• Jianchao Wang's avatar
    blk-mq: fix a hung issue when fsync · 85bd6e61
    Jianchao Wang authored
    Florian reported a io hung issue when fsync(). It should be
    triggered by following race condition.
    
    data + post flush         a flush
    
    blk_flush_complete_seq
      case REQ_FSEQ_DATA
        blk_flush_queue_rq
        issued to driver      blk_mq_dispatch_rq_list
                                try to issue a flush req
                                failed due to NON-NCQ command
                                .queue_rq return BLK_STS_DEV_RESOURCE
    
    request completion
      req->end_io // doesn't check RESTART
      mq_flush_data_end_io
        case REQ_FSEQ_POSTFLUSH
          blk_kick_flush
            do nothing because previous flush
            has not been completed
         blk_mq_run_hw_queue
                                  insert rq to hctx->dispatch
                                  due to RESTART is still set, do nothing
    
    To fix this, replace the blk_mq_run_hw_queue in mq_flush_data_end_io
    with blk_mq_sched_restart to check and clear the RESTART flag.
    
    Fixes: bd166ef1 (blk-mq-sched: add framework for MQ capable IO schedulers)
    Reported-by: 's avatarFlorian Stecker <m19@florianstecker.de>
    Tested-by: 's avatarFlorian Stecker <m19@florianstecker.de>
    Signed-off-by: 's avatarJianchao Wang <jianchao.w.wang@oracle.com>
    Signed-off-by: 's avatarJens Axboe <axboe@kernel.dk>
    85bd6e61
blk-flush.c 14.4 KB