Linux-NVME Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Nilay Shroff <nilay@linux.ibm.com>
To: Christoph Hellwig <hch@lst.de>
Cc: linux-nvme@lists.infradead.org, kbusch@kernel.org,
	sagi@grimberg.me, axboe@fb.com, chaitanyak@nvidia.com,
	gjoyce@linux.ibm.com
Subject: Re: [PATCH 2/2] nvme: make keep-alive synchronous operation
Date: Mon, 7 Oct 2024 13:25:56 +0530	[thread overview]
Message-ID: <5cfe49e2-0d78-4e16-b8a7-bc3fda90c2e5@linux.ibm.com> (raw)
In-Reply-To: <20241007064149.GB1141@lst.de>



On 10/7/24 12:11, Christoph Hellwig wrote:
> On Fri, Oct 04, 2024 at 05:16:57PM +0530, Nilay Shroff wrote:
>> The nvme keep-alive operation, which executes at a periodic interval,
>> could potentially sneak in while shutting down a fabric controller.
>> This may lead to a race between the fabric controller admin queue
>> destroy code path (while shutting down controller) and the blk-mq
>> hw/hctx queuing from the keep-alive thread.
>>
>> This fix helps avoid race by implementing keep-alive as a synchronous
>> operation so that admin queue-usage ref counter is decremented only
>> after keep-alive command finish execution and returns its status.
> 
> With that you mean ->q_usage_counter?
Yes I meant ->q_usage_counter.
> 
> Moving to synchronous submission and wasting a workqueue context for
> that is a bit sad.  I think just removing the blk_mq_free_request call
> from nvme_keep_alive_finish and returning RQ_END_IO_FREE instead
> should have the same effect, or am I missing something?
> 
Unfortunately, that would not help because we still fall through the same 
code path and here we're just moving blk_mq_free_request call from 
nvme_keep_alive_finish to its caller. For instance, assuming we keep 
nvme_keep_alive work asynchronous then the code path for invoking 
nvme_keep_alive_finish would be,

nvme_keep_alive_work()
  ->blk_execute_rq_no_wait()
    ->blk_mq_run_hw_queue()
      ->blk_mq_sched_dispatch_requests()
        ->__blk_mq_sched_dispatch_requests()
          ->blk_mq_dispatch_rq_list()
            ->nvme_loop_queue_rq()
              ->nvme_fail_nonready_command() 
                ->nvme_complete_rq()
                  ->nvme_end_req()
                    ->blk_mq_end_request()
                      ->__blk_mq_end_request()
                        ->nvme_keep_alive_finish() 

In the above call path, if nvme_keep_alive_finish returns RQ_END_IO_FREE (and we 
remove blk_mq_free_request call from it) then the caller __blk_mq_end_request would 
then invoke blk_mq_free_request and we may still hit the same bug because 
this would allow blk_mq_destroy_queue() (which may be running on another cpu from 
the controller shutdown code path) forward progress and next from blk_put_queue()
the admin queue resources are deleted.

>> Also, while we are at it, instead of first acquiring ctrl lock and then
>> accessing NVMe controller state, lets use the helper function
>> nvme_ctrl_state() in nvme_keep_alive_end_io() and get rid of the
>> lock.
> 
> Please split that into a separate patch.
Sure I would split the patch and resend the series. 

And thank you for your review comments!

--Nilay


      reply	other threads:[~2024-10-07  8:21 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-04 11:46 [PATCH 0/2] nvme: system fault while shutting down fabric controller Nilay Shroff
2024-10-04 11:46 ` [PATCH 1/2] nvme-loop: flush off pending I/O while shutting down loop controller Nilay Shroff
2024-10-07  6:37   ` Christoph Hellwig
2024-10-04 11:46 ` [PATCH 2/2] nvme: make keep-alive synchronous operation Nilay Shroff
2024-10-07  6:41   ` Christoph Hellwig
2024-10-07  7:55     ` Nilay Shroff [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5cfe49e2-0d78-4e16-b8a7-bc3fda90c2e5@linux.ibm.com \
    --to=nilay@linux.ibm.com \
    --cc=axboe@fb.com \
    --cc=chaitanyak@nvidia.com \
    --cc=gjoyce@linux.ibm.com \
    --cc=hch@lst.de \
    --cc=kbusch@kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox