From: yaoma <yaoma@linux.alibaba.com>
To: Keith Busch <kbusch@kernel.org>, Sagi Grimberg <sagi@grimberg.me>
Cc: axboe@kernel.dk, hch@lst.de, linux-nvme@lists.infradead.org,
linux-kernel@vger.kernel.org, kanie@linux.alibaba.com
Subject: Re: [PATCH] nvme: fix deadlock between reset and scan
Date: Wed, 29 Nov 2023 11:23:13 +0800 [thread overview]
Message-ID: <6430641c-0db3-d9c2-4b75-51c179c52e8d@linux.alibaba.com> (raw)
In-Reply-To: <ZWYqvo86PI6iHoXV@kbusch-mbp.dhcp.thefacebook.com>
I have previously tried the method that you proposed, and it could solve
the deadlock issue. My worry is that if an I/O timeout occurs during the
scan, it will trigger a reset. However, the reset will wait for the scan
to end, which could introduce a new risk of deadlock.
I agree with the suggestion made by Sagi Grimberg that this approach
does not introduce new problems.
On 2023/11/29 02:00, Keith Busch wrote:
> On Tue, Nov 28, 2023 at 12:13:59PM +0200, Sagi Grimberg wrote:
>>
>>
>> On 11/28/23 08:22, yaoma wrote:
>>> Hi Keith Busch
>>>
>>> Thanks for your reply.
>>>
>>> The idea to avoid such a deadlock between nvme_reset and nvme_scan is to
>>> ensure that no namespace can be added to ctrl->namespaces after
>>> nvme_start_freeze has already been called. We can achieve this goal by
>>> assessing the ctrl->state after we have already acquired the
>>> ctrl->namespaces_rwsem lock, to decide whether to add the namespace to
>>> the list or not.
>>> 1. After we determine that ctrl->state is LIVE, it may be immediately
>>> changed to another state. However, since we have already acquired the
>>> lock, other tasks cannot access ctrl->namespace, so we can still safely
>>> add the namespace to the list. After acquiring the lock,
>>> nvme_start_freeze will freeze all ns->q in the list, including any newly
>>> added namespaces.
>>> 2. Before the completion of nvme_reset, ctrl->state will not be changed
>>> to LIVE, so we will not add any more namespaces to the list. All ns->q
>>> in the list is frozen, so nvme_wait_freeze can exit normally.
>>
>> I agree with the analysis, there is a hole between start_freeze and
>> freeze_wait that a scan may add a ns to the ctrl ns list.
>>
>> However the fix should be to mark the ctrl with say NVME_CTRL_FROZEN
>> flag set in nvme_freeze_start and cleared in nvme_unfreeze (similar
>> to what we did with quiesce). Then the scan can check it before adding
>> the new namespace (under the namespaces_rwsem).
>
> Could we just make sure that scan_work isn't running? If we reset a live
> controller, then we're not depending on reset_work to unblock scan_work,
> and can let scan_work end gracefully. The scan_work can't be rescheduled
> again while in the resetting state.
>
> ---
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index fad4cccce745c..5d6305475bad5 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -2701,8 +2701,10 @@ static void nvme_reset_work(struct work_struct *work)
> * If we're called to reset a live controller first shut it down before
> * moving on.
> */
> - if (dev->ctrl.ctrl_config & NVME_CC_ENABLE)
> + if (dev->ctrl.ctrl_config & NVME_CC_ENABLE) {
> + flush_work(&dev->ctrl.scan_work);
> nvme_dev_disable(dev, false);
> + }
> nvme_sync_queues(&dev->ctrl);
>
> mutex_lock(&dev->shutdown_lock);
> --
next prev parent reply other threads:[~2023-11-29 3:23 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-23 11:00 [PATCH] nvme: fix deadlock between reset and scan Bitao Hu
2023-11-27 18:07 ` Keith Busch
2023-11-28 6:22 ` yaoma
2023-11-28 10:13 ` Sagi Grimberg
2023-11-28 18:00 ` Keith Busch
2023-11-29 3:23 ` yaoma [this message]
2023-11-29 3:28 ` yaoma
2023-11-29 9:24 ` yaoma
2023-11-30 2:13 ` [PATCH v2] " Bitao Hu
2023-11-30 22:28 ` Keith Busch
2023-12-04 7:44 ` Christoph Hellwig
2023-12-04 8:15 ` Sagi Grimberg
2023-12-04 8:35 ` Guixin Liu
2023-12-04 16:39 ` Keith Busch
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6430641c-0db3-d9c2-4b75-51c179c52e8d@linux.alibaba.com \
--to=yaoma@linux.alibaba.com \
--cc=axboe@kernel.dk \
--cc=hch@lst.de \
--cc=kanie@linux.alibaba.com \
--cc=kbusch@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox