public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
From: Nilay Shroff <nilay@linux.ibm.com>
To: Christoph Hellwig <hch@lst.de>
Cc: linux-nvme@lists.infradead.org, linux-block@vger.kernel.org,
	kbusch@kernel.org, hare@suse.de, sagi@grimberg.me,
	jmeneghi@redhat.com, axboe@kernel.dk, martin.petersen@oracle.com,
	gjoyce@ibm.com
Subject: Re: [RFC PATCHv2 1/3] nvme-multipath: introduce delayed removal of the multipath head node
Date: Mon, 28 Apr 2025 12:35:39 +0530	[thread overview]
Message-ID: <3916ace7-833e-4cbb-8f8a-cb8e87403c7a@linux.ibm.com> (raw)
In-Reply-To: <20250425144336.GA12179@lst.de>



On 4/25/25 8:13 PM, Christoph Hellwig wrote:
> On Fri, Apr 25, 2025 at 04:03:08PM +0530, Nilay Shroff wrote:
>> @@ -4007,10 +4008,6 @@ static void nvme_ns_remove(struct nvme_ns *ns)
>>  
>>  	mutex_lock(&ns->ctrl->subsys->lock);
>>  	list_del_rcu(&ns->siblings);
>> -	if (list_empty(&ns->head->list)) {
>> -		list_del_init(&ns->head->entry);
>> -		last_path = true;
>> -	}
>>  	mutex_unlock(&ns->ctrl->subsys->lock);
>>  
>>  	/* guarantee not available in head->list */
>> @@ -4028,8 +4025,7 @@ static void nvme_ns_remove(struct nvme_ns *ns)
>>  	mutex_unlock(&ns->ctrl->namespaces_lock);
>>  	synchronize_srcu(&ns->ctrl->srcu);
>>  
>> -	if (last_path)
>> -		nvme_mpath_shutdown_disk(ns->head);
>> +	nvme_mpath_shutdown_disk(ns->head);
> 
> This now removes the head list deletion from the first critical
> section into nvme_mpath_shutdown_disk.  I remember we had to do it
> the way it currently is done because we were running into issues
> otherwise, the commit history might help a bit with what the issues
> were.
> 
Thank you for the pointers! Yes, I checked the commit history and found this
commit 9edceaf43050 ("nvme: avoid race in shutdown namespace removal") which
avoids the race between nvme_find_ns_head and nvme_ns_remove. And looking at
the commit history it seems we also need to handle that case here. I think if
user doesn't configure delayed removal of head node then the existing behavior
should be preserved to avoid above mentioned race. However if user configures
the delayed head node removal then we would delay the head->entry removal until
the delayed node removal time elapses. I'd handle this case in next patch.
 
>> +	if (a == &dev_attr_delayed_removal_secs.attr) {
>> +		struct nvme_ns_head *head = dev_to_ns_head(dev);
>> +		struct gendisk *disk = dev_to_disk(dev);
>> +
>> +		/*
>> +		 * This attribute is only valid for head node and non-fabric
>> +		 * setup.
>> +		 */
>> +		if (!nvme_disk_is_ns_head(disk) ||
>> +				test_bit(NVME_NSHEAD_FABRICS, &head->flags))
>> +			return 0;
>> +	}
> 
> Didn't we say we also want to allow fabrics to use it if explicitly
> configured?
> 
Hmm, I think there's some confusion here. I thought we discussed that for fabric setup, 
fabric link failure is managed through other parameters such as "reconnect_delay" 
and "max_reconnects" which is handled at respective fabric driver layer. Therefor, 
head->delayed_removal_secs is intended exclusively for non-fabric (e.g., PCIe) 
multipath configurations.

Even if we were to support delayed head node removal for fabric setup then I 
wonder who would handle fabric (re-)connect command? As once we exhaust number of 
reconnect attempts, we stop the reconnect work and delete the fabric controller.

So do you think we should still expose head->delayed_removal_secs for fabric setup? 
If yes, then it'd add further additional delay (without re-connect attempt) for 
fabric link failures, isn't it?

Thanks,
--Nilay


  reply	other threads:[~2025-04-28  7:06 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-25 10:33 [RFC PATCHv2 0/3] improve NVMe multipath handling Nilay Shroff
2025-04-25 10:33 ` [RFC PATCHv2 1/3] nvme-multipath: introduce delayed removal of the multipath head node Nilay Shroff
2025-04-25 14:43   ` Christoph Hellwig
2025-04-28  7:05     ` Nilay Shroff [this message]
2025-04-25 22:26   ` Sagi Grimberg
2025-04-28  7:39     ` Nilay Shroff
2025-04-25 10:33 ` [RFC PATCHv2 2/3] nvme: introduce multipath_head_always module param Nilay Shroff
2025-04-25 14:45   ` Christoph Hellwig
2025-04-29  6:26     ` Nilay Shroff
2025-04-28  6:57   ` Hannes Reinecke
2025-04-28  7:39     ` Nilay Shroff
2025-04-29  5:49       ` Hannes Reinecke
2025-04-29  6:24         ` Nilay Shroff
2025-04-29  7:01           ` Hannes Reinecke
2025-04-29  7:15             ` Nilay Shroff
2025-04-25 10:33 ` [RFC PATCHv2 3/3] nvme: rename nvme_mpath_shutdown_disk to nvme_mpath_remove_disk Nilay Shroff
2025-04-25 14:46   ` Christoph Hellwig
2025-04-25 22:27   ` Sagi Grimberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3916ace7-833e-4cbb-8f8a-cb8e87403c7a@linux.ibm.com \
    --to=nilay@linux.ibm.com \
    --cc=axboe@kernel.dk \
    --cc=gjoyce@ibm.com \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=jmeneghi@redhat.com \
    --cc=kbusch@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=martin.petersen@oracle.com \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox