From: Nilay Shroff <nilay@linux.ibm.com>
To: John Garry <john.g.garry@oracle.com>,
hch@lst.de, kbusch@kernel.org, sagi@grimberg.me, axboe@fb.com,
martin.petersen@oracle.com,
james.bottomley@hansenpartnership.com, hare@suse.com
Cc: jmeneghi@redhat.com, linux-nvme@lists.infradead.org,
linux-scsi@vger.kernel.org, michael.christie@oracle.com,
snitzer@kernel.org, bmarzins@redhat.com,
dm-devel@lists.linux.dev, linux-block@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 03/13] libmultipath: Add path selection support
Date: Tue, 14 Apr 2026 17:13:25 +0530 [thread overview]
Message-ID: <6dc8e0d7-9b2b-4867-9df7-6853f4b2fa05@linux.ibm.com> (raw)
In-Reply-To: <bc18ad6f-10b1-4a28-b88d-aed5754b968d@oracle.com>
On 4/14/26 3:33 PM, John Garry wrote:
> Hi Nilay,
>
>>>
>>> I think so, but we will need scsi to maintain such a count internally to support this policy. And for NVMe we will need some abstraction to lookup the per-controller QD for a mpath_device.
>>>
>> This raises another question regarding the current framework. From what I can see, all NVMe multipath I/O policies are currently supported for SCSI as well. Going forward, if we introduce a new I/O policy for NVMe that does not make sense for SCSI, how can we ensure that the new policy is supported only for NVMe and not for SCSI? Conversely, we may also want to introduce a policy that is relevant only for SCSI but not for NVMe.
>>
>> With the current framework, it seems difficult to restrict a policy to a specific transport. It appears that all policies are implicitly shared between NVMe and SCSI.
>>
>> Would it make sense to introduce some abstraction for I/O policies in the framework so that a given policy can be implemented and exposed only for the relevant transport (e.g., NVMe-only or SCSI-only), rather than requiring it to be supported by both?
>
> I am just coming back to this now....
>
> about the queue-depth iopolicy, why is depth per controller and not per NS (path)? The following does not mention:
>
> https://lore.kernel.org/linux-nvme/20240625122605.857462-3-jmeneghi@redhat.com/
>
> Is the idea that some controller may have another NS attached and have traffic there, and we need to account according to this also?
>
Yes, the idea is that congestion should be evaluated at the controller level rather than per-namespace.
In NVMe, multiple namespaces can be attached to the same controller, and all of them share the same
transport path and I/O queue resources (submission and completion queues). As a result, any contention
or congestion is fundamentally observed at the controller, and not at an individual namespace.
If we were to track queue depth per namespace, it could give a misleading view of the actual load on
the underlying path, since multiple namespaces may be contributing to the same set of queues. In contrast,
tracking queue depth per controller provides a more accurate representation of the total outstanding I/O
and the level of congestion on that path.
In a multipath configuration, this allows us to compare controllers directly. For example, if one controller
has a lower queue depth than another, it is likely experiencing less contention and may offer lower latency,
making it a better candidate for forwarding I/O.
Thanks,
--Nilay
next prev parent reply other threads:[~2026-04-14 11:43 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-25 15:32 [PATCH 00/13] libmultipath: a generic multipath lib for block drivers John Garry
2026-02-25 15:32 ` [PATCH 01/13] libmultipath: Add initial framework John Garry
2026-03-02 12:08 ` Nilay Shroff
2026-03-02 12:21 ` John Garry
2026-02-25 15:32 ` [PATCH 02/13] libmultipath: Add basic gendisk support John Garry
2026-02-26 2:16 ` Benjamin Marzinski
2026-02-26 9:04 ` John Garry
2026-03-02 12:31 ` Nilay Shroff
2026-03-02 15:39 ` John Garry
2026-03-03 12:39 ` Nilay Shroff
2026-03-03 12:59 ` John Garry
2026-03-03 12:13 ` Markus Elfring
2026-02-25 15:32 ` [PATCH 03/13] libmultipath: Add path selection support John Garry
2026-02-26 3:37 ` Benjamin Marzinski
2026-02-26 9:26 ` John Garry
2026-03-02 12:36 ` Nilay Shroff
2026-03-02 15:11 ` John Garry
2026-03-03 11:01 ` Nilay Shroff
2026-03-03 12:41 ` John Garry
2026-03-04 10:26 ` Nilay Shroff
2026-03-04 11:09 ` John Garry
2026-04-14 10:03 ` John Garry
2026-04-14 11:43 ` Nilay Shroff [this message]
2026-04-14 13:10 ` John Garry
2026-03-04 13:10 ` Nilay Shroff
2026-03-04 14:38 ` John Garry
2026-02-25 15:32 ` [PATCH 04/13] libmultipath: Add bio handling John Garry
2026-03-02 12:39 ` Nilay Shroff
2026-03-02 15:52 ` John Garry
2026-03-03 14:00 ` Nilay Shroff
2026-02-25 15:32 ` [PATCH 05/13] libmultipath: Add support for mpath_device management John Garry
2026-02-25 15:32 ` [PATCH 06/13] libmultipath: Add cdev support John Garry
2026-02-25 15:32 ` [PATCH 07/13] libmultipath: Add delayed removal support John Garry
2026-03-02 12:41 ` Nilay Shroff
2026-03-02 15:54 ` John Garry
2026-04-08 11:28 ` John Garry
2026-04-08 15:41 ` Hannes Reinecke
2026-04-08 16:28 ` John Garry
2026-04-09 6:37 ` Nilay Shroff
2026-04-09 13:00 ` John Garry
2026-04-10 7:06 ` Nilay Shroff
2026-04-10 8:55 ` John Garry
2026-04-10 9:09 ` Nilay Shroff
2026-04-10 9:49 ` John Garry
2026-04-10 10:51 ` Nilay Shroff
2026-04-10 11:49 ` John Garry
2026-02-25 15:32 ` [PATCH 08/13] libmultipath: Add sysfs helpers John Garry
2026-02-27 19:05 ` Benjamin Marzinski
2026-03-02 11:11 ` John Garry
2026-02-25 15:32 ` [PATCH 09/13] libmultipath: Add PR support John Garry
2026-02-25 15:49 ` Keith Busch
2026-02-25 16:52 ` John Garry
2026-02-27 18:12 ` Benjamin Marzinski
2026-03-02 10:45 ` John Garry
2026-02-25 15:32 ` [PATCH 10/13] libmultipath: Add mpath_bdev_report_zones() John Garry
2026-02-25 15:32 ` [PATCH 11/13] libmultipath: Add support for block device IOCTL John Garry
2026-02-27 19:52 ` Benjamin Marzinski
2026-03-02 11:19 ` John Garry
2026-04-09 15:20 ` John Garry
2026-02-25 15:32 ` [PATCH 12/13] libmultipath: Add mpath_bdev_getgeo() John Garry
2026-02-25 15:32 ` [PATCH 13/13] libmultipath: Add mpath_bdev_get_unique_id() John Garry
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6dc8e0d7-9b2b-4867-9df7-6853f4b2fa05@linux.ibm.com \
--to=nilay@linux.ibm.com \
--cc=axboe@fb.com \
--cc=bmarzins@redhat.com \
--cc=dm-devel@lists.linux.dev \
--cc=hare@suse.com \
--cc=hch@lst.de \
--cc=james.bottomley@hansenpartnership.com \
--cc=jmeneghi@redhat.com \
--cc=john.g.garry@oracle.com \
--cc=kbusch@kernel.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=linux-scsi@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=michael.christie@oracle.com \
--cc=sagi@grimberg.me \
--cc=snitzer@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox