* Fwd: [LSF/MM/BPF TOPIC] Native SCSI multipath support
[not found] <69349b51-72c2-47f9-948f-f89843af62e4@oracle.com>
@ 2026-02-13 15:26 ` Xose Vazquez Perez
2026-02-21 17:41 ` Mike Snitzer
1 sibling, 0 replies; 9+ messages in thread
From: Xose Vazquez Perez @ 2026-02-13 15:26 UTC (permalink / raw)
To: Martin Wilck, Benjamin Marzinski, Christophe Varoqui, DM_DEVEL-ML
FYI: https://lore.kernel.org/linux-scsi/69349b51-72c2-47f9-948f-f89843af62e4@oracle.com/dm
-------- Forwarded Message --------
Subject: [LSF/MM/BPF TOPIC] Native SCSI multipath support
Date: Fri, 13 Feb 2026 14:19:11 +0000
From: John Garry <john.g.garry@oracle.com>
Organization: Oracle Corporation
To: lsf-pc@lists.linux-foundation.org, linux-nvme@lists.infradead.org, linux-block@vger.kernel.org, linux-scsi@vger.kernel.org
At ALPSS 25 I presented a proposal for Native SCSI multipath support. Let's discuss this topic at LSFMM.
The idea for this is that SCSI could natively support multipath, like how NVMe host driver does today. It is intended as an alternative to dm-multipath support.
I have been working on the implementation and I plan to post patches in the next cycle. I am looking at a 3-stage approach:
a. create a driver-agnostic multipath library, very heavily based on NVMe host multipath support.
The library would support features such as path management, path selection/iopolicy, failover recovery, PR, delayed removal, gendisk management etc.
b. switch NVMe over to use this library
c. add native SCSI multipath support based on this common library
Thanks,
John
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Native SCSI multipath support
[not found] <69349b51-72c2-47f9-948f-f89843af62e4@oracle.com>
2026-02-13 15:26 ` Fwd: [LSF/MM/BPF TOPIC] Native SCSI multipath support Xose Vazquez Perez
@ 2026-02-21 17:41 ` Mike Snitzer
2026-02-24 9:56 ` John Garry
2026-02-25 0:46 ` Benjamin Marzinski
1 sibling, 2 replies; 9+ messages in thread
From: Mike Snitzer @ 2026-02-21 17:41 UTC (permalink / raw)
To: John Garry; +Cc: lsf-pc, linux-nvme, linux-block, linux-scsi, dm-devel
On Fri, Feb 13, 2026 at 02:19:11PM +0000, John Garry wrote:
> At ALPSS 25 I presented a proposal for Native SCSI multipath support. Let's
> discuss this topic at LSFMM.
>
> The idea for this is that SCSI could natively support multipath, like how
> NVMe host driver does today. It is intended as an alternative to
> dm-multipath support.
>
> I have been working on the implementation and I plan to post patches in the
> next cycle. I am looking at a 3-stage approach:
> a. create a driver-agnostic multipath library, very heavily based on NVMe
> host multipath support.
> The library would support features such as path management, path
> selection/iopolicy, failover recovery, PR, delayed removal, gendisk
> management etc.
> b. switch NVMe over to use this library
I can appreciate that the kernel to userspace interface of DM
multipath is clearly unwanted (hence NVMe multipath and now SCSI
multipath).
But you should really be switching DM-multipath over to using it too;
or at least detailing _why_ the core of DM multipath
(drivers/md/dm-mpath.c) cannot be updated to use this common backend
library.
This line of work makes little sense to me if it just ignores
dm-multipath.
Mike
> c. add native SCSI multipath support based on this common library
>
> Thanks,
> John
>
>
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Native SCSI multipath support
2026-02-21 17:41 ` Mike Snitzer
@ 2026-02-24 9:56 ` John Garry
2026-02-25 0:46 ` Benjamin Marzinski
1 sibling, 0 replies; 9+ messages in thread
From: John Garry @ 2026-02-24 9:56 UTC (permalink / raw)
To: Mike Snitzer; +Cc: lsf-pc, linux-nvme, linux-block, linux-scsi, dm-devel
On 21/02/2026 17:41, Mike Snitzer wrote:
> On Fri, Feb 13, 2026 at 02:19:11PM +0000, John Garry wrote:
>> At ALPSS 25 I presented a proposal for Native SCSI multipath support. Let's
>> discuss this topic at LSFMM.
>>
>> The idea for this is that SCSI could natively support multipath, like how
>> NVMe host driver does today. It is intended as an alternative to
>> dm-multipath support.
>>
>> I have been working on the implementation and I plan to post patches in the
>> next cycle. I am looking at a 3-stage approach:
>> a. create a driver-agnostic multipath library, very heavily based on NVMe
>> host multipath support.
>> The library would support features such as path management, path
>> selection/iopolicy, failover recovery, PR, delayed removal, gendisk
>> management etc.
>> b. switch NVMe over to use this library
> I can appreciate that the kernel to userspace interface of DM
> multipath is clearly unwanted (hence NVMe multipath and now SCSI
> multipath).
>
> But you should really be switching DM-multipath over to using it too;
> or at least detailing_why_ the core of DM multipath
> (drivers/md/dm-mpath.c) cannot be updated to use this common backend
> library.
>
> This line of work makes little sense to me if it just ignores
> dm-multipath.
What I am proposing is refactoring the NVMe multipath code so that it
can be used for SCSI as well.
I am not sure where to begin on saying that this library would be
unsuitable dm-mpath. For a start, the bio flow is totally different.
Then path selection is totally different.
Anyway, I'll post the code this week and you can check it.
John
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Native SCSI multipath support
2026-02-21 17:41 ` Mike Snitzer
2026-02-24 9:56 ` John Garry
@ 2026-02-25 0:46 ` Benjamin Marzinski
2026-02-25 8:11 ` Hannes Reinecke
1 sibling, 1 reply; 9+ messages in thread
From: Benjamin Marzinski @ 2026-02-25 0:46 UTC (permalink / raw)
To: Mike Snitzer
Cc: John Garry, lsf-pc, linux-nvme, linux-block, linux-scsi, dm-devel
On Sat, Feb 21, 2026 at 12:41:28PM -0500, Mike Snitzer wrote:
> On Fri, Feb 13, 2026 at 02:19:11PM +0000, John Garry wrote:
> > At ALPSS 25 I presented a proposal for Native SCSI multipath support. Let's
> > discuss this topic at LSFMM.
> >
> > The idea for this is that SCSI could natively support multipath, like how
> > NVMe host driver does today. It is intended as an alternative to
> > dm-multipath support.
> >
> > I have been working on the implementation and I plan to post patches in the
> > next cycle. I am looking at a 3-stage approach:
> > a. create a driver-agnostic multipath library, very heavily based on NVMe
> > host multipath support.
> > The library would support features such as path management, path
> > selection/iopolicy, failover recovery, PR, delayed removal, gendisk
> > management etc.
> > b. switch NVMe over to use this library
>
> I can appreciate that the kernel to userspace interface of DM
> multipath is clearly unwanted (hence NVMe multipath and now SCSI
> multipath).
>
> But you should really be switching DM-multipath over to using it too;
> or at least detailing _why_ the core of DM multipath
> (drivers/md/dm-mpath.c) cannot be updated to use this common backend
> library.
>
> This line of work makes little sense to me if it just ignores
> dm-multipath.
>
> Mike
Thinking about this work from a DM multipath perspective, I'm more
interested in how much it plans to handle the more annoying niche cases
of dealing with SCSI devices, like paths that confidently report that
they are able to accept IO, only to fail all IO sent to them. Also, I
wonder how/if this is planning on handling Persistent Reservations. The
arrays, I assume, are still going to see this as a collection of I_T
Nexuses (some of which may be down and unable to accept commands at any
given time, and to which new ones my be added) instead of a single one.
I also think this would be useful to talk about at LSF.
-Ben
>
> > c. add native SCSI multipath support based on this common library
> >
> > Thanks,
> > John
> >
> >
> >
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Native SCSI multipath support
2026-02-25 0:46 ` Benjamin Marzinski
@ 2026-02-25 8:11 ` Hannes Reinecke
2026-02-25 9:26 ` John Garry
0 siblings, 1 reply; 9+ messages in thread
From: Hannes Reinecke @ 2026-02-25 8:11 UTC (permalink / raw)
To: Benjamin Marzinski, Mike Snitzer
Cc: John Garry, lsf-pc, linux-nvme, linux-block, linux-scsi, dm-devel
On 2/25/26 01:46, Benjamin Marzinski wrote:
> On Sat, Feb 21, 2026 at 12:41:28PM -0500, Mike Snitzer wrote:
>> On Fri, Feb 13, 2026 at 02:19:11PM +0000, John Garry wrote:
>>> At ALPSS 25 I presented a proposal for Native SCSI multipath support. Let's
>>> discuss this topic at LSFMM.
>>>
>>> The idea for this is that SCSI could natively support multipath, like how
>>> NVMe host driver does today. It is intended as an alternative to
>>> dm-multipath support.
>>>
>>> I have been working on the implementation and I plan to post patches in the
>>> next cycle. I am looking at a 3-stage approach:
>>> a. create a driver-agnostic multipath library, very heavily based on NVMe
>>> host multipath support.
>>> The library would support features such as path management, path
>>> selection/iopolicy, failover recovery, PR, delayed removal, gendisk
>>> management etc.
>>> b. switch NVMe over to use this library
>>
>> I can appreciate that the kernel to userspace interface of DM
>> multipath is clearly unwanted (hence NVMe multipath and now SCSI
>> multipath).
>>
>> But you should really be switching DM-multipath over to using it too;
>> or at least detailing _why_ the core of DM multipath
>> (drivers/md/dm-mpath.c) cannot be updated to use this common backend
>> library.
>>
>> This line of work makes little sense to me if it just ignores
>> dm-multipath.
>>
>> Mike
>
> Thinking about this work from a DM multipath perspective, I'm more
> interested in how much it plans to handle the more annoying niche cases
> of dealing with SCSI devices, like paths that confidently report that
> they are able to accept IO, only to fail all IO sent to them. Also, I
> wonder how/if this is planning on handling Persistent Reservations. The
> arrays, I assume, are still going to see this as a collection of I_T
> Nexuses (some of which may be down and unable to accept commands at any
> given time, and to which new ones my be added) instead of a single one.
>
> I also think this would be useful to talk about at LSF.
>
And that even makes me wonder whether we should have a discussion about
persistent reservations at LSF, too.
I seem to be involved in discussions about PRs from various angles now
(live migration seems to want to join the fray), so maybe we could get
together to discuss things.
And I _still_ want to have a blktests for persistent reservations ...
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Native SCSI multipath support
2026-02-25 8:11 ` Hannes Reinecke
@ 2026-02-25 9:26 ` John Garry
2026-03-10 17:12 ` Ewan Milne
0 siblings, 1 reply; 9+ messages in thread
From: John Garry @ 2026-02-25 9:26 UTC (permalink / raw)
To: Hannes Reinecke, Benjamin Marzinski, Mike Snitzer
Cc: lsf-pc, linux-nvme, linux-block, linux-scsi, dm-devel
On 25/02/2026 08:11, Hannes Reinecke wrote:
> And I _still_ want to have a blktests for persistent reservations ...
nvme/054 supports resv testing.
For scsi PR, we could use util-linux, which has blkpr.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Native SCSI multipath support
2026-02-25 9:26 ` John Garry
@ 2026-03-10 17:12 ` Ewan Milne
2026-03-10 18:05 ` John Garry
0 siblings, 1 reply; 9+ messages in thread
From: Ewan Milne @ 2026-03-10 17:12 UTC (permalink / raw)
To: John Garry
Cc: Hannes Reinecke, Benjamin Marzinski, Mike Snitzer, lsf-pc,
linux-nvme, linux-block, linux-scsi, dm-devel
Hi John-
Sorry, I was out for a couple of weeks and have been catching up...
Re: sg support, there were issues in the past with people attempting
to do SG_IO through dm-mp
assuming that DM would handle retry on other paths, which it didn't.
You also have to be aware
that non-idempotent commands don't work right if retried. My
recommendation would be to avoid
implementing it, although there has been interest in a better way to
do multipathed "generic"
commands (e.g. virt pass-through) I think that is a more involved
project than you want to do here.
I see the discussion has progressed re: ALUA support in your later
patch postings, which is good.
As Hannes said, a Native SCSI MP would be useless without it. You
don't have to support the
older non-ALUA mechanisms though, those arrays are way, way old.
SCSI does not have the equivalent of NVMe's AEN, so you need a way to
ensure that your
ALUA info is up-to-date. DM-MP's path checker normally does this by
sending commands on
which the Unit Attention can be reported so that the code can fetch
up-to-date ALUA info.
Hannes made some optimizations years ago to avoid excessive RTPG
commands with large
numbers of LUNs which we would need also.
It will be necessary for the functionality to be enabled via a module
option, at least initially.
Introducing this in general use will be a big change for people who
have Enterprise SAN
configurations with their own custom path monitoring tools. I believe
we put some functionality
into usespace multipath tools so e.g. Native NVMe devices can still be
monitored/observed
which made things a bit easier for people.
Unfortunately I will not be able to attend LSF/MM this year. I am
sure it will be a good discussion.
-Ewan
On Wed, Feb 25, 2026 at 4:27 AM John Garry <john.g.garry@oracle.com> wrote:
>
> On 25/02/2026 08:11, Hannes Reinecke wrote:
> > And I _still_ want to have a blktests for persistent reservations ...
> nvme/054 supports resv testing.
>
> For scsi PR, we could use util-linux, which has blkpr.
>
>
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Native SCSI multipath support
2026-03-10 17:12 ` Ewan Milne
@ 2026-03-10 18:05 ` John Garry
2026-03-10 18:42 ` Benjamin Marzinski
0 siblings, 1 reply; 9+ messages in thread
From: John Garry @ 2026-03-10 18:05 UTC (permalink / raw)
To: Ewan Milne
Cc: Hannes Reinecke, Benjamin Marzinski, Mike Snitzer, lsf-pc,
linux-nvme, linux-block, linux-scsi, dm-devel
On 10/03/2026 17:12, Ewan Milne wrote:
> Hi John-
>
> Sorry, I was out for a couple of weeks and have been catching up...
>
> Re: sg support, there were issues in the past with people attempting
> to do SG_IO through dm-mp
> assuming that DM would handle retry on other paths, which it didn't.
> You also have to be aware
> that non-idempotent commands don't work right if retried. My
> recommendation would be to avoid
> implementing it, although there has been interest in a better way to
> do multipathed "generic"
> commands (e.g. virt pass-through) I think that is a more involved
> project than you want to do here.
Understood, my current plan is not have a multipathed sg driver - we
will still have the per-scsi device/path sg device.
>
> I see the discussion has progressed re: ALUA support in your later
> patch postings, which is good.
> As Hannes said, a Native SCSI MP would be useless without it. You
> don't have to support the
> older non-ALUA mechanisms though, those arrays are way, way old.
>
> SCSI does not have the equivalent of NVMe's AEN, so you need a way to
> ensure that your
> ALUA info is up-to-date. DM-MP's path checker normally does this by
> sending commands on
> which the Unit Attention can be reported so that the code can fetch
> up-to-date ALUA info.
> Hannes made some optimizations years ago to avoid excessive RTPG
> commands with large
> numbers of LUNs which we would need also.
Hannes is suggesting to not have a kernel path checker, so let me know
if any issue with that.
>
> It will be necessary for the functionality to be enabled via a module
> option, at least initially.
> Introducing this in general use will be a big change for people who
> have Enterprise SAN
> configurations with their own custom path monitoring tools. I believe
> we put some functionality
> into usespace multipath tools so e.g. Native NVMe devices can still be
> monitored/observed
> which made things a bit easier for people.
>
Sure, if you check my patches, we disable by default and enable via a
module param
cheers
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Native SCSI multipath support
2026-03-10 18:05 ` John Garry
@ 2026-03-10 18:42 ` Benjamin Marzinski
0 siblings, 0 replies; 9+ messages in thread
From: Benjamin Marzinski @ 2026-03-10 18:42 UTC (permalink / raw)
To: John Garry
Cc: Ewan Milne, Hannes Reinecke, Mike Snitzer, lsf-pc, linux-nvme,
linux-block, linux-scsi, dm-devel
On Tue, Mar 10, 2026 at 06:05:29PM +0000, John Garry wrote:
> On 10/03/2026 17:12, Ewan Milne wrote:
> > Hi John-
> >
> > Sorry, I was out for a couple of weeks and have been catching up...
> >
> > Re: sg support, there were issues in the past with people attempting
> > to do SG_IO through dm-mp
> > assuming that DM would handle retry on other paths, which it didn't.
> > You also have to be aware
> > that non-idempotent commands don't work right if retried. My
> > recommendation would be to avoid
> > implementing it, although there has been interest in a better way to
> > do multipathed "generic"
> > commands (e.g. virt pass-through) I think that is a more involved
> > project than you want to do here.
>
> Understood, my current plan is not have a multipathed sg driver - we will
> still have the per-scsi device/path sg device.
The sd devices still handle SG_IO ioctls. For instance, the persistent
reservation ioctls are SG_IO ioctls. But like I said elsewhere, getting
multipathed persistent reservations working safely is going to be a
large effort, better left for later.
But even without them, to handle things like sending SCSI WRITE commands
over SG_IO ioctls, in an ideal world, you would want to be able to retry
on other paths in the ioctl code. However, like Ewan mentioned, there
are times when you don't want to retry the ioctl. Just sending SG_IO
ioctls to one path and letting them fail if they fail down that path is
the safest way for now, even if there are times when that SG_IO ioctl
could complete successfully down another path.
-Ben
>
> >
> > I see the discussion has progressed re: ALUA support in your later
> > patch postings, which is good.
> > As Hannes said, a Native SCSI MP would be useless without it. You
> > don't have to support the
> > older non-ALUA mechanisms though, those arrays are way, way old.
> >
> > SCSI does not have the equivalent of NVMe's AEN, so you need a way to
> > ensure that your
> > ALUA info is up-to-date. DM-MP's path checker normally does this by
> > sending commands on
> > which the Unit Attention can be reported so that the code can fetch
> > up-to-date ALUA info.
> > Hannes made some optimizations years ago to avoid excessive RTPG
> > commands with large
> > numbers of LUNs which we would need also.
>
> Hannes is suggesting to not have a kernel path checker, so let me know if
> any issue with that.
>
> >
> > It will be necessary for the functionality to be enabled via a module
> > option, at least initially.
> > Introducing this in general use will be a big change for people who
> > have Enterprise SAN
> > configurations with their own custom path monitoring tools. I believe
> > we put some functionality
> > into usespace multipath tools so e.g. Native NVMe devices can still be
> > monitored/observed
> > which made things a bit easier for people.
> >
>
> Sure, if you check my patches, we disable by default and enable via a module
> param
>
> cheers
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2026-03-10 18:42 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <69349b51-72c2-47f9-948f-f89843af62e4@oracle.com>
2026-02-13 15:26 ` Fwd: [LSF/MM/BPF TOPIC] Native SCSI multipath support Xose Vazquez Perez
2026-02-21 17:41 ` Mike Snitzer
2026-02-24 9:56 ` John Garry
2026-02-25 0:46 ` Benjamin Marzinski
2026-02-25 8:11 ` Hannes Reinecke
2026-02-25 9:26 ` John Garry
2026-03-10 17:12 ` Ewan Milne
2026-03-10 18:05 ` John Garry
2026-03-10 18:42 ` Benjamin Marzinski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox