public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
* [LSF/MM/BPF TOPIC] Native SCSI multipath support
@ 2026-02-13 14:19 John Garry
  2026-02-13 17:21 ` Hannes Reinecke
                   ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: John Garry @ 2026-02-13 14:19 UTC (permalink / raw)
  To: lsf-pc, linux-nvme, linux-block, linux-scsi

At ALPSS 25 I presented a proposal for Native SCSI multipath support. 
Let's discuss this topic at LSFMM.

The idea for this is that SCSI could natively support multipath, like 
how NVMe host driver does today. It is intended as an alternative to 
dm-multipath support.

I have been working on the implementation and I plan to post patches in 
the next cycle. I am looking at a 3-stage approach:
a. create a driver-agnostic multipath library, very heavily based on 
NVMe host multipath support.
The library would support features such as path management, path 
selection/iopolicy, failover recovery, PR, delayed removal, gendisk 
management etc.
b. switch NVMe over to use this library
c. add native SCSI multipath support based on this common library

Thanks,
John



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Native SCSI multipath support
  2026-02-13 14:19 [LSF/MM/BPF TOPIC] Native SCSI multipath support John Garry
@ 2026-02-13 17:21 ` Hannes Reinecke
  2026-02-14  9:42   ` John Garry
  2026-02-17 19:33 ` Bart Van Assche
  2026-02-21 17:41 ` Mike Snitzer
  2 siblings, 1 reply; 22+ messages in thread
From: Hannes Reinecke @ 2026-02-13 17:21 UTC (permalink / raw)
  To: John Garry, lsf-pc, linux-nvme, linux-block, linux-scsi

On 2/13/26 15:19, John Garry wrote:
> At ALPSS 25 I presented a proposal for Native SCSI multipath support. 
> Let's discuss this topic at LSFMM.
> 
> The idea for this is that SCSI could natively support multipath, like 
> how NVMe host driver does today. It is intended as an alternative to dm- 
> multipath support.
> 
> I have been working on the implementation and I plan to post patches in 
> the next cycle. I am looking at a 3-stage approach:
> a. create a driver-agnostic multipath library, very heavily based on 
> NVMe host multipath support.
> The library would support features such as path management, path 
> selection/iopolicy, failover recovery, PR, delayed removal, gendisk 
> management etc.
> b. switch NVMe over to use this library
> c. add native SCSI multipath support based on this common library
> 
Go for it, John!

I'd be very interested in that.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Native SCSI multipath support
  2026-02-13 17:21 ` Hannes Reinecke
@ 2026-02-14  9:42   ` John Garry
  2026-02-16  7:26     ` Hannes Reinecke
  2026-02-16 16:32     ` Hannes Reinecke
  0 siblings, 2 replies; 22+ messages in thread
From: John Garry @ 2026-02-14  9:42 UTC (permalink / raw)
  To: Hannes Reinecke, lsf-pc, linux-nvme, linux-block, linux-scsi

On 13/02/2026 17:21, Hannes Reinecke wrote:
>> At ALPSS 25 I presented a proposal for Native SCSI multipath support. 
>> Let's discuss this topic at LSFMM.
>>
>> The idea for this is that SCSI could natively support multipath, like 
>> how NVMe host driver does today. It is intended as an alternative to 
>> dm- multipath support.
>>
>> I have been working on the implementation and I plan to post patches 
>> in the next cycle. I am looking at a 3-stage approach:
>> a. create a driver-agnostic multipath library, very heavily based on 
>> NVMe host multipath support.
>> The library would support features such as path management, path 
>> selection/iopolicy, failover recovery, PR, delayed removal, gendisk 
>> management etc.
>> b. switch NVMe over to use this library
>> c. add native SCSI multipath support based on this common library
>>
> Go for it, John!
> 
> I'd be very interested in that.

cheers, in the meantime, I have some comments:

- I need to test PRs for both NVMe and SCSI, any advice on that would be 
good. I don't think that blktests covers it. I did see Christoph mention 
a testsuite at: 
https://lore.kernel.org/linux-nvme/1438672271-11309-1-git-send-email-hch@lst.de/ 
- I can check that.

- I am still not sure on whether we require a multipath version of sg. 
We can still have per-path sg. NVMe does have a multipath nvme-generic 
dev, but that just handles IOCTLs/uring cmd, and nothing like sg 
read/write fops

- I have not tried to detangle ALUA support from SCSI DH, so no ALUA 
support yet


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Native SCSI multipath support
  2026-02-14  9:42   ` John Garry
@ 2026-02-16  7:26     ` Hannes Reinecke
  2026-02-16 16:32     ` Hannes Reinecke
  1 sibling, 0 replies; 22+ messages in thread
From: Hannes Reinecke @ 2026-02-16  7:26 UTC (permalink / raw)
  To: John Garry, lsf-pc, linux-nvme, linux-block, linux-scsi

On 2/14/26 10:42, John Garry wrote:
> On 13/02/2026 17:21, Hannes Reinecke wrote:
>>> At ALPSS 25 I presented a proposal for Native SCSI multipath support. 
>>> Let's discuss this topic at LSFMM.
>>>
>>> The idea for this is that SCSI could natively support multipath, like 
>>> how NVMe host driver does today. It is intended as an alternative to 
>>> dm- multipath support.
>>>
>>> I have been working on the implementation and I plan to post patches 
>>> in the next cycle. I am looking at a 3-stage approach:
>>> a. create a driver-agnostic multipath library, very heavily based on 
>>> NVMe host multipath support.
>>> The library would support features such as path management, path 
>>> selection/iopolicy, failover recovery, PR, delayed removal, gendisk 
>>> management etc.
>>> b. switch NVMe over to use this library
>>> c. add native SCSI multipath support based on this common library
>>>
>> Go for it, John!
>>
>> I'd be very interested in that.
> 
> cheers, in the meantime, I have some comments:
> 
> - I need to test PRs for both NVMe and SCSI, any advice on that would be 
> good. I don't think that blktests covers it. I did see Christoph mention 
> a testsuite at: https://lore.kernel.org/linux-nvme/1438672271-11309-1- 
> git-send-email-hch@lst.de/ - I can check that.
> 
Well, you might have seen the discussion on the device-mapper list, 
where stefanha is implementing generic PRs for dm-multipathing.
Or rather, trying to. We might need to revisit that and see what we
could be doing on the SCSI side.
Maybe we should be having a session about PRs at LSF?

> - I am still not sure on whether we require a multipath version of sg. 
> We can still have per-path sg. NVMe does have a multipath nvme-generic 
> dev, but that just handles IOCTLs/uring cmd, and nothing like sg read/ 
> write fops
> 
'sg' is primarily for testing 'raw' SCSI commands. (And dastardly 
complex to boot). I really would keep it in it's current form, and not
try to mimick something with SCSI multipathing.

> - I have not tried to detangle ALUA support from SCSI DH, so no ALUA 
> support yet
> 
Ouch. But that is the key point of the implementation; ALUA provides
_all_ the information required for multipathing, so how can you _not_
have support for it?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Native SCSI multipath support
  2026-02-14  9:42   ` John Garry
  2026-02-16  7:26     ` Hannes Reinecke
@ 2026-02-16 16:32     ` Hannes Reinecke
  2026-02-16 16:55       ` John Garry
  2026-02-21 17:47       ` Mike Snitzer
  1 sibling, 2 replies; 22+ messages in thread
From: Hannes Reinecke @ 2026-02-16 16:32 UTC (permalink / raw)
  To: John Garry, lsf-pc, linux-nvme, linux-block, linux-scsi

On 2/14/26 10:42, John Garry wrote:
> On 13/02/2026 17:21, Hannes Reinecke wrote:
>>> At ALPSS 25 I presented a proposal for Native SCSI multipath support. 
>>> Let's discuss this topic at LSFMM.
>>>
>>> The idea for this is that SCSI could natively support multipath, like 
>>> how NVMe host driver does today. It is intended as an alternative to 
>>> dm- multipath support.
>>>
>>> I have been working on the implementation and I plan to post patches 
>>> in the next cycle. I am looking at a 3-stage approach:
>>> a. create a driver-agnostic multipath library, very heavily based on 
>>> NVMe host multipath support.
>>> The library would support features such as path management, path 
>>> selection/iopolicy, failover recovery, PR, delayed removal, gendisk 
>>> management etc.
>>> b. switch NVMe over to use this library
>>> c. add native SCSI multipath support based on this common library
>>>
>> Go for it, John!
>>
>> I'd be very interested in that.
> 
> cheers, in the meantime, I have some comments:
> 
> - I need to test PRs for both NVMe and SCSI, any advice on that would be 
> good. I don't think that blktests covers it. I did see Christoph mention 
> a testsuite at: https://lore.kernel.org/linux-nvme/1438672271-11309-1- 
> git-send-email-hch@lst.de/ - I can check that.
> 
Well, you might have seen the discussion on the device-mapper list, 
where stefanha is implementing generic PRs for dm-multipathing.
Or rather, trying to. We might need to revisit that and see what we
could be doing on the SCSI side.
Maybe we should be having a session about PRs at LSF?

> - I am still not sure on whether we require a multipath version of sg. 
> We can still have per-path sg. NVMe does have a multipath nvme-generic 
> dev, but that just handles IOCTLs/uring cmd, and nothing like sg read/ 
> write fops
> 
'sg' is primarily for testing 'raw' SCSI commands. (And dastardly 
complex to boot). I really would keep it in it's current form, and not
try to mimick something with SCSI multipathing.

> - I have not tried to detangle ALUA support from SCSI DH, so no ALUA 
> support yet
> 
Ouch. But that is the key point of the implementation; ALUA provides
_all_ the information required for multipathing, so how can you _not_
have support for it?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Native SCSI multipath support
  2026-02-16 16:32     ` Hannes Reinecke
@ 2026-02-16 16:55       ` John Garry
  2026-02-17  7:05         ` Hannes Reinecke
  2026-02-21 17:47       ` Mike Snitzer
  1 sibling, 1 reply; 22+ messages in thread
From: John Garry @ 2026-02-16 16:55 UTC (permalink / raw)
  To: Hannes Reinecke, lsf-pc, linux-nvme, linux-block, linux-scsi

On 16/02/2026 16:32, Hannes Reinecke wrote:
>> cheers, in the meantime, I have some comments:
>>
>> - I need to test PRs for both NVMe and SCSI, any advice on that would 
>> be good. I don't think that blktests covers it. I did see Christoph 
>> mention a testsuite at: https://urldefense.com/v3/__https:// 
>> lore.kernel.org/linux-nvme/1438672271-11309-1-__;!!ACWV5N9M2RV99hQ! 
>> NyU_EGj3duLYY2LeAfcU8f3WP67loAPsnqoz8qYMoV6CwqUgqWuoSE_VERaSDshSIbcmLG7zUhd4FfU$ git-send-email-hch@lst.de/ - I can check that.
>>
> Well, you might have seen the discussion on the device-mapper list, 
> where stefanha is implementing generic PRs for dm-multipathing.
> Or rather, trying to. We might need to revisit that and see what we
> could be doing on the SCSI side.
> Maybe we should be having a session about PRs at LSF?

maybe...

> 
>> - I am still not sure on whether we require a multipath version of sg. 
>> We can still have per-path sg. NVMe does have a multipath nvme-generic 
>> dev, but that just handles IOCTLs/uring cmd, and nothing like sg read/ 
>> write fops
>>
> 'sg' is primarily for testing 'raw' SCSI commands. (And dastardly 
> complex to boot). I really would keep it in it's current form, and not
> try to mimick something with SCSI multipathing.

I can get to scsi_ioctl() from the multipath sd device ioctl - 
sd_ioctl() - maybe that is enough.

> 
>> - I have not tried to detangle ALUA support from SCSI DH, so no ALUA 
>> support yet
>>
> Ouch. But that is the key point of the implementation; ALUA provides
> _all_ the information required for multipathing, so how can you _not_
> have support for it?

So far every path is just "optimised" and scsi_vpd_lun_id() is used to 
match scsi_devices ... ALUA support will be added, but if I were to do 
it now, it would just delay posting anything even further...

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Native SCSI multipath support
  2026-02-16 16:55       ` John Garry
@ 2026-02-17  7:05         ` Hannes Reinecke
  0 siblings, 0 replies; 22+ messages in thread
From: Hannes Reinecke @ 2026-02-17  7:05 UTC (permalink / raw)
  To: John Garry, lsf-pc, linux-nvme, linux-block, linux-scsi

On 2/16/26 17:55, John Garry wrote:
> On 16/02/2026 16:32, Hannes Reinecke wrote:
>>> cheers, in the meantime, I have some comments:
>>>
[ .. ]>>> - I am still not sure on whether we require a multipath 
version of
>>> sg. We can still have per-path sg. NVMe does have a multipath nvme- 
>>> generic dev, but that just handles IOCTLs/uring cmd, and nothing like 
>>> sg read/ write fops
>>>
>> 'sg' is primarily for testing 'raw' SCSI commands. (And dastardly 
>> complex to boot). I really would keep it in it's current form, and not
>> try to mimick something with SCSI multipathing.
> 
> I can get to scsi_ioctl() from the multipath sd device ioctl - 
> sd_ioctl() - maybe that is enough.
> 
Yeah, it should. In the end, the read/write path is less interesting
for 'raw' SCSI commands; I would think sg is more interesting for the
more obscure commands. And most of these would be path-specific anyway.

>>
>>> - I have not tried to detangle ALUA support from SCSI DH, so no ALUA 
>>> support yet
>>>
>> Ouch. But that is the key point of the implementation; ALUA provides
>> _all_ the information required for multipathing, so how can you _not_
>> have support for it?
> 
> So far every path is just "optimised" and scsi_vpd_lun_id() is used to 
> match scsi_devices ... ALUA support will be added, but if I were to do 
> it now, it would just delay posting anything even further...

Fair enough. Eagerly awaiting the patchset.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Native SCSI multipath support
  2026-02-13 14:19 [LSF/MM/BPF TOPIC] Native SCSI multipath support John Garry
  2026-02-13 17:21 ` Hannes Reinecke
@ 2026-02-17 19:33 ` Bart Van Assche
  2026-02-17 20:13   ` Keith Busch
  2026-02-18  8:23   ` John Garry
  2026-02-21 17:41 ` Mike Snitzer
  2 siblings, 2 replies; 22+ messages in thread
From: Bart Van Assche @ 2026-02-17 19:33 UTC (permalink / raw)
  To: John Garry, lsf-pc, linux-nvme, linux-block, linux-scsi

On 2/13/26 6:19 AM, John Garry wrote:
> At ALPSS 25 I presented a proposal for Native SCSI multipath support. 
> Let's discuss this topic at LSFMM.
> 
> The idea for this is that SCSI could natively support multipath, like 
> how NVMe host driver does today. It is intended as an alternative to dm- 
> multipath support.
> 
> I have been working on the implementation and I plan to post patches in 
> the next cycle. I am looking at a 3-stage approach:
> a. create a driver-agnostic multipath library, very heavily based on 
> NVMe host multipath support.
> The library would support features such as path management, path 
> selection/iopolicy, failover recovery, PR, delayed removal, gendisk 
> management etc.
> b. switch NVMe over to use this library
> c. add native SCSI multipath support based on this common library

A minor comment: maybe "in-kernel" makes more clear what this proposal
is about than "native"?

More important: what will the performance impact be on SCSI devices that
do not need multipath support? UFS devices don't need multipath support
and soon (later this year) will support more than one million IOPS per
device. Further performance improvements are on the roadmap.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Native SCSI multipath support
  2026-02-17 19:33 ` Bart Van Assche
@ 2026-02-17 20:13   ` Keith Busch
  2026-02-18  2:39     ` [Lsf-pc] " Martin K. Petersen
  2026-02-18  8:23   ` John Garry
  1 sibling, 1 reply; 22+ messages in thread
From: Keith Busch @ 2026-02-17 20:13 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: John Garry, lsf-pc, linux-nvme, linux-block, linux-scsi

On Tue, Feb 17, 2026 at 11:33:12AM -0800, Bart Van Assche wrote:
> More important: what will the performance impact be on SCSI devices that
> do not need multipath support? UFS devices don't need multipath support
> and soon (later this year) will support more than one million IOPS per
> device. Further performance improvements are on the roadmap.

For nvme, we can detect if a device is multipath capable. If not, we
skip the multipath layer altogether so it has no performance impact. I'd
imagine the generic library version would similarly require an opt-in
approach that UFS simply wouldn't subscribe to.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF TOPIC] Native SCSI multipath support
  2026-02-17 20:13   ` Keith Busch
@ 2026-02-18  2:39     ` Martin K. Petersen
  2026-02-18  7:35       ` Hannes Reinecke
  0 siblings, 1 reply; 22+ messages in thread
From: Martin K. Petersen @ 2026-02-18  2:39 UTC (permalink / raw)
  To: Keith Busch via Lsf-pc
  Cc: Bart Van Assche, Keith Busch, John Garry, linux-nvme, linux-block,
	linux-scsi


Keith,

> For nvme, we can detect if a device is multipath capable.

Yep. Same with SCSI...

-- 
Martin K. Petersen

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF TOPIC] Native SCSI multipath support
  2026-02-18  2:39     ` [Lsf-pc] " Martin K. Petersen
@ 2026-02-18  7:35       ` Hannes Reinecke
  2026-02-18  8:35         ` John Garry
  0 siblings, 1 reply; 22+ messages in thread
From: Hannes Reinecke @ 2026-02-18  7:35 UTC (permalink / raw)
  To: Martin K. Petersen, Keith Busch via Lsf-pc
  Cc: Bart Van Assche, Keith Busch, John Garry, linux-nvme, linux-block,
	linux-scsi

On 2/18/26 03:39, Martin K. Petersen wrote:
> 
> Keith,
> 
>> For nvme, we can detect if a device is multipath capable.
> 
> Yep. Same with SCSI...
> 
And that's how we handle things currently. We've learned from long and 
painful experiences that there is _NO_ way to automatically figure out
if a device is multipathed. That will always be an admin decision, so
there needs to be an opt-in mechanism.
And that needs to be set _prior_ to probing.
And you need a driver-specific opt-out, to disable all devices from
this driver for multipathing (UFS, USB, ATA, you name it).

Once you have that you can declare all ALUA capable devices with
a VPD page 83 device identifier as multipathed. Irrespective of
how many paths will show up.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Native SCSI multipath support
  2026-02-17 19:33 ` Bart Van Assche
  2026-02-17 20:13   ` Keith Busch
@ 2026-02-18  8:23   ` John Garry
  1 sibling, 0 replies; 22+ messages in thread
From: John Garry @ 2026-02-18  8:23 UTC (permalink / raw)
  To: Bart Van Assche, lsf-pc, linux-nvme, linux-block, linux-scsi

On 17/02/2026 19:33, Bart Van Assche wrote:
> On 2/13/26 6:19 AM, John Garry wrote:
>> At ALPSS 25 I presented a proposal for Native SCSI multipath support. 
>> Let's discuss this topic at LSFMM.
>>
>> The idea for this is that SCSI could natively support multipath, like 
>> how NVMe host driver does today. It is intended as an alternative to 
>> dm- multipath support.
>>
>> I have been working on the implementation and I plan to post patches 
>> in the next cycle. I am looking at a 3-stage approach:
>> a. create a driver-agnostic multipath library, very heavily based on 
>> NVMe host multipath support.
>> The library would support features such as path management, path 
>> selection/iopolicy, failover recovery, PR, delayed removal, gendisk 
>> management etc.
>> b. switch NVMe over to use this library
>> c. add native SCSI multipath support based on this common library
> 
> A minor comment: maybe "in-kernel" makes more clear what this proposal
> is about than "native"?

dm-multipath is also in-kernel. It just requires userspace for 
config/control.

The key difference is that "native" scsi multipathing provides a scsi 
disk which supports multipathing.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF TOPIC] Native SCSI multipath support
  2026-02-18  7:35       ` Hannes Reinecke
@ 2026-02-18  8:35         ` John Garry
  0 siblings, 0 replies; 22+ messages in thread
From: John Garry @ 2026-02-18  8:35 UTC (permalink / raw)
  To: Hannes Reinecke, Martin K. Petersen, Keith Busch via Lsf-pc
  Cc: Bart Van Assche, Keith Busch, linux-nvme, linux-block, linux-scsi

On 18/02/2026 07:35, Hannes Reinecke wrote:
> On 2/18/26 03:39, Martin K. Petersen wrote:
>>
>> Keith,
>>
>>> For nvme, we can detect if a device is multipath capable.
>>
>> Yep. Same with SCSI...
>>
> And that's how we handle things currently. We've learned from long and 
> painful experiences that there is _NO_ way to automatically figure out
> if a device is multipathed. That will always be an admin decision, so
> there needs to be an opt-in mechanism.
> And that needs to be set _prior_ to probing.
> And you need a driver-specific opt-out, to disable all devices from
> this driver for multipathing (UFS, USB, ATA, you name it).
> 
> Once you have that you can declare all ALUA capable devices with
> a VPD page 83 device identifier as multipathed. Irrespective of
> how many paths will show up.

What I am going to post introduces two mod params.

scsi_multipath.mulitpath and .enable_always

Any scsi device gets multipath treatment when either:

a. .mulitpath enabled and ALUA supported (scsi_device_tpgs() non-zero) 
and unique ID from VPD page 83

b. .enable_always enabled and unique ID from VPD page 83

And all of this will need a new SCSI MULTIPATH config option enabled.

NVMe host driver has similar params.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Native SCSI multipath support
  2026-02-13 14:19 [LSF/MM/BPF TOPIC] Native SCSI multipath support John Garry
  2026-02-13 17:21 ` Hannes Reinecke
  2026-02-17 19:33 ` Bart Van Assche
@ 2026-02-21 17:41 ` Mike Snitzer
  2026-02-24  9:56   ` John Garry
  2026-02-25  0:46   ` Benjamin Marzinski
  2 siblings, 2 replies; 22+ messages in thread
From: Mike Snitzer @ 2026-02-21 17:41 UTC (permalink / raw)
  To: John Garry; +Cc: lsf-pc, linux-nvme, linux-block, linux-scsi, dm-devel

On Fri, Feb 13, 2026 at 02:19:11PM +0000, John Garry wrote:
> At ALPSS 25 I presented a proposal for Native SCSI multipath support. Let's
> discuss this topic at LSFMM.
> 
> The idea for this is that SCSI could natively support multipath, like how
> NVMe host driver does today. It is intended as an alternative to
> dm-multipath support.
> 
> I have been working on the implementation and I plan to post patches in the
> next cycle. I am looking at a 3-stage approach:
> a. create a driver-agnostic multipath library, very heavily based on NVMe
> host multipath support.
> The library would support features such as path management, path
> selection/iopolicy, failover recovery, PR, delayed removal, gendisk
> management etc.
> b. switch NVMe over to use this library

I can appreciate that the kernel to userspace interface of DM
multipath is clearly unwanted (hence NVMe multipath and now SCSI
multipath).

But you should really be switching DM-multipath over to using it too;
or at least detailing _why_ the core of DM multipath
(drivers/md/dm-mpath.c) cannot be updated to use this common backend
library.

This line of work makes little sense to me if it just ignores
dm-multipath.

Mike

> c. add native SCSI multipath support based on this common library
> 
> Thanks,
> John
> 
> 
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Native SCSI multipath support
  2026-02-16 16:32     ` Hannes Reinecke
  2026-02-16 16:55       ` John Garry
@ 2026-02-21 17:47       ` Mike Snitzer
  1 sibling, 0 replies; 22+ messages in thread
From: Mike Snitzer @ 2026-02-21 17:47 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: John Garry, lsf-pc, linux-nvme, linux-block, linux-scsi

On Mon, Feb 16, 2026 at 05:32:04PM +0100, Hannes Reinecke wrote:
> On 2/14/26 10:42, John Garry wrote:
> > On 13/02/2026 17:21, Hannes Reinecke wrote:
> > > > At ALPSS 25 I presented a proposal for Native SCSI multipath
> > > > support. Let's discuss this topic at LSFMM.
> > > > 
> > > > The idea for this is that SCSI could natively support multipath,
> > > > like how NVMe host driver does today. It is intended as an
> > > > alternative to dm- multipath support.
> > > > 
> > > > I have been working on the implementation and I plan to post
> > > > patches in the next cycle. I am looking at a 3-stage approach:
> > > > a. create a driver-agnostic multipath library, very heavily
> > > > based on NVMe host multipath support.
> > > > The library would support features such as path management, path
> > > > selection/iopolicy, failover recovery, PR, delayed removal,
> > > > gendisk management etc.
> > > > b. switch NVMe over to use this library
> > > > c. add native SCSI multipath support based on this common library
> > > > 
> > > Go for it, John!
> > > 
> > > I'd be very interested in that.
> > 
> > cheers, in the meantime, I have some comments:
> > 
> > - I need to test PRs for both NVMe and SCSI, any advice on that would be
> > good. I don't think that blktests covers it. I did see Christoph mention
> > a testsuite at: https://lore.kernel.org/linux-nvme/1438672271-11309-1-
> > git-send-email-hch@lst.de/ - I can check that.
> > 
> Well, you might have seen the discussion on the device-mapper list, where
> stefanha is implementing generic PRs for dm-multipathing.
> Or rather, trying to. We might need to revisit that and see what we
> could be doing on the SCSI side.
> Maybe we should be having a session about PRs at LSF?
> 
> > - I am still not sure on whether we require a multipath version of sg.
> > We can still have per-path sg. NVMe does have a multipath nvme-generic
> > dev, but that just handles IOCTLs/uring cmd, and nothing like sg read/
> > write fops
> > 
> 'sg' is primarily for testing 'raw' SCSI commands. (And dastardly complex to
> boot). I really would keep it in it's current form, and not
> try to mimick something with SCSI multipathing.
> 
> > - I have not tried to detangle ALUA support from SCSI DH, so no ALUA
> > support yet
> > 
> Ouch. But that is the key point of the implementation; ALUA provides
> _all_ the information required for multipathing, so how can you _not_
> have support for it?

Exactly: no ALUA support makes, whatever this is, completely useless.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Native SCSI multipath support
  2026-02-21 17:41 ` Mike Snitzer
@ 2026-02-24  9:56   ` John Garry
  2026-02-25  0:46   ` Benjamin Marzinski
  1 sibling, 0 replies; 22+ messages in thread
From: John Garry @ 2026-02-24  9:56 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: lsf-pc, linux-nvme, linux-block, linux-scsi, dm-devel

On 21/02/2026 17:41, Mike Snitzer wrote:
> On Fri, Feb 13, 2026 at 02:19:11PM +0000, John Garry wrote:
>> At ALPSS 25 I presented a proposal for Native SCSI multipath support. Let's
>> discuss this topic at LSFMM.
>>
>> The idea for this is that SCSI could natively support multipath, like how
>> NVMe host driver does today. It is intended as an alternative to
>> dm-multipath support.
>>
>> I have been working on the implementation and I plan to post patches in the
>> next cycle. I am looking at a 3-stage approach:
>> a. create a driver-agnostic multipath library, very heavily based on NVMe
>> host multipath support.
>> The library would support features such as path management, path
>> selection/iopolicy, failover recovery, PR, delayed removal, gendisk
>> management etc.
>> b. switch NVMe over to use this library
> I can appreciate that the kernel to userspace interface of DM
> multipath is clearly unwanted (hence NVMe multipath and now SCSI
> multipath).
> 
> But you should really be switching DM-multipath over to using it too;
> or at least detailing_why_ the core of DM multipath
> (drivers/md/dm-mpath.c) cannot be updated to use this common backend
> library.
> 
> This line of work makes little sense to me if it just ignores
> dm-multipath.

What I am proposing is refactoring the NVMe multipath code so that it 
can be used for SCSI as well.

I am not sure where to begin on saying that this library would be 
unsuitable dm-mpath. For a start, the bio flow is totally different. 
Then path selection is totally different.

Anyway, I'll post the code this week and you can check it.

John

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Native SCSI multipath support
  2026-02-21 17:41 ` Mike Snitzer
  2026-02-24  9:56   ` John Garry
@ 2026-02-25  0:46   ` Benjamin Marzinski
  2026-02-25  8:11     ` Hannes Reinecke
  1 sibling, 1 reply; 22+ messages in thread
From: Benjamin Marzinski @ 2026-02-25  0:46 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: John Garry, lsf-pc, linux-nvme, linux-block, linux-scsi, dm-devel

On Sat, Feb 21, 2026 at 12:41:28PM -0500, Mike Snitzer wrote:
> On Fri, Feb 13, 2026 at 02:19:11PM +0000, John Garry wrote:
> > At ALPSS 25 I presented a proposal for Native SCSI multipath support. Let's
> > discuss this topic at LSFMM.
> > 
> > The idea for this is that SCSI could natively support multipath, like how
> > NVMe host driver does today. It is intended as an alternative to
> > dm-multipath support.
> > 
> > I have been working on the implementation and I plan to post patches in the
> > next cycle. I am looking at a 3-stage approach:
> > a. create a driver-agnostic multipath library, very heavily based on NVMe
> > host multipath support.
> > The library would support features such as path management, path
> > selection/iopolicy, failover recovery, PR, delayed removal, gendisk
> > management etc.
> > b. switch NVMe over to use this library
> 
> I can appreciate that the kernel to userspace interface of DM
> multipath is clearly unwanted (hence NVMe multipath and now SCSI
> multipath).
> 
> But you should really be switching DM-multipath over to using it too;
> or at least detailing _why_ the core of DM multipath
> (drivers/md/dm-mpath.c) cannot be updated to use this common backend
> library.
> 
> This line of work makes little sense to me if it just ignores
> dm-multipath.
> 
> Mike

Thinking about this work from a DM multipath perspective, I'm more
interested in how much it plans to handle the more annoying niche cases
of dealing with SCSI devices, like paths that confidently report that
they are able to accept IO, only to fail all IO sent to them. Also, I
wonder how/if this is planning on handling Persistent Reservations. The
arrays, I assume, are still going to see this as a collection of I_T
Nexuses (some of which may be down and unable to accept commands at any
given time, and to which new ones my be added) instead of a single one.

I also think this would be useful to talk about at LSF.

-Ben

> 
> > c. add native SCSI multipath support based on this common library
> > 
> > Thanks,
> > John
> > 
> > 
> > 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Native SCSI multipath support
  2026-02-25  0:46   ` Benjamin Marzinski
@ 2026-02-25  8:11     ` Hannes Reinecke
  2026-02-25  9:26       ` John Garry
  0 siblings, 1 reply; 22+ messages in thread
From: Hannes Reinecke @ 2026-02-25  8:11 UTC (permalink / raw)
  To: Benjamin Marzinski, Mike Snitzer
  Cc: John Garry, lsf-pc, linux-nvme, linux-block, linux-scsi, dm-devel

On 2/25/26 01:46, Benjamin Marzinski wrote:
> On Sat, Feb 21, 2026 at 12:41:28PM -0500, Mike Snitzer wrote:
>> On Fri, Feb 13, 2026 at 02:19:11PM +0000, John Garry wrote:
>>> At ALPSS 25 I presented a proposal for Native SCSI multipath support. Let's
>>> discuss this topic at LSFMM.
>>>
>>> The idea for this is that SCSI could natively support multipath, like how
>>> NVMe host driver does today. It is intended as an alternative to
>>> dm-multipath support.
>>>
>>> I have been working on the implementation and I plan to post patches in the
>>> next cycle. I am looking at a 3-stage approach:
>>> a. create a driver-agnostic multipath library, very heavily based on NVMe
>>> host multipath support.
>>> The library would support features such as path management, path
>>> selection/iopolicy, failover recovery, PR, delayed removal, gendisk
>>> management etc.
>>> b. switch NVMe over to use this library
>>
>> I can appreciate that the kernel to userspace interface of DM
>> multipath is clearly unwanted (hence NVMe multipath and now SCSI
>> multipath).
>>
>> But you should really be switching DM-multipath over to using it too;
>> or at least detailing _why_ the core of DM multipath
>> (drivers/md/dm-mpath.c) cannot be updated to use this common backend
>> library.
>>
>> This line of work makes little sense to me if it just ignores
>> dm-multipath.
>>
>> Mike
> 
> Thinking about this work from a DM multipath perspective, I'm more
> interested in how much it plans to handle the more annoying niche cases
> of dealing with SCSI devices, like paths that confidently report that
> they are able to accept IO, only to fail all IO sent to them. Also, I
> wonder how/if this is planning on handling Persistent Reservations. The
> arrays, I assume, are still going to see this as a collection of I_T
> Nexuses (some of which may be down and unable to accept commands at any
> given time, and to which new ones my be added) instead of a single one.
> 
> I also think this would be useful to talk about at LSF.
> 
And that even makes me wonder whether we should have a discussion about
persistent reservations at LSF, too.
I seem to be involved in discussions about PRs from various angles now
(live migration seems to want to join the fray), so maybe we could get
together to discuss things.

And I _still_ want to have a blktests for persistent reservations ...

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Native SCSI multipath support
  2026-02-25  8:11     ` Hannes Reinecke
@ 2026-02-25  9:26       ` John Garry
  2026-03-10 17:12         ` Ewan Milne
  0 siblings, 1 reply; 22+ messages in thread
From: John Garry @ 2026-02-25  9:26 UTC (permalink / raw)
  To: Hannes Reinecke, Benjamin Marzinski, Mike Snitzer
  Cc: lsf-pc, linux-nvme, linux-block, linux-scsi, dm-devel

On 25/02/2026 08:11, Hannes Reinecke wrote:
> And I _still_ want to have a blktests for persistent reservations ...
nvme/054 supports resv testing.

For scsi PR, we could use util-linux, which has blkpr.



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Native SCSI multipath support
  2026-02-25  9:26       ` John Garry
@ 2026-03-10 17:12         ` Ewan Milne
  2026-03-10 18:05           ` John Garry
  0 siblings, 1 reply; 22+ messages in thread
From: Ewan Milne @ 2026-03-10 17:12 UTC (permalink / raw)
  To: John Garry
  Cc: Hannes Reinecke, Benjamin Marzinski, Mike Snitzer, lsf-pc,
	linux-nvme, linux-block, linux-scsi, dm-devel

Hi John-

Sorry, I was out for a couple of weeks and have been catching up...

Re: sg support, there were issues in the past with people attempting
to do SG_IO through dm-mp
assuming that DM would handle retry on other paths, which it didn't.
You also have to be aware
that non-idempotent commands don't work right if retried.  My
recommendation would be to avoid
implementing it, although there has been interest in a better way to
do multipathed "generic"
commands (e.g. virt pass-through) I think that is a more involved
project than you want to do here.

I see the discussion has progressed re: ALUA support in your later
patch postings, which is good.
As Hannes said, a Native SCSI MP would be useless without it.  You
don't have to support the
older non-ALUA mechanisms though, those arrays are way, way old.

SCSI does not have the equivalent of NVMe's AEN, so you need a way to
ensure that your
ALUA info is up-to-date.  DM-MP's path checker normally does this by
sending commands on
which the Unit Attention can be reported so that the code can fetch
up-to-date ALUA info.
Hannes made some optimizations years ago to avoid excessive RTPG
commands with large
numbers of LUNs which we would need also.

It will be necessary for the functionality to be enabled via a module
option, at least initially.
Introducing this in general use will be a big change for people who
have Enterprise SAN
configurations with their own custom path monitoring tools.  I believe
we put some functionality
into usespace multipath tools so e.g. Native NVMe devices can still be
monitored/observed
which made things a bit easier for people.

Unfortunately I will not be able to attend LSF/MM this year.  I am
sure it will be a good discussion.

-Ewan

On Wed, Feb 25, 2026 at 4:27 AM John Garry <john.g.garry@oracle.com> wrote:
>
> On 25/02/2026 08:11, Hannes Reinecke wrote:
> > And I _still_ want to have a blktests for persistent reservations ...
> nvme/054 supports resv testing.
>
> For scsi PR, we could use util-linux, which has blkpr.
>
>
>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Native SCSI multipath support
  2026-03-10 17:12         ` Ewan Milne
@ 2026-03-10 18:05           ` John Garry
  2026-03-10 18:42             ` Benjamin Marzinski
  0 siblings, 1 reply; 22+ messages in thread
From: John Garry @ 2026-03-10 18:05 UTC (permalink / raw)
  To: Ewan Milne
  Cc: Hannes Reinecke, Benjamin Marzinski, Mike Snitzer, lsf-pc,
	linux-nvme, linux-block, linux-scsi, dm-devel

On 10/03/2026 17:12, Ewan Milne wrote:
> Hi John-
> 
> Sorry, I was out for a couple of weeks and have been catching up...
> 
> Re: sg support, there were issues in the past with people attempting
> to do SG_IO through dm-mp
> assuming that DM would handle retry on other paths, which it didn't.
> You also have to be aware
> that non-idempotent commands don't work right if retried.  My
> recommendation would be to avoid
> implementing it, although there has been interest in a better way to
> do multipathed "generic"
> commands (e.g. virt pass-through) I think that is a more involved
> project than you want to do here.

Understood, my current plan is not have a multipathed sg driver - we 
will still have the per-scsi device/path sg device.

> 
> I see the discussion has progressed re: ALUA support in your later
> patch postings, which is good.
> As Hannes said, a Native SCSI MP would be useless without it.  You
> don't have to support the
> older non-ALUA mechanisms though, those arrays are way, way old.
> 
> SCSI does not have the equivalent of NVMe's AEN, so you need a way to
> ensure that your
> ALUA info is up-to-date.  DM-MP's path checker normally does this by
> sending commands on
> which the Unit Attention can be reported so that the code can fetch
> up-to-date ALUA info.
> Hannes made some optimizations years ago to avoid excessive RTPG
> commands with large
> numbers of LUNs which we would need also.

Hannes is suggesting to not have a kernel path checker, so let me know 
if any issue with that.

> 
> It will be necessary for the functionality to be enabled via a module
> option, at least initially.
> Introducing this in general use will be a big change for people who
> have Enterprise SAN
> configurations with their own custom path monitoring tools.  I believe
> we put some functionality
> into usespace multipath tools so e.g. Native NVMe devices can still be
> monitored/observed
> which made things a bit easier for people.
> 

Sure, if you check my patches, we disable by default and enable via a 
module param

cheers

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Native SCSI multipath support
  2026-03-10 18:05           ` John Garry
@ 2026-03-10 18:42             ` Benjamin Marzinski
  0 siblings, 0 replies; 22+ messages in thread
From: Benjamin Marzinski @ 2026-03-10 18:42 UTC (permalink / raw)
  To: John Garry
  Cc: Ewan Milne, Hannes Reinecke, Mike Snitzer, lsf-pc, linux-nvme,
	linux-block, linux-scsi, dm-devel

On Tue, Mar 10, 2026 at 06:05:29PM +0000, John Garry wrote:
> On 10/03/2026 17:12, Ewan Milne wrote:
> > Hi John-
> > 
> > Sorry, I was out for a couple of weeks and have been catching up...
> > 
> > Re: sg support, there were issues in the past with people attempting
> > to do SG_IO through dm-mp
> > assuming that DM would handle retry on other paths, which it didn't.
> > You also have to be aware
> > that non-idempotent commands don't work right if retried.  My
> > recommendation would be to avoid
> > implementing it, although there has been interest in a better way to
> > do multipathed "generic"
> > commands (e.g. virt pass-through) I think that is a more involved
> > project than you want to do here.
> 
> Understood, my current plan is not have a multipathed sg driver - we will
> still have the per-scsi device/path sg device.

The sd devices still handle SG_IO ioctls. For instance, the persistent
reservation ioctls are SG_IO ioctls. But like I said elsewhere, getting
multipathed persistent reservations working safely is going to be a
large effort, better left for later.

But even without them, to handle things like sending SCSI WRITE commands
over SG_IO ioctls, in an ideal world, you would want to be able to retry
on other paths in the ioctl code. However, like Ewan mentioned, there
are times when you don't want to retry the ioctl. Just sending SG_IO
ioctls to one path and letting them fail if they fail down that path is
the safest way for now, even if there are times when that SG_IO ioctl
could complete successfully down another path.

-Ben

> 
> > 
> > I see the discussion has progressed re: ALUA support in your later
> > patch postings, which is good.
> > As Hannes said, a Native SCSI MP would be useless without it.  You
> > don't have to support the
> > older non-ALUA mechanisms though, those arrays are way, way old.
> > 
> > SCSI does not have the equivalent of NVMe's AEN, so you need a way to
> > ensure that your
> > ALUA info is up-to-date.  DM-MP's path checker normally does this by
> > sending commands on
> > which the Unit Attention can be reported so that the code can fetch
> > up-to-date ALUA info.
> > Hannes made some optimizations years ago to avoid excessive RTPG
> > commands with large
> > numbers of LUNs which we would need also.
> 
> Hannes is suggesting to not have a kernel path checker, so let me know if
> any issue with that.
> 
> > 
> > It will be necessary for the functionality to be enabled via a module
> > option, at least initially.
> > Introducing this in general use will be a big change for people who
> > have Enterprise SAN
> > configurations with their own custom path monitoring tools.  I believe
> > we put some functionality
> > into usespace multipath tools so e.g. Native NVMe devices can still be
> > monitored/observed
> > which made things a bit easier for people.
> > 
> 
> Sure, if you check my patches, we disable by default and enable via a module
> param
> 
> cheers


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2026-03-10 18:42 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-13 14:19 [LSF/MM/BPF TOPIC] Native SCSI multipath support John Garry
2026-02-13 17:21 ` Hannes Reinecke
2026-02-14  9:42   ` John Garry
2026-02-16  7:26     ` Hannes Reinecke
2026-02-16 16:32     ` Hannes Reinecke
2026-02-16 16:55       ` John Garry
2026-02-17  7:05         ` Hannes Reinecke
2026-02-21 17:47       ` Mike Snitzer
2026-02-17 19:33 ` Bart Van Assche
2026-02-17 20:13   ` Keith Busch
2026-02-18  2:39     ` [Lsf-pc] " Martin K. Petersen
2026-02-18  7:35       ` Hannes Reinecke
2026-02-18  8:35         ` John Garry
2026-02-18  8:23   ` John Garry
2026-02-21 17:41 ` Mike Snitzer
2026-02-24  9:56   ` John Garry
2026-02-25  0:46   ` Benjamin Marzinski
2026-02-25  8:11     ` Hannes Reinecke
2026-02-25  9:26       ` John Garry
2026-03-10 17:12         ` Ewan Milne
2026-03-10 18:05           ` John Garry
2026-03-10 18:42             ` Benjamin Marzinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox