* Fwd: [LSF/MM/BPF TOPIC] Native SCSI multipath support [not found] <69349b51-72c2-47f9-948f-f89843af62e4@oracle.com> @ 2026-02-13 15:26 ` Xose Vazquez Perez 2026-02-21 17:41 ` Mike Snitzer 1 sibling, 0 replies; 9+ messages in thread From: Xose Vazquez Perez @ 2026-02-13 15:26 UTC (permalink / raw) To: Martin Wilck, Benjamin Marzinski, Christophe Varoqui, DM_DEVEL-ML FYI: https://lore.kernel.org/linux-scsi/69349b51-72c2-47f9-948f-f89843af62e4@oracle.com/dm -------- Forwarded Message -------- Subject: [LSF/MM/BPF TOPIC] Native SCSI multipath support Date: Fri, 13 Feb 2026 14:19:11 +0000 From: John Garry <john.g.garry@oracle.com> Organization: Oracle Corporation To: lsf-pc@lists.linux-foundation.org, linux-nvme@lists.infradead.org, linux-block@vger.kernel.org, linux-scsi@vger.kernel.org At ALPSS 25 I presented a proposal for Native SCSI multipath support. Let's discuss this topic at LSFMM. The idea for this is that SCSI could natively support multipath, like how NVMe host driver does today. It is intended as an alternative to dm-multipath support. I have been working on the implementation and I plan to post patches in the next cycle. I am looking at a 3-stage approach: a. create a driver-agnostic multipath library, very heavily based on NVMe host multipath support. The library would support features such as path management, path selection/iopolicy, failover recovery, PR, delayed removal, gendisk management etc. b. switch NVMe over to use this library c. add native SCSI multipath support based on this common library Thanks, John ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Native SCSI multipath support [not found] <69349b51-72c2-47f9-948f-f89843af62e4@oracle.com> 2026-02-13 15:26 ` Fwd: [LSF/MM/BPF TOPIC] Native SCSI multipath support Xose Vazquez Perez @ 2026-02-21 17:41 ` Mike Snitzer 2026-02-24 9:56 ` John Garry 2026-02-25 0:46 ` Benjamin Marzinski 1 sibling, 2 replies; 9+ messages in thread From: Mike Snitzer @ 2026-02-21 17:41 UTC (permalink / raw) To: John Garry; +Cc: lsf-pc, linux-nvme, linux-block, linux-scsi, dm-devel On Fri, Feb 13, 2026 at 02:19:11PM +0000, John Garry wrote: > At ALPSS 25 I presented a proposal for Native SCSI multipath support. Let's > discuss this topic at LSFMM. > > The idea for this is that SCSI could natively support multipath, like how > NVMe host driver does today. It is intended as an alternative to > dm-multipath support. > > I have been working on the implementation and I plan to post patches in the > next cycle. I am looking at a 3-stage approach: > a. create a driver-agnostic multipath library, very heavily based on NVMe > host multipath support. > The library would support features such as path management, path > selection/iopolicy, failover recovery, PR, delayed removal, gendisk > management etc. > b. switch NVMe over to use this library I can appreciate that the kernel to userspace interface of DM multipath is clearly unwanted (hence NVMe multipath and now SCSI multipath). But you should really be switching DM-multipath over to using it too; or at least detailing _why_ the core of DM multipath (drivers/md/dm-mpath.c) cannot be updated to use this common backend library. This line of work makes little sense to me if it just ignores dm-multipath. Mike > c. add native SCSI multipath support based on this common library > > Thanks, > John > > > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Native SCSI multipath support 2026-02-21 17:41 ` Mike Snitzer @ 2026-02-24 9:56 ` John Garry 2026-02-25 0:46 ` Benjamin Marzinski 1 sibling, 0 replies; 9+ messages in thread From: John Garry @ 2026-02-24 9:56 UTC (permalink / raw) To: Mike Snitzer; +Cc: lsf-pc, linux-nvme, linux-block, linux-scsi, dm-devel On 21/02/2026 17:41, Mike Snitzer wrote: > On Fri, Feb 13, 2026 at 02:19:11PM +0000, John Garry wrote: >> At ALPSS 25 I presented a proposal for Native SCSI multipath support. Let's >> discuss this topic at LSFMM. >> >> The idea for this is that SCSI could natively support multipath, like how >> NVMe host driver does today. It is intended as an alternative to >> dm-multipath support. >> >> I have been working on the implementation and I plan to post patches in the >> next cycle. I am looking at a 3-stage approach: >> a. create a driver-agnostic multipath library, very heavily based on NVMe >> host multipath support. >> The library would support features such as path management, path >> selection/iopolicy, failover recovery, PR, delayed removal, gendisk >> management etc. >> b. switch NVMe over to use this library > I can appreciate that the kernel to userspace interface of DM > multipath is clearly unwanted (hence NVMe multipath and now SCSI > multipath). > > But you should really be switching DM-multipath over to using it too; > or at least detailing_why_ the core of DM multipath > (drivers/md/dm-mpath.c) cannot be updated to use this common backend > library. > > This line of work makes little sense to me if it just ignores > dm-multipath. What I am proposing is refactoring the NVMe multipath code so that it can be used for SCSI as well. I am not sure where to begin on saying that this library would be unsuitable dm-mpath. For a start, the bio flow is totally different. Then path selection is totally different. Anyway, I'll post the code this week and you can check it. John ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Native SCSI multipath support 2026-02-21 17:41 ` Mike Snitzer 2026-02-24 9:56 ` John Garry @ 2026-02-25 0:46 ` Benjamin Marzinski 2026-02-25 8:11 ` Hannes Reinecke 1 sibling, 1 reply; 9+ messages in thread From: Benjamin Marzinski @ 2026-02-25 0:46 UTC (permalink / raw) To: Mike Snitzer Cc: John Garry, lsf-pc, linux-nvme, linux-block, linux-scsi, dm-devel On Sat, Feb 21, 2026 at 12:41:28PM -0500, Mike Snitzer wrote: > On Fri, Feb 13, 2026 at 02:19:11PM +0000, John Garry wrote: > > At ALPSS 25 I presented a proposal for Native SCSI multipath support. Let's > > discuss this topic at LSFMM. > > > > The idea for this is that SCSI could natively support multipath, like how > > NVMe host driver does today. It is intended as an alternative to > > dm-multipath support. > > > > I have been working on the implementation and I plan to post patches in the > > next cycle. I am looking at a 3-stage approach: > > a. create a driver-agnostic multipath library, very heavily based on NVMe > > host multipath support. > > The library would support features such as path management, path > > selection/iopolicy, failover recovery, PR, delayed removal, gendisk > > management etc. > > b. switch NVMe over to use this library > > I can appreciate that the kernel to userspace interface of DM > multipath is clearly unwanted (hence NVMe multipath and now SCSI > multipath). > > But you should really be switching DM-multipath over to using it too; > or at least detailing _why_ the core of DM multipath > (drivers/md/dm-mpath.c) cannot be updated to use this common backend > library. > > This line of work makes little sense to me if it just ignores > dm-multipath. > > Mike Thinking about this work from a DM multipath perspective, I'm more interested in how much it plans to handle the more annoying niche cases of dealing with SCSI devices, like paths that confidently report that they are able to accept IO, only to fail all IO sent to them. Also, I wonder how/if this is planning on handling Persistent Reservations. The arrays, I assume, are still going to see this as a collection of I_T Nexuses (some of which may be down and unable to accept commands at any given time, and to which new ones my be added) instead of a single one. I also think this would be useful to talk about at LSF. -Ben > > > c. add native SCSI multipath support based on this common library > > > > Thanks, > > John > > > > > > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Native SCSI multipath support 2026-02-25 0:46 ` Benjamin Marzinski @ 2026-02-25 8:11 ` Hannes Reinecke 2026-02-25 9:26 ` John Garry 0 siblings, 1 reply; 9+ messages in thread From: Hannes Reinecke @ 2026-02-25 8:11 UTC (permalink / raw) To: Benjamin Marzinski, Mike Snitzer Cc: John Garry, lsf-pc, linux-nvme, linux-block, linux-scsi, dm-devel On 2/25/26 01:46, Benjamin Marzinski wrote: > On Sat, Feb 21, 2026 at 12:41:28PM -0500, Mike Snitzer wrote: >> On Fri, Feb 13, 2026 at 02:19:11PM +0000, John Garry wrote: >>> At ALPSS 25 I presented a proposal for Native SCSI multipath support. Let's >>> discuss this topic at LSFMM. >>> >>> The idea for this is that SCSI could natively support multipath, like how >>> NVMe host driver does today. It is intended as an alternative to >>> dm-multipath support. >>> >>> I have been working on the implementation and I plan to post patches in the >>> next cycle. I am looking at a 3-stage approach: >>> a. create a driver-agnostic multipath library, very heavily based on NVMe >>> host multipath support. >>> The library would support features such as path management, path >>> selection/iopolicy, failover recovery, PR, delayed removal, gendisk >>> management etc. >>> b. switch NVMe over to use this library >> >> I can appreciate that the kernel to userspace interface of DM >> multipath is clearly unwanted (hence NVMe multipath and now SCSI >> multipath). >> >> But you should really be switching DM-multipath over to using it too; >> or at least detailing _why_ the core of DM multipath >> (drivers/md/dm-mpath.c) cannot be updated to use this common backend >> library. >> >> This line of work makes little sense to me if it just ignores >> dm-multipath. >> >> Mike > > Thinking about this work from a DM multipath perspective, I'm more > interested in how much it plans to handle the more annoying niche cases > of dealing with SCSI devices, like paths that confidently report that > they are able to accept IO, only to fail all IO sent to them. Also, I > wonder how/if this is planning on handling Persistent Reservations. The > arrays, I assume, are still going to see this as a collection of I_T > Nexuses (some of which may be down and unable to accept commands at any > given time, and to which new ones my be added) instead of a single one. > > I also think this would be useful to talk about at LSF. > And that even makes me wonder whether we should have a discussion about persistent reservations at LSF, too. I seem to be involved in discussions about PRs from various angles now (live migration seems to want to join the fray), so maybe we could get together to discuss things. And I _still_ want to have a blktests for persistent reservations ... Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect hare@suse.de +49 911 74053 688 SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Native SCSI multipath support 2026-02-25 8:11 ` Hannes Reinecke @ 2026-02-25 9:26 ` John Garry 2026-03-10 17:12 ` Ewan Milne 0 siblings, 1 reply; 9+ messages in thread From: John Garry @ 2026-02-25 9:26 UTC (permalink / raw) To: Hannes Reinecke, Benjamin Marzinski, Mike Snitzer Cc: lsf-pc, linux-nvme, linux-block, linux-scsi, dm-devel On 25/02/2026 08:11, Hannes Reinecke wrote: > And I _still_ want to have a blktests for persistent reservations ... nvme/054 supports resv testing. For scsi PR, we could use util-linux, which has blkpr. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Native SCSI multipath support 2026-02-25 9:26 ` John Garry @ 2026-03-10 17:12 ` Ewan Milne 2026-03-10 18:05 ` John Garry 0 siblings, 1 reply; 9+ messages in thread From: Ewan Milne @ 2026-03-10 17:12 UTC (permalink / raw) To: John Garry Cc: Hannes Reinecke, Benjamin Marzinski, Mike Snitzer, lsf-pc, linux-nvme, linux-block, linux-scsi, dm-devel Hi John- Sorry, I was out for a couple of weeks and have been catching up... Re: sg support, there were issues in the past with people attempting to do SG_IO through dm-mp assuming that DM would handle retry on other paths, which it didn't. You also have to be aware that non-idempotent commands don't work right if retried. My recommendation would be to avoid implementing it, although there has been interest in a better way to do multipathed "generic" commands (e.g. virt pass-through) I think that is a more involved project than you want to do here. I see the discussion has progressed re: ALUA support in your later patch postings, which is good. As Hannes said, a Native SCSI MP would be useless without it. You don't have to support the older non-ALUA mechanisms though, those arrays are way, way old. SCSI does not have the equivalent of NVMe's AEN, so you need a way to ensure that your ALUA info is up-to-date. DM-MP's path checker normally does this by sending commands on which the Unit Attention can be reported so that the code can fetch up-to-date ALUA info. Hannes made some optimizations years ago to avoid excessive RTPG commands with large numbers of LUNs which we would need also. It will be necessary for the functionality to be enabled via a module option, at least initially. Introducing this in general use will be a big change for people who have Enterprise SAN configurations with their own custom path monitoring tools. I believe we put some functionality into usespace multipath tools so e.g. Native NVMe devices can still be monitored/observed which made things a bit easier for people. Unfortunately I will not be able to attend LSF/MM this year. I am sure it will be a good discussion. -Ewan On Wed, Feb 25, 2026 at 4:27 AM John Garry <john.g.garry@oracle.com> wrote: > > On 25/02/2026 08:11, Hannes Reinecke wrote: > > And I _still_ want to have a blktests for persistent reservations ... > nvme/054 supports resv testing. > > For scsi PR, we could use util-linux, which has blkpr. > > > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Native SCSI multipath support 2026-03-10 17:12 ` Ewan Milne @ 2026-03-10 18:05 ` John Garry 2026-03-10 18:42 ` Benjamin Marzinski 0 siblings, 1 reply; 9+ messages in thread From: John Garry @ 2026-03-10 18:05 UTC (permalink / raw) To: Ewan Milne Cc: Hannes Reinecke, Benjamin Marzinski, Mike Snitzer, lsf-pc, linux-nvme, linux-block, linux-scsi, dm-devel On 10/03/2026 17:12, Ewan Milne wrote: > Hi John- > > Sorry, I was out for a couple of weeks and have been catching up... > > Re: sg support, there were issues in the past with people attempting > to do SG_IO through dm-mp > assuming that DM would handle retry on other paths, which it didn't. > You also have to be aware > that non-idempotent commands don't work right if retried. My > recommendation would be to avoid > implementing it, although there has been interest in a better way to > do multipathed "generic" > commands (e.g. virt pass-through) I think that is a more involved > project than you want to do here. Understood, my current plan is not have a multipathed sg driver - we will still have the per-scsi device/path sg device. > > I see the discussion has progressed re: ALUA support in your later > patch postings, which is good. > As Hannes said, a Native SCSI MP would be useless without it. You > don't have to support the > older non-ALUA mechanisms though, those arrays are way, way old. > > SCSI does not have the equivalent of NVMe's AEN, so you need a way to > ensure that your > ALUA info is up-to-date. DM-MP's path checker normally does this by > sending commands on > which the Unit Attention can be reported so that the code can fetch > up-to-date ALUA info. > Hannes made some optimizations years ago to avoid excessive RTPG > commands with large > numbers of LUNs which we would need also. Hannes is suggesting to not have a kernel path checker, so let me know if any issue with that. > > It will be necessary for the functionality to be enabled via a module > option, at least initially. > Introducing this in general use will be a big change for people who > have Enterprise SAN > configurations with their own custom path monitoring tools. I believe > we put some functionality > into usespace multipath tools so e.g. Native NVMe devices can still be > monitored/observed > which made things a bit easier for people. > Sure, if you check my patches, we disable by default and enable via a module param cheers ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Native SCSI multipath support 2026-03-10 18:05 ` John Garry @ 2026-03-10 18:42 ` Benjamin Marzinski 0 siblings, 0 replies; 9+ messages in thread From: Benjamin Marzinski @ 2026-03-10 18:42 UTC (permalink / raw) To: John Garry Cc: Ewan Milne, Hannes Reinecke, Mike Snitzer, lsf-pc, linux-nvme, linux-block, linux-scsi, dm-devel On Tue, Mar 10, 2026 at 06:05:29PM +0000, John Garry wrote: > On 10/03/2026 17:12, Ewan Milne wrote: > > Hi John- > > > > Sorry, I was out for a couple of weeks and have been catching up... > > > > Re: sg support, there were issues in the past with people attempting > > to do SG_IO through dm-mp > > assuming that DM would handle retry on other paths, which it didn't. > > You also have to be aware > > that non-idempotent commands don't work right if retried. My > > recommendation would be to avoid > > implementing it, although there has been interest in a better way to > > do multipathed "generic" > > commands (e.g. virt pass-through) I think that is a more involved > > project than you want to do here. > > Understood, my current plan is not have a multipathed sg driver - we will > still have the per-scsi device/path sg device. The sd devices still handle SG_IO ioctls. For instance, the persistent reservation ioctls are SG_IO ioctls. But like I said elsewhere, getting multipathed persistent reservations working safely is going to be a large effort, better left for later. But even without them, to handle things like sending SCSI WRITE commands over SG_IO ioctls, in an ideal world, you would want to be able to retry on other paths in the ioctl code. However, like Ewan mentioned, there are times when you don't want to retry the ioctl. Just sending SG_IO ioctls to one path and letting them fail if they fail down that path is the safest way for now, even if there are times when that SG_IO ioctl could complete successfully down another path. -Ben > > > > > I see the discussion has progressed re: ALUA support in your later > > patch postings, which is good. > > As Hannes said, a Native SCSI MP would be useless without it. You > > don't have to support the > > older non-ALUA mechanisms though, those arrays are way, way old. > > > > SCSI does not have the equivalent of NVMe's AEN, so you need a way to > > ensure that your > > ALUA info is up-to-date. DM-MP's path checker normally does this by > > sending commands on > > which the Unit Attention can be reported so that the code can fetch > > up-to-date ALUA info. > > Hannes made some optimizations years ago to avoid excessive RTPG > > commands with large > > numbers of LUNs which we would need also. > > Hannes is suggesting to not have a kernel path checker, so let me know if > any issue with that. > > > > > It will be necessary for the functionality to be enabled via a module > > option, at least initially. > > Introducing this in general use will be a big change for people who > > have Enterprise SAN > > configurations with their own custom path monitoring tools. I believe > > we put some functionality > > into usespace multipath tools so e.g. Native NVMe devices can still be > > monitored/observed > > which made things a bit easier for people. > > > > Sure, if you check my patches, we disable by default and enable via a module > param > > cheers ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2026-03-10 18:42 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <69349b51-72c2-47f9-948f-f89843af62e4@oracle.com>
2026-02-13 15:26 ` Fwd: [LSF/MM/BPF TOPIC] Native SCSI multipath support Xose Vazquez Perez
2026-02-21 17:41 ` Mike Snitzer
2026-02-24 9:56 ` John Garry
2026-02-25 0:46 ` Benjamin Marzinski
2026-02-25 8:11 ` Hannes Reinecke
2026-02-25 9:26 ` John Garry
2026-03-10 17:12 ` Ewan Milne
2026-03-10 18:05 ` John Garry
2026-03-10 18:42 ` Benjamin Marzinski
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox