* multipathing pending issues with rhel @ 2008-10-08 21:07 Christophe Varoqui 2008-10-08 22:11 ` Benjamin Marzinski 0 siblings, 1 reply; 4+ messages in thread From: Christophe Varoqui @ 2008-10-08 21:07 UTC (permalink / raw) To: Benjamin Marzinski; +Cc: dm-devel Ben, I'd like to summarize all the issues I raised recently through my employers support channel on the multipath subsystem. And see if something can be done about it, at least in the upstream concerned codebases. 1/ multipathd private namespace pins lvm2 logical volumes maps mounted at daemon startup, thus making "vgchange -ay" fail, even after umounting the visible mount. In my context, it also means I can't stop a clustered service build on this vg to start it on another node. This problem does not affect upstream which does not create a private namespace. 2/ can't map a rw multipath over read-only paths. Quick workaround to create ro multipath, but ro->rw promotion is not automatic when paths become writable. I keep thinking we should allow rw multipath over ro paths. The ro->rw event might also work, but what will trigger the kernel rw status change in the first place ? To my knowledge, only a manual scsi device rescan can force this status update ... which accounts for a less user-friendly solution than the former. 3/ Can't use scsi-3 persistent reservations on clariion multipathed luns : paths reserved on node A, writes submitted on node B should be errored immediately to ensure data integrity. Instead, writes get buffered in the "queue_if_no_path" logic, and finally corrupt the data when reservation get cleared. In my context, reservation is the prefered io fencing method for clusters. The kernel knows the write io submitted on a path is refused due to a reservation conflict, but this status is not propagated to multipath for it to react by not queuing this io as it should. Regards, cvaroqui ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: multipathing pending issues with rhel 2008-10-08 21:07 multipathing pending issues with rhel Christophe Varoqui @ 2008-10-08 22:11 ` Benjamin Marzinski 2008-10-08 23:06 ` Christophe Varoqui 0 siblings, 1 reply; 4+ messages in thread From: Benjamin Marzinski @ 2008-10-08 22:11 UTC (permalink / raw) To: Christophe Varoqui; +Cc: dm-devel On Wed, Oct 08, 2008 at 11:07:25PM +0200, Christophe Varoqui wrote: > Ben, > > I'd like to summarize all the issues I raised recently through my > employers support channel on the multipath subsystem. > And see if something can be done about it, at least in the upstream > concerned codebases. > > 1/ multipathd private namespace pins lvm2 logical volumes maps mounted > at daemon startup, thus making "vgchange -ay" fail, even after > umounting the visible mount. In my context, it also means I can't stop > a clustered service build on this vg to start it on another node. This > problem does not affect upstream which does not create a private > namespace. This already has a fix queued for 5.3. Multipathd now unounts all of the unnecessary mount points after creating the private namespace. > > 2/ can't map a rw multipath over read-only paths. Quick workaround to > create ro multipath, but ro->rw promotion is not automatic when paths > become writable. I keep thinking we should allow rw multipath over ro > paths. The ro->rw event might also work, but what will trigger the > kernel rw status change in the first place ? To my knowledge, only a > manual scsi device rescan can force this status update ... which > accounts for a less user-friendly solution than the former. The workaround is in place for 5.3, but I fully agree that a kernel patch to allow rw maps on top of ro devices is the way to go in the future. > 3/ Can't use scsi-3 persistent reservations on clariion multipathed > luns : paths reserved on node A, writes submitted on node B should be > errored immediately to ensure data integrity. Instead, writes get > buffered in the "queue_if_no_path" logic, and finally corrupt the data > when reservation get cleared. In my context, reservation is the > prefered io fencing method for clusters. > The kernel knows the write io submitted on a path is refused due to a > reservation conflict, but this status is not propagated to multipath > for it to react by not queuing this io as it should. > > Regards, > cvaroqui ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: multipathing pending issues with rhel 2008-10-08 22:11 ` Benjamin Marzinski @ 2008-10-08 23:06 ` Christophe Varoqui 2008-11-26 20:13 ` Edward Goggin 0 siblings, 1 reply; 4+ messages in thread From: Christophe Varoqui @ 2008-10-08 23:06 UTC (permalink / raw) To: Benjamin Marzinski; +Cc: dm-devel Le Wed, 8 Oct 2008 17:11:48 -0500, Benjamin Marzinski <bmarzins@redhat.com> a écrit : > On Wed, Oct 08, 2008 at 11:07:25PM +0200, Christophe Varoqui wrote: > > Ben, > > > > I'd like to summarize all the issues I raised recently through my > > employers support channel on the multipath subsystem. > > And see if something can be done about it, at least in the upstream > > concerned codebases. > > > > 1/ multipathd private namespace pins lvm2 logical volumes maps > > mounted at daemon startup, thus making "vgchange -ay" fail, even > > after umounting the visible mount. In my context, it also means I > > can't stop a clustered service build on this vg to start it on > > another node. This problem does not affect upstream which does not > > create a private namespace. > errata: "vgchange -an", though it's clear you read it as I meant :) > This already has a fix queued for 5.3. Multipathd now unounts all of > the unnecessary mount points after creating the private namespace. > Good to know, thanks for this update. > > > > 2/ can't map a rw multipath over read-only paths. Quick workaround > > to create ro multipath, but ro->rw promotion is not automatic when > > paths become writable. I keep thinking we should allow rw multipath > > over ro paths. The ro->rw event might also work, but what will > > trigger the kernel rw status change in the first place ? To my > > knowledge, only a manual scsi device rescan can force this status > > update ... which accounts for a less user-friendly solution than > > the former. > > The workaround is in place for 5.3, but I fully agree that a kernel > patch to allow rw maps on top of ro devices is the way to go in the > future. > Glad we share this point of view. Will you propose patches for inclusion upstream ? Which update level do you estimate will include this fix ? > > 3/ Can't use scsi-3 persistent reservations on clariion multipathed > > luns : paths reserved on node A, writes submitted on node B should > > be errored immediately to ensure data integrity. Instead, writes get > > buffered in the "queue_if_no_path" logic, and finally corrupt the > > data when reservation get cleared. In my context, reservation is the > > prefered io fencing method for clusters. > > The kernel knows the write io submitted on a path is refused due to > > a reservation conflict, but this status is not propagated to > > multipath for it to react by not queuing this io as it should. > > This one is hard to understand, I empatize. Nonetheless it deserves attention, as it is a data-corrupter for the clients using, or eager to use, persistent reservations on these widespread Clariion arrays. I know Mike Christie already tried to address the issue years ago ... he might be willing to take over again. (hope) > > Regards, > > cvaroqui ^ permalink raw reply [flat|nested] 4+ messages in thread
* RE: Re: multipathing pending issues with rhel 2008-10-08 23:06 ` Christophe Varoqui @ 2008-11-26 20:13 ` Edward Goggin 0 siblings, 0 replies; 4+ messages in thread From: Edward Goggin @ 2008-11-26 20:13 UTC (permalink / raw) To: 'device-mapper development', Benjamin Marzinski > -----Original Message----- > From: dm-devel-bounces@redhat.com > [mailto:dm-devel-bounces@redhat.com] On Behalf Of Christophe Varoqui > Sent: Wednesday, October 08, 2008 7:07 PM > To: Benjamin Marzinski > Cc: dm-devel@redhat.com > Subject: [dm-devel] Re: multipathing pending issues with rhel > > Le Wed, 8 Oct 2008 17:11:48 -0500, > Benjamin Marzinski <bmarzins@redhat.com> a écrit : > > > On Wed, Oct 08, 2008 at 11:07:25PM +0200, Christophe Varoqui wrote: > > > Ben, > > > > > > I'd like to summarize all the issues I raised recently through my > > > employers support channel on the multipath subsystem. > > > And see if something can be done about it, at least in > the upstream > > > concerned codebases. > > > > > > 1/ multipathd private namespace pins lvm2 logical volumes maps > > > mounted at daemon startup, thus making "vgchange -ay" fail, even > > > after umounting the visible mount. In my context, it also means I > > > can't stop a clustered service build on this vg to start it on > > > another node. This problem does not affect upstream which > does not > > > create a private namespace. > > > errata: "vgchange -an", though it's clear you read it as I meant :) > > > This already has a fix queued for 5.3. Multipathd now > unounts all of > > the unnecessary mount points after creating the private namespace. > > > Good to know, thanks for this update. > > > > > > > 2/ can't map a rw multipath over read-only paths. Quick > workaround > > > to create ro multipath, but ro->rw promotion is not > automatic when > > > paths become writable. I keep thinking we should allow rw > multipath > > > over ro paths. The ro->rw event might also work, but what will > > > trigger the kernel rw status change in the first place ? To my > > > knowledge, only a manual scsi device rescan can force this status > > > update ... which accounts for a less user-friendly > solution than the > > > former. > > > > The workaround is in place for 5.3, but I fully agree that a kernel > > patch to allow rw maps on top of ro devices is the way to go in the > > future. > > > Glad we share this point of view. Will you propose patches > for inclusion upstream ? Which update level do you estimate > will include this fix ? > > > > 3/ Can't use scsi-3 persistent reservations on clariion > multipathed > > > luns : paths reserved on node A, writes submitted on node > B should > > > be errored immediately to ensure data integrity. Instead, > writes get > > > buffered in the "queue_if_no_path" logic, and finally corrupt the > > > data when reservation get cleared. In my context, > reservation is the > > > prefered io fencing method for clusters. > > > The kernel knows the write io submitted on a path is > refused due to > > > a reservation conflict, but this status is not propagated to > > > multipath for it to react by not queuing this io as it should. > > > > This one is hard to understand, I empatize. Nonetheless it > deserves attention, as it is a data-corrupter for the clients > using, or eager to use, persistent reservations on these > widespread Clariion arrays. I know Mike Christie already > tried to address the issue years ago ... he might be willing > to take over again. (hope) > I think I inadvertantly made this matter worse about two years ago when I changed the clariion path checker to issue a scsi read via sg_read in libsg.c after inquiry page 0xc0 in order to discern an inactive from an active clariion snapshot logical unit. This change to multipath-tools/libcheckers/emc_clariion.c causes a path check on a path to incur a reservation error if the reservation is held by a different I/T nexus. A reasonable fix for this problem is to have sg_read in multipath-tools/libcheckers/libsg.c return PATH_UP even if the return value from ioctl is < 0 if the returned scsi status in io_hdr.status is SAM_STAT_RESERVATION_CONFLICT. The dm-mpath.c > > > Regards, > > > cvaroqui > > -- > dm-devel mailing list > dm-devel@redhat.com > https://www.redhat.com/mailman/listinfo/dm-devel > ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2008-11-26 20:13 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-10-08 21:07 multipathing pending issues with rhel Christophe Varoqui 2008-10-08 22:11 ` Benjamin Marzinski 2008-10-08 23:06 ` Christophe Varoqui 2008-11-26 20:13 ` Edward Goggin
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.