* multipathing pending issues with rhel
@ 2008-10-08 21:07 Christophe Varoqui
2008-10-08 22:11 ` Benjamin Marzinski
0 siblings, 1 reply; 4+ messages in thread
From: Christophe Varoqui @ 2008-10-08 21:07 UTC (permalink / raw)
To: Benjamin Marzinski; +Cc: dm-devel
Ben,
I'd like to summarize all the issues I raised recently through my
employers support channel on the multipath subsystem.
And see if something can be done about it, at least in the upstream
concerned codebases.
1/ multipathd private namespace pins lvm2 logical volumes maps mounted
at daemon startup, thus making "vgchange -ay" fail, even after
umounting the visible mount. In my context, it also means I can't stop
a clustered service build on this vg to start it on another node. This
problem does not affect upstream which does not create a private
namespace.
2/ can't map a rw multipath over read-only paths. Quick workaround to
create ro multipath, but ro->rw promotion is not automatic when paths
become writable. I keep thinking we should allow rw multipath over ro
paths. The ro->rw event might also work, but what will trigger the
kernel rw status change in the first place ? To my knowledge, only a
manual scsi device rescan can force this status update ... which
accounts for a less user-friendly solution than the former.
3/ Can't use scsi-3 persistent reservations on clariion multipathed
luns : paths reserved on node A, writes submitted on node B should be
errored immediately to ensure data integrity. Instead, writes get
buffered in the "queue_if_no_path" logic, and finally corrupt the data
when reservation get cleared. In my context, reservation is the
prefered io fencing method for clusters.
The kernel knows the write io submitted on a path is refused due to a
reservation conflict, but this status is not propagated to multipath
for it to react by not queuing this io as it should.
Regards,
cvaroqui
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: multipathing pending issues with rhel
2008-10-08 21:07 multipathing pending issues with rhel Christophe Varoqui
@ 2008-10-08 22:11 ` Benjamin Marzinski
2008-10-08 23:06 ` Christophe Varoqui
0 siblings, 1 reply; 4+ messages in thread
From: Benjamin Marzinski @ 2008-10-08 22:11 UTC (permalink / raw)
To: Christophe Varoqui; +Cc: dm-devel
On Wed, Oct 08, 2008 at 11:07:25PM +0200, Christophe Varoqui wrote:
> Ben,
>
> I'd like to summarize all the issues I raised recently through my
> employers support channel on the multipath subsystem.
> And see if something can be done about it, at least in the upstream
> concerned codebases.
>
> 1/ multipathd private namespace pins lvm2 logical volumes maps mounted
> at daemon startup, thus making "vgchange -ay" fail, even after
> umounting the visible mount. In my context, it also means I can't stop
> a clustered service build on this vg to start it on another node. This
> problem does not affect upstream which does not create a private
> namespace.
This already has a fix queued for 5.3. Multipathd now unounts all of
the unnecessary mount points after creating the private namespace.
>
> 2/ can't map a rw multipath over read-only paths. Quick workaround to
> create ro multipath, but ro->rw promotion is not automatic when paths
> become writable. I keep thinking we should allow rw multipath over ro
> paths. The ro->rw event might also work, but what will trigger the
> kernel rw status change in the first place ? To my knowledge, only a
> manual scsi device rescan can force this status update ... which
> accounts for a less user-friendly solution than the former.
The workaround is in place for 5.3, but I fully agree that a kernel
patch to allow rw maps on top of ro devices is the way to go in the
future.
> 3/ Can't use scsi-3 persistent reservations on clariion multipathed
> luns : paths reserved on node A, writes submitted on node B should be
> errored immediately to ensure data integrity. Instead, writes get
> buffered in the "queue_if_no_path" logic, and finally corrupt the data
> when reservation get cleared. In my context, reservation is the
> prefered io fencing method for clusters.
> The kernel knows the write io submitted on a path is refused due to a
> reservation conflict, but this status is not propagated to multipath
> for it to react by not queuing this io as it should.
>
> Regards,
> cvaroqui
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: multipathing pending issues with rhel
2008-10-08 22:11 ` Benjamin Marzinski
@ 2008-10-08 23:06 ` Christophe Varoqui
2008-11-26 20:13 ` Edward Goggin
0 siblings, 1 reply; 4+ messages in thread
From: Christophe Varoqui @ 2008-10-08 23:06 UTC (permalink / raw)
To: Benjamin Marzinski; +Cc: dm-devel
Le Wed, 8 Oct 2008 17:11:48 -0500,
Benjamin Marzinski <bmarzins@redhat.com> a écrit :
> On Wed, Oct 08, 2008 at 11:07:25PM +0200, Christophe Varoqui wrote:
> > Ben,
> >
> > I'd like to summarize all the issues I raised recently through my
> > employers support channel on the multipath subsystem.
> > And see if something can be done about it, at least in the upstream
> > concerned codebases.
> >
> > 1/ multipathd private namespace pins lvm2 logical volumes maps
> > mounted at daemon startup, thus making "vgchange -ay" fail, even
> > after umounting the visible mount. In my context, it also means I
> > can't stop a clustered service build on this vg to start it on
> > another node. This problem does not affect upstream which does not
> > create a private namespace.
>
errata: "vgchange -an", though it's clear you read it as I meant :)
> This already has a fix queued for 5.3. Multipathd now unounts all of
> the unnecessary mount points after creating the private namespace.
>
Good to know, thanks for this update.
> >
> > 2/ can't map a rw multipath over read-only paths. Quick workaround
> > to create ro multipath, but ro->rw promotion is not automatic when
> > paths become writable. I keep thinking we should allow rw multipath
> > over ro paths. The ro->rw event might also work, but what will
> > trigger the kernel rw status change in the first place ? To my
> > knowledge, only a manual scsi device rescan can force this status
> > update ... which accounts for a less user-friendly solution than
> > the former.
>
> The workaround is in place for 5.3, but I fully agree that a kernel
> patch to allow rw maps on top of ro devices is the way to go in the
> future.
>
Glad we share this point of view. Will you propose patches for inclusion
upstream ? Which update level do you estimate will include this fix ?
> > 3/ Can't use scsi-3 persistent reservations on clariion multipathed
> > luns : paths reserved on node A, writes submitted on node B should
> > be errored immediately to ensure data integrity. Instead, writes get
> > buffered in the "queue_if_no_path" logic, and finally corrupt the
> > data when reservation get cleared. In my context, reservation is the
> > prefered io fencing method for clusters.
> > The kernel knows the write io submitted on a path is refused due to
> > a reservation conflict, but this status is not propagated to
> > multipath for it to react by not queuing this io as it should.
> >
This one is hard to understand, I empatize. Nonetheless it deserves
attention, as it is a data-corrupter for the clients using, or
eager to use, persistent reservations on these widespread Clariion
arrays. I know Mike Christie already tried to address the issue
years ago ... he might be willing to take over again. (hope)
> > Regards,
> > cvaroqui
^ permalink raw reply [flat|nested] 4+ messages in thread
* RE: Re: multipathing pending issues with rhel
2008-10-08 23:06 ` Christophe Varoqui
@ 2008-11-26 20:13 ` Edward Goggin
0 siblings, 0 replies; 4+ messages in thread
From: Edward Goggin @ 2008-11-26 20:13 UTC (permalink / raw)
To: 'device-mapper development', Benjamin Marzinski
> -----Original Message-----
> From: dm-devel-bounces@redhat.com
> [mailto:dm-devel-bounces@redhat.com] On Behalf Of Christophe Varoqui
> Sent: Wednesday, October 08, 2008 7:07 PM
> To: Benjamin Marzinski
> Cc: dm-devel@redhat.com
> Subject: [dm-devel] Re: multipathing pending issues with rhel
>
> Le Wed, 8 Oct 2008 17:11:48 -0500,
> Benjamin Marzinski <bmarzins@redhat.com> a écrit :
>
> > On Wed, Oct 08, 2008 at 11:07:25PM +0200, Christophe Varoqui wrote:
> > > Ben,
> > >
> > > I'd like to summarize all the issues I raised recently through my
> > > employers support channel on the multipath subsystem.
> > > And see if something can be done about it, at least in
> the upstream
> > > concerned codebases.
> > >
> > > 1/ multipathd private namespace pins lvm2 logical volumes maps
> > > mounted at daemon startup, thus making "vgchange -ay" fail, even
> > > after umounting the visible mount. In my context, it also means I
> > > can't stop a clustered service build on this vg to start it on
> > > another node. This problem does not affect upstream which
> does not
> > > create a private namespace.
> >
> errata: "vgchange -an", though it's clear you read it as I meant :)
>
> > This already has a fix queued for 5.3. Multipathd now
> unounts all of
> > the unnecessary mount points after creating the private namespace.
> >
> Good to know, thanks for this update.
>
> > >
> > > 2/ can't map a rw multipath over read-only paths. Quick
> workaround
> > > to create ro multipath, but ro->rw promotion is not
> automatic when
> > > paths become writable. I keep thinking we should allow rw
> multipath
> > > over ro paths. The ro->rw event might also work, but what will
> > > trigger the kernel rw status change in the first place ? To my
> > > knowledge, only a manual scsi device rescan can force this status
> > > update ... which accounts for a less user-friendly
> solution than the
> > > former.
> >
> > The workaround is in place for 5.3, but I fully agree that a kernel
> > patch to allow rw maps on top of ro devices is the way to go in the
> > future.
> >
> Glad we share this point of view. Will you propose patches
> for inclusion upstream ? Which update level do you estimate
> will include this fix ?
>
> > > 3/ Can't use scsi-3 persistent reservations on clariion
> multipathed
> > > luns : paths reserved on node A, writes submitted on node
> B should
> > > be errored immediately to ensure data integrity. Instead,
> writes get
> > > buffered in the "queue_if_no_path" logic, and finally corrupt the
> > > data when reservation get cleared. In my context,
> reservation is the
> > > prefered io fencing method for clusters.
> > > The kernel knows the write io submitted on a path is
> refused due to
> > > a reservation conflict, but this status is not propagated to
> > > multipath for it to react by not queuing this io as it should.
> > >
> This one is hard to understand, I empatize. Nonetheless it
> deserves attention, as it is a data-corrupter for the clients
> using, or eager to use, persistent reservations on these
> widespread Clariion arrays. I know Mike Christie already
> tried to address the issue years ago ... he might be willing
> to take over again. (hope)
>
I think I inadvertantly made this matter worse about two years ago
when I changed the clariion path checker to issue a scsi read via
sg_read in libsg.c after inquiry page 0xc0 in order to discern an
inactive from an active clariion snapshot logical unit. This change
to multipath-tools/libcheckers/emc_clariion.c causes a path check on
a path to incur a reservation error if the reservation is held
by a different I/T nexus.
A reasonable fix for this problem is to have sg_read in
multipath-tools/libcheckers/libsg.c return PATH_UP even if the
return value from ioctl is < 0 if the returned scsi status
in io_hdr.status is SAM_STAT_RESERVATION_CONFLICT.
The dm-mpath.c
> > > Regards,
> > > cvaroqui
>
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2008-11-26 20:13 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-10-08 21:07 multipathing pending issues with rhel Christophe Varoqui
2008-10-08 22:11 ` Benjamin Marzinski
2008-10-08 23:06 ` Christophe Varoqui
2008-11-26 20:13 ` Edward Goggin
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.