All of lore.kernel.org
 help / color / mirror / Atom feed
* multipathing pending issues with rhel
@ 2008-10-08 21:07 Christophe Varoqui
  2008-10-08 22:11 ` Benjamin Marzinski
  0 siblings, 1 reply; 4+ messages in thread
From: Christophe Varoqui @ 2008-10-08 21:07 UTC (permalink / raw)
  To: Benjamin Marzinski; +Cc: dm-devel

Ben,

I'd like to summarize all the issues I raised recently through my
employers support channel on the multipath subsystem.
And see if something can be done about it, at least in the upstream
concerned codebases.

1/ multipathd private namespace pins lvm2 logical volumes maps mounted
at daemon startup, thus making "vgchange -ay" fail, even after
umounting the visible mount. In my context, it also means I can't stop
a clustered service build on this vg to start it on another node. This
problem does not affect upstream which does not create a private
namespace.

2/ can't map a rw multipath over read-only paths. Quick workaround to
create ro multipath, but ro->rw promotion is not automatic when paths
become writable. I keep thinking we should allow rw multipath over ro
paths. The ro->rw event might also work, but what will trigger the
kernel rw status change in the first place ? To my knowledge, only a
manual scsi device rescan can force this status update ... which
accounts for a less user-friendly solution than the former.

3/ Can't use scsi-3 persistent reservations on clariion multipathed
luns : paths reserved on node A, writes submitted on node B should be
errored immediately to ensure data integrity. Instead, writes get
buffered in the "queue_if_no_path" logic, and finally corrupt the data
when reservation get cleared. In my context, reservation is the
prefered io fencing method for clusters.
The kernel knows the write io submitted on a path is refused due to a
reservation conflict, but this status is not propagated to multipath
for it to react by not queuing this io as it should.

Regards,
cvaroqui

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: multipathing pending issues with rhel
  2008-10-08 21:07 multipathing pending issues with rhel Christophe Varoqui
@ 2008-10-08 22:11 ` Benjamin Marzinski
  2008-10-08 23:06   ` Christophe Varoqui
  0 siblings, 1 reply; 4+ messages in thread
From: Benjamin Marzinski @ 2008-10-08 22:11 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: dm-devel

On Wed, Oct 08, 2008 at 11:07:25PM +0200, Christophe Varoqui wrote:
> Ben,
> 
> I'd like to summarize all the issues I raised recently through my
> employers support channel on the multipath subsystem.
> And see if something can be done about it, at least in the upstream
> concerned codebases.
> 
> 1/ multipathd private namespace pins lvm2 logical volumes maps mounted
> at daemon startup, thus making "vgchange -ay" fail, even after
> umounting the visible mount. In my context, it also means I can't stop
> a clustered service build on this vg to start it on another node. This
> problem does not affect upstream which does not create a private
> namespace.

This already has a fix queued for 5.3.  Multipathd now unounts all of
the unnecessary mount points after creating the private namespace.

> 
> 2/ can't map a rw multipath over read-only paths. Quick workaround to
> create ro multipath, but ro->rw promotion is not automatic when paths
> become writable. I keep thinking we should allow rw multipath over ro
> paths. The ro->rw event might also work, but what will trigger the
> kernel rw status change in the first place ? To my knowledge, only a
> manual scsi device rescan can force this status update ... which
> accounts for a less user-friendly solution than the former.

The workaround is in place for 5.3, but I fully agree that a kernel
patch to allow rw maps on top of ro devices is the way to go in the
future.

> 3/ Can't use scsi-3 persistent reservations on clariion multipathed
> luns : paths reserved on node A, writes submitted on node B should be
> errored immediately to ensure data integrity. Instead, writes get
> buffered in the "queue_if_no_path" logic, and finally corrupt the data
> when reservation get cleared. In my context, reservation is the
> prefered io fencing method for clusters.
> The kernel knows the write io submitted on a path is refused due to a
> reservation conflict, but this status is not propagated to multipath
> for it to react by not queuing this io as it should.
> 
> Regards,
> cvaroqui

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: multipathing pending issues with rhel
  2008-10-08 22:11 ` Benjamin Marzinski
@ 2008-10-08 23:06   ` Christophe Varoqui
  2008-11-26 20:13     ` Edward Goggin
  0 siblings, 1 reply; 4+ messages in thread
From: Christophe Varoqui @ 2008-10-08 23:06 UTC (permalink / raw)
  To: Benjamin Marzinski; +Cc: dm-devel

Le Wed, 8 Oct 2008 17:11:48 -0500,
Benjamin Marzinski <bmarzins@redhat.com> a écrit :

> On Wed, Oct 08, 2008 at 11:07:25PM +0200, Christophe Varoqui wrote:
> > Ben,
> > 
> > I'd like to summarize all the issues I raised recently through my
> > employers support channel on the multipath subsystem.
> > And see if something can be done about it, at least in the upstream
> > concerned codebases.
> > 
> > 1/ multipathd private namespace pins lvm2 logical volumes maps
> > mounted at daemon startup, thus making "vgchange -ay" fail, even
> > after umounting the visible mount. In my context, it also means I
> > can't stop a clustered service build on this vg to start it on
> > another node. This problem does not affect upstream which does not
> > create a private namespace.
>
errata: "vgchange -an", though it's clear you read it as I meant :)
 
> This already has a fix queued for 5.3.  Multipathd now unounts all of
> the unnecessary mount points after creating the private namespace.
>
Good to know, thanks for this update.
 
> > 
> > 2/ can't map a rw multipath over read-only paths. Quick workaround
> > to create ro multipath, but ro->rw promotion is not automatic when
> > paths become writable. I keep thinking we should allow rw multipath
> > over ro paths. The ro->rw event might also work, but what will
> > trigger the kernel rw status change in the first place ? To my
> > knowledge, only a manual scsi device rescan can force this status
> > update ... which accounts for a less user-friendly solution than
> > the former.
> 
> The workaround is in place for 5.3, but I fully agree that a kernel
> patch to allow rw maps on top of ro devices is the way to go in the
> future.
>
Glad we share this point of view. Will you propose patches for inclusion
upstream ? Which update level do you estimate will include this fix ?
 
> > 3/ Can't use scsi-3 persistent reservations on clariion multipathed
> > luns : paths reserved on node A, writes submitted on node B should
> > be errored immediately to ensure data integrity. Instead, writes get
> > buffered in the "queue_if_no_path" logic, and finally corrupt the
> > data when reservation get cleared. In my context, reservation is the
> > prefered io fencing method for clusters.
> > The kernel knows the write io submitted on a path is refused due to
> > a reservation conflict, but this status is not propagated to
> > multipath for it to react by not queuing this io as it should.
> > 
This one is hard to understand, I empatize. Nonetheless it deserves
attention, as it is a data-corrupter for the clients using, or
eager to use, persistent reservations on these widespread Clariion
arrays. I know Mike Christie already tried to address the issue
years ago ... he might be willing to take over again. (hope)

> > Regards,
> > cvaroqui

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: Re: multipathing pending issues with rhel
  2008-10-08 23:06   ` Christophe Varoqui
@ 2008-11-26 20:13     ` Edward Goggin
  0 siblings, 0 replies; 4+ messages in thread
From: Edward Goggin @ 2008-11-26 20:13 UTC (permalink / raw)
  To: 'device-mapper development', Benjamin Marzinski



> -----Original Message-----
> From: dm-devel-bounces@redhat.com 
> [mailto:dm-devel-bounces@redhat.com] On Behalf Of Christophe Varoqui
> Sent: Wednesday, October 08, 2008 7:07 PM
> To: Benjamin Marzinski
> Cc: dm-devel@redhat.com
> Subject: [dm-devel] Re: multipathing pending issues with rhel
> 
> Le Wed, 8 Oct 2008 17:11:48 -0500,
> Benjamin Marzinski <bmarzins@redhat.com> a écrit :
> 
> > On Wed, Oct 08, 2008 at 11:07:25PM +0200, Christophe Varoqui wrote:
> > > Ben,
> > >
> > > I'd like to summarize all the issues I raised recently through my 
> > > employers support channel on the multipath subsystem.
> > > And see if something can be done about it, at least in 
> the upstream 
> > > concerned codebases.
> > >
> > > 1/ multipathd private namespace pins lvm2 logical volumes maps 
> > > mounted at daemon startup, thus making "vgchange -ay" fail, even 
> > > after umounting the visible mount. In my context, it also means I 
> > > can't stop a clustered service build on this vg to start it on 
> > > another node. This problem does not affect upstream which 
> does not 
> > > create a private namespace.
> >
> errata: "vgchange -an", though it's clear you read it as I meant :)
> 
> > This already has a fix queued for 5.3.  Multipathd now 
> unounts all of 
> > the unnecessary mount points after creating the private namespace.
> >
> Good to know, thanks for this update.
> 
> > >
> > > 2/ can't map a rw multipath over read-only paths. Quick 
> workaround 
> > > to create ro multipath, but ro->rw promotion is not 
> automatic when 
> > > paths become writable. I keep thinking we should allow rw 
> multipath 
> > > over ro paths. The ro->rw event might also work, but what will 
> > > trigger the kernel rw status change in the first place ? To my 
> > > knowledge, only a manual scsi device rescan can force this status 
> > > update ... which accounts for a less user-friendly 
> solution than the 
> > > former.
> >
> > The workaround is in place for 5.3, but I fully agree that a kernel 
> > patch to allow rw maps on top of ro devices is the way to go in the 
> > future.
> >
> Glad we share this point of view. Will you propose patches 
> for inclusion upstream ? Which update level do you estimate 
> will include this fix ?
> 
> > > 3/ Can't use scsi-3 persistent reservations on clariion 
> multipathed 
> > > luns : paths reserved on node A, writes submitted on node 
> B should 
> > > be errored immediately to ensure data integrity. Instead, 
> writes get 
> > > buffered in the "queue_if_no_path" logic, and finally corrupt the 
> > > data when reservation get cleared. In my context, 
> reservation is the 
> > > prefered io fencing method for clusters.
> > > The kernel knows the write io submitted on a path is 
> refused due to 
> > > a reservation conflict, but this status is not propagated to 
> > > multipath for it to react by not queuing this io as it should.
> > >
> This one is hard to understand, I empatize. Nonetheless it 
> deserves attention, as it is a data-corrupter for the clients 
> using, or eager to use, persistent reservations on these 
> widespread Clariion arrays. I know Mike Christie already 
> tried to address the issue years ago ... he might be willing 
> to take over again. (hope)
> 

I think I inadvertantly made this matter worse about two years ago
when I changed the clariion path checker to issue a scsi read via
sg_read in libsg.c after inquiry page 0xc0 in order to discern an
inactive from an active clariion snapshot logical unit.  This change
to multipath-tools/libcheckers/emc_clariion.c causes a path check on
a path to incur a reservation error if the reservation is held
by a different I/T nexus.

A reasonable fix for this problem is to have sg_read in
multipath-tools/libcheckers/libsg.c return PATH_UP even if the
return value from ioctl is < 0 if the returned scsi status
in io_hdr.status is SAM_STAT_RESERVATION_CONFLICT.

The dm-mpath.c  
> > > Regards,
> > > cvaroqui
> 
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2008-11-26 20:13 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-10-08 21:07 multipathing pending issues with rhel Christophe Varoqui
2008-10-08 22:11 ` Benjamin Marzinski
2008-10-08 23:06   ` Christophe Varoqui
2008-11-26 20:13     ` Edward Goggin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.