public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
* Re: [lvm] [christophe.varoqui@free.fr: dm multipath target]
       [not found] ` <20030819094838.F8428@sistina.com>
@ 2003-08-19  9:04   ` christophe.varoqui
  2003-08-19  9:51     ` Heinz J . Mauelshagen
  0 siblings, 1 reply; 19+ messages in thread
From: christophe.varoqui @ 2003-08-19  9:04 UTC (permalink / raw)
  To: mauelshagen; +Cc: mge, thornber, linux-scsi

Hello,

pvscan will block on ghosts (verified here).
It stays in uninterruptible sleep until a START_LUN is sent through the ghost it
blocks on. You need a way to detect and skip those : this detection is vendor
specific and not related to volume management.

I still consider multipath is a requirement above the LVM need, and as such it
shouldn't be included with LVM2 : LVM2 shouldn't be a requirement for
multipathing. Moreover, PV UUID should not be necessary for multipath, the
inquiry data suffice.

In a multipathed environment, with dm multipathed maps set up outside LVM2 as I
suggested, LVM2 just need to restrain its view (through /etc/lvm.conf) to those
maps in order to avoid ghosts-blocking.

I "cc:" linux-scsi as opinions of the folks from the OLS talk seem needed.

regards,
cvaroqui

Selon "Heinz J . Mauelshagen" <Heinz.Mauelshagen@t-online.de>:

> 
> Christophe,
> 
> in LVM2, such "ghosts disks" will never be used to multipath io
> to the device(s), because LVM2 discovers disks by label. In order to do
> this, the tools need to read the labels off the devices which is not
> possible
> for "ghosts disks" if I understand you correctly.
> 
> Please keep in mind, that LVM2 tool support for multipathing is under
> development (code will show up in October).
> 
> It should be trivial to call the respective LVM2 tool from a hotplug script
> to
> automatically enable new paths to the device.
> 
> IO balancing is already supprted in our multipath target code.
> 
> BTW: People could still use the dmsetup tool (which comes with the
> device-mapper
>      source distro) to set a multipath configuration accessing "ghosts
> disks"
>      up. But this is no problem IMO, because dmsetup is an expert tool which
> is
>      not to be used by regular users anyway who will be using the LVM2
> tools.
> 
> 
> Regards,
> Heinz    -- The LVM Guy --
> 
> 
> > Date: Tue, 19 Aug 2003 00:52:49 +0200
> > From: christophe varoqui <christophe.varoqui@free.fr>
> > To: thornber@sistina.com
> > Subject: dm multipath target
> > 
> > Hello,
> > 
> > I read that you prepare a dm multipath target for release.
> > As a Storageworks HSV client, I would like to know how/if you plan to 
> > manage "ghosts disks" ?
> > 
> > First a little background info :
> > A ghost is a path to a disk that you can use to submit inquiries but it 
> > cannot take IO. The HSV controler presents those ghosts through half of 
> > its 4 FC ports, for each LUN. A host can force the activation of a ghost 
> > by sending a "START_LUN" on it, thus a valid IO path become ghost in its 
> > place. Maybe other controlers use other activation procedures.
> > 
> > A complete multipath integration in Linux means (to me) :
> > 
> > * auto-detection and setting of the multipath device at boot and at LUN 
> > detection time : with dm (or md) and a block.* hotplug script for 2.6 
> > kernels, this seems easy.
> > * IO balancing across all paths active at any moment, and never send IO 
> > to ghosts : the tough part, as it certainly is vendor-specific.
> > 
> > In my case, the ghost paths can be seen as spares.
> > 
> > This notion of spare/ghost could be kept at a userland level, then the 
> > multipath target only need to know about active paths : Easy. But you 
> > need to call userland each time you fail a path for it to try and 
> > replace it by activating a ghost.
> > 
> > What are your current thoughts about that, especially about the timing 
> > and mecanics for the userland callbacks ? What hooks will be in the 
> > first release, and what's the larger plan ?
> > I certainly whish I can help with testing your target, with up to 8 
> > paths per LUN (4/4 active/passive) and I'll try to get a userland right 
> > for my environment.
> > 
> > regards,
> > cvaroqui
> > 
> > ----- End forwarded message -----
> 
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
> 
> Heinz Mauelshagen                                 Sistina Software Inc.
> Senior Consultant/Developer                       Am Sonnenhang 11
>                                                   56242 Marienrachdorf
>                                                   Germany
> Mauelshagen@Sistina.com                           +49 2626 141200
>                                                        FAX 924446
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
> 


-- 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [lvm] [christophe.varoqui@free.fr: dm multipath target]
  2003-08-19  9:04   ` [lvm] [christophe.varoqui@free.fr: dm multipath target] christophe.varoqui
@ 2003-08-19  9:51     ` Heinz J . Mauelshagen
  2003-08-19 10:48       ` christophe.varoqui
  2003-08-19 14:19       ` James Bottomley
  0 siblings, 2 replies; 19+ messages in thread
From: Heinz J . Mauelshagen @ 2003-08-19  9:51 UTC (permalink / raw)
  To: christophe.varoqui; +Cc: mge

On Tue, Aug 19, 2003 at 11:04:31AM +0200, christophe.varoqui@free.fr wrote:
> Hello,
> 
> pvscan will block on ghosts (verified here).
> It stays in uninterruptible sleep until a START_LUN is sent through the ghost it
> blocks on. You need a way to detect and skip those : this detection is vendor
> specific and not related to volume management.

No, device-mapper is designed completely generic (LVM2 as well BTW).
It will never contain any vendor specific hacks.

Ghosts should report EIO in case of an unstarted LUN anyway ?
In case they are started they are supposed to report EPERM on IO because
they don't support any.

Even reporting EPERM in both cases would be sufficient, because IO
will never make it through a ghost. Blocking makes no sense to me at all.

> 
> I still consider multipath is a requirement above the LVM need,

There's various "flavours" where to put multipath into the io chain.
Kernel hacker conclusion is in device-mapper.

> and as such it
> shouldn't be included with LVM2 : LVM2 shouldn't be a requirement for
> multipathing.

It isn't: that was my point about device-mapper and dmsetup.
Multipathing is a device-mapper target and therefore offers a general kernel
service to be used by arbitrary applications.
LVM2 is the one from us supporting that in the future.

Other LVM applications are free to implement a different model.

> Moreover, PV UUID should not be necessary for multipath, the
> inquiry data suffice.

It isn't. Device-mapper has no clue of volume management metadata. LVM2 has.

We need better device recovery in Linux (which will happen in the Linux 2.7
series later). PV UUIDs enable LVM2 to discover PVs.

> 
> In a multipathed environment, with dm multipathed maps set up outside LVM2 as I
> suggested, LVM2 just need to restrain its view (through /etc/lvm.conf) to those
> maps in order to avoid ghosts-blocking.

Yep, it can already if the admin wants so.

I much rather prefer a cleaner ghosts implementation not blocking
critical applications.

> 
> I "cc:" linux-scsi as opinions of the folks from the OLS talk seem needed.

Ok.

Regards,
Heinz    -- The LVM Guy --

> 
> regards,
> cvaroqui
> 
> Selon "Heinz J . Mauelshagen" <Heinz.Mauelshagen@t-online.de>:
> 
> > 
> > Christophe,
> > 
> > in LVM2, such "ghosts disks" will never be used to multipath io
> > to the device(s), because LVM2 discovers disks by label. In order to do
> > this, the tools need to read the labels off the devices which is not
> > possible
> > for "ghosts disks" if I understand you correctly.
> > 
> > Please keep in mind, that LVM2 tool support for multipathing is under
> > development (code will show up in October).
> > 
> > It should be trivial to call the respective LVM2 tool from a hotplug script
> > to
> > automatically enable new paths to the device.
> > 
> > IO balancing is already supprted in our multipath target code.
> > 
> > BTW: People could still use the dmsetup tool (which comes with the
> > device-mapper
> >      source distro) to set a multipath configuration accessing "ghosts
> > disks"
> >      up. But this is no problem IMO, because dmsetup is an expert tool which
> > is
> >      not to be used by regular users anyway who will be using the LVM2
> > tools.
> > 
> > 
> > Regards,
> > Heinz    -- The LVM Guy --
> > 
> > 
> > > Date: Tue, 19 Aug 2003 00:52:49 +0200
> > > From: christophe varoqui <christophe.varoqui@free.fr>
> > > To: thornber@sistina.com
> > > Subject: dm multipath target
> > > 
> > > Hello,
> > > 
> > > I read that you prepare a dm multipath target for release.
> > > As a Storageworks HSV client, I would like to know how/if you plan to 
> > > manage "ghosts disks" ?
> > > 
> > > First a little background info :
> > > A ghost is a path to a disk that you can use to submit inquiries but it 
> > > cannot take IO. The HSV controler presents those ghosts through half of 
> > > its 4 FC ports, for each LUN. A host can force the activation of a ghost 
> > > by sending a "START_LUN" on it, thus a valid IO path become ghost in its 
> > > place. Maybe other controlers use other activation procedures.
> > > 
> > > A complete multipath integration in Linux means (to me) :
> > > 
> > > * auto-detection and setting of the multipath device at boot and at LUN 
> > > detection time : with dm (or md) and a block.* hotplug script for 2.6 
> > > kernels, this seems easy.
> > > * IO balancing across all paths active at any moment, and never send IO 
> > > to ghosts : the tough part, as it certainly is vendor-specific.
> > > 
> > > In my case, the ghost paths can be seen as spares.
> > > 
> > > This notion of spare/ghost could be kept at a userland level, then the 
> > > multipath target only need to know about active paths : Easy. But you 
> > > need to call userland each time you fail a path for it to try and 
> > > replace it by activating a ghost.
> > > 
> > > What are your current thoughts about that, especially about the timing 
> > > and mecanics for the userland callbacks ? What hooks will be in the 
> > > first release, and what's the larger plan ?
> > > I certainly whish I can help with testing your target, with up to 8 
> > > paths per LUN (4/4 active/passive) and I'll try to get a userland right 
> > > for my environment.
> > > 
> > > regards,
> > > cvaroqui
> > > 
> > > ----- End forwarded message -----
> > 

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

Heinz Mauelshagen                                 Sistina Software Inc.
Senior Consultant/Developer                       Am Sonnenhang 11
                                                  56242 Marienrachdorf
                                                  Germany
Mauelshagen@Sistina.com                           +49 2626 141200
                                                       FAX 924446
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [lvm] [christophe.varoqui@free.fr: dm multipath target]
  2003-08-19  9:51     ` Heinz J . Mauelshagen
@ 2003-08-19 10:48       ` christophe.varoqui
  2003-08-19 12:35         ` Heinz J . Mauelshagen
  2003-08-19 14:19       ` James Bottomley
  1 sibling, 1 reply; 19+ messages in thread
From: christophe.varoqui @ 2003-08-19 10:48 UTC (permalink / raw)
  To: mauelshagen; +Cc: mge, linux-scsi

Selon "Heinz J . Mauelshagen" <Heinz.Mauelshagen@t-online.de>:

> Ghosts should report EIO in case of an unstarted LUN anyway ?
> In case they are started they are supposed to report EPERM on IO because
> they don't support any.
> 
All I can say is that it blocks pvscan for a looong time. I havn't had the
patience to see if it finaly abords.

> Even reporting EPERM in both cases would be sufficient, because IO
> will never make it through a ghost. Blocking makes no sense to me at all.
> 
That's the vendor-specific issue I talk about : other controlers certainly
return errors, other activate the path on IO submission (with or without
penalty). HSV simply blocks.

> There's various "flavours" where to put multipath into the io chain.
> Kernel hacker conclusion is in device-mapper.
> 
No problem with that

> Multipathing is a device-mapper target and therefore offers a general kernel
> service to be used by arbitrary applications.
> LVM2 is the one from us supporting that in the future.
> 
Does it mean the multipath tools from you will be packaged in the lvm compound
binary, or as an independant set ?

> Other LVM applications are free to implement a different model.
> 
> Device-mapper has no clue of volume management metadata. LVM2 has.
>
> We need better device recovery in Linux (which will happen in the Linux 2.7
> series later). PV UUIDs enable LVM2 to discover PVs.
> 
Yes, but does it mean _your_ multipath tools will need PV UUIDs ?
Will these tools be relevant outside of LVM2 volume management scope, as for
them to become "standard" ?

> I much rather prefer a cleaner ghosts implementation not blocking
> critical applications.
> 
The ghost behaviour being vendor-specific, policy belongs to userspace. If the
vendor says the ghost does not respond to IO, then it's up to userland to
instruct the kernel to not submit IO to it (ie set device mappings).
As you plan to package a set of userspace tools for multipath management, you
should allow plugins implementing those policies.

regards,
cvaroqui

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [lvm] [christophe.varoqui@free.fr: dm multipath target]
  2003-08-19 10:48       ` christophe.varoqui
@ 2003-08-19 12:35         ` Heinz J . Mauelshagen
  2003-08-19 13:14           ` christophe.varoqui
  2003-08-19 13:26           ` christophe.varoqui
  0 siblings, 2 replies; 19+ messages in thread
From: Heinz J . Mauelshagen @ 2003-08-19 12:35 UTC (permalink / raw)
  To: christophe.varoqui; +Cc: mauelshagen, mge, linux-scsi

On Tue, Aug 19, 2003 at 12:48:07PM +0200, christophe.varoqui@free.fr wrote:
> Selon "Heinz J . Mauelshagen" <Heinz.Mauelshagen@t-online.de>:
> 
> > Ghosts should report EIO in case of an unstarted LUN anyway ?
> > In case they are started they are supposed to report EPERM on IO because
> > they don't support any.
> > 
> All I can say is that it blocks pvscan for a looong time. I havn't had the
> patience to see if it finaly abords.

:(
Then set up a device filter in lvm.conf to avoid access.

> 
> > Even reporting EPERM in both cases would be sufficient, because IO
> > will never make it through a ghost. Blocking makes no sense to me at all.
> > 
> That's the vendor-specific issue I talk about : other controlers certainly
> return errors, other activate the path on IO submission (with or without
> penalty). HSV simply blocks.

Well, error-reporting and handling in Linux is "weak" anyway.
More 2.7 work (as kernel hackers know) including standards for vendor
specific drivers ITR.

> 
> > There's various "flavours" where to put multipath into the io chain.
> > Kernel hacker conclusion is in device-mapper.
> > 
> No problem with that

:)

> 
> > Multipathing is a device-mapper target and therefore offers a general kernel
> > service to be used by arbitrary applications.
> > LVM2 is the one from us supporting that in the future.
> > 
> Does it mean the multipath tools from you will be packaged in the lvm compound
> binary, or as an independant set ?

In the LVM2 binray.

> 
> > Other LVM applications are free to implement a different model.
> > 
> > Device-mapper has no clue of volume management metadata. LVM2 has.
> >
> > We need better device recovery in Linux (which will happen in the Linux 2.7
> > series later). PV UUIDs enable LVM2 to discover PVs.
> > 
> Yes, but does it mean _your_ multipath tools will need PV UUIDs ?

Yes.

> Will these tools be relevant outside of LVM2 volume management scope, as for
> them to become "standard" ?

No restriction that the part of the LVM2 lib could, it's LGPLed.

> 
> > I much rather prefer a cleaner ghosts implementation not blocking
> > critical applications.
> > 
> The ghost behaviour being vendor-specific, policy belongs to userspace. If the
> vendor says the ghost does not respond to IO, then it's up to userland to
> instruct the kernel to not submit IO to it (ie set device mappings).

Sorry, this sounds like a workaround mess with arbitrary vendor specific
drivers and their behaviours. As said: we can filter such devices out
in LVM2.

> As you plan to package a set of userspace tools for multipath management, you
> should allow plugins implementing those policies.

Overkill because of missing standard(s). I much rather prefer
enhanced and correct error reporting.

And as said above: no need because we can easily filter out device nodes
which don't allow (IO) access.

> 
> regards,
> cvaroqui

-- 

Regards,
Heinz    -- The LVM Guy --

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

Heinz Mauelshagen                                 Sistina Software Inc.
Senior Consultant/Developer                       Am Sonnenhang 11
                                                  56242 Marienrachdorf
                                                  Germany
Mauelshagen@Sistina.com                           +49 2626 141200
                                                       FAX 924446
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [lvm] [christophe.varoqui@free.fr: dm multipath target]
  2003-08-19 12:35         ` Heinz J . Mauelshagen
@ 2003-08-19 13:14           ` christophe.varoqui
  2003-08-19 13:26           ` christophe.varoqui
  1 sibling, 0 replies; 19+ messages in thread
From: christophe.varoqui @ 2003-08-19 13:14 UTC (permalink / raw)
  To: mauelshagen; +Cc: mge, linux-scsi

> 
> Sorry, this sounds like a workaround mess with arbitrary vendor specific
> drivers and their behaviours. As said: we can filter such devices out
> in LVM2.
> 
> I much rather prefer enhanced and correct error reporting.
> 
> And as said above: no need because we can easily filter out device nodes
> which don't allow (IO) access.
> 
Yes, but a ghost might become active when :
* any host that has access to it through LUN masking sends a START_LUN
* a controler restarts or crashs, then active paths on this controler are
migrated to the other, thus activating ghosts.

I understand that you want to rely on kernel error reporting to put the failed
paths out of the multipath, but how will you get back in the freshly activated
paths if you have filtered them ?

If you do not get them in, you can reach a situation where your mp pool gets
empty even if there are valid paths around.

You won't be able to get away with an initial static marking of active /
inactive paths : this marking is dynamic. Only a proper callback on failing a
path and on manual activation can help vendor-specific plugins to keep those
lists in synch.

Do I oversee or misread a capability of your tools ?

regards,
cvaroqui

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [lvm] [christophe.varoqui@free.fr: dm multipath target]
  2003-08-19 12:35         ` Heinz J . Mauelshagen
  2003-08-19 13:14           ` christophe.varoqui
@ 2003-08-19 13:26           ` christophe.varoqui
  2003-08-19 16:23             ` Heinz J . Mauelshagen
  1 sibling, 1 reply; 19+ messages in thread
From: christophe.varoqui @ 2003-08-19 13:26 UTC (permalink / raw)
  To: mauelshagen; +Cc: mge, linux-scsi

> > Yes, but does it mean _your_ multipath tools will need PV UUIDs ?
> 
> Yes.
> 
You realize that it make those tools unsuitable for general use : 
* one may not want to use the full lvm compound binary to set up multipaths
* one may not want to tag its devices to set up multipaths, where its not necessary

Shouldn't LVM2 just rely on an separate multipath management (dm-based). There
surely will be one such implementation. We users don't need fragmentation in
such a low-level area.

PS: I certainly do not disregard sistina's work on this front, and do not mean
to be disrepectful. Just trying to forward my admin point of view.

regards,
cvaroqui

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [lvm] [christophe.varoqui@free.fr: dm multipath target]
  2003-08-19  9:51     ` Heinz J . Mauelshagen
  2003-08-19 10:48       ` christophe.varoqui
@ 2003-08-19 14:19       ` James Bottomley
  2003-08-19 16:09         ` Heinz J . Mauelshagen
  1 sibling, 1 reply; 19+ messages in thread
From: James Bottomley @ 2003-08-19 14:19 UTC (permalink / raw)
  To: mauelshagen; +Cc: christophe.varoqui, mge, SCSI Mailing List

On Tue, 2003-08-19 at 04:51, Heinz J . Mauelshagen wrote:
> On Tue, Aug 19, 2003 at 11:04:31AM +0200, christophe.varoqui@free.fr wrote:
> No, device-mapper is designed completely generic (LVM2 as well BTW).
> It will never contain any vendor specific hacks.

We don't need it to contain any vendor code, but we do need an interface
for vendor specific additions.

How arrays accomplish a path switch is highly vendor specific.  Some
(like EMC) run all paths active, so nothing needs doing.  Others need a
specific command sent to the array down the path before the switch can
be done.

> Ghosts should report EIO in case of an unstarted LUN anyway ?
> In case they are started they are supposed to report EPERM on IO because
> they don't support any.

This is awfully vendor specific again.  I know, for example, the HP MSA
array will report a check condition, not ready.  I think we translate
this to I/O error in the mid-layer (best we can do).  This is why we'll
do a better job of error reporting in fastfail.

James



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [lvm] [christophe.varoqui@free.fr: dm multipath target]
  2003-08-19 14:19       ` James Bottomley
@ 2003-08-19 16:09         ` Heinz J . Mauelshagen
  0 siblings, 0 replies; 19+ messages in thread
From: Heinz J . Mauelshagen @ 2003-08-19 16:09 UTC (permalink / raw)
  To: James Bottomley; +Cc: mauelshagen, christophe.varoqui, mge, SCSI Mailing List

On Tue, Aug 19, 2003 at 09:19:53AM -0500, James Bottomley wrote:
> On Tue, 2003-08-19 at 04:51, Heinz J . Mauelshagen wrote:
> > On Tue, Aug 19, 2003 at 11:04:31AM +0200, christophe.varoqui@free.fr wrote:
> > No, device-mapper is designed completely generic (LVM2 as well BTW).
> > It will never contain any vendor specific hacks.
> 
> We don't need it to contain any vendor code, but we do need an interface
> for vendor specific additions.

In case there's no more you need than the given target methods in
device-mapper (e.g. mapping and endio methods), you've got it already :)

> 
> How arrays accomplish a path switch is highly vendor specific.  Some
> (like EMC) run all paths active, so nothing needs doing.  Others need a
> specific command sent to the array down the path before the switch can
> be done.

I know and those private policies should not be addressed
in device-mapper but much rather in userspace so that there's accessible
paths once mapping tables are activated.

> 
> > Ghosts should report EIO in case of an unstarted LUN anyway ?
> > In case they are started they are supposed to report EPERM on IO because
> > they don't support any.
> 
> This is awfully vendor specific again.  I know, for example, the HP MSA
> array will report a check condition, not ready.  I think we translate
> this to I/O error in the mid-layer (best we can do).

Ok, fair enough. We know that we need better error reporting and
handling in 2.7.
Plus we need some decent "basic" error reported by such smart devices
so that we don't end up with piles of vendor specific modules.

Now I hear you say: "No chance short term" ;)
Right, that's why I'ld like to cover such hassle in userspace rather than
in the kernel if possible.

> This is why we'll
> do a better job of error reporting in fastfail.

:)

> 
> James
> 

-- 

Regards,
Heinz    -- The LVM Guy --

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

Heinz Mauelshagen                                 Sistina Software Inc.
Senior Consultant/Developer                       Am Sonnenhang 11
                                                  56242 Marienrachdorf
                                                  Germany
Mauelshagen@Sistina.com                           +49 2626 141200
                                                       FAX 924446
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [lvm] [christophe.varoqui@free.fr: dm multipath target]
  2003-08-19 13:26           ` christophe.varoqui
@ 2003-08-19 16:23             ` Heinz J . Mauelshagen
  2003-08-19 23:22               ` christophe varoqui
  0 siblings, 1 reply; 19+ messages in thread
From: Heinz J . Mauelshagen @ 2003-08-19 16:23 UTC (permalink / raw)
  To: christophe.varoqui; +Cc: mauelshagen, mge, linux-scsi

On Tue, Aug 19, 2003 at 03:26:12PM +0200, christophe.varoqui@free.fr wrote:
> > > Yes, but does it mean _your_ multipath tools will need PV UUIDs ?
> > 
> > Yes.
> > 
> You realize that it make those tools unsuitable for general use : 
> * one may not want to use the full lvm compound binary to set up multipaths
> * one may not want to tag its devices to set up multipaths, where its not necessary

The "Yes" was about the LVM2 tools and we do recommend them to be installed
to support the variaty of volume management features.

If you have a corner use case where you don't need any of them but multipath,
dmsetup can be used to cover what you want out of the box (creating ASCII
mapping tables which is simple for multipath).
No need to install LVM2 at all for that.

> 
> Shouldn't LVM2 just rely on an separate multipath management (dm-based). There
> surely will be one such implementation. We users don't need fragmentation in
> such a low-level area.

dm is the multipath runtime (through a dm multipath target).

As said: you can set up multipath configs without LVM2 in case you
don't need full volume management functionality using the dmsetup tool.
This is not the recommended way but you still can do it ;)

> 
> PS: I certainly do not disregard sistina's work on this front, and do not mean
> to be disrepectful. Just trying to forward my admin point of view.

Of course not.

> 
> regards,
> cvaroqui

-- 

Regards,
Heinz    -- The LVM Guy --

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

Heinz Mauelshagen                                 Sistina Software Inc.
Senior Consultant/Developer                       Am Sonnenhang 11
                                                  56242 Marienrachdorf
                                                  Germany
Mauelshagen@Sistina.com                           +49 2626 141200
                                                       FAX 924446
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [lvm] [christophe.varoqui@free.fr: dm multipath target]
  2003-08-19 16:23             ` Heinz J . Mauelshagen
@ 2003-08-19 23:22               ` christophe varoqui
  2003-08-20 13:02                 ` Heinz J . Mauelshagen
  0 siblings, 1 reply; 19+ messages in thread
From: christophe varoqui @ 2003-08-19 23:22 UTC (permalink / raw)
  To: mauelshagen; +Cc: linux-scsi

On Tue, 19 Aug 2003 18:23:47 +0200
Heinz.Mauelshagen@t-online.de (Heinz J . Mauelshagen) wrote:

> > You realize that it make those tools unsuitable for general use : 
> > * one may not want to use the full lvm compound binary to set up multipaths
> > * one may not want to tag its devices to set up multipaths, where its not necessary
> 
> The "Yes" was about the LVM2 tools and we do recommend them to be installed
> to support the variaty of volume management features.
> 
> If you have a corner use case where you don't need any of them but multipath,
> dmsetup can be used to cover what you want out of the box (creating ASCII
> mapping tables which is simple for multipath).
> No need to install LVM2 at all for that.
> 
This is not a corner case use : most users who face multipath problem cope with FC raid controlers that embed the volume management. Those users are in need of a clean and stand alone multipath layer, not a swiss-knife volume manager.

I understand that the implementation you propose offers a clean and coherent interface for volume management and multipath, but it will work as expected only for "all paths to LUN are active" configurations : ie high-end EMC clients (who don't need host volume management) and "FC JBOD connected through 2 HBA"-users. All other users must cope with ghosts.

You suggest admins can filter those out with LVM2 config file : OK, easy to do at initialisation time, but how can admins cope with ghosts becoming active and valid paths becoming ghosts ?
In the first case admins must insert the newly activated path in the multipath map _when the event occurs_.
In the second case, admins must blacklist a former valid path in the LVM2 config file _when the event occurs_.

It can only be done by scripts/plugins triggered by path failures as
* a ghost becomes active if 
  1) the host explicitly ask for it, housekeeping is in charge of the initiator of the activation demand
  2) the raid controler has fail an active path over the ghost which implies the hosts get a path failure in a mp target anyway
* an active path becomes ghost implies a path failure

As a whole MP device is vendor coherent, but different MP devices can be hold by different vendors, I wonder if it would be right to set the vendor-specific script/plugin path as a parameter of the target map.

Comments ?

> > 
> > Shouldn't LVM2 just rely on an separate multipath management (dm-based). There
> > surely will be one such implementation. We users don't need fragmentation in
> > such a low-level area.
> 
> dm is the multipath runtime (through a dm multipath target).
> 
No discussion here.

> As said: you can set up multipath configs without LVM2 in case you
> don't need full volume management functionality using the dmsetup tool.
> This is not the recommended way but you still can do it ;)
> 
You just cannot recommend what you described to a HPQ StorageWorks customer, like me :)
If the reasons why are still unclear, please point the blurry details as I'll be glad to elaborate.

regards,
 cvaroqui

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [lvm] [christophe.varoqui@free.fr: dm multipath target]
  2003-08-19 23:22               ` christophe varoqui
@ 2003-08-20 13:02                 ` Heinz J . Mauelshagen
  2003-08-20 14:19                   ` christophe.varoqui
  0 siblings, 1 reply; 19+ messages in thread
From: Heinz J . Mauelshagen @ 2003-08-20 13:02 UTC (permalink / raw)
  To: christophe varoqui; +Cc: mauelshagen, linux-scsi

On Wed, Aug 20, 2003 at 01:22:04AM +0200, christophe varoqui wrote:
> On Tue, 19 Aug 2003 18:23:47 +0200
> Heinz.Mauelshagen@t-online.de (Heinz J . Mauelshagen) wrote:
> 
> > > You realize that it make those tools unsuitable for general use : 
> > > * one may not want to use the full lvm compound binary to set up multipaths
> > > * one may not want to tag its devices to set up multipaths, where its not necessary
> > 
> > The "Yes" was about the LVM2 tools and we do recommend them to be installed
> > to support the variaty of volume management features.
> > 
> > If you have a corner use case where you don't need any of them but multipath,
> > dmsetup can be used to cover what you want out of the box (creating ASCII
> > mapping tables which is simple for multipath).
> > No need to install LVM2 at all for that.
> > 
> This is not a corner case use : most users who face multipath problem cope with FC raid controlers that embed the volume management. Those users are in need of a clean and stand alone multipath layer, not a swiss-knife volume manager.

Agreed that smart controlers and intelligent disk subsystems have intrinsic
volume management capabilities.
The point you miss IMHO is, that people using those nevertheless want volume
management on top of their EMC boxen in order to virtualize their storage
(i.e. to online move data or to have logical volumes spanning multiple
disk subsystems).

IOW: people want to have storage virtualization of _all_ their storage devices,
     not just some to handle their varying storage management needs for all of
     their storage.

That's why I talked about a corner case.

> 
> I understand that the implementation you propose offers a clean and coherent interface for volume management and multipath, but it will work as expected only for "all paths to LUN are active" configurations : ie high-end EMC clients (who don't need host volume management) and "FC JBOD connected through 2 HBA"-users. All other users must cope with ghosts.
> 

Well, the high-end storage boxen don't need volume management SW to do their
private business, right. But the point is, that volume management is not a
single storage device business as mentioned above.

> You suggest admins can filter those out with LVM2 config file : OK, easy to do at initialisation time, but how can admins cope with ghosts becoming active and valid paths becoming ghosts ?
> In the first case admins must insert the newly activated path in the multipath map _when the event occurs_.

If valid path can become ghosts it is an argument that ghosts need to report
errors on io accesses where they actually hang the application if they
behave the same way you mentioned for ghost in the first place.

Other than that, hotplug events should be used to cope with this.

> In the second case, admins must blacklist a former valid path in the LVM2 config file _when the event occurs_.

The filter in the config file is only used by the tools, not by
device-mapper at runtime. device-mapper's multipath target needs to get
an error reported in case a path fails in order to recognize it.

Hotplug to blacklist those as well ?

> 
> It can only be done by scripts/plugins triggered by path failures as
> * a ghost becomes active if 
>   1) the host explicitly ask for it, housekeeping is in charge of the initiator of the activation demand
>   2) the raid controler has fail an active path over the ghost which implies the hosts get a path failure in a mp target anyway
> * an active path becomes ghost implies a path failure
> 
> As a whole MP device is vendor coherent, but different MP devices can be hold by different vendors, I wonder if it would be right to set the vendor-specific script/plugin path as a parameter of the target map.
> 

The "activate all paths we want to use" is outside the scope of LVM.

a. If the vendors model handles activation on the host side, a script to do
   it (or some vendor daemon handling this on the host or hotplug) is necessary
   anyway because there's no standard.
   Additional programs need to be hookable into the (de)activation
   event queue to handle config file updates.

b. And if a invalid io path reports an error on io access (as i mentioned
   earlier in my posts), which is elementary to enable multipath to work as
   mentioned above (valid path -> ghost change), LVM2 and device-mapper can
   cope with it now. No principal need for config file updates if b is given
   (rather than maybe performance penalties accessing invalid paths
    by the tools)


If a valid io path can change to a ghost _and_ no io error is
reported accessing the ghost then, the system is in deep trouble anyway.

Or what happens now if you move LVM2/device-mapper completely out of the picture
and mount a filesystem on a valid path which turns into a ghost while
the filesystem is mounted and under load ?

Regards,
Heinz    -- The LVM Guy --





> Comments ?
> 
> > > 
> > > Shouldn't LVM2 just rely on an separate multipath management (dm-based). There
> > > surely will be one such implementation. We users don't need fragmentation in
> > > such a low-level area.
> > 
> > dm is the multipath runtime (through a dm multipath target).
> > 
> No discussion here.
> 
> > As said: you can set up multipath configs without LVM2 in case you
> > don't need full volume management functionality using the dmsetup tool.
> > This is not the recommended way but you still can do it ;)
> > 
> You just cannot recommend what you described to a HPQ StorageWorks customer, like me :)
> If the reasons why are still unclear, please point the blurry details as I'll be glad to elaborate.
> 
> regards,
>  cvaroqui

*** Software bugs are stupid.
    Nevertheless it needs not so stupid people to solve them ***

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

Heinz Mauelshagen                                 Sistina Software Inc.
Senior Consultant/Developer                       Am Sonnenhang 11
                                                  56242 Marienrachdorf
                                                  Germany
Mauelshagen@Sistina.com                           +49 2626 141200
                                                       FAX 924446
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [lvm] [christophe.varoqui@free.fr: dm multipath target]
  2003-08-20 13:02                 ` Heinz J . Mauelshagen
@ 2003-08-20 14:19                   ` christophe.varoqui
  2003-08-21 12:47                     ` Heinz J . Mauelshagen
  0 siblings, 1 reply; 19+ messages in thread
From: christophe.varoqui @ 2003-08-20 14:19 UTC (permalink / raw)
  To: mauelshagen; +Cc: linux-scsi

> > 
> > This is not a corner case use : most users who face multipath problem cope
> > with FC raid controlers that embed the volume management. Those users are in
> > need of a clean and stand alone multipath layer, not a swiss-knife volume
> > manager.
> 
> Agreed that smart controlers and intelligent disk subsystems have intrinsic
> volume management capabilities.
> The point you miss IMHO is, that people using those nevertheless want volume
> management on top of their EMC boxen in order to virtualize their storage
> (i.e. to online move data or to have logical volumes spanning multiple
> disk subsystems).
> 
> IOW: people want to have storage virtualization of _all_ their storage
> devices,
>      not just some to handle their varying storage management needs for all
> of
>      their storage.
> 
> That's why I talked about a corner case.
> 
I still politely disagree. Take the same HSV controler : this beast hosts up to
252 FC disks, up to 144 Go each. This controler embeds a _real_ volume manager
that allocate extents (striped, raid0, raid5) and uses a rotative allocator that
spreads every LUN on ALL disks. It also feature a load-leveler and allocation
leveler daemon. Admins can resize the LUNs, no matter the parity level ... and
soon Linux will support that (as Win2k & Tru64 already do).

The only use of hosts volume management would be persistant-mirroring 2 suchs
arrays, which LVM2 does not propose at the moment.

I, for one, would the multipath implementation to be stand alone.
LVM2 can be layer on top of that without any modification : I see no advantage
for you to merge the two subsystems.

> > 
> > I understand that the implementation you propose offers a clean and
> > coherent interface for volume management and multipath, but it will work as
> > expected only for "all paths to LUN are active" configurations : ie high-end
> > EMC clients (who don't need host volume management) and "FC JBOD connected
> > through 2 HBA"-users. All other users must cope with ghosts.
> > 
> 
> Well, the high-end storage boxen don't need volume management SW to do their
> private business, right. But the point is, that volume management is not a
> single storage device business as mentioned above.
> 
Point taken, but where you say "volume management is the answer", I rather say
"clean & lean multipath implementation is the answer"

> > You suggest admins can filter those out with LVM2 config file : OK, easy to
> > do at initialisation time, but how can admins cope with ghosts becoming
> > active and valid paths becoming ghosts ?
> > In the first case admins must insert the newly activated path in the
> > multipath map _when the event occurs_.
> 
> If valid path can become ghosts it is an argument that ghosts need to report
> errors on io accesses where they actually hang the application if they
> behave the same way you mentioned for ghost in the first place.
> 
Yes, James B seems to be confident failfast will permit to do that.
I'm eager to run tests if needed. 
James, Does the patch you posted along with OLS talk minutes over a test3 kernel
would be sufficient to see behaviour change with ghosts ?
 
> Other than that, hotplug events should be used to cope with this.
> 
No, I guess not : realize that a ghost becoming a valid path trigger _nothing_
in a current kernel. I do not even know if the HSV notifies such a change.
Thus, no hotplug event get triggered.

> > In the second case, admins must blacklist a former valid path in the LVM2
> config file _when the event occurs_.
> 
> The filter in the config file is only used by the tools, not by
> device-mapper at runtime. device-mapper's multipath target needs to get
> an error reported in case a path fails in order to recognize it.
> 
> Hotplug to blacklist those as well ?
> 
As said above no hotplug event will be generated when ghosts and valid paths get
swaped around. But there will be failpath events, which need to be notified to a
vendor-specific script.

> > 
> > It can only be done by scripts/plugins triggered by path failures as
> > * a ghost becomes active if 
> >   1) the host explicitly ask for it, housekeeping is in charge of the
> initiator of the activation demand
> >   2) the raid controler has fail an active path over the ghost which
> implies the hosts get a path failure in a mp target anyway
> > * an active path becomes ghost implies a path failure
> > 
> > As a whole MP device is vendor coherent, but different MP devices can be
> hold by different vendors, I wonder if it would be right to set the
> vendor-specific script/plugin path as a parameter of the target map.
> > 
> 
> The "activate all paths we want to use" is outside the scope of LVM.
> 
> a. If the vendors model handles activation on the host side, a script to do
>    it (or some vendor daemon handling this on the host or hotplug) is
> necessary
>    anyway because there's no standard.
>    Additional programs need to be hookable into the (de)activation
>    event queue to handle config file updates.
> 
ok,
but I rather say "the mp framework (be it LVM2 or anything else) accept
vendor-specific plugs" than "vendor specific hacks must accpet plugins for LVM2
config housekeeping". Don't you agree ?

> b. And if a invalid io path reports an error on io access (as i mentioned
>    earlier in my posts), which is elementary to enable multipath to work as
>    mentioned above (valid path -> ghost change), LVM2 and device-mapper can
>    cope with it now. No principal need for config file updates if b is given
>    (rather than maybe performance penalties accessing invalid paths
>     by the tools)
>
No, you still need to rescan for new valid paths outside the dm target control,
as a path going ghost often implies a ghost going valid. The new valid path must
be added to the dm target to restore the redondancy level and nominal performance.
This rescan must be triggered on path failure. and this is vendor-specific too.

>
> If a valid io path can change to a ghost _and_ no io error is
> reported accessing the ghost then, the system is in deep trouble anyway.
> 
I can START_LUN it back to unhang processes. Going ghost is not the same as failing.

> Or what happens now if you move LVM2/device-mapper completely out of the
> picture
> and mount a filesystem on a valid path which turns into a ghost while
> the filesystem is mounted and under load ?
Same as above.
But why would I want to do that anyway ?
Multipath devices need to be accessed through a multipath metadevice.

If Linux cannot do it, do not use Linux with this array : this is my regretful
conclusion. Hope it changes soon because I have a load of Linux-wannabe servers
& apps in that situation.
Of course there is HP SecurePath, but what's the point of an Open Source kernel
tainted with closed modules ?

regards,
cvaroqui

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [lvm] [christophe.varoqui@free.fr: dm multipath target]
  2003-08-20 14:19                   ` christophe.varoqui
@ 2003-08-21 12:47                     ` Heinz J . Mauelshagen
  2003-08-21 16:34                       ` christophe.varoqui
  0 siblings, 1 reply; 19+ messages in thread
From: Heinz J . Mauelshagen @ 2003-08-21 12:47 UTC (permalink / raw)
  To: christophe.varoqui; +Cc: mauelshagen, linux-scsi

On Wed, Aug 20, 2003 at 04:19:24PM +0200, christophe.varoqui@free.fr wrote:
> > > 
> > > This is not a corner case use : most users who face multipath problem cope
> > > with FC raid controlers that embed the volume management. Those users are in
> > > need of a clean and stand alone multipath layer, not a swiss-knife volume
> > > manager.
> > 
> > Agreed that smart controlers and intelligent disk subsystems have intrinsic
> > volume management capabilities.
> > The point you miss IMHO is, that people using those nevertheless want volume
> > management on top of their EMC boxen in order to virtualize their storage
> > (i.e. to online move data or to have logical volumes spanning multiple
> > disk subsystems).
> > 
> > IOW: people want to have storage virtualization of _all_ their storage
> > devices,
> >      not just some to handle their varying storage management needs for all
> > of
> >      their storage.
> > 
> > That's why I talked about a corner case.
> > 
> I still politely disagree. Take the same HSV controler : this beast hosts up to
> 252 FC disks, up to 144 Go each. This controler embeds a _real_ volume manager
> that allocate extents (striped, raid0, raid5) and uses a rotative allocator that
> spreads every LUN on ALL disks. It also feature a load-leveler and allocation
> leveler daemon. Admins can resize the LUNs, no matter the parity level ... and
> soon Linux will support that (as Win2k & Tru64 already do).


Of corse you find such smart solutions out there (look at large
EMC Symetrix etc.).
I just say it is not the general volume management case.

> 
> The only use of hosts volume management would be persistant-mirroring 2 suchs
> arrays, which LVM2 does not propose at the moment.

Yep, that's an example (we'll support in fall this year).

Others are volumes larger than the largest array, striping beyond array limits,
hot spot removal and online data relocation to newer disk subsystems in order
to remove the old ones.

> 
> I, for one, would the multipath implementation to be stand alone.
> LVM2 can be layer on top of that without any modification : I see no advantage
> for you to merge the two subsystems.

They aren't. As I tried to explain before, LVM2 is just a volume management
application supporting multipathing shortly.

If more people need stand alone multipathing, dmsetup can be used together
with some scripting or a small multipathing appl can be created.

I just can't see that there's lots of demand ;)

> 
> > > 
> > > I understand that the implementation you propose offers a clean and
> > > coherent interface for volume management and multipath, but it will work as
> > > expected only for "all paths to LUN are active" configurations : ie high-end
> > > EMC clients (who don't need host volume management) and "FC JBOD connected
> > > through 2 HBA"-users. All other users must cope with ghosts.
> > > 
> > 
> > Well, the high-end storage boxen don't need volume management SW to do their
> > private business, right. But the point is, that volume management is not a
> > single storage device business as mentioned above.
> > 
> Point taken, but where you say "volume management is the answer", I rather say
> "clean & lean multipath implementation is the answer"

See above, please.

> 
> > > You suggest admins can filter those out with LVM2 config file : OK, easy to
> > > do at initialisation time, but how can admins cope with ghosts becoming
> > > active and valid paths becoming ghosts ?
> > > In the first case admins must insert the newly activated path in the
> > > multipath map _when the event occurs_.
> > 
> > If valid path can become ghosts it is an argument that ghosts need to report
> > errors on io accesses where they actually hang the application if they
> > behave the same way you mentioned for ghost in the first place.
> > 
> Yes, James B seems to be confident failfast will permit to do that.
> I'm eager to run tests if needed. 
> James, Does the patch you posted along with OLS talk minutes over a test3 kernel
> would be sufficient to see behaviour change with ghosts ?
>  
> > Other than that, hotplug events should be used to cope with this.
> > 
> No, I guess not : realize that a ghost becoming a valid path trigger _nothing_
> in a current kernel. I do not even know if the HSV notifies such a change.
> Thus, no hotplug event get triggered.

Too bad.

> 
> > > In the second case, admins must blacklist a former valid path in the LVM2
> > config file _when the event occurs_.
> > 
> > The filter in the config file is only used by the tools, not by
> > device-mapper at runtime. device-mapper's multipath target needs to get
> > an error reported in case a path fails in order to recognize it.
> > 
> > Hotplug to blacklist those as well ?
> > 
> As said above no hotplug event will be generated when ghosts and valid paths get
> swaped around.

Ok.

> But there will be failpath events, which need to be notified to a
> vendor-specific script.

Which in turn can run LVM2 tools or dmsetup in case not LVM2 is wanted
to reload change multipath device mappings.

> 
> > > 
> > > It can only be done by scripts/plugins triggered by path failures as
> > > * a ghost becomes active if 
> > >   1) the host explicitly ask for it, housekeeping is in charge of the
> > initiator of the activation demand
> > >   2) the raid controler has fail an active path over the ghost which
> > implies the hosts get a path failure in a mp target anyway
> > > * an active path becomes ghost implies a path failure
> > > 
> > > As a whole MP device is vendor coherent, but different MP devices can be
> > hold by different vendors, I wonder if it would be right to set the
> > vendor-specific script/plugin path as a parameter of the target map.
> > > 
> > 
> > The "activate all paths we want to use" is outside the scope of LVM.
> > 
> > a. If the vendors model handles activation on the host side, a script to do
> >    it (or some vendor daemon handling this on the host or hotplug) is
> > necessary
> >    anyway because there's no standard.
> >    Additional programs need to be hookable into the (de)activation
> >    event queue to handle config file updates.
> > 
> ok,
> but I rather say "the mp framework (be it LVM2 or anything else) accept
> vendor-specific plugs" than "vendor specific hacks must accpet plugins for LVM2
> config housekeeping". Don't you agree ?

No, Linux device-mapper multipathing sits generically above any vendor
specific lower level drivers and is set up from user-land (dmsetup or device-mapper library used by LVM2).
If only the vendor knows about any internal state machinery of his storage
subsystem and reports related state changes in a vendor specific way, part
of that report processing should interface into device-mapping.

> 
> > b. And if a invalid io path reports an error on io access (as i mentioned
> >    earlier in my posts), which is elementary to enable multipath to work as
> >    mentioned above (valid path -> ghost change), LVM2 and device-mapper can
> >    cope with it now. No principal need for config file updates if b is given
> >    (rather than maybe performance penalties accessing invalid paths
> >     by the tools)
> >
> No, you still need to rescan for new valid paths outside the dm target control,
> as a path going ghost often implies a ghost going valid. The new valid path must
> be added to the dm target to restore the redondancy level and nominal performance.
> This rescan must be triggered on path failure. and this is vendor-specific too.

The vendor specific thing is the initiator then and should address
device-mapping changes.

> 
> >
> > If a valid io path can change to a ghost _and_ no io error is
> > reported accessing the ghost then, the system is in deep trouble anyway.
> > 
> I can START_LUN it back to unhang processes. Going ghost is not the same as failing.

Ok. Principally to be covered the same way as above.

> 
> > Or what happens now if you move LVM2/device-mapper completely out of the
> > picture
> > and mount a filesystem on a valid path which turns into a ghost while
> > the filesystem is mounted and under load ?
> Same as above.
> But why would I want to do that anyway ?

This was a single path example to clearify the failure behavour of ghosts
for me.

> Multipath devices need to be accessed through a multipath metadevice.
> 
> If Linux cannot do it, do not use Linux with this array : this is my regretful
> conclusion. Hope it changes soon because I have a load of Linux-wannabe servers
> & apps in that situation.

What about given MD multipath ?
How does this work together with that array ?

> Of course there is HP SecurePath, but what's the point of an Open Source kernel
> tainted with closed modules ?

No point.

> 
> regards,
> cvaroqui

-- 

Regards,
Heinz    -- The LVM Guy --

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

Heinz Mauelshagen                                 Sistina Software Inc.
Senior Consultant/Developer                       Am Sonnenhang 11
                                                  56242 Marienrachdorf
                                                  Germany
Mauelshagen@Sistina.com                           +49 2626 141200
                                                       FAX 924446
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [lvm] [christophe.varoqui@free.fr: dm multipath target]
  2003-08-21 12:47                     ` Heinz J . Mauelshagen
@ 2003-08-21 16:34                       ` christophe.varoqui
  2003-08-22  8:51                         ` Heinz J . Mauelshagen
  0 siblings, 1 reply; 19+ messages in thread
From: christophe.varoqui @ 2003-08-21 16:34 UTC (permalink / raw)
  To: mauelshagen; +Cc: linux-scsi

> > 
> > The only use of hosts volume management would be persistant-mirroring 2
> suchs
> > arrays, which LVM2 does not propose at the moment.
> 
> Yep, that's an example (we'll support in fall this year).
> 
> Others are volumes larger than the largest array, striping beyond array
> limits,
> hot spot removal and online data relocation to newer disk subsystems in
> order
> to remove the old ones.
>
yes
 
> > 
> > I, for one, would the multipath implementation to be stand alone.
> > LVM2 can be layer on top of that without any modification : I see no
> > advantage
> > for you to merge the two subsystems.
> 
> They aren't. As I tried to explain before, LVM2 is just a volume management
> application supporting multipathing shortly.
> 
> If more people need stand alone multipathing, dmsetup can be used together
> with some scripting or a small multipathing appl can be created.
> 
Why favoring fragmentation ? An independant multipath implementation would
become a de facto standard. Instead, you force others to propose alternative
implementations : EVMS will do that and surely will do it in the IBM monolithic
way too, then another lean implementation will appear ...
What implementation will the vendors support ? all, random, none ...
Anyway, not good for us users.

You said it's not harder to do it independantly, so why ?

> I just can't see that there's lots of demand ;)
> 
Indeed, we feel alone in this thread :)

> > realize that a ghost becoming a valid path trigger
> > _nothing_
> > in a current kernel. I do not even know if the HSV notifies such a change.
> > Thus, no hotplug event get triggered.
> 
> Too bad.
> 
Yep

> > But there will be failpath events, which need to be notified to a
> > vendor-specific script.
> 
> Which in turn can run LVM2 tools or dmsetup in case not LVM2 is wanted
> to reload change multipath device mappings.
> 
Yes,
the question boils down to "will/do dm multipath target forward failpath events
to userspace ?"

> > but I rather say "the mp framework (be it LVM2 or anything else) accept
> > vendor-specific plugs" than "vendor specific hacks must accpet plugins for
> > LVM2
> > config housekeeping". Don't you agree ?
> 
> No, Linux device-mapper multipathing sits generically above any vendor
> specific lower level drivers and is set up from user-land (dmsetup or
> device-mapper library used by LVM2).
> If only the vendor knows about any internal state machinery of his storage
> subsystem and reports related state changes in a vendor specific way, part
> of that report processing should interface into device-mapping.
> 
Such a daemon may not be able to keep paths state in-memory representation in
sync with reality without path failure notification.
In the case of HSVs, in scanned all INQUIRY and saw nothing that differentiate a
ghost and a valid path. Most sense data is unreachable, as sg_mode hangs on
ghosts trying to retreive it.

> > No, you still need to rescan for new valid paths outside the dm target
> control,
> > as a path going ghost often implies a ghost going valid. The new valid path
> must
> > be added to the dm target to restore the redondancy level and nominal
> performance.
> > This rescan must be triggered on path failure. and this is vendor-specific
> too.
> 
> The vendor specific thing is the initiator then and should address
> device-mapping changes.
> 
As above

> 
> What about given MD multipath ?
> How does this work together with that array ?
>
awfully :)
I may be able to autoconfigure md mps at boot, but :
With ghosts included in md : a failover can elect a ghost then hang forever
(certainly true with a dm mp target too, but failfast could change that ?)
With ghosts excluded of md : I need a callback on path failure to hotadd
activated ghosts. This callback is absent (mdadm does that asynchronously but I
had not luck with that feature)
Both solutions : the selected active path going ghost leaves me hung, with no
failover happening (as you do load balancing you will certainly hit by that even
more)


We have had 3 hot spots in this thread :

1) failure notification on ghost IO path is needed : hanging is not acceptable
Agreement reached here :)

2) multipath support for "bizarre" arrays need a userspace callback on path failure.
Agreement seems to be reached. Still not clear what provides this callback :
MD/DM or Block Layer ?

3) multipath implementation need to be separate from volume managers. (DM is not
a volume manger, LVM2 & EVMS are !)
Agreement not reached :(

Do others on this list have comments/opinions ?

regards,
cvaroqui
-- 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [lvm] [christophe.varoqui@free.fr: dm multipath target]
  2003-08-21 16:34                       ` christophe.varoqui
@ 2003-08-22  8:51                         ` Heinz J . Mauelshagen
  2003-08-22 14:59                           ` Patrick Mansfield
  0 siblings, 1 reply; 19+ messages in thread
From: Heinz J . Mauelshagen @ 2003-08-22  8:51 UTC (permalink / raw)
  To: christophe.varoqui; +Cc: mauelshagen, linux-scsi

On Thu, Aug 21, 2003 at 06:34:52PM +0200, christophe.varoqui@free.fr wrote:
> > > 
> > > The only use of hosts volume management would be persistant-mirroring 2
> > suchs
> > > arrays, which LVM2 does not propose at the moment.
> > 
> > Yep, that's an example (we'll support in fall this year).
> > 
> > Others are volumes larger than the largest array, striping beyond array
> > limits,
> > hot spot removal and online data relocation to newer disk subsystems in
> > order
> > to remove the old ones.
> >
> yes
>  
> > > 
> > > I, for one, would the multipath implementation to be stand alone.
> > > LVM2 can be layer on top of that without any modification : I see no
> > > advantage
> > > for you to merge the two subsystems.
> > 
> > They aren't. As I tried to explain before, LVM2 is just a volume management
> > application supporting multipathing shortly.
> > 
> > If more people need stand alone multipathing, dmsetup can be used together
> > with some scripting or a small multipathing appl can be created.
> > 
> Why favoring fragmentation ? An independant multipath implementation would
> become a de facto standard. Instead, you force others to propose alternative
> implementations : EVMS will do that and surely will do it in the IBM monolithic
> way too, then another lean implementation will appear ...
> What implementation will the vendors support ? all, random, none ...
> Anyway, not good for us users.
> 
> You said it's not harder to do it independantly, so why ?

dmsetup + script.

> 
> > I just can't see that there's lots of demand ;)
> > 
> Indeed, we feel alone in this thread :)

If there's more demand later, it will happen.

> 
> > > realize that a ghost becoming a valid path trigger
> > > _nothing_
> > > in a current kernel. I do not even know if the HSV notifies such a change.
> > > Thus, no hotplug event get triggered.
> > 
> > Too bad.
> > 
> Yep
> 
> > > But there will be failpath events, which need to be notified to a
> > > vendor-specific script.
> > 
> > Which in turn can run LVM2 tools or dmsetup in case not LVM2 is wanted
> > to reload change multipath device mappings.
> > 
> Yes,
> the question boils down to "will/do dm multipath target forward failpath events
> to userspace ?"

It does. "dmsetup status MAPPED-DEVICE" shows it (using the LGPLed
device-mapper library).

> 
> > > but I rather say "the mp framework (be it LVM2 or anything else) accept
> > > vendor-specific plugs" than "vendor specific hacks must accpet plugins for
> > > LVM2
> > > config housekeeping". Don't you agree ?
> > 
> > No, Linux device-mapper multipathing sits generically above any vendor
> > specific lower level drivers and is set up from user-land (dmsetup or
> > device-mapper library used by LVM2).
> > If only the vendor knows about any internal state machinery of his storage
> > subsystem and reports related state changes in a vendor specific way, part
> > of that report processing should interface into device-mapping.
> > 
> Such a daemon may not be able to keep paths state in-memory representation in
> sync with reality without path failure notification.
> In the case of HSVs, in scanned all INQUIRY and saw nothing that differentiate a
> ghost and a valid path. Most sense data is unreachable, as sg_mode hangs on
> ghosts trying to retreive it.
> 
> > > No, you still need to rescan for new valid paths outside the dm target
> > control,
> > > as a path going ghost often implies a ghost going valid. The new valid path
> > must
> > > be added to the dm target to restore the redondancy level and nominal
> > performance.
> > > This rescan must be triggered on path failure. and this is vendor-specific
> > too.
> > 
> > The vendor specific thing is the initiator then and should address
> > device-mapping changes.
> > 
> As above
> 
> > 
> > What about given MD multipath ?
> > How does this work together with that array ?
> >
> awfully :)
> I may be able to autoconfigure md mps at boot, but :
> With ghosts included in md : a failover can elect a ghost then hang forever
> (certainly true with a dm mp target too, but failfast could change that ?)
> With ghosts excluded of md : I need a callback on path failure to hotadd
> activated ghosts. This callback is absent (mdadm does that asynchronously but I
> had not luck with that feature)
> Both solutions : the selected active path going ghost leaves me hung, with no
> failover happening (as you do load balancing you will certainly hit by that even
> more)
> 

Get's us back to my other argument: ghosts should _never_ hang, they should
report an error.

> 
> We have had 3 hot spots in this thread :
> 
> 1) failure notification on ghost IO path is needed : hanging is not acceptable
> Agreement reached here :)

Good :)

> 
> 2) multipath support for "bizarre" arrays need a userspace callback on path failure.
> Agreement seems to be reached. Still not clear what provides this callback :
> MD/DM or Block Layer ?

Presumably device-mapper multipath is the one we end up with at the
end of the day, the answer is DM (the status information retrievable from
DM provides it already)

> 
> 3) multipath implementation need to be separate from volume managers. (DM is not
> a volume manger, LVM2 & EVMS are !)
> Agreement not reached :(

Seems more to be a misunderstanding we have ITR to me.

Our multipathing target is part of device-mapper, which is completely
seperate from LVM2, EVMS, ...
Those volume manager applications _use_ device-mapper as the mapping runtime
by providing appropriate mapping tables to be loaded into device-mapper
or unloaded from it. There can be others (such as EVMS).

As I mentioned before: dmsetup is the device-mapper tool which supports
such (un)loading of mapping tables _without_ any line of LVM2 (or arbitrary
other volume management application) code at all.

Setting up a multipath mapping using dmsetup means creating a one line
mapping table with a couple of parameters per path (see multipath target
code we plan to release next week).

> 
> Do others on this list have comments/opinions ?
> 
> regards,
> cvaroqui
> -- 

-- 

Regards,
Heinz    -- The LVM Guy --

*** Software bugs are stupid.
    Nevertheless it needs not so stupid people to solve them ***

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

Heinz Mauelshagen                                 Sistina Software Inc.
Senior Consultant/Developer                       Am Sonnenhang 11
                                                  56242 Marienrachdorf
                                                  Germany
Mauelshagen@Sistina.com                           +49 2626 141200
                                                       FAX 924446
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [lvm] [christophe.varoqui@free.fr: dm multipath target]
  2003-08-22  8:51                         ` Heinz J . Mauelshagen
@ 2003-08-22 14:59                           ` Patrick Mansfield
  2003-08-22 15:34                             ` christophe.varoqui
  2003-08-22 16:07                             ` christophe.varoqui
  0 siblings, 2 replies; 19+ messages in thread
From: Patrick Mansfield @ 2003-08-22 14:59 UTC (permalink / raw)
  To: mauelshagen; +Cc: christophe.varoqui, linux-scsi

On Fri, Aug 22, 2003 at 10:51:46AM +0200, Heinz J . Mauelshagen wrote:
> On Thu, Aug 21, 2003 at 06:34:52PM +0200, christophe.varoqui@free.fr wrote:

> > 2) multipath support for "bizarre" arrays need a userspace callback on path failure.
> > Agreement seems to be reached. Still not clear what provides this callback :
> > MD/DM or Block Layer ?
> 
> Presumably device-mapper multipath is the one we end up with at the
> end of the day, the answer is DM (the status information retrievable from
> DM provides it already)

The sending of any special command to enable an alternative path (like the
ghost LUN) should be done in a kernel thread or some way that allows the
root file system to be multi-pathed on such devices.

> Setting up a multipath mapping using dmsetup means creating a one line
> mapping table with a couple of parameters per path (see multipath target
> code we plan to release next week).

Is this the same code Joe is working on?

-- Patrick Mansfield

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [lvm] [christophe.varoqui@free.fr: dm multipath target]
  2003-08-22 14:59                           ` Patrick Mansfield
@ 2003-08-22 15:34                             ` christophe.varoqui
  2003-08-22 15:55                               ` Patrick Mansfield
  2003-08-22 16:07                             ` christophe.varoqui
  1 sibling, 1 reply; 19+ messages in thread
From: christophe.varoqui @ 2003-08-22 15:34 UTC (permalink / raw)
  To: Patrick Mansfield; +Cc: mauelshagen, linux-scsi

Selon Patrick Mansfield <patmans@us.ibm.com>:
> The sending of any special command to enable an alternative path (like the
> ghost LUN) should be done in a kernel thread or some way that allows the
> root file system to be multi-pathed on such devices.
> 
Early userspace should do the trick to keep vendors out of kernel development.
Anyway, it is still not clear to me how HBAs BIOS treat such ghost LUNs at boot.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [lvm] [christophe.varoqui@free.fr: dm multipath target]
  2003-08-22 15:34                             ` christophe.varoqui
@ 2003-08-22 15:55                               ` Patrick Mansfield
  0 siblings, 0 replies; 19+ messages in thread
From: Patrick Mansfield @ 2003-08-22 15:55 UTC (permalink / raw)
  To: christophe.varoqui; +Cc: mauelshagen, linux-scsi

On Fri, Aug 22, 2003 at 05:34:03PM +0200, christophe.varoqui@free.fr wrote:
> Selon Patrick Mansfield <patmans@us.ibm.com>:
> > The sending of any special command to enable an alternative path (like the
> > ghost LUN) should be done in a kernel thread or some way that allows the
> > root file system to be multi-pathed on such devices.
> > 
> Early userspace should do the trick to keep vendors out of kernel development.
> Anyway, it is still not clear to me how HBAs BIOS treat such ghost LUNs at boot.

I don't mean boot time, but special commands sent at the time of failure.

You can't run a user space program if the disk it's on needs to failover.

-- Patrick Mansfield

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [lvm] [christophe.varoqui@free.fr: dm multipath target]
  2003-08-22 14:59                           ` Patrick Mansfield
  2003-08-22 15:34                             ` christophe.varoqui
@ 2003-08-22 16:07                             ` christophe.varoqui
  1 sibling, 0 replies; 19+ messages in thread
From: christophe.varoqui @ 2003-08-22 16:07 UTC (permalink / raw)
  To: Patrick Mansfield; +Cc: mauelshagen, linux-scsi

>From "Working Draft SCSI Block Commands - 2 (SBC-2)"

4.2.1.4 Ready state
A direct-access block device is ready when medium access commands can be
executed. A block device using removable media is not ready until a volume is
mounted. Such a block device, with a volume not mounted, shall terminate medium
access commands with CHECK CONDITION status and the sense key shall be set to
NOT READY with the appropriate additional sense code for the condition.
Some direct-access block devices may be switched from being ready to being not
ready by using the START STOP UNIT command. An application client may need to
issue a START STOP UNIT command with a START bit set to bring a block device ready.
---

It seems a ghost LUN on HSV controlers is not doing the right thing as "sg_turs"
reports NOT READY the first time then hang on D-state on consecutive try.

Can the linux kernel be in cause or is it HP ? May the kernel should trap this
NOT READY sense on the fly and disconnect the path.
If it is HP, is there a way to bring them in conformance ?

regards,
cvaroqui

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2003-08-22 16:07 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20030819073926.GA423@fib011235813.fsnet.co.uk>
     [not found] ` <20030819094838.F8428@sistina.com>
2003-08-19  9:04   ` [lvm] [christophe.varoqui@free.fr: dm multipath target] christophe.varoqui
2003-08-19  9:51     ` Heinz J . Mauelshagen
2003-08-19 10:48       ` christophe.varoqui
2003-08-19 12:35         ` Heinz J . Mauelshagen
2003-08-19 13:14           ` christophe.varoqui
2003-08-19 13:26           ` christophe.varoqui
2003-08-19 16:23             ` Heinz J . Mauelshagen
2003-08-19 23:22               ` christophe varoqui
2003-08-20 13:02                 ` Heinz J . Mauelshagen
2003-08-20 14:19                   ` christophe.varoqui
2003-08-21 12:47                     ` Heinz J . Mauelshagen
2003-08-21 16:34                       ` christophe.varoqui
2003-08-22  8:51                         ` Heinz J . Mauelshagen
2003-08-22 14:59                           ` Patrick Mansfield
2003-08-22 15:34                             ` christophe.varoqui
2003-08-22 15:55                               ` Patrick Mansfield
2003-08-22 16:07                             ` christophe.varoqui
2003-08-19 14:19       ` James Bottomley
2003-08-19 16:09         ` Heinz J . Mauelshagen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox