* Multipathing hints request
@ 2005-07-21 14:31 Philipp Niemann
2005-07-21 18:31 ` christophe varoqui
0 siblings, 1 reply; 7+ messages in thread
From: Philipp Niemann @ 2005-07-21 14:31 UTC (permalink / raw)
To: dm-devel
Hello list, nice to meet you.
As I am new to the list I am not quite shure if the topic is correct,
but...
Around here I have two EMC Clariion 4500 Storage Arrays, two Brocade Silkworms
each building a fabric, and two QLogic 2340 HBA in my PC. For now I am
using debian stable (sarge) with a vanilla kernel-2.6.12.3 compiled with
all I can get from the md/devmapper part of config.
I want the OS to use the two HBAs for path failover. The Clariion is an
activce/passive device.
I use the multipath-tools written by Christophe Varoqui and others,
version 0.4.2.4-2 as packaged with debian. There is a more up to date
one on the project homepage but I haved bothered yet.
Is this the right list to ask for help/suggestions? What details would
you like to have then? If not the right list, where to post?
More details:
HBA1 is connected to Clariion1_SPA and Clariion2_SPA via fabric1
HBA2 is connected to Clariion1_SPB and Clariion2_SPB via fabric2
I have the devices in /proc/partitions (total of 5 LUNs configured, each
with 2 path makes 10 devices in /proc/partitions)
I have multipath configured like this:
# multipath -lv2 # should be group-by-node-name policy
3600601602001f1a665653dbe010465c5
[size=22 GB][features="0"][hwhandler="0"]
\_ round-robin 0 [enabled][first]
\_ 6:0:0:0 sda 8:0 [ready ][active]
\_ 7:0:1:0 sdh 8:112 [faulty][active]
360060160200161a037f90d75dfca84f7
[size=20 GB][features="0"][hwhandler="0"]
\_ round-robin 0 [active][first]
\_ 6:0:1:1 sde 8:64 [faulty][active]
\_ 7:0:0:1 sdg 8:96 [ready ][active]
3600601602001f1a600a524d18275c0b2
[size=20 GB][features="0"][hwhandler="0"]
\_ round-robin 0 [enabled][first]
\_ 6:0:0:1 sdb 8:16 [ready ][active]
\_ 7:0:1:1 sdi 8:128 [faulty][active]
3600601602001f1a6e0c6fdbae081bd89
[size=20 GB][features="0"][hwhandler="0"]
\_ round-robin 0 [enabled][first]
\_ 6:0:0:2 sdc 8:32 [faulty][active]
\_ 7:0:1:2 sdj 8:144 [ready ][active]
360060160200161a06215f38ffbb2d6e9
[size=20 GB][features="0"][hwhandler="0"]
\_ round-robin 0 [enabled][first]
\_ 6:0:1:0 sdd 8:48 [faulty][active]
\_ 7:0:0:0 sdf 8:80 [ready ][active]
I think this doesn't look too bad. Ready/Faulty match the default owner for the
LUN on the Clariion. I am able to create a filesystem say on
360060160200161a037f90d75dfca84f7, which might appear as dm-4 in /proc/partitions
BTW: Anyone to know what "hwhandler" tells me? Can I change it? Same for
"features".
BTW2: I'd prefer to use "emc" instead of "round-robin". How?
I concluded something like that might be possible as dm-round-robin
and dm-emc exist
I am able to mount that filesystem and to create files on it. I prefer doing something
like dd if=/dev/zero of=/mnt/test bs=1024 count=$((1024*1024*19)). That way I have
plenty of time to do some failover tests.
My problem (finally): If I pull the cable establishing the path to sdg, I get
almost immediate hard IO errors, corrupting the filesystem. The devmapper
won't switch to the second path, though it recognizes the failing
disk sdg and disables the path 8:96. How do I do that properly?
Thanks for ideas,
Philipp Niemann
BTW3: Anyone with a hint how to do this with a non-customized RHES 4.0 or 3.0?
No multipath-tools there :(
--
Philipp Niemann DIMDI
Abteilung D / AG D4
Peripherie / UNIX-Systembetreuer Waisenhausgasse 36-38a
Tel. : 0221/4724-281 50676 Koeln
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: Multipathing hints request
2005-07-21 14:31 Multipathing hints request Philipp Niemann
@ 2005-07-21 18:31 ` christophe varoqui
2005-07-22 8:37 ` Philipp Niemann
0 siblings, 1 reply; 7+ messages in thread
From: christophe varoqui @ 2005-07-21 18:31 UTC (permalink / raw)
To: device-mapper development
On jeu, 2005-07-21 at 16:31 +0200, Philipp Niemann wrote:
> Hello list, nice to meet you.
>
> As I am new to the list I am not quite shure if the topic is correct,
> but...
>
> Around here I have two EMC Clariion 4500 Storage Arrays, two Brocade Silkworms
> each building a fabric, and two QLogic 2340 HBA in my PC. For now I am
> using debian stable (sarge) with a vanilla kernel-2.6.12.3 compiled with
> all I can get from the md/devmapper part of config.
>
> I want the OS to use the two HBAs for path failover. The Clariion is an
> activce/passive device.
>
> I use the multipath-tools written by Christophe Varoqui and others,
> version 0.4.2.4-2 as packaged with debian. There is a more up to date
> one on the project homepage but I haved bothered yet.
>
> Is this the right list to ask for help/suggestions? What details would
> you like to have then? If not the right list, where to post?
>
> More details:
>
> HBA1 is connected to Clariion1_SPA and Clariion2_SPA via fabric1
> HBA2 is connected to Clariion1_SPB and Clariion2_SPB via fabric2
>
> I have the devices in /proc/partitions (total of 5 LUNs configured, each
> with 2 path makes 10 devices in /proc/partitions)
>
> I have multipath configured like this:
>
> # multipath -lv2 # should be group-by-node-name policy
> 3600601602001f1a665653dbe010465c5
> [size=22 GB][features="0"][hwhandler="0"]
> \_ round-robin 0 [enabled][first]
> \_ 6:0:0:0 sda 8:0 [ready ][active]
> \_ 7:0:1:0 sdh 8:112 [faulty][active]
>
> 360060160200161a037f90d75dfca84f7
> [size=20 GB][features="0"][hwhandler="0"]
> \_ round-robin 0 [active][first]
> \_ 6:0:1:1 sde 8:64 [faulty][active]
> \_ 7:0:0:1 sdg 8:96 [ready ][active]
>
> 3600601602001f1a600a524d18275c0b2
> [size=20 GB][features="0"][hwhandler="0"]
> \_ round-robin 0 [enabled][first]
> \_ 6:0:0:1 sdb 8:16 [ready ][active]
> \_ 7:0:1:1 sdi 8:128 [faulty][active]
>
> 3600601602001f1a6e0c6fdbae081bd89
> [size=20 GB][features="0"][hwhandler="0"]
> \_ round-robin 0 [enabled][first]
> \_ 6:0:0:2 sdc 8:32 [faulty][active]
> \_ 7:0:1:2 sdj 8:144 [ready ][active]
>
> 360060160200161a06215f38ffbb2d6e9
> [size=20 GB][features="0"][hwhandler="0"]
> \_ round-robin 0 [enabled][first]
> \_ 6:0:1:0 sdd 8:48 [faulty][active]
> \_ 7:0:0:0 sdf 8:80 [ready ][active]
>
>
> I think this doesn't look too bad. Ready/Faulty match the default owner for the
> LUN on the Clariion. I am able to create a filesystem say on
> 360060160200161a037f90d75dfca84f7, which might appear as dm-4 in /proc/partitions
>
It does *not* look good. You ought to have 2 paths group per multipath.
One for active paths, the other for inactive paths. As you'll see this
is why the failover does not work for you.
> BTW: Anyone to know what "hwhandler" tells me? Can I change it? Same for
> "features".
The "hardware handler" is an optional additional kernel module used to
trigger a specific operation when a new Path Group get activated. This
is how the host will ask your Clariion controler to activate the
inactive paths, for example.
So yes, you need hwhandler="1 emc". "0" meaning no hwhandler at all.
> BTW2: I'd prefer to use "emc" instead of "round-robin". How?
> I concluded something like that might be possible as dm-round-robin
> and dm-emc exist
>
bzz. round-robin is the io speading policy, whereas "emc" is a
hwhandler. Different beast.
> I am able to mount that filesystem and to create files on it. I prefer doing something
> like dd if=/dev/zero of=/mnt/test bs=1024 count=$((1024*1024*19)). That way I have
> plenty of time to do some failover tests.
>
> My problem (finally): If I pull the cable establishing the path to sdg, I get
> almost immediate hard IO errors, corrupting the filesystem. The devmapper
> won't switch to the second path, though it recognizes the failing
> disk sdg and disables the path 8:96. How do I do that properly?
>
You should now know what to do :/
Regards,
--
christophe varoqui <christophe.varoqui@free.fr>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Multipathing hints request
2005-07-21 18:31 ` christophe varoqui
@ 2005-07-22 8:37 ` Philipp Niemann
2005-07-22 9:58 ` Christophe Varoqui
0 siblings, 1 reply; 7+ messages in thread
From: Philipp Niemann @ 2005-07-22 8:37 UTC (permalink / raw)
To: dm-devel
Hello again!
On Thu, 21.07.2005-20:31:27 +0200, christophe varoqui wrote:
> On jeu, 2005-07-21 at 16:31 +0200, Philipp Niemann wrote:
[...]
> > 360060160200161a06215f38ffbb2d6e9
> > [size=20 GB][features="0"][hwhandler="0"]
> > \_ round-robin 0 [enabled][first]
> > \_ 6:0:1:0 sdd 8:48 [faulty][active]
> > \_ 7:0:0:0 sdf 8:80 [ready ][active]
> >
> >
> > I think this doesn't look too bad. Ready/Faulty match the default owner for the
> > LUN on the Clariion. I am able to create a filesystem say on
> > 360060160200161a037f90d75dfca84f7, which might appear as dm-4 in /proc/partitions
> >
> It does *not* look good. You ought to have 2 paths group per multipath.
> One for active paths, the other for inactive paths. As you'll see this
> is why the failover does not work for you.
Thanks!
Now it looks different. Think I got what you meant. Or didn't I?
Mostly I changed the config file to contain a multipath section. See
below.
416L3 (360060160200161a037f90d75dfca84f7)
[size=20 GB][features="0"][hwhandler="0"]
\_ round-robin 0 [enabled]
\_ 6:0:1:1 sde 8:64 [faulty][active]
\_ round-robin 0 [enabled][first]
\_ 7:0:0:1 sdg 8:96 [ready ][active]
> > BTW: Anyone to know what "hwhandler" tells me? Can I change it? Same for
> > "features".
>
> The "hardware handler" is an optional additional kernel module used to
> trigger a specific operation when a new Path Group get activated. This
> is how the host will ask your Clariion controler to activate the
> inactive paths, for example.
>
> So yes, you need hwhandler="1 emc". "0" meaning no hwhandler at all.
As you see above, I didn't manage that yet. Snip of the config that
produces the above:
multipaths {
multipath {
wwid 360060160200161a037f90d75dfca84f7
alias 416L3
path_grouping_policy failover
path_selector "round-robin 0"
features "0"
hardware_handler "1 emc"
}
}
Is that syntax correct? I don't get the activation of the second path.
Is it the release? Should I upgrade from 0.4.2.3? I'd pefer to keep the
debian package for simplicity and I didn't find anything sounding
important in ChangeLog.
Regards,
Philipp
--
Philipp Niemann DIMDI
Abteilung D / AG D4
Peripherie / UNIX-Systembetreuer Waisenhausgasse 36-38a
Tel. : 0221/4724-281 50676 Koeln
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: Multipathing hints request
2005-07-22 8:37 ` Philipp Niemann
@ 2005-07-22 9:58 ` Christophe Varoqui
2005-07-22 12:09 ` Philipp Niemann
0 siblings, 1 reply; 7+ messages in thread
From: Christophe Varoqui @ 2005-07-22 9:58 UTC (permalink / raw)
To: device-mapper development
>
> multipaths {
> multipath {
> wwid 360060160200161a037f90d75dfca84f7
> alias 416L3
> path_grouping_policy failover
> path_selector "round-robin 0"
> features "0"
> hardware_handler "1 emc"
> }
> }
>
hardware_handler and fetures params must be in the devices {} section, not in multipaths {}.
Also, pgpolicy is best set to group_by_serial with Clariion hw.
You'll see it produce same grouping than failover in your case.
Regards,
cvaroqui
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: Multipathing hints request
2005-07-22 9:58 ` Christophe Varoqui
@ 2005-07-22 12:09 ` Philipp Niemann
2005-07-24 20:02 ` Lars Marowsky-Bree
0 siblings, 1 reply; 7+ messages in thread
From: Philipp Niemann @ 2005-07-22 12:09 UTC (permalink / raw)
To: dm-devel
Well, almost at it.
On Fri, 22.07.2005-11:58:28 +0200, Christophe Varoqui wrote:
> hardware_handler and features params must be in the devices {} section, not in multipaths {}.
I changed the settings in /etc/multipath.conf again, as you advised.
Here is it:
# Begin of /etc/multipath.conf
defaults {
multipath_tool "/sbin/multipath -v 0 -S"
udev_dir /dev
polling_interval 10
default_selector "round-robin 0"
default_path_grouping_policy failover
default_getuid_callout "/sbin/scsi_id -g -u -s /block/%n"
default_prio_callout "/bin/false"
default_features "0"
}
multipaths {
multipath {
wwid 360060160200161a037f90d75dfca84f7
alias 416L3
path_grouping_policy failover
path_selector "round-robin 0"
}
}
devices {
device {
vendor "DGC "
product "RAID 5 "
path_grouping_policy group_by_serial
getuid_callout "/sbin/scsi_id -g -u -s /block/%n"
path_checker emc_clariion
path_selector "round-robin 0"
features "0"
hardware_handler "1 emc"
}
}
# End of /etc/multipath.conf
I figured out that the vendor value must be exactly 8 characters long.
> Also, pgpolicy is best set to group_by_serial with Clariion hw.
> You'll see it produce same grouping than failover in your case.
Hm, doesn't look the same here. It again leaves me with only one
priority group. So I stick to failover for now.
What different features are available? I just managed to freeze my
network conn (at least, haven't checked the host yet) by adding "1
queue_if_no_path" as used with the compaq entry in the examples.
I didn't manage to get failover to work, yet. Am I correct in assuming
that the multipathd daemon is only needed for path recovery? Because I
have problems getting that one to react to signals properly.
Right, gonna check what happend to the machine.
CU,
Philipp
--
Philipp Niemann DIMDI
Abteilung D / AG D4
Peripherie / UNIX-Systembetreuer Waisenhausgasse 36-38a
Tel. : 0221/4724-281 50676 Koeln
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: Multipathing hints request
2005-07-22 12:09 ` Philipp Niemann
@ 2005-07-24 20:02 ` Lars Marowsky-Bree
2005-07-25 13:02 ` Philipp Niemann
0 siblings, 1 reply; 7+ messages in thread
From: Lars Marowsky-Bree @ 2005-07-24 20:02 UTC (permalink / raw)
To: device-mapper development
On 2005-07-22T14:09:55, Philipp Niemann <niemann@dimdi.de> wrote:
> Well, almost at it.
>
> On Fri, 22.07.2005-11:58:28 +0200, Christophe Varoqui wrote:
> > hardware_handler and features params must be in the devices {} section, not in multipaths {}.
>
> I changed the settings in /etc/multipath.conf again, as you advised.
> Here is it:
>
> # Begin of /etc/multipath.conf
> defaults {
> multipath_tool "/sbin/multipath -v 0 -S"
> udev_dir /dev
> polling_interval 10
> default_selector "round-robin 0"
> default_path_grouping_policy failover
> default_getuid_callout "/sbin/scsi_id -g -u -s /block/%n"
> default_prio_callout "/bin/false"
> default_features "0"
> }
> multipaths {
> multipath {
> wwid 360060160200161a037f90d75dfca84f7
> alias 416L3
> path_grouping_policy failover
> path_selector "round-robin 0"
> }
> }
> devices {
> device {
> vendor "DGC "
> product "RAID 5 "
> path_grouping_policy group_by_serial
> getuid_callout "/sbin/scsi_id -g -u -s /block/%n"
> path_checker emc_clariion
> path_selector "round-robin 0"
> features "0"
> hardware_handler "1 emc"
> }
For the Clariion 4500, you need to set the hardware_handler to
"3 emc 1 0"
(The trespass command to transfer ownership of a LU from one
Service-Processor to the other is different from the FC4500 to the EMC
CX series. I've got to admit I've not tested the FC4500 yet, though.)
You also want the pp_emc callout from the newer multipath-tools
package and use that as the priority callout, and group by priority.
Sincerely,
Lars Marowsky-Brée <lmb@suse.de>
--
High Availability & Clustering
SUSE Labs, Research and Development
SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin
"Ignorance more frequently begets confidence than does knowledge"
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: Multipathing hints request
2005-07-24 20:02 ` Lars Marowsky-Bree
@ 2005-07-25 13:02 ` Philipp Niemann
0 siblings, 0 replies; 7+ messages in thread
From: Philipp Niemann @ 2005-07-25 13:02 UTC (permalink / raw)
To: device-mapper development
Jo!
A very happy me shouts out thanks to all the list. Special thanks to
Lars who gave the final hint:
On Sun, 24.07.2005-22:02:38 +0200, Lars Marowsky-Bree wrote:
> On 2005-07-22T14:09:55, Philipp Niemann <niemann@dimdi.de> wrote:
> >
[snip]
> > hardware_handler "1 emc"
> > }
>
> For the Clariion 4500, you need to set the hardware_handler to
> "3 emc 1 0"
>
> (The trespass command to transfer ownership of a LU from one
> Service-Processor to the other is different from the FC4500 to the EMC
> CX series. I've got to admit I've not tested the FC4500 yet, though.)
>
> You also want the pp_emc callout from the newer multipath-tools
> package and use that as the priority callout, and group by priority.
Actually I am using 'hardware_handler "3 emc 1 1"' in my config file now.
Without the reservation bit honouring the failover didn't work. I also
upgraded to multipath-tools-0.4.4 to get the pp_emc thingy.
That gives me a configuration where I can trespass LUNs on the Array via
the Management Suite and get them switched back by my linux box.
The other way works too, so that a write operation on a SAN Filesystem
continues to write even if I pull out the active cable. When I plug it
back in place after some time, the system even switches back to the
default path. Wow! Only scratch in the surface is that the devices come
back in with new names, so former sde becomes sdo. Maybe persistent
bindings might help.
I got somewhat scared as the system issues _lots_ of IO Errors. But a
filesystem check after IO Operation finished gave no problems
whatsoever. So I guess it worked alright.
Thanks again for your support,
Philipp
--
Philipp Niemann DIMDI
Abteilung D / AG D4
Peripherie / UNIX-Systembetreuer Waisenhausgasse 36-38a
Tel. : 0221/4724-281 50676 Koeln
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2005-07-25 13:02 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-07-21 14:31 Multipathing hints request Philipp Niemann
2005-07-21 18:31 ` christophe varoqui
2005-07-22 8:37 ` Philipp Niemann
2005-07-22 9:58 ` Christophe Varoqui
2005-07-22 12:09 ` Philipp Niemann
2005-07-24 20:02 ` Lars Marowsky-Bree
2005-07-25 13:02 ` Philipp Niemann
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.