* rdac priority checker changing priorities
@ 2009-04-29 22:34 Lucas Brasilino
2009-04-30 6:25 ` Hannes Reinecke
0 siblings, 1 reply; 8+ messages in thread
From: Lucas Brasilino @ 2009-04-29 22:34 UTC (permalink / raw)
To: dm-devel
Hi
I don't know if I'm misundertanding something. I've got an DS4700 and I'm
switching from RDAC[1] to multipath, since it's natively supported in
the distribution I use
here (SLES 10 SP2).
Since RDAC[1] works perfect, I'm trying to use 'rdac' priority in multipath.
My /etc/multiconf.conf is quite tiny, since I'm building it step-by-step :-) :
blacklist {
devnode "^sda[0-9]*"
}
defaults {
user_friendly_names yes
prio rdac
path_checker tur
}
multipaths {
multipath {
wwid 3600a0b8000327b900000107549f85224
alias mpath0
}
}
I think that using 'prio rdac' makes multipath to use 'mpath_prio_rdac' tool.
# multipath -v2 -ll
mpath0 (3600a0b8000327b900000107549f85224) dm-0 IBM,1814 FAStT
[size=140G][features=1 queue_if_no_path][hwhandler=1 rdac]
\_ round-robin 0 [prio=6][active]
\_ 9:0:0:0 sdb 8:16 [active][ready]
\_ round-robin 0 [prio=1][enabled]
\_ 10:0:0:0 sdc 8:32 [active][ghost]
So the first path has priority 6, as I can confirm:
# mpath_prio_rdac /dev/sdb
6
# mpath_prio_rdac /dev/sdc
1
After the first path (prio=6) failure I get:
# multipath -v2 -ll
sdb: rdac prio: inquiry command indicates error
mpath0 (3600a0b8000327b900000107549f85224) dm-0 IBM,1814 FAStT
[size=140G][features=1 queue_if_no_path][hwhandler=1 rdac]
\_ round-robin 0 [prio=0][enabled]
\_ 9:0:0:0 sdb 8:16 [failed][faulty]
\_ round-robin 0 [prio=1][enabled]
\_ 10:0:0:0 sdc 8:32 [active][ghost]
Ok.. working great, activating the second path. But after the faulty
path is restored:
# multipath -v2 -ll
mpath0 (3600a0b8000327b900000107549f85224) dm-0 IBM,1814 FAStT
[size=140G][features=1 queue_if_no_path][hwhandler=1 rdac]
\_ round-robin 0 [prio=2][enabled]
\_ 9:0:0:0 sdb 8:16 [active][ghost]
\_ round-robin 0 [prio=5][active]
\_ 10:0:0:0 sdc 8:32 [active][ready]
Second path is now priority!!! And of course does not fails back! By
the way, my LUN is configured in
DS4700 in sort a way that the first path *is* the path to preferred controller.
I think path priorities should not change. If so first path goes back
to 'active' status.
Am I misunderstanding something ? Or messing things up?
By the way, here comes the default 'multipath.conf':
#defaults {
# udev_dir /dev
# polling_interval 10
# selector "round-robin 0"
# path_grouping_policy multibus
# getuid_callout "/lib/udev/scsi_id -g -u -s /block/%n"
# prio const
# path_checker directio
# rr_min_io 100
# max_fds 8192
# rr_weight priorities
# failback immediate
# no_path_retry fail
# user_friendly_names no
#}
#blacklist {
# wwid 26353900f02796769
# devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
# devnode "^hd[a-z][[0-9]*]"
# devnode "^cciss!c[0-9]d[0-9]*[p[0-9]*]"
# device {
# vendor DEC.*
# product MSA[15]00
# }
#}
[...]
#devices {
# device {
# vendor "COMPAQ "
# product "HSV110 (C)COMPAQ"
# path_grouping_policy multibus
# getuid_callout "/lib/udev/scsi_id -g -u -s /block/%n"
# path_checker directio
# path_selector "round-robin 0"
# hardware_handler "0"
# failback 15
# rr_weight priorities
# no_path_retry queue
# rr_min_io 100
# product_blacklist LUN_Z
# }
# device {
# vendor "COMPAQ "
# product "MSA1000 "
# path_grouping_policy multibus
# }
#}
Thanks a lot in advance
Lucas Brasilino
[1] http://www.lsi.com/rdac/ds4000.html
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: rdac priority checker changing priorities 2009-04-29 22:34 rdac priority checker changing priorities Lucas Brasilino @ 2009-04-30 6:25 ` Hannes Reinecke 2009-04-30 18:05 ` Chandra Seetharaman 0 siblings, 1 reply; 8+ messages in thread From: Hannes Reinecke @ 2009-04-30 6:25 UTC (permalink / raw) To: device-mapper development Hi Lucas, Lucas Brasilino wrote: > Hi > > I don't know if I'm misundertanding something. I've got an DS4700 and I'm > switching from RDAC[1] to multipath, since it's natively supported in > the distribution I use > here (SLES 10 SP2). > > Since RDAC[1] works perfect, I'm trying to use 'rdac' priority in multipath. > > My /etc/multiconf.conf is quite tiny, since I'm building it step-by-step :-) : > > blacklist { > devnode "^sda[0-9]*" > } > > defaults { > user_friendly_names yes > prio rdac > path_checker tur > } > > multipaths { > multipath { > wwid 3600a0b8000327b900000107549f85224 > alias mpath0 > } > } > > I think that using 'prio rdac' makes multipath to use 'mpath_prio_rdac' tool. > > # multipath -v2 -ll > mpath0 (3600a0b8000327b900000107549f85224) dm-0 IBM,1814 FAStT > [size=140G][features=1 queue_if_no_path][hwhandler=1 rdac] > \_ round-robin 0 [prio=6][active] > \_ 9:0:0:0 sdb 8:16 [active][ready] > \_ round-robin 0 [prio=1][enabled] > \_ 10:0:0:0 sdc 8:32 [active][ghost] > > So the first path has priority 6, as I can confirm: > > # mpath_prio_rdac /dev/sdb > 6 > # mpath_prio_rdac /dev/sdc > 1 > > After the first path (prio=6) failure I get: > > # multipath -v2 -ll > sdb: rdac prio: inquiry command indicates error > mpath0 (3600a0b8000327b900000107549f85224) dm-0 IBM,1814 FAStT > [size=140G][features=1 queue_if_no_path][hwhandler=1 rdac] > \_ round-robin 0 [prio=0][enabled] > \_ 9:0:0:0 sdb 8:16 [failed][faulty] > \_ round-robin 0 [prio=1][enabled] > \_ 10:0:0:0 sdc 8:32 [active][ghost] > > Ok.. working great, activating the second path. But after the faulty > path is restored: > > # multipath -v2 -ll > mpath0 (3600a0b8000327b900000107549f85224) dm-0 IBM,1814 FAStT > [size=140G][features=1 queue_if_no_path][hwhandler=1 rdac] > \_ round-robin 0 [prio=2][enabled] > \_ 9:0:0:0 sdb 8:16 [active][ghost] > \_ round-robin 0 [prio=5][active] > \_ 10:0:0:0 sdc 8:32 [active][ready] > > Second path is now priority!!! And of course does not fails back! By > the way, my LUN is configured in > DS4700 in sort a way that the first path *is* the path to preferred controller. > > I think path priorities should not change. If so first path goes back > to 'active' status. > Am I misunderstanding something ? Or messing things up? > You are using an old version of multipathing for SLES10 SP2. This had a bug triggering priority inversion on RDAC. Please update to the latest version. Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Markus Rex, HRB 16746 (AG Nürnberg) ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: rdac priority checker changing priorities 2009-04-30 6:25 ` Hannes Reinecke @ 2009-04-30 18:05 ` Chandra Seetharaman 2009-05-04 10:43 ` Hannes Reinecke 2009-05-05 17:59 ` Lucas Brasilino 0 siblings, 2 replies; 8+ messages in thread From: Chandra Seetharaman @ 2009-04-30 18:05 UTC (permalink / raw) To: Hannes Reinecke; +Cc: dm-devel Hannes, I think we need to revisit the priority value we provide for preferred path(4) relative to active path (2) and non-preferred(1). Consider the following scenario: Access to a lun thru 2 preferred and 2 non-preferred path. Lets call path group with preferred paths as pg1 and with non-preferred paths as pg2. Initially pg1 has priority of 8 and pg2 has priority of 2. pg1 is chosen and I/O goes thru pg1, all good. Both the paths in pg1 fails, pg2 has been made the active path group and I/O is sent thru that path and since it became "active", its priority raises to 6 ( 2 path times (active + non-preferred)). When one of the paths in pg1 comes back, one would expect the failback to happen. It doesn't happen as pg1's priority (4) is smaller than that of pg2 (6). Which is not correct. We can do the same exercise with more than 4 paths also, like 6, 8 etc., and the results are worse. So, IMO we need to give the disproportionately large number for preferred path w.r.t active and non-preferred. What do you think ? chandra On Thu, 2009-04-30 at 08:25 +0200, Hannes Reinecke wrote: > Hi Lucas, > > Lucas Brasilino wrote: > > Hi > > > > I don't know if I'm misundertanding something. I've got an DS4700 and I'm > > switching from RDAC[1] to multipath, since it's natively supported in > > the distribution I use > > here (SLES 10 SP2). > > > > Since RDAC[1] works perfect, I'm trying to use 'rdac' priority in multipath. > > > > My /etc/multiconf.conf is quite tiny, since I'm building it step-by-step :-) : > > > > blacklist { > > devnode "^sda[0-9]*" > > } > > > > defaults { > > user_friendly_names yes > > prio rdac > > path_checker tur > > } > > > > multipaths { > > multipath { > > wwid 3600a0b8000327b900000107549f85224 > > alias mpath0 > > } > > } > > > > I think that using 'prio rdac' makes multipath to use 'mpath_prio_rdac' tool. > > > > # multipath -v2 -ll > > mpath0 (3600a0b8000327b900000107549f85224) dm-0 IBM,1814 FAStT > > [size=140G][features=1 queue_if_no_path][hwhandler=1 rdac] > > \_ round-robin 0 [prio=6][active] > > \_ 9:0:0:0 sdb 8:16 [active][ready] > > \_ round-robin 0 [prio=1][enabled] > > \_ 10:0:0:0 sdc 8:32 [active][ghost] > > > > So the first path has priority 6, as I can confirm: > > > > # mpath_prio_rdac /dev/sdb > > 6 > > # mpath_prio_rdac /dev/sdc > > 1 > > > > After the first path (prio=6) failure I get: > > > > # multipath -v2 -ll > > sdb: rdac prio: inquiry command indicates error > > mpath0 (3600a0b8000327b900000107549f85224) dm-0 IBM,1814 FAStT > > [size=140G][features=1 queue_if_no_path][hwhandler=1 rdac] > > \_ round-robin 0 [prio=0][enabled] > > \_ 9:0:0:0 sdb 8:16 [failed][faulty] > > \_ round-robin 0 [prio=1][enabled] > > \_ 10:0:0:0 sdc 8:32 [active][ghost] > > > > Ok.. working great, activating the second path. But after the faulty > > path is restored: > > > > # multipath -v2 -ll > > mpath0 (3600a0b8000327b900000107549f85224) dm-0 IBM,1814 FAStT > > [size=140G][features=1 queue_if_no_path][hwhandler=1 rdac] > > \_ round-robin 0 [prio=2][enabled] > > \_ 9:0:0:0 sdb 8:16 [active][ghost] > > \_ round-robin 0 [prio=5][active] > > \_ 10:0:0:0 sdc 8:32 [active][ready] > > > > Second path is now priority!!! And of course does not fails back! By > > the way, my LUN is configured in > > DS4700 in sort a way that the first path *is* the path to preferred controller. > > > > I think path priorities should not change. If so first path goes back > > to 'active' status. > > Am I misunderstanding something ? Or messing things up? > > > You are using an old version of multipathing for SLES10 SP2. > This had a bug triggering priority inversion on RDAC. > Please update to the latest version. > > Cheers, > > Hannes ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: rdac priority checker changing priorities 2009-04-30 18:05 ` Chandra Seetharaman @ 2009-05-04 10:43 ` Hannes Reinecke 2009-05-04 17:30 ` Chandra Seetharaman 2009-06-23 0:47 ` Chandra Seetharaman 2009-05-05 17:59 ` Lucas Brasilino 1 sibling, 2 replies; 8+ messages in thread From: Hannes Reinecke @ 2009-05-04 10:43 UTC (permalink / raw) To: sekharan; +Cc: dm-devel Hi Chandra, Chandra Seetharaman wrote: > Hannes, > > I think we need to revisit the priority value we provide for preferred > path(4) relative to active path (2) and non-preferred(1). > > Consider the following scenario: > > Access to a lun thru 2 preferred and 2 non-preferred path. Lets call > path group with preferred paths as pg1 and with non-preferred paths as > pg2. > > Initially pg1 has priority of 8 and pg2 has priority of 2. pg1 is chosen > and I/O goes thru pg1, all good. > > Both the paths in pg1 fails, pg2 has been made the active path group and > I/O is sent thru that path and since it became "active", its priority > raises to 6 ( 2 path times (active + non-preferred)). > > When one of the paths in pg1 comes back, one would expect the failback > to happen. It doesn't happen as pg1's priority (4) is smaller than that > of pg2 (6). Which is not correct. > Is this really a valid case? This means we'll have a setup like this: rdac pg1 sda failed sdb failed pg2 sdc active sdd active Correct? So, given your assumptions, the proposed scenario would be represented like this: rdac pg1 sda active sdb failed pg2 sdc active sdd active So it is really a good idea to switch paths in this case? The 'sdb' path would not be reachable here, so any path switch command wouldn't have been received, either. I'm not sure _what_ is going to happen when we switch paths now and sdb comes back later; but most likely the entire setup will be messed up then: sda (pref & owned) 6 sdb 0 sdc (sec) 1 sdd (sec & owned) 3 and we'll be getting the path layout thoroughly jumbled then. So I don't really like this idea. We should only be switching paths when _all_ paths of a path group become available again. Providing not all paths have failed in the active group, of course. Then we should be switching paths regardless. Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Markus Rex, HRB 16746 (AG Nürnberg) ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: rdac priority checker changing priorities 2009-05-04 10:43 ` Hannes Reinecke @ 2009-05-04 17:30 ` Chandra Seetharaman 2009-06-23 0:47 ` Chandra Seetharaman 1 sibling, 0 replies; 8+ messages in thread From: Chandra Seetharaman @ 2009-05-04 17:30 UTC (permalink / raw) To: Hannes Reinecke; +Cc: dm-devel Hi Hannes On Mon, 2009-05-04 at 12:43 +0200, Hannes Reinecke wrote: <snip> > > > Is this really a valid case? Yes, Having more than one patch to a (or both) controller(s) is a valid case. > This means we'll have a setup like this: > > rdac > pg1 > sda failed > sdb failed > pg2 > sdc active > sdd active > > Correct? correct. > So, given your assumptions, the proposed scenario would be represented It is not an assumption, it is the behavior I have seen :) > like this: > > rdac > pg1 > sda active > sdb failed > pg2 > sdc active > sdd active > > So it is really a good idea to switch paths in this case? The 'sdb' Yes. We need to switch for two reasons - since there is a preferred path available we _should_ use it (otherwise it will throw off the load balancing the admin has made in the storage). - To be consistent with multipath's state before the access to the preferred controller failed. i.e if multipath has configured a dm device in this state, multipath _does_ make pg1 the active path group. > path would not be reachable here, so any path switch command wouldn't > have been received, either. I'm not sure _what_ is going to happen Since, both paths are leading to the same controller, mode select sent for sda would have made sdb also the active controller. But, as you mentioned it is not seen by dm-multipath. > when we switch paths now and sdb comes back later; but most likely The patch I re-submitted last week (Handle multipath paths in a path group properly during pg_init : http://marc.info/?l=dm-devel&m=124094710300894&w=2) handles this situation correctly, by sending an activate during reinstate. > the entire setup will be messed up then: > sda (pref & owned) 6 > sdb 0 > sdc (sec) 1 > sdd (sec & owned) 3 No, this will not be the case. As soon as the access to sdb comes back it will be seen as pref and owned and hence will get a priority value of 6. Also, as soon as sda has been made active, sdd will become passive/ghost, and hence will have the priority value of 1. > and we'll be getting the path layout thoroughly jumbled then. > So I don't really like this idea. We should only be switching > paths when _all_ paths of a path group become available again. > Providing not all paths have failed in the active group, of course. > Then we should be switching paths regardless. > Here are the details: =========================================================== (1) Initial configuration (all are good): pg1 sda (pref and active) - 6 sdb (pref and active) - 6 pg2 sdc (sec and passive) - 1 sdd (sec and passive) - 1 ------ (2) Access to sdb goes down ------ pg1 sda (pref and active) - 6 sdb (not there) - 0 pg2 sdc (sec and passive) - 1 sdd (sec and passive) - 1 ------ (3) Access to sda goes down, path group switches ------ pg1 sda (not there) - 0 sdb (not there) - 0 pg2 sdc (sec and active) - 3 sdd (sec and active) - 3 ------ (4) sda comes back, path group switch _should_ happen here. to be consistent with (1). If the path group switch happens, sda will have a priority of 6 and sdc/sdd will have priority of 1 each (as they will become passive). Path switch can happen only if the priority we give for preferred path is lot more than the sum of all priorities of all the paths in the other path group. ------ pg1 sda (pref and passive)- 4 sdb (not there) - 0 pg2 sdc (sec and active) - 3 sdd (sec and active) - 3 Hope it is clear now. > Cheers, > > Hannes ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: rdac priority checker changing priorities 2009-05-04 10:43 ` Hannes Reinecke 2009-05-04 17:30 ` Chandra Seetharaman @ 2009-06-23 0:47 ` Chandra Seetharaman 2009-06-23 6:20 ` Hannes Reinecke 1 sibling, 1 reply; 8+ messages in thread From: Chandra Seetharaman @ 2009-06-23 0:47 UTC (permalink / raw) To: Hannes Reinecke; +Cc: dm-devel [-- Attachment #1: Type: text/plain, Size: 2269 bytes --] Hi Hannes, Please see the attached file for the real example. Can I go ahead and generate a patch to increase the priority of the preferred path to say, 50 ? chandra On Mon, 2009-05-04 at 12:43 +0200, Hannes Reinecke wrote: > Hi Chandra, > > Chandra Seetharaman wrote: > > Hannes, > > > > I think we need to revisit the priority value we provide for preferred > > path(4) relative to active path (2) and non-preferred(1). > > > > Consider the following scenario: > > > > Access to a lun thru 2 preferred and 2 non-preferred path. Lets call > > path group with preferred paths as pg1 and with non-preferred paths as > > pg2. > > > > Initially pg1 has priority of 8 and pg2 has priority of 2. pg1 is chosen > > and I/O goes thru pg1, all good. > > > > Both the paths in pg1 fails, pg2 has been made the active path group and > > I/O is sent thru that path and since it became "active", its priority > > raises to 6 ( 2 path times (active + non-preferred)). > > > > When one of the paths in pg1 comes back, one would expect the failback > > to happen. It doesn't happen as pg1's priority (4) is smaller than that > > of pg2 (6). Which is not correct. > > > Is this really a valid case? > This means we'll have a setup like this: > > rdac > pg1 > sda failed > sdb failed > pg2 > sdc active > sdd active > > Correct? > So, given your assumptions, the proposed scenario would be represented > like this: > > rdac > pg1 > sda active > sdb failed > pg2 > sdc active > sdd active > > So it is really a good idea to switch paths in this case? The 'sdb' > path would not be reachable here, so any path switch command wouldn't > have been received, either. I'm not sure _what_ is going to happen > when we switch paths now and sdb comes back later; but most likely > the entire setup will be messed up then: > sda (pref & owned) 6 > sdb 0 > sdc (sec) 1 > sdd (sec & owned) 3 > and we'll be getting the path layout thoroughly jumbled then. > So I don't really like this idea. We should only be switching > paths when _all_ paths of a path group become available again. > Providing not all paths have failed in the active group, of course. > Then we should be switching paths regardless. > > Cheers, > > Hannes [-- Attachment #2: typescript --] [-- Type: text/plain, Size: 4379 bytes --] $ multipath -ll 3600a0b800011a1ee00003f834a3f7a65 3600a0b800011a1ee00003f834a3f7a65 dm-0 IBM,1815 FAStT [size=10G][features=1 queue_if_no_path][hwhandler=1 rdac] \_ round-robin 0 [prio=12][active] \_ 2:0:1:4 sdu 65:64 [active][ready] \_ 2:0:0:4 sdp 8:240 [active][ready] \_ round-robin 0 [prio=2][enabled] \_ 1:0:1:4 sdk 8:160 [active][ghost] \_ 1:0:0:4 sdf 8:80 [active][ghost] $ # disabled one preferred path $ multipath -ll 3600a0b800011a1ee00003f834a3f7a65 sdp: rdac prio: inquiry command indicates error 3600a0b800011a1ee00003f834a3f7a65 dm-0 IBM,1815 FAStT [size=10G][features=1 queue_if_no_path][hwhandler=1 rdac] \_ round-robin 0 [prio=6][active] \_ 2:0:1:4 sdu 65:64 [active][ready] \_ 2:0:0:4 sdp 8:240 [failed][faulty] \_ round-robin 0 [prio=2][enabled] \_ 1:0:1:4 sdk 8:160 [active][ghost] \_ 1:0:0:4 sdf 8:80 [active][ghost] $ # ALL GOOD $ # disabled another preferred path $ multipath -ll 3600a0b800011a1ee00003f834a3f7a65 sdu: rdac prio: inquiry command indicates error sdp: rdac prio: inquiry command indicates error 3600a0b800011a1ee00003f834a3f7a65 dm-0 IBM,1815 FAStT [size=10G][features=1 queue_if_no_path][hwhandler=1 rdac] \_ round-robin 0 [prio=0][enabled] \_ 2:0:1:4 sdu 65:64 [failed][faulty] \_ 2:0:0:4 sdp 8:240 [failed][faulty] \_ round-robin 0 [prio=6][active] \_ 1:0:1:4 sdk 8:160 [active][ready] \_ 1:0:0:4 sdf 8:80 [active][ready] $ # failed over to the non-preferred path $ # that is good $ # disabled a non-preferred path $ multipath -ll 3600a0b800011a1ee00003f834a3f7a65 sdu: rdac prio: inquiry command indicates error sdp: rdac prio: inquiry command indicates error sdk: rdac prio: inquiry command indicates error 3600a0b800011a1ee00003f834a3f7a65 dm-0 IBM,1815 FAStT [size=10G][features=1 queue_if_no_path][hwhandler=1 rdac] \_ round-robin 0 [prio=0][enabled] \_ 2:0:1:4 sdu 65:64 [failed][faulty] \_ 2:0:0:4 sdp 8:240 [failed][faulty] \_ round-robin 0 [prio=3][active] \_ 1:0:1:4 sdk 8:160 [failed][faulty] \_ 1:0:0:4 sdf 8:80 [active][ready] $ # all good $ # enabled a non-preferred path $ multipath -ll 3600a0b800011a1ee00003f834a3f7a65 sdu: rdac prio: inquiry command indicates error sdp: rdac prio: inquiry command indicates error 3600a0b800011a1ee00003f834a3f7a65 dm-0 IBM,1815 FAStT [size=10G][features=1 queue_if_no_path][hwhandler=1 rdac] \_ round-robin 0 [prio=0][enabled] \_ 2:0:1:4 sdu 65:64 [failed][faulty] \_ 2:0:0:4 sdp 8:240 [failed][faulty] \_ round-robin 0 [prio=6][active] \_ 1:0:1:4 sdk 8:160 [active][ready] \_ 1:0:0:4 sdf 8:80 [active][ready] $ # Good $ # enabled a preferred path. $ # expected failback to the preferred path group $ multipath -ll 3600a0b800011a1ee00003f834a3f7a65 sdp: rdac prio: inquiry command indicates error 3600a0b800011a1ee00003f834a3f7a65 dm-0 IBM,1815 FAStT [size=10G][features=1 queue_if_no_path][hwhandler=1 rdac] \_ round-robin 0 [prio=4][enabled] \_ 2:0:1:4 sdu 65:64 [active][ghost] \_ 2:0:0:4 sdp 8:240 [failed][faulty] \_ round-robin 0 [prio=6][active] \_ 1:0:1:4 sdk 8:160 [active][ready] \_ 1:0:0:4 sdf 8:80 [active][ready] $ # no. failback did not happen. [see the first path group still states "ghost"] $ # the reason is that the priority of the preferred path group is less than $ # that of the non-preferred path group. $ # Basically, non-preferred path is used even though one preferred path is available $ # which is not correct $ # wait for a a minute, may be $ sleep 60 $ multipath -ll 3600a0b800011a1ee00003f834a3f7a65 sdp: rdac prio: inquiry command indicates error 3600a0b800011a1ee00003f834a3f7a65 dm-0 IBM,1815 FAStT [size=10G][features=1 queue_if_no_path][hwhandler=1 rdac] \_ round-robin 0 [prio=4][enabled] \_ 2:0:1:4 sdu 65:64 [active][ghost] \_ 2:0:0:4 sdp 8:240 [failed][faulty] \_ round-robin 0 [prio=6][active] \_ 1:0:1:4 sdk 8:160 [active][ready] \_ 1:0:0:4 sdf 8:80 [active][ready] $ # nope... failback didn't happen. $ # enabled the other preferred path. $ # only now the failback happens. $ $ multipath -ll 3600a0b800011a1ee00003f834a3f7a65 3600a0b800011a1ee00003f834a3f7a65 dm-0 IBM,1815 FAStT [size=10G][features=1 queue_if_no_path][hwhandler=1 rdac] \_ round-robin 0 [prio=12][active] \_ 2:0:1:4 sdu 65:64 [active][ready] \_ 2:0:0:4 sdp 8:240 [active][ready] \_ round-robin 0 [prio=2][enabled] \_ 1:0:1:4 sdk 8:160 [active][ghost] \_ 1:0:0:4 sdf 8:80 [active][ghost] $ exit [-- Attachment #3: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: rdac priority checker changing priorities 2009-06-23 0:47 ` Chandra Seetharaman @ 2009-06-23 6:20 ` Hannes Reinecke 0 siblings, 0 replies; 8+ messages in thread From: Hannes Reinecke @ 2009-06-23 6:20 UTC (permalink / raw) To: sekharan; +Cc: dm-devel Chandra Seetharaman wrote: > Hi Hannes, > > Please see the attached file for the real example. > > Can I go ahead and generate a patch to increase the priority of the > preferred path to say, 50 ? > No. That's just wrong and we'll run into the same problem once someone increases the number of paths to 50. What we should do here is to modify the priority value, or rather the way priority is used. We should be splitting the current priority value into two fields, pg priority and number of paths in a pg. pg priority is the priority of a _single_ path here, and, by definition as we're using group_by_prio, the priority of each path in the pg. Then we should be modifying the algorithm to choose the next pg to do something like this: -> Choose the pg with the highest priority -> If two pgs have the same priority choose the pg with the highest path count. Maybe we could even use the highest _valid_ path count here, depending if we have the information at that point. This algorithm would solve the problem we're having now once and for all. Just adding the priorities of the individual paths will always lead to these type of problems. I see if I can find some time to draw up a patch. Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Markus Rex, HRB 16746 (AG Nürnberg) ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: rdac priority checker changing priorities 2009-04-30 18:05 ` Chandra Seetharaman 2009-05-04 10:43 ` Hannes Reinecke @ 2009-05-05 17:59 ` Lucas Brasilino 1 sibling, 0 replies; 8+ messages in thread From: Lucas Brasilino @ 2009-05-05 17:59 UTC (permalink / raw) To: device-mapper development Well, maybe it's a newbie opinion, but looks like that's easier if each path has a 'fixed value' priority, isn't it ? I've made a little code here that returns a choosen fixed value to paths and everything goes fine..... regards Lucas Brasilino 2009/4/30 Chandra Seetharaman <sekharan@us.ibm.com>: > Hannes, > > I think we need to revisit the priority value we provide for preferred > path(4) relative to active path (2) and non-preferred(1). > > Consider the following scenario: > > Access to a lun thru 2 preferred and 2 non-preferred path. Lets call > path group with preferred paths as pg1 and with non-preferred paths as > pg2. > > Initially pg1 has priority of 8 and pg2 has priority of 2. pg1 is chosen > and I/O goes thru pg1, all good. > > Both the paths in pg1 fails, pg2 has been made the active path group and > I/O is sent thru that path and since it became "active", its priority > raises to 6 ( 2 path times (active + non-preferred)). > > When one of the paths in pg1 comes back, one would expect the failback > to happen. It doesn't happen as pg1's priority (4) is smaller than that > of pg2 (6). Which is not correct. > > We can do the same exercise with more than 4 paths also, like 6, 8 etc., > and the results are worse. > > So, IMO we need to give the disproportionately large number for > preferred path w.r.t active and non-preferred. What do you think ? > > chandra > > > > On Thu, 2009-04-30 at 08:25 +0200, Hannes Reinecke wrote: >> Hi Lucas, >> >> Lucas Brasilino wrote: >> > Hi >> > >> > I don't know if I'm misundertanding something. I've got an DS4700 and I'm >> > switching from RDAC[1] to multipath, since it's natively supported in >> > the distribution I use >> > here (SLES 10 SP2). >> > >> > Since RDAC[1] works perfect, I'm trying to use 'rdac' priority in multipath. >> > >> > My /etc/multiconf.conf is quite tiny, since I'm building it step-by-step :-) : >> > >> > blacklist { >> > devnode "^sda[0-9]*" >> > } >> > >> > defaults { >> > user_friendly_names yes >> > prio rdac >> > path_checker tur >> > } >> > >> > multipaths { >> > multipath { >> > wwid 3600a0b8000327b900000107549f85224 >> > alias mpath0 >> > } >> > } >> > >> > I think that using 'prio rdac' makes multipath to use 'mpath_prio_rdac' tool. >> > >> > # multipath -v2 -ll >> > mpath0 (3600a0b8000327b900000107549f85224) dm-0 IBM,1814 FAStT >> > [size=140G][features=1 queue_if_no_path][hwhandler=1 rdac] >> > \_ round-robin 0 [prio=6][active] >> > \_ 9:0:0:0 sdb 8:16 [active][ready] >> > \_ round-robin 0 [prio=1][enabled] >> > \_ 10:0:0:0 sdc 8:32 [active][ghost] >> > >> > So the first path has priority 6, as I can confirm: >> > >> > # mpath_prio_rdac /dev/sdb >> > 6 >> > # mpath_prio_rdac /dev/sdc >> > 1 >> > >> > After the first path (prio=6) failure I get: >> > >> > # multipath -v2 -ll >> > sdb: rdac prio: inquiry command indicates error >> > mpath0 (3600a0b8000327b900000107549f85224) dm-0 IBM,1814 FAStT >> > [size=140G][features=1 queue_if_no_path][hwhandler=1 rdac] >> > \_ round-robin 0 [prio=0][enabled] >> > \_ 9:0:0:0 sdb 8:16 [failed][faulty] >> > \_ round-robin 0 [prio=1][enabled] >> > \_ 10:0:0:0 sdc 8:32 [active][ghost] >> > >> > Ok.. working great, activating the second path. But after the faulty >> > path is restored: >> > >> > # multipath -v2 -ll >> > mpath0 (3600a0b8000327b900000107549f85224) dm-0 IBM,1814 FAStT >> > [size=140G][features=1 queue_if_no_path][hwhandler=1 rdac] >> > \_ round-robin 0 [prio=2][enabled] >> > \_ 9:0:0:0 sdb 8:16 [active][ghost] >> > \_ round-robin 0 [prio=5][active] >> > \_ 10:0:0:0 sdc 8:32 [active][ready] >> > >> > Second path is now priority!!! And of course does not fails back! By >> > the way, my LUN is configured in >> > DS4700 in sort a way that the first path *is* the path to preferred controller. >> > >> > I think path priorities should not change. If so first path goes back >> > to 'active' status. >> > Am I misunderstanding something ? Or messing things up? >> > >> You are using an old version of multipathing for SLES10 SP2. >> This had a bug triggering priority inversion on RDAC. >> Please update to the latest version. >> >> Cheers, >> >> Hannes > > -- > dm-devel mailing list > dm-devel@redhat.com > https://www.redhat.com/mailman/listinfo/dm-devel > ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2009-06-23 6:20 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-04-29 22:34 rdac priority checker changing priorities Lucas Brasilino 2009-04-30 6:25 ` Hannes Reinecke 2009-04-30 18:05 ` Chandra Seetharaman 2009-05-04 10:43 ` Hannes Reinecke 2009-05-04 17:30 ` Chandra Seetharaman 2009-06-23 0:47 ` Chandra Seetharaman 2009-06-23 6:20 ` Hannes Reinecke 2009-05-05 17:59 ` Lucas Brasilino
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.