* release 0.4.4 ?
@ 2005-04-17 9:50 christophe varoqui
2005-04-18 9:02 ` Lars Marowsky-Bree
` (2 more replies)
0 siblings, 3 replies; 30+ messages in thread
From: christophe varoqui @ 2005-04-17 9:50 UTC (permalink / raw)
To: device-mapper development
Hello,
does someone has an objection to my releasing pre14 as 0.4.4 ?
Regards,
--
christophe varoqui <christophe.varoqui@free.fr>
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: release 0.4.4 ? 2005-04-17 9:50 release 0.4.4 ? christophe varoqui @ 2005-04-18 9:02 ` Lars Marowsky-Bree 2005-04-19 17:45 ` Alasdair G Kergon 2005-04-19 23:32 ` christophe varoqui 2 siblings, 0 replies; 30+ messages in thread From: Lars Marowsky-Bree @ 2005-04-18 9:02 UTC (permalink / raw) To: device-mapper development On 2005-04-17T11:50:46, christophe varoqui <christophe.varoqui@free.fr> wrote: > does someone has an objection to my releasing pre14 as 0.4.4 ? I'm reviewing some minor cleanups we still have queued. I'll send them to you later today after figuring out they are really needed ;-) Sincerely, Lars Marowsky-Brée <lmb@suse.de> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: release 0.4.4 ? 2005-04-17 9:50 release 0.4.4 ? christophe varoqui 2005-04-18 9:02 ` Lars Marowsky-Bree @ 2005-04-19 17:45 ` Alasdair G Kergon 2005-04-19 21:14 ` christophe varoqui 2005-04-19 23:32 ` christophe varoqui 2 siblings, 1 reply; 30+ messages in thread From: Alasdair G Kergon @ 2005-04-19 17:45 UTC (permalink / raw) To: christophe varoqui; +Cc: device-mapper development ---x--S--T 1 root root 1032 Apr 19 11:13 .multipath.cache Alasdair -- agk@redhat.com --- multipath-tools-0.4.4.2/libmultipath/cache.c 2005-04-11 10:46:14.000000000 +0100 +++ multipath-tools-0.4.4.2-new/libmultipath/cache.c 2005-04-19 18:20:16.000000000 +0100 @@ -86,7 +86,7 @@ off_t record_len; struct path * pp; - fd = open(CACHE_TMPFILE, O_RDWR|O_CREAT); + fd = open(CACHE_TMPFILE, O_RDWR|O_CREAT, 0600); if (fd < 0) return 1; ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: release 0.4.4 ? 2005-04-19 17:45 ` Alasdair G Kergon @ 2005-04-19 21:14 ` christophe varoqui 0 siblings, 0 replies; 30+ messages in thread From: christophe varoqui @ 2005-04-19 21:14 UTC (permalink / raw) To: device-mapper development Merged On mar, 2005-04-19 at 18:45 +0100, Alasdair G Kergon wrote: > --- multipath-tools-0.4.4.2/libmultipath/cache.c 2005-04-11 > 10:46:14.000000000 +0100 > +++ multipath-tools-0.4.4.2-new/libmultipath/cache.c 2005-04-19 > 18:20:16.000000000 +0100 > @@ -86,7 +86,7 @@ > off_t record_len; > struct path * pp; > > - fd = open(CACHE_TMPFILE, O_RDWR|O_CREAT); > + fd = open(CACHE_TMPFILE, O_RDWR|O_CREAT, 0600); > > if (fd < 0) > return 1; > -- christophe varoqui <christophe.varoqui@free.fr> ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: release 0.4.4 ? 2005-04-17 9:50 release 0.4.4 ? christophe varoqui 2005-04-18 9:02 ` Lars Marowsky-Bree 2005-04-19 17:45 ` Alasdair G Kergon @ 2005-04-19 23:32 ` christophe varoqui 2005-04-20 13:31 ` Lars Marowsky-Bree 2 siblings, 1 reply; 30+ messages in thread From: christophe varoqui @ 2005-04-19 23:32 UTC (permalink / raw) To: device-mapper development A lot of fixes flowed in since that proposal. Now I propose pre16 as a 0.4.4 candidate. Objections still welcome. Regards, On dim, 2005-04-17 at 11:50 +0200, christophe varoqui wrote: > Hello, > > does someone has an objection to my releasing pre14 as 0.4.4 ? > > Regards, -- christophe varoqui <christophe.varoqui@free.fr> ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: release 0.4.4 ? 2005-04-19 23:32 ` christophe varoqui @ 2005-04-20 13:31 ` Lars Marowsky-Bree 2005-04-20 14:04 ` Lars Marowsky-Bree 2005-04-20 14:20 ` christophe varoqui 0 siblings, 2 replies; 30+ messages in thread From: Lars Marowsky-Bree @ 2005-04-20 13:31 UTC (permalink / raw) To: device-mapper development; +Cc: hare On 2005-04-20T01:32:51, christophe varoqui <christophe.varoqui@free.fr> wrote: > A lot of fixes flowed in since that proposal. > Now I propose pre16 as a 0.4.4 candidate. > > Objections still welcome. -pre16 /sbin/multipath doesn't create _any_ mapping for me. Sincerely, Lars Marowsky-Brée <lmb@suse.de> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: release 0.4.4 ? 2005-04-20 13:31 ` Lars Marowsky-Bree @ 2005-04-20 14:04 ` Lars Marowsky-Bree 2005-04-20 14:24 ` christophe varoqui 2005-04-20 14:20 ` christophe varoqui 1 sibling, 1 reply; 30+ messages in thread From: Lars Marowsky-Bree @ 2005-04-20 14:04 UTC (permalink / raw) To: device-mapper development On 2005-04-20T15:31:17, Lars Marowsky-Bree <lmb@suse.de> wrote: > > A lot of fixes flowed in since that proposal. > > Now I propose pre16 as a 0.4.4 candidate. > > > > Objections still welcome. > -pre16 /sbin/multipath doesn't create _any_ mapping for me. The change to hwtable.c / config.c no longer can deal with store_hwe() not filling in _all_ details. It does not differentiate between a parameter being NULL because memory couldn't be allocated and a parameter NULL (use default) being passed in. Replacing the NULL entries with "/bin/false" is a temporary workaround... Sincerely, Lars Marowsky-Brée <lmb@suse.de> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: release 0.4.4 ? 2005-04-20 14:04 ` Lars Marowsky-Bree @ 2005-04-20 14:24 ` christophe varoqui 0 siblings, 0 replies; 30+ messages in thread From: christophe varoqui @ 2005-04-20 14:24 UTC (permalink / raw) To: device-mapper development On mer, 2005-04-20 at 16:04 +0200, Lars Marowsky-Bree wrote: > On 2005-04-20T15:31:17, Lars Marowsky-Bree <lmb@suse.de> wrote: > > > > A lot of fixes flowed in since that proposal. > > > Now I propose pre16 as a 0.4.4 candidate. > > > > > > Objections still welcome. > > -pre16 /sbin/multipath doesn't create _any_ mapping for me. > > The change to hwtable.c / config.c no longer can deal with store_hwe() > not filling in _all_ details. > > It does not differentiate between a parameter being NULL because memory > couldn't be allocated and a parameter NULL (use default) being passed > in. > > Replacing the NULL entries with "/bin/false" is a temporary > workaround... > (*ashamed*) :/ NULL must remain a valid param value. I'll make sure of that when I regain a SAN access. You can beat me to it, too :) Regards, -- christophe varoqui <christophe.varoqui@free.fr> ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: release 0.4.4 ? 2005-04-20 13:31 ` Lars Marowsky-Bree 2005-04-20 14:04 ` Lars Marowsky-Bree @ 2005-04-20 14:20 ` christophe varoqui 2005-04-20 14:26 ` Lars Marowsky-Bree 2005-04-20 20:12 ` Lars Marowsky-Bree 1 sibling, 2 replies; 30+ messages in thread From: christophe varoqui @ 2005-04-20 14:20 UTC (permalink / raw) To: device-mapper development; +Cc: hare On mer, 2005-04-20 at 15:31 +0200, Lars Marowsky-Bree wrote: > On 2005-04-20T01:32:51, christophe varoqui <christophe.varoqui@free.fr> wrote: > > > A lot of fixes flowed in since that proposal. > > Now I propose pre16 as a 0.4.4 candidate. > > > > Objections still welcome. > > -pre16 /sbin/multipath doesn't create _any_ mapping for me. > Not funny. Unfortunately, I had no SAN access for the last week, which plague the quality of the pre-release, as you can see. I guess I'll setup the OSDL environment now, for testing and debugging. Hope to be able to diagnose in a few hours. Regards, -- christophe varoqui <christophe.varoqui@free.fr> ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: release 0.4.4 ? 2005-04-20 14:20 ` christophe varoqui @ 2005-04-20 14:26 ` Lars Marowsky-Bree 2005-04-20 14:41 ` christophe varoqui 2005-04-20 20:59 ` christophe varoqui 2005-04-20 20:12 ` Lars Marowsky-Bree 1 sibling, 2 replies; 30+ messages in thread From: Lars Marowsky-Bree @ 2005-04-20 14:26 UTC (permalink / raw) To: device-mapper development [-- Attachment #1: Type: text/plain, Size: 610 bytes --] On 2005-04-20T16:20:23, christophe varoqui <christophe.varoqui@free.fr> wrote: > > -pre16 /sbin/multipath doesn't create _any_ mapping for me. > Not funny. Not very ;-) > I guess I'll setup the OSDL environment now, for testing and debugging. > Hope to be able to diagnose in a few hours. Well, I forgot to attach a patch last time. Here we go. Also initializes pp_emc so that the sense buffer doesn't contain crap. Sincerely, Lars Marowsky-Brée <lmb@suse.de> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business [-- Attachment #2: multipath-tools-fixes.patch --] [-- Type: text/plain, Size: 2745 bytes --] diff -ru multipath-tools-0.4.4-pre16.old/libmultipath/cache.h multipath-tools-0.4.4-pre16/libmultipath/cache.h --- multipath-tools-0.4.4-pre16.old/libmultipath/cache.h 2005-04-20 01:09:06.000000000 +0200 +++ multipath-tools-0.4.4-pre16/libmultipath/cache.h 2005-04-20 16:05:04.859208723 +0200 @@ -1,5 +1,5 @@ -#define CACHE_FILE "/var/cache/multipath/.multipath.cache" -#define CACHE_TMPFILE "/var/cache/multipath/.multipath.cache.tmp" +#define CACHE_FILE "/dev/.multipath.cache" +#define CACHE_TMPFILE "/dev/.multipath.cache.tmp" #define CACHE_EXPIRE 5 #define MAX_WAIT 5 Only in multipath-tools-0.4.4-pre16/libmultipath: cache.h~ diff -ru multipath-tools-0.4.4-pre16.old/libmultipath/hwtable.c multipath-tools-0.4.4-pre16/libmultipath/hwtable.c --- multipath-tools-0.4.4-pre16.old/libmultipath/hwtable.c 2005-04-18 21:27:41.000000000 +0200 +++ multipath-tools-0.4.4-pre16/libmultipath/hwtable.c 2005-04-20 16:04:13.538010731 +0200 @@ -36,11 +36,11 @@ r += store_hwe_ext(hw, "DGC", "*", GROUP_BY_PRIO, DEFAULT_GETUID, "/sbin/pp_emc /dev/%n", "1 emc", "0", "emc_clariion"); r += store_hwe_ext(hw, "IBM", "3542", GROUP_BY_SERIAL, DEFAULT_GETUID, - NULL, "0", "0", "tur"); + "/bin/false", "0", "0", "tur"); r += store_hwe_ext(hw, "SGI", "TP9400", MULTIBUS, DEFAULT_GETUID, - NULL, "0", "0", "tur"); + "/bin/false", "0", "0", "tur"); r += store_hwe_ext(hw, "SGI", "TP9500", FAILOVER, DEFAULT_GETUID, - NULL, "0", "0", "tur"); + "/bin/false", "0", "0", "tur"); return r; } diff -ru multipath-tools-0.4.4-pre16.old/multipath/Makefile multipath-tools-0.4.4-pre16/multipath/Makefile --- multipath-tools-0.4.4-pre16.old/multipath/Makefile 2005-04-20 01:08:37.000000000 +0200 +++ multipath-tools-0.4.4-pre16/multipath/Makefile 2005-04-20 16:04:42.961537675 +0200 @@ -43,7 +43,6 @@ install: install -d $(DESTDIR)$(bindir) install -m 755 $(EXEC) $(DESTDIR)$(bindir)/ - install -d $(DESTDIR)/var/cache/multipath/ install -d $(DESTDIR)/etc/dev.d/block/ install -m 755 multipath.dev $(DESTDIR)/etc/dev.d/block/ install -d $(DESTDIR)/etc/udev/rules.d diff -ru multipath-tools-0.4.4-pre16.old/libcheckers/emc_clariion.c multipath-tools-0.4.4-pre16/libcheckers/emc_clariion.c --- multipath-tools-0.4.4-pre16.old/libcheckers/emc_clariion.c 2005-04-11 16:54:47.000000000 +0200 +++ multipath-tools-0.4.4-pre16/libcheckers/emc_clariion.c 2005-04-20 16:23:12.376089751 +0200 @@ -25,8 +25,8 @@ int emc_clariion(int fd, char *msg, void **context) { - unsigned char sense_buffer[256]; - unsigned char sb[128]; + unsigned char sense_buffer[256] = { 0, }; + unsigned char sb[128] = { 0, }; unsigned char inqCmdBlk[INQUIRY_CMDLEN] = {INQUIRY_CMD, 1, 0xC0, 0, sizeof(sb), 0}; struct sg_io_hdr io_hdr; [-- Attachment #3: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: release 0.4.4 ? 2005-04-20 14:26 ` Lars Marowsky-Bree @ 2005-04-20 14:41 ` christophe varoqui 2005-04-20 15:01 ` Lars Marowsky-Bree 2005-04-20 20:59 ` christophe varoqui 1 sibling, 1 reply; 30+ messages in thread From: christophe varoqui @ 2005-04-20 14:41 UTC (permalink / raw) To: device-mapper development > Well, I forgot to attach a patch last time. Here we go. Also initializes > pp_emc so that the sense buffer doesn't contain crap. > > Did you reach an agreement with Alasdair regarding the cache file location ? -- christophe varoqui <christophe.varoqui@free.fr> ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: release 0.4.4 ? 2005-04-20 14:41 ` christophe varoqui @ 2005-04-20 15:01 ` Lars Marowsky-Bree 2005-04-20 16:06 ` Alasdair G Kergon 0 siblings, 1 reply; 30+ messages in thread From: Lars Marowsky-Bree @ 2005-04-20 15:01 UTC (permalink / raw) To: device-mapper development On 2005-04-20T16:41:54, christophe varoqui <christophe.varoqui@free.fr> wrote: > Did you reach an agreement with Alasdair regarding the cache file > location ? Not yet. I just saw that that part of the patch I need locally on top of -pre16 slipped in too, you may want to delay applying that until Alasdair has replied. But I think it's pretty obvious that multipath is needed before /var is mounted, I think. Mit freundlichen Grüßen, Lars Marowsky-Brée <lmb@suse.de> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: release 0.4.4 ? 2005-04-20 15:01 ` Lars Marowsky-Bree @ 2005-04-20 16:06 ` Alasdair G Kergon 2005-04-20 16:27 ` christophe varoqui 0 siblings, 1 reply; 30+ messages in thread From: Alasdair G Kergon @ 2005-04-20 16:06 UTC (permalink / raw) To: device-mapper development On Wed, Apr 20, 2005 at 05:01:07PM +0200, Lars Marowsky-Bree wrote: > But I think it's pretty obvious that multipath is needed before /var is > mounted, I think. I would prefer to see the temporary file disappear completely: as I see things, it was a quick workaround for a problem until there's time to replace it with a better permanent solution. Alasdair -- agk@redhat.com ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: release 0.4.4 ? 2005-04-20 16:06 ` Alasdair G Kergon @ 2005-04-20 16:27 ` christophe varoqui 2005-04-20 17:34 ` Lars Marowsky-Bree 0 siblings, 1 reply; 30+ messages in thread From: christophe varoqui @ 2005-04-20 16:27 UTC (permalink / raw) To: device-mapper development On mer, 2005-04-20 at 17:06 +0100, Alasdair G Kergon wrote: > On Wed, Apr 20, 2005 at 05:01:07PM +0200, Lars Marowsky-Bree wrote: > > But I think it's pretty obvious that multipath is needed before /var is > > mounted, I think. > > I would prefer to see the temporary file disappear completely: > as I see things, it was a quick workaround for a problem until > there's time to replace it with a better permanent solution. > Yes, it is a workaround fited to 0.4.4 0.4.5 won't use a cache, as Alasdair suggests. In the mean time, the problem boils down to which distro will maintain a patch over 0.4.4 to set the cache location to the place it feels appropriate. Volonteers ? May be some other location can be agreed on ? Regards, -- christophe varoqui <christophe.varoqui@free.fr> ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: release 0.4.4 ? 2005-04-20 16:27 ` christophe varoqui @ 2005-04-20 17:34 ` Lars Marowsky-Bree 0 siblings, 0 replies; 30+ messages in thread From: Lars Marowsky-Bree @ 2005-04-20 17:34 UTC (permalink / raw) To: device-mapper development On 2005-04-20T18:27:11, christophe varoqui <christophe.varoqui@free.fr> wrote: > In the mean time, the problem boils down to which distro will maintain a > patch over 0.4.4 to set the cache location to the place it feels > appropriate. Just leave it as it is then, I don't care much; it's a pretty simple change to maintain. Sincerely, Lars Marowsky-Brée <lmb@suse.de> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: release 0.4.4 ? 2005-04-20 14:26 ` Lars Marowsky-Bree 2005-04-20 14:41 ` christophe varoqui @ 2005-04-20 20:59 ` christophe varoqui 1 sibling, 0 replies; 30+ messages in thread From: christophe varoqui @ 2005-04-20 20:59 UTC (permalink / raw) To: device-mapper development Indeed, store_hwe was completely brain dead. Corrected in pre17, online. Also merged the sense buffer zeroing in EMC checker. Regards, cvaroqui On mer, 2005-04-20 at 16:26 +0200, Lars Marowsky-Bree wrote: > On 2005-04-20T16:20:23, christophe varoqui <christophe.varoqui@free.fr> wrote: > > > > -pre16 /sbin/multipath doesn't create _any_ mapping for me. > > Not funny. > > Not very ;-) > > > I guess I'll setup the OSDL environment now, for testing and debugging. > > Hope to be able to diagnose in a few hours. > > Well, I forgot to attach a patch last time. Here we go. Also initializes > pp_emc so that the sense buffer doesn't contain crap. > > > Sincerely, > Lars Marowsky-Brée <lmb@suse.de> > > -- > dm-devel mailing list > dm-devel@redhat.com > https://www.redhat.com/mailman/listinfo/dm-devel -- christophe varoqui <christophe.varoqui@free.fr> ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: release 0.4.4 ? 2005-04-20 14:20 ` christophe varoqui 2005-04-20 14:26 ` Lars Marowsky-Bree @ 2005-04-20 20:12 ` Lars Marowsky-Bree 2005-04-20 20:39 ` christophe varoqui 1 sibling, 1 reply; 30+ messages in thread From: Lars Marowsky-Bree @ 2005-04-20 20:12 UTC (permalink / raw) To: device-mapper development On 2005-04-20T16:20:23, christophe varoqui <christophe.varoqui@free.fr> wrote: > I guess I'll setup the OSDL environment now, for testing and debugging. > Hope to be able to diagnose in a few hours. Something else is strange. This is what /sbin/multipath creates: 0 267249280 multipath 0 1 emc 2 1 round-robin 0 2 1 66:208 1000 8:240 1000 round-robin 0 2 1 65:224 1000 8:0 1000 action preset to 0 action set to 4 create: 3600601607cf30e00164589a37a31d911 [size=127 GB][features="0"][hwhandler="1 emc"] \_ round-robin 0 [first] \_ 1:0:1:0 sdat 66:208 [ready ] \_ 0:0:1:0 sdp 8:240 [ready ] \_ round-robin 0 \_ 1:0:0:0 sdae 65:224 [ready ] \_ 0:0:0:0 sda 8:0 [ready ] multipathd however: ... Apr 20 22:09:11 chip multipathd: getprio = /sbin/pp_emc /dev/%n (controler setting) Apr 20 22:09:11 chip multipathd: error calling out /sbin/pp_emc /dev/sdp Resulting in a flat map: 3600601607cf30e0070af8bb37a31d911 [size=133 GB][features="0"][hwhandler="1 emc"] \_ round-robin 0 [active][first] \_ 0:0:1:12 sdab 65:176 [ready ][active] \_ 1:0:0:12 sdaq 66:160 [ready ][active] \_ 1:0:1:12 sdbf 67:144 [ready ][active] \_ 0:0:0:12 sdm 8:192 [ready ][active] Which is broken and actually causes the system to misbehave quite spectacularly. What is multipathd doing differently from multipath? The two parts being distinct in behaviour is really one of the worst design decisions in the entire setup, in my not so humble opinion. Second, if a callout _fails_, it ought to refuse to create the map IMHO, rather than risk creating a broken one. Sincerely, Lars Marowsky-Brée <lmb@suse.de> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: release 0.4.4 ? 2005-04-20 20:12 ` Lars Marowsky-Bree @ 2005-04-20 20:39 ` christophe varoqui 2005-04-20 20:51 ` Lars Marowsky-Bree 0 siblings, 1 reply; 30+ messages in thread From: christophe varoqui @ 2005-04-20 20:39 UTC (permalink / raw) To: device-mapper development On mer, 2005-04-20 at 22:12 +0200, Lars Marowsky-Bree wrote: > On 2005-04-20T16:20:23, christophe varoqui <christophe.varoqui@free.fr> wrote: > > > I guess I'll setup the OSDL environment now, for testing and debugging. > > Hope to be able to diagnose in a few hours. > > Something else is strange. > > This is what /sbin/multipath creates: > > 0 267249280 multipath 0 1 emc 2 1 round-robin 0 2 1 66:208 1000 8:240 1000 round-robin 0 2 1 65:224 1000 8:0 1000 > action preset to 0 > action set to 4 > create: 3600601607cf30e00164589a37a31d911 > [size=127 GB][features="0"][hwhandler="1 emc"] > \_ round-robin 0 [first] > \_ 1:0:1:0 sdat 66:208 [ready ] > \_ 0:0:1:0 sdp 8:240 [ready ] > \_ round-robin 0 > \_ 1:0:0:0 sdae 65:224 [ready ] > \_ 0:0:0:0 sda 8:0 [ready ] > > multipathd however: > ... > Apr 20 22:09:11 chip multipathd: getprio = /sbin/pp_emc /dev/%n (controler setting) > Apr 20 22:09:11 chip multipathd: error calling out /sbin/pp_emc /dev/sdp > from libmultipath/discovery.c:devinfo() : if (apply_format(pp->getprio, &buff[0], pp)) { pp->priority = 1; } else if (execute_program(buff, prio, 16)) { condlog(3, "error calling out %s", buff); pp->priority = 1; } else pp->priority = atoi(prio); A failed exec results in a default path weight. This code is common to multipath and multipathd > Resulting in a flat map: > 3600601607cf30e0070af8bb37a31d911 > [size=133 GB][features="0"][hwhandler="1 emc"] > \_ round-robin 0 [active][first] > \_ 0:0:1:12 sdab 65:176 [ready ][active] > \_ 1:0:0:12 sdaq 66:160 [ready ][active] > \_ 1:0:1:12 sdbf 67:144 [ready ][active] > \_ 0:0:0:12 sdm 8:192 [ready ][active] > > Which is broken and actually causes the system to misbehave quite > spectacularly. > Indeed. > What is multipathd doing differently from multipath? The two parts being > distinct in behaviour is really one of the worst design decisions in the > entire setup, in my not so humble opinion. > Sure, it was acknowledged and corrected from 0.4.2. Multipathd *never* decides to change a map. All it does is exec multipath, which means two runs of multipath gave 2 differents maps : very bad, but nothing to do with code divergence between daemon and tool. Regards, -- christophe varoqui <christophe.varoqui@free.fr> ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: release 0.4.4 ? 2005-04-20 20:39 ` christophe varoqui @ 2005-04-20 20:51 ` Lars Marowsky-Bree 2005-04-20 21:01 ` christophe varoqui 0 siblings, 1 reply; 30+ messages in thread From: Lars Marowsky-Bree @ 2005-04-20 20:51 UTC (permalink / raw) To: device-mapper development On 2005-04-20T22:39:56, christophe varoqui <christophe.varoqui@free.fr> wrote: > Multipathd *never* decides to change a map. All it does is exec > multipath, which means two runs of multipath gave 2 differents maps : > very bad, but nothing to do with code divergence between daemon and > tool. But it _always_ happens when multipath is called from multipathd. It _never_ happens when multipath is called manually by root. It didn't with -pre14, but does with -pre16. What could have caused this? Sincerely, Lars Marowsky-Brée <lmb@suse.de> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: release 0.4.4 ? 2005-04-20 20:51 ` Lars Marowsky-Bree @ 2005-04-20 21:01 ` christophe varoqui 2005-04-20 21:10 ` Lars Marowsky-Bree 0 siblings, 1 reply; 30+ messages in thread From: christophe varoqui @ 2005-04-20 21:01 UTC (permalink / raw) To: device-mapper development On mer, 2005-04-20 at 22:51 +0200, Lars Marowsky-Bree wrote: > On 2005-04-20T22:39:56, christophe varoqui <christophe.varoqui@free.fr> wrote: > > > Multipathd *never* decides to change a map. All it does is exec > > multipath, which means two runs of multipath gave 2 differents maps : > > very bad, but nothing to do with code divergence between daemon and > > tool. > > But it _always_ happens when multipath is called from multipathd. > > It _never_ happens when multipath is called manually by root. > > It didn't with -pre14, but does with -pre16. > > What could have caused this? > Don't forget the binaries caching in a ramfs. If you don't restart the daemon, it may still use a WIP multipath version as a callout. I'll try and reproduce this at OSDL. regards, -- christophe varoqui <christophe.varoqui@free.fr> ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: release 0.4.4 ? 2005-04-20 21:01 ` christophe varoqui @ 2005-04-20 21:10 ` Lars Marowsky-Bree 2005-04-20 21:38 ` christophe varoqui 0 siblings, 1 reply; 30+ messages in thread From: Lars Marowsky-Bree @ 2005-04-20 21:10 UTC (permalink / raw) To: device-mapper development On 2005-04-20T23:01:36, christophe varoqui <christophe.varoqui@free.fr> wrote: > Don't forget the binaries caching in a ramfs. > If you don't restart the daemon, it may still use a WIP multipath > version as a callout. > > I'll try and reproduce this at OSDL. I've most certainly restarted the daemon. To make sure the kernel didn't get stuck somewhere (it's sometimes a bit annoying to be debugging user-space and the kernel at the same time ;-) I also rebooted the node, which I guess should count as a restart ;-) Sincerely, Lars Marowsky-Brée <lmb@suse.de> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: release 0.4.4 ? 2005-04-20 21:10 ` Lars Marowsky-Bree @ 2005-04-20 21:38 ` christophe varoqui 2005-04-20 21:52 ` Lars Marowsky-Bree 0 siblings, 1 reply; 30+ messages in thread From: christophe varoqui @ 2005-04-20 21:38 UTC (permalink / raw) To: device-mapper development On mer, 2005-04-20 at 23:10 +0200, Lars Marowsky-Bree wrote: > On 2005-04-20T23:01:36, christophe varoqui <christophe.varoqui@free.fr> wrote: > > > Don't forget the binaries caching in a ramfs. > > If you don't restart the daemon, it may still use a WIP multipath > > version as a callout. > > > > I'll try and reproduce this at OSDL. > > I've most certainly restarted the daemon. To make sure the kernel didn't > get stuck somewhere (it's sometimes a bit annoying to be debugging > user-space and the kernel at the same time ;-) I also rebooted the node, > which I guess should count as a restart ;-) > I can't reproduce that at OSDL : * no config file * IBM 3542 (tur/group_by_serial/no hwh/no feature) * mp-tools 0.4.4-pre17 ==== Phase 1 : a running dd, all is fine ==== [root@cl039 block]# jobs [1]+ Running dd if=/dev/mapper/3600a0b80000b596a000003113d9c5381 of=/dev/null & (wd: ~) [root@cl039 block]# multipath -l 3600a0b80000b596a000003113d9c5381 3600a0b80000b596a000003113d9c5381 [size=33 GB][features="0"][hwhandler="0"] \_ round-robin 0 [active][first] \_ 2:0:3:9 sdan 66:112 [ready ][active] \_ 3:0:1:9 sdbh 67:176 [ready ][active] \_ round-robin 0 [enabled] \_ 3:0:3:9 sdcb 68:240 [ready ][active] \_ 2:0:1:9 sdt 65:48 [ready ][active] ==== Phase 2 : entropy ==== [root@cl039 device]# echo 1>/sys/block/sdbh/device/delete Apr 20 14:34:08 cl039 kernel: Synchronizing SCSI cache for disk sdbh: Apr 20 14:34:08 cl039 kernel: scsi3 (1:9): rejecting I/O to dead device Apr 20 14:34:08 cl039 multipathd: devmap event (2) on 3600a0b80000b596a000003113d9c5381 Apr 20 14:34:08 cl039 multipathd: mark 67:176 as failed Apr 20 14:34:10 cl039 kernel: scsi3 (1:9): rejecting I/O to dead device Apr 20 14:34:10 cl039 multipathd: 67:176: tur checker reports path is up Apr 20 14:34:11 cl039 multipathd: devmap event (3) on 3600a0b80000b596a000003113d9c5381 Apr 20 14:34:11 cl039 multipathd: mark 67:176 as failed Apr 20 14:34:16 cl039 kernel: scsi3 (1:9): rejecting I/O to dead device Apr 20 14:34:16 cl039 multipathd: 67:176: tur checker reports path is up Apr 20 14:34:17 cl039 multipathd: devmap event (4) on 3600a0b80000b596a000003113d9c5381 Apr 20 14:34:22 cl039 kernel: scsi3 (1:9): rejecting I/O to dead device Apr 20 14:34:54 cl039 last message repeated 6 times Apr 20 14:35:52 cl039 last message repeated 11 times ### note the kernel seems to be slow reaping down the device, which causes some up/down cycles ### [root@cl039 device]# multipath -l 3600a0b80000b596a000003113d9c5381 3600a0b80000b596a000003113d9c5381 [size=33 GB][features="0"][hwhandler="0"] \_ round-robin 0 [enabled] \_ 2:0:3:9 sdan 66:112 [ready ][active] \_ 0:0:0:0 67:176 [undef ][active] \_ round-robin 0 [active][first] \_ 3:0:3:9 sdcb 68:240 [ready ][active] \_ 2:0:1:9 sdt 65:48 [ready ][active] ### note multipath go execed by multipathd and choose *not* to reload the map, leaving the dead device in place, in case it comes up again ### ### also note the active PG switched to the one with most valid path (all path prio == 1) ### ==== Pahe 3 : restore ==== [root@cl039 root]# echo "scsi add-single-device 3 0 1 9">/proc/scsi/scsi Apr 20 14:35:56 cl039 kernel: Vendor: IBM Model: 3542 Rev: 0401 Apr 20 14:35:57 cl039 kernel: Type: Direct-Access ANSI SCSI revision: 03 Apr 20 14:35:57 cl039 kernel: qla2200 0000:03:05.0: scsi(3:0:1:9): Enabled tagged queuing, queue depth 16. Apr 20 14:35:57 cl039 kernel: SCSI device sdcc: 71014400 512-byte hdwr sectors (36359 MB) Apr 20 14:35:57 cl039 kernel: SCSI device sdcc: drive cache: write back Apr 20 14:35:57 cl039 kernel: sdcc:<3>scsi3 (1:9): rejecting I/O to dead device Apr 20 14:35:58 cl039 multipathd: 68:240: tur checker reports path is down Apr 20 14:35:58 cl039 multipathd: 65:48: tur checker reports path is down Apr 20 14:35:58 cl039 kernel: unknown partition table Apr 20 14:35:58 cl039 kernel: Attached scsi disk sdcc at scsi3, channel 0, id 1, lun 9 Apr 20 14:35:58 cl039 kernel: Attached scsi generic sg59 at scsi3, channel 0, id 1, lun 9, type 0 Apr 20 14:36:03 cl039 kernel: scsi3 (1:9): rejecting I/O to dead device Apr 20 14:36:04 cl039 multipathd: 68:240: tur checker reports path is up Apr 20 14:36:04 cl039 multipathd: devmap event (5) on 3600a0b80000b596a000003113d9c5381 Apr 20 14:36:04 cl039 multipathd: 65:48: tur checker reports path is up [root@cl039 root]# multipath -l 3600a0b80000b596a000003113d9c5381 3600a0b80000b596a000003113d9c5381 [size=33 GB][features="0"][hwhandler="0"] \_ round-robin 0 [active][first] \_ 2:0:3:9 sdan 66:112 [ready ][active] \_ 3:0:1:9 sdcc 69:0 [ready ][active] \_ round-robin 0 [enabled] \_ 3:0:3:9 sdcb 68:240 [ready ][active] \_ 2:0:1:9 sdt 65:48 [ready ][active] ### multipath got execed by multipathd, and juged opportune to reload a map, removing the dead device and adding the new renamed one ### ==== end ==== It all seems sane to me. Regards, -- christophe varoqui <christophe.varoqui@free.fr> ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: release 0.4.4 ? 2005-04-20 21:38 ` christophe varoqui @ 2005-04-20 21:52 ` Lars Marowsky-Bree 2005-04-20 22:04 ` christophe varoqui 0 siblings, 1 reply; 30+ messages in thread From: Lars Marowsky-Bree @ 2005-04-20 21:52 UTC (permalink / raw) To: device-mapper development On 2005-04-20T23:38:39, christophe varoqui <christophe.varoqui@free.fr> wrote: > I can't reproduce that at OSDL : > > * no config file > * IBM 3542 (tur/group_by_serial/no hwh/no feature) > * mp-tools 0.4.4-pre17 That means you're not using a priority callout, which is what the bug is about, right? Anyway, I'll try -pre17 (or maybe -pre18 by then ;-) tomorrow. Sincerely, Lars Marowsky-Brée <lmb@suse.de> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: release 0.4.4 ? 2005-04-20 21:52 ` Lars Marowsky-Bree @ 2005-04-20 22:04 ` christophe varoqui 2005-04-20 22:10 ` christophe varoqui 2005-04-20 22:34 ` christophe varoqui 0 siblings, 2 replies; 30+ messages in thread From: christophe varoqui @ 2005-04-20 22:04 UTC (permalink / raw) To: device-mapper development On mer, 2005-04-20 at 23:52 +0200, Lars Marowsky-Bree wrote: > On 2005-04-20T23:38:39, christophe varoqui <christophe.varoqui@free.fr> wrote: > > > I can't reproduce that at OSDL : > > > > * no config file > > * IBM 3542 (tur/group_by_serial/no hwh/no feature) > > * mp-tools 0.4.4-pre17 > > That means you're not using a priority callout, which is what the bug is > about, right? > > Anyway, I'll try -pre17 (or maybe -pre18 by then ;-) tomorrow. > oh, so you use the group_by_prio pgpolicy ? Indeed, the fallback prio to 1 is annoying in that scenario. I'll see how to propagate the error to prevent a map reloading in case of prio callout error. Regards, -- christophe varoqui <christophe.varoqui@free.fr> ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: release 0.4.4 ? 2005-04-20 22:04 ` christophe varoqui @ 2005-04-20 22:10 ` christophe varoqui 2005-04-20 22:40 ` Lars Marowsky-Bree 2005-04-20 22:34 ` christophe varoqui 1 sibling, 1 reply; 30+ messages in thread From: christophe varoqui @ 2005-04-20 22:10 UTC (permalink / raw) To: device-mapper development On jeu, 2005-04-21 at 00:04 +0200, christophe varoqui wrote: > On mer, 2005-04-20 at 23:52 +0200, Lars Marowsky-Bree wrote: > > On 2005-04-20T23:38:39, christophe varoqui <christophe.varoqui@free.fr> wrote: > > > > > I can't reproduce that at OSDL : > > > > > > * no config file > > > * IBM 3542 (tur/group_by_serial/no hwh/no feature) > > > * mp-tools 0.4.4-pre17 > > > > That means you're not using a priority callout, which is what the bug is > > about, right? > > > > Anyway, I'll try -pre17 (or maybe -pre18 by then ;-) tomorrow. > > > oh, so you use the group_by_prio pgpolicy ? > Indeed, the fallback prio to 1 is annoying in that scenario. > > I'll see how to propagate the error to prevent a map reloading in case > of prio callout error. > In the mean time, you should check why your prio callout always errors when run from multipathd. One obvious reason would be that the callout was not copied into the ramfs ... Regards, -- christophe varoqui <christophe.varoqui@free.fr> ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: release 0.4.4 ? 2005-04-20 22:10 ` christophe varoqui @ 2005-04-20 22:40 ` Lars Marowsky-Bree 2005-04-20 22:49 ` christophe varoqui 0 siblings, 1 reply; 30+ messages in thread From: Lars Marowsky-Bree @ 2005-04-20 22:40 UTC (permalink / raw) To: device-mapper development On 2005-04-21T00:10:52, christophe varoqui <christophe.varoqui@free.fr> wrote: > In the mean time, you should check why your prio callout always errors > when run from multipathd. One obvious reason would be that the callout > was not copied into the ramfs ... How can I check that? I don't see any ramfs/shmfs in mount, and multipathd also doesn't log anything about it. Mit freundlichen Grüßen, Lars Marowsky-Brée <lmb@suse.de> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: release 0.4.4 ? 2005-04-20 22:40 ` Lars Marowsky-Bree @ 2005-04-20 22:49 ` christophe varoqui 2005-04-20 23:18 ` christophe varoqui 0 siblings, 1 reply; 30+ messages in thread From: christophe varoqui @ 2005-04-20 22:49 UTC (permalink / raw) To: device-mapper development On jeu, 2005-04-21 at 00:40 +0200, Lars Marowsky-Bree wrote: > On 2005-04-21T00:10:52, christophe varoqui <christophe.varoqui@free.fr> wrote: > > > In the mean time, you should check why your prio callout always errors > > when run from multipathd. One obvious reason would be that the callout > > was not copied into the ramfs ... > > How can I check that? I don't see any ramfs/shmfs in mount, and > multipathd also doesn't log anything about it. > "multipathd -v3" will log what files it copies into the ramfs. You don't see it in "mount" because its in a private namespace, but you should see it listed in /proc/$(pidof multipathd)/mounts ... though it won't help you see its content :/ Regards, -- christophe varoqui <christophe.varoqui@free.fr> ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: release 0.4.4 ? 2005-04-20 22:49 ` christophe varoqui @ 2005-04-20 23:18 ` christophe varoqui 2005-04-21 8:52 ` Lars Marowsky-Bree 0 siblings, 1 reply; 30+ messages in thread From: christophe varoqui @ 2005-04-20 23:18 UTC (permalink / raw) To: device-mapper development On jeu, 2005-04-21 at 00:49 +0200, christophe varoqui wrote: > On jeu, 2005-04-21 at 00:40 +0200, Lars Marowsky-Bree wrote: > > On 2005-04-21T00:10:52, christophe varoqui <christophe.varoqui@free.fr> wrote: > > > > > In the mean time, you should check why your prio callout always errors > > > when run from multipathd. One obvious reason would be that the callout > > > was not copied into the ramfs ... > > > > How can I check that? I don't see any ramfs/shmfs in mount, and > > multipathd also doesn't log anything about it. > > > "multipathd -v3" will log what files it copies into the ramfs. > > You don't see it in "mount" because its in a private namespace, but you > should see it listed in /proc/$(pidof multipathd)/mounts ... though it > won't help you see its content :/ > I think I know what happens. the binvec is constructed from dict.c only, so it works only in presence of a config file. Attached patch does : * move push_callout to config.[ch] * use it in store_hwe() too To make you right, I'll push pre18 now ;) Regards, cvaroqui diff -urN multipath-tools-0.4.4-pre17/libmultipath/config.c multipath-tools-0.4.4-pre18/libmultipath/config.c --- multipath-tools-0.4.4-pre17/libmultipath/config.c 2005-04-20 12:50:09.000000000 -0700 +++ multipath-tools-0.4.4-pre18/libmultipath/config.c 2005-04-20 16:24:31.951046352 -0700 @@ -14,6 +14,56 @@ #include "../libcheckers/checkers.h" +/* + * helper function to draw a list of callout binaries found in the config file + */ +extern int +push_callout(char * callout) +{ + int i; + char * bin; + char * p; + + /* + * purge command line arguments + */ + p = callout; + + while (*p != ' ' && *p != '\0') + p++; + + if (!conf->binvec) + conf->binvec = vector_alloc(); + + + if (!conf->binvec) + return 1; + + /* + * if this callout is already stored in binvec, don't store it twice + */ + vector_foreach_slot (conf->binvec, bin, i) + if (memcmp(bin, callout, p - callout) == 0) + return 0; + + /* + * else, store it + */ + bin = MALLOC((p - callout) + 1); + + if (!bin) + return 1; + + strncpy(bin, callout, p - callout); + + if (!vector_alloc_slot(conf->binvec)) + return 1; + + vector_set_slot(conf->binvec, bin); + + return 0; +} + struct hwentry * find_hwe (vector hwtable, char * vendor, char * product) { @@ -195,10 +245,13 @@ if (pgp) hwe->pgpolicy = pgp; - if (getuid) + if (getuid) { hwe->getuid = set_param_str(getuid); - else + push_callout(getuid); + } else { hwe->getuid = set_default(DEFAULT_GETUID); + push_callout(DEFAULT_GETUID); + } if (!hwe->getuid) goto out; @@ -238,17 +291,21 @@ if (pgp) hwe->pgpolicy = pgp; - if (getuid) + if (getuid) { hwe->getuid = set_param_str(getuid); - else + push_callout(getuid); + } else { hwe->getuid = set_default(DEFAULT_GETUID); + push_callout(DEFAULT_GETUID); + } if (!hwe->getuid) goto out; - if (getprio) + if (getprio) { hwe->getprio = set_param_str(getprio); - else + push_callout(getprio); + } else hwe->getprio = NULL; if (hwhandler) @@ -404,3 +461,4 @@ free_config(conf); return 1; } + diff -urN multipath-tools-0.4.4-pre17/libmultipath/config.h multipath-tools-0.4.4-pre18/libmultipath/config.h --- multipath-tools-0.4.4-pre17/libmultipath/config.h 2005-04-18 12:12:39.000000000 -0700 +++ multipath-tools-0.4.4-pre18/libmultipath/config.h 2005-04-20 16:20:27.623189832 -0700 @@ -66,6 +66,8 @@ struct config * conf; +extern int push_callout(char * callout); + struct hwentry * find_hwe (vector hwtable, char * vendor, char * product); struct mpentry * find_mpe (char * wwid); char * get_mpe_wwid (char * alias); diff -urN multipath-tools-0.4.4-pre17/libmultipath/dict.h multipath-tools-0.4.4-pre18/libmultipath/dict.h --- multipath-tools-0.4.4-pre17/libmultipath/dict.h 2005-03-29 04:19:43.000000000 -0800 +++ multipath-tools-0.4.4-pre18/libmultipath/dict.h 2005-04-20 16:20:40.726197872 -0700 @@ -6,6 +6,5 @@ #endif vector init_keywords(void); -void push_callout(char * callout); #endif /* _DICT_H */ diff -urN multipath-tools-0.4.4-pre17/libmultipath/dict.c multipath-tools-0.4.4-pre18/libmultipath/dict.c --- multipath-tools-0.4.4-pre17/libmultipath/dict.c 2005-04-18 09:08:03.000000000 -0700 +++ multipath-tools-0.4.4-pre18/libmultipath/dict.c 2005-04-20 16:28:21.002225280 -0700 @@ -11,56 +11,6 @@ #include "../libcheckers/checkers.h" /* - * helper function to draw a list of callout binaries found in the config file - */ -extern int -push_callout(char * callout) -{ - int i; - char * bin; - char * p; - - /* - * purge command line arguments - */ - p = callout; - - while (*p != ' ' && *p != '\0') - p++; - - if (!conf->binvec) - conf->binvec = vector_alloc(); - - - if (!conf->binvec) - return 1; - - /* - * if this callout is already stored in binvec, don't store it twice - */ - vector_foreach_slot (conf->binvec, bin, i) - if (memcmp(bin, callout, p - callout) == 0) - return 0; - - /* - * else, store it - */ - bin = MALLOC((p - callout) + 1); - - if (!bin) - return 1; - - strncpy(bin, callout, p - callout); - - if (!vector_alloc_slot(conf->binvec)) - return 1; - - vector_set_slot(conf->binvec, bin); - - return 0; -} - -/* * default block handlers */ static int -- christophe varoqui <christophe.varoqui@free.fr> ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: release 0.4.4 ? 2005-04-20 23:18 ` christophe varoqui @ 2005-04-21 8:52 ` Lars Marowsky-Bree 0 siblings, 0 replies; 30+ messages in thread From: Lars Marowsky-Bree @ 2005-04-21 8:52 UTC (permalink / raw) To: device-mapper development On 2005-04-21T01:18:02, christophe varoqui <christophe.varoqui@free.fr> wrote: > To make you right, I'll push pre18 now ;) Thanks, that seems to fix it for me! Sincerely, Lars Marowsky-Brée <lmb@suse.de> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: release 0.4.4 ? 2005-04-20 22:04 ` christophe varoqui 2005-04-20 22:10 ` christophe varoqui @ 2005-04-20 22:34 ` christophe varoqui 1 sibling, 0 replies; 30+ messages in thread From: christophe varoqui @ 2005-04-20 22:34 UTC (permalink / raw) To: device-mapper development On jeu, 2005-04-21 at 00:04 +0200, christophe varoqui wrote: > On mer, 2005-04-20 at 23:52 +0200, Lars Marowsky-Bree wrote: > > On 2005-04-20T23:38:39, christophe varoqui <christophe.varoqui@free.fr> wrote: > > > > > I can't reproduce that at OSDL : > > > > > > * no config file > > > * IBM 3542 (tur/group_by_serial/no hwh/no feature) > > > * mp-tools 0.4.4-pre17 > > > > That means you're not using a priority callout, which is what the bug is > > about, right? > > > > Anyway, I'll try -pre17 (or maybe -pre18 by then ;-) tomorrow. > > > oh, so you use the group_by_prio pgpolicy ? > Indeed, the fallback prio to 1 is annoying in that scenario. > > I'll see how to propagate the error to prevent a map reloading in case > of prio callout error. > > Regards, That should do the trick : * errors on prio callout are stored as a negative prio value * multipath/main.c:coalesce_path() marks maps untouchable if they contain a path with negative prio diff -urN multipath-tools-0.4.4-pre17/libmultipath/discovery.c multipath-tools-0.4.4-pre18/libmultipath/discovery.c --- multipath-tools-0.4.4-pre17/libmultipath/discovery.c 2005-04-20 12:57:10.000000000 -0700 +++ multipath-tools-0.4.4-pre18/libmultipath/discovery.c 2005-04-20 15:35:29.676340096 -0700 @@ -481,10 +481,11 @@ select_getprio(pp); if (apply_format(pp->getprio, &buff[0], pp)) { - pp->priority = 1; + condlog(0, "error formatting prio callout command"); + pp->priority = -1; } else if (execute_program(buff, prio, 16)) { - condlog(3, "error calling out %s", buff); - pp->priority = 1; + condlog(0, "error calling out %s", buff); + pp->priority = -1; } else pp->priority = atoi(prio); @@ -498,9 +499,10 @@ select_getuid(pp); if (apply_format(pp->getuid, &buff[0], pp)) { + condlog(0, "error formatting uid callout command"); memset(pp->wwid, 0, WWID_SIZE); } else if (execute_program(buff, pp->wwid, WWID_SIZE)) { - condlog(3, "error calling out %s", buff); + condlog(0, "error calling out %s", buff); memset(pp->wwid, 0, WWID_SIZE); } condlog(3, "uid = %s (callout)", pp->wwid); diff -urN multipath-tools-0.4.4-pre17/multipath/main.c multipath-tools-0.4.4-pre18/multipath/main.c --- multipath-tools-0.4.4-pre17/multipath/main.c 2005-04-20 13:36:44.000000000 -0700 +++ multipath-tools-0.4.4-pre18/multipath/main.c 2005-04-20 15:41:35.135781816 -0700 @@ -710,6 +710,9 @@ mpp->size = pp1->size; mpp->paths = vector_alloc(); + if (pp1->priority < 0) + mpp->action = ACT_NOTHING; + if (!mpp->paths) return 1; @@ -732,6 +735,9 @@ mpp->wwid); mpp->action = ACT_NOTHING; } + if (pp2->priority < 0) + mpp->action = ACT_NOTHING; + if (store_path(mpp->paths, pp2)) return 1; } -- christophe varoqui <christophe.varoqui@free.fr> ^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2005-04-21 8:52 UTC | newest] Thread overview: 30+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-04-17 9:50 release 0.4.4 ? christophe varoqui 2005-04-18 9:02 ` Lars Marowsky-Bree 2005-04-19 17:45 ` Alasdair G Kergon 2005-04-19 21:14 ` christophe varoqui 2005-04-19 23:32 ` christophe varoqui 2005-04-20 13:31 ` Lars Marowsky-Bree 2005-04-20 14:04 ` Lars Marowsky-Bree 2005-04-20 14:24 ` christophe varoqui 2005-04-20 14:20 ` christophe varoqui 2005-04-20 14:26 ` Lars Marowsky-Bree 2005-04-20 14:41 ` christophe varoqui 2005-04-20 15:01 ` Lars Marowsky-Bree 2005-04-20 16:06 ` Alasdair G Kergon 2005-04-20 16:27 ` christophe varoqui 2005-04-20 17:34 ` Lars Marowsky-Bree 2005-04-20 20:59 ` christophe varoqui 2005-04-20 20:12 ` Lars Marowsky-Bree 2005-04-20 20:39 ` christophe varoqui 2005-04-20 20:51 ` Lars Marowsky-Bree 2005-04-20 21:01 ` christophe varoqui 2005-04-20 21:10 ` Lars Marowsky-Bree 2005-04-20 21:38 ` christophe varoqui 2005-04-20 21:52 ` Lars Marowsky-Bree 2005-04-20 22:04 ` christophe varoqui 2005-04-20 22:10 ` christophe varoqui 2005-04-20 22:40 ` Lars Marowsky-Bree 2005-04-20 22:49 ` christophe varoqui 2005-04-20 23:18 ` christophe varoqui 2005-04-21 8:52 ` Lars Marowsky-Bree 2005-04-20 22:34 ` christophe varoqui
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.