* multipathd segfault and error calling out
@ 2009-02-26 2:07 John A. Sullivan III
2009-02-26 3:04 ` Konrad Rzeszutek
2009-02-26 9:40 ` Bryn M. Reeves
0 siblings, 2 replies; 9+ messages in thread
From: John A. Sullivan III @ 2009-02-26 2:07 UTC (permalink / raw)
To: dm-devel
Hello, all. I am running on kernel 2.6.27 on CentOS 5.2 with VServer
and device-mapper-multipath-0.4.7-17.el5. I have a custom
mpath_prio_ssi script which takes the device name (e.g., sdaa), pulls
out the path from /etc/disk/by-path and then echos a priority based upon
a lookup table. It works perfectly fine from the command line.
multipath -ll shows the priorities assigned perfectly and exactly the
right paths are active.
However, when I start multipathd, it all goes down the tubes. The paths
disappear and /var/log/messages is filled with:
Feb 25 20:50:17 vd01 multipathd: error calling out /usr/local/sbin/mpath_prio_ssi sdh
Feb 25 20:50:17 vd01 multipathd: error calling out /usr/local/sbin/mpath_prio_ssi sdi
Feb 25 20:50:17 vd01 multipathd: error calling out /usr/local/sbin/mpath_prio_ssi sdj
Feb 25 20:50:17 vd01 multipathd: error calling out /usr/local/sbin/mpath_prio_ssi sdc
Feb 25 20:50:17 vd01 multipathd: error calling out /usr/local/sbin/mpath_prio_ssi sdd
Feb 25 20:50:17 vd01 multipathd: error calling out /usr/local/sbin/mpath_prio_ssi sde
Feb 25 20:50:17 vd01 multipathd: error calling out /usr/local/sbin/mpath_prio_ssi sdf
Feb 25 20:50:17 vd01 multipathd: error calling out /usr/local/sbin/mpath_prio_ssi sdg
Feb 25 20:50:17 vd01 multipathd: error calling out /usr/local/sbin/mpath_prio_ssi sdh
Feb 25 20:50:17 vd01 multipathd: error calling out /usr/local/sbin/mpath_prio_ssi sdi
Feb 25 20:50:17 vd01 multipathd: error calling out /usr/local/sbin/mpath_prio_ssi sdj
Feb 25 20:50:17 vd01 multipathd: error calling out /usr/local/sbin/mpath_prio_ssi sdc
Feb 25 20:50:17 vd01 multipathd: error calling out /usr/local/sbin/mpath_prio_ssi sdd
Feb 25 20:50:17 vd01 multipathd: error calling out /usr/local/sbin/mpath_prio_ssi sde
Feb 25 20:50:17 vd01 multipathd: error calling out /usr/local/sbin/mpath_prio_ssi sdf
Feb 25 20:50:17 vd01 multipathd: error calling out /usr/local/sbin/mpath_prio_ssi sdg
Feb 25 20:50:17 vd01 multipathd: error calling out /usr/local/sbin/mpath_prio_ssi sdh
Feb 25 20:50:17 vd01 multipathd: error calling out /usr/local/sbin/mpath_prio_ssi sdi
Feb 25 20:50:17 vd01 multipathd: error calling out /usr/local/sbin/mpath_prio_ssi sdj
Feb 25 20:50:17 vd01 multipathd: error calling out /usr/local/sbin/mpath_prio_ssi sdk
Feb 25 20:50:17 vd01 multipathd: error calling out /usr/local/sbin/mpath_prio_ssi sdl
Feb 25 20:50:17 vd01 multipathd: error calling out /usr/local/sbin/mpath_prio_ssi sdm
Feb 25 20:50:17 vd01 multipathd: error calling out /usr/local/sbin/mpath_prio_ssi sdn
Feb 25 20:50:17 vd01 multipathd: error calling out /usr/local/sbin/mpath_prio_ssi sdo
Feb 25 20:50:17 vd01 multipathd: error calling out /usr/local/sbin/mpath_prio_ssi sdp
Feb 25 20:50:17 vd01 multipathd: error calling out /usr/local/sbin/mpath_prio_ssi sdq
Feb 25 20:50:17 vd01 multipathd: error calling out /usr/local/sbin/mpath_prio_ssi sdr
Feb 25 20:50:18 vd01 multipathd: error calling out /usr/local/sbin/mpath_prio_ssi sdk
Feb 25 20:50:18 vd01 multipathd: error calling out /usr/local/sbin/mpath_prio_ssi sdl
Feb 25 20:50:18 vd01 multipathd: error calling out /usr/local/sbin/mpath_prio_ssi sdm
The callout does exist, is executable, and works with multipath. Based
upon some email threads, I tried moving it to / and /sbin. No change.
When multipathd starts I get:
Feb 25 20:50:17 vd01 multipathd: cannot open /sbin/dasd_id : No such file or directory
Feb 25 20:50:17 vd01 multipathd: [copy.c] cannot open /sbin/dasd_id
Feb 25 20:50:17 vd01 multipathd: cannot copy /sbin/dasd_id in ramfs : No such file or directory
Feb 25 20:50:17 vd01 multipathd: error calling out /usr/local/sbin/mpath_prio_ssi sda
and when it shuts stops I get:
Feb 25 20:43:17 vd01 multipathd: error calling out /usr/local/bin/mpath_prio_ssi sdj
Feb 25 20:43:17 vd01 multipathd: isdb: stop event checker thread
Feb 25 20:43:17 vd01 multipathd: isdc: stop event checker thread
Feb 25 20:43:17 vd01 multipathd: isda: stop event checker thread
Feb 25 20:43:17 vd01 multipathd: isdd: stop event checker thread
Feb 25 20:43:17 vd01 kernel: multipathd[28942]: segfault at a ip 0000003e8a26fa3d sp 00007fff47e946d0 error 4 in libc-2.5.so[3e8a200000+14a000]
Here are the uncommented portions of my /etc/multipath.conf:
blacklist {
# devnode "*"
# sdb
wwid SATA_ST3250310NS_9SF0L234
#sda
wwid SATA_ST3250310NS_9SF0LVSR
}
defaults {
failback 5
path_grouping_policy failover
path_checker tur
prio_callout "/usr/local/sbin/mpath_prio_ssi %n"
}
multipaths {
multipath {
wwid 3600144f049a4acec00003048c6912c00
alias isda
}
multipath {
wwid 3600144f049a4acf500003048c6912c00
alias isdb
}
multipath {
wwid 3600144f049a4ace600003048c6912c00
alias isdc
}
multipath {
wwid 3600144f049a448a700003048c6912c00
alias isdd
}
}
devices {
device {
vendor "SUN"
product "SOLARIS"
getuid_callout "/sbin/scsi_id -g -u -s /block/%n"
features "0"
hardware_handler "0"
path_grouping_policy failover
rr_weight uniform
rr_min_io 1000
path_checker readsector0
}
}
Here is the callout script:
#!/bin/bash
# if not passed any device name, return a priority of 0
if [ -z "${1}" ];then
echo 0
exit
fi
ENTRY="$(ls -l1 /dev/disk/by-path | grep /${1}'$')"
DEV="${ENTRY##* ip-}"
if [ "$DEV" = "${ENTRY}" ];then #This is not an iSCSI device
echo 0
exit
fi
DEV="${DEV%% ->*}"
echo $(grep ${DEV} /usr/local/etc/iscsi.list | cut -f2)
and here is iscsi.list:
172.x.x.158:3260-iscsi-iqn.1986-03.com.sun:02:04e2454b-e60c-cfaf-d8b9-923b9fed0020 5
172.x.x.158:3260-iscsi-iqn.1986-03.com.sun:02:8af2554f-13f6-c0e1-8343-c457e4802cb9 99
172.x.x.158:3260-iscsi-iqn.1986-03.com.sun:02:b8a8e5cd-4501-c8f7-be51-ad7fcb31f782 5
172.30.13.158:3260-iscsi-iqn.1986-03.com.sun:02:c3420146-eecc-c364-dc6e-a3914db5a639 24
172.x.x.174:3260-iscsi-iqn.1986-03.com.sun:02:04e2454b-e60c-cfaf-d8b9-923b9fed0020 1
172.x.x.174:3260-iscsi-iqn.1986-03.com.sun:02:8af2554f-13f6-c0e1-8343-c457e4802cb9 2
172.x.x.174:3260-iscsi-iqn.1986-03.com.sun:02:b8a8e5cd-4501-c8f7-be51-ad7fcb31f782 1
172.x.x.174:3260-iscsi-iqn.1986-03.com.sun:02:c3420146-eecc-c364-dc6e-a3914db5a639 49
172.x.x.190:3260-iscsi-iqn.1986-03.com.sun:02:04e2454b-e60c-cfaf-d8b9-923b9fed0020 0
172.x.x.190:3260-iscsi-iqn.1986-03.com.sun:02:8af2554f-13f6-c0e1-8343-c457e4802cb9 49
172.x.x.190:3260-iscsi-iqn.1986-03.com.sun:02:b8a8e5cd-4501-c8f7-be51-ad7fcb31f782 0
172.x.x.190:3260-iscsi-iqn.1986-03.com.sun:02:c3420146-eecc-c364-dc6e-a3914db5a639 2
172.x.x.206:3260-iscsi-iqn.1986-03.com.sun:02:04e2454b-e60c-cfaf-d8b9-923b9fed0020 11
172.x.x.206:3260-iscsi-iqn.1986-03.com.sun:02:8af2554f-13f6-c0e1-8343-c457e4802cb9 24
172.x.x.206:3260-iscsi-iqn.1986-03.com.sun:02:b8a8e5cd-4501-c8f7-be51-ad7fcb31f782 11
172.x.x.206:3260-iscsi-iqn.1986-03.com.sun:02:c3420146-eecc-c364-dc6e-a3914db5a639 99
172.x.x.30:3260-iscsi-iqn.1986-03.com.sun:02:04e2454b-e60c-cfaf-d8b9-923b9fed0020 24
172.x.x.30:3260-iscsi-iqn.1986-03.com.sun:02:8af2554f-13f6-c0e1-8343-c457e4802cb9 11
172.x.x.30:3260-iscsi-iqn.1986-03.com.sun:02:b8a8e5cd-4501-c8f7-be51-ad7fcb31f782 99
172.x.x.30:3260-iscsi-iqn.1986-03.com.sun:02:c3420146-eecc-c364-dc6e-a3914db5a639 5
172.x.x.46:3260-iscsi-iqn.1986-03.com.sun:02:04e2454b-e60c-cfaf-d8b9-923b9fed0020 49
172.x.x.46:3260-iscsi-iqn.1986-03.com.sun:02:8af2554f-13f6-c0e1-8343-c457e4802cb9 0
172.x.x.46:3260-iscsi-iqn.1986-03.com.sun:02:b8a8e5cd-4501-c8f7-be51-ad7fcb31f782 2
172.x.x.46:3260-iscsi-iqn.1986-03.com.sun:02:c3420146-eecc-c364-dc6e-a3914db5a639 1
172.x.x.62:3260-iscsi-iqn.1986-03.com.sun:02:04e2454b-e60c-cfaf-d8b9-923b9fed0020 2
172.x.x.62:3260-iscsi-iqn.1986-03.com.sun:02:8af2554f-13f6-c0e1-8343-c457e4802cb9 1
172.x.x.62:3260-iscsi-iqn.1986-03.com.sun:02:b8a8e5cd-4501-c8f7-be51-ad7fcb31f782 49
172.x.x.62:3260-iscsi-iqn.1986-03.com.sun:02:c3420146-eecc-c364-dc6e-a3914db5a639 0
172.x.x.78:3260-iscsi-iqn.1986-03.com.sun:02:04e2454b-e60c-cfaf-d8b9-923b9fed0020 99
172.x.x.78:3260-iscsi-iqn.1986-03.com.sun:02:8af2554f-13f6-c0e1-8343-c457e4802cb9 5
172.x.x.78:3260-iscsi-iqn.1986-03.com.sun:02:b8a8e5cd-4501-c8f7-be51-ad7fcb31f782 24
172.x.x.78:3260-iscsi-iqn.1986-03.com.sun:02:c3420146-eecc-c364-dc6e-a3914db5a639 11
What have I misconfigured? How do I make this work properly? Thanks -
John
--
John A. Sullivan III
Open Source Development Corporation
+1 207-985-7880
jsullivan@opensourcedevel.com
http://www.spiritualoutreach.com
Making Christianity intelligible to secular society
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: multipathd segfault and error calling out 2009-02-26 2:07 multipathd segfault and error calling out John A. Sullivan III @ 2009-02-26 3:04 ` Konrad Rzeszutek 2009-02-26 3:23 ` John A. Sullivan III 2009-02-26 9:40 ` Bryn M. Reeves 1 sibling, 1 reply; 9+ messages in thread From: Konrad Rzeszutek @ 2009-02-26 3:04 UTC (permalink / raw) To: device-mapper development On Wed, Feb 25, 2009 at 09:07:44PM -0500, John A. Sullivan III wrote: > Hello, all. I am running on kernel 2.6.27 on CentOS 5.2 with VServer > and device-mapper-multipath-0.4.7-17.el5. I have a custom > mpath_prio_ssi script which takes the device name (e.g., sdaa), pulls > out the path from /etc/disk/by-path and then echos a priority based upon > a lookup table. It works perfectly fine from the command line. > multipath -ll shows the priorities assigned perfectly and exactly the > right paths are active. > > However, when I start multipathd, it all goes down the tubes. The paths > disappear and /var/log/messages is filled with: > Feb 25 20:50:17 vd01 multipathd: error calling out /usr/local/sbin/mpath_prio_ssi sdh Keep in mind that the environment you have when multipathd calls is quite limited. I believe there is no PATH set, nor any other "normal" values. Make sure your code uses absolute paths. So "/bin/grep" ,"/bin/cut", etc.. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: multipathd segfault and error calling out 2009-02-26 3:04 ` Konrad Rzeszutek @ 2009-02-26 3:23 ` John A. Sullivan III 2009-02-26 5:14 ` John A. Sullivan III 2009-02-26 7:16 ` Hannes Reinecke 0 siblings, 2 replies; 9+ messages in thread From: John A. Sullivan III @ 2009-02-26 3:23 UTC (permalink / raw) To: device-mapper development On Wed, 2009-02-25 at 22:04 -0500, Konrad Rzeszutek wrote: > On Wed, Feb 25, 2009 at 09:07:44PM -0500, John A. Sullivan III wrote: > > Hello, all. I am running on kernel 2.6.27 on CentOS 5.2 with VServer > > and device-mapper-multipath-0.4.7-17.el5. I have a custom > > mpath_prio_ssi script which takes the device name (e.g., sdaa), pulls > > out the path from /etc/disk/by-path and then echos a priority based upon > > a lookup table. It works perfectly fine from the command line. > > multipath -ll shows the priorities assigned perfectly and exactly the > > right paths are active. > > > > However, when I start multipathd, it all goes down the tubes. The paths > > disappear and /var/log/messages is filled with: > > Feb 25 20:50:17 vd01 multipathd: error calling out /usr/local/sbin/mpath_prio_ssi sdh > > Keep in mind that the environment you have when multipathd calls is quite > limited. I believe there is no PATH set, nor any other "normal" values. > > Make sure your code uses absolute paths. So "/bin/grep" ,"/bin/cut", etc.. <snip> Thank you. I was enthusiastic that might have been the problem, but alas not. Even with absolute pathnames and setting the PATH variable, it still gives the same error. In fact, I should have mentioned, I created a bogus file with the same pathname which did nothing but "echo hello" and it gave the same error calling out error. What next? - John -- John A. Sullivan III Open Source Development Corporation +1 207-985-7880 jsullivan@opensourcedevel.com http://www.spiritualoutreach.com Making Christianity intelligible to secular society ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: multipathd segfault and error calling out 2009-02-26 3:23 ` John A. Sullivan III @ 2009-02-26 5:14 ` John A. Sullivan III 2009-02-26 5:30 ` John A. Sullivan III 2009-02-26 7:16 ` Hannes Reinecke 1 sibling, 1 reply; 9+ messages in thread From: John A. Sullivan III @ 2009-02-26 5:14 UTC (permalink / raw) To: device-mapper development On Wed, 2009-02-25 at 22:23 -0500, John A. Sullivan III wrote: > On Wed, 2009-02-25 at 22:04 -0500, Konrad Rzeszutek wrote: > > On Wed, Feb 25, 2009 at 09:07:44PM -0500, John A. Sullivan III wrote: > > > Hello, all. I am running on kernel 2.6.27 on CentOS 5.2 with VServer > > > and device-mapper-multipath-0.4.7-17.el5. I have a custom > > > mpath_prio_ssi script which takes the device name (e.g., sdaa), pulls > > > out the path from /etc/disk/by-path and then echos a priority based upon > > > a lookup table. It works perfectly fine from the command line. > > > multipath -ll shows the priorities assigned perfectly and exactly the > > > right paths are active. > > > > > > However, when I start multipathd, it all goes down the tubes. The paths > > > disappear and /var/log/messages is filled with: > > > Feb 25 20:50:17 vd01 multipathd: error calling out /usr/local/sbin/mpath_prio_ssi sdh > > > > Keep in mind that the environment you have when multipathd calls is quite > > limited. I believe there is no PATH set, nor any other "normal" values. > > > > Make sure your code uses absolute paths. So "/bin/grep" ,"/bin/cut", etc.. > <snip> > Thank you. I was enthusiastic that might have been the problem, but > alas not. Even with absolute pathnames and setting the PATH variable, it > still gives the same error. In fact, I should have mentioned, I created > a bogus file with the same pathname which did nothing but "echo hello" > and it gave the same error calling out error. What next? - John This is increasingly bizarre. I did an strace on the multipath command and on the multipathd command. Here is a portion of the strace for multipath: close(1) = 0 dup(6) = 1 execve("/usr/local/sbin/mpath_prio_ssi", ["/usr/local/sbin/mpath_prio_ssi", "sda"], [/* 25 vars */]) = 0 brk(0) = 0x8c3000 Here is the same call from multipathd: close(1) = 0 dup(7) = 1 execve("/usr/local/sbin/mpath_prio_ssi", ["/usr/local/sbin/mpath_prio_ssi", "sda"], [/* 25 vars */]) = -1 ENOENT (No such file or directory) exit_group(-1) = ? Is it my imagination or is it exactly the same call but one is finding the file and the other is not. What could cause this? It is an explicit pathname and the file exists??!! Thanks - John -- John A. Sullivan III Open Source Development Corporation +1 207-985-7880 jsullivan@opensourcedevel.com http://www.spiritualoutreach.com Making Christianity intelligible to secular society ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: multipathd segfault and error calling out 2009-02-26 5:14 ` John A. Sullivan III @ 2009-02-26 5:30 ` John A. Sullivan III 0 siblings, 0 replies; 9+ messages in thread From: John A. Sullivan III @ 2009-02-26 5:30 UTC (permalink / raw) To: device-mapper development On Thu, 2009-02-26 at 00:14 -0500, John A. Sullivan III wrote: > On Wed, 2009-02-25 at 22:23 -0500, John A. Sullivan III wrote: > > On Wed, 2009-02-25 at 22:04 -0500, Konrad Rzeszutek wrote: > > > On Wed, Feb 25, 2009 at 09:07:44PM -0500, John A. Sullivan III wrote: > > > > Hello, all. I am running on kernel 2.6.27 on CentOS 5.2 with VServer > > > > and device-mapper-multipath-0.4.7-17.el5. I have a custom > > > > mpath_prio_ssi script which takes the device name (e.g., sdaa), pulls > > > > out the path from /etc/disk/by-path and then echos a priority based upon > > > > a lookup table. It works perfectly fine from the command line. > > > > multipath -ll shows the priorities assigned perfectly and exactly the > > > > right paths are active. > > > > > > > > However, when I start multipathd, it all goes down the tubes. The paths > > > > disappear and /var/log/messages is filled with: > > > > Feb 25 20:50:17 vd01 multipathd: error calling out /usr/local/sbin/mpath_prio_ssi sdh > > > > > > Keep in mind that the environment you have when multipathd calls is quite > > > limited. I believe there is no PATH set, nor any other "normal" values. > > > > > > Make sure your code uses absolute paths. So "/bin/grep" ,"/bin/cut", etc.. > > <snip> > > Thank you. I was enthusiastic that might have been the problem, but > > alas not. Even with absolute pathnames and setting the PATH variable, it > > still gives the same error. In fact, I should have mentioned, I created > > a bogus file with the same pathname which did nothing but "echo hello" > > and it gave the same error calling out error. What next? - John > This is increasingly bizarre. I did an strace on the multipath command > and on the multipathd command. > > Here is a portion of the strace for multipath: > close(1) = 0 > dup(6) = 1 > execve("/usr/local/sbin/mpath_prio_ssi", ["/usr/local/sbin/mpath_prio_ssi", "sda"], [/* 25 vars */]) = 0 > brk(0) = 0x8c3000 > > Here is the same call from multipathd: > close(1) = 0 > dup(7) = 1 > execve("/usr/local/sbin/mpath_prio_ssi", ["/usr/local/sbin/mpath_prio_ssi", "sda"], [/* 25 vars */]) = -1 ENOENT (No such file or directory) > exit_group(-1) = ? > > Is it my imagination or is it exactly the same call but one is finding > the file and the other is not. What could cause this? It is an explicit > pathname and the file exists??!! Thanks - John I should also mention that the trace shows there is no problem for multipathd to open the file. Two threads before the failure, we see this in the strace: stat("/var/cache/multipathd", {st_mode=S_IFDIR|0700, st_size=4096, ...}) = 0 open("/usr/local/sbin/mpath_prio_ssi", O_RDONLY) = 4 fstat(4, {st_mode=S_IFREG|0755, st_size=368, ...}) = 0 close(4) = 0 So the problem appears to be explicitly with the execve call. How does one fix this? Thanks - John -- John A. Sullivan III Open Source Development Corporation +1 207-985-7880 jsullivan@opensourcedevel.com http://www.spiritualoutreach.com Making Christianity intelligible to secular society ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: multipathd segfault and error calling out 2009-02-26 3:23 ` John A. Sullivan III 2009-02-26 5:14 ` John A. Sullivan III @ 2009-02-26 7:16 ` Hannes Reinecke 1 sibling, 0 replies; 9+ messages in thread From: Hannes Reinecke @ 2009-02-26 7:16 UTC (permalink / raw) To: device-mapper development John A. Sullivan III wrote: > On Wed, 2009-02-25 at 22:04 -0500, Konrad Rzeszutek wrote: >> On Wed, Feb 25, 2009 at 09:07:44PM -0500, John A. Sullivan III wrote: >>> Hello, all. I am running on kernel 2.6.27 on CentOS 5.2 with VServer >>> and device-mapper-multipath-0.4.7-17.el5. I have a custom >>> mpath_prio_ssi script which takes the device name (e.g., sdaa), pulls >>> out the path from /etc/disk/by-path and then echos a priority based upon >>> a lookup table. It works perfectly fine from the command line. >>> multipath -ll shows the priorities assigned perfectly and exactly the >>> right paths are active. >>> >>> However, when I start multipathd, it all goes down the tubes. The paths >>> disappear and /var/log/messages is filled with: >>> Feb 25 20:50:17 vd01 multipathd: error calling out /usr/local/sbin/mpath_prio_ssi sdh >> Keep in mind that the environment you have when multipathd calls is quite >> limited. I believe there is no PATH set, nor any other "normal" values. >> >> Make sure your code uses absolute paths. So "/bin/grep" ,"/bin/cut", etc.. > <snip> > Thank you. I was enthusiastic that might have been the problem, but > alas not. Even with absolute pathnames and setting the PATH variable, it > still gives the same error. In fact, I should have mentioned, I created > a bogus file with the same pathname which did nothing but "echo hello" > and it gave the same error calling out error. What next? - John Return an explicit exit code. It might be that eg 'cut' returns a non-zero value, which then would interpreted as a failure. Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Markus Rex, HRB 16746 (AG Nürnberg) ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: multipathd segfault and error calling out 2009-02-26 2:07 multipathd segfault and error calling out John A. Sullivan III 2009-02-26 3:04 ` Konrad Rzeszutek @ 2009-02-26 9:40 ` Bryn M. Reeves 2009-02-26 13:33 ` John A. Sullivan III 2009-02-27 3:06 ` John A. Sullivan III 1 sibling, 2 replies; 9+ messages in thread From: Bryn M. Reeves @ 2009-02-26 9:40 UTC (permalink / raw) To: device-mapper development John A. Sullivan III wrote: > Hello, all. I am running on kernel 2.6.27 on CentOS 5.2 with > VServer and device-mapper-multipath-0.4.7-17.el5. I have a custom > mpath_prio_ssi script which takes the device name (e.g., sdaa), > pulls out the path from /etc/disk/by-path and then echos a priority > based upon a lookup table. It works perfectly fine from the > command line. multipath -ll shows the priorities assigned perfectly > and exactly the right paths are active. > > However, when I start multipathd, it all goes down the tubes. The > paths disappear and /var/log/messages is filled with: Feb 25 > 20:50:17 vd01 multipathd: error calling out > /usr/local/sbin/mpath_prio_ssi sdh Feb 25 20:50:17 vd01 multipathd: > error calling out /usr/local/sbin/mpath_prio_ssi sdi Feb 25 > 20:50:17 vd01 multipathd: error calling out > /usr/local/sbin/mpath_prio_ssi sdj Feb 25 20:50:17 vd01 multipathd: > error calling out /usr/local/sbin/mpath_prio_ssi sdc I think you'll need to modify the multipathd binary to achieve this. To avoid deadlocking when file system access is interrupted due to path failures multipathd forks into a new namespace and discards all the device-backed file systems that are mounted. It creates an in-memory file system (ramfs) and copies all the binaries it will need into this. The file system is locked into memory so that multipathd can continue to function even if the paths backing the root file system have all failed. For the callouts themselves (getuid and getprio binaries) the config file processing takes care of this but this only works for stand-alone binaries. If your script has other dependencies then you'll have to add code to pull those into the ramfs volume. See libmultipath/config.c:push_callout(), libmultipath/config.c:store_hwe(), multipathd/main.c:prepare_namespace() and other code that manipulates the list of binaries stored in conf->binvec. Regards, Bryn. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: multipathd segfault and error calling out 2009-02-26 9:40 ` Bryn M. Reeves @ 2009-02-26 13:33 ` John A. Sullivan III 2009-02-27 3:06 ` John A. Sullivan III 1 sibling, 0 replies; 9+ messages in thread From: John A. Sullivan III @ 2009-02-26 13:33 UTC (permalink / raw) To: bmr, device-mapper development On Thu, 2009-02-26 at 09:40 +0000, Bryn M. Reeves wrote: > John A. Sullivan III wrote: > > Hello, all. I am running on kernel 2.6.27 on CentOS 5.2 with > > VServer and device-mapper-multipath-0.4.7-17.el5. I have a custom > > mpath_prio_ssi script which takes the device name (e.g., sdaa), > > pulls out the path from /etc/disk/by-path and then echos a priority > > based upon a lookup table. It works perfectly fine from the > > command line. multipath -ll shows the priorities assigned perfectly > > and exactly the right paths are active. > > > > However, when I start multipathd, it all goes down the tubes. The > > paths disappear and /var/log/messages is filled with: Feb 25 > > 20:50:17 vd01 multipathd: error calling out > > /usr/local/sbin/mpath_prio_ssi sdh Feb 25 20:50:17 vd01 multipathd: > > error calling out /usr/local/sbin/mpath_prio_ssi sdi Feb 25 > > 20:50:17 vd01 multipathd: error calling out > > /usr/local/sbin/mpath_prio_ssi sdj Feb 25 20:50:17 vd01 multipathd: > > error calling out /usr/local/sbin/mpath_prio_ssi sdc > > I think you'll need to modify the multipathd binary to achieve this. > > To avoid deadlocking when file system access is interrupted due to > path failures multipathd forks into a new namespace and discards all > the device-backed file systems that are mounted. > > It creates an in-memory file system (ramfs) and copies all the > binaries it will need into this. The file system is locked into memory > so that multipathd can continue to function even if the paths backing > the root file system have all failed. > > For the callouts themselves (getuid and getprio binaries) the config > file processing takes care of this but this only works for stand-alone > binaries. If your script has other dependencies then you'll have to > add code to pull those into the ramfs volume. > > See libmultipath/config.c:push_callout(), > libmultipath/config.c:store_hwe(), > multipathd/main.c:prepare_namespace() and other code that manipulates > the list of binaries stored in conf->binvec. > > Regards, > Bryn. <snip> Thank you very much, Bryn. That finally makes sense of it all. Unfortunately, I am not a developer at all and hence approach this more as a systems designer. If I understand you correctly, the best approach would be to create my script as a compiled binary rather than a bash script. Then the config file processing will load it into memory. Is that correct? Does that also imply that the file referenced as the list of iSCSI ids and priorities needs to be embedded in the binary? Is that a non-issue if I am not using multipathing for the devices containing the referenced script? As my skills are limited for converting this from bash to C, could I achieve the same thing by calling bash rather than the script and passing the script as an argument, e.g., prio_callout "/bin/bash /usr/local/sbin/mpath_prio_ssi %n" Thanks again - John -- John A. Sullivan III Open Source Development Corporation +1 207-985-7880 jsullivan@opensourcedevel.com http://www.spiritualoutreach.com Making Christianity intelligible to secular society ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: multipathd segfault and error calling out 2009-02-26 9:40 ` Bryn M. Reeves 2009-02-26 13:33 ` John A. Sullivan III @ 2009-02-27 3:06 ` John A. Sullivan III 1 sibling, 0 replies; 9+ messages in thread From: John A. Sullivan III @ 2009-02-27 3:06 UTC (permalink / raw) To: bmr, device-mapper development On Thu, 2009-02-26 at 09:40 +0000, Bryn M. Reeves wrote: > John A. Sullivan III wrote: > > Hello, all. I am running on kernel 2.6.27 on CentOS 5.2 with > > VServer and device-mapper-multipath-0.4.7-17.el5. I have a custom > > mpath_prio_ssi script which takes the device name (e.g., sdaa), > > pulls out the path from /etc/disk/by-path and then echos a priority > > based upon a lookup table. It works perfectly fine from the > > command line. multipath -ll shows the priorities assigned perfectly > > and exactly the right paths are active. > > > > However, when I start multipathd, it all goes down the tubes. The > > paths disappear and /var/log/messages is filled with: Feb 25 > > 20:50:17 vd01 multipathd: error calling out > > /usr/local/sbin/mpath_prio_ssi sdh Feb 25 20:50:17 vd01 multipathd: > > error calling out /usr/local/sbin/mpath_prio_ssi sdi Feb 25 > > 20:50:17 vd01 multipathd: error calling out > > /usr/local/sbin/mpath_prio_ssi sdj Feb 25 20:50:17 vd01 multipathd: > > error calling out /usr/local/sbin/mpath_prio_ssi sdc > > I think you'll need to modify the multipathd binary to achieve this. > > To avoid deadlocking when file system access is interrupted due to > path failures multipathd forks into a new namespace and discards all > the device-backed file systems that are mounted. > > It creates an in-memory file system (ramfs) and copies all the > binaries it will need into this. The file system is locked into memory > so that multipathd can continue to function even if the paths backing > the root file system have all failed. > > For the callouts themselves (getuid and getprio binaries) the config > file processing takes care of this but this only works for stand-alone > binaries. If your script has other dependencies then you'll have to > add code to pull those into the ramfs volume. > > See libmultipath/config.c:push_callout(), > libmultipath/config.c:store_hwe(), > multipathd/main.c:prepare_namespace() and other code that manipulates > the list of binaries stored in conf->binvec. <snip> You were exactly right (of course!). I changed prio_callout from directly calling a bash scrip to /bin/bash scriptname %n and that eliminated the callout errors. However, as expected, the internal calls to bin/ls, bin/grep, etc. all failed. I then rewrote the script to use nothing but bash internals (took a little doing such as getting the path list from /dev/disk/by-path but it seems to work). That, in our initial testing of simply pulling the network cable (no live data transfer yet), multipathd fails the devices and fails them back on recovery but, after recover, all the paths are shown as enabled - none are active. We hope to start live data testing tomorrow. Thanks again - John -- John A. Sullivan III Open Source Development Corporation +1 207-985-7880 jsullivan@opensourcedevel.com http://www.spiritualoutreach.com Making Christianity intelligible to secular society ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2009-02-27 3:06 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-02-26 2:07 multipathd segfault and error calling out John A. Sullivan III 2009-02-26 3:04 ` Konrad Rzeszutek 2009-02-26 3:23 ` John A. Sullivan III 2009-02-26 5:14 ` John A. Sullivan III 2009-02-26 5:30 ` John A. Sullivan III 2009-02-26 7:16 ` Hannes Reinecke 2009-02-26 9:40 ` Bryn M. Reeves 2009-02-26 13:33 ` John A. Sullivan III 2009-02-27 3:06 ` John A. Sullivan III
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.