* multipathd segfault and SCSI errors
@ 2008-10-24 6:11 Prakash Rudraraju
2008-10-24 6:37 ` Prakash Rudraraju
2008-10-24 13:30 ` Konrad Rzeszutek
0 siblings, 2 replies; 4+ messages in thread
From: Prakash Rudraraju @ 2008-10-24 6:11 UTC (permalink / raw)
To: dm-devel@redhat.com
[-- Attachment #1.1: Type: text/plain, Size: 4596 bytes --]
Hi,
We have setup a Compellent SAN with 2 HBA attached to dual fabrics. Under load when we import a 60GB database, paths fail very often. Following is the failed path behavior from syslog.
Oct 23 02:01:15 db03 kernel: sd 2:0:1:1: SCSI error: return code = 0x08000002
Oct 23 02:01:15 db03 kernel: sde: Current: sense key: Aborted Command
Oct 23 02:01:15 db03 kernel: Add. Sense: Internal target failure
Oct 23 02:01:15 db03 kernel:
Oct 23 02:01:15 db03 kernel: end_request: I/O error, dev sde, sector 911585239
Oct 23 02:01:15 db03 kernel: device-mapper: multipath: Failing path 8:64.
Oct 23 02:01:15 db03 multipathd: 8:64: mark as failed
Oct 23 02:01:15 db03 multipathd: mpath1: remaining active paths: 1
Oct 23 02:01:15 db03 kernel: sd 1:0:3:1: SCSI error: return code = 0x08000002
Oct 23 02:01:15 db03 kernel: sdc: Current: sense key: Aborted Command
Oct 23 02:01:15 db03 kernel: Add. Sense: Internal target failure
Oct 23 02:01:15 db03 kernel:
Oct 23 02:01:15 db03 kernel: end_request: I/O error, dev sdc, sector 911585239
Oct 23 02:01:15 db03 kernel: device-mapper: multipath: Failing path 8:32.
Oct 23 02:01:16 db03 multipathd: 8:32: mark as failed
Oct 23 02:01:16 db03 multipathd: mpath1: remaining active paths: 0
Oct 23 02:01:19 db03 multipathd: sde: tur checker reports path is up
Oct 23 02:01:19 db03 multipathd: 8:64: reinstated
Oct 23 02:01:19 db03 multipathd: mpath1: remaining active paths: 1
Oct 23 02:01:20 db03 multipathd: sdc: tur checker reports path is up
Oct 23 02:01:20 db03 multipathd: 8:32: reinstated
Oct 23 02:01:20 db03 multipathd: mpath1: remaining active paths: 2
Oct 23 02:01:21 db03 kernel: sd 2:0:1:1: SCSI error: return code = 0x08000002
Oct 23 02:01:21 db03 kernel: sde: Current: sense key: Aborted Command
Oct 23 02:01:21 db03 kernel: Add. Sense: Internal target failure
Oct 23 02:01:21 db03 kernel:
Multipathd segfault during boot and following is from dmesg output:
multipathd[7165]: segfault at 000000000000000a rip 00002aaaaaf51a3d rsp 00007fff03b50090 error 4
sd 2:0:1:1: SCSI error: return code = 0x08000002
sde: Current: sense key: Aborted Command
Add. Sense: Internal target failure
end_request: I/O error, dev sde, sector 912637903
device-mapper: multipath: Failing path 8:64.
sd 1:0:3:1: SCSI error: return code = 0x08000002
sdc: Current: sense key: Aborted Command
Add. Sense: Internal target failure
end_request: I/O error, dev sdc, sector 915472343
device-mapper: multipath: Failing path 8:32.
sd 2:0:1:1: SCSI error: return code = 0x08000002
sde: Current: sense key: Aborted Command
Add. Sense: Internal target failure
end_request: I/O error, dev sde, sector 915472343
device-mapper: multipath: Failing path 8:64.
sd 2:0:1:1: SCSI error: return code = 0x08000002
sde: Current: sense key: Aborted Command
Add. Sense: Internal target failure
end_request: I/O error, dev sde, sector 919728103
device-mapper: multipath: Failing path 8:64.
sd 1:0:3:1: SCSI error: return code = 0x08000002
sdc: Current: sense key: Aborted Command
Add. Sense: Internal target failure
We have experienced same failures on both RHEL 5.1 and CentOS. Following is /etc/multipathd.conf
defaults {
user_friendly_names yes
path_grouping_policy multibus
}
devices {
device {
vendor "COMPELNT"
product "Compellent Vol"
path_checker tur
polling_interval 10
no_path_retry queue
}
}
blacklist {
devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
devnode "^(hd|xvd)[a-z]*"
wwid "*"
}
# Make sure our multipath devices are enabled.
blacklist_exceptions {
wwid "36000d310000e63000000000000000007"
wwid "36000d310000e6300000000000000000c"
}
# multipath -ll
mpath1 (36000d310000e6300000000000000000c) dm-5 COMPELNT,Compellent Vol
[size=500G][features=1 queue_if_no_path][hwhandler=0]
\_ round-robin 0 [prio=2][active]
\_ 1:0:3:1 sdc 8:32 [active][ready]
\_ 2:0:1:1 sde 8:64 [active][ready]
mpath0 (36000d310000e63000000000000000007) dm-0 COMPELNT,Compellent Vol
[size=50G][features=1 queue_if_no_path][hwhandler=0]
\_ round-robin 0 [prio=2][active]
\_ 1:0:3:0 sdb 8:16 [active][ready]
\_ 2:0:1:0 sdd 8:48 [active][ready]
Please let me know if you need more information. This is my first experience with SAN configuration and I feel that I have missed something very obvious, because I was not getting meaningful results for those search results.
Thanks,
Prakash.
[-- Attachment #1.2: Type: text/html, Size: 20576 bytes --]
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* RE: multipathd segfault and SCSI errors
2008-10-24 6:11 multipathd segfault and SCSI errors Prakash Rudraraju
@ 2008-10-24 6:37 ` Prakash Rudraraju
2008-10-24 13:30 ` Konrad Rzeszutek
1 sibling, 0 replies; 4+ messages in thread
From: Prakash Rudraraju @ 2008-10-24 6:37 UTC (permalink / raw)
To: device-mapper development
[-- Attachment #1.1: Type: text/plain, Size: 5621 bytes --]
More info about setup from multipath verbose output.
# multipath -v3
dm-0: blacklisted
dm-1: blacklisted
dm-2: blacklisted
dm-3: blacklisted
dm-4: blacklisted
dm-5: blacklisted
dm-6: blacklisted
md0: blacklisted
ram0: blacklisted
ram10: blacklisted
ram11: blacklisted
ram12: blacklisted
ram13: blacklisted
ram14: blacklisted
ram15: blacklisted
ram1: blacklisted
ram2: blacklisted
ram3: blacklisted
ram4: blacklisted
ram5: blacklisted
ram6: blacklisted
ram7: blacklisted
ram8: blacklisted
ram9: blacklisted
sda: not found in pathvec
sda: mask = 0x1f
sda: bus = 1
sda: dev_t = 8:0
sda: size = 142082048
sda: vendor = DELL
sda: product = PERC 6/i Adapter
sda: rev = 1.11
sda: h:b:t:l = 0:2:0:0
sda: serial = 0036a2870ae53c2410003469e390ec01
sda: path checker = readsector0 (config file default)
sda: state = 2
sda: getprio = NULL (internal default)
sda: prio = 1
sda: getuid = /sbin/scsi_id -g -u -s /block/%n (config file default)
sda: uid = 36001ec90e369340010243ce50a87a236 (callout)
sdb: not found in pathvec
sdb: mask = 0x1f
sdb: bus = 1
sdb: dev_t = 8:16
sdb: size = 104857600
sdb: vendor = COMPELNT
sdb: product = Compellent Vol
sdb: rev = 0401
sdb: h:b:t:l = 1:0:3:0
sdb: tgt_node_name = 0x5000d310000e6302
sdb: serial = 00000e63-00000007
sdb: path checker = tur (controller setting)
sdb: state = 2
sdb: getprio = NULL (internal default)
sdb: prio = 1
sdb: getuid = /sbin/scsi_id -g -u -s /block/%n (config file default)
sdb: uid = 36000d310000e63000000000000000007 (callout)
sdc: not found in pathvec
sdc: mask = 0x1f
sdc: bus = 1
sdc: dev_t = 8:32
sdc: size = 1048576000
sdc: vendor = COMPELNT
sdc: product = Compellent Vol
sdc: rev = 0401
sdc: h:b:t:l = 1:0:3:1
sdc: tgt_node_name = 0x5000d310000e6302
sdc: serial = 00000e63-0000000c
sdc: path checker = tur (controller setting)
sdc: state = 2
sdc: getprio = NULL (internal default)
sdc: prio = 1
sdc: getuid = /sbin/scsi_id -g -u -s /block/%n (config file default)
sdc: uid = 36000d310000e6300000000000000000c (callout)
sdd: not found in pathvec
sdd: mask = 0x1f
sdd: bus = 1
sdd: dev_t = 8:48
sdd: size = 104857600
sdd: vendor = COMPELNT
sdd: product = Compellent Vol
sdd: rev = 0401
sdd: h:b:t:l = 2:0:1:0
sdd: tgt_node_name = 0x5000d310000e6302
sdd: serial = 00000e63-00000007
sdd: path checker = tur (controller setting)
sdd: state = 2
sdd: getprio = NULL (internal default)
sdd: prio = 1
sdd: getuid = /sbin/scsi_id -g -u -s /block/%n (config file default)
sdd: uid = 36000d310000e63000000000000000007 (callout)
sde: not found in pathvec
sde: mask = 0x1f
sde: bus = 1
sde: dev_t = 8:64
sde: size = 1048576000
sde: vendor = COMPELNT
sde: product = Compellent Vol
sde: rev = 0401
sde: h:b:t:l = 2:0:1:1
sde: tgt_node_name = 0x5000d310000e6302
sde: serial = 00000e63-0000000c
sde: path checker = tur (controller setting)
sde: state = 2
sde: getprio = NULL (internal default)
sde: prio = 1
sde: getuid = /sbin/scsi_id -g -u -s /block/%n (config file default)
sde: uid = 36000d310000e6300000000000000000c (callout)
===== paths list =====
uuid hcil dev dev_t pri dm_st chk_st vend/pr
36001ec90e369340010243ce50a87a236 0:2:0:0 sda 8:0 1 [undef][ready] DELL,PE
36000d310000e63000000000000000007 1:0:3:0 sdb 8:16 1 [undef][ready] COMPELN
36000d310000e6300000000000000000c 1:0:3:1 sdc 8:32 1 [undef][ready] COMPELN
36000d310000e63000000000000000007 2:0:1:0 sdd 8:48 1 [undef][ready] COMPELN
36000d310000e6300000000000000000c 2:0:1:1 sde 8:64 1 [undef][ready] COMPELN
params = 1 queue_if_no_path 0 1 1 round-robin 0 2 1 8:32 1000 8:64 1000
status = 1 0 0 1 1 A 0 2 0 8:32 A 47 8:64 A 49
params = 1 queue_if_no_path 0 1 1 round-robin 0 2 1 8:16 1000 8:48 1000
status = 1 0 0 1 1 A 0 2 0 8:16 A 0 8:48 A 0
36001ec90e369340010243ce50a87a236: blacklisted
36000d310000e63000000000000000007: exception-listed
Found matching wwid [36000d310000e63000000000000000007] in bindings file.
Setting alias to mpath0
sdb: ownership set to mpath0
sdb: not found in pathvec
sdb: mask = 0xc
sdb: state = 2
sdb: prio = 1
sdd: ownership set to mpath0
sdd: not found in pathvec
sdd: mask = 0xc
sdd: state = 2
sdd: prio = 1
mpath0: pgfailover = -1 (internal default)
mpath0: pgpolicy = multibus (config file default)
mpath0: selector = round-robin 0 (internal default)
mpath0: features = 0 (internal default)
mpath0: hwhandler = 0 (internal default)
mpath0: rr_weight = 1 (internal default)
mpath0: minio = 1000 (config file default)
mpath0: no_path_retry = -2 (controller setting)
pg_timeout = NONE (internal default)
mpath0: set ACT_NOTHING (map unchanged)
36000d310000e6300000000000000000c: exception-listed
Found matching wwid [36000d310000e6300000000000000000c] in bindings file.
Setting alias to mpath1
sdc: ownership set to mpath1
sdc: not found in pathvec
sdc: mask = 0xc
sdc: state = 2
sdc: prio = 1
sde: ownership set to mpath1
sde: not found in pathvec
sde: mask = 0xc
sde: state = 2
sde: prio = 1
mpath1: pgfailover = -1 (internal default)
mpath1: pgpolicy = multibus (config file default)
mpath1: selector = round-robin 0 (internal default)
mpath1: features = 0 (internal default)
mpath1: hwhandler = 0 (internal default)
mpath1: rr_weight = 1 (internal default)
mpath1: minio = 1000 (config file default)
mpath1: no_path_retry = -2 (controller setting)
pg_timeout = NONE (internal default)
mpath1: set ACT_NOTHING (map unchanged)
36000d310000e63000000000000000007: exception-listed
36000d310000e6300000000000000000c: exception-listed
Thanks,
Prakash.
[-- Attachment #1.2: Type: text/html, Size: 32541 bytes --]
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: multipathd segfault and SCSI errors
2008-10-24 6:11 multipathd segfault and SCSI errors Prakash Rudraraju
2008-10-24 6:37 ` Prakash Rudraraju
@ 2008-10-24 13:30 ` Konrad Rzeszutek
2008-10-24 15:39 ` Prakash Rudraraju
1 sibling, 1 reply; 4+ messages in thread
From: Konrad Rzeszutek @ 2008-10-24 13:30 UTC (permalink / raw)
To: device-mapper development
> multipathd[7165]: segfault at 000000000000000a rip 00002aaaaaf51a3d rsp 00007fff03b50090 error 4
Can you run multipathd as so:
multipathd -v9 -d
And provide the -200 lines output from the segfault output? You might need to
edit the /etc/init.d/multipathd to have this work and pipe the output to a log file or so.
^ permalink raw reply [flat|nested] 4+ messages in thread
* RE: multipathd segfault and SCSI errors
2008-10-24 13:30 ` Konrad Rzeszutek
@ 2008-10-24 15:39 ` Prakash Rudraraju
0 siblings, 0 replies; 4+ messages in thread
From: Prakash Rudraraju @ 2008-10-24 15:39 UTC (permalink / raw)
To: device-mapper development
Eliminating the local disk /dev/sda by adding it to blacklist solved the problem of segmentation fault during boot up.
blacklist {
devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
devnode "^(hd|xvd)[a-z]*"
devnode "sda"
wwid "*"
}
I am currently importing the database to check for path failures during load. It takes about 12 hours to complete, but we had no path failures so far.
I will get "multipathd -v9 -d" after the import is complete.
# multipathd -v9 -d
Oct 24 08:38:39 | --------start up--------
Oct 24 08:38:39 | read /etc/multipath.conf
Oct 24 08:38:39 | process is already running
Thanks,
Prakash.
-----Original Message-----
From: dm-devel-bounces@redhat.com [mailto:dm-devel-bounces@redhat.com] On Behalf Of Konrad Rzeszutek
Sent: Friday, October 24, 2008 6:31 AM
To: device-mapper development
Subject: Re: [dm-devel] multipathd segfault and SCSI errors
> multipathd[7165]: segfault at 000000000000000a rip 00002aaaaaf51a3d rsp 00007fff03b50090 error 4
Can you run multipathd as so:
multipathd -v9 -d
And provide the -200 lines output from the segfault output? You might need to
edit the /etc/init.d/multipathd to have this work and pipe the output to a log file or so.
--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2008-10-24 15:39 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-10-24 6:11 multipathd segfault and SCSI errors Prakash Rudraraju
2008-10-24 6:37 ` Prakash Rudraraju
2008-10-24 13:30 ` Konrad Rzeszutek
2008-10-24 15:39 ` Prakash Rudraraju
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.