* segfault the use xen and multipath devices @ 2011-11-10 9:52 Vasiliy Tolstov 2011-11-10 19:07 ` Bart Van Assche 0 siblings, 1 reply; 6+ messages in thread From: Vasiliy Tolstov @ 2011-11-10 9:52 UTC (permalink / raw) To: linux-scsi Hello. I'm use xen-2.6.32.46 from jeremy linux tree (http://git.kernel.org/?p=linux/kernel/git/jeremy/xen.git) dm3, dm-4, dm-5 multipath devices inside it devices connected via SRP. after i manualy shutdown srp devices, SRP see that and try to reconnect after unsuccesseful connect, says: scsi host4: ib_srp: reconnect failed (-22), removing target port. messages like "I/O error, dev dm-5, sector 0 says that multipath try to write on shutdown device. after that scsi says: scsi 4:0:0:0: Device offlined - not ready after error recovery and in dmesg i see segfault: BUG: unable to handle kernel NULL pointer dereference at 0000000000000038 afte that scsi doing offline second target: scsi 5:0:0:1: Device offlined - not ready after error recovery after that i see: invalid opcode: 0000 [#2] SMP Does this messages says, that in kernel scsi subsystem not works fine or they says about xen related problems? Thanks for all suggestions. dmesg log: [239828.774879] device-mapper: multipath: Failing path 8:160. [239829.810871] device-mapper: multipath: Failing path 8:176. [239831.189426] scsi host4: ib_srp: failed send status 12 [239831.189489] scsi host4: ib_srp: failed send status 5 [239831.189542] scsi host5: ib_srp: failed send status 12 [239831.189596] scsi host4: ib_srp: failed send status 5 [239831.189648] scsi host5: ib_srp: failed send status 5 [239831.189704] scsi host4: ib_srp: failed send status 5 [239831.189759] scsi host5: ib_srp: failed send status 5 [239832.485855] scsi host4: SRP abort called [239832.485912] scsi host4: SRP abort called [239832.485960] scsi host4: SRP abort called [239832.486008] scsi host4: SRP abort called [239832.486063] scsi host4: SRP reset_device called [239832.486114] scsi host4: SRP reset_device called [239832.486164] scsi host4: SRP reset_device called [239832.486229] scsi host4: ib_srp: SRP reset_host called state 0 qp_err 1 [239832.846957] device-mapper: multipath: Failing path 8:96. [239832.850767] device-mapper: multipath: Failing path 8:112. [239832.853760] end_request: I/O error, dev dm-4, sector 0 [239832.853822] Buffer I/O error on device dm-4, logical block 0 [239832.853889] end_request: I/O error, dev dm-4, sector 4 [239832.853944] Buffer I/O error on device dm-4, logical block 1 [239832.854008] Buffer I/O error on device dm-4, logical block 2 [239832.854068] Buffer I/O error on device dm-4, logical block 3 [239832.854072] end_request: I/O error, dev dm-4, sector 0 [239832.854076] Buffer I/O error on device dm-4, logical block 0 [239832.854081] Buffer I/O error on device dm-4, logical block 1 [239832.854186] end_request: I/O error, dev dm-4, sector 3515228760 [239832.854190] Buffer I/O error on device dm-4, logical block 878807190 [239832.854243] end_request: I/O error, dev dm-4, sector 3515228760 [239832.854246] Buffer I/O error on device dm-4, logical block 878807190 [239832.854315] end_request: I/O error, dev dm-4, sector 0 [239832.854319] Buffer I/O error on device dm-4, logical block 0 [239832.854327] Buffer I/O error on device dm-4, logical block 1 [239832.854486] end_request: I/O error, dev dm-4, sector 0 [239832.854728] device-mapper: multipath: Failing path 8:128. [239832.854852] end_request: I/O error, dev dm-4, sector 24 [239832.858110] end_request: I/O error, dev dm-5, sector 0 [239832.858176] end_request: I/O error, dev dm-5, sector 4 [239832.858305] end_request: I/O error, dev dm-5, sector 0 [239832.858466] end_request: I/O error, dev dm-5, sector 3515228760 [239832.858573] end_request: I/O error, dev dm-5, sector 3515228760 [239832.858711] end_request: I/O error, dev dm-5, sector 0 [239832.858886] end_request: I/O error, dev dm-5, sector 0 [239832.859003] end_request: I/O error, dev dm-5, sector 24 [239832.903778] end_request: I/O error, dev dm-4, sector 0 [239832.903909] end_request: I/O error, dev dm-4, sector 3515228760 [239832.904027] end_request: I/O error, dev dm-4, sector 0 [239832.904173] end_request: I/O error, dev dm-4, sector 0 [239832.904289] end_request: I/O error, dev dm-4, sector 24 [239832.907028] end_request: I/O error, dev dm-5, sector 0 [239832.907165] end_request: I/O error, dev dm-5, sector 3515228760 [239832.907284] end_request: I/O error, dev dm-5, sector 0 [239832.907418] end_request: I/O error, dev dm-5, sector 0 [239832.907535] end_request: I/O error, dev dm-5, sector 24 [239834.401687] scsi host5: SRP abort called [239834.401745] scsi host5: SRP abort called [239834.401793] scsi host5: SRP abort called [239834.401847] scsi host5: SRP reset_device called [239834.401897] scsi host5: SRP reset_device called [239834.401948] scsi host5: SRP reset_device called [239834.402000] scsi host5: ib_srp: SRP reset_host called state 0 qp_err 1 [239834.870961] device-mapper: multipath: Failing path 8:144. [239852.489918] scsi host4: SRP abort called [239852.489974] scsi host4: SRP reset_device called [239852.490026] scsi host4: ib_srp: SRP reset_host called state 0 qp_err 1 [239854.405903] scsi host5: SRP abort called [239854.405961] scsi host5: SRP reset_device called [239854.406013] scsi host5: ib_srp: SRP reset_host called state 0 qp_err 1 [239861.499864] scsi host4: ib_srp: Got failed path rec status -22 [239861.499980] scsi host4: ib_srp: Path record query failed [239861.500037] scsi host4: ib_srp: reconnect failed (-22), removing target port. [239861.501259] scsi host5: ib_srp: Got failed path rec status -22 [239861.501324] scsi host5: ib_srp: Path record query failed [239861.501380] scsi host5: ib_srp: reconnect failed (-22), removing target port. [239862.494015] scsi 4:0:0:0: Device offlined - not ready after error recovery [239862.494111] BUG: unable to handle kernel NULL pointer dereference at 0000000000000038 [239862.494267] IP: [<ffffffff811781d0>] elv_requeue_request+0x42/0x6d [239862.494334] PGD 0 [239862.494380] Oops: 0000 [#1] SMP [239862.494435] last sysfs file: /sys/devices/pci0000:00/0000:00:07.0/0000:04:00.0/host5/target5:0:0/5:0:0:2/block/sdl/uevent [239862.494535] CPU 7 [239862.494580] Modules linked in: dm_ioband raid1 dm_round_robin sd_mod crc_t10dif nls_utf8 cifs bonding ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat ebtables x_tables xen_evtchn xenfs dm_multipath dm_mod scsi_dh ib_sdp ib_srp scsi_transport_srp scsi_tgt ib_ipoib ib_umad rdma_ucm rdma_cm ib_cm iw_cm ib_addr ib_sa ib_uverbs mlx4_ib ib_mad ib_core md_mod ata_generic evdev snd_pcm ata_piix snd_timer snd tpm_tis soundcore libata tpm snd_page_alloc tpm_bios scsi_mod pcspkr serio_raw button processor acpi_processor squashfs loop aufs(C) ide_generic ide_core mlx4_en uhci_hcd ehci_hcd usbcore nls_base mlx4_core igb dca thermal thermal_sys [239862.495433] Pid: 3957, comm: scsi_eh_4 Tainted: G C 2.6.32-5-xen-amd64 #1 ProLiant DL170h G6 [239862.495527] RIP: e030:[<ffffffff811781d0>] [<ffffffff811781d0>] elv_requeue_request+0x42/0x6d [239862.495623] RSP: e02b:ffff88010726bd30 EFLAGS: 00010002 [239862.495677] RAX: 0000000000000000 RBX: ffff8800b3f818f0 RCX: ffff8800b3f81a10 [239862.495763] RDX: ffff8800b3f81a10 RSI: ffff8800b3f818f0 RDI: ffff880107232340 [239862.495888] RBP: ffff880107232340 R08: ffff880107232340 R09: 0000000000000000 [239862.495975] R10: 0000160000000000 R11: ffff88010726bea0 R12: ffff880107232340 [239862.496062] R13: ffff8801049ae200 R14: ffff88010726c000 R15: ffff88010723f828 [239862.496153] FS: 00007fb7c1e357a0(0000) GS:ffff880028122000(0000) knlGS:0000000000000000 [239862.496243] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b [239862.496297] CR2: 0000000000000038 CR3: 000000011395e000 CR4: 0000000000002660 [239862.496384] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [239862.496471] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [239862.496558] Process scsi_eh_4 (pid: 3957, threadinfo ffff88010726a000, task ffff880107260700) [239862.496648] Stack: [239862.496689] 0000000000000200 0000000000000200 0000000000001057 ffffffffa010ecc0 [239862.496760] <0> 0000000007232340 0000000000000000 ffff8800b3f818f0 ffff8801049ae200 [239862.496867] <0> 0000000000000000 ffff8800b3f818f0 0000000000080000 ffffffffa010f0d1 [239862.497008] Call Trace: [239862.497061] [<ffffffffa010ecc0>] ? __scsi_queue_insert+0xbe/0xe4 [scsi_mod] [239862.497126] [<ffffffffa010f0d1>] ? scsi_io_completion+0x3eb/0x3fa [scsi_mod] [239862.497219] [<ffffffffa010bbac>] ? scsi_eh_flush_done_q+0xe3/0x104 [scsi_mod] [239862.497311] [<ffffffffa010cdbf>] ? scsi_error_handler+0x3cb/0x5b5 [scsi_mod] [239862.497404] [<ffffffffa010c9f4>] ? scsi_error_handler+0x0/0x5b5 [scsi_mod] [239862.497466] [<ffffffff81065c71>] ? kthread+0x79/0x81 [239862.497522] [<ffffffff81012baa>] ? child_rip+0xa/0x20 [239862.497577] [<ffffffff81011d61>] ? int_ret_from_sys_call+0x7/0x1b [239862.497634] [<ffffffff8101251d>] ? retint_restore_args+0x5/0x6 [239862.497692] [<ffffffff8100ef4f>] ? xen_restore_fl_direct_end+0x0/0x1 [239862.497750] [<ffffffff81012ba0>] ? child_rip+0x0/0x20 [239862.497802] Code: 01 74 04 a8 10 74 35 25 01 00 04 00 ff c8 0f 95 c0 83 e0 01 48 05 ec 00 00 00 ff 4c 85 04 f6 43 48 20 74 18 48 8b 45 18 48 8b 00 <48> 8b 40 38 48 85 c0 74 08 48 89 de 48 89 ef ff d0 81 63 48 ff [239862.498351] RIP [<ffffffff811781d0>] elv_requeue_request+0x42/0x6d [239862.498412] RSP <ffff88010726bd30> [239862.498459] CR2: 0000000000000038 [239862.498880] ---[ end trace 80ef0ddbabc31003 ]--- [239864.410004] scsi 5:0:0:1: Device offlined - not ready after error recovery [239864.410151] invalid opcode: 0000 [#2] SMP [239864.418643] last sysfs file: /sys/devices/pci0000:00/0000:00:07.0/0000:04:00.0/host5/target5:0:0/5:0:0:2/block/sdl/uevent [239864.418798] CPU 8 [239864.418937] Modules linked in: dm_ioband raid1 dm_round_robin sd_mod crc_t10dif nls_utf8 cifs bonding ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat ebtables x_tables xen_evtchn xenfs dm_multipath dm_mod scsi_dh ib_sdp ib_srp scsi_transport_srp scsi_tgt ib_ipoib ib_umad rdma_ucm rdma_cm ib_cm iw_cm ib_addr ib_sa ib_uverbs mlx4_ib ib_mad ib_core md_mod ata_generic evdev snd_pcm ata_piix snd_timer snd tpm_tis soundcore libata tpm snd_page_alloc tpm_bios scsi_mod pcspkr serio_raw button processor acpi_processor squashfs loop aufs(C) ide_generic ide_core mlx4_en uhci_hcd ehci_hcd usbcore nls_base mlx4_core igb dca thermal thermal_sys [239864.423151] Pid: 3975, comm: scsi_eh_5 Tainted: G D C 2.6.32-5-xen-amd64 #1 ProLiant DL170h G6 [239864.423290] RIP: e030:[<ffffffff8147b5b5>] [<ffffffff8147b5b5>] 0xffffffff8147b5b5 [239864.423481] RSP: e02b:ffff88010718fd28 EFLAGS: 00010086 [239864.423580] RAX: ffffffff8147b590 RBX: ffff8800b3f80a80 RCX: ffff8800b3f847a0 [239864.423820] RDX: ffff8800b3f80ba0 RSI: ffff8800b3f80a80 RDI: ffff880107234680 [239864.423955] RBP: ffff880107234680 R08: ffff880107234680 R09: 0000000000000000 [239864.424088] R10: 0000160000000000 R11: ffff88010718fea0 R12: ffff880107234680 [239864.424220] R13: ffff8801049aef00 R14: ffff880107338000 R15: ffff88010723d028 [239864.424357] FS: 00007fcd364b9700(0000) GS:ffff880028140000(0000) knlGS:0000000000000000 [239864.424493] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b [239864.424594] CR2: 00007fb7c2cf7e30 CR3: 00000000b60ef000 CR4: 0000000000002660 [239864.424727] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [239864.424860] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [239864.424994] Process scsi_eh_5 (pid: 3975, threadinfo ffff88010718e000, task ffff880107260e00) [239864.425131] Stack: [239864.425218] ffffffff811781e1 0000000000000200 0000000000000200 0000000000001057 [239864.425480] <0> ffffffffa010ecc0 0000000007234680 0000000000000000 ffff8800b3f80a80 [239864.425875] <0> ffff8801049aef00 0000000000000000 ffff8800b3f80a80 0000000000080000 [239864.426374] Call Trace: [239864.426470] [<ffffffff811781e1>] ? elv_requeue_request+0x53/0x6d [239864.426581] [<ffffffffa010ecc0>] ? __scsi_queue_insert+0xbe/0xe4 [scsi_mod] [239864.426693] [<ffffffffa010f0d1>] ? scsi_io_completion+0x3eb/0x3fa [scsi_mod] [239864.426832] [<ffffffffa010bbac>] ? scsi_eh_flush_done_q+0xe3/0x104 [scsi_mod] [239864.426971] [<ffffffffa010cdbf>] ? scsi_error_handler+0x3cb/0x5b5 [scsi_mod] [239864.427110] [<ffffffffa010c9f4>] ? scsi_error_handler+0x0/0x5b5 [scsi_mod] [239864.427220] [<ffffffff81065c71>] ? kthread+0x79/0x81 [239864.427322] [<ffffffff81012baa>] ? child_rip+0xa/0x20 [239864.427422] [<ffffffff81011d61>] ? int_ret_from_sys_call+0x7/0x1b [239864.427526] [<ffffffff8101251d>] ? retint_restore_args+0x5/0x6 [239864.427630] [<ffffffff8100ef4f>] ? xen_restore_fl_direct_end+0x0/0x1 [239864.427735] [<ffffffff81012ba0>] ? child_rip+0x0/0x20 [239864.427833] Code: 47 81 ff ff ff ff e9 77 17 81 ff ff ff ff b0 b5 47 81 ff ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 93 78 17 81 ff <ff> ff ff 23 78 17 81 ff ff ff ff 00 00 00 00 00 00 00 00 08 7d [239864.431489] RIP [<ffffffff8147b5b5>] 0xffffffff8147b5b5 [239864.431643] RSP <ffff88010718fd28> [239864.431738] ---[ end trace 80ef0ddbabc31004 ]--- -- Vasiliy Tolstov, Clodo.ru e-mail: v.tolstov@selfip.ru jabber: vase@selfip.ru ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: segfault the use xen and multipath devices 2011-11-10 9:52 segfault the use xen and multipath devices Vasiliy Tolstov @ 2011-11-10 19:07 ` Bart Van Assche 2011-11-10 19:24 ` Vasiliy Tolstov 2011-11-15 10:00 ` Vasiliy Tolstov 0 siblings, 2 replies; 6+ messages in thread From: Bart Van Assche @ 2011-11-10 19:07 UTC (permalink / raw) To: Vasiliy Tolstov; +Cc: linux-scsi On Thu, Nov 10, 2011 at 10:52 AM, Vasiliy Tolstov <v.tolstov@selfip.ru> wrote: > and in dmesg i see segfault: BUG: unable to handle kernel NULL pointer > dereference at 0000000000000038 > > afte that scsi doing offline second target: scsi 5:0:0:1: Device > offlined - not ready after error recovery > > after that i see: invalid opcode: 0000 [#2] SMP > > Does this messages says, that in kernel scsi subsystem not works fine > or they says about xen related problems? Does this patch help: http://marc.info/?l=linux-scsi&m=131680195721932 ? Bart. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: segfault the use xen and multipath devices 2011-11-10 19:07 ` Bart Van Assche @ 2011-11-10 19:24 ` Vasiliy Tolstov 2011-11-15 10:00 ` Vasiliy Tolstov 1 sibling, 0 replies; 6+ messages in thread From: Vasiliy Tolstov @ 2011-11-10 19:24 UTC (permalink / raw) To: Bart Van Assche; +Cc: linux-scsi 2011/11/10 Bart Van Assche <bvanassche@acm.org>: > Does this patch help: http://marc.info/?l=linux-scsi&m=131680195721932 ? thanks, i'm try and send reply. -- Vasiliy Tolstov, Clodo.ru e-mail: v.tolstov@selfip.ru jabber: vase@selfip.ru ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: segfault the use xen and multipath devices 2011-11-10 19:07 ` Bart Van Assche 2011-11-10 19:24 ` Vasiliy Tolstov @ 2011-11-15 10:00 ` Vasiliy Tolstov 2011-11-15 11:12 ` Hannes Reinecke 1 sibling, 1 reply; 6+ messages in thread From: Vasiliy Tolstov @ 2011-11-15 10:00 UTC (permalink / raw) To: Bart Van Assche; +Cc: linux-scsi 2011/11/10 Bart Van Assche <bvanassche@acm.org>: > On Thu, Nov 10, 2011 at 10:52 AM, Vasiliy Tolstov <v.tolstov@selfip.ru> wrote: >> and in dmesg i see segfault: BUG: unable to handle kernel NULL pointer >> dereference at 0000000000000038 >> >> afte that scsi doing offline second target: scsi 5:0:0:1: Device >> offlined - not ready after error recovery >> >> after that i see: invalid opcode: 0000 [#2] SMP >> >> Does this messages says, that in kernel scsi subsystem not works fine >> or they says about xen related problems? > > Does this patch help: http://marc.info/?l=linux-scsi&m=131680195721932 ? > > Bart. > Provided patch not solve my problem, another die: [ 1116.770956] scsi host3: SRP abort called [ 1116.776450] device-mapper: multipath: Failing path 8:16. [ 1116.780413] device-mapper: multipath: Failing path 8:0. [ 1116.784288] device-mapper: multipath: Failing path 8:32. [ 1116.788277] device-mapper: multipath: Failing path 8:80. [ 1116.792256] device-mapper: multipath: Failing path 8:48. [ 1116.796324] device-mapper: multipath: Failing path 8:64. [ 1116.820330] end_request: I/O error, dev dm-0, sector 0 [ 1116.820442] Buffer I/O error on device dm-0, logical block 0 [ 1116.820545] Buffer I/O error on device dm-0, logical block 1 [ 1116.820652] Buffer I/O error on device dm-0, logical block 2 [ 1116.820727] end_request: I/O error, dev dm-0, sector 0 [ 1116.820733] Buffer I/O error on device dm-0, logical block 0 [ 1116.820739] Buffer I/O error on device dm-0, logical block 1 [ 1116.820811] end_request: I/O error, dev dm-0, sector 3515228760 [ 1116.820819] Buffer I/O error on device dm-0, logical block 878807190 [ 1116.820854] end_request: I/O error, dev dm-0, sector 3515228760 [ 1116.820858] Buffer I/O error on device dm-0, logical block 878807190 [ 1116.820904] end_request: I/O error, dev dm-0, sector 0 [ 1116.820908] Buffer I/O error on device dm-0, logical block 0 [ 1116.820917] Buffer I/O error on device dm-0, logical block 1 [ 1116.821000] end_request: I/O error, dev dm-0, sector 0 [ 1116.821003] Buffer I/O error on device dm-0, logical block 0 [ 1116.822043] end_request: I/O error, dev dm-0, sector 24 [ 1116.826036] end_request: I/O error, dev dm-1, sector 0 [ 1116.826193] end_request: I/O error, dev dm-1, sector 0 [ 1116.826375] end_request: I/O error, dev dm-1, sector 2929357400 [ 1116.826518] end_request: I/O error, dev dm-1, sector 2929357400 [ 1116.826671] end_request: I/O error, dev dm-1, sector 0 [ 1116.826851] end_request: I/O error, dev dm-1, sector 0 [ 1116.827051] end_request: I/O error, dev dm-1, sector 24 [ 1116.830046] end_request: I/O error, dev dm-2, sector 0 [ 1116.830218] end_request: I/O error, dev dm-2, sector 0 [ 1116.830387] end_request: I/O error, dev dm-2, sector 3515228760 [ 1116.830545] end_request: I/O error, dev dm-2, sector 3515228760 [ 1116.830716] end_request: I/O error, dev dm-2, sector 0 [ 1116.830890] end_request: I/O error, dev dm-2, sector 0 [ 1116.831064] end_request: I/O error, dev dm-2, sector 24 [ 1116.869530] end_request: I/O error, dev dm-0, sector 0 [ 1116.869694] end_request: I/O error, dev dm-0, sector 0 [ 1116.869884] end_request: I/O error, dev dm-0, sector 3515228760 [ 1116.870048] end_request: I/O error, dev dm-0, sector 3515228760 [ 1116.870209] end_request: I/O error, dev dm-0, sector 0 [ 1116.870400] end_request: I/O error, dev dm-0, sector 0 [ 1116.870574] end_request: I/O error, dev dm-0, sector 24 [ 1116.874661] end_request: I/O error, dev dm-1, sector 0 [ 1116.874823] end_request: I/O error, dev dm-1, sector 0 [ 1116.874998] end_request: I/O error, dev dm-1, sector 2929357400 [ 1116.875167] end_request: I/O error, dev dm-1, sector 2929357400 [ 1116.875385] end_request: I/O error, dev dm-1, sector 0 [ 1116.875615] end_request: I/O error, dev dm-1, sector 0 [ 1116.875774] end_request: I/O error, dev dm-1, sector 24 [ 1116.878236] end_request: I/O error, dev dm-2, sector 0 [ 1116.878398] end_request: I/O error, dev dm-2, sector 0 [ 1116.878563] end_request: I/O error, dev dm-2, sector 3515228760 [ 1116.878707] end_request: I/O error, dev dm-2, sector 3515228760 [ 1116.878864] end_request: I/O error, dev dm-2, sector 0 [ 1116.879043] end_request: I/O error, dev dm-2, sector 0 [ 1116.879200] end_request: I/O error, dev dm-2, sector 24 [ 1119.831908] scsi host2: ib_srp: failed send status 12 [ 1119.832014] scsi host2: ib_srp: failed send status 5 [ 1119.832113] scsi host3: ib_srp: failed send status 12 [ 1119.832212] scsi host2: ib_srp: failed send status 5 [ 1119.832310] scsi host3: ib_srp: failed send status 5 [ 1119.832407] scsi host2: ib_srp: failed send status 5 [ 1119.832505] scsi host3: ib_srp: failed send status 5 [ 1121.770852] scsi host3: SRP abort called [ 1121.770949] scsi host3: SRP abort called [ 1121.771048] scsi host3: SRP reset_device called [ 1121.771144] scsi host3: SRP reset_device called [ 1121.771239] scsi host3: SRP reset_device called [ 1121.771337] scsi host3: ib_srp: SRP reset_host called state 0 qp_err 1 [ 1141.774844] scsi host3: SRP abort called [ 1141.774943] scsi host3: SRP reset_device called [ 1141.775039] scsi host3: ib_srp: SRP reset_host called state 0 qp_err 1 [ 1142.770859] scsi host2: SRP abort called [ 1142.770956] scsi host2: SRP abort called [ 1142.771050] scsi host2: SRP abort called [ 1142.771143] scsi host2: SRP abort called [ 1142.771235] scsi host2: SRP abort called [ 1142.771333] scsi host2: SRP reset_device called [ 1142.771429] scsi host2: SRP reset_device called [ 1142.771524] scsi host2: SRP reset_device called [ 1142.771621] scsi host2: ib_srp: SRP reset_host called state 0 qp_err 1 [ 1149.904557] scsi host2: ib_srp: Got failed path rec status -22 [ 1149.904675] scsi host2: ib_srp: Path record query failed [ 1149.904780] scsi host2: ib_srp: reconnect failed (-22), removing target port. [ 1149.906138] scsi host3: ib_srp: Got failed path rec status -22 [ 1149.906258] scsi host3: ib_srp: Path record query failed [ 1149.906359] scsi host3: ib_srp: reconnect failed (-22), removing target port. [ 1149.950999] scsi: killing requests for dead queue [ 1149.983013] scsi: killing requests for dead queue [ 1150.015030] scsi: killing requests for dead queue [ 1150.046977] scsi: killing requests for dead queue [ 1150.079151] scsi: killing requests for dead queue [ 1150.111163] scsi: killing requests for dead queue [ 1151.778872] scsi 3:0:0:2: Device offlined - not ready after error recovery [ 1151.779003] invalid opcode: 0000 [#1] SMP [ 1151.779202] last sysfs file: /sys/devices/pci0000:00/0000:00:07.0/0000:04:00.0/host3/target3:0:0/3:0:0:2/block/sdf/uevent [ 1151.779347] CPU 15 [ 1151.779485] Modules linked in: dm_round_robin sd_mod crc_t10dif nls_utf8 cifs bonding ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat ebtables x_tables xen_evtchn xenfs dm_multipath dm_mod scsi_dh ib_sdp ib_srp scsi_transport_srp scsi_tgt ib_ipoib ib_umad rdma_ucm rdma_cm ib_cm iw_cm ib_addr ib_sa ib_uverbs mlx4_ib ib_mad ib_core md_mod snd_pcm ata_generic snd_timer snd evdev soundcore ata_piix snd_page_alloc tpm_tis libata tpm pcspkr tpm_bios scsi_mod serio_raw button processor acpi_processor squashfs loop aufs(C) ide_generic ide_core mlx4_en uhci_hcd ehci_hcd usbcore nls_base mlx4_core igb thermal dca thermal_sys [ 1151.783206] Pid: 5368, comm: scsi_eh_3 Tainted: G C 2.6.32-5-xen-amd64 #1 ProLiant DL170h G6 [ 1151.783338] RIP: e030:[<ffffffff8147b595>] [<ffffffff8147b595>] 0xffffffff8147b595 [ 1151.783513] RSP: e02b:ffff8801110f5d28 EFLAGS: 00010086 [ 1151.783605] RAX: 00000000ffc7b7f9 RBX: ffff8801035818f0 RCX: ffff880103581a10 [ 1151.783702] RDX: ffff880103581a10 RSI: ffff8801035818f0 RDI: ffff88012f251a70 [ 1151.783800] RBP: ffff88012f251a70 R08: ffff88012f251a70 R09: 0000000000000000 [ 1151.783897] R10: 0000160000000000 R11: ffff8801110f5ea0 R12: ffff88012f251a70 [ 1151.783995] R13: ffff880135129800 R14: ffff880103488000 R15: ffff8801090be828 [ 1151.784095] FS: 00007f8eccbb17a0(0000) GS:ffff880028212000(0000) knlGS:0000000000000000 [ 1151.784223] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b [ 1151.784316] CR2: 00000000025cf178 CR3: 00000001352c5000 CR4: 0000000000002660 [ 1151.784414] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1151.784512] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 1151.784610] Process scsi_eh_3 (pid: 5368, threadinfo ffff8801110f4000, task ffff88011101c600) [ 1151.784738] Stack: [ 1151.784818] ffffffff811782f5 0000000000000200 0000000000000200 0000000000001057 [ 1151.785053] <0> ffffffffa010dd13 000000002f251a70 0000000000000000 ffff8801035818f0 [ 1151.785407] <0> ffff880135129800 0000000000000000 ffff8801035818f0 0000000000080000 [ 1151.785834] Call Trace: [ 1151.785923] [<ffffffff811782f5>] ? elv_requeue_request+0x53/0x6d [ 1151.786023] [<ffffffffa010dd13>] ? __scsi_queue_insert+0xbe/0xe5 [scsi_mod] [ 1151.786125] [<ffffffffa010e125>] ? scsi_io_completion+0x3eb/0x3fa [scsi_mod] [ 1151.786227] [<ffffffffa010abb0>] ? scsi_eh_flush_done_q+0xe3/0x104 [scsi_mod] [ 1151.786356] [<ffffffffa010bdc3>] ? scsi_error_handler+0x3cb/0x5b5 [scsi_mod] [ 1151.786458] [<ffffffffa010b9f8>] ? scsi_error_handler+0x0/0x5b5 [scsi_mod] [ 1151.786558] [<ffffffff81065c69>] ? kthread+0x79/0x81 [ 1151.786653] [<ffffffff81012baa>] ? child_rip+0xa/0x20 [ 1151.786745] [<ffffffff81011d61>] ? int_ret_from_sys_call+0x7/0x1b [ 1151.786841] [<ffffffff8101251d>] ? retint_restore_args+0x5/0x6 [ 1151.786937] [<ffffffff8100ef4f>] ? xen_restore_fl_direct_end+0x0/0x1 [ 1151.787033] [<ffffffff81012ba0>] ? child_rip+0x0/0x20 [ 1151.787124] Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e0 c1 47 81 ff ff ff ff 40 c6 47 81 ff ff ff ff 0d 79 17 81 ff <ff> ff ff b0 b5 47 81 ff ff ff ff 00 00 00 00 00 00 00 00 00 00 [ 1151.790294] RIP [<ffffffff8147b595>] 0xffffffff8147b595 [ 1151.790432] RSP <ffff8801110f5d28> [ 1151.790519] ---[ end trace 0facc831a040d284 ]--- [ 1152.774945] BUG: unable to handle kernel NULL pointer dereference at 0000000000000038 [ 1152.775190] IP: [<ffffffff811782e4>] elv_requeue_request+0x42/0x6d [ 1152.775350] PGD 0 [ 1152.775488] Oops: 0000 [#2] SMP [ 1152.775683] last sysfs file: /sys/devices/pci0000:00/0000:00:07.0/0000:04:00.0/host5/scsi_host/host5/local_ib_port [ 1152.775827] CPU 11 [ 1152.775965] Modules linked in: dm_round_robin sd_mod crc_t10dif nls_utf8 cifs bonding ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat ebtables x_tables xen_evtchn xenfs dm_multipath dm_mod scsi_dh ib_sdp ib_srp scsi_transport_srp scsi_tgt ib_ipoib ib_umad rdma_ucm rdma_cm ib_cm iw_cm ib_addr ib_sa ib_uverbs mlx4_ib ib_mad ib_core md_mod snd_pcm ata_generic snd_timer snd evdev soundcore ata_piix snd_page_alloc tpm_tis libata tpm pcspkr tpm_bios scsi_mod serio_raw button processor acpi_processor squashfs loop aufs(C) ide_generic ide_core mlx4_en uhci_hcd ehci_hcd usbcore nls_base mlx4_core igb thermal dca thermal_sys [ 1152.780034] Pid: 5365, comm: scsi_eh_2 Tainted: G D C 2.6.32-5-xen-amd64 #1 ProLiant DL170h G6 [ 1152.780174] RIP: e030:[<ffffffff811782e4>] [<ffffffff811782e4>] elv_requeue_request+0x42/0x6d [ 1152.780351] RSP: e02b:ffff8801386ddd30 EFLAGS: 00010002 [ 1152.780443] RAX: 0000000000000000 RBX: ffff880102863110 RCX: ffff880102863230 [ 1152.780541] RDX: ffff880102863230 RSI: ffff880102863110 RDI: ffff88012f250000 [ 1152.780638] RBP: ffff88012f250000 R08: ffff88012f250000 R09: 0000000000000000 [ 1152.780736] R10: 0000160000000000 R11: 0000000000000001 R12: ffff88012f250000 [ 1152.780834] R13: ffff880135bb3c00 R14: ffff8801093d8000 R15: ffff8801090b9c28 [ 1152.780934] FS: 00007f1fcf595700(0000) GS:ffff88002819a000(0000) knlGS:0000000000000000 [ 1152.781062] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b [ 1152.781155] CR2: 0000000000000038 CR3: 0000000001001000 CR4: 0000000000002660 [ 1152.781252] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1152.781350] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 1152.781448] Process scsi_eh_2 (pid: 5365, threadinfo ffff8801386dc000, task ffff88011101db00) [ 1152.781576] Stack: [ 1152.781656] 0000000000000200 0000000000000200 0000000000001057 ffffffffa010dd13 [ 1152.781892] <0> 000000002f250000 0000000000000000 ffff880102863110 ffff880135bb3c00 [ 1152.782245] <0> 0000000000000000 ffff880102863110 0000000000080000 ffffffffa010e125 [ 1152.782672] Call Trace: [ 1152.782760] [<ffffffffa010dd13>] ? __scsi_queue_insert+0xbe/0xe5 [scsi_mod] [ 1152.782862] [<ffffffffa010e125>] ? scsi_io_completion+0x3eb/0x3fa [scsi_mod] [ 1152.782964] [<ffffffffa010abb0>] ? scsi_eh_flush_done_q+0xe3/0x104 [scsi_mod] [ 1152.783093] [<ffffffffa010bdc3>] ? scsi_error_handler+0x3cb/0x5b5 [scsi_mod] [ 1152.783195] [<ffffffffa010b9f8>] ? scsi_error_handler+0x0/0x5b5 [scsi_mod] [ 1152.783295] [<ffffffff81065c69>] ? kthread+0x79/0x81 [ 1152.783389] [<ffffffff81012baa>] ? child_rip+0xa/0x20 [ 1152.783482] [<ffffffff81011d61>] ? int_ret_from_sys_call+0x7/0x1b [ 1152.783577] [<ffffffff8101251d>] ? retint_restore_args+0x5/0x6 [ 1152.783673] [<ffffffff8100ef4f>] ? xen_restore_fl_direct_end+0x0/0x1 [ 1152.783769] [<ffffffff81012ba0>] ? child_rip+0x0/0x20 [ 1152.783859] Code: 01 74 04 a8 10 74 35 25 01 00 04 00 ff c8 0f 95 c0 83 e0 01 48 05 ec 00 00 00 ff 4c 85 04 f6 43 48 20 74 18 48 8b 45 18 48 8b 00 <48> 8b 40 38 48 85 c0 74 08 48 89 de 48 89 ef ff d0 81 63 48 ff [ 1152.787020] RIP [<ffffffff811782e4>] elv_requeue_request+0x42/0x6d [ 1152.787162] RSP <ffff8801386ddd30> [ 1152.787247] CR2: 0000000000000038 [ 1152.787333] ---[ end trace 0facc831a040d285 ]--- -- Vasiliy Tolstov, Clodo.ru e-mail: v.tolstov@selfip.ru jabber: vase@selfip.ru ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: segfault the use xen and multipath devices 2011-11-15 10:00 ` Vasiliy Tolstov @ 2011-11-15 11:12 ` Hannes Reinecke 2011-11-15 16:06 ` Bart Van Assche 0 siblings, 1 reply; 6+ messages in thread From: Hannes Reinecke @ 2011-11-15 11:12 UTC (permalink / raw) To: Vasiliy Tolstov; +Cc: SCSI Mailing List On 11/15/2011 11:00 AM, Vasiliy Tolstov wrote: > 2011/11/10 Bart Van Assche <bvanassche@acm.org>: >> On Thu, Nov 10, 2011 at 10:52 AM, Vasiliy Tolstov <v.tolstov@selfip.ru> wrote: >>> and in dmesg i see segfault: BUG: unable to handle kernel NULL pointer >>> dereference at 0000000000000038 >>> >>> afte that scsi doing offline second target: scsi 5:0:0:1: Device >>> offlined - not ready after error recovery >>> >>> after that i see: invalid opcode: 0000 [#2] SMP >>> >>> Does this messages says, that in kernel scsi subsystem not works fine >>> or they says about xen related problems? >> >> Does this patch help: http://marc.info/?l=linux-scsi&m=131680195721932 ? >> >> Bart. >> > > Provided patch not solve my problem, another die: > [ 1116.770956] scsi host3: SRP abort called > [ 1116.776450] device-mapper: multipath: Failing path 8:16. > [ 1116.780413] device-mapper: multipath: Failing path 8:0. > [ 1116.784288] device-mapper: multipath: Failing path 8:32. > [ 1116.788277] device-mapper: multipath: Failing path 8:80. > [ 1116.792256] device-mapper: multipath: Failing path 8:48. > [ 1116.796324] device-mapper: multipath: Failing path 8:64. > [ 1116.820330] end_request: I/O error, dev dm-0, sector 0 > [ 1116.820442] Buffer I/O error on device dm-0, logical block 0 > [ 1116.820545] Buffer I/O error on device dm-0, logical block 1 > [ 1116.820652] Buffer I/O error on device dm-0, logical block 2 > [ 1116.820727] end_request: I/O error, dev dm-0, sector 0 > [ 1116.820733] Buffer I/O error on device dm-0, logical block 0 > [ 1116.820739] Buffer I/O error on device dm-0, logical block 1 > [ 1116.820811] end_request: I/O error, dev dm-0, sector 3515228760 > [ 1116.820819] Buffer I/O error on device dm-0, logical block 878807190 > [ 1116.820854] end_request: I/O error, dev dm-0, sector 3515228760 > [ 1116.820858] Buffer I/O error on device dm-0, logical block 878807190 > [ 1116.820904] end_request: I/O error, dev dm-0, sector 0 > [ 1116.820908] Buffer I/O error on device dm-0, logical block 0 > [ 1116.820917] Buffer I/O error on device dm-0, logical block 1 > [ 1116.821000] end_request: I/O error, dev dm-0, sector 0 > [ 1116.821003] Buffer I/O error on device dm-0, logical block 0 > [ 1116.822043] end_request: I/O error, dev dm-0, sector 24 > [ 1116.826036] end_request: I/O error, dev dm-1, sector 0 > [ 1116.826193] end_request: I/O error, dev dm-1, sector 0 > [ 1116.826375] end_request: I/O error, dev dm-1, sector 2929357400 > [ 1116.826518] end_request: I/O error, dev dm-1, sector 2929357400 > [ 1116.826671] end_request: I/O error, dev dm-1, sector 0 > [ 1116.826851] end_request: I/O error, dev dm-1, sector 0 > [ 1116.827051] end_request: I/O error, dev dm-1, sector 24 > [ 1116.830046] end_request: I/O error, dev dm-2, sector 0 > [ 1116.830218] end_request: I/O error, dev dm-2, sector 0 > [ 1116.830387] end_request: I/O error, dev dm-2, sector 3515228760 > [ 1116.830545] end_request: I/O error, dev dm-2, sector 3515228760 > [ 1116.830716] end_request: I/O error, dev dm-2, sector 0 > [ 1116.830890] end_request: I/O error, dev dm-2, sector 0 > [ 1116.831064] end_request: I/O error, dev dm-2, sector 24 > [ 1116.869530] end_request: I/O error, dev dm-0, sector 0 > [ 1116.869694] end_request: I/O error, dev dm-0, sector 0 > [ 1116.869884] end_request: I/O error, dev dm-0, sector 3515228760 > [ 1116.870048] end_request: I/O error, dev dm-0, sector 3515228760 > [ 1116.870209] end_request: I/O error, dev dm-0, sector 0 > [ 1116.870400] end_request: I/O error, dev dm-0, sector 0 > [ 1116.870574] end_request: I/O error, dev dm-0, sector 24 > [ 1116.874661] end_request: I/O error, dev dm-1, sector 0 > [ 1116.874823] end_request: I/O error, dev dm-1, sector 0 > [ 1116.874998] end_request: I/O error, dev dm-1, sector 2929357400 > [ 1116.875167] end_request: I/O error, dev dm-1, sector 2929357400 > [ 1116.875385] end_request: I/O error, dev dm-1, sector 0 > [ 1116.875615] end_request: I/O error, dev dm-1, sector 0 > [ 1116.875774] end_request: I/O error, dev dm-1, sector 24 > [ 1116.878236] end_request: I/O error, dev dm-2, sector 0 > [ 1116.878398] end_request: I/O error, dev dm-2, sector 0 > [ 1116.878563] end_request: I/O error, dev dm-2, sector 3515228760 > [ 1116.878707] end_request: I/O error, dev dm-2, sector 3515228760 > [ 1116.878864] end_request: I/O error, dev dm-2, sector 0 > [ 1116.879043] end_request: I/O error, dev dm-2, sector 0 > [ 1116.879200] end_request: I/O error, dev dm-2, sector 24 > [ 1119.831908] scsi host2: ib_srp: failed send status 12 > [ 1119.832014] scsi host2: ib_srp: failed send status 5 > [ 1119.832113] scsi host3: ib_srp: failed send status 12 > [ 1119.832212] scsi host2: ib_srp: failed send status 5 > [ 1119.832310] scsi host3: ib_srp: failed send status 5 > [ 1119.832407] scsi host2: ib_srp: failed send status 5 > [ 1119.832505] scsi host3: ib_srp: failed send status 5 > [ 1121.770852] scsi host3: SRP abort called > [ 1121.770949] scsi host3: SRP abort called > [ 1121.771048] scsi host3: SRP reset_device called > [ 1121.771144] scsi host3: SRP reset_device called > [ 1121.771239] scsi host3: SRP reset_device called > [ 1121.771337] scsi host3: ib_srp: SRP reset_host called state 0 qp_err 1 > [ 1141.774844] scsi host3: SRP abort called > [ 1141.774943] scsi host3: SRP reset_device called > [ 1141.775039] scsi host3: ib_srp: SRP reset_host called state 0 qp_err 1 > [ 1142.770859] scsi host2: SRP abort called > [ 1142.770956] scsi host2: SRP abort called > [ 1142.771050] scsi host2: SRP abort called > [ 1142.771143] scsi host2: SRP abort called > [ 1142.771235] scsi host2: SRP abort called > [ 1142.771333] scsi host2: SRP reset_device called > [ 1142.771429] scsi host2: SRP reset_device called > [ 1142.771524] scsi host2: SRP reset_device called > [ 1142.771621] scsi host2: ib_srp: SRP reset_host called state 0 qp_err 1 > [ 1149.904557] scsi host2: ib_srp: Got failed path rec status -22 > [ 1149.904675] scsi host2: ib_srp: Path record query failed > [ 1149.904780] scsi host2: ib_srp: reconnect failed (-22), removing target port. > [ 1149.906138] scsi host3: ib_srp: Got failed path rec status -22 > [ 1149.906258] scsi host3: ib_srp: Path record query failed > [ 1149.906359] scsi host3: ib_srp: reconnect failed (-22), removing target port. > [ 1149.950999] scsi: killing requests for dead queue > [ 1149.983013] scsi: killing requests for dead queue > [ 1150.015030] scsi: killing requests for dead queue > [ 1150.046977] scsi: killing requests for dead queue > [ 1150.079151] scsi: killing requests for dead queue > [ 1150.111163] scsi: killing requests for dead queue > [ 1151.778872] scsi 3:0:0:2: Device offlined - not ready after error recovery > [ 1151.779003] invalid opcode: 0000 [#1] SMP > [ 1151.779202] last sysfs file: > /sys/devices/pci0000:00/0000:00:07.0/0000:04:00.0/host3/target3:0:0/3:0:0:2/block/sdf/uevent > [ 1151.779347] CPU 15 > [ 1151.779485] Modules linked in: dm_round_robin sd_mod crc_t10dif > nls_utf8 cifs bonding ip6table_filter ip6_tables iptable_filter > ip_tables ebtable_nat ebtables x_tables xen_evtchn xenfs dm_multipath > dm_mod scsi_dh ib_sdp ib_srp scsi_transport_srp scsi_tgt ib_ipoib > ib_umad rdma_ucm rdma_cm ib_cm iw_cm ib_addr ib_sa ib_uverbs mlx4_ib > ib_mad ib_core md_mod snd_pcm ata_generic snd_timer snd evdev > soundcore ata_piix snd_page_alloc tpm_tis libata tpm pcspkr tpm_bios > scsi_mod serio_raw button processor acpi_processor squashfs loop > aufs(C) ide_generic ide_core mlx4_en uhci_hcd ehci_hcd usbcore > nls_base mlx4_core igb thermal dca thermal_sys > [ 1151.783206] Pid: 5368, comm: scsi_eh_3 Tainted: G C > 2.6.32-5-xen-amd64 #1 ProLiant DL170h G6 > [ 1151.783338] RIP: e030:[<ffffffff8147b595>] [<ffffffff8147b595>] > 0xffffffff8147b595 > [ 1151.783513] RSP: e02b:ffff8801110f5d28 EFLAGS: 00010086 > [ 1151.783605] RAX: 00000000ffc7b7f9 RBX: ffff8801035818f0 RCX: ffff880103581a10 > [ 1151.783702] RDX: ffff880103581a10 RSI: ffff8801035818f0 RDI: ffff88012f251a70 > [ 1151.783800] RBP: ffff88012f251a70 R08: ffff88012f251a70 R09: 0000000000000000 > [ 1151.783897] R10: 0000160000000000 R11: ffff8801110f5ea0 R12: ffff88012f251a70 > [ 1151.783995] R13: ffff880135129800 R14: ffff880103488000 R15: ffff8801090be828 > [ 1151.784095] FS: 00007f8eccbb17a0(0000) GS:ffff880028212000(0000) > knlGS:0000000000000000 > [ 1151.784223] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b > [ 1151.784316] CR2: 00000000025cf178 CR3: 00000001352c5000 CR4: 0000000000002660 > [ 1151.784414] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 1151.784512] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > [ 1151.784610] Process scsi_eh_3 (pid: 5368, threadinfo > ffff8801110f4000, task ffff88011101c600) > [ 1151.784738] Stack: > [ 1151.784818] ffffffff811782f5 0000000000000200 0000000000000200 > 0000000000001057 > [ 1151.785053] <0> ffffffffa010dd13 000000002f251a70 0000000000000000 > ffff8801035818f0 > [ 1151.785407] <0> ffff880135129800 0000000000000000 ffff8801035818f0 > 0000000000080000 > [ 1151.785834] Call Trace: > [ 1151.785923] [<ffffffff811782f5>] ? elv_requeue_request+0x53/0x6d > [ 1151.786023] [<ffffffffa010dd13>] ? __scsi_queue_insert+0xbe/0xe5 [scsi_mod] > [ 1151.786125] [<ffffffffa010e125>] ? scsi_io_completion+0x3eb/0x3fa [scsi_mod] > [ 1151.786227] [<ffffffffa010abb0>] ? scsi_eh_flush_done_q+0xe3/0x104 > [scsi_mod] > [ 1151.786356] [<ffffffffa010bdc3>] ? scsi_error_handler+0x3cb/0x5b5 [scsi_mod] > [ 1151.786458] [<ffffffffa010b9f8>] ? scsi_error_handler+0x0/0x5b5 [scsi_mod] > [ 1151.786558] [<ffffffff81065c69>] ? kthread+0x79/0x81 > [ 1151.786653] [<ffffffff81012baa>] ? child_rip+0xa/0x20 > [ 1151.786745] [<ffffffff81011d61>] ? int_ret_from_sys_call+0x7/0x1b > [ 1151.786841] [<ffffffff8101251d>] ? retint_restore_args+0x5/0x6 > [ 1151.786937] [<ffffffff8100ef4f>] ? xen_restore_fl_direct_end+0x0/0x1 > [ 1151.787033] [<ffffffff81012ba0>] ? child_rip+0x0/0x20 > [ 1151.787124] Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00 00 00 00 00 00 e0 c1 47 81 ff ff ff ff 40 c6 47 81 ff ff ff ff 0d > 79 17 81 ff <ff> ff ff b0 b5 47 81 ff ff ff ff 00 00 00 00 00 00 00 00 > 00 00 > [ 1151.790294] RIP [<ffffffff8147b595>] 0xffffffff8147b595 > [ 1151.790432] RSP <ffff8801110f5d28> > [ 1151.790519] ---[ end trace 0facc831a040d284 ]--- > [ 1152.774945] BUG: unable to handle kernel NULL pointer dereference > at 0000000000000038 > [ 1152.775190] IP: [<ffffffff811782e4>] elv_requeue_request+0x42/0x6d > [ 1152.775350] PGD 0 > [ 1152.775488] Oops: 0000 [#2] SMP > [ 1152.775683] last sysfs file: > /sys/devices/pci0000:00/0000:00:07.0/0000:04:00.0/host5/scsi_host/host5/local_ib_port > [ 1152.775827] CPU 11 > [ 1152.775965] Modules linked in: dm_round_robin sd_mod crc_t10dif > nls_utf8 cifs bonding ip6table_filter ip6_tables iptable_filter > ip_tables ebtable_nat ebtables x_tables xen_evtchn xenfs dm_multipath > dm_mod scsi_dh ib_sdp ib_srp scsi_transport_srp scsi_tgt ib_ipoib > ib_umad rdma_ucm rdma_cm ib_cm iw_cm ib_addr ib_sa ib_uverbs mlx4_ib > ib_mad ib_core md_mod snd_pcm ata_generic snd_timer snd evdev > soundcore ata_piix snd_page_alloc tpm_tis libata tpm pcspkr tpm_bios > scsi_mod serio_raw button processor acpi_processor squashfs loop > aufs(C) ide_generic ide_core mlx4_en uhci_hcd ehci_hcd usbcore > nls_base mlx4_core igb thermal dca thermal_sys > [ 1152.780034] Pid: 5365, comm: scsi_eh_2 Tainted: G D C > 2.6.32-5-xen-amd64 #1 ProLiant DL170h G6 > [ 1152.780174] RIP: e030:[<ffffffff811782e4>] [<ffffffff811782e4>] > elv_requeue_request+0x42/0x6d > [ 1152.780351] RSP: e02b:ffff8801386ddd30 EFLAGS: 00010002 > [ 1152.780443] RAX: 0000000000000000 RBX: ffff880102863110 RCX: ffff880102863230 > [ 1152.780541] RDX: ffff880102863230 RSI: ffff880102863110 RDI: ffff88012f250000 > [ 1152.780638] RBP: ffff88012f250000 R08: ffff88012f250000 R09: 0000000000000000 > [ 1152.780736] R10: 0000160000000000 R11: 0000000000000001 R12: ffff88012f250000 > [ 1152.780834] R13: ffff880135bb3c00 R14: ffff8801093d8000 R15: ffff8801090b9c28 > [ 1152.780934] FS: 00007f1fcf595700(0000) GS:ffff88002819a000(0000) > knlGS:0000000000000000 > [ 1152.781062] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b > [ 1152.781155] CR2: 0000000000000038 CR3: 0000000001001000 CR4: 0000000000002660 > [ 1152.781252] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 1152.781350] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > [ 1152.781448] Process scsi_eh_2 (pid: 5365, threadinfo > ffff8801386dc000, task ffff88011101db00) > [ 1152.781576] Stack: > [ 1152.781656] 0000000000000200 0000000000000200 0000000000001057 > ffffffffa010dd13 > [ 1152.781892] <0> 000000002f250000 0000000000000000 ffff880102863110 > ffff880135bb3c00 > [ 1152.782245] <0> 0000000000000000 ffff880102863110 0000000000080000 > ffffffffa010e125 > [ 1152.782672] Call Trace: > [ 1152.782760] [<ffffffffa010dd13>] ? __scsi_queue_insert+0xbe/0xe5 [scsi_mod] > [ 1152.782862] [<ffffffffa010e125>] ? scsi_io_completion+0x3eb/0x3fa [scsi_mod] > [ 1152.782964] [<ffffffffa010abb0>] ? scsi_eh_flush_done_q+0xe3/0x104 > [scsi_mod] > [ 1152.783093] [<ffffffffa010bdc3>] ? scsi_error_handler+0x3cb/0x5b5 [scsi_mod] > [ 1152.783195] [<ffffffffa010b9f8>] ? scsi_error_handler+0x0/0x5b5 [scsi_mod] > [ 1152.783295] [<ffffffff81065c69>] ? kthread+0x79/0x81 > [ 1152.783389] [<ffffffff81012baa>] ? child_rip+0xa/0x20 > [ 1152.783482] [<ffffffff81011d61>] ? int_ret_from_sys_call+0x7/0x1b > [ 1152.783577] [<ffffffff8101251d>] ? retint_restore_args+0x5/0x6 > [ 1152.783673] [<ffffffff8100ef4f>] ? xen_restore_fl_direct_end+0x0/0x1 > [ 1152.783769] [<ffffffff81012ba0>] ? child_rip+0x0/0x20 > [ 1152.783859] Code: 01 74 04 a8 10 74 35 25 01 00 04 00 ff c8 0f 95 > c0 83 e0 01 48 05 ec 00 00 00 ff 4c 85 04 f6 43 48 20 74 18 48 8b 45 > 18 48 8b 00 <48> 8b 40 38 48 85 c0 74 08 48 89 de 48 89 ef ff d0 81 63 > 48 ff > [ 1152.787020] RIP [<ffffffff811782e4>] elv_requeue_request+0x42/0x6d > [ 1152.787162] RSP <ffff8801386ddd30> > [ 1152.787247] CR2: 0000000000000038 > [ 1152.787333] ---[ end trace 0facc831a040d285 ]--- > Ouch. Requeue while the queue is dead. I'm pretty sure we need to fix the error handler for this case. Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg) -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: segfault the use xen and multipath devices 2011-11-15 11:12 ` Hannes Reinecke @ 2011-11-15 16:06 ` Bart Van Assche 0 siblings, 0 replies; 6+ messages in thread From: Bart Van Assche @ 2011-11-15 16:06 UTC (permalink / raw) To: Hannes Reinecke; +Cc: Vasiliy Tolstov, SCSI Mailing List On Tue, Nov 15, 2011 at 12:12 PM, Hannes Reinecke <hare@suse.de> wrote: > On 11/15/2011 11:00 AM, Vasiliy Tolstov wrote: > > [ 1152.782672] Call Trace: > > [ 1152.782760] [<ffffffffa010dd13>] ? __scsi_queue_insert+0xbe/0xe5 [scsi_mod] > > [ 1152.782862] [<ffffffffa010e125>] ? scsi_io_completion+0x3eb/0x3fa [scsi_mod] > > [ 1152.782964] [<ffffffffa010abb0>] ? scsi_eh_flush_done_q+0xe3/0x104 > > [scsi_mod] > > [ 1152.783093] [<ffffffffa010bdc3>] ? scsi_error_handler+0x3cb/0x5b5 [scsi_mod] > > [ 1152.783195] [<ffffffffa010b9f8>] ? scsi_error_handler+0x0/0x5b5 [scsi_mod] > > [ 1152.783295] [<ffffffff81065c69>] ? kthread+0x79/0x81 > > [ 1152.783389] [<ffffffff81012baa>] ? child_rip+0xa/0x20 > > [ 1152.783482] [<ffffffff81011d61>] ? int_ret_from_sys_call+0x7/0x1b > > [ 1152.783577] [<ffffffff8101251d>] ? retint_restore_args+0x5/0x6 > > [ 1152.783673] [<ffffffff8100ef4f>] ? xen_restore_fl_direct_end+0x0/0x1 > > [ 1152.783769] [<ffffffff81012ba0>] ? child_rip+0x0/0x20Ouch. Requeue while the queue is dead. > > I'm pretty sure we need to fix the error handler for this case. Apparently when a SCSI host is removed the SCSI queue is killed from inside __scsi_remove_device() before the error handler thread is stopped from scsi_host_dev_release(). That order doesn't seem correct to me. Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2011-11-15 16:06 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-11-10 9:52 segfault the use xen and multipath devices Vasiliy Tolstov 2011-11-10 19:07 ` Bart Van Assche 2011-11-10 19:24 ` Vasiliy Tolstov 2011-11-15 10:00 ` Vasiliy Tolstov 2011-11-15 11:12 ` Hannes Reinecke 2011-11-15 16:06 ` Bart Van Assche
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox