* crash in iscsi/scsi initiator with linux-4.15.0-rc1
@ 2017-12-01 17:00 Steve Wise
2017-12-01 18:32 ` Ewan D. Milne
0 siblings, 1 reply; 6+ messages in thread
From: Steve Wise @ 2017-12-01 17:00 UTC (permalink / raw)
To: linux-scsi
Hey,
I'm seeing this null pointer dereference with linux-4.15.0-rc1. To reproduce
it, I connect two ram disks via iscsi/TCP, and start an fio:
iscsiadm -m discovery --op update --type sendtargets -p 172.16.1.10:3260
iscsiadm -m node -p 172.16.1.10:3260 -l
ISCSI_DISKS=/dev/sdd:/dev/sde; fio --rw=randrw --name=random --norandommap
--ioengine=libaio --size=400m --group_reporting --exitall --fsync_on_close=1
--invalidate=1 --direct=1 --filename=$ISCSI_DISKS --time_based --runtime=300
--iodepth=128 --numjobs=8 --unit_base=1 --bs=64k --kb_base=1000
Then on the initiator node, while the fio test is running, I detach the devices:
iscsiadm -m node -p 172.16.1.10:3260 -I iser -u
Then I hit this crash. Has anyone else encountered this issue? Wondering if
there is a fix handy. :)
Thanks,
Steve.
----
[ 127.175953] scsi 8:0:0:0: alua: Detached
[ 127.175955] scsi 8:0:0:0: alua: Detached
[ 127.175981] ------------[ cut here ]------------
[ 127.175984] list_del corruption. prev->next should be ffff8803382f1240, but
was ffff88039ab0f780
[ 127.176010] WARNING: CPU: 5 PID: 373 at lib/list_debug.c:53
__list_del_entry_valid+0x7c/0xa0
[ 127.176011] Modules linked in: iscsi_tcp libiscsi_tcp rpcrdma ib_isert
iscsi_target_mod libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp
scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm
iw_cm libcxgb mlx5_ib ext4 ib_core dm_mirror dm_region_hash dm_log dm_mod
mbcache jbd2 coretemp kvm iTCO_wdt ppdev irqbypass iTCO_vendor_support gpio_ich
i2c_i801 pcspkr lpc_ich parport_pc i5400_edac sg parport i5k_amb shpchp nfsd
auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sr_mod nouveau
cdrom sd_mod ata_generic pata_acpi video mxm_wmi wmi drm_kms_helper syscopyarea
sysfillrect sysimgblt fb_sys_fops ttm mlx5_core drm igb cxgb4 ahci firewire_ohci
ata_piix libahci firewire_core dca i2c_algo_bit devlink libata ptp serio_raw
i2c_core crc_itu_t pps_core [last unloaded: ib_iser]
[ 127.176055] CPU: 5 PID: 373 Comm: kworker/u16:4 Not tainted 4.15.0-rc1+ #6
[ 127.176056] Hardware name: Supermicro X7DWA/X7DWA, BIOS 6.00 12/21/2007
[ 127.176074] Workqueue: scsi_wq_9 __iscsi_unbind_session
[scsi_transport_iscsi]
[ 127.176075] task: ffff88039a498000 task.stack: ffffc90002880000
[ 127.176076] RIP: 0010:__list_del_entry_valid+0x7c/0xa0
[ 127.176076] RSP: 0018:ffffc90002883d38 EFLAGS: 00010082
[ 127.176077] RAX: 0000000000000000 RBX: ffff8803382f1240 RCX: 0000000000000000
[ 127.176078] RDX: 0000000000000001 RSI: 0000000000000002 RDI: 0000000000000092
[ 127.176079] RBP: ffff8803982129c0 R08: 0000000000000054 R09: ffffffff823d60e0
[ 127.176079] R10: 0000000000000473 R11: 0000000000000000 R12: ffff880398212800
[ 127.176080] R13: ffff880396701800 R14: ffff880396701800 R15: ffff8801afc31000
[ 127.176081] FS: 0000000000000000(0000) GS:ffff8803bfd40000(0000)
knlGS:0000000000000000
[ 127.176082] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 127.176083] CR2: 00007f6a80028038 CR3: 000000039a957000 CR4: 00000000000006e0
[ 127.176084] Call Trace:
[ 127.176091] alua_bus_detach+0x5c/0xc0
[ 127.176095] scsi_dh_release_device+0x18/0x50
[ 127.176098] scsi_device_dev_release_usercontext+0x25/0x230
[ 127.176107] execute_in_process_context+0x58/0x60
[ 127.176110] device_release+0x2d/0x80
[ 127.176113] kobject_cleanup+0x5e/0x180
[ 127.176115] scsi_remove_target+0x16b/0x1b0
[ 127.176119] __iscsi_unbind_session+0xb3/0x160 [scsi_transport_iscsi]
[ 127.176121] process_one_work+0x141/0x340
[ 127.176123] worker_thread+0x47/0x3e0
[ 127.176124] kthread+0xf5/0x130
[ 127.176126] ? rescuer_thread+0x380/0x380
[ 127.176127] ? kthread_associate_blkcg+0x90/0x90
[ 127.176129] ret_from_fork+0x1f/0x30
[ 127.176130] Code: ff 31 c0 c3 48 89 fe 31 c0 48 c7 c7 60 19 a9 81 e8 3a 33 d0
ff 0f ff 31 c0 c3 48 89 fe 31 c0 48 c7 c7 20 19 a9 81 e8 24 33 d0 ff <0f> ff 31
c0 c3 48 89 fe 31 c0 48 c7 c7 e8 18 a9 81 e8 0e 33 d0
[ 127.176145] ---[ end trace e7e378e0f32966e0 ]---
[ 127.176148] scsi 9:0:0:0: alua: Detached
[ 127.466362] BUG: unable to handle kernel NULL pointer dereference at
(null)
[ 127.474355] IP: _raw_spin_lock_irqsave+0x1e/0x40
[ 127.479136] PGD 399e70067 P4D 399e70067 PUD 3966cd067 PMD 0
[ 127.484961] Oops: 0002 [#1] SMP
[ 127.488269] Modules linked in: iscsi_tcp libiscsi_tcp rpcrdma ib_isert
iscsi_target_mod libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp
scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm
iw_cm libcxgb mlx5_ib ext4 ib_core dm_mirror dm_region_hash dm_log dm_mod
mbcache jbd2 coretemp kvm iTCO_wdt ppdev irqbypass iTCO_vendor_support gpio_ich
i2c_i801 pcspkr lpc_ich parport_pc i5400_edac sg parport i5k_amb shpchp nfsd
auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sr_mod nouveau
cdrom sd_mod ata_generic pata_acpi video mxm_wmi wmi drm_kms_helper syscopyarea
sysfillrect sysimgblt fb_sys_fops ttm mlx5_core drm igb cxgb4 ahci firewire_ohci
ata_piix libahci firewire_core dca i2c_algo_bit devlink libata ptp serio_raw
i2c_core crc_itu_t pps_core [last unloaded: ib_iser]
[ 127.565494] CPU: 0 PID: 374 Comm: kworker/u16:5 Tainted: G W
4.15.0-rc1+ #6
[ 127.573846] Hardware name: Supermicro X7DWA/X7DWA, BIOS 6.00 12/21/2007
[ 127.580649] Workqueue: scsi_wq_8 __iscsi_unbind_session
[scsi_transport_iscsi]
[ 127.588054] task: ffff88039a4995c0 task.stack: ffffc90002888000
[ 127.594138] RIP: 0010:_raw_spin_lock_irqsave+0x1e/0x40
[ 127.599433] RSP: 0018:ffffc9000288bd68 EFLAGS: 00010046
[ 127.604819] RAX: 0000000000000000 RBX: 0000000000000246 RCX: 0000000000000000
[ 127.612129] RDX: 0000000000000001 RSI: ffff8803bfc0e038 RDI: 0000000000000000
[ 127.619427] RBP: ffff880396700f28 R08: 0000000000000000 R09: 0000000000000496
[ 127.626768] R10: 0000000000000000 R11: 0000000000000010 R12: ffff88033ab43900
[ 127.634067] R13: ffff88033997f000 R14: ffff880396700800 R15: ffff88033997f000
[ 127.641390] FS: 0000000000000000(0000) GS:ffff8803bfc00000(0000)
knlGS:0000000000000000
[ 127.649667] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 127.655579] CR2: 0000000000000000 CR3: 0000000396042000 CR4: 00000000000006f0
[ 127.662890] Call Trace:
[ 127.665521] scsi_device_dev_release_usercontext+0x40/0x230
[ 127.671273] execute_in_process_context+0x58/0x60
[ 127.676144] device_release+0x2d/0x80
[ 127.679987] kobject_cleanup+0x5e/0x180
[ 127.684005] scsi_remove_target+0x16b/0x1b0
[ 127.688356] __iscsi_unbind_session+0xb3/0x160 [scsi_transport_iscsi]
[ 127.694972] process_one_work+0x141/0x340
[ 127.699179] worker_thread+0x47/0x3e0
[ 127.703018] kthread+0xf5/0x130
[ 127.706330] ? rescuer_thread+0x380/0x380
[ 127.710504] ? kthread_associate_blkcg+0x90/0x90
[ 127.715321] ret_from_fork+0x1f/0x30
[ 127.719083] Code: f4 66 90 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 53 9c
58 66 66 90 66 90 48 89 c3 fa 66 66 90 66 66 90 31 c0 ba 01 00 00 00 <f0> 0f b1
17 85 c0 75 05 48 89 d8 5b c3 89 c6 e8 77 63 98 ff eb
[ 127.738870] RIP: _raw_spin_lock_irqsave+0x1e/0x40 RSP: ffffc9000288bd68
[ 127.745673] CR2: 0000000000000000
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: crash in iscsi/scsi initiator with linux-4.15.0-rc1
2017-12-01 17:00 crash in iscsi/scsi initiator with linux-4.15.0-rc1 Steve Wise
@ 2017-12-01 18:32 ` Ewan D. Milne
2017-12-01 20:36 ` Steve Wise
2017-12-19 19:31 ` Steve Wise
0 siblings, 2 replies; 6+ messages in thread
From: Ewan D. Milne @ 2017-12-01 18:32 UTC (permalink / raw)
To: Steve Wise; +Cc: linux-scsi
On Fri, 2017-12-01 at 11:00 -0600, Steve Wise wrote:
> Hey,
>
> I'm seeing this null pointer dereference with linux-4.15.0-rc1. To reproduce
> it, I connect two ram disks via iscsi/TCP, and start an fio:
>
> iscsiadm -m discovery --op update --type sendtargets -p 172.16.1.10:3260
> iscsiadm -m node -p 172.16.1.10:3260 -l
> ISCSI_DISKS=/dev/sdd:/dev/sde; fio --rw=randrw --name=random --norandommap
> --ioengine=libaio --size=400m --group_reporting --exitall --fsync_on_close=1
> --invalidate=1 --direct=1 --filename=$ISCSI_DISKS --time_based --runtime=300
> --iodepth=128 --numjobs=8 --unit_base=1 --bs=64k --kb_base=1000
>
> Then on the initiator node, while the fio test is running, I detach the devices:
>
> iscsiadm -m node -p 172.16.1.10:3260 -I iser -u
>
> Then I hit this crash. Has anyone else encountered this issue? Wondering if
> there is a fix handy. :)
>
This is the same problem that is being discussed under the thread:
"[PATCH] scsi: fix race condition when removing target".
We had good test results with both Jason Yan's patch and Bart's patch
applied, however the ultimate solution is still in progress, see James'
comments.
You could also try reverting fbce4d97fd "scsi: fixup kernel warning
during rmmod()" if you just need to get past this.
-Ewan
> Thanks,
>
> Steve.
>
> ----
>
> [ 127.175953] scsi 8:0:0:0: alua: Detached
> [ 127.175955] scsi 8:0:0:0: alua: Detached
> [ 127.175981] ------------[ cut here ]------------
> [ 127.175984] list_del corruption. prev->next should be ffff8803382f1240, but
> was ffff88039ab0f780
> [ 127.176010] WARNING: CPU: 5 PID: 373 at lib/list_debug.c:53
> __list_del_entry_valid+0x7c/0xa0
> [ 127.176011] Modules linked in: iscsi_tcp libiscsi_tcp rpcrdma ib_isert
> iscsi_target_mod libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp
> scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm
> iw_cm libcxgb mlx5_ib ext4 ib_core dm_mirror dm_region_hash dm_log dm_mod
> mbcache jbd2 coretemp kvm iTCO_wdt ppdev irqbypass iTCO_vendor_support gpio_ich
> i2c_i801 pcspkr lpc_ich parport_pc i5400_edac sg parport i5k_amb shpchp nfsd
> auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sr_mod nouveau
> cdrom sd_mod ata_generic pata_acpi video mxm_wmi wmi drm_kms_helper syscopyarea
> sysfillrect sysimgblt fb_sys_fops ttm mlx5_core drm igb cxgb4 ahci firewire_ohci
> ata_piix libahci firewire_core dca i2c_algo_bit devlink libata ptp serio_raw
> i2c_core crc_itu_t pps_core [last unloaded: ib_iser]
> [ 127.176055] CPU: 5 PID: 373 Comm: kworker/u16:4 Not tainted 4.15.0-rc1+ #6
> [ 127.176056] Hardware name: Supermicro X7DWA/X7DWA, BIOS 6.00 12/21/2007
> [ 127.176074] Workqueue: scsi_wq_9 __iscsi_unbind_session
> [scsi_transport_iscsi]
> [ 127.176075] task: ffff88039a498000 task.stack: ffffc90002880000
> [ 127.176076] RIP: 0010:__list_del_entry_valid+0x7c/0xa0
> [ 127.176076] RSP: 0018:ffffc90002883d38 EFLAGS: 00010082
> [ 127.176077] RAX: 0000000000000000 RBX: ffff8803382f1240 RCX: 0000000000000000
> [ 127.176078] RDX: 0000000000000001 RSI: 0000000000000002 RDI: 0000000000000092
> [ 127.176079] RBP: ffff8803982129c0 R08: 0000000000000054 R09: ffffffff823d60e0
> [ 127.176079] R10: 0000000000000473 R11: 0000000000000000 R12: ffff880398212800
> [ 127.176080] R13: ffff880396701800 R14: ffff880396701800 R15: ffff8801afc31000
> [ 127.176081] FS: 0000000000000000(0000) GS:ffff8803bfd40000(0000)
> knlGS:0000000000000000
> [ 127.176082] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 127.176083] CR2: 00007f6a80028038 CR3: 000000039a957000 CR4: 00000000000006e0
> [ 127.176084] Call Trace:
> [ 127.176091] alua_bus_detach+0x5c/0xc0
> [ 127.176095] scsi_dh_release_device+0x18/0x50
> [ 127.176098] scsi_device_dev_release_usercontext+0x25/0x230
> [ 127.176107] execute_in_process_context+0x58/0x60
> [ 127.176110] device_release+0x2d/0x80
> [ 127.176113] kobject_cleanup+0x5e/0x180
> [ 127.176115] scsi_remove_target+0x16b/0x1b0
> [ 127.176119] __iscsi_unbind_session+0xb3/0x160 [scsi_transport_iscsi]
> [ 127.176121] process_one_work+0x141/0x340
> [ 127.176123] worker_thread+0x47/0x3e0
> [ 127.176124] kthread+0xf5/0x130
> [ 127.176126] ? rescuer_thread+0x380/0x380
> [ 127.176127] ? kthread_associate_blkcg+0x90/0x90
> [ 127.176129] ret_from_fork+0x1f/0x30
> [ 127.176130] Code: ff 31 c0 c3 48 89 fe 31 c0 48 c7 c7 60 19 a9 81 e8 3a 33 d0
> ff 0f ff 31 c0 c3 48 89 fe 31 c0 48 c7 c7 20 19 a9 81 e8 24 33 d0 ff <0f> ff 31
> c0 c3 48 89 fe 31 c0 48 c7 c7 e8 18 a9 81 e8 0e 33 d0
> [ 127.176145] ---[ end trace e7e378e0f32966e0 ]---
> [ 127.176148] scsi 9:0:0:0: alua: Detached
> [ 127.466362] BUG: unable to handle kernel NULL pointer dereference at
> (null)
> [ 127.474355] IP: _raw_spin_lock_irqsave+0x1e/0x40
> [ 127.479136] PGD 399e70067 P4D 399e70067 PUD 3966cd067 PMD 0
> [ 127.484961] Oops: 0002 [#1] SMP
> [ 127.488269] Modules linked in: iscsi_tcp libiscsi_tcp rpcrdma ib_isert
> iscsi_target_mod libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp
> scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm
> iw_cm libcxgb mlx5_ib ext4 ib_core dm_mirror dm_region_hash dm_log dm_mod
> mbcache jbd2 coretemp kvm iTCO_wdt ppdev irqbypass iTCO_vendor_support gpio_ich
> i2c_i801 pcspkr lpc_ich parport_pc i5400_edac sg parport i5k_amb shpchp nfsd
> auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sr_mod nouveau
> cdrom sd_mod ata_generic pata_acpi video mxm_wmi wmi drm_kms_helper syscopyarea
> sysfillrect sysimgblt fb_sys_fops ttm mlx5_core drm igb cxgb4 ahci firewire_ohci
> ata_piix libahci firewire_core dca i2c_algo_bit devlink libata ptp serio_raw
> i2c_core crc_itu_t pps_core [last unloaded: ib_iser]
> [ 127.565494] CPU: 0 PID: 374 Comm: kworker/u16:5 Tainted: G W
> 4.15.0-rc1+ #6
> [ 127.573846] Hardware name: Supermicro X7DWA/X7DWA, BIOS 6.00 12/21/2007
> [ 127.580649] Workqueue: scsi_wq_8 __iscsi_unbind_session
> [scsi_transport_iscsi]
> [ 127.588054] task: ffff88039a4995c0 task.stack: ffffc90002888000
> [ 127.594138] RIP: 0010:_raw_spin_lock_irqsave+0x1e/0x40
> [ 127.599433] RSP: 0018:ffffc9000288bd68 EFLAGS: 00010046
> [ 127.604819] RAX: 0000000000000000 RBX: 0000000000000246 RCX: 0000000000000000
> [ 127.612129] RDX: 0000000000000001 RSI: ffff8803bfc0e038 RDI: 0000000000000000
> [ 127.619427] RBP: ffff880396700f28 R08: 0000000000000000 R09: 0000000000000496
> [ 127.626768] R10: 0000000000000000 R11: 0000000000000010 R12: ffff88033ab43900
> [ 127.634067] R13: ffff88033997f000 R14: ffff880396700800 R15: ffff88033997f000
> [ 127.641390] FS: 0000000000000000(0000) GS:ffff8803bfc00000(0000)
> knlGS:0000000000000000
> [ 127.649667] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 127.655579] CR2: 0000000000000000 CR3: 0000000396042000 CR4: 00000000000006f0
> [ 127.662890] Call Trace:
> [ 127.665521] scsi_device_dev_release_usercontext+0x40/0x230
> [ 127.671273] execute_in_process_context+0x58/0x60
> [ 127.676144] device_release+0x2d/0x80
> [ 127.679987] kobject_cleanup+0x5e/0x180
> [ 127.684005] scsi_remove_target+0x16b/0x1b0
> [ 127.688356] __iscsi_unbind_session+0xb3/0x160 [scsi_transport_iscsi]
> [ 127.694972] process_one_work+0x141/0x340
> [ 127.699179] worker_thread+0x47/0x3e0
> [ 127.703018] kthread+0xf5/0x130
> [ 127.706330] ? rescuer_thread+0x380/0x380
> [ 127.710504] ? kthread_associate_blkcg+0x90/0x90
> [ 127.715321] ret_from_fork+0x1f/0x30
> [ 127.719083] Code: f4 66 90 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 53 9c
> 58 66 66 90 66 90 48 89 c3 fa 66 66 90 66 66 90 31 c0 ba 01 00 00 00 <f0> 0f b1
> 17 85 c0 75 05 48 89 d8 5b c3 89 c6 e8 77 63 98 ff eb
> [ 127.738870] RIP: _raw_spin_lock_irqsave+0x1e/0x40 RSP: ffffc9000288bd68
> [ 127.745673] CR2: 0000000000000000
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: crash in iscsi/scsi initiator with linux-4.15.0-rc1
2017-12-01 18:32 ` Ewan D. Milne
@ 2017-12-01 20:36 ` Steve Wise
2017-12-19 19:31 ` Steve Wise
1 sibling, 0 replies; 6+ messages in thread
From: Steve Wise @ 2017-12-01 20:36 UTC (permalink / raw)
To: emilne; +Cc: linux-scsi
> > Then I hit this crash. Has anyone else encountered this issue? Wondering if
> > there is a fix handy. :)
> >
>
> This is the same problem that is being discussed under the thread:
> "[PATCH] scsi: fix race condition when removing target".
>
> We had good test results with both Jason Yan's patch and Bart's patch
> applied, however the ultimate solution is still in progress, see James'
> comments.
>
> You could also try reverting fbce4d97fd "scsi: fixup kernel warning
> during rmmod()" if you just need to get past this.
>
> -Ewan
Thanks Ewan, I'll back up that commit just to verify I'm seeing the same issue. I'm also happy to test any final fix.
Steve.
^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: crash in iscsi/scsi initiator with linux-4.15.0-rc1
2017-12-01 18:32 ` Ewan D. Milne
2017-12-01 20:36 ` Steve Wise
@ 2017-12-19 19:31 ` Steve Wise
2017-12-19 20:20 ` Ewan D. Milne
1 sibling, 1 reply; 6+ messages in thread
From: Steve Wise @ 2017-12-19 19:31 UTC (permalink / raw)
To: emilne; +Cc: linux-scsi, yanaijie, Bart.VanAssche
> > Hey,
> >
> > I'm seeing this null pointer dereference with linux-4.15.0-rc1. To reproduce
> > it, I connect two ram disks via iscsi/TCP, and start an fio:
> >
> > iscsiadm -m discovery --op update --type sendtargets -p 172.16.1.10:3260
> > iscsiadm -m node -p 172.16.1.10:3260 -l
> > ISCSI_DISKS=/dev/sdd:/dev/sde; fio --rw=randrw --name=random --
> norandommap
> > --ioengine=libaio --size=400m --group_reporting --exitall --fsync_on_close=1
> > --invalidate=1 --direct=1 --filename=$ISCSI_DISKS --time_based --runtime=300
> > --iodepth=128 --numjobs=8 --unit_base=1 --bs=64k --kb_base=1000
> >
> > Then on the initiator node, while the fio test is running, I detach the devices:
> >
> > iscsiadm -m node -p 172.16.1.10:3260 -I iser -u
> >
> > Then I hit this crash. Has anyone else encountered this issue? Wondering if
> > there is a fix handy. :)
> >
>
> This is the same problem that is being discussed under the thread:
> "[PATCH] scsi: fix race condition when removing target".
>
> We had good test results with both Jason Yan's patch and Bart's patch
> applied, however the ultimate solution is still in progress, see James'
> comments.
>
> You could also try reverting fbce4d97fd "scsi: fixup kernel warning
> during rmmod()" if you just need to get past this.
>
> -Ewan
>
Hey Ewan, Yan, Bart,
I'm still seeing this issue with 4.15-rc4. Is the issue still outstanding?
Steve.
---
[ 1002.205103] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 1002.213022] IP: _raw_spin_lock_irqsave+0x1e/0x40
[ 1002.217740] PGD 0 P4D 0
[ 1002.220382] Oops: 0002 [#1] SMP
[ 1002.223637] Modules linked in: iw_cxgb4 cxgb4 nvme_rdma nvme_fabrics rdma_ktest(O) rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm mlx4_ib ib_core libcxgb vfat intel_rapl fat iosf_mbi x86_pkg_temp_thermal intel_powerclamp coretemp kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel crypto_simd glue_helper cryptd iTCO_wdt iTCO_vendor_support mxm_wmi mei_me ipmi_si lpc_ich mei pcspkr i2c_i801 mfd_core ipmi_devintf shpchp sg ipmi_msghandler wmi nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 mbcache jbd2 mlx4_en mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm sd_mod igb drm ahci libahci dca
mlx4_core
[ 1002.295663] ptp libata pps_core crc32c_intel nvme i2c_algo_bit i2c_core nvme_core [last unloaded: cxgb4]
[ 1002.305563] CPU: 4 PID: 5156 Comm: fio Tainted: G O 4.15.0-rc4 #3
[ 1002.313223] Hardware name: Supermicro X9DR3-F/X9DR3-F, BIOS 3.2a 07/09/2015
[ 1002.320555] RIP: 0010:_raw_spin_lock_irqsave+0x1e/0x40
[ 1002.326077] RSP: 0018:ffffc900070cbd10 EFLAGS: 00010046
[ 1002.331692] RAX: 0000000000000000 RBX: 0000000000000246 RCX: 0000000000000000
[ 1002.339225] RDX: 0000000000000001 RSI: ffff88085fd0e038 RDI: 0000000000000000
[ 1002.346763] RBP: ffff880855a65f18 R08: 0000000000000000 R09: 0000000000000744
[ 1002.354315] R10: 00000000000003ff R11: 0000000000000001 R12: ffff88084992e180
[ 1002.361873] R13: ffff880855a67000 R14: ffff880855a65800 R15: ffff880856d7d5a8
[ 1002.369447] FS: 0000000000000000(0000) GS:ffff88085fd00000(0000) knlGS:0000000000000000
[ 1002.377995] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1002.384209] CR2: 0000000000000000 CR3: 0000000001c09005 CR4: 00000000000606e0
[ 1002.391826] Call Trace:
[ 1002.394774] scsi_device_dev_release_usercontext+0x40/0x230
[ 1002.400858] execute_in_process_context+0x58/0x60
[ 1002.406085] device_release+0x2d/0x80
[ 1002.410277] kobject_cleanup+0x5e/0x180
[ 1002.414659] scsi_disk_put+0x2b/0x40 [sd_mod]
[ 1002.419559] __blkdev_put+0x1b5/0x1d0
[ 1002.423777] ? disk_flush_events+0x24/0x60
[ 1002.428430] blkdev_close+0x21/0x30
[ 1002.432484] __fput+0xd5/0x210
[ 1002.436111] task_work_run+0x82/0xa0
[ 1002.440262] do_exit+0x2be/0xb20
[ 1002.444074] ? syscall_trace_enter+0x1af/0x290
[ 1002.449110] do_group_exit+0x39/0xa0
[ 1002.453287] SyS_exit_group+0x10/0x10
[ 1002.457557] do_syscall_64+0x61/0x1a0
[ 1002.461829] entry_SYSCALL64_slow_path+0x25/0x25
[ 1002.467064] RIP: 0033:0x7f9abb1c8529
[ 1002.471266] RSP: 002b:00007ffe53be40d8 EFLAGS: 00000206 ORIG_RAX: 00000000000000e7
[ 1002.479482] RAX: ffffffffffffffda RBX: 0000000000000010 RCX: 00007f9abb1c8529
[ 1002.487279] RDX: 0000000000000005 RSI: 000000000000000a RDI: 0000000000000005
[ 1002.495079] RBP: 00007f9a9c9de818 R08: 000000000000003c R09: 00000000000000e7
[ 1002.502882] R10: ffffffffffffff60 R11: 0000000000000206 R12: 0000000000000006
[ 1002.510690] R13: 0000000000000006 R14: 0000000000000000 R15: 000000000172a440
[ 1002.518497] Code: f4 66 90 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 53 9c 58 66 66 90 66 90 48 89 c3 fa 66 66 90 66 66 90 31 c0 ba 01 00 00 00 <f0> 0f b1 17 85 c0 75 05 48 89 d8 5b c3 89 c6 e8 77 06 9e ff eb
[ 1002.538742] RIP: _raw_spin_lock_irqsave+0x1e/0x40 RSP: ffffc900070cbd10
[ 1002.546055] CR2: 0000000000000000
---
This email has been checked for viruses by AVG.
http://www.avg.com
^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: crash in iscsi/scsi initiator with linux-4.15.0-rc1
2017-12-19 19:31 ` Steve Wise
@ 2017-12-19 20:20 ` Ewan D. Milne
2017-12-20 19:05 ` Steve Wise
0 siblings, 1 reply; 6+ messages in thread
From: Ewan D. Milne @ 2017-12-19 20:20 UTC (permalink / raw)
To: Steve Wise; +Cc: linux-scsi, yanaijie, Bart.VanAssche
On Tue, 2017-12-19 at 13:31 -0600, Steve Wise wrote:
> > > Hey,
> > >
> > > I'm seeing this null pointer dereference with linux-4.15.0-rc1. To reproduce
> > > it, I connect two ram disks via iscsi/TCP, and start an fio:
> > >
> > > iscsiadm -m discovery --op update --type sendtargets -p 172.16.1.10:3260
> > > iscsiadm -m node -p 172.16.1.10:3260 -l
> > > ISCSI_DISKS=/dev/sdd:/dev/sde; fio --rw=randrw --name=random --
> > norandommap
> > > --ioengine=libaio --size=400m --group_reporting --exitall --fsync_on_close=1
> > > --invalidate=1 --direct=1 --filename=$ISCSI_DISKS --time_based --runtime=300
> > > --iodepth=128 --numjobs=8 --unit_base=1 --bs=64k --kb_base=1000
> > >
> > > Then on the initiator node, while the fio test is running, I detach the devices:
> > >
> > > iscsiadm -m node -p 172.16.1.10:3260 -I iser -u
> > >
> > > Then I hit this crash. Has anyone else encountered this issue? Wondering if
> > > there is a fix handy. :)
> > >
> >
> > This is the same problem that is being discussed under the thread:
> > "[PATCH] scsi: fix race condition when removing target".
> >
> > We had good test results with both Jason Yan's patch and Bart's patch
> > applied, however the ultimate solution is still in progress, see James'
> > comments.
> >
> > You could also try reverting fbce4d97fd "scsi: fixup kernel warning
> > during rmmod()" if you just need to get past this.
> >
> > -Ewan
> >
>
> Hey Ewan, Yan, Bart,
>
> I'm still seeing this issue with 4.15-rc4. Is the issue still outstanding?
>
> Steve.
>
Please apply the following commit from the 4.15/scsi-fixes branch of
git://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git
and advise if it does not fix your issue. It should.
----
commit 81b6c999897919d5a16fedc018fe375dbab091c5
Author: Hannes Reinecke <hare@suse.de>
Date: Wed Dec 13 14:21:37 2017 +0100
scsi: core: check for device state in __scsi_remove_target()
As it turned out device_get() doesn't use kref_get_unless_zero(), so we
will be always getting a device pointer. Consequently, we need to check
for the device state in __scsi_remove_target() to avoid tripping over
deleted objects.
Fixes: fbce4d97fd43 ("scsi: fixup kernel warning during rmmod()")
Reported-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
> ---
>
> [ 1002.205103] BUG: unable to handle kernel NULL pointer dereference at (null)
> [ 1002.213022] IP: _raw_spin_lock_irqsave+0x1e/0x40
> [ 1002.217740] PGD 0 P4D 0
> [ 1002.220382] Oops: 0002 [#1] SMP
> [ 1002.223637] Modules linked in: iw_cxgb4 cxgb4 nvme_rdma nvme_fabrics rdma_ktest(O) rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm mlx4_ib ib_core libcxgb vfat intel_rapl fat iosf_mbi x86_pkg_temp_thermal intel_powerclamp coretemp kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel crypto_simd glue_helper cryptd iTCO_wdt iTCO_vendor_support mxm_wmi mei_me ipmi_si lpc_ich mei pcspkr i2c_i801 mfd_core ipmi_devintf shpchp sg ipmi_msghandler wmi nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 mbcache jbd2 mlx4_en mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm sd_mod igb drm ahci libahci dc
a mlx4_core
> [ 1002.295663] ptp libata pps_core crc32c_intel nvme i2c_algo_bit i2c_core nvme_core [last unloaded: cxgb4]
> [ 1002.305563] CPU: 4 PID: 5156 Comm: fio Tainted: G O 4.15.0-rc4 #3
> [ 1002.313223] Hardware name: Supermicro X9DR3-F/X9DR3-F, BIOS 3.2a 07/09/2015
> [ 1002.320555] RIP: 0010:_raw_spin_lock_irqsave+0x1e/0x40
> [ 1002.326077] RSP: 0018:ffffc900070cbd10 EFLAGS: 00010046
> [ 1002.331692] RAX: 0000000000000000 RBX: 0000000000000246 RCX: 0000000000000000
> [ 1002.339225] RDX: 0000000000000001 RSI: ffff88085fd0e038 RDI: 0000000000000000
> [ 1002.346763] RBP: ffff880855a65f18 R08: 0000000000000000 R09: 0000000000000744
> [ 1002.354315] R10: 00000000000003ff R11: 0000000000000001 R12: ffff88084992e180
> [ 1002.361873] R13: ffff880855a67000 R14: ffff880855a65800 R15: ffff880856d7d5a8
> [ 1002.369447] FS: 0000000000000000(0000) GS:ffff88085fd00000(0000) knlGS:0000000000000000
> [ 1002.377995] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1002.384209] CR2: 0000000000000000 CR3: 0000000001c09005 CR4: 00000000000606e0
> [ 1002.391826] Call Trace:
> [ 1002.394774] scsi_device_dev_release_usercontext+0x40/0x230
> [ 1002.400858] execute_in_process_context+0x58/0x60
> [ 1002.406085] device_release+0x2d/0x80
> [ 1002.410277] kobject_cleanup+0x5e/0x180
> [ 1002.414659] scsi_disk_put+0x2b/0x40 [sd_mod]
> [ 1002.419559] __blkdev_put+0x1b5/0x1d0
> [ 1002.423777] ? disk_flush_events+0x24/0x60
> [ 1002.428430] blkdev_close+0x21/0x30
> [ 1002.432484] __fput+0xd5/0x210
> [ 1002.436111] task_work_run+0x82/0xa0
> [ 1002.440262] do_exit+0x2be/0xb20
> [ 1002.444074] ? syscall_trace_enter+0x1af/0x290
> [ 1002.449110] do_group_exit+0x39/0xa0
> [ 1002.453287] SyS_exit_group+0x10/0x10
> [ 1002.457557] do_syscall_64+0x61/0x1a0
> [ 1002.461829] entry_SYSCALL64_slow_path+0x25/0x25
> [ 1002.467064] RIP: 0033:0x7f9abb1c8529
> [ 1002.471266] RSP: 002b:00007ffe53be40d8 EFLAGS: 00000206 ORIG_RAX: 00000000000000e7
> [ 1002.479482] RAX: ffffffffffffffda RBX: 0000000000000010 RCX: 00007f9abb1c8529
> [ 1002.487279] RDX: 0000000000000005 RSI: 000000000000000a RDI: 0000000000000005
> [ 1002.495079] RBP: 00007f9a9c9de818 R08: 000000000000003c R09: 00000000000000e7
> [ 1002.502882] R10: ffffffffffffff60 R11: 0000000000000206 R12: 0000000000000006
> [ 1002.510690] R13: 0000000000000006 R14: 0000000000000000 R15: 000000000172a440
> [ 1002.518497] Code: f4 66 90 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 53 9c 58 66 66 90 66 90 48 89 c3 fa 66 66 90 66 66 90 31 c0 ba 01 00 00 00 <f0> 0f b1 17 85 c0 75 05 48 89 d8 5b c3 89 c6 e8 77 06 9e ff eb
> [ 1002.538742] RIP: _raw_spin_lock_irqsave+0x1e/0x40 RSP: ffffc900070cbd10
> [ 1002.546055] CR2: 0000000000000000
>
>
> ---
> This email has been checked for viruses by AVG.
> http://www.avg.com
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: crash in iscsi/scsi initiator with linux-4.15.0-rc1
2017-12-19 20:20 ` Ewan D. Milne
@ 2017-12-20 19:05 ` Steve Wise
0 siblings, 0 replies; 6+ messages in thread
From: Steve Wise @ 2017-12-20 19:05 UTC (permalink / raw)
To: emilne; +Cc: linux-scsi, yanaijie, Bart.VanAssche
> > Hey Ewan, Yan, Bart,
> >
> > I'm still seeing this issue with 4.15-rc4. Is the issue still outstanding?
> >
> > Steve.
> >
>
> Please apply the following commit from the 4.15/scsi-fixes branch of
>
> git://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git
>
> and advise if it does not fix your issue. It should.
This seems to resolve my issue. Thanks!
If you want, you can add a Tested-by: from me.
Steve.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2017-12-20 19:05 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-12-01 17:00 crash in iscsi/scsi initiator with linux-4.15.0-rc1 Steve Wise
2017-12-01 18:32 ` Ewan D. Milne
2017-12-01 20:36 ` Steve Wise
2017-12-19 19:31 ` Steve Wise
2017-12-19 20:20 ` Ewan D. Milne
2017-12-20 19:05 ` Steve Wise
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).