* 2.6.38-rc2+ tcm_mvsas kernel oops
@ 2011-01-30 19:02 Fubo Chen
2011-01-30 21:34 ` Nicholas A. Bellinger
0 siblings, 1 reply; 8+ messages in thread
From: Fubo Chen @ 2011-01-30 19:02 UTC (permalink / raw)
To: Nicholas A. Bellinger; +Cc: linux-scsi
Hello,
Today I did what I should have done before: try to load and unload
tcm_mvsas kernel module. Surprised to see that this triggered kernel
oops. Did I make stupid mistake ?
What I did:
# rm -rf drivers/target/tcm_mvsas
# cd Documentation/target
# { echo yes; echo yes; } | ./tcm_mod_builder.py -m tcm_mvsas -p SAS
# cd ../..
# echo m | make oldconfig
# make prepare
# make M=drivers/target/tcm_mvsas modules modules_install
# modprobe tcm_mvsas
# rmmod tcm_mvsas
# rmmod target_core_mod
Segmentation fault
>From console:
<<<<<<<<<<<<<<<<<<<<<< BEGIN FABRIC API >>>>>>>>>>>>>>>>>>>>>>
Initialized struct target_fabric_configfs: ffff880027680000 for mvsas
<<<<<<<<<<<<<<<<<<<<<< END FABRIC API >>>>>>>>>>>>>>>>>>>>>>
TCM_MVSAS[0] - Set fabric -> tcm_mvsas_fabric_configfs
<<<<<<<<<<<<<<<<<<<<<< BEGIN FABRIC API >>>>>>>>>>>>>>>>>>>>>>
Target_Core_ConfigFS: DEREGISTER -> Releasing tf: mvsas
<<<<<<<<<<<<<<<<<<<<<< END FABRIC API >>>>>>>>>>>>>>>>>>>>>>
TCM_MVSAS[0] - Cleared tcm_mvsas_fabric_configfs
general protection fault: 0000 [#1] SMP
last sysfs file:
/sys/devices/pci0000:00/0000:00:11.0/0000:02:03.0/usb1/1-0:1.0/uevent
CPU 0
Modules linked in: target_core_mod(-) configfs netconsole iscsi_tcp
libiscsi_tcp libiscsi scsi_transport_iscsi binfmt_misc psmouse
serio_raw shpchp i2c_piix4 mptspi mptscsih mptbase scsi_transport_spi
e1000 floppy [last unloaded: tcm_mvsas]
Pid: 2346, comm: rmmod Not tainted 2.6.38-rc2+
RIP: 0010:[<ffffffff810946a4>] [<ffffffff810946a4>] __lock_acquire+0x64/0x1510
RSP: 0018:ffff8800275cdb18 EFLAGS: 00010046
RAX: 0000000000000046 RBX: 6b6b6b6b6b6b6be3 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffff8800275cdbe8 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000002
R13: 0000000000000000 R14: 0000000000000000 R15: ffff88002cb4a350
FS: 00007f9238be4700(0000) GS:ffff88003d600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007f92386d1fc0 CR3: 0000000027560000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process rmmod (pid: 2346, threadinfo ffff8800275cc000, task ffff88002cb4a350)
Stack:
0000000000000004 ffff88002cb4a350 ffffffff82033ee0 ffffffff81010dfd
ffff8800275cdb68 ffffffff81ed1590 ffff8800275cdb68 0000000000000000
32a19d8cf067a674 ffff88002cb4ab08 ffff8800275cdc48 0000000000000002
Call Trace:
[<ffffffff81010dfd>] ? save_stack_trace+0x2d/0x50
[<ffffffff81095bf0>] lock_acquire+0xa0/0x150
[<ffffffffa0146f8f>] ? detach_groups+0x2f/0x120 [configfs]
[<ffffffff81545a04>] ? __mutex_lock_common+0x2a4/0x3e0
[<ffffffffa0147004>] ? detach_groups+0xa4/0x120 [configfs]
[<ffffffff815472b6>] _raw_spin_lock+0x36/0x70
[<ffffffffa0146f8f>] ? detach_groups+0x2f/0x120 [configfs]
[<ffffffffa0146f8f>] detach_groups+0x2f/0x120 [configfs]
[<ffffffffa0146f46>] configfs_detach_group+0x16/0x30 [configfs]
[<ffffffffa0147012>] detach_groups+0xb2/0x120 [configfs]
[<ffffffffa0146f46>] configfs_detach_group+0x16/0x30 [configfs]
[<ffffffffa0147012>] detach_groups+0xb2/0x120 [configfs]
[<ffffffffa0146f46>] configfs_detach_group+0x16/0x30 [configfs]
[<ffffffffa0147012>] detach_groups+0xb2/0x120 [configfs]
[<ffffffffa0146f46>] configfs_detach_group+0x16/0x30 [configfs]
[<ffffffffa0147012>] detach_groups+0xb2/0x120 [configfs]
[<ffffffffa0146f46>] configfs_detach_group+0x16/0x30 [configfs]
[<ffffffffa0147122>] configfs_unregister_subsystem+0xa2/0x130 [configfs]
[<ffffffffa014fc84>] target_core_exit_configfs+0x184/0x1c0 [target_core_mod]
[<ffffffff810a0a32>] sys_delete_module+0x1a2/0x280
[<ffffffff81547019>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[<ffffffff81002f82>] system_call_fastpath+0x16/0x1b
Code: 8b 05 a1 64 9a 00 4c 89 75 f0 48 89 fb 41 89 d5 4c 8b 55 10 45
85 c0 0f 84 4a 04 00 00 8b 3d 08 96 cd 00 85 ff 0f 84 5c 04 00 00 <48>
81 3b 20 15 dd 81 b8 01 00 00 00 44 0f 44 e0 83 fe 01 0f 86
RIP [<ffffffff810946a4>] __lock_acquire+0x64/0x1510
RSP <ffff8800275cdb18>
---[ end trace f4ddfaa61a61623b ]---
Thanks for all help.
Fubo.
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: 2.6.38-rc2+ tcm_mvsas kernel oops 2011-01-30 19:02 2.6.38-rc2+ tcm_mvsas kernel oops Fubo Chen @ 2011-01-30 21:34 ` Nicholas A. Bellinger 2011-01-31 17:21 ` Fubo Chen 0 siblings, 1 reply; 8+ messages in thread From: Nicholas A. Bellinger @ 2011-01-30 21:34 UTC (permalink / raw) To: Fubo Chen; +Cc: linux-scsi On Sun, 2011-01-30 at 20:02 +0100, Fubo Chen wrote: > Hello, > > Today I did what I should have done before: try to load and unload > tcm_mvsas kernel module. Surprised to see that this triggered kernel > oops. Did I make stupid mistake ? > Hi Fubo, > What I did: > > # rm -rf drivers/target/tcm_mvsas > # cd Documentation/target > # { echo yes; echo yes; } | ./tcm_mod_builder.py -m tcm_mvsas -p SAS > # cd ../.. > # echo m | make oldconfig > # make prepare FYI, you do not need to be calling make oldconfig + prepare each time to rebuild a single fabric module like tcm_mvsas.ko > # make M=drivers/target/tcm_mvsas modules modules_install > # modprobe tcm_mvsas What happened to 'modprobe target_core_mod' before loading tcm_mvsas..? Typically if you are running 'make oldconfig' and change your .config, you need to be running a matched set of modules, and not something that was potentially built from a different .config. > # rmmod tcm_mvsas > # rmmod target_core_mod > Segmentation fault > > >From console: > > <<<<<<<<<<<<<<<<<<<<<< BEGIN FABRIC API >>>>>>>>>>>>>>>>>>>>>> > Initialized struct target_fabric_configfs: ffff880027680000 for mvsas > <<<<<<<<<<<<<<<<<<<<<< END FABRIC API >>>>>>>>>>>>>>>>>>>>>> > TCM_MVSAS[0] - Set fabric -> tcm_mvsas_fabric_configfs > <<<<<<<<<<<<<<<<<<<<<< BEGIN FABRIC API >>>>>>>>>>>>>>>>>>>>>> > Target_Core_ConfigFS: DEREGISTER -> Releasing tf: mvsas > <<<<<<<<<<<<<<<<<<<<<< END FABRIC API >>>>>>>>>>>>>>>>>>>>>> > TCM_MVSAS[0] - Cleared tcm_mvsas_fabric_configfs > general protection fault: 0000 [#1] SMP > last sysfs file: > /sys/devices/pci0000:00/0000:00:11.0/0000:02:03.0/usb1/1-0:1.0/uevent > CPU 0 > Modules linked in: target_core_mod(-) configfs netconsole iscsi_tcp > libiscsi_tcp libiscsi scsi_transport_iscsi binfmt_misc psmouse > serio_raw shpchp i2c_piix4 mptspi mptscsih mptbase scsi_transport_spi > e1000 floppy [last unloaded: tcm_mvsas] > > Pid: 2346, comm: rmmod Not tainted 2.6.38-rc2+ > RIP: 0010:[<ffffffff810946a4>] [<ffffffff810946a4>] __lock_acquire+0x64/0x1510 > RSP: 0018:ffff8800275cdb18 EFLAGS: 00010046 > RAX: 0000000000000046 RBX: 6b6b6b6b6b6b6be3 RCX: 0000000000000000 > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 > RBP: ffff8800275cdbe8 R08: 0000000000000001 R09: 0000000000000000 > R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000002 > R13: 0000000000000000 R14: 0000000000000000 R15: ffff88002cb4a350 > FS: 00007f9238be4700(0000) GS:ffff88003d600000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > CR2: 00007f92386d1fc0 CR3: 0000000027560000 CR4: 00000000000006f0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Process rmmod (pid: 2346, threadinfo ffff8800275cc000, task ffff88002cb4a350) > Stack: > 0000000000000004 ffff88002cb4a350 ffffffff82033ee0 ffffffff81010dfd > ffff8800275cdb68 ffffffff81ed1590 ffff8800275cdb68 0000000000000000 > 32a19d8cf067a674 ffff88002cb4ab08 ffff8800275cdc48 0000000000000002 > Call Trace: > [<ffffffff81010dfd>] ? save_stack_trace+0x2d/0x50 > [<ffffffff81095bf0>] lock_acquire+0xa0/0x150 > [<ffffffffa0146f8f>] ? detach_groups+0x2f/0x120 [configfs] > [<ffffffff81545a04>] ? __mutex_lock_common+0x2a4/0x3e0 > [<ffffffffa0147004>] ? detach_groups+0xa4/0x120 [configfs] > [<ffffffff815472b6>] _raw_spin_lock+0x36/0x70 > [<ffffffffa0146f8f>] ? detach_groups+0x2f/0x120 [configfs] > [<ffffffffa0146f8f>] detach_groups+0x2f/0x120 [configfs] > [<ffffffffa0146f46>] configfs_detach_group+0x16/0x30 [configfs] > [<ffffffffa0147012>] detach_groups+0xb2/0x120 [configfs] > [<ffffffffa0146f46>] configfs_detach_group+0x16/0x30 [configfs] > [<ffffffffa0147012>] detach_groups+0xb2/0x120 [configfs] > [<ffffffffa0146f46>] configfs_detach_group+0x16/0x30 [configfs] > [<ffffffffa0147012>] detach_groups+0xb2/0x120 [configfs] > [<ffffffffa0146f46>] configfs_detach_group+0x16/0x30 [configfs] > [<ffffffffa0147012>] detach_groups+0xb2/0x120 [configfs] > [<ffffffffa0146f46>] configfs_detach_group+0x16/0x30 [configfs] > [<ffffffffa0147122>] configfs_unregister_subsystem+0xa2/0x130 [configfs] > [<ffffffffa014fc84>] target_core_exit_configfs+0x184/0x1c0 [target_core_mod] > [<ffffffff810a0a32>] sys_delete_module+0x1a2/0x280 > [<ffffffff81547019>] ? trace_hardirqs_on_thunk+0x3a/0x3f > [<ffffffff81002f82>] system_call_fastpath+0x16/0x1b > Code: 8b 05 a1 64 9a 00 4c 89 75 f0 48 89 fb 41 89 d5 4c 8b 55 10 45 > 85 c0 0f 84 4a 04 00 00 8b 3d 08 96 cd 00 85 ff 0f 84 5c 04 00 00 <48> > 81 3b 20 15 dd 81 b8 01 00 00 00 44 0f 44 e0 83 fe 01 0f 86 > RIP [<ffffffff810946a4>] __lock_acquire+0x64/0x1510 > RSP <ffff8800275cdb18> > ---[ end trace f4ddfaa61a61623b ]--- > > Ok, just to verify. I have tried a couple varitions of the following after generating a fresh 'tcm_mvsas' fabric skeleton on lio-core-2.6.git/linus-38-rc2: while [ 1 ]; do modprobe target_core_mod ; sleep 1 ; modprobe tcm_mvsas ; rmmod tcm_mvsas ; rmmod target_core_mod; done and nothing out of the ordinary appers with .38-rc2 target code on x86_64 VM while this runs so far.. Did something change in your .config between the running target_core_mod and newly built tcm_mvsas.ko that could cause a GFP like this..? Please verify your 'rmmod tcm_mvsas' test with a single set of .config options and rebuild + reboot with: make clean ; make bzImage ; make modules ; make modules_install ; make install Thanks, --nab ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 2.6.38-rc2+ tcm_mvsas kernel oops 2011-01-30 21:34 ` Nicholas A. Bellinger @ 2011-01-31 17:21 ` Fubo Chen 2011-01-31 20:55 ` Nicholas A. Bellinger 0 siblings, 1 reply; 8+ messages in thread From: Fubo Chen @ 2011-01-31 17:21 UTC (permalink / raw) To: Nicholas A. Bellinger; +Cc: linux-scsi On Sun, Jan 30, 2011 at 10:34 PM, Nicholas A. Bellinger <nab@linux-iscsi.org> wrote: > On Sun, 2011-01-30 at 20:02 +0100, Fubo Chen wrote: >> [ ... ] >> # make M=drivers/target/tcm_mvsas modules modules_install >> # modprobe tcm_mvsas > > What happened to 'modprobe target_core_mod' before loading tcm_mvsas..? 'modprobe tcm_mvsas' loads target_core_mod automatically as far as i know ? > Typically if you are running 'make oldconfig' and change your .config, > you need to be running a matched set of modules, and not something that > was potentially built from a different .config. > >> # rmmod tcm_mvsas >> # rmmod target_core_mod >> Segmentation fault >> >> >From console: >> >> <<<<<<<<<<<<<<<<<<<<<< BEGIN FABRIC API >>>>>>>>>>>>>>>>>>>>>> >> Initialized struct target_fabric_configfs: ffff880027680000 for mvsas >> <<<<<<<<<<<<<<<<<<<<<< END FABRIC API >>>>>>>>>>>>>>>>>>>>>> >> TCM_MVSAS[0] - Set fabric -> tcm_mvsas_fabric_configfs >> <<<<<<<<<<<<<<<<<<<<<< BEGIN FABRIC API >>>>>>>>>>>>>>>>>>>>>> >> Target_Core_ConfigFS: DEREGISTER -> Releasing tf: mvsas >> <<<<<<<<<<<<<<<<<<<<<< END FABRIC API >>>>>>>>>>>>>>>>>>>>>> >> TCM_MVSAS[0] - Cleared tcm_mvsas_fabric_configfs >> general protection fault: 0000 [#1] SMP >> last sysfs file: >> /sys/devices/pci0000:00/0000:00:11.0/0000:02:03.0/usb1/1-0:1.0/uevent >> CPU 0 >> Modules linked in: target_core_mod(-) configfs netconsole iscsi_tcp >> libiscsi_tcp libiscsi scsi_transport_iscsi binfmt_misc psmouse >> serio_raw shpchp i2c_piix4 mptspi mptscsih mptbase scsi_transport_spi >> e1000 floppy [last unloaded: tcm_mvsas] >> >> Pid: 2346, comm: rmmod Not tainted 2.6.38-rc2+ >> RIP: 0010:[<ffffffff810946a4>] [<ffffffff810946a4>] __lock_acquire+0x64/0x1510 >> RSP: 0018:ffff8800275cdb18 EFLAGS: 00010046 >> RAX: 0000000000000046 RBX: 6b6b6b6b6b6b6be3 RCX: 0000000000000000 >> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 >> RBP: ffff8800275cdbe8 R08: 0000000000000001 R09: 0000000000000000 >> R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000002 >> R13: 0000000000000000 R14: 0000000000000000 R15: ffff88002cb4a350 >> FS: 00007f9238be4700(0000) GS:ffff88003d600000(0000) knlGS:0000000000000000 >> CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b >> CR2: 00007f92386d1fc0 CR3: 0000000027560000 CR4: 00000000000006f0 >> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 >> Process rmmod (pid: 2346, threadinfo ffff8800275cc000, task ffff88002cb4a350) >> Stack: >> 0000000000000004 ffff88002cb4a350 ffffffff82033ee0 ffffffff81010dfd >> ffff8800275cdb68 ffffffff81ed1590 ffff8800275cdb68 0000000000000000 >> 32a19d8cf067a674 ffff88002cb4ab08 ffff8800275cdc48 0000000000000002 >> Call Trace: >> [<ffffffff81010dfd>] ? save_stack_trace+0x2d/0x50 >> [<ffffffff81095bf0>] lock_acquire+0xa0/0x150 >> [<ffffffffa0146f8f>] ? detach_groups+0x2f/0x120 [configfs] >> [<ffffffff81545a04>] ? __mutex_lock_common+0x2a4/0x3e0 >> [<ffffffffa0147004>] ? detach_groups+0xa4/0x120 [configfs] >> [<ffffffff815472b6>] _raw_spin_lock+0x36/0x70 >> [<ffffffffa0146f8f>] ? detach_groups+0x2f/0x120 [configfs] >> [<ffffffffa0146f8f>] detach_groups+0x2f/0x120 [configfs] >> [<ffffffffa0146f46>] configfs_detach_group+0x16/0x30 [configfs] >> [<ffffffffa0147012>] detach_groups+0xb2/0x120 [configfs] >> [<ffffffffa0146f46>] configfs_detach_group+0x16/0x30 [configfs] >> [<ffffffffa0147012>] detach_groups+0xb2/0x120 [configfs] >> [<ffffffffa0146f46>] configfs_detach_group+0x16/0x30 [configfs] >> [<ffffffffa0147012>] detach_groups+0xb2/0x120 [configfs] >> [<ffffffffa0146f46>] configfs_detach_group+0x16/0x30 [configfs] >> [<ffffffffa0147012>] detach_groups+0xb2/0x120 [configfs] >> [<ffffffffa0146f46>] configfs_detach_group+0x16/0x30 [configfs] >> [<ffffffffa0147122>] configfs_unregister_subsystem+0xa2/0x130 [configfs] >> [<ffffffffa014fc84>] target_core_exit_configfs+0x184/0x1c0 [target_core_mod] >> [<ffffffff810a0a32>] sys_delete_module+0x1a2/0x280 >> [<ffffffff81547019>] ? trace_hardirqs_on_thunk+0x3a/0x3f >> [<ffffffff81002f82>] system_call_fastpath+0x16/0x1b >> Code: 8b 05 a1 64 9a 00 4c 89 75 f0 48 89 fb 41 89 d5 4c 8b 55 10 45 >> 85 c0 0f 84 4a 04 00 00 8b 3d 08 96 cd 00 85 ff 0f 84 5c 04 00 00 <48> >> 81 3b 20 15 dd 81 b8 01 00 00 00 44 0f 44 e0 83 fe 01 0f 86 >> RIP [<ffffffff810946a4>] __lock_acquire+0x64/0x1510 >> RSP <ffff8800275cdb18> >> ---[ end trace f4ddfaa61a61623b ]--- >> >> > > Ok, just to verify. I have tried a couple varitions of the following > after generating a fresh 'tcm_mvsas' fabric skeleton on > lio-core-2.6.git/linus-38-rc2: > > while [ 1 ]; do modprobe target_core_mod ; sleep 1 ; modprobe > tcm_mvsas ; rmmod tcm_mvsas ; rmmod target_core_mod; done > > and nothing out of the ordinary appers with .38-rc2 target code on > x86_64 VM while this runs so far.. > > Did something change in your .config between the running target_core_mod > and newly built tcm_mvsas.ko that could cause a GFP like this..? > > Please verify your 'rmmod tcm_mvsas' test with a single set of .config > options and rebuild + reboot with: > > make clean ; make bzImage ; make modules ; make modules_install ; make install Thanks for hint. I have rebuilt kernel but unfortunately crash still occurs. Maybe it's because I have enabled SLUB poisoning ? Fubo. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 2.6.38-rc2+ tcm_mvsas kernel oops 2011-01-31 17:21 ` Fubo Chen @ 2011-01-31 20:55 ` Nicholas A. Bellinger 2011-02-01 17:55 ` Fubo Chen 0 siblings, 1 reply; 8+ messages in thread From: Nicholas A. Bellinger @ 2011-01-31 20:55 UTC (permalink / raw) To: Fubo Chen; +Cc: linux-scsi On Mon, 2011-01-31 at 18:21 +0100, Fubo Chen wrote: > On Sun, Jan 30, 2011 at 10:34 PM, Nicholas A. Bellinger > <nab@linux-iscsi.org> wrote: > > On Sun, 2011-01-30 at 20:02 +0100, Fubo Chen wrote: > >> [ ... ] > >> # make M=drivers/target/tcm_mvsas modules modules_install > >> # modprobe tcm_mvsas > > > > What happened to 'modprobe target_core_mod' before loading tcm_mvsas..? > > 'modprobe tcm_mvsas' loads target_core_mod automatically as far as i know ? > Hmmm, yes.. Typically after target_core_mod is initial loaded, and then doing a: mkdir -p /sys/kernel/config/target/$FABRIC_MOD will call request_module() based on known module names in target_core_configfs.c:target_core_register_fabric() to autoload the fabric module. But since tcm_mvsas does not have an entry there yet, this AFAIK should not many any difference. > > Typically if you are running 'make oldconfig' and change your .config, > > you need to be running a matched set of modules, and not something that > > was potentially built from a different .config. > > > >> # rmmod tcm_mvsas > >> # rmmod target_core_mod > >> Segmentation fault > >> > >> >From console: > >> > >> <<<<<<<<<<<<<<<<<<<<<< BEGIN FABRIC API >>>>>>>>>>>>>>>>>>>>>> > >> Initialized struct target_fabric_configfs: ffff880027680000 for mvsas > >> <<<<<<<<<<<<<<<<<<<<<< END FABRIC API >>>>>>>>>>>>>>>>>>>>>> > >> TCM_MVSAS[0] - Set fabric -> tcm_mvsas_fabric_configfs > >> <<<<<<<<<<<<<<<<<<<<<< BEGIN FABRIC API >>>>>>>>>>>>>>>>>>>>>> > >> Target_Core_ConfigFS: DEREGISTER -> Releasing tf: mvsas > >> <<<<<<<<<<<<<<<<<<<<<< END FABRIC API >>>>>>>>>>>>>>>>>>>>>> > >> TCM_MVSAS[0] - Cleared tcm_mvsas_fabric_configfs > >> general protection fault: 0000 [#1] SMP > >> last sysfs file: > >> /sys/devices/pci0000:00/0000:00:11.0/0000:02:03.0/usb1/1-0:1.0/uevent > >> CPU 0 > >> Modules linked in: target_core_mod(-) configfs netconsole iscsi_tcp > >> libiscsi_tcp libiscsi scsi_transport_iscsi binfmt_misc psmouse > >> serio_raw shpchp i2c_piix4 mptspi mptscsih mptbase scsi_transport_spi > >> e1000 floppy [last unloaded: tcm_mvsas] > >> > >> Pid: 2346, comm: rmmod Not tainted 2.6.38-rc2+ > >> RIP: 0010:[<ffffffff810946a4>] [<ffffffff810946a4>] __lock_acquire+0x64/0x1510 > >> RSP: 0018:ffff8800275cdb18 EFLAGS: 00010046 > >> RAX: 0000000000000046 RBX: 6b6b6b6b6b6b6be3 RCX: 0000000000000000 > >> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 > >> RBP: ffff8800275cdbe8 R08: 0000000000000001 R09: 0000000000000000 > >> R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000002 > >> R13: 0000000000000000 R14: 0000000000000000 R15: ffff88002cb4a350 > >> FS: 00007f9238be4700(0000) GS:ffff88003d600000(0000) knlGS:0000000000000000 > >> CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > >> CR2: 00007f92386d1fc0 CR3: 0000000027560000 CR4: 00000000000006f0 > >> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > >> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > >> Process rmmod (pid: 2346, threadinfo ffff8800275cc000, task ffff88002cb4a350) > >> Stack: > >> 0000000000000004 ffff88002cb4a350 ffffffff82033ee0 ffffffff81010dfd > >> ffff8800275cdb68 ffffffff81ed1590 ffff8800275cdb68 0000000000000000 > >> 32a19d8cf067a674 ffff88002cb4ab08 ffff8800275cdc48 0000000000000002 > >> Call Trace: > >> [<ffffffff81010dfd>] ? save_stack_trace+0x2d/0x50 > >> [<ffffffff81095bf0>] lock_acquire+0xa0/0x150 > >> [<ffffffffa0146f8f>] ? detach_groups+0x2f/0x120 [configfs] > >> [<ffffffff81545a04>] ? __mutex_lock_common+0x2a4/0x3e0 > >> [<ffffffffa0147004>] ? detach_groups+0xa4/0x120 [configfs] > >> [<ffffffff815472b6>] _raw_spin_lock+0x36/0x70 > >> [<ffffffffa0146f8f>] ? detach_groups+0x2f/0x120 [configfs] > >> [<ffffffffa0146f8f>] detach_groups+0x2f/0x120 [configfs] > >> [<ffffffffa0146f46>] configfs_detach_group+0x16/0x30 [configfs] > >> [<ffffffffa0147012>] detach_groups+0xb2/0x120 [configfs] > >> [<ffffffffa0146f46>] configfs_detach_group+0x16/0x30 [configfs] > >> [<ffffffffa0147012>] detach_groups+0xb2/0x120 [configfs] > >> [<ffffffffa0146f46>] configfs_detach_group+0x16/0x30 [configfs] > >> [<ffffffffa0147012>] detach_groups+0xb2/0x120 [configfs] > >> [<ffffffffa0146f46>] configfs_detach_group+0x16/0x30 [configfs] > >> [<ffffffffa0147012>] detach_groups+0xb2/0x120 [configfs] > >> [<ffffffffa0146f46>] configfs_detach_group+0x16/0x30 [configfs] > >> [<ffffffffa0147122>] configfs_unregister_subsystem+0xa2/0x130 [configfs] > >> [<ffffffffa014fc84>] target_core_exit_configfs+0x184/0x1c0 [target_core_mod] > >> [<ffffffff810a0a32>] sys_delete_module+0x1a2/0x280 > >> [<ffffffff81547019>] ? trace_hardirqs_on_thunk+0x3a/0x3f > >> [<ffffffff81002f82>] system_call_fastpath+0x16/0x1b > >> Code: 8b 05 a1 64 9a 00 4c 89 75 f0 48 89 fb 41 89 d5 4c 8b 55 10 45 > >> 85 c0 0f 84 4a 04 00 00 8b 3d 08 96 cd 00 85 ff 0f 84 5c 04 00 00 <48> > >> 81 3b 20 15 dd 81 b8 01 00 00 00 44 0f 44 e0 83 fe 01 0f 86 > >> RIP [<ffffffff810946a4>] __lock_acquire+0x64/0x1510 > >> RSP <ffff8800275cdb18> > >> ---[ end trace f4ddfaa61a61623b ]--- > >> > >> > > > > Ok, just to verify. I have tried a couple varitions of the following > > after generating a fresh 'tcm_mvsas' fabric skeleton on > > lio-core-2.6.git/linus-38-rc2: > > > > while [ 1 ]; do modprobe target_core_mod ; sleep 1 ; modprobe > > tcm_mvsas ; rmmod tcm_mvsas ; rmmod target_core_mod; done > > > > and nothing out of the ordinary appers with .38-rc2 target code on > > x86_64 VM while this runs so far.. > > > > Did something change in your .config between the running target_core_mod > > and newly built tcm_mvsas.ko that could cause a GFP like this..? > > > > Please verify your 'rmmod tcm_mvsas' test with a single set of .config > > options and rebuild + reboot with: > > > > make clean ; make bzImage ; make modules ; make modules_install ; make install > > Thanks for hint. I have rebuilt kernel but unfortunately crash still > occurs. Maybe it's because I have enabled SLUB poisoning ? > Hmmm, I don't see how this would make a difference, and FYI the above test loops for 'rmmod tcm_mvsas' where running with slub_debug=FZ w/o issue. Well, if you are certain things are working fine on .37-FINAL, you can try using 'git bisect' from a known working LIO .37 commit and build +test until you locate an offending commit. But again, this appears to be working in lio-core-2.6.git/linus-38-rc2, please verify this is what is being tested..? --nab ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 2.6.38-rc2+ tcm_mvsas kernel oops 2011-01-31 20:55 ` Nicholas A. Bellinger @ 2011-02-01 17:55 ` Fubo Chen 2011-02-02 3:01 ` Nicholas A. Bellinger 0 siblings, 1 reply; 8+ messages in thread From: Fubo Chen @ 2011-02-01 17:55 UTC (permalink / raw) To: Nicholas A. Bellinger; +Cc: linux-scsi On Mon, Jan 31, 2011 at 9:55 PM, Nicholas A. Bellinger <nab@linux-iscsi.org> wrote: > [ ... ] > > Hmmm, I don't see how this would make a difference, and FYI the above > test loops for 'rmmod tcm_mvsas' where running with slub_debug=FZ w/o > issue. > > Well, if you are certain things are working fine on .37-FINAL, you can > try using 'git bisect' from a known working LIO .37 commit and build > +test until you locate an offending commit. > > But again, this appears to be working in lio-core-2.6.git/linus-38-rc2, > please verify this is what is being tested..? Thanks for looking at this. This is what I get with v2.6.38-rc2, tcm_mvsas and slub poisoning: # cat /proc/cmdline BOOT_IMAGE=/boot/vmlinuz-2.6.38-rc2 root=UUID=c2d91556-8ed3-4a2a-95d9-50d0203bcfcc ro quiet splash slub_debug=FPUZ # modprobe tcm_mvsas # rmmod tcm_mvsas # rmmod target_core_mod Segmentation fault and on the console: <<<<<<<<<<<<<<<<<<<<<< BEGIN FABRIC API >>>>>>>>>>>>>>>>>>>>>> Initialized struct target_fabric_configfs: ffff880025e09090 for mvsas <<<<<<<<<<<<<<<<<<<<<< END FABRIC API >>>>>>>>>>>>>>>>>>>>>> TCM_MVSAS[0] - Set fabric -> tcm_mvsas_fabric_configfs <<<<<<<<<<<<<<<<<<<<<< BEGIN FABRIC API >>>>>>>>>>>>>>>>>>>>>> Target_Core_ConfigFS: DEREGISTER -> Releasing tf: mvsas <<<<<<<<<<<<<<<<<<<<<< END FABRIC API >>>>>>>>>>>>>>>>>>>>>> TCM_MVSAS[0] - Cleared tcm_mvsas_fabric_configfs general protection fault: 0000 [#1] SMP last sysfs file: /sys/devices/pci0000:00/0000:00:11.0/0000:02:03.0/usb1/1-0:1.0/uevent CPU 0 Modules linked in: target_core_mod(-) configfs netconsole iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi binfmt_misc psmouse serio_raw i2c_piix4 shpchp mptspi mptscsih e1000 mptbase scsi_transport_spi floppy [last unloaded: tcm_mvsas] Pid: 1432, comm: rmmod Not tainted 2.6.38-rc2 #4 440BX Desktop Reference Platform/VMware Virtual Platform RIP: 0010:[<ffffffff81094684>] [<ffffffff81094684>] __lock_acquire+0x64/0x1510 RSP: 0018:ffff880022697b18 EFLAGS: 00010046 RAX: 0000000000000046 RBX: 6b6b6b6b6b6b6be3 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff880022697be8 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000002 R13: 0000000000000000 R14: 0000000000000000 R15: ffff88002a1a2350 FS: 00007f844069c700(0000) GS:ffff88003d600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007f8440189fc0 CR3: 0000000025d67000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process rmmod (pid: 1432, threadinfo ffff880022696000, task ffff88002a1a2350) Stack: 0000000000000004 ffff88002a1a2350 ffffffff82030820 ffffffff81010dfd ffff880022697b68 ffffffff81ed0590 ffff880022697b68 0000000000000000 3161938ca065261c ffff88002a1a2b08 ffff880022697c48 0000000000000002 Call Trace: [<ffffffff81010dfd>] ? save_stack_trace+0x2d/0x50 [<ffffffff81095bd0>] lock_acquire+0xa0/0x150 [<ffffffffa00e5f7f>] ? detach_groups+0x2f/0x120 [configfs] [<ffffffff81540f44>] ? __mutex_lock_common+0x2a4/0x3e0 [<ffffffffa00e5ff4>] ? detach_groups+0xa4/0x120 [configfs] [<ffffffff815427f6>] _raw_spin_lock+0x36/0x70 [<ffffffffa00e5f7f>] ? detach_groups+0x2f/0x120 [configfs] [<ffffffffa00e5f7f>] detach_groups+0x2f/0x120 [configfs] [<ffffffffa00e5f36>] configfs_detach_group+0x16/0x30 [configfs] [<ffffffffa00e6002>] detach_groups+0xb2/0x120 [configfs] [<ffffffffa00e5f36>] configfs_detach_group+0x16/0x30 [configfs] [<ffffffffa00e6002>] detach_groups+0xb2/0x120 [configfs] [<ffffffffa00e5f36>] configfs_detach_group+0x16/0x30 [configfs] [<ffffffffa00e6002>] detach_groups+0xb2/0x120 [configfs] [<ffffffffa00e5f36>] configfs_detach_group+0x16/0x30 [configfs] [<ffffffffa00e6002>] detach_groups+0xb2/0x120 [configfs] [<ffffffffa00e5f36>] configfs_detach_group+0x16/0x30 [configfs] [<ffffffffa00e6112>] configfs_unregister_subsystem+0xa2/0x130 [configfs] [<ffffffffa00efc84>] target_core_exit_configfs+0x184/0x1c0 [target_core_mod] [<ffffffff810a0a12>] sys_delete_module+0x1a2/0x280 [<ffffffff81542559>] ? trace_hardirqs_on_thunk+0x3a/0x3f [<ffffffff81002f82>] system_call_fastpath+0x16/0x1b Code: 8b 05 c1 64 9a 00 4c 89 75 f0 48 89 fb 41 89 d5 4c 8b 55 10 45 85 c0 0f 84 4a 04 00 00 8b 3d 28 86 cd 00 85 ff 0f 84 5c 04 00 00 <48> 81 3b 20 05 dd 81 b8 01 00 00 00 44 0f 44 e0 83 fe 01 0f 86 RIP [<ffffffff81094684>] __lock_acquire+0x64/0x1510 RSP <ffff880022697b18> ---[ end trace 4abcf014267c1c85 ]--- ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 2.6.38-rc2+ tcm_mvsas kernel oops 2011-02-01 17:55 ` Fubo Chen @ 2011-02-02 3:01 ` Nicholas A. Bellinger 2011-02-02 4:46 ` Nicholas A. Bellinger 0 siblings, 1 reply; 8+ messages in thread From: Nicholas A. Bellinger @ 2011-02-02 3:01 UTC (permalink / raw) To: Fubo Chen; +Cc: linux-scsi, Joel Becker On Tue, 2011-02-01 at 18:55 +0100, Fubo Chen wrote: > On Mon, Jan 31, 2011 at 9:55 PM, Nicholas A. Bellinger > <nab@linux-iscsi.org> wrote: > > [ ... ] > > > > Hmmm, I don't see how this would make a difference, and FYI the above > > test loops for 'rmmod tcm_mvsas' where running with slub_debug=FZ w/o > > issue. > > > > Well, if you are certain things are working fine on .37-FINAL, you can > > try using 'git bisect' from a known working LIO .37 commit and build > > +test until you locate an offending commit. > > > > But again, this appears to be working in lio-core-2.6.git/linus-38-rc2, > > please verify this is what is being tested..? > > Thanks for looking at this. This is what I get with v2.6.38-rc2, > tcm_mvsas and slub poisoning: > > # cat /proc/cmdline > BOOT_IMAGE=/boot/vmlinuz-2.6.38-rc2 > root=UUID=c2d91556-8ed3-4a2a-95d9-50d0203bcfcc ro quiet splash > slub_debug=FPUZ > # modprobe tcm_mvsas > # rmmod tcm_mvsas > # rmmod target_core_mod > Segmentation fault > Thanks for this info.. I am now able to reproduce w/ .38-rc2 using slub_debug=FPUZ.. (More below) > and on the console: > > <<<<<<<<<<<<<<<<<<<<<< BEGIN FABRIC API >>>>>>>>>>>>>>>>>>>>>> > Initialized struct target_fabric_configfs: ffff880025e09090 for mvsas > <<<<<<<<<<<<<<<<<<<<<< END FABRIC API >>>>>>>>>>>>>>>>>>>>>> > TCM_MVSAS[0] - Set fabric -> tcm_mvsas_fabric_configfs > <<<<<<<<<<<<<<<<<<<<<< BEGIN FABRIC API >>>>>>>>>>>>>>>>>>>>>> > Target_Core_ConfigFS: DEREGISTER -> Releasing tf: mvsas > <<<<<<<<<<<<<<<<<<<<<< END FABRIC API >>>>>>>>>>>>>>>>>>>>>> > TCM_MVSAS[0] - Cleared tcm_mvsas_fabric_configfs > general protection fault: 0000 [#1] SMP > last sysfs file: > /sys/devices/pci0000:00/0000:00:11.0/0000:02:03.0/usb1/1-0:1.0/uevent > CPU 0 > Modules linked in: target_core_mod(-) configfs netconsole iscsi_tcp > libiscsi_tcp libiscsi scsi_transport_iscsi binfmt_misc psmouse > serio_raw i2c_piix4 shpchp mptspi mptscsih e1000 mptbase > scsi_transport_spi floppy [last unloaded: tcm_mvsas] > > Pid: 1432, comm: rmmod Not tainted 2.6.38-rc2 #4 440BX Desktop > Reference Platform/VMware Virtual Platform > RIP: 0010:[<ffffffff81094684>] [<ffffffff81094684>] __lock_acquire+0x64/0x1510 > RSP: 0018:ffff880022697b18 EFLAGS: 00010046 > RAX: 0000000000000046 RBX: 6b6b6b6b6b6b6be3 RCX: 0000000000000000 > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 > RBP: ffff880022697be8 R08: 0000000000000001 R09: 0000000000000000 > R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000002 > R13: 0000000000000000 R14: 0000000000000000 R15: ffff88002a1a2350 > FS: 00007f844069c700(0000) GS:ffff88003d600000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > CR2: 00007f8440189fc0 CR3: 0000000025d67000 CR4: 00000000000006f0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Process rmmod (pid: 1432, threadinfo ffff880022696000, task ffff88002a1a2350) > Stack: > 0000000000000004 ffff88002a1a2350 ffffffff82030820 ffffffff81010dfd > ffff880022697b68 ffffffff81ed0590 ffff880022697b68 0000000000000000 > 3161938ca065261c ffff88002a1a2b08 ffff880022697c48 0000000000000002 > Call Trace: > [<ffffffff81010dfd>] ? save_stack_trace+0x2d/0x50 > [<ffffffff81095bd0>] lock_acquire+0xa0/0x150 > [<ffffffffa00e5f7f>] ? detach_groups+0x2f/0x120 [configfs] > [<ffffffff81540f44>] ? __mutex_lock_common+0x2a4/0x3e0 > [<ffffffffa00e5ff4>] ? detach_groups+0xa4/0x120 [configfs] > [<ffffffff815427f6>] _raw_spin_lock+0x36/0x70 > [<ffffffffa00e5f7f>] ? detach_groups+0x2f/0x120 [configfs] > [<ffffffffa00e5f7f>] detach_groups+0x2f/0x120 [configfs] > [<ffffffffa00e5f36>] configfs_detach_group+0x16/0x30 [configfs] > [<ffffffffa00e6002>] detach_groups+0xb2/0x120 [configfs] > [<ffffffffa00e5f36>] configfs_detach_group+0x16/0x30 [configfs] > [<ffffffffa00e6002>] detach_groups+0xb2/0x120 [configfs] > [<ffffffffa00e5f36>] configfs_detach_group+0x16/0x30 [configfs] > [<ffffffffa00e6002>] detach_groups+0xb2/0x120 [configfs] > [<ffffffffa00e5f36>] configfs_detach_group+0x16/0x30 [configfs] > [<ffffffffa00e6002>] detach_groups+0xb2/0x120 [configfs] > [<ffffffffa00e5f36>] configfs_detach_group+0x16/0x30 [configfs] > [<ffffffffa00e6112>] configfs_unregister_subsystem+0xa2/0x130 [configfs] > [<ffffffffa00efc84>] target_core_exit_configfs+0x184/0x1c0 [target_core_mod] > [<ffffffff810a0a12>] sys_delete_module+0x1a2/0x280 > [<ffffffff81542559>] ? trace_hardirqs_on_thunk+0x3a/0x3f > [<ffffffff81002f82>] system_call_fastpath+0x16/0x1b > Code: 8b 05 c1 64 9a 00 4c 89 75 f0 48 89 fb 41 89 d5 4c 8b 55 10 45 > 85 c0 0f 84 4a 04 00 00 8b 3d 28 86 cd 00 85 ff 0f 84 5c 04 00 00 <48> > 81 3b 20 05 dd 81 b8 01 00 00 00 44 0f 44 e0 83 fe 01 0f 86 > RIP [<ffffffff81094684>] __lock_acquire+0x64/0x1510 > RSP <ffff880022697b18> > ---[ end trace 4abcf014267c1c85 ]--- > -- So this is coming from target_core_exit_configfs() -> configfs_unregister_system() from a simple 'modprobe target_core_mo ; rmmod target_core_mod' with slub_debug=FPUZ.. It appears to be related to the TCM top level struct configfs_subsystem->su_group->default_groups[], which we setup in target_core_init_configfs() and from which are released individually in target_core_exit_configfs() before calling configfs_unregister_system(). Note that target_core_exit_configfs() is following the same logic as default_groups for non struct configfs_subsystem backed groups, so I am thinking this is going to be the root culprit. After a quick test w/o the above subsys->su_group.default_groups allocation/release (and the rest of the top level cg->default_groups[] disabled), the GFP no longer appears. They appear to be coming more than a single stale struct configfs_dirent->s_children from the top level TCM default groups attached fs/configfs/dir.c:detach_groups(). (jlbec CC'ed) I am still looking at what is the expected way to handle multiple default_groups (including a default_group with children) with struct configfs_subsystem deregister() in fs/configfs/dir.c code, and will send a followup later this evening. Thanks again for your report, --nab ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 2.6.38-rc2+ tcm_mvsas kernel oops 2011-02-02 3:01 ` Nicholas A. Bellinger @ 2011-02-02 4:46 ` Nicholas A. Bellinger 2011-02-02 17:53 ` Fubo Chen 0 siblings, 1 reply; 8+ messages in thread From: Nicholas A. Bellinger @ 2011-02-02 4:46 UTC (permalink / raw) To: Fubo Chen; +Cc: linux-scsi, Joel Becker On Tue, 2011-02-01 at 19:01 -0800, Nicholas A. Bellinger wrote: > On Tue, 2011-02-01 at 18:55 +0100, Fubo Chen wrote: > > On Mon, Jan 31, 2011 at 9:55 PM, Nicholas A. Bellinger > > <nab@linux-iscsi.org> wrote: > > > [ ... ] > > > > > > Hmmm, I don't see how this would make a difference, and FYI the above > > > test loops for 'rmmod tcm_mvsas' where running with slub_debug=FZ w/o > > > issue. > > > > > > Well, if you are certain things are working fine on .37-FINAL, you can > > > try using 'git bisect' from a known working LIO .37 commit and build > > > +test until you locate an offending commit. > > > > > > But again, this appears to be working in lio-core-2.6.git/linus-38-rc2, > > > please verify this is what is being tested..? > > > > Thanks for looking at this. This is what I get with v2.6.38-rc2, > > tcm_mvsas and slub poisoning: > > > > # cat /proc/cmdline > > BOOT_IMAGE=/boot/vmlinuz-2.6.38-rc2 > > root=UUID=c2d91556-8ed3-4a2a-95d9-50d0203bcfcc ro quiet splash > > slub_debug=FPUZ > > # modprobe tcm_mvsas > > # rmmod tcm_mvsas > > # rmmod target_core_mod > > Segmentation fault > > > > Thanks for this info.. I am now able to reproduce w/ .38-rc2 using > slub_debug=FPUZ.. (More below) > > > and on the console: > > > > <<<<<<<<<<<<<<<<<<<<<< BEGIN FABRIC API >>>>>>>>>>>>>>>>>>>>>> > > Initialized struct target_fabric_configfs: ffff880025e09090 for mvsas > > <<<<<<<<<<<<<<<<<<<<<< END FABRIC API >>>>>>>>>>>>>>>>>>>>>> > > TCM_MVSAS[0] - Set fabric -> tcm_mvsas_fabric_configfs > > <<<<<<<<<<<<<<<<<<<<<< BEGIN FABRIC API >>>>>>>>>>>>>>>>>>>>>> > > Target_Core_ConfigFS: DEREGISTER -> Releasing tf: mvsas > > <<<<<<<<<<<<<<<<<<<<<< END FABRIC API >>>>>>>>>>>>>>>>>>>>>> > > TCM_MVSAS[0] - Cleared tcm_mvsas_fabric_configfs > > general protection fault: 0000 [#1] SMP > > last sysfs file: > > /sys/devices/pci0000:00/0000:00:11.0/0000:02:03.0/usb1/1-0:1.0/uevent > > CPU 0 > > Modules linked in: target_core_mod(-) configfs netconsole iscsi_tcp > > libiscsi_tcp libiscsi scsi_transport_iscsi binfmt_misc psmouse > > serio_raw i2c_piix4 shpchp mptspi mptscsih e1000 mptbase > > scsi_transport_spi floppy [last unloaded: tcm_mvsas] > > > > Pid: 1432, comm: rmmod Not tainted 2.6.38-rc2 #4 440BX Desktop > > Reference Platform/VMware Virtual Platform > > RIP: 0010:[<ffffffff81094684>] [<ffffffff81094684>] __lock_acquire+0x64/0x1510 > > RSP: 0018:ffff880022697b18 EFLAGS: 00010046 > > RAX: 0000000000000046 RBX: 6b6b6b6b6b6b6be3 RCX: 0000000000000000 > > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 > > RBP: ffff880022697be8 R08: 0000000000000001 R09: 0000000000000000 > > R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000002 > > R13: 0000000000000000 R14: 0000000000000000 R15: ffff88002a1a2350 > > FS: 00007f844069c700(0000) GS:ffff88003d600000(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > > CR2: 00007f8440189fc0 CR3: 0000000025d67000 CR4: 00000000000006f0 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > > Process rmmod (pid: 1432, threadinfo ffff880022696000, task ffff88002a1a2350) > > Stack: > > 0000000000000004 ffff88002a1a2350 ffffffff82030820 ffffffff81010dfd > > ffff880022697b68 ffffffff81ed0590 ffff880022697b68 0000000000000000 > > 3161938ca065261c ffff88002a1a2b08 ffff880022697c48 0000000000000002 > > Call Trace: > > [<ffffffff81010dfd>] ? save_stack_trace+0x2d/0x50 > > [<ffffffff81095bd0>] lock_acquire+0xa0/0x150 > > [<ffffffffa00e5f7f>] ? detach_groups+0x2f/0x120 [configfs] > > [<ffffffff81540f44>] ? __mutex_lock_common+0x2a4/0x3e0 > > [<ffffffffa00e5ff4>] ? detach_groups+0xa4/0x120 [configfs] > > [<ffffffff815427f6>] _raw_spin_lock+0x36/0x70 > > [<ffffffffa00e5f7f>] ? detach_groups+0x2f/0x120 [configfs] > > [<ffffffffa00e5f7f>] detach_groups+0x2f/0x120 [configfs] > > [<ffffffffa00e5f36>] configfs_detach_group+0x16/0x30 [configfs] > > [<ffffffffa00e6002>] detach_groups+0xb2/0x120 [configfs] > > [<ffffffffa00e5f36>] configfs_detach_group+0x16/0x30 [configfs] > > [<ffffffffa00e6002>] detach_groups+0xb2/0x120 [configfs] > > [<ffffffffa00e5f36>] configfs_detach_group+0x16/0x30 [configfs] > > [<ffffffffa00e6002>] detach_groups+0xb2/0x120 [configfs] > > [<ffffffffa00e5f36>] configfs_detach_group+0x16/0x30 [configfs] > > [<ffffffffa00e6002>] detach_groups+0xb2/0x120 [configfs] > > [<ffffffffa00e5f36>] configfs_detach_group+0x16/0x30 [configfs] > > [<ffffffffa00e6112>] configfs_unregister_subsystem+0xa2/0x130 [configfs] > > [<ffffffffa00efc84>] target_core_exit_configfs+0x184/0x1c0 [target_core_mod] > > [<ffffffff810a0a12>] sys_delete_module+0x1a2/0x280 > > [<ffffffff81542559>] ? trace_hardirqs_on_thunk+0x3a/0x3f > > [<ffffffff81002f82>] system_call_fastpath+0x16/0x1b > > Code: 8b 05 c1 64 9a 00 4c 89 75 f0 48 89 fb 41 89 d5 4c 8b 55 10 45 > > 85 c0 0f 84 4a 04 00 00 8b 3d 28 86 cd 00 85 ff 0f 84 5c 04 00 00 <48> > > 81 3b 20 05 dd 81 b8 01 00 00 00 44 0f 44 e0 83 fe 01 0f 86 > > RIP [<ffffffff81094684>] __lock_acquire+0x64/0x1510 > > RSP <ffff880022697b18> > > ---[ end trace 4abcf014267c1c85 ]--- > > -- > > So this is coming from target_core_exit_configfs() -> > configfs_unregister_system() from a simple 'modprobe target_core_mo ; > rmmod target_core_mod' with slub_debug=FPUZ.. > > It appears to be related to the TCM top level struct > configfs_subsystem->su_group->default_groups[], which we setup in > target_core_init_configfs() and from which are released individually in > target_core_exit_configfs() before calling configfs_unregister_system(). > > Note that target_core_exit_configfs() is following the same logic as > default_groups for non struct configfs_subsystem backed groups, so I am > thinking this is going to be the root culprit. > > After a quick test w/o the above subsys->su_group.default_groups > allocation/release (and the rest of the top level cg->default_groups[] > disabled), the GFP no longer appears. They appear to be coming more > than a single stale struct configfs_dirent->s_children from the top > level TCM default groups attached fs/configfs/dir.c:detach_groups(). > (jlbec CC'ed) > > I am still looking at what is the expected way to handle multiple > default_groups (including a default_group with children) with struct > configfs_subsystem deregister() in fs/configfs/dir.c code, and will send > a followup later this evening. > > Thanks again for your report, > Ok, after some more research and testing there appears to be two issues in target_core_exit_configfs() wrt to default groups. First, the call to configfs_unregister_subsystem() is expected to drain top level struct configfs_subsystem->su_group.default_groups[] in fs/configfs/dir.c: configfs_unregister_subsystem() -> unlink_group(), and not directly by the configfs consumer. These second issue is core_alua_free_lu_gp(se_global->default_lu_gp) releasing default_lu_gp->lun_group before lu_gp_cg->default_groups is drained. Here the change that is now resolving the issue on my end with .38-rc2 using slub_debug=FPUZ, and I will send out a proper patch for lio-core-2.6.git/linus-38-rc2 shortly.. Please verify this works for you. Thanks! --nab diff --git a/drivers/target/target_core_configfs.c b/drivers/target/target_core_configfs.c index 9ff1942..7d7dfbc 100644 --- a/drivers/target/target_core_configfs.c +++ b/drivers/target/target_core_configfs.c @@ -3262,8 +3262,7 @@ static void target_core_exit_configfs(void) config_item_put(item); } kfree(lu_gp_cg->default_groups); - core_alua_free_lu_gp(se_global->default_lu_gp); - se_global->default_lu_gp = NULL; + lu_gp_cg->default_groups = NULL; alua_cg = &se_global->alua_group; for (i = 0; alua_cg->default_groups[i]; i++) { @@ -3272,6 +3271,7 @@ static void target_core_exit_configfs(void) config_item_put(item); } kfree(alua_cg->default_groups); + alua_cg->default_groups = NULL; hba_cg = &se_global->target_core_hbagroup; for (i = 0; hba_cg->default_groups[i]; i++) { @@ -3280,15 +3280,17 @@ static void target_core_exit_configfs(void) config_item_put(item); } kfree(hba_cg->default_groups); - - for (i = 0; subsys->su_group.default_groups[i]; i++) { - item = &subsys->su_group.default_groups[i]->cg_item; - subsys->su_group.default_groups[i] = NULL; - config_item_put(item); - } + hba_cg->default_groups = NULL; + /* + * We expect subsys->su_group.default_groups to be released + * by configfs subsystem provider logic.. + */ + configfs_unregister_subsystem(subsys); kfree(subsys->su_group.default_groups); - configfs_unregister_subsystem(subsys); + core_alua_free_lu_gp(se_global->default_lu_gp); + se_global->default_lu_gp = NULL; + printk(KERN_INFO "TARGET_CORE[0]: Released ConfigFS Fabric" " Infrastructure\n"); ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: 2.6.38-rc2+ tcm_mvsas kernel oops 2011-02-02 4:46 ` Nicholas A. Bellinger @ 2011-02-02 17:53 ` Fubo Chen 0 siblings, 0 replies; 8+ messages in thread From: Fubo Chen @ 2011-02-02 17:53 UTC (permalink / raw) To: Nicholas A. Bellinger; +Cc: linux-scsi, Joel Becker On Wed, Feb 2, 2011 at 5:46 AM, Nicholas A. Bellinger <nab@linux-iscsi.org> wrote: > On Tue, 2011-02-01 at 19:01 -0800, Nicholas A. Bellinger wrote: >> On Tue, 2011-02-01 at 18:55 +0100, Fubo Chen wrote: >> > On Mon, Jan 31, 2011 at 9:55 PM, Nicholas A. Bellinger >> > <nab@linux-iscsi.org> wrote: >> > > [ ... ] >> > > >> > > Hmmm, I don't see how this would make a difference, and FYI the above >> > > test loops for 'rmmod tcm_mvsas' where running with slub_debug=FZ w/o >> > > issue. >> > > >> > > Well, if you are certain things are working fine on .37-FINAL, you can >> > > try using 'git bisect' from a known working LIO .37 commit and build >> > > +test until you locate an offending commit. >> > > >> > > But again, this appears to be working in lio-core-2.6.git/linus-38-rc2, >> > > please verify this is what is being tested..? >> > >> > Thanks for looking at this. This is what I get with v2.6.38-rc2, >> > tcm_mvsas and slub poisoning: >> > >> > # cat /proc/cmdline >> > BOOT_IMAGE=/boot/vmlinuz-2.6.38-rc2 >> > root=UUID=c2d91556-8ed3-4a2a-95d9-50d0203bcfcc ro quiet splash >> > slub_debug=FPUZ >> > # modprobe tcm_mvsas >> > # rmmod tcm_mvsas >> > # rmmod target_core_mod >> > Segmentation fault >> > >> >> Thanks for this info.. I am now able to reproduce w/ .38-rc2 using >> slub_debug=FPUZ.. (More below) >> >> > and on the console: >> > >> > <<<<<<<<<<<<<<<<<<<<<< BEGIN FABRIC API >>>>>>>>>>>>>>>>>>>>>> >> > Initialized struct target_fabric_configfs: ffff880025e09090 for mvsas >> > <<<<<<<<<<<<<<<<<<<<<< END FABRIC API >>>>>>>>>>>>>>>>>>>>>> >> > TCM_MVSAS[0] - Set fabric -> tcm_mvsas_fabric_configfs >> > <<<<<<<<<<<<<<<<<<<<<< BEGIN FABRIC API >>>>>>>>>>>>>>>>>>>>>> >> > Target_Core_ConfigFS: DEREGISTER -> Releasing tf: mvsas >> > <<<<<<<<<<<<<<<<<<<<<< END FABRIC API >>>>>>>>>>>>>>>>>>>>>> >> > TCM_MVSAS[0] - Cleared tcm_mvsas_fabric_configfs >> > general protection fault: 0000 [#1] SMP >> > last sysfs file: >> > /sys/devices/pci0000:00/0000:00:11.0/0000:02:03.0/usb1/1-0:1.0/uevent >> > CPU 0 >> > Modules linked in: target_core_mod(-) configfs netconsole iscsi_tcp >> > libiscsi_tcp libiscsi scsi_transport_iscsi binfmt_misc psmouse >> > serio_raw i2c_piix4 shpchp mptspi mptscsih e1000 mptbase >> > scsi_transport_spi floppy [last unloaded: tcm_mvsas] >> > >> > Pid: 1432, comm: rmmod Not tainted 2.6.38-rc2 #4 440BX Desktop >> > Reference Platform/VMware Virtual Platform >> > RIP: 0010:[<ffffffff81094684>] [<ffffffff81094684>] __lock_acquire+0x64/0x1510 >> > RSP: 0018:ffff880022697b18 EFLAGS: 00010046 >> > RAX: 0000000000000046 RBX: 6b6b6b6b6b6b6be3 RCX: 0000000000000000 >> > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 >> > RBP: ffff880022697be8 R08: 0000000000000001 R09: 0000000000000000 >> > R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000002 >> > R13: 0000000000000000 R14: 0000000000000000 R15: ffff88002a1a2350 >> > FS: 00007f844069c700(0000) GS:ffff88003d600000(0000) knlGS:0000000000000000 >> > CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b >> > CR2: 00007f8440189fc0 CR3: 0000000025d67000 CR4: 00000000000006f0 >> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >> > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 >> > Process rmmod (pid: 1432, threadinfo ffff880022696000, task ffff88002a1a2350) >> > Stack: >> > 0000000000000004 ffff88002a1a2350 ffffffff82030820 ffffffff81010dfd >> > ffff880022697b68 ffffffff81ed0590 ffff880022697b68 0000000000000000 >> > 3161938ca065261c ffff88002a1a2b08 ffff880022697c48 0000000000000002 >> > Call Trace: >> > [<ffffffff81010dfd>] ? save_stack_trace+0x2d/0x50 >> > [<ffffffff81095bd0>] lock_acquire+0xa0/0x150 >> > [<ffffffffa00e5f7f>] ? detach_groups+0x2f/0x120 [configfs] >> > [<ffffffff81540f44>] ? __mutex_lock_common+0x2a4/0x3e0 >> > [<ffffffffa00e5ff4>] ? detach_groups+0xa4/0x120 [configfs] >> > [<ffffffff815427f6>] _raw_spin_lock+0x36/0x70 >> > [<ffffffffa00e5f7f>] ? detach_groups+0x2f/0x120 [configfs] >> > [<ffffffffa00e5f7f>] detach_groups+0x2f/0x120 [configfs] >> > [<ffffffffa00e5f36>] configfs_detach_group+0x16/0x30 [configfs] >> > [<ffffffffa00e6002>] detach_groups+0xb2/0x120 [configfs] >> > [<ffffffffa00e5f36>] configfs_detach_group+0x16/0x30 [configfs] >> > [<ffffffffa00e6002>] detach_groups+0xb2/0x120 [configfs] >> > [<ffffffffa00e5f36>] configfs_detach_group+0x16/0x30 [configfs] >> > [<ffffffffa00e6002>] detach_groups+0xb2/0x120 [configfs] >> > [<ffffffffa00e5f36>] configfs_detach_group+0x16/0x30 [configfs] >> > [<ffffffffa00e6002>] detach_groups+0xb2/0x120 [configfs] >> > [<ffffffffa00e5f36>] configfs_detach_group+0x16/0x30 [configfs] >> > [<ffffffffa00e6112>] configfs_unregister_subsystem+0xa2/0x130 [configfs] >> > [<ffffffffa00efc84>] target_core_exit_configfs+0x184/0x1c0 [target_core_mod] >> > [<ffffffff810a0a12>] sys_delete_module+0x1a2/0x280 >> > [<ffffffff81542559>] ? trace_hardirqs_on_thunk+0x3a/0x3f >> > [<ffffffff81002f82>] system_call_fastpath+0x16/0x1b >> > Code: 8b 05 c1 64 9a 00 4c 89 75 f0 48 89 fb 41 89 d5 4c 8b 55 10 45 >> > 85 c0 0f 84 4a 04 00 00 8b 3d 28 86 cd 00 85 ff 0f 84 5c 04 00 00 <48> >> > 81 3b 20 05 dd 81 b8 01 00 00 00 44 0f 44 e0 83 fe 01 0f 86 >> > RIP [<ffffffff81094684>] __lock_acquire+0x64/0x1510 >> > RSP <ffff880022697b18> >> > ---[ end trace 4abcf014267c1c85 ]--- >> > -- >> >> So this is coming from target_core_exit_configfs() -> >> configfs_unregister_system() from a simple 'modprobe target_core_mo ; >> rmmod target_core_mod' with slub_debug=FPUZ.. >> >> It appears to be related to the TCM top level struct >> configfs_subsystem->su_group->default_groups[], which we setup in >> target_core_init_configfs() and from which are released individually in >> target_core_exit_configfs() before calling configfs_unregister_system(). >> >> Note that target_core_exit_configfs() is following the same logic as >> default_groups for non struct configfs_subsystem backed groups, so I am >> thinking this is going to be the root culprit. >> >> After a quick test w/o the above subsys->su_group.default_groups >> allocation/release (and the rest of the top level cg->default_groups[] >> disabled), the GFP no longer appears. They appear to be coming more >> than a single stale struct configfs_dirent->s_children from the top >> level TCM default groups attached fs/configfs/dir.c:detach_groups(). >> (jlbec CC'ed) >> >> I am still looking at what is the expected way to handle multiple >> default_groups (including a default_group with children) with struct >> configfs_subsystem deregister() in fs/configfs/dir.c code, and will send >> a followup later this evening. >> >> Thanks again for your report, >> > > Ok, after some more research and testing there appears to be two issues > in target_core_exit_configfs() wrt to default groups. First, the call > to configfs_unregister_subsystem() is expected to drain top level struct > configfs_subsystem->su_group.default_groups[] in fs/configfs/dir.c: > configfs_unregister_subsystem() -> unlink_group(), and not directly by > the configfs consumer. > > These second issue is core_alua_free_lu_gp(se_global->default_lu_gp) > releasing default_lu_gp->lun_group before lu_gp_cg->default_groups is > drained. > > Here the change that is now resolving the issue on my end with .38-rc2 > using slub_debug=FPUZ, and I will send out a proper patch for > lio-core-2.6.git/linus-38-rc2 shortly.. Please verify this works for > you. yes, this works forme. Thank you ! Fubo. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2011-02-02 17:53 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-01-30 19:02 2.6.38-rc2+ tcm_mvsas kernel oops Fubo Chen 2011-01-30 21:34 ` Nicholas A. Bellinger 2011-01-31 17:21 ` Fubo Chen 2011-01-31 20:55 ` Nicholas A. Bellinger 2011-02-01 17:55 ` Fubo Chen 2011-02-02 3:01 ` Nicholas A. Bellinger 2011-02-02 4:46 ` Nicholas A. Bellinger 2011-02-02 17:53 ` Fubo Chen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox