public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
* Re: Linux 2.6.38-rc4 (target_core: rmmod GP fault)
       [not found] <AANLkTimmb26UiBSukdNnVdxLJpCGd=QqpCw8vQoHALh-@mail.gmail.com>
@ 2011-02-09 17:28 ` Randy Dunlap
  2011-02-09 19:00   ` Linus Torvalds
  0 siblings, 1 reply; 7+ messages in thread
From: Randy Dunlap @ 2011-02-09 17:28 UTC (permalink / raw)
  To: Linus Torvalds, scsi; +Cc: Linux Kernel Mailing List

x86_64, nearly allmodconfig.  No target hardware.


[  144.508473] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
[  144.509901] last sysfs file: /sys/devices/pci0000:00/0000:00:1d.1/usb6/6-1/6-1.3/devnum
[  144.512026] CPU 1 
[  144.512026] Modules linked in: target_core_mod(-) configfs af_packet nfsd lockd nfs_acl auth_rpcgss exportfs sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables x_tables ipv6 cpufreq_ondemand acpi_cpufreq freq_table mperf binfmt_misc dm_mirror dm_region_hash dm_log dm_multipath scsi_dh dm_mod mousedev joydev evdev mac_hid snd_hda_codec_analog usbmouse snd_hda_intel snd_hda_codec usbkbd usbhid hid snd_hwdep snd_seq 8250_pnp snd_seq_device dcdbas pcspkr sr_mod i2c_i801 cdrom tg3 sg iTCO_wdt snd_pcm 8250 iTCO_vendor_support rtc_cmos serial_core rtc_core snd_timer rtc_lib snd processor button thermal_sys soundcore intel_agp hwmon snd_page_alloc intel_gtt unix ide_p
 ci_generic ide_core ata_generic pata_acpi ata_piix sd_mod crc_t10dif ext3 jbd mbcache uhci_hcd ohci_hcd ssb mmc_core pcmcia pcmcia_core firmware_class ehci_hcd usbcore nls_base [last unload!
 ed: microcode]
[  144.512026] 
[  144.512026] Pid: 2597, comm: rmmod Not tainted 2.6.38-rc4 #1 0TY565/OptiPlex 745                 
[  144.512026] RIP: 0010:[<ffffffff810c3e5f>]  [<ffffffff810c3e5f>] __lock_acquire+0xd8/0x4e8
[  144.512026] RSP: 0018:ffff88006df1bb78  EFLAGS: 00010006
[  144.512026] RAX: 0000000000000002 RBX: 6b6b6b6b6b6b6be3 RCX: 0000000000000000
[  144.512026] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 6b6b6b6b6b6b6be3
[  144.512026] RBP: ffff88006df1bbd8 R08: 0000000000000001 R09: 0000000000000000
[  144.512026] R10: 0000000000000006 R11: ffffffffa06ab0ef R12: 0000000000000000
[  144.512026] R13: ffff88006dec3000 R14: 0000000000000000 R15: 0000000000000000
[  144.512026] FS:  00007fe0320d36f0(0000) GS:ffff88007c600000(0000) knlGS:0000000000000000
[  144.512026] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  144.512026] CR2: 0000003fadc7bf20 CR3: 000000006de4f000 CR4: 00000000000006e0
[  144.512026] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  144.512026] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  144.512026] Process rmmod (pid: 2597, threadinfo ffff88006df1a000, task ffff88006dec3000)
[  144.512026] Stack:
[  144.512026]  0000000000000118 ffff8800000003a7 0000000000000005 00000000810b175b
[  144.512026]  ffff88006df1bba8 0000000000000000 ffff88006dec3000 0000000000000000
[  144.512026]  ffff88006dec3000 ffffffffa06ab0ef 0000000000000001 0000000000000000
[  144.512026] Call Trace:
[  144.512026]  [<ffffffffa06ab0ef>] ? spin_lock+0x15/0x1e [configfs]
[  144.512026]  [<ffffffff810c436f>] lock_acquire+0x100/0x150
[  144.512026]  [<ffffffffa06ab0ef>] ? spin_lock+0x15/0x1e [configfs]
[  144.512026]  [<ffffffffa06ac40f>] ? detach_groups+0x91/0x12e [configfs]
[  144.512026]  [<ffffffffa06ac40f>] ? detach_groups+0x91/0x12e [configfs]
[  144.512026]  [<ffffffff81556300>] _raw_spin_lock+0x44/0xaf
[  144.512026]  [<ffffffffa06ab0ef>] ? spin_lock+0x15/0x1e [configfs]
[  144.512026]  [<ffffffff810c47db>] ? lock_release_nested+0xfb/0x133
[  144.512026]  [<ffffffffa06ab0ef>] spin_lock+0x15/0x1e [configfs]
[  144.512026]  [<ffffffffa06ab144>] dget+0x2e/0x56 [configfs]
[  144.512026]  [<ffffffffa06ac3a4>] detach_groups+0x26/0x12e [configfs]
[  144.512026]  [<ffffffffa06ac363>] configfs_detach_group+0x2d/0x48 [configfs]
[  144.512026]  [<ffffffffa06ac41f>] detach_groups+0xa1/0x12e [configfs]
[  144.512026]  [<ffffffffa06ac363>] configfs_detach_group+0x2d/0x48 [configfs]
[  144.512026]  [<ffffffffa06ac41f>] detach_groups+0xa1/0x12e [configfs]
[  144.512026]  [<ffffffffa06ac363>] configfs_detach_group+0x2d/0x48 [configfs]
[  144.512026]  [<ffffffffa06ac41f>] detach_groups+0xa1/0x12e [configfs]
[  144.512026]  [<ffffffffa06ac363>] configfs_detach_group+0x2d/0x48 [configfs]
[  144.512026]  [<ffffffffa06ac41f>] detach_groups+0xa1/0x12e [configfs]
[  144.512026]  [<ffffffffa06ac363>] configfs_detach_group+0x2d/0x48 [configfs]
[  144.512026]  [<ffffffffa06ace26>] configfs_unregister_subsystem+0x105/0x194 [configfs]
[  144.512026]  [<ffffffffa06baf55>] target_core_exit_configfs+0x185/0x1eb [target_core_mod]
[  144.512026]  [<ffffffff810d46a8>] sys_delete_module+0x2d6/0x368
[  144.512026]  [<ffffffff8155602d>] ? lockdep_sys_exit_thunk+0x35/0x67
[  144.512026]  [<ffffffff81555fb7>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[  144.512026]  [<ffffffff8100e942>] system_call_fastpath+0x16/0x1b
[  144.512026] Code: 05 8f 32 8d 01 e8 6c b1 fb ff 48 ff 05 8b 32 8d 01 48 ff 05 8c 32 8d 01 48 ff 05 95 32 8d 01 e9 e3 03 00 00 48 ff 05 81 32 8d 01 <48> 81 3b 40 5f 26 82 75 07 48 ff 05 81 32 8d 01 83 fe 01 77 13 
[  144.512026] RIP  [<ffffffff810c3e5f>] __lock_acquire+0xd8/0x4e8
[  144.512026]  RSP <ffff88006df1bb78>
[  144.512026] ---[ end trace 37e0ba5347875330 ]---

---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Linux 2.6.38-rc4 (target_core: rmmod GP fault)
  2011-02-09 17:28 ` Linux 2.6.38-rc4 (target_core: rmmod GP fault) Randy Dunlap
@ 2011-02-09 19:00   ` Linus Torvalds
  2011-02-09 20:02     ` Nicholas A. Bellinger
  0 siblings, 1 reply; 7+ messages in thread
From: Linus Torvalds @ 2011-02-09 19:00 UTC (permalink / raw)
  To: Randy Dunlap, Nicholas Bellinger, Joel Becker, James Bottomley
  Cc: scsi, Linux Kernel Mailing List

On Wed, Feb 9, 2011 at 9:28 AM, Randy Dunlap <randy.dunlap@oracle.com> wrote:
> x86_64, nearly allmodconfig.  No target hardware.
>
>
> [  144.508473] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
> [  144.509901] last sysfs file: /sys/devices/pci0000:00/0000:00:1d.1/usb6/6-1/6-1.3/devnum
> [  144.512026] CPU 1
> [  144.512026]
> [  144.512026] Pid: 2597, comm: rmmod Not tainted 2.6.38-rc4 #1 0TY565/OptiPlex 745
> [  144.512026] RIP: 0010:[<ffffffff810c3e5f>]  [<ffffffff810c3e5f>] __lock_acquire+0xd8/0x4e8
> [  144.512026] RSP: 0018:ffff88006df1bb78  EFLAGS: 00010006
> [  144.512026] RAX: 0000000000000002 RBX: 6b6b6b6b6b6b6be3 RCX: 0000000000000000

The code disassembles to

   0:	8d 01                	lea    (%rcx),%eax
   2:	e8 6c b1 fb ff       	callq  0xfffffffffffbb173
   7:	48 ff 05 8b 32 8d 01 	incq   0x18d328b(%rip)        # 0x18d3299
   e:	48 ff 05 8c 32 8d 01 	incq   0x18d328c(%rip)        # 0x18d32a1
  15:	48 ff 05 95 32 8d 01 	incq   0x18d3295(%rip)        # 0x18d32b1
  1c:	e9 e3 03 00 00       	jmpq   0x404
  21:	48 ff 05 81 32 8d 01 	incq   0x18d3281(%rip)        # 0x18d32a9
  28:*	48 81 3b 40 5f 26 82 	cmpq   $0xffffffff82265f40,(%rbx)     <--
trapping instruction
  2f:	75 07                	jne    0x38
  31:	48 ff 05 81 32 8d 01 	incq   0x18d3281(%rip)        # 0x18d32b9
  38:	83 fe 01             	cmp    $0x1,%esi

and %rbx (and %rdi) contains the poison pattern for free'd memory (0x6b6b6b..).

> [  144.512026] Process rmmod (pid: 2597, threadinfo ffff88006df1a000, task ffff88006dec3000)

.. and that's likely not a very commonly tested case.

> [  144.512026]  [<ffffffffa06ace26>] configfs_unregister_subsystem+0x105/0x194 [configfs]
> [  144.512026]  [<ffffffffa06baf55>] target_core_exit_configfs+0x185/0x1eb [target_core_mod]
> [  144.512026]  [<ffffffff810d46a8>] sys_delete_module+0x2d6/0x368

The target_core_exit_configfs() code looks _very_ broken. It looks
broken for two reasons:

 - it's very different from the cleanup code for the "failed to init"
case in target_core_init_configfs, which does a lot less (see the
"out:" code there)

 - it seems to do a lot of manual freeing of the
"su_group.default_groups" stuff etc, which is all internal configfs
stuff, and seems to be used by the register/unregister phases.

So somebody show knows configfs better should really check that
cleanup, but it looks like target-core is just totally broken for the
rmmod case.

Added more people to the cc. Nicholas, Joel and James. Guys: please
check the insmod/rmmod case with
 (a) spinlock debugging and lockdep enabled
 (b) SLUB poisoning enabled.
ie all of these should be on:

  CONFIG_SLUB_DEBUG_ON=y
  CONFIG_DEBUG_SPINLOCK=y
  CONFIG_DEBUG_MUTEXES=y
  CONFIG_DEBUG_LOCK_ALLOC=y
  CONFIG_PROVE_LOCKING=y
  CONFIG_LOCKDEP=y
  CONFIG_DEBUG_LOCKDEP=y
  CONFIG_TRACE_IRQFLAGS=y
  CONFIG_DEBUG_SPINLOCK_SLEEP=y
  CONFIG_STACKTRACE=y

and you might also want to add CONFIG_DEBUG_PAGEALLOC to the mix.

                        Linus

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Linux 2.6.38-rc4 (target_core: rmmod GP fault)
  2011-02-09 19:00   ` Linus Torvalds
@ 2011-02-09 20:02     ` Nicholas A. Bellinger
  2011-02-09 20:13       ` James Bottomley
  0 siblings, 1 reply; 7+ messages in thread
From: Nicholas A. Bellinger @ 2011-02-09 20:02 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Randy Dunlap, Joel Becker, James Bottomley, scsi,
	Linux Kernel Mailing List

On Wed, 2011-02-09 at 11:00 -0800, Linus Torvalds wrote:
> On Wed, Feb 9, 2011 at 9:28 AM, Randy Dunlap <randy.dunlap@oracle.com> wrote:
> > x86_64, nearly allmodconfig.  No target hardware.
> >
> >
> > [  144.508473] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
> > [  144.509901] last sysfs file: /sys/devices/pci0000:00/0000:00:1d.1/usb6/6-1/6-1.3/devnum
> > [  144.512026] CPU 1
> > [  144.512026]
> > [  144.512026] Pid: 2597, comm: rmmod Not tainted 2.6.38-rc4 #1 0TY565/OptiPlex 745
> > [  144.512026] RIP: 0010:[<ffffffff810c3e5f>]  [<ffffffff810c3e5f>] __lock_acquire+0xd8/0x4e8
> > [  144.512026] RSP: 0018:ffff88006df1bb78  EFLAGS: 00010006
> > [  144.512026] RAX: 0000000000000002 RBX: 6b6b6b6b6b6b6be3 RCX: 0000000000000000
> 
> The code disassembles to
> 
>    0:	8d 01                	lea    (%rcx),%eax
>    2:	e8 6c b1 fb ff       	callq  0xfffffffffffbb173
>    7:	48 ff 05 8b 32 8d 01 	incq   0x18d328b(%rip)        # 0x18d3299
>    e:	48 ff 05 8c 32 8d 01 	incq   0x18d328c(%rip)        # 0x18d32a1
>   15:	48 ff 05 95 32 8d 01 	incq   0x18d3295(%rip)        # 0x18d32b1
>   1c:	e9 e3 03 00 00       	jmpq   0x404
>   21:	48 ff 05 81 32 8d 01 	incq   0x18d3281(%rip)        # 0x18d32a9
>   28:*	48 81 3b 40 5f 26 82 	cmpq   $0xffffffff82265f40,(%rbx)     <--
> trapping instruction
>   2f:	75 07                	jne    0x38
>   31:	48 ff 05 81 32 8d 01 	incq   0x18d3281(%rip)        # 0x18d32b9
>   38:	83 fe 01             	cmp    $0x1,%esi
> 
> and %rbx (and %rdi) contains the poison pattern for free'd memory (0x6b6b6b..).
> 
> > [  144.512026] Process rmmod (pid: 2597, threadinfo ffff88006df1a000, task ffff88006dec3000)
> 
> .. and that's likely not a very commonly tested case.
> 
> > [  144.512026]  [<ffffffffa06ace26>] configfs_unregister_subsystem+0x105/0x194 [configfs]
> > [  144.512026]  [<ffffffffa06baf55>] target_core_exit_configfs+0x185/0x1eb [target_core_mod]
> > [  144.512026]  [<ffffffff810d46a8>] sys_delete_module+0x2d6/0x368
> 
> The target_core_exit_configfs() code looks _very_ broken. It looks
> broken for two reasons:
> 
>  - it's very different from the cleanup code for the "failed to init"
> case in target_core_init_configfs, which does a lot less (see the
> "out:" code there)
> 

When registering a top level struct configfs_subsystem to appear under

	/sys/kernel/config/$SUBSYSTEM

the releasing of the top-level default group via
configfs_unregister_subsystem() during a failure in
target_core_init_configfs() is done for us, but we are still missing the
extra config_item_put()'s on the sub top-level groups (Joel, please
correct me)

The original 'out:' failure path code does not call config_item_put() on
these default groups, because config_group_init_type_name() has only
initialized struct config_group until configfs_register_subsystem() is
called to register the top level struct config_subsystem.

With the current 'out:' path being broken, to address the first point I
think moving the following code chunk in target_core_init_configfs to
before the configfs_register_subsystem() would make sense so that
configfs_register_subsystem() will fail last:

        /*
         * Register built-in RAMDISK subsystem logic for virtual LUN 0
         */
        ret = rd_module_init();
        if (ret < 0)
                goto out;

        if (core_dev_setup_virtual_lun0() < 0)
                goto out;

        return 0;

However looking at fs/configfs/dir.c:configfs_register_subsystem(), I
think the caller is still expected to release any sub top-level struct
config_group->default_groups[] w/ config_item_put() even though
unlink_group() is called from the configfs_attach_group() failure path..
(Joel..?)

>  - it seems to do a lot of manual freeing of the
> "su_group.default_groups" stuff etc, which is all internal configfs
> stuff, and seems to be used by the register/unregister phases.
> 

The specific issue rmmod with SLUB poisioning had been reported by Fubo
Chen to linux-scsi in the last weeks.  The patch to address the proper
release of the top-level + sub top-level struct configfs_subsystem's
default_groups in target_core_exit_configfs() has been committed into
the upstream tree in lio-core-2.6.git/linus-38-rc3 and sent out to
linux-scsi here:

[PATCH] target: Fix top-level configfs_subsystem default_group shutdown breakage
http://marc.info/?l=linux-scsi&m=129662389218924&w=2

> So somebody show knows configfs better should really check that
> cleanup, but it looks like target-core is just totally broken for the
> rmmod case.
> 
> Added more people to the cc. Nicholas, Joel and James. Guys: please
> check the insmod/rmmod case with
>  (a) spinlock debugging and lockdep enabled
>  (b) SLUB poisoning enabled.
> ie all of these should be on:
> 
>   CONFIG_SLUB_DEBUG_ON=y
>   CONFIG_DEBUG_SPINLOCK=y
>   CONFIG_DEBUG_MUTEXES=y
>   CONFIG_DEBUG_LOCK_ALLOC=y
>   CONFIG_PROVE_LOCKING=y
>   CONFIG_LOCKDEP=y
>   CONFIG_DEBUG_LOCKDEP=y
>   CONFIG_TRACE_IRQFLAGS=y
>   CONFIG_DEBUG_SPINLOCK_SLEEP=y
>   CONFIG_STACKTRACE=y
> 
> and you might also want to add CONFIG_DEBUG_PAGEALLOC to the mix.
> 

<nod>  I believe the above patch resolves the specific rmmod issue.
However, during SLUB poisioning testing we also came across errors with
the incorrect use of struct config_item_operations->release() in
target_core_configfs.c and target_core_fabric_configfs.c code.  The
series to address these was included in the last series to James here:

[PATCH 00/12] target: Updates for .38-rc4
http://marc.info/?l=linux-scsi&m=129680191624837&w=2

Note that this series for-38 mainline needs to be applied on top of the
original update series after the drivers/target/ mainline merge:

[PATCH 00/24] target updates for .38-rc3 (v2)
http://marc.info/?l=linux-scsi&m=129632617326015&w=2

The entire series is available from
     
   git://git.kernel.org/pub/scm/linux/kernel/git/nab/scsi-post-merge-2.6.git for-38-rc4

James, please review + sign-off so we can get these updates into mainline.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Linux 2.6.38-rc4 (target_core: rmmod GP fault)
  2011-02-09 20:02     ` Nicholas A. Bellinger
@ 2011-02-09 20:13       ` James Bottomley
  2011-02-09 20:20         ` Nicholas A. Bellinger
  0 siblings, 1 reply; 7+ messages in thread
From: James Bottomley @ 2011-02-09 20:13 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: Linus Torvalds, Randy Dunlap, Joel Becker, scsi,
	Linux Kernel Mailing List

On Wed, 2011-02-09 at 12:02 -0800, Nicholas A. Bellinger wrote:
> On Wed, 2011-02-09 at 11:00 -0800, Linus Torvalds wrote:
> > On Wed, Feb 9, 2011 at 9:28 AM, Randy Dunlap <randy.dunlap@oracle.com> wrote:
> > > x86_64, nearly allmodconfig.  No target hardware.
> > >
> > >
> > > [  144.508473] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
> > > [  144.509901] last sysfs file: /sys/devices/pci0000:00/0000:00:1d.1/usb6/6-1/6-1.3/devnum
> > > [  144.512026] CPU 1
> > > [  144.512026]
> > > [  144.512026] Pid: 2597, comm: rmmod Not tainted 2.6.38-rc4 #1 0TY565/OptiPlex 745
> > > [  144.512026] RIP: 0010:[<ffffffff810c3e5f>]  [<ffffffff810c3e5f>] __lock_acquire+0xd8/0x4e8
> > > [  144.512026] RSP: 0018:ffff88006df1bb78  EFLAGS: 00010006
> > > [  144.512026] RAX: 0000000000000002 RBX: 6b6b6b6b6b6b6be3 RCX: 0000000000000000
> > 
> > The code disassembles to
> > 
> >    0:	8d 01                	lea    (%rcx),%eax
> >    2:	e8 6c b1 fb ff       	callq  0xfffffffffffbb173
> >    7:	48 ff 05 8b 32 8d 01 	incq   0x18d328b(%rip)        # 0x18d3299
> >    e:	48 ff 05 8c 32 8d 01 	incq   0x18d328c(%rip)        # 0x18d32a1
> >   15:	48 ff 05 95 32 8d 01 	incq   0x18d3295(%rip)        # 0x18d32b1
> >   1c:	e9 e3 03 00 00       	jmpq   0x404
> >   21:	48 ff 05 81 32 8d 01 	incq   0x18d3281(%rip)        # 0x18d32a9
> >   28:*	48 81 3b 40 5f 26 82 	cmpq   $0xffffffff82265f40,(%rbx)     <--
> > trapping instruction
> >   2f:	75 07                	jne    0x38
> >   31:	48 ff 05 81 32 8d 01 	incq   0x18d3281(%rip)        # 0x18d32b9
> >   38:	83 fe 01             	cmp    $0x1,%esi
> > 
> > and %rbx (and %rdi) contains the poison pattern for free'd memory (0x6b6b6b..).
> > 
> > > [  144.512026] Process rmmod (pid: 2597, threadinfo ffff88006df1a000, task ffff88006dec3000)
> > 
> > .. and that's likely not a very commonly tested case.
> > 
> > > [  144.512026]  [<ffffffffa06ace26>] configfs_unregister_subsystem+0x105/0x194 [configfs]
> > > [  144.512026]  [<ffffffffa06baf55>] target_core_exit_configfs+0x185/0x1eb [target_core_mod]
> > > [  144.512026]  [<ffffffff810d46a8>] sys_delete_module+0x2d6/0x368
> > 
> > The target_core_exit_configfs() code looks _very_ broken. It looks
> > broken for two reasons:
> > 
> >  - it's very different from the cleanup code for the "failed to init"
> > case in target_core_init_configfs, which does a lot less (see the
> > "out:" code there)
> > 
> 
> When registering a top level struct configfs_subsystem to appear under
> 
> 	/sys/kernel/config/$SUBSYSTEM
> 
> the releasing of the top-level default group via
> configfs_unregister_subsystem() during a failure in
> target_core_init_configfs() is done for us, but we are still missing the
> extra config_item_put()'s on the sub top-level groups (Joel, please
> correct me)
> 
> The original 'out:' failure path code does not call config_item_put() on
> these default groups, because config_group_init_type_name() has only
> initialized struct config_group until configfs_register_subsystem() is
> called to register the top level struct config_subsystem.
> 
> With the current 'out:' path being broken, to address the first point I
> think moving the following code chunk in target_core_init_configfs to
> before the configfs_register_subsystem() would make sense so that
> configfs_register_subsystem() will fail last:
> 
>         /*
>          * Register built-in RAMDISK subsystem logic for virtual LUN 0
>          */
>         ret = rd_module_init();
>         if (ret < 0)
>                 goto out;
> 
>         if (core_dev_setup_virtual_lun0() < 0)
>                 goto out;
> 
>         return 0;
> 
> However looking at fs/configfs/dir.c:configfs_register_subsystem(), I
> think the caller is still expected to release any sub top-level struct
> config_group->default_groups[] w/ config_item_put() even though
> unlink_group() is called from the configfs_attach_group() failure path..
> (Joel..?)
> 
> >  - it seems to do a lot of manual freeing of the
> > "su_group.default_groups" stuff etc, which is all internal configfs
> > stuff, and seems to be used by the register/unregister phases.
> > 
> 
> The specific issue rmmod with SLUB poisioning had been reported by Fubo
> Chen to linux-scsi in the last weeks.  The patch to address the proper
> release of the top-level + sub top-level struct configfs_subsystem's
> default_groups in target_core_exit_configfs() has been committed into
> the upstream tree in lio-core-2.6.git/linus-38-rc3 and sent out to
> linux-scsi here:
> 
> [PATCH] target: Fix top-level configfs_subsystem default_group shutdown breakage
> http://marc.info/?l=linux-scsi&m=129662389218924&w=2
> 
> > So somebody show knows configfs better should really check that
> > cleanup, but it looks like target-core is just totally broken for the
> > rmmod case.
> > 
> > Added more people to the cc. Nicholas, Joel and James. Guys: please
> > check the insmod/rmmod case with
> >  (a) spinlock debugging and lockdep enabled
> >  (b) SLUB poisoning enabled.
> > ie all of these should be on:
> > 
> >   CONFIG_SLUB_DEBUG_ON=y
> >   CONFIG_DEBUG_SPINLOCK=y
> >   CONFIG_DEBUG_MUTEXES=y
> >   CONFIG_DEBUG_LOCK_ALLOC=y
> >   CONFIG_PROVE_LOCKING=y
> >   CONFIG_LOCKDEP=y
> >   CONFIG_DEBUG_LOCKDEP=y
> >   CONFIG_TRACE_IRQFLAGS=y
> >   CONFIG_DEBUG_SPINLOCK_SLEEP=y
> >   CONFIG_STACKTRACE=y
> > 
> > and you might also want to add CONFIG_DEBUG_PAGEALLOC to the mix.
> > 
> 
> <nod>  I believe the above patch resolves the specific rmmod issue.
> However, during SLUB poisioning testing we also came across errors with
> the incorrect use of struct config_item_operations->release() in
> target_core_configfs.c and target_core_fabric_configfs.c code.  The
> series to address these was included in the last series to James here:
> 
> [PATCH 00/12] target: Updates for .38-rc4
> http://marc.info/?l=linux-scsi&m=129680191624837&w=2
> 
> Note that this series for-38 mainline needs to be applied on top of the
> original update series after the drivers/target/ mainline merge:
> 
> [PATCH 00/24] target updates for .38-rc3 (v2)
> http://marc.info/?l=linux-scsi&m=129632617326015&w=2
> 
> The entire series is available from
>      
>    git://git.kernel.org/pub/scm/linux/kernel/git/nab/scsi-post-merge-2.6.git for-38-rc4
> 
> James, please review + sign-off so we can get these updates into mainline.

Firstly, could we get the serious bug fixes identified and separated
from the general enhancement updates, so they can go in a fixes tree
without depending on enhancements?  The former category would include
the /proc interface removal, since we don't want the legacy interface to
be in a released kernel.

James

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Linux 2.6.38-rc4 (target_core: rmmod GP fault)
  2011-02-09 20:13       ` James Bottomley
@ 2011-02-09 20:20         ` Nicholas A. Bellinger
  2011-02-09 20:28           ` James Bottomley
  0 siblings, 1 reply; 7+ messages in thread
From: Nicholas A. Bellinger @ 2011-02-09 20:20 UTC (permalink / raw)
  To: James Bottomley
  Cc: Linus Torvalds, Randy Dunlap, Joel Becker, scsi,
	Linux Kernel Mailing List

On Wed, 2011-02-09 at 14:13 -0600, James Bottomley wrote:
> On Wed, 2011-02-09 at 12:02 -0800, Nicholas A. Bellinger wrote:
> > On Wed, 2011-02-09 at 11:00 -0800, Linus Torvalds wrote:
> > > On Wed, Feb 9, 2011 at 9:28 AM, Randy Dunlap <randy.dunlap@oracle.com> wrote:
> > > > x86_64, nearly allmodconfig.  No target hardware.

<SNIP>

> > <nod>  I believe the above patch resolves the specific rmmod issue.
> > However, during SLUB poisioning testing we also came across errors with
> > the incorrect use of struct config_item_operations->release() in
> > target_core_configfs.c and target_core_fabric_configfs.c code.  The
> > series to address these was included in the last series to James here:
> > 
> > [PATCH 00/12] target: Updates for .38-rc4
> > http://marc.info/?l=linux-scsi&m=129680191624837&w=2
> > 
> > Note that this series for-38 mainline needs to be applied on top of the
> > original update series after the drivers/target/ mainline merge:
> > 
> > [PATCH 00/24] target updates for .38-rc3 (v2)
> > http://marc.info/?l=linux-scsi&m=129632617326015&w=2
> > 
> > The entire series is available from
> >      
> >    git://git.kernel.org/pub/scm/linux/kernel/git/nab/scsi-post-merge-2.6.git for-38-rc4
> > 
> > James, please review + sign-off so we can get these updates into mainline.
> 
> Firstly, could we get the serious bug fixes identified and separated
> from the general enhancement updates, so they can go in a fixes tree
> without depending on enhancements?  The former category would include
> the /proc interface removal, since we don't want the legacy interface to
> be in a released kernel.
> 

Everything in those two series should be considered bug fixes and
immediate for-38 mainline material.

The target_core_mib.c statistics logic using procfs seq_list() has been
removed in [PATCH 12/12] of the most recent series above.

Thanks,

--nab

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Linux 2.6.38-rc4 (target_core: rmmod GP fault)
  2011-02-09 20:20         ` Nicholas A. Bellinger
@ 2011-02-09 20:28           ` James Bottomley
  2011-02-09 20:44             ` Nicholas A. Bellinger
  0 siblings, 1 reply; 7+ messages in thread
From: James Bottomley @ 2011-02-09 20:28 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: Linus Torvalds, Randy Dunlap, Joel Becker, scsi,
	Linux Kernel Mailing List

On Wed, 2011-02-09 at 12:20 -0800, Nicholas A. Bellinger wrote:
> On Wed, 2011-02-09 at 14:13 -0600, James Bottomley wrote:
> > Firstly, could we get the serious bug fixes identified and separated
> > from the general enhancement updates, so they can go in a fixes tree
> > without depending on enhancements?  The former category would include
> > the /proc interface removal, since we don't want the legacy interface to
> > be in a released kernel.
> > 
> 
> Everything in those two series should be considered bug fixes and
> immediate for-38 mainline material.

Things like this:

target: remove EXTRA_CFLAGS
target: Remove unnecessary container_of() pointer check
target: Remove unnecessary se_clear_dev_ports legacy code
target: Remove spurious double cast from structure macro accessors
target: Convert TMR REQ/RSP definitions to target namespace
target: Minor sparse warning fixes and annotations
target: Remove unneeded test of se_cmd

Are not serious bug fixes.  I could go either way on some of the error path changes.

James

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Linux 2.6.38-rc4 (target_core: rmmod GP fault)
  2011-02-09 20:28           ` James Bottomley
@ 2011-02-09 20:44             ` Nicholas A. Bellinger
  0 siblings, 0 replies; 7+ messages in thread
From: Nicholas A. Bellinger @ 2011-02-09 20:44 UTC (permalink / raw)
  To: James Bottomley
  Cc: Linus Torvalds, Randy Dunlap, Joel Becker, scsi,
	Linux Kernel Mailing List, Christoph Hellwig

On Wed, 2011-02-09 at 14:28 -0600, James Bottomley wrote:
> On Wed, 2011-02-09 at 12:20 -0800, Nicholas A. Bellinger wrote:
> > On Wed, 2011-02-09 at 14:13 -0600, James Bottomley wrote:
> > > Firstly, could we get the serious bug fixes identified and separated
> > > from the general enhancement updates, so they can go in a fixes tree
> > > without depending on enhancements?  The former category would include
> > > the /proc interface removal, since we don't want the legacy interface to
> > > be in a released kernel.
> > > 
> > 
> > Everything in those two series should be considered bug fixes and
> > immediate for-38 mainline material.
> 
> Things like this:
> 
> target: remove EXTRA_CFLAGS
> target: Remove unnecessary container_of() pointer check
> target: Remove unnecessary se_clear_dev_ports legacy code
> target: Remove spurious double cast from structure macro accessors
> target: Convert TMR REQ/RSP definitions to target namespace

This is an important one, as using w/o TMR_* prefixed definitions for
task management response/response defs, we run into problems with
existing include/scsi/scsi.h message codes.

> target: Minor sparse warning fixes and annotations
> target: Remove unneeded test of se_cmd
> 
> Are not serious bug fixes.  I could go either way on some of the error path changes.
> 

Yes, these others are minor items that have been submitted by people
(hch, roland, jesper, danc) who have been reviewing target code since it
was merged for .38-rc1.

Considering how minor these are I would prefer to have these merged,
than deferring for-39.  If not, then I will need to respin tree w/o
their cleanups if what you prefer to be sent to Linus for-38, there are
3 other bugfix patches in the upstream LIO tree that I have been saving
for-38-rc5 since for-38-rc4 was cut(considering the two outstanding
series).  I would like to have these included as well.

Do you want me to send them all to the linux-scsi list again...?

--nab

  



> James
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2011-02-09 20:44 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <AANLkTimmb26UiBSukdNnVdxLJpCGd=QqpCw8vQoHALh-@mail.gmail.com>
2011-02-09 17:28 ` Linux 2.6.38-rc4 (target_core: rmmod GP fault) Randy Dunlap
2011-02-09 19:00   ` Linus Torvalds
2011-02-09 20:02     ` Nicholas A. Bellinger
2011-02-09 20:13       ` James Bottomley
2011-02-09 20:20         ` Nicholas A. Bellinger
2011-02-09 20:28           ` James Bottomley
2011-02-09 20:44             ` Nicholas A. Bellinger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox