[BUG] 2.6.24-rc2-mm1 - kernel bug on nfs v4

linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed

* [BUG] 2.6.24-rc2-mm1 - kernel bug on nfs v4
@ 2007-11-16 14:15 Kamalesh Babulal
  2007-11-17 17:53 ` Torsten Kaiser
  0 siblings, 1 reply; 20+ messages in thread
From: Kamalesh Babulal @ 2007-11-16 14:15 UTC (permalink / raw)
  To: Andrew Morton, LKML, linuxppc-dev, nfs, Andy Whitcroft,
	Balbir Singh

Hi Andrew,

The kernel enters the xmon state while running the file system
stress on nfs v4 mounted partition.
 
0:mon> e
cpu 0x0: Vector: 300 (Data Access) at [c0000000dbd4f820]
    pc: c000000000065be4: .__wake_up_common+0x44/0xe8
    lr: c000000000069768: .__wake_up+0x54/0x88
    sp: c0000000dbd4faa0
   msr: 8000000000001032
   dar: 0
 dsisr: 40010000
  current = 0xc0000000dfb6f680
  paca    = 0xc000000000574580
    pid   = 1865, comm = rpciod/0
0:mon> t
[c0000000dbd4fb50] c000000000069768 .__wake_up+0x54/0x88
[c0000000dbd4fc00] d00000000086b890 .nfs_sb_deactive+0x44/0x58 [nfs]
[c0000000dbd4fc80] d000000000872658 .nfs_free_unlinkdata+0x2c/0x74 [nfs]
[c0000000dbd4fd10] d000000000598510 .rpc_release_calldata+0x50/0x74 [sunrpc]
[c0000000dbd4fda0] c00000000008d960 .run_workqueue+0x10c/0x1f4
[c0000000dbd4fe50] c00000000008ec70 .worker_thread+0x118/0x138
[c0000000dbd4ff00] c0000000000939f4 .kthread+0x78/0xc4
[c0000000dbd4ff90] c00000000002b060 .kernel_thread+0x4c/0x68
0:mon> r
R00 = c000000000069768   R16 = 4000000001c00000
R01 = c0000000dbd4faa0   R17 = c0000000004410c0
R02 = c0000000006752d8   R18 = 0000000000000000
R03 = c0000000ace4ffc0   R19 = 000000000019c000
R04 = 0000000000000003   R20 = c00000000050af08
R05 = 0000000000000001   R21 = 000000000210af08
R06 = 0000000000000000   R22 = 000000000210b178
R07 = 0000000000000000   R23 = c00000000050b178
R08 = 0000000000000000   R24 = 0000000000000003
R09 = 0000000000000000   R25 = 0000000000000000
R10 = 0000000000000001   R26 = 0000000000000000
R11 = ffffffffffffffe8   R27 = c0000000ace4ffc0
R12 = 0000000000004000   R28 = 0000000000000001
R13 = c000000000574580   R29 = 0000000000000003
R14 = 0000000000000000   R30 = c00000000061bad8
R15 = c000000000442888   R31 = d0000000008baa50
pc  = c000000000065be4 .__wake_up_common+0x44/0xe8
lr  = c000000000069768 .__wake_up+0x54/0x88
msr = 8000000000001032   cr  = 24000022
ctr = 80000000001af404   xer = 0000000000000002   trap =  300
dar = 0000000000000000   dsisr = 40010000
0:mon>

-- 
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [BUG] 2.6.24-rc2-mm1 - kernel bug on nfs v4
  2007-11-16 14:15 [BUG] 2.6.24-rc2-mm1 - kernel bug on nfs v4 Kamalesh Babulal
@ 2007-11-17 17:53 ` Torsten Kaiser
  2007-11-17 18:05   ` Andrew Morton
                     ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Torsten Kaiser @ 2007-11-17 17:53 UTC (permalink / raw)
  To: Kamalesh Babulal
  Cc: LKML, Trond Myklebust, linuxppc-dev, nfs, Andrew Morton,
	Jan Blunck, Balbir Singh

On Nov 16, 2007 3:15 PM, Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> wrote:
> Hi Andrew,
>
> The kernel enters the xmon state while running the file system
> stress on nfs v4 mounted partition.
[snip]
> 0:mon> t
> [c0000000dbd4fb50] c000000000069768 .__wake_up+0x54/0x88
> [c0000000dbd4fc00] d00000000086b890 .nfs_sb_deactive+0x44/0x58 [nfs]
> [c0000000dbd4fc80] d000000000872658 .nfs_free_unlinkdata+0x2c/0x74 [nfs]
> [c0000000dbd4fd10] d000000000598510 .rpc_release_calldata+0x50/0x74 [sunrpc]
> [c0000000dbd4fda0] c00000000008d960 .run_workqueue+0x10c/0x1f4
> [c0000000dbd4fe50] c00000000008ec70 .worker_thread+0x118/0x138
> [c0000000dbd4ff00] c0000000000939f4 .kthread+0x78/0xc4
> [c0000000dbd4ff90] c00000000002b060 .kernel_thread+0x4c/0x68

Definitely not a ppc problem.
Got nearly the same backtrace on 64bit x86:
[  966.712167] BUG: soft lockup - CPU#3 stuck for 11s! [rpciod/3:605]
[  966.718522] CPU 3:
[  966.720589] Modules linked in: radeon drm nfsd exportfs ipv6
w83792d tuner tea5767 tda8290 tuner_xc2028 tda9887 tuner_simple mt20xx
tea5761 tvaudio msp3400 bttv ir_common compat_ioctl32 videobuf_dma_sg
videobuf_core btcx_risc tveeprom videodev usbhid v4l2_common
v4l1_compat hid sg i2c_nforce2 pata_amd
[  966.748306] Pid: 605, comm: rpciod/3 Not tainted 2.6.24-rc2-mm1 #4
[  966.754653] RIP: 0010:[<ffffffff805b0542>]  [<ffffffff805b0542>]
_spin_lock_irqsave+0x12/0x30
[  966.763424] RSP: 0018:ffff81007ef33e28  EFLAGS: 00000286
[  966.768879] RAX: 0000000000000286 RBX: ffff81007ef33e60 RCX: 0000000000000000
[  966.776204] RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffff81011e107960
[  966.783511] RBP: ffff81011cc6c588 R08: ffff8100db918130 R09: ffff81011cc6c540
[  966.790837] R10: 0000000000000000 R11: ffffffff80266390 R12: ffff8100d2d693a8
[  966.798170] R13: ffff81011cc6c588 R14: ffff8100d2d693a8 R15: ffffffff80302726
[  966.805505] FS:  00007f9e739d96f0(0000) GS:ffff81011ff12700(0000)
knlGS:0000000000000000
[  966.813805] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
[  966.819703] CR2: 0000000001b691d0 CR3: 0000000069861000 CR4: 00000000000006e0
[  966.827039] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  966.834362] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  966.841687]
[  966.841687] Call Trace:
[  966.845728]  [<ffffffff8022cf4d>] __wake_up+0x2d/0x70
[  966.850900]  [<ffffffff802f5e6e>] nfs_free_unlinkdata+0x1e/0x50
[  966.857004]  [<ffffffff80593f66>] rpc_release_calldata+0x26/0x50
[  966.863161]  [<ffffffff80594930>] rpc_async_schedule+0x0/0x10
[  966.869078]  [<ffffffff80245cec>] run_workqueue+0xcc/0x170
[  966.874705]  [<ffffffff802467a0>] worker_thread+0x0/0xb0
[  966.880163]  [<ffffffff802467a0>] worker_thread+0x0/0xb0
[  966.885610]  [<ffffffff8024680d>] worker_thread+0x6d/0xb0
[  966.891148]  [<ffffffff8024a140>] autoremove_wake_function+0x0/0x30
[  966.897606]  [<ffffffff802467a0>] worker_thread+0x0/0xb0
[  966.903045]  [<ffffffff802467a0>] worker_thread+0x0/0xb0
[  966.908485]  [<ffffffff80249d5b>] kthread+0x4b/0x80
[  966.913484]  [<ffffffff8020ca28>] child_rip+0xa/0x12
[  966.918579]  [<ffffffff80249d10>] kthread+0x0/0x80
[  966.923498]  [<ffffffff8020ca1e>] child_rip+0x0/0x12
[  966.928584]

Sadly lockdep does not work for me, as it gets turned off early:
[   39.851594] ---------------------------------
[   39.855963] inconsistent {softirq-on-W} -> {in-softirq-W} usage.
[   39.861981] swapper/0 [HC0[0]:SC1[1]:HE0:SE0] takes:
[   39.866963]  (&n->list_lock){-+..}, at: [<ffffffff802935c1>]
add_partial+0x31/0xa0
[   39.874712] {softirq-on-W} state was registered at:
[   39.879788]   [<ffffffff80259fb8>] __lock_acquire+0x3e8/0x1140
[   39.885763]   [<ffffffff80259838>] debug_check_no_locks_freed+0x188/0x1a0
[   39.892682]   [<ffffffff8025ad65>] lock_acquire+0x55/0x70
[   39.898840]   [<ffffffff802935c1>] add_partial+0x31/0xa0
[   39.904288]   [<ffffffff805c76de>] _spin_lock+0x1e/0x30
[   39.909650]   [<ffffffff802935c1>] add_partial+0x31/0xa0
[   39.915097]   [<ffffffff80296f9c>] kmem_cache_open+0x1cc/0x330
[   39.921066]   [<ffffffff805c7984>] _spin_unlock_irq+0x24/0x30
[   39.926946]   [<ffffffff802974f4>] create_kmalloc_cache+0x64/0xf0
[   39.933172]   [<ffffffff80295640>] init_alloc_cpu_cpu+0x70/0x90
[   39.939226]   [<ffffffff8080ada5>] kmem_cache_init+0x65/0x1d0
[   39.945289]   [<ffffffff807f1b4e>] start_kernel+0x23e/0x350
[   39.950996]   [<ffffffff807f112d>] _sinittext+0x12d/0x140
[   39.956529]   [<ffffffffffffffff>] 0xffffffffffffffff
[   39.961720] irq event stamp: 1207
[   39.965048] hardirqs last  enabled at (1206): [<ffffffff80259838>]
debug_check_no_locks_freed+0x188/0x1a0
[   39.974701] hardirqs last disabled at (1207): [<ffffffff802952eb>]
__slab_free+0x3b/0x190
[   39.982968] softirqs last  enabled at (570): [<ffffffff8020cf0c>]
call_softirq+0x1c/0x30
[   39.991148] softirqs last disabled at (1197): [<ffffffff8020cf0c>]
call_softirq+0x1c/0x30
[   39.999415]
[   39.999416] other info that might help us debug this:
[   40.005990] no locks held by swapper/0.
[   40.010018]
[   40.010018] stack backtrace:
[   40.014429]
[   40.014429] Call Trace:
[   40.018407]  <IRQ>  [<ffffffff8025847c>] print_usage_bug+0x18c/0x1a0
[   40.024817]  [<ffffffff802593ec>] mark_lock+0x64c/0x660
[   40.030057]  [<ffffffff80259f6e>] __lock_acquire+0x39e/0x1140
[   40.035818]  [<ffffffff80257717>] save_trace+0x37/0xa0
[   40.040972]  [<ffffffff802492cd>] __rcu_process_callbacks+0x8d/0x250
[   40.047335]  [<ffffffff8025ad65>] lock_acquire+0x55/0x70
[   40.052663]  [<ffffffff802935c1>] add_partial+0x31/0xa0
[   40.057905]  [<ffffffff802595d3>] trace_hardirqs_on+0x83/0x160
[   40.063750]  [<ffffffff805c76de>] _spin_lock+0x1e/0x30
[   40.068905]  [<ffffffff802935c1>] add_partial+0x31/0xa0
[   40.074311]  [<ffffffff802953b0>] __slab_free+0x100/0x190
[   40.079724]  [<ffffffff802492cd>] __rcu_process_callbacks+0x8d/0x250
[   40.086088]  [<ffffffff8023b79c>] tasklet_action+0x2c/0xc0
[   40.091588]  [<ffffffff802494b3>] rcu_process_callbacks+0x23/0x50
[   40.097694]  [<ffffffff8023b7ba>] tasklet_action+0x4a/0xc0
[   40.103194]  [<ffffffff8023b67a>] __do_softirq+0x7a/0x100
[   40.108607]  [<ffffffff8020cf0c>] call_softirq+0x1c/0x30
[   40.113935]  [<ffffffff8020f125>] do_softirq+0x55/0xb0
[   40.119089]  [<ffffffff8023b5f7>] irq_exit+0x97/0xa0
[   40.124073]  [<ffffffff8021bf2c>] smp_apic_timer_interrupt+0x7c/0xc0
[   40.130434]  [<ffffffff8020ac70>] default_idle+0x0/0x60
[   40.135840]  [<ffffffff8020ac70>] default_idle+0x0/0x60
[   40.141080]  [<ffffffff8020c9bb>] apic_timer_interrupt+0x6b/0x70
[   40.147100]  <EOI>  [<ffffffff8020aca7>] default_idle+0x37/0x60
[   40.153066]  [<ffffffff8020aca5>] default_idle+0x35/0x60
[   40.158393]  [<ffffffff8020ad2f>] cpu_idle+0x5f/0x90
[   40.163374]
[   40.164888] INFO: lockdep is turned off.

Don't know who to bug about that.

Torsten

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [BUG] 2.6.24-rc2-mm1 - kernel bug on nfs v4
  2007-11-17 17:53 ` Torsten Kaiser
@ 2007-11-17 18:05   ` Andrew Morton
  2007-11-17 19:33     ` Christoph Lameter
  2007-11-17 18:09   ` Ingo Molnar
  2007-11-17 18:58   ` Trond Myklebust
  2 siblings, 1 reply; 20+ messages in thread
From: Andrew Morton @ 2007-11-17 18:05 UTC (permalink / raw)
  To: Torsten Kaiser
  Cc: Trond Myklebust, LKML, Kamalesh Babulal, linuxppc-dev, nfs,
	Christoph Lameter, Jan Blunck, Balbir Singh

On Sat, 17 Nov 2007 18:53:45 +0100 "Torsten Kaiser" <just.for.lkml@googlemail.com> wrote:

> On Nov 16, 2007 3:15 PM, Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> wrote:
> > Hi Andrew,
> >
> > The kernel enters the xmon state while running the file system
> > stress on nfs v4 mounted partition.
> [snip]
> > 0:mon> t
> > [c0000000dbd4fb50] c000000000069768 .__wake_up+0x54/0x88
> > [c0000000dbd4fc00] d00000000086b890 .nfs_sb_deactive+0x44/0x58 [nfs]
> > [c0000000dbd4fc80] d000000000872658 .nfs_free_unlinkdata+0x2c/0x74 [nfs]
> > [c0000000dbd4fd10] d000000000598510 .rpc_release_calldata+0x50/0x74 [sunrpc]
> > [c0000000dbd4fda0] c00000000008d960 .run_workqueue+0x10c/0x1f4
> > [c0000000dbd4fe50] c00000000008ec70 .worker_thread+0x118/0x138
> > [c0000000dbd4ff00] c0000000000939f4 .kthread+0x78/0xc4
> > [c0000000dbd4ff90] c00000000002b060 .kernel_thread+0x4c/0x68
> 
> Definitely not a ppc problem.
> Got nearly the same backtrace on 64bit x86:
> [  966.712167] BUG: soft lockup - CPU#3 stuck for 11s! [rpciod/3:605]
> [  966.718522] CPU 3:
> [  966.720589] Modules linked in: radeon drm nfsd exportfs ipv6
> w83792d tuner tea5767 tda8290 tuner_xc2028 tda9887 tuner_simple mt20xx
> tea5761 tvaudio msp3400 bttv ir_common compat_ioctl32 videobuf_dma_sg
> videobuf_core btcx_risc tveeprom videodev usbhid v4l2_common
> v4l1_compat hid sg i2c_nforce2 pata_amd
> [  966.748306] Pid: 605, comm: rpciod/3 Not tainted 2.6.24-rc2-mm1 #4
> [  966.754653] RIP: 0010:[<ffffffff805b0542>]  [<ffffffff805b0542>]
> _spin_lock_irqsave+0x12/0x30
> [  966.763424] RSP: 0018:ffff81007ef33e28  EFLAGS: 00000286
> [  966.768879] RAX: 0000000000000286 RBX: ffff81007ef33e60 RCX: 0000000000000000
> [  966.776204] RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffff81011e107960
> [  966.783511] RBP: ffff81011cc6c588 R08: ffff8100db918130 R09: ffff81011cc6c540
> [  966.790837] R10: 0000000000000000 R11: ffffffff80266390 R12: ffff8100d2d693a8
> [  966.798170] R13: ffff81011cc6c588 R14: ffff8100d2d693a8 R15: ffffffff80302726
> [  966.805505] FS:  00007f9e739d96f0(0000) GS:ffff81011ff12700(0000)
> knlGS:0000000000000000
> [  966.813805] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> [  966.819703] CR2: 0000000001b691d0 CR3: 0000000069861000 CR4: 00000000000006e0
> [  966.827039] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  966.834362] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [  966.841687]
> [  966.841687] Call Trace:
> [  966.845728]  [<ffffffff8022cf4d>] __wake_up+0x2d/0x70
> [  966.850900]  [<ffffffff802f5e6e>] nfs_free_unlinkdata+0x1e/0x50
> [  966.857004]  [<ffffffff80593f66>] rpc_release_calldata+0x26/0x50
> [  966.863161]  [<ffffffff80594930>] rpc_async_schedule+0x0/0x10
> [  966.869078]  [<ffffffff80245cec>] run_workqueue+0xcc/0x170
> [  966.874705]  [<ffffffff802467a0>] worker_thread+0x0/0xb0
> [  966.880163]  [<ffffffff802467a0>] worker_thread+0x0/0xb0
> [  966.885610]  [<ffffffff8024680d>] worker_thread+0x6d/0xb0
> [  966.891148]  [<ffffffff8024a140>] autoremove_wake_function+0x0/0x30
> [  966.897606]  [<ffffffff802467a0>] worker_thread+0x0/0xb0
> [  966.903045]  [<ffffffff802467a0>] worker_thread+0x0/0xb0
> [  966.908485]  [<ffffffff80249d5b>] kthread+0x4b/0x80
> [  966.913484]  [<ffffffff8020ca28>] child_rip+0xa/0x12
> [  966.918579]  [<ffffffff80249d10>] kthread+0x0/0x80
> [  966.923498]  [<ffffffff8020ca1e>] child_rip+0x0/0x12
> [  966.928584]

I don't know what'a causing that.  I spose I should set up nfs4.

> Sadly lockdep does not work for me, as it gets turned off early:
> [   39.851594] ---------------------------------
> [   39.855963] inconsistent {softirq-on-W} -> {in-softirq-W} usage.
> [   39.861981] swapper/0 [HC0[0]:SC1[1]:HE0:SE0] takes:
> [   39.866963]  (&n->list_lock){-+..}, at: [<ffffffff802935c1>]
> add_partial+0x31/0xa0
> [   39.874712] {softirq-on-W} state was registered at:
> [   39.879788]   [<ffffffff80259fb8>] __lock_acquire+0x3e8/0x1140
> [   39.885763]   [<ffffffff80259838>] debug_check_no_locks_freed+0x188/0x1a0
> [   39.892682]   [<ffffffff8025ad65>] lock_acquire+0x55/0x70
> [   39.898840]   [<ffffffff802935c1>] add_partial+0x31/0xa0
> [   39.904288]   [<ffffffff805c76de>] _spin_lock+0x1e/0x30
> [   39.909650]   [<ffffffff802935c1>] add_partial+0x31/0xa0
> [   39.915097]   [<ffffffff80296f9c>] kmem_cache_open+0x1cc/0x330
> [   39.921066]   [<ffffffff805c7984>] _spin_unlock_irq+0x24/0x30
> [   39.926946]   [<ffffffff802974f4>] create_kmalloc_cache+0x64/0xf0
> [   39.933172]   [<ffffffff80295640>] init_alloc_cpu_cpu+0x70/0x90
> [   39.939226]   [<ffffffff8080ada5>] kmem_cache_init+0x65/0x1d0
> [   39.945289]   [<ffffffff807f1b4e>] start_kernel+0x23e/0x350
> [   39.950996]   [<ffffffff807f112d>] _sinittext+0x12d/0x140
> [   39.956529]   [<ffffffffffffffff>] 0xffffffffffffffff
> [   39.961720] irq event stamp: 1207
> [   39.965048] hardirqs last  enabled at (1206): [<ffffffff80259838>]
> debug_check_no_locks_freed+0x188/0x1a0
> [   39.974701] hardirqs last disabled at (1207): [<ffffffff802952eb>]
> __slab_free+0x3b/0x190
> [   39.982968] softirqs last  enabled at (570): [<ffffffff8020cf0c>]
> call_softirq+0x1c/0x30
> [   39.991148] softirqs last disabled at (1197): [<ffffffff8020cf0c>]
> call_softirq+0x1c/0x30
> [   39.999415]
> [   39.999416] other info that might help us debug this:
> [   40.005990] no locks held by swapper/0.
> [   40.010018]
> [   40.010018] stack backtrace:
> [   40.014429]
> [   40.014429] Call Trace:
> [   40.018407]  <IRQ>  [<ffffffff8025847c>] print_usage_bug+0x18c/0x1a0
> [   40.024817]  [<ffffffff802593ec>] mark_lock+0x64c/0x660
> [   40.030057]  [<ffffffff80259f6e>] __lock_acquire+0x39e/0x1140
> [   40.035818]  [<ffffffff80257717>] save_trace+0x37/0xa0
> [   40.040972]  [<ffffffff802492cd>] __rcu_process_callbacks+0x8d/0x250
> [   40.047335]  [<ffffffff8025ad65>] lock_acquire+0x55/0x70
> [   40.052663]  [<ffffffff802935c1>] add_partial+0x31/0xa0
> [   40.057905]  [<ffffffff802595d3>] trace_hardirqs_on+0x83/0x160
> [   40.063750]  [<ffffffff805c76de>] _spin_lock+0x1e/0x30
> [   40.068905]  [<ffffffff802935c1>] add_partial+0x31/0xa0
> [   40.074311]  [<ffffffff802953b0>] __slab_free+0x100/0x190
> [   40.079724]  [<ffffffff802492cd>] __rcu_process_callbacks+0x8d/0x250
> [   40.086088]  [<ffffffff8023b79c>] tasklet_action+0x2c/0xc0
> [   40.091588]  [<ffffffff802494b3>] rcu_process_callbacks+0x23/0x50
> [   40.097694]  [<ffffffff8023b7ba>] tasklet_action+0x4a/0xc0
> [   40.103194]  [<ffffffff8023b67a>] __do_softirq+0x7a/0x100
> [   40.108607]  [<ffffffff8020cf0c>] call_softirq+0x1c/0x30
> [   40.113935]  [<ffffffff8020f125>] do_softirq+0x55/0xb0
> [   40.119089]  [<ffffffff8023b5f7>] irq_exit+0x97/0xa0
> [   40.124073]  [<ffffffff8021bf2c>] smp_apic_timer_interrupt+0x7c/0xc0
> [   40.130434]  [<ffffffff8020ac70>] default_idle+0x0/0x60
> [   40.135840]  [<ffffffff8020ac70>] default_idle+0x0/0x60
> [   40.141080]  [<ffffffff8020c9bb>] apic_timer_interrupt+0x6b/0x70
> [   40.147100]  <EOI>  [<ffffffff8020aca7>] default_idle+0x37/0x60
> [   40.153066]  [<ffffffff8020aca5>] default_idle+0x35/0x60
> [   40.158393]  [<ffffffff8020ad2f>] cpu_idle+0x5f/0x90
> [   40.163374]
> [   40.164888] INFO: lockdep is turned off.
> 
> Don't know who to bug about that.
> 

That's slub.  It appears that list_lock is being taken from process context
in one place and from softirq in another.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [BUG] 2.6.24-rc2-mm1 - kernel bug on nfs v4
  2007-11-17 17:53 ` Torsten Kaiser
  2007-11-17 18:05   ` Andrew Morton
@ 2007-11-17 18:09   ` Ingo Molnar
  2007-11-17 18:19     ` Andrew Morton
  2007-11-17 23:00     ` root
  2007-11-17 18:58   ` Trond Myklebust
  2 siblings, 2 replies; 20+ messages in thread
From: Ingo Molnar @ 2007-11-17 18:09 UTC (permalink / raw)
  To: Torsten Kaiser
  Cc: Trond Myklebust, Peter Zijlstra, LKML, Kamalesh Babulal,
	linuxppc-dev, nfs, Andrew Morton, Jan Blunck, Balbir Singh


* Torsten Kaiser <just.for.lkml@googlemail.com> wrote:

> Sadly lockdep does not work for me, as it gets turned off early:
> [   39.851594] ---------------------------------
> [   39.855963] inconsistent {softirq-on-W} -> {in-softirq-W} usage.
> [   39.861981] swapper/0 [HC0[0]:SC1[1]:HE0:SE0] takes:
> [   39.866963]  (&n->list_lock){-+..}, at: [<ffffffff802935c1>]

hey, that means it found a bug - which is not sad at all :-)

	Ingo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [BUG] 2.6.24-rc2-mm1 - kernel bug on nfs v4
  2007-11-17 18:09   ` Ingo Molnar
@ 2007-11-17 18:19     ` Andrew Morton
  2007-11-17 19:40       ` Torsten Kaiser
  2007-11-17 23:00     ` root
  1 sibling, 1 reply; 20+ messages in thread
From: Andrew Morton @ 2007-11-17 18:19 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Trond Myklebust, Peter, Zijlstra, LKML, Torsten Kaiser,
	Kamalesh Babulal, linuxppc-dev, nfs, Jan Blunck, Balbir Singh

On Sat, 17 Nov 2007 19:09:46 +0100 Ingo Molnar <mingo@elte.hu> wrote:

> 
> * Torsten Kaiser <just.for.lkml@googlemail.com> wrote:
> 
> > Sadly lockdep does not work for me, as it gets turned off early:
> > [   39.851594] ---------------------------------
> > [   39.855963] inconsistent {softirq-on-W} -> {in-softirq-W} usage.
> > [   39.861981] swapper/0 [HC0[0]:SC1[1]:HE0:SE0] takes:
> > [   39.866963]  (&n->list_lock){-+..}, at: [<ffffffff802935c1>]
> 
> hey, that means it found a bug - which is not sad at all :-)
> 

mutter.

Torsten, you could try CONFIG_SLAB=y, CONFIG_SLUB=n to see if you can make
some progress on the NFS problem.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [BUG] 2.6.24-rc2-mm1 - kernel bug on nfs v4
  2007-11-17 17:53 ` Torsten Kaiser
  2007-11-17 18:05   ` Andrew Morton
  2007-11-17 18:09   ` Ingo Molnar
@ 2007-11-17 18:58   ` Trond Myklebust
  2007-11-17 19:18     ` Torsten Kaiser
  2 siblings, 1 reply; 20+ messages in thread
From: Trond Myklebust @ 2007-11-17 18:58 UTC (permalink / raw)
  To: Torsten Kaiser
  Cc: LKML, Kamalesh Babulal, linuxppc-dev, nfs, Andrew Morton,
	Jan Blunck, Balbir Singh

[-- Attachment #1: Type: text/plain, Size: 893 bytes --]


On Sat, 2007-11-17 at 18:53 +0100, Torsten Kaiser wrote:
> On Nov 16, 2007 3:15 PM, Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> wrote:
> > Hi Andrew,
> >
> > The kernel enters the xmon state while running the file system
> > stress on nfs v4 mounted partition.
> [snip]
> > 0:mon> t
> > [c0000000dbd4fb50] c000000000069768 .__wake_up+0x54/0x88
> > [c0000000dbd4fc00] d00000000086b890 .nfs_sb_deactive+0x44/0x58 [nfs]
> > [c0000000dbd4fc80] d000000000872658 .nfs_free_unlinkdata+0x2c/0x74 [nfs]
> > [c0000000dbd4fd10] d000000000598510 .rpc_release_calldata+0x50/0x74 [sunrpc]
> > [c0000000dbd4fda0] c00000000008d960 .run_workqueue+0x10c/0x1f4
> > [c0000000dbd4fe50] c00000000008ec70 .worker_thread+0x118/0x138
> > [c0000000dbd4ff00] c0000000000939f4 .kthread+0x78/0xc4
> > [c0000000dbd4ff90] c00000000002b060 .kernel_thread+0x4c/0x68

Could you try with the attached patch.

Cheers
  Trond

[-- Attachment #2: linux-2.6.24-007-fix_nfs_free_unlinkdata.dif --]
[-- Type: message/rfc822, Size: 1254 bytes --]

From: Trond Myklebust <Trond.Myklebust@netapp.com>
Subject: NFS: Fix nfs_free_unlinkdata()
Date: Sat, 17 Nov 2007 13:52:36 -0500
Message-ID: <1195325920.7484.2.camel@localhost.localdomain>

We should really only be calling nfs_sb_deactive() at the end of an RPC
call, to balance the nfs_sb_active() call in nfs_do_call_unlink(). OTOH,
nfs_free_unlinkdata() can be called from a variety of other situations.

Fix is to move the call to nfs_sb_deactive() into
nfs_async_unlink_release().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
---

 fs/nfs/unlink.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/nfs/unlink.c b/fs/nfs/unlink.c
index b97d3bb..c90862a 100644
--- a/fs/nfs/unlink.c
+++ b/fs/nfs/unlink.c
@@ -31,7 +31,6 @@ struct nfs_unlinkdata {
 static void
 nfs_free_unlinkdata(struct nfs_unlinkdata *data)
 {
-	nfs_sb_deactive(NFS_SERVER(data->dir));
 	iput(data->dir);
 	put_rpccred(data->cred);
 	kfree(data->args.name.name);
@@ -116,6 +115,7 @@ static void nfs_async_unlink_release(void *calldata)
 	struct nfs_unlinkdata	*data = calldata;
 
 	nfs_dec_sillycount(data->dir);
+	nfs_sb_deactive(NFS_SERVER(data->dir));
 	nfs_free_unlinkdata(data);
 }
 

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [BUG] 2.6.24-rc2-mm1 - kernel bug on nfs v4
  2007-11-17 18:58   ` Trond Myklebust
@ 2007-11-17 19:18     ` Torsten Kaiser
  0 siblings, 0 replies; 20+ messages in thread
From: Torsten Kaiser @ 2007-11-17 19:18 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: LKML, Kamalesh Babulal, linuxppc-dev, nfs, Andrew Morton,
	Jan Blunck, Balbir Singh

On Nov 17, 2007 7:58 PM, Trond Myklebust <trond.myklebust@fys.uio.no> wrote:
>
> On Sat, 2007-11-17 at 18:53 +0100, Torsten Kaiser wrote:
> > On Nov 16, 2007 3:15 PM, Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> wrote:
> > > Hi Andrew,
> > >
> > > The kernel enters the xmon state while running the file system
> > > stress on nfs v4 mounted partition.
> > [snip]
> > > 0:mon> t
> > > [c0000000dbd4fb50] c000000000069768 .__wake_up+0x54/0x88
> > > [c0000000dbd4fc00] d00000000086b890 .nfs_sb_deactive+0x44/0x58 [nfs]
> > > [c0000000dbd4fc80] d000000000872658 .nfs_free_unlinkdata+0x2c/0x74 [nfs]
> > > [c0000000dbd4fd10] d000000000598510 .rpc_release_calldata+0x50/0x74 [sunrpc]
> > > [c0000000dbd4fda0] c00000000008d960 .run_workqueue+0x10c/0x1f4
> > > [c0000000dbd4fe50] c00000000008ec70 .worker_thread+0x118/0x138
> > > [c0000000dbd4ff00] c0000000000939f4 .kthread+0x78/0xc4
> > > [c0000000dbd4ff90] c00000000002b060 .kernel_thread+0x4c/0x68
>
> Could you try with the attached patch.
[snip]
> Fix is to move the call to nfs_sb_deactive() into
> nfs_async_unlink_release().

I realley doubt that will fix it.

My stacktrace was like:
run_workqueue
called: rpc_async_schedule
  that called: rpc_release_calldata
    which points to: nfs_async_unlink_release
       that called: nfs_free_unlinkdata

So it does not matter for me if nfs_sb_deactive is called one step earlier.

Currently building with SLAB instead SLUB to see if lockdep tells something...

Torsten

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [BUG] 2.6.24-rc2-mm1 - kernel bug on nfs v4
  2007-11-17 18:05   ` Andrew Morton
@ 2007-11-17 19:33     ` Christoph Lameter
  2007-11-17 20:10       ` Torsten Kaiser
  0 siblings, 1 reply; 20+ messages in thread
From: Christoph Lameter @ 2007-11-17 19:33 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Trond Myklebust, LKML, Torsten Kaiser, Kamalesh Babulal,
	linuxppc-dev, nfs, Jan Blunck, Balbir Singh

On Sat, 17 Nov 2007, Andrew Morton wrote:

> > Don't know who to bug about that.
> 
> That's slub.  It appears that list_lock is being taken from process context
> in one place and from softirq in another.

I kicked out some weird interrupt disable code in mm that was only run during
NUMA bootstrap.

This should fix it but isnt there some mechanism to convince lockdep that 
it is okay to do these things during bootstrap?

---
 mm/slub.c |    2 ++
 1 file changed, 2 insertions(+)

Index: linux-2.6/mm/slub.c
===================================================================
--- linux-2.6.orig/mm/slub.c	2007-11-17 11:31:21.044136631 -0800
+++ linux-2.6/mm/slub.c	2007-11-17 11:32:17.364386560 -0800
@@ -2044,7 +2044,9 @@ static struct kmem_cache_node *early_kme
 #endif
 	init_kmem_cache_node(n);
 	atomic_long_inc(&n->nr_slabs);
+	local_irq_disable();
 	add_partial(kmalloc_caches, page, 0);
+	local_irq_enable();
 	return n;
 }
 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [BUG] 2.6.24-rc2-mm1 - kernel bug on nfs v4
  2007-11-17 18:19     ` Andrew Morton
@ 2007-11-17 19:40       ` Torsten Kaiser
  2007-11-17 23:05         ` Peter Zijlstra
  0 siblings, 1 reply; 20+ messages in thread
From: Torsten Kaiser @ 2007-11-17 19:40 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Trond Myklebust, Peter Zijlstra, LKML, Kamalesh Babulal,
	linuxppc-dev, nfs, Ingo Molnar, Jan Blunck, Balbir Singh

On Nov 17, 2007 7:19 PM, Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Sat, 17 Nov 2007 19:09:46 +0100 Ingo Molnar <mingo@elte.hu> wrote:
>
> >
> > * Torsten Kaiser <just.for.lkml@googlemail.com> wrote:
> >
> > > Sadly lockdep does not work for me, as it gets turned off early:
> > > [   39.851594] ---------------------------------
> > > [   39.855963] inconsistent {softirq-on-W} -> {in-softirq-W} usage.
> > > [   39.861981] swapper/0 [HC0[0]:SC1[1]:HE0:SE0] takes:
> > > [   39.866963]  (&n->list_lock){-+..}, at: [<ffffffff802935c1>]
> >
> > hey, that means it found a bug - which is not sad at all :-)

It was sad, that it found a bug that I was not searching for. ;)

> mutter.
>
> Torsten, you could try CONFIG_SLAB=y, CONFIG_SLUB=n to see if you can make
> some progress on the NFS problem.

I should had thought of that myself... OK anyway here is the result:

The hang is reproducable, emerge froze the system again after download
the source.
Lockdep triggers immedetly before the freeze, but the result is still
not helpful:

[  221.565011] INFO: trying to register non-static key.
[  221.566999] the code is fine but needs lockdep annotation.
[  221.569206] turning off the locking correctness validator.
[  221.571404]
[  221.571405] Call Trace:
[  221.572996]  [<ffffffff8025a1b4>] __lock_acquire+0x4c4/0x1140
[  221.575298]  [<ffffffff8025ae85>] lock_acquire+0x55/0x70
[  221.577429]  [<ffffffff8022d6fd>] __wake_up+0x2d/0x70
[  221.579457]  [<ffffffff805c5f04>] _spin_lock_irqsave+0x34/0x50
[  221.581800]  [<ffffffff805c5e45>] _spin_unlock_irqrestore+0x55/0x70
[  221.584317]  [<ffffffff8022d6fd>] __wake_up+0x2d/0x70
[  221.586344]  [<ffffffff805a88b0>] rpc_async_schedule+0x0/0x10
[  221.588648]  [<ffffffff802fface>] nfs_free_unlinkdata+0x1e/0x50
[  221.591023]  [<ffffffff805a7e96>] rpc_release_calldata+0x26/0x50
[  221.593428]  [<ffffffff8024778f>] run_workqueue+0x16f/0x210
[  221.595662]  [<ffffffff80259731>] trace_hardirqs_on+0xc1/0x160
[  221.598004]  [<ffffffff802483d0>] worker_thread+0x0/0xb0
[  221.600130]  [<ffffffff802483d0>] worker_thread+0x0/0xb0
[  221.602265]  [<ffffffff8024843d>] worker_thread+0x6d/0xb0
[  221.604431]  [<ffffffff8024bfc0>] autoremove_wake_function+0x0/0x30
[  221.606939]  [<ffffffff802483d0>] worker_thread+0x0/0xb0
[  221.609067]  [<ffffffff802483d0>] worker_thread+0x0/0xb0
[  221.611199]  [<ffffffff8024bbeb>] kthread+0x4b/0x80
[  221.613156]  [<ffffffff8020cb98>] child_rip+0xa/0x12
[  221.615151]  [<ffffffff8020c2af>] restore_args+0x0/0x30
[  221.617247]  [<ffffffff8024bba0>] kthread+0x0/0x80
[  221.619162]  [<ffffffff8020cb8e>] child_rip+0x0/0x12
[  221.621147]
[  221.621749] INFO: lockdep is turned off.
[  226.369259] SysRq : Emergency Sync
[  226.331342] Emergency Sync complete
[  227.064545] SysRq : Emergency Remount R/O
[  228.193491] SysRq : Emergency Sync
[  228.155593] Emergency Sync complete
[  228.767931] SysRq : Resetting

I also had another BUG output during system startup, but that should
be unrelated:
[  103.254681] BUG: sleeping function called from invalid context at
kernel/rwsem.c:20
[  103.257757] in_atomic():0, irqs_disabled():1
[  103.259469] 1 lock held by artsd/5883:
[  103.259470]  #0:  (pm_qos_lock){....}, at: [<ffffffff80250efb>]
pm_qos_add_requirement+0x6b/0xf0
[  103.263316] irq event stamp: 49712
[  103.263318] hardirqs last  enabled at (49711): [<ffffffff802941ed>]
__kmalloc+0x10d/0x180
[  103.263321] hardirqs last disabled at (49712): [<ffffffff805c5eea>]
_spin_lock_irqsave+0x1a/0x50
[  103.263326] softirqs last  enabled at (48820): [<ffffffff805954d9>]
unix_release_sock+0x79/0x240
[  103.263330] softirqs last disabled at (48818): [<ffffffff805c5b89>]
_write_lock_bh+0x9/0x30
[  103.263333]
[  103.263333] Call Trace:
[  103.263335]  [<ffffffff8024fc25>] down_read+0x15/0x40
[  103.263338]  [<ffffffff802507e6>] __blocking_notifier_call_chain+0x46/0x90
[  103.263341]  [<ffffffff80250f23>] pm_qos_add_requirement+0x93/0xf0
[  103.263344]  [<ffffffff804fdc4a>] snd_pcm_hw_params+0x2fa/0x380
[  103.263347]  [<ffffffff804fe93c>] snd_pcm_common_ioctl1+0xb4c/0xdc0
[  103.263350]  [<ffffffff8027b167>] __do_fault+0x227/0x470
[  103.263353]  [<ffffffff8025a435>] __lock_acquire+0x745/0x1140
[  103.263357]  [<ffffffff805c5e45>] _spin_unlock_irqrestore+0x55/0x70
[  103.263359]  [<ffffffff80259731>] trace_hardirqs_on+0xc1/0x160
[  103.263362]  [<ffffffff804fee88>] snd_pcm_playback_ioctl1+0x48/0x240
[  103.263365]  [<ffffffff804ffa36>] snd_pcm_playback_ioctl+0x36/0x50
[  103.263367]  [<ffffffff802a80bf>] vfs_ioctl+0x2f/0xa0
[  103.263369]  [<ffffffff802a8390>] do_vfs_ioctl+0x260/0x2e0
[  103.263371]  [<ffffffff80259731>] trace_hardirqs_on+0xc1/0x160
[  103.263373]  [<ffffffff802a84a1>] sys_ioctl+0x91/0xb0
[  103.263376]  [<ffffffff8020bc5e>] system_call+0x7e/0x83
[  103.263379]

Torsten

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [BUG] 2.6.24-rc2-mm1 - kernel bug on nfs v4
  2007-11-17 19:33     ` Christoph Lameter
@ 2007-11-17 20:10       ` Torsten Kaiser
  0 siblings, 0 replies; 20+ messages in thread
From: Torsten Kaiser @ 2007-11-17 20:10 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Trond Myklebust, LKML, Kamalesh Babulal, linuxppc-dev, nfs,
	Andrew Morton, Jan Blunck, Balbir Singh

On Nov 17, 2007 8:33 PM, Christoph Lameter <clameter@sgi.com> wrote:
> On Sat, 17 Nov 2007, Andrew Morton wrote:
>
> > That's slub.  It appears that list_lock is being taken from process context
> > in one place and from softirq in another.
>
> I kicked out some weird interrupt disable code in mm that was only run during
> NUMA bootstrap.

I'm using NUMA (Opteron), so this indeed fixes it.

A kernel complied with SLUB now outputs the same message as the SLAB
one, that lockdep annotations are needed at the place where nfs hangs.

> This should fix it but isnt there some mechanism to convince lockdep that
> it is okay to do these things during bootstrap?
>
> ---
>  mm/slub.c |    2 ++
>  1 file changed, 2 insertions(+)
>
> Index: linux-2.6/mm/slub.c
> ===================================================================
> --- linux-2.6.orig/mm/slub.c    2007-11-17 11:31:21.044136631 -0800
> +++ linux-2.6/mm/slub.c 2007-11-17 11:32:17.364386560 -0800
> @@ -2044,7 +2044,9 @@ static struct kmem_cache_node *early_kme
>  #endif
>         init_kmem_cache_node(n);
>         atomic_long_inc(&n->nr_slabs);
> +       local_irq_disable();
>         add_partial(kmalloc_caches, page, 0);
> +       local_irq_enable();
>         return n;
>  }
>
>
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [BUG] 2.6.24-rc2-mm1 - kernel bug on nfs v4
  2007-11-17 18:09   ` Ingo Molnar
  2007-11-17 18:19     ` Andrew Morton
@ 2007-11-17 23:00     ` root
  2007-11-19 22:50       ` Christoph Lameter
  1 sibling, 1 reply; 20+ messages in thread
From: root @ 2007-11-17 23:00 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Trond Myklebust, Peter Zijlstra, LKML, Torsten Kaiser,
	Kamalesh Babulal, linuxppc-dev, nfs, Christoph Lameter,
	Andrew Morton, Jan Blunck, Balbir Singh

On Sat, Nov 17, 2007 at 07:09:46PM +0100, Ingo Molnar wrote:
> 
> * Torsten Kaiser <just.for.lkml@googlemail.com> wrote:
> 
> > Sadly lockdep does not work for me, as it gets turned off early:
> > [   39.851594] ---------------------------------
> > [   39.855963] inconsistent {softirq-on-W} -> {in-softirq-W} usage.
> > [   39.861981] swapper/0 [HC0[0]:SC1[1]:HE0:SE0] takes:
> > [   39.866963]  (&n->list_lock){-+..}, at: [<ffffffff802935c1>]
> 
> hey, that means it found a bug - which is not sad at all :-)

---
Subject: lockdep: slub: annotate boot time node->list_lock usage

inconsistent {softirq-on-W} -> {in-softirq-W} usage.
swapper/0 [HC0[0]:SC1[1]:HE0:SE0] takes:
 (&n->list_lock){-+..}, at: [<ffffffff802935c1>] add_partial+0x31/0xa0
{softirq-on-W} state was registered at:
  [<ffffffff80259fb8>] __lock_acquire+0x3e8/0x1140
  [<ffffffff80259838>] debug_check_no_locks_freed+0x188/0x1a0
  [<ffffffff8025ad65>] lock_acquire+0x55/0x70
  [<ffffffff802935c1>] add_partial+0x31/0xa0
  [<ffffffff805c76de>] _spin_lock+0x1e/0x30
  [<ffffffff802935c1>] add_partial+0x31/0xa0
  [<ffffffff80296f9c>] kmem_cache_open+0x1cc/0x330
  [<ffffffff805c7984>] _spin_unlock_irq+0x24/0x30
  [<ffffffff802974f4>] create_kmalloc_cache+0x64/0xf0
  [<ffffffff80295640>] init_alloc_cpu_cpu+0x70/0x90
  [<ffffffff8080ada5>] kmem_cache_init+0x65/0x1d0
  [<ffffffff807f1b4e>] start_kernel+0x23e/0x350
  [<ffffffff807f112d>] _sinittext+0x12d/0x140
  [<ffffffffffffffff>] 0xffffffffffffffff

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Christoph Lameter <clameter@sgi.com>
CC: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
---
 mm/slub.c |    8 ++++++++
 1 file changed, 8 insertions(+)

Index: linux-2.6/mm/slub.c
===================================================================
--- linux-2.6.orig/mm/slub.c
+++ linux-2.6/mm/slub.c
@@ -2155,6 +2155,7 @@ static struct kmem_cache_node *early_kme
 {
 	struct page *page;
 	struct kmem_cache_node *n;
+	unsigned long flags;
 
 	BUG_ON(kmalloc_caches->size < sizeof(struct kmem_cache_node));
 
@@ -2179,7 +2180,14 @@ static struct kmem_cache_node *early_kme
 #endif
 	init_kmem_cache_node(n);
 	atomic_long_inc(&n->nr_slabs);
+	/*
+	 * lockdep requires consistent irq usage for each lock
+	 * so even though there cannot be a race this early in
+	 * the boot sequence, we still disable irqs.
+	 */
+	local_irq_save(flags);
 	add_partial(kmalloc_caches, page, 0);
+	local_irq_restore(flags);
 	return n;
 }
 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [BUG] 2.6.24-rc2-mm1 - kernel bug on nfs v4
  2007-11-17 19:40       ` Torsten Kaiser
@ 2007-11-17 23:05         ` Peter Zijlstra
  2007-11-17 23:44           ` Torsten Kaiser
  2007-11-18 18:44           ` Torsten Kaiser
  0 siblings, 2 replies; 20+ messages in thread
From: Peter Zijlstra @ 2007-11-17 23:05 UTC (permalink / raw)
  To: Torsten Kaiser
  Cc: Trond Myklebust, Peter Zijlstra, steved, LKML, Kamalesh Babulal,
	linuxppc-dev, nfs, Andrew Morton, Jan Blunck, Ingo Molnar,
	Balbir Singh

On Sat, Nov 17, 2007 at 08:40:22PM +0100, Torsten Kaiser wrote:

> Lockdep triggers immedetly before the freeze, but the result is still
> not helpful:
> 
> [  221.565011] INFO: trying to register non-static key.
> [  221.566999] the code is fine but needs lockdep annotation.
> [  221.569206] turning off the locking correctness validator.
> [  221.571404]
> [  221.571405] Call Trace:
> [  221.572996]  [<ffffffff8025a1b4>] __lock_acquire+0x4c4/0x1140
> [  221.575298]  [<ffffffff8025ae85>] lock_acquire+0x55/0x70
> [  221.577429]  [<ffffffff8022d6fd>] __wake_up+0x2d/0x70
> [  221.579457]  [<ffffffff805c5f04>] _spin_lock_irqsave+0x34/0x50
> [  221.581800]  [<ffffffff805c5e45>] _spin_unlock_irqrestore+0x55/0x70
> [  221.584317]  [<ffffffff8022d6fd>] __wake_up+0x2d/0x70
> [  221.586344]  [<ffffffff805a88b0>] rpc_async_schedule+0x0/0x10
> [  221.588648]  [<ffffffff802fface>] nfs_free_unlinkdata+0x1e/0x50
> [  221.591023]  [<ffffffff805a7e96>] rpc_release_calldata+0x26/0x50
> [  221.593428]  [<ffffffff8024778f>] run_workqueue+0x16f/0x210
> [  221.595662]  [<ffffffff80259731>] trace_hardirqs_on+0xc1/0x160
> [  221.598004]  [<ffffffff802483d0>] worker_thread+0x0/0xb0
> [  221.600130]  [<ffffffff802483d0>] worker_thread+0x0/0xb0
> [  221.602265]  [<ffffffff8024843d>] worker_thread+0x6d/0xb0
> [  221.604431]  [<ffffffff8024bfc0>] autoremove_wake_function+0x0/0x30
> [  221.606939]  [<ffffffff802483d0>] worker_thread+0x0/0xb0
> [  221.609067]  [<ffffffff802483d0>] worker_thread+0x0/0xb0
> [  221.611199]  [<ffffffff8024bbeb>] kthread+0x4b/0x80
> [  221.613156]  [<ffffffff8020cb98>] child_rip+0xa/0x12
> [  221.615151]  [<ffffffff8020c2af>] restore_args+0x0/0x30
> [  221.617247]  [<ffffffff8024bba0>] kthread+0x0/0x80
> [  221.619162]  [<ffffffff8020cb8e>] child_rip+0x0/0x12
> [  221.621147]
> [  221.621749] INFO: lockdep is turned off.

I've been staring at this NFS code for a while an can't make any sense
out of it. It seems to correctly initialize the waitqueue. So this would
indicate corruption of some sort.



> I also had another BUG output during system startup, but that should
> be unrelated:
> [  103.254681] BUG: sleeping function called from invalid context at
> kernel/rwsem.c:20
> [  103.257757] in_atomic():0, irqs_disabled():1
> [  103.259469] 1 lock held by artsd/5883:
> [  103.259470]  #0:  (pm_qos_lock){....}, at: [<ffffffff80250efb>]
> pm_qos_add_requirement+0x6b/0xf0
> [  103.263316] irq event stamp: 49712
> [  103.263318] hardirqs last  enabled at (49711): [<ffffffff802941ed>]
> __kmalloc+0x10d/0x180
> [  103.263321] hardirqs last disabled at (49712): [<ffffffff805c5eea>]
> _spin_lock_irqsave+0x1a/0x50
> [  103.263326] softirqs last  enabled at (48820): [<ffffffff805954d9>]
> unix_release_sock+0x79/0x240
> [  103.263330] softirqs last disabled at (48818): [<ffffffff805c5b89>]
> _write_lock_bh+0x9/0x30
> [  103.263333]
> [  103.263333] Call Trace:
> [  103.263335]  [<ffffffff8024fc25>] down_read+0x15/0x40
> [  103.263338]  [<ffffffff802507e6>] __blocking_notifier_call_chain+0x46/0x90
> [  103.263341]  [<ffffffff80250f23>] pm_qos_add_requirement+0x93/0xf0
> [  103.263344]  [<ffffffff804fdc4a>] snd_pcm_hw_params+0x2fa/0x380
> [  103.263347]  [<ffffffff804fe93c>] snd_pcm_common_ioctl1+0xb4c/0xdc0
> [  103.263350]  [<ffffffff8027b167>] __do_fault+0x227/0x470
> [  103.263353]  [<ffffffff8025a435>] __lock_acquire+0x745/0x1140
> [  103.263357]  [<ffffffff805c5e45>] _spin_unlock_irqrestore+0x55/0x70
> [  103.263359]  [<ffffffff80259731>] trace_hardirqs_on+0xc1/0x160
> [  103.263362]  [<ffffffff804fee88>] snd_pcm_playback_ioctl1+0x48/0x240
> [  103.263365]  [<ffffffff804ffa36>] snd_pcm_playback_ioctl+0x36/0x50
> [  103.263367]  [<ffffffff802a80bf>] vfs_ioctl+0x2f/0xa0
> [  103.263369]  [<ffffffff802a8390>] do_vfs_ioctl+0x260/0x2e0
> [  103.263371]  [<ffffffff80259731>] trace_hardirqs_on+0xc1/0x160
> [  103.263373]  [<ffffffff802a84a1>] sys_ioctl+0x91/0xb0
> [  103.263376]  [<ffffffff8020bc5e>] system_call+0x7e/0x83
> [  103.263379]

This pm-qos code is fubar, it calls blocking_notifier_call_chain while
holding a spinlock (and that is after 'fixing' it from a
srcu_notifier_call_chain - which is equally wrong).

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [BUG] 2.6.24-rc2-mm1 - kernel bug on nfs v4
  2007-11-17 23:05         ` Peter Zijlstra
@ 2007-11-17 23:44           ` Torsten Kaiser
  2007-11-18 18:44           ` Torsten Kaiser
  1 sibling, 0 replies; 20+ messages in thread
From: Torsten Kaiser @ 2007-11-17 23:44 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Trond Myklebust, steved, LKML, Kamalesh Babulal, linuxppc-dev,
	nfs, Andrew Morton, Jan Blunck, Ingo Molnar, Balbir Singh

On Nov 18, 2007 12:05 AM, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
>
> On Sat, Nov 17, 2007 at 08:40:22PM +0100, Torsten Kaiser wrote:
>
> > Lockdep triggers immedetly before the freeze, but the result is still
> > not helpful:
> >
> > [  221.565011] INFO: trying to register non-static key.
> > [  221.566999] the code is fine but needs lockdep annotation.
> > [  221.569206] turning off the locking correctness validator.
> > [  221.571404]
> > [  221.571405] Call Trace:
> > [  221.572996]  [<ffffffff8025a1b4>] __lock_acquire+0x4c4/0x1140
> > [  221.575298]  [<ffffffff8025ae85>] lock_acquire+0x55/0x70
> > [  221.577429]  [<ffffffff8022d6fd>] __wake_up+0x2d/0x70
> > [  221.579457]  [<ffffffff805c5f04>] _spin_lock_irqsave+0x34/0x50
> > [  221.581800]  [<ffffffff805c5e45>] _spin_unlock_irqrestore+0x55/0x70
> > [  221.584317]  [<ffffffff8022d6fd>] __wake_up+0x2d/0x70
> > [  221.586344]  [<ffffffff805a88b0>] rpc_async_schedule+0x0/0x10
> > [  221.588648]  [<ffffffff802fface>] nfs_free_unlinkdata+0x1e/0x50
> > [  221.591023]  [<ffffffff805a7e96>] rpc_release_calldata+0x26/0x50
> > [  221.593428]  [<ffffffff8024778f>] run_workqueue+0x16f/0x210
> > [  221.595662]  [<ffffffff80259731>] trace_hardirqs_on+0xc1/0x160
> > [  221.598004]  [<ffffffff802483d0>] worker_thread+0x0/0xb0
> > [  221.600130]  [<ffffffff802483d0>] worker_thread+0x0/0xb0
> > [  221.602265]  [<ffffffff8024843d>] worker_thread+0x6d/0xb0
> > [  221.604431]  [<ffffffff8024bfc0>] autoremove_wake_function+0x0/0x30
> > [  221.606939]  [<ffffffff802483d0>] worker_thread+0x0/0xb0
> > [  221.609067]  [<ffffffff802483d0>] worker_thread+0x0/0xb0
> > [  221.611199]  [<ffffffff8024bbeb>] kthread+0x4b/0x80
> > [  221.613156]  [<ffffffff8020cb98>] child_rip+0xa/0x12
> > [  221.615151]  [<ffffffff8020c2af>] restore_args+0x0/0x30
> > [  221.617247]  [<ffffffff8024bba0>] kthread+0x0/0x80
> > [  221.619162]  [<ffffffff8020cb8e>] child_rip+0x0/0x12
> > [  221.621147]
> > [  221.621749] INFO: lockdep is turned off.
>
> I've been staring at this NFS code for a while an can't make any sense
> out of it. It seems to correctly initialize the waitqueue. So this would
> indicate corruption of some sort.

Not sure if this is helpful, but after looking into the code, the
above stacktrace looks somewhat damaged.
Might be my fault: # CONFIG_FRAME_POINTER is not set
On the other hand the stacktrace from the run with the SLUB lockdep
fix shows the same function names.

That trace contains this line:
 [<ffffffff8030167e>] nfs_free_unlinkdata+0x1e/0x50
(gdb) list *0xffffffff8030167e
0xffffffff8030167e is in nfs_free_unlinkdata (fs/nfs/unlink.c:33).
28       */
29      static void
30      nfs_free_unlinkdata(struct nfs_unlinkdata *data)
31      {
32              nfs_sb_deactive(NFS_SERVER(data->dir));
33              iput(data->dir);
34              put_rpccred(data->cred);
35              kfree(data->args.name.name);
36              kfree(data);
37      }

Is some inode lock guilty?
Please ask, if you need more information.

Torsten

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [BUG] 2.6.24-rc2-mm1 - kernel bug on nfs v4
  2007-11-17 23:05         ` Peter Zijlstra
  2007-11-17 23:44           ` Torsten Kaiser
@ 2007-11-18 18:44           ` Torsten Kaiser
  2007-11-18 19:18             ` Trond Myklebust
  1 sibling, 1 reply; 20+ messages in thread
From: Torsten Kaiser @ 2007-11-18 18:44 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Trond Myklebust, steved, LKML, Kamalesh Babulal, linuxppc-dev,
	nfs, Andrew Morton, Jan Blunck, Ingo Molnar, Balbir Singh

On Nov 18, 2007 12:05 AM, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
> I've been staring at this NFS code for a while an can't make any sense
> out of it. It seems to correctly initialize the waitqueue. So this would
> indicate corruption of some sort.

No, it does not "correctly" initialize the waitqueue. It doesn't even
try to initialize it.

I now found the guilty patch and what is wrong with it.

nfs-stop-sillyname-renames-and-unmounts-from-racing.patch adds:

@@ -110,8 +112,22 @@ struct nfs_server {
                                                   filesystem */
 #endif
        void (*destroy)(struct nfs_server *);
+
+       atomic_t active; /* Keep trace of any activity to this server */
+       wait_queue_head_t active_wq;  /* Wait for any activity to stop  */

and tries to initialize it:
@@ -593,6 +593,10 @@ static int nfs_init_server(struct nfs_server *server,
        server->namelen  = data->namlen;
        /* Create a client RPC handle for the NFSv3 ACL management interface */
        nfs_init_server_aclclient(server);
+
+       init_waitqueue_head(&server->active_wq);
+       atomic_set(&server->active, 0);
+

and then uses it via nfs_sb_active and nfs_sb_deactive:

@@ -29,6 +29,7 @@ struct nfs_unlinkdata {
 static void
 nfs_free_unlinkdata(struct nfs_unlinkdata *data)
 {
+       nfs_sb_deactive(NFS_SERVER(data->dir));
        iput(data->dir);
        put_rpccred(data->cred);
        kfree(data->args.name.name);
@@ -151,6 +152,7 @@ static int nfs_do_call_unlink(struct dentry
*parent, struct inode *dir, struct n
                nfs_dec_sillycount(dir);
                return 0;
        }
+       nfs_sb_active(NFS_SERVER(dir));
        data->args.fh = NFS_FH(dir);
        nfs_fattr_init(&data->res.dir_attr);


But it does not notice this:
struct dentry_operations nfs_dentry_operations = {
        .d_revalidate   = nfs_lookup_revalidate,
        .d_delete       = nfs_dentry_delete,
        .d_iput         = nfs_dentry_iput,
};
struct dentry_operations nfs4_dentry_operations = {
        .d_revalidate   = nfs_open_revalidate,
        .d_delete       = nfs_dentry_delete,
        .d_iput         = nfs_dentry_iput,
};

NFSv2/3 and NFSv4 share the same dentry_iput and so share the same
unlink and sillyrename logic.
But they do not share nfs_init_server()!

I wonder why this doesn't blow up more violently, but only hangs...

But as I don't know if it is correct to add the workqueue
initialization to nfs4_init_server() or remove the nfs_sb_active /
nfs_sb_deactive for the NFSv4 case, I can't offer a patch to fix this.

Torsten

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [BUG] 2.6.24-rc2-mm1 - kernel bug on nfs v4
  2007-11-18 18:44           ` Torsten Kaiser
@ 2007-11-18 19:18             ` Trond Myklebust
  2007-11-19  7:15               ` Torsten Kaiser
  2007-11-20  5:35               ` Andrew Morton
  0 siblings, 2 replies; 20+ messages in thread
From: Trond Myklebust @ 2007-11-18 19:18 UTC (permalink / raw)
  To: Torsten Kaiser
  Cc: Peter Zijlstra, steved, LKML, Kamalesh Babulal, linuxppc-dev, nfs,
	Andrew Morton, Jan Blunck, Ingo Molnar, Balbir Singh

[-- Attachment #1: Type: text/plain, Size: 3001 bytes --]


On Sun, 2007-11-18 at 19:44 +0100, Torsten Kaiser wrote:
> On Nov 18, 2007 12:05 AM, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
> > I've been staring at this NFS code for a while an can't make any sense
> > out of it. It seems to correctly initialize the waitqueue. So this would
> > indicate corruption of some sort.
> 
> No, it does not "correctly" initialize the waitqueue. It doesn't even
> try to initialize it.
> 
> I now found the guilty patch and what is wrong with it.
> 
> nfs-stop-sillyname-renames-and-unmounts-from-racing.patch adds:
> 
> @@ -110,8 +112,22 @@ struct nfs_server {
>                                                    filesystem */
>  #endif
>         void (*destroy)(struct nfs_server *);
> +
> +       atomic_t active; /* Keep trace of any activity to this server */
> +       wait_queue_head_t active_wq;  /* Wait for any activity to stop  */
> 
> and tries to initialize it:
> @@ -593,6 +593,10 @@ static int nfs_init_server(struct nfs_server *server,
>         server->namelen  = data->namlen;
>         /* Create a client RPC handle for the NFSv3 ACL management interface */
>         nfs_init_server_aclclient(server);
> +
> +       init_waitqueue_head(&server->active_wq);
> +       atomic_set(&server->active, 0);
> +
> 
> and then uses it via nfs_sb_active and nfs_sb_deactive:
> 
> @@ -29,6 +29,7 @@ struct nfs_unlinkdata {
>  static void
>  nfs_free_unlinkdata(struct nfs_unlinkdata *data)
>  {
> +       nfs_sb_deactive(NFS_SERVER(data->dir));
>         iput(data->dir);
>         put_rpccred(data->cred);
>         kfree(data->args.name.name);
> @@ -151,6 +152,7 @@ static int nfs_do_call_unlink(struct dentry
> *parent, struct inode *dir, struct n
>                 nfs_dec_sillycount(dir);
>                 return 0;
>         }
> +       nfs_sb_active(NFS_SERVER(dir));
>         data->args.fh = NFS_FH(dir);
>         nfs_fattr_init(&data->res.dir_attr);
> 
> 
> But it does not notice this:
> struct dentry_operations nfs_dentry_operations = {
>         .d_revalidate   = nfs_lookup_revalidate,
>         .d_delete       = nfs_dentry_delete,
>         .d_iput         = nfs_dentry_iput,
> };
> struct dentry_operations nfs4_dentry_operations = {
>         .d_revalidate   = nfs_open_revalidate,
>         .d_delete       = nfs_dentry_delete,
>         .d_iput         = nfs_dentry_iput,
> };
> 
> NFSv2/3 and NFSv4 share the same dentry_iput and so share the same
> unlink and sillyrename logic.
> But they do not share nfs_init_server()!
> 
> I wonder why this doesn't blow up more violently, but only hangs...
> 
> But as I don't know if it is correct to add the workqueue
> initialization to nfs4_init_server() or remove the nfs_sb_active /
> nfs_sb_deactive for the NFSv4 case, I can't offer a patch to fix this.
> 
> Torsten

I had already fixed that one in my own stack. Attached are the 3 patches
that I've got. 1 from SteveD, 2 fixes.

Andrew, could you please unapply the sillyrename patches you've got, and
apply these 3 instead?

Trond


[-- Attachment #2: linux-2.6.24-005-fix_sillyrename_bug_on_umount.dif --]
[-- Type: message/rfc822, Size: 4060 bytes --]

From: Steve Dickson <SteveD@redhat.com>
Subject: NFS: Stop sillyname renames and unmounts from racing
Date: Thu, 08 Nov 2007 04:05:04 -0500
Message-ID: <1195413486.7893.17.camel@heimdal.trondhjem.org>

Added an active/deactive mechanism to the nfs_server structure
allowing async operations to hold off umount until the
operations are done.

Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
---

 fs/nfs/client.c           |    4 ++++
 fs/nfs/super.c            |   13 +++++++++++++
 fs/nfs/unlink.c           |    2 ++
 include/linux/nfs_fs_sb.h |   17 +++++++++++++++++
 4 files changed, 36 insertions(+), 0 deletions(-)

diff --git a/fs/nfs/client.c b/fs/nfs/client.c
index 70587f3..2ecf726 100644
--- a/fs/nfs/client.c
+++ b/fs/nfs/client.c
@@ -593,6 +593,10 @@ static int nfs_init_server(struct nfs_server *server,
 	server->namelen  = data->namlen;
 	/* Create a client RPC handle for the NFSv3 ACL management interface */
 	nfs_init_server_aclclient(server);
+
+	init_waitqueue_head(&server->active_wq);
+	atomic_set(&server->active, 0);
+
 	dprintk("<-- nfs_init_server() = 0 [new %p]\n", clp);
 	return 0;
 
diff --git a/fs/nfs/super.c b/fs/nfs/super.c
index 71067d1..833aed8 100644
--- a/fs/nfs/super.c
+++ b/fs/nfs/super.c
@@ -202,6 +202,7 @@ static int nfs_get_sb(struct file_system_type *, int, const char *, void *, stru
 static int nfs_xdev_get_sb(struct file_system_type *fs_type,
 		int flags, const char *dev_name, void *raw_data, struct vfsmount *mnt);
 static void nfs_kill_super(struct super_block *);
+static void nfs_put_super(struct super_block *);
 
 static struct file_system_type nfs_fs_type = {
 	.owner		= THIS_MODULE,
@@ -223,6 +224,7 @@ static const struct super_operations nfs_sops = {
 	.alloc_inode	= nfs_alloc_inode,
 	.destroy_inode	= nfs_destroy_inode,
 	.write_inode	= nfs_write_inode,
+	.put_super	= nfs_put_super,
 	.statfs		= nfs_statfs,
 	.clear_inode	= nfs_clear_inode,
 	.umount_begin	= nfs_umount_begin,
@@ -1772,6 +1774,17 @@ static void nfs4_kill_super(struct super_block *sb)
 	nfs_free_server(server);
 }
 
+static void nfs_put_super(struct super_block *sb)
+{
+	struct nfs_server *server = NFS_SB(sb);
+	/*
+	 * Make sure there are no outstanding ops to this server.
+	 * If so, wait for them to finish before allowing the
+	 * unmount to continue.
+	 */
+	wait_event(server->active_wq, atomic_read(&server->active) == 0);
+}
+
 /*
  * Clone an NFS4 server record on xdev traversal (FSID-change)
  */
diff --git a/fs/nfs/unlink.c b/fs/nfs/unlink.c
index 233ad38..cf12a24 100644
--- a/fs/nfs/unlink.c
+++ b/fs/nfs/unlink.c
@@ -29,6 +29,7 @@ struct nfs_unlinkdata {
 static void
 nfs_free_unlinkdata(struct nfs_unlinkdata *data)
 {
+	nfs_sb_deactive(NFS_SERVER(data->dir));
 	iput(data->dir);
 	put_rpccred(data->cred);
 	kfree(data->args.name.name);
@@ -151,6 +152,7 @@ static int nfs_do_call_unlink(struct dentry *parent, struct inode *dir, struct n
 		nfs_dec_sillycount(dir);
 		return 0;
 	}
+	nfs_sb_active(NFS_SERVER(dir));
 	data->args.fh = NFS_FH(dir);
 	nfs_fattr_init(&data->res.dir_attr);
 
diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h
index 0cac49b..6ef3af8 100644
--- a/include/linux/nfs_fs_sb.h
+++ b/include/linux/nfs_fs_sb.h
@@ -4,6 +4,8 @@
 #include <linux/list.h>
 #include <linux/backing-dev.h>
 
+#include <asm/atomic.h>
+
 struct nfs_iostats;
 
 /*
@@ -110,8 +112,23 @@ struct nfs_server {
 						   filesystem */
 #endif
 	void (*destroy)(struct nfs_server *);
+
+	atomic_t active; /* Keep trace of any activity to this server */
+	wait_queue_head_t active_wq;  /* Wait for any activity to stop  */
 };
 
+static inline void 
+nfs_sb_active(struct nfs_server *server)
+{
+	atomic_inc(&server->active);
+}
+static inline void 
+nfs_sb_deactive(struct nfs_server *server)
+{
+	if (atomic_dec_and_test(&server->active))
+		wake_up(&server->active_wq);
+}
+
 /* Server capabilities */
 #define NFS_CAP_READDIRPLUS	(1U << 0)
 #define NFS_CAP_HARDLINKS	(1U << 1)

[-- Attachment #3: linux-2.6.24-006-fix_to_fix_sillyrename_bug_on_umount.dif --]
[-- Type: message/rfc822, Size: 4205 bytes --]

From: Trond Myklebust <Trond.Myklebust@netapp.com>
Subject: NFS: Fix up problems with Steve's sillyrename fix
Date: Sat, 17 Nov 2007 13:08:49 -0500
Message-ID: <1195413486.7893.18.camel@heimdal.trondhjem.org>

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
---

 fs/nfs/client.c           |    6 +++---
 fs/nfs/internal.h         |    2 ++
 fs/nfs/super.c            |   33 ++++++++++++++++++++++-----------
 fs/nfs/unlink.c           |    2 ++
 include/linux/nfs_fs_sb.h |   13 +------------
 5 files changed, 30 insertions(+), 26 deletions(-)

diff --git a/fs/nfs/client.c b/fs/nfs/client.c
index 2ecf726..be9fecb 100644
--- a/fs/nfs/client.c
+++ b/fs/nfs/client.c
@@ -594,9 +594,6 @@ static int nfs_init_server(struct nfs_server *server,
 	/* Create a client RPC handle for the NFSv3 ACL management interface */
 	nfs_init_server_aclclient(server);
 
-	init_waitqueue_head(&server->active_wq);
-	atomic_set(&server->active, 0);
-
 	dprintk("<-- nfs_init_server() = 0 [new %p]\n", clp);
 	return 0;
 
@@ -736,6 +733,9 @@ static struct nfs_server *nfs_alloc_server(void)
 	INIT_LIST_HEAD(&server->client_link);
 	INIT_LIST_HEAD(&server->master_link);
 
+	init_waitqueue_head(&server->active_wq);
+	atomic_set(&server->active, 0);
+
 	server->io_stats = nfs_alloc_iostats();
 	if (!server->io_stats) {
 		kfree(server);
diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
index f3acf48..7579379 100644
--- a/fs/nfs/internal.h
+++ b/fs/nfs/internal.h
@@ -160,6 +160,8 @@ extern struct rpc_stat nfs_rpcstat;
 
 extern int __init register_nfs_fs(void);
 extern void __exit unregister_nfs_fs(void);
+extern void nfs_sb_active(struct nfs_server *server);
+extern void nfs_sb_deactive(struct nfs_server *server);
 
 /* namespace.c */
 extern char *nfs_path(const char *base,
diff --git a/fs/nfs/super.c b/fs/nfs/super.c
index 833aed8..046d1ac 100644
--- a/fs/nfs/super.c
+++ b/fs/nfs/super.c
@@ -327,6 +327,28 @@ void __exit unregister_nfs_fs(void)
 	unregister_filesystem(&nfs_fs_type);
 }
 
+void nfs_sb_active(struct nfs_server *server)
+{
+	atomic_inc(&server->active);
+}
+
+void nfs_sb_deactive(struct nfs_server *server)
+{
+	if (atomic_dec_and_test(&server->active))
+		wake_up(&server->active_wq);
+}
+
+static void nfs_put_super(struct super_block *sb)
+{
+	struct nfs_server *server = NFS_SB(sb);
+	/*
+	 * Make sure there are no outstanding ops to this server.
+	 * If so, wait for them to finish before allowing the
+	 * unmount to continue.
+	 */
+	wait_event(server->active_wq, atomic_read(&server->active) == 0);
+}
+
 /*
  * Deliver file system statistics to userspace
  */
@@ -1774,17 +1796,6 @@ static void nfs4_kill_super(struct super_block *sb)
 	nfs_free_server(server);
 }
 
-static void nfs_put_super(struct super_block *sb)
-{
-	struct nfs_server *server = NFS_SB(sb);
-	/*
-	 * Make sure there are no outstanding ops to this server.
-	 * If so, wait for them to finish before allowing the
-	 * unmount to continue.
-	 */
-	wait_event(server->active_wq, atomic_read(&server->active) == 0);
-}
-
 /*
  * Clone an NFS4 server record on xdev traversal (FSID-change)
  */
diff --git a/fs/nfs/unlink.c b/fs/nfs/unlink.c
index cf12a24..b97d3bb 100644
--- a/fs/nfs/unlink.c
+++ b/fs/nfs/unlink.c
@@ -14,6 +14,8 @@
 #include <linux/sched.h>
 #include <linux/wait.h>
 
+#include "internal.h"
+
 struct nfs_unlinkdata {
 	struct hlist_node list;
 	struct nfs_removeargs args;
diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h
index 6ef3af8..9f949b5 100644
--- a/include/linux/nfs_fs_sb.h
+++ b/include/linux/nfs_fs_sb.h
@@ -3,6 +3,7 @@
 
 #include <linux/list.h>
 #include <linux/backing-dev.h>
+#include <linux/wait.h>
 
 #include <asm/atomic.h>
 
@@ -117,18 +118,6 @@ struct nfs_server {
 	wait_queue_head_t active_wq;  /* Wait for any activity to stop  */
 };
 
-static inline void 
-nfs_sb_active(struct nfs_server *server)
-{
-	atomic_inc(&server->active);
-}
-static inline void 
-nfs_sb_deactive(struct nfs_server *server)
-{
-	if (atomic_dec_and_test(&server->active))
-		wake_up(&server->active_wq);
-}
-
 /* Server capabilities */
 #define NFS_CAP_READDIRPLUS	(1U << 0)
 #define NFS_CAP_HARDLINKS	(1U << 1)

[-- Attachment #4: linux-2.6.24-007-fix_nfs_free_unlinkdata.dif --]
[-- Type: message/rfc822, Size: 1255 bytes --]

From: Trond Myklebust <Trond.Myklebust@netapp.com>
Subject: NFS: Fix nfs_free_unlinkdata()
Date: Sat, 17 Nov 2007 13:52:36 -0500
Message-ID: <1195413486.7893.19.camel@heimdal.trondhjem.org>

We should really only be calling nfs_sb_deactive() at the end of an RPC
call, to balance the nfs_sb_active() call in nfs_do_call_unlink(). OTOH,
nfs_free_unlinkdata() can be called from a variety of other situations.

Fix is to move the call to nfs_sb_deactive() into
nfs_async_unlink_release().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
---

 fs/nfs/unlink.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/nfs/unlink.c b/fs/nfs/unlink.c
index b97d3bb..c90862a 100644
--- a/fs/nfs/unlink.c
+++ b/fs/nfs/unlink.c
@@ -31,7 +31,6 @@ struct nfs_unlinkdata {
 static void
 nfs_free_unlinkdata(struct nfs_unlinkdata *data)
 {
-	nfs_sb_deactive(NFS_SERVER(data->dir));
 	iput(data->dir);
 	put_rpccred(data->cred);
 	kfree(data->args.name.name);
@@ -116,6 +115,7 @@ static void nfs_async_unlink_release(void *calldata)
 	struct nfs_unlinkdata	*data = calldata;
 
 	nfs_dec_sillycount(data->dir);
+	nfs_sb_deactive(NFS_SERVER(data->dir));
 	nfs_free_unlinkdata(data);
 }
 

[-- Attachment #5: series --]
[-- Type: text/plain, Size: 231 bytes --]

# This series applies on GIT commit 4c1fe2f78a08e2c514a39c91a0eb7b55bbd3c0d2
linux-2.6.24-005-fix_sillyrename_bug_on_umount.dif
linux-2.6.24-006-fix_to_fix_sillyrename_bug_on_umount.dif
linux-2.6.24-007-fix_nfs_free_unlinkdata.dif

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [BUG] 2.6.24-rc2-mm1 - kernel bug on nfs v4
  2007-11-18 19:18             ` Trond Myklebust
@ 2007-11-19  7:15               ` Torsten Kaiser
  2007-11-19  9:00                 ` Andrew Morton
  2007-11-20  5:35               ` Andrew Morton
  1 sibling, 1 reply; 20+ messages in thread
From: Torsten Kaiser @ 2007-11-19  7:15 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Peter Zijlstra, steved, LKML, Kamalesh Babulal, linuxppc-dev, nfs,
	Andrew Morton, Jan Blunck, Ingo Molnar, Balbir Singh

On Nov 18, 2007 8:18 PM, Trond Myklebust <trond.myklebust@fys.uio.no> wrote:
> On Sun, 2007-11-18 at 19:44 +0100, Torsten Kaiser wrote:
> > NFSv2/3 and NFSv4 share the same dentry_iput and so share the same
> > unlink and sillyrename logic.
> > But they do not share nfs_init_server()!
> >
> > I wonder why this doesn't blow up more violently, but only hangs...
> >
> > But as I don't know if it is correct to add the workqueue
> > initialization to nfs4_init_server() or remove the nfs_sb_active /
> > nfs_sb_deactive for the NFSv4 case, I can't offer a patch to fix this.
> >
> > Torsten
>
> I had already fixed that one in my own stack. Attached are the 3 patches
> that I've got. 1 from SteveD, 2 fixes.
>

Moving the init_waitqueue_head() like patch
linux-2.6.24-006-fix_to_fix_sillyrename_bug_on_umount.dif and applying
linux-2.6.24-007-fix_nfs_free_unlinkdata.dif lets my testcase work.
Also lockdep no longer complains about the non-static key.

Torsten

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [BUG] 2.6.24-rc2-mm1 - kernel bug on nfs v4
  2007-11-19  7:15               ` Torsten Kaiser
@ 2007-11-19  9:00                 ` Andrew Morton
  2007-11-19 18:24                   ` Torsten Kaiser
  0 siblings, 1 reply; 20+ messages in thread
From: Andrew Morton @ 2007-11-19  9:00 UTC (permalink / raw)
  To: Torsten Kaiser
  Cc: Jan Blunck, Peter Zijlstra, steved, LKML, Kamalesh Babulal,
	linuxppc-dev, nfs, Ingo Molnar, Trond Myklebust, Balbir Singh

On Mon, 19 Nov 2007 08:15:48 +0100 "Torsten Kaiser" <just.for.lkml@googlemail.com> wrote:

> On Nov 18, 2007 8:18 PM, Trond Myklebust <trond.myklebust@fys.uio.no> wrote:
> > On Sun, 2007-11-18 at 19:44 +0100, Torsten Kaiser wrote:
> > > NFSv2/3 and NFSv4 share the same dentry_iput and so share the same
> > > unlink and sillyrename logic.
> > > But they do not share nfs_init_server()!
> > >
> > > I wonder why this doesn't blow up more violently, but only hangs...
> > >
> > > But as I don't know if it is correct to add the workqueue
> > > initialization to nfs4_init_server() or remove the nfs_sb_active /
> > > nfs_sb_deactive for the NFSv4 case, I can't offer a patch to fix this.
> > >
> > > Torsten
> >
> > I had already fixed that one in my own stack. Attached are the 3 patches
> > that I've got. 1 from SteveD, 2 fixes.
> >
> 
> Moving the init_waitqueue_head() like patch
> linux-2.6.24-006-fix_to_fix_sillyrename_bug_on_umount.dif and applying
> linux-2.6.24-007-fix_nfs_free_unlinkdata.dif lets my testcase work.
> Also lockdep no longer complains about the non-static key.
> 

Thanks.

To avoid goofups, could you please send the full fix against 2.6.24-rc2-mm1?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [BUG] 2.6.24-rc2-mm1 - kernel bug on nfs v4
  2007-11-19  9:00                 ` Andrew Morton
@ 2007-11-19 18:24                   ` Torsten Kaiser
  0 siblings, 0 replies; 20+ messages in thread
From: Torsten Kaiser @ 2007-11-19 18:24 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Jan Blunck, Peter Zijlstra, steved, LKML, Kamalesh Babulal,
	linuxppc-dev, nfs, Ingo Molnar, Trond Myklebust, Balbir Singh

On Nov 19, 2007 10:00 AM, Andrew Morton <akpm@linux-foundation.org> wrote:
> On Mon, 19 Nov 2007 08:15:48 +0100 "Torsten Kaiser" <just.for.lkml@googlemail.com> wrote:
> > On Nov 18, 2007 8:18 PM, Trond Myklebust <trond.myklebust@fys.uio.no> wrote:
> > > I had already fixed that one in my own stack. Attached are the 3 patches
> > > that I've got. 1 from SteveD, 2 fixes.
> >
> > Moving the init_waitqueue_head() like patch
> > linux-2.6.24-006-fix_to_fix_sillyrename_bug_on_umount.dif and applying
> > linux-2.6.24-007-fix_nfs_free_unlinkdata.dif lets my testcase work.
> > Also lockdep no longer complains about the non-static key.
>
> Thanks.
>
> To avoid goofups, could you please send the full fix against 2.6.24-rc2-mm1?

Umm... As I applied this changes manually there is a not insignificant
change of goofups on my part...

For the hang problem I think Tronds suggestion with replacing the
patches from -mm with fresh versions would be the best.


Anyway, currently I have the patch from
http://lkml.org/lkml/2007/11/16/74 to fix the can't-create-files-bug.

To fix the hang bug I used Tronds
linux-2.6.24-007-fix_nfs_free_unlinkdata.dif and the first two hunks
from linux-2.6.24-006-fix_to_fix_sillyrename_bug_on_umount.dif.

Torsten

The needed 2 hunks for reference:

--- a/fs/nfs/client.c
+++ b/fs/nfs/client.c
@@ -594,9 +594,6 @@ static int nfs_init_server(struct nfs_server *server,
 	/* Create a client RPC handle for the NFSv3 ACL management interface */
 	nfs_init_server_aclclient(server);

-	init_waitqueue_head(&server->active_wq);
-	atomic_set(&server->active, 0);
-
 	dprintk("<-- nfs_init_server() = 0 [new %p]\n", clp);
 	return 0;

@@ -736,6 +733,9 @@ static struct nfs_server *nfs_alloc_server(void)
 	INIT_LIST_HEAD(&server->client_link);
 	INIT_LIST_HEAD(&server->master_link);

+	init_waitqueue_head(&server->active_wq);
+	atomic_set(&server->active, 0);
+
 	server->io_stats = nfs_alloc_iostats();
 	if (!server->io_stats) {
 		kfree(server);

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [BUG] 2.6.24-rc2-mm1 - kernel bug on nfs v4
  2007-11-17 23:00     ` root
@ 2007-11-19 22:50       ` Christoph Lameter
  0 siblings, 0 replies; 20+ messages in thread
From: Christoph Lameter @ 2007-11-19 22:50 UTC (permalink / raw)
  To: root
  Cc: Trond Myklebust, Peter Zijlstra, LKML, Torsten Kaiser,
	Kamalesh Babulal, linuxppc-dev, nfs, Ingo Molnar, Jan Blunck,
	Andrew Morton, Balbir Singh

On Sun, 18 Nov 2007, root wrote:

> @@ -2155,6 +2155,7 @@ static struct kmem_cache_node *early_kme
>  {
>  	struct page *page;
>  	struct kmem_cache_node *n;
> +	unsigned long flags;
>  
>  	BUG_ON(kmalloc_caches->size < sizeof(struct kmem_cache_node));
>  

Well local_irq_save is a bit of an overkill. We know that interrupts are 
enabled during this phase of the boot sequence.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [BUG] 2.6.24-rc2-mm1 - kernel bug on nfs v4
  2007-11-18 19:18             ` Trond Myklebust
  2007-11-19  7:15               ` Torsten Kaiser
@ 2007-11-20  5:35               ` Andrew Morton
  1 sibling, 0 replies; 20+ messages in thread
From: Andrew Morton @ 2007-11-20  5:35 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Peter Zijlstra, steved, LKML, Torsten Kaiser, Kamalesh Babulal,
	linuxppc-dev, nfs, Ingo Molnar, Jan Blunck, Balbir Singh

On Sun, 18 Nov 2007 14:18:06 -0500 Trond Myklebust <trond.myklebust@fys.uio.no> wrote:

> > 
> > Torsten
> 
> I had already fixed that one in my own stack. Attached are the 3 patches
> that I've got. 1 from SteveD, 2 fixes.
> 
> Andrew, could you please unapply the sillyrename patches you've got, and
> apply these 3 instead?

I'd expect to see things like this appear in git-nfs.patch.  Did something change?

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2007-11-20  5:41 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-11-16 14:15 [BUG] 2.6.24-rc2-mm1 - kernel bug on nfs v4 Kamalesh Babulal
2007-11-17 17:53 ` Torsten Kaiser
2007-11-17 18:05   ` Andrew Morton
2007-11-17 19:33     ` Christoph Lameter
2007-11-17 20:10       ` Torsten Kaiser
2007-11-17 18:09   ` Ingo Molnar
2007-11-17 18:19     ` Andrew Morton
2007-11-17 19:40       ` Torsten Kaiser
2007-11-17 23:05         ` Peter Zijlstra
2007-11-17 23:44           ` Torsten Kaiser
2007-11-18 18:44           ` Torsten Kaiser
2007-11-18 19:18             ` Trond Myklebust
2007-11-19  7:15               ` Torsten Kaiser
2007-11-19  9:00                 ` Andrew Morton
2007-11-19 18:24                   ` Torsten Kaiser
2007-11-20  5:35               ` Andrew Morton
2007-11-17 23:00     ` root
2007-11-19 22:50       ` Christoph Lameter
2007-11-17 18:58   ` Trond Myklebust
2007-11-17 19:18     ` Torsten Kaiser

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).