All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Torsten Kaiser <just.for.lkml@googlemail.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	steved@redhat.com, LKML <linux-kernel@vger.kernel.org>,
	Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>,
	linuxppc-dev@ozlabs.org, nfs@lists.sourceforge.net,
	Andrew Morton <akpm@linux-foundation.org>,
	Jan Blunck <jblunck@suse.de>, Ingo Molnar <mingo@elte.hu>,
	Balbir Singh <balbir@linux.vnet.ibm.com>
Subject: Re: [BUG] 2.6.24-rc2-mm1 - kernel bug on nfs v4
Date: Sun, 18 Nov 2007 00:05:08 +0100	[thread overview]
Message-ID: <20071117230508.GB25905@dyad> (raw)
In-Reply-To: <64bb37e0711171140w5f1451e0qea081a4fbc7a45f7@mail.gmail.com>

On Sat, Nov 17, 2007 at 08:40:22PM +0100, Torsten Kaiser wrote:

> Lockdep triggers immedetly before the freeze, but the result is still
> not helpful:
> 
> [  221.565011] INFO: trying to register non-static key.
> [  221.566999] the code is fine but needs lockdep annotation.
> [  221.569206] turning off the locking correctness validator.
> [  221.571404]
> [  221.571405] Call Trace:
> [  221.572996]  [<ffffffff8025a1b4>] __lock_acquire+0x4c4/0x1140
> [  221.575298]  [<ffffffff8025ae85>] lock_acquire+0x55/0x70
> [  221.577429]  [<ffffffff8022d6fd>] __wake_up+0x2d/0x70
> [  221.579457]  [<ffffffff805c5f04>] _spin_lock_irqsave+0x34/0x50
> [  221.581800]  [<ffffffff805c5e45>] _spin_unlock_irqrestore+0x55/0x70
> [  221.584317]  [<ffffffff8022d6fd>] __wake_up+0x2d/0x70
> [  221.586344]  [<ffffffff805a88b0>] rpc_async_schedule+0x0/0x10
> [  221.588648]  [<ffffffff802fface>] nfs_free_unlinkdata+0x1e/0x50
> [  221.591023]  [<ffffffff805a7e96>] rpc_release_calldata+0x26/0x50
> [  221.593428]  [<ffffffff8024778f>] run_workqueue+0x16f/0x210
> [  221.595662]  [<ffffffff80259731>] trace_hardirqs_on+0xc1/0x160
> [  221.598004]  [<ffffffff802483d0>] worker_thread+0x0/0xb0
> [  221.600130]  [<ffffffff802483d0>] worker_thread+0x0/0xb0
> [  221.602265]  [<ffffffff8024843d>] worker_thread+0x6d/0xb0
> [  221.604431]  [<ffffffff8024bfc0>] autoremove_wake_function+0x0/0x30
> [  221.606939]  [<ffffffff802483d0>] worker_thread+0x0/0xb0
> [  221.609067]  [<ffffffff802483d0>] worker_thread+0x0/0xb0
> [  221.611199]  [<ffffffff8024bbeb>] kthread+0x4b/0x80
> [  221.613156]  [<ffffffff8020cb98>] child_rip+0xa/0x12
> [  221.615151]  [<ffffffff8020c2af>] restore_args+0x0/0x30
> [  221.617247]  [<ffffffff8024bba0>] kthread+0x0/0x80
> [  221.619162]  [<ffffffff8020cb8e>] child_rip+0x0/0x12
> [  221.621147]
> [  221.621749] INFO: lockdep is turned off.

I've been staring at this NFS code for a while an can't make any sense
out of it. It seems to correctly initialize the waitqueue. So this would
indicate corruption of some sort.



> I also had another BUG output during system startup, but that should
> be unrelated:
> [  103.254681] BUG: sleeping function called from invalid context at
> kernel/rwsem.c:20
> [  103.257757] in_atomic():0, irqs_disabled():1
> [  103.259469] 1 lock held by artsd/5883:
> [  103.259470]  #0:  (pm_qos_lock){....}, at: [<ffffffff80250efb>]
> pm_qos_add_requirement+0x6b/0xf0
> [  103.263316] irq event stamp: 49712
> [  103.263318] hardirqs last  enabled at (49711): [<ffffffff802941ed>]
> __kmalloc+0x10d/0x180
> [  103.263321] hardirqs last disabled at (49712): [<ffffffff805c5eea>]
> _spin_lock_irqsave+0x1a/0x50
> [  103.263326] softirqs last  enabled at (48820): [<ffffffff805954d9>]
> unix_release_sock+0x79/0x240
> [  103.263330] softirqs last disabled at (48818): [<ffffffff805c5b89>]
> _write_lock_bh+0x9/0x30
> [  103.263333]
> [  103.263333] Call Trace:
> [  103.263335]  [<ffffffff8024fc25>] down_read+0x15/0x40
> [  103.263338]  [<ffffffff802507e6>] __blocking_notifier_call_chain+0x46/0x90
> [  103.263341]  [<ffffffff80250f23>] pm_qos_add_requirement+0x93/0xf0
> [  103.263344]  [<ffffffff804fdc4a>] snd_pcm_hw_params+0x2fa/0x380
> [  103.263347]  [<ffffffff804fe93c>] snd_pcm_common_ioctl1+0xb4c/0xdc0
> [  103.263350]  [<ffffffff8027b167>] __do_fault+0x227/0x470
> [  103.263353]  [<ffffffff8025a435>] __lock_acquire+0x745/0x1140
> [  103.263357]  [<ffffffff805c5e45>] _spin_unlock_irqrestore+0x55/0x70
> [  103.263359]  [<ffffffff80259731>] trace_hardirqs_on+0xc1/0x160
> [  103.263362]  [<ffffffff804fee88>] snd_pcm_playback_ioctl1+0x48/0x240
> [  103.263365]  [<ffffffff804ffa36>] snd_pcm_playback_ioctl+0x36/0x50
> [  103.263367]  [<ffffffff802a80bf>] vfs_ioctl+0x2f/0xa0
> [  103.263369]  [<ffffffff802a8390>] do_vfs_ioctl+0x260/0x2e0
> [  103.263371]  [<ffffffff80259731>] trace_hardirqs_on+0xc1/0x160
> [  103.263373]  [<ffffffff802a84a1>] sys_ioctl+0x91/0xb0
> [  103.263376]  [<ffffffff8020bc5e>] system_call+0x7e/0x83
> [  103.263379]

This pm-qos code is fubar, it calls blocking_notifier_call_chain while
holding a spinlock (and that is after 'fixing' it from a
srcu_notifier_call_chain - which is equally wrong).

WARNING: multiple messages have this Message-ID (diff)
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Torsten Kaiser <just.for.lkml@googlemail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Ingo Molnar <mingo@elte.hu>,
	Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>,
	LKML <linux-kernel@vger.kernel.org>,
	linuxppc-dev@ozlabs.org, nfs@lists.sourceforge.net,
	Andy Whitcroft <apw@shadowen.org>,
	Balbir Singh <balbir@linux.vnet.ibm.com>,
	Jan Blunck <jblunck@suse.de>,
	Trond Myklebust <trond.myklebust@fys.uio.no>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	steved@redhat.com
Subject: Re: [BUG] 2.6.24-rc2-mm1 - kernel bug on nfs v4
Date: Sun, 18 Nov 2007 00:05:08 +0100	[thread overview]
Message-ID: <20071117230508.GB25905@dyad> (raw)
In-Reply-To: <64bb37e0711171140w5f1451e0qea081a4fbc7a45f7@mail.gmail.com>

On Sat, Nov 17, 2007 at 08:40:22PM +0100, Torsten Kaiser wrote:

> Lockdep triggers immedetly before the freeze, but the result is still
> not helpful:
> 
> [  221.565011] INFO: trying to register non-static key.
> [  221.566999] the code is fine but needs lockdep annotation.
> [  221.569206] turning off the locking correctness validator.
> [  221.571404]
> [  221.571405] Call Trace:
> [  221.572996]  [<ffffffff8025a1b4>] __lock_acquire+0x4c4/0x1140
> [  221.575298]  [<ffffffff8025ae85>] lock_acquire+0x55/0x70
> [  221.577429]  [<ffffffff8022d6fd>] __wake_up+0x2d/0x70
> [  221.579457]  [<ffffffff805c5f04>] _spin_lock_irqsave+0x34/0x50
> [  221.581800]  [<ffffffff805c5e45>] _spin_unlock_irqrestore+0x55/0x70
> [  221.584317]  [<ffffffff8022d6fd>] __wake_up+0x2d/0x70
> [  221.586344]  [<ffffffff805a88b0>] rpc_async_schedule+0x0/0x10
> [  221.588648]  [<ffffffff802fface>] nfs_free_unlinkdata+0x1e/0x50
> [  221.591023]  [<ffffffff805a7e96>] rpc_release_calldata+0x26/0x50
> [  221.593428]  [<ffffffff8024778f>] run_workqueue+0x16f/0x210
> [  221.595662]  [<ffffffff80259731>] trace_hardirqs_on+0xc1/0x160
> [  221.598004]  [<ffffffff802483d0>] worker_thread+0x0/0xb0
> [  221.600130]  [<ffffffff802483d0>] worker_thread+0x0/0xb0
> [  221.602265]  [<ffffffff8024843d>] worker_thread+0x6d/0xb0
> [  221.604431]  [<ffffffff8024bfc0>] autoremove_wake_function+0x0/0x30
> [  221.606939]  [<ffffffff802483d0>] worker_thread+0x0/0xb0
> [  221.609067]  [<ffffffff802483d0>] worker_thread+0x0/0xb0
> [  221.611199]  [<ffffffff8024bbeb>] kthread+0x4b/0x80
> [  221.613156]  [<ffffffff8020cb98>] child_rip+0xa/0x12
> [  221.615151]  [<ffffffff8020c2af>] restore_args+0x0/0x30
> [  221.617247]  [<ffffffff8024bba0>] kthread+0x0/0x80
> [  221.619162]  [<ffffffff8020cb8e>] child_rip+0x0/0x12
> [  221.621147]
> [  221.621749] INFO: lockdep is turned off.

I've been staring at this NFS code for a while an can't make any sense
out of it. It seems to correctly initialize the waitqueue. So this would
indicate corruption of some sort.



> I also had another BUG output during system startup, but that should
> be unrelated:
> [  103.254681] BUG: sleeping function called from invalid context at
> kernel/rwsem.c:20
> [  103.257757] in_atomic():0, irqs_disabled():1
> [  103.259469] 1 lock held by artsd/5883:
> [  103.259470]  #0:  (pm_qos_lock){....}, at: [<ffffffff80250efb>]
> pm_qos_add_requirement+0x6b/0xf0
> [  103.263316] irq event stamp: 49712
> [  103.263318] hardirqs last  enabled at (49711): [<ffffffff802941ed>]
> __kmalloc+0x10d/0x180
> [  103.263321] hardirqs last disabled at (49712): [<ffffffff805c5eea>]
> _spin_lock_irqsave+0x1a/0x50
> [  103.263326] softirqs last  enabled at (48820): [<ffffffff805954d9>]
> unix_release_sock+0x79/0x240
> [  103.263330] softirqs last disabled at (48818): [<ffffffff805c5b89>]
> _write_lock_bh+0x9/0x30
> [  103.263333]
> [  103.263333] Call Trace:
> [  103.263335]  [<ffffffff8024fc25>] down_read+0x15/0x40
> [  103.263338]  [<ffffffff802507e6>] __blocking_notifier_call_chain+0x46/0x90
> [  103.263341]  [<ffffffff80250f23>] pm_qos_add_requirement+0x93/0xf0
> [  103.263344]  [<ffffffff804fdc4a>] snd_pcm_hw_params+0x2fa/0x380
> [  103.263347]  [<ffffffff804fe93c>] snd_pcm_common_ioctl1+0xb4c/0xdc0
> [  103.263350]  [<ffffffff8027b167>] __do_fault+0x227/0x470
> [  103.263353]  [<ffffffff8025a435>] __lock_acquire+0x745/0x1140
> [  103.263357]  [<ffffffff805c5e45>] _spin_unlock_irqrestore+0x55/0x70
> [  103.263359]  [<ffffffff80259731>] trace_hardirqs_on+0xc1/0x160
> [  103.263362]  [<ffffffff804fee88>] snd_pcm_playback_ioctl1+0x48/0x240
> [  103.263365]  [<ffffffff804ffa36>] snd_pcm_playback_ioctl+0x36/0x50
> [  103.263367]  [<ffffffff802a80bf>] vfs_ioctl+0x2f/0xa0
> [  103.263369]  [<ffffffff802a8390>] do_vfs_ioctl+0x260/0x2e0
> [  103.263371]  [<ffffffff80259731>] trace_hardirqs_on+0xc1/0x160
> [  103.263373]  [<ffffffff802a84a1>] sys_ioctl+0x91/0xb0
> [  103.263376]  [<ffffffff8020bc5e>] system_call+0x7e/0x83
> [  103.263379]

This pm-qos code is fubar, it calls blocking_notifier_call_chain while
holding a spinlock (and that is after 'fixing' it from a
srcu_notifier_call_chain - which is equally wrong).


  parent reply	other threads:[~2007-11-17 23:17 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-11-16 14:15 [BUG] 2.6.24-rc2-mm1 - kernel bug on nfs v4 Kamalesh Babulal
2007-11-16 14:15 ` Kamalesh Babulal
2007-11-17 17:53 ` [NFS] " Torsten Kaiser
2007-11-17 17:53   ` Torsten Kaiser
2007-11-17 17:53   ` Torsten Kaiser
2007-11-17 18:05   ` Andrew Morton
2007-11-17 18:05     ` Andrew Morton
2007-11-17 19:33     ` Christoph Lameter
2007-11-17 19:33       ` Christoph Lameter
2007-11-17 20:10       ` Torsten Kaiser
2007-11-17 20:10         ` Torsten Kaiser
     [not found]       ` <Pine.LNX.4.64.0711171128530.7986-RYO/mD75kfhx2SFC9UQUAuF7EQX82lMiAL8bYrjMMd8@public.gmane.org>
2007-11-17 20:10         ` [NFS] " Torsten Kaiser
2007-11-17 19:33     ` Christoph Lameter
2007-11-17 18:09   ` Ingo Molnar
2007-11-17 18:09     ` Ingo Molnar
2007-11-17 18:19     ` Andrew Morton
2007-11-17 18:19       ` Andrew Morton
2007-11-17 19:40       ` Torsten Kaiser
2007-11-17 19:40         ` Torsten Kaiser
     [not found]         ` <64bb37e0711171140w5f1451e0qea081a4fbc7a45f7-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2007-11-17 23:05           ` [NFS] " Peter Zijlstra
2007-11-17 23:05         ` Peter Zijlstra [this message]
2007-11-17 23:05           ` Peter Zijlstra
2007-11-17 23:44           ` [NFS] " Torsten Kaiser
2007-11-17 23:44           ` Torsten Kaiser
2007-11-17 23:44             ` Torsten Kaiser
2007-11-18 18:44           ` Torsten Kaiser
2007-11-18 18:44             ` Torsten Kaiser
2007-11-18 19:18             ` Trond Myklebust
2007-11-18 19:18               ` Trond Myklebust
     [not found]               ` <1195413486.7893.16.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2007-11-19  7:15                 ` [NFS] " Torsten Kaiser
2007-11-20  5:35                 ` Andrew Morton
2007-11-19  7:15               ` Torsten Kaiser
2007-11-19  7:15                 ` Torsten Kaiser
2007-11-19  9:00                 ` Andrew Morton
2007-11-19  9:00                   ` Andrew Morton
2007-11-19 18:24                   ` Torsten Kaiser
2007-11-19 18:24                     ` Torsten Kaiser
2007-11-19 18:24                   ` [NFS] " Torsten Kaiser
     [not found]                 ` <64bb37e0711182315s1d159c80h11811acb07566f03-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2007-11-19  9:00                   ` Andrew Morton
2007-11-20  5:35               ` Andrew Morton
2007-11-20  5:35                 ` Andrew Morton
     [not found]             ` <64bb37e0711181044s75fd1081sdf44dac2e060d49a-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2007-11-18 19:18               ` [NFS] " Trond Myklebust
2007-11-18 18:44           ` Torsten Kaiser
2007-11-17 19:40       ` Torsten Kaiser
2007-11-17 18:19     ` Andrew Morton
2007-11-17 23:00     ` root
2007-11-17 23:00       ` root
2007-11-19 22:50       ` [NFS] " Christoph Lameter
2007-11-19 22:50       ` Christoph Lameter
2007-11-19 22:50         ` Christoph Lameter
2008-01-02 18:43       ` Torsten Kaiser
2008-01-02 20:51         ` Christoph Lameter
2008-01-02 21:10           ` Torsten Kaiser
2007-11-17 23:00     ` [NFS] " root
     [not found]   ` <64bb37e0711170953p67d1be49lf4eaa190d662e2b4-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2007-11-17 18:05     ` Andrew Morton
2007-11-17 18:09     ` Ingo Molnar
2007-11-17 18:58     ` Trond Myklebust
2007-11-17 18:58   ` Trond Myklebust
2007-11-17 18:58     ` Trond Myklebust
     [not found]     ` <1195325920.7484.1.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2007-11-17 19:18       ` [NFS] " Torsten Kaiser
2007-11-17 19:18     ` Torsten Kaiser
2007-11-17 19:18       ` Torsten Kaiser

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20071117230508.GB25905@dyad \
    --to=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=jblunck@suse.de \
    --cc=just.for.lkml@googlemail.com \
    --cc=kamalesh@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@ozlabs.org \
    --cc=mingo@elte.hu \
    --cc=nfs@lists.sourceforge.net \
    --cc=steved@redhat.com \
    --cc=trond.myklebust@fys.uio.no \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.