From: Cliff Wickman <cpw@sgi.com>
To: John F Flynn III <flynnj@cs.fiu.edu>
Cc: linux-kernel@vger.kernel.org
Subject: Re: Very rare crash in prune_dcache
Date: Mon, 19 Dec 2005 16:34:35 -0600 [thread overview]
Message-ID: <20051219223435.GA2576@sgi.com> (raw)
In-Reply-To: <43A7286F.3080104@cs.fiu.edu>
We've seen the below on at 2.6.5 kernel (SuSE SLES9) at SGI.
Does it look like your crash?
The panic is by kswapd0:
<1>Unable to handle kernel NULL pointer dereference (address
0000000000000078)
<4>kswapd0[122]: Oops 8813272891392 [1]
whose stack shows:
[<a0000001001cecf0>] clear_inode+0x1b0/0x2c0
[<a0000001001d03d0>] generic_drop_inode+0x3b0/0x400
[<a0000001001ccf30>] iput+0x130/0x1c0
[<a00000020b6f0cd0>] nfs_dentry_iput+0x170/0x1c0 [nfs]
[<a0000001001ca050>] prune_dcache+0x510/0x540
[<a0000001001ca0c0>] shrink_dcache_memory+0x40/0x80
[<a00000010014c360>] shrink_slab+0x2e0/0x440
Both generic_shutdown_super()'s calls to shrink_dcache_parent() or
shrink_dcache_anon(), and kswapd0's call to shrink_dcache_memory()
call prune_dcache().
I suspect a race condition inside prune_dcache().
The prune_dcache() function:
lock dcache_lock
scan the dentry_unused list of dentry's for a given number ("count") of
dentry's to free:
if a dentry to free, call prune_one_dentry()
dentry_iput()
unlock dcache_lock
iput() any associated inode
d_free() the dentry
lock dcache_lock
unlock dcache_lock
Two processors entering prune_dcache() near the same time will both scan
the dentry_unused list and could try to iput() the same inode twice. That is
because the dcache_lock is released while running iput().
I suppose the dcache_lock must be released here because the iput() may take
a long time. And the dcache_lock is used many places in the system
to protect the dentry cache's lists.
It would seem to me that a straighforward fix would be to add another
lock to protect just the scan of the dentry_unused list only here in
prune_dcache()
-Cliff Wickman
On Mon, Dec 19, 2005 at 04:38:55PM -0500, John F Flynn III wrote:
> Good evening, folks...
>
> We have been experiencing a very rare (on average once every two to
> three months) crash on some of our servers.
>
> uname -a:
> Linux cheetah 2.6.9-22.0.1.ELsmp #1 SMP Thu Oct 27 13:14:25 CDT 2005
> i686 i686 i386 GNU/Linux
>
> (This is a CentOS provided kernel)
>
> Here is a photo of the bottom of the panic. Unfortunately the kernel has
> no chance to log this anywhere else:
>
> http://www.cs.fiu.edu/~flynnj/cheetah-crash.jpg
>
>
> The crash appears to be in prune_dcache, and has happened on several
> distinct machines, so we do not believe it is a hardware problem.
>
> If anyone has pointers on what bug could be causing this crash, or if
> it's been fixed in newer kernels we could try, it would be greatly
> appreciated. This only seems to happen on loaded production machines,
> and it happens so rarely that more detailed debugging is nearly impossible.
>
> Thanks in advance,
> -John Flynn
>
> --
> John Flynn flynnj@cs.fiu.edu
> =========================================================
> Systems and Network Administration /\_/\
> School of Computer Science ( O.O )
> Florida International University > <
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
Cliff Wickman
Silicon Graphics, Inc.
cpw@sgi.com
(651) 683-3824
next prev parent reply other threads:[~2005-12-19 22:34 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-12-19 21:38 Very rare crash in prune_dcache John F Flynn III
2005-12-19 22:34 ` Cliff Wickman [this message]
2005-12-20 6:46 ` Bharata B Rao
2005-12-20 13:33 ` Cliff Wickman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20051219223435.GA2576@sgi.com \
--to=cpw@sgi.com \
--cc=flynnj@cs.fiu.edu \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox