From: David Chinner <dgc@sgi.com>
To: Jan Kara <jack@suse.cz>
Cc: David Chinner <dgc@sgi.com>, lkml <linux-kernel@vger.kernel.org>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: BUG: drop_pagecache_sb vs kjournald lockup
Date: Wed, 19 Mar 2008 23:17:21 +1100 [thread overview]
Message-ID: <20080319121721.GH155407@sgi.com> (raw)
In-Reply-To: <20080319100805.GB22109@duck.suse.cz>
On Wed, Mar 19, 2008 at 11:08:05AM +0100, Jan Kara wrote:
> On Wed 19-03-08 07:46:46, David Chinner wrote:
> > On Tue, Mar 18, 2008 at 02:43:26PM +0100, Jan Kara wrote:
> > > > 2.6.25-rc3, 4p ia64, ext3 root drive.
> > > >
> > > > I was running an XFS stress test on one of the XFS partitions on
> > > > the machine (zero load on the root ext3 drive), when the system
> > > > locked up in kjournald with this on the console:
> > > >
> > > > BUG: spinlock lockup on CPU#2, kjournald/2150, a000000100e022e0
> > > >
> > > <snip traces>
> >
> > <snip other stuff>
> >
> > > > Anyone know the reason why drop_pagecache_sb() uses such a brute-force
> > > > mechanism to free up clean page cache pages?
> > > Yes, we know that drop_pagecache_sb() has locking issues but since it
> > > is intended to be used for debugging purposes only, nobody cared enough
> > > to fix it. Completely untested patch below if you dare to try ;)
> >
> > It may be intended for debuging purposes, but it does get used in
> > production HPC environments (a lot!). I guess I've never seen this
> > lockup before because SGI customers don't use ext3, but they have
> > complained about the system "stopping" while drop_caches is executed.
> > This locking ..... strategy would explain it, though.
> ;) But in case your customers use it in production, shouldn't you push
> for a better interface for such feature? Just a thought...
Because it's much less horrible than the thing it replaced, does
what is needed and mostly does not cause problems? And because it
usually just works I've never felt the need to look at the
implementation....
/me is off to look at Fengguang's patch
> > I'll try the patch, but I can't guarantee anything - I only saw this
> > lockup once in about 18 hours when dropping caches every 2 seconds.
> Well, if you enable lockdep, it should warn you about possible problems
> much earlier (at least we already got several reports of lockdep warnings
> when using drop_caches).
ia64 - no lockdep :/
Cheers,
Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group
next prev parent reply other threads:[~2008-03-19 19:36 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-03-18 11:28 BUG: drop_pagecache_sb vs kjournald lockup David Chinner
2008-03-18 13:43 ` Jan Kara
2008-03-18 20:46 ` David Chinner
2008-03-19 10:08 ` Jan Kara
2008-03-19 12:17 ` David Chinner [this message]
[not found] ` <E1JbwS9-0008Gq-7h@localhost>
2008-03-19 22:03 ` [PATCH] A deadlock free and best try version of drop_caches() David Chinner
[not found] ` <E1JcJsN-0005K0-RW@localhost>
2008-03-20 12:28 ` Fengguang Wu
2008-03-20 12:28 ` Fengguang Wu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080319121721.GH155407@sgi.com \
--to=dgc@sgi.com \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).