From: Chris Mason <chris.mason@oracle.com>
To: Jan Kara <jack@suse.cz>
Cc: Zach Brown <zach.brown@oracle.com>,
Erez Zadok <ezk@cs.sunysb.edu>,
linux-kernel@vger.kernel.org, ext3-users@redhat.com,
Peter Zijlstra <peterz@infradead.org>,
linux-fsdevel@vger.kernel.org
Subject: Re: lockdep warning with LTP dio test (v2.6.24-rc6-125-g5356f66)
Date: Mon, 14 Jan 2008 13:14:54 -0500 [thread overview]
Message-ID: <20080114131454.37eb7c12@think.oraclecorp.com> (raw)
In-Reply-To: <20080114170609.GH4214@duck.suse.cz>
On Mon, 14 Jan 2008 18:06:09 +0100
Jan Kara <jack@suse.cz> wrote:
> On Wed 02-01-08 12:42:19, Zach Brown wrote:
> > Erez Zadok wrote:
> > > Setting: ltp-full-20071031, dio01 test on ext3 with Linus's
> > > latest tree. Kernel w/ SMP, preemption, and lockdep configured.
> >
> > This is a real lock ordering problem. Thanks for reporting it.
> >
> > The updating of atime inside sys_mmap() orders the mmap_sem in the
> > vfs outside of the journal handle in ext3's inode dirtying:
> >
[ lock inversion traces ]
> > Two fixes come to mind:
> >
> > 1) use something like Peter's ->mmap_prepare() to update atime
> > before acquiring the mmap_sem.
> > ( http://lkml.org/lkml/2007/11/11/97 ). I don't know if this would
> > leave more paths which do a journal_start() while holding the
> > mmap_sem.
> >
> > 2) rework ext3's dio to only hold the jbd handle in
> > ext3_get_block(). Chris has a patch for this kicking around
> > somewhere but I'm told it has problems exposing old blocks in
> > ordered data mode.
> >
> > Does anyone have preferences? I could go either way. I certainly
> > don't like the idea of journal handles being held across the
> > entirety of fs/direct-io.c. It's yet another case of O_DIRECT
> > differing wildly from the buffered path :(.
> I've looked more into it and I think that 2) is the only way to go
> since transaction start ranks below page lock (standard buffered
> write path) and page lock ranks below mmap_sem. So we have at least
> one more dependency mmap_sem must go before transaction start...
Just to clarify a little bit:
If ext3's DIO code only touches transactions in get_block, then it can
violate data=ordered rules. Basically the transaction that allocates
the blocks might commit before the DIO code gets around to writing them.
A crash in the wrong place will expose stale data on disk.
-chris
next prev parent reply other threads:[~2008-01-14 18:17 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-12-24 23:02 lockdep warning with LTP dio test (v2.6.24-rc6-125-g5356f66) Erez Zadok
2008-01-02 20:42 ` Zach Brown
2008-01-14 17:06 ` Jan Kara
2008-01-14 18:14 ` Chris Mason [this message]
2008-01-25 16:09 ` Jan Kara
2008-01-25 16:16 ` Chris Mason
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080114131454.37eb7c12@think.oraclecorp.com \
--to=chris.mason@oracle.com \
--cc=ext3-users@redhat.com \
--cc=ezk@cs.sunysb.edu \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=zach.brown@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.