From: Mingming Cao <cmm@us.ibm.com>
To: Jan Kara <jack@suse.cz>
Cc: Chris Mason <chris.mason@oracle.com>,
Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>,
Andrew Morton <akpm@linux-foundation.org>,
linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH] jbd jbd2: fix dio write returning EIO whentry_to_release_page fails
Date: Wed, 06 Aug 2008 15:57:57 -0700 [thread overview]
Message-ID: <1218063477.6383.41.camel@mingming-laptop> (raw)
In-Reply-To: <20080806135337.GA3615@duck.suse.cz>
在 2008-08-06三的 15:53 +0200,Jan Kara写道:
> On Wed 06-08-08 09:25:13, Chris Mason wrote:
> > On Tue, 2008-08-05 at 14:17 -0700, Mingming Cao wrote:
> > > 在 2008-08-05二的 12:17 -0400,Chris Mason写道:
> > > > On Tue, 2008-08-05 at 13:51 +0900, Hisashi Hifumi wrote:
> > > > > >> >
> > > > > >> > diff -Nrup linux-2.6.27-rc1.org/fs/jbd/transaction.c
> > > > > >linux-2.6.27-rc1/fs/jbd/transaction.c
> > > > > >> > --- linux-2.6.27-rc1.org/fs/jbd/transaction.c 2008-07-29
> > > > > >19:28:47.000000000 +0900
> > > > > >> > +++ linux-2.6.27-rc1/fs/jbd/transaction.c 2008-07-29 20:40:12.000000000 +0900
> > > > > >> > @@ -1764,6 +1764,12 @@ int journal_try_to_free_buffers(journal_
> > > > > >> > */
> > > > > >> > if (ret == 0 && (gfp_mask & __GFP_WAIT) && (gfp_mask & __GFP_FS)) {
> > > > > >> > journal_wait_for_transaction_sync_data(journal);
> > > > > >> > +
> > > > > >> > + bh = head;
> > > > > >> > + do {
> > > > > >> > + while (atomic_read(&bh->b_count))
> > > > > >> > + schedule();
> > > > > >> > + } while ((bh = bh->b_this_page) != head);
> > > > > >> > ret = try_to_free_buffers(page);
> > > > > >> > }
> > > > > >>
> > > > > >> The loop is problematic. If the scheduler decides to keep running this
> > > > > >> task then we have a busy loop. If this task has realtime policy then
> > > > > >> it might even lock up the kernel.
> > > > > >>
> > > > > >
> > > > > >ocfs2 calls journal_try_to_free_buffers too, looping on b_count might
> > > > > >not be the best idea there either.
> > > > > >
> > > > > >This code gets called from releasepage, which is used other places than
> > > > > >the O_DIRECT invalidation paths, I'd be worried about performance
> > > > > >problems here.
> > > > > >
> > > > >
> > > > > try_to_release_page has gfp_mask parameter. So when try_to_releasepage
> > > > > is called from performance sensitive part, gfp_mask should not be set.
> > > > > b_count check loop is inside of (gfp_mask & __GFP_WAIT) && (gfp_mask & __GFP_FS) check.
> > > >
> > > > Looks like try_to_free_pages will go into releasepage with wait & fs
> > > > both set. This kind of change would make me very nervous.
> > > >
> > >
> > > Hi Chris,
> > >
> > > The gfp_mask try_to_free_pages() takes from it's caller will past it
> > > down to try_to_release_page(). Based on the meaning of __GFP_WAIT and
> > > GFP_FS, if the upper level caller set these two flags, I assume the
> > > upper level caller expect delay and wait for fs to finish?
> > >
> > >
> > > But I agree that using a loop in journal_try_to_free_buffers() to wait
> > > for the busy bh release the counter is expensive...
> >
> > I rediscovered your old thread about trying to do this in a launder_page
> > call ;)
> Yes, we thought about using launder_page() before :).
>
> > Does it make more sense to fix do_launder_page to call into the FS on
> > every page, and let the FS check for PageDirty on its own? That way
> > invalidate_inode_pages2_range basically gets its own private call into
> > the FS that says wait around until this page is really free.
> That would certainly work as well. But IMHO waiting for ->writepage()
> call to finish isn't really a big deal even in try_to_release_page() if
> __GFP_FS (and __GFP_WAIT) is set. The only problem is that there is no
> effective way to do so and so Hisashi used that "wait for b_count to drop"
> which looks really scary and I don't like it as well.
>
I was looking at the comment in invalidate_complete_page2(), which is
now only called from DIO path, it saids
/*
* This is like invalidate_complete_page(), except it ignores the page's
* refcount. We do this because invalidate_inode_pages2() needs
stronger
* invalidation guarantees, and cannot afford to leave pages behind
because
* shrink_page_list() has a temp ref on them, or because they're
transiently
* sitting in the lru_cache_add() pagevecs.
*/
I am wondering why we need stronger invalidate hurantees for DIO->
invalidate_inode_pages_range(),which force the page being removed from
page cache? In case of bh is busy due to ext3 writeout,
journal_try_to_free_buffers() could return different error number(EBUSY)
to try_to_releasepage() (instead of EIO). In that case, could we just
leave the page in the cache, clean pageuptodate() (to force later buffer
read to read from disk) and then invalidate_complete_page2() return
successfully? Any issue with this way?
Mingming
> Honza
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2008-08-06 22:57 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-08-04 11:10 [PATCH] jbd jbd2: fix dio write returning EIO when try_to_release_page fails Hisashi Hifumi
2008-08-04 21:50 ` Andrew Morton
2008-08-05 2:36 ` [PATCH] jbd jbd2: fix dio write returning EIO whentry_to_release_page fails Hisashi Hifumi
2008-08-05 21:35 ` Mingming Cao
2008-08-06 2:04 ` [PATCH] jbd jbd2: fix dio write returning EIOwhentry_to_release_page fails Hisashi Hifumi
2008-08-05 3:35 ` [PATCH] jbd jbd2: fix dio write returning EIO when try_to_release_page fails Chris Mason
2008-08-05 4:51 ` [PATCH] jbd jbd2: fix dio write returning EIO whentry_to_release_page fails Hisashi Hifumi
2008-08-05 16:17 ` Chris Mason
2008-08-05 21:17 ` Mingming Cao
2008-08-06 6:55 ` [PATCH] jbd jbd2: fix dio write returning EIOwhentry_to_release_page fails Hisashi Hifumi
2008-08-06 8:39 ` [PATCH] jbd jbd2: fix dio write returningEIOwhentry_to_release_page fails Hisashi Hifumi
2008-08-06 13:25 ` [PATCH] jbd jbd2: fix dio write returning EIO whentry_to_release_page fails Chris Mason
2008-08-06 13:53 ` Jan Kara
2008-08-06 22:57 ` Mingming Cao [this message]
2008-08-07 1:07 ` Chris Mason
2008-08-07 3:15 ` [PATCH] jbd jbd2: fix dio write returning EIOwhentry_to_release_page fails Hisashi Hifumi
2008-08-07 10:21 ` Chris Mason
2008-08-08 3:28 ` [PATCH] jbd jbd2: fix dio write returningEIOwhentry_to_release_page fails Hisashi Hifumi
2008-08-08 12:54 ` Chris Mason
2008-08-11 6:25 ` [PATCH] jbd jbd2: fix dio writereturningEIOwhentry_to_release_page fails Hisashi Hifumi
2008-08-12 13:28 ` Chris Mason
2008-08-12 16:38 ` Zach Brown
2008-08-12 20:06 ` Mingming Cao
2008-08-13 6:02 ` [PATCH] jbd jbd2: fix diowritereturningEIOwhentry_to_release_page fails Hisashi Hifumi
2008-08-13 10:56 ` [PATCH] jbd jbd2: fix dio writereturningEIOwhentry_to_release_page fails Jan Kara
2008-08-13 10:16 ` Jan Kara
2008-08-13 12:59 ` Chris Mason
2008-08-19 7:03 ` [PATCH] jbd jbd2: fix diowritereturningEIOwhentry_to_release_page fails Hisashi Hifumi
2008-08-19 7:16 ` Andrew Morton
2008-08-20 2:50 ` [PATCH] jbd jbd2: fixdiowritereturningEIOwhentry_to_release_page fails Hisashi Hifumi
2008-08-21 7:47 ` Hisashi Hifumi
2008-08-05 21:03 ` [PATCH] jbd jbd2: fix dio write returning EIO when try_to_release_page fails Mingming Cao
2008-08-06 12:47 ` Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1218063477.6383.41.camel@mingming-laptop \
--to=cmm@us.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=chris.mason@oracle.com \
--cc=hifumi.hisashi@oss.ntt.co.jp \
--cc=jack@suse.cz \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).