linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Zach Brown <zach.brown@oracle.com>
To: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>
Cc: akpm@linux-foundation.org, linux-kernel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org,
	Chris Mason <chris.mason@oracle.com>,
	Badari Pulavarty <pbadari@us.ibm.com>
Subject: Re: [PATCH] dio: falling through to buffered I/O when invalidation of a page fails
Date: Tue, 11 Dec 2007 17:00:32 -0800	[thread overview]
Message-ID: <475F32B0.20107@oracle.com> (raw)
In-Reply-To: <6.0.0.20.2.20071210164242.03915ca8@172.19.0.2>

Hisashi Hifumi wrote:
> Hi.
> 
> Current dio has some problems:
> 1, In ext3 ordered, dio write can return with EIO because of the race
> between invalidation of
> a page and jbd. jbd pins the bhs while committing journal so
> try_to_release_page fails when jbd
> is committing the transaction.

Yeah.  It sure would be fantastic if some ext3 expert could stop this
from happening somehow.  But that hasn't happened in.. uh.. Badari, for
how many years has this been on the radar? :)

> 
> Past discussion about this issue is as follows.
> http://marc.info/?t=119343431200004&r=1&w=2
> http://marc.info/?t=112656762800002&r=1&w=2
> 
> 2, invalidate_inode_pages2_range() sets ret=-EIO when
> invalidate_complete_page2()
> fails, but this ret is cleared if do_launder_page() succeed on a page of
> next index.

Oops.  That's too bad.  So maybe we should fix it by not stomping on
that return code?

	ret2 = do_launder()
	if (ret2 == 0)
		ret2 = invalidate()
	if (ret == 0)
		ret = ret2

I'd be surprised if we ever wanted to mask an -EIO when later pages
laundered successfully.

> In this case, dio is carried out even if invalidate_complete_page2()
> fails on some pages.
> This can cause inconsistency between memory and blocks on HDD because
> the page
> cache still exists.

Yeah.

> I solved problems above by introducing invalidate_inode_pages3_range()
> and falling
> through to buffered I/O when invalidation of a page failed.

Well, I like the idea of more intelligently dealing with the known
problem between dio and ext3.  I'm not sure that falling back to
buffered is right.

If dio can tell that it only failed because jbd held the buffer, should
we have waited for the transaction to complete before trying to
invalidate again?

If we could do that, we'd avoid performing the IO twice.

> We can distinguish between failure of page invalidation and other errors
> with the return value of invalidate_inode_pages3_range().

I'm not sure duplicating the invalidation loop into a new function is
the right thing.  Maybe we'd just tweak inode_pages2 to indicate to the
caller the specific failing circumstances somehow.  Maybe.

- z

  reply	other threads:[~2007-12-12  1:00 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-12-10  7:52 [PATCH] dio: falling through to buffered I/O when invalidation of a page fails Hisashi Hifumi
2007-12-12  1:00 ` Zach Brown [this message]
2007-12-12  7:51   ` [PATCH] dio: falling through to buffered I/O when invalidationof " Hisashi Hifumi
2007-12-14 18:59   ` [PATCH] dio: falling through to buffered I/O when invalidation of " Badari Pulavarty
2007-12-14 19:15     ` Zach Brown
2007-12-17  2:38       ` [PATCH] dio: falling through to buffered I/O when invalidationof " Hisashi Hifumi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=475F32B0.20107@oracle.com \
    --to=zach.brown@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=chris.mason@oracle.com \
    --cc=hifumi.hisashi@oss.ntt.co.jp \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbadari@us.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).