linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Theodore Ts'o <tytso@mit.edu>
To: Jan Kara <jack@suse.cz>
Cc: Dmitry Monakhov <dmonakhov@openvz.org>, linux-ext4@vger.kernel.org
Subject: Re: Uninitialized extent races
Date: Fri, 21 Dec 2012 13:02:43 -0500	[thread overview]
Message-ID: <20121221180243.GB31731@thunk.org> (raw)
In-Reply-To: <20121221161929.GF17357@quack.suse.cz>

On Fri, Dec 21, 2012 at 05:19:29PM +0100, Jan Kara wrote:
>   No, I'm speaking about merging currently uninitialized extents. I.e.
> suppose someone does the following on a filesystem with dioread_nolock so
> that writeback happens via unwritten extents:
>   fd = open("file", O_RDWR);
>   pwrite(fd, buf, 4096, 0);
> 					flusher thread starts writing
> 					we create uninitialized extent for
> 					  range 0-4096
>   fallocate(fd, 0, 4096, 4096);
>     - we merge extents and now have just 1 uninitialized extent for range
>       0-8192
> 					ext4_convert_unwritten_extents() now
> 					  has to split the extent to finish
> 					  the IO.

Ah, I see.  Disabling the the merging that might take place as a
result of the fallocate.  Yes, I agree that's a completely sane thing
to do.

The alternate approach would be to add a flag in the extent status
tree indicating that an unwritten conversion is pending, but that
would add more complexity.

Hmmm.... do we need that complexity anyway?  What happens if we have a
race between a punch (or truncate) and the flusher thread, so there is
pending write.  There are two things that would be of concern.  (1)
Will convert_unwritten_extents do the right thing if the extent in
question has disappeared, and (2) what if the block gets reused for
some other inode in the interim?

I _think_ we're OK in the case of (2), since we're not using FUA
writes for anything other than the commit block, so there shouldn't be
any way that a write for the new inode could complete before the
pending write finishes up.  And (1) should be OK, although it may end
up triggering a WARN_ON and a scarry ext4_msg() in
ext4_convert_unwritten_extents().   But it made me stop and think....

> And I regarding more merging, that could be done (obviously), just we might
> need to postpone that after writeback is finished (PageWriteback is
> cleared) because there extent estimates are not clear. And I need to know
> necessary number of extents well in advance to be able to reserve credits
> in the journal. OTOH maybe we could use jbd2_journal_extend() to get more
> credits if we need them for merging. And when that fails, bad luck but we
> can cope... Anyway, this is a different problem.

Yeah, using jbd2_journal_extend() was what I was thinking about doing
where we could do some opportunistic merging if there's room in the
journal to allow that.  But I agree that's a different problem....

	   	 	      	    - Ted

  reply	other threads:[~2012-12-21 18:02 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-12-21  1:25 Uninitialized extent races Jan Kara
2012-12-21  3:11 ` Theodore Ts'o
2012-12-21 16:19   ` Jan Kara
2012-12-21 18:02     ` Theodore Ts'o [this message]
2012-12-21 22:49       ` Jan Kara
2012-12-21 23:03         ` Theodore Ts'o
2012-12-24 11:17       ` Zheng Liu
2012-12-31  8:32         ` Jan Kara
2012-12-31 16:31           ` Zheng Liu
2012-12-31 16:44             ` Jan Kara
2013-01-01  4:49               ` Zheng Liu
2012-12-21 12:34 ` Dmitry Monakhov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121221180243.GB31731@thunk.org \
    --to=tytso@mit.edu \
    --cc=dmonakhov@openvz.org \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).