linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Sandeen <sandeen@redhat.com>
To: "Ted Ts'o" <tytso@mit.edu>
Cc: ext4 development <linux-ext4@vger.kernel.org>
Subject: Re: [PATCH V2] ext4: serialize unaligned asynchronous DIO
Date: Fri, 21 Jan 2011 10:00:01 -0600	[thread overview]
Message-ID: <4D39AD81.4010407@redhat.com> (raw)
In-Reply-To: <4D3087CE.2060200@redhat.com>

On 1/14/11 11:28 AM, Eric Sandeen wrote:
> 
> 
> ext4 has a data corruption case when doing non-block-aligned
> asynchronous direct IO into a sparse file, as demonstrated
> by xfstest 240.
> 
> The root cause is that while ext4 preallocates space in the
> hole, mappings of that space still look "new" and 
> dio_zero_block() will zero out the unwritten portions.  When
> more than one AIO thread is going, they both find this "new"
> block and race to zero out their portion; this is uncoordinated
> and causes data corruption.
> 
> Dave Chinner fixed this for xfs by simply serializing all
> unaligned asynchronous direct IO.  I've done the same here.
> This is a very big hammer, and I'm not very pleased with
> stuffing this into ext4_file_write().  But since ext4 is
> DIO_LOCKING, we need to serialize it at this high level.
> 
> I tried to move this into ext4_ext_direct_IO, but by then
> we have the i_mutex already, and we will wait on the
> work queue to do conversions - which must also take the
> i_mutex.  So that won't work.
> 
> This was originally exposed by qemu-kvm installing to
> a raw disk image with a normal sector-63 alignment.  I've
> tested a backport of this patch with qemu, and it does
> avoid the corruption.  It is also quite a lot slower
> (14 min for package installs, vs. 8 min for well-aligned)
> but I'll take slow correctness over fast corruption any day.
> 
> Mingming suggested that perhaps we can track outstanding
> conversions, and wait on that instead so that non-sparse
> files won't be affected, but I've had trouble making that
> work so far, and would like to get the corruption hole
> plugged ASAP.  Perhaps adding a prink_once() warning of
> the perf degradation on this path would be useful?
> 

I've sent a patch to do the above, as V3, twice now, but the list
is eating it.  Ted, you were cc'd so hopefully you got it?

-Eric

  parent reply	other threads:[~2011-01-21 16:00 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-13 22:23 [PATCH] ext4: serialize unaligned asynchronous DIO Eric Sandeen
2011-01-14  4:15 ` Ted Ts'o
2011-01-14  4:41   ` Eric Sandeen
2011-01-14 17:28   ` [PATCH V2] " Eric Sandeen
2011-01-18 16:23     ` Eric Sandeen
2011-01-21 16:00     ` Eric Sandeen [this message]
2011-01-21 18:26     ` [PATCH V3 RESEND 2] " Eric Sandeen
2011-01-21 23:27       ` Ted Ts'o
2011-02-07  2:33       ` Ted Ts'o
2011-02-07 15:59         ` Ted Ts'o
2011-02-07 17:58           ` Eric Sandeen
2011-02-07 22:18             ` Mingming Cao
2012-02-23 13:23           ` backport "ext4: serialize unaligned asynchronous DIO" to 2.6.32 Philipp Hahn
2012-02-23 15:15             ` Eric Sandeen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D39AD81.4010407@redhat.com \
    --to=sandeen@redhat.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).