All of lore.kernel.org
 help / color / mirror / Atom feed
From: Theodore Ts'o <tytso@mit.edu>
To: Al Viro <viro@ZenIV.linux.org.uk>
Cc: linux-ext4@vger.kernel.org, Eric Sandeen <sandeen@redhat.com>,
	linux-fsdevel@vger.kernel.org
Subject: Re: [heads-up][RFC] ext4_file_write() breakage
Date: Thu, 3 Apr 2014 22:55:59 -0400	[thread overview]
Message-ID: <20140404025558.GB2525@thunk.org> (raw)
In-Reply-To: <20140403163739.GR18016@ZenIV.linux.org.uk>

On Thu, Apr 03, 2014 at 05:37:39PM +0100, Al Viro wrote:
> 2) simply looking at file size in O_APPEND case instead of pos would not
> close that one - file size is unstable at that point (we don't have any
> locks held here).
> 
> 3) ext4_unaligned_aio() suffers the same problem, but that's *not* the
> only issue with it.

So basically, we'll have to take i_mutex in order to check the file
size, which means there's no point with the ext4_unaligned_aio()
logics.  We can just take the i_mutex and then do the tests based on
i_size in ext4_file_dio_write()

>  It checks that (O_DIRECT) aio write tries to hit
> something aligned only to hw sector and not to block size.  Fine, but...
> think what rlimit will do to us.  generic_write_checks() contains this:
> 
> 	unsigned long limit = rlimit(RLIMIT_FSIZE);
> 	....
> 		if (limit != RLIM_INFINITY) {
> 			if (*pos >= limit) {
> 				send_sig(SIGXFSZ, current, 0);
> 				return -EFBIG;
> 			}
> 			if (*count > limit - (typeof(limit))*pos) {
> 				*count = limit - (typeof(limit))*pos;
> 			}
> 		}
> 
> and it's done only after we'd called ext4_unaligned_aio().  

Can we solve these problem by simply doing these tests in
ext4_file_dio_write(), so we modify pos/couint before we do the
ext4_unaligned_aio() checks?  We don't need i_mutex to do these
particular tests, right?

> So it doesn't
> predict whether the iovec seen by ->direct_IO() will be unaligned - there
> are false negatives.  Even worse, consider an iovec that consists of
> 8 segments, 512 bytes each.  Starting offset in file is a multiple of block
> size.  Everything's fine from ext4_unaligned_aio() POV, right?  And from
> fs/direct-io.c one it's only sector-aligned sucker.  For a good reason,
> since a segment in the middle of that thing might very well point to unmapped
> memory, which will mean short write, with all zeroing issues ext4 is trying
> to avoid here.

I'm not sure I understand the concern here.  The zeroing issues we're
concerned about is when two threads need to work on the same unwritten
block.  So if the pos and size are block aligned, this can't heppen.
What am I missing?

					- Ted

  reply	other threads:[~2014-04-04  2:55 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-03 16:37 [heads-up][RFC] ext4_file_write() breakage Al Viro
2014-04-04  2:55 ` Theodore Ts'o [this message]
2014-04-04  6:11   ` Al Viro
2014-04-05  3:15     ` Theodore Ts'o
2014-04-05  4:32       ` Al Viro
2014-04-08  2:01         ` Theodore Ts'o
2014-04-05  6:53       ` Al Viro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140404025558.GB2525@thunk.org \
    --to=tytso@mit.edu \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=sandeen@redhat.com \
    --cc=viro@ZenIV.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.