linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Christoph Hellwig <hch@lst.de>
Cc: Brian Foster <bfoster@redhat.com>, linux-xfs@vger.kernel.org
Subject: Re: [PATCH 2/3] xfs: go straight to real allocations for direct I/O COW writes
Date: Tue, 24 Jan 2017 16:09:34 -0800	[thread overview]
Message-ID: <20170125000934.GG9134@birch.djwong.org> (raw)
In-Reply-To: <20170124200855.GA1385@lst.de>

On Tue, Jan 24, 2017 at 09:08:55PM +0100, Christoph Hellwig wrote:
> On Tue, Jan 24, 2017 at 12:43:18PM -0500, Brian Foster wrote:
> > Hmm, that's not what I'm seeing (not that it really matters, but I'm
> > curious if I'm missing something):
> 
> Yeah, I can reproduce this on mainline.  Turns out the it was done
> by the align call in xfs_bmap_btalloc that even my before run had
> removed.
> 
> Took me some time to spin my head around this.
> 
> Btw, I think we have a nasty race in the current DIO code that might
> expose stale data, but this is just the same kind of head spinning
> exercise for now:
> 
> Thread 1 writes a range from B to c
> 
>                     B --------- C

          A --------- B --------- C
               ^           ^
               d           e

I'm assuming B-C has no shared blocks, d-B has shared blocks, and that
both d & e are cowextsize boundaries.

> a little later thread 2 writes from A to B
> 
>         A --------- B
> but the code preallocates beyond B into the range where thread
> 1 has just written, but ->end_io hasn't been called yet.
> But once ->end_io is called thread 2 has already allocated
> up to the extent size hint into the write range of thread 1,
> so the end_io handler will splice the unintialized blocks from
> that preallocation back into the file right after B.

I think you're right about there being a dio race here.  I'm not sure
what the solution here is -- certainly we could disregard the cowextsize
hint, though that has a fragmentation cost that we already know about.

We could also change the dio write setup to extend the range that it
checks for shared blocks up and down to the nearest cowextsize boundary
and set up the delalloc reservations in the cow fork as necessary.  If
our thread2 comes along then it'll find the reservations already set
up for a cow so that we avoid the situation where B-C changes between
iomap_begin and dio_write_end_io does the wrong thing.  That's more in
the spirit of cowextsize since we'd promote future writes to CoW.
However it's more wasteful of blocks since we have no idea if those
future writes are ever actually going to happen, and it also adds more
bmap records if we don't use all of the reservation.

Ugh, my head hurts, I'm going to go for a walk to ponder this some more,
and to try to work out whether the buffered path is all right.

--D

> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2017-01-25  0:09 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-05 21:05 reflink COW improvements Christoph Hellwig
2016-12-05 21:05 ` [PATCH 1/3] xfs: reject all unaligned direct writes to reflinked files Christoph Hellwig
2016-12-07 18:59   ` Brian Foster
2016-12-05 21:05 ` [PATCH 2/3] xfs: go straight to real allocations for direct I/O COW writes Christoph Hellwig
2016-12-07 19:00   ` Brian Foster
2016-12-07 19:37     ` Christoph Hellwig
2016-12-07 19:46       ` Brian Foster
2016-12-08  4:23         ` Darrick J. Wong
2017-01-24  8:37         ` Christoph Hellwig
2017-01-24 13:50           ` Brian Foster
2017-01-24 13:59             ` Christoph Hellwig
2017-01-24 15:02               ` Brian Foster
2017-01-24 15:09                 ` Christoph Hellwig
2017-01-24 16:17                   ` Brian Foster
2017-01-24 16:21                     ` Christoph Hellwig
2017-01-24 17:43                       ` Brian Foster
2017-01-24 20:08                         ` Christoph Hellwig
2017-01-24 20:10                           ` Christoph Hellwig
2017-01-25  0:09                           ` Darrick J. Wong [this message]
2017-01-27 17:44                             ` Darrick J. Wong
2017-01-27 17:48                               ` Christoph Hellwig
2016-12-05 21:05 ` [PATCH 3/3] xfs: allocate direct I/O COW blocks in iomap_begin Christoph Hellwig
2016-12-06  2:09 ` reflink COW improvements Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170125000934.GG9134@birch.djwong.org \
    --to=darrick.wong@oracle.com \
    --cc=bfoster@redhat.com \
    --cc=hch@lst.de \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).