From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Christoph Hellwig <hch@lst.de>
Cc: Brian Foster <bfoster@redhat.com>, linux-xfs@vger.kernel.org
Subject: Re: [PATCH 2/3] xfs: go straight to real allocations for direct I/O COW writes
Date: Fri, 27 Jan 2017 09:44:47 -0800 [thread overview]
Message-ID: <20170127174447.GZ9134@birch.djwong.org> (raw)
In-Reply-To: <20170125000934.GG9134@birch.djwong.org>
On Tue, Jan 24, 2017 at 04:09:34PM -0800, Darrick J. Wong wrote:
> On Tue, Jan 24, 2017 at 09:08:55PM +0100, Christoph Hellwig wrote:
> > On Tue, Jan 24, 2017 at 12:43:18PM -0500, Brian Foster wrote:
> > > Hmm, that's not what I'm seeing (not that it really matters, but I'm
> > > curious if I'm missing something):
> >
> > Yeah, I can reproduce this on mainline. Turns out the it was done
> > by the align call in xfs_bmap_btalloc that even my before run had
> > removed.
> >
> > Took me some time to spin my head around this.
> >
> > Btw, I think we have a nasty race in the current DIO code that might
> > expose stale data, but this is just the same kind of head spinning
> > exercise for now:
> >
> > Thread 1 writes a range from B to c
> >
> > B --------- C
>
> A --------- B --------- C
> ^ ^
> d e
>
> I'm assuming B-C has no shared blocks, d-B has shared blocks, and that
> both d & e are cowextsize boundaries.
>
> > a little later thread 2 writes from A to B
> >
> > A --------- B
> > but the code preallocates beyond B into the range where thread
> > 1 has just written, but ->end_io hasn't been called yet.
> > But once ->end_io is called thread 2 has already allocated
> > up to the extent size hint into the write range of thread 1,
> > so the end_io handler will splice the unintialized blocks from
> > that preallocation back into the file right after B.
>
> I think you're right about there being a dio race here. I'm not sure
> what the solution here is -- certainly we could disregard the cowextsize
> hint, though that has a fragmentation cost that we already know about.
>
> We could also change the dio write setup to extend the range that it
> checks for shared blocks up and down to the nearest cowextsize boundary
> and set up the delalloc reservations in the cow fork as necessary. If
> our thread2 comes along then it'll find the reservations already set
> up for a cow so that we avoid the situation where B-C changes between
> iomap_begin and dio_write_end_io does the wrong thing. That's more in
> the spirit of cowextsize since we'd promote future writes to CoW.
> However it's more wasteful of blocks since we have no idea if those
> future writes are ever actually going to happen, and it also adds more
> bmap records if we don't use all of the reservation.
>
> Ugh, my head hurts, I'm going to go for a walk to ponder this some more,
> and to try to work out whether the buffered path is all right.
I think I came up with a better idea overnight: take advantage of the
unwritten bit. Currently, all extents in the cow fork are either
delalloc or normal extents. When we allocate the blocks to fill a cow
extent, we'll flag the extent as unwritten. When we go to write the
data to disk, we convert the unwritten extent to a real extent, and the
post-write remap (since it only takes a file offset and length) will
be changed to remap only real, written extents into the data fork.
--D
>
> --D
>
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2017-01-27 17:45 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-12-05 21:05 reflink COW improvements Christoph Hellwig
2016-12-05 21:05 ` [PATCH 1/3] xfs: reject all unaligned direct writes to reflinked files Christoph Hellwig
2016-12-07 18:59 ` Brian Foster
2016-12-05 21:05 ` [PATCH 2/3] xfs: go straight to real allocations for direct I/O COW writes Christoph Hellwig
2016-12-07 19:00 ` Brian Foster
2016-12-07 19:37 ` Christoph Hellwig
2016-12-07 19:46 ` Brian Foster
2016-12-08 4:23 ` Darrick J. Wong
2017-01-24 8:37 ` Christoph Hellwig
2017-01-24 13:50 ` Brian Foster
2017-01-24 13:59 ` Christoph Hellwig
2017-01-24 15:02 ` Brian Foster
2017-01-24 15:09 ` Christoph Hellwig
2017-01-24 16:17 ` Brian Foster
2017-01-24 16:21 ` Christoph Hellwig
2017-01-24 17:43 ` Brian Foster
2017-01-24 20:08 ` Christoph Hellwig
2017-01-24 20:10 ` Christoph Hellwig
2017-01-25 0:09 ` Darrick J. Wong
2017-01-27 17:44 ` Darrick J. Wong [this message]
2017-01-27 17:48 ` Christoph Hellwig
2016-12-05 21:05 ` [PATCH 3/3] xfs: allocate direct I/O COW blocks in iomap_begin Christoph Hellwig
2016-12-06 2:09 ` reflink COW improvements Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170127174447.GZ9134@birch.djwong.org \
--to=darrick.wong@oracle.com \
--cc=bfoster@redhat.com \
--cc=hch@lst.de \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).