From: Dave Chinner <david@fromorbit.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Brian Foster <bfoster@redhat.com>, linux-xfs@vger.kernel.org
Subject: Re: [PATCH] xfs: fix shared extent data corruption due to missing cow reservation
Date: Fri, 16 Nov 2018 15:35:08 +1100 [thread overview]
Message-ID: <20181116043507.GB19305@dastard> (raw)
In-Reply-To: <20181115055020.GS4235@magnolia>
On Wed, Nov 14, 2018 at 09:50:20PM -0800, Darrick J. Wong wrote:
> On Tue, Nov 13, 2018 at 12:08:19PM -0500, Brian Foster wrote:
> > Page writeback indirectly handles shared extents via the existence
> > of overlapping COW fork blocks. If COW fork blocks exist, writeback
> > always performs the associated copy-on-write regardless if the
> > underlying blocks are actually shared. If the blocks are shared,
> > then overlapping COW fork blocks must always exist.
> >
> > fstests shared/010 reproduces a case where a buffered write occurs
> > over a shared block without performing the requisite COW fork
> > reservation. This ultimately causes writeback to the shared extent
> > and data corruption that is detected across md5 checks of the
> > filesystem across a mount cycle.
> >
> > The problem occurs when a buffered write lands over a shared extent
> > that crosses an extent size hint boundary and that also happens to
> > have a partial COW reservation that doesn't cover the start and end
> > blocks of the data fork extent.
> >
> > For example, a buffered write occurs across the file offset (in FSB
> > units) range of [29, 57]. A shared extent exists at blocks [29, 35]
> > and COW reservation already exists at blocks [32, 34]. After
> > accommodating a COW extent size hint of 32 blocks and the existing
> > reservation at offset 32, xfs_reflink_reserve_cow() allocates 32
> > blocks of reservation at offset 0 and returns with COW reservation
> > across the range of [0, 34]. The associated data fork extent is
> > still [29, 35], however, which isn't fully covered by the COW
> > reservation.
> >
> > This leads to a buffered write at file offset 35 over a shared
> > extent without associated COW reservation. Writeback eventually
> > kicks in, performs an overwrite of the underlying shared block and
> > causes the associated data corruption.
> >
> > Update xfs_reflink_reserve_cow() to accommodate the fact that a
> > delalloc allocation request may not fully cover the extent in the
> > data fork. Trim the data fork extent appropriately, just as is done
> > for shared extent boundaries and/or existing COW reservations that
> > happen to overlap the start of the data fork extent. This prevents
> > shared/010 failures due to data corruption on reflink enabled
> > filesystems.
> >
> > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > ---
> >
> > This is not fully tested yet beyond verification that it solves the
> > problem reproduced by shared/010. I'll be running more tests today, but
> > I'm sending sooner for review and testing due to the nature of the
> > problem and the fact that it's a fairly isolated change. I'll follow up
> > if I discover any resulting regressions..
>
> Did you find any regressions?
>
> I ran this through my overnight tests and saw no adverse effects, though
> Dave was complaining yesterday about continuing generic/091 corruptions
> (which I didn't see with this patch applied...)
I can say now that this patch hasn't caused any new corruptions. It
hasn't fixed any of the (many) corruptions that I'm hitting, either,
so from that perspective it's no better or worse than what we have
now :P
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2018-11-16 14:45 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-13 17:08 [PATCH] xfs: fix shared extent data corruption due to missing cow reservation Brian Foster
2018-11-15 5:50 ` Darrick J. Wong
2018-11-15 9:49 ` Christoph Hellwig
2018-11-15 12:33 ` Brian Foster
2018-11-16 4:35 ` Dave Chinner [this message]
2018-11-16 13:32 ` Brian Foster
2018-11-16 21:19 ` Dave Chinner
2018-11-17 13:33 ` Brian Foster
2018-11-15 9:50 ` Christoph Hellwig
2018-11-15 15:51 ` Eric Sandeen
2018-11-15 15:58 ` Brian Foster
2018-11-15 15:59 ` Eric Sandeen
2018-11-15 16:10 ` Brian Foster
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181116043507.GB19305@dastard \
--to=david@fromorbit.com \
--cc=bfoster@redhat.com \
--cc=darrick.wong@oracle.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox