From: Dave Chinner <david@fromorbit.com>
To: Manfred Spraul <manfred@colorfullife.com>
Cc: linux-xfs@vger.kernel.org,
"Spraul Manfred (XC/QMM21-CT)" <Manfred.Spraul@de.bosch.com>
Subject: Re: Metadata CRC error detected at xfs_dir3_block_read_verify+0x9e/0xc0 [xfs], xfs_dir3_block block 0x86f58
Date: Thu, 17 Mar 2022 13:47:05 +1100 [thread overview]
Message-ID: <20220317024705.GY3927073@dread.disaster.area> (raw)
In-Reply-To: <3242ad20-0039-2579-b125-b7a9447a7230@colorfullife.com>
On Wed, Mar 16, 2022 at 09:55:04AM +0100, Manfred Spraul wrote:
> Hi Dave,
>
> On 3/14/22 16:18, Manfred Spraul wrote:
> > Hi Dave,
> >
> > On 3/13/22 23:46, Dave Chinner wrote:
> > > OK, this test is explicitly tearing writes at the storage level.
> > > When there is an update to multiple sectors of the metadata block,
> > > the metadata will be inconsistent on disk while those individual
> > > sector writes are replayed.
> >
> > Thanks for the clarification.
> >
> > I'll modify the test application to never tear write operations and
> > retry.
> >
> > If there are findings, then I'll distribute them.
> >
> I've modified the test app, and with 4000 simulated power failures I have
> not seen any corruptions.
>
>
> Thus:
>
> - With teared write operations: 2 corruptions from ~800 simulated power
> failures
>
> - Without teared write operations: no corruptions from ~4000 simulated power
> failures.
Good to hear.
> But:
>
> I've checked the eMMC specification, and the spec allows that teared write
> happen:
Yes, most storage only guarantees that sector writes are atomic and
so multi-sector writes have no guarantees of being written
atomically. IOWs, all storage technologies that currently exist are
allowed to tear multi-sector writes.
However, FUA writes are guaranteed to be whole on persistent storage
regardless of size when the hardware signals completion. And any
write that the hardware has signalled as complete before a cache
flush is received is also guaranteed to be whole on persistent
storage when the cache flush is signalled as complete by the
hardware. These mechanisms provide protection against torn writes.
IOWs, it's up to filesystems to guarantee data is on stable storage
before they trust it fully. Filesystems are pretty good at using
REQ_FLUSH, REQ_FUA and write completion ordering to ensure that
anything they need whole and complete on stable storage is actually
whole and complete.
In the cases where torn writes occur because that haven't been
covered by a FUA or cache flush guarantee (such as your test),
filesystems need mechanisms in their metadata to detect such events.
CRCs are the prime mechanism for this - that's what XFS uses, and it
was XFS reporting a CRC failure when reading torn metadata that
started this whole thread.
> Is my understanding correct that XFS support neither eMMC nor NVM devices?
> (unless there is a battery backup that exceeds the guarantees from the spec)
Incorrect.
They are supported just fine because flush/FUA semantics provide
guarantees against torn writes in normal operation. IOWs, torn
writes are something that almost *never* happen in real life, even
when power fails suddenly. Despite this, XFS can detect it has
occurred (because broken storage is all too common!), and if it
can't recovery automatically, it will shut down and ask the user to
correct the problem.
BTRFS and ZFS can also detect torn writes, and if you use the
(non-default) ext4 option "metadata_csum" it will also detect torn
writes to metadata via CRC failures. There are other filesystems
that can detect and correct torn writes, too.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2022-03-17 2:47 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-13 15:47 Metadata CRC error detected at xfs_dir3_block_read_verify+0x9e/0xc0 [xfs], xfs_dir3_block block 0x86f58 Manfred Spraul
2022-03-13 22:46 ` Dave Chinner
2022-03-14 15:18 ` Manfred Spraul
2022-03-16 8:55 ` Manfred Spraul
2022-03-17 2:47 ` Dave Chinner [this message]
2022-03-17 3:08 ` Dave Chinner
2022-03-17 6:49 ` Manfred Spraul
2022-03-17 8:24 ` Dave Chinner
2022-03-17 16:09 ` Manfred Spraul
2022-03-17 14:50 ` Theodore Ts'o
2022-03-17 16:03 ` Manfred Spraul
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220317024705.GY3927073@dread.disaster.area \
--to=david@fromorbit.com \
--cc=Manfred.Spraul@de.bosch.com \
--cc=linux-xfs@vger.kernel.org \
--cc=manfred@colorfullife.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox