From: "Darrick J. Wong" <djwong@kernel.org>
To: Christoph Hellwig <hch@lst.de>
Cc: Carlos Maiolino <cem@kernel.org>, linux-xfs@vger.kernel.org
Subject: Re: [PATCH 2/2] xfs: fix log CRC mismatches between i386 and other architectures
Date: Mon, 15 Sep 2025 11:25:13 -0700 [thread overview]
Message-ID: <20250915182513.GP8096@frogsfrogsfrogs> (raw)
In-Reply-To: <20250915132047.159473-3-hch@lst.de>
On Mon, Sep 15, 2025 at 06:20:30AM -0700, Christoph Hellwig wrote:
> When mounting file systems with a log that was dirtied on i386 on
> other architectures or vice versa, log recovery is unhappy:
>
> [ 11.068052] XFS (vdb): Torn write (CRC failure) detected at log block 0x2. Truncating head block from 0xc.
>
> This is because the CRCs generated by i386 and other architectures
> always diff. The reason for that is that sizeof(struct xlog_rec_header)
> returns different values for i386 vs the rest (324 vs 328), because the
> struct is not sizeof(uint64_t) aligned, and i386 has odd struct size
> alignment rules.
...and let me guess, the checksum function samples data all the way out
to byte 324/328 too?
> This issue goes back to commit 13cdc853c519 ("Add log versioning, and new
> super block field for the log stripe") in the xfs-import tree, which
> adds log v2 support and the h_size field that causes the unaligned size.
> At that time it only mattered for the crude debug only log header
> checksum, but with commit 0e446be44806 ("xfs: add CRC checks to the log")
> it became a real issue for v5 file system, because now there is a proper
> CRC, and regular builds actually expect it match.
>
> Fix this by allowing checksums with and without the padding.
>
> Fixes: 0e446be44806 ("xfs: add CRC checks to the log")
Cc: <stable@vger.kernel.org> # v3.8
Perhaps? This seems like a serious tripping point for old kernels.
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
> fs/xfs/libxfs/xfs_log_format.h | 30 +++++++++++++++++++++++++++++-
> fs/xfs/libxfs/xfs_ondisk.h | 2 ++
> fs/xfs/xfs_log.c | 8 ++++----
> fs/xfs/xfs_log_priv.h | 4 ++--
> fs/xfs/xfs_log_recover.c | 19 +++++++++++++++++--
> 5 files changed, 54 insertions(+), 9 deletions(-)
>
> diff --git a/fs/xfs/libxfs/xfs_log_format.h b/fs/xfs/libxfs/xfs_log_format.h
> index 0d637c276db0..942c490f23e4 100644
> --- a/fs/xfs/libxfs/xfs_log_format.h
> +++ b/fs/xfs/libxfs/xfs_log_format.h
> @@ -174,12 +174,40 @@ typedef struct xlog_rec_header {
> __be32 h_prev_block; /* block number to previous LR : 4 */
> __be32 h_num_logops; /* number of log operations in this LR : 4 */
> __be32 h_cycle_data[XLOG_HEADER_CYCLE_SIZE / BBSIZE];
> - /* new fields */
> +
> + /* fields added by the Linux port: */
> __be32 h_fmt; /* format of log record : 4 */
> uuid_t h_fs_uuid; /* uuid of FS : 16 */
> +
> + /* fields added for log v2: */
> __be32 h_size; /* iclog size : 4 */
> +
> + /*
> + * When h_size added for log v2 support, it caused structure to have
> + * a different size on i386 vs all other architectures because the
> + * sum of the size ofthe member is not aligned by that of the largest
> + * __be64-sized member, and i386 has really odd struct alignment rules.
> + *
> + * Due to the way the log headers are placed out on-disk that alone is
> + * not a problem becaue the xlog_rec_header always sits alone in a
> + * BBSIZEs area, and the rest of that area is padded with zeroes.
> + * But xlog_cksum used to calculate the checksum based on the structure
> + * size, and thus gives different checksums for i386 vs the rest.
> + * We now do two checksum validation passes for both sizes to allow
> + * moving v5 file systems with unclean logs between i386 and other
> + * (little-endian) architectures.
Is this a problem on other 32-bit platforms? Or just i386?
> + */
> + __u32 h_pad0;
> } xlog_rec_header_t;
>
> +#ifdef __i386__
> +#define XLOG_REC_SIZE offsetofend(struct xlog_rec_header, h_size)
> +#define XLOG_REC_SIZE_OTHER sizeof(struct xlog_rec_header)
> +#else
> +#define XLOG_REC_SIZE sizeof(struct xlog_rec_header)
> +#define XLOG_REC_SIZE_OTHER offsetofend(struct xlog_rec_header, h_size)
> +#endif /* __i386__ */
> +
> typedef struct xlog_rec_ext_header {
> __be32 xh_cycle; /* write cycle of log : 4 */
> __be32 xh_cycle_data[XLOG_HEADER_CYCLE_SIZE / BBSIZE]; /* : 256 */
> diff --git a/fs/xfs/libxfs/xfs_ondisk.h b/fs/xfs/libxfs/xfs_ondisk.h
> index 5ed44fdf7491..7bfa3242e2c5 100644
> --- a/fs/xfs/libxfs/xfs_ondisk.h
> +++ b/fs/xfs/libxfs/xfs_ondisk.h
> @@ -174,6 +174,8 @@ xfs_check_ondisk_structs(void)
> XFS_CHECK_STRUCT_SIZE(struct xfs_rud_log_format, 16);
> XFS_CHECK_STRUCT_SIZE(struct xfs_map_extent, 32);
> XFS_CHECK_STRUCT_SIZE(struct xfs_phys_extent, 16);
> + XFS_CHECK_STRUCT_SIZE(struct xlog_rec_header, 328);
> + XFS_CHECK_STRUCT_SIZE(struct xlog_rec_ext_header, 260);
I guess we'll find out from the build bots. ;)
The code changes looks ok modulo my various questions.
--D
>
> XFS_CHECK_OFFSET(struct xfs_bui_log_format, bui_extents, 16);
> XFS_CHECK_OFFSET(struct xfs_cui_log_format, cui_extents, 16);
> diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> index c8a57e21a1d3..69703dc3ef94 100644
> --- a/fs/xfs/xfs_log.c
> +++ b/fs/xfs/xfs_log.c
> @@ -1568,13 +1568,13 @@ xlog_cksum(
> struct xlog *log,
> struct xlog_rec_header *rhead,
> char *dp,
> - int size)
> + unsigned int hdrsize,
> + unsigned int size)
> {
> uint32_t crc;
>
> /* first generate the crc for the record header ... */
> - crc = xfs_start_cksum_update((char *)rhead,
> - sizeof(struct xlog_rec_header),
> + crc = xfs_start_cksum_update((char *)rhead, hdrsize,
> offsetof(struct xlog_rec_header, h_crc));
>
> /* ... then for additional cycle data for v2 logs ... */
> @@ -1818,7 +1818,7 @@ xlog_sync(
>
> /* calculcate the checksum */
> iclog->ic_header.h_crc = xlog_cksum(log, &iclog->ic_header,
> - iclog->ic_datap, size);
> + iclog->ic_datap, XLOG_REC_SIZE, size);
> /*
> * Intentionally corrupt the log record CRC based on the error injection
> * frequency, if defined. This facilitates testing log recovery in the
> diff --git a/fs/xfs/xfs_log_priv.h b/fs/xfs/xfs_log_priv.h
> index a9a7a271c15b..0cfc654d8e87 100644
> --- a/fs/xfs/xfs_log_priv.h
> +++ b/fs/xfs/xfs_log_priv.h
> @@ -499,8 +499,8 @@ xlog_recover_finish(
> extern void
> xlog_recover_cancel(struct xlog *);
>
> -extern __le32 xlog_cksum(struct xlog *log, struct xlog_rec_header *rhead,
> - char *dp, int size);
> +__le32 xlog_cksum(struct xlog *log, struct xlog_rec_header *rhead,
> + char *dp, unsigned int hdrsize, unsigned int size);
>
> extern struct kmem_cache *xfs_log_ticket_cache;
> struct xlog_ticket *xlog_ticket_alloc(struct xlog *log, int unit_bytes,
> diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
> index 0a4db8efd903..549d60959aee 100644
> --- a/fs/xfs/xfs_log_recover.c
> +++ b/fs/xfs/xfs_log_recover.c
> @@ -2894,9 +2894,24 @@ xlog_recover_process(
> int pass,
> struct list_head *buffer_list)
> {
> - __le32 expected_crc = rhead->h_crc, crc;
> + __le32 expected_crc = rhead->h_crc, crc, other_crc;
>
> - crc = xlog_cksum(log, rhead, dp, be32_to_cpu(rhead->h_len));
> + crc = xlog_cksum(log, rhead, dp, XLOG_REC_SIZE,
> + be32_to_cpu(rhead->h_len));
> +
> + /*
> + * Look at the end of the struct xlog_rec_header definition in
> + * xfs_log_format.h for the glory details.
> + */
> + if (expected_crc && crc != expected_crc) {
> + other_crc = xlog_cksum(log, rhead, dp, XLOG_REC_SIZE_OTHER,
> + be32_to_cpu(rhead->h_len));
> + if (other_crc == expected_crc) {
> + xfs_notice_once(log->l_mp,
> + "Fixing up incorrect CRC due to padding.");
> + crc = other_crc;
> + }
> + }
>
> /*
> * Nothing else to do if this is a CRC verification pass. Just return
> --
> 2.47.2
>
>
next prev parent reply other threads:[~2025-09-15 18:25 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-15 13:20 fix cross-platform log CRC validation Christoph Hellwig
2025-09-15 13:20 ` [PATCH 1/2] xfs: rename the old_crc variable in xlog_recover_process Christoph Hellwig
2025-09-15 18:25 ` Darrick J. Wong
2025-09-15 13:20 ` [PATCH 2/2] xfs: fix log CRC mismatches between i386 and other architectures Christoph Hellwig
2025-09-15 18:25 ` Darrick J. Wong [this message]
2025-09-15 20:50 ` Christoph Hellwig
2025-09-16 10:26 ` Carlos Maiolino
2025-09-16 11:34 ` fix cross-platform log CRC validation Carlos Maiolino
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250915182513.GP8096@frogsfrogsfrogs \
--to=djwong@kernel.org \
--cc=cem@kernel.org \
--cc=hch@lst.de \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox