From: Ming Lei <ming.lei@redhat.com>
To: zhangshida <starzhangzsd@gmail.com>
Cc: linux-kernel@vger.kernel.org, linux-block@vger.kernel.org,
nvdimm@lists.linux.dev, virtualization@lists.linux.dev,
linux-nvme@lists.infradead.org, gfs2@lists.linux.dev,
ntfs3@lists.linux.dev, linux-xfs@vger.kernel.org,
zhangshida@kylinos.cn
Subject: Re: Fix potential data loss and corruption due to Incorrect BIO Chain Handling
Date: Sat, 22 Nov 2025 11:35:31 +0800 [thread overview]
Message-ID: <aSEvg8z9qxSwJmZn@fedora> (raw)
In-Reply-To: <20251121081748.1443507-1-zhangshida@kylinos.cn>
On Fri, Nov 21, 2025 at 04:17:39PM +0800, zhangshida wrote:
> From: Shida Zhang <zhangshida@kylinos.cn>
>
> Hello everyone,
>
> We have recently encountered a severe data loss issue on kernel version 4.19,
> and we suspect the same underlying problem may exist in the latest kernel versions.
>
> Environment:
> * **Architecture:** arm64
> * **Page Size:** 64KB
> * **Filesystem:** XFS with a 4KB block size
>
> Scenario:
> The issue occurs while running a MySQL instance where one thread appends data
> to a log file, and a separate thread concurrently reads that file to perform
> CRC checks on its contents.
>
> Problem Description:
> Occasionally, the reading thread detects data corruption. Specifically, it finds
> that stale data has been exposed in the middle of the file.
>
> We have captured four instances of this corruption in our production environment.
> In each case, we observed a distinct pattern:
> The corruption starts at an offset that aligns with the beginning of an XFS extent.
> The corruption ends at an offset that is aligned to the system's `PAGE_SIZE` (64KB in our case).
>
> Corruption Instances:
> 1. Start:`0x73be000`, **End:** `0x73c0000` (Length: 8KB)
> 2. Start:`0x10791a000`, **End:** `0x107920000` (Length: 24KB)
> 3. Start:`0x14535a000`, **End:** `0x145b70000` (Length: 8280KB)
> 4. Start:`0x370d000`, **End:** `0x3710000` (Length: 12KB)
>
> After analysis, we believe the root cause is in the handling of chained bios, specifically
> related to out-of-order io completion.
>
> Consider a bio chain where `bi_remaining` is decremented as each bio in the chain completes.
> For example,
> if a chain consists of three bios (bio1 -> bio2 -> bio3) with
> bi_remaining count:
> 1->2->2
Right.
> if the bio completes in the reverse order, there will be a problem.
> if bio 3 completes first, it will become:
> 1->2->1
Yes.
> then bio 2 completes:
> 1->1->0
No, it is supposed to be 1->1->1.
When bio 1 completes, it will become 0->0->0
bio3's `__bi_remaining` won't drop to zero until bio2's reaches
zero, and bio2 won't be done until bio1 is ended.
Please look at bio_endio():
void bio_endio(struct bio *bio)
{
again:
if (!bio_remaining_done(bio))
return;
...
if (bio->bi_end_io == bio_chain_endio) {
bio = __bio_chain_endio(bio);
goto again;
}
...
}
Thanks,
Ming
next prev parent reply other threads:[~2025-11-22 3:35 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-21 8:17 Fix potential data loss and corruption due to Incorrect BIO Chain Handling zhangshida
2025-11-21 8:17 ` [PATCH 1/9] block: fix data loss and stale date exposure problems during append write zhangshida
2025-11-21 9:34 ` Johannes Thumshirn
2025-11-22 7:08 ` Stephen Zhang
2025-11-21 10:31 ` Christoph Hellwig
2025-11-21 16:13 ` Andreas Gruenbacher
2025-11-22 7:25 ` Stephen Zhang
2025-11-28 3:22 ` Stephen Zhang
2025-11-28 5:55 ` Christoph Hellwig
2025-11-28 6:26 ` Stephen Zhang
2025-11-22 12:15 ` Ming Lei
2025-11-21 8:17 ` [PATCH 2/9] block: export bio_chain_and_submit zhangshida
2025-11-21 10:32 ` Christoph Hellwig
2025-11-21 17:12 ` Andreas Gruenbacher
2025-11-22 7:02 ` Stephen Zhang
2025-11-21 8:17 ` [PATCH 3/9] gfs2: use bio_chain_and_submit for simplification zhangshida
2025-11-21 8:17 ` [PATCH 4/9] xfs: " zhangshida
2025-11-21 8:17 ` [PATCH 5/9] block: " zhangshida
2025-11-21 8:17 ` [PATCH 6/9] fs/ntfs3: " zhangshida
2025-11-21 8:17 ` [PATCH 7/9] zram: " zhangshida
2025-11-21 8:17 ` [PATCH 8/9] nvmet: fix the potential bug and " zhangshida
2025-11-21 8:17 ` [PATCH 9/9] nvdimm: " zhangshida
2025-11-21 10:37 ` Fix potential data loss and corruption due to Incorrect BIO Chain Handling Christoph Hellwig
2025-11-22 6:38 ` Stephen Zhang
2025-11-24 6:22 ` Christoph Hellwig
2025-11-27 7:05 ` Stephen Zhang
2025-11-27 7:14 ` Christoph Hellwig
2025-11-27 7:40 ` Gao Xiang
2025-11-27 14:46 ` Christoph Hellwig
2025-11-28 1:32 ` Stephen Zhang
2025-11-28 1:29 ` Stephen Zhang
2025-11-22 3:35 ` Ming Lei [this message]
2025-11-22 6:42 ` Stephen Zhang
2025-11-22 7:46 ` Andreas Gruenbacher
2025-11-22 12:01 ` Ming Lei
2025-11-22 14:56 ` Andreas Gruenbacher
2025-11-23 3:14 ` Stephen Zhang
2025-11-23 13:48 ` Ming Lei
2025-11-24 1:28 ` Stephen Zhang
2025-11-24 2:00 ` Stephen Zhang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aSEvg8z9qxSwJmZn@fedora \
--to=ming.lei@redhat.com \
--cc=gfs2@lists.linux.dev \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=linux-xfs@vger.kernel.org \
--cc=ntfs3@lists.linux.dev \
--cc=nvdimm@lists.linux.dev \
--cc=starzhangzsd@gmail.com \
--cc=virtualization@lists.linux.dev \
--cc=zhangshida@kylinos.cn \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.