* [PATCH v5 0/3] Fix bio chain related issues
@ 2025-12-04 2:47 zhangshida
2025-12-04 2:47 ` [PATCH v5 1/3] bcache: fix improper use of bi_end_io zhangshida
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: zhangshida @ 2025-12-04 2:47 UTC (permalink / raw)
To: Johannes.Thumshirn, hch, agruenba, ming.lei, hsiangkao, csander,
colyli
Cc: linux-block, linux-bcache, linux-kernel, zhangshida, starzhangzsd
From: Shida Zhang <zhangshida@kylinos.cn>
This series addresses incorrect usage of bio_chain_endio().
Note: Patch 2 depends on changes introduced in patch 1. Therefore, patch
1 is still included in this series even though Coly suggested sending it
directly to the bcache mailing list:
https://lore.kernel.org/all/20251201082611.2703889-1-zhangshida@kylinos.cn/
V5:
- Patch 2: Replaced BUG_ON(1) with BUG().
- Patch 3: Rephrased the commit message.
v4:
- Removed unnecessary cleanups from the series.
https://lore.kernel.org/all/20251201090442.2707362-1-zhangshida@kylinos.cn/
v3:
- Remove the dead code in bio_chain_endio and drop patch 1 in v2
- Refined the __bio_chain_endio changes with minor modifications (was
patch 02 in v2).
- Dropped cleanup patches 06 and 12 from v2 due to an incorrect 'prev'
and 'new' order.
https://lore.kernel.org/all/20251129090122.2457896-1-zhangshida@kylinos.cn/
v2:
- Added fix for bcache.
- Added BUG_ON() in bio_chain_endio().
- Enhanced commit messages for each patch
https://lore.kernel.org/all/20251128083219.2332407-1-zhangshida@kylinos.cn/
v1:
https://lore.kernel.org/all/20251121081748.1443507-1-zhangshida@kylinos.cn/
Shida Zhang (3):
bcache: fix improper use of bi_end_io
block: prohibit calls to bio_chain_endio
block: prevent race condition on bi_status in __bio_chain_endio
block/bio.c | 11 ++++++++---
drivers/md/bcache/request.c | 6 +++---
2 files changed, 11 insertions(+), 6 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 9+ messages in thread* [PATCH v5 1/3] bcache: fix improper use of bi_end_io 2025-12-04 2:47 [PATCH v5 0/3] Fix bio chain related issues zhangshida @ 2025-12-04 2:47 ` zhangshida 2025-12-04 9:38 ` Christoph Hellwig 2025-12-04 2:47 ` [PATCH v5 2/3] block: prohibit calls to bio_chain_endio zhangshida 2025-12-04 2:47 ` [PATCH v5 3/3] block: prevent race condition on bi_status in __bio_chain_endio zhangshida 2 siblings, 1 reply; 9+ messages in thread From: zhangshida @ 2025-12-04 2:47 UTC (permalink / raw) To: Johannes.Thumshirn, hch, agruenba, ming.lei, hsiangkao, csander, colyli Cc: linux-block, linux-bcache, linux-kernel, zhangshida, starzhangzsd From: Shida Zhang <zhangshida@kylinos.cn> Don't call bio->bi_end_io() directly. Use the bio_endio() helper function instead, which handles completion more safely and uniformly. Suggested-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Shida Zhang <zhangshida@kylinos.cn> --- drivers/md/bcache/request.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c index af345dc6fde..82fdea7dea7 100644 --- a/drivers/md/bcache/request.c +++ b/drivers/md/bcache/request.c @@ -1104,7 +1104,7 @@ static void detached_dev_end_io(struct bio *bio) } kfree(ddip); - bio->bi_end_io(bio); + bio_endio(bio); } static void detached_dev_do_request(struct bcache_device *d, struct bio *bio, @@ -1121,7 +1121,7 @@ static void detached_dev_do_request(struct bcache_device *d, struct bio *bio, ddip = kzalloc(sizeof(struct detached_dev_io_private), GFP_NOIO); if (!ddip) { bio->bi_status = BLK_STS_RESOURCE; - bio->bi_end_io(bio); + bio_endio(bio); return; } @@ -1136,7 +1136,7 @@ static void detached_dev_do_request(struct bcache_device *d, struct bio *bio, if ((bio_op(bio) == REQ_OP_DISCARD) && !bdev_max_discard_sectors(dc->bdev)) - bio->bi_end_io(bio); + detached_dev_end_io(bio); else submit_bio_noacct(bio); } -- 2.34.1 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v5 1/3] bcache: fix improper use of bi_end_io 2025-12-04 2:47 ` [PATCH v5 1/3] bcache: fix improper use of bi_end_io zhangshida @ 2025-12-04 9:38 ` Christoph Hellwig 0 siblings, 0 replies; 9+ messages in thread From: Christoph Hellwig @ 2025-12-04 9:38 UTC (permalink / raw) To: zhangshida Cc: Johannes.Thumshirn, hch, agruenba, ming.lei, hsiangkao, csander, colyli, linux-block, linux-bcache, linux-kernel, zhangshida Looks good: Reviewed-by: Christoph Hellwig <hch@lst.de> ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v5 2/3] block: prohibit calls to bio_chain_endio 2025-12-04 2:47 [PATCH v5 0/3] Fix bio chain related issues zhangshida 2025-12-04 2:47 ` [PATCH v5 1/3] bcache: fix improper use of bi_end_io zhangshida @ 2025-12-04 2:47 ` zhangshida 2025-12-04 9:40 ` Christoph Hellwig 2025-12-04 2:47 ` [PATCH v5 3/3] block: prevent race condition on bi_status in __bio_chain_endio zhangshida 2 siblings, 1 reply; 9+ messages in thread From: zhangshida @ 2025-12-04 2:47 UTC (permalink / raw) To: Johannes.Thumshirn, hch, agruenba, ming.lei, hsiangkao, csander, colyli Cc: linux-block, linux-bcache, linux-kernel, zhangshida, starzhangzsd From: Shida Zhang <zhangshida@kylinos.cn> Now that all potential callers of bio_chain_endio have been eliminated, completely prohibit any future calls to this function. Suggested-by: Ming Lei <ming.lei@redhat.com> Suggested-by: Andreas Gruenbacher <agruenba@redhat.com> Suggested-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Shida Zhang <zhangshida@kylinos.cn> --- block/bio.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/block/bio.c b/block/bio.c index b3a79285c27..cfb751dfcf5 100644 --- a/block/bio.c +++ b/block/bio.c @@ -320,9 +320,13 @@ static struct bio *__bio_chain_endio(struct bio *bio) return parent; } +/** + * This function should only be used as a flag and must never be called. + * If execution reaches here, it indicates a serious programming error. + */ static void bio_chain_endio(struct bio *bio) { - bio_endio(__bio_chain_endio(bio)); + BUG(); } /** -- 2.34.1 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v5 2/3] block: prohibit calls to bio_chain_endio 2025-12-04 2:47 ` [PATCH v5 2/3] block: prohibit calls to bio_chain_endio zhangshida @ 2025-12-04 9:40 ` Christoph Hellwig 2025-12-06 9:26 ` Stephen Zhang 0 siblings, 1 reply; 9+ messages in thread From: Christoph Hellwig @ 2025-12-04 9:40 UTC (permalink / raw) To: zhangshida Cc: Johannes.Thumshirn, hch, agruenba, ming.lei, hsiangkao, csander, colyli, linux-block, linux-bcache, linux-kernel, zhangshida On Thu, Dec 04, 2025 at 10:47:47AM +0800, zhangshida wrote: > From: Shida Zhang <zhangshida@kylinos.cn> > > Now that all potential callers of bio_chain_endio have been > eliminated, completely prohibit any future calls to this function. > > Suggested-by: Ming Lei <ming.lei@redhat.com> > Suggested-by: Andreas Gruenbacher <agruenba@redhat.com> > Suggested-by: Christoph Hellwig <hch@infradead.org> > Signed-off-by: Shida Zhang <zhangshida@kylinos.cn> > --- > block/bio.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/block/bio.c b/block/bio.c > index b3a79285c27..cfb751dfcf5 100644 > --- a/block/bio.c > +++ b/block/bio.c > @@ -320,9 +320,13 @@ static struct bio *__bio_chain_endio(struct bio *bio) > return parent; > } > > +/** > + * This function should only be used as a flag and must never be called. > + * If execution reaches here, it indicates a serious programming error. > + */ This is not a kerneldoc comment and thus should not use /** to start the comment, otherwise the kerneldoc script will complain about missing kernel doc elelemts. Otherwise looks good: Reviewed-by: Christoph Hellwig <hch@lst.de> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v5 2/3] block: prohibit calls to bio_chain_endio 2025-12-04 9:40 ` Christoph Hellwig @ 2025-12-06 9:26 ` Stephen Zhang 0 siblings, 0 replies; 9+ messages in thread From: Stephen Zhang @ 2025-12-06 9:26 UTC (permalink / raw) To: Christoph Hellwig Cc: Johannes.Thumshirn, agruenba, ming.lei, hsiangkao, csander, colyli, linux-block, linux-bcache, linux-kernel, zhangshida Christoph Hellwig <hch@infradead.org> 于2025年12月4日周四 17:40写道: > > On Thu, Dec 04, 2025 at 10:47:47AM +0800, zhangshida wrote: > > From: Shida Zhang <zhangshida@kylinos.cn> > > > > Now that all potential callers of bio_chain_endio have been > > eliminated, completely prohibit any future calls to this function. > > > > Suggested-by: Ming Lei <ming.lei@redhat.com> > > Suggested-by: Andreas Gruenbacher <agruenba@redhat.com> > > Suggested-by: Christoph Hellwig <hch@infradead.org> > > Signed-off-by: Shida Zhang <zhangshida@kylinos.cn> > > --- > > block/bio.c | 6 +++++- > > 1 file changed, 5 insertions(+), 1 deletion(-) > > > > diff --git a/block/bio.c b/block/bio.c > > index b3a79285c27..cfb751dfcf5 100644 > > --- a/block/bio.c > > +++ b/block/bio.c > > @@ -320,9 +320,13 @@ static struct bio *__bio_chain_endio(struct bio *bio) > > return parent; > > } > > > > +/** > > + * This function should only be used as a flag and must never be called. > > + * If execution reaches here, it indicates a serious programming error. > > + */ > > This is not a kerneldoc comment and thus should not use /** to start > the comment, otherwise the kerneldoc script will complain about > missing kernel doc elelemts. > Thanks for catching that. I'll fix the comment format in the next revision. Thanks, Shida > Otherwise looks good: > > Reviewed-by: Christoph Hellwig <hch@lst.de> ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v5 3/3] block: prevent race condition on bi_status in __bio_chain_endio 2025-12-04 2:47 [PATCH v5 0/3] Fix bio chain related issues zhangshida 2025-12-04 2:47 ` [PATCH v5 1/3] bcache: fix improper use of bi_end_io zhangshida 2025-12-04 2:47 ` [PATCH v5 2/3] block: prohibit calls to bio_chain_endio zhangshida @ 2025-12-04 2:47 ` zhangshida 2025-12-04 12:01 ` Andreas Gruenbacher 2 siblings, 1 reply; 9+ messages in thread From: zhangshida @ 2025-12-04 2:47 UTC (permalink / raw) To: Johannes.Thumshirn, hch, agruenba, ming.lei, hsiangkao, csander, colyli Cc: linux-block, linux-bcache, linux-kernel, zhangshida, starzhangzsd, Christoph Hellwig From: Shida Zhang <zhangshida@kylinos.cn> Andreas point out that multiple completions can race setting bi_status. If __bio_chain_endio() is called concurrently from multiple threads accessing the same parent bio, it should use WRITE_ONCE()/READ_ONCE() to access parent->bi_status and avoid data races. On x86 and ARM, these macros compile to the same instruction as a normal write, but they may be required on other architectures to prevent tearing, and to ensure the compiler does not add or remove memory accesses under the assumption that the values are not accessed concurrently. Adopting a cmpxchg approach, as used in other code paths, resolves all these issues, as suggested by Christoph. Suggested-by: Andreas Gruenbacher <agruenba@redhat.com> Suggested-by: Christoph Hellwig <hch@infradead.org> Suggested-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Shida Zhang <zhangshida@kylinos.cn> --- block/bio.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/block/bio.c b/block/bio.c index cfb751dfcf5..51b57f9d8bd 100644 --- a/block/bio.c +++ b/block/bio.c @@ -314,8 +314,9 @@ static struct bio *__bio_chain_endio(struct bio *bio) { struct bio *parent = bio->bi_private; - if (bio->bi_status && !parent->bi_status) - parent->bi_status = bio->bi_status; + if (bio->bi_status) + cmpxchg(&parent->bi_status, 0, bio->bi_status); + bio_put(bio); return parent; } -- 2.34.1 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v5 3/3] block: prevent race condition on bi_status in __bio_chain_endio 2025-12-04 2:47 ` [PATCH v5 3/3] block: prevent race condition on bi_status in __bio_chain_endio zhangshida @ 2025-12-04 12:01 ` Andreas Gruenbacher 2025-12-05 7:13 ` Stephen Zhang 0 siblings, 1 reply; 9+ messages in thread From: Andreas Gruenbacher @ 2025-12-04 12:01 UTC (permalink / raw) To: zhangshida Cc: Johannes.Thumshirn, hch, ming.lei, hsiangkao, csander, colyli, linux-block, linux-bcache, linux-kernel, zhangshida, Christoph Hellwig On Thu, Dec 4, 2025 at 3:48 AM zhangshida <starzhangzsd@gmail.com> wrote: > From: Shida Zhang <zhangshida@kylinos.cn> > > Andreas point out that multiple completions can race setting > bi_status. What I've actually pointed out is that the '!parent->bi_status' check in this statement is an unnecessary optimization that can be removed. But this is not what this discussion is mainly about anymore. In the current code, multiple completions can race setting bi_status, but that is fine as long as bi_status is never set to 0 during bio completion. The effect is that when there are multiple errors, the bi_error field of the final bio will eventually be set to an error code, but we don't know which error code will win. This all works correctly today, and there is no race to fix because the race is intentional. > If __bio_chain_endio() is called concurrently from multiple threads > accessing the same parent bio, it should use WRITE_ONCE()/READ_ONCE() > to access parent->bi_status and avoid data races. > > On x86 and ARM, these macros compile to the same instruction as a > normal write, but they may be required on other architectures to > prevent tearing, and to ensure the compiler does not add or remove > memory accesses under the assumption that the values are not accessed > concurrently. WRITE_ONCE() and READ_ONCE() also prevent the compiler from reordering operations. Even when the compiler doesn't seem to do anything nasty at the moment, it would probably still be worthwhile to use WRITE_ONCE() for setting bi_status throughout the code. But that's beyond the scope of this patch, and it calls for more than a global search and replace job. > Adopting a cmpxchg approach, as used in other code paths, resolves all > these issues, as suggested by Christoph. No, the cmpxchg() doesn't actually achieve anything, it only makes things worse. For example, when there is an A -> B chain, we can end up with the following sequence of events: - A fails, sets A->bi_status, and calls bio_endio(A). - B->status is still 0, so bio_endio(A) sets B->bi_status to A->bi_status. - B fails and sets B->bi_status, OVERRIDING the value of A->bi_status. - bio_endio(B) calls B->bi_end_io(). Things get worse in an A -> B -> C chain, but I've already mentioned that earlier in this thread. So again, the cmpxchg() is unnecessary, but it is also harmless because it suggests that there is some form of synchronization that doesn't exist. The btrfs code that the cmpxchg() was taken from seems to implement actual first-failure-wins semantics, but this patch does not. The underlying question here is whether we want to change things so that bi_status is set to the first error that occurs (probably first in time, not first in the chain). If that is the goal, then we should be explicit about it. Right now, I don't see the need. Thanks, Andreas > Suggested-by: Andreas Gruenbacher <agruenba@redhat.com> > Suggested-by: Christoph Hellwig <hch@infradead.org> > Suggested-by: Caleb Sander Mateos <csander@purestorage.com> > Reviewed-by: Christoph Hellwig <hch@lst.de> > Signed-off-by: Shida Zhang <zhangshida@kylinos.cn> > --- > block/bio.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/block/bio.c b/block/bio.c > index cfb751dfcf5..51b57f9d8bd 100644 > --- a/block/bio.c > +++ b/block/bio.c > @@ -314,8 +314,9 @@ static struct bio *__bio_chain_endio(struct bio *bio) > { > struct bio *parent = bio->bi_private; > > - if (bio->bi_status && !parent->bi_status) > - parent->bi_status = bio->bi_status; > + if (bio->bi_status) > + cmpxchg(&parent->bi_status, 0, bio->bi_status); > + > bio_put(bio); > return parent; > } > -- > 2.34.1 > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v5 3/3] block: prevent race condition on bi_status in __bio_chain_endio 2025-12-04 12:01 ` Andreas Gruenbacher @ 2025-12-05 7:13 ` Stephen Zhang 0 siblings, 0 replies; 9+ messages in thread From: Stephen Zhang @ 2025-12-05 7:13 UTC (permalink / raw) To: Andreas Gruenbacher Cc: Johannes.Thumshirn, hch, ming.lei, hsiangkao, csander, colyli, linux-block, linux-bcache, linux-kernel, zhangshida, Christoph Hellwig Andreas Gruenbacher <agruenba@redhat.com> 于2025年12月4日周四 20:01写道: > > On Thu, Dec 4, 2025 at 3:48 AM zhangshida <starzhangzsd@gmail.com> wrote: > > From: Shida Zhang <zhangshida@kylinos.cn> > > > > Andreas point out that multiple completions can race setting > > bi_status. > > What I've actually pointed out is that the '!parent->bi_status' check > in this statement is an unnecessary optimization that can be removed. > But this is not what this discussion is mainly about anymore. > > In the current code, multiple completions can race setting bi_status, > but that is fine as long as bi_status is never set to 0 during bio > completion. The effect is that when there are multiple errors, the > bi_error field of the final bio will eventually be set to an error > code, but we don't know which error code will win. This all works > correctly today, and there is no race to fix because the race is > intentional. > > > If __bio_chain_endio() is called concurrently from multiple threads > > accessing the same parent bio, it should use WRITE_ONCE()/READ_ONCE() > > to access parent->bi_status and avoid data races. > > > > On x86 and ARM, these macros compile to the same instruction as a > > normal write, but they may be required on other architectures to > > prevent tearing, and to ensure the compiler does not add or remove > > memory accesses under the assumption that the values are not accessed > > concurrently. > > WRITE_ONCE() and READ_ONCE() also prevent the compiler from reordering > operations. Even when the compiler doesn't seem to do anything nasty > at the moment, it would probably still be worthwhile to use > WRITE_ONCE() for setting bi_status throughout the code. But that's > beyond the scope of this patch, and it calls for more than a global > search and replace job. > > > Adopting a cmpxchg approach, as used in other code paths, resolves all > > these issues, as suggested by Christoph. > > No, the cmpxchg() doesn't actually achieve anything, it only makes > things worse. For example, when there is an A -> B chain, we can end > up with the following sequence of events: > > - A fails, sets A->bi_status, and calls bio_endio(A). > - B->status is still 0, so bio_endio(A) sets B->bi_status to A->bi_status. > - B fails and sets B->bi_status, OVERRIDING the value of A->bi_status. > - bio_endio(B) calls B->bi_end_io(). > > Things get worse in an A -> B -> C chain, but I've already mentioned > that earlier in this thread. > > So again, the cmpxchg() is unnecessary, but it is also harmless > because it suggests that there is some form of synchronization that > doesn't exist. The btrfs code that the cmpxchg() was taken from seems > to implement actual first-failure-wins semantics, but this patch does > not. > > The underlying question here is whether we want to change things so > that bi_status is set to the first error that occurs (probably first > in time, not first in the chain). If that is the goal, then we should > be explicit about it. Right now, I don't see the need. > Thank you for the thorough discussion on this matter. Based on my understanding, there appear to be two main viewpoints: Side 1 (S1) argues: A. Using cmpxchg() addresses all the issues that READ_ONCE()/WRITE_ONCE() can solve. That said, cmpxchg() appears to provide a comprehensive solution. B. While cmpxchg() introduces some additional overhead, this may not be significant since it occurs in a slow error-handling path. Or nondeterministic order of overriding, Which is not a problem. Side 2 (S2) argues: A. While READ_ONCE()/WRITE_ONCE()-style protections may be needed, addressing them in this specific patch may not be appropriate. B. The cmpxchg() approach adds unnecessary complexity without clear benefits. The trade-off between S1B and S2B seems somewhat balanced, with no clear winner. However, regarding S1A versus S2A, I wonder: If we agree that READ_ONCE()/WRITE_ONCE() protections are required here, is there a specific reason we couldn't address just this code snippet in this patch? We could then follow the S1 approach and use cmpxchg(). However, If this change remains controversial, I'd be happy to drop it and resend v6 without this modification. Thanks, Shida > Thanks, > Andreas > > > Suggested-by: Andreas Gruenbacher <agruenba@redhat.com> > > Suggested-by: Christoph Hellwig <hch@infradead.org> > > Suggested-by: Caleb Sander Mateos <csander@purestorage.com> > > Reviewed-by: Christoph Hellwig <hch@lst.de> > > Signed-off-by: Shida Zhang <zhangshida@kylinos.cn> > > --- > > block/bio.c | 5 +++-- > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > diff --git a/block/bio.c b/block/bio.c > > index cfb751dfcf5..51b57f9d8bd 100644 > > --- a/block/bio.c > > +++ b/block/bio.c > > @@ -314,8 +314,9 @@ static struct bio *__bio_chain_endio(struct bio *bio) > > { > > struct bio *parent = bio->bi_private; > > > > - if (bio->bi_status && !parent->bi_status) > > - parent->bi_status = bio->bi_status; > > + if (bio->bi_status) > > + cmpxchg(&parent->bi_status, 0, bio->bi_status); > > + > > bio_put(bio); > > return parent; > > } > > -- > > 2.34.1 > > > ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2025-12-06 9:26 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-12-04 2:47 [PATCH v5 0/3] Fix bio chain related issues zhangshida 2025-12-04 2:47 ` [PATCH v5 1/3] bcache: fix improper use of bi_end_io zhangshida 2025-12-04 9:38 ` Christoph Hellwig 2025-12-04 2:47 ` [PATCH v5 2/3] block: prohibit calls to bio_chain_endio zhangshida 2025-12-04 9:40 ` Christoph Hellwig 2025-12-06 9:26 ` Stephen Zhang 2025-12-04 2:47 ` [PATCH v5 3/3] block: prevent race condition on bi_status in __bio_chain_endio zhangshida 2025-12-04 12:01 ` Andreas Gruenbacher 2025-12-05 7:13 ` Stephen Zhang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox