* [PATCH v2] bcache: fix cached_dev.sb_bio use-after-free and crash
@ 2026-03-22 13:41 colyli
2026-03-22 14:29 ` Jens Axboe
2026-03-22 14:30 ` Jens Axboe
0 siblings, 2 replies; 5+ messages in thread
From: colyli @ 2026-03-22 13:41 UTC (permalink / raw)
To: axboe; +Cc: linux-bcache, linux-block, Mingzhe Zou, stable, Coly Li
From: Mingzhe Zou <mingzhe.zou@easystack.cn>
In our production environment, we have received multiple crash reports
regarding libceph, which have caught our attention:
```
[6888366.280350] Call Trace:
[6888366.280452] blk_update_request+0x14e/0x370
[6888366.280561] blk_mq_end_request+0x1a/0x130
[6888366.280671] rbd_img_handle_request+0x1a0/0x1b0 [rbd]
[6888366.280792] rbd_obj_handle_request+0x32/0x40 [rbd]
[6888366.280903] __complete_request+0x22/0x70 [libceph]
[6888366.281032] osd_dispatch+0x15e/0xb40 [libceph]
[6888366.281164] ? inet_recvmsg+0x5b/0xd0
[6888366.281272] ? ceph_tcp_recvmsg+0x6f/0xa0 [libceph]
[6888366.281405] ceph_con_process_message+0x79/0x140 [libceph]
[6888366.281534] ceph_con_v1_try_read+0x5d7/0xf30 [libceph]
[6888366.281661] ceph_con_workfn+0x329/0x680 [libceph]
```
After analyzing the coredump file, we found that the address of
dc->sb_bio has been freed. We know that cached_dev is only freed when it
is stopped.
Since sb_bio is a part of struct cached_dev, rather than an alloc every
time. If the device is stopped while writing to the superblock, the
released address will be accessed at endio.
This patch hopes to wait for sb_write to complete in cached_dev_free.
It should be noted that we analyzed the cause of the problem, then tell
all details to the QWEN and adopted the modifications it made.
Signed-off-by: Mingzhe Zou <mingzhe.zou@easystack.cn>
Fixes: cafe563591446 ("bcache: A block layer cache")
Cc: stable@vger.kernel.org # 3.10+
Signed-off-by: Coly Li <colyli@fnnas.com>
---
Change log,
v2, fix emiail address type to stable kerenl.
v1, initial version.
drivers/md/bcache/super.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 64bb38c95895..6627a381f65a 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -1373,6 +1373,13 @@ static CLOSURE_CALLBACK(cached_dev_free)
mutex_unlock(&bch_register_lock);
+ /*
+ * Wait for any pending sb_write to complete before free.
+ * The sb_bio is embedded in struct cached_dev, so we must
+ * ensure no I/O is in progress.
+ */
+ closure_sync(&dc->sb_write);
+
if (dc->sb_disk)
folio_put(virt_to_folio(dc->sb_disk));
--
2.47.3
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v2] bcache: fix cached_dev.sb_bio use-after-free and crash
2026-03-22 13:41 colyli
@ 2026-03-22 14:29 ` Jens Axboe
2026-03-22 14:30 ` Jens Axboe
1 sibling, 0 replies; 5+ messages in thread
From: Jens Axboe @ 2026-03-22 14:29 UTC (permalink / raw)
To: colyli, axboe; +Cc: linux-bcache, linux-block, Mingzhe Zou, stable
On 3/22/26 7:41 AM, colyli@fnnas.com wrote:
> Change log,
> v2, fix emiail address type to stable kerenl.
Thankfully no typos in v2...
--
Jens Axboe
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] bcache: fix cached_dev.sb_bio use-after-free and crash
2026-03-22 13:41 colyli
2026-03-22 14:29 ` Jens Axboe
@ 2026-03-22 14:30 ` Jens Axboe
1 sibling, 0 replies; 5+ messages in thread
From: Jens Axboe @ 2026-03-22 14:30 UTC (permalink / raw)
To: axboe, colyli; +Cc: linux-bcache, linux-block, Mingzhe Zou, stable
On Sun, 22 Mar 2026 21:41:02 +0800, colyli@fnnas.com wrote:
> In our production environment, we have received multiple crash reports
> regarding libceph, which have caught our attention:
>
> ```
> [6888366.280350] Call Trace:
> [6888366.280452] blk_update_request+0x14e/0x370
> [6888366.280561] blk_mq_end_request+0x1a/0x130
> [6888366.280671] rbd_img_handle_request+0x1a0/0x1b0 [rbd]
> [6888366.280792] rbd_obj_handle_request+0x32/0x40 [rbd]
> [6888366.280903] __complete_request+0x22/0x70 [libceph]
> [6888366.281032] osd_dispatch+0x15e/0xb40 [libceph]
> [6888366.281164] ? inet_recvmsg+0x5b/0xd0
> [6888366.281272] ? ceph_tcp_recvmsg+0x6f/0xa0 [libceph]
> [6888366.281405] ceph_con_process_message+0x79/0x140 [libceph]
> [6888366.281534] ceph_con_v1_try_read+0x5d7/0xf30 [libceph]
> [6888366.281661] ceph_con_workfn+0x329/0x680 [libceph]
> ```
>
> [...]
Applied, thanks!
[1/1] bcache: fix cached_dev.sb_bio use-after-free and crash
commit: b36478a1fece72b5d4540141fd31024dcba1d241
Best regards,
--
Jens Axboe
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH v2] bcache: fix cached_dev.sb_bio use-after-free and crash
@ 2026-03-23 13:01 mingzhe.zou
2026-03-23 14:25 ` Coly Li
0 siblings, 1 reply; 5+ messages in thread
From: mingzhe.zou @ 2026-03-23 13:01 UTC (permalink / raw)
To: colyli, colyli; +Cc: linux-bcache, zoumingzhe, zoumingzhe, Mingzhe Zou
From: Mingzhe Zou <mingzhe.zou@easystack.cn>
In our production environment, we have received multiple crash reports
regarding libceph, which have caught our attention:
```
[6888366.280350] Call Trace:
[6888366.280452] blk_update_request+0x14e/0x370
[6888366.280561] blk_mq_end_request+0x1a/0x130
[6888366.280671] rbd_img_handle_request+0x1a0/0x1b0 [rbd]
[6888366.280792] rbd_obj_handle_request+0x32/0x40 [rbd]
[6888366.280903] __complete_request+0x22/0x70 [libceph]
[6888366.281032] osd_dispatch+0x15e/0xb40 [libceph]
[6888366.281164] ? inet_recvmsg+0x5b/0xd0
[6888366.281272] ? ceph_tcp_recvmsg+0x6f/0xa0 [libceph]
[6888366.281405] ceph_con_process_message+0x79/0x140 [libceph]
[6888366.281534] ceph_con_v1_try_read+0x5d7/0xf30 [libceph]
[6888366.281661] ceph_con_workfn+0x329/0x680 [libceph]
```
After analyzing the coredump file, we found that the address of dc->sb_bio
has been freed. We know that cached_dev is only freed when it is stopped.
Since sb_bio is a part of struct cached_dev, rather than an alloc every time.
If the device is stopped while writing to the superblock, the released address
will be accessed at endio.
This patch hopes to wait for sb_write to complete in cached_dev_free.
It should be noted that we analyzed the cause of the problem, then tell
all details to the QWEN and adopted the modifications it made.
Signed-off-by: Mingzhe Zou <mingzhe.zou@easystack.cn>
---
v2: fix the crash caused by not calling closure_init in v1
---
drivers/md/bcache/super.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 64bb38c95895..b76edbaaf4f3 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -1373,6 +1373,13 @@ static CLOSURE_CALLBACK(cached_dev_free)
mutex_unlock(&bch_register_lock);
+ /*
+ * Wait for any pending sb_write to complete before free.
+ * The sb_bio is embedded in struct cached_dev, so we must
+ * ensure no I/O is in progress.
+ */
+ down(&dc->sb_write_mutex);
+
if (dc->sb_disk)
folio_put(virt_to_folio(dc->sb_disk));
--
2.34.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v2] bcache: fix cached_dev.sb_bio use-after-free and crash
2026-03-23 13:01 [PATCH v2] bcache: fix cached_dev.sb_bio use-after-free and crash mingzhe.zou
@ 2026-03-23 14:25 ` Coly Li
0 siblings, 0 replies; 5+ messages in thread
From: Coly Li @ 2026-03-23 14:25 UTC (permalink / raw)
To: mingzhe.zou; +Cc: colyli, linux-bcache, zoumingzhe, zoumingzhe
On Mon, Mar 23, 2026 at 09:01:19PM +0800, mingzhe.zou@easystack.cn wrote:
> From: Mingzhe Zou <mingzhe.zou@easystack.cn>
>
> In our production environment, we have received multiple crash reports
> regarding libceph, which have caught our attention:
>
> ```
> [6888366.280350] Call Trace:
> [6888366.280452] blk_update_request+0x14e/0x370
> [6888366.280561] blk_mq_end_request+0x1a/0x130
> [6888366.280671] rbd_img_handle_request+0x1a0/0x1b0 [rbd]
> [6888366.280792] rbd_obj_handle_request+0x32/0x40 [rbd]
> [6888366.280903] __complete_request+0x22/0x70 [libceph]
> [6888366.281032] osd_dispatch+0x15e/0xb40 [libceph]
> [6888366.281164] ? inet_recvmsg+0x5b/0xd0
> [6888366.281272] ? ceph_tcp_recvmsg+0x6f/0xa0 [libceph]
> [6888366.281405] ceph_con_process_message+0x79/0x140 [libceph]
> [6888366.281534] ceph_con_v1_try_read+0x5d7/0xf30 [libceph]
> [6888366.281661] ceph_con_workfn+0x329/0x680 [libceph]
> ```
>
> After analyzing the coredump file, we found that the address of dc->sb_bio
> has been freed. We know that cached_dev is only freed when it is stopped.
>
> Since sb_bio is a part of struct cached_dev, rather than an alloc every time.
> If the device is stopped while writing to the superblock, the released address
> will be accessed at endio.
>
> This patch hopes to wait for sb_write to complete in cached_dev_free.
>
> It should be noted that we analyzed the cause of the problem, then tell
> all details to the QWEN and adopted the modifications it made.
>
> Signed-off-by: Mingzhe Zou <mingzhe.zou@easystack.cn>
>
> ---
> v2: fix the crash caused by not calling closure_init in v1
> ---
> drivers/md/bcache/super.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
> index 64bb38c95895..b76edbaaf4f3 100644
> --- a/drivers/md/bcache/super.c
> +++ b/drivers/md/bcache/super.c
> @@ -1373,6 +1373,13 @@ static CLOSURE_CALLBACK(cached_dev_free)
>
> mutex_unlock(&bch_register_lock);
>
> + /*
> + * Wait for any pending sb_write to complete before free.
> + * The sb_bio is embedded in struct cached_dev, so we must
> + * ensure no I/O is in progress.
> + */
> + down(&dc->sb_write_mutex);
> +
I know what you mean. dc->sb_write cannot be access out of bch_write_bdev_super().
But the above down() method is not comfortable IMHO.
Fortunately when cached_dev_free() is called from cached_dev_flush(), kobjs of
bcache device is delted by kobject_del(&d->kobj), there is no chance to call
bch_write_bdev_super() via sysfs interface. And when cached_dev_free() is called,
other code path calling bch_write_bdev_super() won't happen neither.
So a pair of
down(&dc->sb_write_mutex);
up(&dc->sb_write_mutex);
might be enough to make sure the last on-flight bch_write_bdev_super() will
complete?
> if (dc->sb_disk)
> folio_put(virt_to_folio(dc->sb_disk));
>
> --
> 2.34.1
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-03-23 14:26 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-23 13:01 [PATCH v2] bcache: fix cached_dev.sb_bio use-after-free and crash mingzhe.zou
2026-03-23 14:25 ` Coly Li
-- strict thread matches above, loose matches on Subject: below --
2026-03-22 13:41 colyli
2026-03-22 14:29 ` Jens Axboe
2026-03-22 14:30 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox