* [PATCH v2 0/2] btrfs: fix use-after-free in btrfs_encoded_read_endio @ 2024-11-12 13:53 Johannes Thumshirn 2024-11-12 13:53 ` [PATCH v2 1/2] " Johannes Thumshirn 2024-11-12 13:53 ` [PATCH v2 2/2] btrfs: simplify waiting for encoded read endios Johannes Thumshirn 0 siblings, 2 replies; 9+ messages in thread From: Johannes Thumshirn @ 2024-11-12 13:53 UTC (permalink / raw) To: linux-btrfs Cc: Filipe Manana, Damien Le Moal, Johannes Thumshirn, Johannes Thumshirn, Mark Harmstone, Omar Sandoval Shinichiro reported a occassional memory corruption in our CI system with btrfs/248 that lead to panics. He also managed to reproduce this corruption reliably on one host. See patch 1/2 for details on the corruption and the fix, patch 2/2 is a cleanup Damien suggested on top of the fix to make the code more obvious. Changes to v1: - Update commit message of patch 1/1 - Prevent double-free of 'priv' in case of io_uring in 2/2 - Use wait_for_completion_io() in 2/2 - Convert priv->pending from atomic_t to refcount_t calling it refs in 2/2 Link to v1: https://lore.kernel.org/linux-btrfs/cover.1731316882.git.jth@kernel.org Johannes Thumshirn (2): btrfs: fix use-after-free in btrfs_encoded_read_endio btrfs: simplify waiting for encoded read endios fs/btrfs/inode.c | 27 ++++++++++++++------------- 1 file changed, 14 insertions(+), 13 deletions(-) -- 2.43.0 ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v2 1/2] btrfs: fix use-after-free in btrfs_encoded_read_endio 2024-11-12 13:53 [PATCH v2 0/2] btrfs: fix use-after-free in btrfs_encoded_read_endio Johannes Thumshirn @ 2024-11-12 13:53 ` Johannes Thumshirn 2024-11-12 14:45 ` Filipe Manana 2024-11-12 20:50 ` Qu Wenruo 2024-11-12 13:53 ` [PATCH v2 2/2] btrfs: simplify waiting for encoded read endios Johannes Thumshirn 1 sibling, 2 replies; 9+ messages in thread From: Johannes Thumshirn @ 2024-11-12 13:53 UTC (permalink / raw) To: linux-btrfs Cc: Filipe Manana, Damien Le Moal, Johannes Thumshirn, Mark Harmstone, Omar Sandoval, Shinichiro Kawasaki, Damien Le Moal From: Johannes Thumshirn <johannes.thumshirn@wdc.com> Shinichiro reported the following use-after free that sometimes is happening in our CI system when running fstests' btrfs/284 on a TCMU runner device: ================================================================== BUG: KASAN: slab-use-after-free in lock_release+0x708/0x780 Read of size 8 at addr ffff888106a83f18 by task kworker/u80:6/219 CPU: 8 UID: 0 PID: 219 Comm: kworker/u80:6 Not tainted 6.12.0-rc6-kts+ #15 Hardware name: Supermicro Super Server/X11SPi-TF, BIOS 3.3 02/21/2020 Workqueue: btrfs-endio btrfs_end_bio_work [btrfs] Call Trace: <TASK> dump_stack_lvl+0x6e/0xa0 ? lock_release+0x708/0x780 print_report+0x174/0x505 ? lock_release+0x708/0x780 ? __virt_addr_valid+0x224/0x410 ? lock_release+0x708/0x780 kasan_report+0xda/0x1b0 ? lock_release+0x708/0x780 ? __wake_up+0x44/0x60 lock_release+0x708/0x780 ? __pfx_lock_release+0x10/0x10 ? __pfx_do_raw_spin_lock+0x10/0x10 ? lock_is_held_type+0x9a/0x110 _raw_spin_unlock_irqrestore+0x1f/0x60 __wake_up+0x44/0x60 btrfs_encoded_read_endio+0x14b/0x190 [btrfs] btrfs_check_read_bio+0x8d9/0x1360 [btrfs] ? lock_release+0x1b0/0x780 ? trace_lock_acquire+0x12f/0x1a0 ? __pfx_btrfs_check_read_bio+0x10/0x10 [btrfs] ? process_one_work+0x7e3/0x1460 ? lock_acquire+0x31/0xc0 ? process_one_work+0x7e3/0x1460 process_one_work+0x85c/0x1460 ? __pfx_process_one_work+0x10/0x10 ? assign_work+0x16c/0x240 worker_thread+0x5e6/0xfc0 ? __pfx_worker_thread+0x10/0x10 kthread+0x2c3/0x3a0 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x31/0x70 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK> Allocated by task 3661: kasan_save_stack+0x30/0x50 kasan_save_track+0x14/0x30 __kasan_kmalloc+0xaa/0xb0 btrfs_encoded_read_regular_fill_pages+0x16c/0x6d0 [btrfs] send_extent_data+0xf0f/0x24a0 [btrfs] process_extent+0x48a/0x1830 [btrfs] changed_cb+0x178b/0x2ea0 [btrfs] btrfs_ioctl_send+0x3bf9/0x5c20 [btrfs] _btrfs_ioctl_send+0x117/0x330 [btrfs] btrfs_ioctl+0x184a/0x60a0 [btrfs] __x64_sys_ioctl+0x12e/0x1a0 do_syscall_64+0x95/0x180 entry_SYSCALL_64_after_hwframe+0x76/0x7e Freed by task 3661: kasan_save_stack+0x30/0x50 kasan_save_track+0x14/0x30 kasan_save_free_info+0x3b/0x70 __kasan_slab_free+0x4f/0x70 kfree+0x143/0x490 btrfs_encoded_read_regular_fill_pages+0x531/0x6d0 [btrfs] send_extent_data+0xf0f/0x24a0 [btrfs] process_extent+0x48a/0x1830 [btrfs] changed_cb+0x178b/0x2ea0 [btrfs] btrfs_ioctl_send+0x3bf9/0x5c20 [btrfs] _btrfs_ioctl_send+0x117/0x330 [btrfs] btrfs_ioctl+0x184a/0x60a0 [btrfs] __x64_sys_ioctl+0x12e/0x1a0 do_syscall_64+0x95/0x180 entry_SYSCALL_64_after_hwframe+0x76/0x7e The buggy address belongs to the object at ffff888106a83f00 which belongs to the cache kmalloc-rnd-07-96 of size 96 The buggy address is located 24 bytes inside of freed 96-byte region [ffff888106a83f00, ffff888106a83f60) The buggy address belongs to the physical page: page: refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888106a83800 pfn:0x106a83 flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff) page_type: f5(slab) raw: 0017ffffc0000000 ffff888100053680 ffffea0004917200 0000000000000004 raw: ffff888106a83800 0000000080200019 00000001f5000000 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff888106a83e00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc ffff888106a83e80: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc >ffff888106a83f00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc ^ ffff888106a83f80: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc ffff888106a84000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ================================================================== Further analyzing the trace and the crash dump's vmcore file shows that the wake_up() call in btrfs_encoded_read_endio() is calling wake_up() on the wait_queue that is in the private data passed to the end_io handler. Commit 4ff47df40447 ("btrfs: move priv off stack in btrfs_encoded_read_regular_fill_pages()") moved 'struct btrfs_encoded_read_private' off the stack. Before that commit one can see a corruption of the private data when analyzing the vmcore after a crash: *(struct btrfs_encoded_read_private *)0xffff88815626eec8 = { .wait = (wait_queue_head_t){ .lock = (spinlock_t){ .rlock = (struct raw_spinlock){ .raw_lock = (arch_spinlock_t){ .val = (atomic_t){ .counter = (int)-2005885696, }, .locked = (u8)0, .pending = (u8)157, .locked_pending = (u16)40192, .tail = (u16)34928, }, .magic = (unsigned int)536325682, .owner_cpu = (unsigned int)29, .owner = (void *)__SCT__tp_func_btrfs_transaction_commit+0x0 = 0x0, .dep_map = (struct lockdep_map){ .key = (struct lock_class_key *)0xffff8881575a3b6c, .class_cache = (struct lock_class *[2]){ 0xffff8882a71985c0, 0xffffea00066f5d40 }, .name = (const char *)0xffff88815626f100 = "", .wait_type_outer = (u8)37, .wait_type_inner = (u8)178, .lock_type = (u8)154, }, }, .__padding = (u8 [24]){ 0, 157, 112, 136, 50, 174, 247, 31, 29 }, .dep_map = (struct lockdep_map){ .key = (struct lock_class_key *)0xffff8881575a3b6c, .class_cache = (struct lock_class *[2]){ 0xffff8882a71985c0, 0xffffea00066f5d40 }, .name = (const char *)0xffff88815626f100 = "", .wait_type_outer = (u8)37, .wait_type_inner = (u8)178, .lock_type = (u8)154, }, }, .head = (struct list_head){ .next = (struct list_head *)0x112cca, .prev = (struct list_head *)0x47, }, }, .pending = (atomic_t){ .counter = (int)-1491499288, }, .status = (blk_status_t)130, } Here we can see several indicators of in-memory data corruption, e.g. the large negative atomic values of ->pending or ->wait->lock->rlock->raw_lock->val, as well as the bogus spinlock magic 0x1ff7ae32 (decimal 536325682 above) instead of 0xdead4ead or the bogus pointer values for ->wait->head. To fix this, move the call to bio_put() before the atomic_test operation so the submitter side in btrfs_encoded_read_regular_fill_pages() is not woken up before the bio is cleaned up. Also change atomic_dec_return() to atomic_dec_and_test() to fix the corruption, as atomic_dec_return() is defined as two instructions on x86_64, whereas atomic_dec_and_test() is defined as a single atomic operation. This can lead to a situation where counter value is already decremented but the if statement in btrfs_encoded_read_endio() is not completely processed, i.e. the 0 test has not completed. If another thread continues executing btrfs_encoded_read_regular_fill_pages() the atomic_dec_return() there can see an already updated ->pending counter and continues by freeing the private data. Continuing in the endio handler the test for 0 succeeds and the wait_queue is woken up, resulting in a use-after-free. Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com> Suggested-by: Damien Le Moal <Damien.LeMoal@wdc.com> Fixes: 1881fba89bd5 ("btrfs: add BTRFS_IOC_ENCODED_READ ioctl") Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> --- fs/btrfs/inode.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 22b8e2764619..cb8b23a3e56b 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -9089,7 +9089,8 @@ static void btrfs_encoded_read_endio(struct btrfs_bio *bbio) */ WRITE_ONCE(priv->status, bbio->bio.bi_status); } - if (atomic_dec_return(&priv->pending) == 0) { + bio_put(&bbio->bio); + if (atomic_dec_and_test(&priv->pending)) { int err = blk_status_to_errno(READ_ONCE(priv->status)); if (priv->uring_ctx) { @@ -9099,7 +9100,6 @@ static void btrfs_encoded_read_endio(struct btrfs_bio *bbio) wake_up(&priv->wait); } } - bio_put(&bbio->bio); } int btrfs_encoded_read_regular_fill_pages(struct btrfs_inode *inode, -- 2.43.0 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v2 1/2] btrfs: fix use-after-free in btrfs_encoded_read_endio 2024-11-12 13:53 ` [PATCH v2 1/2] " Johannes Thumshirn @ 2024-11-12 14:45 ` Filipe Manana 2024-11-12 17:20 ` Johannes Thumshirn 2024-11-12 20:50 ` Qu Wenruo 1 sibling, 1 reply; 9+ messages in thread From: Filipe Manana @ 2024-11-12 14:45 UTC (permalink / raw) To: Johannes Thumshirn Cc: linux-btrfs, Filipe Manana, Damien Le Moal, Johannes Thumshirn, Mark Harmstone, Omar Sandoval, Shinichiro Kawasaki, Damien Le Moal On Tue, Nov 12, 2024 at 1:54 PM Johannes Thumshirn <jth@kernel.org> wrote: > > From: Johannes Thumshirn <johannes.thumshirn@wdc.com> > > Shinichiro reported the following use-after free that sometimes is > happening in our CI system when running fstests' btrfs/284 on a TCMU > runner device: > > ================================================================== > BUG: KASAN: slab-use-after-free in lock_release+0x708/0x780 > Read of size 8 at addr ffff888106a83f18 by task kworker/u80:6/219 > > CPU: 8 UID: 0 PID: 219 Comm: kworker/u80:6 Not tainted 6.12.0-rc6-kts+ #15 > Hardware name: Supermicro Super Server/X11SPi-TF, BIOS 3.3 02/21/2020 > Workqueue: btrfs-endio btrfs_end_bio_work [btrfs] > Call Trace: > <TASK> > dump_stack_lvl+0x6e/0xa0 > ? lock_release+0x708/0x780 > print_report+0x174/0x505 > ? lock_release+0x708/0x780 > ? __virt_addr_valid+0x224/0x410 > ? lock_release+0x708/0x780 > kasan_report+0xda/0x1b0 > ? lock_release+0x708/0x780 > ? __wake_up+0x44/0x60 > lock_release+0x708/0x780 > ? __pfx_lock_release+0x10/0x10 > ? __pfx_do_raw_spin_lock+0x10/0x10 > ? lock_is_held_type+0x9a/0x110 > _raw_spin_unlock_irqrestore+0x1f/0x60 > __wake_up+0x44/0x60 > btrfs_encoded_read_endio+0x14b/0x190 [btrfs] > btrfs_check_read_bio+0x8d9/0x1360 [btrfs] > ? lock_release+0x1b0/0x780 > ? trace_lock_acquire+0x12f/0x1a0 > ? __pfx_btrfs_check_read_bio+0x10/0x10 [btrfs] > ? process_one_work+0x7e3/0x1460 > ? lock_acquire+0x31/0xc0 > ? process_one_work+0x7e3/0x1460 > process_one_work+0x85c/0x1460 > ? __pfx_process_one_work+0x10/0x10 > ? assign_work+0x16c/0x240 > worker_thread+0x5e6/0xfc0 > ? __pfx_worker_thread+0x10/0x10 > kthread+0x2c3/0x3a0 > ? __pfx_kthread+0x10/0x10 > ret_from_fork+0x31/0x70 > ? __pfx_kthread+0x10/0x10 > ret_from_fork_asm+0x1a/0x30 > </TASK> > > Allocated by task 3661: > kasan_save_stack+0x30/0x50 > kasan_save_track+0x14/0x30 > __kasan_kmalloc+0xaa/0xb0 > btrfs_encoded_read_regular_fill_pages+0x16c/0x6d0 [btrfs] > send_extent_data+0xf0f/0x24a0 [btrfs] > process_extent+0x48a/0x1830 [btrfs] > changed_cb+0x178b/0x2ea0 [btrfs] > btrfs_ioctl_send+0x3bf9/0x5c20 [btrfs] > _btrfs_ioctl_send+0x117/0x330 [btrfs] > btrfs_ioctl+0x184a/0x60a0 [btrfs] > __x64_sys_ioctl+0x12e/0x1a0 > do_syscall_64+0x95/0x180 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > Freed by task 3661: > kasan_save_stack+0x30/0x50 > kasan_save_track+0x14/0x30 > kasan_save_free_info+0x3b/0x70 > __kasan_slab_free+0x4f/0x70 > kfree+0x143/0x490 > btrfs_encoded_read_regular_fill_pages+0x531/0x6d0 [btrfs] > send_extent_data+0xf0f/0x24a0 [btrfs] > process_extent+0x48a/0x1830 [btrfs] > changed_cb+0x178b/0x2ea0 [btrfs] > btrfs_ioctl_send+0x3bf9/0x5c20 [btrfs] > _btrfs_ioctl_send+0x117/0x330 [btrfs] > btrfs_ioctl+0x184a/0x60a0 [btrfs] > __x64_sys_ioctl+0x12e/0x1a0 > do_syscall_64+0x95/0x180 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > The buggy address belongs to the object at ffff888106a83f00 > which belongs to the cache kmalloc-rnd-07-96 of size 96 > The buggy address is located 24 bytes inside of > freed 96-byte region [ffff888106a83f00, ffff888106a83f60) > > The buggy address belongs to the physical page: > page: refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888106a83800 pfn:0x106a83 > flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff) > page_type: f5(slab) > raw: 0017ffffc0000000 ffff888100053680 ffffea0004917200 0000000000000004 > raw: ffff888106a83800 0000000080200019 00000001f5000000 0000000000000000 > page dumped because: kasan: bad access detected > > Memory state around the buggy address: > ffff888106a83e00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc > ffff888106a83e80: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc > >ffff888106a83f00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc > ^ > ffff888106a83f80: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc > ffff888106a84000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ================================================================== > > Further analyzing the trace and the crash dump's vmcore file shows that > the wake_up() call in btrfs_encoded_read_endio() is calling wake_up() on > the wait_queue that is in the private data passed to the end_io handler. > > Commit 4ff47df40447 ("btrfs: move priv off stack in > btrfs_encoded_read_regular_fill_pages()") moved 'struct > btrfs_encoded_read_private' off the stack. > > Before that commit one can see a corruption of the private data when > analyzing the vmcore after a crash: > > *(struct btrfs_encoded_read_private *)0xffff88815626eec8 = { > .wait = (wait_queue_head_t){ > .lock = (spinlock_t){ > .rlock = (struct raw_spinlock){ > .raw_lock = (arch_spinlock_t){ > .val = (atomic_t){ > .counter = (int)-2005885696, > }, > .locked = (u8)0, > .pending = (u8)157, > .locked_pending = (u16)40192, > .tail = (u16)34928, > }, > .magic = (unsigned int)536325682, > .owner_cpu = (unsigned int)29, > .owner = (void *)__SCT__tp_func_btrfs_transaction_commit+0x0 = 0x0, > .dep_map = (struct lockdep_map){ > .key = (struct lock_class_key *)0xffff8881575a3b6c, > .class_cache = (struct lock_class *[2]){ 0xffff8882a71985c0, 0xffffea00066f5d40 }, > .name = (const char *)0xffff88815626f100 = "", > .wait_type_outer = (u8)37, > .wait_type_inner = (u8)178, > .lock_type = (u8)154, > }, > }, > .__padding = (u8 [24]){ 0, 157, 112, 136, 50, 174, 247, 31, 29 }, > .dep_map = (struct lockdep_map){ > .key = (struct lock_class_key *)0xffff8881575a3b6c, > .class_cache = (struct lock_class *[2]){ 0xffff8882a71985c0, 0xffffea00066f5d40 }, > .name = (const char *)0xffff88815626f100 = "", > .wait_type_outer = (u8)37, > .wait_type_inner = (u8)178, > .lock_type = (u8)154, > }, > }, > .head = (struct list_head){ > .next = (struct list_head *)0x112cca, > .prev = (struct list_head *)0x47, > }, > }, > .pending = (atomic_t){ > .counter = (int)-1491499288, > }, > .status = (blk_status_t)130, > } > > Here we can see several indicators of in-memory data corruption, e.g. the > large negative atomic values of ->pending or > ->wait->lock->rlock->raw_lock->val, as well as the bogus spinlock magic > 0x1ff7ae32 (decimal 536325682 above) instead of 0xdead4ead or the bogus > pointer values for ->wait->head. > > To fix this, move the call to bio_put() before the atomic_test operation > so the submitter side in btrfs_encoded_read_regular_fill_pages() is not > woken up before the bio is cleaned up. This is the part I don't see what's the relation to the use-after-free problem on the private structure. This seems like a cleanup that should be a separate patch with its own changelog. > > Also change atomic_dec_return() to atomic_dec_and_test() to fix the > corruption, as atomic_dec_return() is defined as two instructions on > x86_64, whereas atomic_dec_and_test() is defined as a single atomic > operation. This can lead to a situation where counter value is already > decremented but the if statement in btrfs_encoded_read_endio() is not > completely processed, i.e. the 0 test has not completed. If another thread > continues executing btrfs_encoded_read_regular_fill_pages() the > atomic_dec_return() there can see an already updated ->pending counter and > continues by freeing the private data. Continuing in the endio handler the > test for 0 succeeds and the wait_queue is woken up, resulting in a > use-after-free. This is the sort of explanation that should have been in v1. Basically, that the non-atomicity of atomic_dec_return() can make the waiter see the 0 value and free the private structure before the waker does a wake_up() against the private's wait queue. So with bio_put() change in a separate patch: Reviewed-by: Filipe Manana <fdmanana@suse.com> (or with it but with an explanation on how this relates to the use-after-free, which I can't see) Thanks. > > Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com> > Suggested-by: Damien Le Moal <Damien.LeMoal@wdc.com> > Fixes: 1881fba89bd5 ("btrfs: add BTRFS_IOC_ENCODED_READ ioctl") > Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> > --- > fs/btrfs/inode.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c > index 22b8e2764619..cb8b23a3e56b 100644 > --- a/fs/btrfs/inode.c > +++ b/fs/btrfs/inode.c > @@ -9089,7 +9089,8 @@ static void btrfs_encoded_read_endio(struct btrfs_bio *bbio) > */ > WRITE_ONCE(priv->status, bbio->bio.bi_status); > } > - if (atomic_dec_return(&priv->pending) == 0) { > + bio_put(&bbio->bio); > + if (atomic_dec_and_test(&priv->pending)) { > int err = blk_status_to_errno(READ_ONCE(priv->status)); > > if (priv->uring_ctx) { > @@ -9099,7 +9100,6 @@ static void btrfs_encoded_read_endio(struct btrfs_bio *bbio) > wake_up(&priv->wait); > } > } > - bio_put(&bbio->bio); > } > > int btrfs_encoded_read_regular_fill_pages(struct btrfs_inode *inode, > -- > 2.43.0 > > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 1/2] btrfs: fix use-after-free in btrfs_encoded_read_endio 2024-11-12 14:45 ` Filipe Manana @ 2024-11-12 17:20 ` Johannes Thumshirn 0 siblings, 0 replies; 9+ messages in thread From: Johannes Thumshirn @ 2024-11-12 17:20 UTC (permalink / raw) To: Filipe Manana, Johannes Thumshirn Cc: linux-btrfs@vger.kernel.org, Filipe Manana, Damien Le Moal, Mark Harmstone, Omar Sandoval, Shinichiro Kawasaki, Damien Le Moal On 12.11.24 15:46, Filipe Manana wrote: > This is the part I don't see what's the relation to the use-after-free > problem on the private structure. > This seems like a cleanup that should be a separate patch with its own > changelog. I need to re-test first, because AFAIR both the bio_put() and the atomic_dec_and_test() where needed to fix the bug on the test system. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 1/2] btrfs: fix use-after-free in btrfs_encoded_read_endio 2024-11-12 13:53 ` [PATCH v2 1/2] " Johannes Thumshirn 2024-11-12 14:45 ` Filipe Manana @ 2024-11-12 20:50 ` Qu Wenruo 2024-11-13 8:01 ` Johannes Thumshirn 1 sibling, 1 reply; 9+ messages in thread From: Qu Wenruo @ 2024-11-12 20:50 UTC (permalink / raw) To: Johannes Thumshirn, linux-btrfs Cc: Filipe Manana, Damien Le Moal, Johannes Thumshirn, Mark Harmstone, Omar Sandoval, Shinichiro Kawasaki, Damien Le Moal 在 2024/11/13 00:23, Johannes Thumshirn 写道: > From: Johannes Thumshirn <johannes.thumshirn@wdc.com> > > Shinichiro reported the following use-after free that sometimes is > happening in our CI system when running fstests' btrfs/284 on a TCMU > runner device: > > ================================================================== > BUG: KASAN: slab-use-after-free in lock_release+0x708/0x780 > Read of size 8 at addr ffff888106a83f18 by task kworker/u80:6/219 > > CPU: 8 UID: 0 PID: 219 Comm: kworker/u80:6 Not tainted 6.12.0-rc6-kts+ #15 > Hardware name: Supermicro Super Server/X11SPi-TF, BIOS 3.3 02/21/2020 > Workqueue: btrfs-endio btrfs_end_bio_work [btrfs] > Call Trace: > <TASK> > dump_stack_lvl+0x6e/0xa0 > ? lock_release+0x708/0x780 > print_report+0x174/0x505 > ? lock_release+0x708/0x780 > ? __virt_addr_valid+0x224/0x410 > ? lock_release+0x708/0x780 > kasan_report+0xda/0x1b0 > ? lock_release+0x708/0x780 > ? __wake_up+0x44/0x60 > lock_release+0x708/0x780 > ? __pfx_lock_release+0x10/0x10 > ? __pfx_do_raw_spin_lock+0x10/0x10 > ? lock_is_held_type+0x9a/0x110 > _raw_spin_unlock_irqrestore+0x1f/0x60 > __wake_up+0x44/0x60 > btrfs_encoded_read_endio+0x14b/0x190 [btrfs] > btrfs_check_read_bio+0x8d9/0x1360 [btrfs] > ? lock_release+0x1b0/0x780 > ? trace_lock_acquire+0x12f/0x1a0 > ? __pfx_btrfs_check_read_bio+0x10/0x10 [btrfs] > ? process_one_work+0x7e3/0x1460 > ? lock_acquire+0x31/0xc0 > ? process_one_work+0x7e3/0x1460 > process_one_work+0x85c/0x1460 > ? __pfx_process_one_work+0x10/0x10 > ? assign_work+0x16c/0x240 > worker_thread+0x5e6/0xfc0 > ? __pfx_worker_thread+0x10/0x10 > kthread+0x2c3/0x3a0 > ? __pfx_kthread+0x10/0x10 > ret_from_fork+0x31/0x70 > ? __pfx_kthread+0x10/0x10 > ret_from_fork_asm+0x1a/0x30 > </TASK> > > Allocated by task 3661: > kasan_save_stack+0x30/0x50 > kasan_save_track+0x14/0x30 > __kasan_kmalloc+0xaa/0xb0 > btrfs_encoded_read_regular_fill_pages+0x16c/0x6d0 [btrfs] > send_extent_data+0xf0f/0x24a0 [btrfs] > process_extent+0x48a/0x1830 [btrfs] > changed_cb+0x178b/0x2ea0 [btrfs] > btrfs_ioctl_send+0x3bf9/0x5c20 [btrfs] > _btrfs_ioctl_send+0x117/0x330 [btrfs] > btrfs_ioctl+0x184a/0x60a0 [btrfs] > __x64_sys_ioctl+0x12e/0x1a0 > do_syscall_64+0x95/0x180 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > Freed by task 3661: > kasan_save_stack+0x30/0x50 > kasan_save_track+0x14/0x30 > kasan_save_free_info+0x3b/0x70 > __kasan_slab_free+0x4f/0x70 > kfree+0x143/0x490 > btrfs_encoded_read_regular_fill_pages+0x531/0x6d0 [btrfs] > send_extent_data+0xf0f/0x24a0 [btrfs] > process_extent+0x48a/0x1830 [btrfs] > changed_cb+0x178b/0x2ea0 [btrfs] > btrfs_ioctl_send+0x3bf9/0x5c20 [btrfs] > _btrfs_ioctl_send+0x117/0x330 [btrfs] > btrfs_ioctl+0x184a/0x60a0 [btrfs] > __x64_sys_ioctl+0x12e/0x1a0 > do_syscall_64+0x95/0x180 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > The buggy address belongs to the object at ffff888106a83f00 > which belongs to the cache kmalloc-rnd-07-96 of size 96 > The buggy address is located 24 bytes inside of > freed 96-byte region [ffff888106a83f00, ffff888106a83f60) > > The buggy address belongs to the physical page: > page: refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888106a83800 pfn:0x106a83 > flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff) > page_type: f5(slab) > raw: 0017ffffc0000000 ffff888100053680 ffffea0004917200 0000000000000004 > raw: ffff888106a83800 0000000080200019 00000001f5000000 0000000000000000 > page dumped because: kasan: bad access detected > > Memory state around the buggy address: > ffff888106a83e00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc > ffff888106a83e80: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc > >ffff888106a83f00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc > ^ > ffff888106a83f80: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc > ffff888106a84000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ================================================================== > > Further analyzing the trace and the crash dump's vmcore file shows that > the wake_up() call in btrfs_encoded_read_endio() is calling wake_up() on > the wait_queue that is in the private data passed to the end_io handler. > > Commit 4ff47df40447 ("btrfs: move priv off stack in > btrfs_encoded_read_regular_fill_pages()") moved 'struct > btrfs_encoded_read_private' off the stack. > > Before that commit one can see a corruption of the private data when > analyzing the vmcore after a crash: > > *(struct btrfs_encoded_read_private *)0xffff88815626eec8 = { > .wait = (wait_queue_head_t){ > .lock = (spinlock_t){ > .rlock = (struct raw_spinlock){ > .raw_lock = (arch_spinlock_t){ > .val = (atomic_t){ > .counter = (int)-2005885696, > }, > .locked = (u8)0, > .pending = (u8)157, > .locked_pending = (u16)40192, > .tail = (u16)34928, > }, > .magic = (unsigned int)536325682, > .owner_cpu = (unsigned int)29, > .owner = (void *)__SCT__tp_func_btrfs_transaction_commit+0x0 = 0x0, > .dep_map = (struct lockdep_map){ > .key = (struct lock_class_key *)0xffff8881575a3b6c, > .class_cache = (struct lock_class *[2]){ 0xffff8882a71985c0, 0xffffea00066f5d40 }, > .name = (const char *)0xffff88815626f100 = "", > .wait_type_outer = (u8)37, > .wait_type_inner = (u8)178, > .lock_type = (u8)154, > }, > }, > .__padding = (u8 [24]){ 0, 157, 112, 136, 50, 174, 247, 31, 29 }, > .dep_map = (struct lockdep_map){ > .key = (struct lock_class_key *)0xffff8881575a3b6c, > .class_cache = (struct lock_class *[2]){ 0xffff8882a71985c0, 0xffffea00066f5d40 }, > .name = (const char *)0xffff88815626f100 = "", > .wait_type_outer = (u8)37, > .wait_type_inner = (u8)178, > .lock_type = (u8)154, > }, > }, > .head = (struct list_head){ > .next = (struct list_head *)0x112cca, > .prev = (struct list_head *)0x47, > }, > }, > .pending = (atomic_t){ > .counter = (int)-1491499288, > }, > .status = (blk_status_t)130, > } > > Here we can see several indicators of in-memory data corruption, e.g. the > large negative atomic values of ->pending or > ->wait->lock->rlock->raw_lock->val, as well as the bogus spinlock magic > 0x1ff7ae32 (decimal 536325682 above) instead of 0xdead4ead or the bogus > pointer values for ->wait->head. > > To fix this, move the call to bio_put() before the atomic_test operation > so the submitter side in btrfs_encoded_read_regular_fill_pages() is not > woken up before the bio is cleaned up. > > Also change atomic_dec_return() to atomic_dec_and_test() to fix the > corruption, as atomic_dec_return() is defined as two instructions on > x86_64, whereas atomic_dec_and_test() is defined as a single atomic > operation. This means we should not utilize "atomic_dec_return() == 0" as a way to do synchronization. And unfortunately I'm also seeing other locations still utilizing the same patter inside btrfs_encoded_read_regular_fill_pages() Shouldn't we also fix that call site even just for the sake of consistency? Thanks, Qu > This can lead to a situation where counter value is already > decremented but the if statement in btrfs_encoded_read_endio() is not > completely processed, i.e. the 0 test has not completed. If another thread > continues executing btrfs_encoded_read_regular_fill_pages() the > atomic_dec_return() there can see an already updated ->pending counter and > continues by freeing the private data. Continuing in the endio handler the > test for 0 succeeds and the wait_queue is woken up, resulting in a > use-after-free. > > Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com> > Suggested-by: Damien Le Moal <Damien.LeMoal@wdc.com> > Fixes: 1881fba89bd5 ("btrfs: add BTRFS_IOC_ENCODED_READ ioctl") > Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> > --- > fs/btrfs/inode.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c > index 22b8e2764619..cb8b23a3e56b 100644 > --- a/fs/btrfs/inode.c > +++ b/fs/btrfs/inode.c > @@ -9089,7 +9089,8 @@ static void btrfs_encoded_read_endio(struct btrfs_bio *bbio) > */ > WRITE_ONCE(priv->status, bbio->bio.bi_status); > } > - if (atomic_dec_return(&priv->pending) == 0) { > + bio_put(&bbio->bio); > + if (atomic_dec_and_test(&priv->pending)) { > int err = blk_status_to_errno(READ_ONCE(priv->status)); > > if (priv->uring_ctx) { > @@ -9099,7 +9100,6 @@ static void btrfs_encoded_read_endio(struct btrfs_bio *bbio) > wake_up(&priv->wait); > } > } > - bio_put(&bbio->bio); > } > > int btrfs_encoded_read_regular_fill_pages(struct btrfs_inode *inode, ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 1/2] btrfs: fix use-after-free in btrfs_encoded_read_endio 2024-11-12 20:50 ` Qu Wenruo @ 2024-11-13 8:01 ` Johannes Thumshirn 2024-11-25 15:55 ` David Sterba 0 siblings, 1 reply; 9+ messages in thread From: Johannes Thumshirn @ 2024-11-13 8:01 UTC (permalink / raw) To: Qu Wenruo, Johannes Thumshirn, linux-btrfs@vger.kernel.org Cc: Filipe Manana, Damien Le Moal, Mark Harmstone, Omar Sandoval, Shinichiro Kawasaki, Damien Le Moal On 12.11.24 21:51, Qu Wenruo wrote: >> To fix this, move the call to bio_put() before the atomic_test operation >> so the submitter side in btrfs_encoded_read_regular_fill_pages() is not >> woken up before the bio is cleaned up. >> >> Also change atomic_dec_return() to atomic_dec_and_test() to fix the >> corruption, as atomic_dec_return() is defined as two instructions on >> x86_64, whereas atomic_dec_and_test() is defined as a single atomic >> operation. > > This means we should not utilize "atomic_dec_return() == 0" as a way to > do synchronization. At least not for reference counting, hence recount_t doesn't even have an equivalent. > And unfortunately I'm also seeing other locations still utilizing the > same patter inside btrfs_encoded_read_regular_fill_pages() > > Shouldn't we also fix that call site even just for the sake of consistency? I have no idea, TBH. The other user of atomic_dec_return() in btrfs is in delayed-inode.c:finish_one_item(): /* atomic_dec_return implies a barrier */ if ((atomic_dec_return(&delayed_root->items) < BTRFS_DELAYED_BACKGROUND || seq % BTRFS_DELAYED_BATCH == 0)) cond_wake_up_nomb(&delayed_root->wait); And that one looks safe in my eyes. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 1/2] btrfs: fix use-after-free in btrfs_encoded_read_endio 2024-11-13 8:01 ` Johannes Thumshirn @ 2024-11-25 15:55 ` David Sterba 0 siblings, 0 replies; 9+ messages in thread From: David Sterba @ 2024-11-25 15:55 UTC (permalink / raw) To: Johannes Thumshirn Cc: Qu Wenruo, Johannes Thumshirn, linux-btrfs@vger.kernel.org, Filipe Manana, Damien Le Moal, Mark Harmstone, Omar Sandoval, Shinichiro Kawasaki, Damien Le Moal On Wed, Nov 13, 2024 at 08:01:29AM +0000, Johannes Thumshirn wrote: > On 12.11.24 21:51, Qu Wenruo wrote: > >> To fix this, move the call to bio_put() before the atomic_test operation > >> so the submitter side in btrfs_encoded_read_regular_fill_pages() is not > >> woken up before the bio is cleaned up. > >> > >> Also change atomic_dec_return() to atomic_dec_and_test() to fix the > >> corruption, as atomic_dec_return() is defined as two instructions on > >> x86_64, whereas atomic_dec_and_test() is defined as a single atomic > >> operation. > > > > This means we should not utilize "atomic_dec_return() == 0" as a way to > > do synchronization. > > At least not for reference counting, hence recount_t doesn't even have > an equivalent. > > > And unfortunately I'm also seeing other locations still utilizing the > > same patter inside btrfs_encoded_read_regular_fill_pages() > > > > Shouldn't we also fix that call site even just for the sake of consistency? > > I have no idea, TBH. The other user of atomic_dec_return() in btrfs is > in delayed-inode.c:finish_one_item(): > > /* atomic_dec_return implies a barrier */ > if ((atomic_dec_return(&delayed_root->items) < > BTRFS_DELAYED_BACKGROUND || seq % BTRFS_DELAYED_BATCH == 0)) > cond_wake_up_nomb(&delayed_root->wait); > > And that one looks safe in my eyes. A safe pattern where atomic_dec_return() is when once reaching 0 there will be no increment. Which is for example when setting the pending counter, possibyl adding e.g. pages one by one (atomic_inc) and then starting that. Each processed page then decrements the counter, once all are done the 0 will be there. There are atomic_dec_return() that could be wrong, I haven't examined all of them but it seems to be always safer to use atomic_dec_and_test(). ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v2 2/2] btrfs: simplify waiting for encoded read endios 2024-11-12 13:53 [PATCH v2 0/2] btrfs: fix use-after-free in btrfs_encoded_read_endio Johannes Thumshirn 2024-11-12 13:53 ` [PATCH v2 1/2] " Johannes Thumshirn @ 2024-11-12 13:53 ` Johannes Thumshirn 2024-11-12 14:57 ` Filipe Manana 1 sibling, 1 reply; 9+ messages in thread From: Johannes Thumshirn @ 2024-11-12 13:53 UTC (permalink / raw) To: linux-btrfs Cc: Filipe Manana, Damien Le Moal, Johannes Thumshirn, Damien Le Moal From: Johannes Thumshirn <johannes.thumshirn@wdc.com> Simplify the I/O completion path for encoded reads by using a completion instead of a wait_queue. Furthermore skip taking an extra reference that is instantly dropped anyways and convert btrfs_encoded_read_private::pending into a refcount_t filed instead of atomic_t. Freeing of the private data is now handled at a common place in btrfs_encoded_read_regular_fill_pages() and if btrfs_encoded_read_endio() is freeing the data in case it has an io_uring context associated, the btrfs_bio's private filed is NULLed to avoid a double free of the private data. Suggested-by: Damien Le Moal <Damien.LeMoal@wdc.com> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> --- fs/btrfs/inode.c | 31 ++++++++++++++++--------------- 1 file changed, 16 insertions(+), 15 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index cb8b23a3e56b..3093905364e5 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -9068,9 +9068,9 @@ static ssize_t btrfs_encoded_read_inline( } struct btrfs_encoded_read_private { - wait_queue_head_t wait; + struct completion done; void *uring_ctx; - atomic_t pending; + refcount_t refs; blk_status_t status; }; @@ -9090,14 +9090,15 @@ static void btrfs_encoded_read_endio(struct btrfs_bio *bbio) WRITE_ONCE(priv->status, bbio->bio.bi_status); } bio_put(&bbio->bio); - if (atomic_dec_and_test(&priv->pending)) { + if (refcount_dec_and_test(&priv->refs)) { int err = blk_status_to_errno(READ_ONCE(priv->status)); if (priv->uring_ctx) { btrfs_uring_read_extent_endio(priv->uring_ctx, err); + bbio->private = NULL; kfree(priv); } else { - wake_up(&priv->wait); + complete(&priv->done); } } } @@ -9116,8 +9117,8 @@ int btrfs_encoded_read_regular_fill_pages(struct btrfs_inode *inode, if (!priv) return -ENOMEM; - init_waitqueue_head(&priv->wait); - atomic_set(&priv->pending, 1); + init_completion(&priv->done); + refcount_set(&priv->refs, 1); priv->status = 0; priv->uring_ctx = uring_ctx; @@ -9130,7 +9131,7 @@ int btrfs_encoded_read_regular_fill_pages(struct btrfs_inode *inode, size_t bytes = min_t(u64, disk_io_size, PAGE_SIZE); if (bio_add_page(&bbio->bio, pages[i], bytes, 0) < bytes) { - atomic_inc(&priv->pending); + refcount_inc(&priv->refs); btrfs_submit_bbio(bbio, 0); bbio = btrfs_bio_alloc(BIO_MAX_VECS, REQ_OP_READ, fs_info, @@ -9145,26 +9146,26 @@ int btrfs_encoded_read_regular_fill_pages(struct btrfs_inode *inode, disk_io_size -= bytes; } while (disk_io_size); - atomic_inc(&priv->pending); btrfs_submit_bbio(bbio, 0); if (uring_ctx) { - if (atomic_dec_return(&priv->pending) == 0) { + if (bbio->private && refcount_read(&priv->refs) == 0) { ret = blk_status_to_errno(READ_ONCE(priv->status)); btrfs_uring_read_extent_endio(uring_ctx, ret); - kfree(priv); - return ret; + goto out; } return -EIOCBQUEUED; } else { - if (atomic_dec_return(&priv->pending) != 0) - io_wait_event(priv->wait, !atomic_read(&priv->pending)); + wait_for_completion_io(&priv->done); /* See btrfs_encoded_read_endio() for ordering. */ ret = blk_status_to_errno(READ_ONCE(priv->status)); - kfree(priv); - return ret; } + +out: + kfree(priv); + return ret; + } ssize_t btrfs_encoded_read_regular(struct kiocb *iocb, struct iov_iter *iter, -- 2.43.0 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v2 2/2] btrfs: simplify waiting for encoded read endios 2024-11-12 13:53 ` [PATCH v2 2/2] btrfs: simplify waiting for encoded read endios Johannes Thumshirn @ 2024-11-12 14:57 ` Filipe Manana 0 siblings, 0 replies; 9+ messages in thread From: Filipe Manana @ 2024-11-12 14:57 UTC (permalink / raw) To: Johannes Thumshirn Cc: linux-btrfs, Filipe Manana, Damien Le Moal, Johannes Thumshirn, Damien Le Moal On Tue, Nov 12, 2024 at 1:54 PM Johannes Thumshirn <jth@kernel.org> wrote: > > From: Johannes Thumshirn <johannes.thumshirn@wdc.com> > > Simplify the I/O completion path for encoded reads by using a > completion instead of a wait_queue. > > Furthermore skip taking an extra reference that is instantly > dropped anyways and convert btrfs_encoded_read_private::pending into a > refcount_t filed instead of atomic_t. > > Freeing of the private data is now handled at a common place in > btrfs_encoded_read_regular_fill_pages() and if btrfs_encoded_read_endio() > is freeing the data in case it has an io_uring context associated, the > btrfs_bio's private filed is NULLed to avoid a double free of the private > data. > > Suggested-by: Damien Le Moal <Damien.LeMoal@wdc.com> > Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> > --- > fs/btrfs/inode.c | 31 ++++++++++++++++--------------- > 1 file changed, 16 insertions(+), 15 deletions(-) > > diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c > index cb8b23a3e56b..3093905364e5 100644 > --- a/fs/btrfs/inode.c > +++ b/fs/btrfs/inode.c > @@ -9068,9 +9068,9 @@ static ssize_t btrfs_encoded_read_inline( > } > > struct btrfs_encoded_read_private { > - wait_queue_head_t wait; > + struct completion done; > void *uring_ctx; > - atomic_t pending; > + refcount_t refs; > blk_status_t status; > }; > > @@ -9090,14 +9090,15 @@ static void btrfs_encoded_read_endio(struct btrfs_bio *bbio) > WRITE_ONCE(priv->status, bbio->bio.bi_status); > } > bio_put(&bbio->bio); > - if (atomic_dec_and_test(&priv->pending)) { > + if (refcount_dec_and_test(&priv->refs)) { > int err = blk_status_to_errno(READ_ONCE(priv->status)); > > if (priv->uring_ctx) { > btrfs_uring_read_extent_endio(priv->uring_ctx, err); > + bbio->private = NULL; Isn't this racy? We decremented priv->refs to 0, the task at btrfs_encoded_read_regular_fill_pages() sees it as 0 and sees bbio->private still as non-NULL, does a kfree() on it and then we do it here again. I.e. we should set bbio->private to NULL before decrementing, and possibly some barriers. Thanks. > kfree(priv); > } else { > - wake_up(&priv->wait); > + complete(&priv->done); > } > } > } > @@ -9116,8 +9117,8 @@ int btrfs_encoded_read_regular_fill_pages(struct btrfs_inode *inode, > if (!priv) > return -ENOMEM; > > - init_waitqueue_head(&priv->wait); > - atomic_set(&priv->pending, 1); > + init_completion(&priv->done); > + refcount_set(&priv->refs, 1); > priv->status = 0; > priv->uring_ctx = uring_ctx; > > @@ -9130,7 +9131,7 @@ int btrfs_encoded_read_regular_fill_pages(struct btrfs_inode *inode, > size_t bytes = min_t(u64, disk_io_size, PAGE_SIZE); > > if (bio_add_page(&bbio->bio, pages[i], bytes, 0) < bytes) { > - atomic_inc(&priv->pending); > + refcount_inc(&priv->refs); > btrfs_submit_bbio(bbio, 0); > > bbio = btrfs_bio_alloc(BIO_MAX_VECS, REQ_OP_READ, fs_info, > @@ -9145,26 +9146,26 @@ int btrfs_encoded_read_regular_fill_pages(struct btrfs_inode *inode, > disk_io_size -= bytes; > } while (disk_io_size); > > - atomic_inc(&priv->pending); > btrfs_submit_bbio(bbio, 0); > > if (uring_ctx) { > - if (atomic_dec_return(&priv->pending) == 0) { > + if (bbio->private && refcount_read(&priv->refs) == 0) { > ret = blk_status_to_errno(READ_ONCE(priv->status)); > btrfs_uring_read_extent_endio(uring_ctx, ret); > - kfree(priv); > - return ret; > + goto out; > } > > return -EIOCBQUEUED; > } else { > - if (atomic_dec_return(&priv->pending) != 0) > - io_wait_event(priv->wait, !atomic_read(&priv->pending)); > + wait_for_completion_io(&priv->done); > /* See btrfs_encoded_read_endio() for ordering. */ > ret = blk_status_to_errno(READ_ONCE(priv->status)); > - kfree(priv); > - return ret; > } > + > +out: > + kfree(priv); > + return ret; > + > } > > ssize_t btrfs_encoded_read_regular(struct kiocb *iocb, struct iov_iter *iter, > -- > 2.43.0 > > ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2024-11-25 15:55 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-11-12 13:53 [PATCH v2 0/2] btrfs: fix use-after-free in btrfs_encoded_read_endio Johannes Thumshirn 2024-11-12 13:53 ` [PATCH v2 1/2] " Johannes Thumshirn 2024-11-12 14:45 ` Filipe Manana 2024-11-12 17:20 ` Johannes Thumshirn 2024-11-12 20:50 ` Qu Wenruo 2024-11-13 8:01 ` Johannes Thumshirn 2024-11-25 15:55 ` David Sterba 2024-11-12 13:53 ` [PATCH v2 2/2] btrfs: simplify waiting for encoded read endios Johannes Thumshirn 2024-11-12 14:57 ` Filipe Manana
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox