* [PATCH v4 0/2] xfs: shutdown UAF fixes
@ 2022-11-17 14:50 Guo Xuenan
2022-11-17 14:50 ` [PATCH v4 1/2] xfs: wait xlog ioend workqueue drained before tearing down AIL Guo Xuenan
2022-11-17 14:50 ` [PATCH v4 2/2] xfs: fix super block buf log item UAF during force shutdown Guo Xuenan
0 siblings, 2 replies; 6+ messages in thread
From: Guo Xuenan @ 2022-11-17 14:50 UTC (permalink / raw)
To: djwong, dchinner
Cc: linux-xfs, guoxuenan, houtao1, jack.qiu, fangwei1, yi.zhang,
zhengbin13, leo.lilong, zengheng4
Hi xfs folks:
The following patches fix some race of xfs force shutdown.
Patch 1 fix uaf in xfs_trans_ail_delete during xlog force shutdown.
In commit cd6f79d1fb32 ("xfs: run callbacks before waking waiters in
xlog_state_shutdown_callbacks"), seems not enough to avoid UAF of AIL
before it being tear down in umount.
Patch 2 fix uaf of super block buffer log item during xlog shut down,
since xfs buf log item can be reloged, super block buffer is most
frequently modified of all xfs_buf, especially when disable lazy-count
feature, during force shutdown we will unpin and release log item, due
to xfs relog mechanism, which may release the log item alread inserted
in CIL.
I reproduce the two problems, /importantly/, adding following patches and
disable lazy-count feature to increase recurrence probability.
Kernel patch for reproduce the issue of patch 1:
```
diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c
index 522d450a94b1..b1221d517c00 100644
--- a/fs/xfs/xfs_buf_item.c
+++ b/fs/xfs/xfs_buf_item.c
@@ -503,6 +503,9 @@ xfs_buf_item_unpin(
* the AIL properly holds a reference on the bli.
*/
freed = atomic_dec_and_test(&bip->bli_refcount);
+ if (remove)
+ mdelay(1000);
+
if (freed && !stale && remove)
xfs_buf_hold(bp);
if (atomic_dec_and_test(&bp->b_pin_count))
```
Kernel patch for reproduce the issue of patch 2:
```
diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
index 7bd16fbff534..b1aac4a7576c 100644
--- a/fs/xfs/xfs_trans.c
+++ b/fs/xfs/xfs_trans.c
@@ -25,6 +25,9 @@
#include "xfs_dquot.h"
#include "xfs_icache.h"
+#include <linux/delay.h>
+#include "xfs_log_priv.h"
+
struct kmem_cache *xfs_trans_cache;
#if defined(CONFIG_TRACEPOINTS)
@@ -1002,6 +1005,8 @@ __xfs_trans_commit(
xfs_trans_apply_sb_deltas(tp);
xfs_trans_apply_dquot_deltas(tp);
+ if (xlog_is_shutdown(log))
+ mdelay(1000);
xlog_cil_commit(log, tp, &commit_seq, regrant);
xfs_trans_free(tp);
```
Guo Xuenan (2):
xfs: wait xlog ioend workqueue drained before tearing down AIL
xfs: fix super block buf log item UAF during force shutdown
fs/xfs/xfs_buf_item.c | 8 +++++---
fs/xfs/xfs_trans_ail.c | 3 +++
2 files changed, 8 insertions(+), 3 deletions(-)
--
2.31.1
^ permalink raw reply related [flat|nested] 6+ messages in thread* [PATCH v4 1/2] xfs: wait xlog ioend workqueue drained before tearing down AIL
2022-11-17 14:50 [PATCH v4 0/2] xfs: shutdown UAF fixes Guo Xuenan
@ 2022-11-17 14:50 ` Guo Xuenan
2022-11-17 19:20 ` Darrick J. Wong
2022-11-17 14:50 ` [PATCH v4 2/2] xfs: fix super block buf log item UAF during force shutdown Guo Xuenan
1 sibling, 1 reply; 6+ messages in thread
From: Guo Xuenan @ 2022-11-17 14:50 UTC (permalink / raw)
To: djwong, dchinner
Cc: linux-xfs, guoxuenan, houtao1, jack.qiu, fangwei1, yi.zhang,
zhengbin13, leo.lilong, zengheng4
Fix uaf in xfs_trans_ail_delete during xlog force shutdown.
In commit cd6f79d1fb32 ("xfs: run callbacks before waking waiters in
xlog_state_shutdown_callbacks") changed the order of running callbacks
and wait for iclog completion to avoid unmount path untimely destroy AIL.
But which seems not enough to ensue this, adding mdelay in
`xfs_buf_item_unpin` can prove that.
The reproduction is as follows. To ensure destroy AIL safely,
we should wait all xlog ioend workers done and sync the AIL.
==================================================================
BUG: KASAN: use-after-free in xfs_trans_ail_delete+0x240/0x2a0
Read of size 8 at addr ffff888023169400 by task kworker/1:1H/43
CPU: 1 PID: 43 Comm: kworker/1:1H Tainted: G W
6.1.0-rc1-00002-gc28266863c4a #137
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.13.0-1ubuntu1.1 04/01/2014
Workqueue: xfs-log/sda xlog_ioend_work
Call Trace:
<TASK>
dump_stack_lvl+0x4d/0x66
print_report+0x171/0x4a6
kasan_report+0xb3/0x130
xfs_trans_ail_delete+0x240/0x2a0
xfs_buf_item_done+0x7b/0xa0
xfs_buf_ioend+0x1e9/0x11f0
xfs_buf_item_unpin+0x4c8/0x860
xfs_trans_committed_bulk+0x4c2/0x7c0
xlog_cil_committed+0xab6/0xfb0
xlog_cil_process_committed+0x117/0x1e0
xlog_state_shutdown_callbacks+0x208/0x440
xlog_force_shutdown+0x1b3/0x3a0
xlog_ioend_work+0xef/0x1d0
process_one_work+0x6f9/0xf70
worker_thread+0x578/0xf30
kthread+0x28c/0x330
ret_from_fork+0x1f/0x30
</TASK>
Allocated by task 9606:
kasan_save_stack+0x1e/0x40
kasan_set_track+0x21/0x30
__kasan_kmalloc+0x7a/0x90
__kmalloc+0x59/0x140
kmem_alloc+0xb2/0x2f0
xfs_trans_ail_init+0x20/0x320
xfs_log_mount+0x37e/0x690
xfs_mountfs+0xe36/0x1b40
xfs_fs_fill_super+0xc5c/0x1a70
get_tree_bdev+0x3c5/0x6c0
vfs_get_tree+0x85/0x250
path_mount+0xec3/0x1830
do_mount+0xef/0x110
__x64_sys_mount+0x150/0x1f0
do_syscall_64+0x35/0x80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
Freed by task 9662:
kasan_save_stack+0x1e/0x40
kasan_set_track+0x21/0x30
kasan_save_free_info+0x2a/0x40
__kasan_slab_free+0x105/0x1a0
__kmem_cache_free+0x99/0x2d0
kvfree+0x3a/0x40
xfs_log_unmount+0x60/0xf0
xfs_unmountfs+0xf3/0x1d0
xfs_fs_put_super+0x78/0x300
generic_shutdown_super+0x151/0x400
kill_block_super+0x9a/0xe0
deactivate_locked_super+0x82/0xe0
deactivate_super+0x91/0xb0
cleanup_mnt+0x32a/0x4a0
task_work_run+0x15f/0x240
exit_to_user_mode_prepare+0x188/0x190
syscall_exit_to_user_mode+0x12/0x30
do_syscall_64+0x42/0x80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
The buggy address belongs to the object at ffff888023169400
which belongs to the cache kmalloc-128 of size 128
The buggy address is located 0 bytes inside of
128-byte region [ffff888023169400, ffff888023169480)
The buggy address belongs to the physical page:
page:ffffea00008c5a00 refcount:1 mapcount:0 mapping:0000000000000000
index:0xffff888023168f80 pfn:0x23168
head:ffffea00008c5a00 order:1 compound_mapcount:0 compound_pincount:0
flags: 0x1fffff80010200(slab|head|node=0|zone=1|lastcpupid=0x1fffff)
raw: 001fffff80010200 ffffea00006b3988 ffffea0000577a88 ffff88800f842ac0
raw: ffff888023168f80 0000000000150007 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff888023169300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff888023169380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff888023169400: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff888023169480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff888023169500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================
Disabling lock debugging due to kernel taint
Fixes: cd6f79d1fb32 ("xfs: run callbacks before waking waiters in xlog_state_shutdown_callbacks")
Signed-off-by: Guo Xuenan <guoxuenan@huawei.com>
---
fs/xfs/xfs_log.c | 39 ++++++++++++++++++++++++++++-----------
1 file changed, 28 insertions(+), 11 deletions(-)
diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index 0141d9907d31..a8dbd4caea51 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -82,6 +82,9 @@ STATIC int
xlog_iclogs_empty(
struct xlog *log);
+static void
+xfs_iclog_iodone_wait(struct xlog *log);
+
static int
xfs_log_cover(struct xfs_mount *);
@@ -888,6 +891,23 @@ xlog_force_iclog(
return xlog_state_release_iclog(iclog->ic_log, iclog, NULL);
}
+/*
+ * Cycle all the iclogbuf locks to make sure all log IO completion
+ * is done before we tear down these buffers.
+ */
+static void
+xfs_iclog_iodone_wait(struct xlog *log)
+{
+ int i;
+ xlog_in_core_t *iclog = log->l_iclog;
+
+ for (i = 0; i < log->l_iclog_bufs; i++) {
+ down(&iclog->ic_sema);
+ up(&iclog->ic_sema);
+ iclog = iclog->ic_next;
+ }
+}
+
/*
* Wait for the iclog and all prior iclogs to be written disk as required by the
* log force state machine. Waiting on ic_force_wait ensures iclog completions
@@ -1113,6 +1133,14 @@ xfs_log_unmount(
{
xfs_log_clean(mp);
+ /*
+ * If shutdown has come from iclog IO context, the log
+ * cleaning will have been skipped and so we need to wait
+ * for ithe iclog to complete shutdown processing before we
+ * tear anything down.
+ */
+ xfs_iclog_iodone_wait(mp->m_log);
+
xfs_buftarg_drain(mp->m_ddev_targp);
xfs_trans_ail_destroy(mp);
@@ -2115,17 +2143,6 @@ xlog_dealloc_log(
xlog_in_core_t *iclog, *next_iclog;
int i;
- /*
- * Cycle all the iclogbuf locks to make sure all log IO completion
- * is done before we tear down these buffers.
- */
- iclog = log->l_iclog;
- for (i = 0; i < log->l_iclog_bufs; i++) {
- down(&iclog->ic_sema);
- up(&iclog->ic_sema);
- iclog = iclog->ic_next;
- }
-
/*
* Destroy the CIL after waiting for iclog IO completion because an
* iclog EIO error will try to shut down the log, which accesses the
--
2.31.1
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH v4 1/2] xfs: wait xlog ioend workqueue drained before tearing down AIL
2022-11-17 14:50 ` [PATCH v4 1/2] xfs: wait xlog ioend workqueue drained before tearing down AIL Guo Xuenan
@ 2022-11-17 19:20 ` Darrick J. Wong
2022-11-18 2:13 ` Guo Xuenan
0 siblings, 1 reply; 6+ messages in thread
From: Darrick J. Wong @ 2022-11-17 19:20 UTC (permalink / raw)
To: Guo Xuenan
Cc: dchinner, linux-xfs, houtao1, jack.qiu, fangwei1, yi.zhang,
zhengbin13, leo.lilong, zengheng4
On Thu, Nov 17, 2022 at 10:50:29PM +0800, Guo Xuenan wrote:
> Fix uaf in xfs_trans_ail_delete during xlog force shutdown.
> In commit cd6f79d1fb32 ("xfs: run callbacks before waking waiters in
> xlog_state_shutdown_callbacks") changed the order of running callbacks
> and wait for iclog completion to avoid unmount path untimely destroy AIL.
> But which seems not enough to ensue this, adding mdelay in
> `xfs_buf_item_unpin` can prove that.
>
> The reproduction is as follows. To ensure destroy AIL safely,
> we should wait all xlog ioend workers done and sync the AIL.
>
> ==================================================================
> BUG: KASAN: use-after-free in xfs_trans_ail_delete+0x240/0x2a0
> Read of size 8 at addr ffff888023169400 by task kworker/1:1H/43
>
> CPU: 1 PID: 43 Comm: kworker/1:1H Tainted: G W
> 6.1.0-rc1-00002-gc28266863c4a #137
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> 1.13.0-1ubuntu1.1 04/01/2014
> Workqueue: xfs-log/sda xlog_ioend_work
> Call Trace:
> <TASK>
> dump_stack_lvl+0x4d/0x66
> print_report+0x171/0x4a6
> kasan_report+0xb3/0x130
> xfs_trans_ail_delete+0x240/0x2a0
> xfs_buf_item_done+0x7b/0xa0
> xfs_buf_ioend+0x1e9/0x11f0
> xfs_buf_item_unpin+0x4c8/0x860
> xfs_trans_committed_bulk+0x4c2/0x7c0
> xlog_cil_committed+0xab6/0xfb0
> xlog_cil_process_committed+0x117/0x1e0
> xlog_state_shutdown_callbacks+0x208/0x440
> xlog_force_shutdown+0x1b3/0x3a0
> xlog_ioend_work+0xef/0x1d0
> process_one_work+0x6f9/0xf70
> worker_thread+0x578/0xf30
> kthread+0x28c/0x330
> ret_from_fork+0x1f/0x30
> </TASK>
>
> Allocated by task 9606:
> kasan_save_stack+0x1e/0x40
> kasan_set_track+0x21/0x30
> __kasan_kmalloc+0x7a/0x90
> __kmalloc+0x59/0x140
> kmem_alloc+0xb2/0x2f0
> xfs_trans_ail_init+0x20/0x320
> xfs_log_mount+0x37e/0x690
> xfs_mountfs+0xe36/0x1b40
> xfs_fs_fill_super+0xc5c/0x1a70
> get_tree_bdev+0x3c5/0x6c0
> vfs_get_tree+0x85/0x250
> path_mount+0xec3/0x1830
> do_mount+0xef/0x110
> __x64_sys_mount+0x150/0x1f0
> do_syscall_64+0x35/0x80
> entry_SYSCALL_64_after_hwframe+0x63/0xcd
>
> Freed by task 9662:
> kasan_save_stack+0x1e/0x40
> kasan_set_track+0x21/0x30
> kasan_save_free_info+0x2a/0x40
> __kasan_slab_free+0x105/0x1a0
> __kmem_cache_free+0x99/0x2d0
> kvfree+0x3a/0x40
> xfs_log_unmount+0x60/0xf0
> xfs_unmountfs+0xf3/0x1d0
> xfs_fs_put_super+0x78/0x300
> generic_shutdown_super+0x151/0x400
> kill_block_super+0x9a/0xe0
> deactivate_locked_super+0x82/0xe0
> deactivate_super+0x91/0xb0
> cleanup_mnt+0x32a/0x4a0
> task_work_run+0x15f/0x240
> exit_to_user_mode_prepare+0x188/0x190
> syscall_exit_to_user_mode+0x12/0x30
> do_syscall_64+0x42/0x80
> entry_SYSCALL_64_after_hwframe+0x63/0xcd
>
> The buggy address belongs to the object at ffff888023169400
> which belongs to the cache kmalloc-128 of size 128
> The buggy address is located 0 bytes inside of
> 128-byte region [ffff888023169400, ffff888023169480)
>
> The buggy address belongs to the physical page:
> page:ffffea00008c5a00 refcount:1 mapcount:0 mapping:0000000000000000
> index:0xffff888023168f80 pfn:0x23168
> head:ffffea00008c5a00 order:1 compound_mapcount:0 compound_pincount:0
> flags: 0x1fffff80010200(slab|head|node=0|zone=1|lastcpupid=0x1fffff)
> raw: 001fffff80010200 ffffea00006b3988 ffffea0000577a88 ffff88800f842ac0
> raw: ffff888023168f80 0000000000150007 00000001ffffffff 0000000000000000
> page dumped because: kasan: bad access detected
>
> Memory state around the buggy address:
> ffff888023169300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> ffff888023169380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> >ffff888023169400: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ^
> ffff888023169480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> ffff888023169500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> ==================================================================
> Disabling lock debugging due to kernel taint
>
> Fixes: cd6f79d1fb32 ("xfs: run callbacks before waking waiters in xlog_state_shutdown_callbacks")
> Signed-off-by: Guo Xuenan <guoxuenan@huawei.com>
> ---
> fs/xfs/xfs_log.c | 39 ++++++++++++++++++++++++++++-----------
> 1 file changed, 28 insertions(+), 11 deletions(-)
>
> diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> index 0141d9907d31..a8dbd4caea51 100644
> --- a/fs/xfs/xfs_log.c
> +++ b/fs/xfs/xfs_log.c
> @@ -82,6 +82,9 @@ STATIC int
> xlog_iclogs_empty(
> struct xlog *log);
>
> +static void
> +xfs_iclog_iodone_wait(struct xlog *log);
> +
Static functions used once don't need forward declarations.
> static int
> xfs_log_cover(struct xfs_mount *);
>
> @@ -888,6 +891,23 @@ xlog_force_iclog(
> return xlog_state_release_iclog(iclog->ic_log, iclog, NULL);
> }
>
> +/*
> + * Cycle all the iclogbuf locks to make sure all log IO completion
> + * is done before we tear down these buffers.
> + */
> +static void
> +xfs_iclog_iodone_wait(struct xlog *log)
Hmm, if this is a function acting on a struct xlog (and not a struct
xfs_mount) then this ought to be xlog_, right?
static inline void
xlog_wait_iclog_completion(
struct xlog *log)
{
struct xlog_in_core *iclog = log->l_iclog;
int i;
> +{
> + int i;
> + xlog_in_core_t *iclog = log->l_iclog;
Same "Don't add to the struct typedef usage" comment I've had since the
start of this thread.
> +
> + for (i = 0; i < log->l_iclog_bufs; i++) {
> + down(&iclog->ic_sema);
> + up(&iclog->ic_sema);
> + iclog = iclog->ic_next;
> + }
> +}
> +
> /*
> * Wait for the iclog and all prior iclogs to be written disk as required by the
> * log force state machine. Waiting on ic_force_wait ensures iclog completions
> @@ -1113,6 +1133,14 @@ xfs_log_unmount(
> {
> xfs_log_clean(mp);
>
> + /*
> + * If shutdown has come from iclog IO context, the log
> + * cleaning will have been skipped and so we need to wait
> + * for ithe iclog to complete shutdown processing before we
> + * tear anything down.
Good comment though.
--D
> + */
> + xfs_iclog_iodone_wait(mp->m_log);
> +
> xfs_buftarg_drain(mp->m_ddev_targp);
>
> xfs_trans_ail_destroy(mp);
> @@ -2115,17 +2143,6 @@ xlog_dealloc_log(
> xlog_in_core_t *iclog, *next_iclog;
> int i;
>
> - /*
> - * Cycle all the iclogbuf locks to make sure all log IO completion
> - * is done before we tear down these buffers.
> - */
> - iclog = log->l_iclog;
> - for (i = 0; i < log->l_iclog_bufs; i++) {
> - down(&iclog->ic_sema);
> - up(&iclog->ic_sema);
> - iclog = iclog->ic_next;
> - }
> -
> /*
> * Destroy the CIL after waiting for iclog IO completion because an
> * iclog EIO error will try to shut down the log, which accesses the
> --
> 2.31.1
>
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH v4 1/2] xfs: wait xlog ioend workqueue drained before tearing down AIL
2022-11-17 19:20 ` Darrick J. Wong
@ 2022-11-18 2:13 ` Guo Xuenan
0 siblings, 0 replies; 6+ messages in thread
From: Guo Xuenan @ 2022-11-18 2:13 UTC (permalink / raw)
To: Darrick J. Wong
Cc: dchinner, linux-xfs, houtao1, jack.qiu, fangwei1, yi.zhang,
zhengbin13, leo.lilong, zengheng4
On 2022/11/18 3:20, Darrick J. Wong wrote:
> On Thu, Nov 17, 2022 at 10:50:29PM +0800, Guo Xuenan wrote:
>> Fix uaf in xfs_trans_ail_delete during xlog force shutdown.
>> In commit cd6f79d1fb32 ("xfs: run callbacks before waking waiters in
>> xlog_state_shutdown_callbacks") changed the order of running callbacks
>> and wait for iclog completion to avoid unmount path untimely destroy AIL.
>> But which seems not enough to ensue this, adding mdelay in
>> `xfs_buf_item_unpin` can prove that.
>>
>> The reproduction is as follows. To ensure destroy AIL safely,
>> we should wait all xlog ioend workers done and sync the AIL.
>>
>> ==================================================================
>> BUG: KASAN: use-after-free in xfs_trans_ail_delete+0x240/0x2a0
>> Read of size 8 at addr ffff888023169400 by task kworker/1:1H/43
>>
>> CPU: 1 PID: 43 Comm: kworker/1:1H Tainted: G W
>> 6.1.0-rc1-00002-gc28266863c4a #137
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
>> 1.13.0-1ubuntu1.1 04/01/2014
>> Workqueue: xfs-log/sda xlog_ioend_work
>> Call Trace:
>> <TASK>
>> dump_stack_lvl+0x4d/0x66
>> print_report+0x171/0x4a6
>> kasan_report+0xb3/0x130
>> xfs_trans_ail_delete+0x240/0x2a0
>> xfs_buf_item_done+0x7b/0xa0
>> xfs_buf_ioend+0x1e9/0x11f0
>> xfs_buf_item_unpin+0x4c8/0x860
>> xfs_trans_committed_bulk+0x4c2/0x7c0
>> xlog_cil_committed+0xab6/0xfb0
>> xlog_cil_process_committed+0x117/0x1e0
>> xlog_state_shutdown_callbacks+0x208/0x440
>> xlog_force_shutdown+0x1b3/0x3a0
>> xlog_ioend_work+0xef/0x1d0
>> process_one_work+0x6f9/0xf70
>> worker_thread+0x578/0xf30
>> kthread+0x28c/0x330
>> ret_from_fork+0x1f/0x30
>> </TASK>
>>
>> Allocated by task 9606:
>> kasan_save_stack+0x1e/0x40
>> kasan_set_track+0x21/0x30
>> __kasan_kmalloc+0x7a/0x90
>> __kmalloc+0x59/0x140
>> kmem_alloc+0xb2/0x2f0
>> xfs_trans_ail_init+0x20/0x320
>> xfs_log_mount+0x37e/0x690
>> xfs_mountfs+0xe36/0x1b40
>> xfs_fs_fill_super+0xc5c/0x1a70
>> get_tree_bdev+0x3c5/0x6c0
>> vfs_get_tree+0x85/0x250
>> path_mount+0xec3/0x1830
>> do_mount+0xef/0x110
>> __x64_sys_mount+0x150/0x1f0
>> do_syscall_64+0x35/0x80
>> entry_SYSCALL_64_after_hwframe+0x63/0xcd
>>
>> Freed by task 9662:
>> kasan_save_stack+0x1e/0x40
>> kasan_set_track+0x21/0x30
>> kasan_save_free_info+0x2a/0x40
>> __kasan_slab_free+0x105/0x1a0
>> __kmem_cache_free+0x99/0x2d0
>> kvfree+0x3a/0x40
>> xfs_log_unmount+0x60/0xf0
>> xfs_unmountfs+0xf3/0x1d0
>> xfs_fs_put_super+0x78/0x300
>> generic_shutdown_super+0x151/0x400
>> kill_block_super+0x9a/0xe0
>> deactivate_locked_super+0x82/0xe0
>> deactivate_super+0x91/0xb0
>> cleanup_mnt+0x32a/0x4a0
>> task_work_run+0x15f/0x240
>> exit_to_user_mode_prepare+0x188/0x190
>> syscall_exit_to_user_mode+0x12/0x30
>> do_syscall_64+0x42/0x80
>> entry_SYSCALL_64_after_hwframe+0x63/0xcd
>>
>> The buggy address belongs to the object at ffff888023169400
>> which belongs to the cache kmalloc-128 of size 128
>> The buggy address is located 0 bytes inside of
>> 128-byte region [ffff888023169400, ffff888023169480)
>>
>> The buggy address belongs to the physical page:
>> page:ffffea00008c5a00 refcount:1 mapcount:0 mapping:0000000000000000
>> index:0xffff888023168f80 pfn:0x23168
>> head:ffffea00008c5a00 order:1 compound_mapcount:0 compound_pincount:0
>> flags: 0x1fffff80010200(slab|head|node=0|zone=1|lastcpupid=0x1fffff)
>> raw: 001fffff80010200 ffffea00006b3988 ffffea0000577a88 ffff88800f842ac0
>> raw: ffff888023168f80 0000000000150007 00000001ffffffff 0000000000000000
>> page dumped because: kasan: bad access detected
>>
>> Memory state around the buggy address:
>> ffff888023169300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>> ffff888023169380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>>> ffff888023169400: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> ^
>> ffff888023169480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>> ffff888023169500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>> ==================================================================
>> Disabling lock debugging due to kernel taint
>>
>> Fixes: cd6f79d1fb32 ("xfs: run callbacks before waking waiters in xlog_state_shutdown_callbacks")
>> Signed-off-by: Guo Xuenan <guoxuenan@huawei.com>
>> ---
>> fs/xfs/xfs_log.c | 39 ++++++++++++++++++++++++++++-----------
>> 1 file changed, 28 insertions(+), 11 deletions(-)
>>
>> diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
>> index 0141d9907d31..a8dbd4caea51 100644
>> --- a/fs/xfs/xfs_log.c
>> +++ b/fs/xfs/xfs_log.c
>> @@ -82,6 +82,9 @@ STATIC int
>> xlog_iclogs_empty(
>> struct xlog *log);
>>
>> +static void
>> +xfs_iclog_iodone_wait(struct xlog *log);
>> +
> Static functions used once don't need forward declarations.
OK,get it.
>> static int
>> xfs_log_cover(struct xfs_mount *);
>>
>> @@ -888,6 +891,23 @@ xlog_force_iclog(
>> return xlog_state_release_iclog(iclog->ic_log, iclog, NULL);
>> }
>>
>> +/*
>> + * Cycle all the iclogbuf locks to make sure all log IO completion
>> + * is done before we tear down these buffers.
>> + */
>> +static void
>> +xfs_iclog_iodone_wait(struct xlog *log)
> Hmm, if this is a function acting on a struct xlog (and not a struct
> xfs_mount) then this ought to be xlog_, right?
right, it will be named xlog_wait_iclog_completion.
> static inline void
> xlog_wait_iclog_completion(
> struct xlog *log)
> {
> struct xlog_in_core *iclog = log->l_iclog;
> int i;
>
>
>> +{
>> + int i;
>> + xlog_in_core_t *iclog = log->l_iclog;
> Same "Don't add to the struct typedef usage" comment I've had since the
> start of this thread.
sorry,will fix it.
>> +
>> + for (i = 0; i < log->l_iclog_bufs; i++) {
>> + down(&iclog->ic_sema);
>> + up(&iclog->ic_sema);
>> + iclog = iclog->ic_next;
>> + }
>> +}
>> +
>> /*
>> * Wait for the iclog and all prior iclogs to be written disk as required by the
>> * log force state machine. Waiting on ic_force_wait ensures iclog completions
>> @@ -1113,6 +1133,14 @@ xfs_log_unmount(
>> {
>> xfs_log_clean(mp);
>>
>> + /*
>> + * If shutdown has come from iclog IO context, the log
>> + * cleaning will have been skipped and so we need to wait
>> + * for ithe iclog to complete shutdown processing before we
>> + * tear anything down.
> Good comment though.
Yes, Dave deserve all the credit,copy from Dave's review comments. :)
> --D
>
>> + */
>> + xfs_iclog_iodone_wait(mp->m_log);
>> +
>> xfs_buftarg_drain(mp->m_ddev_targp);
>>
>> xfs_trans_ail_destroy(mp);
>> @@ -2115,17 +2143,6 @@ xlog_dealloc_log(
>> xlog_in_core_t *iclog, *next_iclog;
>> int i;
>>
>> - /*
>> - * Cycle all the iclogbuf locks to make sure all log IO completion
>> - * is done before we tear down these buffers.
>> - */
>> - iclog = log->l_iclog;
>> - for (i = 0; i < log->l_iclog_bufs; i++) {
>> - down(&iclog->ic_sema);
>> - up(&iclog->ic_sema);
>> - iclog = iclog->ic_next;
>> - }
>> -
>> /*
>> * Destroy the CIL after waiting for iclog IO completion because an
>> * iclog EIO error will try to shut down the log, which accesses the
>> --
>> 2.31.1
>>
> .
--
Guo Xuenan [OS Kernel Lab]
-----------------------------
Email: guoxuenan@huawei.com
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v4 2/2] xfs: fix super block buf log item UAF during force shutdown
2022-11-17 14:50 [PATCH v4 0/2] xfs: shutdown UAF fixes Guo Xuenan
2022-11-17 14:50 ` [PATCH v4 1/2] xfs: wait xlog ioend workqueue drained before tearing down AIL Guo Xuenan
@ 2022-11-17 14:50 ` Guo Xuenan
2022-11-17 19:28 ` Darrick J. Wong
1 sibling, 1 reply; 6+ messages in thread
From: Guo Xuenan @ 2022-11-17 14:50 UTC (permalink / raw)
To: djwong, dchinner
Cc: linux-xfs, guoxuenan, houtao1, jack.qiu, fangwei1, yi.zhang,
zhengbin13, leo.lilong, zengheng4
xfs log io error will trigger xlog shut down, and end_io worker call
xlog_state_shutdown_callbacks to unpin and release the buf log item.
The race condition is that when there are some thread doing transaction
commit and happened not to be intercepted by xlog_is_shutdown, then,
these log item will be insert into CIL, when unpin and release these
buf log item, UAF will occur. BTW, add delay before `xlog_cil_commit`
can increase recurrence probability.
The following call graph actually encountered this bad situation.
fsstress io end worker kworker/0:1H-216
xlog_ioend_work
->xlog_force_shutdown
->xlog_state_shutdown_callbacks
->xlog_cil_process_committed
->xlog_cil_committed
->xfs_trans_committed_bulk
->xfs_trans_apply_sb_deltas ->li_ops->iop_unpin(lip, 1);
->xfs_trans_getsb
->_xfs_trans_bjoin
->xfs_buf_item_init
->if (bip) { return 0;} //relog
->xlog_cil_commit
->xlog_cil_insert_items //insert into CIL
->xfs_buf_ioend_fail(bp);
->xfs_buf_ioend
->xfs_buf_item_done
->xfs_buf_item_relse
->xfs_buf_item_free
when cil push worker gather percpu cil and insert super block buf log item
into ctx->log_items then uaf occurs.
==================================================================
BUG: KASAN: use-after-free in xlog_cil_push_work+0x1c8f/0x22f0
Write of size 8 at addr ffff88801800f3f0 by task kworker/u4:4/105
CPU: 0 PID: 105 Comm: kworker/u4:4 Tainted: G W
6.1.0-rc1-00001-g274115149b42 #136
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.13.0-1ubuntu1.1 04/01/2014
Workqueue: xfs-cil/sda xlog_cil_push_work
Call Trace:
<TASK>
dump_stack_lvl+0x4d/0x66
print_report+0x171/0x4a6
kasan_report+0xb3/0x130
xlog_cil_push_work+0x1c8f/0x22f0
process_one_work+0x6f9/0xf70
worker_thread+0x578/0xf30
kthread+0x28c/0x330
ret_from_fork+0x1f/0x30
</TASK>
Allocated by task 2145:
kasan_save_stack+0x1e/0x40
kasan_set_track+0x21/0x30
__kasan_slab_alloc+0x54/0x60
kmem_cache_alloc+0x14a/0x510
xfs_buf_item_init+0x160/0x6d0
_xfs_trans_bjoin+0x7f/0x2e0
xfs_trans_getsb+0xb6/0x3f0
xfs_trans_apply_sb_deltas+0x1f/0x8c0
__xfs_trans_commit+0xa25/0xe10
xfs_symlink+0xe23/0x1660
xfs_vn_symlink+0x157/0x280
vfs_symlink+0x491/0x790
do_symlinkat+0x128/0x220
__x64_sys_symlink+0x7a/0x90
do_syscall_64+0x35/0x80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
Freed by task 216:
kasan_save_stack+0x1e/0x40
kasan_set_track+0x21/0x30
kasan_save_free_info+0x2a/0x40
__kasan_slab_free+0x105/0x1a0
kmem_cache_free+0xb6/0x460
xfs_buf_ioend+0x1e9/0x11f0
xfs_buf_item_unpin+0x3d6/0x840
xfs_trans_committed_bulk+0x4c2/0x7c0
xlog_cil_committed+0xab6/0xfb0
xlog_cil_process_committed+0x117/0x1e0
xlog_state_shutdown_callbacks+0x208/0x440
xlog_force_shutdown+0x1b3/0x3a0
xlog_ioend_work+0xef/0x1d0
process_one_work+0x6f9/0xf70
worker_thread+0x578/0xf30
kthread+0x28c/0x330
ret_from_fork+0x1f/0x30
The buggy address belongs to the object at ffff88801800f388
which belongs to the cache xfs_buf_item of size 272
The buggy address is located 104 bytes inside of
272-byte region [ffff88801800f388, ffff88801800f498)
The buggy address belongs to the physical page:
page:ffffea0000600380 refcount:1 mapcount:0 mapping:0000000000000000
index:0xffff88801800f208 pfn:0x1800e
head:ffffea0000600380 order:1 compound_mapcount:0 compound_pincount:0
flags: 0x1fffff80010200(slab|head|node=0|zone=1|lastcpupid=0x1fffff)
raw: 001fffff80010200 ffffea0000699788 ffff88801319db50 ffff88800fb50640
raw: ffff88801800f208 000000000015000a 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff88801800f280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff88801800f300: fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff88801800f380: fc fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff88801800f400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff88801800f480: fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================
Disabling lock debugging due to kernel taint
Signed-off-by: Guo Xuenan <guoxuenan@huawei.com>
---
fs/xfs/xfs_buf_item.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c
index 522d450a94b1..df7322ed73fa 100644
--- a/fs/xfs/xfs_buf_item.c
+++ b/fs/xfs/xfs_buf_item.c
@@ -1018,6 +1018,8 @@ xfs_buf_item_relse(
trace_xfs_buf_item_relse(bp, _RET_IP_);
ASSERT(!test_bit(XFS_LI_IN_AIL, &bip->bli_item.li_flags));
+ if (atomic_read(&bip->bli_refcount))
+ return;
bp->b_log_item = NULL;
xfs_buf_rele(bp);
xfs_buf_item_free(bip);
--
2.31.1
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH v4 2/2] xfs: fix super block buf log item UAF during force shutdown
2022-11-17 14:50 ` [PATCH v4 2/2] xfs: fix super block buf log item UAF during force shutdown Guo Xuenan
@ 2022-11-17 19:28 ` Darrick J. Wong
0 siblings, 0 replies; 6+ messages in thread
From: Darrick J. Wong @ 2022-11-17 19:28 UTC (permalink / raw)
To: Guo Xuenan
Cc: dchinner, linux-xfs, houtao1, jack.qiu, fangwei1, yi.zhang,
zhengbin13, leo.lilong, zengheng4
On Thu, Nov 17, 2022 at 10:50:30PM +0800, Guo Xuenan wrote:
> xfs log io error will trigger xlog shut down, and end_io worker call
> xlog_state_shutdown_callbacks to unpin and release the buf log item.
> The race condition is that when there are some thread doing transaction
> commit and happened not to be intercepted by xlog_is_shutdown, then,
> these log item will be insert into CIL, when unpin and release these
> buf log item, UAF will occur. BTW, add delay before `xlog_cil_commit`
> can increase recurrence probability.
>
> The following call graph actually encountered this bad situation.
> fsstress io end worker kworker/0:1H-216
> xlog_ioend_work
> ->xlog_force_shutdown
> ->xlog_state_shutdown_callbacks
> ->xlog_cil_process_committed
> ->xlog_cil_committed
> ->xfs_trans_committed_bulk
Odd switch from ^^^^^^^^^^^^^ spaces to tabs here.
> ->xfs_trans_apply_sb_deltas ->li_ops->iop_unpin(lip, 1);
> ->xfs_trans_getsb
> ->_xfs_trans_bjoin
> ->xfs_buf_item_init
> ->if (bip) { return 0;} //relog
> ->xlog_cil_commit
> ->xlog_cil_insert_items //insert into CIL
Why are we still inserting things into the CIL of a dead log?
> ->xfs_buf_ioend_fail(bp);
> ->xfs_buf_ioend
> ->xfs_buf_item_done
> ->xfs_buf_item_relse
> ->xfs_buf_item_free
>
> when cil push worker gather percpu cil and insert super block buf log item
> into ctx->log_items then uaf occurs.
>
> ==================================================================
> BUG: KASAN: use-after-free in xlog_cil_push_work+0x1c8f/0x22f0
> Write of size 8 at addr ffff88801800f3f0 by task kworker/u4:4/105
>
> CPU: 0 PID: 105 Comm: kworker/u4:4 Tainted: G W
> 6.1.0-rc1-00001-g274115149b42 #136
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> 1.13.0-1ubuntu1.1 04/01/2014
> Workqueue: xfs-cil/sda xlog_cil_push_work
> Call Trace:
> <TASK>
> dump_stack_lvl+0x4d/0x66
> print_report+0x171/0x4a6
> kasan_report+0xb3/0x130
> xlog_cil_push_work+0x1c8f/0x22f0
> process_one_work+0x6f9/0xf70
> worker_thread+0x578/0xf30
> kthread+0x28c/0x330
> ret_from_fork+0x1f/0x30
> </TASK>
>
> Allocated by task 2145:
> kasan_save_stack+0x1e/0x40
> kasan_set_track+0x21/0x30
> __kasan_slab_alloc+0x54/0x60
> kmem_cache_alloc+0x14a/0x510
> xfs_buf_item_init+0x160/0x6d0
> _xfs_trans_bjoin+0x7f/0x2e0
> xfs_trans_getsb+0xb6/0x3f0
> xfs_trans_apply_sb_deltas+0x1f/0x8c0
> __xfs_trans_commit+0xa25/0xe10
> xfs_symlink+0xe23/0x1660
> xfs_vn_symlink+0x157/0x280
> vfs_symlink+0x491/0x790
> do_symlinkat+0x128/0x220
> __x64_sys_symlink+0x7a/0x90
> do_syscall_64+0x35/0x80
> entry_SYSCALL_64_after_hwframe+0x63/0xcd
>
> Freed by task 216:
> kasan_save_stack+0x1e/0x40
> kasan_set_track+0x21/0x30
> kasan_save_free_info+0x2a/0x40
> __kasan_slab_free+0x105/0x1a0
> kmem_cache_free+0xb6/0x460
> xfs_buf_ioend+0x1e9/0x11f0
> xfs_buf_item_unpin+0x3d6/0x840
> xfs_trans_committed_bulk+0x4c2/0x7c0
> xlog_cil_committed+0xab6/0xfb0
> xlog_cil_process_committed+0x117/0x1e0
> xlog_state_shutdown_callbacks+0x208/0x440
> xlog_force_shutdown+0x1b3/0x3a0
> xlog_ioend_work+0xef/0x1d0
> process_one_work+0x6f9/0xf70
> worker_thread+0x578/0xf30
> kthread+0x28c/0x330
> ret_from_fork+0x1f/0x30
>
> The buggy address belongs to the object at ffff88801800f388
> which belongs to the cache xfs_buf_item of size 272
> The buggy address is located 104 bytes inside of
> 272-byte region [ffff88801800f388, ffff88801800f498)
>
> The buggy address belongs to the physical page:
> page:ffffea0000600380 refcount:1 mapcount:0 mapping:0000000000000000
> index:0xffff88801800f208 pfn:0x1800e
> head:ffffea0000600380 order:1 compound_mapcount:0 compound_pincount:0
> flags: 0x1fffff80010200(slab|head|node=0|zone=1|lastcpupid=0x1fffff)
> raw: 001fffff80010200 ffffea0000699788 ffff88801319db50 ffff88800fb50640
> raw: ffff88801800f208 000000000015000a 00000001ffffffff 0000000000000000
> page dumped because: kasan: bad access detected
>
> Memory state around the buggy address:
> ffff88801800f280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ffff88801800f300: fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc fc
> >ffff88801800f380: fc fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ^
> ffff88801800f400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ffff88801800f480: fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc fc
> ==================================================================
> Disabling lock debugging due to kernel taint
>
> Signed-off-by: Guo Xuenan <guoxuenan@huawei.com>
> ---
> fs/xfs/xfs_buf_item.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c
> index 522d450a94b1..df7322ed73fa 100644
> --- a/fs/xfs/xfs_buf_item.c
> +++ b/fs/xfs/xfs_buf_item.c
> @@ -1018,6 +1018,8 @@ xfs_buf_item_relse(
> trace_xfs_buf_item_relse(bp, _RET_IP_);
> ASSERT(!test_bit(XFS_LI_IN_AIL, &bip->bli_item.li_flags));
>
> + if (atomic_read(&bip->bli_refcount))
> + return;
If we're releasing the buf item, shouldn't this be an
atomic_dec_and_test or something?
--D
> bp->b_log_item = NULL;
> xfs_buf_rele(bp);
> xfs_buf_item_free(bip);
> --
> 2.31.1
>
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2022-11-18 2:13 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-11-17 14:50 [PATCH v4 0/2] xfs: shutdown UAF fixes Guo Xuenan
2022-11-17 14:50 ` [PATCH v4 1/2] xfs: wait xlog ioend workqueue drained before tearing down AIL Guo Xuenan
2022-11-17 19:20 ` Darrick J. Wong
2022-11-18 2:13 ` Guo Xuenan
2022-11-17 14:50 ` [PATCH v4 2/2] xfs: fix super block buf log item UAF during force shutdown Guo Xuenan
2022-11-17 19:28 ` Darrick J. Wong
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox