[PATCH v2 0/3] xfs: fix two problem when recovery intents fails

linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v2 0/3] xfs: fix two problem when recovery intents fails
@ 2023-07-15  6:36 Long Li
  2023-07-15  6:36 ` [PATCH v2 1/3] xfs: factor out xfs_defer_pending_abort Long Li
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Long Li @ 2023-07-15  6:36 UTC (permalink / raw)
  To: djwong, david; +Cc: linux-xfs, yi.zhang, houtao1, leo.lilong, yangerkun

This patch set fix two problem when recovery intents fails.

Patches 1-2 fix the possible problem that intent items not released.
When recovery intents, new intents items may be created during recovery
intents. if recovery fails, new intents items may be left in AIL or
leaks.

Patch 3 fix a uaf problem, when recovery intents fails, intent items
may be freed before done item commited.

v2:
 - change xfs_defer_pending_abort to static in patch 1
 - rewrite commit message in patch 2-3
 - rename xfs_defer_ops_capture_free to xfs_defer_ops_capture_abort, and
 add xfs_defer_pending_abort to the start of xfs_defer_ops_capture_abort

Long Li (3):
  xfs: factor out xfs_defer_pending_abort
  xfs: abort intent items when recovery intents fail
  xfs: make sure done item committed before cancel intents

 fs/xfs/libxfs/xfs_defer.c | 28 ++++++++++++++++++----------
 fs/xfs/libxfs/xfs_defer.h |  2 +-
 fs/xfs/xfs_log_recover.c  | 16 ++++++++--------
 3 files changed, 27 insertions(+), 19 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 1/3] xfs: factor out xfs_defer_pending_abort
  2023-07-15  6:36 [PATCH v2 0/3] xfs: fix two problem when recovery intents fails Long Li
@ 2023-07-15  6:36 ` Long Li
  2023-07-19  2:08   ` Darrick J. Wong
  2023-07-15  6:36 ` [PATCH v2 2/3] xfs: abort intent items when recovery intents fail Long Li
  2023-07-15  6:36 ` [PATCH v2 3/3] xfs: make sure done item committed before cancel intents Long Li
  2 siblings, 1 reply; 12+ messages in thread
From: Long Li @ 2023-07-15  6:36 UTC (permalink / raw)
  To: djwong, david; +Cc: linux-xfs, yi.zhang, houtao1, leo.lilong, yangerkun

Factor out xfs_defer_pending_abort() from xfs_defer_trans_abort(), which
not use transaction parameter, so it can be used after the transaction
life cycle.

Signed-off-by: Long Li <leo.lilong@huawei.com>
---
 fs/xfs/libxfs/xfs_defer.c | 23 +++++++++++++++--------
 1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_defer.c b/fs/xfs/libxfs/xfs_defer.c
index bcfb6a4203cd..88388e12f8e7 100644
--- a/fs/xfs/libxfs/xfs_defer.c
+++ b/fs/xfs/libxfs/xfs_defer.c
@@ -245,21 +245,18 @@ xfs_defer_create_intents(
 	return ret;
 }
 
-/* Abort all the intents that were committed. */
 STATIC void
-xfs_defer_trans_abort(
-	struct xfs_trans		*tp,
-	struct list_head		*dop_pending)
+xfs_defer_pending_abort(
+	struct xfs_mount		*mp,
+	struct list_head		*dop_list)
 {
 	struct xfs_defer_pending	*dfp;
 	const struct xfs_defer_op_type	*ops;
 
-	trace_xfs_defer_trans_abort(tp, _RET_IP_);
-
 	/* Abort intent items that don't have a done item. */
-	list_for_each_entry(dfp, dop_pending, dfp_list) {
+	list_for_each_entry(dfp, dop_list, dfp_list) {
 		ops = defer_op_types[dfp->dfp_type];
-		trace_xfs_defer_pending_abort(tp->t_mountp, dfp);
+		trace_xfs_defer_pending_abort(mp, dfp);
 		if (dfp->dfp_intent && !dfp->dfp_done) {
 			ops->abort_intent(dfp->dfp_intent);
 			dfp->dfp_intent = NULL;
@@ -267,6 +264,16 @@ xfs_defer_trans_abort(
 	}
 }
 
+/* Abort all the intents that were committed. */
+STATIC void
+xfs_defer_trans_abort(
+	struct xfs_trans		*tp,
+	struct list_head		*dop_pending)
+{
+	trace_xfs_defer_trans_abort(tp, _RET_IP_);
+	xfs_defer_pending_abort(tp->t_mountp, dop_pending);
+}
+
 /*
  * Capture resources that the caller said not to release ("held") when the
  * transaction commits.  Caller is responsible for zero-initializing @dres.
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 2/3] xfs: abort intent items when recovery intents fail
  2023-07-15  6:36 [PATCH v2 0/3] xfs: fix two problem when recovery intents fails Long Li
  2023-07-15  6:36 ` [PATCH v2 1/3] xfs: factor out xfs_defer_pending_abort Long Li
@ 2023-07-15  6:36 ` Long Li
  2023-07-19  2:07   ` Darrick J. Wong
  2023-07-15  6:36 ` [PATCH v2 3/3] xfs: make sure done item committed before cancel intents Long Li
  2 siblings, 1 reply; 12+ messages in thread
From: Long Li @ 2023-07-15  6:36 UTC (permalink / raw)
  To: djwong, david; +Cc: linux-xfs, yi.zhang, houtao1, leo.lilong, yangerkun

When recovering intents, we capture newly created intent items as part of
committing recovered intent items.  If intent recovery fails at a later
point, we forget to remove those newly created intent items from the AIL
and hang:

    [root@localhost ~]# cat /proc/539/stack
    [<0>] xfs_ail_push_all_sync+0x174/0x230
    [<0>] xfs_unmount_flush_inodes+0x8d/0xd0
    [<0>] xfs_mountfs+0x15f7/0x1e70
    [<0>] xfs_fs_fill_super+0x10ec/0x1b20
    [<0>] get_tree_bdev+0x3c8/0x730
    [<0>] vfs_get_tree+0x89/0x2c0
    [<0>] path_mount+0xecf/0x1800
    [<0>] do_mount+0xf3/0x110
    [<0>] __x64_sys_mount+0x154/0x1f0
    [<0>] do_syscall_64+0x39/0x80
    [<0>] entry_SYSCALL_64_after_hwframe+0x63/0xcd

When newly created intent items fail to commit via transaction, intent
recovery hasn't created done items for these newly created intent items,
so the capture structure is the sole owner of the captured intent items.
We must release them explicitly or else they leak:

unreferenced object 0xffff888016719108 (size 432):
  comm "mount", pid 529, jiffies 4294706839 (age 144.463s)
  hex dump (first 32 bytes):
    08 91 71 16 80 88 ff ff 08 91 71 16 80 88 ff ff  ..q.......q.....
    18 91 71 16 80 88 ff ff 18 91 71 16 80 88 ff ff  ..q.......q.....
  backtrace:
    [<ffffffff8230c68f>] xfs_efi_init+0x18f/0x1d0
    [<ffffffff8230c720>] xfs_extent_free_create_intent+0x50/0x150
    [<ffffffff821b671a>] xfs_defer_create_intents+0x16a/0x340
    [<ffffffff821bac3e>] xfs_defer_ops_capture_and_commit+0x8e/0xad0
    [<ffffffff82322bb9>] xfs_cui_item_recover+0x819/0x980
    [<ffffffff823289b6>] xlog_recover_process_intents+0x246/0xb70
    [<ffffffff8233249a>] xlog_recover_finish+0x8a/0x9a0
    [<ffffffff822eeafb>] xfs_log_mount_finish+0x2bb/0x4a0
    [<ffffffff822c0f4f>] xfs_mountfs+0x14bf/0x1e70
    [<ffffffff822d1f80>] xfs_fs_fill_super+0x10d0/0x1b20
    [<ffffffff81a21fa2>] get_tree_bdev+0x3d2/0x6d0
    [<ffffffff81a1ee09>] vfs_get_tree+0x89/0x2c0
    [<ffffffff81a9f35f>] path_mount+0xecf/0x1800
    [<ffffffff81a9fd83>] do_mount+0xf3/0x110
    [<ffffffff81aa00e4>] __x64_sys_mount+0x154/0x1f0
    [<ffffffff83968739>] do_syscall_64+0x39/0x80

Fix the problem above by abort intent items that don't have a done item
when recovery intents fail.

Fixes: e6fff81e4870 ("xfs: proper replay of deferred ops queued during log recovery")
Signed-off-by: Long Li <leo.lilong@huawei.com>
---
 fs/xfs/libxfs/xfs_defer.c | 5 +++--
 fs/xfs/libxfs/xfs_defer.h | 2 +-
 fs/xfs/xfs_log_recover.c  | 2 +-
 3 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_defer.c b/fs/xfs/libxfs/xfs_defer.c
index 88388e12f8e7..f71679ce23b9 100644
--- a/fs/xfs/libxfs/xfs_defer.c
+++ b/fs/xfs/libxfs/xfs_defer.c
@@ -763,12 +763,13 @@ xfs_defer_ops_capture(
 
 /* Release all resources that we used to capture deferred ops. */
 void
-xfs_defer_ops_capture_free(
+xfs_defer_ops_capture_abort(
 	struct xfs_mount		*mp,
 	struct xfs_defer_capture	*dfc)
 {
 	unsigned short			i;
 
+	xfs_defer_pending_abort(mp, &dfc->dfc_dfops);
 	xfs_defer_cancel_list(mp, &dfc->dfc_dfops);
 
 	for (i = 0; i < dfc->dfc_held.dr_bufs; i++)
@@ -809,7 +810,7 @@ xfs_defer_ops_capture_and_commit(
 	/* Commit the transaction and add the capture structure to the list. */
 	error = xfs_trans_commit(tp);
 	if (error) {
-		xfs_defer_ops_capture_free(mp, dfc);
+		xfs_defer_ops_capture_abort(mp, dfc);
 		return error;
 	}
 
diff --git a/fs/xfs/libxfs/xfs_defer.h b/fs/xfs/libxfs/xfs_defer.h
index 114a3a4930a3..8788ad5f6a73 100644
--- a/fs/xfs/libxfs/xfs_defer.h
+++ b/fs/xfs/libxfs/xfs_defer.h
@@ -121,7 +121,7 @@ int xfs_defer_ops_capture_and_commit(struct xfs_trans *tp,
 		struct list_head *capture_list);
 void xfs_defer_ops_continue(struct xfs_defer_capture *d, struct xfs_trans *tp,
 		struct xfs_defer_resources *dres);
-void xfs_defer_ops_capture_free(struct xfs_mount *mp,
+void xfs_defer_ops_capture_abort(struct xfs_mount *mp,
 		struct xfs_defer_capture *d);
 void xfs_defer_resources_rele(struct xfs_defer_resources *dres);
 
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 82c81d20459d..fdaa0ffe029b 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -2511,7 +2511,7 @@ xlog_abort_defer_ops(
 
 	list_for_each_entry_safe(dfc, next, capture_list, dfc_list) {
 		list_del_init(&dfc->dfc_list);
-		xfs_defer_ops_capture_free(mp, dfc);
+		xfs_defer_ops_capture_abort(mp, dfc);
 	}
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 3/3] xfs: make sure done item committed before cancel intents
  2023-07-15  6:36 [PATCH v2 0/3] xfs: fix two problem when recovery intents fails Long Li
  2023-07-15  6:36 ` [PATCH v2 1/3] xfs: factor out xfs_defer_pending_abort Long Li
  2023-07-15  6:36 ` [PATCH v2 2/3] xfs: abort intent items when recovery intents fail Long Li
@ 2023-07-15  6:36 ` Long Li
  2023-07-19  2:19   ` Darrick J. Wong
  2023-07-19  6:41   ` Dave Chinner
  2 siblings, 2 replies; 12+ messages in thread
From: Long Li @ 2023-07-15  6:36 UTC (permalink / raw)
  To: djwong, david; +Cc: linux-xfs, yi.zhang, houtao1, leo.lilong, yangerkun

KASAN report a uaf when recover intents fails:

 ==================================================================
 BUG: KASAN: slab-use-after-free in xfs_cui_release+0xb7/0xc0
 Read of size 4 at addr ffff888012575e60 by task kworker/u8:3/103
 CPU: 3 PID: 103 Comm: kworker/u8:3 Not tainted 6.4.0-rc7-next-20230619-00003-g94543a53f9a4-dirty #166
 Workqueue: xfs-cil/sda xlog_cil_push_work
 Call Trace:
  <TASK>
  dump_stack_lvl+0x50/0x70
  print_report+0xc2/0x600
  kasan_report+0xb6/0xe0
  xfs_cui_release+0xb7/0xc0
  xfs_cud_item_release+0x3c/0x90
  xfs_trans_committed_bulk+0x2d5/0x7f0
  xlog_cil_committed+0xaba/0xf20
  xlog_cil_push_work+0x1a60/0x2360
  process_one_work+0x78e/0x1140
  worker_thread+0x58b/0xf60
  kthread+0x2cd/0x3c0
  ret_from_fork+0x1f/0x30
  </TASK>

 Allocated by task 531:
  kasan_save_stack+0x22/0x40
  kasan_set_track+0x25/0x30
  __kasan_slab_alloc+0x55/0x60
  kmem_cache_alloc+0x195/0x5f0
  xfs_cui_init+0x198/0x1d0
  xlog_recover_cui_commit_pass2+0x133/0x5f0
  xlog_recover_items_pass2+0x107/0x230
  xlog_recover_commit_trans+0x3e7/0x9c0
  xlog_recovery_process_trans+0x140/0x1d0
  xlog_recover_process_ophdr+0x1a0/0x3d0
  xlog_recover_process_data+0x108/0x2d0
  xlog_recover_process+0x1f6/0x280
  xlog_do_recovery_pass+0x609/0xdb0
  xlog_do_log_recovery+0x84/0xe0
  xlog_do_recover+0x7d/0x470
  xlog_recover+0x25f/0x490
  xfs_log_mount+0x2dd/0x6f0
  xfs_mountfs+0x11ce/0x1e70
  xfs_fs_fill_super+0x10ec/0x1b20
  get_tree_bdev+0x3c8/0x730
  vfs_get_tree+0x89/0x2c0
  path_mount+0xecf/0x1800
  do_mount+0xf3/0x110
  __x64_sys_mount+0x154/0x1f0
  do_syscall_64+0x39/0x80
  entry_SYSCALL_64_after_hwframe+0x63/0xcd

 Freed by task 531:
  kasan_save_stack+0x22/0x40
  kasan_set_track+0x25/0x30
  kasan_save_free_info+0x2b/0x40
  __kasan_slab_free+0x114/0x1b0
  kmem_cache_free+0xf8/0x510
  xfs_cui_item_free+0x95/0xb0
  xfs_cui_release+0x86/0xc0
  xlog_recover_cancel_intents.isra.0+0xf8/0x210
  xlog_recover_finish+0x7e7/0x980
  xfs_log_mount_finish+0x2bb/0x4a0
  xfs_mountfs+0x14bf/0x1e70
  xfs_fs_fill_super+0x10ec/0x1b20
  get_tree_bdev+0x3c8/0x730
  vfs_get_tree+0x89/0x2c0
  path_mount+0xecf/0x1800
  do_mount+0xf3/0x110
  __x64_sys_mount+0x154/0x1f0
  do_syscall_64+0x39/0x80
  entry_SYSCALL_64_after_hwframe+0x63/0xcd

 The buggy address belongs to the object at ffff888012575dc8
  which belongs to the cache xfs_cui_item of size 432
 The buggy address is located 152 bytes inside of
  freed 432-byte region [ffff888012575dc8, ffff888012575f78)

 The buggy address belongs to the physical page:
 page:ffffea0000495d00 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888012576208 pfn:0x12574
 head:ffffea0000495d00 order:2 entire_mapcount:0 nr_pages_mapped:0 pincount:0
 flags: 0x1fffff80010200(slab|head|node=0|zone=1|lastcpupid=0x1fffff)
 page_type: 0xffffffff()
 raw: 001fffff80010200 ffff888012092f40 ffff888014570150 ffff888014570150
 raw: ffff888012576208 00000000001e0010 00000001ffffffff 0000000000000000
 page dumped because: kasan: bad access detected

 Memory state around the buggy address:
  ffff888012575d00: fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc fc
  ffff888012575d80: fc fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb
 >ffff888012575e00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                        ^
  ffff888012575e80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  ffff888012575f00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fc
 ==================================================================

If process intents fails, intent items left in AIL will be delete
from AIL and freed in error handling, even intent items that have been
recovered and created done items. After this, uaf will be triggered when
done item commited, because at this point the released intent item will
be accessed.

xlog_recover_finish                     xlog_cil_push_work
----------------------------            ---------------------------
xlog_recover_process_intents
  xfs_cui_item_recover//cui_refcount == 1
    xfs_trans_get_cud
    xfs_trans_commit
      <add cud item to cil>
  xfs_cui_item_recover
    <error occurred and return>
xlog_recover_cancel_intents
  xfs_cui_release     //cui_refcount == 0
    xfs_cui_item_free //free cui
  <release other intent items>
xlog_force_shutdown   //shutdown
                               <...>
                                        <push items in cil>
                                        xlog_cil_committed
                                          xfs_cud_item_release
                                            xfs_cui_release // UAF

Fix it by move log force forward to make sure done items committed before
cancel intents.

Fixes: 2e76f188fd90 ("xfs: cancel intents immediately if process_intents fails")
Signed-off-by: Long Li <leo.lilong@huawei.com>
---
 fs/xfs/xfs_log_recover.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index fdaa0ffe029b..c37031e64db5 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -3444,6 +3444,13 @@ xlog_recover_finish(
 	int	error;
 
 	error = xlog_recover_process_intents(log);
+	/*
+	 * Sync the log to get all the intents that have done item out of
+	 * the AIL.  This isn't absolutely necessary, but it helps in case
+	 * the unlink transactions would have problems pushing the intents
+	 * out of the way.
+	 */
+	xfs_log_force(log->l_mp, XFS_LOG_SYNC);
 	if (error) {
 		/*
 		 * Cancel all the unprocessed intent items now so that we don't
@@ -3458,13 +3465,6 @@ xlog_recover_finish(
 		return error;
 	}
 
-	/*
-	 * Sync the log to get all the intents out of the AIL.  This isn't
-	 * absolutely necessary, but it helps in case the unlink transactions
-	 * would have problems pushing the intents out of the way.
-	 */
-	xfs_log_force(log->l_mp, XFS_LOG_SYNC);
-
 	/*
 	 * Now that we've recovered the log and all the intents, we can clear
 	 * the log incompat feature bits in the superblock because there's no
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/3] xfs: abort intent items when recovery intents fail
  2023-07-15  6:36 ` [PATCH v2 2/3] xfs: abort intent items when recovery intents fail Long Li
@ 2023-07-19  2:07   ` Darrick J. Wong
  0 siblings, 0 replies; 12+ messages in thread
From: Darrick J. Wong @ 2023-07-19  2:07 UTC (permalink / raw)
  To: Long Li; +Cc: david, linux-xfs, yi.zhang, houtao1, yangerkun

On Sat, Jul 15, 2023 at 02:36:46PM +0800, Long Li wrote:
> When recovering intents, we capture newly created intent items as part of
> committing recovered intent items.  If intent recovery fails at a later
> point, we forget to remove those newly created intent items from the AIL
> and hang:
> 
>     [root@localhost ~]# cat /proc/539/stack
>     [<0>] xfs_ail_push_all_sync+0x174/0x230
>     [<0>] xfs_unmount_flush_inodes+0x8d/0xd0
>     [<0>] xfs_mountfs+0x15f7/0x1e70
>     [<0>] xfs_fs_fill_super+0x10ec/0x1b20
>     [<0>] get_tree_bdev+0x3c8/0x730
>     [<0>] vfs_get_tree+0x89/0x2c0
>     [<0>] path_mount+0xecf/0x1800
>     [<0>] do_mount+0xf3/0x110
>     [<0>] __x64_sys_mount+0x154/0x1f0
>     [<0>] do_syscall_64+0x39/0x80
>     [<0>] entry_SYSCALL_64_after_hwframe+0x63/0xcd
> 
> When newly created intent items fail to commit via transaction, intent
> recovery hasn't created done items for these newly created intent items,
> so the capture structure is the sole owner of the captured intent items.
> We must release them explicitly or else they leak:
> 
> unreferenced object 0xffff888016719108 (size 432):
>   comm "mount", pid 529, jiffies 4294706839 (age 144.463s)
>   hex dump (first 32 bytes):
>     08 91 71 16 80 88 ff ff 08 91 71 16 80 88 ff ff  ..q.......q.....
>     18 91 71 16 80 88 ff ff 18 91 71 16 80 88 ff ff  ..q.......q.....
>   backtrace:
>     [<ffffffff8230c68f>] xfs_efi_init+0x18f/0x1d0
>     [<ffffffff8230c720>] xfs_extent_free_create_intent+0x50/0x150
>     [<ffffffff821b671a>] xfs_defer_create_intents+0x16a/0x340
>     [<ffffffff821bac3e>] xfs_defer_ops_capture_and_commit+0x8e/0xad0
>     [<ffffffff82322bb9>] xfs_cui_item_recover+0x819/0x980
>     [<ffffffff823289b6>] xlog_recover_process_intents+0x246/0xb70
>     [<ffffffff8233249a>] xlog_recover_finish+0x8a/0x9a0
>     [<ffffffff822eeafb>] xfs_log_mount_finish+0x2bb/0x4a0
>     [<ffffffff822c0f4f>] xfs_mountfs+0x14bf/0x1e70
>     [<ffffffff822d1f80>] xfs_fs_fill_super+0x10d0/0x1b20
>     [<ffffffff81a21fa2>] get_tree_bdev+0x3d2/0x6d0
>     [<ffffffff81a1ee09>] vfs_get_tree+0x89/0x2c0
>     [<ffffffff81a9f35f>] path_mount+0xecf/0x1800
>     [<ffffffff81a9fd83>] do_mount+0xf3/0x110
>     [<ffffffff81aa00e4>] __x64_sys_mount+0x154/0x1f0
>     [<ffffffff83968739>] do_syscall_64+0x39/0x80
> 
> Fix the problem above by abort intent items that don't have a done item
> when recovery intents fail.
> 
> Fixes: e6fff81e4870 ("xfs: proper replay of deferred ops queued during log recovery")
> Signed-off-by: Long Li <leo.lilong@huawei.com>
> ---
>  fs/xfs/libxfs/xfs_defer.c | 5 +++--
>  fs/xfs/libxfs/xfs_defer.h | 2 +-
>  fs/xfs/xfs_log_recover.c  | 2 +-
>  3 files changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_defer.c b/fs/xfs/libxfs/xfs_defer.c
> index 88388e12f8e7..f71679ce23b9 100644
> --- a/fs/xfs/libxfs/xfs_defer.c
> +++ b/fs/xfs/libxfs/xfs_defer.c
> @@ -763,12 +763,13 @@ xfs_defer_ops_capture(
>  
>  /* Release all resources that we used to capture deferred ops. */
>  void
> -xfs_defer_ops_capture_free(
> +xfs_defer_ops_capture_abort(
>  	struct xfs_mount		*mp,
>  	struct xfs_defer_capture	*dfc)
>  {
>  	unsigned short			i;
>  
> +	xfs_defer_pending_abort(mp, &dfc->dfc_dfops);

Aha, so ... intent recovery can itself create new intent items.  These
new items are tracked via t_dfops in the transaction, but they've been
moved into the xfs_defer_capture structure by xfs_defer_ops_capture...

>  	xfs_defer_cancel_list(mp, &dfc->dfc_dfops);
>  
>  	for (i = 0; i < dfc->dfc_held.dr_bufs; i++)
> @@ -809,7 +810,7 @@ xfs_defer_ops_capture_and_commit(
>  	/* Commit the transaction and add the capture structure to the list. */
>  	error = xfs_trans_commit(tp);
>  	if (error) {
> -		xfs_defer_ops_capture_free(mp, dfc);
> +		xfs_defer_ops_capture_abort(mp, dfc);

...but then the transaction commit fails.  The capture structure still
has the new intents attached, but nobody actually aborts them.  They
leak.

That's why _capture_free needs to call xfs_defer_pending_abort so that
we call ->abort_intent to free the new intents instead of leaking them.

Ok.

Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D


>  		return error;
>  	}
>  
> diff --git a/fs/xfs/libxfs/xfs_defer.h b/fs/xfs/libxfs/xfs_defer.h
> index 114a3a4930a3..8788ad5f6a73 100644
> --- a/fs/xfs/libxfs/xfs_defer.h
> +++ b/fs/xfs/libxfs/xfs_defer.h
> @@ -121,7 +121,7 @@ int xfs_defer_ops_capture_and_commit(struct xfs_trans *tp,
>  		struct list_head *capture_list);
>  void xfs_defer_ops_continue(struct xfs_defer_capture *d, struct xfs_trans *tp,
>  		struct xfs_defer_resources *dres);
> -void xfs_defer_ops_capture_free(struct xfs_mount *mp,
> +void xfs_defer_ops_capture_abort(struct xfs_mount *mp,
>  		struct xfs_defer_capture *d);
>  void xfs_defer_resources_rele(struct xfs_defer_resources *dres);
>  
> diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
> index 82c81d20459d..fdaa0ffe029b 100644
> --- a/fs/xfs/xfs_log_recover.c
> +++ b/fs/xfs/xfs_log_recover.c
> @@ -2511,7 +2511,7 @@ xlog_abort_defer_ops(
>  
>  	list_for_each_entry_safe(dfc, next, capture_list, dfc_list) {
>  		list_del_init(&dfc->dfc_list);
> -		xfs_defer_ops_capture_free(mp, dfc);
> +		xfs_defer_ops_capture_abort(mp, dfc);
>  	}
>  }
>  
> -- 
> 2.31.1
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/3] xfs: factor out xfs_defer_pending_abort
  2023-07-15  6:36 ` [PATCH v2 1/3] xfs: factor out xfs_defer_pending_abort Long Li
@ 2023-07-19  2:08   ` Darrick J. Wong
  0 siblings, 0 replies; 12+ messages in thread
From: Darrick J. Wong @ 2023-07-19  2:08 UTC (permalink / raw)
  To: Long Li; +Cc: david, linux-xfs, yi.zhang, houtao1, yangerkun

On Sat, Jul 15, 2023 at 02:36:45PM +0800, Long Li wrote:
> Factor out xfs_defer_pending_abort() from xfs_defer_trans_abort(), which
> not use transaction parameter, so it can be used after the transaction
> life cycle.
> 
> Signed-off-by: Long Li <leo.lilong@huawei.com>

Pretty straightforward slicing and dicing, so
Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

> ---
>  fs/xfs/libxfs/xfs_defer.c | 23 +++++++++++++++--------
>  1 file changed, 15 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_defer.c b/fs/xfs/libxfs/xfs_defer.c
> index bcfb6a4203cd..88388e12f8e7 100644
> --- a/fs/xfs/libxfs/xfs_defer.c
> +++ b/fs/xfs/libxfs/xfs_defer.c
> @@ -245,21 +245,18 @@ xfs_defer_create_intents(
>  	return ret;
>  }
>  
> -/* Abort all the intents that were committed. */
>  STATIC void
> -xfs_defer_trans_abort(
> -	struct xfs_trans		*tp,
> -	struct list_head		*dop_pending)
> +xfs_defer_pending_abort(
> +	struct xfs_mount		*mp,
> +	struct list_head		*dop_list)
>  {
>  	struct xfs_defer_pending	*dfp;
>  	const struct xfs_defer_op_type	*ops;
>  
> -	trace_xfs_defer_trans_abort(tp, _RET_IP_);
> -
>  	/* Abort intent items that don't have a done item. */
> -	list_for_each_entry(dfp, dop_pending, dfp_list) {
> +	list_for_each_entry(dfp, dop_list, dfp_list) {
>  		ops = defer_op_types[dfp->dfp_type];
> -		trace_xfs_defer_pending_abort(tp->t_mountp, dfp);
> +		trace_xfs_defer_pending_abort(mp, dfp);
>  		if (dfp->dfp_intent && !dfp->dfp_done) {
>  			ops->abort_intent(dfp->dfp_intent);
>  			dfp->dfp_intent = NULL;
> @@ -267,6 +264,16 @@ xfs_defer_trans_abort(
>  	}
>  }
>  
> +/* Abort all the intents that were committed. */
> +STATIC void
> +xfs_defer_trans_abort(
> +	struct xfs_trans		*tp,
> +	struct list_head		*dop_pending)
> +{
> +	trace_xfs_defer_trans_abort(tp, _RET_IP_);
> +	xfs_defer_pending_abort(tp->t_mountp, dop_pending);
> +}
> +
>  /*
>   * Capture resources that the caller said not to release ("held") when the
>   * transaction commits.  Caller is responsible for zero-initializing @dres.
> -- 
> 2.31.1
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 3/3] xfs: make sure done item committed before cancel intents
  2023-07-15  6:36 ` [PATCH v2 3/3] xfs: make sure done item committed before cancel intents Long Li
@ 2023-07-19  2:19   ` Darrick J. Wong
  2023-07-22  2:16     ` Long Li
  2023-07-19  6:41   ` Dave Chinner
  1 sibling, 1 reply; 12+ messages in thread
From: Darrick J. Wong @ 2023-07-19  2:19 UTC (permalink / raw)
  To: Long Li; +Cc: david, linux-xfs, yi.zhang, houtao1, yangerkun

On Sat, Jul 15, 2023 at 02:36:47PM +0800, Long Li wrote:
> KASAN report a uaf when recover intents fails:
> 
>  ==================================================================
>  BUG: KASAN: slab-use-after-free in xfs_cui_release+0xb7/0xc0
>  Read of size 4 at addr ffff888012575e60 by task kworker/u8:3/103
>  CPU: 3 PID: 103 Comm: kworker/u8:3 Not tainted 6.4.0-rc7-next-20230619-00003-g94543a53f9a4-dirty #166
>  Workqueue: xfs-cil/sda xlog_cil_push_work
>  Call Trace:
>   <TASK>
>   dump_stack_lvl+0x50/0x70
>   print_report+0xc2/0x600
>   kasan_report+0xb6/0xe0
>   xfs_cui_release+0xb7/0xc0
>   xfs_cud_item_release+0x3c/0x90
>   xfs_trans_committed_bulk+0x2d5/0x7f0
>   xlog_cil_committed+0xaba/0xf20
>   xlog_cil_push_work+0x1a60/0x2360
>   process_one_work+0x78e/0x1140
>   worker_thread+0x58b/0xf60
>   kthread+0x2cd/0x3c0
>   ret_from_fork+0x1f/0x30
>   </TASK>
> 
>  Allocated by task 531:
>   kasan_save_stack+0x22/0x40
>   kasan_set_track+0x25/0x30
>   __kasan_slab_alloc+0x55/0x60
>   kmem_cache_alloc+0x195/0x5f0
>   xfs_cui_init+0x198/0x1d0
>   xlog_recover_cui_commit_pass2+0x133/0x5f0
>   xlog_recover_items_pass2+0x107/0x230
>   xlog_recover_commit_trans+0x3e7/0x9c0
>   xlog_recovery_process_trans+0x140/0x1d0
>   xlog_recover_process_ophdr+0x1a0/0x3d0
>   xlog_recover_process_data+0x108/0x2d0
>   xlog_recover_process+0x1f6/0x280
>   xlog_do_recovery_pass+0x609/0xdb0
>   xlog_do_log_recovery+0x84/0xe0
>   xlog_do_recover+0x7d/0x470
>   xlog_recover+0x25f/0x490
>   xfs_log_mount+0x2dd/0x6f0
>   xfs_mountfs+0x11ce/0x1e70
>   xfs_fs_fill_super+0x10ec/0x1b20
>   get_tree_bdev+0x3c8/0x730
>   vfs_get_tree+0x89/0x2c0
>   path_mount+0xecf/0x1800
>   do_mount+0xf3/0x110
>   __x64_sys_mount+0x154/0x1f0
>   do_syscall_64+0x39/0x80
>   entry_SYSCALL_64_after_hwframe+0x63/0xcd
> 
>  Freed by task 531:
>   kasan_save_stack+0x22/0x40
>   kasan_set_track+0x25/0x30
>   kasan_save_free_info+0x2b/0x40
>   __kasan_slab_free+0x114/0x1b0
>   kmem_cache_free+0xf8/0x510
>   xfs_cui_item_free+0x95/0xb0
>   xfs_cui_release+0x86/0xc0
>   xlog_recover_cancel_intents.isra.0+0xf8/0x210
>   xlog_recover_finish+0x7e7/0x980
>   xfs_log_mount_finish+0x2bb/0x4a0
>   xfs_mountfs+0x14bf/0x1e70
>   xfs_fs_fill_super+0x10ec/0x1b20
>   get_tree_bdev+0x3c8/0x730
>   vfs_get_tree+0x89/0x2c0
>   path_mount+0xecf/0x1800
>   do_mount+0xf3/0x110
>   __x64_sys_mount+0x154/0x1f0
>   do_syscall_64+0x39/0x80
>   entry_SYSCALL_64_after_hwframe+0x63/0xcd
> 
>  The buggy address belongs to the object at ffff888012575dc8
>   which belongs to the cache xfs_cui_item of size 432
>  The buggy address is located 152 bytes inside of
>   freed 432-byte region [ffff888012575dc8, ffff888012575f78)
> 
>  The buggy address belongs to the physical page:
>  page:ffffea0000495d00 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888012576208 pfn:0x12574
>  head:ffffea0000495d00 order:2 entire_mapcount:0 nr_pages_mapped:0 pincount:0
>  flags: 0x1fffff80010200(slab|head|node=0|zone=1|lastcpupid=0x1fffff)
>  page_type: 0xffffffff()
>  raw: 001fffff80010200 ffff888012092f40 ffff888014570150 ffff888014570150
>  raw: ffff888012576208 00000000001e0010 00000001ffffffff 0000000000000000
>  page dumped because: kasan: bad access detected
> 
>  Memory state around the buggy address:
>   ffff888012575d00: fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc fc
>   ffff888012575d80: fc fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb
>  >ffff888012575e00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>                                                         ^
>   ffff888012575e80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>   ffff888012575f00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fc
>  ==================================================================
> 
> If process intents fails, intent items left in AIL will be delete
> from AIL and freed in error handling, even intent items that have been
> recovered and created done items. After this, uaf will be triggered when
> done item commited, because at this point the released intent item will
> be accessed.
> 
> xlog_recover_finish                     xlog_cil_push_work
> ----------------------------            ---------------------------
> xlog_recover_process_intents
>   xfs_cui_item_recover//cui_refcount == 1
>     xfs_trans_get_cud
>     xfs_trans_commit
>       <add cud item to cil>
>   xfs_cui_item_recover
>     <error occurred and return>
> xlog_recover_cancel_intents
>   xfs_cui_release     //cui_refcount == 0
>     xfs_cui_item_free //free cui
>   <release other intent items>
> xlog_force_shutdown   //shutdown
>                                <...>
>                                         <push items in cil>
>                                         xlog_cil_committed
>                                           xfs_cud_item_release
>                                             xfs_cui_release // UAF
> 
> Fix it by move log force forward to make sure done items committed before
> cancel intents.
> 
> Fixes: 2e76f188fd90 ("xfs: cancel intents immediately if process_intents fails")
> Signed-off-by: Long Li <leo.lilong@huawei.com>
> ---
>  fs/xfs/xfs_log_recover.c | 14 +++++++-------
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
> index fdaa0ffe029b..c37031e64db5 100644
> --- a/fs/xfs/xfs_log_recover.c
> +++ b/fs/xfs/xfs_log_recover.c
> @@ -3444,6 +3444,13 @@ xlog_recover_finish(
>  	int	error;
>  
>  	error = xlog_recover_process_intents(log);
> +	/*
> +	 * Sync the log to get all the intents that have done item out of

"...that have intent done items out of the AIL."

> +	 * the AIL.  This isn't absolutely necessary, but it helps in case
> +	 * the unlink transactions would have problems pushing the intents
> +	 * out of the way.
> +	 */
> +	xfs_log_force(log->l_mp, XFS_LOG_SYNC);

Hmm.  Why doesn't the shutdown force the (now dead) dead log items out
of memory?

	/*
	 * Flush all the completed transactions to disk before marking the log
	 * being shut down. We need to do this first as shutting down the log
	 * before the force will prevent the log force from flushing the iclogs
	 * to disk.
	 *
	 * When we are in recovery, there are no transactions to flush, and
	 * we don't want to touch the log because we don't want to perturb the
	 * current head/tail for future recovery attempts. Hence we need to
	 * avoid a log force in this case.
	 *
	 * If we are shutting down due to a log IO error, then we must avoid
	 * trying to write the log as that may just result in more IO errors and
	 * an endless shutdown/force loop.
	 */
	if (!log_error && !xlog_in_recovery(log))
		xfs_log_force(log->l_mp, XFS_LOG_SYNC);

Oh.  We're in recovery, but we've passed the part where we've finished
with the log/inode/dquot items that we read from the disk.  IOWs, now
we've gotten to the part of recovery where we're actually standing up
new transactions to finish the work that was in the intents ... with new
incore intents.

I wonder, does the problem go away if you (hackishly)
clear_bit(XLOG_ACTIVE_RECOVERY, &log->l_opstate); before the
xlog_force_shutdown?

--D

>  	if (error) {
>  		/*
>  		 * Cancel all the unprocessed intent items now so that we don't
> @@ -3458,13 +3465,6 @@ xlog_recover_finish(
>  		return error;
>  	}
>  
> -	/*
> -	 * Sync the log to get all the intents out of the AIL.  This isn't
> -	 * absolutely necessary, but it helps in case the unlink transactions
> -	 * would have problems pushing the intents out of the way.
> -	 */
> -	xfs_log_force(log->l_mp, XFS_LOG_SYNC);
> -
>  	/*
>  	 * Now that we've recovered the log and all the intents, we can clear
>  	 * the log incompat feature bits in the superblock because there's no
> -- 
> 2.31.1
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 3/3] xfs: make sure done item committed before cancel intents
  2023-07-15  6:36 ` [PATCH v2 3/3] xfs: make sure done item committed before cancel intents Long Li
  2023-07-19  2:19   ` Darrick J. Wong
@ 2023-07-19  6:41   ` Dave Chinner
  2023-07-22  1:19     ` Long Li
  1 sibling, 1 reply; 12+ messages in thread
From: Dave Chinner @ 2023-07-19  6:41 UTC (permalink / raw)
  To: Long Li; +Cc: djwong, linux-xfs, yi.zhang, houtao1, yangerkun

On Sat, Jul 15, 2023 at 02:36:47PM +0800, Long Li wrote:
> KASAN report a uaf when recover intents fails:
....
> 
> If process intents fails, intent items left in AIL will be delete
> from AIL and freed in error handling, even intent items that have been
> recovered and created done items. After this, uaf will be triggered when
> done item commited, because at this point the released intent item will
> be accessed.
> 
> xlog_recover_finish                     xlog_cil_push_work
> ----------------------------            ---------------------------
> xlog_recover_process_intents
>   xfs_cui_item_recover//cui_refcount == 1
>     xfs_trans_get_cud
>     xfs_trans_commit
>       <add cud item to cil>
>   xfs_cui_item_recover
>     <error occurred and return>
> xlog_recover_cancel_intents
>   xfs_cui_release     //cui_refcount == 0
>     xfs_cui_item_free //free cui
>   <release other intent items>
> xlog_force_shutdown   //shutdown
>                                <...>
>                                         <push items in cil>
>                                         xlog_cil_committed
>                                           xfs_cud_item_release
>                                             xfs_cui_release // UAF

Huh. The log stores items in the AIL without holding a reference to
them, then on shutdown takes the intent done reference away because
it assumes the intent has not been processed as it is still in the
AIL.

Ok, that's broken.

> Fix it by move log force forward to make sure done items committed before
> cancel intents.

That doesn't fix the fact we have a reference counted object that is
being accessed by code that doesn't actually own a reference to the
object.  Intent log items are created with a reference count of 2 -
one for the creator, and one for the intent done object.

Look at xlog_recover_cui_commit_pass2():

        /*
         * Insert the intent into the AIL directly and drop one reference so
         * that finishing or canceling the work will drop the other.
         */
        xfs_trans_ail_insert(log->l_ailp, &cuip->cui_item, lsn);
        xfs_cui_release(cuip);
        return 0;
}

Log recovery explicitly drops the creator reference after it is
inserted into the AIL, but it then processes the log item as if it
also owns the intent-done reference. The moment we call
->iop_recover(), the intent-done reference should be owned by the
log item.

The recovery of the BUI, RUI and EFI all do the same thing. I
suspect that these references should actually be held by log
recovery until it is done processing the item, at which point it
should be removed from the AIL by xlog_recover_process_intents().

The code in ->iop_recovery should assume that it passes the
reference to the done intent, but if that code fails before creating
the done-intent then it needs to release the intent reference
itself.

That way when we go to cancel the intent, the only intents we find
in the AIL are the ones we know have not been processed yet and
hence we can safely drop both the creator and the intent done
reference from xlog_recover_cancel_intents().

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 3/3] xfs: make sure done item committed before cancel intents
  2023-07-19  6:41   ` Dave Chinner
@ 2023-07-22  1:19     ` Long Li
  2023-07-24  1:05       ` Dave Chinner
  0 siblings, 1 reply; 12+ messages in thread
From: Long Li @ 2023-07-22  1:19 UTC (permalink / raw)
  To: Dave Chinner, Long Li; +Cc: djwong, linux-xfs, yi.zhang, houtao1, yangerkun

On Wed, Jul 19, 2023 at 04:41:41PM +1000, Dave Chinner wrote:
> On Sat, Jul 15, 2023 at 02:36:47PM +0800, Long Li wrote:
> > KASAN report a uaf when recover intents fails:
> ....
> > 
> > If process intents fails, intent items left in AIL will be delete
> > from AIL and freed in error handling, even intent items that have been
> > recovered and created done items. After this, uaf will be triggered when
> > done item commited, because at this point the released intent item will
> > be accessed.
> > 
> > xlog_recover_finish                     xlog_cil_push_work
> > ----------------------------            ---------------------------
> > xlog_recover_process_intents
> >   xfs_cui_item_recover//cui_refcount == 1
> >     xfs_trans_get_cud
> >     xfs_trans_commit
> >       <add cud item to cil>
> >   xfs_cui_item_recover
> >     <error occurred and return>
> > xlog_recover_cancel_intents
> >   xfs_cui_release     //cui_refcount == 0
> >     xfs_cui_item_free //free cui
> >   <release other intent items>
> > xlog_force_shutdown   //shutdown
> >                                <...>
> >                                         <push items in cil>
> >                                         xlog_cil_committed
> >                                           xfs_cud_item_release
> >                                             xfs_cui_release // UAF
> 
> Huh. The log stores items in the AIL without holding a reference to
> them, then on shutdown takes the intent done reference away because
> it assumes the intent has not been processed as it is still in the
> AIL.
> 
> Ok, that's broken.
> 
> > Fix it by move log force forward to make sure done items committed before
> > cancel intents.
> 
> That doesn't fix the fact we have a reference counted object that is
> being accessed by code that doesn't actually own a reference to the
> object.  Intent log items are created with a reference count of 2 -
> one for the creator, and one for the intent done object.
> 
> Look at xlog_recover_cui_commit_pass2():
> 
>         /*
>          * Insert the intent into the AIL directly and drop one reference so
>          * that finishing or canceling the work will drop the other.
>          */
>         xfs_trans_ail_insert(log->l_ailp, &cuip->cui_item, lsn);
>         xfs_cui_release(cuip);
>         return 0;
> }
> 
> Log recovery explicitly drops the creator reference after it is
> inserted into the AIL, but it then processes the log item as if it
> also owns the intent-done reference. The moment we call
> ->iop_recover(), the intent-done reference should be owned by the
> log item.

Hi, Dave

Thanks for the reply. Yes, your analysis seems reasonable, it helped me a
lot to understand the intent lifecycle.

> 
> The recovery of the BUI, RUI and EFI all do the same thing. I
> suspect that these references should actually be held by log
> recovery until it is done processing the item, at which point it
> should be removed from the AIL by xlog_recover_process_intents().

Why do we need to remove the intent from the AIL at this point, shouldn't
it be removed from the AIL when the done intent is committed? Or is there
any way to ensure that the intents are removed from the AIL when they are
processed.

> 
> The code in ->iop_recovery should assume that it passes the
> reference to the done intent, but if that code fails before creating
> the done-intent then it needs to release the intent reference
> itself.
> 
> That way when we go to cancel the intent, the only intents we find
> in the AIL are the ones we know have not been processed yet and
> hence we can safely drop both the creator and the intent done
> reference from xlog_recover_cancel_intents().

Yes, if the processed intent has been removed from the AIL, then only the
unprocessed intent remains in the current AIL, and it is safe to cancel
the intent.

Best regards
Long Li
> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 3/3] xfs: make sure done item committed before cancel intents
  2023-07-19  2:19   ` Darrick J. Wong
@ 2023-07-22  2:16     ` Long Li
  0 siblings, 0 replies; 12+ messages in thread
From: Long Li @ 2023-07-22  2:16 UTC (permalink / raw)
  To: Darrick J. Wong, Long Li; +Cc: david, linux-xfs, yi.zhang, houtao1, yangerkun

On Tue, Jul 18, 2023 at 07:19:14PM -0700, Darrick J. Wong wrote:
> On Sat, Jul 15, 2023 at 02:36:47PM +0800, Long Li wrote:
> > KASAN report a uaf when recover intents fails:
> > 
> >  ==================================================================
> >  BUG: KASAN: slab-use-after-free in xfs_cui_release+0xb7/0xc0
> >  Read of size 4 at addr ffff888012575e60 by task kworker/u8:3/103
> >  CPU: 3 PID: 103 Comm: kworker/u8:3 Not tainted 6.4.0-rc7-next-20230619-00003-g94543a53f9a4-dirty #166
> >  Workqueue: xfs-cil/sda xlog_cil_push_work
> >  Call Trace:
> >   <TASK>
> >   dump_stack_lvl+0x50/0x70
> >   print_report+0xc2/0x600
> >   kasan_report+0xb6/0xe0
> >   xfs_cui_release+0xb7/0xc0
> >   xfs_cud_item_release+0x3c/0x90
> >   xfs_trans_committed_bulk+0x2d5/0x7f0
> >   xlog_cil_committed+0xaba/0xf20
> >   xlog_cil_push_work+0x1a60/0x2360
> >   process_one_work+0x78e/0x1140
> >   worker_thread+0x58b/0xf60
> >   kthread+0x2cd/0x3c0
> >   ret_from_fork+0x1f/0x30
> >   </TASK>
> > 
> >  Allocated by task 531:
> >   kasan_save_stack+0x22/0x40
> >   kasan_set_track+0x25/0x30
> >   __kasan_slab_alloc+0x55/0x60
> >   kmem_cache_alloc+0x195/0x5f0
> >   xfs_cui_init+0x198/0x1d0
> >   xlog_recover_cui_commit_pass2+0x133/0x5f0
> >   xlog_recover_items_pass2+0x107/0x230
> >   xlog_recover_commit_trans+0x3e7/0x9c0
> >   xlog_recovery_process_trans+0x140/0x1d0
> >   xlog_recover_process_ophdr+0x1a0/0x3d0
> >   xlog_recover_process_data+0x108/0x2d0
> >   xlog_recover_process+0x1f6/0x280
> >   xlog_do_recovery_pass+0x609/0xdb0
> >   xlog_do_log_recovery+0x84/0xe0
> >   xlog_do_recover+0x7d/0x470
> >   xlog_recover+0x25f/0x490
> >   xfs_log_mount+0x2dd/0x6f0
> >   xfs_mountfs+0x11ce/0x1e70
> >   xfs_fs_fill_super+0x10ec/0x1b20
> >   get_tree_bdev+0x3c8/0x730
> >   vfs_get_tree+0x89/0x2c0
> >   path_mount+0xecf/0x1800
> >   do_mount+0xf3/0x110
> >   __x64_sys_mount+0x154/0x1f0
> >   do_syscall_64+0x39/0x80
> >   entry_SYSCALL_64_after_hwframe+0x63/0xcd
> > 
> >  Freed by task 531:
> >   kasan_save_stack+0x22/0x40
> >   kasan_set_track+0x25/0x30
> >   kasan_save_free_info+0x2b/0x40
> >   __kasan_slab_free+0x114/0x1b0
> >   kmem_cache_free+0xf8/0x510
> >   xfs_cui_item_free+0x95/0xb0
> >   xfs_cui_release+0x86/0xc0
> >   xlog_recover_cancel_intents.isra.0+0xf8/0x210
> >   xlog_recover_finish+0x7e7/0x980
> >   xfs_log_mount_finish+0x2bb/0x4a0
> >   xfs_mountfs+0x14bf/0x1e70
> >   xfs_fs_fill_super+0x10ec/0x1b20
> >   get_tree_bdev+0x3c8/0x730
> >   vfs_get_tree+0x89/0x2c0
> >   path_mount+0xecf/0x1800
> >   do_mount+0xf3/0x110
> >   __x64_sys_mount+0x154/0x1f0
> >   do_syscall_64+0x39/0x80
> >   entry_SYSCALL_64_after_hwframe+0x63/0xcd
> > 
> >  The buggy address belongs to the object at ffff888012575dc8
> >   which belongs to the cache xfs_cui_item of size 432
> >  The buggy address is located 152 bytes inside of
> >   freed 432-byte region [ffff888012575dc8, ffff888012575f78)
> > 
> >  The buggy address belongs to the physical page:
> >  page:ffffea0000495d00 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888012576208 pfn:0x12574
> >  head:ffffea0000495d00 order:2 entire_mapcount:0 nr_pages_mapped:0 pincount:0
> >  flags: 0x1fffff80010200(slab|head|node=0|zone=1|lastcpupid=0x1fffff)
> >  page_type: 0xffffffff()
> >  raw: 001fffff80010200 ffff888012092f40 ffff888014570150 ffff888014570150
> >  raw: ffff888012576208 00000000001e0010 00000001ffffffff 0000000000000000
> >  page dumped because: kasan: bad access detected
> > 
> >  Memory state around the buggy address:
> >   ffff888012575d00: fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc fc
> >   ffff888012575d80: fc fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb
> >  >ffff888012575e00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >                                                         ^
> >   ffff888012575e80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >   ffff888012575f00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fc
> >  ==================================================================
> > 
> > If process intents fails, intent items left in AIL will be delete
> > from AIL and freed in error handling, even intent items that have been
> > recovered and created done items. After this, uaf will be triggered when
> > done item commited, because at this point the released intent item will
> > be accessed.
> > 
> > xlog_recover_finish                     xlog_cil_push_work
> > ----------------------------            ---------------------------
> > xlog_recover_process_intents
> >   xfs_cui_item_recover//cui_refcount == 1
> >     xfs_trans_get_cud
> >     xfs_trans_commit
> >       <add cud item to cil>
> >   xfs_cui_item_recover
> >     <error occurred and return>
> > xlog_recover_cancel_intents
> >   xfs_cui_release     //cui_refcount == 0
> >     xfs_cui_item_free //free cui
> >   <release other intent items>
> > xlog_force_shutdown   //shutdown
> >                                <...>
> >                                         <push items in cil>
> >                                         xlog_cil_committed
> >                                           xfs_cud_item_release
> >                                             xfs_cui_release // UAF
> > 
> > Fix it by move log force forward to make sure done items committed before
> > cancel intents.
> > 
> > Fixes: 2e76f188fd90 ("xfs: cancel intents immediately if process_intents fails")
> > Signed-off-by: Long Li <leo.lilong@huawei.com>
> > ---
> >  fs/xfs/xfs_log_recover.c | 14 +++++++-------
> >  1 file changed, 7 insertions(+), 7 deletions(-)
> > 
> > diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
> > index fdaa0ffe029b..c37031e64db5 100644
> > --- a/fs/xfs/xfs_log_recover.c
> > +++ b/fs/xfs/xfs_log_recover.c
> > @@ -3444,6 +3444,13 @@ xlog_recover_finish(
> >  	int	error;
> >  
> >  	error = xlog_recover_process_intents(log);
> > +	/*
> > +	 * Sync the log to get all the intents that have done item out of
> 
> "...that have intent done items out of the AIL."
> 
> > +	 * the AIL.  This isn't absolutely necessary, but it helps in case
> > +	 * the unlink transactions would have problems pushing the intents
> > +	 * out of the way.
> > +	 */
> > +	xfs_log_force(log->l_mp, XFS_LOG_SYNC);
> 
> Hmm.  Why doesn't the shutdown force the (now dead) dead log items out
> of memory?
> 
> 	/*
> 	 * Flush all the completed transactions to disk before marking the log
> 	 * being shut down. We need to do this first as shutting down the log
> 	 * before the force will prevent the log force from flushing the iclogs
> 	 * to disk.
> 	 *
> 	 * When we are in recovery, there are no transactions to flush, and
> 	 * we don't want to touch the log because we don't want to perturb the
> 	 * current head/tail for future recovery attempts. Hence we need to
> 	 * avoid a log force in this case.
> 	 *
> 	 * If we are shutting down due to a log IO error, then we must avoid
> 	 * trying to write the log as that may just result in more IO errors and
> 	 * an endless shutdown/force loop.
> 	 */
> 	if (!log_error && !xlog_in_recovery(log))
> 		xfs_log_force(log->l_mp, XFS_LOG_SYNC);
> 
> Oh.  We're in recovery, but we've passed the part where we've finished
> with the log/inode/dquot items that we read from the disk.  IOWs, now
> we've gotten to the part of recovery where we're actually standing up
> new transactions to finish the work that was in the intents ... with new
> incore intents.
> 
> I wonder, does the problem go away if you (hackishly)
> clear_bit(XLOG_ACTIVE_RECOVERY, &log->l_opstate); before the
> xlog_force_shutdown?

I used the patch below and tested it to confirm that the uaf problem
goes away.

diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 82c81d20459d..5d9f84ce2b9b 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -3452,9 +3452,10 @@ xlog_recover_finish(
 		 * (inode reclaim does this) before we get around to
 		 * xfs_log_mount_cancel.
 		 */
+		clear_bit(XLOG_ACTIVE_RECOVERY, &log->l_opstate);
+		xlog_force_shutdown(log, SHUTDOWN_CORRUPT_INCORE);
 		xlog_recover_cancel_intents(log);
 		xfs_alert(log->l_mp, "Failed to recover intents");
-		xlog_force_shutdown(log, SHUTDOWN_LOG_IO_ERROR);
 		return error;
 	}
 
I'm wondering if a shutdown with SHUTDOWN_CORRUPT_INCORE is appropriate,
even though there is no log io error at this point.

Best regards
Long Li

> 
> --D
> 
> >  	if (error) {
> >  		/*
> >  		 * Cancel all the unprocessed intent items now so that we don't
> > @@ -3458,13 +3465,6 @@ xlog_recover_finish(
> >  		return error;
> >  	}
> >  
> > -	/*
> > -	 * Sync the log to get all the intents out of the AIL.  This isn't
> > -	 * absolutely necessary, but it helps in case the unlink transactions
> > -	 * would have problems pushing the intents out of the way.
> > -	 */
> > -	xfs_log_force(log->l_mp, XFS_LOG_SYNC);
> > -
> >  	/*
> >  	 * Now that we've recovered the log and all the intents, we can clear
> >  	 * the log incompat feature bits in the superblock because there's no
> > -- 
> > 2.31.1
> > 


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 3/3] xfs: make sure done item committed before cancel intents
  2023-07-22  1:19     ` Long Li
@ 2023-07-24  1:05       ` Dave Chinner
  2023-07-25 12:25         ` Long Li
  0 siblings, 1 reply; 12+ messages in thread
From: Dave Chinner @ 2023-07-24  1:05 UTC (permalink / raw)
  To: Long Li; +Cc: Long Li, djwong, linux-xfs, yi.zhang, houtao1, yangerkun

On Sat, Jul 22, 2023 at 09:19:09AM +0800, Long Li wrote:
> On Wed, Jul 19, 2023 at 04:41:41PM +1000, Dave Chinner wrote:
> > On Sat, Jul 15, 2023 at 02:36:47PM +0800, Long Li wrote:
> > > KASAN report a uaf when recover intents fails:
> > ....
> > > 
> > > If process intents fails, intent items left in AIL will be delete
> > > from AIL and freed in error handling, even intent items that have been
> > > recovered and created done items. After this, uaf will be triggered when
> > > done item commited, because at this point the released intent item will
> > > be accessed.
> > > 
> > > xlog_recover_finish                     xlog_cil_push_work
> > > ----------------------------            ---------------------------
> > > xlog_recover_process_intents
> > >   xfs_cui_item_recover//cui_refcount == 1
> > >     xfs_trans_get_cud
> > >     xfs_trans_commit
> > >       <add cud item to cil>
> > >   xfs_cui_item_recover
> > >     <error occurred and return>
> > > xlog_recover_cancel_intents
> > >   xfs_cui_release     //cui_refcount == 0
> > >     xfs_cui_item_free //free cui
> > >   <release other intent items>
> > > xlog_force_shutdown   //shutdown
> > >                                <...>
> > >                                         <push items in cil>
> > >                                         xlog_cil_committed
> > >                                           xfs_cud_item_release
> > >                                             xfs_cui_release // UAF
> > 
> > Huh. The log stores items in the AIL without holding a reference to
> > them, then on shutdown takes the intent done reference away because
> > it assumes the intent has not been processed as it is still in the
> > AIL.
> > 
> > Ok, that's broken.
> > 
> > > Fix it by move log force forward to make sure done items committed before
> > > cancel intents.
> > 
> > That doesn't fix the fact we have a reference counted object that is
> > being accessed by code that doesn't actually own a reference to the
> > object.  Intent log items are created with a reference count of 2 -
> > one for the creator, and one for the intent done object.
> > 
> > Look at xlog_recover_cui_commit_pass2():
> > 
> >         /*
> >          * Insert the intent into the AIL directly and drop one reference so
> >          * that finishing or canceling the work will drop the other.
> >          */
> >         xfs_trans_ail_insert(log->l_ailp, &cuip->cui_item, lsn);
> >         xfs_cui_release(cuip);
> >         return 0;
> > }
> > 
> > Log recovery explicitly drops the creator reference after it is
> > inserted into the AIL, but it then processes the log item as if it
> > also owns the intent-done reference. The moment we call
> > ->iop_recover(), the intent-done reference should be owned by the
> > log item.
> 
> Hi, Dave
> 
> Thanks for the reply. Yes, your analysis seems reasonable, it helped me a
> lot to understand the intent lifecycle.
> 
> > 
> > The recovery of the BUI, RUI and EFI all do the same thing. I
> > suspect that these references should actually be held by log
> > recovery until it is done processing the item, at which point it
> > should be removed from the AIL by xlog_recover_process_intents().
> 
> Why do we need to remove the intent from the AIL at this point,

Because we've processed the recovery of it - it is either completely
done or we have a new intent in the CIL ready to continue operation.
Either way, the next write to the journal will remove the item from
the AIL when it completes.

Intents don't need to be in the AIL, though - we can cancel them
in memory (see the intent whiteout code) and so when we process the
done item from journal IO completion the last reference goes away
and they won't be in the AIL at this point in time.

IOWs, the intent freeing code doesn't care if the intent is in the
AIL or not, it does the right thing either way.

Hence if we remove the intent from the list of intents that need to
be recovered after we have done the initial recovery, we acheive two
things:

1. the tail of the log can be moved forward with the commit of the
done intent or new intent to continue the operation, and

2. We avoid the problem of trying to determine how many reference
counts we need to drop from intent recovery cancelling because we
never come across intents we've actually attempted recovery on.

> shouldn't
> it be removed from the AIL when the done intent is committed? Or is there
> any way to ensure that the intents are removed from the AIL when they are
> processed.

THe reference counting ensures the right thing is done when the last
reference goes away. If it is in the AIL, it will get removed, if it
is not in the AIL, then AIL removal is a no-op and nothign bad
happens.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 3/3] xfs: make sure done item committed before cancel intents
  2023-07-24  1:05       ` Dave Chinner
@ 2023-07-25 12:25         ` Long Li
  0 siblings, 0 replies; 12+ messages in thread
From: Long Li @ 2023-07-25 12:25 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Long Li, djwong, linux-xfs, yi.zhang, houtao1, yangerkun

On Mon, Jul 24, 2023 at 11:05:48AM +1000, Dave Chinner wrote:
> On Sat, Jul 22, 2023 at 09:19:09AM +0800, Long Li wrote:
> > On Wed, Jul 19, 2023 at 04:41:41PM +1000, Dave Chinner wrote:
> > > On Sat, Jul 15, 2023 at 02:36:47PM +0800, Long Li wrote:
> > > > KASAN report a uaf when recover intents fails:
> > > ....
> > > > 
> > > > If process intents fails, intent items left in AIL will be delete
> > > > from AIL and freed in error handling, even intent items that have been
> > > > recovered and created done items. After this, uaf will be triggered when
> > > > done item commited, because at this point the released intent item will
> > > > be accessed.
> > > > 
> > > > xlog_recover_finish                     xlog_cil_push_work
> > > > ----------------------------            ---------------------------
> > > > xlog_recover_process_intents
> > > >   xfs_cui_item_recover//cui_refcount == 1
> > > >     xfs_trans_get_cud
> > > >     xfs_trans_commit
> > > >       <add cud item to cil>
> > > >   xfs_cui_item_recover
> > > >     <error occurred and return>
> > > > xlog_recover_cancel_intents
> > > >   xfs_cui_release     //cui_refcount == 0
> > > >     xfs_cui_item_free //free cui
> > > >   <release other intent items>
> > > > xlog_force_shutdown   //shutdown
> > > >                                <...>
> > > >                                         <push items in cil>
> > > >                                         xlog_cil_committed
> > > >                                           xfs_cud_item_release
> > > >                                             xfs_cui_release // UAF
> > > 
> > > Huh. The log stores items in the AIL without holding a reference to
> > > them, then on shutdown takes the intent done reference away because
> > > it assumes the intent has not been processed as it is still in the
> > > AIL.
> > > 
> > > Ok, that's broken.
> > > 
> > > > Fix it by move log force forward to make sure done items committed before
> > > > cancel intents.
> > > 
> > > That doesn't fix the fact we have a reference counted object that is
> > > being accessed by code that doesn't actually own a reference to the
> > > object.  Intent log items are created with a reference count of 2 -
> > > one for the creator, and one for the intent done object.
> > > 
> > > Look at xlog_recover_cui_commit_pass2():
> > > 
> > >         /*
> > >          * Insert the intent into the AIL directly and drop one reference so
> > >          * that finishing or canceling the work will drop the other.
> > >          */
> > >         xfs_trans_ail_insert(log->l_ailp, &cuip->cui_item, lsn);
> > >         xfs_cui_release(cuip);
> > >         return 0;
> > > }
> > > 
> > > Log recovery explicitly drops the creator reference after it is
> > > inserted into the AIL, but it then processes the log item as if it
> > > also owns the intent-done reference. The moment we call
> > > ->iop_recover(), the intent-done reference should be owned by the
> > > log item.
> > 
> > Hi, Dave
> > 
> > Thanks for the reply. Yes, your analysis seems reasonable, it helped me a
> > lot to understand the intent lifecycle.
> > 
> > > 
> > > The recovery of the BUI, RUI and EFI all do the same thing. I
> > > suspect that these references should actually be held by log
> > > recovery until it is done processing the item, at which point it
> > > should be removed from the AIL by xlog_recover_process_intents().
> > 
> > Why do we need to remove the intent from the AIL at this point,
> 
> Because we've processed the recovery of it - it is either completely
> done or we have a new intent in the CIL ready to continue operation.
> Either way, the next write to the journal will remove the item from
> the AIL when it completes.
> 
> Intents don't need to be in the AIL, though - we can cancel them
> in memory (see the intent whiteout code) and so when we process the
> done item from journal IO completion the last reference goes away
> and they won't be in the AIL at this point in time.
> 
> IOWs, the intent freeing code doesn't care if the intent is in the
> AIL or not, it does the right thing either way.
> 
> Hence if we remove the intent from the list of intents that need to
> be recovered after we have done the initial recovery, we acheive two
> things:
> 
> 1. the tail of the log can be moved forward with the commit of the
> done intent or new intent to continue the operation, and
> 
> 2. We avoid the problem of trying to determine how many reference
> counts we need to drop from intent recovery cancelling because we
> never come across intents we've actually attempted recovery on.
> 
> > shouldn't
> > it be removed from the AIL when the done intent is committed? Or is there
> > any way to ensure that the intents are removed from the AIL when they are
> > processed.
> 
> THe reference counting ensures the right thing is done when the last
> reference goes away. If it is in the AIL, it will get removed, if it
> is not in the AIL, then AIL removal is a no-op and nothign bad
> happens.

Thank you for your detailed answer, it was great, I understand
what you mean.

The intent item stays in the AIL until the done item is committed,
which prevents the tail lsn from moving forward. After the transaction
is alloced in ->iop_recover(), we don't need to stop tail lsn from
moving forward because there is no other thread committing the
transaction at this point, the intent item will not be overwritten
until the done intent is written to the log, so we can just remove
the intent item from the AIL.

I think we can remove the intent item from the AIL as soon as the
done item is created, since the reference count has already been
passed to the done item. Eventually the problem can be fixed.

This is my understanding.

Thanks
Long Li


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2023-07-25 12:28 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-07-15  6:36 [PATCH v2 0/3] xfs: fix two problem when recovery intents fails Long Li
2023-07-15  6:36 ` [PATCH v2 1/3] xfs: factor out xfs_defer_pending_abort Long Li
2023-07-19  2:08   ` Darrick J. Wong
2023-07-15  6:36 ` [PATCH v2 2/3] xfs: abort intent items when recovery intents fail Long Li
2023-07-19  2:07   ` Darrick J. Wong
2023-07-15  6:36 ` [PATCH v2 3/3] xfs: make sure done item committed before cancel intents Long Li
2023-07-19  2:19   ` Darrick J. Wong
2023-07-22  2:16     ` Long Li
2023-07-19  6:41   ` Dave Chinner
2023-07-22  1:19     ` Long Li
2023-07-24  1:05       ` Dave Chinner
2023-07-25 12:25         ` Long Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).