* [PATCH 0/3] xfs: fix two problem when recovery intents fails
@ 2023-06-29 13:17 Long Li
2023-06-29 13:17 ` [PATCH 1/3] xfs: factor out xfs_defer_pending_abort Long Li
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Long Li @ 2023-06-29 13:17 UTC (permalink / raw)
To: djwong, david; +Cc: linux-xfs, yi.zhang, houtao1, leo.lilong, yangerkun
This patch set fix two problem when recovery intents fails.
Patches 1-2 fix the possible problem that intent items not released.
When recovery intents, new intents items may be created during recovery
intents. if recovery fails, new intents items may be left in AIL or
leaks.
Patch 3 fix a uaf problem, when recovery intents fails, intent items
may be freed before done item commited.
Long Li (3):
xfs: factor out xfs_defer_pending_abort
xfs: abort intent items when recovery intents fail
xfs: make sure done item committed before cancel intents
fs/xfs/libxfs/xfs_defer.c | 26 +++++++++++++++++---------
fs/xfs/libxfs/xfs_defer.h | 1 +
fs/xfs/xfs_log_recover.c | 15 ++++++++-------
3 files changed, 26 insertions(+), 16 deletions(-)
--
2.31.1
^ permalink raw reply [flat|nested] 7+ messages in thread* [PATCH 1/3] xfs: factor out xfs_defer_pending_abort 2023-06-29 13:17 [PATCH 0/3] xfs: fix two problem when recovery intents fails Long Li @ 2023-06-29 13:17 ` Long Li 2023-06-29 15:10 ` Darrick J. Wong 2023-06-29 13:17 ` [PATCH 2/3] xfs: abort intent items when recovery intents fail Long Li 2023-06-29 13:17 ` [PATCH 3/3] xfs: make sure done item committed before cancel intents Long Li 2 siblings, 1 reply; 7+ messages in thread From: Long Li @ 2023-06-29 13:17 UTC (permalink / raw) To: djwong, david; +Cc: linux-xfs, yi.zhang, houtao1, leo.lilong, yangerkun Factor out xfs_defer_pending_abort() from xfs_defer_trans_abort(), which not use transaction parameter, so it can be used after the transaction life cycle. Signed-off-by: Long Li <leo.lilong@huawei.com> --- fs/xfs/libxfs/xfs_defer.c | 25 ++++++++++++++++--------- 1 file changed, 16 insertions(+), 9 deletions(-) diff --git a/fs/xfs/libxfs/xfs_defer.c b/fs/xfs/libxfs/xfs_defer.c index bcfb6a4203cd..7ec6812fa625 100644 --- a/fs/xfs/libxfs/xfs_defer.c +++ b/fs/xfs/libxfs/xfs_defer.c @@ -245,21 +245,18 @@ xfs_defer_create_intents( return ret; } -/* Abort all the intents that were committed. */ -STATIC void -xfs_defer_trans_abort( - struct xfs_trans *tp, - struct list_head *dop_pending) +void +xfs_defer_pending_abort( + struct xfs_mount *mp, + struct list_head *dop_list) { struct xfs_defer_pending *dfp; const struct xfs_defer_op_type *ops; - trace_xfs_defer_trans_abort(tp, _RET_IP_); - /* Abort intent items that don't have a done item. */ - list_for_each_entry(dfp, dop_pending, dfp_list) { + list_for_each_entry(dfp, dop_list, dfp_list) { ops = defer_op_types[dfp->dfp_type]; - trace_xfs_defer_pending_abort(tp->t_mountp, dfp); + trace_xfs_defer_pending_abort(mp, dfp); if (dfp->dfp_intent && !dfp->dfp_done) { ops->abort_intent(dfp->dfp_intent); dfp->dfp_intent = NULL; @@ -267,6 +264,16 @@ xfs_defer_trans_abort( } } +/* Abort all the intents that were committed. */ +STATIC void +xfs_defer_trans_abort( + struct xfs_trans *tp, + struct list_head *dop_pending) +{ + trace_xfs_defer_trans_abort(tp, _RET_IP_); + xfs_defer_pending_abort(tp->t_mountp, dop_pending); +} + /* * Capture resources that the caller said not to release ("held") when the * transaction commits. Caller is responsible for zero-initializing @dres. -- 2.31.1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 1/3] xfs: factor out xfs_defer_pending_abort 2023-06-29 13:17 ` [PATCH 1/3] xfs: factor out xfs_defer_pending_abort Long Li @ 2023-06-29 15:10 ` Darrick J. Wong 0 siblings, 0 replies; 7+ messages in thread From: Darrick J. Wong @ 2023-06-29 15:10 UTC (permalink / raw) To: Long Li; +Cc: david, linux-xfs, yi.zhang, houtao1, yangerkun On Thu, Jun 29, 2023 at 09:17:23PM +0800, Long Li wrote: > Factor out xfs_defer_pending_abort() from xfs_defer_trans_abort(), which > not use transaction parameter, so it can be used after the transaction > life cycle. > > Signed-off-by: Long Li <leo.lilong@huawei.com> Looks fine to me, Reviewed-by: Darrick J. Wong <djwong@kernel.org> --D > --- > fs/xfs/libxfs/xfs_defer.c | 25 ++++++++++++++++--------- > 1 file changed, 16 insertions(+), 9 deletions(-) > > diff --git a/fs/xfs/libxfs/xfs_defer.c b/fs/xfs/libxfs/xfs_defer.c > index bcfb6a4203cd..7ec6812fa625 100644 > --- a/fs/xfs/libxfs/xfs_defer.c > +++ b/fs/xfs/libxfs/xfs_defer.c > @@ -245,21 +245,18 @@ xfs_defer_create_intents( > return ret; > } > > -/* Abort all the intents that were committed. */ > -STATIC void > -xfs_defer_trans_abort( > - struct xfs_trans *tp, > - struct list_head *dop_pending) > +void > +xfs_defer_pending_abort( > + struct xfs_mount *mp, > + struct list_head *dop_list) > { > struct xfs_defer_pending *dfp; > const struct xfs_defer_op_type *ops; > > - trace_xfs_defer_trans_abort(tp, _RET_IP_); > - > /* Abort intent items that don't have a done item. */ > - list_for_each_entry(dfp, dop_pending, dfp_list) { > + list_for_each_entry(dfp, dop_list, dfp_list) { > ops = defer_op_types[dfp->dfp_type]; > - trace_xfs_defer_pending_abort(tp->t_mountp, dfp); > + trace_xfs_defer_pending_abort(mp, dfp); > if (dfp->dfp_intent && !dfp->dfp_done) { > ops->abort_intent(dfp->dfp_intent); > dfp->dfp_intent = NULL; > @@ -267,6 +264,16 @@ xfs_defer_trans_abort( > } > } > > +/* Abort all the intents that were committed. */ > +STATIC void > +xfs_defer_trans_abort( > + struct xfs_trans *tp, > + struct list_head *dop_pending) > +{ > + trace_xfs_defer_trans_abort(tp, _RET_IP_); > + xfs_defer_pending_abort(tp->t_mountp, dop_pending); > +} > + > /* > * Capture resources that the caller said not to release ("held") when the > * transaction commits. Caller is responsible for zero-initializing @dres. > -- > 2.31.1 > ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 2/3] xfs: abort intent items when recovery intents fail 2023-06-29 13:17 [PATCH 0/3] xfs: fix two problem when recovery intents fails Long Li 2023-06-29 13:17 ` [PATCH 1/3] xfs: factor out xfs_defer_pending_abort Long Li @ 2023-06-29 13:17 ` Long Li 2023-06-29 14:42 ` Darrick J. Wong 2023-06-29 13:17 ` [PATCH 3/3] xfs: make sure done item committed before cancel intents Long Li 2 siblings, 1 reply; 7+ messages in thread From: Long Li @ 2023-06-29 13:17 UTC (permalink / raw) To: djwong, david; +Cc: linux-xfs, yi.zhang, houtao1, leo.lilong, yangerkun When recovery intents, it may capture some deferred ops and commit the new intent items, if recovery intents fails, there will be no done item drop the reference to the new intent item. New intent items will left in AIL and caused mount thread hung all the time as fllows: [root@localhost ~]# cat /proc/539/stack [<0>] xfs_ail_push_all_sync+0x174/0x230 [<0>] xfs_unmount_flush_inodes+0x8d/0xd0 [<0>] xfs_mountfs+0x15f7/0x1e70 [<0>] xfs_fs_fill_super+0x10ec/0x1b20 [<0>] get_tree_bdev+0x3c8/0x730 [<0>] vfs_get_tree+0x89/0x2c0 [<0>] path_mount+0xecf/0x1800 [<0>] do_mount+0xf3/0x110 [<0>] __x64_sys_mount+0x154/0x1f0 [<0>] do_syscall_64+0x39/0x80 [<0>] entry_SYSCALL_64_after_hwframe+0x63/0xcd During intent item recover, if transaction that have deferred ops commmit fails in xfs_defer_ops_capture_and_commit(), defer capture would not added to capture list, so matching done items would not commit when finish defer operations, this will cause intent item leaks: unreferenced object 0xffff888016719108 (size 432): comm "mount", pid 529, jiffies 4294706839 (age 144.463s) hex dump (first 32 bytes): 08 91 71 16 80 88 ff ff 08 91 71 16 80 88 ff ff ..q.......q..... 18 91 71 16 80 88 ff ff 18 91 71 16 80 88 ff ff ..q.......q..... backtrace: [<ffffffff8230c68f>] xfs_efi_init+0x18f/0x1d0 [<ffffffff8230c720>] xfs_extent_free_create_intent+0x50/0x150 [<ffffffff821b671a>] xfs_defer_create_intents+0x16a/0x340 [<ffffffff821bac3e>] xfs_defer_ops_capture_and_commit+0x8e/0xad0 [<ffffffff82322bb9>] xfs_cui_item_recover+0x819/0x980 [<ffffffff823289b6>] xlog_recover_process_intents+0x246/0xb70 [<ffffffff8233249a>] xlog_recover_finish+0x8a/0x9a0 [<ffffffff822eeafb>] xfs_log_mount_finish+0x2bb/0x4a0 [<ffffffff822c0f4f>] xfs_mountfs+0x14bf/0x1e70 [<ffffffff822d1f80>] xfs_fs_fill_super+0x10d0/0x1b20 [<ffffffff81a21fa2>] get_tree_bdev+0x3d2/0x6d0 [<ffffffff81a1ee09>] vfs_get_tree+0x89/0x2c0 [<ffffffff81a9f35f>] path_mount+0xecf/0x1800 [<ffffffff81a9fd83>] do_mount+0xf3/0x110 [<ffffffff81aa00e4>] __x64_sys_mount+0x154/0x1f0 [<ffffffff83968739>] do_syscall_64+0x39/0x80 Fix the problem above by abort intent items that don't have a done item when recovery intents fail. Fixes: e6fff81e4870 ("xfs: proper replay of deferred ops queued during log recovery") Signed-off-by: Long Li <leo.lilong@huawei.com> --- fs/xfs/libxfs/xfs_defer.c | 1 + fs/xfs/libxfs/xfs_defer.h | 1 + fs/xfs/xfs_log_recover.c | 1 + 3 files changed, 3 insertions(+) diff --git a/fs/xfs/libxfs/xfs_defer.c b/fs/xfs/libxfs/xfs_defer.c index 7ec6812fa625..b2b46d142281 100644 --- a/fs/xfs/libxfs/xfs_defer.c +++ b/fs/xfs/libxfs/xfs_defer.c @@ -809,6 +809,7 @@ xfs_defer_ops_capture_and_commit( /* Commit the transaction and add the capture structure to the list. */ error = xfs_trans_commit(tp); if (error) { + xfs_defer_pending_abort(mp, &dfc->dfc_dfops); xfs_defer_ops_capture_free(mp, dfc); return error; } diff --git a/fs/xfs/libxfs/xfs_defer.h b/fs/xfs/libxfs/xfs_defer.h index 114a3a4930a3..c3775014f7ab 100644 --- a/fs/xfs/libxfs/xfs_defer.h +++ b/fs/xfs/libxfs/xfs_defer.h @@ -37,6 +37,7 @@ struct xfs_defer_pending { enum xfs_defer_ops_type dfp_type; }; +void xfs_defer_pending_abort(struct xfs_mount *mp, struct list_head *dop_list); void xfs_defer_add(struct xfs_trans *tp, enum xfs_defer_ops_type type, struct list_head *h); int xfs_defer_finish_noroll(struct xfs_trans **tp); diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c index 82c81d20459d..924beecf07bb 100644 --- a/fs/xfs/xfs_log_recover.c +++ b/fs/xfs/xfs_log_recover.c @@ -2511,6 +2511,7 @@ xlog_abort_defer_ops( list_for_each_entry_safe(dfc, next, capture_list, dfc_list) { list_del_init(&dfc->dfc_list); + xfs_defer_pending_abort(mp, &dfc->dfc_dfops); xfs_defer_ops_capture_free(mp, dfc); } } -- 2.31.1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 2/3] xfs: abort intent items when recovery intents fail 2023-06-29 13:17 ` [PATCH 2/3] xfs: abort intent items when recovery intents fail Long Li @ 2023-06-29 14:42 ` Darrick J. Wong 2023-06-30 1:51 ` Long Li 0 siblings, 1 reply; 7+ messages in thread From: Darrick J. Wong @ 2023-06-29 14:42 UTC (permalink / raw) To: Long Li; +Cc: david, linux-xfs, yi.zhang, houtao1, yangerkun On Thu, Jun 29, 2023 at 09:17:24PM +0800, Long Li wrote: > When recovery intents, it may capture some deferred ops and commit the new > intent items, if recovery intents fails, there will be no done item drop > the reference to the new intent item. New intent items will left in AIL > and caused mount thread hung all the time as fllows: Er... let me try rewriting this a bit: "When recovering intents, we capture newly created intent items as part of committing recovered intent items. If intent recovery fails at a later point, we forget to remove those newly created intent items from the AIL and hang: > > [root@localhost ~]# cat /proc/539/stack > [<0>] xfs_ail_push_all_sync+0x174/0x230 > [<0>] xfs_unmount_flush_inodes+0x8d/0xd0 > [<0>] xfs_mountfs+0x15f7/0x1e70 > [<0>] xfs_fs_fill_super+0x10ec/0x1b20 > [<0>] get_tree_bdev+0x3c8/0x730 > [<0>] vfs_get_tree+0x89/0x2c0 > [<0>] path_mount+0xecf/0x1800 > [<0>] do_mount+0xf3/0x110 > [<0>] __x64_sys_mount+0x154/0x1f0 > [<0>] do_syscall_64+0x39/0x80 > [<0>] entry_SYSCALL_64_after_hwframe+0x63/0xcd > > During intent item recover, if transaction that have deferred ops commmit > fails in xfs_defer_ops_capture_and_commit(), defer capture would not added > to capture list, so matching done items would not commit when finish defer > operations, this will cause intent item leaks: "Intent recovery hasn't created done items for these newly created intent items, so the capture structure is the sole owner of the captured intent items. We must release them explicitly or else they leak: > unreferenced object 0xffff888016719108 (size 432): > comm "mount", pid 529, jiffies 4294706839 (age 144.463s) > hex dump (first 32 bytes): > 08 91 71 16 80 88 ff ff 08 91 71 16 80 88 ff ff ..q.......q..... > 18 91 71 16 80 88 ff ff 18 91 71 16 80 88 ff ff ..q.......q..... > backtrace: > [<ffffffff8230c68f>] xfs_efi_init+0x18f/0x1d0 > [<ffffffff8230c720>] xfs_extent_free_create_intent+0x50/0x150 > [<ffffffff821b671a>] xfs_defer_create_intents+0x16a/0x340 > [<ffffffff821bac3e>] xfs_defer_ops_capture_and_commit+0x8e/0xad0 > [<ffffffff82322bb9>] xfs_cui_item_recover+0x819/0x980 > [<ffffffff823289b6>] xlog_recover_process_intents+0x246/0xb70 > [<ffffffff8233249a>] xlog_recover_finish+0x8a/0x9a0 > [<ffffffff822eeafb>] xfs_log_mount_finish+0x2bb/0x4a0 > [<ffffffff822c0f4f>] xfs_mountfs+0x14bf/0x1e70 > [<ffffffff822d1f80>] xfs_fs_fill_super+0x10d0/0x1b20 > [<ffffffff81a21fa2>] get_tree_bdev+0x3d2/0x6d0 > [<ffffffff81a1ee09>] vfs_get_tree+0x89/0x2c0 > [<ffffffff81a9f35f>] path_mount+0xecf/0x1800 > [<ffffffff81a9fd83>] do_mount+0xf3/0x110 > [<ffffffff81aa00e4>] __x64_sys_mount+0x154/0x1f0 > [<ffffffff83968739>] do_syscall_64+0x39/0x80 > > Fix the problem above by abort intent items that don't have a done item > when recovery intents fail. > > Fixes: e6fff81e4870 ("xfs: proper replay of deferred ops queued during log recovery") > Signed-off-by: Long Li <leo.lilong@huawei.com> > --- > fs/xfs/libxfs/xfs_defer.c | 1 + > fs/xfs/libxfs/xfs_defer.h | 1 + > fs/xfs/xfs_log_recover.c | 1 + > 3 files changed, 3 insertions(+) > > diff --git a/fs/xfs/libxfs/xfs_defer.c b/fs/xfs/libxfs/xfs_defer.c > index 7ec6812fa625..b2b46d142281 100644 > --- a/fs/xfs/libxfs/xfs_defer.c > +++ b/fs/xfs/libxfs/xfs_defer.c > @@ -809,6 +809,7 @@ xfs_defer_ops_capture_and_commit( > /* Commit the transaction and add the capture structure to the list. */ > error = xfs_trans_commit(tp); > if (error) { > + xfs_defer_pending_abort(mp, &dfc->dfc_dfops); > xfs_defer_ops_capture_free(mp, dfc); I prefer that we not go extern'ing two functions that mess around with internal state. Could you instead add the xfs_defer_pending_abort call to the start of xfs_defer_ops_capture_free, and rename xfs_defer_ops_capture_free to xfs_defer_ops_capture_abort? Other than those two things, I /think/ this looks correct. Assuming my understanding of the problem is reflected in the tweaks I made to the commit message. ;) --D > return error; > } > diff --git a/fs/xfs/libxfs/xfs_defer.h b/fs/xfs/libxfs/xfs_defer.h > index 114a3a4930a3..c3775014f7ab 100644 > --- a/fs/xfs/libxfs/xfs_defer.h > +++ b/fs/xfs/libxfs/xfs_defer.h > @@ -37,6 +37,7 @@ struct xfs_defer_pending { > enum xfs_defer_ops_type dfp_type; > }; > > +void xfs_defer_pending_abort(struct xfs_mount *mp, struct list_head *dop_list); > void xfs_defer_add(struct xfs_trans *tp, enum xfs_defer_ops_type type, > struct list_head *h); > int xfs_defer_finish_noroll(struct xfs_trans **tp); > diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c > index 82c81d20459d..924beecf07bb 100644 > --- a/fs/xfs/xfs_log_recover.c > +++ b/fs/xfs/xfs_log_recover.c > @@ -2511,6 +2511,7 @@ xlog_abort_defer_ops( > > list_for_each_entry_safe(dfc, next, capture_list, dfc_list) { > list_del_init(&dfc->dfc_list); > + xfs_defer_pending_abort(mp, &dfc->dfc_dfops); > xfs_defer_ops_capture_free(mp, dfc); > } > } > -- > 2.31.1 > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/3] xfs: abort intent items when recovery intents fail 2023-06-29 14:42 ` Darrick J. Wong @ 2023-06-30 1:51 ` Long Li 0 siblings, 0 replies; 7+ messages in thread From: Long Li @ 2023-06-30 1:51 UTC (permalink / raw) To: Darrick J. Wong, Long Li; +Cc: david, linux-xfs, yi.zhang, houtao1, yangerkun On Thu, Jun 29, 2023 at 07:42:36AM -0700, Darrick J. Wong wrote: > On Thu, Jun 29, 2023 at 09:17:24PM +0800, Long Li wrote: > > When recovery intents, it may capture some deferred ops and commit the new > > intent items, if recovery intents fails, there will be no done item drop > > the reference to the new intent item. New intent items will left in AIL > > and caused mount thread hung all the time as fllows: > > Er... let me try rewriting this a bit: > I have tried my best to express it clearly, but there are still some language barriers. Thank you very much for making the commit message clearer. > "When recovering intents, we capture newly created intent items as part > of committing recovered intent items. If intent recovery fails at a > later point, we forget to remove those newly created intent items from > the AIL and hang: > > > > > [root@localhost ~]# cat /proc/539/stack > > [<0>] xfs_ail_push_all_sync+0x174/0x230 > > [<0>] xfs_unmount_flush_inodes+0x8d/0xd0 > > [<0>] xfs_mountfs+0x15f7/0x1e70 > > [<0>] xfs_fs_fill_super+0x10ec/0x1b20 > > [<0>] get_tree_bdev+0x3c8/0x730 > > [<0>] vfs_get_tree+0x89/0x2c0 > > [<0>] path_mount+0xecf/0x1800 > > [<0>] do_mount+0xf3/0x110 > > [<0>] __x64_sys_mount+0x154/0x1f0 > > [<0>] do_syscall_64+0x39/0x80 > > [<0>] entry_SYSCALL_64_after_hwframe+0x63/0xcd > > > > During intent item recover, if transaction that have deferred ops commmit > > fails in xfs_defer_ops_capture_and_commit(), defer capture would not added > > to capture list, so matching done items would not commit when finish defer > > operations, this will cause intent item leaks: > > "Intent recovery hasn't created done items for these newly created > intent items, so the capture structure is the sole owner of the captured > intent items. We must release them explicitly or else they leak: > > > unreferenced object 0xffff888016719108 (size 432): > > comm "mount", pid 529, jiffies 4294706839 (age 144.463s) > > hex dump (first 32 bytes): > > 08 91 71 16 80 88 ff ff 08 91 71 16 80 88 ff ff ..q.......q..... > > 18 91 71 16 80 88 ff ff 18 91 71 16 80 88 ff ff ..q.......q..... > > backtrace: > > [<ffffffff8230c68f>] xfs_efi_init+0x18f/0x1d0 > > [<ffffffff8230c720>] xfs_extent_free_create_intent+0x50/0x150 > > [<ffffffff821b671a>] xfs_defer_create_intents+0x16a/0x340 > > [<ffffffff821bac3e>] xfs_defer_ops_capture_and_commit+0x8e/0xad0 > > [<ffffffff82322bb9>] xfs_cui_item_recover+0x819/0x980 > > [<ffffffff823289b6>] xlog_recover_process_intents+0x246/0xb70 > > [<ffffffff8233249a>] xlog_recover_finish+0x8a/0x9a0 > > [<ffffffff822eeafb>] xfs_log_mount_finish+0x2bb/0x4a0 > > [<ffffffff822c0f4f>] xfs_mountfs+0x14bf/0x1e70 > > [<ffffffff822d1f80>] xfs_fs_fill_super+0x10d0/0x1b20 > > [<ffffffff81a21fa2>] get_tree_bdev+0x3d2/0x6d0 > > [<ffffffff81a1ee09>] vfs_get_tree+0x89/0x2c0 > > [<ffffffff81a9f35f>] path_mount+0xecf/0x1800 > > [<ffffffff81a9fd83>] do_mount+0xf3/0x110 > > [<ffffffff81aa00e4>] __x64_sys_mount+0x154/0x1f0 > > [<ffffffff83968739>] do_syscall_64+0x39/0x80 > > > > Fix the problem above by abort intent items that don't have a done item > > when recovery intents fail. > > > > Fixes: e6fff81e4870 ("xfs: proper replay of deferred ops queued during log recovery") > > Signed-off-by: Long Li <leo.lilong@huawei.com> > > --- > > fs/xfs/libxfs/xfs_defer.c | 1 + > > fs/xfs/libxfs/xfs_defer.h | 1 + > > fs/xfs/xfs_log_recover.c | 1 + > > 3 files changed, 3 insertions(+) > > > > diff --git a/fs/xfs/libxfs/xfs_defer.c b/fs/xfs/libxfs/xfs_defer.c > > index 7ec6812fa625..b2b46d142281 100644 > > --- a/fs/xfs/libxfs/xfs_defer.c > > +++ b/fs/xfs/libxfs/xfs_defer.c > > @@ -809,6 +809,7 @@ xfs_defer_ops_capture_and_commit( > > /* Commit the transaction and add the capture structure to the list. */ > > error = xfs_trans_commit(tp); > > if (error) { > > + xfs_defer_pending_abort(mp, &dfc->dfc_dfops); > > xfs_defer_ops_capture_free(mp, dfc); > > I prefer that we not go extern'ing two functions that mess around with > internal state. Could you instead add the xfs_defer_pending_abort call > to the start of xfs_defer_ops_capture_free, and rename > xfs_defer_ops_capture_free to xfs_defer_ops_capture_abort? > Thanks for your suggestion, it seems reasonable and clean enough, it will change in the next version. > Other than those two things, I /think/ this looks correct. Assuming my > understanding of the problem is reflected in the tweaks I made to the > commit message. ;) > > --D Thanks for your review again, the commit message you wrote is exactly what I want to describe, so I will send a new version later. ddThanks, Long Li > > > return error; > > } > > diff --git a/fs/xfs/libxfs/xfs_defer.h b/fs/xfs/libxfs/xfs_defer.h > > index 114a3a4930a3..c3775014f7ab 100644 > > --- a/fs/xfs/libxfs/xfs_defer.h > > +++ b/fs/xfs/libxfs/xfs_defer.h > > @@ -37,6 +37,7 @@ struct xfs_defer_pending { > > enum xfs_defer_ops_type dfp_type; > > }; > > > > +void xfs_defer_pending_abort(struct xfs_mount *mp, struct list_head *dop_list); > > void xfs_defer_add(struct xfs_trans *tp, enum xfs_defer_ops_type type, > > struct list_head *h); > > int xfs_defer_finish_noroll(struct xfs_trans **tp); > > diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c > > index 82c81d20459d..924beecf07bb 100644 > > --- a/fs/xfs/xfs_log_recover.c > > +++ b/fs/xfs/xfs_log_recover.c > > @@ -2511,6 +2511,7 @@ xlog_abort_defer_ops( > > > > list_for_each_entry_safe(dfc, next, capture_list, dfc_list) { > > list_del_init(&dfc->dfc_list); > > + xfs_defer_pending_abort(mp, &dfc->dfc_dfops); > > xfs_defer_ops_capture_free(mp, dfc); > > } > > } > > -- > > 2.31.1 > > ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 3/3] xfs: make sure done item committed before cancel intents 2023-06-29 13:17 [PATCH 0/3] xfs: fix two problem when recovery intents fails Long Li 2023-06-29 13:17 ` [PATCH 1/3] xfs: factor out xfs_defer_pending_abort Long Li 2023-06-29 13:17 ` [PATCH 2/3] xfs: abort intent items when recovery intents fail Long Li @ 2023-06-29 13:17 ` Long Li 2 siblings, 0 replies; 7+ messages in thread From: Long Li @ 2023-06-29 13:17 UTC (permalink / raw) To: djwong, david; +Cc: linux-xfs, yi.zhang, houtao1, leo.lilong, yangerkun When recovery intents fails, all intent items left in AIL will be delete from AIL and released in xlog_recover_cancel_intents(). If an intent item that have been recover and log a new done item, it may be freed before done item committed due to intents cancel, if so, uaf will be triggered as fllows when done item committed. Fix it by move log force forward to make sure done items committed before cancel intent items. ================================================================== BUG: KASAN: slab-use-after-free in xfs_cui_release+0xb7/0xc0 Read of size 4 at addr ffff888012575e60 by task kworker/u8:3/103 CPU: 3 PID: 103 Comm: kworker/u8:3 Not tainted 6.4.0-rc7-next-20230619-00003-g94543a53f9a4-dirty #166 Workqueue: xfs-cil/sda xlog_cil_push_work Call Trace: <TASK> dump_stack_lvl+0x50/0x70 print_report+0xc2/0x600 kasan_report+0xb6/0xe0 xfs_cui_release+0xb7/0xc0 xfs_cud_item_release+0x3c/0x90 xfs_trans_committed_bulk+0x2d5/0x7f0 xlog_cil_committed+0xaba/0xf20 xlog_cil_push_work+0x1a60/0x2360 process_one_work+0x78e/0x1140 worker_thread+0x58b/0xf60 kthread+0x2cd/0x3c0 ret_from_fork+0x1f/0x30 </TASK> Allocated by task 531: kasan_save_stack+0x22/0x40 kasan_set_track+0x25/0x30 __kasan_slab_alloc+0x55/0x60 kmem_cache_alloc+0x195/0x5f0 xfs_cui_init+0x198/0x1d0 xlog_recover_cui_commit_pass2+0x133/0x5f0 xlog_recover_items_pass2+0x107/0x230 xlog_recover_commit_trans+0x3e7/0x9c0 xlog_recovery_process_trans+0x140/0x1d0 xlog_recover_process_ophdr+0x1a0/0x3d0 xlog_recover_process_data+0x108/0x2d0 xlog_recover_process+0x1f6/0x280 xlog_do_recovery_pass+0x609/0xdb0 xlog_do_log_recovery+0x84/0xe0 xlog_do_recover+0x7d/0x470 xlog_recover+0x25f/0x490 xfs_log_mount+0x2dd/0x6f0 xfs_mountfs+0x11ce/0x1e70 xfs_fs_fill_super+0x10ec/0x1b20 get_tree_bdev+0x3c8/0x730 vfs_get_tree+0x89/0x2c0 path_mount+0xecf/0x1800 do_mount+0xf3/0x110 __x64_sys_mount+0x154/0x1f0 do_syscall_64+0x39/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Freed by task 531: kasan_save_stack+0x22/0x40 kasan_set_track+0x25/0x30 kasan_save_free_info+0x2b/0x40 __kasan_slab_free+0x114/0x1b0 kmem_cache_free+0xf8/0x510 xfs_cui_item_free+0x95/0xb0 xfs_cui_release+0x86/0xc0 xlog_recover_cancel_intents.isra.0+0xf8/0x210 xlog_recover_finish+0x7e7/0x980 xfs_log_mount_finish+0x2bb/0x4a0 xfs_mountfs+0x14bf/0x1e70 xfs_fs_fill_super+0x10ec/0x1b20 get_tree_bdev+0x3c8/0x730 vfs_get_tree+0x89/0x2c0 path_mount+0xecf/0x1800 do_mount+0xf3/0x110 __x64_sys_mount+0x154/0x1f0 do_syscall_64+0x39/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd The buggy address belongs to the object at ffff888012575dc8 which belongs to the cache xfs_cui_item of size 432 The buggy address is located 152 bytes inside of freed 432-byte region [ffff888012575dc8, ffff888012575f78) The buggy address belongs to the physical page: page:ffffea0000495d00 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888012576208 pfn:0x12574 head:ffffea0000495d00 order:2 entire_mapcount:0 nr_pages_mapped:0 pincount:0 flags: 0x1fffff80010200(slab|head|node=0|zone=1|lastcpupid=0x1fffff) page_type: 0xffffffff() raw: 001fffff80010200 ffff888012092f40 ffff888014570150 ffff888014570150 raw: ffff888012576208 00000000001e0010 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff888012575d00: fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc fc ffff888012575d80: fc fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb >ffff888012575e00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff888012575e80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff888012575f00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fc ================================================================== Fixes: 2e76f188fd90 ("xfs: cancel intents immediately if process_intents fails") Signed-off-by: Long Li <leo.lilong@huawei.com> --- fs/xfs/xfs_log_recover.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c index 924beecf07bb..8e1f32cea42b 100644 --- a/fs/xfs/xfs_log_recover.c +++ b/fs/xfs/xfs_log_recover.c @@ -3445,6 +3445,13 @@ xlog_recover_finish( int error; error = xlog_recover_process_intents(log); + /* + * Sync the log to get all the intents that have done item out of + * the AIL. This isn't absolutely necessary, but it helps in case + * the unlink transactions would have problems pushing the intents + * out of the way. + */ + xfs_log_force(log->l_mp, XFS_LOG_SYNC); if (error) { /* * Cancel all the unprocessed intent items now so that we don't @@ -3459,13 +3466,6 @@ xlog_recover_finish( return error; } - /* - * Sync the log to get all the intents out of the AIL. This isn't - * absolutely necessary, but it helps in case the unlink transactions - * would have problems pushing the intents out of the way. - */ - xfs_log_force(log->l_mp, XFS_LOG_SYNC); - /* * Now that we've recovered the log and all the intents, we can clear * the log incompat feature bits in the superblock because there's no -- 2.31.1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
end of thread, other threads:[~2023-06-30 1:54 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-06-29 13:17 [PATCH 0/3] xfs: fix two problem when recovery intents fails Long Li 2023-06-29 13:17 ` [PATCH 1/3] xfs: factor out xfs_defer_pending_abort Long Li 2023-06-29 15:10 ` Darrick J. Wong 2023-06-29 13:17 ` [PATCH 2/3] xfs: abort intent items when recovery intents fail Long Li 2023-06-29 14:42 ` Darrick J. Wong 2023-06-30 1:51 ` Long Li 2023-06-29 13:17 ` [PATCH 3/3] xfs: make sure done item committed before cancel intents Long Li
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox