From: Long Li <leo.lilong@huawei.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: <linux-xfs@vger.kernel.org>, <chandanbabu@kernel.org>, <hch@lst.de>
Subject: Re: [PATCHSET RFC 0/7] xfs: log intent item recovery should reconstruct defer work state
Date: Thu, 30 Nov 2023 20:06:28 +0800 [thread overview]
Message-ID: <20231130120628.GA1599481@ceph-admin> (raw)
In-Reply-To: <170120318847.13206.17051442307252477333.stgit@frogsfrogsfrogs>
On Tue, Nov 28, 2023 at 12:26:28PM -0800, Darrick J. Wong wrote:
> Hi all,
>
> Long Li reported a KASAN report from a UAF when intent recovery fails:
>
> ==================================================================
> BUG: KASAN: slab-use-after-free in xfs_cui_release+0xb7/0xc0
> Read of size 4 at addr ffff888012575e60 by task kworker/u8:3/103
> CPU: 3 PID: 103 Comm: kworker/u8:3 Not tainted 6.4.0-rc7-next-20230619-00003-g94543a53f9a4-dirty #166
> Workqueue: xfs-cil/sda xlog_cil_push_work
> Call Trace:
> <TASK>
> dump_stack_lvl+0x50/0x70
> print_report+0xc2/0x600
> kasan_report+0xb6/0xe0
> xfs_cui_release+0xb7/0xc0
> xfs_cud_item_release+0x3c/0x90
> xfs_trans_committed_bulk+0x2d5/0x7f0
> xlog_cil_committed+0xaba/0xf20
> xlog_cil_push_work+0x1a60/0x2360
> process_one_work+0x78e/0x1140
> worker_thread+0x58b/0xf60
> kthread+0x2cd/0x3c0
> ret_from_fork+0x1f/0x30
> </TASK>
>
> Allocated by task 531:
> kasan_save_stack+0x22/0x40
> kasan_set_track+0x25/0x30
> __kasan_slab_alloc+0x55/0x60
> kmem_cache_alloc+0x195/0x5f0
> xfs_cui_init+0x198/0x1d0
> xlog_recover_cui_commit_pass2+0x133/0x5f0
> xlog_recover_items_pass2+0x107/0x230
> xlog_recover_commit_trans+0x3e7/0x9c0
> xlog_recovery_process_trans+0x140/0x1d0
> xlog_recover_process_ophdr+0x1a0/0x3d0
> xlog_recover_process_data+0x108/0x2d0
> xlog_recover_process+0x1f6/0x280
> xlog_do_recovery_pass+0x609/0xdb0
> xlog_do_log_recovery+0x84/0xe0
> xlog_do_recover+0x7d/0x470
> xlog_recover+0x25f/0x490
> xfs_log_mount+0x2dd/0x6f0
> xfs_mountfs+0x11ce/0x1e70
> xfs_fs_fill_super+0x10ec/0x1b20
> get_tree_bdev+0x3c8/0x730
> vfs_get_tree+0x89/0x2c0
> path_mount+0xecf/0x1800
> do_mount+0xf3/0x110
> __x64_sys_mount+0x154/0x1f0
> do_syscall_64+0x39/0x80
> entry_SYSCALL_64_after_hwframe+0x63/0xcd
>
> Freed by task 531:
> kasan_save_stack+0x22/0x40
> kasan_set_track+0x25/0x30
> kasan_save_free_info+0x2b/0x40
> __kasan_slab_free+0x114/0x1b0
> kmem_cache_free+0xf8/0x510
> xfs_cui_item_free+0x95/0xb0
> xfs_cui_release+0x86/0xc0
> xlog_recover_cancel_intents.isra.0+0xf8/0x210
> xlog_recover_finish+0x7e7/0x980
> xfs_log_mount_finish+0x2bb/0x4a0
> xfs_mountfs+0x14bf/0x1e70
> xfs_fs_fill_super+0x10ec/0x1b20
> get_tree_bdev+0x3c8/0x730
> vfs_get_tree+0x89/0x2c0
> path_mount+0xecf/0x1800
> do_mount+0xf3/0x110
> __x64_sys_mount+0x154/0x1f0
> do_syscall_64+0x39/0x80
> entry_SYSCALL_64_after_hwframe+0x63/0xcd
>
> The buggy address belongs to the object at ffff888012575dc8
> which belongs to the cache xfs_cui_item of size 432
> The buggy address is located 152 bytes inside of
> freed 432-byte region [ffff888012575dc8, ffff888012575f78)
>
> The buggy address belongs to the physical page:
> page:ffffea0000495d00 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888012576208 pfn:0x12574
> head:ffffea0000495d00 order:2 entire_mapcount:0 nr_pages_mapped:0 pincount:0
> flags: 0x1fffff80010200(slab|head|node=0|zone=1|lastcpupid=0x1fffff)
> page_type: 0xffffffff()
> raw: 001fffff80010200 ffff888012092f40 ffff888014570150 ffff888014570150
> raw: ffff888012576208 00000000001e0010 00000001ffffffff 0000000000000000
> page dumped because: kasan: bad access detected
>
> Memory state around the buggy address:
> ffff888012575d00: fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc fc
> ffff888012575d80: fc fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb
> >ffff888012575e00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ^
> ffff888012575e80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ffff888012575f00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fc
> ==================================================================
>
> "If process intents fails, intent items left in AIL will be delete
> from AIL and freed in error handling, even intent items that have been
> recovered and created done items. After this, uaf will be triggered when
> done item committed, because at this point the released intent item will
> be accessed.
>
> xlog_recover_finish xlog_cil_push_work
> ---------------------------- ---------------------------
> xlog_recover_process_intents
> xfs_cui_item_recover//cui_refcount == 1
> xfs_trans_get_cud
> xfs_trans_commit
> <add cud item to cil>
> xfs_cui_item_recover
> <error occurred and return>
> xlog_recover_cancel_intents
> xfs_cui_release //cui_refcount == 0
> xfs_cui_item_free //free cui
> <release other intent items>
> xlog_force_shutdown //shutdown
> <...>
> <push items in cil>
> xlog_cil_committed
> xfs_cud_item_release
> xfs_cui_release // UAF
>
> "Intent log items are created with a reference count of 2, one for the
> creator, and one for the intent done object. Log recovery explicitly
> drops the creator reference after it is inserted into the AIL, but it
> then processes the log item as if it also owns the intent-done reference.
>
> "The code in ->iop_recovery should assume that it passes the reference
> to the done intent, we can remove the intent item from the AIL after
> creating the done-intent, but if that code fails before creating the
> done-intent then it needs to release the intent reference by log recovery
> itself.
>
> "That way when we go to cancel the intent, the only intents we find in
> the AIL are the ones we know have not been processed yet and hence we
> can safely drop both the creator and the intent done reference from
> xlog_recover_cancel_intents().
>
> "Hence if we remove the intent from the list of intents that need to
> be recovered after we have done the initial recovery, we acheive two
> things:
>
> "1. the tail of the log can be moved forward with the commit of the
> done intent or new intent to continue the operation, and
>
> "2. We avoid the problem of trying to determine how many reference
> counts we need to drop from intent recovery cancelling because we
> never come across intents we've actually attempted recovery on."
>
> Restated: The cause of the UAF is that xlog_recover_cancel_intents
> thinks that it owns the refcount on any intent item in the AIL, and that
> it's always safe to release these intent items. This is not true after
> the recovery function creates an log intent done item and points it at
> the log intent item because releasing the done item always releases the
> intent item.
>
> The runtime defer ops code avoids all this by tracking both the log
> intent and the intent done items, and releasing only the intent done
> item if both have been created. Long Li proposed fixing this by adding
> state flags, but I have a more comprehensive fix.
>
> First, observe that the latter half of the intent _recover functions are
> nearly open-coded versions of the corresponding _finish_one function
> that uses an onstack deferred work item to single-step through the item.
>
> Second, notice that the recover function is not an exact match because
> of the odd behavior that unfinished recovered work items are relogged
> with separate log intent items instead of a single new log intent item,
> which is what the defer ops machinery does.
>
> Dave and I have long suspected that recovery should be reconstructing
> the defer work state from what's in the recovered intent item. Now we
> finally have an excuse to refactor the code to do that.
>
> This series starts by fixing a resource leak in LARP recovery. We fix
> the bug that Long Li reported by switching the intent recovery code to
> construct chains of xfs_defer_pending objects and then using the defer
> pending objects to track the intent/done item ownership. Finally, we
> clean up the code to reconstruct the exact incore state, which means we
> can remove all the opencoded _recover code, which makes maintaining log
> items much easier.
>
Thanks for fixing this UAF issue, it really is a much more comprehensive fix,
and makes the intent item recovery code much easier to maintain.
Best Regards
Long Li
prev parent reply other threads:[~2023-11-30 12:02 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-28 20:26 [PATCHSET RFC 0/7] xfs: log intent item recovery should reconstruct defer work state Darrick J. Wong
2023-11-28 20:26 ` [PATCH 1/7] xfs: don't leak recovered attri intent items Darrick J. Wong
2023-11-30 7:34 ` Christoph Hellwig
2023-11-30 17:02 ` Darrick J. Wong
2023-12-04 4:50 ` Christoph Hellwig
2023-11-28 20:26 ` [PATCH 2/7] xfs: use xfs_defer_pending objects to recover " Darrick J. Wong
2023-11-30 7:53 ` Christoph Hellwig
2023-11-30 17:14 ` Darrick J. Wong
2023-12-04 4:50 ` Christoph Hellwig
2023-11-28 20:26 ` [PATCH 3/7] xfs: pass the xfs_defer_pending object to iop_recover Darrick J. Wong
2023-11-30 7:53 ` Christoph Hellwig
2023-11-28 20:26 ` [PATCH 4/7] xfs: transfer recovered intent item ownership in ->iop_recover Darrick J. Wong
2023-11-30 7:55 ` Christoph Hellwig
2023-11-28 20:26 ` [PATCH 5/7] xfs: recreate work items when recovering intent items Darrick J. Wong
2023-11-30 8:06 ` Christoph Hellwig
2023-11-30 17:36 ` Darrick J. Wong
2023-11-30 13:13 ` Long Li
2023-11-30 17:19 ` Darrick J. Wong
2023-11-28 20:27 ` [PATCH 6/7] xfs: use xfs_defer_finish_one to finish recovered work items Darrick J. Wong
2023-11-30 8:10 ` Christoph Hellwig
2023-11-30 17:59 ` Darrick J. Wong
2023-11-28 20:27 ` [PATCH 7/7] xfs: move ->iop_recover to xfs_defer_op_type Darrick J. Wong
2023-11-30 8:11 ` Christoph Hellwig
2023-11-30 12:06 ` Long Li [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231130120628.GA1599481@ceph-admin \
--to=leo.lilong@huawei.com \
--cc=chandanbabu@kernel.org \
--cc=djwong@kernel.org \
--cc=hch@lst.de \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox