From: Long Li <leo.lilong@huawei.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: <linux-xfs@vger.kernel.org>, <chandanbabu@kernel.org>, <hch@lst.de>
Subject: Re: [PATCHSET RFC 0/7] xfs: log intent item recovery should reconstruct defer work state
Date: Thu, 30 Nov 2023 20:06:28 +0800 [thread overview]
Message-ID: <20231130120628.GA1599481@ceph-admin> (raw)
In-Reply-To: <170120318847.13206.17051442307252477333.stgit@frogsfrogsfrogs>
On Tue, Nov 28, 2023 at 12:26:28PM -0800, Darrick J. Wong wrote:
> Hi all,
>
> Long Li reported a KASAN report from a UAF when intent recovery fails:
>
> ==================================================================
> BUG: KASAN: slab-use-after-free in xfs_cui_release+0xb7/0xc0
> Read of size 4 at addr ffff888012575e60 by task kworker/u8:3/103
> CPU: 3 PID: 103 Comm: kworker/u8:3 Not tainted 6.4.0-rc7-next-20230619-00003-g94543a53f9a4-dirty #166
> Workqueue: xfs-cil/sda xlog_cil_push_work
> Call Trace:
> <TASK>
> dump_stack_lvl+0x50/0x70
> print_report+0xc2/0x600
> kasan_report+0xb6/0xe0
> xfs_cui_release+0xb7/0xc0
> xfs_cud_item_release+0x3c/0x90
> xfs_trans_committed_bulk+0x2d5/0x7f0
> xlog_cil_committed+0xaba/0xf20
> xlog_cil_push_work+0x1a60/0x2360
> process_one_work+0x78e/0x1140
> worker_thread+0x58b/0xf60
> kthread+0x2cd/0x3c0
> ret_from_fork+0x1f/0x30
> </TASK>
>
> Allocated by task 531:
> kasan_save_stack+0x22/0x40
> kasan_set_track+0x25/0x30
> __kasan_slab_alloc+0x55/0x60
> kmem_cache_alloc+0x195/0x5f0
> xfs_cui_init+0x198/0x1d0
> xlog_recover_cui_commit_pass2+0x133/0x5f0
> xlog_recover_items_pass2+0x107/0x230
> xlog_recover_commit_trans+0x3e7/0x9c0
> xlog_recovery_process_trans+0x140/0x1d0
> xlog_recover_process_ophdr+0x1a0/0x3d0
> xlog_recover_process_data+0x108/0x2d0
> xlog_recover_process+0x1f6/0x280
> xlog_do_recovery_pass+0x609/0xdb0
> xlog_do_log_recovery+0x84/0xe0
> xlog_do_recover+0x7d/0x470
> xlog_recover+0x25f/0x490
> xfs_log_mount+0x2dd/0x6f0
> xfs_mountfs+0x11ce/0x1e70
> xfs_fs_fill_super+0x10ec/0x1b20
> get_tree_bdev+0x3c8/0x730
> vfs_get_tree+0x89/0x2c0
> path_mount+0xecf/0x1800
> do_mount+0xf3/0x110
> __x64_sys_mount+0x154/0x1f0
> do_syscall_64+0x39/0x80
> entry_SYSCALL_64_after_hwframe+0x63/0xcd
>
> Freed by task 531:
> kasan_save_stack+0x22/0x40
> kasan_set_track+0x25/0x30
> kasan_save_free_info+0x2b/0x40
> __kasan_slab_free+0x114/0x1b0
> kmem_cache_free+0xf8/0x510
> xfs_cui_item_free+0x95/0xb0
> xfs_cui_release+0x86/0xc0
> xlog_recover_cancel_intents.isra.0+0xf8/0x210
> xlog_recover_finish+0x7e7/0x980
> xfs_log_mount_finish+0x2bb/0x4a0
> xfs_mountfs+0x14bf/0x1e70
> xfs_fs_fill_super+0x10ec/0x1b20
> get_tree_bdev+0x3c8/0x730
> vfs_get_tree+0x89/0x2c0
> path_mount+0xecf/0x1800
> do_mount+0xf3/0x110
> __x64_sys_mount+0x154/0x1f0
> do_syscall_64+0x39/0x80
> entry_SYSCALL_64_after_hwframe+0x63/0xcd
>
> The buggy address belongs to the object at ffff888012575dc8
> which belongs to the cache xfs_cui_item of size 432
> The buggy address is located 152 bytes inside of
> freed 432-byte region [ffff888012575dc8, ffff888012575f78)
>
> The buggy address belongs to the physical page:
> page:ffffea0000495d00 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888012576208 pfn:0x12574
> head:ffffea0000495d00 order:2 entire_mapcount:0 nr_pages_mapped:0 pincount:0
> flags: 0x1fffff80010200(slab|head|node=0|zone=1|lastcpupid=0x1fffff)
> page_type: 0xffffffff()
> raw: 001fffff80010200 ffff888012092f40 ffff888014570150 ffff888014570150
> raw: ffff888012576208 00000000001e0010 00000001ffffffff 0000000000000000
> page dumped because: kasan: bad access detected
>
> Memory state around the buggy address:
> ffff888012575d00: fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc fc
> ffff888012575d80: fc fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb
> >ffff888012575e00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ^
> ffff888012575e80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ffff888012575f00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fc
> ==================================================================
>
> "If process intents fails, intent items left in AIL will be delete
> from AIL and freed in error handling, even intent items that have been
> recovered and created done items. After this, uaf will be triggered when
> done item committed, because at this point the released intent item will
> be accessed.
>
> xlog_recover_finish xlog_cil_push_work
> ---------------------------- ---------------------------
> xlog_recover_process_intents
> xfs_cui_item_recover//cui_refcount == 1
> xfs_trans_get_cud
> xfs_trans_commit
> <add cud item to cil>
> xfs_cui_item_recover
> <error occurred and return>
> xlog_recover_cancel_intents
> xfs_cui_release //cui_refcount == 0
> xfs_cui_item_free //free cui
> <release other intent items>
> xlog_force_shutdown //shutdown
> <...>
> <push items in cil>
> xlog_cil_committed
> xfs_cud_item_release
> xfs_cui_release // UAF
>
> "Intent log items are created with a reference count of 2, one for the
> creator, and one for the intent done object. Log recovery explicitly
> drops the creator reference after it is inserted into the AIL, but it
> then processes the log item as if it also owns the intent-done reference.
>
> "The code in ->iop_recovery should assume that it passes the reference
> to the done intent, we can remove the intent item from the AIL after
> creating the done-intent, but if that code fails before creating the
> done-intent then it needs to release the intent reference by log recovery
> itself.
>
> "That way when we go to cancel the intent, the only intents we find in
> the AIL are the ones we know have not been processed yet and hence we
> can safely drop both the creator and the intent done reference from
> xlog_recover_cancel_intents().
>
> "Hence if we remove the intent from the list of intents that need to
> be recovered after we have done the initial recovery, we acheive two
> things:
>
> "1. the tail of the log can be moved forward with the commit of the
> done intent or new intent to continue the operation, and
>
> "2. We avoid the problem of trying to determine how many reference
> counts we need to drop from intent recovery cancelling because we
> never come across intents we've actually attempted recovery on."
>
> Restated: The cause of the UAF is that xlog_recover_cancel_intents
> thinks that it owns the refcount on any intent item in the AIL, and that
> it's always safe to release these intent items. This is not true after
> the recovery function creates an log intent done item and points it at
> the log intent item because releasing the done item always releases the
> intent item.
>
> The runtime defer ops code avoids all this by tracking both the log
> intent and the intent done items, and releasing only the intent done
> item if both have been created. Long Li proposed fixing this by adding
> state flags, but I have a more comprehensive fix.
>
> First, observe that the latter half of the intent _recover functions are
> nearly open-coded versions of the corresponding _finish_one function
> that uses an onstack deferred work item to single-step through the item.
>
> Second, notice that the recover function is not an exact match because
> of the odd behavior that unfinished recovered work items are relogged
> with separate log intent items instead of a single new log intent item,
> which is what the defer ops machinery does.
>
> Dave and I have long suspected that recovery should be reconstructing
> the defer work state from what's in the recovered intent item. Now we
> finally have an excuse to refactor the code to do that.
>
> This series starts by fixing a resource leak in LARP recovery. We fix
> the bug that Long Li reported by switching the intent recovery code to
> construct chains of xfs_defer_pending objects and then using the defer
> pending objects to track the intent/done item ownership. Finally, we
> clean up the code to reconstruct the exact incore state, which means we
> can remove all the opencoded _recover code, which makes maintaining log
> items much easier.
>
Thanks for fixing this UAF issue, it really is a much more comprehensive fix,
and makes the intent item recovery code much easier to maintain.
Best Regards
Long Li
prev parent reply other threads:[~2023-11-30 12:02 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-28 20:26 [PATCHSET RFC 0/7] xfs: log intent item recovery should reconstruct defer work state Darrick J. Wong
2023-11-28 20:26 ` [PATCH 1/7] xfs: don't leak recovered attri intent items Darrick J. Wong
2023-11-30 7:34 ` Christoph Hellwig
2023-11-30 17:02 ` Darrick J. Wong
2023-12-04 4:50 ` Christoph Hellwig
2023-11-28 20:26 ` [PATCH 2/7] xfs: use xfs_defer_pending objects to recover " Darrick J. Wong
2023-11-30 7:53 ` Christoph Hellwig
2023-11-30 17:14 ` Darrick J. Wong
2023-12-04 4:50 ` Christoph Hellwig
2023-11-28 20:26 ` [PATCH 3/7] xfs: pass the xfs_defer_pending object to iop_recover Darrick J. Wong
2023-11-30 7:53 ` Christoph Hellwig
2023-11-28 20:26 ` [PATCH 4/7] xfs: transfer recovered intent item ownership in ->iop_recover Darrick J. Wong
2023-11-30 7:55 ` Christoph Hellwig
2023-11-28 20:26 ` [PATCH 5/7] xfs: recreate work items when recovering intent items Darrick J. Wong
2023-11-30 8:06 ` Christoph Hellwig
2023-11-30 17:36 ` Darrick J. Wong
2023-11-30 13:13 ` Long Li
2023-11-30 17:19 ` Darrick J. Wong
2023-11-28 20:27 ` [PATCH 6/7] xfs: use xfs_defer_finish_one to finish recovered work items Darrick J. Wong
2023-11-30 8:10 ` Christoph Hellwig
2023-11-30 17:59 ` Darrick J. Wong
2023-11-28 20:27 ` [PATCH 7/7] xfs: move ->iop_recover to xfs_defer_op_type Darrick J. Wong
2023-11-30 8:11 ` Christoph Hellwig
2023-11-30 12:06 ` Long Li [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231130120628.GA1599481@ceph-admin \
--to=leo.lilong@huawei.com \
--cc=chandanbabu@kernel.org \
--cc=djwong@kernel.org \
--cc=hch@lst.de \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.