Re: [PATCHSET RFC 0/7] xfs: log intent item recovery should reconstruct defer work state

public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed

From: Long Li <leo.lilong@huawei.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: <linux-xfs@vger.kernel.org>, <chandanbabu@kernel.org>, <hch@lst.de>
Subject: Re: [PATCHSET RFC 0/7] xfs: log intent item recovery should reconstruct defer work state
Date: Thu, 30 Nov 2023 20:06:28 +0800	[thread overview]
Message-ID: <20231130120628.GA1599481@ceph-admin> (raw)
In-Reply-To: <170120318847.13206.17051442307252477333.stgit@frogsfrogsfrogs>

On Tue, Nov 28, 2023 at 12:26:28PM -0800, Darrick J. Wong wrote:
> Hi all,
> 
> Long Li reported a KASAN report from a UAF when intent recovery fails:
> 
>  ==================================================================
>  BUG: KASAN: slab-use-after-free in xfs_cui_release+0xb7/0xc0
>  Read of size 4 at addr ffff888012575e60 by task kworker/u8:3/103
>  CPU: 3 PID: 103 Comm: kworker/u8:3 Not tainted 6.4.0-rc7-next-20230619-00003-g94543a53f9a4-dirty #166
>  Workqueue: xfs-cil/sda xlog_cil_push_work
>  Call Trace:
>   <TASK>
>   dump_stack_lvl+0x50/0x70
>   print_report+0xc2/0x600
>   kasan_report+0xb6/0xe0
>   xfs_cui_release+0xb7/0xc0
>   xfs_cud_item_release+0x3c/0x90
>   xfs_trans_committed_bulk+0x2d5/0x7f0
>   xlog_cil_committed+0xaba/0xf20
>   xlog_cil_push_work+0x1a60/0x2360
>   process_one_work+0x78e/0x1140
>   worker_thread+0x58b/0xf60
>   kthread+0x2cd/0x3c0
>   ret_from_fork+0x1f/0x30
>   </TASK>
> 
>  Allocated by task 531:
>   kasan_save_stack+0x22/0x40
>   kasan_set_track+0x25/0x30
>   __kasan_slab_alloc+0x55/0x60
>   kmem_cache_alloc+0x195/0x5f0
>   xfs_cui_init+0x198/0x1d0
>   xlog_recover_cui_commit_pass2+0x133/0x5f0
>   xlog_recover_items_pass2+0x107/0x230
>   xlog_recover_commit_trans+0x3e7/0x9c0
>   xlog_recovery_process_trans+0x140/0x1d0
>   xlog_recover_process_ophdr+0x1a0/0x3d0
>   xlog_recover_process_data+0x108/0x2d0
>   xlog_recover_process+0x1f6/0x280
>   xlog_do_recovery_pass+0x609/0xdb0
>   xlog_do_log_recovery+0x84/0xe0
>   xlog_do_recover+0x7d/0x470
>   xlog_recover+0x25f/0x490
>   xfs_log_mount+0x2dd/0x6f0
>   xfs_mountfs+0x11ce/0x1e70
>   xfs_fs_fill_super+0x10ec/0x1b20
>   get_tree_bdev+0x3c8/0x730
>   vfs_get_tree+0x89/0x2c0
>   path_mount+0xecf/0x1800
>   do_mount+0xf3/0x110
>   __x64_sys_mount+0x154/0x1f0
>   do_syscall_64+0x39/0x80
>   entry_SYSCALL_64_after_hwframe+0x63/0xcd
> 
>  Freed by task 531:
>   kasan_save_stack+0x22/0x40
>   kasan_set_track+0x25/0x30
>   kasan_save_free_info+0x2b/0x40
>   __kasan_slab_free+0x114/0x1b0
>   kmem_cache_free+0xf8/0x510
>   xfs_cui_item_free+0x95/0xb0
>   xfs_cui_release+0x86/0xc0
>   xlog_recover_cancel_intents.isra.0+0xf8/0x210
>   xlog_recover_finish+0x7e7/0x980
>   xfs_log_mount_finish+0x2bb/0x4a0
>   xfs_mountfs+0x14bf/0x1e70
>   xfs_fs_fill_super+0x10ec/0x1b20
>   get_tree_bdev+0x3c8/0x730
>   vfs_get_tree+0x89/0x2c0
>   path_mount+0xecf/0x1800
>   do_mount+0xf3/0x110
>   __x64_sys_mount+0x154/0x1f0
>   do_syscall_64+0x39/0x80
>   entry_SYSCALL_64_after_hwframe+0x63/0xcd
> 
>  The buggy address belongs to the object at ffff888012575dc8
>   which belongs to the cache xfs_cui_item of size 432
>  The buggy address is located 152 bytes inside of
>   freed 432-byte region [ffff888012575dc8, ffff888012575f78)
> 
>  The buggy address belongs to the physical page:
>  page:ffffea0000495d00 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888012576208 pfn:0x12574
>  head:ffffea0000495d00 order:2 entire_mapcount:0 nr_pages_mapped:0 pincount:0
>  flags: 0x1fffff80010200(slab|head|node=0|zone=1|lastcpupid=0x1fffff)
>  page_type: 0xffffffff()
>  raw: 001fffff80010200 ffff888012092f40 ffff888014570150 ffff888014570150
>  raw: ffff888012576208 00000000001e0010 00000001ffffffff 0000000000000000
>  page dumped because: kasan: bad access detected
> 
>  Memory state around the buggy address:
>   ffff888012575d00: fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc fc
>   ffff888012575d80: fc fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb
>  >ffff888012575e00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>                                                         ^
>   ffff888012575e80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>   ffff888012575f00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fc
>  ==================================================================
> 
> "If process intents fails, intent items left in AIL will be delete
> from AIL and freed in error handling, even intent items that have been
> recovered and created done items. After this, uaf will be triggered when
> done item committed, because at this point the released intent item will
> be accessed.
> 
> xlog_recover_finish                     xlog_cil_push_work
> ----------------------------            ---------------------------
> xlog_recover_process_intents
>   xfs_cui_item_recover//cui_refcount == 1
>     xfs_trans_get_cud
>     xfs_trans_commit
>       <add cud item to cil>
>   xfs_cui_item_recover
>     <error occurred and return>
> xlog_recover_cancel_intents
>   xfs_cui_release     //cui_refcount == 0
>     xfs_cui_item_free //free cui
>   <release other intent items>
> xlog_force_shutdown   //shutdown
>                                <...>
>                                         <push items in cil>
>                                         xlog_cil_committed
>                                           xfs_cud_item_release
>                                             xfs_cui_release // UAF
> 
> "Intent log items are created with a reference count of 2, one for the
> creator, and one for the intent done object. Log recovery explicitly
> drops the creator reference after it is inserted into the AIL, but it
> then processes the log item as if it also owns the intent-done reference.
> 
> "The code in ->iop_recovery should assume that it passes the reference
> to the done intent, we can remove the intent item from the AIL after
> creating the done-intent, but if that code fails before creating the
> done-intent then it needs to release the intent reference by log recovery
> itself.
> 
> "That way when we go to cancel the intent, the only intents we find in
> the AIL are the ones we know have not been processed yet and hence we
> can safely drop both the creator and the intent done reference from
> xlog_recover_cancel_intents().
> 
> "Hence if we remove the intent from the list of intents that need to
> be recovered after we have done the initial recovery, we acheive two
> things:
> 
> "1. the tail of the log can be moved forward with the commit of the
> done intent or new intent to continue the operation, and
> 
> "2. We avoid the problem of trying to determine how many reference
> counts we need to drop from intent recovery cancelling because we
> never come across intents we've actually attempted recovery on."
> 
> Restated: The cause of the UAF is that xlog_recover_cancel_intents
> thinks that it owns the refcount on any intent item in the AIL, and that
> it's always safe to release these intent items.  This is not true after
> the recovery function creates an log intent done item and points it at
> the log intent item because releasing the done item always releases the
> intent item.
> 
> The runtime defer ops code avoids all this by tracking both the log
> intent and the intent done items, and releasing only the intent done
> item if both have been created.  Long Li proposed fixing this by adding
> state flags, but I have a more comprehensive fix.
> 
> First, observe that the latter half of the intent _recover functions are
> nearly open-coded versions of the corresponding _finish_one function
> that uses an onstack deferred work item to single-step through the item.
> 
> Second, notice that the recover function is not an exact match because
> of the odd behavior that unfinished recovered work items are relogged
> with separate log intent items instead of a single new log intent item,
> which is what the defer ops machinery does.
> 
> Dave and I have long suspected that recovery should be reconstructing
> the defer work state from what's in the recovered intent item.  Now we
> finally have an excuse to refactor the code to do that.
> 
> This series starts by fixing a resource leak in LARP recovery.  We fix
> the bug that Long Li reported by switching the intent recovery code to
> construct chains of xfs_defer_pending objects and then using the defer
> pending objects to track the intent/done item ownership.  Finally, we
> clean up the code to reconstruct the exact incore state, which means we
> can remove all the opencoded _recover code, which makes maintaining log
> items much easier.
> 

Thanks for fixing this UAF issue, it really is a much more comprehensive fix,
and makes the intent item recovery code much easier to maintain.

Best Regards
Long Li

     prev parent reply	other threads:[~2023-11-30 12:02 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-28 20:26 [PATCHSET RFC 0/7] xfs: log intent item recovery should reconstruct defer work state Darrick J. Wong
2023-11-28 20:26 ` [PATCH 1/7] xfs: don't leak recovered attri intent items Darrick J. Wong
2023-11-30  7:34   ` Christoph Hellwig
2023-11-30 17:02     ` Darrick J. Wong
2023-12-04  4:50       ` Christoph Hellwig
2023-11-28 20:26 ` [PATCH 2/7] xfs: use xfs_defer_pending objects to recover " Darrick J. Wong
2023-11-30  7:53   ` Christoph Hellwig
2023-11-30 17:14     ` Darrick J. Wong
2023-12-04  4:50       ` Christoph Hellwig
2023-11-28 20:26 ` [PATCH 3/7] xfs: pass the xfs_defer_pending object to iop_recover Darrick J. Wong
2023-11-30  7:53   ` Christoph Hellwig
2023-11-28 20:26 ` [PATCH 4/7] xfs: transfer recovered intent item ownership in ->iop_recover Darrick J. Wong
2023-11-30  7:55   ` Christoph Hellwig
2023-11-28 20:26 ` [PATCH 5/7] xfs: recreate work items when recovering intent items Darrick J. Wong
2023-11-30  8:06   ` Christoph Hellwig
2023-11-30 17:36     ` Darrick J. Wong
2023-11-30 13:13   ` Long Li
2023-11-30 17:19     ` Darrick J. Wong
2023-11-28 20:27 ` [PATCH 6/7] xfs: use xfs_defer_finish_one to finish recovered work items Darrick J. Wong
2023-11-30  8:10   ` Christoph Hellwig
2023-11-30 17:59     ` Darrick J. Wong
2023-11-28 20:27 ` [PATCH 7/7] xfs: move ->iop_recover to xfs_defer_op_type Darrick J. Wong
2023-11-30  8:11   ` Christoph Hellwig
2023-11-30 12:06 ` Long Li [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231130120628.GA1599481@ceph-admin \
    --to=leo.lilong@huawei.com \
    --cc=chandanbabu@kernel.org \
    --cc=djwong@kernel.org \
    --cc=hch@lst.de \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox