From: Waqar Hameed <waqar.hameed@axis.com>
To: Zhihao Cheng <chengzhihao1@huawei.com>
Cc: Richard Weinberger <richard@nod.at>,
Sascha Hauer <s.hauer@pengutronix.de>, <kernel@axis.com>,
<linux-mtd@lists.infradead.org>, <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH RFC] ubifs: Fix use-after-free in ubifs_tnc_end_commit
Date: Tue, 15 Oct 2024 20:52:37 +0200 [thread overview]
Message-ID: <pnd7ca9r0pt.fsf@axis.com> (raw)
In-Reply-To: <5173d3d2-4a6b-8b0b-c8f7-8034c9763532@huawei.com> (Zhihao Cheng's message of "Sat, 12 Oct 2024 20:30:17 +0800")
On Sat, Oct 12, 2024 at 20:30 +0800 Zhihao Cheng <chengzhihao1@huawei.com> wrote:
> 在 2024/10/9 22:46, Waqar Hameed 写道:
>> Running
>> rm -f /etc/test-file.bin
>> dd if=/dev/urandom of=/etc/test-file.bin bs=1M count=60 conv=fsync
>> in a loop, with `CONFIG_UBIFS_FS_AUTHENTICATION`, KASAN reports:
>> BUG: KASAN: use-after-free in ubifs_tnc_end_commit+0xa5c/0x1950
>> Write of size 32 at addr ffffff800a3af86c by task ubifs_bgt0_20/153
>> Call trace:
>> dump_backtrace+0x0/0x340
>> show_stack+0x18/0x24
>> dump_stack_lvl+0x9c/0xbc
>> print_address_description.constprop.0+0x74/0x2b0
>> kasan_report+0x1d8/0x1f0
>> kasan_check_range+0xf8/0x1a0
>> memcpy+0x84/0xf4
>> ubifs_tnc_end_commit+0xa5c/0x1950
>> do_commit+0x4e0/0x1340
>> ubifs_bg_thread+0x234/0x2e0
>> kthread+0x36c/0x410
>> ret_from_fork+0x10/0x20
>> Allocated by task 401:
>> kasan_save_stack+0x38/0x70
>> __kasan_kmalloc+0x8c/0xd0
>> __kmalloc+0x34c/0x5bc
>> tnc_insert+0x140/0x16a4
>> ubifs_tnc_add+0x370/0x52c
>> ubifs_jnl_write_data+0x5d8/0x870
>> do_writepage+0x36c/0x510
>> ubifs_writepage+0x190/0x4dc
>> __writepage+0x58/0x154
>> write_cache_pages+0x394/0x830
>> do_writepages+0x1f0/0x5b0
>> filemap_fdatawrite_wbc+0x170/0x25c
>> file_write_and_wait_range+0x140/0x190
>> ubifs_fsync+0xe8/0x290
>> vfs_fsync_range+0xc0/0x1e4
>> do_fsync+0x40/0x90
>> __arm64_sys_fsync+0x34/0x50
>> invoke_syscall.constprop.0+0xa8/0x260
>> do_el0_svc+0xc8/0x1f0
>> el0_svc+0x34/0x70
>> el0t_64_sync_handler+0x108/0x114
>> el0t_64_sync+0x1a4/0x1a8
>> Freed by task 403:
>> kasan_save_stack+0x38/0x70
>> kasan_set_track+0x28/0x40
>> kasan_set_free_info+0x28/0x4c
>> __kasan_slab_free+0xd4/0x13c
>> kfree+0xc4/0x3a0
>> tnc_delete+0x3f4/0xe40
>> ubifs_tnc_remove_range+0x368/0x73c
>> ubifs_tnc_remove_ino+0x29c/0x2e0
>> ubifs_jnl_delete_inode+0x150/0x260
>> ubifs_evict_inode+0x1d4/0x2e4
>> evict+0x1c8/0x450
>> iput+0x2a0/0x3c4
>> do_unlinkat+0x2cc/0x490
>> __arm64_sys_unlinkat+0x90/0x100
>> invoke_syscall.constprop.0+0xa8/0x260
>> do_el0_svc+0xc8/0x1f0
>> el0_svc+0x34/0x70
>> el0t_64_sync_handler+0x108/0x114
>> el0t_64_sync+0x1a4/0x1a8
>> The offending `memcpy` is in `ubifs_copy_hash()`. Fix this by checking
>> if the `znode` is obsolete before accessing the hash (just like we do
>> for `znode->parent`).
>
> Do you mean that the UAF occurs in following path:
> do_commit -> ubifs_tnc_end_commit -> write_index:
> while (1) {
> ...
> znode = cnext;
> ...
> if (znode->cparent)
> ubifs_copy_hash(c, hash, znode->cparent->zbranch[znode->ciip].hash); //
> znode->cparent has been freed!
> }
Yes, that's what KASAN reports. It's the `memcpy()` in
`ubifs_copy_hash()` that triggers the slab-use-after-free.
>
> If so, according to the current implementation(lastest linux kernel is v6.12), I
> cannot understand that how the znode->cparent is freed before write_index()
> finished, it looks impossible.
> All dirty znodes are gathered by function get_znodes_to_commit() which is
> protected by c->tnc_mutex, and the 'cparent' member in all dirty znodes is
> assigned with non-NULL. Then I think the znode memory freeing path
> 'tnc_delete->kfree(znode)' cannot happen, because:
> 1) If a znode is dirtied, all its' ancestor znodes(a path from znode to root
> znode) must be dirtied, which is guaranteed by UBIFS. See
> dirty_cow_bottom_up/lookup_level0_dirty.
> 2) A dirty znode waiting for commit cannot be freed before write_index()
> finished, which is guaranteed by tnc_delete:
> if (znode->cnext) {
> __set_bit(OBSOLETE_ZNODE, &znode->flags);
> ...
> } else {
> kfree(znode);
> }
I'm with you here. Initially I thought there was some lock missing
(since it is showing signs of a race, e.g. not deterministic). But as
you point out, it is protected with `tnc_mutex`, hence my "RFC" tag on
this patch.
>> Fixes: 16a26b20d2af ("ubifs: authentication: Add hashes to index nodes")
>> Signed-off-by: Waqar Hameed <waqar.hameed@axis.com>
>> ---
>> I'm not entirely sure if this is the _correct_ way to fix this. However,
>> testing shows that the problem indeed disappears.
>> My understanding is that the `znode` could concurrently be deleted (with
>> a reference in an unprotected code section without `tnc_mutex`). The
>> assumption is that in this case it would be sufficient to check
>> `ubifs_zn_obsolete(znode)`, like as in the if-statement for
>> `znode->parent` just below.
>
> I'm analyzing tnc-related code these days, however I can't find places that may
> concurrently operate the same znode. And I cannot reproduce the problem with
> your reproducer:
> while true; do
> rm -f /UBIFS_MNT/test-file.bin
> dd if=/dev/urandom of=/UBIFS_MNT/test-file.bin bs=1M count=60 conv=fsync
> done
For completeness, here are the _exact_ steps that I have used to
reproduce this on my system with v6.12-rc2 (commit 75b607fab38d "Merge
tag 'sched_ext-for-6.12-rc2-fixes' of
git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext"):
```
ubiattach -m 2
keyctl add logon dummy_key: dummy_load @us
ubimkvol /dev/ubi0 -s 80MiB -n 0 -N test-vol
ubiupdatevol /dev/ubi0_0 -t
mount -t ubifs /dev/ubi0_0 /mnt/flash -o auth_hash_name=sha256,auth_key=dummy_key:
count=0
while true; do
date
count=$(($count + 1))
echo count=$count
rm -f /mnt/flash/test-file.bin
dd if=/dev/urandom of=/mnt/flash/test-file.bin bs=1M count=60 conv=fsync
echo ""
done
```
Note that you need to have `CONFIG_UBIFS_FS_AUTHENTICATION=y` (and
`CONFIG_KASAN=y` obviously) in your `.config` in order to trigger the
offending `memcpy()` in `ubifs_copy_hash()`. Also, it takes a while. For
example, last time it took me 88 iterations of the above loop before it
triggered. So you might need to let it spin for a while.
>
> Can you dig more deeper by adding more debug message, so that we can figure out
> what is really happening.
Certainly! I could try to enable the debug prints from UBIFS, however
they are *a lot*. Moreover, printing that much changes the timing
behavior and might make it harder to trigger the use-after-free. Do you
have any tips on where we should try to focus the debug prints (a
dynamic debug filter).
[...]
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
WARNING: multiple messages have this Message-ID (diff)
From: Waqar Hameed <waqar.hameed@axis.com>
To: Zhihao Cheng <chengzhihao1@huawei.com>
Cc: Richard Weinberger <richard@nod.at>,
Sascha Hauer <s.hauer@pengutronix.de>, <kernel@axis.com>,
<linux-mtd@lists.infradead.org>, <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH RFC] ubifs: Fix use-after-free in ubifs_tnc_end_commit
Date: Tue, 15 Oct 2024 20:52:37 +0200 [thread overview]
Message-ID: <pnd7ca9r0pt.fsf@axis.com> (raw)
In-Reply-To: <5173d3d2-4a6b-8b0b-c8f7-8034c9763532@huawei.com> (Zhihao Cheng's message of "Sat, 12 Oct 2024 20:30:17 +0800")
On Sat, Oct 12, 2024 at 20:30 +0800 Zhihao Cheng <chengzhihao1@huawei.com> wrote:
> 在 2024/10/9 22:46, Waqar Hameed 写道:
>> Running
>> rm -f /etc/test-file.bin
>> dd if=/dev/urandom of=/etc/test-file.bin bs=1M count=60 conv=fsync
>> in a loop, with `CONFIG_UBIFS_FS_AUTHENTICATION`, KASAN reports:
>> BUG: KASAN: use-after-free in ubifs_tnc_end_commit+0xa5c/0x1950
>> Write of size 32 at addr ffffff800a3af86c by task ubifs_bgt0_20/153
>> Call trace:
>> dump_backtrace+0x0/0x340
>> show_stack+0x18/0x24
>> dump_stack_lvl+0x9c/0xbc
>> print_address_description.constprop.0+0x74/0x2b0
>> kasan_report+0x1d8/0x1f0
>> kasan_check_range+0xf8/0x1a0
>> memcpy+0x84/0xf4
>> ubifs_tnc_end_commit+0xa5c/0x1950
>> do_commit+0x4e0/0x1340
>> ubifs_bg_thread+0x234/0x2e0
>> kthread+0x36c/0x410
>> ret_from_fork+0x10/0x20
>> Allocated by task 401:
>> kasan_save_stack+0x38/0x70
>> __kasan_kmalloc+0x8c/0xd0
>> __kmalloc+0x34c/0x5bc
>> tnc_insert+0x140/0x16a4
>> ubifs_tnc_add+0x370/0x52c
>> ubifs_jnl_write_data+0x5d8/0x870
>> do_writepage+0x36c/0x510
>> ubifs_writepage+0x190/0x4dc
>> __writepage+0x58/0x154
>> write_cache_pages+0x394/0x830
>> do_writepages+0x1f0/0x5b0
>> filemap_fdatawrite_wbc+0x170/0x25c
>> file_write_and_wait_range+0x140/0x190
>> ubifs_fsync+0xe8/0x290
>> vfs_fsync_range+0xc0/0x1e4
>> do_fsync+0x40/0x90
>> __arm64_sys_fsync+0x34/0x50
>> invoke_syscall.constprop.0+0xa8/0x260
>> do_el0_svc+0xc8/0x1f0
>> el0_svc+0x34/0x70
>> el0t_64_sync_handler+0x108/0x114
>> el0t_64_sync+0x1a4/0x1a8
>> Freed by task 403:
>> kasan_save_stack+0x38/0x70
>> kasan_set_track+0x28/0x40
>> kasan_set_free_info+0x28/0x4c
>> __kasan_slab_free+0xd4/0x13c
>> kfree+0xc4/0x3a0
>> tnc_delete+0x3f4/0xe40
>> ubifs_tnc_remove_range+0x368/0x73c
>> ubifs_tnc_remove_ino+0x29c/0x2e0
>> ubifs_jnl_delete_inode+0x150/0x260
>> ubifs_evict_inode+0x1d4/0x2e4
>> evict+0x1c8/0x450
>> iput+0x2a0/0x3c4
>> do_unlinkat+0x2cc/0x490
>> __arm64_sys_unlinkat+0x90/0x100
>> invoke_syscall.constprop.0+0xa8/0x260
>> do_el0_svc+0xc8/0x1f0
>> el0_svc+0x34/0x70
>> el0t_64_sync_handler+0x108/0x114
>> el0t_64_sync+0x1a4/0x1a8
>> The offending `memcpy` is in `ubifs_copy_hash()`. Fix this by checking
>> if the `znode` is obsolete before accessing the hash (just like we do
>> for `znode->parent`).
>
> Do you mean that the UAF occurs in following path:
> do_commit -> ubifs_tnc_end_commit -> write_index:
> while (1) {
> ...
> znode = cnext;
> ...
> if (znode->cparent)
> ubifs_copy_hash(c, hash, znode->cparent->zbranch[znode->ciip].hash); //
> znode->cparent has been freed!
> }
Yes, that's what KASAN reports. It's the `memcpy()` in
`ubifs_copy_hash()` that triggers the slab-use-after-free.
>
> If so, according to the current implementation(lastest linux kernel is v6.12), I
> cannot understand that how the znode->cparent is freed before write_index()
> finished, it looks impossible.
> All dirty znodes are gathered by function get_znodes_to_commit() which is
> protected by c->tnc_mutex, and the 'cparent' member in all dirty znodes is
> assigned with non-NULL. Then I think the znode memory freeing path
> 'tnc_delete->kfree(znode)' cannot happen, because:
> 1) If a znode is dirtied, all its' ancestor znodes(a path from znode to root
> znode) must be dirtied, which is guaranteed by UBIFS. See
> dirty_cow_bottom_up/lookup_level0_dirty.
> 2) A dirty znode waiting for commit cannot be freed before write_index()
> finished, which is guaranteed by tnc_delete:
> if (znode->cnext) {
> __set_bit(OBSOLETE_ZNODE, &znode->flags);
> ...
> } else {
> kfree(znode);
> }
I'm with you here. Initially I thought there was some lock missing
(since it is showing signs of a race, e.g. not deterministic). But as
you point out, it is protected with `tnc_mutex`, hence my "RFC" tag on
this patch.
>> Fixes: 16a26b20d2af ("ubifs: authentication: Add hashes to index nodes")
>> Signed-off-by: Waqar Hameed <waqar.hameed@axis.com>
>> ---
>> I'm not entirely sure if this is the _correct_ way to fix this. However,
>> testing shows that the problem indeed disappears.
>> My understanding is that the `znode` could concurrently be deleted (with
>> a reference in an unprotected code section without `tnc_mutex`). The
>> assumption is that in this case it would be sufficient to check
>> `ubifs_zn_obsolete(znode)`, like as in the if-statement for
>> `znode->parent` just below.
>
> I'm analyzing tnc-related code these days, however I can't find places that may
> concurrently operate the same znode. And I cannot reproduce the problem with
> your reproducer:
> while true; do
> rm -f /UBIFS_MNT/test-file.bin
> dd if=/dev/urandom of=/UBIFS_MNT/test-file.bin bs=1M count=60 conv=fsync
> done
For completeness, here are the _exact_ steps that I have used to
reproduce this on my system with v6.12-rc2 (commit 75b607fab38d "Merge
tag 'sched_ext-for-6.12-rc2-fixes' of
git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext"):
```
ubiattach -m 2
keyctl add logon dummy_key: dummy_load @us
ubimkvol /dev/ubi0 -s 80MiB -n 0 -N test-vol
ubiupdatevol /dev/ubi0_0 -t
mount -t ubifs /dev/ubi0_0 /mnt/flash -o auth_hash_name=sha256,auth_key=dummy_key:
count=0
while true; do
date
count=$(($count + 1))
echo count=$count
rm -f /mnt/flash/test-file.bin
dd if=/dev/urandom of=/mnt/flash/test-file.bin bs=1M count=60 conv=fsync
echo ""
done
```
Note that you need to have `CONFIG_UBIFS_FS_AUTHENTICATION=y` (and
`CONFIG_KASAN=y` obviously) in your `.config` in order to trigger the
offending `memcpy()` in `ubifs_copy_hash()`. Also, it takes a while. For
example, last time it took me 88 iterations of the above loop before it
triggered. So you might need to let it spin for a while.
>
> Can you dig more deeper by adding more debug message, so that we can figure out
> what is really happening.
Certainly! I could try to enable the debug prints from UBIFS, however
they are *a lot*. Moreover, printing that much changes the timing
behavior and might make it harder to trigger the use-after-free. Do you
have any tips on where we should try to focus the debug prints (a
dynamic debug filter).
[...]
next prev parent reply other threads:[~2024-10-15 18:55 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-09 14:46 [PATCH RFC] ubifs: Fix use-after-free in ubifs_tnc_end_commit Waqar Hameed
2024-10-09 14:46 ` Waqar Hameed
2024-10-12 12:30 ` Zhihao Cheng
2024-10-12 12:30 ` Zhihao Cheng
2024-10-15 18:52 ` Waqar Hameed [this message]
2024-10-15 18:52 ` Waqar Hameed
2024-10-16 2:11 ` Zhihao Cheng
2024-10-16 2:11 ` Zhihao Cheng
2024-10-17 18:36 ` Waqar Hameed
2024-10-17 18:36 ` Waqar Hameed
2024-10-18 1:40 ` Zhihao Cheng
2024-10-18 1:40 ` Zhihao Cheng
2024-11-06 16:36 ` Waqar Hameed
2024-11-06 16:36 ` Waqar Hameed
2024-11-07 7:14 ` Zhihao Cheng
2024-11-07 7:14 ` Zhihao Cheng
2024-11-07 22:11 ` Waqar Hameed
2024-11-07 22:11 ` Waqar Hameed
2024-10-29 8:10 ` Ryder Wang
2024-10-29 8:10 ` Ryder Wang
2024-10-29 9:06 ` Zhihao Cheng
2024-10-29 9:06 ` Zhihao Cheng
2024-10-29 9:56 ` Ryder Wang
2024-10-29 9:56 ` Ryder Wang
2024-10-29 11:19 ` Zhihao Cheng
2024-10-29 11:19 ` Zhihao Cheng
2024-11-07 8:39 ` Zhihao Cheng
2024-11-07 8:39 ` Zhihao Cheng
2024-11-07 22:38 ` Waqar Hameed
2024-11-07 22:38 ` Waqar Hameed
2024-11-08 1:38 ` Zhihao Cheng
2024-11-08 1:38 ` Zhihao Cheng
2024-11-08 18:02 ` Waqar Hameed
2024-11-08 18:02 ` Waqar Hameed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=pnd7ca9r0pt.fsf@axis.com \
--to=waqar.hameed@axis.com \
--cc=chengzhihao1@huawei.com \
--cc=kernel@axis.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mtd@lists.infradead.org \
--cc=richard@nod.at \
--cc=s.hauer@pengutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.