From: Leo Martins <loemra.dev@gmail.com>
To: kernel test robot <oliver.sang@intel.com>
Cc: oe-lkp@lists.linux.dev, lkp@intel.com,
linux-kernel@vger.kernel.org, David Sterba <dsterba@suse.com>,
linux-btrfs@vger.kernel.org
Subject: Re: [linus:master] [btrfs] e8513c012d: addition_on#;use-after-free
Date: Mon, 1 Dec 2025 16:51:41 -0800 [thread overview]
Message-ID: <20251202005146.3723457-1-loemra.dev@gmail.com> (raw)
In-Reply-To: <202511262228.6dda231e-lkp@intel.com>
On Wed, 26 Nov 2025 22:23:21 +0800 kernel test robot <oliver.sang@intel.com> wrote:
>
>
> Hello,
>
> kernel test robot noticed "addition_on#;use-after-free" on:
>
> commit: e8513c012de75fd65e2df5499572bc6ef3f6e409 ("btrfs: implement ref_tracker for delayed_nodes")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>
> [test failed on linus/master 30f09200cc4aefbd8385b01e41bde2e4565a6f0e]
> [test failed on linux-next/master 92fd6e84175befa1775e5c0ab682938eca27c0b2]
>
> in testcase: blogbench
> version: blogbench-x86_64-1.2-1_20251009
> with following parameters:
>
> disk: 1SSD
> fs: btrfs
> cpufreq_governor: performance
>
>
>
> config: x86_64-rhel-9.4
> compiler: gcc-14
> test machine: 192 threads 4 sockets Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz (Cascade Lake) with 176G memory
>
> (please refer to attached dmesg/kmsg for entire log/backtrace)
>
>
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <oliver.sang@intel.com>
> | Closes: https://lore.kernel.org/oe-lkp/202511262228.6dda231e-lkp@intel.com
>
>
> [ 92.359770][ T1319] ------------[ cut here ]------------
> [ 92.365399][ T2115] 1513 283338 42318 194834 39930 140043 83712
> [ 92.365683][ T1319] refcount_t: addition on 0; use-after-free.
> [ 92.371060][ T2115]
> [ 92.381336][ T1319] WARNING: CPU: 29 PID: 1319 at lib/refcount.c:25 refcount_warn_saturate (lib/refcount.c:25 (discriminator 1))
> [ 92.389049][ T2115] 2174 291211 39936 202785 36246 141919 76774
> [ 92.389356][ T1319] Modules linked in:
> [ 92.398571][ T2115]
> [ 92.419599][ T1319] dm_mod intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common skx_edac skx_edac_common nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp btrfs kvm_intel blake2b_generic xor kvm raid6_pq snd_pcm irqbypass ipmi_ssif snd_timer ghash_clmulni_intel ahci ast rapl snd intel_cstate libahci acpi_power_meter nvme drm_client_lib binfmt_misc drm_shmem_helper mei_me soundcore intel_uncore ioatdma ipmi_si i2c_i801 pcspkr nvme_core acpi_ipmi libata mei drm_kms_helper i2c_smbus lpc_ich intel_pch_thermal dca ipmi_devintf wmi ipmi_msghandler acpi_pad joydev drm fuse nfnetlink
> [ 92.477068][ T1319] CPU: 29 UID: 0 PID: 1319 Comm: kworker/u770:33 Tainted: G S 6.17.0-rc7-00022-ge8513c012de7 #1 VOLUNTARY
> [ 92.490808][ T1319] Tainted: [S]=CPU_OUT_OF_SPEC
> [ 92.495958][ T1319] Hardware name: Intel Corporation ............/S9200WKBRD2, BIOS SE5C620.86B.0D.01.0552.060220191912 06/02/2019
> [ 92.508765][ T1319] Workqueue: btrfs-endio-write btrfs_work_helper [btrfs]
> [ 92.516545][ T1319] RIP: 0010:refcount_warn_saturate (lib/refcount.c:25 (discriminator 1))
> [ 92.523057][ T1319] Code: 95 95 ff 0f 0b e9 4b 45 90 00 80 3d d3 40 98 01 00 0f 85 5e ff ff ff 48 c7 c7 88 fe ad 82 c6 05 bf 40 98 01 01 e8 9b 95 95 ff <0f> 0b c3 cc cc cc cc 48 c7 c7 e0 fe ad 82 c6 05 a3 40 98 01 01 e8
> All code
> ========
> 0: 95 xchg %eax,%ebp
> 1: 95 xchg %eax,%ebp
> 2: ff 0f decl (%rdi)
> 4: 0b e9 or %ecx,%ebp
> 6: 4b rex.WXB
> 7: 45 90 rex.RB xchg %eax,%r8d
> 9: 00 80 3d d3 40 98 add %al,-0x67bf2cc3(%rax)
> f: 01 00 add %eax,(%rax)
> 11: 0f 85 5e ff ff ff jne 0xffffffffffffff75
> 17: 48 c7 c7 88 fe ad 82 mov $0xffffffff82adfe88,%rdi
> 1e: c6 05 bf 40 98 01 01 movb $0x1,0x19840bf(%rip) # 0x19840e4
> 25: e8 9b 95 95 ff call 0xffffffffff9595c5
> 2a:* 0f 0b ud2 <-- trapping instruction
> 2c: c3 ret
> 2d: cc int3
> 2e: cc int3
> 2f: cc int3
> 30: cc int3
> 31: 48 c7 c7 e0 fe ad 82 mov $0xffffffff82adfee0,%rdi
> 38: c6 05 a3 40 98 01 01 movb $0x1,0x19840a3(%rip) # 0x19840e2
> 3f: e8 .byte 0xe8
>
> Code starting with the faulting instruction
> ===========================================
> 0: 0f 0b ud2
> 2: c3 ret
> 3: cc int3
> 4: cc int3
> 5: cc int3
> 6: cc int3
> 7: 48 c7 c7 e0 fe ad 82 mov $0xffffffff82adfee0,%rdi
> e: c6 05 a3 40 98 01 01 movb $0x1,0x19840a3(%rip) # 0x19840b8
> 15: e8 .byte 0xe8
> [ 92.543676][ T1319] RSP: 0018:ffffc9000fcabca0 EFLAGS: 00010282
> [ 92.550187][ T1319] RAX: 0000000000000000 RBX: ffff888e3d2c8d68 RCX: 0000000000000000
> [ 92.558625][ T1319] RDX: ffff88984f36a3c0 RSI: 0000000000000001 RDI: ffff88984f35c200
> [ 92.567066][ T1319] RBP: ffff8881069c6368 R08: 00000000000008fd R09: 000000000000001d
> [ 92.575508][ T1319] R10: 5b5d393933353633 R11: 205d353131325420 R12: ffff888cd91ced20
> [ 92.583960][ T1319] R13: 0000000000081087 R14: ffff888ed98c2ba8 R15: ffff8881069c6000
> [ 92.592397][ T1319] FS: 0000000000000000(0000) GS:ffff8898cb53c000(0000) knlGS:0000000000000000
> [ 92.601843][ T1319] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 92.608920][ T1319] CR2: 0000000000000800 CR3: 0000002c7de24001 CR4: 00000000007726f0
> [ 92.617398][ T1319] PKRU: 55555554
> [ 92.621442][ T1319] Call Trace:
> [ 92.625216][ T1319] <TASK>
> [ 92.628645][ T1319] btrfs_get_delayed_node+0xda/0x1b0 btrfs
> [ 92.636027][ T1319] btrfs_get_or_create_delayed_node+0x12a/0x1b0 btrfs
> [ 92.644356][ T1319] btrfs_delayed_update_inode (fs/btrfs/delayed-inode.c:1957) btrfs
> [ 92.651387][ T1319] ? btrfs_update_root_times (fs/btrfs/root-tree.c:488) btrfs
> [ 92.658201][ T1319] btrfs_update_inode (fs/btrfs/inode.c:4174) btrfs
> [ 92.664431][ T1319] btrfs_finish_one_ordered (fs/btrfs/inode.c:4188 fs/btrfs/inode.c:3226) btrfs
> [ 92.671351][ T1319] btrfs_work_helper (fs/btrfs/async-thread.c:313) btrfs
> [ 92.677606][ T1319] process_one_work (arch/x86/include/asm/jump_label.h:36 include/trace/events/workqueue.h:110 kernel/workqueue.c:3241)
> [ 92.682972][ T1319] worker_thread (kernel/workqueue.c:3313 (discriminator 2) kernel/workqueue.c:3400 (discriminator 2))
> [ 92.688065][ T1319] ? __pfx_worker_thread (kernel/workqueue.c:3346)
> [ 92.693679][ T1319] kthread (kernel/kthread.c:463)
> [ 92.698145][ T1319] ? __pfx_kthread (kernel/kthread.c:412)
> [ 92.703205][ T1319] ret_from_fork (arch/x86/kernel/process.c:148)
> [ 92.708102][ T1319] ? __pfx_kthread (kernel/kthread.c:412)
> [ 92.713169][ T1319] ret_from_fork_asm (arch/x86/entry/entry_64.S:258)
> [ 92.718420][ T1319] </TASK>
> [ 92.721904][ T1319] ---[ end trace 0000000000000000 ]---
> [ 92.772617][ T2117] [ perf record: Woken up 232 times to write data ]
> [ 92.772626][ T2117]
> [ 97.041836][ T2115] 2696 307116 30435 210930 28018 143529 49252
> [ 97.041848][ T2115]
> [ 107.042097][ T2115] 3311 309408 37927 214378 34672 141593 71871
> [ 107.042111][ T2115]
> [ 107.492140][ T2117] Warning:
> [ 107.492150][ T2117]
> [ 107.504083][ T2117] 174 out of order events recorded.
> [ 107.504089][ T2117]
> [ 107.572544][ T2117] [ perf record: Captured and wrote 680.421 MB /tmp/lkp/perf-sched.data (2685986 samples) ]
> [ 107.572552][ T2117]
> [ 116.735824][ C3] perf: interrupt took too long (2518 > 2500), lowering kernel.perf_event_max_sample_rate to 79000
> [ 116.747160][ C3] perf: interrupt took too long (3163 > 3147), lowering kernel.perf_event_max_sample_rate to 63000
> [ 116.760698][ T2117] Events enabled
> [ 116.760708][ T2117]
> [ 116.763935][ C3] perf: interrupt took too long (3955 > 3953), lowering kernel.perf_event_max_sample_rate to 50000
> [ 116.796549][ C67] perf: interrupt took too long (4953 > 4943), lowering kernel.perf_event_max_sample_rate to 40000
> [ 116.866088][ C131] perf: interrupt took too long (6210 > 6191), lowering kernel.perf_event_max_sample_rate to 32000
> [ 117.042049][ T2115] 3836 310809 30767 213835 28646 137339 48846
> [ 117.042057][ T2115]
> [ 117.739156][ C25] perf: interrupt took too long (7763 > 7762), lowering kernel.perf_event_max_sample_rate to 25000
> [ 119.838008][ T2117] [ perf record: Woken up 399 times to write data ]
> [ 119.838018][ T2117]
> [ 124.613970][ T2117] Warning:
> [ 124.613980][ T2117]
> [ 124.625324][ T2117] Processed 1601669 events and lost 6 chunks!
> [ 124.625332][ T2117]
> [ 124.635344][ T2117]
> [ 124.635349][ T2117]
> [ 124.642513][ T2117] Check IO/CPU overload!
> [ 124.642518][ T2117]
> [ 124.649994][ T2117]
> [ 124.649999][ T2117]
> [ 124.656077][ T2117] Warning:
> [ 124.656082][ T2117]
> [ 124.664342][ T2117] 35 out of order events recorded.
> [ 124.664347][ T2117]
> [ 124.678444][ T2117] [ perf record: Captured and wrote 252.317 MB /tmp/lkp/perf_c2c.data (1412807 samples) ]
> [ 124.678450][ T2117]
> [ 127.042332][ T2115] 4407 303071 35022 208647 31346 136630 47202
> [ 127.042344][ T2115]
> [ 137.042250][ T2115] 4943 299584 31810 208626 30129 128390 56528
> [ 137.042263][ T2115]
> [ 147.042441][ T2115] 5385 300009 27874 208644 24778 132677 46221
> [ 147.042453][ T2115]
> [ 157.042433][ T2115] 5910 307972 30431 216088 28665 137389 57696
> [ 157.042445][ T2115]
> [ 167.042628][ T2115] 6460 304663 35022 213949 31758 139786 62225
> [ 167.042640][ T2115]
> [ 177.042573][ T2115] 6967 307323 30606 213465 26983 141516 63168
> [ 177.042587][ T2115]
> [ 187.042733][ T2115] 7407 307075 26787 213546 24994 142923 41523
> [ 187.042745][ T2115]
> [ 197.042791][ T2115] 7927 308105 34190 214356 29643 140592 52914
> [ 197.042803][ T2115]
> [ 207.042918][ T2115] 8549 301948 38038 210474 34706 141120 67811
> [ 207.042931][ T2115]
> [ 217.042636][ T2115] 8955 297050 25135 208501 22518 135881 44260
> [ 217.042648][ T2115]
> [ 227.038865][ T2115] 9471 304296 32936 210610 29938 137844 62809
> [ 227.038870][ T2115]
> [ 237.038945][ T2115] 9964 305929 31628 213754 28012 141671 62429
> [ 237.038951][ T2115]
> [ 247.039008][ T2115] 10440 307726 30685 214830 28062 140111 62249
> [ 247.039015][ T2115]
> [ 257.039183][ T2115] 10802 292789 23345 205882 20689 138434 48453
> [ 257.039188][ T2115]
> [ 267.043497][ T2115] 11273 306881 30725 212403 27270 143693 60639
> [ 267.043508][ T2115]
> [ 277.043510][ T2115] 11797 302806 33279 211710 30071 142609 64344
> [ 277.043519][ T2115]
> [ 287.039432][ T2115] 12205 264441 24852 186434 24016 129256 56311
> [ 287.039437][ T2115]
> [ 297.043412][ T2115] 12673 310406 27331 217050 25960 146991 54571
> [ 297.043421][ T2115]
>
>
> The kernel config and materials to reproduce are available at:
> https://download.01.org/0day-ci/archive/20251126/202511262228.6dda231e-lkp@intel.com
>
>
>
> --
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki
Hello,
I believe I have identified the root cause of the warning.
However, I'm having some troubles running the reproducer as I
haven't setup lkp-tests yet. Could you test the patch below
against your reproducer to see if it fixes the issue?
---8<---
[PATCH] btrfs: fix use-after-free in btrfs_get_or_create_delayed_node
Previously, btrfs_get_or_create_delayed_node sets the delayed_node's
refcount before acquiring the root->delayed_nodes lock.
Commit e8513c012de7 ("btrfs: implement ref_tracker for delayed_nodes")
moves refcount_set inside the critical section which means
there is no longer a memory barrier between setting the refcount and
setting btrfs_inode->delayed_node = node.
This allows btrfs_get_or_create_delayed_node to set
btrfs_inode->delayed_node before setting the refcount.
A different thread is then able to read and increase the refcount
of btrfs_inode->delayed_node leading to a refcounting bug and
a use-after-free warning.
The fix is to move refcount_set back to where it was to take
advantage of the implicit memory barrier provided by lock
acquisition.
Fixes: e8513c012de7 ("btrfs: implement ref_tracker for delayed_nodes")
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202511262228.6dda231e-lkp@intel.com
Signed-off-by: Leo Martins <loemra.dev@gmail.com>
---
fs/btrfs/delayed-inode.c | 34 ++++++++++++++++++----------------
1 file changed, 18 insertions(+), 16 deletions(-)
diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c
index 364814642a91..f61f10000e33 100644
--- a/fs/btrfs/delayed-inode.c
+++ b/fs/btrfs/delayed-inode.c
@@ -152,37 +152,39 @@ static struct btrfs_delayed_node *btrfs_get_or_create_delayed_node(
return ERR_PTR(-ENOMEM);
btrfs_init_delayed_node(node, root, ino);
+ /* Cached in the inode and can be accessed. */
+ refcount_set(&node->refs, 2);
+ btrfs_delayed_node_ref_tracker_alloc(node, tracker, GFP_ATOMIC);
+ btrfs_delayed_node_ref_tracker_alloc(node, &node->inode_cache_tracker, GFP_ATOMIC);
+
/* Allocate and reserve the slot, from now it can return a NULL from xa_load(). */
ret = xa_reserve(&root->delayed_nodes, ino, GFP_NOFS);
- if (ret == -ENOMEM) {
- btrfs_delayed_node_ref_tracker_dir_exit(node);
- kmem_cache_free(delayed_node_cache, node);
- return ERR_PTR(-ENOMEM);
- }
+ if (ret == -ENOMEM)
+ goto cleanup;
+
xa_lock(&root->delayed_nodes);
ptr = xa_load(&root->delayed_nodes, ino);
if (ptr) {
/* Somebody inserted it, go back and read it. */
xa_unlock(&root->delayed_nodes);
- btrfs_delayed_node_ref_tracker_dir_exit(node);
- kmem_cache_free(delayed_node_cache, node);
- node = NULL;
- goto again;
+ goto cleanup;
}
ptr = __xa_store(&root->delayed_nodes, ino, node, GFP_ATOMIC);
ASSERT(xa_err(ptr) != -EINVAL);
ASSERT(xa_err(ptr) != -ENOMEM);
ASSERT(ptr == NULL);
-
- /* Cached in the inode and can be accessed. */
- refcount_set(&node->refs, 2);
- btrfs_delayed_node_ref_tracker_alloc(node, tracker, GFP_ATOMIC);
- btrfs_delayed_node_ref_tracker_alloc(node, &node->inode_cache_tracker, GFP_ATOMIC);
-
- btrfs_inode->delayed_node = node;
+ WRITE_ONCE(btrfs_inode->delayed_node, node);
xa_unlock(&root->delayed_nodes);
return node;
+cleanup:
+ btrfs_delayed_node_ref_tracker_free(node, tracker);
+ btrfs_delayed_node_ref_tracker_free(node, &node->inode_cache_tracker);
+ btrfs_delayed_node_ref_tracker_dir_exit(node);
+ kmem_cache_free(delayed_node_cache, node);
+ if (ret)
+ return ERR_PTR(ret);
+ goto again;
}
/*
--
2.47.3
next prev parent reply other threads:[~2025-12-02 0:51 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-26 14:23 [linus:master] [btrfs] e8513c012d: addition_on#;use-after-free kernel test robot
2025-12-02 0:51 ` Leo Martins [this message]
2025-12-02 8:40 ` Oliver Sang
2025-12-02 15:04 ` Filipe Manana
2025-12-02 17:17 ` Leo Martins
2025-12-02 19:19 ` Filipe Manana
2025-12-02 21:03 ` Leo Martins
2025-12-03 10:45 ` Filipe Manana
2025-12-02 17:18 ` Leo Martins
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251202005146.3723457-1-loemra.dev@gmail.com \
--to=loemra.dev@gmail.com \
--cc=dsterba@suse.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lkp@intel.com \
--cc=oe-lkp@lists.linux.dev \
--cc=oliver.sang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.