Linux Btrfs filesystem development
 help / color / mirror / Atom feed
From: Qu Wenruo <wqu@suse.com>
To: Kai Krakow <kai@kaishome.de>, linux-btrfs@vger.kernel.org
Cc: Eli Venter <eli@genedx.com>
Subject: Re: [PATCH] btrfs: harden __reserve_bytes() with space_info==NULL
Date: Sun, 14 Dec 2025 07:34:36 +1030	[thread overview]
Message-ID: <4eed3843-a706-4203-b797-13cc9e19190a@suse.com> (raw)
In-Reply-To: <350e9b44-7a16-4c3f-a54f-5b6b3f4931f3@suse.com>



在 2025/12/14 07:18, Qu Wenruo 写道:
> 
> 
> 在 2025/12/14 06:39, Kai Krakow 写道:
>> During mount, the global block reserve might not have its space_info
>> initialized yet. If we try to reserve bytes in this state (e.g. via
>> early sysfs writes), we must not crash.
>>
>> This happened while developing patches which allow modification of the
>> devinfo.type field via sysfs. If this write access is executed by the
>> user before the mount finished, the kernel crashed with a NULL pointer
>> dereference:
> 
> 
> I'd say the modification through sysfs itself is a dangerous idea, it 
> will need to hold the proper locks and if not properly checked can 
> easily introduce unexpected races.
> 
> 
> Furthermore currently there is no RW support for devinfo related member.

Correction, there are RW members but that are only runtime ones, e.g. 
scrub throughput and iops limits.

Nothing is going to trigger any metadata writes.
And that's true for all RW members, no matter if it's 
space_info->chunk_size or reclaim control.

Even for the label modification through sysfs, we do not trigger any 
metadata changes, but only modify in-memory structures.

If you're going to use sysfs to trigger a transaction, I'd just say, DON'T.

Thanks,
Qu

> 
> So this means your patch is fixing something that is only affecting your 
> out-of-tree development branch, which is not bringing much usefulness to 
> upstream.
> 
> Thanks,
> Qu
> 
>>
>>> Noticed an oops with these patches when doing echo 1 >devinfo/2/type
>>> while mount is still ongoing. My btrfs is big so the mount takes
>>> 20-30 minutes. Reboot and wait until mount is complete and this
>>> worked fine.
>>
>> BUG: kernel NULL pointer dereference, address: 0000000000000008
>> PGD 0 P4D 0
>> Oops: Oops: 0002 [#1] PREEMPT SMP NOPTI
>> CPU: 4 UID: 0 PID: 3520 Comm: bash Not tainted 6.12.52-dirty #2
>> Hardware name: Penguin Computing Relion 1900/MD90-FS0-ZB-XX, BIOS R15 
>> 06/25/2018
>> RIP: 0010:_raw_spin_lock+0x17/0x30
>> Code: 44 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 
>> 1e fa 0f 1f 44 00 00 65 ff 05 e8 c0 d8 5e 31 c0 ba 01 00 00 00 0f b1 
>> 17 75 05 c3 cc cc cc cc 89 c6 e9 97 01 00 00 0f 1f 80 00
>> RSP: 0018:ffffbc13a95837c8 EFLAGS: 00010246
>> RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
>> RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000008
>> RBP: 0000000000000008 R08: ffffbc13a9583a07 R09: 0000000000000001
>> R10: d800000000000000 R11: 0000000000000001 R12: ffff9bee913db000
>> R13: 0000000000000000 R14: 00000000fffffffb R15: ffff9bee913db000
>> FS: 00007fd6e270f740(0000) GS:ffff9bfddfc00000(0000) 
>> knlGS:0000000000000000
>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 0000000000000008 CR3: 00000008d9986004 CR4: 00000000003706f0
>> Call Trace:
>>
>> __reserve_bytes+0x70/0x720 [btrfs]
>> ? get_page_from_freelist+0x343/0x1570
>> btrfs_reserve_metadata_bytes+0x1d/0xd0 [btrfs]
>> btrfs_use_block_rsv+0x153/0x220 [btrfs]
>> btrfs_alloc_tree_block+0x83/0x580 [btrfs]
>> btrfs_force_cow_block+0x129/0x620 [btrfs]
>> btrfs_cow_block+0xcd/0x230 [btrfs]
>> btrfs_search_slot+0x566/0xd60 [btrfs]
>> ? kmem_cache_alloc_noprof+0x106/0x2f0
>> btrfs_update_device+0x91/0x1d0 [btrfs]
>> btrfs_devinfo_type_store+0xb8/0x140 [btrfs]
>> kernfs_fop_write_iter+0x14c/0x200
>> vfs_write+0x289/0x440
>> ksys_write+0x6d/0xf0
>> trace_clock_x86_tsc+0x20/0x20
>> ? do_wp_page+0x838/0xf90
>> ? __do_sys_newfstat+0x68/0x70
>> ? __pte_offset_map+0x1b/0xf0
>> ? __handle_mm_fault+0xa6c/0x10f0
>> ? __count_memcg_events+0x53/0xf0
>> ? handle_mm_fault+0x1c4/0x2d0
>> ? do_user_addr_fault+0x334/0x620
>> ? arch_exit_to_user_mode_prepare.isra.0+0x11/0x90
>> entry_SYSCALL_64_after_hwframe+0x76/0x7e
>> RIP: 0033:0x7fd6e27a1687
>> Code: 48 89 fa 4c 89 df e8 58 b3 00 00 8b 93 08 03 00 00 59 5e 48 83 
>> f8 fc 74 1a 5b c3 0f 1f 84 00 00 00 00 00 48 8b 44 24 10 0f 05 <5b> c3 
>> 0f 1f 80 00 00 00 00 83 e2 39 83 fa 08 75 de e8 23 ff ff ff
>> RSP: 002b:00007ffecb401260 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
>> RAX: ffffffffffffffda RBX: 00007fd6e270f740 RCX: 00007fd6e27a1687
>> RDX: 0000000000000002 RSI: 0000557a2c38ad20 RDI: 0000000000000001
>> RBP: 0000557a2c38ad20 R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000002
>> R13: 00007fd6e28fa5c0 R14: 00007fd6e28f7e80 R15: 0000000000000000
>>
>> Modules linked in: rpcsec_gss_krb5 nfsv3 nfsv4 dns_resolver nfs netfs 
>> zram lz4hc_compress lz4_compress dm_crypt bonding tls ipmi_ssif 
>> intel_rapl_msr nfsd binfmt_misc auth_rpcgss nfs_acl lockd grace 
>> intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common 
>> sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp rapl 
>> intel_cstate s>
>> intel_pmc_bxt ixgbe ehci_pci iTCO_vendor_support xfrm_algo gf128mul 
>> libata mpt3sas xhci_hcd ehci_hcd watchdog crypto_simd mdio_devres 
>> libphy cryptd raid_class usbcore scsi_transport_sas mdio igb scsi_mod 
>> wmi usb_common i2c_i801 lpc_ich scsi_common i2c_smbus i2c_algo_bit dca
>> CR2: 0000000000000008
>> ---[ end trace 0000000000000000 ]---
>> RIP: 0010:_raw_spin_lock+0x17/0x30
>> Code: 44 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 
>> 1e fa 0f 1f 44 00 00 65 ff 05 e8 c0 d8 5e 31 c0 ba 01 00 00 00 0f b1 
>> 17 75 05 c3 cc cc cc cc 89 c6 e9 97 01 00 00 0f 1f 80 00
>> RSP: 0018:ffffbc13a95837c8 EFLAGS: 00010246
>> RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
>> RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000008
>> RBP: 0000000000000008 R08: ffffbc13a9583a07 R09: 0000000000000001
>> R10: d800000000000000 R11: 0000000000000001 R12: ffff9bee913db000
>> R13: 0000000000000000 R14: 00000000fffffffb R15: ffff9bee913db000
>> FS: 00007fd6e270f740(0000) GS:ffff9bfddfc00000(0000) 
>> knlGS:0000000000000000
>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 0000000000000008 CR3: 00000008d9986004 CR4: 00000000003706f0
>>
>> Reported-by: Eli Venter <eli@genedx.com>
>> Signed-off-by: Kai Krakow <kai@kaishome.de>
>> ---
>>   fs/btrfs/space-info.c | 8 ++++++++
>>   1 file changed, 8 insertions(+)
>>
>> diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
>> index 97452fb5d29b..cbb6c4924850 100644
>> --- a/fs/btrfs/space-info.c
>> +++ b/fs/btrfs/space-info.c
>> @@ -1752,6 +1752,14 @@ static int __reserve_bytes(struct btrfs_fs_info 
>> *fs_info,
>>           ASSERT(flush != BTRFS_RESERVE_FLUSH_EVICT);
>>       }
>> +    /*
>> +     * During mount, the global block reserve might not have its 
>> space_info
>> +     * initialized yet. If we try to reserve bytes in this state 
>> (e.g. via
>> +     * early sysfs writes), we must not crash.
>> +     */
>> +    if (unlikely(!space_info))
>> +        return -EBUSY;
>> +
>>       if (flush == BTRFS_RESERVE_FLUSH_DATA)
>>           async_work = &fs_info->async_data_reclaim_work;
>>       else
> 
> 


  reply	other threads:[~2025-12-13 21:04 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-13 20:09 [PATCH] btrfs: harden __reserve_bytes() with space_info==NULL Kai Krakow
2025-12-13 20:48 ` Qu Wenruo
2025-12-13 21:04   ` Qu Wenruo [this message]
2025-12-13 21:14     ` Kai Krakow
2025-12-13 21:10   ` Kai Krakow
2025-12-13 21:15     ` Qu Wenruo
2025-12-13 21:43       ` Kai Krakow
2025-12-13 22:31         ` Qu Wenruo
2025-12-15 19:28           ` David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4eed3843-a706-4203-b797-13cc9e19190a@suse.com \
    --to=wqu@suse.com \
    --cc=eli@genedx.com \
    --cc=kai@kaishome.de \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox