linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] hfsplus: fix missing hfs_bnode_get() in __hfs_bnode_create
@ 2025-08-29  9:39 Chenzhi Yang
  2025-08-29 21:53 ` Viacheslav Dubeyko
  0 siblings, 1 reply; 4+ messages in thread
From: Chenzhi Yang @ 2025-08-29  9:39 UTC (permalink / raw)
  To: slava, glaubitz, frank.li
  Cc: linux-fsdevel, linux-kernel, syzbot+005d2a9ecd9fbf525f6a,
	Yang Chenzhi

From: Yang Chenzhi <yang.chenzhi@vivo.com>

When sync() and link() are called concurrently, both threads may
enter hfs_bnode_find() without finding the node in the hash table
and proceed to create it.

Thread A:
  hfsplus_write_inode()
    -> hfsplus_write_system_inode()
      -> hfs_btree_write()
        -> hfs_bnode_find(tree, 0)
          -> __hfs_bnode_create(tree, 0)

Thread B:
  hfsplus_create_cat()
    -> hfs_brec_insert()
      -> hfs_bnode_split()
        -> hfs_bmap_alloc()
          -> hfs_bnode_find(tree, 0)
            -> __hfs_bnode_create(tree, 0)

In this case, thread A creates the bnode, sets refcnt=1, and hashes it.
Thread B also tries to create the same bnode, notices it has already
been inserted, drops its own instance, and uses the hashed one without
getting the node.

```

	node2 = hfs_bnode_findhash(tree, cnid);
	if (!node2) {                                 <- Thread A
		hash = hfs_bnode_hash(cnid);
		node->next_hash = tree->node_hash[hash];
		tree->node_hash[hash] = node;
		tree->node_hash_cnt++;
	} else {                                      <- Thread B
		spin_unlock(&tree->hash_lock);
		kfree(node);
		wait_event(node2->lock_wq,
			!test_bit(HFS_BNODE_NEW, &node2->flags));
		return node2;
	}
```

However, hfs_bnode_find() requires each call to take a reference.
Here both threads end up setting refcnt=1. When they later put the node,
this triggers:

BUG_ON(!atomic_read(&node->refcnt))

In this scenario, Thread B in fact finds the node in the hash table
rather than creating a new one, and thus must take a reference.

Fix this by calling hfs_bnode_get() when reusing a bnode newly created by
another thread to ensure the refcount is updated correctly.

A similar bug was fixed in HFS long ago in commit
a9dc087fd3c4 ("fix missing hfs_bnode_get() in __hfs_bnode_create")
but the same issue remained in HFS+ until now.

Reported-by: syzbot+005d2a9ecd9fbf525f6a@syzkaller.appspotmail.com
Signed-off-by: Yang Chenzhi <yang.chenzhi@vivo.com>
---
 fs/hfsplus/bnode.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/hfsplus/bnode.c b/fs/hfsplus/bnode.c
index 14f4995588ff..e774bd4f40c3 100644
--- a/fs/hfsplus/bnode.c
+++ b/fs/hfsplus/bnode.c
@@ -522,6 +522,7 @@ static struct hfs_bnode *__hfs_bnode_create(struct hfs_btree *tree, u32 cnid)
 		tree->node_hash[hash] = node;
 		tree->node_hash_cnt++;
 	} else {
+		hfs_bnode_get(node2);
 		spin_unlock(&tree->hash_lock);
 		kfree(node);
 		wait_event(node2->lock_wq,
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re:  [PATCH] hfsplus: fix missing hfs_bnode_get() in __hfs_bnode_create
  2025-08-29  9:39 [PATCH] hfsplus: fix missing hfs_bnode_get() in __hfs_bnode_create Chenzhi Yang
@ 2025-08-29 21:53 ` Viacheslav Dubeyko
  2025-09-01 14:21   ` 杨晨志
  0 siblings, 1 reply; 4+ messages in thread
From: Viacheslav Dubeyko @ 2025-08-29 21:53 UTC (permalink / raw)
  To: glaubitz@physik.fu-berlin.de, yang.chenzhi@vivo.com,
	slava@dubeyko.com, frank.li@vivo.com
  Cc: syzbot+005d2a9ecd9fbf525f6a@syzkaller.appspotmail.com,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org

On Fri, 2025-08-29 at 17:39 +0800, Chenzhi Yang wrote:
> From: Yang Chenzhi <yang.chenzhi@vivo.com>
> 
> When sync() and link() are called concurrently, both threads may
> enter hfs_bnode_find() without finding the node in the hash table
> and proceed to create it.
> 

It's clear why link() could try to create a b-tree node. But it's not completely
clear why sync() should try to create a b-tree node? Usually, sync() should try
to flush already existed inodes. Especially, if we imply any system inodes that
are created mostly during mount operation. How is it possible that sync() could
try to create the node? Moreover, node_id 0 (root node) should be already
created if we created the b-tree.

I am investigating right now the issue when root node is re-written by another
node during inserting the record in HFS (but I assume that HFS+ could have the
same issue). So, I think your fix is good but we have some more serious issue in
the background.

So, if you are fixing the syzbot reported issue, then it could be good to have
call trace in the comment to understand the fix environment. And it could be
more clear in what environment sync() is trying to create the b-tree node. I
like your an analysis of the issue but not all details are shared. And I still
have questions how sync() could try to create the b-tree's node. Probably, this
fix could be justified by concurrent link() requests or something like this. But
concurrent creation of b-tree node by sync() and link() looks really suspicious.

Thanks,
Slava.

> Thread A:
>   hfsplus_write_inode()
>     -> hfsplus_write_system_inode()
>       -> hfs_btree_write()
>         -> hfs_bnode_find(tree, 0)
>           -> __hfs_bnode_create(tree, 0)
> 
> Thread B:
>   hfsplus_create_cat()
>     -> hfs_brec_insert()
>       -> hfs_bnode_split()
>         -> hfs_bmap_alloc()
>           -> hfs_bnode_find(tree, 0)
>             -> __hfs_bnode_create(tree, 0)
> 
> In this case, thread A creates the bnode, sets refcnt=1, and hashes it.
> Thread B also tries to create the same bnode, notices it has already
> been inserted, drops its own instance, and uses the hashed one without
> getting the node.
> 
> ```
> 
> 	node2 = hfs_bnode_findhash(tree, cnid);
> 	if (!node2) {                                 <- Thread A
> 		hash = hfs_bnode_hash(cnid);
> 		node->next_hash = tree->node_hash[hash];
> 		tree->node_hash[hash] = node;
> 		tree->node_hash_cnt++;
> 	} else {                                      <- Thread B
> 		spin_unlock(&tree->hash_lock);
> 		kfree(node);
> 		wait_event(node2->lock_wq,
> 			!test_bit(HFS_BNODE_NEW, &node2->flags));
> 		return node2;
> 	}
> ```
> 
> However, hfs_bnode_find() requires each call to take a reference.
> Here both threads end up setting refcnt=1. When they later put the node,
> this triggers:
> 
> BUG_ON(!atomic_read(&node->refcnt))
> 
> In this scenario, Thread B in fact finds the node in the hash table
> rather than creating a new one, and thus must take a reference.
> 
> Fix this by calling hfs_bnode_get() when reusing a bnode newly created by
> another thread to ensure the refcount is updated correctly.
> 
> A similar bug was fixed in HFS long ago in commit
> a9dc087fd3c4 ("fix missing hfs_bnode_get() in __hfs_bnode_create")
> but the same issue remained in HFS+ until now.
> 
> Reported-by: syzbot+005d2a9ecd9fbf525f6a@syzkaller.appspotmail.com
> Signed-off-by: Yang Chenzhi <yang.chenzhi@vivo.com>
> ---
>  fs/hfsplus/bnode.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/fs/hfsplus/bnode.c b/fs/hfsplus/bnode.c
> index 14f4995588ff..e774bd4f40c3 100644
> --- a/fs/hfsplus/bnode.c
> +++ b/fs/hfsplus/bnode.c
> @@ -522,6 +522,7 @@ static struct hfs_bnode *__hfs_bnode_create(struct hfs_btree *tree, u32 cnid)
>  		tree->node_hash[hash] = node;
>  		tree->node_hash_cnt++;
>  	} else {
> +		hfs_bnode_get(node2);
>  		spin_unlock(&tree->hash_lock);
>  		kfree(node);
>  		wait_event(node2->lock_wq,

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] hfsplus: fix missing hfs_bnode_get() in __hfs_bnode_create
  2025-08-29 21:53 ` Viacheslav Dubeyko
@ 2025-09-01 14:21   ` 杨晨志
  2025-09-02 21:57     ` Viacheslav Dubeyko
  0 siblings, 1 reply; 4+ messages in thread
From: 杨晨志 @ 2025-09-01 14:21 UTC (permalink / raw)
  To: Viacheslav Dubeyko, glaubitz@physik.fu-berlin.de,
	slava@dubeyko.com, 李扬韬
  Cc: syzbot+005d2a9ecd9fbf525f6a@syzkaller.appspotmail.com,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org

Thanks for reviewing.

The syzbot C reproducer creates five processes, each of which repeatedly 
performs the following sequence of operations in a loop device: mount, 
hard link creation, sync, unmount, and loop device reset[1]. In certain 
cases, this can result in situations where, for example, Thread A is 
executing sync while Thread B is concurrently performing a link operation.

> It's clear why link() could try to create a b-tree node. But it's not completely
> clear why sync() should try to create a b-tree node? Usually, sync() should try> to flush already existed inodes. Especially, if we imply any system 
inodes that
> are created mostly during mount operation. How is it possible that sync() could> try to create the node?

```
if (!cnid && tree->cnid == 4)
	dump_stack();
```

Add the code above to __hfs_bnode_create(). Call sync() immediately 
after hfsplus is mounted successfully. The call trace shows that sync() 
can create a bnode(create header node; id 0).

> Moreover, node_id 0 (root node) should be already
> created if we created the b-tree.
> 

node_id 0 is the header node, which contains a bitmap, and node_id 1 is 
the root node (the root node can change if the tree height increases, 
but the initial root node ID of the catalog tree is always 1).

In fact, the root node (ID 1) is the bnode we create when initializing 
the catalog btree. However, during the creation of the catalog tree, we 
do not create the header node (ID 0).

hfsplus_fill_super()
   -> hfsplus_iget(ROOT_CNID)
     -> hfs_find_init(catalog_tree)
       -> hfsplus_find_cat
         -> hfs_brec_read
           -> hfs_brec_find(tree->root)
             -> hfs_bnode_find(tree->root)
               -> __hfs_bnode_create(tree->root)

 From my perspective, the header node is somewhat special. There are two 
cases where we need to use it. First, when searching the bitmap to 
allocate or free a node. Second, when reading or writing metadata in the 
header record. In hfs_btree_open(), we do read data from the header 
record; however, it reads this data directly from the page rather than 
creating a header node.

```
	mapping = tree->inode->i_mapping;
	page = read_mapping_page(mapping, 0, NULL);
	if (IS_ERR(page))
		goto free_inode;

	/* Load the header */
	head = (struct hfs_btree_header_rec *)(kmap_local_page(page) +
		sizeof(struct hfs_bnode_desc));
	tree->root = be32_to_cpu(head->root);

```

So we lose the opportunity to initialize the header node during the 
mounting process. We can only wait until the next time we need to use 
the header node to create it. In this scenario, it could be sync() -> 
hfs_btree_write to write metadata back to disk, or link() -> 
hfs_bnode_split to use the bitmap to allocate a new node.

Besides this, I found hfs_bnode_find() to be somewhat strange. 
hfs_bnode_find() does not record the node it creates in the bitmap(e.g. 
header node). The recording is handled by hfs_bmap_alloc, meaning that 
recording and creation are separated operations, and the two functions 
do not have a one-to-one relationship.The caller must ensure that the 
node it wants to find is already in the bitmap. However, we do not 
currently have such checks in place—or perhaps there is some mechanism 
that guarantees every node passed through it is already in the bitmap?

Interestingly, since hfsplus_fill_super() needs to initialize the root 
directory “/”, this operation must rely on the catalog_tree. As a 
result, the first byte of the catalog_tree bitmap is 0xC0 (i.e., 
0b11000000), meaning that both the header node and the root node are 
pre-marked in the bitmap so that hfs_bnode_find can later be used to 
create the header node (although in practice, the root node is the first 
one created).

```
	/* Load the root directory */
	root = hfsplus_iget(sb, HFSPLUS_ROOT_CNID);  <- use catalog tree
	if (IS_ERR(root)) {
		pr_err("failed to load root directory\n");
		err = PTR_ERR(root);
		goto out_put_alloc_file;
	}

```

In contrast, the extent tree is not used at all during the mounting 
process. As a result, the first byte of the extents tree is 0x80 (i.e., 
0b1000000), so its root node can be created later, when 
hfs_brec_insert() performs the first insertion. At that point, the root 
node is marked in the bitmap through hfs_bmap_alloc().

In short, during the mounting process, the extents tree initializes 
neither the header node nor the root node.

```
	if (!fd->bnode) {
		if (!tree->root)
			hfs_btree_inc_height(tree);  <- hfs_bmap_alloc
		node = hfs_bnode_find(tree, tree->leaf_head);
		if (IS_ERR(node))
			return PTR_ERR(node);
		fd->bnode = node;
		fd->record = -1;
	}

```

> So, if you are fixing the syzbot reported issue, then it could be good to have
> call trace in the comment to understand the fix environment. And it could be
> more clear in what environment sync() is trying to create the b-tree node.

OK, I will put call trace in the commit message in future development.

syzbot report call trace:[2][3]

Call Trace:
  <TASK>
  hfsplus_btree_write+0x379/0x7b0 fs/hfsplus/btree.c:309
  hfsplus_system_write_inode fs/hfsplus/super.c:137 [inline]
  hfsplus_write_inode+0x4c9/0x5f0 fs/hfsplus/super.c:163
  write_inode fs/fs-writeback.c:1525 [inline]
  __writeback_single_inode+0x6f4/0x1000 fs/fs-writeback.c:1745
  writeback_sb_inodes+0x6b7/0xf60 fs/fs-writeback.c:1976
  wb_writeback+0x43b/0xaf0 fs/fs-writeback.c:2156
  wb_do_writeback fs/fs-writeback.c:2303 [inline]
  wb_workfn+0x40e/0xf00 fs/fs-writeback.c:2343
  process_one_work kernel/workqueue.c:3236 [inline]
  process_scheduled_works+0xae1/0x17b0 kernel/workqueue.c:3319
  worker_thread+0x8a0/0xda0 kernel/workqueue.c:3400
  kthread+0x711/0x8a0 kernel/kthread.c:463
  ret_from_fork+0x3fc/0x770 arch/x86/kernel/process.c:148
  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245

Call Trace:
  <TASK>
  hfsplus_bmap_alloc+0x5a5/0x640 fs/hfsplus/btree.c:414
  hfs_bnode_split+0xcc/0xef0 fs/hfsplus/brec.c:245
  hfsplus_brec_insert+0x38f/0xcc0 fs/hfsplus/brec.c:100
  hfsplus_rename_cat+0x51c/0xde0 fs/hfsplus/catalog.c:491
  hfsplus_link+0x359/0x6a0 fs/hfsplus/dir.c:323
  vfs_link+0x4ea/0x6e0 fs/namei.c:4854
  do_linkat+0x272/0x560 fs/namei.c:4924
  __do_sys_link fs/namei.c:4958 [inline]
  __se_sys_link fs/namei.c:4956 [inline]
  __x64_sys_link+0x82/0x90 fs/namei.c:4956
  do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
  do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
  entry_SYSCALL_64_after_hwframe+0x77/0x7f

There are two different call traces, depending on which path acquires 
the node first.

Finally, I believe it is necessary to add the missing hfs_bnode_get in 
__hfs_bnode_create. This ensures that each find call holds a proper 
reference, aligning the reference counting design with that of VFS 
inodes, and effectively preventing miscounting in concurrent scenarios.

If you would like, I can rewrite the commit message and resend the 
patch. Please let me know.

Thanks,
Chenzhi Yang

[1] https://syzkaller.appspot.com/text?tag=ReproC&x=12211dbc580000
[2] https://syzkaller.appspot.com/text?tag=CrashReport&x=1076eda2580000
[3] https://syzkaller.appspot.com/text?tag=CrashReport&x=13cd7682580000

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [PATCH] hfsplus: fix missing hfs_bnode_get() in __hfs_bnode_create
  2025-09-01 14:21   ` 杨晨志
@ 2025-09-02 21:57     ` Viacheslav Dubeyko
  0 siblings, 0 replies; 4+ messages in thread
From: Viacheslav Dubeyko @ 2025-09-02 21:57 UTC (permalink / raw)
  To: glaubitz@physik.fu-berlin.de, yang.chenzhi@vivo.com,
	slava@dubeyko.com, frank.li@vivo.com
  Cc: syzbot+005d2a9ecd9fbf525f6a@syzkaller.appspotmail.com,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org

On Mon, 2025-09-01 at 14:21 +0000, 杨晨志 wrote:
> Thanks for reviewing.
> 

Thanks a lot for detailed explanation. It is helpful.

> The syzbot C reproducer creates five processes, each of which repeatedly 
> performs the following sequence of operations in a loop device: mount, 
> hard link creation, sync, unmount, and loop device reset[1]. In certain 
> cases, this can result in situations where, for example, Thread A is 
> executing sync while Thread B is concurrently performing a link operation.
> 
> > It's clear why link() could try to create a b-tree node. But it's not completely
> > clear why sync() should try to create a b-tree node? Usually, sync() should try> to flush already existed inodes. Especially, if we imply any system 
> inodes that
> > are created mostly during mount operation. How is it possible that sync() could> try to create the node?
> 
> ```
> if (!cnid && tree->cnid == 4)
> 	dump_stack();
> ```
> 
> Add the code above to __hfs_bnode_create(). Call sync() immediately 
> after hfsplus is mounted successfully. The call trace shows that sync() 
> can create a bnode(create header node; id 0).
> 
> > Moreover, node_id 0 (root node) should be already
> > created if we created the b-tree.
> > 
> 
> node_id 0 is the header node, which contains a bitmap, and node_id 1 is 
> the root node (the root node can change if the tree height increases, 
> but the initial root node ID of the catalog tree is always 1).
> 
> In fact, the root node (ID 1) is the bnode we create when initializing 
> the catalog btree. However, during the creation of the catalog tree, we 
> do not create the header node (ID 0).
> 
> hfsplus_fill_super()
>    -> hfsplus_iget(ROOT_CNID)
>      -> hfs_find_init(catalog_tree)
>        -> hfsplus_find_cat
>          -> hfs_brec_read
>            -> hfs_brec_find(tree->root)
>              -> hfs_bnode_find(tree->root)
>                -> __hfs_bnode_create(tree->root)
> 
>  From my perspective, the header node is somewhat special. There are two 
> cases where we need to use it. First, when searching the bitmap to 
> allocate or free a node. Second, when reading or writing metadata in the 
> header record. In hfs_btree_open(), we do read data from the header 
> record; however, it reads this data directly from the page rather than 
> creating a header node.
> 
> ```
> 	mapping = tree->inode->i_mapping;
> 	page = read_mapping_page(mapping, 0, NULL);
> 	if (IS_ERR(page))
> 		goto free_inode;
> 
> 	/* Load the header */
> 	head = (struct hfs_btree_header_rec *)(kmap_local_page(page) +
> 		sizeof(struct hfs_bnode_desc));
> 	tree->root = be32_to_cpu(head->root);
> 
> ```
> 
> So we lose the opportunity to initialize the header node during the 
> mounting process. We can only wait until the next time we need to use 
> the header node to create it. In this scenario, it could be sync() -> 
> hfs_btree_write to write metadata back to disk, or link() -> 
> hfs_bnode_split to use the bitmap to allocate a new node.
> 
> Besides this, I found hfs_bnode_find() to be somewhat strange. 
> hfs_bnode_find() does not record the node it creates in the bitmap(e.g. 
> header node). The recording is handled by hfs_bmap_alloc, meaning that 
> recording and creation are separated operations, and the two functions 
> do not have a one-to-one relationship.The caller must ensure that the 
> node it wants to find is already in the bitmap. However, we do not 
> currently have such checks in place—or perhaps there is some mechanism 
> that guarantees every node passed through it is already in the bitmap?
> 

I assume that we have enough bugs in this logic. Currently, I am investigating
the issue in HFS related to node's bitmap management logic. And I still haven't
the complete picture of the issue:

[   34.496034][ T9845] hfs: new node 0 already hashed?
[   34.496526][ T9845] ------------[ cut here ]------------
[   34.497010][ T9845] WARNING: CPU: 2 PID: 9845 at fs/hfs/bnode.c:549
hfs_bnode_create+0x461/0x4f0
[   34.497805][ T9845] Modules linked in:
[   34.498186][ T9845] CPU: 2 UID: 0 PID: 9845 Comm: repro Not tainted 6.17.0-
rc1-00116-gd7ee5bdce789-dirty #32 PREEMPT(full) 
[   34.499175][ T9845] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX,
1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[   34.500144][ T9845] RIP: 0010:hfs_bnode_create+0x461/0x4f0
[   34.500650][ T9845] Code: c1 8b 89 ee e8 e0 8a 93 fe e9 cf fc ff ff e8 66 49
24 ff 4c 89 ef e8 9e ad f4 08 48 c7 c7 20 cb cf
[   34.502377][ T9845] RSP: 0018:ffffc9000d036f20 EFLAGS: 00010246
[   34.502918][ T9845] RAX: 000000000000001f RBX: ffff888022b28000 RCX:
f65fb9590b3bdd00
[   34.503625][ T9845] RDX: 0000000000000000 RSI: 0000000080000000 RDI:
0000000000000000
[   34.504423][ T9845] RBP: 0000000000000000 R08: ffff888062924253 R09:
1ffff1100c52484a
[   34.505281][ T9845] R10: dffffc0000000000 R11: ffffed100c52484b R12:
0000000000000000
[   34.506257][ T9845] R13: ffff888022b280e0 R14: ffff888021a88a00 R15:
dffffc0000000000
[   34.507072][ T9845] FS:  00007f8a41e5c540(0000) GS:ffff8880cfce5000(0000)
knlGS:0000000000000000
[   34.508102][ T9845] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   34.508861][ T9845] CR2: 00007f19ac0470f8 CR3: 00000001101d8000 CR4:
00000000000006f0
[   34.509696][ T9845] Call Trace:
[   34.510067][ T9845]  <TASK>
[   34.510360][ T9845]  ? do_raw_spin_unlock+0x4d/0x210
[   34.510890][ T9845]  hfs_bmap_alloc+0x83b/0x8f0
[   34.511419][ T9845]  ? __pfx_hfs_bnode_find+0x10/0x10
[   34.511886][ T9845]  ? __pfx_hfs_bmap_alloc+0x10/0x10
[   34.512452][ T9845]  hfs_btree_inc_height+0xf6/0xac0
[   34.512937][ T9845]  ? _atomic_dec_and_lock+0xbd/0x140
[   34.513475][ T9845]  ? __pfx_folio_mark_accessed+0x10/0x10
[   34.514117][ T9845]  ? __pfx_hfs_btree_inc_height+0x10/0x10
[   34.514703][ T9845]  ? do_raw_spin_unlock+0x4d/0x210
[   34.515217][ T9845]  hfs_brec_insert+0x891/0xd40
[   34.515707][ T9845]  ? __pfx_hfs_brec_insert+0x10/0x10
[   34.516286][ T9845]  ? __kmalloc_noprof+0x2a9/0x510
[   34.516795][ T9845]  __hfs_ext_cache_extent+0x2c7/0xdc0
[   34.517394][ T9845]  hfs_ext_read_extent+0x288/0x300
[   34.517922][ T9845]  ? __pfx_hfs_ext_read_extent+0x10/0x10
[   34.518523][ T9845]  ? __pfx_clean_bdev_aliases+0x10/0x10
[   34.519071][ T9845]  hfs_extend_file+0x14e/0xee0
[   34.519570][ T9845]  ? hfs_bmap_reserve+0x3e9/0x430
[   34.520045][ T9845]  ? __pfx_hfs_extend_file+0x10/0x10
[   34.520593][ T9845]  hfs_get_block+0x3a9/0x9d0
[   34.521062][ T9845]  __block_write_begin_int+0x6c2/0x1990
[   34.521617][ T9845]  ? __pfx___folio_batch_add_and_move+0x10/0x10
[   34.522242][ T9845]  ? __pfx_hfs_get_block+0x10/0x10
[   34.522758][ T9845]  ? __pfx___block_write_begin_int+0x10/0x10
[   34.523387][ T9845]  cont_write_begin+0x6d5/0xa50
[   34.523879][ T9845]  ? __pfx_cont_write_begin+0x10/0x10
[   34.524440][ T9845]  ? __pfx___might_resched+0x10/0x10
[   34.524976][ T9845]  ? __mark_inode_dirty+0x3db/0xeb0
[   34.525490][ T9845]  ? folio_unlock+0x10c/0x170
[   34.525981][ T9845]  hfs_write_begin+0x66/0xb0
[   34.526489][ T9845]  ? __pfx_hfs_get_block+0x10/0x10
[   34.527044][ T9845]  generic_perform_write+0x2c5/0x900
[   34.527569][ T9845]  ? __pfx_generic_perform_write+0x10/0x10
[   34.528161][ T9845]  ? file_update_time+0x2ce/0x470
[   34.528733][ T9845]  ? __generic_file_write_iter+0xf9/0x230
[   34.529270][ T9845]  ? generic_file_write_iter+0x103/0x550
[   34.529907][ T9845]  generic_file_write_iter+0x117/0x550
[   34.530445][ T9845]  ? __pfx_generic_file_write_iter+0x10/0x10
[   34.531076][ T9845]  ? stack_depot_save_flags+0x40/0x890
[   34.531600][ T9845]  ? __pfx_aa_file_perm+0x10/0x10
[   34.532108][ T9845]  ? __lock_acquire+0xaec/0xd80
[   34.532619][ T9845]  ? rcu_read_lock_any_held+0xb3/0x120
[   34.533138][ T9845]  ? __pfx_rcu_read_lock_any_held+0x10/0x10
[   34.533742][ T9845]  vfs_write+0x60a/0xb80
[   34.534189][ T9845]  ? __pfx_generic_file_write_iter+0x10/0x10
[   34.534769][ T9845]  ? __pfx_vfs_write+0x10/0x10
[   34.535259][ T9845]  ? do_sys_openat2+0x154/0x1c0
[   34.535781][ T9845]  ? kmem_cache_free+0x18c/0x400
[   34.536288][ T9845]  ksys_write+0x14a/0x250
[   34.536748][ T9845]  ? __pfx_ksys_write+0x10/0x10
[   34.537207][ T9845]  ? do_syscall_64+0xb7/0x3a0
[   34.537661][ T9845]  ? entry_SYSCALL_64_after_hwframe+0x77/0x7f
[   34.538268][ T9845]  ? entry_SYSCALL_64_after_hwframe+0x77/0x7f
[   34.538886][ T9845]  do_syscall_64+0xf3/0x3a0
[   34.539380][ T9845]  ? exc_page_fault+0x9f/0xf0
[   34.539871][ T9845]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
[   34.540472][ T9845] RIP: 0033:0x7f8a41d7bfc9
[   34.540915][ T9845] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00
48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 48
[   34.542872][ T9845] RSP: 002b:00007ffeb34ea6e8 EFLAGS: 00000202 ORIG_RAX:
0000000000000001
[   34.543642][ T9845] RAX: ffffffffffffffda RBX: 0000000000000000 RCX:
00007f8a41d7bfc9
[   34.544324][ T9845] RDX: 000000000208e24b RSI: 0000200000000080 RDI:
0000000000000005
[   34.545145][ T9845] RBP: 00007ffeb34ea700 R08: 00007ffeb34ea700 R09:
00007ffeb34ea700
[   34.545885][ T9845] R10: 00007ffeb34ea700 R11: 0000000000000202 R12:
000055a027173260
[   34.546641][ T9845] R13: 0000000000000000 R14: 0000000000000000 R15:
0000000000000000
[   34.547329][ T9845]  </TASK>

And I suspect that HFS+ could have the same issue.

> Interestingly, since hfsplus_fill_super() needs to initialize the root 
> directory “/”, this operation must rely on the catalog_tree. As a 
> result, the first byte of the catalog_tree bitmap is 0xC0 (i.e., 
> 0b11000000), meaning that both the header node and the root node are 
> pre-marked in the bitmap so that hfs_bnode_find can later be used to 
> create the header node (although in practice, the root node is the first 
> one created).
> 
> ```
> 	/* Load the root directory */
> 	root = hfsplus_iget(sb, HFSPLUS_ROOT_CNID);  <- use catalog tree
> 	if (IS_ERR(root)) {
> 		pr_err("failed to load root directory\n");
> 		err = PTR_ERR(root);
> 		goto out_put_alloc_file;
> 	}
> 
> ```
> 
> In contrast, the extent tree is not used at all during the mounting 
> process. As a result, the first byte of the extents tree is 0x80 (i.e., 
> 0b1000000), so its root node can be created later, when 
> hfs_brec_insert() performs the first insertion. At that point, the root 
> node is marked in the bitmap through hfs_bmap_alloc().
> 
> In short, during the mounting process, the extents tree initializes 
> neither the header node nor the root node.
> 
> ```
> 	if (!fd->bnode) {
> 		if (!tree->root)
> 			hfs_btree_inc_height(tree);  <- hfs_bmap_alloc
> 		node = hfs_bnode_find(tree, tree->leaf_head);
> 		if (IS_ERR(node))
> 			return PTR_ERR(node);
> 		fd->bnode = node;
> 		fd->record = -1;
> 	}
> 
> ```
> 
> > So, if you are fixing the syzbot reported issue, then it could be good to have
> > call trace in the comment to understand the fix environment. And it could be
> > more clear in what environment sync() is trying to create the b-tree node.
> 
> OK, I will put call trace in the commit message in future development.
> 
> syzbot report call trace:[2][3]
> 
> Call Trace:
>   <TASK>
>   hfsplus_btree_write+0x379/0x7b0 fs/hfsplus/btree.c:309
>   hfsplus_system_write_inode fs/hfsplus/super.c:137 [inline]
>   hfsplus_write_inode+0x4c9/0x5f0 fs/hfsplus/super.c:163
>   write_inode fs/fs-writeback.c:1525 [inline]
>   __writeback_single_inode+0x6f4/0x1000 fs/fs-writeback.c:1745
>   writeback_sb_inodes+0x6b7/0xf60 fs/fs-writeback.c:1976
>   wb_writeback+0x43b/0xaf0 fs/fs-writeback.c:2156
>   wb_do_writeback fs/fs-writeback.c:2303 [inline]
>   wb_workfn+0x40e/0xf00 fs/fs-writeback.c:2343
>   process_one_work kernel/workqueue.c:3236 [inline]
>   process_scheduled_works+0xae1/0x17b0 kernel/workqueue.c:3319
>   worker_thread+0x8a0/0xda0 kernel/workqueue.c:3400
>   kthread+0x711/0x8a0 kernel/kthread.c:463
>   ret_from_fork+0x3fc/0x770 arch/x86/kernel/process.c:148
>   ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
> 
> Call Trace:
>   <TASK>
>   hfsplus_bmap_alloc+0x5a5/0x640 fs/hfsplus/btree.c:414
>   hfs_bnode_split+0xcc/0xef0 fs/hfsplus/brec.c:245
>   hfsplus_brec_insert+0x38f/0xcc0 fs/hfsplus/brec.c:100
>   hfsplus_rename_cat+0x51c/0xde0 fs/hfsplus/catalog.c:491
>   hfsplus_link+0x359/0x6a0 fs/hfsplus/dir.c:323
>   vfs_link+0x4ea/0x6e0 fs/namei.c:4854
>   do_linkat+0x272/0x560 fs/namei.c:4924
>   __do_sys_link fs/namei.c:4958 [inline]
>   __se_sys_link fs/namei.c:4956 [inline]
>   __x64_sys_link+0x82/0x90 fs/namei.c:4956
>   do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
>   do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
>   entry_SYSCALL_64_after_hwframe+0x77/0x7f
> 
> There are two different call traces, depending on which path acquires 
> the node first.
> 
> Finally, I believe it is necessary to add the missing hfs_bnode_get in 
> __hfs_bnode_create. This ensures that each find call holds a proper 
> reference, aligning the reference counting design with that of VFS 
> inodes, and effectively preventing miscounting in concurrent scenarios.
> 

I agree with the fix. The patch simply needs to be more clearly explained.

> If you would like, I can rewrite the commit message and resend the 
> patch. Please let me know.
> 

Yes, please, the commit message needs to be improved.


Thanks,
Slava.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-09-02 21:57 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-29  9:39 [PATCH] hfsplus: fix missing hfs_bnode_get() in __hfs_bnode_create Chenzhi Yang
2025-08-29 21:53 ` Viacheslav Dubeyko
2025-09-01 14:21   ` 杨晨志
2025-09-02 21:57     ` Viacheslav Dubeyko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).