linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v2] hfs: add return values to hfs_brec_lenoff and hfs_bnode_read to improve robustness
@ 2025-08-27  6:40 Chenzhi Yang
  2025-08-27 20:04 ` Viacheslav Dubeyko
  0 siblings, 1 reply; 4+ messages in thread
From: Chenzhi Yang @ 2025-08-27  6:40 UTC (permalink / raw)
  To: slava, glaubitz, frank.li; +Cc: linux-fsdevel, linux-kernel, Yang Chenzhi

From: Yang Chenzhi <yang.chenzhi@vivo.com>

This patch addresses two issues in the hfs filesystem:

1. out-of-bounds access in hfs_bmap_alloc

Analysis can be found here:
https://lore.kernel.org/all/20250818141734.8559-2-yang.chenzhi@vivo.com/

With specially crafted offsets from syzbot, hfs_brec_lenoff()
could return invalid offset and length values.

This patch introduces a return value for hfs_brec_lenoff().
The function now validates offset and length:
  - if invalid, it returns an error code;
  - if valid, it returns 0.

All callers of hfs_brec_lenoff() are updated to check its return
value before using offset and length, thus preventing out-of-bounds access.

2. potential use of uninitialized memory in hfs_bnode_dump

Related bug report:
https://syzkaller.appspot.com/bug?extid=f687659f3c2acfa34201

This bug was previously fixed in commit:
commit a431930c9bac518bf99d6b1da526a7f37ddee8d8

However, a new syzbot report indicated a KMSAN use-uninit-value.
The root cause is that hfs_bnode_dump() calls hfs_bnode_read_u16()
with an invalid offset.
  - hfs_bnode_read() detects the invalid offset and returns immediately;
  - Back in hfs_bnode_read_u16(), be16_to_cpu() was then called on an
    uninitialized variable.

To address this, the intended direction is for hfs_bnode_read()
to return a status code (0 on success, negative errno on failure)
so that callers can detect errors and exit early, avoiding the use
of uninitialized memory.

However, hfs_bnode_read() is widely used, this patch does not modify
it directly. Instead, new __hfs_bnode_read*() helper functions are
introduced, which mirror the original behavior but add offset/length
validation and return values.

For now, only the hfs_bnode_dump() code path is updated to use these
helpers in order to validate the feasibility of this approach.

After applying the patch, the xfstests quick suite was run:
  - The previously failing generic/113 test now passes;
  - All other test cases remain unchanged.

-------------------------------------------

The main idea of this patch is to:
Add explicit return values to critical functions so that
invalid offset/length values are reported via error codes;

Require all callers to check return values, ensuring
invalid parameters are not propagated further;

Improve the overall robustness of the HFS codebase and
protect against syzbot-crafted invalid inputs.

RFC: feedback is requested on whether adding return values
to hfs_brec_lenoff() and hfs_bnode_read() in this manner
is an acceptable direction, and if such safety improvements
should be expanded more broadly within the HFS subsystem.

Signed-off-by: Yang Chenzhi <yang.chenzhi@vivo.com>
---
 fs/hfs/bfind.c | 14 ++++----
 fs/hfs/bnode.c | 87 +++++++++++++++++++++++++++++++++++---------------
 fs/hfs/brec.c  | 13 ++++++--
 fs/hfs/btree.c | 21 +++++++++---
 fs/hfs/btree.h | 21 +++++++++++-
 5 files changed, 116 insertions(+), 40 deletions(-)

diff --git a/fs/hfs/bfind.c b/fs/hfs/bfind.c
index 34e9804e0f36..aea6edd4d830 100644
--- a/fs/hfs/bfind.c
+++ b/fs/hfs/bfind.c
@@ -61,16 +61,16 @@ int __hfs_brec_find(struct hfs_bnode *bnode, struct hfs_find_data *fd)
 	u16 off, len, keylen;
 	int rec;
 	int b, e;
-	int res;
+	int res, ret;
 
 	b = 0;
 	e = bnode->num_recs - 1;
 	res = -ENOENT;
 	do {
 		rec = (e + b) / 2;
-		len = hfs_brec_lenoff(bnode, rec, &off);
+		ret = hfs_brec_lenoff(bnode, rec, &off, &len);
 		keylen = hfs_brec_keylen(bnode, rec);
-		if (keylen == 0) {
+		if (keylen == 0 || ret) {
 			res = -EINVAL;
 			goto fail;
 		}
@@ -87,9 +87,9 @@ int __hfs_brec_find(struct hfs_bnode *bnode, struct hfs_find_data *fd)
 			e = rec - 1;
 	} while (b <= e);
 	if (rec != e && e >= 0) {
-		len = hfs_brec_lenoff(bnode, e, &off);
+		ret = hfs_brec_lenoff(bnode, e, &off, &len);
 		keylen = hfs_brec_keylen(bnode, e);
-		if (keylen == 0) {
+		if (keylen == 0 || ret) {
 			res = -EINVAL;
 			goto fail;
 		}
@@ -223,9 +223,9 @@ int hfs_brec_goto(struct hfs_find_data *fd, int cnt)
 		fd->record += cnt;
 	}
 
-	len = hfs_brec_lenoff(bnode, fd->record, &off);
+	res = hfs_brec_lenoff(bnode, fd->record, &off, &len);
 	keylen = hfs_brec_keylen(bnode, fd->record);
-	if (keylen == 0) {
+	if (keylen == 0 || res) {
 		res = -EINVAL;
 		goto out;
 	}
diff --git a/fs/hfs/bnode.c b/fs/hfs/bnode.c
index e8cd1a31f247..b0bbaf016b8d 100644
--- a/fs/hfs/bnode.c
+++ b/fs/hfs/bnode.c
@@ -57,26 +57,16 @@ int check_and_correct_requested_length(struct hfs_bnode *node, int off, int len)
 	return len;
 }
 
-void hfs_bnode_read(struct hfs_bnode *node, void *buf, int off, int len)
+int __hfs_bnode_read(struct hfs_bnode *node, void *buf, u16 off, u16 len)
 {
 	struct page *page;
 	int pagenum;
 	int bytes_read;
 	int bytes_to_read;
 
-	if (!is_bnode_offset_valid(node, off))
-		return;
-
-	if (len == 0) {
-		pr_err("requested zero length: "
-		       "NODE: id %u, type %#x, height %u, "
-		       "node_size %u, offset %d, len %d\n",
-		       node->this, node->type, node->height,
-		       node->tree->node_size, off, len);
-		return;
-	}
-
-	len = check_and_correct_requested_length(node, off, len);
+	/* len = 0 is invalid: prevent use of an uninitalized buffer*/
+	if (!len || !hfs_off_and_len_is_valid(node, off, len))
+		return -EINVAL;
 
 	off += node->page_offset;
 	pagenum = off >> PAGE_SHIFT;
@@ -93,6 +83,47 @@ void hfs_bnode_read(struct hfs_bnode *node, void *buf, int off, int len)
 		pagenum++;
 		off = 0; /* page offset only applies to the first page */
 	}
+
+	return 0;
+}
+
+static int __hfs_bnode_read_u16(struct hfs_bnode *node, u16* buf, u16 off)
+{
+	__be16 data;
+	int res;
+
+	res = __hfs_bnode_read(node, (void*)(&data), off, 2);
+	if (res)
+		return res;
+	*buf = be16_to_cpu(data);
+	return 0;
+}
+
+
+static int __hfs_bnode_read_u8(struct hfs_bnode *node, u8* buf, u16 off)
+{
+	int res;
+
+	res = __hfs_bnode_read(node, (void*)(&buf), off, 2);
+	if (res)
+		return res;
+	return 0;
+}
+
+void hfs_bnode_read(struct hfs_bnode *node, void *buf, int off, int len)
+{
+	int res;
+
+	len = check_and_correct_requested_length(node, off, len);
+	res = __hfs_bnode_read(node, buf, (u16)off, (u16)len);
+	if (res) {
+		pr_err("hfs_bnode_read error: "
+		       "NODE: id %u, type %#x, height %u, "
+		       "node_size %u, offset %d, len %d\n",
+		       node->this, node->type, node->height,
+		       node->tree->node_size, off, len);
+	}
+	return;
 }
 
 u16 hfs_bnode_read_u16(struct hfs_bnode *node, int off)
@@ -241,7 +272,8 @@ void hfs_bnode_dump(struct hfs_bnode *node)
 {
 	struct hfs_bnode_desc desc;
 	__be32 cnid;
-	int i, off, key_off;
+	int i, res;
+	u16 off, key_off;
 
 	hfs_dbg(BNODE_MOD, "bnode: %d\n", node->this);
 	hfs_bnode_read(node, &desc, 0, sizeof(desc));
@@ -251,23 +283,28 @@ void hfs_bnode_dump(struct hfs_bnode *node)
 
 	off = node->tree->node_size - 2;
 	for (i = be16_to_cpu(desc.num_recs); i >= 0; off -= 2, i--) {
-		key_off = hfs_bnode_read_u16(node, off);
+		res = __hfs_bnode_read_u16(node, &key_off, off);
+		if (res) return;
 		hfs_dbg_cont(BNODE_MOD, " %d", key_off);
 		if (i && node->type == HFS_NODE_INDEX) {
-			int tmp;
-
-			if (node->tree->attributes & HFS_TREE_VARIDXKEYS)
-				tmp = (hfs_bnode_read_u8(node, key_off) | 1) + 1;
-			else
+			u8 tmp, data;
+			if (node->tree->attributes & HFS_TREE_VARIDXKEYS) {
+				res = __hfs_bnode_read_u8(node, &data, key_off);
+				if (res) return;
+				tmp = (data | 1) + 1;
+			} else {
 				tmp = node->tree->max_key_len + 1;
-			hfs_dbg_cont(BNODE_MOD, " (%d,%d",
-				     tmp, hfs_bnode_read_u8(node, key_off));
+			}
+			res = __hfs_bnode_read_u8(node, &data, key_off);
+			if (res) return;
+			hfs_dbg_cont(BNODE_MOD, " (%d,%d", tmp, data);
 			hfs_bnode_read(node, &cnid, key_off + tmp, 4);
 			hfs_dbg_cont(BNODE_MOD, ",%d)", be32_to_cpu(cnid));
 		} else if (i && node->type == HFS_NODE_LEAF) {
-			int tmp;
+			u8 tmp;
 
-			tmp = hfs_bnode_read_u8(node, key_off);
+			res = __hfs_bnode_read_u8(node, &tmp, key_off);
+			if (res) return;
 			hfs_dbg_cont(BNODE_MOD, " (%d)", tmp);
 		}
 	}
diff --git a/fs/hfs/brec.c b/fs/hfs/brec.c
index 896396554bcc..d7026a3ffeea 100644
--- a/fs/hfs/brec.c
+++ b/fs/hfs/brec.c
@@ -16,15 +16,22 @@ static int hfs_brec_update_parent(struct hfs_find_data *fd);
 static int hfs_btree_inc_height(struct hfs_btree *tree);
 
 /* Get the length and offset of the given record in the given node */
-u16 hfs_brec_lenoff(struct hfs_bnode *node, u16 rec, u16 *off)
+int hfs_brec_lenoff(struct hfs_bnode *node, u16 rec, u16 *off, u16 *len)
 {
 	__be16 retval[2];
 	u16 dataoff;
+	int res;
 
 	dataoff = node->tree->node_size - (rec + 2) * 2;
-	hfs_bnode_read(node, retval, dataoff, 4);
+	res = __hfs_bnode_read(node, retval, dataoff, 4);
+	if (res)
+		return -EINVAL;
 	*off = be16_to_cpu(retval[1]);
-	return be16_to_cpu(retval[0]) - *off;
+	*len = be16_to_cpu(retval[0]) - *off;
+	if (!hfs_off_and_len_is_valid(node, *off, *len) ||
+			*off < sizeof(struct hfs_bnode_desc))
+		return -EINVAL;
+	return 0;
 }
 
 /* Get the length of the key from a keyed record */
diff --git a/fs/hfs/btree.c b/fs/hfs/btree.c
index e86e1e235658..b13582dcc27a 100644
--- a/fs/hfs/btree.c
+++ b/fs/hfs/btree.c
@@ -301,7 +301,9 @@ struct hfs_bnode *hfs_bmap_alloc(struct hfs_btree *tree)
 	node = hfs_bnode_find(tree, nidx);
 	if (IS_ERR(node))
 		return node;
-	len = hfs_brec_lenoff(node, 2, &off16);
+	res = hfs_brec_lenoff(node, 2, &off16, &len);
+	if (res)
+		return ERR_PTR(res);
 	off = off16;
 
 	off += node->page_offset;
@@ -347,7 +349,9 @@ struct hfs_bnode *hfs_bmap_alloc(struct hfs_btree *tree)
 			return next_node;
 		node = next_node;
 
-		len = hfs_brec_lenoff(node, 0, &off16);
+		res = hfs_brec_lenoff(node, 0, &off16, &len);
+		if (res)
+			return ERR_PTR(res);
 		off = off16;
 		off += node->page_offset;
 		pagep = node->page + (off >> PAGE_SHIFT);
@@ -363,6 +367,7 @@ void hfs_bmap_free(struct hfs_bnode *node)
 	u16 off, len;
 	u32 nidx;
 	u8 *data, byte, m;
+	int res;
 
 	hfs_dbg(BNODE_MOD, "btree_free_node: %u\n", node->this);
 	tree = node->tree;
@@ -370,7 +375,9 @@ void hfs_bmap_free(struct hfs_bnode *node)
 	node = hfs_bnode_find(tree, 0);
 	if (IS_ERR(node))
 		return;
-	len = hfs_brec_lenoff(node, 2, &off);
+	res = hfs_brec_lenoff(node, 2, &off, &len);
+	if (res)
+		goto fail;
 	while (nidx >= len * 8) {
 		u32 i;
 
@@ -394,7 +401,9 @@ void hfs_bmap_free(struct hfs_bnode *node)
 			hfs_bnode_put(node);
 			return;
 		}
-		len = hfs_brec_lenoff(node, 0, &off);
+		res = hfs_brec_lenoff(node, 0, &off, &len);
+		if (res)
+			goto fail;
 	}
 	off += node->page_offset + nidx / 8;
 	page = node->page[off >> PAGE_SHIFT];
@@ -415,4 +424,8 @@ void hfs_bmap_free(struct hfs_bnode *node)
 	hfs_bnode_put(node);
 	tree->free_nodes++;
 	mark_inode_dirty(tree->inode);
+	return;
+fail:
+	hfs_bnode_put(node);
+	pr_err("fail to free a bnode due to invalid off or len\n");
 }
diff --git a/fs/hfs/btree.h b/fs/hfs/btree.h
index 0e6baee93245..78f228e62a86 100644
--- a/fs/hfs/btree.h
+++ b/fs/hfs/btree.h
@@ -94,6 +94,7 @@ extern struct hfs_bnode * hfs_bmap_alloc(struct hfs_btree *);
 extern void hfs_bmap_free(struct hfs_bnode *node);
 
 /* bnode.c */
+extern int __hfs_bnode_read(struct hfs_bnode *, void *, u16, u16);
 extern void hfs_bnode_read(struct hfs_bnode *, void *, int, int);
 extern u16 hfs_bnode_read_u16(struct hfs_bnode *, int);
 extern u8 hfs_bnode_read_u8(struct hfs_bnode *, int);
@@ -116,7 +117,7 @@ extern void hfs_bnode_get(struct hfs_bnode *);
 extern void hfs_bnode_put(struct hfs_bnode *);
 
 /* brec.c */
-extern u16 hfs_brec_lenoff(struct hfs_bnode *, u16, u16 *);
+extern int hfs_brec_lenoff(struct hfs_bnode *, u16, u16 *, u16 *);
 extern u16 hfs_brec_keylen(struct hfs_bnode *, u16);
 extern int hfs_brec_insert(struct hfs_find_data *, void *, int);
 extern int hfs_brec_remove(struct hfs_find_data *);
@@ -170,3 +171,21 @@ struct hfs_btree_header_rec {
 						   max key length. use din catalog
 						   b-tree but not in extents
 						   b-tree (hfsplus). */
+static inline
+bool hfs_off_and_len_is_valid(struct hfs_bnode *node, u16 off, u16 len)
+{
+	bool ret = true;
+	if (off > node->tree->node_size ||
+			off + len > node->tree->node_size)
+		ret = false;
+
+	if (!ret) {
+		pr_err("requested invalid offset: "
+		       "NODE: id %u, type %#x, height %u, "
+		       "node_size %u, offset %u, length %u\n",
+		       node->this, node->type, node->height,
+		       node->tree->node_size, off, len);
+	}
+
+	return ret;
+}
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [RFC PATCH v2] hfs: add return values to hfs_brec_lenoff and hfs_bnode_read to improve robustness
  2025-08-27  6:40 [RFC PATCH v2] hfs: add return values to hfs_brec_lenoff and hfs_bnode_read to improve robustness Chenzhi Yang
@ 2025-08-27 20:04 ` Viacheslav Dubeyko
  2025-08-28 12:35   ` 杨晨志
  0 siblings, 1 reply; 4+ messages in thread
From: Viacheslav Dubeyko @ 2025-08-27 20:04 UTC (permalink / raw)
  To: Chenzhi Yang, glaubitz, frank.li; +Cc: linux-fsdevel, linux-kernel

On Wed, 2025-08-27 at 14:40 +0800, Chenzhi Yang wrote:
> From: Yang Chenzhi <yang.chenzhi@vivo.com>
> 
> This patch addresses two issues in the hfs filesystem:
> 
> 1. out-of-bounds access in hfs_bmap_alloc
> 
> Analysis can be found here:
> https://lore.kernel.org/all/20250818141734.8559-2-yang.chenzhi@vivo.com/
> 
> With specially crafted offsets from syzbot, hfs_brec_lenoff()
> could return invalid offset and length values.
> 
> This patch introduces a return value for hfs_brec_lenoff().
> The function now validates offset and length:
>   - if invalid, it returns an error code;
>   - if valid, it returns 0.
> 
> All callers of hfs_brec_lenoff() are updated to check its return
> value before using offset and length, thus preventing out-of-bounds
> access.
> 
> 2. potential use of uninitialized memory in hfs_bnode_dump
> 
> Related bug report:
> https://syzkaller.appspot.com/bug?extid=f687659f3c2acfa34201
> 
> This bug was previously fixed in commit:
> commit a431930c9bac518bf99d6b1da526a7f37ddee8d8
> 
> However, a new syzbot report indicated a KMSAN use-uninit-value.
> The root cause is that hfs_bnode_dump() calls hfs_bnode_read_u16()
> with an invalid offset.
>   - hfs_bnode_read() detects the invalid offset and returns
> immediately;
>   - Back in hfs_bnode_read_u16(), be16_to_cpu() was then called on an
>     uninitialized variable.
> 
> To address this, the intended direction is for hfs_bnode_read()
> to return a status code (0 on success, negative errno on failure)
> so that callers can detect errors and exit early, avoiding the use
> of uninitialized memory.
> 
> However, hfs_bnode_read() is widely used, this patch does not modify
> it directly. Instead, new __hfs_bnode_read*() helper functions are
> introduced, which mirror the original behavior but add offset/length
> validation and return values.
> 
> For now, only the hfs_bnode_dump() code path is updated to use these
> helpers in order to validate the feasibility of this approach.
> 
> After applying the patch, the xfstests quick suite was run:
>   - The previously failing generic/113 test now passes;
>   - All other test cases remain unchanged.
> 
> -------------------------------------------
> 
> The main idea of this patch is to:
> Add explicit return values to critical functions so that
> invalid offset/length values are reported via error codes;
> 
> Require all callers to check return values, ensuring
> invalid parameters are not propagated further;
> 
> Improve the overall robustness of the HFS codebase and
> protect against syzbot-crafted invalid inputs.
> 
> RFC: feedback is requested on whether adding return values
> to hfs_brec_lenoff() and hfs_bnode_read() in this manner
> is an acceptable direction, and if such safety improvements
> should be expanded more broadly within the HFS subsystem.
> 
> Signed-off-by: Yang Chenzhi <yang.chenzhi@vivo.com>
> ---
>  fs/hfs/bfind.c | 14 ++++----
>  fs/hfs/bnode.c | 87 +++++++++++++++++++++++++++++++++++-------------
> --
>  fs/hfs/brec.c  | 13 ++++++--
>  fs/hfs/btree.c | 21 +++++++++---
>  fs/hfs/btree.h | 21 +++++++++++-
>  5 files changed, 116 insertions(+), 40 deletions(-)
> 
> diff --git a/fs/hfs/bfind.c b/fs/hfs/bfind.c
> index 34e9804e0f36..aea6edd4d830 100644
> --- a/fs/hfs/bfind.c
> +++ b/fs/hfs/bfind.c
> @@ -61,16 +61,16 @@ int __hfs_brec_find(struct hfs_bnode *bnode,
> struct hfs_find_data *fd)
>  	u16 off, len, keylen;
>  	int rec;
>  	int b, e;
> -	int res;
> +	int res, ret;
>  
>  	b = 0;
>  	e = bnode->num_recs - 1;
>  	res = -ENOENT;
>  	do {
>  		rec = (e + b) / 2;
> -		len = hfs_brec_lenoff(bnode, rec, &off);
> +		ret = hfs_brec_lenoff(bnode, rec, &off, &len);

Frankly speaking, I don't think that reworking this method for
returning the error code is necessary. Currently, we return the length
value (u16) and we can return U16_MAX as for off as for len for the
case of incorrect offset or erroneous logic. We can treat U16_MAX as
error condition and we can check off and len for this value. Usually,
HFS b-tree node has 512 bytes in size and as offset as length cannot be
equal to U16_MAX (or bigger). And we don't need to change the input and
output arguments if we will check for U16_MAX value.

>  		keylen = hfs_brec_keylen(bnode, rec);
> -		if (keylen == 0) {
> +		if (keylen == 0 || ret) {
>  			res = -EINVAL;
>  			goto fail;
>  		}
> @@ -87,9 +87,9 @@ int __hfs_brec_find(struct hfs_bnode *bnode, struct
> hfs_find_data *fd)
>  			e = rec - 1;
>  	} while (b <= e);
>  	if (rec != e && e >= 0) {
> -		len = hfs_brec_lenoff(bnode, e, &off);
> +		ret = hfs_brec_lenoff(bnode, e, &off, &len);
>  		keylen = hfs_brec_keylen(bnode, e);
> -		if (keylen == 0) {
> +		if (keylen == 0 || ret) {

The same here.

>  			res = -EINVAL;
>  			goto fail;
>  		}
> @@ -223,9 +223,9 @@ int hfs_brec_goto(struct hfs_find_data *fd, int
> cnt)
>  		fd->record += cnt;
>  	}
>  
> -	len = hfs_brec_lenoff(bnode, fd->record, &off);
> +	res = hfs_brec_lenoff(bnode, fd->record, &off, &len);
>  	keylen = hfs_brec_keylen(bnode, fd->record);
> -	if (keylen == 0) {
> +	if (keylen == 0 || res) {

Ditto.

>  		res = -EINVAL;
>  		goto out;
>  	}
> diff --git a/fs/hfs/bnode.c b/fs/hfs/bnode.c
> index e8cd1a31f247..b0bbaf016b8d 100644
> --- a/fs/hfs/bnode.c
> +++ b/fs/hfs/bnode.c
> @@ -57,26 +57,16 @@ int check_and_correct_requested_length(struct
> hfs_bnode *node, int off, int len)
>  	return len;
>  }
>  
> -void hfs_bnode_read(struct hfs_bnode *node, void *buf, int off, int
> len)
> +int __hfs_bnode_read(struct hfs_bnode *node, void *buf, u16 off, u16
> len)

I don't follow why do we need to introduce __hfs_bnode_read(). One
method for all is enough. And I think we still can use only
hfs_bnode_read(). Because, we can initialize the buffer by 0x00 or 0xFF
in the case we cannot read if offset or length are invalid. Usually,
every method checks (or should) check the returning value of
hfs_bnode_read().

>  {
>  	struct page *page;
>  	int pagenum;
>  	int bytes_read;
>  	int bytes_to_read;
>  
> -	if (!is_bnode_offset_valid(node, off))
> -		return;
> -
> -	if (len == 0) {
> -		pr_err("requested zero length: "
> -		       "NODE: id %u, type %#x, height %u, "
> -		       "node_size %u, offset %d, len %d\n",
> -		       node->this, node->type, node->height,
> -		       node->tree->node_size, off, len);
> -		return;
> -	}
> -
> -	len = check_and_correct_requested_length(node, off, len);
> +	/* len = 0 is invalid: prevent use of an uninitalized
> buffer*/
> +	if (!len || !hfs_off_and_len_is_valid(node, off, len))
> +		return -EINVAL;
>  
>  	off += node->page_offset;
>  	pagenum = off >> PAGE_SHIFT;
> @@ -93,6 +83,47 @@ void hfs_bnode_read(struct hfs_bnode *node, void
> *buf, int off, int len)
>  		pagenum++;
>  		off = 0; /* page offset only applies to the first
> page */
>  	}
> +
> +	return 0;
> +}
> +
> +static int __hfs_bnode_read_u16(struct hfs_bnode *node, u16* buf,
> u16 off)

I don't see the point to introduce another version of method because we
can return U16_MAX as invalid value.

> +{
> +	__be16 data;
> +	int res;
> +
> +	res = __hfs_bnode_read(node, (void*)(&data), off, 2);
> +	if (res)
> +		return res;
> +	*buf = be16_to_cpu(data);
> +	return 0;
> +}
> +
> +
> +static int __hfs_bnode_read_u8(struct hfs_bnode *node, u8* buf, u16
> off)

And we can return U8_MAX as invalid value here too.

> +{
> +	int res;
> +
> +	res = __hfs_bnode_read(node, (void*)(&buf), off, 2);
> +	if (res)
> +		return res;
> +	return 0;
> +}
> +
> +void hfs_bnode_read(struct hfs_bnode *node, void *buf, int off, int
> len)

I don't think that we need two methods instead of one.

> +{
> +	int res;
> +
> +	len = check_and_correct_requested_length(node, off, len);
> +	res = __hfs_bnode_read(node, buf, (u16)off, (u16)len);
> +	if (res) {
> +		pr_err("hfs_bnode_read error: "
> +		       "NODE: id %u, type %#x, height %u, "
> +		       "node_size %u, offset %d, len %d\n",
> +		       node->this, node->type, node->height,
> +		       node->tree->node_size, off, len);
> +	}
> +	return;
>  }
>  
>  u16 hfs_bnode_read_u16(struct hfs_bnode *node, int off)
> @@ -241,7 +272,8 @@ void hfs_bnode_dump(struct hfs_bnode *node)

The hfs_bnode_dump() is mostly debugging method. So, maybe, it could be
interesting to see the "garbage" instead of breaking the read logic.

>  {
>  	struct hfs_bnode_desc desc;
>  	__be32 cnid;
> -	int i, off, key_off;
> +	int i, res;
> +	u16 off, key_off;
>  
>  	hfs_dbg(BNODE_MOD, "bnode: %d\n", node->this);
>  	hfs_bnode_read(node, &desc, 0, sizeof(desc));
> @@ -251,23 +283,28 @@ void hfs_bnode_dump(struct hfs_bnode *node)
>  
>  	off = node->tree->node_size - 2;
>  	for (i = be16_to_cpu(desc.num_recs); i >= 0; off -= 2, i--)
> {
> -		key_off = hfs_bnode_read_u16(node, off);
> +		res = __hfs_bnode_read_u16(node, &key_off, off);
> +		if (res) return;
>  		hfs_dbg_cont(BNODE_MOD, " %d", key_off);
>  		if (i && node->type == HFS_NODE_INDEX) {
> -			int tmp;
> -
> -			if (node->tree->attributes &
> HFS_TREE_VARIDXKEYS)
> -				tmp = (hfs_bnode_read_u8(node,
> key_off) | 1) + 1;
> -			else
> +			u8 tmp, data;
> +			if (node->tree->attributes &
> HFS_TREE_VARIDXKEYS) {
> +				res = __hfs_bnode_read_u8(node,
> &data, key_off);
> +				if (res) return;

This breaks the kernel code style.

> +				tmp = (data | 1) + 1;
> +			} else {
>  				tmp = node->tree->max_key_len + 1;
> -			hfs_dbg_cont(BNODE_MOD, " (%d,%d",
> -				     tmp, hfs_bnode_read_u8(node,
> key_off));
> +			}
> +			res = __hfs_bnode_read_u8(node, &data,
> key_off);
> +			if (res) return;

This breaks the kernel code style.

> +			hfs_dbg_cont(BNODE_MOD, " (%d,%d", tmp,
> data);
>  			hfs_bnode_read(node, &cnid, key_off + tmp,
> 4);
>  			hfs_dbg_cont(BNODE_MOD, ",%d)",
> be32_to_cpu(cnid));
>  		} else if (i && node->type == HFS_NODE_LEAF) {
> -			int tmp;
> +			u8 tmp;
>  
> -			tmp = hfs_bnode_read_u8(node, key_off);
> +			res = __hfs_bnode_read_u8(node, &tmp,
> key_off);
> +			if (res) return;

This breaks the kernel code style.

>  			hfs_dbg_cont(BNODE_MOD, " (%d)", tmp);
>  		}
>  	}
> diff --git a/fs/hfs/brec.c b/fs/hfs/brec.c
> index 896396554bcc..d7026a3ffeea 100644
> --- a/fs/hfs/brec.c
> +++ b/fs/hfs/brec.c
> @@ -16,15 +16,22 @@ static int hfs_brec_update_parent(struct
> hfs_find_data *fd);
>  static int hfs_btree_inc_height(struct hfs_btree *tree);
>  
>  /* Get the length and offset of the given record in the given node
> */
> -u16 hfs_brec_lenoff(struct hfs_bnode *node, u16 rec, u16 *off)
> +int hfs_brec_lenoff(struct hfs_bnode *node, u16 rec, u16 *off, u16
> *len)

Please, see my comments above. I think we can use U16_MAX as invalid
value.

>  {
>  	__be16 retval[2];
>  	u16 dataoff;
> +	int res;
>  
>  	dataoff = node->tree->node_size - (rec + 2) * 2;
> -	hfs_bnode_read(node, retval, dataoff, 4);
> +	res = __hfs_bnode_read(node, retval, dataoff, 4);
> +	if (res)
> +		return -EINVAL;
>  	*off = be16_to_cpu(retval[1]);
> -	return be16_to_cpu(retval[0]) - *off;
> +	*len = be16_to_cpu(retval[0]) - *off;
> +	if (!hfs_off_and_len_is_valid(node, *off, *len) ||
> +			*off < sizeof(struct hfs_bnode_desc))
> +		return -EINVAL;
> +	return 0;
>  }
>  
>  /* Get the length of the key from a keyed record */
> diff --git a/fs/hfs/btree.c b/fs/hfs/btree.c
> index e86e1e235658..b13582dcc27a 100644
> --- a/fs/hfs/btree.c
> +++ b/fs/hfs/btree.c
> @@ -301,7 +301,9 @@ struct hfs_bnode *hfs_bmap_alloc(struct hfs_btree
> *tree)
>  	node = hfs_bnode_find(tree, nidx);
>  	if (IS_ERR(node))
>  		return node;
> -	len = hfs_brec_lenoff(node, 2, &off16);
> +	res = hfs_brec_lenoff(node, 2, &off16, &len);
> +	if (res)
> +		return ERR_PTR(res);
>  	off = off16;
>  
>  	off += node->page_offset;
> @@ -347,7 +349,9 @@ struct hfs_bnode *hfs_bmap_alloc(struct hfs_btree
> *tree)
>  			return next_node;
>  		node = next_node;
>  
> -		len = hfs_brec_lenoff(node, 0, &off16);
> +		res = hfs_brec_lenoff(node, 0, &off16, &len);
> +		if (res)
> +			return ERR_PTR(res);
>  		off = off16;
>  		off += node->page_offset;
>  		pagep = node->page + (off >> PAGE_SHIFT);
> @@ -363,6 +367,7 @@ void hfs_bmap_free(struct hfs_bnode *node)
>  	u16 off, len;
>  	u32 nidx;
>  	u8 *data, byte, m;
> +	int res;
>  
>  	hfs_dbg(BNODE_MOD, "btree_free_node: %u\n", node->this);
>  	tree = node->tree;
> @@ -370,7 +375,9 @@ void hfs_bmap_free(struct hfs_bnode *node)
>  	node = hfs_bnode_find(tree, 0);
>  	if (IS_ERR(node))
>  		return;
> -	len = hfs_brec_lenoff(node, 2, &off);
> +	res = hfs_brec_lenoff(node, 2, &off, &len);
> +	if (res)
> +		goto fail;
>  	while (nidx >= len * 8) {
>  		u32 i;
>  
> @@ -394,7 +401,9 @@ void hfs_bmap_free(struct hfs_bnode *node)
>  			hfs_bnode_put(node);
>  			return;
>  		}
> -		len = hfs_brec_lenoff(node, 0, &off);
> +		res = hfs_brec_lenoff(node, 0, &off, &len);
> +		if (res)
> +			goto fail;
>  	}
>  	off += node->page_offset + nidx / 8;
>  	page = node->page[off >> PAGE_SHIFT];
> @@ -415,4 +424,8 @@ void hfs_bmap_free(struct hfs_bnode *node)
>  	hfs_bnode_put(node);
>  	tree->free_nodes++;
>  	mark_inode_dirty(tree->inode);
> +	return;
> +fail:
> +	hfs_bnode_put(node);
> +	pr_err("fail to free a bnode due to invalid off or len\n");

I am not sure that breaking free bnode logic because of incorrect
length (and, maybe, even offset) is correct behavior. Because, we can
correct length and continue to free bnode. So, I am not sure that
complete failure is the correct way of managing the situation.

Thanks,
Slava.

>  }
> diff --git a/fs/hfs/btree.h b/fs/hfs/btree.h
> index 0e6baee93245..78f228e62a86 100644
> --- a/fs/hfs/btree.h
> +++ b/fs/hfs/btree.h
> @@ -94,6 +94,7 @@ extern struct hfs_bnode * hfs_bmap_alloc(struct
> hfs_btree *);
>  extern void hfs_bmap_free(struct hfs_bnode *node);
>  
>  /* bnode.c */
> +extern int __hfs_bnode_read(struct hfs_bnode *, void *, u16, u16);
>  extern void hfs_bnode_read(struct hfs_bnode *, void *, int, int);
>  extern u16 hfs_bnode_read_u16(struct hfs_bnode *, int);
>  extern u8 hfs_bnode_read_u8(struct hfs_bnode *, int);
> @@ -116,7 +117,7 @@ extern void hfs_bnode_get(struct hfs_bnode *);
>  extern void hfs_bnode_put(struct hfs_bnode *);
>  
>  /* brec.c */
> -extern u16 hfs_brec_lenoff(struct hfs_bnode *, u16, u16 *);
> +extern int hfs_brec_lenoff(struct hfs_bnode *, u16, u16 *, u16 *);
>  extern u16 hfs_brec_keylen(struct hfs_bnode *, u16);
>  extern int hfs_brec_insert(struct hfs_find_data *, void *, int);
>  extern int hfs_brec_remove(struct hfs_find_data *);
> @@ -170,3 +171,21 @@ struct hfs_btree_header_rec {
>  						   max key length.
> use din catalog
>  						   b-tree but not in
> extents
>  						   b-tree (hfsplus).
> */
> +static inline
> +bool hfs_off_and_len_is_valid(struct hfs_bnode *node, u16 off, u16
> len)
> +{
> +	bool ret = true;
> +	if (off > node->tree->node_size ||
> +			off + len > node->tree->node_size)
> +		ret = false;
> +
> +	if (!ret) {
> +		pr_err("requested invalid offset: "
> +		       "NODE: id %u, type %#x, height %u, "
> +		       "node_size %u, offset %u, length %u\n",
> +		       node->this, node->type, node->height,
> +		       node->tree->node_size, off, len);
> +	}
> +
> +	return ret;
> +}

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC PATCH v2] hfs: add return values to hfs_brec_lenoff and hfs_bnode_read to improve robustness
  2025-08-27 20:04 ` Viacheslav Dubeyko
@ 2025-08-28 12:35   ` 杨晨志
  2025-08-28 19:46     ` Viacheslav Dubeyko
  0 siblings, 1 reply; 4+ messages in thread
From: 杨晨志 @ 2025-08-28 12:35 UTC (permalink / raw)
  To: Viacheslav Dubeyko, glaubitz@physik.fu-berlin.de,
	李扬韬
  Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org

在 2025/8/28 4:04, Viacheslav Dubeyko 写道:
> Frankly speaking, I don't think that reworking this method for
> returning the error code is necessary. Currently, we return the length
> value (u16) and we can return U16_MAX as for off as for len for the
> case of incorrect offset or erroneous logic. We can treat U16_MAX as
> error condition and we can check off and len for this value. Usually,
> HFS b-tree node has 512 bytes in size and as offset as length cannot be
> equal to U16_MAX (or bigger). And we don't need to change the input and
> output arguments if we will check for U16_MAX value.

Using U16_MAX as the error return value is reasonable. This change also 
applies to hfsplus, since the maximum node_size in hfsplus is 32768 
(0x8000), which is less than U16_MAX. Adopting this approach helps avoid 
extensive interface changes. I agree with your point.

>> diff --git a/fs/hfs/bnode.c b/fs/hfs/bnode.c
>> index e8cd1a31f247..b0bbaf016b8d 100644
>> --- a/fs/hfs/bnode.c
>> +++ b/fs/hfs/bnode.c
>> @@ -57,26 +57,16 @@ int check_and_correct_requested_length(struct
>> hfs_bnode *node, int off, int len)
>>   	return len;
>>   }
>>   
>> -void hfs_bnode_read(struct hfs_bnode *node, void *buf, int off, int
>> len)
>> +int __hfs_bnode_read(struct hfs_bnode *node, void *buf, u16 off, u16
>> len)
> 
> I don't follow why do we need to introduce __hfs_bnode_read(). One
> method for all is enough. And I think we still can use only
> hfs_bnode_read(). Because, we can initialize the buffer by 0x00 or 0xFF
> in the case we cannot read if offset or length are invalid. Usually,
> every method checks (or should) check the returning value of
> hfs_bnode_read().
> 
>>   {
>>   	struct page *page;
>>   	int pagenum;
>>   	int bytes_read;
>>   	int bytes_to_read;
>>   
>> -	if (!is_bnode_offset_valid(node, off))
>> -		return;
>> -
>> -	if (len == 0) {
>> -		pr_err("requested zero length: "
>> -		       "NODE: id %u, type %#x, height %u, "
>> -		       "node_size %u, offset %d, len %d\n",
>> -		       node->this, node->type, node->height,
>> -		       node->tree->node_size, off, len);
>> -		return;
>> -	}
>> -
>> -	len = check_and_correct_requested_length(node, off, len);
>> +	/* len = 0 is invalid: prevent use of an uninitalized
>> buffer*/
>> +	if (!len || !hfs_off_and_len_is_valid(node, off, len))
>> +		return -EINVAL;
>>   
>>   	off += node->page_offset;
>>   	pagenum = off >> PAGE_SHIFT;
>> @@ -93,6 +83,47 @@ void hfs_bnode_read(struct hfs_bnode *node, void
>> *buf, int off, int len)
>>   		pagenum++;
>>   		off = 0; /* page offset only applies to the first
>> page */
>>   	}
>> +
>> +	return 0;
>> +}
>> +
>> +static int __hfs_bnode_read_u16(struct hfs_bnode *node, u16* buf,
>> u16 off)
> 
> I don't see the point to introduce another version of method because we
> can return U16_MAX as invalid value.
> 
>> +{
>> +	__be16 data;
>> +	int res;
>> +
>> +	res = __hfs_bnode_read(node, (void*)(&data), off, 2);
>> +	if (res)
>> +		return res;
>> +	*buf = be16_to_cpu(data);
>> +	return 0;
>> +}
>> +
>> +
>> +static int __hfs_bnode_read_u8(struct hfs_bnode *node, u8* buf, u16
>> off)
> 
> And we can return U8_MAX as invalid value here too.

My original plan was to add a return value to hfs_bnode_read(). However, 
since modifying this interface would require changes across dozens of 
call sites—which seems too extensive—and introducing a return value 
might necessitate additional error handling in certain functions (e.g., 
hfs_bnode_split()), I wasn’t entirely sure how to proceed.

Therefore, I created __hfs_bnode_read() -- a function that behaves 
identically to hfs_bnode_read but includes a return value—as a way to 
experiment and facilitate discussion. I don’t think we ultimately need 
an additional function; its purpose is strictly to help evaluate, within 
this RFC patch, whether adding a return value to hfs_bnode_read is 
desirable and whether the approach taken in __hfs_bnode_read() would be 
suitable.

I agree that we should check return values at every call site—there are 
too many issues in HFS caused by missing checks. However, I’m a bit 
confused by the suggestion to validate the hfs_bnode_read buffer 
directly. If the buffer is a structured type (such as btree_key), 
detecting errors becomes more challenging. In such cases, we may need to 
rely on methods like memcmp() or memchr_inv(). From that perspective, 
adding a return value to hfs_bnode_read seems an easier way.

If you have a lightweight method to quickly verify the validity of a 
read without introducing a return value, please let me know. I can 
follow up with a new patch to address it.
>> @@ -251,23 +283,28 @@ void hfs_bnode_dump(struct hfs_bnode *node)
>>   
>>   	off = node->tree->node_size - 2;
>>   	for (i = be16_to_cpu(desc.num_recs); i >= 0; off -= 2, i--)
>> {
>> -		key_off = hfs_bnode_read_u16(node, off);
>> +		res = __hfs_bnode_read_u16(node, &key_off, off);
>> +		if (res) return;
>>   		hfs_dbg_cont(BNODE_MOD, " %d", key_off);
>>   		if (i && node->type == HFS_NODE_INDEX) {
>> -			int tmp;
>> -
>> -			if (node->tree->attributes &
>> HFS_TREE_VARIDXKEYS)
>> -				tmp = (hfs_bnode_read_u8(node,
>> key_off) | 1) + 1;
>> -			else
>> +			u8 tmp, data;
>> +			if (node->tree->attributes &
>> HFS_TREE_VARIDXKEYS) {
>> +				res = __hfs_bnode_read_u8(node,
>> &data, key_off);
>> +				if (res) return;
> 
> This breaks the kernel code style.

Thank you for pointing out the formatting issues. I’ll be more careful 
about that in future development.

> 
> Please, see my comments above. I think we can use U16_MAX as invalid
> value.
> 

I understand your point. For functions like hfs_brec_lenoff, 
hfs_bnode_read_u16, and hfs_bnode_read_u8, using U16_MAX or U8_MAX as 
error return values does seem reasonable.

I will submit a new patch for modifying hfs_brec_lenoff using U16_MAX as 
returned-value to prevent potential out-of-bounds issues in 
hfs_bmap_alloc and hfs_bmap_free.

Regarding changes to hfs_bnode_read, I’d appreciate further guidance 
from you before proceeding with any modifications.

>> @@ -394,7 +401,9 @@ void hfs_bmap_free(struct hfs_bnode *node)
>>   			hfs_bnode_put(node);
>>   			return;
>>   		}
>> -		len = hfs_brec_lenoff(node, 0, &off);
>> +		res = hfs_brec_lenoff(node, 0, &off, &len);
>> +		if (res)
>> +			goto fail;
>>   	}
>>   	off += node->page_offset + nidx / 8;
>>   	page = node->page[off >> PAGE_SHIFT];
>> @@ -415,4 +424,8 @@ void hfs_bmap_free(struct hfs_bnode *node)
>>   	hfs_bnode_put(node);
>>   	tree->free_nodes++;
>>   	mark_inode_dirty(tree->inode);
>> +	return;
>> +fail:
>> +	hfs_bnode_put(node);
>> +	pr_err("fail to free a bnode due to invalid off or len\n");
> 
> I am not sure that breaking free bnode logic because of incorrect
> length (and, maybe, even offset) is correct behavior. Because, we can
> correct length and continue to free bnode. So, I am not sure that
> complete failure is the correct way of managing the situation.
> 

I'll look into this more carefully — the current approach does seem 
suboptimal.

Thanks,
Chenzhi Yang


^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [RFC PATCH v2] hfs: add return values to hfs_brec_lenoff and hfs_bnode_read to improve robustness
  2025-08-28 12:35   ` 杨晨志
@ 2025-08-28 19:46     ` Viacheslav Dubeyko
  0 siblings, 0 replies; 4+ messages in thread
From: Viacheslav Dubeyko @ 2025-08-28 19:46 UTC (permalink / raw)
  To: glaubitz@physik.fu-berlin.de, yang.chenzhi@vivo.com,
	slava@dubeyko.com, frank.li@vivo.com
  Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org

On Thu, 2025-08-28 at 12:35 +0000, 杨晨志 wrote:
> 在 2025/8/28 4:04, Viacheslav Dubeyko 写道:
> > Frankly speaking, I don't think that reworking this method for
> > returning the error code is necessary. Currently, we return the length
> > value (u16) and we can return U16_MAX as for off as for len for the
> > case of incorrect offset or erroneous logic. We can treat U16_MAX as
> > error condition and we can check off and len for this value. Usually,
> > HFS b-tree node has 512 bytes in size and as offset as length cannot be
> > equal to U16_MAX (or bigger). And we don't need to change the input and
> > output arguments if we will check for U16_MAX value.
> 
> Using U16_MAX as the error return value is reasonable. This change also 
> applies to hfsplus, since the maximum node_size in hfsplus is 32768 
> (0x8000), which is less than U16_MAX. Adopting this approach helps avoid 
> extensive interface changes. I agree with your point.
> 
> > > diff --git a/fs/hfs/bnode.c b/fs/hfs/bnode.c
> > > index e8cd1a31f247..b0bbaf016b8d 100644
> > > --- a/fs/hfs/bnode.c
> > > +++ b/fs/hfs/bnode.c
> > > @@ -57,26 +57,16 @@ int check_and_correct_requested_length(struct
> > > hfs_bnode *node, int off, int len)
> > >   	return len;
> > >   }
> > >   
> > > -void hfs_bnode_read(struct hfs_bnode *node, void *buf, int off, int
> > > len)
> > > +int __hfs_bnode_read(struct hfs_bnode *node, void *buf, u16 off, u16
> > > len)
> > 
> > I don't follow why do we need to introduce __hfs_bnode_read(). One
> > method for all is enough. And I think we still can use only
> > hfs_bnode_read(). Because, we can initialize the buffer by 0x00 or 0xFF
> > in the case we cannot read if offset or length are invalid. Usually,
> > every method checks (or should) check the returning value of
> > hfs_bnode_read().
> > 
> > >   {
> > >   	struct page *page;
> > >   	int pagenum;
> > >   	int bytes_read;
> > >   	int bytes_to_read;
> > >   
> > > -	if (!is_bnode_offset_valid(node, off))
> > > -		return;
> > > -
> > > -	if (len == 0) {
> > > -		pr_err("requested zero length: "
> > > -		       "NODE: id %u, type %#x, height %u, "
> > > -		       "node_size %u, offset %d, len %d\n",
> > > -		       node->this, node->type, node->height,
> > > -		       node->tree->node_size, off, len);
> > > -		return;
> > > -	}
> > > -
> > > -	len = check_and_correct_requested_length(node, off, len);
> > > +	/* len = 0 is invalid: prevent use of an uninitalized
> > > buffer*/
> > > +	if (!len || !hfs_off_and_len_is_valid(node, off, len))
> > > +		return -EINVAL;
> > >   
> > >   	off += node->page_offset;
> > >   	pagenum = off >> PAGE_SHIFT;
> > > @@ -93,6 +83,47 @@ void hfs_bnode_read(struct hfs_bnode *node, void
> > > *buf, int off, int len)
> > >   		pagenum++;
> > >   		off = 0; /* page offset only applies to the first
> > > page */
> > >   	}
> > > +
> > > +	return 0;
> > > +}
> > > +
> > > +static int __hfs_bnode_read_u16(struct hfs_bnode *node, u16* buf,
> > > u16 off)
> > 
> > I don't see the point to introduce another version of method because we
> > can return U16_MAX as invalid value.
> > 
> > > +{
> > > +	__be16 data;
> > > +	int res;
> > > +
> > > +	res = __hfs_bnode_read(node, (void*)(&data), off, 2);
> > > +	if (res)
> > > +		return res;
> > > +	*buf = be16_to_cpu(data);
> > > +	return 0;
> > > +}
> > > +
> > > +
> > > +static int __hfs_bnode_read_u8(struct hfs_bnode *node, u8* buf, u16
> > > off)
> > 
> > And we can return U8_MAX as invalid value here too.
> 
> My original plan was to add a return value to hfs_bnode_read(). However, 
> since modifying this interface would require changes across dozens of 
> call sites—which seems too extensive—and introducing a return value 
> might necessitate additional error handling in certain functions (e.g., 
> hfs_bnode_split()), I wasn’t entirely sure how to proceed.
> 
> Therefore, I created __hfs_bnode_read() -- a function that behaves 
> identically to hfs_bnode_read but includes a return value—as a way to 
> experiment and facilitate discussion. I don’t think we ultimately need 
> an additional function; its purpose is strictly to help evaluate, within 
> this RFC patch, whether adding a return value to hfs_bnode_read is 
> desirable and whether the approach taken in __hfs_bnode_read() would be 
> suitable.
> 
> I agree that we should check return values at every call site—there are 
> too many issues in HFS caused by missing checks. However, I’m a bit 
> confused by the suggestion to validate the hfs_bnode_read buffer 
> directly. If the buffer is a structured type (such as btree_key), 
> detecting errors becomes more challenging. In such cases, we may need to 
> rely on methods like memcmp() or memchr_inv(). From that perspective, 
> adding a return value to hfs_bnode_read seems an easier way.
> 
> If you have a lightweight method to quickly verify the validity of a 
> read without introducing a return value, please let me know. I can 
> follow up with a new patch to address it.

My initial idea was to check the fields of the record. For example, like this
[1]:

hfs_bnode_read(fd.bnode, &entry, fd.entryoffset, fd.entrylength);
if (entry.type != HFS_CDR_THD) {
     pr_err("bad catalog folder thread\n");
     err = -EIO;
     goto out;
}

However, unfortunately, we also have such patterns [2]:

hfs_bnode_read(node, &node_desc, 0, sizeof(node_desc));
node_desc.next = cpu_to_be32(node->next);
node_desc.num_recs = cpu_to_be16(node->num_recs);
hfs_bnode_write(node, &node_desc, 0, sizeof(node_desc));

Of course, technically speaking, it is possible try to check the record's fields
but we could miss something, it is complicated and it could be more compute-
intensive. So, it is much simpler simply to check the returned error. The
question here: how to minimize the required changes? So, returning error code
from hfs_bnode_read() could be the simplest solution.

However, I don;t completely agree with hfs_off_and_len_is_valid() logic:

+static inline
+bool hfs_off_and_len_is_valid(struct hfs_bnode *node, u16 off, u16 len)
+{
+	bool ret = true;
+	if (off > node->tree->node_size ||
+			off + len > node->tree->node_size)
+		ret = false;
+
+	if (!ret) {
+		pr_err("requested invalid offset: "
+		       "NODE: id %u, type %#x, height %u, "
+		       "node_size %u, offset %u, length %u\n",
+		       node->this, node->type, node->height,
+		       node->tree->node_size, off, len);
+	}
+
+	return ret;
+}

If offset is out of node size, then, of course, we cannot read anything. But if
the offset is inside node and (offset + length) is bigger than node size, then
we can simply correct the length and read anyway. The question here: has buffer
enough allocated memory to receive the read data? But this check is should be
done by caller. Potentially, it is possible to receive a granularity size of
requested metadata structure and to compare with the requested length.
Otherwise, we cannot make any other reasonable conclusion.

As far as I can see, we have around 25 calls of hfs_bnode_read() and mostly
functions are ready to return error. So, we can carefully rework this logic.
The hfs_brec_lenoff(), hfs_bnode_read_u16() can return U16_MAX and
hfs_bnode_read_u8() can return U8_MAX. The hfs_bnode_read_key() is under
question yet. Should we return error code here or calling memset() will be
enough? I think hfs_bnode_dump() doesn't require significant rework because it
is mostly debugging function and it should dump as much as possible nevertheless
of detected errors.

Thanks,
Slava.

[1] https://elixir.bootlin.com/linux/v6.17-rc3/source/fs/hfs/dir.c#L82
[2] https://elixir.bootlin.com/linux/v6.17-rc3/source/fs/hfs/brec.c#L327

> > 

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-08-28 19:46 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-27  6:40 [RFC PATCH v2] hfs: add return values to hfs_brec_lenoff and hfs_bnode_read to improve robustness Chenzhi Yang
2025-08-27 20:04 ` Viacheslav Dubeyko
2025-08-28 12:35   ` 杨晨志
2025-08-28 19:46     ` Viacheslav Dubeyko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).