public inbox for linux-erofs@ozlabs.org
 help / color / mirror / Atom feed
* [PATCH] erofs-utils: lib: validate z_extents against inode size
@ 2026-03-17  8:21 Utkal Singh
  2026-03-17  8:34 ` Gao Xiang
  0 siblings, 1 reply; 4+ messages in thread
From: Utkal Singh @ 2026-03-17  8:21 UTC (permalink / raw)
  To: linux-erofs; +Cc: hsiangkao, yifan.yfzhao, singhutkal015

z_extents is read from on-disk metadata and used as the upper bound
for extent lookups in z_erofs_map_blocks_ext().  A corrupted value
can be arbitrarily large (up to 2^48-1), causing erofs_read_metabuf()
to access offsets far beyond the actual extent table.  The resulting
garbage is parsed as z_erofs_extent records, leading to wrong physical
addresses used for I/O, silent data corruption, or crashes.

Since each extent covers at least one logical cluster, the extent
count cannot exceed DIV_ROUND_UP(i_size, 1 << z_lclusterbits).
Validate z_extents against this bound at inode initialization time
and reject invalid values with -EFSCORRUPTED.

Signed-off-by: Utkal Singh <singhutkal015@gmail.com>
---
 lib/zmap.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/lib/zmap.c b/lib/zmap.c
index 0e7af4e..2f679b7 100644
--- a/lib/zmap.c
+++ b/lib/zmap.c
@@ -675,8 +675,26 @@ static int z_erofs_fill_inode_lazy(struct erofs_inode *vi)
 	vi->z_lclusterbits = sbi->blkszbits + (h->h_clusterbits & 15);
 	if (vi->datalayout == EROFS_INODE_COMPRESSED_FULL &&
 	    (vi->z_advise & Z_EROFS_ADVISE_EXTENTS)) {
+		u64 max_extents;
+
 		vi->z_extents = le32_to_cpu(h->h_extents_lo) |
 			((u64)le16_to_cpu(h->h_extents_hi) << 32);
+
+		/*
+		 * Each extent covers at least one logical cluster, so
+		 * the extent count must not exceed the number of lclusters.
+		 * Reject bogus values to prevent out-of-bounds metadata
+		 * reads in z_erofs_map_blocks_ext().
+		 */
+		max_extents = DIV_ROUND_UP(vi->i_size,
+					   1ULL << vi->z_lclusterbits);
+		if (vi->z_extents > max_extents) {
+			erofs_err("bogus z_extents %llu (max %llu) for nid %llu",
+				  vi->z_extents | 0ULL, max_extents | 0ULL,
+				  vi->nid | 0ULL);
+			err = -EFSCORRUPTED;
+			goto out_put_metabuf;
+		}
 		goto done;
 	}
 	vi->z_algorithmtype[0] = h->h_algorithmtype & 15;
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] erofs-utils: lib: validate z_extents against inode size
  2026-03-17  8:21 [PATCH] erofs-utils: lib: validate z_extents against inode size Utkal Singh
@ 2026-03-17  8:34 ` Gao Xiang
  2026-03-17  9:01   ` Gao Xiang
  0 siblings, 1 reply; 4+ messages in thread
From: Gao Xiang @ 2026-03-17  8:34 UTC (permalink / raw)
  To: Utkal Singh, linux-erofs; +Cc: yifan.yfzhao



On 2026/3/17 16:21, Utkal Singh wrote:
> z_extents is read from on-disk metadata and used as the upper bound
> for extent lookups in z_erofs_map_blocks_ext().  A corrupted value
> can be arbitrarily large (up to 2^48-1), causing erofs_read_metabuf()
> to access offsets far beyond the actual extent table.  The resulting
> garbage is parsed as z_erofs_extent records, leading to wrong physical
> addresses used for I/O, silent data corruption, or crashes.

No, I don't think it needs to be fixed since it won't cause any
harmful behavior if the image is already corrupted.

Please don't submit any patches like this if you don't have any
reproducer and i_size is too long can cause overly long time,
but the image can be valid.


> 
> Since each extent covers at least one logical cluster, the extent
> count cannot exceed DIV_ROUND_UP(i_size, 1 << z_lclusterbits).
> Validate z_extents against this bound at inode initialization time
> and reject invalid values with -EFSCORRUPTED.
> 
> Signed-off-by: Utkal Singh <singhutkal015@gmail.com>
> ---
>   lib/zmap.c | 18 ++++++++++++++++++
>   1 file changed, 18 insertions(+)
> 
> diff --git a/lib/zmap.c b/lib/zmap.c
> index 0e7af4e..2f679b7 100644
> --- a/lib/zmap.c
> +++ b/lib/zmap.c
> @@ -675,8 +675,26 @@ static int z_erofs_fill_inode_lazy(struct erofs_inode *vi)
>   	vi->z_lclusterbits = sbi->blkszbits + (h->h_clusterbits & 15);
>   	if (vi->datalayout == EROFS_INODE_COMPRESSED_FULL &&
>   	    (vi->z_advise & Z_EROFS_ADVISE_EXTENTS)) {
> +		u64 max_extents;
> +
>   		vi->z_extents = le32_to_cpu(h->h_extents_lo) |
>   			((u64)le16_to_cpu(h->h_extents_hi) << 32);
> +
> +		/*
> +		 * Each extent covers at least one logical cluster, so
> +		 * the extent count must not exceed the number of lclusters.
> +		 * Reject bogus values to prevent out-of-bounds metadata
> +		 * reads in z_erofs_map_blocks_ext().
> +		 */
> +		max_extents = DIV_ROUND_UP(vi->i_size,
> +					   1ULL << vi->z_lclusterbits);

extents can be within 1ULL << vi->z_lclusterbits instead,
I don't think it's valid.

`up to 2^48-1 extents` is fine and valid, the users can
always interrupt this at any time.

Thanks,
Gao Xiang

> +		if (vi->z_extents > max_extents) {
> +			erofs_err("bogus z_extents %llu (max %llu) for nid %llu",
> +				  vi->z_extents | 0ULL, max_extents | 0ULL,
> +				  vi->nid | 0ULL);
> +			err = -EFSCORRUPTED;
> +			goto out_put_metabuf;
> +		}
>   		goto done;
>   	}
>   	vi->z_algorithmtype[0] = h->h_algorithmtype & 15;



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] erofs-utils: lib: validate z_extents against inode size
  2026-03-17  8:34 ` Gao Xiang
@ 2026-03-17  9:01   ` Gao Xiang
  2026-03-17  9:28     ` Utkal Singh
  0 siblings, 1 reply; 4+ messages in thread
From: Gao Xiang @ 2026-03-17  9:01 UTC (permalink / raw)
  To: Utkal Singh, linux-erofs; +Cc: yifan.yfzhao



On 2026/3/17 16:34, Gao Xiang wrote:
> 
> 
> On 2026/3/17 16:21, Utkal Singh wrote:
>> z_extents is read from on-disk metadata and used as the upper bound
>> for extent lookups in z_erofs_map_blocks_ext().  A corrupted value
>> can be arbitrarily large (up to 2^48-1), causing erofs_read_metabuf()
>> to access offsets far beyond the actual extent table.  The resulting
>> garbage is parsed as z_erofs_extent records, leading to wrong physical
>> addresses used for I/O, silent data corruption, or crashes.
> 
> No, I don't think it needs to be fixed since it won't cause any
> harmful behavior if the image is already corrupted.
> 
> Please don't submit any patches like this if you don't have any
> reproducer and i_size is too long can cause overly long time,
> but the image can be valid.
> 
> 
>>
>> Since each extent covers at least one logical cluster, the extent
>> count cannot exceed DIV_ROUND_UP(i_size, 1 << z_lclusterbits).
>> Validate z_extents against this bound at inode initialization time
>> and reject invalid values with -EFSCORRUPTED.
>>
>> Signed-off-by: Utkal Singh <singhutkal015@gmail.com>
>> ---
>>   lib/zmap.c | 18 ++++++++++++++++++
>>   1 file changed, 18 insertions(+)
>>
>> diff --git a/lib/zmap.c b/lib/zmap.c
>> index 0e7af4e..2f679b7 100644
>> --- a/lib/zmap.c
>> +++ b/lib/zmap.c
>> @@ -675,8 +675,26 @@ static int z_erofs_fill_inode_lazy(struct erofs_inode *vi)
>>       vi->z_lclusterbits = sbi->blkszbits + (h->h_clusterbits & 15);
>>       if (vi->datalayout == EROFS_INODE_COMPRESSED_FULL &&
>>           (vi->z_advise & Z_EROFS_ADVISE_EXTENTS)) {
>> +        u64 max_extents;
>> +
>>           vi->z_extents = le32_to_cpu(h->h_extents_lo) |
>>               ((u64)le16_to_cpu(h->h_extents_hi) << 32);
>> +
>> +        /*
>> +         * Each extent covers at least one logical cluster, so
>> +         * the extent count must not exceed the number of lclusters.
>> +         * Reject bogus values to prevent out-of-bounds metadata
>> +         * reads in z_erofs_map_blocks_ext().

How do you define out-of-bounds metadata read? as long as it exists
in the image, and the extent can be parsed correctly, it should be
considered as valid rather than corrupted.

Or if it triggers end-of-device or end-of-image, erofs_read_metabuf()
will trigger EIO instead; or if fails due to some sanity checks, it
should fail in the corresponding logic instead of here.

Please identify which behavior we really need to fix:

  - if some metadata is strictly invalid, we should return corrupted
    image indeed;

  or

  - we just allow such metadata because it defines valid according
    to EROFS on-disk format and never let them generate bad behaviors
    to the system, e.g.

      - Crashes (in that case we should fix the crashes);

      - DOS (we should allow users' interruption, for example
        i_size == 2^48 is valid, but it should take time, what we'd
        like to do is to allow user to interrupt this; rather than
        just set some arbitary meaningless small values).

Thanks,
Gao Xiang


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] erofs-utils: lib: validate z_extents against inode size
  2026-03-17  9:01   ` Gao Xiang
@ 2026-03-17  9:28     ` Utkal Singh
  0 siblings, 0 replies; 4+ messages in thread
From: Utkal Singh @ 2026-03-17  9:28 UTC (permalink / raw)
  To: Gao Xiang; +Cc: linux-erofs, yifan.yfzhao

[-- Attachment #1: Type: text/plain, Size: 402 bytes --]

>
> Understood. Thanks for the detailed clarification.
>
> I see that large on-disk values can still be valid and that existing error
> handling (e.g., erofs_read_metabuf()) already handles boundary conditions
> appropriately.
>
> I will drop this approach and focus on identifying issues with concrete
> reproducers and clear impact under the current design.
>
> Thanks for your guidance.
>
> Utkal
>

[-- Attachment #2: Type: text/html, Size: 1343 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-03-17  9:28 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-17  8:21 [PATCH] erofs-utils: lib: validate z_extents against inode size Utkal Singh
2026-03-17  8:34 ` Gao Xiang
2026-03-17  9:01   ` Gao Xiang
2026-03-17  9:28     ` Utkal Singh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox