public inbox for linux-erofs@ozlabs.org
 help / color / mirror / Atom feed
From: Utkal Singh <singhutkal015@gmail.com>
To: linux-erofs@lists.ozlabs.org
Cc: hsiangkao@linux.alibaba.com, yifan.yfzhao@foxmail.com,
	Utkal Singh <singhutkal015@gmail.com>
Subject: [PATCH] erofs-utils: lib: switch ZSTD decompression to streaming API
Date: Mon, 23 Mar 2026 05:22:57 +0000	[thread overview]
Message-ID: <20260323052257.11377-1-singhutkal015@gmail.com> (raw)

The current ZSTD decompression path calls ZSTD_getFrameContentSize()
(or legacy ZSTD_getDecompressedSize()) to read the decompressed size
from the ZSTD frame header, then malloc()s a buffer of that size.

This is problematic because the frame content size field is untrusted
on-disk metadata; a crafted EROFS image can set it to an arbitrarily
large value, triggering a large allocation before any real validation
occurs.

The Linux kernel's erofs ZSTD decompressor does not use
ZSTD_getFrameContentSize() at all.  It uses ZSTD_decompressStream(),
which decompresses directly into a caller-supplied buffer whose size
is already known from the extent map.

Align erofs-utils with the kernel:

- Use rq->decodedlength (from the trusted extent map) to size the
  output buffer, removing the dependency on the on-disk frame header.
- Replace ZSTD_decompress() with ZSTD_createDStream(),
  ZSTD_initDStream(), and ZSTD_decompressStream().
- Remove the HAVE_ZSTD_GETFRAMECONTENTSIZE ifdef block entirely.
- For the decodedskip case, allocate a temporary buffer of exactly
  rq->decodedlength (not the untrusted frame size).

Suggested-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Utkal Singh <singhutkal015@gmail.com>
---
 lib/decompress.c | 76 +++++++++++++++++++++++++++++-------------------
 1 file changed, 46 insertions(+), 30 deletions(-)

diff --git a/lib/decompress.c b/lib/decompress.c
index e66693c..19cde03 100644
--- a/lib/decompress.c
+++ b/lib/decompress.c
@@ -28,57 +28,73 @@ static unsigned int z_erofs_fixup_insize(const u8 *padbuf, unsigned int padbufsi
 /* also a very preliminary userspace version */
 static int z_erofs_decompress_zstd(struct z_erofs_decompress_req *rq)
 {
-	int ret = 0;
+	ZSTD_DStream *dstream;
+	ZSTD_inBuffer in;
+	ZSTD_outBuffer out;
 	char *dest = rq->out;
 	char *src = rq->in;
 	char *buff = NULL;
-	unsigned int inputmargin = 0;
-	unsigned long long total;
+	unsigned int inputmargin;
+	size_t ret;
+	int err = 0;
 
 	inputmargin = z_erofs_fixup_insize((u8 *)src, rq->inputsize);
 	if (inputmargin >= rq->inputsize)
 		return -EFSCORRUPTED;
 
-#ifdef HAVE_ZSTD_GETFRAMECONTENTSIZE
-	total = ZSTD_getFrameContentSize(src + inputmargin,
-					 rq->inputsize - inputmargin);
-	if (total == ZSTD_CONTENTSIZE_UNKNOWN ||
-	    total == ZSTD_CONTENTSIZE_ERROR)
-		return -EFSCORRUPTED;
-#else
-	total = ZSTD_getDecompressedSize(src + inputmargin,
-					 rq->inputsize - inputmargin);
-#endif
-	if (rq->decodedskip || total != rq->decodedlength) {
-		buff = malloc(total);
+	if (rq->decodedskip) {
+		buff = malloc(rq->decodedlength);
 		if (!buff)
 			return -ENOMEM;
 		dest = buff;
 	}
 
-	ret = ZSTD_decompress(dest, total,
-			      src + inputmargin, rq->inputsize - inputmargin);
+	dstream = ZSTD_createDStream();
+	if (!dstream) {
+		err = -ENOMEM;
+		goto out_free_buff;
+	}
+
+	ZSTD_initDStream(dstream);
+
+	in.src  = src + inputmargin;
+	in.size = rq->inputsize - inputmargin;
+	in.pos  = 0;
+
+	out.dst  = dest;
+	out.size = rq->decodedlength;
+	out.pos  = 0;
+
+	ret = ZSTD_decompressStream(dstream, &out, &in);
 	if (ZSTD_isError(ret)) {
-		erofs_err("ZSTD decompress failed %d: %s", ZSTD_getErrorCode(ret),
-			  ZSTD_getErrorName(ret));
-		ret = -EIO;
-		goto out;
+		erofs_err("ZSTD decompress failed: %s", ZSTD_getErrorName(ret));
+		err = -EFSCORRUPTED;
+		goto out_free_dstream;
 	}
 
-	if (ret != (int)total) {
-		erofs_err("ZSTD decompress length mismatch %d, expected %d",
-			  ret, total);
-		ret = -EIO;
-		goto out;
+	if (ret != 0) {
+		erofs_err("ZSTD frame not fully decoded");
+		err = -EFSCORRUPTED;
+		goto out_free_dstream;
+	}
+
+	if (out.pos != rq->decodedlength) {
+		erofs_err("ZSTD decompress length mismatch: got %zu, expected %u",
+			  out.pos, rq->decodedlength);
+		err = -EFSCORRUPTED;
+		goto out_free_dstream;
 	}
-	if (rq->decodedskip || total != rq->decodedlength)
+
+	if (rq->decodedskip)
 		memcpy(rq->out, dest + rq->decodedskip,
 		       rq->decodedlength - rq->decodedskip);
-	ret = 0;
-out:
+
+out_free_dstream:
+	ZSTD_freeDStream(dstream);
+out_free_buff:
 	if (buff)
 		free(buff);
-	return ret;
+	return err;
 }
 #endif
 
-- 
2.43.0



             reply	other threads:[~2026-03-23  5:23 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-23  5:22 Utkal Singh [this message]
2026-03-23 11:19 ` [PATCH] erofs-utils: lib: switch ZSTD decompression to streaming API Nithurshen
2026-03-29 20:25   ` Utkal Singh
2026-03-30  1:42     ` Gao Xiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260323052257.11377-1-singhutkal015@gmail.com \
    --to=singhutkal015@gmail.com \
    --cc=hsiangkao@linux.alibaba.com \
    --cc=linux-erofs@lists.ozlabs.org \
    --cc=yifan.yfzhao@foxmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox