From: Utkal Singh <singhutkal015@gmail.com>
To: linux-erofs@lists.ozlabs.org
Cc: hsiangkao@linux.alibaba.com, yifan.yfzhao@foxmail.com,
Utkal Singh <singhutkal015@gmail.com>
Subject: [PATCH] erofs-utils: lib: switch ZSTD decompression to streaming API
Date: Mon, 23 Mar 2026 05:22:57 +0000 [thread overview]
Message-ID: <20260323052257.11377-1-singhutkal015@gmail.com> (raw)
The current ZSTD decompression path calls ZSTD_getFrameContentSize()
(or legacy ZSTD_getDecompressedSize()) to read the decompressed size
from the ZSTD frame header, then malloc()s a buffer of that size.
This is problematic because the frame content size field is untrusted
on-disk metadata; a crafted EROFS image can set it to an arbitrarily
large value, triggering a large allocation before any real validation
occurs.
The Linux kernel's erofs ZSTD decompressor does not use
ZSTD_getFrameContentSize() at all. It uses ZSTD_decompressStream(),
which decompresses directly into a caller-supplied buffer whose size
is already known from the extent map.
Align erofs-utils with the kernel:
- Use rq->decodedlength (from the trusted extent map) to size the
output buffer, removing the dependency on the on-disk frame header.
- Replace ZSTD_decompress() with ZSTD_createDStream(),
ZSTD_initDStream(), and ZSTD_decompressStream().
- Remove the HAVE_ZSTD_GETFRAMECONTENTSIZE ifdef block entirely.
- For the decodedskip case, allocate a temporary buffer of exactly
rq->decodedlength (not the untrusted frame size).
Suggested-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Utkal Singh <singhutkal015@gmail.com>
---
lib/decompress.c | 76 +++++++++++++++++++++++++++++-------------------
1 file changed, 46 insertions(+), 30 deletions(-)
diff --git a/lib/decompress.c b/lib/decompress.c
index e66693c..19cde03 100644
--- a/lib/decompress.c
+++ b/lib/decompress.c
@@ -28,57 +28,73 @@ static unsigned int z_erofs_fixup_insize(const u8 *padbuf, unsigned int padbufsi
/* also a very preliminary userspace version */
static int z_erofs_decompress_zstd(struct z_erofs_decompress_req *rq)
{
- int ret = 0;
+ ZSTD_DStream *dstream;
+ ZSTD_inBuffer in;
+ ZSTD_outBuffer out;
char *dest = rq->out;
char *src = rq->in;
char *buff = NULL;
- unsigned int inputmargin = 0;
- unsigned long long total;
+ unsigned int inputmargin;
+ size_t ret;
+ int err = 0;
inputmargin = z_erofs_fixup_insize((u8 *)src, rq->inputsize);
if (inputmargin >= rq->inputsize)
return -EFSCORRUPTED;
-#ifdef HAVE_ZSTD_GETFRAMECONTENTSIZE
- total = ZSTD_getFrameContentSize(src + inputmargin,
- rq->inputsize - inputmargin);
- if (total == ZSTD_CONTENTSIZE_UNKNOWN ||
- total == ZSTD_CONTENTSIZE_ERROR)
- return -EFSCORRUPTED;
-#else
- total = ZSTD_getDecompressedSize(src + inputmargin,
- rq->inputsize - inputmargin);
-#endif
- if (rq->decodedskip || total != rq->decodedlength) {
- buff = malloc(total);
+ if (rq->decodedskip) {
+ buff = malloc(rq->decodedlength);
if (!buff)
return -ENOMEM;
dest = buff;
}
- ret = ZSTD_decompress(dest, total,
- src + inputmargin, rq->inputsize - inputmargin);
+ dstream = ZSTD_createDStream();
+ if (!dstream) {
+ err = -ENOMEM;
+ goto out_free_buff;
+ }
+
+ ZSTD_initDStream(dstream);
+
+ in.src = src + inputmargin;
+ in.size = rq->inputsize - inputmargin;
+ in.pos = 0;
+
+ out.dst = dest;
+ out.size = rq->decodedlength;
+ out.pos = 0;
+
+ ret = ZSTD_decompressStream(dstream, &out, &in);
if (ZSTD_isError(ret)) {
- erofs_err("ZSTD decompress failed %d: %s", ZSTD_getErrorCode(ret),
- ZSTD_getErrorName(ret));
- ret = -EIO;
- goto out;
+ erofs_err("ZSTD decompress failed: %s", ZSTD_getErrorName(ret));
+ err = -EFSCORRUPTED;
+ goto out_free_dstream;
}
- if (ret != (int)total) {
- erofs_err("ZSTD decompress length mismatch %d, expected %d",
- ret, total);
- ret = -EIO;
- goto out;
+ if (ret != 0) {
+ erofs_err("ZSTD frame not fully decoded");
+ err = -EFSCORRUPTED;
+ goto out_free_dstream;
+ }
+
+ if (out.pos != rq->decodedlength) {
+ erofs_err("ZSTD decompress length mismatch: got %zu, expected %u",
+ out.pos, rq->decodedlength);
+ err = -EFSCORRUPTED;
+ goto out_free_dstream;
}
- if (rq->decodedskip || total != rq->decodedlength)
+
+ if (rq->decodedskip)
memcpy(rq->out, dest + rq->decodedskip,
rq->decodedlength - rq->decodedskip);
- ret = 0;
-out:
+
+out_free_dstream:
+ ZSTD_freeDStream(dstream);
+out_free_buff:
if (buff)
free(buff);
- return ret;
+ return err;
}
#endif
--
2.43.0
next reply other threads:[~2026-03-23 5:23 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-23 5:22 Utkal Singh [this message]
2026-03-23 11:19 ` [PATCH] erofs-utils: lib: switch ZSTD decompression to streaming API Nithurshen
2026-03-29 20:25 ` Utkal Singh
2026-03-30 1:42 ` Gao Xiang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260323052257.11377-1-singhutkal015@gmail.com \
--to=singhutkal015@gmail.com \
--cc=hsiangkao@linux.alibaba.com \
--cc=linux-erofs@lists.ozlabs.org \
--cc=yifan.yfzhao@foxmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox