[PATCH 1/7] lib/xz: Avoid overlapping memcpy() with invalid input with in-place decompression

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Gao Xiang <xiang@kernel.org>
To: linux-erofs@lists.ozlabs.org, LKML <linux-kernel@vger.kernel.org>
Cc: Lasse Collin <lasse.collin@tukaani.org>,
	Greg KH <gregkh@linuxfoundation.org>,
	Gao Xiang <hsiangkao@linux.alibaba.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 1/7] lib/xz: Avoid overlapping memcpy() with invalid input with in-place decompression
Date: Mon, 11 Oct 2021 05:31:39 +0800	[thread overview]
Message-ID: <20211010213145.17462-2-xiang@kernel.org> (raw)
In-Reply-To: <20211010213145.17462-1-xiang@kernel.org>

From: Lasse Collin <lasse.collin@tukaani.org>

With valid files, the safety margin described in lib/decompress_unxz.c
ensures that these buffers cannot overlap. But if the uncompressed size
of the input is larger than the caller thought, which is possible when
the input file is invalid/corrupt, the buffers can overlap. Obviously
the result will then be garbage (and usually the decoder will return
an error too) but no other harm will happen when such an over-run occurs.

This change only affects uncompressed LZMA2 chunks and so this
should have no effect on performance.

Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
---
 lib/decompress_unxz.c |  2 +-
 lib/xz/xz_dec_lzma2.c | 21 +++++++++++++++++++--
 2 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/lib/decompress_unxz.c b/lib/decompress_unxz.c
index a2f38e23004a..f7a3dc13316a 100644
--- a/lib/decompress_unxz.c
+++ b/lib/decompress_unxz.c
@@ -167,7 +167,7 @@
  * memeq and memzero are not used much and any remotely sane implementation
  * is fast enough. memcpy/memmove speed matters in multi-call mode, but
  * the kernel image is decompressed in single-call mode, in which only
- * memcpy speed can matter and only if there is a lot of uncompressible data
+ * memmove speed can matter and only if there is a lot of uncompressible data
  * (LZMA2 stores uncompressible chunks in uncompressed form). Thus, the
  * functions below should just be kept small; it's probably not worth
  * optimizing for speed.
diff --git a/lib/xz/xz_dec_lzma2.c b/lib/xz/xz_dec_lzma2.c
index 7a6781e3f47b..d548cf0e59fe 100644
--- a/lib/xz/xz_dec_lzma2.c
+++ b/lib/xz/xz_dec_lzma2.c
@@ -387,7 +387,14 @@ static void dict_uncompressed(struct dictionary *dict, struct xz_buf *b,
 
 		*left -= copy_size;
 
-		memcpy(dict->buf + dict->pos, b->in + b->in_pos, copy_size);
+		/*
+		 * If doing in-place decompression in single-call mode and the
+		 * uncompressed size of the file is larger than the caller
+		 * thought (i.e. it is invalid input!), the buffers below may
+		 * overlap and cause undefined behavior with memcpy().
+		 * With valid inputs memcpy() would be fine here.
+		 */
+		memmove(dict->buf + dict->pos, b->in + b->in_pos, copy_size);
 		dict->pos += copy_size;
 
 		if (dict->full < dict->pos)
@@ -397,7 +404,11 @@ static void dict_uncompressed(struct dictionary *dict, struct xz_buf *b,
 			if (dict->pos == dict->end)
 				dict->pos = 0;
 
-			memcpy(b->out + b->out_pos, b->in + b->in_pos,
+			/*
+			 * Like above but for multi-call mode: use memmove()
+			 * to avoid undefined behavior with invalid input.
+			 */
+			memmove(b->out + b->out_pos, b->in + b->in_pos,
 					copy_size);
 		}
 
@@ -421,6 +432,12 @@ static uint32_t dict_flush(struct dictionary *dict, struct xz_buf *b)
 		if (dict->pos == dict->end)
 			dict->pos = 0;
 
+		/*
+		 * These buffers cannot overlap even if doing in-place
+		 * decompression because in multi-call mode dict->buf
+		 * has been allocated by us in this file; it's not
+		 * provided by the caller like in single-call mode.
+		 */
 		memcpy(b->out + b->out_pos, dict->buf + dict->start,
 				copy_size);
 	}
-- 
2.20.1

WARNING: multiple messages have this Message-ID (diff)

From: Gao Xiang <xiang@kernel.org>
To: linux-erofs@lists.ozlabs.org, LKML <linux-kernel@vger.kernel.org>
Cc: Lasse Collin <lasse.collin@tukaani.org>,
	Chao Yu <chao@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Greg KH <gregkh@linuxfoundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Gao Xiang <hsiangkao@linux.alibaba.com>
Subject: [PATCH 1/7] lib/xz: Avoid overlapping memcpy() with invalid input with in-place decompression
Date: Mon, 11 Oct 2021 05:31:39 +0800	[thread overview]
Message-ID: <20211010213145.17462-2-xiang@kernel.org> (raw)
In-Reply-To: <20211010213145.17462-1-xiang@kernel.org>

From: Lasse Collin <lasse.collin@tukaani.org>

With valid files, the safety margin described in lib/decompress_unxz.c
ensures that these buffers cannot overlap. But if the uncompressed size
of the input is larger than the caller thought, which is possible when
the input file is invalid/corrupt, the buffers can overlap. Obviously
the result will then be garbage (and usually the decoder will return
an error too) but no other harm will happen when such an over-run occurs.

This change only affects uncompressed LZMA2 chunks and so this
should have no effect on performance.

Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
---
 lib/decompress_unxz.c |  2 +-
 lib/xz/xz_dec_lzma2.c | 21 +++++++++++++++++++--
 2 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/lib/decompress_unxz.c b/lib/decompress_unxz.c
index a2f38e23004a..f7a3dc13316a 100644
--- a/lib/decompress_unxz.c
+++ b/lib/decompress_unxz.c
@@ -167,7 +167,7 @@
  * memeq and memzero are not used much and any remotely sane implementation
  * is fast enough. memcpy/memmove speed matters in multi-call mode, but
  * the kernel image is decompressed in single-call mode, in which only
- * memcpy speed can matter and only if there is a lot of uncompressible data
+ * memmove speed can matter and only if there is a lot of uncompressible data
  * (LZMA2 stores uncompressible chunks in uncompressed form). Thus, the
  * functions below should just be kept small; it's probably not worth
  * optimizing for speed.
diff --git a/lib/xz/xz_dec_lzma2.c b/lib/xz/xz_dec_lzma2.c
index 7a6781e3f47b..d548cf0e59fe 100644
--- a/lib/xz/xz_dec_lzma2.c
+++ b/lib/xz/xz_dec_lzma2.c
@@ -387,7 +387,14 @@ static void dict_uncompressed(struct dictionary *dict, struct xz_buf *b,
 
 		*left -= copy_size;
 
-		memcpy(dict->buf + dict->pos, b->in + b->in_pos, copy_size);
+		/*
+		 * If doing in-place decompression in single-call mode and the
+		 * uncompressed size of the file is larger than the caller
+		 * thought (i.e. it is invalid input!), the buffers below may
+		 * overlap and cause undefined behavior with memcpy().
+		 * With valid inputs memcpy() would be fine here.
+		 */
+		memmove(dict->buf + dict->pos, b->in + b->in_pos, copy_size);
 		dict->pos += copy_size;
 
 		if (dict->full < dict->pos)
@@ -397,7 +404,11 @@ static void dict_uncompressed(struct dictionary *dict, struct xz_buf *b,
 			if (dict->pos == dict->end)
 				dict->pos = 0;
 
-			memcpy(b->out + b->out_pos, b->in + b->in_pos,
+			/*
+			 * Like above but for multi-call mode: use memmove()
+			 * to avoid undefined behavior with invalid input.
+			 */
+			memmove(b->out + b->out_pos, b->in + b->in_pos,
 					copy_size);
 		}
 
@@ -421,6 +432,12 @@ static uint32_t dict_flush(struct dictionary *dict, struct xz_buf *b)
 		if (dict->pos == dict->end)
 			dict->pos = 0;
 
+		/*
+		 * These buffers cannot overlap even if doing in-place
+		 * decompression because in multi-call mode dict->buf
+		 * has been allocated by us in this file; it's not
+		 * provided by the caller like in single-call mode.
+		 */
 		memcpy(b->out + b->out_pos, dict->buf + dict->start,
 				copy_size);
 	}
-- 
2.20.1

next prev parent reply	other threads:[~2021-10-10 21:32 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-10 21:31 [PATCH 0/7] erofs: add LZMA compression support Gao Xiang
2021-10-10 21:31 ` Gao Xiang
2021-10-10 21:31 ` Gao Xiang [this message]
2021-10-10 21:31   ` [PATCH 1/7] lib/xz: Avoid overlapping memcpy() with invalid input with in-place decompression Gao Xiang
2021-10-10 21:31 ` [PATCH 2/7] lib/xz: Validate the value before assigning it to an enum variable Gao Xiang
2021-10-10 21:31   ` Gao Xiang
2021-10-10 21:31 ` [PATCH 3/7] lib/xz: Move s->lzma.len = 0 initialization to lzma_reset() Gao Xiang
2021-10-10 21:31   ` Gao Xiang
2021-10-10 21:31 ` [PATCH 4/7] lib/xz: Add MicroLZMA decoder Gao Xiang
2021-10-10 21:31   ` Gao Xiang
2021-10-10 21:31 ` [PATCH 5/7] lib/xz, lib/decompress_unxz.c: Fix spelling in comments Gao Xiang
2021-10-10 21:31   ` Gao Xiang
2021-10-10 21:31 ` [PATCH 6/7] erofs: rename some generic methods in decompressor Gao Xiang
2021-10-10 21:31   ` Gao Xiang
2021-10-19 13:03   ` Chao Yu
2021-10-19 13:03     ` Chao Yu
2021-10-10 21:31 ` [PATCH 7/7] erofs: lzma compression support Gao Xiang
2021-10-10 21:31   ` Gao Xiang
2021-10-19 13:04   ` Chao Yu
2021-10-19 13:04     ` Chao Yu
2021-10-14  1:45 ` [PATCH 0/7] erofs: add LZMA " Gao Xiang
2021-10-14  1:45   ` Gao Xiang

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:a2f38e23004 dfblob:f7a3dc13316 dfblob:7a6781e3f47
dfblob:d548cf0e59f dfblob:a2f38e23004 dfblob:f7a3dc13316
dfblob:7a6781e3f47 dfblob:d548cf0e59f )
 OR (
bs:"[PATCH 1/7] lib/xz: Avoid overlapping memcpy() with invalid input with in-place decompression" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211010213145.17462-2-xiang@kernel.org \
    --to=xiang@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=hsiangkao@linux.alibaba.com \
    --cc=lasse.collin@tukaani.org \
    --cc=linux-erofs@lists.ozlabs.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.