* [PATCH 0/4] linux-user: fix several mremap bugs
@ 2025-10-11 20:03 Matthew Lugg
2025-10-11 20:03 ` [PATCH 1/4] linux-user: fix mremap unmapping adjacent region Matthew Lugg
` (3 more replies)
0 siblings, 4 replies; 6+ messages in thread
From: Matthew Lugg @ 2025-10-11 20:03 UTC (permalink / raw)
To: qemu-devel; +Cc: laurent, qemu-stable, Matthew Lugg
I was recently debugging a strange crash in a downstream project which turned
out to be a QEMU bug related to the `mremap` implementation in linux-user. In
practice, this bug essentially led to arbitrary memory regions being unmapped
when a 32-bit guest, running on a 64-bit host, uses `mremap` to shrink a memory
mapping.
The first patch in this set resolves that bug. Since the patch is very simple,
and the bug is quite likely to be hit, I suspect that that commit is a good
candidate for qemu-stable.
The following two patches just resolve two more bugs I became aware of whilst
working on this code. I believe the messages in those patches contain all
necessary context. They are less critical and the fixes more complex, so are
likely not suitable for backporting into qemu-stable.
The final commits adds tcg tests for the fixed `mremap` behavior. The third fix
is unfortunately difficult to test programmatically, but I have confirmed that
it behaves as expected by observing the output of `strace qemu-i386 repro`,
where `repro` is the following C program:
#define _GNU_SOURCE
#include <stddef.h>
#include <sys/mman.h>
int main(void) {
char *a = mmap(NULL, 4097, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
char *b = mmap(NULL, 4097, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
mremap(b, 4097, 4097, MREMAP_FIXED | MREMAP_MAYMOVE, a);
// QEMU has now leaked a page of its memory reservation!
return 0;
}
Prior to the patch, as the comment says, QEMU leaks a page of its address space
reservation (i.e. the page becomes unmapped). After the patch, QEMU correctly
reclaims that page with `mmap`.
Matthew Lugg (4):
linux-user: fix mremap unmapping adjacent region
linux-user: fix mremap errors for invalid ranges
linux-user: fix reserved_va page leak in do_munmap
tests: add tcg coverage for fixed mremap bugs
linux-user/mmap.c | 75 +++++++++++++--------------------
tests/tcg/multiarch/test-mmap.c | 47 ++++++++++++++++++---
2 files changed, 71 insertions(+), 51 deletions(-)
--
2.51.0
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/4] linux-user: fix mremap unmapping adjacent region
2025-10-11 20:03 [PATCH 0/4] linux-user: fix several mremap bugs Matthew Lugg
@ 2025-10-11 20:03 ` Matthew Lugg
2025-10-11 20:03 ` [PATCH 2/4] linux-user: fix mremap errors for invalid ranges Matthew Lugg
` (2 subsequent siblings)
3 siblings, 0 replies; 6+ messages in thread
From: Matthew Lugg @ 2025-10-11 20:03 UTC (permalink / raw)
To: qemu-devel; +Cc: laurent, qemu-stable, Matthew Lugg
This typo meant that calls to `mremap` which shrink a mapping by some N
bytes would, when the virtual address space was pre-reserved (e.g.
32-bit guest on 64-bit host), unmap the N bytes following the *original*
mapping.
Signed-off-by: Matthew Lugg <mlugg@mlugg.co.uk>
---
linux-user/mmap.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/linux-user/mmap.c b/linux-user/mmap.c
index 847092a28a..ec8392b35b 100644
--- a/linux-user/mmap.c
+++ b/linux-user/mmap.c
@@ -1164,7 +1164,8 @@ abi_long target_mremap(abi_ulong old_addr, abi_ulong old_size,
errno = ENOMEM;
host_addr = MAP_FAILED;
} else if (reserved_va && old_size > new_size) {
- mmap_reserve_or_unmap(old_addr + old_size,
+ /* Re-reserve pages we just shrunk out of the mapping */
+ mmap_reserve_or_unmap(old_addr + new_size,
old_size - new_size);
}
}
--
2.51.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/4] linux-user: fix mremap errors for invalid ranges
2025-10-11 20:03 [PATCH 0/4] linux-user: fix several mremap bugs Matthew Lugg
2025-10-11 20:03 ` [PATCH 1/4] linux-user: fix mremap unmapping adjacent region Matthew Lugg
@ 2025-10-11 20:03 ` Matthew Lugg
2025-10-11 20:03 ` [PATCH 3/4] linux-user: fix reserved_va page leak in do_munmap Matthew Lugg
2025-10-11 20:03 ` [PATCH 4/4] tests: add tcg coverage for fixed mremap bugs Matthew Lugg
3 siblings, 0 replies; 6+ messages in thread
From: Matthew Lugg @ 2025-10-11 20:03 UTC (permalink / raw)
To: qemu-devel; +Cc: laurent, qemu-stable, Matthew Lugg
If an address range given to `mremap` is invalid (exceeds addressing
bounds on the guest), we were previously returning `ENOMEM`, which is
not correct. The manpage and the Linux kernel implementation both agree
that if `old_addr`/`old_size` refer to an invalid address, `EFAULT` is
returned, and if `new_addr`/`new_size` refer to an invalid address,
`EINVAL` is returned.
Signed-off-by: Matthew Lugg <mlugg@mlugg.co.uk>
---
linux-user/mmap.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/linux-user/mmap.c b/linux-user/mmap.c
index ec8392b35b..4c5fe832ad 100644
--- a/linux-user/mmap.c
+++ b/linux-user/mmap.c
@@ -1103,12 +1103,15 @@ abi_long target_mremap(abi_ulong old_addr, abi_ulong old_size,
int prot;
void *host_addr;
- if (!guest_range_valid_untagged(old_addr, old_size) ||
- ((flags & MREMAP_FIXED) &&
+ if (!guest_range_valid_untagged(old_addr, old_size)) {
+ errno = EFAULT;
+ return -1;
+ }
+ if (((flags & MREMAP_FIXED) &&
!guest_range_valid_untagged(new_addr, new_size)) ||
((flags & MREMAP_MAYMOVE) == 0 &&
!guest_range_valid_untagged(old_addr, new_size))) {
- errno = ENOMEM;
+ errno = EINVAL;
return -1;
}
--
2.51.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 3/4] linux-user: fix reserved_va page leak in do_munmap
2025-10-11 20:03 [PATCH 0/4] linux-user: fix several mremap bugs Matthew Lugg
2025-10-11 20:03 ` [PATCH 1/4] linux-user: fix mremap unmapping adjacent region Matthew Lugg
2025-10-11 20:03 ` [PATCH 2/4] linux-user: fix mremap errors for invalid ranges Matthew Lugg
@ 2025-10-11 20:03 ` Matthew Lugg
2025-10-11 20:03 ` [PATCH 4/4] tests: add tcg coverage for fixed mremap bugs Matthew Lugg
3 siblings, 0 replies; 6+ messages in thread
From: Matthew Lugg @ 2025-10-11 20:03 UTC (permalink / raw)
To: qemu-devel; +Cc: laurent, qemu-stable, Matthew Lugg
The previous logic here had an off-by-one error: assuming 4k pages on
host and guest, if `len == 4097` (indicating to unmap 2 pages), then
`last = start + 4096`, so `real_last = start + 4095`, so ultimately
`real_len = 4096`. I do not believe this could cause any observable bugs
in guests, because `target_munmap` page-aligns the length it passes in.
However, calls to this function in `target_mremap` do not page-align the
length, so those calls could "drop" pages, leading to a part of the
reserved region becoming unmapped. At worst, a host allocation could get
mapped into that hole, then clobbered by a new guest mapping.
A simple fix didn't feel ideal here, because I think this function was
not written as well as it could be. Instead, the logic is simpler if we
use `end = start + len` instead of `last = start + len - 1` (overflow
does not cause any problem here), and use offsets in the loops (avoiding
overflows since the offset is never larger than the host page size).
Signed-off-by: Matthew Lugg <mlugg@mlugg.co.uk>
---
linux-user/mmap.c | 63 ++++++++++++++++-------------------------------
1 file changed, 21 insertions(+), 42 deletions(-)
diff --git a/linux-user/mmap.c b/linux-user/mmap.c
index 4c5fe832ad..e1ed9085c7 100644
--- a/linux-user/mmap.c
+++ b/linux-user/mmap.c
@@ -1014,59 +1014,38 @@ abi_long target_mmap(abi_ulong start, abi_ulong len, int target_prot,
static int mmap_reserve_or_unmap(abi_ulong start, abi_ulong len)
{
int host_page_size = qemu_real_host_page_size();
+ abi_ulong end;
abi_ulong real_start;
- abi_ulong real_last;
- abi_ulong real_len;
- abi_ulong last;
- abi_ulong a;
+ abi_ulong real_end;
+ abi_ulong off;
void *host_start;
int prot;
- last = start + len - 1;
+ end = ROUND_UP(start + len, TARGET_PAGE_SIZE);
+
real_start = start & -host_page_size;
- real_last = ROUND_UP(last, host_page_size) - 1;
+ real_end = ROUND_UP(end, host_page_size);
- /*
- * If guest pages remain on the first or last host pages,
- * adjust the deallocation to retain those guest pages.
- * The single page special case is required for the last page,
- * lest real_start overflow to zero.
- */
- if (real_last - real_start < host_page_size) {
- prot = 0;
- for (a = real_start; a < start; a += TARGET_PAGE_SIZE) {
- prot |= page_get_flags(a);
- }
- for (a = last; a < real_last; a += TARGET_PAGE_SIZE) {
- prot |= page_get_flags(a + 1);
- }
- if (prot != 0) {
- return 0;
- }
- } else {
- for (prot = 0, a = real_start; a < start; a += TARGET_PAGE_SIZE) {
- prot |= page_get_flags(a);
- }
- if (prot != 0) {
- real_start += host_page_size;
- }
+ /* end or real_end may have overflowed to 0, but that's okay. */
- for (prot = 0, a = last; a < real_last; a += TARGET_PAGE_SIZE) {
- prot |= page_get_flags(a + 1);
- }
- if (prot != 0) {
- real_last -= host_page_size;
- }
+ /* If [real_start,start) contains a mapped guest page, retain the first page. */
+ for (prot = 0, off = 0; off < start - real_start; off += TARGET_PAGE_SIZE) {
+ prot |= page_get_flags(real_start + off);
+ }
+ if (prot != 0) {
+ real_start += host_page_size;
+ }
- if (real_last < real_start) {
- return 0;
- }
+ /* If [end,real_end) contains a mapped guest page, retain the last page. */
+ for (prot = 0, off = 0; off < real_end - end; off += TARGET_PAGE_SIZE) {
+ prot |= page_get_flags(end + off);
+ }
+ if (prot != 0) {
+ real_end -= host_page_size;
}
- real_len = real_last - real_start + 1;
host_start = g2h_untagged(real_start);
-
- return do_munmap(host_start, real_len);
+ return do_munmap(host_start, real_end - real_start);
}
int target_munmap(abi_ulong start, abi_ulong len)
--
2.51.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 4/4] tests: add tcg coverage for fixed mremap bugs
2025-10-11 20:03 [PATCH 0/4] linux-user: fix several mremap bugs Matthew Lugg
` (2 preceding siblings ...)
2025-10-11 20:03 ` [PATCH 3/4] linux-user: fix reserved_va page leak in do_munmap Matthew Lugg
@ 2025-10-11 20:03 ` Matthew Lugg
2025-10-11 20:15 ` Matthew Lugg
3 siblings, 1 reply; 6+ messages in thread
From: Matthew Lugg @ 2025-10-11 20:03 UTC (permalink / raw)
To: qemu-devel; +Cc: laurent, qemu-stable, Matthew Lugg
These tests cover the first two fixes in this patch series. The final
patch is not covered because the bug it fixes is not easily observable
by the guest.
Signed-off-by: Matthew Lugg <mlugg@mlugg.co.uk>
---
tests/tcg/multiarch/test-mmap.c | 47 +++++++++++++++++++++++++++++----
1 file changed, 42 insertions(+), 5 deletions(-)
diff --git a/tests/tcg/multiarch/test-mmap.c b/tests/tcg/multiarch/test-mmap.c
index 96257f8ebe..64df694d1a 100644
--- a/tests/tcg/multiarch/test-mmap.c
+++ b/tests/tcg/multiarch/test-mmap.c
@@ -22,6 +22,7 @@
* along with this program; if not, see <http://www.gnu.org/licenses/>.
*/
+#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
@@ -36,12 +37,12 @@
do \
{ \
if (!(x)) { \
- fprintf(stderr, "FAILED at %s:%d\n", __FILE__, __LINE__); \
+ fprintf(stderr, " FAILED at %s:%d\n", __FILE__, __LINE__); \
exit (EXIT_FAILURE); \
} \
} while (0)
-unsigned char *dummybuf;
+unsigned char *dummybuf; /* length is 2*pagesize */
static unsigned int pagesize;
static unsigned int pagemask;
int test_fd;
@@ -439,21 +440,56 @@ void check_invalid_mmaps(void)
{
unsigned char *addr;
+ fprintf(stdout, "%s", __func__);
+
/* Attempt to map a zero length page. */
addr = mmap(NULL, 0, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
- fprintf(stdout, "%s addr=%p", __func__, (void *)addr);
fail_unless(addr == MAP_FAILED);
fail_unless(errno == EINVAL);
/* Attempt to map a over length page. */
addr = mmap(NULL, -4, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
- fprintf(stdout, "%s addr=%p", __func__, (void *)addr);
fail_unless(addr == MAP_FAILED);
fail_unless(errno == ENOMEM);
+ /* Attempt to remap a region which exceeds the bounds of memory. */
+ addr = mremap((void *)((uintptr_t)pagesize * 10), SIZE_MAX & ~(size_t)pagemask, pagesize, 0);
+ fail_unless(addr == MAP_FAILED);
+ fail_unless(errno == EFAULT);
+
fprintf(stdout, " passed\n");
}
+void check_shrink_mmaps(void)
+{
+ unsigned char *a, *b, *c;
+ a = mmap(NULL, pagesize * 2, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+ b = mmap(NULL, pagesize * 2, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+ c = mmap(NULL, pagesize * 2, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+
+ fail_unless(a != MAP_FAILED);
+ fail_unless(b != MAP_FAILED);
+ fail_unless(c != MAP_FAILED);
+
+ /* Ensure we can read the full mappings */
+ memcpy(dummybuf, a, 2 * pagesize);
+ memcpy(dummybuf, b, 2 * pagesize);
+ memcpy(dummybuf, c, 2 * pagesize);
+
+ /* Shrink the middle mapping in-place; the others should be unaffected */
+ b = mremap(b, pagesize * 2, pagesize, 0);
+ fail_unless(b != MAP_FAILED);
+
+ /* Ensure we can still access all valid mappings */
+ memcpy(dummybuf, a, 2 * pagesize);
+ memcpy(dummybuf, b, pagesize);
+ memcpy(dummybuf, c, 2 * pagesize);
+
+ munmap(a, 2 * pagesize);
+ munmap(b, pagesize);
+ munmap(c, 2 * pagesize);
+}
+
int main(int argc, char **argv)
{
char tempname[] = "/tmp/.cmmapXXXXXX";
@@ -468,7 +504,7 @@ int main(int argc, char **argv)
/* Assume pagesize is a power of two. */
pagemask = pagesize - 1;
- dummybuf = malloc (pagesize);
+ dummybuf = malloc (pagesize * 2);
printf ("pagesize=%u pagemask=%x\n", pagesize, pagemask);
test_fd = mkstemp(tempname);
@@ -496,6 +532,7 @@ int main(int argc, char **argv)
check_file_fixed_eof_mmaps();
check_file_unfixed_eof_mmaps();
check_invalid_mmaps();
+ check_shrink_mmaps();
/* Fails at the moment. */
/* check_aligned_anonymous_fixed_mmaps_collide_with_host(); */
--
2.51.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 4/4] tests: add tcg coverage for fixed mremap bugs
2025-10-11 20:03 ` [PATCH 4/4] tests: add tcg coverage for fixed mremap bugs Matthew Lugg
@ 2025-10-11 20:15 ` Matthew Lugg
0 siblings, 0 replies; 6+ messages in thread
From: Matthew Lugg @ 2025-10-11 20:15 UTC (permalink / raw)
To: qemu-devel; +Cc: laurent, qemu-stable
I only just noticed I accidentally changed a line in a macro at the top
of this
file ("FAILED" -> " FAILED"); that was unintentional. I'll omit that
change from
future versions of this patch.
On 10/11/25 21:03, Matthew Lugg wrote:
> These tests cover the first two fixes in this patch series. The final
> patch is not covered because the bug it fixes is not easily observable
> by the guest.
>
> Signed-off-by: Matthew Lugg <mlugg@mlugg.co.uk>
> ---
> tests/tcg/multiarch/test-mmap.c | 47 +++++++++++++++++++++++++++++----
> 1 file changed, 42 insertions(+), 5 deletions(-)
>
> diff --git a/tests/tcg/multiarch/test-mmap.c b/tests/tcg/multiarch/test-mmap.c
> index 96257f8ebe..64df694d1a 100644
> --- a/tests/tcg/multiarch/test-mmap.c
> +++ b/tests/tcg/multiarch/test-mmap.c
> @@ -22,6 +22,7 @@
> * along with this program; if not, see <http://www.gnu.org/licenses/>.
> */
>
> +#define _GNU_SOURCE
> #include <stdio.h>
> #include <stdlib.h>
> #include <stdint.h>
> @@ -36,12 +37,12 @@
> do \
> { \
> if (!(x)) { \
> - fprintf(stderr, "FAILED at %s:%d\n", __FILE__, __LINE__); \
> + fprintf(stderr, " FAILED at %s:%d\n", __FILE__, __LINE__); \
> exit (EXIT_FAILURE); \
> } \
> } while (0)
>
> -unsigned char *dummybuf;
> +unsigned char *dummybuf; /* length is 2*pagesize */
> static unsigned int pagesize;
> static unsigned int pagemask;
> int test_fd;
> @@ -439,21 +440,56 @@ void check_invalid_mmaps(void)
> {
> unsigned char *addr;
>
> + fprintf(stdout, "%s", __func__);
> +
> /* Attempt to map a zero length page. */
> addr = mmap(NULL, 0, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
> - fprintf(stdout, "%s addr=%p", __func__, (void *)addr);
> fail_unless(addr == MAP_FAILED);
> fail_unless(errno == EINVAL);
>
> /* Attempt to map a over length page. */
> addr = mmap(NULL, -4, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
> - fprintf(stdout, "%s addr=%p", __func__, (void *)addr);
> fail_unless(addr == MAP_FAILED);
> fail_unless(errno == ENOMEM);
>
> + /* Attempt to remap a region which exceeds the bounds of memory. */
> + addr = mremap((void *)((uintptr_t)pagesize * 10), SIZE_MAX & ~(size_t)pagemask, pagesize, 0);
> + fail_unless(addr == MAP_FAILED);
> + fail_unless(errno == EFAULT);
> +
> fprintf(stdout, " passed\n");
> }
>
> +void check_shrink_mmaps(void)
> +{
> + unsigned char *a, *b, *c;
> + a = mmap(NULL, pagesize * 2, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
> + b = mmap(NULL, pagesize * 2, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
> + c = mmap(NULL, pagesize * 2, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
> +
> + fail_unless(a != MAP_FAILED);
> + fail_unless(b != MAP_FAILED);
> + fail_unless(c != MAP_FAILED);
> +
> + /* Ensure we can read the full mappings */
> + memcpy(dummybuf, a, 2 * pagesize);
> + memcpy(dummybuf, b, 2 * pagesize);
> + memcpy(dummybuf, c, 2 * pagesize);
> +
> + /* Shrink the middle mapping in-place; the others should be unaffected */
> + b = mremap(b, pagesize * 2, pagesize, 0);
> + fail_unless(b != MAP_FAILED);
> +
> + /* Ensure we can still access all valid mappings */
> + memcpy(dummybuf, a, 2 * pagesize);
> + memcpy(dummybuf, b, pagesize);
> + memcpy(dummybuf, c, 2 * pagesize);
> +
> + munmap(a, 2 * pagesize);
> + munmap(b, pagesize);
> + munmap(c, 2 * pagesize);
> +}
> +
> int main(int argc, char **argv)
> {
> char tempname[] = "/tmp/.cmmapXXXXXX";
> @@ -468,7 +504,7 @@ int main(int argc, char **argv)
>
> /* Assume pagesize is a power of two. */
> pagemask = pagesize - 1;
> - dummybuf = malloc (pagesize);
> + dummybuf = malloc (pagesize * 2);
> printf ("pagesize=%u pagemask=%x\n", pagesize, pagemask);
>
> test_fd = mkstemp(tempname);
> @@ -496,6 +532,7 @@ int main(int argc, char **argv)
> check_file_fixed_eof_mmaps();
> check_file_unfixed_eof_mmaps();
> check_invalid_mmaps();
> + check_shrink_mmaps();
>
> /* Fails at the moment. */
> /* check_aligned_anonymous_fixed_mmaps_collide_with_host(); */
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-10-11 20:21 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-11 20:03 [PATCH 0/4] linux-user: fix several mremap bugs Matthew Lugg
2025-10-11 20:03 ` [PATCH 1/4] linux-user: fix mremap unmapping adjacent region Matthew Lugg
2025-10-11 20:03 ` [PATCH 2/4] linux-user: fix mremap errors for invalid ranges Matthew Lugg
2025-10-11 20:03 ` [PATCH 3/4] linux-user: fix reserved_va page leak in do_munmap Matthew Lugg
2025-10-11 20:03 ` [PATCH 4/4] tests: add tcg coverage for fixed mremap bugs Matthew Lugg
2025-10-11 20:15 ` Matthew Lugg
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).