* [PATCH 0/4] linux-user: fix several mremap bugs @ 2025-10-11 20:03 Matthew Lugg 2025-10-11 20:03 ` [PATCH 1/4] linux-user: fix mremap unmapping adjacent region Matthew Lugg ` (3 more replies) 0 siblings, 4 replies; 6+ messages in thread From: Matthew Lugg @ 2025-10-11 20:03 UTC (permalink / raw) To: qemu-devel; +Cc: laurent, qemu-stable, Matthew Lugg I was recently debugging a strange crash in a downstream project which turned out to be a QEMU bug related to the `mremap` implementation in linux-user. In practice, this bug essentially led to arbitrary memory regions being unmapped when a 32-bit guest, running on a 64-bit host, uses `mremap` to shrink a memory mapping. The first patch in this set resolves that bug. Since the patch is very simple, and the bug is quite likely to be hit, I suspect that that commit is a good candidate for qemu-stable. The following two patches just resolve two more bugs I became aware of whilst working on this code. I believe the messages in those patches contain all necessary context. They are less critical and the fixes more complex, so are likely not suitable for backporting into qemu-stable. The final commits adds tcg tests for the fixed `mremap` behavior. The third fix is unfortunately difficult to test programmatically, but I have confirmed that it behaves as expected by observing the output of `strace qemu-i386 repro`, where `repro` is the following C program: #define _GNU_SOURCE #include <stddef.h> #include <sys/mman.h> int main(void) { char *a = mmap(NULL, 4097, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); char *b = mmap(NULL, 4097, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); mremap(b, 4097, 4097, MREMAP_FIXED | MREMAP_MAYMOVE, a); // QEMU has now leaked a page of its memory reservation! return 0; } Prior to the patch, as the comment says, QEMU leaks a page of its address space reservation (i.e. the page becomes unmapped). After the patch, QEMU correctly reclaims that page with `mmap`. Matthew Lugg (4): linux-user: fix mremap unmapping adjacent region linux-user: fix mremap errors for invalid ranges linux-user: fix reserved_va page leak in do_munmap tests: add tcg coverage for fixed mremap bugs linux-user/mmap.c | 75 +++++++++++++-------------------- tests/tcg/multiarch/test-mmap.c | 47 ++++++++++++++++++--- 2 files changed, 71 insertions(+), 51 deletions(-) -- 2.51.0 ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/4] linux-user: fix mremap unmapping adjacent region 2025-10-11 20:03 [PATCH 0/4] linux-user: fix several mremap bugs Matthew Lugg @ 2025-10-11 20:03 ` Matthew Lugg 2025-10-11 20:03 ` [PATCH 2/4] linux-user: fix mremap errors for invalid ranges Matthew Lugg ` (2 subsequent siblings) 3 siblings, 0 replies; 6+ messages in thread From: Matthew Lugg @ 2025-10-11 20:03 UTC (permalink / raw) To: qemu-devel; +Cc: laurent, qemu-stable, Matthew Lugg This typo meant that calls to `mremap` which shrink a mapping by some N bytes would, when the virtual address space was pre-reserved (e.g. 32-bit guest on 64-bit host), unmap the N bytes following the *original* mapping. Signed-off-by: Matthew Lugg <mlugg@mlugg.co.uk> --- linux-user/mmap.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/linux-user/mmap.c b/linux-user/mmap.c index 847092a28a..ec8392b35b 100644 --- a/linux-user/mmap.c +++ b/linux-user/mmap.c @@ -1164,7 +1164,8 @@ abi_long target_mremap(abi_ulong old_addr, abi_ulong old_size, errno = ENOMEM; host_addr = MAP_FAILED; } else if (reserved_va && old_size > new_size) { - mmap_reserve_or_unmap(old_addr + old_size, + /* Re-reserve pages we just shrunk out of the mapping */ + mmap_reserve_or_unmap(old_addr + new_size, old_size - new_size); } } -- 2.51.0 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/4] linux-user: fix mremap errors for invalid ranges 2025-10-11 20:03 [PATCH 0/4] linux-user: fix several mremap bugs Matthew Lugg 2025-10-11 20:03 ` [PATCH 1/4] linux-user: fix mremap unmapping adjacent region Matthew Lugg @ 2025-10-11 20:03 ` Matthew Lugg 2025-10-11 20:03 ` [PATCH 3/4] linux-user: fix reserved_va page leak in do_munmap Matthew Lugg 2025-10-11 20:03 ` [PATCH 4/4] tests: add tcg coverage for fixed mremap bugs Matthew Lugg 3 siblings, 0 replies; 6+ messages in thread From: Matthew Lugg @ 2025-10-11 20:03 UTC (permalink / raw) To: qemu-devel; +Cc: laurent, qemu-stable, Matthew Lugg If an address range given to `mremap` is invalid (exceeds addressing bounds on the guest), we were previously returning `ENOMEM`, which is not correct. The manpage and the Linux kernel implementation both agree that if `old_addr`/`old_size` refer to an invalid address, `EFAULT` is returned, and if `new_addr`/`new_size` refer to an invalid address, `EINVAL` is returned. Signed-off-by: Matthew Lugg <mlugg@mlugg.co.uk> --- linux-user/mmap.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/linux-user/mmap.c b/linux-user/mmap.c index ec8392b35b..4c5fe832ad 100644 --- a/linux-user/mmap.c +++ b/linux-user/mmap.c @@ -1103,12 +1103,15 @@ abi_long target_mremap(abi_ulong old_addr, abi_ulong old_size, int prot; void *host_addr; - if (!guest_range_valid_untagged(old_addr, old_size) || - ((flags & MREMAP_FIXED) && + if (!guest_range_valid_untagged(old_addr, old_size)) { + errno = EFAULT; + return -1; + } + if (((flags & MREMAP_FIXED) && !guest_range_valid_untagged(new_addr, new_size)) || ((flags & MREMAP_MAYMOVE) == 0 && !guest_range_valid_untagged(old_addr, new_size))) { - errno = ENOMEM; + errno = EINVAL; return -1; } -- 2.51.0 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 3/4] linux-user: fix reserved_va page leak in do_munmap 2025-10-11 20:03 [PATCH 0/4] linux-user: fix several mremap bugs Matthew Lugg 2025-10-11 20:03 ` [PATCH 1/4] linux-user: fix mremap unmapping adjacent region Matthew Lugg 2025-10-11 20:03 ` [PATCH 2/4] linux-user: fix mremap errors for invalid ranges Matthew Lugg @ 2025-10-11 20:03 ` Matthew Lugg 2025-10-11 20:03 ` [PATCH 4/4] tests: add tcg coverage for fixed mremap bugs Matthew Lugg 3 siblings, 0 replies; 6+ messages in thread From: Matthew Lugg @ 2025-10-11 20:03 UTC (permalink / raw) To: qemu-devel; +Cc: laurent, qemu-stable, Matthew Lugg The previous logic here had an off-by-one error: assuming 4k pages on host and guest, if `len == 4097` (indicating to unmap 2 pages), then `last = start + 4096`, so `real_last = start + 4095`, so ultimately `real_len = 4096`. I do not believe this could cause any observable bugs in guests, because `target_munmap` page-aligns the length it passes in. However, calls to this function in `target_mremap` do not page-align the length, so those calls could "drop" pages, leading to a part of the reserved region becoming unmapped. At worst, a host allocation could get mapped into that hole, then clobbered by a new guest mapping. A simple fix didn't feel ideal here, because I think this function was not written as well as it could be. Instead, the logic is simpler if we use `end = start + len` instead of `last = start + len - 1` (overflow does not cause any problem here), and use offsets in the loops (avoiding overflows since the offset is never larger than the host page size). Signed-off-by: Matthew Lugg <mlugg@mlugg.co.uk> --- linux-user/mmap.c | 63 ++++++++++++++++------------------------------- 1 file changed, 21 insertions(+), 42 deletions(-) diff --git a/linux-user/mmap.c b/linux-user/mmap.c index 4c5fe832ad..e1ed9085c7 100644 --- a/linux-user/mmap.c +++ b/linux-user/mmap.c @@ -1014,59 +1014,38 @@ abi_long target_mmap(abi_ulong start, abi_ulong len, int target_prot, static int mmap_reserve_or_unmap(abi_ulong start, abi_ulong len) { int host_page_size = qemu_real_host_page_size(); + abi_ulong end; abi_ulong real_start; - abi_ulong real_last; - abi_ulong real_len; - abi_ulong last; - abi_ulong a; + abi_ulong real_end; + abi_ulong off; void *host_start; int prot; - last = start + len - 1; + end = ROUND_UP(start + len, TARGET_PAGE_SIZE); + real_start = start & -host_page_size; - real_last = ROUND_UP(last, host_page_size) - 1; + real_end = ROUND_UP(end, host_page_size); - /* - * If guest pages remain on the first or last host pages, - * adjust the deallocation to retain those guest pages. - * The single page special case is required for the last page, - * lest real_start overflow to zero. - */ - if (real_last - real_start < host_page_size) { - prot = 0; - for (a = real_start; a < start; a += TARGET_PAGE_SIZE) { - prot |= page_get_flags(a); - } - for (a = last; a < real_last; a += TARGET_PAGE_SIZE) { - prot |= page_get_flags(a + 1); - } - if (prot != 0) { - return 0; - } - } else { - for (prot = 0, a = real_start; a < start; a += TARGET_PAGE_SIZE) { - prot |= page_get_flags(a); - } - if (prot != 0) { - real_start += host_page_size; - } + /* end or real_end may have overflowed to 0, but that's okay. */ - for (prot = 0, a = last; a < real_last; a += TARGET_PAGE_SIZE) { - prot |= page_get_flags(a + 1); - } - if (prot != 0) { - real_last -= host_page_size; - } + /* If [real_start,start) contains a mapped guest page, retain the first page. */ + for (prot = 0, off = 0; off < start - real_start; off += TARGET_PAGE_SIZE) { + prot |= page_get_flags(real_start + off); + } + if (prot != 0) { + real_start += host_page_size; + } - if (real_last < real_start) { - return 0; - } + /* If [end,real_end) contains a mapped guest page, retain the last page. */ + for (prot = 0, off = 0; off < real_end - end; off += TARGET_PAGE_SIZE) { + prot |= page_get_flags(end + off); + } + if (prot != 0) { + real_end -= host_page_size; } - real_len = real_last - real_start + 1; host_start = g2h_untagged(real_start); - - return do_munmap(host_start, real_len); + return do_munmap(host_start, real_end - real_start); } int target_munmap(abi_ulong start, abi_ulong len) -- 2.51.0 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 4/4] tests: add tcg coverage for fixed mremap bugs 2025-10-11 20:03 [PATCH 0/4] linux-user: fix several mremap bugs Matthew Lugg ` (2 preceding siblings ...) 2025-10-11 20:03 ` [PATCH 3/4] linux-user: fix reserved_va page leak in do_munmap Matthew Lugg @ 2025-10-11 20:03 ` Matthew Lugg 2025-10-11 20:15 ` Matthew Lugg 3 siblings, 1 reply; 6+ messages in thread From: Matthew Lugg @ 2025-10-11 20:03 UTC (permalink / raw) To: qemu-devel; +Cc: laurent, qemu-stable, Matthew Lugg These tests cover the first two fixes in this patch series. The final patch is not covered because the bug it fixes is not easily observable by the guest. Signed-off-by: Matthew Lugg <mlugg@mlugg.co.uk> --- tests/tcg/multiarch/test-mmap.c | 47 +++++++++++++++++++++++++++++---- 1 file changed, 42 insertions(+), 5 deletions(-) diff --git a/tests/tcg/multiarch/test-mmap.c b/tests/tcg/multiarch/test-mmap.c index 96257f8ebe..64df694d1a 100644 --- a/tests/tcg/multiarch/test-mmap.c +++ b/tests/tcg/multiarch/test-mmap.c @@ -22,6 +22,7 @@ * along with this program; if not, see <http://www.gnu.org/licenses/>. */ +#define _GNU_SOURCE #include <stdio.h> #include <stdlib.h> #include <stdint.h> @@ -36,12 +37,12 @@ do \ { \ if (!(x)) { \ - fprintf(stderr, "FAILED at %s:%d\n", __FILE__, __LINE__); \ + fprintf(stderr, " FAILED at %s:%d\n", __FILE__, __LINE__); \ exit (EXIT_FAILURE); \ } \ } while (0) -unsigned char *dummybuf; +unsigned char *dummybuf; /* length is 2*pagesize */ static unsigned int pagesize; static unsigned int pagemask; int test_fd; @@ -439,21 +440,56 @@ void check_invalid_mmaps(void) { unsigned char *addr; + fprintf(stdout, "%s", __func__); + /* Attempt to map a zero length page. */ addr = mmap(NULL, 0, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); - fprintf(stdout, "%s addr=%p", __func__, (void *)addr); fail_unless(addr == MAP_FAILED); fail_unless(errno == EINVAL); /* Attempt to map a over length page. */ addr = mmap(NULL, -4, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); - fprintf(stdout, "%s addr=%p", __func__, (void *)addr); fail_unless(addr == MAP_FAILED); fail_unless(errno == ENOMEM); + /* Attempt to remap a region which exceeds the bounds of memory. */ + addr = mremap((void *)((uintptr_t)pagesize * 10), SIZE_MAX & ~(size_t)pagemask, pagesize, 0); + fail_unless(addr == MAP_FAILED); + fail_unless(errno == EFAULT); + fprintf(stdout, " passed\n"); } +void check_shrink_mmaps(void) +{ + unsigned char *a, *b, *c; + a = mmap(NULL, pagesize * 2, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); + b = mmap(NULL, pagesize * 2, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); + c = mmap(NULL, pagesize * 2, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); + + fail_unless(a != MAP_FAILED); + fail_unless(b != MAP_FAILED); + fail_unless(c != MAP_FAILED); + + /* Ensure we can read the full mappings */ + memcpy(dummybuf, a, 2 * pagesize); + memcpy(dummybuf, b, 2 * pagesize); + memcpy(dummybuf, c, 2 * pagesize); + + /* Shrink the middle mapping in-place; the others should be unaffected */ + b = mremap(b, pagesize * 2, pagesize, 0); + fail_unless(b != MAP_FAILED); + + /* Ensure we can still access all valid mappings */ + memcpy(dummybuf, a, 2 * pagesize); + memcpy(dummybuf, b, pagesize); + memcpy(dummybuf, c, 2 * pagesize); + + munmap(a, 2 * pagesize); + munmap(b, pagesize); + munmap(c, 2 * pagesize); +} + int main(int argc, char **argv) { char tempname[] = "/tmp/.cmmapXXXXXX"; @@ -468,7 +504,7 @@ int main(int argc, char **argv) /* Assume pagesize is a power of two. */ pagemask = pagesize - 1; - dummybuf = malloc (pagesize); + dummybuf = malloc (pagesize * 2); printf ("pagesize=%u pagemask=%x\n", pagesize, pagemask); test_fd = mkstemp(tempname); @@ -496,6 +532,7 @@ int main(int argc, char **argv) check_file_fixed_eof_mmaps(); check_file_unfixed_eof_mmaps(); check_invalid_mmaps(); + check_shrink_mmaps(); /* Fails at the moment. */ /* check_aligned_anonymous_fixed_mmaps_collide_with_host(); */ -- 2.51.0 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 4/4] tests: add tcg coverage for fixed mremap bugs 2025-10-11 20:03 ` [PATCH 4/4] tests: add tcg coverage for fixed mremap bugs Matthew Lugg @ 2025-10-11 20:15 ` Matthew Lugg 0 siblings, 0 replies; 6+ messages in thread From: Matthew Lugg @ 2025-10-11 20:15 UTC (permalink / raw) To: qemu-devel; +Cc: laurent, qemu-stable I only just noticed I accidentally changed a line in a macro at the top of this file ("FAILED" -> " FAILED"); that was unintentional. I'll omit that change from future versions of this patch. On 10/11/25 21:03, Matthew Lugg wrote: > These tests cover the first two fixes in this patch series. The final > patch is not covered because the bug it fixes is not easily observable > by the guest. > > Signed-off-by: Matthew Lugg <mlugg@mlugg.co.uk> > --- > tests/tcg/multiarch/test-mmap.c | 47 +++++++++++++++++++++++++++++---- > 1 file changed, 42 insertions(+), 5 deletions(-) > > diff --git a/tests/tcg/multiarch/test-mmap.c b/tests/tcg/multiarch/test-mmap.c > index 96257f8ebe..64df694d1a 100644 > --- a/tests/tcg/multiarch/test-mmap.c > +++ b/tests/tcg/multiarch/test-mmap.c > @@ -22,6 +22,7 @@ > * along with this program; if not, see <http://www.gnu.org/licenses/>. > */ > > +#define _GNU_SOURCE > #include <stdio.h> > #include <stdlib.h> > #include <stdint.h> > @@ -36,12 +37,12 @@ > do \ > { \ > if (!(x)) { \ > - fprintf(stderr, "FAILED at %s:%d\n", __FILE__, __LINE__); \ > + fprintf(stderr, " FAILED at %s:%d\n", __FILE__, __LINE__); \ > exit (EXIT_FAILURE); \ > } \ > } while (0) > > -unsigned char *dummybuf; > +unsigned char *dummybuf; /* length is 2*pagesize */ > static unsigned int pagesize; > static unsigned int pagemask; > int test_fd; > @@ -439,21 +440,56 @@ void check_invalid_mmaps(void) > { > unsigned char *addr; > > + fprintf(stdout, "%s", __func__); > + > /* Attempt to map a zero length page. */ > addr = mmap(NULL, 0, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); > - fprintf(stdout, "%s addr=%p", __func__, (void *)addr); > fail_unless(addr == MAP_FAILED); > fail_unless(errno == EINVAL); > > /* Attempt to map a over length page. */ > addr = mmap(NULL, -4, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); > - fprintf(stdout, "%s addr=%p", __func__, (void *)addr); > fail_unless(addr == MAP_FAILED); > fail_unless(errno == ENOMEM); > > + /* Attempt to remap a region which exceeds the bounds of memory. */ > + addr = mremap((void *)((uintptr_t)pagesize * 10), SIZE_MAX & ~(size_t)pagemask, pagesize, 0); > + fail_unless(addr == MAP_FAILED); > + fail_unless(errno == EFAULT); > + > fprintf(stdout, " passed\n"); > } > > +void check_shrink_mmaps(void) > +{ > + unsigned char *a, *b, *c; > + a = mmap(NULL, pagesize * 2, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); > + b = mmap(NULL, pagesize * 2, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); > + c = mmap(NULL, pagesize * 2, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); > + > + fail_unless(a != MAP_FAILED); > + fail_unless(b != MAP_FAILED); > + fail_unless(c != MAP_FAILED); > + > + /* Ensure we can read the full mappings */ > + memcpy(dummybuf, a, 2 * pagesize); > + memcpy(dummybuf, b, 2 * pagesize); > + memcpy(dummybuf, c, 2 * pagesize); > + > + /* Shrink the middle mapping in-place; the others should be unaffected */ > + b = mremap(b, pagesize * 2, pagesize, 0); > + fail_unless(b != MAP_FAILED); > + > + /* Ensure we can still access all valid mappings */ > + memcpy(dummybuf, a, 2 * pagesize); > + memcpy(dummybuf, b, pagesize); > + memcpy(dummybuf, c, 2 * pagesize); > + > + munmap(a, 2 * pagesize); > + munmap(b, pagesize); > + munmap(c, 2 * pagesize); > +} > + > int main(int argc, char **argv) > { > char tempname[] = "/tmp/.cmmapXXXXXX"; > @@ -468,7 +504,7 @@ int main(int argc, char **argv) > > /* Assume pagesize is a power of two. */ > pagemask = pagesize - 1; > - dummybuf = malloc (pagesize); > + dummybuf = malloc (pagesize * 2); > printf ("pagesize=%u pagemask=%x\n", pagesize, pagemask); > > test_fd = mkstemp(tempname); > @@ -496,6 +532,7 @@ int main(int argc, char **argv) > check_file_fixed_eof_mmaps(); > check_file_unfixed_eof_mmaps(); > check_invalid_mmaps(); > + check_shrink_mmaps(); > > /* Fails at the moment. */ > /* check_aligned_anonymous_fixed_mmaps_collide_with_host(); */ ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-10-11 20:21 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-10-11 20:03 [PATCH 0/4] linux-user: fix several mremap bugs Matthew Lugg 2025-10-11 20:03 ` [PATCH 1/4] linux-user: fix mremap unmapping adjacent region Matthew Lugg 2025-10-11 20:03 ` [PATCH 2/4] linux-user: fix mremap errors for invalid ranges Matthew Lugg 2025-10-11 20:03 ` [PATCH 3/4] linux-user: fix reserved_va page leak in do_munmap Matthew Lugg 2025-10-11 20:03 ` [PATCH 4/4] tests: add tcg coverage for fixed mremap bugs Matthew Lugg 2025-10-11 20:15 ` Matthew Lugg
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).