From: "Alex Bennée" <alex.bennee@linaro.org>
To: "Richard Henderson" <richard.henderson@linaro.org>
Cc: qemu-devel@nongnu.org, laurent@vivier.eu
Subject: qemu-x86_64, buster /sbin/ldconfig and setup_arg_pages (a mind dump)
Date: Fri, 17 Jan 2020 17:33:20 +0000 [thread overview]
Message-ID: <874kwukmxr.fsf@linaro.org> (raw)
Hi Richard,
While I was attempting to test the new vsyscall patches for x86 I
discovered I couldn't debootstrap an x86_64 buster image on my ARM box.
After digging further into it I discovered it was because executing
/sbin/ldconfig crashes and aborts the bootstrap.
This is helpfully reproducible on my main development system which is
also running buster:
./x86_64-linux-user/qemu-x86_64 /sbin/ldconfig
setup_arg_pages: 00000040000e0000
target_set_brk: new_brk=00000040000dfdf8
do_brk(0000000000000000) -> 00000040000e0000 (!new_brk)
do_brk(00000040000e11c0) -> do_brk: allocating 8192 => 00007fb2dace5000
00000040000e0000 (mapped_addr != -1 or brk_page)
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
fish: Job 2, “./x86_64-linux-user/qemu-x86_64…” terminated by signal SIGSEGV (Address boundary error)
The failure of the second do_brk during the early setup of the binaries
TLS data area. However for some reason this isn't always the case. For
example with testthread which also uses TLS:
./x86_64-linux-user/qemu-x86_64 ./tests/tcg/x86_64-linux-user/testthread
setup_arg_pages: 0000004000000000
target_set_brk: new_brk=00000000004c8558
do_brk(0000000000000000) -> 00000000004c9000 (!new_brk)
do_brk(00000000004ca1c0) -> do_brk: allocating 8192 => 00000000004c9000
00000000004ca1c0 (mapped_addr == brk_page)
do_brk(00000000004eb1c0) -> do_brk: allocating 135168 => 00000000004cb000
00000000004eb1c0 (mapped_addr == brk_page)
do_brk(00000000004ec000) -> 00000000004ec000 (new_brk <= brk_page)
thread1: 0 hello1
thread2: 0 hello2
thread1: 1 hello1
Ultimately the failure is down to setup_arg_pages allocating too low in
the address space in the ldconfig case which leaves the second brk
unable to example it's region of memory. Turning on -d page and you can
see the region forming:
page layout changed following target_mmap
start end size prot
0000004000000000-0000004000009000 0000000000009000 r--
0000004000009000-00000040000ae000 00000000000a5000 r-x
00000040000ae000-00000040000d8000 000000000002a000 r--
00000040000d8000-00000040000df000 0000000000007000 rw-
00000040000df000-00000040000e0000 0000000000001000 ---
page layout changed following target_mmap
start end size prot
0000004000000000-0000004000009000 0000000000009000 r--
0000004000009000-00000040000ae000 00000000000a5000 r-x
00000040000ae000-00000040000d8000 000000000002a000 r--
00000040000d8000-00000040008e1000 0000000000809000 rw-
setup_arg_pages: 00000040000e0000
guest_base 0x0
page layout changed following binary load
start end size prot
0000004000000000-0000004000009000 0000000000009000 r--
0000004000009000-00000040000ae000 00000000000a5000 r-x
00000040000ae000-00000040000d8000 000000000002a000 r--
00000040000d8000-00000040000e0000 0000000000008000 rw-
00000040000e0000-00000040000e1000 0000000000001000 ---
00000040000e1000-00000040008e1000 0000000000800000 rw-
start_brk 0x0000000000000000
end_code 0x00000040000ad971
start_code 0x0000004000009000
start_data 0x00000040000d8778
end_data 0x00000040000de510
start_stack 0x00000040008e02d0
brk 0x00000040000dfdf8
entry 0x000000400000a370
argv_start 0x00000040008e02d8
env_start 0x00000040008e02e8
auxv_start 0x00000040008e0428
target_set_brk: new_brk=00000040000dfdf8
page layout changed following target_mmap
start end size prot
0000004000000000-0000004000009000 0000000000009000 r--
0000004000009000-00000040000ae000 00000000000a5000 r-x
00000040000ae000-00000040000d8000 000000000002a000 r--
00000040000d8000-00000040000e0000 0000000000008000 rw-
00000040000e0000-00000040000e1000 0000000000001000 ---
00000040000e1000-00000040008e2000 0000000000801000 rw-
So it looks like setup_arg_pages just creates a segment right in the
middle of a previously allocated block of storage. This is odd because
the loader basically just leaves it to mmap to pick a region:
error = target_mmap(0, size + guard, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
AFAICT this just depends on where we have allocated last, in the
testthread case we already have a high mapping to splat:
page layout changed following target_mmap
start end size prot
0000000000400000-0000000000401000 0000000000001000 r--
0000000000401000-0000000000495000 0000000000094000 r-x
0000000000495000-00000000004bc000 0000000000027000 r--
00000000004bd000-00000000004c9000 000000000000c000 rw-
0000004000000000-0000004000801000 0000000000801000 rw-
setup_arg_pages: 0000004000000000
guest_base 0x0
page layout changed following binary load
start end size prot
0000000000400000-0000000000401000 0000000000001000 r--
0000000000401000-0000000000495000 0000000000094000 r-x
0000000000495000-00000000004bc000 0000000000027000 r--
00000000004bd000-00000000004c9000 000000000000c000 rw-
0000004000000000-0000004000001000 0000000000001000 ---
0000004000001000-0000004000801000 0000000000800000 rw-
And comparing the ldconfig to a "normal" case we can see that the
problem is all of ldconfig has been allocated in the TASK_UNMAPPED_BASE
region. This is due to ldconfig having a DYNAMIC region without a load
address which causes mmap_find_vma to get called to find space for it
and then all the subsequent anonymous regions that are needed:
load_elf_image: dynamic loaddr 0000000000000000
mmap_find_vma: 0000004000000000
load_elf_image: mapping un-backed region: 0000004000000000:0000000000009000
load_elf_image: mapping un-backed region: 0000004000009000:00000000000a5000
load_elf_image: mapping un-backed region: 00000040000ae000:000000000002a000
load_elf_image: mapping un-backed region: 00000040000d8000:0000000000007000
mmap_find_vma: 00000040000e0000
setup_arg_pages: 00000040000e0000
target_set_brk: new_brk=00000040000dfdf8
mmap_find_vma: 00000040008e1000
mmap_find_vma: 00000040008e2000
do_brk(0000000000000000) -> 00000040000e0000 (!new_brk)
do_brk(00000040000e11c0) -> mmap_find_vma: 00000040000e0000
do_brk: allocating 8192 => 00007fb999e49000
00000040000e0000 (mapped_addr != -1 or brk_page)
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
But no actually this all seems to be normal for dynamically linked
things - but still something must be different:
./x86_64-linux-user/qemu-x86_64 ./tests/tcg/x86_64-linux-user/testthread.dyn
load_elf_image: dynamic loaddr 0000000000000000
mmap_find_vma: 0000004000000000
load_elf_image: mapping un-backed region: 0000004000000000:0000000000001000
load_elf_image: mapping un-backed region: 0000004000001000:0000000000001000
load_elf_image: mapping un-backed region: 0000004000002000:0000000000001000
load_elf_image: mapping un-backed region: 0000004000003000:0000000000002000
mmap_find_vma: 0000004000005000
setup_arg_pages: 0000004000005000
load_elf_image: dynamic loaddr 0000000000000000
mmap_find_vma: 0000004000806000
load_elf_image: mapping un-backed region: 0000004000806000:0000000000001000
load_elf_image: mapping un-backed region: 0000004000807000:000000000001e000
load_elf_image: mapping un-backed region: 0000004000825000:0000000000008000
load_elf_image: mapping un-backed region: 000000400082d000:0000000000002000
target_set_brk: new_brk=0000004000004070
mmap_find_vma: 0000004000830000
mmap_find_vma: 0000004000831000
do_brk(0000000000000000) -> 0000004000005000 (!new_brk)
mmap_find_vma: 0000004000832000
mmap_find_vma: 0000004000857000
mmap_find_vma: 0000004000878000
mmap_find_vma: 000000400087a000
mmap_find_vma: 0000004000a3b000
mmap_find_vma: 0000004000a3e000
do_brk(0000000000000000) -> 0000004000005000 (!new_brk)
do_brk(0000004000026000) -> mmap_find_vma: 0000004000005000
do_brk: allocating 135168 => 00007fa00659b000
0000004000005000 (mapped_addr != -1 or brk_page)
mmap_find_vma: 000000400123f000
mmap_find_vma: 000000400923f000
Recompiling testthread as a dynamic executable and it runs fine, leaving
itself enough space to expand the brk region at least once.
So what do we take away from this?
* we need testcases to exercise the memory layout of dynamic binaries
* "special" dynamic binaries can break our careful memory layout
* I feel as though I've trodden on a nest of vipers
Does any of this track with you? What is different about ldconfig that
breaks our memory placement?
--
Alex Bennée
next reply other threads:[~2020-01-17 17:35 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-01-17 17:33 Alex Bennée [this message]
2020-01-17 18:10 ` qemu-x86_64, buster /sbin/ldconfig and setup_arg_pages (a mind dump) Richard Henderson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=874kwukmxr.fsf@linaro.org \
--to=alex.bennee@linaro.org \
--cc=laurent@vivier.eu \
--cc=qemu-devel@nongnu.org \
--cc=richard.henderson@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.