From: Akihiko Odaki <akihiko.odaki@daynix.com>
To: Helge Deller <deller@gmx.de>
Cc: qemu-devel@nongnu.org, Laurent Vivier <laurent@vivier.eu>
Subject: Re: [PATCH 5/5] linux-user: Do not align brk with host page size
Date: Tue, 1 Aug 2023 07:43:54 +0900 [thread overview]
Message-ID: <aa7e87f0-ae72-0147-3e8e-4d06b736b28c@daynix.com> (raw)
In-Reply-To: <7b55e607-45f9-a574-fb00-b65508e57b10@gmx.de>
On 2023/08/01 0:51, Helge Deller wrote:
> On 7/31/23 10:03, Akihiko Odaki wrote:
>> do_brk() minimizes calls into target_mmap() by aligning the address
>> with host page size, which is potentially larger than the target page
>> size.
>
> Keep in mind, that host page size can be smaller than target page size too.
> That's the reason why host brk (brk_page) and target brk (target_brk)
> needs to be tracked individually.
> So, it's not an optimization, but required.
> Btw, I think we have been there before with the idea of
> just keeping track of host pages...
It does not matter even if the host page size is smaller or larger.
do_brk() solely relies on target_mmap(), which works with target page size.
>
>> However, the current implementation of this optimization has two
>> bugs:
>>
>> - The start of brk is rounded up with the host page size while brk
>> advertises an address aligned with the target page size as the
>> beginning of brk. This makes the beginning of brk unmapped.
>> - Content clearing after mapping is flawed. The size to clear is
>> specified as HOST_PAGE_ALIGN(brk_page) - brk_page, but brk_page is
>> aligned with the host page size so it is always zero.
>>
>> This optimization actually has no practical benefit. It makes difference
>> when brk() is called multiple times with values in a range of the host
>> page size. However, sophisticated memory allocators try to avoid to
>> make such frequent brk() calls. For example, glibc 2.37 calls brk() to
>> shrink the heap only when there is a room more than 128 KiB. It is
>> rare to have a page size larger than 128 KiB if it happens.
>>
>> Let's remove the optimization to fix the bugs and make the code simpler.
>>
>> Fixes: 86f04735ac ("linux-user: Fix brk() to release pages")
>> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1616
>> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
>> ---
>> linux-user/elfload.c | 4 ++--
>> linux-user/syscall.c | 54 ++++++++++----------------------------------
>> 2 files changed, 14 insertions(+), 44 deletions(-)
>>
>> diff --git a/linux-user/elfload.c b/linux-user/elfload.c
>> index 861ec07abc..2aee2298ec 100644
>> --- a/linux-user/elfload.c
>> +++ b/linux-user/elfload.c
>> @@ -3678,8 +3678,8 @@ int load_elf_binary(struct linux_binprm *bprm,
>> struct image_info *info)
>> * to mmap pages in this space.
>> */
>> if (info->reserve_brk) {
>> - abi_ulong start_brk = HOST_PAGE_ALIGN(info->brk);
>> - abi_ulong end_brk = HOST_PAGE_ALIGN(info->brk +
>> info->reserve_brk);
>> + abi_ulong start_brk = TARGET_PAGE_ALIGN(info->brk);
>> + abi_ulong end_brk = TARGET_PAGE_ALIGN(info->brk +
>> info->reserve_brk);
>
> In my patchset I removed the reserve_brk stuff...
>
>> target_munmap(start_brk, end_brk - start_brk);
>> }
>>
>> diff --git a/linux-user/syscall.c b/linux-user/syscall.c
>> index ebdc8c144c..475260b7ce 100644
>> --- a/linux-user/syscall.c
>> +++ b/linux-user/syscall.c
>> @@ -802,81 +802,51 @@ static inline int host_to_target_sock_type(int
>> host_type)
>> }
>>
>> static abi_ulong target_brk, initial_target_brk;
>> -static abi_ulong brk_page;
>
> with that you loose the ability to track the brk address on host.
brk_page actually doesn't track the address on the host. It's just an
address on the target but aligned with host page size. And the alignment
is purely for optimization.
>
>
>> void target_set_brk(abi_ulong new_brk)
>> {
>> target_brk = TARGET_PAGE_ALIGN(new_brk);
>> initial_target_brk = target_brk;
>> - brk_page = HOST_PAGE_ALIGN(target_brk);
>> }
>>
>> /* do_brk() must return target values and target errnos. */
>> abi_long do_brk(abi_ulong brk_val)
>> {
>> abi_long mapped_addr;
>> - abi_ulong new_alloc_size;
>> - abi_ulong new_brk, new_host_brk_page;
>> + abi_ulong new_brk;
>> + abi_ulong old_brk;
>>
>> /* brk pointers are always untagged */
>>
>> - /* return old brk value if brk_val unchanged */
>> - if (brk_val == target_brk) {
>> - return target_brk;
>> - }
>> -
>> /* do not allow to shrink below initial brk value */
>> if (brk_val < initial_target_brk) {
>> return target_brk;
>> }
>>
>> new_brk = TARGET_PAGE_ALIGN(brk_val);
>> - new_host_brk_page = HOST_PAGE_ALIGN(brk_val);
>> + old_brk = TARGET_PAGE_ALIGN(target_brk);
>>
>> - /* brk_val and old target_brk might be on the same page */
>> - if (new_brk == TARGET_PAGE_ALIGN(target_brk)) {
>> - /* empty remaining bytes in (possibly larger) host page */
>> - memset(g2h_untagged(new_brk), 0, new_host_brk_page - new_brk);
>
> you can't drop that.
> guest could have clobbered the upper parts, e.g. when it first mapped
> all (e.g. 8k), wrote to it, then released upper half of the host page
> (4k) and
> then remaps the upper 4k again.
> In that case we need to ensure that the upper 4k gets cleaned. On a
> physical
> machine the kernel would have requested a new page, but here we are stuck
> with the bigger host page which we can't simply release.
Such a case should also be handled with target_mmap().
Regards,
Akihiko Odaki
next prev parent reply other threads:[~2023-07-31 22:44 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-31 8:03 [PATCH 0/5] linux-user: brk/mmap fixes Akihiko Odaki
2023-07-31 8:03 ` [PATCH 1/5] linux-user: Unset MAP_FIXED_NOREPLACE for host Akihiko Odaki
2023-07-31 15:43 ` Richard Henderson
2023-07-31 8:03 ` [PATCH 2/5] linux-user: Do not call get_errno() in do_brk() Akihiko Odaki
2023-07-31 15:13 ` Helge Deller
2023-07-31 8:03 ` [PATCH 3/5] linux-user: Use MAP_FIXED_NOREPLACE for do_brk() Akihiko Odaki
2023-07-31 11:44 ` Peter Maydell
2023-07-31 13:32 ` Akihiko Odaki
2023-07-31 13:45 ` Peter Maydell
2023-07-31 15:44 ` Richard Henderson
2023-07-31 8:03 ` [PATCH 4/5] linux-user: Do nothing if too small brk is specified Akihiko Odaki
2023-07-31 15:16 ` Helge Deller
2023-07-31 8:03 ` [PATCH 5/5] linux-user: Do not align brk with host page size Akihiko Odaki
2023-07-31 15:51 ` Helge Deller
2023-07-31 22:43 ` Akihiko Odaki [this message]
2023-07-31 9:31 ` [PATCH 0/5] linux-user: brk/mmap fixes Michael Tokarev
2023-07-31 10:17 ` Joel Stanley
2023-07-31 10:23 ` Akihiko Odaki
2023-07-31 16:47 ` Helge Deller
2023-07-31 17:43 ` Helge Deller
2023-07-31 18:24 ` Helge Deller
2023-08-01 4:49 ` Joel Stanley
2023-08-01 10:43 ` Helge Deller
2023-08-02 7:26 ` Akihiko Odaki
2023-08-02 10:37 ` Helge Deller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aa7e87f0-ae72-0147-3e8e-4d06b736b28c@daynix.com \
--to=akihiko.odaki@daynix.com \
--cc=deller@gmx.de \
--cc=laurent@vivier.eu \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).