From: David Hildenbrand <david@redhat.com>
To: Christian Borntraeger <borntraeger@de.ibm.com>,
Cornelia Huck <cohuck@redhat.com>,
Igor Mammedov <imammedo@redhat.com>
Cc: David Gibson <david@gibson.dropbear.id.au>,
ehabkost@redhat.com, groug@kaod.org, qemu-devel@nongnu.org,
qemu-s390x@nongnu.org, qemu-ppc@nongnu.org, clg@kaod.org,
pbonzini@redhat.com
Subject: Re: [Qemu-devel] [qemu-s390x] [PATCH for-2.13] Clear mem_path if we fall back to anonymous RAM allocation
Date: Thu, 19 Apr 2018 16:11:37 +0200 [thread overview]
Message-ID: <065165b5-3ab4-ae1a-f72c-c04f911656c3@redhat.com> (raw)
In-Reply-To: <77d0717b-6eba-8b20-6691-c3085937604b@de.ibm.com>
On 19.04.2018 15:34, Christian Borntraeger wrote:
>
>
> On 04/19/2018 02:58 PM, Cornelia Huck wrote:
>> On Thu, 19 Apr 2018 14:33:18 +0200
>> Igor Mammedov <imammedo@redhat.com> wrote:
>>
>>> On Thu, 19 Apr 2018 17:21:23 +1000
>>> David Gibson <david@gibson.dropbear.id.au> wrote:
>>>
>>>> If the -mem-path option is set, we attempt to map the guest's RAM from a
>>>> file in the given path; it's usually used to back guest RAM with hugepages.
>>>> If we're unable to (e.g. not enough free hugepages) then we fall back to
>>>> allocating normal anonymous pages. This behaviour can be surprising, but a
>>>> comment in allocate_system_memory_nonnuma() suggests it's legacy behaviour
>>>> we can't change.
>>>>
>>>> What really isn't ok, though, is that in this case we leave mem_path set.
>>>> That means functions which attempt to determine the pagesize of main RAM
>>>> can erroneously think it is hugepage based on the requested path, even
>>>> though it's not.
>>>>
>>>> This is particular bad for the pseries machine type. KVM HV limitations
>>>> mean the guest can't use pagesizes larger than the host page size used to
>>>> back RAM. That means that such a fallback, rather than merely giving
>>>> poorer performance that expected will cause the guest to freeze up early in
>>>> boot as it attempts to use large page mappings that can't work.
>>>>
>>>> This patch addresses the problem by clearing the mem_path variable when we
>>>> fall back to anonymous pages, meaning that subsequent attempts to
>>>> determine the RAM page size will get an accurate result.
>>>>
>>>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
>>>> ---
>>>> numa.c | 1 +
>>>> 1 file changed, 1 insertion(+)
>>>>
>>>> Paolo et al, as with my earlier patches adding some extensions to the
>>>> helpers for determining backing page sizes, if there are no objections
>>>> can I get an ack to merge this via my ppc tree?
>>>>
>>>> diff --git a/numa.c b/numa.c
>>>> index 1116c90af9..78a869e598 100644
>>>> --- a/numa.c
>>>> +++ b/numa.c
>>>> @@ -469,6 +469,7 @@ static void allocate_system_memory_nonnuma(MemoryRegion *mr, Object *owner,
>>>> /* Legacy behavior: if allocation failed, fall back to
>>>> * regular RAM allocation.
>>>> */
>>>> + mem_path = NULL;
>>>> memory_region_init_ram_nomigrate(mr, owner, name, ram_size, &error_fatal);
>>>> }
>>>> #else
>>>
>>> mem_path is also used by kvm_s390_apply_cpu_model(),
>>> and in ccw_init() memory is initialized before CPUs are
>>> so if QEM was started with -mem-path, then before patch
>>> created CPU won't have CMM enabled and print warning:
>>>
>>> "CMM will not be enabled because it is not compatible with hugetlbfs."
>>>
>>> and after patch it might enable CMM if we clear mem_path.
>>> So question is do we care about this?
>>
>> I don't quite remember the cmm semantics here -- Christian?
>
> The CMMA interface does not work on large pages. I think the kernel will react
> with EFAULT in some cases (cmma migration and others) so qemu will probably fail
> unexpectedly.
>
> But this patch seems to only clear mem-path if we do not allocate at all from
> hugetlbfs. So things should be ok, no?
>
>
This even looks like the right thing to me, as hugetlbfs was never
supported.
--
Thanks,
David / dhildenb
next prev parent reply other threads:[~2018-04-19 14:11 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-04-19 7:21 [Qemu-devel] [PATCH for-2.13] Clear mem_path if we fall back to anonymous RAM allocation David Gibson
2018-04-19 12:33 ` Igor Mammedov
2018-04-19 12:58 ` [Qemu-devel] [qemu-s390x] " Cornelia Huck
2018-04-19 13:34 ` Christian Borntraeger
2018-04-19 14:11 ` David Hildenbrand [this message]
2018-04-19 16:08 ` Greg Kurz
2018-04-20 2:17 ` David Gibson
2018-04-20 7:13 ` Christian Borntraeger
2018-04-19 16:30 ` [Qemu-devel] " Greg Kurz
2018-04-20 2:18 ` David Gibson
2018-04-20 15:34 ` Paolo Bonzini
2018-04-21 9:20 ` David Gibson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=065165b5-3ab4-ae1a-f72c-c04f911656c3@redhat.com \
--to=david@redhat.com \
--cc=borntraeger@de.ibm.com \
--cc=clg@kaod.org \
--cc=cohuck@redhat.com \
--cc=david@gibson.dropbear.id.au \
--cc=ehabkost@redhat.com \
--cc=groug@kaod.org \
--cc=imammedo@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-ppc@nongnu.org \
--cc=qemu-s390x@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).