qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kurz <groug@kaod.org>
To: David Hildenbrand <david@redhat.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>,
	Cornelia Huck <cohuck@redhat.com>,
	Igor Mammedov <imammedo@redhat.com>,
	David Gibson <david@gibson.dropbear.id.au>,
	ehabkost@redhat.com, qemu-devel@nongnu.org,
	qemu-s390x@nongnu.org, qemu-ppc@nongnu.org, clg@kaod.org,
	pbonzini@redhat.com
Subject: Re: [Qemu-devel] [qemu-s390x] [PATCH for-2.13] Clear mem_path if we fall back to anonymous RAM allocation
Date: Thu, 19 Apr 2018 18:08:51 +0200	[thread overview]
Message-ID: <20180419180851.461a0db3@bahia.lan> (raw)
In-Reply-To: <065165b5-3ab4-ae1a-f72c-c04f911656c3@redhat.com>

On Thu, 19 Apr 2018 16:11:37 +0200
David Hildenbrand <david@redhat.com> wrote:

> On 19.04.2018 15:34, Christian Borntraeger wrote:
> > 
> > 
> > On 04/19/2018 02:58 PM, Cornelia Huck wrote:  
> >> On Thu, 19 Apr 2018 14:33:18 +0200
> >> Igor Mammedov <imammedo@redhat.com> wrote:
> >>  
> >>> On Thu, 19 Apr 2018 17:21:23 +1000
> >>> David Gibson <david@gibson.dropbear.id.au> wrote:
> >>>  
> >>>> If the -mem-path option is set, we attempt to map the guest's RAM from a
> >>>> file in the given path; it's usually used to back guest RAM with hugepages.
> >>>> If we're unable to (e.g. not enough free hugepages) then we fall back to
> >>>> allocating normal anonymous pages.  This behaviour can be surprising, but a
> >>>> comment in allocate_system_memory_nonnuma() suggests it's legacy behaviour
> >>>> we can't change.
> >>>>
> >>>> What really isn't ok, though, is that in this case we leave mem_path set.
> >>>> That means functions which attempt to determine the pagesize of main RAM
> >>>> can erroneously think it is hugepage based on the requested path, even
> >>>> though it's not.
> >>>>
> >>>> This is particular bad for the pseries machine type.  KVM HV limitations
> >>>> mean the guest can't use pagesizes larger than the host page size used to
> >>>> back RAM.  That means that such a fallback, rather than merely giving
> >>>> poorer performance that expected will cause the guest to freeze up early in
> >>>> boot as it attempts to use large page mappings that can't work.
> >>>>
> >>>> This patch addresses the problem by clearing the mem_path variable when we
> >>>> fall back to anonymous pages, meaning that subsequent attempts to
> >>>> determine the RAM page size will get an accurate result.
> >>>>
> >>>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> >>>> ---
> >>>>  numa.c | 1 +
> >>>>  1 file changed, 1 insertion(+)
> >>>>
> >>>> Paolo et al, as with my earlier patches adding some extensions to the
> >>>> helpers for determining backing page sizes, if there are no objections
> >>>> can I get an ack to merge this via my ppc tree?
> >>>>
> >>>> diff --git a/numa.c b/numa.c
> >>>> index 1116c90af9..78a869e598 100644
> >>>> --- a/numa.c
> >>>> +++ b/numa.c
> >>>> @@ -469,6 +469,7 @@ static void allocate_system_memory_nonnuma(MemoryRegion *mr, Object *owner,
> >>>>              /* Legacy behavior: if allocation failed, fall back to
> >>>>               * regular RAM allocation.
> >>>>               */
> >>>> +            mem_path = NULL;
> >>>>              memory_region_init_ram_nomigrate(mr, owner, name, ram_size, &error_fatal);
> >>>>          }
> >>>>  #else    
> >>>
> >>> mem_path is also used by kvm_s390_apply_cpu_model(),
> >>> and in ccw_init() memory is initialized before CPUs are

Something similar happens with spapr: kvm_fixup_page_sizes() calls
qemu_getrampagesize() during CPU start, which happens before the machine
init calls allocate_system_memory_nonnuma(). Shouldn't we allocate memory
before calling spapr_init_cpus() in spapr_machine_init() then ?

> >>> so if QEM was started with -mem-path, then before patch
> >>> created CPU won't have CMM enabled and print warning:
> >>>   
> >>>  "CMM will not be enabled because it is not compatible with hugetlbfs."
> >>>
> >>> and after patch it might enable CMM if we clear mem_path.
> >>> So question is do we care about this?  
> >>
> >> I don't quite remember the cmm semantics here -- Christian?  
> > 
> > The CMMA interface does not work on large pages. I think the kernel will react
> > with EFAULT in some cases (cmma migration and others) so qemu will probably fail
> > unexpectedly. 
> > 
> > But this patch seems to only clear mem-path if we do not allocate at all from
> > hugetlbfs. So things should be ok, no?
> > 
> >   
> 
> This even looks like the right thing to me, as hugetlbfs was never
> supported.
> 

Unrelated to this patch, -mem-path can be passed something that doesn't sit
in a hugetlbfs, in which case we use getpagesize()... is there a reason for
kvm_s390_enable_cmma() to filter out this case as well ? Or should we rather
check mem_path isn't NULL and points to a hugetlbfs ?

  reply	other threads:[~2018-04-19 16:09 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-19  7:21 [Qemu-devel] [PATCH for-2.13] Clear mem_path if we fall back to anonymous RAM allocation David Gibson
2018-04-19 12:33 ` Igor Mammedov
2018-04-19 12:58   ` [Qemu-devel] [qemu-s390x] " Cornelia Huck
2018-04-19 13:34     ` Christian Borntraeger
2018-04-19 14:11       ` David Hildenbrand
2018-04-19 16:08         ` Greg Kurz [this message]
2018-04-20  2:17           ` David Gibson
2018-04-20  7:13           ` Christian Borntraeger
2018-04-19 16:30 ` [Qemu-devel] " Greg Kurz
2018-04-20  2:18   ` David Gibson
2018-04-20 15:34     ` Paolo Bonzini
2018-04-21  9:20       ` David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180419180851.461a0db3@bahia.lan \
    --to=groug@kaod.org \
    --cc=borntraeger@de.ibm.com \
    --cc=clg@kaod.org \
    --cc=cohuck@redhat.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=david@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=imammedo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    --cc=qemu-s390x@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).