Re: [PATCH RFC v2 1/2] linux/vnuma: vNUMA for PV domu guest

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

From: David Vrabel <david.vrabel@citrix.com>
To: Elena Ufimtseva <ufimtseva@gmail.com>
Cc: Li Yechen <lccycc123@gmail.com>,
	George Dunlap <george.dunlap@eu.citrix.com>,
	Dario Faggioli <dario.faggioli@citrix.com>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>,
	Matt Wilson <msw@amazon.com>,
	boris.ostrovsky@oracle.com
Subject: Re: [PATCH RFC v2 1/2] linux/vnuma: vNUMA for PV domu guest
Date: Wed, 18 Sep 2013 13:23:41 +0100	[thread overview]
Message-ID: <52399B4D.5070208@citrix.com> (raw)
In-Reply-To: <CAEr7rXg5BrDB4UVmyCdLRQt-oz_CunoGKeOTxzpGRBks-7vMdw@mail.gmail.com>

On 18/09/13 07:16, Elena Ufimtseva wrote:
> On Tue, Sep 17, 2013 at 10:10 AM, David Vrabel <david.vrabel@citrix.com> wrote:
>> On 17/09/13 09:34, Elena Ufimtseva wrote:
>>> Requests NUMA topology info from Xen by issuing subop
>>> hypercall. Initializes NUMA nodes, sets number of CPUs,
>>> distance table and NUMA nodes memory ranges during boot.
>>> vNUMA topology defined by user in VM config file. Memory
>>> ranges are represented by structure vnuma_topology_info
>>> where start and end of memory area are defined in guests
>>> pfns numbers, constructed and aligned accordingly to
>>> e820 domain map.
>>> In case the received structure has errors, will fail to
>>> dummy numa init.
>>> Requires XEN with applied patches from vnuma patchset;
>>>
>>> Changes since v1:
>>> - moved the test for xen_pv_domain() into xen_numa_init;
>>> - replaced memory block search/allocation by single memblock_alloc;
>>> - moved xen_numa_init to vnuma.c from enlighten.c;
>>> - moved memblock structure to public interface memory.h;
>>> - specified signedness of vnuma topology structure members;
>>> - removed excessive debug output;
>>>
>>> TODO:
>>> - consider common interface for Dom0, HVM and PV guests to provide
>>> vNUMA topology;
>>> - dynamic numa balancing at the time of this patch (kernel 3.11
>>> 6e4664525b1db28f8c4e1130957f70a94c19213e with boot parameter
>>> numa_balancing=true that is such by default) crashes numa-enabled
>>> guest. Investigate further.
>>
>>> --- a/arch/x86/mm/numa.c
>>> +++ b/arch/x86/mm/numa.c
>>> @@ -19,6 +19,7 @@
>>>  #include <asm/amd_nb.h>
>>
>> #include <asm/xen/vnuma.h> here...
>>
>>>  #include "numa_internal.h"
>>> +#include "asm/xen/vnuma.h"
>>
>> ... not here.
>>
>>> --- /dev/null
>>> +++ b/arch/x86/xen/vnuma.c
>>> @@ -0,0 +1,92 @@
>>> +#include <linux/err.h>
>>> +#include <linux/memblock.h>
>>> +#include <xen/interface/xen.h>
>>> +#include <xen/interface/memory.h>
>>> +#include <asm/xen/interface.h>
>>> +#include <asm/xen/hypercall.h>
>>> +#include <asm/xen/vnuma.h>
>>> +#ifdef CONFIG_NUMA
>>> +/* Xen PV NUMA topology initialization */
>>> +static unsigned int xen_vnuma_init = 0;
>>> +int xen_vnuma_support()
>>> +{
>>> +     return xen_vnuma_init;
>>> +}
>>
>> I'm not sure how this and the usage in the next patch actually work.
>> xen_vnuma_init is only set after the test of numa_off prior to calling
>> xen_numa_init() which will set xen_vnuma_init.
> 
> David, its obscure and naming is not self explanatory.. Will fix it.
> But the idea was to make sure
> that NUMA can be safely turned on (for domu domain and if
> xen_numa_init call was sucessfull).

I understand what it's for, I just don't see how it works.

The code path looks like (I think):

xen_vnuma_init = 0;

if (!xen_vnuma_init)
    numa_off = 1

if (!numa_off)
    xen_numa_init()

However, if you go with the idea of calling dummy init in
xen_num_init()'s error path you don't need this.


>>> +     for (i = 0; i < numa_topo.nr_nodes; i++) {
>>> +             if (numa_add_memblk(i, varea[i].start, varea[i].end))
>>> +                     /* pass to numa_dummy_init */
>>> +                     goto vnumaout;
>>
>> If there's a failure here, numa may be partially setup.  Do you need to
>> undo any of the bits that have already setup?
> 
> Konrad asked me the same and I was under impression it is safe. But
> that was based on assumptions
> what I would rather avoid making.  I will add bits to unset numa in
> case of failure.

I looked at the other uses of this and none of them undo on failure so I
think it is fine as is.

>>> +     if (phys)
>>> +             memblock_free(__pa(phys), mem_size);
>>> +     if (physd)
>>> +             memblock_free(__pa(physd), dist_size);
>>> +     if (physc)
>>> +             memblock_free(__pa(physc), cpu_to_node_size);
>>> +     return rc;
>>
>> If you return an error, x86_numa_init() will try to call setup for other
>> NUMA system.  Consider calling numa_dummy_init() directly instead and
>> then returning success.
> 
> David, isnt it what x86_numa_init() supposed to do? try every
> *numa_init until one succeed?
> Will adding excplicit call to dummy numa from xen_init_numa brake this logic?

Yes, but if we know we're a PV guest we do not want to try any other
one, we want to fallback to the dummy init immediately.

David

next prev parent reply	other threads:[~2013-09-18 12:23 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-17  8:33 [PATCH RFC v2 0/2] linux/vnuma: vNUMA PV guest support introduction Elena Ufimtseva
2013-09-17  8:34 ` [PATCH RFC v2 1/2] linux/vnuma: vNUMA for PV domu guest Elena Ufimtseva
2013-09-17 14:10   ` David Vrabel
2013-09-18  6:16     ` Elena Ufimtseva
2013-09-18  7:17       ` Dario Faggioli
2013-09-18  7:41         ` Elena Ufimtseva
2013-09-18 12:23       ` David Vrabel [this message]
2013-09-17 14:21   ` Boris Ostrovsky
2013-09-18  6:30     ` Elena Ufimtseva
2013-09-18  7:33       ` Dario Faggioli
2013-09-18  7:39         ` Elena Ufimtseva
2013-09-18 16:04   ` Dario Faggioli
2013-09-17  8:34 ` [PATCH RFC v2 2/2] linux/vnuma: Enables NUMA for domu PV guest Elena Ufimtseva
2013-09-17 14:17   ` David Vrabel
2013-09-17 14:37     ` Dario Faggioli
2013-09-18  6:32       ` Elena Ufimtseva
2013-09-18 15:14   ` Dario Faggioli
2013-09-27 17:03     ` Konrad Rzeszutek Wilk
2013-09-18 16:16 ` [PATCH RFC v2 0/2] linux/vnuma: vNUMA PV guest support introduction Dario Faggioli
2013-09-18 16:20   ` Elena Ufimtseva

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52399B4D.5070208@citrix.com \
    --to=david.vrabel@citrix.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=dario.faggioli@citrix.com \
    --cc=george.dunlap@eu.citrix.com \
    --cc=lccycc123@gmail.com \
    --cc=msw@amazon.com \
    --cc=ufimtseva@gmail.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).