Re: [PATCH RFC v2 1/2] linux/vnuma: vNUMA for PV domu guest

All of lore.kernel.org
 help / color / mirror / Atom feed

From: David Vrabel <david.vrabel@citrix.com>
To: Elena Ufimtseva <ufimtseva@gmail.com>
Cc: Li Yechen <lccycc123@gmail.com>,
	George Dunlap <george.dunlap@eu.citrix.com>,
	Dario Faggioli <dario.faggioli@citrix.com>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>,
	Matt Wilson <msw@amazon.com>,
	boris.ostrovsky@oracle.com
Subject: Re: [PATCH RFC v2 1/2] linux/vnuma: vNUMA for PV domu guest
Date: Wed, 18 Sep 2013 13:23:41 +0100	[thread overview]
Message-ID: <52399B4D.5070208@citrix.com> (raw)
In-Reply-To: <CAEr7rXg5BrDB4UVmyCdLRQt-oz_CunoGKeOTxzpGRBks-7vMdw@mail.gmail.com>

On 18/09/13 07:16, Elena Ufimtseva wrote:
> On Tue, Sep 17, 2013 at 10:10 AM, David Vrabel <david.vrabel@citrix.com> wrote:
>> On 17/09/13 09:34, Elena Ufimtseva wrote:
>>> Requests NUMA topology info from Xen by issuing subop
>>> hypercall. Initializes NUMA nodes, sets number of CPUs,
>>> distance table and NUMA nodes memory ranges during boot.
>>> vNUMA topology defined by user in VM config file. Memory
>>> ranges are represented by structure vnuma_topology_info
>>> where start and end of memory area are defined in guests
>>> pfns numbers, constructed and aligned accordingly to
>>> e820 domain map.
>>> In case the received structure has errors, will fail to
>>> dummy numa init.
>>> Requires XEN with applied patches from vnuma patchset;
>>>
>>> Changes since v1:
>>> - moved the test for xen_pv_domain() into xen_numa_init;
>>> - replaced memory block search/allocation by single memblock_alloc;
>>> - moved xen_numa_init to vnuma.c from enlighten.c;
>>> - moved memblock structure to public interface memory.h;
>>> - specified signedness of vnuma topology structure members;
>>> - removed excessive debug output;
>>>
>>> TODO:
>>> - consider common interface for Dom0, HVM and PV guests to provide
>>> vNUMA topology;
>>> - dynamic numa balancing at the time of this patch (kernel 3.11
>>> 6e4664525b1db28f8c4e1130957f70a94c19213e with boot parameter
>>> numa_balancing=true that is such by default) crashes numa-enabled
>>> guest. Investigate further.
>>
>>> --- a/arch/x86/mm/numa.c
>>> +++ b/arch/x86/mm/numa.c
>>> @@ -19,6 +19,7 @@
>>>  #include <asm/amd_nb.h>
>>
>> #include <asm/xen/vnuma.h> here...
>>
>>>  #include "numa_internal.h"
>>> +#include "asm/xen/vnuma.h"
>>
>> ... not here.
>>
>>> --- /dev/null
>>> +++ b/arch/x86/xen/vnuma.c
>>> @@ -0,0 +1,92 @@
>>> +#include <linux/err.h>
>>> +#include <linux/memblock.h>
>>> +#include <xen/interface/xen.h>
>>> +#include <xen/interface/memory.h>
>>> +#include <asm/xen/interface.h>
>>> +#include <asm/xen/hypercall.h>
>>> +#include <asm/xen/vnuma.h>
>>> +#ifdef CONFIG_NUMA
>>> +/* Xen PV NUMA topology initialization */
>>> +static unsigned int xen_vnuma_init = 0;
>>> +int xen_vnuma_support()
>>> +{
>>> +     return xen_vnuma_init;
>>> +}
>>
>> I'm not sure how this and the usage in the next patch actually work.
>> xen_vnuma_init is only set after the test of numa_off prior to calling
>> xen_numa_init() which will set xen_vnuma_init.
> 
> David, its obscure and naming is not self explanatory.. Will fix it.
> But the idea was to make sure
> that NUMA can be safely turned on (for domu domain and if
> xen_numa_init call was sucessfull).

I understand what it's for, I just don't see how it works.

The code path looks like (I think):

xen_vnuma_init = 0;

if (!xen_vnuma_init)
    numa_off = 1

if (!numa_off)
    xen_numa_init()

However, if you go with the idea of calling dummy init in
xen_num_init()'s error path you don't need this.


>>> +     for (i = 0; i < numa_topo.nr_nodes; i++) {
>>> +             if (numa_add_memblk(i, varea[i].start, varea[i].end))
>>> +                     /* pass to numa_dummy_init */
>>> +                     goto vnumaout;
>>
>> If there's a failure here, numa may be partially setup.  Do you need to
>> undo any of the bits that have already setup?
> 
> Konrad asked me the same and I was under impression it is safe. But
> that was based on assumptions
> what I would rather avoid making.  I will add bits to unset numa in
> case of failure.

I looked at the other uses of this and none of them undo on failure so I
think it is fine as is.

>>> +     if (phys)
>>> +             memblock_free(__pa(phys), mem_size);
>>> +     if (physd)
>>> +             memblock_free(__pa(physd), dist_size);
>>> +     if (physc)
>>> +             memblock_free(__pa(physc), cpu_to_node_size);
>>> +     return rc;
>>
>> If you return an error, x86_numa_init() will try to call setup for other
>> NUMA system.  Consider calling numa_dummy_init() directly instead and
>> then returning success.
> 
> David, isnt it what x86_numa_init() supposed to do? try every
> *numa_init until one succeed?
> Will adding excplicit call to dummy numa from xen_init_numa brake this logic?

Yes, but if we know we're a PV guest we do not want to try any other
one, we want to fallback to the dummy init immediately.

David

next prev parent reply	other threads:[~2013-09-18 12:23 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-17  8:33 [PATCH RFC v2 0/2] linux/vnuma: vNUMA PV guest support introduction Elena Ufimtseva
2013-09-17  8:34 ` [PATCH RFC v2 1/2] linux/vnuma: vNUMA for PV domu guest Elena Ufimtseva
2013-09-17 14:10   ` David Vrabel
2013-09-18  6:16     ` Elena Ufimtseva
2013-09-18  7:17       ` Dario Faggioli
2013-09-18  7:41         ` Elena Ufimtseva
2013-09-18 12:23       ` David Vrabel [this message]
2013-09-17 14:21   ` Boris Ostrovsky
2013-09-18  6:30     ` Elena Ufimtseva
2013-09-18  7:33       ` Dario Faggioli
2013-09-18  7:39         ` Elena Ufimtseva
2013-09-18 16:04   ` Dario Faggioli
2013-09-17  8:34 ` [PATCH RFC v2 2/2] linux/vnuma: Enables NUMA for domu PV guest Elena Ufimtseva
2013-09-17 14:17   ` David Vrabel
2013-09-17 14:37     ` Dario Faggioli
2013-09-18  6:32       ` Elena Ufimtseva
2013-09-18 15:14   ` Dario Faggioli
2013-09-27 17:03     ` Konrad Rzeszutek Wilk
2013-09-18 16:16 ` [PATCH RFC v2 0/2] linux/vnuma: vNUMA PV guest support introduction Dario Faggioli
2013-09-18 16:20   ` Elena Ufimtseva

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52399B4D.5070208@citrix.com \
    --to=david.vrabel@citrix.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=dario.faggioli@citrix.com \
    --cc=george.dunlap@eu.citrix.com \
    --cc=lccycc123@gmail.com \
    --cc=msw@amazon.com \
    --cc=ufimtseva@gmail.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.