All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
To: Eduardo Habkost <ehabkost@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>,
	Alexey Kardashevskiy <aik@ozlabs.ru>,
	Hu Tao <hutao@cn.fujitsu.com>,
	qemu-devel@nongnu.org, qemu-ppc@nongnu.org,
	Anton Blanchard <anton@samba.org>,
	David Rientjes <rientjes@google.com>,
	Igor Mammedov <imammedo@redhat.com>
Subject: Re: [Qemu-devel] [PATCH 2/2 v5] numa: enable sparse node numbering on ppc
Date: Wed, 2 Jul 2014 14:02:14 -0700	[thread overview]
Message-ID: <20140702210214.GC29053@linux.vnet.ibm.com> (raw)
In-Reply-To: <20140702182138.GY3222@otherpad.lan.raisama.net>

On 02.07.2014 [15:21:38 -0300], Eduardo Habkost wrote:
> On Tue, Jul 01, 2014 at 01:50:06PM -0700, Nishanth Aravamudan wrote:
> > On 01.07.2014 [17:39:57 -0300], Eduardo Habkost wrote:
> > > On Tue, Jul 01, 2014 at 01:13:28PM -0700, Nishanth Aravamudan wrote:
> > > [...]
> > > > diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> > > > index 12472c6..cdefafe 100644
> > > > --- a/hw/i386/pc.c
> > > > +++ b/hw/i386/pc.c
> > > > @@ -1121,6 +1121,18 @@ PcGuestInfo *pc_guest_info_init(ram_addr_t below_4g_mem_size,
> > > >      guest_info->ram_size = below_4g_mem_size + above_4g_mem_size;
> > > >      guest_info->apic_id_limit = pc_apic_id_limit(max_cpus);
> > > >      guest_info->apic_xrupt_override = kvm_allows_irq0_override();
> > > > +    /* No support for sparse NUMA node IDs yet: */
> > > > +    for (i = max_numa_nodeid - 1; i >= 0; i--) {
> > > > +        /* Report large node IDs first, to make mistakes easier to spot */
> > > > +        if (!numa_info[i].present) {
> > > > +            error_report("numa: Node ID missing: %d", i);
> > > > +            exit(EXIT_FAILURE);
> > > > +        }
> > > > +    }
> > > > +
> > > > +    /* This must be always true if all nodes are present */
> > > > +    assert(num_numa_nodes == max_numa_nodeid);
> > > > +
> > > 
> > > I wonder if there's a better place where we could put this check.
> > 
> > Well, only i386 and ppc support NUMA, afaict. So I'm not sure where it
> > makes sense to put it. I guess we could have a flag that the
> > architectures set that indicates sparse NUMA support or not, and put
> > this in the generic code.
> > 
> > Or do you mean putting this check somewhere else in the PC init code?
> 
> I mean somewhere else in the PC init code. But as today the code that
> calls pc_guest_info_init() and pc_memory_init() is duplicated in both
> pc_piix.c and pc_q35.c, this looks like the best place we have.

Ok, so if I send out another revision with the fixed j initialization
below, is there anything else in my changes that you would like fixed?

> > > >      guest_info->numa_nodes = num_numa_nodes;
> > > >      guest_info->node_mem = g_malloc0(guest_info->numa_nodes *
> > > >                                      sizeof *guest_info->node_mem);
> > > [...]
> > > > diff --git a/numa.c b/numa.c
> > > > index 5930df0..a689e52 100644
> > > > --- a/numa.c
> > > > +++ b/numa.c
> > > [...]
> > > > @@ -225,9 +220,12 @@ void set_numa_nodes(void)
> > > >           * must cope with this anyway, because there are BIOSes out there in
> > > >           * real machines which also use this scheme.
> > > >           */
> > > > -        if (i == num_numa_nodes) {
> > > > -            for (i = 0; i < max_cpus; i++) {
> > > > -                set_bit(i, numa_info[i % num_numa_nodes].node_cpu);
> > > > +        if (i == max_numa_nodeid) {
> > > > +            for (i = 0, j = 0; i < max_cpus; i++) {
> > > 
> > > Doesn't j need to be initialized to -1, here?
> > 
> > Arrgh, sorry had been messing with your suggestion to use a while loop.
> > You're right, it needs to be -1 here.
> > 
> > > Except for that, patch looks good to me. But I would be more comfortable
> > > with it if we had automated tests to help ensure we are not breaking
> > > compatibility of existing NUMA command-line conbinations with these
> > > changes.
> > 
> > Is that the test target in the qemu source? Are there examples of any
> > such NUMA tests already?
> 
> I use 'make check' to run them, they are in the tests/ directory.

Got it, thanks.

> I am not aware of any NUMA-related test, but I see two possible ways of
> testing it: using qtest and asking for for the NUMA node info through
> the monitor, or a unit test for numa.c that simply calls
> numa_node_parse() and set_numa_nodes(), and then checks the result on
> numa_info[] directly.

Do you have a preference for which of these to do?

> A third option may be using qtest and checking the resulting ACPI tables
> directly. It would cover even more code, but would be specific to PC.

I'm not comfortable saying I can get to this, as I still don't really
know the ACPI code, but I can put it on my todo list, at least.

> The tests won't be a requirement to me, but they would surely be welcome
> (and would have detected the j=0 mistake above).

I think it makes sense to put this in now, as it would have caught the
original issue(s) with sparse node numbering as well.

Thanks,
Nish

  reply	other threads:[~2014-07-02 21:02 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-01 20:11 [Qemu-devel] [PATCH 1/2] Rename nb_numa_nodes to num_numa_nodes Nishanth Aravamudan
2014-07-01 20:13 ` [Qemu-devel] [PATCH 2/2 v5] numa: enable sparse node numbering on ppc Nishanth Aravamudan
2014-07-01 20:39   ` Eduardo Habkost
2014-07-01 20:50     ` Nishanth Aravamudan
2014-07-02 18:21       ` Eduardo Habkost
2014-07-02 21:02         ` Nishanth Aravamudan [this message]
2014-07-02 21:52           ` Eduardo Habkost
2014-07-01 20:42 ` [Qemu-devel] [PATCH 1/2] Rename nb_numa_nodes to num_numa_nodes Eduardo Habkost
2014-07-03  7:03 ` Michael S. Tsirkin
2014-07-03 16:31   ` Nishanth Aravamudan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140702210214.GC29053@linux.vnet.ibm.com \
    --to=nacc@linux.vnet.ibm.com \
    --cc=aik@ozlabs.ru \
    --cc=anton@samba.org \
    --cc=ehabkost@redhat.com \
    --cc=hutao@cn.fujitsu.com \
    --cc=imammedo@redhat.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.