All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Andreas Färber" <afaerber@suse.de>
To: Igor Mammedov <imammedo@redhat.com>, qemu-devel@nongnu.org
Cc: ehabkost@redhat.com
Subject: Re: [Qemu-devel] [PATCH v2 for-2.3] numa: pc: fix default VCPU to node mapping
Date: Thu, 19 Mar 2015 17:05:13 +0100	[thread overview]
Message-ID: <550AF3B9.3090104@suse.de> (raw)
In-Reply-To: <1426696705-32472-1-git-send-email-imammedo@redhat.com>

Am 18.03.2015 um 17:38 schrieb Igor Mammedov:
> since commit
>    dd0247e0 pc: acpi: mark all possible CPUs as enabled in SRAT
> Linux kernel actually tries to use CPU to Node mapping from
> QEMU provided SRAT table instead of discarding it, and that
> in some cases breaks build_sched_domains() which expects
> sane mapping where cores/threads belonging to the same socket
> are on the same NUMA node.
> 
> With current default round-robin mapping of VCPUs to nodes
> guest ends-up with cores/threads belonging to the same socket
> being on different NUMA nodes.
> 
> For example with following CLI:
> qemu-kvm -m 4G -smp 5,sockets=2,cores=4,threads=1,maxcpus=8 \
>          -numa node,nodeid=0 -numa node,nodeid=1
> 2.6.32 based kernels will hang on boot due to incorrectly build
> sched_group-s list in update_sd_lb_stats()
> so comment in QEMU justifying dumb default mapping:
>  "
>   guest OSes must cope with this anyway, because there are BIOSes
>   out there in real machines which also use this scheme.
>  "
> isn't really valid.
> 
> Replacing default mapping with a manual, where VCPUs belonging to
> the same socket are on the same NUMA node, fixes issue for
> guests which can't handle nonsense topology i.e. changing CLI to:
>   -numa node,nodeid=0,cpus=0-3 -numa node,nodeid=1,cpus=4-7
> 
> So instead of simply scattering VCPUs around nodes, map
> the same socket VCPUs to the same NUMA node, which is what
> guest would expect from a sane hardware/BIOS.
> 
> Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> ---
> v2:
>   - add machine callback cpu_index_to_socket_id() and use it
>     instead of stub approach
> ---
>  hw/i386/pc.c          |  9 +++++++++
>  include/hw/boards.h   |  5 +++++
>  include/sysemu/numa.h |  3 ++-
>  numa.c                | 18 +++++++++++++-----
>  vl.c                  |  2 +-
>  5 files changed, 30 insertions(+), 7 deletions(-)

Looks great to me now, the hook name with _socket_id is perfect,

Reviewed-by: Andreas Färber <afaerber@suse.de>

but can we do that in three steps please? "machine:" adding callback and
default implementation, "numa:" switching to use it and "pc:" overriding
the new callback - not only nicer subjects but easier to cherry-pick and
bisect then.

Regards,
Andreas

-- 
SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Felix Imendörffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu,
Graham Norton; HRB 21284 (AG Nürnberg)

  reply	other threads:[~2015-03-19 16:05 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-18 16:38 [Qemu-devel] [PATCH v2 for-2.3] numa: pc: fix default VCPU to node mapping Igor Mammedov
2015-03-19 16:05 ` Andreas Färber [this message]
2015-03-19 16:10   ` Igor Mammedov
2015-03-19 16:21 ` Andreas Färber
2015-03-19 16:28   ` Igor Mammedov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=550AF3B9.3090104@suse.de \
    --to=afaerber@suse.de \
    --cc=ehabkost@redhat.com \
    --cc=imammedo@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.