qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Jones <drjones@redhat.com>
To: gaowanlong@cn.fujitsu.com
Cc: aliguori@us.ibm.com, ehabkost@redhat.com, hutao@cn.fujitsu.com,
	qemu-devel@nongnu.org,
	peter huangpeng <peter.huangpeng@huawei.com>,
	bsd@redhat.com, y-goto@jp.fujitsu.com,
	Paolo Bonzini <pbonzini@redhat.com>,
	lcapitulino@redhat.com, lersek@redhat.com, afaerber@suse.de
Subject: Re: [Qemu-devel] [PATCH V9 06/12] NUMA: Add Linux libnuma detection
Date: Thu, 29 Aug 2013 04:15:49 -0400 (EDT)	[thread overview]
Message-ID: <994970720.3441149.1377764149516.JavaMail.root@redhat.com> (raw)
In-Reply-To: <521EB076.2080604@cn.fujitsu.com>



----- Original Message -----
> On 08/28/2013 09:44 PM, Paolo Bonzini wrote:
> > Il 26/08/2013 10:43, Andrew Jones ha scritto:
> >>
> >> ----- Original Message -----
> >>>> On 08/26/2013 03:46 PM, Andrew Jones wrote:
> >>>>>>>>>> Is this patch still necessary? I thought that dropping the
> >>>>>>>>>>>>>> numa_num_configured_nodes() calls from patch 8/12 got rid
> >>>>>>>>>>>>>> of the need for this library. Maybe I missed other uses?
> >>>>>>>>>>
> >>>>>>>>>> Yes, in 08/12 we also use mbind(),
> >>>>>> You don't need a whole library for mbind(), it's a syscall. See
> >>>>>> syscall(2).
> >>>>>>
> >>>>>>>>>> and in 09/12 we use max_numa_node().
> >>>>>> Really? I didn't see it there. And anyway, that goes back to our
> >>>>>> discussion
> >>>>>> about setting qemu's MAX_NODES to whatever we think qemu should
> >>>>>> support,
> >>>>>> and then just checking that we don't blow that limit whenever reading
> >>>>>> host node info, i.e.
> >>>>>>
> >>>>>> maxnode = 0;
> >>>>>> while (host_nodes[maxnode] && maxnode < MAX_NODES)
> >>>>>>   node_read(&info[maxnode++]);
> >>>>>>
> >>>>>> type of a thing.
> >>>>>>
> >>>>>> And, if there's a place you really need to know the current online
> >>>>>> number
> >>>>>> of host nodes, then, like I said earlier, you should just go to sysfs
> >>>>>> yourself. libnuma:numa_max_node() returns an int that it only
> >>>>>> initializes
> >>>>>> at library load time, so it's not going to adapt to
> >>>>>> onlining/offlining.
> >>>>
> >>>> OK, thank you.
> >>>> Then I should define MPOL_* macros in QEMU and use mbind(2) syscall
> >>>> directly,
> >>>> right?
> >> Hmm, yeah, that's too bad that numaif.h is part of libnuma, and not a more
> >> general lib. Whether or not we want to redefine those symbols within
> >> qemu, in order to avoid the dependency on installing numactl-devel, isn't
> >> something I can answer. That's a better question for Anthony. Anthony?
> >> Paolo,
> >> any opinions? Maybe we should pick up uapi/linux/mempolicy.h with the
> >> linux-header synch script?
> >>
> > 
> > I think using libnuma is fine.  In principle this could be used on other
> > OSes than Linux, I think?
> 
> But seems that mbind(2) is Linux-specific syscall, right?
> 

You would need to avoid directly calling mbind, i.e. use libnuma for all
numa related calls. Then, if libnuma were to support more OSes, qemu would
automatically (wrt to numa) as well. Your mbind() with libnuma would look
like this

numa_set_bind_policy(strict)
numa_tonodemask_memory(addr, size, nodemask)

The problem is that set_bind_policy only takes a bool, and thus only
allows two of the four possibly policies

MPOL_BIND        strict == 1
MPOL_PREFERRED   strict == 0

So, due to libnuma's policy setting limitations, and the fact it doesn't
currently support more OSes than Linux, then I prefer your current
series version that drops libnuma. If qemu will need to support NUMA on
another OS, then we can cross this bridge when we get there.

drew

  reply	other threads:[~2013-08-29  8:16 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-23  4:09 [Qemu-devel] [PATCH V9 00/12] Add support for binding guest numa nodes to host numa nodes Wanlong Gao
2013-08-23  4:09 ` [Qemu-devel] [PATCH V9 01/12] NUMA: add NumaOptions, NumaNodeOptions and NumaMemOptions Wanlong Gao
2013-08-23  4:09 ` [Qemu-devel] [PATCH V9 02/12] NUMA: split -numa option Wanlong Gao
2013-08-23  4:09 ` [Qemu-devel] [PATCH V9 03/12] NUMA: check if the total numa memory size is equal to ram_size Wanlong Gao
2013-08-23  4:09 ` [Qemu-devel] [PATCH V9 04/12] NUMA: move numa related code to numa.c Wanlong Gao
2013-08-23  4:09 ` [Qemu-devel] [PATCH V9 05/12] NUMA: Add numa_info structure to contain numa nodes info Wanlong Gao
2013-08-23  4:09 ` [Qemu-devel] [PATCH V9 06/12] NUMA: Add Linux libnuma detection Wanlong Gao
2013-08-23  8:40   ` Andrew Jones
2013-08-26  1:43     ` Wanlong Gao
2013-08-26  7:46       ` Andrew Jones
2013-08-26  8:16         ` Wanlong Gao
2013-08-26  8:43           ` Andrew Jones
2013-08-28 13:44             ` Paolo Bonzini
2013-08-29  2:22               ` Wanlong Gao
2013-08-29  8:15                 ` Andrew Jones [this message]
2013-08-29  8:31                   ` Andrew Jones
2013-08-23  4:09 ` [Qemu-devel] [PATCH V9 07/12] NUMA: parse guest numa nodes memory policy Wanlong Gao
2013-08-23 14:11   ` Andrew Jones
2013-08-26  1:07     ` Wanlong Gao
2013-08-26  7:12       ` Andrew Jones
2013-08-23  4:09 ` [Qemu-devel] [PATCH V9 08/12] NUMA: set " Wanlong Gao
2013-08-23  8:44   ` Andrew Jones
2013-08-23  4:10 ` [Qemu-devel] [PATCH V9 09/12] NUMA: add qmp command set-mem-policy to set memory policy for NUMA node Wanlong Gao
2013-08-23  4:10 ` [Qemu-devel] [PATCH V9 10/12] NUMA: add hmp command set-mem-policy Wanlong Gao
2013-08-23  4:10 ` [Qemu-devel] [PATCH V9 11/12] NUMA: add qmp command query-numa Wanlong Gao
2013-08-23  4:10 ` [Qemu-devel] [PATCH V9 12/12] NUMA: convert hmp command info_numa to use qmp command query_numa Wanlong Gao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=994970720.3441149.1377764149516.JavaMail.root@redhat.com \
    --to=drjones@redhat.com \
    --cc=afaerber@suse.de \
    --cc=aliguori@us.ibm.com \
    --cc=bsd@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=gaowanlong@cn.fujitsu.com \
    --cc=hutao@cn.fujitsu.com \
    --cc=lcapitulino@redhat.com \
    --cc=lersek@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.huangpeng@huawei.com \
    --cc=qemu-devel@nongnu.org \
    --cc=y-goto@jp.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).