From: Andrew Jones <drjones@redhat.com>
To: gaowanlong@cn.fujitsu.com
Cc: aliguori@us.ibm.com, ehabkost@redhat.com, hutao@cn.fujitsu.com,
qemu-devel@nongnu.org,
peter huangpeng <peter.huangpeng@huawei.com>,
bsd@redhat.com, y-goto@jp.fujitsu.com,
Paolo Bonzini <pbonzini@redhat.com>,
lcapitulino@redhat.com, lersek@redhat.com, afaerber@suse.de
Subject: Re: [Qemu-devel] [PATCH V9 06/12] NUMA: Add Linux libnuma detection
Date: Thu, 29 Aug 2013 04:15:49 -0400 (EDT) [thread overview]
Message-ID: <994970720.3441149.1377764149516.JavaMail.root@redhat.com> (raw)
In-Reply-To: <521EB076.2080604@cn.fujitsu.com>
----- Original Message -----
> On 08/28/2013 09:44 PM, Paolo Bonzini wrote:
> > Il 26/08/2013 10:43, Andrew Jones ha scritto:
> >>
> >> ----- Original Message -----
> >>>> On 08/26/2013 03:46 PM, Andrew Jones wrote:
> >>>>>>>>>> Is this patch still necessary? I thought that dropping the
> >>>>>>>>>>>>>> numa_num_configured_nodes() calls from patch 8/12 got rid
> >>>>>>>>>>>>>> of the need for this library. Maybe I missed other uses?
> >>>>>>>>>>
> >>>>>>>>>> Yes, in 08/12 we also use mbind(),
> >>>>>> You don't need a whole library for mbind(), it's a syscall. See
> >>>>>> syscall(2).
> >>>>>>
> >>>>>>>>>> and in 09/12 we use max_numa_node().
> >>>>>> Really? I didn't see it there. And anyway, that goes back to our
> >>>>>> discussion
> >>>>>> about setting qemu's MAX_NODES to whatever we think qemu should
> >>>>>> support,
> >>>>>> and then just checking that we don't blow that limit whenever reading
> >>>>>> host node info, i.e.
> >>>>>>
> >>>>>> maxnode = 0;
> >>>>>> while (host_nodes[maxnode] && maxnode < MAX_NODES)
> >>>>>> node_read(&info[maxnode++]);
> >>>>>>
> >>>>>> type of a thing.
> >>>>>>
> >>>>>> And, if there's a place you really need to know the current online
> >>>>>> number
> >>>>>> of host nodes, then, like I said earlier, you should just go to sysfs
> >>>>>> yourself. libnuma:numa_max_node() returns an int that it only
> >>>>>> initializes
> >>>>>> at library load time, so it's not going to adapt to
> >>>>>> onlining/offlining.
> >>>>
> >>>> OK, thank you.
> >>>> Then I should define MPOL_* macros in QEMU and use mbind(2) syscall
> >>>> directly,
> >>>> right?
> >> Hmm, yeah, that's too bad that numaif.h is part of libnuma, and not a more
> >> general lib. Whether or not we want to redefine those symbols within
> >> qemu, in order to avoid the dependency on installing numactl-devel, isn't
> >> something I can answer. That's a better question for Anthony. Anthony?
> >> Paolo,
> >> any opinions? Maybe we should pick up uapi/linux/mempolicy.h with the
> >> linux-header synch script?
> >>
> >
> > I think using libnuma is fine. In principle this could be used on other
> > OSes than Linux, I think?
>
> But seems that mbind(2) is Linux-specific syscall, right?
>
You would need to avoid directly calling mbind, i.e. use libnuma for all
numa related calls. Then, if libnuma were to support more OSes, qemu would
automatically (wrt to numa) as well. Your mbind() with libnuma would look
like this
numa_set_bind_policy(strict)
numa_tonodemask_memory(addr, size, nodemask)
The problem is that set_bind_policy only takes a bool, and thus only
allows two of the four possibly policies
MPOL_BIND strict == 1
MPOL_PREFERRED strict == 0
So, due to libnuma's policy setting limitations, and the fact it doesn't
currently support more OSes than Linux, then I prefer your current
series version that drops libnuma. If qemu will need to support NUMA on
another OS, then we can cross this bridge when we get there.
drew
next prev parent reply other threads:[~2013-08-29 8:16 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-08-23 4:09 [Qemu-devel] [PATCH V9 00/12] Add support for binding guest numa nodes to host numa nodes Wanlong Gao
2013-08-23 4:09 ` [Qemu-devel] [PATCH V9 01/12] NUMA: add NumaOptions, NumaNodeOptions and NumaMemOptions Wanlong Gao
2013-08-23 4:09 ` [Qemu-devel] [PATCH V9 02/12] NUMA: split -numa option Wanlong Gao
2013-08-23 4:09 ` [Qemu-devel] [PATCH V9 03/12] NUMA: check if the total numa memory size is equal to ram_size Wanlong Gao
2013-08-23 4:09 ` [Qemu-devel] [PATCH V9 04/12] NUMA: move numa related code to numa.c Wanlong Gao
2013-08-23 4:09 ` [Qemu-devel] [PATCH V9 05/12] NUMA: Add numa_info structure to contain numa nodes info Wanlong Gao
2013-08-23 4:09 ` [Qemu-devel] [PATCH V9 06/12] NUMA: Add Linux libnuma detection Wanlong Gao
2013-08-23 8:40 ` Andrew Jones
2013-08-26 1:43 ` Wanlong Gao
2013-08-26 7:46 ` Andrew Jones
2013-08-26 8:16 ` Wanlong Gao
2013-08-26 8:43 ` Andrew Jones
2013-08-28 13:44 ` Paolo Bonzini
2013-08-29 2:22 ` Wanlong Gao
2013-08-29 8:15 ` Andrew Jones [this message]
2013-08-29 8:31 ` Andrew Jones
2013-08-23 4:09 ` [Qemu-devel] [PATCH V9 07/12] NUMA: parse guest numa nodes memory policy Wanlong Gao
2013-08-23 14:11 ` Andrew Jones
2013-08-26 1:07 ` Wanlong Gao
2013-08-26 7:12 ` Andrew Jones
2013-08-23 4:09 ` [Qemu-devel] [PATCH V9 08/12] NUMA: set " Wanlong Gao
2013-08-23 8:44 ` Andrew Jones
2013-08-23 4:10 ` [Qemu-devel] [PATCH V9 09/12] NUMA: add qmp command set-mem-policy to set memory policy for NUMA node Wanlong Gao
2013-08-23 4:10 ` [Qemu-devel] [PATCH V9 10/12] NUMA: add hmp command set-mem-policy Wanlong Gao
2013-08-23 4:10 ` [Qemu-devel] [PATCH V9 11/12] NUMA: add qmp command query-numa Wanlong Gao
2013-08-23 4:10 ` [Qemu-devel] [PATCH V9 12/12] NUMA: convert hmp command info_numa to use qmp command query_numa Wanlong Gao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=994970720.3441149.1377764149516.JavaMail.root@redhat.com \
--to=drjones@redhat.com \
--cc=afaerber@suse.de \
--cc=aliguori@us.ibm.com \
--cc=bsd@redhat.com \
--cc=ehabkost@redhat.com \
--cc=gaowanlong@cn.fujitsu.com \
--cc=hutao@cn.fujitsu.com \
--cc=lcapitulino@redhat.com \
--cc=lersek@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peter.huangpeng@huawei.com \
--cc=qemu-devel@nongnu.org \
--cc=y-goto@jp.fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).