From: Dario Faggioli <dario.faggioli@citrix.com>
To: Wei Liu <wei.liu2@citrix.com>
Cc: keir@xen.org, Ian.Campbell@citrix.com,
stefano.stabellini@eu.citrix.com, george.dunlap@eu.citrix.com,
msw@linux.com, lccycc123@gmail.com, ian.jackson@eu.citrix.com,
xen-devel@lists.xen.org, JBeulich@suse.com,
Elena Ufimtseva <ufimtseva@gmail.com>
Subject: Re: [PATCH v6 00/10] vnuma introduction
Date: Tue, 22 Jul 2014 17:06:37 +0200 [thread overview]
Message-ID: <1406041597.17850.74.camel@Solace> (raw)
In-Reply-To: <20140722144846.GB6448@zion.uk.xensource.com>
[-- Attachment #1.1: Type: text/plain, Size: 6228 bytes --]
On mar, 2014-07-22 at 15:48 +0100, Wei Liu wrote:
> On Tue, Jul 22, 2014 at 04:03:44PM +0200, Dario Faggioli wrote:
> > I mean, even right now, PV guests see completely random cache-sharing
> > topology, and that does (at least potentially) affect performance, as
> > the guest scheduler will make incorrect/inconsistent assumptions.
> >
>
> Correct. It's just that it might be more obvious to see the problem with
> vNUMA.
>
Yep.
> > > Yes, given that you derive numa memory allocation from cpu pinning or
> > > use combination of cpu pinning, vcpu to vnode map and vnode to pnode
> > > map, in those cases those IDs might reflect the right topology.
> > >
> > Well, pinning does (should?) not always happen, as a consequence of a
> > virtual topology being used.
> >
>
> That's true. I was just referring to the current status of the patch
> series. AIUI that's how it is implemented now, not necessary the way it
> has to be.
>
Ok.
> > With the following guest configuration, in terms of vcpu pinning:
> >
> > 1) 2 vCPUs ==> same pCPUs
>
> 4 vcpus, I think.
>
> > root@benny:~# xl vcpu-list
> > Name ID VCPU CPU State Time(s) CPU Affinity
> > debian.guest.osstest 9 0 0 -b- 2.7 0
> > debian.guest.osstest 9 1 0 -b- 5.2 0
> > debian.guest.osstest 9 2 7 -b- 2.4 7
> > debian.guest.osstest 9 3 7 -b- 4.4 7
> >
What I meant with "2 vCPUs" was that I was putting 2 vCPUs of the guest
(0 and 1) on the same pCPU (0), and the other 2 (2 and 3) on another
(7).
That should have meant a topology that does not share at least the least
cache level in the guest, but it is not.
> > 2) no SMT
> > root@benny:~# xl vcpu-list
> > Name ID VCPU CPU State Time(s) CPU
> > Affinity
> > debian.guest.osstest 11 0 0 -b- 0.6 0
> > debian.guest.osstest 11 1 2 -b- 0.4 2
> > debian.guest.osstest 11 2 4 -b- 1.5 4
> > debian.guest.osstest 11 3 6 -b- 0.5 6
> >
> > 3) Random
> > root@benny:~# xl vcpu-list
> > Name ID VCPU CPU State Time(s) CPU
> > Affinity
> > debian.guest.osstest 12 0 3 -b- 1.6 all
> > debian.guest.osstest 12 1 1 -b- 1.4 all
> > debian.guest.osstest 12 2 5 -b- 2.4 all
> > debian.guest.osstest 12 3 7 -b- 1.5 all
> >
> > 4) yes SMT
> > root@benny:~# xl vcpu-list
> > Name ID VCPU CPU State Time(s) CPU
> > Affinity
> > debian.guest.osstest 14 0 1 -b- 1.0 1
> > debian.guest.osstest 14 1 2 -b- 1.8 2
> > debian.guest.osstest 14 2 6 -b- 1.1 6
> > debian.guest.osstest 14 3 7 -b- 0.8 7
> >
> > And, in *all* these 4 cases, here's what I see:
> >
> > root@debian:~# cat /sys/devices/system/cpu/cpu*/topology/core_siblings_list
> > 0-3
> > 0-3
> > 0-3
> > 0-3
> >
> > root@debian:~# cat /sys/devices/system/cpu/cpu*/topology/thread_siblings_list
> > 0-3
> > 0-3
> > 0-3
> > 0-3
> >
> > root@debian:~# lstopo
> > Machine (488MB) + Socket L#0 + L3 L#0 (8192KB) + L2 L#0 (256KB) + L1 L#0 (32KB) + Core L#0
> > PU L#0 (P#0)
> > PU L#1 (P#1)
> > PU L#2 (P#2)
> > PU L#3 (P#3)
> >
>
> I won't be surprised if guest builds up a wrong topology, as what real
> "ID"s it sees depends very much on what pcpus you pick.
>
Exactly, but if I pin all the guest vCPUs on specific host pCPUs from
the very beginning (pinning specified in the config file, which is what
I'm doing), I should be able to control that...
> Have you tried pinning vcpus to pcpus [0, 1, 2, 3]? That way you should
> be able to see the same topology as the one you saw in Dom0?
>
Well, at least some of the examples above should have shown some
non-shared cache levels already. Anyway, here it comes:
root@benny:~# xl vcpu-list
Name ID VCPU CPU State Time(s) CPU Affinity
debian.guest.osstest 15 0 0 -b- 1.8 0
debian.guest.osstest 15 1 1 -b- 0.7 1
debian.guest.osstest 15 2 2 -b- 0.6 2
debian.guest.osstest 15 3 3 -b- 0.7 3
root@debian:~# hwloc-ls --of console
Machine (488MB) + Socket L#0 + L3 L#0 (8192KB) + L2 L#0 (256KB) + L1 L#0
(32KB) + Core L#0
PU L#0 (P#0)
PU L#1 (P#1)
PU L#2 (P#2)
PU L#3 (P#3)
root@debian:~# lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 4
Core(s) per socket: 1
Socket(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 60
Stepping: 3
CPU MHz: 3591.780
BogoMIPS: 7183.56
Hypervisor vendor: Xen
Virtualization type: full
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 8192K
So, no, that is not giving the same result as in Dom0. :-(
> > This is not the case for dom0 where (I booted with dom0_max_vcpus=4 on
> > the xen command line) I see this:
> >
>
> I guess this is because you're basically picking pcpu 0-3 for Dom0. It
> doesn't matter if you pin them or not.
>
That makes total sense, and in fact, I was not surprised about Dom0
looking like this... I rather am about not being able to get a similar
topology for the guest, no matter how I pin it... :-/
Dario
--
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)
[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 181 bytes --]
[-- Attachment #2: Type: text/plain, Size: 126 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
next prev parent reply other threads:[~2014-07-22 15:06 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-18 5:49 [PATCH v6 00/10] vnuma introduction Elena Ufimtseva
2014-07-18 5:50 ` [PATCH v6 01/10] xen: vnuma topology and subop hypercalls Elena Ufimtseva
2014-07-18 10:30 ` Wei Liu
2014-07-20 13:16 ` Elena Ufimtseva
2014-07-20 15:59 ` Wei Liu
2014-07-22 15:18 ` Dario Faggioli
2014-07-23 5:33 ` Elena Ufimtseva
2014-07-18 13:49 ` Konrad Rzeszutek Wilk
2014-07-20 13:26 ` Elena Ufimtseva
2014-07-22 15:14 ` Dario Faggioli
2014-07-23 5:22 ` Elena Ufimtseva
2014-07-23 14:06 ` Jan Beulich
2014-07-25 4:52 ` Elena Ufimtseva
2014-07-25 7:33 ` Jan Beulich
2014-07-18 5:50 ` [PATCH v6 02/10] xsm bits for vNUMA hypercalls Elena Ufimtseva
2014-07-18 13:50 ` Konrad Rzeszutek Wilk
2014-07-18 15:26 ` Daniel De Graaf
2014-07-20 13:48 ` Elena Ufimtseva
2014-07-18 5:50 ` [PATCH v6 03/10] vnuma hook to debug-keys u Elena Ufimtseva
2014-07-23 14:10 ` Jan Beulich
2014-07-18 5:50 ` [PATCH v6 04/10] libxc: Introduce xc_domain_setvnuma to set vNUMA Elena Ufimtseva
2014-07-18 10:33 ` Wei Liu
2014-07-29 10:33 ` Ian Campbell
2014-07-18 5:50 ` [PATCH v6 05/10] libxl: vnuma topology configuration parser and doc Elena Ufimtseva
2014-07-18 10:53 ` Wei Liu
2014-07-20 14:04 ` Elena Ufimtseva
2014-07-29 10:38 ` Ian Campbell
2014-07-29 10:42 ` Ian Campbell
2014-08-06 4:46 ` Elena Ufimtseva
2014-07-18 5:50 ` [PATCH v6 06/10] libxc: move code to arch_boot_alloc func Elena Ufimtseva
2014-07-29 10:38 ` Ian Campbell
2014-07-18 5:50 ` [PATCH v6 07/10] libxc: allocate domain memory for vnuma enabled Elena Ufimtseva
2014-07-29 10:43 ` Ian Campbell
2014-08-06 4:48 ` Elena Ufimtseva
2014-07-18 5:50 ` [PATCH v6 08/10] libxl: build numa nodes memory blocks Elena Ufimtseva
2014-07-18 11:01 ` Wei Liu
2014-07-20 12:58 ` Elena Ufimtseva
2014-07-20 15:59 ` Wei Liu
2014-07-18 5:50 ` [PATCH v6 09/10] libxl: vnuma nodes placement bits Elena Ufimtseva
2014-07-18 5:50 ` [PATCH v6 10/10] libxl: set vnuma for domain Elena Ufimtseva
2014-07-18 10:58 ` Wei Liu
2014-07-29 10:45 ` Ian Campbell
2014-08-12 3:52 ` Elena Ufimtseva
2014-08-12 9:42 ` Wei Liu
2014-08-12 17:10 ` Dario Faggioli
2014-08-12 17:13 ` Wei Liu
2014-08-12 17:24 ` Elena Ufimtseva
2014-07-18 6:16 ` [PATCH v6 00/10] vnuma introduction Elena Ufimtseva
2014-07-18 9:53 ` Wei Liu
2014-07-18 10:13 ` Dario Faggioli
2014-07-18 11:48 ` Wei Liu
2014-07-20 14:57 ` Elena Ufimtseva
2014-07-22 15:49 ` Dario Faggioli
2014-07-22 14:03 ` Dario Faggioli
2014-07-22 14:48 ` Wei Liu
2014-07-22 15:06 ` Dario Faggioli [this message]
2014-07-22 16:47 ` Wei Liu
2014-07-22 19:43 ` Is: cpuid creation of PV guests is not correct. Was:Re: " Konrad Rzeszutek Wilk
2014-07-22 22:34 ` Is: cpuid creation of PV guests is not correct Andrew Cooper
2014-07-22 22:53 ` Is: cpuid creation of PV guests is not correct. Was:Re: [PATCH v6 00/10] vnuma introduction Dario Faggioli
2014-07-23 6:00 ` Elena Ufimtseva
2014-07-22 12:49 ` Dario Faggioli
2014-07-23 5:59 ` Elena Ufimtseva
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1406041597.17850.74.camel@Solace \
--to=dario.faggioli@citrix.com \
--cc=Ian.Campbell@citrix.com \
--cc=JBeulich@suse.com \
--cc=george.dunlap@eu.citrix.com \
--cc=ian.jackson@eu.citrix.com \
--cc=keir@xen.org \
--cc=lccycc123@gmail.com \
--cc=msw@linux.com \
--cc=stefano.stabellini@eu.citrix.com \
--cc=ufimtseva@gmail.com \
--cc=wei.liu2@citrix.com \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).