From: Don Slutz <dslutz@verizon.com>
To: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Simon Martin <furryfuttock@gmail.com>,
Andrew Cooper <andrew.cooper3@citrix.com>,
Nate Studer <nate.studer@dornerworks.com>,
Don Slutz <dslutz@verizon.com>,
xen-devel@lists.xen.org
Subject: Re: Strange interdependace between domains
Date: Fri, 14 Feb 2014 05:26:08 -0500 [thread overview]
Message-ID: <52FDEF40.8040709@terremark.com> (raw)
In-Reply-To: <1392333198.32038.153.camel@Solace>
On 02/13/14 18:13, Dario Faggioli wrote:
> On gio, 2014-02-13 at 22:25 +0000, Simon Martin wrote:
>> Thanks for all the replies guys.
>>
> :-)
>
>> Don> How many instruction per second a thread gets does depend on the
>> Don> "idleness" of other threads (no longer just the hyperThread's
>> Don> parther).
>>
>> This seems a bit strange to me. In my case I have time
>> critical PV running by itself in a CPU pool. So Xen should not be
>> scheduling it, so I can't see how this Hypervisor thread would be affected.
>>
> I think Don is referring to the idleness of the other _hardware_ threads
> in the chip, rather than software threads of execution, either in Xen or
> in Dom0/DomU. I checked his original e-mail and, AFAIUI, he seems to
> confirm that the throughput you get on, say, core 3, depends on what
> it's sibling core (which really is his sibling hyperthread, again in the
> hardware sense... Gah, the terminology is just a mess! :-P). He seems to
> also add the fact that there is a similar kind of inter-dependency
> between all the hardware hyperthread, not just between siblings.
>
> Does this make sense Don?
>
Yes, but the results I am getting vary based on the disto (most likely
the microcode version).
Linux (and I think that xen) both have a CPU scheduler that picks core
before threads:
top - 04:06:29 up 66 days, 15:31, 11 users, load average: 2.43, 0.72, 0.29
Tasks: 250 total, 1 running, 249 sleeping, 0 stopped, 0 zombie
Cpu0 : 99.7%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.2%hi, 0.1%si,
0.0%st
Cpu1 : 0.0%us, 0.0%sy, 0.0%ni, 99.8%id, 0.0%wa, 0.0%hi, 0.2%si,
0.0%st
Cpu2 : 99.9%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.1%hi, 0.0%si,
0.0%st
Cpu3 : 1.6%us, 0.1%sy, 0.0%ni, 98.3%id, 0.0%wa, 0.0%hi, 0.0%si,
0.0%st
Cpu4 : 99.9%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.1%hi, 0.0%si,
0.0%st
Cpu5 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si,
0.0%st
Cpu6 : 1.4%us, 0.0%sy, 0.0%ni, 98.6%id, 0.0%wa, 0.0%hi, 0.0%si,
0.0%st
Cpu7 : 99.9%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.1%hi, 0.0%si,
0.0%st
Mem: 32940640k total, 18008576k used, 14932064k free, 285740k buffers
Swap: 10223612k total, 4696k used, 10218916k free, 16746224k cached
Is an example without xen involved and Fedora 17
Linux dcs-xen-50 3.8.11-100.fc17.x86_64 #1 SMP Wed May 1 19:31:26 UTC
2013 x86_64 x86_64 x86_64 GNU/Linux
On this machine:
Just 7:
start done
thr 0: 14 Feb 14 04:11:08.944566 14 Feb 14 04:13:20.874764
+02:11.930198 ~= 131.93 and 9.10 GiI/Sec
6 & 7:
start done
thr 0: 14 Feb 14 04:14:31.010426 14 Feb 14 04:18:55.404116
+04:24.393690 ~= 264.39 and 4.54 GiI/Sec
thr 1: 14 Feb 14 04:14:31.010426 14 Feb 14 04:18:55.415561
+04:24.405135 ~= 264.41 and 4.54 GiI/Sec
5 & 7:
start done
thr 0: 14 Feb 14 04:20:28.902831 14 Feb 14 04:22:45.563511
+02:16.660680 ~= 136.66 and 8.78 GiI/Sec
thr 1: 14 Feb 14 04:20:28.902831 14 Feb 14 04:22:46.182159
+02:17.279328 ~= 137.28 and 8.74 GiI/Sec
1 & 3 & 5 & 7:
start done
thr 0: 14 Feb 14 04:32:24.353302 14 Feb 14 04:35:16.870558
+02:52.517256 ~= 172.52 and 6.96 GiI/Sec
thr 1: 14 Feb 14 04:32:24.353301 14 Feb 14 04:35:17.371155
+02:53.017854 ~= 173.02 and 6.94 GiI/Sec
thr 2: 14 Feb 14 04:32:24.353302 14 Feb 14 04:35:17.225871
+02:52.872569 ~= 172.87 and 6.94 GiI/Sec
thr 3: 14 Feb 14 04:32:24.353302 14 Feb 14 04:35:16.655362
+02:52.302060 ~= 172.30 and 6.96 GiI/Sec
This is from:
Feb 14 04:29:21 dcs-xen-51 kernel: [ 41.921367] microcode: CPU3
updated to revision 0x28, date = 2012-04-24
On CentOS 5.10:
Linux dcs-xen-53 2.6.18-371.el5 #1 SMP Tue Oct 1 08:35:08 EDT 2013
x86_64 x86_64 x86_64 GNU/Linux
only 7:
start done
thr 0: 14 Feb 14 09:43:10.903549 14 Feb 14 09:46:04.925463
+02:54.021914 ~= 174.02 and 6.90 GiI/Sec
6 & 7:
start done
thr 0: 14 Feb 14 09:49:17.804633 14 Feb 14 09:55:02.473549
+05:44.668916 ~= 344.67 and 3.48 GiI/Sec
thr 1: 14 Feb 14 09:49:17.804618 14 Feb 14 09:55:02.533243
+05:44.728625 ~= 344.73 and 3.48 GiI/Sec
5 & 7:
start done
thr 0: 14 Feb 14 10:01:30.566603 14 Feb 14 10:04:23.024858
+02:52.458255 ~= 172.46 and 6.96 GiI/Sec
thr 1: 14 Feb 14 10:01:30.566603 14 Feb 14 10:04:23.069964
+02:52.503361 ~= 172.50 and 6.96 GiI/Sec
1 & 3 & 5 & 7:
start done
thr 0: 14 Feb 14 10:05:58.359646 14 Feb 14 10:08:50.984629
+02:52.624983 ~= 172.62 and 6.95 GiI/Sec
thr 1: 14 Feb 14 10:05:58.359646 14 Feb 14 10:08:50.993064
+02:52.633418 ~= 172.63 and 6.95 GiI/Sec
thr 2: 14 Feb 14 10:05:58.359645 14 Feb 14 10:08:50.857982
+02:52.498337 ~= 172.50 and 6.96 GiI/Sec
thr 3: 14 Feb 14 10:05:58.359645 14 Feb 14 10:08:50.905031
+02:52.545386 ~= 172.55 and 6.95 GiI/Sec
Feb 14 09:41:42 dcs-xen-53 kernel: microcode: CPU3 updated from revision
0x17 to 0x29, date = 06122013
Hope this helps.
-Don Slutz
>>>> 6.- All VCPUs are pinned:
>>>>
>> Dario> Right, although, if you use cpupools, and if I've understood what you're
>> Dario> up to, you really should not require pinning. I mean, the isolation
>> Dario> between the RT-ish domain and the rest of the world should be already in
>> Dario> place thanks to cpupools.
>>
>> This is what I thought, however when running looking at the vcpu-list
>> I CPU affinity was "all" until I starting pinning. As I wasn't sure
>> whether that was "all inside this cpu pool" or "all" I felt it was
>> safer to do it explicitly.
>>
> Actually, you are right, we could put things in a way that results more
> clear, when one observes the output! So, I confirm that, despite the
> fact that you see "all", that all is relative to the cpupool the domain
> is assigned to.
>
> I'll try to think on how to make this more evident... A note in the
> manpage and/or the various sources of documentation, is the easy (but
> still necessary, I agree) part, and I'll add this to my TODO list.
> Actually modifying the output is more tricky, as affinity and cpupools
> are orthogonal by design, and that is the right (IMHO) thing.
>
> I guess trying to tweak the printf()-s in `xl vcpu-list' would not be
> that hard... I'll have a look and see if I can come up with a proposal.
>
>> Dario> So, if you ask me, you're restricting too much things in
>> Dario> pool-0, where dom0 and the Windows VM runs. In fact, is there a
>> Dario> specific reason why you need all their vcpus to be statically
>> Dario> pinned each one to only one pcpu? If not, I'd leave them a
>> Dario> little bit more of freedom.
>>
>> I agree with you here, however when I don't pin CPU affinity is "all".
>> Is this "all in the CPU pool"? I couldn't find that info.
>>
> Again, yes: once a domain is in a cpupool, no matter what its affinity
> says, it won't ever reach a pcpu assigned to another cpupool. The
> technical reason is that each cpupool is ruled by it's own (copy of a)
> scheduler, even if you use, e.g., credit, for both/all the pools. In
> that case, what you will get are two full instances of credit,
> completely independent between each other, each one in charge only of a
> very specific subset of pcpus (as mandated by cpupools). So, different
> runqueues, different data structures, different anything.
>
>> Dario> What I'd try is:
>> Dario> 1. all dom0 and win7 vcpus free, so no pinning in pool0.
>> Dario> 2. pinning as follows:
>> Dario> * all vcpus of win7 --> pcpus 1,2
>> Dario> * all vcpus of dom0 --> no pinning
>> Dario> this way, what you get is the following: win7 could suffer sometimes,
>> Dario> if all its 3 vcpus gets busy, but that, I think is acceptable, at
>> Dario> least up to a certain extent, is that the case?
>> Dario> At the same time, you
>> Dario> are making sure dom0 always has a chance to run, as pcpu#0 would be
>> Dario> his exclusive playground, in case someone, including your pv499
>> Dario> domain, needs its services.
>>
>> This is what I had when I started :-). Thanks for the confirmation
>> that I was doing it right. However if the hyperthreading is the issue,
>> then I will only have 2 PCPU available, and I will assign them both to
>> dom0 and win7.
>>
> Yes, with hyperthreading in mind, that is what you should do.
>
> Once we will have confirmed that hyperthreading is the issue, we'll see
> what we can do. I mean, if, in your case, it's fine to 'waste' a cpu,
> then ok, but I think we need a general solution for this... Perhaps with
> a little worse performances than just leaving one core/hyperthread
> completely idle, but at the same time more resource efficient.
>
> I wonder how tweaking the sched_smt_power_savings would deal with
> this...
>
>> Dario> Right. Are you familiar with tracing what happens inside Xen
>> Dario> with xentrace and, perhaps, xenalyze? It takes a bit of time to
>> Dario> get used to it but, once you dominate it, it is a good mean for
>> Dario> getting out really useful info!
>>
>> Dario> There is a blog post about that here:
>> Dario> http://blog.xen.org/index.php/2012/09/27/tracing-with-xentrace-and-xenalyze/
>> Dario> and it should have most of the info, or the links to where to
>> Dario> find them.
>>
>> Thanks for this. If this problem is more than the hyperthreading then
>> I will definitely use it. Also looks like it might be useful when I
>> start looking at the jitter on the singleshot timer (which should be
>> in a couple of weeks).
>>
> It will reveal to be very useful for that, I'm sure! :-)
>
> Let us know how the re-testing goes.
>
> Regards,
> Dario
>
next prev parent reply other threads:[~2014-02-14 10:26 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-02-13 16:56 Strange interdependace between domains Simon Martin
2014-02-13 17:07 ` Ian Campbell
2014-02-13 17:28 ` Simon Martin
2014-02-13 17:39 ` Dario Faggioli
2014-02-13 17:36 ` Dario Faggioli
2014-02-13 20:47 ` Nate Studer
2014-02-13 22:25 ` Simon Martin
2014-02-13 23:13 ` Dario Faggioli
2014-02-14 10:26 ` Don Slutz [this message]
2014-02-14 12:02 ` Simon Martin
2014-02-14 13:26 ` Andrew Cooper
2014-02-14 17:21 ` Dario Faggioli
2014-02-17 12:46 ` Simon Martin
2014-02-18 16:55 ` Dario Faggioli
2014-02-18 17:58 ` Don Slutz
2014-02-18 18:06 ` Dario Faggioli
2014-02-20 6:07 ` Juergen Gross
2014-02-20 18:22 ` Dario Faggioli
2014-02-21 6:31 ` Juergen Gross
2014-02-21 17:24 ` Dario Faggioli
2014-02-24 9:25 ` Juergen Gross
2014-02-17 13:19 ` Juergen Gross
2014-02-17 15:08 ` Dario Faggioli
2014-02-18 5:31 ` Juergen Gross
2014-02-17 14:13 ` Nate Studer
2014-02-18 16:47 ` Dario Faggioli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52FDEF40.8040709@terremark.com \
--to=dslutz@verizon.com \
--cc=andrew.cooper3@citrix.com \
--cc=dario.faggioli@citrix.com \
--cc=furryfuttock@gmail.com \
--cc=nate.studer@dornerworks.com \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).