From: Don Slutz <dslutz@verizon.com>
To: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Simon Martin <furryfuttock@gmail.com>,
Andrew Cooper <andrew.cooper3@citrix.com>,
Nate Studer <nate.studer@dornerworks.com>,
Don Slutz <dslutz@verizon.com>,
xen-devel@lists.xen.org
Subject: Re: Strange interdependace between domains
Date: Fri, 14 Feb 2014 05:26:08 -0500 [thread overview]
Message-ID: <52FDEF40.8040709@terremark.com> (raw)
In-Reply-To: <1392333198.32038.153.camel@Solace>
On 02/13/14 18:13, Dario Faggioli wrote:
> On gio, 2014-02-13 at 22:25 +0000, Simon Martin wrote:
>> Thanks for all the replies guys.
>>
> :-)
>
>> Don> How many instruction per second a thread gets does depend on the
>> Don> "idleness" of other threads (no longer just the hyperThread's
>> Don> parther).
>>
>> This seems a bit strange to me. In my case I have time
>> critical PV running by itself in a CPU pool. So Xen should not be
>> scheduling it, so I can't see how this Hypervisor thread would be affected.
>>
> I think Don is referring to the idleness of the other _hardware_ threads
> in the chip, rather than software threads of execution, either in Xen or
> in Dom0/DomU. I checked his original e-mail and, AFAIUI, he seems to
> confirm that the throughput you get on, say, core 3, depends on what
> it's sibling core (which really is his sibling hyperthread, again in the
> hardware sense... Gah, the terminology is just a mess! :-P). He seems to
> also add the fact that there is a similar kind of inter-dependency
> between all the hardware hyperthread, not just between siblings.
>
> Does this make sense Don?
>
Yes, but the results I am getting vary based on the disto (most likely
the microcode version).
Linux (and I think that xen) both have a CPU scheduler that picks core
before threads:
top - 04:06:29 up 66 days, 15:31, 11 users, load average: 2.43, 0.72, 0.29
Tasks: 250 total, 1 running, 249 sleeping, 0 stopped, 0 zombie
Cpu0 : 99.7%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.2%hi, 0.1%si,
0.0%st
Cpu1 : 0.0%us, 0.0%sy, 0.0%ni, 99.8%id, 0.0%wa, 0.0%hi, 0.2%si,
0.0%st
Cpu2 : 99.9%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.1%hi, 0.0%si,
0.0%st
Cpu3 : 1.6%us, 0.1%sy, 0.0%ni, 98.3%id, 0.0%wa, 0.0%hi, 0.0%si,
0.0%st
Cpu4 : 99.9%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.1%hi, 0.0%si,
0.0%st
Cpu5 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si,
0.0%st
Cpu6 : 1.4%us, 0.0%sy, 0.0%ni, 98.6%id, 0.0%wa, 0.0%hi, 0.0%si,
0.0%st
Cpu7 : 99.9%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.1%hi, 0.0%si,
0.0%st
Mem: 32940640k total, 18008576k used, 14932064k free, 285740k buffers
Swap: 10223612k total, 4696k used, 10218916k free, 16746224k cached
Is an example without xen involved and Fedora 17
Linux dcs-xen-50 3.8.11-100.fc17.x86_64 #1 SMP Wed May 1 19:31:26 UTC
2013 x86_64 x86_64 x86_64 GNU/Linux
On this machine:
Just 7:
start done
thr 0: 14 Feb 14 04:11:08.944566 14 Feb 14 04:13:20.874764
+02:11.930198 ~= 131.93 and 9.10 GiI/Sec
6 & 7:
start done
thr 0: 14 Feb 14 04:14:31.010426 14 Feb 14 04:18:55.404116
+04:24.393690 ~= 264.39 and 4.54 GiI/Sec
thr 1: 14 Feb 14 04:14:31.010426 14 Feb 14 04:18:55.415561
+04:24.405135 ~= 264.41 and 4.54 GiI/Sec
5 & 7:
start done
thr 0: 14 Feb 14 04:20:28.902831 14 Feb 14 04:22:45.563511
+02:16.660680 ~= 136.66 and 8.78 GiI/Sec
thr 1: 14 Feb 14 04:20:28.902831 14 Feb 14 04:22:46.182159
+02:17.279328 ~= 137.28 and 8.74 GiI/Sec
1 & 3 & 5 & 7:
start done
thr 0: 14 Feb 14 04:32:24.353302 14 Feb 14 04:35:16.870558
+02:52.517256 ~= 172.52 and 6.96 GiI/Sec
thr 1: 14 Feb 14 04:32:24.353301 14 Feb 14 04:35:17.371155
+02:53.017854 ~= 173.02 and 6.94 GiI/Sec
thr 2: 14 Feb 14 04:32:24.353302 14 Feb 14 04:35:17.225871
+02:52.872569 ~= 172.87 and 6.94 GiI/Sec
thr 3: 14 Feb 14 04:32:24.353302 14 Feb 14 04:35:16.655362
+02:52.302060 ~= 172.30 and 6.96 GiI/Sec
This is from:
Feb 14 04:29:21 dcs-xen-51 kernel: [ 41.921367] microcode: CPU3
updated to revision 0x28, date = 2012-04-24
On CentOS 5.10:
Linux dcs-xen-53 2.6.18-371.el5 #1 SMP Tue Oct 1 08:35:08 EDT 2013
x86_64 x86_64 x86_64 GNU/Linux
only 7:
start done
thr 0: 14 Feb 14 09:43:10.903549 14 Feb 14 09:46:04.925463
+02:54.021914 ~= 174.02 and 6.90 GiI/Sec
6 & 7:
start done
thr 0: 14 Feb 14 09:49:17.804633 14 Feb 14 09:55:02.473549
+05:44.668916 ~= 344.67 and 3.48 GiI/Sec
thr 1: 14 Feb 14 09:49:17.804618 14 Feb 14 09:55:02.533243
+05:44.728625 ~= 344.73 and 3.48 GiI/Sec
5 & 7:
start done
thr 0: 14 Feb 14 10:01:30.566603 14 Feb 14 10:04:23.024858
+02:52.458255 ~= 172.46 and 6.96 GiI/Sec
thr 1: 14 Feb 14 10:01:30.566603 14 Feb 14 10:04:23.069964
+02:52.503361 ~= 172.50 and 6.96 GiI/Sec
1 & 3 & 5 & 7:
start done
thr 0: 14 Feb 14 10:05:58.359646 14 Feb 14 10:08:50.984629
+02:52.624983 ~= 172.62 and 6.95 GiI/Sec
thr 1: 14 Feb 14 10:05:58.359646 14 Feb 14 10:08:50.993064
+02:52.633418 ~= 172.63 and 6.95 GiI/Sec
thr 2: 14 Feb 14 10:05:58.359645 14 Feb 14 10:08:50.857982
+02:52.498337 ~= 172.50 and 6.96 GiI/Sec
thr 3: 14 Feb 14 10:05:58.359645 14 Feb 14 10:08:50.905031
+02:52.545386 ~= 172.55 and 6.95 GiI/Sec
Feb 14 09:41:42 dcs-xen-53 kernel: microcode: CPU3 updated from revision
0x17 to 0x29, date = 06122013
Hope this helps.
-Don Slutz
>>>> 6.- All VCPUs are pinned:
>>>>
>> Dario> Right, although, if you use cpupools, and if I've understood what you're
>> Dario> up to, you really should not require pinning. I mean, the isolation
>> Dario> between the RT-ish domain and the rest of the world should be already in
>> Dario> place thanks to cpupools.
>>
>> This is what I thought, however when running looking at the vcpu-list
>> I CPU affinity was "all" until I starting pinning. As I wasn't sure
>> whether that was "all inside this cpu pool" or "all" I felt it was
>> safer to do it explicitly.
>>
> Actually, you are right, we could put things in a way that results more
> clear, when one observes the output! So, I confirm that, despite the
> fact that you see "all", that all is relative to the cpupool the domain
> is assigned to.
>
> I'll try to think on how to make this more evident... A note in the
> manpage and/or the various sources of documentation, is the easy (but
> still necessary, I agree) part, and I'll add this to my TODO list.
> Actually modifying the output is more tricky, as affinity and cpupools
> are orthogonal by design, and that is the right (IMHO) thing.
>
> I guess trying to tweak the printf()-s in `xl vcpu-list' would not be
> that hard... I'll have a look and see if I can come up with a proposal.
>
>> Dario> So, if you ask me, you're restricting too much things in
>> Dario> pool-0, where dom0 and the Windows VM runs. In fact, is there a
>> Dario> specific reason why you need all their vcpus to be statically
>> Dario> pinned each one to only one pcpu? If not, I'd leave them a
>> Dario> little bit more of freedom.
>>
>> I agree with you here, however when I don't pin CPU affinity is "all".
>> Is this "all in the CPU pool"? I couldn't find that info.
>>
> Again, yes: once a domain is in a cpupool, no matter what its affinity
> says, it won't ever reach a pcpu assigned to another cpupool. The
> technical reason is that each cpupool is ruled by it's own (copy of a)
> scheduler, even if you use, e.g., credit, for both/all the pools. In
> that case, what you will get are two full instances of credit,
> completely independent between each other, each one in charge only of a
> very specific subset of pcpus (as mandated by cpupools). So, different
> runqueues, different data structures, different anything.
>
>> Dario> What I'd try is:
>> Dario> 1. all dom0 and win7 vcpus free, so no pinning in pool0.
>> Dario> 2. pinning as follows:
>> Dario> * all vcpus of win7 --> pcpus 1,2
>> Dario> * all vcpus of dom0 --> no pinning
>> Dario> this way, what you get is the following: win7 could suffer sometimes,
>> Dario> if all its 3 vcpus gets busy, but that, I think is acceptable, at
>> Dario> least up to a certain extent, is that the case?
>> Dario> At the same time, you
>> Dario> are making sure dom0 always has a chance to run, as pcpu#0 would be
>> Dario> his exclusive playground, in case someone, including your pv499
>> Dario> domain, needs its services.
>>
>> This is what I had when I started :-). Thanks for the confirmation
>> that I was doing it right. However if the hyperthreading is the issue,
>> then I will only have 2 PCPU available, and I will assign them both to
>> dom0 and win7.
>>
> Yes, with hyperthreading in mind, that is what you should do.
>
> Once we will have confirmed that hyperthreading is the issue, we'll see
> what we can do. I mean, if, in your case, it's fine to 'waste' a cpu,
> then ok, but I think we need a general solution for this... Perhaps with
> a little worse performances than just leaving one core/hyperthread
> completely idle, but at the same time more resource efficient.
>
> I wonder how tweaking the sched_smt_power_savings would deal with
> this...
>
>> Dario> Right. Are you familiar with tracing what happens inside Xen
>> Dario> with xentrace and, perhaps, xenalyze? It takes a bit of time to
>> Dario> get used to it but, once you dominate it, it is a good mean for
>> Dario> getting out really useful info!
>>
>> Dario> There is a blog post about that here:
>> Dario> http://blog.xen.org/index.php/2012/09/27/tracing-with-xentrace-and-xenalyze/
>> Dario> and it should have most of the info, or the links to where to
>> Dario> find them.
>>
>> Thanks for this. If this problem is more than the hyperthreading then
>> I will definitely use it. Also looks like it might be useful when I
>> start looking at the jitter on the singleshot timer (which should be
>> in a couple of weeks).
>>
> It will reveal to be very useful for that, I'm sure! :-)
>
> Let us know how the re-testing goes.
>
> Regards,
> Dario
>
next prev parent reply other threads:[~2014-02-14 10:26 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-02-13 16:56 Strange interdependace between domains Simon Martin
2014-02-13 17:07 ` Ian Campbell
2014-02-13 17:28 ` Simon Martin
2014-02-13 17:39 ` Dario Faggioli
2014-02-13 17:36 ` Dario Faggioli
2014-02-13 20:47 ` Nate Studer
2014-02-13 22:25 ` Simon Martin
2014-02-13 23:13 ` Dario Faggioli
2014-02-14 10:26 ` Don Slutz [this message]
2014-02-14 12:02 ` Simon Martin
2014-02-14 13:26 ` Andrew Cooper
2014-02-14 17:21 ` Dario Faggioli
2014-02-17 12:46 ` Simon Martin
2014-02-18 16:55 ` Dario Faggioli
2014-02-18 17:58 ` Don Slutz
2014-02-18 18:06 ` Dario Faggioli
2014-02-20 6:07 ` Juergen Gross
2014-02-20 18:22 ` Dario Faggioli
2014-02-21 6:31 ` Juergen Gross
2014-02-21 17:24 ` Dario Faggioli
2014-02-24 9:25 ` Juergen Gross
2014-02-17 13:19 ` Juergen Gross
2014-02-17 15:08 ` Dario Faggioli
2014-02-18 5:31 ` Juergen Gross
2014-02-17 14:13 ` Nate Studer
2014-02-18 16:47 ` Dario Faggioli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52FDEF40.8040709@terremark.com \
--to=dslutz@verizon.com \
--cc=andrew.cooper3@citrix.com \
--cc=dario.faggioli@citrix.com \
--cc=furryfuttock@gmail.com \
--cc=nate.studer@dornerworks.com \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.