From mboxrd@z Thu Jan 1 00:00:00 1970 From: Don Slutz Subject: Re: Strange interdependace between domains Date: Fri, 14 Feb 2014 05:26:08 -0500 Message-ID: <52FDEF40.8040709@terremark.com> References: <1646915994.20140213165604@gmail.com> <1392313015.32038.112.camel@Solace> <295276356.20140213222507@gmail.com> <1392333198.32038.153.camel@Solace> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <1392333198.32038.153.camel@Solace> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Dario Faggioli Cc: Simon Martin , Andrew Cooper , Nate Studer , Don Slutz , xen-devel@lists.xen.org List-Id: xen-devel@lists.xenproject.org On 02/13/14 18:13, Dario Faggioli wrote: > On gio, 2014-02-13 at 22:25 +0000, Simon Martin wrote: >> Thanks for all the replies guys. >> > :-) > >> Don> How many instruction per second a thread gets does depend on the >> Don> "idleness" of other threads (no longer just the hyperThread's >> Don> parther). >> >> This seems a bit strange to me. In my case I have time >> critical PV running by itself in a CPU pool. So Xen should not be >> scheduling it, so I can't see how this Hypervisor thread would be affected. >> > I think Don is referring to the idleness of the other _hardware_ threads > in the chip, rather than software threads of execution, either in Xen or > in Dom0/DomU. I checked his original e-mail and, AFAIUI, he seems to > confirm that the throughput you get on, say, core 3, depends on what > it's sibling core (which really is his sibling hyperthread, again in the > hardware sense... Gah, the terminology is just a mess! :-P). He seems to > also add the fact that there is a similar kind of inter-dependency > between all the hardware hyperthread, not just between siblings. > > Does this make sense Don? > Yes, but the results I am getting vary based on the disto (most likely the microcode version). Linux (and I think that xen) both have a CPU scheduler that picks core before threads: top - 04:06:29 up 66 days, 15:31, 11 users, load average: 2.43, 0.72, 0.29 Tasks: 250 total, 1 running, 249 sleeping, 0 stopped, 0 zombie Cpu0 : 99.7%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.2%hi, 0.1%si, 0.0%st Cpu1 : 0.0%us, 0.0%sy, 0.0%ni, 99.8%id, 0.0%wa, 0.0%hi, 0.2%si, 0.0%st Cpu2 : 99.9%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.1%hi, 0.0%si, 0.0%st Cpu3 : 1.6%us, 0.1%sy, 0.0%ni, 98.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu4 : 99.9%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.1%hi, 0.0%si, 0.0%st Cpu5 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu6 : 1.4%us, 0.0%sy, 0.0%ni, 98.6%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu7 : 99.9%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.1%hi, 0.0%si, 0.0%st Mem: 32940640k total, 18008576k used, 14932064k free, 285740k buffers Swap: 10223612k total, 4696k used, 10218916k free, 16746224k cached Is an example without xen involved and Fedora 17 Linux dcs-xen-50 3.8.11-100.fc17.x86_64 #1 SMP Wed May 1 19:31:26 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux On this machine: Just 7: start done thr 0: 14 Feb 14 04:11:08.944566 14 Feb 14 04:13:20.874764 +02:11.930198 ~= 131.93 and 9.10 GiI/Sec 6 & 7: start done thr 0: 14 Feb 14 04:14:31.010426 14 Feb 14 04:18:55.404116 +04:24.393690 ~= 264.39 and 4.54 GiI/Sec thr 1: 14 Feb 14 04:14:31.010426 14 Feb 14 04:18:55.415561 +04:24.405135 ~= 264.41 and 4.54 GiI/Sec 5 & 7: start done thr 0: 14 Feb 14 04:20:28.902831 14 Feb 14 04:22:45.563511 +02:16.660680 ~= 136.66 and 8.78 GiI/Sec thr 1: 14 Feb 14 04:20:28.902831 14 Feb 14 04:22:46.182159 +02:17.279328 ~= 137.28 and 8.74 GiI/Sec 1 & 3 & 5 & 7: start done thr 0: 14 Feb 14 04:32:24.353302 14 Feb 14 04:35:16.870558 +02:52.517256 ~= 172.52 and 6.96 GiI/Sec thr 1: 14 Feb 14 04:32:24.353301 14 Feb 14 04:35:17.371155 +02:53.017854 ~= 173.02 and 6.94 GiI/Sec thr 2: 14 Feb 14 04:32:24.353302 14 Feb 14 04:35:17.225871 +02:52.872569 ~= 172.87 and 6.94 GiI/Sec thr 3: 14 Feb 14 04:32:24.353302 14 Feb 14 04:35:16.655362 +02:52.302060 ~= 172.30 and 6.96 GiI/Sec This is from: Feb 14 04:29:21 dcs-xen-51 kernel: [ 41.921367] microcode: CPU3 updated to revision 0x28, date = 2012-04-24 On CentOS 5.10: Linux dcs-xen-53 2.6.18-371.el5 #1 SMP Tue Oct 1 08:35:08 EDT 2013 x86_64 x86_64 x86_64 GNU/Linux only 7: start done thr 0: 14 Feb 14 09:43:10.903549 14 Feb 14 09:46:04.925463 +02:54.021914 ~= 174.02 and 6.90 GiI/Sec 6 & 7: start done thr 0: 14 Feb 14 09:49:17.804633 14 Feb 14 09:55:02.473549 +05:44.668916 ~= 344.67 and 3.48 GiI/Sec thr 1: 14 Feb 14 09:49:17.804618 14 Feb 14 09:55:02.533243 +05:44.728625 ~= 344.73 and 3.48 GiI/Sec 5 & 7: start done thr 0: 14 Feb 14 10:01:30.566603 14 Feb 14 10:04:23.024858 +02:52.458255 ~= 172.46 and 6.96 GiI/Sec thr 1: 14 Feb 14 10:01:30.566603 14 Feb 14 10:04:23.069964 +02:52.503361 ~= 172.50 and 6.96 GiI/Sec 1 & 3 & 5 & 7: start done thr 0: 14 Feb 14 10:05:58.359646 14 Feb 14 10:08:50.984629 +02:52.624983 ~= 172.62 and 6.95 GiI/Sec thr 1: 14 Feb 14 10:05:58.359646 14 Feb 14 10:08:50.993064 +02:52.633418 ~= 172.63 and 6.95 GiI/Sec thr 2: 14 Feb 14 10:05:58.359645 14 Feb 14 10:08:50.857982 +02:52.498337 ~= 172.50 and 6.96 GiI/Sec thr 3: 14 Feb 14 10:05:58.359645 14 Feb 14 10:08:50.905031 +02:52.545386 ~= 172.55 and 6.95 GiI/Sec Feb 14 09:41:42 dcs-xen-53 kernel: microcode: CPU3 updated from revision 0x17 to 0x29, date = 06122013 Hope this helps. -Don Slutz >>>> 6.- All VCPUs are pinned: >>>> >> Dario> Right, although, if you use cpupools, and if I've understood what you're >> Dario> up to, you really should not require pinning. I mean, the isolation >> Dario> between the RT-ish domain and the rest of the world should be already in >> Dario> place thanks to cpupools. >> >> This is what I thought, however when running looking at the vcpu-list >> I CPU affinity was "all" until I starting pinning. As I wasn't sure >> whether that was "all inside this cpu pool" or "all" I felt it was >> safer to do it explicitly. >> > Actually, you are right, we could put things in a way that results more > clear, when one observes the output! So, I confirm that, despite the > fact that you see "all", that all is relative to the cpupool the domain > is assigned to. > > I'll try to think on how to make this more evident... A note in the > manpage and/or the various sources of documentation, is the easy (but > still necessary, I agree) part, and I'll add this to my TODO list. > Actually modifying the output is more tricky, as affinity and cpupools > are orthogonal by design, and that is the right (IMHO) thing. > > I guess trying to tweak the printf()-s in `xl vcpu-list' would not be > that hard... I'll have a look and see if I can come up with a proposal. > >> Dario> So, if you ask me, you're restricting too much things in >> Dario> pool-0, where dom0 and the Windows VM runs. In fact, is there a >> Dario> specific reason why you need all their vcpus to be statically >> Dario> pinned each one to only one pcpu? If not, I'd leave them a >> Dario> little bit more of freedom. >> >> I agree with you here, however when I don't pin CPU affinity is "all". >> Is this "all in the CPU pool"? I couldn't find that info. >> > Again, yes: once a domain is in a cpupool, no matter what its affinity > says, it won't ever reach a pcpu assigned to another cpupool. The > technical reason is that each cpupool is ruled by it's own (copy of a) > scheduler, even if you use, e.g., credit, for both/all the pools. In > that case, what you will get are two full instances of credit, > completely independent between each other, each one in charge only of a > very specific subset of pcpus (as mandated by cpupools). So, different > runqueues, different data structures, different anything. > >> Dario> What I'd try is: >> Dario> 1. all dom0 and win7 vcpus free, so no pinning in pool0. >> Dario> 2. pinning as follows: >> Dario> * all vcpus of win7 --> pcpus 1,2 >> Dario> * all vcpus of dom0 --> no pinning >> Dario> this way, what you get is the following: win7 could suffer sometimes, >> Dario> if all its 3 vcpus gets busy, but that, I think is acceptable, at >> Dario> least up to a certain extent, is that the case? >> Dario> At the same time, you >> Dario> are making sure dom0 always has a chance to run, as pcpu#0 would be >> Dario> his exclusive playground, in case someone, including your pv499 >> Dario> domain, needs its services. >> >> This is what I had when I started :-). Thanks for the confirmation >> that I was doing it right. However if the hyperthreading is the issue, >> then I will only have 2 PCPU available, and I will assign them both to >> dom0 and win7. >> > Yes, with hyperthreading in mind, that is what you should do. > > Once we will have confirmed that hyperthreading is the issue, we'll see > what we can do. I mean, if, in your case, it's fine to 'waste' a cpu, > then ok, but I think we need a general solution for this... Perhaps with > a little worse performances than just leaving one core/hyperthread > completely idle, but at the same time more resource efficient. > > I wonder how tweaking the sched_smt_power_savings would deal with > this... > >> Dario> Right. Are you familiar with tracing what happens inside Xen >> Dario> with xentrace and, perhaps, xenalyze? It takes a bit of time to >> Dario> get used to it but, once you dominate it, it is a good mean for >> Dario> getting out really useful info! >> >> Dario> There is a blog post about that here: >> Dario> http://blog.xen.org/index.php/2012/09/27/tracing-with-xentrace-and-xenalyze/ >> Dario> and it should have most of the info, or the links to where to >> Dario> find them. >> >> Thanks for this. If this problem is more than the hyperthreading then >> I will definitely use it. Also looks like it might be useful when I >> start looking at the jitter on the singleshot timer (which should be >> in a couple of weeks). >> > It will reveal to be very useful for that, I'm sure! :-) > > Let us know how the re-testing goes. > > Regards, > Dario >