xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Dario Faggioli <dario.faggioli@citrix.com>
To: Pavlo Suikov <pavlo.suikov@globallogic.com>
Cc: xen-devel@lists.xen.org
Subject: Re: Delays on usleep calls
Date: Tue, 21 Jan 2014 12:46:01 +0100	[thread overview]
Message-ID: <1390304761.23576.161.camel@Solace> (raw)
In-Reply-To: <CAE4oM6xZpkw5BLcK3toYMix=L9gmp89Ww5=biC5+rsncm10A3w@mail.gmail.com>


[-- Attachment #1.1: Type: text/plain, Size: 5516 bytes --]

On lun, 2014-01-20 at 18:05 +0200, Pavlo Suikov wrote:
> > x86 or ARM host?
> ARM. ARMv7, TI Jacinto6 to be precise.
> 
Ok.

> > Also, how many pCPUs and vCPUs do the host and the various guests
> have?
> 
> 
> 2 pCPUs, 4 vCPUs: 2 vCPU per domain.
> 
Right. So you are overbooking the platform a bit. Don't get me wrong,
that's not only legitimate, it's actually a good thing, if only because
it gives us something nice to play with, from the Xen scheduling
perspective. If you were just having #vCPUs==#pCPUs, that would be way
more boring! :-)

That being said, is that a problem, as a temporary measure, during this
first phase of testing and benchmarking, to change it a bit? I'm asking
because I think that could help isolating the various causes of the
issues you're seeing, and hence facing and resolving them.

> > Are you using any vCPU-to-pCPU pinning?
>
> No.
> 
Ok, so, if, as said above, you can do that, I'd try the following. With
the credit scheduler (after having cleared/disabled the rate limiting
thing), go for 1 vCPU in Dom0 and 1 vCPU in DomU.

Also, pin both, and do it to different pCPUs. I think booting with this
"dom0_max_vcpus=1 dom0_vcpus_pin" in the Xen command line would do the
trick for Dom0. For DomU, you just put in the config file a "cpus=X"
entry, as soon as you see what it is the pCPU to which Dom0 is _not_
pinned (I suspect Dom0 will end up pinned to pCPU #0, and so you should
use "cpus=1" for the DomU).

With that configuration, repeat the tests.

Basically, what I'm asking you to do is to completely kick the Xen
scheduler out of the window, for now, to try getting some baseline
numbers. Nicely enough, when using only 1 vCPU for both Dom0 and DomU,
you pretty much rule out most of the Linux scheduler's logic (not
everything, but at least the part about load balancing). To push even
harder on the latter, I'd boost the priority of the test program (I'm
still talking about inside the Linux guest) to some high level rtprio.

What all the above should give you is an estimation of the current lower
bound on latency and jitter that you can get. If that's already not good
enough (provided I did not make any glaring mistake in the instructions
above :-D), then we know that there are areas other than the scheduler
that needs some intervention, and we can start looking for which ones
and what to do.

Also, whether or not what you get is enough, one can also start working
on seeing what scheduler, and/or what set of scheduling parameters, is
able to replicate, or get close and reliably enough, to the 'static
scenario'.

What do you think?

> We did additional measurements and as you can see, my first impression
> was not very correct: difference between dom0 and domU exist and is
> quite observable on a larger scale. On the same setup bare metal
> without Xen number of times t > 32 is close to 0; on the setup with
> Xen but without domU system running number of times t > 32 is close to
> 0 as well. 
>
I appreciate that. Given the many actors and factors involved, I think
the only way to figure out what's going on is to try isolating the
various components as much as we can... That's why I'm suggesting to
consider a very very very simple situation first, at least wrt to
scheduling.

>  We will make additional measurements with Linux (not Android) as a
> domU guest, though.
> 
Ok.

> > # xl sched-sedf
> 
> # xl sched-sedf
> Cpupool Pool-0:
> Name                                ID Period Slice  Latency Extra
> Weight
> Domain-0                             0    100      0       0     1
>    0
> android_4.3                          1    100      0       0     1
>    0
> 
May I ask for the output of

# xl list -n

and

# xl vcpu-list

in the sEDF case too?

That being said, I suggest you not to spend much time on sEDF for now.
As it is, it's broken, especially on SMPs, so we either re-engineer it
properly, or turn toward RT-Xen (and, e.g., help Sisu and his team to
upstream it).

I think we should have a discussion about the above, outside and beyond
this thread... I'll spring it up in the proper way ASAP.

> > Oh, and now that I think about it, something that present in credit
> and
> > not in sEDF that might be worth checking is the scheduling rate
> limiting
> > thing.
> 
> 
> We'll check it out, thanks!
> 
Right. One other thing that I forgot to mention: the timeslice. Credit
uses, by default, 30ms as its scheduling timeslice which, I think, is
quite high for latency sensitive workloads like yours (Linux typically
uses 1, 3.33, 4 or 10).

# xl sched-credit
Cpupool Pool-0: tslice=30ms ratelimit=1000us
Name                                ID Weight  Cap
Domain-0                             0    256    0
vm.guest.osstest                     9    256    0

I think that another thing that is worth trying is running the
experiments with that lowered a bit. E.g.:

# xl sched-credit -s -t 1
# xl sched-credit
Cpupool Pool-0: tslice=1ms ratelimit=1000us
Name                                ID Weight  Cap
Domain-0                             0    256    0
vm.guest.osstest                     9    256    0

Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  parent reply	other threads:[~2014-01-21 11:46 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-20 14:10 Delays on usleep calls Pavlo Suikov
2014-01-20 15:05 ` Dario Faggioli
2014-01-20 16:05   ` Pavlo Suikov
2014-01-20 17:31     ` Pavlo Suikov
2014-01-21 10:56       ` Dario Faggioli
2014-01-21 11:46     ` Dario Faggioli [this message]
2014-01-21 15:53       ` Pavlo Suikov
2014-01-21 17:56         ` Dario Faggioli
2014-01-23 19:09           ` Pavlo Suikov
2014-01-24 17:08             ` Dario Faggioli
2014-02-05 21:30   ` Robbie VanVossen
2014-02-07  9:22     ` Dario Faggioli
2014-02-13 21:09       ` Robbie VanVossen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1390304761.23576.161.camel@Solace \
    --to=dario.faggioli@citrix.com \
    --cc=pavlo.suikov@globallogic.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).