Re: Question about the ability of credit scheduler to handle I/O and CPU intensive VMs

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

From: Yuehai Xu <yuehaixu@gmail.com>
To: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: xen-devel@lists.xensource.com, yhxu@wayne.edu
Subject: Re: Question about the ability of credit scheduler to handle I/O and CPU intensive VMs
Date: Mon, 4 Oct 2010 22:52:51 -0400	[thread overview]
Message-ID: <AANLkTimYTbf5meNptCtuiKWfQGd_qSNCkbCNabfREc_0@mail.gmail.com> (raw)
In-Reply-To: <AANLkTi=Oa0_=vXrr63eALBU2sQa3aLV0NiQHt8hPPvcw@mail.gmail.com>

On Thu, Sep 30, 2010 at 9:27 AM, George Dunlap
<George.Dunlap@eu.citrix.com> wrote:
> On Thu, Sep 30, 2010 at 1:28 PM, Yuehai Xu <yuehaixu@gmail.com> wrote:
>>> I agree, letting a VM with an interrupt run for a short period of time
>>> makes sense.  The challenge is to make sure that it can't simply send
>>> itself interrupts every 50us and get to run 100% of the time. :-)
>>
>> I am afraid I don't really understand the challenge is, or, in another
>> word, this method is good principally, but in practice, it is hard to
>> implement? As I know, the OS should always schedules I/O related
>> processes once they are in runnable queue, so, as long as we give even
>> a very short period of time to the waken up guest VM, the I/O process
>> in it should be scheduled at once. In that case, this problem should
>> be solved. Of course, I don't do experiments, saying is always much
>> easier than doing.
>
> What I mean is that you have to be careful when implementing it.  A
> very simple implementation would look like this:
> * Normally, let the VM with the highest credits run.  However, if a VM
> is sent an interrupt, give it priority to run for 50us.
>
> Now, suppose, however, that a rogue VM sets up a periodic timer to
> send itself an interrupt every 55us.  Then it will get an interrupt,
> get priority for 50us, be preempted for 5us, and then get another
> interrupt, allowing it to run for another 50us.    Thus it runs 90% of
> the time, even though it should only run (for example) 50% of the
> time.
>
> We need a way to balance interrupt latency (how long after an
> interrupt is raised before a VM can run) and cpu scheduling fairness.
> That means that if we let a VM run for 50us, and then preempt it, and
> it gets an interrupt 5us later, we need a way to know not to schedule
> it until it's been off the cpu for a reasonable amount of time.  It's
> possible, but it will take some experimentation to see what the best
> option is.
>
>  -George
>

I'd like try to implement this idea to XEN, even though I am not sure
whether I can do it since I am not an expert. :-D.

The first step for me is to write a very simple scheduler without
considering CPU fairness, I/O performance, etc. Its mechanism is very
simple,
the selection of next VCPU is based on the algorithm of round robin.
The current VCPU is always inserted into the tail of the list while
the next
VCPU of the head is selected to be scheduled. The current test code is
basing on credit scheduler of XEN 4.0.1-rc6-pre, except that I delete
all
the component of credit calculation related, the tick of every 10ms,
30ms is also deleted. The time for the next VCPU which is selected is
set to 30ms.

Here, my pre-assumption is that Dom0 pins to PCPU0, while other DomU
pins to PCPU1 for simplicity.

However, some problems puzzle me a lot. When I start two DomU which
shares PCPU1, and in both of which I run a CPU intensive program,
the trace log from xenalyze is below(I modify some code so that the
format is different from the original):
...
<  0.399300204 -x d1v0> (dom: 1) --> (dom: 2) vruntime : 30000802)
<  0.424239058 -x d2v0> (dom: 2) --> (dom: 1) vruntime : 30000582)
<  0.449177708 -x d1v0> (dom: 1) --> (dom: 2) vruntime : 30000336)
<  0.474116762 -x d2v0> (dom: 2) --> (dom: 1) vruntime : 30000827)
<  0.499055641 -x d1v0> (dom: 1) --> (dom: 2) vruntime : 30000596)
<  0.523972987 |x d2v0> (dom: 2) --> (dom: 1) vruntime : 30001301)
<  0.548911095 -x d1v0> (dom: 1) --> (dom: 2) vruntime : 29999684)
...
I think these results make sense since every domU is using almost 30ms of PCPU1

However, I stop one of the CPU intensive program in a DomU while keep
the other running, the results are:
.....
<  0.327815345 |x d1v0> (dom: 1) --> (dom: 2) vruntime : 1542607)
<  0.327906620 -x d2v0> (dom: 2) --> (dom: 1) vruntime : 109521)
<  0.344349033 |x d1v0> (dom: 1) --> (dom: 2) vruntime : 19779544)
<  0.344377129 -x d2v0> (dom: 2) --> (dom: 1) vruntime : 33528)
<  0.344570662 |x d1v0> (dom: 1) --> (dom: 2) vruntime : 232540)
<  0.344643933 -x d2v0> (dom: 2) --> (dom: 1) vruntime : 87857)
<  0.345009170 |x d1v0> (dom: 1) --> (dom: 2) vruntime : 439081)
<  0.345034387 -x d2v0> (dom: 2) --> (dom: 1) vruntime : 30059)
<  0.369973183 -x d1v0> (dom: 1) --> (dom: 1) vruntime : 30000506)
<  0.392423279 |x d1v0> (dom: 1) --> (dom: 2) vruntime : 27006658)
....

Here I am gotten confusing, since my algorithm of scheduling is very
simple, every VM should have 30ms of PCPU, however, from the results,
the time for
each VCPU to have PCPU is quite unstable. I think somewhere, the
routine of schedule() should be invoked frequently, and from xentop,
the VM with CPU
intensive occupies PCPU almost at 97%.


Thanks,
Yuehai

next prev parent reply	other threads:[~2010-10-05  2:52 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-13 21:37 Question about the ability of credit scheduler to handle I/O and CPU intensive VMs Yuehai Xu
2010-09-13 23:29 ` Jeremy Fitzhardinge
2010-09-14  1:38   ` Yuehai Xu
     [not found] ` <AANLkTin9E1m_jFcj4Ak7nB9OxcQynrznpQ_nNPi_U7hN@mail.gmail.com>
2010-09-14 14:58   ` Yuehai Xu
2010-09-30 12:28   ` Yuehai Xu
2010-09-30 13:27     ` George Dunlap
2010-10-05  2:52       ` Yuehai Xu [this message]
2010-10-05 14:16         ` George Dunlap
2010-10-05 14:56           ` Yuehai Xu
2010-10-05 15:02             ` George Dunlap
2010-10-07 22:18               ` Yuehai Xu
2010-10-08  0:25                 ` Yuehai Xu
2010-10-08  9:57                   ` George Dunlap
2010-10-08 10:03                     ` George Dunlap
2010-10-08 10:11                       ` George Dunlap
2010-10-10  4:08                     ` Yuehai Xu
2010-10-10  8:30                       ` cendhu
2010-10-11 11:05                       ` George Dunlap
2010-10-12 12:42                         ` Yuehai Xu
2010-10-18 10:25                           ` George Dunlap
2010-10-05  4:30       ` question about lineat pagetable and mfn_x strongerwill

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AANLkTimYTbf5meNptCtuiKWfQGd_qSNCkbCNabfREc_0@mail.gmail.com \
    --to=yuehaixu@gmail.com \
    --cc=George.Dunlap@eu.citrix.com \
    --cc=xen-devel@lists.xensource.com \
    --cc=yhxu@wayne.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).