From: Dario Faggioli <dario.faggioli@citrix.com>
To: Tony S <suokunstar@gmail.com>, xen-devel@lists.xen.org
Cc: George Dunlap <George.Dunlap@eu.citrix.com>,
Juergen Gross <jgross@suse.com>,
Boris Ostrovsky <boris.ostrovsky@oracle.com>,
David Vrabel <david.vrabel@citrix.com>,
Matt Fleming <matt@codeblueprint.co.uk>
Subject: Re: [BUG] Linux process vruntime accounting in Xen
Date: Mon, 16 May 2016 13:37:01 +0200 [thread overview]
Message-ID: <1463398621.18789.55.camel@citrix.com> (raw)
In-Reply-To: <CAG2GYXEoByMMbxUCMw8-ZMsvnt3mDWND09CjPfMLkt=neCGWyA@mail.gmail.com>
[-- Attachment #1.1: Type: text/plain, Size: 4347 bytes --]
[Adding George again, and a few Linux/Xen folks]
On Sat, 2016-05-14 at 18:25 -0600, Tony S wrote:
> In virtualized environments, sometimes we need to limit the CPU
> resources to a virtual machine(VM). For example in Xen, we use
> $ xl sched-credit -d 1 -c 50
>
> to limit the CPU resource of dom 1 as half of
> one physical CPU core. If the VM CPU resource is capped, the process
> inside the VM will have a vruntime accounting problem. Here, I report
> my findings about Linux process scheduler under the above scenario.
>
Thanks for this other report as well. :-)
All you say makes sense to me, and I will think about it. I'm not sure
about one thing, though...
> ------------Description------------
> Linux CFS relies on delta_exec to charge the vruntime of processes.
> The variable delta_exec is the difference of a process starts and
> stops running on a CPU. This works well in physical machine. However,
> in virtual machine under capped resources, some processes might be
> accounted with inaccurate vruntime.
>
> For example, suppose we have a VM which has one vCPU and is capped to
> have as much as 50% of a physical CPU. When process A inside the VM
> starts running and the CPU resource of that VM runs out, the VM will
> be paused. Next round when the VM is allocated new CPU resource and
> starts running again, process A stops running and is put back to the
> runqueue. The delta_exec of process A is accounted as its "real
> execution time" plus the paused time of its VM. That will make the
> vruntime of process A much larger than it should be and process A
> would not be scheduled again for a long time until the vruntimes of
> other
> processes catch it.
> ---------------------------------------
>
>
> ------------Analysis----------------
> When a process stops running and is going to put back to the
> runqueue,
> update_curr() will be executed.
> [src/kernel/sched/fair.c]
>
> static void update_curr(struct cfs_rq *cfs_rq)
> {
> ... ...
> delta_exec = now - curr->exec_start;
> ... ...
> curr->exec_start = now;
> ... ...
> curr->sum_exec_runtime += delta_exec;
> schedstat_add(cfs_rq, exec_clock, delta_exec);
> curr->vruntime += calc_delta_fair(delta_exec, curr);
> update_min_vruntime(cfs_rq);
> ... ...
> }
>
> "now" --> the right now time
> "exec_start" --> the time when the current process is put on the CPU
> "delta_exec" --> the time difference of a process between it starts
> and stops running on the CPU
>
> When a process starts running before its VM is paused and the process
> stops running after its VM is unpaused, the delta_exec will include
> the VM suspend time which is pretty large compared to the real
> execution time of a process.
>
... but would that also apply to a VM that is not scheduled --just
because of pCPU contention, not because it was paused-- for a few time?
Isn't there anything in place in Xen or Linux (the latter being better
suitable for something like this, IMHO) to compensate for that?
I have to admit I haven't really ever checked myself, maybe either
George or our Linux people do know more?
> This issue will make a great performance harm to the victim process.
> If the process is an I/O-bound workload, its throughput and latency
> will be influenced. If the process is a CPU-bound workload, this
> issue
> will make its vruntime "unfair" compared to other processes under
> CFS.
>
> Because the CPU resource of some type VMs in the cloud are limited as
> the above describes(like Amazon EC2 t2.small instance), I doubt that
> will also harm the performance of public cloud instances.
> ---------------------------------------
>
>
> My test environment is as follows: Hypervisor(Xen 4.5.0), Dom 0(Linux
> 3.18.21), Dom U(Linux 3.18.21). I also test longterm version Linux
> 3.18.30 and the latest longterm version, Linux 4.4.7. Those kernels
> all have this issue.
>
> Please confirm this bug. Thanks.
>
>
--
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)
[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
[-- Attachment #2: Type: text/plain, Size: 126 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
next prev parent reply other threads:[~2016-05-16 11:37 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-15 0:25 [BUG] Linux process vruntime accounting in Xen Tony S
2016-05-16 11:37 ` Dario Faggioli [this message]
2016-05-16 21:38 ` Tony S
2016-05-16 22:33 ` Boris Ostrovsky
2016-05-17 9:33 ` George Dunlap
2016-05-17 9:45 ` Juergen Gross
2016-05-18 12:24 ` Juergen Gross
2016-05-18 14:57 ` Dario Faggioli
2016-05-18 16:09 ` Tony S
2016-05-18 16:14 ` Juergen Gross
2016-05-20 12:50 ` Juergen Gross
2016-05-16 22:33 ` Tony S
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1463398621.18789.55.camel@citrix.com \
--to=dario.faggioli@citrix.com \
--cc=George.Dunlap@eu.citrix.com \
--cc=boris.ostrovsky@oracle.com \
--cc=david.vrabel@citrix.com \
--cc=jgross@suse.com \
--cc=matt@codeblueprint.co.uk \
--cc=suokunstar@gmail.com \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).