From: Dario Faggioli <dario.faggioli@citrix.com>
To: Tony S <suokunstar@gmail.com>, xen-devel@lists.xen.org
Cc: George Dunlap <George.Dunlap@eu.citrix.com>,
Juergen Gross <jgross@suse.com>,
Boris Ostrovsky <boris.ostrovsky@oracle.com>,
David Vrabel <david.vrabel@citrix.com>,
Matt Fleming <matt@codeblueprint.co.uk>
Subject: Re: [BUG] Linux process vruntime accounting in Xen
Date: Mon, 16 May 2016 13:37:01 +0200 [thread overview]
Message-ID: <1463398621.18789.55.camel@citrix.com> (raw)
In-Reply-To: <CAG2GYXEoByMMbxUCMw8-ZMsvnt3mDWND09CjPfMLkt=neCGWyA@mail.gmail.com>
[-- Attachment #1.1: Type: text/plain, Size: 4347 bytes --]
[Adding George again, and a few Linux/Xen folks]
On Sat, 2016-05-14 at 18:25 -0600, Tony S wrote:
> In virtualized environments, sometimes we need to limit the CPU
> resources to a virtual machine(VM). For example in Xen, we use
> $ xl sched-credit -d 1 -c 50
>
> to limit the CPU resource of dom 1 as half of
> one physical CPU core. If the VM CPU resource is capped, the process
> inside the VM will have a vruntime accounting problem. Here, I report
> my findings about Linux process scheduler under the above scenario.
>
Thanks for this other report as well. :-)
All you say makes sense to me, and I will think about it. I'm not sure
about one thing, though...
> ------------Description------------
> Linux CFS relies on delta_exec to charge the vruntime of processes.
> The variable delta_exec is the difference of a process starts and
> stops running on a CPU. This works well in physical machine. However,
> in virtual machine under capped resources, some processes might be
> accounted with inaccurate vruntime.
>
> For example, suppose we have a VM which has one vCPU and is capped to
> have as much as 50% of a physical CPU. When process A inside the VM
> starts running and the CPU resource of that VM runs out, the VM will
> be paused. Next round when the VM is allocated new CPU resource and
> starts running again, process A stops running and is put back to the
> runqueue. The delta_exec of process A is accounted as its "real
> execution time" plus the paused time of its VM. That will make the
> vruntime of process A much larger than it should be and process A
> would not be scheduled again for a long time until the vruntimes of
> other
> processes catch it.
> ---------------------------------------
>
>
> ------------Analysis----------------
> When a process stops running and is going to put back to the
> runqueue,
> update_curr() will be executed.
> [src/kernel/sched/fair.c]
>
> static void update_curr(struct cfs_rq *cfs_rq)
> {
> ... ...
> delta_exec = now - curr->exec_start;
> ... ...
> curr->exec_start = now;
> ... ...
> curr->sum_exec_runtime += delta_exec;
> schedstat_add(cfs_rq, exec_clock, delta_exec);
> curr->vruntime += calc_delta_fair(delta_exec, curr);
> update_min_vruntime(cfs_rq);
> ... ...
> }
>
> "now" --> the right now time
> "exec_start" --> the time when the current process is put on the CPU
> "delta_exec" --> the time difference of a process between it starts
> and stops running on the CPU
>
> When a process starts running before its VM is paused and the process
> stops running after its VM is unpaused, the delta_exec will include
> the VM suspend time which is pretty large compared to the real
> execution time of a process.
>
... but would that also apply to a VM that is not scheduled --just
because of pCPU contention, not because it was paused-- for a few time?
Isn't there anything in place in Xen or Linux (the latter being better
suitable for something like this, IMHO) to compensate for that?
I have to admit I haven't really ever checked myself, maybe either
George or our Linux people do know more?
> This issue will make a great performance harm to the victim process.
> If the process is an I/O-bound workload, its throughput and latency
> will be influenced. If the process is a CPU-bound workload, this
> issue
> will make its vruntime "unfair" compared to other processes under
> CFS.
>
> Because the CPU resource of some type VMs in the cloud are limited as
> the above describes(like Amazon EC2 t2.small instance), I doubt that
> will also harm the performance of public cloud instances.
> ---------------------------------------
>
>
> My test environment is as follows: Hypervisor(Xen 4.5.0), Dom 0(Linux
> 3.18.21), Dom U(Linux 3.18.21). I also test longterm version Linux
> 3.18.30 and the latest longterm version, Linux 4.4.7. Those kernels
> all have this issue.
>
> Please confirm this bug. Thanks.
>
>
--
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)
[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
[-- Attachment #2: Type: text/plain, Size: 126 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
next prev parent reply other threads:[~2016-05-16 11:37 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-15 0:25 [BUG] Linux process vruntime accounting in Xen Tony S
2016-05-16 11:37 ` Dario Faggioli [this message]
2016-05-16 21:38 ` Tony S
2016-05-16 22:33 ` Boris Ostrovsky
2016-05-17 9:33 ` George Dunlap
2016-05-17 9:45 ` Juergen Gross
2016-05-18 12:24 ` Juergen Gross
2016-05-18 14:57 ` Dario Faggioli
2016-05-18 16:09 ` Tony S
2016-05-18 16:14 ` Juergen Gross
2016-05-20 12:50 ` Juergen Gross
2016-05-16 22:33 ` Tony S
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1463398621.18789.55.camel@citrix.com \
--to=dario.faggioli@citrix.com \
--cc=George.Dunlap@eu.citrix.com \
--cc=boris.ostrovsky@oracle.com \
--cc=david.vrabel@citrix.com \
--cc=jgross@suse.com \
--cc=matt@codeblueprint.co.uk \
--cc=suokunstar@gmail.com \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.