public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [BUG] Paravirtual time accounting / IRQ time accounting
@ 2014-03-19  9:42 lwcheng
  2014-03-20 15:01 ` Glauber Costa
  0 siblings, 1 reply; 8+ messages in thread
From: lwcheng @ 2014-03-19  9:42 UTC (permalink / raw)
  To: linux-kernel; +Cc: glommer

In consolidated environments, when there are multiple virtual machines (VMs)
running on one CPU core, timekeeping will be a problem to the guest OS.
Here, I report my findings about Linux process scheduler.


Description
------------
Linux CFS relies on rq->clock_task to charge each task, determine  
vruntime, etc.

When CONFIG_IRQ_TIME_ACCOUNTING is enabled, the time spent on serving IRQ
will be excluded from updating rq->clock_task.
When CONFIG_PARAVIRT_TIME_ACCOUNTING is enabled, the time stolen by  
the hypervisor
will also be excluded from updating rq->clock_task.

With "both" CONFIG_IRQ_TIME_ACCOUNTING and  
CONFIG_PARAVIRT_TIME_ACCOUNTING enabled,
I put three KVM guests on one core and run hackbench in each guest. I  
find that
in the guests, rq->clock_task stays *unchanged*. The malfunction  
embarrasses CFS.
------------


Analysis
------------
[src/kernel/sched/core.c]
static void update_rq_clock_task(struct rq *rq, s64 delta)
{
     ... ...
#ifdef CONFIG_IRQ_TIME_ACCOUNTING
     irq_delta = irq_time_read(cpu_of(rq)) - rq->prev_irq_time;
     ... ...
     rq->prev_irq_time += irq_delta;
     delta -= irq_delta;
#endif

#ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING
     if (static_key_false((&paravirt_steal_rq_enabled))) {
         steal = paravirt_steal_clock(cpu_of(rq));
         steal -= rq->prev_steal_time_rq;
         ... ...
         rq->prev_steal_time_rq += steal;
         delta -= steal;
     }
#endif

     rq->clock_task += delta;
     ... ...
}
--
"delta" -> the intended increment to rq->clock_task
"irq_delta" -> the time spent on serving IRQ (hard + soft)
"steal" -> the time stolen by the underlying hypervisor
--
"irq_delta" is calculated based on sched_clock_cpu(), which is vulnerable
to VM scheduling delays. "irq_delta" can include part or whole of "steal".
I observe that [irq_delta + steal >> delta].
As a result, "delta" becomes zero. That is why rq->clock_task stops.
------------

Please confirm this bug. Thanks.


Luwei Cheng
--
CS student
The University of Hong Kong

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-03-22 14:57 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-19  9:42 [BUG] Paravirtual time accounting / IRQ time accounting lwcheng
2014-03-20 15:01 ` Glauber Costa
2014-03-21  5:50   ` Mike Galbraith
2014-03-22  6:47     ` lwcheng
2014-03-22  7:44       ` Mike Galbraith
2014-03-21 11:31   ` Rik van Riel
2014-03-22  7:15     ` lwcheng
2014-03-22 14:57       ` Rik van Riel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox