public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Martin Schwidefsky <schwidefsky@de.ibm.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org,
	Jan Glauber <jang@linux.vnet.ibm.com>,
	heiko.carstens@de.ibm.com, Paul Mackerras <paulus@samba.org>
Subject: Re: [accounting regression since rc1]  scheduler updates
Date: Tue, 21 Aug 2007 12:38:16 +0200	[thread overview]
Message-ID: <1187692696.24279.7.camel@localhost> (raw)
In-Reply-To: <20070821093434.GB12025@elte.hu>

On Tue, 2007-08-21 at 11:34 +0200, Ingo Molnar wrote:
> * Martin Schwidefsky <schwidefsky@de.ibm.com> wrote:
> 
> > > hm, does on s390 scheduler_tick() get driven in virtual time or in 
> > > real time? The very latest scheduler code will enforce a minimum 
> > > rate of sched_clock() across two scheduler_tick() calls (in rc3 and 
> > > later kernels). If sched_clock() "slows down" but scheduler_tick() 
> > > still has a real-time frequency then that impacts the quality of 
> > > scheduling. So scheduler_tick() and sched_clock() must really have 
> > > the same behavior (either both are virtual or both are real), so 
> > > that scheduling becomes invariant to steal-time.
> > 
> > scheduler_tick() is based on the HZ timer which uses the TOD clock = 
> > real time. sched_clock() currently uses the TOD clock as well so in 
> > regard to the new scheduler we currently do not have a problem. We 
> > have a problem with cpu time accounting, the change to the /proc code 
> > breaks the precise accounting on s390. To solve the cpu time 
> > accounting we need to change sched_clock() to the cpu timer = virtual 
> > time. To change the scheduler_tick() as well requires another patch 
> > and I fear it would complicate things in the s390 backend.
> 
> my feeling is that it gives us generally higher-quality scheduling if we 
> drive all things scheduler via virtual time. Do you agree with that?

Yes, I'm in favour of converting sched_clock to use virtual time. It
makes sense to me.

> > And if you say that the scheduling becomes invariant to steal-time, 
> > how is the cpu time accounting via sum_exec supposed to work if it 
> > does not take steal-time into account ?
> 
> right now there are two distinct and independent things: scheduler 
> behavior (the scheduling decisions the scheduler makes) and accounting 
> behavior.
> 
> the 'invariant' i mentioned only covers scheduler behavior, not 
> accounting behavior. Accounting is separate in theory, but coupled in 
> practice now via sum_exec_runtime.

Hmm, ok. But the fact is that right now the accounting via
sum_exec_runtime is broken in regard to virtual cpus, isn't it?

> Before we do a patch to decouple them again, lets make sure we agree on 
> the direction to take here. There are two ways to account within a 
> virtual machine: either in real time or in virtual time.
> 
> it seems you'd like accounting to be sensitive to 'external load' - i.e. 
> you'd like an 'internal' top to show the 'real' CPU accounting, right? 

Yes, we want utime and stime represent the time spent on the physical
cpu. To make up for the missing time the steal time field has been
introduced.

> Wouldnt it be more consistent if a virtual box would not show any 
> dependency on external load? (i.e. it would slow down all of its 
> internal functionality transparently, without exposing it via /proc. The 
> only way to observe that would be the TOD interfaces: gettimeofday and 
> real-time clock driven POSIX timers. Even timer_list could be driven via 
> virtual time - although that would probably break user expectations, 
> right?) Or would accounting-in-virtual-time break user expectations too? 
> (most of the other hypervisors let guests account in virtual time.

No, imho it is less consistent if the virtual box shows virtual time. If
you look at the top output as a user and it shows that some process used
x% of cpu what does it tell you? With virtual cpus next to nothing, you
have to normalize the numbers with the %steal while the process was
running. But even then it still is not a good number because the %steal
changes while a process is running. The only good solution is to use
virtual time for all cputime values.

-- 
blue skies,
  Martin.

"Reality continues to ruin my life." - Calvin.



  parent reply	other threads:[~2007-08-21 10:34 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-08-12 16:32 [git pull request] scheduler updates Ingo Molnar
2007-08-14  8:37 ` [accounting regression since rc1] " Christian Borntraeger
2007-08-16  8:17   ` [PATCH][RFC] Re: accounting regression since rc1 Christian Borntraeger
2007-08-20 15:45   ` [accounting regression since rc1] scheduler updates Ingo Molnar
2007-08-20 17:03     ` Martin Schwidefsky
2007-08-20 18:08       ` Ingo Molnar
2007-08-20 18:33         ` Martin Schwidefsky
2007-08-20 19:00           ` Balbir Singh
2007-08-20 19:05           ` Ingo Molnar
2007-08-21  7:20             ` Christian Borntraeger
2007-08-20 19:12           ` Ingo Molnar
2007-08-21  7:00           ` Christian Borntraeger
2007-08-21  9:18             ` Martin Schwidefsky
2007-08-20 23:07         ` Paul Mackerras
2007-08-21  2:18         ` Andi Kleen
2007-08-21  7:09           ` Ingo Molnar
2007-08-21 10:07             ` Andi Kleen
2007-08-21 10:20               ` Ingo Molnar
2007-08-21 11:15                 ` Andi Kleen
2007-08-21 11:20                   ` Ingo Molnar
2007-08-21  8:17     ` Christian Borntraeger
2007-08-21  8:42       ` Ingo Molnar
2007-08-21  9:11         ` Martin Schwidefsky
2007-08-21  9:34           ` Ingo Molnar
2007-08-21  9:48             ` Paul Mackerras
2007-08-21 10:38             ` Martin Schwidefsky [this message]
2007-08-21 11:36               ` Ingo Molnar
2007-08-21 11:58                 ` Martin Schwidefsky
2007-08-21 10:39             ` Christian Borntraeger
2007-08-21 10:43             ` Christian Borntraeger
2007-08-21 11:15               ` Ingo Molnar
2007-08-21 11:24                 ` Christian Borntraeger
2007-08-21 11:30                   ` Ingo Molnar
2007-08-21 11:58                     ` Christian Borntraeger
2007-08-21 12:21                       ` Ingo Molnar
2007-08-21 12:57                         ` Martin Schwidefsky
2007-08-21 11:25       ` Ingo Molnar
2007-08-22  7:50         ` Christian Borntraeger
2007-08-22  7:59           ` Ingo Molnar
     [not found] ` <200708141032.47235.borntraeger@de.ibm.com>
     [not found]   ` <alpine.LFD.0.999.0708140835240.30176@woody.linux-foundation.org>
2007-08-14 18:19     ` Christian Borntraeger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1187692696.24279.7.camel@localhost \
    --to=schwidefsky@de.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=borntraeger@de.ibm.com \
    --cc=heiko.carstens@de.ibm.com \
    --cc=jang@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=paulus@samba.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox