public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Marcelo Tosatti <mtosatti@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krcmar <rkrcmar@redhat.com>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	"Rafael J. Wysocki" <rjw@rjwysocki.net>,
	Viresh Kumar <viresh.kumar@linaro.org>
Subject: Re: [patch 0/3] KVM CPU frequency change hypercalls
Date: Wed, 1 Mar 2017 12:11:03 -0300	[thread overview]
Message-ID: <20170301151101.GA17141@amt.cnet> (raw)
In-Reply-To: <4cbcac70-23ae-eb2d-b020-14ada6996208@redhat.com>

On Wed, Mar 01, 2017 at 03:21:32PM +0100, Paolo Bonzini wrote:
> 
> 
> On 28/02/2017 03:45, Marcelo Tosatti wrote:
> > On Fri, Feb 24, 2017 at 04:34:52PM +0100, Paolo Bonzini wrote:
> >>
> >>
> >> On 24/02/2017 14:04, Marcelo Tosatti wrote:
> >>>>>>> Whats the current usecase, or forseeable future usecase, for save/restore
> >>>>>>> across preemption again? (which would validate the broken by design
> >>>>>>> claim).
> >>>>>> Stop a guest that is using cpufreq, start a guest that is not using it.
> >>>>>> The second guest's performance now depends on the state that the first
> >>>>>> guest left in cpufreq.
> >>>>> Nothing forbids the host to implement switching with the
> >>>>> current hypercall interface: all you need is a scheduler
> >>>>> hook.
> >>>> Can it be done in vcpu_load/vcpu_put?  But you still would have two
> >>>> components (KVM and sysfs) potentially fighting over the frequency, and
> >>>> that's still a bit ugly.
> >>>
> >>> Change the frequency at vcpu_load/vcpu_put? Yes: call into
> >>> cpufreq-userspace. But there is no notion of "per-task frequency" on the
> >>> Linux kernel (which was the starting point of this subthread).
> >>
> >> There isn't, but this patchset is providing a direct path from a task to
> >> cpufreq-userspace.  This is as close as you can get to a per-task frequency.
> > 
> > Cpufreq-userspace is supposed to be used by tasks in userspace.
> > Thats why its called "userspace".
> 
> I think the intended usecase is to have a daemon handling a systemwide
> policy.  Examples are the historical (and now obsolete) users such as
> cpufreqd, cpudyn, powernowd, or cpuspeed.  The user alternatively can
> play the role of the daemon by writing to sysfs.
> 
> I've never seen userspace tasks talking to cpufreq-userspace to set
> their own running frequency.  If DPDK does it, that's nasty in my
> opinion

Please extend what "nasty" means in detail. I really don't understand
why its nasty.

>  and we should find an interface that works best for both DPDK
> and KVM.  Which should be done on linux-pm like Rafael suggested.
> 
> >>> But if you configure all CPUs in the system as cpufreq-userspace,
> >>> then some other (userspace program) has to decide the frequency
> >>> for the other CPUs.
> >>>
> >>> Which agent would do that and why? Thats why i initially said "whats the
> >>> usecase".
> >>
> >> You could just pin them at the highest non-TurboBoost frequency until a
> >> guest runs.  That's assuming that they are idle and, because of
> >> isol_cpus/nohz_full, they would be almost always in deep C state anyway.
> > 
> > The original claim of the thread  was: "this feature (frequency
> > hypercalls) works for pinned vcpu<->pcpu, pcpu dedicated exclusively
> > to vcpu case, lets try to extend this to other cases".
> > 
> > Which is a valid and useful direction to go.
> > 
> > However there is no user for multiple vcpus in the same pcpu now.
> 
> You are still ignoring the case of one guest started after another, or
> of another program started on a CPU that formerly was used by KVM.  They
> don't have to be multiple users at the same time.

Just have the cpufreq-userspace policy be instantiated while the 
isolated vcpu owns the pcpu. Before/after that, the previous policy 
is in place. 

> > If there were multiple vcpus, all of them requesting a given
> > frequency, it would be necessary to:
> > 
> > 	1) Maintain frequency of the pcpu to the highest 
> > 	   frequencies.
> > 
> > 		OR
> > 
> > 	2) Since switching frequencies can take up to 70us (*)
> > 	   (depends on processor), its generally not worthwhile
> > 	   to switch frequencies between task switches.
> 
> Is latency that important, or is rather overhead the one to pay
> attention to?  The slides you linked
> (http://www.ena-hpc.org/2013/pdf/04.pdf) at page 17 suggest it's around
> 10us.

Ok, be it 10us. 10us overhead on every task context switch is not
acceptable.

> One possibility is to do (1) if you have multiple tasks on the run queue
> (or fallback to what is specified in sysfs) and (2) if you only have one
> task.

Sure, that is alright. But the use-case at hand does not involve 
multiple tasks on the pcpu.

> Anyway, please repost with Cc to linux-pm so that we can restart the
> discussion there.
> 
> Paolo

Done. Can you please reply with a concise summary of what you object to? 

      reply	other threads:[~2017-03-01 15:11 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-02 17:47 [patch 0/3] KVM CPU frequency change hypercalls Marcelo Tosatti
2017-02-02 17:47 ` [patch 1/3] cpufreq: implement min/max/up/down functions Marcelo Tosatti
2017-02-03  4:09   ` Viresh Kumar
2017-02-02 17:47 ` [patch 2/3] KVM: x86: introduce ioctl to allow frequency hypercalls Marcelo Tosatti
2017-02-03 17:03   ` Radim Krcmar
2017-02-22 21:18     ` Marcelo Tosatti
2017-02-23 16:48       ` Radim Krcmar
2017-02-23 17:31         ` Paolo Bonzini
2017-02-02 17:47 ` [patch 3/3] KVM: x86: frequency change hypercalls Marcelo Tosatti
2017-02-02 18:01   ` Marcelo Tosatti
2017-02-03 17:40   ` Radim Krcmar
2017-02-03 18:24     ` Marcelo Tosatti
2017-02-03 19:28       ` Radim Krcmar
2017-02-03 12:50 ` [patch 0/3] KVM CPU " Rafael J. Wysocki
2017-02-03 16:43 ` Radim Krcmar
2017-02-03 18:14   ` Marcelo Tosatti
2017-02-03 19:09     ` Radim Krcmar
2017-02-23 17:35       ` Paolo Bonzini
2017-02-23 23:19         ` Marcelo Tosatti
2017-02-24  9:18           ` Paolo Bonzini
2017-02-24 11:50             ` Marcelo Tosatti
2017-02-24 12:17               ` Paolo Bonzini
2017-02-24 13:04                 ` Marcelo Tosatti
2017-02-24 15:34                   ` Paolo Bonzini
2017-02-24 16:54                     ` Rafael J. Wysocki
2017-02-28  2:45                     ` Marcelo Tosatti
2017-03-01 14:21                       ` Paolo Bonzini
2017-03-01 15:11                         ` Marcelo Tosatti [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170301151101.GA17141@amt.cnet \
    --to=mtosatti@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=rjw@rjwysocki.net \
    --cc=rkrcmar@redhat.com \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox