From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: "Jason J. Herne" <jjherne@linux.vnet.ibm.com>
Cc: amit.shah@redhat.com, borntraeger@de.ibm.com,
qemu-devel@nongnu.org, afaerber@suse.de, quintela@redhat.com
Subject: Re: [Qemu-devel] [PATCH v2 1/3] cpu: Provide vcpu throttling interface
Date: Wed, 3 Jun 2015 19:03:56 +0100 [thread overview]
Message-ID: <20150603180356.GE2129@work-vm> (raw)
In-Reply-To: <556F412C.9090106@linux.vnet.ibm.com>
* Jason J. Herne (jjherne@linux.vnet.ibm.com) wrote:
> On 06/03/2015 03:56 AM, Juan Quintela wrote:
> >"Jason J. Herne" <jjherne@linux.vnet.ibm.com> wrote:
> >>Provide a method to throttle guest cpu execution. CPUState is augmented with
> >>timeout controls and throttle start/stop functions. To throttle the guest cpu
> >>the caller simply has to call the throttle start function and provide a ratio of
> >>sleep time to normal execution time.
> >>
> >>Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
> >>Reviewed-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
> >
> >
> >
> >Hi
> >
> >sorry that I replied to your previous submission, I had "half done" the
> >mail from yesterday, I still think that all applies.
> >
> >
> >>---
> >> cpus.c | 62 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >> include/qom/cpu.h | 46 +++++++++++++++++++++++++++++++++++++++++
> >> 2 files changed, 108 insertions(+)
> >>
> >>diff --git a/cpus.c b/cpus.c
> >>index de6469f..7568357 100644
> >>--- a/cpus.c
> >>+++ b/cpus.c
> >>@@ -64,6 +64,9 @@
> >>
> >> #endif /* CONFIG_LINUX */
> >>
> >>+/* Number of ms between cpu throttle operations */
> >>+#define CPU_THROTTLE_TIMESLICE 10
> >
> >
> >We are checking for throotling on each cpu each 10ms.
> >But on patch 2 we can see that we only change the throotling each
> >time that we call migration_bitmap_sync(), that only happens each round
> >through all the pages. Normally auto-converge only matters for machines
> >with lots of memory, so this is going to happen each more than 10ms (we
> >change it each 4 passes). You changed it to each 2 passes, and you add
> >it a 0.2. I think that I would preffer to just have it each single
> >pass, but add a 0.1 each pass? simpler and end result would be the same?
> >
> >
>
> Well, we certainly could make it run every pass but I think it would get
> a little too aggressive then. The reason is, we do not increment the
> throttle
> rate by adding 0.2 each time. We increment it by multiplying the current
> rate
> by 2. So by doing that every pass we are doubling the exponential growth
> rate. I will admit the numbers I chose are hardly scientific... I chose them
> because they seemed to provide a decent balance of "throttling aggression"
> in
> my workloads.
That's the advantage of making them parameters.
Dave
> >This is what the old code did (new code do exactly the same, just in the
> >previous patch)
> >
> >-static void mig_sleep_cpu(void *opq)
> >-{
> >- qemu_mutex_unlock_iothread();
> >- g_usleep(30*1000);
> >- qemu_mutex_lock_iothread();
> >-}
> >
> >So, what we are doing, calling async_run_on_cpu() with this function.
> >
> >To make things short, we end here:
> >
> >static void flush_queued_work(CPUState *cpu)
> >{
> > struct qemu_work_item *wi;
> >
> > if (cpu->queued_work_first == NULL) {
> > return;
> > }
> >
> > while ((wi = cpu->queued_work_first)) {
> > cpu->queued_work_first = wi->next;
> > wi->func(wi->data);
> > wi->done = true;
> > if (wi->free) {
> > g_free(wi);
> > }
> > }
> > cpu->queued_work_last = NULL;
> > qemu_cond_broadcast(&qemu_work_cond);
> >}
> >
> >So, we are walking a list that is protected with the iothread_lock, and
> >we are dropping the lock in the middle of the walk ..... Just hoping
> >that nothing else is calling run_async_on_cpu() from a different place
> >....
> >
> >
> >Could we change this to something like:
> >
> >diff --git a/kvm-all.c b/kvm-all.c
> >index 17a3771..ae9635f 100644
> >--- a/kvm-all.c
> >+++ b/kvm-all.c
> >@@ -1787,6 +1787,7 @@ int kvm_cpu_exec(CPUState *cpu)
> > {
> > struct kvm_run *run = cpu->kvm_run;
> > int ret, run_ret;
> >+ long throotling_time;
> >
> > DPRINTF("kvm_cpu_exec()\n");
> >
> >@@ -1813,8 +1814,15 @@ int kvm_cpu_exec(CPUState *cpu)
> > */
> > qemu_cpu_kick_self();
> > }
> >+ throotling_time = cpu->throotling_time;
> >+ cpu->throotling_time = 0;
> >+ cpu->sleeping_time += throotling_time;
> > qemu_mutex_unlock_iothread();
> >
> >+
> >+ if (throotling_time) {
> >+ g_usleep(throttling_time);
> >+ }
> > run_ret = kvm_vcpu_ioctl(cpu, KVM_RUN, 0);
> >
> > qemu_mutex_lock_iothread();
> >
> >
> >
> >And we handle where we are doing now what throotling_time and
> >sleeping_time should be? No need to drop throotling_time/lock/whatever.
> >
> >It adds an if on the fast path, but we have a call to the hypervisor
> >each time, so it shouldn't be so bad, no?
> >
> >For tcp/xen we should seach for a different place to put this code, but
> >you get the idea. Reason for putting it here is because this is where
> >the iothread is dropped, so we don't lost any locking, any other place
> >that we put it needs review with respect to this.
> >
> >
> >Jason, I am really, really sorry to have open this big can of worms on
> >your patch. Now you know why I was telling that "improving"
> >autoconverge was not as easy as it looked.
> >
> >With respect ot the last comment, I also think that we can include the
> >Jason patches. They are one imprevement over what we have now. It is
> >just that it needs more improvements.
> >
>
> Yes, I understand what you are suggesting here. I can certainly look into it
> if you would prefer that rather than accept the current design. The reason I
> did things using the timer and thread was because it fit into the old
> auto-converge code very nicely. Does it make sense to go forward with my
> current design (modified as per your earlier suggestions), and then follow
> up with your proposal at a later date?
>
> --
> -- Jason J. Herne (jjherne@linux.vnet.ibm.com)
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
next prev parent reply other threads:[~2015-06-03 18:04 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-06-02 17:46 [Qemu-devel] [PATCH v2 0/3] migration: Dynamic cpu throttling for auto-converge Jason J. Herne
2015-06-02 17:46 ` [Qemu-devel] [PATCH v2 1/3] cpu: Provide vcpu throttling interface Jason J. Herne
2015-06-03 7:56 ` Juan Quintela
2015-06-03 18:02 ` Jason J. Herne
2015-06-03 18:03 ` Dr. David Alan Gilbert [this message]
2015-06-03 18:11 ` Jason J. Herne
2015-06-08 19:07 ` Jason J. Herne
2015-06-09 8:06 ` Dr. David Alan Gilbert
2015-06-09 15:14 ` Jason J. Herne
2015-06-10 8:45 ` Dr. David Alan Gilbert
2015-06-10 12:03 ` Jason J. Herne
2015-06-11 6:38 ` Markus Armbruster
2015-06-02 17:46 ` [Qemu-devel] [PATCH v2 2/3] migration: Dynamic cpu throttling for auto-converge Jason J. Herne
2015-06-02 20:08 ` Eric Blake
2015-06-02 17:46 ` [Qemu-devel] [PATCH v2 3/3] qmp/hmp: Add throttle ratio to query-migrate and info migrate Jason J. Herne
2015-06-02 20:11 ` Eric Blake
2015-06-03 17:45 ` Jason J. Herne
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150603180356.GE2129@work-vm \
--to=dgilbert@redhat.com \
--cc=afaerber@suse.de \
--cc=amit.shah@redhat.com \
--cc=borntraeger@de.ibm.com \
--cc=jjherne@linux.vnet.ibm.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).