From: "Jason J. Herne" <jjherne@linux.vnet.ibm.com>
To: Andrey Korolyov <andrey@xdel.ru>
Cc: quintela@redhat.com,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
borntraeger@de.ibm.com, Amit Shah <amit.shah@redhat.com>,
afaerber@suse.de
Subject: Re: [Qemu-devel] [PATCH 1/2] cpu: Provide vcpu throttling interface
Date: Mon, 01 Jun 2015 13:04:56 -0400 [thread overview]
Message-ID: <556C90B8.5010904@linux.vnet.ibm.com> (raw)
In-Reply-To: <CABYiri9TT8yJdhUUWc2waBsxuafupnFOuKzCkfOdyN=mzcMm-Q@mail.gmail.com>
On 06/01/2015 11:23 AM, Andrey Korolyov wrote:
> On Mon, Jun 1, 2015 at 6:17 PM, Jason J. Herne
> <jjherne@linux.vnet.ibm.com> wrote:
>> Provide a method to throttle guest cpu execution. CPUState is augmented with
>> timeout controls and throttle start/stop functions. To throttle the guest cpu
>> the caller simply has to call the throttle start function and provide a ratio of
>> sleep time to normal execution time.
>>
>> Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
>> Reviewed-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
>> ---
>> cpus.c | 62 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> include/qom/cpu.h | 46 +++++++++++++++++++++++++++++++++++++++++
>> 2 files changed, 108 insertions(+)
>>
>> diff --git a/cpus.c b/cpus.c
>> index de6469f..7568357 100644
>> --- a/cpus.c
>> +++ b/cpus.c
>> @@ -64,6 +64,9 @@
>>
>> #endif /* CONFIG_LINUX */
>>
>> +/* Number of ms between cpu throttle operations */
>> +#define CPU_THROTTLE_TIMESLICE 10
>> +
>> static CPUState *next_cpu;
>> int64_t max_delay;
>> int64_t max_advance;
>> @@ -919,6 +922,65 @@ static void qemu_kvm_wait_io_event(CPUState *cpu)
>> qemu_wait_io_event_common(cpu);
>> }
>>
>> +static void cpu_throttle_thread(void *opq)
>> +{
>> + CPUState *cpu = (CPUState *)opq;
>> + long sleeptime_ms = (long)(cpu->throttle_ratio * CPU_THROTTLE_TIMESLICE);
>> +
>> + /* Stop the timer if needed */
>> + if (cpu->throttle_timer_stop) {
>> + timer_del(cpu->throttle_timer);
>> + timer_free(cpu->throttle_timer);
>> + cpu->throttle_timer = NULL;
>> + return;
>> + }
>> +
>> + qemu_mutex_unlock_iothread();
>> + g_usleep(sleeptime_ms * 1000); /* Convert ms to us for usleep call */
>> + qemu_mutex_lock_iothread();
>> +
>> + timer_mod(cpu->throttle_timer, qemu_clock_get_ms(QEMU_CLOCK_REALTIME) +
>> + CPU_THROTTLE_TIMESLICE);
>> +}
>> +
>> +static void cpu_throttle_timer_pop(void *opq)
>> +{
>> + CPUState *cpu = (CPUState *)opq;
>> +
>> + async_run_on_cpu(cpu, cpu_throttle_thread, cpu);
>> +}
>> +
>> +void cpu_throttle_start(CPUState *cpu, float throttle_ratio)
>> +{
>> + assert(throttle_ratio > 0);
>> + cpu->throttle_ratio = throttle_ratio;
>> +
>> + if (!cpu_throttle_active(cpu)) {
>> + cpu->throttle_timer = timer_new_ms(QEMU_CLOCK_REALTIME,
>> + cpu_throttle_timer_pop, cpu);
>> + timer_mod(cpu->throttle_timer, qemu_clock_get_ms(QEMU_CLOCK_REALTIME) +
>> + CPU_THROTTLE_TIMESLICE);
>> + cpu->throttle_timer_stop = false;
>> + }
>> +}
>> +
>> +void cpu_throttle_stop(CPUState *cpu)
>> +{
>> + assert(cpu_throttle_active(cpu));
>> + cpu->throttle_timer_stop = true;
>> +}
>> +
>> +bool cpu_throttle_active(CPUState *cpu)
>> +{
>> + return (cpu->throttle_timer != NULL);
>> +}
>> +
>> +float cpu_throttle_get_ratio(CPUState *cpu)
>> +{
>> + assert(cpu_throttle_active(cpu));
>> + return cpu->throttle_ratio;
>> +}
>> +
>> static void *qemu_kvm_cpu_thread_fn(void *arg)
>> {
>> CPUState *cpu = arg;
>> diff --git a/include/qom/cpu.h b/include/qom/cpu.h
>> index 39f0f19..9d16e6a 100644
>> --- a/include/qom/cpu.h
>> +++ b/include/qom/cpu.h
>> @@ -310,6 +310,11 @@ struct CPUState {
>> uint32_t can_do_io;
>> int32_t exception_index; /* used by m68k TCG */
>>
>> + /* vcpu throttling controls */
>> + QEMUTimer *throttle_timer;
>> + bool throttle_timer_stop;
>> + float throttle_ratio;
>> +
>> /* Note that this is accessed at the start of every TB via a negative
>> offset from AREG0. Leave this field at the end so as to make the
>> (absolute value) offset as small as possible. This reduces code
>> @@ -553,6 +558,47 @@ CPUState *qemu_get_cpu(int index);
>> */
>> bool cpu_exists(int64_t id);
>>
>> +/**
>> + * cpu_throttle_start:
>> + * @cpu: The vcpu to throttle
>> + *
>> + * Throttles a vcpu by forcing it to sleep. The duration of the sleep is a
>> + * ratio of sleep time to running time. A ratio of 1.0 corresponds to a 50%
>> + * duty cycle (example: 10ms sleep for every 10ms awake).
>> + *
>> + * cpu_throttle_start can be called as needed to adjust the throttle ratio.
>> + * Once the throttling starts, it will remain in effect until cpu_throttle_stop
>> + * is called.
>> + */
>> +void cpu_throttle_start(CPUState *cpu, float throttle_ratio);
>> +
>> +/**
>> + * cpu_throttle_stop:
>> + * @cpu: The vcpu to stop throttling
>> + *
>> + * Stops the vcpu throttling started by cpu_throttle_start.
>> + */
>> +void cpu_throttle_stop(CPUState *cpu);
>> +
>> +/**
>> + * cpu_throttle_active:
>> + * @cpu: The vcpu to check
>> + *
>> + * Returns %true if this vcpu is currently being throttled, %false otherwise.
>> + */
>> +bool cpu_throttle_active(CPUState *cpu);
>> +
>> +/**
>> + * cpu_throttle_get_ratio:
>> + * @cpu: The vcpu whose throttle ratio to return.
>> + *
>> + * Returns the ratio being used to throttle this vcpu. See cpu_throttle_start
>> + * for details.
>> + *
>> + * Returns The ratio being used to throttle this vcpu.
>> + */
>> +float cpu_throttle_get_ratio(CPUState *cpu);
>> +
>> #ifndef CONFIG_USER_ONLY
>>
>> typedef void (*CPUInterruptHandler)(CPUState *, int);
>> --
>> 1.9.1
>>
>>
>
> Thanks Jason, this patch would be quite interesting as it eliminates
> slight overhead from scheduler when cgroups are actively used for same
> task (~5% for per-vcpu cgroup layout simular to libvirt`s one for
> guest perf numa bench). Are you planning to add wakeup frequency
> throttler as well to same interface?
>
No, I was not planning on adding anything. I recognise that the controls
this patch provides are not an ideal throttling mechanism. But it seems
better than what we have today for auto-converge. That said, if people
like the interface and the mechanism it certainly can be enhanced.
--
-- Jason J. Herne (jjherne@linux.vnet.ibm.com)
next prev parent reply other threads:[~2015-06-01 17:05 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-06-01 15:17 [Qemu-devel] [PATCH 0/2] migration: Dynamic cpu throttling for auto-converge Jason J. Herne
2015-06-01 15:17 ` [Qemu-devel] [PATCH 1/2] cpu: Provide vcpu throttling interface Jason J. Herne
2015-06-01 15:23 ` Andrey Korolyov
2015-06-01 17:04 ` Jason J. Herne [this message]
2015-06-03 7:12 ` Juan Quintela
2015-06-03 14:35 ` Jason J. Herne
2015-06-01 15:17 ` [Qemu-devel] [PATCH 2/2] migration: Dynamic cpu throttling for auto-converge Jason J. Herne
2015-06-01 15:32 ` Dr. David Alan Gilbert
2015-06-01 17:16 ` Jason J. Herne
2015-06-02 13:58 ` Dr. David Alan Gilbert
2015-06-02 14:37 ` Jason J. Herne
2015-06-02 14:57 ` Dr. David Alan Gilbert
2015-06-02 16:45 ` Eric Blake
2015-06-03 7:24 ` Juan Quintela
2015-06-03 7:21 ` Juan Quintela
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=556C90B8.5010904@linux.vnet.ibm.com \
--to=jjherne@linux.vnet.ibm.com \
--cc=afaerber@suse.de \
--cc=amit.shah@redhat.com \
--cc=andrey@xdel.ru \
--cc=borntraeger@de.ibm.com \
--cc=dgilbert@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).