From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52265) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YzT98-0000cM-1F for qemu-devel@nongnu.org; Mon, 01 Jun 2015 13:05:07 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YzT93-0004HX-1f for qemu-devel@nongnu.org; Mon, 01 Jun 2015 13:05:05 -0400 Received: from e19.ny.us.ibm.com ([129.33.205.209]:57291) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YzT92-0004HR-Tx for qemu-devel@nongnu.org; Mon, 01 Jun 2015 13:05:00 -0400 Received: from /spool/local by e19.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 1 Jun 2015 13:05:00 -0400 Received: from b01cxnp23032.gho.pok.ibm.com (b01cxnp23032.gho.pok.ibm.com [9.57.198.27]) by d01dlp01.pok.ibm.com (Postfix) with ESMTP id 0DA1338C8026 for ; Mon, 1 Jun 2015 13:04:57 -0400 (EDT) Received: from d01av04.pok.ibm.com (d01av04.pok.ibm.com [9.56.224.64]) by b01cxnp23032.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id t51H4uwY39518294 for ; Mon, 1 Jun 2015 17:04:56 GMT Received: from d01av04.pok.ibm.com (localhost [127.0.0.1]) by d01av04.pok.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id t51H4ude014956 for ; Mon, 1 Jun 2015 13:04:56 -0400 Message-ID: <556C90B8.5010904@linux.vnet.ibm.com> Date: Mon, 01 Jun 2015 13:04:56 -0400 From: "Jason J. Herne" MIME-Version: 1.0 References: <1433171851-18507-1-git-send-email-jjherne@linux.vnet.ibm.com> <1433171851-18507-2-git-send-email-jjherne@linux.vnet.ibm.com> In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 1/2] cpu: Provide vcpu throttling interface Reply-To: jjherne@linux.vnet.ibm.com List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Andrey Korolyov Cc: quintela@redhat.com, "qemu-devel@nongnu.org" , "Dr. David Alan Gilbert" , borntraeger@de.ibm.com, Amit Shah , afaerber@suse.de On 06/01/2015 11:23 AM, Andrey Korolyov wrote: > On Mon, Jun 1, 2015 at 6:17 PM, Jason J. Herne > wrote: >> Provide a method to throttle guest cpu execution. CPUState is augmented with >> timeout controls and throttle start/stop functions. To throttle the guest cpu >> the caller simply has to call the throttle start function and provide a ratio of >> sleep time to normal execution time. >> >> Signed-off-by: Jason J. Herne >> Reviewed-by: Matthew Rosato >> --- >> cpus.c | 62 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> include/qom/cpu.h | 46 +++++++++++++++++++++++++++++++++++++++++ >> 2 files changed, 108 insertions(+) >> >> diff --git a/cpus.c b/cpus.c >> index de6469f..7568357 100644 >> --- a/cpus.c >> +++ b/cpus.c >> @@ -64,6 +64,9 @@ >> >> #endif /* CONFIG_LINUX */ >> >> +/* Number of ms between cpu throttle operations */ >> +#define CPU_THROTTLE_TIMESLICE 10 >> + >> static CPUState *next_cpu; >> int64_t max_delay; >> int64_t max_advance; >> @@ -919,6 +922,65 @@ static void qemu_kvm_wait_io_event(CPUState *cpu) >> qemu_wait_io_event_common(cpu); >> } >> >> +static void cpu_throttle_thread(void *opq) >> +{ >> + CPUState *cpu = (CPUState *)opq; >> + long sleeptime_ms = (long)(cpu->throttle_ratio * CPU_THROTTLE_TIMESLICE); >> + >> + /* Stop the timer if needed */ >> + if (cpu->throttle_timer_stop) { >> + timer_del(cpu->throttle_timer); >> + timer_free(cpu->throttle_timer); >> + cpu->throttle_timer = NULL; >> + return; >> + } >> + >> + qemu_mutex_unlock_iothread(); >> + g_usleep(sleeptime_ms * 1000); /* Convert ms to us for usleep call */ >> + qemu_mutex_lock_iothread(); >> + >> + timer_mod(cpu->throttle_timer, qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + >> + CPU_THROTTLE_TIMESLICE); >> +} >> + >> +static void cpu_throttle_timer_pop(void *opq) >> +{ >> + CPUState *cpu = (CPUState *)opq; >> + >> + async_run_on_cpu(cpu, cpu_throttle_thread, cpu); >> +} >> + >> +void cpu_throttle_start(CPUState *cpu, float throttle_ratio) >> +{ >> + assert(throttle_ratio > 0); >> + cpu->throttle_ratio = throttle_ratio; >> + >> + if (!cpu_throttle_active(cpu)) { >> + cpu->throttle_timer = timer_new_ms(QEMU_CLOCK_REALTIME, >> + cpu_throttle_timer_pop, cpu); >> + timer_mod(cpu->throttle_timer, qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + >> + CPU_THROTTLE_TIMESLICE); >> + cpu->throttle_timer_stop = false; >> + } >> +} >> + >> +void cpu_throttle_stop(CPUState *cpu) >> +{ >> + assert(cpu_throttle_active(cpu)); >> + cpu->throttle_timer_stop = true; >> +} >> + >> +bool cpu_throttle_active(CPUState *cpu) >> +{ >> + return (cpu->throttle_timer != NULL); >> +} >> + >> +float cpu_throttle_get_ratio(CPUState *cpu) >> +{ >> + assert(cpu_throttle_active(cpu)); >> + return cpu->throttle_ratio; >> +} >> + >> static void *qemu_kvm_cpu_thread_fn(void *arg) >> { >> CPUState *cpu = arg; >> diff --git a/include/qom/cpu.h b/include/qom/cpu.h >> index 39f0f19..9d16e6a 100644 >> --- a/include/qom/cpu.h >> +++ b/include/qom/cpu.h >> @@ -310,6 +310,11 @@ struct CPUState { >> uint32_t can_do_io; >> int32_t exception_index; /* used by m68k TCG */ >> >> + /* vcpu throttling controls */ >> + QEMUTimer *throttle_timer; >> + bool throttle_timer_stop; >> + float throttle_ratio; >> + >> /* Note that this is accessed at the start of every TB via a negative >> offset from AREG0. Leave this field at the end so as to make the >> (absolute value) offset as small as possible. This reduces code >> @@ -553,6 +558,47 @@ CPUState *qemu_get_cpu(int index); >> */ >> bool cpu_exists(int64_t id); >> >> +/** >> + * cpu_throttle_start: >> + * @cpu: The vcpu to throttle >> + * >> + * Throttles a vcpu by forcing it to sleep. The duration of the sleep is a >> + * ratio of sleep time to running time. A ratio of 1.0 corresponds to a 50% >> + * duty cycle (example: 10ms sleep for every 10ms awake). >> + * >> + * cpu_throttle_start can be called as needed to adjust the throttle ratio. >> + * Once the throttling starts, it will remain in effect until cpu_throttle_stop >> + * is called. >> + */ >> +void cpu_throttle_start(CPUState *cpu, float throttle_ratio); >> + >> +/** >> + * cpu_throttle_stop: >> + * @cpu: The vcpu to stop throttling >> + * >> + * Stops the vcpu throttling started by cpu_throttle_start. >> + */ >> +void cpu_throttle_stop(CPUState *cpu); >> + >> +/** >> + * cpu_throttle_active: >> + * @cpu: The vcpu to check >> + * >> + * Returns %true if this vcpu is currently being throttled, %false otherwise. >> + */ >> +bool cpu_throttle_active(CPUState *cpu); >> + >> +/** >> + * cpu_throttle_get_ratio: >> + * @cpu: The vcpu whose throttle ratio to return. >> + * >> + * Returns the ratio being used to throttle this vcpu. See cpu_throttle_start >> + * for details. >> + * >> + * Returns The ratio being used to throttle this vcpu. >> + */ >> +float cpu_throttle_get_ratio(CPUState *cpu); >> + >> #ifndef CONFIG_USER_ONLY >> >> typedef void (*CPUInterruptHandler)(CPUState *, int); >> -- >> 1.9.1 >> >> > > Thanks Jason, this patch would be quite interesting as it eliminates > slight overhead from scheduler when cgroups are actively used for same > task (~5% for per-vcpu cgroup layout simular to libvirt`s one for > guest perf numa bench). Are you planning to add wakeup frequency > throttler as well to same interface? > No, I was not planning on adding anything. I recognise that the controls this patch provides are not an ideal throttling mechanism. But it seems better than what we have today for auto-converge. That said, if people like the interface and the mechanism it certainly can be enhanced. -- -- Jason J. Herne (jjherne@linux.vnet.ibm.com)