From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56251) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bgw1e-0001Zu-FN for qemu-devel@nongnu.org; Mon, 05 Sep 2016 11:41:35 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bgw1Y-00016s-E6 for qemu-devel@nongnu.org; Mon, 05 Sep 2016 11:41:33 -0400 Received: from mail-wm0-x231.google.com ([2a00:1450:400c:c09::231]:37479) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bgw1Y-00016h-3h for qemu-devel@nongnu.org; Mon, 05 Sep 2016 11:41:28 -0400 Received: by mail-wm0-x231.google.com with SMTP id w12so48063592wmf.0 for ; Mon, 05 Sep 2016 08:41:27 -0700 (PDT) References: <1472725227-10374-1-git-send-email-pbonzini@redhat.com> <1472725227-10374-11-git-send-email-pbonzini@redhat.com> <87pooiz8nh.fsf@linaro.org> From: Alex =?utf-8?Q?Benn=C3=A9e?= In-reply-to: Date: Mon, 05 Sep 2016 16:41:23 +0100 Message-ID: <87mvjmz74s.fsf@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Subject: Re: [Qemu-devel] [PATCH 10/12] cpus-common: Introduce async_safe_run_on_cpu() List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini Cc: qemu-devel@nongnu.org, Sergey Fedorov , Sergey Fedorov , "Emilio G. Cota" Paolo Bonzini writes: > On 05/09/2016 17:08, Alex Bennée wrote: >> >> Paolo Bonzini writes: >> >>> Signed-off-by: Paolo Bonzini >>> --- >>> cpus-common.c | 25 +++++++++++++++++++++++++ >>> include/qom/cpu.h | 12 ++++++++++++ >>> 2 files changed, 37 insertions(+) >>> >>> diff --git a/cpus-common.c b/cpus-common.c >>> index 59c8dc8..88cf5ec 100644 >>> --- a/cpus-common.c >>> +++ b/cpus-common.c >>> @@ -144,6 +144,11 @@ void async_run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data) >>> queue_work_on_cpu(cpu, wi); >>> } >>> >>> +typedef struct SafeWorkItem { >>> + run_on_cpu_func func; >>> + void *data; >>> +} SafeWorkItem; >>> + >>> /* Wait for pending exclusive operations to complete. The exclusive lock >>> must be held. */ >>> static inline void exclusive_idle(void) >>> @@ -208,6 +213,26 @@ void cpu_exec_end(CPUState *cpu) >>> qemu_mutex_unlock(&qemu_cpu_list_mutex); >>> } >>> >>> +static void async_safe_run_on_cpu_fn(CPUState *cpu, void *data) >>> +{ >>> + SafeWorkItem *w = data; >>> + >>> + start_exclusive(); >>> + w->func(cpu, w->data); >>> + end_exclusive(); >>> + g_free(w); >>> +} >>> + >>> +void async_safe_run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data) >>> +{ >>> + SafeWorkItem *w = g_new(SafeWorkItem, 1); >> >> OK so I appreciate this approach is a neat way to embed safe work in the >> existing queue but does it really offer that much more for yet another >> dynamic allocation vs an extra flag to the WorkItem? > > Actually I had it that way in the "first version" (the one that I > promised last Monday). The questions are: > > 1) is it so different to have 1 vs. 2 mini allocations > > 2) is this a fast path > > but it's no big deal, I can certainly change back. When the system is under stress it all adds up - but as I said the test case was a fairly special. Keeping it all in one structure does make it easier to convert to an array later though. I'll see if I can come up with a differently pathological case and I'll benchmark the two approaches. > > Paolo > >> In previous iterations I can DoS QEMU with a guest that does heavy >> cross-CPU TLB flushing which led to a storm of mini allocations (for the >> list and associated structures). This caused the massive memory usage as >> the queue backed up. >> >> I appreciate it was a fairly special test case and I introduced other >> mitigations in the base patches cputlb code to get around it however it >> was the driver for me experimenting with the pre-allocated array for >> holding work items. >> >>> + >>> + w->func = func; >>> + w->data = data; >>> + >>> + async_run_on_cpu(cpu, async_safe_run_on_cpu_fn, w); >>> +} >>> + >>> void process_queued_cpu_work(CPUState *cpu) >>> { >>> struct qemu_work_item *wi; >>> diff --git a/include/qom/cpu.h b/include/qom/cpu.h >>> index 0e04e8f..54a875e 100644 >>> --- a/include/qom/cpu.h >>> +++ b/include/qom/cpu.h >>> @@ -663,6 +663,18 @@ void run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data); >>> void async_run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data); >>> >>> /** >>> + * async_safe_run_on_cpu: >>> + * @cpu: The vCPU to run on. >>> + * @func: The function to be executed. >>> + * @data: Data to pass to the function. >>> + * >>> + * Schedules the function @func for execution on the vCPU @cpu asynchronously, >>> + * while all other vCPUs are sleeping. @func is called with the CPU list lock >>> + * taken (and for system emulation the BQL); any other lock can be taken safely. >>> + */ >>> +void async_safe_run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data); >>> + >>> +/** >>> * qemu_get_cpu: >>> * @index: The CPUState@cpu_index value of the CPU to obtain. >>> * >> >> >> -- >> Alex Bennée >> -- Alex Bennée