From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57516) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b9qEy-0002C7-HW for qemu-devel@nongnu.org; Mon, 06 Jun 2016 04:50:34 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1b9qEu-0003aI-4J for qemu-devel@nongnu.org; Mon, 06 Jun 2016 04:50:31 -0400 Received: from mail-wm0-x235.google.com ([2a00:1450:400c:c09::235]:35990) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b9qEs-0003a0-R0 for qemu-devel@nongnu.org; Mon, 06 Jun 2016 04:50:27 -0400 Received: by mail-wm0-x235.google.com with SMTP id n184so81053852wmn.1 for ; Mon, 06 Jun 2016 01:50:26 -0700 (PDT) From: Alex =?utf-8?Q?Benn=C3=A9e?= In-reply-to: <57544CC2.8030303@gmail.com> Date: Mon, 06 Jun 2016 09:50:39 +0100 Message-ID: <87k2i2ituo.fsf@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Subject: Re: [Qemu-devel] [RFC v1 07/12] cpus: introduce async_safe_run_on_cpu. List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Sergey Fedorov Cc: mttcg@listserver.greensocs.com, fred.konrad@greensocs.com, a.rigo@virtualopensystems.com, cota@braap.org, qemu-devel@nongnu.org, mark.burton@greensocs.com, pbonzini@redhat.com, jan.kiszka@siemens.com, rth@twiddle.net, peter.maydell@linaro.org, claudio.fontana@huawei.com, Peter Crosthwaite , Andreas =?utf-8?Q?F?= =?utf-8?Q?=C3=A4rber?= Sergey Fedorov writes: > On 15/04/16 17:23, Alex Bennée wrote: >> diff --git a/cpu-exec-common.c b/cpu-exec-common.c >> index 3d7eaa3..c2f7c29 100644 >> --- a/cpu-exec-common.c >> +++ b/cpu-exec-common.c >> @@ -79,3 +79,4 @@ void cpu_loop_exit_restore(CPUState *cpu, uintptr_t pc) >> cpu->current_tb = NULL; >> siglongjmp(cpu->jmp_env, 1); >> } >> + > > Do we rally want this extra line at the end of the file? :) > >> diff --git a/cpu-exec.c b/cpu-exec.c >> index 42cec05..2f362f8 100644 >> --- a/cpu-exec.c >> +++ b/cpu-exec.c >> @@ -365,6 +365,17 @@ int cpu_exec(CPUState *cpu) >> uintptr_t next_tb; >> SyncClocks sc; >> >> + /* >> + * This happen when somebody doesn't want this CPU to start >> + * In case of MTTCG. >> + */ >> +#ifdef CONFIG_SOFTMMU >> + if (async_safe_work_pending()) { >> + cpu->exit_request = 1; >> + return 0; >> + } >> +#endif >> + > > I can't see the point of this change. We check for > "async_safe_work_pending()" each time round the loop in > qemu_tcg_cpu_thread_fn() before entering tcg_cpu_exec() and we do that > after 'cpu->exit_request' gets reset. I can't remember what this was meant to fix now so I'll remove it as suggested. > >> /* replay_interrupt may need current_cpu */ >> current_cpu = cpu; >> >> diff --git a/cpus.c b/cpus.c >> index 9177161..860e2a9 100644 >> --- a/cpus.c >> +++ b/cpus.c >> @@ -928,6 +928,19 @@ static QemuCond qemu_cpu_cond; >> static QemuCond qemu_pause_cond; >> static QemuCond qemu_work_cond; >> >> +/* safe work */ >> +static int safe_work_pending; >> +static int tcg_scheduled_cpus; >> + >> +typedef struct { >> + CPUState *cpu; /* CPU affected */ > > Do we really need to pass CPU state to the safe work? Async safe work > seems to be supposed to handle cross-CPU tasks rather than CPU-specific > ones. TB flush, for example, doesn't really require CPU to be passed to > the async safe work handler: we should be able to check for code buffer > overflow before scheduling the job. Because of this, it seems to be > reasonable to rename async_safe_run_on_cpu() to something like > request_async_cpu_safe_work(). CPUState is common enough to be passed around and this is a shared array for all queued work. I wanted to ensure the second most common operation of CPUState + one extra bit of information was covered without having to resort to memory allocation. The previous iterations of this work had allocations for structures and the linked list and I was working hard to make the common case allocation free. > >> + run_on_cpu_func func; /* Helper function */ >> + void *data; /* Helper data */ >> +} qemu_safe_work_item; >> + >> +static GArray *safe_work; /* array of qemu_safe_work_items */ > > I think we shouldn't bother with "static" memory allocation for the list > of work items. We should be fine with dynamic allocation in > async_safe_run_on_cpu(). Why? Under heavy load these add up fast, some TLB flush operations can queue CPU x ASID deferred operations. It's not exactly a large array to allocate at the start. We could make the initial size smaller if you want. > Halting all the CPUs doesn't seem to be cheap > anyhow, thus it shouldn't be used frequently. The only use so far is to > make tb_flush() safe. If we look at how tb_flush() is used we can see > that this is either code buffer exhaustion or some rare operation which > needs all the previous translations to be flushed. The former shouldn't > happen often, the latter isn't supposed to be cheap and fast anyway. The various flushes are more common under ARMv8, we are not just talking about the big tb_flush using this mechanism. > >> +static QemuMutex safe_work_mutex; >> + >> void qemu_init_cpu_loop(void) >> { >> qemu_init_sigbus(); >> @@ -937,6 +950,9 @@ void qemu_init_cpu_loop(void) >> qemu_mutex_init(&qemu_global_mutex); >> >> qemu_thread_get_self(&io_thread); >> + >> + safe_work = g_array_sized_new(TRUE, TRUE, sizeof(qemu_safe_work_item), 128); >> + qemu_mutex_init(&safe_work_mutex); >> } >> >> void run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data) >> @@ -997,6 +1013,81 @@ void async_run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data) >> qemu_cpu_kick(cpu); >> } >> >> +/* >> + * Safe work interface >> + * >> + * Safe work is defined as work that requires the system to be >> + * quiescent before making changes. >> + */ >> + >> +void async_safe_run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data) >> +{ >> + CPUState *iter; >> + qemu_safe_work_item wi; >> + wi.cpu = cpu; >> + wi.func = func; >> + wi.data = data; >> + >> + qemu_mutex_lock(&safe_work_mutex); >> + g_array_append_val(safe_work, wi); >> + atomic_inc(&safe_work_pending); >> + qemu_mutex_unlock(&safe_work_mutex); >> + >> + /* Signal all vCPUs to halt */ >> + CPU_FOREACH(iter) { >> + qemu_cpu_kick(iter); >> + } >> +} >> + >> +/** >> + * flush_queued_safe_work: >> + * >> + * @scheduled_cpu_count >> + * >> + * If not 0 will signal the other vCPUs and sleep. The last vCPU to >> + * get to the function then drains the queue while the system is in a >> + * quiescent state. This allows the operations to change shared >> + * structures. >> + * >> + * @see async_run_safe_work_on_cpu >> + */ >> +static void flush_queued_safe_work(int scheduled_cpu_count) >> +{ >> + qemu_safe_work_item *wi; >> + int i; >> + >> + /* bail out if there is nothing to do */ >> + if (!async_safe_work_pending()) { >> + return; >> + } >> + >> + if (scheduled_cpu_count) { >> + >> + /* Nothing to do but sleep */ >> + qemu_cond_wait(&qemu_work_cond, &qemu_global_mutex); > > We'll not be able to extend this to user-mode emulation because we don't > have BQL there. I think we should consider something like > exclusive_idle()/start_exclusive()/end_exclusive()/cpu_exec_start()/cpu_exec_end(). This is true, sacrificed on the alter of making things simpler. It would be nice if we had a single simple dispatch point for linux-user as well and we could make things sane there as well. I'll have another look at the exclusive code although at first blush it seemed a little more complex because it didn't have the luxury of knowing how many threads are running. > > Kind regards, > Sergey > >> + >> + } else { >> + >> + /* We can now do the work */ >> + qemu_mutex_lock(&safe_work_mutex); >> + for (i = 0; i < safe_work->len; i++) { >> + wi = &g_array_index(safe_work, qemu_safe_work_item, i); >> + wi->func(wi->cpu, wi->data); >> + } >> + g_array_remove_range(safe_work, 0, safe_work->len); >> + atomic_set(&safe_work_pending, 0); >> + qemu_mutex_unlock(&safe_work_mutex); >> + >> + /* Wake everyone up */ >> + qemu_cond_broadcast(&qemu_work_cond); >> + } >> +} >> + >> +bool async_safe_work_pending(void) >> +{ >> + return (atomic_read(&safe_work_pending) != 0); >> +} >> + >> static void flush_queued_work(CPUState *cpu) >> { >> struct qemu_work_item *wi; -- Alex Bennée