Re: [Qemu-devel] [RFC v1 07/12] cpus: introduce async_safe_run_on_cpu.

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Alex Bennée" <alex.bennee@linaro.org>
To: Sergey Fedorov <serge.fdrv@gmail.com>
Cc: mttcg@listserver.greensocs.com, fred.konrad@greensocs.com,
	a.rigo@virtualopensystems.com, cota@braap.org,
	qemu-devel@nongnu.org, mark.burton@greensocs.com,
	pbonzini@redhat.com, jan.kiszka@siemens.com, rth@twiddle.net,
	peter.maydell@linaro.org, claudio.fontana@huawei.com,
	"Peter Crosthwaite" <crosthwaite.peter@gmail.com>,
	"Andreas Färber" <afaerber@suse.de>
Subject: Re: [Qemu-devel] [RFC v1 07/12] cpus: introduce async_safe_run_on_cpu.
Date: Mon, 06 Jun 2016 09:50:39 +0100	[thread overview]
Message-ID: <87k2i2ituo.fsf@linaro.org> (raw)
In-Reply-To: <57544CC2.8030303@gmail.com>


Sergey Fedorov <serge.fdrv@gmail.com> writes:

> On 15/04/16 17:23, Alex Bennée wrote:
>> diff --git a/cpu-exec-common.c b/cpu-exec-common.c
>> index 3d7eaa3..c2f7c29 100644
>> --- a/cpu-exec-common.c
>> +++ b/cpu-exec-common.c
>> @@ -79,3 +79,4 @@ void cpu_loop_exit_restore(CPUState *cpu, uintptr_t pc)
>>      cpu->current_tb = NULL;
>>      siglongjmp(cpu->jmp_env, 1);
>>  }
>> +
>
> Do we rally want this extra line at the end of the file? :)
>
>> diff --git a/cpu-exec.c b/cpu-exec.c
>> index 42cec05..2f362f8 100644
>> --- a/cpu-exec.c
>> +++ b/cpu-exec.c
>> @@ -365,6 +365,17 @@ int cpu_exec(CPUState *cpu)
>>      uintptr_t next_tb;
>>      SyncClocks sc;
>>
>> +    /*
>> +     * This happen when somebody doesn't want this CPU to start
>> +     * In case of MTTCG.
>> +     */
>> +#ifdef CONFIG_SOFTMMU
>> +    if (async_safe_work_pending()) {
>> +        cpu->exit_request = 1;
>> +        return 0;
>> +    }
>> +#endif
>> +
>
> I can't see the point of this change. We check for
> "async_safe_work_pending()" each time round the loop in
> qemu_tcg_cpu_thread_fn() before entering tcg_cpu_exec() and we do that
> after 'cpu->exit_request' gets reset.

I can't remember what this was meant to fix now so I'll remove it as suggested.

>
>>      /* replay_interrupt may need current_cpu */
>>      current_cpu = cpu;
>>
>> diff --git a/cpus.c b/cpus.c
>> index 9177161..860e2a9 100644
>> --- a/cpus.c
>> +++ b/cpus.c
>> @@ -928,6 +928,19 @@ static QemuCond qemu_cpu_cond;
>>  static QemuCond qemu_pause_cond;
>>  static QemuCond qemu_work_cond;
>>
>> +/* safe work */
>> +static int safe_work_pending;
>> +static int tcg_scheduled_cpus;
>> +
>> +typedef struct {
>> +    CPUState            *cpu;  /* CPU affected */
>
> Do we really need to pass CPU state to the safe work? Async safe work
> seems to be supposed to handle cross-CPU tasks rather than CPU-specific
> ones. TB flush, for example, doesn't really require CPU to be passed to
> the async safe work handler: we should be able to check for code buffer
> overflow before scheduling the job. Because of this, it seems to be
> reasonable to rename async_safe_run_on_cpu() to something like
> request_async_cpu_safe_work().

CPUState is common enough to be passed around and this is a shared array
for all queued work. I wanted to ensure the second most common operation
of CPUState + one extra bit of information was covered without having to
resort to memory allocation.

The previous iterations of this work had allocations for structures and
the linked list and I was working hard to make the common case
allocation free.

>
>> +    run_on_cpu_func     func;  /* Helper function */
>> +    void                *data; /* Helper data */
>> +} qemu_safe_work_item;
>> +
>> +static GArray *safe_work;       /* array of qemu_safe_work_items */
>
> I think we shouldn't bother with "static" memory allocation for the list
> of work items. We should be fine with dynamic allocation in
> async_safe_run_on_cpu().

Why? Under heavy load these add up fast, some TLB flush operations can
queue CPU x ASID deferred operations. It's not exactly a large array to
allocate at the start. We could make the initial size smaller if you
want.

> Halting all the CPUs doesn't seem to be cheap
> anyhow, thus it shouldn't be used frequently. The only use so far is to
> make tb_flush() safe. If we look at how tb_flush() is used we can see
> that this is either code buffer exhaustion or some rare operation which
> needs all the previous translations to be flushed. The former shouldn't
> happen often, the latter isn't supposed to be cheap and fast anyway.

The various flushes are more common under ARMv8, we are not just talking
about the big tb_flush using this mechanism.

>
>> +static QemuMutex safe_work_mutex;
>> +
>>  void qemu_init_cpu_loop(void)
>>  {
>>      qemu_init_sigbus();
>> @@ -937,6 +950,9 @@ void qemu_init_cpu_loop(void)
>>      qemu_mutex_init(&qemu_global_mutex);
>>
>>      qemu_thread_get_self(&io_thread);
>> +
>> +    safe_work = g_array_sized_new(TRUE, TRUE, sizeof(qemu_safe_work_item), 128);
>> +    qemu_mutex_init(&safe_work_mutex);
>>  }
>>
>>  void run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data)
>> @@ -997,6 +1013,81 @@ void async_run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data)
>>      qemu_cpu_kick(cpu);
>>  }
>>
>> +/*
>> + * Safe work interface
>> + *
>> + * Safe work is defined as work that requires the system to be
>> + * quiescent before making changes.
>> + */
>> +
>> +void async_safe_run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data)
>> +{
>> +    CPUState *iter;
>> +    qemu_safe_work_item wi;
>> +    wi.cpu = cpu;
>> +    wi.func = func;
>> +    wi.data = data;
>> +
>> +    qemu_mutex_lock(&safe_work_mutex);
>> +    g_array_append_val(safe_work, wi);
>> +    atomic_inc(&safe_work_pending);
>> +    qemu_mutex_unlock(&safe_work_mutex);
>> +
>> +    /* Signal all vCPUs to halt */
>> +    CPU_FOREACH(iter) {
>> +        qemu_cpu_kick(iter);
>> +    }
>> +}
>> +
>> +/**
>> + * flush_queued_safe_work:
>> + *
>> + * @scheduled_cpu_count
>> + *
>> + * If not 0 will signal the other vCPUs and sleep. The last vCPU to
>> + * get to the function then drains the queue while the system is in a
>> + * quiescent state. This allows the operations to change shared
>> + * structures.
>> + *
>> + * @see async_run_safe_work_on_cpu
>> + */
>> +static void flush_queued_safe_work(int scheduled_cpu_count)
>> +{
>> +    qemu_safe_work_item *wi;
>> +    int i;
>> +
>> +    /* bail out if there is nothing to do */
>> +    if (!async_safe_work_pending()) {
>> +        return;
>> +    }
>> +
>> +    if (scheduled_cpu_count) {
>> +
>> +        /* Nothing to do but sleep */
>> +        qemu_cond_wait(&qemu_work_cond, &qemu_global_mutex);
>
> We'll not be able to extend this to user-mode emulation because we don't
> have BQL there. I think we should consider something like
> exclusive_idle()/start_exclusive()/end_exclusive()/cpu_exec_start()/cpu_exec_end().

This is true, sacrificed on the alter of making things simpler. It would
be nice if we had a single simple dispatch point for linux-user as well
and we could make things sane there as well.

I'll have another look at the exclusive code although at first blush it
seemed a little more complex because it didn't have the luxury of
knowing how many threads are running.

>
> Kind regards,
> Sergey
>
>> +
>> +    } else {
>> +
>> +        /* We can now do the work */
>> +        qemu_mutex_lock(&safe_work_mutex);
>> +        for (i = 0; i < safe_work->len; i++) {
>> +            wi = &g_array_index(safe_work, qemu_safe_work_item, i);
>> +            wi->func(wi->cpu, wi->data);
>> +        }
>> +        g_array_remove_range(safe_work, 0, safe_work->len);
>> +        atomic_set(&safe_work_pending, 0);
>> +        qemu_mutex_unlock(&safe_work_mutex);
>> +
>> +        /* Wake everyone up */
>> +        qemu_cond_broadcast(&qemu_work_cond);
>> +    }
>> +}
>> +
>> +bool async_safe_work_pending(void)
>> +{
>> +    return (atomic_read(&safe_work_pending) != 0);
>> +}
>> +
>>  static void flush_queued_work(CPUState *cpu)
>>  {
>>      struct qemu_work_item *wi;


--
Alex Bennée

next prev parent reply	other threads:[~2016-06-06  8:50 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-15 14:23 [Qemu-devel] [RFC v1 00/12] Enable MTTCG for 32 bit arm on x86 Alex Bennée
2016-04-15 14:23 ` Alex Bennée
2016-04-15 14:23 ` [Qemu-devel] [RFC v1 01/12] include: move CPU-related definitions out of qemu-common.h Alex Bennée
2016-04-15 14:23 ` [Qemu-devel] [RFC v1 02/12] tcg/i386: Make direct jump patching thread-safe Alex Bennée
2016-04-15 14:23 ` [Qemu-devel] [RFC v1 03/12] qemu-thread: add simple test-and-set spinlock Alex Bennée
2016-04-15 14:23 ` [Qemu-devel] [RFC v1 04/12] atomic: introduce atomic_dec_fetch Alex Bennée
2016-06-02 20:34   ` Sergey Fedorov
2016-04-15 14:23 ` [Qemu-devel] [RFC v1 05/12] atomic: introduce cmpxchg_bool Alex Bennée
2016-04-15 16:22   ` Richard Henderson
2016-04-15 17:06     ` Alex Bennée
2016-06-03 16:45   ` Sergey Fedorov
2016-06-03 19:12     ` Alex Bennée
2016-06-03 19:20       ` Eric Blake
2016-04-15 14:23 ` [RFC v1 06/12] cpus: pass CPUState to run_on_cpu helpers Alex Bennée
2016-04-15 14:23   ` [Qemu-devel] " Alex Bennée
2016-04-20 18:59   ` Eduardo Habkost
2016-04-20 18:59     ` [Qemu-devel] " Eduardo Habkost
2016-04-20 19:50     ` Alex Bennée
2016-04-20 19:50       ` [Qemu-devel] " Alex Bennée
2016-04-15 14:23 ` [Qemu-devel] [RFC v1 07/12] cpus: introduce async_safe_run_on_cpu Alex Bennée
2016-06-05 16:01   ` Sergey Fedorov
2016-06-06  8:50     ` Alex Bennée [this message]
2016-06-06  9:38       ` Sergey Fedorov
2016-06-05 16:44   ` Sergey Fedorov
2016-04-15 14:23 ` [Qemu-devel] [RFC v1 08/12] cputlb: introduce tlb_flush_* async work Alex Bennée
2016-06-05 16:39   ` Sergey Fedorov
2016-06-06  8:54     ` Alex Bennée
2016-06-06 10:04       ` Sergey Fedorov
2016-04-15 14:23 ` [Qemu-devel] [RFC v1 09/12] translate-all: introduces tb_flush_safe Alex Bennée
2016-06-05 16:48   ` Sergey Fedorov
2016-06-06  8:54     ` Alex Bennée
2016-06-06 10:06       ` Sergey Fedorov
2016-04-15 14:23 ` [RFC v1 10/12] arm: use tlb_flush_page_all for tlbimva[a] Alex Bennée
2016-04-15 14:23   ` [Qemu-devel] " Alex Bennée
2016-06-05 16:54   ` Sergey Fedorov
2016-06-05 16:54     ` [Qemu-devel] " Sergey Fedorov
2016-06-06  8:55     ` Alex Bennée
2016-06-06  8:55       ` [Qemu-devel] " Alex Bennée
2016-04-15 14:23 ` [RFC v1 11/12] arm: atomically check the exclusive value in a STREX Alex Bennée
2016-04-15 14:23   ` [Qemu-devel] " Alex Bennée
2016-04-15 14:23 ` [Qemu-devel] [RFC v1 12/12] cpus: default MTTCG to on for 32 bit ARM on x86 Alex Bennée
2016-06-05 17:12   ` Sergey Fedorov
2016-06-06  8:58     ` Alex Bennée
2016-06-06 10:19       ` Sergey Fedorov
2016-06-06 10:26   ` Peter Maydell
2016-06-06 14:28     ` Alex Bennée
2016-06-06 14:37       ` Peter Maydell
2016-04-15 19:12 ` [Qemu-devel] [RFC v1 00/12] Enable MTTCG for 32 bit arm " Alex Bennée

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87k2i2ituo.fsf@linaro.org \
    --to=alex.bennee@linaro.org \
    --cc=a.rigo@virtualopensystems.com \
    --cc=afaerber@suse.de \
    --cc=claudio.fontana@huawei.com \
    --cc=cota@braap.org \
    --cc=crosthwaite.peter@gmail.com \
    --cc=fred.konrad@greensocs.com \
    --cc=jan.kiszka@siemens.com \
    --cc=mark.burton@greensocs.com \
    --cc=mttcg@listserver.greensocs.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=rth@twiddle.net \
    --cc=serge.fdrv@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.