Re: [Qemu-devel] [RFC v1 07/12] cpus: introduce async_safe_run_on_cpu.

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Sergey Fedorov <serge.fdrv@gmail.com>
To: "Alex Bennée" <alex.bennee@linaro.org>
Cc: mttcg@listserver.greensocs.com, fred.konrad@greensocs.com,
	a.rigo@virtualopensystems.com, cota@braap.org,
	qemu-devel@nongnu.org, mark.burton@greensocs.com,
	pbonzini@redhat.com, jan.kiszka@siemens.com, rth@twiddle.net,
	peter.maydell@linaro.org, claudio.fontana@huawei.com,
	"Peter Crosthwaite" <crosthwaite.peter@gmail.com>,
	"Andreas Färber" <afaerber@suse.de>
Subject: Re: [Qemu-devel] [RFC v1 07/12] cpus: introduce async_safe_run_on_cpu.
Date: Mon, 6 Jun 2016 12:38:16 +0300	[thread overview]
Message-ID: <57554488.1070408@gmail.com> (raw)
In-Reply-To: <87k2i2ituo.fsf@linaro.org>

On 06/06/16 11:50, Alex Bennée wrote:
> Sergey Fedorov <serge.fdrv@gmail.com> writes:
>
>> On 15/04/16 17:23, Alex Bennée wrote:
>>> diff --git a/cpus.c b/cpus.c
>>> index 9177161..860e2a9 100644
>>> --- a/cpus.c
>>> +++ b/cpus.c
>>> @@ -928,6 +928,19 @@ static QemuCond qemu_cpu_cond;
>>>  static QemuCond qemu_pause_cond;
>>>  static QemuCond qemu_work_cond;
>>>
>>> +/* safe work */
>>> +static int safe_work_pending;
>>> +static int tcg_scheduled_cpus;
>>> +
>>> +typedef struct {
>>> +    CPUState            *cpu;  /* CPU affected */
>> Do we really need to pass CPU state to the safe work? Async safe work
>> seems to be supposed to handle cross-CPU tasks rather than CPU-specific
>> ones. TB flush, for example, doesn't really require CPU to be passed to
>> the async safe work handler: we should be able to check for code buffer
>> overflow before scheduling the job. Because of this, it seems to be
>> reasonable to rename async_safe_run_on_cpu() to something like
>> request_async_cpu_safe_work().
> CPUState is common enough to be passed around and this is a shared array
> for all queued work. I wanted to ensure the second most common operation
> of CPUState + one extra bit of information was covered without having to
> resort to memory allocation.
>
> The previous iterations of this work had allocations for structures and
> the linked list and I was working hard to make the common case
> allocation free.

The point was that this mechanism supposed to be used for operations
requiring all CPUs to be halted thus there should be little chance it
would require a specific CPUState to be passed.

>
>>> +    run_on_cpu_func     func;  /* Helper function */
>>> +    void                *data; /* Helper data */
>>> +} qemu_safe_work_item;
>>> +
>>> +static GArray *safe_work;       /* array of qemu_safe_work_items */
>> I think we shouldn't bother with "static" memory allocation for the list
>> of work items. We should be fine with dynamic allocation in
>> async_safe_run_on_cpu().
> Why? Under heavy load these add up fast, some TLB flush operations can
> queue CPU x ASID deferred operations. It's not exactly a large array to
> allocate at the start. We could make the initial size smaller if you
> want.

We are going to use async_run_on_cpu() for TLB invalidation, aren't we?

>
>> Halting all the CPUs doesn't seem to be cheap
>> anyhow, thus it shouldn't be used frequently. The only use so far is to
>> make tb_flush() safe. If we look at how tb_flush() is used we can see
>> that this is either code buffer exhaustion or some rare operation which
>> needs all the previous translations to be flushed. The former shouldn't
>> happen often, the latter isn't supposed to be cheap and fast anyway.
> The various flushes are more common under ARMv8, we are not just talking
> about the big tb_flush using this mechanism.

Which of them would really require quiescent state? I suppose all
frequent operations are to be handled in per-CPU manner and don't
require this expansive "halt all the CPUs" mechanism.

>
>>
>>> +static QemuMutex safe_work_mutex;
>>> +
>>>  void qemu_init_cpu_loop(void)
>>>  {
>>>      qemu_init_sigbus();
(snip)
>>> @@ -997,6 +1013,81 @@ void async_run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data)
>>>      qemu_cpu_kick(cpu);
>>>  }
>>>
>>> +/*
>>> + * Safe work interface
>>> + *
>>> + * Safe work is defined as work that requires the system to be
>>> + * quiescent before making changes.
>>> + */
>>> +
>>> +void async_safe_run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data)
>>> +{
>>> +    CPUState *iter;
>>> +    qemu_safe_work_item wi;
>>> +    wi.cpu = cpu;
>>> +    wi.func = func;
>>> +    wi.data = data;
>>> +
>>> +    qemu_mutex_lock(&safe_work_mutex);
>>> +    g_array_append_val(safe_work, wi);
>>> +    atomic_inc(&safe_work_pending);
>>> +    qemu_mutex_unlock(&safe_work_mutex);
>>> +
>>> +    /* Signal all vCPUs to halt */
>>> +    CPU_FOREACH(iter) {
>>> +        qemu_cpu_kick(iter);
>>> +    }
>>> +}
>>> +
>>> +/**
>>> + * flush_queued_safe_work:
>>> + *
>>> + * @scheduled_cpu_count
>>> + *
>>> + * If not 0 will signal the other vCPUs and sleep. The last vCPU to
>>> + * get to the function then drains the queue while the system is in a
>>> + * quiescent state. This allows the operations to change shared
>>> + * structures.
>>> + *
>>> + * @see async_run_safe_work_on_cpu
>>> + */
>>> +static void flush_queued_safe_work(int scheduled_cpu_count)
>>> +{
>>> +    qemu_safe_work_item *wi;
>>> +    int i;
>>> +
>>> +    /* bail out if there is nothing to do */
>>> +    if (!async_safe_work_pending()) {
>>> +        return;
>>> +    }
>>> +
>>> +    if (scheduled_cpu_count) {
>>> +
>>> +        /* Nothing to do but sleep */
>>> +        qemu_cond_wait(&qemu_work_cond, &qemu_global_mutex);
>> We'll not be able to extend this to user-mode emulation because we don't
>> have BQL there. I think we should consider something like
>> exclusive_idle()/start_exclusive()/end_exclusive()/cpu_exec_start()/cpu_exec_end().
> This is true, sacrificed on the alter of making things simpler. It would
> be nice if we had a single simple dispatch point for linux-user as well
> and we could make things sane there as well.
>
> I'll have another look at the exclusive code although at first blush it
> seemed a little more complex because it didn't have the luxury of
> knowing how many threads are running.

Actually, it knows that making use of user-mode-only 'CPUState::running'
and its global counter 'pending_cpus'.

Kind regards,
Sergey

next prev parent reply	other threads:[~2016-06-06  9:38 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-15 14:23 [Qemu-devel] [RFC v1 00/12] Enable MTTCG for 32 bit arm on x86 Alex Bennée
2016-04-15 14:23 ` Alex Bennée
2016-04-15 14:23 ` [Qemu-devel] [RFC v1 01/12] include: move CPU-related definitions out of qemu-common.h Alex Bennée
2016-04-15 14:23 ` [Qemu-devel] [RFC v1 02/12] tcg/i386: Make direct jump patching thread-safe Alex Bennée
2016-04-15 14:23 ` [Qemu-devel] [RFC v1 03/12] qemu-thread: add simple test-and-set spinlock Alex Bennée
2016-04-15 14:23 ` [Qemu-devel] [RFC v1 04/12] atomic: introduce atomic_dec_fetch Alex Bennée
2016-06-02 20:34   ` Sergey Fedorov
2016-04-15 14:23 ` [Qemu-devel] [RFC v1 05/12] atomic: introduce cmpxchg_bool Alex Bennée
2016-04-15 16:22   ` Richard Henderson
2016-04-15 17:06     ` Alex Bennée
2016-06-03 16:45   ` Sergey Fedorov
2016-06-03 19:12     ` Alex Bennée
2016-06-03 19:20       ` Eric Blake
2016-04-15 14:23 ` [Qemu-devel] [RFC v1 06/12] cpus: pass CPUState to run_on_cpu helpers Alex Bennée
2016-04-20 18:59   ` Eduardo Habkost
2016-04-20 19:50     ` Alex Bennée
2016-04-15 14:23 ` [Qemu-devel] [RFC v1 07/12] cpus: introduce async_safe_run_on_cpu Alex Bennée
2016-06-05 16:01   ` Sergey Fedorov
2016-06-06  8:50     ` Alex Bennée
2016-06-06  9:38       ` Sergey Fedorov [this message]
2016-06-05 16:44   ` Sergey Fedorov
2016-04-15 14:23 ` [Qemu-devel] [RFC v1 08/12] cputlb: introduce tlb_flush_* async work Alex Bennée
2016-06-05 16:39   ` Sergey Fedorov
2016-06-06  8:54     ` Alex Bennée
2016-06-06 10:04       ` Sergey Fedorov
2016-04-15 14:23 ` [Qemu-devel] [RFC v1 09/12] translate-all: introduces tb_flush_safe Alex Bennée
2016-06-05 16:48   ` Sergey Fedorov
2016-06-06  8:54     ` Alex Bennée
2016-06-06 10:06       ` Sergey Fedorov
2016-04-15 14:23 ` [Qemu-devel] [RFC v1 10/12] arm: use tlb_flush_page_all for tlbimva[a] Alex Bennée
2016-06-05 16:54   ` Sergey Fedorov
2016-06-06  8:55     ` Alex Bennée
2016-04-15 14:23 ` [Qemu-devel] [RFC v1 11/12] arm: atomically check the exclusive value in a STREX Alex Bennée
2016-04-15 14:23 ` [Qemu-devel] [RFC v1 12/12] cpus: default MTTCG to on for 32 bit ARM on x86 Alex Bennée
2016-06-05 17:12   ` Sergey Fedorov
2016-06-06  8:58     ` Alex Bennée
2016-06-06 10:19       ` Sergey Fedorov
2016-06-06 10:26   ` Peter Maydell
2016-06-06 14:28     ` Alex Bennée
2016-06-06 14:37       ` Peter Maydell
2016-04-15 19:12 ` [Qemu-devel] [RFC v1 00/12] Enable MTTCG for 32 bit arm " Alex Bennée

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=57554488.1070408@gmail.com \
    --to=serge.fdrv@gmail.com \
    --cc=a.rigo@virtualopensystems.com \
    --cc=afaerber@suse.de \
    --cc=alex.bennee@linaro.org \
    --cc=claudio.fontana@huawei.com \
    --cc=cota@braap.org \
    --cc=crosthwaite.peter@gmail.com \
    --cc=fred.konrad@greensocs.com \
    --cc=jan.kiszka@siemens.com \
    --cc=mark.burton@greensocs.com \
    --cc=mttcg@listserver.greensocs.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=rth@twiddle.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).