qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Emilio G. Cota" <cota@braap.org>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-devel@nongnu.org, serge.fdrv@gmail.com,
	alex.bennee@linaro.org, sergey.fedorov@linaro.org
Subject: Re: [Qemu-devel] [PATCH 16/16] cpus-common: lock-free fast path for cpu_exec_start/end
Date: Wed, 21 Sep 2016 13:24:44 -0400	[thread overview]
Message-ID: <20160921172444.GF13385@flamenco> (raw)
In-Reply-To: <1474289459-15242-17-git-send-email-pbonzini@redhat.com>

On Mon, Sep 19, 2016 at 14:50:59 +0200, Paolo Bonzini wrote:
> Set cpu->running without taking the cpu_list lock, only look at it if
> there is a concurrent exclusive section.  This requires adding a new
> field to CPUState, which records whether a running CPU is being counted
> in pending_cpus.  When an exclusive section is started concurrently with
> cpu_exec_start, cpu_exec_start can use the new field to wait for the end
> of the exclusive section.
> 
> This a separate patch for easier bisection of issues.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  cpus-common.c              | 73 ++++++++++++++++++++++++++++++++++++++++------
>  docs/tcg-exclusive.promela | 53 +++++++++++++++++++++++++++++++--
>  include/qom/cpu.h          |  5 ++--
>  3 files changed, 117 insertions(+), 14 deletions(-)
> 
> diff --git a/cpus-common.c b/cpus-common.c
> index f7ad534..46cf8ef 100644
> --- a/cpus-common.c
> +++ b/cpus-common.c
> @@ -184,8 +184,12 @@ void start_exclusive(void)
>  
>      /* Make all other cpus stop executing.  */
>      pending_cpus = 1;
> +
> +    /* Write pending_cpus before reading other_cpu->running.  */
> +    smp_mb();
>      CPU_FOREACH(other_cpu) {
>          if (other_cpu->running) {
> +            other_cpu->has_waiter = true;
>              pending_cpus++;
>              qemu_cpu_kick(other_cpu);
>          }
> @@ -212,24 +216,75 @@ void end_exclusive(void)
>  /* Wait for exclusive ops to finish, and begin cpu execution.  */
>  void cpu_exec_start(CPUState *cpu)
>  {
> -    qemu_mutex_lock(&qemu_cpu_list_mutex);
> -    exclusive_idle();
>      cpu->running = true;
> -    qemu_mutex_unlock(&qemu_cpu_list_mutex);
> +
> +    /* Write cpu->running before reading pending_cpus.  */
> +    smp_mb();
> +
> +    /* 1. start_exclusive saw cpu->running == true and pending_cpus >= 1.
> +     * After taking the lock we'll see cpu->has_waiter == true and run---not
> +     * for long because start_exclusive kicked us.  cpu_exec_end will
> +     * decrement pending_cpus and signal the waiter.
> +     *
> +     * 2. start_exclusive saw cpu->running == false but pending_cpus >= 1.
> +     * This includes the case when an exclusive item is running now.
> +     * Then we'll see cpu->has_waiter == false and wait for the item to
> +     * complete.
> +     *
> +     * 3. pending_cpus == 0.  Then start_exclusive is definitely going to
> +     * see cpu->running == true, and it will kick the CPU.
> +     */
> +    if (pending_cpus) {
> +        qemu_mutex_lock(&qemu_cpu_list_mutex);
> +        if (!cpu->has_waiter) {
> +            /* Not counted in pending_cpus, let the exclusive item
> +             * run.  Since we have the lock, set cpu->running to true
> +             * while holding it instead of retrying.
> +             */
> +            cpu->running = false;
> +            exclusive_idle();
> +            /* Now pending_cpus is zero.  */
> +            cpu->running = true;
> +        } else {
> +            /* Counted in pending_cpus, go ahead.  */
> +        }
> +        qemu_mutex_unlock(&qemu_cpu_list_mutex);
> +    }

wrt scenario (3): I don't think other threads will always see cpu->running == true.
Consider the following:

cpu0					cpu1
----					----

cpu->running = true;			pending_cpus = 1;
smp_mb();				smp_mb();
if (pending_cpus) { /* false */ }	CPU_FOREACH(other_cpu) { if (other_cpu->running) { /* false */ } }

The barriers here don't guarantee that changes are immediately visible to others
(for that we need strong ops, i.e. atomics).
So in the example above, pending_cpus has been set to 1, but it might not
yet be visible by cpu0. The same thing applies to cpu0->running; despite
the barrier, cpu1 might not yet perceive it, and could therefore miss kicking
cpu0 (and proceed while cpu0 executes).

Is there a performance (scalability) reason behind this patch? I can only
think of a guest with many frequent atomics, which would be very slow. However,
once the cmpxchg patchset goes in, those atomics will be emulated without
leaving the CPU loop.

If we want this to scale better without complicating things too much,
I'd focus on converting the exclusive_resume broadcast into a signal,
so that we avoid the thundering herd problem. Not clear to me what workloads
would contend on start/end_exclusive though.

Thanks,

		E.

  reply	other threads:[~2016-09-21 17:25 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-19 12:50 [Qemu-devel] [PATCH v7 00/16] cpu-exec: Safe work in quiescent state Paolo Bonzini
2016-09-19 12:50 ` [Qemu-devel] [PATCH 01/16] cpus: pass CPUState to run_on_cpu helpers Paolo Bonzini
2016-09-19 12:50 ` [Qemu-devel] [PATCH 02/16] cpus: Move common code out of {async_, }run_on_cpu() Paolo Bonzini
2016-09-19 12:50 ` [Qemu-devel] [PATCH 03/16] cpus: Rename flush_queued_work() Paolo Bonzini
2016-09-19 12:50 ` [Qemu-devel] [PATCH 04/16] linux-user: Use QemuMutex and QemuCond Paolo Bonzini
2016-09-21 16:26   ` Emilio G. Cota
2016-09-21 16:32     ` Paolo Bonzini
2016-09-19 12:50 ` [Qemu-devel] [PATCH 05/16] linux-user: Add qemu_cpu_is_self() and qemu_cpu_kick() Paolo Bonzini
2016-09-19 12:50 ` [Qemu-devel] [PATCH 06/16] cpus-common: move CPU list management to common code Paolo Bonzini
2016-09-19 12:50 ` [Qemu-devel] [PATCH 07/16] cpus-common: move CPU work item " Paolo Bonzini
2016-09-21 17:03   ` Emilio G. Cota
2016-09-21 17:15     ` Paolo Bonzini
2016-09-19 12:50 ` [Qemu-devel] [PATCH 08/16] cpus-common: fix uninitialized variable use in run_on_cpu Paolo Bonzini
2016-09-19 12:50 ` [Qemu-devel] [PATCH 09/16] cpus-common: move exclusive work infrastructure from linux-user Paolo Bonzini
2016-09-19 12:50 ` [Qemu-devel] [PATCH 10/16] docs: include formal model for TCG exclusive sections Paolo Bonzini
2016-09-19 12:50 ` [Qemu-devel] [PATCH 11/16] cpus-common: always defer async_run_on_cpu work items Paolo Bonzini
2016-09-19 12:50 ` [Qemu-devel] [PATCH 12/16] cpus-common: remove redundant call to exclusive_idle() Paolo Bonzini
2016-09-19 12:50 ` [Qemu-devel] [PATCH 13/16] cpus-common: simplify locking for start_exclusive/end_exclusive Paolo Bonzini
2016-09-21 16:15   ` Emilio G. Cota
2016-09-21 16:20     ` Paolo Bonzini
2016-09-19 12:50 ` [Qemu-devel] [PATCH 14/16] cpus-common: Introduce async_safe_run_on_cpu() Paolo Bonzini
2016-09-21 16:08   ` Emilio G. Cota
2016-09-21 16:20     ` Paolo Bonzini
2016-09-19 12:50 ` [Qemu-devel] [PATCH 15/16] tcg: Make tb_flush() thread safe Paolo Bonzini
2016-09-21 16:05   ` Emilio G. Cota
2016-09-21 16:19     ` Paolo Bonzini
2016-09-21 17:37       ` Emilio G. Cota
2016-09-21 18:22         ` Paolo Bonzini
2016-09-19 12:50 ` [Qemu-devel] [PATCH 16/16] cpus-common: lock-free fast path for cpu_exec_start/end Paolo Bonzini
2016-09-21 17:24   ` Emilio G. Cota [this message]
2016-09-21 18:19     ` Paolo Bonzini
2016-09-21 22:14       ` Emilio G. Cota
2016-09-21 22:27   ` Emilio G. Cota
2016-09-22  8:46     ` Paolo Bonzini
2016-09-19 13:24 ` [Qemu-devel] [PATCH v7 00/16] cpu-exec: Safe work in quiescent state no-reply
2016-09-19 16:04 ` no-reply
2016-09-21 17:29 ` Emilio G. Cota
2016-09-21 18:20   ` Paolo Bonzini
  -- strict thread matches above, loose matches on Subject: below --
2016-09-23  7:31 [Qemu-devel] [PATCH v8 " Paolo Bonzini
2016-09-23  7:31 ` [Qemu-devel] [PATCH 16/16] cpus-common: lock-free fast path for cpu_exec_start/end Paolo Bonzini
2016-09-23 18:23   ` Richard Henderson
2016-09-24 11:52     ` Paolo Bonzini
2016-09-24 20:43       ` Richard Henderson
2016-09-26  7:20         ` Paolo Bonzini
2016-09-26  7:28           ` Alex Bennée
2016-09-26  8:23             ` Paolo Bonzini
2016-09-12 11:12 [Qemu-devel] [PATCH v7 00/16] cpu-exec: Safe work in quiescent state Paolo Bonzini
2016-09-12 11:12 ` [Qemu-devel] [PATCH 16/16] cpus-common: lock-free fast path for cpu_exec_start/end Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160921172444.GF13385@flamenco \
    --to=cota@braap.org \
    --cc=alex.bennee@linaro.org \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=serge.fdrv@gmail.com \
    --cc=sergey.fedorov@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).