qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Emilio G. Cota" <cota@braap.org>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: mttcg@greensocs.com, mark.burton@greensocs.com,
	a.rigo@virtualopensystems.com, qemu-devel@nongnu.org,
	guillaume.delbergue@greensocs.com, alex.bennee@linaro.org,
	Frederic Konrad <fred.konrad@greensocs.com>
Subject: Re: [Qemu-devel] [RFC 05/38] thread-posix: inline qemu_spin functions
Date: Tue, 25 Aug 2015 15:30:53 -0400	[thread overview]
Message-ID: <20150825193053.GA29063@flamenco> (raw)
In-Reply-To: <20150825023003.GB8880@flamenco>

On Mon, Aug 24, 2015 at 22:30:03 -0400, Emilio G. Cota wrote:
> On Sun, Aug 23, 2015 at 18:04:46 -0700, Paolo Bonzini wrote:
> > On 23/08/2015 17:23, Emilio G. Cota wrote:
> (snip)
> > Applied, but in the end the spinlock will probably simply use a simple
> > test-and-test-and-set lock, or an MCS lock.  There is no need to use
> > pthreads for this.
(snip)
> Note that fair locks (such as MCS) for user-space programs are not
> necessarily a good idea when preemption is considered--and for usermode
> we'd be forced (if we allowed MCS's to nest) to use per-lock stack variables
> given that the number of threads is unbounded, which is pretty ugly.
> 
> If contention is a problem, a simple, fast spinlock combined with an exponential
> backoff is already pretty good. Fairness is not a requirement (the cache
> substrate of a NUMA machine isn't necessarily fair, is it?); scalability is.
> If the algorithm in the guest requires fairness, the guest must use a fair lock
> (e.g. MCS), and that works as intended when run natively or under qemu.
> 
> I just tested a fetch-and-swap+exp.backoff spinlock with usermode on a
> program that spawns N threads and each thread performs an 2**M atomic increments
> on the same variable. That is, a degenerate worst-case kind of contention.
> N varies from 1 to 64, and M=15 on all runs, 5 runs per experiment:
> 
>   http://imgur.com/XpYctyT
>   With backoff, the per-access latency grows roughly linearly with the number of
>   cores, i.e. this is scalable. The other two are clearly superlinear.

Just tried MCS, CLH and ticket spinlocks (with and without backoff).

They take essentially forever for this (admittedly worst-case) test; this
is not suprising when we realise that we're essentially trying to do
in software what the coherence protocol in the cache does (in hardware)
for us when using greedy spinlocks. I'm not even showing numbers because
around N=9 each experiment starts taking waay too long for me to wait.

> In light of these results I see very little against going for a solution
> with exponential backoff.

The only reason I can think of against this option is that we'd be altering
the dynamic behaviour of the emulated code. For example, code that is
not clearly not scalable (such as the example above) would be made
to "scale" by the use of the backoffs (as a result the CPU that gets a lock
is very likely to grab it again right after unlocking). This makes me
a uneasy; it makes intuitive sense to me that unscalable code should
not scale under QEMU, instead of us playing tricks that would confuse
users.

		Emilio

  reply	other threads:[~2015-08-25 19:30 UTC|newest]

Thread overview: 110+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-08-24  0:23 [Qemu-devel] [RFC 00/38] MTTCG: i386, user+system mode Emilio G. Cota
2015-08-24  0:23 ` [Qemu-devel] [RFC 01/38] cpu-exec: add missing mmap_lock in tb_find_slow Emilio G. Cota
2015-09-07 15:33   ` Alex Bennée
2015-08-24  0:23 ` [Qemu-devel] [RFC 02/38] hw/i386/kvmvapic: add missing include of tcg.h Emilio G. Cota
2015-09-07 15:49   ` Alex Bennée
2015-09-07 16:11     ` Paolo Bonzini
2015-08-24  0:23 ` [Qemu-devel] [RFC 03/38] cpu-exec: set current_cpu at cpu_exec() Emilio G. Cota
2015-08-24  1:03   ` Paolo Bonzini
2015-08-25  0:41     ` [Qemu-devel] [PATCH 1/4] cpus: add qemu_cpu_thread_init_common() to avoid code duplication Emilio G. Cota
2015-08-25  0:41       ` [Qemu-devel] [PATCH 2/4] linux-user: add helper to set current_cpu before cpu_loop() Emilio G. Cota
2015-08-25  0:41       ` [Qemu-devel] [PATCH 3/4] linux-user: call rcu_(un)register_thread on thread creation/deletion Emilio G. Cota
2015-08-26  0:22         ` Paolo Bonzini
2015-08-25  0:41       ` [Qemu-devel] [PATCH 4/4] bsd-user: add helper to set current_cpu before cpu_loop() Emilio G. Cota
2015-08-25 18:07         ` Emilio G. Cota
2015-08-24  0:23 ` [Qemu-devel] [RFC 04/38] translate-all: remove volatile from have_tb_lock Emilio G. Cota
2015-09-07 15:50   ` Alex Bennée
2015-09-07 16:12     ` Paolo Bonzini
2015-08-24  0:23 ` [Qemu-devel] [RFC 05/38] thread-posix: inline qemu_spin functions Emilio G. Cota
2015-08-24  1:04   ` Paolo Bonzini
2015-08-25  2:30     ` Emilio G. Cota
2015-08-25 19:30       ` Emilio G. Cota [this message]
2015-08-25 22:53         ` Paolo Bonzini
2015-08-24  0:23 ` [Qemu-devel] [RFC 06/38] seqlock: add missing 'inline' to seqlock_read_retry Emilio G. Cota
2015-09-07 15:50   ` Alex Bennée
2015-08-24  0:23 ` [Qemu-devel] [RFC 07/38] seqlock: read sequence number atomically Emilio G. Cota
2015-09-07 15:53   ` Alex Bennée
2015-09-07 16:13     ` Paolo Bonzini
2015-08-24  0:23 ` [Qemu-devel] [RFC 08/38] rcu: init rcu_registry_lock after fork Emilio G. Cota
2015-09-08 17:34   ` Alex Bennée
2015-09-08 19:03     ` Emilio G. Cota
2015-09-09  9:35       ` Alex Bennée
2015-08-24  0:23 ` [Qemu-devel] [RFC 09/38] rcu: fix comment with s/rcu_gp_lock/rcu_registry_lock/ Emilio G. Cota
2015-09-10 11:18   ` Alex Bennée
2015-08-24  0:23 ` [Qemu-devel] [RFC 10/38] translate-all: remove obsolete comment about l1_map Emilio G. Cota
2015-09-10 11:59   ` Alex Bennée
2015-08-24  0:23 ` [Qemu-devel] [RFC 11/38] qemu-thread: handle spurious futex_wait wakeups Emilio G. Cota
2015-09-10 13:22   ` Alex Bennée
2015-09-10 17:46     ` Emilio G. Cota
2015-08-24  0:23 ` [Qemu-devel] [RFC 12/38] linux-user: call rcu_(un)register_thread on pthread_(exit|create) Emilio G. Cota
2015-08-25  0:45   ` Emilio G. Cota
2015-08-24  0:23 ` [Qemu-devel] [RFC 13/38] cputlb: add physical address to CPUTLBEntry Emilio G. Cota
2015-09-10 13:49   ` Alex Bennée
2015-09-10 17:50     ` Emilio G. Cota
2015-09-21  5:01   ` Paolo Bonzini
2015-08-24  0:23 ` [Qemu-devel] [RFC 14/38] softmmu: add helpers to get ld/st physical addresses Emilio G. Cota
2015-08-24  2:02   ` Paolo Bonzini
2015-08-25  2:47     ` Emilio G. Cota
2015-08-24  0:23 ` [Qemu-devel] [RFC 15/38] radix-tree: add generic lockless radix tree module Emilio G. Cota
2015-09-10 14:25   ` Alex Bennée
2015-09-10 18:00     ` Emilio G. Cota
2015-08-24  0:23 ` [Qemu-devel] [RFC 16/38] aie: add module for Atomic Instruction Emulation Emilio G. Cota
2015-08-24  0:23 ` [Qemu-devel] [RFC 17/38] aie: add target helpers Emilio G. Cota
2015-09-17 15:14   ` Alex Bennée
2015-09-21  5:18   ` Paolo Bonzini
2015-09-21 20:59     ` Alex Bennée
2015-08-24  0:23 ` [Qemu-devel] [RFC 18/38] tcg: add fences Emilio G. Cota
2015-09-10 15:28   ` Alex Bennée
2015-08-24  0:23 ` [Qemu-devel] [RFC 19/38] tcg: add tcg_gen_smp_rmb() Emilio G. Cota
2015-09-10 16:01   ` Alex Bennée
2015-09-10 18:05     ` Emilio G. Cota
2015-08-24  0:23 ` [Qemu-devel] [RFC 20/38] tcg/i386: implement fences Emilio G. Cota
2015-08-24  1:32   ` Paolo Bonzini
2015-08-25  3:02     ` Emilio G. Cota
2015-08-25 22:55       ` Paolo Bonzini
2015-08-24  0:23 ` [Qemu-devel] [RFC 21/38] target-i386: emulate atomic instructions + barriers using AIE Emilio G. Cota
2015-09-17 15:30   ` Alex Bennée
2015-08-24  0:23 ` [Qemu-devel] [RFC 22/38] cpu: update interrupt_request atomically Emilio G. Cota
2015-08-24  1:09   ` Paolo Bonzini
2015-08-25 20:36     ` Emilio G. Cota
2015-08-25 22:52       ` Paolo Bonzini
2015-08-24  0:23 ` [Qemu-devel] [RFC 23/38] cpu-exec: grab iothread lock during interrupt handling Emilio G. Cota
2015-09-09 10:13   ` Paolo Bonzini
2015-08-24  0:23 ` [Qemu-devel] [RFC 24/38] cpu-exec: reset mmap_lock after exiting the CPU loop Emilio G. Cota
2015-08-24  2:01   ` Paolo Bonzini
2015-08-25 21:16     ` Emilio G. Cota
2015-08-24  0:23 ` [Qemu-devel] [RFC 25/38] cpu: add barriers around cpu->tcg_exit_req Emilio G. Cota
2015-08-24  2:01   ` Paolo Bonzini
2015-08-24  0:23 ` [Qemu-devel] [RFC 26/38] cpu: protect tb_jmp_cache with seqlock Emilio G. Cota
2015-08-24  1:14   ` Paolo Bonzini
2015-08-25 21:46     ` Emilio G. Cota
2015-08-25 22:49       ` Paolo Bonzini
2015-09-04  8:50   ` Paolo Bonzini
2015-09-04 10:04     ` Paolo Bonzini
2015-08-24  0:23 ` [Qemu-devel] [RFC 27/38] cpu-exec: convert tb_invalidated_flag into a per-TB flag Emilio G. Cota
2015-08-24  0:23 ` [Qemu-devel] [RFC 28/38] cpu-exec: use RCU to perform lockless TB lookups Emilio G. Cota
2015-08-24  0:23 ` [Qemu-devel] [RFC 29/38] tcg: export have_tb_lock Emilio G. Cota
2015-08-24  0:23 ` [Qemu-devel] [RFC 30/38] translate-all: add tb_lock assertions Emilio G. Cota
2015-08-24  0:24 ` [Qemu-devel] [RFC 31/38] cpu: protect l1_map with tb_lock in full-system mode Emilio G. Cota
2015-08-24  1:07   ` Paolo Bonzini
2015-08-25 21:54     ` Emilio G. Cota
2015-08-24  0:24 ` [Qemu-devel] [RFC 32/38] cpu list: convert to RCU QLIST Emilio G. Cota
2015-08-24  0:24 ` [Qemu-devel] [RFC 33/38] cpu: introduce cpu_tcg_sched_work to run work while other CPUs sleep Emilio G. Cota
2015-08-24  1:24   ` Paolo Bonzini
2015-08-25 22:18     ` Emilio G. Cota
2015-08-24  0:24 ` [Qemu-devel] [RFC 34/38] translate-all: use tcg_sched_work for tb_flush Emilio G. Cota
2015-08-24  0:24 ` [Qemu-devel] [RFC 35/38] cputlb: use cpu_tcg_sched_work for tlb_flush_all Emilio G. Cota
2015-08-24  1:29   ` Paolo Bonzini
2015-08-25 22:31     ` Emilio G. Cota
2015-08-26  0:25       ` Paolo Bonzini
2015-09-01 16:10   ` Alex Bennée
2015-09-01 19:38     ` Emilio G. Cota
2015-09-01 20:18       ` Peter Maydell
2015-08-24  0:24 ` [Qemu-devel] [RFC 36/38] cputlb: use tcg_sched_work for tlb_flush_page_all Emilio G. Cota
2015-08-24  0:24 ` [Qemu-devel] [RFC 37/38] cpus: remove async_run_safe_work_on_cpu Emilio G. Cota
2015-08-24  0:24 ` [Qemu-devel] [RFC 38/38] Revert "target-i386: yield to another VCPU on PAUSE" Emilio G. Cota
2015-08-24  1:29   ` Paolo Bonzini
2015-08-24  2:01 ` [Qemu-devel] [RFC 00/38] MTTCG: i386, user+system mode Paolo Bonzini
2015-08-25 22:36   ` Emilio G. Cota
2015-08-24 16:08 ` Artyom Tarasenko
2015-08-24 20:16   ` Emilio G. Cota

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150825193053.GA29063@flamenco \
    --to=cota@braap.org \
    --cc=a.rigo@virtualopensystems.com \
    --cc=alex.bennee@linaro.org \
    --cc=fred.konrad@greensocs.com \
    --cc=guillaume.delbergue@greensocs.com \
    --cc=mark.burton@greensocs.com \
    --cc=mttcg@greensocs.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).