From: Richard Henderson <richard.henderson@linaro.org>
To: "Emilio G. Cota" <cota@braap.org>, qemu-devel@nongnu.org
Cc: Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v6 73/73] cputlb: queue async flush jobs without the BQL
Date: Wed, 20 Feb 2019 09:18:33 -0800 [thread overview]
Message-ID: <9760d4ba-e71c-77cb-7cab-9bc2f6e6b60e@linaro.org> (raw)
In-Reply-To: <20190130004811.27372-74-cota@braap.org>
On 1/29/19 4:48 PM, Emilio G. Cota wrote:
> This yields sizable scalability improvements, as the below results show.
>
> Host: Two Intel E5-2683 v3 14-core CPUs at 2.00 GHz (Haswell)
>
> Workload: Ubuntu 18.04 ppc64 compiling the linux kernel with
> "make -j N", where N is the number of cores in the guest.
>
> Speedup vs a single thread (higher is better):
>
> 14 +---------------------------------------------------------------+
> | + + + + + + $$$$$$ + |
> | $$$$$ |
> | $$$$$$ |
> 12 |-+ $A$$ +-|
> | $$ |
> | $$$ |
> 10 |-+ $$ ##D#####################D +-|
> | $$$ #####**B**************** |
> | $$####***** ***** |
> | A$#***** B |
> 8 |-+ $$B** +-|
> | $$** |
> | $** |
> 6 |-+ $$* +-|
> | A** |
> | $B |
> | $ |
> 4 |-+ $* +-|
> | $ |
> | $ |
> 2 |-+ $ +-|
> | $ +cputlb-no-bql $$A$$ |
> | A +per-cpu-lock ##D## |
> | + + + + + + baseline **B** |
> 0 +---------------------------------------------------------------+
> 1 4 8 12 16 20 24 28
> Guest vCPUs
> png: https://imgur.com/zZRvS7q
>
> Some notes:
> - baseline corresponds to the commit before this series
>
> - per-cpu-lock is the commit that converts the CPU loop to per-cpu locks.
>
> - cputlb-no-bql is this commit.
>
> - I'm using taskset to assign cores to threads, favouring locality whenever
> possible but not using SMT. When N=1, I'm using a single host core, which
> leads to superlinear speedups (since with more cores the I/O thread can execute
> while vCPU threads sleep). In the future I might use N+1 host cores for N
> guest cores to avoid this, or perhaps pin guest threads to cores one-by-one.
>
> Single-threaded performance is affected very lightly. Results
> below for debian aarch64 bootup+test for the entire series
> on an Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz host:
>
> - Before:
>
> Performance counter stats for 'taskset -c 0 ../img/aarch64/die.sh' (10 runs):
>
> 7269.033478 task-clock (msec) # 0.998 CPUs utilized ( +- 0.06% )
> 30,659,870,302 cycles # 4.218 GHz ( +- 0.06% )
> 54,790,540,051 instructions # 1.79 insns per cycle ( +- 0.05% )
> 9,796,441,380 branches # 1347.695 M/sec ( +- 0.05% )
> 165,132,201 branch-misses # 1.69% of all branches ( +- 0.12% )
>
> 7.287011656 seconds time elapsed ( +- 0.10% )
>
> - After:
>
> 7375.924053 task-clock (msec) # 0.998 CPUs utilized ( +- 0.13% )
> 31,107,548,846 cycles # 4.217 GHz ( +- 0.12% )
> 55,355,668,947 instructions # 1.78 insns per cycle ( +- 0.05% )
> 9,929,917,664 branches # 1346.261 M/sec ( +- 0.04% )
> 166,547,442 branch-misses # 1.68% of all branches ( +- 0.09% )
>
> 7.389068145 seconds time elapsed ( +- 0.13% )
>
> That is, a 1.37% slowdown.
>
> Signed-off-by: Emilio G. Cota <cota@braap.org>
> ---
> accel/tcg/cputlb.c | 10 +++++-----
> 1 file changed, 5 insertions(+), 5 deletions(-)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
next prev parent reply other threads:[~2019-02-20 17:18 UTC|newest]
Thread overview: 109+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-30 0:46 [Qemu-devel] [PATCH v6 00/73] per-CPU locks Emilio G. Cota
2019-01-30 0:46 ` [Qemu-devel] [PATCH v6 01/73] cpu: convert queued work to a QSIMPLEQ Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 02/73] cpu: rename cpu->work_mutex to cpu->lock Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 03/73] cpu: introduce cpu_mutex_lock/unlock Emilio G. Cota
2019-02-06 17:21 ` Alex Bennée
2019-02-06 20:02 ` Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 04/73] cpu: make qemu_work_cond per-cpu Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 05/73] cpu: move run_on_cpu to cpus-common Emilio G. Cota
2019-02-06 17:22 ` Alex Bennée
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 06/73] cpu: introduce process_queued_cpu_work_locked Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 07/73] cpu: make per-CPU locks an alias of the BQL in TCG rr mode Emilio G. Cota
2019-02-07 12:40 ` Alex Bennée
2019-02-20 16:12 ` Richard Henderson
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 08/73] tcg-runtime: define helper_cpu_halted_set Emilio G. Cota
2019-02-07 12:40 ` Alex Bennée
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 09/73] ppc: convert to helper_cpu_halted_set Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 10/73] cris: " Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 11/73] hppa: " Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 12/73] m68k: " Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 13/73] alpha: " Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 14/73] microblaze: " Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 15/73] cpu: define cpu_halted helpers Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 16/73] tcg-runtime: convert to cpu_halted_set Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 17/73] arm: convert to cpu_halted Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 18/73] ppc: " Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 19/73] sh4: " Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 20/73] i386: " Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 21/73] lm32: " Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 22/73] m68k: " Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 23/73] mips: " Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 24/73] riscv: " Emilio G. Cota
2019-02-06 23:50 ` Alistair Francis
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 25/73] s390x: " Emilio G. Cota
2019-01-30 10:30 ` Cornelia Huck
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 26/73] sparc: " Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 27/73] xtensa: " Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 28/73] gdbstub: " Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 29/73] openrisc: " Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 30/73] cpu-exec: " Emilio G. Cota
2019-02-07 12:44 ` Alex Bennée
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 31/73] cpu: " Emilio G. Cota
2019-02-07 20:39 ` Alex Bennée
2019-02-20 16:21 ` Richard Henderson
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 32/73] cpu: define cpu_interrupt_request helpers Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 33/73] ppc: use cpu_reset_interrupt Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 34/73] exec: " Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 35/73] i386: " Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 36/73] s390x: " Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 37/73] openrisc: " Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 38/73] arm: convert to cpu_interrupt_request Emilio G. Cota
2019-02-07 20:55 ` Alex Bennée
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 39/73] i386: " Emilio G. Cota
2019-02-08 11:00 ` Alex Bennée
2019-03-02 22:48 ` Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 40/73] i386/kvm: " Emilio G. Cota
2019-02-08 11:15 ` Alex Bennée
2019-03-02 23:14 ` Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 41/73] i386/hax-all: " Emilio G. Cota
2019-02-08 11:20 ` Alex Bennée
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 42/73] i386/whpx-all: " Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 43/73] i386/hvf: convert to cpu_request_interrupt Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 44/73] ppc: convert to cpu_interrupt_request Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 45/73] sh4: " Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 46/73] cris: " Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 47/73] hppa: " Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 48/73] lm32: " Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 49/73] m68k: " Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 50/73] mips: " Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 51/73] nios: " Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 52/73] s390x: " Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 53/73] alpha: " Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 54/73] moxie: " Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 55/73] sparc: " Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 56/73] openrisc: " Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 57/73] unicore32: " Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 58/73] microblaze: " Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 59/73] accel/tcg: " Emilio G. Cota
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 60/73] cpu: convert to interrupt_request Emilio G. Cota
2019-02-08 11:21 ` Alex Bennée
2019-02-20 16:55 ` Richard Henderson
2019-01-30 0:47 ` [Qemu-devel] [PATCH v6 61/73] cpu: call .cpu_has_work with the CPU lock held Emilio G. Cota
2019-02-08 11:22 ` Alex Bennée
2019-01-30 0:48 ` [Qemu-devel] [PATCH v6 62/73] cpu: introduce cpu_has_work_with_iothread_lock Emilio G. Cota
2019-02-08 11:33 ` Alex Bennée
2019-03-03 19:52 ` Emilio G. Cota
2019-01-30 0:48 ` [Qemu-devel] [PATCH v6 63/73] ppc: convert to cpu_has_work_with_iothread_lock Emilio G. Cota
2019-01-30 0:48 ` [Qemu-devel] [PATCH v6 64/73] mips: " Emilio G. Cota
2019-01-30 0:48 ` [Qemu-devel] [PATCH v6 65/73] s390x: " Emilio G. Cota
2019-01-30 10:35 ` Cornelia Huck
2019-01-30 0:48 ` [Qemu-devel] [PATCH v6 66/73] riscv: " Emilio G. Cota
2019-02-06 23:51 ` Alistair Francis
2019-01-30 0:48 ` [Qemu-devel] [PATCH v6 67/73] sparc: " Emilio G. Cota
2019-01-30 0:48 ` [Qemu-devel] [PATCH v6 68/73] xtensa: " Emilio G. Cota
2019-01-30 0:48 ` [Qemu-devel] [PATCH v6 69/73] cpu: rename all_cpu_threads_idle to qemu_tcg_rr_all_cpu_threads_idle Emilio G. Cota
2019-02-08 11:34 ` Alex Bennée
2019-02-20 17:01 ` Richard Henderson
2019-01-30 0:48 ` [Qemu-devel] [PATCH v6 70/73] cpu: protect CPU state with cpu->lock instead of the BQL Emilio G. Cota
2019-02-08 14:33 ` Alex Bennée
2019-02-20 17:25 ` Richard Henderson
2019-01-30 0:48 ` [Qemu-devel] [PATCH v6 71/73] cpus-common: release BQL earlier in run_on_cpu Emilio G. Cota
2019-02-08 14:34 ` Alex Bennée
2019-01-30 0:48 ` [Qemu-devel] [PATCH v6 72/73] cpu: add async_run_on_cpu_no_bql Emilio G. Cota
2019-02-08 14:58 ` Alex Bennée
2019-03-03 20:47 ` Emilio G. Cota
2019-01-30 0:48 ` [Qemu-devel] [PATCH v6 73/73] cputlb: queue async flush jobs without the BQL Emilio G. Cota
2019-02-08 15:58 ` Alex Bennée
2019-02-20 17:18 ` Richard Henderson [this message]
2019-02-20 17:27 ` [Qemu-devel] [PATCH v6 00/73] per-CPU locks Richard Henderson
2019-02-20 22:50 ` Emilio G. Cota
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9760d4ba-e71c-77cb-7cab-9bc2f6e6b60e@linaro.org \
--to=richard.henderson@linaro.org \
--cc=cota@braap.org \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).