From: Robert Foley <robert.foley@linaro.org>
To: qemu-devel@nongnu.org
Cc: richard.henderson@linaro.org, cota@braap.org,
alex.bennee@linaro.org, robert.foley@linaro.org,
peter.puhov@linaro.org
Subject: [PATCH v9 74/74] cputlb: queue async flush jobs without the BQL
Date: Thu, 21 May 2020 12:40:11 -0400 [thread overview]
Message-ID: <20200521164011.638-75-robert.foley@linaro.org> (raw)
In-Reply-To: <20200521164011.638-1-robert.foley@linaro.org>
From: "Emilio G. Cota" <cota@braap.org>
This yields sizable scalability improvements, as the below results show.
Host: Two Intel Xeon Silver 4114 10-core CPUs at 2.20 GHz
VM: Ubuntu 18.04 ppc64
Speedup vs a single thread for kernel build
7 +-----------------------------------------------------------------------+
| + + + + + + |
| ########### baseline ******* |
| ##### #### cpu lock ####### |
| ## #### |
6 |-+ ## ## +-|
| ## #### |
| ## ### |
| ## ***** # |
| ## **** *** # |
| ## *** * |
5 |-+ ## *** **** +-|
| # **** ** |
| # ** ** |
| #* ** |
| #* ** |
| #* * |
| # ****** |
| # ** |
| # * |
3 |-+ # +-|
| # |
| # |
| # |
| # |
2 |-+ # +-|
| # |
| # |
| # |
| # |
| # + + + + + + |
1 +-----------------------------------------------------------------------+
0 5 10 15 20 25 30 35
Guest vCPUs
Pictures are also here:
https://drive.google.com/file/d/1ASg5XyP9hNfN9VysXC3qe5s9QSJlwFAt/view?usp=sharing
Some notes:
- baseline corresponds to the commit before this series
- cpu-lock is this series
Single-threaded performance is affected very lightly. Results
below for debian aarch64 bootup+test for the entire series
on an Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz host:
- Before:
Performance counter stats for 'taskset -c 0 ../img/aarch64/die.sh' (10 runs):
7269.033478 task-clock (msec) # 0.998 CPUs utilized
( +- 0.06% )
30,659,870,302 cycles # 4.218 GHz
( +- 0.06% )
54,790,540,051 instructions # 1.79 insns per cycle
( +- 0.05% )
9,796,441,380 branches # 1347.695 M/sec
( +- 0.05% )
165,132,201 branch-misses # 1.69% of all branches
( +- 0.12% )
7.287011656 seconds time elapsed
( +- 0.10% )
- After:
7375.924053 task-clock (msec) # 0.998 CPUs utilized
( +- 0.13% )
31,107,548,846 cycles # 4.217 GHz
( +- 0.12% )
55,355,668,947 instructions # 1.78 insns per cycle
( +- 0.05% )
9,929,917,664 branches # 1346.261 M/sec
( +- 0.04% )
166,547,442 branch-misses # 1.68% of all branches
( +- 0.09% )
7.389068145 seconds time elapsed
( +- 0.13% )
That is, a 1.37% slowdown.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
[Updated the speedup chart results for re-based series.]
Signed-off-by: Robert Foley <robert.foley@linaro.org>
---
accel/tcg/cputlb.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index eb2cf9de5e..50bc76fb61 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -284,7 +284,7 @@ static void flush_all_helper(CPUState *src, run_on_cpu_func fn,
CPU_FOREACH(cpu) {
if (cpu != src) {
- async_run_on_cpu(cpu, fn, d);
+ async_run_on_cpu_no_bql(cpu, fn, d);
}
}
}
@@ -352,8 +352,8 @@ void tlb_flush_by_mmuidx(CPUState *cpu, uint16_t idxmap)
tlb_debug("mmu_idx: 0x%" PRIx16 "\n", idxmap);
if (cpu->created && !qemu_cpu_is_self(cpu)) {
- async_run_on_cpu(cpu, tlb_flush_by_mmuidx_async_work,
- RUN_ON_CPU_HOST_INT(idxmap));
+ async_run_on_cpu_no_bql(cpu, tlb_flush_by_mmuidx_async_work,
+ RUN_ON_CPU_HOST_INT(idxmap));
} else {
tlb_flush_by_mmuidx_async_work(cpu, RUN_ON_CPU_HOST_INT(idxmap));
}
@@ -547,7 +547,7 @@ void tlb_flush_page_by_mmuidx(CPUState *cpu, target_ulong addr, uint16_t idxmap)
* we can stuff idxmap into the low TARGET_PAGE_BITS, avoid
* allocating memory for this operation.
*/
- async_run_on_cpu(cpu, tlb_flush_page_by_mmuidx_async_1,
+ async_run_on_cpu_no_bql(cpu, tlb_flush_page_by_mmuidx_async_1,
RUN_ON_CPU_TARGET_PTR(addr | idxmap));
} else {
TLBFlushPageByMMUIdxData *d = g_new(TLBFlushPageByMMUIdxData, 1);
@@ -555,7 +555,7 @@ void tlb_flush_page_by_mmuidx(CPUState *cpu, target_ulong addr, uint16_t idxmap)
/* Otherwise allocate a structure, freed by the worker. */
d->addr = addr;
d->idxmap = idxmap;
- async_run_on_cpu(cpu, tlb_flush_page_by_mmuidx_async_2,
+ async_run_on_cpu_no_bql(cpu, tlb_flush_page_by_mmuidx_async_2,
RUN_ON_CPU_HOST_PTR(d));
}
}
--
2.17.1
prev parent reply other threads:[~2020-05-21 17:26 UTC|newest]
Thread overview: 79+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-21 16:38 [PATCH v9 00/74] per-CPU locks Robert Foley
2020-05-21 16:38 ` [PATCH v9 01/74] cpu: convert queued work to a QSIMPLEQ Robert Foley
2020-05-21 16:38 ` [PATCH v9 02/74] cpu: rename cpu->work_mutex to cpu->lock Robert Foley
2020-05-21 16:39 ` [PATCH v9 03/74] cpu: introduce cpu_mutex_lock/unlock Robert Foley
2020-05-21 16:39 ` [PATCH v9 04/74] cpu: make qemu_work_cond per-cpu Robert Foley
2020-05-21 16:39 ` [PATCH v9 05/74] cpu: move run_on_cpu to cpus-common Robert Foley
2020-05-21 16:39 ` [PATCH v9 06/74] cpu: introduce process_queued_cpu_work_locked Robert Foley
2020-05-21 16:39 ` [PATCH v9 07/74] cpu: make per-CPU locks an alias of the BQL in TCG rr mode Robert Foley
2020-05-21 16:39 ` [PATCH v9 08/74] tcg-runtime: define helper_cpu_halted_set Robert Foley
2020-05-21 16:39 ` [PATCH v9 09/74] ppc: convert to helper_cpu_halted_set Robert Foley
2020-05-21 16:39 ` [PATCH v9 10/74] cris: " Robert Foley
2020-05-21 16:45 ` Edgar E. Iglesias
2020-05-21 16:39 ` [PATCH v9 11/74] hppa: " Robert Foley
2020-05-21 16:39 ` [PATCH v9 12/74] m68k: " Robert Foley
2020-05-21 16:39 ` [PATCH v9 13/74] alpha: " Robert Foley
2020-05-21 16:39 ` [PATCH v9 14/74] microblaze: " Robert Foley
2020-05-21 16:45 ` Edgar E. Iglesias
2020-05-21 16:39 ` [PATCH v9 15/74] cpu: define cpu_halted helpers Robert Foley
2020-05-21 16:39 ` [PATCH v9 16/74] tcg-runtime: convert to cpu_halted_set Robert Foley
2020-05-21 16:39 ` [PATCH v9 17/74] hw/semihosting: " Robert Foley
2020-05-21 16:39 ` [PATCH v9 18/74] arm: convert to cpu_halted Robert Foley
2020-05-21 16:39 ` [PATCH v9 19/74] ppc: " Robert Foley
2020-05-21 16:39 ` [PATCH v9 20/74] sh4: " Robert Foley
2020-05-21 16:39 ` [PATCH v9 21/74] i386: " Robert Foley
2020-05-21 16:39 ` [PATCH v9 22/74] lm32: " Robert Foley
2020-05-21 16:39 ` [PATCH v9 23/74] m68k: " Robert Foley
2020-05-21 16:39 ` [PATCH v9 24/74] mips: " Robert Foley
2020-05-21 16:39 ` [PATCH v9 25/74] riscv: " Robert Foley
2020-05-21 16:39 ` [PATCH v9 26/74] s390x: " Robert Foley
2020-05-21 16:39 ` [PATCH v9 27/74] sparc: " Robert Foley
2020-05-21 16:39 ` [PATCH v9 28/74] xtensa: " Robert Foley
2020-05-21 16:39 ` [PATCH v9 29/74] gdbstub: " Robert Foley
2020-05-21 16:39 ` [PATCH v9 30/74] openrisc: " Robert Foley
2020-05-21 16:39 ` [PATCH v9 31/74] cpu-exec: " Robert Foley
2020-05-21 16:39 ` [PATCH v9 32/74] cpu: " Robert Foley
2020-05-21 16:39 ` [PATCH v9 33/74] cpu: define cpu_interrupt_request helpers Robert Foley
2020-05-21 16:39 ` [PATCH v9 34/74] ppc: use cpu_reset_interrupt Robert Foley
2020-05-21 16:39 ` [PATCH v9 35/74] exec: " Robert Foley
2020-05-21 16:39 ` [PATCH v9 36/74] i386: " Robert Foley
2020-05-21 16:39 ` [PATCH v9 37/74] s390x: " Robert Foley
2020-05-21 16:39 ` [PATCH v9 38/74] openrisc: " Robert Foley
2020-05-21 16:39 ` [PATCH v9 39/74] arm: convert to cpu_interrupt_request Robert Foley
2020-05-21 16:39 ` [PATCH v9 40/74] i386: " Robert Foley
2020-05-21 16:39 ` [PATCH v9 41/74] i386/kvm: " Robert Foley
2020-05-21 16:39 ` [PATCH v9 42/74] i386/hax-all: " Robert Foley
2020-05-21 16:39 ` [PATCH v9 43/74] i386/whpx-all: " Robert Foley
2020-05-21 16:39 ` [PATCH v9 44/74] i386/hvf: convert to cpu_request_interrupt Robert Foley
2020-05-21 16:39 ` [PATCH v9 45/74] ppc: convert to cpu_interrupt_request Robert Foley
2020-05-21 16:39 ` [PATCH v9 46/74] sh4: " Robert Foley
2020-05-21 16:39 ` [PATCH v9 47/74] cris: " Robert Foley
2020-05-21 16:45 ` Edgar E. Iglesias
2020-05-21 16:39 ` [PATCH v9 48/74] hppa: " Robert Foley
2020-05-21 16:39 ` [PATCH v9 49/74] lm32: " Robert Foley
2020-05-21 16:39 ` [PATCH v9 50/74] m68k: " Robert Foley
2020-05-21 16:39 ` [PATCH v9 51/74] mips: " Robert Foley
2020-05-21 16:39 ` [PATCH v9 52/74] nios: " Robert Foley
2020-05-21 16:39 ` [PATCH v9 53/74] s390x: " Robert Foley
2020-05-21 16:39 ` [PATCH v9 54/74] alpha: " Robert Foley
2020-05-21 16:39 ` [PATCH v9 55/74] moxie: " Robert Foley
2020-05-21 16:39 ` [PATCH v9 56/74] sparc: " Robert Foley
2020-05-21 16:39 ` [PATCH v9 57/74] openrisc: " Robert Foley
2020-05-21 16:39 ` [PATCH v9 58/74] unicore32: " Robert Foley
2020-05-21 16:39 ` [PATCH v9 59/74] microblaze: " Robert Foley
2020-05-21 16:46 ` Edgar E. Iglesias
2020-05-21 16:39 ` [PATCH v9 60/74] accel/tcg: " Robert Foley
2020-05-21 16:39 ` [PATCH v9 61/74] cpu: convert to interrupt_request Robert Foley
2020-05-21 16:39 ` [PATCH v9 62/74] cpu: call .cpu_has_work with the CPU lock held Robert Foley
2020-05-21 16:40 ` [PATCH v9 63/74] cpu: introduce cpu_has_work_with_iothread_lock Robert Foley
2020-05-21 16:40 ` [PATCH v9 64/74] ppc: convert to cpu_has_work_with_iothread_lock Robert Foley
2020-05-21 16:40 ` [PATCH v9 65/74] mips: " Robert Foley
2020-05-21 16:40 ` [PATCH v9 66/74] s390x: " Robert Foley
2020-05-21 16:40 ` [PATCH v9 67/74] riscv: " Robert Foley
2020-05-21 16:40 ` [PATCH v9 68/74] sparc: " Robert Foley
2020-05-21 16:40 ` [PATCH v9 69/74] xtensa: " Robert Foley
2020-05-21 16:40 ` [PATCH v9 70/74] cpu: rename all_cpu_threads_idle to qemu_tcg_rr_all_cpu_threads_idle Robert Foley
2020-05-21 16:40 ` [PATCH v9 71/74] cpu: protect CPU state with cpu->lock instead of the BQL Robert Foley
2020-05-21 16:40 ` [PATCH v9 72/74] cpus-common: release BQL earlier in run_on_cpu Robert Foley
2020-05-21 16:40 ` [PATCH v9 73/74] cpu: add async_run_on_cpu_no_bql Robert Foley
2020-05-21 16:40 ` Robert Foley [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200521164011.638-75-robert.foley@linaro.org \
--to=robert.foley@linaro.org \
--cc=alex.bennee@linaro.org \
--cc=cota@braap.org \
--cc=peter.puhov@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=richard.henderson@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).