From: Richard Henderson <rth@twiddle.net>
To: qemu-devel@nongnu.org
Cc: peter.maydell@linaro.org, cota@braap.org, alex.bennee@linaro.org
Subject: [Qemu-devel] [PATCH v8 32/37] target-arm: emulate aarch64's LL/SC using cmpxchg helpers
Date: Mon, 24 Oct 2016 10:39:43 -0700 [thread overview]
Message-ID: <1477330788-14996-33-git-send-email-rth@twiddle.net> (raw)
In-Reply-To: <1477330788-14996-1-git-send-email-rth@twiddle.net>
From: "Emilio G. Cota" <cota@braap.org>
Emulating LL/SC with cmpxchg is not correct, since it can
suffer from the ABA problem. Portable parallel code, however,
is written assuming only cmpxchg--and not LL/SC--is available.
This means that in practice emulating LL/SC with cmpxchg is
a viable alternative.
The appended emulates LL/SC pairs in aarch64 with cmpxchg helpers.
This works in both user and system mode. In usermode, it avoids
pausing all other CPUs to perform the LL/SC pair. The subsequent
performance and scalability improvement is significant, as the
plots below show. They plot the throughput of atomic_add-bench
compiled for ARM and executed on a 64-core x86 machine.
Hi-res plots: http://imgur.com/a/JVc8Y
atomic_add-bench: 1000000 ops/thread, [0,1] range
18 ++---------+----------+---------+----------+----------+----------+---++
+cmpxchg +-E--+ + + + + + |
16 ++master +-H--+ ++
|| |
14 ++ ++
| | |
12 ++| ++
| | |
10 ++++ ++
8 ++E ++
|+++ |
6 ++ | ++
| | |
4 ++ | ++
| | |
2 +H++E+--- ++
+ | +E++----+E+---+--+E+----++E+------+E+------+E++----+E+---+--+E|
0 ++H-H----H-+-----H----+---------+----------+----------+----------+---++
0 10 20 30 40 50 60
Number of threads
atomic_add-bench: 1000000 ops/thread, [0,2] range
18 ++---------+----------+---------+----------+----------+----------+---++
+cmpxchg +-E--+ + + + + + |
16 ++master +-H--+ ++
| | |
14 ++E ++
| | |
12 ++| ++
|+++ |
10 ++ | ++
8 ++ | ++
| | |
6 ++ | ++
| | |
4 ++ | ++
| +E+--- |
2 +H+ +E+-----+++ +++ +++ ---+E+-----+E+------+++
+++ + +E+---+--+E+----++E+------+E+--- ++++ +++ + +E|
0 ++H-H----H-+-----H----+---------+----------+----------+----------+---++
0 10 20 30 40 50 60
Number of threads
atomic_add-bench: 1000000 ops/thread, [0,128] range
70 ++---------+----------+---------+----------+----------+----------+---++
+cmpxchg +-E--+ + + + + + |
60 ++master +-H--+ +++ ---+E+-----+E+------+E+
| +E+------E-------+E+--- |
| --- +++ |
50 ++ +++--- ++
| -+E+ |
40 ++ +++---- ++
| E- |
| --| |
30 ++ -- +++ ++
| +E+ |
20 ++E+ ++
|E+ |
| |
10 ++ ++
+ + + + + + + |
0 +HH-H----H-+-----H----+---------+----------+----------+----------+---++
0 10 20 30 40 50 60
Number of threads
atomic_add-bench: 1000000 ops/thread, [0,1024] range
160 ++---------+---------+----------+---------+----------+----------+---++
+cmpxchg +-E--+ + + + + + |
140 ++master +-H--+ +++ +++
| -+E+-----+E+-------E|
120 ++ +++ ---- +++
| +++ ----E-- |
100 ++ --E--- +++ ++
| +++ ---- +++ |
80 ++ --E-- ++
| ---- +++ |
| -+E+ |
60 ++ ---- +++ ++
| +E+- |
40 ++ -- ++
| +E+ |
20 +EE+ ++
+++ + + + + + + |
0 +HH-H---H--+-----H---+----------+---------+----------+----------+---++
0 10 20 30 40 50 60
Number of threads
[rth: Rearrange 128-bit cmpxchg helper. Enforce alignment on LL.]
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1467054136-10430-28-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
target-arm/helper-a64.c | 113 +++++++++++++++++++++++++++++++++++++++++++++
target-arm/helper-a64.h | 2 +
target-arm/translate-a64.c | 106 +++++++++++++++++++-----------------------
3 files changed, 163 insertions(+), 58 deletions(-)
diff --git a/target-arm/helper-a64.c b/target-arm/helper-a64.c
index 41e48a4..98b97df 100644
--- a/target-arm/helper-a64.c
+++ b/target-arm/helper-a64.c
@@ -27,6 +27,10 @@
#include "qemu/bitops.h"
#include "internals.h"
#include "qemu/crc32c.h"
+#include "exec/exec-all.h"
+#include "exec/cpu_ldst.h"
+#include "qemu/int128.h"
+#include "tcg.h"
#include <zlib.h> /* For crc32 */
/* C2.4.7 Multiply and divide */
@@ -444,3 +448,112 @@ uint64_t HELPER(crc32c_64)(uint64_t acc, uint64_t val, uint32_t bytes)
/* Linux crc32c converts the output to one's complement. */
return crc32c(acc, buf, bytes) ^ 0xffffffff;
}
+
+/* Returns 0 on success; 1 otherwise. */
+uint64_t HELPER(paired_cmpxchg64_le)(CPUARMState *env, uint64_t addr,
+ uint64_t new_lo, uint64_t new_hi)
+{
+ uintptr_t ra = GETPC();
+ Int128 oldv, cmpv, newv;
+ bool success;
+
+ cmpv = int128_make128(env->exclusive_val, env->exclusive_high);
+ newv = int128_make128(new_lo, new_hi);
+
+ if (parallel_cpus) {
+#ifndef CONFIG_ATOMIC128
+ cpu_loop_exit_atomic(ENV_GET_CPU(env), ra);
+#else
+ int mem_idx = cpu_mmu_index(env, false);
+ TCGMemOpIdx oi = make_memop_idx(MO_LEQ | MO_ALIGN_16, mem_idx);
+ oldv = helper_atomic_cmpxchgo_le_mmu(env, addr, cmpv, newv, oi, ra);
+ success = int128_eq(oldv, cmpv);
+#endif
+ } else {
+ uint64_t o0, o1;
+
+#ifdef CONFIG_USER_ONLY
+ /* ??? Enforce alignment. */
+ uint64_t *haddr = g2h(addr);
+ o0 = ldq_le_p(haddr + 0);
+ o1 = ldq_le_p(haddr + 1);
+ oldv = int128_make128(o0, o1);
+
+ success = int128_eq(oldv, cmpv);
+ if (success) {
+ stq_le_p(haddr + 0, int128_getlo(newv));
+ stq_le_p(haddr + 1, int128_gethi(newv));
+ }
+#else
+ int mem_idx = cpu_mmu_index(env, false);
+ TCGMemOpIdx oi0 = make_memop_idx(MO_LEQ | MO_ALIGN_16, mem_idx);
+ TCGMemOpIdx oi1 = make_memop_idx(MO_LEQ, mem_idx);
+
+ o0 = helper_le_ldq_mmu(env, addr + 0, oi0, ra);
+ o1 = helper_le_ldq_mmu(env, addr + 8, oi1, ra);
+ oldv = int128_make128(o0, o1);
+
+ success = int128_eq(oldv, cmpv);
+ if (success) {
+ helper_le_stq_mmu(env, addr + 0, int128_getlo(newv), oi1, ra);
+ helper_le_stq_mmu(env, addr + 8, int128_gethi(newv), oi1, ra);
+ }
+#endif
+ }
+
+ return !success;
+}
+
+uint64_t HELPER(paired_cmpxchg64_be)(CPUARMState *env, uint64_t addr,
+ uint64_t new_lo, uint64_t new_hi)
+{
+ uintptr_t ra = GETPC();
+ Int128 oldv, cmpv, newv;
+ bool success;
+
+ cmpv = int128_make128(env->exclusive_val, env->exclusive_high);
+ newv = int128_make128(new_lo, new_hi);
+
+ if (parallel_cpus) {
+#ifndef CONFIG_ATOMIC128
+ cpu_loop_exit_atomic(ENV_GET_CPU(env), ra);
+#else
+ int mem_idx = cpu_mmu_index(env, false);
+ TCGMemOpIdx oi = make_memop_idx(MO_BEQ | MO_ALIGN_16, mem_idx);
+ oldv = helper_atomic_cmpxchgo_be_mmu(env, addr, cmpv, newv, oi, ra);
+ success = int128_eq(oldv, cmpv);
+#endif
+ } else {
+ uint64_t o0, o1;
+
+#ifdef CONFIG_USER_ONLY
+ /* ??? Enforce alignment. */
+ uint64_t *haddr = g2h(addr);
+ o1 = ldq_be_p(haddr + 0);
+ o0 = ldq_be_p(haddr + 1);
+ oldv = int128_make128(o0, o1);
+
+ success = int128_eq(oldv, cmpv);
+ if (success) {
+ stq_be_p(haddr + 0, int128_gethi(newv));
+ stq_be_p(haddr + 1, int128_getlo(newv));
+ }
+#else
+ int mem_idx = cpu_mmu_index(env, false);
+ TCGMemOpIdx oi0 = make_memop_idx(MO_BEQ | MO_ALIGN_16, mem_idx);
+ TCGMemOpIdx oi1 = make_memop_idx(MO_BEQ, mem_idx);
+
+ o1 = helper_be_ldq_mmu(env, addr + 0, oi0, ra);
+ o0 = helper_be_ldq_mmu(env, addr + 8, oi1, ra);
+ oldv = int128_make128(o0, o1);
+
+ success = int128_eq(oldv, cmpv);
+ if (success) {
+ helper_be_stq_mmu(env, addr + 0, int128_gethi(newv), oi1, ra);
+ helper_be_stq_mmu(env, addr + 8, int128_getlo(newv), oi1, ra);
+ }
+#endif
+ }
+
+ return !success;
+}
diff --git a/target-arm/helper-a64.h b/target-arm/helper-a64.h
index 1d3d10f..dd32000 100644
--- a/target-arm/helper-a64.h
+++ b/target-arm/helper-a64.h
@@ -46,3 +46,5 @@ DEF_HELPER_FLAGS_2(frecpx_f32, TCG_CALL_NO_RWG, f32, f32, ptr)
DEF_HELPER_FLAGS_2(fcvtx_f64_to_f32, TCG_CALL_NO_RWG, f32, f64, env)
DEF_HELPER_FLAGS_3(crc32_64, TCG_CALL_NO_RWG_SE, i64, i64, i64, i32)
DEF_HELPER_FLAGS_3(crc32c_64, TCG_CALL_NO_RWG_SE, i64, i64, i64, i32)
+DEF_HELPER_FLAGS_4(paired_cmpxchg64_le, TCG_CALL_NO_WG, i64, env, i64, i64, i64)
+DEF_HELPER_FLAGS_4(paired_cmpxchg64_be, TCG_CALL_NO_WG, i64, env, i64, i64, i64)
diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index 96c2227..ded924a 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -1839,37 +1839,41 @@ static void disas_b_exc_sys(DisasContext *s, uint32_t insn)
}
}
-/*
- * Load/Store exclusive instructions are implemented by remembering
- * the value/address loaded, and seeing if these are the same
- * when the store is performed. This is not actually the architecturally
- * mandated semantics, but it works for typical guest code sequences
- * and avoids having to monitor regular stores.
- *
- * In system emulation mode only one CPU will be running at once, so
- * this sequence is effectively atomic. In user emulation mode we
- * throw an exception and handle the atomic operation elsewhere.
- */
static void gen_load_exclusive(DisasContext *s, int rt, int rt2,
TCGv_i64 addr, int size, bool is_pair)
{
TCGv_i64 tmp = tcg_temp_new_i64();
- TCGMemOp memop = s->be_data + size;
+ TCGMemOp be = s->be_data;
g_assert(size <= 3);
- tcg_gen_qemu_ld_i64(tmp, addr, get_mem_index(s), memop);
-
if (is_pair) {
- TCGv_i64 addr2 = tcg_temp_new_i64();
TCGv_i64 hitmp = tcg_temp_new_i64();
- g_assert(size >= 2);
- tcg_gen_addi_i64(addr2, addr, 1 << size);
- tcg_gen_qemu_ld_i64(hitmp, addr2, get_mem_index(s), memop);
- tcg_temp_free_i64(addr2);
+ if (size == 3) {
+ TCGv_i64 addr2 = tcg_temp_new_i64();
+
+ tcg_gen_qemu_ld_i64(tmp, addr, get_mem_index(s),
+ MO_64 | MO_ALIGN_16 | be);
+ tcg_gen_addi_i64(addr2, addr, 8);
+ tcg_gen_qemu_ld_i64(hitmp, addr2, get_mem_index(s),
+ MO_64 | MO_ALIGN | be);
+ tcg_temp_free_i64(addr2);
+ } else {
+ g_assert(size == 2);
+ tcg_gen_qemu_ld_i64(tmp, addr, get_mem_index(s),
+ MO_64 | MO_ALIGN | be);
+ if (be == MO_LE) {
+ tcg_gen_extr32_i64(tmp, hitmp, tmp);
+ } else {
+ tcg_gen_extr32_i64(hitmp, tmp, tmp);
+ }
+ }
+
tcg_gen_mov_i64(cpu_exclusive_high, hitmp);
tcg_gen_mov_i64(cpu_reg(s, rt2), hitmp);
tcg_temp_free_i64(hitmp);
+ } else {
+ tcg_gen_qemu_ld_i64(tmp, addr, get_mem_index(s), size | MO_ALIGN | be);
}
tcg_gen_mov_i64(cpu_exclusive_val, tmp);
@@ -1879,16 +1883,6 @@ static void gen_load_exclusive(DisasContext *s, int rt, int rt2,
tcg_gen_mov_i64(cpu_exclusive_addr, addr);
}
-#ifdef CONFIG_USER_ONLY
-static void gen_store_exclusive(DisasContext *s, int rd, int rt, int rt2,
- TCGv_i64 addr, int size, int is_pair)
-{
- tcg_gen_mov_i64(cpu_exclusive_test, addr);
- tcg_gen_movi_i32(cpu_exclusive_info,
- size | is_pair << 2 | (rd << 4) | (rt << 9) | (rt2 << 14));
- gen_exception_internal_insn(s, 4, EXCP_STREX);
-}
-#else
static void gen_store_exclusive(DisasContext *s, int rd, int rt, int rt2,
TCGv_i64 inaddr, int size, int is_pair)
{
@@ -1916,46 +1910,42 @@ static void gen_store_exclusive(DisasContext *s, int rd, int rt, int rt2,
tcg_gen_brcond_i64(TCG_COND_NE, addr, cpu_exclusive_addr, fail_label);
tmp = tcg_temp_new_i64();
- tcg_gen_qemu_ld_i64(tmp, addr, get_mem_index(s), s->be_data + size);
- tcg_gen_brcond_i64(TCG_COND_NE, tmp, cpu_exclusive_val, fail_label);
- tcg_temp_free_i64(tmp);
-
- if (is_pair) {
- TCGv_i64 addrhi = tcg_temp_new_i64();
- TCGv_i64 tmphi = tcg_temp_new_i64();
-
- tcg_gen_addi_i64(addrhi, addr, 1 << size);
- tcg_gen_qemu_ld_i64(tmphi, addrhi, get_mem_index(s),
- s->be_data + size);
- tcg_gen_brcond_i64(TCG_COND_NE, tmphi, cpu_exclusive_high, fail_label);
-
- tcg_temp_free_i64(tmphi);
- tcg_temp_free_i64(addrhi);
- }
-
- /* We seem to still have the exclusive monitor, so do the store */
- tcg_gen_qemu_st_i64(cpu_reg(s, rt), addr, get_mem_index(s),
- s->be_data + size);
if (is_pair) {
- TCGv_i64 addrhi = tcg_temp_new_i64();
-
- tcg_gen_addi_i64(addrhi, addr, 1 << size);
- tcg_gen_qemu_st_i64(cpu_reg(s, rt2), addrhi,
- get_mem_index(s), s->be_data + size);
- tcg_temp_free_i64(addrhi);
+ if (size == 2) {
+ TCGv_i64 val = tcg_temp_new_i64();
+ tcg_gen_concat32_i64(tmp, cpu_reg(s, rt), cpu_reg(s, rt2));
+ tcg_gen_concat32_i64(val, cpu_exclusive_val, cpu_exclusive_high);
+ tcg_gen_atomic_cmpxchg_i64(tmp, addr, val, tmp,
+ get_mem_index(s),
+ size | MO_ALIGN | s->be_data);
+ tcg_gen_setcond_i64(TCG_COND_NE, tmp, tmp, val);
+ tcg_temp_free_i64(val);
+ } else if (s->be_data == MO_LE) {
+ gen_helper_paired_cmpxchg64_le(tmp, cpu_env, addr, cpu_reg(s, rt),
+ cpu_reg(s, rt2));
+ } else {
+ gen_helper_paired_cmpxchg64_be(tmp, cpu_env, addr, cpu_reg(s, rt),
+ cpu_reg(s, rt2));
+ }
+ } else {
+ TCGv_i64 val = cpu_reg(s, rt);
+ tcg_gen_atomic_cmpxchg_i64(tmp, addr, cpu_exclusive_val, val,
+ get_mem_index(s),
+ size | MO_ALIGN | s->be_data);
+ tcg_gen_setcond_i64(TCG_COND_NE, tmp, tmp, cpu_exclusive_val);
}
tcg_temp_free_i64(addr);
- tcg_gen_movi_i64(cpu_reg(s, rd), 0);
+ tcg_gen_mov_i64(cpu_reg(s, rd), tmp);
+ tcg_temp_free_i64(tmp);
tcg_gen_br(done_label);
+
gen_set_label(fail_label);
tcg_gen_movi_i64(cpu_reg(s, rd), 1);
gen_set_label(done_label);
tcg_gen_movi_i64(cpu_exclusive_addr, -1);
-
}
-#endif
/* Update the Sixty-Four bit (SF) registersize. This logic is derived
* from the ARMv8 specs for LDR (Shared decode for all encodings).
--
2.7.4
next prev parent reply other threads:[~2016-10-24 17:40 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-10-24 17:39 [Qemu-devel] [PATCH v8 00/37] cmpxchg atomic operations Richard Henderson
2016-10-24 17:39 ` [Qemu-devel] [PATCH v8 01/37] atomics: Add parameters to macros Richard Henderson
2016-10-24 18:13 ` Emilio G. Cota
2016-10-24 17:39 ` [Qemu-devel] [PATCH v8 02/37] atomics: add atomic_xor Richard Henderson
2016-10-24 17:39 ` [Qemu-devel] [PATCH v8 03/37] atomics: add atomic_op_fetch variants Richard Henderson
2016-10-24 17:39 ` [Qemu-devel] [PATCH v8 04/37] atomics: Add __nocheck atomic operations Richard Henderson
2016-10-24 18:16 ` Emilio G. Cota
2016-10-24 17:39 ` [Qemu-devel] [PATCH v8 05/37] exec: Avoid direct references to Int128 parts Richard Henderson
2016-10-24 17:39 ` [Qemu-devel] [PATCH v8 06/37] int128: Use __int128 if available Richard Henderson
2016-10-24 17:39 ` [Qemu-devel] [PATCH v8 07/37] int128: Add int128_make128 Richard Henderson
2016-10-24 17:39 ` [Qemu-devel] [PATCH v8 09/37] linux-user: enable parallel code generation on clone Richard Henderson
2016-10-24 17:39 ` [Qemu-devel] [PATCH v8 10/37] cputlb: Replace SHIFT with DATA_SIZE Richard Henderson
2016-10-24 17:39 ` [Qemu-devel] [PATCH v8 11/37] cputlb: Move probe_write out of softmmu_template.h Richard Henderson
2016-10-24 17:39 ` [Qemu-devel] [PATCH v8 12/37] cputlb: Remove includes from softmmu_template.h Richard Henderson
2016-10-24 17:39 ` [Qemu-devel] [PATCH v8 13/37] cputlb: Move most of iotlb code out of line Richard Henderson
2016-10-24 17:39 ` [Qemu-devel] [PATCH v8 14/37] cputlb: Tidy some macros Richard Henderson
2016-10-24 17:39 ` [Qemu-devel] [PATCH v8 15/37] tcg: Add atomic helpers Richard Henderson
2016-10-24 17:39 ` [Qemu-devel] [PATCH v8 16/37] tcg: Add atomic128 helpers Richard Henderson
2016-10-24 17:39 ` [Qemu-devel] [PATCH v8 17/37] tcg: Add CONFIG_ATOMIC64 Richard Henderson
2016-10-24 17:39 ` [Qemu-devel] [PATCH v8 18/37] tcg: Emit barriers with parallel_cpus Richard Henderson
2016-10-24 17:39 ` [Qemu-devel] [PATCH v8 19/37] target-i386: emulate LOCK'ed cmpxchg using cmpxchg helpers Richard Henderson
2016-10-24 17:39 ` [Qemu-devel] [PATCH v8 20/37] target-i386: emulate LOCK'ed OP instructions using atomic helpers Richard Henderson
2016-10-24 17:39 ` [Qemu-devel] [PATCH v8 21/37] target-i386: emulate LOCK'ed INC using atomic helper Richard Henderson
2016-10-24 17:39 ` [Qemu-devel] [PATCH v8 22/37] target-i386: emulate LOCK'ed NOT " Richard Henderson
2016-10-24 17:39 ` [Qemu-devel] [PATCH v8 23/37] target-i386: emulate LOCK'ed NEG using cmpxchg helper Richard Henderson
2016-10-24 17:39 ` [Qemu-devel] [PATCH v8 24/37] target-i386: emulate LOCK'ed XADD using atomic helper Richard Henderson
2016-10-24 17:39 ` [Qemu-devel] [PATCH v8 25/37] target-i386: emulate LOCK'ed BTX ops using atomic helpers Richard Henderson
2016-10-24 17:39 ` [Qemu-devel] [PATCH v8 26/37] target-i386: emulate XCHG using atomic helper Richard Henderson
2016-10-24 17:39 ` [Qemu-devel] [PATCH v8 27/37] target-i386: remove helper_lock() Richard Henderson
2016-10-24 17:39 ` [Qemu-devel] [PATCH v8 28/37] tests: add atomic_add-bench Richard Henderson
2016-10-24 17:39 ` [Qemu-devel] [PATCH v8 29/37] target-arm: Rearrange aa32 load and store functions Richard Henderson
2016-10-24 17:39 ` [Qemu-devel] [PATCH v8 30/37] target-arm: emulate LL/SC using cmpxchg helpers Richard Henderson
2016-10-24 17:39 ` [Qemu-devel] [PATCH v8 31/37] target-arm: emulate SWP with atomic_xchg helper Richard Henderson
2016-10-24 17:39 ` Richard Henderson [this message]
2016-10-24 17:39 ` [Qemu-devel] [PATCH v8 33/37] linux-user: remove handling of ARM's EXCP_STREX Richard Henderson
2016-10-24 17:39 ` [Qemu-devel] [PATCH v8 34/37] linux-user: remove handling of aarch64's EXCP_STREX Richard Henderson
2016-10-24 17:39 ` [Qemu-devel] [PATCH v8 35/37] target-arm: remove EXCP_STREX + cpu_exclusive_{test, info} Richard Henderson
2016-10-24 17:39 ` [Qemu-devel] [PATCH v8 36/37] target-alpha: Introduce MMU_PHYS_IDX Richard Henderson
2016-10-24 17:39 ` [Qemu-devel] [PATCH v8 37/37] target-alpha: Emulate LL/SC using cmpxchg helpers Richard Henderson
2016-10-24 18:27 ` [Qemu-devel] [PATCH v8 00/37] cmpxchg atomic operations Emilio G. Cota
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1477330788-14996-33-git-send-email-rth@twiddle.net \
--to=rth@twiddle.net \
--cc=alex.bennee@linaro.org \
--cc=cota@braap.org \
--cc=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).