* [PATCH 0/5] tcg: Issue memory barriers for guest memory model @ 2021-03-16 22:07 Richard Henderson 2021-03-16 22:07 ` [PATCH 1/5] tcg: Decode the operand to INDEX_op_mb in dumps Richard Henderson ` (6 more replies) 0 siblings, 7 replies; 9+ messages in thread From: Richard Henderson @ 2021-03-16 22:07 UTC (permalink / raw) To: qemu-devel; +Cc: peter.maydell This is intending to fix the current aarch64 host failure for s390x guest cdrom-test. This is caused by the io thread issuing memory barriers that are supposed to be matched by the vcpu, but are elided by tcg in rr mode as "unnecessary". I know Peter would like a smaller patch to sync the io thread with the vcpu thread. I've made a couple of attempts at this, but havn't managed to get something reliable (although now irritatingly infrequent -- about 1 in 500). I have further patches to further optimize barriers, and to generate load-acquire/store-release instructions in tcg. But it's late in the release cycle, etc etc. I've done nothing to measure the performance impact of this. I quit the cdtom-test cycle after 4000 passes. r~ Richard Henderson (5): tcg: Decode the operand to INDEX_op_mb in dumps tcg: Do not elide memory barriers for CF_PARALLEL tcg: Elide memory barriers implied by the host memory model tcg: Create tcg_req_mo tcg: Add host memory barriers to cpu_ldst.h interfaces include/exec/cpu_ldst.h | 7 ++++ include/tcg/tcg.h | 20 +++++++++++ accel/tcg/cputlb.c | 2 ++ accel/tcg/tcg-all.c | 6 +--- accel/tcg/user-exec.c | 17 +++++++++ tcg/tcg-op.c | 19 +++++----- tcg/tcg.c | 79 +++++++++++++++++++++++++++++++++++++++++ 7 files changed, 137 insertions(+), 13 deletions(-) -- 2.25.1 ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 1/5] tcg: Decode the operand to INDEX_op_mb in dumps 2021-03-16 22:07 [PATCH 0/5] tcg: Issue memory barriers for guest memory model Richard Henderson @ 2021-03-16 22:07 ` Richard Henderson 2021-03-17 13:32 ` Philippe Mathieu-Daudé 2021-03-16 22:07 ` [PATCH 2/5] tcg: Do not elide memory barriers for CF_PARALLEL Richard Henderson ` (5 subsequent siblings) 6 siblings, 1 reply; 9+ messages in thread From: Richard Henderson @ 2021-03-16 22:07 UTC (permalink / raw) To: qemu-devel; +Cc: peter.maydell Signed-off-by: Richard Henderson <richard.henderson@linaro.org> --- tcg/tcg.c | 79 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 79 insertions(+) diff --git a/tcg/tcg.c b/tcg/tcg.c index 2991112829..23a94d771c 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -2415,6 +2415,85 @@ static void tcg_dump_ops(TCGContext *s, bool have_prefs) arg_label(op->args[k])->id); i++, k++; break; + case INDEX_op_mb: + { + TCGBar membar = op->args[k]; + const char *b_op, *m_op; + + switch (membar & TCG_BAR_SC) { + case 0: + b_op = "none"; + break; + case TCG_BAR_LDAQ: + b_op = "acq"; + break; + case TCG_BAR_STRL: + b_op = "rel"; + break; + case TCG_BAR_SC: + b_op = "seq"; + break; + default: + g_assert_not_reached(); + } + + switch (membar & TCG_MO_ALL) { + case 0: + m_op = "none"; + break; + case TCG_MO_LD_LD: + m_op = "rr"; + break; + case TCG_MO_LD_ST: + m_op = "rw"; + break; + case TCG_MO_ST_LD: + m_op = "wr"; + break; + case TCG_MO_ST_ST: + m_op = "ww"; + break; + case TCG_MO_LD_LD | TCG_MO_LD_ST: + m_op = "rr+rw"; + break; + case TCG_MO_LD_LD | TCG_MO_ST_LD: + m_op = "rr+wr"; + break; + case TCG_MO_LD_LD | TCG_MO_ST_ST: + m_op = "rr+ww"; + break; + case TCG_MO_LD_ST | TCG_MO_ST_LD: + m_op = "rw+wr"; + break; + case TCG_MO_LD_ST | TCG_MO_ST_ST: + m_op = "rw+ww"; + break; + case TCG_MO_ST_LD | TCG_MO_ST_ST: + m_op = "wr+ww"; + break; + case TCG_MO_LD_LD | TCG_MO_LD_ST | TCG_MO_ST_LD: + m_op = "rr+rw+wr"; + break; + case TCG_MO_LD_LD | TCG_MO_LD_ST | TCG_MO_ST_ST: + m_op = "rr+rw+ww"; + break; + case TCG_MO_LD_LD | TCG_MO_ST_LD | TCG_MO_ST_ST: + m_op = "rr+wr+ww"; + break; + case TCG_MO_LD_ST | TCG_MO_ST_LD | TCG_MO_ST_ST: + m_op = "rw+wr+ww"; + break; + case TCG_MO_ALL: + m_op = "all"; + break; + default: + g_assert_not_reached(); + } + + col += qemu_log("%s%s:%s", (k ? "," : ""), b_op, m_op); + i++, k++; + } + break; default: break; } -- 2.25.1 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 1/5] tcg: Decode the operand to INDEX_op_mb in dumps 2021-03-16 22:07 ` [PATCH 1/5] tcg: Decode the operand to INDEX_op_mb in dumps Richard Henderson @ 2021-03-17 13:32 ` Philippe Mathieu-Daudé 0 siblings, 0 replies; 9+ messages in thread From: Philippe Mathieu-Daudé @ 2021-03-17 13:32 UTC (permalink / raw) To: Richard Henderson, qemu-devel; +Cc: peter.maydell On 3/16/21 11:07 PM, Richard Henderson wrote: > Signed-off-by: Richard Henderson <richard.henderson@linaro.org> > --- > tcg/tcg.c | 79 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 79 insertions(+) Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 2/5] tcg: Do not elide memory barriers for CF_PARALLEL 2021-03-16 22:07 [PATCH 0/5] tcg: Issue memory barriers for guest memory model Richard Henderson 2021-03-16 22:07 ` [PATCH 1/5] tcg: Decode the operand to INDEX_op_mb in dumps Richard Henderson @ 2021-03-16 22:07 ` Richard Henderson 2021-03-16 22:07 ` [PATCH 3/5] tcg: Elide memory barriers implied by the host memory model Richard Henderson ` (4 subsequent siblings) 6 siblings, 0 replies; 9+ messages in thread From: Richard Henderson @ 2021-03-16 22:07 UTC (permalink / raw) To: qemu-devel; +Cc: peter.maydell The virtio devices require proper memory ordering between the vcpus and the iothreads. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> --- tcg/tcg-op.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c index 70475773f4..76dc7d8dc5 100644 --- a/tcg/tcg-op.c +++ b/tcg/tcg-op.c @@ -97,9 +97,13 @@ void tcg_gen_op6(TCGOpcode opc, TCGArg a1, TCGArg a2, TCGArg a3, void tcg_gen_mb(TCGBar mb_type) { - if (tcg_ctx->tb_cflags & CF_PARALLEL) { - tcg_gen_op1(INDEX_op_mb, mb_type); - } + /* + * It is tempting to elide the barrier in a single-threaded context + * (i.e. !(tb_cflags & CF_PARALLEL)), however, even with a single cpu + * we have i/o threads running in parallel, and lack of memory order + * can result in e.g. virtio queue entries being read incorrectly. + */ + tcg_gen_op1(INDEX_op_mb, mb_type); } /* 32 bit ops */ -- 2.25.1 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 3/5] tcg: Elide memory barriers implied by the host memory model 2021-03-16 22:07 [PATCH 0/5] tcg: Issue memory barriers for guest memory model Richard Henderson 2021-03-16 22:07 ` [PATCH 1/5] tcg: Decode the operand to INDEX_op_mb in dumps Richard Henderson 2021-03-16 22:07 ` [PATCH 2/5] tcg: Do not elide memory barriers for CF_PARALLEL Richard Henderson @ 2021-03-16 22:07 ` Richard Henderson 2021-03-16 22:07 ` [PATCH 4/5] tcg: Create tcg_req_mo Richard Henderson ` (3 subsequent siblings) 6 siblings, 0 replies; 9+ messages in thread From: Richard Henderson @ 2021-03-16 22:07 UTC (permalink / raw) To: qemu-devel; +Cc: peter.maydell Reduce the set of required barriers to those needed by the host right from the beginning. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> --- tcg/tcg-op.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c index 76dc7d8dc5..c8501508c2 100644 --- a/tcg/tcg-op.c +++ b/tcg/tcg-op.c @@ -102,8 +102,13 @@ void tcg_gen_mb(TCGBar mb_type) * (i.e. !(tb_cflags & CF_PARALLEL)), however, even with a single cpu * we have i/o threads running in parallel, and lack of memory order * can result in e.g. virtio queue entries being read incorrectly. + * + * That said, we can elide anything which the host provides for free. */ - tcg_gen_op1(INDEX_op_mb, mb_type); + mb_type &= ~TCG_TARGET_DEFAULT_MO; + if (mb_type & TCG_MO_ALL) { + tcg_gen_op1(INDEX_op_mb, mb_type); + } } /* 32 bit ops */ -- 2.25.1 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 4/5] tcg: Create tcg_req_mo 2021-03-16 22:07 [PATCH 0/5] tcg: Issue memory barriers for guest memory model Richard Henderson ` (2 preceding siblings ...) 2021-03-16 22:07 ` [PATCH 3/5] tcg: Elide memory barriers implied by the host memory model Richard Henderson @ 2021-03-16 22:07 ` Richard Henderson 2021-03-16 22:07 ` [PATCH 5/5] tcg: Add host memory barriers to cpu_ldst.h interfaces Richard Henderson ` (2 subsequent siblings) 6 siblings, 0 replies; 9+ messages in thread From: Richard Henderson @ 2021-03-16 22:07 UTC (permalink / raw) To: qemu-devel; +Cc: peter.maydell Split out the logic to emit a host memory barrier in response to a guest memory operation. Do not provide a true default for TCG_GUEST_DEFAULT_MO because the defined() check will still be useful for determining if a guest has been updated for MTTCG. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> --- include/tcg/tcg.h | 20 ++++++++++++++++++++ accel/tcg/tcg-all.c | 6 +----- tcg/tcg-op.c | 8 +------- 3 files changed, 22 insertions(+), 12 deletions(-) diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h index 0f0695e90d..395b3b6964 100644 --- a/include/tcg/tcg.h +++ b/include/tcg/tcg.h @@ -1245,6 +1245,26 @@ static inline unsigned get_mmuidx(TCGMemOpIdx oi) return oi & 15; } +/** + * tcg_req_mo: + * @type: TCGBar + * + * Filter @type to the barrier that is required for the guest + * memory ordering vs the host memory ordering. A non-zero + * result indicates that some barrier is required. + * + * If TCG_GUEST_DEFAULT_MO is not defined, assume that the + * guest requires strict alignment. + * + * This is a macro so that it's constant even without optimization. + */ +#ifdef TCG_GUEST_DEFAULT_MO +# define tcg_req_mo(type) \ + ((type) & TCG_GUEST_DEFAULT_MO & ~TCG_TARGET_DEFAULT_MO) +#else +# define tcg_req_mo(type) ((type) & ~TCG_TARGET_DEFAULT_MO) +#endif + /** * tcg_qemu_tb_exec: * @env: pointer to CPUArchState for the CPU diff --git a/accel/tcg/tcg-all.c b/accel/tcg/tcg-all.c index e378c2db73..6ae51e3476 100644 --- a/accel/tcg/tcg-all.c +++ b/accel/tcg/tcg-all.c @@ -69,11 +69,7 @@ DECLARE_INSTANCE_CHECKER(TCGState, TCG_STATE, static bool check_tcg_memory_orders_compatible(void) { -#if defined(TCG_GUEST_DEFAULT_MO) && defined(TCG_TARGET_DEFAULT_MO) - return (TCG_GUEST_DEFAULT_MO & ~TCG_TARGET_DEFAULT_MO) == 0; -#else - return false; -#endif + return tcg_req_mo(TCG_MO_ALL) == 0; } static bool default_mttcg_enabled(void) diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c index c8501508c2..12fc8a1b17 100644 --- a/tcg/tcg-op.c +++ b/tcg/tcg-op.c @@ -2796,13 +2796,7 @@ static void gen_ldst_i64(TCGOpcode opc, TCGv_i64 val, TCGv addr, static void tcg_gen_req_mo(TCGBar type) { -#ifdef TCG_GUEST_DEFAULT_MO - type &= TCG_GUEST_DEFAULT_MO; -#endif - type &= ~TCG_TARGET_DEFAULT_MO; - if (type) { - tcg_gen_mb(type | TCG_BAR_SC); - } + tcg_gen_mb(tcg_req_mo(type) | TCG_BAR_SC); } static inline TCGv plugin_prep_mem_callbacks(TCGv vaddr) -- 2.25.1 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 5/5] tcg: Add host memory barriers to cpu_ldst.h interfaces 2021-03-16 22:07 [PATCH 0/5] tcg: Issue memory barriers for guest memory model Richard Henderson ` (3 preceding siblings ...) 2021-03-16 22:07 ` [PATCH 4/5] tcg: Create tcg_req_mo Richard Henderson @ 2021-03-16 22:07 ` Richard Henderson 2021-03-16 22:30 ` [PATCH 0/5] tcg: Issue memory barriers for guest memory model no-reply 2021-06-26 6:06 ` Richard Henderson 6 siblings, 0 replies; 9+ messages in thread From: Richard Henderson @ 2021-03-16 22:07 UTC (permalink / raw) To: qemu-devel; +Cc: peter.maydell Bring the majority of helpers into line with the rest of tcg in respecting guest memory ordering. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> --- include/exec/cpu_ldst.h | 7 +++++++ accel/tcg/cputlb.c | 2 ++ accel/tcg/user-exec.c | 17 +++++++++++++++++ 3 files changed, 26 insertions(+) diff --git a/include/exec/cpu_ldst.h b/include/exec/cpu_ldst.h index ce6ce82618..f0ab79fe3c 100644 --- a/include/exec/cpu_ldst.h +++ b/include/exec/cpu_ldst.h @@ -169,6 +169,13 @@ void cpu_stl_le_data_ra(CPUArchState *env, abi_ptr ptr, void cpu_stq_le_data_ra(CPUArchState *env, abi_ptr ptr, uint64_t val, uintptr_t ra); +#define cpu_req_mo(type) \ + do { \ + if (tcg_req_mo(type)) { \ + smp_mb(); \ + } \ + } while (0) + #if defined(CONFIG_USER_ONLY) extern __thread uintptr_t helper_retaddr; diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c index 8a7b779270..a3503eaa71 100644 --- a/accel/tcg/cputlb.c +++ b/accel/tcg/cputlb.c @@ -2100,6 +2100,7 @@ static inline uint64_t cpu_load_helper(CPUArchState *env, abi_ptr addr, meminfo = trace_mem_get_info(op, mmu_idx, false); trace_guest_mem_before_exec(env_cpu(env), addr, meminfo); + cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD); op &= ~MO_SIGN; oi = make_memop_idx(op, mmu_idx); ret = full_load(env, addr, oi, retaddr); @@ -2542,6 +2543,7 @@ cpu_store_helper(CPUArchState *env, target_ulong addr, uint64_t val, meminfo = trace_mem_get_info(op, mmu_idx, true); trace_guest_mem_before_exec(env_cpu(env), addr, meminfo); + cpu_req_mo(TCG_MO_LD_ST | TCG_MO_ST_ST); oi = make_memop_idx(op, mmu_idx); store_helper(env, addr, val, oi, retaddr, op); diff --git a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c index 0d8cc27b21..34f6dfcef4 100644 --- a/accel/tcg/user-exec.c +++ b/accel/tcg/user-exec.c @@ -843,6 +843,7 @@ uint32_t cpu_ldub_data(CPUArchState *env, abi_ptr ptr) uint16_t meminfo = trace_mem_get_info(MO_UB, MMU_USER_IDX, false); trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo); + cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD); ret = ldub_p(g2h(env_cpu(env), ptr)); qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo); return ret; @@ -854,6 +855,7 @@ int cpu_ldsb_data(CPUArchState *env, abi_ptr ptr) uint16_t meminfo = trace_mem_get_info(MO_SB, MMU_USER_IDX, false); trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo); + cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD); ret = ldsb_p(g2h(env_cpu(env), ptr)); qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo); return ret; @@ -865,6 +867,7 @@ uint32_t cpu_lduw_be_data(CPUArchState *env, abi_ptr ptr) uint16_t meminfo = trace_mem_get_info(MO_BEUW, MMU_USER_IDX, false); trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo); + cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD); ret = lduw_be_p(g2h(env_cpu(env), ptr)); qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo); return ret; @@ -876,6 +879,7 @@ int cpu_ldsw_be_data(CPUArchState *env, abi_ptr ptr) uint16_t meminfo = trace_mem_get_info(MO_BESW, MMU_USER_IDX, false); trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo); + cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD); ret = ldsw_be_p(g2h(env_cpu(env), ptr)); qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo); return ret; @@ -887,6 +891,7 @@ uint32_t cpu_ldl_be_data(CPUArchState *env, abi_ptr ptr) uint16_t meminfo = trace_mem_get_info(MO_BEUL, MMU_USER_IDX, false); trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo); + cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD); ret = ldl_be_p(g2h(env_cpu(env), ptr)); qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo); return ret; @@ -898,6 +903,7 @@ uint64_t cpu_ldq_be_data(CPUArchState *env, abi_ptr ptr) uint16_t meminfo = trace_mem_get_info(MO_BEQ, MMU_USER_IDX, false); trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo); + cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD); ret = ldq_be_p(g2h(env_cpu(env), ptr)); qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo); return ret; @@ -909,6 +915,7 @@ uint32_t cpu_lduw_le_data(CPUArchState *env, abi_ptr ptr) uint16_t meminfo = trace_mem_get_info(MO_LEUW, MMU_USER_IDX, false); trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo); + cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD); ret = lduw_le_p(g2h(env_cpu(env), ptr)); qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo); return ret; @@ -920,6 +927,7 @@ int cpu_ldsw_le_data(CPUArchState *env, abi_ptr ptr) uint16_t meminfo = trace_mem_get_info(MO_LESW, MMU_USER_IDX, false); trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo); + cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD); ret = ldsw_le_p(g2h(env_cpu(env), ptr)); qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo); return ret; @@ -931,6 +939,7 @@ uint32_t cpu_ldl_le_data(CPUArchState *env, abi_ptr ptr) uint16_t meminfo = trace_mem_get_info(MO_LEUL, MMU_USER_IDX, false); trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo); + cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD); ret = ldl_le_p(g2h(env_cpu(env), ptr)); qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo); return ret; @@ -942,6 +951,7 @@ uint64_t cpu_ldq_le_data(CPUArchState *env, abi_ptr ptr) uint16_t meminfo = trace_mem_get_info(MO_LEQ, MMU_USER_IDX, false); trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo); + cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD); ret = ldq_le_p(g2h(env_cpu(env), ptr)); qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo); return ret; @@ -1052,6 +1062,7 @@ void cpu_stb_data(CPUArchState *env, abi_ptr ptr, uint32_t val) uint16_t meminfo = trace_mem_get_info(MO_UB, MMU_USER_IDX, true); trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo); + cpu_req_mo(TCG_MO_LD_ST | TCG_MO_ST_ST); stb_p(g2h(env_cpu(env), ptr), val); qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo); } @@ -1061,6 +1072,7 @@ void cpu_stw_be_data(CPUArchState *env, abi_ptr ptr, uint32_t val) uint16_t meminfo = trace_mem_get_info(MO_BEUW, MMU_USER_IDX, true); trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo); + cpu_req_mo(TCG_MO_LD_ST | TCG_MO_ST_ST); stw_be_p(g2h(env_cpu(env), ptr), val); qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo); } @@ -1070,6 +1082,7 @@ void cpu_stl_be_data(CPUArchState *env, abi_ptr ptr, uint32_t val) uint16_t meminfo = trace_mem_get_info(MO_BEUL, MMU_USER_IDX, true); trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo); + cpu_req_mo(TCG_MO_LD_ST | TCG_MO_ST_ST); stl_be_p(g2h(env_cpu(env), ptr), val); qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo); } @@ -1079,6 +1092,7 @@ void cpu_stq_be_data(CPUArchState *env, abi_ptr ptr, uint64_t val) uint16_t meminfo = trace_mem_get_info(MO_BEQ, MMU_USER_IDX, true); trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo); + cpu_req_mo(TCG_MO_LD_ST | TCG_MO_ST_ST); stq_be_p(g2h(env_cpu(env), ptr), val); qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo); } @@ -1088,6 +1102,7 @@ void cpu_stw_le_data(CPUArchState *env, abi_ptr ptr, uint32_t val) uint16_t meminfo = trace_mem_get_info(MO_LEUW, MMU_USER_IDX, true); trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo); + cpu_req_mo(TCG_MO_LD_ST | TCG_MO_ST_ST); stw_le_p(g2h(env_cpu(env), ptr), val); qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo); } @@ -1097,6 +1112,7 @@ void cpu_stl_le_data(CPUArchState *env, abi_ptr ptr, uint32_t val) uint16_t meminfo = trace_mem_get_info(MO_LEUL, MMU_USER_IDX, true); trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo); + cpu_req_mo(TCG_MO_LD_ST | TCG_MO_ST_ST); stl_le_p(g2h(env_cpu(env), ptr), val); qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo); } @@ -1106,6 +1122,7 @@ void cpu_stq_le_data(CPUArchState *env, abi_ptr ptr, uint64_t val) uint16_t meminfo = trace_mem_get_info(MO_LEQ, MMU_USER_IDX, true); trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo); + cpu_req_mo(TCG_MO_LD_ST | TCG_MO_ST_ST); stq_le_p(g2h(env_cpu(env), ptr), val); qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo); } -- 2.25.1 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 0/5] tcg: Issue memory barriers for guest memory model 2021-03-16 22:07 [PATCH 0/5] tcg: Issue memory barriers for guest memory model Richard Henderson ` (4 preceding siblings ...) 2021-03-16 22:07 ` [PATCH 5/5] tcg: Add host memory barriers to cpu_ldst.h interfaces Richard Henderson @ 2021-03-16 22:30 ` no-reply 2021-06-26 6:06 ` Richard Henderson 6 siblings, 0 replies; 9+ messages in thread From: no-reply @ 2021-03-16 22:30 UTC (permalink / raw) To: richard.henderson; +Cc: peter.maydell, qemu-devel Patchew URL: https://patchew.org/QEMU/20210316220735.2048137-1-richard.henderson@linaro.org/ Hi, This series seems to have some coding style problems. See output below for more information: Type: series Message-id: 20210316220735.2048137-1-richard.henderson@linaro.org Subject: [PATCH 0/5] tcg: Issue memory barriers for guest memory model === TEST SCRIPT BEGIN === #!/bin/bash git rev-parse base > /dev/null || exit 0 git config --local diff.renamelimit 0 git config --local diff.renames True git config --local diff.algorithm histogram ./scripts/checkpatch.pl --mailback base.. === TEST SCRIPT END === Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384 From https://github.com/patchew-project/qemu - [tag update] patchew/20210311143958.562625-1-richard.henderson@linaro.org -> patchew/20210311143958.562625-1-richard.henderson@linaro.org * [new tag] patchew/20210316220735.2048137-1-richard.henderson@linaro.org -> patchew/20210316220735.2048137-1-richard.henderson@linaro.org Switched to a new branch 'test' 06ceb5a tcg: Add host memory barriers to cpu_ldst.h interfaces be4ade5 tcg: Create tcg_req_mo 1336778 tcg: Elide memory barriers implied by the host memory model d0f90d5 tcg: Do not elide memory barriers for CF_PARALLEL c9f634b tcg: Decode the operand to INDEX_op_mb in dumps === OUTPUT BEGIN === 1/5 Checking commit c9f634bdbe20 (tcg: Decode the operand to INDEX_op_mb in dumps) 2/5 Checking commit d0f90d584f17 (tcg: Do not elide memory barriers for CF_PARALLEL) 3/5 Checking commit 133677838f14 (tcg: Elide memory barriers implied by the host memory model) 4/5 Checking commit be4ade51a457 (tcg: Create tcg_req_mo) 5/5 Checking commit 06ceb5ad212a (tcg: Add host memory barriers to cpu_ldst.h interfaces) ERROR: memory barrier without comment #189: FILE: include/exec/cpu_ldst.h:175: + smp_mb(); \ total: 1 errors, 0 warnings, 146 lines checked Patch 5/5 has style problems, please review. If any of these errors are false positives report them to the maintainer, see CHECKPATCH in MAINTAINERS. === OUTPUT END === Test command exited with code: 1 The full log is available at http://patchew.org/logs/20210316220735.2048137-1-richard.henderson@linaro.org/testing.checkpatch/?type=message. --- Email generated automatically by Patchew [https://patchew.org/]. Please send your feedback to patchew-devel@redhat.com ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/5] tcg: Issue memory barriers for guest memory model 2021-03-16 22:07 [PATCH 0/5] tcg: Issue memory barriers for guest memory model Richard Henderson ` (5 preceding siblings ...) 2021-03-16 22:30 ` [PATCH 0/5] tcg: Issue memory barriers for guest memory model no-reply @ 2021-06-26 6:06 ` Richard Henderson 6 siblings, 0 replies; 9+ messages in thread From: Richard Henderson @ 2021-06-26 6:06 UTC (permalink / raw) To: qemu-devel; +Cc: peter.maydell Ping. A local rebase seems to apply clean. r~ On 3/16/21 3:07 PM, Richard Henderson wrote: > This is intending to fix the current aarch64 host failure > for s390x guest cdrom-test. This is caused by the io thread > issuing memory barriers that are supposed to be matched by > the vcpu, but are elided by tcg in rr mode as "unnecessary". > > I know Peter would like a smaller patch to sync the io thread > with the vcpu thread. I've made a couple of attempts at this, > but havn't managed to get something reliable (although now > irritatingly infrequent -- about 1 in 500). > > I have further patches to further optimize barriers, and to > generate load-acquire/store-release instructions in tcg. > But it's late in the release cycle, etc etc. > > I've done nothing to measure the performance impact of this. > I quit the cdtom-test cycle after 4000 passes. > > > r~ > > > Richard Henderson (5): > tcg: Decode the operand to INDEX_op_mb in dumps > tcg: Do not elide memory barriers for CF_PARALLEL > tcg: Elide memory barriers implied by the host memory model > tcg: Create tcg_req_mo > tcg: Add host memory barriers to cpu_ldst.h interfaces > > include/exec/cpu_ldst.h | 7 ++++ > include/tcg/tcg.h | 20 +++++++++++ > accel/tcg/cputlb.c | 2 ++ > accel/tcg/tcg-all.c | 6 +--- > accel/tcg/user-exec.c | 17 +++++++++ > tcg/tcg-op.c | 19 +++++----- > tcg/tcg.c | 79 +++++++++++++++++++++++++++++++++++++++++ > 7 files changed, 137 insertions(+), 13 deletions(-) > ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2021-06-26 6:07 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2021-03-16 22:07 [PATCH 0/5] tcg: Issue memory barriers for guest memory model Richard Henderson 2021-03-16 22:07 ` [PATCH 1/5] tcg: Decode the operand to INDEX_op_mb in dumps Richard Henderson 2021-03-17 13:32 ` Philippe Mathieu-Daudé 2021-03-16 22:07 ` [PATCH 2/5] tcg: Do not elide memory barriers for CF_PARALLEL Richard Henderson 2021-03-16 22:07 ` [PATCH 3/5] tcg: Elide memory barriers implied by the host memory model Richard Henderson 2021-03-16 22:07 ` [PATCH 4/5] tcg: Create tcg_req_mo Richard Henderson 2021-03-16 22:07 ` [PATCH 5/5] tcg: Add host memory barriers to cpu_ldst.h interfaces Richard Henderson 2021-03-16 22:30 ` [PATCH 0/5] tcg: Issue memory barriers for guest memory model no-reply 2021-06-26 6:06 ` Richard Henderson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).