From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:43619) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YLauZ-0006gM-JL for qemu-devel@nongnu.org; Wed, 11 Feb 2015 12:17:21 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YLauU-0003yw-HK for qemu-devel@nongnu.org; Wed, 11 Feb 2015 12:17:15 -0500 Received: from greensocs.com ([193.104.36.180]:54559) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YLauU-0003yf-6b for qemu-devel@nongnu.org; Wed, 11 Feb 2015 12:17:10 -0500 Message-ID: <54DB8E8F.403@greensocs.com> Date: Wed, 11 Feb 2015 18:17:03 +0100 From: Frederic Konrad MIME-Version: 1.0 References: <54D885A0.3080609@greensocs.com> <87pp9hrukd.fsf@linaro.org> <54DB15F0.1060004@greensocs.com> In-Reply-To: <54DB15F0.1060004@greensocs.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] CPU TLB flush with multithread TCG. Reply-To: Frederic Konrad List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: =?UTF-8?B?QWxleCBCZW5uw6ll?= Cc: mttcg@listserver.greensocs.com, Peter Maydell , Jan Kiszka , qemu-devel , Alexander Graf , Paolo Bonzini On 11/02/2015 09:42, Frederic Konrad wrote: > On 11/02/2015 04:33, Alex Benn=C3=A9e wrote: >> Frederic Konrad writes: >> >>> Hi everybody, >>> >>> In multithread tlb_flush is broken as CPUA can flush an other CPUB an= d >>> CPUB can be >>> executing code, and fixing this can be quite hard: >>> * We need to exit the CPU which is flushed. >>> * Makes sure the CPU is stopped. >>> * Then we can flush tlb. >>> The big issues are: >>> * Two threads can be doing a flush at the same time. >>> * Something can restart the CPU during the flush. >>> >>> A better idea I think is that instead of flushing tlb we can put a fl= ag >>> in CPUState such >>> as flush_request and ask the cpu to exit. >>> Then later once the CPU is exited we can flush tlbs if flush_request=20 >>> is set. >>> It will ensure that the CPU won't execute code as it's associated=20 >>> thread >>> will be >>> flushing. >>> >>> Can this work? >> Does this imply deferring the work? Surely if we don't flush when >> instructed things could break down very quickly? >> > Yes this imply deferring the work. It might be an issue as we can't=20 > exit the CPU > directly, they will finish their execution before flushing.. > > But maybe it's only a problem if the CPU flushes himself? > eg: if CPUB is executing instructions and CPUA wants to flush it: does=20 > that makes a > difference if the flush happen before or after the block of=20 > instruction executed by > CPUB? > This is what we did for that: * Add a new flag for a flush/flush_page request (We can use INTERRUPT=20 flag as suggested by Paolo to be cleaner). * Flush when we exit CPU, or directly if the CPU is associated to the=20 thread. * Request flush and ask cpu to exit instead of flushing directly. The problem which might happen is that some instructions are executed=20 after the tlb change. For example when CPUA does something which requires a tlb flush CPUB will execute some instructions before exiting and flushing.. Can this be a problem? Here are the changes: diff --git a/cpus.c b/cpus.c index c29dc6f..3ed78e5 100644 --- a/cpus.c +++ b/cpus.c @@ -42,6 +42,8 @@ #include "qapi-event.h" #include "hw/nmi.h" +#include "exec/cputlb.h" + #ifndef _WIN32 #include "qemu/compatfd.h" #endif @@ -1379,6 +1381,21 @@ static void tcg_exec_all(CPUState *cpu) } } + switch (cpu->flush_request) { + case TLB_FLUSH: + tlb_flush(cpu, 0); + break; + case TLB_FLUSH_GLOBAL: + tlb_flush(cpu, 1); + break; + case TLB_FLUSH_PAGE: + tlb_flush_page(cpu, cpu->flush_addr); + break; + default: + break; + } + + cpu->flush_request =3D 0; cpu->exit_request =3D 0; } diff --git a/cputlb.c b/cputlb.c index d5a0e7f..3be7608 100644 --- a/cputlb.c +++ b/cputlb.c @@ -29,6 +29,7 @@ #include "exec/memory-internal.h" #include "exec/ram_addr.h" #include "tcg/tcg.h" +#include "sysemu/cpus.h" void qemu_mutex_lock_iothread(void); void qemu_mutex_unlock_iothread(void); @@ -39,6 +40,31 @@ void qemu_mutex_unlock_iothread(void); /* statistics */ int tlb_flush_count; +void tlb_request_flush(CPUState *cpu, int flush_global) +{ + if (cpu->thread_id =3D=3D qemu_get_thread_id()) { + tlb_flush(cpu, flush_global); + } + + if (!flush_global) { + cpu->flush_request =3D TLB_FLUSH; + } else { + cpu->flush_request =3D TLB_FLUSH_GLOBAL; + } + qemu_cpu_kick_thread(cpu); +} + +void tlb_request_flush_page(CPUState *cpu, target_ulong addr) +{ + if (cpu->thread_id =3D=3D qemu_get_thread_id()) { + tlb_flush_page(cpu, addr); + } + + cpu->flush_request =3D TLB_FLUSH_PAGE; + cpu->flush_addr =3D addr; + qemu_cpu_kick_thread(cpu); +} + /* NOTE: * If flush_global is true (the usual case), flush all tlb entries. * If flush_global is false, flush (at least) all tlb entries not diff --git a/include/exec/cputlb.h b/include/exec/cputlb.h index e0da9d7..cf23d18 100644 --- a/include/exec/cputlb.h +++ b/include/exec/cputlb.h @@ -19,6 +19,14 @@ #ifndef CPUTLB_H #define CPUTLB_H +enum flush_req +{ + TLB_NO_REQUEST =3D 0, + TLB_FLUSH, + TLB_FLUSH_GLOBAL, + TLB_FLUSH_PAGE +}; + #if !defined(CONFIG_USER_ONLY) /* cputlb.c */ void tlb_protect_code(ram_addr_t ram_addr); diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h index a334ee4..ba96bcd 100644 --- a/include/exec/exec-all.h +++ b/include/exec/exec-all.h @@ -95,6 +95,17 @@ void tb_invalidate_phys_page_range(tb_page_addr_t=20 start, tb_page_addr_t end, int is_cpu_write_access); void tb_invalidate_phys_range(tb_page_addr_t start, tb_page_addr_t end, int is_cpu_write_access); +/** + * tlb_request_flush. + * Request a tlb_flush and exit the cpu. + */ +void tlb_request_flush(CPUState *cpu, int flush_global); +/** + * tlb_request_flush_page. + * Request a tlb_flush_page and exit the cpu. + */ +void tlb_request_flush_page(CPUState *cpu, target_ulong addr); + #if !defined(CONFIG_USER_ONLY) bool qemu_in_vcpu_thread(void); void cpu_reload_memory_map(CPUState *cpu); diff --git a/include/qom/cpu.h b/include/qom/cpu.h index 01c1793..d6ef064 100644 --- a/include/qom/cpu.h +++ b/include/qom/cpu.h @@ -295,6 +295,10 @@ struct CPUState { uint32_t can_do_io; int32_t exception_index; /* used by m68k TCG */ + /* Somebody asked to flush the tlb of this CPU. */ + int flush_request; + uint64_t flush_addr; + /* Note that this is accessed at the start of every TB via a negati= ve offset from AREG0. Leave this field at the end so as to make th= e (absolute value) offset as small as possible. This reduces code This is the helper modification: diff --git a/target-arm/helper.c b/target-arm/helper.c index 1a5e067..16bd936 100644 --- a/target-arm/helper.c +++ b/target-arm/helper.c @@ -314,7 +314,7 @@ static void dacr_write(CPUARMState *env, const=20 ARMCPRegInfo *ri, uint64_t value) ARMCPU *cpu =3D arm_env_get_cpu(env); raw_write(env, ri, value); - tlb_flush(CPU(cpu), 1); /* Flush TLB as domain not tracked in TLB */ + tlb_request_flush(CPU(cpu), 1); /* Flush TLB as domain not tracked=20 in TLB */ } static void fcse_write(CPUARMState *env, const ARMCPRegInfo *ri,=20 uint64_t value) @@ -325,7 +325,7 @@ static void fcse_write(CPUARMState *env, const=20 ARMCPRegInfo *ri, uint64_t value) /* Unlike real hardware the qemu TLB uses virtual addresses, * not modified virtual addresses, so this causes a TLB flush. */ - tlb_flush(CPU(cpu), 1); + tlb_request_flush(CPU(cpu), 1); raw_write(env, ri, value); } } @@ -341,7 +341,7 @@ static void contextidr_write(CPUARMState *env, const=20 ARMCPRegInfo *ri, * format) this register includes the ASID, so do a TLB flush. * For PMSA it is purely a process ID and no action is needed. */ - tlb_flush(CPU(cpu), 1); + tlb_request_flush(CPU(cpu), 1); } raw_write(env, ri, value); } @@ -352,7 +352,7 @@ static void tlbiall_write(CPUARMState *env, const=20 ARMCPRegInfo *ri, /* Invalidate all (TLBIALL) */ ARMCPU *cpu =3D arm_env_get_cpu(env); - tlb_flush(CPU(cpu), 1); + tlb_request_flush(CPU(cpu), 1); } static void tlbimva_write(CPUARMState *env, const ARMCPRegInfo *ri, @@ -361,7 +361,7 @@ static void tlbimva_write(CPUARMState *env, const=20 ARMCPRegInfo *ri, /* Invalidate single TLB entry by MVA and ASID (TLBIMVA) */ ARMCPU *cpu =3D arm_env_get_cpu(env); - tlb_flush_page(CPU(cpu), value & TARGET_PAGE_MASK); + tlb_request_flush_page(CPU(cpu), value & TARGET_PAGE_MASK); } static void tlbiasid_write(CPUARMState *env, const ARMCPRegInfo *ri, @@ -370,7 +370,7 @@ static void tlbiasid_write(CPUARMState *env, const=20 ARMCPRegInfo *ri, /* Invalidate by ASID (TLBIASID) */ ARMCPU *cpu =3D arm_env_get_cpu(env); - tlb_flush(CPU(cpu), value =3D=3D 0); + tlb_request_flush(CPU(cpu), value =3D=3D 0); } static void tlbimvaa_write(CPUARMState *env, const ARMCPRegInfo *ri, @@ -379,7 +379,7 @@ static void tlbimvaa_write(CPUARMState *env, const=20 ARMCPRegInfo *ri, /* Invalidate single entry by MVA, all ASIDs (TLBIMVAA) */ ARMCPU *cpu =3D arm_env_get_cpu(env); - tlb_flush_page(CPU(cpu), value & TARGET_PAGE_MASK); + tlb_request_flush_page(CPU(cpu), value & TARGET_PAGE_MASK); } /* IS variants of TLB operations must affect all cores */ @@ -389,7 +389,7 @@ static void tlbiall_is_write(CPUARMState *env, const=20 ARMCPRegInfo *ri, CPUState *other_cs; CPU_FOREACH(other_cs) { - tlb_flush(other_cs, 1); + tlb_request_flush(other_cs, 1); } } @@ -399,7 +399,7 @@ static void tlbiasid_is_write(CPUARMState *env,=20 const ARMCPRegInfo *ri, CPUState *other_cs; CPU_FOREACH(other_cs) { - tlb_flush(other_cs, value =3D=3D 0); + tlb_request_flush(other_cs, value =3D=3D 0); } } @@ -409,7 +409,7 @@ static void tlbimva_is_write(CPUARMState *env, const=20 ARMCPRegInfo *ri, CPUState *other_cs; CPU_FOREACH(other_cs) { - tlb_flush_page(other_cs, value & TARGET_PAGE_MASK); + tlb_request_flush_page(other_cs, value & TARGET_PAGE_MASK); } } @@ -419,7 +419,7 @@ static void tlbimvaa_is_write(CPUARMState *env,=20 const ARMCPRegInfo *ri, CPUState *other_cs; CPU_FOREACH(other_cs) { - tlb_flush_page(other_cs, value & TARGET_PAGE_MASK); + tlb_request_flush_page(other_cs, value & TARGET_PAGE_MASK); } } @@ -1647,7 +1647,7 @@ static void vmsa_ttbcr_write(CPUARMState *env,=20 const ARMCPRegInfo *ri, /* With LPAE the TTBCR could result in a change of ASID * via the TTBCR.A1 bit, so do a TLB flush. */ - tlb_flush(CPU(cpu), 1); + tlb_request_flush(CPU(cpu), 1); } vmsa_ttbcr_raw_write(env, ri, value); } @@ -1671,7 +1671,7 @@ static void vmsa_tcr_el1_write(CPUARMState *env,=20 const ARMCPRegInfo *ri, TCR *tcr =3D raw_ptr(env, ri); /* For AArch64 the A1 bit could result in a change of ASID, so TLB=20 flush. */ - tlb_flush(CPU(cpu), 1); + tlb_request_flush(CPU(cpu), 1); tcr->raw_tcr =3D value; } @@ -1684,7 +1684,7 @@ static void vmsa_ttbr_write(CPUARMState *env,=20 const ARMCPRegInfo *ri, if (cpreg_field_is_64bit(ri)) { ARMCPU *cpu =3D arm_env_get_cpu(env); - tlb_flush(CPU(cpu), 1); + tlb_request_flush(CPU(cpu), 1); } raw_write(env, ri, value); } @@ -2018,7 +2018,7 @@ static void tlbi_aa64_va_write(CPUARMState *env,=20 const ARMCPRegInfo *ri, ARMCPU *cpu =3D arm_env_get_cpu(env); uint64_t pageaddr =3D sextract64(value << 12, 0, 56); - tlb_flush_page(CPU(cpu), pageaddr); + tlb_request_flush_page(CPU(cpu), pageaddr); } static void tlbi_aa64_vaa_write(CPUARMState *env, const ARMCPRegInfo *r= i, @@ -2028,7 +2028,7 @@ static void tlbi_aa64_vaa_write(CPUARMState *env,=20 const ARMCPRegInfo *ri, ARMCPU *cpu =3D arm_env_get_cpu(env); uint64_t pageaddr =3D sextract64(value << 12, 0, 56); - tlb_flush_page(CPU(cpu), pageaddr); + tlb_request_flush_page(CPU(cpu), pageaddr); } static void tlbi_aa64_asid_write(CPUARMState *env, const ARMCPRegInfo *= ri, @@ -2037,7 +2037,7 @@ static void tlbi_aa64_asid_write(CPUARMState *env,=20 const ARMCPRegInfo *ri, /* Invalidate by ASID (AArch64 version) */ ARMCPU *cpu =3D arm_env_get_cpu(env); int asid =3D extract64(value, 48, 16); - tlb_flush(CPU(cpu), asid =3D=3D 0); + tlb_request_flush(CPU(cpu), asid =3D=3D 0); } static void tlbi_aa64_va_is_write(CPUARMState *env, const ARMCPRegInfo=20 *ri, @@ -2047,7 +2047,7 @@ static void tlbi_aa64_va_is_write(CPUARMState=20 *env, const ARMCPRegInfo *ri, uint64_t pageaddr =3D sextract64(value << 12, 0, 56); CPU_FOREACH(other_cs) { - tlb_flush_page(other_cs, pageaddr); + tlb_request_flush_page(other_cs, pageaddr); } } @@ -2058,7 +2058,7 @@ static void tlbi_aa64_vaa_is_write(CPUARMState=20 *env, const ARMCPRegInfo *ri, uint64_t pageaddr =3D sextract64(value << 12, 0, 56); CPU_FOREACH(other_cs) { - tlb_flush_page(other_cs, pageaddr); + tlb_request_flush_page(other_cs, pageaddr); } } @@ -2069,7 +2069,7 @@ static void tlbi_aa64_asid_is_write(CPUARMState=20 *env, const ARMCPRegInfo *ri, int asid =3D extract64(value, 48, 16); CPU_FOREACH(other_cs) { - tlb_flush(other_cs, asid =3D=3D 0); + tlb_request_flush(other_cs, asid =3D=3D 0); } } @@ -2132,7 +2132,7 @@ static void sctlr_write(CPUARMState *env, const=20 ARMCPRegInfo *ri, raw_write(env, ri, value); /* ??? Lots of these bits are not implemented. */ /* This may enable/disable the MMU, so do a TLB flush. */ - tlb_flush(CPU(cpu), 1); + tlb_request_flush(CPU(cpu), 1); } static const ARMCPRegInfo v8_cp_reginfo[] =3D { @@ -2370,7 +2370,7 @@ static void hcr_write(CPUARMState *env, const=20 ARMCPRegInfo *ri, uint64_t value) * HCR_DC Disables stage1 and enables stage2 translation */ if ((raw_read(env, ri) ^ value) & (HCR_VM | HCR_PTW | HCR_DC)) { - tlb_flush(CPU(cpu), 1); + tlb_request_flush(CPU(cpu), 1); } raw_write(env, ri, value); } > Thanks, > Fred >> >>> Thanks, >>> Fred >