From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1Lp0nf-0005Oj-DD for qemu-devel@nongnu.org; Wed, 01 Apr 2009 09:48:15 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1Lp0na-0005Lv-38 for qemu-devel@nongnu.org; Wed, 01 Apr 2009 09:48:14 -0400 Received: from [199.232.76.173] (port=32898 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Lp0nY-0005Lj-Tg for qemu-devel@nongnu.org; Wed, 01 Apr 2009 09:48:09 -0400 Received: from miranda.se.axis.com ([193.13.178.8]:50871) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1Lp0nX-0001PH-Vb for qemu-devel@nongnu.org; Wed, 01 Apr 2009 09:48:08 -0400 Received: from axis.com (edgar.se.axis.com [10.93.151.1]) by miranda.se.axis.com (8.13.4/8.13.4/Debian-3sarge3) with ESMTP id n31Dm412009611 for ; Wed, 1 Apr 2009 15:48:04 +0200 Date: Wed, 1 Apr 2009 15:48:04 +0200 From: "Edgar E. Iglesias" Subject: Re: [Qemu-devel] SH: Fix movca.l/ocbi emulation. Message-ID: <20090401134804.GD24403@edgar.se.axis.com> References: <200812112134.30560.vladimir@codesourcery.com> <200904010058.07573.vladimir@codesourcery.com> <20090401124449.GB24403@edgar.se.axis.com> <200904011736.43530.vladimir@codesourcery.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200904011736.43530.vladimir@codesourcery.com> Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Vladimir Prus Cc: Shin-ichiro KAWASAKI , qemu-devel@nongnu.org, "Edgar E. Iglesias" On Wed, Apr 01, 2009 at 05:36:43PM +0400, Vladimir Prus wrote: > On Wednesday 01 April 2009 16:44:49 Edgar E. Iglesias wrote: > > Hi Edgar, > > > > > I am afraid I might not be the best person to accurately catch all cases > > > where cache might be enabled or disabled. KAWASAKI-san has posted a function > > > that can be used to check if caching is enabled, in: > > > > > > http://article.gmane.org/gmane.comp.emulators.qemu/36348 > > > > > > Does that one address your suggestion? If so, I can incorporate that function, > > > and checking for it. > > > > I dont have the manuals here but yes, I think something like that will be > > needed. On other archs I've seen ppl do memcpy to IO areas, don't know > > about sh but to be on the safe side we better not do any funny stuff > > with uncached stores. > > I have added the check, using the routine provided by Kawasaki-san. [cut] > > > typedef struct CPUSH4State { > > > int id; /* CPU model */ > > > > > > @@ -149,6 +155,8 @@ typedef struct CPUSH4State { > > > tlb_t itlb[ITLB_SIZE]; /* instruction translation table */ > > > void *intc_handle; > > > int intr_at_halt; /* SR_BL ignored during sleep */ > > > + memory_content_t *movcal_backup; > > > + memory_content_t **movcal_backup_tail; > > > > Do you really need the _tail member? > > Seems to me like you're only maintaining it but not really using it. > > See helper_movcal: > > *(env->movcal_backup_tail) = r; > > I think having movcal backup list in-order is good. Say, if we have two > movcal to the same address and ocbi, we want to restore the saved when > the first movcal was executed, and given that helper_ocbi traversed > the list from the front we need to put elements in-order. > I see, thanks for clarifying that. > > > +void helper_ocbi(uint32_t address) > > > +{ > > > + memory_content_t **current = &(env->movcal_backup); > > > + while (*current) > > > + { > > > + uint32_t a = (*current)->address; > > > + if ((a & ~0x1F) == (address & ~0x1F)) > > > + { > > > + memory_content_t *next = (*current)->next; > > > + > > > + stl_data(a, (*current)->value); > > > + > > > + if (next == 0) > > > + { > > > + env->movcal_backup_tail = current; > > > + } > > > + > > > + free (*current); > > > + *current = next; > > > + break; > > > + } > > > > The indentation got messed up here. > > I have fixed it. Revised patch is attached. Thanks, ATM I don't have any tools/kernel to test with but AFAICS it looks OK. Acked-by: Edgar E. Iglesias > commit d5caa92b7e792c526306b3438e697bfab3dd0c03 > Author: Vladimir Prus > Date: Tue Nov 11 12:29:02 2008 +0300 > > Fix movcal.l/ocbi emulation. > > * target-sh4/cpu.h (memory_content_t): New. > (CPUSH4State): New fields movcal_backup and movcal_backup_tail. > * target-sh4/helper.h (helper_movcal) > (helper_discard_movcal_backup, helper_ocbi): New. > * target-sh4/op_helper.c (helper_movcal) > (helper_discard_movcal_backup, helper_ocbi): New. > * target-sh4/translate.c (DisasContext): New field has_movcal. > (sh4_defs): Update CVS for SH7785. > (cpu_sh4_init): Initialize env->movcal_backup_tail. > (_decode_opc): Discard movca.l-backup. > Make use of helper_movcal and helper_ocbi. > (gen_intermediate_code_internal): Initialize has_movcal to 1. > > diff --git a/target-sh4/cpu.h b/target-sh4/cpu.h > index e9eee2c..d754512 100644 > --- a/target-sh4/cpu.h > +++ b/target-sh4/cpu.h > @@ -100,6 +100,12 @@ enum sh_features { > SH_FEATURE_BCR3_AND_BCR4 = 2, > }; > > +typedef struct memory_content { > + uint32_t address; > + uint32_t value; > + struct memory_content *next; > +} memory_content; > + > typedef struct CPUSH4State { > int id; /* CPU model */ > > @@ -149,6 +155,8 @@ typedef struct CPUSH4State { > tlb_t itlb[ITLB_SIZE]; /* instruction translation table */ > void *intc_handle; > int intr_at_halt; /* SR_BL ignored during sleep */ > + memory_content *movcal_backup; > + memory_content **movcal_backup_tail; > } CPUSH4State; > > CPUSH4State *cpu_sh4_init(const char *cpu_model); > @@ -163,6 +171,8 @@ void sh4_cpu_list(FILE *f, int (*cpu_fprintf)(FILE *f, const char *fmt, ...)); > void cpu_sh4_write_mmaped_utlb_addr(CPUSH4State *s, target_phys_addr_t addr, > uint32_t mem_value); > > +int cpu_sh4_is_cached(CPUSH4State * env, target_ulong addr); > + > static inline void cpu_set_tls(CPUSH4State *env, target_ulong newtls) > { > env->gbr = newtls; > @@ -294,16 +304,19 @@ static inline void cpu_pc_from_tb(CPUState *env, TranslationBlock *tb) > env->flags = tb->flags; > } > > +#define TB_FLAG_PENDING_MOVCA (1 << 4) > + > static inline void cpu_get_tb_cpu_state(CPUState *env, target_ulong *pc, > target_ulong *cs_base, int *flags) > { > *pc = env->pc; > *cs_base = 0; > *flags = (env->flags & (DELAY_SLOT | DELAY_SLOT_CONDITIONAL > - | DELAY_SLOT_TRUE | DELAY_SLOT_CLEARME)) /* Bits 0- 3 */ > - | (env->fpscr & (FPSCR_FR | FPSCR_SZ | FPSCR_PR)) /* Bits 19-21 */ > + | DELAY_SLOT_TRUE | DELAY_SLOT_CLEARME)) /* Bits 0- 3 */ > + | (env->fpscr & (FPSCR_FR | FPSCR_SZ | FPSCR_PR)) /* Bits 19-21 */ > | (env->sr & (SR_MD | SR_RB)) /* Bits 29-30 */ > - | (env->sr & SR_FD); /* Bit 15 */ > + | (env->sr & SR_FD) /* Bit 15 */ > + | (env->movcal_backup ? TB_FLAG_PENDING_MOVCA : 0); /* Bit 4 */ > } > > #endif /* _CPU_SH4_H */ > diff --git a/target-sh4/helper.c b/target-sh4/helper.c > index f1feca7..5fa4fb4 100644 > --- a/target-sh4/helper.c > +++ b/target-sh4/helper.c > @@ -652,4 +652,48 @@ void cpu_sh4_write_mmaped_utlb_addr(CPUSH4State *s, target_phys_addr_t addr, > } > } > > +int cpu_sh4_is_cached(CPUSH4State * env, target_ulong addr) > +{ > + int n; > + int use_asid = (env->mmucr & MMUCR_SV) == 0 || (env->sr & SR_MD) == 0; > + > + /* check area */ > + if (env->sr & SR_MD) { > + /* For previledged mode, P2 and P4 area is not cachable. */ > + if ((0xA0000000 <= addr && addr < 0xC0000000) || 0xE0000000 <= addr) > + return 0; > + } else { > + /* For user mode, only U0 area is cachable. */ > + if (0x80000000 <= addr) > + return 0; > + } > + > + /* > + * TODO : Evaluate CCR and check if the cache is on or off. > + * Now CCR is not in CPUSH4State, but in SH7750State. > + * When you move the ccr inot CPUSH4State, the code will be > + * as follows. > + */ > +#if 0 > + /* check if operand cache is enabled or not. */ > + if (!(env->ccr & 1)) > + return 0; > +#endif > + > + /* if MMU is off, no check for TLB. */ > + if (env->mmucr & MMUCR_AT) > + return 1; > + > + /* check TLB */ > + n = find_tlb_entry(env, addr, env->itlb, ITLB_SIZE, use_asid); > + if (n >= 0) > + return env->itlb[n].c; > + > + n = find_tlb_entry(env, addr, env->utlb, UTLB_SIZE, use_asid); > + if (n >= 0) > + return env->utlb[n].c; > + > + return 0; > +} > + > #endif > diff --git a/target-sh4/helper.h b/target-sh4/helper.h > index e665185..4b2fcdd 100644 > --- a/target-sh4/helper.h > +++ b/target-sh4/helper.h > @@ -9,6 +9,10 @@ DEF_HELPER_0(debug, void) > DEF_HELPER_1(sleep, void, i32) > DEF_HELPER_1(trapa, void, i32) > > +DEF_HELPER_2(movcal, void, i32, i32) > +DEF_HELPER_0(discard_movcal_backup, void) > +DEF_HELPER_1(ocbi, void, i32) > + > DEF_HELPER_2(addv, i32, i32, i32) > DEF_HELPER_2(addc, i32, i32, i32) > DEF_HELPER_2(subv, i32, i32, i32) > diff --git a/target-sh4/op_helper.c b/target-sh4/op_helper.c > index 84e1ad3..90c94dc 100644 > --- a/target-sh4/op_helper.c > +++ b/target-sh4/op_helper.c > @@ -18,6 +18,7 @@ > * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA 02110-1301 USA > */ > #include > +#include > #include "exec.h" > #include "helper.h" > > @@ -122,6 +123,57 @@ void helper_trapa(uint32_t tra) > cpu_loop_exit(); > } > > +void helper_movcal(uint32_t address, uint32_t value) > +{ > + if (cpu_sh4_is_cached (env, address)) > + { > + memory_content *r = (memory_content *)malloc (sizeof(memory_content)); > + r->address = address; > + r->value = value; > + r->next = NULL; > + > + *(env->movcal_backup_tail) = r; > + env->movcal_backup_tail = &(r->next); > + } > +} > + > +void helper_discard_movcal_backup(void) > +{ > + memory_content *current = env->movcal_backup; > + > + while(current) > + { > + memory_content *next = current->next; > + free (current); > + env->movcal_backup = current = next; > + if (current == 0) > + env->movcal_backup_tail = &(env->movcal_backup); > + } > +} > + > +void helper_ocbi(uint32_t address) > +{ > + memory_content **current = &(env->movcal_backup); > + while (*current) > + { > + uint32_t a = (*current)->address; > + if ((a & ~0x1F) == (address & ~0x1F)) > + { > + memory_content *next = (*current)->next; > + stl_data(a, (*current)->value); > + > + if (next == 0) > + { > + env->movcal_backup_tail = current; > + } > + > + free (*current); > + *current = next; > + break; > + } > + } > +} > + > uint32_t helper_addc(uint32_t arg0, uint32_t arg1) > { > uint32_t tmp0, tmp1; > diff --git a/target-sh4/translate.c b/target-sh4/translate.c > index cc9f886..4ced176 100644 > --- a/target-sh4/translate.c > +++ b/target-sh4/translate.c > @@ -50,6 +50,7 @@ typedef struct DisasContext { > uint32_t delayed_pc; > int singlestep_enabled; > uint32_t features; > + int has_movcal; > } DisasContext; > > #if defined(CONFIG_USER_ONLY) > @@ -283,6 +284,7 @@ CPUSH4State *cpu_sh4_init(const char *cpu_model) > env = qemu_mallocz(sizeof(CPUSH4State)); > env->features = def->features; > cpu_exec_init(env); > + env->movcal_backup_tail = &(env->movcal_backup); > sh4_translate_init(); > env->cpu_model_str = cpu_model; > cpu_sh4_reset(env); > @@ -495,6 +497,37 @@ static inline void gen_store_fpr64 (TCGv_i64 t, int reg) > > static void _decode_opc(DisasContext * ctx) > { > + /* This code tries to make movcal emulation sufficiently > + accurate for Linux purposes. This instruction writes > + memory, and prior to that, always allocates a cache line. > + It is used in two contexts: > + - in memcpy, where data is copied in blocks, the first write > + of to a block uses movca.l for performance. > + - in arch/sh/mm/cache-sh4.c, movcal.l + ocbi combination is used > + to flush the cache. Here, the data written by movcal.l is never > + written to memory, and the data written is just bogus. > + > + To simulate this, we simulate movcal.l, we store the value to memory, > + but we also remember the previous content. If we see ocbi, we check > + if movcal.l for that address was done previously. If so, the write should > + not have hit the memory, so we restore the previous content. > + When we see an instruction that is neither movca.l > + nor ocbi, the previous content is discarded. > + > + To optimize, we only try to flush stores when we're at the start of > + TB, or if we already saw movca.l in this TB and did not flush stores > + yet. */ > + if (ctx->has_movcal) > + { > + int opcode = ctx->opcode & 0xf0ff; > + if (opcode != 0x0093 /* ocbi */ > + && opcode != 0x00c3 /* movca.l */) > + { > + gen_helper_discard_movcal_backup (); > + ctx->has_movcal = 0; > + } > + } > + > #if 0 > fprintf(stderr, "Translating opcode 0x%04x\n", ctx->opcode); > #endif > @@ -1545,7 +1578,13 @@ static void _decode_opc(DisasContext * ctx) > } > return; > case 0x00c3: /* movca.l R0,@Rm */ > - tcg_gen_qemu_st32(REG(0), REG(B11_8), ctx->memidx); > + { > + TCGv val = tcg_temp_new(); > + tcg_gen_qemu_ld32u(val, REG(B11_8), ctx->memidx); > + gen_helper_movcal (REG(B11_8), val); > + tcg_gen_qemu_st32(REG(0), REG(B11_8), ctx->memidx); > + } > + ctx->has_movcal = 1; > return; > case 0x40a9: > /* MOVUA.L @Rm,R0 (Rm) -> R0 > @@ -1594,9 +1633,7 @@ static void _decode_opc(DisasContext * ctx) > break; > case 0x0093: /* ocbi @Rn */ > { > - TCGv dummy = tcg_temp_new(); > - tcg_gen_qemu_ld32s(dummy, REG(B11_8), ctx->memidx); > - tcg_temp_free(dummy); > + gen_helper_ocbi (REG(B11_8)); > } > return; > case 0x00a3: /* ocbp @Rn */ > @@ -1876,6 +1913,7 @@ gen_intermediate_code_internal(CPUState * env, TranslationBlock * tb, > ctx.tb = tb; > ctx.singlestep_enabled = env->singlestep_enabled; > ctx.features = env->features; > + ctx.has_movcal = (tb->flags & TB_FLAG_PENDING_MOVCA); > > #ifdef DEBUG_DISAS > qemu_log_mask(CPU_LOG_TB_CPU,