From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:46084) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WLwRQ-0000qW-6y for qemu-devel@nongnu.org; Fri, 07 Mar 2014 10:12:09 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WLwRL-0003B8-2A for qemu-devel@nongnu.org; Fri, 07 Mar 2014 10:12:04 -0500 Received: from mail-lb0-f169.google.com ([209.85.217.169]:39663) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WLwRK-0003B3-Ed for qemu-devel@nongnu.org; Fri, 07 Mar 2014 10:11:58 -0500 Received: by mail-lb0-f169.google.com with SMTP id l4so2946032lbv.28 for ; Fri, 07 Mar 2014 07:11:57 -0800 (PST) MIME-Version: 1.0 In-Reply-To: <5319DCD8.9080307@twiddle.net> References: <1394134385-1727-1-git-send-email-peter.maydell@linaro.org> <1394134385-1727-13-git-send-email-peter.maydell@linaro.org> <5319DCD8.9080307@twiddle.net> From: Peter Maydell Date: Fri, 7 Mar 2014 15:11:37 +0000 Message-ID: Content-Type: text/plain; charset=UTF-8 Subject: Re: [Qemu-devel] [PATCH v4 12/21] target-arm: A64: Implement DC ZVA List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Richard Henderson Cc: Rob Herring , Peter Crosthwaite , Patch Tracking , Michael Matz , Claudio Fontana , Alexander Graf , QEMU Developers , Will Newton , Dirk Mueller , Laurent Desnogues , =?UTF-8?B?QWxleCBCZW5uw6ll?= , "kvmarm@lists.cs.columbia.edu" , Christoffer Dall On 7 March 2014 14:51, Richard Henderson wrote: > On 03/06/2014 11:32 AM, Peter Maydell wrote: >> +/** >> + * tlb_vaddr_to_host: >> + * @env: CPUArchState >> + * @addr: guest virtual address to look up >> + * @mmu_idx: MMU index to use for lookup >> + * >> + * Look up the specified guest virtual index in the TCG softmmu TLB. >> + * If the TLB contains a host virtual address suitable for direct RAM >> + * access, then return it. Otherwise (TLB miss, TLB entry is for an >> + * I/O access, etc) return NULL. >> + * >> + * This is the equivalent of the initial fast-path code used by >> + * TCG backends for guest load and store accesses. >> + */ >> +static inline void *tlb_vaddr_to_host(CPUArchState *env, target_ulong addr, >> + int mmu_idx) >> +{ >> + int index = (addr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1); >> + target_ulong tlb_addr = env->tlb_table[mmu_idx][index].addr_write; > > Somewhere I think the function name or at least the block comment should > indicate that this lookup is for writing, since we hard-code addr_write here. Doh, yes. I forgot that when I was shifting the code into a more general function. Is it worth parameterising this for read vs write lookups? >> +void HELPER(dc_zva)(CPUARMState *env, uint64_t vaddr_in) >> +{ >> + /* Implement DC ZVA, which zeroes a fixed-length block of memory. >> + * Note that we do not implement the (architecturally mandated) >> + * alignment fault for attempts to use this on Device memory >> + * (which matches the usual QEMU behaviour of not implementing either >> + * alignment faults or any memory attribute handling). >> + */ >> + >> + ARMCPU *cpu = arm_env_get_cpu(env); >> + uint64_t blocklen = 4 << cpu->dcz_blocksize; >> + uint64_t vaddr = vaddr_in & ~(blocklen - 1); >> + >> +#ifndef CONFIG_USER_ONLY >> + { >> + /* Slightly awkwardly, QEMU's TARGET_PAGE_SIZE may be less than >> + * the block size so we might have to do more than one TLB lookup. >> + * We know that in fact for any v8 CPU the page size is at least 4K >> + * and the block size must be 2K or less, but TARGET_PAGE_SIZE is only >> + * 1K as an artefact of legacy v5 subpage support being present in the >> + * same QEMU executable. >> + */ >> + int maxidx = DIV_ROUND_UP(blocklen, TARGET_PAGE_SIZE); >> + void *hostaddr[maxidx]; > > What's the maximum blocksize? Did you really need dynamic allocation here? Max blocksize architecturally currently is 2K. Dynamic allocation seemed cleaner code than hardcoding "this will always have either 1 or 2 elements", though. (The field in the "how big is a block?" register would allow more than 2K, since that is encoded as 0b1001 in a 4 bit field.) > >> + int try, i; >> + >> + for (try = 0; try < 2; try++) { >> + >> + for (i = 0; i < maxidx; i++) { >> + hostaddr[i] = tlb_vaddr_to_host(env, >> + vaddr + TARGET_PAGE_SIZE * i, >> + cpu_mmu_index(env)); >> + if (!hostaddr[i]) { >> + break; >> + } >> + } >> + if (i == maxidx) { >> + /* If it's all in the TLB it's fair game for just writing to; >> + * we know we don't need to update dirty status, etc. >> + */ >> + for (i = 0; i < maxidx - 1; i++) { >> + memset(hostaddr[i], 0, TARGET_PAGE_SIZE); >> + } >> + memset(hostaddr[i], 0, blocklen - (i * TARGET_PAGE_SIZE)); >> + return; >> + } >> + /* OK, try a store and see if we can populate the tlb. This >> + * might cause an exception if the memory isn't writable, >> + * in which case we will longjmp out of here. We must for >> + * this purpose use the actual register value passed to us >> + * so that we get the fault address right. >> + */ >> + cpu_stb_data(env, vaddr_in, 0); >> + /* Now we can populate the other TLB entries, if any */ >> + for (i = 0; i < maxidx; i++) { >> + uint64_t va = vaddr + TARGET_PAGE_SIZE * i; >> + if (va != (vaddr_in & TARGET_PAGE_MASK)) { >> + cpu_stb_data(env, va, 0); >> + } >> + } > > cpu_stb_data doesn't take into account user vs kernel mode accesses. ...so what does it use for the mmu index? > Maybe > better off using helper_ret_stb_mmu, and passing along GETRA(). OK. > As a bonus, you'll have accurate exceptions should the access throw, so you > don't need to force the save of PC before calling the helper. Which... I don't > see you doing, so perhaps there's a bug here at the moment. Mmm. (In system mode we'll save PC as a side effect of having an accessfn defined for the DC_ZVA reginfo.) thanks -- PMM