Re: [Qemu-devel] [RFC 1/8] cputlb: add physical address to CPUTLBEntry

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: "Emilio G. Cota" <cota@braap.org>
To: Richard Henderson <rth@twiddle.net>
Cc: mttcg@greensocs.com, Peter Maydell <peter.maydell@linaro.org>,
	qemu-devel@nongnu.org,
	Alvise Rigo <a.rigo@virtualopensystems.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	alex.bennee@linaro.org,
	Frederic Konrad <fred.konrad@greensocs.com>
Subject: Re: [Qemu-devel] [RFC 1/8] cputlb: add physical address to CPUTLBEntry
Date: Sun, 10 May 2015 04:07:48 -0400	[thread overview]
Message-ID: <20150510080748.GA8629@flamenco> (raw)
In-Reply-To: <554D2FFE.20204@twiddle.net>

On Fri, May 08, 2015 at 14:51:58 -0700, Richard Henderson wrote:
> Ouch.  24 of 64 wasted bytes for 64-bit?
> 
> I wonder if there's a better way we can encode this to avoid 3 copies of the
> virtual address for read/write/code.  Or if we're better off using more than
> one insn to multiply by a non-power-of-two.

Adding one more instruction works well. Perf tests
(# time ./selftest.sh for [1]) show no appreciable difference.

Patch (i386-only) appended.

[1] http://wiki.qemu.org/download/ppc-virtexml507-linux-2_6_34.tgz

> Or if the hardware multiplier is
> fast enough just multiply by the proper constant.

This option should be slower (3-cycle latency for IMUL
on Ivy Bridge/Haswell, probably slower on others),
but would make the code very simple.
Unfortunately I couldn't write a patch to do it due to
my poor grasp of TCG backend code. If someone provides a
patch I'd be glad to test it.

If the appended ends up being the preferred option I
can extend it to support all the TCG targets. I cannot
however test all of them--only have access to x86 and ppc
hardware at the moment.

Thanks,

		Emilio

[PATCH] i386-only: remove sizeof(CPUTLBEntry)=pow2 constraint

Breaks all non-i386 TCG backends! Do not apply.

Signed-off-by: Emilio G. Cota <cota@braap.org>
---
 include/exec/cpu-defs.h | 23 +++++++++++++++++------
 tcg/i386/tcg-target.c   | 12 +++++++-----
 2 files changed, 24 insertions(+), 11 deletions(-)

diff --git a/include/exec/cpu-defs.h b/include/exec/cpu-defs.h
index 8891f16..716052a 100644
--- a/include/exec/cpu-defs.h
+++ b/include/exec/cpu-defs.h
@@ -96,14 +96,25 @@ typedef struct CPUTLBEntry {
     /* Addend to virtual address to get host address.  IO accesses
        use the corresponding iotlb value.  */
     uintptr_t addend;
-    /* padding to get a power of two size */
-    uint8_t dummy[(1 << CPU_TLB_ENTRY_BITS) -
-                  (sizeof(target_ulong) * 4 +
-                   ((-sizeof(target_ulong) * 4) & (sizeof(uintptr_t) - 1)) +
-                   sizeof(uintptr_t))];
 } CPUTLBEntry;
 
-QEMU_BUILD_BUG_ON(sizeof(CPUTLBEntry) != (1 << CPU_TLB_ENTRY_BITS));
+/*
+ * Fast TLB hits are essential for softmmu performance. Since we hand-code in
+ * assembly the TCG check for TLB hits, we define here a pair of constants that
+ * allow us to use only shifts to obtain a TLBEntry address given its index.
+ * NOTE: The constants below should thus be updated every time changes are
+ * made to the CPUTLBEntry struct. See the compile-time consistency check below.
+ */
+#if TARGET_LONG_BITS == 64
+#define CPU_TLB_ENTRY_OUT_SHIFT 3
+#define CPU_TLB_ENTRY_IN_SHIFT  2
+#else
+#define CPU_TLB_ENTRY_OUT_SHIFT (HOST_LONG_BITS == 32 ? 2 : 3)
+#define CPU_TLB_ENTRY_IN_SHIFT  (HOST_LONG_BITS == 32 ? 2 : 1)
+#endif
+
+QEMU_BUILD_BUG_ON(sizeof(CPUTLBEntry) != \
+                (1 + BIT(CPU_TLB_ENTRY_IN_SHIFT)) << CPU_TLB_ENTRY_OUT_SHIFT);
 
 /* The IOTLB is not accessed directly inline by generated TCG code,
  * so the CPUIOTLBEntry layout is not as critical as that of the
diff --git a/tcg/i386/tcg-target.c b/tcg/i386/tcg-target.c
index ab63823..54250f5 100644
--- a/tcg/i386/tcg-target.c
+++ b/tcg/i386/tcg-target.c
@@ -1195,15 +1195,17 @@ static inline void tcg_out_tlb_load(TCGContext *s, TCGReg addrlo, TCGReg addrhi,
     tcg_out_mov(s, htype, r0, addrlo);
     tcg_out_mov(s, ttype, r1, addrlo);
 
-    tcg_out_shifti(s, SHIFT_SHR + hrexw, r0,
-                   TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
+    tcg_out_shifti(s, SHIFT_SHR + hrexw, r0, TARGET_PAGE_BITS);
 
     tgen_arithi(s, ARITH_AND + trexw, r1,
                 TARGET_PAGE_MASK | ((1 << s_bits) - 1), 0);
-    tgen_arithi(s, ARITH_AND + hrexw, r0,
-                (CPU_TLB_SIZE - 1) << CPU_TLB_ENTRY_BITS, 0);
+    tgen_arithi(s, ARITH_AND + hrexw, r0, CPU_TLB_SIZE - 1, 0);
 
-    tcg_out_modrm_sib_offset(s, OPC_LEA + hrexw, r0, TCG_AREG0, r0, 0,
+    tcg_out_modrm_sib_offset(s, OPC_LEA + hrexw, r0, r0, r0,
+                             CPU_TLB_ENTRY_IN_SHIFT, 0);
+
+    tcg_out_modrm_sib_offset(s, OPC_LEA + hrexw, r0, TCG_AREG0, r0,
+                             CPU_TLB_ENTRY_OUT_SHIFT,
                              offsetof(CPUArchState, tlb_table[mem_index][0])
                              + which);
 
-- 
1.8.3

next prev parent reply	other threads:[~2015-05-10  8:07 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-08 21:02 [Qemu-devel] [RFC 0/8] Helper-based Atomic Instruction Emulation (AIE) Emilio G. Cota
2015-05-08 21:02 ` [Qemu-devel] [RFC 1/8] cputlb: add physical address to CPUTLBEntry Emilio G. Cota
2015-05-08 21:51   ` Richard Henderson
2015-05-10  8:07     ` Emilio G. Cota [this message]
2015-05-08 21:02 ` [Qemu-devel] [RFC 2/8] softmmu: add helpers to get ld/st physical addresses Emilio G. Cota
2015-05-08 21:02 ` [Qemu-devel] [RFC 3/8] tiny_set: add module to test for membership in a tiny set of pointers Emilio G. Cota
2015-05-08 21:02 ` [Qemu-devel] [RFC 4/8] radix-tree: add generic lockless radix tree module Emilio G. Cota
2015-05-08 21:02 ` [Qemu-devel] [RFC 5/8] aie: add module for Atomic Instruction Emulation Emilio G. Cota
2015-05-08 22:41   ` Emilio G. Cota
2015-05-08 21:02 ` [Qemu-devel] [RFC 6/8] aie: add target helpers Emilio G. Cota
2015-05-08 21:02 ` [Qemu-devel] [RFC 7/8] target-arm: emulate atomic instructions using AIE Emilio G. Cota
2015-05-08 21:02 ` [Qemu-devel] [RFC 8/8] target-i386: " Emilio G. Cota
2015-05-11 16:01 ` [Qemu-devel] [RFC 0/8] Helper-based Atomic Instruction Emulation (AIE) Frederic Konrad

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:8891f16 dfblob:716052a dfblob:ab63823 dfblob:54250f5 )
 OR (
bs:"Re: [Qemu-devel] [RFC 1/8] cputlb: add physical address to CPUTLBEntry" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150510080748.GA8629@flamenco \
    --to=cota@braap.org \
    --cc=a.rigo@virtualopensystems.com \
    --cc=alex.bennee@linaro.org \
    --cc=fred.konrad@greensocs.com \
    --cc=mttcg@greensocs.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=rth@twiddle.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).