From: Richard Henderson <rth@twiddle.net>
To: qemu-devel@nongnu.org
Cc: peter.maydell@linaro.org, Aurelien Jarno <aurelien@aurel32.net>
Subject: [Qemu-devel] [PULL 13/18] tcg/i386: use softmmu fast path for unaligned accesses
Date: Mon, 24 Aug 2015 12:37:01 -0700 [thread overview]
Message-ID: <1440445026-26522-14-git-send-email-rth@twiddle.net> (raw)
In-Reply-To: <1440445026-26522-1-git-send-email-rth@twiddle.net>
From: Aurelien Jarno <aurelien@aurel32.net>
Softmmu unaligned load/stores currently goes through through the slow
path for two reasons:
- to support unaligned access on host with strict alignement
- to correctly handle accesses crossing pages
x86 is only concerned by the second reason. Unaligned accesses are
avoided by compilers, but are not uncommon. We therefore would like
to see them going through the fast path, if they don't cross pages.
For that we can use the fact that two adjacent TLB entries can't contain
the same page. Therefore accessing the TLB entry corresponding to the
first byte, but comparing its content to page address of the last byte
ensures that we don't cross pages. We can do this check without adding
more instructions in the TLB code (but increasing its length by one
byte) by using the LEA instruction to combine the existing move with the
size addition.
On an x86-64 host, this gives a 3% boot time improvement for a powerpc
guest and 4% for an x86-64 guest.
[rth: Tidied calculation of the offset mask]
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <1436467197-2183-1-git-send-email-aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
tcg/i386/tcg-target.c | 22 +++++++++++++---------
1 file changed, 13 insertions(+), 9 deletions(-)
diff --git a/tcg/i386/tcg-target.c b/tcg/i386/tcg-target.c
index 7648f7e..ff55499 100644
--- a/tcg/i386/tcg-target.c
+++ b/tcg/i386/tcg-target.c
@@ -1172,7 +1172,7 @@ static void * const qemu_st_helpers[16] = {
First argument register is clobbered. */
static inline void tcg_out_tlb_load(TCGContext *s, TCGReg addrlo, TCGReg addrhi,
- int mem_index, TCGMemOp s_bits,
+ int mem_index, TCGMemOp opc,
tcg_insn_unit **label_ptr, int which)
{
const TCGReg r0 = TCG_REG_L0;
@@ -1180,6 +1180,8 @@ static inline void tcg_out_tlb_load(TCGContext *s, TCGReg addrlo, TCGReg addrhi,
TCGType ttype = TCG_TYPE_I32;
TCGType htype = TCG_TYPE_I32;
int trexw = 0, hrexw = 0;
+ int s_mask = (1 << (opc & MO_SIZE)) - 1;
+ bool aligned = (opc & MO_AMASK) == MO_ALIGN || s_mask == 0;
if (TCG_TARGET_REG_BITS == 64) {
if (TARGET_LONG_BITS == 64) {
@@ -1193,13 +1195,19 @@ static inline void tcg_out_tlb_load(TCGContext *s, TCGReg addrlo, TCGReg addrhi,
}
tcg_out_mov(s, htype, r0, addrlo);
- tcg_out_mov(s, ttype, r1, addrlo);
+ if (aligned) {
+ tcg_out_mov(s, ttype, r1, addrlo);
+ } else {
+ /* For unaligned access check that we don't cross pages using
+ the page address of the last byte. */
+ tcg_out_modrm_offset(s, OPC_LEA + trexw, r1, addrlo, s_mask);
+ }
tcg_out_shifti(s, SHIFT_SHR + hrexw, r0,
TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
tgen_arithi(s, ARITH_AND + trexw, r1,
- TARGET_PAGE_MASK | ((1 << s_bits) - 1), 0);
+ TARGET_PAGE_MASK | (aligned ? s_mask : 0), 0);
tgen_arithi(s, ARITH_AND + hrexw, r0,
(CPU_TLB_SIZE - 1) << CPU_TLB_ENTRY_BITS, 0);
@@ -1545,7 +1553,6 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, bool is64)
TCGMemOp opc;
#if defined(CONFIG_SOFTMMU)
int mem_index;
- TCGMemOp s_bits;
tcg_insn_unit *label_ptr[2];
#endif
@@ -1558,9 +1565,8 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, bool is64)
#if defined(CONFIG_SOFTMMU)
mem_index = get_mmuidx(oi);
- s_bits = opc & MO_SIZE;
- tcg_out_tlb_load(s, addrlo, addrhi, mem_index, s_bits,
+ tcg_out_tlb_load(s, addrlo, addrhi, mem_index, opc,
label_ptr, offsetof(CPUTLBEntry, addr_read));
/* TLB Hit. */
@@ -1687,7 +1693,6 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, bool is64)
TCGMemOp opc;
#if defined(CONFIG_SOFTMMU)
int mem_index;
- TCGMemOp s_bits;
tcg_insn_unit *label_ptr[2];
#endif
@@ -1700,9 +1705,8 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, bool is64)
#if defined(CONFIG_SOFTMMU)
mem_index = get_mmuidx(oi);
- s_bits = opc & MO_SIZE;
- tcg_out_tlb_load(s, addrlo, addrhi, mem_index, s_bits,
+ tcg_out_tlb_load(s, addrlo, addrhi, mem_index, opc,
label_ptr, offsetof(CPUTLBEntry, addr_write));
/* TLB Hit. */
--
2.4.3
next prev parent reply other threads:[~2015-08-24 19:38 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-08-24 19:36 [Qemu-devel] [PULL 00/18] Queued TCG patches Richard Henderson
2015-08-24 19:36 ` [Qemu-devel] [PULL 01/18] tcg/optimize: fix constant signedness Richard Henderson
2015-08-24 19:36 ` [Qemu-devel] [PULL 02/18] tcg/optimize: optimize temps tracking Richard Henderson
2015-08-24 19:36 ` [Qemu-devel] [PULL 03/18] tcg/optimize: add temp_is_const and temp_is_copy functions Richard Henderson
2015-08-24 19:36 ` [Qemu-devel] [PULL 04/18] tcg/optimize: track const/copy status separately Richard Henderson
2015-08-24 19:36 ` [Qemu-devel] [PULL 05/18] tcg/optimize: allow constant to have copies Richard Henderson
2015-08-24 19:36 ` [Qemu-devel] [PULL 06/18] tcg: rename trunc_shr_i32 into trunc_shr_i64_i32 Richard Henderson
2015-08-24 19:36 ` [Qemu-devel] [PULL 07/18] tcg: don't abuse TCG type in tcg_gen_trunc_shr_i64_i32 Richard Henderson
2015-08-24 19:36 ` [Qemu-devel] [PULL 08/18] tcg: implement real ext_i32_i64 and extu_i32_i64 ops Richard Henderson
2015-08-24 19:36 ` [Qemu-devel] [PULL 09/18] tcg/optimize: add optimizations for " Richard Henderson
2015-08-24 19:36 ` [Qemu-devel] [PULL 10/18] tcg: update README about size changing ops Richard Henderson
2015-08-24 19:36 ` [Qemu-devel] [PULL 11/18] tcg: Split trunc_shr_i32 opcode into extr[lh]_i64_i32 Richard Henderson
2015-08-24 19:37 ` Richard Henderson [this message]
2015-08-24 19:37 ` [Qemu-devel] [PULL 14/18] tcg/ppc: Improve unaligned load/store handling on 64-bit backend Richard Henderson
2015-08-24 19:37 ` [Qemu-devel] [PULL 15/18] tcg/s390: Use softmmu fast path for unaligned accesses Richard Henderson
2015-08-24 19:37 ` [Qemu-devel] [PULL 16/18] tcg/aarch64: " Richard Henderson
2015-08-24 19:37 ` [Qemu-devel] [PULL 17/18] linux-user: remove --enable-guest-base/--disable-guest-base Richard Henderson
2015-08-24 19:37 ` [Qemu-devel] [PULL 18/18] linux-user: remove useless macros GUEST_BASE and RESERVED_VA Richard Henderson
2015-08-28 8:21 ` Cornelia Huck
2015-08-28 8:33 ` Laurent Vivier
2015-08-28 8:55 ` Cornelia Huck
2015-08-25 14:33 ` [Qemu-devel] [PULL 00/18] Queued TCG patches Peter Maydell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1440445026-26522-14-git-send-email-rth@twiddle.net \
--to=rth@twiddle.net \
--cc=aurelien@aurel32.net \
--cc=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).