From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44633) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WoE5P-0008Af-6j for qemu-devel@nongnu.org; Sat, 24 May 2014 11:42:24 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WoE5G-0006KA-5e for qemu-devel@nongnu.org; Sat, 24 May 2014 11:42:15 -0400 Received: from mail-pb0-x230.google.com ([2607:f8b0:400e:c01::230]:65064) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WoE5F-0006K6-Qd for qemu-devel@nongnu.org; Sat, 24 May 2014 11:42:06 -0400 Received: by mail-pb0-f48.google.com with SMTP id rr13so5554687pbb.35 for ; Sat, 24 May 2014 08:42:04 -0700 (PDT) Sender: Richard Henderson Message-ID: <5380BDC8.7050502@twiddle.net> Date: Sat, 24 May 2014 08:42:00 -0700 From: Richard Henderson MIME-Version: 1.0 References: <1400051861-5848-1-git-send-email-rth@twiddle.net> <1400051861-5848-7-git-send-email-rth@twiddle.net> <53806C74.3050602@redhat.com> In-Reply-To: <53806C74.3050602@redhat.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 06/24] tcg-mips: Move softmmu slow path out of line List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini , qemu-devel@nongnu.org Cc: aurelien@aurel32.net On 05/24/2014 02:55 AM, Paolo Bonzini wrote: > Il 14/05/2014 09:17, Richard Henderson ha scritto: >> + tcg_out_opc_imm(s, OPC_LW, TCG_REG_A0, TCG_REG_A0, add_off); >> + tcg_out_opc_reg(s, OPC_AND, TCG_REG_T0, TCG_REG_T0, addrl); >> + >> + label_ptr[0] = s->code_ptr; >> tcg_out_opc_br(s, OPC_BNE, TCG_REG_T0, TCG_REG_AT); >> - tcg_out_nop(s); > > I don't remember mips very well, LW cannot be put in the delay slot? This would > let you fill both delay slots for the 64-bit case. Or is it just that the code > becomes harder to follow due to the TARGET_LONG_BITS == 64 "if"s? > > Alternatively, for 64-bit you could use OR+BNE instead of BNE+NOP+BNE. Of > course this can be done later, this patchset is already a big improvement. It's MIPS I that had all sorts of problems with scheduling loads. Including requiring two cycles between load issue and use. TCG doesn't handle any of that; we require a fully interlocked pipeline. Without looking it up, I'd guess that was at least MIPS III (circa 1992?). Mostly that nop is hard to fill because of the if's, and I wanted to fill the last slot with the addition to make up the full host address. OR+BNE doesn't help; you need 2 XORs and 1 OR to do a double-word equality comparison. That's something that might take a bit of measurement to show it's worthwhile. r~