From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33268) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZF4os-0002OD-57 for qemu-devel@nongnu.org; Tue, 14 Jul 2015 14:20:43 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZF4oo-00052g-1q for qemu-devel@nongnu.org; Tue, 14 Jul 2015 14:20:42 -0400 Received: from mail-wg0-x22e.google.com ([2a00:1450:400c:c00::22e]:34360) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZF4on-00052O-Rn for qemu-devel@nongnu.org; Tue, 14 Jul 2015 14:20:37 -0400 Received: by wgkl9 with SMTP id l9so15287883wgk.1 for ; Tue, 14 Jul 2015 11:20:36 -0700 (PDT) Sender: Paolo Bonzini References: <1436891912-14742-1-git-send-email-leon.alrae@imgtec.com> <20150714170928.GC7569@aurel32.net> From: Paolo Bonzini Message-ID: <55A552F1.70000@redhat.com> Date: Tue, 14 Jul 2015 20:20:33 +0200 MIME-Version: 1.0 In-Reply-To: <20150714170928.GC7569@aurel32.net> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 8bit Subject: Re: [Qemu-devel] [PATCH] target-mips: apply workaround for TCG optimizations for MFC1 List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Aurelien Jarno , Leon Alrae Cc: qemu-devel@nongnu.org, rth@twiddle.net On 14/07/2015 19:09, Aurelien Jarno wrote: > On 2015-07-14 17:38, Leon Alrae wrote: >> There seems to be an issue when trying to keep a pointer in bottom 32-bits >> of a 64-bit floating point register. Load and store instructions accessing >> this address for some reason use the whole 64-bit content of floating point >> register rather than truncated 32-bit value. The following load uses >> incorrect address which leads to a crash if upper 32 bits of $f0 isn't 0: >> >> 0x00400c60: mfc1 t8,$f0 >> 0x00400c64: lw t9,0(t8) >> >> It can be reproduced with the following linux userland program when running >> on a MIPS32 with CP0.Status.FR=1 (by default mips32r5-generic and >> mips32r6-generic CPUs have this bit set in linux-user). >> >> int main(int argc, char *argv[]) >> { >> int tmp = 0x11111111; >> /* Set f0 */ >> __asm__ ("mtc1 %0, $f0\n" >> "mthc1 %1, $f0\n" >> : : "r" (&tmp), "r" (tmp)); >> /* At this point $f0: w:76fff040 d:1111111176fff040 */ >> __asm__ ("mfc1 $t8, $f0\n" >> "lw $t9, 0($t8)\n"); /* <--- crash! */ >> return 0; >> } >> >> Running above program in normal (non-singlestep mode) leads to: >> >> Program received signal SIGSEGV, Segmentation fault. >> 0x00005555559f6f37 in static_code_gen_buffer () >> (gdb) x/i 0x00005555559f6f37 >> => 0x5555559f6f37 : mov %gs:0x0(%rbp),%ebp >> (gdb) info registers rbp >> rbp 0x1111111176fff040 0x1111111176fff040 >> >> The program runs fine in singlestep mode, or with disabled TCG >> optimizations. Also, I'm not able to reproduce it in system emulation. > > I am able to reproduce the problem, but for me disabling the > optimizations doesn't help. That said the problem is just another issue > with the "let's assume the target supports move between 32 and 64 bit > registers". At some point we should add a paragraph to tcg/README, to > define how handle 32 vs 64 bit registers and what the TCG targets should > expect. We had to add special code to handle that for sparc > (trunc_shr_i32 instruction), but also code to the optimizer to remember > about "garbage" high bits. I am not sure someone has a global view about > how all this code interacts. I certainly don't have a global view, so much that I didn't think at all of the optimizer... Instead, it looks to me like a bug in the register allocator. In particular this code in tcg_reg_alloc_mov: if (IS_DEAD_ARG(1) && !ts->fixed_reg && !ots->fixed_reg) { /* the mov can be suppressed */ if (ots->val_type == TEMP_VAL_REG) { s->reg_to_temp[ots->reg] = -1; } ots->reg = ts->reg; temp_dead(s, args[1]); } is not covering the "itype != otype" case. In addition, the IS_DEAD_ARG(1) case can be covered above in the if (((NEED_SYNC_ARG(0) || ots->fixed_reg) && ts->val_type != TEMP_VAL_REG) || ts->val_type == TEMP_VAL_MEM) { conditional: in this case there's no need at all to go through itype, and it's possible to load directly into ots. Paolo