qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [5283] Use the new concat_tl_i64 op for std and stda
@ 2008-09-21 18:43 Blue Swirl
  0 siblings, 0 replies; 6+ messages in thread
From: Blue Swirl @ 2008-09-21 18:43 UTC (permalink / raw)
  To: qemu-devel

Revision: 5283
          http://svn.sv.gnu.org/viewvc/?view=rev&root=qemu&revision=5283
Author:   blueswir1
Date:     2008-09-21 18:43:17 +0000 (Sun, 21 Sep 2008)

Log Message:
-----------
Use the new concat_tl_i64 op for std and stda

Modified Paths:
--------------
    trunk/target-sparc/translate.c

Modified: trunk/target-sparc/translate.c
===================================================================
--- trunk/target-sparc/translate.c	2008-09-21 18:32:28 UTC (rev 5282)
+++ trunk/target-sparc/translate.c	2008-09-21 18:43:17 UTC (rev 5283)
@@ -1715,14 +1715,10 @@
 
 static inline void gen_stda_asi(TCGv hi, TCGv addr, int insn, int rd)
 {
-    TCGv r_low, r_asi, r_size;
+    TCGv r_asi, r_size;
 
     gen_movl_reg_TN(rd + 1, cpu_tmp0);
-    r_low = tcg_temp_new(TCG_TYPE_I32);
-    tcg_gen_trunc_tl_i32(r_low, cpu_tmp0);
-    tcg_gen_trunc_tl_i32(cpu_tmp32, hi);
-    tcg_gen_concat_i32_i64(cpu_tmp64, r_low, cpu_tmp32);
-    tcg_temp_free(r_low);
+    tcg_gen_concat_tl_i64(cpu_tmp64, cpu_tmp0, hi);
     r_asi = gen_get_asi(insn, addr);
     r_size = tcg_const_i32(8);
     tcg_gen_helper_0_4(helper_st_asi, addr, cpu_tmp64, r_asi, r_size);
@@ -1818,14 +1814,10 @@
 
 static inline void gen_stda_asi(TCGv hi, TCGv addr, int insn, int rd)
 {
-    TCGv r_low, r_asi, r_size;
+    TCGv r_asi, r_size;
 
     gen_movl_reg_TN(rd + 1, cpu_tmp0);
-    r_low = tcg_temp_new(TCG_TYPE_I32);
-    tcg_gen_trunc_tl_i32(r_low, cpu_tmp0);
-    tcg_gen_trunc_tl_i32(cpu_tmp32, hi);
-    tcg_gen_concat_i32_i64(cpu_tmp64, r_low, cpu_tmp32);
-    tcg_temp_free(r_low);
+    tcg_gen_concat_tl_i64(cpu_tmp64, cpu_tmp0, hi);
     r_asi = tcg_const_i32(GET_FIELD(insn, 19, 26));
     r_size = tcg_const_i32(8);
     tcg_gen_helper_0_4(helper_st_asi, addr, cpu_tmp64, r_asi, r_size);
@@ -4477,7 +4469,7 @@
                     if (rd & 1)
                         goto illegal_insn;
                     else {
-                        TCGv r_low, r_const;
+                        TCGv r_const;
 
                         save_state(dc, cpu_cond);
                         gen_address_mask(dc, cpu_addr);
@@ -4486,11 +4478,7 @@
                                            r_const); // XXX remove
                         tcg_temp_free(r_const);
                         gen_movl_reg_TN(rd + 1, cpu_tmp0);
-                        r_low = tcg_temp_new(TCG_TYPE_I32);
-                        tcg_gen_trunc_tl_i32(r_low, cpu_tmp0);
-                        tcg_gen_trunc_tl_i32(cpu_tmp32, cpu_val);
-                        tcg_gen_concat_i32_i64(cpu_tmp64, r_low, cpu_tmp32);
-                        tcg_temp_free(r_low);
+                        tcg_gen_concat_tl_i64(cpu_tmp64, cpu_tmp0, cpu_val);
                         tcg_gen_qemu_st64(cpu_tmp64, cpu_addr, dc->mem_idx);
                     }
                     break;

^ permalink raw reply	[flat|nested] 6+ messages in thread

* re: [Qemu-devel] [5283] Use the new concat_tl_i64 op for std and stda
@ 2008-09-23 21:49 Vince Weaver
  2008-09-23 21:59 ` Paul Brook
  0 siblings, 1 reply; 6+ messages in thread
From: Vince Weaver @ 2008-09-23 21:49 UTC (permalink / raw)
  To: qemu-devel

Hello

the aforementioned patch broke the spec2k crafty benchmark on 
sparc32plus-linux-user (I'm running on x86_64 but I don't think it makes a 
difference).

As always, the problem is when there are high bits (above 32) set when 
concating.

The old code generated:
    mov_i32 tmp11,tmp0
    mov_i32 tmp1,loc4
    movi_i64 tmp12,$0xffffffff
    and_i64 tmp10,tmp1,tmp12
    movi_i64 tmp12,$0x20
    shl_i64 tmp10,tmp10,tmp12
    movi_i64 tmp12,$0xffffffff
    and_i64 tmp2,tmp11,tmp12
    or_i64 tmp2,tmp2,tmp10
    qemu_st64 tmp2,loc5,$0x0

The new code generates:
    movi_i64 tmp12,$0x20
    shl_i64 tmp10,loc4,tmp12
    or_i64 tmp2,tmp0,tmp10
    qemu_st64 tmp2,loc5,$0x0

which generates wrong values because the low value being concatenated 
isn't masked/truncated to 32-bits first.

Vince

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [5283] Use the new concat_tl_i64 op for std and stda
  2008-09-23 21:49 [Qemu-devel] [5283] Use the new concat_tl_i64 op for std and stda Vince Weaver
@ 2008-09-23 21:59 ` Paul Brook
  2008-09-23 22:08   ` Vince Weaver
  0 siblings, 1 reply; 6+ messages in thread
From: Paul Brook @ 2008-09-23 21:59 UTC (permalink / raw)
  To: qemu-devel

On Tuesday 23 September 2008, Vince Weaver wrote:
> Hello
>
> the aforementioned patch broke the spec2k crafty benchmark on
> sparc32plus-linux-user (I'm running on x86_64 but I don't think it makes a
> difference).

Are you sure? I've double checked the patch and I still think it is correct.

>
> The new code generates:
>     movi_i64 tmp12,$0x20
>     shl_i64 tmp10,loc4,tmp12
>     or_i64 tmp2,tmp0,tmp10
>     qemu_st64 tmp2,loc5,$0x0

The extension should be immediately before the code you quote here.

Paul

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [5283] Use the new concat_tl_i64 op for std and stda
  2008-09-23 21:59 ` Paul Brook
@ 2008-09-23 22:08   ` Vince Weaver
  2008-09-23 22:31     ` Vince Weaver
  2008-09-23 22:31     ` Paul Brook
  0 siblings, 2 replies; 6+ messages in thread
From: Vince Weaver @ 2008-09-23 22:08 UTC (permalink / raw)
  To: Paul Brook; +Cc: qemu-devel

On Tue, 23 Sep 2008, Paul Brook wrote:

> On Tuesday 23 September 2008, Vince Weaver wrote:
>> Hello
>>
>> the aforementioned patch broke the spec2k crafty benchmark on
>> sparc32plus-linux-user (I'm running on x86_64 but I don't think it makes a
>> difference).
>
> Are you sure? I've double checked the patch and I still think it is correct.

Yes.  The program definitely works if I revert the patch, but fails if I 
re-apply it.

>> The new code generates:
>>     movi_i64 tmp12,$0x20
>>     shl_i64 tmp10,loc4,tmp12
>>     or_i64 tmp2,tmp0,tmp10
>>     qemu_st64 tmp2,loc5,$0x0
>
> The extension should be immediately before the code you quote here.

Nope, in both cases the preceeding code is identical:

OP after la:
  movi_i64 tmp8,$0x13bc00
  mov_i64 g1,tmp8
  movi_i64 tmp10,$0x268
  add_i64 loc5,g1,tmp10
  ld_i64 loc4,regwptr,$0x0
  movi_i64 pc,$0x27e78
  movi_i64 npc,$0x27e7c
  movi_i64 tmp10,$0xffffffff
  and_i64 loc5,loc5,tmp10
  movi_i32 tmp11,$0x7
  movi_i64 tmp10,$helper_check_align
  call tmp10,$0x0,$0,loc5,tmp11
  ld_i64 tmp0,regwptr,$0x8

The code in question is:

IN: InitializeMasks
0x0000000000027e74:  sethi  %hi(0x13bc00), %g1
0x0000000000027e78:  std  %o0, [ %g1 + 0x268 ]  ! 0x13be68
0x0000000000027e7c:  call  0x22eac
0x0000000000027e80:  mov  3, %o0


I can create a small assembly language program that duplicates the problem 
if it would help debugging it.

Vince

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [5283] Use the new concat_tl_i64 op for std and stda
  2008-09-23 22:08   ` Vince Weaver
@ 2008-09-23 22:31     ` Vince Weaver
  2008-09-23 22:31     ` Paul Brook
  1 sibling, 0 replies; 6+ messages in thread
From: Vince Weaver @ 2008-09-23 22:31 UTC (permalink / raw)
  To: qemu-devel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 317 bytes --]

On Tue, 23 Sep 2008, Vince Weaver wrote:
>
> I can create a small assembly language program that duplicates the problem if 
> it would help debugging it.

attached is a short assembly language program that shows the problem.
It fails with current SVN, but passes on real hardware and with patch 5283 
reverted.

Vince

[-- Attachment #2: Type: APPLICATION/octet-stream, Size: 1127 bytes --]

[-- Attachment #3: Type: TEXT/PLAIN, Size: 2011 bytes --]

! as -Av8plus -o std_test.o std_test.s ; ld -o std_test std_test.o

.equ SYSCALL_EXIT,1	
.equ SYSCALL_WRITE,4
.equ STDOUT,1

	! This problem was noticed with crafty (which does 64-bit bit manipulations)
	! After the "std" code moved to using tcg_gen_concat_tl_i64()
	
	.globl _start
_start:
	
	!
	! TEST1 - high bits in y
	! 
	
	set	test1,%o1
	call	write_stdout
	nop
tun_test1:	
	! Set some high bits in o4
	! have to use sllx to get them there
	
	set	0xff000000, %o5
	sllx	%o5,8,%o5
	or	%o5,0xff,%o5
	
	set	0x00000000, %o4

	set	temp_mem, %o3
	
	std	%o4,[%o3]		! write 64-bit value, high 32bits from %o4
					! low 32bits from %o5

	ld	[%o3],%g1		! load back in the high 32 bits
	
	set	0x00000000,%g2		! result _should_ be 0
	cmp	%g1,%g2
	beq	test1_pass
	nop
	
test1_fail:
	set	fail,%o1
	ba	test1_print
	nop
test1_pass:
	set	pass,%o1
test1_print:
	call	write_stdout
	nop
		
	!================================
	! Exit
	!================================
exit:		
        mov	0,%o0			! exit value
        mov	SYSCALL_EXIT,%g1        ! put the exit syscall number in g1
        ta      0x10			! and exit


        !================================
	! WRITE_STDOUT
	!================================
	! %o1 has string

write_stdout:
	set     SYSCALL_WRITE,%g1       ! Write syscall in %g1
	set     STDOUT,%o0              ! 1 in %o0 (stdout)
	set     0,%o2                   ! 0 (count) in %o2

str_loop1:
	ldub    [%o1+%o2],%l0           ! load byte
	cmp     %l0,%g0                 ! compare against zero
	bnz     str_loop1               ! if not nul, repeat
	! BRANCH DELAY SLOT
	inc     %o2                     ! increment count

	dec     %o2                     ! correct count
	ta      0x10                    ! run the syscall

	retl                            ! return
	nop
		
.data
fail:	.ascii "Fail!\n\0"
pass:	.ascii "Pass\n\0"
test1:	.ascii "Test1: Storing 64-bit value : \0"
temp_mem:	.int 0
temp_mem_lo:	.int 0
	

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [5283] Use the new concat_tl_i64 op for std and stda
  2008-09-23 22:08   ` Vince Weaver
  2008-09-23 22:31     ` Vince Weaver
@ 2008-09-23 22:31     ` Paul Brook
  1 sibling, 0 replies; 6+ messages in thread
From: Paul Brook @ 2008-09-23 22:31 UTC (permalink / raw)
  To: qemu-devel

> >> The new code generates:
> >>     movi_i64 tmp12,$0x20
> >>     shl_i64 tmp10,loc4,tmp12
> >>     or_i64 tmp2,tmp0,tmp10
> >>     qemu_st64 tmp2,loc5,$0x0
> >
> > The extension should be immediately before the code you quote here.
>
> Nope, in both cases the preceeding code is identical:

Ah, I see what's going on now.  Fixed.

Paul

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2008-09-23 22:32 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-09-23 21:49 [Qemu-devel] [5283] Use the new concat_tl_i64 op for std and stda Vince Weaver
2008-09-23 21:59 ` Paul Brook
2008-09-23 22:08   ` Vince Weaver
2008-09-23 22:31     ` Vince Weaver
2008-09-23 22:31     ` Paul Brook
  -- strict thread matches above, loose matches on Subject: below --
2008-09-21 18:43 Blue Swirl

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).