* [Qemu-devel] [5283] Use the new concat_tl_i64 op for std and stda
@ 2008-09-21 18:43 Blue Swirl
0 siblings, 0 replies; 6+ messages in thread
From: Blue Swirl @ 2008-09-21 18:43 UTC (permalink / raw)
To: qemu-devel
Revision: 5283
http://svn.sv.gnu.org/viewvc/?view=rev&root=qemu&revision=5283
Author: blueswir1
Date: 2008-09-21 18:43:17 +0000 (Sun, 21 Sep 2008)
Log Message:
-----------
Use the new concat_tl_i64 op for std and stda
Modified Paths:
--------------
trunk/target-sparc/translate.c
Modified: trunk/target-sparc/translate.c
===================================================================
--- trunk/target-sparc/translate.c 2008-09-21 18:32:28 UTC (rev 5282)
+++ trunk/target-sparc/translate.c 2008-09-21 18:43:17 UTC (rev 5283)
@@ -1715,14 +1715,10 @@
static inline void gen_stda_asi(TCGv hi, TCGv addr, int insn, int rd)
{
- TCGv r_low, r_asi, r_size;
+ TCGv r_asi, r_size;
gen_movl_reg_TN(rd + 1, cpu_tmp0);
- r_low = tcg_temp_new(TCG_TYPE_I32);
- tcg_gen_trunc_tl_i32(r_low, cpu_tmp0);
- tcg_gen_trunc_tl_i32(cpu_tmp32, hi);
- tcg_gen_concat_i32_i64(cpu_tmp64, r_low, cpu_tmp32);
- tcg_temp_free(r_low);
+ tcg_gen_concat_tl_i64(cpu_tmp64, cpu_tmp0, hi);
r_asi = gen_get_asi(insn, addr);
r_size = tcg_const_i32(8);
tcg_gen_helper_0_4(helper_st_asi, addr, cpu_tmp64, r_asi, r_size);
@@ -1818,14 +1814,10 @@
static inline void gen_stda_asi(TCGv hi, TCGv addr, int insn, int rd)
{
- TCGv r_low, r_asi, r_size;
+ TCGv r_asi, r_size;
gen_movl_reg_TN(rd + 1, cpu_tmp0);
- r_low = tcg_temp_new(TCG_TYPE_I32);
- tcg_gen_trunc_tl_i32(r_low, cpu_tmp0);
- tcg_gen_trunc_tl_i32(cpu_tmp32, hi);
- tcg_gen_concat_i32_i64(cpu_tmp64, r_low, cpu_tmp32);
- tcg_temp_free(r_low);
+ tcg_gen_concat_tl_i64(cpu_tmp64, cpu_tmp0, hi);
r_asi = tcg_const_i32(GET_FIELD(insn, 19, 26));
r_size = tcg_const_i32(8);
tcg_gen_helper_0_4(helper_st_asi, addr, cpu_tmp64, r_asi, r_size);
@@ -4477,7 +4469,7 @@
if (rd & 1)
goto illegal_insn;
else {
- TCGv r_low, r_const;
+ TCGv r_const;
save_state(dc, cpu_cond);
gen_address_mask(dc, cpu_addr);
@@ -4486,11 +4478,7 @@
r_const); // XXX remove
tcg_temp_free(r_const);
gen_movl_reg_TN(rd + 1, cpu_tmp0);
- r_low = tcg_temp_new(TCG_TYPE_I32);
- tcg_gen_trunc_tl_i32(r_low, cpu_tmp0);
- tcg_gen_trunc_tl_i32(cpu_tmp32, cpu_val);
- tcg_gen_concat_i32_i64(cpu_tmp64, r_low, cpu_tmp32);
- tcg_temp_free(r_low);
+ tcg_gen_concat_tl_i64(cpu_tmp64, cpu_tmp0, cpu_val);
tcg_gen_qemu_st64(cpu_tmp64, cpu_addr, dc->mem_idx);
}
break;
^ permalink raw reply [flat|nested] 6+ messages in thread
* re: [Qemu-devel] [5283] Use the new concat_tl_i64 op for std and stda
@ 2008-09-23 21:49 Vince Weaver
2008-09-23 21:59 ` Paul Brook
0 siblings, 1 reply; 6+ messages in thread
From: Vince Weaver @ 2008-09-23 21:49 UTC (permalink / raw)
To: qemu-devel
Hello
the aforementioned patch broke the spec2k crafty benchmark on
sparc32plus-linux-user (I'm running on x86_64 but I don't think it makes a
difference).
As always, the problem is when there are high bits (above 32) set when
concating.
The old code generated:
mov_i32 tmp11,tmp0
mov_i32 tmp1,loc4
movi_i64 tmp12,$0xffffffff
and_i64 tmp10,tmp1,tmp12
movi_i64 tmp12,$0x20
shl_i64 tmp10,tmp10,tmp12
movi_i64 tmp12,$0xffffffff
and_i64 tmp2,tmp11,tmp12
or_i64 tmp2,tmp2,tmp10
qemu_st64 tmp2,loc5,$0x0
The new code generates:
movi_i64 tmp12,$0x20
shl_i64 tmp10,loc4,tmp12
or_i64 tmp2,tmp0,tmp10
qemu_st64 tmp2,loc5,$0x0
which generates wrong values because the low value being concatenated
isn't masked/truncated to 32-bits first.
Vince
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] [5283] Use the new concat_tl_i64 op for std and stda
2008-09-23 21:49 [Qemu-devel] [5283] Use the new concat_tl_i64 op for std and stda Vince Weaver
@ 2008-09-23 21:59 ` Paul Brook
2008-09-23 22:08 ` Vince Weaver
0 siblings, 1 reply; 6+ messages in thread
From: Paul Brook @ 2008-09-23 21:59 UTC (permalink / raw)
To: qemu-devel
On Tuesday 23 September 2008, Vince Weaver wrote:
> Hello
>
> the aforementioned patch broke the spec2k crafty benchmark on
> sparc32plus-linux-user (I'm running on x86_64 but I don't think it makes a
> difference).
Are you sure? I've double checked the patch and I still think it is correct.
>
> The new code generates:
> movi_i64 tmp12,$0x20
> shl_i64 tmp10,loc4,tmp12
> or_i64 tmp2,tmp0,tmp10
> qemu_st64 tmp2,loc5,$0x0
The extension should be immediately before the code you quote here.
Paul
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] [5283] Use the new concat_tl_i64 op for std and stda
2008-09-23 21:59 ` Paul Brook
@ 2008-09-23 22:08 ` Vince Weaver
2008-09-23 22:31 ` Vince Weaver
2008-09-23 22:31 ` Paul Brook
0 siblings, 2 replies; 6+ messages in thread
From: Vince Weaver @ 2008-09-23 22:08 UTC (permalink / raw)
To: Paul Brook; +Cc: qemu-devel
On Tue, 23 Sep 2008, Paul Brook wrote:
> On Tuesday 23 September 2008, Vince Weaver wrote:
>> Hello
>>
>> the aforementioned patch broke the spec2k crafty benchmark on
>> sparc32plus-linux-user (I'm running on x86_64 but I don't think it makes a
>> difference).
>
> Are you sure? I've double checked the patch and I still think it is correct.
Yes. The program definitely works if I revert the patch, but fails if I
re-apply it.
>> The new code generates:
>> movi_i64 tmp12,$0x20
>> shl_i64 tmp10,loc4,tmp12
>> or_i64 tmp2,tmp0,tmp10
>> qemu_st64 tmp2,loc5,$0x0
>
> The extension should be immediately before the code you quote here.
Nope, in both cases the preceeding code is identical:
OP after la:
movi_i64 tmp8,$0x13bc00
mov_i64 g1,tmp8
movi_i64 tmp10,$0x268
add_i64 loc5,g1,tmp10
ld_i64 loc4,regwptr,$0x0
movi_i64 pc,$0x27e78
movi_i64 npc,$0x27e7c
movi_i64 tmp10,$0xffffffff
and_i64 loc5,loc5,tmp10
movi_i32 tmp11,$0x7
movi_i64 tmp10,$helper_check_align
call tmp10,$0x0,$0,loc5,tmp11
ld_i64 tmp0,regwptr,$0x8
The code in question is:
IN: InitializeMasks
0x0000000000027e74: sethi %hi(0x13bc00), %g1
0x0000000000027e78: std %o0, [ %g1 + 0x268 ] ! 0x13be68
0x0000000000027e7c: call 0x22eac
0x0000000000027e80: mov 3, %o0
I can create a small assembly language program that duplicates the problem
if it would help debugging it.
Vince
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] [5283] Use the new concat_tl_i64 op for std and stda
2008-09-23 22:08 ` Vince Weaver
@ 2008-09-23 22:31 ` Vince Weaver
2008-09-23 22:31 ` Paul Brook
1 sibling, 0 replies; 6+ messages in thread
From: Vince Weaver @ 2008-09-23 22:31 UTC (permalink / raw)
To: qemu-devel
[-- Attachment #1: Type: TEXT/PLAIN, Size: 317 bytes --]
On Tue, 23 Sep 2008, Vince Weaver wrote:
>
> I can create a small assembly language program that duplicates the problem if
> it would help debugging it.
attached is a short assembly language program that shows the problem.
It fails with current SVN, but passes on real hardware and with patch 5283
reverted.
Vince
[-- Attachment #2: Type: APPLICATION/octet-stream, Size: 1127 bytes --]
[-- Attachment #3: Type: TEXT/PLAIN, Size: 2011 bytes --]
! as -Av8plus -o std_test.o std_test.s ; ld -o std_test std_test.o
.equ SYSCALL_EXIT,1
.equ SYSCALL_WRITE,4
.equ STDOUT,1
! This problem was noticed with crafty (which does 64-bit bit manipulations)
! After the "std" code moved to using tcg_gen_concat_tl_i64()
.globl _start
_start:
!
! TEST1 - high bits in y
!
set test1,%o1
call write_stdout
nop
tun_test1:
! Set some high bits in o4
! have to use sllx to get them there
set 0xff000000, %o5
sllx %o5,8,%o5
or %o5,0xff,%o5
set 0x00000000, %o4
set temp_mem, %o3
std %o4,[%o3] ! write 64-bit value, high 32bits from %o4
! low 32bits from %o5
ld [%o3],%g1 ! load back in the high 32 bits
set 0x00000000,%g2 ! result _should_ be 0
cmp %g1,%g2
beq test1_pass
nop
test1_fail:
set fail,%o1
ba test1_print
nop
test1_pass:
set pass,%o1
test1_print:
call write_stdout
nop
!================================
! Exit
!================================
exit:
mov 0,%o0 ! exit value
mov SYSCALL_EXIT,%g1 ! put the exit syscall number in g1
ta 0x10 ! and exit
!================================
! WRITE_STDOUT
!================================
! %o1 has string
write_stdout:
set SYSCALL_WRITE,%g1 ! Write syscall in %g1
set STDOUT,%o0 ! 1 in %o0 (stdout)
set 0,%o2 ! 0 (count) in %o2
str_loop1:
ldub [%o1+%o2],%l0 ! load byte
cmp %l0,%g0 ! compare against zero
bnz str_loop1 ! if not nul, repeat
! BRANCH DELAY SLOT
inc %o2 ! increment count
dec %o2 ! correct count
ta 0x10 ! run the syscall
retl ! return
nop
.data
fail: .ascii "Fail!\n\0"
pass: .ascii "Pass\n\0"
test1: .ascii "Test1: Storing 64-bit value : \0"
temp_mem: .int 0
temp_mem_lo: .int 0
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] [5283] Use the new concat_tl_i64 op for std and stda
2008-09-23 22:08 ` Vince Weaver
2008-09-23 22:31 ` Vince Weaver
@ 2008-09-23 22:31 ` Paul Brook
1 sibling, 0 replies; 6+ messages in thread
From: Paul Brook @ 2008-09-23 22:31 UTC (permalink / raw)
To: qemu-devel
> >> The new code generates:
> >> movi_i64 tmp12,$0x20
> >> shl_i64 tmp10,loc4,tmp12
> >> or_i64 tmp2,tmp0,tmp10
> >> qemu_st64 tmp2,loc5,$0x0
> >
> > The extension should be immediately before the code you quote here.
>
> Nope, in both cases the preceeding code is identical:
Ah, I see what's going on now. Fixed.
Paul
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2008-09-23 22:32 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-09-23 21:49 [Qemu-devel] [5283] Use the new concat_tl_i64 op for std and stda Vince Weaver
2008-09-23 21:59 ` Paul Brook
2008-09-23 22:08 ` Vince Weaver
2008-09-23 22:31 ` Vince Weaver
2008-09-23 22:31 ` Paul Brook
-- strict thread matches above, loose matches on Subject: below --
2008-09-21 18:43 Blue Swirl
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).