[Qemu-devel] target-sparc/TODO

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] target-sparc/TODO
@ 2009-08-17 10:52 Artyom Tarasenko
  2009-08-17 17:35 ` [Qemu-devel] target-sparc/TODO Blue Swirl
  0 siblings, 1 reply; 18+ messages in thread
From: Artyom Tarasenko @ 2009-08-17 10:52 UTC (permalink / raw)
  To: qemu-devel, Blue Swirl

> - Global register for regwptr, so that windowed registers can be
> accessed directly

looks like it's already implemented?

> - Synthetic instructions
Is it still open?

> - Hardware breakpoint/watchpoint support
Is it still open?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Qemu-devel] Re: target-sparc/TODO
  2009-08-17 10:52 [Qemu-devel] target-sparc/TODO Artyom Tarasenko
@ 2009-08-17 17:35 ` Blue Swirl
  2009-08-19 10:17   ` Artyom Tarasenko
  0 siblings, 1 reply; 18+ messages in thread
From: Blue Swirl @ 2009-08-17 17:35 UTC (permalink / raw)
  To: Artyom Tarasenko; +Cc: qemu-devel

On Mon, Aug 17, 2009 at 1:52 PM, Artyom
Tarasenko<atar4qemu@googlemail.com> wrote:
>> - Global register for regwptr, so that windowed registers can be
>> accessed directly
>
> looks like it's already implemented?

No, this means that a global register (TCG_AREG1) would be designated
as regwptr, so that the window registers (%o, %l, %i) would be defined
with:

cpu_wregs[i] = tcg_global_mem_new(TCG_AREG1, offsetof(...), name).

This would need some changes to cwp handling to support TCG_AREG1,
maybe also to TCG prologue.

Before TCG, this was difficult because the registers were taken by
cpu_T0, cpu_T1 and cpu_T2.

But it's not clear if this gives any performance gain, because
although window registers accesses may get faster (this is also not
certain because CPUstate should reside in cache), there is one host
register less available and that may mean more host memory accesses.

>> - Synthetic instructions
> Is it still open?

We already handle 'clr' and 'mov'. Code generation is not optimal, for
example arithmetic ops with constants/%g0 or things like wrpsr which
always does a XOR of the parameters even if they are constants or %g0.

>> - Hardware breakpoint/watchpoint support
> Is it still open?

I think support for these was only found in a few CPU models, so they
are not used much. Nobody has also shown any interest or provided a
test case.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Qemu-devel] Re: target-sparc/TODO
  2009-08-17 17:35 ` [Qemu-devel] target-sparc/TODO Blue Swirl
@ 2009-08-19 10:17   ` Artyom Tarasenko
  2009-08-19 16:43     ` Blue Swirl
  0 siblings, 1 reply; 18+ messages in thread
From: Artyom Tarasenko @ 2009-08-19 10:17 UTC (permalink / raw)
  To: Blue Swirl; +Cc: qemu-devel

2009/8/17 Blue Swirl <blauwirbel@gmail.com>:
> On Mon, Aug 17, 2009 at 1:52 PM, Artyom
> Tarasenko<atar4qemu@googlemail.com> wrote:
>>> - Global register for regwptr, so that windowed registers can be
>>> accessed directly
>>
>> looks like it's already implemented?
>
> No, this means that a global register (TCG_AREG1) would be designated
> as regwptr, so that the window registers (%o, %l, %i) would be defined
> with:
>
> cpu_wregs[i] = tcg_global_mem_new(TCG_AREG1, offsetof(...), name).
>
> This would need some changes to cwp handling to support TCG_AREG1,
> maybe also to TCG prologue.
>
> Before TCG, this was difficult because the registers were taken by
> cpu_T0, cpu_T1 and cpu_T2.
>
> But it's not clear if this gives any performance gain, because
> although window registers accesses may get faster (this is also not
> certain because CPUstate should reside in cache), there is one host
> register less available and that may mean more host memory accesses.

So, it's only about performance, otherwise the current implementation
is complete?

>>> - Synthetic instructions
>> Is it still open?
>
> We already handle 'clr' and 'mov'. Code generation is not optimal, for
> example arithmetic ops with constants/%g0 or things like wrpsr which
> always does a XOR of the parameters even if they are constants or %g0.

Would the synthetic ops with %g0 produce wrong results?
Particularly I'm interested if

jmp     %l1, %g4, %g0

may behave other than on a real hw.

>>> - Hardware breakpoint/watchpoint support
>> Is it still open?
>
> I think support for these was only found in a few CPU models, so they
> are not used much. Nobody has also shown any interest or provided a
> test case.


On OBP start-up I see some "write breakpoint reg" messages in the
debug log. Do they have to do with hardware breakpoint support?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Qemu-devel] Re: target-sparc/TODO
  2009-08-19 10:17   ` Artyom Tarasenko
@ 2009-08-19 16:43     ` Blue Swirl
  2009-08-20  9:44       ` Artyom Tarasenko
  0 siblings, 1 reply; 18+ messages in thread
From: Blue Swirl @ 2009-08-19 16:43 UTC (permalink / raw)
  To: Artyom Tarasenko; +Cc: qemu-devel

On Wed, Aug 19, 2009 at 1:17 PM, Artyom
Tarasenko<atar4qemu@googlemail.com> wrote:
> 2009/8/17 Blue Swirl <blauwirbel@gmail.com>:
>> On Mon, Aug 17, 2009 at 1:52 PM, Artyom
>> Tarasenko<atar4qemu@googlemail.com> wrote:
>>>> - Global register for regwptr, so that windowed registers can be
>>>> accessed directly
>>>
>>> looks like it's already implemented?
>>
>> No, this means that a global register (TCG_AREG1) would be designated
>> as regwptr, so that the window registers (%o, %l, %i) would be defined
>> with:
>>
>> cpu_wregs[i] = tcg_global_mem_new(TCG_AREG1, offsetof(...), name).
>>
>> This would need some changes to cwp handling to support TCG_AREG1,
>> maybe also to TCG prologue.
>>
>> Before TCG, this was difficult because the registers were taken by
>> cpu_T0, cpu_T1 and cpu_T2.
>>
>> But it's not clear if this gives any performance gain, because
>> although window registers accesses may get faster (this is also not
>> certain because CPUstate should reside in cache), there is one host
>> register less available and that may mean more host memory accesses.
>
> So, it's only about performance, otherwise the current implementation
> is complete?

Yes. Sparc64/V9 side is less complete.

>>>> - Synthetic instructions
>>> Is it still open?
>>
>> We already handle 'clr' and 'mov'. Code generation is not optimal, for
>> example arithmetic ops with constants/%g0 or things like wrpsr which
>> always does a XOR of the parameters even if they are constants or %g0.
>
> Would the synthetic ops with %g0 produce wrong results?
> Particularly I'm interested if
>
> jmp     %l1, %g4, %g0
>
> may behave other than on a real hw.

No, if rd is %g0, the current PC will not be written anywhere (not by
real HW either). This is handled by translate.c:4041 call to
gen_movl_TN_reg(), which suppresses the move if rd is zero. Likewise
for other rd writeback.

This class of instructions ("op %x, %y, %g0") should be close to
optimal except for degenerate cases like "and %g1, 1, %g0" which
should not generate any code at all. "andcc %g1, 1, %g0" should and it
does.

I meant that "wrpsr %g1, 0" (or %g0) should suppress the useless XOR
of %g1 and 0.

>>>> - Hardware breakpoint/watchpoint support
>>> Is it still open?
>>
>> I think support for these was only found in a few CPU models, so they
>> are not used much. Nobody has also shown any interest or provided a
>> test case.
>
>
> On OBP start-up I see some "write breakpoint reg" messages in the
> debug log. Do they have to do with hardware breakpoint support?
>

Yes, or they could also be MMU breakpoint registers.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Qemu-devel] Re: target-sparc/TODO
  2009-08-19 16:43     ` Blue Swirl
@ 2009-08-20  9:44       ` Artyom Tarasenko
  2009-08-20 19:15         ` Blue Swirl
  0 siblings, 1 reply; 18+ messages in thread
From: Artyom Tarasenko @ 2009-08-20  9:44 UTC (permalink / raw)
  To: Blue Swirl; +Cc: qemu-devel

>> Particularly I'm interested if
>>
>> jmp     %l1, %g4, %g0
>>
>> may behave other than on a real hw.
>
> No, if rd is %g0, the current PC will not be written anywhere (not by
> real HW either).

The reason I asked is the two following pieces of code work
differently on a real and emulated SS-5. On a real one spacel! does an
asi write, and spacel@ does an asi read, and under qemu  spacel! seems
to do nothing, and spacel@ returns its second parameter multiplied by
4. Both of them don't even try to call an [unimplemented] asi
operation, I've runned the tests with mmu and asi debug turned on.

Real SS-5:

ok 0 0 spacel@ .
Data Access Error
ok 0 20 spacel@ .
0
ok 12345678 0 20 spacel!
ok 0 20 spacel@ .
12345678
ok


qemu SS-5:

ok 0 0 spacel@ .
0
ok 0 20 spacel@ .
80
ok 12345678 0 20 spacel!
ok 0 20 spacel@ .
80
ok

I don't know sparc asm good enogh, but qemu behavior seems to be
logical: in the first case I see no store op, and there are shifts
which would multiply by 4:

ok see spacel!
code spacel!
ffd26e0c     ld      [%g7], %l2
ffd26e10     add     %g7, 4, %g7
ffd26e14     ld      [%g7], %l0
ffd26e18     add     %g7, 4, %g7
ffd26e1c     sll     %g4, 2, %g4
ffd26e20     call    ffd26e24
ffd26e24     add     %g0, 14, %l1

ok ffd26e24 dis
ffd26e24     add     %g0, 14, %l1
ffd26e28     add     %o7, %l1, %l1
ffd26e2c     jmp     %l1, %g4, %g0
ffd26e30     ba      ffd26f68
ok

ok see spacel@
code spacel@
ffd26830     ld      [%g7], %l0
ffd26834     add     %g7, 4, %g7
ffd26838     sll     %g4, 2, %g4
ffd2683c     call    ffd26840
ffd26840     add     %g0, 14, %l1

ok ffd26840 dis
ffd26840     add     %g0, 14, %l1
ffd26844     add     %o7, %l1, %l1
ffd26848     jmp     %l1, %g4, %g0
ffd2684c     ba      ffd26984


The code is identical on a real and emulated SS.

It must be the jump, which jumps differently on a real hw and under
qemu. Do you see from the code where the jump would jump to, or maybe
you have a suggestion how to check where the jump jumps to on the real
hw?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Qemu-devel] Re: target-sparc/TODO
  2009-08-20  9:44       ` Artyom Tarasenko
@ 2009-08-20 19:15         ` Blue Swirl
  2009-08-21  9:58           ` Artyom Tarasenko
  0 siblings, 1 reply; 18+ messages in thread
From: Blue Swirl @ 2009-08-20 19:15 UTC (permalink / raw)
  To: Artyom Tarasenko; +Cc: qemu-devel

On Thu, Aug 20, 2009 at 12:44 PM, Artyom
Tarasenko<atar4qemu@googlemail.com> wrote:
>>> Particularly I'm interested if
>>>
>>> jmp     %l1, %g4, %g0
>>>
>>> may behave other than on a real hw.
>>
>> No, if rd is %g0, the current PC will not be written anywhere (not by
>> real HW either).
>
> The reason I asked is the two following pieces of code work
> differently on a real and emulated SS-5. On a real one spacel! does an
> asi write, and spacel@ does an asi read, and under qemu  spacel! seems
> to do nothing, and spacel@ returns its second parameter multiplied by
> 4. Both of them don't even try to call an [unimplemented] asi
> operation, I've runned the tests with mmu and asi debug turned on.
>
> Real SS-5:
>
> ok 0 0 spacel@ .
> Data Access Error
> ok 0 20 spacel@ .
> 0
> ok 12345678 0 20 spacel!
> ok 0 20 spacel@ .
> 12345678
> ok
>
>
> qemu SS-5:
>
> ok 0 0 spacel@ .
> 0
> ok 0 20 spacel@ .
> 80
> ok 12345678 0 20 spacel!
> ok 0 20 spacel@ .
> 80
> ok
>
> I don't know sparc asm good enogh, but qemu behavior seems to be
> logical: in the first case I see no store op, and there are shifts
> which would multiply by 4:
>
> ok see spacel!
> code spacel!
> ffd26e0c     ld      [%g7], %l2
> ffd26e10     add     %g7, 4, %g7
> ffd26e14     ld      [%g7], %l0
> ffd26e18     add     %g7, 4, %g7
> ffd26e1c     sll     %g4, 2, %g4
> ffd26e20     call    ffd26e24
> ffd26e24     add     %g0, 14, %l1
>
> ok ffd26e24 dis
> ffd26e24     add     %g0, 14, %l1
> ffd26e28     add     %o7, %l1, %l1
> ffd26e2c     jmp     %l1, %g4, %g0
> ffd26e30     ba      ffd26f68
> ok
>
> ok see spacel@
> code spacel@
> ffd26830     ld      [%g7], %l0
> ffd26834     add     %g7, 4, %g7
> ffd26838     sll     %g4, 2, %g4
> ffd2683c     call    ffd26840
> ffd26840     add     %g0, 14, %l1
>
> ok ffd26840 dis
> ffd26840     add     %g0, 14, %l1
> ffd26844     add     %o7, %l1, %l1
> ffd26848     jmp     %l1, %g4, %g0
> ffd2684c     ba      ffd26984
>
>
> The code is identical on a real and emulated SS.
>
> It must be the jump, which jumps differently on a real hw and under
> qemu. Do you see from the code where the jump would jump to, or maybe
> you have a suggestion how to check where the jump jumps to on the real
> hw?

The target of the call instruction is also a delay slot instruction
for the call itself. Maybe this case is not handled correctly?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Qemu-devel] Re: target-sparc/TODO
  2009-08-20 19:15         ` Blue Swirl
@ 2009-08-21  9:58           ` Artyom Tarasenko
  2009-08-21 12:40             ` Artyom Tarasenko
  0 siblings, 1 reply; 18+ messages in thread
From: Artyom Tarasenko @ 2009-08-21  9:58 UTC (permalink / raw)
  To: Blue Swirl; +Cc: qemu-devel

2009/8/20 Blue Swirl <blauwirbel@gmail.com>:
> On Thu, Aug 20, 2009 at 12:44 PM, Artyom
> Tarasenko<atar4qemu@googlemail.com> wrote:
>>>> Particularly I'm interested if
>>>>
>>>> jmp     %l1, %g4, %g0
>>>>
>>>> may behave other than on a real hw.
>>>
>>> No, if rd is %g0, the current PC will not be written anywhere (not by
>>> real HW either).
>>
>> The reason I asked is the two following pieces of code work
>> differently on a real and emulated SS-5. On a real one spacel! does an
>> asi write, and spacel@ does an asi read, and under qemu  spacel! seems
>> to do nothing, and spacel@ returns its second parameter multiplied by
>> 4. Both of them don't even try to call an [unimplemented] asi
>> operation, I've runned the tests with mmu and asi debug turned on.
>>
>> Real SS-5:
>>
>> ok 0 0 spacel@ .
>> Data Access Error
>> ok 0 20 spacel@ .
>> 0
>> ok 12345678 0 20 spacel!
>> ok 0 20 spacel@ .
>> 12345678
>> ok
>>
>>
>> qemu SS-5:
>>
>> ok 0 0 spacel@ .
>> 0
>> ok 0 20 spacel@ .
>> 80
>> ok 12345678 0 20 spacel!
>> ok 0 20 spacel@ .
>> 80
>> ok
>>
>> I don't know sparc asm good enogh, but qemu behavior seems to be
>> logical: in the first case I see no store op, and there are shifts
>> which would multiply by 4:
>>
>> ok see spacel!
>> code spacel!
>> ffd26e0c     ld      [%g7], %l2
>> ffd26e10     add     %g7, 4, %g7
>> ffd26e14     ld      [%g7], %l0
>> ffd26e18     add     %g7, 4, %g7
>> ffd26e1c     sll     %g4, 2, %g4
>> ffd26e20     call    ffd26e24
>> ffd26e24     add     %g0, 14, %l1
>>
>> ok ffd26e24 dis
>> ffd26e24     add     %g0, 14, %l1
>> ffd26e28     add     %o7, %l1, %l1
>> ffd26e2c     jmp     %l1, %g4, %g0
>> ffd26e30     ba      ffd26f68
>> ok
>>
>> ok see spacel@
>> code spacel@
>> ffd26830     ld      [%g7], %l0
>> ffd26834     add     %g7, 4, %g7
>> ffd26838     sll     %g4, 2, %g4
>> ffd2683c     call    ffd26840
>> ffd26840     add     %g0, 14, %l1
>>
>> ok ffd26840 dis
>> ffd26840     add     %g0, 14, %l1
>> ffd26844     add     %o7, %l1, %l1
>> ffd26848     jmp     %l1, %g4, %g0
>> ffd2684c     ba      ffd26984
>>
>>
>> The code is identical on a real and emulated SS.
>>
>> It must be the jump, which jumps differently on a real hw and under
>> qemu. Do you see from the code where the jump would jump to, or maybe
>> you have a suggestion how to check where the jump jumps to on the real
>> hw?
>
> The target of the call instruction is also a delay slot instruction
> for the call itself. Maybe this case is not handled correctly?

Good idea! Don't know how to test it though.

And what about "ba" in the delay slot of "jmp"? Is the correct
behavior described somewhere? Would jump just be ignored? Whould it
execute one instruction on jump destination and then branch? Would
branch be ignored?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Qemu-devel] Re: target-sparc/TODO
  2009-08-21  9:58           ` Artyom Tarasenko
@ 2009-08-21 12:40             ` Artyom Tarasenko
  2009-08-21 19:45               ` Blue Swirl
  0 siblings, 1 reply; 18+ messages in thread
From: Artyom Tarasenko @ 2009-08-21 12:40 UTC (permalink / raw)
  To: Blue Swirl; +Cc: qemu-devel

2009/8/21 Artyom Tarasenko <atar4qemu@googlemail.com>:
> 2009/8/20 Blue Swirl <blauwirbel@gmail.com>:
>> On Thu, Aug 20, 2009 at 12:44 PM, Artyom
>> Tarasenko<atar4qemu@googlemail.com> wrote:
>>>>> Particularly I'm interested if
>>>>>
>>>>> jmp     %l1, %g4, %g0
>>>>>
>>>>> may behave other than on a real hw.
>>>>
>>>> No, if rd is %g0, the current PC will not be written anywhere (not by
>>>> real HW either).
>>>
>>> The reason I asked is the two following pieces of code work
>>> differently on a real and emulated SS-5. On a real one spacel! does an
>>> asi write, and spacel@ does an asi read, and under qemu  spacel! seems
>>> to do nothing, and spacel@ returns its second parameter multiplied by
>>> 4. Both of them don't even try to call an [unimplemented] asi
>>> operation, I've runned the tests with mmu and asi debug turned on.
>>>
>>> Real SS-5:
>>>
>>> ok 0 0 spacel@ .
>>> Data Access Error
>>> ok 0 20 spacel@ .
>>> 0
>>> ok 12345678 0 20 spacel!
>>> ok 0 20 spacel@ .
>>> 12345678
>>> ok
>>>
>>>
>>> qemu SS-5:
>>>
>>> ok 0 0 spacel@ .
>>> 0
>>> ok 0 20 spacel@ .
>>> 80
>>> ok 12345678 0 20 spacel!
>>> ok 0 20 spacel@ .
>>> 80
>>> ok
>>>
>>> I don't know sparc asm good enogh, but qemu behavior seems to be
>>> logical: in the first case I see no store op, and there are shifts
>>> which would multiply by 4:
>>>
>>> ok see spacel!
>>> code spacel!
>>> ffd26e0c     ld      [%g7], %l2
>>> ffd26e10     add     %g7, 4, %g7
>>> ffd26e14     ld      [%g7], %l0
>>> ffd26e18     add     %g7, 4, %g7
>>> ffd26e1c     sll     %g4, 2, %g4
>>> ffd26e20     call    ffd26e24
>>> ffd26e24     add     %g0, 14, %l1
>>>
>>> ok ffd26e24 dis
>>> ffd26e24     add     %g0, 14, %l1
>>> ffd26e28     add     %o7, %l1, %l1
>>> ffd26e2c     jmp     %l1, %g4, %g0
>>> ffd26e30     ba      ffd26f68
>>> ok
>>>
>>> ok see spacel@
>>> code spacel@
>>> ffd26830     ld      [%g7], %l0
>>> ffd26834     add     %g7, 4, %g7
>>> ffd26838     sll     %g4, 2, %g4
>>> ffd2683c     call    ffd26840
>>> ffd26840     add     %g0, 14, %l1
>>>
>>> ok ffd26840 dis
>>> ffd26840     add     %g0, 14, %l1
>>> ffd26844     add     %o7, %l1, %l1
>>> ffd26848     jmp     %l1, %g4, %g0
>>> ffd2684c     ba      ffd26984
>>>
>>>
>>> The code is identical on a real and emulated SS.
>>>
>>> It must be the jump, which jumps differently on a real hw and under
>>> qemu. Do you see from the code where the jump would jump to, or maybe
>>> you have a suggestion how to check where the jump jumps to on the real
>>> hw?
>>
>> The target of the call instruction is also a delay slot instruction
>> for the call itself. Maybe this case is not handled correctly?
>
> Good idea! Don't know how to test it though.
>
> And what about "ba" in the delay slot of "jmp"? Is the correct
> behavior described somewhere? Would jump just be ignored? Whould it
> execute one instruction on jump destination and then branch? Would
> branch be ignored?

Page 55 of The SPARC v8 Architecture Manual
(http://www.sparc.org/standards/V8.pdf) describes this case
explicitly:
cpu should execute one instruction on the jump target and then branch.
 Is it what qemu currently does?

I guess gcc doesn't generate such constructs, so linux & Co wouldn't
notice if this behavior is not implemented as specified.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Qemu-devel] Re: target-sparc/TODO
  2009-08-21 12:40             ` Artyom Tarasenko
@ 2009-08-21 19:45               ` Blue Swirl
  2009-08-21 21:01                 ` Artyom Tarasenko
  0 siblings, 1 reply; 18+ messages in thread
From: Blue Swirl @ 2009-08-21 19:45 UTC (permalink / raw)
  To: Artyom Tarasenko; +Cc: qemu-devel

On Fri, Aug 21, 2009 at 3:40 PM, Artyom
Tarasenko<atar4qemu@googlemail.com> wrote:
> 2009/8/21 Artyom Tarasenko <atar4qemu@googlemail.com>:
>> 2009/8/20 Blue Swirl <blauwirbel@gmail.com>:
>>> On Thu, Aug 20, 2009 at 12:44 PM, Artyom
>>> Tarasenko<atar4qemu@googlemail.com> wrote:
>>>>>> Particularly I'm interested if
>>>>>>
>>>>>> jmp     %l1, %g4, %g0
>>>>>>
>>>>>> may behave other than on a real hw.
>>>>>
>>>>> No, if rd is %g0, the current PC will not be written anywhere (not by
>>>>> real HW either).
>>>>
>>>> The reason I asked is the two following pieces of code work
>>>> differently on a real and emulated SS-5. On a real one spacel! does an
>>>> asi write, and spacel@ does an asi read, and under qemu  spacel! seems
>>>> to do nothing, and spacel@ returns its second parameter multiplied by
>>>> 4. Both of them don't even try to call an [unimplemented] asi
>>>> operation, I've runned the tests with mmu and asi debug turned on.
>>>>
>>>> Real SS-5:
>>>>
>>>> ok 0 0 spacel@ .
>>>> Data Access Error
>>>> ok 0 20 spacel@ .
>>>> 0
>>>> ok 12345678 0 20 spacel!
>>>> ok 0 20 spacel@ .
>>>> 12345678
>>>> ok
>>>>
>>>>
>>>> qemu SS-5:
>>>>
>>>> ok 0 0 spacel@ .
>>>> 0
>>>> ok 0 20 spacel@ .
>>>> 80
>>>> ok 12345678 0 20 spacel!
>>>> ok 0 20 spacel@ .
>>>> 80
>>>> ok
>>>>
>>>> I don't know sparc asm good enogh, but qemu behavior seems to be
>>>> logical: in the first case I see no store op, and there are shifts
>>>> which would multiply by 4:
>>>>
>>>> ok see spacel!
>>>> code spacel!
>>>> ffd26e0c     ld      [%g7], %l2
>>>> ffd26e10     add     %g7, 4, %g7
>>>> ffd26e14     ld      [%g7], %l0
>>>> ffd26e18     add     %g7, 4, %g7
>>>> ffd26e1c     sll     %g4, 2, %g4
>>>> ffd26e20     call    ffd26e24
>>>> ffd26e24     add     %g0, 14, %l1
>>>>
>>>> ok ffd26e24 dis
>>>> ffd26e24     add     %g0, 14, %l1
>>>> ffd26e28     add     %o7, %l1, %l1
>>>> ffd26e2c     jmp     %l1, %g4, %g0
>>>> ffd26e30     ba      ffd26f68
>>>> ok
>>>>
>>>> ok see spacel@
>>>> code spacel@
>>>> ffd26830     ld      [%g7], %l0
>>>> ffd26834     add     %g7, 4, %g7
>>>> ffd26838     sll     %g4, 2, %g4
>>>> ffd2683c     call    ffd26840
>>>> ffd26840     add     %g0, 14, %l1
>>>>
>>>> ok ffd26840 dis
>>>> ffd26840     add     %g0, 14, %l1
>>>> ffd26844     add     %o7, %l1, %l1
>>>> ffd26848     jmp     %l1, %g4, %g0
>>>> ffd2684c     ba      ffd26984
>>>>
>>>>
>>>> The code is identical on a real and emulated SS.
>>>>
>>>> It must be the jump, which jumps differently on a real hw and under
>>>> qemu. Do you see from the code where the jump would jump to, or maybe
>>>> you have a suggestion how to check where the jump jumps to on the real
>>>> hw?
>>>
>>> The target of the call instruction is also a delay slot instruction
>>> for the call itself. Maybe this case is not handled correctly?
>>
>> Good idea! Don't know how to test it though.
>>
>> And what about "ba" in the delay slot of "jmp"? Is the correct
>> behavior described somewhere? Would jump just be ignored? Whould it
>> execute one instruction on jump destination and then branch? Would
>> branch be ignored?
>
> Page 55 of The SPARC v8 Architecture Manual
> (http://www.sparc.org/standards/V8.pdf) describes this case
> explicitly:
> cpu should execute one instruction on the jump target and then branch.
>  Is it what qemu currently does?

I may be blind, I don't see the description of this case in that page.
However, I made a small Linux test program to test it:
    .global _start
    .type _start, function
_start:
    mov 1, %o0
    call 1f
#ifdef NOP
    nop
#endif
1:   inc %o0
    mov 1, %g1
    ta 0x10

(and a BSD version:
#include <sys/syscall.h>

    .globl _start
_start:
    mov 1, %o0
    call 1f
#ifdef NOP
    nop
#endif
1:   inc %o0
    mov SYS_exit, %g1
    ta 0
)

Both QEMU and real (Sparc64) hardware exit with return value of 3, so
the inc is re-executed. If I add a nop in the call delay slot, the
return value is 2.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Qemu-devel] Re: target-sparc/TODO
  2009-08-21 19:45               ` Blue Swirl
@ 2009-08-21 21:01                 ` Artyom Tarasenko
  2009-08-21 21:10                   ` Igor Kovalenko
  2009-08-22  6:51                   ` Blue Swirl
  0 siblings, 2 replies; 18+ messages in thread
From: Artyom Tarasenko @ 2009-08-21 21:01 UTC (permalink / raw)
  To: Blue Swirl; +Cc: qemu-devel

2009/8/21 Blue Swirl <blauwirbel@gmail.com>:
> On Fri, Aug 21, 2009 at 3:40 PM, Artyom
> Tarasenko<atar4qemu@googlemail.com> wrote:
>> 2009/8/21 Artyom Tarasenko <atar4qemu@googlemail.com>:
>>> 2009/8/20 Blue Swirl <blauwirbel@gmail.com>:
>>>> On Thu, Aug 20, 2009 at 12:44 PM, Artyom
>>>> Tarasenko<atar4qemu@googlemail.com> wrote:
>>>>>>> Particularly I'm interested if
>>>>>>>
>>>>>>> jmp     %l1, %g4, %g0
>>>>>>>
>>>>>>> may behave other than on a real hw.
>>>>>>
>>>>>> No, if rd is %g0, the current PC will not be written anywhere (not by
>>>>>> real HW either).
>>>>>
>>>>> The reason I asked is the two following pieces of code work
>>>>> differently on a real and emulated SS-5. On a real one spacel! does an
>>>>> asi write, and spacel@ does an asi read, and under qemu  spacel! seems
>>>>> to do nothing, and spacel@ returns its second parameter multiplied by
>>>>> 4. Both of them don't even try to call an [unimplemented] asi
>>>>> operation, I've runned the tests with mmu and asi debug turned on.
>>>>>
>>>>> Real SS-5:
>>>>>
>>>>> ok 0 0 spacel@ .
>>>>> Data Access Error
>>>>> ok 0 20 spacel@ .
>>>>> 0
>>>>> ok 12345678 0 20 spacel!
>>>>> ok 0 20 spacel@ .
>>>>> 12345678
>>>>> ok
>>>>>
>>>>>
>>>>> qemu SS-5:
>>>>>
>>>>> ok 0 0 spacel@ .
>>>>> 0
>>>>> ok 0 20 spacel@ .
>>>>> 80
>>>>> ok 12345678 0 20 spacel!
>>>>> ok 0 20 spacel@ .
>>>>> 80
>>>>> ok
>>>>>
>>>>> I don't know sparc asm good enogh, but qemu behavior seems to be
>>>>> logical: in the first case I see no store op, and there are shifts
>>>>> which would multiply by 4:
>>>>>
>>>>> ok see spacel!
>>>>> code spacel!
>>>>> ffd26e0c     ld      [%g7], %l2
>>>>> ffd26e10     add     %g7, 4, %g7
>>>>> ffd26e14     ld      [%g7], %l0
>>>>> ffd26e18     add     %g7, 4, %g7
>>>>> ffd26e1c     sll     %g4, 2, %g4
>>>>> ffd26e20     call    ffd26e24
>>>>> ffd26e24     add     %g0, 14, %l1
>>>>>
>>>>> ok ffd26e24 dis
>>>>> ffd26e24     add     %g0, 14, %l1
>>>>> ffd26e28     add     %o7, %l1, %l1
>>>>> ffd26e2c     jmp     %l1, %g4, %g0
>>>>> ffd26e30     ba      ffd26f68
>>>>> ok
>>>>>
>>>>> ok see spacel@
>>>>> code spacel@
>>>>> ffd26830     ld      [%g7], %l0
>>>>> ffd26834     add     %g7, 4, %g7
>>>>> ffd26838     sll     %g4, 2, %g4
>>>>> ffd2683c     call    ffd26840
>>>>> ffd26840     add     %g0, 14, %l1
>>>>>
>>>>> ok ffd26840 dis
>>>>> ffd26840     add     %g0, 14, %l1
>>>>> ffd26844     add     %o7, %l1, %l1
>>>>> ffd26848     jmp     %l1, %g4, %g0
>>>>> ffd2684c     ba      ffd26984
>>>>>
>>>>>
>>>>> The code is identical on a real and emulated SS.
>>>>>
>>>>> It must be the jump, which jumps differently on a real hw and under
>>>>> qemu. Do you see from the code where the jump would jump to, or maybe
>>>>> you have a suggestion how to check where the jump jumps to on the real
>>>>> hw?
>>>>
>>>> The target of the call instruction is also a delay slot instruction
>>>> for the call itself. Maybe this case is not handled correctly?
>>>
>>> Good idea! Don't know how to test it though.
>>>
>>> And what about "ba" in the delay slot of "jmp"? Is the correct
>>> behavior described somewhere? Would jump just be ignored? Whould it
>>> execute one instruction on jump destination and then branch? Would
>>> branch be ignored?
>>
>> Page 55 of The SPARC v8 Architecture Manual
>> (http://www.sparc.org/standards/V8.pdf) describes this case
>> explicitly:
>> cpu should execute one instruction on the jump target and then branch.
>>  Is it what qemu currently does?
>
> I may be blind, I don't see the description of this case in that page.

I wasn't referring the call case, but jmp+ba case (two last ops in the
listing above). This DCTI is described on pages marked 55-56 (pages
54-54 in a pdf reader). That's the first case in the table 5-12.

> Both QEMU and real (Sparc64) hardware exit with return value of 3, so
> the inc is re-executed. If I add a nop in the call delay slot, the
> return value is 2.

Can you make a similar test, but with ba in the jmp's delay slot?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] Re: target-sparc/TODO
  2009-08-21 21:01                 ` Artyom Tarasenko
@ 2009-08-21 21:10                   ` Igor Kovalenko
  2009-08-21 21:17                     ` Artyom Tarasenko
  2009-08-22  6:51                   ` Blue Swirl
  1 sibling, 1 reply; 18+ messages in thread
From: Igor Kovalenko @ 2009-08-21 21:10 UTC (permalink / raw)
  To: Artyom Tarasenko; +Cc: Blue Swirl, qemu-devel

On Sat, Aug 22, 2009 at 1:01 AM, Artyom
Tarasenko<atar4qemu@googlemail.com> wrote:
> 2009/8/21 Blue Swirl <blauwirbel@gmail.com>:
>> On Fri, Aug 21, 2009 at 3:40 PM, Artyom
>> Tarasenko<atar4qemu@googlemail.com> wrote:
>>> 2009/8/21 Artyom Tarasenko <atar4qemu@googlemail.com>:
>>>> 2009/8/20 Blue Swirl <blauwirbel@gmail.com>:
>>>>> On Thu, Aug 20, 2009 at 12:44 PM, Artyom
>>>>> Tarasenko<atar4qemu@googlemail.com> wrote:
>>>>>>>> Particularly I'm interested if
>>>>>>>>
>>>>>>>> jmp     %l1, %g4, %g0
>>>>>>>>
>>>>>>>> may behave other than on a real hw.
>>>>>>>
>>>>>>> No, if rd is %g0, the current PC will not be written anywhere (not by
>>>>>>> real HW either).
>>>>>>
>>>>>> The reason I asked is the two following pieces of code work
>>>>>> differently on a real and emulated SS-5. On a real one spacel! does an
>>>>>> asi write, and spacel@ does an asi read, and under qemu  spacel! seems
>>>>>> to do nothing, and spacel@ returns its second parameter multiplied by
>>>>>> 4. Both of them don't even try to call an [unimplemented] asi
>>>>>> operation, I've runned the tests with mmu and asi debug turned on.
>>>>>>
>>>>>> Real SS-5:
>>>>>>
>>>>>> ok 0 0 spacel@ .
>>>>>> Data Access Error
>>>>>> ok 0 20 spacel@ .
>>>>>> 0
>>>>>> ok 12345678 0 20 spacel!
>>>>>> ok 0 20 spacel@ .
>>>>>> 12345678
>>>>>> ok
>>>>>>
>>>>>>
>>>>>> qemu SS-5:
>>>>>>
>>>>>> ok 0 0 spacel@ .
>>>>>> 0
>>>>>> ok 0 20 spacel@ .
>>>>>> 80
>>>>>> ok 12345678 0 20 spacel!
>>>>>> ok 0 20 spacel@ .
>>>>>> 80
>>>>>> ok
>>>>>>
>>>>>> I don't know sparc asm good enogh, but qemu behavior seems to be
>>>>>> logical: in the first case I see no store op, and there are shifts
>>>>>> which would multiply by 4:
>>>>>>
>>>>>> ok see spacel!
>>>>>> code spacel!
>>>>>> ffd26e0c     ld      [%g7], %l2
>>>>>> ffd26e10     add     %g7, 4, %g7
>>>>>> ffd26e14     ld      [%g7], %l0
>>>>>> ffd26e18     add     %g7, 4, %g7
>>>>>> ffd26e1c     sll     %g4, 2, %g4
>>>>>> ffd26e20     call    ffd26e24
>>>>>> ffd26e24     add     %g0, 14, %l1
>>>>>>
>>>>>> ok ffd26e24 dis
>>>>>> ffd26e24     add     %g0, 14, %l1
>>>>>> ffd26e28     add     %o7, %l1, %l1
>>>>>> ffd26e2c     jmp     %l1, %g4, %g0
>>>>>> ffd26e30     ba      ffd26f68
>>>>>> ok
>>>>>>
>>>>>> ok see spacel@
>>>>>> code spacel@
>>>>>> ffd26830     ld      [%g7], %l0
>>>>>> ffd26834     add     %g7, 4, %g7
>>>>>> ffd26838     sll     %g4, 2, %g4
>>>>>> ffd2683c     call    ffd26840
>>>>>> ffd26840     add     %g0, 14, %l1
>>>>>>
>>>>>> ok ffd26840 dis
>>>>>> ffd26840     add     %g0, 14, %l1
>>>>>> ffd26844     add     %o7, %l1, %l1
>>>>>> ffd26848     jmp     %l1, %g4, %g0
>>>>>> ffd2684c     ba      ffd26984
>>>>>>
>>>>>>
>>>>>> The code is identical on a real and emulated SS.
>>>>>>
>>>>>> It must be the jump, which jumps differently on a real hw and under
>>>>>> qemu. Do you see from the code where the jump would jump to, or maybe
>>>>>> you have a suggestion how to check where the jump jumps to on the real
>>>>>> hw?
>>>>>
>>>>> The target of the call instruction is also a delay slot instruction
>>>>> for the call itself. Maybe this case is not handled correctly?
>>>>
>>>> Good idea! Don't know how to test it though.
>>>>
>>>> And what about "ba" in the delay slot of "jmp"? Is the correct
>>>> behavior described somewhere? Would jump just be ignored? Whould it
>>>> execute one instruction on jump destination and then branch? Would
>>>> branch be ignored?
>>>
>>> Page 55 of The SPARC v8 Architecture Manual
>>> (http://www.sparc.org/standards/V8.pdf) describes this case
>>> explicitly:
>>> cpu should execute one instruction on the jump target and then branch.
>>>  Is it what qemu currently does?
>>
>> I may be blind, I don't see the description of this case in that page.
>
> I wasn't referring the call case, but jmp+ba case (two last ops in the
> listing above). This DCTI is described on pages marked 55-56 (pages
> 54-54 in a pdf reader). That's the first case in the table 5-12.
>
>> Both QEMU and real (Sparc64) hardware exit with return value of 3, so
>> the inc is re-executed. If I add a nop in the call delay slot, the
>> return value is 2.
>
> Can you make a similar test, but with ba in the jmp's delay slot?

SPARC-V8 left as undefined the result of executing a delayed
conditional branch that had a delayed control
transfer in its delay slot...

-- 
Kind regards,
Igor V. Kovalenko

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] Re: target-sparc/TODO
  2009-08-21 21:10                   ` Igor Kovalenko
@ 2009-08-21 21:17                     ` Artyom Tarasenko
  0 siblings, 0 replies; 18+ messages in thread
From: Artyom Tarasenko @ 2009-08-21 21:17 UTC (permalink / raw)
  To: Igor Kovalenko; +Cc: Blue Swirl, qemu-devel

2009/8/21 Igor Kovalenko <igor.v.kovalenko@gmail.com>:
> On Sat, Aug 22, 2009 at 1:01 AM, Artyom
> Tarasenko<atar4qemu@googlemail.com> wrote:
>> 2009/8/21 Blue Swirl <blauwirbel@gmail.com>:
>>> On Fri, Aug 21, 2009 at 3:40 PM, Artyom
>>> Tarasenko<atar4qemu@googlemail.com> wrote:
>>>> 2009/8/21 Artyom Tarasenko <atar4qemu@googlemail.com>:
>>>>> 2009/8/20 Blue Swirl <blauwirbel@gmail.com>:
>>>>>> On Thu, Aug 20, 2009 at 12:44 PM, Artyom
>>>>>> Tarasenko<atar4qemu@googlemail.com> wrote:
>>>>>>>>> Particularly I'm interested if
>>>>>>>>>
>>>>>>>>> jmp     %l1, %g4, %g0
>>>>>>>>>
>>>>>>>>> may behave other than on a real hw.
>>>>>>>>
>>>>>>>> No, if rd is %g0, the current PC will not be written anywhere (not by
>>>>>>>> real HW either).
>>>>>>>
>>>>>>> The reason I asked is the two following pieces of code work
>>>>>>> differently on a real and emulated SS-5. On a real one spacel! does an
>>>>>>> asi write, and spacel@ does an asi read, and under qemu  spacel! seems
>>>>>>> to do nothing, and spacel@ returns its second parameter multiplied by
>>>>>>> 4. Both of them don't even try to call an [unimplemented] asi
>>>>>>> operation, I've runned the tests with mmu and asi debug turned on.
>>>>>>>
>>>>>>> Real SS-5:
>>>>>>>
>>>>>>> ok 0 0 spacel@ .
>>>>>>> Data Access Error
>>>>>>> ok 0 20 spacel@ .
>>>>>>> 0
>>>>>>> ok 12345678 0 20 spacel!
>>>>>>> ok 0 20 spacel@ .
>>>>>>> 12345678
>>>>>>> ok
>>>>>>>
>>>>>>>
>>>>>>> qemu SS-5:
>>>>>>>
>>>>>>> ok 0 0 spacel@ .
>>>>>>> 0
>>>>>>> ok 0 20 spacel@ .
>>>>>>> 80
>>>>>>> ok 12345678 0 20 spacel!
>>>>>>> ok 0 20 spacel@ .
>>>>>>> 80
>>>>>>> ok
>>>>>>>
>>>>>>> I don't know sparc asm good enogh, but qemu behavior seems to be
>>>>>>> logical: in the first case I see no store op, and there are shifts
>>>>>>> which would multiply by 4:
>>>>>>>
>>>>>>> ok see spacel!
>>>>>>> code spacel!
>>>>>>> ffd26e0c     ld      [%g7], %l2
>>>>>>> ffd26e10     add     %g7, 4, %g7
>>>>>>> ffd26e14     ld      [%g7], %l0
>>>>>>> ffd26e18     add     %g7, 4, %g7
>>>>>>> ffd26e1c     sll     %g4, 2, %g4
>>>>>>> ffd26e20     call    ffd26e24
>>>>>>> ffd26e24     add     %g0, 14, %l1
>>>>>>>
>>>>>>> ok ffd26e24 dis
>>>>>>> ffd26e24     add     %g0, 14, %l1
>>>>>>> ffd26e28     add     %o7, %l1, %l1
>>>>>>> ffd26e2c     jmp     %l1, %g4, %g0
>>>>>>> ffd26e30     ba      ffd26f68
>>>>>>> ok
>>>>>>>
>>>>>>> ok see spacel@
>>>>>>> code spacel@
>>>>>>> ffd26830     ld      [%g7], %l0
>>>>>>> ffd26834     add     %g7, 4, %g7
>>>>>>> ffd26838     sll     %g4, 2, %g4
>>>>>>> ffd2683c     call    ffd26840
>>>>>>> ffd26840     add     %g0, 14, %l1
>>>>>>>
>>>>>>> ok ffd26840 dis
>>>>>>> ffd26840     add     %g0, 14, %l1
>>>>>>> ffd26844     add     %o7, %l1, %l1
>>>>>>> ffd26848     jmp     %l1, %g4, %g0
>>>>>>> ffd2684c     ba      ffd26984
>>>>>>>
>>>>>>>
>>>>>>> The code is identical on a real and emulated SS.
>>>>>>>
>>>>>>> It must be the jump, which jumps differently on a real hw and under
>>>>>>> qemu. Do you see from the code where the jump would jump to, or maybe
>>>>>>> you have a suggestion how to check where the jump jumps to on the real
>>>>>>> hw?
>>>>>>
>>>>>> The target of the call instruction is also a delay slot instruction
>>>>>> for the call itself. Maybe this case is not handled correctly?
>>>>>
>>>>> Good idea! Don't know how to test it though.
>>>>>
>>>>> And what about "ba" in the delay slot of "jmp"? Is the correct
>>>>> behavior described somewhere? Would jump just be ignored? Whould it
>>>>> execute one instruction on jump destination and then branch? Would
>>>>> branch be ignored?
>>>>
>>>> Page 55 of The SPARC v8 Architecture Manual
>>>> (http://www.sparc.org/standards/V8.pdf) describes this case
>>>> explicitly:
>>>> cpu should execute one instruction on the jump target and then branch.
>>>>  Is it what qemu currently does?
>>>
>>> I may be blind, I don't see the description of this case in that page.
>>
>> I wasn't referring the call case, but jmp+ba case (two last ops in the
>> listing above). This DCTI is described on pages marked 55-56 (pages
>> 54-54 in a pdf reader). That's the first case in the table 5-12.
>>
>>> Both QEMU and real (Sparc64) hardware exit with return value of 3, so
>>> the inc is re-executed. If I add a nop in the call delay slot, the
>>> return value is 2.
>>
>> Can you make a similar test, but with ba in the jmp's delay slot?
>
> SPARC-V8 left as undefined the result of executing a delayed
> conditional branch that had a delayed control
> transfer in its delay slot...

Have you taken a look on  the pages 55-56 (pages 54-55 in a pdf
reader) of the V8 manual? Particularly the table 5-12 (the very first
case) describe the case pretty explicitly.

Also it would have been strange if Sun used an undefined case in all
of their own firmwares.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Qemu-devel] Re: target-sparc/TODO
  2009-08-21 21:01                 ` Artyom Tarasenko
  2009-08-21 21:10                   ` Igor Kovalenko
@ 2009-08-22  6:51                   ` Blue Swirl
  2009-08-22 12:40                     ` Artyom Tarasenko
  1 sibling, 1 reply; 18+ messages in thread
From: Blue Swirl @ 2009-08-22  6:51 UTC (permalink / raw)
  To: Artyom Tarasenko; +Cc: qemu-devel

On Sat, Aug 22, 2009 at 12:01 AM, Artyom
Tarasenko<atar4qemu@googlemail.com> wrote:
> 2009/8/21 Blue Swirl <blauwirbel@gmail.com>:
>> On Fri, Aug 21, 2009 at 3:40 PM, Artyom
>> Tarasenko<atar4qemu@googlemail.com> wrote:
>>> 2009/8/21 Artyom Tarasenko <atar4qemu@googlemail.com>:
>>>> 2009/8/20 Blue Swirl <blauwirbel@gmail.com>:
>>>>> On Thu, Aug 20, 2009 at 12:44 PM, Artyom
>>>>> Tarasenko<atar4qemu@googlemail.com> wrote:
>>>>>>>> Particularly I'm interested if
>>>>>>>>
>>>>>>>> jmp     %l1, %g4, %g0
>>>>>>>>
>>>>>>>> may behave other than on a real hw.
>>>>>>>
>>>>>>> No, if rd is %g0, the current PC will not be written anywhere (not by
>>>>>>> real HW either).
>>>>>>
>>>>>> The reason I asked is the two following pieces of code work
>>>>>> differently on a real and emulated SS-5. On a real one spacel! does an
>>>>>> asi write, and spacel@ does an asi read, and under qemu  spacel! seems
>>>>>> to do nothing, and spacel@ returns its second parameter multiplied by
>>>>>> 4. Both of them don't even try to call an [unimplemented] asi
>>>>>> operation, I've runned the tests with mmu and asi debug turned on.
>>>>>>
>>>>>> Real SS-5:
>>>>>>
>>>>>> ok 0 0 spacel@ .
>>>>>> Data Access Error
>>>>>> ok 0 20 spacel@ .
>>>>>> 0
>>>>>> ok 12345678 0 20 spacel!
>>>>>> ok 0 20 spacel@ .
>>>>>> 12345678
>>>>>> ok
>>>>>>
>>>>>>
>>>>>> qemu SS-5:
>>>>>>
>>>>>> ok 0 0 spacel@ .
>>>>>> 0
>>>>>> ok 0 20 spacel@ .
>>>>>> 80
>>>>>> ok 12345678 0 20 spacel!
>>>>>> ok 0 20 spacel@ .
>>>>>> 80
>>>>>> ok
>>>>>>
>>>>>> I don't know sparc asm good enogh, but qemu behavior seems to be
>>>>>> logical: in the first case I see no store op, and there are shifts
>>>>>> which would multiply by 4:
>>>>>>
>>>>>> ok see spacel!
>>>>>> code spacel!
>>>>>> ffd26e0c     ld      [%g7], %l2
>>>>>> ffd26e10     add     %g7, 4, %g7
>>>>>> ffd26e14     ld      [%g7], %l0
>>>>>> ffd26e18     add     %g7, 4, %g7
>>>>>> ffd26e1c     sll     %g4, 2, %g4
>>>>>> ffd26e20     call    ffd26e24
>>>>>> ffd26e24     add     %g0, 14, %l1
>>>>>>
>>>>>> ok ffd26e24 dis
>>>>>> ffd26e24     add     %g0, 14, %l1
>>>>>> ffd26e28     add     %o7, %l1, %l1
>>>>>> ffd26e2c     jmp     %l1, %g4, %g0
>>>>>> ffd26e30     ba      ffd26f68
>>>>>> ok
>>>>>>
>>>>>> ok see spacel@
>>>>>> code spacel@
>>>>>> ffd26830     ld      [%g7], %l0
>>>>>> ffd26834     add     %g7, 4, %g7
>>>>>> ffd26838     sll     %g4, 2, %g4
>>>>>> ffd2683c     call    ffd26840
>>>>>> ffd26840     add     %g0, 14, %l1
>>>>>>
>>>>>> ok ffd26840 dis
>>>>>> ffd26840     add     %g0, 14, %l1
>>>>>> ffd26844     add     %o7, %l1, %l1
>>>>>> ffd26848     jmp     %l1, %g4, %g0
>>>>>> ffd2684c     ba      ffd26984
>>>>>>
>>>>>>
>>>>>> The code is identical on a real and emulated SS.
>>>>>>
>>>>>> It must be the jump, which jumps differently on a real hw and under
>>>>>> qemu. Do you see from the code where the jump would jump to, or maybe
>>>>>> you have a suggestion how to check where the jump jumps to on the real
>>>>>> hw?
>>>>>
>>>>> The target of the call instruction is also a delay slot instruction
>>>>> for the call itself. Maybe this case is not handled correctly?
>>>>
>>>> Good idea! Don't know how to test it though.
>>>>
>>>> And what about "ba" in the delay slot of "jmp"? Is the correct
>>>> behavior described somewhere? Would jump just be ignored? Whould it
>>>> execute one instruction on jump destination and then branch? Would
>>>> branch be ignored?
>>>
>>> Page 55 of The SPARC v8 Architecture Manual
>>> (http://www.sparc.org/standards/V8.pdf) describes this case
>>> explicitly:
>>> cpu should execute one instruction on the jump target and then branch.
>>>  Is it what qemu currently does?
>>
>> I may be blind, I don't see the description of this case in that page.
>
> I wasn't referring the call case, but jmp+ba case (two last ops in the
> listing above). This DCTI is described on pages marked 55-56 (pages
> 54-54 in a pdf reader). That's the first case in the table 5-12.
>
>> Both QEMU and real (Sparc64) hardware exit with return value of 3, so
>> the inc is re-executed. If I add a nop in the call delay slot, the
>> return value is 2.
>
> Can you make a similar test, but with ba in the jmp's delay slot?

Now, we have found a bug! The following program exits with 2 on real
hardware for -UBA and -DBA versions, but 0 for -UBA (2 for -DBA) on
QEMU!

#ifdef __OpenBSD__
#include <sys/syscall.h>
#endif

    .globl _start
_start:
    clr %o0
#ifdef BA
    ba first
#else
    set first, %g1
    jmp %g1
#endif
     ba second
    /* should not be executed: */
    or %o0, 1, %o0
second:
#ifdef __OpenBSD__
    mov SYS_exit, %g1
    ta 0
#else
    mov 1, %g1
    ta 0x10
#endif
first:
    or %o0, 2, %o0
    /* should not be executed: */
    ba second
     or %o0, 4, %o0

qemu.log reveals that in the -UBA case, instead of 'or %o0, 2, %o0',
the first 'clr %o0' is executed:
IN:
0x00010054:  mov  %g0, %o0
0x00010058:  sethi  %hi(0x10000), %g1
0x0001005c:  or  %g1, 0x74, %g1 ! 0x10074
0x00010060:  jmp  %g1
0x00010064:  b  0x1006c

--------------
IN:
0x00010054:  mov  %g0, %o0

--------------
IN:
0x0001006c:  mov  1, %g1
0x00010070:  ta  0x10

For -DBA the log is OK:
IN:
0x00010054:  mov  %g0, %o0
0x00010058:  b  0x1006c
0x0001005c:  b  0x10064

--------------
IN:
0x0001006c:  or  %o0, 2, %o0

--------------
IN:
0x00010064:  mov  1, %g1
0x00010068:  ta  0x10

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Qemu-devel] Re: target-sparc/TODO
  2009-08-22  6:51                   ` Blue Swirl
@ 2009-08-22 12:40                     ` Artyom Tarasenko
  2009-08-22 13:30                       ` Robert Reif
  0 siblings, 1 reply; 18+ messages in thread
From: Artyom Tarasenko @ 2009-08-22 12:40 UTC (permalink / raw)
  To: Blue Swirl; +Cc: qemu-devel

2009/8/22 Blue Swirl <blauwirbel@gmail.com>:
> On Sat, Aug 22, 2009 at 12:01 AM, Artyom
> Tarasenko<atar4qemu@googlemail.com> wrote:
>> 2009/8/21 Blue Swirl <blauwirbel@gmail.com>:
>>> On Fri, Aug 21, 2009 at 3:40 PM, Artyom
>>> Tarasenko<atar4qemu@googlemail.com> wrote:
>>>> 2009/8/21 Artyom Tarasenko <atar4qemu@googlemail.com>:
>>>>> 2009/8/20 Blue Swirl <blauwirbel@gmail.com>:
>>>>>> On Thu, Aug 20, 2009 at 12:44 PM, Artyom
>>>>>> Tarasenko<atar4qemu@googlemail.com> wrote:
>>>>>>>>> Particularly I'm interested if
>>>>>>>>>
>>>>>>>>> jmp     %l1, %g4, %g0
>>>>>>>>>
>>>>>>>>> may behave other than on a real hw.
>>>>>>>>
>>>>>>>> No, if rd is %g0, the current PC will not be written anywhere (not by
>>>>>>>> real HW either).
>>>>>>>
>>>>>>> The reason I asked is the two following pieces of code work
>>>>>>> differently on a real and emulated SS-5. On a real one spacel! does an
>>>>>>> asi write, and spacel@ does an asi read, and under qemu  spacel! seems
>>>>>>> to do nothing, and spacel@ returns its second parameter multiplied by
>>>>>>> 4. Both of them don't even try to call an [unimplemented] asi
>>>>>>> operation, I've runned the tests with mmu and asi debug turned on.
>>>>>>>
>>>>>>> Real SS-5:
>>>>>>>
>>>>>>> ok 0 0 spacel@ .
>>>>>>> Data Access Error
>>>>>>> ok 0 20 spacel@ .
>>>>>>> 0
>>>>>>> ok 12345678 0 20 spacel!
>>>>>>> ok 0 20 spacel@ .
>>>>>>> 12345678
>>>>>>> ok
>>>>>>>
>>>>>>>
>>>>>>> qemu SS-5:
>>>>>>>
>>>>>>> ok 0 0 spacel@ .
>>>>>>> 0
>>>>>>> ok 0 20 spacel@ .
>>>>>>> 80
>>>>>>> ok 12345678 0 20 spacel!
>>>>>>> ok 0 20 spacel@ .
>>>>>>> 80
>>>>>>> ok
>>>>>>>
>>>>>>> I don't know sparc asm good enogh, but qemu behavior seems to be
>>>>>>> logical: in the first case I see no store op, and there are shifts
>>>>>>> which would multiply by 4:
>>>>>>>
>>>>>>> ok see spacel!
>>>>>>> code spacel!
>>>>>>> ffd26e0c     ld      [%g7], %l2
>>>>>>> ffd26e10     add     %g7, 4, %g7
>>>>>>> ffd26e14     ld      [%g7], %l0
>>>>>>> ffd26e18     add     %g7, 4, %g7
>>>>>>> ffd26e1c     sll     %g4, 2, %g4
>>>>>>> ffd26e20     call    ffd26e24
>>>>>>> ffd26e24     add     %g0, 14, %l1
>>>>>>>
>>>>>>> ok ffd26e24 dis
>>>>>>> ffd26e24     add     %g0, 14, %l1
>>>>>>> ffd26e28     add     %o7, %l1, %l1
>>>>>>> ffd26e2c     jmp     %l1, %g4, %g0
>>>>>>> ffd26e30     ba      ffd26f68
>>>>>>> ok
>>>>>>>
>>>>>>> ok see spacel@
>>>>>>> code spacel@
>>>>>>> ffd26830     ld      [%g7], %l0
>>>>>>> ffd26834     add     %g7, 4, %g7
>>>>>>> ffd26838     sll     %g4, 2, %g4
>>>>>>> ffd2683c     call    ffd26840
>>>>>>> ffd26840     add     %g0, 14, %l1
>>>>>>>
>>>>>>> ok ffd26840 dis
>>>>>>> ffd26840     add     %g0, 14, %l1
>>>>>>> ffd26844     add     %o7, %l1, %l1
>>>>>>> ffd26848     jmp     %l1, %g4, %g0
>>>>>>> ffd2684c     ba      ffd26984
>>>>>>>
>>>>>>>
>>>>>>> The code is identical on a real and emulated SS.
>>>>>>>
>>>>>>> It must be the jump, which jumps differently on a real hw and under
>>>>>>> qemu. Do you see from the code where the jump would jump to, or maybe
>>>>>>> you have a suggestion how to check where the jump jumps to on the real
>>>>>>> hw?
>>>>>>
>>>>>> The target of the call instruction is also a delay slot instruction
>>>>>> for the call itself. Maybe this case is not handled correctly?
>>>>>
>>>>> Good idea! Don't know how to test it though.
>>>>>
>>>>> And what about "ba" in the delay slot of "jmp"? Is the correct
>>>>> behavior described somewhere? Would jump just be ignored? Whould it
>>>>> execute one instruction on jump destination and then branch? Would
>>>>> branch be ignored?
>>>>
>>>> Page 55 of The SPARC v8 Architecture Manual
>>>> (http://www.sparc.org/standards/V8.pdf) describes this case
>>>> explicitly:
>>>> cpu should execute one instruction on the jump target and then branch.
>>>>  Is it what qemu currently does?
>>>
>>> I may be blind, I don't see the description of this case in that page.
>>
>> I wasn't referring the call case, but jmp+ba case (two last ops in the
>> listing above). This DCTI is described on pages marked 55-56 (pages
>> 54-54 in a pdf reader). That's the first case in the table 5-12.
>>
>>> Both QEMU and real (Sparc64) hardware exit with return value of 3, so
>>> the inc is re-executed. If I add a nop in the call delay slot, the
>>> return value is 2.
>>
>> Can you make a similar test, but with ba in the jmp's delay slot?
>
> Now, we have found a bug!

Looks better now!

Probing Memory Bank #0 64 Megabytes of DRAM
Probing Memory Bank #1 64 Megabytes of DRAM
Probing Memory Bank #2 64 Megabytes of DRAM
Probing Memory Bank #3 64 Megabytes of DRAM
Probing Memory Bank #4 Data Access Error
ok

Used to be "Nothing there". What is a bit surprising, is that only 256
Mib found when started with -m 512. May be a next bug, this time in
ASI.

With -m 128 :
Probing Memory Bank #0 64 Megabytes of DRAM
Probing Memory Bank #1 64 Megabytes of DRAM
Probing Memory Bank #2 Nothing there
Probing Memory Bank #3 Nothing there
Probing Memory Bank #4 Data Access Error
ok

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] Re: target-sparc/TODO
  2009-08-22 12:40                     ` Artyom Tarasenko
@ 2009-08-22 13:30                       ` Robert Reif
  2009-08-22 17:25                         ` Artyom Tarasenko
  0 siblings, 1 reply; 18+ messages in thread
From: Robert Reif @ 2009-08-22 13:30 UTC (permalink / raw)
  To: Artyom Tarasenko; +Cc: Blue Swirl, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 5566 bytes --]

Artyom Tarasenko wrote:
> 2009/8/22 Blue Swirl <blauwirbel@gmail.com>:
>   
>> On Sat, Aug 22, 2009 at 12:01 AM, Artyom
>> Tarasenko<atar4qemu@googlemail.com> wrote:
>>     
>>> 2009/8/21 Blue Swirl <blauwirbel@gmail.com>:
>>>       
>>>> On Fri, Aug 21, 2009 at 3:40 PM, Artyom
>>>> Tarasenko<atar4qemu@googlemail.com> wrote:
>>>>         
>>>>> 2009/8/21 Artyom Tarasenko <atar4qemu@googlemail.com>:
>>>>>           
>>>>>> 2009/8/20 Blue Swirl <blauwirbel@gmail.com>:
>>>>>>             
>>>>>>> On Thu, Aug 20, 2009 at 12:44 PM, Artyom
>>>>>>> Tarasenko<atar4qemu@googlemail.com> wrote:
>>>>>>>               
>>>>>>>>>> Particularly I'm interested if
>>>>>>>>>>
>>>>>>>>>> jmp     %l1, %g4, %g0
>>>>>>>>>>
>>>>>>>>>> may behave other than on a real hw.
>>>>>>>>>>                     
>>>>>>>>> No, if rd is %g0, the current PC will not be written anywhere (not by
>>>>>>>>> real HW either).
>>>>>>>>>                   
>>>>>>>> The reason I asked is the two following pieces of code work
>>>>>>>> differently on a real and emulated SS-5. On a real one spacel! does an
>>>>>>>> asi write, and spacel@ does an asi read, and under qemu  spacel! seems
>>>>>>>> to do nothing, and spacel@ returns its second parameter multiplied by
>>>>>>>> 4. Both of them don't even try to call an [unimplemented] asi
>>>>>>>> operation, I've runned the tests with mmu and asi debug turned on.
>>>>>>>>
>>>>>>>> Real SS-5:
>>>>>>>>
>>>>>>>> ok 0 0 spacel@ .
>>>>>>>> Data Access Error
>>>>>>>> ok 0 20 spacel@ .
>>>>>>>> 0
>>>>>>>> ok 12345678 0 20 spacel!
>>>>>>>> ok 0 20 spacel@ .
>>>>>>>> 12345678
>>>>>>>> ok
>>>>>>>>
>>>>>>>>
>>>>>>>> qemu SS-5:
>>>>>>>>
>>>>>>>> ok 0 0 spacel@ .
>>>>>>>> 0
>>>>>>>> ok 0 20 spacel@ .
>>>>>>>> 80
>>>>>>>> ok 12345678 0 20 spacel!
>>>>>>>> ok 0 20 spacel@ .
>>>>>>>> 80
>>>>>>>> ok
>>>>>>>>
>>>>>>>> I don't know sparc asm good enogh, but qemu behavior seems to be
>>>>>>>> logical: in the first case I see no store op, and there are shifts
>>>>>>>> which would multiply by 4:
>>>>>>>>
>>>>>>>> ok see spacel!
>>>>>>>> code spacel!
>>>>>>>> ffd26e0c     ld      [%g7], %l2
>>>>>>>> ffd26e10     add     %g7, 4, %g7
>>>>>>>> ffd26e14     ld      [%g7], %l0
>>>>>>>> ffd26e18     add     %g7, 4, %g7
>>>>>>>> ffd26e1c     sll     %g4, 2, %g4
>>>>>>>> ffd26e20     call    ffd26e24
>>>>>>>> ffd26e24     add     %g0, 14, %l1
>>>>>>>>
>>>>>>>> ok ffd26e24 dis
>>>>>>>> ffd26e24     add     %g0, 14, %l1
>>>>>>>> ffd26e28     add     %o7, %l1, %l1
>>>>>>>> ffd26e2c     jmp     %l1, %g4, %g0
>>>>>>>> ffd26e30     ba      ffd26f68
>>>>>>>> ok
>>>>>>>>
>>>>>>>> ok see spacel@
>>>>>>>> code spacel@
>>>>>>>> ffd26830     ld      [%g7], %l0
>>>>>>>> ffd26834     add     %g7, 4, %g7
>>>>>>>> ffd26838     sll     %g4, 2, %g4
>>>>>>>> ffd2683c     call    ffd26840
>>>>>>>> ffd26840     add     %g0, 14, %l1
>>>>>>>>
>>>>>>>> ok ffd26840 dis
>>>>>>>> ffd26840     add     %g0, 14, %l1
>>>>>>>> ffd26844     add     %o7, %l1, %l1
>>>>>>>> ffd26848     jmp     %l1, %g4, %g0
>>>>>>>> ffd2684c     ba      ffd26984
>>>>>>>>
>>>>>>>>
>>>>>>>> The code is identical on a real and emulated SS.
>>>>>>>>
>>>>>>>> It must be the jump, which jumps differently on a real hw and under
>>>>>>>> qemu. Do you see from the code where the jump would jump to, or maybe
>>>>>>>> you have a suggestion how to check where the jump jumps to on the real
>>>>>>>> hw?
>>>>>>>>                 
>>>>>>> The target of the call instruction is also a delay slot instruction
>>>>>>> for the call itself. Maybe this case is not handled correctly?
>>>>>>>               
>>>>>> Good idea! Don't know how to test it though.
>>>>>>
>>>>>> And what about "ba" in the delay slot of "jmp"? Is the correct
>>>>>> behavior described somewhere? Would jump just be ignored? Whould it
>>>>>> execute one instruction on jump destination and then branch? Would
>>>>>> branch be ignored?
>>>>>>             
>>>>> Page 55 of The SPARC v8 Architecture Manual
>>>>> (http://www.sparc.org/standards/V8.pdf) describes this case
>>>>> explicitly:
>>>>> cpu should execute one instruction on the jump target and then branch.
>>>>>  Is it what qemu currently does?
>>>>>           
>>>> I may be blind, I don't see the description of this case in that page.
>>>>         
>>> I wasn't referring the call case, but jmp+ba case (two last ops in the
>>> listing above). This DCTI is described on pages marked 55-56 (pages
>>> 54-54 in a pdf reader). That's the first case in the table 5-12.
>>>
>>>       
>>>> Both QEMU and real (Sparc64) hardware exit with return value of 3, so
>>>> the inc is re-executed. If I add a nop in the call delay slot, the
>>>> return value is 2.
>>>>         
>>> Can you make a similar test, but with ba in the jmp's delay slot?
>>>       
>> Now, we have found a bug!
>>     
>
> Looks better now!
>
> Probing Memory Bank #0 64 Megabytes of DRAM
> Probing Memory Bank #1 64 Megabytes of DRAM
> Probing Memory Bank #2 64 Megabytes of DRAM
> Probing Memory Bank #3 64 Megabytes of DRAM
> Probing Memory Bank #4 Data Access Error
> ok
>
> Used to be "Nothing there". What is a bit surprising, is that only 256
> Mib found when started with -m 512. May be a next bug, this time in
> ASI.
>
> With -m 128 :
> Probing Memory Bank #0 64 Megabytes of DRAM
> Probing Memory Bank #1 64 Megabytes of DRAM
> Probing Memory Bank #2 Nothing there
> Probing Memory Bank #3 Nothing there
> Probing Memory Bank #4 Data Access Error
> ok
>
>
>
>   
Here is a very old patch for the eccmemctl that fixes a bug
I found while trying to fix the memory probing problem
that is now fixed.


[-- Attachment #2: eccmemctl.diff.txt --]
[-- Type: text/plain, Size: 665 bytes --]

diff --git a/hw/eccmemctl.c b/hw/eccmemctl.c
index 28519c8..b2bff98 100644
--- a/hw/eccmemctl.c
+++ b/hw/eccmemctl.c
@@ -301,10 +301,11 @@ static void ecc_reset(void *opaque)
 
     if (s->version == ECC_MCC)
         s->regs[ECC_MER] &= ECC_MER_REU;
-    else
-        s->regs[ECC_MER] &= (ECC_MER_VER | ECC_MER_IMPL | ECC_MER_MRR |
-                             ECC_MER_DCI);
-    s->regs[ECC_MDR] = 0x20;
+    else {
+        s->regs[ECC_MER] &= (ECC_MER_VER | ECC_MER_IMPL | ECC_MER_DCI);
+        s->regs[ECC_MER] |= ECC_MER_MRR;
+    }
+    s->regs[ECC_MDR] = 0x40;
     s->regs[ECC_MFSR] = 0;
     s->regs[ECC_VCR] = 0;
     s->regs[ECC_MFAR0] = 0x07c00000;

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] Re: target-sparc/TODO
  2009-08-22 13:30                       ` Robert Reif
@ 2009-08-22 17:25                         ` Artyom Tarasenko
  2009-08-22 18:46                           ` Robert Reif
  0 siblings, 1 reply; 18+ messages in thread
From: Artyom Tarasenko @ 2009-08-22 17:25 UTC (permalink / raw)
  To: Robert Reif; +Cc: Blue Swirl, qemu-devel

2009/8/22 Robert Reif <reif@earthlink.net>:
> Artyom Tarasenko wrote:
>>
>> 2009/8/22 Blue Swirl <blauwirbel@gmail.com>:
>>
>>>
>>> On Sat, Aug 22, 2009 at 12:01 AM, Artyom
>>> Tarasenko<atar4qemu@googlemail.com> wrote:
>>>
>>>>
>>>> 2009/8/21 Blue Swirl <blauwirbel@gmail.com>:
>>>>
>>>>>
>>>>> On Fri, Aug 21, 2009 at 3:40 PM, Artyom
>>>>> Tarasenko<atar4qemu@googlemail.com> wrote:
>>>>>
>>>>>>
>>>>>> 2009/8/21 Artyom Tarasenko <atar4qemu@googlemail.com>:
>>>>>>
>>>>>>>
>>>>>>> 2009/8/20 Blue Swirl <blauwirbel@gmail.com>:
>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Aug 20, 2009 at 12:44 PM, Artyom
>>>>>>>> Tarasenko<atar4qemu@googlemail.com> wrote:
>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Particularly I'm interested if
>>>>>>>>>>>
>>>>>>>>>>> jmp     %l1, %g4, %g0
>>>>>>>>>>>
>>>>>>>>>>> may behave other than on a real hw.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> No, if rd is %g0, the current PC will not be written anywhere (not
>>>>>>>>>> by
>>>>>>>>>> real HW either).
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The reason I asked is the two following pieces of code work
>>>>>>>>> differently on a real and emulated SS-5. On a real one spacel! does
>>>>>>>>> an
>>>>>>>>> asi write, and spacel@ does an asi read, and under qemu  spacel!
>>>>>>>>> seems
>>>>>>>>> to do nothing, and spacel@ returns its second parameter multiplied
>>>>>>>>> by
>>>>>>>>> 4. Both of them don't even try to call an [unimplemented] asi
>>>>>>>>> operation, I've runned the tests with mmu and asi debug turned on.
>>>>>>>>>
>>>>>>>>> Real SS-5:
>>>>>>>>>
>>>>>>>>> ok 0 0 spacel@ .
>>>>>>>>> Data Access Error
>>>>>>>>> ok 0 20 spacel@ .
>>>>>>>>> 0
>>>>>>>>> ok 12345678 0 20 spacel!
>>>>>>>>> ok 0 20 spacel@ .
>>>>>>>>> 12345678
>>>>>>>>> ok
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> qemu SS-5:
>>>>>>>>>
>>>>>>>>> ok 0 0 spacel@ .
>>>>>>>>> 0
>>>>>>>>> ok 0 20 spacel@ .
>>>>>>>>> 80
>>>>>>>>> ok 12345678 0 20 spacel!
>>>>>>>>> ok 0 20 spacel@ .
>>>>>>>>> 80
>>>>>>>>> ok
>>>>>>>>>
>>>>>>>>> I don't know sparc asm good enogh, but qemu behavior seems to be
>>>>>>>>> logical: in the first case I see no store op, and there are shifts
>>>>>>>>> which would multiply by 4:
>>>>>>>>>
>>>>>>>>> ok see spacel!
>>>>>>>>> code spacel!
>>>>>>>>> ffd26e0c     ld      [%g7], %l2
>>>>>>>>> ffd26e10     add     %g7, 4, %g7
>>>>>>>>> ffd26e14     ld      [%g7], %l0
>>>>>>>>> ffd26e18     add     %g7, 4, %g7
>>>>>>>>> ffd26e1c     sll     %g4, 2, %g4
>>>>>>>>> ffd26e20     call    ffd26e24
>>>>>>>>> ffd26e24     add     %g0, 14, %l1
>>>>>>>>>
>>>>>>>>> ok ffd26e24 dis
>>>>>>>>> ffd26e24     add     %g0, 14, %l1
>>>>>>>>> ffd26e28     add     %o7, %l1, %l1
>>>>>>>>> ffd26e2c     jmp     %l1, %g4, %g0
>>>>>>>>> ffd26e30     ba      ffd26f68
>>>>>>>>> ok
>>>>>>>>>
>>>>>>>>> ok see spacel@
>>>>>>>>> code spacel@
>>>>>>>>> ffd26830     ld      [%g7], %l0
>>>>>>>>> ffd26834     add     %g7, 4, %g7
>>>>>>>>> ffd26838     sll     %g4, 2, %g4
>>>>>>>>> ffd2683c     call    ffd26840
>>>>>>>>> ffd26840     add     %g0, 14, %l1
>>>>>>>>>
>>>>>>>>> ok ffd26840 dis
>>>>>>>>> ffd26840     add     %g0, 14, %l1
>>>>>>>>> ffd26844     add     %o7, %l1, %l1
>>>>>>>>> ffd26848     jmp     %l1, %g4, %g0
>>>>>>>>> ffd2684c     ba      ffd26984
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The code is identical on a real and emulated SS.
>>>>>>>>>
>>>>>>>>> It must be the jump, which jumps differently on a real hw and under
>>>>>>>>> qemu. Do you see from the code where the jump would jump to, or
>>>>>>>>> maybe
>>>>>>>>> you have a suggestion how to check where the jump jumps to on the
>>>>>>>>> real
>>>>>>>>> hw?
>>>>>>>>>
>>>>>>>>
>>>>>>>> The target of the call instruction is also a delay slot instruction
>>>>>>>> for the call itself. Maybe this case is not handled correctly?
>>>>>>>>
>>>>>>>
>>>>>>> Good idea! Don't know how to test it though.
>>>>>>>
>>>>>>> And what about "ba" in the delay slot of "jmp"? Is the correct
>>>>>>> behavior described somewhere? Would jump just be ignored? Whould it
>>>>>>> execute one instruction on jump destination and then branch? Would
>>>>>>> branch be ignored?
>>>>>>>
>>>>>>
>>>>>> Page 55 of The SPARC v8 Architecture Manual
>>>>>> (http://www.sparc.org/standards/V8.pdf) describes this case
>>>>>> explicitly:
>>>>>> cpu should execute one instruction on the jump target and then branch.
>>>>>>  Is it what qemu currently does?
>>>>>>
>>>>>
>>>>> I may be blind, I don't see the description of this case in that page.
>>>>>
>>>>
>>>> I wasn't referring the call case, but jmp+ba case (two last ops in the
>>>> listing above). This DCTI is described on pages marked 55-56 (pages
>>>> 54-54 in a pdf reader). That's the first case in the table 5-12.
>>>>
>>>>
>>>>>
>>>>> Both QEMU and real (Sparc64) hardware exit with return value of 3, so
>>>>> the inc is re-executed. If I add a nop in the call delay slot, the
>>>>> return value is 2.
>>>>>
>>>>
>>>> Can you make a similar test, but with ba in the jmp's delay slot?
>>>>
>>>
>>> Now, we have found a bug!
>>>
>>
>> Looks better now!
>>
>> Probing Memory Bank #0 64 Megabytes of DRAM
>> Probing Memory Bank #1 64 Megabytes of DRAM
>> Probing Memory Bank #2 64 Megabytes of DRAM
>> Probing Memory Bank #3 64 Megabytes of DRAM
>> Probing Memory Bank #4 Data Access Error
>> ok
>>
>> Used to be "Nothing there". What is a bit surprising, is that only 256
>> Mib found when started with -m 512. May be a next bug, this time in
>> ASI.
>>
>> With -m 128 :
>> Probing Memory Bank #0 64 Megabytes of DRAM
>> Probing Memory Bank #1 64 Megabytes of DRAM
>> Probing Memory Bank #2 Nothing there
>> Probing Memory Bank #3 Nothing there
>> Probing Memory Bank #4 Data Access Error
>> ok
>>
>>
>>
>>
>
> Here is a very old patch for the eccmemctl that fixes a bug
> I found while trying to fix the memory probing problem
> that is now fixed.
>
>
> diff --git a/hw/eccmemctl.c b/hw/eccmemctl.c
> index 28519c8..b2bff98 100644
> --- a/hw/eccmemctl.c
> +++ b/hw/eccmemctl.c
> @@ -301,10 +301,11 @@ static void ecc_reset(void *opaque)
>
>     if (s->version == ECC_MCC)
>         s->regs[ECC_MER] &= ECC_MER_REU;
> -    else
> -        s->regs[ECC_MER] &= (ECC_MER_VER | ECC_MER_IMPL | ECC_MER_MRR |
> -                             ECC_MER_DCI);
> -    s->regs[ECC_MDR] = 0x20;
> +    else {
> +        s->regs[ECC_MER] &= (ECC_MER_VER | ECC_MER_IMPL | ECC_MER_DCI);
> +        s->regs[ECC_MER] |= ECC_MER_MRR;
> +    }
> +    s->regs[ECC_MDR] = 0x40;
>     s->regs[ECC_MFSR] = 0;
>     s->regs[ECC_VCR] = 0;
>     s->regs[ECC_MFAR0] = 0x07c00000;
>

The piece of code is still there, but the patch makes no visible
effect on SS-5, 10 and 20:

Probing Memory Bank #0 64 Megabytes of DRAM
...
Probing Memory Bank #3 64 Megabytes of DRAM
Probing Memory Bank #4 Data Access Error
ok

Do you remember, for which machine/OBP was it intended?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] Re: target-sparc/TODO
  2009-08-22 17:25                         ` Artyom Tarasenko
@ 2009-08-22 18:46                           ` Robert Reif
  0 siblings, 0 replies; 18+ messages in thread
From: Robert Reif @ 2009-08-22 18:46 UTC (permalink / raw)
  To: Artyom Tarasenko; +Cc: Blue Swirl, qemu-devel

Artyom Tarasenko wrote:
> 2009/8/22 Robert Reif <reif@earthlink.net>:
>   
>> Artyom Tarasenko wrote:
>>     
>>> 2009/8/22 Blue Swirl <blauwirbel@gmail.com>:
>>>
>>>       
>>>> On Sat, Aug 22, 2009 at 12:01 AM, Artyom
>>>> Tarasenko<atar4qemu@googlemail.com> wrote:
>>>>
>>>>         
>>>>> 2009/8/21 Blue Swirl <blauwirbel@gmail.com>:
>>>>>
>>>>>           
>>>>>> On Fri, Aug 21, 2009 at 3:40 PM, Artyom
>>>>>> Tarasenko<atar4qemu@googlemail.com> wrote:
>>>>>>
>>>>>>             
>>>>>>> 2009/8/21 Artyom Tarasenko <atar4qemu@googlemail.com>:
>>>>>>>
>>>>>>>               
>>>>>>>> 2009/8/20 Blue Swirl <blauwirbel@gmail.com>:
>>>>>>>>
>>>>>>>>                 
>>>>>>>>> On Thu, Aug 20, 2009 at 12:44 PM, Artyom
>>>>>>>>> Tarasenko<atar4qemu@googlemail.com> wrote:
>>>>>>>>>
>>>>>>>>>                   
>>>>>>>>>>>> Particularly I'm interested if
>>>>>>>>>>>>
>>>>>>>>>>>> jmp     %l1, %g4, %g0
>>>>>>>>>>>>
>>>>>>>>>>>> may behave other than on a real hw.
>>>>>>>>>>>>
>>>>>>>>>>>>                         
>>>>>>>>>>> No, if rd is %g0, the current PC will not be written anywhere (not
>>>>>>>>>>> by
>>>>>>>>>>> real HW either).
>>>>>>>>>>>
>>>>>>>>>>>                       
>>>>>>>>>> The reason I asked is the two following pieces of code work
>>>>>>>>>> differently on a real and emulated SS-5. On a real one spacel! does
>>>>>>>>>> an
>>>>>>>>>> asi write, and spacel@ does an asi read, and under qemu  spacel!
>>>>>>>>>> seems
>>>>>>>>>> to do nothing, and spacel@ returns its second parameter multiplied
>>>>>>>>>> by
>>>>>>>>>> 4. Both of them don't even try to call an [unimplemented] asi
>>>>>>>>>> operation, I've runned the tests with mmu and asi debug turned on.
>>>>>>>>>>
>>>>>>>>>> Real SS-5:
>>>>>>>>>>
>>>>>>>>>> ok 0 0 spacel@ .
>>>>>>>>>> Data Access Error
>>>>>>>>>> ok 0 20 spacel@ .
>>>>>>>>>> 0
>>>>>>>>>> ok 12345678 0 20 spacel!
>>>>>>>>>> ok 0 20 spacel@ .
>>>>>>>>>> 12345678
>>>>>>>>>> ok
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> qemu SS-5:
>>>>>>>>>>
>>>>>>>>>> ok 0 0 spacel@ .
>>>>>>>>>> 0
>>>>>>>>>> ok 0 20 spacel@ .
>>>>>>>>>> 80
>>>>>>>>>> ok 12345678 0 20 spacel!
>>>>>>>>>> ok 0 20 spacel@ .
>>>>>>>>>> 80
>>>>>>>>>> ok
>>>>>>>>>>
>>>>>>>>>> I don't know sparc asm good enogh, but qemu behavior seems to be
>>>>>>>>>> logical: in the first case I see no store op, and there are shifts
>>>>>>>>>> which would multiply by 4:
>>>>>>>>>>
>>>>>>>>>> ok see spacel!
>>>>>>>>>> code spacel!
>>>>>>>>>> ffd26e0c     ld      [%g7], %l2
>>>>>>>>>> ffd26e10     add     %g7, 4, %g7
>>>>>>>>>> ffd26e14     ld      [%g7], %l0
>>>>>>>>>> ffd26e18     add     %g7, 4, %g7
>>>>>>>>>> ffd26e1c     sll     %g4, 2, %g4
>>>>>>>>>> ffd26e20     call    ffd26e24
>>>>>>>>>> ffd26e24     add     %g0, 14, %l1
>>>>>>>>>>
>>>>>>>>>> ok ffd26e24 dis
>>>>>>>>>> ffd26e24     add     %g0, 14, %l1
>>>>>>>>>> ffd26e28     add     %o7, %l1, %l1
>>>>>>>>>> ffd26e2c     jmp     %l1, %g4, %g0
>>>>>>>>>> ffd26e30     ba      ffd26f68
>>>>>>>>>> ok
>>>>>>>>>>
>>>>>>>>>> ok see spacel@
>>>>>>>>>> code spacel@
>>>>>>>>>> ffd26830     ld      [%g7], %l0
>>>>>>>>>> ffd26834     add     %g7, 4, %g7
>>>>>>>>>> ffd26838     sll     %g4, 2, %g4
>>>>>>>>>> ffd2683c     call    ffd26840
>>>>>>>>>> ffd26840     add     %g0, 14, %l1
>>>>>>>>>>
>>>>>>>>>> ok ffd26840 dis
>>>>>>>>>> ffd26840     add     %g0, 14, %l1
>>>>>>>>>> ffd26844     add     %o7, %l1, %l1
>>>>>>>>>> ffd26848     jmp     %l1, %g4, %g0
>>>>>>>>>> ffd2684c     ba      ffd26984
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> The code is identical on a real and emulated SS.
>>>>>>>>>>
>>>>>>>>>> It must be the jump, which jumps differently on a real hw and under
>>>>>>>>>> qemu. Do you see from the code where the jump would jump to, or
>>>>>>>>>> maybe
>>>>>>>>>> you have a suggestion how to check where the jump jumps to on the
>>>>>>>>>> real
>>>>>>>>>> hw?
>>>>>>>>>>
>>>>>>>>>>                     
>>>>>>>>> The target of the call instruction is also a delay slot instruction
>>>>>>>>> for the call itself. Maybe this case is not handled correctly?
>>>>>>>>>
>>>>>>>>>                   
>>>>>>>> Good idea! Don't know how to test it though.
>>>>>>>>
>>>>>>>> And what about "ba" in the delay slot of "jmp"? Is the correct
>>>>>>>> behavior described somewhere? Would jump just be ignored? Whould it
>>>>>>>> execute one instruction on jump destination and then branch? Would
>>>>>>>> branch be ignored?
>>>>>>>>
>>>>>>>>                 
>>>>>>> Page 55 of The SPARC v8 Architecture Manual
>>>>>>> (http://www.sparc.org/standards/V8.pdf) describes this case
>>>>>>> explicitly:
>>>>>>> cpu should execute one instruction on the jump target and then branch.
>>>>>>>  Is it what qemu currently does?
>>>>>>>
>>>>>>>               
>>>>>> I may be blind, I don't see the description of this case in that page.
>>>>>>
>>>>>>             
>>>>> I wasn't referring the call case, but jmp+ba case (two last ops in the
>>>>> listing above). This DCTI is described on pages marked 55-56 (pages
>>>>> 54-54 in a pdf reader). That's the first case in the table 5-12.
>>>>>
>>>>>
>>>>>           
>>>>>> Both QEMU and real (Sparc64) hardware exit with return value of 3, so
>>>>>> the inc is re-executed. If I add a nop in the call delay slot, the
>>>>>> return value is 2.
>>>>>>
>>>>>>             
>>>>> Can you make a similar test, but with ba in the jmp's delay slot?
>>>>>
>>>>>           
>>>> Now, we have found a bug!
>>>>
>>>>         
>>> Looks better now!
>>>
>>> Probing Memory Bank #0 64 Megabytes of DRAM
>>> Probing Memory Bank #1 64 Megabytes of DRAM
>>> Probing Memory Bank #2 64 Megabytes of DRAM
>>> Probing Memory Bank #3 64 Megabytes of DRAM
>>> Probing Memory Bank #4 Data Access Error
>>> ok
>>>
>>> Used to be "Nothing there". What is a bit surprising, is that only 256
>>> Mib found when started with -m 512. May be a next bug, this time in
>>> ASI.
>>>
>>> With -m 128 :
>>> Probing Memory Bank #0 64 Megabytes of DRAM
>>> Probing Memory Bank #1 64 Megabytes of DRAM
>>> Probing Memory Bank #2 Nothing there
>>> Probing Memory Bank #3 Nothing there
>>> Probing Memory Bank #4 Data Access Error
>>> ok
>>>
>>>
>>>
>>>
>>>       
>> Here is a very old patch for the eccmemctl that fixes a bug
>> I found while trying to fix the memory probing problem
>> that is now fixed.
>>
>>
>> diff --git a/hw/eccmemctl.c b/hw/eccmemctl.c
>> index 28519c8..b2bff98 100644
>> --- a/hw/eccmemctl.c
>> +++ b/hw/eccmemctl.c
>> @@ -301,10 +301,11 @@ static void ecc_reset(void *opaque)
>>
>>     if (s->version == ECC_MCC)
>>         s->regs[ECC_MER] &= ECC_MER_REU;
>> -    else
>> -        s->regs[ECC_MER] &= (ECC_MER_VER | ECC_MER_IMPL | ECC_MER_MRR |
>> -                             ECC_MER_DCI);
>> -    s->regs[ECC_MDR] = 0x20;
>> +    else {
>> +        s->regs[ECC_MER] &= (ECC_MER_VER | ECC_MER_IMPL | ECC_MER_DCI);
>> +        s->regs[ECC_MER] |= ECC_MER_MRR;
>> +    }
>> +    s->regs[ECC_MDR] = 0x40;
>>     s->regs[ECC_MFSR] = 0;
>>     s->regs[ECC_VCR] = 0;
>>     s->regs[ECC_MFAR0] = 0x07c00000;
>>
>>     
>
> The piece of code is still there, but the patch makes no visible
> effect on SS-5, 10 and 20:
>
> Probing Memory Bank #0 64 Megabytes of DRAM
> ...
> Probing Memory Bank #3 64 Megabytes of DRAM
> Probing Memory Bank #4 Data Access Error
> ok
>
> Do you remember, for which machine/OBP was it intended?
>
>
>
>   
It's for the ss10/20.  I just found the bug by stepping through
the code and compairing what it was doing with the chip
manual.  I'm not suprised it didn't help.  It's just a correctness
fix.

You have to be careful when specifing a memory size on
the ss10/20 because a SIMM slot uses 64M reguardless of
the actual size of the SIMM.  A smaller size will be mirrored
to fill the entire space and OBP checks for mirrioing to
determine the actual size.  QEMU doesn't do any mirroring
for small SIMMS which will confuse OBP.  I don't think
that is the issue are seeing but just something to keep in mind.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Qemu-devel] target-sparc/TODO
@ 2010-08-20 19:59 Artyom Tarasenko
  0 siblings, 0 replies; 18+ messages in thread
From: Artyom Tarasenko @ 2010-08-20 19:59 UTC (permalink / raw)
  To: Blue Swirl; +Cc: qemu-devel

2010/8/19 Blue Swirl <blauwirbel@gmail.com>:

>> Is there a list of what is missing? You mentioned the pci bridges
>> before. What else?
>> Shall we maintain the TODO file for everything we find? Including
>> things we may fix within a few days after finding?
>
> Actually, we have a TODO file, see under target-sparc. Thanks for
> reminding me. ;-)

I know that we have it. I asked you about a year ago whether the
things in it were up to date. ;-)
What I mean is that we should update it really every time we find or
fix missing things. For example my current list of sun4m
features/fixes which may or may not be implemented in the near or far
future:

- esp: improve "Select with Attention" / "Select without Attention",
so that it's compatible with NetBSD 1.6-3.0.
- mxcc: implement missing registers so that SS-10/SS-20 OBPs would
work with the default CPU model
- le: improve for passing OBP loopback test. This would allow the
network boot (which is the default option for OBP when running under
qemu because NVRAM contents is invalid).
- nvram: add possibility to back it up with a file.
- add option rom file load for graphic card or for any SBus slot (the
second variant is preferable because it can be used to provide SS-20
OBP with fake or real dbri rom).
- slavio_timer: make it closer to the real HW than to specification.
NeXTStep OS relies on register contents after the mode change although
the documentation states it to be undefined.
- fix Solaris 2.2-2.5.1 boot hanging if the day of month is >20 (am
currently at it).

the list is unsorted. The "option rom file" point is already in the
TODO file but without reasoning.

I think one point in the current TODO is resolved:

- Interrupt routing does not match real HW

I think we are pretty good there at least for the single CPU machines.

-- 
Regards,
Artyom Tarasenko

solaris/sparc under qemu blog: http://tyom.blogspot.com/

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2010-08-20 19:59 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-08-17 10:52 [Qemu-devel] target-sparc/TODO Artyom Tarasenko
2009-08-17 17:35 ` [Qemu-devel] target-sparc/TODO Blue Swirl
2009-08-19 10:17   ` Artyom Tarasenko
2009-08-19 16:43     ` Blue Swirl
2009-08-20  9:44       ` Artyom Tarasenko
2009-08-20 19:15         ` Blue Swirl
2009-08-21  9:58           ` Artyom Tarasenko
2009-08-21 12:40             ` Artyom Tarasenko
2009-08-21 19:45               ` Blue Swirl
2009-08-21 21:01                 ` Artyom Tarasenko
2009-08-21 21:10                   ` Igor Kovalenko
2009-08-21 21:17                     ` Artyom Tarasenko
2009-08-22  6:51                   ` Blue Swirl
2009-08-22 12:40                     ` Artyom Tarasenko
2009-08-22 13:30                       ` Robert Reif
2009-08-22 17:25                         ` Artyom Tarasenko
2009-08-22 18:46                           ` Robert Reif
  -- strict thread matches above, loose matches on Subject: below --
2010-08-20 19:59 [Qemu-devel] target-sparc/TODO Artyom Tarasenko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).