[Qemu-devel] [RFC PATCH] s390x-linux-user

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] [RFC PATCH] s390x-linux-user
@ 2009-06-26 16:49 Ulrich Hecht
  2009-06-26 17:17 ` Blue Swirl
  0 siblings, 1 reply; 11+ messages in thread
From: Ulrich Hecht @ 2009-06-26 16:49 UTC (permalink / raw)
  To: qemu-devel

[-- Attachment #1: Type: text/plain, Size: 1364 bytes --]

Hi!

Here's an alpha of the S/390 target I'm currently working on. So far, 
only the s390x-linux-user target is supported. No machine emulation, no 
31-bit or 24-bit addressing modes. The S/390 instruction set is 
gargantuan, and I implement instructions as they come along, which means 
that everything not emitted by GCC (and then some) is unimplemented. 
Nonetheless, it runs dynamically linked binaries from SLE11 and most of 
the stuff in /bin, including bash and vim. (You wouldn't believe how 
many binaries there require 128-bit floats...)

Besides the unimplemented instructions, the code still leaves a lot of 
room for improvement, especially for optimization. All condition code 
computation, for instance, is currently done in helper functions.

There is a very peculiar S/390 instruction called "EXECUTE". What it does 
is to take another instruction stored somewhere in memory, logical-OR 
the second byte of the instruction with the LSB of R0 and then execute 
the result, without changing the instruction in memory or the program 
counter. Any idea how to implement this in QEMU? Currently, I'm 
interpreting the couple of instructions that GCC uses EXECUTE with, but 
in the long run that would amount to implementing a second emulator...

CU
Uli

-- 
SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg)

[-- Attachment #2: s390x-linux-user.patch.gz --]
[-- Type: application/x-gzip, Size: 26899 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [RFC PATCH] s390x-linux-user
  2009-06-26 16:49 [Qemu-devel] [RFC PATCH] s390x-linux-user Ulrich Hecht
@ 2009-06-26 17:17 ` Blue Swirl
  2009-06-26 17:40   ` Paul Brook
  2009-06-26 19:07   ` Stuart Brady
  0 siblings, 2 replies; 11+ messages in thread
From: Blue Swirl @ 2009-06-26 17:17 UTC (permalink / raw)
  To: Ulrich Hecht; +Cc: qemu-devel

On 6/26/09, Ulrich Hecht <uli@suse.de> wrote:
>  There is a very peculiar S/390 instruction called "EXECUTE". What it does
>  is to take another instruction stored somewhere in memory, logical-OR
>  the second byte of the instruction with the LSB of R0 and then execute
>  the result, without changing the instruction in memory or the program
>  counter. Any idea how to implement this in QEMU? Currently, I'm
>  interpreting the couple of instructions that GCC uses EXECUTE with, but
>  in the long run that would amount to implementing a second emulator...

Maybe something like this: Make a special TB of the EXECUTE
instruction and add LSB of R0 to TB flags for these TBs. Then you can
examine R0, OR and generate code at translation time. The TBs linking
to EXECUTE TB may need to be special too in order to track for R0.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [RFC PATCH] s390x-linux-user
  2009-06-26 17:17 ` Blue Swirl
@ 2009-06-26 17:40   ` Paul Brook
  2009-06-26 17:46     ` Blue Swirl
  2009-06-26 19:07   ` Stuart Brady
  1 sibling, 1 reply; 11+ messages in thread
From: Paul Brook @ 2009-06-26 17:40 UTC (permalink / raw)
  To: qemu-devel; +Cc: Blue Swirl

On Friday 26 June 2009, Blue Swirl wrote:
> On 6/26/09, Ulrich Hecht <uli@suse.de> wrote:
> >  There is a very peculiar S/390 instruction called "EXECUTE". What it
> > does is to take another instruction stored somewhere in memory,
> > logical-OR the second byte of the instruction with the LSB of R0 and then
> > execute the result, without changing the instruction in memory or the
> > program counter. Any idea how to implement this in QEMU? Currently, I'm
> > interpreting the couple of instructions that GCC uses EXECUTE with, but
> > in the long run that would amount to implementing a second emulator...
>
> Maybe something like this: Make a special TB of the EXECUTE
> instruction and add LSB of R0 to TB flags for these TBs. Then you can
> examine R0, OR and generate code at translation time. The TBs linking
> to EXECUTE TB may need to be special too in order to track for R0.

That's not sufficient. The results also depend on the referenced instruction.

Paul

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [RFC PATCH] s390x-linux-user
  2009-06-26 17:40   ` Paul Brook
@ 2009-06-26 17:46     ` Blue Swirl
  2009-06-26 17:59       ` Paul Brook
  0 siblings, 1 reply; 11+ messages in thread
From: Blue Swirl @ 2009-06-26 17:46 UTC (permalink / raw)
  To: Paul Brook; +Cc: qemu-devel

On 6/26/09, Paul Brook <paul@codesourcery.com> wrote:
> On Friday 26 June 2009, Blue Swirl wrote:
>  > On 6/26/09, Ulrich Hecht <uli@suse.de> wrote:
>  > >  There is a very peculiar S/390 instruction called "EXECUTE". What it
>  > > does is to take another instruction stored somewhere in memory,
>  > > logical-OR the second byte of the instruction with the LSB of R0 and then
>  > > execute the result, without changing the instruction in memory or the
>  > > program counter. Any idea how to implement this in QEMU? Currently, I'm
>  > > interpreting the couple of instructions that GCC uses EXECUTE with, but
>  > > in the long run that would amount to implementing a second emulator...
>  >
>  > Maybe something like this: Make a special TB of the EXECUTE
>  > instruction and add LSB of R0 to TB flags for these TBs. Then you can
>  > examine R0, OR and generate code at translation time. The TBs linking
>  > to EXECUTE TB may need to be special too in order to track for R0.
>
>
> That's not sufficient. The results also depend on the referenced instruction.

Then add the second byte of the referenced instruction to TB flags? Or
maybe just the result of the OR operation for compactness?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [RFC PATCH] s390x-linux-user
  2009-06-26 17:46     ` Blue Swirl
@ 2009-06-26 17:59       ` Paul Brook
  2009-06-26 18:18         ` Paul Brook
  2009-06-26 18:22         ` Blue Swirl
  0 siblings, 2 replies; 11+ messages in thread
From: Paul Brook @ 2009-06-26 17:59 UTC (permalink / raw)
  To: Blue Swirl; +Cc: qemu-devel

On Friday 26 June 2009, Blue Swirl wrote:
> On 6/26/09, Paul Brook <paul@codesourcery.com> wrote:
> > On Friday 26 June 2009, Blue Swirl wrote:
> >  > On 6/26/09, Ulrich Hecht <uli@suse.de> wrote:
> >  > >  There is a very peculiar S/390 instruction called "EXECUTE". What
> >  > > it does is to take another instruction stored somewhere in memory,
> >  > > logical-OR the second byte of the instruction with the LSB of R0 and
> >  > > then execute the result, without changing the instruction in memory
> >  > > or the program counter. Any idea how to implement this in QEMU?
> >  > > Currently, I'm interpreting the couple of instructions that GCC uses
> >  > > EXECUTE with, but in the long run that would amount to implementing
> >  > > a second emulator...
> >  >
> >  > Maybe something like this: Make a special TB of the EXECUTE
> >  > instruction and add LSB of R0 to TB flags for these TBs. Then you can
> >  > examine R0, OR and generate code at translation time. The TBs linking
> >  > to EXECUTE TB may need to be special too in order to track for R0.
> >
> > That's not sufficient. The results also depend on the referenced
> > instruction.
>
> Then add the second byte of the referenced instruction to TB flags? Or
> maybe just the result of the OR operation for compactness?

No. You need the whole instruction. Which is fetched from memory, so is not 
easily available when you're checking TB flags.
To do it this way, I think you'd need to split the instruction in two. The 
first part would load the whole instruciton from memory, or with r0, then 
store the result in an internal CPU pseudo-register to the whole instruction, 
and cuse annother TB lookup. The second would generate code that cleared the 
pseudo-register then executed the code that was stored in it.
You'd have to include the whole of the pseudo-register in TB_FLAGS, and I 
doubt you've got enough bits for that.

OTOH, tweaking the TCG interface so that it works as an interpreter shouldn't 
be all that hard. It's something I've been considering to do for a while, and 
would mean that you can build both interpreter and translator from the same 
source.

Paul

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [RFC PATCH] s390x-linux-user
  2009-06-26 17:59       ` Paul Brook
@ 2009-06-26 18:18         ` Paul Brook
  2009-06-26 18:22         ` Blue Swirl
  1 sibling, 0 replies; 11+ messages in thread
From: Paul Brook @ 2009-06-26 18:18 UTC (permalink / raw)
  To: qemu-devel; +Cc: Blue Swirl

> No. You need the whole instruction. Which is fetched from memory, so is not
> easily available when you're checking TB flags.
> To do it this way, I think you'd need to split the instruction in two. The
> first part would load the whole instruciton from memory, or with r0, then
> store the result in an internal CPU pseudo-register to the whole
> instruction, and cuse annother TB lookup. The second would generate code
> that cleared the pseudo-register then executed the code that was stored in
> it.
>You'd have to include the whole of the pseudo-register in TB_FLAGS, and I
>doubt you've got enough bits for that.

On second reading I've spotted a way around this. Start with the two-phase 
generation as described above, but make sure the TB is invalidated before the 
next EXECUTE instruction is run. This means that instead of the whole 
instruction in the TB flags you just need a "half way through EXECUTE" bit.

Reliably invalidating the TB may get a bit hairy, but I'm pretty sure it's 
doable.

Paul

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [RFC PATCH] s390x-linux-user
  2009-06-26 17:59       ` Paul Brook
  2009-06-26 18:18         ` Paul Brook
@ 2009-06-26 18:22         ` Blue Swirl
  2009-06-26 18:39           ` Paul Brook
  1 sibling, 1 reply; 11+ messages in thread
From: Blue Swirl @ 2009-06-26 18:22 UTC (permalink / raw)
  To: Paul Brook; +Cc: qemu-devel

On 6/26/09, Paul Brook <paul@codesourcery.com> wrote:
> On Friday 26 June 2009, Blue Swirl wrote:
>  > On 6/26/09, Paul Brook <paul@codesourcery.com> wrote:
>  > > On Friday 26 June 2009, Blue Swirl wrote:
>  > >  > On 6/26/09, Ulrich Hecht <uli@suse.de> wrote:
>  > >  > >  There is a very peculiar S/390 instruction called "EXECUTE". What
>  > >  > > it does is to take another instruction stored somewhere in memory,
>  > >  > > logical-OR the second byte of the instruction with the LSB of R0 and
>  > >  > > then execute the result, without changing the instruction in memory
>  > >  > > or the program counter. Any idea how to implement this in QEMU?
>  > >  > > Currently, I'm interpreting the couple of instructions that GCC uses
>  > >  > > EXECUTE with, but in the long run that would amount to implementing
>  > >  > > a second emulator...
>  > >  >
>  > >  > Maybe something like this: Make a special TB of the EXECUTE
>  > >  > instruction and add LSB of R0 to TB flags for these TBs. Then you can
>  > >  > examine R0, OR and generate code at translation time. The TBs linking
>  > >  > to EXECUTE TB may need to be special too in order to track for R0.
>  > >
>  > > That's not sufficient. The results also depend on the referenced
>  > > instruction.
>  >
>  > Then add the second byte of the referenced instruction to TB flags? Or
>  > maybe just the result of the OR operation for compactness?
>
>
> No. You need the whole instruction. Which is fetched from memory, so is not
>  easily available when you're checking TB flags.
>  To do it this way, I think you'd need to split the instruction in two. The
>  first part would load the whole instruciton from memory, or with r0, then
>  store the result in an internal CPU pseudo-register to the whole instruction,
>  and cuse annother TB lookup. The second would generate code that cleared the
>  pseudo-register then executed the code that was stored in it.
>  You'd have to include the whole of the pseudo-register in TB_FLAGS, and I
>  doubt you've got enough bits for that.

How about cs_base then?

>  OTOH, tweaking the TCG interface so that it works as an interpreter shouldn't
>  be all that hard. It's something I've been considering to do for a while, and
>  would mean that you can build both interpreter and translator from the same
>  source.

Like by adding an interpreter TCG target? If it were in C only, it
could also serve as a portable (low performance) translator runtime.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [RFC PATCH] s390x-linux-user
  2009-06-26 18:22         ` Blue Swirl
@ 2009-06-26 18:39           ` Paul Brook
  0 siblings, 0 replies; 11+ messages in thread
From: Paul Brook @ 2009-06-26 18:39 UTC (permalink / raw)
  To: qemu-devel; +Cc: Blue Swirl

> >  OTOH, tweaking the TCG interface so that it works as an interpreter
> > shouldn't be all that hard. It's something I've been considering to do
> > for a while, and would mean that you can build both interpreter and
> > translator from the same source.
>
> Like by adding an interpreter TCG target? If it were in C only, it
> could also serve as a portable (low performance) translator runtime.

There are a couple of different options.

You could spit out bytecode (or even some simplified form of an existing ISA) 
then run that though an interpreter. This is gets you a portable target, and 
behaves much like a native TCG target.

The alternative is to replace TCG altogether, and have tcg_gen_* perform the 
operation immediately as the code is translated. You need a couple of tricks 
to cope with conditional banches, but as long as you don't allow loops this 
isn't too hairy. This is more invasive, gives you a pure interpreter, so may 
be the fastest option for heavily self-modifying guest code.

Paul

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [RFC PATCH] s390x-linux-user
  2009-06-26 17:17 ` Blue Swirl
  2009-06-26 17:40   ` Paul Brook
@ 2009-06-26 19:07   ` Stuart Brady
  2009-06-26 19:24     ` Paul Brook
  2009-07-03 15:11     ` Ulrich Hecht
  1 sibling, 2 replies; 11+ messages in thread
From: Stuart Brady @ 2009-06-26 19:07 UTC (permalink / raw)
  To: qemu-devel

On Fri, Jun 26, 2009 at 08:17:42PM +0300, Blue Swirl wrote:
> On 6/26/09, Ulrich Hecht <uli@suse.de> wrote:
> > There is a very peculiar S/390 instruction called "EXECUTE". What it does
> > is to take another instruction stored somewhere in memory, logical-OR
> > the second byte of the instruction with the LSB of R0 and then execute
> > the result, without changing the instruction in memory or the program
> > counter. Any idea how to implement this in QEMU? Currently, I'm
> > interpreting the couple of instructions that GCC uses EXECUTE with, but
> > in the long run that would amount to implementing a second emulator...
> 
> Maybe something like this: Make a special TB of the EXECUTE
> instruction and add LSB of R0 to TB flags for these TBs. Then you can
> examine R0, OR and generate code at translation time. The TBs linking
> to EXECUTE TB may need to be special too in order to track for R0.

Stupid idea, I expect, but would it be possible to handle EXECUTE by 
'branching' to the 'instruction stored somewhere in memory', using one
bit to hold the state of R0, and another indicate that the TB is a 
special EXECUTE TB (i.e. only a single instruction should be decoded,
the LSB of R0 should be ORed, and code must be generated to return to 
the 'caller'), and another bit for the state of the LSB of R0?

Presumably, SMC handling would safely deal with the memory holding that
instruction being written to.  (If all variants of S/390 need precise
SMC handling, I suppose that shouldn't be a problem?)

My only real concern would be that it must not be possible to observe
this behaviour.  (I.e. an interrupt arriving at the 'wrong' moment or 
the EXECUTEd instruction faulting must be properly handled.)

Also, if S/390 has separate read/execute page bits, would access to the
memory location in question still count as 'execution'?  I suppose this
would also be possible to work around, though...

I won't be totally surprised if someone tells me that this would be
completely unworkable, but I'd be interested in learning why, if that
is indeed the case. :-)

Cheers,
-- 
Stuart Brady

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [RFC PATCH] s390x-linux-user
  2009-06-26 19:07   ` Stuart Brady
@ 2009-06-26 19:24     ` Paul Brook
  2009-07-03 15:11     ` Ulrich Hecht
  1 sibling, 0 replies; 11+ messages in thread
From: Paul Brook @ 2009-06-26 19:24 UTC (permalink / raw)
  To: qemu-devel

> Stupid idea, I expect, but would it be possible to handle EXECUTE by
> 'branching' to the 'instruction stored somewhere in memory', using one
> bit to hold the state of R0, and another indicate that the TB is a
> special EXECUTE TB (i.e. only a single instruction should be decoded,
> the LSB of R0 should be ORed, and code must be generated to return to
> the 'caller'), and another bit for the state of the LSB of R0?

I guess s/bit/byte/.

> Presumably, SMC handling would safely deal with the memory holding that
> instruction being written to.  (If all variants of S/390 need precise
> SMC handling, I suppose that shouldn't be a problem?)

You don't need precise SMC here. That's only required if a TB can modify 
itself.

> My only real concern would be that it must not be possible to observe
> this behaviour.  (I.e. an interrupt arriving at the 'wrong' moment or
> the EXECUTEd instruction faulting must be properly handled.)

That's easy to fix. We already do this for other targets (e.g. ARMv7-M 
exception return). You already need an extra TB flag bit to indicate that this 
is part way through an EXECUTE instruction.

> Also, if S/390 has separate read/execute page bits, would access to the
> memory location in question still count as 'execution'?  I suppose this
> would also be possible to work around, though...

This is probably the trickiest bit to get right, especially if the you end up 
causing a fault.

Paul

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [RFC PATCH] s390x-linux-user
  2009-06-26 19:07   ` Stuart Brady
  2009-06-26 19:24     ` Paul Brook
@ 2009-07-03 15:11     ` Ulrich Hecht
  1 sibling, 0 replies; 11+ messages in thread
From: Ulrich Hecht @ 2009-07-03 15:11 UTC (permalink / raw)
  To: Stuart Brady; +Cc: qemu-devel

On Friday 26 June 2009, Stuart Brady wrote:
> Also, if S/390 has separate read/execute page bits, would access to
> the memory location in question still count as 'execution'?

I think so:

"The fetching of the target instruction is considered to be an 
instruction fetch for purposes of program-event recording and for 
purposes of reporting access exceptions."

BTW, I got a detail wrong in the description of EXECUTE (not in the 
code): The register used for the OR operation is encoded in the EXECUTE 
instruction, with 0 meaning "don't OR". So it can actually be any 
general-purpose register _except_ R0. See 
http://publibfp.boulder.ibm.com/cgi-bin/bookmgr/download/A2278324.pdf  
page 7-126.

CU
Uli

-- 
SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2009-07-03 15:11 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-06-26 16:49 [Qemu-devel] [RFC PATCH] s390x-linux-user Ulrich Hecht
2009-06-26 17:17 ` Blue Swirl
2009-06-26 17:40   ` Paul Brook
2009-06-26 17:46     ` Blue Swirl
2009-06-26 17:59       ` Paul Brook
2009-06-26 18:18         ` Paul Brook
2009-06-26 18:22         ` Blue Swirl
2009-06-26 18:39           ` Paul Brook
2009-06-26 19:07   ` Stuart Brady
2009-06-26 19:24     ` Paul Brook
2009-07-03 15:11     ` Ulrich Hecht

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).