qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] Stupid (probably) idea wrt dyngen & gcc 3.4 & 4.0
@ 2005-05-09  0:02 Sebastian Kaliszewski
  2005-05-09  0:25 ` André Braga
  2005-05-09  0:40 ` Paul Brook
  0 siblings, 2 replies; 5+ messages in thread
From: Sebastian Kaliszewski @ 2005-05-09  0:02 UTC (permalink / raw)
  To: qemu-devel

Hello!

As I understand the problem with dyngen & GCC 3.4 and newer is that even 
when using the following marcro (line 158 of dynget-exec.h) in op_* 
functions

#define FORCE_RET() asm volatile ("");

GCC still puts multiple exit points of a function.

But did anyone try the following one:

#define FORCE_RET() asm volatile ("" : : : "memory" );

This tells GCC that that asm block clobbers arbitrary memory. If it doesnt 
help, then maybe putting few instructions will help (increasing the weight 
of the code thus convincing optimiser not to multiplicate the asm block)?

#define FORCE_RET() asm volatile ("nop; nop; nop; nop" : : : "memory" );

or 

#define FORCE_RET() asm volatile ("ret; ret; ret; ret" : : : "memory" );

Then if the above fails, then simply search the binary code for such block 
of fout instructions (in case of nops it'd be 0x90909090, in case of ret's 
-- don't remember). It's rather impossible, that such immediate value would 
apear inside op_* code, so the only real possibility is FORCE_RET() 
occurence (Ther is also slim possibility that such code would be an align 
fill block -- but AFAIR gcc is instructed ther not to align code and AFAIR 
gcc would use block of 4 one byte nops -- it will use longer nops in such 
cases). So then replacing such nops with jumps to end inside blocks is 
trivial.

What do you think?

rgds
Sebastian Kaliszewski
-- 
"Never undersetimate the power of human stupidity" -- from notebooks of L.L.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] Stupid (probably) idea wrt dyngen & gcc 3.4 & 4.0
  2005-05-09  0:02 [Qemu-devel] Stupid (probably) idea wrt dyngen & gcc 3.4 & 4.0 Sebastian Kaliszewski
@ 2005-05-09  0:25 ` André Braga
  2005-05-09  0:40 ` Paul Brook
  1 sibling, 0 replies; 5+ messages in thread
From: André Braga @ 2005-05-09  0:25 UTC (permalink / raw)
  To: qemu-devel

Alternatively, 

volatile 
  inc ax
  dec ax
  inc ax
  dec ax

which is the same size as 4 nops (on x86 assembly), has a net result
of doing nothing (caveat interrupts/preemption), and is *absolutely
illogical* to find in any machine-generated code...

There must be some way to generate similar code on other supported
platforms (I mean: code that's extremely unlikely to be generated by a
machine but could be used as a sentinel code sequence to dyngen), but
feasibility considerations apart, I don't really think this is the
most elegant solution...


2005/5/8, Sebastian Kaliszewski <sk@z.pl>:
> Hello!
> 
> As I understand the problem with dyngen & GCC 3.4 and newer is that even
> when using the following marcro (line 158 of dynget-exec.h) in op_*
> functions
> 
> #define FORCE_RET() asm volatile ("");
> 
> GCC still puts multiple exit points of a function.
> 
> But did anyone try the following one:
> 
> #define FORCE_RET() asm volatile ("" : : : "memory" );
> 
> This tells GCC that that asm block clobbers arbitrary memory. If it doesnt
> help, then maybe putting few instructions will help (increasing the weight
> of the code thus convincing optimiser not to multiplicate the asm block)?
> 
> #define FORCE_RET() asm volatile ("nop; nop; nop; nop" : : : "memory" );
> 
> or
> 
> #define FORCE_RET() asm volatile ("ret; ret; ret; ret" : : : "memory" );
> 
> Then if the above fails, then simply search the binary code for such block
> of fout instructions (in case of nops it'd be 0x90909090, in case of ret's
> -- don't remember). It's rather impossible, that such immediate value would
> apear inside op_* code, so the only real possibility is FORCE_RET()
> occurence (Ther is also slim possibility that such code would be an align
> fill block -- but AFAIR gcc is instructed ther not to align code and AFAIR
> gcc would use block of 4 one byte nops -- it will use longer nops in such
> cases). So then replacing such nops with jumps to end inside blocks is
> trivial.
> 
> What do you think?
> 
> rgds
> Sebastian Kaliszewski

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] Stupid (probably) idea wrt dyngen & gcc 3.4 & 4.0
  2005-05-09  0:02 [Qemu-devel] Stupid (probably) idea wrt dyngen & gcc 3.4 & 4.0 Sebastian Kaliszewski
  2005-05-09  0:25 ` André Braga
@ 2005-05-09  0:40 ` Paul Brook
  2005-05-09  1:55   ` Sebastian Kaliszewski
  1 sibling, 1 reply; 5+ messages in thread
From: Paul Brook @ 2005-05-09  0:40 UTC (permalink / raw)
  To: qemu-devel, sk

On Monday 09 May 2005 01:02, Sebastian Kaliszewski wrote:
> Hello!
>
> As I understand the problem with dyngen & GCC 3.4 and newer is that even
> when using the following marcro (line 158 of dynget-exec.h) in op_*
> functions
>
> #define FORCE_RET() asm volatile ("");
>
> GCC still puts multiple exit points of a function.
>
> But did anyone try the following one:
>
> #define FORCE_RET() asm volatile ("" : : : "memory" );
>
> This tells GCC that that asm block clobbers arbitrary memory. If it doesnt
> help, then maybe putting few instructions will help (increasing the weight
> of the code thus convincing optimiser not to multiplicate the asm block)?
>
> #define FORCE_RET() asm volatile ("nop; nop; nop; nop" : : : "memory" );


No. The main problem with gcc3.4 was that we weren't using FORCE_RET 
everywhere that we should. This has mostly been fixed now.


The problem with gcc4 is that -fno-reorder-blocks no longer does what we want. 
There is no way to force gcc to put the the end of the function at the end.

As far as gcc is concerned there's nothing special about the "end" of the 
function. gcc will turn

  if (unlikely)
    something();
  rest_of_function();
  FORCE_RET();
  return;

into

  if (unlikely) goto unlikely_code;
return_from_unlikely:
  rest_of_function();
  return;

unlikely_code:
  something();
  goto return_from_unlikely;

making rest_of_function bigger won't help.

> Then if the above fails, then simply search the binary code for such block 
> of fout instructions

This won't work either. the FORCE_RET is before the function epilogue. ie. you 
might have:

op_foo:
    push %ebx
    # function code
    # the assembly from FORCE_RET
    pop %ebx
    ret

If you amputate this at the FORCE_RET you end up with a stack overflow.

I've got a solution for x86/x86-64 that's 95% complete, using the method I 
suggested in a previous email. I hope to be submitting a patch shortly.
I expect most other hosts (particularly the RISC based ones) to be much 
simpler to fix.

Paul

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] Stupid (probably) idea wrt dyngen & gcc 3.4 & 4.0
  2005-05-09  0:40 ` Paul Brook
@ 2005-05-09  1:55   ` Sebastian Kaliszewski
  2005-05-09  2:33     ` Paul Brook
  0 siblings, 1 reply; 5+ messages in thread
From: Sebastian Kaliszewski @ 2005-05-09  1:55 UTC (permalink / raw)
  To: qemu-devel

So the idea was indeed stupid :)

On Monday 09 May 2005 02:40, Paul Brook wrote:
> No. The main problem with gcc3.4 was that we weren't using FORCE_RET
> everywhere that we should. This has mostly been fixed now.

I see...

[snip]
>
> I've got a solution for x86/x86-64 that's 95% complete, using the method
> I suggested in a previous email. 

So, since ret is too small to be replaced with jump, you relocate the block 
following ret few bytes further, and retarget all relevant jumps?

> I hope to be submitting a patch shortly.
> I expect most other hosts (particularly the RISC based ones) to be much
> simpler to fix.

Nice.

rgds
Sebastian Kaliszewski

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] Stupid (probably) idea wrt dyngen & gcc 3.4 & 4.0
  2005-05-09  1:55   ` Sebastian Kaliszewski
@ 2005-05-09  2:33     ` Paul Brook
  0 siblings, 0 replies; 5+ messages in thread
From: Paul Brook @ 2005-05-09  2:33 UTC (permalink / raw)
  To: qemu-devel, sk

On Monday 09 May 2005 02:55, Sebastian Kaliszewski wrote:
> So the idea was indeed stupid :)
>
> On Monday 09 May 2005 02:40, Paul Brook wrote:
> > No. The main problem with gcc3.4 was that we weren't using FORCE_RET
> > everywhere that we should. This has mostly been fixed now.
>
> I see...
>
> [snip]
>
> > I've got a solution for x86/x86-64 that's 95% complete, using the method
> > I suggested in a previous email.
>
> So, since ret is too small to be replaced with jump, you relocate the block
> following ret few bytes further, and retarget all relevant jumps?

Almost. I relocate the instructins immediately preceeding the ret. I define 
FORCE_RET()  as "nop; nop; nop; nop" to make sure we always have some code 
that can be moved without having to relocate any jumps. In most cases dyngen 
can then recognise these nop blocks, and remove them from the output.

Paul

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2005-05-09  2:38 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-05-09  0:02 [Qemu-devel] Stupid (probably) idea wrt dyngen & gcc 3.4 & 4.0 Sebastian Kaliszewski
2005-05-09  0:25 ` André Braga
2005-05-09  0:40 ` Paul Brook
2005-05-09  1:55   ` Sebastian Kaliszewski
2005-05-09  2:33     ` Paul Brook

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).