* [Qemu-devel] Stupid (probably) idea wrt dyngen & gcc 3.4 & 4.0
@ 2005-05-09 0:02 Sebastian Kaliszewski
2005-05-09 0:25 ` André Braga
2005-05-09 0:40 ` Paul Brook
0 siblings, 2 replies; 5+ messages in thread
From: Sebastian Kaliszewski @ 2005-05-09 0:02 UTC (permalink / raw)
To: qemu-devel
Hello!
As I understand the problem with dyngen & GCC 3.4 and newer is that even
when using the following marcro (line 158 of dynget-exec.h) in op_*
functions
#define FORCE_RET() asm volatile ("");
GCC still puts multiple exit points of a function.
But did anyone try the following one:
#define FORCE_RET() asm volatile ("" : : : "memory" );
This tells GCC that that asm block clobbers arbitrary memory. If it doesnt
help, then maybe putting few instructions will help (increasing the weight
of the code thus convincing optimiser not to multiplicate the asm block)?
#define FORCE_RET() asm volatile ("nop; nop; nop; nop" : : : "memory" );
or
#define FORCE_RET() asm volatile ("ret; ret; ret; ret" : : : "memory" );
Then if the above fails, then simply search the binary code for such block
of fout instructions (in case of nops it'd be 0x90909090, in case of ret's
-- don't remember). It's rather impossible, that such immediate value would
apear inside op_* code, so the only real possibility is FORCE_RET()
occurence (Ther is also slim possibility that such code would be an align
fill block -- but AFAIR gcc is instructed ther not to align code and AFAIR
gcc would use block of 4 one byte nops -- it will use longer nops in such
cases). So then replacing such nops with jumps to end inside blocks is
trivial.
What do you think?
rgds
Sebastian Kaliszewski
--
"Never undersetimate the power of human stupidity" -- from notebooks of L.L.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] Stupid (probably) idea wrt dyngen & gcc 3.4 & 4.0
2005-05-09 0:02 [Qemu-devel] Stupid (probably) idea wrt dyngen & gcc 3.4 & 4.0 Sebastian Kaliszewski
@ 2005-05-09 0:25 ` André Braga
2005-05-09 0:40 ` Paul Brook
1 sibling, 0 replies; 5+ messages in thread
From: André Braga @ 2005-05-09 0:25 UTC (permalink / raw)
To: qemu-devel
Alternatively,
volatile
inc ax
dec ax
inc ax
dec ax
which is the same size as 4 nops (on x86 assembly), has a net result
of doing nothing (caveat interrupts/preemption), and is *absolutely
illogical* to find in any machine-generated code...
There must be some way to generate similar code on other supported
platforms (I mean: code that's extremely unlikely to be generated by a
machine but could be used as a sentinel code sequence to dyngen), but
feasibility considerations apart, I don't really think this is the
most elegant solution...
2005/5/8, Sebastian Kaliszewski <sk@z.pl>:
> Hello!
>
> As I understand the problem with dyngen & GCC 3.4 and newer is that even
> when using the following marcro (line 158 of dynget-exec.h) in op_*
> functions
>
> #define FORCE_RET() asm volatile ("");
>
> GCC still puts multiple exit points of a function.
>
> But did anyone try the following one:
>
> #define FORCE_RET() asm volatile ("" : : : "memory" );
>
> This tells GCC that that asm block clobbers arbitrary memory. If it doesnt
> help, then maybe putting few instructions will help (increasing the weight
> of the code thus convincing optimiser not to multiplicate the asm block)?
>
> #define FORCE_RET() asm volatile ("nop; nop; nop; nop" : : : "memory" );
>
> or
>
> #define FORCE_RET() asm volatile ("ret; ret; ret; ret" : : : "memory" );
>
> Then if the above fails, then simply search the binary code for such block
> of fout instructions (in case of nops it'd be 0x90909090, in case of ret's
> -- don't remember). It's rather impossible, that such immediate value would
> apear inside op_* code, so the only real possibility is FORCE_RET()
> occurence (Ther is also slim possibility that such code would be an align
> fill block -- but AFAIR gcc is instructed ther not to align code and AFAIR
> gcc would use block of 4 one byte nops -- it will use longer nops in such
> cases). So then replacing such nops with jumps to end inside blocks is
> trivial.
>
> What do you think?
>
> rgds
> Sebastian Kaliszewski
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] Stupid (probably) idea wrt dyngen & gcc 3.4 & 4.0
2005-05-09 0:02 [Qemu-devel] Stupid (probably) idea wrt dyngen & gcc 3.4 & 4.0 Sebastian Kaliszewski
2005-05-09 0:25 ` André Braga
@ 2005-05-09 0:40 ` Paul Brook
2005-05-09 1:55 ` Sebastian Kaliszewski
1 sibling, 1 reply; 5+ messages in thread
From: Paul Brook @ 2005-05-09 0:40 UTC (permalink / raw)
To: qemu-devel, sk
On Monday 09 May 2005 01:02, Sebastian Kaliszewski wrote:
> Hello!
>
> As I understand the problem with dyngen & GCC 3.4 and newer is that even
> when using the following marcro (line 158 of dynget-exec.h) in op_*
> functions
>
> #define FORCE_RET() asm volatile ("");
>
> GCC still puts multiple exit points of a function.
>
> But did anyone try the following one:
>
> #define FORCE_RET() asm volatile ("" : : : "memory" );
>
> This tells GCC that that asm block clobbers arbitrary memory. If it doesnt
> help, then maybe putting few instructions will help (increasing the weight
> of the code thus convincing optimiser not to multiplicate the asm block)?
>
> #define FORCE_RET() asm volatile ("nop; nop; nop; nop" : : : "memory" );
No. The main problem with gcc3.4 was that we weren't using FORCE_RET
everywhere that we should. This has mostly been fixed now.
The problem with gcc4 is that -fno-reorder-blocks no longer does what we want.
There is no way to force gcc to put the the end of the function at the end.
As far as gcc is concerned there's nothing special about the "end" of the
function. gcc will turn
if (unlikely)
something();
rest_of_function();
FORCE_RET();
return;
into
if (unlikely) goto unlikely_code;
return_from_unlikely:
rest_of_function();
return;
unlikely_code:
something();
goto return_from_unlikely;
making rest_of_function bigger won't help.
> Then if the above fails, then simply search the binary code for such block
> of fout instructions
This won't work either. the FORCE_RET is before the function epilogue. ie. you
might have:
op_foo:
push %ebx
# function code
# the assembly from FORCE_RET
pop %ebx
ret
If you amputate this at the FORCE_RET you end up with a stack overflow.
I've got a solution for x86/x86-64 that's 95% complete, using the method I
suggested in a previous email. I hope to be submitting a patch shortly.
I expect most other hosts (particularly the RISC based ones) to be much
simpler to fix.
Paul
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] Stupid (probably) idea wrt dyngen & gcc 3.4 & 4.0
2005-05-09 0:40 ` Paul Brook
@ 2005-05-09 1:55 ` Sebastian Kaliszewski
2005-05-09 2:33 ` Paul Brook
0 siblings, 1 reply; 5+ messages in thread
From: Sebastian Kaliszewski @ 2005-05-09 1:55 UTC (permalink / raw)
To: qemu-devel
So the idea was indeed stupid :)
On Monday 09 May 2005 02:40, Paul Brook wrote:
> No. The main problem with gcc3.4 was that we weren't using FORCE_RET
> everywhere that we should. This has mostly been fixed now.
I see...
[snip]
>
> I've got a solution for x86/x86-64 that's 95% complete, using the method
> I suggested in a previous email.
So, since ret is too small to be replaced with jump, you relocate the block
following ret few bytes further, and retarget all relevant jumps?
> I hope to be submitting a patch shortly.
> I expect most other hosts (particularly the RISC based ones) to be much
> simpler to fix.
Nice.
rgds
Sebastian Kaliszewski
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] Stupid (probably) idea wrt dyngen & gcc 3.4 & 4.0
2005-05-09 1:55 ` Sebastian Kaliszewski
@ 2005-05-09 2:33 ` Paul Brook
0 siblings, 0 replies; 5+ messages in thread
From: Paul Brook @ 2005-05-09 2:33 UTC (permalink / raw)
To: qemu-devel, sk
On Monday 09 May 2005 02:55, Sebastian Kaliszewski wrote:
> So the idea was indeed stupid :)
>
> On Monday 09 May 2005 02:40, Paul Brook wrote:
> > No. The main problem with gcc3.4 was that we weren't using FORCE_RET
> > everywhere that we should. This has mostly been fixed now.
>
> I see...
>
> [snip]
>
> > I've got a solution for x86/x86-64 that's 95% complete, using the method
> > I suggested in a previous email.
>
> So, since ret is too small to be replaced with jump, you relocate the block
> following ret few bytes further, and retarget all relevant jumps?
Almost. I relocate the instructins immediately preceeding the ret. I define
FORCE_RET() as "nop; nop; nop; nop" to make sure we always have some code
that can be moved without having to relocate any jumps. In most cases dyngen
can then recognise these nop blocks, and remove them from the output.
Paul
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2005-05-09 2:38 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-05-09 0:02 [Qemu-devel] Stupid (probably) idea wrt dyngen & gcc 3.4 & 4.0 Sebastian Kaliszewski
2005-05-09 0:25 ` André Braga
2005-05-09 0:40 ` Paul Brook
2005-05-09 1:55 ` Sebastian Kaliszewski
2005-05-09 2:33 ` Paul Brook
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).