Hi all!
From the technical documentation (http://www.usenix.org/publications/library/proceedings/usenix05/tech/freenix/bellard.html)
I read:
The first step is to split each target CPU
instruction into fewer simpler instructions called micro
operations. Each micro operation is implemented by a small
piece of C code. This small C source code is compiled by GCC to an
object file. The micro operations are chosen so that their number
is much smaller (typically a few hundreds) than all the
combinations of instructions and operands of the target CPU. The
translation from target CPU instructions to micro operations is
done entirely with hand coded code.
A compile time tool called dyngen
uses the object file containing the micro operations as input to
generate a dynamic code generator. This dynamic code generator is
invoked at runtime to generate a complete host function which
concatenates several micro operations.
instead from wikipedia(http://en.wikipedia.org/wiki/QEMU)
and other sources I read:
The Tiny Code Generator (TCG) aims to remove
the shortcoming of relying on a particular version of GCC or any compiler, instead
incorporating the compiler (code generator) into other tasks
performed by QEMU in run-time. The whole translation task thus
consists of two parts: blocks of target code (TBs) being
rewritten in TCG ops - a kind of machine-independent
intermediate notation, and subsequently this notation being
compiled for the host's architecture by TCG. Optional optimisation
passes are performed between them.
- So, I think that the technical documentation is now obsolete,
isn't it?
- The "old way" used much offline (compile time) work compiling the
micro operations into host machine code, while if I understand well,
TCG does everything in run-time(please correct me if I am wrong!)..
so I wonder, how can it be as fast as the previous method (or even
faster)?
- If I understand well, TGC runtime flow is the following:
- TCG takes the target binary, and splits it into target blocks
- if the TB is not cached, TGC translates it (or better the
target instructions it is composed by) into TCG micro ops,
- TGC compiles TGC uops into host object code,
- TGC caches the TB,
- TGC tries to chain the block with others,
- TGC copies the TB into the execution buffer
- TGC runs it
Am I right? Please correct me, whether I am wrong, as I wanna use
that flow scheme for trying to understand the code..
Thank you very much in advance!
Stefano B.