From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from [140.186.70.92] (port=57974 helo=eggs.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1PRAUS-0007Hl-V7
	for qemu-devel@nongnu.org; Fri, 10 Dec 2010 16:27:07 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <stefboombastic@gmail.com>) id 1PRAUL-0002qs-0p
	for qemu-devel@nongnu.org; Fri, 10 Dec 2010 16:26:56 -0500
Received: from mail-wy0-f173.google.com ([74.125.82.173]:39939)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <stefboombastic@gmail.com>) id 1PRAUK-0002qS-OW
	for qemu-devel@nongnu.org; Fri, 10 Dec 2010 16:26:48 -0500
Received: by wyg36 with SMTP id 36so4238107wyg.4
	for <qemu-devel@nongnu.org>; Fri, 10 Dec 2010 13:26:47 -0800 (PST)
Message-ID: <4D029B13.3050002@gmail.com>
Date: Fri, 10 Dec 2010 22:26:43 +0100
From: Stefano Bonifazi <stefboombastic@gmail.com>
MIME-Version: 1.0
Content-Type: multipart/alternative;
	boundary="------------000305050007020207080200"
Subject: [Qemu-devel] TCG flow vs dyngen
List-Id: qemu-devel.nongnu.org
List-Unsubscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: qemu-devel@nongnu.org

This is a multi-part message in MIME format.
--------------000305050007020207080200
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

Hi all!
  From the technical documentation 
(http://www.usenix.org/publications/library/proceedings/usenix05/tech/freenix/bellard.html) 
I read:

> The first step is to split each target CPU instruction into fewer 
> simpler instructions called /micro operations/. Each micro operation 
> is implemented by a small piece of C code. This small C source code is 
> compiled by GCC to an object file. The micro operations are chosen so 
> that their number is much smaller (typically a few hundreds) than all 
> the combinations of instructions and operands of the target CPU. The 
> translation from target CPU instructions to micro operations is done 
> entirely with hand coded code. 
> A compile time tool called dyngen uses the object file containing the 
> micro operations as input to generate a dynamic code generator. This 
> dynamic code generator is invoked at runtime to generate a complete 
> host function which concatenates several micro operations. 
instead from wikipedia(http://en.wikipedia.org/wiki/QEMU) and other 
sources I read:

> The Tiny Code Generator (TCG) aims to remove the shortcoming of 
> relying on a particular version of GCC 
> <http://en.wikipedia.org/wiki/GNU_Compiler_Collection> or any 
> compiler, instead incorporating the compiler (code generator) into 
> other tasks performed by QEMU in run-time. The whole translation task 
> thus consists of two parts: blocks of target code (/TBs/) being 
> rewritten in *TCG ops* - a kind of machine-independent intermediate 
> notation, and subsequently this notation being compiled for the host's 
> architecture by TCG. Optional optimisation passes are performed 
> between them.
- So, I think that the technical documentation is now obsolete, isn't it?

- The "old way" used much offline (compile time) work compiling the 
micro operations into host machine code, while if I understand well, TCG 
does everything in run-time(please correct me if I am wrong!).. so I 
wonder, how can it be as fast as the previous method (or even faster)?

- If I understand well, TGC runtime flow is the following:
     - TCG takes the target binary, and splits it into target blocks
     - if the TB is not cached, TGC translates it (or better the target 
instructions it is composed by) into TCG micro ops,
     - TGC compiles TGC uops into host object code,
     - TGC caches the TB,
     - TGC tries to chain the block with others,
     - TGC copies the TB into the execution buffer
     - TGC runs it
Am I right? Please correct me, whether I am wrong, as I wanna use that 
flow scheme for trying to understand the code..
Thank you very much in advance!
Stefano B.


--------------000305050007020207080200
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
  <head>
    <meta http-equiv="content-type" content="text/html;
      charset=ISO-8859-1">
  </head>
  <body text="#000000" bgcolor="#ffffff">
    Hi all!<br>
    &nbsp;From the technical documentation (<a class="moz-txt-link-freetext"
href="http://www.usenix.org/publications/library/proceedings/usenix05/tech/freenix/bellard.html">http://www.usenix.org/publications/library/proceedings/usenix05/tech/freenix/bellard.html</a>)
    I read:<br>
    <br>
    <blockquote type="cite">The first step is to split each target CPU
      instruction into fewer simpler instructions called <i>micro
        operations</i>. Each micro operation is implemented by a small
      piece of C code. This small C source code is compiled by GCC to an
      object file. The micro operations are chosen so that their number
      is much smaller (typically a few hundreds) than all the
      combinations of instructions and operands of the target CPU. The
      translation from target CPU instructions to micro operations is
      done entirely with hand coded code. </blockquote>
    <blockquote type="cite">A compile time tool called <tt>dyngen</tt>
      uses the object file containing the micro operations as input to
      generate a dynamic code generator. This dynamic code generator is
      invoked at runtime to generate a complete host function which
      concatenates several micro operations. </blockquote>
    instead from wikipedia(<a class="moz-txt-link-freetext"
      href="http://en.wikipedia.org/wiki/QEMU">http://en.wikipedia.org/wiki/QEMU</a>)
    and other sources I read:<br>
    <br>
    <blockquote type="cite">The Tiny Code Generator (TCG) aims to remove
      the shortcoming of relying on a particular version of <a
        href="http://en.wikipedia.org/wiki/GNU_Compiler_Collection"
        title="GNU Compiler Collection">GCC</a> or any compiler, instead
      incorporating the compiler (code generator) into other tasks
      performed by QEMU in run-time. The whole translation task thus
      consists of two parts: blocks of target code (<i>TBs</i>) being
      rewritten in <b>TCG ops</b> - a kind of machine-independent
      intermediate notation, and subsequently this notation being
      compiled for the host's architecture by TCG. Optional optimisation
      passes are performed between them.</blockquote>
    - So, I think that the technical documentation is now obsolete,
    isn't it?<br>
    <br>
    - The "old way" used much offline (compile time) work compiling the
    micro operations into host machine code, while if I understand well,
    TCG does everything in run-time(please correct me if I am wrong!)..
    so I wonder, how can it be as fast as the previous method (or even
    faster)?<br>
    <br>
    - If I understand well, TGC runtime flow is the following:<br>
    &nbsp; &nbsp; - TCG takes the target binary, and splits it into target blocks
    <br>
    &nbsp;&nbsp;&nbsp; - if the TB is not cached, TGC translates it (or better the
    target instructions it is composed by) into TCG micro ops, <br>
    &nbsp;&nbsp;&nbsp; - TGC compiles TGC uops into host object code, <br>
    &nbsp;&nbsp;&nbsp; - TGC caches the TB, <br>
    &nbsp;&nbsp;&nbsp; - TGC tries to chain the block with others, <br>
    &nbsp;&nbsp;&nbsp; - TGC copies the TB into the execution buffer<br>
    &nbsp;&nbsp;&nbsp; - TGC runs it<br>
    Am I right? Please correct me, whether I am wrong, as I wanna use
    that flow scheme for trying to understand the code.. <br>
    Thank you very much in advance!<br>
    Stefano B.<br>
    <br>
    <br>
  </body>
</html>

--------------000305050007020207080200--