From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:44776) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RVxfY-0008MQ-9y for qemu-devel@nongnu.org; Wed, 30 Nov 2011 22:50:45 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1RVxfW-0007Cj-SH for qemu-devel@nongnu.org; Wed, 30 Nov 2011 22:50:44 -0500 Received: from csmailer.cs.nctu.edu.tw ([140.113.235.130]:57273) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RVxfV-0007CY-T4 for qemu-devel@nongnu.org; Wed, 30 Nov 2011 22:50:42 -0500 Date: Thu, 1 Dec 2011 11:50:24 +0800 From: =?utf-8?B?6Zmz6Z+L5Lu7?= Message-ID: <20111201035024.GA88545@cs.nctu.edu.tw> References: <20111129070343.GA3585@cs.nctu.edu.tw> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] Improve QEMU performance with LLVM codegen and other techniques List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alexander Graf Cc: qemu-devel@nongnu.org, =?utf-8?B?6Zmz6Z+L5Lu7?= Hi Alex, > Very cool! I was thinking about this for a while myself now. It's espec= ially appealing these days since you can do the hotspot optimization in a= separate thread :). >=20 > Especially in system mode, you also need to flush when tb_flush() is ca= lled though. And you have to make sure to match hflags and segment descri= ptors for the links - otherwise you might end up connecting TBs from diff= erent processes :). I'll check the tb_flush again. IIRC, we make the code cache big enough = so that there is no need to flush the code cache. But I think we still need to de= al with it in the end. The block linking is done by QEMU and we leave it alone. But I don't kn= ow QEMU ever does hflags and segment descriptors check before doing block linking= . Could you point it out? Anyway, here is how we form trace from a set of basic b= locks. 1. We insert instrumented code at the beginning of each TCG block to coll= ect how many times this block being executed. 2. When a block's execution time, say block A, reaches a pre-defined thre= shold, we follow the run time execution path to collect block B followed A an= d so on to form a trace. This approach is called NET (Next-Executing Tail) [1]= . 3. Then a trace composed of TCG blocks is sent to a LLVM translator. The = translator generates the host binary for the trace into a LLVM code cache, and pa= tch the beginning of block A (in QEMU code cache) so that anyone executing blo= ck A will=20 jump to the corresponding trace and execute. Above is block to trace link. I think there is no need to do hflags and s= egment descriptors check, right? Although I set the trace length to one basic bl= ock at the moment (make the situation simpler), I think we still don't have to c= heck the blocks' hflags and segment descriptors in the trace to see if they ma= tch. =20 > > successfully, then login and run some benchmark on it. As a very firs= t step, we > > make a very high threshold on trace building. In other words, a basic= block must > > be executed *many* time to trigger the trace building process. Then w= e lower the > > threshold a bit at a time to see how things work. When something goes= wrong, we > > might get kernel panic or the system hangs at some point on the booti= ng process. > > I have no idea on how to solve this kind of problem. So I'd like to s= eek for > > help/experience/suggestion on the mailing list. I just hope I make th= e whole > > situation clear to you.=20 >=20 > I don't see any better approach to debugging this than the one you're a= lready taking. Try to run as many workloads as you can and see if they br= eak :). Oh and always make the optimization optional, so that you can nar= row it down to it and know you didn't hit a generic QEMU bug. You mean make the trace optimization optional? We have tested our frame= work in LLVM-only mode. which means we replace TCG with LLVM entirely. It's _very= _ slow but works. What the generic QEMU bug is? We use QEMU 0.13 and just rely o= n its emulation part right now. Does recent version fix major bugs in the emula= tion engine? Thanks for your advices. :-) [1] http://www.cs.virginia.edu/kim/docs/micro05.pdf Regards, chenwj --=20 Wei-Ren Chen (=E9=99=B3=E9=9F=8B=E4=BB=BB) Computer Systems Lab, Institute of Information Science, Academia Sinica, Taiwan (R.O.C.) Tel:886-2-2788-3799 #1667 Homepage: http://people.cs.nctu.edu.tw/~chenwj