From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:48552) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XOjHE-0008VW-Nm for qemu-devel@nongnu.org; Tue, 02 Sep 2014 04:17:27 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XOjH8-00015m-Fq for qemu-devel@nongnu.org; Tue, 02 Sep 2014 04:17:20 -0400 Received: from mail-pd0-f172.google.com ([209.85.192.172]:48615) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XOjH8-00015A-B2 for qemu-devel@nongnu.org; Tue, 02 Sep 2014 04:17:14 -0400 Received: by mail-pd0-f172.google.com with SMTP id z10so8011951pdj.31 for ; Tue, 02 Sep 2014 01:17:01 -0700 (PDT) From: "Byron Hawkins" References: <36d601cfc330$5eb23ea0$1c16bbe0$@uci.edu> In-Reply-To: Date: Tue, 2 Sep 2014 01:16:57 -0700 Message-ID: <40d301cfc686$453237b0$cf96a710$@uci.edu> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Content-Language: en-us Subject: Re: [Qemu-devel] Running programs that dynamically generate code List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: 'Peter Maydell' Cc: 'QEMU Developer List' > -----Original Message----- > From: Peter Maydell [mailto:peter.maydell@linaro.org] > Sent: Friday, August 29, 2014 2:23 AM > To: Byron Hawkins > Cc: QEMU Developer List > Subject: Re: [Qemu-devel] Running programs that dynamically generate > code >=20 > On 29 August 2014 03:24, Byron Hawkins wrote: > > Hi, I=E2=80=99m working on a research project to optimize binary = translation > > for target applications that dynamically generate code, such as > > browser JIT engines. When I run the octane benchmark in Chrome v8 > > under QEMU (i.e., qemu-x86_64), it shows significant overhead = compared > > to a native run. Can someone tell me how QEMU maintains consistency > > with the target application when it dynamically generates code? >=20 > When we generate code from a particular page of guest RAM, we arrange = to > trap into QEMU if/when the guest writes to that page (either using = QEMU's > own softmmu facilities if using the -system- emulators, or by marking = the > page readonly and handling the segfault if using the linux-user = emulators). If > we trap then we flush the cached translation we had for that page and = let > the guest write proceed. > For guest CPUs like x86 there is some further complication to allow = writes to > the page the guest is currently executing from to behave as the guest > expects. For guest CPUs like ARM where the architecture requires guest > code to perform icache/dcache maintenance operations before the writes > are visible to instruction fetch, we take advantage of that to avoid = jumping > through some of the hoops. >=20 > The most obvious cause of overhead compared to a native run would be = that > we emulate all our floating point and SIMD operations, so unless = Chrome's > JIT is sticking strictly to integer operations we're bound to go = rather slower. >=20 > -- PMM Thanks Peter. Can you tell me how much translated code is flushed when a = write is detected in an executable region? Is it just the code = translated from the one modified page?