From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:55357) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UgTgv-0005gT-17 for qemu-devel@nongnu.org; Sun, 26 May 2013 01:40:30 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UgTgs-00078u-7U for qemu-devel@nongnu.org; Sun, 26 May 2013 01:40:24 -0400 Received: from mail-wi0-x234.google.com ([2a00:1450:400c:c05::234]:42866) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UgTgs-00078j-1O for qemu-devel@nongnu.org; Sun, 26 May 2013 01:40:22 -0400 Received: by mail-wi0-f180.google.com with SMTP id hn14so708261wib.13 for ; Sat, 25 May 2013 22:40:20 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <51A10BCA.6000800@suse.de> References: <51A10BCA.6000800@suse.de> Date: Sun, 26 May 2013 08:40:20 +0300 Message-ID: From: Lior Vernia Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] Potential to accelerate QEMU for specific architectures List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: =?ISO-8859-1?Q?Andreas_F=E4rber?= Cc: qemu-devel@nongnu.org, =?UTF-8?B?6Zmz6Z+L5Lu7?= , Richard Henderson Hello, On Sat, May 25, 2013 at 10:06 PM, Andreas F=E4rber wrote= : > Hi, > > Am 24.05.2013 21:24, schrieb Lior Vernia: >> I am running x86 applications on an ARM device using QEMU, and found >> it too slow for my needs. > > Before we start going into technical details, what are you trying to > achieve on a high level and how did you try to do it? > > Are you using qemu-system-x86_64 or qemu-x86_64? The latest v1.5.0? Sorry, right after I wrote the message it occured to me I should have mentioned that I was talking about qemu-system, either x86 or i386. At the moment I just ran the limbo app on a Galaxy SIII with various images, just to see the capabilities, and was disappointed. Limbo seems to run v1.1.0. If you suspect that it's the JNI wrapping that's causing a lot of the damage, then we can talk about compiling QEMU for ARM and running it natively, I just haven't been able to get that to work. >> This is to be expected, of course, this is >> not a complaint. > > Especially since most people still run on x86 ... > >> However, I was wondering whether this could be helped >> by "overriding" the generic binary translation mechanism and focusing >> on lower level binary translation just from x86 to ARM. >> >> It's clear to me that this isn't a small project, but it might be >> important enough for me to invest myself in. However, before I jump >> into it, I wanted to inquire whether this would be worthwhile at all. >> Does anyone have any estimate as to how big of a gain that could >> achieve? Or whether a more significant improvement could be achieved >> by further tweaking that didn't occur to me? I wanted to add that I've been reading about this Russian startup that's looking to emulate x86 on ARM at 40% of native speed using dynamic binary translation (as far as I gather): http://www.bit-tech.net/news/hardware/2012/10/04/x86-on-arm/1 So this should be possible. And it can't be very much unlike QEMU, can it? > > ... the tcg/arm/ code does not get a lot of love, so you might be able > to squeeze some more performance out of it by implementing optional TCG > ops or optimizing existing implementations. In theory most TCG ops > should correspond to a machine instruction (where available); there's a > TCG-level optimizer to create more efficient code, but it's a tradeoff > between time for code optimization and execution time. > > Needless to say that you should enable -O3 optimization (or something) > for the core C code and not to enable debug features in configure for > your performance measurements. :) > > Whatever implementation you experiment with, get familiar with our > Git-based workflow and try to stay close to qemu.git code or otherwise > you'll create a fork with little chance of getting integrated into the > code base - meaning both we don't get your speedups and you don't get > our latest features and bugfixes. One such example was the attempt to > use LLVM instead of TCG. Thanks, but we're getting slightly ahead of ourselves here :) I'd still want to make sure that QEMU is at fault for the performance, and if that's the case that there's potential for real improvement before I start getting my hands dirty .