From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1JyZyu-0007Rn-Tw for qemu-devel@nongnu.org; Tue, 20 May 2008 18:06:52 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1JyZys-0007R6-MM for qemu-devel@nongnu.org; Tue, 20 May 2008 18:06:51 -0400 Received: from [199.232.76.173] (port=46896 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1JyZys-0007R2-C9 for qemu-devel@nongnu.org; Tue, 20 May 2008 18:06:50 -0400 Received: from mail.codesourcery.com ([65.74.133.4]:46624) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1JyZys-0002U1-6o for qemu-devel@nongnu.org; Tue, 20 May 2008 18:06:50 -0400 From: Paul Brook Subject: Re: [Qemu-devel] Performance Monitoring Date: Tue, 20 May 2008 23:06:44 +0100 References: <3000d2e90805201156g30050a68ve9187e3b94341e99@mail.gmail.com> In-Reply-To: <3000d2e90805201156g30050a68ve9187e3b94341e99@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200805202306.45277.paul@codesourcery.com> Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Cc: Cheif Jones > I'm doing a research project in which i want to run an OS under an emulator > for a period of time and get full CPU opcode statistics (how many times > every opcode was executed). As far as i understand the Qemu design, it is > doing "JIT" translation of terget opcode to host opcodes to improve > performance, and so there is no easy way to count target opcodes (e.g a > loop is compiled JIT and runs natively). > > Is it possible to disable Qemu's JIT capabilities and get target opcode > statistics? You've a couple of options: - Disable TB caching (so code is always translated whenever it is run), and do the counting during translation. Performance is going to be fairly sucky. - Inject the counters into the translated code. This is maybe a bit more work, but should perform much better. With either alternative you'll still have issues with exceptions. MMU faults abort a TB early, so will screw up your statistics. One possibility is to terminate a TB on every memory access, like we do for watchpoints. You probably already know this, but I'd be surprised if the statistics you get have much/any correlation with real world performance on modern hardware. Paul