From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1M6sW3-0008RC-Hz for qemu-devel@nongnu.org; Wed, 20 May 2009 16:35:55 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1M6sVy-0008PS-EB for qemu-devel@nongnu.org; Wed, 20 May 2009 16:35:55 -0400 Received: from [199.232.76.173] (port=44188 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1M6sVy-0008PN-9L for qemu-devel@nongnu.org; Wed, 20 May 2009 16:35:50 -0400 Received: from csl.cornell.edu ([128.84.224.10]:4463 helo=vlsi.csl.cornell.edu) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1M6sVx-0008BC-Rn for qemu-devel@nongnu.org; Wed, 20 May 2009 16:35:50 -0400 Date: Wed, 20 May 2009 16:35:25 -0400 (EDT) From: Vince Weaver Subject: Re: [Qemu-devel] Instruction counting instrumentation for ARM + initial patch In-Reply-To: <761ea48b0905200516g47713089g5d0b06f6f94bcd1a@mail.gmail.com> Message-ID: <20090520162441.L73304@stanley.csl.cornell.edu> References: <1242745197.24234.7.camel@peak10.cs.hut.fi> <200905201148.43631.paul@codesourcery.com> <761ea48b0905200516g47713089g5d0b06f6f94bcd1a@mail.gmail.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Laurent Desnogues Cc: =?ISO-8859-1?Q?Timo_T=F6yry?= , qemu-devel@nongnu.org I wonder if a simplistic stats gathering frame work could be added to Qemu. The problem is there currently are at least 3 users of Qemu: 1. People who want fast simulation 2. People who are doing virtualization 3. People trying to do instrumentation/research Unfortunately those three groups have conflicting interests. The main problem is that adding instrumentation infrastructure will either slow down the common case, or else introduce lots of #ifdefs all over the code. Neither is very attractive. It would be nice if maybe a limited instrumentation architecture could be put into qemu, that could be configured out. It would save the various researchers the problem of everyone re-implementing it differently. It would be nice to have: 1. A way to dump an instruction trace (address, length (for CISC), and opcode, CPU# for multi-thread) 2. A way to dump memory traces (address, length, possibly the value loaded/stored, CPU# for multi-thread) 3. A way to dump basic-block entry/exit Many of the various research metrics can be gained from these stats. #1 and #2 are enough for cache simulators. #1 (if post-processed) is enough to get a frequency plot for instruction count and type. #1 can be used to extrapolate branch-taken statistics for branch predictors #3 Can be used for basic block vectors, or to get faster instruction counts I'm not sure if it's possible to do this in a non-intrusive way that would be acceptable to the other users of Qemu though. Pin manages to have their null plugin run very fast; at least one of the Spec2k binaries runs faster translated than it does natively. Vince