From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43)
	id 1JAYDa-0001Vo-0e
	for qemu-devel@nongnu.org; Thu, 03 Jan 2008 17:07:14 -0500
Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43)
	id 1JAYDZ-0001Ur-Br
	for qemu-devel@nongnu.org; Thu, 03 Jan 2008 17:07:13 -0500
Received: from [199.232.76.173] (helo=monty-python.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1JAYDY-0001Ud-S1
	for qemu-devel@nongnu.org; Thu, 03 Jan 2008 17:07:12 -0500
Received: from mail.codesourcery.com ([65.74.133.4])
	by monty-python.gnu.org with esmtps
	(TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60)
	(envelope-from <paul@codesourcery.com>) id 1JAYDY-0004s7-61
	for qemu-devel@nongnu.org; Thu, 03 Jan 2008 17:07:12 -0500
From: Paul Brook <paul@codesourcery.com>
Subject: Re: [Qemu-devel] performance monitor
Date: Thu, 3 Jan 2008 22:07:07 +0000
References: <200801032136.27558.clemens.kol@gmx.at>
	<200801032129.09756.paul@codesourcery.com>
	<200801032238.03458.clemens.kol@gmx.at>
In-Reply-To: <200801032238.03458.clemens.kol@gmx.at>
MIME-Version: 1.0
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Message-Id: <200801032207.08448.paul@codesourcery.com>
Reply-To: qemu-devel@nongnu.org
List-Id: qemu-devel.nongnu.org
List-Unsubscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/pipermail/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: qemu-devel@nongnu.org
Cc: Clemens Kolbitsch <clemens.kol@gmx.at>

> Does anyone have an idea on how I can measure performance in qemu to a
> somewhat accurate level? I have modified qemu (the memory handling) and the
> linux kernel and want to find out the penalty this introduced... does
> anyone have any comments / ideas on this?

Short answer is you probably can't. And even if you can I won't believe tyour 
results unless you've verified them on real hardware :-)

With the exception of some very small embedded cores, Modern CPUs have complex 
out of order execution pipelines and multi-level cache hierarchies. It's 
common for performance to be dominated by these secondary factors rather than 
raw instruction throughput.

Exactly what features dominate performance is very application specific. 
Determining which factor dominates is unlikely to be something qemu can help 
with.

However if e.g. you know that for your application there's a good correlation 
was between performance and L2 cache misses you could instrument qemu to and 
a L1/L2 cache model. The overhead will be fairly severe (easily 10x slower), 
and completely screw up any realtime measurements. However it would produce 
some useful cache use statistics that you could use to guesstimate actual 
performance. This is similar to how cachegrind works. Obviously if your 
application isn't cache bound then these figures will be meaningless.

Paul