From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1DNNxR-0002iF-TM for qemu-devel@nongnu.org; Mon, 18 Apr 2005 00:34:02 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1DNNxF-0002cQ-NS for qemu-devel@nongnu.org; Mon, 18 Apr 2005 00:33:52 -0400 Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1DNNxE-0002bC-W9 for qemu-devel@nongnu.org; Mon, 18 Apr 2005 00:33:49 -0400 Received: from [64.233.184.202] (helo=wproxy.gmail.com) by monty-python.gnu.org with esmtp (Exim 4.34) id 1DNNwM-0004r3-Jf for qemu-devel@nongnu.org; Mon, 18 Apr 2005 00:32:54 -0400 Received: by wproxy.gmail.com with SMTP id 71so2123084wra for ; Sun, 17 Apr 2005 21:31:25 -0700 (PDT) Message-ID: Date: Mon, 18 Apr 2005 00:31:25 -0400 From: Karl Magdsick Subject: Re: [Qemu-devel] Profiling Qemu for speed? In-Reply-To: <2ad73a0504171939154235f5@mail.gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline References: <20050418013615.89123.qmail@web51604.mail.yahoo.com> <2ad73a0504171939154235f5@mail.gmail.com> Reply-To: Karl Magdsick , qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: =?ISO-8859-1?Q?Andr=E9_Braga?= , qemu-devel@nongnu.org Ideally, we could force gcc to implement switch statements as indirect jumps with jump tables inline with the code. However, this may not be possible. I think Nathaniel was just saying that gcc is likely generating several hundred sequential if-else blocks for large switch statements. This gives you O(n) runtime, whereas a function pointer array gives you O(1) runtime. (This assumes each case ends with a break... I haven't looked at the code to which Nathaniel is referring.)=20 Therefore, for sufficiently large switch statements, sequential if-else statements will be slower than a function pointer array. The question is: is n=3D~ 200 sufficiently large? I think only empirical testing has a chance of answering this question in everyone's mind. If you smartly nest your if-else statements, you can at least get O(log n) runtime, with negligible difference in the leading constant as compared to the sequential if-else statements. There's also the possibility of (at startup) dynamically generating an indirect jump along with a jump table inline with the code (in order to minimize page faults, and perhaps utilize the prefectch to get the first few jump addresses in the L2 cache). Of course, as pointed out elsewhere, the dynamic code generator is hopefully only invoked rarely, so speeding up the switch statements may not have a noticeable effect on emulation speed. -Karl On 4/17/05, Andr=E9 Braga wrote: > The problem with table lookups (I'm assuming you're talking about > function pointer vectors) is that they *destroy* spatial locality of > reference that you could otherwise attain by having series of > if-then-else instructions and some clever instruction prefetching > mechanism on modern processors... Not to mention the function call > overhead.