From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1DLo3D-0007uu-Qo for qemu-devel@nongnu.org; Wed, 13 Apr 2005 16:01:28 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1DLo35-0007rp-2m for qemu-devel@nongnu.org; Wed, 13 Apr 2005 16:01:20 -0400 Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1DLo33-0007lK-TE for qemu-devel@nongnu.org; Wed, 13 Apr 2005 16:01:18 -0400 Received: from [216.58.162.138] (helo=netraverse.com) by monty-python.gnu.org with esmtp (SSL 3.0:RSA_3DES_EDE_CBC_SHA:24) (Exim 4.34) id 1DLoC7-00079b-A0 for qemu-devel@nongnu.org; Wed, 13 Apr 2005 16:10:39 -0400 Received: from [69.165.224.96] (account lreiter HELO [10.1.0.1]) by netraverse.com (CommuniGate Pro SMTP 4.0.5) with ESMTP-TLS id 5822508 for qemu-devel@nongnu.org; Wed, 13 Apr 2005 13:25:56 -0600 Message-ID: <425D7C94.3050300@win4lin.com> Date: Wed, 13 Apr 2005 16:09:56 -0400 From: "Leonardo E. Reiter" MIME-Version: 1.0 Subject: Re: [Qemu-devel] qvm86, kqemu and video speed References: <425A84B0.60200@praguespringpeople.org> <20050412183843.1a3ad827@caprice.artificis.hu> <425C21E8.6090306@bellard.org> <41e41e7a0504131112678e6f08@mail.gmail.com> <425D71E5.3050807@win4lin.com> In-Reply-To: <425D71E5.3050807@win4lin.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Well the proof is in the pudding as they say. Running the display adapter benchmark with the Fresh Diagnose program under Windows 2000, inlining the glue functions actually slows some things down a slight bit, while improving others by a small margin. Overall there is no measurable performance gain when the glue functions are inlined. So I suppose there is no sense in doing that. Processors these days are so smart, that perhaps the inlining actually causes cache optimization problems in some cases and has a negative (instead of the expected positive) performance effect. - Leo Reiter Leonardo E. Reiter wrote: > 3. I think a really simple optimization may be to inline the glue > functions in vga_template.h, cirrus_vga_rop.h, and cirrus_vga_rop2.h, > which is very trivial. We tried that a while back and it did improve > performance a bit - for instance, it shaved 1.5 seconds off the boot > time of a Windows 2000 session. Windows 2000 likes to display heavy > graphics, like marquees, etc., while booting in its status dialog boxes, > which is why the improvement was there I think. Maybe Fabrice or > someone else can comment as to the possible consequences, other than the > obvious code size increase of using inline functions (which is not much > in this case) of inlining those functions. We didn't notice anything > adverse, but perhaps we weren't looking closely enough :) We will keep > testing over here, and if all goes well, post a patch that does this > simple optimization. That is, unless anyone can chime in with a good > reason not to do this of course. -- Leonardo E. Reiter Vice President of Engineering Win4Lin, Inc. Virtual Computing from Desktop to Data Center Main: +1 512 339 7979 Fax: +1 512 532 6501 http://www.win4lin.com