From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1Led5n-0004Ct-AT for qemu-devel@nongnu.org; Tue, 03 Mar 2009 17:28:03 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1Led5j-0004Ch-Lc for qemu-devel@nongnu.org; Tue, 03 Mar 2009 17:28:01 -0500 Received: from [199.232.76.173] (port=56305 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Led5j-0004Ce-Eu for qemu-devel@nongnu.org; Tue, 03 Mar 2009 17:27:59 -0500 Received: from soufre.accelance.net ([213.162.48.15]:57062) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1Led5j-0002CB-1Q for qemu-devel@nongnu.org; Tue, 03 Mar 2009 17:27:59 -0500 Received: from [192.168.0.5] (potipota.net [88.168.176.51]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by soufre.accelance.net (Postfix) with ESMTP id 1250A450AD for ; Tue, 3 Mar 2009 23:27:56 +0100 (CET) Subject: Re: [Qemu-devel] sh : performance problem From: Lionel Landwerlin In-Reply-To: <761ea48b0903031125n5d97462eu15caa552764789d9@mail.gmail.com> References: <49A6C317.1080202@juno.dti.ne.jp> <1236038327.4975.16.camel@coalu.atr> <49AD50E2.7000401@juno.dti.ne.jp> <1236106677.16018.5.camel@couak.urd44.com> <761ea48b0903031125n5d97462eu15caa552764789d9@mail.gmail.com> Content-Type: text/plain; charset=utf-8 Date: Tue, 03 Mar 2009 23:28:32 +0100 Message-Id: <1236119312.4005.13.camel@coalu.atr> Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Le mardi 03 mars 2009 =C3=A0 20:25 +0100, Laurent Desnogues a =C3=A9crit = : > On Tue, Mar 3, 2009 at 7:57 PM, Lionel Landwerlin > wrote: > > Le mercredi 04 mars 2009 =C3=A0 00:46 +0900, Shin-ichiro KAWASAKI a =C3= =A9crit : > [...] > >> sh4 : 5.8 [seconds] O(n) utlb search. > >> sh4 : 4.6 [seconds] O(log2(n)) utlb search. > >> sh4 : 4.1 [seconds] O(1) utlb search by Lionel > >> arm : 0.8 [seconds] (-M versatilepb + Debian ARM) > >> > >> Your patch has a nice score! > > > > Great :) But we're still far from arm :( >=20 > It would be interesting if you could run oprofile of both platforms > (and even better if that was with call-graph output). >=20 > >> > +#if !defined(CONFIG_USER_ONLY) > >> > + /* vpn to utlb entry caches (too much space for user emulatio= n) */ > >> > + uint8_t utlbs_1k[4194304]; /* 222 =3D> 4 Mb */ > >> > + uint8_t utlbs_4k[1048576]; /* 220 =3D> 1 Mb */ > >> > + uint8_t utlbs_64k[65536]; /* 216 =3D> 64 Kb */ > >> > + uint8_t utlbs_1m[4096]; /* 212 =3D> 4 Kb */ > >> > +#endif > >> > } CPUSH4State; > >> > >> Isn't it too gorgeous? > >> How about allocating them on demand? > >> I guess sh-linux uses only utlbs_4k[], in general. > >> If so, 4 Mb utlbs_1k[] is waste. > >> > > > > sh-linux can also use huge pages of 64k and 1M. > > > > I think it is important to keep the emulation as close as possible fr= om > > the real the cpu capabilities. >=20 > I think Shin-ichiro was rather pointing a the number of 1k pages, > which are probably not used. OTOH I agree being close to > CPU capabilities is important. The problem is, when a virtual address space must be resolved, the cpu emulator is not able to know what page size has been (or not) used to map the address. Thus we must handle the worst case... And after all, 5Mb isn't so much, considering the fact that the physical guest memory, in system emulation, can be mapped anywhere in the host virtual address space, contrary to the user emulation that have harder contraints... By the way, does someone know why there is some kind of "tlb management code" in exec.c ?? Does the SH4 architecture have special features that can't be handled in a generic code ? Or are we just rewriting some code that is already there ... ? --=20 Lionel Landwerlin