From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1LetBq-0006Q4-Aq for qemu-devel@nongnu.org; Wed, 04 Mar 2009 10:39:22 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1LetBo-0006Pi-Od for qemu-devel@nongnu.org; Wed, 04 Mar 2009 10:39:21 -0500 Received: from [199.232.76.173] (port=38414 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1LetBo-0006PZ-Ju for qemu-devel@nongnu.org; Wed, 04 Mar 2009 10:39:20 -0500 Received: from vsmtp02.dti.ne.jp ([202.216.231.137]:39711) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1LetBn-0007F8-OA for qemu-devel@nongnu.org; Wed, 04 Mar 2009 10:39:20 -0500 Received: from [192.168.1.22] (PPPa1209.e11.eacc.dti.ne.jp [124.255.90.198]) by vsmtp02.dti.ne.jp (3.11v) with ESMTP AUTH id n24FdF8T006082 for ; Thu, 5 Mar 2009 00:39:15 +0900 (JST) Message-ID: <49AEA0A3.8030003@juno.dti.ne.jp> Date: Thu, 05 Mar 2009 00:39:15 +0900 From: Shin-ichiro KAWASAKI MIME-Version: 1.0 Subject: Re: [Qemu-devel] sh : performance problem References: <49A6C317.1080202@juno.dti.ne.jp> <1236038327.4975.16.camel@coalu.atr> <49AD50E2.7000401@juno.dti.ne.jp> <1236106677.16018.5.camel@couak.urd44.com> <761ea48b0903031125n5d97462eu15caa552764789d9@mail.gmail.com> <1236119312.4005.13.camel@coalu.atr> In-Reply-To: <1236119312.4005.13.camel@coalu.atr> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Lionel Landwerlin wrote: > Le mardi 03 mars 2009 =C3=A0 20:25 +0100, Laurent Desnogues a =C3=A9cri= t : >> On Tue, Mar 3, 2009 at 7:57 PM, Lionel Landwerlin >> wrote: >>> Le mercredi 04 mars 2009 =C3=A0 00:46 +0900, Shin-ichiro KAWASAKI a =C3= =A9crit : >> [...] >>>> sh4 : 5.8 [seconds] O(n) utlb search. >>>> sh4 : 4.6 [seconds] O(log2(n)) utlb search. >>>> sh4 : 4.1 [seconds] O(1) utlb search by Lionel >>>> arm : 0.8 [seconds] (-M versatilepb + Debian ARM) >>>> >>>> Your patch has a nice score! >>> Great :) But we're still far from arm :( >> It would be interesting if you could run oprofile of both platforms >> (and even better if that was with call-graph output). >> >>>>> +#if !defined(CONFIG_USER_ONLY) >>>>> + /* vpn to utlb entry caches (too much space for user emulation= ) */ >>>>> + uint8_t utlbs_1k[4194304]; /* 222 =3D> 4 Mb */ >>>>> + uint8_t utlbs_4k[1048576]; /* 220 =3D> 1 Mb */ >>>>> + uint8_t utlbs_64k[65536]; /* 216 =3D> 64 Kb */ >>>>> + uint8_t utlbs_1m[4096]; /* 212 =3D> 4 Kb */ >>>>> +#endif >>>>> } CPUSH4State; >>>> Isn't it too gorgeous? >>>> How about allocating them on demand? >>>> I guess sh-linux uses only utlbs_4k[], in general. >>>> If so, 4 Mb utlbs_1k[] is waste. >>>> >>> sh-linux can also use huge pages of 64k and 1M. >>> >>> I think it is important to keep the emulation as close as possible fr= om >>> the real the cpu capabilities. >> I think Shin-ichiro was rather pointing a the number of 1k pages, >> which are probably not used. OTOH I agree being close to >> CPU capabilities is important. >=20 > The problem is, when a virtual address space must be resolved, the cpu > emulator is not able to know what page size has been (or not) used to > map the address. Thus we must handle the worst case... >=20 > And after all, 5Mb isn't so much, considering the fact that the physica= l > guest memory, in system emulation, can be mapped anywhere in the host > virtual address space, contrary to the user emulation that have harder > contraints... IMO, it will be another patch which introduces lines like this. if (!env->utlbs_1k) env->utlbs_1k =3D qemu_mallocz(4194304); But footprint tuning can be left as future task. Sorry for being too critical. Regards, Shin-ichiro KAWASAKI