From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1LesvS-0007EI-Bc for qemu-devel@nongnu.org; Wed, 04 Mar 2009 10:22:26 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1LesvQ-0007E6-VF for qemu-devel@nongnu.org; Wed, 04 Mar 2009 10:22:25 -0500 Received: from [199.232.76.173] (port=37833 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1LesvQ-0007E0-O7 for qemu-devel@nongnu.org; Wed, 04 Mar 2009 10:22:24 -0500 Received: from vsmtp02.dti.ne.jp ([202.216.231.137]:39392) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1LesvP-0004wc-Th for qemu-devel@nongnu.org; Wed, 04 Mar 2009 10:22:24 -0500 Message-ID: <49AE9CAD.1030805@juno.dti.ne.jp> Date: Thu, 05 Mar 2009 00:22:21 +0900 From: Shin-ichiro KAWASAKI MIME-Version: 1.0 Subject: Re: [Qemu-devel] sh : performance problem References: <49A6C317.1080202@juno.dti.ne.jp> <761ea48b0903031125n5d97462eu15caa552764789d9@mail.gmail.com> <1236119312.4005.13.camel@coalu.atr> <200903040259.18791.paul@codesourcery.com> In-Reply-To: <200903040259.18791.paul@codesourcery.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Cc: Lionel Landwerlin Paul Brook wrote: >>>> Great :) But we're still far from arm :( > >> By the way, does someone know why there is some kind of "tlb management >> code" in exec.c ?? >> >> Does the SH4 architecture have special features that can't be handled in >> a generic code ? Or are we just rewriting some code that is already >> there ... ? > > I think you're missing the most important difference; SH uses a software > managed TLB, whereas ARM uses a hardware managed TLB. > > The main consequence of this is that we don't have to model the actual ARM TLB > at all, it is never directly visible. We effectively implement an infinitely > large TLB. > > For SH the TLB is programmed directly, so we end up having to maintain two > TLBs: The qemu TLB and the architectural SH TLB. For correct operation pages > must be removed from the qemu TLB when they are evicted/replaced in the SH > TLB. The SH TLB is quite small, and flushing qemu TLB entries is quite > expensive, so this results in fairly poor performance. > > MIPS has a similar problem. However in that case the most common TLB > operations do not directly expose the TLB state. In particular when setting a > new TLB entry it is unspecified which TLB entry is replaced. At that point > the OS can't know which ehtry was evicted, so we can lie, and not evict pages > until the guest does something that allows it to determine the exact TLB > state. In practice this is sufficient to make mips-linux workreasonably well. > > I'm not sure if the same is posible for SH. It probably depends whether URC is > visible to/used by the guest. Thank you Paul for your clear explanation. It confirms my guess, and answers my question also. As I posted for other mail, I tried to increase the number of tlb entry from 64 to 256 and get performance improvement. This approach modifies real hardware specification including URC, but same SH-Linux kernel works fine. It implies that SH-Linux does not refer URC, and MIPS approach might be the solution for SH too. Regards, Shin-ichiro KAWASAKI