From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1MLM4C-0000of-O3 for qemu-devel@nongnu.org; Mon, 29 Jun 2009 14:59:00 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1MLM4B-0000oD-Dj for qemu-devel@nongnu.org; Mon, 29 Jun 2009 14:58:59 -0400 Received: from [199.232.76.173] (port=57384 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1MLM4B-0000oA-4P for qemu-devel@nongnu.org; Mon, 29 Jun 2009 14:58:59 -0400 Received: from fg-out-1718.google.com ([72.14.220.153]:23309) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1MLM4A-00085k-8z for qemu-devel@nongnu.org; Mon, 29 Jun 2009 14:58:58 -0400 Received: by fg-out-1718.google.com with SMTP id l27so520838fgb.8 for ; Mon, 29 Jun 2009 11:58:56 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <5b31733c0906291126l5c4df229pdfd9e7faf88aa292@mail.gmail.com> References: <5b31733c0906291050w355b2fe0n9ac6f62f3486e47c@mail.gmail.com> <761ea48b0906291059g13103602uc678cc318ac63015@mail.gmail.com> <5b31733c0906291126l5c4df229pdfd9e7faf88aa292@mail.gmail.com> Date: Mon, 29 Jun 2009 21:58:56 +0300 Message-ID: Subject: Re: [Qemu-devel] [PATCH 0/3] RFC: TCG ARM optimizations From: Blue Swirl Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Filip Navara Cc: Laurent Desnogues , qemu-devel On 6/29/09, Filip Navara wrote: > Lastly, the code generated for softmmu memory loads/stores could > probably be optimized in some cases. It uses hard-coded registers. > It's not optimized for multiple stores to adjacent locations (pushing > multiple registers to stack) and does all the calculations again and > again. This results not only in recomputing numbers we already have > (as long as the stack is still on the same guest page), but also in > huge TBs. I imagine that doesn't help the processor cache too much. > This would probably benefit all targets. In fact I believe the softmmu > code could be moved out of the TCG target-specific code and into the > main code (with the possibility to override it with optimized > version). Interesting. We could add a new optional TCG instruction op_ld_g2h (extracted from qemu_ld) that performs the TLB lookup and returns the host address. When multiple accesses near the same guest address are detected (how?), the translator can reuse the host address, perform some math and check if the guest page is still same. If true, ld_raw can be used, otherwise recalculate the host address. On the performance side, qemu_ld on Sparc host uses 9 instructions in the TLB hit case before the access. Maybe this would lower the number a bit but not too much.