From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1KsHQP-0007ZF-29 for qemu-devel@nongnu.org; Tue, 21 Oct 2008 09:37:29 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1KsHQN-0007Yp-Ch for qemu-devel@nongnu.org; Tue, 21 Oct 2008 09:37:27 -0400 Received: from [199.232.76.173] (port=48768 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1KsHQN-0007Ym-7t for qemu-devel@nongnu.org; Tue, 21 Oct 2008 09:37:27 -0400 Received: from qw-out-1920.google.com ([74.125.92.148]:1498) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1KsHQM-0000wg-Sv for qemu-devel@nongnu.org; Tue, 21 Oct 2008 09:37:27 -0400 Received: by qw-out-1920.google.com with SMTP id 5so685445qwc.4 for ; Tue, 21 Oct 2008 06:37:25 -0700 (PDT) Message-ID: <48FDDB12.6050701@codemonkey.ws> Date: Tue, 21 Oct 2008 08:37:22 -0500 From: Anthony Liguori MIME-Version: 1.0 Subject: Re: [Qemu-devel] [PATCH][RFC] Optimize ld[bwlq]_phys and st[bwlq]_phys References: <1224014348-13765-1-git-send-email-aliguori@us.ibm.com> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Blue Swirl wrote: > On 10/14/08, Anthony Liguori wrote: > >> This patch optimizes the ld and st functions that operate on physical addresses. >> Right now, a number of them default to cpu_phys_memory_{read,write} which is >> very slow. As long as the operations are aligned, it is safe to just lookup >> the page and directly read/write the data via ld_p or st_p. >> >> This patch introduces a common function since all of these functions are >> roughly the same. I've tested x86 and sparc with Linux and Windows guests. >> >> I'm pretty confident that this new code is functionally equivalent but I wanted >> to have someone else confirm this. >> > > Why there are special cases for lduw and stw? > I figured the 'u' in lduw stood for unaligned. The "optimization" only works if the physical address is aligned because it doesn't handle the case where a word would cross a physical page boundary. This is why it falls back to cpu_physical_memory_rw which can handle this. ldub is the same, but since it's a single byte, it can't cross a page boundary so I don't think the special casing is necessary (or really possible :-)). I wasn't sure about stw. The current code didn't assume alignment but the name doesn't imply that alignment is necessary. I added a fallback case just to be on the safe side in case any code depends on it today. FWIW, I'm still not sure this optimization really makes anything faster. > I'd add 'inline' to the common functions, otherwise looks OK. > In general, I tend to avoid inline unless it's required for correctness (like in a header file in combination with static). In general, the compiler can make better decisions about inlining than a person can. I'm not at all interested in starting a long thread about the merits of 'inline' but suffice to say that this is not an uncommon view. Regards, Anthony Liguori