From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54721) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aX48f-0006uN-GX for qemu-devel@nongnu.org; Sat, 20 Feb 2016 04:47:46 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aX48c-0007oy-AL for qemu-devel@nongnu.org; Sat, 20 Feb 2016 04:47:45 -0500 Received: from mail-wm0-x22e.google.com ([2a00:1450:400c:c09::22e]:37527) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aX48c-0007ot-2q for qemu-devel@nongnu.org; Sat, 20 Feb 2016 04:47:42 -0500 Received: by mail-wm0-x22e.google.com with SMTP id g62so101702447wme.0 for ; Sat, 20 Feb 2016 01:47:41 -0800 (PST) Sender: Paolo Bonzini References: <1455935721-8804-1-git-send-email-arei.gonglei@huawei.com> From: Paolo Bonzini Message-ID: <56C83638.2010406@redhat.com> Date: Sat, 20 Feb 2016 10:47:36 +0100 MIME-Version: 1.0 In-Reply-To: <1455935721-8804-1-git-send-email-arei.gonglei@huawei.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 0/3] memory: an optimization List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Gonglei , qemu-devel@nongnu.org Cc: peter.huangpeng@huawei.com On 20/02/2016 03:35, Gonglei wrote: > Perf top tells me qemu_get_ram_ptr consume too much cpu cycles. >> 22.56% qemu-kvm [.] address_space_translate >> 13.29% qemu-kvm [.] qemu_get_ram_ptr >> 4.71% qemu-kvm [.] phys_page_find >> 4.43% qemu-kvm [.] address_space_translate_internal >> 3.47% libpthread-2.19.so [.] __pthread_mutex_unlock_usercnt >> 3.08% qemu-kvm [.] qemu_ram_addr_from_host >> 2.62% qemu-kvm [.] address_space_map >> 2.61% libc-2.19.so [.] _int_malloc >> 2.58% libc-2.19.so [.] _int_free >> 2.38% libc-2.19.so [.] malloc >> 2.06% libpthread-2.19.so [.] pthread_mutex_lock >> 1.68% libc-2.19.so [.] malloc_consolidate >> 1.35% libc-2.19.so [.] __memcpy_sse2_unaligned >> 1.23% qemu-kvm [.] lduw_le_phys >> 1.18% qemu-kvm [.] find_next_zero_bit >> 1.02% qemu-kvm [.] object_unref > > And Paolo suggested that we can get rid of qemu_get_ram_ptr > by storing the RAMBlock pointer into the memory region, > instead of the ram_addr_t value. And after appling this change, > I got much better performance indeed. What's the gain like? I've not reviewed the patch in depth, but what I can say is that I like it a lot. It only does the bare minimum needed to provide the optimization, but this also makes it very simple to understand. More cleanups and further optimizations are possible (including removing mr->ram_addr completely), but your patches really does one thing and does it well. Good job! Paolo