From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ulrich Drepper Subject: Re: [PATCH v2] qemu-kvm: Speed up of the dirty-bitmap-traveling Date: Wed, 10 Feb 2010 05:10:18 -0800 Message-ID: <4B72B03A.6020208@redhat.com> References: <4B728FF9.6010707@lab.ntt.co.jp> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: "kvm@vger.kernel.org" , "qemu-devel@nongnu.org" , Avi Kivity , mtosatti@redhat.com To: OHMURA Kei Return-path: Received: from mx1.redhat.com ([209.132.183.28]:58910 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753211Ab0BJNKd (ORCPT ); Wed, 10 Feb 2010 08:10:33 -0500 In-Reply-To: <4B728FF9.6010707@lab.ntt.co.jp> Sender: kvm-owner@vger.kernel.org List-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 02/10/2010 02:52 AM, OHMURA Kei wrote: > for (i =3D 0; i < len; i++) { > - c =3D bitmap[i]; > - while (c > 0) { > - j =3D ffsl(c) - 1; > - c &=3D ~(1u << j); > - page_number =3D i * 8 + j; > - addr1 =3D page_number * TARGET_PAGE_SIZE; > - addr =3D offset + addr1; > - ram_addr =3D cpu_get_physical_page_desc(addr); > - cpu_physical_memory_set_dirty(ram_addr); > - n++; > + if (bitmap_ul[i] !=3D 0) { > + c =3D le_bswap(bitmap_ul[i], HOST_LONG_BITS); > + while (c > 0) { > + j =3D ffsl(c) - 1; > + c &=3D ~(1ul << j); > + page_number =3D i * HOST_LONG_BITS + j; > + addr1 =3D page_number * TARGET_PAGE_SIZE; > + addr =3D offset + addr1; > + ram_addr =3D cpu_get_physical_page_desc(addr); > + cpu_physical_memory_set_dirty(ram_addr); > + } If you're optimizing this code you might want to do it all. The compiler might not see through the bswap call and create unnecessary data dependencies. Especially problematic if the bitmap is really sparse. Also, the outer test is !=3D while the inner test is >. Be consistent. I suggest to replace the inner loop with do { ... } while (c !=3D 0); Depending on how sparse the bitmap is populated this might reduce the number of data dependencies quite a bit. - --=20 =E2=9E=A7 Ulrich Drepper =E2=9E=A7 Red Hat, Inc. =E2=9E=A7 444 Castro S= t =E2=9E=A7 Mountain View, CA =E2=9D=96 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) iEYEARECAAYFAktysDoACgkQ2ijCOnn/RHS2zwCfcj+G0S5ZAEA8MjGAVI/rKjJJ +0oAnA4njIrwx3/5+o43ekYeYXSNyei0 =3Dukkz -----END PGP SIGNATURE----- From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1NfCKw-0000kG-GM for qemu-devel@nongnu.org; Wed, 10 Feb 2010 08:10:34 -0500 Received: from [199.232.76.173] (port=41988 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NfCKv-0000jv-KF for qemu-devel@nongnu.org; Wed, 10 Feb 2010 08:10:33 -0500 Received: from Debian-exim by monty-python.gnu.org with spam-scanned (Exim 4.60) (envelope-from ) id 1NfCKv-0000oR-0T for qemu-devel@nongnu.org; Wed, 10 Feb 2010 08:10:33 -0500 Received: from mx1.redhat.com ([209.132.183.28]:50087) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1NfCKu-0000oH-HJ for qemu-devel@nongnu.org; Wed, 10 Feb 2010 08:10:32 -0500 Message-ID: <4B72B03A.6020208@redhat.com> Date: Wed, 10 Feb 2010 05:10:18 -0800 From: Ulrich Drepper MIME-Version: 1.0 References: <4B728FF9.6010707@lab.ntt.co.jp> In-Reply-To: <4B728FF9.6010707@lab.ntt.co.jp> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: [Qemu-devel] Re: [PATCH v2] qemu-kvm: Speed up of the dirty-bitmap-traveling List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: OHMURA Kei Cc: mtosatti@redhat.com, "qemu-devel@nongnu.org" , "kvm@vger.kernel.org" , Avi Kivity -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 02/10/2010 02:52 AM, OHMURA Kei wrote: > for (i =3D 0; i < len; i++) { > - c =3D bitmap[i]; > - while (c > 0) { > - j =3D ffsl(c) - 1; > - c &=3D ~(1u << j); > - page_number =3D i * 8 + j; > - addr1 =3D page_number * TARGET_PAGE_SIZE; > - addr =3D offset + addr1; > - ram_addr =3D cpu_get_physical_page_desc(addr); > - cpu_physical_memory_set_dirty(ram_addr); > - n++; > + if (bitmap_ul[i] !=3D 0) { > + c =3D le_bswap(bitmap_ul[i], HOST_LONG_BITS); > + while (c > 0) { > + j =3D ffsl(c) - 1; > + c &=3D ~(1ul << j); > + page_number =3D i * HOST_LONG_BITS + j; > + addr1 =3D page_number * TARGET_PAGE_SIZE; > + addr =3D offset + addr1; > + ram_addr =3D cpu_get_physical_page_desc(addr); > + cpu_physical_memory_set_dirty(ram_addr); > + } If you're optimizing this code you might want to do it all. The compiler might not see through the bswap call and create unnecessary data dependencies. Especially problematic if the bitmap is really sparse. Also, the outer test is !=3D while the inner test is >. Be consistent. I suggest to replace the inner loop with do { ... } while (c !=3D 0); Depending on how sparse the bitmap is populated this might reduce the number of data dependencies quite a bit. - --=20 =E2=9E=A7 Ulrich Drepper =E2=9E=A7 Red Hat, Inc. =E2=9E=A7 444 Castro St = =E2=9E=A7 Mountain View, CA =E2=9D=96 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) iEYEARECAAYFAktysDoACgkQ2ijCOnn/RHS2zwCfcj+G0S5ZAEA8MjGAVI/rKjJJ +0oAnA4njIrwx3/5+o43ekYeYXSNyei0 =3Dukkz -----END PGP SIGNATURE-----