From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:38590) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QHzQW-0002kr-7F for qemu-devel@nongnu.org; Thu, 05 May 2011 10:21:16 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QHzQV-0008UP-2O for qemu-devel@nongnu.org; Thu, 05 May 2011 10:21:12 -0400 Received: from mx1.redhat.com ([209.132.183.28]:35255) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QHzQU-0008UC-RO for qemu-devel@nongnu.org; Thu, 05 May 2011 10:21:11 -0400 From: Alex Williamson In-Reply-To: <20110505132106.GG30119@redhat.com> References: <20110503182039.28430.26530.stgit@s20.home> <20110503183652.28430.83897.stgit@s20.home> <20110505132106.GG30119@redhat.com> Content-Type: text/plain; charset="UTF-8" Date: Thu, 05 May 2011 08:21:06 -0600 Message-ID: <1304605266.3081.5.camel@x201> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v2 3/3] CPUPhysMemoryClient: Batch contiguous addresses when playing catchup List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Michael S. Tsirkin" Cc: jan.kiszka@siemens.com, qemu-devel@nongnu.org, armbru@redhat.com On Thu, 2011-05-05 at 16:21 +0300, Michael S. Tsirkin wrote: > On Tue, May 03, 2011 at 12:36:58PM -0600, Alex Williamson wrote: > > When a phys memory client registers and we play catchup by walking > > the page tables, we can make a huge improvement in the number of > > times the set_memory callback is called by batching contiguous > > pages together. With a 4G guest, this reduces the number of callbacks > > at registration from 1048866 to 296. > > > > Signed-off-by: Alex Williamson > > --- > > > > exec.c | 38 ++++++++++++++++++++++++++++++++------ > > 1 files changed, 32 insertions(+), 6 deletions(-) > > > > diff --git a/exec.c b/exec.c > > index bbd5c86..a0678a4 100644 > > --- a/exec.c > > +++ b/exec.c > > @@ -1741,14 +1741,21 @@ static int cpu_notify_migration_log(int enable) > > return 0; > > } > > > > +struct last_map { > > + target_phys_addr_t start_addr; > > + ram_addr_t size; > > A bit worried that ram_addr_t size might thinkably overflow > (it's just a long, could be a 4G ram). Break it out when it fills up? struct CPUPhysMemoryClient { void (*set_memory)(struct CPUPhysMemoryClient *client, target_phys_addr_t start_addr, ram_addr_t size, ram_addr_t phys_offset); ram_addr_t seems to be the standard for describing these types of things. It's an unsigned long, so 4G is only concern for 32b builds, which don't support that much memory anyway. Please apply. Thanks, Alex > > + ram_addr_t phys_offset; > > +}; > > + > > /* The l1_phys_map provides the upper P_L1_BITs of the guest physical > > * address. Each intermediate table provides the next L2_BITs of guest > > * physical address space. The number of levels vary based on host and > > * guest configuration, making it efficient to build the final guest > > * physical address by seeding the L1 offset and shifting and adding in > > * each L2 offset as we recurse through them. */ > > -static void phys_page_for_each_1(CPUPhysMemoryClient *client, > > - int level, void **lp, target_phys_addr_t addr) > > +static void phys_page_for_each_1(CPUPhysMemoryClient *client, int level, > > + void **lp, target_phys_addr_t addr, > > + struct last_map *map) > > { > > int i; > > > > @@ -1760,15 +1767,29 @@ static void phys_page_for_each_1(CPUPhysMemoryClient *client, > > addr <<= L2_BITS + TARGET_PAGE_BITS; > > for (i = 0; i < L2_SIZE; ++i) { > > if (pd[i].phys_offset != IO_MEM_UNASSIGNED) { > > - client->set_memory(client, addr | i << TARGET_PAGE_BITS, > > - TARGET_PAGE_SIZE, pd[i].phys_offset); > > + target_phys_addr_t start_addr = addr | i << TARGET_PAGE_BITS; > > + > > + if (map->size && > > + start_addr == map->start_addr + map->size && > > + pd[i].phys_offset == map->phys_offset + map->size) { > > + > > + map->size += TARGET_PAGE_SIZE; > > + continue; > > + } else if (map->size) { > > + client->set_memory(client, map->start_addr, > > + map->size, map->phys_offset); > > + } > > + > > + map->start_addr = start_addr; > > + map->size = TARGET_PAGE_SIZE; > > + map->phys_offset = pd[i].phys_offset; > > } > > } > > } else { > > void **pp = *lp; > > for (i = 0; i < L2_SIZE; ++i) { > > phys_page_for_each_1(client, level - 1, pp + i, > > - (addr << L2_BITS) | i); > > + (addr << L2_BITS) | i, map); > > } > > } > > } > > @@ -1776,9 +1797,14 @@ static void phys_page_for_each_1(CPUPhysMemoryClient *client, > > static void phys_page_for_each(CPUPhysMemoryClient *client) > > { > > int i; > > + struct last_map map = { 0 }; > > + > > Nit: just {} is enough. > > > for (i = 0; i < P_L1_SIZE; ++i) { > > phys_page_for_each_1(client, P_L1_SHIFT / L2_BITS - 1, > > - l1_phys_map + i, i); > > + l1_phys_map + i, i, &map); > > + } > > + if (map.size) { > > + client->set_memory(client, map.start_addr, map.size, map.phys_offset); > > } > > } > >