From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41788) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f6L2f-00042I-Nt for qemu-devel@nongnu.org; Wed, 11 Apr 2018 15:04:26 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1f6L2c-0002t9-Ft for qemu-devel@nongnu.org; Wed, 11 Apr 2018 15:04:25 -0400 Received: from hqemgate15.nvidia.com ([216.228.121.64]:14538) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1f6L2c-0002ro-4U for qemu-devel@nongnu.org; Wed, 11 Apr 2018 15:04:22 -0400 References: <20180411172014.24711-1-clg@kaod.org> <20180411115511.342c75b9@w520.home> From: Kirti Wankhede Message-ID: <0326be05-c25b-adab-c31d-b9e337a6247a@nvidia.com> Date: Thu, 12 Apr 2018 00:34:03 +0530 MIME-Version: 1.0 In-Reply-To: <20180411115511.342c75b9@w520.home> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [RFC PATCH] migration: discard RAMBlocks of type ram_device List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alex Williamson , =?UTF-8?Q?C=c3=a9dric_Le_Goater?= Cc: qemu-devel@nongnu.org, Juan Quintela , "Dr . David Alan Gilbert" , David Gibson , Yulei Zhang , kevin.tian@intel.com, joonas.lahtinen@linux.intel.com, zhenyuw@linux.intel.com, zhi.a.wang@intel.com On 4/11/2018 11:25 PM, Alex Williamson wrote: > [cc +folks working on vfio-mdev migration] >=20 > On Wed, 11 Apr 2018 19:20:14 +0200 > C=C3=A9dric Le Goater wrote: >=20 >> Here is some context for this strange change request. >> >> On the POWER9 processor, the XIVE interrupt controller can control >> interrupt sources using MMIO to trigger events, to EOI or to turn off >> the sources. Priority management and interrupt acknowledgment is also >> controlled by MMIO in the presenter subengine. >> >> These MMIO regions are exposed to guests in QEMU with a set of 'ram >> device' memory mappings, similarly to VFIO, and the VMAs are populated >> dynamically with the appropriate pages using a fault handler. >> >> But, these regions are an issue for migration. We need to discard the >> associated RAMBlocks from the RAM state on the source VM and let the >> destination VM rebuild the memory mappings on the new host in the >> post_load() operation just before resuming the system. >> >> This is the goal of the following proposal. Does it make sense ? It >> seems to be working enough to migrate a running guest but there might >> be a better, more subtle, approach. >=20 > Yulei, is this something you've run into with GVT-g migration? I don't > see how we can read from or write to ram_device regions in a useful way > during migration anyway, so the change initially looks correct to me. > Thanks, >=20 I ran into this problem with vGPU migration. I have very similar patch in my local branch to test vGPU migration. This patch looks good to me. Thanks, Kirti > Alex >=20 >> Signed-off-by: C=C3=A9dric Le Goater >> --- >> migration/ram.c | 42 ++++++++++++++++++++++++++++++++++++++++-- >> 1 file changed, 40 insertions(+), 2 deletions(-) >> >> diff --git a/migration/ram.c b/migration/ram.c >> index 0e90efa09236..6404ccd046d8 100644 >> --- a/migration/ram.c >> +++ b/migration/ram.c >> @@ -780,6 +780,10 @@ unsigned long migration_bitmap_find_dirty(RAMState = *rs, RAMBlock *rb, >> unsigned long *bitmap =3D rb->bmap; >> unsigned long next; >> =20 >> + if (memory_region_is_ram_device(rb->mr)) { >> + return size; >> + } >> + >> if (rs->ram_bulk_stage && start > 0) { >> next =3D start + 1; >> } else { >> @@ -826,6 +830,9 @@ uint64_t ram_pagesize_summary(void) >> uint64_t summary =3D 0; >> =20 >> RAMBLOCK_FOREACH(block) { >> + if (memory_region_is_ram_device(block->mr)) { >> + continue; >> + } >> summary |=3D block->page_size; >> } >> =20 >> @@ -850,6 +857,9 @@ static void migration_bitmap_sync(RAMState *rs) >> qemu_mutex_lock(&rs->bitmap_mutex); >> rcu_read_lock(); >> RAMBLOCK_FOREACH(block) { >> + if (memory_region_is_ram_device(block->mr)) { >> + continue; >> + } >> migration_bitmap_sync_range(rs, block, 0, block->used_length); >> } >> rcu_read_unlock(); >> @@ -1499,6 +1509,10 @@ static int ram_save_host_page(RAMState *rs, PageS= earchStatus *pss, >> size_t pagesize_bits =3D >> qemu_ram_pagesize(pss->block) >> TARGET_PAGE_BITS; >> =20 >> + if (memory_region_is_ram_device(pss->block->mr)) { >> + return 0; >> + } >> + >> do { >> tmppages =3D ram_save_target_page(rs, pss, last_stage); >> if (tmppages < 0) { >> @@ -1588,6 +1602,9 @@ uint64_t ram_bytes_total(void) >> =20 >> rcu_read_lock(); >> RAMBLOCK_FOREACH(block) { >> + if (memory_region_is_ram_device(block->mr)) { >> + continue; >> + } >> total +=3D block->used_length; >> } >> rcu_read_unlock(); >> @@ -1643,6 +1660,9 @@ static void ram_save_cleanup(void *opaque) >> memory_global_dirty_log_stop(); >> =20 >> QLIST_FOREACH_RCU(block, &ram_list.blocks, next) { >> + if (memory_region_is_ram_device(block->mr)) { >> + continue; >> + } >> g_free(block->bmap); >> block->bmap =3D NULL; >> g_free(block->unsentmap); >> @@ -1710,6 +1730,9 @@ void ram_postcopy_migrated_memory_release(Migratio= nState *ms) >> unsigned long range =3D block->used_length >> TARGET_PAGE_BITS; >> unsigned long run_start =3D find_next_zero_bit(bitmap, range, 0= ); >> =20 >> + if (memory_region_is_ram_device(block->mr)) { >> + continue; >> + } >> while (run_start < range) { >> unsigned long run_end =3D find_next_bit(bitmap, range, run_= start + 1); >> ram_discard_range(block->idstr, run_start << TARGET_PAGE_BI= TS, >> @@ -1784,8 +1807,13 @@ static int postcopy_each_ram_send_discard(Migrati= onState *ms) >> int ret; >> =20 >> RAMBLOCK_FOREACH(block) { >> - PostcopyDiscardState *pds =3D >> - postcopy_discard_send_init(ms, block->idstr); >> + PostcopyDiscardState *pds; >> + >> + if (memory_region_is_ram_device(block->mr)) { >> + continue; >> + } >> + >> + pds =3D postcopy_discard_send_init(ms, block->idstr); >> =20 >> /* >> * Postcopy sends chunks of bitmap over the wire, but it >> @@ -1996,6 +2024,10 @@ int ram_postcopy_send_discard_bitmap(MigrationSta= te *ms) >> unsigned long *bitmap =3D block->bmap; >> unsigned long *unsentmap =3D block->unsentmap; >> =20 >> + if (memory_region_is_ram_device(block->mr)) { >> + continue; >> + } >> + >> if (!unsentmap) { >> /* We don't have a safe way to resize the sentmap, so >> * if the bitmap was resized it will be NULL at this >> @@ -2151,6 +2183,9 @@ static void ram_list_init_bitmaps(void) >> /* Skip setting bitmap if there is no RAM */ >> if (ram_bytes_total()) { >> QLIST_FOREACH_RCU(block, &ram_list.blocks, next) { >> + if (memory_region_is_ram_device(block->mr)) { >> + continue; >> + } >> pages =3D block->max_length >> TARGET_PAGE_BITS; >> block->bmap =3D bitmap_new(pages); >> bitmap_set(block->bmap, 0, pages); >> @@ -2227,6 +2262,9 @@ static int ram_save_setup(QEMUFile *f, void *opaqu= e) >> qemu_put_be64(f, ram_bytes_total() | RAM_SAVE_FLAG_MEM_SIZE); >> =20 >> RAMBLOCK_FOREACH(block) { >> + if (memory_region_is_ram_device(block->mr)) { >> + continue; >> + } >> qemu_put_byte(f, strlen(block->idstr)); >> qemu_put_buffer(f, (uint8_t *)block->idstr, strlen(block->idstr= )); >> qemu_put_be64(f, block->used_length); >=20