From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44017) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Xo14Z-0000OP-Vg for qemu-devel@nongnu.org; Mon, 10 Nov 2014 21:20:49 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Xo14X-0004o9-Q7 for qemu-devel@nongnu.org; Mon, 10 Nov 2014 21:20:47 -0500 Received: from ozlabs.org ([2401:3900:2:1::2]:55489) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Xo14X-0004nb-6M for qemu-devel@nongnu.org; Mon, 10 Nov 2014 21:20:45 -0500 Date: Tue, 11 Nov 2014 12:39:35 +1100 From: David Gibson Message-ID: <20141111013935.GE15270@voom.redhat.com> References: <1412358473-31398-1-git-send-email-dgilbert@redhat.com> <1412358473-31398-40-git-send-email-dgilbert@redhat.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="Q8BnQc91gJZX4vDc" Content-Disposition: inline In-Reply-To: <1412358473-31398-40-git-send-email-dgilbert@redhat.com> Subject: Re: [Qemu-devel] [PATCH v4 39/47] postcopy_ram.c: place_page and helpers List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Dr. David Alan Gilbert (git)" Cc: aarcange@redhat.com, yamahata@private.email.ne.jp, lilei@linux.vnet.ibm.com, quintela@redhat.com, cristian.klein@cs.umu.se, qemu-devel@nongnu.org, amit.shah@redhat.com, yanghy@cn.fujitsu.com --Q8BnQc91gJZX4vDc Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Oct 03, 2014 at 06:47:45PM +0100, Dr. David Alan Gilbert (git) wrot= e: > From: "Dr. David Alan Gilbert" >=20 > postcopy_place_page (etc) provide a way for postcopy to place a page > into guests memory atomically (using the new remap_anon_pages syscall). >=20 > Signed-off-by: Dr. David Alan Gilbert > --- > include/migration/migration.h | 2 + > include/migration/postcopy-ram.h | 23 +++++++ > postcopy-ram.c | 145 +++++++++++++++++++++++++++++++++= +++++- > 3 files changed, 168 insertions(+), 2 deletions(-) >=20 > diff --git a/include/migration/migration.h b/include/migration/migration.h > index 5bc01d5..58ac7bf 100644 > --- a/include/migration/migration.h > +++ b/include/migration/migration.h > @@ -96,6 +96,8 @@ struct MigrationIncomingState { > QEMUFile *return_path; > QemuMutex rp_mutex; /* We send replies from multiple threads= */ > PostcopyPMI postcopy_pmi; > + void *postcopy_tmp_page; > + long postcopy_place_skipped; /* Check for incorrect place = ops */ > }; > =20 > MigrationIncomingState *migration_incoming_get_current(void); > diff --git a/include/migration/postcopy-ram.h b/include/migration/postcop= y-ram.h > index 413b670..0210491 100644 > --- a/include/migration/postcopy-ram.h > +++ b/include/migration/postcopy-ram.h > @@ -80,4 +80,27 @@ void postcopy_discard_send_chunk(MigrationState *ms, P= ostcopyDiscardState *pds, > void postcopy_discard_send_finish(MigrationState *ms, > PostcopyDiscardState *pds); > =20 > +/* > + * Place a zero'd page of memory at *host > + * returns 0 on success > + */ > +int postcopy_place_zero_page(MigrationIncomingState *mis, void *host, > + long bitmap_offset); > + > +/* > + * Place a page (from) at (host) efficiently > + * There are restrictions on how 'from' must be mapped, in general be= st > + * to use other postcopy_ routines to allocate. > + * returns 0 on success > + */ > +int postcopy_place_page(MigrationIncomingState *mis, void *host, void *f= rom, > + long bitmap_offset); > + > +/* > + * Allocate a page of memory that can be mapped at a later point in time > + * using postcopy_place_page > + * Returns: Pointer to allocated page > + */ > +void *postcopy_get_tmp_page(MigrationIncomingState *mis, long bitmap_off= set); > + > #endif > diff --git a/postcopy-ram.c b/postcopy-ram.c > index 8b2a035..19d4b20 100644 > --- a/postcopy-ram.c > +++ b/postcopy-ram.c > @@ -229,7 +229,6 @@ static PostcopyPMIState postcopy_pmi_get_state_nolock( > } > =20 > /* Retrieve the state of the given page */ > -__attribute__ (( unused )) /* Until later in patch series */ > static PostcopyPMIState postcopy_pmi_get_state(MigrationIncomingState *m= is, > size_t bitmap_index) > { > @@ -245,7 +244,6 @@ static PostcopyPMIState postcopy_pmi_get_state(Migrat= ionIncomingState *mis, > * Set the page state to the given state if the previous state was as ex= pected > * Return the actual previous state. > */ > -__attribute__ (( unused )) /* Until later in patch series */ > static PostcopyPMIState postcopy_pmi_change_state(MigrationIncomingState= *mis, > size_t bitmap_index, > PostcopyPMIState expected_sta= te, > @@ -464,6 +462,7 @@ static int cleanup_area(const char *block_name, void = *host_addr, > int postcopy_ram_incoming_init(MigrationIncomingState *mis, size_t ram_p= ages) > { > postcopy_pmi_init(mis, ram_pages); > + mis->postcopy_place_skipped =3D -1; > =20 > if (qemu_ram_foreach_block(init_area, mis)) { > return -1; > @@ -482,6 +481,10 @@ int postcopy_ram_incoming_cleanup(MigrationIncomingS= tate *mis) > return -1; > } > =20 > + if (mis->postcopy_tmp_page) { > + munmap(mis->postcopy_tmp_page, getpagesize()); > + mis->postcopy_tmp_page =3D NULL; > + } > return 0; > } > =20 > @@ -551,6 +554,126 @@ int postcopy_ram_enable_notify(MigrationIncomingSta= te *mis) > return 0; > } > =20 > +/* > + * Place a zero'd page of memory at *host > + * returns 0 on success > + * bitmap_offset: Index into the migration bitmaps > + */ > +int postcopy_place_zero_page(MigrationIncomingState *mis, void *host, > + long bitmap_offset) > +{ > + void *tmp =3D postcopy_get_tmp_page(mis, bitmap_offset); > + if (!tmp) { > + return -ENOMEM; > + } > + *(char *)tmp =3D 0; > + return postcopy_place_page(mis, host, tmp, bitmap_offset); > +} > + > +/* > + * Place a target page (from) at (host) efficiently > + * There are restrictions on how 'from' must be mapped, in general be= st > + * to use other postcopy_ routines to allocate. > + * returns 0 on success > + * bitmap_offset: Index into the migration bitmaps > + * > + * Where HPS > TPS it holds off doing the place until the last TP in the= HP > + * and assumes (from, host) point to the last TP in a continuous HP > + */ > +int postcopy_place_page(MigrationIncomingState *mis, void *host, void *f= rom, > + long bitmap_offset) > +{ > + PostcopyPMIState old_state, tmp_state; > + size_t hps =3D sysconf(_SC_PAGESIZE); > + > + /* Only place the page when the last target page within the hp arriv= es */ > + if ((bitmap_offset + 1) & (mis->postcopy_pmi.host_bits - 1)) { > + DPRINTF("%s: Skipping incomplete hp host=3D%p from=3D%p bitmap_o= ffset=3D%lx", > + __func__, host, from, bitmap_offset); > + mis->postcopy_place_skipped =3D bitmap_offset; > + return 0; > + } > + > + /* > + * If we skip a page (above) we should end up placing that page befo= re > + * doing anything with other host pages. > + */ > + if (mis->postcopy_place_skipped !=3D -1) { > + assert((bitmap_offset & ~(mis->postcopy_pmi.host_bits - 1)) =3D= =3D > + (mis->postcopy_place_skipped & > + ~(mis->postcopy_pmi.host_bits - 1))); > + } > + mis->postcopy_place_skipped =3D -1; All the above logic seems like you're making assumptions about exactly how this function will be invoked which are fragile and a layering violation. It seems like these lower level functions should work only in host pages and have the target->host page consolidation up in the protocol handling layer. Better yet would be to build it into the protocol itself that reuqests made by the desination (in destination host page chunks) should be answered by the source as a unit, to avoid the hassle of splitting and recombining host pages. > + /* Adjust pointers to point to start of host page */ > + host =3D (void *)((uintptr_t)host & ~(hps - 1)); > + from =3D (void *)((uintptr_t)from & ~(hps - 1)); > + bitmap_offset -=3D (mis->postcopy_pmi.host_bits - 1); > + > + if (syscall(__NR_remap_anon_pages, host, from, hps, 0) !=3D > + getpagesize()) { > + perror("remap_anon_pages in postcopy_place_page"); > + fprintf(stderr, "host: %p from: %p pmi=3D%d\n", host, from, > + postcopy_pmi_get_state(mis, bitmap_offset)); > + > + return -errno; > + } > + > + tmp_state =3D postcopy_pmi_get_state(mis, bitmap_offset); > + do { > + old_state =3D tmp_state; > + tmp_state =3D postcopy_pmi_change_state(mis, bitmap_offset, old_= state, > + POSTCOPY_PMI_RECEIVED); > + > + } while (old_state !=3D tmp_state); > + > + > + if (old_state =3D=3D POSTCOPY_PMI_REQUESTED) { > + /* TODO: Notify kernel */ > + } > + > + return 0; > +} > + > +/* > + * Returns a target page of memory that can be mapped at a later point i= n time > + * using postcopy_place_page > + * The same address is used repeatedly, postcopy_place_page just takes t= he > + * backing page away. The same address might be re-used, but I don't see anything that actually makes that happen. > + * Returns: Pointer to allocated page > + * > + * Note this is a target page and uses the bitmap_offset to get an offset > + * into a hostpage; since there's only one real temporary host page the = caller > + * is expected to not flip around between pages. > + */ > +void *postcopy_get_tmp_page(MigrationIncomingState *mis, long bitmap_off= set) > +{ > + ptrdiff_t offset; > + > + if (!mis->postcopy_tmp_page) { > + mis->postcopy_tmp_page =3D mmap(NULL, getpagesize(), > + PROT_READ | PROT_WRITE, MAP_PRIVATE | > + MAP_ANONYMOUS, -1, 0); > + if (!mis->postcopy_tmp_page) { > + perror("mapping postcopy tmp page"); > + return NULL; > + } > + if (madvise(mis->postcopy_tmp_page, getpagesize(), MADV_DONTFORK= )) { > + munmap(mis->postcopy_tmp_page, getpagesize()); > + perror("postcpy tmp page DONTFORK"); > + return NULL; > + } > + } > + > + /* > + * Get the offset within the host page based on bitmap_offset. > + */ > + offset =3D (bitmap_offset & (mis->postcopy_pmi.host_bits - 1)) << > + qemu_target_page_bits(); > + > + return (void *)((uint8_t *)mis->postcopy_tmp_page + offset); > +} > + > #else > /* No target OS support, stubs just fail */ > int postcopy_ram_hosttest(void) > @@ -598,6 +721,24 @@ int postcopy_ram_enable_notify(MigrationIncomingStat= e *mis) > { > assert(0); > } > + > +int postcopy_place_zero_page(MigrationIncomingState *mis, void *host, > + long bitmap_offset) > +{ > + assert(0); > +} > + > +int postcopy_place_page(MigrationIncomingState *mis, void *host, void *f= rom, > + long bitmap_offset) > +{ > + assert(0); > +} > + > +void *postcopy_get_tmp_page(MigrationIncomingState *mis, long bitmap_off= set) > +{ > + assert(0); > +} > + > #endif > =20 > /* ---------------------------------------------------------------------= ---- */ --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --Q8BnQc91gJZX4vDc Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBAgAGBQJUYWjXAAoJEGw4ysog2bOS8VoQAKHDxUtE15Eseje68c84mPrj fPG67hG4NUtaV4SaxxAKGPODF4qqTG/Jez61/V7MeLREs8rpPmAlqdILa9uL+BGc 1bledyllQtjYtfw1+nrBWnS2VNGX1e8CtxKgBDJLJI/WeKFn9X2J8qM7snC/Khle Zy9+YjCCBbhhvcyYv4hrgWcrr6X39xe6k+P5/M3gHZr0k7PHQJ1g6Hn1JF8Me/g8 SNB6X2qE+PddQpIY1ig76j5woBqEuZcC6edgTQE0+tAXIPjp9FPrx8Viaw6K2IIm SHxFVO3Z9ao8MofNiHEeHsnLCvVb+jwqWVJohKNS0DrtXpSJAEr3E+DXPdsRcxbH aCtkSSCw0h+mB1afmt8IGhvOlp/gIa90nm9Cc7gq551ZYpolTBZyWHHzJ3JxQzED VAewr3AS3XyjK4S4qdgOTyBR7V7LG/LkklQC6SnVhD2ooO+ANxoiIfjVG1BYuqrT UERiMPnh+SDNsax/FOcOfg/feWdv6cVdMuNxW6nKZlMoo0PBHwwJvy9aFo/jXH+w Dd+jion733BXFKAF1NHiCv4Ft9f1x8G/KvIn2eefeU+y2kuno/xgn7uBfH+JNno1 lGMRI1pteMjtfQxDidXQSG+XtlRtCqMPOcjH8xYEngdICRvR3l2djuWww/VAwJBc KaIhPtApQ28VpctT+rC1 =mWsD -----END PGP SIGNATURE----- --Q8BnQc91gJZX4vDc--