From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Perevalov Alexey <alexey.perevalov@hotmail.com>
Cc: Alexey Perevalov <a.perevalov@samsung.com>,
"i.maximets@samsung.com" <i.maximets@samsung.com>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
"peterx@redhat.com" <peterx@redhat.com>,
"quintela@redhat.com" <quintela@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v3 3/3] migration: add bitmap for received page
Date: Fri, 23 Jun 2017 20:41:41 +0100 [thread overview]
Message-ID: <20170623194140.GA2547@work-vm> (raw)
In-Reply-To: <HE1PR0901MB13240187397F213E9AF03BF18DD80@HE1PR0901MB1324.eurprd09.prod.outlook.com>
* Perevalov Alexey (alexey.perevalov@hotmail.com) wrote:
> On Fri, Jun 23, 2017 at 11:29:42AM +0100, Dr. David Alan Gilbert wrote:
> > * Alexey Perevalov (a.perevalov@samsung.com) wrote:
> > > This patch adds ability to track down already received
> > > pages, it's necessary for calculation vCPU block time in
> > > postcopy migration feature, maybe for restore after
> > > postcopy migration failure.
> > > Also it's necessary to solve shared memory issue in
> > > postcopy livemigration. Information about received pages
> > > will be transferred to the software virtual bridge
> > > (e.g. OVS-VSWITCHD), to avoid fallocate (unmap) for
> > > already received pages. fallocate syscall is required for
> > > remmaped shared memory, due to remmaping itself blocks
> > > ioctl(UFFDIO_COPY, ioctl in this case will end with EEXIT
> > > error (struct page is exists after remmap).
> > >
> > > Bitmap is placed into RAMBlock as another postcopy/precopy
> > > related bitmaps.
> > >
> > > Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
> > > ---
> > > include/exec/ram_addr.h | 3 +++
> > > migration/migration.c | 1 +
> > > migration/postcopy-ram.c | 12 ++++++---
> > > migration/ram.c | 66 +++++++++++++++++++++++++++++++++++++++++++++---
> > > migration/ram.h | 6 +++++
> > > 5 files changed, 82 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
> > > index 140efa8..67fbb39 100644
> > > --- a/include/exec/ram_addr.h
> > > +++ b/include/exec/ram_addr.h
> > > @@ -47,6 +47,8 @@ struct RAMBlock {
> > > * of the postcopy phase
> > > */
> > > unsigned long *unsentmap;
> > > + /* bitmap of already received pages in postcopy */
> > > + unsigned long *receivedmap;
> > > };
> > >
> > > static inline bool offset_in_ramblock(RAMBlock *b, ram_addr_t offset)
> > > @@ -60,6 +62,7 @@ static inline void *ramblock_ptr(RAMBlock *block, ram_addr_t offset)
> > > return (char *)block->host + offset;
> > > }
> > >
> > > +unsigned long int ramblock_recv_bitmap_offset(void *host_addr, RAMBlock *rb);
> > > long qemu_getrampagesize(void);
> > > unsigned long last_ram_page(void);
> > > RAMBlock *qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr,
> > > diff --git a/migration/migration.c b/migration/migration.c
> > > index 71e38bc..53fbd41 100644
> > > --- a/migration/migration.c
> > > +++ b/migration/migration.c
> > > @@ -143,6 +143,7 @@ MigrationIncomingState *migration_incoming_get_current(void)
> > > qemu_mutex_init(&mis_current.rp_mutex);
> > > qemu_event_init(&mis_current.main_thread_load_event, false);
> > > once = true;
> > > + ramblock_recv_map_init();
> > > }
> > > return &mis_current;
> > > }
> > > diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
> > > index d1af2c1..5d2b92d 100644
> > > --- a/migration/postcopy-ram.c
> > > +++ b/migration/postcopy-ram.c
> > > @@ -562,8 +562,13 @@ int postcopy_ram_enable_notify(MigrationIncomingState *mis)
> > > }
> > >
> > > static int qemu_ufd_copy_ioctl(int userfault_fd, void *host_addr,
> > > - void *from_addr, uint64_t pagesize)
> > > + void *from_addr, uint64_t pagesize, RAMBlock *rb)
> > > {
> > > + /* received page isn't feature of blocktime calculation,
> > > + * it's more general entity, so keep it here,
> > > + * but gup betwean two following operation could be high,
> > > + * and in this case blocktime for such small interval will be lost */
> > > + ramblock_recv_bitmap_set(host_addr, rb);
> >
> > I have a fun problem here in my world with using the same bitmap for
> > shared memory with the vhost-user client; for that a set bit means
> > that the data has already arrived and we need to do a UFFDIO_WAKE on
> > the client;
> Do you mean vhost-user client?
Yes, I'm doing UFFDIO_WAKE calls on the userfault fd passed to me by
the client.
> > but that means we can't set the bit in this function until
> > the end after we've done the COPY/ZERO.
>
> I have the same problem, I described it to Peter, when he asked why
> ramblock_recv_bitmap_set should be closer to ioctl. But even such
> position doesn't solve that problem.
>
> I could repeat here, I'm sending that bitmap to vhost-user client, and
> it's possible situation when bitmap is set but page not yet copied.
> Did you faced that? Or just mention it as potential problem.
A similar problem; I've got the fault thread receiving a fault request
from the UFD, if the bit is set then it sends a WAKE, if it's not set
then it sends a request back to the source.
If we set the bit before the COPY/ZERO then I could send a WAKE too
early.
> If so, we could move ramblock_recv_bitmap_set after ioctl,
> but we chose that way to avoid situation when new page fault happening
> during ioctl or betwean ioctl and ramblock_recv_bitmap_set on the same vCPU.
> Or introduce 2 bitmap, copied/received.
It's a shame to need 2 bits. We shouldn't get another fault on the
same page, but I guess we can get it from another CPU on the same page
which hmm is the problem with the stats code.
Dave
> >
> > Dave
> >
> > > if (from_addr) {
> > > struct uffdio_copy copy_struct;
> > > copy_struct.dst = (uint64_t)(uintptr_t)host_addr;
> > > @@ -594,7 +599,7 @@ int postcopy_place_page(MigrationIncomingState *mis, void *host, void *from,
> > > * which would be slightly cheaper, but we'd have to be careful
> > > * of the order of updating our page state.
> > > */
> > > - if (qemu_ufd_copy_ioctl(mis->userfault_fd, host, from, pagesize)) {
> > > + if (qemu_ufd_copy_ioctl(mis->userfault_fd, host, from, pagesize, rb)) {
> > > int e = errno;
> > > error_report("%s: %s copy host: %p from: %p (size: %zd)",
> > > __func__, strerror(e), host, from, pagesize);
> > > @@ -616,7 +621,8 @@ int postcopy_place_page_zero(MigrationIncomingState *mis, void *host,
> > > trace_postcopy_place_page_zero(host);
> > >
> > > if (qemu_ram_pagesize(rb) == getpagesize()) {
> > > - if (qemu_ufd_copy_ioctl(mis->userfault_fd, host, 0, getpagesize())) {
> > > + if (qemu_ufd_copy_ioctl(mis->userfault_fd, host, 0, getpagesize(),
> > > + rb)) {
> > > int e = errno;
> > > error_report("%s: %s zero host: %p",
> > > __func__, strerror(e), host);
> > > diff --git a/migration/ram.c b/migration/ram.c
> > > index f50479d..fad4dbf 100644
> > > --- a/migration/ram.c
> > > +++ b/migration/ram.c
> > > @@ -151,6 +151,41 @@ out:
> > > return ret;
> > > }
> > >
> > > +void ramblock_recv_map_init(void)
> > > +{
> > > + RAMBlock *rb;
> > > +
> > > + RAMBLOCK_FOREACH(rb) {
> > > + unsigned long pages;
> > > + pages = rb->max_length >> TARGET_PAGE_BITS;
> > > + assert(!rb->receivedmap);
> > > + rb->receivedmap = bitmap_new(pages);
> > > + }
> > > +}
> > > +
> > > +unsigned long int ramblock_recv_bitmap_offset(void *host_addr, RAMBlock *rb)
> > > +{
> > > + uint64_t host_addr_offset = (uint64_t)(uintptr_t)(host_addr
> > > + - (void *)rb->host);
> > > + return host_addr_offset >> TARGET_PAGE_BITS;
> > > +}
> > > +
> > > +int ramblock_recv_bitmap_test(void *host_addr, RAMBlock *rb)
> > > +{
> > > + return test_bit(ramblock_recv_bitmap_offset(host_addr, rb),
> > > + rb->receivedmap);
> > > +}
> > > +
> > > +void ramblock_recv_bitmap_set(void *host_addr, RAMBlock *rb)
> > > +{
> > > + set_bit_atomic(ramblock_recv_bitmap_offset(host_addr, rb), rb->receivedmap);
> > > +}
> > > +
> > > +void ramblock_recv_bitmap_clear(void *host_addr, RAMBlock *rb)
> > > +{
> > > + clear_bit(ramblock_recv_bitmap_offset(host_addr, rb), rb->receivedmap);
> > > +}
> > > +
> > > /*
> > > * An outstanding page request, on the source, having been received
> > > * and queued
> > > @@ -1773,6 +1808,18 @@ int ram_postcopy_send_discard_bitmap(MigrationState *ms)
> > > return ret;
> > > }
> > >
> > > +static void ramblock_recv_bitmap_clear_range(uint64_t start, size_t length,
> > > + RAMBlock *rb)
> > > +{
> > > + int i, range_count;
> > > + range_count = length >> TARGET_PAGE_BITS;
> > > + for (i = 0; i < range_count; i++) {
> > > + ramblock_recv_bitmap_clear((void *)((uint64_t)(intptr_t)rb->host +
> > > + start), rb);
> > > + start += TARGET_PAGE_SIZE;
> > > + }
> > > +}
> > > +
> > > /**
> > > * ram_discard_range: discard dirtied pages at the beginning of postcopy
> > > *
> > > @@ -1797,6 +1844,7 @@ int ram_discard_range(const char *rbname, uint64_t start, size_t length)
> > > goto err;
> > > }
> > >
> > > + ramblock_recv_bitmap_clear_range(start, length, rb);
> > > ret = ram_block_discard_range(rb, start, length);
> > >
> > > err:
> > > @@ -2324,8 +2372,14 @@ static int ram_load_setup(QEMUFile *f, void *opaque)
> > >
> > > static int ram_load_cleanup(void *opaque)
> > > {
> > > + RAMBlock *rb;
> > > xbzrle_load_cleanup();
> > > compress_threads_load_cleanup();
> > > +
> > > + RAMBLOCK_FOREACH(rb) {
> > > + g_free(rb->receivedmap);
> > > + rb->receivedmap = NULL;
> > > + }
> > > return 0;
> > > }
> > >
> > > @@ -2513,6 +2567,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
> > > ram_addr_t addr, total_ram_bytes;
> > > void *host = NULL;
> > > uint8_t ch;
> > > + RAMBlock *rb;
> > >
> > > addr = qemu_get_be64(f);
> > > flags = addr & ~TARGET_PAGE_MASK;
> > > @@ -2520,15 +2575,15 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
> > >
> > > if (flags & (RAM_SAVE_FLAG_ZERO | RAM_SAVE_FLAG_PAGE |
> > > RAM_SAVE_FLAG_COMPRESS_PAGE | RAM_SAVE_FLAG_XBZRLE)) {
> > > - RAMBlock *block = ram_block_from_stream(f, flags);
> > > + rb = ram_block_from_stream(f, flags);
> > >
> > > - host = host_from_ram_block_offset(block, addr);
> > > + host = host_from_ram_block_offset(rb, addr);
> > > if (!host) {
> > > error_report("Illegal RAM offset " RAM_ADDR_FMT, addr);
> > > ret = -EINVAL;
> > > break;
> > > }
> > > - trace_ram_load_loop(block->idstr, (uint64_t)addr, flags, host);
> > > + trace_ram_load_loop(rb->idstr, (uint64_t)addr, flags, host);
> > > }
> > >
> > > switch (flags & ~RAM_SAVE_FLAG_CONTINUE) {
> > > @@ -2582,10 +2637,12 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
> > >
> > > case RAM_SAVE_FLAG_ZERO:
> > > ch = qemu_get_byte(f);
> > > + ramblock_recv_bitmap_set(host, rb);
> > > ram_handle_compressed(host, ch, TARGET_PAGE_SIZE);
> > > break;
> > >
> > > case RAM_SAVE_FLAG_PAGE:
> > > + ramblock_recv_bitmap_set(host, rb);
> > > qemu_get_buffer(f, host, TARGET_PAGE_SIZE);
> > > break;
> > >
> > > @@ -2596,10 +2653,13 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
> > > ret = -EINVAL;
> > > break;
> > > }
> > > +
> > > + ramblock_recv_bitmap_set(host, rb);
> > > decompress_data_with_multi_threads(f, host, len);
> > > break;
> > >
> > > case RAM_SAVE_FLAG_XBZRLE:
> > > + ramblock_recv_bitmap_set(host, rb);
> > > if (load_xbzrle(f, addr, host) < 0) {
> > > error_report("Failed to decompress XBZRLE page at "
> > > RAM_ADDR_FMT, addr);
> > > diff --git a/migration/ram.h b/migration/ram.h
> > > index c081fde..98d68df 100644
> > > --- a/migration/ram.h
> > > +++ b/migration/ram.h
> > > @@ -52,4 +52,10 @@ int ram_discard_range(const char *block_name, uint64_t start, size_t length);
> > > int ram_postcopy_incoming_init(MigrationIncomingState *mis);
> > >
> > > void ram_handle_compressed(void *host, uint8_t ch, uint64_t size);
> > > +
> > > +void ramblock_recv_map_init(void);
> > > +int ramblock_recv_bitmap_test(void *host_addr, RAMBlock *rb);
> > > +void ramblock_recv_bitmap_set(void *host_addr, RAMBlock *rb);
> > > +void ramblock_recv_bitmap_clear(void *host_addr, RAMBlock *rb);
> > > +
> > > #endif
> > > --
> > > 1.8.3.1
> > >
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> >
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
next prev parent reply other threads:[~2017-06-23 19:41 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CGME20170616190851eucas1p123df3a9fe01d3c2a81bfe45906d07f0d@eucas1p1.samsung.com>
2017-06-16 19:08 ` [Qemu-devel] [PATCH v3 0/3] Add bitmap for received pages in postcopy migration Alexey Perevalov
2017-06-16 19:08 ` [Qemu-devel] [PATCH v3 1/3] migration: postcopy_place_page factoring out Alexey Perevalov
2017-06-16 19:08 ` [Qemu-devel] [PATCH v3 2/3] migration: introduce qemu_ufd_copy_ioctl helper Alexey Perevalov
2017-06-21 12:11 ` Juan Quintela
2017-06-16 19:08 ` [Qemu-devel] [PATCH v3 3/3] migration: add bitmap for received page Alexey Perevalov
2017-06-21 19:22 ` Juan Quintela
2017-06-23 15:35 ` Perevalov Alexey
2017-06-23 10:29 ` Dr. David Alan Gilbert
2017-06-23 15:19 ` Perevalov Alexey
2017-06-23 19:41 ` Dr. David Alan Gilbert [this message]
2017-06-24 5:11 ` Perevalov Alexey
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170623194140.GA2547@work-vm \
--to=dgilbert@redhat.com \
--cc=a.perevalov@samsung.com \
--cc=alexey.perevalov@hotmail.com \
--cc=i.maximets@samsung.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.