From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Juan Quintela <quintela@redhat.com>
Cc: qemu-devel@nongnu.org, lvivier@redhat.com, peterx@redhat.com,
berrange@redhat.com
Subject: Re: [Qemu-devel] [PATCH v5 16/17] migration: Transfer pages over new channels
Date: Tue, 8 Aug 2017 12:32:08 +0100 [thread overview]
Message-ID: <20170808113208.GF2081@work-vm> (raw)
In-Reply-To: <87k22e9usf.fsf@secure.mitica>
* Juan Quintela (quintela@redhat.com) wrote:
> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > * Juan Quintela (quintela@redhat.com) wrote:
> >> We switch for sending the page number to send real pages.
> >>
> >> Signed-off-by: Juan Quintela <quintela@redhat.com>
> >>
> >> --
> >>
> >> Remove the HACK bit, now we have the function that calculates the size
> >> of a page exported.
> >> ---
> >> migration/migration.c | 14 ++++++++----
> >> migration/ram.c | 59 +++++++++++++++++----------------------------------
> >> 2 files changed, 29 insertions(+), 44 deletions(-)
> >>
> >> diff --git a/migration/migration.c b/migration/migration.c
> >> index e122684..34a34b7 100644
> >> --- a/migration/migration.c
> >> +++ b/migration/migration.c
> >> @@ -1882,13 +1882,14 @@ static void *migration_thread(void *opaque)
> >> /* Used by the bandwidth calcs, updated later */
> >> int64_t initial_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
> >> int64_t setup_start = qemu_clock_get_ms(QEMU_CLOCK_HOST);
> >> - int64_t initial_bytes = 0;
> >> /*
> >> * The final stage happens when the remaining data is smaller than
> >> * this threshold; it's calculated from the requested downtime and
> >> * measured bandwidth
> >> */
> >> int64_t threshold_size = 0;
> >> + int64_t qemu_file_bytes = 0;
> >> + int64_t multifd_pages = 0;
> >
> > It feels like these changes to the transfer count should be in a
> > separate patch.
>
> Until this patch, we only sent the address number for testing purposes,
> we can change it in the previous patch. I can split the
> qemu_file_bytes, though.
>
> >> int64_t start_time = initial_time;
> >> int64_t end_time;
> >> bool old_vm_running = false;
> >> @@ -1976,9 +1977,13 @@ static void *migration_thread(void *opaque)
> >> }
> >> current_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
> >> if (current_time >= initial_time + BUFFER_DELAY) {
> >> - uint64_t transferred_bytes = qemu_ftell(s->to_dst_file) -
> >> - initial_bytes;
> >> uint64_t time_spent = current_time - initial_time;
> >> + uint64_t qemu_file_bytes_now = qemu_ftell(s->to_dst_file);
> >> + uint64_t multifd_pages_now = ram_counters.multifd;
> >> + uint64_t transferred_bytes =
> >> + (qemu_file_bytes_now - qemu_file_bytes) +
> >> + (multifd_pages_now - multifd_pages) *
> >> + qemu_target_page_size();
> >
> > If I've followed this right, then ram_counters.multifd is in the main
> > thread not the individual threads, so we should be OK doing that.
>
> Yeap.
>
> >
> >> double bandwidth = (double)transferred_bytes / time_spent;
> >> threshold_size = bandwidth * s->parameters.downtime_limit;
> >>
> >> @@ -1996,7 +2001,8 @@ static void *migration_thread(void *opaque)
> >>
> >> qemu_file_reset_rate_limit(s->to_dst_file);
> >> initial_time = current_time;
> >> - initial_bytes = qemu_ftell(s->to_dst_file);
> >> + qemu_file_bytes = qemu_file_bytes_now;
> >> + multifd_pages = multifd_pages_now;
> >> }
> >> if (qemu_file_rate_limit(s->to_dst_file)) {
> >> /* usleep expects microseconds */
> >> diff --git a/migration/ram.c b/migration/ram.c
> >> index b55b243..c78b286 100644
> >> --- a/migration/ram.c
> >> +++ b/migration/ram.c
> >> @@ -468,25 +468,21 @@ static void *multifd_send_thread(void *opaque)
> >> break;
> >> }
> >> if (p->pages.num) {
> >> - int i;
> >> int num;
> >>
> >> num = p->pages.num;
> >> p->pages.num = 0;
> >> qemu_mutex_unlock(&p->mutex);
> >>
> >> - for (i = 0; i < num; i++) {
> >> - if (qio_channel_write(p->c,
> >> - (const char *)&p->pages.iov[i].iov_base,
> >> - sizeof(uint8_t *), &error_abort)
> >> - != sizeof(uint8_t *)) {
> >> - MigrationState *s = migrate_get_current();
> >> + if (qio_channel_writev_all(p->c, p->pages.iov,
> >> + num, &error_abort)
> >> + != num * TARGET_PAGE_SIZE) {
> >> + MigrationState *s = migrate_get_current();
> >
> > Same comments as previous patch; note we should find a way to get
> > the error message logged; not easy since we're in a thread, but
> > we need to find a way to log the errors.
>
> I am open to suggestions how to set errors in a different thread.
The thread function can 'return' a value - could that be an error
pointer consumed when the thread is joined?
I'd take a fprintf if nothing else (although that's not actually safe)
but not an abort on the source side. Ever.
>
> >> @@ -1262,8 +1240,10 @@ static int ram_multifd_page(RAMState *rs, PageSearchStatus *pss,
> >> offset | RAM_SAVE_FLAG_MULTIFD_PAGE);
> >> fd_num = multifd_send_page(p, rs->migration_dirty_pages == 1);
> >> qemu_put_be16(rs->f, fd_num);
> >> + if (fd_num != UINT16_MAX) {
> >> + qemu_fflush(rs->f);
> >> + }
> >
> > Is that to make sure that the relatively small messages actually get
> > transmitted on the main fd so that the destination starts receiving
> > them?
>
> Yeap.
>
> > I do have a worry there that, since the addresses are going down a
> > single fd we are open to deadlock by the send threads filling up
> > buffers and blocking waiting for the receivers to receive.
>
> I think we are doing the intelligent case here.
> We only sync when we are sure that the package has finished, so we
> should be ok here. If we finish the migration, we call fflush anyways on
> other places, so we can't get stuck as far as I can see.
Dave
> Later, Juan.
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
next prev parent reply other threads:[~2017-08-08 11:32 UTC|newest]
Thread overview: 93+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-07-17 13:42 [Qemu-devel] [PATCH v5 00/17] Multifd Juan Quintela
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 01/17] migrate: Add gboolean return type to migrate_channel_process_incoming Juan Quintela
2017-07-19 15:01 ` Dr. David Alan Gilbert
2017-07-20 7:00 ` Peter Xu
2017-07-20 8:47 ` Daniel P. Berrange
2017-07-24 10:18 ` Juan Quintela
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 02/17] migration: Create migration_ioc_process_incoming() Juan Quintela
2017-07-19 13:38 ` Daniel P. Berrange
2017-07-24 11:09 ` Juan Quintela
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 03/17] qio: Create new qio_channel_{readv, writev}_all Juan Quintela
2017-07-19 13:44 ` Daniel P. Berrange
2017-08-08 8:40 ` Juan Quintela
2017-08-08 9:25 ` Daniel P. Berrange
2017-07-19 15:42 ` Dr. David Alan Gilbert
2017-07-19 15:43 ` Daniel P. Berrange
2017-07-19 16:04 ` Dr. David Alan Gilbert
2017-07-19 16:08 ` Daniel P. Berrange
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 04/17] migration: Add multifd capability Juan Quintela
2017-07-19 15:44 ` Dr. David Alan Gilbert
2017-08-08 8:42 ` Juan Quintela
2017-07-19 17:14 ` Eric Blake
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 05/17] migration: Create x-multifd-threads parameter Juan Quintela
2017-07-19 16:00 ` Dr. David Alan Gilbert
2017-08-08 8:46 ` Juan Quintela
2017-08-08 9:44 ` Dr. David Alan Gilbert
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 06/17] migration: Create x-multifd-group parameter Juan Quintela
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 07/17] migration: Create multifd migration threads Juan Quintela
2017-07-19 16:49 ` Dr. David Alan Gilbert
2017-08-08 8:58 ` Juan Quintela
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 08/17] migration: Split migration_fd_process_incomming Juan Quintela
2017-07-19 17:08 ` Dr. David Alan Gilbert
2017-07-21 12:39 ` Eric Blake
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 09/17] migration: Start of multiple fd work Juan Quintela
2017-07-19 13:56 ` Daniel P. Berrange
2017-07-19 17:35 ` Dr. David Alan Gilbert
2017-08-08 9:35 ` Juan Quintela
2017-08-08 9:54 ` Dr. David Alan Gilbert
2017-07-20 9:34 ` Peter Xu
2017-08-08 9:19 ` Juan Quintela
2017-08-09 8:08 ` Peter Xu
2017-08-09 11:12 ` Juan Quintela
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 10/17] migration: Create ram_multifd_page Juan Quintela
2017-07-19 19:02 ` Dr. David Alan Gilbert
2017-07-20 8:10 ` Peter Xu
2017-07-20 11:48 ` Dr. David Alan Gilbert
2017-08-08 15:58 ` Juan Quintela
2017-08-08 16:04 ` Juan Quintela
2017-08-09 7:42 ` Peter Xu
2017-08-08 15:56 ` Juan Quintela
2017-08-08 16:30 ` Dr. David Alan Gilbert
2017-08-08 18:02 ` Juan Quintela
2017-08-08 19:14 ` Dr. David Alan Gilbert
2017-08-09 16:48 ` Paolo Bonzini
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 11/17] migration: Really use multiple pages at a time Juan Quintela
2017-07-19 13:58 ` Daniel P. Berrange
2017-08-08 11:55 ` Juan Quintela
2017-07-20 9:44 ` Dr. David Alan Gilbert
2017-08-08 12:11 ` Juan Quintela
2017-07-20 9:49 ` Peter Xu
2017-07-20 10:09 ` Peter Xu
2017-08-08 16:06 ` Juan Quintela
2017-08-09 7:48 ` Peter Xu
2017-08-09 8:05 ` Juan Quintela
2017-08-09 8:12 ` Peter Xu
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 12/17] migration: Send the fd number which we are going to use for this page Juan Quintela
2017-07-20 9:58 ` Dr. David Alan Gilbert
2017-08-09 16:48 ` Paolo Bonzini
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 13/17] migration: Create thread infrastructure for multifd recv side Juan Quintela
2017-07-20 10:22 ` Peter Xu
2017-08-08 11:41 ` Juan Quintela
2017-08-09 5:53 ` Peter Xu
2017-07-20 10:29 ` Dr. David Alan Gilbert
2017-08-08 11:51 ` Juan Quintela
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 14/17] migration: Delay the start of reception on main channel Juan Quintela
2017-07-20 10:56 ` Dr. David Alan Gilbert
2017-08-08 11:29 ` Juan Quintela
2017-07-20 11:10 ` Peter Xu
2017-08-08 11:30 ` Juan Quintela
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 15/17] migration: Test new fd infrastructure Juan Quintela
2017-07-20 11:20 ` Dr. David Alan Gilbert
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 16/17] migration: Transfer pages over new channels Juan Quintela
2017-07-20 11:31 ` Dr. David Alan Gilbert
2017-08-08 11:13 ` Juan Quintela
2017-08-08 11:32 ` Dr. David Alan Gilbert [this message]
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 17/17] migration: Flush receive queue Juan Quintela
2017-07-20 11:45 ` Dr. David Alan Gilbert
2017-08-08 10:43 ` Juan Quintela
2017-08-08 11:25 ` Dr. David Alan Gilbert
2017-07-21 2:40 ` Peter Xu
2017-08-08 11:40 ` Juan Quintela
2017-08-10 6:49 ` Peter Xu
2017-07-21 6:03 ` Peter Xu
2017-07-21 10:53 ` Juan Quintela
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170808113208.GF2081@work-vm \
--to=dgilbert@redhat.com \
--cc=berrange@redhat.com \
--cc=lvivier@redhat.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).