From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Alexey <a.perevalov@samsung.com>
Cc: i.maximets@samsung.com, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH 4/6] migration: calculate downtime on dst side (CPUMASK)
Date: Mon, 24 Apr 2017 18:13:12 +0100 [thread overview]
Message-ID: <20170424171312.GN2362@work-vm> (raw)
In-Reply-To: <20170422094911.GA6094@aperevalov-ubuntu>
* Alexey (a.perevalov@samsung.com) wrote:
> Hello David,
> this mail just for CPUMASK discussion.
>
> On Fri, Apr 21, 2017 at 01:00:32PM +0100, Dr. David Alan Gilbert wrote:
> > * Alexey Perevalov (a.perevalov@samsung.com) wrote:
> > > This patch provides downtime calculation per vCPU,
> > > as a summary and as a overlapped value for all vCPUs.
> > >
> > > This approach just keeps tree with page fault addr as a key,
> > > and t1-t2 interval of pagefault time and page copy time, with
> > > affected vCPU bit mask.
> > > For more implementation details please see comment to
> > > get_postcopy_total_downtime function.
> > >
> > > Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
> > > ---
> > > include/migration/migration.h | 14 +++
> > > migration/migration.c | 280 +++++++++++++++++++++++++++++++++++++++++-
> > > migration/postcopy-ram.c | 24 +++-
> > > migration/qemu-file.c | 1 -
> > > migration/trace-events | 9 +-
> > > 5 files changed, 323 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/include/migration/migration.h b/include/migration/migration.h
> > > index 5720c88..5d2c628 100644
> > > --- a/include/migration/migration.h
> > > +++ b/include/migration/migration.h
> > > @@ -123,10 +123,24 @@ struct MigrationIncomingState {
> > >
> > > /* See savevm.c */
> > > LoadStateEntry_Head loadvm_handlers;
> > > +
> > > + /*
> > > + * Tree for keeping postcopy downtime,
> > > + * necessary to calculate correct downtime, during multiple
> > > + * vm suspends, it keeps host page address as a key and
> > > + * DowntimeDuration as a data
> > > + * NULL means kernel couldn't provide process thread id,
> > > + * and QEMU couldn't identify which vCPU raise page fault
> > > + */
> > > + GTree *postcopy_downtime;
> > > };
> > >
> > > MigrationIncomingState *migration_incoming_get_current(void);
> > > void migration_incoming_state_destroy(void);
> > > +void mark_postcopy_downtime_begin(uint64_t addr, int cpu);
> > > +void mark_postcopy_downtime_end(uint64_t addr);
> > > +uint64_t get_postcopy_total_downtime(void);
> > > +void destroy_downtime_duration(gpointer data);
> > >
> > > /*
> > > * An outstanding page request, on the source, having been received
> > > diff --git a/migration/migration.c b/migration/migration.c
> > > index 79f6425..5bac434 100644
> > > --- a/migration/migration.c
> > > +++ b/migration/migration.c
> > > @@ -38,6 +38,8 @@
> > > #include "io/channel-tls.h"
> > > #include "migration/colo.h"
> > >
> > > +#define DEBUG_VCPU_DOWNTIME 1
> > > +
> > > #define MAX_THROTTLE (32 << 20) /* Migration transfer speed throttling */
> > >
> > > /* Amount of time to allocate to each "chunk" of bandwidth-throttled
> > > @@ -77,6 +79,19 @@ static NotifierList migration_state_notifiers =
> > >
> > > static bool deferred_incoming;
> > >
> > > +typedef struct {
> > > + int64_t begin;
> > > + int64_t end;
> > > + uint64_t *cpus; /* cpus bit mask array, QEMU bit functions support
> > > + bit operation on memory regions, but doesn't check out of range */
> > > +} DowntimeDuration;
> > > +
> > > +typedef struct {
> > > + int64_t tp; /* point in time */
> > > + bool is_end;
> > > + uint64_t *cpus;
> > > +} OverlapDowntime;
> > > +
> > > /*
> > > * Current state of incoming postcopy; note this is not part of
> > > * MigrationIncomingState since it's state is used during cleanup
> > > @@ -117,6 +132,13 @@ MigrationState *migrate_get_current(void)
> > > return ¤t_migration;
> > > }
> > >
> > > +void destroy_downtime_duration(gpointer data)
> > > +{
> > > + DowntimeDuration *dd = (DowntimeDuration *)data;
> > > + g_free(dd->cpus);
> > > + g_free(data);
> > > +}
> > > +
> > > MigrationIncomingState *migration_incoming_get_current(void)
> > > {
> > > static bool once;
> > > @@ -138,10 +160,13 @@ void migration_incoming_state_destroy(void)
> > > struct MigrationIncomingState *mis = migration_incoming_get_current();
> > >
> > > qemu_event_destroy(&mis->main_thread_load_event);
> > > + if (mis->postcopy_downtime) {
> > > + g_tree_destroy(mis->postcopy_downtime);
> > > + mis->postcopy_downtime = NULL;
> > > + }
> > > loadvm_free_handlers(mis);
> > > }
> > >
> > > -
> > > typedef struct {
> > > bool optional;
> > > uint32_t size;
> > > @@ -1754,7 +1779,6 @@ static int postcopy_start(MigrationState *ms, bool *old_vm_running)
> > > */
> > > ms->postcopy_after_devices = true;
> > > notifier_list_notify(&migration_state_notifiers, ms);
> > > -
> >
> > Stray deletion
> >
> > > ms->downtime = qemu_clock_get_ms(QEMU_CLOCK_REALTIME) - time_at_stop;
> > >
> > > qemu_mutex_unlock_iothread();
> > > @@ -2117,3 +2141,255 @@ PostcopyState postcopy_state_set(PostcopyState new_state)
> > > return atomic_xchg(&incoming_postcopy_state, new_state);
> > > }
> > >
> > > +#define SIZE_TO_KEEP_CPUBITS (1 + smp_cpus/sizeof(guint64))
> >
> > Split out your cpu-sets so that you have an 'alloc_cpu_set',
> > a 'set bit' a 'set all bits', dup etc
> > (I see Linux has cpumask.h that has a 'cpu_set' that's
> > basically the same thing, but we need something portablish.)
> >
> Agree, the way I'm working with cpumask is little bit naive.
> instead of set all_cpumask in case when all vCPU are sleeping with precision
> ((1 << smp_cpus) - 1), I just set ~0 it all, because I didn't use
> functions like cpumask_and.
> If you think, this patch should use cpumask, cpumask patchset/separate
> thread should be introduced before, and then this patchset should be
> rebased on top of it.
Yes, some functions like that - because then it makes this code
clearer; and there's less repetition.
Dave
>
> BR
> Alexey
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
next prev parent reply other threads:[~2017-04-24 17:13 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CGME20170414131735eucas1p21f1fcadf426789276f567191372f7794@eucas1p2.samsung.com>
2017-04-14 13:17 ` [Qemu-devel] [PATCH 0/6] calculate downtime for postcopy live migration Alexey Perevalov
[not found] ` <CGME20170414131738eucas1p28fe4896d7f42d8c5b23cb95312c41eca@eucas1p2.samsung.com>
2017-04-14 13:17 ` [Qemu-devel] [PATCH 1/6] userfault: add pid into uffd_msg & update UFFD_FEATURE_* Alexey Perevalov
[not found] ` <CGME20170414131739eucas1p1ea9a6adcdbe8cfe45ac1ff582d28d873@eucas1p1.samsung.com>
2017-04-14 13:17 ` [Qemu-devel] [PATCH 2/6] util: introduce glib-helper.c Alexey Perevalov
2017-04-14 16:05 ` Philippe Mathieu-Daudé
2017-04-17 7:07 ` Alexey
2017-04-21 10:01 ` Dr. David Alan Gilbert
2017-04-21 10:27 ` Peter Maydell
2017-04-21 15:10 ` Alexey
2017-04-21 15:49 ` Peter Maydell
2017-04-25 11:23 ` Dr. David Alan Gilbert
[not found] ` <CGME20170414131739eucas1p27a3eed795ae545efff380d7c5f8358c3@eucas1p2.samsung.com>
2017-04-14 13:17 ` [Qemu-devel] [PATCH 3/6] migration: add UFFD_FEATURE_THREAD_ID feature support Alexey Perevalov
2017-04-21 10:24 ` Dr. David Alan Gilbert
2017-04-21 15:22 ` Alexey
2017-04-24 8:03 ` Peter Xu
2017-04-24 8:12 ` Peter Xu
2017-04-24 8:38 ` Alexey
2017-04-24 17:10 ` Dr. David Alan Gilbert
2017-04-25 7:55 ` Alexey
2017-04-25 11:14 ` Dr. David Alan Gilbert
2017-04-25 11:51 ` Alexey Perevalov
[not found] ` <CGME20170414131740eucas1p27eba648b990a93a627265c740e7ff118@eucas1p2.samsung.com>
2017-04-14 13:17 ` [Qemu-devel] [PATCH 4/6] migration: calculate downtime on dst side Alexey Perevalov
2017-04-21 12:00 ` Dr. David Alan Gilbert
2017-04-21 18:47 ` Alexey
2017-04-24 17:11 ` Dr. David Alan Gilbert
2017-04-22 9:49 ` [Qemu-devel] [PATCH 4/6] migration: calculate downtime on dst side (CPUMASK) Alexey
2017-04-24 17:13 ` Dr. David Alan Gilbert [this message]
2017-04-25 8:24 ` [Qemu-devel] [PATCH 4/6] migration: calculate downtime on dst side Peter Xu
2017-04-25 10:10 ` Alexey Perevalov
2017-04-25 10:25 ` Peter Xu
2017-04-25 10:47 ` Alexey Perevalov
[not found] ` <CGME20170414131740eucas1p28f240a4e6c78fb56be52f2641c3e5af6@eucas1p2.samsung.com>
2017-04-14 13:17 ` [Qemu-devel] [PATCH 5/6] migration: send postcopy downtime back to source Alexey Perevalov
2017-04-24 17:26 ` Dr. David Alan Gilbert
2017-04-25 5:51 ` Alexey
[not found] ` <CGME20170414131741eucas1p2f34e11e4292fef1c50ef63bd3522ad04@eucas1p2.samsung.com>
2017-04-14 13:17 ` [Qemu-devel] [PATCH 6/6] migration: detailed traces for postcopy Alexey Perevalov
2017-04-17 13:32 ` Philippe Mathieu-Daudé
2017-04-24 18:03 ` Dr. David Alan Gilbert
2017-04-17 2:32 ` [Qemu-devel] [PATCH 0/6] calculate downtime for postcopy live migration no-reply
2017-04-17 2:36 ` no-reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170424171312.GN2362@work-vm \
--to=dgilbert@redhat.com \
--cc=a.perevalov@samsung.com \
--cc=i.maximets@samsung.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).