From: Alexey <a.perevalov@samsung.com>
To: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: i.maximets@samsung.com, f4bug@amsat.org,
Peter Xu <peterx@redhat.com>,
qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH RESEND V3 4/6] migration: add postcopy downtime into MigrationIncommingState
Date: Fri, 05 May 2017 19:25:21 +0300 [thread overview]
Message-ID: <20170505162521.GA3847@aperevalov-ubuntu> (raw)
In-Reply-To: <20170505141113.GA3293@work-vm>
On Fri, May 05, 2017 at 03:11:14PM +0100, Dr. David Alan Gilbert wrote:
> * Alexey (a.perevalov@samsung.com) wrote:
> > On Tue, May 02, 2017 at 09:51:44AM +0100, Dr. David Alan Gilbert wrote:
> > > * Alexey (a.perevalov@samsung.com) wrote:
> > > > On Fri, Apr 28, 2017 at 05:22:05PM +0100, Dr. David Alan Gilbert wrote:
> > > > > * Peter Xu (peterx@redhat.com) wrote:
> > > > > > On Fri, Apr 28, 2017 at 01:03:45PM +0300, Alexey Perevalov wrote:
> > > > > >
> > > > > > [...]
> > > > > >
> > > > > > > >>diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
> > > > > > > >>index 21e7150..f3688f5 100644
> > > > > > > >>--- a/migration/postcopy-ram.c
> > > > > > > >>+++ b/migration/postcopy-ram.c
> > > > > > > >>@@ -132,6 +132,14 @@ static bool ufd_version_check(int ufd, MigrationIncomingState *mis)
> > > > > > > >> return false;
> > > > > > > >> }
> > > > > > > >>+#ifdef UFFD_FEATURE_THREAD_ID
> > > > > > > >>+ if (mis && UFFD_FEATURE_THREAD_ID & supported_features) {
> > > > > > > >>+ /* kernel supports that feature */
> > > > > > > >>+ mis->downtime_ctx = downtime_context_new();
> > > > > > > >>+ new_features |= UFFD_FEATURE_THREAD_ID;
> > > > > > > >So here I know why in patch 2 new_features == 0...
> > > > > > > >
> > > > > > > >If I were you, I would like the series be done in below 4 patches:
> > > > > > > >
> > > > > > > >1. update header
> > > > > > > >2. introduce THREAD_ID feature, and enable it conditionally
> > > > > > > >3. squash all the downtime thing (downtime context, calculation) in
> > > > > > > > one patch here
> > > > > > > >4. introduce trace
> > > > > > > >
> > > > > > > >IMHO that's clearer and easier for review. But I'm okay with current
> > > > > > > >as well as long as the maintainers (Dave/Juan) won't disagree. :)
> > > > > > > In previous series, David asked me to split one patch into 2
> > > > > > > [Qemu-devel] [PATCH 3/6] migration: add UFFD_FEATURE_THREAD_ID feature
> > > > > > > support
> > > > > > >
> > > > > > > >There seem to be two parts to this:
> > > > > > > > a) Adding the mis parameter to ufd_version_check
> > > > > > > > b) Asking for the feature
> > > > > > >
> > > > > > > >Please split it into two patches.
> > > > > > >
> > > > > > > So in current patch set, I also added re-factoring, which was missed before
> > > > > > > "migration: split ufd_version_check onto receive/request features part"
> > > > > >
> > > > > > Sure. As long as Dave agrees, I'm okay with either way.
> > > > >
> > > > > I'm OK with the split, it pretty much matches what I asked last time I think.
> > > > >
> > > > > The question I still have is how is this memory-expensive feature turned
> > > > > on and off by the user?
> > > > > Also I think Peter had some ideas for simpler data structures, how did
> > > > > that play out?
> > > > Maybe introduce it as extension of MigrationParameter,
> > > > I mean { "execute": "migrate-set-parameters" , "arguments":
> > > > { "calculate-postcopy-downtime": 1 } }
> > >
> > > Use migrate-set-capabilities, they're effectively the same but just booleans.
> >
> > For me it's not so clear, where to set that capability, on destination or on source
> > side. User sets postcopy ram capability on source side, probably on
> > source side user wants to set postcopy-downtime.
> > If I'm not wrong, neither capabilities nor parameters are transferring
> > from source to destination.
>
> Use a capability on the destination specifically for this; it's OK to set capabilities
> on the destination, and actually libvirt already sets some for us.
>
> One question: Now we're using Peter's idea, so you don't have that big tree
> structure, what are the costs now - is it as big a problem as it was?
It was a tree where key was a page address, so in worst case when we could face
with huge number of pages (Tera bytes of RAM and 4kb page size) that structure was
big, and it consumes a lot of memory, but lookup wasn't so bad due to
tree. Every time in _begin and in _end logarithm complexity search
processed. Right now O(1) complexity in _begin to fill necessary field,
due to cpu_index is an array index (but need to lookup for cpu_index by
thread_id, see bellow), and O(n) complexity in _end, where n is number of cpus.
Size of PostcopyDowntime context depends just on vCPU number.
It's about vCPU_Number * (array of int64_t for page_fault_vcpu_time + array of
uint64_t for vcpu_addr + array of int64_t for vcpu_downtime) + int64_t
for last_begin + int for number of vCPC suspended + int64_t for total
downtime.
In case of 2046 vCPU it will be 49124 bytes. Algorithm doesn't depends
on number of pages, but depends on number of vCPU, obviously in common
case that number is lesser.
for recall:
typede PostcopyDowntimeContext {
/* time when page fault initiated per vCPU */
int64_t *page_fault_vcpu_time;
/* page address per vCPU */
uint64_t *vcpu_addr;
int64_t total_downtime;
/* downtime per vCPU */
int64_t *vcpu_downtime;
/* point in time when last page fault was initiated */
int64_t last_begin;
/* number of vCPU are suspended */
int smp_cpus_down;
} PostcopyDowntimeContext;
And need to remember about get_mem_fault_cpu_index where QEMU iterates
on cpus every pagefault to lookup cpu_index by process's thread id.
Peter suggested hash there, but I think tree will be enough. Currently
servers with about 2000 of cpus just begin selling. I prepared patch
set, but didn't include tree for lookup in get_mem_fault_cpu_index, not
so clear how to be with it, live time of that lookup tree, should it be
just a part of PostcopyDowntimeContext or general code. BTW similar
lookup (linear) is doing in qemu_get_cpu by cpu_index, so I think it
will be useful to have macros for construct/destruct search tree for
cpus per unique field.
>
> > I wanted to pass in in MIG_CMD_POSTCOPY_ADVISE, but it holds only 2
> > uint64, and they are already occupied.
> > Like with RETURN PATH protocol, MIG couldn't be extended w/o breaking backward
> > compatibility. Length for cmd is transmitted, but compared with
> > predefined len from mig_cmd_args.
> >
> > Maybe just increase QEMU_VM_FILE_VERSION in this case, it will be
> > possible to return downtime back to source by return path.
> > For supporting backward compatibility keep several versions of mig_cmd_args
> > per QEMU_VM_FILE_VERSION.
>
> No, we'll only change file version on some massive improvement.
>
> Dave
>
> >
> > > Dave
> > >
> > > >
> > > > >
> > > > > Dave
> > > > >
> > > > >
> > > > > > --
> > > > > > Peter Xu
> > > > > --
> > > > > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> > > > >
> > > >
> > > > --
> > > >
> > > > BR
> > > > Alexey
> > > --
> > > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> > >
> >
> > --
> >
> > BR
> > Alexey
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
--
BR
Alexey
next prev parent reply other threads:[~2017-05-05 16:25 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CGME20170428065752eucas1p1b702ff53ba0bd96674e8cc35466f8046@eucas1p1.samsung.com>
2017-04-28 6:57 ` [Qemu-devel] [PATCH RESEND V3 0/6] calculate downtime for postcopy live migration Alexey Perevalov
[not found] ` <CGME20170428065752eucas1p190511b1932f61b6321c489f0eb4e816f@eucas1p1.samsung.com>
2017-04-28 6:57 ` [Qemu-devel] [PATCH RESEND V3 1/6] userfault: add pid into uffd_msg & update UFFD_FEATURE_* Alexey Perevalov
[not found] ` <CGME20170428065753eucas1p1639528c4df0b459db96579fd5bee281c@eucas1p1.samsung.com>
2017-04-28 6:57 ` [Qemu-devel] [PATCH RESEND V3 2/6] migration: pass ptr to MigrationIncomingState into migration ufd_version_check & postcopy_ram_supported_by_host Alexey Perevalov
2017-04-28 9:04 ` Peter Xu
[not found] ` <CGME20170428065753eucas1p1524aa2bd8e469e6c94a88ee80eb54a6e@eucas1p1.samsung.com>
2017-04-28 6:57 ` [Qemu-devel] [PATCH RESEND V3 3/6] migration: split ufd_version_check onto receive/request features part Alexey Perevalov
2017-04-28 9:01 ` Peter Xu
2017-04-28 10:58 ` Alexey Perevalov
2017-04-28 12:57 ` Alexey Perevalov
2017-04-28 15:55 ` Dr. David Alan Gilbert
[not found] ` <CGME20170428065754eucas1p1f51713373ce8c2d19945a4f91c52bd5c@eucas1p1.samsung.com>
2017-04-28 6:57 ` [Qemu-devel] [PATCH RESEND V3 4/6] migration: add postcopy downtime into MigrationIncommingState Alexey Perevalov
2017-04-28 9:38 ` Peter Xu
2017-04-28 10:03 ` Alexey Perevalov
2017-04-28 10:07 ` Peter Xu
2017-04-28 16:22 ` Dr. David Alan Gilbert
2017-04-29 9:16 ` Alexey
2017-04-29 15:02 ` Eric Blake
2017-05-02 8:51 ` Dr. David Alan Gilbert
2017-05-04 13:09 ` Alexey
2017-05-05 14:11 ` Dr. David Alan Gilbert
2017-05-05 16:25 ` Alexey [this message]
[not found] ` <CGME20170428065755eucas1p2ff9aa17eaa294e741d8c65f8d58a71fb@eucas1p2.samsung.com>
2017-04-28 6:57 ` [Qemu-devel] [PATCH RESEND V3 5/6] migration: calculate downtime on dst side Alexey Perevalov
2017-04-28 10:00 ` Peter Xu
2017-04-28 11:11 ` Alexey Perevalov
2017-05-08 6:29 ` Peter Xu
2017-05-08 9:08 ` Alexey
2017-05-09 8:26 ` Peter Xu
2017-05-09 9:40 ` Dr. David Alan Gilbert
2017-05-09 9:44 ` Daniel P. Berrange
2017-05-10 15:46 ` Alexey
2017-05-10 15:58 ` Daniel P. Berrange
2017-05-11 4:56 ` Peter Xu
[not found] ` <CGME20170511070940eucas1p2ca3e44c15c84eef00e33d755a11c0ea1@eucas1p2.samsung.com>
2017-05-11 7:09 ` Alexey
[not found] ` <CGME20170511064629eucas1p114c72db6d922a6a05a4ec4a4d3003b55@eucas1p1.samsung.com>
2017-05-11 6:46 ` Alexey
2017-05-09 15:19 ` Alexey
2017-05-09 19:01 ` Dr. David Alan Gilbert
2017-05-11 6:32 ` Alexey
2017-05-11 8:25 ` Dr. David Alan Gilbert
2017-04-28 16:34 ` Dr. David Alan Gilbert
[not found] ` <CGME20170428065755eucas1p1cdd0f278a235f176e9f63c40bc64a7a9@eucas1p1.samsung.com>
2017-04-28 6:57 ` [Qemu-devel] [PATCH RESEND V3 6/6] migration: trace postcopy total downtime Alexey Perevalov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170505162521.GA3847@aperevalov-ubuntu \
--to=a.perevalov@samsung.com \
--cc=dgilbert@redhat.com \
--cc=f4bug@amsat.org \
--cc=i.maximets@samsung.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).