All of lore.kernel.org
 help / color / mirror / Atom feed
From: Juergen Gross <jgross@suse.com>
To: Andrew Cooper <andrew.cooper3@citrix.com>, Wei Liu <wei.liu2@citrix.com>
Cc: stefano.stabellini@eu.citrix.com, ian.jackson@eu.citrix.com,
	Ian.Campbell@citrix.com, xen-devel@lists.xen.org
Subject: Re: [PATCH] libxc: try to find last used pfn when migrating
Date: Mon, 30 Nov 2015 13:16:57 +0100	[thread overview]
Message-ID: <565C3E39.6020001@suse.com> (raw)
In-Reply-To: <565C1B3D.3030102@citrix.com>

On 30/11/15 10:47, Andrew Cooper wrote:
> On 30/11/15 08:17, Juergen Gross wrote:
>> On 27/11/15 18:16, Andrew Cooper wrote:
>>> On 27/11/15 15:53, Juergen Gross wrote:
>>>> On 27/11/15 16:33, Wei Liu wrote:
>>>>> On Fri, Nov 27, 2015 at 03:50:53PM +0100, Juergen Gross wrote:
>>>>>> For migration the last used pfn of a guest is needed to size the
>>>>>> logdirty bitmap and as an upper bound of the page loop. Unfortunately
>>>>>> there are pv-kernels advertising a much higher maximum pfn as they
>>>>>> are really using in order to support memory hotplug. This will lead
>>>>>> to allocation of much more memory in Xen tools during migration as
>>>>>> really needed.
>>>>>>
>>>>>> Try to find the last used guest pfn of a pv-domu by scanning the p2m
>>>>>> tree from the last entry towards it's start and search for an entry
>>>>>> not being invalid.
>>>>>>
>>>>>> Normally the mid pages of the p2m tree containing all invalid entries
>>>>>> are being reused, so we can just scan the top page for identical
>>>>>> entries and skip them but the first one.
>>>>>>
>>>>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>>>>> ---
>>>>>>  tools/libxc/xc_sr_save.c        |  8 ++++----
>>>>>>  tools/libxc/xc_sr_save_x86_pv.c | 34 +++++++++++++++++++++++++++++++---
>>>>>>  2 files changed, 35 insertions(+), 7 deletions(-)
>>>>>>
>>>>>> diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
>>>>>> index 0c12e56..22b3f18 100644
>>>>>> --- a/tools/libxc/xc_sr_save.c
>>>>>> +++ b/tools/libxc/xc_sr_save.c
>>>>>> @@ -677,6 +677,10 @@ static int setup(struct xc_sr_context *ctx)
>>>>>>      DECLARE_HYPERCALL_BUFFER_SHADOW(unsigned long, dirty_bitmap,
>>>>>>                                      &ctx->save.dirty_bitmap_hbuf);
>>>>>>  
>>>>>> +    rc = ctx->save.ops.setup(ctx);
>>>>>> +    if ( rc )
>>>>>> +        goto err;
>>>>>> +
>>>>>>      dirty_bitmap = xc_hypercall_buffer_alloc_pages(
>>>>>>                     xch, dirty_bitmap, NRPAGES(bitmap_size(ctx->save.p2m_size)));
>>>>>>      ctx->save.batch_pfns = malloc(MAX_BATCH_SIZE *
>>>>>> @@ -692,10 +696,6 @@ static int setup(struct xc_sr_context *ctx)
>>>>>>          goto err;
>>>>>>      }
>>>>>>  
>>>>> p2m_size is set in two places for PV guest.  Since we are now relying on
>>>>> ops.setup to set p2m_size, the invocation to xc_domain_nr_gpfns in
>>>>> xc_domain_save should be removed, so that we only have one place to set
>>>>> p2m_size.
>>>> I wasn't sure this is needed in save case only. If it is I can make
>>>> the change, of course. Andrew?
>> I got this one wrong, sorry. I was relating to the xc_domain_nr_gpfns()
>> call in x86_pv_domain_info(). And here the question really applies:
>>
>> Do we need this call at all? I don't think it is needed for the
>> restore operation. At least in my tests the upper pfn bound was always
>> taken from the data stream, not from xc_domain_nr_gpfns().
> 
> x86_pv_domain_info() was the result of attempting to consolidate all the
> random checks for PV guests, for both the save and restore sides,
> although most of the callsites subsequently vanished.
> 
> The function as a whole gets used several times on the restore side, and
> once of save side.  Unilaterally rewriting ctx->x86_pv.max_pfn is not
> the best option, so would probably be best to disband
> x86_pv_domain_info() altogether and hoist the width/p2m checks into the
> relevant callers.

Hmm, this would mean to have the remaining code of x86_pv_domain_info()
two or three times. I think it's better to keep it without the call of
xc_domain_nr_gpfns().

> 
>>
>>> That is most likely a byproduct of this codes somewhat-organic growth
>>> over 18 months.
>>>
>>> The check in xc_domain_save() is a check left over from legacy
>>> migration.  It was present in legacy migration as legacy merged the page
>>> type into the upper 4 bits of the unsigned long pfn, meaning that a pfn
>>> of (2^28)-1 was the INVALID_PFN representation in the 32bit stream.
>>>
>>> The check is less important for migration v2, but I left it in as an
>>> upper sanity check.
>>>
>>> I am not aversed to removing the check if we are no longer going to
>>> believe the result of XENMEM_maximum_gpfn, and requiring that setup()
>>> get and sanity check the domain size.
>> As I'm planning to support migration of larger domains in the future
>> I think the check has to go away. Why not now.
> 
> Ideally, there should still be a sanity check.  Ian's idea of the caller
> passing the expected size seems like a good candidate.

But which size is expected? I think it would make more sense to limit
the needed dom0 memory to some fraction of the guest size. We could use
a default which might be configurable globally or per domain.


Juergen

  reply	other threads:[~2015-11-30 12:16 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-27 14:50 [PATCH] libxc: try to find last used pfn when migrating Juergen Gross
2015-11-27 15:01 ` David Vrabel
2015-11-27 15:11   ` Juergen Gross
2015-11-27 15:18     ` Ian Campbell
2015-11-27 15:20     ` David Vrabel
2015-11-27 15:22     ` Ian Jackson
2015-11-27 15:50       ` Juergen Gross
2015-11-27 15:33 ` Wei Liu
2015-11-27 15:53   ` Juergen Gross
2015-11-27 17:16     ` Andrew Cooper
2015-11-30  8:17       ` Juergen Gross
2015-11-30  9:47         ` Andrew Cooper
2015-11-30 12:16           ` Juergen Gross [this message]
2015-11-27 16:42 ` Andrew Cooper
2015-11-27 16:51   ` Juergen Gross

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=565C3E39.6020001@suse.com \
    --to=jgross@suse.com \
    --cc=Ian.Campbell@citrix.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=stefano.stabellini@eu.citrix.com \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.