xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: xen-devel <xen-devel@lists.xenproject.org>, Keir Fraser <keir@xen.org>
Subject: Re: [PATCH] x86/PV: fix unintended dependency of m2p-strict mode on migration-v2
Date: Mon, 1 Feb 2016 14:07:48 +0000	[thread overview]
Message-ID: <56AF66B4.4000307@citrix.com> (raw)
In-Reply-To: <56AF69B102000078000CCFB0@prv-mh.provo.novell.com>

On 01/02/16 13:20, Jan Beulich wrote:
> Ping? (I'd really like to get this resolved, so we don't need to
> indefinitely run with non-upstream behavior in our distros.)
>
> Thanks, Jan

My remaining issue is whether this loop gets executed by default.

I realise that there is a difference between legacy and v2 migration,
and that v2 migration by default worked.  If that means we managed to
skip this loop in its entirety for v2, then I am far less concerned
about the overhead.

~Andrew


>
>>>> On 13.01.16 at 17:15, <JBeulich@suse.com> wrote:
>>>>> On 13.01.16 at 17:00, <andrew.cooper3@citrix.com> wrote:
>>> On 13/01/16 15:36, Jan Beulich wrote:
>>>>>>> On 13.01.16 at 16:25, <andrew.cooper3@citrix.com> wrote:
>>>>> On 12/01/16 15:19, Jan Beulich wrote:
>>>>>>>>> On 12.01.16 at 12:55, <andrew.cooper3@citrix.com> wrote:
>>>>>>> On 12/01/16 10:08, Jan Beulich wrote:
>>>>>>>> This went unnoticed until a backport of this to an older Xen got used,
>>>>>>>> causing migration of guests enabling this VM assist to fail, because
>>>>>>>> page table pinning there preceeds vCPU context loading, and hence L4
>>>>>>>> tables get initialized for the wrong mode. Fix this by post-processing
>>>>>>>> L4 tables when setting the intended VM assist flags for the guest.
>>>>>>>>
>>>>>>>> Note that this leaves in place a dependency on vCPU 0 getting its guest
>>>>>>>> context restored first, but afaict the logic here is not the only thing
>>>>>>>> depending on that.
>>>>>>>>
>>>>>>>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>>>>>>>>
>>>>>>>> --- a/xen/arch/x86/domain.c
>>>>>>>> +++ b/xen/arch/x86/domain.c
>>>>>>>> @@ -1067,8 +1067,48 @@ int arch_set_info_guest(
>>>>>>>>          goto out;
>>>>>>>>  
>>>>>>>>      if ( v->vcpu_id == 0 )
>>>>>>>> +    {
>>>>>>>>          d->vm_assist = c(vm_assist);
>>>>>>>>  
>>>>>>>> +        /*
>>>>>>>> +         * In the restore case we need to deal with L4 pages which got
>>>>>>>> +         * initialized with m2p_strict still clear (and which hence lack 
>>>>> the
>>>>>>>> +         * correct initial RO_MPT_VIRT_{START,END} L4 entry).
>>>>>>>> +         */
>>>>>>>> +        if ( d != current->domain && VM_ASSIST(d, m2p_strict) &&
>>>>>>>> +             is_pv_domain(d) && !is_pv_32bit_domain(d) &&
>>>>>>>> +             atomic_read(&d->arch.pv_domain.nr_l4_pages) )
>>>>>>>> +        {
>>>>>>>> +            bool_t done = 0;
>>>>>>>> +
>>>>>>>> +            spin_lock_recursive(&d->page_alloc_lock);
>>>>>>>> +
>>>>>>>> +            for ( i = 0; ; )
>>>>>>>> +            {
>>>>>>>> +                struct page_info *page = page_list_remove_head(&d->page_list);
>>>>>>>> +
>>>>>>>> +                if ( page_lock(page) )
>>>>>>>> +                {
>>>>>>>> +                    if ( (page->u.inuse.type_info & PGT_type_mask) ==
>>>>>>>> +                         PGT_l4_page_table )
>>>>>>>> +                        done = !fill_ro_mpt(page_to_mfn(page));
>>>>>>>> +
>>>>>>>> +                    page_unlock(page);
>>>>>>>> +                }
>>>>>>>> +
>>>>>>>> +                page_list_add_tail(page, &d->page_list);
>>>>>>>> +
>>>>>>>> +                if ( done || (!(++i & 0xff) && hypercall_preempt_check()) )
>>>>>>>> +                    break;
>>>>>>>> +            }
>>>>>>>> +
>>>>>>>> +            spin_unlock_recursive(&d->page_alloc_lock);
>>>>>>>> +
>>>>>>>> +            if ( !done )
>>>>>>>> +                return -ERESTART;
>>>>>>> This is a long loop.  It is preemptible, but will incur a time delay
>>>>>>> proportional to the size of the domain during the VM downtime. 
>>>>>>>
>>>>>>> Could you defer the loop until after %cr3 has set been set up, and only
>>>>>>> enter the loop if the kernel l4 table is missing the RO mappings?  That
>>>>>>> way, domains migrated with migration v2 will skip the loop entirely.
>>>>>> Well, first of all this would be the result only as long as you or
>>>>>> someone else don't re-think and possibly move pinning ahead of
>>>>>> context load again.
>>>>> A second set_context() will unconditionally hit the loop though.
>>>> Right - another argument against making any change to what is
>>>> in the patch right now.
>>> If there are any L4 pages, the current code will unconditionally search
>>> the pagelist on every entry to the function, even when it has already
>>> fixed up the strictness.
>>>
>>> A toolstack can enter this functions multiple times for the same vcpu,
>>> by resetting the vcpu state inbetween.  How much do we care about this
>>> usage?
>> If we cared at all, we'd need to insert another similar piece of
>> code in the reset path (moving L4s back to m2p-relaxed mode).
>>
>> Jan
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org 
>> http://lists.xen.org/xen-devel 
>

  reply	other threads:[~2016-02-01 14:08 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-12 10:08 [PATCH] x86/PV: fix unintended dependency of m2p-strict mode on migration-v2 Jan Beulich
2016-01-12 11:55 ` Andrew Cooper
2016-01-12 15:19   ` Jan Beulich
2016-01-13 15:25     ` Andrew Cooper
2016-01-13 15:36       ` Jan Beulich
2016-01-13 16:00         ` Andrew Cooper
2016-01-13 16:15           ` Jan Beulich
2016-02-01 13:20             ` Jan Beulich
2016-02-01 14:07               ` Andrew Cooper [this message]
2016-02-01 16:28                 ` Jan Beulich
2016-02-01 16:34                   ` Andrew Cooper
2016-02-01 16:51                     ` Jan Beulich
2016-02-01 17:31                       ` Andrew Cooper
2016-02-02 10:21                         ` Jan Beulich
2016-02-02 14:08                         ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56AF66B4.4000307@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=keir@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).