All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Cooper <andrew.cooper3@citrix.com>
To: "Jürgen Groß" <jgross@suse.com>,
	xen-devel@lists.xensource.com, jbeulich@suse.com,
	konrad.wilk@oracle.com, david.vrabel@citrix.com
Subject: Re: [PATCH 1/4] expand x86 arch_shared_info to support linear p2m list
Date: Fri, 14 Nov 2014 14:59:37 +0000	[thread overview]
Message-ID: <546618D9.5070200@citrix.com> (raw)
In-Reply-To: <54660E5C.8030107@suse.com>

On 14/11/14 14:14, Jürgen Groß wrote:
> On 11/14/2014 02:56 PM, Andrew Cooper wrote:
>> On 14/11/14 12:53, Juergen Gross wrote:
>>> On 11/14/2014 12:41 PM, Andrew Cooper wrote:
>>>> On 14/11/14 09:37, Juergen Gross wrote:
>>>>> The x86 struct arch_shared_info field pfn_to_mfn_frame_list_list
>>>>> currently contains the mfn of the top level page frame of the 3 level
>>>>> p2m tree, which is used by the Xen tools during saving and restoring
>>>>> (and live migration) of pv domains and for crash dump analysis. With
>>>>> three levels of the p2m tree it is possible to support up to 512
>>>>> GB of
>>>>> RAM for a 64 bit pv domain.
>>>>>
>>>>> A 32 bit pv domain can support more, as each memory page can hold
>>>>> 1024
>>>>> instead of 512 entries, leading to a limit of 4 TB.
>>>>>
>>>>> To be able to support more RAM on x86-64 switch to a virtual mapped
>>>>> p2m list.
>>>>>
>>>>> This patch expands struct arch_shared_info with a new p2m list
>>>>> virtual
>>>>> address and the mfn of the page table root. The new information is
>>>>> indicated by the domain to be valid by storing ~0UL into
>>>>> pfn_to_mfn_frame_list_list. The hypervisor indicates usability of
>>>>> this
>>>>> feature by a new flag XENFEAT_virtual_p2m.
>>>>
>>>> How do you envisage this being used?  Are you expecting the tools
>>>> to do
>>>> manual pagetable walks using xc_map_foreign_xxx() ?
>>>
>>> Yes. Not very different compared to today's mapping via the 3 level
>>> p2m tree. Just another entry format, 4 instead of 3 levels and starting
>>> at an offset.
>>
>> Yes - David and I were discussing this over lunch, and it is not
>> actually very different.
>>
>> In reality, how likely is it that the pages backing this virtual linear
>> array change?
>
> Very unlikely, I think. But not impossible.
>
>> One issue currently is that, during the live part of migration, the
>> toolstack has no way of working out whether the structure of the p2m has
>> changed (intermediate leaves rearranged, or the length increasing).
>>
>> In the case that the VM does change the structure of the p2m under the
>> feet of the toolstack, migration will either blow up in a non-subtle way
>> with a p2m/m2p mismatch, or in a subtle way with the receiving side
>> copying the new p2m over the wrong part of the new domain.
>>
>> I am wondering whether, with this new p2m method, we can take sufficient
>> steps to be able to guarantee mishaps like this can't occur.
>
> This should be easy: I could add a counter in arch_shared_info which is
> incremented whenever a p2m mapping is being changed. The toolstack could
> compare the counter values before start and at end of migration and redo
> the migration (or fail) if they are different. In order to avoid races
> I would have to increment the counter before and after changing the
> mapping.
>

That is insufficient I believe.

Consider:

* Toolstack walks pagetables and maps the frames containing the linear p2m
* Live migration starts
* VM remaps a frame in the middle of the linear p2m
* Live migration continues, but the toolstack has a stale frame in the
middle of its view of the p2m.

As the p2m is almost never expected to change, I think it might be
better to have a flag the toolstack can set to say "The toolstack is
peeking at your p2m behind your back - you must not change its structure."

Having just thought this through, I think there is also a race condition
between a VM changing an entry in the p2m, and the toolstack doing
verifications of frames being sent.

~Andrew

  reply	other threads:[~2014-11-14 14:59 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-14  9:37 [PATCH 0/4] support guest virtual mapped p2m list Juergen Gross
2014-11-14  9:37 ` [PATCH 1/4] expand x86 arch_shared_info to support linear " Juergen Gross
2014-11-14 11:41   ` Andrew Cooper
2014-11-14 12:53     ` Juergen Gross
2014-11-14 13:56       ` Andrew Cooper
2014-11-14 14:14         ` Jürgen Groß
2014-11-14 14:59           ` Andrew Cooper [this message]
2014-11-14 15:32             ` Juergen Gross
2014-11-14 16:08               ` Andrew Cooper
2014-11-18  5:33                 ` Juergen Gross
2014-11-18 10:51                   ` Andrew Cooper
2014-11-18 10:56                     ` David Vrabel
2014-11-21 12:23   ` Jan Beulich
2014-11-21 12:57     ` Juergen Gross
2014-11-21 13:26       ` Andrew Cooper
2014-11-21 13:37         ` Jürgen Groß
2014-11-21 14:04           ` Andrew Cooper
2014-11-21 14:07           ` Jan Beulich
2014-11-14  9:37 ` [PATCH 2/4] introduce arch_get_features() Juergen Gross
2014-11-21 12:26   ` Jan Beulich
2014-11-21 13:21   ` Julien Grall
2014-11-14  9:37 ` [PATCH 3/4] introduce boot parameter for setting XENFEAT_virtual_p2m Juergen Gross
2014-11-19 21:04   ` Konrad Rzeszutek Wilk
2014-11-20  4:46     ` Juergen Gross
2014-11-14  9:37 ` [PATCH 4/4] document new boot parameter virt_p2m Juergen Gross

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=546618D9.5070200@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=david.vrabel@citrix.com \
    --cc=jbeulich@suse.com \
    --cc=jgross@suse.com \
    --cc=konrad.wilk@oracle.com \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.