* [PATCH] xen: correctly rebuild mfn list list after migration.
@ 2010-10-12 10:14 Ian Campbell
2010-10-14 0:37 ` Jeremy Fitzhardinge
2010-10-14 1:43 ` Jeremy Fitzhardinge
0 siblings, 2 replies; 10+ messages in thread
From: Ian Campbell @ 2010-10-12 10:14 UTC (permalink / raw)
To: jeremy; +Cc: xen-devel, Ian Campbell
Otherwise the second migration attempt fails because the mfn_list_list
still refers to all the old mfns.
We need to update the entires in both p2m_top_mfn and the mid_mfn
pages which p2m_top_mfn refers to.
In order to do this we need to keep track of the virtual addresses
mapping the p2m_mid_mfn pages since we cannot rely on
mfn_to_virt(p2m_top_mfn[idx]) since p2m_top_mfn[idx] will still
contain the old MFN after a migration, which may now belong to another
domain and hence have a different mapping in the m2p.
Therefore add and maintain a third top level page, p2m_mid_mfn_p[],
which tracks the virtual addresses of the mfns contained in
p2m_top_mfn[].
We also need to update the content of the p2m_mid_missing_mfn page on
resume to refer to the page's new mfn.
p2m_missing does not need updating since the migration process takes
care of the leaf p2m pages for us.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
---
arch/x86/xen/mmu.c | 49 ++++++++++++++++++++++++++++++++++++++-----------
1 files changed, 38 insertions(+), 11 deletions(-)
diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
index 16a8e25..8788064 100644
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -185,6 +185,8 @@ DEFINE_PER_CPU(unsigned long, xen_current_cr3); /* actual vcpu cr3 */
* / \ / \ / /
* p2m p2m p2m p2m p2m p2m p2m ...
*
+ * The p2m_mid_mfn pages are mapped by p2m_mid_mfn_p.
+ *
* The p2m_top and p2m_top_mfn levels are limited to 1 page, so the
* maximum representable pseudo-physical address space is:
* P2M_TOP_PER_PAGE * P2M_MID_PER_PAGE * P2M_PER_PAGE pages
@@ -209,6 +211,7 @@ static RESERVE_BRK_ARRAY(unsigned long, p2m_mid_missing_mfn, P2M_MID_PER_PAGE);
static RESERVE_BRK_ARRAY(unsigned long **, p2m_top, P2M_TOP_PER_PAGE);
static RESERVE_BRK_ARRAY(unsigned long, p2m_top_mfn, P2M_TOP_PER_PAGE);
+static RESERVE_BRK_ARRAY(unsigned long *, p2m_mid_mfn_p, P2M_TOP_PER_PAGE);
RESERVE_BRK(p2m_mid, PAGE_SIZE * (MAX_DOMAIN_PAGES / (P2M_PER_PAGE * P2M_MID_PER_PAGE)));
RESERVE_BRK(p2m_mid_mfn, PAGE_SIZE * (MAX_DOMAIN_PAGES / (P2M_PER_PAGE * P2M_MID_PER_PAGE)));
@@ -245,6 +248,14 @@ static void p2m_top_mfn_init(unsigned long *top)
top[i] = virt_to_mfn(p2m_mid_missing_mfn);
}
+static void p2m_mid_mfn_p_init(unsigned long **top)
+{
+ unsigned i;
+
+ for (i = 0; i < P2M_TOP_PER_PAGE; i++)
+ top[i] = p2m_mid_missing_mfn;
+}
+
static void p2m_mid_init(unsigned long **mid)
{
unsigned i;
@@ -301,15 +312,21 @@ EXPORT_SYMBOL(create_lookup_pte_addr);
*/
void xen_build_mfn_list_list(void)
{
- unsigned pfn;
+ unsigned long pfn;
/* Pre-initialize p2m_top_mfn to be completely missing */
if (p2m_top_mfn == NULL) {
p2m_mid_missing_mfn = extend_brk(PAGE_SIZE, PAGE_SIZE);
p2m_mid_mfn_init(p2m_mid_missing_mfn);
+ p2m_mid_mfn_p = extend_brk(PAGE_SIZE, PAGE_SIZE);
+ p2m_mid_mfn_p_init(p2m_mid_mfn_p);
+
p2m_top_mfn = extend_brk(PAGE_SIZE, PAGE_SIZE);
p2m_top_mfn_init(p2m_top_mfn);
+ } else {
+ /* Reinitialise, mfn's all change after migration */
+ p2m_mid_mfn_init(p2m_mid_missing_mfn);
}
for (pfn = 0; pfn < xen_max_p2m_pfn; pfn += P2M_PER_PAGE) {
@@ -322,14 +339,19 @@ void xen_build_mfn_list_list(void)
mid = p2m_top[topidx];
/* Don't bother allocating any mfn mid levels if
- they're just missing */
- if (mid[mididx] == p2m_missing)
+ * they're just missing, just update the stored mfn,
+ * since all could have changed over a migrate.
+ */
+ if (mid == p2m_mid_missing) {
+ p2m_top_mfn[topidx] = virt_to_mfn(p2m_mid_missing);
+ pfn += P2M_MID_PER_PAGE - 1;
continue;
+ }
- mid_mfn = p2m_top_mfn[topidx];
- mid_mfn_p = mfn_to_virt(mid_mfn);
+ mid_mfn_p = p2m_mid_mfn_p[topidx];
+ mid_mfn = virt_to_mfn(mid_mfn_p);
- if (mid_mfn_p == p2m_mid_missing_mfn) {
+ if (mid_mfn_p == p2m_missing) {
/*
* XXX boot-time only! We should never find
* missing parts of the mfn tree after
@@ -340,10 +362,11 @@ void xen_build_mfn_list_list(void)
p2m_mid_mfn_init(mid_mfn_p);
mid_mfn = virt_to_mfn(mid_mfn_p);
-
- p2m_top_mfn[topidx] = mid_mfn;
+ p2m_mid_mfn_p[topidx] = mid_mfn_p;
}
+ p2m_top_mfn[topidx] = mid_mfn;
+
mid_mfn_p[mididx] = virt_to_mfn(mid[mididx]);
}
}
@@ -362,7 +385,7 @@ void __init xen_build_dynamic_phys_to_machine(void)
{
unsigned long *mfn_list = (unsigned long *)xen_start_info->mfn_list;
unsigned long max_pfn = min(MAX_DOMAIN_PAGES, xen_start_info->nr_pages);
- unsigned pfn;
+ unsigned long pfn;
xen_max_p2m_pfn = max_pfn;
@@ -452,7 +475,9 @@ static bool alloc_p2m(unsigned long pfn)
}
top_mfn_p = &p2m_top_mfn[topidx];
- mid_mfn = mfn_to_virt(*top_mfn_p);
+ mid_mfn = p2m_mid_mfn_p[topidx];
+
+ BUG_ON(mid_mfn != mfn_to_virt(*top_mfn_p));
if (mid_mfn == p2m_mid_missing_mfn) {
/* Separately check the mid mfn level */
@@ -464,11 +489,13 @@ static bool alloc_p2m(unsigned long pfn)
return false;
p2m_mid_mfn_init(mid_mfn);
-
+
missing_mfn = virt_to_mfn(p2m_mid_missing_mfn);
mid_mfn_mfn = virt_to_mfn(mid_mfn);
if (cmpxchg(top_mfn_p, missing_mfn, mid_mfn_mfn) != missing_mfn)
free_p2m_page(mid_mfn);
+ else
+ p2m_mid_mfn_p[topidx] = mid_mfn;
}
if (p2m_top[topidx][mididx] == p2m_missing) {
--
1.5.6.5
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] xen: correctly rebuild mfn list list after migration.
2010-10-12 10:14 [PATCH] xen: correctly rebuild mfn list list after migration Ian Campbell
@ 2010-10-14 0:37 ` Jeremy Fitzhardinge
2010-10-14 6:49 ` Ian Campbell
2010-10-14 1:43 ` Jeremy Fitzhardinge
1 sibling, 1 reply; 10+ messages in thread
From: Jeremy Fitzhardinge @ 2010-10-14 0:37 UTC (permalink / raw)
To: Ian Campbell; +Cc: xen-devel
On 10/12/2010 03:14 AM, Ian Campbell wrote:
> Otherwise the second migration attempt fails because the mfn_list_list
> still refers to all the old mfns.
>
> We need to update the entires in both p2m_top_mfn and the mid_mfn
> pages which p2m_top_mfn refers to.
>
> In order to do this we need to keep track of the virtual addresses
> mapping the p2m_mid_mfn pages since we cannot rely on
> mfn_to_virt(p2m_top_mfn[idx]) since p2m_top_mfn[idx] will still
> contain the old MFN after a migration, which may now belong to another
> domain and hence have a different mapping in the m2p.
>
> Therefore add and maintain a third top level page, p2m_mid_mfn_p[],
> which tracks the virtual addresses of the mfns contained in
> p2m_top_mfn[].
>
> We also need to update the content of the p2m_mid_missing_mfn page on
> resume to refer to the page's new mfn.
>
> p2m_missing does not need updating since the migration process takes
> care of the leaf p2m pages for us.
>
> Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
> ---
> arch/x86/xen/mmu.c | 49 ++++++++++++++++++++++++++++++++++++++-----------
> 1 files changed, 38 insertions(+), 11 deletions(-)
>
> diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
> index 16a8e25..8788064 100644
> --- a/arch/x86/xen/mmu.c
> +++ b/arch/x86/xen/mmu.c
> @@ -185,6 +185,8 @@ DEFINE_PER_CPU(unsigned long, xen_current_cr3); /* actual vcpu cr3 */
> * / \ / \ / /
> * p2m p2m p2m p2m p2m p2m p2m ...
> *
> + * The p2m_mid_mfn pages are mapped by p2m_mid_mfn_p.
> + *
> * The p2m_top and p2m_top_mfn levels are limited to 1 page, so the
> * maximum representable pseudo-physical address space is:
> * P2M_TOP_PER_PAGE * P2M_MID_PER_PAGE * P2M_PER_PAGE pages
> @@ -209,6 +211,7 @@ static RESERVE_BRK_ARRAY(unsigned long, p2m_mid_missing_mfn, P2M_MID_PER_PAGE);
>
> static RESERVE_BRK_ARRAY(unsigned long **, p2m_top, P2M_TOP_PER_PAGE);
> static RESERVE_BRK_ARRAY(unsigned long, p2m_top_mfn, P2M_TOP_PER_PAGE);
> +static RESERVE_BRK_ARRAY(unsigned long *, p2m_mid_mfn_p, P2M_TOP_PER_PAGE);
>
> RESERVE_BRK(p2m_mid, PAGE_SIZE * (MAX_DOMAIN_PAGES / (P2M_PER_PAGE * P2M_MID_PER_PAGE)));
> RESERVE_BRK(p2m_mid_mfn, PAGE_SIZE * (MAX_DOMAIN_PAGES / (P2M_PER_PAGE * P2M_MID_PER_PAGE)));
> @@ -245,6 +248,14 @@ static void p2m_top_mfn_init(unsigned long *top)
> top[i] = virt_to_mfn(p2m_mid_missing_mfn);
> }
>
> +static void p2m_mid_mfn_p_init(unsigned long **top)
> +{
> + unsigned i;
> +
> + for (i = 0; i < P2M_TOP_PER_PAGE; i++)
> + top[i] = p2m_mid_missing_mfn;
> +}
> +
> static void p2m_mid_init(unsigned long **mid)
> {
> unsigned i;
> @@ -301,15 +312,21 @@ EXPORT_SYMBOL(create_lookup_pte_addr);
> */
> void xen_build_mfn_list_list(void)
> {
> - unsigned pfn;
> + unsigned long pfn;
>
> /* Pre-initialize p2m_top_mfn to be completely missing */
> if (p2m_top_mfn == NULL) {
> p2m_mid_missing_mfn = extend_brk(PAGE_SIZE, PAGE_SIZE);
> p2m_mid_mfn_init(p2m_mid_missing_mfn);
>
> + p2m_mid_mfn_p = extend_brk(PAGE_SIZE, PAGE_SIZE);
> + p2m_mid_mfn_p_init(p2m_mid_mfn_p);
> +
> p2m_top_mfn = extend_brk(PAGE_SIZE, PAGE_SIZE);
> p2m_top_mfn_init(p2m_top_mfn);
> + } else {
> + /* Reinitialise, mfn's all change after migration */
> + p2m_mid_mfn_init(p2m_mid_missing_mfn);
> }
>
> for (pfn = 0; pfn < xen_max_p2m_pfn; pfn += P2M_PER_PAGE) {
> @@ -322,14 +339,19 @@ void xen_build_mfn_list_list(void)
> mid = p2m_top[topidx];
>
> /* Don't bother allocating any mfn mid levels if
> - they're just missing */
> - if (mid[mididx] == p2m_missing)
> + * they're just missing, just update the stored mfn,
> + * since all could have changed over a migrate.
> + */
> + if (mid == p2m_mid_missing) {
> + p2m_top_mfn[topidx] = virt_to_mfn(p2m_mid_missing);
> + pfn += P2M_MID_PER_PAGE - 1;
> continue;
> + }
>
> - mid_mfn = p2m_top_mfn[topidx];
> - mid_mfn_p = mfn_to_virt(mid_mfn);
> + mid_mfn_p = p2m_mid_mfn_p[topidx];
> + mid_mfn = virt_to_mfn(mid_mfn_p);
>
> - if (mid_mfn_p == p2m_mid_missing_mfn) {
> + if (mid_mfn_p == p2m_missing) {
> /*
> * XXX boot-time only! We should never find
> * missing parts of the mfn tree after
> @@ -340,10 +362,11 @@ void xen_build_mfn_list_list(void)
> p2m_mid_mfn_init(mid_mfn_p);
>
> mid_mfn = virt_to_mfn(mid_mfn_p);
> -
> - p2m_top_mfn[topidx] = mid_mfn;
> + p2m_mid_mfn_p[topidx] = mid_mfn_p;
> }
>
> + p2m_top_mfn[topidx] = mid_mfn;
> +
> mid_mfn_p[mididx] = virt_to_mfn(mid[mididx]);
> }
> }
> @@ -362,7 +385,7 @@ void __init xen_build_dynamic_phys_to_machine(void)
> {
> unsigned long *mfn_list = (unsigned long *)xen_start_info->mfn_list;
> unsigned long max_pfn = min(MAX_DOMAIN_PAGES, xen_start_info->nr_pages);
> - unsigned pfn;
> + unsigned long pfn;
>
> xen_max_p2m_pfn = max_pfn;
>
> @@ -452,7 +475,9 @@ static bool alloc_p2m(unsigned long pfn)
> }
>
> top_mfn_p = &p2m_top_mfn[topidx];
> - mid_mfn = mfn_to_virt(*top_mfn_p);
> + mid_mfn = p2m_mid_mfn_p[topidx];
> +
> + BUG_ON(mid_mfn != mfn_to_virt(*top_mfn_p));
I'm getting this triggering at boot:
PM: Adding info for No Bus:xen!gntdev
------------[ cut here ]------------
kernel BUG at /home/jeremy/git/linux/arch/x86/xen/mmu.c:480!
invalid opcode: 0000 [#1] SMP
last sysfs file:
CPU 0
Modules linked in:
Pid: 6, comm: events/0 Not tainted 2.6.32.24-next #220
RIP: e030:[<ffffffff8100d715>] [<ffffffff8100d715>] set_phys_to_machine+0x12f/0x2d9
RSP: e02b:ffff88001fd8bca0 EFLAGS: 00010206
RAX: ffff88000227d000 RBX: ffffffff8227d000 RCX: 0000000000000016
RDX: ffff880000000000 RSI: 0000000000195a4e RDI: 00000000000207ff
RBP: ffff88001fd8bcf0 R08: 0000000000000003 R09: 00000003981ce0e5
R10: 0000000000000000 R11: 0000000000000003 R12: 00000000000207ff
R13: 0000000000195a4d R14: 0000000000000000 R15: ffffffff8227f000
FS: 0000000000000000(0000) GS:ffff8800051bd000(0000) knlGS:0000000000000000
CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 0000000001001000 CR4: 0000000000002660
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process events/0 (pid: 6, threadinfo ffff88001fd8a000, task ffff88001fd88180)
Stack:
2222222222222222 2222222222222222 2222222222222222 ffffffff82278000
<0> 0000000000000001 ffffea0000b2bfa8 00000000000207ff 0000000000000000
<0> 0000000000000200 0000000000000200 ffff88001fd8bdc0 ffffffff812a128e
Call Trace:
[<ffffffff812a128e>] balloon_process+0x20a/0x626
[<ffffffff81069cfb>] ? worker_thread+0x1fc/0x347
[<ffffffff81069d50>] worker_thread+0x251/0x347
[<ffffffff81069cfb>] ? worker_thread+0x1fc/0x347
[<ffffffff812a1084>] ? balloon_process+0x0/0x626
[<ffffffff8106e8ce>] ? autoremove_wake_function+0x0/0x39
[<ffffffff81069aff>] ? worker_thread+0x0/0x347
[<ffffffff8106e5f4>] kthread+0x7f/0x87
[<ffffffff81013e4a>] child_rip+0xa/0x20
[<ffffffff810137d0>] ? restore_args+0x0/0x30
[<ffffffff81013e40>] ? child_rip+0x0/0x20
Code: 83 c8 ff eb 10 48 c1 e0 03 31 d2 48 03 05 c4 c3 75 00 48 8b 00 48 c1 e0 0c 48 ba 00 00 00 00 00 88 ff ff 48 01 d0 48 39 c3 74 04 <0f> 0b eb fe 48 3b 1d 80 68 94 00 0f 85 bd 00 00 00 31 f6 bf d0
RIP [<ffffffff8100d715>] set_phys_to_machine+0x12f/0x2d9
RSP <ffff88001fd8bca0>
---[ end trace 93d72a36b9146f22 ]---
>
> if (mid_mfn == p2m_mid_missing_mfn) {
> /* Separately check the mid mfn level */
> @@ -464,11 +489,13 @@ static bool alloc_p2m(unsigned long pfn)
> return false;
>
> p2m_mid_mfn_init(mid_mfn);
> -
> +
> missing_mfn = virt_to_mfn(p2m_mid_missing_mfn);
> mid_mfn_mfn = virt_to_mfn(mid_mfn);
> if (cmpxchg(top_mfn_p, missing_mfn, mid_mfn_mfn) != missing_mfn)
> free_p2m_page(mid_mfn);
> + else
> + p2m_mid_mfn_p[topidx] = mid_mfn;
> }
>
> if (p2m_top[topidx][mididx] == p2m_missing) {
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] xen: correctly rebuild mfn list list after migration.
2010-10-12 10:14 [PATCH] xen: correctly rebuild mfn list list after migration Ian Campbell
2010-10-14 0:37 ` Jeremy Fitzhardinge
@ 2010-10-14 1:43 ` Jeremy Fitzhardinge
2010-10-14 6:55 ` Ian Campbell
1 sibling, 1 reply; 10+ messages in thread
From: Jeremy Fitzhardinge @ 2010-10-14 1:43 UTC (permalink / raw)
To: Ian Campbell; +Cc: xen-devel
On 10/12/2010 03:14 AM, Ian Campbell wrote:
> Otherwise the second migration attempt fails because the mfn_list_list
> still refers to all the old mfns.
>
> We need to update the entires in both p2m_top_mfn and the mid_mfn
> pages which p2m_top_mfn refers to.
>
> In order to do this we need to keep track of the virtual addresses
> mapping the p2m_mid_mfn pages since we cannot rely on
> mfn_to_virt(p2m_top_mfn[idx]) since p2m_top_mfn[idx] will still
> contain the old MFN after a migration, which may now belong to another
> domain and hence have a different mapping in the m2p.
>
> Therefore add and maintain a third top level page, p2m_mid_mfn_p[],
> which tracks the virtual addresses of the mfns contained in
> p2m_top_mfn[].
>
> We also need to update the content of the p2m_mid_missing_mfn page on
> resume to refer to the page's new mfn.
>
> p2m_missing does not need updating since the migration process takes
> care of the leaf p2m pages for us.
>
> Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
> ---
> arch/x86/xen/mmu.c | 49 ++++++++++++++++++++++++++++++++++++++-----------
> 1 files changed, 38 insertions(+), 11 deletions(-)
>
> diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
> index 16a8e25..8788064 100644
> --- a/arch/x86/xen/mmu.c
> +++ b/arch/x86/xen/mmu.c
> @@ -185,6 +185,8 @@ DEFINE_PER_CPU(unsigned long, xen_current_cr3); /* actual vcpu cr3 */
> * / \ / \ / /
> * p2m p2m p2m p2m p2m p2m p2m ...
> *
> + * The p2m_mid_mfn pages are mapped by p2m_mid_mfn_p.
> + *
> * The p2m_top and p2m_top_mfn levels are limited to 1 page, so the
> * maximum representable pseudo-physical address space is:
> * P2M_TOP_PER_PAGE * P2M_MID_PER_PAGE * P2M_PER_PAGE pages
> @@ -209,6 +211,7 @@ static RESERVE_BRK_ARRAY(unsigned long, p2m_mid_missing_mfn, P2M_MID_PER_PAGE);
>
> static RESERVE_BRK_ARRAY(unsigned long **, p2m_top, P2M_TOP_PER_PAGE);
> static RESERVE_BRK_ARRAY(unsigned long, p2m_top_mfn, P2M_TOP_PER_PAGE);
> +static RESERVE_BRK_ARRAY(unsigned long *, p2m_mid_mfn_p, P2M_TOP_PER_PAGE);
This should be called "p2m_top_mfn_p" to have consistent naming, since
its at the top of the hierarchy and is indexed by topidx. The "mid" in
the name makes it look like it should be indexed by mididx, but then it
wouldn't make sense to have just one of these (since there needs to be a
mid for each entry in top). The fact that it points to mids is
irrelevant (or at least implied by the fact that its a top).
>
> RESERVE_BRK(p2m_mid, PAGE_SIZE * (MAX_DOMAIN_PAGES / (P2M_PER_PAGE * P2M_MID_PER_PAGE)));
> RESERVE_BRK(p2m_mid_mfn, PAGE_SIZE * (MAX_DOMAIN_PAGES / (P2M_PER_PAGE * P2M_MID_PER_PAGE)));
> @@ -245,6 +248,14 @@ static void p2m_top_mfn_init(unsigned long *top)
> top[i] = virt_to_mfn(p2m_mid_missing_mfn);
> }
>
> +static void p2m_mid_mfn_p_init(unsigned long **top)
> +{
> + unsigned i;
> +
> + for (i = 0; i < P2M_TOP_PER_PAGE; i++)
> + top[i] = p2m_mid_missing_mfn;
> +}
> +
> static void p2m_mid_init(unsigned long **mid)
> {
> unsigned i;
> @@ -301,15 +312,21 @@ EXPORT_SYMBOL(create_lookup_pte_addr);
> */
> void xen_build_mfn_list_list(void)
> {
> - unsigned pfn;
> + unsigned long pfn;
>
> /* Pre-initialize p2m_top_mfn to be completely missing */
> if (p2m_top_mfn == NULL) {
> p2m_mid_missing_mfn = extend_brk(PAGE_SIZE, PAGE_SIZE);
> p2m_mid_mfn_init(p2m_mid_missing_mfn);
>
> + p2m_mid_mfn_p = extend_brk(PAGE_SIZE, PAGE_SIZE);
> + p2m_mid_mfn_p_init(p2m_mid_mfn_p);
> +
> p2m_top_mfn = extend_brk(PAGE_SIZE, PAGE_SIZE);
> p2m_top_mfn_init(p2m_top_mfn);
> + } else {
> + /* Reinitialise, mfn's all change after migration */
> + p2m_mid_mfn_init(p2m_mid_missing_mfn);
> }
>
> for (pfn = 0; pfn < xen_max_p2m_pfn; pfn += P2M_PER_PAGE) {
> @@ -322,14 +339,19 @@ void xen_build_mfn_list_list(void)
> mid = p2m_top[topidx];
>
> /* Don't bother allocating any mfn mid levels if
> - they're just missing */
> - if (mid[mididx] == p2m_missing)
> + * they're just missing, just update the stored mfn,
> + * since all could have changed over a migrate.
> + */
> + if (mid == p2m_mid_missing) {
> + p2m_top_mfn[topidx] = virt_to_mfn(p2m_mid_missing);
> + pfn += P2M_MID_PER_PAGE - 1;
Why -1? How does this interact with the "pfn += P2M_PER_PAGE" in the
for loop? Should it be "(P2M_MID_PER_PAGE - 1) * P2M_PER_PAGE", with
the pfn += P2M_PER_PAGE moved to the bottom of the loop rather than in
the header?
Is this test even necessary? Is it to save redundant re-testing of the
mid level?
J
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] xen: correctly rebuild mfn list list after migration.
2010-10-14 0:37 ` Jeremy Fitzhardinge
@ 2010-10-14 6:49 ` Ian Campbell
2010-10-14 8:51 ` Ian Campbell
0 siblings, 1 reply; 10+ messages in thread
From: Ian Campbell @ 2010-10-14 6:49 UTC (permalink / raw)
To: Jeremy Fitzhardinge; +Cc: xen-devel@lists.xensource.com
On Thu, 2010-10-14 at 01:37 +0100, Jeremy Fitzhardinge wrote:
>
>
> I'm getting this triggering at boot:
>
> PM: Adding info for No Bus:xen!gntdev
> ------------[ cut here ]------------
> kernel BUG at /home/jeremy/git/linux/arch/x86/xen/mmu.c:480!
Probably a consequence of the bogus attempt to skip over completely
empty mid levels which you pointed out in your next mail.
I guess I should test 64 bit guests and not just PAE ones and I guess
pre-ballooning is likely to interact as well.
Ian.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] xen: correctly rebuild mfn list list after migration.
2010-10-14 1:43 ` Jeremy Fitzhardinge
@ 2010-10-14 6:55 ` Ian Campbell
0 siblings, 0 replies; 10+ messages in thread
From: Ian Campbell @ 2010-10-14 6:55 UTC (permalink / raw)
To: Jeremy Fitzhardinge; +Cc: xen-devel@lists.xensource.com
On Thu, 2010-10-14 at 02:43 +0100, Jeremy Fitzhardinge wrote:
> On 10/12/2010 03:14 AM, Ian Campbell wrote:
> > static RESERVE_BRK_ARRAY(unsigned long **, p2m_top, P2M_TOP_PER_PAGE);
> > static RESERVE_BRK_ARRAY(unsigned long, p2m_top_mfn, P2M_TOP_PER_PAGE);
> > +static RESERVE_BRK_ARRAY(unsigned long *, p2m_mid_mfn_p, P2M_TOP_PER_PAGE);
>
> This should be called "p2m_top_mfn_p" to have consistent naming, since
> its at the top of the hierarchy and is indexed by topidx. The "mid" in
> the name makes it look like it should be indexed by mididx, but then it
> wouldn't make sense to have just one of these (since there needs to be a
> mid for each entry in top). The fact that it points to mids is
> irrelevant (or at least implied by the fact that its a top).
OK.
> > @@ -322,14 +339,19 @@ void xen_build_mfn_list_list(void)
> > mid = p2m_top[topidx];
> >
> > /* Don't bother allocating any mfn mid levels if
> > - they're just missing */
> > - if (mid[mididx] == p2m_missing)
> > + * they're just missing, just update the stored mfn,
> > + * since all could have changed over a migrate.
> > + */
> > + if (mid == p2m_mid_missing) {
> > + p2m_top_mfn[topidx] = virt_to_mfn(p2m_mid_missing);
> > + pfn += P2M_MID_PER_PAGE - 1;
>
> Why -1? How does this interact with the "pfn += P2M_PER_PAGE" in the
> for loop? Should it be "(P2M_MID_PER_PAGE - 1) * P2M_PER_PAGE", with
> the pfn += P2M_PER_PAGE moved to the bottom of the loop rather than in
> the header?
The -1 was supposed to offset the ++ in the for header, except as you
point out its actually a +=.
I'll figure out what the correct increment should be.
> Is this test even necessary? Is it to save redundant re-testing of the
> mid level?
It avoids descending P2M_MID_PER_PAGE times per p2m_mis_missing top
level entry into leaf entries which we know are going to be p2m_missing
because they are referred to by p2m_mid_missing. Perhaps its a premature
optimisation, if the test and increment end up too complex once I've
fixed them I may just drop it.
Ian.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Re: [PATCH] xen: correctly rebuild mfn list list after migration.
2010-10-14 6:49 ` Ian Campbell
@ 2010-10-14 8:51 ` Ian Campbell
2010-10-14 18:27 ` Jeremy Fitzhardinge
0 siblings, 1 reply; 10+ messages in thread
From: Ian Campbell @ 2010-10-14 8:51 UTC (permalink / raw)
To: Jeremy Fitzhardinge; +Cc: xen-devel@lists.xensource.com
On Thu, 2010-10-14 at 07:49 +0100, Ian Campbell wrote:
> On Thu, 2010-10-14 at 01:37 +0100, Jeremy Fitzhardinge wrote:
> >
> >
> > I'm getting this triggering at boot:
> >
> > PM: Adding info for No Bus:xen!gntdev
> > ------------[ cut here ]------------
> > kernel BUG at /home/jeremy/git/linux/arch/x86/xen/mmu.c:480!
>
> Probably a consequence of the bogus attempt to skip over completely
> empty mid levels which you pointed out in your next mail.
>
> I guess I should test 64 bit guests and not just PAE ones and I guess
> pre-ballooning is likely to interact as well.
I'm only able to reproduce this by booting ballooned and then ballooning
up, are you doing that or were you seeing it just by booting?
What are your memory and maxmem settings?
I'll obviously fix the issue I'm seeing and hope it also fixes yours but
even better would be to reproduce your failure here.
Ian.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Re: [PATCH] xen: correctly rebuild mfn list list after migration.
2010-10-14 8:51 ` Ian Campbell
@ 2010-10-14 18:27 ` Jeremy Fitzhardinge
2010-10-21 10:10 ` Ian Campbell
0 siblings, 1 reply; 10+ messages in thread
From: Jeremy Fitzhardinge @ 2010-10-14 18:27 UTC (permalink / raw)
To: Ian Campbell; +Cc: xen-devel@lists.xensource.com
On 10/14/2010 01:51 AM, Ian Campbell wrote:
> On Thu, 2010-10-14 at 07:49 +0100, Ian Campbell wrote:
>> On Thu, 2010-10-14 at 01:37 +0100, Jeremy Fitzhardinge wrote:
>>>
>>> I'm getting this triggering at boot:
>>>
>>> PM: Adding info for No Bus:xen!gntdev
>>> ------------[ cut here ]------------
>>> kernel BUG at /home/jeremy/git/linux/arch/x86/xen/mmu.c:480!
>> Probably a consequence of the bogus attempt to skip over completely
>> empty mid levels which you pointed out in your next mail.
>>
>> I guess I should test 64 bit guests and not just PAE ones and I guess
>> pre-ballooning is likely to interact as well.
> I'm only able to reproduce this by booting ballooned and then ballooning
> up, are you doing that or were you seeing it just by booting?
>
> What are your memory and maxmem settings?
memory=512, no maxmem.
J
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Re: [PATCH] xen: correctly rebuild mfn list list after migration.
2010-10-14 18:27 ` Jeremy Fitzhardinge
@ 2010-10-21 10:10 ` Ian Campbell
2010-10-21 17:10 ` Jeremy Fitzhardinge
0 siblings, 1 reply; 10+ messages in thread
From: Ian Campbell @ 2010-10-21 10:10 UTC (permalink / raw)
To: Jeremy Fitzhardinge; +Cc: xen-devel@lists.xensource.com
On Thu, 2010-10-14 at 19:27 +0100, Jeremy Fitzhardinge wrote:
> On 10/14/2010 01:51 AM, Ian Campbell wrote:
> > On Thu, 2010-10-14 at 07:49 +0100, Ian Campbell wrote:
> >> On Thu, 2010-10-14 at 01:37 +0100, Jeremy Fitzhardinge wrote:
> >>>
> >>> I'm getting this triggering at boot:
> >>>
> >>> PM: Adding info for No Bus:xen!gntdev
> >>> ------------[ cut here ]------------
> >>> kernel BUG at /home/jeremy/git/linux/arch/x86/xen/mmu.c:480!
> >> Probably a consequence of the bogus attempt to skip over completely
> >> empty mid levels which you pointed out in your next mail.
> >>
> >> I guess I should test 64 bit guests and not just PAE ones and I guess
> >> pre-ballooning is likely to interact as well.
> > I'm only able to reproduce this by booting ballooned and then ballooning
> > up, are you doing that or were you seeing it just by booting?
> >
> > What are your memory and maxmem settings?
>
> memory=512, no maxmem.
The BUG_ON I added to alloc_p2m was wrong (converted an mfn to an
address in the direct map and then compared it with a virtual address in
kernel map, IIRC). Here's an update patch which fixes all your previous
comments, this issue and a few other bits and bobs
* s/p2m_mid_mfn_p/p2m_top_mfn_p/g
* Fix BUG_ON in alloc_p2m
* Skip correct number of pfns when mid == p2m_mid_missing
* Use correct value p2m_top_mfn[topidx] when mid == p2m_mid_missing
Ian.
8<--------
>From 0a771251cf7e31ff056e24feb3507b236be2a827 Mon Sep 17 00:00:00 2001
From: Ian Campbell <ian.campbell@citrix.com>
Date: Thu, 21 Oct 2010 11:00:46 +0100
Subject: [PATCH] xen: correctly rebuild mfn list list after migration.
Otherwise the second migration attempt fails because the mfn_list_list
still refers to all the old mfns.
We need to update the entires in both p2m_top_mfn and the mid_mfn
pages which p2m_top_mfn refers to.
In order to do this we need to keep track of the virtual addresses
mapping the p2m_mid_mfn pages since we cannot rely on
mfn_to_virt(p2m_top_mfn[idx]) since p2m_top_mfn[idx] will still
contain the old MFN after a migration, which may now belong to another
domain and hence have a different mapping in the m2p.
Therefore add and maintain a third top level page, p2m_top_mfn_p[],
which tracks the virtual addresses of the mfns contained in
p2m_top_mfn[].
We also need to update the content of the p2m_mid_missing_mfn page on
resume to refer to the page's new mfn.
p2m_missing does not need updating since the migration process takes
care of the leaf p2m pages for us.
---
arch/x86/xen/mmu.c | 50 +++++++++++++++++++++++++++++++++++++-------------
1 files changed, 37 insertions(+), 13 deletions(-)
diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
index 15bbccd..ebb74ec 100644
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -186,6 +186,8 @@ DEFINE_PER_CPU(unsigned long, xen_current_cr3); /* actual vcpu cr3 */
* / \ / \ / /
* p2m p2m p2m p2m p2m p2m p2m ...
*
+ * The p2m_mid_mfn pages are mapped by p2m_top_mfn_p.
+ *
* The p2m_top and p2m_top_mfn levels are limited to 1 page, so the
* maximum representable pseudo-physical address space is:
* P2M_TOP_PER_PAGE * P2M_MID_PER_PAGE * P2M_PER_PAGE pages
@@ -210,6 +212,7 @@ static RESERVE_BRK_ARRAY(unsigned long, p2m_mid_missing_mfn, P2M_MID_PER_PAGE);
static RESERVE_BRK_ARRAY(unsigned long **, p2m_top, P2M_TOP_PER_PAGE);
static RESERVE_BRK_ARRAY(unsigned long, p2m_top_mfn, P2M_TOP_PER_PAGE);
+static RESERVE_BRK_ARRAY(unsigned long *, p2m_top_mfn_p, P2M_TOP_PER_PAGE);
RESERVE_BRK(p2m_mid, PAGE_SIZE * (MAX_DOMAIN_PAGES / (P2M_PER_PAGE * P2M_MID_PER_PAGE)));
RESERVE_BRK(p2m_mid_mfn, PAGE_SIZE * (MAX_DOMAIN_PAGES / (P2M_PER_PAGE * P2M_MID_PER_PAGE)));
@@ -246,6 +249,14 @@ static void p2m_top_mfn_init(unsigned long *top)
top[i] = virt_to_mfn(p2m_mid_missing_mfn);
}
+static void p2m_top_mfn_p_init(unsigned long **top)
+{
+ unsigned i;
+
+ for (i = 0; i < P2M_TOP_PER_PAGE; i++)
+ top[i] = p2m_mid_missing_mfn;
+}
+
static void p2m_mid_init(unsigned long **mid)
{
unsigned i;
@@ -302,33 +313,43 @@ EXPORT_SYMBOL(create_lookup_pte_addr);
*/
void xen_build_mfn_list_list(void)
{
- unsigned pfn;
+ unsigned long pfn;
/* Pre-initialize p2m_top_mfn to be completely missing */
if (p2m_top_mfn == NULL) {
p2m_mid_missing_mfn = extend_brk(PAGE_SIZE, PAGE_SIZE);
p2m_mid_mfn_init(p2m_mid_missing_mfn);
+ p2m_top_mfn_p = extend_brk(PAGE_SIZE, PAGE_SIZE);
+ p2m_top_mfn_p_init(p2m_top_mfn_p);
+
p2m_top_mfn = extend_brk(PAGE_SIZE, PAGE_SIZE);
p2m_top_mfn_init(p2m_top_mfn);
+ } else {
+ /* Reinitialise, mfn's all change after migration */
+ p2m_mid_mfn_init(p2m_mid_missing_mfn);
}
for (pfn = 0; pfn < xen_max_p2m_pfn; pfn += P2M_PER_PAGE) {
unsigned topidx = p2m_top_index(pfn);
unsigned mididx = p2m_mid_index(pfn);
unsigned long **mid;
- unsigned long mid_mfn;
unsigned long *mid_mfn_p;
mid = p2m_top[topidx];
+ mid_mfn_p = p2m_top_mfn_p[topidx];
/* Don't bother allocating any mfn mid levels if
- they're just missing */
- if (mid[mididx] == p2m_missing)
+ * they're just missing, just update the stored mfn,
+ * since all could have changed over a migrate.
+ */
+ if (mid == p2m_mid_missing) {
+ BUG_ON(mididx);
+ BUG_ON(mid_mfn_p != p2m_mid_missing_mfn);
+ p2m_top_mfn[topidx] = virt_to_mfn(p2m_mid_missing_mfn);
+ pfn += (P2M_MID_PER_PAGE - 1) * P2M_PER_PAGE;
continue;
-
- mid_mfn = p2m_top_mfn[topidx];
- mid_mfn_p = mfn_to_virt(mid_mfn);
+ }
if (mid_mfn_p == p2m_mid_missing_mfn) {
/*
@@ -340,11 +361,10 @@ void xen_build_mfn_list_list(void)
mid_mfn_p = extend_brk(PAGE_SIZE, PAGE_SIZE);
p2m_mid_mfn_init(mid_mfn_p);
- mid_mfn = virt_to_mfn(mid_mfn_p);
-
- p2m_top_mfn[topidx] = mid_mfn;
+ p2m_top_mfn_p[topidx] = mid_mfn_p;
}
+ p2m_top_mfn[topidx] = virt_to_mfn(mid_mfn_p);
mid_mfn_p[mididx] = virt_to_mfn(mid[mididx]);
}
}
@@ -363,7 +383,7 @@ void __init xen_build_dynamic_phys_to_machine(void)
{
unsigned long *mfn_list = (unsigned long *)xen_start_info->mfn_list;
unsigned long max_pfn = min(MAX_DOMAIN_PAGES, xen_start_info->nr_pages);
- unsigned pfn;
+ unsigned long pfn;
xen_max_p2m_pfn = max_pfn;
@@ -453,7 +473,9 @@ static bool alloc_p2m(unsigned long pfn)
}
top_mfn_p = &p2m_top_mfn[topidx];
- mid_mfn = mfn_to_virt(*top_mfn_p);
+ mid_mfn = p2m_top_mfn_p[topidx];
+
+ BUG_ON(virt_to_mfn(mid_mfn) != *top_mfn_p);
if (mid_mfn == p2m_mid_missing_mfn) {
/* Separately check the mid mfn level */
@@ -465,11 +487,13 @@ static bool alloc_p2m(unsigned long pfn)
return false;
p2m_mid_mfn_init(mid_mfn);
-
+
missing_mfn = virt_to_mfn(p2m_mid_missing_mfn);
mid_mfn_mfn = virt_to_mfn(mid_mfn);
if (cmpxchg(top_mfn_p, missing_mfn, mid_mfn_mfn) != missing_mfn)
free_p2m_page(mid_mfn);
+ else
+ p2m_top_mfn_p[topidx] = mid_mfn;
}
if (p2m_top[topidx][mididx] == p2m_missing) {
--
1.5.6.5
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: Re: [PATCH] xen: correctly rebuild mfn list list after migration.
2010-10-21 10:10 ` Ian Campbell
@ 2010-10-21 17:10 ` Jeremy Fitzhardinge
2010-10-21 17:28 ` Ian Campbell
0 siblings, 1 reply; 10+ messages in thread
From: Jeremy Fitzhardinge @ 2010-10-21 17:10 UTC (permalink / raw)
To: Ian Campbell; +Cc: xen-devel@lists.xensource.com
On 10/21/2010 03:10 AM, Ian Campbell wrote:
> On Thu, 2010-10-14 at 19:27 +0100, Jeremy Fitzhardinge wrote:
>> On 10/14/2010 01:51 AM, Ian Campbell wrote:
>>> On Thu, 2010-10-14 at 07:49 +0100, Ian Campbell wrote:
>>>> On Thu, 2010-10-14 at 01:37 +0100, Jeremy Fitzhardinge wrote:
>>>>> I'm getting this triggering at boot:
>>>>>
>>>>> PM: Adding info for No Bus:xen!gntdev
>>>>> ------------[ cut here ]------------
>>>>> kernel BUG at /home/jeremy/git/linux/arch/x86/xen/mmu.c:480!
>>>> Probably a consequence of the bogus attempt to skip over completely
>>>> empty mid levels which you pointed out in your next mail.
>>>>
>>>> I guess I should test 64 bit guests and not just PAE ones and I guess
>>>> pre-ballooning is likely to interact as well.
>>> I'm only able to reproduce this by booting ballooned and then ballooning
>>> up, are you doing that or were you seeing it just by booting?
>>>
>>> What are your memory and maxmem settings?
>> memory=512, no maxmem.
> The BUG_ON I added to alloc_p2m was wrong (converted an mfn to an
> address in the direct map and then compared it with a virtual address in
> kernel map, IIRC). Here's an update patch which fixes all your previous
> comments, this issue and a few other bits and bobs
>
> * s/p2m_mid_mfn_p/p2m_top_mfn_p/g
> * Fix BUG_ON in alloc_p2m
> * Skip correct number of pfns when mid == p2m_mid_missing
> * Use correct value p2m_top_mfn[topidx] when mid == p2m_mid_missing
OK, it passed my sniff test - it boots and save/restored a few times.
(I added a S-O-B for you.)
Thanks,
J
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Re: [PATCH] xen: correctly rebuild mfn list list after migration.
2010-10-21 17:10 ` Jeremy Fitzhardinge
@ 2010-10-21 17:28 ` Ian Campbell
0 siblings, 0 replies; 10+ messages in thread
From: Ian Campbell @ 2010-10-21 17:28 UTC (permalink / raw)
To: Jeremy Fitzhardinge; +Cc: xen-devel@lists.xensource.com
On Thu, 2010-10-21 at 18:10 +0100, Jeremy Fitzhardinge wrote:
> On 10/21/2010 03:10 AM, Ian Campbell wrote:
> > On Thu, 2010-10-14 at 19:27 +0100, Jeremy Fitzhardinge wrote:
> >> On 10/14/2010 01:51 AM, Ian Campbell wrote:
> >>> On Thu, 2010-10-14 at 07:49 +0100, Ian Campbell wrote:
> >>>> On Thu, 2010-10-14 at 01:37 +0100, Jeremy Fitzhardinge wrote:
> >>>>> I'm getting this triggering at boot:
> >>>>>
> >>>>> PM: Adding info for No Bus:xen!gntdev
> >>>>> ------------[ cut here ]------------
> >>>>> kernel BUG at /home/jeremy/git/linux/arch/x86/xen/mmu.c:480!
> >>>> Probably a consequence of the bogus attempt to skip over completely
> >>>> empty mid levels which you pointed out in your next mail.
> >>>>
> >>>> I guess I should test 64 bit guests and not just PAE ones and I guess
> >>>> pre-ballooning is likely to interact as well.
> >>> I'm only able to reproduce this by booting ballooned and then ballooning
> >>> up, are you doing that or were you seeing it just by booting?
> >>>
> >>> What are your memory and maxmem settings?
> >> memory=512, no maxmem.
> > The BUG_ON I added to alloc_p2m was wrong (converted an mfn to an
> > address in the direct map and then compared it with a virtual address in
> > kernel map, IIRC). Here's an update patch which fixes all your previous
> > comments, this issue and a few other bits and bobs
> >
> > * s/p2m_mid_mfn_p/p2m_top_mfn_p/g
> > * Fix BUG_ON in alloc_p2m
> > * Skip correct number of pfns when mid == p2m_mid_missing
> > * Use correct value p2m_top_mfn[topidx] when mid == p2m_mid_missing
>
> OK, it passed my sniff test - it boots and save/restored a few times.
Great.
> (I added a S-O-B for you.)
Oops, thanks.
Ian.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2010-10-21 17:28 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-10-12 10:14 [PATCH] xen: correctly rebuild mfn list list after migration Ian Campbell
2010-10-14 0:37 ` Jeremy Fitzhardinge
2010-10-14 6:49 ` Ian Campbell
2010-10-14 8:51 ` Ian Campbell
2010-10-14 18:27 ` Jeremy Fitzhardinge
2010-10-21 10:10 ` Ian Campbell
2010-10-21 17:10 ` Jeremy Fitzhardinge
2010-10-21 17:28 ` Ian Campbell
2010-10-14 1:43 ` Jeremy Fitzhardinge
2010-10-14 6:55 ` Ian Campbell
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.