All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] xen/balloon: flush unused mappings before updating P2M table
@ 2014-03-14 16:21 Wei Liu
  2014-03-14 18:05 ` David Vrabel
  0 siblings, 1 reply; 5+ messages in thread
From: Wei Liu @ 2014-03-14 16:21 UTC (permalink / raw)
  To: xen-devel; +Cc: Boris Ostrovsky, Wei Liu, David Vrabel

Xen balloon driver will update ballooned out pages' P2M entries to point
to scratch page for PV guests. In 24f69373e2 ("xen/balloon: don't alloc
page while non-preemptible", kmap_flush_unused was moved after the
update for P2M table. In that case for 32 bit PV guest we might end up
with

P2M    X -----> scratch_page
M2P    Y -----> X  (Y is mfn in unused kmap entry)

When PVMMU is consulted, it gets confused and returns the wrong value.
Eventually the guest crashes.

Move the flush before __set_phys_to_machine to fix this.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Konrad Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---
 drivers/xen/balloon.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index 37d06ea..49a809c 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -392,6 +392,10 @@ static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp)
 	if (nr_pages > ARRAY_SIZE(frame_list))
 		nr_pages = ARRAY_SIZE(frame_list);
 
+	/* Ensure that ballooned highmem pages don't have kmaps. */
+	kmap_flush_unused();
+	flush_tlb_all();
+
 	for (i = 0; i < nr_pages; i++) {
 		page = alloc_page(gfp);
 		if (page == NULL) {
@@ -432,10 +436,6 @@ static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp)
 		balloon_append(pfn_to_page(pfn));
 	}
 
-	/* Ensure that ballooned highmem pages don't have kmaps. */
-	kmap_flush_unused();
-	flush_tlb_all();
-
 	set_xen_guest_handle(reservation.extent_start, frame_list);
 	reservation.nr_extents   = nr_pages;
 	ret = HYPERVISOR_memory_op(XENMEM_decrease_reservation, &reservation);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] xen/balloon: flush unused mappings before updating P2M table
  2014-03-14 16:21 [PATCH] xen/balloon: flush unused mappings before updating P2M table Wei Liu
@ 2014-03-14 18:05 ` David Vrabel
  2014-03-14 18:27   ` Wei Liu
  0 siblings, 1 reply; 5+ messages in thread
From: David Vrabel @ 2014-03-14 18:05 UTC (permalink / raw)
  To: Wei Liu; +Cc: xen-devel, Boris Ostrovsky, Tim Deegan

On 14/03/14 16:21, Wei Liu wrote:
> Xen balloon driver will update ballooned out pages' P2M entries to point
> to scratch page for PV guests. In 24f69373e2 ("xen/balloon: don't alloc
> page while non-preemptible", kmap_flush_unused was moved after the
> update for P2M table. In that case for 32 bit PV guest we might end up
> with
> 
> P2M    X -----> scratch_page
> M2P    Y -----> X  (Y is mfn in unused kmap entry)
> 
> When PVMMU is consulted, it gets confused and returns the wrong value.
> Eventually the guest crashes.
> 
> Move the flush before __set_phys_to_machine to fix this.

The scrub_page() will immediately repopulate the kmap cache with the MFN
about to be returned to Xen so this isn't the correct place.

I don't understand your description of the problem so I cannot suggest a
correct fix.  What's consulting what?

As an aside, I do think the flush_tlb_all() is unnecessary since Xen
does that for us in the update_va_mapping hypercall.  I think. Tim, can
you confirm?

David

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] xen/balloon: flush unused mappings before updating P2M table
  2014-03-14 18:05 ` David Vrabel
@ 2014-03-14 18:27   ` Wei Liu
  2014-03-14 18:44     ` David Vrabel
  0 siblings, 1 reply; 5+ messages in thread
From: Wei Liu @ 2014-03-14 18:27 UTC (permalink / raw)
  To: David Vrabel; +Cc: xen-devel, Boris Ostrovsky, Tim Deegan, Wei Liu

On Fri, Mar 14, 2014 at 06:05:50PM +0000, David Vrabel wrote:
> On 14/03/14 16:21, Wei Liu wrote:
> > Xen balloon driver will update ballooned out pages' P2M entries to point
> > to scratch page for PV guests. In 24f69373e2 ("xen/balloon: don't alloc
> > page while non-preemptible", kmap_flush_unused was moved after the
> > update for P2M table. In that case for 32 bit PV guest we might end up
> > with
> > 
> > P2M    X -----> scratch_page
> > M2P    Y -----> X  (Y is mfn in unused kmap entry)
> > 
> > When PVMMU is consulted, it gets confused and returns the wrong value.
> > Eventually the guest crashes.
> > 
> > Move the flush before __set_phys_to_machine to fix this.
> 
> The scrub_page() will immediately repopulate the kmap cache with the MFN
> about to be returned to Xen so this isn't the correct place.
> 

If XEN_SCRUB_PAGE is not set then scrub_page is a nop. Even if
XEN_SCRUB_PAGE is set, the call to clear_highpage affects per-cpu kmap
not persisten kmap. kmap_flush_unused affects persistent kmap.

> I don't understand your description of the problem so I cannot suggest a
> correct fix.  What's consulting what?
> 

kmap_flush_unused consults PVMMU. It goes through all global kmap slots
and try to clear those unused ones. It calls flush_all_zero_pkmaps which
calls pte_page, which eventually goes to PVMMU.

But I just discover something new so this patch can be dropped for the
moment.

Wei.

> As an aside, I do think the flush_tlb_all() is unnecessary since Xen
> does that for us in the update_va_mapping hypercall.  I think. Tim, can
> you confirm?
> 
> David

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] xen/balloon: flush unused mappings before updating P2M table
  2014-03-14 18:27   ` Wei Liu
@ 2014-03-14 18:44     ` David Vrabel
  2014-03-14 18:57       ` Wei Liu
  0 siblings, 1 reply; 5+ messages in thread
From: David Vrabel @ 2014-03-14 18:44 UTC (permalink / raw)
  To: Wei Liu; +Cc: xen-devel, Boris Ostrovsky, Tim Deegan

On 14/03/14 18:27, Wei Liu wrote:
> On Fri, Mar 14, 2014 at 06:05:50PM +0000, David Vrabel wrote:
>> On 14/03/14 16:21, Wei Liu wrote:
>>> Xen balloon driver will update ballooned out pages' P2M entries to point
>>> to scratch page for PV guests. In 24f69373e2 ("xen/balloon: don't alloc
>>> page while non-preemptible", kmap_flush_unused was moved after the
>>> update for P2M table. In that case for 32 bit PV guest we might end up
>>> with
>>>
>>> P2M    X -----> scratch_page
>>> M2P    Y -----> X  (Y is mfn in unused kmap entry)
>>>
>>> When PVMMU is consulted, it gets confused and returns the wrong value.
>>> Eventually the guest crashes.
>>>
>>> Move the flush before __set_phys_to_machine to fix this.
>>
>> The scrub_page() will immediately repopulate the kmap cache with the MFN
>> about to be returned to Xen so this isn't the correct place.
>>
> 
> If XEN_SCRUB_PAGE is not set then scrub_page is a nop. Even if
> XEN_SCRUB_PAGE is set, the call to clear_highpage affects per-cpu kmap
> not persisten kmap. kmap_flush_unused affects persistent kmap.
> 
>> I don't understand your description of the problem so I cannot suggest a
>> correct fix.  What's consulting what?
>>
> 
> kmap_flush_unused consults PVMMU. It goes through all global kmap slots
> and try to clear those unused ones. It calls flush_all_zero_pkmaps which
> calls pte_page, which eventually goes to PVMMU.

Ok, that's a real bug then.  The P2M cannot be changed if there are
still mappings for that PFN.

David

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] xen/balloon: flush unused mappings before updating P2M table
  2014-03-14 18:44     ` David Vrabel
@ 2014-03-14 18:57       ` Wei Liu
  0 siblings, 0 replies; 5+ messages in thread
From: Wei Liu @ 2014-03-14 18:57 UTC (permalink / raw)
  To: David Vrabel; +Cc: xen-devel, Boris Ostrovsky, Tim Deegan, Wei Liu

On Fri, Mar 14, 2014 at 06:44:54PM +0000, David Vrabel wrote:
> On 14/03/14 18:27, Wei Liu wrote:
> > On Fri, Mar 14, 2014 at 06:05:50PM +0000, David Vrabel wrote:
> >> On 14/03/14 16:21, Wei Liu wrote:
> >>> Xen balloon driver will update ballooned out pages' P2M entries to point
> >>> to scratch page for PV guests. In 24f69373e2 ("xen/balloon: don't alloc
> >>> page while non-preemptible", kmap_flush_unused was moved after the
> >>> update for P2M table. In that case for 32 bit PV guest we might end up
> >>> with
> >>>
> >>> P2M    X -----> scratch_page
> >>> M2P    Y -----> X  (Y is mfn in unused kmap entry)
> >>>
> >>> When PVMMU is consulted, it gets confused and returns the wrong value.
> >>> Eventually the guest crashes.
> >>>
> >>> Move the flush before __set_phys_to_machine to fix this.
> >>
> >> The scrub_page() will immediately repopulate the kmap cache with the MFN
> >> about to be returned to Xen so this isn't the correct place.
> >>
> > 
> > If XEN_SCRUB_PAGE is not set then scrub_page is a nop. Even if
> > XEN_SCRUB_PAGE is set, the call to clear_highpage affects per-cpu kmap
> > not persisten kmap. kmap_flush_unused affects persistent kmap.
> > 
> >> I don't understand your description of the problem so I cannot suggest a
> >> correct fix.  What's consulting what?
> >>
> > 
> > kmap_flush_unused consults PVMMU. It goes through all global kmap slots
> > and try to clear those unused ones. It calls flush_all_zero_pkmaps which
> > calls pte_page, which eventually goes to PVMMU.
> 
> Ok, that's a real bug then.  The P2M cannot be changed if there are
> still mappings for that PFN.
> 

I happened to trigger it with other treaks to balloon driver, but never
manage to trigger it with the original balloon driver.

On a second thought, I think this patch is still correct on its own
right. I will re-submit it again with improved commit message.

Wei.

> David

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-03-14 18:57 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-14 16:21 [PATCH] xen/balloon: flush unused mappings before updating P2M table Wei Liu
2014-03-14 18:05 ` David Vrabel
2014-03-14 18:27   ` Wei Liu
2014-03-14 18:44     ` David Vrabel
2014-03-14 18:57       ` Wei Liu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.