xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 02/14] Nested Virtualization: localevent
@ 2010-08-05 15:00 Christoph Egger
  2010-08-05 15:46 ` Keir Fraser
  0 siblings, 1 reply; 7+ messages in thread
From: Christoph Egger @ 2010-08-05 15:00 UTC (permalink / raw)
  To: xen-devel

[-- Attachment #1: Type: text/plain, Size: 322 bytes --]


Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>

-- 
---to satisfy European Law for business letters:
Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach b. Muenchen
Geschaeftsfuehrer: Alberto Bozzo, Andrew Bowd
Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632

[-- Attachment #2: xen_nh02_localevent.diff --]
[-- Type: text/x-diff, Size: 3228 bytes --]

# HG changeset patch
# User cegger
# Date 1280925494 -7200
Change local_event_delivery_* to take vcpu argument.
This is needed as 'current' is not accessable on guest shutdown/destroy.
This fixes spurious xen crashes on shutdown/destroy with nestedhvm enabled.

diff -r b13ace9a80d8 -r e75662c48917 xen/arch/ia64/xen/hypercall.c
--- a/xen/arch/ia64/xen/hypercall.c
+++ b/xen/arch/ia64/xen/hypercall.c
@@ -309,13 +309,13 @@ ia64_hypercall(struct pt_regs *regs)
 				do_softirq();
 				stop_timer(&v->arch.hlt_timer);
 				/* do_block() calls
-				 * local_event_delivery_enable(),
+				 * local_event_delivery_enable(v),
 				 * but PAL CALL must be called with
 				 * psr.i = 0 and psr.i is unchanged.
 				 * SDM vol.2 Part I 11.10.2
 				 * PAL Calling Conventions.
 				 */
-				local_event_delivery_disable();
+				local_event_delivery_disable(v);
 			}
 			regs->r8 = 0;
 			regs->r9 = 0;
diff -r b13ace9a80d8 -r e75662c48917 xen/common/schedule.c
--- a/xen/common/schedule.c
+++ b/xen/common/schedule.c
@@ -539,7 +539,7 @@ static long do_block(void)
 {
     struct vcpu *v = current;
 
-    local_event_delivery_enable();
+    local_event_delivery_enable(v);
     set_bit(_VPF_blocked, &v->pause_flags);
 
     /* Check for events /after/ blocking: avoids wakeup waiting race. */
diff -r b13ace9a80d8 -r e75662c48917 xen/include/asm-ia64/event.h
--- a/xen/include/asm-ia64/event.h
+++ b/xen/include/asm-ia64/event.h
@@ -48,19 +48,19 @@ static inline int local_events_need_deli
     return event_pending(current);
 }
 
-static inline int local_event_delivery_is_enabled(void)
+static inline int local_event_delivery_is_enabled(struct vcpu *v)
 {
-    return !current->vcpu_info->evtchn_upcall_mask;
+    return !v->vcpu_info->evtchn_upcall_mask;
 }
 
-static inline void local_event_delivery_disable(void)
+static inline void local_event_delivery_disable(struct vcpu *v)
 {
-    current->vcpu_info->evtchn_upcall_mask = 1;
+    v->vcpu_info->evtchn_upcall_mask = 1;
 }
 
-static inline void local_event_delivery_enable(void)
+static inline void local_event_delivery_enable(struct vcpu *v)
 {
-    current->vcpu_info->evtchn_upcall_mask = 0;
+    v->vcpu_info->evtchn_upcall_mask = 0;
 }
 
 static inline int arch_virq_is_global(int virq)
diff -r b13ace9a80d8 -r e75662c48917 xen/include/asm-x86/event.h
--- a/xen/include/asm-x86/event.h
+++ b/xen/include/asm-x86/event.h
@@ -23,19 +23,19 @@ static inline int local_events_need_deli
              !vcpu_info(v, evtchn_upcall_mask)));
 }
 
-static inline int local_event_delivery_is_enabled(void)
+static inline int local_event_delivery_is_enabled(struct vcpu *v)
 {
-    return !vcpu_info(current, evtchn_upcall_mask);
+    return !vcpu_info(v, evtchn_upcall_mask);
 }
 
-static inline void local_event_delivery_disable(void)
+static inline void local_event_delivery_disable(struct vcpu *v)
 {
-    vcpu_info(current, evtchn_upcall_mask) = 1;
+    vcpu_info(v, evtchn_upcall_mask) = 1;
 }
 
-static inline void local_event_delivery_enable(void)
+static inline void local_event_delivery_enable(struct vcpu *v)
 {
-    vcpu_info(current, evtchn_upcall_mask) = 0;
+    vcpu_info(v, evtchn_upcall_mask) = 0;
 }
 
 /* No arch specific virq definition now. Default to global. */

[-- Attachment #3: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 02/14] Nested Virtualization: localevent
  2010-08-05 15:00 [PATCH 02/14] Nested Virtualization: localevent Christoph Egger
@ 2010-08-05 15:46 ` Keir Fraser
  2010-08-05 16:19   ` Keir Fraser
  0 siblings, 1 reply; 7+ messages in thread
From: Keir Fraser @ 2010-08-05 15:46 UTC (permalink / raw)
  To: Christoph Egger, xen-devel@lists.xensource.com

The functions are called local_event_delivery* because they implicitly act
on current. They don't need to take a vcpu parameter. If you find you need a
vcpu parameter then you are using them, or one of their callers,
incorrectly.

 -- Keir

On 05/08/2010 16:00, "Christoph Egger" <Christoph.Egger@amd.com> wrote:

> 
> Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 02/14] Nested Virtualization: localevent
  2010-08-05 15:46 ` Keir Fraser
@ 2010-08-05 16:19   ` Keir Fraser
  2010-08-06  9:17     ` Christoph Egger
  0 siblings, 1 reply; 7+ messages in thread
From: Keir Fraser @ 2010-08-05 16:19 UTC (permalink / raw)
  To: Christoph Egger, xen-devel@lists.xensource.com

I seem to remember we discussed the reason for this a bit some time ago. It
looked to me like you were calling a function that makes sense only on a
running guest (and a locally currently running guest at that) after the
guest was dead, during cleanup/teardown. If I'm remebering correctly then
the fix would be to not do that then. ;-)

 -- Keir

On 05/08/2010 16:46, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:

> The functions are called local_event_delivery* because they implicitly act
> on current. They don't need to take a vcpu parameter. If you find you need a
> vcpu parameter then you are using them, or one of their callers,
> incorrectly.
> 
>  -- Keir
> 
> On 05/08/2010 16:00, "Christoph Egger" <Christoph.Egger@amd.com> wrote:
> 
>> 
>> Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 02/14] Nested Virtualization: localevent
  2010-08-05 16:19   ` Keir Fraser
@ 2010-08-06  9:17     ` Christoph Egger
  2010-08-06 14:02       ` Keir Fraser
  0 siblings, 1 reply; 7+ messages in thread
From: Christoph Egger @ 2010-08-06  9:17 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel@lists.xensource.com

On Thursday 05 August 2010 18:19:14 Keir Fraser wrote:
> I seem to remember we discussed the reason for this a bit some time ago.

Yes, I remember, too.

> It looked to me like you were calling a function that makes sense only on a
> running guest (and a locally currently running guest at that) after the
> guest was dead, during cleanup/teardown. If I'm remembering correctly then
> the fix would be to not do that then. ;-)

I am not sure if I got you. I try to repeat you with my own words:

This localevent patch on its own is pointless. It makes only sense
when a guest (the level 1 guest) runs its own guest, the level 2 guest.
When the level 2 guest was dead, during cleanup/teardown.


I retry to explain:

This localevent patch on its own is pointless, I agree with you in this case.
The patch is needed when the level 1 guest was running a level 2 guest
and get destroyed. During the destroy process Xen may still want to
inject (pending) interrupts/events.
For this reason, nestedhvm_vcpu_destroy() (added in patch 5/14)
does a nestedsvm_vcpu_stgi() to prevent the interrupts/events
from being blocked by hvm_interrupt_blocked() (see patch 9/14)
and level 1 guest remaining in a zombie state.

nestedsvm_vcpu_stgi() calls local_event_delivery_enable() (only on AMD now).
At the time when nestedsvm_vcpu_stgi() is called from 
nestedhvm_vcpu_destroy(), "current" may be an invalid pointer while
the vcpu pointer is valid.
When local_event_delivery_enable() accesses "current" then Xen crashes.
That is why local_event_delivery_enable() needs the vcpu argument.

Christoph



>  -- Keir
>
> On 05/08/2010 16:46, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:
> > The functions are called local_event_delivery* because they implicitly
> > act on current. They don't need to take a vcpu parameter. If you find you
> > need a vcpu parameter then you are using them, or one of their callers,
> > incorrectly.
> >
> >  -- Keir
> >
> > On 05/08/2010 16:00, "Christoph Egger" <Christoph.Egger@amd.com> wrote:
> >> Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xensource.com
> > http://lists.xensource.com/xen-devel



-- 
---to satisfy European Law for business letters:
Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach b. Muenchen
Geschaeftsfuehrer: Alberto Bozzo, Andrew Bowd
Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 02/14] Nested Virtualization: localevent
  2010-08-06  9:17     ` Christoph Egger
@ 2010-08-06 14:02       ` Keir Fraser
  2010-08-10  8:01         ` Christoph Egger
  0 siblings, 1 reply; 7+ messages in thread
From: Keir Fraser @ 2010-08-06 14:02 UTC (permalink / raw)
  To: Christoph Egger; +Cc: xen-devel@lists.xensource.com

On 06/08/2010 10:17, "Christoph Egger" <Christoph.Egger@amd.com> wrote:

> For this reason, nestedhvm_vcpu_destroy() (added in patch 5/14)
> does a nestedsvm_vcpu_stgi() to prevent the interrupts/events
> from being blocked by hvm_interrupt_blocked() (see patch 9/14)
> and level 1 guest remaining in a zombie state.

Ah, this is the crux of it. You shouldn't need to stgi from the vcpu
destructor. It makes no sense and doing it shouldn't leave you with a zombie
domain. Indeed, vcpu_destroy() is called from the very final domain
destructor -- vcpu_destroy's caller finishes by freeing the domain structure
itself, so not much chance of hanging around as a zombie! I'm assuming you
call nestedhvm_vcpu_destroy() on the vcpu_destroy() path here by the way...
If it's called from some other context then I think its name is misleading
and should be changed.

 -- Keir

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 02/14] Nested Virtualization: localevent
  2010-08-06 14:02       ` Keir Fraser
@ 2010-08-10  8:01         ` Christoph Egger
  2010-08-10  8:21           ` Keir Fraser
  0 siblings, 1 reply; 7+ messages in thread
From: Christoph Egger @ 2010-08-10  8:01 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel@lists.xensource.com

On Friday 06 August 2010 16:02:11 Keir Fraser wrote:
> On 06/08/2010 10:17, "Christoph Egger" <Christoph.Egger@amd.com> wrote:
> > For this reason, nestedhvm_vcpu_destroy() (added in patch 5/14)
> > does a nestedsvm_vcpu_stgi() to prevent the interrupts/events
> > from being blocked by hvm_interrupt_blocked() (see patch 9/14)
> > and level 1 guest remaining in a zombie state.
>
> Ah, this is the crux of it. You shouldn't need to stgi from the vcpu
> destructor. It makes no sense and doing it shouldn't leave you with a
> zombie domain.

I backed out the 'localevent' patch in my local tree, removed the stgi
call in the vcpu destructor and run tests. A lot of things have been
changed since the issue has been found and the real bug might have
already been fixed in the meantime.

I haven't seen any issues with that changes in my tests so my next
patch series I send will have the localevent patch and the stgi call
dropped.

> Indeed, vcpu_destroy() is called from the very final domain 
> destructor -- vcpu_destroy's caller finishes by freeing the domain
> structure itself, so not much chance of hanging around as a zombie! I'm
> assuming you call nestedhvm_vcpu_destroy() on the vcpu_destroy() path here
> by the way...

Yes, your assumption is correct.

> If it's called from some other context then I think its name 
> is misleading and should be changed.
>
>  -- Keir

Christoph

-- 
---to satisfy European Law for business letters:
Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach b. Muenchen
Geschaeftsfuehrer: Alberto Bozzo, Andrew Bowd
Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 02/14] Nested Virtualization: localevent
  2010-08-10  8:01         ` Christoph Egger
@ 2010-08-10  8:21           ` Keir Fraser
  0 siblings, 0 replies; 7+ messages in thread
From: Keir Fraser @ 2010-08-10  8:21 UTC (permalink / raw)
  To: Christoph Egger; +Cc: xen-devel@lists.xensource.com

On 10/08/2010 09:01, "Christoph Egger" <Christoph.Egger@amd.com> wrote:

>> Ah, this is the crux of it. You shouldn't need to stgi from the vcpu
>> destructor. It makes no sense and doing it shouldn't leave you with a
>> zombie domain.
> 
> I backed out the 'localevent' patch in my local tree, removed the stgi
> call in the vcpu destructor and run tests. A lot of things have been
> changed since the issue has been found and the real bug might have
> already been fixed in the meantime.
> 
> I haven't seen any issues with that changes in my tests so my next
> patch series I send will have the localevent patch and the stgi call
> dropped.

Thanks. I applied the p2m infrastructure patch that Tim acked, by the way.
Also may as well give Intel a little longer to respond on the common
infrastructure patches.

 -- keir

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2010-08-10  8:21 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-08-05 15:00 [PATCH 02/14] Nested Virtualization: localevent Christoph Egger
2010-08-05 15:46 ` Keir Fraser
2010-08-05 16:19   ` Keir Fraser
2010-08-06  9:17     ` Christoph Egger
2010-08-06 14:02       ` Keir Fraser
2010-08-10  8:01         ` Christoph Egger
2010-08-10  8:21           ` Keir Fraser

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).