Re: [rfc] [patch] more 'long' in the hypervisor interface

All of lore.kernel.org
 help / color / mirror / Atom feed

* Re: [rfc] [patch] more 'long' in the hypervisor interface
  2006-06-28 21:03 [rfc] [patch] more 'long' in the hypervisor interface Hollis Blanchard
@ 2006-06-28 21:02 ` Keir Fraser
  2006-06-28 21:19   ` Hollis Blanchard
  2006-06-28 21:09 ` Chris Wright
  2006-06-28 21:10 ` Hollis Blanchard
  2 siblings, 1 reply; 15+ messages in thread
From: Keir Fraser @ 2006-06-28 21:02 UTC (permalink / raw)
  To: Hollis Blanchard; +Cc: xen-devel, xen-ppc-devel


On 28 Jun 2006, at 22:03, Hollis Blanchard wrote:

> Hi Keir, we've come across some more users of 'long' in the hypervisor
> interface: xen/include/public/memory.h. Unlike the dom0_ops, we can't
> just change these to be 64 bits because 32-bit kernels use these
> structures for the balloon driver.
>
> I would like to create a new type, say "legacy_ulong_t", to cover these
> cases and future instances we'll undoubtedly come across. What do you
> think?
>
> Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>

Call them xen_ulong_t and they're fine. They're hardly legacy since 
there's no suitable non-legacy replacement right now. Don't forget the 
typedef in the other arch-*.h headers as well.

  -- Keir

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [rfc] [patch] more 'long' in the hypervisor interface
@ 2006-06-28 21:03 Hollis Blanchard
  2006-06-28 21:02 ` Keir Fraser
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Hollis Blanchard @ 2006-06-28 21:03 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel, xen-ppc-devel

Hi Keir, we've come across some more users of 'long' in the hypervisor
interface: xen/include/public/memory.h. Unlike the dom0_ops, we can't
just change these to be 64 bits because 32-bit kernels use these
structures for the balloon driver.

I would like to create a new type, say "legacy_ulong_t", to cover these
cases and future instances we'll undoubtedly come across. What do you
think?

Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>

diff -r 10db0f8c710d xen/include/public/arch-x86_64.h
--- a/xen/include/public/arch-x86_64.h	Wed Jun 28 15:37:45 2006 -0400
+++ b/xen/include/public/arch-x86_64.h	Wed Jun 28 16:02:21 2006 -0500
@@ -104,6 +104,8 @@ DEFINE_XEN_GUEST_HANDLE(xen_pfn_t);
 #define MAX_VIRT_CPUS 32
 
 #ifndef __ASSEMBLY__
+
+typedef unsigned long legacy_ulong_t;
 
 /*
  * int HYPERVISOR_set_segment_base(unsigned int which, unsigned long base)
diff -r 10db0f8c710d xen/include/public/memory.h
--- a/xen/include/public/memory.h	Wed Jun 28 15:37:45 2006 -0400
+++ b/xen/include/public/memory.h	Wed Jun 28 16:02:21 2006 -0500
@@ -32,7 +32,7 @@ struct xen_memory_reservation {
     XEN_GUEST_HANDLE(xen_pfn_t) extent_start;
 
     /* Number of extents, and size/alignment of each (2^extent_order pages). */
-    unsigned long  nr_extents;
+    legacy_ulong_t  nr_extents;
     unsigned int   extent_order;
 
     /*
@@ -90,7 +90,7 @@ struct xen_memory_exchange {
      *     command will be non-zero.
      *  5. THIS FIELD MUST BE INITIALISED TO ZERO BY THE CALLER!
      */
-    unsigned long nr_exchanged;
+    legacy_ulong_t nr_exchanged;
 };
 typedef struct xen_memory_exchange xen_memory_exchange_t;
 DEFINE_XEN_GUEST_HANDLE(xen_memory_exchange_t);
@@ -148,8 +148,8 @@ DEFINE_XEN_GUEST_HANDLE(xen_machphys_mfn
  */
 #define XENMEM_machphys_mapping     12
 struct xen_machphys_mapping {
-    unsigned long v_start, v_end; /* Start and end virtual addresses.   */
-    unsigned long max_mfn;        /* Maximum MFN that can be looked up. */
+    legacy_ulong_t v_start, v_end; /* Start and end virtual addresses.   */
+    legacy_ulong_t max_mfn;        /* Maximum MFN that can be looked up. */
 };
 typedef struct xen_machphys_mapping xen_machphys_mapping_t;
 DEFINE_XEN_GUEST_HANDLE(xen_machphys_mapping_t);
@@ -170,7 +170,7 @@ struct xen_add_to_physmap {
     unsigned int space;
 
     /* Index into source mapping space. */
-    unsigned long idx;
+    legacy_ulong_t idx;
 
     /* GPFN where the source mapping page should appear. */
     xen_pfn_t     gpfn;
@@ -188,7 +188,7 @@ struct xen_translate_gpfn_list {
     domid_t domid;
 
     /* Length of list. */
-    unsigned long nr_gpfns;
+    legacy_ulong_t nr_gpfns;
 
     /* List of GPFNs to translate. */
     XEN_GUEST_HANDLE(xen_pfn_t) gpfn_list;


-- 
Hollis Blanchard
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [rfc] [patch] more 'long' in the hypervisor interface
  2006-06-28 21:03 [rfc] [patch] more 'long' in the hypervisor interface Hollis Blanchard
  2006-06-28 21:02 ` Keir Fraser
@ 2006-06-28 21:09 ` Chris Wright
  2006-06-28 21:21   ` Hollis Blanchard
  2006-06-28 21:10 ` Hollis Blanchard
  2 siblings, 1 reply; 15+ messages in thread
From: Chris Wright @ 2006-06-28 21:09 UTC (permalink / raw)
  To: Hollis Blanchard; +Cc: xen-devel, xen-ppc-devel

* Hollis Blanchard (hollisb@us.ibm.com) wrote:
> Hi Keir, we've come across some more users of 'long' in the hypervisor
> interface: xen/include/public/memory.h. Unlike the dom0_ops, we can't
> just change these to be 64 bits because 32-bit kernels use these
> structures for the balloon driver.
> 
> I would like to create a new type, say "legacy_ulong_t", to cover these
> cases and future instances we'll undoubtedly come across. What do you
> think?
> 
> Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
> 
> diff -r 10db0f8c710d xen/include/public/arch-x86_64.h
> --- a/xen/include/public/arch-x86_64.h	Wed Jun 28 15:37:45 2006 -0400
> +++ b/xen/include/public/arch-x86_64.h	Wed Jun 28 16:02:21 2006 -0500
> @@ -104,6 +104,8 @@ DEFINE_XEN_GUEST_HANDLE(xen_pfn_t);
>  #define MAX_VIRT_CPUS 32
>  
>  #ifndef __ASSEMBLY__
> +
> +typedef unsigned long legacy_ulong_t;

What is legacy about it, this looks quite odd, and I don't think will
build on i386.

thanks,
-chris

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [rfc] [patch] more 'long' in the hypervisor interface
  2006-06-28 21:03 [rfc] [patch] more 'long' in the hypervisor interface Hollis Blanchard
  2006-06-28 21:02 ` Keir Fraser
  2006-06-28 21:09 ` Chris Wright
@ 2006-06-28 21:10 ` Hollis Blanchard
  2 siblings, 0 replies; 15+ messages in thread
From: Hollis Blanchard @ 2006-06-28 21:10 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel, xen-ppc-devel

On Wed, 2006-06-28 at 16:03 -0500, Hollis Blanchard wrote:
> diff -r 10db0f8c710d xen/include/public/arch-x86_64.h
> --- a/xen/include/public/arch-x86_64.h  Wed Jun 28 15:37:45 2006 -0400
> +++ b/xen/include/public/arch-x86_64.h  Wed Jun 28 16:02:21 2006 -0500
> @@ -104,6 +104,8 @@ DEFINE_XEN_GUEST_HANDLE(xen_pfn_t);
>  #define MAX_VIRT_CPUS 32
>  
>  #ifndef __ASSEMBLY__
> +
> +typedef unsigned long legacy_ulong_t;
>  
>  /*
>   * int HYPERVISOR_set_segment_base(unsigned int which, unsigned long
> base)

(Obviously that patch should have included the same for x86_32 and ia64;
I had a typo in the diff command.)

-- 
Hollis Blanchard
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [rfc] [patch] more 'long' in the hypervisor interface
  2006-06-28 21:02 ` Keir Fraser
@ 2006-06-28 21:19   ` Hollis Blanchard
  0 siblings, 0 replies; 15+ messages in thread
From: Hollis Blanchard @ 2006-06-28 21:19 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel, xen-ppc-devel

On Wed, 2006-06-28 at 22:02 +0100, Keir Fraser wrote:
> On 28 Jun 2006, at 22:03, Hollis Blanchard wrote:
> 
> > Hi Keir, we've come across some more users of 'long' in the hypervisor
> > interface: xen/include/public/memory.h. Unlike the dom0_ops, we can't
> > just change these to be 64 bits because 32-bit kernels use these
> > structures for the balloon driver.
> >
> > I would like to create a new type, say "legacy_ulong_t", to cover these
> > cases and future instances we'll undoubtedly come across. What do you
> > think?
> >
> > Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
> 
> Call them xen_ulong_t and they're fine. They're hardly legacy since 
> there's no suitable non-legacy replacement right now. Don't forget the 
> typedef in the other arch-*.h headers as well.

Excellent!

The reason I worry about calling it "xen_ulong_t" is that name doesn't
discourage people from using it. In fact, it only makes sense to use
this type in the interface, and even then only in places in the
interface that were 'long' already. In other words, nobody should ever
use this type any more, which is why I called it "legacy" originally. I
could imagine people trying to use it in the future because it seems
like the right thing to do. I'll leave the decision to you.



Define an architecture-specific 'long' type for ABI compatibility.
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>

diff -r 10db0f8c710d xen/include/public/arch-ia64.h
--- a/xen/include/public/arch-ia64.h	Wed Jun 28 15:37:45 2006 -0400
+++ b/xen/include/public/arch-ia64.h	Wed Jun 28 16:12:48 2006 -0500
@@ -39,6 +39,8 @@ DEFINE_XEN_GUEST_HANDLE(xen_pfn_t);
 #define MAX_VIRT_CPUS 64
 
 #ifndef __ASSEMBLY__
+
+typedef unsigned long xen_ulong_t;
 
 #define MAX_NR_SECTION  32  /* at most 32 memory holes */
 struct mm_section {
diff -r 10db0f8c710d xen/include/public/arch-x86_32.h
--- a/xen/include/public/arch-x86_32.h	Wed Jun 28 15:37:45 2006 -0400
+++ b/xen/include/public/arch-x86_32.h	Wed Jun 28 16:12:48 2006 -0500
@@ -97,6 +97,8 @@ DEFINE_XEN_GUEST_HANDLE(xen_pfn_t);
 #define MAX_VIRT_CPUS 32
 
 #ifndef __ASSEMBLY__
+
+typedef unsigned long xen_ulong_t;
 
 /*
  * Send an array of these to HYPERVISOR_set_trap_table()
diff -r 10db0f8c710d xen/include/public/arch-x86_64.h
--- a/xen/include/public/arch-x86_64.h	Wed Jun 28 15:37:45 2006 -0400
+++ b/xen/include/public/arch-x86_64.h	Wed Jun 28 16:12:48 2006 -0500
@@ -104,6 +104,8 @@ DEFINE_XEN_GUEST_HANDLE(xen_pfn_t);
 #define MAX_VIRT_CPUS 32
 
 #ifndef __ASSEMBLY__
+
+typedef unsigned long xen_ulong_t;
 
 /*
  * int HYPERVISOR_set_segment_base(unsigned int which, unsigned long base)
diff -r 10db0f8c710d xen/include/public/memory.h
--- a/xen/include/public/memory.h	Wed Jun 28 15:37:45 2006 -0400
+++ b/xen/include/public/memory.h	Wed Jun 28 16:12:48 2006 -0500
@@ -32,7 +32,7 @@ struct xen_memory_reservation {
     XEN_GUEST_HANDLE(xen_pfn_t) extent_start;
 
     /* Number of extents, and size/alignment of each (2^extent_order pages). */
-    unsigned long  nr_extents;
+    xen_ulong_t  nr_extents;
     unsigned int   extent_order;
 
     /*
@@ -90,7 +90,7 @@ struct xen_memory_exchange {
      *     command will be non-zero.
      *  5. THIS FIELD MUST BE INITIALISED TO ZERO BY THE CALLER!
      */
-    unsigned long nr_exchanged;
+    xen_ulong_t nr_exchanged;
 };
 typedef struct xen_memory_exchange xen_memory_exchange_t;
 DEFINE_XEN_GUEST_HANDLE(xen_memory_exchange_t);
@@ -148,8 +148,8 @@ DEFINE_XEN_GUEST_HANDLE(xen_machphys_mfn
  */
 #define XENMEM_machphys_mapping     12
 struct xen_machphys_mapping {
-    unsigned long v_start, v_end; /* Start and end virtual addresses.   */
-    unsigned long max_mfn;        /* Maximum MFN that can be looked up. */
+    xen_ulong_t v_start, v_end; /* Start and end virtual addresses.   */
+    xen_ulong_t max_mfn;        /* Maximum MFN that can be looked up. */
 };
 typedef struct xen_machphys_mapping xen_machphys_mapping_t;
 DEFINE_XEN_GUEST_HANDLE(xen_machphys_mapping_t);
@@ -170,7 +170,7 @@ struct xen_add_to_physmap {
     unsigned int space;
 
     /* Index into source mapping space. */
-    unsigned long idx;
+    xen_ulong_t idx;
 
     /* GPFN where the source mapping page should appear. */
     xen_pfn_t     gpfn;
@@ -188,7 +188,7 @@ struct xen_translate_gpfn_list {
     domid_t domid;
 
     /* Length of list. */
-    unsigned long nr_gpfns;
+    xen_ulong_t nr_gpfns;
 
     /* List of GPFNs to translate. */
     XEN_GUEST_HANDLE(xen_pfn_t) gpfn_list;


-- 
Hollis Blanchard
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [rfc] [patch] more 'long' in the hypervisor interface
  2006-06-28 21:09 ` Chris Wright
@ 2006-06-28 21:21   ` Hollis Blanchard
  2006-06-28 21:36     ` Chris Wright
  0 siblings, 1 reply; 15+ messages in thread
From: Hollis Blanchard @ 2006-06-28 21:21 UTC (permalink / raw)
  To: Chris Wright; +Cc: xen-devel, xen-ppc-devel

On Wed, 2006-06-28 at 14:09 -0700, Chris Wright wrote:
> * Hollis Blanchard (hollisb@us.ibm.com) wrote:
> > Hi Keir, we've come across some more users of 'long' in the hypervisor
> > interface: xen/include/public/memory.h. Unlike the dom0_ops, we can't
> > just change these to be 64 bits because 32-bit kernels use these
> > structures for the balloon driver.
> > 
> > I would like to create a new type, say "legacy_ulong_t", to cover these
> > cases and future instances we'll undoubtedly come across. What do you
> > think?
> > 
> > Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
> > 
> > diff -r 10db0f8c710d xen/include/public/arch-x86_64.h
> > --- a/xen/include/public/arch-x86_64.h	Wed Jun 28 15:37:45 2006 -0400
> > +++ b/xen/include/public/arch-x86_64.h	Wed Jun 28 16:02:21 2006 -0500
> > @@ -104,6 +104,8 @@ DEFINE_XEN_GUEST_HANDLE(xen_pfn_t);
> >  #define MAX_VIRT_CPUS 32
> >  
> >  #ifndef __ASSEMBLY__
> > +
> > +typedef unsigned long legacy_ulong_t;
> 
> What is legacy about it, this looks quite odd, and I don't think will
> build on i386.

Hopefully my other reply cleared things up for you...

It builds fine for me; what problems do you envision?

-- 
Hollis Blanchard
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [rfc] [patch] more 'long' in the hypervisor interface
  2006-06-28 21:21   ` Hollis Blanchard
@ 2006-06-28 21:36     ` Chris Wright
  2006-06-28 21:58       ` Hollis Blanchard
  0 siblings, 1 reply; 15+ messages in thread
From: Chris Wright @ 2006-06-28 21:36 UTC (permalink / raw)
  To: Hollis Blanchard; +Cc: Chris Wright, xen-devel, xen-ppc-devel

* Hollis Blanchard (hollisb@us.ibm.com) wrote:
> On Wed, 2006-06-28 at 14:09 -0700, Chris Wright wrote:
> > What is legacy about it, this looks quite odd, and I don't think will
> > build on i386.
> 
> Hopefully my other reply cleared things up for you...

Well, the patch by itself doesn't do anything, so are you doing smth
special on ppc?

> It builds fine for me; what problems do you envision?

Just, for example, the missing x86_32 update, which you cared for.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [rfc] [patch] more 'long' in the hypervisor interface
  2006-06-28 21:36     ` Chris Wright
@ 2006-06-28 21:58       ` Hollis Blanchard
  2006-06-28 22:42         ` [RFC] Erratic mouse in HVM guest Ross Maxfield
  2006-06-28 23:05         ` [rfc] [patch] more 'long' in the hypervisor interface Chris Wright
  0 siblings, 2 replies; 15+ messages in thread
From: Hollis Blanchard @ 2006-06-28 21:58 UTC (permalink / raw)
  To: Chris Wright; +Cc: xen-devel, xen-ppc-devel

On Wed, 2006-06-28 at 14:36 -0700, Chris Wright wrote:
> * Hollis Blanchard (hollisb@us.ibm.com) wrote:
> > On Wed, 2006-06-28 at 14:09 -0700, Chris Wright wrote:
> > > What is legacy about it, this looks quite odd, and I don't think will
> > > build on i386.
> > 
> > Hopefully my other reply cleared things up for you...
> 
> Well, the patch by itself doesn't do anything, so are you doing smth
> special on ppc?

We discussed a bit on IRC (developers are welcome to join OFTC #xen),
but to recap for the list...

PPC will have
	typedef uint64_t xen_ulong_t;
That means that the fields in memory.h will keep the same
size/alignment, whether compiled 32- or 64-bit. This is the way the
interface should have been designed in the first place, but we're locked
into the current ABI on x86. However, since PPC has no current users, we
can define the ABI correctly from the start.

-- 
Hollis Blanchard
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [RFC] Erratic mouse in HVM guest
  2006-06-28 21:58       ` Hollis Blanchard
@ 2006-06-28 22:42         ` Ross Maxfield
  2006-06-28 23:05         ` [rfc] [patch] more 'long' in the hypervisor interface Chris Wright
  1 sibling, 0 replies; 15+ messages in thread
From: Ross Maxfield @ 2006-06-28 22:42 UTC (permalink / raw)
  To: xen-devel, xen-devel

To whom it may concern,

For many months, some of us at Novell working on and testing Xen have contended with chaotic mouse behavior in HVM Linux guests.   This ill-mannered mouse, however, appears to be sensitive to certain hardware.  Although I have seen the mouse jump around the screen occasionally on diverse machines, I see it continuously on the Harwich Twin Castle Paxville (3GHz, 8GB, x86_64, 8 way duel-core).  The mouse is completely unusable in the guest as the slightest mouse event produces wild results in the guest, either erratic mouse movement or button presses.

Bug 167187, “Erratic mouse behavior with HVM Linux guest and SDL” was entered into Novell's Bugzilla April 17th, 2006, and Intel was informed of the issue.  Since Novell's first release of Xen with SLES is with full support of para-virtualized guests, this issue relative to the HVM guest has been put aside until recently when I began to explore the cause of the mouse problem.  Here's what I've learned.

First, the mouse behaves erratically because the data coming out of /dev/input/mice is jumbled up, out of order actually.  This was rather perplexing because I had been able to determine that qemu was delivering the data in the proper order and, in fact, i8042_interrupt() of linux-2.6.16/drivers/input/serio/i8042.c executing in the HVM guest was also reporting that the data had been read in proper order, yet the processing of the data occurred out of order.

After exploring a number of possible causes for this behavior I discovered an assumption in the kernel code that is true when the kernel is running natively but not necessarily true when hosted by the hypervisor.

I learned that the i8042_interrupt() will be polled by the timer interrupt if HZ/20 jiffies has expired since the last 8042 interrupt.  So here's what I believe is happening.  Each mouse event generates at least three bytes of data, each byte of data generates an interrupt.  When the first interrupt is injected in the guest, as well as all interrupts, the kernel masks the interrupt vector in the PIC and then EOIs the PIC before actually handling the interrupt.  This, of course, allows ANY other interrupt to occur save the one currently begin serviced.  When i8042_interrupt() is called, it first calls timer_mod() to delay the timer callback another HZ/20, takes a spin_lock_irqsave() disabling interrupts (interrupts are enabled prior to i8042_interrupt() being called), reads the 8042 obtaining the first byte of data from qemu, and then releases the spinlock.  Immediately after releasing the spinlock, this isr is interrupted by a timer interrupt which discovers that the 8042's HZ/20 timer has expired and i8042_interrupt() is reentered and runs to completion as there is not a pending timer interrupt.  When the timer interrupt completes, the previously interrupted isr resumes and continues to process what was to be the first byte but now is not.  I have been able to determine that the timer is indeed calling i8052_interrupt() and causing the mis-ordered data.

For the timer interrupt handler to believe that HZ/20 jiffies had expired there must have been at least that amount of time lapse between i8052_interrupt() releasing the spinlock and calling serio_intrerrupt() a dozen lines later, suggesting a lengthy hypervisor preemption followed by a timer isr before resuming from the point of preemption.  Or, a considerable amount of time, > HZ/20, expired reading the data from qemu's emulation of port 0x60, followed by a timer isr after the spin_unlock_irqrestore() in i8052_interrupt().  Which ever case may be, i8052_interrupt() is _assuming_ that HZ/20 jiffies are not going to lapse before its isr completes.  This assumption is probably fair enough for running natively, but not a good assumption when hosted by the current implementation of the hypervisor.

The question now is, does the hypervisor change to accommodate the assumption, or is the assumption removed from the kernel, or is there yet some other fiendish time-consuming bug yet to be discovered ?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [rfc] [patch] more 'long' in the hypervisor interface
  2006-06-28 21:58       ` Hollis Blanchard
  2006-06-28 22:42         ` [RFC] Erratic mouse in HVM guest Ross Maxfield
@ 2006-06-28 23:05         ` Chris Wright
  2006-06-29 14:37           ` Steve Ofsthun
  1 sibling, 1 reply; 15+ messages in thread
From: Chris Wright @ 2006-06-28 23:05 UTC (permalink / raw)
  To: Hollis Blanchard; +Cc: Chris Wright, xen-devel, xen-ppc-devel

* Hollis Blanchard (hollisb@us.ibm.com) wrote:
> We discussed a bit on IRC (developers are welcome to join OFTC #xen),
> but to recap for the list...
> 
> PPC will have
> 	typedef uint64_t xen_ulong_t;
> That means that the fields in memory.h will keep the same
> size/alignment, whether compiled 32- or 64-bit. This is the way the
> interface should have been designed in the first place, but we're locked
> into the current ABI on x86. However, since PPC has no current users, we
> can define the ABI correctly from the start.

I see.  I think it would be nice to work on the ABI such that it makes
sense for the future 32/64 mixed modes.  So I guess I actually agree
with your legacy typedef name ;-)

One issue is that 32-bit userspace effectively has direct access to
64-bit hypercall interface.  This can be handled in the 64-bit kernel by
doing compat translation, by having 32-bit compat hypercall interface
and jumping to right spot on hypercall page, or by having fixed size
structure.  It's not clear to me the value of effectively exposing the
ABI all the way to userspace.

What is the current plan for 32-bit kernel on 64-bit hv?  In this case
a 32-bit compat hypercall page might be useful, or having fixed size
structure.

My concern is that we'll never make a clean break if we slowly cobble up
the interface with more hacks.  Maybe a forward looking compat interface
would be a good breaking point.

thanks,
-chris

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [rfc] [patch] more 'long' in the hypervisor interface
  2006-06-28 23:05         ` [rfc] [patch] more 'long' in the hypervisor interface Chris Wright
@ 2006-06-29 14:37           ` Steve Ofsthun
  2006-06-29 17:02             ` Chris Wright
  2006-06-29 18:14             ` Hollis Blanchard
  0 siblings, 2 replies; 15+ messages in thread
From: Steve Ofsthun @ 2006-06-29 14:37 UTC (permalink / raw)
  To: Chris Wright; +Cc: xen-devel, Hollis Blanchard, xen-ppc-devel

Chris Wright wrote:
> * Hollis Blanchard (hollisb@us.ibm.com) wrote:
> 
>>We discussed a bit on IRC (developers are welcome to join OFTC #xen),
>>but to recap for the list...
>>
>>PPC will have
>>	typedef uint64_t xen_ulong_t;
>>That means that the fields in memory.h will keep the same
>>size/alignment, whether compiled 32- or 64-bit. This is the way the
>>interface should have been designed in the first place, but we're locked
>>into the current ABI on x86. However, since PPC has no current users, we
>>can define the ABI correctly from the start.
> 
> 
> I see.  I think it would be nice to work on the ABI such that it makes
> sense for the future 32/64 mixed modes.  So I guess I actually agree
> with your legacy typedef name ;-)

X86 32/64 mixed modes really have 2 independent compatibility issues.  One
is the calling conventions used to pass parameters through the hypercall
interface.  The second is the format of the data structures passed through
the calling conventions to the underlying hypervisor.

Today, we run 32/64 mixed mode HVM guests on a 64 bit hypervisor.  The
hypercall interface was modified to handle both 32-bit and 64-bit calling
conventions.  The underlying hypervisor however only supports 64-bit
structure formats.  A 64-bit guest can continue to use the standard headers
for passing data to hypercalls.  A 32-bit guest must redefine every structure
in the public interfaces to properly pass data to the hypervisor.

We would like to see the 32-bit and 64-bit structure definitions evolve
to a single size invariant version of the interface structures for both
32-bit and 64-bit guests.

> One issue is that 32-bit userspace effectively has direct access to
> 64-bit hypercall interface.  This can be handled in the 64-bit kernel by
> doing compat translation, by having 32-bit compat hypercall interface
> and jumping to right spot on hypercall page, or by having fixed size
> structure.  It's not clear to me the value of effectively exposing the
> ABI all the way to userspace.

I'm not sure I understand your use of the term 'userspace' here.  Do you
mean guest kernel mode, or actual unprivileged user code?

> What is the current plan for 32-bit kernel on 64-bit hv?  In this case
> a 32-bit compat hypercall page might be useful, or having fixed size
> structure.

For X86 there are probably two plans.  For paravirtual guests, there is a
strong desire to formalize the existing ABI.  This will force the 32-bit
and 64-bit ABIs to remain significantly different.  Since the underlying
hypervisors don't allow 32/64 mixed mode guests, there is little reason
to reconcile the two ABIs.  If the ABIs were identical today, you still
couldn't run mixed mode guests.

For HVM guests, the ABI is less established.  I'm not sure anyone but us
(Virtual Iron), is doing much with hypercalls from HVM guests.  We are
currently running paravirtualized drivers in HVM guests.  As the code
matures, we will be posting these patches.

We have had to deal with issues separate from the mechanical ABI issues.
For example, grant table transfers (used by the standard netfront/netback)
don't play well with QEMU's one time direct map of the entire HVM guest
address space.  In addition, the xen support needed by PV drivers is
specific to later 2.6 kernels.  Getting this code to work on older linux
kernels requires some additional work.

> My concern is that we'll never make a clean break if we slowly cobble up
> the interface with more hacks.  Maybe a forward looking compat interface
> would be a good breaking point.

I agree with you on this.  The longer this goes unaddressed, the more work it
will be to fix.

Steve
-- 
Steve Ofsthun - Virtual Iron Software, Inc.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [rfc] [patch] more 'long' in the hypervisor interface
  2006-06-29 14:37           ` Steve Ofsthun
@ 2006-06-29 17:02             ` Chris Wright
  2006-06-29 17:55               ` Steve Ofsthun
  2006-06-29 18:14             ` Hollis Blanchard
  1 sibling, 1 reply; 15+ messages in thread
From: Chris Wright @ 2006-06-29 17:02 UTC (permalink / raw)
  To: Steve Ofsthun; +Cc: Chris Wright, xen-devel, Hollis Blanchard, xen-ppc-devel

* Steve Ofsthun (sofsthun@virtualiron.com) wrote:
> for passing data to hypercalls.  A 32-bit guest must redefine every 
> structure
> in the public interfaces to properly pass data to the hypervisor.

Exactly, and that breaks the ABI for normal 32-bit guest on 32-bit
hypervisor.  Hence the pain that Hollis is dealing with.

> We would like to see the 32-bit and 64-bit structure definitions evolve
> to a single size invariant version of the interface structures for both
> 32-bit and 64-bit guests.

Yup, that's same interest that Hollis has.  I'm simply suggesting that
mixed mode becomes the vehicle for evolving that interface.

> >One issue is that 32-bit userspace effectively has direct access to
> >64-bit hypercall interface.  This can be handled in the 64-bit kernel by
> >doing compat translation, by having 32-bit compat hypercall interface
> >and jumping to right spot on hypercall page, or by having fixed size
> >structure.  It's not clear to me the value of effectively exposing the
> >ABI all the way to userspace.
> 
> I'm not sure I understand your use of the term 'userspace' here.  Do you
> mean guest kernel mode, or actual unprivileged user code?

Userspace effectively makes hypercalls directly (see privcmd).  The
wider the ABI is exposed, the harder it is to make changes.

> >What is the current plan for 32-bit kernel on 64-bit hv?  In this case
> >a 32-bit compat hypercall page might be useful, or having fixed size
> >structure.
> 
> For X86 there are probably two plans.  For paravirtual guests, there is a
> strong desire to formalize the existing ABI.  This will force the 32-bit
> and 64-bit ABIs to remain significantly different.  Since the underlying
> hypervisors don't allow 32/64 mixed mode guests, there is little reason
> to reconcile the two ABIs.  If the ABIs were identical today, you still
> couldn't run mixed mode guests.

Yes, that's why I suggested compat.

> For HVM guests, the ABI is less established.  I'm not sure anyone but us
> (Virtual Iron), is doing much with hypercalls from HVM guests.  We are
> currently running paravirtualized drivers in HVM guests.  As the code
> matures, we will be posting these patches.

Look forward to seeing that.

> We have had to deal with issues separate from the mechanical ABI issues.
> For example, grant table transfers (used by the standard netfront/netback)
> don't play well with QEMU's one time direct map of the entire HVM guest
> address space.

What did you do here?  Specifically, how much does the PV driver in HVM
guest diverge from normal PV guest?  This will effect merging this code
into Linux which is why I care.

thanks,
-chris

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [rfc] [patch] more 'long' in the hypervisor interface
  2006-06-29 17:02             ` Chris Wright
@ 2006-06-29 17:55               ` Steve Ofsthun
  0 siblings, 0 replies; 15+ messages in thread
From: Steve Ofsthun @ 2006-06-29 17:55 UTC (permalink / raw)
  To: Chris Wright; +Cc: xen-devel, Hollis Blanchard, xen-ppc-devel

Chris Wright wrote:
> * Steve Ofsthun (sofsthun@virtualiron.com) wrote:
> 
>>For HVM guests, the ABI is less established.  I'm not sure anyone but us
>>(Virtual Iron), is doing much with hypercalls from HVM guests.  We are
>>currently running paravirtualized drivers in HVM guests.  As the code
>>matures, we will be posting these patches.
> 
> 
> Look forward to seeing that.

So am I! ;)

>>We have had to deal with issues separate from the mechanical ABI issues.
>>For example, grant table transfers (used by the standard netfront/netback)
>>don't play well with QEMU's one time direct map of the entire HVM guest
>>address space.
> 
> 
> What did you do here?  Specifically, how much does the PV driver in HVM
> guest diverge from normal PV guest?  This will effect merging this code
> into Linux which is why I care.

We changed the netfront/netback to stop using grant table transfers.  As
an alternative, we preallocate receive buffers in netfront and pass them
as grant references to netback.  As netback receives new packets, the
preallocated pool of buffers are dynamically mapped and data is copied in.
This additional data copy is probably unacceptable for PV guests.  We
are just starting to analyze the performance impact.

Two alternatives to this approach are available in the longer term.  One
is to create a new high performance device model in QEMU that reduces the
per packet cost of the emulated NIC.  There has been discussion about this,
but I am not aware of any actual work in progress.   Another approach is
to retain the existing PV network driver and fix the inconsistencies between
QEMU guest mappings and grant table transfer operations.

Steve
-- 
Steve Ofsthun - Virtual Iron Software, Inc.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [rfc] [patch] more 'long' in the hypervisor interface
  2006-06-29 14:37           ` Steve Ofsthun
  2006-06-29 17:02             ` Chris Wright
@ 2006-06-29 18:14             ` Hollis Blanchard
  2006-06-29 21:04               ` Steve Ofsthun
  1 sibling, 1 reply; 15+ messages in thread
From: Hollis Blanchard @ 2006-06-29 18:14 UTC (permalink / raw)
  To: Steve Ofsthun; +Cc: Chris Wright, xen-devel, xen-ppc-devel

On Thu, 2006-06-29 at 10:37 -0400, Steve Ofsthun wrote:
> Chris Wright wrote:
> > * Hollis Blanchard (hollisb@us.ibm.com) wrote:
> > 
> >>We discussed a bit on IRC (developers are welcome to join OFTC #xen),
> >>but to recap for the list...
> >>
> >>PPC will have
> >>	typedef uint64_t xen_ulong_t;
> >>That means that the fields in memory.h will keep the same
> >>size/alignment, whether compiled 32- or 64-bit. This is the way the
> >>interface should have been designed in the first place, but we're locked
> >>into the current ABI on x86. However, since PPC has no current users, we
> >>can define the ABI correctly from the start.
> > 
> > 
> > I see.  I think it would be nice to work on the ABI such that it makes
> > sense for the future 32/64 mixed modes.  So I guess I actually agree
> > with your legacy typedef name ;-)
> 
> X86 32/64 mixed modes really have 2 independent compatibility issues.  One
> is the calling conventions used to pass parameters through the hypercall
> interface.  The second is the format of the data structures passed through
> the calling conventions to the underlying hypervisor.
> 
> Today, we run 32/64 mixed mode HVM guests on a 64 bit hypervisor.  The
> hypercall interface was modified to handle both 32-bit and 64-bit calling
> conventions.  The underlying hypervisor however only supports 64-bit
> structure formats.  A 64-bit guest can continue to use the standard headers
> for passing data to hypercalls.  A 32-bit guest must redefine every structure
> in the public interfaces to properly pass data to the hypervisor.

The work I've been doing should cover most of the userland/hypervisor
interface, i.e. everything in libxc. Since it doesn't affect me
personally right now, I haven't been looking at the kernel/hypervisor
interface, though I certainly support similar changes there.

> We would like to see the 32-bit and 64-bit structure definitions evolve
> to a single size invariant version of the interface structures for both
> 32-bit and 64-bit guests.

Definitely.

> > One issue is that 32-bit userspace effectively has direct access to
> > 64-bit hypercall interface.  This can be handled in the 64-bit kernel by
> > doing compat translation, by having 32-bit compat hypercall interface
> > and jumping to right spot on hypercall page, or by having fixed size
> > structure.  It's not clear to me the value of effectively exposing the
> > ABI all the way to userspace.
> 
> I'm not sure I understand your use of the term 'userspace' here.  Do you
> mean guest kernel mode, or actual unprivileged user code?

Unprivileged user code, specifically applications using libxc.

> > What is the current plan for 32-bit kernel on 64-bit hv?  In this case
> > a 32-bit compat hypercall page might be useful, or having fixed size
> > structure.
> 
> For X86 there are probably two plans.  For paravirtual guests, there is a
> strong desire to formalize the existing ABI.  This will force the 32-bit
> and 64-bit ABIs to remain significantly different.  Since the underlying
> hypervisors don't allow 32/64 mixed mode guests, there is little reason
> to reconcile the two ABIs.  If the ABIs were identical today, you still
> couldn't run mixed mode guests.

Not sure I follow here. Identical ABIs would enable mixed mode guests,
even if the current implementation doesn't support that, right? So that
sounds like a good goal.

> For HVM guests, the ABI is less established.  I'm not sure anyone but us
> (Virtual Iron), is doing much with hypercalls from HVM guests.  We are
> currently running paravirtualized drivers in HVM guests.  As the code
> matures, we will be posting these patches.
> 
> We have had to deal with issues separate from the mechanical ABI issues.
> For example, grant table transfers (used by the standard netfront/netback)
> don't play well with QEMU's one time direct map of the entire HVM guest
> address space.  In addition, the xen support needed by PV drivers is
> specific to later 2.6 kernels.  Getting this code to work on older linux
> kernels requires some additional work.
> 
> > My concern is that we'll never make a clean break if we slowly cobble up
> > the interface with more hacks.  Maybe a forward looking compat interface
> > would be a good breaking point.
> 
> I agree with you on this.  The longer this goes unaddressed, the more work it
> will be to fix.

I think we all agree we should make sure all future interfaces are
correct.

By "forward-looking compat interface", I think Chris means a set of new
hypercall numbers that are written for newly designed fixed-layout data
structures. I'm fine with that.

In this case, it looks like all the do_memory_op() functions (e.g.
increase_reservation()) could be directly called by a new
do_memory_op_compat() function. In other cases, some code reorganization
may be necessary. For example, duplicating do_dom0_op() doesn't look
like fun.

-- 
Hollis Blanchard
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [rfc] [patch] more 'long' in the hypervisor interface
  2006-06-29 18:14             ` Hollis Blanchard
@ 2006-06-29 21:04               ` Steve Ofsthun
  0 siblings, 0 replies; 15+ messages in thread
From: Steve Ofsthun @ 2006-06-29 21:04 UTC (permalink / raw)
  To: Hollis Blanchard; +Cc: Chris Wright, xen-devel, xen-ppc-devel

Hollis Blanchard wrote:
> On Thu, 2006-06-29 at 10:37 -0400, Steve Ofsthun wrote:
> 
>>For X86 there are probably two plans.  For paravirtual guests, there is a
>>strong desire to formalize the existing ABI.  This will force the 32-bit
>>and 64-bit ABIs to remain significantly different.  Since the underlying
>>hypervisors don't allow 32/64 mixed mode guests, there is little reason
>>to reconcile the two ABIs.  If the ABIs were identical today, you still
>>couldn't run mixed mode guests.
> 
> 
> Not sure I follow here. Identical ABIs would enable mixed mode guests,
> even if the current implementation doesn't support that, right? So that
> sounds like a good goal.


Yes, identical ABIs would enable mixed mode guests.  I was just trying to
point out that there are other issues (page table sharing, etc) that would
also need to be addressed before mixed mode PV guests would work.

Steve
-- 
Steve Ofsthun - Virtual Iron Software, Inc.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2006-06-29 21:04 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-06-28 21:03 [rfc] [patch] more 'long' in the hypervisor interface Hollis Blanchard
2006-06-28 21:02 ` Keir Fraser
2006-06-28 21:19   ` Hollis Blanchard
2006-06-28 21:09 ` Chris Wright
2006-06-28 21:21   ` Hollis Blanchard
2006-06-28 21:36     ` Chris Wright
2006-06-28 21:58       ` Hollis Blanchard
2006-06-28 22:42         ` [RFC] Erratic mouse in HVM guest Ross Maxfield
2006-06-28 23:05         ` [rfc] [patch] more 'long' in the hypervisor interface Chris Wright
2006-06-29 14:37           ` Steve Ofsthun
2006-06-29 17:02             ` Chris Wright
2006-06-29 17:55               ` Steve Ofsthun
2006-06-29 18:14             ` Hollis Blanchard
2006-06-29 21:04               ` Steve Ofsthun
2006-06-28 21:10 ` Hollis Blanchard

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.