All of lore.kernel.org
 help / color / mirror / Atom feed
* passing hypercall parameters by pointer
@ 2005-08-17 19:51 Hollis Blanchard
  0 siblings, 0 replies; 21+ messages in thread
From: Hollis Blanchard @ 2005-08-17 19:51 UTC (permalink / raw)
  To: xen-devel; +Cc: Jimi Xenidis

Many Xen hypercalls pass mlocked pointers as parameters for both input and 
output. For example, xc_get_pfn_list() is a nice one with multiple levels of 
structures/mlocking.

Considering just the tools for the moment, those pointers are userspace 
addresses. Ultimately the hypervisor ends up with that userspace address, from 
which it reads and writes data. This is OK for x86, since userspace, kernel, 
and hypervisor all share the same virtual address space (and userspace has 
carefully mlocked the relevent memory).

On PowerPC though, the hypervisor runs in real mode (no MMU translation).  
Unlike x86, PowerPC exceptions arrive in real mode, and also PowerPC does not 
force a TLB flush when switching between real and virtual modes. So a virtual 
address is pretty much worthless as a hypervisor parameter; performing the 
MMU translation in software is infeasible.

Although it rarely passes parameters by pointer, the way the pSeries 
hypervisor handles this is having the kernel always pass a "pseudo-physical" 
address (to borrow Xen terminology), which is trivially translatable to a 
"machine" address in the hypervisor. The processor has some notion of a large 
(e.g. 64M) chunk of contiguous machine memory, so the hypervisor keeps a 
table of chunks which can be used to translate pseudo-physical addresses.

Of course, userspace doesn't know psuedo-physical addresses, only the kernel 
does. So one way or another, to pass parameters by pointer to the PPC 
hypervisor, the kernel is going to need to translate them. That also means  
userspace memory areas will be limited to one page (since virtually 
consecutive pages may not be representable by a single pseudo-physical 
address).

If we're stuck with structure addresses in hypercalls, one possible solution 
is to modify libxc so that all parameter addresses are physical pointers 
within the same page, then pass that page's physical address into the 
hypercall. Something like this:

ulong magicpage_vaddr;
ulong magicpage_paddr;

libxc_init() {
#ifdef __powerpc__
	posix_memalign(&magicpage_vaddr, PAGE_SIZE, PAGE_SIZE);
	mlock(magicpage_vaddr);
	magicpage_paddr = new_translate_syscall(magicpage_vaddr);
#endif
	...
}

xc_get_pfn_list() {
	dom0_op_t *op;
	ulong op_paddr;
	magicalloc(&op, &op_paddr, sizeof(dom0_op_t));
	...
}

#ifdef __powerpc__
magicalloc(ulong &usable_addr, ulong &hcall_addr, int bytes) {
	*usable_addr = magicpage_vaddr + offset;
	*hcall_addr = magicpage_paddr + offset;
	offset += bytes;
}

do_xen_hypercall(ptr) {
	ptr -= magicpage_vaddr - magicpage_paddr;
	do_privcmd(..., ptr);
}
#endif

(Note that this is for discussion only, not a proposed interface.)

Each architecture would provide their own magicalloc and do_xen_hypercall, and 
for x86 magicalloc would be malloc+mlock and both pointers are the same. x86 
do_xen_hypercall would remain unchanged. Basically, any current use of mlock 
in libxc would be replaced with calls to magicalloc.

For example, if we're willing to change the embedded pointers in dom0_ops to 
offsets, we do not need to invent a new "translate" system call.

Other suggestions are welcome.

-- 
Hollis Blanchard
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: passing hypercall parameters by pointer
@ 2005-08-17 20:44 Ian Pratt
       [not found] ` <mailman.1124311483.4826@unix-os.sc.intel.com>
  2005-08-17 22:04 ` Hollis Blanchard
  0 siblings, 2 replies; 21+ messages in thread
From: Ian Pratt @ 2005-08-17 20:44 UTC (permalink / raw)
  To: Hollis Blanchard, xen-devel; +Cc: Jimi Xenidis

> Many Xen hypercalls pass mlocked pointers as parameters for 
> both input and output. For example, xc_get_pfn_list() is a 
> nice one with multiple levels of structures/mlocking.
> 
> Considering just the tools for the moment, those pointers are 
> userspace addresses. Ultimately the hypervisor ends up with 
> that userspace address, from which it reads and writes data. 
> This is OK for x86, since userspace, kernel, and hypervisor 
> all share the same virtual address space (and userspace has 
> carefully mlocked the relevent memory).
> 
> On PowerPC though, the hypervisor runs in real mode (no MMU 
> translation).  
> Unlike x86, PowerPC exceptions arrive in real mode, and also 
> PowerPC does not force a TLB flush when switching between 
> real and virtual modes. So a virtual address is pretty much 
> worthless as a hypervisor parameter; performing the MMU 
> translation in software is infeasible.

I think I'd prefer to hide all of this by co-operation between the
kernel and the hypervisor's copy to/from user.

The kernel can easily translate a virtual address and length into a list
of psuedo-phyiscal frame numbers and initial offset. Xen's copy from
user function can then use this list when doing its work. 

Ian


> Although it rarely passes parameters by pointer, the way the 
> pSeries hypervisor handles this is having the kernel always 
> pass a "pseudo-physical" 
> address (to borrow Xen terminology), which is trivially 
> translatable to a "machine" address in the hypervisor. The 
> processor has some notion of a large (e.g. 64M) chunk of 
> contiguous machine memory, so the hypervisor keeps a table of 
> chunks which can be used to translate pseudo-physical addresses.
> 
> Of course, userspace doesn't know psuedo-physical addresses, 
> only the kernel does. So one way or another, to pass 
> parameters by pointer to the PPC hypervisor, the kernel is 
> going to need to translate them. That also means userspace 
> memory areas will be limited to one page (since virtually 
> consecutive pages may not be representable by a single 
> pseudo-physical address).
> 
> If we're stuck with structure addresses in hypercalls, one 
> possible solution is to modify libxc so that all parameter 
> addresses are physical pointers within the same page, then 
> pass that page's physical address into the hypercall. 
> Something like this:
> 
> ulong magicpage_vaddr;
> ulong magicpage_paddr;
> 
> libxc_init() {
> #ifdef __powerpc__
> 	posix_memalign(&magicpage_vaddr, PAGE_SIZE, PAGE_SIZE);
> 	mlock(magicpage_vaddr);
> 	magicpage_paddr = new_translate_syscall(magicpage_vaddr);
> #endif
> 	...
> }
> 
> xc_get_pfn_list() {
> 	dom0_op_t *op;
> 	ulong op_paddr;
> 	magicalloc(&op, &op_paddr, sizeof(dom0_op_t));
> 	...
> }
> 
> #ifdef __powerpc__
> magicalloc(ulong &usable_addr, ulong &hcall_addr, int bytes) {
> 	*usable_addr = magicpage_vaddr + offset;
> 	*hcall_addr = magicpage_paddr + offset;
> 	offset += bytes;
> }
> 
> do_xen_hypercall(ptr) {
> 	ptr -= magicpage_vaddr - magicpage_paddr;
> 	do_privcmd(..., ptr);
> }
> #endif
> 
> (Note that this is for discussion only, not a proposed interface.)
> 
> Each architecture would provide their own magicalloc and 
> do_xen_hypercall, and for x86 magicalloc would be 
> malloc+mlock and both pointers are the same. x86 
> do_xen_hypercall would remain unchanged. Basically, any 
> current use of mlock in libxc would be replaced with calls to 
> magicalloc.
> 
> For example, if we're willing to change the embedded pointers 
> in dom0_ops to offsets, we do not need to invent a new 
> "translate" system call.
> 
> Other suggestions are welcome.
> 
> --
> Hollis Blanchard
> IBM Linux Technology Center
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: passing hypercall parameters by pointer
       [not found] ` <mailman.1124311483.4826@unix-os.sc.intel.com>
@ 2005-08-17 21:07   ` Arun Sharma
  2005-08-17 22:11     ` Hollis Blanchard
  0 siblings, 1 reply; 21+ messages in thread
From: Arun Sharma @ 2005-08-17 21:07 UTC (permalink / raw)
  To: Ian Pratt; +Cc: Ling, Xiaofeng, xen-devel, Yu, Ke

Ian Pratt wrote:
>>Many Xen hypercalls pass mlocked pointers as parameters for 
>>both input and output. For example, xc_get_pfn_list() is a 
>>nice one with multiple levels of structures/mlocking.
>>
>>Considering just the tools for the moment, those pointers are 
>>userspace addresses. Ultimately the hypervisor ends up with 
>>that userspace address, from which it reads and writes data. 
>>This is OK for x86, since userspace, kernel, and hypervisor 
>>all share the same virtual address space (and userspace has 
>>carefully mlocked the relevent memory).

This is a problem even on x86 for VMX domains which execute hypercalls 
because of para virtualized device drivers.

>>
>>On PowerPC though, the hypervisor runs in real mode (no MMU 
>>translation).  
>>Unlike x86, PowerPC exceptions arrive in real mode, and also 
>>PowerPC does not force a TLB flush when switching between 
>>real and virtual modes. So a virtual address is pretty much 
>>worthless as a hypervisor parameter; performing the MMU 
>>translation in software is infeasible.
> 
> 
> I think I'd prefer to hide all of this by co-operation between the
> kernel and the hypervisor's copy to/from user.
>

This is basically what Xiaofeng attempted to do in this patch:

http://article.gmane.org/gmane.comp.emulators.xen.devel/11107

although the virtual -> pseudo physical is also done in the hypervisor.
Please let us know if the patch is acceptable in light of your email.

> The kernel can easily translate a virtual address and length into a list
> of psuedo-phyiscal frame numbers and initial offset. Xen's copy from
> user function can then use this list when doing its work. 

The other alternative (which we talked about at OLS) is to use a couple 
of pinned pages for parameter passing - but it doesn't work very well for:

a) Multiple levels of structures/pointers
b) Arguments which may be bigger than a couple of pages 
(xc_get_pfn_list() for a bigmem domain for example).

	-Arun

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: passing hypercall parameters by pointer
  2005-08-17 20:44 passing hypercall parameters by pointer Ian Pratt
       [not found] ` <mailman.1124311483.4826@unix-os.sc.intel.com>
@ 2005-08-17 22:04 ` Hollis Blanchard
  1 sibling, 0 replies; 21+ messages in thread
From: Hollis Blanchard @ 2005-08-17 22:04 UTC (permalink / raw)
  To: Ian Pratt; +Cc: Jimi Xenidis, xen-devel

On Wednesday 17 August 2005 15:44, Ian Pratt wrote:
> > Many Xen hypercalls pass mlocked pointers as parameters for
> > both input and output. For example, xc_get_pfn_list() is a
> > nice one with multiple levels of structures/mlocking.
> >
> > Considering just the tools for the moment, those pointers are
> > userspace addresses. Ultimately the hypervisor ends up with
> > that userspace address, from which it reads and writes data.
> > This is OK for x86, since userspace, kernel, and hypervisor
> > all share the same virtual address space (and userspace has
> > carefully mlocked the relevent memory).
> >
> > On PowerPC though, the hypervisor runs in real mode (no MMU
> > translation).
> > Unlike x86, PowerPC exceptions arrive in real mode, and also
> > PowerPC does not force a TLB flush when switching between
> > real and virtual modes. So a virtual address is pretty much
> > worthless as a hypervisor parameter; performing the MMU
> > translation in software is infeasible.
>
> I think I'd prefer to hide all of this by co-operation between the
> kernel and the hypervisor's copy to/from user.
>
> The kernel can easily translate a virtual address and length into a list
> of psuedo-phyiscal frame numbers and initial offset. Xen's copy from
> user function can then use this list when doing its work.

Could you elaborate a little?

Consider this structure:
typedef struct {
    /* IN variables. */
    domid_t       domain;
    memory_t      max_pfns;
    void         *buffer;
    /* OUT variables. */
    memory_t      num_pfns;
} dom0_getmemlist_t;

libxc creates this struct and passes it to the kernel, and the kernel doesn't 
know anything about the internals. Are you saying that privcmd_ioctl() should 
look like this?

    switch ( cmd )
    {
    case IOCTL_PRIVCMD_HYPERCALL:
    {
        privcmd_hypercall_t hypercall;
        dom0_op_t *op = (dom0_op_t *)&hypercall;
  
        if ( copy_from_user(&hypercall, (void *)data, sizeof(hypercall)) )
            return -EFAULT;

        /* NEW switch statement: */
        switch (op->cmd)
        {
        case DOM0_GETMEMLIST:
            op->u.getmemlist.buffer = virt_to_phys(op->u.getmemlist.buffer);
            break;
        case DOM0_SETDOMAININFO:
            ...
        case DOM0_READCONSOLE:
            ...
        }
    }
    break;
    }

Right now the kernel doesn't peer inside the hypercall structures at all.

-- 
Hollis Blanchard
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: passing hypercall parameters by pointer
  2005-08-17 21:07   ` Arun Sharma
@ 2005-08-17 22:11     ` Hollis Blanchard
  0 siblings, 0 replies; 21+ messages in thread
From: Hollis Blanchard @ 2005-08-17 22:11 UTC (permalink / raw)
  To: Arun Sharma; +Cc: Jimi Xenidis, Ian Pratt, xen-devel, Yu, Ke, Ling, Xiaofeng

On Wednesday 17 August 2005 16:07, Arun Sharma wrote:
> Ian Pratt wrote:
> >>Many Xen hypercalls pass mlocked pointers as parameters for
> >>both input and output. For example, xc_get_pfn_list() is a
> >>nice one with multiple levels of structures/mlocking.
> >>
> >>Considering just the tools for the moment, those pointers are
> >>userspace addresses. Ultimately the hypervisor ends up with
> >>that userspace address, from which it reads and writes data.
> >>This is OK for x86, since userspace, kernel, and hypervisor
> >>all share the same virtual address space (and userspace has
> >>carefully mlocked the relevent memory).
>
> This is a problem even on x86 for VMX domains which execute hypercalls
> because of para virtualized device drivers.
>
> >>On PowerPC though, the hypervisor runs in real mode (no MMU
> >>translation).
> >>Unlike x86, PowerPC exceptions arrive in real mode, and also
> >>PowerPC does not force a TLB flush when switching between
> >>real and virtual modes. So a virtual address is pretty much
> >>worthless as a hypervisor parameter; performing the MMU
> >>translation in software is infeasible.
> >
> > I think I'd prefer to hide all of this by co-operation between the
> > kernel and the hypervisor's copy to/from user.
>
> This is basically what Xiaofeng attempted to do in this patch:
>
> http://article.gmane.org/gmane.comp.emulators.xen.devel/11107
>
> although the virtual -> pseudo physical is also done in the hypervisor.
> Please let us know if the patch is acceptable in light of your email.

This patch does performs MMU translation in software. Even if you like that on 
x86, trying to do that on PowerPC is considerably more expensive. Just the 
page table lookup could be 16 loads and compares, and that's not counting 
segmentation.

> > The kernel can easily translate a virtual address and length into a list
> > of psuedo-phyiscal frame numbers and initial offset. Xen's copy from
> > user function can then use this list when doing its work.
>
> The other alternative (which we talked about at OLS) is to use a couple
> of pinned pages for parameter passing - but it doesn't work very well for:
>
> a) Multiple levels of structures/pointers
> b) Arguments which may be bigger than a couple of pages
> (xc_get_pfn_list() for a bigmem domain for example).

This is pretty much the proposal I sent earlier. The multiple levels of 
pointers can be handled as I showed, by creating an allocator that manages 
the couple pages.

I have no answer for parameters that are very large, but I wonder how many 
cases there are. For example, DOM0_READCONSOLE could just be limited to 4KB 
reads, and if there's more data than that, call it again. Perhaps there is 
some case-specific solution to xc_get_pfn_list() as well.

-- 
Hollis Blanchard
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: passing hypercall parameters by pointer
@ 2005-08-18  0:47 Ling, Xiaofeng
  0 siblings, 0 replies; 21+ messages in thread
From: Ling, Xiaofeng @ 2005-08-18  0:47 UTC (permalink / raw)
  To: Sharma, Arun, Ian Pratt; +Cc: xen-devel, Yu, Ke



Arun Sharma <mailto:arun.sharma@intel.com> wrote:
> Ian Pratt wrote:
> The other alternative (which we talked about at OLS) is to use a
> couple of pinned pages for parameter passing - but it doesn't work
> very well for:  
> 
> a) Multiple levels of structures/pointers
A good example is do_multicall.
A complete implementation need to enum all the hypercall and 
try to deal with each hypercall if it uses points.

> b) Arguments which may be bigger than a couple of pages
> (xc_get_pfn_list() for a bigmem domain for example).
> 
> 	-Arun

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: passing hypercall parameters by pointer
@ 2005-08-18  6:56 Tian, Kevin
  0 siblings, 0 replies; 21+ messages in thread
From: Tian, Kevin @ 2005-08-18  6:56 UTC (permalink / raw)
  To: Ian Pratt, Hollis Blanchard, xen-devel; +Cc: Jimi Xenidis

>From: Ian Pratt
>Sent: Thursday, August 18, 2005 4:44 AM
>> On PowerPC though, the hypervisor runs in real mode (no MMU 
>> translation).  
>> Unlike x86, PowerPC exceptions arrive in real mode, and also 
>> PowerPC does not force a TLB flush when switching between 
>> real and virtual modes. So a virtual address is pretty much 
>> worthless as a hypervisor parameter; performing the MMU 
>> translation in software is infeasible.
>
>I think I'd prefer to hide all of this by co-operation between the
>kernel and the hypervisor's copy to/from user.
>
>The kernel can easily translate a virtual address and length into a
list
>of psuedo-phyiscal frame numbers and initial offset. Xen's copy from
>user function can then use this list when doing its work.
>
>Ian
>

So this is a common concern for hypervisor residing in a different
address space as guest. For PowerPC, it's real mode (hypervisor) VS
virtual mode (guest). For vmx domain, hypervisor has its own monitor
page table separated from shadow page table. Expect the final solution
to be uniform too. ;-)

See if I understand your suggestion closely here. Previous Xiaofeng's
patch has following flow when accessing guest address space:
---hypervisor---
- Search gva in guest page table to get pfn
- Get mfn by pfn
- map mfn into hypervisor's space
- Then directly access the new va'

Then your suggestion is to make gva->pfn search happening in guest. And
hypervisor will still have rest steps to manipulate monitor page table
first and then access new va'. (PowerPC will access mfn directly).
Finally in either option, copy_from/to_user becomes a memcpy to a new
va' without exception happening.

Now, question comes out. The pseudo-physical frame number list itself
also presents as a parameter to hypervisor, and there's no promise that
this list will be confined in single page. You also need extra info in
this list if multiple parameters are pointers. How to access this
scalable list effectively seems to be same puzzle as the subject. For
x86 people may set a maximum limitation, but how about 64bit platform?
Good example is always get_pfn_list, which always breaks assumption for
size of parameter. ;-)

Thanks,
Kevin

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: passing hypercall parameters by pointer
@ 2005-08-18  6:56 Tian, Kevin
  0 siblings, 0 replies; 21+ messages in thread
From: Tian, Kevin @ 2005-08-18  6:56 UTC (permalink / raw)
  To: Hollis Blanchard, Sharma, Arun
  Cc: Jimi Xenidis, Ian Pratt, xen-devel, Yu, Ke, Ling, Xiaofeng

>From: Hollis Blanchard
>Sent: Thursday, August 18, 2005 6:11 AM
>
>I have no answer for parameters that are very large, but I wonder how
many
>cases there are. For example, DOM0_READCONSOLE could just be limited
>to 4KB
>reads, and if there's more data than that, call it again. Perhaps there
is
>some case-specific solution to xc_get_pfn_list() as well.
>

If one hypercall wants to get specific context at one point atomically,
"call it again" several times actually returns mixed contexts belonging
to different time points. That's not desired. Even if people want to add
atomic protection for such type of case, performance will be affected a
lot and more risk to suffer dead-lock.

Thanks,
Kevin

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: passing hypercall parameters by pointer
@ 2005-08-18  6:56 Tian, Kevin
  2005-08-18 15:58 ` Hollis Blanchard
  0 siblings, 1 reply; 21+ messages in thread
From: Tian, Kevin @ 2005-08-18  6:56 UTC (permalink / raw)
  To: Hollis Blanchard, Ian Pratt; +Cc: Jimi Xenidis, xen-devel

>From: Hollis Blanchard
>Sent: Thursday, August 18, 2005 6:05 AM
>        case DOM0_GETMEMLIST:
>            op->u.getmemlist.buffer =
virt_to_phys(op->u.getmemlist.buffer);
>            break;

If following Ian's suggestion, you have to create a list of pfn here
instead of only converting start address. There's no guaranty that the
buffer is limited in one page. ;-)

Thanks,
Kevin
>        case DOM0_SETDOMAININFO:
>            ...
>        case DOM0_READCONSOLE:
>            ...
>        }
>    }
>    break;
>    }
>
>Right now the kernel doesn't peer inside the hypercall structures at
all.
>
>--
>Hollis Blanchard
>IBM Linux Technology Center
>
>_______________________________________________
>Xen-devel mailing list
>Xen-devel@lists.xensource.com
>http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: passing hypercall parameters by pointer
  2005-08-18  6:56 Tian, Kevin
@ 2005-08-18 15:58 ` Hollis Blanchard
  2005-08-19  2:00   ` Jimi Xenidis
  0 siblings, 1 reply; 21+ messages in thread
From: Hollis Blanchard @ 2005-08-18 15:58 UTC (permalink / raw)
  To: Tian, Kevin; +Cc: Jimi Xenidis, Ian Pratt, xen-devel

On Aug 18, 2005, at 1:56 AM, Tian, Kevin wrote:

>> From: Hollis Blanchard
>> Sent: Thursday, August 18, 2005 6:05 AM
>>        case DOM0_GETMEMLIST:
>>            op->u.getmemlist.buffer = 
>> virt_to_phys(op->u.getmemlist.buffer);
>>            break;
>
> If following Ian's suggestion, you have to create a list of pfn here
> instead of only converting start address. There's no guaranty that the
> buffer is limited in one page. ;-)

Actually that was an explicitly stated limitation.

But I think I like this scatterlist idea. So for every pointer (buffer 
in the above example), instead the pseudo-physical address to a 
scatterlist will be passed to the hypervisor, and then 
copy_to/from_user expects a scatterlist address instead of a pointer. I 
think the copy_to/from_user and get/put_user API would need to change 
though: you'd need the value, the scatterlist pointer, and an offset 
into the scatterlist.

So x86 would need a slight API change, but could continue without 
dealing with any scatterlists, i.e. no ABI change.

The PowerPC kernel would need knowledge of every hypercall structure to 
create and translate the scatterlist. I know that's an idea Jimi isn't 
fond of, but it really seems like the best solution here.

-- 
Hollis Blanchard
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: passing hypercall parameters by pointer
  2005-08-18 15:58 ` Hollis Blanchard
@ 2005-08-19  2:00   ` Jimi Xenidis
  2005-08-19 10:32     ` Keir Fraser
  0 siblings, 1 reply; 21+ messages in thread
From: Jimi Xenidis @ 2005-08-19  2:00 UTC (permalink / raw)
  To: Hollis Blanchard; +Cc: Ian Pratt, Tian, Kevin, xen-devel

>>>>> "HB" == Hollis Blanchard <hollisb@us.ibm.com> writes:

hmm let me bubble up my intro :)

 HB> I know that's an idea Jimi isn't fond of, but it really seems
 HB> like the best solution here.

Why I dislike this solution.
  1. Currently, the kernel has no intimate knowledge of the managment
     calls.  This is goodness since this gives the freedom to
     "innovate" in the management area without impacting the kernel,
     we now would require kernel updates that grok management
     structures, creating more opportunity for versioning chaos and
     bloating of the kernel patch.
  2. We are complicating the kernel and the hypervisor in order to
     keep a user app simple.  Does anyone care that a user app suffer
     a little performace impact?  Frankly, I'm much more worried about
     unecessarily impacting the hypervisor.

I believe a negotiated managment area that the application serializes
all arguements into to be a far better solution, the area can be of
arbitrary size and it the added complexity to the application is
trivial.

Am I missing something?
-JX


-- 
 "I got an idea, an idea so smart my head would explode if I even
  began to know what I was talking about." -- Peter Griffin (Family Guy)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: passing hypercall parameters by pointer
  2005-08-19  2:00   ` Jimi Xenidis
@ 2005-08-19 10:32     ` Keir Fraser
  0 siblings, 0 replies; 21+ messages in thread
From: Keir Fraser @ 2005-08-19 10:32 UTC (permalink / raw)
  To: Jimi Xenidis; +Cc: Ian Pratt, Tian, Kevin, xen-devel


On 19 Aug 2005, at 03:00, Jimi Xenidis wrote:

> I believe a negotiated managment area that the application serializes
> all arguements into to be a far better solution, the area can be of
> arbitrary size and it the added complexity to the application is
> trivial.
>
> Am I missing something?

This is the correct answer imo. get_pfn_list() needs to die anyway: 
there are better ways to get the list of mfns belonging to a guest (you 
can get the list back from increase_reservation, or you can map the 
guest's pfn->mfn map).

The current mlock() scheme in libxc is screwed anyway -- we 
mlock/munlock regions that may overlap at page granularity. Fixing this 
would lead naturally to a preallocation scheme.

  -- Keir

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: passing hypercall parameters by pointer
@ 2005-08-19 11:34 Ian Pratt
  2005-08-19 11:52 ` Jimi Xenidis
  2005-08-19 12:20 ` Keir Fraser
  0 siblings, 2 replies; 21+ messages in thread
From: Ian Pratt @ 2005-08-19 11:34 UTC (permalink / raw)
  To: Keir Fraser, Jimi Xenidis; +Cc: Tian, Kevin, xen-devel

> The current mlock() scheme in libxc is screwed anyway -- we 
> mlock/munlock regions that may overlap at page granularity. 
> Fixing this would lead naturally to a preallocation scheme.

That's a very good point. For the moment, we should remove all the
munlock() calls for safety. The amount of unnecessary memory we'll end
up pinning will be tiny, so we shouldn't worry about it.

Post 3.0 we can completely redo the dom0 op interface, but the rest of
the hypercall interface will have to remain backward compatible, at
least for x86_*. Since passing by VA is so convenient on the
architectures that support it we may not want to do anything different
on these anyhow.

For VT paravirt drivers I think pre-registration will work fine. The set
of hypercalls we need to support is small anyhow.

Ian

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: passing hypercall parameters by pointer
  2005-08-19 11:34 Ian Pratt
@ 2005-08-19 11:52 ` Jimi Xenidis
  2005-08-19 12:17   ` Keir Fraser
  2005-08-19 12:20 ` Keir Fraser
  1 sibling, 1 reply; 21+ messages in thread
From: Jimi Xenidis @ 2005-08-19 11:52 UTC (permalink / raw)
  To: Ian Pratt; +Cc: xen-devel, Tian, Kevin

>>>>> "IP" == Ian Pratt <m+Ian.Pratt@cl.cam.ac.uk> writes:

 IP> Post 3.0 we can completely redo the dom0 op interface, but the rest of
 IP> the hypercall interface will have to remain backward compatible, at
 IP> least for x86_*.

Just to clarify, "the rest" refers to hypercalls made from the kernel,
correct?  Any hypercall using VAs made from user space are at issue here.

 IP> Since passing by VA is so convenient on the architectures that
 IP> support it we may not want to do anything different on these
 IP> anyhow.

I agree, why create a new mapping when a usable one exists.

At least for common kernel code, we will need to wrap such VAs in a
macro so that the "psuedo-physical" is passed in for PPC. I assume
this is reasonable?

-JX


-- 
 "I got an idea, an idea so smart my head would explode if I even
  began to know what I was talking about." -- Peter Griffin (Family Guy)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: passing hypercall parameters by pointer
  2005-08-19 11:52 ` Jimi Xenidis
@ 2005-08-19 12:17   ` Keir Fraser
  2005-08-19 13:57     ` Hollis Blanchard
  0 siblings, 1 reply; 21+ messages in thread
From: Keir Fraser @ 2005-08-19 12:17 UTC (permalink / raw)
  To: Jimi Xenidis; +Cc: Ian Pratt, Tian, Kevin, xen-devel


On 19 Aug 2005, at 12:52, Jimi Xenidis wrote:

> IP> Since passing by VA is so convenient on the architectures that
>  IP> support it we may not want to do anything different on these
>  IP> anyhow.
>
> I agree, why create a new mapping when a usable one exists.
>
> At least for common kernel code, we will need to wrap such VAs in a
> macro so that the "psuedo-physical" is passed in for PPC. I assume
> this is reasonable?

This is all potentially fixable before 3.0 final. Paravirt x86 can 
continue to use guest virtual addresses. The idea would be that the 
registration scheme would essentially create a parameter-passing 
'address space' into which you hook pages of memory. On x86 we would 
map the address space onto regions of kernel va space. On other arches 
we would map the address space onto physical addresses that get mapped 
into Xen's va space. get_user/put_user/copy_from_user/copy_to_user will 
take guest addresses that point into this parameter-passing address 
space.

At least we can scope it out by doing a few hypercalls to start with -- 
probably dom0_ops first and see how it pans out. I think it will work 
quite well...

  -- Keir

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: passing hypercall parameters by pointer
  2005-08-19 11:34 Ian Pratt
  2005-08-19 11:52 ` Jimi Xenidis
@ 2005-08-19 12:20 ` Keir Fraser
  1 sibling, 0 replies; 21+ messages in thread
From: Keir Fraser @ 2005-08-19 12:20 UTC (permalink / raw)
  To: Ian Pratt; +Cc: Jimi Xenidis, Tian, Kevin, xen-devel


On 19 Aug 2005, at 12:34, Ian Pratt wrote:

> That's a very good point. For the moment, we should remove all the
> munlock() calls for safety. The amount of unnecessary memory we'll end
> up pinning will be tiny, so we shouldn't worry about it.

The munlock()s indicate where we should deallocate bounce buffers back 
to the pre-reservation pool. We should at least mark those places so we 
don't have to search for them again later.

  -- Keir

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: passing hypercall parameters by pointer
@ 2005-08-19 12:41 Ian Pratt
  0 siblings, 0 replies; 21+ messages in thread
From: Ian Pratt @ 2005-08-19 12:41 UTC (permalink / raw)
  To: Keir Fraser, Jimi Xenidis; +Cc: Tian, Kevin, xen-devel

> This is all potentially fixable before 3.0 final. Paravirt 
> x86 can continue to use guest virtual addresses. The idea 
> would be that the registration scheme would essentially 
> create a parameter-passing 'address space' into which you 
> hook pages of memory. On x86 we would map the address space 
> onto regions of kernel va space. On other arches we would map 
> the address space onto physical addresses that get mapped 
> into Xen's va space. 
> get_user/put_user/copy_from_user/copy_to_user will take guest 
> addresses that point into this parameter-passing address space.
> 
> At least we can scope it out by doing a few hypercalls to 
> start with -- probably dom0_ops first and see how it pans 
> out. I think it will work quite well...

I'd be inclined to first go after the ops that are needed for the
paravirtualized drivers (mem_op, grantab_op). Perhaps people could post
a few patch examples for dicsussion?

NB: This in no way represents a commitment to get this into 3.0-final.
Let's have a look at the patches and decide.

[Right now, anything that isn't fixing bugs or sorting out xenbus/tools
is actually a distraction]

Ian

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: passing hypercall parameters by pointer
  2005-08-19 12:17   ` Keir Fraser
@ 2005-08-19 13:57     ` Hollis Blanchard
  2005-08-19 14:35       ` Keir Fraser
  0 siblings, 1 reply; 21+ messages in thread
From: Hollis Blanchard @ 2005-08-19 13:57 UTC (permalink / raw)
  To: Keir Fraser; +Cc: Jimi Xenidis, Ian Pratt, Tian, Kevin, xen-devel

On Aug 19, 2005, at 7:17 AM, Keir Fraser wrote:
>
> On 19 Aug 2005, at 12:52, Jimi Xenidis wrote:
>
>> IP> Since passing by VA is so convenient on the architectures that
>>  IP> support it we may not want to do anything different on these
>>  IP> anyhow.
>>
>> I agree, why create a new mapping when a usable one exists.
>>
>> At least for common kernel code, we will need to wrap such VAs in a
>> macro so that the "psuedo-physical" is passed in for PPC. I assume
>> this is reasonable?
>
> This is all potentially fixable before 3.0 final. Paravirt x86 can 
> continue to use guest virtual addresses. The idea would be that the 
> registration scheme would essentially create a parameter-passing 
> 'address space' into which you hook pages of memory. On x86 we would 
> map the address space onto regions of kernel va space. On other arches 
> we would map the address space onto physical addresses that get mapped 
> into Xen's va space. get_user/put_user/copy_from_user/copy_to_user 
> will take guest addresses that point into this parameter-passing 
> address space.

Could you flesh this out a little more? I *think* what you're saying is 
this (on PowerPC):
- at boot, the kernel notifies Xen of a parameter page
- replace libxc calls to mlock() with register_this_address() (which 
could be a privcmd ioctl)
- register_this_address() stuffs the userspace pointer and 
corresponding pseudo-physical pointer into a table in the parameter 
page
- libxc ignorantly creates its structures with userspace addresses
- once the hypercall arrives in Xen, copy_from_user() is passed the 
userspace address
- copy_from_user() consults the table in the parameter page to 
translate userspace -> pseudo-physical, then translates pseudo-physical 
-> machine

Is that right?

-- 
Hollis Blanchard
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: passing hypercall parameters by pointer
  2005-08-19 13:57     ` Hollis Blanchard
@ 2005-08-19 14:35       ` Keir Fraser
  2005-08-19 15:18         ` Hollis Blanchard
  0 siblings, 1 reply; 21+ messages in thread
From: Keir Fraser @ 2005-08-19 14:35 UTC (permalink / raw)
  To: Hollis Blanchard; +Cc: Jimi Xenidis, Ian Pratt, Tian, Kevin, xen-devel


On 19 Aug 2005, at 14:57, Hollis Blanchard wrote:

> Could you flesh this out a little more? I *think* what you're saying 
> is this (on PowerPC):
> - at boot, the kernel notifies Xen of a parameter page

It can be multiple pages, and the mappings can change over time. Think 
of something like set_parameter_page(parameter_address_space_frame, 
physical_address_space_fram) establishing a mapping from parameter 
address space to phys address space.

> - replace libxc calls to mlock() with register_this_address() (which 
> could be a privcmd ioctl)

Yep. I think libxc would request via a privcmd ioctl. The kernel can 
extend the parameter-passing region, or allocate a subsection of the 
existing region, and mmap it into user space. It would also return to 
libxc the range of parameter-passing addresses that have been allocated 
to it.

> - libxc ignorantly creates its structures with userspace addresses

libxc would create structs with parameter-passing addresses.

  -- Keir

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: passing hypercall parameters by pointer
  2005-08-19 14:35       ` Keir Fraser
@ 2005-08-19 15:18         ` Hollis Blanchard
  2005-08-19 15:31           ` Keir Fraser
  0 siblings, 1 reply; 21+ messages in thread
From: Hollis Blanchard @ 2005-08-19 15:18 UTC (permalink / raw)
  To: Keir Fraser; +Cc: Jimi Xenidis, Ian Pratt, Tian, Kevin, xen-devel

On Aug 19, 2005, at 9:35 AM, Keir Fraser wrote:
>
> On 19 Aug 2005, at 14:57, Hollis Blanchard wrote:
>
>> - replace libxc calls to mlock() with register_this_address() (which 
>> could be a privcmd ioctl)
>
> Yep. I think libxc would request via a privcmd ioctl. The kernel can 
> extend the parameter-passing region, or allocate a subsection of the 
> existing region, and mmap it into user space. It would also return to 
> libxc the range of parameter-passing addresses that have been 
> allocated to it.
>
>> - libxc ignorantly creates its structures with userspace addresses
>
> libxc would create structs with parameter-passing addresses.

Does "parameter-passing addresses" mean offsets inside the parameter 
passing space?

I think pseudocode is going to be more effective than English here. 
Let's take DOM0_PERFCCONTROL as an example:

main() {
     xc_perfc_desc_t *desc = malloc();
     mlock(desc); // <------------- [1]
     xc_perfc_control(desc);
}

xc_perfc_control(xc_perfc_desc_t *desc) {
     dom0_op_t dop;

     dop.cmd = DOM0_PERFCCONTROL;
     dop.u.perfccontrol.desc = desc; // <------------ [2]
     do_dom0_op(&dop);
}

Even if you replace malloc/mlock at [1] with a call that maps 
"parameter passing" space into this process, what address will you put 
in the struct at [2]? That would have to be an offset within the 
parameter passing space, right?

-- 
Hollis Blanchard
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: passing hypercall parameters by pointer
  2005-08-19 15:18         ` Hollis Blanchard
@ 2005-08-19 15:31           ` Keir Fraser
  0 siblings, 0 replies; 21+ messages in thread
From: Keir Fraser @ 2005-08-19 15:31 UTC (permalink / raw)
  To: Hollis Blanchard; +Cc: Jimi Xenidis, Ian Pratt, Tian, Kevin, xen-devel


On 19 Aug 2005, at 16:18, Hollis Blanchard wrote:

> Even if you replace malloc/mlock at [1] with a call that maps 
> "parameter passing" space into this process, what address will you put 
> in the struct at [2]? That would have to be an offset within the 
> parameter passing space, right?

Yes.

  -- Keir

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2005-08-19 15:31 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-08-17 20:44 passing hypercall parameters by pointer Ian Pratt
     [not found] ` <mailman.1124311483.4826@unix-os.sc.intel.com>
2005-08-17 21:07   ` Arun Sharma
2005-08-17 22:11     ` Hollis Blanchard
2005-08-17 22:04 ` Hollis Blanchard
  -- strict thread matches above, loose matches on Subject: below --
2005-08-19 12:41 Ian Pratt
2005-08-19 11:34 Ian Pratt
2005-08-19 11:52 ` Jimi Xenidis
2005-08-19 12:17   ` Keir Fraser
2005-08-19 13:57     ` Hollis Blanchard
2005-08-19 14:35       ` Keir Fraser
2005-08-19 15:18         ` Hollis Blanchard
2005-08-19 15:31           ` Keir Fraser
2005-08-19 12:20 ` Keir Fraser
2005-08-18  6:56 Tian, Kevin
2005-08-18 15:58 ` Hollis Blanchard
2005-08-19  2:00   ` Jimi Xenidis
2005-08-19 10:32     ` Keir Fraser
2005-08-18  6:56 Tian, Kevin
2005-08-18  6:56 Tian, Kevin
2005-08-18  0:47 Ling, Xiaofeng
2005-08-17 19:51 Hollis Blanchard

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.