From: Daniel De Graaf <dgdegra@tycho.nsa.gov>
To: Ian Campbell <Ian.Campbell@eu.citrix.com>
Cc: Stefano Stabellini <Stefano.Stabellini@eu.citrix.com>,
"xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>,
Ian Jackson <Ian.Jackson@eu.citrix.com>,
"konrad.wilk@oracle.com" <konrad.wilk@oracle.com>
Subject: Re: [PATCH 1/3] libxc: add xc_gnttab_map_grant_ref_notify
Date: Thu, 22 Sep 2011 14:09:00 -0400 [thread overview]
Message-ID: <4E7B79BC.3090703@tycho.nsa.gov> (raw)
In-Reply-To: <1316680355.23371.35.camel@zakaz.uk.xensource.com>
On 09/22/2011 04:32 AM, Ian Campbell wrote:
> On Wed, 2011-09-21 at 18:07 +0100, Daniel De Graaf wrote:
>> On 09/21/2011 11:25 AM, Ian Campbell wrote:
>
>>> When I map a page how do I know what the offset of it is wrt the file
>>> descriptor? DO I just have to remember how many pages I mapped an *=
>>> 4096?
>>
>> You had the offset at the time you mapped it, because that's what you
>> passed in the offset parameter to mmap(). Just don't lose the number :)
>
> So I guess my followup question is where does the number I pass to mmap
> come from...
>
> /me scrobbles in the code.
>
> Aha, so it is an output from the gntdev/gntalloc ioctls. So how about:
>
> /* IN parameters */
> /*
> * Offset in the file descriptor for a byte within the page. This offset
> * is the result of the IOCTL_GNTDEV_MAP_GRANT_REF and is the same as
> * is used with mmap().
> */
>
Sounds good.
>>>>>> +};
>>>>>> +
>>>>>> +/* Clear (set to zero) the byte specified by index */
>>>>>> +#define UNMAP_NOTIFY_CLEAR_BYTE 0x1
>>>>>> +/* Send an interrupt on the indicated event channel */
>>>>>> +#define UNMAP_NOTIFY_SEND_EVENT 0x2
>>>>>> +
>>>>>> #endif /* __LINUX_PUBLIC_GNTDEV_H__ */
>>>>>> diff --git a/tools/libxc/xc_gnttab.c b/tools/libxc/xc_gnttab.c
>>>>>> index 4f55fce..3d3c63b 100644
>>>>>> --- a/tools/libxc/xc_gnttab.c
>>>>>> +++ b/tools/libxc/xc_gnttab.c
>>>>>> @@ -18,6 +18,7 @@
>>>>>> */
>>>>>>
>>>>>> #include "xc_private.h"
>>>>>> +#include <errno.h>
>>>>>>
>>>>>> int xc_gnttab_op(xc_interface *xch, int cmd, void * op, int op_size, int count)
>>>>>> {
>>>>>> @@ -174,6 +175,28 @@ void *xc_gnttab_map_domain_grant_refs(xc_gnttab *xcg,
>>>>>> count, domid, refs, prot);
>>>>>> }
>>>>>>
>>>>>> +void *xc_gnttab_map_grant_ref_notify(xc_gnttab *xcg,
>>>>>> + uint32_t domid,
>>>>>> + uint32_t ref,
>>>>>> + uint32_t notify_offset,
>>>>>> + evtchn_port_t notify_port,
>>>>>> + int *notify_result)
>>>>>> +{
>>>>>> + if (xcg->ops->u.gnttab.map_grant_ref_notify)
>>>>>> + return xcg->ops->u.gnttab.map_grant_ref_notify(xcg, xcg->ops_handle,
>>>>>> + domid, ref, notify_offset, notify_port, notify_result);
>>>>>> + else {
>>>>>> + void* area = xcg->ops->u.gnttab.map_grant_ref(xcg, xcg->ops_handle,
>>>>>> + domid, ref, PROT_READ|PROT_WRITE);
>>>>>> + if (area && notify_result) {
>>>>>> + *notify_result = -1;
>>>>>> + errno = ENOSYS;
>>>>>> + }
>>>>>> + return area;
>>>>>> + }
>>>>>
>>>>> I think the new public interface is fine but do we really need a new
>>>>> internal interface here?
>>>>>
>>>>> I think you can just add the notify_* arguments to the existing OSDEP
>>>>> function and have those OS backends which don't implement that feature
>>>>> return ENOSYS if notify_offset != 0 (or ~0 or whatever invalid value
>>>>> works).
>>>>>
>>>>> Why doesn't the *_notify variant take a prot argument?
>>>>
>>>> At least for the byte-clear portion of the notify, the page must be writable
>>>> or the notify will not work. I suppose an event channel alone could be used
>>>> for a read-only mapping, but normally the unmapping of a read-only mapping is
>>>> not an important event - although I guess you could contrive a use for this
>>>> type of notificaiton, so there's no reason not to allow it.
>>>
>>> If you combine this the two functions then returning EINVAL for attempts
>>> to map without PROT_WRITE (or whatever perm is necessary) would be
>>> reasonable IMHO.
>>>
>>
>> The ioctl already prevents you from requesting the impossible, so this should
>> just work.
>
> Even better. I see the check in the gntdev driver but not in the
> gntalloc one, is that right?
>
Yes. The unmap notification will always work in gntalloc because the
shared page is owned locally, and is always writable there; read-only
refers to the mapping domain's ability to write.
>
>>> I hadn't noticed that we have both map_gratn_ref and map_grant_refs,
>>> that seems like an area which could be cleaned up. (not saying you
>>> should, just noticing it)
>>
>> Since I'm already rewriting the osdep layer functions, I think I can replace
>> all 3-4 existing map functions with a single function.
>
> Awesome, thanks!
>
>> It looks like the current 2.6.32.x xen kernels also aren't returning EAGAIN,
>> so I'm unsure as to why this support was added. The commit in question is
>> 20689:fe42b16855aa by Grzegorz Milos (committed by Keir Fraser), but I don't
>> see any discussion on xen-devel for it.
>
> IIRC it was related to the page sharing patches which can cause grant
> hypercalls to return EAGAIN if the granted page is swapped or shared. I
> think this can only happen to backend/dom0 type operations.
>
> I think the reason it needs to return to the guest is that the paging
> daemon may be in the same domain and having the only vcpu block in the
> hypercall would deadlock.
>
>> It's also unclear why repeating the
>> request every millisecond in an infinite loop is better than letting the
>> caller handle an -EAGAIN.
>
> Yeah, the millisecond thing is pretty gross, something like
> sched_yield() might be a bit more palatable (if it's portable enough,
> although sleep could be a fallback if not)
>
> I suppose that hiding the EAGAIN handling in the library was just
> thought to be convenient, compared with changing all the existing users.
>
> Ian.
>
It sounds to me like this is best solved in the kernel, although it would
still have to invoke some kind of yield operation since I assume the kernel
can't tell when the hypercall will not block (ideally you would be able to
put the invoking process to sleep pending the cross-domain page fault).
For now, it sounds like the best solution is to keep the usleep-based loop
in for all gntdev invocations. Using sched_yield will cause the CPU to spin
while the page is pending from disk or possibly while waiting for dom0 to
be scheduled (assuming a domU-domU vchan).
--
Daniel De Graaf
National Security Agency
next prev parent reply other threads:[~2011-09-22 18:09 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-08-19 14:38 [PATCH] libvchan: interdomain communications library Daniel De Graaf
2011-08-22 7:40 ` Vasiliy G Tolstov
2011-08-24 19:28 ` Konrad Rzeszutek Wilk
2011-08-22 9:15 ` Ian Campbell
2011-08-24 18:52 ` Daniel De Graaf
2011-08-25 10:27 ` Tim Deegan
2011-08-25 18:36 ` Jeremy Fitzhardinge
2011-08-24 18:52 ` [PATCH v2] " Daniel De Graaf
2011-08-26 14:01 ` Ian Jackson
2011-08-29 18:48 ` [PATCH v3] " Daniel De Graaf
2011-08-30 10:32 ` Ian Campbell
2011-08-31 19:17 ` Daniel De Graaf
2011-09-01 16:28 ` Ian Campbell
2011-09-01 16:47 ` Daniel De Graaf
2011-09-01 16:56 ` Ian Jackson
2011-09-01 17:46 ` Daniel De Graaf
2011-09-01 16:22 ` [PATCH v4 0/3] " Daniel De Graaf
2011-09-01 16:22 ` [PATCH 1/3] libxc: add xc_gnttab_map_grant_ref_notify Daniel De Graaf
2011-09-01 19:29 ` Konrad Rzeszutek Wilk
2011-09-01 16:22 ` [PATCH 2/3] libxc: add xc_gntshr_* functions Daniel De Graaf
2011-09-01 19:24 ` Konrad Rzeszutek Wilk
2011-09-01 16:22 ` [PATCH 3/3] libvchan: interdomain communications library Daniel De Graaf
2011-09-19 22:43 ` [PATCH v5 0/3] Daniel De Graaf
2011-09-19 22:43 ` [PATCH 1/3] libxc: add xc_gnttab_map_grant_ref_notify Daniel De Graaf
2011-09-21 10:03 ` Ian Campbell
2011-09-21 15:02 ` Daniel De Graaf
2011-09-21 15:25 ` Ian Campbell
2011-09-21 17:07 ` Daniel De Graaf
2011-09-22 8:32 ` Ian Campbell
2011-09-22 18:09 ` Daniel De Graaf [this message]
2011-09-19 22:43 ` [PATCH 2/3] libxc: add xc_gntshr_* functions Daniel De Graaf
2011-09-21 10:13 ` Ian Campbell
2011-09-19 22:43 ` [PATCH 3/3] libvchan: interdomain communications library Daniel De Graaf
2011-09-21 10:53 ` Ian Campbell
2011-09-21 16:31 ` Daniel De Graaf
2011-09-22 8:18 ` Ian Campbell
2011-09-21 13:44 ` Ian Campbell
2011-09-22 22:14 ` [PATCH v6 0/3] libxenvchan: " Daniel De Graaf
2011-09-22 22:14 ` [PATCH 1/3] libxc: add xc_gnttab_map_grant_ref_notify Daniel De Graaf
2011-09-30 9:16 ` Ian Campbell
2011-09-30 14:12 ` Ian Jackson
2011-09-30 14:17 ` Ian Campbell
2011-09-22 22:14 ` [PATCH 2/3] libxc: add xc_gntshr_* functions Daniel De Graaf
2011-09-22 22:14 ` [PATCH 3/3] libvchan: interdomain communications library Daniel De Graaf
2011-09-30 7:51 ` [PATCH v6 0/3] libxenvchan: " Vasiliy Tolstov
2011-09-30 8:28 ` Vasiliy Tolstov
2011-09-30 14:40 ` Daniel De Graaf
2011-11-24 20:02 ` Anil Madhavapeddy
2011-11-25 16:53 ` Daniel De Graaf
2011-11-25 16:54 ` [PATCH] libxc: Fix checks on grant notify arguments Daniel De Graaf
2011-12-01 18:20 ` Ian Jackson
2011-11-25 16:56 ` [PATCH 1/2] xen/events: prevent calling evtchn_get on invlaid channels Daniel De Graaf
2011-11-25 16:56 ` [PATCH 2/2] xen/gntalloc: release grant references on page free Daniel De Graaf
2011-11-25 18:37 ` [PATCH] xen/gntalloc: fix reference counts on multi-page mappings Daniel De Graaf
2011-09-30 8:34 ` [PATCH v6 0/3] libxenvchan: interdomain communications library Ian Campbell
2011-09-30 8:37 ` [PATCH] libxc: osdep: report missing backends in common code Ian Campbell
2011-10-06 18:44 ` [PATCH v6 0/3] libxenvchan: interdomain communications library Ian Jackson
2011-10-07 8:41 ` Roger Pau Monné
2011-10-07 9:15 ` Keir Fraser
2011-10-07 9:48 ` Ian Jackson
2011-10-07 10:22 ` Roger Pau Monné
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4E7B79BC.3090703@tycho.nsa.gov \
--to=dgdegra@tycho.nsa.gov \
--cc=Ian.Campbell@eu.citrix.com \
--cc=Ian.Jackson@eu.citrix.com \
--cc=Stefano.Stabellini@eu.citrix.com \
--cc=konrad.wilk@oracle.com \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.