From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Adin Scannell <adin@gridcentric.com>
Cc: olaf@aepfle.de, xen-devel@lists.xensource.com,
Adin Scannell <adin@scannell.ca>,
andres@gridcentric.ca, JBeulich@suse.com,
Konrad Rzeszutek Wilk <konrad@darnok.org>
Subject: Re: [PATCH 3/3] Port of mmap_batch_v2 to support paging in Xen
Date: Sat, 17 Dec 2011 16:29:51 -0500 [thread overview]
Message-ID: <20111217212951.GA17479@phenom.dumpdata.com> (raw)
In-Reply-To: <CAAJKtqrcMv+XfN_-X8O-FqbifOZOqp8i3FQa5qPg+05-PiZtqg@mail.gmail.com>
On Sat, Dec 17, 2011 at 11:51:11AM -0500, Adin Scannell wrote:
> Hi Konrad,
>
> Thanks for the quick turnaround. I'll incorporate your feedback and
> resend. Some NOTES are inline below.
>
> On Sat, Dec 17, 2011 at 9:40 AM, Konrad Rzeszutek Wilk
> <konrad@darnok.org> wrote:
> > On Fri, Dec 16, 2011 at 10:22:21PM -0500, Adin Scannell wrote:
> >> This wasn't ported from any patch, but was rewritten based on the XCP 2.6.32
> >> tree. The code structure is significantly different and this patch mirrors the
> >> existing Linux code.
> >>
> >> The primary reason for need the V2 interface is to support foreign mappings
> >> (i.e. qemu) of paged-out pages. The libxc code will already retry mappings
> >> when an ENOENT is returned. The V2 interface provides a richer error value,
> >> so the user-space code is capable of handling these errors specifically.
> >
> > Can you give more details on how to use paged-out pages. Perhaps a
> > pointer to the xen's docs?
>
> Hrm, in usual fashion the docs are a bit lacking.
>
> Simply: the kernel has to do nothing to support paging (other than the
> backend drivers which need to handle the grant EAGAIN case -- other
> patch). The only reason the V2 interface is needed here is to pass
Hm I did not see the netback one? Should that also incorporate this?
> back full error codes from the mmu_update()'s. Xen and libxc have a
> mutual understanding that when you receive an ENOENT error code, you
> back off and try again.
What you described is pretty good. Please do include it in the git
description. Thx
>
> >>
> >> Signed-off-by: Adin Scannell <adin@scannell.ca>
> >>
> >> Index: linux/drivers/xen/xenfs/privcmd.c
> >> ===================================================================
> >> ---
> >> drivers/xen/xenfs/privcmd.c | 90 ++++++++++++++++++++++++++++++++++++++++++-
> >
> > So that file just moved to drivers/xen/privcmd.c
>
> Of course it has :)
Well, we can't have it easy can we! :-)
>
> >> include/xen/privcmd.h | 10 +++++
> >> 2 files changed, 99 insertions(+), 1 deletions(-)
> >>
> >> diff --git a/drivers/xen/xenfs/privcmd.c b/drivers/xen/xenfs/privcmd.c
> >> index dbd3b16..21cbb5a 100644
> >> --- a/drivers/xen/xenfs/privcmd.c
> >> +++ b/drivers/xen/xenfs/privcmd.c
> >> @@ -70,7 +70,7 @@ static void free_page_list(struct list_head *pages)
> >> */
> >> static int gather_array(struct list_head *pagelist,
> >> unsigned nelem, size_t size,
> >> - void __user *data)
> >> + const void __user *data)
> >> {
> >> unsigned pageidx;
> >> void *pagedata;
> >> @@ -245,6 +245,15 @@ struct mmap_batch_state {
> >> xen_pfn_t __user *user;
> >> };
> >>
> >> +struct mmap_batch_v2_state {
> >> + domid_t domain;
> >> + unsigned long va;
> >> + struct vm_area_struct *vma;
> >> + int paged_out;
> >
> > Should this be unsigned int?
>
> Noted below.
>
> >> +
> >> + int __user *err;
> >> +};
> >> +
> >> static int mmap_batch_fn(void *data, void *state)
> >> {
> >> xen_pfn_t *mfnp = data;
> >> @@ -260,6 +269,20 @@ static int mmap_batch_fn(void *data, void *state)
> >> return 0;
> >> }
> >>
> >> +static int mmap_batch_v2_fn(void *data, void *state)
> >> +{
> >> + xen_pfn_t *mfnp = data;
> >> + struct mmap_batch_v2_state *st = state;
> >> +
> >> + int rc = xen_remap_domain_mfn_range(st->vma, st->va & PAGE_MASK, *mfnp, 1,
> >> + st->vma->vm_page_prot, st->domain);
> >
> > You don't want to check that st is not NULL?
>
> This could be an assertion as it's only used in the
> ioctl_mmap_batch_v2 function where it's guaranteed to pass in a
> non-null state.
>
> I'll add an additional patch to the queue that adds these assertions
> to both mmap_batch_fn and mmap_batch_fn_v2.
>
> >> + st->paged_out++;
> > Is it possible that this ends overflowing and hitting 0?
>
> I don't think this is an issue practically, as the only way to trigger
> this would be to be on a 64bit machine and map an ungodly number of
> pages with a single mmap_batch (MAX_INT). For correctness though, I
> can update this variable and the err variable in mmap_batch_state
> which suffers from the same problem -- turn them into unsigned long.
> This will be included in the additional patch.
> >> + m.arr);
> >> +
> >> + if (ret || list_empty(&pagelist))
> >> + goto out;
> >> +
> >> + down_write(&mm->mmap_sem);
> >> +
> >> + vma = find_vma(mm, m.addr);
> >> + ret = -EINVAL;
> >> + /* We allow multiple shots here, because this interface
> >> + * is used by libxc and mappings for specific pages will
> >> + * be retried when pages are paged-out (ENOENT). */
> >> + if (!vma ||
> >> + vma->vm_ops != &privcmd_vm_ops ||
> >> + (m.addr < vma->vm_start) ||
> >> + ((m.addr + (nr_pages << PAGE_SHIFT)) > vma->vm_end)) {
> >> + up_write(&mm->mmap_sem);
> >> + goto out;
> >> + }
> >> +
> >> + state.domain = m.dom;
> >
> > Should you check the m.dom for incorrect ones? Like -1? or DOMID_IO?
>
> I followed the example for mmap_batch, which just tries the call and
> returns whatever error Xen gives. I think that's the right approach
> for these.
OK. I think a another patch to actually check for that in every other
ioclt in this code might be worth it.
>
> Cheers!
> -Adin
next prev parent reply other threads:[~2011-12-17 21:29 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-12-17 3:22 [PATCH] Add necessary bits to pvops Linux for mapping paged-out pages Adin Scannell
2011-12-17 3:22 ` [PATCH 1/3] Make xen_remap_domain_mfn_range return value meaningful in case of error Adin Scannell
2011-12-17 3:22 ` [PATCH 2/3] Handle GNTST_eagain in kernel drivers Adin Scannell
2011-12-17 14:30 ` Konrad Rzeszutek Wilk
2011-12-17 16:53 ` Adin Scannell
2011-12-17 21:31 ` Konrad Rzeszutek Wilk
2012-01-02 16:06 ` Olaf Hering
2012-01-03 18:19 ` Konrad Rzeszutek Wilk
2012-01-03 18:40 ` Olaf Hering
2012-01-03 18:48 ` Konrad Rzeszutek Wilk
2011-12-17 3:22 ` [PATCH 3/3] Port of mmap_batch_v2 to support paging in Xen Adin Scannell
2011-12-17 14:40 ` Konrad Rzeszutek Wilk
2011-12-17 16:51 ` Adin Scannell
2011-12-17 21:29 ` Konrad Rzeszutek Wilk [this message]
2011-12-17 3:49 ` [PATCH] Add necessary bits to pvops Linux for mapping paged-out pages Adin Scannell
2011-12-17 14:16 ` Konrad Rzeszutek Wilk
2011-12-17 14:16 ` Konrad Rzeszutek Wilk
2012-01-02 16:06 ` Olaf Hering
-- strict thread matches above, loose matches on Subject: below --
2011-12-20 6:36 [PATCH 0/3] Add kernel bits necessary to suport Xen paging Adin Scannell
2011-12-20 6:36 ` [PATCH 3/3] Port of mmap_batch_v2 to support paging in Xen Adin Scannell
2011-12-20 18:11 ` Konrad Rzeszutek Wilk
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20111217212951.GA17479@phenom.dumpdata.com \
--to=konrad.wilk@oracle.com \
--cc=JBeulich@suse.com \
--cc=adin@gridcentric.com \
--cc=adin@scannell.ca \
--cc=andres@gridcentric.ca \
--cc=konrad@darnok.org \
--cc=olaf@aepfle.de \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.