From: Robert Jennings <rcj@linux.vnet.ibm.com>
To: Dave Hansen <dave@sr71.net>
Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-mm@kvack.org, Alexander Viro <viro@zeniv.linux.org.uk>,
Rik van Riel <riel@redhat.com>,
Andrea Arcangeli <aarcange@redhat.com>,
Matt Helsley <matt.helsley@gmail.com>,
Anthony Liguori <anthony@codemonkey.ws>,
Michael Roth <mdroth@linux.vnet.ibm.com>,
Lei Li <lilei@linux.vnet.ibm.com>,
Leonardo Garcia <lagarcia@linux.vnet.ibm.com>,
Vlastimil Babka <vbabka@suse.cz>
Subject: Re: [PATCH 2/2] vmsplice: Add limited zero copy to vmsplice
Date: Tue, 8 Oct 2013 12:35:21 -0500 [thread overview]
Message-ID: <20131008173521.GA6129@linux.vnet.ibm.com> (raw)
In-Reply-To: <525436A5.20808@sr71.net>
* Dave Hansen (dave@sr71.net) wrote:
> On 10/07/2013 01:21 PM, Robert C Jennings wrote:
> > + if (!buf->offset && (buf->len == PAGE_SIZE) &&
> > + (buf->flags & PIPE_BUF_FLAG_GIFT) && (sd->flags & SPLICE_F_MOVE)) {
> > + struct page *page = buf->page;
> > + struct mm_struct *mm;
> > + struct vm_area_struct *vma;
> > + spinlock_t *ptl;
> > + pte_t *ptep, pte;
> > + unsigned long useraddr;
> > +
> > + if (!PageAnon(page))
> > + goto copy;
> > + if (PageCompound(page))
> > + goto copy;
> > + if (PageHuge(page) || PageTransHuge(page))
> > + goto copy;
> > + if (page_mapped(page))
> > + goto copy;
>
> I'd really like to see some comments about those cases. You touched on
> page_mapped() above, but could you replicate some of that in a comment?
Yes, I'll add comments in the code for these cases.
> Also, considering that this is being targeted at QEMU VMs, I would
> imagine that you're going to want to support PageTransHuge() in here
> pretty fast. Do you anticipate that being very much trouble? Have you
> planned for it in here?
My focus with this patch set was to get agreement on the change in the
first patch of the vmsplice syscall flags to perform page flipping rather
than copying.
I am working on support of PageTransHuge() but it is not complete.
It reworks this function to coalesce PAGE_SIZE pipe buffers into THP-sized
units and operate on those.
> > + useraddr = (unsigned long)sd->u.userptr;
> > + mm = current->mm;
> > +
> > + ret = -EAGAIN;
> > + down_read(&mm->mmap_sem);
> > + vma = find_vma_intersection(mm, useraddr, useraddr + PAGE_SIZE);
>
> If oyu are only doing these a page at a time, why bother with
> find_vma_intersection()? Why not a plain find_vma()?
You're correct, I can change this to use find_vma().
> Also, if we fail to find a VMA, won't this return -EAGAIN? That seems
> like a rather uninformative error code to get returned back out to
> userspace, especially since retrying won't help.
Yes, -EAGAIN is not good for this case, I will use -EFAULT.
> > + if (IS_ERR_OR_NULL(vma))
> > + goto up_copy;
> > + if (!vma->anon_vma) {
> > + ret = anon_vma_prepare(vma);
> > + if (ret)
> > + goto up_copy;
> > + }
>
> The first thing anon_vma_prepare() does is check vma->anon_vma. This
> extra check seems unnecessary.
I'll fix this, thanks.
> > + zap_page_range(vma, useraddr, PAGE_SIZE, NULL);
> > + ret = lock_page_killable(page);
> > + if (ret)
> > + goto up_copy;
> > + ptep = get_locked_pte(mm, useraddr, &ptl);
> > + if (!ptep)
> > + goto unlock_up_copy;
> > + pte = *ptep;
> > + if (pte_present(pte))
> > + goto unlock_up_copy;
> > + get_page(page);
> > + page_add_anon_rmap(page, vma, useraddr);
> > + pte = mk_pte(page, vma->vm_page_prot);
>
> 'pte' is getting used for two different things here, which makes it a
> bit confusing. I'd probably just skip this first assignment and
> directly do:
>
> if (pte_present(*ptep))
> goto unlock_up_copy;
I'll fix this, thanks.
> > + set_pte_at(mm, useraddr, ptep, pte);
> > + update_mmu_cache(vma, useraddr, ptep);
> > + pte_unmap_unlock(ptep, ptl);
> > + ret = 0;
> > +unlock_up_copy:
> > + unlock_page(page);
> > +up_copy:
> > + up_read(&mm->mmap_sem);
> > + if (!ret) {
> > + ret = sd->len;
> > + goto out;
> > + }
> > + /* else ret < 0 and we should fallback to copying */
> > + VM_BUG_ON(ret > 0);
> > + }
>
> This also screams to be broken out in to a helper function instead of
> just being thrown in with the existing code.
You're right, it's very self-contained already. I'll pull it out.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-10-08 17:35 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-10-07 20:21 [PATCH 0/2] vmpslice support for zero-copy gifting of pages Robert C Jennings
2013-10-07 20:21 ` [PATCH 1/2] vmsplice: unmap gifted pages for recipient Robert C Jennings
2013-10-08 16:14 ` Dave Hansen
2013-10-08 19:48 ` Robert Jennings
2013-10-08 21:22 ` Dave Hansen
2013-10-08 16:23 ` Dave Hansen
2013-10-17 13:54 ` Robert Jennings
2013-10-17 10:20 ` Vlastimil Babka
2013-10-17 13:48 ` Robert Jennings
2013-10-18 8:10 ` Vlastimil Babka
2013-10-07 20:21 ` [PATCH 2/2] vmsplice: Add limited zero copy to vmsplice Robert C Jennings
2013-10-08 16:45 ` Dave Hansen
2013-10-08 17:35 ` Robert Jennings [this message]
2013-10-17 11:23 ` Vlastimil Babka
2013-10-17 13:44 ` Robert Jennings
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131008173521.GA6129@linux.vnet.ibm.com \
--to=rcj@linux.vnet.ibm.com \
--cc=aarcange@redhat.com \
--cc=anthony@codemonkey.ws \
--cc=dave@sr71.net \
--cc=lagarcia@linux.vnet.ibm.com \
--cc=lilei@linux.vnet.ibm.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=matt.helsley@gmail.com \
--cc=mdroth@linux.vnet.ibm.com \
--cc=riel@redhat.com \
--cc=vbabka@suse.cz \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).