All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Hansen <dave@sr71.net>
To: Robert C Jennings <rcj@linux.vnet.ibm.com>, linux-kernel@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Rik van Riel <riel@redhat.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Matt Helsley <matt.helsley@gmail.com>,
	Anthony Liguori <anthony@codemonkey.ws>,
	Michael Roth <mdroth@linux.vnet.ibm.com>,
	Lei Li <lilei@linux.vnet.ibm.com>,
	Leonardo Garcia <lagarcia@linux.vnet.ibm.com>,
	Vlastimil Babka <vbabka@suse.cz>
Subject: Re: [PATCH 2/2] vmsplice: Add limited zero copy to vmsplice
Date: Tue, 08 Oct 2013 09:45:25 -0700	[thread overview]
Message-ID: <525436A5.20808@sr71.net> (raw)
In-Reply-To: <1381177293-27125-3-git-send-email-rcj@linux.vnet.ibm.com>

On 10/07/2013 01:21 PM, Robert C Jennings wrote:
> +	if (!buf->offset && (buf->len == PAGE_SIZE) &&
> +	    (buf->flags & PIPE_BUF_FLAG_GIFT) && (sd->flags & SPLICE_F_MOVE)) {
> +		struct page *page = buf->page;
> +		struct mm_struct *mm;
> +		struct vm_area_struct *vma;
> +		spinlock_t *ptl;
> +		pte_t *ptep, pte;
> +		unsigned long useraddr;
> +
> +		if (!PageAnon(page))
> +			goto copy;
> +		if (PageCompound(page))
> +			goto copy;
> +		if (PageHuge(page) || PageTransHuge(page))
> +			goto copy;
> +		if (page_mapped(page))
> +			goto copy;

I'd really like to see some comments about those cases.  You touched on
page_mapped() above, but could you replicate some of that in a comment?

Also, considering that this is being targeted at QEMU VMs, I would
imagine that you're going to want to support PageTransHuge() in here
pretty fast.  Do you anticipate that being very much trouble?  Have you
planned for it in here?

> +		useraddr = (unsigned long)sd->u.userptr;
> +		mm = current->mm;
> +
> +		ret = -EAGAIN;
> +		down_read(&mm->mmap_sem);
> +		vma = find_vma_intersection(mm, useraddr, useraddr + PAGE_SIZE);

If oyu are only doing these a page at a time, why bother with
find_vma_intersection()?  Why not a plain find_vma()?

Also, if we fail to find a VMA, won't this return -EAGAIN?  That seems
like a rather uninformative error code to get returned back out to
userspace, especially since retrying won't help.

> +		if (IS_ERR_OR_NULL(vma))
> +			goto up_copy;
> +		if (!vma->anon_vma) {
> +			ret = anon_vma_prepare(vma);
> +			if (ret)
> +				goto up_copy;
> +		}

The first thing anon_vma_prepare() does is check vma->anon_vma.  This
extra check seems unnecessary.

> +		zap_page_range(vma, useraddr, PAGE_SIZE, NULL);
> +		ret = lock_page_killable(page);
> +		if (ret)
> +			goto up_copy;
> +		ptep = get_locked_pte(mm, useraddr, &ptl);
> +		if (!ptep)
> +			goto unlock_up_copy;
> +		pte = *ptep;
> +		if (pte_present(pte))
> +			goto unlock_up_copy;
> +		get_page(page);
> +		page_add_anon_rmap(page, vma, useraddr);
> +		pte = mk_pte(page, vma->vm_page_prot);

'pte' is getting used for two different things here, which makes it a
bit confusing.  I'd probably just skip this first assignment and
directly do:

		if (pte_present(*ptep))
			goto unlock_up_copy;

> +		set_pte_at(mm, useraddr, ptep, pte);
> +		update_mmu_cache(vma, useraddr, ptep);
> +		pte_unmap_unlock(ptep, ptl);
> +		ret = 0;
> +unlock_up_copy:
> +		unlock_page(page);
> +up_copy:
> +		up_read(&mm->mmap_sem);
> +		if (!ret) {
> +			ret = sd->len;
> +			goto out;
> +		}
> +		/* else ret < 0 and we should fallback to copying */
> +		VM_BUG_ON(ret > 0);
> +	}

This also screams to be broken out in to a helper function instead of
just being thrown in with the existing code.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Dave Hansen <dave@sr71.net>
To: Robert C Jennings <rcj@linux.vnet.ibm.com>, linux-kernel@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Rik van Riel <riel@redhat.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Matt Helsley <matt.helsley@gmail.com>,
	Anthony Liguori <anthony@codemonkey.ws>,
	Michael Roth <mdroth@linux.vnet.ibm.com>,
	Lei Li <lilei@linux.vnet.ibm.com>,
	Leonardo Garcia <lagarcia@linux.vnet.ibm.com>,
	Vlastimil Babka <vbabka@suse.cz>
Subject: Re: [PATCH 2/2] vmsplice: Add limited zero copy to vmsplice
Date: Tue, 08 Oct 2013 09:45:25 -0700	[thread overview]
Message-ID: <525436A5.20808@sr71.net> (raw)
In-Reply-To: <1381177293-27125-3-git-send-email-rcj@linux.vnet.ibm.com>

On 10/07/2013 01:21 PM, Robert C Jennings wrote:
> +	if (!buf->offset && (buf->len == PAGE_SIZE) &&
> +	    (buf->flags & PIPE_BUF_FLAG_GIFT) && (sd->flags & SPLICE_F_MOVE)) {
> +		struct page *page = buf->page;
> +		struct mm_struct *mm;
> +		struct vm_area_struct *vma;
> +		spinlock_t *ptl;
> +		pte_t *ptep, pte;
> +		unsigned long useraddr;
> +
> +		if (!PageAnon(page))
> +			goto copy;
> +		if (PageCompound(page))
> +			goto copy;
> +		if (PageHuge(page) || PageTransHuge(page))
> +			goto copy;
> +		if (page_mapped(page))
> +			goto copy;

I'd really like to see some comments about those cases.  You touched on
page_mapped() above, but could you replicate some of that in a comment?

Also, considering that this is being targeted at QEMU VMs, I would
imagine that you're going to want to support PageTransHuge() in here
pretty fast.  Do you anticipate that being very much trouble?  Have you
planned for it in here?

> +		useraddr = (unsigned long)sd->u.userptr;
> +		mm = current->mm;
> +
> +		ret = -EAGAIN;
> +		down_read(&mm->mmap_sem);
> +		vma = find_vma_intersection(mm, useraddr, useraddr + PAGE_SIZE);

If oyu are only doing these a page at a time, why bother with
find_vma_intersection()?  Why not a plain find_vma()?

Also, if we fail to find a VMA, won't this return -EAGAIN?  That seems
like a rather uninformative error code to get returned back out to
userspace, especially since retrying won't help.

> +		if (IS_ERR_OR_NULL(vma))
> +			goto up_copy;
> +		if (!vma->anon_vma) {
> +			ret = anon_vma_prepare(vma);
> +			if (ret)
> +				goto up_copy;
> +		}

The first thing anon_vma_prepare() does is check vma->anon_vma.  This
extra check seems unnecessary.

> +		zap_page_range(vma, useraddr, PAGE_SIZE, NULL);
> +		ret = lock_page_killable(page);
> +		if (ret)
> +			goto up_copy;
> +		ptep = get_locked_pte(mm, useraddr, &ptl);
> +		if (!ptep)
> +			goto unlock_up_copy;
> +		pte = *ptep;
> +		if (pte_present(pte))
> +			goto unlock_up_copy;
> +		get_page(page);
> +		page_add_anon_rmap(page, vma, useraddr);
> +		pte = mk_pte(page, vma->vm_page_prot);

'pte' is getting used for two different things here, which makes it a
bit confusing.  I'd probably just skip this first assignment and
directly do:

		if (pte_present(*ptep))
			goto unlock_up_copy;

> +		set_pte_at(mm, useraddr, ptep, pte);
> +		update_mmu_cache(vma, useraddr, ptep);
> +		pte_unmap_unlock(ptep, ptl);
> +		ret = 0;
> +unlock_up_copy:
> +		unlock_page(page);
> +up_copy:
> +		up_read(&mm->mmap_sem);
> +		if (!ret) {
> +			ret = sd->len;
> +			goto out;
> +		}
> +		/* else ret < 0 and we should fallback to copying */
> +		VM_BUG_ON(ret > 0);
> +	}

This also screams to be broken out in to a helper function instead of
just being thrown in with the existing code.


  reply	other threads:[~2013-10-08 16:45 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-07 20:21 [PATCH 0/2] vmpslice support for zero-copy gifting of pages Robert C Jennings
2013-10-07 20:21 ` Robert C Jennings
2013-10-07 20:21 ` [PATCH 1/2] vmsplice: unmap gifted pages for recipient Robert C Jennings
2013-10-07 20:21   ` Robert C Jennings
2013-10-08 16:14   ` Dave Hansen
2013-10-08 16:14     ` Dave Hansen
2013-10-08 19:48     ` Robert Jennings
2013-10-08 19:48       ` Robert Jennings
2013-10-08 21:22       ` Dave Hansen
2013-10-08 21:22         ` Dave Hansen
2013-10-08 16:23   ` Dave Hansen
2013-10-08 16:23     ` Dave Hansen
2013-10-17 13:54     ` Robert Jennings
2013-10-17 13:54       ` Robert Jennings
2013-10-17 10:20   ` Vlastimil Babka
2013-10-17 10:20     ` Vlastimil Babka
2013-10-17 13:48     ` Robert Jennings
2013-10-17 13:48       ` Robert Jennings
2013-10-18  8:10       ` Vlastimil Babka
2013-10-18  8:10         ` Vlastimil Babka
2013-10-07 20:21 ` [PATCH 2/2] vmsplice: Add limited zero copy to vmsplice Robert C Jennings
2013-10-07 20:21   ` Robert C Jennings
2013-10-08 16:45   ` Dave Hansen [this message]
2013-10-08 16:45     ` Dave Hansen
2013-10-08 17:35     ` Robert Jennings
2013-10-08 17:35       ` Robert Jennings
2013-10-17 11:23   ` Vlastimil Babka
2013-10-17 11:23     ` Vlastimil Babka
2013-10-17 13:44     ` Robert Jennings
2013-10-17 13:44       ` Robert Jennings

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=525436A5.20808@sr71.net \
    --to=dave@sr71.net \
    --cc=aarcange@redhat.com \
    --cc=anthony@codemonkey.ws \
    --cc=lagarcia@linux.vnet.ibm.com \
    --cc=lilei@linux.vnet.ibm.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=matt.helsley@gmail.com \
    --cc=mdroth@linux.vnet.ibm.com \
    --cc=rcj@linux.vnet.ibm.com \
    --cc=riel@redhat.com \
    --cc=vbabka@suse.cz \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.