virtualization.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
From: Rusty Russell <rusty@rustcorp.com.au>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	Max Krasnyansky <maxk@qualcomm.com>,
	virtualization@lists.linux-foundation.org
Subject: Re: [PATCH 5/5] tun: vringfd xmit support.
Date: Sat, 19 Apr 2008 01:15:15 +1000	[thread overview]
Message-ID: <200804190115.15983.rusty@rustcorp.com.au> (raw)
In-Reply-To: <20080418043120.ff78eab5.akpm@linux-foundation.org>

On Friday 18 April 2008 21:31:20 Andrew Morton wrote:
> On Fri, 18 Apr 2008 14:43:24 +1000 Rusty Russell <rusty@rustcorp.com.au> wrote:
> > +		/* How many pages will this take? */
> > +		npages = 1 + (base + len - 1)/PAGE_SIZE - base/PAGE_SIZE;
>
> Brain hurts.  I hope you got that right.

I tested it when I wrote it, but just wrote a tester again:

base		len	npages
0               1       1
0xfff           1       1
0x1000          1       1
0               4096    1
0x1             4096    2
0xfff           4096    2
0x1000          4096    1
0xfffff000      4096    1
0xfffff000      4097    4293918722

> > +		if (unlikely(num_pg + npages > MAX_SKB_FRAGS)) {
> > +			err = -ENOSPC;
> > +			goto fail;
> > +		}
> > +		n = get_user_pages(current, current->mm, base, npages,
> > +				   0, 0, pages, NULL);
>
> What is the maximum numbet of pages which an unpriviliged user can
> concurrently pin with this code?

Since only root can open the tun device, it's currently OK.  The old code
kmalloced and copied: is there some mm-fu reason why pinning userspace memory
is worse?

But I actually think it's OK even for non-root, since these become skbs, which
means they either go into an outgoing device queue or a socket queue which is
accounted for exactly for this reason. 

> > +		if (unlikely(n < 0)) {
> > +			err = n;
> > +			goto fail;
> > +		}
> > +
> > +		/* Transfer pages to the frag array */
> > +		for (j = 0; j < n; j++) {
> > +			f[num_pg].page = pages[j];
> > +			if (j == 0) {
> > +				f[num_pg].page_offset = offset_in_page(base);
> > +				f[num_pg].size = min(len, PAGE_SIZE -
> > +						     f[num_pg].page_offset);
> > +			} else {
> > +				f[num_pg].page_offset = 0;
> > +				f[num_pg].size = min(len, PAGE_SIZE);
> > +			}
> > +			len -= f[num_pg].size;
> > +			base += f[num_pg].size;
> > +			num_pg++;
> > +		}
>
> This loop is a fancy way of doing
>
> 		num_pg = n;

Damn, you had me reworking this until I realized why.  It's not: we're
inside a loop, doing one iovec array element at a time.

> > +		if (unlikely(n != npages)) {
> > +			err = -EFAULT;
> > +			goto fail;
> > +		}
>
> why not do this immediately after running get_user_pages()?

To simplify the failure path.  Hmm, I would use release_pages here...

> > +fail:
> > +	for (i = 0; i < num_pg; i++)
> > +		put_page(f[i].page);
>
> release_pages() could be a tad more efficient, but it's only error-path.

... but I didn't know that existed.  Had to include pagemap.h, and it's not
exported.  It seems to be a useful interface; see patch.

Cheers,
Rusty.

Subject: Export release_pages; nice undo for get_user_pages.

Andrew Morton suggests tun/tap use release_pages, but it's not
exported.  It's not clear to me why this is in swap.c, but it exists
even without CONFIG_SWAP, so that's OK.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

diff -r abd2ad431e5c mm/swap.c
--- a/mm/swap.c	Sat Apr 19 00:34:54 2008 +1000
+++ b/mm/swap.c	Sat Apr 19 01:11:40 2008 +1000
@@ -346,6 +346,7 @@ void release_pages(struct page **pages, 
 
 	pagevec_free(&pages_to_free);
 }
+EXPORT_SYMBOL(release_pages);
 
 /*
  * The pages which we're about to release may be in the deferred lru-addition

  reply	other threads:[~2008-04-18 15:15 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <200804181433.48488.rusty@rustcorp.com.au>
2008-04-18  4:35 ` [PATCH 1/5] virtio: put last_used and last_avail index into ring itself Rusty Russell
     [not found] ` <200804181435.21214.rusty@rustcorp.com.au>
2008-04-18  4:39   ` [PATCH 2/5] /dev/vring: simple userspace-kernel ringbuffer interface Rusty Russell
     [not found]   ` <200804181439.49051.rusty@rustcorp.com.au>
2008-04-18  4:41     ` [PATCH 3/5] /dev/vring limit and base ioctls Rusty Russell
     [not found]     ` <200804181441.10499.rusty@rustcorp.com.au>
2008-04-18  4:42       ` [PATCH 4/5] tun: vringfd receive support Rusty Russell
     [not found]       ` <200804181442.17251.rusty@rustcorp.com.au>
2008-04-18  4:43         ` [PATCH 5/5] tun: vringfd xmit support Rusty Russell
2008-04-18  4:43         ` Rusty Russell
     [not found]         ` <200804181443.24812.rusty@rustcorp.com.au>
2008-04-18 11:31           ` Andrew Morton
2008-04-18 15:15             ` Rusty Russell [this message]
2008-04-18 16:24               ` Ray Lee
2008-04-18 19:06               ` Andrew Morton
2008-04-19 14:41                 ` Rusty Russell
     [not found]                 ` <200804200041.43969.rusty@rustcorp.com.au>
2008-04-19 17:51                   ` Andrew Morton
2008-04-19  1:54               ` Andrew Morton
2008-04-18 11:46           ` pradeep singh rautela
2008-04-18 14:25             ` Ray Lee
2008-04-18 18:01               ` pradeep singh rautela
2008-04-18 11:18     ` [PATCH 2/5] /dev/vring: simple userspace-kernel ringbuffer interface Andrew Morton
     [not found]     ` <20080418041846.db15150b.akpm@linux-foundation.org>
2008-04-18 14:32       ` Rusty Russell
2008-04-18 18:59         ` Andrew Morton
2008-04-18 19:38           ` Michael Kerrisk
2008-04-19 16:41             ` Rusty Russell
     [not found]             ` <200804200241.14722.rusty@rustcorp.com.au>
2008-04-20  0:16               ` David Miller
2008-04-19 15:02           ` Jonathan Corbet
2008-04-19 10:22     ` Evgeniy Polyakov
     [not found]     ` <20080419102214.GA21952@2ka.mipt.ru>
2008-04-19 16:05       ` Rusty Russell
2008-04-19 16:33         ` Evgeniy Polyakov
     [not found]         ` <20080419163322.GA17089@2ka.mipt.ru>
2008-04-19 16:45           ` Rusty Russell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200804190115.15983.rusty@rustcorp.com.au \
    --to=rusty@rustcorp.com.au \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maxk@qualcomm.com \
    --cc=netdev@vger.kernel.org \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).