From: Roland Dreier <roland@topspin.com>
To: Timur Tabi <timur.tabi@ammasso.com>
Cc: Andrew Morton <akpm@osdl.org>,
hch@infradead.org, hozer@hozed.org, linux-kernel@vger.kernel.org,
openib-general@openib.org
Subject: Re: [PATCH][RFC][0/4] InfiniBand userspace verbs implementation
Date: Mon, 25 Apr 2005 06:15:10 -0700 [thread overview]
Message-ID: <52is2bvvz5.fsf@topspin.com> (raw)
In-Reply-To: 426BABF4.3050205@ammasso.com
Timur> With mlock(), we don't need to use get_user_pages() at all.
Timur> Arjan tells me the only time an mlocked page can move is
Timur> with hot (un)plug of memory, but that isn't supported on
Timur> the systems that we support. We actually prefer mlock()
Timur> over get_user_pages(), because if the process dies, the
Timur> locks automatically go away too.
There actually is another way pages can move, with both
get_user_pages() and mlock(): copy-on-write after a fork(). If
userspace does a fork(), then all PTEs are marked read-only, and if
the original process touches the page after the fork(), a new page
will be allocated and mapped at the original virtual address.
This is actually a pretty big pain, because the only good solution
seems to be for the kernel to mark these registered regions as
VM_DONTCOPY. Right now this means that driver code ends up monkeying
with vm_flags for user vmas.
Does it seem reasonable to add a new system call to let userspace mark
memory it doesn't want copied into forked processes? Something like
long sys_mark_nocopy(unsigned long addr, size_t len, int mark)
which would set VM_DONTCOPY if mark != 0, and clear it if mark == 0.
A better name would be gratefully accepted...
Then to register memory for RDMA, userspace would call
sys_mark_nocopy() (with appropriate accounting to handle possibly
overlapping regions) and the kernel would call get_user_pages(). The
get_user_pages() is of course required because the kernel can't trust
userspace to keep the pages locked. mlock() would no longer be
necessary. We can trust userspace to call sys_mark_nocopy() as
needed, because a process can only hurt itself and its children by
misusing the sys_mark_nocopy() call.
If this seems reasonable then I can code a patch.
- R.
next prev parent reply other threads:[~2005-04-25 13:15 UTC|newest]
Thread overview: 143+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-04-04 22:09 [PATCH][RFC][0/4] InfiniBand userspace verbs implementation Roland Dreier
2005-04-04 22:09 ` [PATCH][RFC][1/4] IB: core changes for userspace verbs Roland Dreier
2005-04-04 22:09 ` [PATCH][RFC][2/4] IB: userspace verbs main module Roland Dreier
2005-04-04 22:09 ` [PATCH][RFC][3/4] IB: userspace verbs mthca changes Roland Dreier
2005-04-04 22:09 ` [PATCH][RFC][4/4] IB: userspace verbs Kconfig/Makefile changes Roland Dreier
2005-04-04 22:49 ` [openib-general] [PATCH][RFC][3/4] IB: userspace verbs mthca changes Tom Duffy
2005-04-04 23:34 ` Roland Dreier
2005-04-21 0:37 ` [PATCH][MTHCA] fix sparc build WAS: " Tom Duffy
2005-04-21 0:38 ` David S. Miller
2005-04-11 14:22 ` [PATCH][RFC][0/4] InfiniBand userspace verbs implementation Troy Benjegerdes
2005-04-11 15:34 ` Roland Dreier
2005-04-11 16:33 ` Troy Benjegerdes
2005-04-11 16:56 ` Roland Dreier
2005-04-11 18:01 ` Troy Benjegerdes
2005-04-11 18:03 ` Roland Dreier
2005-04-12 0:13 ` Andrew Morton
2005-04-12 0:21 ` Roland Dreier
2005-04-12 18:23 ` Michael S. Tsirkin
2005-04-13 18:28 ` Roland Dreier
2005-04-13 19:32 ` Andrew Morton
2005-04-13 1:04 ` [openib-general] " Libor Michalek
2005-04-18 17:15 ` Timur Tabi
2005-04-26 3:31 ` Libor Michalek
2005-05-04 18:27 ` Timur Tabi
2005-05-05 18:48 ` Timur Tabi
2005-05-06 23:08 ` Timur Tabi
2005-05-07 13:18 ` Hugh Dickins
2005-05-07 14:45 ` Timur Tabi
2005-05-07 16:30 ` Hugh Dickins
2005-05-11 20:12 ` William Jordan
2005-05-11 20:42 ` Hugh Dickins
2005-05-11 22:52 ` Andrea Arcangeli
2005-05-11 22:49 ` Andrea Arcangeli
2005-05-11 22:53 ` Timur Tabi
2005-05-11 23:05 ` Andrea Arcangeli
2005-05-05 23:34 ` Libor Michalek
2005-04-18 16:22 ` Timur Tabi
2005-04-18 16:43 ` Christoph Hellwig
2005-04-18 16:45 ` Timur Tabi
2005-04-24 2:44 ` Andrew Morton
2005-04-24 14:23 ` Timur Tabi
2005-04-24 20:53 ` Greg KH
2005-04-24 21:52 ` Timur Tabi
2005-04-25 1:03 ` Greg KH
2005-04-25 4:12 ` Timur Tabi
2005-04-25 13:30 ` Dave Hansen
2005-04-25 13:15 ` Roland Dreier [this message]
2005-04-25 13:17 ` Christoph Hellwig
2005-04-25 14:16 ` Roland Dreier
2005-04-25 20:54 ` Andrew Morton
2005-04-25 21:12 ` Roland Dreier
2005-04-25 22:14 ` Andrew Morton
2005-04-25 22:21 ` Timur Tabi
2005-04-25 22:32 ` Andrew Morton
2005-04-25 23:58 ` Roland Dreier
2005-04-26 0:11 ` Andrew Morton
2005-04-26 0:23 ` Roland Dreier
2005-04-26 0:37 ` Andrew Morton
2005-04-26 2:21 ` Timur Tabi
2005-04-26 3:16 ` Andrew Morton
2005-04-26 3:38 ` Timur Tabi
2005-04-26 4:33 ` Andrew Morton
2005-04-26 14:07 ` Timur Tabi
2005-04-26 15:31 ` Roland Dreier
2005-04-26 15:42 ` [openib-general] " Libor Michalek
2005-04-26 15:49 ` Roland Dreier
2005-04-26 19:28 ` Andrew Morton
2005-04-26 20:14 ` Roland Dreier
2005-04-26 20:18 ` Timur Tabi
2005-04-26 20:37 ` Andrew Morton
2005-04-29 14:26 ` Bill Jordan
2005-04-29 15:56 ` Caitlin Bestler
2005-04-29 16:45 ` RDMA memory registration (was: [openib-general] Re: [PATCH][RFC][0/4] InfiniBand userspace verbs implementation) Roland Dreier
2005-04-29 17:23 ` Libor Michalek
2005-04-29 18:22 ` RDMA memory registration Brice Goglin
2005-04-29 18:31 ` Roland Dreier
2005-04-29 19:33 ` [openib-general] " Grant Grundler
2005-05-03 8:42 ` David Addison
2005-05-03 15:36 ` Grant Grundler
2005-04-29 19:43 ` RDMA memory registration (was: [openib-general] Re: [PATCH][RFC][0/4] InfiniBand userspace verbs implementation) Bill Jordan
2005-04-29 19:45 ` RDMA memory registration Roland Dreier
2005-04-29 17:04 ` [openib-general] Re: [PATCH][RFC][0/4] InfiniBand userspace verbs implementation Libor Michalek
2005-04-30 0:31 ` Caitlin Bestler
2005-05-03 18:43 ` Andy Isaacson
2005-05-03 19:04 ` Caitlin Bestler
2005-05-04 18:22 ` William Jordan
2005-05-05 1:27 ` Rik van Riel
2005-05-05 1:57 ` Andy Isaacson
2005-04-26 20:32 ` Andrew Morton
2005-04-26 21:23 ` Roland Dreier
2005-04-27 0:05 ` Andrew Morton
2005-04-27 2:13 ` Roland Dreier
2005-04-27 3:21 ` Caitlin Bestler
2005-04-27 3:15 ` Caitlin Bestler
2005-04-26 2:03 ` IWAMOTO Toshihiro
2005-04-26 2:16 ` Timur Tabi
2005-04-26 2:26 ` [openib-general] " Stephen Langdon
2005-04-25 22:23 ` Timur Tabi
2005-04-25 22:35 ` Andrew Morton
2005-04-25 22:42 ` Timur Tabi
2005-04-25 23:13 ` Andrew Morton
2005-04-25 23:21 ` Timur Tabi
2005-04-25 23:27 ` Andrew Morton
2005-04-26 0:08 ` Roland Dreier
2005-04-25 22:51 ` [openib-general] Re: [PATCH][RFC][0/4] InfiniBand userspace verbsimplementation Bob Woodruff
2005-04-25 23:13 ` Timur Tabi
2005-04-25 23:17 ` Andrew Morton
2005-04-25 23:29 ` Bob Woodruff
2005-04-25 23:17 ` [openib-general] Re: [PATCH][RFC][0/4] InfiniBand userspace verbs implementation Libor Michalek
2005-04-25 23:24 ` Andrew Morton
2005-04-25 23:37 ` Caitlin Bestler
2005-04-26 0:10 ` Andrew Morton
2005-04-26 3:55 ` Libor Michalek
2005-04-26 0:02 ` Roland Dreier
2005-04-26 6:12 ` Christoph Hellwig
2005-04-26 13:45 ` [openib-general] " Caitlin Bestler
2005-04-26 15:24 ` Timur Tabi
2005-04-25 19:11 ` Andy Isaacson
2005-04-18 16:09 ` Timur Tabi
2005-04-18 16:12 ` Roland Dreier
2005-04-18 16:50 ` Timur Tabi
2005-04-21 19:47 ` Pavel Machek
2005-04-18 16:16 ` Arjan van de Ven
2005-04-18 16:25 ` Timur Tabi
2005-04-18 19:40 ` Arjan van de Ven
2005-04-18 20:00 ` Timur Tabi
2005-04-18 20:05 ` Arjan van de Ven
2005-04-18 20:19 ` Timur Tabi
2005-04-18 20:07 ` [openib-general] " Bernhard Fischer
2005-04-21 2:17 ` Troy Benjegerdes
2005-04-21 3:07 ` Timur Tabi
2005-04-21 17:38 ` Andy Isaacson
2005-04-21 18:39 ` Timur Tabi
2005-04-21 19:56 ` Andy Isaacson
2005-04-21 20:07 ` Timur Tabi
2005-04-21 20:12 ` Chris Wright
2005-04-21 20:14 ` Timur Tabi
2005-04-21 20:25 ` Chris Wright
2005-04-21 20:30 ` Arjan van de Ven
2005-04-22 6:14 ` Greg KH
2005-04-22 17:55 ` Timur Tabi
2005-04-22 18:12 ` Arjan van de Ven
2005-04-29 0:56 ` Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52is2bvvz5.fsf@topspin.com \
--to=roland@topspin.com \
--cc=akpm@osdl.org \
--cc=hch@infradead.org \
--cc=hozer@hozed.org \
--cc=linux-kernel@vger.kernel.org \
--cc=openib-general@openib.org \
--cc=timur.tabi@ammasso.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.