qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH] rdma: don't make pages writeable if not requiested
@ 2013-03-21  6:18 Michael S. Tsirkin
  2013-03-21  6:55 ` Roland Dreier
  2013-03-21 12:23 ` Michael R. Hines
  0 siblings, 2 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2013-03-21  6:18 UTC (permalink / raw)
  To: Michael R. Hines
  Cc: Roland Dreier, qemu-devel, linux-rdma, Yishai Hadas, linux-kernel,
	Hal Rosenstock, Sean Hefty, Christoph Lameter

core/umem.c seems to get the arguments to get_user_pages
in the reverse order: it sets writeable flag and
breaks COW for MAP_SHARED if and only if hardware needs to
write the page.

This breaks memory overcommit for users such as KVM:
each time we try to register a page to send it to remote, this
breaks COW.  It seems that for applications that only have
REMOTE_READ permission, there is no reason to break COW at all.

If the page that is COW has lots of copies, this makes the user process
quickly exceed the cgroups memory limit.  This makes RDMA mostly useless
for virtualization, thus the stable tag.

Reported-by: "Michael R. Hines" <mrhines@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---

Note: compile-tested only, I don't have RDMA hardware at the moment.
Michael, could you please try this patch (also fixing your
usespace code not to request write access) and report?

Note2: grep for get_user_pages in infiniband drivers turns up
lots of users who set write to 1 unconditionally.
These might be bugs too, should be checked.

 drivers/infiniband/core/umem.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
index a841123..5929598 100644
--- a/drivers/infiniband/core/umem.c
+++ b/drivers/infiniband/core/umem.c
@@ -152,7 +152,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr,
 		ret = get_user_pages(current, current->mm, cur_base,
 				     min_t(unsigned long, npages,
 					   PAGE_SIZE / sizeof (struct page *)),
-				     1, !umem->writable, page_list, vma_list);
+				     !umem->writable, 1, page_list, vma_list);
 
 		if (ret < 0)
 			goto out;
-- 
MST

^ permalink raw reply related	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2013-03-21 20:09 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-03-21  6:18 [Qemu-devel] [PATCH] rdma: don't make pages writeable if not requiested Michael S. Tsirkin
2013-03-21  6:55 ` Roland Dreier
2013-03-21  7:03   ` Michael S. Tsirkin
2013-03-21  7:15     ` Roland Dreier
2013-03-21  8:51       ` Michael S. Tsirkin
2013-03-21  9:13         ` Roland Dreier
2013-03-21  9:39           ` Michael S. Tsirkin
2013-03-21 17:11             ` Jason Gunthorpe
2013-03-21 17:15               ` Michael S. Tsirkin
2013-03-21 17:21                 ` Jason Gunthorpe
2013-03-21 17:42                   ` Michael S. Tsirkin
2013-03-21 17:57                     ` Jason Gunthorpe
2013-03-21 18:03                       ` Michael S. Tsirkin
2013-03-21 18:16               ` Michael S. Tsirkin
2013-03-21 18:41                 ` Jason Gunthorpe
2013-03-21 19:15                   ` Michael S. Tsirkin
2013-03-21 20:09                     ` Jason Gunthorpe
2013-03-21  9:32   ` Michael S. Tsirkin
2013-03-21 11:30     ` Michael S. Tsirkin
2013-03-21 12:23 ` Michael R. Hines
2013-03-21 12:32   ` Michael S. Tsirkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).