linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
To: dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Mitko Haralanov
	<mitko.haralanov-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	Dean Luick <dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	Jubin John <jubin.john-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Subject: [PATCH 13/16] IB/hfi1: Add pin query function
Date: Tue, 08 Mar 2016 11:15:28 -0800	[thread overview]
Message-ID: <20160308191527.30542.76026.stgit@scvm10.sc.intel.com> (raw)
In-Reply-To: <20160308191210.30542.91885.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>

From: Mitko Haralanov <mitko.haralanov-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

System administrators can use the locked memory
ulimit setting to set the maximum amount of memory
a user can lock/pin. However, this setting alone is not
enough to guarantee good operation of the hfi1 driver
due to the fact that the setting does not have fine
enough granularity to account for the limit being used
by multiple user processes and caches.

Therefore, a better limiting algorithm is needed. This
is where the new hfi1_can_pin_pages() function and the
cache_size module parameter come in.

The function works by looking at the ulimit and cache_size
value to compute a cache size. The algorithm examines the
ulimit value and, if it is not "unlimited", computes a
per-cache limit based on the number of configured user
contexts.

After that, the lower of the two - cache_size and computed
per-cache limit - is used.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Dean Luick <dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Mitko Haralanov <mitko.haralanov-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Jubin John <jubin.john-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/hw/hfi1/hfi.h        |    1 +
 drivers/infiniband/hw/hfi1/user_pages.c |   52 +++++++++++++++++++++++++++----
 2 files changed, 47 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/hfi.h b/drivers/infiniband/hw/hfi1/hfi.h
index 2107cdc..ff3b37a 100644
--- a/drivers/infiniband/hw/hfi1/hfi.h
+++ b/drivers/infiniband/hw/hfi1/hfi.h
@@ -1664,6 +1664,7 @@ void shutdown_led_override(struct hfi1_pportdata *ppd);
  */
 #define DEFAULT_RCVHDR_ENTSIZE 32
 
+bool hfi1_can_pin_pages(struct hfi1_devdata *, u32, u32);
 int hfi1_acquire_user_pages(unsigned long, size_t, bool, struct page **);
 void hfi1_release_user_pages(struct page **, size_t, bool);
 
diff --git a/drivers/infiniband/hw/hfi1/user_pages.c b/drivers/infiniband/hw/hfi1/user_pages.c
index 3bf8108..bd7a8ab 100644
--- a/drivers/infiniband/hw/hfi1/user_pages.c
+++ b/drivers/infiniband/hw/hfi1/user_pages.c
@@ -48,22 +48,62 @@
 #include <linux/mm.h>
 #include <linux/sched.h>
 #include <linux/device.h>
+#include <linux/module.h>
 
 #include "hfi.h"
 
-int hfi1_acquire_user_pages(unsigned long vaddr, size_t npages, bool writable,
-			    struct page **pages)
+static unsigned long cache_size = 256;
+module_param(cache_size, ulong, S_IRUGO | S_IWUSR);
+MODULE_PARM_DESC(cache_size, "Send and receive side cache size limit (in MB)");
+
+/*
+ * Determine whether the caller can pin pages.
+ *
+ * This function should be used in the implementation of buffer caches.
+ * The cache implementation should call this function prior to attempting
+ * to pin buffer pages in order to determine whether they should do so.
+ * The function computes cache limits based on the configured ulimit and
+ * cache size. Use of this function is especially important for caches
+ * which are not limited in any other way (e.g. by HW resources) and, thus,
+ * could keeping caching buffers.
+ *
+ */
+bool hfi1_can_pin_pages(struct hfi1_devdata *dd, u32 nlocked, u32 npages)
 {
-	unsigned long pinned, lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
+	unsigned long ulimit = rlimit(RLIMIT_MEMLOCK), pinned, cache_limit,
+		size = (cache_size * (1UL << 20)); /* convert to bytes */
+	unsigned usr_ctxts = dd->num_rcv_contexts - dd->first_user_ctxt;
 	bool can_lock = capable(CAP_IPC_LOCK);
-	int ret;
+
+	/*
+	 * Calculate per-cache size. The calculation below uses only a quarter
+	 * of the available per-context limit. This leaves space for other
+	 * pinning. Should we worry about shared ctxts?
+	 */
+	cache_limit = (ulimit / usr_ctxts) / 4;
+
+	/* If ulimit isn't set to "unlimited" and is smaller than cache_size. */
+	if (ulimit != (-1UL) && size > cache_limit)
+		size = cache_limit;
+
+	/* Convert to number of pages */
+	size = DIV_ROUND_UP(size, PAGE_SIZE);
 
 	down_read(&current->mm->mmap_sem);
 	pinned = current->mm->pinned_vm;
 	up_read(&current->mm->mmap_sem);
 
-	if (pinned + npages > lock_limit && !can_lock)
-		return -ENOMEM;
+	/* First, check the absolute limit against all pinned pages. */
+	if (pinned + npages >= ulimit && !can_lock)
+		return false;
+
+	return ((nlocked + npages) <= size) || can_lock;
+}
+
+int hfi1_acquire_user_pages(unsigned long vaddr, size_t npages, bool writable,
+			    struct page **pages)
+{
+	int ret;
 
 	ret = get_user_pages_fast(vaddr, npages, writable, pages);
 	if (ret < 0)

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2016-03-08 19:15 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-08 19:14 [PATCH 00/16] IB/hfi1: Add a page pinning cache for PSM sdma Dennis Dalessandro
     [not found] ` <20160308191210.30542.91885.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
2016-03-08 19:14   ` [PATCH 01/16] IB/hfi1: Re-factor MMU notification code Dennis Dalessandro
2016-03-08 19:14   ` [PATCH 02/16] IB/hfi1: Allow MMU function execution in IRQ context Dennis Dalessandro
2016-03-08 19:14   ` [PATCH 03/16] IB/hfi1: Prevent NULL pointer dereference Dennis Dalessandro
2016-03-08 19:14   ` [PATCH 04/16] IB/hfi1: Allow remove MMU callbacks to free nodes Dennis Dalessandro
2016-03-08 19:14   ` [PATCH 05/16] IB/hfi1: Remove the use of add/remove RB function pointers Dennis Dalessandro
2016-03-08 19:14   ` [PATCH 06/16] IB/hfi1: Notify remove MMU/RB callback of calling context Dennis Dalessandro
2016-03-08 19:14   ` [PATCH 07/16] IB/hfi1: Use interval RB trees Dennis Dalessandro
2016-03-08 19:14   ` [PATCH 08/16] IB/hfi1: Add MMU tracing Dennis Dalessandro
2016-03-08 19:15   ` [PATCH 09/16] IB/hfi1: Remove compare callback Dennis Dalessandro
2016-03-08 19:15   ` [PATCH 10/16] IB/hfi1: Add filter callback Dennis Dalessandro
2016-03-08 19:15   ` [PATCH 11/16] IB/hfi1: Adjust last address values for intervals Dennis Dalessandro
2016-03-08 19:15   ` [PATCH 12/16] IB/hfi1: Implement SDMA-side buffer caching Dennis Dalessandro
2016-03-08 19:15   ` Dennis Dalessandro [this message]
2016-03-08 19:15   ` [PATCH 14/16] IB/hfi1: Specify mm when releasing pages Dennis Dalessandro
2016-03-08 19:15   ` [PATCH 15/16] IB/hfi1: Switch to using the pin query function Dennis Dalessandro
2016-03-08 19:15   ` [PATCH 16/16] IB/hfi1: Add SDMA cache eviction algorithm Dennis Dalessandro
2016-03-08 20:56   ` [PATCH 00/16] IB/hfi1: Add a page pinning cache for PSM sdma Or Gerlitz
     [not found]     ` <CAJ3xEMj8gz6Xw1bqA677fC8=U2ktLQ6b4W20SeNrztN7u6UKZg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-03-09  0:21       ` Dennis Dalessandro
     [not found]         ` <20160309002157.GA20105-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
2016-03-09  5:07           ` Leon Romanovsky
     [not found]             ` <20160309050731.GM13396-2ukJVAZIZ/Y@public.gmane.org>
2016-03-09 19:21               ` Dennis Dalessandro
     [not found]                 ` <20160309192109.GB15031-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
2016-03-10  4:56                   ` Leon Romanovsky
     [not found]                     ` <20160310045609.GC1977-2ukJVAZIZ/Y@public.gmane.org>
2016-03-14  7:09                       ` Leon Romanovsky
     [not found]                         ` <20160314070951.GB26456-2ukJVAZIZ/Y@public.gmane.org>
2016-03-14 12:01                           ` Dennis Dalessandro
     [not found]                             ` <20160314120152.GA30838-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
2016-03-14 12:27                               ` Leon Romanovsky
2016-03-14 21:19                               ` Or Gerlitz
     [not found]                                 ` <CAJ3xEMiSgXFT46r0Y7DZGr3mpJFSLrtYrMdz+DbCXDozbMYZRA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-03-15  1:20                                   ` Dennis Dalessandro
     [not found]                                     ` <20160315012049.GA30055-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
2016-03-15  7:03                                       ` Or Gerlitz
     [not found]                                         ` <CAJ3xEMhToYOBMo_nNi2fFpTexUmUqb3zBX7zWWYjCDfPWpWn-w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-03-15 19:37                                           ` Dennis Dalessandro
     [not found]                                             ` <20160315193731.GA20662-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
2016-03-16 15:53                                               ` Or Gerlitz
     [not found]                                                 ` <CAJ3xEMjWxdwM0wQR85_-wHpPhYSWz1W0T8H_nDjnp2JS7G29Qw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-03-16 17:23                                                   ` Dennis Dalessandro
     [not found]                                                     ` <20160316172309.GB26530-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
2016-03-22  7:33                                                       ` Or Gerlitz
     [not found]                                                         ` <CAJ3xEMg5OPBT63=VV4EKnZUWSHGatDRL0n7O4iBVrN9bMUw5dg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-03-22 11:46                                                           ` Dennis Dalessandro
2016-03-10  7:11                   ` Or Gerlitz
2016-03-09 14:47           ` Or Gerlitz
     [not found]             ` <CAJ3xEMhcs-6vDzsNk9nJKoNxTWnfaUtDmNHhLp1EPyerjnw2Xg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-03-09 15:19               ` Dennis Dalessandro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160308191527.30542.76026.stgit@scvm10.sc.intel.com \
    --to=dennis.dalessandro-ral2jqcrhueavxtiumwx3w@public.gmane.org \
    --cc=dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    --cc=dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=jubin.john-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=mitko.haralanov-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).