From: Greg KH <gregkh@linuxfoundation.org>
To: ira.weiny@intel.com
Cc: devel@driverdev.osuosl.org, linux-rdma@vger.kernel.org,
Mitko Haralanov <mitko.haralanov@intel.com>,
dennis.dalessandro@intel.com, dledford@redhat.com
Subject: Re: [PATCH v3 14/23] staging/rdma/hfi1: Implement Expected Receive TID caching
Date: Tue, 27 Oct 2015 17:22:10 +0900 [thread overview]
Message-ID: <20151027082210.GA7383@kroah.com> (raw)
In-Reply-To: <1445869729-7507-15-git-send-email-ira.weiny@intel.com>
On Mon, Oct 26, 2015 at 10:28:40AM -0400, ira.weiny@intel.com wrote:
> From: Mitko Haralanov <mitko.haralanov@intel.com>
>
> Expected receives work by user-space libraries (PSM) calling into the
> driver with information about the user's receive buffer and have the driver
> DMA-map that buffer and program the HFI to receive data directly into it.
>
> This is an expensive operation as it requires the driver to pin the pages
> which
> the user's buffer maps to, DMA-map them, and then program the HFI.
>
> When the receive is complete, user-space libraries have to call into the driver
> again so the buffer is removed from the HFI, un-mapped, and the pages unpinned.
>
> All of these operations are expensive, considering that a lot of applications
> (especially micro-benchmarks) use the same buffer over and over.
>
> In order to get better performance for user-space applications, it is highly
> beneficial that they don't continuously call into the driver to register and
> unregister the same buffer. Rather, they can register the buffer and cache it
> for future work. The buffer can be unregistered when it is freed by the user.
>
> This change implements such buffer caching by making use of the kernel's MMU
> notifier API. User-space libraries call into the driver only when the need to
> register a new buffer.
>
> Once a buffer is registered, it stays programmed into the HFI until the kernel
> notifies the driver that the buffer has been freed by the user. At that time,
> the user-space library is notified and it can do the necessary work to remove
> the buffer from its cache.
>
> Buffers which have been invalidated by the kernel are not automatically removed
> from the HFI and do not have their pages unpinned. Buffers are only completely
> removed when the user-space libraries call into the driver to free them. This
> is done to ensure that any ongoing transfers into that buffer are complete.
> This is important when a buffer is not completely freed but rather it is
> shrunk. The user-space library could still have uncompleted transfers into the
> remaining buffer.
>
> With this feature, it is important that systems are setup with reasonable
> limits for the amount of lockable memory. Keeping the limit at "unlimited" (as
> we've done up to this point), may result in jobs being killed by the kernel's
> OOM due to them taking up excessive amounts of memory.
>
> Reviewed-by: Arthur Kepner <arthur.kepner@intel.com>
> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
>
> ---
> Changes from V2:
> Fix random Kconfig 0-day build error
> Fix leak of random memory to user space caught by Dan Carpenter
> Separate out pointer bug fix into a previous patch
> Change error checks in case statement per Dan's comments
>
> drivers/staging/rdma/hfi1/Kconfig | 1 +
> drivers/staging/rdma/hfi1/Makefile | 2 +-
> drivers/staging/rdma/hfi1/common.h | 15 +-
> drivers/staging/rdma/hfi1/file_ops.c | 490 ++-----------
> drivers/staging/rdma/hfi1/hfi.h | 43 +-
> drivers/staging/rdma/hfi1/init.c | 5 +-
> drivers/staging/rdma/hfi1/trace.h | 132 ++--
> drivers/staging/rdma/hfi1/user_exp_rcv.c | 1171 ++++++++++++++++++++++++++++++
> drivers/staging/rdma/hfi1/user_exp_rcv.h | 82 +++
> drivers/staging/rdma/hfi1/user_pages.c | 110 +--
> drivers/staging/rdma/hfi1/user_sdma.c | 13 +
> drivers/staging/rdma/hfi1/user_sdma.h | 10 +-
> include/uapi/rdma/hfi/hfi1_user.h | 42 +-
> 13 files changed, 1481 insertions(+), 635 deletions(-)
> create mode 100644 drivers/staging/rdma/hfi1/user_exp_rcv.c
> create mode 100644 drivers/staging/rdma/hfi1/user_exp_rcv.h
This is way too big to review properly, please break it up into
reviewable chunks.
thanks,
greg k-h
next prev parent reply other threads:[~2015-10-27 8:22 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-26 14:28 [PATCH v3 00/23] staging/rdma/hfi1: Fix bugs and performance issues ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2015-10-26 14:28 ` [PATCH v3 02/23] staging/rdma/hfi1: Fix code to reset ASIC CSRs on FLR ira.weiny
[not found] ` <1445869729-7507-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2015-10-26 14:28 ` [PATCH v3 01/23] staging/rdma/hfi1: Fix regression in send performance ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2015-10-26 14:28 ` [PATCH v3 03/23] staging/rdma/hfi1: Extend the offline timeout ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2015-10-26 14:28 ` [PATCH v3 07/23] staging/rdma/hfi1: close shared context security hole ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2015-10-26 14:28 ` [PATCH v3 09/23] staging/rdma/hfi1: Add a schedule in send thread ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2015-10-26 14:28 ` [PATCH v3 12/23] staging/rdma/hfi1: Macro code clean up ira.weiny-ral2JQCrhuEAvxtiuMwx3w
[not found] ` <1445869729-7507-13-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2015-10-27 8:19 ` Greg KH
[not found] ` <20151027081910.GA25171-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>
2015-10-27 20:51 ` ira.weiny
[not found] ` <20151027205115.GB32118-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
2015-10-27 21:14 ` Greg KH
[not found] ` <20151027211404.GA18879-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>
2015-10-28 15:47 ` ira.weiny
2015-10-26 14:28 ` [PATCH v3 13/23] staging/rdma/hfi1: Wrong cast breaks desired pointer arithmetic ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2015-10-26 14:28 ` [PATCH v3 15/23] staging/rdma/hfi1: Allow tuning of SDMA interrupt rate ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2015-10-26 14:28 ` [PATCH v3 18/23] staging/rdma/hfi1: Thread the receive interrupt ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2015-10-26 14:28 ` [PATCH v3 19/23] staging/rdma/hfi: modify workqueue for parallelism ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2015-10-27 8:44 ` Greg KH
2015-10-26 14:28 ` [PATCH v3 20/23] staging/rdma/hfi1: Load SBus firmware once per ASIC ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2015-10-26 14:28 ` [PATCH v3 21/23] staging/rdma/hfi1: Add unit # to verbs txreq cache name ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2015-10-26 14:28 ` [PATCH v3 22/23] staging/rdma/hfi1: add additional rc traces ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2015-10-26 14:28 ` [PATCH v3 23/23] staging/rdma/hfi1: Update driver version string to 0.9-294 ira.weiny-ral2JQCrhuEAvxtiuMwx3w
[not found] ` <1445869729-7507-24-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2015-10-27 8:46 ` Greg KH
[not found] ` <20151027084641.GA16795-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>
2015-10-27 21:00 ` ira.weiny
2015-10-27 21:15 ` Greg KH
2015-10-26 14:28 ` [PATCH v3 04/23] staging/rdma/hfi1: Prevent host software lock up ira.weiny
2015-10-26 14:28 ` [PATCH v3 05/23] staging/rdma/hfi1: Remove QSFP_ENABLED from HFI capability mask ira.weiny
2015-10-26 14:28 ` [PATCH v3 06/23] staging/rdma/hfi1: Add coalescing support for SDMA TX descriptors ira.weiny
2015-10-26 14:28 ` [PATCH v3 08/23] staging/rdma/hfi1: Reset firmware instead of reloading Sbus ira.weiny
2015-10-26 14:28 ` [PATCH v3 10/23] staging/rdma/hfi1: Fix port bounce issues with 0.22 DC firmware ira.weiny
2015-10-26 14:28 ` [PATCH v3 11/23] staging/rdma/hfi1: Prevent silent data corruption with user SDMA ira.weiny
2015-10-26 14:28 ` [PATCH v3 14/23] staging/rdma/hfi1: Implement Expected Receive TID caching ira.weiny
2015-10-27 8:22 ` Greg KH [this message]
2015-10-26 14:28 ` [PATCH v3 16/23] staging/rdma/hfi1: Increase SDMA descriptor queue size ira.weiny
2015-10-26 14:28 ` [PATCH v3 17/23] staging/rdma/hfi1: Add irqsaves in the packet processing path ira.weiny
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151027082210.GA7383@kroah.com \
--to=gregkh@linuxfoundation.org \
--cc=dennis.dalessandro@intel.com \
--cc=devel@driverdev.osuosl.org \
--cc=dledford@redhat.com \
--cc=ira.weiny@intel.com \
--cc=linux-rdma@vger.kernel.org \
--cc=mitko.haralanov@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).