From: Jason Gunthorpe <jgg@nvidia.com>
To: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Cc: leonro@nvidia.com, Dean Luick <dean.luick@cornelisnetworks.com>,
Brendan Cunningham <bcunningham@cornelisnetworks.com>,
linux-rdma@vger.kernel.org
Subject: Re: [PATCH for-rc v2] IB/hfi1: Fix wrong mmu_node used for user SDMA packet after invalidate
Date: Thu, 1 Jun 2023 14:42:41 -0300 [thread overview]
Message-ID: <ZHjYkX1P8MSlFt7m@nvidia.com> (raw)
In-Reply-To: <168451393605.3700681.13493776139032178861.stgit@awfm-02.cornelisnetworks.com>
On Fri, May 19, 2023 at 12:32:16PM -0400, Dennis Dalessandro wrote:
> From: Brendan Cunningham <bcunningham@cornelisnetworks.com>
>
> The hfi1 user SDMA pinned-page cache will leave a stale cache entry when
> the cache-entry's virtual address range is invalidated but that cache
> entry is in-use by an outstanding SDMA request.
>
> Subsequent user SDMA requests with buffers in or spanning the virtual
> address range of the stale cache entry will result in packets
> constructed from the wrong memory, the physical pages pointed to by the
> stale cache entry.
>
> To fix this, remove mmu_rb_node cache entries from the mmu_rb_handler
> cache independent of the cache entry's refcount. Add 'struct kref
> refcount' to struct mmu_rb_node and manage mmu_rb_node lifetime with
> kref_get() and kref_put().
>
> mmu_rb_node.refcount makes sdma_mmu_node.refcount redundant. Remove
> 'atomic_t refcount' from struct sdma_mmu_node and change sdma_mmu_node
> code to use mmu_rb_node.refcount.
>
> Move the mmu_rb_handler destructor call after a
> wait-for-SDMA-request-completion call so mmu_rb_nodes that need
> mmu_rb_handler's workqueue to queue themselves up for destruction from
> an interrupt context may do so.
>
> Fixes: f48ad614c100 ("IB/hfi1: Move driver out of staging")
> Fixes: 00cbce5cbf88 ("IB/hfi1: Fix bugs with non-PAGE_SIZE-end multi-iovec user SDMA requests")
>
> Reviewed-by: Dean Luick <dean.luick@cornelisnetworks.com>
> Signed-off-by: Brendan Cunningham <bcunningham@cornelisnetworks.com>
> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
> ---
> changes since v1: Update Fixes SHA
> ---
> drivers/infiniband/hw/hfi1/ipoib_tx.c | 4 -
> drivers/infiniband/hw/hfi1/mmu_rb.c | 101 ++++++++++++++---------
> drivers/infiniband/hw/hfi1/mmu_rb.h | 3 +
> drivers/infiniband/hw/hfi1/sdma.c | 23 ++++-
> drivers/infiniband/hw/hfi1/sdma.h | 47 +++++++----
> drivers/infiniband/hw/hfi1/sdma_txreq.h | 2
> drivers/infiniband/hw/hfi1/user_sdma.c | 137 ++++++++++++-------------------
> drivers/infiniband/hw/hfi1/user_sdma.h | 1
> drivers/infiniband/hw/hfi1/vnic_sdma.c | 4 -
> 9 files changed, 177 insertions(+), 145 deletions(-)
This is too big for -rc, but I took it to for-next
Thanks,
Jason
prev parent reply other threads:[~2023-06-01 17:42 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-19 16:32 [PATCH for-rc v2] IB/hfi1: Fix wrong mmu_node used for user SDMA packet after invalidate Dennis Dalessandro
2023-06-01 17:42 ` Jason Gunthorpe [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZHjYkX1P8MSlFt7m@nvidia.com \
--to=jgg@nvidia.com \
--cc=bcunningham@cornelisnetworks.com \
--cc=dean.luick@cornelisnetworks.com \
--cc=dennis.dalessandro@cornelisnetworks.com \
--cc=leonro@nvidia.com \
--cc=linux-rdma@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.