From: Jason Gunthorpe <jgg@nvidia.com>
To: Mike Marciniszyn <mike.marciniszyn@intel.com>
Cc: <dledford@redhat.com>, <linux-rdma@vger.kernel.org>
Subject: Re: [PATCH for-rc] IB/rdmavt: Fix RQ counting issues causing use of an invalid RWQE
Date: Wed, 29 Jul 2020 15:54:53 -0300 [thread overview]
Message-ID: <20200729185453.GA278576@nvidia.com> (raw)
In-Reply-To: <20200728183848.22226.29132.stgit@awfm-01.aw.intel.com>
On Tue, Jul 28, 2020 at 02:38:48PM -0400, Mike Marciniszyn wrote:
> The lookaside count is improperly initialized to the size of the
> Receive Queue with the additional +1. In the traces below, the
> RQ size is 384, so the count was set to 385.
>
> The lookaside count is then rarely refreshed. Note the high and
> incorrect count in the trace below:
>
> rvt_get_rwqe: [hfi1_0] wqe ffffc900078e9008 wr_id 55c7206d75a0 qpn c
> qpt 2 pid 3018 num_sge 1 head 1 tail 0, count 385
> rvt_get_rwqe: (hfi1_rc_rcv+0x4eb/0x1480 [hfi1] <- rvt_get_rwqe) ret=0x1
>
> The head,tail indicate there is only one RWQE posted although the count
> says 385 and we correctly return the element 0.
>
> The next call to rvt_get_rwqe with the decremented count:
>
> rvt_get_rwqe: [hfi1_0] wqe ffffc900078e9058 wr_id 0 qpn c
> qpt 2 pid 3018 num_sge 0 head 1 tail 1, count 384
> rvt_get_rwqe: (hfi1_rc_rcv+0x4eb/0x1480 [hfi1] <- rvt_get_rwqe) ret=0x1
>
> Note that the RQ is empty (head == tail) yet we return the RWQE at tail 1,
> which is not valid because of the bogus high count.
>
> Best case, the RWQE has never been posted and the rc logic sees an RWQE
> that is too small (all zeros) and puts the QP into an error state.
>
> In the worst case, a server slow at posting receive buffers might fool
> rvt_get_rwqe() into fetching an old RWQE and corrupt memory.
>
> Fix by deleting the faulty initialization code and creating an
> inline to fetch the posted count and convert all callers to use
> new inline.
>
> Fixes: f592ae3c999f ("IB/rdmavt: Fracture single lock used for posting and processing RWQEs")
> Reported-by: Zhaojuan Guo <zguo@redhat.com>
> Cc: <stable@vger.kernel.org> # 5.4.x
> Reviewed-by: Kaike Wan <kaike.wan@intel.com>
> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
> Tested-by: Honggang Li <honli@redhat.com>
> ---
> drivers/infiniband/sw/rdmavt/qp.c | 33 ++++-----------------------------
> drivers/infiniband/sw/rdmavt/rc.c | 4 +---
> include/rdma/rdmavt_qp.h | 19 +++++++++++++++++++
> 3 files changed, 24 insertions(+), 32 deletions(-)
Applied to for-rc, thanks
Jason
prev parent reply other threads:[~2020-07-29 18:55 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-28 18:38 [PATCH for-rc] IB/rdmavt: Fix RQ counting issues causing use of an invalid RWQE Mike Marciniszyn
2020-07-29 14:33 ` Honggang LI
2020-07-29 18:54 ` Jason Gunthorpe [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200729185453.GA278576@nvidia.com \
--to=jgg@nvidia.com \
--cc=dledford@redhat.com \
--cc=linux-rdma@vger.kernel.org \
--cc=mike.marciniszyn@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.