From: <gregkh@linuxfoundation.org>
To: michael.j.ruhl@intel.com, dennis.dalessandro@intel.com,
jgg@mellanox.com, mike.marciniszyn@intel.com,
stable@vger.kernel.org
Cc: <stable@vger.kernel.org>
Subject: FAILED: patch "[PATCH] IB/rdmavt: Fix concurrency panics in QP post_send and modify" failed to apply to 4.14-stable tree
Date: Wed, 20 Mar 2019 20:57:19 +0100 [thread overview]
Message-ID: <155311183995182@kroah.com> (raw)
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable@vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d757c60eca9b22f4d108929a24401e0fdecda0b1 Mon Sep 17 00:00:00 2001
From: "Michael J. Ruhl" <michael.j.ruhl@intel.com>
Date: Tue, 26 Feb 2019 08:45:25 -0800
Subject: [PATCH] IB/rdmavt: Fix concurrency panics in QP post_send and modify
to error
The RC/UC code path can go through a software loopback. In this code path
the receive side QP is manipulated.
If two threads are working on the QP receive side (i.e. post_send, and
modify_qp to an error state), QP information can be corrupted.
(post_send via loopback)
set r_sge
loop
update r_sge
(modify_qp)
take r_lock
update r_sge <---- r_sge is now incorrect
(post_send)
update r_sge <---- crash, etc.
...
This can lead to one of the two following crashes:
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: hfi1_copy_sge+0xf1/0x2e0 [hfi1]
PGD 8000001fe6a57067 PUD 1fd9e0c067 PMD 0
Call Trace:
ruc_loopback+0x49b/0xbc0 [hfi1]
hfi1_do_send+0x38e/0x3e0 [hfi1]
_hfi1_do_send+0x1e/0x20 [hfi1]
process_one_work+0x17f/0x440
worker_thread+0x126/0x3c0
kthread+0xd1/0xe0
ret_from_fork_nospec_begin+0x21/0x21
or:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000048
IP: rvt_clear_mr_refs+0x45/0x370 [rdmavt]
PGD 80000006ae5eb067 PUD ef15d0067 PMD 0
Call Trace:
rvt_error_qp+0xaa/0x240 [rdmavt]
rvt_modify_qp+0x47f/0xaa0 [rdmavt]
ib_security_modify_qp+0x8f/0x400 [ib_core]
ib_modify_qp_with_udata+0x44/0x70 [ib_core]
modify_qp.isra.23+0x1eb/0x2b0 [ib_uverbs]
ib_uverbs_modify_qp+0xaa/0xf0 [ib_uverbs]
ib_uverbs_write+0x272/0x430 [ib_uverbs]
vfs_write+0xc0/0x1f0
SyS_write+0x7f/0xf0
system_call_fastpath+0x1c/0x21
Fix by using the appropriate locking on the receiving QP.
Fixes: 15703461533a ("IB/{hfi1, qib, rdmavt}: Move ruc_loopback to rdmavt")
Cc: <stable@vger.kernel.org> #v4.9+
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
diff --git a/drivers/infiniband/sw/rdmavt/qp.c b/drivers/infiniband/sw/rdmavt/qp.c
index 337365806261..a34b9a2a32b6 100644
--- a/drivers/infiniband/sw/rdmavt/qp.c
+++ b/drivers/infiniband/sw/rdmavt/qp.c
@@ -2790,6 +2790,18 @@ again:
}
EXPORT_SYMBOL(rvt_copy_sge);
+static enum ib_wc_status loopback_qp_drop(struct rvt_ibport *rvp,
+ struct rvt_qp *sqp)
+{
+ rvp->n_pkt_drops++;
+ /*
+ * For RC, the requester would timeout and retry so
+ * shortcut the timeouts and just signal too many retries.
+ */
+ return sqp->ibqp.qp_type == IB_QPT_RC ?
+ IB_WC_RETRY_EXC_ERR : IB_WC_SUCCESS;
+}
+
/**
* ruc_loopback - handle UC and RC loopback requests
* @sqp: the sending QP
@@ -2862,17 +2874,14 @@ again:
}
spin_unlock_irqrestore(&sqp->s_lock, flags);
- if (!qp || !(ib_rvt_state_ops[qp->state] & RVT_PROCESS_RECV_OK) ||
+ if (!qp) {
+ send_status = loopback_qp_drop(rvp, sqp);
+ goto serr_no_r_lock;
+ }
+ spin_lock_irqsave(&qp->r_lock, flags);
+ if (!(ib_rvt_state_ops[qp->state] & RVT_PROCESS_RECV_OK) ||
qp->ibqp.qp_type != sqp->ibqp.qp_type) {
- rvp->n_pkt_drops++;
- /*
- * For RC, the requester would timeout and retry so
- * shortcut the timeouts and just signal too many retries.
- */
- if (sqp->ibqp.qp_type == IB_QPT_RC)
- send_status = IB_WC_RETRY_EXC_ERR;
- else
- send_status = IB_WC_SUCCESS;
+ send_status = loopback_qp_drop(rvp, sqp);
goto serr;
}
@@ -3030,6 +3039,7 @@ do_write:
wqe->wr.send_flags & IB_SEND_SOLICITED);
send_comp:
+ spin_unlock_irqrestore(&qp->r_lock, flags);
spin_lock_irqsave(&sqp->s_lock, flags);
rvp->n_loop_pkts++;
flush_send:
@@ -3056,6 +3066,7 @@ rnr_nak:
}
if (sqp->s_rnr_retry_cnt < 7)
sqp->s_rnr_retry--;
+ spin_unlock_irqrestore(&qp->r_lock, flags);
spin_lock_irqsave(&sqp->s_lock, flags);
if (!(ib_rvt_state_ops[sqp->state] & RVT_PROCESS_RECV_OK))
goto clr_busy;
@@ -3084,6 +3095,8 @@ err:
rvt_rc_error(qp, wc.status);
serr:
+ spin_unlock_irqrestore(&qp->r_lock, flags);
+serr_no_r_lock:
spin_lock_irqsave(&sqp->s_lock, flags);
rvt_send_complete(sqp, wqe, send_status);
if (sqp->ibqp.qp_type == IB_QPT_RC) {
reply other threads:[~2019-03-20 19:57 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=155311183995182@kroah.com \
--to=gregkh@linuxfoundation.org \
--cc=dennis.dalessandro@intel.com \
--cc=jgg@mellanox.com \
--cc=michael.j.ruhl@intel.com \
--cc=mike.marciniszyn@intel.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.