From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.linuxfoundation.org ([140.211.169.12]:54387 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S964813AbcJTOYg (ORCPT ); Thu, 20 Oct 2016 10:24:36 -0400 Subject: Patch "IB/hfi1: Fix defered ack race with qp destroy" has been added to the 4.7-stable tree To: mike.marciniszyn@intel.com, dennis.dalessandro@intel.com, dledford@redhat.com, gregkh@linuxfoundation.org Cc: , From: Date: Thu, 20 Oct 2016 16:24:21 +0200 Message-ID: <14769734613711@kroah.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ANSI_X3.4-1968 Content-Transfer-Encoding: 8bit Sender: stable-owner@vger.kernel.org List-ID: This is a note to let you know that I've just added the patch titled IB/hfi1: Fix defered ack race with qp destroy to the 4.7-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: ib-hfi1-fix-defered-ack-race-with-qp-destroy.patch and it can be found in the queue-4.7 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let know about it. >>From 72f53af2651957b0b9d6dead72a393eaf9a2c3be Mon Sep 17 00:00:00 2001 From: Mike Marciniszyn Date: Sun, 25 Sep 2016 07:41:46 -0700 Subject: IB/hfi1: Fix defered ack race with qp destroy From: Mike Marciniszyn commit 72f53af2651957b0b9d6dead72a393eaf9a2c3be upstream. There is a a bug in defered ack stuff that causes a race with the destroy of a QP. A packet causes a defered ack to be pended by putting the QP into an rcd queue. A return from the driver interrupt processing will process that rcd queue of QPs and attempt to do a direct send of the ack. At this point no locks are held and the above QP could now be put in the reset state in the qp destroy logic. A refcount protects the QP while it is in the rcd queue so it isn't going anywhere yet. If the direct send fails to allocate a pio buffer, hfi1_schedule_send() is called to trigger sending an ack from the send engine. There is no state test in that code path. The refcount is then dropped from the driver.c caller potentially allowing the qp destroy to continue from its refcount wait in parallel with the workqueue scheduling of the qp. Reviewed-by: Dennis Dalessandro Signed-off-by: Mike Marciniszyn Signed-off-by: Dennis Dalessandro Signed-off-by: Doug Ledford Signed-off-by: Greg Kroah-Hartman --- drivers/infiniband/hw/hfi1/rc.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) --- a/drivers/infiniband/hw/hfi1/rc.c +++ b/drivers/infiniband/hw/hfi1/rc.c @@ -889,8 +889,10 @@ void hfi1_send_rc_ack(struct hfi1_ctxtda return; queue_ack: - this_cpu_inc(*ibp->rvp.rc_qacks); spin_lock_irqsave(&qp->s_lock, flags); + if (!(ib_rvt_state_ops[qp->state] & RVT_PROCESS_RECV_OK)) + goto unlock; + this_cpu_inc(*ibp->rvp.rc_qacks); qp->s_flags |= RVT_S_ACK_PENDING | RVT_S_RESP_PENDING; qp->s_nak_state = qp->r_nak_state; qp->s_ack_psn = qp->r_ack_psn; @@ -899,6 +901,7 @@ queue_ack: /* Schedule the send tasklet. */ hfi1_schedule_send(qp); +unlock: spin_unlock_irqrestore(&qp->s_lock, flags); } Patches currently in stable-queue which might be from mike.marciniszyn@intel.com are queue-4.7/ib-hfi1-fix-defered-ack-race-with-qp-destroy.patch