From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Dean Luick <dean.luick@intel.com>,
Mike Marciniszyn <mike.marciniszyn@intel.com>,
Roland Dreier <roland@purestorage.com>
Subject: [ 34/49] IPoIB: Fix send lockup due to missed TX completion
Date: Tue, 26 Mar 2013 16:01:31 -0700 [thread overview]
Message-ID: <20130326225843.099781572@linuxfoundation.org> (raw)
In-Reply-To: <20130326225839.554028294@linuxfoundation.org>
3.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mike Marciniszyn <mike.marciniszyn@intel.com>
commit 1ee9e2aa7b31427303466776f455d43e5e3c9275 upstream.
Commit f0dc117abdfa ("IPoIB: Fix TX queue lockup with mixed UD/CM
traffic") attempts to solve an issue where unprocessed UD send
completions can deadlock the netdev.
The patch doesn't fully resolve the issue because if more than half
the tx_outstanding's were UD and all of the destinations are RC
reachable, arming the CQ doesn't solve the issue.
This patch uses the IB_CQ_REPORT_MISSED_EVENTS on the
ib_req_notify_cq(). If the rc is above 0, the UD send cq completion
callback is called directly to re-arm the send completion timer.
This issue is seen in very large parallel filesystem deployments
and the patch has been shown to correct the issue.
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/infiniband/ulp/ipoib/ipoib_cm.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
--- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
@@ -753,9 +753,13 @@ void ipoib_cm_send(struct net_device *de
if (++priv->tx_outstanding == ipoib_sendq_size) {
ipoib_dbg(priv, "TX ring 0x%x full, stopping kernel net queue\n",
tx->qp->qp_num);
- if (ib_req_notify_cq(priv->send_cq, IB_CQ_NEXT_COMP))
- ipoib_warn(priv, "request notify on send CQ failed\n");
netif_stop_queue(dev);
+ rc = ib_req_notify_cq(priv->send_cq,
+ IB_CQ_NEXT_COMP | IB_CQ_REPORT_MISSED_EVENTS);
+ if (rc < 0)
+ ipoib_warn(priv, "request notify on send CQ failed\n");
+ else if (rc)
+ ipoib_send_comp_handler(priv->send_cq, dev);
}
}
}
next prev parent reply other threads:[~2013-03-26 23:01 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-03-26 23:00 [ 00/49] 3.0.71-stable review Greg Kroah-Hartman
2013-03-26 23:00 ` [ 01/49] Revert "USB: EHCI: dont check DMA values in QH overlays" Greg Kroah-Hartman
2013-03-26 23:00 ` [ 02/49] sunsu: Fix panic in case of nonexistent port at "console=ttySY" cmdline option Greg Kroah-Hartman
2013-03-26 23:01 ` [ 03/49] net/ipv4: Ensure that location of timestamp option is stored Greg Kroah-Hartman
2013-03-26 23:01 ` [ 04/49] netconsole: dont call __netpoll_cleanup() while atomic Greg Kroah-Hartman
2013-03-26 23:01 ` [ 05/49] sctp: dont break the loop while meeting the active_path so as to find the matched transport Greg Kroah-Hartman
2013-03-26 23:01 ` [ 06/49] ipv4: fix definition of FIB_TABLE_HASHSZ Greg Kroah-Hartman
2013-03-26 23:01 ` [ 07/49] rtnetlink: Mask the rta_type when range checking Greg Kroah-Hartman
2013-03-26 23:01 ` [ 08/49] inet: limit length of fragment queue hash table bucket lists Greg Kroah-Hartman
2013-03-26 23:01 ` [ 09/49] sfc: Fix loop condition for efx_filter_search() when !for_insert Greg Kroah-Hartman
2013-03-26 23:01 ` [ 10/49] sfc: Fix Siena mac statistics on big endian platforms Greg Kroah-Hartman
2013-03-26 23:01 ` [ 11/49] sfc: Do not attempt to flush queues if DMA is disabled Greg Kroah-Hartman
2013-03-26 23:01 ` [ 12/49] sfc: Convert firmware subtypes to native byte order in efx_mcdi_get_board_cfg() Greg Kroah-Hartman
2013-03-26 23:01 ` [ 13/49] sfc: Fix two causes of flush failure Greg Kroah-Hartman
2013-03-26 23:01 ` [ 14/49] sfc: lock TX queues when calling netif_device_detach() Greg Kroah-Hartman
2013-03-26 23:01 ` [ 15/49] sfc: Fix timekeeping in efx_mcdi_poll() Greg Kroah-Hartman
2013-03-26 23:01 ` [ 16/49] sfc: Properly sync RX DMA buffer when it is not the last in the page Greg Kroah-Hartman
2013-03-26 23:01 ` [ 17/49] sfc: Fix efx_rx_buf_offset() in the presence of swiotlb Greg Kroah-Hartman
2013-03-26 23:01 ` [ 18/49] sfc: Detach net device when stopping queues for reconfiguration Greg Kroah-Hartman
2013-03-26 23:01 ` [ 19/49] sfc: Disable soft interrupt handling during efx_device_detach_sync() Greg Kroah-Hartman
2013-03-26 23:01 ` [ 20/49] sfc: Only use TX push if a single descriptor is to be written Greg Kroah-Hartman
2013-03-26 23:01 ` [ 21/49] ALSA: hda - Fix typo in checking IEC958 emphasis bit Greg Kroah-Hartman
2013-03-26 23:01 ` [ 22/49] ALSA: snd-usb: mixer: propagate errors up the call chain Greg Kroah-Hartman
2013-03-26 23:01 ` [ 23/49] ALSA: snd-usb: mixer: ignore -EINVAL in snd_usb_mixer_controls() Greg Kroah-Hartman
2013-03-26 23:01 ` [ 24/49] drm/i915: restrict kernel address leak in debugfs Greg Kroah-Hartman
2013-03-26 23:01 ` [ 25/49] tracing: Fix race in snapshot swapping Greg Kroah-Hartman
2013-03-26 23:01 ` [ 26/49] tracing: Fix free of probe entry by calling call_rcu_sched() Greg Kroah-Hartman
2013-03-26 23:01 ` [ 27/49] mwifiex: fix potential out-of-boundary access to ibss rate table Greg Kroah-Hartman
2013-03-26 23:01 ` [ 28/49] drm/i915: bounds check execbuffer relocation count Greg Kroah-Hartman
2013-03-26 23:01 ` [ 29/49] KMS: fix EDID detailed timing vsync parsing Greg Kroah-Hartman
2013-03-26 23:01 ` [ 30/49] mm/hugetlb: fix total hugetlbfs pages count when using memory overcommit accouting Greg Kroah-Hartman
2013-03-26 23:01 ` [ 31/49] cifs: ignore everything in SPNEGO blob after mechTypes Greg Kroah-Hartman
2013-03-26 23:01 ` [ 32/49] ext4: fix the wrong number of the allocated blocks in ext4_split_extent() Greg Kroah-Hartman
2013-03-26 23:01 ` [ 33/49] usb-storage: add unusual_devs entry for Samsung YP-Z3 mp3 player Greg Kroah-Hartman
2013-03-26 23:01 ` Greg Kroah-Hartman [this message]
2013-03-26 23:01 ` [ 35/49] clockevents: Dont allow dummy broadcast timers Greg Kroah-Hartman
2013-03-26 23:01 ` [ 36/49] x86-64: Fix the failure case in copy_user_handle_tail() Greg Kroah-Hartman
2013-03-26 23:01 ` [ 37/49] USB: xhci - fix bit definitions for IMAN register Greg Kroah-Hartman
2013-03-26 23:01 ` [ 38/49] USB: serial: fix interface refcounting Greg Kroah-Hartman
2013-03-26 23:01 ` [ 39/49] udf: Fix bitmap overflow on large filesystems with small block size Greg Kroah-Hartman
2013-03-26 23:01 ` [ 40/49] USB: garmin_gps: fix memory leak on disconnect Greg Kroah-Hartman
2013-03-26 23:01 ` [ 41/49] USB: io_ti: fix get_icount for two port adapters Greg Kroah-Hartman
2013-03-26 23:01 ` [ 42/49] key: Fix resource leak Greg Kroah-Hartman
2013-03-26 23:01 ` [ 43/49] Fix: compat_rw_copy_check_uvector() misuse in aio, readv, writev, and security keys Greg Kroah-Hartman
2013-03-26 23:01 ` [ 44/49] isofs: avoid info leak on export Greg Kroah-Hartman
2013-03-26 23:01 ` [ 45/49] udf: " Greg Kroah-Hartman
2013-03-26 23:01 ` [ 46/49] i915: initialize CADL in opregion Greg Kroah-Hartman
2013-03-26 23:01 ` [ 47/49] exec: use -ELOOP for max recursion depth Greg Kroah-Hartman
2013-03-26 23:01 ` [ 48/49] rt2x00: error in configurations with mesh support disabled Greg Kroah-Hartman
2013-03-26 23:01 ` [ 49/49] asus-laptop: Do not call HWRS on init Greg Kroah-Hartman
2013-03-27 18:32 ` [ 00/49] 3.0.71-stable review Shuah Khan
2013-03-28 14:19 ` Satoru Takeuchi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130326225843.099781572@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=dean.luick@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mike.marciniszyn@intel.com \
--cc=roland@purestorage.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).