stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Zhu Yanjun <yanjunz@mellanox.com>,
	Leon Romanovsky <leonro@mellanox.com>,
	Jason Gunthorpe <jgg@mellanox.com>
Subject: [PATCH 4.19 30/38] RDMA/rxe: Fix soft lockup problem due to using tasklets in softirq
Date: Tue, 18 Feb 2020 20:55:16 +0100	[thread overview]
Message-ID: <20200218190422.105374079@linuxfoundation.org> (raw)
In-Reply-To: <20200218190418.536430858@linuxfoundation.org>

From: Zhu Yanjun <yanjunz@mellanox.com>

commit 8ac0e6641c7ca14833a2a8c6f13d8e0a435e535c upstream.

When run stress tests with RXE, the following Call Traces often occur

  watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [swapper/2:0]
  ...
  Call Trace:
  <IRQ>
  create_object+0x3f/0x3b0
  kmem_cache_alloc_node_trace+0x129/0x2d0
  __kmalloc_reserve.isra.52+0x2e/0x80
  __alloc_skb+0x83/0x270
  rxe_init_packet+0x99/0x150 [rdma_rxe]
  rxe_requester+0x34e/0x11a0 [rdma_rxe]
  rxe_do_task+0x85/0xf0 [rdma_rxe]
  tasklet_action_common.isra.21+0xeb/0x100
  __do_softirq+0xd0/0x298
  irq_exit+0xc5/0xd0
  smp_apic_timer_interrupt+0x68/0x120
  apic_timer_interrupt+0xf/0x20
  </IRQ>
  ...

The root cause is that tasklet is actually a softirq. In a tasklet
handler, another softirq handler is triggered. Usually these softirq
handlers run on the same cpu core. So this will cause "soft lockup Bug".

Fixes: 8700e3e7c485 ("Soft RoCE driver")
Link: https://lore.kernel.org/r/20200212072635.682689-8-leon@kernel.org
Signed-off-by: Zhu Yanjun <yanjunz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/infiniband/sw/rxe/rxe_comp.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

--- a/drivers/infiniband/sw/rxe/rxe_comp.c
+++ b/drivers/infiniband/sw/rxe/rxe_comp.c
@@ -329,7 +329,7 @@ static inline enum comp_state check_ack(
 					qp->comp.psn = pkt->psn;
 					if (qp->req.wait_psn) {
 						qp->req.wait_psn = 0;
-						rxe_run_task(&qp->req.task, 1);
+						rxe_run_task(&qp->req.task, 0);
 					}
 				}
 				return COMPST_ERROR_RETRY;
@@ -457,7 +457,7 @@ static void do_complete(struct rxe_qp *q
 	 */
 	if (qp->req.wait_fence) {
 		qp->req.wait_fence = 0;
-		rxe_run_task(&qp->req.task, 1);
+		rxe_run_task(&qp->req.task, 0);
 	}
 }
 
@@ -473,7 +473,7 @@ static inline enum comp_state complete_a
 		if (qp->req.need_rd_atomic) {
 			qp->comp.timeout_retry = 0;
 			qp->req.need_rd_atomic = 0;
-			rxe_run_task(&qp->req.task, 1);
+			rxe_run_task(&qp->req.task, 0);
 		}
 	}
 
@@ -719,7 +719,7 @@ int rxe_completer(void *arg)
 							RXE_CNT_COMP_RETRY);
 					qp->req.need_retry = 1;
 					qp->comp.started_retry = 1;
-					rxe_run_task(&qp->req.task, 1);
+					rxe_run_task(&qp->req.task, 0);
 				}
 
 				if (pkt) {



  parent reply	other threads:[~2020-02-18 19:57 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-18 19:54 [PATCH 4.19 00/38] 4.19.105-stable review Greg Kroah-Hartman
2020-02-18 19:54 ` [PATCH 4.19 01/38] Input: synaptics - switch T470s to RMI4 by default Greg Kroah-Hartman
2020-02-18 19:54 ` [PATCH 4.19 02/38] Input: synaptics - enable SMBus on ThinkPad L470 Greg Kroah-Hartman
2020-02-18 19:54 ` [PATCH 4.19 03/38] Input: synaptics - remove the LEN0049 dmi id from topbuttonpad list Greg Kroah-Hartman
2020-02-18 19:54 ` [PATCH 4.19 04/38] ALSA: usb-audio: Fix UAC2/3 effect unit parsing Greg Kroah-Hartman
2020-02-18 19:54 ` [PATCH 4.19 05/38] ALSA: hda/realtek - Fix silent output on MSI-GL73 Greg Kroah-Hartman
2020-02-18 19:54 ` [PATCH 4.19 06/38] ALSA: usb-audio: Apply sample rate quirk for Audioengine D1 Greg Kroah-Hartman
2020-02-18 19:54 ` [PATCH 4.19 07/38] arm64: cpufeature: Set the FP/SIMD compat HWCAP bits properly Greg Kroah-Hartman
2020-02-18 19:54 ` [PATCH 4.19 08/38] arm64: nofpsmid: Handle TIF_FOREIGN_FPSTATE flag cleanly Greg Kroah-Hartman
2020-02-18 19:54 ` [PATCH 4.19 09/38] ALSA: usb-audio: sound: usb: usb true/false for bool return type Greg Kroah-Hartman
2020-02-18 19:54 ` [PATCH 4.19 10/38] ALSA: usb-audio: Add clock validity quirk for Denon MC7000/MCX8000 Greg Kroah-Hartman
2020-02-18 19:54 ` [PATCH 4.19 11/38] ext4: dont assume that mmp_nodename/bdevname have NUL Greg Kroah-Hartman
2020-02-18 19:54 ` [PATCH 4.19 12/38] ext4: fix support for inode sizes > 1024 bytes Greg Kroah-Hartman
2020-02-18 19:54 ` [PATCH 4.19 13/38] ext4: fix checksum errors with indexed dirs Greg Kroah-Hartman
2020-02-18 19:55 ` [PATCH 4.19 14/38] ext4: add cond_resched() to ext4_protect_reserved_inode Greg Kroah-Hartman
2020-02-18 19:55 ` [PATCH 4.19 15/38] ext4: improve explanation of a mount failure caused by a misconfigured kernel Greg Kroah-Hartman
2020-02-18 19:55 ` [PATCH 4.19 16/38] Btrfs: fix race between using extent maps and merging them Greg Kroah-Hartman
2020-02-18 19:55 ` [PATCH 4.19 17/38] btrfs: ref-verify: fix memory leaks Greg Kroah-Hartman
2020-02-18 19:55 ` [PATCH 4.19 18/38] btrfs: print message when tree-log replay starts Greg Kroah-Hartman
2020-02-18 19:55 ` [PATCH 4.19 19/38] btrfs: log message when rw remount is attempted with unclean tree-log Greg Kroah-Hartman
2020-02-18 19:55 ` [PATCH 4.19 20/38] ARM: npcm: Bring back GPIOLIB support Greg Kroah-Hartman
2020-02-18 19:55 ` [PATCH 4.19 21/38] arm64: ssbs: Fix context-switch when SSBS is present on all CPUs Greg Kroah-Hartman
2020-02-18 19:55 ` [PATCH 4.19 22/38] KVM: nVMX: Use correct root level for nested EPT shadow page tables Greg Kroah-Hartman
2020-02-18 19:55 ` [PATCH 4.19 23/38] perf/x86/amd: Add missing L2 misses event spec to AMD Family 17hs event map Greg Kroah-Hartman
2020-02-18 19:55 ` [PATCH 4.19 24/38] nvme: fix the parameter order for nvme_get_log in nvme_get_fw_slot_info Greg Kroah-Hartman
2020-02-18 19:55 ` [PATCH 4.19 25/38] IB/hfi1: Acquire lock to release TID entries when user file is closed Greg Kroah-Hartman
2020-02-18 19:55 ` [PATCH 4.19 26/38] IB/hfi1: Close window for pq and request coliding Greg Kroah-Hartman
2020-02-18 19:55 ` [PATCH 4.19 27/38] IB/rdmavt: Reset all QPs when the device is shut down Greg Kroah-Hartman
2020-02-18 19:55 ` [PATCH 4.19 28/38] RDMA/core: Fix invalid memory access in spec_filter_size Greg Kroah-Hartman
2020-02-18 19:55 ` [PATCH 4.19 29/38] RDMA/hfi1: Fix memory leak in _dev_comp_vect_mappings_create Greg Kroah-Hartman
2020-02-18 19:55 ` Greg Kroah-Hartman [this message]
2020-02-18 19:55 ` [PATCH 4.19 31/38] RDMA/core: Fix protection fault in get_pkey_idx_qp_list Greg Kroah-Hartman
2020-02-18 19:55 ` [PATCH 4.19 32/38] s390/time: Fix clk type in get_tod_clock Greg Kroah-Hartman
2020-02-18 19:55 ` [PATCH 4.19 33/38] perf/x86/intel: Fix inaccurate period in context switch for auto-reload Greg Kroah-Hartman
2020-02-18 19:55 ` [PATCH 4.19 34/38] hwmon: (pmbus/ltc2978) Fix PMBus polling of MFR_COMMON definitions Greg Kroah-Hartman
2020-02-18 19:55 ` [PATCH 4.19 35/38] NFSv4.1 make cachethis=no for writes Greg Kroah-Hartman
2020-02-18 19:55 ` [PATCH 4.19 36/38] jbd2: move the clearing of b_modified flag to the journal_unmap_buffer() Greg Kroah-Hartman
2020-02-18 19:55 ` [PATCH 4.19 37/38] jbd2: do not clear the BH_Mapped flag when forgetting a metadata buffer Greg Kroah-Hartman
2020-02-18 19:55 ` [PATCH 4.19 38/38] KVM: x86/mmu: Fix struct guest_walker arrays for 5-level paging Greg Kroah-Hartman
2020-02-18 23:37 ` [PATCH 4.19 00/38] 4.19.105-stable review shuah
2020-02-19  2:39 ` Naresh Kamboju
2020-02-19 11:06 ` Jon Hunter
2020-02-19 18:08 ` Guenter Roeck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200218190422.105374079@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=jgg@mellanox.com \
    --cc=leonro@mellanox.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=yanjunz@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).