From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: bigeasy@linutronix.de
Cc: linux-rt-users@vger.kernel.org, linux-kernel@vger.kernel.org,
	Arnaldo Carvalho de Melo <acme@redhat.com>,
	Clark Williams <williams@redhat.com>,
	Dean Luick <dean.luick@intel.com>,
	Dennis Dalessandro <dennis.dalessandro@intel.com>,
	Doug Ledford <dledford@redhat.com>,
	Julia Cartwright <julia@ni.com>, Kaike Wan <kaike.wan@intel.com>,
	Leon Romanovsky <leonro@mellanox.com>,
	linux-rdma@vger.kernel.org, Peter Zijlstra <peterz@infradead.org>,
	Sebastian Andrzej Siewior <sebastian.siewior@linutronix.de>,
	Sebastian Sanchez <sebastian.sanchez@intel.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Thomas Gleixner <tglx@linutronix.de>
Subject: [PATCH 2/2] IB/hfi1: Handle packets in the theaded handler only
Date: Tue,  3 Oct 2017 12:49:20 -0300	[thread overview]
Message-ID: <20171003154920.31566-3-acme@kernel.org> (raw)
In-Reply-To: <20171003154920.31566-1-acme@kernel.org>
From: Arnaldo Carvalho de Melo <acme@redhat.com>
The hfi1 driver calls request_threaded_irq with two parameters:
      handler = receive_context_interrupt;
      thread = receive_context_thread;
      request_threaded_irq(me->msix.vector, handler, thread, 0, me->name, arg);
And tries to process packets on the hard irq one, receive_context_interrupt(),
only waking up the thread (returning IRQ_WAKE_THREAD) when some threshold is
crossed in the number of packets available in the NIC, trying to balance
latency and bandwidth.
But in a CONFIG_PREEMPT_RT_FULL kernel it ends up calling spin locks from the
hard irq handler (receive_context_interrupt) which causes BUGs like this:
  [ 1002.740581] hfi1 0000:21:00.0: hfi1_0: set_link_state: current ARMED, new ACTIVE
  [ 1002.740583] hfi1 0000:21:00.0: hfi1_0: logical state changed to PORT_ACTIVE (0x4)
  [ 1002.740599] hfi1 0000:21:00.0: hfi1_0: send_idle_message: sending idle message 0x203
  [ 1002.741873] hfi1 0000:21:00.0: hfi1_0: read_idle_message: read idle message 0x203
  [ 1002.741874] hfi1 0000:21:00.0: hfi1_0: handle_sma_message: SMA message 0x2
  [ 1002.741923] hfi1 0000:21:00.0: hfi1_0: Switching to NO_DMA_RTAIL
  [ 1004.744192] IPv6: ADDRCONF(NETDEV_CHANGE): hfi1_opa0: link becomes ready
  [ 1167.907754] ------------[ cut here ]------------
  [ 1167.907756] kernel BUG at kernel/rtmutex.c:902!
  [ 1167.907758] invalid opcode: 0000 [#1] PREEMPT SMP
  <SNIP>
  [ 1167.907805] CPU: 10 PID: 1505 Comm: hfi1_cq0 Not tainted 3.10.0-708.rt56.635.test.el7.x86_64 #1
  <SNIP>
  [ 1167.907823] Call Trace:
  [ 1167.907826]  <IRQ>
  [ 1167.907850]  [<ffffffffc06e4981>] ? hfi1_rvt_get_rwqe+0x141/0x400 [hfi1]
  [ 1167.907852]  [<ffffffff816b7625>] rt_spin_lock+0x25/0x30
  [ 1167.907856]  [<ffffffff810aa774>] queue_kthread_work+0x24/0x60
  [ 1167.907861]  [<ffffffffc068845b>] rvt_cq_enter+0x17b/0x250 [rdmavt]
  [ 1167.907869]  [<ffffffffc06e391a>] hfi1_rc_rcv+0x67a/0x1260 [hfi1]
  [ 1167.907878]  [<ffffffffc06fefc8>] hfi1_ib_rcv+0x2c8/0x400 [hfi1]
  [ 1167.907886]  [<ffffffffc06c381c>] process_receive_ib+0x6c/0x150 [hfi1]
  [ 1167.907888]  [<ffffffff810cee9d>] ? enqueue_pushable_task+0x6d/0x90
  [ 1167.907895]  [<ffffffffc06c1f31>] handle_receive_interrupt_nodma_rtail+0x161/0x310 [hfi1]
  [ 1167.907914]  [<ffffffffc06b49d3>] receive_context_interrupt+0x53/0x390 [hfi1]
  [ 1167.907917]  [<ffffffff8112fb26>] __handle_irq_event_percpu+0x56/0x240
  [ 1167.907919]  [<ffffffff816b7616>] ? rt_spin_lock+0x16/0x30
  [ 1167.907920]  [<ffffffff8112fd59>] handle_irq_event_percpu+0x49/0xa0
  [ 1167.907922]  [<ffffffff8112fe28>] handle_irq_event+0x78/0xb0
  [ 1167.907924]  [<ffffffff81132d29>] handle_edge_irq+0x99/0x1a0
  [ 1167.907926]  [<ffffffff8101ea7b>] handle_irq+0xbb/0x150
  [ 1167.907929]  [<ffffffff816c298d>] do_IRQ+0x4d/0xe0
  [ 1167.907931]  [<ffffffff816b7fad>] common_interrupt+0x6d/0x6d
  [ 1167.907931]  <EOI>
  [ 1167.907932]  [<ffffffff816b7616>] ? rt_spin_lock+0x16/0x30
  [ 1167.907934]  [<ffffffff810aaa55>] ? kthread_worker_fn+0xb5/0x170
  [ 1167.907935]  [<ffffffff810aa9a0>] ? flush_kthread_work+0x130/0x130
  [ 1167.907937]  [<ffffffff810aabdf>] kthread+0xcf/0xe0
  [ 1167.907938]  [<ffffffff810aab10>] ? kthread_worker_fn+0x170/0x170
  [ 1167.907940]  [<ffffffff816c0498>] ret_from_fork+0x58/0x90
  [ 1167.907941]  [<ffffffff810aab10>] ? kthread_worker_fn+0x170/0x170
  [ 1167.907951] Code: 90 e8 eb f0 ff ff e9 d4 fd ff ff 66 0f 1f 44 00 00 e8 db f0 ff ff eb b6 0f 0b 0f 1f 80 00 00 00 00 e8 0b f7 a3 ff e8 46 86 9c ff <0f> 0b 0f 0b 66 90 0f 1f 44 00 00 55 48 89 e5 41 57 65 4c 8b 3c
  [ 1167.907952] RIP  [<ffffffff816b62fa>] rt_spin_lock_slowlock+0x34a/0x350
  [ 1167.907952]  RSP <ffff880c3f403ad0>
To get it to work on RT just keep the prologue that clears the chip receive
interrupt and immediately return IRQ_WAKE_THREAD, deferring all packet
processing, with its locking, to the thread.
With this test systems are able to pass traffic over this hardware using a
CONFIG_PREEMPT_RT_FULL patched kernel without triggering these BUGs.
Cc: Clark Williams <williams@redhat.com>
Cc: Dean Luick <dean.luick@intel.com>
Cc: Dennis Dalessandro <dennis.dalessandro@intel.com>
Cc: Doug Ledford <dledford@redhat.com>
Cc: Julia Cartwright <julia@ni.com>
Cc: Kaike Wan <kaike.wan@intel.com>
Cc: Leon Romanovsky <leonro@mellanox.com>
Cc: linux-rdma@vger.kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Andrzej Siewior <sebastian.siewior@linutronix.de>
Cc: Sebastian Sanchez <sebastian.sanchez@intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 drivers/infiniband/hw/hfi1/chip.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/drivers/infiniband/hw/hfi1/chip.c b/drivers/infiniband/hw/hfi1/chip.c
index 121a4c920f1b..733a00d8ea4c 100644
--- a/drivers/infiniband/hw/hfi1/chip.c
+++ b/drivers/infiniband/hw/hfi1/chip.c
@@ -8226,15 +8226,17 @@ static irqreturn_t receive_context_interrupt(int irq, void *data)
 {
 	struct hfi1_ctxtdata *rcd = data;
 	struct hfi1_devdata *dd = rcd->dd;
-	int disposition;
-	int present;
 
 	trace_hfi1_receive_interrupt(dd, rcd->ctxt);
 	this_cpu_inc(*dd->int_counter);
 	aspm_ctx_disable(rcd);
 
+#ifdef CONFIG_PREEMPT_RT_FULL
+	return IRQ_WAKE_THREAD;
+#else
+{
 	/* receive interrupt remains blocked while processing packets */
-	disposition = rcd->do_interrupt(rcd, 0);
+	int disposition = rcd->do_interrupt(rcd, 0), present;
 
 	/*
 	 * Too many packets were seen while processing packets in this
@@ -8257,6 +8259,8 @@ static irqreturn_t receive_context_interrupt(int irq, void *data)
 
 	return IRQ_HANDLED;
 }
+#endif
+}
 
 /*
  * Receive packet thread handler.  This expects to be invoked with the
-- 
2.13.6
next prev parent reply	other threads:[~2017-10-03 15:49 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-03 15:49 [GIT PULL 0/2] infiniband hfi1 PREEMPT_RT_FULL changes Arnaldo Carvalho de Melo
2017-10-03 15:49 ` [PATCH 1/2] IB/hfi1: Use preempt_{dis,en}able_nort() Arnaldo Carvalho de Melo
2017-10-05 14:17   ` Julia Cartwright
     [not found]     ` <20171005141744.GC21185-ew3lsbMjNqt5wtABiV/Xjqyly8cj88Ttqxv4g6HH51o@public.gmane.org>
2017-10-05 15:27       ` Thomas Gleixner
2017-10-05 15:37         ` Julia Cartwright
     [not found]           ` <20171005153759.GG647-ew3lsbMjNqt5wtABiV/Xjqyly8cj88Ttqxv4g6HH51o@public.gmane.org>
2017-10-05 15:55             ` Steven Rostedt
2017-10-05 16:05               ` Julia Cartwright
2017-10-05 16:16                 ` Thomas Gleixner
2017-10-05 16:39                   ` Julia Cartwright
2017-10-05 16:53       ` Arnaldo Carvalho de Melo
2017-10-05 18:29         ` Julia Cartwright
     [not found]           ` <20171005182900.GK647-ew3lsbMjNqt5wtABiV/Xjqyly8cj88Ttqxv4g6HH51o@public.gmane.org>
2017-10-05 18:53             ` Arnaldo Carvalho de Melo
2017-10-05 19:15               ` Steven Rostedt
2017-10-05 16:30   ` Sebastian Andrzej Siewior
2017-10-06  9:19     ` Sebastian Andrzej Siewior
     [not found]   ` <20171003154920.31566-2-acme-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2017-10-10 18:59     ` Dennis Dalessandro
     [not found]       ` <1d06a3da-426f-c887-1da7-64b760c53425-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2017-10-10 19:02         ` Arnaldo Carvalho de Melo
     [not found]           ` <20171010190218.GN28623-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2017-10-11 11:03             ` Sebastian Andrzej Siewior
2017-10-11 13:43               ` Arnaldo Carvalho de Melo
2017-10-03 15:49 ` Arnaldo Carvalho de Melo [this message]
2017-10-05 16:27   ` [PATCH 2/2] IB/hfi1: Handle packets in the theaded handler only Sebastian Andrzej Siewior
2017-10-10 19:06   ` Dennis Dalessandro
2017-10-10 19:15     ` Arnaldo Carvalho de Melo
2017-10-11 10:44       ` Sebastian Andrzej Siewior
2017-10-11 13:42         ` Arnaldo Carvalho de Melo
     [not found]         ` <20171011104456.mlewocqc6ghi3fev-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
2017-10-11 19:07           ` Arnaldo Carvalho de Melo
2017-10-11 19:14             ` Arnaldo Carvalho de Melo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox
  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):
  git send-email \
    --in-reply-to=20171003154920.31566-3-acme@kernel.org \
    --to=acme@kernel.org \
    --cc=acme@redhat.com \
    --cc=bigeasy@linutronix.de \
    --cc=dean.luick@intel.com \
    --cc=dennis.dalessandro@intel.com \
    --cc=dledford@redhat.com \
    --cc=julia@ni.com \
    --cc=kaike.wan@intel.com \
    --cc=leonro@mellanox.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=sebastian.sanchez@intel.com \
    --cc=sebastian.siewior@linutronix.de \
    --cc=tglx@linutronix.de \
    --cc=williams@redhat.com \
    /path/to/YOUR_REPLY
  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
  Be sure your reply has a Subject: header at the top and a blank line
  before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).