public inbox for linux-rt-users@vger.kernel.org
 help / color / mirror / Atom feed
From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Mike Marciniszyn <mike.marciniszyn@intel.com>,
	Dennis Dalessandro <dennis.dalessandro@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Clark Williams <williams@redhat.com>,
	Dean Luick <dean.luick@intel.com>,
	Doug Ledford <dledford@redhat.com>,
	Jubin John <jubin.john@intel.com>,
	Kaike Wan <kaike.wan@intel.com>,
	Leon Romanovsky <leonro@mellanox.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Sebastian Andrzej Siewior <sebastian.siewior@linutronix.de>,
	Sebastian Sanchez <sebastian.sanchez@intel.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	linux-rt-users@vger.kernel.org, linux-rdma@vger.kernel.org
Subject: [RFC+PATCH] Infiniband hfi1 + PREEMPT_RT_FULL issues
Date: Mon, 25 Sep 2017 11:49:49 -0300	[thread overview]
Message-ID: <20170925144949.GP29668@kernel.org> (raw)

Hi,

	I'm trying to get an Infiniband test case working with the RT
kernel, and ended over tripping over this case:

	In drivers/infiniband/hw/hfi1/pio.c sc_buffer_alloc() disables
preemption that will be reenabled by either pio_copy() or
seg_pio_copy_end().

	But before disabling preemption it grabs a spin lock that will
be dropped after it disables preemption, which ends up triggering a
warning in migrate_disable() later on.

  spin_lock_irqsave(&sc->alloc_lock)
    migrate_disable() ++p->migrate_disable -> 2
  preempt_disable()
  spin_unlock_irqrestore(&sc->alloc_lock)
    migrate_enable() in_atomic(), so just returns, migrate_disable stays at 2
  spin_lock_irqsave(some other lock) -> b00m

    And the WARN_ON code ends up tripping over this over and over in
    log_store().
    
    Sequence captured via ftrace_dump_on_oops + crash utility 'dmesg'
    command.
    
    [512258.613862] sm-3297 16 .....11 359465349134644: sc_buffer_alloc <-hfi1_verbs_send_pio
    [512258.613876] sm-3297 16 .....11 359465349134719: migrate_disable <-sc_buffer_alloc
    [512258.613890] sm-3297 16 .....12 359465349134798: rt_spin_lock <-sc_buffer_alloc
    [512258.613903] sm-3297 16 ....112 359465349135481: rt_spin_unlock <-sc_buffer_alloc
    [512258.613916] sm-3297 16 ....112 359465349135556: migrate_enable <-sc_buffer_alloc
    [512258.613935] sm-3297 16 ....112 359465349135788: seg_pio_copy_start <-hfi1_verbs_send_pio
    [512258.613954] sm-3297 16 ....112 359465349136273: update_sge <-hfi1_verbs_send_pio
    [512258.613981] sm-3297 16 ....112 359465349136373: seg_pio_copy_mid <-hfi1_verbs_send_pio
    [512258.613999] sm-3297 16 ....112 359465349136873: update_sge <-hfi1_verbs_send_pio
    [512258.614017] sm-3297 16 ....112 359465349136956: seg_pio_copy_mid <-hfi1_verbs_send_pio
    [512258.614035] sm-3297 16 ....112 359465349137221: seg_pio_copy_end <-hfi1_verbs_send_pio
    [512258.614048] sm-3297 16 .....12 359465349137360: migrate_disable <-hfi1_verbs_send_pio
    [512258.614065] sm-3297 16 .....12 359465349137476: warn_slowpath_null <-migrate_disable
    [512258.614081] sm-3297 16 .....12 359465349137564: __warn <-warn_slowpath_null
    [512258.614088] sm-3297 16 .....12 359465349137958: printk <-__warn
    [512258.614096] sm-3297 16 .....12 359465349138055: vprintk_default <-printk
    [512258.614104] sm-3297 16 .....12 359465349138144: vprintk_emit <-vprintk_default
    [512258.614111] sm-3297 16 d....12 359465349138312: _raw_spin_lock <-vprintk_emit
    [512258.614119] sm-3297 16 d...112 359465349138789: log_store <-vprintk_emit
    [512258.614127] sm-3297 16 .....12 359465349139068: migrate_disable <-vprintk_emit

I'm wondering if turning this sc->alloc_lock to a raw_spin_lock is the
right solution, which I'm afraid its not, as there are places where it
is held and then the code goes on to grab other non-raw spinlocks...

I got this patch in my test branch and it makes the test case go further
before splatting on other problems with infiniband + PREEMPT_RT_FULL,
but as I said, I fear its not the right solution, ideas?

The kernel I'm seing this is RHEL's + the PREEMPT_RT_FULL patch:

Linux version 3.10.0-709.rt56.636.test.el7.x86_64 (acme@seventh) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-16) (GCC) ) #
1 SMP PREEMPT RT Wed Sep 20 18:04:55 -03 2017

I will try and build with the latest PREEMPT_RT_FULL patch, but the
infiniband codebase in RHEL seems to be up to what is upstream and
I just looked at patches-4.11.12-rt14/add_migrate_disable.patch and that
WARN_ON_ONCE(p->migrate_disable_atomic) is still there :-\

- Arnaldo

commit 7ec7d80c7f46bb04da5f39836096de4c0ddde71a
Author: Arnaldo Carvalho de Melo <acme@redhat.com>
Date:   Wed Sep 6 10:30:08 2017 -0300

    infiniband: Convert per-NUMA send_context->alloc_lock to a raw spinlock

diff --git a/drivers/infiniband/hw/hfi1/pio.c b/drivers/infiniband/hw/hfi1/pio.c
index 615be68e40b3..8f28f8fe842d 100644
--- a/drivers/infiniband/hw/hfi1/pio.c
+++ b/drivers/infiniband/hw/hfi1/pio.c
@@ -744,7 +744,7 @@ struct send_context *sc_alloc(struct hfi1_devdata *dd, int type,
 	sc->dd = dd;
 	sc->node = numa;
 	sc->type = type;
-	spin_lock_init(&sc->alloc_lock);
+	raw_spin_lock_init(&sc->alloc_lock);
 	spin_lock_init(&sc->release_lock);
 	spin_lock_init(&sc->credit_ctrl_lock);
 	INIT_LIST_HEAD(&sc->piowait);
@@ -929,13 +929,13 @@ void sc_disable(struct send_context *sc)
 		return;
 
 	/* do all steps, even if already disabled */
-	spin_lock_irqsave(&sc->alloc_lock, flags);
+	raw_spin_lock_irqsave(&sc->alloc_lock, flags);
 	reg = read_kctxt_csr(sc->dd, sc->hw_context, SC(CTRL));
 	reg &= ~SC(CTRL_CTXT_ENABLE_SMASK);
 	sc->flags &= ~SCF_ENABLED;
 	sc_wait_for_packet_egress(sc, 1);
 	write_kctxt_csr(sc->dd, sc->hw_context, SC(CTRL), reg);
-	spin_unlock_irqrestore(&sc->alloc_lock, flags);
+	raw_spin_unlock_irqrestore(&sc->alloc_lock, flags);
 
 	/*
 	 * Flush any waiters.  Once the context is disabled,
@@ -1232,7 +1232,7 @@ int sc_enable(struct send_context *sc)
 	 * worry about locking since the releaser will not do anything
 	 * if the context accounting values have not changed.
 	 */
-	spin_lock_irqsave(&sc->alloc_lock, flags);
+	raw_spin_lock_irqsave(&sc->alloc_lock, flags);
 	sc_ctrl = read_kctxt_csr(dd, sc->hw_context, SC(CTRL));
 	if ((sc_ctrl & SC(CTRL_CTXT_ENABLE_SMASK)))
 		goto unlock; /* already enabled */
@@ -1303,7 +1303,7 @@ int sc_enable(struct send_context *sc)
 	sc->flags |= SCF_ENABLED;
 
 unlock:
-	spin_unlock_irqrestore(&sc->alloc_lock, flags);
+	raw_spin_unlock_irqrestore(&sc->alloc_lock, flags);
 
 	return ret;
 }
@@ -1361,9 +1361,9 @@ void sc_stop(struct send_context *sc, int flag)
 	sc->flags |= flag;
 
 	/* stop buffer allocations */
-	spin_lock_irqsave(&sc->alloc_lock, flags);
+	raw_spin_lock_irqsave(&sc->alloc_lock, flags);
 	sc->flags &= ~SCF_ENABLED;
-	spin_unlock_irqrestore(&sc->alloc_lock, flags);
+	raw_spin_unlock_irqrestore(&sc->alloc_lock, flags);
 	wake_up(&sc->halt_wait);
 }
 
@@ -1391,9 +1391,9 @@ struct pio_buf *sc_buffer_alloc(struct send_context *sc, u32 dw_len,
 	int trycount = 0;
 	u32 head, next;
 
-	spin_lock_irqsave(&sc->alloc_lock, flags);
+	raw_spin_lock_irqsave(&sc->alloc_lock, flags);
 	if (!(sc->flags & SCF_ENABLED)) {
-		spin_unlock_irqrestore(&sc->alloc_lock, flags);
+		raw_spin_unlock_irqrestore(&sc->alloc_lock, flags);
 		goto done;
 	}
 
@@ -1402,7 +1402,7 @@ struct pio_buf *sc_buffer_alloc(struct send_context *sc, u32 dw_len,
 	if (blocks > avail) {
 		/* not enough room */
 		if (unlikely(trycount))	{ /* already tried to get more room */
-			spin_unlock_irqrestore(&sc->alloc_lock, flags);
+			raw_spin_unlock_irqrestore(&sc->alloc_lock, flags);
 			goto done;
 		}
 		/* copy from receiver cache line and recalculate */
@@ -1458,7 +1458,7 @@ struct pio_buf *sc_buffer_alloc(struct send_context *sc, u32 dw_len,
 	 */
 	smp_wmb();
 	sc->sr_head = next;
-	spin_unlock_irqrestore(&sc->alloc_lock, flags);
+	raw_spin_unlock_irqrestore(&sc->alloc_lock, flags);
 
 	/* finish filling in the buffer outside the lock */
 	pbuf->start = sc->base_addr + fill_wrap * PIO_BLOCK_SIZE;
diff --git a/drivers/infiniband/hw/hfi1/pio.h b/drivers/infiniband/hw/hfi1/pio.h
index 867e5ffc3595..06dfc6f81fd5 100644
--- a/drivers/infiniband/hw/hfi1/pio.h
+++ b/drivers/infiniband/hw/hfi1/pio.h
@@ -112,7 +112,7 @@ struct send_context {
 	u8  group;			/* credit return group */
 
 	/* allocator fields */
-	spinlock_t alloc_lock ____cacheline_aligned_in_smp;
+	raw_spinlock_t alloc_lock ____cacheline_aligned_in_smp;
 	u32 sr_head;			/* shadow ring head */
 	unsigned long fill;		/* official alloc count */
 	unsigned long alloc_free;	/* copy of free (less cache thrash) */

             reply	other threads:[~2017-09-25 14:49 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-25 14:49 Arnaldo Carvalho de Melo [this message]
2017-09-25 19:45 ` [RFC+PATCH] Infiniband hfi1 + PREEMPT_RT_FULL issues Arnaldo Carvalho de Melo
     [not found] ` <20170925144949.GP29668-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2017-09-25 20:15   ` Steven Rostedt
     [not found]     ` <20170925161528.52d34769-ZM9ACYiE99GSuEeoRQArULNAH6kLmebB@public.gmane.org>
2017-09-26 13:15       ` Arnaldo Carvalho de Melo
     [not found]         ` <20170926131529.GB25735-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2017-09-26 17:55           ` Marciniszyn, Mike
2017-09-26 18:28             ` Arnaldo Carvalho de Melo
2017-09-26 21:11               ` Steven Rostedt
2017-09-25 21:14   ` Julia Cartwright
2017-09-26 14:10     ` Marciniszyn, Mike
     [not found]       ` <32E1700B9017364D9B60AED9960492BC3441B3BD-RjuIdWtd+YbTXloPLtfHfbfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2017-09-26 14:56         ` Steven Rostedt
2017-09-26 21:00       ` Julia Cartwright
2017-10-06  9:23       ` Sebastian Andrzej Siewior

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170925144949.GP29668@kernel.org \
    --to=acme@kernel.org \
    --cc=dean.luick@intel.com \
    --cc=dennis.dalessandro@intel.com \
    --cc=dledford@redhat.com \
    --cc=jubin.john@intel.com \
    --cc=kaike.wan@intel.com \
    --cc=leonro@mellanox.com \
    --cc=linux-rdma@vger.kernel.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=mike.marciniszyn@intel.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=sebastian.sanchez@intel.com \
    --cc=sebastian.siewior@linutronix.de \
    --cc=tglx@linutronix.de \
    --cc=williams@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox