All of lore.kernel.org
 help / color / mirror / Atom feed
From: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
To: Oliver <oohall@gmail.com>
Cc: manvanth <manvanth@linux.vnet.ibm.com>,
	sim <sim@linux.vnet.ibm.com>,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
	maurosr <maurosr@linux.vnet.ibm.com>
Subject: Re: [mainline][ppc][bnx2x] watchdog: CPU 80 self-detected hard LOCKUP @ opal_interrupt+0x28/0x70 when module load/unload
Date: Thu, 15 Nov 2018 16:40:43 +0530	[thread overview]
Message-ID: <1542280243.15177.2.camel@abdul> (raw)
In-Reply-To: <1537784366.26347.15.camel@abdul.in.ibm.com>

On Mon, 2018-09-24 at 15:49 +0530, Abdul Haleem wrote:
> On Mon, 2018-09-24 at 19:35 +1000, Oliver wrote:
> > On Mon, Sep 24, 2018 at 6:56 PM, Abdul Haleem
> > <abdhalee@linux.vnet.ibm.com> wrote:
> > > Greeting's
> > >
> > > bnx2x module load/unload test results in continuous hard LOCKUP trace on
> > > my powerpc bare-metal running mainline 4.19.0-rc4 kernel
> > >
> > > the instruction address points to:
> > >
> > > 0xc00000000009d048 is in opal_interrupt
> > > (arch/powerpc/platforms/powernv/opal-irqchip.c:133).
> > > 128
> > > 129     static irqreturn_t opal_interrupt(int irq, void *data)
> > > 130     {
> > > 131             __be64 events;
> > > 132
> > > 133             opal_handle_interrupt(virq_to_hw(irq), &events);
> > > 134             last_outstanding_events = be64_to_cpu(events);
> > > 135             if (opal_have_pending_events())
> > > 136                     opal_wake_poller();
> > > 137
> > >
> > > trace:
> > > bnx2x 0008:01:00.3 enP8p1s0f3: renamed from eth0
> > > bnx2x 0008:01:00.3 enP8p1s0f3: using MSI-X  IRQs: sp 297  fp[0] 299 ... fp[7] 306
> > > bnx2x 0008:01:00.2 enP8p1s0f2: NIC Link is Up, 1000 Mbps full duplex, Flow control: none
> > > bnx2x 0008:01:00.3 enP8p1s0f3: NIC Link is Up, 1000 Mbps full duplex, Flow control: none
> > > bnx2x: QLogic 5771x/578xx 10/20-Gigabit Ethernet Driver bnx2x 1.712.30-0 (2014/02/10)
> > > bnx2x 0008:01:00.0: msix capability found
> > > bnx2x 0008:01:00.0: Using 64-bit DMA iommu bypass
> > > bnx2x 0008:01:00.0: part number 0-0-0-0
> > > bnx2x 0008:01:00.0: 32.000 Gb/s available PCIe bandwidth (5 GT/s x8 link)
> > > bnx2x 0008:01:00.0 enP8p1s0f0: renamed from eth0
> > > bnx2x 0008:01:00.1: msix capability found
> > > bnx2x 0008:01:00.1: Using 64-bit DMA iommu bypass
> > > bnx2x 0008:01:00.1: part number 0-0-0-0
> > > bnx2x 0008:01:00.0 enP8p1s0f0: using MSI-X  IRQs: sp 267  fp[0] 269 ... fp[7] 276
> > > bnx2x 0008:01:00.0 enP8p1s0f0: NIC Link is Up, 10000 Mbps full duplex, Flow control: ON - receive & transmit
> > > bnx2x 0008:01:00.1: 32.000 Gb/s available PCIe bandwidth (5 GT/s x8 link)
> > > bnx2x 0008:01:00.1 enP8p1s0f1: renamed from eth0
> > > bnx2x 0008:01:00.2: msix capability found
> > > bnx2x 0008:01:00.2: Using 64-bit DMA iommu bypass
> > > bnx2x 0008:01:00.2: part number 0-0-0-0
> > > bnx2x 0008:01:00.1 enP8p1s0f1: using MSI-X  IRQs: sp 277  fp[0] 279 ... fp[7] 286
> > > bnx2x 0008:01:00.1 enP8p1s0f1: NIC Link is Up, 10000 Mbps full duplex, Flow control: ON - receive & transmit
> > 
> > 
> > > watchdog: CPU 80 self-detected hard LOCKUP @ opal_interrupt+0x28/0x70
> > > watchdog: CPU 80 TB:980794111093, last heartbeat TB:973959617200 (13348ms ago)
> > 
> > Ouch, 13 seconds in OPAL. Looks like we trip the hard lockup detector
> > once the thread comes back into the kernel so we're not completely
> > stuck. At a guess there's some contention on a lock in OPAL due to the
> > bind/unbind loop, but i'm not sure why that would be happening.
> > 
> > Can you give us a copy of the OPAL log? /sys/firmware/opal/msglog)
> 
> Oliver, thanks for looking into this, I have sent a private mail (file
> was 1MB) with logs attached.
> 

Oliver, any luck on the logs given.

Warnings also show up on 4.20.0-rc2-next-20181114

-- 
Regard's

Abdul Haleem
IBM Linux Technology Centre




  reply	other threads:[~2018-11-15 11:12 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-24  8:56 [mainline][ppc][bnx2x] watchdog: CPU 80 self-detected hard LOCKUP @ opal_interrupt+0x28/0x70 when module load/unload Abdul Haleem
2018-09-24  9:35 ` Oliver
2018-09-24 10:19   ` Abdul Haleem
2018-11-15 11:10     ` Abdul Haleem [this message]
2018-11-15 14:16       ` Abdul Haleem
2018-11-16  4:44         ` Michael Ellerman
2018-11-16  5:01           ` Abdul Haleem
2018-11-16 10:02             ` Michael Ellerman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1542280243.15177.2.camel@abdul \
    --to=abdhalee@linux.vnet.ibm.com \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=manvanth@linux.vnet.ibm.com \
    --cc=maurosr@linux.vnet.ibm.com \
    --cc=oohall@gmail.com \
    --cc=sim@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.