linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Michael Ellerman <mpe@ellerman.id.au>
To: Sachin Sant <sachinp@linux.vnet.ibm.com>, linuxppc-dev@ozlabs.org
Cc: ego@linux.vnet.ibm.com, linux-next@vger.kernel.org
Subject: Re: [powerpc]WARN : arch/powerpc/platforms/powernv/smp.c:160
Date: Mon, 26 Aug 2019 13:29:30 +1000	[thread overview]
Message-ID: <87imqk4nad.fsf@concordia.ellerman.id.au> (raw)
In-Reply-To: <AB1A20B4-523B-491E-AB89-124AD2810C17@linux.vnet.ibm.com>

Sachin Sant <sachinp@linux.vnet.ibm.com> writes:
> linux-next is currently broken on POWER8 non virtualized. Kernel
> fails to reach login prompt with following kernel warning
> repeatedly shown during boot.

I don't see it on my test systems.

The backtrace makes it look like you're doing CPU hot_un_plug during
boot, which seems a bit odd.

Or possibly it's just that the cpu_is_offline() test in do_idle() is
returning true due to some bug.

> The problem dates back atleast till next-20190816. 

A bisect would be helpful obviously :)

> [   40.285606] WARNING: CPU: 1 PID: 0 at arch/powerpc/platforms/powernv/smp.c:160 pnv_smp_cpu_kill_self+0x50/0x2d0
> [   40.285609] Modules linked in: kvm_hv kvm sunrpc dm_mirror dm_region_hash dm_log dm_mod ses enclosure scsi_transport_sas sg ipmi_powernv ipmi_devintf powernv_rng uio_pdrv_genirq uio leds_powernv ipmi_msghandler powernv_op_panel ibmpowernv ip_tables ext4 mbcache jbd2 sd_mod ipr tg3 libata ptp pps_core
> [   40.285643] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.3.0-rc5-next-20190823-autotest-autotest #1
> [   40.285644] NIP:  c0000000000b5f40 LR: c000000000055498 CTR: c0000000000b5ef0
> [   40.285646] REGS: c0000007f5527980 TRAP: 0700   Not tainted  (5.3.0-rc5-next-20190823-autotest-autotest)
> [   40.285646] MSR:  9000000000029033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 24004028  XER: 00000000
> [   40.285650] CFAR: c000000000055494 IRQMASK: 1 
> [   40.285650] GPR00: c000000000055498 c0000007f5527c10 c00000000148b200 0000000000000000 
> [   40.285650] GPR04: 0000000000000000 c0000007fa897d80 c0000007fa90c800 00000007f9980000 
> [   40.285650] GPR08: 0000000000000000 0000000000000001 0000000000000000 c0000007fa90c800 
> [   40.285650] GPR12: c0000000000b5ef0 c0000007ffffee00 0000000000000800 c000000ffffc11d0 
> [   40.285650] GPR16: 0000000000000001 c000000001035280 0000000000000000 c0000000015303c0 
> [   40.285650] GPR20: c000000000052d60 0000000000000001 c0000007f54cd800 c0000007f54cd880 
> [   40.285650] GPR24: 0000000000080000 c0000007f54cd800 c0000000014bdf78 c0000000014c20d8 
> [   40.285650] GPR28: 0000000000000002 c0000000014c2538 0000000000000001 c0000007f54cd800 
> [   40.285662] NIP [c0000000000b5f40] pnv_smp_cpu_kill_self+0x50/0x2d0
> [   40.285664] LR [c000000000055498] cpu_die+0x48/0x64
> [   40.285665] Call Trace:
> [   40.285667] [c0000007f5527c10] [c000000000f85f10] ppc64_tlb_batch+0x0/0x1220 (unreliable)
> [   40.285669] [c0000007f5527df0] [c000000000055498] cpu_die+0x48/0x64
> [   40.285672] [c0000007f5527e10] [c0000000000226a0] arch_cpu_idle_dead+0x20/0x40
> [   40.285674] [c0000007f5527e30] [c00000000016bd2c] do_idle+0x37c/0x3f0
> [   40.285676] [c0000007f5527ed0] [c00000000016bfac] cpu_startup_entry+0x3c/0x50
> [   40.285678] [c0000007f5527f00] [c000000000055198] start_secondary+0x638/0x680
> [   40.285680] [c0000007f5527f90] [c00000000000ac5c] start_secondary_prolog+0x10/0x14
> [   40.285680] Instruction dump:
> [   40.285681] fb61ffd8 fb81ffe0 fba1ffe8 fbc1fff0 fbe1fff8 f8010010 f821fe21 e90d1178 
> [   40.285684] f9010198 39000000 892d0988 792907e0 <0b090000> 39200002 7d210164 39200003 
> [   40.285687] ---[ end trace 72c90a064122d9e4 ]—

That WARN shouldn't really kill the boot, do you see anything else?

> Relevant code snippet :
> 156         /*
> 157          * This hard disables local interurpts, ensuring we have no lazy
> 158          * irqs pending.
> 159          */
> 160         WARN_ON(irqs_disabled());  <<===
> 161         hard_irq_disable();
> 162         WARN_ON(lazy_irq_pending());

Even via the path shown above I think we should have IRQs enabled, but I
guess not.

cheers

  reply	other threads:[~2019-08-26  3:31 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-24 16:04 [powerpc]WARN : arch/powerpc/platforms/powernv/smp.c:160 Sachin Sant
2019-08-26  3:29 ` Michael Ellerman [this message]
2019-08-26 10:12   ` Sachin Sant
2019-08-27 11:28     ` Sachin Sant
2019-08-26  9:09 ` Gautham R Shenoy
2019-08-26 10:13   ` Sachin Sant

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87imqk4nad.fsf@concordia.ellerman.id.au \
    --to=mpe@ellerman.id.au \
    --cc=ego@linux.vnet.ibm.com \
    --cc=linux-next@vger.kernel.org \
    --cc=linuxppc-dev@ozlabs.org \
    --cc=sachinp@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).