From: mark.rutland@arm.com (Mark Rutland)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v33 00/14] add kdump support
Date: Fri, 17 Mar 2017 16:24:21 +0000 [thread overview]
Message-ID: <20170317162421.GK5940@leverpostej> (raw)
In-Reply-To: <1489765628.17202.59.camel@infradead.org>
On Fri, Mar 17, 2017 at 03:47:08PM +0000, David Woodhouse wrote:
> On Fri, 2017-03-17 at 15:33 +0000, Mark Rutland wrote:
> No, in this case the CPUs *were* offlined correctly, or at least "as
> designed", by smp_send_crash_stop(). And if that hadn't worked, as
> verified by *its* synchronisation method based on the atomic_t
> waiting_for_crash_ipi, then *it* would have complained for itself:
>
> if (atomic_read(&waiting_for_crash_ipi) > 0)
> pr_warning("SMP: failed to stop secondary CPUs %*pbl\n",
> ???cpumask_pr_args(cpu_online_mask));
>
> It's just that smp_send_crash_stop() (or more specifically
> ipi_cpu_crash_stop()) doesn't touch the online cpu mask. Unlike the
> ARM32 equivalent function machien_crash_nonpanic_core(), which does.
>
> It wasn't clear if that was *intentional*, to allow the original
> contents of the online mask before the crash to be seen in the
> resulting vmcore... or purely an accident.?
Looking at this, there's a larger mess.
The waiting_for_crash_ipi dance only tells us if CPUs have taken the
IPI, not wether they've been offlined (i.e. actually left the kernel).
We need something closer to the usual cpu_{disable,die,kill} dance,
clearing online as appropriate.
If CPUs haven't left the kernel, we still need to warn about that.
> FWIW if I trigger a crash on CPU 1 my kdump (still 4.9.8+v32) doesn't work.
> I end up booting the kdump kernel on CPU#1 and then it gets distinctly unhappy...
>
> [????0.000000] Booting Linux on physical CPU 0x1
> ...
> [????0.017125] Detected PIPT I-cache on CPU1
> [????0.017138] GICv3: CPU1: found redistributor 0 region 0:0x00000000f0280000
> [????0.017147] CPU1: Booted secondary processor [411fd073]
> [????0.017339] Detected PIPT I-cache on CPU2
> [????0.017347] GICv3: CPU2: found redistributor 2 region 0:0x00000000f02c0000
> [????0.017354] CPU2: Booted secondary processor [411fd073]
> [????0.017537] Detected PIPT I-cache on CPU3
> [????0.017545] GICv3: CPU3: found redistributor 3 region 0:0x00000000f02e0000
> [????0.017551] CPU3: Booted secondary processor [411fd073]
> [????0.017576] Brought up 4 CPUs
> [????0.017587] SMP: Total of 4 processors activated.
> ...
> [???31.745809] INFO: rcu_sched detected stalls on CPUs/tasks:
> [???31.751299]? 1-...: (30 GPs behind) idle=c90/0/0 softirq=0/0 fqs=0?
> [???31.757557]? 2-...: (30 GPs behind) idle=608/0/0 softirq=0/0 fqs=0?
> [???31.763814]? 3-...: (30 GPs behind) idle=604/0/0 softirq=0/0 fqs=0?
> [???31.770069]? (detected by 0, t=5252 jiffies, g=-270, c=-271, q=0)
> [???31.776161] Task dump for CPU 1:
> [???31.779381] swapper/1???????R??running task????????0?????0??????1 0x00000080
> [???31.786446] Task dump for CPU 2:
> [???31.789666] swapper/2???????R??running task????????0?????0??????1 0x00000080
> [???31.796725] Task dump for CPU 3:
> [???31.799945] swapper/3???????R??running task????????0?????0??????1 0x00000080
>
> Is some of that platform-specific?
That sounds like timer interrupts aren't being taken.
Given that the CPUs have come up, my suspicion would be that the GIC's
been left in some odd state, that the kdump kernel hasn't managed to
recover from.
Marc may have an idea.
Thanks,
Mark.
next prev parent reply other threads:[~2017-03-17 16:24 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-15 9:56 [PATCH v33 00/14] add kdump support AKASHI Takahiro
2017-03-15 9:59 ` [PATCH v33 01/14] memblock: add memblock_clear_nomap() AKASHI Takahiro
2017-03-15 9:59 ` [PATCH v33 02/14] memblock: add memblock_cap_memory_range() AKASHI Takahiro
2017-03-15 9:59 ` [PATCH v33 03/14] arm64: limit memory regions based on DT property, usable-memory-range AKASHI Takahiro
2017-03-15 9:59 ` [PATCH v33 04/14] arm64: kdump: reserve memory for crash dump kernel AKASHI Takahiro
2017-03-17 10:46 ` David Woodhouse
2017-03-17 11:31 ` AKASHI Takahiro
2017-03-17 11:32 ` David Woodhouse
2017-03-15 9:59 ` [PATCH v33 05/14] arm64: mm: allow for unmapping part of kernel mapping AKASHI Takahiro
2017-03-21 10:35 ` James Morse
2017-03-23 11:43 ` AKASHI Takahiro
2017-03-24 10:57 ` Ard Biesheuvel
2017-03-27 13:49 ` AKASHI Takahiro
2017-03-21 11:16 ` Ard Biesheuvel
2017-03-23 10:56 ` AKASHI Takahiro
2017-03-15 9:59 ` [PATCH v33 06/14] arm64: kdump: protect crash dump kernel memory AKASHI Takahiro
2017-03-15 9:59 ` [PATCH v33 07/14] arm64: hibernate: preserve kdump image around hibernation AKASHI Takahiro
2017-03-21 18:25 ` James Morse
2017-03-23 11:29 ` AKASHI Takahiro
2017-03-15 9:59 ` [PATCH v33 08/14] arm64: kdump: implement machine_crash_shutdown() AKASHI Takahiro
2017-03-15 9:59 ` [PATCH v33 09/14] arm64: kdump: add VMCOREINFO's for user-space tools AKASHI Takahiro
2017-03-15 9:59 ` [PATCH v33 10/14] arm64: kdump: provide /proc/vmcore file AKASHI Takahiro
2017-03-15 9:59 ` [PATCH v33 11/14] arm64: kdump: enable kdump in defconfig AKASHI Takahiro
2017-03-15 9:59 ` [PATCH v33 12/14] Documentation: kdump: describe arm64 port AKASHI Takahiro
2017-03-15 10:00 ` [PATCH v33 13/14] Documentation: dt: chosen properties for arm64 kdump AKASHI Takahiro
2017-03-15 10:01 ` [PATCH v33 14/14] efi/libstub/arm*: Set default address and size cells values for an empty dtb AKASHI Takahiro
2017-03-15 11:41 ` [PATCH v33 00/14] add kdump support David Woodhouse
2017-03-16 0:23 ` AKASHI Takahiro
2017-03-16 10:29 ` David Woodhouse
2017-03-17 11:43 ` David Woodhouse
2017-03-17 14:02 ` David Woodhouse
2017-03-17 15:04 ` Mark Rutland
2017-03-17 15:33 ` Mark Rutland
2017-03-17 15:47 ` David Woodhouse
2017-03-17 16:24 ` Mark Rutland [this message]
2017-03-17 16:59 ` Marc Zyngier
2017-03-17 17:10 ` Marc Zyngier
2017-03-17 20:03 ` David Woodhouse
2017-03-21 7:34 ` AKASHI Takahiro
2017-03-21 9:42 ` David Woodhouse
2017-03-20 12:42 ` Pratyush Anand
2017-03-22 16:55 ` Goel, Sameer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170317162421.GK5940@leverpostej \
--to=mark.rutland@arm.com \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox