From: Qian Cai <cai@lca.pw>
To: "AKASHI, Takahiro" <takahiro.akashi@linaro.org>,
James Morse <james.morse@arm.com>,
Marc Zyngier <marc.zyngier@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will.deacon@arm.com>,
kexec@lists.infradead.org, linux-arm-kernel@lists.infradead.org,
Ard Biesheuvel <ard.biesheuvel@linaro.org>
Subject: Re: arm64: kdump broken on a large CPU system
Date: Wed, 12 Dec 2018 17:37:04 -0500 [thread overview]
Message-ID: <1544654224.18411.11.camel@lca.pw> (raw)
In-Reply-To: <e4a3456b-6a75-4564-a49f-0532d0b35726@lca.pw>
On Tue, 2018-12-11 at 23:39 -0500, Qian Cai wrote:
> [+ kexec@lists.infradead.org]
>
> The debugging progress so far...
>
> Wait up to 5 minutes for other CPUs to stop in crash_smp_send_stop() made no
> difference.
>
> With "dev" branch of this tree [1], it is possible to print out messages from
> purgatory when passing something like "--port=0x602B0000
> --port-lsr=0x602B0000,0x80" to kexec. However, even enable_dcache() in
> setup_arch() will hung like forever on this machine (working fine on another
> arm64 server - Cortex-A72). After removed only enable_dcache() /
> disable_dcache() from setup_arch() etc without removing printf() lines, it did
> print out,
>
> I'm in purgatory
> purgatory: entry=0000000090080000
> purgatory: dtb=0000000092d50000
> purgatory: D-cache Enabled before SHA verification
> purgatory: D-cache Disabled after SHA verification
>
> So, it confirmed that it must hung somewhere in arm64/kernel/head.S (.stext)
> or
> the early part of start_kernel() before earlycon was initialized.
>
> Also confirmed that passing nr_cpus=64 in the first kernel would again make
> everything work fine with this new kexec.
>
> Since enable_dcache() would hung as well, I suspect this has something to do
> with enabling MMU (i.e, .stext -> __primary_switch -> __enable_mmu) coupling
> with some sort of per-CPU data where the number of CPUs matters.
Still debugging a hung to enable MMU (enable_dcache) in purgatory [1] which may
provide some clues for the hung later in the 2nd kernel.
dsb nshst
tlbi alle2
dsb nsh
isb
bl get_ips_bits
lsl x1, x0, #TCR_IPS_EL2_SHIFT
orr x1, x1, x7
mov x0, x6
ldr x2, =MEMORY_ATTRIBUTES
msr mair_el2, x2
msr tcr_el2, x1
msr ttbr0_el2, x0
isb
mrs x0, sctlr_el2
ldr x3, =SCTLR_ELx_FLAGS
orr x0, x0, x3
msr sctlr_el2, x0 <--- hung right on this instruction.
Without CONFIG_ARM64_VHE (i.e., running in EL1), it is able to run
enable_dcache() but it still hung later in the 2nd kernel somewhere.
dsb nshst
tlbi vmalle1
dsb nsh
isb
bl get_ips_bits
lsl x1, x0, #TCR_IPS_EL1_SHIFT
orr x1, x1, x7
mov x0, x6
ldr x2, =MEMORY_ATTRIBUTES
msr mair_el1, x2
msr tcr_el1, x1
msr ttbr0_el1, x0
isb
mrs x0, sctlr_el1
ldr x3, =SCTLR_ELx_FLAGS
orr x0, x0, x3
msr sctlr_el1, x0
isb
One data point of this system is that it has 4 threads on each core. Each 2-core
share a same L1 and L2 caches, so that is 8 CPUs shares them each. All CPUs
share a same L3 cache.
Hence, I wonder if this is because of incomplete cache/TLB invalidation that had
stale entries (or uninitialised junk which just happens to look valid) present
before turning the MMU on.
[1] https://github.com/pratyushanand/kexec-tools/blob/devel/purgatory/arch/\
arm64/cache.S
>
> Right now, I think I need to find a way to print directly to pl011 serial
> console while debugging those assembly code like CONFIG_DEBUG_LL for arm64, so
> it can be used to locate where exactly it hung. Otherwise, I am shooting in
> the
> dark.
>
> [1] https://github.com/pratyushanand/kexec-tools
>
> === original email ===
>
> On this HPE Apollo 70 arm64 server with 256 CPUs, triggering a crash dump just
> hung (4.20-rc6 as well as 4.18). It was confirmed that the executing went as
> far
> as entering __cpu_soft_restart(),
>
> __crash_kexec
> machine_kexec
> cpu_soft_restart
> restart
> __cpu_soft_restart
>
> The earlycon was enabled but had no output from the 2nd kernel, so it was
> pretty
> much stuck in all those assembly code in arm64/kernel/head.S or the early part
> of start_kernel() before earlycon was initialized.
>
> It turned out this has something to do with nr_cpus in the 1st kernel,
> although
> the 2nd kernel always has nr_cpus=1 [1]. It was tested with both
> crashkernel=512M or 768M.
>
> nr_cpus <= 96 GOOD (2nd kernel was up in 2-3 mins.)
> nr_cpus=256 BAD (2nd kernel was NOT up after 1 hour.)
> nr_cpus=127 BAD (2nd kernel was NOT up after 10 mins.)
>
> I did also test with and without CONFIG_ARM64_VHE (i.e., el2_switch) made no
> difference.
>
> [1] KDUMP_COMMANDLINE_APPEND="irqpoll nr_cpus=1 swiotlb=noforce reset_devices"
>
> I am still figuring out a way to debug those assembly code to where it
> actually
> hung, but the server was hooked up with a conserver that was not able to
> generate any sysrq and I have no shell access to the conserver, so seems a bit
> difficult to use kgdb or kdb in this case.
>
> CPU information,
>
> # lscpu
> Architecture: aarch64
> Byte Order: Little Endian
> CPU(s): 256
> On-line CPU(s) list: 0-255
> Thread(s) per core: 4
> Core(s) per socket: 32
> Socket(s): 2
> NUMA node(s): 2
> Vendor ID: Cavium
> Model: 1
> Model name: ThunderX2 99xx
> Stepping: 0x1
> BogoMIPS: 400.00
> L1d cache: 32K
> L1i cache: 32K
> L2 cache: 256K
> L3 cache: 32768K
> NUMA node0 CPU(s): 0-127
> NUMA node1 CPU(s): 128-255
> Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics cpuid
> asimdrdm
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
next prev parent reply other threads:[~2018-12-12 22:37 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <113776f1-5633-e397-96eb-c533ea79671d@lca.pw>
[not found] ` <29f74c6d-dd21-dcee-6c62-914f018c4e4e@arm.com>
[not found] ` <7f467952-342b-71e2-c553-ff53ecc1812e@arm.com>
[not found] ` <20181212025131.GL21466@linaro.org>
2018-12-12 4:39 ` arm64: kdump broken on a large CPU system Qian Cai
2018-12-12 22:37 ` Qian Cai [this message]
2018-12-13 5:22 ` [PATCH] arm64: invalidate TLB before turning MMU on Qian Cai
2018-12-13 5:40 ` Bhupesh Sharma
2018-12-13 13:39 ` Qian Cai
2018-12-13 10:44 ` James Morse
2018-12-13 13:44 ` Qian Cai
2018-12-14 4:08 ` [PATCH v2] arm64: invalidate TLB just " Qian Cai
2018-12-14 5:01 ` Bhupesh Sharma
2018-12-14 12:54 ` Qian Cai
2018-12-14 7:23 ` Ard Biesheuvel
2018-12-15 1:53 ` Qian Cai
2019-01-10 20:00 ` Bhupesh Sharma
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1544654224.18411.11.camel@lca.pw \
--to=cai@lca.pw \
--cc=ard.biesheuvel@linaro.org \
--cc=catalin.marinas@arm.com \
--cc=james.morse@arm.com \
--cc=kexec@lists.infradead.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=marc.zyngier@arm.com \
--cc=takahiro.akashi@linaro.org \
--cc=will.deacon@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox