linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Marc Zyngier <maz@kernel.org>
To: Tim Harvey <tharvey@gateworks.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
	Robert Richter <rrichter@marvell.com>,
	Will Deacon <will@kernel.org>,
	Sunil Goutham <sgoutham@marvell.com>,
	linux-arm-kernel@lists.infradead.org
Subject: Re: ARM64_SW_TTBR0_PAN enabled causing hangs on OcteonTX
Date: Tue, 25 Feb 2020 16:27:41 +0000	[thread overview]
Message-ID: <da8f38078ef8805200b102a1d24da4ae@kernel.org> (raw)
In-Reply-To: <CAJ+vNU3XVNkdHXbq-KJaRecSxpPxboVW5Cx7zVEv64Gm1dt+Vg@mail.gmail.com>

On 2020-02-25 16:13, Tim Harvey wrote:
> On Mon, Feb 24, 2020 at 5:55 PM Marc Zyngier <maz@kernel.org> wrote:
>> 
>> On 2020-02-25 01:16, Tim Harvey wrote:
>> > On Mon, Feb 24, 2020 at 4:50 PM Marc Zyngier <maz@kernel.org> wrote:
>> >>
>> >> Tim,
>> >>
>> >> On 2020-02-25 00:35, Tim Harvey wrote:
>> >> > Greetings,
>> >> >
>> >> > I'm trying to understand why enabling CONFIG_ARM64_SW_TTBR0_PAN on an
>> >> > OcteonTX (CN80XX) SoC would cause the kernel to hang.
>> >> >
>> >> > Here's what I'm seeing using arch/arm64/defconfig +
>> >> > CONFIG_ARM64_SW_TTBR0_PAN=y on a Gateworks Newport board with a
>> >> > CN8030-1500BG676-SCP-P12-G SoC using the Marvell SDK-10.1.1.0 boot
>> >> > firmware:
>> >> >
>> >> > Starting kernel ...
>> >> >
>> >> > [    0.000000] Booting Linux on physical CPU 0x0000000000 [0x430f0a22]
>> >> > [    0.000000] Linux version 5.5.0-00001-g2028a3b (tharvey@tharvey)
>> >> > (gcc version 7.3.0 (Marvell Inc. Version: Marvell GCC7 build 238.0))
>> >> > #2 SMP PREEMPT Mon Feb 24 16:20:24 PST 2020
>> >> > [    0.000000] Machine model: Gateworks Newport CN80XX GW6404
>> >> > [    0.000000] efi: Getting EFI parameters from FDT:
>> >> > [    0.000000] efi: UEFI not found.
>> >> > [    0.000000] cma: Reserved 64 MiB at 0x000000007c000000
>> >> > [    0.000000] NUMA: NODE_DATA [mem 0x7bbe5100-0x7bbe6fff]
>> >> > [    0.000000] Zone ranges:
>> >> > [    0.000000]   DMA      [mem 0x0000000000500000-0x000000003fffffff]
>> >> > [    0.000000]   DMA32    [mem 0x0000000040000000-0x000000007fffffff]
>> >> > [    0.000000]   Normal   empty
>> >> > [    0.000000] Movable zone start for each node
>> >> > [    0.000000] Early memory node ranges
>> >> > [    0.000000]   node   0: [mem 0x0000000000500000-0x000000007fffffff]
>> >> > [    0.000000] Initmem setup node 0 [mem
>> >> > 0x0000000000500000-0x000000007fffffff]
>> >> > [    0.000000] On node 0 totalpages: 523008
>> >> > [    0.000000]   DMA zone: 4076 pages used for memmap
>> >> > [    0.000000]   DMA zone: 0 pages reserved
>> >> > [    0.000000]   DMA zone: 260864 pages, LIFO batch:63
>> >> > [    0.000000]   DMA32 zone: 4096 pages used for memmap
>> >> > [    0.000000]   DMA32 zone: 262144 pages, LIFO batch:63
>> >> > [    0.000000] psci: probing for conduit method from DT.
>> >> > [    0.000000] psci: PSCIv1.1 detected in firmware.
>> >> > [    0.000000] psci: Using standard PSCI v0.2 function IDs
>> >> > [    0.000000] psci: Trusted OS resident on physical CPU 0x0
>> >> > [    0.000000] psci: SMC Calling Convention v1.1
>> >> > [    0.000000] percpu: Embedded 22 pages/cpu s53016 r8192 d28904 u90112
>> >> > [    0.000000] pcpu-alloc: s53016 r8192 d28904 u90112 alloc=22*4096
>> >> > [    0.000000] pcpu-alloc: [0] 0 [0] 1 [0] 2 [0] 3
>> >> > [    0.000000] Detected VIPT I-cache on CPU0
>> >> > [    0.000000] CPU features: detected: GIC system register CPU
>> >> > interface
>> >> > [    0.000000] CPU features: detected: Cavium erratum 30115
>> >> > [    0.000000] CPU features: detected: Kernel page table isolation
>> >> > (KPTI)
>> >>
>> >> If this CPU is just another version of TX1, KPTI shouldn't get enabled
>> >> on
>> >> this HW, as it definitely breaks (see erratum 27456 and its
>> >> consequences).
>> >> Can you please enable CONFIG_CAVIUM_ERRATUM_27456 and report back?
>> >>
>> >
>> > Marc,
>> >
>> > This is a CN8030 Pass 1.2 part so erratum 27456 does appear to be
>> > needed and it is indeed enabled already in the kernel by default.
>> 
>> And yet the kernel doesn't seem to detect an affected silicon.
>> Can you please apply the following patch and report what happens
>> (including the full dmesg):
>> 
>> diff --git a/arch/arm64/kernel/cpu_errata.c
>> b/arch/arm64/kernel/cpu_errata.c
>> index 703ad0a84f99..c0890d882e56 100644
>> --- a/arch/arm64/kernel/cpu_errata.c
>> +++ b/arch/arm64/kernel/cpu_errata.c
>> @@ -672,7 +672,7 @@ const struct midr_range 
>> cavium_erratum_27456_cpus[]
>> = {
>>         /* Cavium ThunderX, T88 pass 1.x - 2.1 */
>>         MIDR_RANGE(MIDR_THUNDERX, 0, 0, 1, 1),
>>         /* Cavium ThunderX, T81 pass 1.0 */
>> -       MIDR_REV(MIDR_THUNDERX_81XX, 0, 0),
>> +       MIDR_ALL_VERSIONS(MIDR_THUNDERX_81XX),
>>         {},
>>   };
>>   #endif
>> 
> 
> Marc,
> 
> That does enable the erratum, disable KPTI and boot properly but I
> misread the erratum and it shouldn't be needed for T81 pass 1.2... the
> erratum is documented only needed for pass 1.0.

Can you then remove the patch *and* disable KPTI?

TX1 is broken beyond recognition and KPTI is known to explode on this HW
(which is why we disable KPTI on it). We always attributed it to this 
erratum,
but in the absence of any help from Cavium to identify the problem, we 
just
keyed it on that.

*IF* this HW is indeed unaffected by it, then it is probably the mix of
KPTI and SWPAN that triggers the issue. If my suspicion is correct, 
you'll
need to have a chat with Cavium/Marvell to understand what is happening 
there.

         M.
-- 
Jazz is not dead. It just smells funny...

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2020-02-25 16:27 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-25  0:35 ARM64_SW_TTBR0_PAN enabled causing hangs on OcteonTX Tim Harvey
2020-02-25  0:50 ` Marc Zyngier
2020-02-25  1:16   ` Tim Harvey
2020-02-25  1:55     ` Marc Zyngier
2020-02-25 16:13       ` Tim Harvey
2020-02-25 16:27         ` Marc Zyngier [this message]
2020-02-25 16:35           ` Robert Richter
2020-02-25 17:00             ` Marc Zyngier
2020-02-25 17:04               ` Tim Harvey
2020-02-25 17:19                 ` Marc Zyngier
2020-02-25 17:29                   ` Tim Harvey
2020-02-25 17:47                     ` Marc Zyngier
2020-02-25 17:11             ` Tim Harvey
2020-03-10 14:51             ` Tim Harvey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=da8f38078ef8805200b102a1d24da4ae@kernel.org \
    --to=maz@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=rrichter@marvell.com \
    --cc=sgoutham@marvell.com \
    --cc=tharvey@gateworks.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).