From: "Heiko Stübner" <heiko@sntech.de>
To: Robin Murphy <robin.murphy@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
vicencb@gmail.com, linux-rockchip@lists.infradead.org,
andre.przywara@arm.com,
Philipp Richter <richterphilipp.pops@gmail.com>,
Will Deacon <will@kernel.org>,
linux-arm-kernel@lists.infradead.org
Subject: Re: aarch64 Kernel Panic Asynchronous SError Interrupt on large file IO
Date: Mon, 07 Oct 2019 15:38:01 +0200 [thread overview]
Message-ID: <2769202.trDOcCdrXg@diego> (raw)
In-Reply-To: <7f659a93-81e1-65f3-8239-537307f34f42@arm.com>
Am Montag, 7. Oktober 2019, 13:51:37 CEST schrieb Robin Murphy:
> On 06/10/2019 14:13, Heiko Stuebner wrote:
> > Am Sonntag, 6. Oktober 2019, 01:45:23 CEST schrieb Robin Murphy:
> >> On 2019-08-19 11:43 am, Will Deacon wrote:
> >>> On Mon, Aug 19, 2019 at 11:07:14AM +0100, Catalin Marinas wrote:
> >>>> On Sat, Aug 17, 2019 at 03:12:41PM +0200, Philipp Richter wrote:
> >>>>> I added "memtest=4" to the kernel cmdline and I'm getting very quicky
> >>>>> a "Internal error: synchronous external abort" panic.
> >>>> [...]
> >>>>> [ 0.000000] early_memtest: # of tests: 4
> >>>>> [ 0.000000] 0x0000000000200000 - 0x0000000002080000 pattern aaaaaaaaaaaaaaaa
> >>>>> [ 0.000000] 0x0000000003a95000 - 0x00000000f8400000 pattern aaaaaaaaaaaaaaaa
> >>>>> [ 0.000000] Internal error: synchronous external abort: 96000210 [#1] SMP
> >>>>
> >>>> At least it's a synchronous error ;).
> >>>>
> >>>>> [ 0.000000] pc : early_memtest+0x16c/0x23c
> >>>> [...]
> >>>>> [ 0.000000] Code: d2800002 d2800001 eb0400bf 54000309 (f9400080)
> >>>>
> >>>> decodecode says:
> >>>>
> >>>> 0: d2800002 mov x2, #0x0 // #0
> >>>> 4: d2800001 mov x1, #0x0 // #0
> >>>> 8: eb0400bf cmp x5, x4
> >>>> c: 54000309 b.ls 0x6c // b.plast
> >>>> 10:* f9400080 ldr x0, [x4] <-- trapping instruction
> >>>>
> >>>> I guess that's the read of *p in memtest(). Writing *p probably
> >>>> generates asynchronous errors it you haven't seen it yet.
> >>>>
> >>>>> Is my board completely broken ? :(
> >>>>
> >>>> One possibility is that you don't have any memory where you think there
> >>>> is, so the mapping just doesn't translate to any valid physical
> >>>> location.
> >>>>
> >>>> Can you add some printk(addr) in do_sea() to see if it always faults on
> >>>> the same address?
> >>>
> >>> Alternatively, just run it a few more times and see if the register dump
> >>> changes. Currently we've got:
> >>>
> >>> [ 0.000000] x5 : ffff8000f8400000 x4 : ffff800008400000
> >>> [ 0.000000] x3 : 0000000008400000 x2 : 0000000000000000
> >>> [ 0.000000] x1 : 0000000000000000 x0 : aaaaaaaaaaaaaaaa
> >>>
> >>> so I'd guess that x3 is the faulting pa. The faulting (linear) VAs in the
> >>> originl report were 0xffff800009c74aa8 and 0xffff800009c08390, which is
> >>> still a way way off from this one :/
> >>>
> >>> Looking at the TRM for the rk3328, there's 4gb of ram starting at pa 0x0,
> >>> so maybe some of it has been configured as secure or the memory controller
> >>> hasn't been properly initialised?
> >>
> >> FWIW I've noticed my RK3399 board doing this too, now that I've started
> >> using it in anger. I'm using a hacky firmware comprising upstream U-Boot
> >> munged with the Rockchip miniloader and downstream Trusted Firmware
> >> binaries,
> >
> > any reason for that combination? For example the rockpro64 got ddr4 support
> > in upstream uboot recently.
>
> Not really; it's just the "works well enough" setup that made distro
> boot usable before the SPL support went upstream, and (other than
> hacking in the CPU PLL initialisation which otherwise gets lost in that
> combination) I haven't touched it since.
>
> [ for now I've just hacked a reserved-memory node into my DT... one day
> I'll get round to firmware tinkering ;) ]
>
>
> >> and it looks like that mismatch is the root of this problem.
> >> Booting a different image based on the BSP U-boot shows that that's
> >> passing a memory node with the range 0x8400000-0x9600000 entirely carved
> >> out, so this is presumably claimed by the secure firmware/TEE and set to
> >> abort Non-Secure accesses.
> >
> > As TEE on PX30 is also one of my current projects, I've stumbled over that
> > memory issue. At least OP-TEE can get passed a location for a dtb during
> > startup which it then would modify to add a reserved section for its memory.
> >
> > But that dtb generally is not the one, the kernel will actually use, but
> > instead only the one used by uboot. extlinux, tftp or whatever will normally
> > load and use a new dtb for the kernel which will likely not get that memory
> > reservation automatically?
> >
> > I'm not yet sure how this is supposed to work in an all-upstream
> > configuration - I'm running upstream u-boot + upstream TF-A + upstream
> > OP-Tee in my project environment right now.
>
> As far as I understand, U-Boot is still responsible for generating the
> memory node in whatever DTB it loads and passes to the kernel, so it
> should still be able to adjust that accordingly. Presumably U-Boot needs
> to discover any firmware/TEE reservations early on to avoid touching any
> Secure memory itself, so it should just need to keep track of them until
> finalising the kernel DTB.
Yeah, that's similar to what I discovered so far :-D .
SPL loads u-boot.itb which should contain, u-boot, tf-a, tee and dt.
[vendor tf-a might do that differently though]
It passes the dt-address as param to both tf-a and optee, which then
may add stuff, like optee adding the firmware-node + reserved-memory
sections.
This dt is then the basis for the main u-boot, to be found at gd->fdt_blob.
So u-boot will need to discover and transplant optee-firmware + optee
reserved-memory sections to any later dt that gets loaded.
Which is what I'll be looking at next ;-) .
Heiko
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2019-10-07 13:38 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CA+Vb7hpe_USzdCuTBHd8V-t6YeQ0oApiBrvM-D43JuhJda6eyQ@mail.gmail.com>
[not found] ` <20190815122151.bg7it6ptxwcn2vif@willie-the-truck>
2019-08-15 13:59 ` aarch64 Kernel Panic Asynchronous SError Interrupt on large file IO Robin Murphy
[not found] ` <CA+Vb7hpi=pCC9viiof8y85Kw_vCawWQ0B6kGFALgxtZfCKoaTw@mail.gmail.com>
2019-08-15 16:00 ` Philipp Richter
2019-08-16 12:01 ` Robin Murphy
2019-08-16 18:54 ` Philipp Richter
2019-08-17 13:12 ` Philipp Richter
2019-08-19 10:07 ` Catalin Marinas
2019-08-19 10:43 ` Will Deacon
2019-10-05 23:45 ` Robin Murphy
2019-10-06 13:13 ` Heiko Stuebner
2019-10-07 11:51 ` Robin Murphy
2019-10-07 13:38 ` Heiko Stübner [this message]
2019-10-07 14:01 ` André Przywara
2019-10-07 14:06 ` Heiko Stübner
2019-10-08 8:08 ` Heiko Stübner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2769202.trDOcCdrXg@diego \
--to=heiko@sntech.de \
--cc=andre.przywara@arm.com \
--cc=catalin.marinas@arm.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-rockchip@lists.infradead.org \
--cc=richterphilipp.pops@gmail.com \
--cc=robin.murphy@arm.com \
--cc=vicencb@gmail.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).