Re: Unstable Kernel behavior on an ARM based board

* Re: Unstable Kernel behavior on an ARM based board
       [not found]                 ` <CA+_ZnZTCtghS+oUdXGW+h6SVD6nMPfsyDAvOwEugrhx6NJ3yFg@mail.gmail.com>
@ 2019-03-04 14:25                   ` Thierry Reding
  2019-03-04 15:51                     ` Embedded Engineer
  2019-03-05 10:01                     ` Embedded Engineer
  0 siblings, 2 replies; 22+ messages in thread
From: Thierry Reding @ 2019-03-04 14:25 UTC (permalink / raw)
  To: Embedded Engineer
  Cc: linux-tegra, Vladimir Murzin, linux-arm-kernel, Jon Hunter

[-- Attachment #1.1: Type: text/plain, Size: 2327 bytes --]

On Mon, Mar 04, 2019 at 05:25:28PM +0500, Embedded Engineer wrote:
> On Mon, Mar 4, 2019 at 3:26 PM Vladimir Murzin <vladimir.murzin@arm.com> wrote:
> >
> > You can try in-kernel memtest:
> >
> > - CONFIG_MEMTEST=y
> > - pass memtest in kernel's command line
> >
> 
> Thanks Vladimir, I tried running mtest as suggested by Clemens in
> u-boot and memtest in kernel as suggested by you. Both tests didn't
> show any errors, however the board sometime hangs at "Starting kernel
> ...". Following logs were obtained when it booted but ended in a
> crash:
> 
> https://pastebin.com/sZZjUcbh

Other than the memory corruption issue this looks like a fairly regular
boot. It's not clear whether the crash of your /sbin/init is related to
any memory issues. The earlier boot log that you had posted showed that
it was failing to mount the root filesystem and dropped you to a
maintenance shell, so that could be an indication that something isn't
right about the root filesystem. Or it could indicate that something is
wrong when loading files from the root filesystem.

The earlier log showed EMEM address decode errors, which are odd because
the addresses clearly lie in regions that should be system memory. EMEM
address decode usually only happens if the memory controller thinks you
are trying to access memory outside of system memory.

The good news is that I think you're pretty close. The memory corruption
is somewhat worrying, but at the same time it's unlikely that you'd get
as far as you do if your memory timings are completely off. However, I
think we need to gather more information to narrow down what's going
wrong.

All of the memory related configuration is part of a file called the
BCT. I think if you could provide that it would be very useful to have.
Also, it looks like you're using the Jetson TK1 device tree to boot, so
can I assume you haven't modified it at all?

Other bits of information that would be good to know are how you are
generating the BCT and your boot images, what exactly you do to flash
the board and which release of L4T you use.

Perhaps also try to run a recent linux-next just to exclude any issues
that may have been part of the 4.8.0-rc7 that you tested.

Also adding Jon and linux-tegra for a broader audience.

Thanks,
Thierry

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 22+ messages in thread