From: Russell King - ARM Linux admin <linux@armlinux.org.uk>
To: Embedded Engineer <embed786@gmail.com>
Cc: Andrew Lunn <andrew@lunn.ch>,
Vladimir Murzin <vladimir.murzin@arm.com>,
Jon Hunter <jonathanh@nvidia.com>,
Thierry Reding <thierry.reding@gmail.com>,
linux-tegra@vger.kernel.org,
linux-arm-kernel@lists.infradead.org
Subject: Re: Unstable Kernel behavior on an ARM based board
Date: Tue, 5 Mar 2019 11:22:26 +0000 [thread overview]
Message-ID: <20190305112226.rhbl3dwopmip45ja@shell.armlinux.org.uk> (raw)
In-Reply-To: <CA+_ZnZTeZLyCcjZduQODzjWxTpU96AefzvTBDFbq2CSjVQxONg@mail.gmail.com>
On Tue, Mar 05, 2019 at 03:29:26PM +0500, Embedded Engineer wrote:
> On Tue, Mar 5, 2019 at 3:07 PM Russell King - ARM Linux admin
> <linux@armlinux.org.uk> wrote:
> >
> > Please apply this patch so we can see the (ptrval) values. Thanks.
>
> Please find below logs after applying patch:
>
> https://pastebin.com/6TaBxPX5
So we have a pattern here:
tegra-udc 7d000000.usb: dma_pool_alloc ci_hw_qh, ec056080 (corrupted)
00000000: c0 00 00 00 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 ................
00000010: a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 ................
00000020: 80 00 00 00 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 ................
00000030: a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 ................
tegra-udc 7d000000.usb: dma_pool_alloc ci_hw_qh, ec056140 (corrupted)
00000000: 80 01 00 00 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 ................
00000010: a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 ................
00000020: 40 01 00 00 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 @...............
00000030: a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 ................
tegra-udc 7d000000.usb: dma_pool_alloc ci_hw_qh, ec0561c0 (corrupted)
00000000: 00 02 00 00 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 ................
00000010: a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 ................
00000020: 40 03 00 00 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 @...............
00000030: a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 ................
tegra-udc 7d000000.usb: dma_pool_alloc ci_hw_qh, ec056200 (corrupted)
00000000: 40 02 00 00 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 @...............
00000010: a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 ................
00000020: 40 05 00 00 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 @...............
00000030: a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 a7 ................
and so it goes on.
The first four bytes are the offset to the next free block of memory in
this page, so can be ignored. The remainder of the bytes should all be
0xa7, but every word at offset 32 into these is corrupted with what
looks to be a similar offset.
We dump 0x40 bytes, which, reading the code makes the pool size 0x40
bytes in size. Tabulating the object offset, the next offset, and
the corruption at offset 32. Corruption1 is from your latest log,
corruption2 is derived from your previous log using the next pointer
to tie up between the two:
object offset next corruption1 corruption2
0x0080 0x00c0 0x00000080 0x00000080
0x0140 0x0180 0x00000140 0x00000100
0x01c0 0x0200 0x00000340 0x000001c0
0x0200 0x0240 0x00000540 0x000001c0
0x0280 0x02c0 0x00000340 0x00000300
0x0340 0x0380 0x00000540 0x00000140
0x03c0 0x0400 0x00000540 0x00000300
0x0400 0x0440 0x000003c0 0x00000140
0x0480 0x04c0 0x00000540 0x000003c0
0x0540 0x0580 0x00000480 0x00000540
0x05c0 0x0600 0x000005c0 0x000005c0
0x0600 0x0640 0x00000500 0x000005c0
0x0680 0x06c0 0x00000740 0x00000680
?????? 0x0780 0x00000740
0x07c0 0x0800 0x000007c0 0x00000700
The corruption looks very much like offset values, except they do not
seem to follow any rhyme or reason. They also appear to be different
on each boot.
Given that the sequence here when a pool allocation occurs is:
1. allocate DMA coherent page
2. memset entire page with 0xa7
3. write next offsets
4. initialise 'offset' to zero (offset of first free object)
5. add page to pools list of pages
6. allocate first object, updating offset to the next free offset read
from the first word of the object.
then when the next allocation request comes along, we allocate the
next object in the same way as step 6. At the point of allocating the
third object, we find that there is corruption in the third object at
0x20 bytes into it - or 0xa0 bytes into the page.
Now, what does the driver that's allocating these do with them? That
is done via init_eps() in drivers/usb/chipidea/udc.c, which doesn't do
anything with the allocated memory. This is the only place that the
driver allocates from this DMA pool, which is done in a loop, so we
know that the objects allocated from this pool will be in relatively
quick succession.
So this does not make sense.
I really doubt that there is anything wrong with the kernel - this USB
driver is used on other SoCs (such as iMX6) and does not exhibit this
problem - it also works on the Tegra TK1 platform as well.
You are definitely seeing memory corruption here - but given what the
above looks like, I'd put forward another possible scenario - maybe
u-boot or something else is leaving a USB controller or some other DMA
agent active, which is writing over memory while the kernel is trying
to boot, resulting in memory corruption.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2019-03-05 11:22 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-03-02 10:44 Unstable Kernel behavior on an ARM based board Embedded Engineer
2019-03-02 11:00 ` Russell King - ARM Linux admin
2019-03-02 11:01 ` Willy Tarreau
2019-03-02 11:22 ` Embedded Engineer
2019-03-02 11:25 ` Willy Tarreau
2019-03-02 11:46 ` Russell King - ARM Linux admin
2019-03-04 13:57 ` Thierry Reding
2019-03-02 11:36 ` Russell King - ARM Linux admin
2019-03-02 11:52 ` Embedded Engineer
2019-03-02 11:57 ` Russell King - ARM Linux admin
2019-03-02 12:20 ` Embedded Engineer
2019-03-02 12:39 ` Russell King - ARM Linux admin
2019-03-02 13:10 ` Embedded Engineer
2019-03-02 15:07 ` Clemens Koller
2019-03-04 5:14 ` Embedded Engineer
2019-03-04 10:26 ` Vladimir Murzin
2019-03-04 12:25 ` Embedded Engineer
2019-03-04 14:25 ` Thierry Reding
2019-03-04 15:51 ` Embedded Engineer
2019-03-05 10:01 ` Embedded Engineer
2019-03-05 10:07 ` Russell King - ARM Linux admin
2019-03-05 10:29 ` Embedded Engineer
2019-03-05 11:20 ` Thierry Reding
2019-03-05 11:22 ` Russell King - ARM Linux admin [this message]
2019-03-05 11:57 ` Thierry Reding
2019-03-05 13:16 ` Embedded Engineer
2019-03-05 13:23 ` Russell King - ARM Linux admin
2019-03-05 13:32 ` Embedded Engineer
2019-03-05 14:23 ` Russell King - ARM Linux admin
2019-03-05 14:57 ` Embedded Engineer
2019-03-05 14:58 ` Russell King - ARM Linux admin
2019-03-05 15:11 ` Embedded Engineer
2019-03-05 15:31 ` Russell King - ARM Linux admin
2019-03-05 15:44 ` Embedded Engineer
2019-03-15 8:55 ` Marcel Ziswiler
2019-03-05 16:00 ` Clemens Koller
2019-03-05 16:21 ` Embedded Engineer
2019-03-09 7:50 ` Embedded Engineer
2019-03-05 10:32 ` Thierry Reding
2019-03-05 11:05 ` Embedded Engineer
2019-03-05 11:36 ` Thierry Reding
2019-03-04 14:00 ` Andrew Lunn
2019-03-04 14:27 ` Thierry Reding
2019-03-04 15:27 ` Embedded Engineer
2019-03-04 15:57 ` Andrew Lunn
2019-03-04 16:03 ` Embedded Engineer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190305112226.rhbl3dwopmip45ja@shell.armlinux.org.uk \
--to=linux@armlinux.org.uk \
--cc=andrew@lunn.ch \
--cc=embed786@gmail.com \
--cc=jonathanh@nvidia.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-tegra@vger.kernel.org \
--cc=thierry.reding@gmail.com \
--cc=vladimir.murzin@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).