netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
To: Russell King - ARM Linux <linux@arm.linux.org.uk>
Cc: "Willy Tarreau" <w@1wt.eu>, "Andrew Lunn" <andrew@lunn.ch>,
	"Jason Cooper" <jason@lakedaemon.net>,
	netdev@vger.kernel.org, "Ethan Tuttle" <ethan@ethantuttle.com>,
	"Ezequiel Garcia" <ezequiel.garcia@free-electrons.com>,
	"Gregory Clément" <gregory.clement@free-electrons.com>,
	linux-arm-kernel@lists.infradead.org
Subject: Re: mvneta: oops in __rcu_read_lock on mirabox
Date: Mon, 16 Sep 2013 18:24:50 +0200	[thread overview]
Message-ID: <20130916182450.639084c6@skate> (raw)
In-Reply-To: <20130916162209.GL12758@n2100.arm.linux.org.uk>

Russell,

On Mon, 16 Sep 2013 17:22:09 +0100, Russell King - ARM Linux wrote:

> One seemed to be a single bit error in an instruction inside the kernel
> image.  The other was what seems to be an impossible abort.
> 
> I still don't see how we could end up with a prefetch abort inside memset()
> due to the kernel domain being inaccessible, but still be able to get
> an oops out, especially when we dump out the memory for the faulting
> instruction by accessing that memory via that apparantly inaccessible
> domain while running the code which dumps that memory also under this
> apparantly inaccessible domain.  If the domain containing the kernel
> really was inaccessible, the system would be completely dead.
> 
> The only possibilities I can come up with for that is that abort was
> caused by something spurious happening at the hardware level causing
> corruption of the instruction TLB (corrupting the domain index stored
> in the I-TLB) or other CPU control hardware causing it to spuriously
> generate that fault.
> 
> As the domain field in the page table L1 entries covers bit 8, and the
> single bit error with the instruction was also bit 8, maybe there's a
> design weakness on data line bit 8 causing marginal operation.
> 
> To add to this, the abort given in this report gives an IFSR value of
> 0x409, which equates to "Synchronous parity error on memory access"
> in ARMv7.  The other value (0x400) equates to "TLB conflict abort"
> which can only happen with LPAE support enabled...  So this is just
> getting more weird!

Could this be caused by bitflips in the RAM due to bad timings, or
overheating or that kind of things?

Thomas
-- 
Thomas Petazzoni, Free Electrons
Kernel, drivers, real-time and embedded Linux
development, consulting, training and support.
http://free-electrons.com

  reply	other threads:[~2013-09-16 16:24 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-15  1:05 mvneta: oops in __rcu_read_lock on mirabox Ethan Tuttle
2013-09-15 18:57 ` Thomas Petazzoni
2013-09-16  6:50   ` Willy Tarreau
2013-09-16  8:56     ` Ethan Tuttle
2013-09-16 15:51     ` Thomas Petazzoni
2013-09-16 16:22       ` Russell King - ARM Linux
2013-09-16 16:24         ` Thomas Petazzoni [this message]
2013-09-16 17:14           ` Russell King - ARM Linux
2013-09-16 17:45             ` Willy Tarreau
2013-09-16 18:25               ` Russell King - ARM Linux
2013-09-16 16:35       ` Ethan Tuttle
2013-09-16 16:39         ` Willy Tarreau
2013-09-16 16:44           ` Willy Tarreau
2013-09-16 17:24             ` Ethan Tuttle
2013-09-16 17:47               ` Willy Tarreau
2013-09-16 18:28                 ` Russell King - ARM Linux
2013-09-17  3:43                   ` Ethan Tuttle
2013-09-17  6:01                     ` Willy Tarreau
2013-09-18  6:30                       ` Ethan Tuttle
2013-09-18 16:35                         ` Thomas Petazzoni
2013-09-18 16:49                           ` Willy Tarreau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130916182450.639084c6@skate \
    --to=thomas.petazzoni@free-electrons.com \
    --cc=andrew@lunn.ch \
    --cc=ethan@ethantuttle.com \
    --cc=ezequiel.garcia@free-electrons.com \
    --cc=gregory.clement@free-electrons.com \
    --cc=jason@lakedaemon.net \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux@arm.linux.org.uk \
    --cc=netdev@vger.kernel.org \
    --cc=w@1wt.eu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).