All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tony Lindgren <tony@atomide.com>
To: Felipe Balbi <balbi@ti.com>
Cc: Nishanth Menon <nm@ti.com>, Marc Zyngier <marc.zyngier@arm.com>,
	Olof Johansson <olof@lixom.net>,
	linux-omap@vger.kernel.org, linux-arm-kernel@lists.infradead.org
Subject: Re: New l3-noc error with CPUFREQ_DT built-in with v4.0-rc1
Date: Mon, 23 Feb 2015 18:35:06 -0800	[thread overview]
Message-ID: <20150224023505.GQ32521@atomide.com> (raw)
In-Reply-To: <20150224022433.GB5198@saruman.tx.rr.com>

* Felipe Balbi <balbi@ti.com> [150223 18:28]:
> Hi,
> 
> On Mon, Feb 23, 2015 at 05:59:04PM -0800, Tony Lindgren wrote:
> > * Tony Lindgren <tony@atomide.com> [150223 16:09]:
> > > Hi Nishanth,
> > > 
> > > Olof told me about a new L3 error happening on omap5-uevm with
> > > v4.0-rc1:
> > > 
> > > WARNING: CPU: 0 PID: 0 at drivers/bus/omap_l3_noc.c:147 l3_interrupt_handler+0x214/0x340()
> > > 4000000.ocp:L3 Custom Error: MASTER MPU TARGET L4PER2 (Idle): Data Access in Supervisor mode during Functional access
> > > ...
> > > 
> > > I tried bisecting this with no luck, but narrowed it down to
> > > having CONFIG_CPUFREQ_DT=y causing it, while =m wont' trigger
> > > it. This got changed by commit 40d1746d2eee ("ARM:
> > > omap2plus_defconfig: use CONFIG_CPUFREQ_DT").
> > > 
> > > Any ideas?
> > 
> > Hmm so setting CONFIG_CPUFREQ_DT=m in arch/arm/configs/omap2plus_defconfig
> > produces the same output with make omap2plus_defconfig as with =y.. So
> > CPUFREQ_DT can't be the real cause of the problem.
> > 
> > It's now looking like the l3-noc warning does not get triggered on
> > every boot.
> > 
> > It also seems the zImage triggering the error does not trigger the
> > error on every boot. To trigger the error, it seems the device needs to
> > be powered down for at least 10 or so seconds between the boots.
> > So far no luck reproducing the error on v3.19.
> > 
> > The easy way to reproduce is to power down omap5 for at least 10 seconds,
> > make omap2lus_defconfig on v4.0-rc1 and boot it.
> > 
> > And so far it looks like next-20150204 works and next-20150209
> > failed at once so far. But of course I would not trust anything
> > at this point :)
> 
> got a log of the failure ? Is it pointing to a device or one of the L4s?

Well mostly the MASTER MPU TARGET L4PER2, the following stack dump is
really the stack dump of the l3_interrupt_handler.
 
> Might be worth to boot with just the bare minimum (UART & timers) and
> disable everything else. You might need to build busybox and append that
> to the kernel so you don't need to rely on MMC/USB/etc for rootfs.
> 
> After that, you could start enabling modules one by one (as modules, not
> built-in) and loading them one by one to see which one causes the
> failure. Big PITA, I know, but I can't think of any other way to go
> about this.

It seems the best way to deal with this is to make the l3_handle_target
actually show the address where the error happened to limit it down
to a single device..

Regards,

Tony

WARNING: multiple messages have this Message-ID (diff)
From: tony@atomide.com (Tony Lindgren)
To: linux-arm-kernel@lists.infradead.org
Subject: New l3-noc error with CPUFREQ_DT built-in with v4.0-rc1
Date: Mon, 23 Feb 2015 18:35:06 -0800	[thread overview]
Message-ID: <20150224023505.GQ32521@atomide.com> (raw)
In-Reply-To: <20150224022433.GB5198@saruman.tx.rr.com>

* Felipe Balbi <balbi@ti.com> [150223 18:28]:
> Hi,
> 
> On Mon, Feb 23, 2015 at 05:59:04PM -0800, Tony Lindgren wrote:
> > * Tony Lindgren <tony@atomide.com> [150223 16:09]:
> > > Hi Nishanth,
> > > 
> > > Olof told me about a new L3 error happening on omap5-uevm with
> > > v4.0-rc1:
> > > 
> > > WARNING: CPU: 0 PID: 0 at drivers/bus/omap_l3_noc.c:147 l3_interrupt_handler+0x214/0x340()
> > > 4000000.ocp:L3 Custom Error: MASTER MPU TARGET L4PER2 (Idle): Data Access in Supervisor mode during Functional access
> > > ...
> > > 
> > > I tried bisecting this with no luck, but narrowed it down to
> > > having CONFIG_CPUFREQ_DT=y causing it, while =m wont' trigger
> > > it. This got changed by commit 40d1746d2eee ("ARM:
> > > omap2plus_defconfig: use CONFIG_CPUFREQ_DT").
> > > 
> > > Any ideas?
> > 
> > Hmm so setting CONFIG_CPUFREQ_DT=m in arch/arm/configs/omap2plus_defconfig
> > produces the same output with make omap2plus_defconfig as with =y.. So
> > CPUFREQ_DT can't be the real cause of the problem.
> > 
> > It's now looking like the l3-noc warning does not get triggered on
> > every boot.
> > 
> > It also seems the zImage triggering the error does not trigger the
> > error on every boot. To trigger the error, it seems the device needs to
> > be powered down for at least 10 or so seconds between the boots.
> > So far no luck reproducing the error on v3.19.
> > 
> > The easy way to reproduce is to power down omap5 for at least 10 seconds,
> > make omap2lus_defconfig on v4.0-rc1 and boot it.
> > 
> > And so far it looks like next-20150204 works and next-20150209
> > failed at once so far. But of course I would not trust anything
> > at this point :)
> 
> got a log of the failure ? Is it pointing to a device or one of the L4s?

Well mostly the MASTER MPU TARGET L4PER2, the following stack dump is
really the stack dump of the l3_interrupt_handler.
 
> Might be worth to boot with just the bare minimum (UART & timers) and
> disable everything else. You might need to build busybox and append that
> to the kernel so you don't need to rely on MMC/USB/etc for rootfs.
> 
> After that, you could start enabling modules one by one (as modules, not
> built-in) and loading them one by one to see which one causes the
> failure. Big PITA, I know, but I can't think of any other way to go
> about this.

It seems the best way to deal with this is to make the l3_handle_target
actually show the address where the error happened to limit it down
to a single device..

Regards,

Tony

  reply	other threads:[~2015-02-24  2:39 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-23 23:59 New l3-noc error with CPUFREQ_DT built-in with v4.0-rc1 Tony Lindgren
2015-02-23 23:59 ` Tony Lindgren
2015-02-24  1:59 ` Tony Lindgren
2015-02-24  1:59   ` Tony Lindgren
2015-02-24  2:24   ` Felipe Balbi
2015-02-24  2:24     ` Felipe Balbi
2015-02-24  2:35     ` Tony Lindgren [this message]
2015-02-24  2:35       ` Tony Lindgren
2015-02-24  3:01       ` Tony Lindgren
2015-02-24  3:01         ` Tony Lindgren
2015-02-24  3:15         ` Felipe Balbi
2015-02-24  3:15           ` Felipe Balbi
2015-02-24  3:21           ` Tony Lindgren
2015-02-24  3:21             ` Tony Lindgren
2015-02-24  3:24             ` Tony Lindgren
2015-02-24  3:24               ` Tony Lindgren
2015-02-24 14:46               ` Felipe Balbi
2015-02-24 14:46                 ` Felipe Balbi
2015-02-24  3:12       ` Felipe Balbi
2015-02-24  3:12         ` Felipe Balbi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150224023505.GQ32521@atomide.com \
    --to=tony@atomide.com \
    --cc=balbi@ti.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-omap@vger.kernel.org \
    --cc=marc.zyngier@arm.com \
    --cc=nm@ti.com \
    --cc=olof@lixom.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.