All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dinh Nguyen <dinguyen-yzvPICuk2ABMcg4IHK0kFoH6Mc4MB0Vx@public.gmane.org>
To: Russell King - ARM Linux
	<linux-lFZ/pmaqli7XmaaqVzeoHQ@public.gmane.org>,
	Dinh Nguyen <dinh.linux-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: Andrew Lunn <andrew-g2DYL2Zd6BY@public.gmane.org>,
	Heiko Stuebner <heiko-4mtYJXux2i+zQB+pC5nmwQ@public.gmane.org>,
	Linux-sh list <linux-sh-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Gregory Clement
	<gregory.clement-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org>,
	Thierry Reding
	<thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Alexandre Courbot
	<gnurou-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Florian Fainelli
	<f.fainelli-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Magnus Damm <magnus.damm-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Michal Simek
	<michal.simek-gjFFaj9aHVfQT0dZR+AlfA@public.gmane.org>,
	Wei Xu <xuwei5-C8/M+/jPZTeaMJb+Lgu22Q@public.gmane.org>,
	"open list:ARM/Rockchip SoC..."
	<linux-rockchip-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org>,
	Geert Uytterhoeven
	<geert-Td1EMuHUCqxL1ZNQvxDV9g@public.gmane.org>,
	"linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org"
	<linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org>,
	bcm-kernel-feedback-list-dY08KVG/lbpWk0Htik3J/w@public.gmane.org,
	Sebastian Hesselbarth
	<sebastian.hesselbarth-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Jason Cooper <jason-NLaQJdtUoK4Be96aLqz0jA@public.gmane.org>,
	Stephen Warren <swarren-3lzwWm7+Weoh9ZMKESR00Q@public.gmane.org>,
	Marc Carino <marc.ceeeee-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>,
	Gregory Fong
	<gregory.0xf0-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	linux-tegra@vger
Subject: Re: [PATCH] ARM: v7 setup function should invalidate L1 cache
Date: Wed, 17 Jun 2015 17:12:45 -0500	[thread overview]
Message-ID: <5581F0DD.60408@opensource.altera.com> (raw)
In-Reply-To: <20150617213006.GC7557-l+eeeJia6m9vn6HldHNs0ANdhmdF6hFW@public.gmane.org>

On 06/17/2015 04:30 PM, Russell King - ARM Linux wrote:
> On Wed, Jun 17, 2015 at 03:35:13PM -0500, Dinh Nguyen wrote:
>> On Mon, Jun 1, 2015 at 6:50 AM, Geert Uytterhoeven <geert-Td1EMuHUCqxL1ZNQvxDV9g@public.gmane.org> wrote:
>>> Hi Russell,
>>>
>>> On Mon, Jun 1, 2015 at 12:53 PM, Russell King - ARM Linux
>>> <linux-lFZ/pmaqli7XmaaqVzeoHQ@public.gmane.org> wrote:
>>>> On Mon, Jun 01, 2015 at 12:41:01PM +0200, Geert Uytterhoeven wrote:
>>>>> FWIW, I have the feeling this has a slight influence on boot reliability on
>>>>> two of my boards:
>>>>>   - r8a7740/armadillo, which is known to suffer from a cache-related bug in
>>>>>     its bootloader, seems to have a higher change of booting successfully on
>>>>>     cold boot,
>>>>>   - sh73a0/kzm9g, which has known cache-issues with secondary CPU boot up,
>>>>>     seems to have a lower chance of booting successfully.
>>>>>
>>>>> No time to spend all week turning this into a statistical significant test
>>>>> project... The reset button is my friend...
>>>>
>>>> Damn it, you sent this right after I merged and pushed out this change in
>>>> my for-arm-soc branch, and was just about to send it to the arm-soc people.
>>>> What excellent timing you have. :)
>>>
>>> Don't worry, I didn't send that email to make you postpone this change.
>>> Giving the fuzziness of reproduction, and the flakiness (esp. on Armadillo)
>>> of the boot loader, and these are old SoCs, please go ahead.
>>>
>>>> What happens on the kzm9g if you revert the mach-shmobile changes?
>>>
>>> Seems to make no difference.
>>>
>>>> For armadillo, do you use the decompressor?  That should be doing all the
>>>> cache cleaning already, prior to the kernel being entered.
>>>
>>> I think so.
>>>
>>> Corruption pattern ranges from lock up, over "Error: unrecognized/unsupported
>>> machine ID", to booting almost completely, but lacking a few devices due to
>>> a corrupted DTB. Been like that as long as I remember, i.e. since I got the
>>> board ca. 1 year ago. Boots fine (100%) with kexec.
>>>
>>
>> It seems like this patch is causing the SoCFPGA to not boot with SMP
>> reliably. About 1 out of every 10 reboots, I'm seeing the boot failure
>> below. The error seems to only happen when I do a cold or warm reboot,
>> but never occurs during a power-up. If I revert this patch, or put
>> back the call to v7_invalidate_l1 in socfpga_secondary_startup , then
>> its able to boot 100% of the time.
> 
> It really sucks that you're only just testing this change now, because
> I've frozen my tree, and removing it for the next merge window is going
> to be an entirely non-trivial matter.  You were copied on the original
> patch, which you failed to test... I can't say I have _much_ sympathy
> for a bug report at this point in time.
> 

I apologize for not catching this error while testing this patch. But I
did test it when you first sent it out..I probably didn't do a stress
test. Sometimes the reboot fails in the 1st attempt, sometimes it fails
in the 9th attempt.

I only caught this error when I was testing my recent changes to use
CPU_METHOD_OF_DECLARE.

For me, I don't think you need to revert this patch or anything, but a
fix can go in for a -rcX?

Dinh

WARNING: multiple messages have this Message-ID (diff)
From: Dinh Nguyen <dinguyen@opensource.altera.com>
To: linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH] ARM: v7 setup function should invalidate L1 cache
Date: Wed, 17 Jun 2015 22:12:45 +0000	[thread overview]
Message-ID: <5581F0DD.60408@opensource.altera.com> (raw)
In-Reply-To: <20150617213006.GC7557@n2100.arm.linux.org.uk>

On 06/17/2015 04:30 PM, Russell King - ARM Linux wrote:
> On Wed, Jun 17, 2015 at 03:35:13PM -0500, Dinh Nguyen wrote:
>> On Mon, Jun 1, 2015 at 6:50 AM, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>>> Hi Russell,
>>>
>>> On Mon, Jun 1, 2015 at 12:53 PM, Russell King - ARM Linux
>>> <linux@arm.linux.org.uk> wrote:
>>>> On Mon, Jun 01, 2015 at 12:41:01PM +0200, Geert Uytterhoeven wrote:
>>>>> FWIW, I have the feeling this has a slight influence on boot reliability on
>>>>> two of my boards:
>>>>>   - r8a7740/armadillo, which is known to suffer from a cache-related bug in
>>>>>     its bootloader, seems to have a higher change of booting successfully on
>>>>>     cold boot,
>>>>>   - sh73a0/kzm9g, which has known cache-issues with secondary CPU boot up,
>>>>>     seems to have a lower chance of booting successfully.
>>>>>
>>>>> No time to spend all week turning this into a statistical significant test
>>>>> project... The reset button is my friend...
>>>>
>>>> Damn it, you sent this right after I merged and pushed out this change in
>>>> my for-arm-soc branch, and was just about to send it to the arm-soc people.
>>>> What excellent timing you have. :)
>>>
>>> Don't worry, I didn't send that email to make you postpone this change.
>>> Giving the fuzziness of reproduction, and the flakiness (esp. on Armadillo)
>>> of the boot loader, and these are old SoCs, please go ahead.
>>>
>>>> What happens on the kzm9g if you revert the mach-shmobile changes?
>>>
>>> Seems to make no difference.
>>>
>>>> For armadillo, do you use the decompressor?  That should be doing all the
>>>> cache cleaning already, prior to the kernel being entered.
>>>
>>> I think so.
>>>
>>> Corruption pattern ranges from lock up, over "Error: unrecognized/unsupported
>>> machine ID", to booting almost completely, but lacking a few devices due to
>>> a corrupted DTB. Been like that as long as I remember, i.e. since I got the
>>> board ca. 1 year ago. Boots fine (100%) with kexec.
>>>
>>
>> It seems like this patch is causing the SoCFPGA to not boot with SMP
>> reliably. About 1 out of every 10 reboots, I'm seeing the boot failure
>> below. The error seems to only happen when I do a cold or warm reboot,
>> but never occurs during a power-up. If I revert this patch, or put
>> back the call to v7_invalidate_l1 in socfpga_secondary_startup , then
>> its able to boot 100% of the time.
> 
> It really sucks that you're only just testing this change now, because
> I've frozen my tree, and removing it for the next merge window is going
> to be an entirely non-trivial matter.  You were copied on the original
> patch, which you failed to test... I can't say I have _much_ sympathy
> for a bug report at this point in time.
> 

I apologize for not catching this error while testing this patch. But I
did test it when you first sent it out..I probably didn't do a stress
test. Sometimes the reboot fails in the 1st attempt, sometimes it fails
in the 9th attempt.

I only caught this error when I was testing my recent changes to use
CPU_METHOD_OF_DECLARE.

For me, I don't think you need to revert this patch or anything, but a
fix can go in for a -rcX?

Dinh


WARNING: multiple messages have this Message-ID (diff)
From: dinguyen@opensource.altera.com (Dinh Nguyen)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH] ARM: v7 setup function should invalidate L1 cache
Date: Wed, 17 Jun 2015 17:12:45 -0500	[thread overview]
Message-ID: <5581F0DD.60408@opensource.altera.com> (raw)
In-Reply-To: <20150617213006.GC7557@n2100.arm.linux.org.uk>

On 06/17/2015 04:30 PM, Russell King - ARM Linux wrote:
> On Wed, Jun 17, 2015 at 03:35:13PM -0500, Dinh Nguyen wrote:
>> On Mon, Jun 1, 2015 at 6:50 AM, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>>> Hi Russell,
>>>
>>> On Mon, Jun 1, 2015 at 12:53 PM, Russell King - ARM Linux
>>> <linux@arm.linux.org.uk> wrote:
>>>> On Mon, Jun 01, 2015 at 12:41:01PM +0200, Geert Uytterhoeven wrote:
>>>>> FWIW, I have the feeling this has a slight influence on boot reliability on
>>>>> two of my boards:
>>>>>   - r8a7740/armadillo, which is known to suffer from a cache-related bug in
>>>>>     its bootloader, seems to have a higher change of booting successfully on
>>>>>     cold boot,
>>>>>   - sh73a0/kzm9g, which has known cache-issues with secondary CPU boot up,
>>>>>     seems to have a lower chance of booting successfully.
>>>>>
>>>>> No time to spend all week turning this into a statistical significant test
>>>>> project... The reset button is my friend...
>>>>
>>>> Damn it, you sent this right after I merged and pushed out this change in
>>>> my for-arm-soc branch, and was just about to send it to the arm-soc people.
>>>> What excellent timing you have. :)
>>>
>>> Don't worry, I didn't send that email to make you postpone this change.
>>> Giving the fuzziness of reproduction, and the flakiness (esp. on Armadillo)
>>> of the boot loader, and these are old SoCs, please go ahead.
>>>
>>>> What happens on the kzm9g if you revert the mach-shmobile changes?
>>>
>>> Seems to make no difference.
>>>
>>>> For armadillo, do you use the decompressor?  That should be doing all the
>>>> cache cleaning already, prior to the kernel being entered.
>>>
>>> I think so.
>>>
>>> Corruption pattern ranges from lock up, over "Error: unrecognized/unsupported
>>> machine ID", to booting almost completely, but lacking a few devices due to
>>> a corrupted DTB. Been like that as long as I remember, i.e. since I got the
>>> board ca. 1 year ago. Boots fine (100%) with kexec.
>>>
>>
>> It seems like this patch is causing the SoCFPGA to not boot with SMP
>> reliably. About 1 out of every 10 reboots, I'm seeing the boot failure
>> below. The error seems to only happen when I do a cold or warm reboot,
>> but never occurs during a power-up. If I revert this patch, or put
>> back the call to v7_invalidate_l1 in socfpga_secondary_startup , then
>> its able to boot 100% of the time.
> 
> It really sucks that you're only just testing this change now, because
> I've frozen my tree, and removing it for the next merge window is going
> to be an entirely non-trivial matter.  You were copied on the original
> patch, which you failed to test... I can't say I have _much_ sympathy
> for a bug report at this point in time.
> 

I apologize for not catching this error while testing this patch. But I
did test it when you first sent it out..I probably didn't do a stress
test. Sometimes the reboot fails in the 1st attempt, sometimes it fails
in the 9th attempt.

I only caught this error when I was testing my recent changes to use
CPU_METHOD_OF_DECLARE.

For me, I don't think you need to revert this patch or anything, but a
fix can go in for a -rcX?

Dinh

  parent reply	other threads:[~2015-06-17 22:12 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-19 16:12 [PATCH] ARM: v7 setup function should invalidate L1 cache Russell King
2015-05-19 16:12 ` Russell King
2015-05-19 22:01 ` Florian Fainelli
2015-05-19 22:01   ` Florian Fainelli
2015-05-19 22:01   ` Florian Fainelli
2015-05-20 22:48 ` Sebastian Hesselbarth
2015-05-20 22:48   ` Sebastian Hesselbarth
2015-05-20 22:48   ` Sebastian Hesselbarth
     [not found] ` <E1Yuk8W-0001tC-IK-eh5Bv4kxaXIANfyc6IWni62ZND6+EDdj@public.gmane.org>
2015-05-19 21:44   ` Heiko Stuebner
2015-05-19 21:44     ` Heiko Stuebner
2015-05-19 21:44     ` Heiko Stuebner
2015-05-19 21:55     ` Arnd Bergmann
2015-05-19 21:55       ` Arnd Bergmann
2015-05-19 21:55       ` Arnd Bergmann
2015-05-19 22:07       ` Russell King - ARM Linux
2015-05-19 22:07         ` Russell King - ARM Linux
2015-05-19 22:07         ` Russell King - ARM Linux
     [not found]         ` <20150519220721.GK2067-l+eeeJia6m9vn6HldHNs0ANdhmdF6hFW@public.gmane.org>
2015-05-19 22:18           ` Arnd Bergmann
2015-05-19 22:18             ` Arnd Bergmann
2015-05-19 22:18             ` Arnd Bergmann
2015-05-19 22:32             ` Russell King - ARM Linux
2015-05-19 22:32               ` Russell King - ARM Linux
2015-05-19 22:32               ` Russell King - ARM Linux
2015-05-20 18:54   ` Dinh Nguyen
2015-05-20 18:54     ` Dinh Nguyen
2015-05-20 18:54     ` Dinh Nguyen
2015-05-21  2:08   ` Shawn Guo
2015-05-21  2:08     ` Shawn Guo
2015-05-21  2:08     ` Shawn Guo
2015-05-22  7:36   ` Geert Uytterhoeven
2015-05-22  7:36     ` Geert Uytterhoeven
2015-05-22  7:36     ` Geert Uytterhoeven
2015-06-01 10:41     ` Geert Uytterhoeven
2015-06-01 10:41       ` Geert Uytterhoeven
2015-06-01 10:41       ` Geert Uytterhoeven
2015-06-01 10:53       ` Russell King - ARM Linux
2015-06-01 10:53         ` Russell King - ARM Linux
2015-06-01 10:53         ` Russell King - ARM Linux
2015-06-01 11:50         ` Geert Uytterhoeven
2015-06-01 11:50           ` Geert Uytterhoeven
2015-06-01 11:50           ` Geert Uytterhoeven
     [not found]           ` <CAMuHMdUXCy+87-pwLiJ7ynaM1AeFq0f7R3sJ4prdE3QN09z++w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-06-17 20:35             ` Dinh Nguyen
2015-06-17 20:35               ` Dinh Nguyen
2015-06-17 20:35               ` Dinh Nguyen
2015-06-17 21:30               ` Russell King - ARM Linux
2015-06-17 21:30                 ` Russell King - ARM Linux
2015-06-17 21:30                 ` Russell King - ARM Linux
     [not found]                 ` <20150617213006.GC7557-l+eeeJia6m9vn6HldHNs0ANdhmdF6hFW@public.gmane.org>
2015-06-17 22:12                   ` Dinh Nguyen [this message]
2015-06-17 22:12                     ` Dinh Nguyen
2015-06-17 22:12                     ` Dinh Nguyen
     [not found]                     ` <5581F0DD.60408-yzvPICuk2ABMcg4IHK0kFoH6Mc4MB0Vx@public.gmane.org>
2015-06-17 22:31                       ` Dinh Nguyen
2015-06-17 22:31                         ` Dinh Nguyen
2015-06-17 22:31                         ` Dinh Nguyen
     [not found]                         ` <5581F542.708-yzvPICuk2ABMcg4IHK0kFoH6Mc4MB0Vx@public.gmane.org>
2015-06-17 22:51                           ` Russell King - ARM Linux
2015-06-17 22:51                             ` Russell King - ARM Linux
2015-06-17 22:51                             ` Russell King - ARM Linux
2015-05-22 10:45   ` Michal Simek
2015-05-22 10:45     ` Michal Simek
2015-05-22 10:45     ` Michal Simek
2015-06-01 10:21   ` Wei Xu
2015-06-01 10:21     ` Wei Xu
2015-06-01 10:21     ` Wei Xu
2015-05-21  8:30 ` Thierry Reding
2015-05-21  8:30   ` Thierry Reding
2015-05-21  8:30   ` Thierry Reding
2015-07-08  1:17 ` [PATCH] ARM: BCM63xx: Remove custom secondary_startup function Florian Fainelli
2015-07-12  1:34   ` Florian Fainelli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5581F0DD.60408@opensource.altera.com \
    --to=dinguyen-yzvpicuk2abmcg4ihk0kfoh6mc4mb0vx@public.gmane.org \
    --cc=andrew-g2DYL2Zd6BY@public.gmane.org \
    --cc=bcm-kernel-feedback-list-dY08KVG/lbpWk0Htik3J/w@public.gmane.org \
    --cc=dinh.linux-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=f.fainelli-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=geert-Td1EMuHUCqxL1ZNQvxDV9g@public.gmane.org \
    --cc=gnurou-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=gregory.0xf0-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=gregory.clement-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org \
    --cc=heiko-4mtYJXux2i+zQB+pC5nmwQ@public.gmane.org \
    --cc=horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org \
    --cc=jason-NLaQJdtUoK4Be96aLqz0jA@public.gmane.org \
    --cc=linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org \
    --cc=linux-lFZ/pmaqli7XmaaqVzeoHQ@public.gmane.org \
    --cc=linux-rockchip-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org \
    --cc=linux-sh-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-tegra@vger \
    --cc=magnus.damm-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=marc.ceeeee-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=michal.simek-gjFFaj9aHVfQT0dZR+AlfA@public.gmane.org \
    --cc=sebastian.hesselbarth-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=swarren-3lzwWm7+Weoh9ZMKESR00Q@public.gmane.org \
    --cc=thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=xuwei5-C8/M+/jPZTeaMJb+Lgu22Q@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.