From: Marc Zyngier <maz@kernel.org>
To: Lukas Wunner <lukas@wunner.de>
Cc: linux-arm-kernel@lists.infradead.org,
Florian Fainelli <f.fainelli@gmail.com>,
Jason Cooper <jason@lakedaemon.net>,
Scott Branden <sbranden@broadcom.com>,
Ray Jui <rjui@broadcom.com>, Stefan Wahren <wahrenst@gmx.net>,
Serge Schneider <serge@raspberrypi.org>,
linux-kernel@vger.kernel.org, Phil Elwell <phil@raspberrypi.org>,
Matthias Brugger <mbrugger@suse.com>,
bcm-kernel-feedback-list@broadcom.com,
linux-rpi-kernel@lists.infradead.org,
Martin Sperl <kernel@martin.sperl.org>,
Thomas Gleixner <tglx@linutronix.de>,
Nicolas Saenz Julienne <nsaenzjulienne@suse.de>,
Kristina Brooks <notstina@gmail.com>
Subject: Re: [PATCH v2] irqchip/bcm2835: Quiesce IRQs left enabled by bootloader
Date: Wed, 12 Feb 2020 08:13:29 +0000 [thread overview]
Message-ID: <d49218987c0d1d573aaa3bcccf44ffe3@kernel.org> (raw)
In-Reply-To: <8be2f3e95fb29abdf80240f2b8a38621c42eb2a9.1581327911.git.lukas@wunner.de>
Hi Lukas,
Thanks for the update on this.
On 2020-02-10 09:52, Lukas Wunner wrote:
> Customers of our "Revolution Pi" open source PLCs (which are based on
> the Raspberry Pi) have reported random lockups as well as jittery eMMC,
> UART and SPI latency. We were able to reproduce the lockups in our lab
> and hooked up a JTAG debugger:
>
> It turns out that the USB controller's interrupt is already enabled
> when
> the kernel boots. All interrupts are disabled when the chip comes out
> of power-on reset, according to the spec. So apparently the bootloader
> enables the interrupt but neglects to disable it before handing over
> control to the kernel.
>
> The bootloader is a closed source blob provided by the Raspberry Pi
> Foundation. Development of an alternative open source bootloader was
> begun by Kristina Brooks but it's not fully functional yet. Usage of
> the blob is thus without alternative for the time being.
>
> The Raspberry Pi Foundation's downstream kernel has a performance-
> optimized USB driver (which we use on our Revolution Pi products).
> The driver takes advantage of the FIQ fast interrupt. Because the
> regular USB interrupt was left enabled by the bootloader, both the
> FIQ and the normal interrupt is enabled once the USB driver probes.
>
> The spec has the following to say on simultaneously enabling the FIQ
> and the normal interrupt of a peripheral:
>
> "One interrupt source can be selected to be connected to the ARM FIQ
> input. An interrupt which is selected as FIQ should have its normal
> interrupt enable bit cleared. Otherwise a normal and an FIQ interrupt
> will be fired at the same time. Not a good idea!"
> ^^^^^^^^^^^^^^^
> https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf
> page 110
>
> On a multicore Raspberry Pi, the Foundation's kernel routes all normal
> interrupts to CPU 0 and the FIQ to CPU 1. Because both the FIQ and the
> normal interrupt is enabled, a USB interrupt causes CPU 0 to spin in
> bcm2836_chained_handle_irq() until the FIQ on CPU 1 has cleared it.
> Interrupts with a lower priority than USB are starved as long.
>
> That explains the jittery eMMC, UART and SPI latency: On one occasion
> I've seen CPU 0 blocked for no less than 2.9 msec. Basically,
> everything not USB takes a performance hit: Whereas eMMC throughput
> on a Compute Module 3 remains relatively constant at 23.5 MB/s with
> this commit, it irregularly dips to 23.0 MB/s without this commit.
>
> The lockups occur when CPU 0 receives a USB interrupt while holding a
> lock which CPU 1 is trying to acquire while the FIQ is temporarily
> disabled on CPU 1.
>
> I've tested old releases of the Foundation's bootloader as far back as
> 1.20160202-1 and they all leave the USB interrupt enabled. Still older
> releases fail to boot a contemporary kernel on a Compute Module 1 or 3,
> which are the only Raspberry Pi variants I have at my disposal for
> testing.
>
> Fix by disabling IRQs left enabled by the bootloader. Although the
> impact is most pronounced on the Foundation's downstream kernel,
> it seems prudent to apply the fix to the upstream kernel to guard
> against such mistakes in any present and future bootloader.
While the story is interesting, it doesn't really belong to a commit
message.
Please trim it down to something along the lines of:
- The RPi bootloader is a bit crap, as it leaves IRQs and FIQs enabled
and for the OS to deal with the consequences
- The kernel driver is not great either, as it doesn't properly
initialize
the interrupt state, resulting in both IRQ and FIQ misfiring and
resulting
in bizarre behaviours
- Properly initializing the irqchip fixes the issue. Add a couple a
warnings
for a good measure, so that people realize their favourite toy comes
with
sub-par SW.
> Signed-off-by: Lukas Wunner <lukas@wunner.de>
> Cc: Serge Schneider <serge@raspberrypi.org>
> Cc: Kristina Brooks <notstina@gmail.com>
> Cc: stable@vger.kernel.org
> ---
> Changes since v1:
> * Use "relaxed" MMIO accessors to avoid memory barriers (Marc)
> * Use u32 instead of int for register access (Marc)
> * Quiesce FIQ as well (Marc)
> * Quiesce IRQs after mapping them for better readability
> * Drop alternative approach from commit message (Marc)
>
> Link to v1:
> https://lore.kernel.org/lkml/988737dbbc4e499c2faaaa4e567ba3ed8deb9a89.1581089797.git.lukas@wunner.de/
>
> drivers/irqchip/irq-bcm2835.c | 15 +++++++++++++++
> 1 file changed, 15 insertions(+)
>
> diff --git a/drivers/irqchip/irq-bcm2835.c
> b/drivers/irqchip/irq-bcm2835.c
> index 418245d31921..63539c88ac3a 100644
> --- a/drivers/irqchip/irq-bcm2835.c
> +++ b/drivers/irqchip/irq-bcm2835.c
> @@ -61,6 +61,7 @@
> | SHORTCUT1_MASK | SHORTCUT2_MASK)
>
> #define REG_FIQ_CONTROL 0x0c
> +#define REG_FIQ_ENABLE 0x80
>
> #define NR_BANKS 3
> #define IRQS_PER_BANK 32
> @@ -135,6 +136,7 @@ static int __init armctrl_of_init(struct
> device_node *node,
> {
> void __iomem *base;
> int irq, b, i;
> + u32 reg;
>
> base = of_iomap(node, 0);
> if (!base)
> @@ -157,6 +159,19 @@ static int __init armctrl_of_init(struct
> device_node *node,
> handle_level_irq);
> irq_set_probe(irq);
> }
> +
> + reg = readl_relaxed(intc.enable[b]);
> + if (reg) {
> + writel_relaxed(reg, intc.disable[b]);
> + pr_err(FW_BUG "Bootloader left irq enabled: "
> + "bank %d irq %*pbl\n", b, IRQS_PER_BANK, ®);
> + }
> + }
> +
> + reg = readl_relaxed(base + REG_FIQ_CONTROL);
> + if (reg & REG_FIQ_ENABLE) {
> + writel_relaxed(0, base + REG_FIQ_CONTROL);
> + pr_err(FW_BUG "Bootloader left fiq enabled\n");
> }
>
> if (is_2836) {
It otherwise looks good. You can either resend it with a fixed commit
message,
or provide me with a commit message that I can stick there while
applying it.
Thanks,
M.
--
Jazz is not dead. It just smells funny...
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
WARNING: multiple messages have this Message-ID (diff)
From: Marc Zyngier <maz@kernel.org>
To: Lukas Wunner <lukas@wunner.de>
Cc: Thomas Gleixner <tglx@linutronix.de>,
Jason Cooper <jason@lakedaemon.net>,
Nicolas Saenz Julienne <nsaenzjulienne@suse.de>,
Florian Fainelli <f.fainelli@gmail.com>,
Ray Jui <rjui@broadcom.com>,
Scott Branden <sbranden@broadcom.com>,
bcm-kernel-feedback-list@broadcom.com,
linux-kernel@vger.kernel.org,
linux-rpi-kernel@lists.infradead.org,
linux-arm-kernel@lists.infradead.org,
Serge Schneider <serge@raspberrypi.org>,
Kristina Brooks <notstina@gmail.com>,
Stefan Wahren <wahrenst@gmx.net>,
Matthias Brugger <mbrugger@suse.com>,
Martin Sperl <kernel@martin.sperl.org>,
Phil Elwell <phil@raspberrypi.org>
Subject: Re: [PATCH v2] irqchip/bcm2835: Quiesce IRQs left enabled by bootloader
Date: Wed, 12 Feb 2020 08:13:29 +0000 [thread overview]
Message-ID: <d49218987c0d1d573aaa3bcccf44ffe3@kernel.org> (raw)
In-Reply-To: <8be2f3e95fb29abdf80240f2b8a38621c42eb2a9.1581327911.git.lukas@wunner.de>
Hi Lukas,
Thanks for the update on this.
On 2020-02-10 09:52, Lukas Wunner wrote:
> Customers of our "Revolution Pi" open source PLCs (which are based on
> the Raspberry Pi) have reported random lockups as well as jittery eMMC,
> UART and SPI latency. We were able to reproduce the lockups in our lab
> and hooked up a JTAG debugger:
>
> It turns out that the USB controller's interrupt is already enabled
> when
> the kernel boots. All interrupts are disabled when the chip comes out
> of power-on reset, according to the spec. So apparently the bootloader
> enables the interrupt but neglects to disable it before handing over
> control to the kernel.
>
> The bootloader is a closed source blob provided by the Raspberry Pi
> Foundation. Development of an alternative open source bootloader was
> begun by Kristina Brooks but it's not fully functional yet. Usage of
> the blob is thus without alternative for the time being.
>
> The Raspberry Pi Foundation's downstream kernel has a performance-
> optimized USB driver (which we use on our Revolution Pi products).
> The driver takes advantage of the FIQ fast interrupt. Because the
> regular USB interrupt was left enabled by the bootloader, both the
> FIQ and the normal interrupt is enabled once the USB driver probes.
>
> The spec has the following to say on simultaneously enabling the FIQ
> and the normal interrupt of a peripheral:
>
> "One interrupt source can be selected to be connected to the ARM FIQ
> input. An interrupt which is selected as FIQ should have its normal
> interrupt enable bit cleared. Otherwise a normal and an FIQ interrupt
> will be fired at the same time. Not a good idea!"
> ^^^^^^^^^^^^^^^
> https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf
> page 110
>
> On a multicore Raspberry Pi, the Foundation's kernel routes all normal
> interrupts to CPU 0 and the FIQ to CPU 1. Because both the FIQ and the
> normal interrupt is enabled, a USB interrupt causes CPU 0 to spin in
> bcm2836_chained_handle_irq() until the FIQ on CPU 1 has cleared it.
> Interrupts with a lower priority than USB are starved as long.
>
> That explains the jittery eMMC, UART and SPI latency: On one occasion
> I've seen CPU 0 blocked for no less than 2.9 msec. Basically,
> everything not USB takes a performance hit: Whereas eMMC throughput
> on a Compute Module 3 remains relatively constant at 23.5 MB/s with
> this commit, it irregularly dips to 23.0 MB/s without this commit.
>
> The lockups occur when CPU 0 receives a USB interrupt while holding a
> lock which CPU 1 is trying to acquire while the FIQ is temporarily
> disabled on CPU 1.
>
> I've tested old releases of the Foundation's bootloader as far back as
> 1.20160202-1 and they all leave the USB interrupt enabled. Still older
> releases fail to boot a contemporary kernel on a Compute Module 1 or 3,
> which are the only Raspberry Pi variants I have at my disposal for
> testing.
>
> Fix by disabling IRQs left enabled by the bootloader. Although the
> impact is most pronounced on the Foundation's downstream kernel,
> it seems prudent to apply the fix to the upstream kernel to guard
> against such mistakes in any present and future bootloader.
While the story is interesting, it doesn't really belong to a commit
message.
Please trim it down to something along the lines of:
- The RPi bootloader is a bit crap, as it leaves IRQs and FIQs enabled
and for the OS to deal with the consequences
- The kernel driver is not great either, as it doesn't properly
initialize
the interrupt state, resulting in both IRQ and FIQ misfiring and
resulting
in bizarre behaviours
- Properly initializing the irqchip fixes the issue. Add a couple a
warnings
for a good measure, so that people realize their favourite toy comes
with
sub-par SW.
> Signed-off-by: Lukas Wunner <lukas@wunner.de>
> Cc: Serge Schneider <serge@raspberrypi.org>
> Cc: Kristina Brooks <notstina@gmail.com>
> Cc: stable@vger.kernel.org
> ---
> Changes since v1:
> * Use "relaxed" MMIO accessors to avoid memory barriers (Marc)
> * Use u32 instead of int for register access (Marc)
> * Quiesce FIQ as well (Marc)
> * Quiesce IRQs after mapping them for better readability
> * Drop alternative approach from commit message (Marc)
>
> Link to v1:
> https://lore.kernel.org/lkml/988737dbbc4e499c2faaaa4e567ba3ed8deb9a89.1581089797.git.lukas@wunner.de/
>
> drivers/irqchip/irq-bcm2835.c | 15 +++++++++++++++
> 1 file changed, 15 insertions(+)
>
> diff --git a/drivers/irqchip/irq-bcm2835.c
> b/drivers/irqchip/irq-bcm2835.c
> index 418245d31921..63539c88ac3a 100644
> --- a/drivers/irqchip/irq-bcm2835.c
> +++ b/drivers/irqchip/irq-bcm2835.c
> @@ -61,6 +61,7 @@
> | SHORTCUT1_MASK | SHORTCUT2_MASK)
>
> #define REG_FIQ_CONTROL 0x0c
> +#define REG_FIQ_ENABLE 0x80
>
> #define NR_BANKS 3
> #define IRQS_PER_BANK 32
> @@ -135,6 +136,7 @@ static int __init armctrl_of_init(struct
> device_node *node,
> {
> void __iomem *base;
> int irq, b, i;
> + u32 reg;
>
> base = of_iomap(node, 0);
> if (!base)
> @@ -157,6 +159,19 @@ static int __init armctrl_of_init(struct
> device_node *node,
> handle_level_irq);
> irq_set_probe(irq);
> }
> +
> + reg = readl_relaxed(intc.enable[b]);
> + if (reg) {
> + writel_relaxed(reg, intc.disable[b]);
> + pr_err(FW_BUG "Bootloader left irq enabled: "
> + "bank %d irq %*pbl\n", b, IRQS_PER_BANK, ®);
> + }
> + }
> +
> + reg = readl_relaxed(base + REG_FIQ_CONTROL);
> + if (reg & REG_FIQ_ENABLE) {
> + writel_relaxed(0, base + REG_FIQ_CONTROL);
> + pr_err(FW_BUG "Bootloader left fiq enabled\n");
> }
>
> if (is_2836) {
It otherwise looks good. You can either resend it with a fixed commit
message,
or provide me with a commit message that I can stick there while
applying it.
Thanks,
M.
--
Jazz is not dead. It just smells funny...
next prev parent reply other threads:[~2020-02-12 8:13 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-07 15:46 [PATCH] irqchip/bcm2835: Quiesce IRQs left enabled by bootloader Lukas Wunner
2020-02-07 15:46 ` Lukas Wunner
2020-02-07 16:11 ` Marc Zyngier
2020-02-07 16:11 ` Marc Zyngier
2020-02-10 9:52 ` [PATCH v2] " Lukas Wunner
2020-02-10 9:52 ` Lukas Wunner
2020-02-12 4:47 ` Florian Fainelli
2020-02-12 4:47 ` Florian Fainelli
2020-02-12 8:13 ` Marc Zyngier [this message]
2020-02-12 8:13 ` Marc Zyngier
2020-02-12 12:36 ` Lukas Wunner
2020-02-12 12:55 ` Nicolas Saenz Julienne
2020-02-12 12:55 ` Nicolas Saenz Julienne
2020-02-23 17:59 ` Stefan Wahren
2020-02-23 17:59 ` Stefan Wahren
2020-02-23 18:24 ` Lukas Wunner
2020-02-24 9:21 ` Stefan Wahren
2020-02-24 9:21 ` Stefan Wahren
2020-02-25 9:50 ` [PATCH v4] " Lukas Wunner
2020-02-25 9:50 ` Lukas Wunner
2020-03-29 20:26 ` [tip: irq/core] " tip-bot2 for Lukas Wunner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d49218987c0d1d573aaa3bcccf44ffe3@kernel.org \
--to=maz@kernel.org \
--cc=bcm-kernel-feedback-list@broadcom.com \
--cc=f.fainelli@gmail.com \
--cc=jason@lakedaemon.net \
--cc=kernel@martin.sperl.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rpi-kernel@lists.infradead.org \
--cc=lukas@wunner.de \
--cc=mbrugger@suse.com \
--cc=notstina@gmail.com \
--cc=nsaenzjulienne@suse.de \
--cc=phil@raspberrypi.org \
--cc=rjui@broadcom.com \
--cc=sbranden@broadcom.com \
--cc=serge@raspberrypi.org \
--cc=tglx@linutronix.de \
--cc=wahrenst@gmx.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.