All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marc Zyngier <maz@kernel.org>
To: Cyril Brulebois <kibi@debian.org>
Cc: Johan Hovold <johan+linaro@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	x86@kernel.org, platform-driver-x86@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, linux-mips@vger.kernel.org,
	linux-kernel@vger.kernel.org, stable@vger.kernel.org,
	Dmitry Torokhov <dtor@chromium.org>,
	Jon Hunter <jonathanh@nvidia.com>,
	Hsin-Yi Wang <hsinyi@chromium.org>,
	Mark-PK Tsai <mark-pk.tsai@mediatek.com>
Subject: Re: [PATCH v6 06/20] irqdomain: Fix mapping-creation race
Date: Wed, 08 Mar 2023 14:53:48 +0000	[thread overview]
Message-ID: <86zg8nxpj7.wl-maz@kernel.org> (raw)
In-Reply-To: <20230308144105.di552lbogqv2s7fk@mraw.org>

On Wed, 08 Mar 2023 14:41:05 +0000,
Cyril Brulebois <kibi@debian.org> wrote:
> 
> Hi Johan,
> 
> And thanks so much for this patch series.
> 
> Johan Hovold <johan+linaro@kernel.org> (2023-02-13):
> > Parallel probing of devices that share interrupts (e.g. when a driver
> > uses asynchronous probing) can currently result in two mappings for the
> > same hardware interrupt to be created due to missing serialisation.
> > 
> > Make sure to hold the irq_domain_mutex when creating mappings so that
> > looking for an existing mapping before creating a new one is done
> > atomically.
> 
> Just for information: This patch fixes a long-standing regression
> regarding Raspberry Pi devices, which have been failing to boot (at
> least reliably) due to MMC timeouts for a long while; I think that
> started between v5.17 and v5.19, but I couldn't bisect at the time
> (I was already chasing some other regression).
> 
> Example bug report:
>   https://bugs.debian.org/1019700
> 
> Before trying to pinpoint when the regression appeared, I've checked
> these versions, with a Debian testing userspace as of 2023-03-07:
>  - v6.1.12: affected.
>  - v6.2: affected.
>  - v6.3-rc1: not affected.
> 
> A bisect between v6.2 and v6.3-rc1 led me to this patch specifically.
> Seeing how it's part of a patch series, and how previous patches are
> preliminary ones, I've checked that cherry-picking the first 6 patches
> on top of v6.1.15 indeed fixes the problem there too, and it does
> (git cherry-pick v6.2-rc4..601363cc08da25747feb87c55573dd54de91d66a).
> 
> 
> With the following systems:
>  - Pi 4 B, using external storage (SD card),
>  - CM4 Lite on CM4 IO Board, using external storage (SD card),
>  - CM4 on CM4 IO Board, using internal storage (eMMC),
> 
> I've been able to verify that v6.1.12 (baseline in Debian testing)
> triggers this MMC timeout issue, while v6.1.15 + the aforementioned
> range of cherry-picked commits no longer triggers this issue.
> 
> (Methodology: cold boot then reboot 20 times, monitoring via serial
> console to keep HDMI output of the equation; affected systems stop
> booting after 1-4 boots; unaffected systems boot and reboot just fine
> all the time.)
> 
> 
> This looks like a critical bugfix for Raspberry Pi users.
> 
> Seeing the stable@ mention is about 4.8, I suppose this is going to be
> considered for a wide range of kernels already… but I'm happy to dig
> into this further to pinpoint when the regression appeared, if that's
> helpful.

If you have an interest in these patches being backported, may I
suggest you look at the backporting failures that have been
reported[1]?

Note that now that 4.9 is out of the picture, nothing is going to be
backported past 4.14.

Thanks,

	M.

[1] https://lore.kernel.org/r/167812853717924@kroah.com

-- 
Without deviation from the norm, progress is not possible.

WARNING: multiple messages have this Message-ID (diff)
From: Marc Zyngier <maz@kernel.org>
To: Cyril Brulebois <kibi@debian.org>
Cc: Johan Hovold <johan+linaro@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	x86@kernel.org, platform-driver-x86@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, linux-mips@vger.kernel.org,
	linux-kernel@vger.kernel.org, stable@vger.kernel.org,
	Dmitry Torokhov <dtor@chromium.org>,
	Jon Hunter <jonathanh@nvidia.com>,
	Hsin-Yi Wang <hsinyi@chromium.org>,
	Mark-PK Tsai <mark-pk.tsai@mediatek.com>
Subject: Re: [PATCH v6 06/20] irqdomain: Fix mapping-creation race
Date: Wed, 08 Mar 2023 14:53:48 +0000	[thread overview]
Message-ID: <86zg8nxpj7.wl-maz@kernel.org> (raw)
In-Reply-To: <20230308144105.di552lbogqv2s7fk@mraw.org>

On Wed, 08 Mar 2023 14:41:05 +0000,
Cyril Brulebois <kibi@debian.org> wrote:
> 
> Hi Johan,
> 
> And thanks so much for this patch series.
> 
> Johan Hovold <johan+linaro@kernel.org> (2023-02-13):
> > Parallel probing of devices that share interrupts (e.g. when a driver
> > uses asynchronous probing) can currently result in two mappings for the
> > same hardware interrupt to be created due to missing serialisation.
> > 
> > Make sure to hold the irq_domain_mutex when creating mappings so that
> > looking for an existing mapping before creating a new one is done
> > atomically.
> 
> Just for information: This patch fixes a long-standing regression
> regarding Raspberry Pi devices, which have been failing to boot (at
> least reliably) due to MMC timeouts for a long while; I think that
> started between v5.17 and v5.19, but I couldn't bisect at the time
> (I was already chasing some other regression).
> 
> Example bug report:
>   https://bugs.debian.org/1019700
> 
> Before trying to pinpoint when the regression appeared, I've checked
> these versions, with a Debian testing userspace as of 2023-03-07:
>  - v6.1.12: affected.
>  - v6.2: affected.
>  - v6.3-rc1: not affected.
> 
> A bisect between v6.2 and v6.3-rc1 led me to this patch specifically.
> Seeing how it's part of a patch series, and how previous patches are
> preliminary ones, I've checked that cherry-picking the first 6 patches
> on top of v6.1.15 indeed fixes the problem there too, and it does
> (git cherry-pick v6.2-rc4..601363cc08da25747feb87c55573dd54de91d66a).
> 
> 
> With the following systems:
>  - Pi 4 B, using external storage (SD card),
>  - CM4 Lite on CM4 IO Board, using external storage (SD card),
>  - CM4 on CM4 IO Board, using internal storage (eMMC),
> 
> I've been able to verify that v6.1.12 (baseline in Debian testing)
> triggers this MMC timeout issue, while v6.1.15 + the aforementioned
> range of cherry-picked commits no longer triggers this issue.
> 
> (Methodology: cold boot then reboot 20 times, monitoring via serial
> console to keep HDMI output of the equation; affected systems stop
> booting after 1-4 boots; unaffected systems boot and reboot just fine
> all the time.)
> 
> 
> This looks like a critical bugfix for Raspberry Pi users.
> 
> Seeing the stable@ mention is about 4.8, I suppose this is going to be
> considered for a wide range of kernels already… but I'm happy to dig
> into this further to pinpoint when the regression appeared, if that's
> helpful.

If you have an interest in these patches being backported, may I
suggest you look at the backporting failures that have been
reported[1]?

Note that now that 4.9 is out of the picture, nothing is going to be
backported past 4.14.

Thanks,

	M.

[1] https://lore.kernel.org/r/167812853717924@kroah.com

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2023-03-08 14:53 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-13 10:42 [PATCH v6 00/20] irqdomain: fix mapping race and rework locking Johan Hovold
2023-02-13 10:42 ` Johan Hovold
2023-02-13 10:42 ` [PATCH v6 01/20] irqdomain: Fix association race Johan Hovold
2023-02-13 10:42   ` Johan Hovold
2023-02-13 19:40   ` [irqchip: irq/irqchip-next] " irqchip-bot for Johan Hovold
2023-02-13 10:42 ` [PATCH v6 02/20] irqdomain: Fix disassociation race Johan Hovold
2023-02-13 10:42   ` Johan Hovold
2023-02-13 19:40   ` [irqchip: irq/irqchip-next] " irqchip-bot for Johan Hovold
2023-02-13 10:42 ` [PATCH v6 03/20] irqdomain: Drop bogus fwspec-mapping error handling Johan Hovold
2023-02-13 10:42   ` Johan Hovold
2023-02-13 19:40   ` [irqchip: irq/irqchip-next] " irqchip-bot for Johan Hovold
2023-02-13 10:42 ` [PATCH v6 04/20] irqdomain: Look for existing mapping only once Johan Hovold
2023-02-13 10:42   ` Johan Hovold
2023-02-13 19:40   ` [irqchip: irq/irqchip-next] " irqchip-bot for Johan Hovold
2023-02-13 10:42 ` [PATCH v6 05/20] irqdomain: Refactor __irq_domain_alloc_irqs() Johan Hovold
2023-02-13 10:42   ` Johan Hovold
2023-02-13 19:40   ` [irqchip: irq/irqchip-next] " irqchip-bot for Johan Hovold
2023-02-13 10:42 ` [PATCH v6 06/20] irqdomain: Fix mapping-creation race Johan Hovold
2023-02-13 10:42   ` Johan Hovold
2023-03-08 14:41   ` Cyril Brulebois
2023-03-08 14:41     ` Cyril Brulebois
2023-03-08 14:53     ` Marc Zyngier [this message]
2023-03-08 14:53       ` Marc Zyngier
2023-03-09  7:32     ` Johan Hovold
2023-03-09  7:32       ` Johan Hovold
2023-02-13 10:42 ` [PATCH v6 07/20] irqdomain: Fix domain registration race Johan Hovold
2023-02-13 10:42   ` Johan Hovold
2023-02-13 19:40   ` [irqchip: irq/irqchip-next] " irqchip-bot for Marc Zyngier
2023-02-13 10:42 ` [PATCH v6 08/20] irqdomain: Drop revmap mutex Johan Hovold
2023-02-13 10:42   ` Johan Hovold
2023-02-13 19:40   ` [irqchip: irq/irqchip-next] " irqchip-bot for Johan Hovold
2023-02-13 10:42 ` [PATCH v6 09/20] irqdomain: Drop dead domain-name assignment Johan Hovold
2023-02-13 10:42   ` Johan Hovold
2023-02-13 19:40   ` [irqchip: irq/irqchip-next] " irqchip-bot for Johan Hovold
2023-02-13 10:42 ` [PATCH v6 10/20] irqdomain: Drop leftover brackets Johan Hovold
2023-02-13 10:42   ` Johan Hovold
2023-02-13 19:40   ` [irqchip: irq/irqchip-next] " irqchip-bot for Johan Hovold
2023-02-13 10:42 ` [PATCH v6 11/20] irqdomain: Clean up irq_domain_push/pop_irq() Johan Hovold
2023-02-13 10:42   ` Johan Hovold
2023-02-13 19:40   ` [irqchip: irq/irqchip-next] " irqchip-bot for Johan Hovold
2023-02-13 10:42 ` [PATCH v6 12/20] x86/ioapic: Use irq_domain_create_hierarchy() Johan Hovold
2023-02-13 10:42   ` Johan Hovold
2023-02-13 19:40   ` [irqchip: irq/irqchip-next] " irqchip-bot for Johan Hovold
2023-02-13 10:42 ` [PATCH v6 13/20] x86/uv: " Johan Hovold
2023-02-13 10:42   ` Johan Hovold
2023-02-13 19:40   ` [irqchip: irq/irqchip-next] " irqchip-bot for Johan Hovold
2023-02-13 10:42 ` [PATCH v6 14/20] irqchip/alpine-msi: Use irq_domain_add_hierarchy() Johan Hovold
2023-02-13 10:42   ` Johan Hovold
2023-02-13 19:40   ` [irqchip: irq/irqchip-next] " irqchip-bot for Johan Hovold
2023-02-13 10:42 ` [PATCH v6 15/20] irqchip/gic-v2m: Use irq_domain_create_hierarchy() Johan Hovold
2023-02-13 10:42   ` Johan Hovold
2023-02-13 19:40   ` [irqchip: irq/irqchip-next] " irqchip-bot for Johan Hovold
2023-02-13 10:42 ` [PATCH v6 16/20] irqchip/gic-v3-its: " Johan Hovold
2023-02-13 10:42   ` Johan Hovold
2023-02-13 19:40   ` [irqchip: irq/irqchip-next] " irqchip-bot for Johan Hovold
2023-02-13 10:42 ` [PATCH v6 17/20] irqchip/gic-v3-mbi: " Johan Hovold
2023-02-13 10:42   ` Johan Hovold
2023-02-13 19:40   ` [irqchip: irq/irqchip-next] " irqchip-bot for Johan Hovold
2023-02-13 10:43 ` [PATCH v6 18/20] irqchip/loongson-pch-msi: " Johan Hovold
2023-02-13 10:43   ` Johan Hovold
2023-02-13 19:40   ` [irqchip: irq/irqchip-next] " irqchip-bot for Johan Hovold
2023-02-13 10:43 ` [PATCH v6 19/20] irqchip/mvebu-odmi: " Johan Hovold
2023-02-13 10:43   ` Johan Hovold
2023-02-13 19:40   ` [irqchip: irq/irqchip-next] " irqchip-bot for Johan Hovold
2023-02-13 10:43 ` [PATCH v6 20/20] irqdomain: Switch to per-domain locking Johan Hovold
2023-02-13 10:43   ` Johan Hovold
2023-02-13 19:40   ` [irqchip: irq/irqchip-next] " irqchip-bot for Johan Hovold
2023-03-07 13:51   ` [PATCH v6 20/20] " David Woodhouse
2023-03-07 13:51     ` David Woodhouse
2023-03-07 14:06     ` Juergen Gross
2023-03-07 14:06       ` Juergen Gross
2023-03-07 14:18       ` David Woodhouse
2023-03-07 14:18         ` David Woodhouse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=86zg8nxpj7.wl-maz@kernel.org \
    --to=maz@kernel.org \
    --cc=dtor@chromium.org \
    --cc=hsinyi@chromium.org \
    --cc=johan+linaro@kernel.org \
    --cc=jonathanh@nvidia.com \
    --cc=kibi@debian.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=mark-pk.tsai@mediatek.com \
    --cc=platform-driver-x86@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.