All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: "René Rebe" <rene@exactco.de>
Cc: linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org,
	Bjorn Helgaas <bhelgaas@google.com>,
	John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>,
	Riccardo Mottola <riccardo.mottola@libero.it>,
	Manivannan Sadhasivam <mani@kernel.org>,
	Brian Norris <briannorris@chromium.org>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Lukas Wunner <lukas@wunner.de>,
	Mario Limonciello <mario.limonciello@amd.com>
Subject: Re: [PATCH] PCI: Fix PCI bridges not to go to D3Hot on older RISC systems
Date: Tue, 2 Dec 2025 11:28:37 -0600	[thread overview]
Message-ID: <20251202172837.GA3078292@bhelgaas> (raw)
In-Reply-To: <20251202.174007.745614442598214100.rene@exactco.de>

[+cc Mani, Brian (a5fb3ff63287 authors), Rafael, Lukas, Mario]

On Tue, Dec 02, 2025 at 05:40:07PM +0100, René Rebe wrote:
> Commit a5fb3ff63287 ("PCI: Allow PCI bridges to go to D3Hot on all
> non-x86") was bisected to break various non-x86 RISC Unix systems,
> e.g. sparc64, see two example oopses below. Fix by only allowing D3Hot
> on modern ARM64, PPC64 and RISCV ISAs besides new enough x86.

I think we need some kind of analysis of what is happening to the PCI
devices here.  I don't know why the CPU architecture per se would be
related to PCI power management.

pci_bridge_d3_possible() is already a barely maintainable hodge podge
of random things that work and don't work.  Generally speaking most of
those cases relate to firmware.  

> Sun Blade 1000:
> ERROR(0): Cheetah error trap taken afsr[0010080005000000] afar[000007f900800000] TL1(0)
> ERROR(0): TPC[100a05a4] TNPC[100a05a8] O7[42acc8] TSTATE[4411001603]
> ERROR(0):
> TPC<MakeIocReady+0xc/0x278 [mptbase]>
> ERROR(0): M_SYND(0),  E_SYND(0), Privileged
> ERROR(0): Highest priority error (0000080000000000) "Bus error response from system bus"
> ERROR(0): D-cache idx[0] tag[0000000000000000] utag[0000000000000000] stag[0000000000000000]
> ERROR(0): D-cache data0[0000000000000000] data1[0000000000000000] data2[0000000000000000] data3[0000000000000000]
> ERROR(0): I-cache idx[0] tag[0000000000000000] utag[0000000000000000] stag[0000000000000000] u[0000000000000000] l[0000000000000000]
> ERROR(0): I-cache INSN0[0000000000000000] INSN1[0000000000000000] INSN2[0000000000000000] INSN3[0000000000000000]
> ERROR(0): I-cache INSN4[0000000000000000] INSN5[0000000000000000] INSN6[0000000000000000] INSN7[0000000000000000]
> ERROR(0): E-cache idx[b08040] tag[000000001e008fa0]
> ERROR(0): E-cache data0[0000000000000000] data1[0000000000000000] data2[0000000000000000] data3[ffffffffffffffff]
> Kernel panic - not syncing: Irrecoverable deferred error trap.
> CPU: 0 UID: 0 PID: 46 Comm: (udev-worker) Not tainted 6.14.0-rc1-00001-ga5fb3ff63287 #18
> Call Trace:
> [<00000000004294b0>] panic+0xf0/0x370
> [<0000000000435bc4>] cheetah_deferred_handler+0x2c8/0x2d8
> [<0000000000405e88>] c_deferred+0x18/0x24
> [<00000000100a05a4>] MakeIocReady+0xc/0x278 [mptbase]

I assume both of these crashes are related to the
CHIPREG_READ32(&ioc->chip->Doorbell) in mpt_GetIocState(), e.g., maybe
that PCI read failed because an upstream bridge was not in D0 and
therefore treated the read as an unsupported request.

> [<00000000100a089c>] mpt_do_ioc_recovery+0x8c/0x1054 [mptbase]
> [<000000001009f2d4>] mpt_attach+0x920/0xa68 [mptbase]
> [<000000001012424c>] mptsas_probe+0x8/0x3e8 [mptsas]
> [<0000000000788308>] local_pci_probe+0x24/0x70
> [<0000000000788dac>] pci_device_probe+0x1c0/0x1d0
> [<000000000082633c>] really_probe+0x13c/0x29c
> [<0000000000826590>] __driver_probe_device+0xf4/0x104
> [<0000000000826614>] driver_probe_device+0x24/0xa0
> [<000000000082683c>] __driver_attach+0xe8/0x104
> [<0000000000824da0>] bus_for_each_dev+0x58/0x84
> [<0000000000825508>] bus_add_driver+0xdc/0x1f8
> [<0000000000827110>] driver_register+0x70/0x120
> 
> Niagara T1:
> mptsas 0000:07:00.0: Unable to change power state from D3cold to D0, device inaccessible
> NON-RESUMABLE ERROR: Reporting on cpu 31
> NON-RESUMABLE ERROR: TPC [0x0000000010184034] <MakeIocReady+0x10/0x298 [mptbase]>
> NON-RESUMABLE ERROR: RAW [1f10000000000007:0000000e3179235c:0000000202000004:000000ea00300000
> NON-RESUMABLE ERROR: 00000000001f0000:0000000000000000:0000000000000000:0000000000000000]
> NON-RESUMABLE ERROR: handle [0x1f10000000000007] stick [0x0000000e3179235c]
> NON-RESUMABLE ERROR: type [precise nonresumable]
> NON-RESUMABLE ERROR: attrs [0x02000004] < PIO sp-faulted priv >
> NON-RESUMABLE ERROR: raddr [0x000000ea00300000]
> Kernel panic - not syncing: Non-resumable error.
> CPU: 31 UID: 0 PID: 367 Comm: (udev-worker) Not tainted 6.16.12+3-sparc64-smp #1 NONE  Debian 6.16.12-2+sparc64.1
> Call Trace:
> [<00000000004373c4>] dump_stack+0x8/0x18
> [<0000000000429540>] panic+0xf4/0x398
> [<000000000043afcc>] sun4v_nonresum_error+0x16c/0x240
> [<0000000000406eb8>] sun4v_nonres_mondo+0xc8/0xd8
> [<0000000010184034>] MakeIocReady+0x10/0x298 [mptbase]
> [<00000000101844b4>] mpt_do_ioc_recovery+0x9c/0x1110 [mptbase]
> [<00000000101836f8>] mpt_attach+0xb58/0xd20 [mptbase]
> [<0000000010287f30>] mptsas_probe+0x10/0x440 [mptsas]
> [<0000000000b3fab0>] local_pci_probe+0x30/0x80
> [<0000000000b405d4>] pci_device_probe+0xb4/0x240
> [<0000000000bfd348>] really_probe+0xc8/0x400
> [<0000000000bfd70c>] __driver_probe_device+0x8c/0x160
> [<0000000000bfd8c8>] driver_probe_device+0x28/0x100
> [<0000000000bfdb7c>] __driver_attach+0xbc/0x1e0
> [<0000000000bfacfc>] bus_for_each_dev+0x5c/0xc0
> [<0000000000bfcafc>] driver_attach+0x1c/0x40
> Press Stop-A (L1-A) from sun keyboard or send break
> twice on console to return to the boot prom
> 
> Fixes: a5fb3ff63287 ("PCI: Allow PCI bridges to go to D3Hot on all non-x86")
> Signed-off-by: René Rebe <rene@exactco.de>
> ---
> Tested on Sun Blade 1000, and shipping in all T2/Linux builds since 2025-08-01
> ---
>  drivers/pci/pci.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index b14dd064006c..7619d2cfa66d 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -3033,9 +3033,9 @@ bool pci_bridge_d3_possible(struct pci_dev *bridge)
>  
>  		/*
>  		 * Out of caution, we only allow PCIe ports from 2015 or newer
> -		 * into D3 on x86.
> +		 * into D3 or other modern ISAs only.
>  		 */
> -		if (!IS_ENABLED(CONFIG_X86) || dmi_get_bios_year() >= 2015)
> +		if (IS_ENABLED(CONFIG_ARM64) || IS_ENABLED(CONFIG_PPC64) || IS_ENABLED(CONFIG_RISCV) || dmi_get_bios_year() >= 2015)
>  			return true;
>  		break;
>  	}
> -- 
> 2.52.0
> 
> -- 
> René Rebe, ExactCODE GmbH, Berlin, Germany
> https://exactco.dehttps://t2linux.comhttps://patreon.com/renerebe

  parent reply	other threads:[~2025-12-02 17:28 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-02 16:40 [PATCH] PCI: Fix PCI bridges not to go to D3Hot on older RISC systems René Rebe
2025-12-02 16:54 ` John Paul Adrian Glaubitz
2025-12-02 17:04   ` René Rebe
2025-12-02 18:20     ` PCI bridge window issue (Was: Re: [PATCH] PCI: Fix PCI bridges not to go to D3Hot on older RISC systems) Ilpo Järvinen
2025-12-02 18:29       ` PCI bridge window issue René Rebe
2025-12-02 19:35         ` Ilpo Järvinen
2025-12-06  1:07     ` [PATCH] PCI: Fix PCI bridges not to go to D3Hot on older RISC systems Maciej W. Rozycki
2025-12-06  8:31       ` John Paul Adrian Glaubitz
2025-12-06 10:02         ` René Rebe
     [not found]         ` <339B5A39-BC20-489A-9969-BF01B4E6AD63@exactco.de>
2025-12-07 14:40           ` Maciej W. Rozycki
2025-12-06 10:14       ` René Rebe
2025-12-07 14:31         ` Maciej W. Rozycki
2025-12-02 17:28 ` Bjorn Helgaas [this message]
2025-12-02 17:41   ` René Rebe
2025-12-02 21:54   ` Brian Norris
2025-12-03  4:49     ` Lukas Wunner
2025-12-03 14:27       ` Mika Westerberg
2025-12-03 14:48         ` René Rebe
2025-12-03 15:22           ` Rafael J. Wysocki
2025-12-03 15:26             ` René Rebe
2025-12-03 17:16               ` Rafael J. Wysocki
2025-12-03  5:15 ` Lukas Wunner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251202172837.GA3078292@bhelgaas \
    --to=helgaas@kernel.org \
    --cc=bhelgaas@google.com \
    --cc=briannorris@chromium.org \
    --cc=glaubitz@physik.fu-berlin.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lukas@wunner.de \
    --cc=mani@kernel.org \
    --cc=mario.limonciello@amd.com \
    --cc=rafael@kernel.org \
    --cc=rene@exactco.de \
    --cc=riccardo.mottola@libero.it \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.