From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.9]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 531052BDC06; Tue, 24 Jun 2025 12:48:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.9 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750769337; cv=none; b=J5Y4Q53qArOhsAdtsnS2g1y032b5pP95n7sjdR0mxs0d7bBnXPD3wJzGYDWHlQ8oGkpZcY96nHtjF24aCers15kl0AtMORZO29fgqszhdJgBHcrzx9mlznc0T530pf+yx2MsW2GheSzjRIMVX883ELZeRAwlhqLcIdsDLCXHBtk= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750769337; c=relaxed/simple; bh=8Ix2Yu8WfDxcmsb51j41qezjfHG5UnQRJivXeCgJV40=; h=From:Date:To:cc:Subject:In-Reply-To:Message-ID:References: MIME-Version:Content-Type; b=l9V0R99QiNk2AtNnXd4YJ3XgYm1oD8XBgGyVCFerOExOnuU9cUfqCvXmwTeXiOqnmvaOpaSqbYZasVi+iHAGxB0z0DEDJVppAtia3CHMytsBTP0l0JzRXkn6aDG6l2HtAHAVSn8Mp6lKVBLQ/QRc7ovr6bAAgZ0d+jxmFJ7C7+M= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=LEYhdUxa; arc=none smtp.client-ip=198.175.65.9 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="LEYhdUxa" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1750769329; x=1782305329; h=from:date:to:cc:subject:in-reply-to:message-id: references:mime-version:content-id; bh=8Ix2Yu8WfDxcmsb51j41qezjfHG5UnQRJivXeCgJV40=; b=LEYhdUxasnxs0mPpjC1hfljlFVnk5Q5JTXFftjvtFHHOnVuyCGlcBfvH ZG+P1OBq4FpREQC5ccUd6wH4qZiS+YLSl/q1pyh/7y0WQ08Re590vS0SV OAOGzx2ziMfh8jXAktpeb1FISDeWN6DiJiZVnAnXaJK7p2tUy44JyOn6y Kw3KO1iQmrpP7lmKF+09HlMaggV2cSCvjXhYCQSGauV1NFOn6vpbyQdA0 zewi124vSXluba8m75PAk9ESua3fbwZg9Rmz6Ho119ryEEhXzI0X88CLC CBJKGSmL50szwKqokX6yJEvdKXYTXIRfzu+lWXf5cMQho65xcBf00x9el g==; X-CSE-ConnectionGUID: I+behXDNT+KoBStNzTnOuw== X-CSE-MsgGUID: mPewiGJGQpiPyoK3g3aH6Q== X-IronPort-AV: E=McAfee;i="6800,10657,11474"; a="75542890" X-IronPort-AV: E=Sophos;i="6.16,261,1744095600"; d="scan'208";a="75542890" Received: from orviesa001.jf.intel.com ([10.64.159.141]) by orvoesa101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jun 2025 05:48:43 -0700 X-CSE-ConnectionGUID: Brx7dDenRYOS+3lnySp+Eg== X-CSE-MsgGUID: qwY2Y/LoRSSJUDW6JMqxsQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,261,1744095600"; d="scan'208";a="189090767" Received: from ijarvine-mobl1.ger.corp.intel.com (HELO localhost) ([10.245.245.16]) by smtpauth.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jun 2025 05:48:39 -0700 From: =?UTF-8?q?Ilpo=20J=C3=A4rvinen?= Date: Tue, 24 Jun 2025 15:48:36 +0300 (EEST) To: D Scott Phillips cc: Bjorn Helgaas , linux-pci@vger.kernel.org, =?ISO-8859-2?Q?Micha=B3_Winiarski?= , Igor Mammedov , LKML , Mika Westerberg Subject: Re: [PATCH 24/25] PCI: Perform reset_resource() and build fail list in sync In-Reply-To: <86plf0lgit.fsf@scott-ph-mail.amperecomputing.com> Message-ID: <78aed75b-137b-b48e-3699-20fcde248680@linux.intel.com> References: <20241216175632.4175-1-ilpo.jarvinen@linux.intel.com> <20241216175632.4175-25-ilpo.jarvinen@linux.intel.com> <86plf0lgit.fsf@scott-ph-mail.amperecomputing.com> Precedence: bulk X-Mailing-List: linux-pci@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: multipart/mixed; BOUNDARY="8323328-1510153036-1750767620=:943" Content-ID: This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --8323328-1510153036-1750767620=:943 Content-Type: text/plain; CHARSET=ISO-8859-15 Content-Transfer-Encoding: QUOTED-PRINTABLE Content-ID: <7466805e-576c-2257-9292-d520bd7c339b@linux.intel.com> On Wed, 18 Jun 2025, D Scott Phillips wrote: > Ilpo J=E4rvinen writes: >=20 > > Resetting resource is problematic as it prevent attempting to allocate > > the resource later, unless something in between restores the resource. > > Similarly, if fail_head does not contain all resources that were reset, > > those resource cannot be restored later. > > > > The entire reset/restore cycle adds complexity and leaving resources > > into reseted state causes issues to other code such as for checks done > > in pci_enable_resources(). Take a small step towards not resetting > > resources by delaying reset until the end of resource assignment and > > build failure list (fail_head) in sync with the reset to avoid leaving > > behind resources that cannot be restored (for the case where the caller > > provides fail_head in the first place to allow restore somewhere in the > > callchain, as is not all callers pass non-NULL fail_head). > > > > The Expansion ROM check is temporarily left in place while building the > > failure list until the upcoming change which reworks optional resource > > handling. > > > > Ideally, whole resource reset could be removed but doing that in a big > > step would make the impact non-tractable due to complexity of all > > related code. > > > > Signed-off-by: Ilpo J=E4rvinen >=20 > Hi Ilpo, I'm seeing a crash on arm64 at boot that I bisected to this > change. I don't think it's the same as the other issues reported. I've > confirmed the crash is still there after your follow up patches. The > crash itself is below[1]. Thanks for the report. I'm sorry about the delay, I was away a few days (and hoping all the gremlins in that resource change would already have=20 been finally solved but it seems my hopes were premature). > It looks like the problem begins when: >=20 > amdgpu_device_resize_fb_bar() > pci_resize_resource() > pci_reassign_bridge_resources() > __pci_bus_size_bridges() >=20 > adds pci_hotplug_io_size to `realloc_head`. The io resource allocation > has failed earlier because the root port doesn't have an io window[2]. >=20 > Then with this patch, pci_reassign_bridge_resources()'s call to > __pci_bridge_assign_resources() now returns the io added space for > hotplug in the `failed` list where the old code dropped it and did not. >=20 > That sends pci_reassign_bridge_resources() into the `cleanup:` path, > where I think the cleanup code doesn't properly release the resources > that were assigned by __pci_bridge_assign_resources() and there's a > conflict reported in pci_claim_resource() where a restored resource is > found as conflicting with itself: >=20 > > pcieport 000d:00:01.0: bridge window [mem 0x340000000000-0x340017ffffff= 64bit pref]: can't claim; address conflict with PCI Bus 000d:01 [mem 0x340= 000000000-0x340017ffffff 64bit pref] >=20 > Setting `pci=3Dhpiosize=3D0` avoids this crash, as does this change: >=20 > diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c > index 16d5d390599a..59ece11702da 100644 > --- a/drivers/pci/setup-bus.c > +++ b/drivers/pci/setup-bus.c > @@ -2442,7 +2442,7 @@ int pci_reassign_bridge_resources(struct pci_dev *b= ridge, unsigned long type) > =09LIST_HEAD(saved); > =09LIST_HEAD(added); > =09LIST_HEAD(failed); > -=09unsigned int i; > +=09unsigned int i, relevant_fails; > =09int ret; > =20 > =09down_read(&pci_bus_sem); > @@ -2490,7 +2490,16 @@ int pci_reassign_bridge_resources(struct pci_dev *= bridge, unsigned long type) > =09__pci_bridge_assign_resources(bridge, &added, &failed); > =09BUG_ON(!list_empty(&added)); > =20 > -=09if (!list_empty(&failed)) { > +=09relevant_fails =3D 0; > +=09list_for_each_entry(dev_res, &failed, list) { > +=09=09restore_dev_resource(dev_res); > +=09=09if (((dev_res->res->flags ^ type) & PCI_RES_TYPE_MASK) =3D=3D 0) > +=09=09=09relevant_fails++; > +=09} > +=09free_list(&failed); > + > +=09/* Cleanup if we had failures in resources of interest */ > +=09if (relevant_fails !=3D 0) { > =09=09ret =3D -ENOSPC; > =09=09goto cleanup; > =09} > @@ -2509,11 +2518,6 @@ int pci_reassign_bridge_resources(struct pci_dev *= bridge, unsigned long type) > =09return 0; > =20 > cleanup: > -=09/* Restore size and flags */ > -=09list_for_each_entry(dev_res, &failed, list) > -=09=09restore_dev_resource(dev_res); > -=09free_list(&failed); > - > =09/* Revert to the old configuration */ > =09list_for_each_entry(dev_res, &saved, list) { > =09=09struct resource *res =3D dev_res->res; >=20 > I don't know this code well enough to know if that changes is completely > bonkers or what. It will take me a bit of time to assimilate all the information too, this= =20 code has astronomical complexity and a failure due to resize is yet=20 another angle to it. I'll get back to you once I've figured more out. > Maybe a change that gets zero as pci_hotplug_io_size > for root ports that don't have an io window would be better? Similar to this, I personally don't like having a non-zero IO window for=20 the cases where there is no io ports required for any devices. On dual=20 socket systems here, it always leads to exhausting the io port space and=20 the later probed devices that would actually have io port resources will=20 not get them. While those failures are not critical/important in my case,= =20 it seems pointless to allocate IO window just for that sake of potential=20 hotplug, hotplug could just get it when there's a true need for the IO window (and nothing bogusly reserved all io port space). So IMO, a patch to change the default hotplug IO window to zero would=20 make sense which effectively matches what you provided on the cmdline. But= =20 this is said before fully understanding what goes wrong on the resize=20 path, it may need corrections too. But regardless of that, I think zeroing= =20 HP IO window is worthwhile on its own right to avoid exhausting the IO=20 port space for nothing. > Any other > ideas, or other information about the crash that I could provide? I suppose the actual crash itself is because the driver does make some=20 invalid assumption about the resources without checking the resources=20 first (I've not had time to locate and check the driver code yet). > [1]: Crash: > > SError Interrupt on CPU0, code 0x00000000be000411 -- SError > > CPU: 0 UID: 0 PID: 9 Comm: kworker/0:0 Not tainted 6.15.2+ #16 PREEMPT(= undef) > > Hardware name: Adlink Ampere Altra Developer Platform/Ampere Altra Deve= loper Platform, BIOS TianoCore 2.09.100.00 (SYS: 2.10.20221028) 04/30/2 > > Workqueue: events work_for_cpu_fn > > pstate: 204000c9 (nzCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=3D--) > > pc : _raw_spin_lock_irqsave+0x34/0xb0 > > lr : __wake_up+0x30/0x80 > > sp : ffff800080003e20 > > x29: ffff800080003e20 x28: ffff07ff808b0000 x27: 0000000000050260 > > x26: 0000000000000001 x25: ffffcc5decad6bd8 x24: ffffcc5decc8b5b8 > > x23: ffff080138a10000 x22: ffff080138a00000 x21: 0000000000000003 > > x20: ffff080138a14dc8 x19: 0000000000000000 x18: 006808018e775f03 > > x17: ffff3bc1330d2000 x16: ffffcc5e3a520ed8 x15: 8daad055c6e77021 > > x14: f231631cf9328575 x13: 0e51168d06a6cf91 x12: f5db8c23b764520c > > x11: 0000000000000040 x10: 0000000000000000 x9 : ffffcc5e3a520f08 > > x8 : 0000000000002113 x7 : 0000000000000000 x6 : 0000000000000000 > > x5 : 0000000000000000 x4 : ffff800080003e08 x3 : 00000000000000c0 > > x2 : 0000000000000001 x1 : 0000000000000000 x0 : ffff080138a14dc8 > > Kernel panic - not syncing: Asynchronous SError Interrupt > > CPU: 0 UID: 0 PID: 9 Comm: kworker/0:0 Not tainted 6.15.2+ #16 PREEMPT(= undef) > > Hardware name: Adlink Ampere Altra Developer Platform/Ampere Altra Deve= loper Platform, BIOS TianoCore 2.09.100.00 (SYS: 2.10.20221028) 04/30/2 > > Workqueue: events work_for_cpu_fn > > Call trace: > > show_stack+0x30/0x98 (C) > > dump_stack_lvl+0x7c/0xa0 > > dump_stack+0x18/0x2c > > panic+0x184/0x3c0 > > nmi_panic+0x90/0xa0 > > arm64_serror_panic+0x6c/0x90 > > do_serror+0x58/0x60 > > el1h_64_error_handler+0x38/0x60 > > el1h_64_error+0x84/0x88 > > _raw_spin_lock_irqsave+0x34/0xb0 (P) > > amdgpu_ih_process+0xf0/0x150 [amdgpu] > > amdgpu_irq_handler+0x34/0xa0 [amdgpu] > > __handle_irq_event_percpu+0x60/0x248 > > handle_irq_event+0x4c/0xc0 > > handle_fasteoi_irq+0xa0/0x1c8 > > handle_irq_desc+0x3c/0x68 > > generic_handle_domain_irq+0x24/0x40 > > __gic_handle_irq_from_irqson.isra.0+0x15c/0x260 > > gic_handle_irq+0x28/0x80 > > call_on_irq_stack+0x24/0x30 > > do_interrupt_handler+0x88/0xa0 > > el1_interrupt+0x44/0xd0 > > el1h_64_irq_handler+0x18/0x28 > > el1h_64_irq+0x84/0x88 > > amdgpu_device_rreg.part.0+0x4c/0x190 [amdgpu] (P) > > amdgpu_device_rreg+0x24/0x40 [amdgpu] > > psp_wait_for+0x88/0xd8 [amdgpu] > > psp_v11_0_bootloader_load_component+0x164/0x1b0 [amdgpu] > > psp_v11_0_bootloader_load_kdb+0x20/0x40 [amdgpu] > > psp_hw_start+0x5c/0x580 [amdgpu] > > psp_load_fw+0x9c/0x280 [amdgpu] > > psp_hw_init+0x44/0xa0 [amdgpu] > > amdgpu_device_fw_loading+0xf8/0x358 [amdgpu] > > amdgpu_device_ip_init+0x684/0x990 [amdgpu] > > amdgpu_device_init+0xba4/0x1038 [amdgpu] > > amdgpu_driver_load_kms+0x20/0xb8 [amdgpu] > > amdgpu_pci_probe+0x1f8/0x7f8 [amdgpu] > > local_pci_probe+0x44/0xb0 > > work_for_cpu_fn+0x24/0x40 > > process_one_work+0x17c/0x410 > > worker_thread+0x254/0x388 > > kthread+0x10c/0x128 > > ret_from_fork+0x10/0x20 > > Kernel Offset: 0x4c5dba380000 from 0xffff800080000000 > > PHYS_OFFSET: 0x80000000 > > CPU features: 0x0800,000042e0,01202650,8241720b > > Memory Limit: none > > ---[ end Kernel panic - not syncing: Asynchronous SError Interrupt ]--- >=20 > [2]: boot snippet > > ACPI: PCI Root Bridge [PCI1] (domain 000d [bus 00-ff]) > > acpi PNP0A08:01: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segment= s MSI EDR HPX-Type3] > > acpi PNP0A08:01: _OSC: platform does not support [PCIeHotplug PME LTR D= PC] > > acpi PNP0A08:01: _OSC: OS now controls [PCIeCapability] > > acpi PNP0A08:01: FADT indicates ASPM is unsupported, using BIOS configu= ration > > acpi PNP0A08:01: MCFG quirk: ECAM at [mem 0x37fff0000000-0x37ffffffffff= ] for [bus 00-ff] with pci_32b_read_ops > > acpi PNP0A08:01: ECAM area [mem 0x37fff0000000-0x37ffffffffff] reserved= by PNP0C02:00 > > acpi PNP0A08:01: ECAM at [mem 0x37fff0000000-0x37ffffffffff] for [bus 0= 0-ff] > > PCI host bridge to bus 000d:00 > > pci_bus 000d:00: root bus resource [mem 0x50000000-0x5fffffff window] > > pci_bus 000d:00: root bus resource [mem 0x340000000000-0x37ffdfffffff w= indow] > > pci_bus 000d:00: root bus resource [bus 00-ff] > > pci 000d:00:00.0: [1def:e100] type 00 class 0x060000 conventional PCI e= ndpoint > > pci 000d:00:01.0: [1def:e101] type 01 class 0x060400 PCIe Root Port > > pci 000d:00:01.0: PCI bridge to [bus 01-03] > > pci 000d:00:01.0: bridge window [io 0xe000-0xefff] > > pci 000d:00:01.0: bridge window [mem 0x50000000-0x502fffff] > > pci 000d:00:01.0: bridge window [mem 0x340000000000-0x3400101fffff 64= bit pref] > > pci 000d:00:01.0: supports D1 D2 > > pci 000d:00:01.0: PME# supported from D0 D1 D3hot > > ... > > pci 000d:00:01.0: bridge window [mem 0x340000000000-0x340017ffffff 64bi= t pref]: assigned > > pci 000d:00:01.0: bridge window [mem 0x50000000-0x502fffff]: assigned > > pci 000d:00:01.0: bridge window [io size 0x1000]: can't assign; no spa= ce > > pci 000d:00:01.0: bridge window [io size 0x1000]: failed to assign > > pci 000d:00:01.0: bridge window [io size 0x1000]: can't assign; no spa= ce > > pci 000d:00:01.0: bridge window [io size 0x1000]: failed to assign >=20 --=20 i. --8323328-1510153036-1750767620=:943--