public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed
From: Jacob Pan <jacob.pan@linux.microsoft.com>
To: David Matlack <dmatlack@google.com>
Cc: iommu@lists.linux.dev, kexec@lists.infradead.org,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, linux-pci@vger.kernel.org,
	Adithya Jayachandran <ajayachandra@nvidia.com>,
	Alexander Graf <graf@amazon.com>,
	Alex Williamson <alex@shazbot.org>,
	Bjorn Helgaas <bhelgaas@google.com>, Chris Li <chrisl@kernel.org>,
	David Rientjes <rientjes@google.com>,
	Jason Gunthorpe <jgg@nvidia.com>, Joerg Roedel <joro@8bytes.org>,
	Jonathan Corbet <corbet@lwn.net>, Josh Hilke <jrhilke@google.com>,
	Leon Romanovsky <leonro@nvidia.com>,
	Lukas Wunner <lukas@wunner.de>, Mike Rapoport <rppt@kernel.org>,
	Parav Pandit <parav@nvidia.com>,
	Pasha Tatashin <pasha.tatashin@soleen.com>,
	Pranjal Shrivastava <praan@google.com>,
	Pratyush Yadav <pratyush@kernel.org>,
	Robin Murphy <robin.murphy@arm.com>,
	Saeed Mahameed <saeedm@nvidia.com>,
	Samiullah Khawaja <skhawaja@google.com>,
	Shuah Khan <skhan@linuxfoundation.org>,
	Will Deacon <will@kernel.org>, William Tu <witu@nvidia.com>,
	Yi Liu <yi.l.liu@intel.com>,
	jacob.pan@linux.microsoft.com
Subject: Re: [PATCH v4 05/11] PCI: liveupdate: Inherit bus numbers during Live Update
Date: Mon, 27 Apr 2026 11:47:45 -0700	[thread overview]
Message-ID: <20260427114745.00000656@linux.microsoft.com> (raw)
In-Reply-To: <20260423212316.3431746-6-dmatlack@google.com>

Hi David,

On Thu, 23 Apr 2026 21:23:09 +0000
David Matlack <dmatlack@google.com> wrote:

> Inherit bus numbers from the previous kernel during a Live Update when
> one or more PCI devices are being preserved.
> 
> During a Live Update, preserved devices must be allowed to continue
> performing memory transactions so the kernel cannot change the fabric
> topology, including bus numbers, since that would require disabling
> and flushing any memory transactions first.
> 
> To keep things simple, inherit the secondary and subordinate bus
> numbers on all bridges if any PCI devices were preserved (i.e. even
> bridges without any downstream endpoints that were preserved). This
> avoids accidentally assigning a bridge a new window that overlaps
> with a preserved device that is downstream of a different bridge.
> 
> If a bridge is enumerated with a broken topology or has no bus numbers
> set during a Live Update, refuse to assign it new bus numbers and
> refuse to enumerate devices below it. This is a safety measure to
> prevent topology conflicts.
> 
> Require that CONFIG_CARDBUS is not enabled to enable
> CONFIG_PCI_LIVEUPDATE since inheriting bus numbers on PCI-to-CardBus
> bridges requires additional work but is not a priority at the moment.
> 
> Signed-off-by: David Matlack <dmatlack@google.com>
> ---
>  .../admin-guide/kernel-parameters.txt         |  6 +++-
>  drivers/pci/Kconfig                           |  2 +-
>  drivers/pci/liveupdate.c                      | 28
> +++++++++++++++++++ drivers/pci/probe.c                           |
> 21 +++++++++++--- include/linux/pci.h                           |  1 +
>  5 files changed, 52 insertions(+), 6 deletions(-)
> 
> diff --git a/Documentation/admin-guide/kernel-parameters.txt
> b/Documentation/admin-guide/kernel-parameters.txt index
> cf3807641d89..f412a4b77fb7 100644 ---
> a/Documentation/admin-guide/kernel-parameters.txt +++
> b/Documentation/admin-guide/kernel-parameters.txt @@ -5156,7 +5156,11
> @@ Kernel parameters explicitly which ones they are.
>  		assign-busses	[X86] Always assign all PCI bus
>  				numbers ourselves, overriding
> -				whatever the firmware may have done.
> +				whatever the firmware may have done.
> Ignored
> +				during a Live Update, where the
> kernel must
> +				inherit the PCI topology (including
> bus numbers)
> +				to avoid interrupting ongoing memory
> +				transactions of preserved devices.
>  		usepirqmask	[X86] Honor the possible IRQ mask
> stored in the BIOS $PIR table. This is needed on
>  				some systems with broken BIOSes,
> notably diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
> index 08398cbe970c..6ef457ff9d08 100644
> --- a/drivers/pci/Kconfig
> +++ b/drivers/pci/Kconfig
> @@ -330,7 +330,7 @@ config VGA_ARB_MAX_GPUS
>  
>  config PCI_LIVEUPDATE
>  	bool "PCI Live Update Support (EXPERIMENTAL)"
> -	depends on PCI && LIVEUPDATE
> +	depends on PCI && LIVEUPDATE && !CARDBUS
>  	help
>  	  Enable PCI core support for preserving PCI devices across
> Live Update. This, in combination with support in a device's driver,
> diff --git a/drivers/pci/liveupdate.c b/drivers/pci/liveupdate.c
> index c0a30d16d9b8..cf8cff134a75 100644
> --- a/drivers/pci/liveupdate.c
> +++ b/drivers/pci/liveupdate.c
> @@ -93,6 +93,19 @@
>   * bound to the correct driver. i.e. The PCI core does not protect
> against a
>   * device getting preserved by driver A in the outgoing kernel and
> then getting
>   * bound to driver B in the incoming kernel.
> + *
> + * BDF Stability
> + * =============
> + *
> + * The PCI core guarantees that incoming preserved devices can be
> identified by
> + * the same bus, device, and function numbers as prior to kexec. To
> accomplish
> + * this, the PCI core always inherits the secondary and subordinate
> bus numbers
> + * assigned to bridges during enumeration, rather than assigning new
> ones (the
> + * PCI core assumes that the previous kernel established a sane
> topology).
> + *
> + * If a misconfigured or unconfigured bridge is encountered during
> enumeration
> + * while there are incoming preserved devices, it's secondary and
> subordinate
> + * bus numbers will be cleared and devices below it will not be
> enumerated. */
>  
>  #define pr_fmt(fmt) "PCI: liveupdate: " fmt
> @@ -354,6 +367,21 @@ void pci_liveupdate_setup_device(struct pci_dev
> *dev) if (!xa)
>  		return;
>  
> +	/*
> +	 * During a Live Update, preserved devices are allowed to
> continue
> +	 * performing memory transactions. The kernel must not
> change the fabric
> +	 * topology, including bus numbers, since that would require
> disabling
> +	 * and flushing any memory transactions first.
> +	 *
> +	 * To keep things simple, inherit the secondary and
> subordinate bus
> +	 * numbers on _all_ bridges if _any_ PCI devices were
> preserved (i.e.
> +	 * even bridges without any downstream endpoints that were
> preserved).
> +	 * This avoids accidentally assigning a bridge a new window
> that
> +	 * overlaps with a preserved device that is downstream of a
> different
> +	 * bridge.
> +	 */
> +	dev->liveupdate_inherit_buses = true;
> +
This flag never gets cleared after the incoming kernel boot up, what if
the user does a manual rescan via sysfs? i.e.
# echo 1 > /sys/bus/pci/rescan
pcibios_assign_all_busses() will never gets called for this device, and
may hit this
	if (dev->liveupdate_inherit_buses) {
		pci_err(dev, "Cannot reconfigure bridge during
		Live Update!\n");

So, maybe clear it in pci_liveupdate_finish()?

>  	key = pci_ser_xa_key(pci_domain_nr(dev->bus),
> pci_dev_id(dev)); dev_ser = xa_load(xa, key);
>  
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 938a28e4a7a0..fa26f4170add 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -1374,6 +1374,14 @@ bool pci_ea_fixed_busnrs(struct pci_dev *dev,
> u8 *sec, u8 *sub) return true;
>  }
>  
> +static bool pci_should_assign_new_buses(struct pci_dev *dev)
> +{
> +	if (dev->liveupdate_inherit_buses)
> +		return false;
> +
> +	return pcibios_assign_all_busses();
> +}
> +
>  /*
>   * pci_scan_bridge_extend() - Scan buses behind a bridge
>   * @bus: Parent bus the bridge is on
> @@ -1401,6 +1409,7 @@ static int pci_scan_bridge_extend(struct
> pci_bus *bus, struct pci_dev *dev, int max, unsigned int
> available_buses, int pass)
>  {
> +	const bool assign_new_buses =
> pci_should_assign_new_buses(dev); struct pci_bus *child;
>  	u32 buses;
>  	u16 bctl;
> @@ -1453,8 +1462,7 @@ static int pci_scan_bridge_extend(struct
> pci_bus *bus, struct pci_dev *dev, goto out;
>  	}
>  
> -	if ((secondary || subordinate) &&
> -	    !pcibios_assign_all_busses() && !broken) {
> +	if ((secondary || subordinate) && !assign_new_buses &&
> !broken) { unsigned int cmax, buses;
>  
>  		/*
> @@ -1496,8 +1504,7 @@ static int pci_scan_bridge_extend(struct
> pci_bus *bus, struct pci_dev *dev,
>  		 * do in the second pass.
>  		 */
>  		if (!pass) {
> -			if (pcibios_assign_all_busses() || broken)
> -
> +			if (assign_new_buses || broken)
>  				/*
>  				 * Temporarily disable forwarding of
> the
>  				 * configuration cycles on all
> bridges in @@ -1511,6 +1518,12 @@ static int
> pci_scan_bridge_extend(struct pci_bus *bus, struct pci_dev *dev, goto
> out; }
>  
> +		if (dev->liveupdate_inherit_buses) {
> +			pci_err(dev, "Cannot reconfigure bridge
> during Live Update!\n");
> +			pci_err(dev, "Downstream devices will not be
> enumerated!\n");
> +			goto out;
> +		}
> +
>  		/* Clear errors */
>  		pci_write_config_word(dev, PCI_STATUS, 0xffff);
>  
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index dd6b26ca9462..9a602b322e3c 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -511,6 +511,7 @@ struct pci_dev {
>  	unsigned int	rom_bar_overlap:1;	/* ROM BAR
> disable broken */ unsigned int	rom_attr_enabled:1;	/*
> Display of ROM attribute enabled? */ unsigned int
> non_mappable_bars:1;	/* BARs can't be mapped to user-space  */
> +	unsigned int	liveupdate_inherit_buses:1; /* Inherit
> bus numbers due to Live Update */ pci_dev_flags_t dev_flags;
>  	atomic_t	enable_cnt;	/* pci_enable_device has
> been called */ 



  reply	other threads:[~2026-04-27 18:47 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-23 21:23 [PATCH v4 00/11] PCI: liveupdate: PCI core support for Live Update David Matlack
2026-04-23 21:23 ` [PATCH v4 01/11] PCI: liveupdate: Set up FLB handler for the PCI core David Matlack
2026-04-24 12:33   ` Pratyush Yadav
2026-04-24 13:29     ` Pasha Tatashin
2026-04-27 23:59       ` David Matlack
2026-04-27 21:05   ` Bjorn Helgaas
2026-04-27 21:31     ` David Matlack
2026-04-23 21:23 ` [PATCH v4 02/11] PCI: liveupdate: Track outgoing preserved PCI devices David Matlack
2026-04-27 15:57   ` Jacob Pan
2026-04-27 18:56     ` David Matlack
2026-04-27 21:06   ` Bjorn Helgaas
2026-04-23 21:23 ` [PATCH v4 03/11] PCI: liveupdate: Track incoming " David Matlack
2026-04-27 21:06   ` Bjorn Helgaas
2026-04-23 21:23 ` [PATCH v4 04/11] PCI: liveupdate: Document driver binding responsibilities David Matlack
2026-04-23 21:23 ` [PATCH v4 05/11] PCI: liveupdate: Inherit bus numbers during Live Update David Matlack
2026-04-27 18:47   ` Jacob Pan [this message]
2026-04-27 20:40     ` David Matlack
2026-04-27 21:16       ` David Matlack
2026-04-27 21:07   ` Bjorn Helgaas
2026-04-23 21:23 ` [PATCH v4 06/11] PCI: liveupdate: Auto-preserve upstream bridges across " David Matlack
2026-04-23 21:23 ` [PATCH v4 07/11] PCI: liveupdate: Inherit ACS flags in incoming preserved devices David Matlack
2026-04-23 21:23 ` [PATCH v4 08/11] PCI: liveupdate: Require preserved devices are in immutable singleton IOMMU groups David Matlack
2026-04-23 22:10   ` David Matlack
2026-04-23 22:52     ` Jason Gunthorpe
2026-04-23 23:09       ` David Matlack
2026-04-23 23:27         ` Samiullah Khawaja
2026-04-27 20:56   ` Jacob Pan
2026-04-23 21:23 ` [PATCH v4 09/11] PCI: liveupdate: Inherit ARI Forwarding Enable on preserved bridges David Matlack
2026-04-23 21:23 ` [PATCH v4 10/11] PCI: liveupdate: Do not disable bus mastering on preserved devices during kexec David Matlack
2026-04-27 21:08   ` Bjorn Helgaas
2026-04-23 21:23 ` [PATCH v4 11/11] Documentation: PCI: Add documentation for Live Update David Matlack

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260427114745.00000656@linux.microsoft.com \
    --to=jacob.pan@linux.microsoft.com \
    --cc=ajayachandra@nvidia.com \
    --cc=alex@shazbot.org \
    --cc=bhelgaas@google.com \
    --cc=chrisl@kernel.org \
    --cc=corbet@lwn.net \
    --cc=dmatlack@google.com \
    --cc=graf@amazon.com \
    --cc=iommu@lists.linux.dev \
    --cc=jgg@nvidia.com \
    --cc=joro@8bytes.org \
    --cc=jrhilke@google.com \
    --cc=kexec@lists.infradead.org \
    --cc=leonro@nvidia.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lukas@wunner.de \
    --cc=parav@nvidia.com \
    --cc=pasha.tatashin@soleen.com \
    --cc=praan@google.com \
    --cc=pratyush@kernel.org \
    --cc=rientjes@google.com \
    --cc=robin.murphy@arm.com \
    --cc=rppt@kernel.org \
    --cc=saeedm@nvidia.com \
    --cc=skhan@linuxfoundation.org \
    --cc=skhawaja@google.com \
    --cc=will@kernel.org \
    --cc=witu@nvidia.com \
    --cc=yi.l.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox