From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pl1-f171.google.com (mail-pl1-f171.google.com [209.85.214.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F2E851C8FBA for ; Mon, 1 Dec 2025 18:54:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.171 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764615260; cv=none; b=D/SzIKIwY8rqCglRL4Iv7772PABpx7pOh9qE9Fhi8snv2n324cuFGeoawV6MiFa4w50VbeyWXQgV7kNuSd9ds+dd+v7c/t3e2I2r6nEapNE6HD9pEVg17t/BB23SUYxakGRRpjpl6Ct2aWOXfiHqtNyVsqLPwkrn3BOrGT3XOc0= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764615260; c=relaxed/simple; bh=xdQFSPwBDitJJbj4YFqv8oX69ifcmN2sMzq3HiUxFog=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=iNr+CeRUCoI68/DmE+PLB9pi1wk+YTiVZOeT90d67SEHPprUuN4e4Z7NgYztoJjJY5L/00Ke0mP367mVQk3aRAnkqrYETaGCoqFkkoDBIxvNK6A3EHGSQyffWAepit8+Nuk1i8P4Y1AXFPR9ejGf5XnQxF0+ZnrNROfa2ETaEA4= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=rJrdc4my; arc=none smtp.client-ip=209.85.214.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="rJrdc4my" Received: by mail-pl1-f171.google.com with SMTP id d9443c01a7336-2984dfae0acso72268335ad.0 for ; Mon, 01 Dec 2025 10:54:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764615257; x=1765220057; darn=vger.kernel.org; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:from:to :cc:subject:date:message-id:reply-to; bh=NZR7hK5x4THTBId2vxyDYOqgCYL0BSheMqFuUmlt+Jw=; b=rJrdc4myLwh5sTna1PuJYPc/KOILdlLOmqy97IzQq1WNI+gRSTu77M/5mUVqJe26pv Vxb029p44Gag8/bWe00oXbZS+Ftp8ymQhnMHBs0Mgc0eWQR8LrXlvW2YHAB6ppq9lKwb IOPkaCW2XSW5iDBk2JJ9IZ6iPoGc5oHjY2G3qx45Nd5gl+h0yGZtsXoiFIZ7LnF00WyA gH6sz/VIsim6SU9HLb3wzF66lBxUhWTAcLkW+wVKIzwkTCJPKgaHFQ/FDahBRzZqhE8k 2YzM19tyU6JMfTgzRM6MM41Q8wMrQbaQ+z52Glxjqs4v2dLDivW/WfQ3JViz9u7AuU3u wLJg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764615257; x=1765220057; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=NZR7hK5x4THTBId2vxyDYOqgCYL0BSheMqFuUmlt+Jw=; b=fTYz6hSndQaMBoPhP/DbgPfkR1YECO3DQN9Z5i6TAkSQwG2cfXmIVI/pWWOfiS8feo +bDL/W9VBxngT24LQFf4Ev2rMSXX9lxqC1t2r74askOTHO+k4qySSK0qTfpfOQXlXkR5 ttyGnk+g3j6qJUW/84X90E+7n150t229FBQLZQtz6gsByTkX/1ydKO17WUQuTlfRS0CJ h4aS34B4VNnEUlziOedSuNXTBeWsFIm4R99yBFtUsBcVdE5gvQUcJin8UFQpb0O47IfU WzN4q6p1IWVUr5QqOabvg7yoP3UI9jgmoxDfQoKTJS0FzJSx74ksv0v9kGwWH3MG2C2C RmAQ== X-Forwarded-Encrypted: i=1; AJvYcCWWEDetn1hmDIe3vblY8YJQpUYHKmWSkM/vi6+YBZbwh9WtmulOdY6UudhVZMM8w8fN+a6SCFswjJX+tXo=@vger.kernel.org X-Gm-Message-State: AOJu0YwKLUM/kzvlwgmPPRgROi+5zE5UxtPpF2HQOJPHl4I5krioGPCy nDj9usigvVvWbgwsLKQYHyWPfAniiek2VYxtWrcfDoFmm8DZX3127gfYOlsCHE5Tqw== X-Gm-Gg: ASbGncuiKPL+XviVZVDemaZBljTKQU4KExmJNpvVQigYAYCDk5GtLM8ZFKcKYkwiiMI PURhkIoWUFZx4Nu4x4//lcnxQnPj8boV1ifmGClIzjvm4IGPMklSCTdTsnfkjjy18JoxYYV/w1x zq+doZWQJColjX1msURSaoWRlFFv2sWW+2sr2uH1x8Z1VlocDg5pzB9PvzAF9WyBAE35Sd7LBAw fVZTdpnbr1W81CDyqV+JIHGuejCOG5D1MwAEg4IuMeQRrLhvag6xkFvVTB2BL1Kmqh5hjnUK1Uz Exv19wPdEUp/mxwG1IWy9Y6fk9DIlxSo3fP4KAdpKEVIkdfqe5ZnJItp0qNiTe8sb1lFlqjZMeo UY7q4R8XPN1/YmZ6kX0jrC3+5zlZwpgp8xof7mcVezJ1VnwkOFEStOZP6KbHR3WdMO8N78FxaPb asIjGaflRc/jnNN2ypbGJDgiPLGE3K79kHcabuKnehRd/QF7U= X-Google-Smtp-Source: AGHT+IEiFxvWkfPP9po3hvukZ9ksCP0VDwl+C6gb0NjWaMPZyH9tph+K1Sdl9eUHW9tCV4/voZXr4g== X-Received: by 2002:a17:90b:53c3:b0:33f:ebc2:634 with SMTP id 98e67ed59e1d1-3475ebe8173mr23598831a91.9.1764615256953; Mon, 01 Dec 2025 10:54:16 -0800 (PST) Received: from google.com (28.29.230.35.bc.googleusercontent.com. [35.230.29.28]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-3475e952efbsm8216852a91.1.2025.12.01.10.54.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 01 Dec 2025 10:54:16 -0800 (PST) Date: Mon, 1 Dec 2025 18:54:11 +0000 From: David Matlack To: Jason Gunthorpe Cc: Pasha Tatashin , Lukas Wunner , Alex Williamson , Adithya Jayachandran , Alex Mastro , Alistair Popple , Andrew Morton , Bjorn Helgaas , Chris Li , David Rientjes , Jacob Pan , Josh Hilke , Kevin Tian , kvm@vger.kernel.org, Leon Romanovsky , linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-pci@vger.kernel.org, Mike Rapoport , Parav Pandit , Philipp Stanner , Pratyush Yadav , Saeed Mahameed , Samiullah Khawaja , Shuah Khan , Tomita Moeko , Vipin Sharma , William Tu , Yi Liu , Yunxiang Li , Zhu Yanjun Subject: Re: [PATCH 02/21] PCI: Add API to track PCI devices preserved across Live Update Message-ID: References: <20251126193608.2678510-1-dmatlack@google.com> <20251126193608.2678510-3-dmatlack@google.com> <20251130005113.GB760268@nvidia.com> <20251201132934.GA1075897@nvidia.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20251201132934.GA1075897@nvidia.com> On 2025-12-01 09:29 AM, Jason Gunthorpe wrote: > On Sat, Nov 29, 2025 at 08:20:34PM -0500, Pasha Tatashin wrote: > > On Sat, Nov 29, 2025 at 7:51 PM Jason Gunthorpe wrote: > > > > > > On Sat, Nov 29, 2025 at 11:34:49AM +0100, Lukas Wunner wrote: > > > > On Wed, Nov 26, 2025 at 07:35:49PM +0000, David Matlack wrote: > > > > > Add an API to enable the PCI subsystem to track all devices that are > > > > > preserved across a Live Update, including both incoming devices (passed > > > > > from the previous kernel) and outgoing devices (passed to the next > > > > > kernel). > > > > > > > > > > Use PCI segment number and BDF to keep track of devices across Live > > > > > Update. This means the kernel must keep both identifiers constant across > > > > > a Live Update for any preserved device. > > > > > > > > While bus numbers will *usually* stay the same across next and previous > > > > kernel, there are exceptions. E.g. if "pci=assign-busses" is specified > > > > on the command line, the kernel will re-assign bus numbers on every boot. > > > > > > Stuff like this has to be disabled for this live update stuff, if the > > > bus numbers are changed it will break the active use of the iommu > > > across the kexec. > > > > > > So while what you say is all technically true, I'm not sure this is > > > necessary. > > > > I agree. However, Lukas's comment made me wonder about the future: if > > we eventually need to preserve non-PCI devices (like a TPM), should we > > be designing a common identification mechanism for all buses now? Or > > should we settle on BDF for PCI and invent stable identifiers for > > other bus types as they become necessary? > > Well, at least PCI subsystem should use BDF.. > > You are probably right that the matching of preserved data to a struct > device should be more general though. Lukas' suggestion would also make it more reliable to detect bus numbers changing during a Live Update. We can play whack-a-mole with things like assign-busses, but there will be a risk that we miss something or something changes in the future. Perhaps it would make sense to rely on BDF in the PCI subsystem in the short term and enforce bus number stability manually (e.g. see patch at the bottom), and then explore stable device paths as a future improvement to make PCI device preservation more reliable and also to enable other bus types? To handle pci=assign-busses, perhaps something like this? Are there any other places where the kernel could change busses? diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index 0ce98e18b5a8..2e1e1aa385a8 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -1331,6 +1331,20 @@ static bool pci_ea_fixed_busnrs(struct pci_dev *dev, u8 *sec, u8 *sub) return true; } +static bool pci_assign_all_busses(void) +{ + /* + * During a Live Update, do not assign new bus numbers. Use bus numbers + * assigned by the firmware and the previous kernel. Bus numbers must + * remain constant so that devices preserved across the Live Update can + * use the IOMMU uninterrupted. + */ + if (liveupdate_count()) + return false; + + return pcibios_assign_all_busses(); +} + /* * pci_scan_bridge_extend() - Scan buses behind a bridge * @bus: Parent bus the bridge is on @@ -1404,7 +1418,7 @@ static int pci_scan_bridge_extend(struct pci_bus *bus, struct pci_dev *dev, pci_write_config_word(dev, PCI_BRIDGE_CONTROL, bctl & ~PCI_BRIDGE_CTL_MASTER_ABORT); - if ((secondary || subordinate) && !pcibios_assign_all_busses() && + if ((secondary || subordinate) && !pci_assign_all_busses() && !is_cardbus && !broken) { unsigned int cmax, buses; @@ -1441,13 +1455,16 @@ static int pci_scan_bridge_extend(struct pci_bus *bus, struct pci_dev *dev, if (subordinate > max) max = subordinate; } else { + pci_WARN_ONCE(dev, liveupdate_count(), + "Assigning new bus numbers during a Live Update! [%u %u %u %u]\n", + secondary, subordinate, is_cardbus, broken); /* * We need to assign a number to this bus which we always * do in the second pass. */ if (!pass) { - if (pcibios_assign_all_busses() || broken || is_cardbus) + if (pci_assign_all_busses() || broken || is_cardbus) /* * Temporarily disable forwarding of the @@ -1522,7 +1539,7 @@ static int pci_scan_bridge_extend(struct pci_bus *bus, struct pci_dev *dev, max+i+1)) break; while (parent->parent) { - if ((!pcibios_assign_all_busses()) && + if ((!pci_assign_all_busses()) && (parent->busn_res.end > max) && (parent->busn_res.end <= max+i)) { j = 1; diff --git a/include/linux/liveupdate.h b/include/linux/liveupdate.h index b913d63eab5f..87a4982d0eb1 100644 --- a/include/linux/liveupdate.h +++ b/include/linux/liveupdate.h @@ -219,6 +219,7 @@ struct liveupdate_flb { /* Return true if live update orchestrator is enabled */ bool liveupdate_enabled(void); +int liveupdate_count(void); /* Called during kexec to tell LUO that entered into reboot */ int liveupdate_reboot(void); @@ -241,6 +242,11 @@ static inline bool liveupdate_enabled(void) return false; } +static inline int liveupdate_count(void) +{ + return 0; +} + static inline int liveupdate_reboot(void) { return 0; diff --git a/kernel/liveupdate/luo_core.c b/kernel/liveupdate/luo_core.c index 69298d82f404..2f273397bd41 100644 --- a/kernel/liveupdate/luo_core.c +++ b/kernel/liveupdate/luo_core.c @@ -256,6 +256,13 @@ bool liveupdate_enabled(void) { return luo_global.enabled; } +EXPORT_SYMBOL_GPL(liveupdate_enabled); + +int liveupdate_count(void) +{ + return luo_global.liveupdate_num; +} +EXPORT_SYMBOL_GPL(liveupdate_count); /** * DOC: LUO ioctl Interface