From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4B03DF4BB7D for ; Tue, 24 Feb 2026 19:05:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:In-Reply-To: Content-Transfer-Encoding:Content-Type:MIME-Version:References:Message-ID: Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=y+nxktJgrzlaK8SQflM75E9Ffs6HGnnvzkBNJC6O+O8=; b=rdsiISg2FBf5n788HNer8YKJv5 RnjQJ2Leg2rzXSGxP2h6jHeQ1CasLeD5CjS08SCfIWCS93MxLCOjNwiE3EcweibQk0bpztzNHPkvE oGynr993LbrFW1ModB4ohjwo4N/TEFkrOWFW4Q8qa5fE/c6501GPyIb4iY7Xi1FhyyeM6KHTaqnuj OV3C9fQfp2Rio4P1soY+aUZ77TEMj9Yr8jAOwcreNVvO4OVazNxhKFO9WRCKjawKuyfDR4TmzPglo XIQbRf+WFXiQ7P3F4k4x7cfds9VHBhXRAXKcR5AfMWNZtx5iZwJzMQbuHrgXYRz5mWe4hWwp1SCGe tEtw2irg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98.2 #2 (Red Hat Linux)) id 1vuxjH-00000002fXG-0XxI; Tue, 24 Feb 2026 19:05:55 +0000 Received: from mail-pl1-x632.google.com ([2607:f8b0:4864:20::632]) by bombadil.infradead.org with esmtps (Exim 4.98.2 #2 (Red Hat Linux)) id 1vuxjF-00000002fWv-0qCo for kexec@lists.infradead.org; Tue, 24 Feb 2026 19:05:54 +0000 Received: by mail-pl1-x632.google.com with SMTP id d9443c01a7336-2ada9e4ea32so8525ad.1 for ; Tue, 24 Feb 2026 11:05:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1771959952; x=1772564752; darn=lists.infradead.org; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:from:to :cc:subject:date:message-id:reply-to; bh=y+nxktJgrzlaK8SQflM75E9Ffs6HGnnvzkBNJC6O+O8=; b=i+OYQ8B+n0yPcv2qWx0zNhsUwl1kRRmsuv7ZPzd0W01IMe9AwzajsrDVb2naeu62Lj whr65Qc7PhitJHM/gbjfwFmtPXp1e0clD7Y+RoU0H73S7IsU6I4WX2+lV3/enJXu2JU5 ynxk+DS+vYVZn8guxcG1xd335E52DrMn13MlOQlbK6slb68XwOxIcPEPMykYWmhGFLpn XgDqu2oAEbzlNUPNvwS+971nDbmcN2laqxK2uW1UvNKqp/je7zzOvM1piD7h6Rq4AhjH lxUIjOs23lYA6iiSgXjhFQpDmW8/r1rfvRaQeh5dASBwQVyMsWbZLNK/vBOXJepQXeLp OjVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1771959952; x=1772564752; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=y+nxktJgrzlaK8SQflM75E9Ffs6HGnnvzkBNJC6O+O8=; b=iP+2OTiQFx37DcHNV1pMc48c2l2R5GDQJ5xXDO7nc1sppUs2auEc8By4xwPoj7Kpph S8YugZKD9Ir9YkN2Ypvhe60gfJ4Rmaz6xsMCcEJocTqsiTpDGtZPuskiBB3T1YhZOjge COWSdObcPobLB3cuizQDQhHXXXUiw2udiLerMyr2AOqFtMQlIj+14W7bpRYfggdOtIA+ vlFwjKn642HZrHixa43712tjTo+/Bh2sl5Sqau9nwsuz/St3i5SiQA6rqutojG5zmgz5 EDx+Pp3JBHMP+KfXKvXIaGF3vWffG61IB6Wq/OmWWXtwuuRMb1vHmwzK/QaVLddfNUVM L9TQ== X-Forwarded-Encrypted: i=1; AJvYcCVrtGQyIhRHLypbDIi74dbamzuvDbFoCvVCcx25HqASH2nlLDR3VbZxjGnz1t1+HP6vLnckSg==@lists.infradead.org X-Gm-Message-State: AOJu0YyP4aqOlf/9mzQl3GdNj1Qfop6Z+U5IVqH7CLN+5U+Ei8AVO72Q jdto0wnHsHOTgqd/GT2JVegKxecSdA2HFxEY8n6DiooQSNibX685nRK9Iki9TO5okQ== X-Gm-Gg: ATEYQzwmXaIJ9FGEmXtxXDGMZH2xYGQI1YSBp7SarP8hxBM2SxmbFV0+LOUmsikgsva MbND9zvZXGDo71Zo1axcGSzqgE8+/atHU0FtNFQoIqoTfNaAdqU1f6nwEsltA7T8XwC6t0iDscf XDSj++19M7ndGh4YZUnp2NikQyBx+11OCmhaIr3E8qqOCZp/GgvGCkqNInfbI2YVjc5k4Cccchd xztnex53wPWarr5PCMPlb388nS1sPnsxb4o7vBwCIn/36rBz0zX3gFDnTo+NXOdZkGOVnWu4fs5 V7A8aPR5IK6EKP33PNOGlhPY3QJ64m+qPWP4az3F6NrIfxUkVos3+/1JCsRzVkBUX8arca7G+tP 3i5fNf3SdyJOCzc57MFUPgfdmj6cAc9Lz3ygtBerNputHfjHcBl9kWPNSFNNIBEVfC2o3oJpw3z 7mxpyfCGEj7/Xl5G9UxdXlXsMzgSHEm65KOIgwN8+cpK51EOKlmKihxAMtx+KH X-Received: by 2002:a17:903:3d0c:b0:297:f2a0:e564 with SMTP id d9443c01a7336-2adca83b652mr139205ad.11.1771959951330; Tue, 24 Feb 2026 11:05:51 -0800 (PST) Received: from google.com (222.245.187.35.bc.googleusercontent.com. [35.187.245.222]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-3590241e8a2sm595715a91.12.2026.02.24.11.05.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Feb 2026 11:05:50 -0800 (PST) Date: Tue, 24 Feb 2026 19:05:41 +0000 From: Pranjal Shrivastava To: David Matlack Cc: Alex Williamson , Adithya Jayachandran , Alexander Graf , Alex Mastro , Alistair Popple , Andrew Morton , Ankit Agrawal , Bjorn Helgaas , Chris Li , David Rientjes , Jacob Pan , Jason Gunthorpe , Jason Gunthorpe , Jonathan Corbet , Josh Hilke , Kevin Tian , kexec@lists.infradead.org, kvm@vger.kernel.org, Leon Romanovsky , Leon Romanovsky , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, linux-pci@vger.kernel.org, Lukas Wunner , =?utf-8?Q?Micha=C5=82?= Winiarski , Mike Rapoport , Parav Pandit , Pasha Tatashin , Pratyush Yadav , Raghavendra Rao Ananta , Rodrigo Vivi , Saeed Mahameed , Samiullah Khawaja , Shuah Khan , Thomas =?iso-8859-1?Q?Hellstr=F6m?= , Tomita Moeko , Vipin Sharma , Vivek Kasireddy , William Tu , Yi Liu , Zhu Yanjun Subject: Re: [PATCH v2 02/22] PCI: Add API to track PCI devices preserved across Live Update Message-ID: References: <20260129212510.967611-1-dmatlack@google.com> <20260129212510.967611-3-dmatlack@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20260224_110553_257922_B48C03A1 X-CRM114-Status: GOOD ( 59.44 ) X-BeenThere: kexec@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "kexec" Errors-To: kexec-bounces+kexec=archiver.kernel.org@lists.infradead.org On Tue, Feb 24, 2026 at 07:02:56PM +0000, Pranjal Shrivastava wrote: > On Tue, Feb 24, 2026 at 09:33:28AM -0800, David Matlack wrote: > > On Tue, Feb 24, 2026 at 1:18 AM Pranjal Shrivastava wrote: > > > On Thu, Jan 29, 2026 at 09:24:49PM +0000, David Matlack wrote: > > > > + * Copyright (c) 2025, Google LLC. > > > > > > Nit: Should these be 2026 now? > > > > Yes! Thanks for catching that. > > > > > > +int pci_liveupdate_outgoing_preserve(struct pci_dev *dev) > > > > +{ > > > > + struct pci_dev_ser new = INIT_PCI_DEV_SER(dev); > > > > + struct pci_ser *ser; > > > > + int i, ret; > > > > + > > > > + /* Preserving VFs is not supported yet. */ > > > > + if (dev->is_virtfn) > > > > + return -EINVAL; > > > > + > > > > + guard(mutex)(&pci_flb_outgoing_lock); > > > > + > > > > + if (dev->liveupdate_outgoing) > > > > + return -EBUSY; > > > > + > > > > + ret = liveupdate_flb_get_outgoing(&pci_liveupdate_flb, (void **)&ser); > > > > + if (ret) > > > > + return ret; > > > > + > > > > + if (ser->nr_devices == ser->max_nr_devices) > > > > + return -E2BIG; > > > > > > I'm wondering how (or if) this handles hot-plugged devices? > > > max_nr_devices is calculated based on for_each_pci_dev at the time of > > > the first preservation.. what happens if a device is hotplugged after > > > the first device is preserved but before the second one is, does > > > max_nr_devices become stale? Since ser->max_nr_devices will not reflect > > > the actual possible device count, potentially leading to an unnecessary > > > -E2BIG failure? > > > > Yes, it's possible to run out space to preserve devices if devices are > > hot-plugged and then preserved. But I think it's better to defer > > handling such a use-case exists (unless you see an obvious simple > > solution). So far I am not seeing preserving hot-plugged devices > > across Live Update as a high priority use-case to support. > > > > Ack. If we aren't supporting preservation for hot-plug at this point. > Let's mention that somewhere? Maybe just a little comment or the kdoc? > > > > > +u32 pci_liveupdate_incoming_nr_devices(void) > > > > +{ > > > > + struct pci_ser *ser; > > > > + int ret; > > > > + > > > > + ret = liveupdate_flb_get_incoming(&pci_liveupdate_flb, (void **)&ser); > > > > + if (ret) > > > > + return 0; > > > > > > Masking this error looks troubled, in the following patch, I see that > > > the retval 0 is treated as a fresh boot, but the IOMMU mappings for that > > > BDF might still be preserved? Which could lead to DMA aliasing issues, > > > without a hint of what happened since we don't even log anything. > > > > All fo the non-0 errors indicate there are 0 incoming devices at the > > time of the call, so I think returning 0 is appropriate. > > > > - EOPNOTSUPP: Live Update is not enabled. > > - ENODATA: Live Update is finished (all incoming devices have been restored). > > - ENOTENT: No PCI data was preserved across the Live Update. > > The flb_retrive_one seems to call: err = flb->ops->retrieve(&args); which could be anything honestly.. since the luo_core doesn't scream about it, maybe the caller should? Thanks, Praan > > None of these cover the case where an IOMMU mapping for BDF X is > > preserved, but device X is not preserved. This is a case we should > > handle in some way... but here is not that place. > > > > > > > > Maybe we could have something like the following: > > > > > > int pci_liveupdate_incoming_nr_devices(void) > > > { > > > struct pci_ser *ser; > > > int ret; > > > > > > ret = liveupdate_flb_get_incoming(&pci_liveupdate_flb, (void **)&ser); > > > if (ret) { > > > if (ret != -ENOENT) > > > pr_warn("PCI: Failed to retrieve preservation list: %d\n", ret); > > > > This would cause this warning to get printed if Live Update was > > disabled, or if no PCI devices were preserved. But both of those are > > not error scenarios. > > > > I agree, the snippet was just an example. What I'm trying to say here > is, what if the retval is -ENOMEM / -ENODATA, the existing code will > treat it as a fresh boot because it believes there are no incoming > devices. However, since this was an incoming device which failed to be > retrieved, there's a chance that it's IOMMU mapping was preserved too. > By returning 0, the PCI core will feel free to rebalance bus numbers or > reassign BARs. For instance, if the IOMMU already inherited mappings for > BDF 02:00.0, but the PCI core (due to this masked error) reassigns a > different device to that BDF, we face DMA aliasing or IOMMU faults. > Am I missing some context here? > > > > > +void pci_liveupdate_setup_device(struct pci_dev *dev) > > > > +{ > > > > + struct pci_ser *ser; > > > > + int ret; > > > > + > > > > + ret = liveupdate_flb_get_incoming(&pci_liveupdate_flb, (void **)&ser); > > > > + if (ret) > > > > + return; > > > > > > We should log something here either at info / debug level since the > > > error isn't bubbled up and the luo_core doesn't scream about it either. > > > > Any error from liveupdate_flb_get_incoming() simply means there are no > > incoming devices. So I don't think there's any error to report in > > dmesg. > > > > > > + dev->liveupdate_incoming = !!pci_ser_find(ser, dev); > > > > > > This feels a little hacky, shall we go for something like: > > > > > > dev->liveupdate_incoming = (pci_ser_find(ser, dev) != NULL); ? > > > > In my experience in the kernel (mostly from KVM), explicity comparison > > to NULL is less preferred to treating a pointer as a boolean. But I'm > > ok with following whatever is the locally preferred style for this > > kind of check. > > > > No strong feelings there, I see both being used in drivers/pci. > > > > > @@ -582,6 +583,10 @@ struct pci_dev { > > > > u8 tph_mode; /* TPH mode */ > > > > u8 tph_req_type; /* TPH requester type */ > > > > #endif > > > > +#ifdef CONFIG_LIVEUPDATE > > > > + unsigned int liveupdate_incoming:1; /* Preserved by previous kernel */ > > > > + unsigned int liveupdate_outgoing:1; /* Preserved for next kernel */ > > > > +#endif > > > > }; > > > > > > This would start another anon bitfield container, should we move this > > > above within the existing bitfield? If we've run pahole and found this > > > to be better, then this should be fine. > > > > Yeah I simply appended these new fields to the very end of the struct. > > If we care about optimizing the packing of struct pci_dev I can find a > > better place to put it. > > If you have pahole handy, it would be great to see if these can slide > into an existing hole. If not, no big deal for v3.. we can keep it as is > > Thanks, > Praan