All of lore.kernel.org
 help / color / mirror / Atom feed
From: Samiullah Khawaja <skhawaja@google.com>
To: David Matlack <dmatlack@google.com>
Cc: kexec@lists.infradead.org, linux-doc@vger.kernel.org,
	 linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-pci@vger.kernel.org,
	 Adithya Jayachandran <ajayachandra@nvidia.com>,
	Alexander Graf <graf@amazon.com>,
	 Alex Williamson <alex@shazbot.org>,
	Bjorn Helgaas <bhelgaas@google.com>,
	 Chris Li <chrisl@kernel.org>,
	David Rientjes <rientjes@google.com>,
	 Jacob Pan <jacob.pan@linux.microsoft.com>,
	Jason Gunthorpe <jgg@nvidia.com>,
	 Jonathan Corbet <corbet@lwn.net>,
	Josh Hilke <jrhilke@google.com>,
	 Leon Romanovsky <leonro@nvidia.com>,
	Lukas Wunner <lukas@wunner.de>, Mike Rapoport <rppt@kernel.org>,
	 Parav Pandit <parav@nvidia.com>,
	Pasha Tatashin <pasha.tatashin@soleen.com>,
	 Pranjal Shrivastava <praan@google.com>,
	Pratyush Yadav <pratyush@kernel.org>,
	 Saeed Mahameed <saeedm@nvidia.com>,
	Shuah Khan <skhan@linuxfoundation.org>,
	 Vipin Sharma <vipinsh@google.com>, William Tu <witu@nvidia.com>,
	Yi Liu <yi.l.liu@intel.com>
Subject: Re: [PATCH v6 03/12] PCI: liveupdate: Track incoming preserved PCI devices
Date: Tue, 16 Jun 2026 22:38:04 +0000	[thread overview]
Message-ID: <ajHOd1lkBJTySVif@google.com> (raw)
In-Reply-To: <CALzav=eo=UwoTNTYM8Z7uKoihxfB7NtVP701qidVgoqyBKhUig@mail.gmail.com>

On Tue, Jun 16, 2026 at 03:20:33PM -0700, David Matlack wrote:
>On Tue, Jun 16, 2026 at 1:09 PM Samiullah Khawaja <skhawaja@google.com> wrote:
>>

[snip]

>>
>> Hmm.. This is interesting, so the KHO state is freed and it cannot be
>> reused. I see you already pointed out that we are putting an LUO policy
>> to say that the retry is not allowed.
>>
>> But what should be the behaviour of liveupdate in this regard? Let the
>> system boot in a normal way? This might break other subsystems as they
>> might depend on PCIe restoring state properly. Also I think some of the
>> PCIe state, like device-id, BAR addresses, ACLs etc, might be used as
>> source of truth by other components.
>>
>> For example, lets say FLB retrieve() of PCIe fails, but succeeds for
>> VFIO/IOMMU, now VFIO/IOMMU are restoring state of a device that is not
>> restored/preserved?
>>
>> Should this be considered fatal?
>
>If PCI FLB retrieve fails then there are certain things that cannot be
>guaranteed, such as BDF (B specifically) remaining constant. This
>could lead to memory corruption as the IOMMU may have live
>translations in place for those specific RequesterIDs. And, in the
>future, preserved devices may be doing P2P which depends on BARs not
>moving. If the PCI core cannot retrieve the FLB saved by the previous
>kernel, it cannot make these guarantees.

Yes, this is what I was worried about.
>
>So yeah I think you're right that PCI core should treat FLB retrieve
>as fatal and just panic.

This sounds great.
>
>> > }
>> >
>> > static void pci_flb_finish(struct liveupdate_flb_op_args *args)
>> > {
>> >-      kho_restore_free(args->obj);
>> >+      struct pci_flb_incoming *incoming = args->obj;
>> >+
>> >+      xa_destroy(&incoming->xa);
>> >+      kho_restore_free(incoming->ser);
>> >+      kfree(incoming);
>> > }
>> >
>> > static struct liveupdate_flb_ops pci_liveupdate_flb_ops = {
>> >@@ -270,6 +335,91 @@ void pci_liveupdate_unpreserve(struct pci_dev *dev)
>> > }
>> > EXPORT_SYMBOL_GPL(pci_liveupdate_unpreserve);
>> >
>> >+static struct pci_flb_incoming *pci_liveupdate_flb_get_incoming(void)
>> >+{
>> >+      struct pci_flb_incoming *incoming = NULL;
>> >+      int ret;
>> >+
>> >+      ret = liveupdate_flb_get_incoming(&pci_liveupdate_flb, (void **)&incoming);
>> >+
>> >+      /* Live Update is not enabled. */
>> >+      if (ret == -EOPNOTSUPP)
>> >+              return NULL;
>> >+
>> >+      /* Live Update is enabled, but there is no incoming FLB data. */
>> >+      if (ret == -ENODATA)
>> >+              return NULL;
>> >+
>> >+      /*
>> >+       * Live Update is enabled and there is incoming FLB data, but none of it
>> >+       * matches pci_liveupdate_flb.compatible.
>> >+       *
>> >+       * This could mean that no PCI FLB data was passed by the previous
>> >+       * kernel, but it could also mean the previous kernel used a different
>> >+       * compatibility string (i.e. a different ABI).
>> >+       */
>> >+      if (ret == -ENOENT) {
>> >+              pr_info_once("No incoming FLB matched %s\n", pci_liveupdate_flb.compatible);
>> >+              return NULL;
>> >+      }
>> >+
>> >+      /*
>> >+       * There is incoming FLB data that matches pci_liveupdate_flb.compatible
>> >+       * but it cannot be retrieved.
>> >+       */
>> >+      if (ret) {
>> >+              WARN_ONCE(ret, "Failed to retrieve incoming FLB data\n");
>>
>> I think this should probably be considered fatal as mentioned above or
>> the caller of this function should get an error so it can fail. I think
>> retrievel of preserved state should generally not fail unless there is
>> memory corruption or ABI is incompatible.
>
>Yeah. I think I will just call panic() here to cover all cases.

We have an luo specific panic macro/function that you can use.

luo_restore_fail()
>
>> >+              return NULL;
>> >+      }
>> >+
>> >+      return incoming;
>> >+}
>> >+
>>
>> [snip]
>> >+
>> >+static inline bool pci_liveupdate_is_incoming(struct pci_dev *dev)
>> >+{
>> >+      return false;
>> >+}
>> > #endif
>> >
>> > #endif /* LINUX_PCI_LIVEUPDATE_H */
>> >--
>> >2.54.0.746.g67dd491aae-goog
>> >
>>
>> Sami


  reply	other threads:[~2026-06-16 22:38 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-22 20:23 [PATCH v6 00/12] PCI: liveupdate: PCI core support for Live Update David Matlack
2026-05-22 20:23 ` [PATCH v6 01/12] PCI: liveupdate: Set up FLB handler for the PCI core David Matlack
2026-05-22 20:43   ` sashiko-bot
2026-05-22 21:37     ` David Matlack
2026-05-23 11:17       ` Mike Rapoport
2026-05-25 16:50         ` Pratyush Yadav
2026-06-05  5:41   ` Pranjal Shrivastava
2026-06-08 20:51     ` David Matlack
2026-06-09 10:45       ` Pranjal Shrivastava
2026-06-12  5:15   ` Pasha Tatashin
2026-06-12  6:54     ` Mike Rapoport
2026-06-12 10:47       ` Pasha Tatashin
2026-06-15 22:19         ` David Matlack
2026-06-15 22:22     ` David Matlack
2026-05-22 20:24 ` [PATCH v6 02/12] PCI: liveupdate: Track outgoing preserved PCI devices David Matlack
2026-05-22 20:54   ` sashiko-bot
2026-05-22 21:28     ` David Matlack
2026-06-05  6:26       ` Pranjal Shrivastava
2026-06-05  6:11   ` Pranjal Shrivastava
2026-06-12 11:38   ` Pasha Tatashin
2026-05-22 20:24 ` [PATCH v6 03/12] PCI: liveupdate: Track incoming " David Matlack
2026-05-22 21:13   ` sashiko-bot
2026-05-22 21:34     ` David Matlack
2026-06-06 10:08   ` Pranjal Shrivastava
2026-06-08 20:57     ` David Matlack
2026-06-09 10:48       ` Pranjal Shrivastava
2026-06-14 13:38   ` Pasha Tatashin
2026-06-15 20:29     ` David Matlack
2026-06-16 20:09   ` Samiullah Khawaja
2026-06-16 22:20     ` David Matlack
2026-06-16 22:38       ` Samiullah Khawaja [this message]
2026-05-22 20:24 ` [PATCH v6 04/12] PCI: liveupdate: Document driver binding responsibilities David Matlack
2026-05-25 15:35   ` Pratyush Yadav
2026-06-06 10:20   ` Pranjal Shrivastava
2026-06-14 13:41   ` Pasha Tatashin
2026-05-22 20:24 ` [PATCH v6 05/12] PCI: liveupdate: Keep bus numbers constant during Live Update David Matlack
2026-05-22 21:08   ` sashiko-bot
2026-05-22 21:31     ` David Matlack
2026-06-06 11:10   ` Pranjal Shrivastava
2026-06-14 14:01     ` Pasha Tatashin
2026-06-14 13:57   ` Pasha Tatashin
2026-06-15 20:20     ` David Matlack
2026-05-22 20:24 ` [PATCH v6 06/12] PCI: liveupdate: Auto-preserve upstream bridges across " David Matlack
2026-06-06 22:15   ` Pranjal Shrivastava
2026-06-08 21:34     ` David Matlack
2026-06-09 11:15       ` Pranjal Shrivastava
2026-05-22 20:24 ` [PATCH v6 07/12] PCI: Refactor matching logic for pci_dev_acs_ops David Matlack
2026-06-07 19:01   ` Pranjal Shrivastava
2026-06-08 21:49     ` David Matlack
2026-06-09 10:56       ` Pranjal Shrivastava
2026-05-22 20:24 ` [PATCH v6 08/12] PCI: liveupdate: Inherit ACS flags in incoming preserved devices David Matlack
2026-06-07 20:37   ` Pranjal Shrivastava
2026-06-08 10:49     ` Pranjal Shrivastava
2026-06-08 18:16       ` Jason Gunthorpe
2026-06-09 15:12         ` Pranjal Shrivastava
2026-06-09 15:34           ` Pranjal Shrivastava
2026-06-08 21:56     ` David Matlack
2026-06-09 17:20       ` Pranjal Shrivastava
2026-06-09 18:40         ` David Matlack
2026-06-09 19:25           ` Pranjal Shrivastava
2026-06-10  0:07         ` Jason Gunthorpe
2026-06-10 14:37           ` Pranjal Shrivastava
2026-05-22 20:24 ` [PATCH v6 09/12] PCI: liveupdate: Inherit ARI Forwarding Enable on preserved bridges David Matlack
2026-05-22 21:51   ` sashiko-bot
2026-06-08 11:33   ` Pranjal Shrivastava
2026-06-08 18:19     ` Jason Gunthorpe
2026-05-22 20:24 ` [PATCH v6 10/12] PCI: liveupdate: Freeze preservation status during shutdown David Matlack
2026-06-08 11:47   ` Pranjal Shrivastava
2026-05-22 20:24 ` [PATCH v6 11/12] PCI: liveupdate: Do not disable bus mastering on preserved devices during kexec David Matlack
2026-06-08 11:58   ` Pranjal Shrivastava
2026-05-22 20:24 ` [PATCH v6 12/12] Documentation: PCI: Add documentation for Live Update David Matlack
2026-06-08 12:01   ` Pranjal Shrivastava

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ajHOd1lkBJTySVif@google.com \
    --to=skhawaja@google.com \
    --cc=ajayachandra@nvidia.com \
    --cc=alex@shazbot.org \
    --cc=bhelgaas@google.com \
    --cc=chrisl@kernel.org \
    --cc=corbet@lwn.net \
    --cc=dmatlack@google.com \
    --cc=graf@amazon.com \
    --cc=jacob.pan@linux.microsoft.com \
    --cc=jgg@nvidia.com \
    --cc=jrhilke@google.com \
    --cc=kexec@lists.infradead.org \
    --cc=leonro@nvidia.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lukas@wunner.de \
    --cc=parav@nvidia.com \
    --cc=pasha.tatashin@soleen.com \
    --cc=praan@google.com \
    --cc=pratyush@kernel.org \
    --cc=rientjes@google.com \
    --cc=rppt@kernel.org \
    --cc=saeedm@nvidia.com \
    --cc=skhan@linuxfoundation.org \
    --cc=vipinsh@google.com \
    --cc=witu@nvidia.com \
    --cc=yi.l.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.