All of lore.kernel.org
 help / color / mirror / Atom feed
From: Samiullah Khawaja <skhawaja@google.com>
To: Baolu Lu <baolu.lu@linux.intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>,
	Joerg Roedel <joro@8bytes.org>,  Will Deacon <will@kernel.org>,
	Jason Gunthorpe <jgg@ziepe.ca>,
	 Robin Murphy <robin.murphy@arm.com>,
	Kevin Tian <kevin.tian@intel.com>,
	 Alex Williamson <alex@shazbot.org>,
	Shuah Khan <shuah@kernel.org>,
	iommu@lists.linux.dev,  linux-kernel@vger.kernel.org,
	kvm@vger.kernel.org, Pratyush Yadav <pratyush@kernel.org>,
	 Pasha Tatashin <pasha.tatashin@soleen.com>,
	David Matlack <dmatlack@google.com>,
	 Andrew Morton <akpm@linux-foundation.org>,
	Pranjal Shrivastava <praan@google.com>,
	 Vipin Sharma <vipinsh@google.com>
Subject: Re: [PATCH v3 07/18] iommu/vt-d: Implement device and iommu preserve/unpreserve ops
Date: Mon, 22 Jun 2026 19:19:11 +0000	[thread overview]
Message-ID: <ajl7xfcoKsohfpPJ@google.com> (raw)
In-Reply-To: <f3e2fef9-4f8c-4032-8d7b-007f47b3fb61@linux.intel.com>

On Mon, Jun 22, 2026 at 09:50:36AM +0800, Baolu Lu wrote:
>On 6/15/26 07:37, Samiullah Khawaja wrote:
>>Add implementation of the device and iommu presevation in a separate
>>file. Also set the device and iommu preserve/unpreserve ops in the
>>struct iommu_ops.
>>
>>Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
>>---
>>  MAINTAINERS                      |   8 ++
>>  drivers/iommu/intel/Makefile     |   1 +
>>  drivers/iommu/intel/iommu.c      |   8 +-
>>  drivers/iommu/intel/iommu.h      |  28 +++++
>>  drivers/iommu/intel/liveupdate.c | 175 +++++++++++++++++++++++++++++++
>>  include/linux/kho/abi/iommu.h    |  21 ++++
>>  6 files changed, 239 insertions(+), 2 deletions(-)
>>  create mode 100644 drivers/iommu/intel/liveupdate.c
>>

[snip]

>>-
>>  /*
>>   * Take a root_entry and return the Lower Context Table Pointer (LCTP)
>>   * if marked present.
>>@@ -3926,6 +3925,11 @@ const struct iommu_ops intel_iommu_ops = {
>>  	.is_attach_deferred	= intel_iommu_is_attach_deferred,
>>  	.def_domain_type	= device_def_domain_type,
>>  	.page_response		= intel_iommu_page_response,
>>+#ifdef CONFIG_IOMMU_LIVEUPDATE
>>+	.preserve_device	= intel_iommu_preserve_device,
>>+	.preserve		= intel_iommu_preserve,
>>+	.unpreserve		= intel_iommu_unpreserve,
>>+#endif
>
>Any reason why an unpreserve_device callback is missing here?

I added unpreserve_device in a later patch when PASID support is added,
as the context tables are currently unpreserved globally during
unpreserve(iommu).

But I agree this looks incomplete and I will add a stub
unpreserve_device in this patch.
>
>>  };
>>  static void quirk_iommu_igfx(struct pci_dev *dev)

[snip]

>>+
>>+static void unpreserve_context_table(struct intel_iommu *iommu,
>>+				     struct iommu_hw_ser *ser,
>>+				     u8 bus, u8 devfn)
>>+{
>>+	struct context_entry *context;
>>+
>>+	spin_lock(&iommu->lock);
>>+	context = iommu_context_addr(iommu, bus, devfn, 0);
>>+	spin_unlock(&iommu->lock);
>
>The spinlock is dropped immediately after reading the address pointer.
>If this is guaranteed to be safe, please add a comment to explain why a
>UAF or race is avoided here. Otherwise, the locking scope needs to be
>extened to protect both the pointer lookup and use.

In the Intel VT-d driver, context tables are never freed once they are
allocated during runtime, as they can be shared across multiple devices.
So I took the lock here to protect against concurrent allocations inside
iommu_context_addr(). Once the address is read, it is safe to use
without holding the lock until the DMAR unit itself is torn down.

I will add a comment explaining this here.
>
>>+	if (context && is_context_table_preserved(iommu, ser, bus, devfn)) {
>>+		iommu_unpreserve_pages(context);
>>+		clear_bit(CONTEXT_TABLE_PRESERVED_BIT(bus, devfn),
>>+			  (unsigned long *)&ser->intel.context_tables_bitmap[0]);
>>+	}
>>+}
>>+
>>+static int preserve_context_table(struct intel_iommu *iommu,
>>+				  struct iommu_hw_ser *ser,
>>+				  u8 bus, u8 devfn)
>>+{
>>+	struct context_entry *context;
>>+	int ret;
>>+
>>+	spin_lock(&iommu->lock);
>>+	context = iommu_context_addr(iommu, bus, devfn, 0);
>>+	spin_unlock(&iommu->lock);
>
>Ditto.

Answered above.
>
>>+	if (context && !is_context_table_preserved(iommu, ser, bus, devfn)) {
>>+		ret = iommu_preserve_pages(context);
>>+		if (ret)
>>+			return ret;
>>+
>>+		set_bit(CONTEXT_TABLE_PRESERVED_BIT(bus, devfn),
>>+			(unsigned long *)&ser->intel.context_tables_bitmap[0]);
>>+	}
>>+
>>+	return 0;
>>+}
>>+

[snip]

>>diff --git a/include/linux/kho/abi/iommu.h b/include/linux/kho/abi/iommu.h
>>index d06f251a2df4..ad760c497e13 100644
>>--- a/include/linux/kho/abi/iommu.h
>>+++ b/include/linux/kho/abi/iommu.h
>>@@ -80,6 +80,7 @@
>>   */
>>  enum iommu_type_ser {
>>  	IOMMU_INVALID,
>>+	IOMMU_INTEL,
>>  };
>>  #define IOMMU_SER_FLAG_DELETED	(1 << 0)
>>@@ -140,16 +141,36 @@ struct iommu_device_ser {
>>  	struct iommu_dev_map_ser domain_iommu_ser;
>>  } __packed;
>>+/**
>>+ * struct iommu_intel_ser - Serialized state of an Intel IOMMU instance
>>+ * @restored: Whether IOMMU state is restored
>>+ * @phys_addr: Physical address of the IOMMU register base
>>+ * @root_table: Physical address of the root entry table
>>+ * @context_tables_bitmap: Bitmap representing the context tables that are
>>+ * preserved.
>>+ */
>>+struct iommu_intel_ser {
>>+	u8 restored;
>>+	u8 padding[7];
>>+	u64 phys_addr;
>>+	u64 root_table;
>>+	u64 context_tables_bitmap[8]; /* Tracks upto 512 context tables */
>
>To avoid open-coded magic numbers, how about something like,
>
>#define VTD_PRESERVED_BITMAP_LONGS  DIV_ROUND_UP(512, BITS_PER_LONG_LONG)
>u64 context_tables_bitmap[VTD_PRESERVED_BITMAP_LONGS];
>
>?

This looks great to me. I will update this here.
>
>>+};
>>+
>>  /**
>>   * struct iommu_hw_ser - Serialized state of an IOMMU instance
>>   * @hdr: Common object header
>>   * @token: Unique token for the IOMMU
>>   * @type: IOMMU type serialized state belongs to
>>+ * @intel: Intel specific serialization data
>>   */
>>  struct iommu_hw_ser {
>>  	struct iommu_hdr_ser hdr;
>>  	u64 token;
>>  	u64 type;
>>+	union {
>>+		struct iommu_intel_ser intel;
>>+	};
>>  } __packed;
>>  /**
>
>Thanks,
>baolu
>

Thanks,
Sami

  reply	other threads:[~2026-06-22 19:19 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-14 23:37 [PATCH v3 00/18] iommu: Add live update state preservation Samiullah Khawaja
2026-06-14 23:37 ` [PATCH v3 01/18] memfd: export memfd_get_seals() Samiullah Khawaja
2026-06-15  5:14   ` Ankit Soni
2026-06-15 11:45   ` Pratyush Yadav
2026-06-14 23:37 ` [PATCH v3 02/18] iommu: Implement IOMMU Live update FLB callbacks Samiullah Khawaja
2026-06-14 23:37 ` [PATCH v3 03/18] iommu/pages: Add APIs to preserve/unpreserve/restore iommu pages Samiullah Khawaja
2026-06-14 23:37 ` [PATCH v3 04/18] iommupt: Implement preserve/unpreserve/restore callbacks Samiullah Khawaja
2026-06-14 23:37 ` [PATCH v3 05/18] iommu: Implement IOMMU domain preservation Samiullah Khawaja
2026-06-14 23:37 ` [PATCH v3 06/18] iommu: Implement device and IOMMU HW preservation Samiullah Khawaja
2026-06-14 23:37 ` [PATCH v3 07/18] iommu/vt-d: Implement device and iommu preserve/unpreserve ops Samiullah Khawaja
2026-06-22  1:50   ` Baolu Lu
2026-06-22 19:19     ` Samiullah Khawaja [this message]
2026-06-14 23:37 ` [PATCH v3 08/18] iommu/vt-d: clear unpreserved context entries during shutdown Samiullah Khawaja
2026-06-22  2:47   ` Baolu Lu
2026-06-22 22:56     ` Samiullah Khawaja
2026-06-14 23:37 ` [PATCH v3 09/18] iommu: Add APIs to get iommu and device preserved state Samiullah Khawaja
2026-06-22  3:10   ` Baolu Lu
2026-06-22 23:27     ` Samiullah Khawaja
2026-06-14 23:37 ` [PATCH v3 10/18] iommu/vt-d: Restore IOMMU state and reclaimed domain ids Samiullah Khawaja
2026-06-22  5:14   ` Baolu Lu
2026-06-22 23:30     ` Samiullah Khawaja
2026-06-14 23:37 ` [PATCH v3 11/18] iommu: Restore and reattach preserved domains to devices Samiullah Khawaja
2026-06-14 23:37 ` [PATCH v3 12/18] iommu/vt-d: Handle reattach of the restored domain Samiullah Khawaja
2026-06-22  5:44   ` Baolu Lu
2026-06-23  0:26     ` Samiullah Khawaja
2026-06-14 23:37 ` [PATCH v3 13/18] iommu/vt-d: preserve PASID table of preserved device Samiullah Khawaja
2026-06-22  6:01   ` Baolu Lu
2026-06-23  0:36     ` Samiullah Khawaja
2026-06-14 23:37 ` [PATCH v3 14/18] iommufd: Implement ioctl to mark HWPT for preservation Samiullah Khawaja
2026-06-14 23:37 ` [PATCH v3 15/18] iommufd: Persist iommu hardware pagetables for live update Samiullah Khawaja
2026-06-14 23:37 ` [PATCH v3 16/18] iommufd: Add APIs to preserve/unpreserve a vfio cdev Samiullah Khawaja
2026-06-14 23:37 ` [PATCH v3 17/18] vfio/pci: Preserve the iommufd state of the " Samiullah Khawaja
2026-06-14 23:37 ` [PATCH v3 18/18] iommufd/selftest: Add test to verify iommufd preservation Samiullah Khawaja

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ajl7xfcoKsohfpPJ@google.com \
    --to=skhawaja@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex@shazbot.org \
    --cc=baolu.lu@linux.intel.com \
    --cc=dmatlack@google.com \
    --cc=dwmw2@infradead.org \
    --cc=iommu@lists.linux.dev \
    --cc=jgg@ziepe.ca \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pasha.tatashin@soleen.com \
    --cc=praan@google.com \
    --cc=pratyush@kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=shuah@kernel.org \
    --cc=vipinsh@google.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.