From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pg1-f202.google.com (mail-pg1-f202.google.com [209.85.215.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B5EBA3EB7F0 for ; Mon, 27 Apr 2026 17:56:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.202 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777312621; cv=none; b=jYyVjtiNxhSVifuk9ec4GLRoM/tqP26vAL7ogWan9E2bW+4dl8CFSNHQzj//5ws8ZiZf/R6zrrV4HTfj2GBo5odjuBuphrbqWBMtpI713cKI/bwV8bkv044QEa0EEddWRBYmKIyJtRhFiNV/DYA1r9ki+i7afyPWyI+na8UEqzI= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777312621; c=relaxed/simple; bh=oTPrNsJGn9y9ve7G0JXQaEe3AHZWdcmCD6cOOWFC2jI=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=CkkqNZ4TexqvNxgnRnHkxCNVjxJKmQTqGPLQvS+qZwYf05jWp58m1j3ANAAaXby5Aa+XzA03ZsAPU3812i6FhciRvtgc6jtp7hTH3SRY+KtGQfhXnFe7n9kjFWztrqf22Lqwva7flGmJpecXvinVV1odcNhU+8ZGf0YEkhLGvwE= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--skhawaja.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=cZNlTPT1; arc=none smtp.client-ip=209.85.215.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--skhawaja.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="cZNlTPT1" Received: by mail-pg1-f202.google.com with SMTP id 41be03b00d2f7-c7973e22399so4053870a12.0 for ; Mon, 27 Apr 2026 10:56:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20251104; t=1777312614; x=1777917414; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=97dwBGOJ4jagqgVLDeX5AsU1vRAHIAdmrbC0lOtevyE=; b=cZNlTPT13I1S40yZZL1xeraX2o1K6/NHy+OCYXAO0Izj7PmZyyqv6gHCPcHDlu0nm/ ixuvDueJfXoCLbEbUKQBSXVC7TkFHk9bS7hRI4oJdP5x7UUpZGaO3ddfrOaG64OztmV2 ilyy8/NLhpKUNpO3BwZqvrUa6ui66L26qRII6QsI15Ay2i/LdEQ2L5myU7NWgUv1IIzH Cj9nJYoZwqSV4XfZnWRSPHCCKj40Qvl8S1HCR0CybKKDllyrJFz05pw1W/+sN5HO+kB1 uGMo5r+yDP/vC/6OftJrHeIzfqeHBCHSmeLgHL5rv2VVf+gHa3rWcnjf5GsezMWceJQ2 TeqA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1777312614; x=1777917414; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=97dwBGOJ4jagqgVLDeX5AsU1vRAHIAdmrbC0lOtevyE=; b=WxjztzI4vZ6OyCwIluZMyegKkotfKiRoXotTgLgTwN4GR3uaWhhUDpZJY+kuKmWUic 7UgXN3QsvufZ+zEWw5WvMzCpFpO2GS2lPYnmxUMORCHHHjPYjK4s2yJQ4P19Zz/1nFlg zLq2j31fM004V1zq6kl4BE+b4VM0pjcVn2/E6Zn2J8F8Ygaff3xZDv/JA7TeXXcYmI3F ffOBcSql/3pB1hfh/xeGeuSXFOPIrXTfsFPMMT8wg4F9/vBChXuMXnP+jUFBVPzIq6yW QrPIG5oiyKLB1eoPbhRwCPZE0I3Q6vlIbGHuzV2c6YypL6s8lIYrzUVw2bQ/DmFs2JKG poiw== X-Forwarded-Encrypted: i=1; AFNElJ9UlVGwg7F2HsalNFZlJ30QsmNiPwH5sScmn9cU8CdmqzsRXugziLmnQZvZ6fF9vGVVYc5TK+rexAavEH0=@vger.kernel.org X-Gm-Message-State: AOJu0YxJnTo9oqD3bXbW9ijiQrGaIy1a8v15WhDTaC7OkiPyTPUWV2Yn soxsZWpK3yp/M8XuK0onk+jAa9qvOJhFEEwzhjGnfHQcwRZy3cIzv6LrXmQrUm8pF9pnRKq3HzU GKScwCswjW05kMQ== X-Received: from pgbda9.prod.google.com ([2002:a05:6a02:2389:b0:c6e:6f94:b489]) (user=skhawaja job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:1592:b0:3a2:dd8a:5088 with SMTP id adf61e73a8af0-3a398f4533dmr355326637.51.1777312613917; Mon, 27 Apr 2026 10:56:53 -0700 (PDT) Date: Mon, 27 Apr 2026 17:56:28 +0000 In-Reply-To: <20260427175633.1978233-1-skhawaja@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20260427175633.1978233-1-skhawaja@google.com> X-Mailer: git-send-email 2.54.0.545.g6539524ca2-goog Message-ID: <20260427175633.1978233-12-skhawaja@google.com> Subject: [PATCH v2 11/16] iommu/vt-d: preserve PASID table of preserved device From: Samiullah Khawaja To: David Woodhouse , Lu Baolu , Joerg Roedel , Will Deacon , Jason Gunthorpe Cc: Samiullah Khawaja , Robin Murphy , Kevin Tian , Alex Williamson , Shuah Khan , iommu@lists.linux.dev, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Saeed Mahameed , Adithya Jayachandran , Parav Pandit , Leon Romanovsky , William Tu , Pratyush Yadav , Pasha Tatashin , David Matlack , Andrew Morton , Chris Li , Pranjal Shrivastava , Vipin Sharma , YiFei Zhu Content-Type: text/plain; charset="UTF-8" In scalable mode the PASID table is used to fetch the io page tables. Preserve and restore the PASID table of the preserved devices. Signed-off-by: Samiullah Khawaja --- drivers/iommu/intel/iommu.c | 5 +- drivers/iommu/intel/iommu.h | 12 +++ drivers/iommu/intel/liveupdate.c | 141 +++++++++++++++++++++++++++++++ drivers/iommu/intel/pasid.c | 7 +- drivers/iommu/intel/pasid.h | 9 ++ include/linux/kho/abi/iommu.h | 13 +++ 6 files changed, 184 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index b90757164cd8..6d42051dcf7c 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -2951,8 +2951,10 @@ static int clear_unpreserve_context_entry_fn(struct device *dev, if (!info) return 0; - if (dev_is_pci(dev) && dev_iommu_preserved_state(dev)) + if (dev_is_pci(dev) && dev_iommu_preserved_state(dev)) { + pasid_cleanup_preserved_table(dev); return 0; + } domain_context_clear(info); return 0; @@ -4013,6 +4015,7 @@ const struct iommu_ops intel_iommu_ops = { .page_response = intel_iommu_page_response, #ifdef CONFIG_IOMMU_LIVEUPDATE .preserve_device = intel_iommu_preserve_device, + .unpreserve_device = intel_iommu_unpreserve_device, .preserve = intel_iommu_preserve, .unpreserve = intel_iommu_unpreserve, #endif diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h index 8e37acf7de12..62076a1a0b4d 100644 --- a/drivers/iommu/intel/iommu.h +++ b/drivers/iommu/intel/iommu.h @@ -1290,12 +1290,15 @@ static inline int iopf_for_domain_replace(struct iommu_domain *new, #ifdef CONFIG_IOMMU_LIVEUPDATE int intel_iommu_preserve_device(struct device *dev, struct iommu_device_ser *device_ser); +void intel_iommu_unpreserve_device(struct device *dev, + struct iommu_device_ser *device_ser); int intel_iommu_preserve(struct iommu_device *iommu, struct iommu_hw_ser *iommu_ser); void intel_iommu_unpreserve(struct iommu_device *iommu, struct iommu_hw_ser *iommu_ser); void intel_iommu_liveupdate_restore_root_table(struct intel_iommu *iommu, struct iommu_hw_ser *iommu_ser); +void pasid_cleanup_preserved_table(struct device *dev); #else static inline int intel_iommu_preserve_device(struct device *dev, struct iommu_device_ser *device_ser) @@ -1303,6 +1306,11 @@ static inline int intel_iommu_preserve_device(struct device *dev, return -EOPNOTSUPP; } +static inline void intel_iommu_unpreserve_device(struct device *dev, + struct iommu_device_ser *device_ser) +{ +} + static inline int intel_iommu_preserve(struct iommu_device *iommu, struct iommu_hw_ser *iommu_ser) { @@ -1318,6 +1326,10 @@ static inline void intel_iommu_liveupdate_restore_root_table(struct intel_iommu struct iommu_hw_ser *iommu_ser) { } + +static inline void pasid_cleanup_preserved_table(struct device *dev) +{ +} #endif #ifdef CONFIG_INTEL_IOMMU_SVM diff --git a/drivers/iommu/intel/liveupdate.c b/drivers/iommu/intel/liveupdate.c index 50a63812533f..404b485e97b9 100644 --- a/drivers/iommu/intel/liveupdate.c +++ b/drivers/iommu/intel/liveupdate.c @@ -14,6 +14,7 @@ #include #include "iommu.h" +#include "pasid.h" #include "../iommu-pages.h" static void unpreserve_iommu_context_table(struct intel_iommu *iommu, int end) @@ -140,10 +141,96 @@ void intel_iommu_liveupdate_restore_root_table(struct intel_iommu *iommu, iommu_for_each_preserved_device(_restore_used_domain_ids, iommu); } +enum pasid_lu_op { + PASID_LU_OP_PRESERVE = 1, + PASID_LU_OP_UNPRESERVE, + PASID_LU_OP_RESTORE, + PASID_LU_OP_FREE, +}; + +static int pasid_lu_do_op(void *table, enum pasid_lu_op op) +{ + int ret = 0; + + switch (op) { + case PASID_LU_OP_PRESERVE: + ret = iommu_preserve_page(table); + break; + case PASID_LU_OP_UNPRESERVE: + iommu_unpreserve_page(table); + break; + case PASID_LU_OP_RESTORE: + iommu_restore_page(virt_to_phys(table)); + break; + case PASID_LU_OP_FREE: + iommu_free_pages(table); + break; + } + + return ret; +} + +static int pasid_lu_handle_pd(struct pasid_dir_entry *dir, enum pasid_lu_op op) +{ + struct pasid_entry *table; + int ret; + + /* Only preserve first table for NO_PASID. */ + table = get_pasid_table_from_pde(&dir[0]); + if (!table) + return -EINVAL; + + ret = pasid_lu_do_op(table, op); + if (ret) + return ret; + + ret = pasid_lu_do_op(dir, op); + if (ret) + goto err; + + return 0; +err: + if (op == PASID_LU_OP_PRESERVE) + pasid_lu_do_op(table, PASID_LU_OP_UNPRESERVE); + + return ret; +} + +void pasid_cleanup_preserved_table(struct device *dev) +{ + struct pasid_table *pasid_table; + struct pasid_dir_entry *dir; + struct pasid_entry *table; + size_t dir_size; + + pasid_table = intel_pasid_get_table(dev); + if (!pasid_table) + return; + + dir = pasid_table->table; + table = get_pasid_table_from_pde(&dir[0]); + if (!table) + return; + + /* Clear everything except the first entry in table. */ + memset(&table[1], 0, SZ_4K - sizeof(*table)); + + /* Use the folio order to calculate the size of Pasid Directory */ + dir_size = (1 << (folio_order(virt_to_folio(dir)) + PAGE_SHIFT)); + + /* Clear everything except the first entry in directory */ + memset(&dir[1], 0, dir_size - sizeof(struct pasid_dir_entry)); + + clflush_cache_range(&table[0], SZ_4K); + clflush_cache_range(&dir[0], dir_size); +} + int intel_iommu_preserve_device(struct device *dev, struct iommu_device_ser *device_ser) { struct device_domain_info *info = dev_iommu_priv_get(dev); + struct pasid_table *pasid_table; + int ret; if (!dev_is_pci(dev)) { dev_err(dev, "Cannot preserve non-PCI device\n"); @@ -155,9 +242,45 @@ int intel_iommu_preserve_device(struct device *dev, device_ser->domain_iommu_ser.attachment_id = domain_id_iommu(info->domain, info->iommu); + + if (!sm_supported(info->iommu)) + return 0; + + pasid_table = intel_pasid_get_table(dev); + if (!pasid_table) + return -EINVAL; + + ret = pasid_lu_handle_pd(pasid_table->table, PASID_LU_OP_PRESERVE); + if (ret) + return ret; + + device_ser->intel.pasid_table = virt_to_phys(pasid_table->table); + device_ser->intel.max_pasid = pasid_table->max_pasid; return 0; } +void intel_iommu_unpreserve_device(struct device *dev, + struct iommu_device_ser *device_ser) +{ + struct device_domain_info *info = dev_iommu_priv_get(dev); + struct pasid_table *pasid_table; + + if (!dev_is_pci(dev)) + return; + + if (!info) + return; + + if (!sm_supported(info->iommu)) + return; + + pasid_table = intel_pasid_get_table(dev); + if (!pasid_table) + return; + + pasid_lu_handle_pd(pasid_table->table, PASID_LU_OP_UNPRESERVE); +} + int intel_iommu_preserve(struct iommu_device *iommu_dev, struct iommu_hw_ser *ser) { @@ -194,3 +317,21 @@ void intel_iommu_unpreserve(struct iommu_device *iommu_dev, unpreserve_iommu_context_table(iommu, ROOT_ENTRY_NR); iommu_unpreserve_page(iommu->root_entry); } + +void *intel_pasid_try_restore_table(struct device *dev, u64 max_pasid) +{ + struct iommu_device_ser *ser = dev_iommu_restored_state(dev); + + if (!ser) + return NULL; + + BUG_ON(pasid_lu_handle_pd(phys_to_virt(ser->intel.pasid_table), + PASID_LU_OP_RESTORE)); + if (WARN_ON_ONCE(ser->intel.max_pasid != max_pasid)) { + pasid_lu_handle_pd(phys_to_virt(ser->intel.pasid_table), + PASID_LU_OP_FREE); + return NULL; + } + + return phys_to_virt(ser->intel.pasid_table); +} diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c index 89541b74ab8c..5cac8e95f73b 100644 --- a/drivers/iommu/intel/pasid.c +++ b/drivers/iommu/intel/pasid.c @@ -60,8 +60,11 @@ int intel_pasid_alloc_table(struct device *dev) size = max_pasid >> (PASID_PDE_SHIFT - 3); order = size ? get_order(size) : 0; - dir = iommu_alloc_pages_node_sz(info->iommu->node, GFP_KERNEL, - 1 << (order + PAGE_SHIFT)); + + dir = intel_pasid_try_restore_table(dev, 1 << (order + PAGE_SHIFT + 3)); + if (!dir) + dir = iommu_alloc_pages_node_sz(info->iommu->node, GFP_KERNEL, + 1 << (order + PAGE_SHIFT)); if (!dir) { kfree(pasid_table); return -ENOMEM; diff --git a/drivers/iommu/intel/pasid.h b/drivers/iommu/intel/pasid.h index 48d3bb6b68de..44e673a4ad8f 100644 --- a/drivers/iommu/intel/pasid.h +++ b/drivers/iommu/intel/pasid.h @@ -301,6 +301,15 @@ static inline void pasid_set_eafe(struct pasid_entry *pe) extern unsigned int intel_pasid_max_id; int intel_pasid_alloc_table(struct device *dev); +#ifdef CONFIG_IOMMU_LIVEUPDATE +void *intel_pasid_try_restore_table(struct device *dev, u64 max_pasid); +#else +static inline void *intel_pasid_try_restore_table(struct device *dev, + u64 max_pasid) +{ + return NULL; +} +#endif void intel_pasid_free_table(struct device *dev); struct pasid_table *intel_pasid_get_table(struct device *dev); int intel_pasid_setup_first_level(struct intel_iommu *iommu, struct device *dev, diff --git a/include/linux/kho/abi/iommu.h b/include/linux/kho/abi/iommu.h index 5ffedf0dbd5a..5eeb1e0c9bce 100644 --- a/include/linux/kho/abi/iommu.h +++ b/include/linux/kho/abi/iommu.h @@ -119,6 +119,16 @@ struct iommu_dev_map_ser { u64 iommu_phys; } __packed; +/** + * struct iommu_device_intel_ser - Intel specific state of serialized device + * @pasid_table: Physical address of pasid table + * @max_pasid: Maximum supported pasid + */ +struct iommu_device_intel_ser { + u64 pasid_table; + u64 max_pasid; +} __packed; + /** * struct iommu_device_ser - Serialized state of a device * @hdr: Common object header @@ -131,6 +141,9 @@ struct iommu_device_ser { u32 devid; u32 pci_domain_nr; struct iommu_dev_map_ser domain_iommu_ser; + union { + struct iommu_device_intel_ser intel; + }; } __packed; /** -- 2.54.0.545.g6539524ca2-goog