public inbox for iommu@lists.linux-foundation.org
 help / color / mirror / Atom feed
From: Vipin Sharma <vipinsh@google.com>
To: Samiullah Khawaja <skhawaja@google.com>
Cc: David Woodhouse <dwmw2@infradead.org>,
	 Lu Baolu <baolu.lu@linux.intel.com>,
	Joerg Roedel <joro@8bytes.org>, Will Deacon <will@kernel.org>,
	 Jason Gunthorpe <jgg@ziepe.ca>,
	Robin Murphy <robin.murphy@arm.com>,
	 Kevin Tian <kevin.tian@intel.com>,
	Alex Williamson <alex@shazbot.org>,
	 Shuah Khan <shuah@kernel.org>,
	iommu@lists.linux.dev, linux-kernel@vger.kernel.org,
	 kvm@vger.kernel.org, Saeed Mahameed <saeedm@nvidia.com>,
	 Adithya Jayachandran <ajayachandra@nvidia.com>,
	Parav Pandit <parav@nvidia.com>,
	 Leon Romanovsky <leonro@nvidia.com>,
	William Tu <witu@nvidia.com>,
	 Pratyush Yadav <pratyush@kernel.org>,
	Pasha Tatashin <pasha.tatashin@soleen.com>,
	 David Matlack <dmatlack@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	 Chris Li <chrisl@kernel.org>,
	Pranjal Shrivastava <praan@google.com>,
	 YiFei Zhu <zhuyifei@google.com>
Subject: Re: [PATCH 02/14] iommu: Implement IOMMU core liveupdate skeleton
Date: Tue, 17 Mar 2026 12:58:27 -0700	[thread overview]
Message-ID: <20260316221854.GC1768676.vipinsh@google.com> (raw)
In-Reply-To: <20260203220948.2176157-3-skhawaja@google.com>

On Tue, Feb 03, 2026 at 10:09:36PM +0000, Samiullah Khawaja wrote:
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 4926a43118e6..c0632cb5b570 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -389,6 +389,9 @@ static struct dev_iommu *dev_iommu_get(struct device *dev)
>  
>  	mutex_init(&param->lock);
>  	dev->iommu = param;
> +#ifdef CONFIG_IOMMU_LIVEUPDATE
> +	dev->iommu->device_ser = NULL;
> +#endif

This should already be NULL due to kzalloc() above.

>  	return param;
>  }
>  
> diff --git a/drivers/iommu/liveupdate.c b/drivers/iommu/liveupdate.c
> index 6189ba32ff2c..83eb609b3fd7 100644
> --- a/drivers/iommu/liveupdate.c
> +++ b/drivers/iommu/liveupdate.c
> @@ -11,6 +11,7 @@
>  #include <linux/liveupdate.h>
>  #include <linux/iommu-lu.h>
>  #include <linux/iommu.h>
> +#include <linux/pci.h>
>  #include <linux/errno.h>
>  
>  static void iommu_liveupdate_restore_objs(u64 next)
> @@ -175,3 +176,328 @@ int iommu_liveupdate_unregister_flb(struct liveupdate_file_handler *handler)
>  	return liveupdate_unregister_flb(handler, &iommu_flb);
>  }
>  EXPORT_SYMBOL(iommu_liveupdate_unregister_flb);
> +
> +int iommu_for_each_preserved_device(iommu_preserved_device_iter_fn fn,
> +				    void *arg)
> +{
> +	struct iommu_lu_flb_obj *obj;
> +	struct devices_ser *devices;
> +	int ret, i, idx;
> +
> +	ret = liveupdate_flb_get_incoming(&iommu_flb, (void **)&obj);
> +	if (ret)
> +		return -ENOENT;
> +
> +	devices = __va(obj->ser->devices_phys);
> +	for (i = 0, idx = 0; i < obj->ser->nr_devices; ++i, ++idx) {
> +		if (idx >= MAX_DEVICE_SERS) {
> +			devices = __va(devices->objs.next_objs);
> +			idx = 0;
> +		}
> +
> +		if (devices->devices[idx].obj.deleted)
> +			continue;
> +
> +		ret = fn(&devices->devices[idx], arg);
> +		if (ret)
> +			return ret;
> +	}
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL(iommu_for_each_preserved_device);

IMHO, with all the obj, ser, devices->devices[..], it is little harder
to follow the code. I will recommend to reconsider naming.

Also, should this function be introduced in the patch where it is
getting used? Other changes in this patch are already big and complex.
Same for iommu_get_device_preserved_data() and
iommu_get_preserved_data().

I think this patch can be split in three.
Patch 1: Preserve iommu_domain
Patch 2: Preserve pci device and iommu device
Patch 3: The helper functions I mentioned above.


> +
> +static inline bool device_ser_match(struct device_ser *match,
> +				    struct pci_dev *pdev)
> +{
> +	return match->devid == pci_dev_id(pdev) && match->pci_domain == pci_domain_nr(pdev->bus);

Nit: s/match/device_ser for readability or something similar.

> +}
> +
> +struct device_ser *iommu_get_device_preserved_data(struct device *dev)
> +{
> +	struct iommu_lu_flb_obj *obj;
> +	struct devices_ser *devices;
> +	int ret, i, idx;
> +
> +	if (!dev_is_pci(dev))
> +		return NULL;
> +
> +	ret = liveupdate_flb_get_incoming(&iommu_flb, (void **)&obj);
> +	if (ret)
> +		return NULL;
> +
> +	devices = __va(obj->ser->devices_phys);
> +	for (i = 0, idx = 0; i < obj->ser->nr_devices; ++i, ++idx) {
> +		if (idx >= MAX_DEVICE_SERS) {
> +			devices = __va(devices->objs.next_objs);
> +			idx = 0;
> +		}
> +
> +		if (devices->devices[idx].obj.deleted)
> +			continue;
> +
> +		if (device_ser_match(&devices->devices[idx], to_pci_dev(dev))) {
> +			devices->devices[idx].obj.incoming = true;
> +			return &devices->devices[idx];
> +		}
> +	}
> +
> +	return NULL;
> +}
> +EXPORT_SYMBOL(iommu_get_device_preserved_data);
> +
> +struct iommu_ser *iommu_get_preserved_data(u64 token, enum iommu_lu_type type)
> +{
> +	struct iommu_lu_flb_obj *obj;
> +	struct iommus_ser *iommus;
> +	int ret, i, idx;
> +
> +	ret = liveupdate_flb_get_incoming(&iommu_flb, (void **)&obj);
> +	if (ret)
> +		return NULL;
> +
> +	iommus = __va(obj->ser->iommus_phys);
> +	for (i = 0, idx = 0; i < obj->ser->nr_iommus; ++i, ++idx) {
> +		if (idx >= MAX_IOMMU_SERS) {
> +			iommus = __va(iommus->objs.next_objs);
> +			idx = 0;
> +		}
> +
> +		if (iommus->iommus[idx].obj.deleted)
> +			continue;
> +
> +		if (iommus->iommus[idx].token == token &&
> +		    iommus->iommus[idx].type == type)
> +			return &iommus->iommus[idx];
> +	}
> +
> +	return NULL;
> +}
> +EXPORT_SYMBOL(iommu_get_preserved_data);
> +
> +static int reserve_obj_ser(struct iommu_objs_ser **objs_ptr, u64 max_objs)
> +{
> +	struct iommu_objs_ser *next_objs, *objs = *objs_ptr;
> +	int idx;
> +
> +	if (objs->nr_objs == max_objs) {
> +		next_objs = kho_alloc_preserve(PAGE_SIZE);
> +		if (IS_ERR(next_objs))
> +			return PTR_ERR(next_objs);
> +
> +		objs->next_objs = virt_to_phys(next_objs);
> +		objs = next_objs;
> +		*objs_ptr = objs;
> +		objs->nr_objs = 0;
> +		objs->next_objs = 0;
> +	}
> +
> +	idx = objs->nr_objs++;
> +	return idx;

I think we should rename these variables, it is difficult to comprehend
the code.

> +}
> +
> +int iommu_domain_preserve(struct iommu_domain *domain, struct iommu_domain_ser **ser)
> +{
> +	struct iommu_domain_ser *domain_ser;
> +	struct iommu_lu_flb_obj *flb_obj;
> +	int idx, ret;
> +
> +	if (!domain->ops->preserve)
> +		return -EOPNOTSUPP;
> +
> +	ret = liveupdate_flb_get_outgoing(&iommu_flb, (void **)&flb_obj);
> +	if (ret)
> +		return ret;
> +
> +	guard(mutex)(&flb_obj->lock);
> +	idx = reserve_obj_ser((struct iommu_objs_ser **)&flb_obj->iommu_domains,
> +			      MAX_IOMMU_DOMAIN_SERS);
> +	if (idx < 0)
> +		return idx;
> +
> +	domain_ser = &flb_obj->iommu_domains->iommu_domains[idx];
> +	idx = flb_obj->ser->nr_domains++;

In for loops above, nr_domains are compared with 'i', and idx is for
index inside a page. Lets not reuse idx here for nr_domains, keep it
consistent for easier understanding. May be use something more
descriptive than 'i' like global_idx and local_idx?

> +	domain_ser->obj.idx = idx;
> +	domain_ser->obj.ref_count = 1;
> +
> +	ret = domain->ops->preserve(domain, domain_ser);
> +	if (ret) {
> +		domain_ser->obj.deleted = true;
> +		return ret;
> +	}
> +
> +	domain->preserved_state = domain_ser;
> +	*ser = domain_ser;
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(iommu_domain_preserve);
> +
> +static int iommu_preserve_locked(struct iommu_device *iommu)
> +{
> +	struct iommu_lu_flb_obj *flb_obj;
> +	struct iommu_ser *iommu_ser;
> +	int idx, ret;

Should there be lockdep asserts in these locked version APIs?

> +
> +}
> +
> +static void iommu_unpreserve_locked(struct iommu_device *iommu)
> +{
> +	struct iommu_ser *iommu_ser = iommu->outgoing_preserved_state;
> +
> +	iommu_ser->obj.ref_count--;

Should there be a null check?

> +	if (iommu_ser->obj.ref_count)
> +		return;
> +
> +	iommu->outgoing_preserved_state = NULL;
> +	iommu->ops->unpreserve(iommu, iommu_ser);
> +	iommu_ser->obj.deleted = true;
> +}
> +
> +int iommu_preserve_device(struct iommu_domain *domain,
> +			  struct device *dev, u64 token)
> +{
> +	struct iommu_lu_flb_obj *flb_obj;
> +	struct device_ser *device_ser;
> +	struct dev_iommu *iommu;
> +	struct pci_dev *pdev;
> +	int ret, idx;
> +
> +	if (!dev_is_pci(dev))
> +		return -EOPNOTSUPP;
> +
> +	if (!domain->preserved_state)
> +		return -EINVAL;
> +
> +	pdev = to_pci_dev(dev);
> +	iommu = dev->iommu;
> +	if (!iommu->iommu_dev->ops->preserve_device ||
> +	    !iommu->iommu_dev->ops->preserve)

iommu_preserve_locked() is already checking for preserve(), we can just
check preserve_device here.

> +		return -EOPNOTSUPP;
> +
> +	ret = liveupdate_flb_get_outgoing(&iommu_flb, (void **)&flb_obj);
> +	if (ret)
> +		return ret;
> +
> +	guard(mutex)(&flb_obj->lock);
> +	idx = reserve_obj_ser((struct iommu_objs_ser **)&flb_obj->devices,
> +			      MAX_DEVICE_SERS);
> +	if (idx < 0)
> +		return idx;
> +
> +	device_ser = &flb_obj->devices->devices[idx];
> +	idx = flb_obj->ser->nr_devices++;
> +	device_ser->obj.idx = idx;
> +	device_ser->obj.ref_count = 1;
> +
> +	ret = iommu_preserve_locked(iommu->iommu_dev);
> +	if (ret) {
> +		device_ser->obj.deleted = true;
> +		return ret;
> +	}
> +
> +	device_ser->domain_iommu_ser.domain_phys = __pa(domain->preserved_state);
> +	device_ser->domain_iommu_ser.iommu_phys = __pa(iommu->iommu_dev->outgoing_preserved_state);
> +	device_ser->devid = pci_dev_id(pdev);
> +	device_ser->pci_domain = pci_domain_nr(pdev->bus);
> +	device_ser->token = token;
> +
> +	ret = iommu->iommu_dev->ops->preserve_device(dev, device_ser);
> +	if (ret) {
> +		device_ser->obj.deleted = true;
> +		iommu_unpreserve_locked(iommu->iommu_dev);
> +		return ret;
> +	}
> +
> +	dev->iommu->device_ser = device_ser;
> +	return 0;
> +}
> +
> +void iommu_unpreserve_device(struct iommu_domain *domain, struct device *dev)
> +{
> +	struct iommu_lu_flb_obj *flb_obj;
> +	struct device_ser *device_ser;
> +	struct dev_iommu *iommu;
> +	struct pci_dev *pdev;
> +	int ret;
> +
> +	if (!dev_is_pci(dev))
> +		return;
> +
> +	pdev = to_pci_dev(dev);
> +	iommu = dev->iommu;
> +	if (!iommu->iommu_dev->ops->unpreserve_device ||
> +	    !iommu->iommu_dev->ops->unpreserve)
> +		return;
> +
> +	ret = liveupdate_flb_get_outgoing(&iommu_flb, (void **)&flb_obj);
> +	if (WARN_ON(ret))

Why WARN_ON here and not other places? Do we need it?

> +		return;
> +
> +	guard(mutex)(&flb_obj->lock);
> +	device_ser = dev_iommu_preserved_state(dev);
> +	if (WARN_ON(!device_ser))

Can't we just silently ignore this?

> +		return;
> +
> +	iommu->iommu_dev->ops->unpreserve_device(dev, device_ser);
> +	dev->iommu->device_ser = NULL;
> +
> +	iommu_unpreserve_locked(iommu->iommu_dev);
> +}

  parent reply	other threads:[~2026-03-17 19:58 UTC|newest]

Thread overview: 104+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-03 22:09 [PATCH 00/14] iommu: Add live update state preservation Samiullah Khawaja
2026-02-03 22:09 ` [PATCH 01/14] iommu: Implement IOMMU LU FLB callbacks Samiullah Khawaja
2026-03-11 21:07   ` Pranjal Shrivastava
2026-03-12 16:43     ` Samiullah Khawaja
2026-03-12 23:43       ` Pranjal Shrivastava
2026-03-13 16:47         ` Samiullah Khawaja
2026-03-13 15:36       ` Pranjal Shrivastava
2026-03-13 16:58         ` Samiullah Khawaja
2026-04-10 13:51     ` Jason Gunthorpe
2026-03-16 22:54   ` Vipin Sharma
2026-03-17  1:06     ` Samiullah Khawaja
2026-03-23 23:27       ` Vipin Sharma
2026-02-03 22:09 ` [PATCH 02/14] iommu: Implement IOMMU core liveupdate skeleton Samiullah Khawaja
2026-03-12 23:10   ` Pranjal Shrivastava
2026-03-13 18:42     ` Samiullah Khawaja
2026-03-17 20:09       ` Pranjal Shrivastava
2026-03-17 20:13         ` Samiullah Khawaja
2026-03-17 20:23           ` Pranjal Shrivastava
2026-03-17 21:03             ` Vipin Sharma
2026-03-18 18:51               ` Pranjal Shrivastava
2026-03-18 17:49             ` Samiullah Khawaja
2026-03-17 19:58   ` Vipin Sharma [this message]
2026-03-17 20:33     ` Samiullah Khawaja
2026-03-24 19:06       ` Vipin Sharma
2026-03-24 19:45         ` Samiullah Khawaja
2026-02-03 22:09 ` [PATCH 03/14] liveupdate: luo_file: Add internal APIs for file preservation Samiullah Khawaja
2026-03-18 10:00   ` Pranjal Shrivastava
2026-03-18 16:54     ` Samiullah Khawaja
2026-02-03 22:09 ` [PATCH 04/14] iommu/pages: Add APIs to preserve/unpreserve/restore iommu pages Samiullah Khawaja
2026-03-03 16:42   ` Ankit Soni
2026-03-03 18:41     ` Samiullah Khawaja
2026-03-20 17:27       ` Pranjal Shrivastava
2026-03-20 18:12         ` Samiullah Khawaja
2026-04-10 14:13           ` Jason Gunthorpe
2026-04-10 22:13             ` Samiullah Khawaja
2026-03-17 20:59   ` Vipin Sharma
2026-03-20  9:28     ` Pranjal Shrivastava
2026-03-20 18:27       ` Samiullah Khawaja
2026-03-20 11:01     ` Pranjal Shrivastava
2026-03-20 18:56       ` Samiullah Khawaja
2026-02-03 22:09 ` [PATCH 05/14] iommupt: Implement preserve/unpreserve/restore callbacks Samiullah Khawaja
2026-03-20 21:57   ` Pranjal Shrivastava
2026-03-23 16:41     ` Samiullah Khawaja
2026-04-10 14:16     ` Jason Gunthorpe
2026-04-10 23:02       ` Samiullah Khawaja
2026-04-10 23:16         ` Jason Gunthorpe
2026-02-03 22:09 ` [PATCH 06/14] iommu/vt-d: Implement device and iommu preserve/unpreserve ops Samiullah Khawaja
2026-03-19 16:04   ` Vipin Sharma
2026-03-19 16:27     ` Samiullah Khawaja
2026-03-20 23:01   ` Pranjal Shrivastava
2026-03-21 13:27     ` Pranjal Shrivastava
2026-03-23 18:32     ` Samiullah Khawaja
2026-02-03 22:09 ` [PATCH 07/14] iommu/vt-d: Restore IOMMU state and reclaimed domain ids Samiullah Khawaja
2026-03-19 20:54   ` Vipin Sharma
2026-03-20  1:05     ` Samiullah Khawaja
2026-03-22 19:51   ` Pranjal Shrivastava
2026-03-23 19:33     ` Samiullah Khawaja
2026-02-03 22:09 ` [PATCH 08/14] iommu: Restore and reattach preserved domains to devices Samiullah Khawaja
2026-03-10  5:16   ` Ankit Soni
2026-03-10 21:47     ` Samiullah Khawaja
2026-03-22 21:59   ` Pranjal Shrivastava
2026-03-23 18:02     ` Samiullah Khawaja
2026-02-03 22:09 ` [PATCH 09/14] iommu/vt-d: preserve PASID table of preserved device Samiullah Khawaja
2026-03-23 18:19   ` Pranjal Shrivastava
2026-03-23 18:51     ` Samiullah Khawaja
2026-02-03 22:09 ` [PATCH 10/14] iommufd-lu: Implement ioctl to let userspace mark an HWPT to be preserved Samiullah Khawaja
2026-03-19 23:35   ` Vipin Sharma
2026-03-20  0:40     ` Samiullah Khawaja
2026-03-20 23:34       ` Vipin Sharma
2026-03-23 16:24         ` Samiullah Khawaja
2026-03-25 14:37   ` Pranjal Shrivastava
2026-03-25 17:31     ` Samiullah Khawaja
2026-03-25 18:55       ` Pranjal Shrivastava
2026-03-25 20:19         ` Samiullah Khawaja
2026-03-25 20:36           ` Pranjal Shrivastava
2026-03-25 20:46             ` Samiullah Khawaja
2026-02-03 22:09 ` [PATCH 11/14] iommufd-lu: Persist iommu hardware pagetables for live update Samiullah Khawaja
2026-02-25 23:47   ` Samiullah Khawaja
2026-03-03  5:56   ` Ankit Soni
2026-03-03 18:51     ` Samiullah Khawaja
2026-03-23 20:28   ` Vipin Sharma
2026-03-23 21:34     ` Samiullah Khawaja
2026-03-25 20:08   ` Pranjal Shrivastava
2026-03-25 20:32     ` Samiullah Khawaja
2026-02-03 22:09 ` [PATCH 12/14] iommufd: Add APIs to preserve/unpreserve a vfio cdev Samiullah Khawaja
2026-03-23 20:59   ` Vipin Sharma
2026-03-23 21:38     ` Samiullah Khawaja
2026-03-25 20:24   ` Pranjal Shrivastava
2026-03-25 20:41     ` Samiullah Khawaja
2026-03-25 21:23       ` Pranjal Shrivastava
2026-03-26  0:16         ` Samiullah Khawaja
2026-02-03 22:09 ` [PATCH 13/14] vfio/pci: Preserve the iommufd state of the " Samiullah Khawaja
2026-02-17  4:18   ` Ankit Soni
2026-03-03 18:35     ` Samiullah Khawaja
2026-03-23 21:17   ` Vipin Sharma
2026-03-23 22:07     ` Samiullah Khawaja
2026-03-24 20:30       ` Vipin Sharma
2026-03-25 20:55   ` Pranjal Shrivastava
2026-02-03 22:09 ` [PATCH 14/14] iommufd/selftest: Add test to verify iommufd preservation Samiullah Khawaja
2026-03-23 22:18   ` Vipin Sharma
2026-03-27 18:32     ` Samiullah Khawaja
2026-03-25 21:05   ` Pranjal Shrivastava
2026-03-27 18:25     ` Samiullah Khawaja
2026-03-27 18:40       ` Samiullah Khawaja

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260316221854.GC1768676.vipinsh@google.com \
    --to=vipinsh@google.com \
    --cc=ajayachandra@nvidia.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex@shazbot.org \
    --cc=baolu.lu@linux.intel.com \
    --cc=chrisl@kernel.org \
    --cc=dmatlack@google.com \
    --cc=dwmw2@infradead.org \
    --cc=iommu@lists.linux.dev \
    --cc=jgg@ziepe.ca \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=leonro@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=parav@nvidia.com \
    --cc=pasha.tatashin@soleen.com \
    --cc=praan@google.com \
    --cc=pratyush@kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=saeedm@nvidia.com \
    --cc=shuah@kernel.org \
    --cc=skhawaja@google.com \
    --cc=will@kernel.org \
    --cc=witu@nvidia.com \
    --cc=zhuyifei@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox