public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Nicolin Chen <nicolinc@nvidia.com>
Cc: kevin.tian@intel.com, will@kernel.org, joro@8bytes.org,
	suravee.suthikulpanit@amd.com, robin.murphy@arm.com,
	dwmw2@infradead.org, baolu.lu@linux.intel.com, shuah@kernel.org,
	linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
	linux-arm-kernel@lists.infradead.org,
	linux-kselftest@vger.kernel.org, eric.auger@redhat.com,
	jean-philippe@linaro.org, mdf@kernel.org, mshavit@google.com,
	shameerali.kolothum.thodi@huawei.com, smostafa@google.com,
	yi.l.liu@intel.com
Subject: Re: [PATCH v1 04/10] iommufd/viommu: Allow drivers to control vdev_id lifecycle
Date: Wed, 23 Oct 2024 13:59:05 -0300	[thread overview]
Message-ID: <20241023165905.GI864191@nvidia.com> (raw)
In-Reply-To: <ZxikJwzq8rLPgtmS@Asurada-Nvidia>

On Wed, Oct 23, 2024 at 12:22:15AM -0700, Nicolin Chen wrote:
> On Thu, Sep 05, 2024 at 03:01:19PM -0300, Jason Gunthorpe wrote:
> > On Tue, Aug 27, 2024 at 10:02:06AM -0700, Nicolin Chen wrote:
> > > The iommufd core provides a lookup helper for an IOMMU driver to find a
> > > device pointer by device's per-viommu virtual ID. Yet a driver may need
> > > an inverted lookup to find a device's per-viommu virtual ID by a device
> > > pointer, e.g. when reporting virtual IRQs/events back to the user space.
> > > In this case, it'd be unsafe for iommufd core to do an inverted lookup,
> > > as the driver can't track the lifecycle of a viommu object or a vdev_id
> > > object.
> > > 
> > > Meanwhile, some HW can even support virtual device ID lookup by its HW-
> > > accelerated virtualization feature. E.g. Tegra241 CMDQV HW supports to
> > > execute vanilla guest-issued SMMU commands containing virtual Stream ID
> > > but requires software to configure a link between virtual Stream ID and
> > > physical Stream ID via HW registers. So not only the iommufd core needs
> > > a vdev_id lookup table, drivers will want one too.
> > > 
> > > Given the two justifications above, it's the best practice to provide a
> > > a pair of set_vdev_id/unset_vdev_id ops in the viommu ops, so a driver
> > > can implement them to control a vdev_id's lifecycle, and configure the
> > > HW properly if required.
> > 
> > I think the lifecycle rules should be much simpler.
> > 
> > If a nested domain is attached to a STE/RID/device then the vIOMMU
> > affiliated with that nested domain is pinned while the STE is in place
> > 
> > So the driver only need to provide locking around attach changing the
> > STE's vIOMMU vs async operations translating from a STE to a
> > vIOMMU. This can be a simple driver lock of some kind, ie a rwlock
> > across the STE table.
> > 
> > Generally that is how all the async events should work, go from the
> > STE to the VIOMMU to a iommufd callback to the iommufd event
> > queue. iommufd will translate the struct device from the driver to an
> > idev_id (or maybe even a vdevid) the same basic way the PRI code works
> 
> I am trying to draft something following this, and here is what
> it would look like:
> 
> ------------------------draft---------------------------
> struct arm_smmu_master {
> 	....
> +	struct rw_semaphore lock;

It would be called vsmmu_lock

> +	struct arm_vsmmu *vsmmu;
> 	....
> };
> 
> ->attach_dev() {
> 	down_write(&master->lock);
> 	if (domain->type == IOMMU_DOMAIN_NESTED)
> 		master->vsmmu = to_smmu_domain(domain)->vsmmu;
> 	else
> 		master->vsmmu = NULL;
> 	up_write(&master->lock);
> }
> 
> isr() {
> 	down_read(&master->lock);
> 	if (master->vsmmu) {
> 		xa_lock(&master->vsmmu->core.vdevs);
> 		vdev = iommufd_viommu_dev_to_vdev(&master->vsmmu->core,
> 						  master->dev);
> 		if (vdev) {
> 			struct iommu_virq_arm_smmuv3 virq_data = evt;
> 
> 			virq_data.evt[0] &= ~EVTQ_0_SID;
> 			virq_data.evt[0] |= FIELD_PREP(EVTQ_0_SID, vdev->id);
> 			return iommufd_viommu_report_irq(
> 					vdev->viommu,
> 					IOMMU_VIRQ_TYPE_ARM_SMMUV3, &virq_data,
> 					sizeof(virq_data));
> 		} else {
> 			rc = -ENOENT;
> 		}
> 		xa_unlock(&master->vsmmu->core.vdevs);
> 	}
> 	up_read(&master->lock);
> }
> --------------------------------------------------------

This looks reasonable

> [Comparison]      | This v1                 | Draft
> 1. Adds to master | A lock and vdev ptr     | A lock and viommu ptr
> 2. Set/unset ptr  | In ->vdevice_alloc/free | In all ->attach_dev
> 3. Do dev_to_vdev | master->vdev->id        | attach_handle?

The set/unset ops have the major issue that they can get out of sync
with the domain. The only time things should be routed to the viommu
is when a viommu related domain is attached.

The lock on attach can be reduced:

  iommu_group_mutex_assert(dev)
  if (domain->type == IOMMU_DOMAIN_NESTED)
 		new_vsmmu = to_smmu_domain(domain)->vsmmu;
  else
 		new_vsmmu = NULL;
  if (new_vsmmu != master->vsmmu) {
 	down_write(&master->lock);
	master->vsmmu = new_vsmmu;
	up_write(&master->lock);
  }

And you'd stick this in prepare or commit..

> Both solutions needs a driver-level lock and an extra pointer in
> the master structure. And both ISR routines require that driver-
> level lock to avoid race against attach_dev v.s. vdev alloc/free.
> Overall, taking step.3 into consideration, it seems that letting
> master lock&hold the vdev pointer (i.e. this v1) is simpler?

I'm not sure the vdev pointer should even be visible to the drivers..

> As for the implementation of iommufd_viommu_dev_to_vdev(), I read
> the attach_handle part in the PRI code, yet I don't see the lock
> that protects the handle returned by iommu_attach_handle_get() in
> iommu_report_device_fault/find_fault_handler().

It is the xa_lock and some rules about flushing irqs and work queues
before allowing the dev to disappear:

>   "Callers are required to synchronize the call of
>    iommu_attach_handle_get() with domain attachment
>    and detachment. The attach handle can only be used
>    during its life cycle."

> But the caller iommu_report_device_fault() is an async event that
> cannot guarantee the lifecycle. Would you please shed some light?

The iopf detatch function will act as a barrirer to ensure that all
the async work has completed, sort of like how RCU works.

But here, I think it is pretty simple, isn't it?

When you update the master->vsmmu you can query the vsmmu to get the
vdev id of that master, then store it in the master struct and forward
it to the iommufd_viommu_report_irq(). That could even search the
xarray since attach is not a performance path.

Then it is locked under the master->lock

Jason

  reply	other threads:[~2024-10-23 16:59 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-27 17:02 [PATCH v1 00/10] iommufd: Add VIOMMU infrastructure (Part-2 VIRQ) Nicolin Chen
2024-08-27 17:02 ` [PATCH v1 01/10] iommufd: Rename IOMMUFD_OBJ_FAULT to IOMMUFD_OBJ_EVENT_IOPF Nicolin Chen
2024-08-27 17:02 ` [PATCH v1 02/10] iommufd: Rename fault.c to event.c Nicolin Chen
2024-08-27 17:02 ` [PATCH v1 03/10] iommufd: Add IOMMUFD_OBJ_EVENT_VIRQ and IOMMUFD_CMD_VIRQ_ALLOC Nicolin Chen
2024-08-27 17:02 ` [PATCH v1 04/10] iommufd/viommu: Allow drivers to control vdev_id lifecycle Nicolin Chen
2024-09-05 18:01   ` Jason Gunthorpe
2024-10-08 17:39     ` Nicolin Chen
2024-10-23  7:22     ` Nicolin Chen
2024-10-23 16:59       ` Jason Gunthorpe [this message]
2024-10-23 18:54         ` Nicolin Chen
2024-10-28 12:58           ` Jason Gunthorpe
2024-08-27 17:02 ` [PATCH v1 05/10] iommufd/viommu: Add iommufd_vdev_id_to_dev helper Nicolin Chen
2024-08-27 17:02 ` [PATCH v1 06/10] iommufd/viommu: Add iommufd_viommu_report_irq helper Nicolin Chen
2024-08-27 17:02 ` [PATCH v1 07/10] iommufd/selftest: Implement mock_viommu_set/unset_vdev_id Nicolin Chen
2024-08-27 17:02 ` [PATCH v1 08/10] iommufd/selftest: Add IOMMU_TEST_OP_TRIGGER_VIRQ for VIRQ coverage Nicolin Chen
2024-08-27 17:02 ` [PATCH v1 09/10] iommufd/selftest: Add EVENT_VIRQ test coverage Nicolin Chen
2024-08-27 17:02 ` [PATCH v1 10/10] iommu/arm-smmu-v3: Report virtual IRQ for device in user space Nicolin Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241023165905.GI864191@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=dwmw2@infradead.org \
    --cc=eric.auger@redhat.com \
    --cc=iommu@lists.linux.dev \
    --cc=jean-philippe@linaro.org \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=mdf@kernel.org \
    --cc=mshavit@google.com \
    --cc=nicolinc@nvidia.com \
    --cc=robin.murphy@arm.com \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=shuah@kernel.org \
    --cc=smostafa@google.com \
    --cc=suravee.suthikulpanit@amd.com \
    --cc=will@kernel.org \
    --cc=yi.l.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox