devicetree.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@ziepe.ca>
To: Benjamin Gaignard <benjamin.gaignard@collabora.com>
Cc: joro@8bytes.org, will@kernel.org, robin.murphy@arm.com,
	robh@kernel.org, krzk+dt@kernel.org, conor+dt@kernel.org,
	heiko@sntech.de, nicolas.dufresne@collabora.com,
	iommu@lists.linux.dev, devicetree@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	linux-rockchip@lists.infradead.org, kernel@collabora.com
Subject: Re: [PATCH v3 3/5] iommu: Add verisilicon IOMMU driver
Date: Fri, 20 Jun 2025 09:05:09 -0300	[thread overview]
Message-ID: <20250620120509.GA39770@ziepe.ca> (raw)
In-Reply-To: <e034a111-93eb-42e8-a533-46553b4f5588@collabora.com>

On Fri, Jun 20, 2025 at 10:57:49AM +0200, Benjamin Gaignard wrote:
> 
> Le 19/06/2025 à 18:59, Jason Gunthorpe a écrit :
> > On Thu, Jun 19, 2025 at 06:27:52PM +0200, Benjamin Gaignard wrote:
> > > Le 19/06/2025 à 15:47, Jason Gunthorpe a écrit :
> > > > Ugh. This ignores the gfp flags that are passed into map because you
> > > > have to force atomic due to the spinlock that shouldn't be there :(
> > > > This means it does not set GFP_KERNEL_ACCOUNT when required. It would
> > > > be better to continue to use the passed in GFP flags but override them
> > > > to atomic mode.
> > > I will add a gfp_t parameter and use it like that:
> > > page_table = iommu_alloc_pages_sz(gfp | GFP_ATOMIC | GFP_DMA32, SPAGE_SIZE);
> > This will or together GFP_ATOMIC and GFP_KERNEL, I don't think that
> > works..
> 
> I have test it, I don't see conflicts or errors. What worries you ?

Just looking at the bitmaps I'm not sure? Did you run with lockdep?

> > > > > +static int vsi_iommu_attach_device(struct iommu_domain *domain,
> > > > > +				   struct device *dev)
> > > > > +{
> > > > > +	struct vsi_iommu *iommu = dev_iommu_priv_get(dev);
> > > > > +	struct vsi_iommu_domain *vsi_domain = to_vsi_domain(domain);
> > > > > +	unsigned long flags;
> > > > > +	int ret;
> > > > > +
> > > > > +	if (WARN_ON(!iommu))
> > > > > +		return -ENODEV;
> > > > > +
> > > > > +	/* iommu already attached */
> > > > > +	if (iommu->domain == domain)
> > > > > +		return 0;
> > > > > +
> > > > > +	ret = vsi_iommu_identity_attach(&vsi_identity_domain, dev);
> > > > > +	if (ret)
> > > > > +		return ret;
> > > > Hurm, this is actually quite bad, now that it is clear the HW is in an
> > > > identity mode it is actually a security problem for VFIO to switch the
> > > > translation to identity during attach_device. I'd really prefer new
> > > > drivers don't make this mistake.
> > > > 
> > > > It seems the main thing motivating this is the fact a linked list has
> > > > only a single iommu->node so you can't attach the iommu to both the
> > > > new/old domain and atomically update the page table base.
> > > > 
> > > > Is it possible for the HW to do a blocking behavior? That would be an
> > > > easy fix.. You should always be able to force this by allocating a
> > > > shared top page table level during probe time and making it entirely
> > > > empty while staying always in the paging mode. Maybe there is a less
> > > > expensive way.
> > > > 
> > > > Otherwise you probably have work more like the other drivers and
> > > > allocate a struct for each attachment so you can have the iommu
> > > > attached two domains during the switch over and never drop to an
> > > > identity mode.
> > > I will remove the switch to identity domain and it will works fine.
> > You'll loose invalidations..
> > 
> > Maybe the easiest thing is to squish vsi_iommu_enable() and reorganize
> > it so that the spinlock is held across the register programming and
> > then you can atomically under the lock change the registers and change
> > the linked list. The register write cannot fail so this is fine.
> > 
> > This should probably also flush the iotlb inside the lock.
> 
> I will try to summarize:
> in vsi_iommu_attach_device() I should:
> - take the lock
> - do nothing if the domain is the same.
> - if iommu is used (ie powered up):
>   - update the registers to enable the iommu
>   - flush
>   - update the link list
> - update iommu->domain
> - release the lock

That sounds believable, yes.. Though can you do the "powered up" checks
inside the spinlock - are they sleeping? Can they be done before the
spinlock?

> in vsi_iommu_identity_attach() I should:
> - take the lock
> - do nothing if the domain is the same.
> - if iommu is used (ie powered up):
>   - disable the iommu
>   - remove the node from the list
> - update iommu->domain
> - release the lock

And maybe flush too? How does the caching hw work at this point? You
can't have stale entries in the cache when you switch back to an
enabled/translating configuration. So either the HW flushes implicitly
or you need to add a flush somewhere..
 
> vsi_iommu_suspend() and vsi_iommu_resume() will also have to take the lock
> before calling vsi_iommu_disable() and vsi_iommu_enable().

Yes, if they use iommu->domain that seems good

If the above locking is a problem then I'd use the group mutex instead
during resume/suspend. The attach functions are already called with
the group mutex held.

> Do I have to switch to identity domain in vsi_iommu_attach_device()
> before applying the requested domain ?

No, that is what creates the security problem. What you want is to
update the HW registers in a way that the HW just changes hitlessly
from one translation to another, then flush the cache.

Jason

  reply	other threads:[~2025-06-20 12:05 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-19 13:12 [PATCH v3 0/5] Add support for Verisilicon IOMMU used by media codec blocks Benjamin Gaignard
2025-06-19 13:12 ` [PATCH v3 1/5] dt-bindings: vendor-prefixes: Add Verisilicon Benjamin Gaignard
2025-06-19 13:12 ` [PATCH v3 2/5] dt-bindings: iommu: verisilicon: Add binding for VSI IOMMU Benjamin Gaignard
2025-06-19 14:19   ` Sebastian Reichel
2025-06-19 15:02     ` Conor Dooley
2025-06-20  9:54     ` Benjamin Gaignard
2025-06-19 13:12 ` [PATCH v3 3/5] iommu: Add verisilicon IOMMU driver Benjamin Gaignard
2025-06-19 13:47   ` Jason Gunthorpe
2025-06-19 16:27     ` Benjamin Gaignard
2025-06-19 16:59       ` Jason Gunthorpe
2025-06-20  8:57         ` Benjamin Gaignard
2025-06-20 12:05           ` Jason Gunthorpe [this message]
2025-06-20 13:52             ` Benjamin Gaignard
2025-06-20 14:42               ` Benjamin Gaignard
2025-06-20 16:35                 ` Jason Gunthorpe
2025-06-20 16:36               ` Jason Gunthorpe
2025-06-20 19:37   ` Robin Murphy
2025-06-20 20:45     ` Lucas Stach
2025-06-23 12:05       ` Robin Murphy
2025-06-23 14:03     ` Benjamin Gaignard
2025-06-19 13:12 ` [PATCH v3 4/5] arm64: dts: rockchip: Add verisilicon IOMMU node on RK3588 Benjamin Gaignard
2025-06-19 13:12 ` [PATCH v3 5/5] arm64: defconfig: enable Verisilicon IOMMU Benjamin Gaignard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250620120509.GA39770@ziepe.ca \
    --to=jgg@ziepe.ca \
    --cc=benjamin.gaignard@collabora.com \
    --cc=conor+dt@kernel.org \
    --cc=devicetree@vger.kernel.org \
    --cc=heiko@sntech.de \
    --cc=iommu@lists.linux.dev \
    --cc=joro@8bytes.org \
    --cc=kernel@collabora.com \
    --cc=krzk+dt@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rockchip@lists.infradead.org \
    --cc=nicolas.dufresne@collabora.com \
    --cc=robh@kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).