All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Michael Shavit <mshavit@google.com>
Cc: iommu@lists.linux.dev, Joerg Roedel <joro@8bytes.org>,
	linux-arm-kernel@lists.infradead.org,
	Robin Murphy <robin.murphy@arm.com>,
	Will Deacon <will@kernel.org>, Nicolin Chen <nicolinc@nvidia.com>,
	Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
Subject: Re: [PATCH v2 11/19] iommu/arm-smmu-v3: Do not change the STE twice during arm_smmu_attach_dev()
Date: Thu, 16 Nov 2023 12:28:09 -0400	[thread overview]
Message-ID: <ZVZDGRd6m/FoLmNi@nvidia.com> (raw)
In-Reply-To: <CAKHBV24X4-Mf3R8OqwBdAJf9P7UD9fep2fOmJVrcx_rkfABxBA@mail.gmail.com>

On Wed, Nov 15, 2023 at 11:15:23PM +0800, Michael Shavit wrote:
> On Tue, Nov 14, 2023 at 1:53 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
> >
> > This was needed because the STE code required the STE to be in
> > ABORT/BYPASS inorder to program a cdtable or S2 STE. Now that the STE code
> > can automatically handle all transitions we can remove this step
> > from the attach_dev flow.
> >
> > A few small bugs exist because of this:
> >
> > 1) If the core code does BLOCKED -> UNMANAGED with disable_bypass=false
> >    then there will be a moment where the STE points at BYPASS. Since
> >    this can be done by VFIO/IOMMUFD it is a small security race.
> >
> > 2) If the core code does IDENTITY -> DMA then any IOMMU_RESV_DIRECT
> >    regions will temporarily become BLOCKED. We'd like drivers to
> >    work in a way that allows IOMMU_RESV_DIRECT to be continuously
> >    functional during these transitions.
> >
> > Make arm_smmu_release_device() put the STE back to the correct
> > ABORT/BYPASS setting. Fix a bug where a IOMMU_RESV_DIRECT was ignored on
> > this path.
> >
> > Notice this subtly depends on the prior arm_smmu_asid_lock change as the
> > STE must be put to non-paging before removing the device for the linked
> > list to avoid races with arm_smmu_share_asid().
> 
> I'm a little confused by this comment. Is this suggesting that
> arm_smmu_detach_dev had a race condition before the arm_smmu_asid_lock
> changes, since it deletes the list entry before deactivating the STE
> that uses the domain and without grabbing the asid_lock, thus allowing
> a gap where the ASID might be re-acquired by an SVA domain while an
> STE with that ASID is still live on this device? Wouldn't that belong
> on the asid_lock patch instead if so?

I wasn't intending to say there is an existing bug, this was more to
point out why it was organized like this, and why it is OK to remove
the detach manipulation of the STE considering races with share_asid.

However, I agree that the code in rc1 is troubled and fixed in the
prior patch:

	spin_lock_irqsave(&smmu_domain->devices_lock, flags);
	list_del(&master->domain_head);
	spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);

^^^^ Prevents arm_smmu_update_ctx_desc_devices() from storing to the STE
     However the STE is still pointing at the ASID

	master->domain = NULL;
	master->ats_enabled = false;
	arm_smmu_install_ste_for_dev(master);

^^^^ Now the STE is gone, so the CD becomes unreferenced

	if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 && master->cd_table.cdtab)
		arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, NULL);

^^^^ Now the CD is non-valid

I was primarily concerned with corrupting the CD, ie that share_asid
would race and un-clear the write_ctx_desc(). That is prevented by the
ordering above.

However, I agree the above is still problematic because there is a
short time window where the ASID can be installed in two CDs with two
different translations. I suppose there is a security issue where this
could corrupt the IOTLB.

This is all fixed in this series too by having more robust locking. So
this does deserve a note in the commit message for the earlier patch
about this issue.

> > @@ -2852,9 +2846,18 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
> >  static void arm_smmu_release_device(struct device *dev)
> >  {
> >         struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> > +       struct arm_smmu_ste target;
> >
> >         if (WARN_ON(arm_smmu_master_sva_enabled(master)))
> >                 iopf_queue_remove_device(master->smmu->evtq.iopf, dev);
> > +
> > +       /* Put the STE back to what arm_smmu_init_strtab() sets */
> 
> Hmmmm, it seems like checking iommu->require_direct may put STEs in
> bypass in scenarios where arm_smmu_init_strtab() wouldn't have.
> arm_smmu_init_strtab is calling iort_get_rmr_sids to pick streams to
> put into bypass, but IIUC iommu->require_direct also applies to
> dts-based reserved-memory regions, not just iort.

Indeed, that actually looks like a little bug as the DT should
technicaly be the same behavior as the iort.. I'm going to ignore it
:)

> I'm not very familiar with the history behind disable_bypass; why is
> putting an entire stream into bypass the correct behavior if a
> reserved-memory (which may be for a small finite region) exists?

This specific reserved memory region is requesting a 1:1 translation
for a chunk of IOVA. This translation is being used by some agent
outside Linux's knowledge and the desire is for the translation to
always be in effect.

So, if we put the STE to ABORT then the translation will stop working
with unknown side effects.

This is also why we install the translation in the DMA domain and
block use of VIFO if these are set - to ensure the 1:1 translation is
always there.

Thanks,
Jason

WARNING: multiple messages have this Message-ID (diff)
From: Jason Gunthorpe <jgg@nvidia.com>
To: Michael Shavit <mshavit@google.com>
Cc: iommu@lists.linux.dev, Joerg Roedel <joro@8bytes.org>,
	linux-arm-kernel@lists.infradead.org,
	Robin Murphy <robin.murphy@arm.com>,
	Will Deacon <will@kernel.org>, Nicolin Chen <nicolinc@nvidia.com>,
	Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
Subject: Re: [PATCH v2 11/19] iommu/arm-smmu-v3: Do not change the STE twice during arm_smmu_attach_dev()
Date: Thu, 16 Nov 2023 12:28:09 -0400	[thread overview]
Message-ID: <ZVZDGRd6m/FoLmNi@nvidia.com> (raw)
In-Reply-To: <CAKHBV24X4-Mf3R8OqwBdAJf9P7UD9fep2fOmJVrcx_rkfABxBA@mail.gmail.com>

On Wed, Nov 15, 2023 at 11:15:23PM +0800, Michael Shavit wrote:
> On Tue, Nov 14, 2023 at 1:53 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
> >
> > This was needed because the STE code required the STE to be in
> > ABORT/BYPASS inorder to program a cdtable or S2 STE. Now that the STE code
> > can automatically handle all transitions we can remove this step
> > from the attach_dev flow.
> >
> > A few small bugs exist because of this:
> >
> > 1) If the core code does BLOCKED -> UNMANAGED with disable_bypass=false
> >    then there will be a moment where the STE points at BYPASS. Since
> >    this can be done by VFIO/IOMMUFD it is a small security race.
> >
> > 2) If the core code does IDENTITY -> DMA then any IOMMU_RESV_DIRECT
> >    regions will temporarily become BLOCKED. We'd like drivers to
> >    work in a way that allows IOMMU_RESV_DIRECT to be continuously
> >    functional during these transitions.
> >
> > Make arm_smmu_release_device() put the STE back to the correct
> > ABORT/BYPASS setting. Fix a bug where a IOMMU_RESV_DIRECT was ignored on
> > this path.
> >
> > Notice this subtly depends on the prior arm_smmu_asid_lock change as the
> > STE must be put to non-paging before removing the device for the linked
> > list to avoid races with arm_smmu_share_asid().
> 
> I'm a little confused by this comment. Is this suggesting that
> arm_smmu_detach_dev had a race condition before the arm_smmu_asid_lock
> changes, since it deletes the list entry before deactivating the STE
> that uses the domain and without grabbing the asid_lock, thus allowing
> a gap where the ASID might be re-acquired by an SVA domain while an
> STE with that ASID is still live on this device? Wouldn't that belong
> on the asid_lock patch instead if so?

I wasn't intending to say there is an existing bug, this was more to
point out why it was organized like this, and why it is OK to remove
the detach manipulation of the STE considering races with share_asid.

However, I agree that the code in rc1 is troubled and fixed in the
prior patch:

	spin_lock_irqsave(&smmu_domain->devices_lock, flags);
	list_del(&master->domain_head);
	spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);

^^^^ Prevents arm_smmu_update_ctx_desc_devices() from storing to the STE
     However the STE is still pointing at the ASID

	master->domain = NULL;
	master->ats_enabled = false;
	arm_smmu_install_ste_for_dev(master);

^^^^ Now the STE is gone, so the CD becomes unreferenced

	if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 && master->cd_table.cdtab)
		arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, NULL);

^^^^ Now the CD is non-valid

I was primarily concerned with corrupting the CD, ie that share_asid
would race and un-clear the write_ctx_desc(). That is prevented by the
ordering above.

However, I agree the above is still problematic because there is a
short time window where the ASID can be installed in two CDs with two
different translations. I suppose there is a security issue where this
could corrupt the IOTLB.

This is all fixed in this series too by having more robust locking. So
this does deserve a note in the commit message for the earlier patch
about this issue.

> > @@ -2852,9 +2846,18 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
> >  static void arm_smmu_release_device(struct device *dev)
> >  {
> >         struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> > +       struct arm_smmu_ste target;
> >
> >         if (WARN_ON(arm_smmu_master_sva_enabled(master)))
> >                 iopf_queue_remove_device(master->smmu->evtq.iopf, dev);
> > +
> > +       /* Put the STE back to what arm_smmu_init_strtab() sets */
> 
> Hmmmm, it seems like checking iommu->require_direct may put STEs in
> bypass in scenarios where arm_smmu_init_strtab() wouldn't have.
> arm_smmu_init_strtab is calling iort_get_rmr_sids to pick streams to
> put into bypass, but IIUC iommu->require_direct also applies to
> dts-based reserved-memory regions, not just iort.

Indeed, that actually looks like a little bug as the DT should
technicaly be the same behavior as the iort.. I'm going to ignore it
:)

> I'm not very familiar with the history behind disable_bypass; why is
> putting an entire stream into bypass the correct behavior if a
> reserved-memory (which may be for a small finite region) exists?

This specific reserved memory region is requesting a 1:1 translation
for a chunk of IOVA. This translation is being used by some agent
outside Linux's knowledge and the desire is for the translation to
always be in effect.

So, if we put the STE to ABORT then the translation will stop working
with unknown side effects.

This is also why we install the translation in the DMA domain and
block use of VIFO if these are set - to ensure the 1:1 translation is
always there.

Thanks,
Jason

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2023-11-16 16:28 UTC|newest]

Thread overview: 158+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-13 17:53 [PATCH v2 00/19] Update SMMUv3 to the modern iommu API (part 1/3) Jason Gunthorpe
2023-11-13 17:53 ` Jason Gunthorpe
2023-11-13 17:53 ` [PATCH v2 01/19] iommu/arm-smmu-v3: Add a type for the STE Jason Gunthorpe
2023-11-13 17:53   ` Jason Gunthorpe
2023-11-14 15:06   ` Moritz Fischer
2023-11-14 15:06     ` Moritz Fischer
2023-11-15 11:52     ` Michael Shavit
2023-11-15 11:52       ` Michael Shavit
2023-11-15 13:35       ` Jason Gunthorpe
2023-11-15 13:35         ` Jason Gunthorpe
2023-11-27 16:03   ` Eric Auger
2023-11-27 16:03     ` Eric Auger
2023-11-27 17:42     ` Jason Gunthorpe
2023-11-27 17:42       ` Jason Gunthorpe
2023-11-27 17:51       ` Eric Auger
2023-11-27 17:51         ` Eric Auger
2023-11-27 18:21         ` Jason Gunthorpe
2023-11-27 18:21           ` Jason Gunthorpe
2023-12-05  0:44   ` Nicolin Chen
2023-12-05  0:44     ` Nicolin Chen
2023-11-13 17:53 ` [PATCH v2 02/19] iommu/arm-smmu-v3: Master cannot be NULL in arm_smmu_write_strtab_ent() Jason Gunthorpe
2023-11-13 17:53   ` Jason Gunthorpe
2023-11-14 15:17   ` Moritz Fischer
2023-11-14 15:17     ` Moritz Fischer
2023-11-15 11:55     ` Michael Shavit
2023-11-15 11:55       ` Michael Shavit
2023-11-27 15:41   ` Eric Auger
2023-11-27 15:41     ` Eric Auger
2023-12-05  0:45   ` Nicolin Chen
2023-12-05  0:45     ` Nicolin Chen
2023-11-13 17:53 ` [PATCH v2 03/19] iommu/arm-smmu-v3: Remove ARM_SMMU_DOMAIN_NESTED Jason Gunthorpe
2023-11-13 17:53   ` Jason Gunthorpe
2023-11-14 15:18   ` Moritz Fischer
2023-11-14 15:18     ` Moritz Fischer
2023-11-27 16:35   ` Eric Auger
2023-11-27 16:35     ` Eric Auger
2023-12-05  0:46   ` Nicolin Chen
2023-12-05  0:46     ` Nicolin Chen
2023-11-13 17:53 ` [PATCH v2 04/19] iommu/arm-smmu-v3: Make STE programming independent of the callers Jason Gunthorpe
2023-11-13 17:53   ` Jason Gunthorpe
2023-12-05  1:38   ` Nicolin Chen
2023-12-05  1:38     ` Nicolin Chen
2023-11-13 17:53 ` [PATCH v2 05/19] iommu/arm-smmu-v3: Consolidate the STE generation for abort/bypass Jason Gunthorpe
2023-11-13 17:53   ` Jason Gunthorpe
2023-11-15 12:17   ` Michael Shavit
2023-11-15 12:17     ` Michael Shavit
2023-12-05  1:43   ` Nicolin Chen
2023-12-05  1:43     ` Nicolin Chen
2023-11-13 17:53 ` [PATCH v2 06/19] iommu/arm-smmu-v3: Move arm_smmu_rmr_install_bypass_ste() Jason Gunthorpe
2023-11-13 17:53   ` Jason Gunthorpe
2023-11-15 13:57   ` Michael Shavit
2023-11-15 13:57     ` Michael Shavit
2023-12-05  1:45   ` Nicolin Chen
2023-12-05  1:45     ` Nicolin Chen
2023-11-13 17:53 ` [PATCH v2 07/19] iommu/arm-smmu-v3: Move the STE generation for S1 and S2 domains into functions Jason Gunthorpe
2023-11-13 17:53   ` Jason Gunthorpe
2023-11-14 15:24   ` Moritz Fischer
2023-11-14 15:24     ` Moritz Fischer
2023-11-15 14:01   ` Michael Shavit
2023-11-15 14:01     ` Michael Shavit
2023-12-05  1:55   ` Nicolin Chen
2023-12-05  1:55     ` Nicolin Chen
2023-12-05 14:35     ` Jason Gunthorpe
2023-12-05 14:35       ` Jason Gunthorpe
2023-11-13 17:53 ` [PATCH v2 08/19] iommu/arm-smmu-v3: Build the whole STE in arm_smmu_make_s2_domain_ste() Jason Gunthorpe
2023-11-13 17:53   ` Jason Gunthorpe
2023-11-15 14:04   ` Michael Shavit
2023-11-15 14:04     ` Michael Shavit
2023-12-05  1:58   ` Nicolin Chen
2023-12-05  1:58     ` Nicolin Chen
2023-11-13 17:53 ` [PATCH v2 09/19] iommu/arm-smmu-v3: Hold arm_smmu_asid_lock during all of attach_dev Jason Gunthorpe
2023-11-13 17:53   ` Jason Gunthorpe
2023-11-15 14:12   ` Michael Shavit
2023-11-15 14:12     ` Michael Shavit
2023-12-05  2:16   ` Nicolin Chen
2023-12-05  2:16     ` Nicolin Chen
2023-11-13 17:53 ` [PATCH v2 10/19] iommu/arm-smmu-v3: Compute the STE only once for each master Jason Gunthorpe
2023-11-13 17:53   ` Jason Gunthorpe
2023-11-15 14:16   ` Michael Shavit
2023-11-15 14:16     ` Michael Shavit
2023-12-05  2:13   ` Nicolin Chen
2023-12-05  2:13     ` Nicolin Chen
2023-11-13 17:53 ` [PATCH v2 11/19] iommu/arm-smmu-v3: Do not change the STE twice during arm_smmu_attach_dev() Jason Gunthorpe
2023-11-13 17:53   ` Jason Gunthorpe
2023-11-15 15:15   ` Michael Shavit
2023-11-15 15:15     ` Michael Shavit
2023-11-16 16:28     ` Jason Gunthorpe [this message]
2023-11-16 16:28       ` Jason Gunthorpe
2023-12-05  2:46   ` Nicolin Chen
2023-12-05  2:46     ` Nicolin Chen
2023-11-13 17:53 ` [PATCH v2 12/19] iommu/arm-smmu-v3: Put writing the context descriptor in the right order Jason Gunthorpe
2023-11-13 17:53   ` Jason Gunthorpe
2023-11-15 15:32   ` Michael Shavit
2023-11-15 15:32     ` Michael Shavit
2023-11-16 16:46     ` Jason Gunthorpe
2023-11-16 16:46       ` Jason Gunthorpe
2023-11-17  4:14       ` Michael Shavit
2023-11-17  4:14         ` Michael Shavit
2023-12-05  3:38   ` Nicolin Chen
2023-12-05  3:38     ` Nicolin Chen
2023-11-13 17:53 ` [PATCH v2 13/19] iommu/arm-smmu-v3: Pass smmu_domain to arm_enable/disable_ats() Jason Gunthorpe
2023-11-13 17:53   ` Jason Gunthorpe
2023-12-05  3:56   ` Nicolin Chen
2023-12-05  3:56     ` Nicolin Chen
2023-11-13 17:53 ` [PATCH v2 14/19] iommu/arm-smmu-v3: Remove arm_smmu_master->domain Jason Gunthorpe
2023-11-13 17:53   ` Jason Gunthorpe
2023-11-27 17:14   ` Eric Auger
2023-11-27 17:14     ` Eric Auger
2023-11-30 12:03     ` Jason Gunthorpe
2023-11-30 12:03       ` Jason Gunthorpe
2023-12-05 13:25       ` Eric Auger
2023-12-05 13:25         ` Eric Auger
2023-12-05  4:47   ` Nicolin Chen
2023-12-05  4:47     ` Nicolin Chen
2023-11-13 17:53 ` [PATCH v2 15/19] iommu/arm-smmu-v3: Add a global static IDENTITY domain Jason Gunthorpe
2023-11-13 17:53   ` Jason Gunthorpe
2023-11-15 15:50   ` Michael Shavit
2023-11-15 15:50     ` Michael Shavit
2023-12-05  4:28   ` Nicolin Chen
2023-12-05  4:28     ` Nicolin Chen
2023-12-05 14:37     ` Jason Gunthorpe
2023-12-05 14:37       ` Jason Gunthorpe
2023-12-05 17:25       ` Nicolin Chen
2023-12-05 17:25         ` Nicolin Chen
2023-12-05 17:42         ` Jason Gunthorpe
2023-12-05 17:42           ` Jason Gunthorpe
2023-12-05 18:21           ` Nicolin Chen
2023-12-05 18:21             ` Nicolin Chen
2023-12-05 19:03             ` Jason Gunthorpe
2023-12-05 19:03               ` Jason Gunthorpe
2023-11-13 17:53 ` [PATCH v2 16/19] iommu/arm-smmu-v3: Add a global static BLOCKED domain Jason Gunthorpe
2023-11-13 17:53   ` Jason Gunthorpe
2023-11-15 15:57   ` Michael Shavit
2023-11-15 15:57     ` Michael Shavit
2023-11-16 15:44     ` Jason Gunthorpe
2023-11-16 15:44       ` Jason Gunthorpe
2023-12-05  4:05   ` Nicolin Chen
2023-12-05  4:05     ` Nicolin Chen
2023-11-13 17:53 ` [PATCH v2 17/19] iommu/arm-smmu-v3: Use the identity/blocked domain during release Jason Gunthorpe
2023-11-13 17:53   ` Jason Gunthorpe
2023-12-05  4:07   ` Nicolin Chen
2023-12-05  4:07     ` Nicolin Chen
2023-11-13 17:53 ` [PATCH v2 18/19] iommu/arm-smmu-v3: Pass arm_smmu_domain and arm_smmu_device to finalize Jason Gunthorpe
2023-11-13 17:53   ` Jason Gunthorpe
2023-11-15 16:02   ` Michael Shavit
2023-11-15 16:02     ` Michael Shavit
2023-12-05  4:42   ` Nicolin Chen
2023-12-05  4:42     ` Nicolin Chen
2023-11-13 17:53 ` [PATCH v2 19/19] iommu/arm-smmu-v3: Convert to domain_alloc_paging() Jason Gunthorpe
2023-11-13 17:53   ` Jason Gunthorpe
2023-12-05  4:40   ` Nicolin Chen
2023-12-05  4:40     ` Nicolin Chen
2023-11-27 16:10 ` [PATCH v2 00/19] Update SMMUv3 to the modern iommu API (part 1/3) Shameerali Kolothum Thodi
2023-11-27 16:10   ` Shameerali Kolothum Thodi
2023-11-27 17:48   ` Jason Gunthorpe
2023-11-27 17:48     ` Jason Gunthorpe
2023-12-05  3:54 ` Nicolin Chen
2023-12-05  3:54   ` Nicolin Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZVZDGRd6m/FoLmNi@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=iommu@lists.linux.dev \
    --cc=joro@8bytes.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=mshavit@google.com \
    --cc=nicolinc@nvidia.com \
    --cc=robin.murphy@arm.com \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.