public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Joerg Roedel <joro@8bytes.org>, Will Deacon <will@kernel.org>,
	Robin Murphy <robin.murphy@arm.com>,
	Kevin Tian <kevin.tian@intel.com>,
	Dmytro Maluka <dmaluka@chromium.org>,
	Samiullah Khawaja <skhawaja@google.com>,
	iommu@lists.linux.dev, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/8] iommu/vt-d: Add entry_sync support for PASID entry updates
Date: Mon, 9 Mar 2026 10:41:16 -0300	[thread overview]
Message-ID: <20260309134116.GE3717316@nvidia.com> (raw)
In-Reply-To: <20260309060648.276762-3-baolu.lu@linux.intel.com>

On Mon, Mar 09, 2026 at 02:06:42PM +0800, Lu Baolu wrote:
> +static void intel_pasid_get_used(const u128 *entry, u128 *used)
> +{
> +	struct pasid_entry *pe = (struct pasid_entry *)entry;
> +	struct pasid_entry *ue = (struct pasid_entry *)used;
> +	u16 pgtt;
> +
> +	/* Initialize used bits to 0. */
> +	memset(ue, 0, sizeof(*ue));
> +
> +	/* Present bit always matters. */
> +	ue->val[0] |= PASID_PTE_PRESENT;
> +
> +	/* Nothing more for non-present entries. */
> +	if (!(pe->val[0] & PASID_PTE_PRESENT))
> +		return;
> +
> +	pgtt = pasid_pte_get_pgtt(pe);
> +	switch (pgtt) {
> +	case PASID_ENTRY_PGTT_FL_ONLY:
> +		/* AW, PGTT */
> +		ue->val[0] |= GENMASK_ULL(4, 2) | GENMASK_ULL(8, 6);
> +		/* DID, PWSNP, PGSNP */
> +		ue->val[1] |= GENMASK_ULL(24, 23) | GENMASK_ULL(15, 0);
> +		/* FSPTPTR, FSPM */
> +		ue->val[2] |= GENMASK_ULL(63, 12) | GENMASK_ULL(3, 2);

This would be an excellent time to properly add these constants :(

/* 9.6 Scalable-Mode PASID Table Entry */
#define SM_PASID0_P		BIT_U64(0)
#define SM_PASID0_FPD		BIT_U64(1)
#define SM_PASID0_AW		GENMASK_U64(4, 2)
#define SM_PASID0_SSEE		BIT_U64(5)
#define SM_PASID0_PGTT		GENMASK_U64(8, 6)
#define SM_PASID0_SSADE		BIT_U64(9)
#define SM_PASID0_SSPTPTR	GENMASK_U64(63, 12)

#define SM_PASID1_DID		GENMASK_U64(15, 0)
#define SM_PASID1_PWSNP		BIT_U64(23)
#define SM_PASID1_PGSNP		BIT_U64(24)
#define SM_PASID1_CD		BIT_U64(25)
#define SM_PASID1_EMTE		BIT_U64(26)
#define SM_PASID1_PAT		GENMASK_U64(63, 32)

#define SM_PASID2_SRE		BIT_U64(0)
#define SM_PASID2_ERE		BIT_U64(1)
#define SM_PASID2_FSPM		GENMASK_U64(3, 2)
#define SM_PASID2_WPE		BIT_U64(4)
#define SM_PASID2_NXE		BIT_U64(5)
#define SM_PASID2_SMEP		BIT_U64(6)
#define SM_PASID2_EAFE		BIT_U64(7)
#define SM_PASID2_FSPTPTR	GENMASK_U64(63, 12)

> +static void intel_pasid_sync(struct entry_sync_writer128 *writer)
> +{
> +	struct intel_pasid_writer *p_writer = container_of(writer,
> +			struct intel_pasid_writer, writer);
> +	struct intel_iommu *iommu = p_writer->iommu;
> +	struct device *dev = p_writer->dev;
> +	bool was_present, is_present;
> +	u32 pasid = p_writer->pasid;
> +	struct pasid_entry *pte;
> +	u16 old_did, old_pgtt;
> +
> +	pte = intel_pasid_get_entry(dev, pasid);
> +	was_present = p_writer->was_present;
> +	is_present = pasid_pte_is_present(pte);
> +	old_did = pasid_get_domain_id(&p_writer->orig_pte);
> +	old_pgtt = pasid_pte_get_pgtt(&p_writer->orig_pte);
> +
> +	/* Update the last present state: */
> +	p_writer->was_present = is_present;
> +
> +	if (!ecap_coherent(iommu->ecap))
> +		clflush_cache_range(pte, sizeof(*pte));
> +
> +	/* Sync for "P=0" to "P=1": */
> +	if (!was_present) {
> +		if (is_present)
> +			pasid_flush_caches(iommu, pte, pasid,
> +					   pasid_get_domain_id(pte));
> +
> +		return;
> +	}
> +
> +	/* Sync for "P=1" to "P=1": */
> +	if (is_present) {
> +		intel_pasid_flush_present(iommu, dev, pasid, old_did, pte);
> +		return;
> +	}
> +
> +	/* Sync for "P=1" to "P=0": */
> +	pasid_cache_invalidation_with_pasid(iommu, old_did, pasid);

Why all this logic? All this different stuff does is meddle with the
IOTLB and it should not seen below.

If the sync is called it should just always call
pasid_cache_invalidation_with_pasid(), that's it.

Writer has already eliminated all cases where sync isn't needed.

> +	if (old_pgtt == PASID_ENTRY_PGTT_PT || old_pgtt == PASID_ENTRY_PGTT_FL_ONLY)
> +		qi_flush_piotlb(iommu, old_did, pasid, 0, -1, 0);
> +	else
> +		iommu->flush.flush_iotlb(iommu, old_did, 0, 0, DMA_TLB_DSI_FLUSH);
> +	devtlb_invalidation_with_pasid(iommu, dev, pasid);

The IOTLB should already be clean'd before the new entry using the
cache tag is programmed. Cleaning it after the entry is live is buggy.

The writer logic ensures it never sees a corrupted entry, so the clean
cache tag cannot be mangled during the writing process.

The way ARM is structured has the cache tags clean if they are in the
allocator bitmap, so when the driver fetches a new tag and starts
using it is clean and non cleaning is needed

When it frees a tag it cleans it and then returns it to the allocator.

ATC invalidations should always be done after the PASID entry is
written. During a hitless update both translations are unpredictably
combined, this is unavoidable and OK.

Jason


  reply	other threads:[~2026-03-09 13:41 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-09  6:06 [PATCH 0/8] iommu/vt-d: Hitless PASID updates via entry_sync Lu Baolu
2026-03-09  6:06 ` [PATCH 1/8] iommu: Lift and generalize the STE/CD update code from SMMUv3 Lu Baolu
2026-03-09 23:33   ` Samiullah Khawaja
2026-03-10  0:06     ` Samiullah Khawaja
2026-03-14  8:13       ` Baolu Lu
2026-03-16  9:51         ` Will Deacon
2026-03-18  3:10           ` Baolu Lu
2026-03-23 12:55             ` Jason Gunthorpe
2026-03-24  5:30               ` Baolu Lu
2026-03-16 16:35         ` Samiullah Khawaja
2026-03-18  3:23           ` Baolu Lu
2026-03-13  5:39   ` Nicolin Chen
2026-03-16  6:24     ` Baolu Lu
2026-03-23 12:59       ` Jason Gunthorpe
2026-03-24  5:49         ` Baolu Lu
2026-03-09  6:06 ` [PATCH 2/8] iommu/vt-d: Add entry_sync support for PASID entry updates Lu Baolu
2026-03-09 13:41   ` Jason Gunthorpe [this message]
2026-03-11  8:42     ` Baolu Lu
2026-03-11 12:23       ` Jason Gunthorpe
2026-03-12  7:51         ` Baolu Lu
2026-03-12  7:50     ` Baolu Lu
2026-03-12 11:44       ` Jason Gunthorpe
2026-03-15  8:11         ` Baolu Lu
2026-03-23 13:07           ` Jason Gunthorpe
2026-03-24  6:22             ` Baolu Lu
2026-03-24 12:53               ` Jason Gunthorpe
2026-03-09  6:06 ` [PATCH 3/8] iommu/vt-d: Require CMPXCHG16B for PASID support Lu Baolu
2026-03-09 13:42   ` Jason Gunthorpe
2026-03-12  7:59     ` Baolu Lu
2026-03-09  6:06 ` [PATCH 4/8] iommu/vt-d: Add trace events for PASID entry sync updates Lu Baolu
2026-03-09  6:06 ` [PATCH 5/8] iommu/vt-d: Use intel_pasid_write() for first-stage setup Lu Baolu
2026-03-09  6:06 ` [PATCH 6/8] iommu/vt-d: Use intel_pasid_write() for second-stage setup Lu Baolu
2026-03-09  6:06 ` [PATCH 7/8] iommu/vt-d: Use intel_pasid_write() for pass-through setup Lu Baolu
2026-03-09  6:06 ` [PATCH 8/8] iommu/vt-d: Use intel_pasid_write() for nested setup Lu Baolu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260309134116.GE3717316@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=dmaluka@chromium.org \
    --cc=iommu@lists.linux.dev \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=skhawaja@google.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox