All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Nicolin Chen <nicolinc@nvidia.com>
Cc: will@kernel.org, robin.murphy@arm.com, joro@8bytes.org,
	jean-philippe@linaro.org, miko.lenczewski@arm.com,
	balbirs@nvidia.com, peterz@infradead.org, smostafa@google.com,
	kevin.tian@intel.com, praan@google.com, zhangzekun11@huawei.com,
	linux-arm-kernel@lists.infradead.org, iommu@lists.linux.dev,
	linux-kernel@vger.kernel.org, patches@lists.linux.dev
Subject: Re: [PATCH rfcv1 7/8] iommu/arm-smmu-v3: Add arm_smmu_invs based arm_smmu_domain_inv_range()
Date: Wed, 27 Aug 2025 15:49:23 -0300	[thread overview]
Message-ID: <20250827184923.GC2206304@nvidia.com> (raw)
In-Reply-To: <8c4c5aec144f65cfd1fcef2eafb395876dac97ec.1755131672.git.nicolinc@nvidia.com>

On Wed, Aug 13, 2025 at 06:25:38PM -0700, Nicolin Chen wrote:
> +void arm_smmu_domain_inv_range(struct arm_smmu_domain *smmu_domain,
> +			       unsigned long iova, size_t size,
> +			       unsigned int granule, bool leaf)
> +{
> +	struct arm_smmu_cmdq_batch cmds = {};
> +	struct arm_smmu_invs *invs;
> +	bool retried = false;
> +	size_t i;
> +
> +	/*
> +	 * An invalidation request must follow some IOPTE change and then load
> +	 * the invalidation array In the meantime, a domain attachment mutates
> +	 * the array and then stores an STE/CD asking SMMU HW to acquire those
> +	 * changed IOPTEs. In other word, these two are interdependent and can
> +	 * race.
> +	 *
> +	 * In a race, the RCU design (with its underlying memory barriers) can
> +	 * ensure the invalidations array to always get updated before loaded.
> +	 *
> +	 * smp_mb() is used here, paired with the smp_mb() following the array
> +	 * update in a concurrent attach, to ensure:
> +	 *  - HW sees the new IOPTEs if it walks after STE installation
> +	 *  - Invalidation thread sees the updated array with the new ASID.
> +	 *
> +	 *  [CPU0]                        | [CPU1]
> +	 *                                |
> +	 *  change IOPTEs and TLB flush:  |
> +	 *  arm_smmu_domain_inv_range() { | arm_smmu_install_new_domain_invs {
> +	 *    ...                         |   rcu_assign_pointer(new_invs);
> +	 *    smp_mb(); // ensure IOPTEs  |   smp_mb(); // ensure new_invs
> +	 *    ...                         |   kfree_rcu(old_invs, rcu);
> +	 *    // load invalidation array  | }
> +	 *    invs = rcu_dereference();   | arm_smmu_install_ste_for_dev {
> +	 *                                |   STE = TTB0 // read new IOPTEs
> +	 */
> +	smp_mb();
> +
> +	rcu_read_lock();
> +again:
> +	invs = rcu_dereference(smmu_domain->invs);
> +
> +	/* A concurrent attachment might have changed the array. Do a respin */
> +	if (unlikely(!read_trylock(&invs->rwlock)))
> +		goto again;
> +	/* Only one retry. Otherwise, it would soft lockup on an empty array */
> +	if (!retried && unlikely(!invs->num_invs)) {
> +		read_unlock(&invs->rwlock);
> +		retried = true;
> +		goto again;
> +	}

This has missed the point, it was to not get the unless we have
ATS. Something like this:


	rcu_read_lock();

	while (true) {
		invs = rcu_dereference(smmu_domain->invs);

		/*
		 * Avoid locking unless ATS is being used. No ATS invalidate can
		 * be going on after a domain is detached.
		 */
		locked = false;
		if (invs->has_ats || READ_ONCE(invs->old)) {
			read_lock(&invs->rwlock);
			if (invs->old) {
				read_unlock(&invs->rwlock);
				continue;
			}
			locked = true;
		}
		break;
	}

	num_invs = READ_ONCE(num_invs);
	for (i = 0; i != num_invs; i++) {

> +		case INV_TYPE_S2_VMID_S1_CLEAR:
> +			/* CMDQ_OP_TLBI_S12_VMALL already flushed S1 entries */
> +			if (arm_smmu_inv_size_too_big(cur->smmu, size, granule))
> +				continue;
> +			/* Just a single CMDQ_OP_TLBI_NH_ALL, no batching */
> +			cmd.tlbi.vmid = cur->id;
> +			arm_smmu_cmdq_issue_cmd_with_sync(cur->smmu, &cmd);

This should just stick it in the batch, the batch was already flushed:

	/* The batch for S2 TLBI must be done before nested S1 ASIDs */
	if (next->type == INV_TYPE_S2_VMID_S1_CLEAR)
		return true;

That needs a fixup too:

	/* The batch for S2 TLBI must be done before nested S1 ASIDs */
	if (cur->type != INV_TYPE_S2_VMID_S1_CLEAR &&
	    next->type == INV_TYPE_S2_VMID_S1_CLEAR)
		return true;

It makes sense to bundle all the NH_ALL into one batch if there is more
than one.

But this arm_smmu_invs_end_batch() no longer works right since the

		if (refcount_read(&cur->users) == 0)
			continue;

Was added..

Jason


  reply	other threads:[~2025-08-27 20:24 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-14  1:25 [PATCH rfcv1 0/8] iommu/arm-smmu-v3: Introduce an RCU-protected invalidation array Nicolin Chen
2025-08-14  1:25 ` [PATCH rfcv1 1/8] iommu/arm-smmu-v3: Clear cmds->num after arm_smmu_cmdq_batch_submit Nicolin Chen
2025-08-14  1:25 ` [PATCH rfcv1 2/8] iommu/arm-smmu-v3: Explicitly set smmu_domain->stage for SVA Nicolin Chen
2025-08-14  1:25 ` [PATCH rfcv1 3/8] iommu/arm-smmu-v3: Add an inline arm_smmu_domain_free() Nicolin Chen
2025-08-14  1:25 ` [PATCH rfcv1 4/8] iommu/arm-smmu-v3: Introduce a per-domain arm_smmu_invs array Nicolin Chen
2025-08-26 19:50   ` Jason Gunthorpe
2025-08-27  0:49     ` Nicolin Chen
2025-08-27 16:48   ` Jason Gunthorpe
2025-08-27 17:19     ` Nicolin Chen
2025-08-28 12:37       ` Jason Gunthorpe
2025-08-27 20:00   ` Jason Gunthorpe
2025-09-06  8:16     ` Nicolin Chen
2025-09-08 15:51       ` Jason Gunthorpe
2025-09-08 18:20         ` Nicolin Chen
2025-08-14  1:25 ` [PATCH rfcv1 5/8] iommu/arm-smmu-v3: Pre-allocate a per-master invalidation array Nicolin Chen
2025-08-26 19:56   ` Jason Gunthorpe
2025-09-06  7:45     ` Nicolin Chen
2025-09-08 15:36       ` Jason Gunthorpe
2025-08-14  1:25 ` [PATCH rfcv1 6/8] iommu/arm-smmu-v3: Populate smmu_domain->invs when attaching masters Nicolin Chen
2025-08-27 18:21   ` Jason Gunthorpe
2025-09-06  7:52     ` Nicolin Chen
2025-09-06  8:20     ` Nicolin Chen
2025-08-14  1:25 ` [PATCH rfcv1 7/8] iommu/arm-smmu-v3: Add arm_smmu_invs based arm_smmu_domain_inv_range() Nicolin Chen
2025-08-27 18:49   ` Jason Gunthorpe [this message]
2025-09-06  8:12     ` Nicolin Chen
2025-09-08 15:39       ` Jason Gunthorpe
2025-09-08 18:19         ` Nicolin Chen
2025-09-08 18:24           ` Jason Gunthorpe
2025-09-08 18:45             ` Nicolin Chen
2025-08-14  1:25 ` [PATCH rfcv1 8/8] iommu/arm-smmu-v3: Perform per-domain invalidations using arm_smmu_invs Nicolin Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250827184923.GC2206304@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=balbirs@nvidia.com \
    --cc=iommu@lists.linux.dev \
    --cc=jean-philippe@linaro.org \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miko.lenczewski@arm.com \
    --cc=nicolinc@nvidia.com \
    --cc=patches@lists.linux.dev \
    --cc=peterz@infradead.org \
    --cc=praan@google.com \
    --cc=robin.murphy@arm.com \
    --cc=smostafa@google.com \
    --cc=will@kernel.org \
    --cc=zhangzekun11@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.