From: Mostafa Saleh <smostafa@google.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: Nicolin Chen <nicolinc@nvidia.com>,
will@kernel.org, robin.murphy@arm.com, joro@8bytes.org,
linux-arm-kernel@lists.infradead.org, iommu@lists.linux.dev,
linux-kernel@vger.kernel.org, skolothumtho@nvidia.com,
praan@google.com, xueshuai@linux.alibaba.com
Subject: Re: [PATCH rc v3 1/4] iommu/arm-smmu-v3: Add ignored bits to fix STE update sequence
Date: Tue, 16 Dec 2025 22:58:33 +0000 [thread overview]
Message-ID: <aUHkGCEeu7L5uCMR@google.com> (raw)
In-Reply-To: <20251216000952.GA6079@nvidia.com>
On Mon, Dec 15, 2025 at 08:09:52PM -0400, Jason Gunthorpe wrote:
> On Sun, Dec 14, 2025 at 10:32:35PM +0000, Mostafa Saleh wrote:
> > > * Figure out if we can do a hitless update of entry to become target. Returns a
> > > * bit mask where 1 indicates that qword needs to be set disruptively.
> > > @@ -1094,13 +1100,22 @@ static u8 arm_smmu_entry_qword_diff(struct arm_smmu_entry_writer *writer,
> > > {
> > > __le64 target_used[NUM_ENTRY_QWORDS] = {};
> > > __le64 cur_used[NUM_ENTRY_QWORDS] = {};
> > > + __le64 ignored[NUM_ENTRY_QWORDS] = {};
> >
> > I think we can avoid extra stack allocation for another STE, if we make
> > the function update cur_used directly, but no strong opinion.
>
> It does more than just mask cur_used, it also adjusts ignored:
>
> > > + /*
> > > + * Ignored is only used for bits that are used by both entries,
> > > + * otherwise it is sequenced according to the unused entry.
> > > + */
> > > + ignored[i] &= target_used[i] & cur_used[i];
>
> Which also explains this:
I haven't tested that, but I was thinking about (applies on top of the patches)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 72ba41591fdb..9981eefcf0da 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1083,7 +1083,7 @@ void arm_smmu_get_ste_used(const __le64 *ent, __le64 *used_bits)
EXPORT_SYMBOL_IF_KUNIT(arm_smmu_get_ste_used);
VISIBLE_IF_KUNIT
-void arm_smmu_get_ste_ignored(__le64 *ignored_bits)
+void arm_smmu_get_ste_ignored(__le64 *used)
{
/*
* MEV does not meaningfully impact the operation of the HW, it only
@@ -1093,17 +1093,14 @@ void arm_smmu_get_ste_ignored(__le64 *ignored_bits)
*
* Note: Software must expect, and be able to deal with, coalesced
* fault records even when MEV == 0.
- */
- ignored_bits[1] |= cpu_to_le64(STRTAB_STE_1_MEV);
-
- /*
+ *
* EATS is used to reject and control the ATS behavior of the device. If
* we are changing it away from 0 then we already trust the device to
* use ATS properly and we have sequenced the device's ATS enable in PCI
* config space to prevent it from issuing ATS while we are changing
* EATS.
*/
- ignored_bits[1] |= cpu_to_le64(STRTAB_STE_1_EATS);
+ used[1] &= ~cpu_to_le64(STRTAB_STE_1_EATS | STRTAB_STE_1_MEV);
}
EXPORT_SYMBOL_IF_KUNIT(arm_smmu_get_ste_ignored);
@@ -1119,22 +1116,15 @@ static u8 arm_smmu_entry_qword_diff(struct arm_smmu_entry_writer *writer,
{
__le64 target_used[NUM_ENTRY_QWORDS] = {};
__le64 cur_used[NUM_ENTRY_QWORDS] = {};
- __le64 ignored[NUM_ENTRY_QWORDS] = {};
u8 used_qword_diff = 0;
unsigned int i;
writer->ops->get_used(entry, cur_used);
writer->ops->get_used(target, target_used);
- if (writer->ops->get_ignored)
- writer->ops->get_ignored(ignored);
+ if (writer->ops->filter_ignored)
+ writer->ops->filter_ignored(cur_used);
for (i = 0; i != NUM_ENTRY_QWORDS; i++) {
- /*
- * Ignored is only used for bits that are used by both entries,
- * otherwise it is sequenced according to the unused entry.
- */
- ignored[i] &= target_used[i] & cur_used[i];
-
/*
* Check that masks are up to date, the make functions are not
* allowed to set a bit to 1 if the used function doesn't say it
@@ -1142,8 +1132,6 @@ static u8 arm_smmu_entry_qword_diff(struct arm_smmu_entry_writer *writer,
*/
WARN_ON_ONCE(target[i] & ~target_used[i]);
- /* Bits can change because they are not currently being used */
- cur_used[i] &= ~ignored[i];
unused_update[i] = (entry[i] & cur_used[i]) |
(target[i] & ~cur_used[i]);
/*
@@ -1575,7 +1563,7 @@ static void arm_smmu_ste_writer_sync_entry(struct arm_smmu_entry_writer *writer)
static const struct arm_smmu_entry_writer_ops arm_smmu_ste_writer_ops = {
.sync = arm_smmu_ste_writer_sync_entry,
.get_used = arm_smmu_get_ste_used,
- .get_ignored = arm_smmu_get_ste_ignored,
+ .filter_ignored = arm_smmu_get_ste_ignored,
};
static void arm_smmu_write_ste(struct arm_smmu_master *master, u32 sid,
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index d5f0e5407b9f..97b995974049 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -900,7 +900,7 @@ struct arm_smmu_entry_writer {
struct arm_smmu_entry_writer_ops {
void (*get_used)(const __le64 *entry, __le64 *used);
- void (*get_ignored)(__le64 *ignored_bits);
+ void (*filter_ignored)(__le64 *used);
void (*sync)(struct arm_smmu_entry_writer *writer);
};
And we only clear the bits from cur_used, there is no need to for the
other mask ignored (ignored[i] &= target_used[i] & cur_used[i])
- If an ignored bit is not in cur_used it will not impact
"cur_used[i] &= ~ignored[i]" as it must be already zero
- If an ignored bit is not in target_used, it doesn't really matter,
we can ignore it anyway, as it is safe to do so.
Anyway, I am not fixated on that though, extra 64B on the stack is not that bad.
>
> > I have some mixed feelings about this, having get_used(), then get_ignored()
> > with the same bits set seems confusing to me, specially the get_ignored()
> > loops back to update cur_used, which is set from get_used()
>
> The same bits are set because of the above - we need to know what the
> actual used bits are to decide if we need to rely on the ignored rule
> to do the update.
>
> > My initial though was just to remove this bit from get_used() + some changes
> > to checks setting bits that are not used would be enough, and the semantics
> > of get_used() can be something as:
> > “Return bits used by the updated translation regime that MUST be observed
> > atomically” and in that case we can ignore things as MEV as it doesn’t
> > impact the translation.
>
> Aside from the above this would cause problems with the validation
> assertions, so it is not a great idea.
Yes, that's why I didn't like this, it had to hack the validation logic.
Thanks,
Mostafa
>
> > However, this approach makes it a bit explicit which bits are ignored, if we
> > keep this logic, I think changing the name of get_ignored() might help, to
> > something as "get_allowed_break()" or "get_update_safe()"?
>
> update_safe sounds good to me
>
> Jason
next prev parent reply other threads:[~2025-12-16 22:58 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-10 2:45 [PATCH rc v3 0/4] : iommu/arm-smmu-v3: Fix hitless STE update in nesting cases Nicolin Chen
2025-12-10 2:45 ` [PATCH rc v3 1/4] iommu/arm-smmu-v3: Add ignored bits to fix STE update sequence Nicolin Chen
2025-12-14 22:32 ` Mostafa Saleh
2025-12-15 20:51 ` Nicolin Chen
2025-12-16 22:49 ` Mostafa Saleh
2025-12-16 0:09 ` Jason Gunthorpe
2025-12-16 20:46 ` Nicolin Chen
2025-12-16 22:58 ` Mostafa Saleh [this message]
2025-12-17 0:23 ` Jason Gunthorpe
2025-12-10 2:45 ` [PATCH rc v3 2/4] iommu/arm-smmu-v3: Ignore STE MEV when computing the " Nicolin Chen
2025-12-10 2:45 ` [PATCH rc v3 3/4] iommu/arm-smmu-v3: Ignore STE EATS " Nicolin Chen
2025-12-10 2:45 ` [PATCH rc v3 4/4] iommu/arm-smmu-v3-test: Add nested s1bypass/s1dssbypass coverage Nicolin Chen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aUHkGCEeu7L5uCMR@google.com \
--to=smostafa@google.com \
--cc=iommu@lists.linux.dev \
--cc=jgg@nvidia.com \
--cc=joro@8bytes.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=nicolinc@nvidia.com \
--cc=praan@google.com \
--cc=robin.murphy@arm.com \
--cc=skolothumtho@nvidia.com \
--cc=will@kernel.org \
--cc=xueshuai@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).