From: Mostafa Saleh <smostafa@google.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: Nicolin Chen <nicolinc@nvidia.com>,
will@kernel.org, robin.murphy@arm.com, joro@8bytes.org,
linux-arm-kernel@lists.infradead.org, iommu@lists.linux.dev,
linux-kernel@vger.kernel.org, skolothumtho@nvidia.com,
praan@google.com, xueshuai@linux.alibaba.com
Subject: Re: [PATCH rc v3 1/4] iommu/arm-smmu-v3: Add ignored bits to fix STE update sequence
Date: Tue, 16 Dec 2025 22:58:33 +0000 [thread overview]
Message-ID: <aUHkGCEeu7L5uCMR@google.com> (raw)
In-Reply-To: <20251216000952.GA6079@nvidia.com>
On Mon, Dec 15, 2025 at 08:09:52PM -0400, Jason Gunthorpe wrote:
> On Sun, Dec 14, 2025 at 10:32:35PM +0000, Mostafa Saleh wrote:
> > > * Figure out if we can do a hitless update of entry to become target. Returns a
> > > * bit mask where 1 indicates that qword needs to be set disruptively.
> > > @@ -1094,13 +1100,22 @@ static u8 arm_smmu_entry_qword_diff(struct arm_smmu_entry_writer *writer,
> > > {
> > > __le64 target_used[NUM_ENTRY_QWORDS] = {};
> > > __le64 cur_used[NUM_ENTRY_QWORDS] = {};
> > > + __le64 ignored[NUM_ENTRY_QWORDS] = {};
> >
> > I think we can avoid extra stack allocation for another STE, if we make
> > the function update cur_used directly, but no strong opinion.
>
> It does more than just mask cur_used, it also adjusts ignored:
>
> > > + /*
> > > + * Ignored is only used for bits that are used by both entries,
> > > + * otherwise it is sequenced according to the unused entry.
> > > + */
> > > + ignored[i] &= target_used[i] & cur_used[i];
>
> Which also explains this:
I haven't tested that, but I was thinking about (applies on top of the patches)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 72ba41591fdb..9981eefcf0da 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1083,7 +1083,7 @@ void arm_smmu_get_ste_used(const __le64 *ent, __le64 *used_bits)
EXPORT_SYMBOL_IF_KUNIT(arm_smmu_get_ste_used);
VISIBLE_IF_KUNIT
-void arm_smmu_get_ste_ignored(__le64 *ignored_bits)
+void arm_smmu_get_ste_ignored(__le64 *used)
{
/*
* MEV does not meaningfully impact the operation of the HW, it only
@@ -1093,17 +1093,14 @@ void arm_smmu_get_ste_ignored(__le64 *ignored_bits)
*
* Note: Software must expect, and be able to deal with, coalesced
* fault records even when MEV == 0.
- */
- ignored_bits[1] |= cpu_to_le64(STRTAB_STE_1_MEV);
-
- /*
+ *
* EATS is used to reject and control the ATS behavior of the device. If
* we are changing it away from 0 then we already trust the device to
* use ATS properly and we have sequenced the device's ATS enable in PCI
* config space to prevent it from issuing ATS while we are changing
* EATS.
*/
- ignored_bits[1] |= cpu_to_le64(STRTAB_STE_1_EATS);
+ used[1] &= ~cpu_to_le64(STRTAB_STE_1_EATS | STRTAB_STE_1_MEV);
}
EXPORT_SYMBOL_IF_KUNIT(arm_smmu_get_ste_ignored);
@@ -1119,22 +1116,15 @@ static u8 arm_smmu_entry_qword_diff(struct arm_smmu_entry_writer *writer,
{
__le64 target_used[NUM_ENTRY_QWORDS] = {};
__le64 cur_used[NUM_ENTRY_QWORDS] = {};
- __le64 ignored[NUM_ENTRY_QWORDS] = {};
u8 used_qword_diff = 0;
unsigned int i;
writer->ops->get_used(entry, cur_used);
writer->ops->get_used(target, target_used);
- if (writer->ops->get_ignored)
- writer->ops->get_ignored(ignored);
+ if (writer->ops->filter_ignored)
+ writer->ops->filter_ignored(cur_used);
for (i = 0; i != NUM_ENTRY_QWORDS; i++) {
- /*
- * Ignored is only used for bits that are used by both entries,
- * otherwise it is sequenced according to the unused entry.
- */
- ignored[i] &= target_used[i] & cur_used[i];
-
/*
* Check that masks are up to date, the make functions are not
* allowed to set a bit to 1 if the used function doesn't say it
@@ -1142,8 +1132,6 @@ static u8 arm_smmu_entry_qword_diff(struct arm_smmu_entry_writer *writer,
*/
WARN_ON_ONCE(target[i] & ~target_used[i]);
- /* Bits can change because they are not currently being used */
- cur_used[i] &= ~ignored[i];
unused_update[i] = (entry[i] & cur_used[i]) |
(target[i] & ~cur_used[i]);
/*
@@ -1575,7 +1563,7 @@ static void arm_smmu_ste_writer_sync_entry(struct arm_smmu_entry_writer *writer)
static const struct arm_smmu_entry_writer_ops arm_smmu_ste_writer_ops = {
.sync = arm_smmu_ste_writer_sync_entry,
.get_used = arm_smmu_get_ste_used,
- .get_ignored = arm_smmu_get_ste_ignored,
+ .filter_ignored = arm_smmu_get_ste_ignored,
};
static void arm_smmu_write_ste(struct arm_smmu_master *master, u32 sid,
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index d5f0e5407b9f..97b995974049 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -900,7 +900,7 @@ struct arm_smmu_entry_writer {
struct arm_smmu_entry_writer_ops {
void (*get_used)(const __le64 *entry, __le64 *used);
- void (*get_ignored)(__le64 *ignored_bits);
+ void (*filter_ignored)(__le64 *used);
void (*sync)(struct arm_smmu_entry_writer *writer);
};
And we only clear the bits from cur_used, there is no need to for the
other mask ignored (ignored[i] &= target_used[i] & cur_used[i])
- If an ignored bit is not in cur_used it will not impact
"cur_used[i] &= ~ignored[i]" as it must be already zero
- If an ignored bit is not in target_used, it doesn't really matter,
we can ignore it anyway, as it is safe to do so.
Anyway, I am not fixated on that though, extra 64B on the stack is not that bad.
>
> > I have some mixed feelings about this, having get_used(), then get_ignored()
> > with the same bits set seems confusing to me, specially the get_ignored()
> > loops back to update cur_used, which is set from get_used()
>
> The same bits are set because of the above - we need to know what the
> actual used bits are to decide if we need to rely on the ignored rule
> to do the update.
>
> > My initial though was just to remove this bit from get_used() + some changes
> > to checks setting bits that are not used would be enough, and the semantics
> > of get_used() can be something as:
> > “Return bits used by the updated translation regime that MUST be observed
> > atomically” and in that case we can ignore things as MEV as it doesn’t
> > impact the translation.
>
> Aside from the above this would cause problems with the validation
> assertions, so it is not a great idea.
Yes, that's why I didn't like this, it had to hack the validation logic.
Thanks,
Mostafa
>
> > However, this approach makes it a bit explicit which bits are ignored, if we
> > keep this logic, I think changing the name of get_ignored() might help, to
> > something as "get_allowed_break()" or "get_update_safe()"?
>
> update_safe sounds good to me
>
> Jason
next prev parent reply other threads:[~2025-12-16 22:58 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-10 2:45 [PATCH rc v3 0/4] : iommu/arm-smmu-v3: Fix hitless STE update in nesting cases Nicolin Chen
2025-12-10 2:45 ` [PATCH rc v3 1/4] iommu/arm-smmu-v3: Add ignored bits to fix STE update sequence Nicolin Chen
2025-12-14 22:32 ` Mostafa Saleh
2025-12-15 20:51 ` Nicolin Chen
2025-12-16 22:49 ` Mostafa Saleh
2025-12-16 0:09 ` Jason Gunthorpe
2025-12-16 20:46 ` Nicolin Chen
2025-12-16 22:58 ` Mostafa Saleh [this message]
2025-12-17 0:23 ` Jason Gunthorpe
2025-12-10 2:45 ` [PATCH rc v3 2/4] iommu/arm-smmu-v3: Ignore STE MEV when computing the " Nicolin Chen
2025-12-10 2:45 ` [PATCH rc v3 3/4] iommu/arm-smmu-v3: Ignore STE EATS " Nicolin Chen
2025-12-10 2:45 ` [PATCH rc v3 4/4] iommu/arm-smmu-v3-test: Add nested s1bypass/s1dssbypass coverage Nicolin Chen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aUHkGCEeu7L5uCMR@google.com \
--to=smostafa@google.com \
--cc=iommu@lists.linux.dev \
--cc=jgg@nvidia.com \
--cc=joro@8bytes.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=nicolinc@nvidia.com \
--cc=praan@google.com \
--cc=robin.murphy@arm.com \
--cc=skolothumtho@nvidia.com \
--cc=will@kernel.org \
--cc=xueshuai@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.