From: Jason Gunthorpe <jgg@nvidia.com>
To: Michael Shavit <mshavit@google.com>
Cc: iommu@lists.linux.dev, Joerg Roedel <joro@8bytes.org>,
linux-arm-kernel@lists.infradead.org,
Robin Murphy <robin.murphy@arm.com>,
Will Deacon <will@kernel.org>, Nicolin Chen <nicolinc@nvidia.com>
Subject: Re: [PATCH 04/19] iommu/arm-smmu-v3: Make STE programming independent of the callers
Date: Wed, 18 Oct 2023 09:24:35 -0300 [thread overview]
Message-ID: <20231018122435.GS3952@nvidia.com> (raw)
In-Reply-To: <CAKHBV24gTmBTEVGx__mSs9bQ93DLTZ-VDbpE24S87kgXS+nHhw@mail.gmail.com>
On Wed, Oct 18, 2023 at 06:54:10PM +0800, Michael Shavit wrote:
> > + } else if (!step_change) {
> > + /* cur == target, so all done */
> > + if (memcmp(cur, target, sizeof(*cur)) == 0)
> > + return true;
> Shouldn't this be len * sizeof(*cur)?
Ugh, yes, thank you. An earlier version had cur be a 'struct
arm_smmu_ste', I missed this when I changed it to allow reuse for the
CD path...
> > + case STRTAB_STE_0_CFG_S1_TRANS:
> > + used_bits->data[0] |= cpu_to_le64(STRTAB_STE_0_S1FMT |
> > + STRTAB_STE_0_S1CTXPTR_MASK |
> > + STRTAB_STE_0_S1CDMAX);
> > + used_bits->data[1] |=
> > + cpu_to_le64(STRTAB_STE_1_S1DSS | STRTAB_STE_1_S1CIR |
> > + STRTAB_STE_1_S1COR | STRTAB_STE_1_S1CSH |
> > + STRTAB_STE_1_S1STALLD | STRTAB_STE_1_STRW);
> > + used_bits->data[1] |= cpu_to_le64(STRTAB_STE_1_EATS);
> > +
> > + if (FIELD_GET(STRTAB_STE_1_S1DSS, le64_to_cpu(ent->data[1])) ==
> > + STRTAB_STE_1_S1DSS_BYPASS)
> > + used_bits->data[1] |= cpu_to_le64(STRTAB_STE_1_SHCFG);
>
> Although the driver only explicitly sets SHCFG for bypass streams, my
> reading of the spec is it is also accessed for S1 and S2 STEs:
> "The SMMU might convey attributes input from a device through this
> process, so that the device might influence the final transaction
> access, and input attributes might be overridden on a per-device basis
> using the MTCFG/MemAttr, SHCFG, ALLOCCFG STE fields. The input
> attribute, modified by these fields, is primarily useful for setting
> the resulting output access attribute when both stage 1 and stage 2
> translation is bypassed (no translation table descriptors to determine
> attribute) but can also be useful for stage 2-only configurations in
> which a device stream might have finer knowledge about the required
> access behavior than the general virtual machine-global stage 2
> translation tables."
Hm.. I struggled with this for a while.
There is some kind of issue here, we cannot have it both ways where
the S1 translation on a PASID needs SHCFG=0 and the S1DSS_BYPASS needs
SHCFG=1. Either the S1 PASID ignores the field, eg because the IOPTE
supersedes it (what this patch assumes), the S1DSS doesn't need it, or
we cannot use S1DSS at all.
Let me see if we can get a deeper understanding here, it is a good
point.
> > + case STRTAB_STE_0_CFG_S2_TRANS:
> > + used_bits->data[1] |= cpu_to_le64(STRTAB_STE_1_EATS);
> > + used_bits->data[2] |=
> > + cpu_to_le64(STRTAB_STE_2_S2VMID | STRTAB_STE_2_VTCR |
> > + STRTAB_STE_2_S2AA64 | STRTAB_STE_2_S2ENDI |
> > + STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2R);
> > + used_bits->data[3] |= cpu_to_le64(STRTAB_STE_3_S2TTB_MASK);
> > + break;
> > +
> > + default:
> > + memset(used_bits, 0xFF, sizeof(*used_bits));
>
> Can we consider a WARN here since this driver only ever uses one of
> the above 4 values and we probably have a programming error if we see
> something else.
Ok
> > +static bool arm_smmu_write_ste_step(struct arm_smmu_ste *cur,
> > + const struct arm_smmu_ste *target,
> > + const struct arm_smmu_ste *target_used)
> > +{
> > + struct arm_smmu_ste cur_used;
> > + struct arm_smmu_ste step;
> > +
> > + arm_smmu_get_ste_used(cur, &cur_used);
> > + return arm_smmu_write_entry_step(cur->data, cur_used.data, target->data,
> > + target_used->data, step.data,
>
> What's up with requiring callers to allocate and provide step.data if
> it's not used by any of the arm_smmu_write_entry_step callers?
arm_smmu_write_entry_step requires a temporary memory of len bytes -
since varadic stack arrays (ie alloca) are forbidden in the kernel,
and kmalloc would be silly, the simplest solution was to have the
caller allocate it and then pass it in.
Alternatively we could have a max size temporary array inside
arm_smmu_write_entry_step() with some static asserts, but I thought
that was less clear.
> > + cpu_to_le64(STRTAB_STE_0_V),
> This also looks a bit strange at this stage since CD entries aren't
> yet supported..... but sure.
Yeah, this function shim is for the later patch that adds one of these
for CD. Don't want to go and change stuff twice.
For reference the CD function from a later patch is:
static bool arm_smmu_write_cd_step(struct arm_smmu_cd *cur,
const struct arm_smmu_cd *target,
const struct arm_smmu_cd *target_used)
{
struct arm_smmu_cd cur_used;
struct arm_smmu_cd step;
arm_smmu_get_cd_used(cur, &cur_used);
return arm_smmu_write_entry_step(cur->data, cur_used.data, target->data,
target_used->data, step.data,
cpu_to_le64(CTXDESC_CD_0_V),
ARRAY_SIZE(cur->data));
}
Thanks,
Jason
WARNING: multiple messages have this Message-ID (diff)
From: Jason Gunthorpe <jgg@nvidia.com>
To: Michael Shavit <mshavit@google.com>
Cc: iommu@lists.linux.dev, Joerg Roedel <joro@8bytes.org>,
linux-arm-kernel@lists.infradead.org,
Robin Murphy <robin.murphy@arm.com>,
Will Deacon <will@kernel.org>, Nicolin Chen <nicolinc@nvidia.com>
Subject: Re: [PATCH 04/19] iommu/arm-smmu-v3: Make STE programming independent of the callers
Date: Wed, 18 Oct 2023 09:24:35 -0300 [thread overview]
Message-ID: <20231018122435.GS3952@nvidia.com> (raw)
In-Reply-To: <CAKHBV24gTmBTEVGx__mSs9bQ93DLTZ-VDbpE24S87kgXS+nHhw@mail.gmail.com>
On Wed, Oct 18, 2023 at 06:54:10PM +0800, Michael Shavit wrote:
> > + } else if (!step_change) {
> > + /* cur == target, so all done */
> > + if (memcmp(cur, target, sizeof(*cur)) == 0)
> > + return true;
> Shouldn't this be len * sizeof(*cur)?
Ugh, yes, thank you. An earlier version had cur be a 'struct
arm_smmu_ste', I missed this when I changed it to allow reuse for the
CD path...
> > + case STRTAB_STE_0_CFG_S1_TRANS:
> > + used_bits->data[0] |= cpu_to_le64(STRTAB_STE_0_S1FMT |
> > + STRTAB_STE_0_S1CTXPTR_MASK |
> > + STRTAB_STE_0_S1CDMAX);
> > + used_bits->data[1] |=
> > + cpu_to_le64(STRTAB_STE_1_S1DSS | STRTAB_STE_1_S1CIR |
> > + STRTAB_STE_1_S1COR | STRTAB_STE_1_S1CSH |
> > + STRTAB_STE_1_S1STALLD | STRTAB_STE_1_STRW);
> > + used_bits->data[1] |= cpu_to_le64(STRTAB_STE_1_EATS);
> > +
> > + if (FIELD_GET(STRTAB_STE_1_S1DSS, le64_to_cpu(ent->data[1])) ==
> > + STRTAB_STE_1_S1DSS_BYPASS)
> > + used_bits->data[1] |= cpu_to_le64(STRTAB_STE_1_SHCFG);
>
> Although the driver only explicitly sets SHCFG for bypass streams, my
> reading of the spec is it is also accessed for S1 and S2 STEs:
> "The SMMU might convey attributes input from a device through this
> process, so that the device might influence the final transaction
> access, and input attributes might be overridden on a per-device basis
> using the MTCFG/MemAttr, SHCFG, ALLOCCFG STE fields. The input
> attribute, modified by these fields, is primarily useful for setting
> the resulting output access attribute when both stage 1 and stage 2
> translation is bypassed (no translation table descriptors to determine
> attribute) but can also be useful for stage 2-only configurations in
> which a device stream might have finer knowledge about the required
> access behavior than the general virtual machine-global stage 2
> translation tables."
Hm.. I struggled with this for a while.
There is some kind of issue here, we cannot have it both ways where
the S1 translation on a PASID needs SHCFG=0 and the S1DSS_BYPASS needs
SHCFG=1. Either the S1 PASID ignores the field, eg because the IOPTE
supersedes it (what this patch assumes), the S1DSS doesn't need it, or
we cannot use S1DSS at all.
Let me see if we can get a deeper understanding here, it is a good
point.
> > + case STRTAB_STE_0_CFG_S2_TRANS:
> > + used_bits->data[1] |= cpu_to_le64(STRTAB_STE_1_EATS);
> > + used_bits->data[2] |=
> > + cpu_to_le64(STRTAB_STE_2_S2VMID | STRTAB_STE_2_VTCR |
> > + STRTAB_STE_2_S2AA64 | STRTAB_STE_2_S2ENDI |
> > + STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2R);
> > + used_bits->data[3] |= cpu_to_le64(STRTAB_STE_3_S2TTB_MASK);
> > + break;
> > +
> > + default:
> > + memset(used_bits, 0xFF, sizeof(*used_bits));
>
> Can we consider a WARN here since this driver only ever uses one of
> the above 4 values and we probably have a programming error if we see
> something else.
Ok
> > +static bool arm_smmu_write_ste_step(struct arm_smmu_ste *cur,
> > + const struct arm_smmu_ste *target,
> > + const struct arm_smmu_ste *target_used)
> > +{
> > + struct arm_smmu_ste cur_used;
> > + struct arm_smmu_ste step;
> > +
> > + arm_smmu_get_ste_used(cur, &cur_used);
> > + return arm_smmu_write_entry_step(cur->data, cur_used.data, target->data,
> > + target_used->data, step.data,
>
> What's up with requiring callers to allocate and provide step.data if
> it's not used by any of the arm_smmu_write_entry_step callers?
arm_smmu_write_entry_step requires a temporary memory of len bytes -
since varadic stack arrays (ie alloca) are forbidden in the kernel,
and kmalloc would be silly, the simplest solution was to have the
caller allocate it and then pass it in.
Alternatively we could have a max size temporary array inside
arm_smmu_write_entry_step() with some static asserts, but I thought
that was less clear.
> > + cpu_to_le64(STRTAB_STE_0_V),
> This also looks a bit strange at this stage since CD entries aren't
> yet supported..... but sure.
Yeah, this function shim is for the later patch that adds one of these
for CD. Don't want to go and change stuff twice.
For reference the CD function from a later patch is:
static bool arm_smmu_write_cd_step(struct arm_smmu_cd *cur,
const struct arm_smmu_cd *target,
const struct arm_smmu_cd *target_used)
{
struct arm_smmu_cd cur_used;
struct arm_smmu_cd step;
arm_smmu_get_cd_used(cur, &cur_used);
return arm_smmu_write_entry_step(cur->data, cur_used.data, target->data,
target_used->data, step.data,
cpu_to_le64(CTXDESC_CD_0_V),
ARRAY_SIZE(cur->data));
}
Thanks,
Jason
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2023-10-18 12:24 UTC|newest]
Thread overview: 134+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-11 0:33 [PATCH 00/19] Update SMMUv3 to the modern iommu API (part 1/2) Jason Gunthorpe
2023-10-11 0:33 ` Jason Gunthorpe
2023-10-11 0:33 ` [PATCH 01/19] iommu/arm-smmu-v3: Add a type for the STE Jason Gunthorpe
2023-10-11 0:33 ` Jason Gunthorpe
2023-10-13 10:37 ` Will Deacon
2023-10-13 10:37 ` Will Deacon
2023-10-13 14:00 ` Jason Gunthorpe
2023-10-13 14:00 ` Jason Gunthorpe
2023-10-11 0:33 ` [PATCH 02/19] iommu/arm-smmu-v3: Master cannot be NULL in arm_smmu_write_strtab_ent() Jason Gunthorpe
2023-10-11 0:33 ` Jason Gunthorpe
2023-10-11 0:33 ` [PATCH 03/19] iommu/arm-smmu-v3: Remove ARM_SMMU_DOMAIN_NESTED Jason Gunthorpe
2023-10-11 0:33 ` Jason Gunthorpe
2023-10-11 0:33 ` [PATCH 04/19] iommu/arm-smmu-v3: Make STE programming independent of the callers Jason Gunthorpe
2023-10-11 0:33 ` Jason Gunthorpe
2023-10-12 8:10 ` Michael Shavit
2023-10-12 8:10 ` Michael Shavit
2023-10-12 12:16 ` Jason Gunthorpe
2023-10-12 12:16 ` Jason Gunthorpe
2023-10-18 11:05 ` Michael Shavit
2023-10-18 11:05 ` Michael Shavit
2023-10-18 13:04 ` Jason Gunthorpe
2023-10-18 13:04 ` Jason Gunthorpe
2023-10-20 8:23 ` Michael Shavit
2023-10-20 8:23 ` Michael Shavit
2023-10-20 11:39 ` Jason Gunthorpe
2023-10-20 11:39 ` Jason Gunthorpe
2023-10-23 8:36 ` Michael Shavit
2023-10-23 8:36 ` Michael Shavit
2023-10-23 12:05 ` Jason Gunthorpe
2023-10-23 12:05 ` Jason Gunthorpe
2023-12-15 20:26 ` Michael Shavit
2023-12-15 20:26 ` Michael Shavit
2023-12-17 13:03 ` Jason Gunthorpe
2023-12-17 13:03 ` Jason Gunthorpe
2023-12-18 12:35 ` Michael Shavit
2023-12-18 12:35 ` Michael Shavit
2023-12-18 12:42 ` Michael Shavit
2023-12-18 12:42 ` Michael Shavit
2023-12-19 13:42 ` Michael Shavit
2023-12-19 13:42 ` Michael Shavit
2023-12-25 12:17 ` Michael Shavit
2023-12-25 12:17 ` Michael Shavit
2023-12-25 12:58 ` Michael Shavit
2023-12-25 12:58 ` Michael Shavit
2023-12-27 15:33 ` Jason Gunthorpe
2023-12-27 15:33 ` Jason Gunthorpe
2023-12-27 15:46 ` Jason Gunthorpe
2023-12-27 15:46 ` Jason Gunthorpe
2024-01-02 8:08 ` Michael Shavit
2024-01-02 8:08 ` Michael Shavit
2024-01-02 14:48 ` Jason Gunthorpe
2024-01-02 14:48 ` Jason Gunthorpe
2024-01-03 16:52 ` Michael Shavit
2024-01-03 16:52 ` Michael Shavit
2024-01-03 17:50 ` Jason Gunthorpe
2024-01-03 17:50 ` Jason Gunthorpe
2024-01-06 8:36 ` [PATCH] " Michael Shavit
2024-01-06 8:36 ` Michael Shavit
2024-01-06 8:36 ` [PATCH] iommu/arm-smmu-v3: Make CD programming use arm_smmu_write_entry_step() Michael Shavit
2024-01-06 8:36 ` Michael Shavit
2024-01-10 13:34 ` Jason Gunthorpe
2024-01-10 13:34 ` Jason Gunthorpe
2024-01-06 8:36 ` [PATCH] iommu/arm-smmu-v3: Add unit tests for arm_smmu_write_entry Michael Shavit
2024-01-06 8:36 ` Michael Shavit
2024-01-12 16:36 ` Jason Gunthorpe
2024-01-12 16:36 ` Jason Gunthorpe
2024-01-16 9:23 ` Michael Shavit
2024-01-16 9:23 ` Michael Shavit
2024-01-10 13:10 ` [PATCH] iommu/arm-smmu-v3: Make STE programming independent of the callers Jason Gunthorpe
2024-01-10 13:10 ` Jason Gunthorpe
2024-01-06 8:50 ` [PATCH 04/19] " Michael Shavit
2024-01-06 8:50 ` Michael Shavit
2024-01-12 19:45 ` Jason Gunthorpe
2024-01-12 19:45 ` Jason Gunthorpe
2024-01-03 15:42 ` Michael Shavit
2024-01-03 15:42 ` Michael Shavit
2024-01-03 15:49 ` Jason Gunthorpe
2024-01-03 15:49 ` Jason Gunthorpe
2024-01-03 16:47 ` Michael Shavit
2024-01-03 16:47 ` Michael Shavit
2024-01-02 8:13 ` Michael Shavit
2024-01-02 8:13 ` Michael Shavit
2024-01-02 14:48 ` Jason Gunthorpe
2024-01-02 14:48 ` Jason Gunthorpe
2023-10-18 10:54 ` Michael Shavit
2023-10-18 10:54 ` Michael Shavit
2023-10-18 12:24 ` Jason Gunthorpe [this message]
2023-10-18 12:24 ` Jason Gunthorpe
2023-10-19 23:03 ` Jason Gunthorpe
2023-10-19 23:03 ` Jason Gunthorpe
2023-10-11 0:33 ` [PATCH 05/19] iommu/arm-smmu-v3: Consolidate the STE generation for abort/bypass Jason Gunthorpe
2023-10-11 0:33 ` Jason Gunthorpe
2023-10-11 0:33 ` [PATCH 06/19] iommu/arm-smmu-v3: Move arm_smmu_rmr_install_bypass_ste() Jason Gunthorpe
2023-10-11 0:33 ` Jason Gunthorpe
2023-10-11 0:33 ` [PATCH 07/19] iommu/arm-smmu-v3: Move the STE generation for S1 and S2 domains into functions Jason Gunthorpe
2023-10-11 0:33 ` Jason Gunthorpe
2023-10-11 0:33 ` [PATCH 08/19] iommu/arm-smmu-v3: Build the whole STE in arm_smmu_make_s2_domain_ste() Jason Gunthorpe
2023-10-11 0:33 ` Jason Gunthorpe
2023-10-11 0:33 ` [PATCH 09/19] iommu/arm-smmu-v3: Hold arm_smmu_asid_lock during all of attach_dev Jason Gunthorpe
2023-10-11 0:33 ` Jason Gunthorpe
2023-10-24 2:44 ` Michael Shavit
2023-10-24 2:44 ` Michael Shavit
2023-10-24 2:48 ` Michael Shavit
2023-10-24 2:48 ` Michael Shavit
2023-10-24 11:50 ` Jason Gunthorpe
2023-10-24 11:50 ` Jason Gunthorpe
2023-10-11 0:33 ` [PATCH 10/19] iommu/arm-smmu-v3: Compute the STE only once for each master Jason Gunthorpe
2023-10-11 0:33 ` Jason Gunthorpe
2023-10-11 0:33 ` [PATCH 11/19] iommu/arm-smmu-v3: Do not change the STE twice during arm_smmu_attach_dev() Jason Gunthorpe
2023-10-11 0:33 ` Jason Gunthorpe
2023-10-11 0:33 ` [PATCH 12/19] iommu/arm-smmu-v3: Put writing the context descriptor in the right order Jason Gunthorpe
2023-10-11 0:33 ` Jason Gunthorpe
2023-10-12 9:01 ` Michael Shavit
2023-10-12 9:01 ` Michael Shavit
2023-10-12 12:34 ` Jason Gunthorpe
2023-10-12 12:34 ` Jason Gunthorpe
2023-10-11 0:33 ` [PATCH 13/19] iommu/arm-smmu-v3: Pass smmu_domain to arm_enable/disable_ats() Jason Gunthorpe
2023-10-11 0:33 ` Jason Gunthorpe
2023-10-11 0:33 ` [PATCH 14/19] iommu/arm-smmu-v3: Remove arm_smmu_master->domain Jason Gunthorpe
2023-10-11 0:33 ` Jason Gunthorpe
2023-10-11 0:33 ` [PATCH 15/19] iommu/arm-smmu-v3: Add a global static IDENTITY domain Jason Gunthorpe
2023-10-11 0:33 ` Jason Gunthorpe
2023-10-18 11:06 ` Michael Shavit
2023-10-18 11:06 ` Michael Shavit
2023-10-18 12:26 ` Jason Gunthorpe
2023-10-18 12:26 ` Jason Gunthorpe
2023-10-11 0:33 ` [PATCH 16/19] iommu/arm-smmu-v3: Add a global static BLOCKED domain Jason Gunthorpe
2023-10-11 0:33 ` Jason Gunthorpe
2023-10-11 0:33 ` [PATCH 17/19] iommu/arm-smmu-v3: Use the identity/blocked domain during release Jason Gunthorpe
2023-10-11 0:33 ` Jason Gunthorpe
2023-10-11 0:33 ` [PATCH 18/19] iommu/arm-smmu-v3: Pass arm_smmu_domain and arm_smmu_device to finalize Jason Gunthorpe
2023-10-11 0:33 ` Jason Gunthorpe
2023-10-11 0:33 ` [PATCH 19/19] iommu/arm-smmu-v3: Convert to domain_alloc_paging() Jason Gunthorpe
2023-10-11 0:33 ` Jason Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231018122435.GS3952@nvidia.com \
--to=jgg@nvidia.com \
--cc=iommu@lists.linux.dev \
--cc=joro@8bytes.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=mshavit@google.com \
--cc=nicolinc@nvidia.com \
--cc=robin.murphy@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.