From: Jason Gunthorpe <jgg@nvidia.com>
To: Michael Shavit <mshavit@google.com>
Cc: iommu@lists.linux.dev, Joerg Roedel <joro@8bytes.org>,
linux-arm-kernel@lists.infradead.org,
Robin Murphy <robin.murphy@arm.com>,
Will Deacon <will@kernel.org>, Nicolin Chen <nicolinc@nvidia.com>
Subject: Re: [PATCH 04/19] iommu/arm-smmu-v3: Make STE programming independent of the callers
Date: Wed, 18 Oct 2023 10:04:55 -0300 [thread overview]
Message-ID: <20231018130455.GU3952@nvidia.com> (raw)
In-Reply-To: <CAKHBV27G1PLZ5gkYkOOCZdD9waZOZypkWqk4_HSR4XXGFT-+LA@mail.gmail.com>
On Wed, Oct 18, 2023 at 07:05:49PM +0800, Michael Shavit wrote:
> > FWIW, I found the most difficult part the used bit calculation, not
> > the update algorithm. Difficult because it is hard to read and find in
> > the spec when things are INGORED, but it is a "straightforward" job of
> > finding INGORED cases and making the used bits 0.
>
> The update algorithm is the part I'm finding much harder to read and
> review :) . arm_smmu_write_entry_step in particular is hard to read
> through; on top of which there's some subtle dependencies between loop
> iterations that weren't obvious to grok:
Yes, you have it right, it is basically a classic greedy
algorithm. Let's improve the comment.
> * Relying on the used_bits to be recomputed after the first iteration
> where V=0 was set to 0 so that more bits can now be set.
> * The STE having to be synced between iterations to prevent broken STE
> reads by the SMMU (there's a comment somewhere else in arm-smmu-v3.c
> that would fit nicely here instead). But the caller is responsible for
> calling this between iterations for some reason (supposedly to support
> CD entries as well in the next series)
Yes, for CD entry support.
How about:
/*
* This algorithm updates any STE/CD to any value without creating a situation
* where the HW can percieve a corrupted entry. HW is only required to have a 64
* bit atomicity with stores from the CPU, while entires are many 64 bit values
* big.
*
* The algorithm works by evolving the entry toward the target in a series of
* steps. Each step synchronizes with the HW so that the HW can not see an entry
* torn across two steps. Upon each call cur/cur_used reflect the current
* synchronized value seen by the HW.
*
* During each step the HW can observe a torn entry that has any combination of
* the step's old/new 64 bit words. The algorithm objective is for the HW
* behavior to always be one of current behavior, V=0, or new behavior, during
* each step, and across all steps.
*
* At each step one of three actions is choosen to evolve cur to target:
* - Update all unused bits with their target values.
* This relies on the IGNORED behavior described in the specification
* - Update a single 64-bit value
* - Update all unused bits and set V=0
*
* The last two actions will cause cur_used to change, which will then allow the
* first action on the next step.
*
* In the most general case we can make any update in three steps:
* - Disrupting the entry (V=0)
* - Fill now unused bits, all bits except V
* - Make valid (V=1), single 64 bit store
*
* However this disrupts the HW while it is happening. There are several
* interesting cases where a STE/CD can be updated without disturbing the HW
* because only a small number of bits are changing (S1DSS, CONFIG, etc) or
* because the used bits don't intersect. We can detect this by calculating how
* many 64 bit values need update after adjusting the unused bits and skip the
* V=0 process.
*/
static bool arm_smmu_write_entry_step(__le64 *cur, const __le64 *cur_used,
Jason
WARNING: multiple messages have this Message-ID (diff)
From: Jason Gunthorpe <jgg@nvidia.com>
To: Michael Shavit <mshavit@google.com>
Cc: iommu@lists.linux.dev, Joerg Roedel <joro@8bytes.org>,
linux-arm-kernel@lists.infradead.org,
Robin Murphy <robin.murphy@arm.com>,
Will Deacon <will@kernel.org>, Nicolin Chen <nicolinc@nvidia.com>
Subject: Re: [PATCH 04/19] iommu/arm-smmu-v3: Make STE programming independent of the callers
Date: Wed, 18 Oct 2023 10:04:55 -0300 [thread overview]
Message-ID: <20231018130455.GU3952@nvidia.com> (raw)
In-Reply-To: <CAKHBV27G1PLZ5gkYkOOCZdD9waZOZypkWqk4_HSR4XXGFT-+LA@mail.gmail.com>
On Wed, Oct 18, 2023 at 07:05:49PM +0800, Michael Shavit wrote:
> > FWIW, I found the most difficult part the used bit calculation, not
> > the update algorithm. Difficult because it is hard to read and find in
> > the spec when things are INGORED, but it is a "straightforward" job of
> > finding INGORED cases and making the used bits 0.
>
> The update algorithm is the part I'm finding much harder to read and
> review :) . arm_smmu_write_entry_step in particular is hard to read
> through; on top of which there's some subtle dependencies between loop
> iterations that weren't obvious to grok:
Yes, you have it right, it is basically a classic greedy
algorithm. Let's improve the comment.
> * Relying on the used_bits to be recomputed after the first iteration
> where V=0 was set to 0 so that more bits can now be set.
> * The STE having to be synced between iterations to prevent broken STE
> reads by the SMMU (there's a comment somewhere else in arm-smmu-v3.c
> that would fit nicely here instead). But the caller is responsible for
> calling this between iterations for some reason (supposedly to support
> CD entries as well in the next series)
Yes, for CD entry support.
How about:
/*
* This algorithm updates any STE/CD to any value without creating a situation
* where the HW can percieve a corrupted entry. HW is only required to have a 64
* bit atomicity with stores from the CPU, while entires are many 64 bit values
* big.
*
* The algorithm works by evolving the entry toward the target in a series of
* steps. Each step synchronizes with the HW so that the HW can not see an entry
* torn across two steps. Upon each call cur/cur_used reflect the current
* synchronized value seen by the HW.
*
* During each step the HW can observe a torn entry that has any combination of
* the step's old/new 64 bit words. The algorithm objective is for the HW
* behavior to always be one of current behavior, V=0, or new behavior, during
* each step, and across all steps.
*
* At each step one of three actions is choosen to evolve cur to target:
* - Update all unused bits with their target values.
* This relies on the IGNORED behavior described in the specification
* - Update a single 64-bit value
* - Update all unused bits and set V=0
*
* The last two actions will cause cur_used to change, which will then allow the
* first action on the next step.
*
* In the most general case we can make any update in three steps:
* - Disrupting the entry (V=0)
* - Fill now unused bits, all bits except V
* - Make valid (V=1), single 64 bit store
*
* However this disrupts the HW while it is happening. There are several
* interesting cases where a STE/CD can be updated without disturbing the HW
* because only a small number of bits are changing (S1DSS, CONFIG, etc) or
* because the used bits don't intersect. We can detect this by calculating how
* many 64 bit values need update after adjusting the unused bits and skip the
* V=0 process.
*/
static bool arm_smmu_write_entry_step(__le64 *cur, const __le64 *cur_used,
Jason
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2023-10-18 13:05 UTC|newest]
Thread overview: 134+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-11 0:33 [PATCH 00/19] Update SMMUv3 to the modern iommu API (part 1/2) Jason Gunthorpe
2023-10-11 0:33 ` Jason Gunthorpe
2023-10-11 0:33 ` [PATCH 01/19] iommu/arm-smmu-v3: Add a type for the STE Jason Gunthorpe
2023-10-11 0:33 ` Jason Gunthorpe
2023-10-13 10:37 ` Will Deacon
2023-10-13 10:37 ` Will Deacon
2023-10-13 14:00 ` Jason Gunthorpe
2023-10-13 14:00 ` Jason Gunthorpe
2023-10-11 0:33 ` [PATCH 02/19] iommu/arm-smmu-v3: Master cannot be NULL in arm_smmu_write_strtab_ent() Jason Gunthorpe
2023-10-11 0:33 ` Jason Gunthorpe
2023-10-11 0:33 ` [PATCH 03/19] iommu/arm-smmu-v3: Remove ARM_SMMU_DOMAIN_NESTED Jason Gunthorpe
2023-10-11 0:33 ` Jason Gunthorpe
2023-10-11 0:33 ` [PATCH 04/19] iommu/arm-smmu-v3: Make STE programming independent of the callers Jason Gunthorpe
2023-10-11 0:33 ` Jason Gunthorpe
2023-10-12 8:10 ` Michael Shavit
2023-10-12 8:10 ` Michael Shavit
2023-10-12 12:16 ` Jason Gunthorpe
2023-10-12 12:16 ` Jason Gunthorpe
2023-10-18 11:05 ` Michael Shavit
2023-10-18 11:05 ` Michael Shavit
2023-10-18 13:04 ` Jason Gunthorpe [this message]
2023-10-18 13:04 ` Jason Gunthorpe
2023-10-20 8:23 ` Michael Shavit
2023-10-20 8:23 ` Michael Shavit
2023-10-20 11:39 ` Jason Gunthorpe
2023-10-20 11:39 ` Jason Gunthorpe
2023-10-23 8:36 ` Michael Shavit
2023-10-23 8:36 ` Michael Shavit
2023-10-23 12:05 ` Jason Gunthorpe
2023-10-23 12:05 ` Jason Gunthorpe
2023-12-15 20:26 ` Michael Shavit
2023-12-15 20:26 ` Michael Shavit
2023-12-17 13:03 ` Jason Gunthorpe
2023-12-17 13:03 ` Jason Gunthorpe
2023-12-18 12:35 ` Michael Shavit
2023-12-18 12:35 ` Michael Shavit
2023-12-18 12:42 ` Michael Shavit
2023-12-18 12:42 ` Michael Shavit
2023-12-19 13:42 ` Michael Shavit
2023-12-19 13:42 ` Michael Shavit
2023-12-25 12:17 ` Michael Shavit
2023-12-25 12:17 ` Michael Shavit
2023-12-25 12:58 ` Michael Shavit
2023-12-25 12:58 ` Michael Shavit
2023-12-27 15:33 ` Jason Gunthorpe
2023-12-27 15:33 ` Jason Gunthorpe
2023-12-27 15:46 ` Jason Gunthorpe
2023-12-27 15:46 ` Jason Gunthorpe
2024-01-02 8:08 ` Michael Shavit
2024-01-02 8:08 ` Michael Shavit
2024-01-02 14:48 ` Jason Gunthorpe
2024-01-02 14:48 ` Jason Gunthorpe
2024-01-03 16:52 ` Michael Shavit
2024-01-03 16:52 ` Michael Shavit
2024-01-03 17:50 ` Jason Gunthorpe
2024-01-03 17:50 ` Jason Gunthorpe
2024-01-06 8:36 ` [PATCH] " Michael Shavit
2024-01-06 8:36 ` Michael Shavit
2024-01-06 8:36 ` [PATCH] iommu/arm-smmu-v3: Make CD programming use arm_smmu_write_entry_step() Michael Shavit
2024-01-06 8:36 ` Michael Shavit
2024-01-10 13:34 ` Jason Gunthorpe
2024-01-10 13:34 ` Jason Gunthorpe
2024-01-06 8:36 ` [PATCH] iommu/arm-smmu-v3: Add unit tests for arm_smmu_write_entry Michael Shavit
2024-01-06 8:36 ` Michael Shavit
2024-01-12 16:36 ` Jason Gunthorpe
2024-01-12 16:36 ` Jason Gunthorpe
2024-01-16 9:23 ` Michael Shavit
2024-01-16 9:23 ` Michael Shavit
2024-01-10 13:10 ` [PATCH] iommu/arm-smmu-v3: Make STE programming independent of the callers Jason Gunthorpe
2024-01-10 13:10 ` Jason Gunthorpe
2024-01-06 8:50 ` [PATCH 04/19] " Michael Shavit
2024-01-06 8:50 ` Michael Shavit
2024-01-12 19:45 ` Jason Gunthorpe
2024-01-12 19:45 ` Jason Gunthorpe
2024-01-03 15:42 ` Michael Shavit
2024-01-03 15:42 ` Michael Shavit
2024-01-03 15:49 ` Jason Gunthorpe
2024-01-03 15:49 ` Jason Gunthorpe
2024-01-03 16:47 ` Michael Shavit
2024-01-03 16:47 ` Michael Shavit
2024-01-02 8:13 ` Michael Shavit
2024-01-02 8:13 ` Michael Shavit
2024-01-02 14:48 ` Jason Gunthorpe
2024-01-02 14:48 ` Jason Gunthorpe
2023-10-18 10:54 ` Michael Shavit
2023-10-18 10:54 ` Michael Shavit
2023-10-18 12:24 ` Jason Gunthorpe
2023-10-18 12:24 ` Jason Gunthorpe
2023-10-19 23:03 ` Jason Gunthorpe
2023-10-19 23:03 ` Jason Gunthorpe
2023-10-11 0:33 ` [PATCH 05/19] iommu/arm-smmu-v3: Consolidate the STE generation for abort/bypass Jason Gunthorpe
2023-10-11 0:33 ` Jason Gunthorpe
2023-10-11 0:33 ` [PATCH 06/19] iommu/arm-smmu-v3: Move arm_smmu_rmr_install_bypass_ste() Jason Gunthorpe
2023-10-11 0:33 ` Jason Gunthorpe
2023-10-11 0:33 ` [PATCH 07/19] iommu/arm-smmu-v3: Move the STE generation for S1 and S2 domains into functions Jason Gunthorpe
2023-10-11 0:33 ` Jason Gunthorpe
2023-10-11 0:33 ` [PATCH 08/19] iommu/arm-smmu-v3: Build the whole STE in arm_smmu_make_s2_domain_ste() Jason Gunthorpe
2023-10-11 0:33 ` Jason Gunthorpe
2023-10-11 0:33 ` [PATCH 09/19] iommu/arm-smmu-v3: Hold arm_smmu_asid_lock during all of attach_dev Jason Gunthorpe
2023-10-11 0:33 ` Jason Gunthorpe
2023-10-24 2:44 ` Michael Shavit
2023-10-24 2:44 ` Michael Shavit
2023-10-24 2:48 ` Michael Shavit
2023-10-24 2:48 ` Michael Shavit
2023-10-24 11:50 ` Jason Gunthorpe
2023-10-24 11:50 ` Jason Gunthorpe
2023-10-11 0:33 ` [PATCH 10/19] iommu/arm-smmu-v3: Compute the STE only once for each master Jason Gunthorpe
2023-10-11 0:33 ` Jason Gunthorpe
2023-10-11 0:33 ` [PATCH 11/19] iommu/arm-smmu-v3: Do not change the STE twice during arm_smmu_attach_dev() Jason Gunthorpe
2023-10-11 0:33 ` Jason Gunthorpe
2023-10-11 0:33 ` [PATCH 12/19] iommu/arm-smmu-v3: Put writing the context descriptor in the right order Jason Gunthorpe
2023-10-11 0:33 ` Jason Gunthorpe
2023-10-12 9:01 ` Michael Shavit
2023-10-12 9:01 ` Michael Shavit
2023-10-12 12:34 ` Jason Gunthorpe
2023-10-12 12:34 ` Jason Gunthorpe
2023-10-11 0:33 ` [PATCH 13/19] iommu/arm-smmu-v3: Pass smmu_domain to arm_enable/disable_ats() Jason Gunthorpe
2023-10-11 0:33 ` Jason Gunthorpe
2023-10-11 0:33 ` [PATCH 14/19] iommu/arm-smmu-v3: Remove arm_smmu_master->domain Jason Gunthorpe
2023-10-11 0:33 ` Jason Gunthorpe
2023-10-11 0:33 ` [PATCH 15/19] iommu/arm-smmu-v3: Add a global static IDENTITY domain Jason Gunthorpe
2023-10-11 0:33 ` Jason Gunthorpe
2023-10-18 11:06 ` Michael Shavit
2023-10-18 11:06 ` Michael Shavit
2023-10-18 12:26 ` Jason Gunthorpe
2023-10-18 12:26 ` Jason Gunthorpe
2023-10-11 0:33 ` [PATCH 16/19] iommu/arm-smmu-v3: Add a global static BLOCKED domain Jason Gunthorpe
2023-10-11 0:33 ` Jason Gunthorpe
2023-10-11 0:33 ` [PATCH 17/19] iommu/arm-smmu-v3: Use the identity/blocked domain during release Jason Gunthorpe
2023-10-11 0:33 ` Jason Gunthorpe
2023-10-11 0:33 ` [PATCH 18/19] iommu/arm-smmu-v3: Pass arm_smmu_domain and arm_smmu_device to finalize Jason Gunthorpe
2023-10-11 0:33 ` Jason Gunthorpe
2023-10-11 0:33 ` [PATCH 19/19] iommu/arm-smmu-v3: Convert to domain_alloc_paging() Jason Gunthorpe
2023-10-11 0:33 ` Jason Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231018130455.GU3952@nvidia.com \
--to=jgg@nvidia.com \
--cc=iommu@lists.linux.dev \
--cc=joro@8bytes.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=mshavit@google.com \
--cc=nicolinc@nvidia.com \
--cc=robin.murphy@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.