* Re: [PATCH 2/3] dmaengine: xilinx: xdma: Fix synchronization issue
From: Lizhi Hou @ 2024-03-27 17:46 UTC (permalink / raw)
To: Louis Chauvet, Brian Xu, Raj Kumar Rampelli, Vinod Koul,
Michal Simek, Jan Kuliga, Miquel Raynal
Cc: dmaengine, linux-arm-kernel, linux-kernel, stable
In-Reply-To: <20240327-digigram-xdma-fixes-v1-2-45f4a52c0283@bootlin.com>
On 3/27/24 02:58, Louis Chauvet wrote:
> The current xdma_synchronize method does not properly wait for the last
> transfer to be done. Due to limitations of the XMDA engine, it is not
> possible to stop a transfer in the middle of a descriptor. Said
> otherwise, if a stop is requested at the end of descriptor "N" and the OS
> is fast enough, the DMA controller will effectively stop immediately.
> However, if the OS is slightly too slow to request the stop and the DMA
> engine starts descriptor "N+1", the N+1 transfer will be performed until
> its end. This means that after a terminate_all, the last descriptor must
> remain valid and the synchronization must wait for this last descriptor to
> be terminated.
>
> Fixes: 855c2e1d1842 ("dmaengine: xilinx: xdma: Rework xdma_terminate_all()")
> Fixes: f5c392d106e7 ("dmaengine: xilinx: xdma: Add terminate_all/synchronize callbacks")
> Cc: stable@vger.kernel.org
> Suggested-by: Miquel Raynal <miquel.raynal@bootlin.com>
> Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> ---
> drivers/dma/xilinx/xdma-regs.h | 3 +++
> drivers/dma/xilinx/xdma.c | 26 ++++++++++++++++++--------
> 2 files changed, 21 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/dma/xilinx/xdma-regs.h b/drivers/dma/xilinx/xdma-regs.h
> index 98f5f6fb9ff9..6ad08878e938 100644
> --- a/drivers/dma/xilinx/xdma-regs.h
> +++ b/drivers/dma/xilinx/xdma-regs.h
> @@ -117,6 +117,9 @@ struct xdma_hw_desc {
> CHAN_CTRL_IE_WRITE_ERROR | \
> CHAN_CTRL_IE_DESC_ERROR)
>
> +/* bits of the channel status register */
> +#define XDMA_CHAN_STATUS_BUSY BIT(0)
> +
> #define XDMA_CHAN_STATUS_MASK CHAN_CTRL_START
>
> #define XDMA_CHAN_ERROR_MASK (CHAN_CTRL_IE_DESC_ALIGN_MISMATCH | \
> diff --git a/drivers/dma/xilinx/xdma.c b/drivers/dma/xilinx/xdma.c
> index b9788aa8f6b7..5a3a3293b21b 100644
> --- a/drivers/dma/xilinx/xdma.c
> +++ b/drivers/dma/xilinx/xdma.c
> @@ -71,6 +71,8 @@ struct xdma_chan {
> enum dma_transfer_direction dir;
> struct dma_slave_config cfg;
> u32 irq;
> + struct completion last_interrupt;
> + bool stop_requested;
> };
>
> /**
> @@ -376,6 +378,8 @@ static int xdma_xfer_start(struct xdma_chan *xchan)
> return ret;
>
> xchan->busy = true;
> + xchan->stop_requested = false;
> + reinit_completion(&xchan->last_interrupt);
If stop_requested is true, it should not start another transfer. So I
would suggest to add
if (xchan->stop_requested)
return -ENODEV;
at the beginning of xdma_xfer_start().
xdma_xfer_start() is protected by chan lock.
>
> return 0;
> }
> @@ -387,7 +391,6 @@ static int xdma_xfer_start(struct xdma_chan *xchan)
> static int xdma_xfer_stop(struct xdma_chan *xchan)
> {
> int ret;
> - u32 val;
> struct xdma_device *xdev = xchan->xdev_hdl;
>
> /* clear run stop bit to prevent any further auto-triggering */
> @@ -395,13 +398,7 @@ static int xdma_xfer_stop(struct xdma_chan *xchan)
> CHAN_CTRL_RUN_STOP);
> if (ret)
> return ret;
Above two lines can be removed with your change.
> -
> - /* Clear the channel status register */
> - ret = regmap_read(xdev->rmap, xchan->base + XDMA_CHAN_STATUS_RC, &val);
> - if (ret)
> - return ret;
> -
> - return 0;
> + return ret;
> }
>
> /**
> @@ -474,6 +471,8 @@ static int xdma_alloc_channels(struct xdma_device *xdev,
> xchan->xdev_hdl = xdev;
> xchan->base = base + i * XDMA_CHAN_STRIDE;
> xchan->dir = dir;
> + xchan->stop_requested = false;
> + init_completion(&xchan->last_interrupt);
>
> ret = xdma_channel_init(xchan);
> if (ret)
> @@ -521,6 +520,7 @@ static int xdma_terminate_all(struct dma_chan *chan)
> spin_lock_irqsave(&xdma_chan->vchan.lock, flags);
>
> xdma_chan->busy = false;
> + xdma_chan->stop_requested = true;
> vd = vchan_next_desc(&xdma_chan->vchan);
> if (vd) {
> list_del(&vd->node);
> @@ -542,6 +542,13 @@ static int xdma_terminate_all(struct dma_chan *chan)
> static void xdma_synchronize(struct dma_chan *chan)
> {
> struct xdma_chan *xdma_chan = to_xdma_chan(chan);
> + struct xdma_device *xdev = xdma_chan->xdev_hdl;
> + int st = 0;
> +
> + /* If the engine continues running, wait for the last interrupt */
> + regmap_read(xdev->rmap, xdma_chan->base + XDMA_CHAN_STATUS, &st);
> + if (st & XDMA_CHAN_STATUS_BUSY)
> + wait_for_completion_timeout(&xdma_chan->last_interrupt, msecs_to_jiffies(1000));
I suggest to add error message for timeout case.
>
> vchan_synchronize(&xdma_chan->vchan);
> }
> @@ -876,6 +883,9 @@ static irqreturn_t xdma_channel_isr(int irq, void *dev_id)
> u32 st;
> bool repeat_tx;
>
> + if (xchan->stop_requested)
> + complete(&xchan->last_interrupt);
> +
This should be moved to the end of function to make sure processing
previous transfer is completed.
out:
if (xchan->stop_requested) {
xchan->busy = false;
complete(&xchan->last_interrupt);
}
spin_unlock(&xchan->vchan.lock);
return IRQ_HANDLED;
Thanks,
Lizhi
> spin_lock(&xchan->vchan.lock);
>
> /* get submitted request */
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* Re: [PATCH 4/4] kprobes: Remove core dependency on modules
From: Jarkko Sakkinen @ 2024-03-27 17:46 UTC (permalink / raw)
To: Masami Hiramatsu, Mark Rutland
Cc: linux-kernel, agordeev, anil.s.keshavamurthy, aou, bp,
catalin.marinas, dave.hansen, davem, gor, hca, jcalvinowens,
linux-arm-kernel, mingo, mpe, naveen.n.rao, palmer, paul.walmsley,
tglx, will
In-Reply-To: <20240327090155.873f1ed32700dbdb75f8eada@kernel.org>
On Wed Mar 27, 2024 at 2:01 AM EET, Masami Hiramatsu (Google) wrote:
> On Tue, 26 Mar 2024 17:38:18 +0000
> Mark Rutland <mark.rutland@arm.com> wrote:
>
> > On Tue, Mar 26, 2024 at 07:13:51PM +0200, Jarkko Sakkinen wrote:
> > > On Tue Mar 26, 2024 at 6:36 PM EET, Mark Rutland wrote:
> >
> > > > +#ifdef CONFIG_MODULES
> > > > /* Check if 'p' is probing a module. */
> > > > *probed_mod = __module_text_address((unsigned long) p->addr);
> > > > if (*probed_mod) {
> > > > @@ -1605,6 +1606,8 @@ static int check_kprobe_address_safe(struct kprobe *p,
> > > > ret = -ENOENT;
> > > > }
> > > > }
> > > > +#endif
> > >
> > > This can be scoped a bit more (see v7 of my patch set).
> >
> > > > +#ifdef CONFIG_MODULES
> > > > static nokprobe_inline bool trace_kprobe_module_exist(struct trace_kprobe *tk)
> > > > {
> > > > char *p;
> > > > @@ -129,6 +130,9 @@ static nokprobe_inline bool trace_kprobe_module_exist(struct trace_kprobe *tk)
> > > >
> > > > return ret;
> > > > }
> > > > +#else
> > > > +#define trace_kprobe_module_exist(tk) false /* aka a module never exists */
> > > > +#endif /* CONFIG_MODULES */
> > > >
> > > > static bool trace_kprobe_is_busy(struct dyn_event *ev)
> > > > {
> > > > @@ -670,6 +674,7 @@ static int register_trace_kprobe(struct trace_kprobe *tk)
> > > > return ret;
> > > > }
> > > >
> > > > +#ifdef CONFIG_MODULES
> > > > /* Module notifier call back, checking event on the module */
> > > > static int trace_kprobe_module_callback(struct notifier_block *nb,
> > > > unsigned long val, void *data)
> > > > @@ -699,6 +704,9 @@ static int trace_kprobe_module_callback(struct notifier_block *nb,
> > > >
> > > > return NOTIFY_DONE;
> > > > }
> > > > +#else
> > > > +#define trace_kprobe_module_callback (NULL)
> > > > +#endif /* CONFIG_MODULES */
> > >
> > > The last two CONFIG_MODULES sections could be combined. This was also in
> > > v7.
> >
> > > Other than lgtm.
> >
> > Great! I've folded your v7 changes in, and pushed that out to:
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/log/?h=kprobes/without-modules
> >
> > I'll hold off sending that out to the list until other folk have had a chance
> > to comment.
>
> Yeah, the updated one looks good to me too.
>
> Thanks!
As for RISC-V:
Tested-by: Jarkko Sakkinen <jarkko@kernel.org> # arch/riscv
I'm fine with adding to all patches because it would be hard
to place tested-by to any specific patch (e.g. if this was a
syscall I would give tested-by just for that patch).
Just adding disclaimer because depending on subsystem people
are more or less strict with this tag :-)
BR, Jarkko
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* [PATCH] Input: stmpe - drop driver owner assignment
From: Krzysztof Kozlowski @ 2024-03-27 17:46 UTC (permalink / raw)
To: Dmitry Torokhov, Maxime Coquelin, Alexandre Torgue, linux-input,
linux-stm32, linux-arm-kernel, linux-kernel
Cc: Krzysztof Kozlowski
Core in platform_driver_register() already sets the .owner, so driver
does not need to.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
---
drivers/input/keyboard/stmpe-keypad.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/input/keyboard/stmpe-keypad.c b/drivers/input/keyboard/stmpe-keypad.c
index 2013c0afd0c3..ef2f44027894 100644
--- a/drivers/input/keyboard/stmpe-keypad.c
+++ b/drivers/input/keyboard/stmpe-keypad.c
@@ -413,7 +413,6 @@ static void stmpe_keypad_remove(struct platform_device *pdev)
static struct platform_driver stmpe_keypad_driver = {
.driver.name = "stmpe-keypad",
- .driver.owner = THIS_MODULE,
.probe = stmpe_keypad_probe,
.remove_new = stmpe_keypad_remove,
};
--
2.34.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* Re: [PATCH 1/1] arm64: mm: swap: support THP_SWAP on hardware with MTE
From: Catalin Marinas @ 2024-03-27 17:58 UTC (permalink / raw)
To: Ryan Roberts
Cc: David Hildenbrand, Matthew Wilcox, Barry Song, will, akpm, hughd,
linux-mm, linux-arm-kernel, chrisl, mark.rutland, steven.price,
Barry Song, Kemeng Shi, Anshuman Khandual, Peter Collingbourne,
Yosry Ahmed, Peter Xu, Lorenzo Stoakes, Mike Rapoport (IBM),
Aneesh Kumar K.V, Rick Edgecombe
In-Reply-To: <46ba09c5-4186-4e03-81cc-ca27c0301fef@arm.com>
On Wed, Mar 27, 2024 at 03:13:18PM +0000, Ryan Roberts wrote:
> On 27/03/2024 14:57, David Hildenbrand wrote:
> > On 27.03.24 15:53, Matthew Wilcox wrote:
> >> On Sat, Mar 23, 2024 at 12:41:36AM +1300, Barry Song wrote:
> >>> Commit d0637c505f8a1 ("arm64: enable THP_SWAP for arm64") brings up
> >>> THP_SWAP on ARM64, but it doesn't enable THP_SWP on hardware with
> >>> MTE as the MTE code works with the assumption tags save/restore is
> >>> always handling a folio with only one page.
> >>>
> >>> The limitation should be removed as more and more ARM64 SoCs have
> >>> this feature. Co-existence of MTE and THP_SWAP becomes more and
> >>> more important.
> >>>
> >>> This patch makes MTE tags saving support large folios, then we don't
> >>> need to split large folios into base pages for swapping out on ARM64
> >>> SoCs with MTE any more.
> >>
> >> Can we go further than this patch and only support PG_mte_tagged and
> >> PG_mte_lock on folio->flags instead of page->flags? We're down to using
> >
> > I think we discussed that already and what I learned is that it "gets a bit
> > complicated". But I'm hoping we can get that discussion started again.
>
> The original conversation starts here:
> https://lore.kernel.org/linux-mm/fb34d312-1049-4932-8f2b-d7f33cfc297c@arm.com/
>
> The issue is that you can have a large folio mapped to user space, and user
> space only wants to activate MTE for a portion of it. So at that point, you
> either have to deal with only part of it being tagged (as we do today with the
> per-page flag) or you have to split the folio.
It needs splitting since the PROT_MTE property ends up in the pte as a
memory attribute. So we can't have a THP mapping with PROT_MTE but only
specific pages tagged.
I had an attempt last year to only keep the PG_mte_tagged flag in the
head page but I recall folio_copy() got in the way since it was calling
copy_highpage() on individual pages and the arm64 code was not seeing
the head PG_mte_tagged. I think it can be worked around but I got
distracted and forgot about this.
--
Catalin
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* [PATCH v6 10/29] iommu/arm-smmu-v3: Make arm_smmu_alloc_cd_ptr()
From: Jason Gunthorpe @ 2024-03-27 18:07 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Lu Baolu, Eric Auger, Jean-Philippe Brucker, Joerg Roedel,
Kevin Tian, kernel test robot, Moritz Fischer, Moritz Fischer,
Michael Shavit, Nicolin Chen, patches, Shameer Kolothum,
Mostafa Saleh, Tony Zhu, Yi Liu, Zhangfei Gao
In-Reply-To: <0-v6-228e7adf25eb+4155-smmuv3_newapi_p2_jgg@nvidia.com>
Only the attach callers can perform an allocation for the CD table entry,
the other callers must not do so, they do not have the correct locking and
they cannot sleep. Split up the functions so this is clear.
arm_smmu_get_cd_ptr() will return pointer to a CD table entry without
doing any kind of allocation.
arm_smmu_alloc_cd_ptr() will allocate the table and any required
leaf.
A following patch will add lockdep assertions to arm_smmu_alloc_cd_ptr()
once the restructuring is completed and arm_smmu_alloc_cd_ptr() is never
called in the wrong context.
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 61 +++++++++++++--------
1 file changed, 39 insertions(+), 22 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 54571f2a4acd5b..2ea4fe9d6594bc 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -103,6 +103,7 @@ static struct arm_smmu_option_prop arm_smmu_options[] = {
static int arm_smmu_domain_finalise(struct arm_smmu_domain *smmu_domain,
struct arm_smmu_device *smmu);
+static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master);
static void parse_driver_options(struct arm_smmu_device *smmu)
{
@@ -1212,29 +1213,51 @@ static void arm_smmu_write_cd_l1_desc(__le64 *dst,
struct arm_smmu_cd *arm_smmu_get_cd_ptr(struct arm_smmu_master *master,
u32 ssid)
{
- __le64 *l1ptr;
- unsigned int idx;
struct arm_smmu_l1_ctx_desc *l1_desc;
- struct arm_smmu_device *smmu = master->smmu;
struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table;
+ if (!cd_table->cdtab)
+ return NULL;
+
if (cd_table->s1fmt == STRTAB_STE_0_S1FMT_LINEAR)
return (struct arm_smmu_cd *)(cd_table->cdtab +
ssid * CTXDESC_CD_DWORDS);
- idx = ssid >> CTXDESC_SPLIT;
- l1_desc = &cd_table->l1_desc[idx];
- if (!l1_desc->l2ptr) {
- if (arm_smmu_alloc_cd_leaf_table(smmu, l1_desc))
- return NULL;
+ l1_desc = &cd_table->l1_desc[ssid / CTXDESC_L2_ENTRIES];
+ if (!l1_desc->l2ptr)
+ return NULL;
+ return &l1_desc->l2ptr[ssid % CTXDESC_L2_ENTRIES];
+}
- l1ptr = cd_table->cdtab + idx * CTXDESC_L1_DESC_DWORDS;
- arm_smmu_write_cd_l1_desc(l1ptr, l1_desc);
- /* An invalid L1CD can be cached */
- arm_smmu_sync_cd(master, ssid, false);
+static struct arm_smmu_cd *arm_smmu_alloc_cd_ptr(struct arm_smmu_master *master,
+ u32 ssid)
+{
+ struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table;
+ struct arm_smmu_device *smmu = master->smmu;
+
+ if (!cd_table->cdtab) {
+ if (arm_smmu_alloc_cd_tables(master))
+ return NULL;
}
- idx = ssid & (CTXDESC_L2_ENTRIES - 1);
- return &l1_desc->l2ptr[idx];
+
+ if (cd_table->s1fmt == STRTAB_STE_0_S1FMT_64K_L2) {
+ unsigned int idx = ssid >> CTXDESC_SPLIT;
+ struct arm_smmu_l1_ctx_desc *l1_desc;
+
+ l1_desc = &cd_table->l1_desc[idx];
+ if (!l1_desc->l2ptr) {
+ __le64 *l1ptr;
+
+ if (arm_smmu_alloc_cd_leaf_table(smmu, l1_desc))
+ return NULL;
+
+ l1ptr = cd_table->cdtab + idx * CTXDESC_L1_DESC_DWORDS;
+ arm_smmu_write_cd_l1_desc(l1ptr, l1_desc);
+ /* An invalid L1CD can be cached */
+ arm_smmu_sync_cd(master, ssid, false);
+ }
+ }
+ return arm_smmu_get_cd_ptr(master, ssid);
}
struct arm_smmu_cd_writer {
@@ -1362,7 +1385,7 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid,
if (WARN_ON(ssid >= (1 << cd_table->s1cdmax)))
return -E2BIG;
- cd_table_entry = arm_smmu_get_cd_ptr(master, ssid);
+ cd_table_entry = arm_smmu_alloc_cd_ptr(master, ssid);
if (!cd_table_entry)
return -ENOMEM;
@@ -2689,13 +2712,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
struct arm_smmu_cd target_cd;
struct arm_smmu_cd *cdptr;
- if (!master->cd_table.cdtab) {
- ret = arm_smmu_alloc_cd_tables(master);
- if (ret)
- goto out_list_del;
- }
-
- cdptr = arm_smmu_get_cd_ptr(master, IOMMU_NO_PASID);
+ cdptr = arm_smmu_alloc_cd_ptr(master, IOMMU_NO_PASID);
if (!cdptr) {
ret = -ENOMEM;
goto out_list_del;
--
2.43.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCH v6 09/29] iommu/arm-smmu-v3: Consolidate clearing a CD table entry
From: Jason Gunthorpe @ 2024-03-27 18:07 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Lu Baolu, Eric Auger, Jean-Philippe Brucker, Joerg Roedel,
Kevin Tian, kernel test robot, Moritz Fischer, Moritz Fischer,
Michael Shavit, Nicolin Chen, patches, Shameer Kolothum,
Mostafa Saleh, Tony Zhu, Yi Liu, Zhangfei Gao
In-Reply-To: <0-v6-228e7adf25eb+4155-smmuv3_newapi_p2_jgg@nvidia.com>
A cleared entry is all 0's. Make arm_smmu_clear_cd() do this sequence.
If we are clearing an entry and for some reason it is not already
allocated in the CD table then something has gone wrong.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Moritz Fischer <moritzf@google.com>
Reviewed-by: Mostafa Saleh <smostafa@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 2 +-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 20 ++++++++++++++-----
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 1 +
3 files changed, 17 insertions(+), 6 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index d159f60480935e..7cf286f7a009fb 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -569,7 +569,7 @@ void arm_smmu_sva_remove_dev_pasid(struct iommu_domain *domain,
mutex_lock(&sva_lock);
- arm_smmu_write_ctx_desc(master, id, NULL);
+ arm_smmu_clear_cd(master, id);
list_for_each_entry(t, &master->bonds, list) {
if (t->mm == mm) {
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index fd1d4d774a7cf2..54571f2a4acd5b 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1314,6 +1314,19 @@ void arm_smmu_make_s1_cd(struct arm_smmu_cd *target,
target->data[3] = cpu_to_le64(cd->mair);
}
+void arm_smmu_clear_cd(struct arm_smmu_master *master, ioasid_t ssid)
+{
+ struct arm_smmu_cd target = {};
+ struct arm_smmu_cd *cdptr;
+
+ if (!master->cd_table.cdtab)
+ return;
+ cdptr = arm_smmu_get_cd_ptr(master, ssid);
+ if (WARN_ON(!cdptr))
+ return;
+ arm_smmu_write_cd_entry(master, ssid, cdptr, &target);
+}
+
static void arm_smmu_clean_cd_entry(struct arm_smmu_cd *target)
{
struct arm_smmu_cd used = {};
@@ -2698,9 +2711,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
case ARM_SMMU_DOMAIN_S2:
arm_smmu_make_s2_domain_ste(&target, master, smmu_domain);
arm_smmu_install_ste_for_dev(master, &target);
- if (master->cd_table.cdtab)
- arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
- NULL);
+ arm_smmu_clear_cd(master, IOMMU_NO_PASID);
break;
}
@@ -2748,8 +2759,7 @@ static int arm_smmu_attach_dev_ste(struct device *dev,
* arm_smmu_domain->devices to avoid races updating the same context
* descriptor from arm_smmu_share_asid().
*/
- if (master->cd_table.cdtab)
- arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, NULL);
+ arm_smmu_clear_cd(master, IOMMU_NO_PASID);
return 0;
}
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 919f9f717bd3b2..d32da11058aab6 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -749,6 +749,7 @@ extern struct xarray arm_smmu_asid_xa;
extern struct mutex arm_smmu_asid_lock;
extern struct arm_smmu_ctx_desc quiet_cd;
+void arm_smmu_clear_cd(struct arm_smmu_master *master, ioasid_t ssid);
struct arm_smmu_cd *arm_smmu_get_cd_ptr(struct arm_smmu_master *master,
u32 ssid);
void arm_smmu_make_s1_cd(struct arm_smmu_cd *target,
--
2.43.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCH v6 22/29] iommu: Add ops->domain_alloc_sva()
From: Jason Gunthorpe @ 2024-03-27 18:08 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Lu Baolu, Eric Auger, Jean-Philippe Brucker, Joerg Roedel,
Kevin Tian, kernel test robot, Moritz Fischer, Moritz Fischer,
Michael Shavit, Nicolin Chen, patches, Shameer Kolothum,
Mostafa Saleh, Tony Zhu, Yi Liu, Zhangfei Gao
In-Reply-To: <0-v6-228e7adf25eb+4155-smmuv3_newapi_p2_jgg@nvidia.com>
Make a new op that receives the device and the mm_struct that the SVA
domain should be created for. Unlike domain_alloc_paging() the dev
argument is never NULL here.
This allows drivers to fully initialize the SVA domain and allocate the
mmu_notifier during allocation. It allows the notifier lifetime to follow
the lifetime of the iommu_domain.
Since we have only one call site, upgrade the new op to return ERR_PTR
instead of NULL.
Change SMMUv3 to use the new op.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Michael Shavit <mshavit@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 11 +++++++----
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 2 +-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 6 +++++-
drivers/iommu/iommu-sva.c | 16 +++++++++++-----
include/linux/iommu.h | 3 +++
5 files changed, 27 insertions(+), 11 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index 3e7aad0960bfd2..e337e40ac5de31 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -659,17 +659,20 @@ static const struct iommu_domain_ops arm_smmu_sva_domain_ops = {
.free = arm_smmu_sva_domain_free
};
-struct iommu_domain *arm_smmu_sva_domain_alloc(unsigned type)
+struct iommu_domain *arm_smmu_sva_domain_alloc(struct device *dev,
+ struct mm_struct *mm)
{
+ struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+ struct arm_smmu_device *smmu = master->smmu;
struct arm_smmu_domain *smmu_domain;
- if (type != IOMMU_DOMAIN_SVA)
- return ERR_PTR(-EOPNOTSUPP);
-
smmu_domain = arm_smmu_domain_alloc();
if (IS_ERR(smmu_domain))
return ERR_CAST(smmu_domain);
+
+ smmu_domain->domain.type = IOMMU_DOMAIN_SVA;
smmu_domain->domain.ops = &arm_smmu_sva_domain_ops;
+ smmu_domain->smmu = smmu;
return &smmu_domain->domain;
}
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 3d9109ad60c19c..7b001afda17aa8 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -3302,8 +3302,8 @@ static struct iommu_ops arm_smmu_ops = {
.identity_domain = &arm_smmu_identity_domain,
.blocked_domain = &arm_smmu_blocked_domain,
.capable = arm_smmu_capable,
- .domain_alloc = arm_smmu_sva_domain_alloc,
.domain_alloc_paging = arm_smmu_domain_alloc_paging,
+ .domain_alloc_sva = arm_smmu_sva_domain_alloc,
.probe_device = arm_smmu_probe_device,
.release_device = arm_smmu_release_device,
.device_group = arm_smmu_device_group,
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 9db84d5940466a..107a39f1dfe869 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -791,7 +791,8 @@ int arm_smmu_master_enable_sva(struct arm_smmu_master *master);
int arm_smmu_master_disable_sva(struct arm_smmu_master *master);
bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master);
void arm_smmu_sva_notifier_synchronize(void);
-struct iommu_domain *arm_smmu_sva_domain_alloc(unsigned int type);
+struct iommu_domain *arm_smmu_sva_domain_alloc(struct device *dev,
+ struct mm_struct *mm);
void arm_smmu_sva_remove_dev_pasid(struct iommu_domain *domain,
struct device *dev, ioasid_t id);
#else /* CONFIG_ARM_SMMU_V3_SVA */
@@ -837,5 +838,8 @@ static inline void arm_smmu_sva_remove_dev_pasid(struct iommu_domain *domain,
ioasid_t id)
{
}
+
+#define arm_smmu_sva_domain_alloc NULL
+
#endif /* CONFIG_ARM_SMMU_V3_SVA */
#endif /* _ARM_SMMU_V3_H */
diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c
index 640acc804e8cdc..18a35e798b729c 100644
--- a/drivers/iommu/iommu-sva.c
+++ b/drivers/iommu/iommu-sva.c
@@ -108,8 +108,8 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev, struct mm_struct *mm
/* Allocate a new domain and set it on device pasid. */
domain = iommu_sva_domain_alloc(dev, mm);
- if (!domain) {
- ret = -ENOMEM;
+ if (IS_ERR(domain)) {
+ ret = PTR_ERR(domain);
goto out_free_handle;
}
@@ -283,9 +283,15 @@ struct iommu_domain *iommu_sva_domain_alloc(struct device *dev,
const struct iommu_ops *ops = dev_iommu_ops(dev);
struct iommu_domain *domain;
- domain = ops->domain_alloc(IOMMU_DOMAIN_SVA);
- if (!domain)
- return NULL;
+ if (ops->domain_alloc_sva) {
+ domain = ops->domain_alloc_sva(dev, mm);
+ if (IS_ERR(domain))
+ return domain;
+ } else {
+ domain = ops->domain_alloc(IOMMU_DOMAIN_SVA);
+ if (!domain)
+ return ERR_PTR(-ENOMEM);
+ }
domain->type = IOMMU_DOMAIN_SVA;
mmgrab(mm);
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 2e925b5eba534c..8aabe83af8f266 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -518,6 +518,7 @@ static inline int __iommu_copy_struct_from_user_array(
* Upon failure, ERR_PTR must be returned.
* @domain_alloc_paging: Allocate an iommu_domain that can be used for
* UNMANAGED, DMA, and DMA_FQ domain types.
+ * @domain_alloc_sva: Allocate an iommu_domain for Shared Virtual Addressing.
* @probe_device: Add device to iommu driver handling
* @release_device: Remove device from iommu driver handling
* @probe_finalize: Do final setup work after the device is added to an IOMMU
@@ -558,6 +559,8 @@ struct iommu_ops {
struct device *dev, u32 flags, struct iommu_domain *parent,
const struct iommu_user_data *user_data);
struct iommu_domain *(*domain_alloc_paging)(struct device *dev);
+ struct iommu_domain *(*domain_alloc_sva)(struct device *dev,
+ struct mm_struct *mm);
struct iommu_device *(*probe_device)(struct device *dev);
void (*release_device)(struct device *dev);
--
2.43.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCH v6 11/29] iommu/arm-smmu-v3: Allocate the CD table entry in advance
From: Jason Gunthorpe @ 2024-03-27 18:07 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Lu Baolu, Eric Auger, Jean-Philippe Brucker, Joerg Roedel,
Kevin Tian, kernel test robot, Moritz Fischer, Moritz Fischer,
Michael Shavit, Nicolin Chen, patches, Shameer Kolothum,
Mostafa Saleh, Tony Zhu, Yi Liu, Zhangfei Gao
In-Reply-To: <0-v6-228e7adf25eb+4155-smmuv3_newapi_p2_jgg@nvidia.com>
Avoid arm_smmu_attach_dev() having to undo the changes to the
smmu_domain->devices list, acquire the cdptr earlier so we don't need to
handle that error.
Now there is a clear break in arm_smmu_attach_dev() where all the
prep-work has been done non-disruptively and we commit to making the HW
change, which cannot fail.
This completes transforming arm_smmu_attach_dev() so that it does not
disturb the HW if it fails.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Mostafa Saleh <smostafa@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 24 +++++++--------------
1 file changed, 8 insertions(+), 16 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 2ea4fe9d6594bc..2bf55ed4e32ced 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2663,6 +2663,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
struct arm_smmu_device *smmu;
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
struct arm_smmu_master *master;
+ struct arm_smmu_cd *cdptr;
if (!fwspec)
return -ENOENT;
@@ -2691,6 +2692,12 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
if (ret)
return ret;
+ if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
+ cdptr = arm_smmu_alloc_cd_ptr(master, IOMMU_NO_PASID);
+ if (!cdptr)
+ return -ENOMEM;
+ }
+
/*
* Prevent arm_smmu_share_asid() from trying to change the ASID
* of either the old or new domain while we are working on it.
@@ -2710,13 +2717,6 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
switch (smmu_domain->stage) {
case ARM_SMMU_DOMAIN_S1: {
struct arm_smmu_cd target_cd;
- struct arm_smmu_cd *cdptr;
-
- cdptr = arm_smmu_alloc_cd_ptr(master, IOMMU_NO_PASID);
- if (!cdptr) {
- ret = -ENOMEM;
- goto out_list_del;
- }
arm_smmu_make_s1_cd(&target_cd, master, smmu_domain);
arm_smmu_write_cd_entry(master, IOMMU_NO_PASID, cdptr,
@@ -2733,16 +2733,8 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
}
arm_smmu_enable_ats(master, smmu_domain);
- goto out_unlock;
-
-out_list_del:
- spin_lock_irqsave(&smmu_domain->devices_lock, flags);
- list_del_init(&master->domain_head);
- spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
-
-out_unlock:
mutex_unlock(&arm_smmu_asid_lock);
- return ret;
+ return 0;
}
static int arm_smmu_attach_dev_ste(struct device *dev,
--
2.43.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCH v6 21/29] iommu/arm-smmu-v3: Keep track of arm_smmu_master_domain for SVA
From: Jason Gunthorpe @ 2024-03-27 18:08 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Lu Baolu, Eric Auger, Jean-Philippe Brucker, Joerg Roedel,
Kevin Tian, kernel test robot, Moritz Fischer, Moritz Fischer,
Michael Shavit, Nicolin Chen, patches, Shameer Kolothum,
Mostafa Saleh, Tony Zhu, Yi Liu, Zhangfei Gao
In-Reply-To: <0-v6-228e7adf25eb+4155-smmuv3_newapi_p2_jgg@nvidia.com>
Currently the smmu_domain->devices list is unused for SVA domains.
Fill it in with the SSID and master of every arm_smmu_set_pasid()
using the same logic as the RID attach.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 29 +++++++++++++++++++--
1 file changed, 27 insertions(+), 2 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 9611ac239fea8c..3d9109ad60c19c 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2586,7 +2586,8 @@ to_smmu_domain_devices(struct iommu_domain *domain)
/* The domain can be NULL only when processing the first attach */
if (!domain)
return NULL;
- if (domain->type & __IOMMU_DOMAIN_PAGING)
+ if ((domain->type & __IOMMU_DOMAIN_PAGING) ||
+ domain->type == IOMMU_DOMAIN_SVA)
return to_smmu_domain(domain);
return NULL;
}
@@ -2792,7 +2793,15 @@ int arm_smmu_set_pasid(struct arm_smmu_master *master,
struct arm_smmu_domain *smmu_domain, ioasid_t pasid,
const struct arm_smmu_cd *cd)
{
+ struct attach_state state = {
+ /*
+ * For now the core code prevents calling this when a domain is
+ * already attached, no need to set old_domain.
+ */
+ .ssid = pasid,
+ };
struct arm_smmu_cd *cdptr;
+ int ret;
/* The core code validates pasid */
@@ -2802,14 +2811,30 @@ int arm_smmu_set_pasid(struct arm_smmu_master *master,
cdptr = arm_smmu_alloc_cd_ptr(master, pasid);
if (!cdptr)
return -ENOMEM;
+
+ mutex_lock(&arm_smmu_asid_lock);
+ ret = arm_smmu_attach_prepare(master, &smmu_domain->domain, &state);
+ if (ret)
+ goto out_unlock;
+
arm_smmu_write_cd_entry(master, pasid, cdptr, cd);
- return 0;
+
+ arm_smmu_attach_commit(master, &state);
+
+out_unlock:
+ mutex_unlock(&arm_smmu_asid_lock);
+ return ret;
}
void arm_smmu_remove_pasid(struct arm_smmu_master *master,
struct arm_smmu_domain *smmu_domain, ioasid_t pasid)
{
+ mutex_lock(&arm_smmu_asid_lock);
arm_smmu_clear_cd(master, pasid);
+ if (master->ats_enabled)
+ arm_smmu_atc_inv_master(master, pasid);
+ arm_smmu_remove_master_domain(master, &smmu_domain->domain, pasid);
+ mutex_unlock(&arm_smmu_asid_lock);
}
static int arm_smmu_attach_dev_ste(struct iommu_domain *domain,
--
2.43.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCH v6 08/29] iommu/arm-smmu-v3: Move the CD generation for S1 domains into a function
From: Jason Gunthorpe @ 2024-03-27 18:07 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Lu Baolu, Eric Auger, Jean-Philippe Brucker, Joerg Roedel,
Kevin Tian, kernel test robot, Moritz Fischer, Moritz Fischer,
Michael Shavit, Nicolin Chen, patches, Shameer Kolothum,
Mostafa Saleh, Tony Zhu, Yi Liu, Zhangfei Gao
In-Reply-To: <0-v6-228e7adf25eb+4155-smmuv3_newapi_p2_jgg@nvidia.com>
Introduce arm_smmu_make_s1_cd() to build the CD from the paging S1 domain,
and reorganize all the places programming S1 domain CD table entries to
call it.
Split arm_smmu_update_s1_domain_cd_entry() from
arm_smmu_update_ctx_desc_devices() so that the S1 path has its own call
chain separate from the unrelated SVA path.
arm_smmu_update_s1_domain_cd_entry() only works on S1 domains
attached to RIDs and refreshes all their CDs.
Remove the forced clear of the CD during S1 domain attach,
arm_smmu_write_cd_entry() will do this automatically if necessary.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Michael Shavit <mshavit@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 25 +++++++-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 60 +++++++++++++------
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 9 +++
3 files changed, 76 insertions(+), 18 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index 41b44baef15e80..d159f60480935e 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -53,6 +53,29 @@ static void arm_smmu_update_ctx_desc_devices(struct arm_smmu_domain *smmu_domain
spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
}
+static void
+arm_smmu_update_s1_domain_cd_entry(struct arm_smmu_domain *smmu_domain)
+{
+ struct arm_smmu_master *master;
+ struct arm_smmu_cd target_cd;
+ unsigned long flags;
+
+ spin_lock_irqsave(&smmu_domain->devices_lock, flags);
+ list_for_each_entry(master, &smmu_domain->devices, domain_head) {
+ struct arm_smmu_cd *cdptr;
+
+ /* S1 domains only support RID attachment right now */
+ cdptr = arm_smmu_get_cd_ptr(master, IOMMU_NO_PASID);
+ if (WARN_ON(!cdptr))
+ continue;
+
+ arm_smmu_make_s1_cd(&target_cd, master, smmu_domain);
+ arm_smmu_write_cd_entry(master, IOMMU_NO_PASID, cdptr,
+ &target_cd);
+ }
+ spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
+}
+
/*
* Check if the CPU ASID is available on the SMMU side. If a private context
* descriptor is using it, try to replace it.
@@ -96,7 +119,7 @@ arm_smmu_share_asid(struct mm_struct *mm, u16 asid)
* be some overlap between use of both ASIDs, until we invalidate the
* TLB.
*/
- arm_smmu_update_ctx_desc_devices(smmu_domain, IOMMU_NO_PASID, cd);
+ arm_smmu_update_s1_domain_cd_entry(smmu_domain);
/* Invalidate TLB entries previously associated with that context */
arm_smmu_tlb_inv_asid(smmu, asid);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 453437ca4bfc2b..fd1d4d774a7cf2 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1209,8 +1209,8 @@ static void arm_smmu_write_cd_l1_desc(__le64 *dst,
WRITE_ONCE(*dst, cpu_to_le64(val));
}
-static struct arm_smmu_cd *arm_smmu_get_cd_ptr(struct arm_smmu_master *master,
- u32 ssid)
+struct arm_smmu_cd *arm_smmu_get_cd_ptr(struct arm_smmu_master *master,
+ u32 ssid)
{
__le64 *l1ptr;
unsigned int idx;
@@ -1273,9 +1273,9 @@ static const struct arm_smmu_entry_writer_ops arm_smmu_cd_writer_ops = {
.v_bit = cpu_to_le64(CTXDESC_CD_0_V),
};
-static void arm_smmu_write_cd_entry(struct arm_smmu_master *master, int ssid,
- struct arm_smmu_cd *cdptr,
- const struct arm_smmu_cd *target)
+void arm_smmu_write_cd_entry(struct arm_smmu_master *master, int ssid,
+ struct arm_smmu_cd *cdptr,
+ const struct arm_smmu_cd *target)
{
struct arm_smmu_cd_writer cd_writer = {
.writer = {
@@ -1288,6 +1288,32 @@ static void arm_smmu_write_cd_entry(struct arm_smmu_master *master, int ssid,
arm_smmu_write_entry(&cd_writer.writer, cdptr->data, target->data);
}
+void arm_smmu_make_s1_cd(struct arm_smmu_cd *target,
+ struct arm_smmu_master *master,
+ struct arm_smmu_domain *smmu_domain)
+{
+ struct arm_smmu_ctx_desc *cd = &smmu_domain->cd;
+
+ memset(target, 0, sizeof(*target));
+
+ target->data[0] = cpu_to_le64(
+ cd->tcr |
+#ifdef __BIG_ENDIAN
+ CTXDESC_CD_0_ENDI |
+#endif
+ CTXDESC_CD_0_V |
+ CTXDESC_CD_0_AA64 |
+ (master->stall_enabled ? CTXDESC_CD_0_S : 0) |
+ CTXDESC_CD_0_R |
+ CTXDESC_CD_0_A |
+ CTXDESC_CD_0_ASET |
+ FIELD_PREP(CTXDESC_CD_0_ASID, cd->asid)
+ );
+
+ target->data[1] = cpu_to_le64(cd->ttbr & CTXDESC_CD_1_TTB0_MASK);
+ target->data[3] = cpu_to_le64(cd->mair);
+}
+
static void arm_smmu_clean_cd_entry(struct arm_smmu_cd *target)
{
struct arm_smmu_cd used = {};
@@ -2646,29 +2672,29 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
switch (smmu_domain->stage) {
- case ARM_SMMU_DOMAIN_S1:
+ case ARM_SMMU_DOMAIN_S1: {
+ struct arm_smmu_cd target_cd;
+ struct arm_smmu_cd *cdptr;
+
if (!master->cd_table.cdtab) {
ret = arm_smmu_alloc_cd_tables(master);
if (ret)
goto out_list_del;
- } else {
- /*
- * arm_smmu_write_ctx_desc() relies on the entry being
- * invalid to work, clear any existing entry.
- */
- ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
- NULL);
- if (ret)
- goto out_list_del;
}
- ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, &smmu_domain->cd);
- if (ret)
+ cdptr = arm_smmu_get_cd_ptr(master, IOMMU_NO_PASID);
+ if (!cdptr) {
+ ret = -ENOMEM;
goto out_list_del;
+ }
+ arm_smmu_make_s1_cd(&target_cd, master, smmu_domain);
+ arm_smmu_write_cd_entry(master, IOMMU_NO_PASID, cdptr,
+ &target_cd);
arm_smmu_make_cdtable_ste(&target, master);
arm_smmu_install_ste_for_dev(master, &target);
break;
+ }
case ARM_SMMU_DOMAIN_S2:
arm_smmu_make_s2_domain_ste(&target, master, smmu_domain);
arm_smmu_install_ste_for_dev(master, &target);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 7078ed569fd4d3..919f9f717bd3b2 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -749,6 +749,15 @@ extern struct xarray arm_smmu_asid_xa;
extern struct mutex arm_smmu_asid_lock;
extern struct arm_smmu_ctx_desc quiet_cd;
+struct arm_smmu_cd *arm_smmu_get_cd_ptr(struct arm_smmu_master *master,
+ u32 ssid);
+void arm_smmu_make_s1_cd(struct arm_smmu_cd *target,
+ struct arm_smmu_master *master,
+ struct arm_smmu_domain *smmu_domain);
+void arm_smmu_write_cd_entry(struct arm_smmu_master *master, int ssid,
+ struct arm_smmu_cd *cdptr,
+ const struct arm_smmu_cd *target);
+
int arm_smmu_write_ctx_desc(struct arm_smmu_master *smmu_master, int ssid,
struct arm_smmu_ctx_desc *cd);
void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid);
--
2.43.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCH v6 13/29] iommu/arm-smmu-v3: Build the whole CD in arm_smmu_make_s1_cd()
From: Jason Gunthorpe @ 2024-03-27 18:07 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Lu Baolu, Eric Auger, Jean-Philippe Brucker, Joerg Roedel,
Kevin Tian, kernel test robot, Moritz Fischer, Moritz Fischer,
Michael Shavit, Nicolin Chen, patches, Shameer Kolothum,
Mostafa Saleh, Tony Zhu, Yi Liu, Zhangfei Gao
In-Reply-To: <0-v6-228e7adf25eb+4155-smmuv3_newapi_p2_jgg@nvidia.com>
Half the code was living in arm_smmu_domain_finalise_s1(), just move it
here and take the values directly from the pgtbl_ops instead of storing
copies.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Mostafa Saleh <smostafa@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 47 ++++++++-------------
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 3 --
2 files changed, 18 insertions(+), 32 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index af5ebedf0f0beb..49e51bc1a5c788 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1313,15 +1313,25 @@ void arm_smmu_make_s1_cd(struct arm_smmu_cd *target,
struct arm_smmu_domain *smmu_domain)
{
struct arm_smmu_ctx_desc *cd = &smmu_domain->cd;
+ const struct io_pgtable_cfg *pgtbl_cfg =
+ &io_pgtable_ops_to_pgtable(smmu_domain->pgtbl_ops)->cfg;
+ typeof(&pgtbl_cfg->arm_lpae_s1_cfg.tcr) tcr =
+ &pgtbl_cfg->arm_lpae_s1_cfg.tcr;
memset(target, 0, sizeof(*target));
target->data[0] = cpu_to_le64(
- cd->tcr |
+ FIELD_PREP(CTXDESC_CD_0_TCR_T0SZ, tcr->tsz) |
+ FIELD_PREP(CTXDESC_CD_0_TCR_TG0, tcr->tg) |
+ FIELD_PREP(CTXDESC_CD_0_TCR_IRGN0, tcr->irgn) |
+ FIELD_PREP(CTXDESC_CD_0_TCR_ORGN0, tcr->orgn) |
+ FIELD_PREP(CTXDESC_CD_0_TCR_SH0, tcr->sh) |
#ifdef __BIG_ENDIAN
CTXDESC_CD_0_ENDI |
#endif
+ CTXDESC_CD_0_TCR_EPD1 |
CTXDESC_CD_0_V |
+ FIELD_PREP(CTXDESC_CD_0_TCR_IPS, tcr->ips) |
CTXDESC_CD_0_AA64 |
(master->stall_enabled ? CTXDESC_CD_0_S : 0) |
CTXDESC_CD_0_R |
@@ -1329,9 +1339,9 @@ void arm_smmu_make_s1_cd(struct arm_smmu_cd *target,
CTXDESC_CD_0_ASET |
FIELD_PREP(CTXDESC_CD_0_ASID, cd->asid)
);
-
- target->data[1] = cpu_to_le64(cd->ttbr & CTXDESC_CD_1_TTB0_MASK);
- target->data[3] = cpu_to_le64(cd->mair);
+ target->data[1] = cpu_to_le64(pgtbl_cfg->arm_lpae_s1_cfg.ttbr &
+ CTXDESC_CD_1_TTB0_MASK);
+ target->data[3] = cpu_to_le64(pgtbl_cfg->arm_lpae_s1_cfg.mair);
}
void arm_smmu_clear_cd(struct arm_smmu_master *master, ioasid_t ssid)
@@ -2286,13 +2296,11 @@ static void arm_smmu_domain_free(struct iommu_domain *domain)
}
static int arm_smmu_domain_finalise_s1(struct arm_smmu_device *smmu,
- struct arm_smmu_domain *smmu_domain,
- struct io_pgtable_cfg *pgtbl_cfg)
+ struct arm_smmu_domain *smmu_domain)
{
int ret;
u32 asid;
struct arm_smmu_ctx_desc *cd = &smmu_domain->cd;
- typeof(&pgtbl_cfg->arm_lpae_s1_cfg.tcr) tcr = &pgtbl_cfg->arm_lpae_s1_cfg.tcr;
refcount_set(&cd->refs, 1);
@@ -2300,31 +2308,13 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_device *smmu,
mutex_lock(&arm_smmu_asid_lock);
ret = xa_alloc(&arm_smmu_asid_xa, &asid, cd,
XA_LIMIT(1, (1 << smmu->asid_bits) - 1), GFP_KERNEL);
- if (ret)
- goto out_unlock;
-
cd->asid = (u16)asid;
- cd->ttbr = pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
- cd->tcr = FIELD_PREP(CTXDESC_CD_0_TCR_T0SZ, tcr->tsz) |
- FIELD_PREP(CTXDESC_CD_0_TCR_TG0, tcr->tg) |
- FIELD_PREP(CTXDESC_CD_0_TCR_IRGN0, tcr->irgn) |
- FIELD_PREP(CTXDESC_CD_0_TCR_ORGN0, tcr->orgn) |
- FIELD_PREP(CTXDESC_CD_0_TCR_SH0, tcr->sh) |
- FIELD_PREP(CTXDESC_CD_0_TCR_IPS, tcr->ips) |
- CTXDESC_CD_0_TCR_EPD1 | CTXDESC_CD_0_AA64;
- cd->mair = pgtbl_cfg->arm_lpae_s1_cfg.mair;
-
- mutex_unlock(&arm_smmu_asid_lock);
- return 0;
-
-out_unlock:
mutex_unlock(&arm_smmu_asid_lock);
return ret;
}
static int arm_smmu_domain_finalise_s2(struct arm_smmu_device *smmu,
- struct arm_smmu_domain *smmu_domain,
- struct io_pgtable_cfg *pgtbl_cfg)
+ struct arm_smmu_domain *smmu_domain)
{
int vmid;
struct arm_smmu_s2_cfg *cfg = &smmu_domain->s2_cfg;
@@ -2348,8 +2338,7 @@ static int arm_smmu_domain_finalise(struct arm_smmu_domain *smmu_domain,
struct io_pgtable_cfg pgtbl_cfg;
struct io_pgtable_ops *pgtbl_ops;
int (*finalise_stage_fn)(struct arm_smmu_device *smmu,
- struct arm_smmu_domain *smmu_domain,
- struct io_pgtable_cfg *pgtbl_cfg);
+ struct arm_smmu_domain *smmu_domain);
/* Restrict the stage to what we can actually support */
if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1))
@@ -2392,7 +2381,7 @@ static int arm_smmu_domain_finalise(struct arm_smmu_domain *smmu_domain,
smmu_domain->domain.geometry.aperture_end = (1UL << pgtbl_cfg.ias) - 1;
smmu_domain->domain.geometry.force_aperture = true;
- ret = finalise_stage_fn(smmu, smmu_domain, &pgtbl_cfg);
+ ret = finalise_stage_fn(smmu, smmu_domain);
if (ret < 0) {
free_io_pgtable_ops(pgtbl_ops);
return ret;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 5aefb0ee2b9bb7..7bafec4c0c2fac 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -587,9 +587,6 @@ struct arm_smmu_strtab_l1_desc {
struct arm_smmu_ctx_desc {
u16 asid;
- u64 ttbr;
- u64 tcr;
- u64 mair;
refcount_t refs;
struct mm_struct *mm;
--
2.43.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCH v6 16/29] iommu/arm-smmu-v3: Make changing domains be hitless for ATS
From: Jason Gunthorpe @ 2024-03-27 18:08 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Lu Baolu, Eric Auger, Jean-Philippe Brucker, Joerg Roedel,
Kevin Tian, kernel test robot, Moritz Fischer, Moritz Fischer,
Michael Shavit, Nicolin Chen, patches, Shameer Kolothum,
Mostafa Saleh, Tony Zhu, Yi Liu, Zhangfei Gao
In-Reply-To: <0-v6-228e7adf25eb+4155-smmuv3_newapi_p2_jgg@nvidia.com>
The core code allows the domain to be changed on the fly without a forced
stop in BLOCKED/IDENTITY. In this flow the driver should just continually
maintain the ATS with no change while the STE is updated.
ATS relies on a linked list smmu_domain->devices to keep track of which
masters have the domain programmed, but this list is also used by
arm_smmu_share_asid(), unrelated to ats.
Create two new functions to encapsulate this combined logic:
arm_smmu_attach_prepare()
<caller generates and sets the STE>
arm_smmu_attach_commit()
The two functions can sequence both enabling ATS and disabling across
the STE store. Have every update of the STE use this sequence.
Installing a S1/S2 domain always enables the ATS if the PCIe device
supports it.
The enable flow is now ordered differently to allow it to be hitless:
1) Add the master to the new smmu_domain->devices list
2) Program the STE
3) Enable ATS at PCIe
4) Remove the master from the old smmu_domain
This flow ensures that invalidations to either domain will generate an ATC
invalidation to the device while the STE is being switched. Thus we don't
need to turn off the ATS anymore for correctness.
The disable flow is the reverse:
1) Disable ATS at PCIe
2) Program the STE
3) Invalidate the ATC
4) Remove the master from the old smmu_domain
Move the nr_ats_masters adjustments to be close to the list
manipulations. It is a count of the number of ATS enabled masters
currently in the list. This is stricly before and after the STE/CD are
revised, and done under the list's spin_lock.
This is part of the bigger picture to allow changing the RID domain while
a PASID is in use. If a SVA PASID is relying on ATS to function then
changing the RID domain cannot just temporarily toggle ATS off without
also wrecking the SVA PASID. The new infrastructure here is organized so
that the PASID attach/detach flows will make use of it as well in
following patches.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 202 ++++++++++++++------
1 file changed, 144 insertions(+), 58 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 4411706019f036..fb18d9d500aeba 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1538,7 +1538,8 @@ static void arm_smmu_make_bypass_ste(struct arm_smmu_ste *target)
}
static void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target,
- struct arm_smmu_master *master)
+ struct arm_smmu_master *master,
+ bool ats_enabled)
{
struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table;
struct arm_smmu_device *smmu = master->smmu;
@@ -1561,7 +1562,7 @@ static void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target,
STRTAB_STE_1_S1STALLD :
0) |
FIELD_PREP(STRTAB_STE_1_EATS,
- master->ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0));
+ ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0));
if (smmu->features & ARM_SMMU_FEAT_E2H) {
/*
@@ -1589,7 +1590,8 @@ static void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target,
static void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
struct arm_smmu_master *master,
- struct arm_smmu_domain *smmu_domain)
+ struct arm_smmu_domain *smmu_domain,
+ bool ats_enabled)
{
struct arm_smmu_s2_cfg *s2_cfg = &smmu_domain->s2_cfg;
const struct io_pgtable_cfg *pgtbl_cfg =
@@ -1605,7 +1607,7 @@ static void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
target->data[1] = cpu_to_le64(
FIELD_PREP(STRTAB_STE_1_EATS,
- master->ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0) |
+ ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0) |
FIELD_PREP(STRTAB_STE_1_SHCFG,
STRTAB_STE_1_SHCFG_INCOMING));
@@ -2455,22 +2457,16 @@ static bool arm_smmu_ats_supported(struct arm_smmu_master *master)
return dev_is_pci(dev) && pci_ats_supported(to_pci_dev(dev));
}
-static void arm_smmu_enable_ats(struct arm_smmu_master *master,
- struct arm_smmu_domain *smmu_domain)
+static void arm_smmu_enable_ats(struct arm_smmu_master *master)
{
size_t stu;
struct pci_dev *pdev;
struct arm_smmu_device *smmu = master->smmu;
- /* Don't enable ATS at the endpoint if it's not enabled in the STE */
- if (!master->ats_enabled)
- return;
-
/* Smallest Translation Unit: log2 of the smallest supported granule */
stu = __ffs(smmu->pgsize_bitmap);
pdev = to_pci_dev(master->dev);
- atomic_inc(&smmu_domain->nr_ats_masters);
/*
* ATC invalidation of PASID 0 causes the entire ATC to be flushed.
*/
@@ -2479,22 +2475,6 @@ static void arm_smmu_enable_ats(struct arm_smmu_master *master,
dev_err(master->dev, "Failed to enable ATS (STU %zu)\n", stu);
}
-static void arm_smmu_disable_ats(struct arm_smmu_master *master,
- struct arm_smmu_domain *smmu_domain)
-{
- if (!master->ats_enabled)
- return;
-
- pci_disable_ats(to_pci_dev(master->dev));
- /*
- * Ensure ATS is disabled at the endpoint before we issue the
- * ATC invalidation via the SMMU.
- */
- wmb();
- arm_smmu_atc_inv_master(master);
- atomic_dec(&smmu_domain->nr_ats_masters);
-}
-
static int arm_smmu_enable_pasid(struct arm_smmu_master *master)
{
int ret;
@@ -2558,39 +2538,147 @@ arm_smmu_find_master_domain(struct arm_smmu_domain *smmu_domain,
return NULL;
}
-static void arm_smmu_detach_dev(struct arm_smmu_master *master)
+/*
+ * If the domain uses the smmu_domain->devices list return the arm_smmu_domain
+ * structure, otherwise NULL. These domains track attached devices so they can
+ * issue invalidations.
+ */
+static struct arm_smmu_domain *
+to_smmu_domain_devices(struct iommu_domain *domain)
{
- struct iommu_domain *domain = iommu_get_domain_for_dev(master->dev);
+ /* The domain can be NULL only when processing the first attach */
+ if (!domain)
+ return NULL;
+ if (domain->type & __IOMMU_DOMAIN_PAGING)
+ return to_smmu_domain(domain);
+ return NULL;
+}
+
+static void arm_smmu_remove_master_domain(struct arm_smmu_master *master,
+ struct iommu_domain *domain)
+{
+ struct arm_smmu_domain *smmu_domain = to_smmu_domain_devices(domain);
struct arm_smmu_master_domain *master_domain;
- struct arm_smmu_domain *smmu_domain;
unsigned long flags;
- if (!domain || !(domain->type & __IOMMU_DOMAIN_PAGING))
+ if (!smmu_domain)
return;
- smmu_domain = to_smmu_domain(domain);
- arm_smmu_disable_ats(master, smmu_domain);
-
spin_lock_irqsave(&smmu_domain->devices_lock, flags);
master_domain = arm_smmu_find_master_domain(smmu_domain, master);
if (master_domain) {
list_del(&master_domain->devices_elm);
kfree(master_domain);
+ if (master->ats_enabled)
+ atomic_dec(&smmu_domain->nr_ats_masters);
}
spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
+}
- master->ats_enabled = false;
+struct attach_state {
+ bool want_ats;
+ bool disable_ats;
+ struct iommu_domain *old_domain;
+};
+
+/*
+ * Prepare to attach a domain to a master. If disable_ats is not set this will
+ * turn on ATS if supported. smmu_domain can be NULL if the domain being
+ * attached does not have a page table and does not require invalidation
+ * tracking.
+ */
+static int arm_smmu_attach_prepare(struct arm_smmu_master *master,
+ struct iommu_domain *domain,
+ struct attach_state *state)
+{
+ struct arm_smmu_domain *smmu_domain =
+ to_smmu_domain_devices(domain);
+ struct arm_smmu_master_domain *master_domain;
+ unsigned long flags;
+
+ /*
+ * arm_smmu_share_asid() must not see two domains pointing to the same
+ * arm_smmu_master_domain contents otherwise it could randomly write one
+ * or the other to the CD.
+ */
+ lockdep_assert_held(&arm_smmu_asid_lock);
+
+ state->want_ats = !state->disable_ats && arm_smmu_ats_supported(master);
+
+ if (smmu_domain) {
+ master_domain = kzalloc(sizeof(*master_domain), GFP_KERNEL);
+ if (!master_domain)
+ return -ENOMEM;
+ master_domain->master = master;
+
+ /*
+ * During prepare we want the current smmu_domain and new
+ * smmu_domain to be in the devices list before we change any
+ * HW. This ensures that both domains will send ATS
+ * invalidations to the master until we are done.
+ *
+ * It is tempting to make this list only track masters that are
+ * using ATS, but arm_smmu_share_asid() also uses this to change
+ * the ASID of a domain, unrelated to ATS.
+ *
+ * Notice if we are re-attaching the same domain then the list
+ * will have two identical entries and commit will remove only
+ * one of them.
+ */
+ spin_lock_irqsave(&smmu_domain->devices_lock, flags);
+ if (state->want_ats)
+ atomic_inc(&smmu_domain->nr_ats_masters);
+ list_add(&master_domain->devices_elm, &smmu_domain->devices);
+ spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
+ }
+
+ if (!state->want_ats && master->ats_enabled) {
+ pci_disable_ats(to_pci_dev(master->dev));
+ /*
+ * This is probably overkill, but the config write for disabling
+ * ATS should complete before the STE is configured to generate
+ * UR to avoid AER noise.
+ */
+ wmb();
+ }
+ return 0;
+}
+
+/*
+ * Commit is done after the STE/CD are configured with the EATS setting. It
+ * completes synchronizing the PCI device's ATC and finishes manipulating the
+ * smmu_domain->devices list.
+ */
+static void arm_smmu_attach_commit(struct arm_smmu_master *master,
+ struct attach_state *state)
+{
+ lockdep_assert_held(&arm_smmu_asid_lock);
+
+ if (state->want_ats && !master->ats_enabled) {
+ arm_smmu_enable_ats(master);
+ } else if (master->ats_enabled) {
+ /*
+ * The translation has changed, flush the ATC. At this point the
+ * SMMU is translating for the new domain and both the old&new
+ * domain will issue invalidations.
+ */
+ arm_smmu_atc_inv_master(master);
+ }
+ master->ats_enabled = state->want_ats;
+
+ arm_smmu_remove_master_domain(master, state->old_domain);
}
static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
{
int ret = 0;
- unsigned long flags;
struct arm_smmu_ste target;
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
struct arm_smmu_device *smmu;
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
- struct arm_smmu_master_domain *master_domain;
+ struct attach_state state = {
+ .old_domain = iommu_get_domain_for_dev(dev),
+ };
struct arm_smmu_master *master;
struct arm_smmu_cd *cdptr;
@@ -2627,11 +2715,6 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
return -ENOMEM;
}
- master_domain = kzalloc(sizeof(*master_domain), GFP_KERNEL);
- if (!master_domain)
- return -ENOMEM;
- master_domain->master = master;
-
/*
* Prevent arm_smmu_share_asid() from trying to change the ASID
* of either the old or new domain while we are working on it.
@@ -2640,13 +2723,11 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
*/
mutex_lock(&arm_smmu_asid_lock);
- arm_smmu_detach_dev(master);
-
- master->ats_enabled = arm_smmu_ats_supported(master);
-
- spin_lock_irqsave(&smmu_domain->devices_lock, flags);
- list_add(&master_domain->devices_elm, &smmu_domain->devices);
- spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
+ ret = arm_smmu_attach_prepare(master, domain, &state);
+ if (ret) {
+ mutex_unlock(&arm_smmu_asid_lock);
+ return ret;
+ }
switch (smmu_domain->stage) {
case ARM_SMMU_DOMAIN_S1: {
@@ -2655,18 +2736,19 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
arm_smmu_make_s1_cd(&target_cd, master, smmu_domain);
arm_smmu_write_cd_entry(master, IOMMU_NO_PASID, cdptr,
&target_cd);
- arm_smmu_make_cdtable_ste(&target, master);
+ arm_smmu_make_cdtable_ste(&target, master, state.want_ats);
arm_smmu_install_ste_for_dev(master, &target);
break;
}
case ARM_SMMU_DOMAIN_S2:
- arm_smmu_make_s2_domain_ste(&target, master, smmu_domain);
+ arm_smmu_make_s2_domain_ste(&target, master, smmu_domain,
+ state.want_ats);
arm_smmu_install_ste_for_dev(master, &target);
arm_smmu_clear_cd(master, IOMMU_NO_PASID);
break;
}
- arm_smmu_enable_ats(master, smmu_domain);
+ arm_smmu_attach_commit(master, &state);
mutex_unlock(&arm_smmu_asid_lock);
return 0;
}
@@ -2695,10 +2777,13 @@ void arm_smmu_remove_pasid(struct arm_smmu_master *master,
arm_smmu_clear_cd(master, pasid);
}
-static int arm_smmu_attach_dev_ste(struct device *dev,
- struct arm_smmu_ste *ste)
+static int arm_smmu_attach_dev_ste(struct iommu_domain *domain,
+ struct device *dev, struct arm_smmu_ste *ste)
{
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+ struct attach_state state = {
+ .old_domain = iommu_get_domain_for_dev(dev),
+ };
if (arm_smmu_master_sva_enabled(master))
return -EBUSY;
@@ -2716,9 +2801,10 @@ static int arm_smmu_attach_dev_ste(struct device *dev,
* the stream (STE.EATS == 0b00), causing F_BAD_ATS_TREQ and
* F_TRANSL_FORBIDDEN events (IHI0070Ea 5.2 Stream Table Entry).
*/
- arm_smmu_detach_dev(master);
-
+ state.disable_ats = true;
+ arm_smmu_attach_prepare(master, domain, &state);
arm_smmu_install_ste_for_dev(master, ste);
+ arm_smmu_attach_commit(master, &state);
mutex_unlock(&arm_smmu_asid_lock);
/*
@@ -2736,7 +2822,7 @@ static int arm_smmu_attach_dev_identity(struct iommu_domain *domain,
struct arm_smmu_ste ste;
arm_smmu_make_bypass_ste(&ste);
- return arm_smmu_attach_dev_ste(dev, &ste);
+ return arm_smmu_attach_dev_ste(domain, dev, &ste);
}
static const struct iommu_domain_ops arm_smmu_identity_ops = {
@@ -2754,7 +2840,7 @@ static int arm_smmu_attach_dev_blocked(struct iommu_domain *domain,
struct arm_smmu_ste ste;
arm_smmu_make_abort_ste(&ste);
- return arm_smmu_attach_dev_ste(dev, &ste);
+ return arm_smmu_attach_dev_ste(domain, dev, &ste);
}
static const struct iommu_domain_ops arm_smmu_blocked_ops = {
--
2.43.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCH v6 12/29] iommu/arm-smmu-v3: Move the CD generation for SVA into a function
From: Jason Gunthorpe @ 2024-03-27 18:07 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Lu Baolu, Eric Auger, Jean-Philippe Brucker, Joerg Roedel,
Kevin Tian, kernel test robot, Moritz Fischer, Moritz Fischer,
Michael Shavit, Nicolin Chen, patches, Shameer Kolothum,
Mostafa Saleh, Tony Zhu, Yi Liu, Zhangfei Gao
In-Reply-To: <0-v6-228e7adf25eb+4155-smmuv3_newapi_p2_jgg@nvidia.com>
Pull all the calculations for building the CD table entry for a mmu_struct
into arm_smmu_make_sva_cd().
Call it in the two places installing the SVA CD table entry.
Open code the last caller of arm_smmu_update_ctx_desc_devices() and remove
the function.
Remove arm_smmu_write_ctx_desc() since all callers are gone. Add the
locking assertions to arm_smmu_alloc_cd_ptr() since
arm_smmu_update_ctx_desc_devices() was the last problematic caller.
Remove quiet_cd since all users are gone, arm_smmu_make_sva_cd() creates
the same value.
The behavior of quiet_cd changes slightly, the old implementation edited
the CD in place to set CTXDESC_CD_0_TCR_EPD0 assuming it was a SVA CD
entry. This version generates a full CD entry with a 0 TTB0 and relies on
arm_smmu_write_cd_entry() to install it hitlessly.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 156 +++++++++++-------
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 103 +-----------
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 7 +-
3 files changed, 108 insertions(+), 158 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index 7cf286f7a009fb..80a7d559ef2d3f 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -34,25 +34,6 @@ struct arm_smmu_bond {
static DEFINE_MUTEX(sva_lock);
-/*
- * Write the CD to the CD tables for all masters that this domain is attached
- * to. Note that this is only used to update existing CD entries in the target
- * CD table, for which it's assumed that arm_smmu_write_ctx_desc can't fail.
- */
-static void arm_smmu_update_ctx_desc_devices(struct arm_smmu_domain *smmu_domain,
- int ssid,
- struct arm_smmu_ctx_desc *cd)
-{
- struct arm_smmu_master *master;
- unsigned long flags;
-
- spin_lock_irqsave(&smmu_domain->devices_lock, flags);
- list_for_each_entry(master, &smmu_domain->devices, domain_head) {
- arm_smmu_write_ctx_desc(master, ssid, cd);
- }
- spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
-}
-
static void
arm_smmu_update_s1_domain_cd_entry(struct arm_smmu_domain *smmu_domain)
{
@@ -128,11 +109,86 @@ arm_smmu_share_asid(struct mm_struct *mm, u16 asid)
return NULL;
}
+static u64 page_size_to_cd(void)
+{
+ static_assert(PAGE_SIZE == SZ_4K || PAGE_SIZE == SZ_16K ||
+ PAGE_SIZE == SZ_64K);
+ if (PAGE_SIZE == SZ_64K)
+ return ARM_LPAE_TCR_TG0_64K;
+ if (PAGE_SIZE == SZ_16K)
+ return ARM_LPAE_TCR_TG0_16K;
+ return ARM_LPAE_TCR_TG0_4K;
+}
+
+static void arm_smmu_make_sva_cd(struct arm_smmu_cd *target,
+ struct arm_smmu_master *master,
+ struct mm_struct *mm, u16 asid)
+{
+ u64 par;
+
+ memset(target, 0, sizeof(*target));
+
+ par = cpuid_feature_extract_unsigned_field(
+ read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1),
+ ID_AA64MMFR0_EL1_PARANGE_SHIFT);
+
+ target->data[0] = cpu_to_le64(
+ CTXDESC_CD_0_TCR_EPD1 |
+#ifdef __BIG_ENDIAN
+ CTXDESC_CD_0_ENDI |
+#endif
+ CTXDESC_CD_0_V |
+ FIELD_PREP(CTXDESC_CD_0_TCR_IPS, par) |
+ CTXDESC_CD_0_AA64 |
+ (master->stall_enabled ? CTXDESC_CD_0_S : 0) |
+ CTXDESC_CD_0_R |
+ CTXDESC_CD_0_A |
+ CTXDESC_CD_0_ASET |
+ FIELD_PREP(CTXDESC_CD_0_ASID, asid));
+
+ /*
+ * If no MM is passed then this creates a SVA entry that faults
+ * everything. arm_smmu_write_cd_entry() can hitlessly go between these
+ * two entries types since TTB0 is ignored by HW when EPD0 is set.
+ */
+ if (mm) {
+ target->data[0] |= cpu_to_le64(
+ FIELD_PREP(CTXDESC_CD_0_TCR_T0SZ,
+ 64ULL - vabits_actual) |
+ FIELD_PREP(CTXDESC_CD_0_TCR_TG0, page_size_to_cd()) |
+ FIELD_PREP(CTXDESC_CD_0_TCR_IRGN0,
+ ARM_LPAE_TCR_RGN_WBWA) |
+ FIELD_PREP(CTXDESC_CD_0_TCR_ORGN0,
+ ARM_LPAE_TCR_RGN_WBWA) |
+ FIELD_PREP(CTXDESC_CD_0_TCR_SH0, ARM_LPAE_TCR_SH_IS));
+
+ target->data[1] = cpu_to_le64(virt_to_phys(mm->pgd) &
+ CTXDESC_CD_1_TTB0_MASK);
+ } else {
+ target->data[0] |= cpu_to_le64(CTXDESC_CD_0_TCR_EPD0);
+
+ /*
+ * Disable stall and immediately generate an abort if stall
+ * disable is permitted. This speeds up cleanup for an unclean
+ * exit if the device is still doing a lot of DMA.
+ */
+ if (master->stall_enabled &&
+ !(master->smmu->features & ARM_SMMU_FEAT_STALL_FORCE))
+ target->data[0] &=
+ cpu_to_le64(~(CTXDESC_CD_0_S | CTXDESC_CD_0_R));
+ }
+
+ /*
+ * MAIR value is pretty much constant and global, so we can just get it
+ * from the current CPU register
+ */
+ target->data[3] = cpu_to_le64(read_sysreg(mair_el1));
+}
+
static struct arm_smmu_ctx_desc *arm_smmu_alloc_shared_cd(struct mm_struct *mm)
{
u16 asid;
int err = 0;
- u64 tcr, par, reg;
struct arm_smmu_ctx_desc *cd;
struct arm_smmu_ctx_desc *ret = NULL;
@@ -166,39 +222,6 @@ static struct arm_smmu_ctx_desc *arm_smmu_alloc_shared_cd(struct mm_struct *mm)
if (err)
goto out_free_asid;
- tcr = FIELD_PREP(CTXDESC_CD_0_TCR_T0SZ, 64ULL - vabits_actual) |
- FIELD_PREP(CTXDESC_CD_0_TCR_IRGN0, ARM_LPAE_TCR_RGN_WBWA) |
- FIELD_PREP(CTXDESC_CD_0_TCR_ORGN0, ARM_LPAE_TCR_RGN_WBWA) |
- FIELD_PREP(CTXDESC_CD_0_TCR_SH0, ARM_LPAE_TCR_SH_IS) |
- CTXDESC_CD_0_TCR_EPD1 | CTXDESC_CD_0_AA64;
-
- switch (PAGE_SIZE) {
- case SZ_4K:
- tcr |= FIELD_PREP(CTXDESC_CD_0_TCR_TG0, ARM_LPAE_TCR_TG0_4K);
- break;
- case SZ_16K:
- tcr |= FIELD_PREP(CTXDESC_CD_0_TCR_TG0, ARM_LPAE_TCR_TG0_16K);
- break;
- case SZ_64K:
- tcr |= FIELD_PREP(CTXDESC_CD_0_TCR_TG0, ARM_LPAE_TCR_TG0_64K);
- break;
- default:
- WARN_ON(1);
- err = -EINVAL;
- goto out_free_asid;
- }
-
- reg = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
- par = cpuid_feature_extract_unsigned_field(reg, ID_AA64MMFR0_EL1_PARANGE_SHIFT);
- tcr |= FIELD_PREP(CTXDESC_CD_0_TCR_IPS, par);
-
- cd->ttbr = virt_to_phys(mm->pgd);
- cd->tcr = tcr;
- /*
- * MAIR value is pretty much constant and global, so we can just get it
- * from the current CPU register
- */
- cd->mair = read_sysreg(mair_el1);
cd->asid = asid;
cd->mm = mm;
@@ -276,6 +299,8 @@ static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
{
struct arm_smmu_mmu_notifier *smmu_mn = mn_to_smmu(mn);
struct arm_smmu_domain *smmu_domain = smmu_mn->domain;
+ struct arm_smmu_master *master;
+ unsigned long flags;
mutex_lock(&sva_lock);
if (smmu_mn->cleared) {
@@ -287,8 +312,19 @@ static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
* DMA may still be running. Keep the cd valid to avoid C_BAD_CD events,
* but disable translation.
*/
- arm_smmu_update_ctx_desc_devices(smmu_domain, mm_get_enqcmd_pasid(mm),
- &quiet_cd);
+ spin_lock_irqsave(&smmu_domain->devices_lock, flags);
+ list_for_each_entry(master, &smmu_domain->devices, domain_head) {
+ struct arm_smmu_cd target;
+ struct arm_smmu_cd *cdptr;
+
+ cdptr = arm_smmu_get_cd_ptr(master, mm_get_enqcmd_pasid(mm));
+ if (WARN_ON(!cdptr))
+ continue;
+ arm_smmu_make_sva_cd(&target, master, NULL, smmu_mn->cd->asid);
+ arm_smmu_write_cd_entry(master, mm_get_enqcmd_pasid(mm), cdptr,
+ &target);
+ }
+ spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
arm_smmu_tlb_inv_asid(smmu_domain->smmu, smmu_mn->cd->asid);
arm_smmu_atc_inv_domain(smmu_domain, mm_get_enqcmd_pasid(mm), 0, 0);
@@ -383,6 +419,8 @@ static int __arm_smmu_sva_bind(struct device *dev, ioasid_t pasid,
struct mm_struct *mm)
{
int ret;
+ struct arm_smmu_cd target;
+ struct arm_smmu_cd *cdptr;
struct arm_smmu_bond *bond;
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
@@ -409,9 +447,13 @@ static int __arm_smmu_sva_bind(struct device *dev, ioasid_t pasid,
goto err_free_bond;
}
- ret = arm_smmu_write_ctx_desc(master, pasid, bond->smmu_mn->cd);
- if (ret)
+ cdptr = arm_smmu_alloc_cd_ptr(master, mm_get_enqcmd_pasid(mm));
+ if (!cdptr) {
+ ret = -ENOMEM;
goto err_put_notifier;
+ }
+ arm_smmu_make_sva_cd(&target, master, mm, bond->smmu_mn->cd->asid);
+ arm_smmu_write_cd_entry(master, pasid, cdptr, &target);
list_add(&bond->list, &master->bonds);
return 0;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 2bf55ed4e32ced..af5ebedf0f0beb 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -89,12 +89,6 @@ struct arm_smmu_option_prop {
DEFINE_XARRAY_ALLOC1(arm_smmu_asid_xa);
DEFINE_MUTEX(arm_smmu_asid_lock);
-/*
- * Special value used by SVA when a process dies, to quiesce a CD without
- * disabling it.
- */
-struct arm_smmu_ctx_desc quiet_cd = { 0 };
-
static struct arm_smmu_option_prop arm_smmu_options[] = {
{ ARM_SMMU_OPT_SKIP_PREFETCH, "hisilicon,broken-prefetch-cmd" },
{ ARM_SMMU_OPT_PAGE0_REGS_ONLY, "cavium,cn9900-broken-page1-regspace"},
@@ -1206,7 +1200,7 @@ static void arm_smmu_write_cd_l1_desc(__le64 *dst,
u64 val = (l1_desc->l2ptr_dma & CTXDESC_L1_DESC_L2PTR_MASK) |
CTXDESC_L1_DESC_V;
- /* See comment in arm_smmu_write_ctx_desc() */
+ /* The HW has 64 bit atomicity with stores to the L2 CD table */
WRITE_ONCE(*dst, cpu_to_le64(val));
}
@@ -1229,12 +1223,15 @@ struct arm_smmu_cd *arm_smmu_get_cd_ptr(struct arm_smmu_master *master,
return &l1_desc->l2ptr[ssid % CTXDESC_L2_ENTRIES];
}
-static struct arm_smmu_cd *arm_smmu_alloc_cd_ptr(struct arm_smmu_master *master,
- u32 ssid)
+struct arm_smmu_cd *arm_smmu_alloc_cd_ptr(struct arm_smmu_master *master,
+ u32 ssid)
{
struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table;
struct arm_smmu_device *smmu = master->smmu;
+ might_sleep();
+ iommu_group_mutex_assert(master->dev);
+
if (!cd_table->cdtab) {
if (arm_smmu_alloc_cd_tables(master))
return NULL;
@@ -1350,91 +1347,6 @@ void arm_smmu_clear_cd(struct arm_smmu_master *master, ioasid_t ssid)
arm_smmu_write_cd_entry(master, ssid, cdptr, &target);
}
-static void arm_smmu_clean_cd_entry(struct arm_smmu_cd *target)
-{
- struct arm_smmu_cd used = {};
- int i;
-
- arm_smmu_get_cd_used(target->data, used.data);
- for (i = 0; i != ARRAY_SIZE(target->data); i++)
- target->data[i] &= used.data[i];
-}
-
-int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid,
- struct arm_smmu_ctx_desc *cd)
-{
- /*
- * This function handles the following cases:
- *
- * (1) Install primary CD, for normal DMA traffic (SSID = IOMMU_NO_PASID = 0).
- * (2) Install a secondary CD, for SID+SSID traffic.
- * (3) Update ASID of a CD. Atomically write the first 64 bits of the
- * CD, then invalidate the old entry and mappings.
- * (4) Quiesce the context without clearing the valid bit. Disable
- * translation, and ignore any translation fault.
- * (5) Remove a secondary CD.
- */
- u64 val;
- bool cd_live;
- struct arm_smmu_cd target;
- struct arm_smmu_cd *cdptr = ⌖
- struct arm_smmu_cd *cd_table_entry;
- struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table;
- struct arm_smmu_device *smmu = master->smmu;
-
- if (WARN_ON(ssid >= (1 << cd_table->s1cdmax)))
- return -E2BIG;
-
- cd_table_entry = arm_smmu_alloc_cd_ptr(master, ssid);
- if (!cd_table_entry)
- return -ENOMEM;
-
- target = *cd_table_entry;
- val = le64_to_cpu(cdptr->data[0]);
- cd_live = !!(val & CTXDESC_CD_0_V);
-
- if (!cd) { /* (5) */
- val = 0;
- } else if (cd == &quiet_cd) { /* (4) */
- if (!(smmu->features & ARM_SMMU_FEAT_STALL_FORCE))
- val &= ~(CTXDESC_CD_0_S | CTXDESC_CD_0_R);
- val |= CTXDESC_CD_0_TCR_EPD0;
- } else if (cd_live) { /* (3) */
- val &= ~CTXDESC_CD_0_ASID;
- val |= FIELD_PREP(CTXDESC_CD_0_ASID, cd->asid);
- /*
- * Until CD+TLB invalidation, both ASIDs may be used for tagging
- * this substream's traffic
- */
- } else { /* (1) and (2) */
- cdptr->data[1] = cpu_to_le64(cd->ttbr & CTXDESC_CD_1_TTB0_MASK);
- cdptr->data[2] = 0;
- cdptr->data[3] = cpu_to_le64(cd->mair);
-
- val = cd->tcr |
-#ifdef __BIG_ENDIAN
- CTXDESC_CD_0_ENDI |
-#endif
- CTXDESC_CD_0_R | CTXDESC_CD_0_A |
- (cd->mm ? 0 : CTXDESC_CD_0_ASET) |
- CTXDESC_CD_0_AA64 |
- FIELD_PREP(CTXDESC_CD_0_ASID, cd->asid) |
- CTXDESC_CD_0_V;
-
- if (cd_table->stall_enabled)
- val |= CTXDESC_CD_0_S;
- }
- cdptr->data[0] = cpu_to_le64(val);
- /*
- * Since the above is updating the CD entry based on the current value
- * without zeroing unused bits it needs fixing before being passed to
- * the programming logic.
- */
- arm_smmu_clean_cd_entry(&target);
- arm_smmu_write_cd_entry(master, ssid, cd_table_entry, &target);
- return 0;
-}
-
static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master)
{
int ret;
@@ -1443,7 +1355,6 @@ static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master)
struct arm_smmu_device *smmu = master->smmu;
struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table;
- cd_table->stall_enabled = master->stall_enabled;
cd_table->s1cdmax = master->ssid_bits;
max_contexts = 1 << cd_table->s1cdmax;
@@ -1541,7 +1452,7 @@ arm_smmu_write_strtab_l1_desc(__le64 *dst, struct arm_smmu_strtab_l1_desc *desc)
val |= FIELD_PREP(STRTAB_L1_DESC_SPAN, desc->span);
val |= desc->l2ptr_dma & STRTAB_L1_DESC_L2PTR_MASK;
- /* See comment in arm_smmu_write_ctx_desc() */
+ /* The HW has 64 bit atomicity with stores to the L2 STE table */
WRITE_ONCE(*dst, cpu_to_le64(val));
}
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index d32da11058aab6..5aefb0ee2b9bb7 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -608,8 +608,6 @@ struct arm_smmu_ctx_desc_cfg {
u8 s1fmt;
/* log2 of the maximum number of CDs supported by this table */
u8 s1cdmax;
- /* Whether CD entries in this table have the stall bit set. */
- u8 stall_enabled:1;
};
struct arm_smmu_s2_cfg {
@@ -747,11 +745,12 @@ static inline struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom)
extern struct xarray arm_smmu_asid_xa;
extern struct mutex arm_smmu_asid_lock;
-extern struct arm_smmu_ctx_desc quiet_cd;
void arm_smmu_clear_cd(struct arm_smmu_master *master, ioasid_t ssid);
struct arm_smmu_cd *arm_smmu_get_cd_ptr(struct arm_smmu_master *master,
u32 ssid);
+struct arm_smmu_cd *arm_smmu_alloc_cd_ptr(struct arm_smmu_master *master,
+ u32 ssid);
void arm_smmu_make_s1_cd(struct arm_smmu_cd *target,
struct arm_smmu_master *master,
struct arm_smmu_domain *smmu_domain);
@@ -759,8 +758,6 @@ void arm_smmu_write_cd_entry(struct arm_smmu_master *master, int ssid,
struct arm_smmu_cd *cdptr,
const struct arm_smmu_cd *target);
-int arm_smmu_write_ctx_desc(struct arm_smmu_master *smmu_master, int ssid,
- struct arm_smmu_ctx_desc *cd);
void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid);
void arm_smmu_tlb_inv_range_asid(unsigned long iova, size_t size, int asid,
size_t granule, bool leaf,
--
2.43.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCH v6 20/29] iommu/arm-smmu-v3: Make SVA allocate a normal arm_smmu_domain
From: Jason Gunthorpe @ 2024-03-27 18:08 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Lu Baolu, Eric Auger, Jean-Philippe Brucker, Joerg Roedel,
Kevin Tian, kernel test robot, Moritz Fischer, Moritz Fischer,
Michael Shavit, Nicolin Chen, patches, Shameer Kolothum,
Mostafa Saleh, Tony Zhu, Yi Liu, Zhangfei Gao
In-Reply-To: <0-v6-228e7adf25eb+4155-smmuv3_newapi_p2_jgg@nvidia.com>
Currently the SVA domain is a naked struct iommu_domain, allocate a struct
arm_smmu_domain instead.
This is necessary to be able to use the struct arm_master_domain
mechanism.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Michael Shavit <mshavit@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 21 +++++-----
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 38 ++++++++++---------
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 4 +-
3 files changed, 36 insertions(+), 27 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index 24e7cf759bbc35..3e7aad0960bfd2 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -637,7 +637,7 @@ static int arm_smmu_sva_set_dev_pasid(struct iommu_domain *domain,
}
arm_smmu_make_sva_cd(&target, master, mm, bond->smmu_mn->cd->asid);
- ret = arm_smmu_set_pasid(master, NULL, id, &target);
+ ret = arm_smmu_set_pasid(master, to_smmu_domain(domain), id, &target);
if (ret) {
list_del(&bond->list);
arm_smmu_mmu_notifier_put(bond->smmu_mn);
@@ -651,7 +651,7 @@ static int arm_smmu_sva_set_dev_pasid(struct iommu_domain *domain,
static void arm_smmu_sva_domain_free(struct iommu_domain *domain)
{
- kfree(domain);
+ kfree(to_smmu_domain(domain));
}
static const struct iommu_domain_ops arm_smmu_sva_domain_ops = {
@@ -659,14 +659,17 @@ static const struct iommu_domain_ops arm_smmu_sva_domain_ops = {
.free = arm_smmu_sva_domain_free
};
-struct iommu_domain *arm_smmu_sva_domain_alloc(void)
+struct iommu_domain *arm_smmu_sva_domain_alloc(unsigned type)
{
- struct iommu_domain *domain;
+ struct arm_smmu_domain *smmu_domain;
- domain = kzalloc(sizeof(*domain), GFP_KERNEL);
- if (!domain)
- return NULL;
- domain->ops = &arm_smmu_sva_domain_ops;
+ if (type != IOMMU_DOMAIN_SVA)
+ return ERR_PTR(-EOPNOTSUPP);
- return domain;
+ smmu_domain = arm_smmu_domain_alloc();
+ if (IS_ERR(smmu_domain))
+ return ERR_CAST(smmu_domain);
+ smmu_domain->domain.ops = &arm_smmu_sva_domain_ops;
+
+ return &smmu_domain->domain;
}
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 0376c1bda8d8fa..9611ac239fea8c 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2270,23 +2270,10 @@ static bool arm_smmu_capable(struct device *dev, enum iommu_cap cap)
}
}
-static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
-{
-
- if (type == IOMMU_DOMAIN_SVA)
- return arm_smmu_sva_domain_alloc();
- return ERR_PTR(-EOPNOTSUPP);
-}
-
-static struct iommu_domain *arm_smmu_domain_alloc_paging(struct device *dev)
+struct arm_smmu_domain *arm_smmu_domain_alloc(void)
{
struct arm_smmu_domain *smmu_domain;
- /*
- * Allocate the domain and initialise some of its data structures.
- * We can't really do anything meaningful until we've added a
- * master.
- */
smmu_domain = kzalloc(sizeof(*smmu_domain), GFP_KERNEL);
if (!smmu_domain)
return ERR_PTR(-ENOMEM);
@@ -2296,6 +2283,23 @@ static struct iommu_domain *arm_smmu_domain_alloc_paging(struct device *dev)
spin_lock_init(&smmu_domain->devices_lock);
INIT_LIST_HEAD(&smmu_domain->mmu_notifiers);
+ return smmu_domain;
+}
+
+static struct iommu_domain *arm_smmu_domain_alloc_paging(struct device *dev)
+{
+ struct arm_smmu_domain *smmu_domain;
+
+ smmu_domain = arm_smmu_domain_alloc();
+ if (IS_ERR(smmu_domain))
+ return ERR_CAST(smmu_domain);
+
+ /*
+ * Allocate the domain and initialise some of its data structures.
+ * We can't really do anything meaningful until we've added a
+ * master.
+ */
+
if (dev) {
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
int ret;
@@ -2309,7 +2313,7 @@ static struct iommu_domain *arm_smmu_domain_alloc_paging(struct device *dev)
return &smmu_domain->domain;
}
-static void arm_smmu_domain_free(struct iommu_domain *domain)
+static void arm_smmu_domain_free_paging(struct iommu_domain *domain)
{
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
struct arm_smmu_device *smmu = smmu_domain->smmu;
@@ -3273,7 +3277,7 @@ static struct iommu_ops arm_smmu_ops = {
.identity_domain = &arm_smmu_identity_domain,
.blocked_domain = &arm_smmu_blocked_domain,
.capable = arm_smmu_capable,
- .domain_alloc = arm_smmu_domain_alloc,
+ .domain_alloc = arm_smmu_sva_domain_alloc,
.domain_alloc_paging = arm_smmu_domain_alloc_paging,
.probe_device = arm_smmu_probe_device,
.release_device = arm_smmu_release_device,
@@ -3295,7 +3299,7 @@ static struct iommu_ops arm_smmu_ops = {
.iotlb_sync = arm_smmu_iotlb_sync,
.iova_to_phys = arm_smmu_iova_to_phys,
.enable_nesting = arm_smmu_enable_nesting,
- .free = arm_smmu_domain_free,
+ .free = arm_smmu_domain_free_paging,
}
};
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 3da131e0173e1f..9db84d5940466a 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -757,6 +757,8 @@ static inline struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom)
extern struct xarray arm_smmu_asid_xa;
extern struct mutex arm_smmu_asid_lock;
+struct arm_smmu_domain *arm_smmu_domain_alloc(void);
+
void arm_smmu_clear_cd(struct arm_smmu_master *master, ioasid_t ssid);
struct arm_smmu_cd *arm_smmu_get_cd_ptr(struct arm_smmu_master *master,
u32 ssid);
@@ -789,7 +791,7 @@ int arm_smmu_master_enable_sva(struct arm_smmu_master *master);
int arm_smmu_master_disable_sva(struct arm_smmu_master *master);
bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master);
void arm_smmu_sva_notifier_synchronize(void);
-struct iommu_domain *arm_smmu_sva_domain_alloc(void);
+struct iommu_domain *arm_smmu_sva_domain_alloc(unsigned int type);
void arm_smmu_sva_remove_dev_pasid(struct iommu_domain *domain,
struct device *dev, ioasid_t id);
#else /* CONFIG_ARM_SMMU_V3_SVA */
--
2.43.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCH v6 29/29] iommu/arm-smmu-v3: Allow setting a S1 domain to a PASID
From: Jason Gunthorpe @ 2024-03-27 18:08 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Lu Baolu, Eric Auger, Jean-Philippe Brucker, Joerg Roedel,
Kevin Tian, kernel test robot, Moritz Fischer, Moritz Fischer,
Michael Shavit, Nicolin Chen, patches, Shameer Kolothum,
Mostafa Saleh, Tony Zhu, Yi Liu, Zhangfei Gao
In-Reply-To: <0-v6-228e7adf25eb+4155-smmuv3_newapi_p2_jgg@nvidia.com>
The SVA cleanup made the SSID logic entirely general so all we need to do
is call it with the correct cd table entry for a S1 domain.
This is slightly tricky because of the ASID and how the locking works, the
simple fix is to just update the ASID once we get the right locks.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 45 +++++++++++++++++++--
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 2 +-
2 files changed, 42 insertions(+), 5 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index f87525225c8a50..59f24602e24d68 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1332,8 +1332,6 @@ void arm_smmu_make_s1_cd(struct arm_smmu_cd *target,
typeof(&pgtbl_cfg->arm_lpae_s1_cfg.tcr) tcr =
&pgtbl_cfg->arm_lpae_s1_cfg.tcr;
- lockdep_assert_held(&master->smmu->asid_lock);
-
memset(target, 0, sizeof(*target));
target->data[0] = cpu_to_le64(
@@ -2765,6 +2763,36 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
return 0;
}
+static int arm_smmu_s1_set_dev_pasid(struct iommu_domain *domain,
+ struct device *dev, ioasid_t id)
+{
+ struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+ struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+ struct arm_smmu_device *smmu = master->smmu;
+ struct arm_smmu_cd target_cd;
+ int ret = 0;
+
+ mutex_lock(&smmu_domain->init_mutex);
+ if (!smmu_domain->smmu)
+ ret = arm_smmu_domain_finalise(smmu_domain, smmu);
+ else if (smmu_domain->smmu != smmu)
+ ret = -EINVAL;
+ mutex_unlock(&smmu_domain->init_mutex);
+ if (ret)
+ return ret;
+
+ if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1)
+ return -EINVAL;
+
+ /*
+ * We can read cd.asid outside the lock because arm_smmu_set_pasid()
+ * will fix it
+ */
+ arm_smmu_make_s1_cd(&target_cd, master, smmu_domain);
+ return arm_smmu_set_pasid(master, to_smmu_domain(domain), id,
+ &target_cd);
+}
+
static void arm_smmu_update_ste(struct arm_smmu_master *master,
struct iommu_domain *sid_domain,
bool want_ats)
@@ -2792,7 +2820,7 @@ static void arm_smmu_update_ste(struct arm_smmu_master *master,
int arm_smmu_set_pasid(struct arm_smmu_master *master,
struct arm_smmu_domain *smmu_domain, ioasid_t pasid,
- const struct arm_smmu_cd *cd)
+ struct arm_smmu_cd *cd)
{
struct iommu_domain *sid_domain = iommu_get_domain_for_dev(master->dev);
struct attach_state state = {
@@ -2824,6 +2852,14 @@ int arm_smmu_set_pasid(struct arm_smmu_master *master,
if (ret)
goto out_unlock;
+ /*
+ * We don't want to obtain to the asid_lock too early, so fix up the
+ * caller set ASID under the lock in case it changed.
+ */
+ cd->data[0] &= ~cpu_to_le64(CTXDESC_CD_0_ASID);
+ cd->data[0] |= cpu_to_le64(
+ FIELD_PREP(CTXDESC_CD_0_ASID, smmu_domain->cd.asid));
+
arm_smmu_write_cd_entry(master, pasid, cdptr, cd);
arm_smmu_update_ste(master, sid_domain, state.want_ats);
@@ -2840,7 +2876,7 @@ static void arm_smmu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
struct arm_smmu_domain *smmu_domain;
struct iommu_domain *domain;
- domain = iommu_get_domain_for_dev_pasid(dev, pasid, IOMMU_DOMAIN_SVA);
+ domain = iommu_get_domain_for_dev_pasid(dev, pasid, 0);
if (WARN_ON(IS_ERR(domain)) || !domain)
return;
@@ -3362,6 +3398,7 @@ static struct iommu_ops arm_smmu_ops = {
.owner = THIS_MODULE,
.default_domain_ops = &(const struct iommu_domain_ops) {
.attach_dev = arm_smmu_attach_dev,
+ .set_dev_pasid = arm_smmu_s1_set_dev_pasid,
.map_pages = arm_smmu_map_pages,
.unmap_pages = arm_smmu_unmap_pages,
.flush_iotlb_all = arm_smmu_flush_iotlb_all,
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 853cd17d06e671..890fc6628d5e0b 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -768,7 +768,7 @@ void arm_smmu_write_cd_entry(struct arm_smmu_master *master, int ssid,
int arm_smmu_set_pasid(struct arm_smmu_master *master,
struct arm_smmu_domain *smmu_domain, ioasid_t pasid,
- const struct arm_smmu_cd *cd);
+ struct arm_smmu_cd *cd);
void arm_smmu_remove_pasid(struct arm_smmu_master *master,
struct arm_smmu_domain *smmu_domain, ioasid_t pasid);
--
2.43.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCH v6 14/29] iommu/arm-smmu-v3: Start building a generic PASID layer
From: Jason Gunthorpe @ 2024-03-27 18:08 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Lu Baolu, Eric Auger, Jean-Philippe Brucker, Joerg Roedel,
Kevin Tian, kernel test robot, Moritz Fischer, Moritz Fischer,
Michael Shavit, Nicolin Chen, patches, Shameer Kolothum,
Mostafa Saleh, Tony Zhu, Yi Liu, Zhangfei Gao
In-Reply-To: <0-v6-228e7adf25eb+4155-smmuv3_newapi_p2_jgg@nvidia.com>
Add arm_smmu_set_pasid()/arm_smmu_remove_pasid() which are to be used by
callers that already constructed the arm_smmu_cd they wish to program.
These functions will encapsulate the shared logic to setup a CD entry that
will be shared by SVA and S1 domain cases.
Prior fixes had already moved most of this logic up into
__arm_smmu_sva_bind(), move it to it's final home.
Following patches will relieve some of the remaining SVA restrictions:
- The RID domain is a S1 domain and has already setup the STE to point to
the CD table
- The programmed PASID is the mm_get_enqcmd_pasid()
- Nothing changes while SVA is running (sva_enable)
SVA invalidation will still iterate over the S1 domain's master list,
later patches will resolve that.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 57 ++++++++++---------
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 32 ++++++++++-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 9 ++-
3 files changed, 67 insertions(+), 31 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index 80a7d559ef2d3f..095d11df2a1966 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -415,29 +415,27 @@ static void arm_smmu_mmu_notifier_put(struct arm_smmu_mmu_notifier *smmu_mn)
arm_smmu_free_shared_cd(cd);
}
-static int __arm_smmu_sva_bind(struct device *dev, ioasid_t pasid,
- struct mm_struct *mm)
+static struct arm_smmu_bond *__arm_smmu_sva_bind(struct device *dev,
+ struct mm_struct *mm)
{
int ret;
- struct arm_smmu_cd target;
- struct arm_smmu_cd *cdptr;
struct arm_smmu_bond *bond;
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
struct arm_smmu_domain *smmu_domain;
if (!(domain->type & __IOMMU_DOMAIN_PAGING))
- return -ENODEV;
+ return ERR_PTR(-ENODEV);
smmu_domain = to_smmu_domain(domain);
if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1)
- return -ENODEV;
+ return ERR_PTR(-ENODEV);
if (!master || !master->sva_enabled)
- return -ENODEV;
+ return ERR_PTR(-ENODEV);
bond = kzalloc(sizeof(*bond), GFP_KERNEL);
if (!bond)
- return -ENOMEM;
+ return ERR_PTR(-ENOMEM);
bond->mm = mm;
@@ -447,22 +445,12 @@ static int __arm_smmu_sva_bind(struct device *dev, ioasid_t pasid,
goto err_free_bond;
}
- cdptr = arm_smmu_alloc_cd_ptr(master, mm_get_enqcmd_pasid(mm));
- if (!cdptr) {
- ret = -ENOMEM;
- goto err_put_notifier;
- }
- arm_smmu_make_sva_cd(&target, master, mm, bond->smmu_mn->cd->asid);
- arm_smmu_write_cd_entry(master, pasid, cdptr, &target);
-
list_add(&bond->list, &master->bonds);
- return 0;
+ return bond;
-err_put_notifier:
- arm_smmu_mmu_notifier_put(bond->smmu_mn);
err_free_bond:
kfree(bond);
- return ret;
+ return ERR_PTR(ret);
}
bool arm_smmu_sva_supported(struct arm_smmu_device *smmu)
@@ -609,10 +597,9 @@ void arm_smmu_sva_remove_dev_pasid(struct iommu_domain *domain,
struct arm_smmu_bond *bond = NULL, *t;
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+ arm_smmu_remove_pasid(master, to_smmu_domain(domain), id);
+
mutex_lock(&sva_lock);
-
- arm_smmu_clear_cd(master, id);
-
list_for_each_entry(t, &master->bonds, list) {
if (t->mm == mm) {
bond = t;
@@ -631,17 +618,33 @@ void arm_smmu_sva_remove_dev_pasid(struct iommu_domain *domain,
static int arm_smmu_sva_set_dev_pasid(struct iommu_domain *domain,
struct device *dev, ioasid_t id)
{
- int ret = 0;
+ struct arm_smmu_master *master = dev_iommu_priv_get(dev);
struct mm_struct *mm = domain->mm;
+ struct arm_smmu_bond *bond;
+ struct arm_smmu_cd target;
+ int ret;
if (mm_get_enqcmd_pasid(mm) != id)
return -EINVAL;
mutex_lock(&sva_lock);
- ret = __arm_smmu_sva_bind(dev, id, mm);
- mutex_unlock(&sva_lock);
+ bond = __arm_smmu_sva_bind(dev, mm);
+ if (IS_ERR(bond)) {
+ mutex_unlock(&sva_lock);
+ return PTR_ERR(bond);
+ }
- return ret;
+ arm_smmu_make_sva_cd(&target, master, mm, bond->smmu_mn->cd->asid);
+ ret = arm_smmu_set_pasid(master, NULL, id, &target);
+ if (ret) {
+ list_del(&bond->list);
+ arm_smmu_mmu_notifier_put(bond->smmu_mn);
+ kfree(bond);
+ mutex_unlock(&sva_lock);
+ return ret;
+ }
+ mutex_unlock(&sva_lock);
+ return 0;
}
static void arm_smmu_sva_domain_free(struct iommu_domain *domain)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 49e51bc1a5c788..3922478799e130 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1223,8 +1223,8 @@ struct arm_smmu_cd *arm_smmu_get_cd_ptr(struct arm_smmu_master *master,
return &l1_desc->l2ptr[ssid % CTXDESC_L2_ENTRIES];
}
-struct arm_smmu_cd *arm_smmu_alloc_cd_ptr(struct arm_smmu_master *master,
- u32 ssid)
+static struct arm_smmu_cd *arm_smmu_alloc_cd_ptr(struct arm_smmu_master *master,
+ u32 ssid)
{
struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table;
struct arm_smmu_device *smmu = master->smmu;
@@ -2417,6 +2417,10 @@ static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master,
int i, j;
struct arm_smmu_device *smmu = master->smmu;
+ master->cd_table.in_ste =
+ FIELD_GET(STRTAB_STE_0_CFG, le64_to_cpu(target->data[0])) ==
+ STRTAB_STE_0_CFG_S1_TRANS;
+
for (i = 0; i < master->num_streams; ++i) {
u32 sid = master->streams[i].id;
struct arm_smmu_ste *step =
@@ -2637,6 +2641,30 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
return 0;
}
+int arm_smmu_set_pasid(struct arm_smmu_master *master,
+ struct arm_smmu_domain *smmu_domain, ioasid_t pasid,
+ const struct arm_smmu_cd *cd)
+{
+ struct arm_smmu_cd *cdptr;
+
+ /* The core code validates pasid */
+
+ if (!master->cd_table.in_ste)
+ return -ENODEV;
+
+ cdptr = arm_smmu_alloc_cd_ptr(master, pasid);
+ if (!cdptr)
+ return -ENOMEM;
+ arm_smmu_write_cd_entry(master, pasid, cdptr, cd);
+ return 0;
+}
+
+void arm_smmu_remove_pasid(struct arm_smmu_master *master,
+ struct arm_smmu_domain *smmu_domain, ioasid_t pasid)
+{
+ arm_smmu_clear_cd(master, pasid);
+}
+
static int arm_smmu_attach_dev_ste(struct device *dev,
struct arm_smmu_ste *ste)
{
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 7bafec4c0c2fac..a3b94b839ee927 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -602,6 +602,7 @@ struct arm_smmu_ctx_desc_cfg {
dma_addr_t cdtab_dma;
struct arm_smmu_l1_ctx_desc *l1_desc;
unsigned int num_l1_ents;
+ u8 in_ste;
u8 s1fmt;
/* log2 of the maximum number of CDs supported by this table */
u8 s1cdmax;
@@ -746,8 +747,6 @@ extern struct mutex arm_smmu_asid_lock;
void arm_smmu_clear_cd(struct arm_smmu_master *master, ioasid_t ssid);
struct arm_smmu_cd *arm_smmu_get_cd_ptr(struct arm_smmu_master *master,
u32 ssid);
-struct arm_smmu_cd *arm_smmu_alloc_cd_ptr(struct arm_smmu_master *master,
- u32 ssid);
void arm_smmu_make_s1_cd(struct arm_smmu_cd *target,
struct arm_smmu_master *master,
struct arm_smmu_domain *smmu_domain);
@@ -755,6 +754,12 @@ void arm_smmu_write_cd_entry(struct arm_smmu_master *master, int ssid,
struct arm_smmu_cd *cdptr,
const struct arm_smmu_cd *target);
+int arm_smmu_set_pasid(struct arm_smmu_master *master,
+ struct arm_smmu_domain *smmu_domain, ioasid_t pasid,
+ const struct arm_smmu_cd *cd);
+void arm_smmu_remove_pasid(struct arm_smmu_master *master,
+ struct arm_smmu_domain *smmu_domain, ioasid_t pasid);
+
void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid);
void arm_smmu_tlb_inv_range_asid(unsigned long iova, size_t size, int asid,
size_t granule, bool leaf,
--
2.43.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCH v6 05/29] iommu/arm-smmu-v3: Add a type for the CD entry
From: Jason Gunthorpe @ 2024-03-27 18:07 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Lu Baolu, Eric Auger, Jean-Philippe Brucker, Joerg Roedel,
Kevin Tian, kernel test robot, Moritz Fischer, Moritz Fischer,
Michael Shavit, Nicolin Chen, patches, Shameer Kolothum,
Mostafa Saleh, Tony Zhu, Yi Liu, Zhangfei Gao
In-Reply-To: <0-v6-228e7adf25eb+4155-smmuv3_newapi_p2_jgg@nvidia.com>
Instead of passing a naked __le16 * around to represent a CD table entry
wrap it in a "struct arm_smmu_cd" with an array of the correct size. This
makes it much clearer which functions will comprise the "CD API".
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Moritz Fischer <moritzf@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Mostafa Saleh <smostafa@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 20 +++++++++++---------
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 7 ++++++-
2 files changed, 17 insertions(+), 10 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 7480f70701a045..6f62a38052b504 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1214,7 +1214,8 @@ static void arm_smmu_write_cd_l1_desc(__le64 *dst,
WRITE_ONCE(*dst, cpu_to_le64(val));
}
-static __le64 *arm_smmu_get_cd_ptr(struct arm_smmu_master *master, u32 ssid)
+static struct arm_smmu_cd *arm_smmu_get_cd_ptr(struct arm_smmu_master *master,
+ u32 ssid)
{
__le64 *l1ptr;
unsigned int idx;
@@ -1223,7 +1224,8 @@ static __le64 *arm_smmu_get_cd_ptr(struct arm_smmu_master *master, u32 ssid)
struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table;
if (cd_table->s1fmt == STRTAB_STE_0_S1FMT_LINEAR)
- return cd_table->cdtab + ssid * CTXDESC_CD_DWORDS;
+ return (struct arm_smmu_cd *)(cd_table->cdtab +
+ ssid * CTXDESC_CD_DWORDS);
idx = ssid >> CTXDESC_SPLIT;
l1_desc = &cd_table->l1_desc[idx];
@@ -1237,7 +1239,7 @@ static __le64 *arm_smmu_get_cd_ptr(struct arm_smmu_master *master, u32 ssid)
arm_smmu_sync_cd(master, ssid, false);
}
idx = ssid & (CTXDESC_L2_ENTRIES - 1);
- return l1_desc->l2ptr + idx * CTXDESC_CD_DWORDS;
+ return &l1_desc->l2ptr[idx];
}
int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid,
@@ -1256,7 +1258,7 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid,
*/
u64 val;
bool cd_live;
- __le64 *cdptr;
+ struct arm_smmu_cd *cdptr;
struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table;
struct arm_smmu_device *smmu = master->smmu;
@@ -1267,7 +1269,7 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid,
if (!cdptr)
return -ENOMEM;
- val = le64_to_cpu(cdptr[0]);
+ val = le64_to_cpu(cdptr->data[0]);
cd_live = !!(val & CTXDESC_CD_0_V);
if (!cd) { /* (5) */
@@ -1284,9 +1286,9 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid,
* this substream's traffic
*/
} else { /* (1) and (2) */
- cdptr[1] = cpu_to_le64(cd->ttbr & CTXDESC_CD_1_TTB0_MASK);
- cdptr[2] = 0;
- cdptr[3] = cpu_to_le64(cd->mair);
+ cdptr->data[1] = cpu_to_le64(cd->ttbr & CTXDESC_CD_1_TTB0_MASK);
+ cdptr->data[2] = 0;
+ cdptr->data[3] = cpu_to_le64(cd->mair);
/*
* STE may be live, and the SMMU might read dwords of this CD in any
@@ -1318,7 +1320,7 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid,
* field within an aligned 64-bit span of a structure can be altered
* without first making the structure invalid.
*/
- WRITE_ONCE(cdptr[0], cpu_to_le64(val));
+ WRITE_ONCE(cdptr->data[0], cpu_to_le64(val));
arm_smmu_sync_cd(master, ssid, true);
return 0;
}
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 23baf117e7e4b5..7078ed569fd4d3 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -282,6 +282,11 @@ struct arm_smmu_ste {
#define CTXDESC_L1_DESC_L2PTR_MASK GENMASK_ULL(51, 12)
#define CTXDESC_CD_DWORDS 8
+
+struct arm_smmu_cd {
+ __le64 data[CTXDESC_CD_DWORDS];
+};
+
#define CTXDESC_CD_0_TCR_T0SZ GENMASK_ULL(5, 0)
#define CTXDESC_CD_0_TCR_TG0 GENMASK_ULL(7, 6)
#define CTXDESC_CD_0_TCR_IRGN0 GENMASK_ULL(9, 8)
@@ -591,7 +596,7 @@ struct arm_smmu_ctx_desc {
};
struct arm_smmu_l1_ctx_desc {
- __le64 *l2ptr;
+ struct arm_smmu_cd *l2ptr;
dma_addr_t l2ptr_dma;
};
--
2.43.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCH v6 27/29] iommu/arm-smmu-v3: Allow IDENTITY/BLOCKED to be set while PASID is used
From: Jason Gunthorpe @ 2024-03-27 18:08 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Lu Baolu, Eric Auger, Jean-Philippe Brucker, Joerg Roedel,
Kevin Tian, kernel test robot, Moritz Fischer, Moritz Fischer,
Michael Shavit, Nicolin Chen, patches, Shameer Kolothum,
Mostafa Saleh, Tony Zhu, Yi Liu, Zhangfei Gao
In-Reply-To: <0-v6-228e7adf25eb+4155-smmuv3_newapi_p2_jgg@nvidia.com>
The HW supports this, use the S1DSS bits to configure the behavior
of SSID=0 which is the RID's translation.
If SSID's are currently being used in the CD table then just update the
S1DSS bits in the STE, remove the master_domain and leave ATS alone.
For iommufd the driver design has a small problem that all the unused CD
table entries are set with V=0 which will generate an event if VFIO
userspace tries to use the CD entry. This patch extends this problem to
include the RID as well if PASID is being used.
For BLOCKED with used PASIDs the
F_STREAM_DISABLED (STRTAB_STE_1_S1DSS_TERMINATE) event is generated on
untagged traffic and a substream CD table entry with V=0 (removed pasid)
will generate C_BAD_CD. Arguably there is no advantage to using S1DSS over
the CD entry 0 with V=0.
As we don't yet support PASID in iommufd this is a problem to resolve
later, possibly by using EPD0 for unused CD table entries instead of V=0,
and not using S1DSS for BLOCKED.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 65 +++++++++++++++------
1 file changed, 47 insertions(+), 18 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 5a2c6d099008ed..69b628c4aaacdf 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1003,6 +1003,14 @@ static void arm_smmu_get_ste_used(const __le64 *ent, __le64 *used_bits)
STRTAB_STE_1_S1STALLD | STRTAB_STE_1_STRW |
STRTAB_STE_1_EATS);
used_bits[2] |= cpu_to_le64(STRTAB_STE_2_S2VMID);
+
+ /*
+ * See 13.5 Summary of attribute/permission configuration fields
+ * for the SHCFG behavior.
+ */
+ if (FIELD_GET(STRTAB_STE_1_S1DSS, le64_to_cpu(ent[1])) ==
+ STRTAB_STE_1_S1DSS_BYPASS)
+ used_bits[1] |= cpu_to_le64(STRTAB_STE_1_SHCFG);
}
/* S2 translates */
@@ -1531,7 +1539,7 @@ static void arm_smmu_make_bypass_ste(struct arm_smmu_ste *target)
static void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target,
struct arm_smmu_master *master,
- bool ats_enabled)
+ bool ats_enabled, unsigned int s1dss)
{
struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table;
struct arm_smmu_device *smmu = master->smmu;
@@ -1545,7 +1553,7 @@ static void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target,
FIELD_PREP(STRTAB_STE_0_S1CDMAX, cd_table->s1cdmax));
target->data[1] = cpu_to_le64(
- FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) |
+ FIELD_PREP(STRTAB_STE_1_S1DSS, s1dss) |
FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) |
FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) |
FIELD_PREP(STRTAB_STE_1_S1CSH, ARM_SMMU_SH_ISH) |
@@ -1556,6 +1564,10 @@ static void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target,
FIELD_PREP(STRTAB_STE_1_EATS,
ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0));
+ if (s1dss == STRTAB_STE_1_S1DSS_BYPASS)
+ target->data[1] |= cpu_to_le64(FIELD_PREP(
+ STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING));
+
if (smmu->features & ARM_SMMU_FEAT_E2H) {
/*
* To support BTM the streamworld needs to match the
@@ -2732,7 +2744,8 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
arm_smmu_make_s1_cd(&target_cd, master, smmu_domain);
arm_smmu_write_cd_entry(master, IOMMU_NO_PASID, cdptr,
&target_cd);
- arm_smmu_make_cdtable_ste(&target, master, state.want_ats);
+ arm_smmu_make_cdtable_ste(&target, master, state.want_ats,
+ STRTAB_STE_1_S1DSS_SSID0);
arm_smmu_install_ste_for_dev(master, &target);
break;
}
@@ -2809,8 +2822,10 @@ static void arm_smmu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
mutex_unlock(&master->smmu->asid_lock);
}
-static int arm_smmu_attach_dev_ste(struct iommu_domain *domain,
- struct device *dev, struct arm_smmu_ste *ste)
+static void arm_smmu_attach_dev_ste(struct iommu_domain *domain,
+ struct device *dev,
+ struct arm_smmu_ste *ste,
+ unsigned int s1dss)
{
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
struct attach_state state = {
@@ -2818,9 +2833,6 @@ static int arm_smmu_attach_dev_ste(struct iommu_domain *domain,
.ssid = IOMMU_NO_PASID,
};
- if (arm_smmu_ssids_in_use(&master->cd_table))
- return -EBUSY;
-
/*
* Do not allow any ASID to be changed while are working on the STE,
* otherwise we could miss invalidations.
@@ -2828,14 +2840,29 @@ static int arm_smmu_attach_dev_ste(struct iommu_domain *domain,
mutex_lock(&master->smmu->asid_lock);
/*
- * The SMMU does not support enabling ATS with bypass/abort. When the
- * STE is in bypass (STE.Config[2:0] == 0b100), ATS Translation Requests
- * and Translated transactions are denied as though ATS is disabled for
- * the stream (STE.EATS == 0b00), causing F_BAD_ATS_TREQ and
- * F_TRANSL_FORBIDDEN events (IHI0070Ea 5.2 Stream Table Entry).
+ * If the CD table is not in use we can use the provided STE, otherwise
+ * we use a cdtable STE with the provided S1DSS.
*/
- state.disable_ats = true;
- arm_smmu_attach_prepare(master, domain, &state);
+ if (arm_smmu_ssids_in_use(&master->cd_table)) {
+ /*
+ * If a CD table has to be present then we need to run with ATS
+ * on even though the RID will fail ATS queries with UR. This is
+ * because we have no idea what the PASID's need.
+ */
+ arm_smmu_attach_prepare(master, domain, &state);
+ arm_smmu_make_cdtable_ste(ste, master, state.want_ats, s1dss);
+ } else {
+ /*
+ * The SMMU does not support enabling ATS with bypass/abort.
+ * When the STE is in bypass (STE.Config[2:0] == 0b100), ATS
+ * Translation Requests and Translated transactions are denied
+ * as though ATS is disabled for the stream (STE.EATS == 0b00),
+ * causing F_BAD_ATS_TREQ and F_TRANSL_FORBIDDEN events
+ * (IHI0070Ea 5.2 Stream Table Entry).
+ */
+ state.disable_ats = true;
+ arm_smmu_attach_prepare(master, domain, &state);
+ }
arm_smmu_install_ste_for_dev(master, ste);
arm_smmu_attach_commit(master, &state);
mutex_unlock(&master->smmu->asid_lock);
@@ -2846,7 +2873,6 @@ static int arm_smmu_attach_dev_ste(struct iommu_domain *domain,
* descriptor from arm_smmu_share_asid().
*/
arm_smmu_clear_cd(master, IOMMU_NO_PASID);
- return 0;
}
static int arm_smmu_attach_dev_identity(struct iommu_domain *domain,
@@ -2855,7 +2881,8 @@ static int arm_smmu_attach_dev_identity(struct iommu_domain *domain,
struct arm_smmu_ste ste;
arm_smmu_make_bypass_ste(&ste);
- return arm_smmu_attach_dev_ste(domain, dev, &ste);
+ arm_smmu_attach_dev_ste(domain, dev, &ste, STRTAB_STE_1_S1DSS_BYPASS);
+ return 0;
}
static const struct iommu_domain_ops arm_smmu_identity_ops = {
@@ -2873,7 +2900,9 @@ static int arm_smmu_attach_dev_blocked(struct iommu_domain *domain,
struct arm_smmu_ste ste;
arm_smmu_make_abort_ste(&ste);
- return arm_smmu_attach_dev_ste(domain, dev, &ste);
+ arm_smmu_attach_dev_ste(domain, dev, &ste,
+ STRTAB_STE_1_S1DSS_TERMINATE);
+ return 0;
}
static const struct iommu_domain_ops arm_smmu_blocked_ops = {
--
2.43.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCH v6 24/29] iommu/arm-smmu-v3: Consolidate freeing the ASID/VMID
From: Jason Gunthorpe @ 2024-03-27 18:08 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Lu Baolu, Eric Auger, Jean-Philippe Brucker, Joerg Roedel,
Kevin Tian, kernel test robot, Moritz Fischer, Moritz Fischer,
Michael Shavit, Nicolin Chen, patches, Shameer Kolothum,
Mostafa Saleh, Tony Zhu, Yi Liu, Zhangfei Gao
In-Reply-To: <0-v6-228e7adf25eb+4155-smmuv3_newapi_p2_jgg@nvidia.com>
The SMMUv3 IOTLB is tagged with a VMID/ASID cache tag. Any time the
underlying translation is changed these need to be invalidated. At boot
time the IOTLB starts out empty and all cache tags are available for
allocation.
When a tag is taken out of the allocator the code assumes the IOTLB
doesn't reference it, and immediately programs it into a STE/CD. If the
cache is referencing the tag then it will have stale data and IOMMU will
become incoherent.
Thus, whenever an ASID/VMID is freed back to the allocator we need to know
that the IOTLB doesn't have any references to it.
Invalidation is a bit inconsistent, the SVA code open codes an
invalidation prior to freeing while the paging code runs through:
arm_smmu_domain_free()
free_io_pgtable_ops()
io_pgtable_tlb_flush_all)()
arm_smmu_tlb_inv_context()
To do it.
Make arm_smmu_tlb_inv_context() able to invalidate all the domain types
and call it from a new arm_smmu_domain_free_id() which also puts back the
ID. Lightly reorganize things so arm_smmu_domain_free_id() is the only
place that does the final flush prior to ASID/VMID free and that
arm_smmu_tlb_inv_context() provides full invalidation for both
arm_smmu_flush_iotlb_all() and arm_smmu_domain_free_id().
Remove the iommu_flush_ops::tlb_flush_all to avoid duplicate invalidation
on free. Nothing else calls this besides free_io_pgtable_ops().
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 9 +--
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 68 ++++++++++++-------
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 1 +
3 files changed, 46 insertions(+), 32 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index fd0b9f230f89e3..9ec1a5869ac3b2 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -367,18 +367,13 @@ static void arm_smmu_sva_domain_free(struct iommu_domain *domain)
{
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
- /*
- * Ensure the ASID is empty in the iommu cache before allowing reuse.
- */
- arm_smmu_tlb_inv_asid(smmu_domain->smmu, smmu_domain->cd.asid);
-
/*
* Notice that the arm_smmu_mm_arch_invalidate_secondary_tlbs op can
* still be called/running at this point. We allow the ASID to be
* reused, and if there is a race then it just suffers harmless
* unnecessary invalidation.
*/
- xa_erase(&arm_smmu_asid_xa, smmu_domain->cd.asid);
+ arm_smmu_domain_free_id(smmu_domain);
/*
* Actual free is defered to the SRCU callback
@@ -423,7 +418,7 @@ struct iommu_domain *arm_smmu_sva_domain_alloc(struct device *dev,
return &smmu_domain->domain;
err_asid:
- xa_erase(&arm_smmu_asid_xa, smmu_domain->cd.asid);
+ arm_smmu_domain_free_id(smmu_domain);
err_free:
kfree(smmu_domain);
return ERR_PTR(ret);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index a82a5e4a13bb44..888972c97f56e1 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2057,27 +2057,19 @@ int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain,
}
/* IO_PGTABLE API */
-static void arm_smmu_tlb_inv_context(void *cookie)
+static void arm_smmu_tlb_inv_context(struct arm_smmu_domain *smmu_domain)
{
- struct arm_smmu_domain *smmu_domain = cookie;
struct arm_smmu_device *smmu = smmu_domain->smmu;
struct arm_smmu_cmdq_ent cmd;
- /*
- * NOTE: when io-pgtable is in non-strict mode, we may get here with
- * PTEs previously cleared by unmaps on the current CPU not yet visible
- * to the SMMU. We are relying on the dma_wmb() implicit during cmd
- * insertion to guarantee those are observed before the TLBI. Do be
- * careful, 007.
- */
- if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
+ if ((smmu_domain->stage == ARM_SMMU_DOMAIN_S1 ||
+ smmu_domain->domain.type == IOMMU_DOMAIN_SVA)) {
arm_smmu_tlb_inv_asid(smmu, smmu_domain->cd.asid);
- } else {
- cmd.opcode = CMDQ_OP_TLBI_S12_VMALL;
- cmd.tlbi.vmid = smmu_domain->s2_cfg.vmid;
+ } else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S2) {
+ cmd.opcode = CMDQ_OP_TLBI_S12_VMALL;
+ cmd.tlbi.vmid = smmu_domain->s2_cfg.vmid;
arm_smmu_cmdq_issue_cmd_with_sync(smmu, &cmd);
}
- arm_smmu_atc_inv_domain(smmu_domain, 0, 0);
}
static void __arm_smmu_tlb_inv_range(struct arm_smmu_cmdq_ent *cmd,
@@ -2211,7 +2203,6 @@ static void arm_smmu_tlb_inv_walk(unsigned long iova, size_t size,
}
static const struct iommu_flush_ops arm_smmu_flush_ops = {
- .tlb_flush_all = arm_smmu_tlb_inv_context,
.tlb_flush_walk = arm_smmu_tlb_inv_walk,
.tlb_add_page = arm_smmu_tlb_inv_page_nosync,
};
@@ -2275,25 +2266,42 @@ static struct iommu_domain *arm_smmu_domain_alloc_paging(struct device *dev)
return &smmu_domain->domain;
}
-static void arm_smmu_domain_free_paging(struct iommu_domain *domain)
+/*
+ * Return the domain's ASID or VMID back to the allocator. All IDs in the
+ * allocator do not have an IOTLB entries referencing them.
+ */
+void arm_smmu_domain_free_id(struct arm_smmu_domain *smmu_domain)
{
- struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
struct arm_smmu_device *smmu = smmu_domain->smmu;
- free_io_pgtable_ops(smmu_domain->pgtbl_ops);
+ arm_smmu_tlb_inv_context(smmu_domain);
- /* Free the ASID or VMID */
- if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
+ if ((smmu_domain->stage == ARM_SMMU_DOMAIN_S1 ||
+ smmu_domain->domain.type == IOMMU_DOMAIN_SVA) &&
+ smmu_domain->cd.asid) {
/* Prevent SVA from touching the CD while we're freeing it */
mutex_lock(&arm_smmu_asid_lock);
xa_erase(&arm_smmu_asid_xa, smmu_domain->cd.asid);
mutex_unlock(&arm_smmu_asid_lock);
- } else {
- struct arm_smmu_s2_cfg *cfg = &smmu_domain->s2_cfg;
- if (cfg->vmid)
- ida_free(&smmu->vmid_map, cfg->vmid);
+ } else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S2 &&
+ smmu_domain->s2_cfg.vmid) {
+ ida_free(&smmu->vmid_map, smmu_domain->s2_cfg.vmid);
}
+}
+static void arm_smmu_domain_free_paging(struct iommu_domain *domain)
+{
+ struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+
+ /*
+ * At this point the page table is not programmed into any STE/CD and
+ * there is no possible concurrent HW walker running due to prior STE/CD
+ * invalidations. However entries tagged with the ASID/VMID may still be
+ * in the IOTLB. Invalidating the IOTLB should fully serialize any
+ * concurrent dirty bit write back before freeing the PTE memory.
+ */
+ arm_smmu_domain_free_id(smmu_domain);
+ free_io_pgtable_ops(smmu_domain->pgtbl_ops);
kfree(smmu_domain);
}
@@ -2914,8 +2922,18 @@ static void arm_smmu_flush_iotlb_all(struct iommu_domain *domain)
{
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
- if (smmu_domain->smmu)
+ /*
+ * NOTE: when io-pgtable is in non-strict mode, we may get here with
+ * PTEs previously cleared by unmaps on the current CPU not yet visible
+ * to the SMMU. We are relying on the dma_wmb() implicit during cmd
+ * insertion to guarantee those are observed before the TLBI. Do be
+ * careful, 007.
+ */
+
+ if (smmu_domain->smmu) {
arm_smmu_tlb_inv_context(smmu_domain);
+ arm_smmu_atc_inv_domain(smmu_domain, 0, 0);
+ }
}
static void arm_smmu_iotlb_sync(struct iommu_domain *domain,
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 3516869954ea33..a711a659576a95 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -772,6 +772,7 @@ int arm_smmu_set_pasid(struct arm_smmu_master *master,
void arm_smmu_remove_pasid(struct arm_smmu_master *master,
struct arm_smmu_domain *smmu_domain, ioasid_t pasid);
+void arm_smmu_domain_free_id(struct arm_smmu_domain *smmu_domain);
void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid);
void arm_smmu_tlb_inv_range_asid(unsigned long iova, size_t size, int asid,
size_t granule, bool leaf,
--
2.43.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCH v6 19/29] iommu/arm-smmu-v3: Thread SSID through the arm_smmu_attach_*() interface
From: Jason Gunthorpe @ 2024-03-27 18:08 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Lu Baolu, Eric Auger, Jean-Philippe Brucker, Joerg Roedel,
Kevin Tian, kernel test robot, Moritz Fischer, Moritz Fischer,
Michael Shavit, Nicolin Chen, patches, Shameer Kolothum,
Mostafa Saleh, Tony Zhu, Yi Liu, Zhangfei Gao
In-Reply-To: <0-v6-228e7adf25eb+4155-smmuv3_newapi_p2_jgg@nvidia.com>
Allow creating and managing arm_smmu_mater_domain's with a non-zero SSID
through the arm_smmu_attach_*() family of functions. This triggers ATC
invalidation for the correct SSID in PASID cases and tracks the
per-attachment SSID in the struct arm_smmu_master_domain.
Generalize arm_smmu_attach_remove() to be able to remove SSID's as well by
ensuring the ATC for the PASID is flushed properly.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 24 ++++++++++++++-------
1 file changed, 16 insertions(+), 8 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index a77467f8837515..0376c1bda8d8fa 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2003,13 +2003,14 @@ arm_smmu_atc_inv_to_cmd(int ssid, unsigned long iova, size_t size,
cmd->atc.size = log2_span;
}
-static int arm_smmu_atc_inv_master(struct arm_smmu_master *master)
+static int arm_smmu_atc_inv_master(struct arm_smmu_master *master,
+ ioasid_t ssid)
{
int i;
struct arm_smmu_cmdq_ent cmd;
struct arm_smmu_cmdq_batch cmds;
- arm_smmu_atc_inv_to_cmd(IOMMU_NO_PASID, 0, 0, &cmd);
+ arm_smmu_atc_inv_to_cmd(ssid, 0, 0, &cmd);
cmds.num = 0;
for (i = 0; i < master->num_streams; i++) {
@@ -2500,7 +2501,7 @@ static void arm_smmu_enable_ats(struct arm_smmu_master *master)
/*
* ATC invalidation of PASID 0 causes the entire ATC to be flushed.
*/
- arm_smmu_atc_inv_master(master);
+ arm_smmu_atc_inv_master(master, IOMMU_NO_PASID);
if (pci_enable_ats(pdev, stu))
dev_err(master->dev, "Failed to enable ATS (STU %zu)\n", stu);
}
@@ -2587,7 +2588,8 @@ to_smmu_domain_devices(struct iommu_domain *domain)
}
static void arm_smmu_remove_master_domain(struct arm_smmu_master *master,
- struct iommu_domain *domain)
+ struct iommu_domain *domain,
+ ioasid_t ssid)
{
struct arm_smmu_domain *smmu_domain = to_smmu_domain_devices(domain);
struct arm_smmu_master_domain *master_domain;
@@ -2597,8 +2599,7 @@ static void arm_smmu_remove_master_domain(struct arm_smmu_master *master,
return;
spin_lock_irqsave(&smmu_domain->devices_lock, flags);
- master_domain = arm_smmu_find_master_domain(smmu_domain, master,
- IOMMU_NO_PASID);
+ master_domain = arm_smmu_find_master_domain(smmu_domain, master, ssid);
if (master_domain) {
list_del(&master_domain->devices_elm);
kfree(master_domain);
@@ -2612,6 +2613,7 @@ struct attach_state {
bool want_ats;
bool disable_ats;
struct iommu_domain *old_domain;
+ ioasid_t ssid;
};
/*
@@ -2643,6 +2645,7 @@ static int arm_smmu_attach_prepare(struct arm_smmu_master *master,
if (!master_domain)
return -ENOMEM;
master_domain->master = master;
+ master_domain->ssid = state->ssid;
/*
* During prepare we want the current smmu_domain and new
@@ -2695,11 +2698,14 @@ static void arm_smmu_attach_commit(struct arm_smmu_master *master,
* SMMU is translating for the new domain and both the old&new
* domain will issue invalidations.
*/
- arm_smmu_atc_inv_master(master);
+ if (state->want_ats)
+ arm_smmu_atc_inv_master(master, state->ssid);
+ else
+ arm_smmu_atc_inv_master(master, IOMMU_NO_PASID);
}
master->ats_enabled = state->want_ats;
- arm_smmu_remove_master_domain(master, state->old_domain);
+ arm_smmu_remove_master_domain(master, state->old_domain, state->ssid);
}
static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
@@ -2711,6 +2717,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
struct attach_state state = {
.old_domain = iommu_get_domain_for_dev(dev),
+ .ssid = IOMMU_NO_PASID,
};
struct arm_smmu_master *master;
struct arm_smmu_cd *cdptr;
@@ -2807,6 +2814,7 @@ static int arm_smmu_attach_dev_ste(struct iommu_domain *domain,
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
struct attach_state state = {
.old_domain = iommu_get_domain_for_dev(dev),
+ .ssid = IOMMU_NO_PASID,
};
if (arm_smmu_ssids_in_use(&master->cd_table))
--
2.43.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCH v6 00/29] Update SMMUv3 to the modern iommu API (part 2/3)
From: Jason Gunthorpe @ 2024-03-27 18:07 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Lu Baolu, Eric Auger, Jean-Philippe Brucker, Joerg Roedel,
Kevin Tian, kernel test robot, Moritz Fischer, Moritz Fischer,
Michael Shavit, Nicolin Chen, patches, Shameer Kolothum,
Mostafa Saleh, Tony Zhu, Yi Liu, Zhangfei Gao
Continuing the work of part 1 this focuses on the CD, PASID and SVA
components:
- attach_dev failure does not change the HW configuration.
- Full PASID API support including:
- S1/SVA domains attached to PASIDs
- IDENTITY/BLOCKED/S1 attached to RID
- Change of the RID domain while PASIDs are attached
- Streamlined SVA support using the core infrastructure
- Hitless, whenever possible, change between two domains
Making the CD programming work like the new STE programming allows
untangling some of the confusing SVA flows. From there the focus is on
building out the core infrastructure for dealing with PASID and CD
entries, then keeping track of unique SSID's for ATS invalidation.
The ATS ordering is generalized so that the PASID flow can use it and put
into a form where it is fully hitless, whenever possible. Care is taken to
ensure that ATC flushes are present after any change in translation.
Finally we simply kill the entire outdated SVA mmu_notifier implementation
in one shot and switch it over to the newly created generic PASID & CD
code. This avoids the messy and confusing approach of trying to
incrementally untangle this in place. The new code is small and simple
enough this is much better than trying to figure out smaller steps.
Once SVA is resting on the right CD code it is straightforward to make the
PASID interface functionally complete.
It achieves the same goals as the several series from Michael and the S1DSS
series from Nicolin that were trying to improve portions of the API.
This is on github:
https://github.com/jgunthorpe/linux/commits/smmuv3_newapi
v6:
- Rebase on v6.9-rc1
- Remove arm_smmu_entry_writer_ops->num_entry_qwords and just use
NUM_ENTRY_QWORDS for CD & STE in all places
- Split arm_smmu_get_cd_ptr() into arm_smmu_alloc_cd_ptr() and call the
allocation one only from attach paths
- Remove cd_table.used_sid
- Fix order of EPD1 in arm_smmu_write_cd_entry
- Do not double invalidate during domain free by removing
iommu_flush_ops->tlb_flush_all. Consolidate all the ID invalidation
code to arm_smmu_tlb_inv_context()
- Use the right old domain in arm_smmu_attach_commit() for PASID cases
- Rename arm_smmu_domain_free() to arm_smmu_domain_free_paging()
- ssid should be an ioasid_t not a u16
- Scrub the CD target instead of the no_used_check temporary
- Reorder the patches around arm_smmu_alloc/get_cd_ptr()
- Add the temporary arm_smmu_write_cd_entry() calls to
arm_smmu_sva_set_dev_pasid() with error handling
- Move some of the hunks for the in_set/etc tracking around so that
use_sid can go away. Use in_ste instead of arm_smmu_is_s1_domain()
v5: https://lore.kernel.org/r/0-v5-9a37e0c884ce+31e3-smmuv3_newapi_p2_jgg@nvidia.com
- Rebase on v6.8-rc7 & Will's tree
- Accomdate the SVA rc patch removing the master list iteration
- Move the kfree(to_smmu_domain(domain)) hunk to the right patch
- Move S1DSS get_used hunk to "Allow IDENTITY/BLOCKED to be set while
PASID is used"
v4: https://lore.kernel.org/r/0-v4-e7091cdd9e8d+43b1-smmuv3_newapi_p2_jgg@nvidia.com
- Rebase on v6.8-rc1, adjust to use mm_get_enqcmd_pasid() and eventually
remove all references from ARM. Move the new ARM_SMMU_FEAT_STALL_FORCE
stuff to arm_smmu_make_sva_cd()
- Adjust to use the new shared STE/CD writer logic. Disable some of the
sanity checks for the interior of the series
- Return ERR_PTR from domain_alloc functions
- Move the ATS disablement flow into arm_smmu_attach_prepare()/commit()
which lets all the STE update flows use the same sequence. This is
needed for nesting in part 3
- Put ssid in attach_state
- Replace to_smmu_domain_safe() with to_smmu_domain_devices()
v3: https://lore.kernel.org/r/0-v3-9083a9368a5c+23fb-smmuv3_newapi_p2_jgg@nvidia.com
- Rebase on the latest part 1
- update comments and commit messages
- Fix error exit in arm_smmu_set_pasid()
- Fix inverted logic for btm_invalidation
- Add missing ATC invalidation on mm release
- Add a big comment explaining that BTM is not enabled and what is
missing to enable it.
v2: https://lore.kernel.org/r/0-v2-16665a652079+5947-smmuv3_newapi_p2_jgg@nvidia.com
- Rebased on iommmufd + Joerg's tree
- Use sid_smmu_domain consistently to refer to the domain attached to the
device (eg the PCIe RID)
- Rework how arm_smmu_attach_*() and callers flow to be more careful
about ordering around ATC invalidation. The ATC must be invalidated
after it is impossible to establish stale entires.
- ATS disable is now entirely part of arm_smmu_attach_dev_ste(), which is
the only STE type that ever disables ATS.
- Remove the 'existing_master_domain' optimization, the code is
functionally fine without it.
- Whitespace, spelling, and checkpatch related items
- Fixed wrong value stored in the xa for the BTM flows
- Use pasid more consistently instead of id
v1: https://lore.kernel.org/r/0-v1-afbb86647bbd+5-smmuv3_newapi_p2_jgg@nvidia.com
Jason Gunthorpe (29):
iommu: Validate the PASID in iommu_attach_device_pasid()
iommu/arm-smmu-v3: Add cpu_to_le64() around STRTAB_STE_0_V
iommu/arm-smmu-v3: Do not allow a SVA domain to be set on the wrong
PASID
iommu/arm-smmu-v3: Do not ATC invalidate the entire domain
iommu/arm-smmu-v3: Add a type for the CD entry
iommu/arm-smmu-v3: Add an ops indirection to the STE code
iommu/arm-smmu-v3: Make CD programming use arm_smmu_write_entry()
iommu/arm-smmu-v3: Move the CD generation for S1 domains into a
function
iommu/arm-smmu-v3: Consolidate clearing a CD table entry
iommu/arm-smmu-v3: Make arm_smmu_alloc_cd_ptr()
iommu/arm-smmu-v3: Allocate the CD table entry in advance
iommu/arm-smmu-v3: Move the CD generation for SVA into a function
iommu/arm-smmu-v3: Build the whole CD in arm_smmu_make_s1_cd()
iommu/arm-smmu-v3: Start building a generic PASID layer
iommu/arm-smmu-v3: Make smmu_domain->devices into an allocated list
iommu/arm-smmu-v3: Make changing domains be hitless for ATS
iommu/arm-smmu-v3: Add ssid to struct arm_smmu_master_domain
iommu/arm-smmu-v3: Do not use master->sva_enable to restrict attaches
iommu/arm-smmu-v3: Thread SSID through the arm_smmu_attach_*()
interface
iommu/arm-smmu-v3: Make SVA allocate a normal arm_smmu_domain
iommu/arm-smmu-v3: Keep track of arm_smmu_master_domain for SVA
iommu: Add ops->domain_alloc_sva()
iommu/arm-smmu-v3: Put the SVA mmu notifier in the smmu_domain
iommu/arm-smmu-v3: Consolidate freeing the ASID/VMID
iommu/arm-smmu-v3: Move the arm_smmu_asid_xa to per-smmu like vmid
iommu/arm-smmu-v3: Bring back SVA BTM support
iommu/arm-smmu-v3: Allow IDENTITY/BLOCKED to be set while PASID is
used
iommu/arm-smmu-v3: Allow a PASID to be set when RID is
IDENTITY/BLOCKED
iommu/arm-smmu-v3: Allow setting a S1 domain to a PASID
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 637 +++++-----
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 1118 +++++++++++------
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 76 +-
drivers/iommu/iommu-sva.c | 16 +-
drivers/iommu/iommu.c | 11 +-
include/linux/iommu.h | 3 +
6 files changed, 1086 insertions(+), 775 deletions(-)
base-commit: 4cece764965020c22cff7665b18a012006359095
--
2.43.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* Re: [PATCH v6 01/16] regulator: dt-bindings: describe the PMU module of the QCA6390 package
From: Krzysztof Kozlowski @ 2024-03-27 18:17 UTC (permalink / raw)
To: Bartosz Golaszewski, Marcel Holtmann, Luiz Augusto von Dentz,
David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Rob Herring, Krzysztof Kozlowski, Conor Dooley, Kalle Valo,
Bjorn Andersson, Konrad Dybcio, Liam Girdwood, Mark Brown,
Catalin Marinas, Will Deacon, Bjorn Helgaas, Saravana Kannan,
Geert Uytterhoeven, Arnd Bergmann, Neil Armstrong,
Marek Szyprowski, Alex Elder, Srini Kandagatla,
Greg Kroah-Hartman, Abel Vesa, Manivannan Sadhasivam,
Lukas Wunner, Dmitry Baryshkov
Cc: linux-bluetooth, netdev, devicetree, linux-kernel, linux-wireless,
linux-arm-msm, linux-arm-kernel, linux-pci, linux-pm,
Bartosz Golaszewski
In-Reply-To: <20240325131624.26023-2-brgl@bgdev.pl>
On 25/03/2024 14:16, Bartosz Golaszewski wrote:
> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
>
> The QCA6390 package contains discreet modules for WLAN and Bluetooth. They
> are powered by the Power Management Unit (PMU) that takes inputs from the
> host and provides LDO outputs. This document describes this module.
>
> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Can you start using b4?
This is a friendly reminder during the review process.
It looks like you received a tag and forgot to add it.
If you do not know the process, here is a short explanation:
Please add Acked-by/Reviewed-by/Tested-by tags when posting new
versions, under or above your Signed-off-by tag. Tag is "received", when
provided in a message replied to you on the mailing list. Tools like b4
can help here. However, there's no need to repost patches *only* to add
the tags. The upstream maintainer will do that for tags received on the
version they apply.
https://elixir.bootlin.com/linux/v6.5-rc3/source/Documentation/process/submitting-patches.rst#L577
If a tag was not added on purpose, please state why and what changed.
Best regards,
Krzysztof
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* Re: [PATCH v3] NUMA: Early use of cpu_to_node() returns 0 instead of the correct node id
From: Andrew Morton @ 2024-03-27 18:17 UTC (permalink / raw)
To: Huang Shijie
Cc: gregkh, patches, rafael, paul.walmsley, palmer, aou, yury.norov,
kuba, vschneid, mingo, vbabka, rppt, tglx, jpoimboe, ndesaulniers,
mikelley, mhiramat, arnd, linux-kernel, linux-riscv,
linux-arm-kernel, catalin.marinas, will, mark.rutland, mpe,
linuxppc-dev, chenhuacai, jiaxun.yang, linux-mips, cl
In-Reply-To: <20240126064451.5465-1-shijie@os.amperecomputing.com>
On Fri, 26 Jan 2024 14:44:51 +0800 Huang Shijie <shijie@os.amperecomputing.com> wrote:
> During the kernel booting, the generic cpu_to_node() is called too early in
> arm64, powerpc and riscv when CONFIG_NUMA is enabled.
>
> There are at least four places in the common code where
> the generic cpu_to_node() is called before it is initialized:
> 1.) early_trace_init() in kernel/trace/trace.c
> 2.) sched_init() in kernel/sched/core.c
> 3.) init_sched_fair_class() in kernel/sched/fair.c
> 4.) workqueue_init_early() in kernel/workqueue.c
>
> In order to fix the bug, the patch introduces early_numa_node_init()
> which is called after smp_prepare_boot_cpu() in start_kernel.
> early_numa_node_init will initialize the "numa_node" as soon as
> the early_cpu_to_node() is ready, before the cpu_to_node() is called
> at the first time.
What are the userspace-visible runtime effects of this bug?
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* Re: [PATCH v6 02/16] regulator: dt-bindings: describe the PMU module of the WCN7850 package
From: Krzysztof Kozlowski @ 2024-03-27 18:19 UTC (permalink / raw)
To: Bartosz Golaszewski, Marcel Holtmann, Luiz Augusto von Dentz,
David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Rob Herring, Krzysztof Kozlowski, Conor Dooley, Kalle Valo,
Bjorn Andersson, Konrad Dybcio, Liam Girdwood, Mark Brown,
Catalin Marinas, Will Deacon, Bjorn Helgaas, Saravana Kannan,
Geert Uytterhoeven, Arnd Bergmann, Neil Armstrong,
Marek Szyprowski, Alex Elder, Srini Kandagatla,
Greg Kroah-Hartman, Abel Vesa, Manivannan Sadhasivam,
Lukas Wunner, Dmitry Baryshkov
Cc: linux-bluetooth, netdev, devicetree, linux-kernel, linux-wireless,
linux-arm-msm, linux-arm-kernel, linux-pci, linux-pm,
Bartosz Golaszewski
In-Reply-To: <20240325131624.26023-3-brgl@bgdev.pl>
On 25/03/2024 14:16, Bartosz Golaszewski wrote:
> + then:
> + required:
> + - vdd-supply
> + - vddio-supply
> + - vddaon-supply
> + - vdddig-supply
> + - vddrfa1p2-supply
> + - vddrfa1p8-supply
I assume vddio1p2 is not required on purpose.
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Best regards,
Krzysztof
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* Re: [PATCH v6 05/16] dt-bindings: net: wireless: describe the ath12k PCI module
From: Krzysztof Kozlowski @ 2024-03-27 18:21 UTC (permalink / raw)
To: Bartosz Golaszewski, Marcel Holtmann, Luiz Augusto von Dentz,
David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Rob Herring, Krzysztof Kozlowski, Conor Dooley, Kalle Valo,
Bjorn Andersson, Konrad Dybcio, Liam Girdwood, Mark Brown,
Catalin Marinas, Will Deacon, Bjorn Helgaas, Saravana Kannan,
Geert Uytterhoeven, Arnd Bergmann, Neil Armstrong,
Marek Szyprowski, Alex Elder, Srini Kandagatla,
Greg Kroah-Hartman, Abel Vesa, Manivannan Sadhasivam,
Lukas Wunner, Dmitry Baryshkov
Cc: linux-bluetooth, netdev, devicetree, linux-kernel, linux-wireless,
linux-arm-msm, linux-arm-kernel, linux-pci, linux-pm,
Bartosz Golaszewski
In-Reply-To: <20240325131624.26023-6-brgl@bgdev.pl>
On 25/03/2024 14:16, Bartosz Golaszewski wrote:
> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
>
> Add device-tree bindings for the ATH12K module found in the WCN7850
> package.
>
> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> ---
> .../bindings/net/wireless/qcom,ath12k.yaml | 100 ++++++++++++++++++
> 1 file changed, 100 insertions(+)
> create mode 100644 Documentation/devicetree/bindings/net/wireless/qcom,ath12k.yaml
>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Best regards,
Krzysztof
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox