From: Mostafa Saleh <smostafa@google.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: iommu@lists.linux.dev, Joerg Roedel <joro@8bytes.org>,
linux-arm-kernel@lists.infradead.org,
Robin Murphy <robin.murphy@arm.com>,
Will Deacon <will@kernel.org>,
Michael Shavit <mshavit@google.com>,
Nicolin Chen <nicolinc@nvidia.com>,
patches@lists.linux.dev, Ryan Roberts <ryan.roberts@arm.com>
Subject: Re: [PATCH 6/7] iommu/arm-smmu-v3: Shrink the cdtab l1_desc array
Date: Tue, 4 Jun 2024 16:14:50 +0000 [thread overview]
Message-ID: <Zl89eleAaHzdNYjV@google.com> (raw)
In-Reply-To: <6-v1-1b720dce51d1+4f44-smmuv3_tidy_jgg@nvidia.com>
Hi Jason,
On Mon, Jun 03, 2024 at 07:31:32PM -0300, Jason Gunthorpe wrote:
> The top of the 2 level CD table is (at most) 1024 entries big, and two
> high order allocations are required. One of __le64 which is programmed
> into the HW (8k) and one of struct arm_smmu_l1_ctx_desc which holds the
> CPU pointer (16k).
>
> There are two copies of the l2ptr_dma, one is stored in the struct
> arm_smmu_l1_ctx_desc, and another is encoded in the __le64 for the HW to
> use. Instead of storing two copies just decode the value from the __le64.
>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 35 ++++++++++-----------
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 1 -
> 2 files changed, 16 insertions(+), 20 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 3f2e0462433d2d..7a6c9aac4cd450 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -1167,28 +1167,19 @@ static void arm_smmu_sync_cd(struct arm_smmu_master *master,
> arm_smmu_cmdq_batch_submit(smmu, &cmds);
> }
>
> -static int arm_smmu_alloc_cd_leaf_table(struct arm_smmu_device *smmu,
> - struct arm_smmu_l1_ctx_desc *l1_desc)
> +static void arm_smmu_write_cd_l1_desc(__le64 *dst, dma_addr_t l2ptr_dma)
> {
> - size_t size = CTXDESC_L2_ENTRIES * sizeof(struct arm_smmu_cd);
> -
> - l1_desc->l2ptr = dma_alloc_coherent(smmu->dev, size,
> - &l1_desc->l2ptr_dma, GFP_KERNEL);
> - if (!l1_desc->l2ptr)
> - return -ENOMEM;
> - return 0;
> -}
> -
> -static void arm_smmu_write_cd_l1_desc(__le64 *dst,
> - struct arm_smmu_l1_ctx_desc *l1_desc)
> -{
> - u64 val = (l1_desc->l2ptr_dma & CTXDESC_L1_DESC_L2PTR_MASK) |
> - CTXDESC_L1_DESC_V;
> + u64 val = (l2ptr_dma & CTXDESC_L1_DESC_L2PTR_MASK) | CTXDESC_L1_DESC_V;
>
> /* The HW has 64 bit atomicity with stores to the L2 CD table */
> WRITE_ONCE(*dst, cpu_to_le64(val));
> }
>
> +static dma_addr_t arm_smmu_cd_l1_get_desc(__le64 *src)
> +{
> + return le64_to_cpu(*src) & CTXDESC_L1_DESC_L2PTR_MASK;
> +}
> +
> struct arm_smmu_cd *arm_smmu_get_cd_ptr(struct arm_smmu_master *master,
> u32 ssid)
> {
> @@ -1227,13 +1218,18 @@ struct arm_smmu_cd *arm_smmu_alloc_cd_ptr(struct arm_smmu_master *master,
>
> l1_desc = &cd_table->l1_desc[idx];
> if (!l1_desc->l2ptr) {
> + dma_addr_t l2ptr_dma;
> __le64 *l1ptr;
>
> - if (arm_smmu_alloc_cd_leaf_table(smmu, l1_desc))
> + l1_desc->l2ptr = dma_alloc_coherent(
> + smmu->dev,
> + CTXDESC_L2_ENTRIES * sizeof(struct arm_smmu_cd),
> + &l2ptr_dma, GFP_KERNEL);
> + if (!l1_desc->l2ptr)
> return NULL;
>
> l1ptr = &cd_table->cdtab.l1_desc[idx];
> - arm_smmu_write_cd_l1_desc(l1ptr, l1_desc);
> + arm_smmu_write_cd_l1_desc(l1ptr, l2ptr_dma);
> /* An invalid L1CD can be cached */
> arm_smmu_sync_cd(master, ssid, false);
> }
> @@ -1406,7 +1402,8 @@ static void arm_smmu_free_cd_tables(struct arm_smmu_master *master)
>
> dma_free_coherent(smmu->dev, size,
> cd_table->l1_desc[i].l2ptr,
> - cd_table->l1_desc[i].l2ptr_dma);
> + arm_smmu_cd_l1_get_desc(
> + &cd_table->cdtab.l1_desc[i]));
> }
> kfree(cd_table->l1_desc);
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index 21c1acf34dd29c..1ffe2fdfd3755f 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -587,7 +587,6 @@ struct arm_smmu_ctx_desc {
>
> struct arm_smmu_l1_ctx_desc {
> struct arm_smmu_cd *l2ptr;
> - dma_addr_t l2ptr_dma;
Maybe now we can also get rid of arm_smmu_l1_ctx_desc, and embed l2ptr directly
in arm_smmu_ctx_desc_cfg?
> };
>
> struct arm_smmu_ctx_desc_cfg {
> --
> 2.45.2
>
Thanks,
Mostafa
WARNING: multiple messages have this Message-ID (diff)
From: Mostafa Saleh <smostafa@google.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: iommu@lists.linux.dev, Joerg Roedel <joro@8bytes.org>,
linux-arm-kernel@lists.infradead.org,
Robin Murphy <robin.murphy@arm.com>,
Will Deacon <will@kernel.org>,
Michael Shavit <mshavit@google.com>,
Nicolin Chen <nicolinc@nvidia.com>,
patches@lists.linux.dev, Ryan Roberts <ryan.roberts@arm.com>
Subject: Re: [PATCH 6/7] iommu/arm-smmu-v3: Shrink the cdtab l1_desc array
Date: Tue, 4 Jun 2024 16:14:50 +0000 [thread overview]
Message-ID: <Zl89eleAaHzdNYjV@google.com> (raw)
In-Reply-To: <6-v1-1b720dce51d1+4f44-smmuv3_tidy_jgg@nvidia.com>
Hi Jason,
On Mon, Jun 03, 2024 at 07:31:32PM -0300, Jason Gunthorpe wrote:
> The top of the 2 level CD table is (at most) 1024 entries big, and two
> high order allocations are required. One of __le64 which is programmed
> into the HW (8k) and one of struct arm_smmu_l1_ctx_desc which holds the
> CPU pointer (16k).
>
> There are two copies of the l2ptr_dma, one is stored in the struct
> arm_smmu_l1_ctx_desc, and another is encoded in the __le64 for the HW to
> use. Instead of storing two copies just decode the value from the __le64.
>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 35 ++++++++++-----------
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 1 -
> 2 files changed, 16 insertions(+), 20 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 3f2e0462433d2d..7a6c9aac4cd450 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -1167,28 +1167,19 @@ static void arm_smmu_sync_cd(struct arm_smmu_master *master,
> arm_smmu_cmdq_batch_submit(smmu, &cmds);
> }
>
> -static int arm_smmu_alloc_cd_leaf_table(struct arm_smmu_device *smmu,
> - struct arm_smmu_l1_ctx_desc *l1_desc)
> +static void arm_smmu_write_cd_l1_desc(__le64 *dst, dma_addr_t l2ptr_dma)
> {
> - size_t size = CTXDESC_L2_ENTRIES * sizeof(struct arm_smmu_cd);
> -
> - l1_desc->l2ptr = dma_alloc_coherent(smmu->dev, size,
> - &l1_desc->l2ptr_dma, GFP_KERNEL);
> - if (!l1_desc->l2ptr)
> - return -ENOMEM;
> - return 0;
> -}
> -
> -static void arm_smmu_write_cd_l1_desc(__le64 *dst,
> - struct arm_smmu_l1_ctx_desc *l1_desc)
> -{
> - u64 val = (l1_desc->l2ptr_dma & CTXDESC_L1_DESC_L2PTR_MASK) |
> - CTXDESC_L1_DESC_V;
> + u64 val = (l2ptr_dma & CTXDESC_L1_DESC_L2PTR_MASK) | CTXDESC_L1_DESC_V;
>
> /* The HW has 64 bit atomicity with stores to the L2 CD table */
> WRITE_ONCE(*dst, cpu_to_le64(val));
> }
>
> +static dma_addr_t arm_smmu_cd_l1_get_desc(__le64 *src)
> +{
> + return le64_to_cpu(*src) & CTXDESC_L1_DESC_L2PTR_MASK;
> +}
> +
> struct arm_smmu_cd *arm_smmu_get_cd_ptr(struct arm_smmu_master *master,
> u32 ssid)
> {
> @@ -1227,13 +1218,18 @@ struct arm_smmu_cd *arm_smmu_alloc_cd_ptr(struct arm_smmu_master *master,
>
> l1_desc = &cd_table->l1_desc[idx];
> if (!l1_desc->l2ptr) {
> + dma_addr_t l2ptr_dma;
> __le64 *l1ptr;
>
> - if (arm_smmu_alloc_cd_leaf_table(smmu, l1_desc))
> + l1_desc->l2ptr = dma_alloc_coherent(
> + smmu->dev,
> + CTXDESC_L2_ENTRIES * sizeof(struct arm_smmu_cd),
> + &l2ptr_dma, GFP_KERNEL);
> + if (!l1_desc->l2ptr)
> return NULL;
>
> l1ptr = &cd_table->cdtab.l1_desc[idx];
> - arm_smmu_write_cd_l1_desc(l1ptr, l1_desc);
> + arm_smmu_write_cd_l1_desc(l1ptr, l2ptr_dma);
> /* An invalid L1CD can be cached */
> arm_smmu_sync_cd(master, ssid, false);
> }
> @@ -1406,7 +1402,8 @@ static void arm_smmu_free_cd_tables(struct arm_smmu_master *master)
>
> dma_free_coherent(smmu->dev, size,
> cd_table->l1_desc[i].l2ptr,
> - cd_table->l1_desc[i].l2ptr_dma);
> + arm_smmu_cd_l1_get_desc(
> + &cd_table->cdtab.l1_desc[i]));
> }
> kfree(cd_table->l1_desc);
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index 21c1acf34dd29c..1ffe2fdfd3755f 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -587,7 +587,6 @@ struct arm_smmu_ctx_desc {
>
> struct arm_smmu_l1_ctx_desc {
> struct arm_smmu_cd *l2ptr;
> - dma_addr_t l2ptr_dma;
Maybe now we can also get rid of arm_smmu_l1_ctx_desc, and embed l2ptr directly
in arm_smmu_ctx_desc_cfg?
> };
>
> struct arm_smmu_ctx_desc_cfg {
> --
> 2.45.2
>
Thanks,
Mostafa
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2024-06-04 16:14 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-03 22:31 [PATCH 0/7] Tidy some minor things in the stream table/cd table area Jason Gunthorpe
2024-06-03 22:31 ` Jason Gunthorpe
2024-06-03 22:31 ` [PATCH 1/7] iommu/arm-smmu-v3: Split struct arm_smmu_strtab_cfg.strtab Jason Gunthorpe
2024-06-03 22:31 ` Jason Gunthorpe
2024-06-04 8:32 ` Nicolin Chen
2024-06-04 8:32 ` Nicolin Chen
2024-06-04 12:59 ` Jason Gunthorpe
2024-06-04 12:59 ` Jason Gunthorpe
2024-06-04 18:28 ` Nicolin Chen
2024-06-04 18:28 ` Nicolin Chen
2024-06-04 19:02 ` Jason Gunthorpe
2024-06-04 19:02 ` Jason Gunthorpe
2024-06-04 19:28 ` Nicolin Chen
2024-06-04 19:28 ` Nicolin Chen
2024-06-04 15:52 ` Mostafa Saleh
2024-06-04 15:52 ` Mostafa Saleh
2024-06-05 23:51 ` Jason Gunthorpe
2024-06-05 23:51 ` Jason Gunthorpe
2024-06-03 22:31 ` [PATCH 2/7] iommu/arm-smmu-v3: Do not zero the strtab twice Jason Gunthorpe
2024-06-03 22:31 ` Jason Gunthorpe
2024-06-04 15:56 ` Mostafa Saleh
2024-06-04 15:56 ` Mostafa Saleh
2024-06-05 21:22 ` Jason Gunthorpe
2024-06-05 21:22 ` Jason Gunthorpe
2024-06-03 22:31 ` [PATCH 3/7] iommu/arm-smmu-v3: Shrink the strtab l1_desc array Jason Gunthorpe
2024-06-03 22:31 ` Jason Gunthorpe
2024-06-04 16:01 ` Mostafa Saleh
2024-06-04 16:01 ` Mostafa Saleh
2024-06-03 22:31 ` [PATCH 4/7] iommu/arm-smmu-v3: Split struct arm_smmu_ctx_desc_cfg.cdtab Jason Gunthorpe
2024-06-03 22:31 ` Jason Gunthorpe
2024-06-04 16:07 ` Mostafa Saleh
2024-06-04 16:07 ` Mostafa Saleh
2024-06-06 23:59 ` Jason Gunthorpe
2024-06-06 23:59 ` Jason Gunthorpe
2024-06-03 22:31 ` [PATCH 5/7] iommu/arm-smmu-v3: Do not use devm for the cd table allocations Jason Gunthorpe
2024-06-03 22:31 ` Jason Gunthorpe
2024-06-03 22:31 ` [PATCH 6/7] iommu/arm-smmu-v3: Shrink the cdtab l1_desc array Jason Gunthorpe
2024-06-03 22:31 ` Jason Gunthorpe
2024-06-04 16:14 ` Mostafa Saleh [this message]
2024-06-04 16:14 ` Mostafa Saleh
2024-06-03 22:31 ` [PATCH 7/7] iommu/arm-smmu-v3: Use the new rb tree helpers Jason Gunthorpe
2024-06-03 22:31 ` Jason Gunthorpe
2024-06-04 16:22 ` Mostafa Saleh
2024-06-04 16:22 ` Mostafa Saleh
2024-06-03 22:41 ` [PATCH 0/7] Tidy some minor things in the stream table/cd table area Nicolin Chen
2024-06-03 22:41 ` Nicolin Chen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zl89eleAaHzdNYjV@google.com \
--to=smostafa@google.com \
--cc=iommu@lists.linux.dev \
--cc=jgg@nvidia.com \
--cc=joro@8bytes.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=mshavit@google.com \
--cc=nicolinc@nvidia.com \
--cc=patches@lists.linux.dev \
--cc=robin.murphy@arm.com \
--cc=ryan.roberts@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.