From: Jason Gunthorpe <jgg@nvidia.com>
To: Alim Akhtar <alim.akhtar@samsung.com>,
Alyssa Rosenzweig <alyssa@rosenzweig.io>,
Albert Ou <aou@eecs.berkeley.edu>,
asahi@lists.linux.dev, Lu Baolu <baolu.lu@linux.intel.com>,
David Woodhouse <dwmw2@infradead.org>,
Heiko Stuebner <heiko@sntech.de>,
iommu@lists.linux.dev, Jernej Skrabec <jernej.skrabec@gmail.com>,
Jonathan Hunter <jonathanh@nvidia.com>,
Joerg Roedel <joro@8bytes.org>,
Krzysztof Kozlowski <krzk@kernel.org>,
linux-arm-kernel@lists.infradead.org,
linux-riscv@lists.infradead.org,
linux-rockchip@lists.infradead.org,
linux-samsung-soc@vger.kernel.org, linux-sunxi@lists.linux.dev,
linux-tegra@vger.kernel.org,
Marek Szyprowski <m.szyprowski@samsung.com>,
Hector Martin <marcan@marcan.st>,
Palmer Dabbelt <palmer@dabbelt.com>,
Paul Walmsley <paul.walmsley@sifive.com>,
Robin Murphy <robin.murphy@arm.com>,
Samuel Holland <samuel@sholland.org>,
Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>,
Sven Peter <sven@svenpeter.dev>,
Thierry Reding <thierry.reding@gmail.com>,
Tomasz Jeznach <tjeznach@rivosinc.com>,
Krishna Reddy <vdumpa@nvidia.com>, Chen-Yu Tsai <wens@csie.org>,
Will Deacon <will@kernel.org>
Cc: Bagas Sanjaya <bagasdotme@gmail.com>,
Joerg Roedel <jroedel@suse.de>,
Pasha Tatashin <pasha.tatashin@soleen.com>,
patches@lists.linux.dev, David Rientjes <rientjes@google.com>,
Matthew Wilcox <willy@infradead.org>
Subject: [PATCH v3 16/23] iommu/pages: Allow sub page sizes to be passed into the allocator
Date: Tue, 25 Feb 2025 15:39:33 -0400 [thread overview]
Message-ID: <16-v3-e797f4dc6918+93057-iommu_pages_jgg@nvidia.com> (raw)
In-Reply-To: <0-v3-e797f4dc6918+93057-iommu_pages_jgg@nvidia.com>
Generally drivers have a specific idea what their HW structure size should
be. In a lot of cases this is related to PAGE_SIZE, but not always. ARM64,
for example, allows a 4K IO page table size on a 64K CPU page table
system.
Currently we don't have any good support for sub page allocations, but
make the API accommodate this by accepting a sub page size from the caller
and rounding up internally.
This is done by moving away from order as the size input and using size:
size == 1 << (order + PAGE_SHIFT)
Following patches convert drivers away from using order and try to specify
allocation sizes independent of PAGE_SIZE.
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/iommu-pages.c | 29 +++++++++++++++---------
drivers/iommu/iommu-pages.h | 44 ++++++++++++++++++++++++++++++++-----
include/linux/iommu.h | 6 ++---
3 files changed, 61 insertions(+), 18 deletions(-)
diff --git a/drivers/iommu/iommu-pages.c b/drivers/iommu/iommu-pages.c
index a7eed09420a231..4cc77fddfeeb47 100644
--- a/drivers/iommu/iommu-pages.c
+++ b/drivers/iommu/iommu-pages.c
@@ -23,24 +23,32 @@ IOPTDESC_MATCH(memcg_data, memcg_data);
static_assert(sizeof(struct ioptdesc) <= sizeof(struct page));
/**
- * iommu_alloc_pages_node - Allocate a zeroed page of a given order from
- * specific NUMA node
+ * iommu_alloc_pages_node_sz - Allocate a zeroed page of a given size from
+ * specific NUMA node
* @nid: memory NUMA node id
* @gfp: buddy allocator flags
- * @order: page order
+ * @size: Memory size to allocate, rounded up to a power of 2
*
- * Returns the virtual address of the allocated page. The page must be
- * freed either by calling iommu_free_pages() or via iommu_put_pages_list().
+ * Returns the virtual address of the allocated page. The page must be freed
+ * either by calling iommu_free_pages() or via iommu_put_pages_list(). The
+ * returned allocation is round_up_pow_two(size) big, and is physically aligned
+ * to its size.
*/
-void *iommu_alloc_pages_node(int nid, gfp_t gfp, unsigned int order)
+void *iommu_alloc_pages_node_sz(int nid, gfp_t gfp, size_t size)
{
- const unsigned long pgcnt = 1UL << order;
+ unsigned long pgcnt;
struct folio *folio;
+ unsigned int order;
/* This uses page_address() on the memory. */
if (WARN_ON(gfp & __GFP_HIGHMEM))
return NULL;
+ /*
+ * Currently sub page allocations result in a full page being returned.
+ */
+ order = get_order(size);
+
/*
* __folio_alloc_node() does not handle NUMA_NO_NODE like
* alloc_pages_node() did.
@@ -61,12 +69,13 @@ void *iommu_alloc_pages_node(int nid, gfp_t gfp, unsigned int order)
* This is necessary for the proper accounting as IOMMU state can be
* rather large, i.e. multiple gigabytes in size.
*/
+ pgcnt = 1UL << order;
mod_node_page_state(folio_pgdat(folio), NR_IOMMU_PAGES, pgcnt);
lruvec_stat_mod_folio(folio, NR_SECONDARY_PAGETABLE, pgcnt);
return folio_address(folio);
}
-EXPORT_SYMBOL_GPL(iommu_alloc_pages_node);
+EXPORT_SYMBOL_GPL(iommu_alloc_pages_node_sz);
static void __iommu_free_desc(struct ioptdesc *iopt)
{
@@ -82,7 +91,7 @@ static void __iommu_free_desc(struct ioptdesc *iopt)
* iommu_free_pages - free pages
* @virt: virtual address of the page to be freed.
*
- * The page must have have been allocated by iommu_alloc_pages_node()
+ * The page must have have been allocated by iommu_alloc_pages_node_sz()
*/
void iommu_free_pages(void *virt)
{
@@ -96,7 +105,7 @@ EXPORT_SYMBOL_GPL(iommu_free_pages);
* iommu_put_pages_list - free a list of pages.
* @list: The list of pages to be freed
*
- * Frees a list of pages allocated by iommu_alloc_pages_node().
+ * Frees a list of pages allocated by iommu_alloc_pages_node_sz().
*/
void iommu_put_pages_list(struct iommu_pages_list *list)
{
diff --git a/drivers/iommu/iommu-pages.h b/drivers/iommu/iommu-pages.h
index f4578f252e2580..3c4575d637da6d 100644
--- a/drivers/iommu/iommu-pages.h
+++ b/drivers/iommu/iommu-pages.h
@@ -46,14 +46,14 @@ static inline struct ioptdesc *virt_to_ioptdesc(void *virt)
return folio_ioptdesc(virt_to_folio(virt));
}
-void *iommu_alloc_pages_node(int nid, gfp_t gfp, unsigned int order);
+void *iommu_alloc_pages_node_sz(int nid, gfp_t gfp, size_t size);
void iommu_free_pages(void *virt);
void iommu_put_pages_list(struct iommu_pages_list *list);
/**
* iommu_pages_list_add - add the page to a iommu_pages_list
* @list: List to add the page to
- * @virt: Address returned from iommu_alloc_pages_node()
+ * @virt: Address returned from iommu_alloc_pages_node_sz()
*/
static inline void iommu_pages_list_add(struct iommu_pages_list *list,
void *virt)
@@ -84,16 +84,48 @@ static inline bool iommu_pages_list_empty(struct iommu_pages_list *list)
return list_empty(&list->pages);
}
+/**
+ * iommu_alloc_pages_node - Allocate a zeroed page of a given order from
+ * specific NUMA node
+ * @nid: memory NUMA node id
+ * @gfp: buddy allocator flags
+ * @order: page order
+ *
+ * Returns the virtual address of the allocated page.
+ * Prefer to use iommu_alloc_pages_node_lg2()
+ */
+static inline void *iommu_alloc_pages_node(int nid, gfp_t gfp,
+ unsigned int order)
+{
+ return iommu_alloc_pages_node_sz(nid, gfp, 1 << (order + PAGE_SHIFT));
+}
+
/**
* iommu_alloc_pages - allocate a zeroed page of a given order
* @gfp: buddy allocator flags
* @order: page order
*
* returns the virtual address of the allocated page
+ * Prefer to use iommu_alloc_pages_lg2()
*/
static inline void *iommu_alloc_pages(gfp_t gfp, int order)
{
- return iommu_alloc_pages_node(NUMA_NO_NODE, gfp, order);
+ return iommu_alloc_pages_node_sz(NUMA_NO_NODE, gfp,
+ 1 << (order + PAGE_SHIFT));
+}
+
+/**
+ * iommu_alloc_pages_sz - Allocate a zeroed page of a given size from
+ * specific NUMA node
+ * @nid: memory NUMA node id
+ * @gfp: buddy allocator flags
+ * @size: Memory size to allocate, this is rounded up to a power of 2
+ *
+ * Returns the virtual address of the allocated page.
+ */
+static inline void *iommu_alloc_pages_sz(gfp_t gfp, size_t size)
+{
+ return iommu_alloc_pages_node_sz(NUMA_NO_NODE, gfp, size);
}
/**
@@ -102,10 +134,11 @@ static inline void *iommu_alloc_pages(gfp_t gfp, int order)
* @gfp: buddy allocator flags
*
* returns the virtual address of the allocated page
+ * Prefer to use iommu_alloc_pages_node_lg2()
*/
static inline void *iommu_alloc_page_node(int nid, gfp_t gfp)
{
- return iommu_alloc_pages_node(nid, gfp, 0);
+ return iommu_alloc_pages_node_sz(nid, gfp, PAGE_SIZE);
}
/**
@@ -113,10 +146,11 @@ static inline void *iommu_alloc_page_node(int nid, gfp_t gfp)
* @gfp: buddy allocator flags
*
* returns the virtual address of the allocated page
+ * Prefer to use iommu_alloc_pages_lg2()
*/
static inline void *iommu_alloc_page(gfp_t gfp)
{
- return iommu_alloc_pages_node(NUMA_NO_NODE, gfp, 0);
+ return iommu_alloc_pages_node_sz(NUMA_NO_NODE, gfp, PAGE_SIZE);
}
#endif /* __IOMMU_PAGES_H */
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 166d8e1bcb100d..b74c9f3dbcce1d 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -327,9 +327,9 @@ typedef unsigned int ioasid_t;
#define IOMMU_DIRTY_NO_CLEAR (1 << 0)
/*
- * Pages allocated through iommu_alloc_pages_node() can be placed on this list
- * using iommu_pages_list_add(). Note: ONLY pages from iommu_alloc_pages_node()
- * can be used this way!
+ * Pages allocated through iommu_alloc_pages_node_sz() can be placed on this
+ * list using iommu_pages_list_add(). Note: ONLY pages from
+ * iommu_alloc_pages_node_sz() can be used this way!
*/
struct iommu_pages_list {
struct list_head pages;
--
2.43.0
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
next prev parent reply other threads:[~2025-02-25 21:08 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-25 19:39 [PATCH v3 00/23] iommu: Further abstract iommu-pages Jason Gunthorpe
2025-02-25 19:39 ` [PATCH v3 01/23] iommu/terga: Do not use struct page as the handle for as->pd memory Jason Gunthorpe
2025-02-25 19:39 ` [PATCH v3 02/23] iommu/tegra: Do not use struct page as the handle for pts Jason Gunthorpe
2025-02-25 19:39 ` [PATCH v3 03/23] iommu/pages: Remove __iommu_alloc_pages()/__iommu_free_pages() Jason Gunthorpe
2025-02-26 6:25 ` Baolu Lu
2025-03-12 11:43 ` Mostafa Saleh
2025-02-25 19:39 ` [PATCH v3 04/23] iommu/pages: Make iommu_put_pages_list() work with high order allocations Jason Gunthorpe
2025-02-26 6:28 ` Baolu Lu
2025-02-25 19:39 ` [PATCH v3 05/23] iommu/pages: Remove the order argument to iommu_free_pages() Jason Gunthorpe
2025-02-26 6:32 ` Baolu Lu
2025-03-12 11:43 ` Mostafa Saleh
2025-02-25 19:39 ` [PATCH v3 06/23] iommu/pages: Remove iommu_free_page() Jason Gunthorpe
2025-02-26 6:34 ` Baolu Lu
2025-03-12 11:44 ` Mostafa Saleh
2025-02-25 19:39 ` [PATCH v3 07/23] iommu/pages: De-inline the substantial functions Jason Gunthorpe
2025-02-26 6:43 ` Baolu Lu
2025-03-12 12:45 ` Mostafa Saleh
2025-02-25 19:39 ` [PATCH v3 08/23] iommu/vtd: Use virt_to_phys() Jason Gunthorpe
2025-03-10 2:21 ` Baolu Lu
2025-02-25 19:39 ` [PATCH v3 09/23] iommu/pages: Formalize the freelist API Jason Gunthorpe
2025-02-26 6:56 ` Baolu Lu
2025-02-26 17:31 ` Jason Gunthorpe
2025-02-27 5:11 ` Baolu Lu
2025-02-25 19:39 ` [PATCH v3 10/23] iommu/riscv: Convert to use struct iommu_pages_list Jason Gunthorpe
2025-02-25 19:39 ` [PATCH v3 11/23] iommu/amd: " Jason Gunthorpe
2025-02-25 19:39 ` [PATCH v3 12/23] iommu: Change iommu_iotlb_gather to use iommu_page_list Jason Gunthorpe
2025-02-26 7:02 ` Baolu Lu
2025-02-25 19:39 ` [PATCH v3 13/23] iommu/pages: Remove iommu_put_pages_list_old and the _Generic Jason Gunthorpe
2025-02-26 7:04 ` Baolu Lu
2025-02-25 19:39 ` [PATCH v3 14/23] iommu/pages: Move from struct page to struct ioptdesc and folio Jason Gunthorpe
2025-02-26 12:42 ` Baolu Lu
2025-02-26 13:51 ` Jason Gunthorpe
2025-02-27 5:17 ` Baolu Lu
2025-02-27 5:17 ` Baolu Lu
2025-02-25 19:39 ` [PATCH v3 15/23] iommu/pages: Move the __GFP_HIGHMEM checks into the common code Jason Gunthorpe
2025-03-12 12:45 ` Mostafa Saleh
2025-02-25 19:39 ` Jason Gunthorpe [this message]
2025-02-26 12:22 ` [PATCH v3 16/23] iommu/pages: Allow sub page sizes to be passed into the allocator Baolu Lu
2025-02-25 19:39 ` [PATCH v3 17/23] iommu/amd: Change rlookup, irq_lookup, and alias to use kvalloc() Jason Gunthorpe
2025-02-25 19:39 ` [PATCH v3 18/23] iommu/amd: Use roundup_pow_two() instead of get_order() Jason Gunthorpe
2025-02-25 19:39 ` [PATCH v3 19/23] iommu/riscv: Update to use iommu_alloc_pages_node_lg2() Jason Gunthorpe
2025-02-25 19:39 ` [PATCH v3 20/23] iommu: Update various drivers to pass in lg2sz instead of order to iommu pages Jason Gunthorpe
2025-02-26 12:24 ` Baolu Lu
2025-03-12 12:59 ` Mostafa Saleh
2025-03-17 13:35 ` Jason Gunthorpe
2025-03-18 10:46 ` Mostafa Saleh
2025-03-18 10:57 ` Robin Murphy
2025-02-25 19:39 ` [PATCH v3 21/23] iommu/pages: Remove iommu_alloc_page/pages() Jason Gunthorpe
2025-02-26 9:15 ` Marek Szyprowski
2025-02-25 19:39 ` [PATCH v3 22/23] iommu/pages: Remove iommu_alloc_page_node() Jason Gunthorpe
2025-02-26 12:26 ` Baolu Lu
2025-02-25 19:39 ` [PATCH v3 23/23] iommu/pages: Remove iommu_alloc_pages_node() Jason Gunthorpe
2025-02-26 12:30 ` Baolu Lu
2025-02-25 20:18 ` [PATCH v3 00/23] iommu: Further abstract iommu-pages Nicolin Chen
2025-02-25 23:17 ` Alejandro Jimenez
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=16-v3-e797f4dc6918+93057-iommu_pages_jgg@nvidia.com \
--to=jgg@nvidia.com \
--cc=alim.akhtar@samsung.com \
--cc=alyssa@rosenzweig.io \
--cc=aou@eecs.berkeley.edu \
--cc=asahi@lists.linux.dev \
--cc=bagasdotme@gmail.com \
--cc=baolu.lu@linux.intel.com \
--cc=dwmw2@infradead.org \
--cc=heiko@sntech.de \
--cc=iommu@lists.linux.dev \
--cc=jernej.skrabec@gmail.com \
--cc=jonathanh@nvidia.com \
--cc=joro@8bytes.org \
--cc=jroedel@suse.de \
--cc=krzk@kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-riscv@lists.infradead.org \
--cc=linux-rockchip@lists.infradead.org \
--cc=linux-samsung-soc@vger.kernel.org \
--cc=linux-sunxi@lists.linux.dev \
--cc=linux-tegra@vger.kernel.org \
--cc=m.szyprowski@samsung.com \
--cc=marcan@marcan.st \
--cc=palmer@dabbelt.com \
--cc=pasha.tatashin@soleen.com \
--cc=patches@lists.linux.dev \
--cc=paul.walmsley@sifive.com \
--cc=rientjes@google.com \
--cc=robin.murphy@arm.com \
--cc=samuel@sholland.org \
--cc=suravee.suthikulpanit@amd.com \
--cc=sven@svenpeter.dev \
--cc=thierry.reding@gmail.com \
--cc=tjeznach@rivosinc.com \
--cc=vdumpa@nvidia.com \
--cc=wens@csie.org \
--cc=will@kernel.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox