From: Feng Tang <feng.tang@linux.alibaba.com>
To: Marek Szyprowski <m.szyprowski@samsung.com>,
Robin Murphy <robin.murphy@arm.com>,
Ying Huang <ying.huang@linux.alibaba.com>,
Andrew Morton <akpm@linux-foundation.org>,
David Hildenbrand <david@kernel.org>
Cc: Lorenzo Stoakes <ljs@kernel.org>,
Liam.Howlett@oracle.com, Vlastimil Babka <vbabka@kernel.org>,
Mike Rapoport <rppt@kernel.org>,
Suren Baghdasaryan <surenb@google.com>,
Michal Hocko <mhocko@suse.com>,
linux-mm@kvack.org, Christoph Hellwig <hch@lst.de>,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will@kernel.org>,
iommu@lists.linux.dev, linux-kernel@vger.kernel.org,
Feng Tang <feng.tang@linux.alibaba.com>,
Changrong Chen <chenchangrong.ccr@alibaba-inc.com>
Subject: [PATCH v3] dma-contiguous: setup default numa cma area if not configured explicitly
Date: Tue, 28 Apr 2026 14:05:50 +0800 [thread overview]
Message-ID: <20260428060550.7167-1-feng.tang@linux.alibaba.com> (raw)
There was a report on a multi-numa-nodes ARM server that when IOMMU is
disabled, the dma_alloc_coherent() function always returns memory from
node 0 even for devices attaching to other nodes, while they can get
local dma memory when IOMMU is on with the same API.
The reason is, when IOMMU is disabled, the dma_alloc_coherent() will
go the direct way and call dma_alloc_contiguous(). The system doesn't
have any explicit cma setting (like per-numa cma), and only has a
default 64MB cma reserved area (on node 0), where kernel will try
first to allocate memory from.
Robin Murphy suggested to setup pernuma cma or disable cma, which did
solve the issue. While there is still concern that for customers
which don't have much kernel knowledge, they could still suffer from
this silently as some architectures enable cma area by default (not
an issue for X86 though, which set CONFIG_CMA_SIZE_MBYTES to 0 by
default) for most Linux distributions.
One thought is to follow the current cma reserving policy for platform
with 'CONFIG_DMA_NUMA_CMA=y', that if the numa cma (either the 'numa cma'
or 'cma pernuma' method) is not explicitly configured, set it up
according to size of default 'dma_contiguous_default_area', while
skipping the numa node where the 'dma_contiguous_default_area' lies
in, this way the default behavior of platform with one NUMA node is
kept unchanged.
To get the node info of cma area, add some helpr funciton and setup
in cma code.
Reported-by: Changrong Chen <chenchangrong.ccr@alibaba-inc.com>
Suggested-by: Ying Huang <ying.huang@linux.alibaba.com>
Suggested-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Feng Tang <feng.tang@linux.alibaba.com>
---
Changelog:
since v2:
* setup the numa cma are following default cma, while
skipping the node holds the default cma (Robin Murphy)
* add cma_get_node() help and related code
* add reporter info
since v1:
* don't use the original way of adding alloc_pages_node()
before trying default cma node (Robin Murphy)
* setup default numa cma area if not configured (Ying Huang)
v2: https://lore.kernel.org/lkml/20260423095243.14239-1-feng.tang@linux.alibaba.com/
v1: https://lore.kernel.org/lkml/20260414090310.92055-1-feng.tang@linux.alibaba.com/
include/linux/cma.h | 1 +
kernel/dma/contiguous.c | 14 ++++++++++++--
mm/cma.c | 11 ++++++++++-
3 files changed, 23 insertions(+), 3 deletions(-)
diff --git a/include/linux/cma.h b/include/linux/cma.h
index 8555d38a97b1..acc9ecdf28e1 100644
--- a/include/linux/cma.h
+++ b/include/linux/cma.h
@@ -26,6 +26,7 @@ extern unsigned long totalcma_pages;
extern phys_addr_t cma_get_base(const struct cma *cma);
extern unsigned long cma_get_size(const struct cma *cma);
extern const char *cma_get_name(const struct cma *cma);
+extern int cma_get_nid(const struct cma *cma);
extern int __init cma_declare_contiguous_nid(phys_addr_t base,
phys_addr_t size, phys_addr_t limit,
diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
index 03f52bd17120..ae6d856c5559 100644
--- a/kernel/dma/contiguous.c
+++ b/kernel/dma/contiguous.c
@@ -136,6 +136,7 @@ static struct cma *dma_contiguous_numa_area[MAX_NUMNODES];
static phys_addr_t numa_cma_size[MAX_NUMNODES] __initdata;
static struct cma *dma_contiguous_pernuma_area[MAX_NUMNODES];
static phys_addr_t pernuma_size_bytes __initdata;
+static bool numa_cma_configured;
static int __init early_numa_cma(char *p)
{
@@ -164,6 +165,7 @@ static int __init early_numa_cma(char *p)
break;
}
+ numa_cma_configured = true;
return 0;
}
early_param("numa_cma", early_numa_cma);
@@ -171,6 +173,7 @@ early_param("numa_cma", early_numa_cma);
static int __init early_cma_pernuma(char *p)
{
pernuma_size_bytes = memparse(p, &p);
+ numa_cma_configured = true;
return 0;
}
early_param("cma_pernuma", early_cma_pernuma);
@@ -221,6 +224,13 @@ static void __init dma_numa_cma_reserve(void)
ret, nid);
}
+ if (!numa_cma_configured && dma_contiguous_default_area) {
+ if (nid != cma_get_nid(dma_contiguous_default_area))
+ numa_cma_size[nid] = cma_get_size(dma_contiguous_default_area);
+ else
+ dma_contiguous_numa_area[nid] = dma_contiguous_default_area;
+ }
+
if (numa_cma_size[nid]) {
cma = &dma_contiguous_numa_area[nid];
@@ -255,8 +265,6 @@ void __init dma_contiguous_reserve(phys_addr_t limit)
phys_addr_t selected_limit = limit;
bool fixed = false;
- dma_numa_cma_reserve();
-
pr_debug("%s(limit %08lx)\n", __func__, (unsigned long)limit);
if (size_cmdline != -1) {
@@ -312,6 +320,8 @@ void __init dma_contiguous_reserve(phys_addr_t limit)
if (ret)
pr_warn("Couldn't queue default CMA region for heap creation.");
}
+
+ dma_numa_cma_reserve();
}
void __weak
diff --git a/mm/cma.c b/mm/cma.c
index c7ca567f4c5c..3bbfafeaf6c1 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -54,6 +54,11 @@ const char *cma_get_name(const struct cma *cma)
}
EXPORT_SYMBOL_GPL(cma_get_name);
+extern int cma_get_nid(const struct cma *cma)
+{
+ return cma->nid;
+}
+
static unsigned long cma_bitmap_aligned_mask(const struct cma *cma,
unsigned int align_order)
{
@@ -511,7 +516,11 @@ static int __init __cma_declare_contiguous_nid(phys_addr_t *basep,
return ret;
}
- (*res_cma)->nid = nid;
+ if (IS_ENABLED(CONFIG_NUMA) && nid == NUMA_NO_NODE)
+ (*res_cma)->nid = early_pfn_to_nid((*res_cma)->ranges[0].base_pfn);
+ else
+ (*res_cma)->nid = nid;
+
*basep = base;
return 0;
--
2.39.5 (Apple Git-154)
next reply other threads:[~2026-04-28 6:06 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-28 6:05 Feng Tang [this message]
2026-04-28 7:52 ` [PATCH v3] dma-contiguous: setup default numa cma area if not configured explicitly David Hildenbrand (Arm)
2026-04-28 8:37 ` Feng Tang
2026-04-28 9:03 ` Feng Tang
2026-05-01 18:51 ` David Hildenbrand (Arm)
2026-05-06 15:46 ` Feng Tang
2026-05-01 5:57 ` kernel test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260428060550.7167-1-feng.tang@linux.alibaba.com \
--to=feng.tang@linux.alibaba.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=catalin.marinas@arm.com \
--cc=chenchangrong.ccr@alibaba-inc.com \
--cc=david@kernel.org \
--cc=hch@lst.de \
--cc=iommu@lists.linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=m.szyprowski@samsung.com \
--cc=mhocko@suse.com \
--cc=robin.murphy@arm.com \
--cc=rppt@kernel.org \
--cc=surenb@google.com \
--cc=vbabka@kernel.org \
--cc=will@kernel.org \
--cc=ying.huang@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox