From: Suren Baghdasaryan <surenb@google.com>
To: akpm@linux-foundation.org
Cc: willy@infradead.org, david@redhat.com, vbabka@suse.cz,
lorenzo.stoakes@oracle.com, liam.howlett@oracle.com,
alexandru.elisei@arm.com, peterx@redhat.com, hannes@cmpxchg.org,
mhocko@kernel.org, m.szyprowski@samsung.com,
iamjoonsoo.kim@lge.com, mina86@mina86.com, axboe@kernel.dk,
viro@zeniv.linux.org.uk, brauner@kernel.org, hch@infradead.org,
jack@suse.cz, hbathini@linux.ibm.com, sourabhjain@linux.ibm.com,
ritesh.list@gmail.com, aneesh.kumar@kernel.org,
bhelgaas@google.com, sj@kernel.org, fvdl@google.com,
ziy@nvidia.com, yuzhao@google.com, minchan@kernel.org,
surenb@google.com, linux-mm@kvack.org,
linuxppc-dev@lists.ozlabs.org, linux-block@vger.kernel.org,
linux-fsdevel@vger.kernel.org, iommu@lists.linux.dev,
linux-kernel@vger.kernel.org, Minchan Kim <minchan@google.com>
Subject: [RFC 2/3] mm: introduce GCMA
Date: Thu, 20 Mar 2025 10:39:30 -0700 [thread overview]
Message-ID: <20250320173931.1583800-3-surenb@google.com> (raw)
In-Reply-To: <20250320173931.1583800-1-surenb@google.com>
From: Minchan Kim <minchan@google.com>
This patch introduces GCMA (Guaranteed Contiguous Memory Allocator)
cleacache backend which reserves some amount of memory at the boot
and then donates it to store clean file-backed pages in the cleancache.
GCMA aims to guarantee contiguous memory allocation success as well as
low and deterministic allocation latency.
Notes:
Originally, the idea was posted by SeongJae Park and Minchan Kim [1].
Later Minchan reworked it to be used in Android as a reference for
Android vendors to use [2].
[1] https://lwn.net/Articles/619865/
[2] https://android-review.googlesource.com/q/topic:%22gcma_6.12%22
Signed-off-by: Minchan Kim <minchan@google.com>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
---
include/linux/gcma.h | 12 ++++
mm/Kconfig | 15 +++++
mm/Makefile | 1 +
mm/gcma.c | 155 +++++++++++++++++++++++++++++++++++++++++++
4 files changed, 183 insertions(+)
create mode 100644 include/linux/gcma.h
create mode 100644 mm/gcma.c
diff --git a/include/linux/gcma.h b/include/linux/gcma.h
new file mode 100644
index 000000000000..2ce40fcc74a5
--- /dev/null
+++ b/include/linux/gcma.h
@@ -0,0 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __GCMA_H__
+#define __GCMA_H__
+
+#include <linux/types.h>
+
+int gcma_register_area(const char *name,
+ unsigned long start_pfn, unsigned long count);
+void gcma_alloc_range(unsigned long start_pfn, unsigned long count);
+void gcma_free_range(unsigned long start_pfn, unsigned long count);
+
+#endif /* __GCMA_H__ */
diff --git a/mm/Kconfig b/mm/Kconfig
index d6ebf0fb0432..85268ef901b6 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -1002,6 +1002,21 @@ config CMA_AREAS
If unsure, leave the default value "8" in UMA and "20" in NUMA.
+config GCMA
+ bool "GCMA (Guaranteed Contiguous Memory Allocator)"
+ depends on CLEANCACHE
+ help
+ This enables the Guaranteed Contiguous Memory Allocator to allow
+ low latency guaranteed contiguous memory allocations. Memory
+ reserved by GCMA is donated to cleancache to be used as pagecache
+ extension. Once GCMA allocation is requested, necessary pages are
+ taken back from the cleancache and used to satisfy the request.
+ Cleancache guarantees low latency successful allocation as long
+ as the total size of GCMA allocations does not exceed the size of
+ the memory donated to the cleancache.
+
+ If unsure, say "N".
+
config MEM_SOFT_DIRTY
bool "Track memory changes"
depends on CHECKPOINT_RESTORE && HAVE_ARCH_SOFT_DIRTY && PROC_FS
diff --git a/mm/Makefile b/mm/Makefile
index 084dbe9edbc4..2173d395d371 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -149,3 +149,4 @@ obj-$(CONFIG_EXECMEM) += execmem.o
obj-$(CONFIG_TMPFS_QUOTA) += shmem_quota.o
obj-$(CONFIG_PT_RECLAIM) += pt_reclaim.o
obj-$(CONFIG_CLEANCACHE) += cleancache.o
+obj-$(CONFIG_GCMA) += gcma.o
diff --git a/mm/gcma.c b/mm/gcma.c
new file mode 100644
index 000000000000..263e63da0c89
--- /dev/null
+++ b/mm/gcma.c
@@ -0,0 +1,155 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * GCMA (Guaranteed Contiguous Memory Allocator)
+ *
+ */
+
+#define pr_fmt(fmt) "gcma: " fmt
+
+#include <linux/cleancache.h>
+#include <linux/gcma.h>
+#include <linux/hashtable.h>
+#include <linux/highmem.h>
+#include <linux/idr.h>
+#include <linux/slab.h>
+#include <linux/xarray.h>
+
+#define MAX_GCMA_AREAS 64
+#define GCMA_AREA_NAME_MAX_LEN 32
+
+struct gcma_area {
+ int area_id;
+ unsigned long start_pfn;
+ unsigned long end_pfn;
+ char name[GCMA_AREA_NAME_MAX_LEN];
+};
+
+static struct gcma_area areas[MAX_GCMA_AREAS];
+static atomic_t nr_gcma_area = ATOMIC_INIT(0);
+static DEFINE_SPINLOCK(gcma_area_lock);
+
+static void alloc_page_range(struct gcma_area *area,
+ unsigned long start_pfn, unsigned long end_pfn)
+{
+ unsigned long scanned = 0;
+ unsigned long pfn;
+ struct page *page;
+ int err;
+
+ for (pfn = start_pfn; pfn < end_pfn; pfn++) {
+ if (!(++scanned % XA_CHECK_SCHED))
+ cond_resched();
+
+ page = pfn_to_page(pfn);
+ err = cleancache_backend_get_folio(area->area_id, page_folio(page));
+ VM_BUG_ON(err);
+ }
+}
+
+static void free_page_range(struct gcma_area *area,
+ unsigned long start_pfn, unsigned long end_pfn)
+{
+ unsigned long scanned = 0;
+ unsigned long pfn;
+ struct page *page;
+ int err;
+
+ for (pfn = start_pfn; pfn < end_pfn; pfn++) {
+ if (!(++scanned % XA_CHECK_SCHED))
+ cond_resched();
+
+ page = pfn_to_page(pfn);
+ err = cleancache_backend_put_folio(area->area_id,
+ page_folio(page));
+ VM_BUG_ON(err);
+ }
+}
+
+int gcma_register_area(const char *name,
+ unsigned long start_pfn, unsigned long count)
+{
+ LIST_HEAD(folios);
+ int i, area_id;
+ int nr_area;
+ int ret = 0;
+
+ for (i = 0; i < count; i++) {
+ struct folio *folio;
+
+ folio = page_folio(pfn_to_page(start_pfn + i));
+ list_add(&folio->lru, &folios);
+ }
+
+ area_id = cleancache_register_backend(name, &folios);
+ if (area_id < 0)
+ return area_id;
+
+ spin_lock(&gcma_area_lock);
+
+ nr_area = atomic_read(&nr_gcma_area);
+ if (nr_area < MAX_GCMA_AREAS) {
+ struct gcma_area *area = &areas[nr_area];
+
+ area->area_id = area_id;
+ area->start_pfn = start_pfn;
+ area->end_pfn = start_pfn + count;
+ strscpy(area->name, name);
+ /* Ensure above stores complete before we increase the count */
+ atomic_set_release(&nr_gcma_area, nr_area + 1);
+ } else {
+ ret = -ENOMEM;
+ }
+
+ spin_unlock(&gcma_area_lock);
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(gcma_register_area);
+
+void gcma_alloc_range(unsigned long start_pfn, unsigned long count)
+{
+ int nr_area = atomic_read_acquire(&nr_gcma_area);
+ unsigned long end_pfn = start_pfn + count;
+ struct gcma_area *area;
+ int i;
+
+ for (i = 0; i < nr_area; i++) {
+ unsigned long s_pfn, e_pfn;
+
+ area = &areas[i];
+ if (area->end_pfn <= start_pfn)
+ continue;
+
+ if (area->start_pfn > end_pfn)
+ continue;
+
+ s_pfn = max(start_pfn, area->start_pfn);
+ e_pfn = min(end_pfn, area->end_pfn);
+ alloc_page_range(area, s_pfn, e_pfn);
+ }
+}
+EXPORT_SYMBOL_GPL(gcma_alloc_range);
+
+void gcma_free_range(unsigned long start_pfn, unsigned long count)
+{
+ int nr_area = atomic_read_acquire(&nr_gcma_area);
+ unsigned long end_pfn = start_pfn + count;
+ struct gcma_area *area;
+ int i;
+
+ for (i = 0; i < nr_area; i++) {
+ unsigned long s_pfn, e_pfn;
+
+ area = &areas[i];
+ if (area->end_pfn <= start_pfn)
+ continue;
+
+ if (area->start_pfn > end_pfn)
+ continue;
+
+ s_pfn = max(start_pfn, area->start_pfn);
+ e_pfn = min(end_pfn, area->end_pfn);
+ free_page_range(area, s_pfn, e_pfn);
+ }
+}
+EXPORT_SYMBOL_GPL(gcma_free_range);
--
2.49.0.rc1.451.g8f38331e32-goog
next prev parent reply other threads:[~2025-03-20 17:39 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-20 17:39 [RFC 0/3] Guaranteed CMA Suren Baghdasaryan
2025-03-20 17:39 ` [RFC 1/3] mm: implement cleancache Suren Baghdasaryan
2025-03-21 5:13 ` Christoph Hellwig
2025-03-21 16:03 ` Suren Baghdasaryan
2025-03-20 17:39 ` Suren Baghdasaryan [this message]
2025-03-21 5:14 ` [RFC 2/3] mm: introduce GCMA Christoph Hellwig
2025-03-21 16:13 ` Suren Baghdasaryan
2025-03-20 17:39 ` [RFC 3/3] mm: integrate GCMA with CMA using dt-bindings Suren Baghdasaryan
2025-03-21 6:12 ` kernel test robot
2025-03-21 6:12 ` kernel test robot
2025-03-21 14:05 ` Conor Dooley
2025-03-21 16:14 ` Suren Baghdasaryan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250320173931.1583800-3-surenb@google.com \
--to=surenb@google.com \
--cc=akpm@linux-foundation.org \
--cc=alexandru.elisei@arm.com \
--cc=aneesh.kumar@kernel.org \
--cc=axboe@kernel.dk \
--cc=bhelgaas@google.com \
--cc=brauner@kernel.org \
--cc=david@redhat.com \
--cc=fvdl@google.com \
--cc=hannes@cmpxchg.org \
--cc=hbathini@linux.ibm.com \
--cc=hch@infradead.org \
--cc=iamjoonsoo.kim@lge.com \
--cc=iommu@lists.linux.dev \
--cc=jack@suse.cz \
--cc=liam.howlett@oracle.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=m.szyprowski@samsung.com \
--cc=mhocko@kernel.org \
--cc=mina86@mina86.com \
--cc=minchan@google.com \
--cc=minchan@kernel.org \
--cc=peterx@redhat.com \
--cc=ritesh.list@gmail.com \
--cc=sj@kernel.org \
--cc=sourabhjain@linux.ibm.com \
--cc=vbabka@suse.cz \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
--cc=yuzhao@google.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.