linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/6] Remove some lruvec page accounting functions
@ 2023-12-28  8:57 Matthew Wilcox (Oracle)
  2023-12-28  8:57 ` [PATCH v2 1/6] mm: Remove inc/dec lruvec page state functions Matthew Wilcox (Oracle)
                   ` (5 more replies)
  0 siblings, 6 replies; 25+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-12-28  8:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Matthew Wilcox (Oracle), linux-mm, Johannes Weiner,
	Vlastimil Babka, Hyeonggon Yoo

Some functions are now unused; remove them.  Make
__mod_lruvec_page_state() unused and then remove it.

Based on next-20231222.  Boot tested only (as next-20231222 has some
infelicities with regard to my testing setup)

v2: Redo the slub conversion.  Avoid using folio_alloc(), but do cast
the struct page pointer that we get back from alloc_pages() to a folio
pointer.  Since we know it's a head page, this all works marvellously,
and will continue to work until we can actually split folios, slabs and
pages apart, at which time it will have to be redone.  I tried a few
different ways to make slab ignorant of folios and it's quite hard to
do right now.  Look for a patch series to make that possible soon.

I wasn't expecting the previous version to make the next merge window, but
since Andrew decided he wanted it, here's what I'm currently thinking ...

Matthew Wilcox (Oracle) (6):
  mm: Remove inc/dec lruvec page state functions
  slub: Use alloc_pages_node() in alloc_slab_page()
  slub: Use folio APIs in free_large_kmalloc()
  slub: Use a folio in __kmalloc_large_node
  mm/khugepaged: Use a folio more in collapse_file()
  mm/memcontrol: Remove __mod_lruvec_page_state()

 include/linux/vmstat.h | 60 +++++++++++++-----------------------------
 mm/khugepaged.c        | 16 +++++------
 mm/memcontrol.c        |  9 +++----
 mm/slub.c              | 20 ++++++--------
 4 files changed, 38 insertions(+), 67 deletions(-)

-- 
2.43.0



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v2 1/6] mm: Remove inc/dec lruvec page state functions
  2023-12-28  8:57 [PATCH v2 0/6] Remove some lruvec page accounting functions Matthew Wilcox (Oracle)
@ 2023-12-28  8:57 ` Matthew Wilcox (Oracle)
  2024-01-02 11:02   ` Vlastimil Babka
  2023-12-28  8:57 ` [PATCH v2 2/6] slub: Use alloc_pages_node() in alloc_slab_page() Matthew Wilcox (Oracle)
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 25+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-12-28  8:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Matthew Wilcox (Oracle), linux-mm, Johannes Weiner,
	Vlastimil Babka, Hyeonggon Yoo

All callers of these have been converted to their folio equivalents.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/vmstat.h | 24 ------------------------
 1 file changed, 24 deletions(-)

diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
index fed855bae6d8..147ae73e0ee7 100644
--- a/include/linux/vmstat.h
+++ b/include/linux/vmstat.h
@@ -597,18 +597,6 @@ static inline void mod_lruvec_page_state(struct page *page,
 
 #endif /* CONFIG_MEMCG */
 
-static inline void __inc_lruvec_page_state(struct page *page,
-					   enum node_stat_item idx)
-{
-	__mod_lruvec_page_state(page, idx, 1);
-}
-
-static inline void __dec_lruvec_page_state(struct page *page,
-					   enum node_stat_item idx)
-{
-	__mod_lruvec_page_state(page, idx, -1);
-}
-
 static inline void __lruvec_stat_mod_folio(struct folio *folio,
 					   enum node_stat_item idx, int val)
 {
@@ -627,18 +615,6 @@ static inline void __lruvec_stat_sub_folio(struct folio *folio,
 	__lruvec_stat_mod_folio(folio, idx, -folio_nr_pages(folio));
 }
 
-static inline void inc_lruvec_page_state(struct page *page,
-					 enum node_stat_item idx)
-{
-	mod_lruvec_page_state(page, idx, 1);
-}
-
-static inline void dec_lruvec_page_state(struct page *page,
-					 enum node_stat_item idx)
-{
-	mod_lruvec_page_state(page, idx, -1);
-}
-
 static inline void lruvec_stat_mod_folio(struct folio *folio,
 					 enum node_stat_item idx, int val)
 {
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 2/6] slub: Use alloc_pages_node() in alloc_slab_page()
  2023-12-28  8:57 [PATCH v2 0/6] Remove some lruvec page accounting functions Matthew Wilcox (Oracle)
  2023-12-28  8:57 ` [PATCH v2 1/6] mm: Remove inc/dec lruvec page state functions Matthew Wilcox (Oracle)
@ 2023-12-28  8:57 ` Matthew Wilcox (Oracle)
  2023-12-28 20:37   ` David Rientjes
                     ` (2 more replies)
  2023-12-28  8:57 ` [PATCH v2 3/6] slub: Use folio APIs in free_large_kmalloc() Matthew Wilcox (Oracle)
                   ` (3 subsequent siblings)
  5 siblings, 3 replies; 25+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-12-28  8:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Matthew Wilcox (Oracle), linux-mm, Johannes Weiner,
	Vlastimil Babka, Hyeonggon Yoo

For no apparent reason, we were open-coding alloc_pages_node() in
this function.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/slub.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index 35aa706dc318..342545775df6 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2187,11 +2187,7 @@ static inline struct slab *alloc_slab_page(gfp_t flags, int node,
 	struct slab *slab;
 	unsigned int order = oo_order(oo);
 
-	if (node == NUMA_NO_NODE)
-		folio = (struct folio *)alloc_pages(flags, order);
-	else
-		folio = (struct folio *)__alloc_pages_node(node, flags, order);
-
+	folio = (struct folio *)alloc_pages_node(node, flags, order);
 	if (!folio)
 		return NULL;
 
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 3/6] slub: Use folio APIs in free_large_kmalloc()
  2023-12-28  8:57 [PATCH v2 0/6] Remove some lruvec page accounting functions Matthew Wilcox (Oracle)
  2023-12-28  8:57 ` [PATCH v2 1/6] mm: Remove inc/dec lruvec page state functions Matthew Wilcox (Oracle)
  2023-12-28  8:57 ` [PATCH v2 2/6] slub: Use alloc_pages_node() in alloc_slab_page() Matthew Wilcox (Oracle)
@ 2023-12-28  8:57 ` Matthew Wilcox (Oracle)
  2023-12-28 20:37   ` David Rientjes
  2024-01-02 11:10   ` Vlastimil Babka
  2023-12-28  8:57 ` [PATCH v2 4/6] slub: Use a folio in __kmalloc_large_node Matthew Wilcox (Oracle)
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 25+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-12-28  8:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Matthew Wilcox (Oracle), linux-mm, Johannes Weiner,
	Vlastimil Babka, Hyeonggon Yoo

Save a few calls to compound_head() by using the folio APIs
directly.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/slub.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index 342545775df6..58b4936f2a29 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4375,9 +4375,9 @@ static void free_large_kmalloc(struct folio *folio, void *object)
 	kasan_kfree_large(object);
 	kmsan_kfree_large(object);
 
-	mod_lruvec_page_state(folio_page(folio, 0), NR_SLAB_UNRECLAIMABLE_B,
+	lruvec_stat_mod_folio(folio, NR_SLAB_UNRECLAIMABLE_B,
 			      -(PAGE_SIZE << order));
-	__free_pages(folio_page(folio, 0), order);
+	folio_put(folio);
 }
 
 /**
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 4/6] slub: Use a folio in __kmalloc_large_node
  2023-12-28  8:57 [PATCH v2 0/6] Remove some lruvec page accounting functions Matthew Wilcox (Oracle)
                   ` (2 preceding siblings ...)
  2023-12-28  8:57 ` [PATCH v2 3/6] slub: Use folio APIs in free_large_kmalloc() Matthew Wilcox (Oracle)
@ 2023-12-28  8:57 ` Matthew Wilcox (Oracle)
  2023-12-28 20:37   ` David Rientjes
  2024-01-02 11:21   ` Vlastimil Babka
  2023-12-28  8:57 ` [PATCH v2 5/6] mm/khugepaged: Use a folio more in collapse_file() Matthew Wilcox (Oracle)
  2023-12-28  8:57 ` [PATCH v2 6/6] mm/memcontrol: Remove __mod_lruvec_page_state() Matthew Wilcox (Oracle)
  5 siblings, 2 replies; 25+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-12-28  8:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Matthew Wilcox (Oracle), linux-mm, Johannes Weiner,
	Vlastimil Babka, Hyeonggon Yoo

Mirror the code in free_large_kmalloc() and alloc_pages_node()
and use a folio directly.  Avoid the use of folio_alloc() as
that will set up an rmappable folio which we do not want here.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/slub.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index 58b4936f2a29..71d5840de65d 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3915,7 +3915,7 @@ EXPORT_SYMBOL(kmem_cache_alloc_node);
  */
 static void *__kmalloc_large_node(size_t size, gfp_t flags, int node)
 {
-	struct page *page;
+	struct folio *folio;
 	void *ptr = NULL;
 	unsigned int order = get_order(size);
 
@@ -3923,10 +3923,10 @@ static void *__kmalloc_large_node(size_t size, gfp_t flags, int node)
 		flags = kmalloc_fix_flags(flags);
 
 	flags |= __GFP_COMP;
-	page = alloc_pages_node(node, flags, order);
-	if (page) {
-		ptr = page_address(page);
-		mod_lruvec_page_state(page, NR_SLAB_UNRECLAIMABLE_B,
+	folio = (struct folio *)alloc_pages_node(node, flags, order);
+	if (folio) {
+		ptr = folio_address(folio);
+		lruvec_stat_mod_folio(folio, NR_SLAB_UNRECLAIMABLE_B,
 				      PAGE_SIZE << order);
 	}
 
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 5/6] mm/khugepaged: Use a folio more in collapse_file()
  2023-12-28  8:57 [PATCH v2 0/6] Remove some lruvec page accounting functions Matthew Wilcox (Oracle)
                   ` (3 preceding siblings ...)
  2023-12-28  8:57 ` [PATCH v2 4/6] slub: Use a folio in __kmalloc_large_node Matthew Wilcox (Oracle)
@ 2023-12-28  8:57 ` Matthew Wilcox (Oracle)
  2023-12-28 21:10   ` Zi Yan
  2024-01-02 11:24   ` Vlastimil Babka
  2023-12-28  8:57 ` [PATCH v2 6/6] mm/memcontrol: Remove __mod_lruvec_page_state() Matthew Wilcox (Oracle)
  5 siblings, 2 replies; 25+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-12-28  8:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Matthew Wilcox (Oracle), linux-mm, Johannes Weiner,
	Vlastimil Babka, Hyeonggon Yoo

This function is not yet fully converted to the folio API, but
this removes a few uses of old APIs.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/khugepaged.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 13c6eadbeda3..b9b0742e4d9a 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -2126,23 +2126,23 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
 		xas_lock_irq(&xas);
 	}
 
-	nr = thp_nr_pages(hpage);
+	folio = page_folio(hpage);
+	nr = folio_nr_pages(folio);
 	if (is_shmem)
-		__mod_lruvec_page_state(hpage, NR_SHMEM_THPS, nr);
+		__lruvec_stat_mod_folio(folio, NR_SHMEM_THPS, nr);
 	else
-		__mod_lruvec_page_state(hpage, NR_FILE_THPS, nr);
+		__lruvec_stat_mod_folio(folio, NR_FILE_THPS, nr);
 
 	if (nr_none) {
-		__mod_lruvec_page_state(hpage, NR_FILE_PAGES, nr_none);
+		__lruvec_stat_mod_folio(folio, NR_FILE_PAGES, nr_none);
 		/* nr_none is always 0 for non-shmem. */
-		__mod_lruvec_page_state(hpage, NR_SHMEM, nr_none);
+		__lruvec_stat_mod_folio(folio, NR_SHMEM, nr_none);
 	}
 
 	/*
 	 * Mark hpage as uptodate before inserting it into the page cache so
 	 * that it isn't mistaken for an fallocated but unwritten page.
 	 */
-	folio = page_folio(hpage);
 	folio_mark_uptodate(folio);
 	folio_ref_add(folio, HPAGE_PMD_NR - 1);
 
@@ -2152,7 +2152,7 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
 
 	/* Join all the small entries into a single multi-index entry. */
 	xas_set_order(&xas, start, HPAGE_PMD_ORDER);
-	xas_store(&xas, hpage);
+	xas_store(&xas, folio);
 	WARN_ON_ONCE(xas_error(&xas));
 	xas_unlock_irq(&xas);
 
@@ -2163,7 +2163,7 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
 	retract_page_tables(mapping, start);
 	if (cc && !cc->is_khugepaged)
 		result = SCAN_PTE_MAPPED_HUGEPAGE;
-	unlock_page(hpage);
+	folio_unlock(folio);
 
 	/*
 	 * The collapse has succeeded, so free the old pages.
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 6/6] mm/memcontrol: Remove __mod_lruvec_page_state()
  2023-12-28  8:57 [PATCH v2 0/6] Remove some lruvec page accounting functions Matthew Wilcox (Oracle)
                   ` (4 preceding siblings ...)
  2023-12-28  8:57 ` [PATCH v2 5/6] mm/khugepaged: Use a folio more in collapse_file() Matthew Wilcox (Oracle)
@ 2023-12-28  8:57 ` Matthew Wilcox (Oracle)
  2023-12-28 21:24   ` Zi Yan
                     ` (2 more replies)
  5 siblings, 3 replies; 25+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-12-28  8:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Matthew Wilcox (Oracle), linux-mm, Johannes Weiner,
	Vlastimil Babka, Hyeonggon Yoo

There are no more callers of __mod_lruvec_page_state(), so convert
the implementation to __lruvec_stat_mod_folio(), removing two calls to
compound_head() (one explicit, one hidden inside page_memcg()).

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/vmstat.h | 36 ++++++++++++++++++------------------
 mm/memcontrol.c        |  9 ++++-----
 2 files changed, 22 insertions(+), 23 deletions(-)

diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
index 147ae73e0ee7..343906a98d6e 100644
--- a/include/linux/vmstat.h
+++ b/include/linux/vmstat.h
@@ -556,19 +556,25 @@ static inline void mod_lruvec_state(struct lruvec *lruvec,
 	local_irq_restore(flags);
 }
 
-void __mod_lruvec_page_state(struct page *page,
+void __lruvec_stat_mod_folio(struct folio *folio,
 			     enum node_stat_item idx, int val);
 
-static inline void mod_lruvec_page_state(struct page *page,
+static inline void lruvec_stat_mod_folio(struct folio *folio,
 					 enum node_stat_item idx, int val)
 {
 	unsigned long flags;
 
 	local_irq_save(flags);
-	__mod_lruvec_page_state(page, idx, val);
+	__lruvec_stat_mod_folio(folio, idx, val);
 	local_irq_restore(flags);
 }
 
+static inline void mod_lruvec_page_state(struct page *page,
+					 enum node_stat_item idx, int val)
+{
+	lruvec_stat_mod_folio(page_folio(page), idx, val);
+}
+
 #else
 
 static inline void __mod_lruvec_state(struct lruvec *lruvec,
@@ -583,10 +589,16 @@ static inline void mod_lruvec_state(struct lruvec *lruvec,
 	mod_node_page_state(lruvec_pgdat(lruvec), idx, val);
 }
 
-static inline void __mod_lruvec_page_state(struct page *page,
-					   enum node_stat_item idx, int val)
+static inline void __lruvec_stat_mod_folio(struct folio *folio,
+					 enum node_stat_item idx, int val)
 {
-	__mod_node_page_state(page_pgdat(page), idx, val);
+	__mod_node_page_state(folio_pgdat(folio), idx, val);
+}
+
+static inline void lruvec_stat_mod_folio(struct folio *folio,
+					 enum node_stat_item idx, int val)
+{
+	mod_node_page_state(folio_pgdat(folio), idx, val);
 }
 
 static inline void mod_lruvec_page_state(struct page *page,
@@ -597,12 +609,6 @@ static inline void mod_lruvec_page_state(struct page *page,
 
 #endif /* CONFIG_MEMCG */
 
-static inline void __lruvec_stat_mod_folio(struct folio *folio,
-					   enum node_stat_item idx, int val)
-{
-	__mod_lruvec_page_state(&folio->page, idx, val);
-}
-
 static inline void __lruvec_stat_add_folio(struct folio *folio,
 					   enum node_stat_item idx)
 {
@@ -615,12 +621,6 @@ static inline void __lruvec_stat_sub_folio(struct folio *folio,
 	__lruvec_stat_mod_folio(folio, idx, -folio_nr_pages(folio));
 }
 
-static inline void lruvec_stat_mod_folio(struct folio *folio,
-					 enum node_stat_item idx, int val)
-{
-	mod_lruvec_page_state(&folio->page, idx, val);
-}
-
 static inline void lruvec_stat_add_folio(struct folio *folio,
 					 enum node_stat_item idx)
 {
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 36bb18d7b397..7a759554bec6 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -891,16 +891,15 @@ void __mod_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx,
 		__mod_memcg_lruvec_state(lruvec, idx, val);
 }
 
-void __mod_lruvec_page_state(struct page *page, enum node_stat_item idx,
+void __lruvec_stat_mod_folio(struct folio *folio, enum node_stat_item idx,
 			     int val)
 {
-	struct page *head = compound_head(page); /* rmap on tail pages */
 	struct mem_cgroup *memcg;
-	pg_data_t *pgdat = page_pgdat(page);
+	pg_data_t *pgdat = folio_pgdat(folio);
 	struct lruvec *lruvec;
 
 	rcu_read_lock();
-	memcg = page_memcg(head);
+	memcg = folio_memcg(folio);
 	/* Untracked pages have no memcg, no lruvec. Update only the node */
 	if (!memcg) {
 		rcu_read_unlock();
@@ -912,7 +911,7 @@ void __mod_lruvec_page_state(struct page *page, enum node_stat_item idx,
 	__mod_lruvec_state(lruvec, idx, val);
 	rcu_read_unlock();
 }
-EXPORT_SYMBOL(__mod_lruvec_page_state);
+EXPORT_SYMBOL(__lruvec_stat_mod_folio);
 
 void __mod_lruvec_kmem_state(void *p, enum node_stat_item idx, int val)
 {
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 2/6] slub: Use alloc_pages_node() in alloc_slab_page()
  2023-12-28  8:57 ` [PATCH v2 2/6] slub: Use alloc_pages_node() in alloc_slab_page() Matthew Wilcox (Oracle)
@ 2023-12-28 20:37   ` David Rientjes
  2024-01-02 11:06   ` Vlastimil Babka
  2024-07-09 17:12   ` Christoph Lameter (Ampere)
  2 siblings, 0 replies; 25+ messages in thread
From: David Rientjes @ 2023-12-28 20:37 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle)
  Cc: Andrew Morton, linux-mm, Johannes Weiner, Vlastimil Babka,
	Hyeonggon Yoo

On Thu, 28 Dec 2023, Matthew Wilcox (Oracle) wrote:

> For no apparent reason, we were open-coding alloc_pages_node() in
> this function.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Acked-by: David Rientjes <rientjes@google.com>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 3/6] slub: Use folio APIs in free_large_kmalloc()
  2023-12-28  8:57 ` [PATCH v2 3/6] slub: Use folio APIs in free_large_kmalloc() Matthew Wilcox (Oracle)
@ 2023-12-28 20:37   ` David Rientjes
  2024-01-02 11:10   ` Vlastimil Babka
  1 sibling, 0 replies; 25+ messages in thread
From: David Rientjes @ 2023-12-28 20:37 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle)
  Cc: Andrew Morton, linux-mm, Johannes Weiner, Vlastimil Babka,
	Hyeonggon Yoo

On Thu, 28 Dec 2023, Matthew Wilcox (Oracle) wrote:

> Save a few calls to compound_head() by using the folio APIs
> directly.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Acked-by: David Rientjes <rientjes@google.com>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 4/6] slub: Use a folio in __kmalloc_large_node
  2023-12-28  8:57 ` [PATCH v2 4/6] slub: Use a folio in __kmalloc_large_node Matthew Wilcox (Oracle)
@ 2023-12-28 20:37   ` David Rientjes
  2024-01-02 11:21   ` Vlastimil Babka
  1 sibling, 0 replies; 25+ messages in thread
From: David Rientjes @ 2023-12-28 20:37 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle)
  Cc: Andrew Morton, linux-mm, Johannes Weiner, Vlastimil Babka,
	Hyeonggon Yoo

On Thu, 28 Dec 2023, Matthew Wilcox (Oracle) wrote:

> Mirror the code in free_large_kmalloc() and alloc_pages_node()
> and use a folio directly.  Avoid the use of folio_alloc() as
> that will set up an rmappable folio which we do not want here.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Acked-by: David Rientjes <rientjes@google.com>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 5/6] mm/khugepaged: Use a folio more in collapse_file()
  2023-12-28  8:57 ` [PATCH v2 5/6] mm/khugepaged: Use a folio more in collapse_file() Matthew Wilcox (Oracle)
@ 2023-12-28 21:10   ` Zi Yan
  2024-01-02 11:24   ` Vlastimil Babka
  1 sibling, 0 replies; 25+ messages in thread
From: Zi Yan @ 2023-12-28 21:10 UTC (permalink / raw)
  To: "Matthew Wilcox (Oracle)"
  Cc: Andrew Morton, linux-mm, Johannes Weiner, Vlastimil Babka,
	Hyeonggon Yoo

[-- Attachment #1: Type: text/plain, Size: 395 bytes --]

On 28 Dec 2023, at 3:57, Matthew Wilcox (Oracle) wrote:

> This function is not yet fully converted to the folio API, but
> this removes a few uses of old APIs.
>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  mm/khugepaged.c | 16 ++++++++--------
>  1 file changed, 8 insertions(+), 8 deletions(-)

LGTM. Reviewed-by: Zi Yan <ziy@nvidia.com>


--
Best Regards,
Yan, Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 6/6] mm/memcontrol: Remove __mod_lruvec_page_state()
  2023-12-28  8:57 ` [PATCH v2 6/6] mm/memcontrol: Remove __mod_lruvec_page_state() Matthew Wilcox (Oracle)
@ 2023-12-28 21:24   ` Zi Yan
  2023-12-28 22:31   ` Shakeel Butt
  2024-01-02 11:56   ` Vlastimil Babka
  2 siblings, 0 replies; 25+ messages in thread
From: Zi Yan @ 2023-12-28 21:24 UTC (permalink / raw)
  To: "Matthew Wilcox (Oracle)"
  Cc: Andrew Morton, linux-mm, Johannes Weiner, Vlastimil Babka,
	Hyeonggon Yoo

[-- Attachment #1: Type: text/plain, Size: 569 bytes --]

On 28 Dec 2023, at 3:57, Matthew Wilcox (Oracle) wrote:

> There are no more callers of __mod_lruvec_page_state(), so convert
> the implementation to __lruvec_stat_mod_folio(), removing two calls to
> compound_head() (one explicit, one hidden inside page_memcg()).
>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  include/linux/vmstat.h | 36 ++++++++++++++++++------------------
>  mm/memcontrol.c        |  9 ++++-----
>  2 files changed, 22 insertions(+), 23 deletions(-)

LGTM. Reviewed-by: Zi Yan <ziy@nvidia.com>

--
Best Regards,
Yan, Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 6/6] mm/memcontrol: Remove __mod_lruvec_page_state()
  2023-12-28  8:57 ` [PATCH v2 6/6] mm/memcontrol: Remove __mod_lruvec_page_state() Matthew Wilcox (Oracle)
  2023-12-28 21:24   ` Zi Yan
@ 2023-12-28 22:31   ` Shakeel Butt
  2024-01-02 11:56   ` Vlastimil Babka
  2 siblings, 0 replies; 25+ messages in thread
From: Shakeel Butt @ 2023-12-28 22:31 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle)
  Cc: Andrew Morton, linux-mm, Johannes Weiner, Vlastimil Babka,
	Hyeonggon Yoo

On Thu, Dec 28, 2023 at 08:57:48AM +0000, Matthew Wilcox (Oracle) wrote:
> There are no more callers of __mod_lruvec_page_state(), so convert
> the implementation to __lruvec_stat_mod_folio(), removing two calls to
> compound_head() (one explicit, one hidden inside page_memcg()).
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Acked-by: Shakeel Butt <shakeelb@google.com>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 1/6] mm: Remove inc/dec lruvec page state functions
  2023-12-28  8:57 ` [PATCH v2 1/6] mm: Remove inc/dec lruvec page state functions Matthew Wilcox (Oracle)
@ 2024-01-02 11:02   ` Vlastimil Babka
  0 siblings, 0 replies; 25+ messages in thread
From: Vlastimil Babka @ 2024-01-02 11:02 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), Andrew Morton
  Cc: linux-mm, Johannes Weiner, Hyeonggon Yoo

On 12/28/23 09:57, Matthew Wilcox (Oracle) wrote:
> All callers of these have been converted to their folio equivalents.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Reviewed-by: Vlastimil Babka <vbabka@suse.cz>

> ---
>  include/linux/vmstat.h | 24 ------------------------
>  1 file changed, 24 deletions(-)
> 
> diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
> index fed855bae6d8..147ae73e0ee7 100644
> --- a/include/linux/vmstat.h
> +++ b/include/linux/vmstat.h
> @@ -597,18 +597,6 @@ static inline void mod_lruvec_page_state(struct page *page,
>  
>  #endif /* CONFIG_MEMCG */
>  
> -static inline void __inc_lruvec_page_state(struct page *page,
> -					   enum node_stat_item idx)
> -{
> -	__mod_lruvec_page_state(page, idx, 1);
> -}
> -
> -static inline void __dec_lruvec_page_state(struct page *page,
> -					   enum node_stat_item idx)
> -{
> -	__mod_lruvec_page_state(page, idx, -1);
> -}
> -
>  static inline void __lruvec_stat_mod_folio(struct folio *folio,
>  					   enum node_stat_item idx, int val)
>  {
> @@ -627,18 +615,6 @@ static inline void __lruvec_stat_sub_folio(struct folio *folio,
>  	__lruvec_stat_mod_folio(folio, idx, -folio_nr_pages(folio));
>  }
>  
> -static inline void inc_lruvec_page_state(struct page *page,
> -					 enum node_stat_item idx)
> -{
> -	mod_lruvec_page_state(page, idx, 1);
> -}
> -
> -static inline void dec_lruvec_page_state(struct page *page,
> -					 enum node_stat_item idx)
> -{
> -	mod_lruvec_page_state(page, idx, -1);
> -}
> -
>  static inline void lruvec_stat_mod_folio(struct folio *folio,
>  					 enum node_stat_item idx, int val)
>  {



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 2/6] slub: Use alloc_pages_node() in alloc_slab_page()
  2023-12-28  8:57 ` [PATCH v2 2/6] slub: Use alloc_pages_node() in alloc_slab_page() Matthew Wilcox (Oracle)
  2023-12-28 20:37   ` David Rientjes
@ 2024-01-02 11:06   ` Vlastimil Babka
  2024-07-09 17:12   ` Christoph Lameter (Ampere)
  2 siblings, 0 replies; 25+ messages in thread
From: Vlastimil Babka @ 2024-01-02 11:06 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), Andrew Morton
  Cc: linux-mm, Johannes Weiner, Hyeonggon Yoo

On 12/28/23 09:57, Matthew Wilcox (Oracle) wrote:
> For no apparent reason, we were open-coding alloc_pages_node() in
> this function.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Reviewed-by: Vlastimil Babka <vbabka@suse.cz>

> ---
>  mm/slub.c | 6 +-----
>  1 file changed, 1 insertion(+), 5 deletions(-)
> 
> diff --git a/mm/slub.c b/mm/slub.c
> index 35aa706dc318..342545775df6 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -2187,11 +2187,7 @@ static inline struct slab *alloc_slab_page(gfp_t flags, int node,
>  	struct slab *slab;
>  	unsigned int order = oo_order(oo);
>  
> -	if (node == NUMA_NO_NODE)
> -		folio = (struct folio *)alloc_pages(flags, order);
> -	else
> -		folio = (struct folio *)__alloc_pages_node(node, flags, order);
> -
> +	folio = (struct folio *)alloc_pages_node(node, flags, order);
>  	if (!folio)
>  		return NULL;
>  



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 3/6] slub: Use folio APIs in free_large_kmalloc()
  2023-12-28  8:57 ` [PATCH v2 3/6] slub: Use folio APIs in free_large_kmalloc() Matthew Wilcox (Oracle)
  2023-12-28 20:37   ` David Rientjes
@ 2024-01-02 11:10   ` Vlastimil Babka
  1 sibling, 0 replies; 25+ messages in thread
From: Vlastimil Babka @ 2024-01-02 11:10 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), Andrew Morton
  Cc: linux-mm, Johannes Weiner, Hyeonggon Yoo

On 12/28/23 09:57, Matthew Wilcox (Oracle) wrote:
> Save a few calls to compound_head() by using the folio APIs
> directly.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Reviewed-by: Vlastimil Babka <vbabka@suse.cz>

> ---
>  mm/slub.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/slub.c b/mm/slub.c
> index 342545775df6..58b4936f2a29 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -4375,9 +4375,9 @@ static void free_large_kmalloc(struct folio *folio, void *object)
>  	kasan_kfree_large(object);
>  	kmsan_kfree_large(object);
>  
> -	mod_lruvec_page_state(folio_page(folio, 0), NR_SLAB_UNRECLAIMABLE_B,
> +	lruvec_stat_mod_folio(folio, NR_SLAB_UNRECLAIMABLE_B,
>  			      -(PAGE_SIZE << order));
> -	__free_pages(folio_page(folio, 0), order);
> +	folio_put(folio);
>  }
>  
>  /**



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 4/6] slub: Use a folio in __kmalloc_large_node
  2023-12-28  8:57 ` [PATCH v2 4/6] slub: Use a folio in __kmalloc_large_node Matthew Wilcox (Oracle)
  2023-12-28 20:37   ` David Rientjes
@ 2024-01-02 11:21   ` Vlastimil Babka
  1 sibling, 0 replies; 25+ messages in thread
From: Vlastimil Babka @ 2024-01-02 11:21 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), Andrew Morton
  Cc: linux-mm, Johannes Weiner, Hyeonggon Yoo

On 12/28/23 09:57, Matthew Wilcox (Oracle) wrote:
> Mirror the code in free_large_kmalloc() and alloc_pages_node()
> and use a folio directly.  Avoid the use of folio_alloc() as
> that will set up an rmappable folio which we do not want here.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Reviewed-by: Vlastimil Babka <vbabka@suse.cz>

> ---
>  mm/slub.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/mm/slub.c b/mm/slub.c
> index 58b4936f2a29..71d5840de65d 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -3915,7 +3915,7 @@ EXPORT_SYMBOL(kmem_cache_alloc_node);
>   */
>  static void *__kmalloc_large_node(size_t size, gfp_t flags, int node)
>  {
> -	struct page *page;
> +	struct folio *folio;
>  	void *ptr = NULL;
>  	unsigned int order = get_order(size);
>  
> @@ -3923,10 +3923,10 @@ static void *__kmalloc_large_node(size_t size, gfp_t flags, int node)
>  		flags = kmalloc_fix_flags(flags);
>  
>  	flags |= __GFP_COMP;
> -	page = alloc_pages_node(node, flags, order);
> -	if (page) {
> -		ptr = page_address(page);
> -		mod_lruvec_page_state(page, NR_SLAB_UNRECLAIMABLE_B,
> +	folio = (struct folio *)alloc_pages_node(node, flags, order);
> +	if (folio) {
> +		ptr = folio_address(folio);
> +		lruvec_stat_mod_folio(folio, NR_SLAB_UNRECLAIMABLE_B,
>  				      PAGE_SIZE << order);
>  	}
>  



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 5/6] mm/khugepaged: Use a folio more in collapse_file()
  2023-12-28  8:57 ` [PATCH v2 5/6] mm/khugepaged: Use a folio more in collapse_file() Matthew Wilcox (Oracle)
  2023-12-28 21:10   ` Zi Yan
@ 2024-01-02 11:24   ` Vlastimil Babka
  1 sibling, 0 replies; 25+ messages in thread
From: Vlastimil Babka @ 2024-01-02 11:24 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), Andrew Morton
  Cc: linux-mm, Johannes Weiner, Hyeonggon Yoo

On 12/28/23 09:57, Matthew Wilcox (Oracle) wrote:
> This function is not yet fully converted to the folio API, but
> this removes a few uses of old APIs.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Reviewed-by: Vlastimil Babka <vbabka@suse.cz>



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 6/6] mm/memcontrol: Remove __mod_lruvec_page_state()
  2023-12-28  8:57 ` [PATCH v2 6/6] mm/memcontrol: Remove __mod_lruvec_page_state() Matthew Wilcox (Oracle)
  2023-12-28 21:24   ` Zi Yan
  2023-12-28 22:31   ` Shakeel Butt
@ 2024-01-02 11:56   ` Vlastimil Babka
  2 siblings, 0 replies; 25+ messages in thread
From: Vlastimil Babka @ 2024-01-02 11:56 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), Andrew Morton
  Cc: linux-mm, Johannes Weiner, Hyeonggon Yoo

On 12/28/23 09:57, Matthew Wilcox (Oracle) wrote:
> There are no more callers of __mod_lruvec_page_state(), so convert
> the implementation to __lruvec_stat_mod_folio(), removing two calls to
> compound_head() (one explicit, one hidden inside page_memcg()).
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Reviewed-by: Vlastimil Babka <vbabka@suse.cz>




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 2/6] slub: Use alloc_pages_node() in alloc_slab_page()
  2023-12-28  8:57 ` [PATCH v2 2/6] slub: Use alloc_pages_node() in alloc_slab_page() Matthew Wilcox (Oracle)
  2023-12-28 20:37   ` David Rientjes
  2024-01-02 11:06   ` Vlastimil Babka
@ 2024-07-09 17:12   ` Christoph Lameter (Ampere)
  2024-07-10 10:35     ` Vlastimil Babka
  2 siblings, 1 reply; 25+ messages in thread
From: Christoph Lameter (Ampere) @ 2024-07-09 17:12 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle)
  Cc: Andrew Morton, linux-mm, Johannes Weiner, Vlastimil Babka,
	Hyeonggon Yoo

On Thu, 28 Dec 2023, Matthew Wilcox (Oracle) wrote:

> For no apparent reason, we were open-coding alloc_pages_node() in
> this function.

The reason is that alloc_pages() follow memory policies, cgroup restrictions 
etc etc and alloc_pages_node does not.

With this patch cgroup restrictions memory policies etc etc no longer work 
in the slab allocator.

Please revert this patch.

> diff --git a/mm/slub.c b/mm/slub.c
> index 35aa706dc318..342545775df6 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -2187,11 +2187,7 @@ static inline struct slab *alloc_slab_page(gfp_t flags, int node,
> 	struct slab *slab;
> 	unsigned int order = oo_order(oo);
>
> -	if (node == NUMA_NO_NODE)
> -		folio = (struct folio *)alloc_pages(flags, order);
> -	else
> -		folio = (struct folio *)__alloc_pages_node(node, flags, order);
> -
> +	folio = (struct folio *)alloc_pages_node(node, flags, order);
> 	if (!folio)
> 		return NULL;
>
> -- 
> 2.43.0
>
>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 2/6] slub: Use alloc_pages_node() in alloc_slab_page()
  2024-07-09 17:12   ` Christoph Lameter (Ampere)
@ 2024-07-10 10:35     ` Vlastimil Babka
  2024-07-10 16:43       ` Christoph Lameter (Ampere)
  0 siblings, 1 reply; 25+ messages in thread
From: Vlastimil Babka @ 2024-07-10 10:35 UTC (permalink / raw)
  To: Christoph Lameter (Ampere), Matthew Wilcox (Oracle)
  Cc: Andrew Morton, linux-mm, Johannes Weiner, Hyeonggon Yoo

On 7/9/24 7:12 PM, Christoph Lameter (Ampere) wrote:
> On Thu, 28 Dec 2023, Matthew Wilcox (Oracle) wrote:
> 
>> For no apparent reason, we were open-coding alloc_pages_node() in
>> this function.
> 
> The reason is that alloc_pages() follow memory policies, cgroup restrictions 
> etc etc and alloc_pages_node does not.
> 
> With this patch cgroup restrictions memory policies etc etc no longer work 
> in the slab allocator.

The only difference is memory policy from get_task_policy(), and the rest is
the same, right?

> Please revert this patch.

But this only affects new slab page allocation, while getting objects from
existing slabs isn't subject to memory policies, so now it's at least
consistent? Do you have some use case where it matters?

>> diff --git a/mm/slub.c b/mm/slub.c
>> index 35aa706dc318..342545775df6 100644
>> --- a/mm/slub.c
>> +++ b/mm/slub.c
>> @@ -2187,11 +2187,7 @@ static inline struct slab *alloc_slab_page(gfp_t flags, int node,
>> 	struct slab *slab;
>> 	unsigned int order = oo_order(oo);
>>
>> -	if (node == NUMA_NO_NODE)
>> -		folio = (struct folio *)alloc_pages(flags, order);
>> -	else
>> -		folio = (struct folio *)__alloc_pages_node(node, flags, order);
>> -
>> +	folio = (struct folio *)alloc_pages_node(node, flags, order);
>> 	if (!folio)
>> 		return NULL;
>>
>> -- 
>> 2.43.0
>>
>>



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 2/6] slub: Use alloc_pages_node() in alloc_slab_page()
  2024-07-10 10:35     ` Vlastimil Babka
@ 2024-07-10 16:43       ` Christoph Lameter (Ampere)
  2024-07-11  7:54         ` Vlastimil Babka
  0 siblings, 1 reply; 25+ messages in thread
From: Christoph Lameter (Ampere) @ 2024-07-10 16:43 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Matthew Wilcox (Oracle), Andrew Morton, linux-mm, Johannes Weiner,
	Hyeonggon Yoo

On Wed, 10 Jul 2024, Vlastimil Babka wrote:

>> With this patch cgroup restrictions memory policies etc etc no longer work
>> in the slab allocator.
>
> The only difference is memory policy from get_task_policy(), and the rest is
> the same, right?

There are also the cpuset/cgroup restrictions via the zonelists that are 
bypassed by removing alloc_pages()

> But this only affects new slab page allocation, while getting objects from
> existing slabs isn't subject to memory policies, so now it's at least
> consistent? Do you have some use case where it matters?

Yes this means you cannot redirect kmalloc based kernel metadata 
allocation when creating f.e. cgroups for another NUMA node. This affects 
all kernel metadata allocation during syscalls that used to be 
controllable via numa methods.

SLAB implemented memory allocations policies per object. SLUB moved that 
to implement these policies only when allocating a page frame. If this 
patch is left in then there wont be any support for memory allocation 
policies left in the slab allocators.

We have some internal patches now that implement memory policies on a per 
object basis for SLUB here.

This is a 10-15% regression on various benchmarks when objects like the 
scheduler statistics structures are misplaced.



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 2/6] slub: Use alloc_pages_node() in alloc_slab_page()
  2024-07-10 16:43       ` Christoph Lameter (Ampere)
@ 2024-07-11  7:54         ` Vlastimil Babka
  2024-07-11 18:04           ` Christoph Lameter (Ampere)
  0 siblings, 1 reply; 25+ messages in thread
From: Vlastimil Babka @ 2024-07-11  7:54 UTC (permalink / raw)
  To: Christoph Lameter (Ampere)
  Cc: Matthew Wilcox (Oracle), Andrew Morton, linux-mm, Johannes Weiner,
	Hyeonggon Yoo

On 7/10/24 6:43 PM, Christoph Lameter (Ampere) wrote:
> On Wed, 10 Jul 2024, Vlastimil Babka wrote:
> 
>>> With this patch cgroup restrictions memory policies etc etc no longer work
>>> in the slab allocator.
>>
>> The only difference is memory policy from get_task_policy(), and the rest is
>> the same, right?
> 
> There are also the cpuset/cgroup restrictions via the zonelists that are 
> bypassed by removing alloc_pages()

AFAICS cpusets are handled on a level that's reached by both paths, i.e.
prepare_alloc_pages(), and I see nothing that would make switching to
alloc_pages_node() bypass it. Am I missing something?

>> But this only affects new slab page allocation, while getting objects from
>> existing slabs isn't subject to memory policies, so now it's at least
>> consistent? Do you have some use case where it matters?
> 
> Yes this means you cannot redirect kmalloc based kernel metadata 
> allocation when creating f.e. cgroups for another NUMA node. This affects 
> all kernel metadata allocation during syscalls that used to be 
> controllable via numa methods.
> 
> SLAB implemented memory allocations policies per object. SLUB moved that 
> to implement these policies only when allocating a page frame. If this 
> patch is left in then there wont be any support for memory allocation 
> policies left in the slab allocators.
> 
> We have some internal patches now that implement memory policies on a per 
> object basis for SLUB here.
> 
> This is a 10-15% regression on various benchmarks when objects like the 
> scheduler statistics structures are misplaced.

I believe it would be best if you submitted a patch with with all that
reasoning. Thanks!


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 2/6] slub: Use alloc_pages_node() in alloc_slab_page()
  2024-07-11  7:54         ` Vlastimil Babka
@ 2024-07-11 18:04           ` Christoph Lameter (Ampere)
  2024-07-12  7:47             ` Vlastimil Babka
  0 siblings, 1 reply; 25+ messages in thread
From: Christoph Lameter (Ampere) @ 2024-07-11 18:04 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Matthew Wilcox (Oracle), Andrew Morton, linux-mm, Johannes Weiner,
	Hyeonggon Yoo

On Thu, 11 Jul 2024, Vlastimil Babka wrote:

>> There are also the cpuset/cgroup restrictions via the zonelists that are
>> bypassed by removing alloc_pages()
>
> AFAICS cpusets are handled on a level that's reached by both paths, i.e.
> prepare_alloc_pages(), and I see nothing that would make switching to
> alloc_pages_node() bypass it. Am I missing something?

You are correct. cpuset/cgroup restrictions also apply to 
alloc_pages_node().

>> We have some internal patches now that implement memory policies on a per
>> object basis for SLUB here.
>>
>> This is a 10-15% regression on various benchmarks when objects like the
>> scheduler statistics structures are misplaced.
>
> I believe it would be best if you submitted a patch with with all that
> reasoning. Thanks!

Turns out those performance issues are related to the issue that NUMA 
locality is only considered at the folio level for slab allocation. 
Individual object allocations are not subject to it.

The performance issue comes about in the following way:

Two kernel threads run on the same cpu using the same slab cache. One of 
them keeps on allocating from a different node via kmalloc_node() and the 
other is using kmalloc(). Then the kmalloc_node() thread will always 
ensure that the per cpu slab is from the other node.

The other thread will use kmalloc() which does not check which node the 
per cpu slab is from. Therefore the kmallc thread can continually be 
served objects that are not local. That is not good and causes 
misplacement of objects.

But that issue is something separate from this commit here and we see the 
same regression before this commit.

This patch still needs to be reverted since the rationale for the patch 
is not right and it disables memory policy support. Results in the 
strange situation that memory policies are used in get_any_partial() in 
slub but not during allocation anymore.



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 2/6] slub: Use alloc_pages_node() in alloc_slab_page()
  2024-07-11 18:04           ` Christoph Lameter (Ampere)
@ 2024-07-12  7:47             ` Vlastimil Babka
  0 siblings, 0 replies; 25+ messages in thread
From: Vlastimil Babka @ 2024-07-12  7:47 UTC (permalink / raw)
  To: Christoph Lameter (Ampere)
  Cc: Matthew Wilcox (Oracle), Andrew Morton, linux-mm, Johannes Weiner,
	Hyeonggon Yoo

On 7/11/24 8:04 PM, Christoph Lameter (Ampere) wrote:
> On Thu, 11 Jul 2024, Vlastimil Babka wrote:
> 
>>> There are also the cpuset/cgroup restrictions via the zonelists that are
>>> bypassed by removing alloc_pages()
>>
>> AFAICS cpusets are handled on a level that's reached by both paths, i.e.
>> prepare_alloc_pages(), and I see nothing that would make switching to
>> alloc_pages_node() bypass it. Am I missing something?
> 
> You are correct. cpuset/cgroup restrictions also apply to 
> alloc_pages_node().
> 
>>> We have some internal patches now that implement memory policies on a per
>>> object basis for SLUB here.
>>>
>>> This is a 10-15% regression on various benchmarks when objects like the
>>> scheduler statistics structures are misplaced.
>>
>> I believe it would be best if you submitted a patch with with all that
>> reasoning. Thanks!

I still believe that :)

> Turns out those performance issues are related to the issue that NUMA 
> locality is only considered at the folio level for slab allocation. 
> Individual object allocations are not subject to it.
> 
> The performance issue comes about in the following way:
> 
> Two kernel threads run on the same cpu using the same slab cache. One of 
> them keeps on allocating from a different node via kmalloc_node() and the 
> other is using kmalloc(). Then the kmalloc_node() thread will always 
> ensure that the per cpu slab is from the other node.
> 
> The other thread will use kmalloc() which does not check which node the 
> per cpu slab is from. Therefore the kmallc thread can continually be 
> served objects that are not local. That is not good and causes 
> misplacement of objects.
> 
> But that issue is something separate from this commit here and we see the 
> same regression before this commit.
> 
> This patch still needs to be reverted since the rationale for the patch 
> is not right and it disables memory policy support. Results in the 
> strange situation that memory policies are used in get_any_partial() in 
> slub but not during allocation anymore.
> 



^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2024-07-12  7:47 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-12-28  8:57 [PATCH v2 0/6] Remove some lruvec page accounting functions Matthew Wilcox (Oracle)
2023-12-28  8:57 ` [PATCH v2 1/6] mm: Remove inc/dec lruvec page state functions Matthew Wilcox (Oracle)
2024-01-02 11:02   ` Vlastimil Babka
2023-12-28  8:57 ` [PATCH v2 2/6] slub: Use alloc_pages_node() in alloc_slab_page() Matthew Wilcox (Oracle)
2023-12-28 20:37   ` David Rientjes
2024-01-02 11:06   ` Vlastimil Babka
2024-07-09 17:12   ` Christoph Lameter (Ampere)
2024-07-10 10:35     ` Vlastimil Babka
2024-07-10 16:43       ` Christoph Lameter (Ampere)
2024-07-11  7:54         ` Vlastimil Babka
2024-07-11 18:04           ` Christoph Lameter (Ampere)
2024-07-12  7:47             ` Vlastimil Babka
2023-12-28  8:57 ` [PATCH v2 3/6] slub: Use folio APIs in free_large_kmalloc() Matthew Wilcox (Oracle)
2023-12-28 20:37   ` David Rientjes
2024-01-02 11:10   ` Vlastimil Babka
2023-12-28  8:57 ` [PATCH v2 4/6] slub: Use a folio in __kmalloc_large_node Matthew Wilcox (Oracle)
2023-12-28 20:37   ` David Rientjes
2024-01-02 11:21   ` Vlastimil Babka
2023-12-28  8:57 ` [PATCH v2 5/6] mm/khugepaged: Use a folio more in collapse_file() Matthew Wilcox (Oracle)
2023-12-28 21:10   ` Zi Yan
2024-01-02 11:24   ` Vlastimil Babka
2023-12-28  8:57 ` [PATCH v2 6/6] mm/memcontrol: Remove __mod_lruvec_page_state() Matthew Wilcox (Oracle)
2023-12-28 21:24   ` Zi Yan
2023-12-28 22:31   ` Shakeel Butt
2024-01-02 11:56   ` Vlastimil Babka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).