* [PATCH v4 00/32] Separate struct slab from struct page
@ 2022-01-04 0:10 Vlastimil Babka
[not found] ` <20220104001046.12263-1-vbabka-AlSwsSmVLrQ@public.gmane.org>
0 siblings, 1 reply; 9+ messages in thread
From: Vlastimil Babka @ 2022-01-04 0:10 UTC (permalink / raw)
To: Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim,
Pekka Enberg
Cc: linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Andrew Morton, Johannes Weiner,
Roman Gushchin, Hyeonggon Yoo, patches-cunTk1MwBs/YUNznpcFYbw,
Vlastimil Babka, Alexander Potapenko, Andrey Konovalov,
Andrey Ryabinin, Andy Lutomirski, Borislav Petkov,
cgroups-u79uwXL29TY76Z2rM5mHXA, Dave Hansen, Dmitry Vyukov,
H. Peter Anvin, Ingo Molnar, Julia Lawall,
kasan-dev-/JYPxA39Uh5TLH3MbocFFw, Luis Chamberlain, Marco Elver,
Michal Hocko, Mi
Folks from non-slab subsystems are Cc'd only to patches affecting them, and
this cover letter.
Series also available in git, based on 5.16-rc6:
https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=slab-struct_slab-v4r2
The plan is to submit as pull request, the previous versions have been
in linux-next since v2 early December. This v4 was in linux-next since
Dec 22:
https://lore.kernel.org/all/f3a83708-3f3c-a634-7bee-dcfcaaa7f36e-AlSwsSmVLrQ@public.gmane.org/
I planned to post it on mailing list for any final review in January, so
this is it. Added only reviewed/tested tags from Hyeonggon Yoo
meahwhile.
Changes from v3:
https://lore.kernel.org/all/4c3dfdfa-2e19-a9a7-7945-3d75bc87ca05-AlSwsSmVLrQ@public.gmane.org/
- rebase to 5.16-rc6 to avoid a conflict with mainline
- collect acks/reviews/tested-by from Johannes, Roman, Hyeonggon Yoo -
thanks!
- in patch "mm/slub: Convert detached_freelist to use a struct slab"
renamed free_nonslab_page() to free_large_kmalloc() and use folio there,
as suggested by Roman
- in "mm/memcg: Convert slab objcgs from struct page to struct slab"
change one caller of slab_objcgs_check() to slab_objcgs() as suggested
by Johannes, realize the other caller should be also changed, and remove
slab_objcgs_check() completely.
Initial version from Matthew Wilcox:
https://lore.kernel.org/all/20211004134650.4031813-1-willy-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org/
LWN coverage of the above:
https://lwn.net/Articles/871982/
This is originally an offshoot of the folio work by Matthew. One of the more
complex parts of the struct page definition are the parts used by the slab
allocators. It would be good for the MM in general if struct slab were its own
data type, and it also helps to prevent tail pages from slipping in anywhere.
As Matthew requested in his proof of concept series, I have taken over the
development of this series, so it's a mix of patches from him (often modified
by me) and my own.
One big difference is the use of coccinelle to perform the relatively trivial
parts of the conversions automatically and at once, instead of a larger number
of smaller incremental reviewable steps. Thanks to Julia Lawall and Luis
Chamberlain for all their help!
Another notable difference is (based also on review feedback) I don't represent
with a struct slab the large kmalloc allocations which are not really a slab,
but use page allocator directly. When going from an object address to a struct
slab, the code tests first folio slab flag, and only if it's set it converts to
struct slab. This makes the struct slab type stronger.
Finally, although Matthew's version didn't use any of the folio work, the
initial support has been merged meanwhile so my version builds on top of it
where appropriate. This eliminates some of the redundant compound_head()
being performed e.g. when testing the slab flag.
To sum up, after this series, struct page fields used by slab allocators are
moved from struct page to a new struct slab, that uses the same physical
storage. The availability of the fields is further distinguished by the
selected slab allocator implementation. The advantages include:
- Similar to folios, if the slab is of order > 0, struct slab always is
guaranteed to be the head page. Additionally it's guaranteed to be an actual
slab page, not a large kmalloc. This removes uncertainty and potential for
bugs.
- It's not possible to accidentally use fields of the slab implementation that's
not configured.
- Other subsystems cannot use slab's fields in struct page anymore (some
existing non-slab usages had to be adjusted in this series), so slab
implementations have more freedom in rearranging them in the struct slab.
Hyeonggon Yoo (1):
mm/slob: Remove unnecessary page_mapcount_reset() function call
Matthew Wilcox (Oracle) (14):
mm: Split slab into its own type
mm: Convert [un]account_slab_page() to struct slab
mm: Convert virt_to_cache() to use struct slab
mm: Convert __ksize() to struct slab
mm: Use struct slab in kmem_obj_info()
mm: Convert check_heap_object() to use struct slab
mm/slub: Convert detached_freelist to use a struct slab
mm/slub: Convert kfree() to use a struct slab
mm/slub: Convert print_page_info() to print_slab_info()
mm/slub: Convert pfmemalloc_match() to take a struct slab
mm/slob: Convert SLOB to use struct slab and struct folio
mm/kasan: Convert to struct folio and struct slab
zsmalloc: Stop using slab fields in struct page
bootmem: Use page->index instead of page->freelist
Vlastimil Babka (17):
mm: add virt_to_folio() and folio_address()
mm/slab: Dissolve slab_map_pages() in its caller
mm/slub: Make object_err() static
mm/slub: Convert __slab_lock() and __slab_unlock() to struct slab
mm/slub: Convert alloc_slab_page() to return a struct slab
mm/slub: Convert __free_slab() to use struct slab
mm/slub: Convert most struct page to struct slab by spatch
mm/slub: Finish struct page to struct slab conversion
mm/slab: Convert kmem_getpages() and kmem_freepages() to struct slab
mm/slab: Convert most struct page to struct slab by spatch
mm/slab: Finish struct page to struct slab conversion
mm: Convert struct page to struct slab in functions used by other
subsystems
mm/memcg: Convert slab objcgs from struct page to struct slab
mm/kfence: Convert kfence_guarded_alloc() to struct slab
mm/sl*b: Differentiate struct slab fields by sl*b implementations
mm/slub: Simplify struct slab slabs field definition
mm/slub: Define struct slab fields for CONFIG_SLUB_CPU_PARTIAL only
when enabled
arch/x86/mm/init_64.c | 2 +-
include/linux/bootmem_info.h | 2 +-
include/linux/kasan.h | 9 +-
include/linux/memcontrol.h | 48 --
include/linux/mm.h | 12 +
include/linux/mm_types.h | 10 +-
include/linux/slab.h | 8 -
include/linux/slab_def.h | 16 +-
include/linux/slub_def.h | 29 +-
mm/bootmem_info.c | 7 +-
mm/kasan/common.c | 27 +-
mm/kasan/generic.c | 8 +-
mm/kasan/kasan.h | 1 +
mm/kasan/quarantine.c | 2 +-
mm/kasan/report.c | 13 +-
mm/kasan/report_tags.c | 10 +-
mm/kfence/core.c | 17 +-
mm/kfence/kfence_test.c | 6 +-
mm/memcontrol.c | 47 +-
mm/slab.c | 456 +++++++------
mm/slab.h | 305 +++++++--
mm/slab_common.c | 14 +-
mm/slob.c | 62 +-
mm/slub.c | 1177 +++++++++++++++++-----------------
mm/sparse.c | 2 +-
mm/usercopy.c | 13 +-
mm/zsmalloc.c | 18 +-
27 files changed, 1263 insertions(+), 1058 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 9+ messages in thread[parent not found: <20220104001046.12263-1-vbabka-AlSwsSmVLrQ@public.gmane.org>]
* [PATCH v4 22/32] mm: Convert struct page to struct slab in functions used by other subsystems [not found] ` <20220104001046.12263-1-vbabka-AlSwsSmVLrQ@public.gmane.org> @ 2022-01-04 0:10 ` Vlastimil Babka [not found] ` <20220104001046.12263-23-vbabka-AlSwsSmVLrQ@public.gmane.org> 2022-01-04 0:10 ` [PATCH v4 23/32] mm/memcg: Convert slab objcgs from struct page to struct slab Vlastimil Babka 1 sibling, 1 reply; 9+ messages in thread From: Vlastimil Babka @ 2022-01-04 0:10 UTC (permalink / raw) To: Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg Cc: linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Andrew Morton, Johannes Weiner, Roman Gushchin, Hyeonggon Yoo, patches-cunTk1MwBs/YUNznpcFYbw, Vlastimil Babka, Andrey Konovalov, Julia Lawall, Luis Chamberlain, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, Marco Elver, Michal Hocko, Vladimir Davydov, kasan-dev-/JYPxA39Uh5TLH3MbocFFw, cgroups-u79uwXL29TY76Z2rM5mHXA KASAN, KFENCE and memcg interact with SLAB or SLUB internals through functions nearest_obj(), obj_to_index() and objs_per_slab() that use struct page as parameter. This patch converts it to struct slab including all callers, through a coccinelle semantic patch. // Options: --include-headers --no-includes --smpl-spacing include/linux/slab_def.h include/linux/slub_def.h mm/slab.h mm/kasan/*.c mm/kfence/kfence_test.c mm/memcontrol.c mm/slab.c mm/slub.c // Note: needs coccinelle 1.1.1 to avoid breaking whitespace @@ @@ -objs_per_slab_page( +objs_per_slab( ... ) { ... } @@ @@ -objs_per_slab_page( +objs_per_slab( ... ) @@ identifier fn =~ "obj_to_index|objs_per_slab"; @@ fn(..., - const struct page *page + const struct slab *slab ,...) { <... ( - page_address(page) + slab_address(slab) | - page + slab ) ...> } @@ identifier fn =~ "nearest_obj"; @@ fn(..., - struct page *page + const struct slab *slab ,...) { <... ( - page_address(page) + slab_address(slab) | - page + slab ) ...> } @@ identifier fn =~ "nearest_obj|obj_to_index|objs_per_slab"; expression E; @@ fn(..., ( - slab_page(E) + E | - virt_to_page(E) + virt_to_slab(E) | - virt_to_head_page(E) + virt_to_slab(E) | - page + page_slab(page) ) ,...) Signed-off-by: Vlastimil Babka <vbabka-AlSwsSmVLrQ@public.gmane.org> Reviewed-by: Andrey Konovalov <andreyknvl-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> Acked-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org> Cc: Julia Lawall <julia.lawall-MZpvjPyXg2s@public.gmane.org> Cc: Luis Chamberlain <mcgrof-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> Cc: Andrey Ryabinin <ryabinin.a.a-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> Cc: Alexander Potapenko <glider-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> Cc: Andrey Konovalov <andreyknvl-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> Cc: Dmitry Vyukov <dvyukov-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> Cc: Marco Elver <elver-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> Cc: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org> Cc: Michal Hocko <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> Cc: Vladimir Davydov <vdavydov.dev-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> Cc: <kasan-dev-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> Cc: <cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org> --- include/linux/slab_def.h | 16 ++++++++-------- include/linux/slub_def.h | 18 +++++++++--------- mm/kasan/common.c | 4 ++-- mm/kasan/generic.c | 2 +- mm/kasan/report.c | 2 +- mm/kasan/report_tags.c | 2 +- mm/kfence/kfence_test.c | 4 ++-- mm/memcontrol.c | 4 ++-- mm/slab.c | 10 +++++----- mm/slab.h | 4 ++-- mm/slub.c | 2 +- 11 files changed, 34 insertions(+), 34 deletions(-) diff --git a/include/linux/slab_def.h b/include/linux/slab_def.h index 3aa5e1e73ab6..e24c9aff6fed 100644 --- a/include/linux/slab_def.h +++ b/include/linux/slab_def.h @@ -87,11 +87,11 @@ struct kmem_cache { struct kmem_cache_node *node[MAX_NUMNODES]; }; -static inline void *nearest_obj(struct kmem_cache *cache, struct page *page, +static inline void *nearest_obj(struct kmem_cache *cache, const struct slab *slab, void *x) { - void *object = x - (x - page->s_mem) % cache->size; - void *last_object = page->s_mem + (cache->num - 1) * cache->size; + void *object = x - (x - slab->s_mem) % cache->size; + void *last_object = slab->s_mem + (cache->num - 1) * cache->size; if (unlikely(object > last_object)) return last_object; @@ -106,16 +106,16 @@ static inline void *nearest_obj(struct kmem_cache *cache, struct page *page, * reciprocal_divide(offset, cache->reciprocal_buffer_size) */ static inline unsigned int obj_to_index(const struct kmem_cache *cache, - const struct page *page, void *obj) + const struct slab *slab, void *obj) { - u32 offset = (obj - page->s_mem); + u32 offset = (obj - slab->s_mem); return reciprocal_divide(offset, cache->reciprocal_buffer_size); } -static inline int objs_per_slab_page(const struct kmem_cache *cache, - const struct page *page) +static inline int objs_per_slab(const struct kmem_cache *cache, + const struct slab *slab) { - if (is_kfence_address(page_address(page))) + if (is_kfence_address(slab_address(slab))) return 1; return cache->num; } diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h index 8a9c2876ca89..33c5c0e3bd8d 100644 --- a/include/linux/slub_def.h +++ b/include/linux/slub_def.h @@ -158,11 +158,11 @@ static inline void sysfs_slab_release(struct kmem_cache *s) void *fixup_red_left(struct kmem_cache *s, void *p); -static inline void *nearest_obj(struct kmem_cache *cache, struct page *page, +static inline void *nearest_obj(struct kmem_cache *cache, const struct slab *slab, void *x) { - void *object = x - (x - page_address(page)) % cache->size; - void *last_object = page_address(page) + - (page->objects - 1) * cache->size; + void *object = x - (x - slab_address(slab)) % cache->size; + void *last_object = slab_address(slab) + + (slab->objects - 1) * cache->size; void *result = (unlikely(object > last_object)) ? last_object : object; result = fixup_red_left(cache, result); @@ -178,16 +178,16 @@ static inline unsigned int __obj_to_index(const struct kmem_cache *cache, } static inline unsigned int obj_to_index(const struct kmem_cache *cache, - const struct page *page, void *obj) + const struct slab *slab, void *obj) { if (is_kfence_address(obj)) return 0; - return __obj_to_index(cache, page_address(page), obj); + return __obj_to_index(cache, slab_address(slab), obj); } -static inline int objs_per_slab_page(const struct kmem_cache *cache, - const struct page *page) +static inline int objs_per_slab(const struct kmem_cache *cache, + const struct slab *slab) { - return page->objects; + return slab->objects; } #endif /* _LINUX_SLUB_DEF_H */ diff --git a/mm/kasan/common.c b/mm/kasan/common.c index 8428da2aaf17..6a1cd2d38bff 100644 --- a/mm/kasan/common.c +++ b/mm/kasan/common.c @@ -298,7 +298,7 @@ static inline u8 assign_tag(struct kmem_cache *cache, /* For caches that either have a constructor or SLAB_TYPESAFE_BY_RCU: */ #ifdef CONFIG_SLAB /* For SLAB assign tags based on the object index in the freelist. */ - return (u8)obj_to_index(cache, virt_to_head_page(object), (void *)object); + return (u8)obj_to_index(cache, virt_to_slab(object), (void *)object); #else /* * For SLUB assign a random tag during slab creation, otherwise reuse @@ -341,7 +341,7 @@ static inline bool ____kasan_slab_free(struct kmem_cache *cache, void *object, if (is_kfence_address(object)) return false; - if (unlikely(nearest_obj(cache, virt_to_head_page(object), object) != + if (unlikely(nearest_obj(cache, virt_to_slab(object), object) != object)) { kasan_report_invalid_free(tagged_object, ip); return true; diff --git a/mm/kasan/generic.c b/mm/kasan/generic.c index 84a038b07c6f..5d0b79416c4e 100644 --- a/mm/kasan/generic.c +++ b/mm/kasan/generic.c @@ -339,7 +339,7 @@ static void __kasan_record_aux_stack(void *addr, bool can_alloc) return; cache = page->slab_cache; - object = nearest_obj(cache, page, addr); + object = nearest_obj(cache, page_slab(page), addr); alloc_meta = kasan_get_alloc_meta(cache, object); if (!alloc_meta) return; diff --git a/mm/kasan/report.c b/mm/kasan/report.c index 0bc10f452f7e..e00999dc6499 100644 --- a/mm/kasan/report.c +++ b/mm/kasan/report.c @@ -249,7 +249,7 @@ static void print_address_description(void *addr, u8 tag) if (page && PageSlab(page)) { struct kmem_cache *cache = page->slab_cache; - void *object = nearest_obj(cache, page, addr); + void *object = nearest_obj(cache, page_slab(page), addr); describe_object(cache, object, addr, tag); } diff --git a/mm/kasan/report_tags.c b/mm/kasan/report_tags.c index 8a319fc16dab..06c21dd77493 100644 --- a/mm/kasan/report_tags.c +++ b/mm/kasan/report_tags.c @@ -23,7 +23,7 @@ const char *kasan_get_bug_type(struct kasan_access_info *info) page = kasan_addr_to_page(addr); if (page && PageSlab(page)) { cache = page->slab_cache; - object = nearest_obj(cache, page, (void *)addr); + object = nearest_obj(cache, page_slab(page), (void *)addr); alloc_meta = kasan_get_alloc_meta(cache, object); if (alloc_meta) { diff --git a/mm/kfence/kfence_test.c b/mm/kfence/kfence_test.c index 695030c1fff8..f7276711d7b9 100644 --- a/mm/kfence/kfence_test.c +++ b/mm/kfence/kfence_test.c @@ -291,8 +291,8 @@ static void *test_alloc(struct kunit *test, size_t size, gfp_t gfp, enum allocat * even for KFENCE objects; these are required so that * memcg accounting works correctly. */ - KUNIT_EXPECT_EQ(test, obj_to_index(s, page, alloc), 0U); - KUNIT_EXPECT_EQ(test, objs_per_slab_page(s, page), 1); + KUNIT_EXPECT_EQ(test, obj_to_index(s, page_slab(page), alloc), 0U); + KUNIT_EXPECT_EQ(test, objs_per_slab(s, page_slab(page)), 1); if (policy == ALLOCATE_ANY) return alloc; diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 2ed5f2a0879d..f7b789e692a0 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2819,7 +2819,7 @@ static inline void mod_objcg_mlstate(struct obj_cgroup *objcg, int memcg_alloc_page_obj_cgroups(struct page *page, struct kmem_cache *s, gfp_t gfp, bool new_page) { - unsigned int objects = objs_per_slab_page(s, page); + unsigned int objects = objs_per_slab(s, page_slab(page)); unsigned long memcg_data; void *vec; @@ -2881,7 +2881,7 @@ struct mem_cgroup *mem_cgroup_from_obj(void *p) struct obj_cgroup *objcg; unsigned int off; - off = obj_to_index(page->slab_cache, page, p); + off = obj_to_index(page->slab_cache, page_slab(page), p); objcg = page_objcgs(page)[off]; if (objcg) return obj_cgroup_memcg(objcg); diff --git a/mm/slab.c b/mm/slab.c index 547ed068a569..c13258116791 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -1559,7 +1559,7 @@ static void check_poison_obj(struct kmem_cache *cachep, void *objp) struct slab *slab = virt_to_slab(objp); unsigned int objnr; - objnr = obj_to_index(cachep, slab_page(slab), objp); + objnr = obj_to_index(cachep, slab, objp); if (objnr) { objp = index_to_obj(cachep, slab, objnr - 1); realobj = (char *)objp + obj_offset(cachep); @@ -2529,7 +2529,7 @@ static void *slab_get_obj(struct kmem_cache *cachep, struct slab *slab) static void slab_put_obj(struct kmem_cache *cachep, struct slab *slab, void *objp) { - unsigned int objnr = obj_to_index(cachep, slab_page(slab), objp); + unsigned int objnr = obj_to_index(cachep, slab, objp); #if DEBUG unsigned int i; @@ -2716,7 +2716,7 @@ static void *cache_free_debugcheck(struct kmem_cache *cachep, void *objp, if (cachep->flags & SLAB_STORE_USER) *dbg_userword(cachep, objp) = (void *)caller; - objnr = obj_to_index(cachep, slab_page(slab), objp); + objnr = obj_to_index(cachep, slab, objp); BUG_ON(objnr >= cachep->num); BUG_ON(objp != index_to_obj(cachep, slab, objnr)); @@ -3662,7 +3662,7 @@ void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct slab *slab) objp = object - obj_offset(cachep); kpp->kp_data_offset = obj_offset(cachep); slab = virt_to_slab(objp); - objnr = obj_to_index(cachep, slab_page(slab), objp); + objnr = obj_to_index(cachep, slab, objp); objp = index_to_obj(cachep, slab, objnr); kpp->kp_objp = objp; if (DEBUG && cachep->flags & SLAB_STORE_USER) @@ -4180,7 +4180,7 @@ void __check_heap_object(const void *ptr, unsigned long n, /* Find and validate object. */ cachep = slab->slab_cache; - objnr = obj_to_index(cachep, slab_page(slab), (void *)ptr); + objnr = obj_to_index(cachep, slab, (void *)ptr); BUG_ON(objnr >= cachep->num); /* Find offset within object. */ diff --git a/mm/slab.h b/mm/slab.h index 039babfde2fe..bca9181e96d7 100644 --- a/mm/slab.h +++ b/mm/slab.h @@ -483,7 +483,7 @@ static inline void memcg_slab_post_alloc_hook(struct kmem_cache *s, continue; } - off = obj_to_index(s, page, p[i]); + off = obj_to_index(s, page_slab(page), p[i]); obj_cgroup_get(objcg); page_objcgs(page)[off] = objcg; mod_objcg_state(objcg, page_pgdat(page), @@ -522,7 +522,7 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s_orig, else s = s_orig; - off = obj_to_index(s, page, p[i]); + off = obj_to_index(s, page_slab(page), p[i]); objcg = objcgs[off]; if (!objcg) continue; diff --git a/mm/slub.c b/mm/slub.c index cc64ba9d9963..ddf21c7a381a 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -4342,7 +4342,7 @@ void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct slab *slab) #else objp = objp0; #endif - objnr = obj_to_index(s, slab_page(slab), objp); + objnr = obj_to_index(s, slab, objp); kpp->kp_data_offset = (unsigned long)((char *)objp0 - (char *)objp); objp = base + s->size * objnr; kpp->kp_objp = objp; -- 2.34.1 ^ permalink raw reply related [flat|nested] 9+ messages in thread
[parent not found: <20220104001046.12263-23-vbabka-AlSwsSmVLrQ@public.gmane.org>]
* Re: [PATCH v4 22/32] mm: Convert struct page to struct slab in functions used by other subsystems [not found] ` <20220104001046.12263-23-vbabka-AlSwsSmVLrQ@public.gmane.org> @ 2022-01-05 2:12 ` Roman Gushchin [not found] ` <YdT+qU4xgQeZc/jP-cx5fftMpWqeCjSd+JxjunQ2O0Ztt9esIQQ4Iyu8u01E@public.gmane.org> 0 siblings, 1 reply; 9+ messages in thread From: Roman Gushchin @ 2022-01-05 2:12 UTC (permalink / raw) To: Vlastimil Babka Cc: Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Andrew Morton, Johannes Weiner, Hyeonggon Yoo, patches-cunTk1MwBs/YUNznpcFYbw, Andrey Konovalov, Julia Lawall, Luis Chamberlain, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, Marco Elver, Michal Hocko, Vladimir Davydov, kasan-dev-/JYPxA39Uh5TLH3MbocFFw, cgroups-u79uwXL29TY76Z2rM5mHXA On Tue, Jan 04, 2022 at 01:10:36AM +0100, Vlastimil Babka wrote: > KASAN, KFENCE and memcg interact with SLAB or SLUB internals through > functions nearest_obj(), obj_to_index() and objs_per_slab() that use > struct page as parameter. This patch converts it to struct slab > including all callers, through a coccinelle semantic patch. > > // Options: --include-headers --no-includes --smpl-spacing include/linux/slab_def.h include/linux/slub_def.h mm/slab.h mm/kasan/*.c mm/kfence/kfence_test.c mm/memcontrol.c mm/slab.c mm/slub.c > // Note: needs coccinelle 1.1.1 to avoid breaking whitespace > > @@ > @@ > > -objs_per_slab_page( > +objs_per_slab( > ... > ) > { ... } > > @@ > @@ > > -objs_per_slab_page( > +objs_per_slab( > ... > ) > > @@ > identifier fn =~ "obj_to_index|objs_per_slab"; > @@ > > fn(..., > - const struct page *page > + const struct slab *slab > ,...) > { > <... > ( > - page_address(page) > + slab_address(slab) > | > - page > + slab > ) > ...> > } > > @@ > identifier fn =~ "nearest_obj"; > @@ > > fn(..., > - struct page *page > + const struct slab *slab > ,...) > { > <... > ( > - page_address(page) > + slab_address(slab) > | > - page > + slab > ) > ...> > } > > @@ > identifier fn =~ "nearest_obj|obj_to_index|objs_per_slab"; > expression E; > @@ > > fn(..., > ( > - slab_page(E) > + E > | > - virt_to_page(E) > + virt_to_slab(E) > | > - virt_to_head_page(E) > + virt_to_slab(E) > | > - page > + page_slab(page) > ) > ,...) > > Signed-off-by: Vlastimil Babka <vbabka-AlSwsSmVLrQ@public.gmane.org> > Reviewed-by: Andrey Konovalov <andreyknvl-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> > Acked-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org> > Cc: Julia Lawall <julia.lawall-MZpvjPyXg2s@public.gmane.org> > Cc: Luis Chamberlain <mcgrof-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> > Cc: Andrey Ryabinin <ryabinin.a.a-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> > Cc: Alexander Potapenko <glider-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> > Cc: Andrey Konovalov <andreyknvl-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> > Cc: Dmitry Vyukov <dvyukov-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> > Cc: Marco Elver <elver-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> > Cc: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org> > Cc: Michal Hocko <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> > Cc: Vladimir Davydov <vdavydov.dev-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> > Cc: <kasan-dev-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> > Cc: <cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org> > --- > include/linux/slab_def.h | 16 ++++++++-------- > include/linux/slub_def.h | 18 +++++++++--------- > mm/kasan/common.c | 4 ++-- > mm/kasan/generic.c | 2 +- > mm/kasan/report.c | 2 +- > mm/kasan/report_tags.c | 2 +- > mm/kfence/kfence_test.c | 4 ++-- > mm/memcontrol.c | 4 ++-- > mm/slab.c | 10 +++++----- > mm/slab.h | 4 ++-- > mm/slub.c | 2 +- > 11 files changed, 34 insertions(+), 34 deletions(-) > > diff --git a/include/linux/slab_def.h b/include/linux/slab_def.h > index 3aa5e1e73ab6..e24c9aff6fed 100644 > --- a/include/linux/slab_def.h > +++ b/include/linux/slab_def.h > @@ -87,11 +87,11 @@ struct kmem_cache { > struct kmem_cache_node *node[MAX_NUMNODES]; > }; > > -static inline void *nearest_obj(struct kmem_cache *cache, struct page *page, > +static inline void *nearest_obj(struct kmem_cache *cache, const struct slab *slab, > void *x) > { > - void *object = x - (x - page->s_mem) % cache->size; > - void *last_object = page->s_mem + (cache->num - 1) * cache->size; > + void *object = x - (x - slab->s_mem) % cache->size; > + void *last_object = slab->s_mem + (cache->num - 1) * cache->size; > > if (unlikely(object > last_object)) > return last_object; > @@ -106,16 +106,16 @@ static inline void *nearest_obj(struct kmem_cache *cache, struct page *page, > * reciprocal_divide(offset, cache->reciprocal_buffer_size) > */ > static inline unsigned int obj_to_index(const struct kmem_cache *cache, > - const struct page *page, void *obj) > + const struct slab *slab, void *obj) > { > - u32 offset = (obj - page->s_mem); > + u32 offset = (obj - slab->s_mem); > return reciprocal_divide(offset, cache->reciprocal_buffer_size); > } > > -static inline int objs_per_slab_page(const struct kmem_cache *cache, > - const struct page *page) > +static inline int objs_per_slab(const struct kmem_cache *cache, > + const struct slab *slab) Nice! It looks indeed better. > { > - if (is_kfence_address(page_address(page))) > + if (is_kfence_address(slab_address(slab))) > return 1; > return cache->num; > } > diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h > index 8a9c2876ca89..33c5c0e3bd8d 100644 > --- a/include/linux/slub_def.h > +++ b/include/linux/slub_def.h > @@ -158,11 +158,11 @@ static inline void sysfs_slab_release(struct kmem_cache *s) > > void *fixup_red_left(struct kmem_cache *s, void *p); > > -static inline void *nearest_obj(struct kmem_cache *cache, struct page *page, > +static inline void *nearest_obj(struct kmem_cache *cache, const struct slab *slab, > void *x) { > - void *object = x - (x - page_address(page)) % cache->size; > - void *last_object = page_address(page) + > - (page->objects - 1) * cache->size; > + void *object = x - (x - slab_address(slab)) % cache->size; > + void *last_object = slab_address(slab) + > + (slab->objects - 1) * cache->size; > void *result = (unlikely(object > last_object)) ? last_object : object; > > result = fixup_red_left(cache, result); > @@ -178,16 +178,16 @@ static inline unsigned int __obj_to_index(const struct kmem_cache *cache, > } > > static inline unsigned int obj_to_index(const struct kmem_cache *cache, > - const struct page *page, void *obj) > + const struct slab *slab, void *obj) > { > if (is_kfence_address(obj)) > return 0; > - return __obj_to_index(cache, page_address(page), obj); > + return __obj_to_index(cache, slab_address(slab), obj); > } > > -static inline int objs_per_slab_page(const struct kmem_cache *cache, > - const struct page *page) > +static inline int objs_per_slab(const struct kmem_cache *cache, > + const struct slab *slab) > { > - return page->objects; > + return slab->objects; > } > #endif /* _LINUX_SLUB_DEF_H */ > diff --git a/mm/kasan/common.c b/mm/kasan/common.c > index 8428da2aaf17..6a1cd2d38bff 100644 > --- a/mm/kasan/common.c > +++ b/mm/kasan/common.c > @@ -298,7 +298,7 @@ static inline u8 assign_tag(struct kmem_cache *cache, > /* For caches that either have a constructor or SLAB_TYPESAFE_BY_RCU: */ > #ifdef CONFIG_SLAB > /* For SLAB assign tags based on the object index in the freelist. */ > - return (u8)obj_to_index(cache, virt_to_head_page(object), (void *)object); > + return (u8)obj_to_index(cache, virt_to_slab(object), (void *)object); > #else > /* > * For SLUB assign a random tag during slab creation, otherwise reuse > @@ -341,7 +341,7 @@ static inline bool ____kasan_slab_free(struct kmem_cache *cache, void *object, > if (is_kfence_address(object)) > return false; > > - if (unlikely(nearest_obj(cache, virt_to_head_page(object), object) != > + if (unlikely(nearest_obj(cache, virt_to_slab(object), object) != > object)) { > kasan_report_invalid_free(tagged_object, ip); > return true; > diff --git a/mm/kasan/generic.c b/mm/kasan/generic.c > index 84a038b07c6f..5d0b79416c4e 100644 > --- a/mm/kasan/generic.c > +++ b/mm/kasan/generic.c > @@ -339,7 +339,7 @@ static void __kasan_record_aux_stack(void *addr, bool can_alloc) > return; > > cache = page->slab_cache; > - object = nearest_obj(cache, page, addr); > + object = nearest_obj(cache, page_slab(page), addr); > alloc_meta = kasan_get_alloc_meta(cache, object); > if (!alloc_meta) > return; > diff --git a/mm/kasan/report.c b/mm/kasan/report.c > index 0bc10f452f7e..e00999dc6499 100644 > --- a/mm/kasan/report.c > +++ b/mm/kasan/report.c > @@ -249,7 +249,7 @@ static void print_address_description(void *addr, u8 tag) > > if (page && PageSlab(page)) { > struct kmem_cache *cache = page->slab_cache; > - void *object = nearest_obj(cache, page, addr); > + void *object = nearest_obj(cache, page_slab(page), addr); s/tab/space > > describe_object(cache, object, addr, tag); > } > diff --git a/mm/kasan/report_tags.c b/mm/kasan/report_tags.c > index 8a319fc16dab..06c21dd77493 100644 > --- a/mm/kasan/report_tags.c > +++ b/mm/kasan/report_tags.c > @@ -23,7 +23,7 @@ const char *kasan_get_bug_type(struct kasan_access_info *info) > page = kasan_addr_to_page(addr); > if (page && PageSlab(page)) { > cache = page->slab_cache; > - object = nearest_obj(cache, page, (void *)addr); > + object = nearest_obj(cache, page_slab(page), (void *)addr); > alloc_meta = kasan_get_alloc_meta(cache, object); > > if (alloc_meta) { > diff --git a/mm/kfence/kfence_test.c b/mm/kfence/kfence_test.c > index 695030c1fff8..f7276711d7b9 100644 > --- a/mm/kfence/kfence_test.c > +++ b/mm/kfence/kfence_test.c > @@ -291,8 +291,8 @@ static void *test_alloc(struct kunit *test, size_t size, gfp_t gfp, enum allocat > * even for KFENCE objects; these are required so that > * memcg accounting works correctly. > */ > - KUNIT_EXPECT_EQ(test, obj_to_index(s, page, alloc), 0U); > - KUNIT_EXPECT_EQ(test, objs_per_slab_page(s, page), 1); > + KUNIT_EXPECT_EQ(test, obj_to_index(s, page_slab(page), alloc), 0U); > + KUNIT_EXPECT_EQ(test, objs_per_slab(s, page_slab(page)), 1); > > if (policy == ALLOCATE_ANY) > return alloc; > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 2ed5f2a0879d..f7b789e692a0 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -2819,7 +2819,7 @@ static inline void mod_objcg_mlstate(struct obj_cgroup *objcg, > int memcg_alloc_page_obj_cgroups(struct page *page, struct kmem_cache *s, > gfp_t gfp, bool new_page) > { > - unsigned int objects = objs_per_slab_page(s, page); > + unsigned int objects = objs_per_slab(s, page_slab(page)); > unsigned long memcg_data; > void *vec; > > @@ -2881,7 +2881,7 @@ struct mem_cgroup *mem_cgroup_from_obj(void *p) > struct obj_cgroup *objcg; > unsigned int off; > > - off = obj_to_index(page->slab_cache, page, p); > + off = obj_to_index(page->slab_cache, page_slab(page), p); > objcg = page_objcgs(page)[off]; > if (objcg) > return obj_cgroup_memcg(objcg); > diff --git a/mm/slab.c b/mm/slab.c > index 547ed068a569..c13258116791 100644 > --- a/mm/slab.c > +++ b/mm/slab.c > @@ -1559,7 +1559,7 @@ static void check_poison_obj(struct kmem_cache *cachep, void *objp) > struct slab *slab = virt_to_slab(objp); > unsigned int objnr; > > - objnr = obj_to_index(cachep, slab_page(slab), objp); > + objnr = obj_to_index(cachep, slab, objp); > if (objnr) { > objp = index_to_obj(cachep, slab, objnr - 1); > realobj = (char *)objp + obj_offset(cachep); > @@ -2529,7 +2529,7 @@ static void *slab_get_obj(struct kmem_cache *cachep, struct slab *slab) > static void slab_put_obj(struct kmem_cache *cachep, > struct slab *slab, void *objp) > { > - unsigned int objnr = obj_to_index(cachep, slab_page(slab), objp); > + unsigned int objnr = obj_to_index(cachep, slab, objp); > #if DEBUG > unsigned int i; > > @@ -2716,7 +2716,7 @@ static void *cache_free_debugcheck(struct kmem_cache *cachep, void *objp, > if (cachep->flags & SLAB_STORE_USER) > *dbg_userword(cachep, objp) = (void *)caller; > > - objnr = obj_to_index(cachep, slab_page(slab), objp); > + objnr = obj_to_index(cachep, slab, objp); > > BUG_ON(objnr >= cachep->num); > BUG_ON(objp != index_to_obj(cachep, slab, objnr)); > @@ -3662,7 +3662,7 @@ void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct slab *slab) > objp = object - obj_offset(cachep); > kpp->kp_data_offset = obj_offset(cachep); > slab = virt_to_slab(objp); > - objnr = obj_to_index(cachep, slab_page(slab), objp); > + objnr = obj_to_index(cachep, slab, objp); > objp = index_to_obj(cachep, slab, objnr); > kpp->kp_objp = objp; > if (DEBUG && cachep->flags & SLAB_STORE_USER) > @@ -4180,7 +4180,7 @@ void __check_heap_object(const void *ptr, unsigned long n, > > /* Find and validate object. */ > cachep = slab->slab_cache; > - objnr = obj_to_index(cachep, slab_page(slab), (void *)ptr); > + objnr = obj_to_index(cachep, slab, (void *)ptr); > BUG_ON(objnr >= cachep->num); > > /* Find offset within object. */ > diff --git a/mm/slab.h b/mm/slab.h > index 039babfde2fe..bca9181e96d7 100644 > --- a/mm/slab.h > +++ b/mm/slab.h > @@ -483,7 +483,7 @@ static inline void memcg_slab_post_alloc_hook(struct kmem_cache *s, > continue; > } > > - off = obj_to_index(s, page, p[i]); > + off = obj_to_index(s, page_slab(page), p[i]); > obj_cgroup_get(objcg); > page_objcgs(page)[off] = objcg; > mod_objcg_state(objcg, page_pgdat(page), > @@ -522,7 +522,7 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s_orig, > else > s = s_orig; > > - off = obj_to_index(s, page, p[i]); > + off = obj_to_index(s, page_slab(page), p[i]); > objcg = objcgs[off]; > if (!objcg) > continue; > diff --git a/mm/slub.c b/mm/slub.c > index cc64ba9d9963..ddf21c7a381a 100644 > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -4342,7 +4342,7 @@ void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct slab *slab) > #else > objp = objp0; > #endif > - objnr = obj_to_index(s, slab_page(slab), objp); > + objnr = obj_to_index(s, slab, objp); > kpp->kp_data_offset = (unsigned long)((char *)objp0 - (char *)objp); > objp = base + s->size * objnr; > kpp->kp_objp = objp; > -- > 2.34.1 > Reviewed-by: Roman Gushchin <guro-b10kYP2dOMg@public.gmane.org> Thanks! ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <YdT+qU4xgQeZc/jP-cx5fftMpWqeCjSd+JxjunQ2O0Ztt9esIQQ4Iyu8u01E@public.gmane.org>]
* Re: [PATCH v4 22/32] mm: Convert struct page to struct slab in functions used by other subsystems [not found] ` <YdT+qU4xgQeZc/jP-cx5fftMpWqeCjSd+JxjunQ2O0Ztt9esIQQ4Iyu8u01E@public.gmane.org> @ 2022-01-05 16:39 ` Vlastimil Babka 0 siblings, 0 replies; 9+ messages in thread From: Vlastimil Babka @ 2022-01-05 16:39 UTC (permalink / raw) To: Roman Gushchin Cc: Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Andrew Morton, Johannes Weiner, Hyeonggon Yoo, patches-cunTk1MwBs/YUNznpcFYbw, Andrey Konovalov, Julia Lawall, Luis Chamberlain, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, Marco Elver, Michal Hocko, Vladimir Davydov, kasan-dev-/JYPxA39Uh5TLH3MbocFFw, cgroups-u79uwXL29TY76Z2rM5mHXA On 1/5/22 03:12, Roman Gushchin wrote: > On Tue, Jan 04, 2022 at 01:10:36AM +0100, Vlastimil Babka wrote: >> --- a/mm/kasan/report.c >> +++ b/mm/kasan/report.c >> @@ -249,7 +249,7 @@ static void print_address_description(void *addr, u8 tag) >> >> if (page && PageSlab(page)) { >> struct kmem_cache *cache = page->slab_cache; >> - void *object = nearest_obj(cache, page, addr); >> + void *object = nearest_obj(cache, page_slab(page), addr); > s/tab/space Yeah it was pointed out earlier that the tab was already there but only this change made it stand out. Fixing that up here would go against the automated spatch conversion, so it's done in later manual patch that also touches this line. >> 2.34.1 >> > > Reviewed-by: Roman Gushchin <guro-b10kYP2dOMg@public.gmane.org> > > Thanks! Thanks! ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v4 23/32] mm/memcg: Convert slab objcgs from struct page to struct slab [not found] ` <20220104001046.12263-1-vbabka-AlSwsSmVLrQ@public.gmane.org> 2022-01-04 0:10 ` [PATCH v4 22/32] mm: Convert struct page to struct slab in functions used by other subsystems Vlastimil Babka @ 2022-01-04 0:10 ` Vlastimil Babka [not found] ` <20220104001046.12263-24-vbabka-AlSwsSmVLrQ@public.gmane.org> 1 sibling, 1 reply; 9+ messages in thread From: Vlastimil Babka @ 2022-01-04 0:10 UTC (permalink / raw) To: Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg Cc: linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Andrew Morton, Johannes Weiner, Roman Gushchin, Hyeonggon Yoo, patches-cunTk1MwBs/YUNznpcFYbw, Vlastimil Babka, Michal Hocko, Vladimir Davydov, cgroups-u79uwXL29TY76Z2rM5mHXA page->memcg_data is used with MEMCG_DATA_OBJCGS flag only for slab pages so convert all the related infrastructure to struct slab. Also use struct folio instead of struct page when resolving object pointers. This is not just mechanistic changing of types and names. Now in mem_cgroup_from_obj() we use folio_test_slab() to decide if we interpret the folio as a real slab instead of a large kmalloc, instead of relying on MEMCG_DATA_OBJCGS bit that used to be checked in page_objcgs_check(). Similarly in memcg_slab_free_hook() where we can encounter kmalloc_large() pages (here the folio slab flag check is implied by virt_to_slab()). As a result, page_objcgs_check() can be dropped instead of converted. To avoid include cycles, move the inline definition of slab_objcgs() from memcontrol.h to mm/slab.h. Signed-off-by: Vlastimil Babka <vbabka-AlSwsSmVLrQ@public.gmane.org> Cc: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org> Cc: Michal Hocko <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> Cc: Vladimir Davydov <vdavydov.dev-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> Cc: <cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org> --- include/linux/memcontrol.h | 48 ------------------------ mm/memcontrol.c | 47 ++++++++++++----------- mm/slab.h | 76 ++++++++++++++++++++++++++------------ 3 files changed, 79 insertions(+), 92 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 0c5c403f4be6..e34112f6a369 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -536,45 +536,6 @@ static inline bool folio_memcg_kmem(struct folio *folio) return folio->memcg_data & MEMCG_DATA_KMEM; } -/* - * page_objcgs - get the object cgroups vector associated with a page - * @page: a pointer to the page struct - * - * Returns a pointer to the object cgroups vector associated with the page, - * or NULL. This function assumes that the page is known to have an - * associated object cgroups vector. It's not safe to call this function - * against pages, which might have an associated memory cgroup: e.g. - * kernel stack pages. - */ -static inline struct obj_cgroup **page_objcgs(struct page *page) -{ - unsigned long memcg_data = READ_ONCE(page->memcg_data); - - VM_BUG_ON_PAGE(memcg_data && !(memcg_data & MEMCG_DATA_OBJCGS), page); - VM_BUG_ON_PAGE(memcg_data & MEMCG_DATA_KMEM, page); - - return (struct obj_cgroup **)(memcg_data & ~MEMCG_DATA_FLAGS_MASK); -} - -/* - * page_objcgs_check - get the object cgroups vector associated with a page - * @page: a pointer to the page struct - * - * Returns a pointer to the object cgroups vector associated with the page, - * or NULL. This function is safe to use if the page can be directly associated - * with a memory cgroup. - */ -static inline struct obj_cgroup **page_objcgs_check(struct page *page) -{ - unsigned long memcg_data = READ_ONCE(page->memcg_data); - - if (!memcg_data || !(memcg_data & MEMCG_DATA_OBJCGS)) - return NULL; - - VM_BUG_ON_PAGE(memcg_data & MEMCG_DATA_KMEM, page); - - return (struct obj_cgroup **)(memcg_data & ~MEMCG_DATA_FLAGS_MASK); -} #else static inline bool folio_memcg_kmem(struct folio *folio) @@ -582,15 +543,6 @@ static inline bool folio_memcg_kmem(struct folio *folio) return false; } -static inline struct obj_cgroup **page_objcgs(struct page *page) -{ - return NULL; -} - -static inline struct obj_cgroup **page_objcgs_check(struct page *page) -{ - return NULL; -} #endif static inline bool PageMemcgKmem(struct page *page) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index f7b789e692a0..f4fdd5675991 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2816,31 +2816,31 @@ static inline void mod_objcg_mlstate(struct obj_cgroup *objcg, rcu_read_unlock(); } -int memcg_alloc_page_obj_cgroups(struct page *page, struct kmem_cache *s, - gfp_t gfp, bool new_page) +int memcg_alloc_slab_cgroups(struct slab *slab, struct kmem_cache *s, + gfp_t gfp, bool new_slab) { - unsigned int objects = objs_per_slab(s, page_slab(page)); + unsigned int objects = objs_per_slab(s, slab); unsigned long memcg_data; void *vec; gfp &= ~OBJCGS_CLEAR_MASK; vec = kcalloc_node(objects, sizeof(struct obj_cgroup *), gfp, - page_to_nid(page)); + slab_nid(slab)); if (!vec) return -ENOMEM; memcg_data = (unsigned long) vec | MEMCG_DATA_OBJCGS; - if (new_page) { + if (new_slab) { /* - * If the slab page is brand new and nobody can yet access - * it's memcg_data, no synchronization is required and - * memcg_data can be simply assigned. + * If the slab is brand new and nobody can yet access its + * memcg_data, no synchronization is required and memcg_data can + * be simply assigned. */ - page->memcg_data = memcg_data; - } else if (cmpxchg(&page->memcg_data, 0, memcg_data)) { + slab->memcg_data = memcg_data; + } else if (cmpxchg(&slab->memcg_data, 0, memcg_data)) { /* - * If the slab page is already in use, somebody can allocate - * and assign obj_cgroups in parallel. In this case the existing + * If the slab is already in use, somebody can allocate and + * assign obj_cgroups in parallel. In this case the existing * objcg vector should be reused. */ kfree(vec); @@ -2865,26 +2865,31 @@ int memcg_alloc_page_obj_cgroups(struct page *page, struct kmem_cache *s, */ struct mem_cgroup *mem_cgroup_from_obj(void *p) { - struct page *page; + struct folio *folio; if (mem_cgroup_disabled()) return NULL; - page = virt_to_head_page(p); + folio = virt_to_folio(p); /* * Slab objects are accounted individually, not per-page. * Memcg membership data for each individual object is saved in * the page->obj_cgroups. */ - if (page_objcgs_check(page)) { - struct obj_cgroup *objcg; + if (folio_test_slab(folio)) { + struct obj_cgroup **objcgs; + struct slab *slab; unsigned int off; - off = obj_to_index(page->slab_cache, page_slab(page), p); - objcg = page_objcgs(page)[off]; - if (objcg) - return obj_cgroup_memcg(objcg); + slab = folio_slab(folio); + objcgs = slab_objcgs(slab); + if (!objcgs) + return NULL; + + off = obj_to_index(slab->slab_cache, slab, p); + if (objcgs[off]) + return obj_cgroup_memcg(objcgs[off]); return NULL; } @@ -2896,7 +2901,7 @@ struct mem_cgroup *mem_cgroup_from_obj(void *p) * page_memcg_check(page) will guarantee that a proper memory * cgroup pointer or NULL will be returned. */ - return page_memcg_check(page); + return page_memcg_check(folio_page(folio, 0)); } __always_inline struct obj_cgroup *get_obj_cgroup_from_current(void) diff --git a/mm/slab.h b/mm/slab.h index bca9181e96d7..36e0022d8267 100644 --- a/mm/slab.h +++ b/mm/slab.h @@ -412,15 +412,36 @@ static inline bool kmem_cache_debug_flags(struct kmem_cache *s, slab_flags_t fla } #ifdef CONFIG_MEMCG_KMEM -int memcg_alloc_page_obj_cgroups(struct page *page, struct kmem_cache *s, - gfp_t gfp, bool new_page); +/* + * slab_objcgs - get the object cgroups vector associated with a slab + * @slab: a pointer to the slab struct + * + * Returns a pointer to the object cgroups vector associated with the slab, + * or NULL. This function assumes that the slab is known to have an + * associated object cgroups vector. It's not safe to call this function + * against slabs with underlying pages, which might have an associated memory + * cgroup: e.g. kernel stack pages. + */ +static inline struct obj_cgroup **slab_objcgs(struct slab *slab) +{ + unsigned long memcg_data = READ_ONCE(slab->memcg_data); + + VM_BUG_ON_PAGE(memcg_data && !(memcg_data & MEMCG_DATA_OBJCGS), + slab_page(slab)); + VM_BUG_ON_PAGE(memcg_data & MEMCG_DATA_KMEM, slab_page(slab)); + + return (struct obj_cgroup **)(memcg_data & ~MEMCG_DATA_FLAGS_MASK); +} + +int memcg_alloc_slab_cgroups(struct slab *slab, struct kmem_cache *s, + gfp_t gfp, bool new_slab); void mod_objcg_state(struct obj_cgroup *objcg, struct pglist_data *pgdat, enum node_stat_item idx, int nr); -static inline void memcg_free_page_obj_cgroups(struct page *page) +static inline void memcg_free_slab_cgroups(struct slab *slab) { - kfree(page_objcgs(page)); - page->memcg_data = 0; + kfree(slab_objcgs(slab)); + slab->memcg_data = 0; } static inline size_t obj_full_size(struct kmem_cache *s) @@ -465,7 +486,7 @@ static inline void memcg_slab_post_alloc_hook(struct kmem_cache *s, gfp_t flags, size_t size, void **p) { - struct page *page; + struct slab *slab; unsigned long off; size_t i; @@ -474,19 +495,19 @@ static inline void memcg_slab_post_alloc_hook(struct kmem_cache *s, for (i = 0; i < size; i++) { if (likely(p[i])) { - page = virt_to_head_page(p[i]); + slab = virt_to_slab(p[i]); - if (!page_objcgs(page) && - memcg_alloc_page_obj_cgroups(page, s, flags, + if (!slab_objcgs(slab) && + memcg_alloc_slab_cgroups(slab, s, flags, false)) { obj_cgroup_uncharge(objcg, obj_full_size(s)); continue; } - off = obj_to_index(s, page_slab(page), p[i]); + off = obj_to_index(s, slab, p[i]); obj_cgroup_get(objcg); - page_objcgs(page)[off] = objcg; - mod_objcg_state(objcg, page_pgdat(page), + slab_objcgs(slab)[off] = objcg; + mod_objcg_state(objcg, slab_pgdat(slab), cache_vmstat_idx(s), obj_full_size(s)); } else { obj_cgroup_uncharge(objcg, obj_full_size(s)); @@ -501,7 +522,7 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s_orig, struct kmem_cache *s; struct obj_cgroup **objcgs; struct obj_cgroup *objcg; - struct page *page; + struct slab *slab; unsigned int off; int i; @@ -512,43 +533,52 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s_orig, if (unlikely(!p[i])) continue; - page = virt_to_head_page(p[i]); - objcgs = page_objcgs_check(page); + slab = virt_to_slab(p[i]); + /* we could be given a kmalloc_large() object, skip those */ + if (!slab) + continue; + + objcgs = slab_objcgs(slab); if (!objcgs) continue; if (!s_orig) - s = page->slab_cache; + s = slab->slab_cache; else s = s_orig; - off = obj_to_index(s, page_slab(page), p[i]); + off = obj_to_index(s, slab, p[i]); objcg = objcgs[off]; if (!objcg) continue; objcgs[off] = NULL; obj_cgroup_uncharge(objcg, obj_full_size(s)); - mod_objcg_state(objcg, page_pgdat(page), cache_vmstat_idx(s), + mod_objcg_state(objcg, slab_pgdat(slab), cache_vmstat_idx(s), -obj_full_size(s)); obj_cgroup_put(objcg); } } #else /* CONFIG_MEMCG_KMEM */ +static inline struct obj_cgroup **slab_objcgs(struct slab *slab) +{ + return NULL; +} + static inline struct mem_cgroup *memcg_from_slab_obj(void *ptr) { return NULL; } -static inline int memcg_alloc_page_obj_cgroups(struct page *page, +static inline int memcg_alloc_slab_cgroups(struct slab *slab, struct kmem_cache *s, gfp_t gfp, - bool new_page) + bool new_slab) { return 0; } -static inline void memcg_free_page_obj_cgroups(struct page *page) +static inline void memcg_free_slab_cgroups(struct slab *slab) { } @@ -587,7 +617,7 @@ static __always_inline void account_slab(struct slab *slab, int order, struct kmem_cache *s, gfp_t gfp) { if (memcg_kmem_enabled() && (s->flags & SLAB_ACCOUNT)) - memcg_alloc_page_obj_cgroups(slab_page(slab), s, gfp, true); + memcg_alloc_slab_cgroups(slab, s, gfp, true); mod_node_page_state(slab_pgdat(slab), cache_vmstat_idx(s), PAGE_SIZE << order); @@ -597,7 +627,7 @@ static __always_inline void unaccount_slab(struct slab *slab, int order, struct kmem_cache *s) { if (memcg_kmem_enabled()) - memcg_free_page_obj_cgroups(slab_page(slab)); + memcg_free_slab_cgroups(slab); mod_node_page_state(slab_pgdat(slab), cache_vmstat_idx(s), -(PAGE_SIZE << order)); -- 2.34.1 ^ permalink raw reply related [flat|nested] 9+ messages in thread
[parent not found: <20220104001046.12263-24-vbabka-AlSwsSmVLrQ@public.gmane.org>]
* Re: [PATCH v4 23/32] mm/memcg: Convert slab objcgs from struct page to struct slab [not found] ` <20220104001046.12263-24-vbabka-AlSwsSmVLrQ@public.gmane.org> @ 2022-01-05 2:41 ` Roman Gushchin [not found] ` <YdUFXUbeYGdFbVbq-cx5fftMpWqeCjSd+JxjunQ2O0Ztt9esIQQ4Iyu8u01E@public.gmane.org> 2022-01-05 2:55 ` Roman Gushchin 1 sibling, 1 reply; 9+ messages in thread From: Roman Gushchin @ 2022-01-05 2:41 UTC (permalink / raw) To: Vlastimil Babka Cc: Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Andrew Morton, Johannes Weiner, Hyeonggon Yoo, patches-cunTk1MwBs/YUNznpcFYbw, Michal Hocko, Vladimir Davydov, cgroups-u79uwXL29TY76Z2rM5mHXA On Tue, Jan 04, 2022 at 01:10:37AM +0100, Vlastimil Babka wrote: > page->memcg_data is used with MEMCG_DATA_OBJCGS flag only for slab pages > so convert all the related infrastructure to struct slab. Also use > struct folio instead of struct page when resolving object pointers. > > This is not just mechanistic changing of types and names. Now in > mem_cgroup_from_obj() we use folio_test_slab() to decide if we interpret > the folio as a real slab instead of a large kmalloc, instead of relying > on MEMCG_DATA_OBJCGS bit that used to be checked in page_objcgs_check(). > Similarly in memcg_slab_free_hook() where we can encounter > kmalloc_large() pages (here the folio slab flag check is implied by > virt_to_slab()). As a result, page_objcgs_check() can be dropped instead > of converted. > > To avoid include cycles, move the inline definition of slab_objcgs() > from memcontrol.h to mm/slab.h. > > Signed-off-by: Vlastimil Babka <vbabka-AlSwsSmVLrQ@public.gmane.org> > Cc: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org> > Cc: Michal Hocko <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> > Cc: Vladimir Davydov <vdavydov.dev-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> > Cc: <cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org> > --- > include/linux/memcontrol.h | 48 ------------------------ > mm/memcontrol.c | 47 ++++++++++++----------- > mm/slab.h | 76 ++++++++++++++++++++++++++------------ > 3 files changed, 79 insertions(+), 92 deletions(-) > > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h > index 0c5c403f4be6..e34112f6a369 100644 > --- a/include/linux/memcontrol.h > +++ b/include/linux/memcontrol.h > @@ -536,45 +536,6 @@ static inline bool folio_memcg_kmem(struct folio *folio) > return folio->memcg_data & MEMCG_DATA_KMEM; > } > > -/* > - * page_objcgs - get the object cgroups vector associated with a page > - * @page: a pointer to the page struct > - * > - * Returns a pointer to the object cgroups vector associated with the page, > - * or NULL. This function assumes that the page is known to have an > - * associated object cgroups vector. It's not safe to call this function > - * against pages, which might have an associated memory cgroup: e.g. > - * kernel stack pages. > - */ > -static inline struct obj_cgroup **page_objcgs(struct page *page) > -{ > - unsigned long memcg_data = READ_ONCE(page->memcg_data); > - > - VM_BUG_ON_PAGE(memcg_data && !(memcg_data & MEMCG_DATA_OBJCGS), page); > - VM_BUG_ON_PAGE(memcg_data & MEMCG_DATA_KMEM, page); > - > - return (struct obj_cgroup **)(memcg_data & ~MEMCG_DATA_FLAGS_MASK); > -} > - > -/* > - * page_objcgs_check - get the object cgroups vector associated with a page > - * @page: a pointer to the page struct > - * > - * Returns a pointer to the object cgroups vector associated with the page, > - * or NULL. This function is safe to use if the page can be directly associated > - * with a memory cgroup. > - */ > -static inline struct obj_cgroup **page_objcgs_check(struct page *page) > -{ > - unsigned long memcg_data = READ_ONCE(page->memcg_data); > - > - if (!memcg_data || !(memcg_data & MEMCG_DATA_OBJCGS)) > - return NULL; > - > - VM_BUG_ON_PAGE(memcg_data & MEMCG_DATA_KMEM, page); > - > - return (struct obj_cgroup **)(memcg_data & ~MEMCG_DATA_FLAGS_MASK); > -} > > #else > static inline bool folio_memcg_kmem(struct folio *folio) > @@ -582,15 +543,6 @@ static inline bool folio_memcg_kmem(struct folio *folio) > return false; > } > > -static inline struct obj_cgroup **page_objcgs(struct page *page) > -{ > - return NULL; > -} > - > -static inline struct obj_cgroup **page_objcgs_check(struct page *page) > -{ > - return NULL; > -} > #endif > > static inline bool PageMemcgKmem(struct page *page) > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index f7b789e692a0..f4fdd5675991 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -2816,31 +2816,31 @@ static inline void mod_objcg_mlstate(struct obj_cgroup *objcg, > rcu_read_unlock(); > } > > -int memcg_alloc_page_obj_cgroups(struct page *page, struct kmem_cache *s, > - gfp_t gfp, bool new_page) > +int memcg_alloc_slab_cgroups(struct slab *slab, struct kmem_cache *s, > + gfp_t gfp, bool new_slab) > { > - unsigned int objects = objs_per_slab(s, page_slab(page)); > + unsigned int objects = objs_per_slab(s, slab); > unsigned long memcg_data; > void *vec; > > gfp &= ~OBJCGS_CLEAR_MASK; > vec = kcalloc_node(objects, sizeof(struct obj_cgroup *), gfp, > - page_to_nid(page)); > + slab_nid(slab)); > if (!vec) > return -ENOMEM; > > memcg_data = (unsigned long) vec | MEMCG_DATA_OBJCGS; > - if (new_page) { > + if (new_slab) { > /* > - * If the slab page is brand new and nobody can yet access > - * it's memcg_data, no synchronization is required and > - * memcg_data can be simply assigned. > + * If the slab is brand new and nobody can yet access its > + * memcg_data, no synchronization is required and memcg_data can > + * be simply assigned. > */ > - page->memcg_data = memcg_data; > - } else if (cmpxchg(&page->memcg_data, 0, memcg_data)) { > + slab->memcg_data = memcg_data; > + } else if (cmpxchg(&slab->memcg_data, 0, memcg_data)) { > /* > - * If the slab page is already in use, somebody can allocate > - * and assign obj_cgroups in parallel. In this case the existing > + * If the slab is already in use, somebody can allocate and > + * assign obj_cgroups in parallel. In this case the existing > * objcg vector should be reused. > */ > kfree(vec); > @@ -2865,26 +2865,31 @@ int memcg_alloc_page_obj_cgroups(struct page *page, struct kmem_cache *s, > */ > struct mem_cgroup *mem_cgroup_from_obj(void *p) > { > - struct page *page; > + struct folio *folio; > > if (mem_cgroup_disabled()) > return NULL; > > - page = virt_to_head_page(p); > + folio = virt_to_folio(p); > > /* > * Slab objects are accounted individually, not per-page. > * Memcg membership data for each individual object is saved in > * the page->obj_cgroups. ^^^^^^^^^^^^^^^^^ slab->memcg_data > */ > - if (page_objcgs_check(page)) { > - struct obj_cgroup *objcg; > + if (folio_test_slab(folio)) { > + struct obj_cgroup **objcgs; > + struct slab *slab; > unsigned int off; > > - off = obj_to_index(page->slab_cache, page_slab(page), p); > - objcg = page_objcgs(page)[off]; > - if (objcg) > - return obj_cgroup_memcg(objcg); > + slab = folio_slab(folio); > + objcgs = slab_objcgs(slab); > + if (!objcgs) > + return NULL; > + > + off = obj_to_index(slab->slab_cache, slab, p); > + if (objcgs[off]) > + return obj_cgroup_memcg(objcgs[off]); > > return NULL; > } There is a comment below, which needs some changes: /* * page_memcg_check() is used here, because page_has_obj_cgroups() * check above could fail because the object cgroups vector wasn't set * at that moment, but it can be set concurrently. * page_memcg_check(page) will guarantee that a proper memory * cgroup pointer or NULL will be returned. */ In reality the folio's slab flag can be cleared before releasing the objcgs \ vector. It seems that there is no such possibility at setting the flag, it's always set before allocating and assigning the objcg vector. > @@ -2896,7 +2901,7 @@ struct mem_cgroup *mem_cgroup_from_obj(void *p) > * page_memcg_check(page) will guarantee that a proper memory > * cgroup pointer or NULL will be returned. > */ > - return page_memcg_check(page); > + return page_memcg_check(folio_page(folio, 0)); > } > > __always_inline struct obj_cgroup *get_obj_cgroup_from_current(void) > diff --git a/mm/slab.h b/mm/slab.h > index bca9181e96d7..36e0022d8267 100644 > --- a/mm/slab.h > +++ b/mm/slab.h > @@ -412,15 +412,36 @@ static inline bool kmem_cache_debug_flags(struct kmem_cache *s, slab_flags_t fla > } > > #ifdef CONFIG_MEMCG_KMEM > -int memcg_alloc_page_obj_cgroups(struct page *page, struct kmem_cache *s, > - gfp_t gfp, bool new_page); > +/* > + * slab_objcgs - get the object cgroups vector associated with a slab > + * @slab: a pointer to the slab struct > + * > + * Returns a pointer to the object cgroups vector associated with the slab, > + * or NULL. This function assumes that the slab is known to have an > + * associated object cgroups vector. It's not safe to call this function > + * against slabs with underlying pages, which might have an associated memory > + * cgroup: e.g. kernel stack pages. Hm, is it still true? I don't think so. It must be safe to call it for any slab now. The rest looks good to me, please feel free to add Reviewed-by: Roman Gushchin <guro-b10kYP2dOMg@public.gmane.org> after fixing these comments. Thanks! ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <YdUFXUbeYGdFbVbq-cx5fftMpWqeCjSd+JxjunQ2O0Ztt9esIQQ4Iyu8u01E@public.gmane.org>]
* Re: [PATCH v4 23/32] mm/memcg: Convert slab objcgs from struct page to struct slab [not found] ` <YdUFXUbeYGdFbVbq-cx5fftMpWqeCjSd+JxjunQ2O0Ztt9esIQQ4Iyu8u01E@public.gmane.org> @ 2022-01-05 17:08 ` Vlastimil Babka [not found] ` <56c7a92b-f830-93dd-3fd0-4b6f1eb723ef-AlSwsSmVLrQ@public.gmane.org> 0 siblings, 1 reply; 9+ messages in thread From: Vlastimil Babka @ 2022-01-05 17:08 UTC (permalink / raw) To: Roman Gushchin Cc: Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Andrew Morton, Johannes Weiner, Hyeonggon Yoo, patches-cunTk1MwBs/YUNznpcFYbw, Michal Hocko, Vladimir Davydov, cgroups-u79uwXL29TY76Z2rM5mHXA On 1/5/22 03:41, Roman Gushchin wrote: > On Tue, Jan 04, 2022 at 01:10:37AM +0100, Vlastimil Babka wrote: >> page->memcg_data is used with MEMCG_DATA_OBJCGS flag only for slab pages >> so convert all the related infrastructure to struct slab. Also use >> struct folio instead of struct page when resolving object pointers. >> >> This is not just mechanistic changing of types and names. Now in >> mem_cgroup_from_obj() we use folio_test_slab() to decide if we interpret >> the folio as a real slab instead of a large kmalloc, instead of relying >> on MEMCG_DATA_OBJCGS bit that used to be checked in page_objcgs_check(). >> Similarly in memcg_slab_free_hook() where we can encounter >> kmalloc_large() pages (here the folio slab flag check is implied by >> virt_to_slab()). As a result, page_objcgs_check() can be dropped instead >> of converted. >> >> To avoid include cycles, move the inline definition of slab_objcgs() >> from memcontrol.h to mm/slab.h. >> >> Signed-off-by: Vlastimil Babka <vbabka-AlSwsSmVLrQ@public.gmane.org> >> Cc: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org> >> Cc: Michal Hocko <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> >> Cc: Vladimir Davydov <vdavydov.dev-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> >> Cc: <cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org> >> /* >> * Slab objects are accounted individually, not per-page. >> * Memcg membership data for each individual object is saved in >> * the page->obj_cgroups. > ^^^^^^^^^^^^^^^^^ > slab->memcg_data Good catch, fixed. >> */ >> - if (page_objcgs_check(page)) { >> - struct obj_cgroup *objcg; >> + if (folio_test_slab(folio)) { >> + struct obj_cgroup **objcgs; >> + struct slab *slab; >> unsigned int off; >> >> - off = obj_to_index(page->slab_cache, page_slab(page), p); >> - objcg = page_objcgs(page)[off]; >> - if (objcg) >> - return obj_cgroup_memcg(objcg); >> + slab = folio_slab(folio); >> + objcgs = slab_objcgs(slab); >> + if (!objcgs) >> + return NULL; >> + >> + off = obj_to_index(slab->slab_cache, slab, p); >> + if (objcgs[off]) >> + return obj_cgroup_memcg(objcgs[off]); >> >> return NULL; >> } > > There is a comment below, which needs some changes: > /* > * page_memcg_check() is used here, because page_has_obj_cgroups() > * check above could fail because the object cgroups vector wasn't set > * at that moment, but it can be set concurrently. > * page_memcg_check(page) will guarantee that a proper memory > * cgroup pointer or NULL will be returned. > */ > > In reality the folio's slab flag can be cleared before releasing the objcgs \ > vector. It seems that there is no such possibility at setting the flag, > it's always set before allocating and assigning the objcg vector. You're right. I'm changing it to: * page_memcg_check() is used here, because in theory we can encounter * a folio where the slab flag has been cleared already, but * slab->memcg_data has not been freed yet * page_memcg_check(page) will guarantee that a proper memory * cgroup pointer or NULL will be returned. I wrote "in theory" because AFAICS it implies a race as we would have to be freeing a slab and at the same time query an object address. We probably could have used the non-check version, but at this point I don't want to make any functional changes besides these comment fixes. I assume your patch on top would cover it? >> @@ -2896,7 +2901,7 @@ struct mem_cgroup *mem_cgroup_from_obj(void *p) >> * page_memcg_check(page) will guarantee that a proper memory >> * cgroup pointer or NULL will be returned. >> */ >> - return page_memcg_check(page); >> + return page_memcg_check(folio_page(folio, 0)); >> } >> >> __always_inline struct obj_cgroup *get_obj_cgroup_from_current(void) >> diff --git a/mm/slab.h b/mm/slab.h >> index bca9181e96d7..36e0022d8267 100644 >> --- a/mm/slab.h >> +++ b/mm/slab.h >> @@ -412,15 +412,36 @@ static inline bool kmem_cache_debug_flags(struct kmem_cache *s, slab_flags_t fla >> } >> >> #ifdef CONFIG_MEMCG_KMEM >> -int memcg_alloc_page_obj_cgroups(struct page *page, struct kmem_cache *s, >> - gfp_t gfp, bool new_page); >> +/* >> + * slab_objcgs - get the object cgroups vector associated with a slab >> + * @slab: a pointer to the slab struct >> + * >> + * Returns a pointer to the object cgroups vector associated with the slab, >> + * or NULL. This function assumes that the slab is known to have an >> + * associated object cgroups vector. It's not safe to call this function >> + * against slabs with underlying pages, which might have an associated memory >> + * cgroup: e.g. kernel stack pages. > > Hm, is it still true? I don't think so. It must be safe to call it for any > slab now. Right, forgot to update after removing the _check variant. Changing to: * Returns a pointer to the object cgroups vector associated with the slab, * or NULL if no such vector has been associated yet. > The rest looks good to me, please feel free to add > Reviewed-by: Roman Gushchin <guro-b10kYP2dOMg@public.gmane.org> > after fixing these comments. Thanks! > Thanks! ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <56c7a92b-f830-93dd-3fd0-4b6f1eb723ef-AlSwsSmVLrQ@public.gmane.org>]
* Re: [PATCH v4 23/32] mm/memcg: Convert slab objcgs from struct page to struct slab [not found] ` <56c7a92b-f830-93dd-3fd0-4b6f1eb723ef-AlSwsSmVLrQ@public.gmane.org> @ 2022-01-06 3:36 ` Roman Gushchin 0 siblings, 0 replies; 9+ messages in thread From: Roman Gushchin @ 2022-01-06 3:36 UTC (permalink / raw) To: Vlastimil Babka Cc: Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Andrew Morton, Johannes Weiner, Hyeonggon Yoo, patches-cunTk1MwBs/YUNznpcFYbw, Michal Hocko, Vladimir Davydov, cgroups-u79uwXL29TY76Z2rM5mHXA On Wed, Jan 05, 2022 at 06:08:45PM +0100, Vlastimil Babka wrote: > On 1/5/22 03:41, Roman Gushchin wrote: > > On Tue, Jan 04, 2022 at 01:10:37AM +0100, Vlastimil Babka wrote: > >> page->memcg_data is used with MEMCG_DATA_OBJCGS flag only for slab pages > >> so convert all the related infrastructure to struct slab. Also use > >> struct folio instead of struct page when resolving object pointers. > >> > >> This is not just mechanistic changing of types and names. Now in > >> mem_cgroup_from_obj() we use folio_test_slab() to decide if we interpret > >> the folio as a real slab instead of a large kmalloc, instead of relying > >> on MEMCG_DATA_OBJCGS bit that used to be checked in page_objcgs_check(). > >> Similarly in memcg_slab_free_hook() where we can encounter > >> kmalloc_large() pages (here the folio slab flag check is implied by > >> virt_to_slab()). As a result, page_objcgs_check() can be dropped instead > >> of converted. > >> > >> To avoid include cycles, move the inline definition of slab_objcgs() > >> from memcontrol.h to mm/slab.h. > >> > >> Signed-off-by: Vlastimil Babka <vbabka-AlSwsSmVLrQ@public.gmane.org> > >> Cc: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org> > >> Cc: Michal Hocko <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> > >> Cc: Vladimir Davydov <vdavydov.dev-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> > >> Cc: <cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org> > >> /* > >> * Slab objects are accounted individually, not per-page. > >> * Memcg membership data for each individual object is saved in > >> * the page->obj_cgroups. > > ^^^^^^^^^^^^^^^^^ > > slab->memcg_data > > Good catch, fixed. > > >> */ > >> - if (page_objcgs_check(page)) { > >> - struct obj_cgroup *objcg; > >> + if (folio_test_slab(folio)) { > >> + struct obj_cgroup **objcgs; > >> + struct slab *slab; > >> unsigned int off; > >> > >> - off = obj_to_index(page->slab_cache, page_slab(page), p); > >> - objcg = page_objcgs(page)[off]; > >> - if (objcg) > >> - return obj_cgroup_memcg(objcg); > >> + slab = folio_slab(folio); > >> + objcgs = slab_objcgs(slab); > >> + if (!objcgs) > >> + return NULL; > >> + > >> + off = obj_to_index(slab->slab_cache, slab, p); > >> + if (objcgs[off]) > >> + return obj_cgroup_memcg(objcgs[off]); > >> > >> return NULL; > >> } > > > > There is a comment below, which needs some changes: > > /* > > * page_memcg_check() is used here, because page_has_obj_cgroups() > > * check above could fail because the object cgroups vector wasn't set > > * at that moment, but it can be set concurrently. > > * page_memcg_check(page) will guarantee that a proper memory > > * cgroup pointer or NULL will be returned. > > */ > > > > In reality the folio's slab flag can be cleared before releasing the objcgs \ > > vector. It seems that there is no such possibility at setting the flag, > > it's always set before allocating and assigning the objcg vector. > > You're right. I'm changing it to: > > * page_memcg_check() is used here, because in theory we can encounter > * a folio where the slab flag has been cleared already, but > * slab->memcg_data has not been freed yet > * page_memcg_check(page) will guarantee that a proper memory > * cgroup pointer or NULL will be returned. > > I wrote "in theory" because AFAICS it implies a race as we would have to be > freeing a slab and at the same time query an object address. We probably > could have used the non-check version, but at this point I don't want to > make any functional changes besides these comment fixes. Sounds good to me. > > I assume your patch on top would cover it? I tried to master it and remembered why we have this bit in place: there is a /proc/kpagecgroup interface which just scans over pages and reads their memcg data. It has zero control over the lifetime of pages, so it's prone to all kinds of races with setting and clearing the slab flag. So it's probably better to leave the MEMCG_DATA_OBJCGS bit in place. > > >> @@ -2896,7 +2901,7 @@ struct mem_cgroup *mem_cgroup_from_obj(void *p) > >> * page_memcg_check(page) will guarantee that a proper memory > >> * cgroup pointer or NULL will be returned. > >> */ > >> - return page_memcg_check(page); > >> + return page_memcg_check(folio_page(folio, 0)); > >> } > >> > >> __always_inline struct obj_cgroup *get_obj_cgroup_from_current(void) > >> diff --git a/mm/slab.h b/mm/slab.h > >> index bca9181e96d7..36e0022d8267 100644 > >> --- a/mm/slab.h > >> +++ b/mm/slab.h > >> @@ -412,15 +412,36 @@ static inline bool kmem_cache_debug_flags(struct kmem_cache *s, slab_flags_t fla > >> } > >> > >> #ifdef CONFIG_MEMCG_KMEM > >> -int memcg_alloc_page_obj_cgroups(struct page *page, struct kmem_cache *s, > >> - gfp_t gfp, bool new_page); > >> +/* > >> + * slab_objcgs - get the object cgroups vector associated with a slab > >> + * @slab: a pointer to the slab struct > >> + * > >> + * Returns a pointer to the object cgroups vector associated with the slab, > >> + * or NULL. This function assumes that the slab is known to have an > >> + * associated object cgroups vector. It's not safe to call this function > >> + * against slabs with underlying pages, which might have an associated memory > >> + * cgroup: e.g. kernel stack pages. > > > > Hm, is it still true? I don't think so. It must be safe to call it for any > > slab now. > > Right, forgot to update after removing the _check variant. > Changing to: > > * Returns a pointer to the object cgroups vector associated with the slab, > * or NULL if no such vector has been associated yet. Perfect! Thanks! ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v4 23/32] mm/memcg: Convert slab objcgs from struct page to struct slab [not found] ` <20220104001046.12263-24-vbabka-AlSwsSmVLrQ@public.gmane.org> 2022-01-05 2:41 ` Roman Gushchin @ 2022-01-05 2:55 ` Roman Gushchin 1 sibling, 0 replies; 9+ messages in thread From: Roman Gushchin @ 2022-01-05 2:55 UTC (permalink / raw) To: Vlastimil Babka Cc: Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Andrew Morton, Johannes Weiner, Hyeonggon Yoo, patches-cunTk1MwBs/YUNznpcFYbw, Michal Hocko, Vladimir Davydov, cgroups-u79uwXL29TY76Z2rM5mHXA On Tue, Jan 04, 2022 at 01:10:37AM +0100, Vlastimil Babka wrote: > page->memcg_data is used with MEMCG_DATA_OBJCGS flag only for slab pages > so convert all the related infrastructure to struct slab. Also use > struct folio instead of struct page when resolving object pointers. > > This is not just mechanistic changing of types and names. Now in > mem_cgroup_from_obj() we use folio_test_slab() to decide if we interpret > the folio as a real slab instead of a large kmalloc, instead of relying > on MEMCG_DATA_OBJCGS bit that used to be checked in page_objcgs_check(). > Similarly in memcg_slab_free_hook() where we can encounter > kmalloc_large() pages (here the folio slab flag check is implied by > virt_to_slab()). As a result, page_objcgs_check() can be dropped instead > of converted. Btw, it seems that with some minimal changes we can drop the whole thing with the changing of the lower bit and rely on the slab page flag. I'll prepare a patch on top of your series. Thanks! ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2022-01-06 3:36 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-01-04 0:10 [PATCH v4 00/32] Separate struct slab from struct page Vlastimil Babka
[not found] ` <20220104001046.12263-1-vbabka-AlSwsSmVLrQ@public.gmane.org>
2022-01-04 0:10 ` [PATCH v4 22/32] mm: Convert struct page to struct slab in functions used by other subsystems Vlastimil Babka
[not found] ` <20220104001046.12263-23-vbabka-AlSwsSmVLrQ@public.gmane.org>
2022-01-05 2:12 ` Roman Gushchin
[not found] ` <YdT+qU4xgQeZc/jP-cx5fftMpWqeCjSd+JxjunQ2O0Ztt9esIQQ4Iyu8u01E@public.gmane.org>
2022-01-05 16:39 ` Vlastimil Babka
2022-01-04 0:10 ` [PATCH v4 23/32] mm/memcg: Convert slab objcgs from struct page to struct slab Vlastimil Babka
[not found] ` <20220104001046.12263-24-vbabka-AlSwsSmVLrQ@public.gmane.org>
2022-01-05 2:41 ` Roman Gushchin
[not found] ` <YdUFXUbeYGdFbVbq-cx5fftMpWqeCjSd+JxjunQ2O0Ztt9esIQQ4Iyu8u01E@public.gmane.org>
2022-01-05 17:08 ` Vlastimil Babka
[not found] ` <56c7a92b-f830-93dd-3fd0-4b6f1eb723ef-AlSwsSmVLrQ@public.gmane.org>
2022-01-06 3:36 ` Roman Gushchin
2022-01-05 2:55 ` Roman Gushchin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox