Linux cgroups development
 help / color / mirror / Atom feed
From: "Vlastimil Babka (SUSE)" <vbabka@kernel.org>
To: Harry Yoo <harry@kernel.org>
Cc: Hao Li <hao.li@linux.dev>, Christoph Lameter <cl@gentwo.org>,
	 David Rientjes <rientjes@google.com>,
	 Roman Gushchin <roman.gushchin@linux.dev>,
	 Suren Baghdasaryan <surenb@google.com>,
	Alexei Starovoitov <ast@kernel.org>,
	 Andrew Morton <akpm@linux-foundation.org>,
	 Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@kernel.org>,
	 Shakeel Butt <shakeel.butt@linux.dev>,
	 Alexander Potapenko <glider@google.com>,
	Marco Elver <elver@google.com>,
	 Dmitry Vyukov <dvyukov@google.com>,
	kasan-dev@googlegroups.com,  linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, cgroups@vger.kernel.org,
	 "Vlastimil Babka (SUSE)" <vbabka@kernel.org>
Subject: [PATCH v3 00/15] mm/slab: introduce alloc_flags and slab_alloc_context
Date: Mon, 15 Jun 2026 13:54:33 +0200	[thread overview]
Message-ID: <20260615-slab_alloc_flags-v3-0-ce1146d140fb@kernel.org> (raw)

This series is now in slab/for-next, based on the slab-for-7.2 tag that
was sent as first PR to Linus. Posting new version due to many
accumulated changes, for final rounds of review. The plan is to send a
second slab PR with this early next week, if nothing explodes.

Git: https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=b4/slab_alloc_flags

The slab implementation currently relies on gfp flags to convey
some context information internally:

- The absence of both __GFP_RECLAIM flags is interpreted as "cannot spin
  on locks", and intended to be used by kmalloc_nolock(). But false
  positives are possible e.g. during early boot where gfp_allowed_mask
  clears __GFP_RECLAIM from all allocations. This leads to unnecessary
  allocation failures and workarounds such as fd3634312a04 ("debugobject:
  Make it work with deferred page initialization - again").

- __GFP_NO_OBJ_EXT exists and takes up valuable bit in the gfp flags
  space, only to prevent recursive kmalloc() allocations for obj_ext
  arrays and sheaves.

The page allocator uses its internal alloc_flags to convey various
context information, including ALLOC_TRYLOCK (meaning "cannot spin").
This series copies that concept for the slab allocator, with its own
slab-specific internal flags:

- SLAB_ALLOC_DEFAULT - no extra flags (the value is 0), but explicit
- SLAB_ALLOC_NOLOCK - do not spin on locks (used by kmalloc_nolock())
- SLAB_ALLOC_NEW_SLAB - replacing existing 'bool new_slab' parameter
			for allocating obj_ext arrays
- SLAB_ALLOC_NO_RECURSE - replacing usage of __GFP_NO_OBJ_EXT

To reduce the amount of parameters in various internal functions, we
additionally introduce slab_alloc_context (also inspired by page
allocator's alloc_context) for passing a number of existing arguments
and the new alloc_flags:

/* Structure holding extra parameters for slab allocations */
struct slab_alloc_context {
	unsigned long caller_addr;
	size_t orig_size;
	unsigned int alloc_flags;
	struct list_lru *lru;
};

This also replaces the existing struct partial_context.

The last necessary piece is kmalloc_flags() which can take the
alloc_flags in addition to gfp flags and is intended for the recursive
allocations of sheaves and obj_ext arrays, so that both
SLAB_ALLOC_NOLOCK and SLAB_ALLOC_NO_RECURSE can be communicated.
Internally it decides between kmalloc_nolock() and normal kmalloc()
depending on SLAB_ALLOC_NOLOCK.

The rest of the series is gradually expanding the usage of both
alloc_flags and slab_alloc_context as necessary, with bits of
refactoring. Then, __GFP_NO_OBJ_EXT is removed completely.

Note that some usage of gfpflags_allow_spinning() relying on absence of
__GFP_RECLAIM remains outside of slab (and page allocator) in memcg,
page_owner and stackdepot code. These can thus yield false-positive
decisions that spinning is not allowed, but should not result in
important allocations failing anymore.

Signed-off-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
---
Changes in v3:
- Applied R-b tags from Harry, Hao, Suren (thanks!)
- Former Patch 1 "mm/slab: do not limit zeroing to orig_size when only
  red zoning is enabled" fast tracked as a fix to slab-for-7.2 PR.
- Patch 1: refactor kasan_init handling (Harry).
- Constify struct slab_alloc_context usage eveywhere (Suren)
- Rename SLAB_ALLOC_TRYLOCK to SLAB_ALLOC_NOLOCK (Suren, Alexei)
- Reorder patches 5 and 6 (formerly 6 7) (Suren)
- Move trynode_flags refactoring from 7 to 6 to avoid bisection
  hazard.
- In Patch 14, support temporarily both __GFP_NO_OBJ_EXT and
  SLAB_ALLOC_NO_RECURSE to prevent obj_ext -> sheaves -> obj_ext
  recursion (Sashiko)
- Expand OBJCGS_CLEAR_MASK to allow kmalloc_nolock() warnings
  (Hao Li, Shengming Hu).
- Link to v2: https://patch.msgid.link/20260610-slab_alloc_flags-v2-0-7190909db118@kernel.org

Changes in v2:
- Due to Sashiko review, drop the idea of zeroing orig_size
  unconditionally, as it can break krealloc(). Thanks to that found a
  pre-existing bug fixed by the new Patch 1. The kfence zeroing related
  cleanup is implemented differently in Patch 2.
- Prevent nested kmalloc_nolock warnings due to added gfp flags
  (Sashiko)
- Fix a pre-existing issue with opportunistic slab allocation from the
  target node only effectively dropping __GFP_NOMEMALLOC and __GFP_RECLAIM.
  (Sashiko)
- Move kmalloc_flags() definitions to mm/slab.h (per Harry).
- Link to v1: https://patch.msgid.link/20260609-slab_alloc_flags-v1-0-2bf4a4b9b526@kernel.org

---
Vlastimil Babka (SUSE) (15):
      mm/slab: do not init any kfence objects on allocation
      mm/slab: stop inlining __slab_alloc_node()
      mm/slab: introduce slab_alloc_context
      mm/slab: introduce alloc_flags and SLAB_ALLOC_NOLOCK
      mm/slab: replace struct partial_context with slab_alloc_context
      mm/slab: add alloc_flags to slab_alloc_context
      mm/slab: pass alloc_flags to new slab allocation
      mm/slab: pass alloc_flags through slab_post_alloc_hook() chain
      mm/slab: replace slab_alloc_node() parameters with slab_alloc_context
      mm/slab: allow kmem_cache_alloc_bulk() with any gfp flags
      mm/slab: pass slab_alloc_context to __do_kmalloc_node()
      mm/slab: allow __GFP_NOMEMALLOC and __GFP_NOWARN for kmalloc_nolock()
      mm/slab: introduce kmalloc_flags()
      mm/slab: remove __GFP_NO_OBJ_EXT usage from alloc_slab_obj_exts()
      mm/slab: replace __GFP_NO_OBJ_EXT with SLAB_ALLOC_NO_RECURSE for sheaves

 include/linux/slab.h |   5 +-
 mm/kfence/core.c     |   2 +-
 mm/memcontrol.c      |   5 +-
 mm/slab.h            |  29 ++-
 mm/slub.c            | 488 +++++++++++++++++++++++++++++++--------------------
 5 files changed, 329 insertions(+), 200 deletions(-)
---
base-commit: dfdfd58cce1c3f5df8733b64595448996c08e424
change-id: 20260601-slab_alloc_flags-25c782b0c57c

Best regards,
--  
Vlastimil Babka (SUSE) <vbabka@kernel.org>


             reply	other threads:[~2026-06-15 11:54 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-15 11:54 Vlastimil Babka (SUSE) [this message]
2026-06-15 11:54 ` [PATCH v3 01/15] mm/slab: do not init any kfence objects on allocation Vlastimil Babka (SUSE)
2026-06-15 11:54 ` [PATCH v3 02/15] mm/slab: stop inlining __slab_alloc_node() Vlastimil Babka (SUSE)
2026-06-15 11:54 ` [PATCH v3 03/15] mm/slab: introduce slab_alloc_context Vlastimil Babka (SUSE)
2026-06-15 11:54 ` [PATCH v3 04/15] mm/slab: introduce alloc_flags and SLAB_ALLOC_NOLOCK Vlastimil Babka (SUSE)
2026-06-15 11:54 ` [PATCH v3 05/15] mm/slab: replace struct partial_context with slab_alloc_context Vlastimil Babka (SUSE)
2026-06-15 11:54 ` [PATCH v3 06/15] mm/slab: add alloc_flags to slab_alloc_context Vlastimil Babka (SUSE)
2026-06-15 11:54 ` [PATCH v3 07/15] mm/slab: pass alloc_flags to new slab allocation Vlastimil Babka (SUSE)
2026-06-15 11:54 ` [PATCH v3 08/15] mm/slab: pass alloc_flags through slab_post_alloc_hook() chain Vlastimil Babka (SUSE)
2026-06-15 11:54 ` [PATCH v3 09/15] mm/slab: replace slab_alloc_node() parameters with slab_alloc_context Vlastimil Babka (SUSE)
2026-06-15 11:54 ` [PATCH v3 10/15] mm/slab: allow kmem_cache_alloc_bulk() with any gfp flags Vlastimil Babka (SUSE)
2026-06-15 11:54 ` [PATCH v3 11/15] mm/slab: pass slab_alloc_context to __do_kmalloc_node() Vlastimil Babka (SUSE)
2026-06-15 11:54 ` [PATCH v3 12/15] mm/slab: allow __GFP_NOMEMALLOC and __GFP_NOWARN for kmalloc_nolock() Vlastimil Babka (SUSE)
2026-06-15 11:54 ` [PATCH v3 13/15] mm/slab: introduce kmalloc_flags() Vlastimil Babka (SUSE)
2026-06-15 11:54 ` [PATCH v3 14/15] mm/slab: remove __GFP_NO_OBJ_EXT usage from alloc_slab_obj_exts() Vlastimil Babka (SUSE)
2026-06-15 11:54 ` [PATCH v3 15/15] mm/slab: replace __GFP_NO_OBJ_EXT with SLAB_ALLOC_NO_RECURSE for sheaves Vlastimil Babka (SUSE)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260615-slab_alloc_flags-v3-0-ce1146d140fb@kernel.org \
    --to=vbabka@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=ast@kernel.org \
    --cc=cgroups@vger.kernel.org \
    --cc=cl@gentwo.org \
    --cc=dvyukov@google.com \
    --cc=elver@google.com \
    --cc=glider@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=hao.li@linux.dev \
    --cc=harry@kernel.org \
    --cc=kasan-dev@googlegroups.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeel.butt@linux.dev \
    --cc=surenb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox