* [PATCH v3 0/3] fork: Page operation cleanups in the fork code
@ 2025-05-09 6:29 Linus Walleij
2025-05-09 6:29 ` [PATCH v3 1/3] fork: Clean-up ifdef logic around stack allocation Linus Walleij
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Linus Walleij @ 2025-05-09 6:29 UTC (permalink / raw)
To: Andrew Morton, linux-mm, Pasha Tatashin, Mateusz Guzik; +Cc: Linus Walleij
This patch set consists of outtakes from a 1 year+ old
patch set from Pasha, which all stand on their own.
See:
https://lore.kernel.org/all/20240311164638.2015063-1-pasha.tatashin@soleen.com/
These are good cleanups for readability
so I split these off, rebased on v6.15-rc1, addressed review
comments and send them separately.
All mentions of dynamic stack are removed from the patch
set as we have no idea whether that will go anywhere.
This is mostly MM related so when the patches are ready
I expect they would land in Andrew's patch stack.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
Changes in v3:
- Drop patches 3 and 5.
- Patch 3 probably only makes sense in the context of
implementing dynamic stack sizing.
- Patch 5 is better addressed after adding the per-arch helper
clear_pahes() to clear more than one page in a go, so I
will wait for this to happen and propose an updated version
at that point.
- Link to v2: https://lore.kernel.org/r/20250507-fork-fixes-v2-0-82ab1e42cde3@linaro.org
Changes in v2:
- Fix subject on patch 2/5
- Fix bisect problem in BUG() guard in patch 2/5
- Move back to using a local nr_pages variable in
patch 3/5 for performance concerns.
- Use preferred patch augment format.
- Link to v1: https://lore.kernel.org/r/20250506-fork-fixes-v1-0-bd35b63f0f1b@linaro.org
---
Pasha Tatashin (3):
fork: Clean-up ifdef logic around stack allocation
fork: Clean-up naming of vm_stack/vm_struct variables in vmap stacks code
fork: check charging success before zeroing stack
kernel/fork.c | 88 +++++++++++++++++++++++++++++------------------------------
1 file changed, 43 insertions(+), 45 deletions(-)
---
base-commit: 0af2f6be1b4281385b618cb86ad946eded089ac8
change-id: 20250504-fork-fixes-9378b6c57873
Best regards,
--
Linus Walleij <linus.walleij@linaro.org>
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH v3 1/3] fork: Clean-up ifdef logic around stack allocation
2025-05-09 6:29 [PATCH v3 0/3] fork: Page operation cleanups in the fork code Linus Walleij
@ 2025-05-09 6:29 ` Linus Walleij
2025-05-09 6:29 ` [PATCH v3 2/3] fork: Clean-up naming of vm_stack/vm_struct variables in vmap stacks code Linus Walleij
2025-05-09 6:29 ` [PATCH v3 3/3] fork: check charging success before zeroing stack Linus Walleij
2 siblings, 0 replies; 4+ messages in thread
From: Linus Walleij @ 2025-05-09 6:29 UTC (permalink / raw)
To: Andrew Morton, linux-mm, Pasha Tatashin, Mateusz Guzik; +Cc: Linus Walleij
From: Pasha Tatashin <pasha.tatashin@soleen.com>
There is unneeded OR in the ifdef functions that are used to allocate
and free kernel stacks based on direct map or vmap.
Therefore, clean up by Changing the order so OR is no longer needed.
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Link: https://lore.kernel.org/20240311164638.2015063-3-pasha.tatashin@soleen.com
[linus.walleij@linaro.org: Rebased]
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
kernel/fork.c | 22 +++++++++++-----------
1 file changed, 11 insertions(+), 11 deletions(-)
diff --git a/kernel/fork.c b/kernel/fork.c
index c4b26cd8998b8e7b2b516e0bb0b1d4676ff644dc..7b9e1ad141baaeb158b1807ea9fc3ef246f5f3a7 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -185,13 +185,7 @@ static inline void free_task_struct(struct task_struct *tsk)
kmem_cache_free(task_struct_cachep, tsk);
}
-/*
- * Allocate pages if THREAD_SIZE is >= PAGE_SIZE, otherwise use a
- * kmemcache based allocator.
- */
-# if THREAD_SIZE >= PAGE_SIZE || defined(CONFIG_VMAP_STACK)
-
-# ifdef CONFIG_VMAP_STACK
+#ifdef CONFIG_VMAP_STACK
/*
* vmalloc() is a bit slow, and calling vfree() enough times will force a TLB
* flush. Try to minimize the number of calls by caching stacks.
@@ -342,7 +336,13 @@ static void free_thread_stack(struct task_struct *tsk)
tsk->stack_vm_area = NULL;
}
-# else /* !CONFIG_VMAP_STACK */
+#else /* !CONFIG_VMAP_STACK */
+
+/*
+ * Allocate pages if THREAD_SIZE is >= PAGE_SIZE, otherwise use a
+ * kmemcache based allocator.
+ */
+#if THREAD_SIZE >= PAGE_SIZE
static void thread_stack_free_rcu(struct rcu_head *rh)
{
@@ -374,8 +374,7 @@ static void free_thread_stack(struct task_struct *tsk)
tsk->stack = NULL;
}
-# endif /* CONFIG_VMAP_STACK */
-# else /* !(THREAD_SIZE >= PAGE_SIZE || defined(CONFIG_VMAP_STACK)) */
+#else /* !(THREAD_SIZE >= PAGE_SIZE) */
static struct kmem_cache *thread_stack_cache;
@@ -414,7 +413,8 @@ void thread_stack_cache_init(void)
BUG_ON(thread_stack_cache == NULL);
}
-# endif /* THREAD_SIZE >= PAGE_SIZE || defined(CONFIG_VMAP_STACK) */
+#endif /* THREAD_SIZE >= PAGE_SIZE */
+#endif /* CONFIG_VMAP_STACK */
/* SLAB cache for signal_struct structures (tsk->signal) */
static struct kmem_cache *signal_cachep;
--
2.49.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH v3 2/3] fork: Clean-up naming of vm_stack/vm_struct variables in vmap stacks code
2025-05-09 6:29 [PATCH v3 0/3] fork: Page operation cleanups in the fork code Linus Walleij
2025-05-09 6:29 ` [PATCH v3 1/3] fork: Clean-up ifdef logic around stack allocation Linus Walleij
@ 2025-05-09 6:29 ` Linus Walleij
2025-05-09 6:29 ` [PATCH v3 3/3] fork: check charging success before zeroing stack Linus Walleij
2 siblings, 0 replies; 4+ messages in thread
From: Linus Walleij @ 2025-05-09 6:29 UTC (permalink / raw)
To: Andrew Morton, linux-mm, Pasha Tatashin, Mateusz Guzik; +Cc: Linus Walleij
From: Pasha Tatashin <pasha.tatashin@soleen.com>
There are two data types: "struct vm_struct" and "struct vm_stack" that
have the same local variable names: vm_stack, or vm, or s, which makes
the code confusing to read.
Change the code so the naming is consistent:
struct vm_struct is always called vm_area
struct vm_stack is always called vm_stack
One change altering vfree(vm_stack) to vfree(vm_area->addr) may look
like a semantic change but it is not: vm_area->addr points to the
vm_stack. This was done to improve readability.
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Link: https://lore.kernel.org/20240311164638.2015063-4-pasha.tatashin@soleen.com
[linus.walleij@linaro.org: Rebased and added new users of the variable names, address review comments]
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
kernel/fork.c | 60 +++++++++++++++++++++++++++++------------------------------
1 file changed, 29 insertions(+), 31 deletions(-)
diff --git a/kernel/fork.c b/kernel/fork.c
index 7b9e1ad141baaeb158b1807ea9fc3ef246f5f3a7..8b8457562740c114c640a8cc230876f6a286b246 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -198,14 +198,14 @@ struct vm_stack {
struct vm_struct *stack_vm_area;
};
-static bool try_release_thread_stack_to_cache(struct vm_struct *vm)
+static bool try_release_thread_stack_to_cache(struct vm_struct *vm_area)
{
unsigned int i;
for (i = 0; i < NR_CACHED_STACKS; i++) {
struct vm_struct *tmp = NULL;
- if (this_cpu_try_cmpxchg(cached_stacks[i], &tmp, vm))
+ if (this_cpu_try_cmpxchg(cached_stacks[i], &tmp, vm_area))
return true;
}
return false;
@@ -214,11 +214,12 @@ static bool try_release_thread_stack_to_cache(struct vm_struct *vm)
static void thread_stack_free_rcu(struct rcu_head *rh)
{
struct vm_stack *vm_stack = container_of(rh, struct vm_stack, rcu);
+ struct vm_struct *vm_area = vm_stack->stack_vm_area;
if (try_release_thread_stack_to_cache(vm_stack->stack_vm_area))
return;
- vfree(vm_stack);
+ vfree(vm_area->addr);
}
static void thread_stack_delayed_free(struct task_struct *tsk)
@@ -231,32 +232,32 @@ static void thread_stack_delayed_free(struct task_struct *tsk)
static int free_vm_stack_cache(unsigned int cpu)
{
- struct vm_struct **cached_vm_stacks = per_cpu_ptr(cached_stacks, cpu);
+ struct vm_struct **cached_vm_stack_areas = per_cpu_ptr(cached_stacks, cpu);
int i;
for (i = 0; i < NR_CACHED_STACKS; i++) {
- struct vm_struct *vm_stack = cached_vm_stacks[i];
+ struct vm_struct *vm_area = cached_vm_stack_areas[i];
- if (!vm_stack)
+ if (!vm_area)
continue;
- vfree(vm_stack->addr);
- cached_vm_stacks[i] = NULL;
+ vfree(vm_area->addr);
+ cached_vm_stack_areas[i] = NULL;
}
return 0;
}
-static int memcg_charge_kernel_stack(struct vm_struct *vm)
+static int memcg_charge_kernel_stack(struct vm_struct *vm_area)
{
int i;
int ret;
int nr_charged = 0;
- BUG_ON(vm->nr_pages != THREAD_SIZE / PAGE_SIZE);
+ BUG_ON(vm_area->nr_pages != THREAD_SIZE / PAGE_SIZE);
for (i = 0; i < THREAD_SIZE / PAGE_SIZE; i++) {
- ret = memcg_kmem_charge_page(vm->pages[i], GFP_KERNEL, 0);
+ ret = memcg_kmem_charge_page(vm_area->pages[i], GFP_KERNEL, 0);
if (ret)
goto err;
nr_charged++;
@@ -264,38 +265,35 @@ static int memcg_charge_kernel_stack(struct vm_struct *vm)
return 0;
err:
for (i = 0; i < nr_charged; i++)
- memcg_kmem_uncharge_page(vm->pages[i], 0);
+ memcg_kmem_uncharge_page(vm_area->pages[i], 0);
return ret;
}
static int alloc_thread_stack_node(struct task_struct *tsk, int node)
{
- struct vm_struct *vm;
+ struct vm_struct *vm_area;
void *stack;
int i;
for (i = 0; i < NR_CACHED_STACKS; i++) {
- struct vm_struct *s;
-
- s = this_cpu_xchg(cached_stacks[i], NULL);
-
- if (!s)
+ vm_area = this_cpu_xchg(cached_stacks[i], NULL);
+ if (!vm_area)
continue;
/* Reset stack metadata. */
- kasan_unpoison_range(s->addr, THREAD_SIZE);
+ kasan_unpoison_range(vm_area->addr, THREAD_SIZE);
- stack = kasan_reset_tag(s->addr);
+ stack = kasan_reset_tag(vm_area->addr);
/* Clear stale pointers from reused stack. */
memset(stack, 0, THREAD_SIZE);
- if (memcg_charge_kernel_stack(s)) {
- vfree(s->addr);
+ if (memcg_charge_kernel_stack(vm_area)) {
+ vfree(vm_area->addr);
return -ENOMEM;
}
- tsk->stack_vm_area = s;
+ tsk->stack_vm_area = vm_area;
tsk->stack = stack;
return 0;
}
@@ -311,8 +309,8 @@ static int alloc_thread_stack_node(struct task_struct *tsk, int node)
if (!stack)
return -ENOMEM;
- vm = find_vm_area(stack);
- if (memcg_charge_kernel_stack(vm)) {
+ vm_area = find_vm_area(stack);
+ if (memcg_charge_kernel_stack(vm_area)) {
vfree(stack);
return -ENOMEM;
}
@@ -321,7 +319,7 @@ static int alloc_thread_stack_node(struct task_struct *tsk, int node)
* free_thread_stack() can be called in interrupt context,
* so cache the vm_struct.
*/
- tsk->stack_vm_area = vm;
+ tsk->stack_vm_area = vm_area;
stack = kasan_reset_tag(stack);
tsk->stack = stack;
return 0;
@@ -517,11 +515,11 @@ void vm_area_free(struct vm_area_struct *vma)
static void account_kernel_stack(struct task_struct *tsk, int account)
{
if (IS_ENABLED(CONFIG_VMAP_STACK)) {
- struct vm_struct *vm = task_stack_vm_area(tsk);
+ struct vm_struct *vm_area = task_stack_vm_area(tsk);
int i;
for (i = 0; i < THREAD_SIZE / PAGE_SIZE; i++)
- mod_lruvec_page_state(vm->pages[i], NR_KERNEL_STACK_KB,
+ mod_lruvec_page_state(vm_area->pages[i], NR_KERNEL_STACK_KB,
account * (PAGE_SIZE / 1024));
} else {
void *stack = task_stack_page(tsk);
@@ -537,12 +535,12 @@ void exit_task_stack_account(struct task_struct *tsk)
account_kernel_stack(tsk, -1);
if (IS_ENABLED(CONFIG_VMAP_STACK)) {
- struct vm_struct *vm;
+ struct vm_struct *vm_area;
int i;
- vm = task_stack_vm_area(tsk);
+ vm_area = task_stack_vm_area(tsk);
for (i = 0; i < THREAD_SIZE / PAGE_SIZE; i++)
- memcg_kmem_uncharge_page(vm->pages[i], 0);
+ memcg_kmem_uncharge_page(vm_area->pages[i], 0);
}
}
--
2.49.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH v3 3/3] fork: check charging success before zeroing stack
2025-05-09 6:29 [PATCH v3 0/3] fork: Page operation cleanups in the fork code Linus Walleij
2025-05-09 6:29 ` [PATCH v3 1/3] fork: Clean-up ifdef logic around stack allocation Linus Walleij
2025-05-09 6:29 ` [PATCH v3 2/3] fork: Clean-up naming of vm_stack/vm_struct variables in vmap stacks code Linus Walleij
@ 2025-05-09 6:29 ` Linus Walleij
2 siblings, 0 replies; 4+ messages in thread
From: Linus Walleij @ 2025-05-09 6:29 UTC (permalink / raw)
To: Andrew Morton, linux-mm, Pasha Tatashin, Mateusz Guzik; +Cc: Linus Walleij
From: Pasha Tatashin <pasha.tatashin@soleen.com>
No need to do zero cached stack if memcg charge fails, so move the
charging attempt before the memset operation.
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Link: https://lore.kernel.org/20240311164638.2015063-6-pasha.tatashin@soleen.com
[linus.walleij@linaro.org: Rebased]
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
kernel/fork.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/kernel/fork.c b/kernel/fork.c
index 8b8457562740c114c640a8cc230876f6a286b246..d6907c49ee87a96a10e231496d27bd29d50be881 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -280,6 +280,11 @@ static int alloc_thread_stack_node(struct task_struct *tsk, int node)
if (!vm_area)
continue;
+ if (memcg_charge_kernel_stack(vm_area)) {
+ vfree(vm_area->addr);
+ return -ENOMEM;
+ }
+
/* Reset stack metadata. */
kasan_unpoison_range(vm_area->addr, THREAD_SIZE);
@@ -288,11 +293,6 @@ static int alloc_thread_stack_node(struct task_struct *tsk, int node)
/* Clear stale pointers from reused stack. */
memset(stack, 0, THREAD_SIZE);
- if (memcg_charge_kernel_stack(vm_area)) {
- vfree(vm_area->addr);
- return -ENOMEM;
- }
-
tsk->stack_vm_area = vm_area;
tsk->stack = stack;
return 0;
--
2.49.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-05-09 6:29 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-09 6:29 [PATCH v3 0/3] fork: Page operation cleanups in the fork code Linus Walleij
2025-05-09 6:29 ` [PATCH v3 1/3] fork: Clean-up ifdef logic around stack allocation Linus Walleij
2025-05-09 6:29 ` [PATCH v3 2/3] fork: Clean-up naming of vm_stack/vm_struct variables in vmap stacks code Linus Walleij
2025-05-09 6:29 ` [PATCH v3 3/3] fork: check charging success before zeroing stack Linus Walleij
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).