From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: zhongjinji@honor.com
Cc: linux-mm@kvack.org, akpm@linux-foundation.org, mhocko@suse.com,
rientjes@google.com, shakeel.butt@linux.dev, npache@redhat.com,
linux-kernel@vger.kernel.org, tglx@linutronix.de,
mingo@redhat.com, peterz@infradead.org, dvhart@infradead.org,
dave@stgolabs.net, andrealmeid@igalia.com,
liam.howlett@oracle.com, liulu.liu@honor.com, feng.han@honor.com
Subject: Re: [PATCH v4 3/3] mm/oom_kill: Have the OOM reaper and exit_mmap() traverse the maple tree in opposite orders
Date: Fri, 15 Aug 2025 15:29:24 +0100 [thread overview]
Message-ID: <8e20a389-9733-4882-85a0-b244046b8b51@lucifer.local> (raw)
In-Reply-To: <20250814135555.17493-4-zhongjinji@honor.com>
On Thu, Aug 14, 2025 at 09:55:55PM +0800, zhongjinji@honor.com wrote:
> From: zhongjinji <zhongjinji@honor.com>
>
> When a process is OOM killed, if the OOM reaper and the thread running
> exit_mmap() execute at the same time, both will traverse the vma's maple
> tree along the same path. They may easily unmap the same vma, causing them
> to compete for the pte spinlock. This increases unnecessary load, causing
> the execution time of the OOM reaper and the thread running exit_mmap() to
> increase.
You're not giving any numbers, and this seems pretty niche, you really
exiting that many processes with the reaper running at the exact same time
that this is an issue? Waiting on a spinlock also?
This commit message is very unconvincing.
>
> When a process exits, exit_mmap() traverses the vma's maple tree from low to high
> address. To reduce the chance of unmapping the same vma simultaneously,
> the OOM reaper should traverse vma's tree from high to low address. This reduces
> lock contention when unmapping the same vma.
Are they going to run through and do their work in exactly the same time,
or might one 'run past' the other and you still have an issue?
Seems very vague and timing dependent and again, not convincing.
>
> Signed-off-by: zhongjinji <zhongjinji@honor.com>
> ---
> include/linux/mm.h | 3 +++
> mm/oom_kill.c | 9 +++++++--
> 2 files changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 0c44bb8ce544..b665ea3c30eb 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -923,6 +923,9 @@ static inline void vma_iter_set(struct vma_iterator *vmi, unsigned long addr)
> #define for_each_vma_range(__vmi, __vma, __end) \
> while (((__vma) = vma_find(&(__vmi), (__end))) != NULL)
>
> +#define for_each_vma_reverse(__vmi, __vma) \
> + while (((__vma) = vma_prev(&(__vmi))) != NULL)
Please don't casually add an undocumented public VMA iterator hidden in a
patch doing something else :)
Won't this skip the first VMA? Not sure this is really worth having as a
general thing anyway, it's not many people who want to do this in reverse.
> +
> #ifdef CONFIG_SHMEM
> /*
> * The vma_is_shmem is not inline because it is used only by slow
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 7ae4001e47c1..602d6836098a 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -517,7 +517,7 @@ static bool __oom_reap_task_mm(struct mm_struct *mm)
> {
> struct vm_area_struct *vma;
> bool ret = true;
> - VMA_ITERATOR(vmi, mm, 0);
> + VMA_ITERATOR(vmi, mm, ULONG_MAX);
>
> /*
> * Tell all users of get_user/copy_from_user etc... that the content
> @@ -527,7 +527,12 @@ static bool __oom_reap_task_mm(struct mm_struct *mm)
> */
> set_bit(MMF_UNSTABLE, &mm->flags);
>
> - for_each_vma(vmi, vma) {
> + /*
> + * When two tasks unmap the same vma at the same time, they may contend for the
> + * pte spinlock. To avoid traversing the same vma as exit_mmap unmap, traverse
> + * the vma maple tree in reverse order.
> + */
Except you won't necessarily avoid anything, as if one walker is faster
than the other they'll run ahead, plus of course they'll have a cross-over
where they share the same PTE anyway.
I feel like maybe you've got a fairly specific situation that indicates an
issue elsewhere and you're maybe solving the wrong problem here?
> + for_each_vma_reverse(vmi, vma) {
> if (vma->vm_flags & (VM_HUGETLB|VM_PFNMAP))
> continue;
>
> --
> 2.17.1
>
>
next prev parent reply other threads:[~2025-08-15 14:30 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-14 13:55 [PATCH v4 0/3] mm/oom_kill: Only delay OOM reaper for processes using robust futexes zhongjinji
2025-08-14 13:55 ` [PATCH v4 1/3] futex: Introduce function process_has_robust_futex() zhongjinji
2025-08-14 13:55 ` [PATCH v4 2/3] mm/oom_kill: Only delay OOM reaper for processes using robust futexes zhongjinji
2025-08-15 14:41 ` Lorenzo Stoakes
2025-08-18 14:14 ` zhongjinji
2025-08-17 19:37 ` Michal Hocko
2025-08-18 12:08 ` zhongjinji
2025-08-19 10:49 ` Michal Hocko
2025-08-20 2:53 ` Davidlohr Bueso
2025-08-21 18:13 ` Michal Hocko
2025-08-21 19:45 ` Davidlohr Bueso
2025-08-14 13:55 ` [PATCH v4 3/3] mm/oom_kill: Have the OOM reaper and exit_mmap() traverse the maple tree in opposite orders zhongjinji
2025-08-14 23:09 ` Andrew Morton
2025-08-15 16:32 ` zhongjinji
[not found] ` <1CAF2012-C9A4-44E1-BEB1-A1ECE0BC0C3E@gmail.com>
2025-08-15 17:53 ` gio
2025-08-15 14:29 ` Lorenzo Stoakes [this message]
2025-08-15 15:01 ` Lorenzo Stoakes
2025-08-15 17:37 ` zhongjinji
2025-08-19 15:18 ` zhongjinji
2025-08-21 9:32 ` Lorenzo Stoakes
2025-08-25 14:12 ` zhongjinji
2025-08-15 14:41 ` Liam R. Howlett
2025-08-15 16:05 ` Liam R. Howlett
2025-08-14 23:13 ` [PATCH v4 0/3] mm/oom_kill: Only delay OOM reaper for processes using robust futexes Andrew Morton
2025-08-15 17:06 ` zhongjinji
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8e20a389-9733-4882-85a0-b244046b8b51@lucifer.local \
--to=lorenzo.stoakes@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=andrealmeid@igalia.com \
--cc=dave@stgolabs.net \
--cc=dvhart@infradead.org \
--cc=feng.han@honor.com \
--cc=liam.howlett@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=liulu.liu@honor.com \
--cc=mhocko@suse.com \
--cc=mingo@redhat.com \
--cc=npache@redhat.com \
--cc=peterz@infradead.org \
--cc=rientjes@google.com \
--cc=shakeel.butt@linux.dev \
--cc=tglx@linutronix.de \
--cc=zhongjinji@honor.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).