Re: [PATCH v2 1/1] mm/madvise: add MADV_F_COLLAPSE_LIGHT to process_madvise()

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Michal Hocko <mhocko@suse.com>
To: Lance Yang <ioworker0@gmail.com>
Cc: akpm@linux-foundation.org, zokeefe@google.com, david@redhat.com,
	songmuchun@bytedance.com, shy828301@gmail.com, peterx@redhat.com,
	mknyszek@google.com, minchan@kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 1/1] mm/madvise: add MADV_F_COLLAPSE_LIGHT to process_madvise()
Date: Fri, 19 Jan 2024 13:51:38 +0100	[thread overview]
Message-ID: <ZapwWuVTIDeI3W8A@tiehlicka> (raw)
In-Reply-To: <CAK1f24k+=Sskotbct+yGxpDKNv=qyXPkww5i2kaqfzwaUVO_GQ@mail.gmail.com>

On Fri 19-01-24 10:03:05, Lance Yang wrote:
> Hey Michal,
> 
> Thanks for taking the time to review!
> 
> On Thu, Jan 18, 2024 at 9:40 PM Michal Hocko <mhocko@suse.com> wrote:
> >
> > On Thu 18-01-24 20:03:46, Lance Yang wrote:
> > [...]
> >
> > before we discuss the semantic, let's focus on the usecase.
> >
> > > Use Cases
> > >
> > > An immediate user of this new functionality is the Go runtime heap allocator
> > > that manages memory in hugepage-sized chunks. In the past, whether it was a
> > > newly allocated chunk through mmap() or a reused chunk released by
> > > madvise(MADV_DONTNEED), the allocator attempted to eagerly back memory with
> > > huge pages using madvise(MADV_HUGEPAGE)[2] and madvise(MADV_COLLAPSE)[3]
> > > respectively. However, both approaches resulted in performance issues; for
> > > both scenarios, there could be entries into direct reclaim and/or compaction,
> > > leading to unpredictable stalls[4]. Now, the allocator can confidently use
> > > process_madvise(MADV_F_COLLAPSE_LIGHT) to attempt the allocation of huge pages.
> >
> > IIUC the primary reason is the cost of the huge page allocation which
> > can be really high if the memory is heavily fragmented and it is called
> > synchronously from the process directly, correct? Can that be worked
> 
> Yes, that's correct.
> 
> > around by process_madvise and performing the operation from a different
> > context? Are there any other reasons to have a different mode?
> 
> In latency-sensitive scenarios, some applications aim to enhance performance
> by utilizing huge pages as much as possible. At the same time, in case of
> allocation failure, they prefer a quick return without triggering direct memory
> reclamation and compaction.

Could you elaborate some more on why?

> > I mean I can think of a more relaxed (opportunistic) MADV_COLLAPSE -
> > e.g. non blocking one to make sure that the caller doesn't really block
> > on resource contention (be it locks or memory availability) because that
> > matches our non-blocking interface in other areas but having a LIGHT
> > operation sounds really vague and the exact semantic would be
> > implementation specific and might change over time. Non-blocking has a
> > clear semantic but it is not really clear whether that is what you
> > really need/want.
> 
> Could you provide me with some suggestions regarding the naming of a
> more relaxed (opportunistic) MADV_COLLAPSE?

Naming is not all that important at this stage (it could be
MADV_COLLAPSE_NOBLOCK for example). The primary question is whether
non-blocking in general is the desired behavior or the implementation
should try but not too hard.

-- 
Michal Hocko
SUSE Labs

next prev parent reply	other threads:[~2024-01-19 12:51 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-18 12:03 [PATCH v2 1/1] mm/madvise: add MADV_F_COLLAPSE_LIGHT to process_madvise() Lance Yang
2024-01-18 13:28 ` Michal Hocko
2024-01-18 13:40 ` Michal Hocko
2024-01-18 13:43   ` Michal Hocko
2024-01-18 14:58     ` Zach O'Keefe
2024-01-18 19:00       ` Yang Shi
2024-01-19  2:37         ` Lance Yang
2024-01-19  1:46       ` Lance Yang
2024-01-19  2:03   ` Lance Yang
2024-01-19 12:51     ` Michal Hocko [this message]
2024-01-19 14:08       ` Lance Yang
2024-01-20  2:09       ` Lance Yang
2024-01-22 13:50         ` Michal Hocko
2024-01-22 14:14           ` Lance Yang
2024-01-22 14:34             ` Lance Yang
2024-01-26 23:26               ` Zach O'Keefe
2024-01-27  8:06                 ` Lance Yang
2024-01-21  3:12 ` Lance Yang
2024-01-26  6:16   ` Lance Yang
2024-01-26 10:15     ` Lance Yang
2024-01-26 12:52       ` Lance Yang
2024-01-26 23:46         ` Zach O'Keefe
2024-01-27  8:03           ` Lance Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZapwWuVTIDeI3W8A@tiehlicka \
    --to=mhocko@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=ioworker0@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=minchan@kernel.org \
    --cc=mknyszek@google.com \
    --cc=peterx@redhat.com \
    --cc=shy828301@gmail.com \
    --cc=songmuchun@bytedance.com \
    --cc=zokeefe@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.