All of lore.kernel.org
 help / color / mirror / Atom feed
From: Roman Gushchin <roman.gushchin@linux.dev>
To: Jan Kara <jack@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	linux-mm@kvack.org
Subject: Re: [PATCH] mm: readahead: make thp readahead conditional to mmap_miss logic
Date: Wed, 1 Oct 2025 12:47:19 +0000	[thread overview]
Message-ID: <aN0i107IF0oQ_PQb@google.com> (raw)
In-Reply-To: <hm56lqfeqcpjjpwkzuo4ktv7ayt763htehpi7ie2d47q52gm3w@mgbj42ivvv5g>

On Wed, Oct 01, 2025 at 01:35:39PM +0200, Jan Kara wrote:
> On Tue 30-09-25 07:48:15, Roman Gushchin wrote:
> > Commit 4687fdbb805a ("mm/filemap: Support VM_HUGEPAGE for file mappings")
> > introduced a special handling for VM_HUGEPAGE mappings: even if the
> > readahead is disabled, 1 or 2 HPAGE_PMD_ORDER pages are
> > allocated.
> > 
> > This change causes a significant regression for containers with a
> > tight memory.max limit, if VM_HUGEPAGE is widely used. Prior to this
> > commit, mmap_miss logic would eventually lead to the readahead
> > disablement, effectively reducing the memory pressure in the
> > cgroup. With this change the kernel is trying to allocate 1-2 huge
> > pages for each fault, no matter if these pages are used or not
> > before being evicted, increasing the memory pressure multi-fold.
> > 
> > To fix the regression, let's make the new VM_HUGEPAGE conditional
> > to the mmap_miss check, but keep independent from the ra->ra_pages.
> > This way the main intention of commit 4687fdbb805a ("mm/filemap:
> > Support VM_HUGEPAGE for file mappings") stays intact, but the
> > regression is resolved.
> > 
> > The logic behind this changes is simple: even if a user explicitly
> > requests using huge pages to back the file mapping (using VM_HUGEPAGE
> > flag), under a very strong memory pressure it's better to fall back
> > to ordinary pages.
> > 
> > Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
> > Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
> > Cc: Jan Kara <jack@suse.cz>
> > Cc: linux-mm@kvack.org
> 
> It would be good to get confirmation from Matthew that indeed this
> preserves what he had in mind with commit 4687fdbb805a92 but the change
> looks good to me.

Hi Jan!

Matthew and myself had a chat about this issue last week at Kernel Recipes
conference and in general agreed on this approach. But of course,
an explicit Ack from him will be appreciated.

Long-term it would be great to use a better metric for memory pressure
here, e.g. PSI. But it's far from trivial.

> Feel free to add:
> 
> Reviewed-by: Jan Kara <jack@suse.cz>

Thank you!


  reply	other threads:[~2025-10-01 12:47 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-30  5:48 [PATCH] mm: readahead: make thp readahead conditional to mmap_miss logic Roman Gushchin
2025-10-01 11:35 ` Jan Kara
2025-10-01 12:47   ` Roman Gushchin [this message]
2025-10-04 13:08 ` Dev Jain
2025-10-06  8:20   ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aN0i107IF0oQ_PQb@google.com \
    --to=roman.gushchin@linux.dev \
    --cc=akpm@linux-foundation.org \
    --cc=jack@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.