From: Roman Gushchin <roman.gushchin@linux.dev>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Jan Kara <jack@suse.cz>, Matthew Wilcox <willy@infradead.org>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Liu Shixin <liushixin2@huawei.com>
Subject: Re: [PATCH] mm: consider disabling readahead if there are signs of thrashing
Date: Thu, 10 Jul 2025 15:54:17 -0700 [thread overview]
Message-ID: <8734b3ac86.fsf@linux.dev> (raw)
In-Reply-To: <20250710135713.916a4898fb02f595206ac861@linux-foundation.org> (Andrew Morton's message of "Thu, 10 Jul 2025 13:57:13 -0700")
Andrew Morton <akpm@linux-foundation.org> writes:
> On Thu, 10 Jul 2025 12:52:32 -0700 Roman Gushchin <roman.gushchin@linux.dev> wrote:
>
>> We've noticed in production that under a very heavy memory pressure
>> the readahead behavior becomes unstable causing spikes in memory
>> pressure and CPU contention on zone locks.
>>
>> The current mmap_miss heuristics considers minor pagefaults as a
>> good reason to decrease mmap_miss and conditionally start async
>> readahead. This creates a vicious cycle: asynchronous readahead
>> loads more pages, which in turn causes more minor pagefaults.
>> This problem is especially pronounced when multiple threads of
>> an application fault on consecutive pages of an evicted executable,
>> aggressively lowering the mmap_miss counter and preventing readahead
>> from being disabled.
>>
>> To improve the logic let's check for !uptodate and workingset
>> folios in do_async_mmap_readahead(). The presence of such pages
>> is a strong indicator of thrashing, which is also used by the
>> delay accounting code, e.g. in folio_wait_bit_common(). So instead
>> of decreasing mmap_miss and lower chances to disable readahead,
>> let's do the opposite and bump it by MMAP_LOTSAMISS / 2.
>
> Are there any testing results to share?
Nothing from the production yet, but it makes a lot of difference
to the reproducer I use (authored by Greg Thelen), which basically
runs a huge binary with 2xCPU number of threads in a very constrained
memory cgroup. Without this change the system is oscillating between
performing more or less well and being completely stuck on zone locks
contention when 256 threads are all competing for a small number of
pages. With this change the system is pretty stable once it reaches
the point with the disabled readahead.
>
> What sort of workloads might be harmed by this change?
I hope none, but maybe I miss something.
>
> We do seem to be thrashing around (heh) with these readahead
> heuristics. Lots of potential for playing whack-a-mole.
>
> Should we make the readahead code more observable? We don't seem to
> have much in the way of statistics, counters, etc. And no tracepoints,
> which is surprising.
I think it's another good mm candidate (the first being oom killer
policies, working on it) for eventual bpf-ization. For example,
I can easily see that a policy specific to a file format can make
a large difference.
In this particular case I guess we can disable readahead based
on memory psi metrics, potentially all in bpf.
Thanks
next prev parent reply other threads:[~2025-07-10 22:54 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-10 19:52 [PATCH] mm: consider disabling readahead if there are signs of thrashing Roman Gushchin
2025-07-10 20:57 ` Andrew Morton
2025-07-10 22:54 ` Roman Gushchin [this message]
2025-07-10 21:43 ` Matthew Wilcox
2025-07-11 16:29 ` Roman Gushchin
2025-07-14 15:16 ` Jan Kara
2025-07-14 20:12 ` Roman Gushchin
2025-07-25 22:42 ` Roman Gushchin
2025-07-25 23:25 ` Roman Gushchin
2025-07-28 9:16 ` Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8734b3ac86.fsf@linux.dev \
--to=roman.gushchin@linux.dev \
--cc=akpm@linux-foundation.org \
--cc=jack@suse.cz \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=liushixin2@huawei.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).