All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@techsingularity.net>
To: Shakeel Butt <shakeelb@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Michal Hocko <mhocko@suse.com>, Vlastimil Babka <vbabka@suse.cz>,
	Alexey Avramov <hakavlad@inbox.lv>,
	Rik van Riel <riel@surriel.com>, Mike Galbraith <efault@gmx.de>,
	Darrick Wong <djwong@kernel.org>,
	regressions@lists.linux.dev,
	Linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v4 1/1] mm: vmscan: Reduce throttling due to a failure to make progress
Date: Thu, 2 Dec 2021 16:52:20 +0000	[thread overview]
Message-ID: <20211202165220.GZ3366@techsingularity.net> (raw)
In-Reply-To: <CALvZod6am_QrZCSf_de6eyzbOtKnWuL1CQZVn+srQVt20cnpFg@mail.gmail.com>

On Thu, Dec 02, 2021 at 08:30:51AM -0800, Shakeel Butt wrote:
> Hi Mel,
> 
> On Thu, Dec 2, 2021 at 7:07 AM Mel Gorman <mgorman@techsingularity.net> wrote:
> >
> > Mike Galbraith, Alexey Avramov and Darrick Wong all reported similar
> > problems due to reclaim throttling for excessive lengths of time.
> > In Alexey's case, a memory hog that should go OOM quickly stalls for
> > several minutes before stalling. In Mike and Darrick's cases, a small
> > memcg environment stalled excessively even though the system had enough
> > memory overall.
> >
> > Commit 69392a403f49 ("mm/vmscan: throttle reclaim when no progress is being
> > made") introduced the problem although commit a19594ca4a8b ("mm/vmscan:
> > increase the timeout if page reclaim is not making progress") made it
> > worse. Systems at or near an OOM state that cannot be recovered must
> > reach OOM quickly and memcg should kill tasks if a memcg is near OOM.
> >
> 
> Is there a reason we can't simply revert 69392a403f49 instead of adding
> more code/heuristics? Looking more into 69392a403f49, I don't think the
> code and commit message are in sync.
> 
> For the memcg reclaim, instead of just removing congestion_wait or
> replacing it with schedule_timeout in mem_cgroup_force_empty(), why
> change the behavior of all memcg reclaim. Also this patch effectively
> reverts that behavior of 69392a403f49.
> 

It doesn't fully revert it but I did consider reverting it. The reason
why I preserved it because the intent originally was to throttle somewhat
when progress is not being made to avoid a premature OOM and I wanted to
preserve that charactersistic. Right now, this is the least harmful way
of doing it.

As more memcg, I removed the NOTHROTTLE because the primary reason why a
memcg might fail to make progress is excessive writeback and that should
still throttle. Completely failing to make progress in a memcg is most
likely due to a memcg-OOM.

> For direct reclaimers under global pressure, why is page allocator a bad
> place for stalling on no progress reclaim? IMHO the callers of the
> reclaim should decide what to do if reclaim is not making progress.

Because it's a layering violation and the caller has little direct control
over the reclaim retry logic. The page allocator has no visibility on
why reclaim failed only that it did fail.

-- 
Mel Gorman
SUSE Labs

  reply	other threads:[~2021-12-02 17:01 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-02 15:06 [PATCH v4 1/1] mm: vmscan: Reduce throttling due to a failure to make progress Mel Gorman
2021-12-02 16:30 ` Shakeel Butt
2021-12-02 16:52   ` Mel Gorman [this message]
2021-12-02 17:41     ` Shakeel Butt
2021-12-03  9:01       ` Mel Gorman
2021-12-03 17:50         ` Shakeel Butt
2021-12-03 19:08           ` Mel Gorman
2021-12-06  6:06             ` Shakeel Butt
2021-12-06 11:25               ` Mel Gorman
2021-12-07  7:14                 ` Shakeel Butt
2021-12-07  9:28                   ` Mel Gorman
2021-12-09  6:20 ` Hugh Dickins
2021-12-09  9:53   ` Mel Gorman
2021-12-28 10:04 ` Thorsten Leemhuis
2021-12-29 23:45   ` Andrew Morton
2021-12-31 14:24     ` Thorsten Leemhuis
2021-12-31 18:33       ` Hugh Dickins
2021-12-31 19:14       ` Linus Torvalds
2021-12-31 19:21         ` Linus Torvalds
2021-12-31 19:22           ` Linus Torvalds
2022-01-01 10:52             ` Thorsten Leemhuis
2021-12-31 21:04           ` Andrew Morton
2021-12-31 21:18             ` Linus Torvalds
2022-02-14 21:10 ` Shuang Zhai
2022-02-15 14:49   ` Mel Gorman
2022-02-22 17:27     ` [PATCH v4 1/1] mm: vmscan: Reduce throttling due to a failure to make progress' Shuang Zhai
2022-02-23 12:50       ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211202165220.GZ3366@techsingularity.net \
    --to=mgorman@techsingularity.net \
    --cc=akpm@linux-foundation.org \
    --cc=djwong@kernel.org \
    --cc=efault@gmx.de \
    --cc=hakavlad@inbox.lv \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=regressions@lists.linux.dev \
    --cc=riel@surriel.com \
    --cc=shakeelb@google.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.