All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.com>
To: Yosry Ahmed <yosryahmed@google.com>
Cc: Liu Shixin <liushixin2@huawei.com>, Yu Zhao <yuzhao@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Huang Ying <ying.huang@intel.com>,
	Sachin Sant <sachinp@linux.ibm.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Kefeng Wang <wangkefeng.wang@huawei.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v10] mm: vmscan: try to reclaim swapcache pages if no swap space
Date: Wed, 22 Nov 2023 14:19:37 +0100	[thread overview]
Message-ID: <ZV3_6UH28KMt0ZDb@tiehlicka> (raw)
In-Reply-To: <CAJD7tka0=JR1s0OzQ0+H8ksFhvB2aBHXx_2-hVc97Enah9DqGQ@mail.gmail.com>

On Wed 22-11-23 02:39:15, Yosry Ahmed wrote:
> On Wed, Nov 22, 2023 at 2:09 AM Michal Hocko <mhocko@suse.com> wrote:
> >
> > On Wed 22-11-23 09:52:42, Michal Hocko wrote:
> > > On Tue 21-11-23 22:44:32, Yosry Ahmed wrote:
> > > > On Tue, Nov 21, 2023 at 10:41 PM Liu Shixin <liushixin2@huawei.com> wrote:
> > > > >
> > > > >
> > > > > On 2023/11/21 21:00, Michal Hocko wrote:
> > > > > > On Tue 21-11-23 17:06:24, Liu Shixin wrote:
> > > > > >
> > > > > > However, in swapcache_only mode, the scan count still increased when scan
> > > > > > non-swapcache pages because there are large number of non-swapcache pages
> > > > > > and rare swapcache pages in swapcache_only mode, and if the non-swapcache
> > > > > > is skipped and do not count, the scan of pages in isolate_lru_folios() can
> > > > > > eventually lead to hung task, just as Sachin reported [2].
> > > > > > I find this paragraph really confusing! I guess what you meant to say is
> > > > > > that a real swapcache_only is problematic because it can end up not
> > > > > > making any progress, correct?
> > > > > This paragraph is going to explain why checking swapcache_only after scan += nr_pages;
> > > > > >
> > > > > > AFAIU you have addressed that problem by making swapcache_only anon LRU
> > > > > > specific, right? That would be certainly more robust as you can still
> > > > > > reclaim from file LRUs. I cannot say I like that because swapcache_only
> > > > > > is a bit confusing and I do not think we want to grow more special
> > > > > > purpose reclaim types. Would it be possible/reasonable to instead put
> > > > > > swapcache pages on the file LRU instead?
> > > > > It looks like a good idea, but I'm not sure if it's possible. I can try it, is there anything to
> > > > > pay attention to?
> > > >
> > > > I think this might be more intrusive than we think. Every time a page
> > > > is added to or removed from the swap cache, we will need to move it
> > > > between LRUs. All pages on the anon LRU will need to go through the
> > > > file LRU before being reclaimed. I think this might be too big of a
> > > > change to achieve this patch's goal.
> > >
> > > TBH I am not really sure how complex that might turn out to be.
> > > Swapcache tends to be full of subtle issues. So you might be right but
> > > it would be better to know _why_ this is not possible before we end up
> > > phising for couple of swapcache pages on potentially huge anon LRU to
> > > isolate them. Think of TB sized machines in this context.
> >
> > Forgot to mention that it is not really far fetched from comparing this
> > to MADV_FREE pages. Those are anonymous but we do not want to keep them
> > on anon LRU because we want to age them indepdendent on the swap
> > availability as they are just dropped during reclaim. Not too much
> > different from swapcache pages. There are more constrains on those but
> > fundamentally this is the same problem, no?
> 
> I agree it's not a first, but swap cache pages are more complicated
> because they can go back and forth, unlike MADV_FREE pages which
> usually go on a one way ticket AFAICT.

Yes swapcache pages are indeed more complicated but most of the time
they just go away as well, no? MADV_FREE can be reinitiated if they are
written as well. So fundamentally they are not that different.

> Also pages going into the swap
> cache can be much more common that MADV_FREE pages for a lot of
> workloads. I am not sure how different reclaim heuristics will react
> to such mobility between the LRUs, and the fact that all pages will
> now only get evicted through the file LRU. The anon LRU will
> essentially become an LRU that feeds the file LRU. Also, the more
> pages we move between LRUs, the more ordering violations we introduce,
> as we may put colder pages in front of hotter pages or vice versa.

Well, traditionally the file LRU has been maintaining page cache or
easily disposable pages like MADV_FREE (which can be considered a cache
as well). Swapcache is a form of a page cache as well.

> All in all, I am not saying it's a bad idea or not possible, I am just
> saying it's probably more complicated than MADV_FREE, and adding more
> cases where pages move between LRUs could introduce problems (or make
> existing problems more visible).

Do we want to start adding filtered anon scan for a certain type of
pages? Because this is the question here AFAICS. This might seem an
easier solution but I would argue that it is less predictable one. 
It is not unusual that a huge anon LRU would contain only very few LRU
pages.

That being said, I might be missing some obvious or less obvious reasons
why this is completely bad idea. Swapcache is indeed subtle.
-- 
Michal Hocko
SUSE Labs


  reply	other threads:[~2023-11-22 13:19 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-21  9:06 [PATCH v10] mm: vmscan: try to reclaim swapcache pages if no swap space Liu Shixin
2023-11-21 13:00 ` Michal Hocko
2023-11-22  6:41   ` Liu Shixin
2023-11-22  6:44     ` Yosry Ahmed
2023-11-22  6:57       ` Huang, Ying
2023-11-22  8:55         ` Michal Hocko
2023-11-22  8:52       ` Michal Hocko
2023-11-22 10:09         ` Michal Hocko
2023-11-22 10:39           ` Yosry Ahmed
2023-11-22 13:19             ` Michal Hocko [this message]
2023-11-22 20:13               ` Yosry Ahmed
2023-11-23  6:15               ` Huang, Ying
2023-11-24 16:30                 ` Michal Hocko
2023-11-27  2:34                   ` Huang, Ying
2023-11-27  7:42                     ` Chris Li
2023-11-27  8:11                       ` Huang, Ying
2023-11-27  8:22                         ` Chris Li
2023-11-27 21:31                           ` Minchan Kim
2023-11-27 21:56                             ` Yosry Ahmed
2023-11-28  3:19                               ` Huang, Ying
2023-11-28  3:27                                 ` Yosry Ahmed
2023-11-28  4:03                                   ` Huang, Ying
2023-11-28  4:13                                     ` Yosry Ahmed
2023-11-28  5:37                                       ` Huang, Ying
2023-11-28  5:41                                         ` Yosry Ahmed
2023-11-28  5:52                                           ` Huang, Ying
2023-11-28 22:37                                 ` Minchan Kim
2023-11-29  3:12                                   ` Huang, Ying
2023-11-29 10:22                                 ` Michal Hocko
2023-11-30  8:07                                   ` Huang, Ying
2023-11-28 23:45                               ` Chris Li
2023-11-27  9:10                     ` Michal Hocko
2023-11-28  1:31                       ` Huang, Ying
2023-11-28 10:16                         ` Michal Hocko
2023-11-28 22:45                           ` Minchan Kim
2023-11-28 23:05                             ` Yosry Ahmed
2023-11-28 23:15                               ` Minchan Kim
2023-11-29 10:17                                 ` Michal Hocko
2023-12-13 23:13                                   ` Andrew Morton
2023-12-15  5:05                                     ` Huang, Ying
2023-12-15 19:24                                       ` Andrew Morton
2023-11-23 17:30   ` Chris Li
2023-11-23 17:19 ` Chris Li
2023-11-28  1:59   ` Liu Shixin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZV3_6UH28KMt0ZDb@tiehlicka \
    --to=mhocko@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=liushixin2@huawei.com \
    --cc=sachinp@linux.ibm.com \
    --cc=wangkefeng.wang@huawei.com \
    --cc=ying.huang@intel.com \
    --cc=yosryahmed@google.com \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.