Re: [PATCH] mm: swap: async free swap slot cache entries

public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed

From: Chris Li <chrisl@kernel.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Wei Xu <weixugc@google.com>, Yu Zhao <yuzhao@google.com>,
	Greg Thelen <gthelen@google.com>,
	Chun-Tse Shao <ctshao@google.com>,
	Suren Baghdasaryan <surenb@google.com>,
	Yosry Ahmed <yosryahmed@google.com>,
	Brain Geffon <bgeffon@google.com>,
	Minchan Kim <minchan@kernel.org>, Michal Hocko <mhocko@suse.com>,
	Mel Gorman <mgorman@techsingularity.net>,
	Huang Ying <ying.huang@intel.com>, Nhat Pham <nphamcs@gmail.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Kairui Song <kasong@tencent.com>,
	Zhongkun He <hezhongkun.hzk@bytedance.com>,
	Kemeng Shi <shikemeng@huaweicloud.com>,
	Barry Song <v-songbaohua@oppo.com>,
	Hugh Dickins <hughd@google.com>
Subject: Re: [PATCH] mm: swap: async free swap slot cache entries
Date: Fri, 22 Dec 2023 15:16:37 -0800	[thread overview]
Message-ID: <ZYYY1VBKdLHH-Kl3@google.com> (raw)
In-Reply-To: <20231222115208.ab4d2aeacdafa4158b14e532@linux-foundation.org>

On Fri, Dec 22, 2023 at 11:52:08AM -0800, Andrew Morton wrote:
> On Thu, 21 Dec 2023 22:25:39 -0800 Chris Li <chrisl@kernel.org> wrote:
> 
> > We discovered that 1% swap page fault is 100us+ while 50% of
> > the swap fault is under 20us.
> > 
> > Further investigation show that a large portion of the time
> > spent in the free_swap_slots() function for the long tail case.
> > 
> > The percpu cache of swap slots is freed in a batch of 64 entries
> > inside free_swap_slots(). These cache entries are accumulated
> > from previous page faults, which may not be related to the current
> > process.
> > 
> > Doing the batch free in the page fault handler causes longer
> > tail latencies and penalizes the current process.
> > 
> > Move free_swap_slots() outside of the swapin page fault handler into an
> > async work queue to avoid such long tail latencies.
> 
> This will require a larger amount of total work than the current

Yes, there will be a tiny little bit of extra overhead to schedule the job
on to the other work queue.

> scheme.  So we're trading that off against better latency.
> 
> Why is this a good tradeoff?

That is a very good question. Both Hugh and Wei had asked me similar questions
before. +Hugh.

The TL;DR is that it makes the swap more palleralizedable.

Because morden computers typically have more than one CPU and the CPU utilization
is rarely reached to 100%. We are actually not trading the latency for some one
run slower. Most of the time the real impact is that the current swapin page fault
can return quicker so more work can submit to the kernel sooner, at the same time
the other idle CPU can pick up the non latency critical work of freeing of the
swap slot cache entries. The net effect is that we speed things up and increase
the overall system utilization rather than slow things down.

The test result of chromebook and Google production server should be able to show
that it is beneficial to both laptop and server workloads, making them more responsive
in swap related workload.

Chris

next prev parent reply	other threads:[~2023-12-22 23:16 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-22  6:25 [PATCH] mm: swap: async free swap slot cache entries Chris Li
2023-12-22 19:52 ` Andrew Morton
2023-12-22 23:16   ` Chris Li [this message]
2023-12-23  6:11     ` David Rientjes
2023-12-23 16:51       ` Chris Li
2023-12-24  3:01         ` David Rientjes
2023-12-24 18:15           ` Chris Li
2023-12-24 21:13             ` David Rientjes
2023-12-24 22:06               ` Chris Li
2023-12-24 22:20                 ` David Rientjes
2023-12-28 15:34                 ` Yosry Ahmed
2023-12-25  7:07     ` Huang, Ying
2024-02-01  0:43       ` Chris Li
2023-12-23  1:44 ` Nhat Pham
2023-12-23  4:41   ` Chris Li
2023-12-28 15:33 ` Yosry Ahmed
2024-02-01  0:57   ` Chris Li
2024-02-01  1:21     ` Yosry Ahmed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZYYY1VBKdLHH-Kl3@google.com \
    --to=chrisl@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=bgeffon@google.com \
    --cc=ctshao@google.com \
    --cc=gthelen@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=hezhongkun.hzk@bytedance.com \
    --cc=hughd@google.com \
    --cc=kasong@tencent.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@suse.com \
    --cc=minchan@kernel.org \
    --cc=nphamcs@gmail.com \
    --cc=shikemeng@huaweicloud.com \
    --cc=surenb@google.com \
    --cc=v-songbaohua@oppo.com \
    --cc=weixugc@google.com \
    --cc=ying.huang@intel.com \
    --cc=yosryahmed@google.com \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox