From: David Hildenbrand <david@redhat.com>
To: Ryan Roberts <ryan.roberts@arm.com>,
Matthew Wilcox <willy@infradead.org>,
Barry Song <21cnbao@gmail.com>
Cc: akpm@linux-foundation.org, linux-mm@kvack.org,
ying.huang@intel.com, baolin.wang@linux.alibaba.com,
chrisl@kernel.org, hannes@cmpxchg.org, hughd@google.com,
kaleshsingh@google.com, kasong@tencent.com,
linux-kernel@vger.kernel.org, mhocko@suse.com,
minchan@kernel.org, nphamcs@gmail.com, senozhatsky@chromium.org,
shakeel.butt@linux.dev, shy828301@gmail.com, surenb@google.com,
v-songbaohua@oppo.com, xiang@kernel.org, yosryahmed@google.com
Subject: Re: [PATCH v5 4/4] mm: Introduce per-thpsize swapin control policy
Date: Tue, 30 Jul 2024 10:47:17 +0200 [thread overview]
Message-ID: <f61235d6-5d33-4853-a498-72db2fb13b10@redhat.com> (raw)
In-Reply-To: <f0c7f061-6284-4fe5-8cbf-93281070895b@arm.com>
On 30.07.24 10:36, Ryan Roberts wrote:
> On 29/07/2024 04:52, Matthew Wilcox wrote:
>> On Fri, Jul 26, 2024 at 09:46:18PM +1200, Barry Song wrote:
>>> A user space interface can be implemented to select different swap-in
>>> order policies, similar to the mTHP allocation order policy. We need
>>> a distinct policy because the performance characteristics of memory
>>> allocation differ significantly from those of swap-in. For example,
>>> SSD read speeds can be much slower than memory allocation. With
>>> policy selection, I believe we can implement mTHP swap-in for
>>> non-SWAP_SYNCHRONOUS scenarios as well. However, users need to understand
>>> the implications of their choices. I think that it's better to start
>>> with at least always never. I believe that we will add auto in the
>>> future to tune automatically, which can be used as default finally.
>>
>> I strongly disagree. Use the same sysctl as the other anonymous memory
>> allocations.
>
> I vaguely recall arguing in the past that just because the user has requested 2M
> THP that doesn't mean its the right thing to do for performance to swap-in the
> whole 2M in one go. That's potentially a pretty huge latency, depending on where
> the backend is, and it could be a waste of IO if the application never touches
> most of the 2M. Although the fact that the application hinted for a 2M THP in
> the first place hopefully means that they are storing objects that need to be
> accessed at similar times. Today it will be swapped in page-by-page then
> eventually collapsed by khugepaged.
>
> But I think those arguments become weaker as the THP size gets smaller. 16K/64K
> swap-in will likely yield significant performance improvements, and I think
> Barry has numbers for this?
>
> So I guess we have a few options:
>
> - Just use the same sysfs interface as for anon allocation, And see if anyone
> reports performance regressions. Investigate one of the options below if an
> issue is raised. That's the simplest and cleanest approach, I think.
>
> - New sysfs interface as Barry has implemented; nobody really wants more
> controls if it can be helped.
>
> - Hardcode a size limit (e.g. 64K); I've tried this in a few different contexts
> and never got any traction.
>
> - Secret option 4: Can we allocate a full-size folio but only choose to swap-in
> to it bit-by-bit? You would need a way to mark which pages of the folio are
> valid (e.g. per-page flag) but guess that's a non-starter given the strategy to
> remove per-page flags?
Maybe we could allocate for folios in the swapcache a bitmap to store
that information (folio->private).
But I am not convinced that is the right thing to do.
If we know some basic properties of the backend, can't we automatically
make a pretty good decision regarding the folio size to use? E.g., slow
disk, avoid 2M ...
Avoiding sysctls if possible here would really be preferable...
--
Cheers,
David / dhildenb
next prev parent reply other threads:[~2024-07-30 8:47 UTC|newest]
Thread overview: 59+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-26 9:46 [PATCH v5 0/4] mm: support mTHP swap-in for zRAM-like swapfile Barry Song
2024-07-26 9:46 ` [PATCH v5 1/4] mm: swap: introduce swapcache_prepare_nr and swapcache_clear_nr for large folios swap-in Barry Song
2024-07-30 3:00 ` Baolin Wang
2024-07-30 3:11 ` Matthew Wilcox
2024-07-30 3:15 ` Barry Song
2024-07-26 9:46 ` [PATCH v5 2/4] mm: Introduce mem_cgroup_swapin_uncharge_swap_nr() helper " Barry Song
2024-07-26 16:30 ` Yosry Ahmed
2024-07-29 2:02 ` Barry Song
2024-07-29 3:43 ` Matthew Wilcox
2024-07-29 4:52 ` Barry Song
2024-07-26 9:46 ` [PATCH v5 3/4] mm: support large folios swapin as a whole for zRAM-like swapfile Barry Song
2024-07-29 3:51 ` Matthew Wilcox
2024-07-29 4:41 ` Barry Song
[not found] ` <CAGsJ_4wxUZAysyg3cCVnHhOFt5SbyAMUfq3tJcX-Wb6D4BiBhA@mail.gmail.com>
2024-07-29 12:49 ` Matthew Wilcox
2024-07-29 13:11 ` Barry Song
2024-07-29 15:13 ` Matthew Wilcox
2024-07-29 20:03 ` Barry Song
2024-07-29 21:56 ` Barry Song
2024-07-30 8:12 ` Ryan Roberts
2024-07-29 6:36 ` Chuanhua Han
2024-07-29 12:55 ` Matthew Wilcox
2024-07-29 13:18 ` Barry Song
2024-07-29 13:32 ` Chuanhua Han
2024-07-29 14:16 ` Dan Carpenter
2024-07-26 9:46 ` [PATCH v5 4/4] mm: Introduce per-thpsize swapin control policy Barry Song
2024-07-27 5:58 ` kernel test robot
2024-07-29 1:37 ` Barry Song
2024-07-29 3:52 ` Matthew Wilcox
2024-07-29 4:49 ` Barry Song
2024-07-29 16:11 ` Christoph Hellwig
2024-07-29 20:11 ` Barry Song
2024-07-30 16:30 ` Christoph Hellwig
2024-07-30 19:28 ` Nhat Pham
2024-07-30 21:06 ` Barry Song
2024-07-31 18:35 ` Nhat Pham
2024-08-01 3:00 ` Sergey Senozhatsky
2024-08-01 20:55 ` Chris Li
2024-08-12 8:27 ` Christoph Hellwig
2024-08-12 8:44 ` Barry Song
2024-07-30 2:27 ` Chuanhua Han
2024-07-30 8:36 ` Ryan Roberts
2024-07-30 8:47 ` David Hildenbrand [this message]
2024-08-05 6:10 ` Huang, Ying
2024-08-02 12:20 ` [PATCH v6 0/2] mm: Ignite large folios swap-in support Barry Song
2024-08-02 12:20 ` [PATCH v6 1/2] mm: add nr argument in mem_cgroup_swapin_uncharge_swap() helper to support large folios Barry Song
2024-08-02 17:29 ` Chris Li
2024-08-02 12:20 ` [PATCH v6 2/2] mm: support large folios swap-in for zRAM-like devices Barry Song
2024-08-03 19:08 ` Andrew Morton
2024-08-12 8:26 ` Christoph Hellwig
2024-08-12 8:53 ` Barry Song
2024-08-12 11:38 ` Christoph Hellwig
2024-08-15 9:47 ` Kairui Song
2024-08-15 13:27 ` Kefeng Wang
2024-08-15 23:06 ` Barry Song
2024-08-16 16:50 ` Kairui Song
2024-08-16 20:34 ` Andrew Morton
2024-08-27 3:41 ` Chuanhua Han
2024-08-16 21:16 ` Matthew Wilcox
2024-08-16 21:39 ` Barry Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f61235d6-5d33-4853-a498-72db2fb13b10@redhat.com \
--to=david@redhat.com \
--cc=21cnbao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=chrisl@kernel.org \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=kaleshsingh@google.com \
--cc=kasong@tencent.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=minchan@kernel.org \
--cc=nphamcs@gmail.com \
--cc=ryan.roberts@arm.com \
--cc=senozhatsky@chromium.org \
--cc=shakeel.butt@linux.dev \
--cc=shy828301@gmail.com \
--cc=surenb@google.com \
--cc=v-songbaohua@oppo.com \
--cc=willy@infradead.org \
--cc=xiang@kernel.org \
--cc=ying.huang@intel.com \
--cc=yosryahmed@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).