linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Kairui Song <ryncsn@gmail.com>
To: Barry Song <21cnbao@gmail.com>
Cc: Chris Li <chrisl@kernel.org>, linux-mm <linux-mm@kvack.org>,
	 Andrew Morton <akpm@linux-foundation.org>,
	Matthew Wilcox <willy@infradead.org>,
	 Hugh Dickins <hughd@google.com>, Baoquan He <bhe@redhat.com>,
	Nhat Pham <nphamcs@gmail.com>,
	 Kemeng Shi <shikemeng@huaweicloud.com>,
	Baolin Wang <baolin.wang@linux.alibaba.com>,
	 Ying Huang <ying.huang@linux.alibaba.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	 David Hildenbrand <david@redhat.com>,
	Yosry Ahmed <yosryahmed@google.com>,
	 Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	Zi Yan <ziy@nvidia.com>,  LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 8/9] mm, swap: implement dynamic allocation of swap table
Date: Wed, 3 Sep 2025 10:13:07 +0800	[thread overview]
Message-ID: <CAMgjq7Apnxx2GbTG5_dA+Vy1wGaxMHuS9APp0nny8tNURX8jDA@mail.gmail.com> (raw)
In-Reply-To: <CAGsJ_4we4ZfNqJ+v7+=0hjNKLakJ-s8qtRsGo_kp0R_th7Xvkw@mail.gmail.com>

Barry Song <21cnbao@gmail.com> 于 2025年9月3日周三 08:03写道:
>
> On Wed, Sep 3, 2025 at 1:17 AM Chris Li <chrisl@kernel.org> wrote:
> >
> > On Tue, Sep 2, 2025 at 4:15 AM Barry Song <21cnbao@gmail.com> wrote:
> > >
> > > On Sat, Aug 23, 2025 at 3:21 AM Kairui Song <ryncsn@gmail.com> wrote:
> > > >
> > > > From: Kairui Song <kasong@tencent.com>
> > > >
> > > > Now swap table is cluster based, which means free clusters can free its
> > > > table since no one should modify it.
> > > >
> > > > There could be speculative readers, like swap cache look up, protect
> > > > them by making them RCU safe. All swap table should be filled with null
> > > > entries before free, so such readers will either see a NULL pointer or
> > > > a null filled table being lazy freed.
> > > >
> > > > On allocation, allocate the table when a cluster is used by any order.
> > > >
> > >
> > > Might be a silly question.
> > >
> > > Just curious—what happens if the allocation fails? Does the swap-out
> > > operation also fail? We sometimes encounter strange issues when memory is
> > > very limited, especially if the reclamation path itself needs to allocate
> > > memory.
> > >
> > > Assume a case where we want to swap out a folio using clusterN. We then
> > > attempt to swap out the following folios with the same clusterN. But if
> > > the allocation of the swap_table keeps failing, what will happen?
> >
> > I think this is the same behavior as the XArray allocation node with no memory.
> > The swap allocator will fail to isolate this cluster, it gets a NULL
> > ci pointer as return value. The swap allocator will try other cluster
> > lists, e.g. non_full, fragment etc.
>
> What I’m actually concerned about is that we keep iterating on this
> cluster. If we try others, that sounds good.
>
> > If all of them fail, the folio_alloc_swap() will return -ENOMEM. Which
> > will propagate back to the try to swap out, then the shrink folio
> > list. It will put this page back to the LRU.
> >
> > The shrink folio list either free enough memory (happy path) or not
> > able to free enough memory and it will cause an OOM kill.
> >
> > I believe previously XArray will also return -ENOMEM at insert a
> > pointer and not be able to allocate a node to hold that ponter. It has
> > the same error poperation path. We did not change that.
>
> Yes, I agree there was an -ENOMEM, but the difference is that we
> are allocating much larger now :-)
>
> One option is to organize every 4 or 8 swap slots into a group for
> allocating or freeing the swap table. This way, we avoid the worst
> case where a single unfreed slot consumes a whole swap table, and
> the allocation size also becomes smaller. However, it’s unclear
> whether the memory savings justify the added complexity and effort.
>
> Anyway, I’m glad to see the current swap_table moving towards merge
> and look forward to running it on various devices. This should help
> us see if it causes any real issues.

Thanks for the insightful review.

I do plan to implement a shrinker to compact the swap table of idle /
full clusters when under pressure. It will be done at the very end.
Things will be much cleaner by then so it's easier to do. And
currently it seems the memory usage is quite good already.

>>
> Thanks
> Barry
>


  reply	other threads:[~2025-09-03  2:13 UTC|newest]

Thread overview: 96+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-22 19:20 [PATCH 0/9] mm, swap: introduce swap table as swap cache (phase I) Kairui Song
2025-08-22 19:20 ` [PATCH 1/9] mm, swap: use unified helper for swap cache look up Kairui Song
2025-08-27  2:47   ` Chris Li
2025-08-27  3:50     ` Chris Li
2025-08-27 13:45     ` Kairui Song
2025-08-27  3:52   ` Baoquan He
2025-08-27 13:46     ` Kairui Song
2025-08-28  3:20   ` Baolin Wang
2025-09-01 23:50   ` Barry Song
2025-09-02  6:12     ` Kairui Song
2025-09-02  6:52       ` Chris Li
2025-09-02 10:06   ` David Hildenbrand
2025-09-02 12:32     ` Chris Li
2025-09-02 13:18       ` David Hildenbrand
2025-09-02 16:38     ` Kairui Song
2025-09-02 10:10   ` David Hildenbrand
2025-09-02 17:13     ` Kairui Song
2025-09-03  8:00       ` David Hildenbrand
2025-09-03 17:41   ` Nhat Pham
2025-09-04 16:05     ` Kairui Song
2025-08-22 19:20 ` [PATCH 2/9] mm, swap: always lock and check the swap cache folio before use Kairui Song
2025-08-27  6:13   ` Chris Li
2025-08-27 13:44     ` Kairui Song
2025-08-30  1:42       ` Chris Li
2025-08-27  7:03   ` Chris Li
2025-08-27 14:35     ` Kairui Song
2025-08-28  3:41       ` Baolin Wang
2025-08-28 18:05         ` Kairui Song
2025-08-30  1:53       ` Chris Li
2025-08-30 15:15         ` Kairui Song
2025-08-30 17:17           ` Chris Li
2025-09-01 18:17         ` Kairui Song
2025-09-01 21:10           ` Chris Li
2025-09-02  5:40   ` Barry Song
2025-09-02 10:18   ` David Hildenbrand
2025-09-02 10:21     ` David Hildenbrand
2025-09-02 12:46     ` Chris Li
2025-09-02 13:27       ` Kairui Song
2025-08-22 19:20 ` [PATCH 3/9] mm, swap: rename and move some swap cluster definition and helpers Kairui Song
2025-08-30  2:31   ` Chris Li
2025-09-02  5:53   ` Barry Song
2025-09-02 10:20   ` David Hildenbrand
2025-09-02 12:50     ` Chris Li
2025-08-22 19:20 ` [PATCH 4/9] mm, swap: tidy up swap device and cluster info helpers Kairui Song
2025-08-27  3:47   ` Baoquan He
2025-08-27 17:44     ` Chris Li
2025-08-27 23:46       ` Baoquan He
2025-08-30  2:38         ` Chris Li
2025-09-02  6:01       ` Barry Song
2025-09-03  9:28       ` David Hildenbrand
2025-09-02  6:02   ` Barry Song
2025-09-02 13:33   ` David Hildenbrand
2025-09-02 15:03     ` Kairui Song
2025-09-03  8:11       ` David Hildenbrand
2025-08-22 19:20 ` [PATCH 5/9] mm/shmem, swap: remove redundant error handling for replacing folio Kairui Song
2025-08-25  3:02   ` Baolin Wang
2025-08-25  9:45     ` Kairui Song
2025-08-30  2:41       ` Chris Li
2025-09-03  8:25   ` David Hildenbrand
2025-08-22 19:20 ` [PATCH 6/9] mm, swap: use the swap table for the swap cache and switch API Kairui Song
2025-08-30  1:54   ` Baoquan He
2025-08-30  3:40     ` Chris Li
2025-08-30  3:34   ` Chris Li
2025-08-30 16:52     ` Kairui Song
2025-08-31  1:00       ` Chris Li
2025-09-02 11:51         ` Kairui Song
2025-09-02  9:55   ` Barry Song
2025-09-02 11:58     ` Kairui Song
2025-09-02 23:44       ` Barry Song
2025-09-03  2:12         ` Kairui Song
2025-09-03  2:31           ` Barry Song
2025-09-03 11:41   ` David Hildenbrand
2025-09-03 12:54     ` Kairui Song
2025-09-04  9:28       ` David Hildenbrand
2025-08-22 19:20 ` [PATCH 7/9] mm, swap: remove contention workaround for swap cache Kairui Song
2025-08-30  4:07   ` Chris Li
2025-08-30 15:24     ` Kairui Song
2025-08-31 15:54       ` Kairui Song
2025-08-31 20:06         ` Chris Li
2025-08-31 20:04       ` Chris Li
2025-09-02 10:06   ` Barry Song
2025-08-22 19:20 ` [PATCH 8/9] mm, swap: implement dynamic allocation of swap table Kairui Song
2025-08-30  4:17   ` Chris Li
2025-09-02 11:15   ` Barry Song
2025-09-02 13:17     ` Chris Li
2025-09-02 16:57       ` Kairui Song
2025-09-02 23:31       ` Barry Song
2025-09-03  2:13         ` Kairui Song [this message]
2025-09-03 12:35         ` Chris Li
2025-09-03 20:52           ` Barry Song
2025-09-04  6:50             ` Chris Li
2025-08-22 19:20 ` [PATCH 9/9] mm, swap: use a single page for swap table when the size fits Kairui Song
2025-08-30  4:23   ` Chris Li
2025-08-26 22:00 ` [PATCH 0/9] mm, swap: introduce swap table as swap cache (phase I) Chris Li
2025-08-30  5:44 ` Chris Li
2025-09-04 16:36   ` Kairui Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMgjq7Apnxx2GbTG5_dA+Vy1wGaxMHuS9APp0nny8tNURX8jDA@mail.gmail.com \
    --to=ryncsn@gmail.com \
    --cc=21cnbao@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=bhe@redhat.com \
    --cc=chrisl@kernel.org \
    --cc=david@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=nphamcs@gmail.com \
    --cc=shikemeng@huaweicloud.com \
    --cc=willy@infradead.org \
    --cc=ying.huang@linux.alibaba.com \
    --cc=yosryahmed@google.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).