linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Kairui Song <ryncsn@gmail.com>
To: Kemeng Shi <shikemeng@huaweicloud.com>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	 Hugh Dickins <hughd@google.com>,
	Baolin Wang <baolin.wang@linux.alibaba.com>,
	 Matthew Wilcox <willy@infradead.org>,
	Chris Li <chrisl@kernel.org>, Nhat Pham <nphamcs@gmail.com>,
	 Baoquan He <bhe@redhat.com>, Barry Song <baohua@kernel.org>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/4] mm/shmem, swap: avoid redundant Xarray lookup during swapin
Date: Wed, 18 Jun 2025 11:07:25 +0800	[thread overview]
Message-ID: <CAMgjq7BR=99KDiSy7o_L0u_DYsnZunyokPc6FycrdExSdrdB_w@mail.gmail.com> (raw)
In-Reply-To: <17bdc50c-1b2c-bb3b-f828-bd9ce93ea086@huaweicloud.com>

On Wed, Jun 18, 2025 at 10:49 AM Kemeng Shi <shikemeng@huaweicloud.com> wrote:
> on 6/18/2025 2:35 AM, Kairui Song wrote:
> > From: Kairui Song <kasong@tencent.com>
> >
> > Currently shmem calls xa_get_order to get the swap radix entry order,
> > requiring a full tree walk. This can be easily combined with the swap
> > entry value checking (shmem_confirm_swap) to avoid the duplicated
> > lookup, which should improve the performance.
> >
> > Signed-off-by: Kairui Song <kasong@tencent.com>
> > ---
> >  mm/shmem.c | 33 ++++++++++++++++++++++++---------
> >  1 file changed, 24 insertions(+), 9 deletions(-)
> >
> > diff --git a/mm/shmem.c b/mm/shmem.c
> > index 4e7ef343a29b..0ad49e57f736 100644
> > --- a/mm/shmem.c
> > +++ b/mm/shmem.c
> > @@ -505,15 +505,27 @@ static int shmem_replace_entry(struct address_space *mapping,
> >
> >  /*
> >   * Sometimes, before we decide whether to proceed or to fail, we must check
> > - * that an entry was not already brought back from swap by a racing thread.
> > + * that an entry was not already brought back or split by a racing thread.
> >   *
> >   * Checking folio is not enough: by the time a swapcache folio is locked, it
> >   * might be reused, and again be swapcache, using the same swap as before.
> > + * Returns the swap entry's order if it still presents, else returns -1.
> >   */
> > -static bool shmem_confirm_swap(struct address_space *mapping,
> > -                            pgoff_t index, swp_entry_t swap)
> > +static int shmem_swap_check_entry(struct address_space *mapping, pgoff_t index,
> > +                               swp_entry_t swap)
> >  {
> > -     return xa_load(&mapping->i_pages, index) == swp_to_radix_entry(swap);
> > +     XA_STATE(xas, &mapping->i_pages, index);
> > +     int ret = -1;
> > +     void *entry;
> > +
> > +     rcu_read_lock();
> > +     do {
> > +             entry = xas_load(&xas);
> > +             if (entry == swp_to_radix_entry(swap))
> > +                     ret = xas_get_order(&xas);
> > +     } while (xas_retry(&xas, entry));
> > +     rcu_read_unlock();
> > +     return ret;
> >  }
> >
> >  /*
> > @@ -2256,16 +2268,20 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index,
> >               return -EIO;
> >
> >       si = get_swap_device(swap);
> > -     if (!si) {
> > -             if (!shmem_confirm_swap(mapping, index, swap))
> > +     order = shmem_swap_check_entry(mapping, index, swap);
> > +     if (unlikely(!si)) {
> > +             if (order < 0)
> >                       return -EEXIST;
> >               else
> >                       return -EINVAL;
> >       }
> > +     if (unlikely(order < 0)) {
> > +             put_swap_device(si);
> > +             return -EEXIST;
> > +     }
> Can we re-arrange the code block as following:
>         order = shmem_swap_check_entry(mapping, index, swap);
>         if (unlikely(order < 0))
>                 return -EEXIST;
>
>         si = get_swap_device(swap);
>         if (!si) {
>                 return -EINVAL;
> ...

Hi, thanks for the suggestion.

This may lead to a trivial higher chance of getting -EINVAL when it
should return -EEXIST, leading to user space errors.

For example if this CPU get interrupted after `order =
shmem_swap_check_entry(mapping, index, swap);`, and another CPU
swapoff-ed the device. Next, we get `si = NULL` here, but the entry is
swapped in already, so it should return -EEXIST. Not -EINVAL.

The chance is really low so it's kind of trivial, we can do a `goto
failed` if got (!si) here, but it will make the logic under `failed:`
more complex. So I'd prefer to not change the original behaviour,
which looks more correct.


  reply	other threads:[~2025-06-18  3:07 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-17 18:34 [PATCH 0/4] mm/shmem, swap: bugfix and improvement of mTHP swap in Kairui Song
2025-06-17 18:35 ` [PATCH 1/4] mm/shmem, swap: improve cached mTHP handling and fix potential hung Kairui Song
2025-06-17 22:58   ` Andrew Morton
2025-06-18  2:11     ` Kairui Song
2025-06-18  2:08   ` Kemeng Shi
2025-06-17 18:35 ` [PATCH 2/4] mm/shmem, swap: avoid redundant Xarray lookup during swapin Kairui Song
2025-06-18  2:48   ` Kemeng Shi
2025-06-18  3:07     ` Kairui Song [this message]
2025-06-19  1:30       ` Kemeng Shi
2025-06-18  7:16   ` Dev Jain
2025-06-18  7:22     ` Kairui Song
2025-06-18  7:29       ` Dev Jain
2025-06-17 18:35 ` [PATCH 3/4] mm/shmem, swap: improve mthp swapin process Kairui Song
2025-06-18  6:27   ` Kemeng Shi
2025-06-18  6:50     ` Kairui Song
2025-06-18  8:08       ` Kemeng Shi
2025-06-18  8:26   ` Kemeng Shi
2025-06-18  8:46     ` Kairui Song
2025-06-19  1:32       ` Kemeng Shi
2025-06-17 18:35 ` [PATCH 4/4] mm/shmem, swap: avoid false positive swap cache lookup Kairui Song
2025-06-19  1:28   ` Kemeng Shi
2025-06-19 17:37     ` Kairui Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAMgjq7BR=99KDiSy7o_L0u_DYsnZunyokPc6FycrdExSdrdB_w@mail.gmail.com' \
    --to=ryncsn@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=bhe@redhat.com \
    --cc=chrisl@kernel.org \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nphamcs@gmail.com \
    --cc=shikemeng@huaweicloud.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).