All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kairui Song <ryncsn@gmail.com>
To: linux-mm@kvack.org
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Hugh Dickins <hughd@google.com>,
	Baolin Wang <baolin.wang@linux.alibaba.com>,
	Matthew Wilcox <willy@infradead.org>,
	Kemeng Shi <shikemeng@huaweicloud.com>,
	Chris Li <chrisl@kernel.org>, Nhat Pham <nphamcs@gmail.com>,
	Baoquan He <bhe@redhat.com>, Barry Song <baohua@kernel.org>,
	linux-kernel@vger.kernel.org, Kairui Song <kasong@tencent.com>
Subject: [PATCH v4 5/9] mm/shmem, swap: avoid false positive swap cache lookup
Date: Sat,  5 Jul 2025 02:17:44 +0800	[thread overview]
Message-ID: <20250704181748.63181-6-ryncsn@gmail.com> (raw)
In-Reply-To: <20250704181748.63181-1-ryncsn@gmail.com>

From: Kairui Song <kasong@tencent.com>

If a shmem read request's index points to the middle of a large swap
entry, shmem swap in will try the swap cache lookup using the large
swap entry's starting value (which is the first sub swap entry of this
large entry).  This will lead to false positive lookup results, if only
the first few swap entries are cached but the actual requested swap
entry pointed by index is uncached. This is not a rare event as swap
readahead always try to cache order 0 folios when possible.

Currently, shmem will do a large entry split when it occurs, aborts
due to a mismatching folio swap value, then retry the swapin from
the beginning, which is a waste of CPU and adds wrong info to
the readahead statistics.

This can be optimized easily by doing the lookup using the right
swap entry value.

Signed-off-by: Kairui Song <kasong@tencent.com>
---
 mm/shmem.c | 31 +++++++++++++++----------------
 1 file changed, 15 insertions(+), 16 deletions(-)

diff --git a/mm/shmem.c b/mm/shmem.c
index 217264315842..2ab214e2771c 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2274,14 +2274,15 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index,
 	pgoff_t offset;
 
 	VM_BUG_ON(!*foliop || !xa_is_value(*foliop));
-	swap = index_entry = radix_to_swp_entry(*foliop);
+	index_entry = radix_to_swp_entry(*foliop);
+	swap = index_entry;
 	*foliop = NULL;
 
-	if (is_poisoned_swp_entry(swap))
+	if (is_poisoned_swp_entry(index_entry))
 		return -EIO;
 
-	si = get_swap_device(swap);
-	order = shmem_confirm_swap(mapping, index, swap);
+	si = get_swap_device(index_entry);
+	order = shmem_confirm_swap(mapping, index, index_entry);
 	if (unlikely(!si)) {
 		if (order < 0)
 			return -EEXIST;
@@ -2293,6 +2294,12 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index,
 		return -EEXIST;
 	}
 
+	/* index may point to the middle of a large entry, get the sub entry */
+	if (order) {
+		offset = index - round_down(index, 1 << order);
+		swap = swp_entry(swp_type(swap), swp_offset(swap) + offset);
+	}
+
 	/* Look it up and read it in.. */
 	folio = swap_cache_get_folio(swap, NULL, 0);
 	if (!folio) {
@@ -2305,8 +2312,10 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index,
 
 		/* Skip swapcache for synchronous device. */
 		if (data_race(si->flags & SWP_SYNCHRONOUS_IO)) {
-			folio = shmem_swap_alloc_folio(inode, vma, index, swap, order, gfp);
+			folio = shmem_swap_alloc_folio(inode, vma, index,
+						       index_entry, order, gfp);
 			if (!IS_ERR(folio)) {
+				swap = index_entry;
 				skip_swapcache = true;
 				goto alloced;
 			}
@@ -2320,17 +2329,7 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index,
 			if (error == -EEXIST)
 				goto failed;
 		}
-
-		/*
-		 * Now swap device can only swap in order 0 folio, it is
-		 * necessary to recalculate the new swap entry based on
-		 * the offset, as the swapin index might be unalgined.
-		 */
-		if (order) {
-			offset = index - round_down(index, 1 << order);
-			swap = swp_entry(swp_type(swap), swp_offset(swap) + offset);
-		}
-
+		/* Cached swapin with readahead, only supports order 0 */
 		folio = shmem_swapin_cluster(swap, gfp, info, index);
 		if (!folio) {
 			error = -ENOMEM;
-- 
2.50.0



  parent reply	other threads:[~2025-07-04 18:18 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-04 18:17 [PATCH v4 0/9] mm/shmem, swap: bugfix and improvement of mTHP swap-in Kairui Song
2025-07-04 18:17 ` [PATCH v4 1/9] mm/shmem, swap: improve cached mTHP handling and fix potential hung Kairui Song
2025-07-04 18:17 ` [PATCH v4 2/9] mm/shmem, swap: avoid redundant Xarray lookup during swapin Kairui Song
2025-07-04 18:17 ` [PATCH v4 3/9] mm/shmem, swap: tidy up THP swapin checks Kairui Song
2025-07-04 18:17 ` [PATCH v4 4/9] mm/shmem, swap: tidy up swap entry splitting Kairui Song
2025-07-06  3:35   ` Baolin Wang
2025-07-06 11:50     ` Kairui Song
2025-07-04 18:17 ` Kairui Song [this message]
2025-07-07  7:53   ` [PATCH v4 5/9] mm/shmem, swap: avoid false positive swap cache lookup Baolin Wang
2025-07-07  8:04     ` Kairui Song
2025-07-08  6:00       ` Baolin Wang
2025-07-04 18:17 ` [PATCH v4 6/9] mm/shmem, swap: never use swap cache and readahead for SWP_SYNCHRONOUS_IO Kairui Song
2025-07-07  8:05   ` Baolin Wang
2025-07-04 18:17 ` [PATCH v4 7/9] mm/shmem, swap: simplify swapin path and result handling Kairui Song
2025-07-07  8:14   ` Baolin Wang
2025-07-04 18:17 ` [PATCH v4 8/9] mm/shmem, swap: simplify swap entry and index calculation of large swapin Kairui Song
2025-07-04 18:17 ` [PATCH v4 9/9] mm/shmem, swap: fix major fault counting Kairui Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250704181748.63181-6-ryncsn@gmail.com \
    --to=ryncsn@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=bhe@redhat.com \
    --cc=chrisl@kernel.org \
    --cc=hughd@google.com \
    --cc=kasong@tencent.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nphamcs@gmail.com \
    --cc=shikemeng@huaweicloud.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.