From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7AE88C3ABDD for ; Wed, 14 May 2025 20:19:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AC67A6B008A; Wed, 14 May 2025 16:19:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A4D326B00B6; Wed, 14 May 2025 16:19:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 87AB76B00B7; Wed, 14 May 2025 16:19:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 64CF26B008A for ; Wed, 14 May 2025 16:19:26 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 67F94C10DB for ; Wed, 14 May 2025 20:19:26 +0000 (UTC) X-FDA: 83442628332.06.18A5662 Received: from mail-pl1-f181.google.com (mail-pl1-f181.google.com [209.85.214.181]) by imf03.hostedemail.com (Postfix) with ESMTP id 9341E20011 for ; Wed, 14 May 2025 20:19:24 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=UHYfKv7z; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf03.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.214.181 as permitted sender) smtp.mailfrom=ryncsn@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1747253964; a=rsa-sha256; cv=none; b=zJ4TPj1DzclEwv1v7gTQro7hUxwmB+NBJU8R0tUZ68mbQLD6bXVMmCCzW3XN7OqrThEPR2 x9t3IPOcPW2kpapOmYUlZgsuDU1ai3ek4sATJbRdVkHTMRxUOu7d6iPi32sOuwKpyCXsMR SfkMBae/mCLp6B8D+lY6VHGSOj9hNnQ= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=UHYfKv7z; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf03.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.214.181 as permitted sender) smtp.mailfrom=ryncsn@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1747253964; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=U6/LBzNJhK4HceOviaxc72HOh4cg2x2/Kq6Ym/AXTkQ=; b=vx6omjLNf5tZJ032xJmrRltURAxDobifB3jILAx+jrpEkls4rnHFKa/fcCE9iRbCVi5BoC xf1CDPGsuTS0CIePpopOS4g0RliMzwoJOTGC1wMSJ4b8149Zei2WPe7ggMnAZdSbow/LEn UHJ/a2NKPirrbCJi6L/gP+8HET7Zl18= Received: by mail-pl1-f181.google.com with SMTP id d9443c01a7336-22fb33898bbso2979635ad.3 for ; Wed, 14 May 2025 13:19:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1747253963; x=1747858763; darn=kvack.org; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:from:to:cc:subject :date:message-id:reply-to; bh=U6/LBzNJhK4HceOviaxc72HOh4cg2x2/Kq6Ym/AXTkQ=; b=UHYfKv7zMsTH+1j6SUXcanDucUViGGaw9z3k2t7keX88NW0VWa32WmVLzhPz3QW/WW NaZgBWVvKyeAEVTGr0u4FfyVDdNmi8xr8xpCkC8tySorIpkS3M7Qpv2WMQrP1ZlYJJzF t5zMkzkHNYSy3R7SiStIlcETsVYA6ljMq1g/t+61O2DV6cT6J2HEkcoq4XY7Dbhji9UP 2VnySNHNZ/DP5m2AbMwySFmfN4wpHoijymNojyb1JPR9nw30vteNVvnghmn3wBmQJl8t 0rkUvJSumkJUKe5ux4kbhbh1XihZnz1yJdBnvYvzqri1xqBB/gdEABRrmx7NqtZFulIx Yayg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747253963; x=1747858763; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=U6/LBzNJhK4HceOviaxc72HOh4cg2x2/Kq6Ym/AXTkQ=; b=BX7tCNxErEEvMbrxVLCUAibxth7sp6QgzBc3LC09AotD+8+5pU9cLxZF9K+TKoMt83 1K4Qkrvn/iRfZy4KOeH8XQTSiY9QGnR6U0arLbqUpselfM25gg4Th+A0tSD12N4bZlw+ A6+z61/kbDgkk/KfsWhe9cNFksLhXT/TIGOwmt7OsGUj6SE1des4xef+jJT+eQ2QaRI8 efxU7MR/nZkd4xXk2UOL7oK2OLqog7aO4Ns3shxIEN/ABdwDV8qj2QLGIwGunSPXaqyU HHGvA1gJmRPmlPC6GMOQEmUblcNW0y54Kr/uaoF+nMuBxsez73ZHkFCMcr1PPnjNq7ol hesw== X-Gm-Message-State: AOJu0YxcKJKVRr7N6dSih6mKVkIPXDvgGuAn4VbI6LVL8yt8eNKkqkrB 14SHljT8a0AxmTHHE013aqTphIdntQYifChBPk8eEwNvYThKncLet8EbHrp9W98= X-Gm-Gg: ASbGncvEpbZtwALxXje3lJbXptUcTnyeuAbgm9PcQYU1pxtBel2H/FeEsgNT4wv09Nm M6nGgsRcul1ysvDTtNtUV9l/Wu+IwCAAZ/LfIGH7RXRolm97TosO/vaG/a7/ag9ylJLbKDcGNuH JczKlIfW0V0BnECkUleDDE+qcmo523wU1a9mCE4YQ2i5jmXrwLy/kS5i8KWGKGghyi0aiOqftJv E8mq+8VfG7GWJua9AWtWRuOkKSiSZsHnLM8USaTjMjqkwAjCMupzwp1XF/t05uPlmxxvuNKuOLP TopwXkKkX5NUwRAESQ36+Osp+VMNoaaVqUCF6kkqCszQdbHLD446aUotNotJmcAtihlVgYmh X-Google-Smtp-Source: AGHT+IHvGAMCZkZwK4H8isk1bhKb1bdu4eze+K7mEWhoe7zQ+58m30m6pR2U/7/k/fiq4uVPb9rS6A== X-Received: by 2002:a17:902:ce84:b0:22e:5d9b:2ec3 with SMTP id d9443c01a7336-2319815cb5fmr62870755ad.30.1747253962748; Wed, 14 May 2025 13:19:22 -0700 (PDT) Received: from KASONG-MC4.tencent.com ([101.32.222.185]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-30e33401934sm2003692a91.9.2025.05.14.13.19.18 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 14 May 2025 13:19:22 -0700 (PDT) From: Kairui Song To: linux-mm@kvack.org Cc: Andrew Morton , Matthew Wilcox , Hugh Dickins , Chris Li , David Hildenbrand , Yosry Ahmed , "Huang, Ying" , Nhat Pham , Johannes Weiner , Baolin Wang , Baoquan He , Barry Song , Kalesh Singh , Kemeng Shi , Tim Chen , Ryan Roberts , linux-kernel@vger.kernel.org, Kairui Song Subject: [PATCH 20/28] mm, swap: check swap table directly for checking cache Date: Thu, 15 May 2025 04:17:20 +0800 Message-ID: <20250514201729.48420-21-ryncsn@gmail.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250514201729.48420-1-ryncsn@gmail.com> References: <20250514201729.48420-1-ryncsn@gmail.com> Reply-To: Kairui Song MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 9341E20011 X-Stat-Signature: foudqfd4wmi4bwkeiarf144esppbmzsq X-HE-Tag: 1747253964-566356 X-HE-Meta: U2FsdGVkX18DDf+nPSSjkuEBP/ea28t0b5XqAObNuFneiUFkTlGYX8vTH3a4qu1B7G9XsnDKazhRiYYHfr7Yg21ugcuKfF/Iatyj9fz0VsvMCVABUqRR60/fO3gB75QKf3a+m4BAqBKJVNu8DOjeBv1NKB+AfS9THYKU8JMqXgt9v3qh+jne4elpRMoWQ70rW7XL+RFB1dnGkgGmMeeALJndwE+EVpueLmqZzk9hMDxutAMwrfkxG/n59aV3cAZ3ILuMfcAqZ2IoOrnvXnEitpTOx8Ij4UHHhABjIN3i2qGiOiQQg3jLMfiO4Lkg5idbHaujUq53WTg8IFgHFiP2ye31ScUZiuEcRBM6TChZdzcMp/0G8pTEDaadD+WnA3GZZB0sVK2yGsOdIOq64sLgbidv12fTnVqQy+M7QiUSkm/ODJ7qngsTSTywzICJ0FdLfds1/EclgVy3hMELTF5Qbk33F+Es67WpPhb77c6e26uFJ7bCzN2BfK4ZLHoq70E1/umtyktwCGty1+oFbNydTqOclsgabcMDMaMAn7dKSCtTxXk/C0kJiE1nWF+plm/kxKgx0TZas2C9ZxBrM4sg0zJHsDDvA1fJQkYcImGF2IMWwob5XNc5+Cp+u0Q57juQuNfjUpN5Wbag8/+L22Z0ARD/DhSy5p8zyCP+iZ3xLtnX9+KfiAtL7WI/dCl5UQVELWV9pWVcS7AeN6YpsQTbzRZgFU1SdzPBIULb24MeKqUbSFeLVAZA3BeW4bs6ma5L7cTeB6oHHb4/195l3y4uFt2YewQlsfbNQH97sd72OIEdBYufFtU3vD3hj02RlnhNN3oPOxrbamuJq3GwVySIBR4tB9DwHSUzmVDQFqTyAMbObj9EAF7UNIE6d3cQuG8BsdlAkJvzu447H0SBSaRJnQZlUV7bsTY0xlgFeCAFgRMyTscxif6VNrAZ/I31yW6kMoIsEz/xKBoWpEW4R4F SSokIUBW 8/VOIZ6cje0219qqzfp0syc4qSNwSCUqd6leM9JzMvyPRT+/F2g/G6tjN6Rl+nj/0jr+E8BjOlrhUVrb0OSVJ2acPhkDgsdDml1i/Udi5VK0rR5vZMUY0W7tDT3avkpmyiW5Fdz2leKUGGXTvkx1OTW2K6tHXnHihP9U7lTZvc0XSEbY+jXvASMpw99VSQTVEhFNnV/VR3YmEIpwyMe65hd0EeftYPtYEVWYDdpLY8PFo6qb3B7XbQMmMIfbrdvKPojWTLyPwxKe+RwJfQBfU9Lhwt3G+poKI0gdd1oq/513qIEysD+stkjaoyXn2hEullFu/JlIV1GJNMTbTRoMC7sARCQVb/sXf1doDN+WCJzSCNgmXHz/FHQy/lHsn9gi0WG012wQ/4xNptF8GZgybfEIO3BcUoYGP44c03g30PU8sKKK7lSOeXnmSAR6g643yeg+BUC0gacX/wo3fZQE/6I46HUMIvDkp7Oa8csSlCxIABvQkeDQ7JecOyK4PtVMWlevAfRP5bsiQ09yQOZyKzWPKnTxJfgak0fB/VPU6q2w9EWsoIdmi4DGluTxWkKl6sQ6G X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Kairui Song Instead of looking at the swap map, check swap table directly to tell if a swap entry has cache. Prepare for remove SWAP_HAS_CACHE. Signed-off-by: Kairui Song --- mm/memory.c | 12 +++++------ mm/swap.h | 6 ++++++ mm/swap_state.c | 11 ++++++++++ mm/swapfile.c | 54 +++++++++++++++++++++++-------------------------- 4 files changed, 48 insertions(+), 35 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index a70624a55aa2..a9a548575e72 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4314,15 +4314,15 @@ static struct folio *__alloc_swap_folio(struct vm_fault *vmf) } #ifdef CONFIG_TRANSPARENT_HUGEPAGE -static inline int non_swapcache_batch(swp_entry_t entry, int max_nr) +static inline int non_swapcache_batch(swp_entry_t entry, unsigned int max_nr) { - struct swap_info_struct *si = swp_info(entry); - pgoff_t offset = swp_offset(entry); - int i; + unsigned int i; for (i = 0; i < max_nr; i++) { - if ((si->swap_map[offset + i] & SWAP_HAS_CACHE)) - return i; + /* Page table lock pins the swap entries / swap device */ + if (swap_cache_check_folio(entry)) + break; + entry.val++; } return i; diff --git a/mm/swap.h b/mm/swap.h index 467996dafbae..2ae4624a0e48 100644 --- a/mm/swap.h +++ b/mm/swap.h @@ -186,6 +186,7 @@ static inline struct address_space *swap_address_space(swp_entry_t entry) extern struct folio *swap_cache_get_folio(swp_entry_t entry); extern struct folio *swap_cache_add_folio(swp_entry_t entry, struct folio *folio, void **shadow, bool swapin); +extern bool swap_cache_check_folio(swp_entry_t entry); extern void *swap_cache_get_shadow(swp_entry_t entry); /* Below helpers requires the caller to lock the swap cluster. */ extern void __swap_cache_del_folio(swp_entry_t entry, @@ -395,6 +396,11 @@ static inline void *swap_cache_get_shadow(swp_entry_t end) return NULL; } +static inline bool swap_cache_check_folio(swp_entry_t entry) +{ + return false; +} + static inline unsigned int folio_swap_flags(struct folio *folio) { return 0; diff --git a/mm/swap_state.c b/mm/swap_state.c index c8bb16835612..ea6a1741db5c 100644 --- a/mm/swap_state.c +++ b/mm/swap_state.c @@ -266,6 +266,17 @@ struct folio *swap_cache_get_folio(swp_entry_t entry) return folio; } +/* + * Check if a swap entry has folio cached, may return false positive. + * Caller must hold a reference of the swap device or pin it in other ways. + */ +bool swap_cache_check_folio(swp_entry_t entry) +{ + swp_te_t swp_te; + swp_te = __swap_table_get(swp_cluster(entry), swp_offset(entry)); + return swp_te_is_folio(swp_te); +} + /* * If we are the only user, then try to free up the swap cache. * diff --git a/mm/swapfile.c b/mm/swapfile.c index ef233466725e..0f2a499ff2c9 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -181,15 +181,19 @@ static long swap_usage_in_pages(struct swap_info_struct *si) #define TTRS_FULL 0x4 static bool swap_only_has_cache(struct swap_info_struct *si, - unsigned long offset, int nr_pages) + struct swap_cluster_info *ci, + unsigned long offset, int nr_pages) { unsigned char *map = si->swap_map + offset; unsigned char *map_end = map + nr_pages; + swp_te_t entry; do { + entry = __swap_table_get(ci, offset); VM_BUG_ON(!(*map & SWAP_HAS_CACHE)); - if (*map != SWAP_HAS_CACHE) + if (*map) return false; + offset++; } while (++map < map_end); return true; @@ -247,11 +251,11 @@ static int __try_to_reclaim_swap(struct swap_info_struct *si, /* * It's safe to delete the folio from swap cache only if the folio's - * swap_map is HAS_CACHE only, which means the slots have no page table + * entry is swap cache only, which means the slots have no page table * reference or pending writeback, and can't be allocated to others. */ ci = swap_lock_cluster(si, offset); - need_reclaim = swap_only_has_cache(si, offset, nr_pages); + need_reclaim = swap_only_has_cache(si, ci, offset, nr_pages); swap_unlock_cluster(ci); if (!need_reclaim) goto out_unlock; @@ -660,29 +664,21 @@ static bool cluster_reclaim_range(struct swap_info_struct *si, spin_unlock(&ci->lock); do { - switch (READ_ONCE(map[offset])) { - case 0: - offset++; + if (swap_count(READ_ONCE(map[offset]))) break; - case SWAP_HAS_CACHE: - nr_reclaim = __try_to_reclaim_swap(si, offset, TTRS_ANYWAY); - if (nr_reclaim > 0) - offset += nr_reclaim; - else - goto out; + nr_reclaim = __try_to_reclaim_swap(si, offset, TTRS_ANYWAY); + if (nr_reclaim > 0) + offset += nr_reclaim; + else if (nr_reclaim < 1) break; - default: - goto out; - } - } while (offset < end); -out: + } while (++offset < end); spin_lock(&ci->lock); /* * Recheck the range no matter reclaim succeeded or not, the slot * could have been be freed while we are not holding the lock. */ for (offset = start; offset < end; offset++) - if (READ_ONCE(map[offset])) + if (map[offset] || !swp_te_is_null(__swap_table_get(ci, offset))) return false; return true; @@ -700,16 +696,13 @@ static bool cluster_scan_range(struct swap_info_struct *si, return true; for (offset = start; offset < end; offset++) { - switch (READ_ONCE(map[offset])) { - case 0: - continue; - case SWAP_HAS_CACHE: + if (swap_count(map[offset])) + return false; + if (swp_te_is_folio(__swap_table_get(ci, offset))) { + VM_WARN_ON_ONCE(!(map[offset] & SWAP_HAS_CACHE)); if (!vm_swap_full()) return false; *need_reclaim = true; - continue; - default: - return false; } } @@ -821,7 +814,8 @@ static void swap_reclaim_full_clusters(struct swap_info_struct *si, bool force) to_scan--; while (offset < end) { - if (READ_ONCE(map[offset]) == SWAP_HAS_CACHE) { + if (!swap_count(map[offset]) && + swp_te_is_folio(__swap_table_get(ci, offset))) { spin_unlock(&ci->lock); nr_reclaim = __try_to_reclaim_swap(si, offset, TTRS_ANYWAY); @@ -1590,7 +1584,7 @@ void __swap_cache_put_entries(struct swap_info_struct *si, struct swap_cluster_info *ci, swp_entry_t entry, unsigned int size) { - if (swap_only_has_cache(si, swp_offset(entry), size)) + if (swap_only_has_cache(si, ci, swp_offset(entry), size)) swap_free_entries(si, ci, swp_offset(entry), size); else for (int i = 0; i < size; i++, entry.val++) @@ -1802,6 +1796,7 @@ void do_put_swap_entries(swp_entry_t entry, int nr) struct swap_info_struct *si; bool any_only_cache = false; unsigned long offset; + swp_te_t swp_te; si = get_swap_device(entry); if (WARN_ON_ONCE(!si)) @@ -1826,7 +1821,8 @@ void do_put_swap_entries(swp_entry_t entry, int nr) */ for (offset = start_offset; offset < end_offset; offset += nr) { nr = 1; - if (READ_ONCE(si->swap_map[offset]) == SWAP_HAS_CACHE) { + swp_te = __swap_table_get(swp_offset_cluster(si, offset), offset); + if (!swap_count(si->swap_map[offset]) && swp_te_is_folio(swp_te)) { /* * Folios are always naturally aligned in swap so * advance forward to the next boundary. Zero means no -- 2.49.0