From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0E46F38737F; Tue, 21 Apr 2026 06:16:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776752215; cv=none; b=g47k3d1seVcrrbHA54BJBntiLCnT3NaIYI998M9IlD6F3ndazWURO1MBvcuPwlb/pUeudD8uDxeP8QHN1EkUhT1cbtPQBypM4UJUOzTDHY3PpZebGoHIckSOwQU2dA9nglwmzkyxbdmBFcXoRFU+9YUIPOM7Exm58GNWjhSvtzg= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776752215; c=relaxed/simple; bh=6i9YAUZYwrRc5SgpD4usGKW1EHQYKdPdq9tqPGDElhg=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=oC2JpN1IskbRDjoDEp7tTA4p3UC773zqPscOOdd3/iFGsjqqXFpBqBIaXTF9iaYtYc/mVIMsDzTD0dIltxTDmMx5YCJeDjfQmZpkyBGK51SGtHkcCR+wK1oT+2ujhYipncRswjjrvCoW4jZ2+WHMkT9EdxtYmHYj+M72K5dbszM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=aF4Q8T/C; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="aF4Q8T/C" Received: by smtp.kernel.org (Postfix) with ESMTPS id BEE2FC2BCFC; Tue, 21 Apr 2026 06:16:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1776752214; bh=6i9YAUZYwrRc5SgpD4usGKW1EHQYKdPdq9tqPGDElhg=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=aF4Q8T/CgsBWUB3LkN/wKWqimcuH3PbKYijQ6Oazwp/vKnzyEPfV+MX+1qNen+6h9 nWCZNt4S+h2S6603XYqpfIfxYwThxLx1SjdR4iWrd+NXKhbkS1dyqsRWW4z1SPRUeO nlC1lrFNZtuHcoba+mRnG78tWmulvJyVVch5uGg/k8DJSLnf80dCGn7s4/wbv9fItG NqnG4uRH9zkObc7aJy8ZX3QYUWAlJSEehJWFs+EWXycnRtw74mlFeyy/yWNNKJCojw jcFDHzcG4Twwk/3bNK38hZFsEABt1b/2MoU5D2PkGCPAftuavDo6jr2e+w1zlLqBgS V2fJ2U6c1CNCg== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id A7220F327AB; Tue, 21 Apr 2026 06:16:54 +0000 (UTC) From: Kairui Song via B4 Relay Date: Tue, 21 Apr 2026 14:16:53 +0800 Subject: [PATCH v3 09/12] mm, swap: consolidate cluster allocation helpers Precedence: bulk X-Mailing-List: cgroups@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Message-Id: <20260421-swap-table-p4-v3-9-2f23759a76bc@tencent.com> References: <20260421-swap-table-p4-v3-0-2f23759a76bc@tencent.com> In-Reply-To: <20260421-swap-table-p4-v3-0-2f23759a76bc@tencent.com> To: linux-mm@kvack.org Cc: Andrew Morton , David Hildenbrand , Zi Yan , Baolin Wang , Barry Song , Hugh Dickins , Chris Li , Kemeng Shi , Nhat Pham , Baoquan He , Johannes Weiner , Youngjun Park , Chengming Zhou , Roman Gushchin , Shakeel Butt , Muchun Song , Qi Zheng , linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, Kairui Song , Yosry Ahmed , Lorenzo Stoakes , Dev Jain , Lance Yang , Michal Hocko , Michal Hocko , Suren Baghdasaryan , Axel Rasmussen , Lorenzo Stoakes , Yosry Ahmed X-Mailer: b4 0.15.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1776752211; l=7165; i=kasong@tencent.com; s=kasong-sign-tencent; h=from:subject:message-id; bh=B+TeE8JVn80xKMjcMTb6N8fSNYHExA8Vz1o7UX2umm0=; b=gFNJDYUx2Qi1b6tclNArTC3/ZBh+XOnQIKjENIqfaDW3MxZTmBC+CWRd0K0isOUmj/RKSAGRs etswOk+N4jYAJJFmaGGHHDVi6RaLrS5YYbacYwojhQ++Rsr81T/aLOF X-Developer-Key: i=kasong@tencent.com; a=ed25519; pk=kCdoBuwrYph+KrkJnrr7Sm1pwwhGDdZKcKrqiK8Y1mI= X-Endpoint-Received: by B4 Relay for kasong@tencent.com/kasong-sign-tencent with auth_id=562 X-Original-From: Kairui Song Reply-To: kasong@tencent.com From: Kairui Song Swap cluster table management is spread across several narrow helpers. As a result, the allocation and fallback sequences are open-coded in multiple places. A few more per-cluster tables will be added soon, so avoid duplicating these sequences per table type. Fold the existing pairs into cluster-oriented helpers, and rename for consistency. No functional change, only a few sanity checks are slightly adjusted. Signed-off-by: Kairui Song --- mm/swapfile.c | 110 ++++++++++++++++++++++++++-------------------------------- 1 file changed, 49 insertions(+), 61 deletions(-) diff --git a/mm/swapfile.c b/mm/swapfile.c index 8d3d22c463f3..2d16aa89a4fd 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -411,20 +411,7 @@ static inline unsigned int cluster_offset(struct swap_info_struct *si, return cluster_index(si, ci) * SWAPFILE_CLUSTER; } -static struct swap_table *swap_table_alloc(gfp_t gfp) -{ - struct folio *folio; - - if (!SWP_TABLE_USE_PAGE) - return kmem_cache_zalloc(swap_table_cachep, gfp); - - folio = folio_alloc(gfp | __GFP_ZERO, 0); - if (folio) - return folio_address(folio); - return NULL; -} - -static void swap_table_free_folio_rcu_cb(struct rcu_head *head) +static void swap_cluster_free_table_folio_rcu_cb(struct rcu_head *head) { struct folio *folio; @@ -432,15 +419,46 @@ static void swap_table_free_folio_rcu_cb(struct rcu_head *head) folio_put(folio); } -static void swap_table_free(struct swap_table *table) +static void swap_cluster_free_table(struct swap_cluster_info *ci) { + struct swap_table *table; + + table = (struct swap_table *)rcu_dereference_protected(ci->table, true); + if (!table) + return; + + rcu_assign_pointer(ci->table, NULL); if (!SWP_TABLE_USE_PAGE) { kmem_cache_free(swap_table_cachep, table); return; } call_rcu(&(folio_page(virt_to_folio(table), 0)->rcu_head), - swap_table_free_folio_rcu_cb); + swap_cluster_free_table_folio_rcu_cb); +} + +static int swap_cluster_alloc_table(struct swap_cluster_info *ci, gfp_t gfp) +{ + struct swap_table *table = NULL; + struct folio *folio; + + /* The cluster must be empty and not on any list during allocation. */ + VM_WARN_ON_ONCE(ci->flags || !cluster_is_empty(ci)); + if (rcu_access_pointer(ci->table)) + return 0; + + if (SWP_TABLE_USE_PAGE) { + folio = folio_alloc(gfp | __GFP_ZERO, 0); + if (folio) + table = folio_address(folio); + } else { + table = kmem_cache_zalloc(swap_table_cachep, gfp); + } + if (!table) + return -ENOMEM; + + rcu_assign_pointer(ci->table, table); + return 0; } /* @@ -471,27 +489,15 @@ static void swap_cluster_assert_empty(struct swap_cluster_info *ci, WARN_ON_ONCE(nr == SWAPFILE_CLUSTER && ci->extend_table); } -static void swap_cluster_free_table(struct swap_cluster_info *ci) -{ - struct swap_table *table; - - /* Only empty cluster's table is allow to be freed */ - lockdep_assert_held(&ci->lock); - table = (void *)rcu_dereference_protected(ci->table, true); - rcu_assign_pointer(ci->table, NULL); - - swap_table_free(table); -} - /* * Allocate swap table for one cluster. Attempt an atomic allocation first, * then fallback to sleeping allocation. */ static struct swap_cluster_info * -swap_cluster_alloc_table(struct swap_info_struct *si, +swap_cluster_populate(struct swap_info_struct *si, struct swap_cluster_info *ci) { - struct swap_table *table; + int ret; /* * Only cluster isolation from the allocator does table allocation. @@ -502,14 +508,9 @@ swap_cluster_alloc_table(struct swap_info_struct *si, lockdep_assert_held(&si->global_cluster_lock); lockdep_assert_held(&ci->lock); - /* The cluster must be free and was just isolated from the free list. */ - VM_WARN_ON_ONCE(ci->flags || !cluster_is_empty(ci)); - - table = swap_table_alloc(__GFP_HIGH | __GFP_NOMEMALLOC | __GFP_NOWARN); - if (table) { - rcu_assign_pointer(ci->table, table); + if (!swap_cluster_alloc_table(ci, __GFP_HIGH | __GFP_NOMEMALLOC | + __GFP_NOWARN)) return ci; - } /* * Try a sleep allocation. Each isolated free cluster may cause @@ -521,7 +522,8 @@ swap_cluster_alloc_table(struct swap_info_struct *si, spin_unlock(&si->global_cluster_lock); local_unlock(&percpu_swap_cluster.lock); - table = swap_table_alloc(__GFP_HIGH | __GFP_NOMEMALLOC | GFP_KERNEL); + ret = swap_cluster_alloc_table(ci, __GFP_HIGH | __GFP_NOMEMALLOC | + GFP_KERNEL); /* * Back to atomic context. We might have migrated to a new CPU with a @@ -536,20 +538,11 @@ swap_cluster_alloc_table(struct swap_info_struct *si, spin_lock(&si->global_cluster_lock); spin_lock(&ci->lock); - /* Nothing except this helper should touch a dangling empty cluster. */ - if (WARN_ON_ONCE(cluster_table_is_alloced(ci))) { - if (table) - swap_table_free(table); - return ci; - } - - if (!table) { + if (ret) { move_cluster(si, ci, &si->free_clusters, CLUSTER_FLAG_FREE); spin_unlock(&ci->lock); return NULL; } - - rcu_assign_pointer(ci->table, table); return ci; } @@ -621,12 +614,11 @@ static struct swap_cluster_info *isolate_lock_cluster( } spin_unlock(&si->lock); - if (found && !cluster_table_is_alloced(found)) { - /* Only an empty free cluster's swap table can be freed. */ - VM_WARN_ON_ONCE(flags != CLUSTER_FLAG_FREE); + /* Cluster's table is freed when and only when it's on the free list. */ + if (found && flags == CLUSTER_FLAG_FREE) { VM_WARN_ON_ONCE(list != &si->free_clusters); - VM_WARN_ON_ONCE(!cluster_is_empty(found)); - return swap_cluster_alloc_table(si, found); + VM_WARN_ON_ONCE(cluster_table_is_alloced(found)); + return swap_cluster_populate(si, found); } return found; @@ -769,7 +761,6 @@ static int swap_cluster_setup_bad_slot(struct swap_info_struct *si, unsigned int ci_off = offset % SWAPFILE_CLUSTER; unsigned long idx = offset / SWAPFILE_CLUSTER; struct swap_cluster_info *ci; - struct swap_table *table; int ret = 0; /* si->max may got shrunk by swap swap_activate() */ @@ -790,12 +781,9 @@ static int swap_cluster_setup_bad_slot(struct swap_info_struct *si, } ci = cluster_info + idx; - if (!ci->table) { - table = swap_table_alloc(GFP_KERNEL); - if (!table) - return -ENOMEM; - rcu_assign_pointer(ci->table, table); - } + /* Need to allocate swap table first for initial bad slot marking. */ + if (!ci->count && swap_cluster_alloc_table(ci, GFP_KERNEL)) + return -ENOMEM; spin_lock(&ci->lock); /* Check for duplicated bad swap slots. */ if (__swap_table_xchg(ci, ci_off, SWP_TB_BAD) != SWP_TB_NULL) { @@ -2992,7 +2980,7 @@ static void free_swap_cluster_info(struct swap_cluster_info *cluster_info, ci = cluster_info + i; /* Cluster with bad marks count will have a remaining table */ spin_lock(&ci->lock); - if (rcu_dereference_protected(ci->table, true)) { + if (cluster_table_is_alloced(ci)) { swap_cluster_assert_empty(ci, 0, SWAPFILE_CLUSTER, true); swap_cluster_free_table(ci); } -- 2.53.0