All of lore.kernel.org
 help / color / mirror / Atom feed
From: Baoquan He <bhe@redhat.com>
To: Kairui Song <ryncsn@gmail.com>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	Barry Song <baohua@kernel.org>, Chris Li <chrisl@kernel.org>,
	Nhat Pham <nphamcs@gmail.com>,
	Yosry Ahmed <yosry.ahmed@linux.dev>,
	David Hildenbrand <david@kernel.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Youngjun Park <youngjun.park@lge.com>,
	Hugh Dickins <hughd@google.com>,
	Baolin Wang <baolin.wang@linux.alibaba.com>,
	Ying Huang <ying.huang@linux.alibaba.com>,
	Kemeng Shi <shikemeng@huaweicloud.com>,
	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	linux-kernel@vger.kernel.org, Kairui Song <kasong@tencent.com>
Subject: Re: [PATCH v4 13/19] mm, swap: remove workaround for unsynchronized swap map cache state
Date: Thu, 18 Dec 2025 11:37:25 +0800	[thread overview]
Message-ID: <aUN29S2CDv0KbfXj@MiWiFi-R3L-srv> (raw)
In-Reply-To: <20251205-swap-table-p2-v4-13-cb7e28a26a40@tencent.com>

On 12/05/25 at 03:29am, Kairui Song wrote:
> From: Kairui Song <kasong@tencent.com>
> 
> Remove the "skip if exists" check from commit a65b0e7607ccb ("zswap:
> make shrinking memcg-aware"). It was needed because there is a tiny time
> window between setting the SWAP_HAS_CACHE bit and actually adding the
> folio to the swap cache. If a user is trying to add the folio into the
> swap cache but another user was interrupted after setting SWAP_HAS_CACHE
> but hasn't added the folio to the swap cache yet, it might lead to a
> deadlock.
> 
> We have moved the bit setting to the same critical section as adding the
> folio, so this is no longer needed. Remove it and clean it up.
> 
> Signed-off-by: Kairui Song <kasong@tencent.com>
> ---
>  mm/swap.h       |  2 +-
>  mm/swap_state.c | 27 ++++++++++-----------------
>  mm/zswap.c      |  2 +-
>  3 files changed, 12 insertions(+), 19 deletions(-)

Reviewed-by: Baoquan He <bhe@redhat.com>

> 
> diff --git a/mm/swap.h b/mm/swap.h
> index b5075a1aee04..6777b2ab9d92 100644
> --- a/mm/swap.h
> +++ b/mm/swap.h
> @@ -260,7 +260,7 @@ int swap_cache_add_folio(struct folio *folio, swp_entry_t entry,
>  void swap_cache_del_folio(struct folio *folio);
>  struct folio *swap_cache_alloc_folio(swp_entry_t entry, gfp_t gfp_flags,
>  				     struct mempolicy *mpol, pgoff_t ilx,
> -				     bool *alloced, bool skip_if_exists);
> +				     bool *alloced);
>  /* Below helpers require the caller to lock and pass in the swap cluster. */
>  void __swap_cache_del_folio(struct swap_cluster_info *ci,
>  			    struct folio *folio, swp_entry_t entry, void *shadow);
> diff --git a/mm/swap_state.c b/mm/swap_state.c
> index df7df8b75e52..1a69ba3be87f 100644
> --- a/mm/swap_state.c
> +++ b/mm/swap_state.c
> @@ -445,8 +445,6 @@ void swap_update_readahead(struct folio *folio, struct vm_area_struct *vma,
>   * @folio: folio to be added.
>   * @gfp: memory allocation flags for charge, can be 0 if @charged if true.
>   * @charged: if the folio is already charged.
> - * @skip_if_exists: if the slot is in a cached state, return NULL.
> - *                  This is an old workaround that will be removed shortly.
>   *
>   * Update the swap_map and add folio as swap cache, typically before swapin.
>   * All swap slots covered by the folio must have a non-zero swap count.
> @@ -457,8 +455,7 @@ void swap_update_readahead(struct folio *folio, struct vm_area_struct *vma,
>   */
>  static struct folio *__swap_cache_prepare_and_add(swp_entry_t entry,
>  						  struct folio *folio,
> -						  gfp_t gfp, bool charged,
> -						  bool skip_if_exists)
> +						  gfp_t gfp, bool charged)
>  {
>  	struct folio *swapcache = NULL;
>  	void *shadow;
> @@ -478,7 +475,7 @@ static struct folio *__swap_cache_prepare_and_add(swp_entry_t entry,
>  		 * might return a folio that is irrelevant to the faulting
>  		 * entry because @entry is aligned down. Just return NULL.
>  		 */
> -		if (ret != -EEXIST || skip_if_exists || folio_test_large(folio))
> +		if (ret != -EEXIST || folio_test_large(folio))
>  			goto failed;
>  
>  		swapcache = swap_cache_get_folio(entry);
> @@ -511,8 +508,6 @@ static struct folio *__swap_cache_prepare_and_add(swp_entry_t entry,
>   * @mpol: NUMA memory allocation policy to be applied
>   * @ilx: NUMA interleave index, for use only when MPOL_INTERLEAVE
>   * @new_page_allocated: sets true if allocation happened, false otherwise
> - * @skip_if_exists: if the slot is a partially cached state, return NULL.
> - *                  This is a workaround that would be removed shortly.
>   *
>   * Allocate a folio in the swap cache for one swap slot, typically before
>   * doing IO (e.g. swap in or zswap writeback). The swap slot indicated by
> @@ -525,8 +520,7 @@ static struct folio *__swap_cache_prepare_and_add(swp_entry_t entry,
>   */
>  struct folio *swap_cache_alloc_folio(swp_entry_t entry, gfp_t gfp_mask,
>  				     struct mempolicy *mpol, pgoff_t ilx,
> -				     bool *new_page_allocated,
> -				     bool skip_if_exists)
> +				     bool *new_page_allocated)
>  {
>  	struct swap_info_struct *si = __swap_entry_to_info(entry);
>  	struct folio *folio;
> @@ -547,8 +541,7 @@ struct folio *swap_cache_alloc_folio(swp_entry_t entry, gfp_t gfp_mask,
>  	if (!folio)
>  		return NULL;
>  	/* Try add the new folio, returns existing folio or NULL on failure. */
> -	result = __swap_cache_prepare_and_add(entry, folio, gfp_mask,
> -					      false, skip_if_exists);
> +	result = __swap_cache_prepare_and_add(entry, folio, gfp_mask, false);
>  	if (result == folio)
>  		*new_page_allocated = true;
>  	else
> @@ -577,7 +570,7 @@ struct folio *swapin_folio(swp_entry_t entry, struct folio *folio)
>  	unsigned long nr_pages = folio_nr_pages(folio);
>  
>  	entry = swp_entry(swp_type(entry), round_down(offset, nr_pages));
> -	swapcache = __swap_cache_prepare_and_add(entry, folio, 0, true, false);
> +	swapcache = __swap_cache_prepare_and_add(entry, folio, 0, true);
>  	if (swapcache == folio)
>  		swap_read_folio(folio, NULL);
>  	return swapcache;
> @@ -605,7 +598,7 @@ struct folio *read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
>  
>  	mpol = get_vma_policy(vma, addr, 0, &ilx);
>  	folio = swap_cache_alloc_folio(entry, gfp_mask, mpol, ilx,
> -					&page_allocated, false);
> +				       &page_allocated);
>  	mpol_cond_put(mpol);
>  
>  	if (page_allocated)
> @@ -724,7 +717,7 @@ struct folio *swap_cluster_readahead(swp_entry_t entry, gfp_t gfp_mask,
>  		/* Ok, do the async read-ahead now */
>  		folio = swap_cache_alloc_folio(
>  			swp_entry(swp_type(entry), offset), gfp_mask, mpol, ilx,
> -			&page_allocated, false);
> +			&page_allocated);
>  		if (!folio)
>  			continue;
>  		if (page_allocated) {
> @@ -742,7 +735,7 @@ struct folio *swap_cluster_readahead(swp_entry_t entry, gfp_t gfp_mask,
>  skip:
>  	/* The page was likely read above, so no need for plugging here */
>  	folio = swap_cache_alloc_folio(entry, gfp_mask, mpol, ilx,
> -					&page_allocated, false);
> +				       &page_allocated);
>  	if (unlikely(page_allocated))
>  		swap_read_folio(folio, NULL);
>  	return folio;
> @@ -847,7 +840,7 @@ static struct folio *swap_vma_readahead(swp_entry_t targ_entry, gfp_t gfp_mask,
>  				continue;
>  		}
>  		folio = swap_cache_alloc_folio(entry, gfp_mask, mpol, ilx,
> -						&page_allocated, false);
> +					       &page_allocated);
>  		if (si)
>  			put_swap_device(si);
>  		if (!folio)
> @@ -869,7 +862,7 @@ static struct folio *swap_vma_readahead(swp_entry_t targ_entry, gfp_t gfp_mask,
>  skip:
>  	/* The folio was likely read above, so no need for plugging here */
>  	folio = swap_cache_alloc_folio(targ_entry, gfp_mask, mpol, targ_ilx,
> -					&page_allocated, false);
> +				       &page_allocated);
>  	if (unlikely(page_allocated))
>  		swap_read_folio(folio, NULL);
>  	return folio;
> diff --git a/mm/zswap.c b/mm/zswap.c
> index a7a2443912f4..d8a33db9d3cc 100644
> --- a/mm/zswap.c
> +++ b/mm/zswap.c
> @@ -1015,7 +1015,7 @@ static int zswap_writeback_entry(struct zswap_entry *entry,
>  
>  	mpol = get_task_policy(current);
>  	folio = swap_cache_alloc_folio(swpentry, GFP_KERNEL, mpol,
> -				       NO_INTERLEAVE_INDEX, &folio_was_allocated, true);
> +				       NO_INTERLEAVE_INDEX, &folio_was_allocated);
>  	put_swap_device(si);
>  	if (!folio)
>  		return -ENOMEM;
> 
> -- 
> 2.52.0
> 



  reply	other threads:[~2025-12-18  3:37 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-04 19:29 [PATCH v4 00/19] mm, swap: swap table phase II: unify swapin use swap cache and cleanup flags Kairui Song
2025-12-04 19:29 ` [PATCH v4 01/19] mm, swap: rename __read_swap_cache_async to swap_cache_alloc_folio Kairui Song
2025-12-11  1:01   ` Baoquan He
2025-12-04 19:29 ` [PATCH v4 02/19] mm, swap: split swap cache preparation loop into a standalone helper Kairui Song
2025-12-04 19:29 ` [PATCH v4 03/19] mm, swap: never bypass the swap cache even for SWP_SYNCHRONOUS_IO Kairui Song
2025-12-04 19:29 ` [PATCH v4 04/19] mm, swap: always try to free swap cache for SWP_SYNCHRONOUS_IO devices Kairui Song
2025-12-04 19:29 ` [PATCH v4 05/19] mm, swap: simplify the code and reduce indention Kairui Song
2025-12-04 19:29 ` [PATCH v4 06/19] mm, swap: free the swap cache after folio is mapped Kairui Song
2025-12-11  4:21   ` Baoquan He
2025-12-04 19:29 ` [PATCH v4 07/19] mm/shmem: never bypass the swap cache for SWP_SYNCHRONOUS_IO Kairui Song
2025-12-04 19:29 ` [PATCH v4 08/19] mm/shmem, swap: remove SWAP_MAP_SHMEM Kairui Song
2025-12-04 19:29 ` [PATCH v4 09/19] mm, swap: swap entry of a bad slot should not be considered as swapped out Kairui Song
2025-12-15  3:57   ` Baoquan He
2025-12-15  4:12     ` Kairui Song
2025-12-04 19:29 ` [PATCH v4 10/19] mm, swap: consolidate cluster reclaim and usability check Kairui Song
2025-12-15  4:12   ` Baoquan He
2025-12-15  4:38     ` Kairui Song
2025-12-17 11:15       ` Baoquan He
2025-12-17 18:30         ` Kairui Song
2025-12-18  3:33           ` Baoquan He
2025-12-04 19:29 ` [PATCH v4 11/19] mm, swap: split locked entry duplicating into a standalone helper Kairui Song
2025-12-17 11:22   ` Baoquan He
2025-12-17 18:37     ` Kairui Song
2025-12-19 17:26       ` Kairui Song
2025-12-20  4:00         ` Baoquan He
2025-12-04 19:29 ` [PATCH v4 12/19] mm, swap: use swap cache as the swap in synchronize layer Kairui Song
2025-12-18  3:31   ` Baoquan He
2025-12-18  3:40     ` Kairui Song
2025-12-04 19:29 ` [PATCH v4 13/19] mm, swap: remove workaround for unsynchronized swap map cache state Kairui Song
2025-12-18  3:37   ` Baoquan He [this message]
2025-12-04 19:29 ` [PATCH v4 14/19] mm, swap: cleanup swap entry management workflow Kairui Song
2025-12-04 19:29 ` [PATCH v4 15/19] mm, swap: add folio to swap cache directly on allocation Kairui Song
2025-12-04 19:29 ` [PATCH v4 16/19] mm, swap: check swap table directly for checking cache Kairui Song
2025-12-04 19:29 ` [PATCH v4 17/19] mm, swap: clean up and improve swap entries freeing Kairui Song
2025-12-04 19:29 ` [PATCH v4 18/19] mm, swap: drop the SWAP_HAS_CACHE flag Kairui Song
2025-12-04 19:29 ` [PATCH v4 19/19] mm, swap: remove no longer needed _swap_info_get Kairui Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aUN29S2CDv0KbfXj@MiWiFi-R3L-srv \
    --to=bhe@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=chrisl@kernel.org \
    --cc=david@kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=kasong@tencent.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=nphamcs@gmail.com \
    --cc=ryncsn@gmail.com \
    --cc=shikemeng@huaweicloud.com \
    --cc=willy@infradead.org \
    --cc=ying.huang@linux.alibaba.com \
    --cc=yosry.ahmed@linux.dev \
    --cc=youngjun.park@lge.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.