Linux-mm Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Yosry Ahmed <yosry@kernel.org>
To: Nhat Pham <nphamcs@gmail.com>
Cc: akpm@linux-foundation.org, chrisl@kernel.org, kasong@tencent.com,
	 hannes@cmpxchg.org, mhocko@kernel.org, roman.gushchin@linux.dev,
	 shakeel.butt@linux.dev, david@kernel.org, muchun.song@linux.dev,
	 shikemeng@huaweicloud.com, baoquan.he@linux.dev,
	baohua@kernel.org, youngjun.park@lge.com,
	 chengming.zhou@linux.dev, ljs@kernel.org, liam@infradead.org,
	vbabka@kernel.org,  rppt@kernel.org, surenb@google.com,
	qi.zheng@linux.dev, axelrasmussen@google.com,
	 yuanchu@google.com, weixugc@google.com, riel@surriel.com,
	gourry@gourry.net,  haowenchao22@gmail.com, kernel-team@meta.com,
	linux-mm@kvack.org,  linux-kernel@vger.kernel.org,
	cgroups@vger.kernel.org
Subject: Re: [RFC PATCH v2 3/7] mm, swap: support physical swap as a vswap backend
Date: Tue, 23 Jun 2026 00:23:46 +0000	[thread overview]
Message-ID: <ajnRulrxAKnZavOl@google.com> (raw)
In-Reply-To: <20260612193738.2183968-4-nphamcs@gmail.com>

On Fri, Jun 12, 2026 at 12:37:34PM -0700, Nhat Pham wrote:
> Add physical swap as a backend for the virtual swap layer.
> 
> With physical swap backing, vswap can allocate a physical slot on
> demand when needed: as a fallback for zswap_store failures, or as
> the destination for zswap writeback.
> 
> Each vswap entry's physical slot is tracked via a Pointer-tagged
> swap_table entry on the physical cluster (rmap back to the vswap
> entry).
> 
> Suggested-by: Kairui Song <kasong@tencent.com>
> Signed-off-by: Nhat Pham <nphamcs@gmail.com>
> ---
[..]
> diff --git a/mm/zswap.c b/mm/zswap.c
> index 466f8a182716..5daff7a25f67 100644
> --- a/mm/zswap.c
> +++ b/mm/zswap.c
> @@ -993,6 +993,7 @@ static int zswap_writeback_entry(struct zswap_entry *entry,
>  	struct folio *folio;
>  	struct mempolicy *mpol;
>  	struct swap_info_struct *si;
> +	swp_entry_t phys = {};
>  	int ret = 0;
>  
>  	/* try to allocate swap cache folio */
> @@ -1000,16 +1001,6 @@ static int zswap_writeback_entry(struct zswap_entry *entry,
>  	if (!si)
>  		return -EEXIST;
>  
> -	/*
> -	 * Vswap entries have no physical backing - writeback would fail
> -	 * and SIGBUS the caller. Bail before we waste a swap-cache folio
> -	 * allocation.
> -	 */
> -	if (si->flags & SWP_VSWAP) {
> -		put_swap_device(si);
> -		return -EINVAL;
> -	}
> -
>  	mpol = get_task_policy(current);
>  	folio = swap_cache_alloc_folio(swpentry, GFP_KERNEL, BIT(0), NULL, mpol,
>  				       NO_INTERLEAVE_INDEX);
> @@ -1028,40 +1019,78 @@ static int zswap_writeback_entry(struct zswap_entry *entry,
>  	/*
>  	 * folio is locked, and the swapcache is now secured against
>  	 * concurrent swapping to and from the slot, and concurrent
> -	 * swapoff so we can safely dereference the zswap tree here.
> -	 * Verify that the swap entry hasn't been invalidated and recycled
> -	 * behind our backs, to avoid overwriting a new swap folio with
> -	 * old compressed data. Only when this is successful can the entry
> -	 * be dereferenced.
> +	 * swapoff so we can safely dereference the zswap tree (or vswap
> +	 * vtable) here. Verify that the swap entry hasn't been
> +	 * invalidated and recycled behind our backs, to avoid overwriting
> +	 * a new swap folio with old compressed data. Only when this is
> +	 * successful can the entry be dereferenced.
>  	 */
> -	tree = swap_zswap_tree(swpentry);
> -	if (entry != xa_load(tree, offset)) {
> -		ret = -ENOMEM;
> -		goto out;
> +	if (swap_is_vswap(si)) {
> +		if (entry != vswap_zswap_load(swpentry)) {
> +			ret = -ENOMEM;
> +			goto out;
> +		}
> +		/*
> +		 * Allocate physical backing BEFORE decompress - if it fails,
> +		 * no wasted work. folio_realloc_swap sets vtable to PHYS,
> +		 * overwriting ZSWAP - the old entry pointer is only held
> +		 * by the caller now.
> +		 */
> +		phys = folio_realloc_swap(folio);
> +		if (!phys.val) {
> +			ret = -ENOMEM;
> +			goto out;
> +		}

I didn't look through the rest of the series, but are there use cases
for calling folio_realloc_swap() without calling vswap_zswap_load()
first? I wonder if the realloc_swap API should take the swpentry
directly and do the load within? Something like
vswap_alloc_phys(swpentry, folio)?


  reply	other threads:[~2026-06-23  0:23 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-12 19:37 [RFC PATCH v2 0/7] mm, swap: Virtual Swap Space (Swap Table Edition) Nhat Pham
2026-06-12 19:37 ` [RFC PATCH v2 1/7] mm, swap: add virtual swap device infrastructure Nhat Pham
2026-06-12 19:37 ` [RFC PATCH v2 2/7] mm, swap: support zswap and zeroswap as vswap backends Nhat Pham
2026-06-23  0:15   ` Yosry Ahmed
2026-06-23  0:18   ` Yosry Ahmed
2026-06-12 19:37 ` [RFC PATCH v2 3/7] mm, swap: support physical swap as a vswap backend Nhat Pham
2026-06-23  0:23   ` Yosry Ahmed [this message]
2026-06-12 19:37 ` [RFC PATCH v2 4/7] mm, swap: only charge physical swap entries Nhat Pham
2026-06-12 19:37 ` [RFC PATCH v2 5/7] mm, swap: add debugfs counters for vswap Nhat Pham
2026-06-12 19:37 ` [RFC PATCH v2 6/7] mm, swap: defer memcg_table allocation on physical clusters Nhat Pham
2026-06-12 19:37 ` [RFC PATCH v2 7/7] mm, swap: widen swap_info_struct max/pages to unsigned long Nhat Pham
2026-06-14  8:20 ` [RFC PATCH v2 0/7] mm, swap: Virtual Swap Space (Swap Table Edition) YoungJun Park
2026-06-15  2:38   ` Nhat Pham
     [not found]     ` <CAO9r8zPj5EH8Mbpc6N+d1u2eEgoV33f+4s=v-84gaobAodPtUw@mail.gmail.com>
     [not found]       ` <ajCm44rYpLOKCQ43@yjaykim-PowerEdge-T330>
2026-06-16 12:15         ` Nhat Pham

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ajnRulrxAKnZavOl@google.com \
    --to=yosry@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=axelrasmussen@google.com \
    --cc=baohua@kernel.org \
    --cc=baoquan.he@linux.dev \
    --cc=cgroups@vger.kernel.org \
    --cc=chengming.zhou@linux.dev \
    --cc=chrisl@kernel.org \
    --cc=david@kernel.org \
    --cc=gourry@gourry.net \
    --cc=hannes@cmpxchg.org \
    --cc=haowenchao22@gmail.com \
    --cc=kasong@tencent.com \
    --cc=kernel-team@meta.com \
    --cc=liam@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ljs@kernel.org \
    --cc=mhocko@kernel.org \
    --cc=muchun.song@linux.dev \
    --cc=nphamcs@gmail.com \
    --cc=qi.zheng@linux.dev \
    --cc=riel@surriel.com \
    --cc=roman.gushchin@linux.dev \
    --cc=rppt@kernel.org \
    --cc=shakeel.butt@linux.dev \
    --cc=shikemeng@huaweicloud.com \
    --cc=surenb@google.com \
    --cc=vbabka@kernel.org \
    --cc=weixugc@google.com \
    --cc=youngjun.park@lge.com \
    --cc=yuanchu@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox