All of lore.kernel.org
 help / color / mirror / Atom feed
From: "David Hildenbrand (Arm)" <david@kernel.org>
To: "Barry Song (Xiaomi)" <baohua@kernel.org>
Cc: akpm@linux-foundation.org, axelrasmussen@google.com,
	baolin.wang@linux.alibaba.com, dev.jain@arm.com,
	kasong@tencent.com, lance.yang@linux.dev, liam@infradead.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org, ljs@kernel.org,
	npache@redhat.com, qi.zheng@linux.dev, ryan.roberts@arm.com,
	shakeel.butt@linux.dev, weixugc@google.com, yuanchu@google.com,
	zhaonanzhe@xiaomi.com, ziy@nvidia.com
Subject: Re: [RFC PATCH] mm: Avoiding split large folios if swap has no space
Date: Mon, 22 Jun 2026 10:58:20 +0200	[thread overview]
Message-ID: <4aa8350e-712f-4380-b3bf-2ff06cf2a35d@kernel.org> (raw)
In-Reply-To: <20260620081017.89085-1-baohua@kernel.org>

On 6/20/26 10:10, Barry Song (Xiaomi) wrote:
> On Fri, Jun 19, 2026 at 10:04 PM David Hildenbrand (Arm) <david@kernel.org> wrote:
> [...]
>>>       /*
>>>        * The page can not be swapped.
>>>        *
>>> @@ -1280,6 +1289,8 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
>>>
>>>                               if (!folio_test_large(folio))
>>>                                       goto activate_locked_split;
>>> +                             if (!__can_reclaim_anon_pages(memcg, sc))
>>> +                                     goto activate_locked_split;
>>
>> Why are we even trying to allocate swap space if we cannot reclaim such pages?
>> Makes we wonder whether we would want to have that check earlier, before the
>> folio_alloc_swap().
>>
>> Any downsides?
> 
> I don't think there are any obvious downsides there. One issue is that
> the memcg may not be passed from reclaim_pages(), so memcg would
> always be NULL. However, the folio could still belong to a memcg
> whose swap quota has been exhausted. In that case, my
> __can_reclaim_anon_pages() will fail when checking whether we can
> swap out. But switching to folio_memcg() also seems awkward.
> 
> So I feel Kairui’s suggestion [1] might be the best approach. In
> folio_alloc_swap(), we return -EAGAIN to tell vmscan.c that
> we can split the folio and retry the swap-out.
> only when there are sufficient swap slots and sufficient memcg swap
> quota do we return -EAGAIN, allowing vmscan to perform a split.
> 
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 78b49b0658ad..62e2c506ccae 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -1755,6 +1755,9 @@ int folio_alloc_swap(struct folio *folio)
>  			VM_WARN_ON_ONCE(1);
>  			return -EINVAL;
>  		}
> +
> +		if (get_nr_swap_pages() < (1 << order))
> +			return -ENOMEM;

I guess we could be clearer with the return value:

-> !get_nr_swap_pages() -> -ENOSPC / -ENOMEM

(no space at all)

-> get_nr_swap_pages() < (1 << order) -> -E2BIG

(there is some space, but not for the full thing)

But now I wonder whether we would also want to check "is there any free swap
space", not just "is there any swap".


Essentially, try returning -E2BIG if there is the chance to swap out after
split, and  -ENOSPC / -ENOMEM if a split wouldn't help.

>  	}
>  
>  again:
> @@ -1769,11 +1772,13 @@ int folio_alloc_swap(struct folio *folio)
>  	}
>  
>  	/* Need to call this even if allocation failed, for MEMCG_SWAP_FAIL. */
> -	if (unlikely(mem_cgroup_try_charge_swap(folio)))
> +	if (unlikely(mem_cgroup_try_charge_swap(folio))) {
>  		swap_cache_del_folio(folio);
> +		return -ENOMEM;

Here we wouldn't have the information whether we could charge after a split.

So that would require a rework to signal this more cleanly to the caller.

> +	}
>  
>  	if (unlikely(!folio_test_swapcache(folio)))
> -		return -ENOMEM;
> +		return -EAGAIN;
>  
>  	return 0;
>  }
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 299b5d9e8836..63e8578454ea 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1257,6 +1257,8 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
>  		 */
>  		if (folio_test_anon(folio) && folio_test_swapbacked(folio) &&
>  				!folio_test_swapcache(folio)) {
> +			int ret;
> +
>  			if (!(sc->gfp_mask & __GFP_IO))
>  				goto keep_locked;
>  			if (folio_maybe_dma_pinned(folio))
> @@ -1275,10 +1277,10 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
>  				    split_folio_to_list(folio, folio_list))
>  					goto activate_locked;
>  			}
> -			if (folio_alloc_swap(folio)) {
> +			if ((ret = folio_alloc_swap(folio))) {

I prefer doing the assignment outside the conditional.

>  				int __maybe_unused order = folio_order(folio);
>  
> -				if (!folio_test_large(folio))
> +				if (!folio_test_large(folio) || ret != -EAGAIN)
>  					goto activate_locked_split;
>  				/* Fallback to swap normal pages */
>  				if (split_folio_to_list(folio, folio_list))
> 
> What’s your view on this, David?

I guess returning from folio_alloc_swap() whether a split could allow for
swapout (e.g., -E2BIG) would be reasonable.

To catch all the cases where it makes a difference:
* No free swap space (split won't work)
* Some free swap space (split would work)
* Sufficient free swap space, but fragmented (split would work)
* No memcg space (split won't work)
* Some memcg space (split would work)

-- 
Cheers,

David


  parent reply	other threads:[~2026-06-22  8:58 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-18 22:17 [RFC PATCH] mm: Avoiding split large folios if swap has no space Barry Song (Xiaomi)
2026-06-18 23:46 ` Nico Pache
2026-06-19  0:59   ` Barry Song
2026-06-19 14:01 ` David Hildenbrand (Arm)
2026-06-19 23:01   ` Barry Song
2026-06-19 14:04 ` David Hildenbrand (Arm)
2026-06-20  8:10   ` Barry Song (Xiaomi)
2026-06-22  3:04     ` Baolin Wang
2026-06-22  3:36       ` Barry Song
2026-06-22  4:06         ` Baolin Wang
2026-06-22  8:58     ` David Hildenbrand (Arm) [this message]
2026-06-19 19:17 ` Kairui Song
2026-06-19 22:42   ` Barry Song (Xiaomi)
2026-06-20  7:19     ` Kairui Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4aa8350e-712f-4380-b3bf-2ff06cf2a35d@kernel.org \
    --to=david@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=axelrasmussen@google.com \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=dev.jain@arm.com \
    --cc=kasong@tencent.com \
    --cc=lance.yang@linux.dev \
    --cc=liam@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ljs@kernel.org \
    --cc=npache@redhat.com \
    --cc=qi.zheng@linux.dev \
    --cc=ryan.roberts@arm.com \
    --cc=shakeel.butt@linux.dev \
    --cc=weixugc@google.com \
    --cc=yuanchu@google.com \
    --cc=zhaonanzhe@xiaomi.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.