All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yosry Ahmed <yosry.ahmed@linux.dev>
To: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: Kanchana P Sridhar <kanchana.p.sridhar@intel.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Nhat Pham <nphamcs@gmail.com>,
	Chengming Zhou <chengming.zhou@linux.dev>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, stable@vger.kernel.org
Subject: Re: [PATCH v2 mm-hotfixes] mm/zswap: fix inconsistent charging when zswap_store_page() fails
Date: Tue, 28 Jan 2025 19:03:57 +0000	[thread overview]
Message-ID: <Z5kqHT1x-_0qtduA@google.com> (raw)
In-Reply-To: <20250128185507.2176-1-42.hyeyoo@gmail.com>

On Wed, Jan 29, 2025 at 03:55:07AM +0900, Hyeonggon Yoo wrote:
> Commit b7c0ccdfbafd ("mm: zswap: support large folios in zswap_store()")
> skips charging any zswapped base pages when it failed to zswap the entire
> folio.
> 
> However, when some base pages are zswapped but it failed to zswap
> the entire folio, the zswap operation is rolled back.
> When freeing zswap entries for those pages, zswap_entry_free() uncharges
> the pages that were not previously charged, causing zswap charging to
> become inconsistent.
> 
> This inconsistency triggers two warnings with following steps:
>   # On a machine with 64GiB of RAM and 36GiB of zswap
>   $ stress-ng --bigheap 2 # wait until the OOM-killer kills stress-ng
>   $ sudo reboot
> 
>   Two warnings are:
>     in mm/memcontrol.c:163, function obj_cgroup_release():
>       WARN_ON_ONCE(nr_bytes & (PAGE_SIZE - 1));
> 
>     in mm/page_counter.c:60, function page_counter_cancel():
>       if (WARN_ONCE(new < 0, "page_counter underflow: %ld nr_pages=%lu\n",
> 	  new, nr_pages))
> 
> While objcg events should only be accounted for when the entire folio is
> zswapped, objcg charging should be performed regardlessly.
> Fix accordingly.
> 
> After resolving the inconsistency, these warnings disappear.
> 
> Fixes: b7c0ccdfbafd ("mm: zswap: support large folios in zswap_store()")
> Cc: stable@vger.kernel.org
> Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
> ---
> 
> v1->v2:
> 
>  Fixed objcg events being accounted for on zswap failure.
> 
>  Fixed the incorrect description. I misunderstood that the base pages are
>  going to be stored in zswap, but their zswap entries are freed immediately.
> 
>  Added a comment on why it charges pages that are going to be removed
>  from zswap.
> 
>  mm/zswap.c | 14 ++++++++++----
>  1 file changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/mm/zswap.c b/mm/zswap.c
> index 6504174fbc6a..10b30ac46deb 100644
> --- a/mm/zswap.c
> +++ b/mm/zswap.c
> @@ -1568,20 +1568,26 @@ bool zswap_store(struct folio *folio)
>  
>  		bytes = zswap_store_page(page, objcg, pool);
>  		if (bytes < 0)
> -			goto put_pool;
> +			goto charge_zswap;
>  		compressed_bytes += bytes;
>  	}
>  
> -	if (objcg) {
> -		obj_cgroup_charge_zswap(objcg, compressed_bytes);
> +	if (objcg)
>  		count_objcg_events(objcg, ZSWPOUT, nr_pages);
> -	}
>  
>  	atomic_long_add(nr_pages, &zswap_stored_pages);
>  	count_vm_events(ZSWPOUT, nr_pages);
>  
>  	ret = true;
>  
> +charge_zswap:
> +	/*
> +	 * Charge zswapped pages even when it failed to zswap the entire folio,
> +	 * because zswap_entry_free() will uncharge them anyway.
> +	 * Otherwise zswap charging will become inconsistent.
> +	 */
> +	if (objcg)
> +		obj_cgroup_charge_zswap(objcg, compressed_bytes);

Thanks for fixing this!

Having to charge just to uncharge right after is annoying. Ideally we'd
just clear entry->objcg if we fail before charging, but we don't have a
direct reference to the entries here and another tree lookup is not
ideal either.

I guess we may be able to improve this handling once [1] lands, as we
can move the charging logic into zswap_store_folio() where we'd have
access to the entries.

For now, would the control flow be easier if we move the charge ahead of
the zswap_store_page() loop instead? There is an existing if (objcg)
block there as well.

[1]https://lore.kernel.org/linux-mm/20241221063119.29140-12-kanchana.p.sridhar@intel.com/

>  put_pool:
>  	zswap_pool_put(pool);
>  put_objcg:
> -- 
> 2.47.1
> 
> 

  reply	other threads:[~2025-01-28 19:04 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-28 18:55 [PATCH v2 mm-hotfixes] mm/zswap: fix inconsistent charging when zswap_store_page() fails Hyeonggon Yoo
2025-01-28 19:03 ` Yosry Ahmed [this message]
2025-01-28 19:19   ` Sridhar, Kanchana P
2025-01-28 19:09 ` Sridhar, Kanchana P
2025-01-28 19:19   ` Yosry Ahmed
2025-01-29  5:48     ` Hyeonggon Yoo
2025-01-29  6:40   ` Hyeonggon Yoo
2025-01-29  7:56     ` Sridhar, Kanchana P

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z5kqHT1x-_0qtduA@google.com \
    --to=yosry.ahmed@linux.dev \
    --cc=42.hyeyoo@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=chengming.zhou@linux.dev \
    --cc=hannes@cmpxchg.org \
    --cc=kanchana.p.sridhar@intel.com \
    --cc=linux-mm@kvack.org \
    --cc=nphamcs@gmail.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.