All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
To: Shakeel Butt <shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
Cc: Hugh Dickins <hughd-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Roman Gushchin <guro-b10kYP2dOMg@public.gmane.org>,
	Michal Hocko <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Andrew Morton
	<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [PATCH] memcg: charge before adding to swapcache on swapin
Date: Fri, 19 Feb 2021 19:34:29 -0500	[thread overview]
Message-ID: <YDBZFY8WnLewRqLg@cmpxchg.org> (raw)
In-Reply-To: <20210219224405.1544597-1-shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>

On Fri, Feb 19, 2021 at 02:44:05PM -0800, Shakeel Butt wrote:
> Currently the kernel adds the page, allocated for swapin, to the
> swapcache before charging the page. This is fine but now we want a
> per-memcg swapcache stat which is essential for folks who wants to
> transparently migrate from cgroup v1's memsw to cgroup v2's memory and
> swap counters.
> 
> To correctly maintain the per-memcg swapcache stat, one option which
> this patch has adopted is to charge the page before adding it to
> swapcache. One challenge in this option is the failure case of
> add_to_swap_cache() on which we need to undo the mem_cgroup_charge().
> Specifically undoing mem_cgroup_uncharge_swap() is not simple.
> 
> This patch circumvent this specific issue by removing the failure path
> of  add_to_swap_cache() by providing __GFP_NOFAIL. Please note that in
> this specific situation ENOMEM was the only possible failure of
> add_to_swap_cache() which is removed by using __GFP_NOFAIL.
> 
> Another option was to use __mod_memcg_lruvec_state(NR_SWAPCACHE) in
> mem_cgroup_charge() but then we need to take of the do_swap_page() case
> where synchronous swap devices bypass the swapcache. The do_swap_page()
> already does hackery to set and reset PageSwapCache bit to make
> mem_cgroup_charge() execute the swap accounting code and then we would
> need to add additional parameter to tell to not touch NR_SWAPCACHE stat
> as that code patch bypass swapcache.
> 
> This patch added memcg charging API explicitly foe swapin pages and
> cleaned up do_swap_page() to not set and reset PageSwapCache bit.
> 
> Signed-off-by: Shakeel Butt <shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>

The patch makes sense to me. While it extends the charge interface, I
actually quite like that it charges the page earlier - before putting
it into wider circulation. It's a step in the right direction.

But IMO the semantics of mem_cgroup_charge_swapin_page() are a bit too
fickle: the __GFP_NOFAIL in add_to_swap_cache() works around it, but
having a must-not-fail-after-this line makes the code tricky to work
on and error prone.

It would be nicer to do a proper transaction sequence.

> @@ -497,16 +497,15 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
>  	__SetPageLocked(page);
>  	__SetPageSwapBacked(page);
>  
> -	/* May fail (-ENOMEM) if XArray node allocation failed. */
> -	if (add_to_swap_cache(page, entry, gfp_mask & GFP_RECLAIM_MASK, &shadow)) {
> -		put_swap_page(page, entry);
> +	if (mem_cgroup_charge_swapin_page(page, NULL, gfp_mask, entry))
>  		goto fail_unlock;
> -	}
>  
> -	if (mem_cgroup_charge(page, NULL, gfp_mask)) {
> -		delete_from_swap_cache(page);
> -		goto fail_unlock;
> -	}
> +	/*
> +	 * Use __GFP_NOFAIL to not worry about undoing the changes done by
> +	 * mem_cgroup_charge_swapin_page() on failure of add_to_swap_cache().
> +	 */
> +	add_to_swap_cache(page, entry,
> +			  (gfp_mask|__GFP_NOFAIL) & GFP_RECLAIM_MASK, &shadow);

How about:

	mem_cgroup_charge_swapin_page()
	add_to_swap_cache()
	mem_cgroup_finish_swapin_page()

where finish_swapin_page() only uncharges the swap entry (on cgroup1)
once the swap->memory transition is complete?

Otherwise the patch looks good to me.

WARNING: multiple messages have this Message-ID (diff)
From: Johannes Weiner <hannes@cmpxchg.org>
To: Shakeel Butt <shakeelb@google.com>
Cc: Hugh Dickins <hughd@google.com>, Roman Gushchin <guro@fb.com>,
	Michal Hocko <mhocko@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	cgroups@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] memcg: charge before adding to swapcache on swapin
Date: Fri, 19 Feb 2021 19:34:29 -0500	[thread overview]
Message-ID: <YDBZFY8WnLewRqLg@cmpxchg.org> (raw)
In-Reply-To: <20210219224405.1544597-1-shakeelb@google.com>

On Fri, Feb 19, 2021 at 02:44:05PM -0800, Shakeel Butt wrote:
> Currently the kernel adds the page, allocated for swapin, to the
> swapcache before charging the page. This is fine but now we want a
> per-memcg swapcache stat which is essential for folks who wants to
> transparently migrate from cgroup v1's memsw to cgroup v2's memory and
> swap counters.
> 
> To correctly maintain the per-memcg swapcache stat, one option which
> this patch has adopted is to charge the page before adding it to
> swapcache. One challenge in this option is the failure case of
> add_to_swap_cache() on which we need to undo the mem_cgroup_charge().
> Specifically undoing mem_cgroup_uncharge_swap() is not simple.
> 
> This patch circumvent this specific issue by removing the failure path
> of  add_to_swap_cache() by providing __GFP_NOFAIL. Please note that in
> this specific situation ENOMEM was the only possible failure of
> add_to_swap_cache() which is removed by using __GFP_NOFAIL.
> 
> Another option was to use __mod_memcg_lruvec_state(NR_SWAPCACHE) in
> mem_cgroup_charge() but then we need to take of the do_swap_page() case
> where synchronous swap devices bypass the swapcache. The do_swap_page()
> already does hackery to set and reset PageSwapCache bit to make
> mem_cgroup_charge() execute the swap accounting code and then we would
> need to add additional parameter to tell to not touch NR_SWAPCACHE stat
> as that code patch bypass swapcache.
> 
> This patch added memcg charging API explicitly foe swapin pages and
> cleaned up do_swap_page() to not set and reset PageSwapCache bit.
> 
> Signed-off-by: Shakeel Butt <shakeelb@google.com>

The patch makes sense to me. While it extends the charge interface, I
actually quite like that it charges the page earlier - before putting
it into wider circulation. It's a step in the right direction.

But IMO the semantics of mem_cgroup_charge_swapin_page() are a bit too
fickle: the __GFP_NOFAIL in add_to_swap_cache() works around it, but
having a must-not-fail-after-this line makes the code tricky to work
on and error prone.

It would be nicer to do a proper transaction sequence.

> @@ -497,16 +497,15 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
>  	__SetPageLocked(page);
>  	__SetPageSwapBacked(page);
>  
> -	/* May fail (-ENOMEM) if XArray node allocation failed. */
> -	if (add_to_swap_cache(page, entry, gfp_mask & GFP_RECLAIM_MASK, &shadow)) {
> -		put_swap_page(page, entry);
> +	if (mem_cgroup_charge_swapin_page(page, NULL, gfp_mask, entry))
>  		goto fail_unlock;
> -	}
>  
> -	if (mem_cgroup_charge(page, NULL, gfp_mask)) {
> -		delete_from_swap_cache(page);
> -		goto fail_unlock;
> -	}
> +	/*
> +	 * Use __GFP_NOFAIL to not worry about undoing the changes done by
> +	 * mem_cgroup_charge_swapin_page() on failure of add_to_swap_cache().
> +	 */
> +	add_to_swap_cache(page, entry,
> +			  (gfp_mask|__GFP_NOFAIL) & GFP_RECLAIM_MASK, &shadow);

How about:

	mem_cgroup_charge_swapin_page()
	add_to_swap_cache()
	mem_cgroup_finish_swapin_page()

where finish_swapin_page() only uncharges the swap entry (on cgroup1)
once the swap->memory transition is complete?

Otherwise the patch looks good to me.


  parent reply	other threads:[~2021-02-20  0:34 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-19 22:44 [PATCH] memcg: charge before adding to swapcache on swapin Shakeel Butt
     [not found] ` <20210219224405.1544597-1-shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2021-02-19 22:49   ` Shakeel Butt
2021-02-19 22:49     ` Shakeel Butt
2021-02-20  0:34   ` Johannes Weiner [this message]
2021-02-20  0:34     ` Johannes Weiner
     [not found]     ` <YDBZFY8WnLewRqLg-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
2021-02-22 18:45       ` Shakeel Butt
2021-02-22 18:45         ` Shakeel Butt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YDBZFY8WnLewRqLg@cmpxchg.org \
    --to=hannes-druugvl0lcnafugrpc6u6w@public.gmane.org \
    --cc=akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
    --cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=guro-b10kYP2dOMg@public.gmane.org \
    --cc=hughd-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org \
    --cc=mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    --cc=shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.