All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
To: Sebastian Andrzej Siewior
	<bigeasy-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
Cc: cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
	"Andrew Morton"
	<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	"Michal Hocko" <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	"Michal Koutný" <mkoutny-IBi9RG/b67k@public.gmane.org>,
	"Peter Zijlstra" <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>,
	"Thomas Gleixner" <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>,
	"Vladimir Davydov"
	<vdavydov.dev-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	"Waiman Long" <longman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Subject: Re: [PATCH v2 3/4] mm/memcg: Protect per-CPU counter by disabling preemption on PREEMPT_RT where needed.
Date: Mon, 14 Feb 2022 11:46:00 -0500	[thread overview]
Message-ID: <YgqHSIa/WvJSXERe@cmpxchg.org> (raw)
In-Reply-To: <20220211223537.2175879-4-bigeasy-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>

On Fri, Feb 11, 2022 at 11:35:36PM +0100, Sebastian Andrzej Siewior wrote:
> The per-CPU counter are modified with the non-atomic modifier. The
> consistency is ensured by disabling interrupts for the update.
> On non PREEMPT_RT configuration this works because acquiring a
> spinlock_t typed lock with the _irq() suffix disables interrupts. On
> PREEMPT_RT configurations the RMW operation can be interrupted.
> 
> Another problem is that mem_cgroup_swapout() expects to be invoked with
> disabled interrupts because the caller has to acquire a spinlock_t which
> is acquired with disabled interrupts. Since spinlock_t never disables
> interrupts on PREEMPT_RT the interrupts are never disabled at this
> point.
> 
> The code is never called from in_irq() context on PREEMPT_RT therefore
> disabling preemption during the update is sufficient on PREEMPT_RT.
> The sections which explicitly disable interrupts can remain on
> PREEMPT_RT because the sections remain short and they don't involve
> sleeping locks (memcg_check_events() is doing nothing on PREEMPT_RT).
> 
> Disable preemption during update of the per-CPU variables which do not
> explicitly disable interrupts.
> 
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
> ---
>  mm/memcontrol.c | 21 +++++++++++++++++++--
>  1 file changed, 19 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index c1caa662946dc..466466f285cea 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -705,6 +705,8 @@ void __mod_memcg_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx,
>  	pn = container_of(lruvec, struct mem_cgroup_per_node, lruvec);
>  	memcg = pn->memcg;
>  
> +	if (IS_ENABLED(CONFIG_PREEMPT_RT))
> +		preempt_disable();
>  	/* Update memcg */
>  	__this_cpu_add(memcg->vmstats_percpu->state[idx], val);
>  
> @@ -712,6 +714,8 @@ void __mod_memcg_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx,
>  	__this_cpu_add(pn->lruvec_stats_percpu->state[idx], val);
>  
>  	memcg_rstat_updated(memcg, val);
> +	if (IS_ENABLED(CONFIG_PREEMPT_RT))
> +		preempt_enable();
>  }

I notice you didn't annoate __mod_memcg_state(). I suppose that is
because it's called with explicit local_irq_disable(), and that
disables preemption on rt? And you only need another preempt_disable()
for stacks that rely on coming from spin_lock_irq(save)?

That makes sense, but it's difficult to maintain. It'll easily break
if somebody adds more memory accounting sites that may also rely on an
irq-disabled spinlock somewhere.

So better to make this an unconditional locking protocol:

static void memcg_stats_lock(void)
{
#ifdef CONFIG_PREEMPT_RT
	preempt_disable();
#else
	VM_BUG_ON(!irqs_disabled());
#endif
}

static void memcg_stats_unlock(void)
{
#ifdef CONFIG_PREEMPT_RT
	preempt_enable();
#endif
}

and always use these around the counter updates.

WARNING: multiple messages have this Message-ID (diff)
From: Johannes Weiner <hannes@cmpxchg.org>
To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: cgroups@vger.kernel.org, linux-mm@kvack.org,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Michal Hocko" <mhocko@kernel.org>,
	"Michal Koutný" <mkoutny@suse.com>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Vladimir Davydov" <vdavydov.dev@gmail.com>,
	"Waiman Long" <longman@redhat.com>
Subject: Re: [PATCH v2 3/4] mm/memcg: Protect per-CPU counter by disabling preemption on PREEMPT_RT where needed.
Date: Mon, 14 Feb 2022 11:46:00 -0500	[thread overview]
Message-ID: <YgqHSIa/WvJSXERe@cmpxchg.org> (raw)
In-Reply-To: <20220211223537.2175879-4-bigeasy@linutronix.de>

On Fri, Feb 11, 2022 at 11:35:36PM +0100, Sebastian Andrzej Siewior wrote:
> The per-CPU counter are modified with the non-atomic modifier. The
> consistency is ensured by disabling interrupts for the update.
> On non PREEMPT_RT configuration this works because acquiring a
> spinlock_t typed lock with the _irq() suffix disables interrupts. On
> PREEMPT_RT configurations the RMW operation can be interrupted.
> 
> Another problem is that mem_cgroup_swapout() expects to be invoked with
> disabled interrupts because the caller has to acquire a spinlock_t which
> is acquired with disabled interrupts. Since spinlock_t never disables
> interrupts on PREEMPT_RT the interrupts are never disabled at this
> point.
> 
> The code is never called from in_irq() context on PREEMPT_RT therefore
> disabling preemption during the update is sufficient on PREEMPT_RT.
> The sections which explicitly disable interrupts can remain on
> PREEMPT_RT because the sections remain short and they don't involve
> sleeping locks (memcg_check_events() is doing nothing on PREEMPT_RT).
> 
> Disable preemption during update of the per-CPU variables which do not
> explicitly disable interrupts.
> 
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
>  mm/memcontrol.c | 21 +++++++++++++++++++--
>  1 file changed, 19 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index c1caa662946dc..466466f285cea 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -705,6 +705,8 @@ void __mod_memcg_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx,
>  	pn = container_of(lruvec, struct mem_cgroup_per_node, lruvec);
>  	memcg = pn->memcg;
>  
> +	if (IS_ENABLED(CONFIG_PREEMPT_RT))
> +		preempt_disable();
>  	/* Update memcg */
>  	__this_cpu_add(memcg->vmstats_percpu->state[idx], val);
>  
> @@ -712,6 +714,8 @@ void __mod_memcg_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx,
>  	__this_cpu_add(pn->lruvec_stats_percpu->state[idx], val);
>  
>  	memcg_rstat_updated(memcg, val);
> +	if (IS_ENABLED(CONFIG_PREEMPT_RT))
> +		preempt_enable();
>  }

I notice you didn't annoate __mod_memcg_state(). I suppose that is
because it's called with explicit local_irq_disable(), and that
disables preemption on rt? And you only need another preempt_disable()
for stacks that rely on coming from spin_lock_irq(save)?

That makes sense, but it's difficult to maintain. It'll easily break
if somebody adds more memory accounting sites that may also rely on an
irq-disabled spinlock somewhere.

So better to make this an unconditional locking protocol:

static void memcg_stats_lock(void)
{
#ifdef CONFIG_PREEMPT_RT
	preempt_disable();
#else
	VM_BUG_ON(!irqs_disabled());
#endif
}

static void memcg_stats_unlock(void)
{
#ifdef CONFIG_PREEMPT_RT
	preempt_enable();
#endif
}

and always use these around the counter updates.


  parent reply	other threads:[~2022-02-14 16:46 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-11 22:35 [PATCH v2 0/4] mm/memcg: Address PREEMPT_RT problems instead of disabling it Sebastian Andrzej Siewior
2022-02-11 22:35 ` Sebastian Andrzej Siewior
     [not found] ` <20220211223537.2175879-1-bigeasy-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
2022-02-11 22:35   ` [PATCH v2 1/4] mm/memcg: Revert ("mm/memcg: optimize user context object stock access") Sebastian Andrzej Siewior
2022-02-11 22:35     ` Sebastian Andrzej Siewior
     [not found]     ` <20220211223537.2175879-2-bigeasy-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
2022-02-14 16:23       ` Johannes Weiner
2022-02-14 16:23         ` Johannes Weiner
2022-02-14 19:45       ` Roman Gushchin
2022-02-14 19:45         ` Roman Gushchin
2022-02-11 22:35   ` [PATCH v2 2/4] mm/memcg: Disable threshold event handlers on PREEMPT_RT Sebastian Andrzej Siewior
2022-02-11 22:35     ` Sebastian Andrzej Siewior
     [not found]     ` <20220211223537.2175879-3-bigeasy-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
2022-02-14 16:23       ` Johannes Weiner
2022-02-14 16:23         ` Johannes Weiner
2022-02-14 19:46       ` Roman Gushchin
2022-02-14 19:46         ` Roman Gushchin
2022-02-11 22:35   ` [PATCH v2 3/4] mm/memcg: Protect per-CPU counter by disabling preemption on PREEMPT_RT where needed Sebastian Andrzej Siewior
2022-02-11 22:35     ` Sebastian Andrzej Siewior
     [not found]     ` <20220211223537.2175879-4-bigeasy-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
2022-02-14 16:46       ` Johannes Weiner [this message]
2022-02-14 16:46         ` Johannes Weiner
     [not found]         ` <YgqHSIa/WvJSXERe-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
2022-02-14 19:53           ` Roman Gushchin
2022-02-14 19:53             ` Roman Gushchin
2022-02-15 18:01           ` Sebastian Andrzej Siewior
2022-02-15 18:01             ` Sebastian Andrzej Siewior
2022-02-11 22:35   ` [PATCH v2 4/4] mm/memcg: Protect memcg_stock with a local_lock_t Sebastian Andrzej Siewior
2022-02-11 22:35     ` Sebastian Andrzej Siewior
     [not found]     ` <20220211223537.2175879-5-bigeasy-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
2022-02-14 16:23       ` Johannes Weiner
2022-02-14 16:23         ` Johannes Weiner
     [not found]         ` <YgqB77SaViGRAtgt-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
2022-02-16 15:51           ` Sebastian Andrzej Siewior
2022-02-16 15:51             ` Sebastian Andrzej Siewior
     [not found]             ` <Yg0dctKholvzADYP-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
2022-02-16 18:08               ` Johannes Weiner
2022-02-16 18:08                 ` Johannes Weiner
     [not found]                 ` <Yg09t/j5Z0X9L7aX-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
2022-02-17  9:28                   ` Sebastian Andrzej Siewior
2022-02-17  9:28                     ` Sebastian Andrzej Siewior

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YgqHSIa/WvJSXERe@cmpxchg.org \
    --to=hannes-druugvl0lcnafugrpc6u6w@public.gmane.org \
    --cc=akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
    --cc=bigeasy-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org \
    --cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org \
    --cc=longman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    --cc=mkoutny-IBi9RG/b67k@public.gmane.org \
    --cc=peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
    --cc=tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org \
    --cc=vdavydov.dev-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.