All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko-IBi9RG/b67k@public.gmane.org>
To: Sebastian Andrzej Siewior
	<bigeasy-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
Cc: cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
	"Andrew Morton"
	<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	"Johannes Weiner"
	<hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
	"Michal Koutný" <mkoutny-IBi9RG/b67k@public.gmane.org>,
	"Peter Zijlstra" <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>,
	"Thomas Gleixner" <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>,
	"Vladimir Davydov"
	<vdavydov.dev-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	"Waiman Long" <longman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Subject: Re: [PATCH 3/4] mm/memcg: Add a local_lock_t for IRQ and TASK object.
Date: Wed, 26 Jan 2022 16:20:36 +0100	[thread overview]
Message-ID: <YfFmxH1IXeegNOa9@dhcp22.suse.cz> (raw)
In-Reply-To: <20220125164337.2071854-4-bigeasy-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>

On Tue 25-01-22 17:43:36, Sebastian Andrzej Siewior wrote:
> The members of the per-CPU structure memcg_stock_pcp are protected
> either by disabling interrupts or by disabling preemption if the
> invocation occurred in process context.
> Disabling interrupts protects most of the structure excluding task_obj
> while disabling preemption protects only task_obj.
> This schema is incompatible with PREEMPT_RT because it creates atomic
> context in which actions are performed which require preemptible
> context. One example is obj_cgroup_release().
> 
> The IRQ-disable and preempt-disable sections can be replaced with
> local_lock_t which preserves the explicit disabling of interrupts while
> keeps the code preemptible on PREEMPT_RT.
> 
> The task_obj has been added for performance reason on non-preemptible
> kernels where preempt_disable() is a NOP. On the PREEMPT_RT preemption
> model preempt_disable() is always implemented. Also there are no memory
> allocations in_irq() context and softirqs are processed in (preemptible)
> process context. Therefore it makes sense to avoid using task_obj.
> 
> Don't use task_obj on PREEMPT_RT and replace manual disabling of
> interrupts with a local_lock_t. This change requires some factoring:
> 
> - drain_obj_stock() drops a reference on obj_cgroup which leads to an
>   invocation of obj_cgroup_release() if it is the last object. This in
>   turn leads to recursive locking of the local_lock_t. To avoid this,
>   obj_cgroup_release() is invoked outside of the locked section.
> 
> - drain_obj_stock() gets a memcg_stock_pcp passed if the stock_lock has been
>   acquired (instead of the task_obj_lock) to avoid recursive locking later
>   in refill_stock().
> 
> - drain_all_stock() disables preemption via get_cpu() and then invokes
>   drain_local_stock() if it is the local CPU to avoid scheduling a worker
>   (which invokes the same function). Disabling preemption here is
>   problematic due to the sleeping locks in drain_local_stock().
>   This can be avoided by always scheduling a worker, even for the local
>   CPU. Using cpus_read_lock() stabilizes cpu_online_mask which ensures
>   that no worker is scheduled for an offline CPU. Since there is no
>   flush_work(), it is still possible that a worker is invoked on the wrong
>   CPU but it is okay since it operates always on the local-CPU data.
> 
> - drain_local_stock() is always invoked as a worker so it can be optimized
>   by removing in_task() (it is always true) and avoiding the "irq_save"
>   variant because interrupts are always enabled here. Operating on
>   task_obj first allows to acquire the lock_lock_t without lockdep
>   complains.
> 
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>

I do not see any obvious problem with this patch. The code is ugly as
hell, though, but a large part of that is because of the weird locking
scheme we already have. I've had a look at 559271146efc ("mm/memcg:
optimize user context object stock access") and while I agree that it
makes sense to optimize for user context I do not really see any numbers
justifying the awkward locking scheme. Is this complexity really worth
it?
-- 
Michal Hocko
SUSE Labs

WARNING: multiple messages have this Message-ID (diff)
From: Michal Hocko <mhocko@suse.com>
To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: cgroups@vger.kernel.org, linux-mm@kvack.org,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	"Michal Koutný" <mkoutny@suse.com>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Vladimir Davydov" <vdavydov.dev@gmail.com>,
	"Waiman Long" <longman@redhat.com>
Subject: Re: [PATCH 3/4] mm/memcg: Add a local_lock_t for IRQ and TASK object.
Date: Wed, 26 Jan 2022 16:20:36 +0100	[thread overview]
Message-ID: <YfFmxH1IXeegNOa9@dhcp22.suse.cz> (raw)
In-Reply-To: <20220125164337.2071854-4-bigeasy@linutronix.de>

On Tue 25-01-22 17:43:36, Sebastian Andrzej Siewior wrote:
> The members of the per-CPU structure memcg_stock_pcp are protected
> either by disabling interrupts or by disabling preemption if the
> invocation occurred in process context.
> Disabling interrupts protects most of the structure excluding task_obj
> while disabling preemption protects only task_obj.
> This schema is incompatible with PREEMPT_RT because it creates atomic
> context in which actions are performed which require preemptible
> context. One example is obj_cgroup_release().
> 
> The IRQ-disable and preempt-disable sections can be replaced with
> local_lock_t which preserves the explicit disabling of interrupts while
> keeps the code preemptible on PREEMPT_RT.
> 
> The task_obj has been added for performance reason on non-preemptible
> kernels where preempt_disable() is a NOP. On the PREEMPT_RT preemption
> model preempt_disable() is always implemented. Also there are no memory
> allocations in_irq() context and softirqs are processed in (preemptible)
> process context. Therefore it makes sense to avoid using task_obj.
> 
> Don't use task_obj on PREEMPT_RT and replace manual disabling of
> interrupts with a local_lock_t. This change requires some factoring:
> 
> - drain_obj_stock() drops a reference on obj_cgroup which leads to an
>   invocation of obj_cgroup_release() if it is the last object. This in
>   turn leads to recursive locking of the local_lock_t. To avoid this,
>   obj_cgroup_release() is invoked outside of the locked section.
> 
> - drain_obj_stock() gets a memcg_stock_pcp passed if the stock_lock has been
>   acquired (instead of the task_obj_lock) to avoid recursive locking later
>   in refill_stock().
> 
> - drain_all_stock() disables preemption via get_cpu() and then invokes
>   drain_local_stock() if it is the local CPU to avoid scheduling a worker
>   (which invokes the same function). Disabling preemption here is
>   problematic due to the sleeping locks in drain_local_stock().
>   This can be avoided by always scheduling a worker, even for the local
>   CPU. Using cpus_read_lock() stabilizes cpu_online_mask which ensures
>   that no worker is scheduled for an offline CPU. Since there is no
>   flush_work(), it is still possible that a worker is invoked on the wrong
>   CPU but it is okay since it operates always on the local-CPU data.
> 
> - drain_local_stock() is always invoked as a worker so it can be optimized
>   by removing in_task() (it is always true) and avoiding the "irq_save"
>   variant because interrupts are always enabled here. Operating on
>   task_obj first allows to acquire the lock_lock_t without lockdep
>   complains.
> 
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

I do not see any obvious problem with this patch. The code is ugly as
hell, though, but a large part of that is because of the weird locking
scheme we already have. I've had a look at 559271146efc ("mm/memcg:
optimize user context object stock access") and while I agree that it
makes sense to optimize for user context I do not really see any numbers
justifying the awkward locking scheme. Is this complexity really worth
it?
-- 
Michal Hocko
SUSE Labs


  parent reply	other threads:[~2022-01-26 15:20 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-25 16:43 [PATCH 0/4] mm/memcg: Address PREEMPT_RT problems instead of disabling it Sebastian Andrzej Siewior
2022-01-25 16:43 ` Sebastian Andrzej Siewior
     [not found] ` <20220125164337.2071854-1-bigeasy-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
2022-01-25 16:43   ` [PATCH 1/4] mm/memcg: Disable threshold event handlers on PREEMPT_RT Sebastian Andrzej Siewior
2022-01-25 16:43     ` Sebastian Andrzej Siewior
     [not found]     ` <20220125164337.2071854-2-bigeasy-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
2022-01-26 14:40       ` Michal Hocko
2022-01-26 14:40         ` Michal Hocko
     [not found]         ` <YfFddqkAhd1YKqX9-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2022-01-26 14:45           ` Sebastian Andrzej Siewior
2022-01-26 14:45             ` Sebastian Andrzej Siewior
     [not found]             ` <YfFegDwQSm9v2Qcu-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
2022-01-26 15:04               ` Michal Koutný
2022-01-26 15:04                 ` Michal Koutný
     [not found]                 ` <20220126150455.GC2516-9OudH3eul5jcvrawFnH+a6VXKuFTiq87@public.gmane.org>
2022-01-27 13:36                   ` Sebastian Andrzej Siewior
2022-01-27 13:36                     ` Sebastian Andrzej Siewior
2022-01-26 15:21               ` Michal Hocko
2022-01-26 15:21                 ` Michal Hocko
2022-01-25 16:43   ` [PATCH 2/4] mm/memcg: Protect per-CPU counter by disabling preemption on PREEMPT_RT where needed Sebastian Andrzej Siewior
2022-01-25 16:43     ` Sebastian Andrzej Siewior
     [not found]     ` <20220125164337.2071854-3-bigeasy-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
2022-01-26 10:06       ` Vlastimil Babka
2022-01-26 10:06         ` Vlastimil Babka
     [not found]         ` <86eeed07-b7dc-b387-ea4d-1a4a41334fe3-AlSwsSmVLrQ@public.gmane.org>
2022-01-26 11:24           ` Sebastian Andrzej Siewior
2022-01-26 11:24             ` Sebastian Andrzej Siewior
2022-01-26 14:56       ` Michal Hocko
2022-01-26 14:56         ` Michal Hocko
2022-01-25 16:43   ` [PATCH 3/4] mm/memcg: Add a local_lock_t for IRQ and TASK object Sebastian Andrzej Siewior
2022-01-25 16:43     ` Sebastian Andrzej Siewior
     [not found]     ` <20220125164337.2071854-4-bigeasy-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
2022-01-26 15:20       ` Michal Hocko [this message]
2022-01-26 15:20         ` Michal Hocko
     [not found]         ` <YfFmxH1IXeegNOa9-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2022-01-27 11:53           ` Sebastian Andrzej Siewior
2022-01-27 11:53             ` Sebastian Andrzej Siewior
     [not found]             ` <YfKHxKda7bGJmrLJ-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
2022-02-01 12:04               ` Michal Hocko
2022-02-01 12:04                 ` Michal Hocko
     [not found]                 ` <YfkhsiWHzsyQSBfl-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2022-02-01 12:11                   ` Sebastian Andrzej Siewior
2022-02-01 12:11                     ` Sebastian Andrzej Siewior
     [not found]                     ` <Yfkjjamj09lZn4sA-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
2022-02-01 15:29                       ` Michal Hocko
2022-02-01 15:29                         ` Michal Hocko
     [not found]                         ` <YflR3/RuGjYuQZPH-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2022-02-03  9:54                           ` Sebastian Andrzej Siewior
2022-02-03  9:54                             ` Sebastian Andrzej Siewior
     [not found]                             ` <YfumP3u1VCjKHE3b-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
2022-02-03 10:09                               ` Michal Hocko
2022-02-03 10:09                                 ` Michal Hocko
     [not found]                                 ` <Yfup9THPcSIPDSoH-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2022-02-03 11:09                                   ` Sebastian Andrzej Siewior
2022-02-03 11:09                                     ` Sebastian Andrzej Siewior
2022-02-08 17:58                                   ` Shakeel Butt
2022-02-08 17:58                                     ` Shakeel Butt
     [not found]                                     ` <CALvZod7yovQ5OTWr=k_eiEBVb1LTRvPkbsY8joAtyigQnvBUww-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2022-02-09  9:17                                       ` Michal Hocko
2022-02-09  9:17                                         ` Michal Hocko
2022-01-26 16:57       ` Vlastimil Babka
2022-01-26 16:57         ` Vlastimil Babka
     [not found]         ` <7f4928b8-16e2-88b3-2688-1519a19653a9-AlSwsSmVLrQ@public.gmane.org>
2022-01-31 15:06           ` Sebastian Andrzej Siewior
2022-01-31 15:06             ` Sebastian Andrzej Siewior
     [not found]             ` <Yff69slA4UTz5Q1Y-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
2022-02-03 16:01               ` Vlastimil Babka
2022-02-03 16:01                 ` Vlastimil Babka
     [not found]                 ` <e068646f-c7f2-5876-8577-6ddf93df07d0-AlSwsSmVLrQ@public.gmane.org>
2022-02-08 17:17                   ` Sebastian Andrzej Siewior
2022-02-08 17:17                     ` Sebastian Andrzej Siewior
     [not found]                     ` <YgKlr+sHZPayWKUP-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
2022-02-08 17:28                       ` Michal Hocko
2022-02-08 17:28                         ` Michal Hocko
2022-02-09  1:48     ` [mm/memcg] 86895e1e85: WARNING:possible_circular_locking_dependency_detected kernel test robot
2022-02-09  1:48       ` kernel test robot
2022-01-25 16:43   ` [PATCH 4/4] mm/memcg: Allow the task_obj optimization only on non-PREEMPTIBLE kernels Sebastian Andrzej Siewior
2022-01-25 16:43     ` Sebastian Andrzej Siewior
2022-01-25 23:21   ` [PATCH 0/4] mm/memcg: Address PREEMPT_RT problems instead of disabling it Andrew Morton
2022-01-25 23:21     ` Andrew Morton
     [not found]     ` <20220125152146.d7e25afe3b8a6807df6fee3f-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2022-01-26  7:30       ` Sebastian Andrzej Siewior
2022-01-26  7:30         ` Sebastian Andrzej Siewior
  -- strict thread matches above, loose matches on Subject: below --
2022-07-12 11:22 [PATCH 0/4] Backport MEMCG changes from v5.17 David Oberhollenzer
2022-07-12 11:22 ` [PATCH 3/4] mm/memcg: Add a local_lock_t for IRQ and TASK object David Oberhollenzer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YfFmxH1IXeegNOa9@dhcp22.suse.cz \
    --to=mhocko-ibi9rg/b67k@public.gmane.org \
    --cc=akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
    --cc=bigeasy-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org \
    --cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org \
    --cc=linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org \
    --cc=longman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=mkoutny-IBi9RG/b67k@public.gmane.org \
    --cc=peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
    --cc=tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org \
    --cc=vdavydov.dev-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.