Re: [PATCH v2 0/5] Introduce memcg_stock_pcp remote draining

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: "Leonardo Brás" <leobras@redhat.com>
To: Michal Hocko <mhocko@suse.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>,
	Marcelo Tosatti <mtosatti@redhat.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Shakeel Butt <shakeelb@google.com>,
	Muchun Song <muchun.song@linux.dev>,
	Andrew Morton <akpm@linux-foundation.org>,
	cgroups@vger.kernel.org, linux-mm@kvack.org,
	 linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 0/5] Introduce memcg_stock_pcp remote draining
Date: Fri, 27 Jan 2023 16:29:37 -0300	[thread overview]
Message-ID: <029147be35b5173d5eb10c182e124ac9d2f1f0ba.camel@redhat.com> (raw)
In-Reply-To: <Y9OZezjUPITtEvTx@dhcp22.suse.cz>

On Fri, 2023-01-27 at 10:29 +0100, Michal Hocko wrote:
> On Fri 27-01-23 04:35:22, Leonardo Brás wrote:
> > On Fri, 2023-01-27 at 08:20 +0100, Michal Hocko wrote:
> > > On Fri 27-01-23 04:14:19, Leonardo Brás wrote:
> > > > On Thu, 2023-01-26 at 15:12 -0800, Roman Gushchin wrote:
> > > [...]
> > > > > I'd rather opt out of stock draining for isolated cpus: it might slightly reduce
> > > > > the accuracy of memory limits and slightly increase the memory footprint (all
> > > > > those dying memcgs...), but the impact will be limited. Actually it is limited
> > > > > by the number of cpus.
> > > > 
> > > > I was discussing this same idea with Marcelo yesterday morning.
> > > > 
> > > > The questions had in the topic were:
> > > > a - About how many pages the pcp cache will hold before draining them itself? 
> > > 
> > > MEMCG_CHARGE_BATCH (64 currently). And one more clarification. The cache
> > > doesn't really hold any pages. It is a mere counter of how many charges
> > > have been accounted for the memcg page counter. So it is not really
> > > consuming proportional amount of resources. It just pins the
> > > corresponding memcg. Have a look at consume_stock and refill_stock
> > 
> > I see. Thanks for pointing that out!
> > 
> > So in worst case scenario the memcg would have reserved 64 pages * (numcpus - 1)
> 
> s@numcpus@num_isolated_cpus@

I was thinking worst case scenario being (ncpus - 1) being isolated.

> 
> > that are not getting used, and may cause an 'earlier' OOM if this amount is
> > needed but can't be freed.
> 
> s@OOM@memcg OOM@
 
> > In the wave of worst case, supposing a big powerpc machine, 256 CPUs, each
> > holding 64k * 64 pages => 1GB memory - 4MB (one cpu using resources).
> > It's starting to get too big, but still ok for a machine this size.
> 
> It is more about the memcg limit rather than the size of the machine.
> Again, let's focus on actual usacase. What is the usual memcg setup with
> those isolcpus

I understand it's about the limit, not actually allocated memory. When I point
the machine size, I mean what is expected to be acceptable from a user in that
machine.

> 
> > The thing is that it can present an odd behavior: 
> > You have a cgroup created before, now empty, and try to run given application,
> > and hits OOM.
> 
> The application would either consume those cached charges or flush them
> if it is running in a different memcg. Or what do you have in mind?

1 - Create a memcg with a VM inside, multiple vcpus pinned to isolated cpus. 
2 - Run multi-cpu task inside the VM, it allocates memory for every CPU and keep
    the pcp cache
3 - Try to run a single-cpu task (pinned?) inside the VM, which uses almost all
    the available memory.
4 - memcg OOM.

Does it make sense?


> 
> > You then restart the cgroup, run the same application without an issue.
> > 
> > Even though it looks a good possibility, this can be perceived by user as
> > instability.
> > 
> > > 
> > > > b - Would it cache any kind of bigger page, or huge page in this same aspect?
> > > 
> > > The above should answer this as well as those following up I hope. If
> > > not let me know.
> > 
> > IIUC we are talking normal pages, is that it?
> 
> We are talking about memcg charges and those have page granularity.
> 

Thanks for the info!

Also, thanks for the feedback!
Leo

next prev parent reply	other threads:[~2023-01-27 19:29 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-25  7:34 [PATCH v2 0/5] Introduce memcg_stock_pcp remote draining Leonardo Bras
2023-01-25  7:34 ` [PATCH v2 1/5] mm/memcontrol: Align percpu memcg_stock to cache Leonardo Bras
2023-01-25  7:34 ` [PATCH v2 2/5] mm/memcontrol: Change stock_lock type from local_lock_t to spinlock_t Leonardo Bras
2023-01-25  7:35 ` [PATCH v2 3/5] mm/memcontrol: Reorder memcg_stock_pcp members to avoid holes Leonardo Bras
2023-01-25  7:35 ` [PATCH v2 4/5] mm/memcontrol: Perform all stock drain in current CPU Leonardo Bras
2023-01-25  7:35 ` [PATCH v2 5/5] mm/memcontrol: Remove flags from memcg_stock_pcp Leonardo Bras
2023-01-25  8:33 ` [PATCH v2 0/5] Introduce memcg_stock_pcp remote draining Michal Hocko
2023-01-25 11:06   ` Leonardo Brás
2023-01-25 11:39     ` Michal Hocko
2023-01-25 18:22     ` Marcelo Tosatti
2023-01-25 23:14       ` Roman Gushchin
2023-01-26  7:41         ` Michal Hocko
2023-01-26 18:03           ` Marcelo Tosatti
2023-01-26 19:20             ` Michal Hocko
2023-01-27  0:32               ` Marcelo Tosatti
2023-01-27  6:58                 ` Michal Hocko
2023-02-01 18:31               ` Roman Gushchin
2023-01-26 23:12           ` Roman Gushchin
2023-01-27  7:11             ` Michal Hocko
2023-01-27  7:22               ` Leonardo Brás
2023-01-27  8:12                 ` Leonardo Brás
2023-01-27  9:23                   ` Michal Hocko
2023-01-27 13:03                   ` Frederic Weisbecker
2023-01-27 13:58               ` Michal Hocko
2023-01-27 18:18                 ` Roman Gushchin
2023-02-03 15:21                   ` Michal Hocko
2023-02-03 19:25                     ` Roman Gushchin
2023-02-13 13:36                       ` Michal Hocko
2023-01-27  7:14             ` Leonardo Brás
2023-01-27  7:20               ` Michal Hocko
2023-01-27  7:35                 ` Leonardo Brás
2023-01-27  9:29                   ` Michal Hocko
2023-01-27 19:29                     ` Leonardo Brás [this message]
2023-01-27 23:50                       ` Roman Gushchin
2023-01-26 18:19         ` Marcelo Tosatti
2023-01-27  5:40           ` Leonardo Brás
2023-01-26  2:01       ` Hillf Danton
2023-01-26  7:45       ` Michal Hocko
2023-01-26 18:14         ` Marcelo Tosatti
2023-01-26 19:13           ` Michal Hocko
2023-01-27  6:55             ` Leonardo Brás
2023-01-31 11:35               ` Marcelo Tosatti
2023-02-01  4:36                 ` Leonardo Brás
2023-02-01 12:52                   ` Michal Hocko
2023-02-01 12:41                 ` Michal Hocko
2023-02-04  4:55                   ` Leonardo Brás
2023-02-05 19:49                     ` Roman Gushchin
2023-02-07  3:18                       ` Leonardo Brás

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=029147be35b5173d5eb10c182e124ac9d2f1f0ba.camel@redhat.com \
    --to=leobras@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=mtosatti@redhat.com \
    --cc=muchun.song@linux.dev \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeelb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).