linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Roman Gushchin <roman.gushchin@linux.dev>
To: Michal Hocko <mhocko@suse.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	Shakeel Butt <shakeelb@google.com>,
	Matthew Wilcox <willy@infradead.org>,
	Christoph Hellwig <hch@infradead.org>,
	"David S. Miller" <davem@davemloft.net>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>, Tejun Heo <tj@kernel.org>,
	Martin KaFai Lau <kafai@fb.com>, bpf <bpf@vger.kernel.org>,
	Kernel Team <kernel-team@fb.com>, linux-mm <linux-mm@kvack.org>,
	Christoph Lameter <cl@linux.com>,
	Pekka Enberg <penberg@kernel.org>,
	David Rientjes <rientjes@google.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Vlastimil Babka <vbabka@suse.cz>
Subject: Re: [PATCH bpf-next 0/5] bpf: BPF specific memory allocator.
Date: Tue, 12 Jul 2022 19:27:28 -0700	[thread overview]
Message-ID: <Ys4tkFkBZ+jEyCk9@castle> (raw)
In-Reply-To: <Ys0lXfWKtwYlVrzK@dhcp22.suse.cz>

On Tue, Jul 12, 2022 at 09:40:13AM +0200, Michal Hocko wrote:
> On Mon 11-07-22 21:39:14, Alexei Starovoitov wrote:
> > On Mon, Jul 11, 2022 at 02:15:07PM +0200, Michal Hocko wrote:
> > > On Sun 10-07-22 07:32:13, Shakeel Butt wrote:
> > > > On Sat, Jul 09, 2022 at 10:26:23PM -0700, Alexei Starovoitov wrote:
> > > > > On Fri, Jul 8, 2022 at 2:55 PM Shakeel Butt <shakeelb@google.com> wrote:
> > > > [...]
> > > > > >
> > > > > > Most probably Michal's comment was on free objects sitting in the caches
> > > > > > (also pointed out by Yosry). Should we drain them on memory pressure /
> > > > > > OOM or should we ignore them as the amount of memory is not significant?
> > > > > 
> > > > > Are you suggesting to design a shrinker for 0.01% of the memory
> > > > > consumed by bpf?
> > > > 
> > > > No, just claim that the memory sitting on such caches is insignificant.
> > > 
> > > yes, that is not really clear from the patch description. Earlier you
> > > have said that the memory consumed might go into GBs. If that is a
> > > memory that is actively used and not really reclaimable then bad luck.
> > > There are other users like that in the kernel and this is not a new
> > > problem. I think it would really help to add a counter to describe both
> > > the overall memory claimed by the bpf allocator and actively used
> > > portion of it. If you use our standard vmstat infrastructure then we can
> > > easily show that information in the OOM report.
> > 
> > OOM report can potentially be extended with info about bpf consumed
> > memory, but it's not clear whether it will help OOM analysis.
> 
> If GBs of memory can be sitting there then it is surely an interesting
> information to have when seeing OOM. One of the big shortcomings of the
> OOM analysis is unaccounted memory.
> 
> > bpftool map show
> > prints all map data already.
> > Some devs use bpf to inspect bpf maps for finer details in run-time.
> > drgn scripts pull that data from crash dumps.
> > There is no need for new counters.
> > The idea of bpf specific counters/limits was rejected by memcg folks.
> 
> I would argue that integration into vmstat is useful not only for oom
> analysis but also for regular health check scripts watching /proc/vmstat
> content. I do not think most of those generic tools are BPF aware. So
> unless there is a good reason to not account this memory there then I
> would vote for adding them. They are cheap and easy to integrate.
>  
> > > OK, thanks for the clarification. There is still one thing that is not
> > > really clear to me. Without a proper ownership bound to any process why
> > > is it desired/helpful to account the memory to a memcg?
> > 
> > The first step is to have a limit. memcg provides it.
> 
> I am sorry but this doesn't really explain it. Could you elaborate
> please? Is the limit supposed to protect against adversaries? Or is it
> just to prevent from accidental runaways? Is it purely for accounting
> purposes?
> 
> > > We have discussed something similar in a different email thread and I
> > > still didn't manage to find time to put all the parts together. But if
> > > the initiator (or however you call the process which loads the program)
> > > exits then this might be the last process in the specific cgroup and so
> > > it can be offlined and mostly invisible to an admin.
> > 
> > Roman already sent reparenting fix:
> > https://patchwork.kernel.org/project/netdevbpf/patch/20220711162827.184743-1-roman.gushchin@linux.dev/

Just to be clear:
for the actual memory which is backing up bpf maps (slabs, percpu or vmallocs)
reparenting was implemented several years ago. Nothing is changing here.

This patch only adds reparenting to the map->memcg pointer (by replacing it
to an objcg), which affects *new* allocations which are happening after
the deletion of the cgroup. This would help to reduce the number of dying cgroups,
but unlikely significantly, this is why it hasn't been implemented from scratch.

Thanks!


  parent reply	other threads:[~2022-07-13  2:27 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20220623003230.37497-1-alexei.starovoitov@gmail.com>
2022-06-27  7:03 ` [PATCH bpf-next 0/5] bpf: BPF specific memory allocator Christoph Hellwig
2022-06-28  0:17   ` Christoph Lameter
2022-06-28  5:01     ` Alexei Starovoitov
2022-06-28 13:57       ` Christoph Lameter
2022-06-28 17:03         ` Alexei Starovoitov
2022-06-29  2:35           ` Christoph Lameter
2022-06-29  2:49             ` Alexei Starovoitov
2022-07-04 16:13               ` Vlastimil Babka
2022-07-06 17:43                 ` Alexei Starovoitov
2022-07-19 11:52                   ` Vlastimil Babka
2022-07-04 20:34   ` Matthew Wilcox
2022-07-06 17:50     ` Alexei Starovoitov
2022-07-06 17:55       ` Matthew Wilcox
2022-07-06 18:05         ` Alexei Starovoitov
2022-07-06 18:21           ` Matthew Wilcox
2022-07-06 18:26             ` Alexei Starovoitov
2022-07-06 18:31               ` Matthew Wilcox
2022-07-06 18:36                 ` Alexei Starovoitov
2022-07-06 18:40                   ` Matthew Wilcox
2022-07-06 18:51                     ` Alexei Starovoitov
2022-07-06 18:55                       ` Matthew Wilcox
2022-07-08 13:41           ` Michal Hocko
2022-07-08 17:48             ` Alexei Starovoitov
2022-07-08 20:13               ` Yosry Ahmed
2022-07-08 21:55               ` Shakeel Butt
2022-07-10  5:26                 ` Alexei Starovoitov
2022-07-10  7:32                   ` Shakeel Butt
2022-07-11 12:15                     ` Michal Hocko
2022-07-12  4:39                       ` Alexei Starovoitov
2022-07-12  7:40                         ` Michal Hocko
2022-07-12  8:39                           ` Yafang Shao
2022-07-12  9:52                             ` Michal Hocko
2022-07-12 15:25                               ` Shakeel Butt
2022-07-12 16:32                                 ` Tejun Heo
2022-07-12 17:26                                   ` Shakeel Butt
2022-07-12 17:36                                     ` Tejun Heo
2022-07-12 18:11                                       ` Shakeel Butt
2022-07-12 18:43                                         ` Alexei Starovoitov
2022-07-13 13:56                                           ` Yafang Shao
2022-07-12 19:11                                         ` Mina Almasry
2022-07-12 16:24                               ` Tejun Heo
2022-07-18 14:13                                 ` Michal Hocko
2022-07-13  2:39                               ` Roman Gushchin
2022-07-13 14:24                                 ` Yafang Shao
2022-07-13 16:24                                   ` Tejun Heo
2022-07-14  6:15                                     ` Yafang Shao
2022-07-18 17:55                                 ` Yosry Ahmed
2022-07-19 11:30                                   ` cgroup specific sticky resources (was: Re: [PATCH bpf-next 0/5] bpf: BPF specific memory allocator.) Michal Hocko
2022-07-19 18:00                                     ` Yosry Ahmed
2022-07-19 18:01                                       ` Yosry Ahmed
2022-07-19 18:46                                       ` Mina Almasry
2022-07-19 19:16                                         ` Tejun Heo
2022-07-19 19:30                                           ` Yosry Ahmed
2022-07-19 19:38                                             ` Tejun Heo
2022-07-19 19:40                                               ` Yosry Ahmed
2022-07-19 19:47                                               ` Mina Almasry
2022-07-19 19:54                                                 ` Tejun Heo
2022-07-19 20:16                                                   ` Mina Almasry
2022-07-19 20:29                                                     ` Tejun Heo
2022-07-20 12:26                                         ` Michal Hocko
2022-07-12 18:40                           ` [PATCH bpf-next 0/5] bpf: BPF specific memory allocator Alexei Starovoitov
2022-07-18 12:27                             ` Michal Hocko
2022-07-13  2:27                           ` Roman Gushchin [this message]
2022-07-11 12:22               ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Ys4tkFkBZ+jEyCk9@castle \
    --to=roman.gushchin@linux.dev \
    --cc=akpm@linux-foundation.org \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=cl@linux.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=hch@infradead.org \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=kafai@fb.com \
    --cc=kernel-team@fb.com \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=penberg@kernel.org \
    --cc=rientjes@google.com \
    --cc=shakeelb@google.com \
    --cc=tj@kernel.org \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).