linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Yosry Ahmed <yosry.ahmed@linux.dev>
To: Shakeel Butt <shakeel.butt@linux.dev>
Cc: "Tejun Heo" <tj@kernel.org>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Alexei Starovoitov" <ast@kernel.org>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	"Michal Hocko" <mhocko@kernel.org>,
	"Roman Gushchin" <roman.gushchin@linux.dev>,
	"Muchun Song" <muchun.song@linux.dev>,
	"Michal Koutný" <mkoutny@suse.com>,
	"Vlastimil Babka" <vbabka@suse.cz>,
	"Sebastian Andrzej Siewior" <bigeasy@linutronix.de>,
	"JP Kobryn" <inwardvessel@gmail.com>,
	bpf@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	"Meta kernel team" <kernel-team@meta.com>
Subject: Re: [RFC PATCH 3/3] cgroup: make css_rstat_updated nmi safe
Date: Tue, 6 May 2025 09:41:04 +0000	[thread overview]
Message-ID: <aBnZMBJ-OOEXvpUa@google.com> (raw)
In-Reply-To: <6u7ccequ5ye3e4iqblcdeqsigindo3xjpsvkdb6hyaw7cpjddc@u2ujv7ymlxc6>

On Thu, May 01, 2025 at 03:10:20PM -0700, Shakeel Butt wrote:
> On Wed, Apr 30, 2025 at 06:14:28AM -0700, Yosry Ahmed wrote:
> [...]
> > > +
> > > +	if (!_css_rstat_cpu_trylock(css, cpu, &flags)) {
> > 
> > 
> > IIUC this trylock will only fail if a BPF program runs in NMI context
> > and tries to update cgroup stats, interrupting a context that is already
> > holding the lock (i.e. updating or flushing stats).
> > 
> 
> Correct (though note that flushing side can be on a different CPU).
> 
> > How often does this happen in practice tho? Is it worth the complexity?
> 
> This is about correctness, so even a chance of occurance need the
> solution.

Right, my question was more about the need to special case NMIs, see
below.

> 
> > 
> > I wonder if it's better if we make css_rstat_updated() inherently
> > lockless instead.
> > 
> > What if css_rstat_updated() always just adds to a lockless tree,
> 
> Here I assume you meant lockless list instead of tree.

Yeah, in a sense. I meant using lockless lists to implement the rstat
tree instead of normal linked lists.

> 
> > and we
> > defer constructing the proper tree to the flushing side? This should
> > make updates generally faster and avoids locking or disabling interrupts
> > in the fast path. We essentially push more work to the flushing side.
> > 
> > We may be able to consolidate some of the code too if all the logic
> > manipulating the tree is on the flushing side.
> > 
> > WDYT? Am I missing something here?
> > 
> 
> Yes this can be done but I don't think we need to tie that to current
> series. I think we can start with lockless in the nmi context and then
> iteratively make css_rstat_updated() lockless for all contexts.

My question is basically whether it would be simpler to actually make it
all lockless than special casing NMIs. With this patch we have two
different paths and a deferred list that we process at a later point. I
think it may be simpler if we just make it all lockless to begin with.
Then we would have a single path and no special deferred processing.

WDYT?


  reply	other threads:[~2025-05-06  9:41 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-29  6:12 [RFC PATCH 0/3] cgroup: nmi safe css_rstat_updated Shakeel Butt
2025-04-29  6:12 ` [RFC PATCH 1/3] llist: add list_add_iff_not_on_list() Shakeel Butt
2025-04-30 12:44   ` [RFC PATCH 1/3] llist: add list_add_iff_not_on_list()g Yosry Ahmed
2025-04-29  6:12 ` [RFC PATCH 2/3] cgroup: support to enable nmi-safe css_rstat_updated Shakeel Butt
2025-04-29  6:12 ` [RFC PATCH 3/3] cgroup: make css_rstat_updated nmi safe Shakeel Butt
2025-04-30 13:14   ` Yosry Ahmed
2025-05-01 22:10     ` Shakeel Butt
2025-05-06  9:41       ` Yosry Ahmed [this message]
2025-05-06 19:30         ` Shakeel Butt
2025-05-07  6:52           ` Yosry Ahmed
2025-04-29  6:12 ` [OFFLIST PATCH 1/2] cgroup: use separate rstat trees for each subsystem Shakeel Butt
2025-04-29  6:12   ` [OFFLIST PATCH 2/2] cgroup: use subsystem-specific rstat locks to avoid contention Shakeel Butt
2025-04-29  6:15     ` Shakeel Butt
2025-05-21 22:23       ` Klara Modin
2025-05-21 22:29         ` Tejun Heo
2025-05-21 23:23         ` Shakeel Butt
2025-05-21 23:33           ` Shakeel Butt
2025-05-21 23:47             ` JP Kobryn
2025-05-21 23:50               ` Shakeel Butt
2025-05-21 23:52                 ` JP Kobryn
2025-05-21 23:47             ` Shakeel Butt
2025-04-29  6:15   ` [OFFLIST PATCH 1/2] cgroup: use separate rstat trees for each subsystem Shakeel Butt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aBnZMBJ-OOEXvpUa@google.com \
    --to=yosry.ahmed@linux.dev \
    --cc=akpm@linux-foundation.org \
    --cc=ast@kernel.org \
    --cc=bigeasy@linutronix.de \
    --cc=bpf@vger.kernel.org \
    --cc=cgroups@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=inwardvessel@gmail.com \
    --cc=kernel-team@meta.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mkoutny@suse.com \
    --cc=muchun.song@linux.dev \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeel.butt@linux.dev \
    --cc=tj@kernel.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).