All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexey Gladkov <legion@kernel.org>
To: Jann Horn <jannh@google.com>
Cc: Chen Ridong <chenridong@huaweicloud.com>,
	akpm@linux-foundation.org, Liam.Howlett@oracle.com,
	lorenzo.stoakes@oracle.com, vbabka@suse.cz, pfalcato@suse.de,
	bigeasy@linutronix.de, paulmck@kernel.org, chenridong@huawei.com,
	roman.gushchin@linux.dev, brauner@kernel.org, pmladek@suse.com,
	geert@linux-m68k.org, mingo@kernel.org, rrangel@chromium.org,
	francesco@valla.it, kpsingh@kernel.org,
	guoweikang.kernel@gmail.com, link@vivo.com,
	viro@zeniv.linux.org.uk, neil@brown.name, nichen@iscas.ac.cn,
	tglx@linutronix.de, frederic@kernel.org, peterz@infradead.org,
	oleg@redhat.com, joel.granados@kernel.org, linux@weissschuh.net,
	avagin@google.com, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, lujialin4@huawei.com,
	"Serge E. Hallyn" <serge@hallyn.com>,
	David Howells <dhowells@redhat.com>
Subject: Re: [RFC next v2 0/2] ucounts: turn the atomic rlimit to percpu_counter
Date: Mon, 19 May 2025 23:01:27 +0200	[thread overview]
Message-ID: <aCucJ9731YzaZI5b@example.org> (raw)
In-Reply-To: <CAG48ez2bFhYYj2qkJk3j5t=3VwYUH4sSMuohyC=MfrRw-bv22g@mail.gmail.com>

On Mon, May 19, 2025 at 09:32:17PM +0200, Jann Horn wrote:
> On Mon, May 19, 2025 at 3:25 PM Chen Ridong <chenridong@huaweicloud.com> wrote:
> > From: Chen Ridong <chenridong@huawei.com>
> >
> > The will-it-scale test case signal1 [1] has been observed. and the test
> > results reveal that the signal sending system call lacks linearity.
> > To further investigate this issue, we initiated a series of tests by
> > launching varying numbers of dockers and closely monitored the throughput
> > of each individual docker. The detailed test outcomes are presented as
> > follows:
> >
> >         | Dockers     |1      |4      |8      |16     |32     |64     |
> >         | Throughput  |380068 |353204 |308948 |306453 |180659 |129152 |
> >
> > The data clearly demonstrates a discernible trend: as the quantity of
> > dockers increases, the throughput per container progressively declines.
> 
> But is that actually a problem? Do you have real workloads that
> concurrently send so many signals, or create inotify watches so
> quickly, that this is has an actual performance impact?
> 
> > In-depth analysis has identified the root cause of this performance
> > degradation. The ucouts module conducts statistics on rlimit, which
> > involves a significant number of atomic operations. These atomic
> > operations, when acting on the same variable, trigger a substantial number
> > of cache misses or remote accesses, ultimately resulting in a drop in
> > performance.
> 
> You're probably running into the namespace-associated ucounts here? So
> the issue is probably that Docker creates all your containers with the
> same owner UID (EUID at namespace creation), causing them all to
> account towards a single ucount, while normally outside of containers,
> each RUID has its own ucount instance?
> 
> Sharing of rlimits between containers is probably normally undesirable
> even without the cacheline bouncing, because it means that too much
> resource usage in one container can cause resource allocations in
> another container to fail... so I think the real problem here is at a
> higher level, in the namespace setup code. Maybe root should be able
> to create a namespace that doesn't inherit ucount limits of its owner
> UID, or something like that...

If we allow rlimits not to be inherited in the userns being created, the
user will be able to bypass their rlimits by running a fork bomb inside
the new userns.

Or I missed your point ?

In init_user_ns all rlimits that are bound to it are set to RLIM_INFINITY.
So root can only reduce rlimits.

https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/fork.c#n1091

-- 
Rgrds, legion



  reply	other threads:[~2025-05-19 21:41 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-19 13:11 [RFC next v2 0/2] ucounts: turn the atomic rlimit to percpu_counter Chen Ridong
2025-05-19 13:11 ` [RFC next v2 1/2] ucounts: free ucount only count and rlimit are zero Chen Ridong
2025-05-19 13:11 ` [RFC next v2 2/2] ucounts: turn the atomic rlimit to percpu_counter Chen Ridong
2025-05-19 19:32 ` [RFC next v2 0/2] " Jann Horn
2025-05-19 21:01   ` Alexey Gladkov [this message]
2025-05-19 21:24     ` Jann Horn
2025-05-21  1:48       ` Chen Ridong
2025-05-21  1:41   ` Chen Ridong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aCucJ9731YzaZI5b@example.org \
    --to=legion@kernel.org \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=avagin@google.com \
    --cc=bigeasy@linutronix.de \
    --cc=brauner@kernel.org \
    --cc=chenridong@huawei.com \
    --cc=chenridong@huaweicloud.com \
    --cc=dhowells@redhat.com \
    --cc=francesco@valla.it \
    --cc=frederic@kernel.org \
    --cc=geert@linux-m68k.org \
    --cc=guoweikang.kernel@gmail.com \
    --cc=jannh@google.com \
    --cc=joel.granados@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=link@vivo.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux@weissschuh.net \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=lujialin4@huawei.com \
    --cc=mingo@kernel.org \
    --cc=neil@brown.name \
    --cc=nichen@iscas.ac.cn \
    --cc=oleg@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=pfalcato@suse.de \
    --cc=pmladek@suse.com \
    --cc=roman.gushchin@linux.dev \
    --cc=rrangel@chromium.org \
    --cc=serge@hallyn.com \
    --cc=tglx@linutronix.de \
    --cc=vbabka@suse.cz \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.