* Re: [PATCH v11 3/9] Use atomic_t for ucounts reference counting
@ 2021-04-27 22:36 kernel test robot
0 siblings, 0 replies; 3+ messages in thread
From: kernel test robot @ 2021-04-27 22:36 UTC (permalink / raw)
To: kbuild
[-- Attachment #1: Type: text/plain, Size: 2869 bytes --]
CC: kbuild-all(a)lists.01.org
In-Reply-To: <94d1dbecab060a6b116b0a2d1accd8ca1bbb4f5f.1619094428.git.legion@kernel.org>
References: <94d1dbecab060a6b116b0a2d1accd8ca1bbb4f5f.1619094428.git.legion@kernel.org>
TO: legion(a)kernel.org
TO: LKML <linux-kernel@vger.kernel.org>
TO: Kernel Hardening <kernel-hardening@lists.openwall.com>
TO: Linux Containers <containers@lists.linux-foundation.org>
TO: linux-mm(a)kvack.org
CC: Alexey Gladkov <legion@kernel.org>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Linux Memory Management List <linux-mm@kvack.org>
CC: Christian Brauner <christian.brauner@ubuntu.com>
CC: "Eric W . Biederman" <ebiederm@xmission.com>
CC: Jann Horn <jannh@google.com>
Hi,
Thank you for the patch! Perhaps something to improve:
[auto build test WARNING on kselftest/next]
[also build test WARNING on linux/master linus/master v5.12 next-20210427]
[cannot apply to hnaz-linux-mm/master]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/legion-kernel-org/Count-rlimits-in-each-user-namespace/20210427-162857
base: https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git next
:::::: branch date: 14 hours ago
:::::: commit date: 14 hours ago
config: x86_64-allyesconfig (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Julia Lawall <julia.lawall@lip6.fr>
cocci warnings: (new ones prefixed by >>)
>> kernel/ucount.c:188:5-24: atomic_dec_and_test variation before object free at line 190.
vim +188 kernel/ucount.c
0c1ee4f3f49c0f Alexey Gladkov 2021-04-22 183
0c1ee4f3f49c0f Alexey Gladkov 2021-04-22 184 void put_ucounts(struct ucounts *ucounts)
f6b2db1a3e8d14 Eric W. Biederman 2016-08-08 185 {
880a38547ff087 Nikolay Borisov 2017-01-20 186 unsigned long flags;
880a38547ff087 Nikolay Borisov 2017-01-20 187
33b16ca5b8cf91 Alexey Gladkov 2021-04-22 @188 if (atomic_dec_and_test(&ucounts->count)) {
880a38547ff087 Nikolay Borisov 2017-01-20 189 spin_lock_irqsave(&ucounts_lock, flags);
f6b2db1a3e8d14 Eric W. Biederman 2016-08-08 @190 hlist_del_init(&ucounts->node);
880a38547ff087 Nikolay Borisov 2017-01-20 191 spin_unlock_irqrestore(&ucounts_lock, flags);
f6b2db1a3e8d14 Eric W. Biederman 2016-08-08 192 kfree(ucounts);
f6b2db1a3e8d14 Eric W. Biederman 2016-08-08 193 }
33b16ca5b8cf91 Alexey Gladkov 2021-04-22 194 }
f6b2db1a3e8d14 Eric W. Biederman 2016-08-08 195
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org
[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 65454 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread* [PATCH v11 0/9] Count rlimits in each user namespace
@ 2021-04-22 12:27 legion
2021-04-22 12:27 ` legion
0 siblings, 1 reply; 3+ messages in thread
From: legion @ 2021-04-22 12:27 UTC (permalink / raw)
To: LKML, Kernel Hardening, Linux Containers, linux-mm
Cc: Jens Axboe, Kees Cook, Jann Horn, Linus Torvalds, Oleg Nesterov,
Eric W . Biederman, Andrew Morton, Alexey Gladkov,
Christian Brauner
From: Alexey Gladkov <legion@kernel.org>
Preface
-------
These patches are for binding the rlimit counters to a user in user namespace.
This patch set can be applied on top of:
git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git v5.12-rc4
Problem
-------
The RLIMIT_NPROC, RLIMIT_MEMLOCK, RLIMIT_SIGPENDING, RLIMIT_MSGQUEUE rlimits
implementation places the counters in user_struct [1]. These limits are global
between processes and persists for the lifetime of the process, even if
processes are in different user namespaces.
To illustrate the impact of rlimits, let's say there is a program that does not
fork. Some service-A wants to run this program as user X in multiple containers.
Since the program never fork the service wants to set RLIMIT_NPROC=1.
service-A
\- program (uid=1000, container1, rlimit_nproc=1)
\- program (uid=1000, container2, rlimit_nproc=1)
The service-A sets RLIMIT_NPROC=1 and runs the program in container1. When the
service-A tries to run a program with RLIMIT_NPROC=1 in container2 it fails
since user X already has one running process.
The problem is not that the limit from container1 affects container2. The
problem is that limit is verified against the global counter that reflects
the number of processes in all containers.
This problem can be worked around by using different users for each container
but in this case we face a different problem of uid mapping when transferring
files from one container to another.
Eric W. Biederman mentioned this issue [2][3].
Introduced changes
------------------
To address the problem, we bind rlimit counters to user namespace. Each counter
reflects the number of processes in a given uid in a given user namespace. The
result is a tree of rlimit counters with the biggest value at the root (aka
init_user_ns). The limit is considered exceeded if it's exceeded up in the tree.
[1]: https://lore.kernel.org/containers/87imd2incs.fsf@x220.int.ebiederm.org/
[2]: https://lists.linuxfoundation.org/pipermail/containers/2020-August/042096.html
[3]: https://lists.linuxfoundation.org/pipermail/containers/2020-October/042524.html
Changelog
---------
v11:
* Revert most of changes in signal.c to fix performance issues and remove
unnecessary memory allocations.
* Fixed issue found by lkp robot (again).
v10:
* Fixed memory leak in __sigqueue_alloc.
* Handled an unlikely situation when all consumers will return ucounts at once.
* Addressed other review comments from Eric W. Biederman.
v9:
* Used a negative value to check that the ucounts->count is close to overflow.
* Rebased onto v5.12-rc4.
v8:
* Used atomic_t for ucounts reference counting. Also added counter overflow
check (thanks to Linus Torvalds for the idea).
* Fixed other issues found by lkp-tests project in the patch that Reimplements
RLIMIT_MEMLOCK on top of ucounts.
v7:
* Fixed issues found by lkp-tests project in the patch that Reimplements
RLIMIT_MEMLOCK on top of ucounts.
v6:
* Fixed issues found by lkp-tests project.
* Rebased onto v5.11.
v5:
* Split the first commit into two commits: change ucounts.count type to atomic_long_t
and add ucounts to cred. These commits were merged by mistake during the rebase.
* The __get_ucounts() renamed to alloc_ucounts().
* The cred.ucounts update has been moved from commit_creds() as it did not allow
to handle errors.
* Added error handling of set_cred_ucounts().
v4:
* Reverted the type change of ucounts.count to refcount_t.
* Fixed typo in the kernel/cred.c
v3:
* Added get_ucounts() function to increase the reference count. The existing
get_counts() function renamed to __get_ucounts().
* The type of ucounts.count changed from atomic_t to refcount_t.
* Dropped 'const' from set_cred_ucounts() arguments.
* Fixed a bug with freeing the cred structure after calling cred_alloc_blank().
* Commit messages have been updated.
* Added selftest.
v2:
* RLIMIT_MEMLOCK, RLIMIT_SIGPENDING and RLIMIT_MSGQUEUE are migrated to ucounts.
* Added ucounts for pair uid and user namespace into cred.
* Added the ability to increase ucount by more than 1.
v1:
* After discussion with Eric W. Biederman, I increased the size of ucounts to
atomic_long_t.
* Added ucount_max to avoid the fork bomb.
--
Alexey Gladkov (9):
Increase size of ucounts to atomic_long_t
Add a reference to ucounts for each cred
Use atomic_t for ucounts reference counting
Reimplement RLIMIT_NPROC on top of ucounts
Reimplement RLIMIT_MSGQUEUE on top of ucounts
Reimplement RLIMIT_SIGPENDING on top of ucounts
Reimplement RLIMIT_MEMLOCK on top of ucounts
kselftests: Add test to check for rlimit changes in different user
namespaces
ucounts: Set ucount_max to the largest positive value the type can
hold
fs/exec.c | 6 +-
fs/hugetlbfs/inode.c | 16 +-
fs/proc/array.c | 2 +-
include/linux/cred.h | 4 +
include/linux/hugetlb.h | 4 +-
include/linux/mm.h | 4 +-
include/linux/sched/user.h | 7 -
include/linux/shmem_fs.h | 2 +-
include/linux/signal_types.h | 4 +-
include/linux/user_namespace.h | 31 +++-
ipc/mqueue.c | 40 ++---
ipc/shm.c | 26 +--
kernel/cred.c | 50 +++++-
kernel/exit.c | 2 +-
kernel/fork.c | 18 +-
kernel/signal.c | 25 +--
kernel/sys.c | 14 +-
kernel/ucount.c | 116 ++++++++++---
kernel/user.c | 3 -
kernel/user_namespace.c | 9 +-
mm/memfd.c | 4 +-
mm/mlock.c | 22 ++-
mm/mmap.c | 4 +-
mm/shmem.c | 10 +-
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/rlimits/.gitignore | 2 +
tools/testing/selftests/rlimits/Makefile | 6 +
tools/testing/selftests/rlimits/config | 1 +
.../selftests/rlimits/rlimits-per-userns.c | 161 ++++++++++++++++++
29 files changed, 467 insertions(+), 127 deletions(-)
create mode 100644 tools/testing/selftests/rlimits/.gitignore
create mode 100644 tools/testing/selftests/rlimits/Makefile
create mode 100644 tools/testing/selftests/rlimits/config
create mode 100644 tools/testing/selftests/rlimits/rlimits-per-userns.c
--
2.29.3
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/containers
^ permalink raw reply [flat|nested] 3+ messages in thread* [PATCH v11 3/9] Use atomic_t for ucounts reference counting
2021-04-22 12:27 [PATCH v11 0/9] Count rlimits in each user namespace legion
@ 2021-04-22 12:27 ` legion
0 siblings, 0 replies; 3+ messages in thread
From: legion @ 2021-04-22 12:27 UTC (permalink / raw)
To: LKML, Kernel Hardening, Linux Containers, linux-mm
Cc: Jens Axboe, Kees Cook, Jann Horn, Linus Torvalds, Oleg Nesterov,
Eric W . Biederman, Andrew Morton, Alexey Gladkov,
Christian Brauner
From: Alexey Gladkov <legion@kernel.org>
The current implementation of the ucounts reference counter requires the
use of spin_lock. We're going to use get_ucounts() in more performance
critical areas like a handling of RLIMIT_SIGPENDING.
Now we need to use spin_lock only if we want to change the hashtable.
v10:
* Always try to put ucounts in case we cannot increase ucounts->count.
This will allow to cover the case when all consumers will return
ucounts at once.
v9:
* Use a negative value to check that the ucounts->count is close to
overflow.
Signed-off-by: Alexey Gladkov <legion@kernel.org>
---
include/linux/user_namespace.h | 4 +--
kernel/ucount.c | 53 ++++++++++++----------------------
2 files changed, 21 insertions(+), 36 deletions(-)
diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h
index f71b5a4a3e74..d84cc2c0b443 100644
--- a/include/linux/user_namespace.h
+++ b/include/linux/user_namespace.h
@@ -92,7 +92,7 @@ struct ucounts {
struct hlist_node node;
struct user_namespace *ns;
kuid_t uid;
- int count;
+ atomic_t count;
atomic_long_t ucount[UCOUNT_COUNTS];
};
@@ -104,7 +104,7 @@ void retire_userns_sysctls(struct user_namespace *ns);
struct ucounts *inc_ucount(struct user_namespace *ns, kuid_t uid, enum ucount_type type);
void dec_ucount(struct ucounts *ucounts, enum ucount_type type);
struct ucounts *alloc_ucounts(struct user_namespace *ns, kuid_t uid);
-struct ucounts *get_ucounts(struct ucounts *ucounts);
+struct ucounts * __must_check get_ucounts(struct ucounts *ucounts);
void put_ucounts(struct ucounts *ucounts);
#ifdef CONFIG_USER_NS
diff --git a/kernel/ucount.c b/kernel/ucount.c
index 50cc1dfb7d28..365865f368ec 100644
--- a/kernel/ucount.c
+++ b/kernel/ucount.c
@@ -11,7 +11,7 @@
struct ucounts init_ucounts = {
.ns = &init_user_ns,
.uid = GLOBAL_ROOT_UID,
- .count = 1,
+ .count = ATOMIC_INIT(1),
};
#define UCOUNTS_HASHTABLE_BITS 10
@@ -139,6 +139,15 @@ static void hlist_add_ucounts(struct ucounts *ucounts)
spin_unlock_irq(&ucounts_lock);
}
+struct ucounts *get_ucounts(struct ucounts *ucounts)
+{
+ if (ucounts && atomic_add_negative(1, &ucounts->count)) {
+ put_ucounts(ucounts);
+ ucounts = NULL;
+ }
+ return ucounts;
+}
+
struct ucounts *alloc_ucounts(struct user_namespace *ns, kuid_t uid)
{
struct hlist_head *hashent = ucounts_hashentry(ns, uid);
@@ -155,7 +164,7 @@ struct ucounts *alloc_ucounts(struct user_namespace *ns, kuid_t uid)
new->ns = ns;
new->uid = uid;
- new->count = 0;
+ atomic_set(&new->count, 1);
spin_lock_irq(&ucounts_lock);
ucounts = find_ucounts(ns, uid, hashent);
@@ -163,33 +172,12 @@ struct ucounts *alloc_ucounts(struct user_namespace *ns, kuid_t uid)
kfree(new);
} else {
hlist_add_head(&new->node, hashent);
- ucounts = new;
+ spin_unlock_irq(&ucounts_lock);
+ return new;
}
}
- if (ucounts->count == INT_MAX)
- ucounts = NULL;
- else
- ucounts->count += 1;
spin_unlock_irq(&ucounts_lock);
- return ucounts;
-}
-
-struct ucounts *get_ucounts(struct ucounts *ucounts)
-{
- unsigned long flags;
-
- if (!ucounts)
- return NULL;
-
- spin_lock_irqsave(&ucounts_lock, flags);
- if (ucounts->count == INT_MAX) {
- WARN_ONCE(1, "ucounts: counter has reached its maximum value");
- ucounts = NULL;
- } else {
- ucounts->count += 1;
- }
- spin_unlock_irqrestore(&ucounts_lock, flags);
-
+ ucounts = get_ucounts(ucounts);
return ucounts;
}
@@ -197,15 +185,12 @@ void put_ucounts(struct ucounts *ucounts)
{
unsigned long flags;
- spin_lock_irqsave(&ucounts_lock, flags);
- ucounts->count -= 1;
- if (!ucounts->count)
+ if (atomic_dec_and_test(&ucounts->count)) {
+ spin_lock_irqsave(&ucounts_lock, flags);
hlist_del_init(&ucounts->node);
- else
- ucounts = NULL;
- spin_unlock_irqrestore(&ucounts_lock, flags);
-
- kfree(ucounts);
+ spin_unlock_irqrestore(&ucounts_lock, flags);
+ kfree(ucounts);
+ }
}
static inline bool atomic_long_inc_below(atomic_long_t *v, int u)
--
2.29.3
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/containers
^ permalink raw reply related [flat|nested] 3+ messages in thread* [PATCH v11 3/9] Use atomic_t for ucounts reference counting
@ 2021-04-22 12:27 ` legion
0 siblings, 0 replies; 3+ messages in thread
From: legion @ 2021-04-22 12:27 UTC (permalink / raw)
To: LKML, Kernel Hardening, Linux Containers, linux-mm
Cc: Alexey Gladkov, Andrew Morton, Christian Brauner,
Eric W . Biederman, Jann Horn, Jens Axboe, Kees Cook,
Linus Torvalds, Oleg Nesterov
From: Alexey Gladkov <legion@kernel.org>
The current implementation of the ucounts reference counter requires the
use of spin_lock. We're going to use get_ucounts() in more performance
critical areas like a handling of RLIMIT_SIGPENDING.
Now we need to use spin_lock only if we want to change the hashtable.
v10:
* Always try to put ucounts in case we cannot increase ucounts->count.
This will allow to cover the case when all consumers will return
ucounts at once.
v9:
* Use a negative value to check that the ucounts->count is close to
overflow.
Signed-off-by: Alexey Gladkov <legion@kernel.org>
---
include/linux/user_namespace.h | 4 +--
kernel/ucount.c | 53 ++++++++++++----------------------
2 files changed, 21 insertions(+), 36 deletions(-)
diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h
index f71b5a4a3e74..d84cc2c0b443 100644
--- a/include/linux/user_namespace.h
+++ b/include/linux/user_namespace.h
@@ -92,7 +92,7 @@ struct ucounts {
struct hlist_node node;
struct user_namespace *ns;
kuid_t uid;
- int count;
+ atomic_t count;
atomic_long_t ucount[UCOUNT_COUNTS];
};
@@ -104,7 +104,7 @@ void retire_userns_sysctls(struct user_namespace *ns);
struct ucounts *inc_ucount(struct user_namespace *ns, kuid_t uid, enum ucount_type type);
void dec_ucount(struct ucounts *ucounts, enum ucount_type type);
struct ucounts *alloc_ucounts(struct user_namespace *ns, kuid_t uid);
-struct ucounts *get_ucounts(struct ucounts *ucounts);
+struct ucounts * __must_check get_ucounts(struct ucounts *ucounts);
void put_ucounts(struct ucounts *ucounts);
#ifdef CONFIG_USER_NS
diff --git a/kernel/ucount.c b/kernel/ucount.c
index 50cc1dfb7d28..365865f368ec 100644
--- a/kernel/ucount.c
+++ b/kernel/ucount.c
@@ -11,7 +11,7 @@
struct ucounts init_ucounts = {
.ns = &init_user_ns,
.uid = GLOBAL_ROOT_UID,
- .count = 1,
+ .count = ATOMIC_INIT(1),
};
#define UCOUNTS_HASHTABLE_BITS 10
@@ -139,6 +139,15 @@ static void hlist_add_ucounts(struct ucounts *ucounts)
spin_unlock_irq(&ucounts_lock);
}
+struct ucounts *get_ucounts(struct ucounts *ucounts)
+{
+ if (ucounts && atomic_add_negative(1, &ucounts->count)) {
+ put_ucounts(ucounts);
+ ucounts = NULL;
+ }
+ return ucounts;
+}
+
struct ucounts *alloc_ucounts(struct user_namespace *ns, kuid_t uid)
{
struct hlist_head *hashent = ucounts_hashentry(ns, uid);
@@ -155,7 +164,7 @@ struct ucounts *alloc_ucounts(struct user_namespace *ns, kuid_t uid)
new->ns = ns;
new->uid = uid;
- new->count = 0;
+ atomic_set(&new->count, 1);
spin_lock_irq(&ucounts_lock);
ucounts = find_ucounts(ns, uid, hashent);
@@ -163,33 +172,12 @@ struct ucounts *alloc_ucounts(struct user_namespace *ns, kuid_t uid)
kfree(new);
} else {
hlist_add_head(&new->node, hashent);
- ucounts = new;
+ spin_unlock_irq(&ucounts_lock);
+ return new;
}
}
- if (ucounts->count == INT_MAX)
- ucounts = NULL;
- else
- ucounts->count += 1;
spin_unlock_irq(&ucounts_lock);
- return ucounts;
-}
-
-struct ucounts *get_ucounts(struct ucounts *ucounts)
-{
- unsigned long flags;
-
- if (!ucounts)
- return NULL;
-
- spin_lock_irqsave(&ucounts_lock, flags);
- if (ucounts->count == INT_MAX) {
- WARN_ONCE(1, "ucounts: counter has reached its maximum value");
- ucounts = NULL;
- } else {
- ucounts->count += 1;
- }
- spin_unlock_irqrestore(&ucounts_lock, flags);
-
+ ucounts = get_ucounts(ucounts);
return ucounts;
}
@@ -197,15 +185,12 @@ void put_ucounts(struct ucounts *ucounts)
{
unsigned long flags;
- spin_lock_irqsave(&ucounts_lock, flags);
- ucounts->count -= 1;
- if (!ucounts->count)
+ if (atomic_dec_and_test(&ucounts->count)) {
+ spin_lock_irqsave(&ucounts_lock, flags);
hlist_del_init(&ucounts->node);
- else
- ucounts = NULL;
- spin_unlock_irqrestore(&ucounts_lock, flags);
-
- kfree(ucounts);
+ spin_unlock_irqrestore(&ucounts_lock, flags);
+ kfree(ucounts);
+ }
}
static inline bool atomic_long_inc_below(atomic_long_t *v, int u)
--
2.29.3
^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2021-04-27 22:36 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-04-27 22:36 [PATCH v11 3/9] Use atomic_t for ucounts reference counting kernel test robot
-- strict thread matches above, loose matches on Subject: below --
2021-04-22 12:27 [PATCH v11 0/9] Count rlimits in each user namespace legion
2021-04-22 12:27 ` [PATCH v11 3/9] Use atomic_t for ucounts reference counting legion
2021-04-22 12:27 ` legion
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.