* Re: [PATCH v11 5/9] Reimplement RLIMIT_MSGQUEUE on top of ucounts @ 2021-08-17 4:03 Ma, XinjianX 2021-08-17 15:47 ` Eric W. Biederman 0 siblings, 1 reply; 10+ messages in thread From: Ma, XinjianX @ 2021-08-17 4:03 UTC (permalink / raw) To: legion@kernel.org, linux-kselftest@vger.kernel.org Cc: lkp, linux-kselftest@vger.kernel.org, akpm@linux-foundation.org, axboe@kernel.dk, christian.brauner@ubuntu.com, containers@lists.linux-foundation.org, ebiederm@xmission.com, jannh@google.com, keescook@chromium.org, kernel-hardening@lists.openwall.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, oleg@redhat.com, torvalds@linux-foundation.org Hi Alexey, When lkp team run kernel selftests, we found after these series of patches, testcase mqueue: mq_perf_tests in kselftest failed with following message. If you confirm and fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot lkp@intel.com ``` # selftests: mqueue: mq_perf_tests # # Initial system state: # Using queue path: /mq_perf_tests # RLIMIT_MSGQUEUE(soft): 819200 # RLIMIT_MSGQUEUE(hard): 819200 # Maximum Message Size: 8192 # Maximum Queue Size: 10 # Nice value: 0 # # Adjusted system state for testing: # RLIMIT_MSGQUEUE(soft): (unlimited) # RLIMIT_MSGQUEUE(hard): (unlimited) # Maximum Message Size: 16777216 # Maximum Queue Size: 65530 # Nice value: -20 # Continuous mode: (disabled) # CPUs to pin: 3 # ./mq_perf_tests: mq_open() at 296: Too many open files not ok 2 selftests: mqueue: mq_perf_tests # exit=1 ``` Test env: rootfs: debian-10 gcc version: 9 ------ Thanks Ma Xinjian ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v11 5/9] Reimplement RLIMIT_MSGQUEUE on top of ucounts 2021-08-17 4:03 [PATCH v11 5/9] Reimplement RLIMIT_MSGQUEUE on top of ucounts Ma, XinjianX @ 2021-08-17 15:47 ` Eric W. Biederman 2021-08-18 13:11 ` Alexey Gladkov 0 siblings, 1 reply; 10+ messages in thread From: Eric W. Biederman @ 2021-08-17 15:47 UTC (permalink / raw) To: Ma, XinjianX Cc: legion@kernel.org, linux-kselftest@vger.kernel.org, lkp, akpm@linux-foundation.org, axboe@kernel.dk, christian.brauner@ubuntu.com, containers@lists.linux-foundation.org, jannh@google.com, keescook@chromium.org, kernel-hardening@lists.openwall.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, oleg@redhat.com, torvalds@linux-foundation.org "Ma, XinjianX" <xinjianx.ma@intel.com> writes: > Hi Alexey, > > When lkp team run kernel selftests, we found after these series of patches, testcase mqueue: mq_perf_tests > in kselftest failed with following message. Which kernel was this run against? Where can the mq_perf_tests that you ran and had problems with be found? During your run were you using user namespaces as part of your test environment? The error message too many files corresponds to the error code EMFILES which is the error code that is returned when the rlimit is reached. One possibility is that your test environment was run in a user namespace and so you wound up limited by rlimit of the user who created the user namespace at the point of user namespace creation. At this point if you can give us enough information to look into this and attempt to reproduce it that would be appreciated. > If you confirm and fix the issue, kindly add following tag as appropriate > Reported-by: kernel test robot lkp@intel.com > > ``` > # selftests: mqueue: mq_perf_tests > # > # Initial system state: > # Using queue path: /mq_perf_tests > # RLIMIT_MSGQUEUE(soft): 819200 > # RLIMIT_MSGQUEUE(hard): 819200 > # Maximum Message Size: 8192 > # Maximum Queue Size: 10 > # Nice value: 0 > # > # Adjusted system state for testing: > # RLIMIT_MSGQUEUE(soft): (unlimited) > # RLIMIT_MSGQUEUE(hard): (unlimited) > # Maximum Message Size: 16777216 > # Maximum Queue Size: 65530 > # Nice value: -20 > # Continuous mode: (disabled) > # CPUs to pin: 3 > # ./mq_perf_tests: mq_open() at 296: Too many open files > not ok 2 selftests: mqueue: mq_perf_tests # exit=1 > ``` > > Test env: > rootfs: debian-10 > gcc version: 9 > > ------ > Thanks > Ma Xinjian Eric ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v11 5/9] Reimplement RLIMIT_MSGQUEUE on top of ucounts 2021-08-17 15:47 ` Eric W. Biederman @ 2021-08-18 13:11 ` Alexey Gladkov 2021-08-19 1:50 ` Ma, XinjianX 2021-08-19 15:10 ` Eric W. Biederman 0 siblings, 2 replies; 10+ messages in thread From: Alexey Gladkov @ 2021-08-18 13:11 UTC (permalink / raw) To: Eric W. Biederman Cc: Ma, XinjianX, linux-kselftest@vger.kernel.org, lkp, akpm@linux-foundation.org, axboe@kernel.dk, christian.brauner@ubuntu.com, containers@lists.linux-foundation.org, jannh@google.com, keescook@chromium.org, kernel-hardening@lists.openwall.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, oleg@redhat.com, torvalds@linux-foundation.org On Tue, Aug 17, 2021 at 10:47:14AM -0500, Eric W. Biederman wrote: > "Ma, XinjianX" <xinjianx.ma@intel.com> writes: > > > Hi Alexey, > > > > When lkp team run kernel selftests, we found after these series of patches, testcase mqueue: mq_perf_tests > > in kselftest failed with following message. > > Which kernel was this run against? > > Where can the mq_perf_tests that you ran and had problems with be found? > > During your run were you using user namespaces as part of your test > environment? > > The error message too many files corresponds to the error code EMFILES > which is the error code that is returned when the rlimit is reached. > > One possibility is that your test environment was run in a user > namespace and so you wound up limited by rlimit of the user who created > the user namespace at the point of user namespace creation. > > At this point if you can give us enough information to look into this > and attempt to reproduce it that would be appreciated. I was able to reproduce it on master without using user namespace. I suspect that the maximum value is not assigned here [1]: set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_MSGQUEUE, task_rlimit(&init_task, RLIMIT_MSGQUEUE)); [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/fork.c#n832 > > If you confirm and fix the issue, kindly add following tag as appropriate > > Reported-by: kernel test robot lkp@intel.com > > > > ``` > > # selftests: mqueue: mq_perf_tests > > # > > # Initial system state: > > # Using queue path: /mq_perf_tests > > # RLIMIT_MSGQUEUE(soft): 819200 > > # RLIMIT_MSGQUEUE(hard): 819200 > > # Maximum Message Size: 8192 > > # Maximum Queue Size: 10 > > # Nice value: 0 > > # > > # Adjusted system state for testing: > > # RLIMIT_MSGQUEUE(soft): (unlimited) > > # RLIMIT_MSGQUEUE(hard): (unlimited) > > # Maximum Message Size: 16777216 > > # Maximum Queue Size: 65530 > > # Nice value: -20 > > # Continuous mode: (disabled) > > # CPUs to pin: 3 > > # ./mq_perf_tests: mq_open() at 296: Too many open files > > not ok 2 selftests: mqueue: mq_perf_tests # exit=1 > > ``` > > > > Test env: > > rootfs: debian-10 > > gcc version: 9 > > > > ------ > > Thanks > > Ma Xinjian > > Eric > -- Rgrds, legion ^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [PATCH v11 5/9] Reimplement RLIMIT_MSGQUEUE on top of ucounts 2021-08-18 13:11 ` Alexey Gladkov @ 2021-08-19 1:50 ` Ma, XinjianX 2021-08-19 15:10 ` Eric W. Biederman 1 sibling, 0 replies; 10+ messages in thread From: Ma, XinjianX @ 2021-08-19 1:50 UTC (permalink / raw) To: Alexey Gladkov, Eric W. Biederman Cc: linux-kselftest@vger.kernel.org, lkp, akpm@linux-foundation.org, axboe@kernel.dk, christian.brauner@ubuntu.com, containers@lists.linux-foundation.org, jannh@google.com, keescook@chromium.org, kernel-hardening@lists.openwall.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, oleg@redhat.com, torvalds@linux-foundation.org > -----Original Message----- > From: Alexey Gladkov <legion@kernel.org> > Sent: Wednesday, August 18, 2021 9:11 PM > To: Eric W. Biederman <ebiederm@xmission.com> > Cc: Ma, XinjianX <xinjianx.ma@intel.com>; linux-kselftest@vger.kernel.org; > lkp <lkp@intel.com>; akpm@linux-foundation.org; axboe@kernel.dk; > christian.brauner@ubuntu.com; containers@lists.linux-foundation.org; > jannh@google.com; keescook@chromium.org; kernel- > hardening@lists.openwall.com; linux-kernel@vger.kernel.org; linux- > mm@kvack.org; oleg@redhat.com; torvalds@linux-foundation.org > Subject: Re: [PATCH v11 5/9] Reimplement RLIMIT_MSGQUEUE on top of > ucounts > > On Tue, Aug 17, 2021 at 10:47:14AM -0500, Eric W. Biederman wrote: > > "Ma, XinjianX" <xinjianx.ma@intel.com> writes: > > > > > Hi Alexey, > > > > > > When lkp team run kernel selftests, we found after these series of > > > patches, testcase mqueue: mq_perf_tests in kselftest failed with > following message. > > > > Which kernel was this run against? > > > > Where can the mq_perf_tests that you ran and had problems with be > found? > > > > During your run were you using user namespaces as part of your test > > environment? > > > > The error message too many files corresponds to the error code EMFILES > > which is the error code that is returned when the rlimit is reached. > > > > One possibility is that your test environment was run in a user > > namespace and so you wound up limited by rlimit of the user who > > created the user namespace at the point of user namespace creation. > > > > At this point if you can give us enough information to look into this > > and attempt to reproduce it that would be appreciated. > > I was able to reproduce it on master without using user namespace. > I suspect that the maximum value is not assigned here [1]: > > set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_MSGQUEUE, > task_rlimit(&init_task, RLIMIT_MSGQUEUE)); > > [1] > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kerne > l/fork.c#n832 Thank you for confirming the issue. And will you plan to fix this issue? If it's your plan, kindly add following tag as appropriate Reported-by: kernel test robot <xinjianx.ma@intel.com> > > > > If you confirm and fix the issue, kindly add following tag as > > > appropriate > > > Reported-by: kernel test robot lkp@intel.com > > > > > > ``` > > > # selftests: mqueue: mq_perf_tests > > > # > > > # Initial system state: > > > # Using queue path: /mq_perf_tests > > > # RLIMIT_MSGQUEUE(soft): 819200 > > > # RLIMIT_MSGQUEUE(hard): 819200 > > > # Maximum Message Size: 8192 > > > # Maximum Queue Size: 10 > > > # Nice value: 0 > > > # > > > # Adjusted system state for testing: > > > # RLIMIT_MSGQUEUE(soft): (unlimited) > > > # RLIMIT_MSGQUEUE(hard): (unlimited) > > > # Maximum Message Size: 16777216 > > > # Maximum Queue Size: 65530 > > > # Nice value: -20 > > > # Continuous mode: (disabled) > > > # CPUs to pin: 3 > > > # ./mq_perf_tests: mq_open() at 296: Too many open files not ok 2 > > > selftests: mqueue: mq_perf_tests # exit=1 > > > ``` > > > > > > Test env: > > > rootfs: debian-10 > > > gcc version: 9 > > > > > > ------ > > > Thanks > > > Ma Xinjian > > > > Eric > > > > -- > Rgrds, legion ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v11 5/9] Reimplement RLIMIT_MSGQUEUE on top of ucounts 2021-08-18 13:11 ` Alexey Gladkov 2021-08-19 1:50 ` Ma, XinjianX @ 2021-08-19 15:10 ` Eric W. Biederman 2021-08-19 17:26 ` Alexey Gladkov 1 sibling, 1 reply; 10+ messages in thread From: Eric W. Biederman @ 2021-08-19 15:10 UTC (permalink / raw) To: Alexey Gladkov Cc: Ma, XinjianX, linux-kselftest@vger.kernel.org, lkp, akpm@linux-foundation.org, axboe@kernel.dk, christian.brauner@ubuntu.com, containers@lists.linux-foundation.org, jannh@google.com, keescook@chromium.org, kernel-hardening@lists.openwall.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, oleg@redhat.com, torvalds@linux-foundation.org Alexey Gladkov <legion@kernel.org> writes: > On Tue, Aug 17, 2021 at 10:47:14AM -0500, Eric W. Biederman wrote: >> "Ma, XinjianX" <xinjianx.ma@intel.com> writes: >> >> > Hi Alexey, >> > >> > When lkp team run kernel selftests, we found after these series of patches, testcase mqueue: mq_perf_tests >> > in kselftest failed with following message. >> >> Which kernel was this run against? >> >> Where can the mq_perf_tests that you ran and had problems with be found? >> >> During your run were you using user namespaces as part of your test >> environment? >> >> The error message too many files corresponds to the error code EMFILES >> which is the error code that is returned when the rlimit is reached. >> >> One possibility is that your test environment was run in a user >> namespace and so you wound up limited by rlimit of the user who created >> the user namespace at the point of user namespace creation. >> >> At this point if you can give us enough information to look into this >> and attempt to reproduce it that would be appreciated. > > I was able to reproduce it on master without using user namespace. > I suspect that the maximum value is not assigned here [1]: > > set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_MSGQUEUE, task_rlimit(&init_task, RLIMIT_MSGQUEUE)); > > [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/fork.c#n832 The rlimits for init_task are set to INIT_RLIMITS. In INIT_RLIMITS RLIMIT_MSGQUEUE is set to MQ_MAX_BYTES So that definitely means that as the code is current constructed the rlimit can not be effectively raised. So it looks like we are just silly and preventing the initial rlimits from being raised. So we probably want to do something like: diff --git a/kernel/fork.c b/kernel/fork.c index bc94b2cc5995..557ce0083ba3 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -825,13 +825,13 @@ void __init fork_init(void) init_task.signal->rlim[RLIMIT_SIGPENDING] = init_task.signal->rlim[RLIMIT_NPROC]; + /* For non-rlimit ucounts make their default limit max_threads/2 */ for (i = 0; i < MAX_PER_NAMESPACE_UCOUNTS; i++) init_user_ns.ucount_max[i] = max_threads/2; - set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_NPROC, task_rlimit(&init_task, RLIMIT_NPROC)); - set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_MSGQUEUE, task_rlimit(&init_task, RLIMIT_MSGQUEUE)); - set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_SIGPENDING, task_rlimit(&init_task, RLIMIT_SIGPENDING)); - set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_MEMLOCK, task_rlimit(&init_task, RLIMIT_MEMLOCK)); + /* In init_user_ns default rlimit to be the only limit */ + for (; i < UCOUNT_COUNTS; i++) + set_rlimit_ucount_max(&init_user_ns, i, RLIMIT_INFINITY); #ifdef CONFIG_VMAP_STACK cpuhp_setup_state(CPUHP_BP_PREPARE_DYN, "fork:vm_stack_cache", Eric ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v11 5/9] Reimplement RLIMIT_MSGQUEUE on top of ucounts 2021-08-19 15:10 ` Eric W. Biederman @ 2021-08-19 17:26 ` Alexey Gladkov 2021-08-23 21:06 ` [PATCH] ucounts: Fix regression preventing increasing of rlimits in init_user_ns Eric W. Biederman 0 siblings, 1 reply; 10+ messages in thread From: Alexey Gladkov @ 2021-08-19 17:26 UTC (permalink / raw) To: Eric W. Biederman Cc: Ma, XinjianX, linux-kselftest@vger.kernel.org, lkp, akpm@linux-foundation.org, axboe@kernel.dk, christian.brauner@ubuntu.com, containers@lists.linux-foundation.org, jannh@google.com, keescook@chromium.org, kernel-hardening@lists.openwall.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, oleg@redhat.com, torvalds@linux-foundation.org On Thu, Aug 19, 2021 at 10:10:26AM -0500, Eric W. Biederman wrote: > Alexey Gladkov <legion@kernel.org> writes: > > > On Tue, Aug 17, 2021 at 10:47:14AM -0500, Eric W. Biederman wrote: > >> "Ma, XinjianX" <xinjianx.ma@intel.com> writes: > >> > >> > Hi Alexey, > >> > > >> > When lkp team run kernel selftests, we found after these series of patches, testcase mqueue: mq_perf_tests > >> > in kselftest failed with following message. > >> > >> Which kernel was this run against? > >> > >> Where can the mq_perf_tests that you ran and had problems with be found? > >> > >> During your run were you using user namespaces as part of your test > >> environment? > >> > >> The error message too many files corresponds to the error code EMFILES > >> which is the error code that is returned when the rlimit is reached. > >> > >> One possibility is that your test environment was run in a user > >> namespace and so you wound up limited by rlimit of the user who created > >> the user namespace at the point of user namespace creation. > >> > >> At this point if you can give us enough information to look into this > >> and attempt to reproduce it that would be appreciated. > > > > I was able to reproduce it on master without using user namespace. > > I suspect that the maximum value is not assigned here [1]: > > > > set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_MSGQUEUE, task_rlimit(&init_task, RLIMIT_MSGQUEUE)); > > > > [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/fork.c#n832 > > The rlimits for init_task are set to INIT_RLIMITS. > In INIT_RLIMITS RLIMIT_MSGQUEUE is set to MQ_MAX_BYTES > > So that definitely means that as the code is current constructed the > rlimit can not be effectively raised. > > So it looks like we are just silly and preventing the initial rlimits > from being raised. > > So we probably want to do something like: Damn, you are faster than me! :) > diff --git a/kernel/fork.c b/kernel/fork.c > index bc94b2cc5995..557ce0083ba3 100644 > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -825,13 +825,13 @@ void __init fork_init(void) > init_task.signal->rlim[RLIMIT_SIGPENDING] = > init_task.signal->rlim[RLIMIT_NPROC]; > > + /* For non-rlimit ucounts make their default limit max_threads/2 */ > for (i = 0; i < MAX_PER_NAMESPACE_UCOUNTS; i++) > init_user_ns.ucount_max[i] = max_threads/2; > > - set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_NPROC, task_rlimit(&init_task, RLIMIT_NPROC)); > - set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_MSGQUEUE, task_rlimit(&init_task, RLIMIT_MSGQUEUE)); > - set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_SIGPENDING, task_rlimit(&init_task, RLIMIT_SIGPENDING)); > - set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_MEMLOCK, task_rlimit(&init_task, RLIMIT_MEMLOCK)); > + /* In init_user_ns default rlimit to be the only limit */ > + for (; i < UCOUNT_COUNTS; i++) > + set_rlimit_ucount_max(&init_user_ns, i, RLIMIT_INFINITY); s/RLIMIT_INFINITY/RLIM_INFINITY/ > > #ifdef CONFIG_VMAP_STACK > cpuhp_setup_state(CPUHP_BP_PREPARE_DYN, "fork:vm_stack_cache", > Acked-by: Alexey Gladkov <legion@kernel.org> I cannot complete this test on my laptop. On 4Gb, the test ends with oom-killer. But with this patch, the test definitely passes the moment of the previous fall. -- Rgrds, legion ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH] ucounts: Fix regression preventing increasing of rlimits in init_user_ns 2021-08-19 17:26 ` Alexey Gladkov @ 2021-08-23 21:06 ` Eric W. Biederman 2021-08-24 1:19 ` Ma, XinjianX 0 siblings, 1 reply; 10+ messages in thread From: Eric W. Biederman @ 2021-08-23 21:06 UTC (permalink / raw) To: Alexey Gladkov Cc: Ma, XinjianX, linux-kselftest@vger.kernel.org, lkp, akpm@linux-foundation.org, axboe@kernel.dk, christian.brauner@ubuntu.com, containers@lists.linux-foundation.org, jannh@google.com, keescook@chromium.org, kernel-hardening@lists.openwall.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, oleg@redhat.com, torvalds@linux-foundation.org "Ma, XinjianX" <xinjianx.ma@intel.com> reported: > When lkp team run kernel selftests, we found after these series of patches, testcase mqueue: mq_perf_tests > in kselftest failed with following message. > > # selftests: mqueue: mq_perf_tests > # > # Initial system state: > # Using queue path: /mq_perf_tests > # RLIMIT_MSGQUEUE(soft): 819200 > # RLIMIT_MSGQUEUE(hard): 819200 > # Maximum Message Size: 8192 > # Maximum Queue Size: 10 > # Nice value: 0 > # > # Adjusted system state for testing: > # RLIMIT_MSGQUEUE(soft): (unlimited) > # RLIMIT_MSGQUEUE(hard): (unlimited) > # Maximum Message Size: 16777216 > # Maximum Queue Size: 65530 > # Nice value: -20 > # Continuous mode: (disabled) > # CPUs to pin: 3 > # ./mq_perf_tests: mq_open() at 296: Too many open files > not ok 2 selftests: mqueue: mq_perf_tests # exit=1 > ``` > > Test env: > rootfs: debian-10 > gcc version: 9 After investigation the problem turned out to be that ucount_max for the rlimits in init_user_ns was being set to the initial rlimit value. The practical problem is that ucount_max provides a limit that applications inside the user namespace can not exceed. Which means in practice that rlimits that have been converted to use the ucount infrastructure were not able to exceend their initial rlimits. Solve this by setting the relevant values of ucount_max to RLIM_INIFINITY. A limit in init_user_ns is pointless so the code should allow the values to grow as large as possible without riscking an underflow or an overflow. As the ltp test case was a bit of a pain I have reproduced the rlimit failure and tested the fix with the following little C program: > #include <stdio.h> > #include <fcntl.h> > #include <sys/stat.h> > #include <mqueue.h> > #include <sys/time.h> > #include <sys/resource.h> > #include <errno.h> > #include <string.h> > #include <stdlib.h> > #include <limits.h> > #include <unistd.h> > > int main(int argc, char **argv) > { > struct mq_attr mq_attr; > struct rlimit rlim; > mqd_t mqd; > int ret; > > ret = getrlimit(RLIMIT_MSGQUEUE, &rlim); > if (ret != 0) { > fprintf(stderr, "getrlimit(RLIMIT_MSGQUEUE) failed: %s\n", strerror(errno)); > exit(EXIT_FAILURE); > } > printf("RLIMIT_MSGQUEUE %lu %lu\n", > rlim.rlim_cur, rlim.rlim_max); > rlim.rlim_cur = RLIM_INFINITY; > rlim.rlim_max = RLIM_INFINITY; > ret = setrlimit(RLIMIT_MSGQUEUE, &rlim); > if (ret != 0) { > fprintf(stderr, "setrlimit(RLIMIT_MSGQUEUE, RLIM_INFINITY) failed: %s\n", strerror(errno)); > exit(EXIT_FAILURE); > } > > memset(&mq_attr, 0, sizeof(struct mq_attr)); > mq_attr.mq_maxmsg = 65536 - 1; > mq_attr.mq_msgsize = 16*1024*1024 - 1; > > mqd = mq_open("/mq_rlimit_test", O_RDONLY|O_CREAT, 0600, &mq_attr); > if (mqd == (mqd_t)-1) { > fprintf(stderr, "mq_open failed: %s\n", strerror(errno)); > exit(EXIT_FAILURE); > } > ret = mq_close(mqd); > if (ret) { > fprintf(stderr, "mq_close failed; %s\n", strerror(errno)); > exit(EXIT_FAILURE); > } > > return EXIT_SUCCESS; > } Fixes: 6e52a9f0532f ("Reimplement RLIMIT_MSGQUEUE on top of ucounts") Fixes: d7c9e99aee48 ("Reimplement RLIMIT_MEMLOCK on top of ucounts") Fixes: d64696905554 ("Reimplement RLIMIT_SIGPENDING on top of ucounts") Fixes: 21d1c5e386bc ("Reimplement RLIMIT_NPROC on top of ucounts") Reported-by: kernel test robot lkp@intel.com Acked-by: Alexey Gladkov <legion@kernel.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> --- This is a simplified version of my previous change that I have tested and will push out to linux-next and then to Linus shortly. kernel/fork.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/kernel/fork.c b/kernel/fork.c index bc94b2cc5995..44f4c2d83763 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -828,10 +828,10 @@ void __init fork_init(void) for (i = 0; i < MAX_PER_NAMESPACE_UCOUNTS; i++) init_user_ns.ucount_max[i] = max_threads/2; - set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_NPROC, task_rlimit(&init_task, RLIMIT_NPROC)); - set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_MSGQUEUE, task_rlimit(&init_task, RLIMIT_MSGQUEUE)); - set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_SIGPENDING, task_rlimit(&init_task, RLIMIT_SIGPENDING)); - set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_MEMLOCK, task_rlimit(&init_task, RLIMIT_MEMLOCK)); + set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_NPROC, RLIM_INFINITY); + set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_MSGQUEUE, RLIM_INFINITY); + set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_SIGPENDING, RLIM_INFINITY); + set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_MEMLOCK, RLIM_INFINITY); #ifdef CONFIG_VMAP_STACK cpuhp_setup_state(CPUHP_BP_PREPARE_DYN, "fork:vm_stack_cache", -- 2.20.1 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* RE: [PATCH] ucounts: Fix regression preventing increasing of rlimits in init_user_ns 2021-08-23 21:06 ` [PATCH] ucounts: Fix regression preventing increasing of rlimits in init_user_ns Eric W. Biederman @ 2021-08-24 1:19 ` Ma, XinjianX 2021-08-24 3:24 ` Eric W. Biederman 0 siblings, 1 reply; 10+ messages in thread From: Ma, XinjianX @ 2021-08-24 1:19 UTC (permalink / raw) To: Eric W. Biederman, Alexey Gladkov Cc: linux-kselftest@vger.kernel.org, lkp, akpm@linux-foundation.org, axboe@kernel.dk, christian.brauner@ubuntu.com, containers@lists.linux-foundation.org, jannh@google.com, keescook@chromium.org, kernel-hardening@lists.openwall.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, oleg@redhat.com, torvalds@linux-foundation.org > -----Original Message----- > From: Eric W. Biederman <ebiederm@xmission.com> > Sent: Tuesday, August 24, 2021 5:07 AM > To: Alexey Gladkov <legion@kernel.org> > Cc: Ma, XinjianX <xinjianx.ma@intel.com>; linux-kselftest@vger.kernel.org; > lkp <lkp@intel.com>; akpm@linux-foundation.org; axboe@kernel.dk; > christian.brauner@ubuntu.com; containers@lists.linux-foundation.org; > jannh@google.com; keescook@chromium.org; kernel- > hardening@lists.openwall.com; linux-kernel@vger.kernel.org; linux- > mm@kvack.org; oleg@redhat.com; torvalds@linux-foundation.org > Subject: [PATCH] ucounts: Fix regression preventing increasing of rlimits in > init_user_ns > > > "Ma, XinjianX" <xinjianx.ma@intel.com> reported: > > > When lkp team run kernel selftests, we found after these series of > > patches, testcase mqueue: mq_perf_tests in kselftest failed with following > message. > > > > # selftests: mqueue: mq_perf_tests > > # > > # Initial system state: > > # Using queue path: /mq_perf_tests > > # RLIMIT_MSGQUEUE(soft): 819200 > > # RLIMIT_MSGQUEUE(hard): 819200 > > # Maximum Message Size: 8192 > > # Maximum Queue Size: 10 > > # Nice value: 0 > > # > > # Adjusted system state for testing: > > # RLIMIT_MSGQUEUE(soft): (unlimited) > > # RLIMIT_MSGQUEUE(hard): (unlimited) > > # Maximum Message Size: 16777216 > > # Maximum Queue Size: 65530 > > # Nice value: -20 > > # Continuous mode: (disabled) > > # CPUs to pin: 3 > > # ./mq_perf_tests: mq_open() at 296: Too many open files not ok 2 > > selftests: mqueue: mq_perf_tests # exit=1 ``` > > > > Test env: > > rootfs: debian-10 > > gcc version: 9 > > After investigation the problem turned out to be that ucount_max for the > rlimits in init_user_ns was being set to the initial rlimit value. > The practical problem is that ucount_max provides a limit that applications > inside the user namespace can not exceed. Which means in practice that > rlimits that have been converted to use the ucount infrastructure were not > able to exceend their initial rlimits. > > Solve this by setting the relevant values of ucount_max to RLIM_INIFINITY. A > limit in init_user_ns is pointless so the code should allow the values to grow > as large as possible without riscking an underflow or an overflow. > > As the ltp test case was a bit of a pain I have reproduced the rlimit failure and > tested the fix with the following little C program: > > #include <stdio.h> > > #include <fcntl.h> > > #include <sys/stat.h> > > #include <mqueue.h> > > #include <sys/time.h> > > #include <sys/resource.h> > > #include <errno.h> > > #include <string.h> > > #include <stdlib.h> > > #include <limits.h> > > #include <unistd.h> > > > > int main(int argc, char **argv) > > { > > struct mq_attr mq_attr; > > struct rlimit rlim; > > mqd_t mqd; > > int ret; > > > > ret = getrlimit(RLIMIT_MSGQUEUE, &rlim); > > if (ret != 0) { > > fprintf(stderr, "getrlimit(RLIMIT_MSGQUEUE) failed: %s\n", > strerror(errno)); > > exit(EXIT_FAILURE); > > } > > printf("RLIMIT_MSGQUEUE %lu %lu\n", > > rlim.rlim_cur, rlim.rlim_max); > > rlim.rlim_cur = RLIM_INFINITY; > > rlim.rlim_max = RLIM_INFINITY; > > ret = setrlimit(RLIMIT_MSGQUEUE, &rlim); > > if (ret != 0) { > > fprintf(stderr, "setrlimit(RLIMIT_MSGQUEUE, RLIM_INFINITY) > failed: %s\n", strerror(errno)); > > exit(EXIT_FAILURE); > > } > > > > memset(&mq_attr, 0, sizeof(struct mq_attr)); > > mq_attr.mq_maxmsg = 65536 - 1; > > mq_attr.mq_msgsize = 16*1024*1024 - 1; > > > > mqd = mq_open("/mq_rlimit_test", O_RDONLY|O_CREAT, 0600, > &mq_attr); > > if (mqd == (mqd_t)-1) { > > fprintf(stderr, "mq_open failed: %s\n", strerror(errno)); > > exit(EXIT_FAILURE); > > } > > ret = mq_close(mqd); > > if (ret) { > > fprintf(stderr, "mq_close failed; %s\n", strerror(errno)); > > exit(EXIT_FAILURE); > > } > > > > return EXIT_SUCCESS; > > } > > Fixes: 6e52a9f0532f ("Reimplement RLIMIT_MSGQUEUE on top of ucounts") > Fixes: d7c9e99aee48 ("Reimplement RLIMIT_MEMLOCK on top of ucounts") > Fixes: d64696905554 ("Reimplement RLIMIT_SIGPENDING on top of ucounts") > Fixes: 21d1c5e386bc ("Reimplement RLIMIT_NPROC on top of ucounts") > Reported-by: kernel test robot lkp@intel.com Sorry, but <> around email address is needed Reported-by: kernel test robot <lkp@intel.com> > Acked-by: Alexey Gladkov <legion@kernel.org> > Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> > --- > > This is a simplified version of my previous change that I have tested and will > push out to linux-next and then to Linus shortly. > > kernel/fork.c | 8 ++++---- > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/kernel/fork.c b/kernel/fork.c index bc94b2cc5995..44f4c2d83763 > 100644 > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -828,10 +828,10 @@ void __init fork_init(void) > for (i = 0; i < MAX_PER_NAMESPACE_UCOUNTS; i++) > init_user_ns.ucount_max[i] = max_threads/2; > > - set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_NPROC, > task_rlimit(&init_task, RLIMIT_NPROC)); > - set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_MSGQUEUE, > task_rlimit(&init_task, RLIMIT_MSGQUEUE)); > - set_rlimit_ucount_max(&init_user_ns, > UCOUNT_RLIMIT_SIGPENDING, task_rlimit(&init_task, RLIMIT_SIGPENDING)); > - set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_MEMLOCK, > task_rlimit(&init_task, RLIMIT_MEMLOCK)); > + set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_NPROC, > RLIM_INFINITY); > + set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_MSGQUEUE, > RLIM_INFINITY); > + set_rlimit_ucount_max(&init_user_ns, > UCOUNT_RLIMIT_SIGPENDING, RLIM_INFINITY); > + set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_MEMLOCK, > RLIM_INFINITY); > > #ifdef CONFIG_VMAP_STACK > cpuhp_setup_state(CPUHP_BP_PREPARE_DYN, > "fork:vm_stack_cache", > -- > 2.20.1 ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] ucounts: Fix regression preventing increasing of rlimits in init_user_ns 2021-08-24 1:19 ` Ma, XinjianX @ 2021-08-24 3:24 ` Eric W. Biederman 0 siblings, 0 replies; 10+ messages in thread From: Eric W. Biederman @ 2021-08-24 3:24 UTC (permalink / raw) To: Ma, XinjianX Cc: Alexey Gladkov, linux-kselftest@vger.kernel.org, lkp, akpm@linux-foundation.org, axboe@kernel.dk, christian.brauner@ubuntu.com, containers@lists.linux-foundation.org, jannh@google.com, keescook@chromium.org, kernel-hardening@lists.openwall.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, oleg@redhat.com, torvalds@linux-foundation.org "Ma, XinjianX" <xinjianx.ma@intel.com> writes: >> -----Original Message----- >> From: Eric W. Biederman <ebiederm@xmission.com> >> ... >> Reported-by: kernel test robot lkp@intel.com > Sorry, but <> around email address is needed > Reported-by: kernel test robot <lkp@intel.com> The change is already tested and pushed out so I really don't want to mess with it. Especially as I am aiming to send it to Linus on Wednesday after it has had a chance to pass through linux-next and whatever automated tests are there. What does copying and pasting the Reported-by: tag as included in your original report cause to break? At this point I suspect that the danger of fat fingering something far outweighs whatever benefits might be gained by surrounding the email address with <> marks. Eric ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v11 0/9] Count rlimits in each user namespace @ 2021-04-22 12:27 legion 2021-04-22 12:27 ` [PATCH v11 5/9] Reimplement RLIMIT_MSGQUEUE on top of ucounts legion 0 siblings, 1 reply; 10+ messages in thread From: legion @ 2021-04-22 12:27 UTC (permalink / raw) To: LKML, Kernel Hardening, Linux Containers, linux-mm Cc: Alexey Gladkov, Andrew Morton, Christian Brauner, Eric W . Biederman, Jann Horn, Jens Axboe, Kees Cook, Linus Torvalds, Oleg Nesterov From: Alexey Gladkov <legion@kernel.org> Preface ------- These patches are for binding the rlimit counters to a user in user namespace. This patch set can be applied on top of: git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git v5.12-rc4 Problem ------- The RLIMIT_NPROC, RLIMIT_MEMLOCK, RLIMIT_SIGPENDING, RLIMIT_MSGQUEUE rlimits implementation places the counters in user_struct [1]. These limits are global between processes and persists for the lifetime of the process, even if processes are in different user namespaces. To illustrate the impact of rlimits, let's say there is a program that does not fork. Some service-A wants to run this program as user X in multiple containers. Since the program never fork the service wants to set RLIMIT_NPROC=1. service-A \- program (uid=1000, container1, rlimit_nproc=1) \- program (uid=1000, container2, rlimit_nproc=1) The service-A sets RLIMIT_NPROC=1 and runs the program in container1. When the service-A tries to run a program with RLIMIT_NPROC=1 in container2 it fails since user X already has one running process. The problem is not that the limit from container1 affects container2. The problem is that limit is verified against the global counter that reflects the number of processes in all containers. This problem can be worked around by using different users for each container but in this case we face a different problem of uid mapping when transferring files from one container to another. Eric W. Biederman mentioned this issue [2][3]. Introduced changes ------------------ To address the problem, we bind rlimit counters to user namespace. Each counter reflects the number of processes in a given uid in a given user namespace. The result is a tree of rlimit counters with the biggest value at the root (aka init_user_ns). The limit is considered exceeded if it's exceeded up in the tree. [1]: https://lore.kernel.org/containers/87imd2incs.fsf@x220.int.ebiederm.org/ [2]: https://lists.linuxfoundation.org/pipermail/containers/2020-August/042096.html [3]: https://lists.linuxfoundation.org/pipermail/containers/2020-October/042524.html Changelog --------- v11: * Revert most of changes in signal.c to fix performance issues and remove unnecessary memory allocations. * Fixed issue found by lkp robot (again). v10: * Fixed memory leak in __sigqueue_alloc. * Handled an unlikely situation when all consumers will return ucounts at once. * Addressed other review comments from Eric W. Biederman. v9: * Used a negative value to check that the ucounts->count is close to overflow. * Rebased onto v5.12-rc4. v8: * Used atomic_t for ucounts reference counting. Also added counter overflow check (thanks to Linus Torvalds for the idea). * Fixed other issues found by lkp-tests project in the patch that Reimplements RLIMIT_MEMLOCK on top of ucounts. v7: * Fixed issues found by lkp-tests project in the patch that Reimplements RLIMIT_MEMLOCK on top of ucounts. v6: * Fixed issues found by lkp-tests project. * Rebased onto v5.11. v5: * Split the first commit into two commits: change ucounts.count type to atomic_long_t and add ucounts to cred. These commits were merged by mistake during the rebase. * The __get_ucounts() renamed to alloc_ucounts(). * The cred.ucounts update has been moved from commit_creds() as it did not allow to handle errors. * Added error handling of set_cred_ucounts(). v4: * Reverted the type change of ucounts.count to refcount_t. * Fixed typo in the kernel/cred.c v3: * Added get_ucounts() function to increase the reference count. The existing get_counts() function renamed to __get_ucounts(). * The type of ucounts.count changed from atomic_t to refcount_t. * Dropped 'const' from set_cred_ucounts() arguments. * Fixed a bug with freeing the cred structure after calling cred_alloc_blank(). * Commit messages have been updated. * Added selftest. v2: * RLIMIT_MEMLOCK, RLIMIT_SIGPENDING and RLIMIT_MSGQUEUE are migrated to ucounts. * Added ucounts for pair uid and user namespace into cred. * Added the ability to increase ucount by more than 1. v1: * After discussion with Eric W. Biederman, I increased the size of ucounts to atomic_long_t. * Added ucount_max to avoid the fork bomb. -- Alexey Gladkov (9): Increase size of ucounts to atomic_long_t Add a reference to ucounts for each cred Use atomic_t for ucounts reference counting Reimplement RLIMIT_NPROC on top of ucounts Reimplement RLIMIT_MSGQUEUE on top of ucounts Reimplement RLIMIT_SIGPENDING on top of ucounts Reimplement RLIMIT_MEMLOCK on top of ucounts kselftests: Add test to check for rlimit changes in different user namespaces ucounts: Set ucount_max to the largest positive value the type can hold fs/exec.c | 6 +- fs/hugetlbfs/inode.c | 16 +- fs/proc/array.c | 2 +- include/linux/cred.h | 4 + include/linux/hugetlb.h | 4 +- include/linux/mm.h | 4 +- include/linux/sched/user.h | 7 - include/linux/shmem_fs.h | 2 +- include/linux/signal_types.h | 4 +- include/linux/user_namespace.h | 31 +++- ipc/mqueue.c | 40 ++--- ipc/shm.c | 26 +-- kernel/cred.c | 50 +++++- kernel/exit.c | 2 +- kernel/fork.c | 18 +- kernel/signal.c | 25 +-- kernel/sys.c | 14 +- kernel/ucount.c | 116 ++++++++++--- kernel/user.c | 3 - kernel/user_namespace.c | 9 +- mm/memfd.c | 4 +- mm/mlock.c | 22 ++- mm/mmap.c | 4 +- mm/shmem.c | 10 +- tools/testing/selftests/Makefile | 1 + tools/testing/selftests/rlimits/.gitignore | 2 + tools/testing/selftests/rlimits/Makefile | 6 + tools/testing/selftests/rlimits/config | 1 + .../selftests/rlimits/rlimits-per-userns.c | 161 ++++++++++++++++++ 29 files changed, 467 insertions(+), 127 deletions(-) create mode 100644 tools/testing/selftests/rlimits/.gitignore create mode 100644 tools/testing/selftests/rlimits/Makefile create mode 100644 tools/testing/selftests/rlimits/config create mode 100644 tools/testing/selftests/rlimits/rlimits-per-userns.c -- 2.29.3 ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v11 5/9] Reimplement RLIMIT_MSGQUEUE on top of ucounts 2021-04-22 12:27 [PATCH v11 0/9] Count rlimits in each user namespace legion @ 2021-04-22 12:27 ` legion 0 siblings, 0 replies; 10+ messages in thread From: legion @ 2021-04-22 12:27 UTC (permalink / raw) To: LKML, Kernel Hardening, Linux Containers, linux-mm Cc: Alexey Gladkov, Andrew Morton, Christian Brauner, Eric W . Biederman, Jann Horn, Jens Axboe, Kees Cook, Linus Torvalds, Oleg Nesterov From: Alexey Gladkov <legion@kernel.org> The rlimit counter is tied to uid in the user_namespace. This allows rlimit values to be specified in userns even if they are already globally exceeded by the user. However, the value of the previous user_namespaces cannot be exceeded. Signed-off-by: Alexey Gladkov <legion@kernel.org> --- include/linux/sched/user.h | 4 ---- include/linux/user_namespace.h | 1 + ipc/mqueue.c | 40 ++++++++++++++++++---------------- kernel/fork.c | 1 + kernel/ucount.c | 1 + kernel/user_namespace.c | 1 + 6 files changed, 25 insertions(+), 23 deletions(-) diff --git a/include/linux/sched/user.h b/include/linux/sched/user.h index d33d867ad6c1..8a34446681aa 100644 --- a/include/linux/sched/user.h +++ b/include/linux/sched/user.h @@ -18,10 +18,6 @@ struct user_struct { #endif #ifdef CONFIG_EPOLL atomic_long_t epoll_watches; /* The number of file descriptors currently watched */ -#endif -#ifdef CONFIG_POSIX_MQUEUE - /* protected by mq_lock */ - unsigned long mq_bytes; /* How many bytes can be allocated to mqueue? */ #endif unsigned long locked_shm; /* How many pages of mlocked shm ? */ unsigned long unix_inflight; /* How many files in flight in unix sockets */ diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h index d5bb4abb8f3e..21ad1ad1b990 100644 --- a/include/linux/user_namespace.h +++ b/include/linux/user_namespace.h @@ -51,6 +51,7 @@ enum ucount_type { UCOUNT_INOTIFY_WATCHES, #endif UCOUNT_RLIMIT_NPROC, + UCOUNT_RLIMIT_MSGQUEUE, UCOUNT_COUNTS, }; diff --git a/ipc/mqueue.c b/ipc/mqueue.c index 8031464ed4ae..461fcf8c873d 100644 --- a/ipc/mqueue.c +++ b/ipc/mqueue.c @@ -144,7 +144,7 @@ struct mqueue_inode_info { struct pid *notify_owner; u32 notify_self_exec_id; struct user_namespace *notify_user_ns; - struct user_struct *user; /* user who created, for accounting */ + struct ucounts *ucounts; /* user who created, for accounting */ struct sock *notify_sock; struct sk_buff *notify_cookie; @@ -292,7 +292,6 @@ static struct inode *mqueue_get_inode(struct super_block *sb, struct ipc_namespace *ipc_ns, umode_t mode, struct mq_attr *attr) { - struct user_struct *u = current_user(); struct inode *inode; int ret = -ENOMEM; @@ -321,7 +320,7 @@ static struct inode *mqueue_get_inode(struct super_block *sb, info->notify_owner = NULL; info->notify_user_ns = NULL; info->qsize = 0; - info->user = NULL; /* set when all is ok */ + info->ucounts = NULL; /* set when all is ok */ info->msg_tree = RB_ROOT; info->msg_tree_rightmost = NULL; info->node_cache = NULL; @@ -371,19 +370,23 @@ static struct inode *mqueue_get_inode(struct super_block *sb, if (mq_bytes + mq_treesize < mq_bytes) goto out_inode; mq_bytes += mq_treesize; - spin_lock(&mq_lock); - if (u->mq_bytes + mq_bytes < u->mq_bytes || - u->mq_bytes + mq_bytes > rlimit(RLIMIT_MSGQUEUE)) { + info->ucounts = get_ucounts(current_ucounts()); + if (info->ucounts) { + long msgqueue; + + spin_lock(&mq_lock); + msgqueue = inc_rlimit_ucounts(info->ucounts, UCOUNT_RLIMIT_MSGQUEUE, mq_bytes); + if (msgqueue == LONG_MAX || msgqueue > rlimit(RLIMIT_MSGQUEUE)) { + dec_rlimit_ucounts(info->ucounts, UCOUNT_RLIMIT_MSGQUEUE, mq_bytes); + spin_unlock(&mq_lock); + put_ucounts(info->ucounts); + info->ucounts = NULL; + /* mqueue_evict_inode() releases info->messages */ + ret = -EMFILE; + goto out_inode; + } spin_unlock(&mq_lock); - /* mqueue_evict_inode() releases info->messages */ - ret = -EMFILE; - goto out_inode; } - u->mq_bytes += mq_bytes; - spin_unlock(&mq_lock); - - /* all is ok */ - info->user = get_uid(u); } else if (S_ISDIR(mode)) { inc_nlink(inode); /* Some things misbehave if size == 0 on a directory */ @@ -497,7 +500,6 @@ static void mqueue_free_inode(struct inode *inode) static void mqueue_evict_inode(struct inode *inode) { struct mqueue_inode_info *info; - struct user_struct *user; struct ipc_namespace *ipc_ns; struct msg_msg *msg, *nmsg; LIST_HEAD(tmp_msg); @@ -520,8 +522,7 @@ static void mqueue_evict_inode(struct inode *inode) free_msg(msg); } - user = info->user; - if (user) { + if (info->ucounts) { unsigned long mq_bytes, mq_treesize; /* Total amount of bytes accounted for the mqueue */ @@ -533,7 +534,7 @@ static void mqueue_evict_inode(struct inode *inode) info->attr.mq_msgsize); spin_lock(&mq_lock); - user->mq_bytes -= mq_bytes; + dec_rlimit_ucounts(info->ucounts, UCOUNT_RLIMIT_MSGQUEUE, mq_bytes); /* * get_ns_from_inode() ensures that the * (ipc_ns = sb->s_fs_info) is either a valid ipc_ns @@ -543,7 +544,8 @@ static void mqueue_evict_inode(struct inode *inode) if (ipc_ns) ipc_ns->mq_queues_count--; spin_unlock(&mq_lock); - free_uid(user); + put_ucounts(info->ucounts); + info->ucounts = NULL; } if (ipc_ns) put_ipc_ns(ipc_ns); diff --git a/kernel/fork.c b/kernel/fork.c index d8a4956463ae..85c6094f5a48 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -823,6 +823,7 @@ void __init fork_init(void) init_user_ns.ucount_max[i] = max_threads/2; init_user_ns.ucount_max[UCOUNT_RLIMIT_NPROC] = task_rlimit(&init_task, RLIMIT_NPROC); + init_user_ns.ucount_max[UCOUNT_RLIMIT_MSGQUEUE] = task_rlimit(&init_task, RLIMIT_MSGQUEUE); #ifdef CONFIG_VMAP_STACK cpuhp_setup_state(CPUHP_BP_PREPARE_DYN, "fork:vm_stack_cache", diff --git a/kernel/ucount.c b/kernel/ucount.c index 6caa56f7dec8..6e6f936a5963 100644 --- a/kernel/ucount.c +++ b/kernel/ucount.c @@ -80,6 +80,7 @@ static struct ctl_table user_table[] = { UCOUNT_ENTRY("max_inotify_instances"), UCOUNT_ENTRY("max_inotify_watches"), #endif + { }, { }, { } }; diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c index 2434b13b02e5..cc90d5203acf 100644 --- a/kernel/user_namespace.c +++ b/kernel/user_namespace.c @@ -122,6 +122,7 @@ int create_user_ns(struct cred *new) ns->ucount_max[i] = INT_MAX; } ns->ucount_max[UCOUNT_RLIMIT_NPROC] = rlimit(RLIMIT_NPROC); + ns->ucount_max[UCOUNT_RLIMIT_MSGQUEUE] = rlimit(RLIMIT_MSGQUEUE); ns->ucounts = ucounts; /* Inherit USERNS_SETGROUPS_ALLOWED from our parent */ -- 2.29.3 ^ permalink raw reply related [flat|nested] 10+ messages in thread
end of thread, other threads:[~2021-08-24 3:25 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2021-08-17 4:03 [PATCH v11 5/9] Reimplement RLIMIT_MSGQUEUE on top of ucounts Ma, XinjianX 2021-08-17 15:47 ` Eric W. Biederman 2021-08-18 13:11 ` Alexey Gladkov 2021-08-19 1:50 ` Ma, XinjianX 2021-08-19 15:10 ` Eric W. Biederman 2021-08-19 17:26 ` Alexey Gladkov 2021-08-23 21:06 ` [PATCH] ucounts: Fix regression preventing increasing of rlimits in init_user_ns Eric W. Biederman 2021-08-24 1:19 ` Ma, XinjianX 2021-08-24 3:24 ` Eric W. Biederman -- strict thread matches above, loose matches on Subject: below -- 2021-04-22 12:27 [PATCH v11 0/9] Count rlimits in each user namespace legion 2021-04-22 12:27 ` [PATCH v11 5/9] Reimplement RLIMIT_MSGQUEUE on top of ucounts legion
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).