From: "Michael Kerrisk (man-pages)" <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: Andrew Morton <akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
Cc: mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org,
Willy Tarreau <w@1wt.eu>,
Vegard Nossum
<vegard.nossum-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>,
socketpair-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org,
Tetsuo Handa
<penguin-kernel-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org>,
Jens Axboe <axboe-b10kYP2dOMg@public.gmane.org>,
Al Viro <viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org>,
stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [PATCH 1/2] pipe: check limits only when increasing pipe capacity
Date: Fri, 19 Aug 2016 17:07:26 +1200 [thread overview]
Message-ID: <88c159a1-0d4f-c776-90b1-5532b7fd4616@gmail.com> (raw)
In-Reply-To: <86c85cff-7fee-cded-386a-e1d518573dda-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Andrew,
thanks for picking up this patch series in -mm. Please drop it.
After discussions with Vegard, I have something better now.
Cheers,
Michael
On 08/16/2016 11:10 PM, Michael Kerrisk (man-pages) wrote:
> When changing a pipe's capacity with fcntl(F_SETPIPE_SZ), various
> limits defined by /proc/sys/fs/pipe-* files are checked to see
> if unprivileged users are exceeding limits on memory consumption.
>
> While documenting and testing the operation of these limits I noticed
> that, as currently implemented, these checks can lead to cases where
> a user can increase a pipe's capacity and is then unable to decrease
> the capacity. The origin of the problem is two-fold:
>
> (1) When increasing the pipe capacity, the checks against the limits
> in /proc/sys/fs/pipe-user-pages-{soft,hard} are made against
> existing consumption, and exclude the memory required for the
> increased pipe capacity. The new increase in pipe capacity
> can then push the total memory used by the user for pipes
> (possibly far) over a limit.
>
> (2) The limit checks are performed even when the new pipe capacity
> is less than the existing pipe capacity. This can lead to
> problems if a user sets a large pipe capacity, and then the
> limits are lowered, with the result that the user will no
> longer be able to decrease the pipe capacity.
>
> The simple solution given by this patch is to perform the checks
> only when the pipe capacity is being increased. The patch does not
> address the broken check in (1), which allows a user to (one-time)
> set a pipe capacity that pushes the user's consumption over the user
> pipe limits. A change to fix that check is proposed in a subsequent
> patch. I've separated the two fixes because the second fix is a
> little more complex, and could possibly (though unlikely) break
> existing user-space. The current patch implements the simple fix
> that carries little risk and seems obviously correct: allowing an
> unprivileged user always to decrease a pipe's capacity.
>
> The program below can be used to demonstrate the problem, and the
> effect of the fix. The program takes one or more command-line
> arguments. The first argument specifies the number of pipes
> that the program should create. The remaining arguments are,
> alternately, pipe capacities that should be set using
> fcntl(F_SETPIPE_SZ), and sleep intervals (in seconds) between
> the fcntl() operations. (The sleep intervals allow the possibility
> to change the limits between fcntl() operations.)
>
> Running this program on an unpatched kernel, we first set some limits:
>
> # getconf PAGESIZE
> 4096
> # echo 0 > /proc/sys/fs/pipe-user-pages-soft
> # echo 1000000000 > /proc/sys/fs/pipe-max-size
> # echo 10000 > /proc/sys/fs/pipe-user-pages-hard # 40.96 MB
>
> Now perform two fcntl(F_SETPIPE_SZ) operations on a single pipe,
> first setting a pipe capacity (10MB), sleeping for a few seconds,
> during which time the hard limit is lowered, and then set pipe
> capacity to a smaller amount (5MB):
>
> # sudo -u mtk ./test_F_SETPIPE_SZ 1 10000000 15 5000000 &
> [1] 748
> # Loop 1: set pipe capacity to 10000000 bytes
> F_SETPIPE_SZ returned 16777216
> Sleeping 15 seconds
>
> # echo 1000 > /proc/sys/fs/pipe-user-pages-hard # 4.096 MB
>
> # Loop 2: set pipe capacity to 5000000 bytes
> Loop 2, pipe 0: F_SETPIPE_SZ failed: fcntl: Operation not permitted
>
> In this case, the user should be able to lower the limit.
>
> With a kernel that has the patch below, the second fcntl()
> succeeds:
>
> # echo 0 > /proc/sys/fs/pipe-user-pages-soft
> # echo 1000000000 > /proc/sys/fs/pipe-max-size
> # echo 10000 > /proc/sys/fs/pipe-user-pages-hard
> # sudo -u mtk ./test_F_SETPIPE_SZ 1 10000000 15 5000000 &
> [1] 3215
> # Loop 1: set pipe capacity to 10000000 bytes
> F_SETPIPE_SZ returned 16777216
> Sleeping 15 seconds
>
> # echo 1000 > /proc/sys/fs/pipe-user-pages-hard
>
> # Loop 2: set pipe capacity to 5000000 bytes
> F_SETPIPE_SZ returned 8388608
>
> 8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---
>
> /* test_F_SETPIPE_SZ.c
>
> (C) 2016, Michael Kerrisk; licensed under GNU GPL version 2 or later
>
> Test operation of fcntl(F_SETPIPE_SZ) for setting pipe capacity
> and interactions with limits defined by /proc/sys/fs/pipe-* files.
> */
>
> int
> main(int argc, char *argv[])
> {
> int (*pfd)[2];
> int npipes;
> int pcap, rcap;
> int j, p, s, stime, loop;
>
> if (argc < 2) {
> fprintf(stderr, "Usage: %s num-pipes "
> "[pipe-capacity sleep-time]...\n", argv[0]);
> exit(EXIT_FAILURE);
> }
>
> npipes = atoi(argv[1]);
>
> pfd = calloc(npipes, sizeof (int [2]));
> if (pfd == NULL) {
> perror("calloc");
> exit(EXIT_FAILURE);
> }
>
> for (j = 0; j < npipes; j++) {
> if (pipe(pfd[j]) == -1) {
> fprintf(stderr, "Loop %d: pipe() failed: ", j);
> perror("pipe");
> exit(EXIT_FAILURE);
> }
> }
>
> for (j = 2; j < argc; j += 2 ) {
> loop = j / 2;
> pcap = atoi(argv[j]);
> printf(" Loop %d: set pipe capacity to %d bytes\n", loop, pcap);
>
> for (p = 0; p < npipes; p++) {
> s = fcntl(pfd[p][0], F_SETPIPE_SZ, pcap);
> if (s == -1) {
> fprintf(stderr, " Loop %d, pipe %d: F_SETPIPE_SZ "
> "failed: ", loop, p);
> perror("fcntl");
> exit(EXIT_FAILURE);
> }
>
> if (p == 0) {
> printf(" F_SETPIPE_SZ returned %d\n", s);
> rcap = s;
> } else {
> if (s != rcap) {
> fprintf(stderr, " Loop %d, pipe %d: F_SETPIPE_SZ "
> "unexpected return: %d\n", loop, p, s);
> exit(EXIT_FAILURE);
> }
> }
>
> stime = (j + 1 < argc) ? atoi(argv[j + 1]) : 0;
> if (stime > 0) {
> printf(" Sleeping %d seconds\n", stime);
> sleep(stime);
> }
> }
> }
>
> exit(EXIT_SUCCESS);
> }
>
> 8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---
>
> Cc: Willy Tarreau <w@1wt.eu>
> Cc: Vegard Nossum <vegard.nossum-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
> Cc: socketpair-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
> Cc: Tetsuo Handa <penguin-kernel-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org>
> Cc: Jens Axboe <axboe-b10kYP2dOMg@public.gmane.org>
> Cc: Al Viro <viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
> Cc: stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> Cc: linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> Signed-off-by: Michael Kerrisk <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> ---
> fs/pipe.c | 25 +++++++++++++++++--------
> 1 file changed, 17 insertions(+), 8 deletions(-)
>
> diff --git a/fs/pipe.c b/fs/pipe.c
> index 4ebe6b2..a98ebca 100644
> --- a/fs/pipe.c
> +++ b/fs/pipe.c
> @@ -1122,14 +1122,23 @@ long pipe_fcntl(struct file *file, unsigned int cmd, unsigned long arg)
> if (!nr_pages)
> goto out;
>
> - if (!capable(CAP_SYS_RESOURCE) && size > pipe_max_size) {
> - ret = -EPERM;
> - goto out;
> - } else if ((too_many_pipe_buffers_hard(pipe->user) ||
> - too_many_pipe_buffers_soft(pipe->user)) &&
> - !capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN)) {
> - ret = -EPERM;
> - goto out;
> + /*
> + * If trying to increase the pipe capacity, check that an
> + * unprivileged user is not trying to exceed various limits.
> + * (Decreasing the pipe capacity is always permitted, even
> + * if the user is currently over a limit.)
> + */
> + if (nr_pages > pipe->buffers) {
> + if (!capable(CAP_SYS_RESOURCE) && size > pipe_max_size) {
> + ret = -EPERM;
> + goto out;
> + } else if ((too_many_pipe_buffers_hard(pipe->user) ||
> + too_many_pipe_buffers_soft(pipe->user)) &&
> + !capable(CAP_SYS_RESOURCE) &&
> + !capable(CAP_SYS_ADMIN)) {
> + ret = -EPERM;
> + goto out;
> + }
> }
> ret = pipe_set_size(pipe, nr_pages);
> break;
>
--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/
prev parent reply other threads:[~2016-08-19 5:07 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-08-16 11:10 [PATCH 1/2] pipe: check limits only when increasing pipe capacity Michael Kerrisk (man-pages)
2016-08-16 11:14 ` [PATCH 2/2] pipe: make pipe user buffer limit checks more precise Michael Kerrisk (man-pages)
2016-08-16 12:07 ` Vegard Nossum
2016-08-16 20:21 ` Michael Kerrisk (man-pages)
[not found] ` <1532b6c4-c618-348c-d36a-9679d5d5a1b4-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2016-08-16 22:00 ` Vegard Nossum
[not found] ` <57B38CF7.5080803-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2016-08-17 8:02 ` Michael Kerrisk (man-pages)
2016-08-17 19:34 ` Vegard Nossum
[not found] ` <57B4BC5B.9050405-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2016-08-17 19:41 ` Michael Kerrisk (man-pages)
[not found] ` <55f54f95-f614-179e-db4b-912adf2199bb-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2016-08-17 19:51 ` Vegard Nossum
[not found] ` <db82480c-7956-b89d-1f4e-ba2c94f4067e-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2016-08-19 5:07 ` Michael Kerrisk (man-pages)
[not found] ` <86c85cff-7fee-cded-386a-e1d518573dda-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2016-08-16 11:55 ` [PATCH 1/2] pipe: check limits only when increasing pipe capacity Vegard Nossum
2016-08-19 5:07 ` Michael Kerrisk (man-pages) [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=88c159a1-0d4f-c776-90b1-5532b7fd4616@gmail.com \
--to=mtk.manpages-re5jqeeqqe8avxtiumwx3w@public.gmane.org \
--cc=akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
--cc=axboe-b10kYP2dOMg@public.gmane.org \
--cc=linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=penguin-kernel-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org \
--cc=socketpair-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=vegard.nossum-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
--cc=viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org \
--cc=w@1wt.eu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).