From: Andrey Wagin <avagin@gmail.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>,
linux-fsdevel@vger.kernel.org,
LKML <linux-kernel@vger.kernel.org>,
"Serge E. Hallyn" <serge@hallyn.com>,
Andrew Morton <akpm@linux-foundation.org>,
Ingo Molnar <mingo@kernel.org>, Kees Cook <keescook@chromium.org>,
Mel Gorman <mgorman@suse.de>, Rik van Riel <riel@redhat.com>
Subject: Re: [PATCH] [RFC] mnt: restrict a number of "struct mnt"
Date: Thu, 20 Jun 2013 01:35:32 +0400 [thread overview]
Message-ID: <20130619213532.GA31165@gmail.com> (raw)
In-Reply-To: <CANaxB-zvPMZe932nAeyOCbdHmmXa76f12BRv0X0W7F0ULZMkTA@mail.gmail.com>
On Tue, Jun 18, 2013 at 02:56:51AM +0400, Andrey Wagin wrote:
> 2013/6/17 Eric W. Biederman <ebiederm@xmission.com>:
> > So for anyone seriously worried about this kind of thing in general we
> > already have the memory control group, which is quite capable of
> > limiting this kind of thing,
>
> > and it limits all memory allocations not just mount.
>
> And that is problem, we can't to limit a particular slab. Let's
> imagine a real container with 4Gb of RAM. What is a kernel memory
> limit resonable for it? I setup 64 Mb (it may be not enough for real
> CT, but it's enough to make host inaccessible for some minutes).
>
> $ mkdir /sys/fs/cgroup/memory/test
> $ echo $((64 << 20)) > /sys/fs/cgroup/memory/test/memory.kmem.limit_in_bytes
> $ unshare -m
> $ echo $$ > /sys/fs/cgroup/memory/test/tasks
> $ mount --make-rprivate /
> $ mount -t tmpfs xxx /mnt
> $ mount --make-shared /mnt
> $ time bash -c 'set -m; for i in `seq 30`; do mount --bind /mnt
> `mktemp -d /mnt/test.XXXXXX` & done; for i in `seq 30`; do wait;
> done'
> real 0m23.141s
> user 0m0.016s
> sys 0m22.881s
>
> While the last script is working, nobody can't to read /proc/mounts or
> mount something. I don't think that users from other containers will
> be glad. This problem is not so significant in compared with umounting
> of this tree.
>
> $ strace -T umount -l /mnt
> umount("/mnt", MNT_DETACH) = 0 <548.898244>
> The host is inaccessible, it writes messages about soft lockup in
> kernel log and eats 100% cpu.
Eric, do you agree that
* It is a problem
* Currently we don't have a mechanism to prevent this problem
* We need to find a way to prevent this problem
>
>
> >
> > Is there some reason we want to go down the path of adding and tuning
> > static limits all over the kernel? As opposed to streamlining the memory
> > control group so it is low overhead and everyone that cares can use it?
>
> The memory control group doesn't help in this case... I need to look
> at this code in more details, maybe we can limit a depth of nested
> mount points.
Complexity of the umount algorithm does not depends on a depth of nested
mounts, it depends on a number of mounts and sometimes complexity is O(n^2).
For example:
mount -t tmpfs xxx /mnt
mount --make-shared /mnt
mkdir /mnt/tmp
mount -t tmpfs xxx /mnt/tmp
mkdir /mnt/d
for ((i = 0; i < $1; i++)); do
d=`mktemp -d /mnt/d/xxx.XXXXXX`
mount --bind /mnt/tmp $d || break
done
mkdir /mnt/tmp/d
for ((i = 0; i < $1; i++)); do
d=`mktemp -d /mnt/tmp/xxx.XXXXXX`
mount --bind /mnt/tmp/d $d || break
done
perf data for umount -l /mnt
29.60% dbus-daemon [kernel.kallsyms] [k] __ticket_spin_lock
|
--- __ticket_spin_lock
lg_local_lock
path_init
path_openat
do_filp_open
do_sys_open
SyS_openat
system_call_fastpath
__openat64_nocancel
0x747379732f312d73
20.20% umount [kernel.kallsyms] [k] propagation_next
|
--- propagation_next
|
|--65.35%-- umount_tree
| SyS_umount
| system_call_fastpath
| __umount2
| __libc_start_main
|
--34.65%-- propagate_umount
umount_tree
SyS_umount
system_call_fastpath
__umount2
__libc_start_main
17.81% umount [kernel.kallsyms] [k] __lookup_mnt
|
--- __lookup_mnt
|
|--82.78%-- propagate_umount
| umount_tree
| SyS_umount
| system_call_fastpath
| __umount2
| __libc_start_main
|
--17.22%-- umount_tree
SyS_umount
system_call_fastpath
__umount2
__libc_start_main
next prev parent reply other threads:[~2013-06-19 21:35 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-17 8:24 [PATCH] [RFC] mnt: restrict a number of "struct mnt" Andrey Vagin
2013-06-17 19:58 ` Eric W. Biederman
2013-06-17 22:56 ` Andrew Morton
2013-06-18 6:09 ` Andrew Vagin
2013-06-17 22:56 ` Andrey Wagin
2013-06-19 21:35 ` Andrey Wagin [this message]
2013-06-21 1:04 ` Eric W. Biederman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130619213532.GA31165@gmail.com \
--to=avagin@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=ebiederm@xmission.com \
--cc=keescook@chromium.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@kernel.org \
--cc=riel@redhat.com \
--cc=serge@hallyn.com \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.