From: Petr Mladek <pmladek@suse.com>
To: Topi Miettinen <toiwoton@gmail.com>
Cc: "Serge E. Hallyn" <serge@hallyn.com>,
"Eric W. Biederman" <ebiederm@xmission.com>,
Tejun Heo <tj@kernel.org>, lkml <linux-kernel@vger.kernel.org>,
luto@kernel.org, Kees Cook <keescook@chromium.org>,
Jonathan Corbet <corbet@lwn.net>, Li Zefan <lizefan@huawei.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Serge Hallyn <serge.hallyn@canonical.com>,
James Morris <james.l.morris@oracle.com>,
Andrew Morton <akpm@linux-foundation.org>,
David Howells <dhowells@redhat.com>,
David Woodhouse <David.Woodhouse@intel.com>,
Ard Biesheuvel <ard.biesheuvel@linaro.org>,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
"open list:DOCUMENTATION" <linux-doc@vger.kernel.org>,
"open list:CONTROL GROUP (CGROUP)" <cgroups@vger.kernel.org>,
"open list:CAPABILITIES" <linux-security-module@vger.kernel.org>
Subject: Re: [PATCH] capabilities: add capability cgroup controller
Date: Fri, 8 Jul 2016 11:13:32 +0200 [thread overview]
Message-ID: <20160708091332.GD3556@pathway.suse.cz> (raw)
In-Reply-To: <e81401b8-a731-9c22-f5a3-ce25eefb1471@gmail.com>
On Thu 2016-07-07 20:27:13, Topi Miettinen wrote:
> On 07/07/16 09:16, Petr Mladek wrote:
> > On Sun 2016-07-03 15:08:07, Topi Miettinen wrote:
> >> The attached patch would make any uses of capabilities generate audit
> >> messages. It works for simple tests as you can see from the commit
> >> message, but unfortunately the call to audit_cgroup_list() deadlocks the
> >> system when booting a full blown OS. There's no deadlock when the call
> >> is removed.
> >>
> >> I guess that in some cases, cgroup_mutex and/or css_set_lock could be
> >> already held earlier before entering audit_cgroup_list(). Holding the
> >> locks is however required by task_cgroup_from_root(). Is there any way
> >> to avoid this? For example, only print some kind of cgroup ID numbers
> >> (are there unique and stable IDs, available without locks?) for those
> >> cgroups where the task is registered in the audit message?
> >
> > I am not sure if anyone know what really happens here. I suggest to
> > enable lockdep. It might detect possible deadlock even before it
> > really happens, see Documentation/locking/lockdep-design.txt
> >
> > It can be enabled by
> >
> > CONFIG_PROVE_LOCKING=y
> >
> > It depends on
> >
> > CONFIG_DEBUG_KERNEL=y
> >
> > and maybe some more options, see lib/Kconfig.debug
>
> Thanks a lot! I caught this stack dump:
>
> starting version 230
> [ 3.416647] ------------[ cut here ]------------
> [ 3.417310] WARNING: CPU: 0 PID: 95 at
> /home/topi/d/linux.git/kernel/locking/lockdep.c:2871
> lockdep_trace_alloc+0xb4/0xc0
> [ 3.417605] DEBUG_LOCKS_WARN_ON(irqs_disabled_flags(flags))
> [ 3.417923] Modules linked in:
> [ 3.418288] CPU: 0 PID: 95 Comm: systemd-udevd Not tainted 4.7.0-rc5+ #97
> [ 3.418444] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS Debian-1.8.2-1 04/01/2014
> [ 3.418726] 0000000000000086 000000007970f3b0 ffff88000016fb00
> ffffffff813c9c45
> [ 3.418993] ffff88000016fb50 0000000000000000 ffff88000016fb40
> ffffffff81091e9b
> [ 3.419176] 00000b3705e2c798 0000000000000046 0000000000000410
> 00000000ffffffff
> [ 3.419374] Call Trace:
> [ 3.419511] [<ffffffff813c9c45>] dump_stack+0x67/0x92
> [ 3.419644] [<ffffffff81091e9b>] __warn+0xcb/0xf0
> [ 3.419745] [<ffffffff81091f1f>] warn_slowpath_fmt+0x5f/0x80
> [ 3.419868] [<ffffffff810e9a84>] lockdep_trace_alloc+0xb4/0xc0
> [ 3.419988] [<ffffffff8120dc42>] kmem_cache_alloc_node+0x42/0x600
> [ 3.420156] [<ffffffff8110432d>] ? debug_lockdep_rcu_enabled+0x1d/0x20
> [ 3.420170] [<ffffffff8163183b>] __alloc_skb+0x5b/0x1d0
> [ 3.420170] [<ffffffff81144f6b>] audit_log_start+0x29b/0x480
> [ 3.420170] [<ffffffff810a2925>] ? __lock_task_sighand+0x95/0x270
> [ 3.420170] [<ffffffff81145cc9>] audit_log_cap_use+0x39/0xf0
> [ 3.420170] [<ffffffff8109cd75>] ns_capable+0x45/0x70
> [ 3.420170] [<ffffffff8109cdb7>] capable+0x17/0x20
> [ 3.420170] [<ffffffff812a2f50>] oom_score_adj_write+0x150/0x2f0
> [ 3.420170] [<ffffffff81230997>] __vfs_write+0x37/0x160
> [ 3.420170] [<ffffffff810e33b7>] ? update_fast_ctr+0x17/0x30
> [ 3.420170] [<ffffffff810e3449>] ? percpu_down_read+0x49/0x90
> [ 3.420170] [<ffffffff81233d47>] ? __sb_start_write+0xb7/0xf0
> [ 3.420170] [<ffffffff81233d47>] ? __sb_start_write+0xb7/0xf0
> [ 3.420170] [<ffffffff81231048>] vfs_write+0xb8/0x1b0
> [ 3.420170] [<ffffffff812533c6>] ? __fget_light+0x66/0x90
> [ 3.420170] [<ffffffff81232078>] SyS_write+0x58/0xc0
> [ 3.420170] [<ffffffff81001f2c>] do_syscall_64+0x5c/0x300
> [ 3.420170] [<ffffffff81849c9a>] entry_SYSCALL64_slow_path+0x25/0x25
> [ 3.420170] ---[ end trace fb586899fb556a5e ]---
> [ 3.447922] random: systemd-udevd urandom read with 3 bits of entropy
> available
> [ 4.014078] clocksource: Switched to clocksource tsc
> Begin: Loading essential drivers ... done.
>
> This is with qemu and the boot continues normally. With real computer,
> there's no such output and system just seems to freeze.
>
> Could it be possible that the deadlock happens because there's some IO
> towards /sys/fs/cgroup, which causes a capability check and that in turn
> causes locking problems when we try to print cgroup list?
The above warning is printed by the code from
kernel/locking/lockdep.c:2871
static void __lockdep_trace_alloc(gfp_t gfp_mask, unsigned long flags)
{
[...]
/* We're only interested __GFP_FS allocations for now */
if (!(gfp_mask & __GFP_FS))
return;
/*
* Oi! Can't be having __GFP_FS allocations with IRQs disabled.
*/
if (DEBUG_LOCKS_WARN_ON(irqs_disabled_flags(flags)))
return;
The backtrace shows that your new audit_log_cap_use() is called
from vfs_write(). You might try to use audit_log_start() with
GFP_NOFS instead of GFP_KERNEL.
Note that this is rather intuitive advice. I still need to learn a lot
about memory management and kernel in general to be more sure about
a correct solution.
Best Regards,
Petr
next prev parent reply other threads:[~2016-07-08 9:13 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-23 15:07 [PATCH] capabilities: add capability cgroup controller Topi Miettinen
2016-06-23 21:03 ` Kees Cook
2016-06-23 21:38 ` Tejun Heo
2016-06-24 0:22 ` Topi Miettinen
2016-06-24 15:48 ` Tejun Heo
2016-06-24 15:59 ` Serge E. Hallyn
2016-06-24 16:35 ` Tejun Heo
2016-06-24 16:59 ` Serge E. Hallyn
2016-06-24 17:21 ` Eric W. Biederman
2016-06-24 17:39 ` Serge E. Hallyn
2016-06-26 19:03 ` Topi Miettinen
2016-06-28 4:57 ` Eric W. Biederman
2016-07-02 11:20 ` Topi Miettinen
2016-06-24 17:24 ` Tejun Heo
2016-06-26 19:14 ` Topi Miettinen
2016-06-26 22:26 ` Tejun Heo
2016-06-27 14:54 ` Serge E. Hallyn
2016-06-27 19:10 ` Topi Miettinen
2016-06-27 19:17 ` Tejun Heo
2016-06-27 19:49 ` Serge E. Hallyn
2016-07-03 15:08 ` Topi Miettinen
2016-07-03 16:13 ` [PATCH] capabilities: audit capability use kbuild test robot
2016-07-07 9:16 ` [PATCH] capabilities: add capability cgroup controller Petr Mladek
2016-07-07 20:27 ` Topi Miettinen
2016-07-08 9:13 ` Petr Mladek [this message]
2016-07-09 16:38 ` Topi Miettinen
2016-07-10 9:04 ` Topi Miettinen
2016-06-23 23:46 ` Andrew Morton
2016-06-24 1:14 ` Topi Miettinen
2016-06-24 4:15 ` Andy Lutomirski
2016-06-25 18:00 ` Djalal Harouni
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160708091332.GD3556@pathway.suse.cz \
--to=pmladek@suse.com \
--cc=David.Woodhouse@intel.com \
--cc=akpm@linux-foundation.org \
--cc=ard.biesheuvel@linaro.org \
--cc=cgroups@vger.kernel.org \
--cc=corbet@lwn.net \
--cc=dhowells@redhat.com \
--cc=ebiederm@xmission.com \
--cc=hannes@cmpxchg.org \
--cc=james.l.morris@oracle.com \
--cc=keescook@chromium.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-security-module@vger.kernel.org \
--cc=lizefan@huawei.com \
--cc=luto@kernel.org \
--cc=paulmck@linux.vnet.ibm.com \
--cc=serge.hallyn@canonical.com \
--cc=serge@hallyn.com \
--cc=tj@kernel.org \
--cc=toiwoton@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).