From: Balbir Singh <bsingharora@gmail.com>
To: Oleg Nesterov <oleg@redhat.com>,
Brent Lovelace <brent.lovelace@candelatech.com>
Cc: linux-kernel@vger.kernel.org, lizefan@huawei.com, tj@kernel.org
Subject: Re: Bisected: Kernel deadlock bug related to cgroups.
Date: Wed, 31 Aug 2016 20:33:53 +1000 [thread overview]
Message-ID: <a65eb516-b317-e035-0189-8fd725e62e9d@gmail.com> (raw)
In-Reply-To: <20160831101639.GA3919@redhat.com>
On 31/08/16 20:16, Oleg Nesterov wrote:
> On 08/30, Brent Lovelace wrote:
>>
>> I found a kernel deadlock regression bug introduced in the 4.4 kernel.
> ...
>> I bisected to this commit id:
>> ----------------------------------------------------------------------------------
>> commit c9e75f0492b248aeaa7af8991a6fc9a21506bc96
>> Author: Oleg Nesterov <oleg@redhat.com>
>> Date: Fri Nov 27 19:57:19 2015 +0100
>>
>> cgroup: pids: fix race between cgroup_post_fork() and cgroup_migrate()
>
> Thanks Brent!
>
>> systemd D ffff88007dfcfd10 0 1 0 0x00000000
>> ffff88007dfcfd10 00ff88007dfcfcd8 0000000000000001 ffff88007e296f80
>> ffff88007dff8000 ffff88007dfd0000 ffffffff82bae220 ffff880036758c00
>> fffffffffffffff2 ffff88005e327b00 ffff88007dfcfd28 ffffffff816c571f
>> Call Trace:
>> [<ffffffff816c571f>] schedule+0x7a/0x8f
>> [<ffffffff81126ed2>] percpu_down_write+0xad/0xc4
>> [<ffffffff811220c8>] ? wake_up_atomic_t+0x25/0x25
>> [<ffffffff81169b4d>] __cgroup_procs_write+0x72/0x229
>> [<ffffffff8112b2eb>] ? lock_acquire+0x103/0x18f
>
> so it sleeps in wait_event() waiting for active readers, and the new
> readers will block. In particular, do_exit() will block.
>
New readers/writers both, usually this is a sign that a reader is
already holding the percpu rwsem.
>> kworker/u8:2 D ffff880036befb58 0 185 2 0x00000000
>> Workqueue: netns cleanup_net
>> ffff880036befb58 00ff880036befbd0 ffffffff00000002 ffff88007e316f80
>> ffff8800783e8000 ffff880036bf0000 ffff88005917bed0 ffff8800783e8000
>> ffffffff816c8953 ffff88005917bed8 ffff880036befb70 ffffffff816c571f
>> Call Trace:
>> [<ffffffff816c8953>] ? usleep_range+0x3a/0x3a
>> [<ffffffff816c571f>] schedule+0x7a/0x8f
>> [<ffffffff816c8982>] schedule_timeout+0x2f/0xd8
>> [<ffffffff816c9204>] ? _raw_spin_unlock_irq+0x27/0x3f
>> [<ffffffff816c8953>] ? usleep_range+0x3a/0x3a
>> [<ffffffff81129ea2>] ? trace_hardirqs_on_caller+0x16f/0x18b
>> [<ffffffff816c5f44>] do_wait_for_common+0xf0/0x127
>> [<ffffffff816c5f44>] ? do_wait_for_common+0xf0/0x127
>> [<ffffffff811101c3>] ? wake_up_q+0x42/0x42
>> [<ffffffff816c5ff2>] wait_for_common+0x36/0x50
>> [<ffffffff816c6024>] wait_for_completion+0x18/0x1a
>> [<ffffffff81106b5f>] kthread_stop+0xc8/0x217
>> [<ffffffffa06d440c>] pg_net_exit+0xbc/0x112 [pktgen]
>> [<ffffffff81604fa2>] ops_exit_list+0x3d/0x4e
>> [<ffffffff81606123>] cleanup_net+0x19f/0x234
>> [<ffffffff81101aa3>] process_one_work+0x237/0x46b
>> [<ffffffff8110216d>] worker_thread+0x1e7/0x292
>> [<ffffffff81101f86>] ? rescuer_thread+0x285/0x285
>> [<ffffffff81106ded>] kthread+0xc4/0xcc
>> [<ffffffff81106d29>] ? kthread_parkme+0x1f/0x1f
>> [<ffffffff816c9d6f>] ret_from_fork+0x3f/0x70
>> [<ffffffff81106d29>] ? kthread_parkme+0x1f/0x1f
>> 3 locks held by kworker/u8:2/185:
>> #0: ("%s""netns"){.+.+.+}, at: [<ffffffff811019ad>]
>> process_one_work+0x141/0x46b
>> #1: (net_cleanup_work){+.+.+.}, at: [<ffffffff811019ad>]
>> process_one_work+0x141/0x46b
>> #2: (net_mutex){+.+.+.}, at: [<ffffffff81605ffe>] cleanup_net+0x7a/0x234
>
> Note that it sleeps with net_mutex held. Probably waiting for kpktgend_*
> below.
>
>> vsftpd D ffff880054867c68 0 4352 2611 0x00000000
>> ffff880054867c68 00ff88005933a480 ffff880000000000 ffff88007e216f80
>> ffff88005933a480 ffff880054868000 0000000000000246 ffff880054867cc0
>> ffff88005933a480 ffffffff81cea268 ffff880054867c80 ffffffff816c571f
>> Call Trace:
>> [<ffffffff816c571f>] schedule+0x7a/0x8f
>> [<ffffffff816c59b1>] schedule_preempt_disabled+0x10/0x19
>> [<ffffffff816c64c0>] mutex_lock_nested+0x1c0/0x3a0
>> [<ffffffff81606233>] ? copy_net_ns+0x7b/0xf8
>> [<ffffffff81606233>] copy_net_ns+0x7b/0xf8
>> [<ffffffff81606233>] ? copy_net_ns+0x7b/0xf8
>> [<ffffffff811073c9>] create_new_namespaces+0xfc/0x16b
>> [<ffffffff8110759c>] copy_namespaces+0x164/0x186
>> [<ffffffff810ea6b1>] copy_process+0x10d2/0x195d
>> [<ffffffff810eb094>] _do_fork+0x8c/0x2fb
>> [<ffffffff81003044>] ? lockdep_sys_exit_thunk+0x12/0x14
>> [<ffffffff810eb375>] SyS_clone+0x14/0x16
>> [<ffffffff816c99b6>] entry_SYSCALL_64_fastpath+0x16/0x76
>> 2 locks held by vsftpd/4352:
>> #0: (&cgroup_threadgroup_rwsem){++++++}, at: [<ffffffff810e9b97>]
>> copy_process+0x5b8/0x195d
>> #1: (net_mutex){+.+.+.}, at: [<ffffffff81606233>] copy_net_ns+0x7b/0xf8
>
> This waits for net_mutex held by kworker/u8:2 above. And with
> cgroup_threadgroup_rwsem acquired for reading, that is why systemd
> above hangs.
>
>> kpktgend_0 D ffff88005917bce8 0 4354 2 0x00000000
>> ffff88005917bce8 00ffffffa06d5d06 ffff880000000000 ffff88007e216f80
>> ffff88007a4ec900 ffff88005917c000 ffff88007a4ec900 ffffffffa06d5d06
>> ffff88005917bed0 0000000000000000 ffff88005917bd00 ffffffff816c571f
>> Call Trace:
>> [<ffffffffa06d5d06>] ? pg_net_init+0x346/0x346 [pktgen]
>> [<ffffffff816c571f>] schedule+0x7a/0x8f
>> [<ffffffff816c85ee>] rwsem_down_read_failed+0xdc/0xf8
>> [<ffffffff8136d634>] call_rwsem_down_read_failed+0x14/0x30
>> [<ffffffff8136d634>] ? call_rwsem_down_read_failed+0x14/0x30
>> [<ffffffff810f91e0>] ? exit_signals+0x17/0x103
>> [<ffffffff81126dc4>] ? percpu_down_read+0x4d/0x5f
>> [<ffffffff810f91e0>] exit_signals+0x17/0x103
>> [<ffffffff810ed342>] do_exit+0x105/0x9a4
>> [<ffffffffa06d5d06>] ? pg_net_init+0x346/0x346 [pktgen]
>> [<ffffffff81106df5>] kthread+0xcc/0xcc
>> [<ffffffff81106d29>] ? kthread_parkme+0x1f/0x1f
>> [<ffffffff816c9d6f>] ret_from_fork+0x3f/0x70
>> [<ffffffff81106d29>] ? kthread_parkme+0x1f/0x1f
>> 1 lock held by kpktgend_0/4354:
>> #0: (&cgroup_threadgroup_rwsem){++++++}, at: [<ffffffff810f91e0>]
>
> it can't take cgroup_threadgroup_rwsem for reading, so it can't exit,
> and that is why kworker/u8:2 hangs.
>
>> kpktgend_1 D ffff88007a4e3ce8 0 4355 2 0x00000000
> ...
>> kpktgend_2 D ffff8800549f7ce8 0 4356 2 0x00000000
> ...
>> kpktgend_3 D ffff88005e2b7ce8 0 4357 2 0x00000000
> ...
>
> The same.
>
> Could you try the recent 568ac888215c7fb2fab "cgroup: reduce read
> locked section of cgroup_threadgroup_rwsem during fork" patch?
> Attached below.
>
>
> With this patch copy_net_ns() should be called outside of
> cgroup_threadgroup_rwsem, the deadlock should hopefully go away.
>
> Thanks,
Yes, I would be interested in seeing if this race goes away. Thanks for
the pointer Oleg!
Balbir Singh.
next prev parent reply other threads:[~2016-08-31 10:34 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-08-30 18:39 Bisected: Kernel deadlock bug related to cgroups Brent Lovelace
2016-08-31 10:16 ` Oleg Nesterov
2016-08-31 10:33 ` Balbir Singh [this message]
2016-08-31 23:19 ` Brent Lovelace
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a65eb516-b317-e035-0189-8fd725e62e9d@gmail.com \
--to=bsingharora@gmail.com \
--cc=brent.lovelace@candelatech.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lizefan@huawei.com \
--cc=oleg@redhat.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.