From: Juri Lelli <juri.lelli@redhat.com>
To: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: peterz@infradead.org, lizefan@huawei.com, mingo@redhat.com,
rostedt@goodmis.org, claudio@evidence.eu.com, bristot@redhat.com,
tommaso.cucinotta@santannapisa.it, luca.abeni@santannapisa.it,
cgroups@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH V3 04/10] sched/core: Prevent race condition between cpuset and __sched_setscheduler()
Date: Wed, 14 Feb 2018 12:27:21 +0100 [thread overview]
Message-ID: <20180214112721.GT12979@localhost.localdomain> (raw)
In-Reply-To: <20180214104935.GS12979@localhost.localdomain>
On 14/02/18 11:49, Juri Lelli wrote:
> On 14/02/18 11:36, Juri Lelli wrote:
> > Hi Mathieu,
> >
> > On 13/02/18 13:32, Mathieu Poirier wrote:
> > > No synchronisation mechanism exist between the cpuset subsystem and calls
> > > to function __sched_setscheduler(). As such it is possible that new root
> > > domains are created on the cpuset side while a deadline acceptance test
> > > is carried out in __sched_setscheduler(), leading to a potential oversell
> > > of CPU bandwidth.
> > >
> > > By making available the cpuset_mutex to the core scheduler it is possible
> > > to prevent situations such as the one described above from happening.
> > >
> > > Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
> > > ---
> >
> > [...]
> >
> > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > > index f727c3d0064c..0d8badcf1f0f 100644
> > > --- a/kernel/sched/core.c
> > > +++ b/kernel/sched/core.c
> > > @@ -4176,6 +4176,13 @@ static int __sched_setscheduler(struct task_struct *p,
> > > }
> > >
> > > /*
> > > + * Make sure we don't race with the cpuset subsystem where root
> > > + * domains can be rebuilt or modified while operations like DL
> > > + * admission checks are carried out.
> > > + */
> > > + cpuset_lock();
> > > +
> > > + /*
> >
> > Mmm, I'm afraid we can't do this. __sched_setscheduler might be called
> > from interrupt contex by normalize_rt_tasks().
>
> Maybe conditionally grabbing it if pi is true could do? I guess we don't
> care much about domains when sysrq.
Ops.. just got this. :/
--->8---
[ 0.020203] ======================================================
[ 0.020946] WARNING: possible circular locking dependency detected
[ 0.021000] 4.16.0-rc1+ #64 Not tainted
[ 0.021000] ------------------------------------------------------
[ 0.021000] swapper/0/1 is trying to acquire lock:
[ 0.021000] (cpu_hotplug_lock.rw_sem){.+.+}, at: [<000000007164d41d>] smpboot_register_percpu_thread_cpumask+0x2d/0x100
[ 0.021000]
[ 0.021000] but task is already holding lock:
[ 0.021000] (cpuset_mutex){+.+.}, at: [<000000008529a52c>] __sched_setscheduler+0xb5/0x8b0
[ 0.021000]
[ 0.021000] which lock already depends on the new lock.
[ 0.021000]
[ 0.021000]
[ 0.021000] the existing dependency chain (in reverse order) is:
[ 0.021000]
[ 0.021000] -> #2 (cpuset_mutex){+.+.}:
[ 0.021000] __sched_setscheduler+0xb5/0x8b0
[ 0.021000] _sched_setscheduler+0x6c/0x80
[ 0.021000] __kthread_create_on_node+0x10e/0x170
[ 0.021000] kthread_create_on_node+0x37/0x40
[ 0.021000] kthread_create_on_cpu+0x27/0x90
[ 0.021000] __smpboot_create_thread.part.3+0x64/0xe0
[ 0.021000] smpboot_register_percpu_thread_cpumask+0x91/0x100
[ 0.021000] spawn_ksoftirqd+0x37/0x40
[ 0.021000] do_one_initcall+0x3b/0x160
[ 0.021000] kernel_init_freeable+0x118/0x258
[ 0.021000] kernel_init+0xa/0x100
[ 0.021000] ret_from_fork+0x3a/0x50
[ 0.021000]
[ 0.021000] -> #1 (smpboot_threads_lock){+.+.}:
[ 0.021000] smpboot_register_percpu_thread_cpumask+0x3b/0x100
[ 0.021000] spawn_ksoftirqd+0x37/0x40
[ 0.021000] do_one_initcall+0x3b/0x160
[ 0.021000] kernel_init_freeable+0x118/0x258
[ 0.021000] kernel_init+0xa/0x100
[ 0.021000] ret_from_fork+0x3a/0x50
[ 0.021000]
[ 0.021000] -> #0 (cpu_hotplug_lock.rw_sem){.+.+}:
[ 0.021000] cpus_read_lock+0x3e/0x80
[ 0.021000] smpboot_register_percpu_thread_cpumask+0x2d/0x100
[ 0.021000] lockup_detector_init+0x3e/0x74
[ 0.021000] kernel_init_freeable+0x146/0x258
[ 0.021000] kernel_init+0xa/0x100
[ 0.021000] ret_from_fork+0x3a/0x50
[ 0.021000]
[ 0.021000] other info that might help us debug this:
[ 0.021000]
[ 0.021000] Chain exists of:
[ 0.021000] cpu_hotplug_lock.rw_sem --> smpboot_threads_lock --> cpuset_mutex
[ 0.021000]
[ 0.021000] Possible unsafe locking scenario:
[ 0.021000]
[ 0.021000] CPU0 CPU1
[ 0.021000] ---- ----
[ 0.021000] lock(cpuset_mutex);
[ 0.021000] lock(smpboot_threads_lock);
[ 0.021000] lock(cpuset_mutex);
[ 0.021000] lock(cpu_hotplug_lock.rw_sem);
[ 0.021000]
[ 0.021000] *** DEADLOCK ***
[ 0.021000]
[ 0.021000] 1 lock held by swapper/0/1:
[ 0.021000] #0: (cpuset_mutex){+.+.}, at: [<000000008529a52c>] __sched_setscheduler+0xb5/0x8b0
[ 0.021000]
[ 0.021000] stack backtrace:
[ 0.021000] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.16.0-rc1+ #64
[ 0.021000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-2.fc27 04/01/2014
[ 0.021000] Call Trace:
[ 0.021000] dump_stack+0x85/0xc5
[ 0.021000] print_circular_bug.isra.38+0x1ce/0x1db
[ 0.021000] __lock_acquire+0x1278/0x1320
[ 0.021000] ? sched_clock_local+0x12/0x80
[ 0.021000] ? lock_acquire+0x9f/0x1f0
[ 0.021000] lock_acquire+0x9f/0x1f0
[ 0.021000] ? smpboot_register_percpu_thread_cpumask+0x2d/0x100
[ 0.021000] cpus_read_lock+0x3e/0x80
[ 0.021000] ? smpboot_register_percpu_thread_cpumask+0x2d/0x100
[ 0.021000] smpboot_register_percpu_thread_cpumask+0x2d/0x100
[ 0.021000] ? set_debug_rodata+0x11/0x11
[ 0.021000] lockup_detector_init+0x3e/0x74
[ 0.021000] kernel_init_freeable+0x146/0x258
[ 0.021000] ? rest_init+0xd0/0xd0
[ 0.021000] kernel_init+0xa/0x100
[ 0.021000] ret_from_fork+0x3a/0x50
next prev parent reply other threads:[~2018-02-14 11:27 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-02-13 20:32 [PATCH V3 00/10] sched/deadline: fix cpusets bandwidth accounting Mathieu Poirier
2018-02-13 20:32 ` [PATCH V3 01/10] sched/topology: Add check to backup comment about hotplug lock Mathieu Poirier
2018-02-13 20:32 ` [PATCH V3 02/10] sched/topology: Adding function partition_sched_domains_locked() Mathieu Poirier
2018-02-13 20:32 ` [PATCH V3 03/10] sched/core: Streamlining calls to task_rq_unlock() Mathieu Poirier
2018-02-13 20:32 ` [PATCH V3 04/10] sched/core: Prevent race condition between cpuset and __sched_setscheduler() Mathieu Poirier
2018-02-14 10:36 ` Juri Lelli
2018-02-14 10:49 ` Juri Lelli
2018-02-14 11:27 ` Juri Lelli [this message]
2018-02-14 15:33 ` Mathieu Poirier
2018-02-14 16:31 ` Juri Lelli
2018-02-15 10:33 ` Juri Lelli
2018-02-15 11:08 ` Juri Lelli
2018-02-13 20:32 ` [PATCH V3 05/10] cpuset: Rebuild root domain deadline accounting information Mathieu Poirier
2018-02-13 20:32 ` [PATCH V3 06/10] sched/deadline: Keep new DL task within root domain's boundary Mathieu Poirier
2018-02-13 20:32 ` [PATCH V3 07/10] cgroup: Constrain 'sched_load_balance' flag when DL tasks are present Mathieu Poirier
2018-02-13 20:32 ` [PATCH V3 08/10] cgroup: Constrain the addition of CPUs to a new CPUset Mathieu Poirier
2018-02-13 20:32 ` [PATCH V3 09/10] sched/core: Don't change the affinity of DL tasks Mathieu Poirier
2018-02-13 20:32 ` [PATCH V3 10/10] sched/deadline: Prevent CPU hotplug operation if DL task on CPU Mathieu Poirier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180214112721.GT12979@localhost.localdomain \
--to=juri.lelli@redhat.com \
--cc=bristot@redhat.com \
--cc=cgroups@vger.kernel.org \
--cc=claudio@evidence.eu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lizefan@huawei.com \
--cc=luca.abeni@santannapisa.it \
--cc=mathieu.poirier@linaro.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tommaso.cucinotta@santannapisa.it \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox