* [PATCH v3 1/6] sched/fair: Remove duplicate code from can_migrate_task()
@ 2014-09-22 18:36 Kirill Tkhai
2014-09-22 18:36 ` [PATCH v3 2/6] sched: Do not pick a task which is switching on other cpu Kirill Tkhai
` (5 more replies)
0 siblings, 6 replies; 18+ messages in thread
From: Kirill Tkhai @ 2014-09-22 18:36 UTC (permalink / raw)
To: linux-kernel; +Cc: Peter Zijlstra, Ingo Molnar, Kirill Tkhai
From: Kirill Tkhai <ktkhai@parallels.com>
Combine two branches which do the same.
Signed-off-by: Kirill Tkhai <ktkhai@parallels.com>
---
kernel/sched/fair.c | 16 ++--------------
1 file changed, 2 insertions(+), 14 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 2a1e6ac..420bc98 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5292,24 +5292,12 @@ int can_migrate_task(struct task_struct *p, struct lb_env *env)
if (!tsk_cache_hot)
tsk_cache_hot = migrate_degrades_locality(p, env);
- if (migrate_improves_locality(p, env)) {
-#ifdef CONFIG_SCHEDSTATS
+ if (migrate_improves_locality(p, env) || !tsk_cache_hot ||
+ env->sd->nr_balance_failed > env->sd->cache_nice_tries) {
if (tsk_cache_hot) {
schedstat_inc(env->sd, lb_hot_gained[env->idle]);
schedstat_inc(p, se.statistics.nr_forced_migrations);
}
-#endif
- return 1;
- }
-
- if (!tsk_cache_hot ||
- env->sd->nr_balance_failed > env->sd->cache_nice_tries) {
-
- if (tsk_cache_hot) {
- schedstat_inc(env->sd, lb_hot_gained[env->idle]);
- schedstat_inc(p, se.statistics.nr_forced_migrations);
- }
-
return 1;
}
^ permalink raw reply related [flat|nested] 18+ messages in thread* [PATCH v3 2/6] sched: Do not pick a task which is switching on other cpu 2014-09-22 18:36 [PATCH v3 1/6] sched/fair: Remove duplicate code from can_migrate_task() Kirill Tkhai @ 2014-09-22 18:36 ` Kirill Tkhai 2014-09-23 15:06 ` Peter Zijlstra 2014-09-22 18:36 ` [PATCH v3 3/6] sched: Use dl_bw_of() under RCU read lock Kirill Tkhai ` (4 subsequent siblings) 5 siblings, 1 reply; 18+ messages in thread From: Kirill Tkhai @ 2014-09-22 18:36 UTC (permalink / raw) To: linux-kernel; +Cc: Peter Zijlstra, Ingo Molnar, Kirill Tkhai From: Kirill Tkhai <ktkhai@parallels.com> Architectures, which define __ARCH_WANT_UNLOCKED_CTXSW, may pull a task when it's in the middle of schedule(). CPU1(task1 calls schedule) CPU2 ... schedule() ... idle_balance() ... load_balance() ... ... schedule() ... prepare_lock_switch() ... raw_spin_unlock(&rq1->lock) ... ... raw_spin_lock(&rq1->lock) ... detach_tasks(); ... can_migrate_task(task1) ... attach_tasks(); <--- move task1 to rq2 ... raw_spin_unlock(&rq1->lock) ... context_switch() <--- switch to task1's stack ... ... (using task1's stack) (using task1's stack) ... ... context_switch() ... Parallel use of a single stack is not a good idea. Signed-off-by: Kirill Tkhai <ktkhai@parallels.com> Cc: <stable@vger.kernel.org> # Should this go to stable? --- kernel/sched/core.c | 11 +++-------- kernel/sched/deadline.c | 7 ++++++- kernel/sched/fair.c | 3 +++ kernel/sched/rt.c | 7 ++++++- kernel/sched/sched.h | 16 ++++++++++++++++ 5 files changed, 34 insertions(+), 10 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 2a93b87..5b864e9 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -1700,15 +1700,10 @@ try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags) #ifdef CONFIG_SMP /* - * If the owning (remote) cpu is still in the middle of schedule() with - * this task as prev, wait until its done referencing the task. + * Note, that p is dequeued at the moment. But it still + * may be "prev" in the middle of schedule() on other cpu. */ - while (p->on_cpu) - cpu_relax(); - /* - * Pairs with the smp_wmb() in finish_lock_switch(). - */ - smp_rmb(); + cpu_relax__while_on_cpu(p); p->sched_contributes_to_load = !!task_contributes_to_load(p); p->state = TASK_WAKING; diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index aaa5abb..ea0ba33 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -1364,7 +1364,9 @@ static int push_dl_task(struct rq *rq) next_task = task; goto retry; } - +#ifdef __ARCH_WANT_UNLOCKED_CTXSW + cpu_relax__while_on_cpu(next_task); +#endif deactivate_task(rq, next_task, 0); set_task_cpu(next_task, later_rq->cpu); activate_task(later_rq, next_task, 0); @@ -1451,6 +1453,9 @@ static int pull_dl_task(struct rq *this_rq) ret = 1; +#ifdef __ARCH_WANT_UNLOCKED_CTXSW + cpu_relax__while_on_cpu(p); +#endif deactivate_task(src_rq, p, 0); set_task_cpu(p, this_cpu); activate_task(this_rq, p, 0); diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 420bc98..80c5064 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -5298,6 +5298,9 @@ int can_migrate_task(struct task_struct *p, struct lb_env *env) schedstat_inc(env->sd, lb_hot_gained[env->idle]); schedstat_inc(p, se.statistics.nr_forced_migrations); } +#ifdef __ARCH_WANT_UNLOCKED_CTXSW + cpu_relax__while_on_cpu(p); +#endif return 1; } diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index 2e6a774..de356b0 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -1734,7 +1734,9 @@ static int push_rt_task(struct rq *rq) next_task = task; goto retry; } - +#ifdef __ARCH_WANT_UNLOCKED_CTXSW + cpu_relax__while_on_cpu(next_task); +#endif deactivate_task(rq, next_task, 0); set_task_cpu(next_task, lowest_rq->cpu); activate_task(lowest_rq, next_task, 0); @@ -1823,6 +1825,9 @@ static int pull_rt_task(struct rq *this_rq) ret = 1; +#ifdef __ARCH_WANT_UNLOCKED_CTXSW + cpu_relax__while_on_cpu(p); +#endif deactivate_task(src_rq, p, 0); set_task_cpu(p, this_cpu); activate_task(this_rq, p, 0); diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 1bc6aad..9c07d72 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -1034,6 +1034,22 @@ static inline void finish_lock_switch(struct rq *rq, struct task_struct *prev) #endif /* __ARCH_WANT_UNLOCKED_CTXSW */ /* + * If the owning (remote) cpu is still in the middle of schedule() with + * this task as prev, wait until its done referencing the task. + */ +static inline void cpu_relax__while_on_cpu(struct task_struct *p) +{ +#ifdef CONFIG_SMP + while (p->on_cpu) + cpu_relax(); + /* + * Pairs with the smp_wmb() in finish_lock_switch(). + */ + smp_rmb(); +#endif +} + +/* * wake flags */ #define WF_SYNC 0x01 /* waker goes to sleep after wakeup */ ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [RFC][PATCH] sched,mips,ia64: Remove __ARCH_WANT_UNLOCKED_CTXSW 2014-09-22 18:36 ` [PATCH v3 2/6] sched: Do not pick a task which is switching on other cpu Kirill Tkhai @ 2014-09-23 15:06 ` Peter Zijlstra 0 siblings, 0 replies; 18+ messages in thread From: Peter Zijlstra @ 2014-09-23 15:06 UTC (permalink / raw) To: Kirill Tkhai Cc: linux-kernel, Ingo Molnar, Kirill Tkhai, oleg, linux, ralf, tony.luck, linux-mips, linux-ia64 On Mon, Sep 22, 2014 at 10:36:18PM +0400, Kirill Tkhai wrote: > From: Kirill Tkhai <ktkhai@parallels.com> > > Architectures, which define __ARCH_WANT_UNLOCKED_CTXSW, > may pull a task when it's in the middle of schedule(). > > CPU1(task1 calls schedule) CPU2 > ... schedule() > ... idle_balance() > ... load_balance() > ... ... > schedule() ... > prepare_lock_switch() ... > raw_spin_unlock(&rq1->lock) ... > ... raw_spin_lock(&rq1->lock) > ... detach_tasks(); > ... can_migrate_task(task1) > ... attach_tasks(); <--- move task1 to rq2 > ... raw_spin_unlock(&rq1->lock) > ... context_switch() <--- switch to task1's stack > ... ... > (using task1's stack) (using task1's stack) > ... ... > context_switch() ... > > > Parallel use of a single stack is not a good idea. Indeed it is, but how about we do this instead? --- Subject: sched,mips,ia64: Remove __ARCH_WANT_UNLOCKED_CTXSW Kirill found that there's a subtle race in the __ARCH_WANT_UNLOCKED_CTXSW code, and instead of fixing it, remove the entire exception because neither arch that uses it seems to actually still require it. Boot tested on mips64el (qemu) only. Cc: Oleg Nesterov <oleg@redhat.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Tony Luck <tony.luck@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> --- arch/ia64/include/asm/processor.h | 1 - arch/mips/include/asm/processor.h | 6 ------ kernel/sched/core.c | 6 ------ kernel/sched/sched.h | 30 ------------------------------ 4 files changed, 43 deletions(-) diff --git a/arch/ia64/include/asm/processor.h b/arch/ia64/include/asm/processor.h index c736713..ce53c50 100644 --- a/arch/ia64/include/asm/processor.h +++ b/arch/ia64/include/asm/processor.h @@ -19,7 +19,6 @@ #include <asm/ptrace.h> #include <asm/ustack.h> -#define __ARCH_WANT_UNLOCKED_CTXSW #define ARCH_HAS_PREFETCH_SWITCH_STACK #define IA64_NUM_PHYS_STACK_REG 96 diff --git a/arch/mips/include/asm/processor.h b/arch/mips/include/asm/processor.h index 05f0843..f1df4cb 100644 --- a/arch/mips/include/asm/processor.h +++ b/arch/mips/include/asm/processor.h @@ -397,12 +397,6 @@ unsigned long get_wchan(struct task_struct *p); #define ARCH_HAS_PREFETCHW #define prefetchw(x) __builtin_prefetch((x), 1, 1) -/* - * See Documentation/scheduler/sched-arch.txt; prevents deadlock on SMP - * systems. - */ -#define __ARCH_WANT_UNLOCKED_CTXSW - #endif #endif /* _ASM_PROCESSOR_H */ diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 2a93b87..ccbafb0 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -2304,10 +2304,6 @@ asmlinkage __visible void schedule_tail(struct task_struct *prev) */ post_schedule(rq); -#ifdef __ARCH_WANT_UNLOCKED_CTXSW - /* In this case, finish_task_switch does not reenable preemption */ - preempt_enable(); -#endif if (current->set_child_tid) put_user(task_pid_vnr(current), current->set_child_tid); } @@ -2350,9 +2346,7 @@ context_switch(struct rq *rq, struct task_struct *prev, * of the scheduler it's an obvious special-case), so we * do an early lockdep release here: */ -#ifndef __ARCH_WANT_UNLOCKED_CTXSW spin_release(&rq->lock.dep_map, 1, _THIS_IP_); -#endif context_tracking_task_switch(prev, next); /* Here we just switch the register state and the stack. */ diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 1bc6aad..d87f122 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -966,7 +966,6 @@ static inline int task_on_rq_migrating(struct task_struct *p) # define finish_arch_post_lock_switch() do { } while (0) #endif -#ifndef __ARCH_WANT_UNLOCKED_CTXSW static inline void prepare_lock_switch(struct rq *rq, struct task_struct *next) { #ifdef CONFIG_SMP @@ -1004,35 +1003,6 @@ static inline void finish_lock_switch(struct rq *rq, struct task_struct *prev) raw_spin_unlock_irq(&rq->lock); } -#else /* __ARCH_WANT_UNLOCKED_CTXSW */ -static inline void prepare_lock_switch(struct rq *rq, struct task_struct *next) -{ -#ifdef CONFIG_SMP - /* - * We can optimise this out completely for !SMP, because the - * SMP rebalancing from interrupt is the only thing that cares - * here. - */ - next->on_cpu = 1; -#endif - raw_spin_unlock(&rq->lock); -} - -static inline void finish_lock_switch(struct rq *rq, struct task_struct *prev) -{ -#ifdef CONFIG_SMP - /* - * After ->on_cpu is cleared, the task can be moved to a different CPU. - * We must ensure this doesn't happen until the switch is completely - * finished. - */ - smp_wmb(); - prev->on_cpu = 0; -#endif - local_irq_enable(); -} -#endif /* __ARCH_WANT_UNLOCKED_CTXSW */ - /* * wake flags */ ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [RFC][PATCH] sched,mips,ia64: Remove __ARCH_WANT_UNLOCKED_CTXSW @ 2014-09-23 15:06 ` Peter Zijlstra 0 siblings, 0 replies; 18+ messages in thread From: Peter Zijlstra @ 2014-09-23 15:06 UTC (permalink / raw) To: Kirill Tkhai Cc: linux-kernel, Ingo Molnar, Kirill Tkhai, oleg, linux, ralf, tony.luck, linux-mips, linux-ia64 On Mon, Sep 22, 2014 at 10:36:18PM +0400, Kirill Tkhai wrote: > From: Kirill Tkhai <ktkhai@parallels.com> > > Architectures, which define __ARCH_WANT_UNLOCKED_CTXSW, > may pull a task when it's in the middle of schedule(). > > CPU1(task1 calls schedule) CPU2 > ... schedule() > ... idle_balance() > ... load_balance() > ... ... > schedule() ... > prepare_lock_switch() ... > raw_spin_unlock(&rq1->lock) ... > ... raw_spin_lock(&rq1->lock) > ... detach_tasks(); > ... can_migrate_task(task1) > ... attach_tasks(); <--- move task1 to rq2 > ... raw_spin_unlock(&rq1->lock) > ... context_switch() <--- switch to task1's stack > ... ... > (using task1's stack) (using task1's stack) > ... ... > context_switch() ... > > > Parallel use of a single stack is not a good idea. Indeed it is, but how about we do this instead? --- Subject: sched,mips,ia64: Remove __ARCH_WANT_UNLOCKED_CTXSW Kirill found that there's a subtle race in the __ARCH_WANT_UNLOCKED_CTXSW code, and instead of fixing it, remove the entire exception because neither arch that uses it seems to actually still require it. Boot tested on mips64el (qemu) only. Cc: Oleg Nesterov <oleg@redhat.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Tony Luck <tony.luck@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> --- arch/ia64/include/asm/processor.h | 1 - arch/mips/include/asm/processor.h | 6 ------ kernel/sched/core.c | 6 ------ kernel/sched/sched.h | 30 ------------------------------ 4 files changed, 43 deletions(-) diff --git a/arch/ia64/include/asm/processor.h b/arch/ia64/include/asm/processor.h index c736713..ce53c50 100644 --- a/arch/ia64/include/asm/processor.h +++ b/arch/ia64/include/asm/processor.h @@ -19,7 +19,6 @@ #include <asm/ptrace.h> #include <asm/ustack.h> -#define __ARCH_WANT_UNLOCKED_CTXSW #define ARCH_HAS_PREFETCH_SWITCH_STACK #define IA64_NUM_PHYS_STACK_REG 96 diff --git a/arch/mips/include/asm/processor.h b/arch/mips/include/asm/processor.h index 05f0843..f1df4cb 100644 --- a/arch/mips/include/asm/processor.h +++ b/arch/mips/include/asm/processor.h @@ -397,12 +397,6 @@ unsigned long get_wchan(struct task_struct *p); #define ARCH_HAS_PREFETCHW #define prefetchw(x) __builtin_prefetch((x), 1, 1) -/* - * See Documentation/scheduler/sched-arch.txt; prevents deadlock on SMP - * systems. - */ -#define __ARCH_WANT_UNLOCKED_CTXSW - #endif #endif /* _ASM_PROCESSOR_H */ diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 2a93b87..ccbafb0 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -2304,10 +2304,6 @@ asmlinkage __visible void schedule_tail(struct task_struct *prev) */ post_schedule(rq); -#ifdef __ARCH_WANT_UNLOCKED_CTXSW - /* In this case, finish_task_switch does not reenable preemption */ - preempt_enable(); -#endif if (current->set_child_tid) put_user(task_pid_vnr(current), current->set_child_tid); } @@ -2350,9 +2346,7 @@ context_switch(struct rq *rq, struct task_struct *prev, * of the scheduler it's an obvious special-case), so we * do an early lockdep release here: */ -#ifndef __ARCH_WANT_UNLOCKED_CTXSW spin_release(&rq->lock.dep_map, 1, _THIS_IP_); -#endif context_tracking_task_switch(prev, next); /* Here we just switch the register state and the stack. */ diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 1bc6aad..d87f122 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -966,7 +966,6 @@ static inline int task_on_rq_migrating(struct task_struct *p) # define finish_arch_post_lock_switch() do { } while (0) #endif -#ifndef __ARCH_WANT_UNLOCKED_CTXSW static inline void prepare_lock_switch(struct rq *rq, struct task_struct *next) { #ifdef CONFIG_SMP @@ -1004,35 +1003,6 @@ static inline void finish_lock_switch(struct rq *rq, struct task_struct *prev) raw_spin_unlock_irq(&rq->lock); } -#else /* __ARCH_WANT_UNLOCKED_CTXSW */ -static inline void prepare_lock_switch(struct rq *rq, struct task_struct *next) -{ -#ifdef CONFIG_SMP - /* - * We can optimise this out completely for !SMP, because the - * SMP rebalancing from interrupt is the only thing that cares - * here. - */ - next->on_cpu = 1; -#endif - raw_spin_unlock(&rq->lock); -} - -static inline void finish_lock_switch(struct rq *rq, struct task_struct *prev) -{ -#ifdef CONFIG_SMP - /* - * After ->on_cpu is cleared, the task can be moved to a different CPU. - * We must ensure this doesn't happen until the switch is completely - * finished. - */ - smp_wmb(); - prev->on_cpu = 0; -#endif - local_irq_enable(); -} -#endif /* __ARCH_WANT_UNLOCKED_CTXSW */ - /* * wake flags */ ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [RFC][PATCH] sched,mips,ia64: Remove __ARCH_WANT_UNLOCKED_CTXSW 2014-09-23 15:06 ` Peter Zijlstra @ 2014-09-23 15:16 ` Kirill Tkhai -1 siblings, 0 replies; 18+ messages in thread From: Kirill Tkhai @ 2014-09-23 15:16 UTC (permalink / raw) To: Peter Zijlstra Cc: linux-kernel@vger.kernel.org, Ingo Molnar, Kirill Tkhai, oleg@redhat.com, linux@roeck-us.net, ralf@linux-mips.org, tony.luck@intel.com, linux-mips@linux-mips.org, linux-ia64@vger.kernel.org 23.09.2014, 19:06, "Peter Zijlstra" <peterz@infradead.org>: > On Mon, Sep 22, 2014 at 10:36:18PM +0400, Kirill Tkhai wrote: >> šFrom: Kirill Tkhai <ktkhai@parallels.com> >> >> šArchitectures, which define __ARCH_WANT_UNLOCKED_CTXSW, >> šmay pull a task when it's in the middle of schedule(). >> >> šCPU1(task1 calls schedule) šššššššššššCPU2 >> š... ššššššššššššššššššššššššššššššššššschedule() >> š... šššššššššššššššššššššššššššššššššššššidle_balance() >> š... ššššššššššššššššššššššššššššššššššššššššload_balance() >> š... ššššššššššššššššššššššššššššššššššššššššššš... >> šschedule() šššššššššššššššššššššššššššššššššššš... >> ššššprepare_lock_switch() šššššššššššššššššššššš... >> šššššššraw_spin_unlock(&rq1->lock) ššššššššššššš... >> ššššššš... šššššššššššššššššššššššššššššššššššššraw_spin_lock(&rq1->lock) >> ššššššš... ššššššššššššššššššššššššššššššššššššššššdetach_tasks(); >> ššššššš... šššššššššššššššššššššššššššššššššššššššššššcan_migrate_task(task1) >> ššššššš... ššššššššššššššššššššššššššššššššššššššššattach_tasks(); <--- move task1 to rq2 >> ššššššš... šššššššššššššššššššššššššššššššššššššraw_spin_unlock(&rq1->lock) >> ššššššš... šššššššššššššššššššššššššššššššcontext_switch() <--- switch to task1's stack >> ššššššš... ššššššššššššššššššššššššššššššš... >> šššš(using task1's stack) šššššššššššššššš(using task1's stack) >> šššš... šššššššššššššššššššššššššššššššššš... >> ššššcontext_switch() ššššššššššššššššššššš... >> >> šParallel use of a single stack is not a good idea. > > Indeed it is, but how about we do this instead? Completely agree, looks good for me. > --- > Subject: sched,mips,ia64: Remove __ARCH_WANT_UNLOCKED_CTXSW > > Kirill found that there's a subtle race in the > __ARCH_WANT_UNLOCKED_CTXSW code, and instead of fixing it, remove the > entire exception because neither arch that uses it seems to actually > still require it. > > Boot tested on mips64el (qemu) only. > > Cc: Oleg Nesterov <oleg@redhat.com> > Cc: Guenter Roeck <linux@roeck-us.net> > Cc: Ralf Baechle <ralf@linux-mips.org> > Cc: Tony Luck <tony.luck@intel.com> > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kirill Tkhai <ktkhai@parallels.com> (one more review of ia64 part) ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC][PATCH] sched,mips,ia64: Remove __ARCH_WANT_UNLOCKED_CTXSW @ 2014-09-23 15:16 ` Kirill Tkhai 0 siblings, 0 replies; 18+ messages in thread From: Kirill Tkhai @ 2014-09-23 15:16 UTC (permalink / raw) To: Peter Zijlstra Cc: linux-kernel@vger.kernel.org, Ingo Molnar, Kirill Tkhai, oleg@redhat.com, linux@roeck-us.net, ralf@linux-mips.org, tony.luck@intel.com, linux-mips@linux-mips.org, linux-ia64@vger.kernel.org 23.09.2014, 19:06, "Peter Zijlstra" <peterz@infradead.org>: > On Mon, Sep 22, 2014 at 10:36:18PM +0400, Kirill Tkhai wrote: >> From: Kirill Tkhai <ktkhai@parallels.com> >> >> Architectures, which define __ARCH_WANT_UNLOCKED_CTXSW, >> may pull a task when it's in the middle of schedule(). >> >> CPU1(task1 calls schedule) CPU2 >> ... schedule() >> ... idle_balance() >> ... load_balance() >> ... ... >> schedule() ... >> prepare_lock_switch() ... >> raw_spin_unlock(&rq1->lock) ... >> ... raw_spin_lock(&rq1->lock) >> ... detach_tasks(); >> ... can_migrate_task(task1) >> ... attach_tasks(); <--- move task1 to rq2 >> ... raw_spin_unlock(&rq1->lock) >> ... context_switch() <--- switch to task1's stack >> ... ... >> (using task1's stack) (using task1's stack) >> ... ... >> context_switch() ... >> >> Parallel use of a single stack is not a good idea. > > Indeed it is, but how about we do this instead? Completely agree, looks good for me. > --- > Subject: sched,mips,ia64: Remove __ARCH_WANT_UNLOCKED_CTXSW > > Kirill found that there's a subtle race in the > __ARCH_WANT_UNLOCKED_CTXSW code, and instead of fixing it, remove the > entire exception because neither arch that uses it seems to actually > still require it. > > Boot tested on mips64el (qemu) only. > > Cc: Oleg Nesterov <oleg@redhat.com> > Cc: Guenter Roeck <linux@roeck-us.net> > Cc: Ralf Baechle <ralf@linux-mips.org> > Cc: Tony Luck <tony.luck@intel.com> > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kirill Tkhai <ktkhai@parallels.com> (one more review of ia64 part) ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC][PATCH] sched,mips,ia64: Remove __ARCH_WANT_UNLOCKED_CTXSW 2014-09-23 15:06 ` Peter Zijlstra @ 2014-09-23 19:45 ` Guenter Roeck -1 siblings, 0 replies; 18+ messages in thread From: Guenter Roeck @ 2014-09-23 19:45 UTC (permalink / raw) To: Peter Zijlstra Cc: Kirill Tkhai, linux-kernel, Ingo Molnar, Kirill Tkhai, oleg, ralf, tony.luck, linux-mips, linux-ia64 On Tue, Sep 23, 2014 at 05:06:41PM +0200, Peter Zijlstra wrote: > On Mon, Sep 22, 2014 at 10:36:18PM +0400, Kirill Tkhai wrote: > > From: Kirill Tkhai <ktkhai@parallels.com> > > > > Architectures, which define __ARCH_WANT_UNLOCKED_CTXSW, > > may pull a task when it's in the middle of schedule(). > > > > CPU1(task1 calls schedule) CPU2 > > ... schedule() > > ... idle_balance() > > ... load_balance() > > ... ... > > schedule() ... > > prepare_lock_switch() ... > > raw_spin_unlock(&rq1->lock) ... > > ... raw_spin_lock(&rq1->lock) > > ... detach_tasks(); > > ... can_migrate_task(task1) > > ... attach_tasks(); <--- move task1 to rq2 > > ... raw_spin_unlock(&rq1->lock) > > ... context_switch() <--- switch to task1's stack > > ... ... > > (using task1's stack) (using task1's stack) > > ... ... > > context_switch() ... > > > > > > Parallel use of a single stack is not a good idea. > > Indeed it is, but how about we do this instead? > > --- > Subject: sched,mips,ia64: Remove __ARCH_WANT_UNLOCKED_CTXSW > > Kirill found that there's a subtle race in the > __ARCH_WANT_UNLOCKED_CTXSW code, and instead of fixing it, remove the > entire exception because neither arch that uses it seems to actually > still require it. > > Boot tested on mips64el (qemu) only. > > Cc: Oleg Nesterov <oleg@redhat.com> > Cc: Guenter Roeck <linux@roeck-us.net> > Cc: Ralf Baechle <ralf@linux-mips.org> > Cc: Tony Luck <tony.luck@intel.com> > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> > --- > arch/ia64/include/asm/processor.h | 1 - > arch/mips/include/asm/processor.h | 6 ------ > kernel/sched/core.c | 6 ------ > kernel/sched/sched.h | 30 ------------------------------ You might also want to clean up any references to the define if you remove it. Documentation/scheduler/sched-arch.txt:you must `#define __ARCH_WANT_UNLOCKED_CTXSW` in a header file arch/blackfin/kernel/entry.S: * since Blackfin does not define __ARCH_WANT_UNLOCKED_CTXSW, so Guenter > 4 files changed, 43 deletions(-) > > diff --git a/arch/ia64/include/asm/processor.h b/arch/ia64/include/asm/processor.h > index c736713..ce53c50 100644 > --- a/arch/ia64/include/asm/processor.h > +++ b/arch/ia64/include/asm/processor.h > @@ -19,7 +19,6 @@ > #include <asm/ptrace.h> > #include <asm/ustack.h> > > -#define __ARCH_WANT_UNLOCKED_CTXSW > #define ARCH_HAS_PREFETCH_SWITCH_STACK > > #define IA64_NUM_PHYS_STACK_REG 96 > diff --git a/arch/mips/include/asm/processor.h b/arch/mips/include/asm/processor.h > index 05f0843..f1df4cb 100644 > --- a/arch/mips/include/asm/processor.h > +++ b/arch/mips/include/asm/processor.h > @@ -397,12 +397,6 @@ unsigned long get_wchan(struct task_struct *p); > #define ARCH_HAS_PREFETCHW > #define prefetchw(x) __builtin_prefetch((x), 1, 1) > > -/* > - * See Documentation/scheduler/sched-arch.txt; prevents deadlock on SMP > - * systems. > - */ > -#define __ARCH_WANT_UNLOCKED_CTXSW > - > #endif > > #endif /* _ASM_PROCESSOR_H */ > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 2a93b87..ccbafb0 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -2304,10 +2304,6 @@ asmlinkage __visible void schedule_tail(struct task_struct *prev) > */ > post_schedule(rq); > > -#ifdef __ARCH_WANT_UNLOCKED_CTXSW > - /* In this case, finish_task_switch does not reenable preemption */ > - preempt_enable(); > -#endif > if (current->set_child_tid) > put_user(task_pid_vnr(current), current->set_child_tid); > } > @@ -2350,9 +2346,7 @@ context_switch(struct rq *rq, struct task_struct *prev, > * of the scheduler it's an obvious special-case), so we > * do an early lockdep release here: > */ > -#ifndef __ARCH_WANT_UNLOCKED_CTXSW > spin_release(&rq->lock.dep_map, 1, _THIS_IP_); > -#endif > > context_tracking_task_switch(prev, next); > /* Here we just switch the register state and the stack. */ > diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h > index 1bc6aad..d87f122 100644 > --- a/kernel/sched/sched.h > +++ b/kernel/sched/sched.h > @@ -966,7 +966,6 @@ static inline int task_on_rq_migrating(struct task_struct *p) > # define finish_arch_post_lock_switch() do { } while (0) > #endif > > -#ifndef __ARCH_WANT_UNLOCKED_CTXSW > static inline void prepare_lock_switch(struct rq *rq, struct task_struct *next) > { > #ifdef CONFIG_SMP > @@ -1004,35 +1003,6 @@ static inline void finish_lock_switch(struct rq *rq, struct task_struct *prev) > raw_spin_unlock_irq(&rq->lock); > } > > -#else /* __ARCH_WANT_UNLOCKED_CTXSW */ > -static inline void prepare_lock_switch(struct rq *rq, struct task_struct *next) > -{ > -#ifdef CONFIG_SMP > - /* > - * We can optimise this out completely for !SMP, because the > - * SMP rebalancing from interrupt is the only thing that cares > - * here. > - */ > - next->on_cpu = 1; > -#endif > - raw_spin_unlock(&rq->lock); > -} > - > -static inline void finish_lock_switch(struct rq *rq, struct task_struct *prev) > -{ > -#ifdef CONFIG_SMP > - /* > - * After ->on_cpu is cleared, the task can be moved to a different CPU. > - * We must ensure this doesn't happen until the switch is completely > - * finished. > - */ > - smp_wmb(); > - prev->on_cpu = 0; > -#endif > - local_irq_enable(); > -} > -#endif /* __ARCH_WANT_UNLOCKED_CTXSW */ > - > /* > * wake flags > */ ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC][PATCH] sched,mips,ia64: Remove __ARCH_WANT_UNLOCKED_CTXSW @ 2014-09-23 19:45 ` Guenter Roeck 0 siblings, 0 replies; 18+ messages in thread From: Guenter Roeck @ 2014-09-23 19:45 UTC (permalink / raw) To: Peter Zijlstra Cc: Kirill Tkhai, linux-kernel, Ingo Molnar, Kirill Tkhai, oleg, ralf, tony.luck, linux-mips, linux-ia64 On Tue, Sep 23, 2014 at 05:06:41PM +0200, Peter Zijlstra wrote: > On Mon, Sep 22, 2014 at 10:36:18PM +0400, Kirill Tkhai wrote: > > From: Kirill Tkhai <ktkhai@parallels.com> > > > > Architectures, which define __ARCH_WANT_UNLOCKED_CTXSW, > > may pull a task when it's in the middle of schedule(). > > > > CPU1(task1 calls schedule) CPU2 > > ... schedule() > > ... idle_balance() > > ... load_balance() > > ... ... > > schedule() ... > > prepare_lock_switch() ... > > raw_spin_unlock(&rq1->lock) ... > > ... raw_spin_lock(&rq1->lock) > > ... detach_tasks(); > > ... can_migrate_task(task1) > > ... attach_tasks(); <--- move task1 to rq2 > > ... raw_spin_unlock(&rq1->lock) > > ... context_switch() <--- switch to task1's stack > > ... ... > > (using task1's stack) (using task1's stack) > > ... ... > > context_switch() ... > > > > > > Parallel use of a single stack is not a good idea. > > Indeed it is, but how about we do this instead? > > --- > Subject: sched,mips,ia64: Remove __ARCH_WANT_UNLOCKED_CTXSW > > Kirill found that there's a subtle race in the > __ARCH_WANT_UNLOCKED_CTXSW code, and instead of fixing it, remove the > entire exception because neither arch that uses it seems to actually > still require it. > > Boot tested on mips64el (qemu) only. > > Cc: Oleg Nesterov <oleg@redhat.com> > Cc: Guenter Roeck <linux@roeck-us.net> > Cc: Ralf Baechle <ralf@linux-mips.org> > Cc: Tony Luck <tony.luck@intel.com> > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> > --- > arch/ia64/include/asm/processor.h | 1 - > arch/mips/include/asm/processor.h | 6 ------ > kernel/sched/core.c | 6 ------ > kernel/sched/sched.h | 30 ------------------------------ You might also want to clean up any references to the define if you remove it. Documentation/scheduler/sched-arch.txt:you must `#define __ARCH_WANT_UNLOCKED_CTXSW` in a header file arch/blackfin/kernel/entry.S: * since Blackfin does not define __ARCH_WANT_UNLOCKED_CTXSW, so Guenter > 4 files changed, 43 deletions(-) > > diff --git a/arch/ia64/include/asm/processor.h b/arch/ia64/include/asm/processor.h > index c736713..ce53c50 100644 > --- a/arch/ia64/include/asm/processor.h > +++ b/arch/ia64/include/asm/processor.h > @@ -19,7 +19,6 @@ > #include <asm/ptrace.h> > #include <asm/ustack.h> > > -#define __ARCH_WANT_UNLOCKED_CTXSW > #define ARCH_HAS_PREFETCH_SWITCH_STACK > > #define IA64_NUM_PHYS_STACK_REG 96 > diff --git a/arch/mips/include/asm/processor.h b/arch/mips/include/asm/processor.h > index 05f0843..f1df4cb 100644 > --- a/arch/mips/include/asm/processor.h > +++ b/arch/mips/include/asm/processor.h > @@ -397,12 +397,6 @@ unsigned long get_wchan(struct task_struct *p); > #define ARCH_HAS_PREFETCHW > #define prefetchw(x) __builtin_prefetch((x), 1, 1) > > -/* > - * See Documentation/scheduler/sched-arch.txt; prevents deadlock on SMP > - * systems. > - */ > -#define __ARCH_WANT_UNLOCKED_CTXSW > - > #endif > > #endif /* _ASM_PROCESSOR_H */ > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 2a93b87..ccbafb0 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -2304,10 +2304,6 @@ asmlinkage __visible void schedule_tail(struct task_struct *prev) > */ > post_schedule(rq); > > -#ifdef __ARCH_WANT_UNLOCKED_CTXSW > - /* In this case, finish_task_switch does not reenable preemption */ > - preempt_enable(); > -#endif > if (current->set_child_tid) > put_user(task_pid_vnr(current), current->set_child_tid); > } > @@ -2350,9 +2346,7 @@ context_switch(struct rq *rq, struct task_struct *prev, > * of the scheduler it's an obvious special-case), so we > * do an early lockdep release here: > */ > -#ifndef __ARCH_WANT_UNLOCKED_CTXSW > spin_release(&rq->lock.dep_map, 1, _THIS_IP_); > -#endif > > context_tracking_task_switch(prev, next); > /* Here we just switch the register state and the stack. */ > diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h > index 1bc6aad..d87f122 100644 > --- a/kernel/sched/sched.h > +++ b/kernel/sched/sched.h > @@ -966,7 +966,6 @@ static inline int task_on_rq_migrating(struct task_struct *p) > # define finish_arch_post_lock_switch() do { } while (0) > #endif > > -#ifndef __ARCH_WANT_UNLOCKED_CTXSW > static inline void prepare_lock_switch(struct rq *rq, struct task_struct *next) > { > #ifdef CONFIG_SMP > @@ -1004,35 +1003,6 @@ static inline void finish_lock_switch(struct rq *rq, struct task_struct *prev) > raw_spin_unlock_irq(&rq->lock); > } > > -#else /* __ARCH_WANT_UNLOCKED_CTXSW */ > -static inline void prepare_lock_switch(struct rq *rq, struct task_struct *next) > -{ > -#ifdef CONFIG_SMP > - /* > - * We can optimise this out completely for !SMP, because the > - * SMP rebalancing from interrupt is the only thing that cares > - * here. > - */ > - next->on_cpu = 1; > -#endif > - raw_spin_unlock(&rq->lock); > -} > - > -static inline void finish_lock_switch(struct rq *rq, struct task_struct *prev) > -{ > -#ifdef CONFIG_SMP > - /* > - * After ->on_cpu is cleared, the task can be moved to a different CPU. > - * We must ensure this doesn't happen until the switch is completely > - * finished. > - */ > - smp_wmb(); > - prev->on_cpu = 0; > -#endif > - local_irq_enable(); > -} > -#endif /* __ARCH_WANT_UNLOCKED_CTXSW */ > - > /* > * wake flags > */ ^ permalink raw reply [flat|nested] 18+ messages in thread
* [tip:sched/core] sched, mips, ia64: Remove __ARCH_WANT_UNLOCKED_CTXSW 2014-09-23 15:06 ` Peter Zijlstra ` (2 preceding siblings ...) (?) @ 2014-09-24 14:56 ` tip-bot for Peter Zijlstra -1 siblings, 0 replies; 18+ messages in thread From: tip-bot for Peter Zijlstra @ 2014-09-24 14:56 UTC (permalink / raw) To: linux-tip-commits Cc: mingo, torvalds, peterz, tony.luck, ralf, akpm, tglx, davidlohr, hpa, linux-kernel, james.hogan, qais.yousef, keescook, paul.burton, fenghua.yu, tkhai Commit-ID: c55f5158f5606f8a62e694b7e009f59b92ac6258 Gitweb: http://git.kernel.org/tip/c55f5158f5606f8a62e694b7e009f59b92ac6258 Author: Peter Zijlstra <peterz@infradead.org> AuthorDate: Tue, 23 Sep 2014 17:06:41 +0200 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Wed, 24 Sep 2014 14:47:05 +0200 sched, mips, ia64: Remove __ARCH_WANT_UNLOCKED_CTXSW Kirill found that there's a subtle race in the __ARCH_WANT_UNLOCKED_CTXSW code, and instead of fixing it, remove the entire exception because neither arch that uses it seems to actually still require it. Boot tested on mips64el (qemu) only. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kirill Tkhai <tkhai@yandex.ru> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Davidlohr Bueso <davidlohr@hp.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Qais Yousef <qais.yousef@imgtec.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Tony Luck <tony.luck@intel.com> Cc: oleg@redhat.com Cc: linux@roeck-us.net Cc: linux-ia64@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mips@linux-mips.org Link: http://lkml.kernel.org/r/20140923150641.GH3312@worktop.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org> --- arch/ia64/include/asm/processor.h | 1 - arch/mips/include/asm/processor.h | 6 ------ kernel/sched/core.c | 6 ------ kernel/sched/sched.h | 30 ------------------------------ 4 files changed, 43 deletions(-) diff --git a/arch/ia64/include/asm/processor.h b/arch/ia64/include/asm/processor.h index c736713..ce53c50 100644 --- a/arch/ia64/include/asm/processor.h +++ b/arch/ia64/include/asm/processor.h @@ -19,7 +19,6 @@ #include <asm/ptrace.h> #include <asm/ustack.h> -#define __ARCH_WANT_UNLOCKED_CTXSW #define ARCH_HAS_PREFETCH_SWITCH_STACK #define IA64_NUM_PHYS_STACK_REG 96 diff --git a/arch/mips/include/asm/processor.h b/arch/mips/include/asm/processor.h index 05f0843..f1df4cb 100644 --- a/arch/mips/include/asm/processor.h +++ b/arch/mips/include/asm/processor.h @@ -397,12 +397,6 @@ unsigned long get_wchan(struct task_struct *p); #define ARCH_HAS_PREFETCHW #define prefetchw(x) __builtin_prefetch((x), 1, 1) -/* - * See Documentation/scheduler/sched-arch.txt; prevents deadlock on SMP - * systems. - */ -#define __ARCH_WANT_UNLOCKED_CTXSW - #endif #endif /* _ASM_PROCESSOR_H */ diff --git a/kernel/sched/core.c b/kernel/sched/core.c index d65566d..5b0eac9 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -2331,10 +2331,6 @@ asmlinkage __visible void schedule_tail(struct task_struct *prev) */ post_schedule(rq); -#ifdef __ARCH_WANT_UNLOCKED_CTXSW - /* In this case, finish_task_switch does not reenable preemption */ - preempt_enable(); -#endif if (current->set_child_tid) put_user(task_pid_vnr(current), current->set_child_tid); } @@ -2377,9 +2373,7 @@ context_switch(struct rq *rq, struct task_struct *prev, * of the scheduler it's an obvious special-case), so we * do an early lockdep release here: */ -#ifndef __ARCH_WANT_UNLOCKED_CTXSW spin_release(&rq->lock.dep_map, 1, _THIS_IP_); -#endif context_tracking_task_switch(prev, next); /* Here we just switch the register state and the stack. */ diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 16e1ca9..6130251 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -975,7 +975,6 @@ static inline int task_on_rq_migrating(struct task_struct *p) # define finish_arch_post_lock_switch() do { } while (0) #endif -#ifndef __ARCH_WANT_UNLOCKED_CTXSW static inline void prepare_lock_switch(struct rq *rq, struct task_struct *next) { #ifdef CONFIG_SMP @@ -1013,35 +1012,6 @@ static inline void finish_lock_switch(struct rq *rq, struct task_struct *prev) raw_spin_unlock_irq(&rq->lock); } -#else /* __ARCH_WANT_UNLOCKED_CTXSW */ -static inline void prepare_lock_switch(struct rq *rq, struct task_struct *next) -{ -#ifdef CONFIG_SMP - /* - * We can optimise this out completely for !SMP, because the - * SMP rebalancing from interrupt is the only thing that cares - * here. - */ - next->on_cpu = 1; -#endif - raw_spin_unlock(&rq->lock); -} - -static inline void finish_lock_switch(struct rq *rq, struct task_struct *prev) -{ -#ifdef CONFIG_SMP - /* - * After ->on_cpu is cleared, the task can be moved to a different CPU. - * We must ensure this doesn't happen until the switch is completely - * finished. - */ - smp_wmb(); - prev->on_cpu = 0; -#endif - local_irq_enable(); -} -#endif /* __ARCH_WANT_UNLOCKED_CTXSW */ - /* * wake flags */ ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v3 3/6] sched: Use dl_bw_of() under RCU read lock 2014-09-22 18:36 [PATCH v3 1/6] sched/fair: Remove duplicate code from can_migrate_task() Kirill Tkhai 2014-09-22 18:36 ` [PATCH v3 2/6] sched: Do not pick a task which is switching on other cpu Kirill Tkhai @ 2014-09-22 18:36 ` Kirill Tkhai 2014-09-24 14:57 ` [tip:sched/core] " tip-bot for Kirill Tkhai 2014-09-22 18:36 ` [PATCH v3 4/6] sched: cleanup: Rename out_unlock to out_free_new_mask Kirill Tkhai ` (3 subsequent siblings) 5 siblings, 1 reply; 18+ messages in thread From: Kirill Tkhai @ 2014-09-22 18:36 UTC (permalink / raw) To: linux-kernel; +Cc: Peter Zijlstra, Ingo Molnar, Kirill Tkhai From: Kirill Tkhai <ktkhai@parallels.com> dl_bw_of() dereferences rq->rd which has to have RCU read lock held. Probability of use-after-free isn't zero here. Also add lockdep assert into dl_bw_cpus(). Signed-off-by: Kirill Tkhai <ktkhai@parallels.com> Cc: <stable@vger.kernel.org> # v3.14+ --- kernel/sched/core.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 5b864e9..a300fce 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -1989,6 +1989,8 @@ unsigned long to_ratio(u64 period, u64 runtime) #ifdef CONFIG_SMP inline struct dl_bw *dl_bw_of(int i) { + rcu_lockdep_assert(rcu_read_lock_sched_held(), + "sched RCU must be held"); return &cpu_rq(i)->rd->dl_bw; } @@ -1997,6 +1999,8 @@ static inline int dl_bw_cpus(int i) struct root_domain *rd = cpu_rq(i)->rd; int cpus = 0; + rcu_lockdep_assert(rcu_read_lock_sched_held(), + "sched RCU must be held"); for_each_cpu_and(i, rd->span, cpu_active_mask) cpus++; @@ -7623,6 +7627,8 @@ static int sched_dl_global_constraints(void) int cpu, ret = 0; unsigned long flags; + rcu_read_lock(); + /* * Here we want to check the bandwidth not being set to some * value smaller than the currently allocated bandwidth in @@ -7644,6 +7650,8 @@ static int sched_dl_global_constraints(void) break; } + rcu_read_unlock(); + return ret; } @@ -7659,6 +7667,7 @@ static void sched_dl_do_global(void) if (global_rt_runtime() != RUNTIME_INF) new_bw = to_ratio(global_rt_period(), global_rt_runtime()); + rcu_read_lock(); /* * FIXME: As above... */ @@ -7669,6 +7678,7 @@ static void sched_dl_do_global(void) dl_b->bw = new_bw; raw_spin_unlock_irqrestore(&dl_b->lock, flags); } + rcu_read_unlock(); } static int sched_rt_global_validate(void) ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [tip:sched/core] sched: Use dl_bw_of() under RCU read lock 2014-09-22 18:36 ` [PATCH v3 3/6] sched: Use dl_bw_of() under RCU read lock Kirill Tkhai @ 2014-09-24 14:57 ` tip-bot for Kirill Tkhai 0 siblings, 0 replies; 18+ messages in thread From: tip-bot for Kirill Tkhai @ 2014-09-24 14:57 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, ktkhai, hpa, mingo, torvalds, peterz, paulmck, tglx Commit-ID: 66339c31bc3978d5fff9c4b4cb590a861def4db2 Gitweb: http://git.kernel.org/tip/66339c31bc3978d5fff9c4b4cb590a861def4db2 Author: Kirill Tkhai <ktkhai@parallels.com> AuthorDate: Mon, 22 Sep 2014 22:36:24 +0400 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Wed, 24 Sep 2014 14:47:09 +0200 sched: Use dl_bw_of() under RCU read lock dl_bw_of() dereferences rq->rd which has to have RCU read lock held. Probability of use-after-free isn't zero here. Also add lockdep assert into dl_bw_cpus(). Signed-off-by: Kirill Tkhai <ktkhai@parallels.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@vger.kernel.org> # v3.14+ Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20140922183624.11015.71558.stgit@localhost Signed-off-by: Ingo Molnar <mingo@kernel.org> --- kernel/sched/core.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 5b0eac9..f0adb03 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -2021,6 +2021,8 @@ unsigned long to_ratio(u64 period, u64 runtime) #ifdef CONFIG_SMP inline struct dl_bw *dl_bw_of(int i) { + rcu_lockdep_assert(rcu_read_lock_sched_held(), + "sched RCU must be held"); return &cpu_rq(i)->rd->dl_bw; } @@ -2029,6 +2031,8 @@ static inline int dl_bw_cpus(int i) struct root_domain *rd = cpu_rq(i)->rd; int cpus = 0; + rcu_lockdep_assert(rcu_read_lock_sched_held(), + "sched RCU must be held"); for_each_cpu_and(i, rd->span, cpu_active_mask) cpus++; @@ -7645,6 +7649,8 @@ static int sched_dl_global_constraints(void) int cpu, ret = 0; unsigned long flags; + rcu_read_lock(); + /* * Here we want to check the bandwidth not being set to some * value smaller than the currently allocated bandwidth in @@ -7666,6 +7672,8 @@ static int sched_dl_global_constraints(void) break; } + rcu_read_unlock(); + return ret; } @@ -7681,6 +7689,7 @@ static void sched_dl_do_global(void) if (global_rt_runtime() != RUNTIME_INF) new_bw = to_ratio(global_rt_period(), global_rt_runtime()); + rcu_read_lock(); /* * FIXME: As above... */ @@ -7691,6 +7700,7 @@ static void sched_dl_do_global(void) dl_b->bw = new_bw; raw_spin_unlock_irqrestore(&dl_b->lock, flags); } + rcu_read_unlock(); } static int sched_rt_global_validate(void) ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v3 4/6] sched: cleanup: Rename out_unlock to out_free_new_mask 2014-09-22 18:36 [PATCH v3 1/6] sched/fair: Remove duplicate code from can_migrate_task() Kirill Tkhai 2014-09-22 18:36 ` [PATCH v3 2/6] sched: Do not pick a task which is switching on other cpu Kirill Tkhai 2014-09-22 18:36 ` [PATCH v3 3/6] sched: Use dl_bw_of() under RCU read lock Kirill Tkhai @ 2014-09-22 18:36 ` Kirill Tkhai 2014-09-24 14:57 ` [tip:sched/core] sched: cleanup: Rename 'out_unlock' to ' out_free_new_mask' tip-bot for Kirill Tkhai 2014-09-22 18:36 ` [PATCH v3 5/6] sched: Use rq->rd in sched_setaffinity() under RCU read lock Kirill Tkhai ` (2 subsequent siblings) 5 siblings, 1 reply; 18+ messages in thread From: Kirill Tkhai @ 2014-09-22 18:36 UTC (permalink / raw) To: linux-kernel; +Cc: Peter Zijlstra, Ingo Molnar, Kirill Tkhai From: Kirill Tkhai <ktkhai@parallels.com> Nothing is locked there, so label's name only confuses a reader. Signed-off-by: Kirill Tkhai <ktkhai@parallels.com> --- kernel/sched/core.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index a300fce..3b07710 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -4003,14 +4003,14 @@ long sched_setaffinity(pid_t pid, const struct cpumask *in_mask) rcu_read_lock(); if (!ns_capable(__task_cred(p)->user_ns, CAP_SYS_NICE)) { rcu_read_unlock(); - goto out_unlock; + goto out_free_new_mask; } rcu_read_unlock(); } retval = security_task_setscheduler(p); if (retval) - goto out_unlock; + goto out_free_new_mask; cpuset_cpus_allowed(p, cpus_allowed); @@ -4028,7 +4028,7 @@ long sched_setaffinity(pid_t pid, const struct cpumask *in_mask) if (dl_bandwidth_enabled() && !cpumask_subset(span, new_mask)) { retval = -EBUSY; - goto out_unlock; + goto out_free_new_mask; } } #endif @@ -4047,7 +4047,7 @@ long sched_setaffinity(pid_t pid, const struct cpumask *in_mask) goto again; } } -out_unlock: +out_free_new_mask: free_cpumask_var(new_mask); out_free_cpus_allowed: free_cpumask_var(cpus_allowed); ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [tip:sched/core] sched: cleanup: Rename 'out_unlock' to ' out_free_new_mask' 2014-09-22 18:36 ` [PATCH v3 4/6] sched: cleanup: Rename out_unlock to out_free_new_mask Kirill Tkhai @ 2014-09-24 14:57 ` tip-bot for Kirill Tkhai 0 siblings, 0 replies; 18+ messages in thread From: tip-bot for Kirill Tkhai @ 2014-09-24 14:57 UTC (permalink / raw) To: linux-tip-commits; +Cc: linux-kernel, ktkhai, hpa, mingo, peterz, tglx Commit-ID: 16303ab2fe214635240a8f57cad2cd29792d4e3b Gitweb: http://git.kernel.org/tip/16303ab2fe214635240a8f57cad2cd29792d4e3b Author: Kirill Tkhai <ktkhai@parallels.com> AuthorDate: Mon, 22 Sep 2014 22:36:30 +0400 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Wed, 24 Sep 2014 14:47:10 +0200 sched: cleanup: Rename 'out_unlock' to 'out_free_new_mask' Nothing is locked there, so label's name only confuses a reader. Signed-off-by: Kirill Tkhai <ktkhai@parallels.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: http://lkml.kernel.org/r/20140922183630.11015.59500.stgit@localhost Signed-off-by: Ingo Molnar <mingo@kernel.org> --- kernel/sched/core.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index f0adb03..316127a 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -4029,14 +4029,14 @@ long sched_setaffinity(pid_t pid, const struct cpumask *in_mask) rcu_read_lock(); if (!ns_capable(__task_cred(p)->user_ns, CAP_SYS_NICE)) { rcu_read_unlock(); - goto out_unlock; + goto out_free_new_mask; } rcu_read_unlock(); } retval = security_task_setscheduler(p); if (retval) - goto out_unlock; + goto out_free_new_mask; cpuset_cpus_allowed(p, cpus_allowed); @@ -4054,7 +4054,7 @@ long sched_setaffinity(pid_t pid, const struct cpumask *in_mask) if (dl_bandwidth_enabled() && !cpumask_subset(span, new_mask)) { retval = -EBUSY; - goto out_unlock; + goto out_free_new_mask; } } #endif @@ -4073,7 +4073,7 @@ again: goto again; } } -out_unlock: +out_free_new_mask: free_cpumask_var(new_mask); out_free_cpus_allowed: free_cpumask_var(cpus_allowed); ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v3 5/6] sched: Use rq->rd in sched_setaffinity() under RCU read lock 2014-09-22 18:36 [PATCH v3 1/6] sched/fair: Remove duplicate code from can_migrate_task() Kirill Tkhai ` (2 preceding siblings ...) 2014-09-22 18:36 ` [PATCH v3 4/6] sched: cleanup: Rename out_unlock to out_free_new_mask Kirill Tkhai @ 2014-09-22 18:36 ` Kirill Tkhai 2014-09-24 14:57 ` [tip:sched/core] " tip-bot for Kirill Tkhai 2014-09-22 18:36 ` [PATCH v3 6/6] sched/rt: Use resched_curr() in task_tick_rt() Kirill Tkhai 2014-09-24 14:56 ` [tip:sched/core] sched/fair: Remove duplicate code from can_migrate_task() tip-bot for Kirill Tkhai 5 siblings, 1 reply; 18+ messages in thread From: Kirill Tkhai @ 2014-09-22 18:36 UTC (permalink / raw) To: linux-kernel; +Cc: Peter Zijlstra, Ingo Molnar, Kirill Tkhai From: Kirill Tkhai <ktkhai@parallels.com> Probability of use-after-free isn't zero in this place. Signed-off-by: Kirill Tkhai <ktkhai@parallels.com> Cc: <stable@vger.kernel.org> # v3.14+ --- kernel/sched/core.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 3b07710..643ee99 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -4023,13 +4023,14 @@ long sched_setaffinity(pid_t pid, const struct cpumask *in_mask) * root_domain. */ #ifdef CONFIG_SMP - if (task_has_dl_policy(p)) { - const struct cpumask *span = task_rq(p)->rd->span; - - if (dl_bandwidth_enabled() && !cpumask_subset(span, new_mask)) { + if (task_has_dl_policy(p) && dl_bandwidth_enabled()) { + rcu_read_lock(); + if (!cpumask_subset(task_rq(p)->rd->span, new_mask)) { retval = -EBUSY; + rcu_read_unlock(); goto out_free_new_mask; } + rcu_read_unlock(); } #endif again: ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [tip:sched/core] sched: Use rq->rd in sched_setaffinity() under RCU read lock 2014-09-22 18:36 ` [PATCH v3 5/6] sched: Use rq->rd in sched_setaffinity() under RCU read lock Kirill Tkhai @ 2014-09-24 14:57 ` tip-bot for Kirill Tkhai 0 siblings, 0 replies; 18+ messages in thread From: tip-bot for Kirill Tkhai @ 2014-09-24 14:57 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, ktkhai, hpa, mingo, torvalds, peterz, paulmck, tglx Commit-ID: f1e3a0932f3a9554371792a7daaf1e0eb19f66d5 Gitweb: http://git.kernel.org/tip/f1e3a0932f3a9554371792a7daaf1e0eb19f66d5 Author: Kirill Tkhai <ktkhai@parallels.com> AuthorDate: Mon, 22 Sep 2014 22:36:36 +0400 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Wed, 24 Sep 2014 14:47:11 +0200 sched: Use rq->rd in sched_setaffinity() under RCU read lock Probability of use-after-free isn't zero in this place. Signed-off-by: Kirill Tkhai <ktkhai@parallels.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@vger.kernel.org> # v3.14+ Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20140922183636.11015.83611.stgit@localhost Signed-off-by: Ingo Molnar <mingo@kernel.org> --- kernel/sched/core.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 316127a..b5349fe 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -4049,13 +4049,14 @@ long sched_setaffinity(pid_t pid, const struct cpumask *in_mask) * root_domain. */ #ifdef CONFIG_SMP - if (task_has_dl_policy(p)) { - const struct cpumask *span = task_rq(p)->rd->span; - - if (dl_bandwidth_enabled() && !cpumask_subset(span, new_mask)) { + if (task_has_dl_policy(p) && dl_bandwidth_enabled()) { + rcu_read_lock(); + if (!cpumask_subset(task_rq(p)->rd->span, new_mask)) { retval = -EBUSY; + rcu_read_unlock(); goto out_free_new_mask; } + rcu_read_unlock(); } #endif again: ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v3 6/6] sched/rt: Use resched_curr() in task_tick_rt() 2014-09-22 18:36 [PATCH v3 1/6] sched/fair: Remove duplicate code from can_migrate_task() Kirill Tkhai ` (3 preceding siblings ...) 2014-09-22 18:36 ` [PATCH v3 5/6] sched: Use rq->rd in sched_setaffinity() under RCU read lock Kirill Tkhai @ 2014-09-22 18:36 ` Kirill Tkhai 2014-09-24 14:57 ` [tip:sched/core] " tip-bot for Kirill Tkhai 2014-09-24 14:56 ` [tip:sched/core] sched/fair: Remove duplicate code from can_migrate_task() tip-bot for Kirill Tkhai 5 siblings, 1 reply; 18+ messages in thread From: Kirill Tkhai @ 2014-09-22 18:36 UTC (permalink / raw) To: linux-kernel; +Cc: Peter Zijlstra, Ingo Molnar, Kirill Tkhai From: Kirill Tkhai <ktkhai@parallels.com> Some time ago PREEMPT_NEED_RESCHED was implemented, so reschedule technics is a little more difficult now. Signed-off-by: Kirill Tkhai <ktkhai@parallels.com> --- kernel/sched/rt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index de356b0..c322071 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -2077,7 +2077,7 @@ static void task_tick_rt(struct rq *rq, struct task_struct *p, int queued) for_each_sched_rt_entity(rt_se) { if (rt_se->run_list.prev != rt_se->run_list.next) { requeue_task_rt(rq, p, 0); - set_tsk_need_resched(p); + resched_curr(rq); return; } } ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [tip:sched/core] sched/rt: Use resched_curr() in task_tick_rt() 2014-09-22 18:36 ` [PATCH v3 6/6] sched/rt: Use resched_curr() in task_tick_rt() Kirill Tkhai @ 2014-09-24 14:57 ` tip-bot for Kirill Tkhai 0 siblings, 0 replies; 18+ messages in thread From: tip-bot for Kirill Tkhai @ 2014-09-24 14:57 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, ktkhai, hpa, mingo, torvalds, peterz, tglx Commit-ID: 8aa6f0ebf41b5fdd186276394bf07e7bd6884d94 Gitweb: http://git.kernel.org/tip/8aa6f0ebf41b5fdd186276394bf07e7bd6884d94 Author: Kirill Tkhai <ktkhai@parallels.com> AuthorDate: Mon, 22 Sep 2014 22:36:43 +0400 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Wed, 24 Sep 2014 14:47:12 +0200 sched/rt: Use resched_curr() in task_tick_rt() Some time ago PREEMPT_NEED_RESCHED was implemented, so reschedule technics is a little more difficult now. Signed-off-by: Kirill Tkhai <ktkhai@parallels.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20140922183642.11015.66039.stgit@localhost Signed-off-by: Ingo Molnar <mingo@kernel.org> --- kernel/sched/rt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index 2e6a774..87ea5bf 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -2072,7 +2072,7 @@ static void task_tick_rt(struct rq *rq, struct task_struct *p, int queued) for_each_sched_rt_entity(rt_se) { if (rt_se->run_list.prev != rt_se->run_list.next) { requeue_task_rt(rq, p, 0); - set_tsk_need_resched(p); + resched_curr(rq); return; } } ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [tip:sched/core] sched/fair: Remove duplicate code from can_migrate_task() 2014-09-22 18:36 [PATCH v3 1/6] sched/fair: Remove duplicate code from can_migrate_task() Kirill Tkhai ` (4 preceding siblings ...) 2014-09-22 18:36 ` [PATCH v3 6/6] sched/rt: Use resched_curr() in task_tick_rt() Kirill Tkhai @ 2014-09-24 14:56 ` tip-bot for Kirill Tkhai 5 siblings, 0 replies; 18+ messages in thread From: tip-bot for Kirill Tkhai @ 2014-09-24 14:56 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, ktkhai, hpa, mingo, torvalds, peterz, tglx Commit-ID: 7a96c231ca23f0f5622852307df4209afc502ec3 Gitweb: http://git.kernel.org/tip/7a96c231ca23f0f5622852307df4209afc502ec3 Author: Kirill Tkhai <ktkhai@parallels.com> AuthorDate: Mon, 22 Sep 2014 22:36:12 +0400 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Wed, 24 Sep 2014 14:47:07 +0200 sched/fair: Remove duplicate code from can_migrate_task() Combine two branches which do the same. Signed-off-by: Kirill Tkhai <ktkhai@parallels.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20140922183612.11015.64200.stgit@localhost Signed-off-by: Ingo Molnar <mingo@kernel.org> --- kernel/sched/fair.c | 16 ++-------------- 1 file changed, 2 insertions(+), 14 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 8cb32f8..10a5a28 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -5315,24 +5315,12 @@ int can_migrate_task(struct task_struct *p, struct lb_env *env) if (!tsk_cache_hot) tsk_cache_hot = migrate_degrades_locality(p, env); - if (migrate_improves_locality(p, env)) { -#ifdef CONFIG_SCHEDSTATS + if (migrate_improves_locality(p, env) || !tsk_cache_hot || + env->sd->nr_balance_failed > env->sd->cache_nice_tries) { if (tsk_cache_hot) { schedstat_inc(env->sd, lb_hot_gained[env->idle]); schedstat_inc(p, se.statistics.nr_forced_migrations); } -#endif - return 1; - } - - if (!tsk_cache_hot || - env->sd->nr_balance_failed > env->sd->cache_nice_tries) { - - if (tsk_cache_hot) { - schedstat_inc(env->sd, lb_hot_gained[env->idle]); - schedstat_inc(p, se.statistics.nr_forced_migrations); - } - return 1; } ^ permalink raw reply related [flat|nested] 18+ messages in thread
end of thread, other threads:[~2014-09-24 14:58 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-09-22 18:36 [PATCH v3 1/6] sched/fair: Remove duplicate code from can_migrate_task() Kirill Tkhai 2014-09-22 18:36 ` [PATCH v3 2/6] sched: Do not pick a task which is switching on other cpu Kirill Tkhai 2014-09-23 15:06 ` [RFC][PATCH] sched,mips,ia64: Remove __ARCH_WANT_UNLOCKED_CTXSW Peter Zijlstra 2014-09-23 15:06 ` Peter Zijlstra 2014-09-23 15:16 ` Kirill Tkhai 2014-09-23 15:16 ` Kirill Tkhai 2014-09-23 19:45 ` Guenter Roeck 2014-09-23 19:45 ` Guenter Roeck 2014-09-24 14:56 ` [tip:sched/core] sched, mips, ia64: " tip-bot for Peter Zijlstra 2014-09-22 18:36 ` [PATCH v3 3/6] sched: Use dl_bw_of() under RCU read lock Kirill Tkhai 2014-09-24 14:57 ` [tip:sched/core] " tip-bot for Kirill Tkhai 2014-09-22 18:36 ` [PATCH v3 4/6] sched: cleanup: Rename out_unlock to out_free_new_mask Kirill Tkhai 2014-09-24 14:57 ` [tip:sched/core] sched: cleanup: Rename 'out_unlock' to ' out_free_new_mask' tip-bot for Kirill Tkhai 2014-09-22 18:36 ` [PATCH v3 5/6] sched: Use rq->rd in sched_setaffinity() under RCU read lock Kirill Tkhai 2014-09-24 14:57 ` [tip:sched/core] " tip-bot for Kirill Tkhai 2014-09-22 18:36 ` [PATCH v3 6/6] sched/rt: Use resched_curr() in task_tick_rt() Kirill Tkhai 2014-09-24 14:57 ` [tip:sched/core] " tip-bot for Kirill Tkhai 2014-09-24 14:56 ` [tip:sched/core] sched/fair: Remove duplicate code from can_migrate_task() tip-bot for Kirill Tkhai
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.