* [PATCH] sched/rt: Fix rq's cpupri leak while enqueue/dequeue child RT entities
@ 2013-11-27 15:59 Kirill Tkhai
2013-12-12 10:30 ` Kirill Tkhai
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Kirill Tkhai @ 2013-11-27 15:59 UTC (permalink / raw)
To: linux-kernel@vger.kernel.org
Cc: Ingo Molnar, Peter Zijlstra, Steven Rostedt, stable
This patch touches RT group scheduling case.
Functions inc_rt_prio_smp() and dec_rt_prio_smp() change (global) rq's priority,
while rt_rq passed to them may be not the top-level rt_rq. This is wrong, because
changing of priority on a child level does not guarantee that the priority is
the highest all over the rq. So, this leak makes RT balancing unusable.
The short example: the task having the highest priority among all rq's RT tasks
(no one other task has the same priority) are waking on a throttle rt_rq.
The rq's cpupri is set to the task's priority equivalent, but real
rq->rt.highest_prio.curr is less.
The patch below fixes the problem.
It looks like all version have this bug, so I CC'ed stable mailing list.
Signed-off-by: Kirill Tkhai <tkhai@yandex.ru>
CC: Ingo Molnar <mingo@redhat.com>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: stable@vger.kernel.org
---
kernel/sched/rt.c | 14 ++++++++++++++
1 files changed, 14 insertions(+), 0 deletions(-)
diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 7d57275..1c40655 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -901,6 +901,13 @@ inc_rt_prio_smp(struct rt_rq *rt_rq, int prio, int prev_prio)
{
struct rq *rq = rq_of_rt_rq(rt_rq);
+#ifdef CONFIG_RT_GROUP_SCHED
+ /*
+ * Change rq's cpupri only if rt_rq is the top queue.
+ */
+ if (&rq->rt != rt_rq)
+ return;
+#endif
if (rq->online && prio < prev_prio)
cpupri_set(&rq->rd->cpupri, rq->cpu, prio);
}
@@ -910,6 +917,13 @@ dec_rt_prio_smp(struct rt_rq *rt_rq, int prio, int prev_prio)
{
struct rq *rq = rq_of_rt_rq(rt_rq);
+#ifdef CONFIG_RT_GROUP_SCHED
+ /*
+ * Change rq's cpupri only if rt_rq is the top queue.
+ */
+ if (&rq->rt != rt_rq)
+ return;
+#endif
if (rq->online && rt_rq->highest_prio.curr != prev_prio)
cpupri_set(&rq->rd->cpupri, rq->cpu, rt_rq->highest_prio.curr);
}
^ permalink raw reply related [flat|nested] 10+ messages in thread* Re: [PATCH] sched/rt: Fix rq's cpupri leak while enqueue/dequeue child RT entities
2013-11-27 15:59 [PATCH] sched/rt: Fix rq's cpupri leak while enqueue/dequeue child RT entities Kirill Tkhai
@ 2013-12-12 10:30 ` Kirill Tkhai
2013-12-13 15:42 ` Peter Zijlstra
2013-12-18 10:32 ` [tip:sched/core] sched/rt: Fix rq's cpupri leak while enqueue/ dequeue " tip-bot for Kirill Tkhai
2 siblings, 0 replies; 10+ messages in thread
From: Kirill Tkhai @ 2013-12-12 10:30 UTC (permalink / raw)
To: linux-kernel@vger.kernel.org
Cc: Ingo Molnar, Peter Zijlstra, Steven Rostedt,
stable@vger.kernel.org
Ping!
27.11.2013, 19:59, "Kirill Tkhai" <tkhai@yandex.ru>:
> This patch touches RT group scheduling case.
>
> Functions inc_rt_prio_smp() and dec_rt_prio_smp() change (global) rq's priority,
> while rt_rq passed to them may be not the top-level rt_rq. This is wrong, because
> changing of priority on a child level does not guarantee that the priority is
> the highest all over the rq. So, this leak makes RT balancing unusable.
>
> The short example: the task having the highest priority among all rq's RT tasks
> (no one other task has the same priority) are waking on a throttle rt_rq.
> The rq's cpupri is set to the task's priority equivalent, but real
> rq->rt.highest_prio.curr is less.
>
> The patch below fixes the problem.
>
> It looks like all version have this bug, so I CC'ed stable mailing list.
>
> Signed-off-by: Kirill Tkhai <tkhai@yandex.ru>
> CC: Ingo Molnar <mingo@redhat.com>
> CC: Peter Zijlstra <peterz@infradead.org>
> CC: Steven Rostedt <rostedt@goodmis.org>
> CC: stable@vger.kernel.org
>---
> kernel/sched/rt.c | 14 ++++++++++++++
> 1 files changed, 14 insertions(+), 0 deletions(-)
>diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
>index 7d57275..1c40655 100644
>--- a/kernel/sched/rt.c
>+++ b/kernel/sched/rt.c
>@@ -901,6 +901,13 @@ inc_rt_prio_smp(struct rt_rq *rt_rq, int prio, int prev_prio)
> {
> struct rq *rq = rq_of_rt_rq(rt_rq);
>
>+#ifdef CONFIG_RT_GROUP_SCHED
>+ /*
>+ * Change rq's cpupri only if rt_rq is the top queue.
>+ */
>+ if (&rq->rt != rt_rq)
>+ return;
>+#endif
> if (rq->online && prio < prev_prio)
> cpupri_set(&rq->rd->cpupri, rq->cpu, prio);
> }
>@@ -910,6 +917,13 @@ dec_rt_prio_smp(struct rt_rq *rt_rq, int prio, int prev_prio)
> {
> struct rq *rq = rq_of_rt_rq(rt_rq);
>
>+#ifdef CONFIG_RT_GROUP_SCHED
>+ /*
>+ * Change rq's cpupri only if rt_rq is the top queue.
>+ */
>+ if (&rq->rt != rt_rq)
>+ return;
>+#endif
> if (rq->online && rt_rq->highest_prio.curr != prev_prio)
> cpupri_set(&rq->rd->cpupri, rq->cpu, rt_rq->highest_prio.curr);
> }
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH] sched/rt: Fix rq's cpupri leak while enqueue/dequeue child RT entities
2013-11-27 15:59 [PATCH] sched/rt: Fix rq's cpupri leak while enqueue/dequeue child RT entities Kirill Tkhai
2013-12-12 10:30 ` Kirill Tkhai
@ 2013-12-13 15:42 ` Peter Zijlstra
2013-12-17 12:02 ` Kirill Tkhai
2013-12-18 10:32 ` [tip:sched/core] sched/rt: Fix rq's cpupri leak while enqueue/ dequeue " tip-bot for Kirill Tkhai
2 siblings, 1 reply; 10+ messages in thread
From: Peter Zijlstra @ 2013-12-13 15:42 UTC (permalink / raw)
To: Kirill Tkhai
Cc: linux-kernel@vger.kernel.org, Ingo Molnar, Steven Rostedt, stable
On Wed, Nov 27, 2013 at 07:59:13PM +0400, Kirill Tkhai wrote:
> This patch touches RT group scheduling case.
>
> Functions inc_rt_prio_smp() and dec_rt_prio_smp() change (global) rq's priority,
> while rt_rq passed to them may be not the top-level rt_rq. This is wrong, because
> changing of priority on a child level does not guarantee that the priority is
> the highest all over the rq. So, this leak makes RT balancing unusable.
>
> The short example: the task having the highest priority among all rq's RT tasks
> (no one other task has the same priority) are waking on a throttle rt_rq.
> The rq's cpupri is set to the task's priority equivalent, but real
> rq->rt.highest_prio.curr is less.
>
> The patch below fixes the problem.
>
> It looks like all version have this bug, so I CC'ed stable mailing list.
Yeah, I think this is right.
cpupri stuff should indeed only be changed for the top level group.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] sched/rt: Fix rq's cpupri leak while enqueue/dequeue child RT entities
2013-12-13 15:42 ` Peter Zijlstra
@ 2013-12-17 12:02 ` Kirill Tkhai
0 siblings, 0 replies; 10+ messages in thread
From: Kirill Tkhai @ 2013-12-17 12:02 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar
Cc: linux-kernel@vger.kernel.org, Steven Rostedt,
stable@vger.kernel.org
13.12.2013, 19:42, "Peter Zijlstra" <peterz@infradead.org>:
> On Wed, Nov 27, 2013 at 07:59:13PM +0400, Kirill Tkhai wrote:
>
>> This patch touches RT group scheduling case.
>>
>> Functions inc_rt_prio_smp() and dec_rt_prio_smp() change (global) rq's priority,
>> while rt_rq passed to them may be not the top-level rt_rq. This is wrong, because
>> changing of priority on a child level does not guarantee that the priority is
>> the highest all over the rq. So, this leak makes RT balancing unusable.
>>
>> The short example: the task having the highest priority among all rq's RT tasks
>> (no one other task has the same priority) are waking on a throttle rt_rq.
>> The rq's cpupri is set to the task's priority equivalent, but real
>> rq->rt.highest_prio.curr is less.
>>
>> The patch below fixes the problem.
>>
>> It looks like all version have this bug, so I CC'ed stable mailing list.
>
> Yeah, I think this is right.
>
> cpupri stuff should indeed only be changed for the top level group.
Ingo, are you going to apply this patch? Or will you give any comments?
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] sched/rt: Fix rq's cpupri leak while enqueue/dequeue child RT entities
@ 2013-12-17 12:02 ` Kirill Tkhai
0 siblings, 0 replies; 10+ messages in thread
From: Kirill Tkhai @ 2013-12-17 12:02 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar
Cc: linux-kernel@vger.kernel.org, Steven Rostedt,
stable@vger.kernel.org
13.12.2013, 19:42, "Peter Zijlstra" <peterz@infradead.org>:
> On Wed, Nov 27, 2013 at 07:59:13PM +0400, Kirill Tkhai wrote:
>
>> О©╫This patch touches RT group scheduling case.
>>
>> О©╫Functions inc_rt_prio_smp() and dec_rt_prio_smp() change (global) rq's priority,
>> О©╫while rt_rq passed to them may be not the top-level rt_rq. This is wrong, because
>> О©╫changing of priority on a child level does not guarantee that the priority is
>> О©╫the highest all over the rq. So, this leak makes RT balancing unusable.
>>
>> О©╫The short example: the task having the highest priority among all rq's RT tasks
>> О©╫(no one other task has the same priority) are waking on a throttle rt_rq.
>> О©╫The rq's cpupri is set to the task's priority equivalent, but real
>> О©╫rq->rt.highest_prio.curr is less.
>>
>> О©╫The patch below fixes the problem.
>>
>> О©╫It looks like all version have this bug, so I CC'ed stable mailing list.
>
> Yeah, I think this is right.
>
> cpupri stuff should indeed only be changed for the top level group.
Ingo, are you going to apply this patch? Or will you give any comments?
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] sched/rt: Fix rq's cpupri leak while enqueue/dequeue child RT entities
2013-12-17 12:02 ` Kirill Tkhai
@ 2013-12-17 12:46 ` Peter Zijlstra
-1 siblings, 0 replies; 10+ messages in thread
From: Peter Zijlstra @ 2013-12-17 12:46 UTC (permalink / raw)
To: Kirill Tkhai
Cc: Ingo Molnar, linux-kernel@vger.kernel.org, Steven Rostedt,
stable@vger.kernel.org
On Tue, Dec 17, 2013 at 04:02:58PM +0400, Kirill Tkhai wrote:
>
>
> 13.12.2013, 19:42, "Peter Zijlstra" <peterz@infradead.org>:
> > On Wed, Nov 27, 2013 at 07:59:13PM +0400, Kirill Tkhai wrote:
> >
> >> This patch touches RT group scheduling case.
> >>
> >> Functions inc_rt_prio_smp() and dec_rt_prio_smp() change (global) rq's priority,
> >> while rt_rq passed to them may be not the top-level rt_rq. This is wrong, because
> >> changing of priority on a child level does not guarantee that the priority is
> >> the highest all over the rq. So, this leak makes RT balancing unusable.
> >>
> >> The short example: the task having the highest priority among all rq's RT tasks
> >> (no one other task has the same priority) are waking on a throttle rt_rq.
> >> The rq's cpupri is set to the task's priority equivalent, but real
> >> rq->rt.highest_prio.curr is less.
> >>
> >> The patch below fixes the problem.
> >>
> >> It looks like all version have this bug, so I CC'ed stable mailing list.
> >
> > Yeah, I think this is right.
> >
> > cpupri stuff should indeed only be changed for the top level group.
>
> Ingo, are you going to apply this patch? Or will you give any comments?
I queued it, Ingo should get it through me somewhere today if all things
go well.
Thanks
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] sched/rt: Fix rq's cpupri leak while enqueue/dequeue child RT entities
@ 2013-12-17 12:46 ` Peter Zijlstra
0 siblings, 0 replies; 10+ messages in thread
From: Peter Zijlstra @ 2013-12-17 12:46 UTC (permalink / raw)
To: Kirill Tkhai
Cc: Ingo Molnar, linux-kernel@vger.kernel.org, Steven Rostedt,
stable@vger.kernel.org
On Tue, Dec 17, 2013 at 04:02:58PM +0400, Kirill Tkhai wrote:
>
>
> 13.12.2013, 19:42, "Peter Zijlstra" <peterz@infradead.org>:
> > On Wed, Nov 27, 2013 at 07:59:13PM +0400, Kirill Tkhai wrote:
> >
> >> �This patch touches RT group scheduling case.
> >>
> >> �Functions inc_rt_prio_smp() and dec_rt_prio_smp() change (global) rq's priority,
> >> �while rt_rq passed to them may be not the top-level rt_rq. This is wrong, because
> >> �changing of priority on a child level does not guarantee that the priority is
> >> �the highest all over the rq. So, this leak makes RT balancing unusable.
> >>
> >> �The short example: the task having the highest priority among all rq's RT tasks
> >> �(no one other task has the same priority) are waking on a throttle rt_rq.
> >> �The rq's cpupri is set to the task's priority equivalent, but real
> >> �rq->rt.highest_prio.curr is less.
> >>
> >> �The patch below fixes the problem.
> >>
> >> �It looks like all version have this bug, so I CC'ed stable mailing list.
> >
> > Yeah, I think this is right.
> >
> > cpupri stuff should indeed only be changed for the top level group.
>
> Ingo, are you going to apply this patch? Or will you give any comments?
I queued it, Ingo should get it through me somewhere today if all things
go well.
Thanks
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] sched/rt: Fix rq's cpupri leak while enqueue/dequeue child RT entities
2013-12-17 12:46 ` Peter Zijlstra
@ 2013-12-17 13:08 ` Kirill Tkhai
-1 siblings, 0 replies; 10+ messages in thread
From: Kirill Tkhai @ 2013-12-17 13:08 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Ingo Molnar, linux-kernel@vger.kernel.org, Steven Rostedt,
stable@vger.kernel.org
17.12.2013, 16:47, "Peter Zijlstra" <peterz@infradead.org>:
> On Tue, Dec 17, 2013 at 04:02:58PM +0400, Kirill Tkhai wrote:
>
>> 13.12.2013, 19:42, "Peter Zijlstra" <peterz@infradead.org>:
>>> On Wed, Nov 27, 2013 at 07:59:13PM +0400, Kirill Tkhai wrote:
>>>> This patch touches RT group scheduling case.
>>>>
>>>> Functions inc_rt_prio_smp() and dec_rt_prio_smp() change (global) rq's priority,
>>>> while rt_rq passed to them may be not the top-level rt_rq. This is wrong, because
>>>> changing of priority on a child level does not guarantee that the priority is
>>>> the highest all over the rq. So, this leak makes RT balancing unusable.
>>>>
>>>> The short example: the task having the highest priority among all rq's RT tasks
>>>> (no one other task has the same priority) are waking on a throttle rt_rq.
>>>> The rq's cpupri is set to the task's priority equivalent, but real
>>>> rq->rt.highest_prio.curr is less.
>>>>
>>>> The patch below fixes the problem.
>>>>
>>>> It looks like all version have this bug, so I CC'ed stable mailing list.
>>> Yeah, I think this is right.
>>>
>>> cpupri stuff should indeed only be changed for the top level group.
>> Ingo, are you going to apply this patch? Or will you give any comments?
>
> I queued it, Ingo should get it through me somewhere today if all things
> go well.
>
> Thanks
Thanks, Peter
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] sched/rt: Fix rq's cpupri leak while enqueue/dequeue child RT entities
@ 2013-12-17 13:08 ` Kirill Tkhai
0 siblings, 0 replies; 10+ messages in thread
From: Kirill Tkhai @ 2013-12-17 13:08 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Ingo Molnar, linux-kernel@vger.kernel.org, Steven Rostedt,
stable@vger.kernel.org
17.12.2013, 16:47, "Peter Zijlstra" <peterz@infradead.org>:
> On Tue, Dec 17, 2013 at 04:02:58PM +0400, Kirill Tkhai wrote:
>
>> О©╫13.12.2013, 19:42, "Peter Zijlstra" <peterz@infradead.org>:
>>> О©╫On Wed, Nov 27, 2013 at 07:59:13PM +0400, Kirill Tkhai wrote:
>>>> О©╫О©╫This patch touches RT group scheduling case.
>>>>
>>>> О©╫О©╫Functions inc_rt_prio_smp() and dec_rt_prio_smp() change (global) rq's priority,
>>>> О©╫О©╫while rt_rq passed to them may be not the top-level rt_rq. This is wrong, because
>>>> О©╫О©╫changing of priority on a child level does not guarantee that the priority is
>>>> О©╫О©╫the highest all over the rq. So, this leak makes RT balancing unusable.
>>>>
>>>> О©╫О©╫The short example: the task having the highest priority among all rq's RT tasks
>>>> О©╫О©╫(no one other task has the same priority) are waking on a throttle rt_rq.
>>>> О©╫О©╫The rq's cpupri is set to the task's priority equivalent, but real
>>>> О©╫О©╫rq->rt.highest_prio.curr is less.
>>>>
>>>> О©╫О©╫The patch below fixes the problem.
>>>>
>>>> О©╫О©╫It looks like all version have this bug, so I CC'ed stable mailing list.
>>> О©╫Yeah, I think this is right.
>>>
>>> О©╫cpupri stuff should indeed only be changed for the top level group.
>> О©╫Ingo, are you going to apply this patch? Or will you give any comments?
>
> I queued it, Ingo should get it through me somewhere today if all things
> go well.
>
> Thanks
Thanks, Peter
^ permalink raw reply [flat|nested] 10+ messages in thread
* [tip:sched/core] sched/rt: Fix rq's cpupri leak while enqueue/ dequeue child RT entities
2013-11-27 15:59 [PATCH] sched/rt: Fix rq's cpupri leak while enqueue/dequeue child RT entities Kirill Tkhai
2013-12-12 10:30 ` Kirill Tkhai
2013-12-13 15:42 ` Peter Zijlstra
@ 2013-12-18 10:32 ` tip-bot for Kirill Tkhai
2 siblings, 0 replies; 10+ messages in thread
From: tip-bot for Kirill Tkhai @ 2013-12-18 10:32 UTC (permalink / raw)
To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, rostedt, peterz, tkhai, tglx
Commit-ID: 757dfcaa41844595964f1220f1d33182dae49976
Gitweb: http://git.kernel.org/tip/757dfcaa41844595964f1220f1d33182dae49976
Author: Kirill Tkhai <tkhai@yandex.ru>
AuthorDate: Wed, 27 Nov 2013 19:59:13 +0400
Committer: Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 17 Dec 2013 15:08:44 +0100
sched/rt: Fix rq's cpupri leak while enqueue/dequeue child RT entities
This patch touches the RT group scheduling case.
Functions inc_rt_prio_smp() and dec_rt_prio_smp() change (global) rq's
priority, while rt_rq passed to them may be not the top-level rt_rq.
This is wrong, because changing of priority on a child level does not
guarantee that the priority is the highest all over the rq. So, this
leak makes RT balancing unusable.
The short example: the task having the highest priority among all rq's
RT tasks (no one other task has the same priority) are waking on a
throttle rt_rq. The rq's cpupri is set to the task's priority
equivalent, but real rq->rt.highest_prio.curr is less.
The patch below fixes the problem.
Signed-off-by: Kirill Tkhai <tkhai@yandex.ru>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/49231385567953@web4m.yandex.ru
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
kernel/sched/rt.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 7d57275..1c40655 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -901,6 +901,13 @@ inc_rt_prio_smp(struct rt_rq *rt_rq, int prio, int prev_prio)
{
struct rq *rq = rq_of_rt_rq(rt_rq);
+#ifdef CONFIG_RT_GROUP_SCHED
+ /*
+ * Change rq's cpupri only if rt_rq is the top queue.
+ */
+ if (&rq->rt != rt_rq)
+ return;
+#endif
if (rq->online && prio < prev_prio)
cpupri_set(&rq->rd->cpupri, rq->cpu, prio);
}
@@ -910,6 +917,13 @@ dec_rt_prio_smp(struct rt_rq *rt_rq, int prio, int prev_prio)
{
struct rq *rq = rq_of_rt_rq(rt_rq);
+#ifdef CONFIG_RT_GROUP_SCHED
+ /*
+ * Change rq's cpupri only if rt_rq is the top queue.
+ */
+ if (&rq->rt != rt_rq)
+ return;
+#endif
if (rq->online && rt_rq->highest_prio.curr != prev_prio)
cpupri_set(&rq->rd->cpupri, rq->cpu, rt_rq->highest_prio.curr);
}
^ permalink raw reply related [flat|nested] 10+ messages in thread
end of thread, other threads:[~2013-12-18 10:32 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-11-27 15:59 [PATCH] sched/rt: Fix rq's cpupri leak while enqueue/dequeue child RT entities Kirill Tkhai
2013-12-12 10:30 ` Kirill Tkhai
2013-12-13 15:42 ` Peter Zijlstra
2013-12-17 12:02 ` Kirill Tkhai
2013-12-17 12:02 ` Kirill Tkhai
2013-12-17 12:46 ` Peter Zijlstra
2013-12-17 12:46 ` Peter Zijlstra
2013-12-17 13:08 ` Kirill Tkhai
2013-12-17 13:08 ` Kirill Tkhai
2013-12-18 10:32 ` [tip:sched/core] sched/rt: Fix rq's cpupri leak while enqueue/ dequeue " tip-bot for Kirill Tkhai
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.