* [PATCH] sched/dl: Fix race between dl_task_timer() and sched_setaffinity() @ 2014-05-16 21:30 Kirill Tkhai 2014-05-19 13:12 ` Juri Lelli 0 siblings, 1 reply; 17+ messages in thread From: Kirill Tkhai @ 2014-05-16 21:30 UTC (permalink / raw) To: linux-kernel; +Cc: peterz, tkhai, mingo, stable, juri.lelli The race is in unlocked task_rq() access. In pair with parallel call of sched_setaffinity() it may be a reason of corruption of internal rq's data. Signed-off-by: Kirill Tkhai <tkhai@yandex.ru> CC: Juri Lelli <juri.lelli@gmail.com> CC: Peter Zijlstra <peterz@infradead.org> CC: Ingo Molnar <mingo@redhat.com> Cc: <stable@vger.kernel.org> # v3.14 --- kernel/sched/deadline.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index 800e99b..ffb023a 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -513,9 +513,16 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer *timer) struct sched_dl_entity, dl_timer); struct task_struct *p = dl_task_of(dl_se); - struct rq *rq = task_rq(p); + struct rq *rq; +again: + rq = task_rq(p); raw_spin_lock(&rq->lock); + if (unlikely(rq != task_rq(p))) { + raw_spin_unlock(&rq->lock); + goto again; + } + /* * We need to take care of a possible races here. In fact, the * task might have changed its scheduling policy to something ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH] sched/dl: Fix race between dl_task_timer() and sched_setaffinity() 2014-05-16 21:30 [PATCH] sched/dl: Fix race between dl_task_timer() and sched_setaffinity() Kirill Tkhai @ 2014-05-19 13:12 ` Juri Lelli 2014-05-19 19:31 ` Kirill Tkhai 0 siblings, 1 reply; 17+ messages in thread From: Juri Lelli @ 2014-05-19 13:12 UTC (permalink / raw) To: Kirill Tkhai; +Cc: linux-kernel, peterz, mingo, stable On Sat, 17 May 2014 01:30:03 +0400 Kirill Tkhai <tkhai@yandex.ru> wrote: > The race is in unlocked task_rq() access. In pair with parallel > call of sched_setaffinity() it may be a reason of corruption > of internal rq's data. > Sure, the thing can happen! > Signed-off-by: Kirill Tkhai <tkhai@yandex.ru> > CC: Juri Lelli <juri.lelli@gmail.com> > CC: Peter Zijlstra <peterz@infradead.org> > CC: Ingo Molnar <mingo@redhat.com> > Cc: <stable@vger.kernel.org> # v3.14 > --- > kernel/sched/deadline.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c > index 800e99b..ffb023a 100644 > --- a/kernel/sched/deadline.c > +++ b/kernel/sched/deadline.c > @@ -513,9 +513,16 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer *timer) > struct sched_dl_entity, > dl_timer); > struct task_struct *p = dl_task_of(dl_se); > - struct rq *rq = task_rq(p); > + struct rq *rq; We could maybe add a comment here, in line with what we have below, to document why we need this. Thanks, - Juri > +again: > + rq = task_rq(p); > raw_spin_lock(&rq->lock); > > + if (unlikely(rq != task_rq(p))) { > + raw_spin_unlock(&rq->lock); > + goto again; > + } > + > /* > * We need to take care of a possible races here. In fact, the > * task might have changed its scheduling policy to something > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] sched/dl: Fix race between dl_task_timer() and sched_setaffinity() 2014-05-19 13:12 ` Juri Lelli @ 2014-05-19 19:31 ` Kirill Tkhai 0 siblings, 0 replies; 17+ messages in thread From: Kirill Tkhai @ 2014-05-19 19:31 UTC (permalink / raw) To: Juri Lelli Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, mingo@redhat.com, stable@vger.kernel.org 19.05.2014, 17:11, "Juri Lelli" <juri.lelli@gmail.com>: > On Sat, 17 May 2014 01:30:03 +0400 > Kirill Tkhai <tkhai@yandex.ru> wrote: > >> The race is in unlocked task_rq() access. In pair with parallel >> call of sched_setaffinity() it may be a reason of corruption >> of internal rq's data. > > Sure, the thing can happen! [snipped] >> @@ -513,9 +513,16 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer *timer) >> struct sched_dl_entity, >> dl_timer); >> struct task_struct *p = dl_task_of(dl_se); >> - struct rq *rq = task_rq(p); >> + struct rq *rq; > > We could maybe add a comment here, in line with what we have below, to > document why we need this. How about this? (I added comment and rewrote changelog). [PATCH] sched/dl: Fix race between dl_task_timer() and sched_setaffinity() Throttled task is still on rq, and it may be moved to other cpu if user is playing with sched_setaffinity(). Therefore, unlocked task_rq() access makes the race. To fix that we do the same as made in __task_rq_lock(). We do not use __task_rq_lock() itself, because it has a useful lockdep check, which is not correct in case of dl_task_timer(). This case is an exception. Signed-off-by: Kirill Tkhai <tkhai@yandex.ru> CC: Juri Lelli <juri.lelli@gmail.com> CC: Peter Zijlstra <peterz@infradead.org> CC: Ingo Molnar <mingo@redhat.com> Cc: <stable@vger.kernel.org> # v3.14 diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index 800e99b..c0a6921 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -513,9 +513,17 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer *timer) struct sched_dl_entity, dl_timer); struct task_struct *p = dl_task_of(dl_se); - struct rq *rq = task_rq(p); + struct rq *rq; +again: + rq = task_rq(p); raw_spin_lock(&rq->lock); + if (unlikely(rq != task_rq(p))) { + /* Task was moved, retrying. */ + raw_spin_unlock(&rq->lock); + goto again; + } + /* * We need to take care of a possible races here. In fact, the * task might have changed its scheduling policy to something ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH] sched/dl: Fix race between dl_task_timer() and sched_setaffinity() @ 2014-05-19 19:31 ` Kirill Tkhai 0 siblings, 0 replies; 17+ messages in thread From: Kirill Tkhai @ 2014-05-19 19:31 UTC (permalink / raw) To: Juri Lelli Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, mingo@redhat.com, stable@vger.kernel.org 19.05.2014, 17:11, "Juri Lelli" <juri.lelli@gmail.com>: > On Sat, 17 May 2014 01:30:03 +0400 > Kirill Tkhai <tkhai@yandex.ru> wrote: > >> О©╫The race is in unlocked task_rq() access. In pair with parallel >> О©╫call of sched_setaffinity() it may be a reason of corruption >> О©╫of internal rq's data. > > Sure, the thing can happen! [snipped] >> О©╫@@ -513,9 +513,16 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer *timer) >> О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫struct sched_dl_entity, >> О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫dl_timer); >> О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫struct task_struct *p = dl_task_of(dl_se); >> О©╫- struct rq *rq = task_rq(p); >> О©╫+ struct rq *rq; > > We could maybe add a comment here, in line with what we have below, to > document why we need this. How about this? (I added comment and rewrote changelog). [PATCH] sched/dl: Fix race between dl_task_timer() and sched_setaffinity() Throttled task is still on rq, and it may be moved to other cpu if user is playing with sched_setaffinity(). Therefore, unlocked task_rq() access makes the race. To fix that we do the same as made in __task_rq_lock(). We do not use __task_rq_lock() itself, because it has a useful lockdep check, which is not correct in case of dl_task_timer(). This case is an exception. Signed-off-by: Kirill Tkhai <tkhai@yandex.ru> CC: Juri Lelli <juri.lelli@gmail.com> CC: Peter Zijlstra <peterz@infradead.org> CC: Ingo Molnar <mingo@redhat.com> Cc: <stable@vger.kernel.org> # v3.14 diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index 800e99b..c0a6921 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -513,9 +513,17 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer *timer) struct sched_dl_entity, dl_timer); struct task_struct *p = dl_task_of(dl_se); - struct rq *rq = task_rq(p); + struct rq *rq; +again: + rq = task_rq(p); raw_spin_lock(&rq->lock); + if (unlikely(rq != task_rq(p))) { + /* Task was moved, retrying. */ + raw_spin_unlock(&rq->lock); + goto again; + } + /* * We need to take care of a possible races here. In fact, the * task might have changed its scheduling policy to something ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH] sched/dl: Fix race between dl_task_timer() and sched_setaffinity() 2014-05-19 19:31 ` Kirill Tkhai (?) @ 2014-05-20 0:00 ` Peter Zijlstra 2014-05-20 5:08 ` Kirill Tkhai -1 siblings, 1 reply; 17+ messages in thread From: Peter Zijlstra @ 2014-05-20 0:00 UTC (permalink / raw) To: Kirill Tkhai Cc: Juri Lelli, linux-kernel@vger.kernel.org, mingo@redhat.com, stable@vger.kernel.org [-- Attachment #1: Type: text/plain, Size: 577 bytes --] On Mon, May 19, 2014 at 11:31:19PM +0400, Kirill Tkhai wrote: > @@ -513,9 +513,17 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer *timer) > struct sched_dl_entity, > dl_timer); > struct task_struct *p = dl_task_of(dl_se); > - struct rq *rq = task_rq(p); > + struct rq *rq; > +again: > + rq = task_rq(p); > raw_spin_lock(&rq->lock); > > + if (unlikely(rq != task_rq(p))) { > + /* Task was moved, retrying. */ > + raw_spin_unlock(&rq->lock); > + goto again; > + } > + That thing is called: rq = __task_rq_lock(p); [-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] sched/dl: Fix race between dl_task_timer() and sched_setaffinity() 2014-05-20 0:00 ` Peter Zijlstra @ 2014-05-20 5:08 ` Kirill Tkhai 0 siblings, 0 replies; 17+ messages in thread From: Kirill Tkhai @ 2014-05-20 5:08 UTC (permalink / raw) To: Peter Zijlstra Cc: Juri Lelli, linux-kernel@vger.kernel.org, mingo@redhat.com, stable@vger.kernel.org 20.05.2014, 04:00, "Peter Zijlstra" <peterz@infradead.org>: > On Mon, May 19, 2014 at 11:31:19PM +0400, Kirill Tkhai wrote: > >> @@ -513,9 +513,17 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer *timer) >> struct sched_dl_entity, >> dl_timer); >> struct task_struct *p = dl_task_of(dl_se); >> - struct rq *rq = task_rq(p); >> + struct rq *rq; >> +again: >> + rq = task_rq(p); >> raw_spin_lock(&rq->lock); >> >> + if (unlikely(rq != task_rq(p))) { >> + /* Task was moved, retrying. */ >> + raw_spin_unlock(&rq->lock); >> + goto again; >> + } >> + > > That thing is called: rq = __task_rq_lock(p); But p->pi_lock is not held. The problem is __task_rq_lock() has lockdep assert. Should we change it? Kirill ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] sched/dl: Fix race between dl_task_timer() and sched_setaffinity() @ 2014-05-20 5:08 ` Kirill Tkhai 0 siblings, 0 replies; 17+ messages in thread From: Kirill Tkhai @ 2014-05-20 5:08 UTC (permalink / raw) To: Peter Zijlstra Cc: Juri Lelli, linux-kernel@vger.kernel.org, mingo@redhat.com, stable@vger.kernel.org 20.05.2014, 04:00, "Peter Zijlstra" <peterz@infradead.org>: > On Mon, May 19, 2014 at 11:31:19PM +0400, Kirill Tkhai wrote: > >> О©╫@@ -513,9 +513,17 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer *timer) >> О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫struct sched_dl_entity, >> О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫dl_timer); >> О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫struct task_struct *p = dl_task_of(dl_se); >> О©╫- struct rq *rq = task_rq(p); >> О©╫+ struct rq *rq; >> О©╫+again: >> О©╫+ rq = task_rq(p); >> О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫raw_spin_lock(&rq->lock); >> >> О©╫+ if (unlikely(rq != task_rq(p))) { >> О©╫+ /* Task was moved, retrying. */ >> О©╫+ raw_spin_unlock(&rq->lock); >> О©╫+ goto again; >> О©╫+ } >> О©╫+ > > That thing is called: rq = __task_rq_lock(p); But p->pi_lock is not held. The problem is __task_rq_lock() has lockdep assert. Should we change it? Kirill ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] sched/dl: Fix race between dl_task_timer() and sched_setaffinity() 2014-05-20 5:08 ` Kirill Tkhai @ 2014-05-20 6:07 ` Kirill Tkhai -1 siblings, 0 replies; 17+ messages in thread From: Kirill Tkhai @ 2014-05-20 6:07 UTC (permalink / raw) To: Peter Zijlstra Cc: Juri Lelli, linux-kernel@vger.kernel.org, mingo@redhat.com, stable@vger.kernel.org 20.05.2014, 09:08, "Kirill Tkhai" <tkhai@yandex.ru>: > 20.05.2014, 04:00, "Peter Zijlstra" <peterz@infradead.org>: > >> On Mon, May 19, 2014 at 11:31:19PM +0400, Kirill Tkhai wrote: >>> @@ -513,9 +513,17 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer *timer) >>> struct sched_dl_entity, >>> dl_timer); >>> struct task_struct *p = dl_task_of(dl_se); >>> - struct rq *rq = task_rq(p); >>> + struct rq *rq; >>> +again: >>> + rq = task_rq(p); >>> raw_spin_lock(&rq->lock); >>> >>> + if (unlikely(rq != task_rq(p))) { >>> + /* Task was moved, retrying. */ >>> + raw_spin_unlock(&rq->lock); >>> + goto again; >>> + } >>> + >> That thing is called: rq = __task_rq_lock(p); > > But p->pi_lock is not held. The problem is __task_rq_lock() has lockdep assert. > Should we change it? Or make something like this? static inline struct rq *_task_rq_lock(struct task_struct *p) { lockdep_assert_held(&p->pi_lock); return __task_rq_lock(p); } Thanks! ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] sched/dl: Fix race between dl_task_timer() and sched_setaffinity() @ 2014-05-20 6:07 ` Kirill Tkhai 0 siblings, 0 replies; 17+ messages in thread From: Kirill Tkhai @ 2014-05-20 6:07 UTC (permalink / raw) To: Peter Zijlstra Cc: Juri Lelli, linux-kernel@vger.kernel.org, mingo@redhat.com, stable@vger.kernel.org 20.05.2014, 09:08, "Kirill Tkhai" <tkhai@yandex.ru>: > 20.05.2014, 04:00, "Peter Zijlstra" <peterz@infradead.org>: > >> О©╫On Mon, May 19, 2014 at 11:31:19PM +0400, Kirill Tkhai wrote: >>> О©╫О©╫@@ -513,9 +513,17 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer *timer) >>> О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫struct sched_dl_entity, >>> О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫dl_timer); >>> О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫struct task_struct *p = dl_task_of(dl_se); >>> О©╫О©╫- struct rq *rq = task_rq(p); >>> О©╫О©╫+ struct rq *rq; >>> О©╫О©╫+again: >>> О©╫О©╫+ rq = task_rq(p); >>> О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫raw_spin_lock(&rq->lock); >>> >>> О©╫О©╫+ if (unlikely(rq != task_rq(p))) { >>> О©╫О©╫+ /* Task was moved, retrying. */ >>> О©╫О©╫+ raw_spin_unlock(&rq->lock); >>> О©╫О©╫+ goto again; >>> О©╫О©╫+ } >>> О©╫О©╫+ >> О©╫That thing is called: rq = __task_rq_lock(p); > > But p->pi_lock is not held. The problem is __task_rq_lock() has lockdep assert. > Should we change it? Or make something like this? static inline struct rq *_task_rq_lock(struct task_struct *p) { lockdep_assert_held(&p->pi_lock); return __task_rq_lock(p); } Thanks! ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] sched/dl: Fix race between dl_task_timer() and sched_setaffinity() 2014-05-20 5:08 ` Kirill Tkhai @ 2014-05-20 7:53 ` Peter Zijlstra -1 siblings, 0 replies; 17+ messages in thread From: Peter Zijlstra @ 2014-05-20 7:53 UTC (permalink / raw) To: Kirill Tkhai Cc: Juri Lelli, linux-kernel@vger.kernel.org, mingo@redhat.com, stable@vger.kernel.org On Tue, May 20, 2014 at 09:08:53AM +0400, Kirill Tkhai wrote: > > > 20.05.2014, 04:00, "Peter Zijlstra" <peterz@infradead.org>: > > On Mon, May 19, 2014 at 11:31:19PM +0400, Kirill Tkhai wrote: > > > >> @@ -513,9 +513,17 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer *timer) > >> struct sched_dl_entity, > >> dl_timer); > >> struct task_struct *p = dl_task_of(dl_se); > >> - struct rq *rq = task_rq(p); > >> + struct rq *rq; > >> +again: > >> + rq = task_rq(p); > >> raw_spin_lock(&rq->lock); > >> > >> + if (unlikely(rq != task_rq(p))) { > >> + /* Task was moved, retrying. */ > >> + raw_spin_unlock(&rq->lock); > >> + goto again; > >> + } > >> + > > > > That thing is called: rq = __task_rq_lock(p); > > But p->pi_lock is not held. The problem is __task_rq_lock() has lockdep assert. > Should we change it? Ok, so now that I'm awake ;-) So the trivial problem as described by your initial changelog isn't right, because we cannot call sched_setaffinity() on deadline tasks, or rather we can, but we can't actually change the affinity mask. Now I suppose the problem can still actually happen when you change the root domain and trigger a effective affinity change that way. That said, no leave it as you proposed, adding a *task_rq_lock() variant without lockdep assert in will only confuse things, as normally we really should be also taking ->pi_lock. The only reason we don't strictly need ->pi_lock now is because we're guaranteed to have p->state == TASK_RUNNING here and are thus free of ttwu races. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] sched/dl: Fix race between dl_task_timer() and sched_setaffinity() @ 2014-05-20 7:53 ` Peter Zijlstra 0 siblings, 0 replies; 17+ messages in thread From: Peter Zijlstra @ 2014-05-20 7:53 UTC (permalink / raw) To: Kirill Tkhai Cc: Juri Lelli, linux-kernel@vger.kernel.org, mingo@redhat.com, stable@vger.kernel.org On Tue, May 20, 2014 at 09:08:53AM +0400, Kirill Tkhai wrote: > > > 20.05.2014, 04:00, "Peter Zijlstra" <peterz@infradead.org>: > > On Mon, May 19, 2014 at 11:31:19PM +0400, Kirill Tkhai wrote: > > > >> �@@ -513,9 +513,17 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer *timer) > >> �������������������������������������������������������struct sched_dl_entity, > >> �������������������������������������������������������dl_timer); > >> ����������struct task_struct *p = dl_task_of(dl_se); > >> �- struct rq *rq = task_rq(p); > >> �+ struct rq *rq; > >> �+again: > >> �+ rq = task_rq(p); > >> ����������raw_spin_lock(&rq->lock); > >> > >> �+ if (unlikely(rq != task_rq(p))) { > >> �+ /* Task was moved, retrying. */ > >> �+ raw_spin_unlock(&rq->lock); > >> �+ goto again; > >> �+ } > >> �+ > > > > That thing is called: rq = __task_rq_lock(p); > > But p->pi_lock is not held. The problem is __task_rq_lock() has lockdep assert. > Should we change it? Ok, so now that I'm awake ;-) So the trivial problem as described by your initial changelog isn't right, because we cannot call sched_setaffinity() on deadline tasks, or rather we can, but we can't actually change the affinity mask. Now I suppose the problem can still actually happen when you change the root domain and trigger a effective affinity change that way. That said, no leave it as you proposed, adding a *task_rq_lock() variant without lockdep assert in will only confuse things, as normally we really should be also taking ->pi_lock. The only reason we don't strictly need ->pi_lock now is because we're guaranteed to have p->state == TASK_RUNNING here and are thus free of ttwu races. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] sched/dl: Fix race between dl_task_timer() and sched_setaffinity() 2014-05-20 7:53 ` Peter Zijlstra @ 2014-05-20 8:17 ` Juri Lelli -1 siblings, 0 replies; 17+ messages in thread From: Juri Lelli @ 2014-05-20 8:17 UTC (permalink / raw) To: Peter Zijlstra Cc: Kirill Tkhai, linux-kernel@vger.kernel.org, mingo@redhat.com, stable@vger.kernel.org Hi, On Tue, 20 May 2014 09:53:15 +0200 Peter Zijlstra <peterz@infradead.org> wrote: > On Tue, May 20, 2014 at 09:08:53AM +0400, Kirill Tkhai wrote: > > > > > > 20.05.2014, 04:00, "Peter Zijlstra" <peterz@infradead.org>: > > > On Mon, May 19, 2014 at 11:31:19PM +0400, Kirill Tkhai wrote: > > > > > >> @@ -513,9 +513,17 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer *timer) > > >> struct sched_dl_entity, > > >> dl_timer); > > >> struct task_struct *p = dl_task_of(dl_se); > > >> - struct rq *rq = task_rq(p); > > >> + struct rq *rq; > > >> +again: > > >> + rq = task_rq(p); > > >> raw_spin_lock(&rq->lock); > > >> > > >> + if (unlikely(rq != task_rq(p))) { > > >> + /* Task was moved, retrying. */ > > >> + raw_spin_unlock(&rq->lock); > > >> + goto again; > > >> + } > > >> + > > > > > > That thing is called: rq = __task_rq_lock(p); > > > > But p->pi_lock is not held. The problem is __task_rq_lock() has lockdep assert. > > Should we change it? > > Ok, so now that I'm awake ;-) > > So the trivial problem as described by your initial changelog isn't > right, because we cannot call sched_setaffinity() on deadline tasks, or > rather we can, but we can't actually change the affinity mask. > Well, if we disable AC we can. And I was able to recreate that race in that case. > Now I suppose the problem can still actually happen when you change the > root domain and trigger a effective affinity change that way. > Yeah, I think here too. > That said, no leave it as you proposed, adding a *task_rq_lock() variant > without lockdep assert in will only confuse things, as normally we > really should be also taking ->pi_lock. > > The only reason we don't strictly need ->pi_lock now is because we're > guaranteed to have p->state == TASK_RUNNING here and are thus free of > ttwu races. Maybe we could add this as part of the comment. Thanks, - Juri ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] sched/dl: Fix race between dl_task_timer() and sched_setaffinity() @ 2014-05-20 8:17 ` Juri Lelli 0 siblings, 0 replies; 17+ messages in thread From: Juri Lelli @ 2014-05-20 8:17 UTC (permalink / raw) To: Peter Zijlstra Cc: Kirill Tkhai, linux-kernel@vger.kernel.org, mingo@redhat.com, stable@vger.kernel.org Hi, On Tue, 20 May 2014 09:53:15 +0200 Peter Zijlstra <peterz@infradead.org> wrote: > On Tue, May 20, 2014 at 09:08:53AM +0400, Kirill Tkhai wrote: > > > > > > 20.05.2014, 04:00, "Peter Zijlstra" <peterz@infradead.org>: > > > On Mon, May 19, 2014 at 11:31:19PM +0400, Kirill Tkhai wrote: > > > > > >> �@@ -513,9 +513,17 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer *timer) > > >> �������������������������������������������������������struct sched_dl_entity, > > >> �������������������������������������������������������dl_timer); > > >> ����������struct task_struct *p = dl_task_of(dl_se); > > >> �- struct rq *rq = task_rq(p); > > >> �+ struct rq *rq; > > >> �+again: > > >> �+ rq = task_rq(p); > > >> ����������raw_spin_lock(&rq->lock); > > >> > > >> �+ if (unlikely(rq != task_rq(p))) { > > >> �+ /* Task was moved, retrying. */ > > >> �+ raw_spin_unlock(&rq->lock); > > >> �+ goto again; > > >> �+ } > > >> �+ > > > > > > That thing is called: rq = __task_rq_lock(p); > > > > But p->pi_lock is not held. The problem is __task_rq_lock() has lockdep assert. > > Should we change it? > > Ok, so now that I'm awake ;-) > > So the trivial problem as described by your initial changelog isn't > right, because we cannot call sched_setaffinity() on deadline tasks, or > rather we can, but we can't actually change the affinity mask. > Well, if we disable AC we can. And I was able to recreate that race in that case. > Now I suppose the problem can still actually happen when you change the > root domain and trigger a effective affinity change that way. > Yeah, I think here too. > That said, no leave it as you proposed, adding a *task_rq_lock() variant > without lockdep assert in will only confuse things, as normally we > really should be also taking ->pi_lock. > > The only reason we don't strictly need ->pi_lock now is because we're > guaranteed to have p->state == TASK_RUNNING here and are thus free of > ttwu races. Maybe we could add this as part of the comment. Thanks, - Juri ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] sched/dl: Fix race between dl_task_timer() and sched_setaffinity() 2014-05-20 8:17 ` Juri Lelli @ 2014-05-20 9:33 ` Kirill Tkhai -1 siblings, 0 replies; 17+ messages in thread From: Kirill Tkhai @ 2014-05-20 9:33 UTC (permalink / raw) To: Juri Lelli, Peter Zijlstra Cc: linux-kernel@vger.kernel.org, mingo@redhat.com, stable@vger.kernel.org 20.05.2014, 12:16, "Juri Lelli" <juri.lelli@gmail.com>: > Hi, > > On Tue, 20 May 2014 09:53:15 +0200 > Peter Zijlstra <peterz@infradead.org> wrote: > >> On Tue, May 20, 2014 at 09:08:53AM +0400, Kirill Tkhai wrote: >>> 20.05.2014, 04:00, "Peter Zijlstra" <peterz@infradead.org>: >>>> On Mon, May 19, 2014 at 11:31:19PM +0400, Kirill Tkhai wrote: >>>>> @@ -513,9 +513,17 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer *timer) >>>>> struct sched_dl_entity, >>>>> dl_timer); >>>>> struct task_struct *p = dl_task_of(dl_se); >>>>> - struct rq *rq = task_rq(p); >>>>> + struct rq *rq; >>>>> +again: >>>>> + rq = task_rq(p); >>>>> raw_spin_lock(&rq->lock); >>>>> >>>>> + if (unlikely(rq != task_rq(p))) { >>>>> + /* Task was moved, retrying. */ >>>>> + raw_spin_unlock(&rq->lock); >>>>> + goto again; >>>>> + } >>>>> + >>>> That thing is called: rq = __task_rq_lock(p); >>> But p->pi_lock is not held. The problem is __task_rq_lock() has lockdep assert. >>> Should we change it? >> Ok, so now that I'm awake ;-) >> >> So the trivial problem as described by your initial changelog isn't >> right, because we cannot call sched_setaffinity() on deadline tasks, or >> rather we can, but we can't actually change the affinity mask. > > Well, if we disable AC we can. And I was able to recreate that race in > that case. > >> Now I suppose the problem can still actually happen when you change the >> root domain and trigger a effective affinity change that way. > > Yeah, I think here too. > >> That said, no leave it as you proposed, adding a *task_rq_lock() variant >> without lockdep assert in will only confuse things, as normally we >> really should be also taking ->pi_lock. >> >> The only reason we don't strictly need ->pi_lock now is because we're >> guaranteed to have p->state == TASK_RUNNING here and are thus free of >> ttwu races. > > Maybe we could add this as part of the comment. Peter, Juri, thanks for comment. Hope, I understood you right :) [PATCH] sched/dl: Fix race in dl_task_timer() Throttled task is still on rq, and it may be moved to other cpu if user is playing with sched_setaffinity(). Therefore, unlocked task_rq() access makes the race. Juri Lelli reports he got this race when dl_bandwidth_enabled() was not set. Other thing, pointed by Peter Zijlstra: "Now I suppose the problem can still actually happen when you change the root domain and trigger a effective affinity change that way". To fix that we do the same as made in __task_rq_lock(). We do not use __task_rq_lock() itself, because it has a useful lockdep check, which is not correct in case of dl_task_timer(). We do not need pi_lock locked here. This case is an exception (PeterZ): "The only reason we don't strictly need ->pi_lock now is because we're guaranteed to have p->state == TASK_RUNNING here and are thus free of ttwu races". Signed-off-by: Kirill Tkhai <tkhai@yandex.ru> CC: Juri Lelli <juri.lelli@gmail.com> CC: Peter Zijlstra <peterz@infradead.org> CC: Ingo Molnar <mingo@redhat.com> Cc: <stable@vger.kernel.org> # v3.14 --- kernel/sched/deadline.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index 800e99b..14bc348 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -513,9 +513,17 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer *timer) struct sched_dl_entity, dl_timer); struct task_struct *p = dl_task_of(dl_se); - struct rq *rq = task_rq(p); + struct rq *rq; +again: + rq = task_rq(p); raw_spin_lock(&rq->lock); + if (rq != task_rq(p)) { + /* Task was moved, retrying. */ + raw_spin_unlock(&rq->lock); + goto again; + } + /* * We need to take care of a possible races here. In fact, the * task might have changed its scheduling policy to something ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH] sched/dl: Fix race between dl_task_timer() and sched_setaffinity() @ 2014-05-20 9:33 ` Kirill Tkhai 0 siblings, 0 replies; 17+ messages in thread From: Kirill Tkhai @ 2014-05-20 9:33 UTC (permalink / raw) To: Juri Lelli, Peter Zijlstra Cc: linux-kernel@vger.kernel.org, mingo@redhat.com, stable@vger.kernel.org 20.05.2014, 12:16, "Juri Lelli" <juri.lelli@gmail.com>: > Hi, > > On Tue, 20 May 2014 09:53:15 +0200 > Peter Zijlstra <peterz@infradead.org> wrote: > >> О©╫On Tue, May 20, 2014 at 09:08:53AM +0400, Kirill Tkhai wrote: >>> О©╫20.05.2014, 04:00, "Peter Zijlstra" <peterz@infradead.org>: >>>> О©╫On Mon, May 19, 2014 at 11:31:19PM +0400, Kirill Tkhai wrote: >>>>> О©╫О©╫@@ -513,9 +513,17 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer *timer) >>>>> О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫struct sched_dl_entity, >>>>> О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫dl_timer); >>>>> О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫struct task_struct *p = dl_task_of(dl_se); >>>>> О©╫О©╫- struct rq *rq = task_rq(p); >>>>> О©╫О©╫+ struct rq *rq; >>>>> О©╫О©╫+again: >>>>> О©╫О©╫+ rq = task_rq(p); >>>>> О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫raw_spin_lock(&rq->lock); >>>>> >>>>> О©╫О©╫+ if (unlikely(rq != task_rq(p))) { >>>>> О©╫О©╫+ /* Task was moved, retrying. */ >>>>> О©╫О©╫+ raw_spin_unlock(&rq->lock); >>>>> О©╫О©╫+ goto again; >>>>> О©╫О©╫+ } >>>>> О©╫О©╫+ >>>> О©╫That thing is called: rq = __task_rq_lock(p); >>> О©╫But p->pi_lock is not held. The problem is __task_rq_lock() has lockdep assert. >>> О©╫Should we change it? >> О©╫Ok, so now that I'm awake ;-) >> >> О©╫So the trivial problem as described by your initial changelog isn't >> О©╫right, because we cannot call sched_setaffinity() on deadline tasks, or >> О©╫rather we can, but we can't actually change the affinity mask. > > Well, if we disable AC we can. And I was able to recreate that race in > that case. > >> О©╫Now I suppose the problem can still actually happen when you change the >> О©╫root domain and trigger a effective affinity change that way. > > Yeah, I think here too. > >> О©╫That said, no leave it as you proposed, adding a *task_rq_lock() variant >> О©╫without lockdep assert in will only confuse things, as normally we >> О©╫really should be also taking ->pi_lock. >> >> О©╫The only reason we don't strictly need ->pi_lock now is because we're >> О©╫guaranteed to have p->state == TASK_RUNNING here and are thus free of >> О©╫ttwu races. > > Maybe we could add this as part of the comment. Peter, Juri, thanks for comment. Hope, I understood you right :) [PATCH] sched/dl: Fix race in dl_task_timer() Throttled task is still on rq, and it may be moved to other cpu if user is playing with sched_setaffinity(). Therefore, unlocked task_rq() access makes the race. Juri Lelli reports he got this race when dl_bandwidth_enabled() was not set. Other thing, pointed by Peter Zijlstra: "Now I suppose the problem can still actually happen when you change the root domain and trigger a effective affinity change that way". To fix that we do the same as made in __task_rq_lock(). We do not use __task_rq_lock() itself, because it has a useful lockdep check, which is not correct in case of dl_task_timer(). We do not need pi_lock locked here. This case is an exception (PeterZ): "The only reason we don't strictly need ->pi_lock now is because we're guaranteed to have p->state == TASK_RUNNING here and are thus free of ttwu races". Signed-off-by: Kirill Tkhai <tkhai@yandex.ru> CC: Juri Lelli <juri.lelli@gmail.com> CC: Peter Zijlstra <peterz@infradead.org> CC: Ingo Molnar <mingo@redhat.com> Cc: <stable@vger.kernel.org> # v3.14 --- kernel/sched/deadline.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index 800e99b..14bc348 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -513,9 +513,17 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer *timer) struct sched_dl_entity, dl_timer); struct task_struct *p = dl_task_of(dl_se); - struct rq *rq = task_rq(p); + struct rq *rq; +again: + rq = task_rq(p); raw_spin_lock(&rq->lock); + if (rq != task_rq(p)) { + /* Task was moved, retrying. */ + raw_spin_unlock(&rq->lock); + goto again; + } + /* * We need to take care of a possible races here. In fact, the * task might have changed its scheduling policy to something ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH] sched/dl: Fix race between dl_task_timer() and sched_setaffinity() 2014-05-20 9:33 ` Kirill Tkhai (?) @ 2014-05-21 7:29 ` Peter Zijlstra -1 siblings, 0 replies; 17+ messages in thread From: Peter Zijlstra @ 2014-05-21 7:29 UTC (permalink / raw) To: Kirill Tkhai Cc: Juri Lelli, linux-kernel@vger.kernel.org, mingo@redhat.com, stable@vger.kernel.org [-- Attachment #1: Type: text/plain, Size: 1272 bytes --] On Tue, May 20, 2014 at 01:33:42PM +0400, Kirill Tkhai wrote: > [PATCH] sched/dl: Fix race in dl_task_timer() > > Throttled task is still on rq, and it may be moved to other cpu > if user is playing with sched_setaffinity(). Therefore, unlocked > task_rq() access makes the race. > > Juri Lelli reports he got this race when dl_bandwidth_enabled() > was not set. > > Other thing, pointed by Peter Zijlstra: > > "Now I suppose the problem can still actually happen when > you change the root domain and trigger a effective affinity > change that way". > > To fix that we do the same as made in __task_rq_lock(). We do not > use __task_rq_lock() itself, because it has a useful lockdep check, > which is not correct in case of dl_task_timer(). We do not need > pi_lock locked here. This case is an exception (PeterZ): > > "The only reason we don't strictly need ->pi_lock now is because > we're guaranteed to have p->state == TASK_RUNNING here and are > thus free of ttwu races". > > Signed-off-by: Kirill Tkhai <tkhai@yandex.ru> > CC: Juri Lelli <juri.lelli@gmail.com> > CC: Peter Zijlstra <peterz@infradead.org> > CC: Ingo Molnar <mingo@redhat.com> > Cc: <stable@vger.kernel.org> # v3.14 > --- thanks Kirill! [-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 17+ messages in thread
* [tip:sched/urgent] sched/dl: Fix race in dl_task_timer() 2014-05-20 9:33 ` Kirill Tkhai (?) (?) @ 2014-06-05 14:33 ` tip-bot for Kirill Tkhai -1 siblings, 0 replies; 17+ messages in thread From: tip-bot for Kirill Tkhai @ 2014-06-05 14:33 UTC (permalink / raw) To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, torvalds, peterz, tkhai, tglx Commit-ID: 0f397f2c90ce68821ee864c2c53baafe78de765d Gitweb: http://git.kernel.org/tip/0f397f2c90ce68821ee864c2c53baafe78de765d Author: Kirill Tkhai <tkhai@yandex.ru> AuthorDate: Tue, 20 May 2014 13:33:42 +0400 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Thu, 5 Jun 2014 11:51:12 +0200 sched/dl: Fix race in dl_task_timer() Throttled task is still on rq, and it may be moved to other cpu if user is playing with sched_setaffinity(). Therefore, unlocked task_rq() access makes the race. Juri Lelli reports he got this race when dl_bandwidth_enabled() was not set. Other thing, pointed by Peter Zijlstra: "Now I suppose the problem can still actually happen when you change the root domain and trigger a effective affinity change that way". To fix that we do the same as made in __task_rq_lock(). We do not use __task_rq_lock() itself, because it has a useful lockdep check, which is not correct in case of dl_task_timer(). We do not need pi_lock locked here. This case is an exception (PeterZ): "The only reason we don't strictly need ->pi_lock now is because we're guaranteed to have p->state == TASK_RUNNING here and are thus free of ttwu races". Signed-off-by: Kirill Tkhai <tkhai@yandex.ru> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: <stable@vger.kernel.org> # v3.14+ Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/3056991400578422@web14g.yandex.ru Signed-off-by: Ingo Molnar <mingo@kernel.org> --- kernel/sched/deadline.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index 800e99b..14bc348 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -513,9 +513,17 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer *timer) struct sched_dl_entity, dl_timer); struct task_struct *p = dl_task_of(dl_se); - struct rq *rq = task_rq(p); + struct rq *rq; +again: + rq = task_rq(p); raw_spin_lock(&rq->lock); + if (rq != task_rq(p)) { + /* Task was moved, retrying. */ + raw_spin_unlock(&rq->lock); + goto again; + } + /* * We need to take care of a possible races here. In fact, the * task might have changed its scheduling policy to something ^ permalink raw reply related [flat|nested] 17+ messages in thread
end of thread, other threads:[~2014-06-05 14:34 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-05-16 21:30 [PATCH] sched/dl: Fix race between dl_task_timer() and sched_setaffinity() Kirill Tkhai 2014-05-19 13:12 ` Juri Lelli 2014-05-19 19:31 ` Kirill Tkhai 2014-05-19 19:31 ` Kirill Tkhai 2014-05-20 0:00 ` Peter Zijlstra 2014-05-20 5:08 ` Kirill Tkhai 2014-05-20 5:08 ` Kirill Tkhai 2014-05-20 6:07 ` Kirill Tkhai 2014-05-20 6:07 ` Kirill Tkhai 2014-05-20 7:53 ` Peter Zijlstra 2014-05-20 7:53 ` Peter Zijlstra 2014-05-20 8:17 ` Juri Lelli 2014-05-20 8:17 ` Juri Lelli 2014-05-20 9:33 ` Kirill Tkhai 2014-05-20 9:33 ` Kirill Tkhai 2014-05-21 7:29 ` Peter Zijlstra 2014-06-05 14:33 ` [tip:sched/urgent] sched/dl: Fix race in dl_task_timer() tip-bot for Kirill Tkhai
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.