From: Oleg Nesterov <oleg@tv-sign.ru>
To: Ingo Molnar <mingo@elte.hu>, Thomas Gleixner <tglx@linutronix.de>,
Steven Rostedt <rostedt@goodmis.org>,
Esben Nielsen <nielsen.esben@googlemail.com>
Cc: linux-kernel@vger.kernel.org
Subject: rt_mutex_timed_lock() vs hrtimer_wakeup() race ?
Date: Sun, 30 Jul 2006 08:36:05 +0400 [thread overview]
Message-ID: <20060730043605.GA2894@oleg> (raw)
I am trying to get some understanding of rt_mutex, but I'm afraid it's
not possible without your help...
// runs on CPU 0
rt_mutex_slowlock:
// main loop
for (;;) {
...
if (timeout && !timeout->task) {
ret = -ETIMEDOUT;
break;
}
...
schedule();
...
set_current_state(state);
}
What if timeout->timer is fired on CPU 1 right before set_current_state() ?
hrtimer_wakeup() does:
timeout->task = NULL; <----- [1]
spin_lock(runqueues->lock);
task->state = TASK_RUNNING; <----- [2]
(task->array != NULL, so try_to_wake_up() just goes to out_running)
If my understanding correct, [1] may slip into the critical section (because
spin_lock() is not a wmb), so that CPU 0 will see [2] but not [1]. In that
case rt_mutex_slowlock() can miss the timeout (set_current_state()->mb()
doesn't help).
Of course, this race (even _if_ I am right) is pure theoretical, but probably
we need smp_wmb() after [1] in hrtimer_wakeup().
Note that do_nanosleep() is ok, hrtimer_base->lock provides a necessary
serialization. In fact, I think it can use __set_current_state(), because
both hrtimer_start() and run_hrtimer_queue() do lock/unlock of base->lock.
Another question, task_blocks_on_rt_mutex() does get_task_struct() and checks
owner->pi_blocked_on != NULL under owner->pi_lock. Why ? RT_MUTEX_HAS_WAITERS
bit is set, we are holding ->wait_lock, so the 'owner' can't go away until
we drop ->wait_lock. I think we can drop owner->pi_lock right after
__rt_mutex_adjust_prio(owner), we can't miss owner->pi_blocked_on != NULL
if it was true before we take owner->pi_lock, and this is the case we should
worry about, yes?
In other words (because I myself can't parse the paragraph above :), could
you explain me why this patch is not correct:
--- rtmutex.c~ 2006-07-30 05:15:38.000000000 +0400
+++ rtmutex.c 2006-07-30 05:41:44.000000000 +0400
@@ -407,7 +407,7 @@ static int task_blocks_on_rt_mutex(struc
struct task_struct *owner = rt_mutex_owner(lock);
struct rt_mutex_waiter *top_waiter = waiter;
unsigned long flags;
- int boost = 0, res;
+ int res;
spin_lock_irqsave(¤t->pi_lock, flags);
__rt_mutex_adjust_prio(current);
@@ -431,24 +431,20 @@ static int task_blocks_on_rt_mutex(struc
plist_add(&waiter->pi_list_entry, &owner->pi_waiters);
__rt_mutex_adjust_prio(owner);
- if (owner->pi_blocked_on) {
- boost = 1;
- /* gets dropped in rt_mutex_adjust_prio_chain()! */
- get_task_struct(owner);
- }
spin_unlock_irqrestore(&owner->pi_lock, flags);
+
+ if (owner->pi_blocked_on)
+ goto boost;
}
else if (debug_rt_mutex_detect_deadlock(waiter, detect_deadlock)) {
- spin_lock_irqsave(&owner->pi_lock, flags);
- if (owner->pi_blocked_on) {
- boost = 1;
- /* gets dropped in rt_mutex_adjust_prio_chain()! */
- get_task_struct(owner);
- }
- spin_unlock_irqrestore(&owner->pi_lock, flags);
+ if (owner->pi_blocked_on)
+ goto boost;
}
- if (!boost)
- return 0;
+
+ return 0;
+boost:
+ /* gets dropped in rt_mutex_adjust_prio_chain()! */
+ get_task_struct(owner);
spin_unlock(&lock->wait_lock);
----------------------------------------------------------
The same question for remove_waiter()/rt_mutex_adjust_pi().
The last (stupid) one,
wake_up_new_task:
if (unlikely(!current->array))
__activate_task(p, rq);
(offtopic) Is it really possible to have current->array == NULL here?
else {
p->prio = current->prio;
What if current was pi-boosted so that rt_prio(current->prio) == 1,
who will de-boost the child?
p->normal_prio = current->normal_prio;
Why? p->normal_prio was calculated by effective_prio() above, could you
explain why that value is not ok?
Thanks,
Oleg.
next reply other threads:[~2006-07-30 0:36 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-07-30 4:36 Oleg Nesterov [this message]
2006-07-30 22:23 ` rt_mutex_timed_lock() vs hrtimer_wakeup() race ? Steven Rostedt
2006-08-01 0:12 ` Oleg Nesterov
2006-07-31 20:47 ` Steven Rostedt
2006-08-01 7:58 ` Thomas Gleixner
2006-08-01 12:07 ` Steven Rostedt
2006-08-01 12:52 ` Thomas Gleixner
2006-08-01 13:21 ` Steven Rostedt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20060730043605.GA2894@oleg \
--to=oleg@tv-sign.ru \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=nielsen.esben@googlemail.com \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.