[RT] bad BUG_ON in rtmutex.c

All of lore.kernel.org
 help / color / mirror / Atom feed

* [RT] bad BUG_ON in rtmutex.c
@ 2006-04-18  1:48 Steven Rostedt
  2006-04-18 12:20 ` Daniel Walker
  2006-04-18 14:13 ` [PATCH -rt] Remove false BUG_ON from rtmutex.c Steven Rostedt
  0 siblings, 2 replies; 15+ messages in thread
From: Steven Rostedt @ 2006-04-18  1:48 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Thomas Gleixner, LKML

I believe the following BUG_ON can produce false positives.  Not sure if
this would be a problem though if the case was true.

Here in rt_adjust_prio_chain line 236 (2.6.16-rt16):

	/*
	 * When deadlock detection is off then we check, if further
	 * priority adjustment is necessary.
	 */
	if (!detect_deadlock && waiter->list_entry.prio == task->prio) {
		BUG_ON(waiter->pi_list_entry.prio != waiter->list_entry.prio);
		goto out_unlock_pi;
	}

This is only protected by the waiter->task->pi_lock.

Here's the race:

We have Process A blocked on some lock L1 (this owner doesn't matter).
A owns L2 and L3.

Say process B is blocked on lock L2 and process C is blocked on L3. We
can also say that B and C are of lower priority than A.

Process B owns lock L4 and process C owns lock L5.

We have Process D that comes and blocks on lock L4 of higher priority
than A.

At the same time we have process E blocking on L5 on another CPU that
just happens to be the same priority as D.

Here's a view of this scenario.

  L1 <=blocks= A
                 <=owns= L2
                           <=blocks= B <=owns= L4
                                                  <=blocks= D
                 <=owns= L3
                           <=blocks= C <=owns= L5
                                                  <=blocks= E

Remember both D and E are running on two different CPUs with the same
priority, but both are higher than A and the priority boosting is in
effect.

As D climbs the chain and finally gets to task == A and lock == L1

Then we get to this part of the code:

	/* Requeue the waiter */
	plist_del(&waiter->list_entry, &lock->wait_list);
	waiter->list_entry.prio = task->prio;
	plist_add(&waiter->list_entry, &lock->wait_list);

	/* Release the task */
	spin_unlock_irqrestore(&task->pi_lock, flags);
	put_task_struct(task);

	/* Grab the next task */
	task = rt_mutex_owner(lock);
	spin_lock_irqsave(&task->pi_lock, flags);

Now after we unlock the task->pi_lock, the waiter->list_entry.prio is
now equal to the task->prio but waiter->pi_list_entry.prio does not yet
equal waiter->pi_list_entry.prio.  And at this moment, we only have the
L1->wait_lock. And to make matters worst, interrupts can now be on.

Lets say before the above happened, process E was going up its chain,
and the above happened just as it reached:

 retry:
	/*
	 * Task can not go away as we did a get_task() before !
	 */
	spin_lock_irqsave(&task->pi_lock, flags);

And E blocked on the task->pi_lock (since task == A, and D had the
lock).

Now when D releases the pi_lock, E can continue, but it gets to the
problem compare:

	if (!detect_deadlock && waiter->list_entry.prio == task->prio) {
		BUG_ON(waiter->pi_list_entry.prio != waiter->list_entry.prio);
		goto out_unlock_pi;
	}

Remember that D had the same prio as E, so when E hits this point,
waiter->list_entry will equal task->prio (boosted by D), but when it
enters the condition, pi_list_entry.prio hasn't been updated yet by D,
so we have a legitimate condition that the BUG_ON test will produce a
true result.

So the question now is: is this a real bug?

-- Steve

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RT] bad BUG_ON in rtmutex.c
  2006-04-18  1:48 [RT] bad BUG_ON in rtmutex.c Steven Rostedt
@ 2006-04-18 12:20 ` Daniel Walker
  2006-04-18 12:34   ` Steven Rostedt
  2006-04-18 14:13 ` [PATCH -rt] Remove false BUG_ON from rtmutex.c Steven Rostedt
  1 sibling, 1 reply; 15+ messages in thread
From: Daniel Walker @ 2006-04-18 12:20 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Ingo Molnar, Thomas Gleixner, LKML

On Mon, 2006-04-17 at 21:48 -0400, Steven Rostedt wrote:
...
> 
> So the question now is: is this a real bug?

It seems like a possible scenario . So if the false BUG_ON() needlessly
kills a perfectly running system, then it must be a bug. It's the case
of the buggy BUG_ON ;) !

Daniel


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RT] bad BUG_ON in rtmutex.c
  2006-04-18 12:20 ` Daniel Walker
@ 2006-04-18 12:34   ` Steven Rostedt
  2006-04-18 13:11     ` Daniel Walker
  0 siblings, 1 reply; 15+ messages in thread
From: Steven Rostedt @ 2006-04-18 12:34 UTC (permalink / raw)
  To: Daniel Walker; +Cc: Ingo Molnar, Thomas Gleixner, LKML

On Tue, 18 Apr 2006, Daniel Walker wrote:

> On Mon, 2006-04-17 at 21:48 -0400, Steven Rostedt wrote:
> ...
> >
> > So the question now is: is this a real bug?
>
> It seems like a possible scenario . So if the false BUG_ON() needlessly
> kills a perfectly running system, then it must be a bug. It's the case
> of the buggy BUG_ON ;) !
>

It was late when I was writing that.  I reread my email today, and realize
that there's a few confusing statements there.  That last one being one :)

I meant to say:

  So the question is now: Is that case in BUG_ON a real bug?

The BUG_ON bugging a normal system _is_ a bug.

-- Steve

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RT] bad BUG_ON in rtmutex.c
  2006-04-18 12:34   ` Steven Rostedt
@ 2006-04-18 13:11     ` Daniel Walker
  2006-04-18 13:50       ` Steven Rostedt
  0 siblings, 1 reply; 15+ messages in thread
From: Daniel Walker @ 2006-04-18 13:11 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Ingo Molnar, Thomas Gleixner, LKML

On Tue, 2006-04-18 at 08:34 -0400, Steven Rostedt wrote:
> On Tue, 18 Apr 2006, Daniel Walker wrote:
> 
> > On Mon, 2006-04-17 at 21:48 -0400, Steven Rostedt wrote:
> > ...
> > >
> > > So the question now is: is this a real bug?
> >
> > It seems like a possible scenario . So if the false BUG_ON() needlessly
> > kills a perfectly running system, then it must be a bug. It's the case
> > of the buggy BUG_ON ;) !
> >
> 
> It was late when I was writing that.  I reread my email today, and realize
> that there's a few confusing statements there.  That last one being one :)

Yeah , it was a bit confusing .

> I meant to say:
> 
>   So the question is now: Is that case in BUG_ON a real bug?
> 
> The BUG_ON bugging a normal system _is_ a bug.

Something in the code bothered me right around the block you
referenced. 

Specifically when it drops the pi_lock , then takes it again, then does
plist_add to the pi_waiters ( during the "Boost the owner" section in
rt_mutex_adjust_prio_chain() ). Since the pi_lock was dropped you could
get an priority change which would lead to a bogus value in
waiter->pi_list_entry.prio .

I was looking over the code, and it seems like once all the chain
adjusting bottoms out you would end up with the correct priorities in
the waiter structures .. Cause whatever task made the priority
adjustment would just end up resetting the pi_waiters during it's
adjustment process. (Seems like there's room for optimization
though ..) 

Daniel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH -rt] Remove false BUG_ON from rtmutex.c
  2006-04-18 14:13 ` [PATCH -rt] Remove false BUG_ON from rtmutex.c Steven Rostedt
@ 2006-04-18 13:12   ` Ingo Molnar
  0 siblings, 0 replies; 15+ messages in thread
From: Ingo Molnar @ 2006-04-18 13:12 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Thomas Gleixner, LKML, Daniel Walker


* Steven Rostedt <rostedt@goodmis.org> wrote:

> Here's a patch to remove the BUG_ON in rtmutex.c.  I previously showed 
> that the condition in that particular BUG_ON can legitimately be the 
> case.

thanks, applied.

	Ingo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RT] bad BUG_ON in rtmutex.c
  2006-04-18 13:11     ` Daniel Walker
@ 2006-04-18 13:50       ` Steven Rostedt
  2006-04-18 13:55         ` Steven Rostedt
  2006-04-18 14:09         ` Daniel Walker
  0 siblings, 2 replies; 15+ messages in thread
From: Steven Rostedt @ 2006-04-18 13:50 UTC (permalink / raw)
  To: Daniel Walker; +Cc: Ingo Molnar, Thomas Gleixner, LKML

On Tue, 2006-04-18 at 06:11 -0700, Daniel Walker wrote:

> 
> Something in the code bothered me right around the block you
> referenced. 
> 
> Specifically when it drops the pi_lock , then takes it again, then does
> plist_add to the pi_waiters ( during the "Boost the owner" section in
> rt_mutex_adjust_prio_chain() ). Since the pi_lock was dropped you could
> get an priority change which would lead to a bogus value in
> waiter->pi_list_entry.prio .

It's not really bogus. It just wont match the task->prio.  The
waiter->pi_list_entry.prio is set to waiter->list_entry.prio and that's
what you really need to match.  But you are right that the prio could
have changed.  But whoever changed the prio should also be updating the
chain, so whoever finishes, should have the chain setup properly.

> 
> I was looking over the code, and it seems like once all the chain
> adjusting bottoms out you would end up with the correct priorities in
> the waiter structures .. Cause whatever task made the priority
> adjustment would just end up resetting the pi_waiters during it's
> adjustment process. (Seems like there's room for optimization
> though ..) 

I guess I just reiterated above what you are saying here.  Not sure if
this can be optimized.  You're talking about optimizing a case that
would seldom happen, but in doing so you stand a great chance of slowing
down the normal case.

To keep latencies down, we are letting the PI chain walk be preempted,
by releasing locks.  It's understood that the chain can then change
while walking (big debate about this between Ingo, tglx and Esben).  But
at the end, we decided on it being better to have latencies down, and
just make adjustments when they arise.  This also keeps the latencies
bounded, since the old way was harder to know the worst case (PI chain
creep).

BUT!  I need to take another good look at the code, and maybe my
previous example of the failed BUG_ON is really a clue that there exists
a deeper bug.  If the processes D and E from my last example were of
different priorities, but still higher than A, could the end result be
setting A to the lower of the two?  This would be a bug, because then A
would not inherit the correct priority!

-- Steve

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RT] bad BUG_ON in rtmutex.c
  2006-04-18 14:51             ` Daniel Walker
@ 2006-04-18 13:52               ` Ingo Molnar
  2006-04-18 15:06               ` Steven Rostedt
  1 sibling, 0 replies; 15+ messages in thread
From: Ingo Molnar @ 2006-04-18 13:52 UTC (permalink / raw)
  To: Daniel Walker; +Cc: Steven Rostedt, Thomas Gleixner, LKML


* Daniel Walker <dwalker@mvista.com> wrote:

> > But, as PI matures, it seems to be more and more acceptable.
> 
> 	I read an article on priority ceiling as another method of doing 
> this. Priority ceiling doesn't seem better, but at the same time I 
> can't imagine how you'd implement it in Linux, or not in a straight 
> forward way .

it's already implemented and can be done in userspace: userspace can do 
it by doing a sys_setscheduler() call when entering the critical 
section, and another one when exiting it. (PI is obviously faster 
because there the futex fastpath can be pure-userspace.)

	Ingo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RT] bad BUG_ON in rtmutex.c
  2006-04-18 13:50       ` Steven Rostedt
@ 2006-04-18 13:55         ` Steven Rostedt
  2006-04-18 14:09         ` Daniel Walker
  1 sibling, 0 replies; 15+ messages in thread
From: Steven Rostedt @ 2006-04-18 13:55 UTC (permalink / raw)
  To: Daniel Walker; +Cc: Ingo Molnar, Thomas Gleixner, LKML

On Tue, 2006-04-18 at 09:50 -0400, Steven Rostedt wrote:

> BUT!  I need to take another good look at the code, and maybe my
> previous example of the failed BUG_ON is really a clue that there exists
> a deeper bug.  If the processes D and E from my last example were of
> different priorities, but still higher than A, could the end result be
> setting A to the lower of the two?  This would be a bug, because then A
> would not inherit the correct priority!

OK, this shouldn't be a problem (answering my own question ;).

The setting of the task's prio is done by __rt_mutex_adjust_prio(task)
and this sets the task's prio to the highest prio task that is blocked
on a lock own by "task", or to "task"s original prio if that is higher.

So nevermind.

-- Steve



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RT] bad BUG_ON in rtmutex.c
  2006-04-18 13:50       ` Steven Rostedt
  2006-04-18 13:55         ` Steven Rostedt
@ 2006-04-18 14:09         ` Daniel Walker
  2006-04-18 14:32           ` Steven Rostedt
  1 sibling, 1 reply; 15+ messages in thread
From: Daniel Walker @ 2006-04-18 14:09 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Ingo Molnar, Thomas Gleixner, LKML

On Tue, 2006-04-18 at 09:50 -0400, Steven Rostedt wrote:
> On Tue, 2006-04-18 at 06:11 -0700, Daniel Walker wrote:
> 

> 
> > 
> > I was looking over the code, and it seems like once all the chain
> > adjusting bottoms out you would end up with the correct priorities in
> > the waiter structures .. Cause whatever task made the priority
> > adjustment would just end up resetting the pi_waiters during it's
> > adjustment process. (Seems like there's room for optimization
> > though ..) 
> 
> I guess I just reiterated above what you are saying here.  Not sure if
> this can be optimized.  You're talking about optimizing a case that
> would seldom happen, but in doing so you stand a great chance of slowing
> down the normal case.

I'm not really working on it, but on a bigger SMP machine it might not
be that uncommon .. PI is running on all tasks now isn't it? It seems
like a simple check could be at the same point as the BUG_ON() we're
talking about .. <speculation> If that BUG_ON() triggers , it means the
task is walking the lock chain at the same time as another task on the
same chain, so you could make the lower prio chain stop at that point ..
no?

> To keep latencies down, we are letting the PI chain walk be preempted,
> by releasing locks.  It's understood that the chain can then change
> while walking (big debate about this between Ingo, tglx and Esben).  But
> at the end, we decided on it being better to have latencies down, and
> just make adjustments when they arise.  This also keeps the latencies
> bounded, since the old way was harder to know the worst case (PI chain
> creep).

I can imagine. Seems like PI is always a point of controversy .

> BUT!  I need to take another good look at the code, and maybe my
> previous example of the failed BUG_ON is really a clue that there exists
> a deeper bug.  If the processes D and E from my last example were of
> different priorities, but still higher than A, could the end result be
> setting A to the lower of the two?  This would be a bug, because then A
> would not inherit the correct priority!

Did that BUG_ON ever trigger for you? I don't know how you would end up
in that state without at least two chain walkers on different cpu's ..

Daniel


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH -rt] Remove false BUG_ON from rtmutex.c
  2006-04-18  1:48 [RT] bad BUG_ON in rtmutex.c Steven Rostedt
  2006-04-18 12:20 ` Daniel Walker
@ 2006-04-18 14:13 ` Steven Rostedt
  2006-04-18 13:12   ` Ingo Molnar
  1 sibling, 1 reply; 15+ messages in thread
From: Steven Rostedt @ 2006-04-18 14:13 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Thomas Gleixner, LKML, Daniel Walker

Ingo,

Here's a patch to remove the BUG_ON in rtmutex.c.  I previously showed
that the condition in that particular BUG_ON can legitimately be the
case.

Once again, if you have processes A, B, C, D, and E holding the
following locks in this scenario:

 L1 <=blocks= A
               <=owns= L2 <=blocks= B <=owns= L4 <=blocks= D
               <=owns= L3 <=blocks= C <=owns= L5 <=blocks= E

Where the priorities of these tasks are

    B,C < A < D = E

B and C are less than A and A is less than D and E where D and E are
equal (actually it probably works when D and E are not equal too).

As D and E climb the chain, there's a very slight race condition that
could allow for the condition in the offending BUG_ON to be true.

This patch removes that BUG_ON.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

Index: linux-2.6.16-rt16/kernel/rtmutex.c
===================================================================
--- linux-2.6.16-rt16.orig/kernel/rtmutex.c	2006-04-17 14:49:43.000000000 -0400
+++ linux-2.6.16-rt16/kernel/rtmutex.c	2006-04-18 09:57:54.000000000 -0400
@@ -232,10 +232,8 @@ static int rt_mutex_adjust_prio_chain(ta
 	 * When deadlock detection is off then we check, if further
 	 * priority adjustment is necessary.
 	 */
-	if (!detect_deadlock && waiter->list_entry.prio == task->prio) {
-		BUG_ON(waiter->pi_list_entry.prio != waiter->list_entry.prio);
+	if (!detect_deadlock && waiter->list_entry.prio == task->prio)
 		goto out_unlock_pi;
-	}
 
 	lock = waiter->lock;
 	if (!spin_trylock(&lock->wait_lock)) {



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RT] bad BUG_ON in rtmutex.c
  2006-04-18 14:09         ` Daniel Walker
@ 2006-04-18 14:32           ` Steven Rostedt
  2006-04-18 14:51             ` Daniel Walker
  0 siblings, 1 reply; 15+ messages in thread
From: Steven Rostedt @ 2006-04-18 14:32 UTC (permalink / raw)
  To: Daniel Walker; +Cc: Ingo Molnar, Thomas Gleixner, LKML

On Tue, 2006-04-18 at 07:09 -0700, Daniel Walker wrote:
> On Tue, 2006-04-18 at 09:50 -0400, Steven Rostedt wrote:
> > On Tue, 2006-04-18 at 06:11 -0700, Daniel Walker wrote:
> > 
> 
> > 
> > > 
> > > I was looking over the code, and it seems like once all the chain
> > > adjusting bottoms out you would end up with the correct priorities in
> > > the waiter structures .. Cause whatever task made the priority
> > > adjustment would just end up resetting the pi_waiters during it's
> > > adjustment process. (Seems like there's room for optimization
> > > though ..) 
> > 
> > I guess I just reiterated above what you are saying here.  Not sure if
> > this can be optimized.  You're talking about optimizing a case that
> > would seldom happen, but in doing so you stand a great chance of slowing
> > down the normal case.
> 
> I'm not really working on it, but on a bigger SMP machine it might not
> be that uncommon .. PI is running on all tasks now isn't it? It seems
> like a simple check could be at the same point as the BUG_ON() we're
> talking about .. <speculation> If that BUG_ON() triggers , it means the
> task is walking the lock chain at the same time as another task on the
> same chain, so you could make the lower prio chain stop at that point ..
> no?

Actually, where that BUG_ON was is the exiting of the chain walk. So it
does stop.  It's the higher priority task that needs to be continuing
the chain walk for that problem to occur.  So really, it already does
what you suggest :)

> 
> > To keep latencies down, we are letting the PI chain walk be preempted,
> > by releasing locks.  It's understood that the chain can then change
> > while walking (big debate about this between Ingo, tglx and Esben).  But
> > at the end, we decided on it being better to have latencies down, and
> > just make adjustments when they arise.  This also keeps the latencies
> > bounded, since the old way was harder to know the worst case (PI chain
> > creep).
> 
> I can imagine. Seems like PI is always a point of controversy .

But, as PI matures, it seems to be more and more acceptable.

> 
> > BUT!  I need to take another good look at the code, and maybe my
> > previous example of the failed BUG_ON is really a clue that there exists
> > a deeper bug.  If the processes D and E from my last example were of
> > different priorities, but still higher than A, could the end result be
> > setting A to the lower of the two?  This would be a bug, because then A
> > would not inherit the correct priority!
> 
> Did that BUG_ON ever trigger for you? I don't know how you would end up
> in that state without at least two chain walkers on different cpu's ..

Yes it did, but I need to clearify.  I have a custom kernel that is
based on Ingo's -rt patch, and I hit this bug on touching other parts of
the PI list.  When looking at fixing it, I realized that the BUG_ON
condition, was legitimate.

Also, I always test on SMP (then I test on UP) and the chain walkers
were on two CPUs.

-- Steve



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RT] bad BUG_ON in rtmutex.c
  2006-04-18 14:32           ` Steven Rostedt
@ 2006-04-18 14:51             ` Daniel Walker
  2006-04-18 13:52               ` Ingo Molnar
  2006-04-18 15:06               ` Steven Rostedt
  0 siblings, 2 replies; 15+ messages in thread
From: Daniel Walker @ 2006-04-18 14:51 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Ingo Molnar, Thomas Gleixner, LKML

On Tue, 2006-04-18 at 10:32 -0400, Steven Rostedt wrote:

> 
> Actually, where that BUG_ON was is the exiting of the chain walk. So it
> does stop.  It's the higher priority task that needs to be continuing
> the chain walk for that problem to occur.  So really, it already does
> what you suggest :)

I bet you could test for that condition in some other spots too . Like
when it adds to the pi_waiters , you could test if the priorities are
out of sync ..

> > 
> > > To keep latencies down, we are letting the PI chain walk be preempted,
> > > by releasing locks.  It's understood that the chain can then change
> > > while walking (big debate about this between Ingo, tglx and Esben).  But
> > > at the end, we decided on it being better to have latencies down, and
> > > just make adjustments when they arise.  This also keeps the latencies
> > > bounded, since the old way was harder to know the worst case (PI chain
> > > creep).
> > 
> > I can imagine. Seems like PI is always a point of controversy .
> 
> But, as PI matures, it seems to be more and more acceptable.

	I read an article on priority ceiling as another method of doing this.
Priority ceiling doesn't seem better, but at the same time I can't
imagine how you'd implement it in Linux, or not in a straight forward
way .

> Also, I always test on SMP (then I test on UP) and the chain walkers
> were on two CPUs.

Yeah, best policy, I've learned ..

Daniel


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RT] bad BUG_ON in rtmutex.c
  2006-04-18 14:51             ` Daniel Walker
  2006-04-18 13:52               ` Ingo Molnar
@ 2006-04-18 15:06               ` Steven Rostedt
  2006-04-18 16:14                 ` Daniel Walker
  1 sibling, 1 reply; 15+ messages in thread
From: Steven Rostedt @ 2006-04-18 15:06 UTC (permalink / raw)
  To: Daniel Walker; +Cc: Ingo Molnar, Thomas Gleixner, LKML


On Tue, 18 Apr 2006, Daniel Walker wrote:

> On Tue, 2006-04-18 at 10:32 -0400, Steven Rostedt wrote:
>
> >
> > Actually, where that BUG_ON was is the exiting of the chain walk. So it
> > does stop.  It's the higher priority task that needs to be continuing
> > the chain walk for that problem to occur.  So really, it already does
> > what you suggest :)
>
> I bet you could test for that condition in some other spots too . Like
> when it adds to the pi_waiters , you could test if the priorities are
> out of sync ..

You mean the other places in rt_mutex_adjust_prio_chain?  It already
checks once an iteration, anything more is just over kill.

>
> > >
> > > > To keep latencies down, we are letting the PI chain walk be preempted,
> > > > by releasing locks.  It's understood that the chain can then change
> > > > while walking (big debate about this between Ingo, tglx and Esben).  But
> > > > at the end, we decided on it being better to have latencies down, and
> > > > just make adjustments when they arise.  This also keeps the latencies
> > > > bounded, since the old way was harder to know the worst case (PI chain
> > > > creep).
> > >
> > > I can imagine. Seems like PI is always a point of controversy .
> >
> > But, as PI matures, it seems to be more and more acceptable.
>
> 	I read an article on priority ceiling as another method of doing this.
> Priority ceiling doesn't seem better, but at the same time I can't
> imagine how you'd implement it in Linux, or not in a straight forward
> way .

Actually, I always thought that running PREEMPT_DESKTOP with soft and hard
IRQS as threads was priority ceiling.  It's just that all locks have the
priority of MAX_RT_PRIO (no preemption allowed).  OK, this doesn't apply
to mutexes, but it does apply for spin_locks. :)

-- Steve


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RT] bad BUG_ON in rtmutex.c
  2006-04-18 15:06               ` Steven Rostedt
@ 2006-04-18 16:14                 ` Daniel Walker
  2006-04-18 16:24                   ` Steven Rostedt
  0 siblings, 1 reply; 15+ messages in thread
From: Daniel Walker @ 2006-04-18 16:14 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Ingo Molnar, Thomas Gleixner, LKML

On Tue, 2006-04-18 at 11:06 -0400, Steven Rostedt wrote:
> On Tue, 18 Apr 2006, Daniel Walker wrote:
> 
> > On Tue, 2006-04-18 at 10:32 -0400, Steven Rostedt wrote:
> >
> > >
> > > Actually, where that BUG_ON was is the exiting of the chain walk. So it
> > > does stop.  It's the higher priority task that needs to be continuing
> > > the chain walk for that problem to occur.  So really, it already does
> > > what you suggest :)
> >
> > I bet you could test for that condition in some other spots too . Like
> > when it adds to the pi_waiters , you could test if the priorities are
> > out of sync ..
> 
> You mean the other places in rt_mutex_adjust_prio_chain?  It already
> checks once an iteration, anything more is just over kill.

Yeah, sounds good .

> Actually, I always thought that running PREEMPT_DESKTOP with soft and hard
> IRQS as threads was priority ceiling.  It's just that all locks have the
> priority of MAX_RT_PRIO (no preemption allowed).  OK, this doesn't apply
> to mutexes, but it does apply for spin_locks. :)

Interesting way to look at it .

Reminds me of the RT read/write locks, only one read or one writer at a
time, so it's really just a mutex ..

Daniel 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RT] bad BUG_ON in rtmutex.c
  2006-04-18 16:14                 ` Daniel Walker
@ 2006-04-18 16:24                   ` Steven Rostedt
  0 siblings, 0 replies; 15+ messages in thread
From: Steven Rostedt @ 2006-04-18 16:24 UTC (permalink / raw)
  To: Daniel Walker; +Cc: Ingo Molnar, Thomas Gleixner, LKML

On Tue, 2006-04-18 at 09:14 -0700, Daniel Walker wrote:

> 
> > Actually, I always thought that running PREEMPT_DESKTOP with soft and hard
> > IRQS as threads was priority ceiling.  It's just that all locks have the
> > priority of MAX_RT_PRIO (no preemption allowed).  OK, this doesn't apply
> > to mutexes, but it does apply for spin_locks. :)
> 
> Interesting way to look at it .
> 
> Reminds me of the RT read/write locks, only one read or one writer at a
> time, so it's really just a mutex ..
> 

We'll read/write doesn't work well with PI (or latencies for that
matter).  But rw_locks have one advantage over normal rt_mutex, and that
is they are self recursive.  i.e. one rw_lock can be taken over and over
again (as read) by the same process, as long as it releases it the same
amount of times.

-- Steve



^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2006-04-18 16:24 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-04-18  1:48 [RT] bad BUG_ON in rtmutex.c Steven Rostedt
2006-04-18 12:20 ` Daniel Walker
2006-04-18 12:34   ` Steven Rostedt
2006-04-18 13:11     ` Daniel Walker
2006-04-18 13:50       ` Steven Rostedt
2006-04-18 13:55         ` Steven Rostedt
2006-04-18 14:09         ` Daniel Walker
2006-04-18 14:32           ` Steven Rostedt
2006-04-18 14:51             ` Daniel Walker
2006-04-18 13:52               ` Ingo Molnar
2006-04-18 15:06               ` Steven Rostedt
2006-04-18 16:14                 ` Daniel Walker
2006-04-18 16:24                   ` Steven Rostedt
2006-04-18 14:13 ` [PATCH -rt] Remove false BUG_ON from rtmutex.c Steven Rostedt
2006-04-18 13:12   ` Ingo Molnar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.