public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* lock order in O(1) scheduler
@ 2002-01-10  5:10 kevin
  2002-01-10  5:26 ` Robert Love
  2002-01-10  5:51 ` Robert Love
  0 siblings, 2 replies; 8+ messages in thread
From: kevin @ 2002-01-10  5:10 UTC (permalink / raw)
  To: mingo, linux-kernel


Hi Ingo,

I was looking through the new O(1) scheduler (found in linux-2.5.2-pre11),
when I came upon the following code in try_to_wake_up():

        lock_task_rq(rq, p, flags);
        p->state = TASK_RUNNING;
        if (!p->array) {
                if (!rt_task(p) && synchronous && (smp_processor_id() < p->cpu)) {
                        spin_lock(&this_rq()->lock);
                        p->cpu = smp_processor_id();
                        activate_task(p, this_rq());
                        spin_unlock(&this_rq()->lock);
                } else {

I was unable to figure out what the logic of the '(smp_processor_id() <
p->cpu)' test is..  (Why should the CPU number of the process being awoken
matter?)  My best guess is that this is to enforce a locking invariant -
but if so, isn't this test backwards?  If p->cpu > current->cpu then
p->cpu's runqueue is locked first followed by this_rq - locking greatest to
least, where the rest of the code does least to greatest..

Also, this code in set_cpus_allowed() looks bogus:

        if (target_cpu < smp_processor_id()) {
                spin_lock_irq(&target_rq->lock);
                spin_lock(&this_rq->lock);
        } else {
                spin_lock_irq(&target_rq->lock);
                spin_lock(&this_rq->lock);
        }

The lock order is the same regardless of the if statement..


-Kevin

-- 
 ------------------------------------------------------------------------
 | Kevin O'Connor                     "BTW, IMHO we need a FAQ for      |
 | kevin@koconnor.net                  'IMHO', 'FAQ', 'BTW', etc. !"    |
 ------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: lock order in O(1) scheduler
  2002-01-10  5:10 lock order in O(1) scheduler kevin
@ 2002-01-10  5:26 ` Robert Love
  2002-01-10  5:29   ` David S. Miller
                     ` (3 more replies)
  2002-01-10  5:51 ` Robert Love
  1 sibling, 4 replies; 8+ messages in thread
From: Robert Love @ 2002-01-10  5:26 UTC (permalink / raw)
  To: kevin; +Cc: mingo, linux-kernel

On Thu, 2002-01-10 at 00:10, kevin@koconnor.net wrote:

> I was unable to figure out what the logic of the '(smp_processor_id() <
> p->cpu)' test is..  (Why should the CPU number of the process being awoken
> matter?)  My best guess is that this is to enforce a locking invariant -
> but if so, isn't this test backwards?  If p->cpu > current->cpu then
> p->cpu's runqueue is locked first followed by this_rq - locking greatest to
> least, where the rest of the code does least to greatest..

Not so sure of the validity, but it is to respect lock order.  Locking
order is to obtain the locks lowest CPU id first to prevent AB/BA
deadlock.  See the comment above the runqueue data structure for
explanation.

> Also, this code in set_cpus_allowed() looks bogus:
> 
>         if (target_cpu < smp_processor_id()) {
>                 spin_lock_irq(&target_rq->lock);
>                 spin_lock(&this_rq->lock);
>         } else {
>                 spin_lock_irq(&target_rq->lock);
>                 spin_lock(&this_rq->lock);
>         }

This is certainly wrong, I noticed this earlier today.  The unlocking
order is not respected either, I suspect.

I believe the code should be:

         if (target_cpu < smp_processor_id()) {
                 spin_lock_irq(&target_rq->lock);
                 spin_lock(&this_rq->lock);
         } else {
                 spin_lock_irq(&this_rq->lock);
                 spin_lock(&target_rq->lock);
         }

Not so sure about unlocking.  Ingo?

	Robert Love


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: lock order in O(1) scheduler
  2002-01-10  5:26 ` Robert Love
@ 2002-01-10  5:29   ` David S. Miller
  2002-01-10  5:49     ` Robert Love
  2002-01-10  5:38   ` Davide Libenzi
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 8+ messages in thread
From: David S. Miller @ 2002-01-10  5:29 UTC (permalink / raw)
  To: rml; +Cc: kevin, mingo, linux-kernel

   From: Robert Love <rml@tech9.net>
   Date: 10 Jan 2002 00:26:08 -0500
   
   Not so sure about unlocking.  Ingo?

Unlocking order doesn't matter wrt. ABBA deadlock.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: lock order in O(1) scheduler
  2002-01-10  5:26 ` Robert Love
  2002-01-10  5:29   ` David S. Miller
@ 2002-01-10  5:38   ` Davide Libenzi
  2002-01-10  6:29   ` kevin
  2002-01-10 13:11   ` Ingo Molnar
  3 siblings, 0 replies; 8+ messages in thread
From: Davide Libenzi @ 2002-01-10  5:38 UTC (permalink / raw)
  To: Robert Love; +Cc: kevin, Ingo Molnar, lkml

On 10 Jan 2002, Robert Love wrote:

> On Thu, 2002-01-10 at 00:10, kevin@koconnor.net wrote:
>
> > I was unable to figure out what the logic of the '(smp_processor_id() <
> > p->cpu)' test is..  (Why should the CPU number of the process being awoken
> > matter?)  My best guess is that this is to enforce a locking invariant -
> > but if so, isn't this test backwards?  If p->cpu > current->cpu then
> > p->cpu's runqueue is locked first followed by this_rq - locking greatest to
> > least, where the rest of the code does least to greatest..
>
> Not so sure of the validity, but it is to respect lock order.  Locking
> order is to obtain the locks lowest CPU id first to prevent AB/BA
> deadlock.  See the comment above the runqueue data structure for
> explanation.
>
> > Also, this code in set_cpus_allowed() looks bogus:
> >
> >         if (target_cpu < smp_processor_id()) {
> >                 spin_lock_irq(&target_rq->lock);
> >                 spin_lock(&this_rq->lock);
> >         } else {
> >                 spin_lock_irq(&target_rq->lock);
> >                 spin_lock(&this_rq->lock);
> >         }
>
> This is certainly wrong, I noticed this earlier today.  The unlocking
> order is not respected either, I suspect.
>
> I believe the code should be:
>
>          if (target_cpu < smp_processor_id()) {
>                  spin_lock_irq(&target_rq->lock);
>                  spin_lock(&this_rq->lock);
>          } else {
>                  spin_lock_irq(&this_rq->lock);
>                  spin_lock(&target_rq->lock);
>          }

Yes, this is a classical example of the famous cut-and-paste bug :)


>
> Not so sure about unlocking.  Ingo?

Unlocking doesn't matter.




- Davide



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: lock order in O(1) scheduler
  2002-01-10  5:29   ` David S. Miller
@ 2002-01-10  5:49     ` Robert Love
  0 siblings, 0 replies; 8+ messages in thread
From: Robert Love @ 2002-01-10  5:49 UTC (permalink / raw)
  To: David S. Miller; +Cc: kevin, mingo, linux-kernel

On Thu, 2002-01-10 at 00:29, David S. Miller wrote:

> Unlocking order doesn't matter wrt. ABBA deadlock.

Indeed.  Thank you.

Anyhow, Ingo, here is a patch for the typo in set_cpus_allowed:

diff -urN linux-2.5.2-pre10/ linux/
--- linux-2.5.2-pre10/kernel/sched.c	Tue Jan  8 00:26:17 2002
+++ linux/kernel/sched.c	Thu Jan 10 00:41:38 2002
@@ -813,8 +813,8 @@
 		spin_lock_irq(&target_rq->lock);
 		spin_lock(&this_rq->lock);
 	} else {
-		spin_lock_irq(&target_rq->lock);
-		spin_lock(&this_rq->lock);
+		spin_lock_irq(&this_rq->lock);
+		spin_lock(&target_rq->lock);
 	}
 	dequeue_task(p, p->array);
 	this_rq->nr_running--;

	Robert Love


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: lock order in O(1) scheduler
  2002-01-10  5:10 lock order in O(1) scheduler kevin
  2002-01-10  5:26 ` Robert Love
@ 2002-01-10  5:51 ` Robert Love
  1 sibling, 0 replies; 8+ messages in thread
From: Robert Love @ 2002-01-10  5:51 UTC (permalink / raw)
  To: kevin; +Cc: mingo, linux-kernel

On Thu, 2002-01-10 at 00:10, kevin@koconnor.net wrote:

> I was unable to figure out what the logic of the '(smp_processor_id() <
> p->cpu)' test is..  (Why should the CPU number of the process being awoken
> matter?)  My best guess is that this is to enforce a locking invariant -
> but if so, isn't this test backwards?  If p->cpu > current->cpu then
> p->cpu's runqueue is locked first followed by this_rq - locking greatest to
> least, where the rest of the code does least to greatest..

OK, I replied I was unsure of the validity, but looking this over, I now
suspect it is wrong.

The test should be (smp_processor_id() > p->cpu).  Thus it would be safe
to lock this_rq since it is of a lower cpu id than p's rq.

	Robert Love


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: lock order in O(1) scheduler
  2002-01-10  5:26 ` Robert Love
  2002-01-10  5:29   ` David S. Miller
  2002-01-10  5:38   ` Davide Libenzi
@ 2002-01-10  6:29   ` kevin
  2002-01-10 13:11   ` Ingo Molnar
  3 siblings, 0 replies; 8+ messages in thread
From: kevin @ 2002-01-10  6:29 UTC (permalink / raw)
  To: Robert Love; +Cc: mingo, linux-kernel

On Thu, Jan 10, 2002 at 12:26:08AM -0500, Robert Love wrote:
> On Thu, 2002-01-10 at 00:10, kevin@koconnor.net wrote:
> 
> > I was unable to figure out what the logic of the '(smp_processor_id() <
> > p->cpu)' test is..  (Why should the CPU number of the process being awoken
> > matter?)  My best guess is that this is to enforce a locking invariant -
> > but if so, isn't this test backwards?  If p->cpu > current->cpu then
> > p->cpu's runqueue is locked first followed by this_rq - locking greatest to
> > least, where the rest of the code does least to greatest..
> 
> Not so sure of the validity, but it is to respect lock order.
[...]

Right.  It is a confusing though - depending on the value of p->cpu
relative to current->cpu, the synchronous flag may be ignored.  Since the
relationship between p->cpu and current->cpu is essentially random, this
makes the behavior of the synchronous flag random - why bother?

I can see that re-grabbing the locks in the proper order would be a pain.
Also, after a quick grep, it appears only fs/pipe.c is interested in the
wake_up_sync() variant.  Maybe this feature should just be culled?

-Kevin


The try_to_wake_up function in pre11 (just for reference):


static int try_to_wake_up(task_t * p, int synchronous)
{
        unsigned long flags;
        int success = 0;
        runqueue_t *rq;

        lock_task_rq(rq, p, flags);
        p->state = TASK_RUNNING;
        if (!p->array) {
                if (!rt_task(p) && synchronous && (smp_processor_id() < p->cpu)) {
                        spin_lock(&this_rq()->lock);
                        p->cpu = smp_processor_id();
                        activate_task(p, this_rq());
                        spin_unlock(&this_rq()->lock);
                } else {
                        activate_task(p, rq);
                        if ((rq->curr == rq->idle) ||
                                        (p->prio < rq->curr->prio))
                                resched_task(rq->curr);
                }
                success = 1;
        }
        unlock_task_rq(rq, p, flags);
        return success;
}

-- 
 ------------------------------------------------------------------------
 | Kevin O'Connor                     "BTW, IMHO we need a FAQ for      |
 | kevin@koconnor.net                  'IMHO', 'FAQ', 'BTW', etc. !"    |
 ------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: lock order in O(1) scheduler
  2002-01-10  5:26 ` Robert Love
                     ` (2 preceding siblings ...)
  2002-01-10  6:29   ` kevin
@ 2002-01-10 13:11   ` Ingo Molnar
  3 siblings, 0 replies; 8+ messages in thread
From: Ingo Molnar @ 2002-01-10 13:11 UTC (permalink / raw)
  To: Robert Love; +Cc: kevin, linux-kernel


On 10 Jan 2002, Robert Love wrote:

> I believe the code should be:
>
>          if (target_cpu < smp_processor_id()) {
>                  spin_lock_irq(&target_rq->lock);
>                  spin_lock(&this_rq->lock);
>          } else {
>                  spin_lock_irq(&this_rq->lock);
>                  spin_lock(&target_rq->lock);
>          }
>
> Not so sure about unlocking.  Ingo?

yep, correct, good catch!

the unlocking order does not matter much.

	Ingo


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2002-01-10 11:14 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-01-10  5:10 lock order in O(1) scheduler kevin
2002-01-10  5:26 ` Robert Love
2002-01-10  5:29   ` David S. Miller
2002-01-10  5:49     ` Robert Love
2002-01-10  5:38   ` Davide Libenzi
2002-01-10  6:29   ` kevin
2002-01-10 13:11   ` Ingo Molnar
2002-01-10  5:51 ` Robert Love

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox