All of lore.kernel.org
 help / color / mirror / Atom feed
* AS spin lock bugs
@ 2003-11-13 10:38 Jens Axboe
  2003-11-13 10:52 ` Nick Piggin
  2003-11-13 10:52 ` Jens Axboe
  0 siblings, 2 replies; 8+ messages in thread
From: Jens Axboe @ 2003-11-13 10:38 UTC (permalink / raw)
  To: Nick Piggin, Linux Kernel; +Cc: Linus Torvalds

Hi,

Was looking at io tracking for cfq, and I think I found some spin lock
bugs in current as (current BK). as_update_iohist() runs from
add_request which is typically in process context. It could be run with
interrupts disabled though, either driver private stuff or using the
generic block layer tagging.

Anyways, as_update_iohist() grabs aic->lock without disabling
interrupts, while as_completed_request() typically runs at interrupt
time and grabs the same lock. Deadlock.

To be safe, both need to use the flags saving lock variants.

===== drivers/block/as-iosched.c 1.28 vs edited =====
--- 1.28/drivers/block/as-iosched.c	Mon Nov 10 06:12:07 2003
+++ edited/drivers/block/as-iosched.c	Thu Nov 13 11:37:47 2003
@@ -783,14 +783,14 @@
 {
 	struct as_rq *arq = RQ_DATA(rq);
 	int data_dir = arq->is_sync;
-	unsigned long thinktime;
+	unsigned long thinktime, flags;
 	sector_t seek_dist;
 
 	if (aic == NULL)
 		return;
 
 	if (data_dir == REQ_SYNC) {
-		spin_lock(&aic->lock);
+		spin_lock_irqsave(&aic->lock, flags);
 
 		if (test_bit(AS_TASK_IORUNNING, &aic->state)
 				&& !atomic_read(&aic->nr_queued)
@@ -844,7 +844,7 @@
 		aic->seek_total = (aic->seek_total>>1)
 					+ (aic->seek_total>>2);
 
-		spin_unlock(&aic->lock);
+		spin_unlock_irqrestore(&aic->lock, flags);
 	}
 }
 
@@ -909,6 +909,7 @@
 	struct as_data *ad = q->elevator.elevator_data;
 	struct as_rq *arq = RQ_DATA(rq);
 	struct as_io_context *aic;
+	unsigned long flags;
 
 	WARN_ON(!list_empty(&rq->queuelist));
 
@@ -959,12 +960,12 @@
 	if (!aic)
 		return;
 
-	spin_lock(&aic->lock);
+	spin_lock_irqsave(&aic->lock, flags);
 	if (arq->is_sync == REQ_SYNC) {
 		set_bit(AS_TASK_IORUNNING, &aic->state);
 		aic->last_end_request = jiffies;
 	}
-	spin_unlock(&aic->lock);
+	spin_unlock_irqrestore(&aic->lock, flags);
 
 	put_io_context(arq->io_context);
 }

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: AS spin lock bugs
  2003-11-13 10:38 AS spin lock bugs Jens Axboe
@ 2003-11-13 10:52 ` Nick Piggin
  2003-11-13 10:59   ` Jens Axboe
  2003-11-13 10:52 ` Jens Axboe
  1 sibling, 1 reply; 8+ messages in thread
From: Nick Piggin @ 2003-11-13 10:52 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Linux Kernel, Linus Torvalds



Jens Axboe wrote:

>Hi,
>
>Was looking at io tracking for cfq, and I think I found some spin lock
>bugs in current as (current BK). as_update_iohist() runs from
>add_request which is typically in process context. It could be run with
>interrupts disabled though, either driver private stuff or using the
>generic block layer tagging.
>
>Anyways, as_update_iohist() grabs aic->lock without disabling
>interrupts, while as_completed_request() typically runs at interrupt
>time and grabs the same lock. Deadlock.
>
>To be safe, both need to use the flags saving lock variants.
>

Hi Jens,
I was hoping everything ran under the queue lock which should always
have interrupts off on the local CPU. The lock in question is to prevent
a as_completed_request on one queue from racing with as_update_iohist
on another. Each would be on a different CPU.

Maybe I'm wrong, did you actually see misbehaviour?



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: AS spin lock bugs
  2003-11-13 10:38 AS spin lock bugs Jens Axboe
  2003-11-13 10:52 ` Nick Piggin
@ 2003-11-13 10:52 ` Jens Axboe
  2003-11-13 10:59   ` Nick Piggin
  1 sibling, 1 reply; 8+ messages in thread
From: Jens Axboe @ 2003-11-13 10:52 UTC (permalink / raw)
  To: Nick Piggin, Linux Kernel; +Cc: Linus Torvalds

On Thu, Nov 13 2003, Jens Axboe wrote:
> @@ -959,12 +960,12 @@
>  	if (!aic)
>  		return;
>  
> -	spin_lock(&aic->lock);
> +	spin_lock_irqsave(&aic->lock, flags);
>  	if (arq->is_sync == REQ_SYNC) {
>  		set_bit(AS_TASK_IORUNNING, &aic->state);
>  		aic->last_end_request = jiffies;
>  	}
> -	spin_unlock(&aic->lock);
> +	spin_unlock_irqrestore(&aic->lock, flags);
>  
>  	put_io_context(arq->io_context);
>  }

BTW, this looks bogus. Why do you need any locking there?

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: AS spin lock bugs
  2003-11-13 10:52 ` Nick Piggin
@ 2003-11-13 10:59   ` Jens Axboe
  0 siblings, 0 replies; 8+ messages in thread
From: Jens Axboe @ 2003-11-13 10:59 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Linux Kernel, Linus Torvalds

On Thu, Nov 13 2003, Nick Piggin wrote:
> 
> 
> Jens Axboe wrote:
> 
> >Hi,
> >
> >Was looking at io tracking for cfq, and I think I found some spin lock
> >bugs in current as (current BK). as_update_iohist() runs from
> >add_request which is typically in process context. It could be run with
> >interrupts disabled though, either driver private stuff or using the
> >generic block layer tagging.
> >
> >Anyways, as_update_iohist() grabs aic->lock without disabling
> >interrupts, while as_completed_request() typically runs at interrupt
> >time and grabs the same lock. Deadlock.
> >
> >To be safe, both need to use the flags saving lock variants.
> >
> 
> Hi Jens,
> I was hoping everything ran under the queue lock which should always
> have interrupts off on the local CPU. The lock in question is to prevent
> a as_completed_request on one queue from racing with as_update_iohist
> on another. Each would be on a different CPU.

Ah yes you are right. The queue lock will be held in both places.

> Maybe I'm wrong, did you actually see misbehaviour?

Nope, just looking over the code. What about the second lock, why is
that needed? I don't see that protecting anything.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: AS spin lock bugs
  2003-11-13 10:52 ` Jens Axboe
@ 2003-11-13 10:59   ` Nick Piggin
  2003-11-13 11:01     ` Jens Axboe
  0 siblings, 1 reply; 8+ messages in thread
From: Nick Piggin @ 2003-11-13 10:59 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Linux Kernel, Linus Torvalds



Jens Axboe wrote:

>On Thu, Nov 13 2003, Jens Axboe wrote:
>
>>@@ -959,12 +960,12 @@
>> 	if (!aic)
>> 		return;
>> 
>>-	spin_lock(&aic->lock);
>>+	spin_lock_irqsave(&aic->lock, flags);
>> 	if (arq->is_sync == REQ_SYNC) {
>> 		set_bit(AS_TASK_IORUNNING, &aic->state);
>> 		aic->last_end_request = jiffies;
>> 	}
>>-	spin_unlock(&aic->lock);
>>+	spin_unlock_irqrestore(&aic->lock, flags);
>> 
>> 	put_io_context(arq->io_context);
>> }
>>
>
>BTW, this looks bogus. Why do you need any locking there?
>

To prevent a request completion on another queue on another CPU from
racing with request insertion: last_end_request is undefined if the
flag is not set. I guess you could flip the statements and put a
smp_mb between them. Probably not worth the trouble though.



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: AS spin lock bugs
  2003-11-13 10:59   ` Nick Piggin
@ 2003-11-13 11:01     ` Jens Axboe
  2003-11-13 11:10       ` Jens Axboe
  0 siblings, 1 reply; 8+ messages in thread
From: Jens Axboe @ 2003-11-13 11:01 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Linux Kernel, Linus Torvalds

On Thu, Nov 13 2003, Nick Piggin wrote:
> 
> 
> Jens Axboe wrote:
> 
> >On Thu, Nov 13 2003, Jens Axboe wrote:
> >
> >>@@ -959,12 +960,12 @@
> >>	if (!aic)
> >>		return;
> >>
> >>-	spin_lock(&aic->lock);
> >>+	spin_lock_irqsave(&aic->lock, flags);
> >>	if (arq->is_sync == REQ_SYNC) {
> >>		set_bit(AS_TASK_IORUNNING, &aic->state);
> >>		aic->last_end_request = jiffies;
> >>	}
> >>-	spin_unlock(&aic->lock);
> >>+	spin_unlock_irqrestore(&aic->lock, flags);
> >>
> >>	put_io_context(arq->io_context);
> >>}
> >>
> >
> >BTW, this looks bogus. Why do you need any locking there?
> >
> 
> To prevent a request completion on another queue on another CPU from
> racing with request insertion: last_end_request is undefined if the
> flag is not set. I guess you could flip the statements and put a
> smp_mb between them. Probably not worth the trouble though.

No better to make it explicit, probably doesn't matter much in
real-life. Thanks for the clarifications.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: AS spin lock bugs
  2003-11-13 11:01     ` Jens Axboe
@ 2003-11-13 11:10       ` Jens Axboe
  2003-11-13 11:16         ` Nick Piggin
  0 siblings, 1 reply; 8+ messages in thread
From: Jens Axboe @ 2003-11-13 11:10 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Linux Kernel, Linus Torvalds

On Thu, Nov 13 2003, Jens Axboe wrote:
> On Thu, Nov 13 2003, Nick Piggin wrote:
> > 
> > 
> > Jens Axboe wrote:
> > 
> > >On Thu, Nov 13 2003, Jens Axboe wrote:
> > >
> > >>@@ -959,12 +960,12 @@
> > >>	if (!aic)
> > >>		return;
> > >>
> > >>-	spin_lock(&aic->lock);
> > >>+	spin_lock_irqsave(&aic->lock, flags);
> > >>	if (arq->is_sync == REQ_SYNC) {
> > >>		set_bit(AS_TASK_IORUNNING, &aic->state);
> > >>		aic->last_end_request = jiffies;
> > >>	}
> > >>-	spin_unlock(&aic->lock);
> > >>+	spin_unlock_irqrestore(&aic->lock, flags);
> > >>
> > >>	put_io_context(arq->io_context);
> > >>}
> > >>
> > >
> > >BTW, this looks bogus. Why do you need any locking there?
> > >
> > 
> > To prevent a request completion on another queue on another CPU from
> > racing with request insertion: last_end_request is undefined if the
> > flag is not set. I guess you could flip the statements and put a
> > smp_mb between them. Probably not worth the trouble though.
> 
> No better to make it explicit, probably doesn't matter much in
> real-life. Thanks for the clarifications.

Ah, it would be clearer as:

	if (arq->is_sync == REQ_SYNC) {
		spin_lock(&aic->lock);
		set_bit(AS_TASK_IORUNNING, &aic->state);
		aic->last_end_request = jiffies;
		spin_unlock(&aic->lock);
	}

Then it doesn't need comments :)

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: AS spin lock bugs
  2003-11-13 11:10       ` Jens Axboe
@ 2003-11-13 11:16         ` Nick Piggin
  0 siblings, 0 replies; 8+ messages in thread
From: Nick Piggin @ 2003-11-13 11:16 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Linux Kernel, Linus Torvalds



Jens Axboe wrote:

>On Thu, Nov 13 2003, Jens Axboe wrote:
>
>>On Thu, Nov 13 2003, Nick Piggin wrote:
>>
>>>
>>>Jens Axboe wrote:
>>>
>>>
>>>>On Thu, Nov 13 2003, Jens Axboe wrote:
>>>>
>>>>
>>>>>@@ -959,12 +960,12 @@
>>>>>	if (!aic)
>>>>>		return;
>>>>>
>>>>>-	spin_lock(&aic->lock);
>>>>>+	spin_lock_irqsave(&aic->lock, flags);
>>>>>	if (arq->is_sync == REQ_SYNC) {
>>>>>		set_bit(AS_TASK_IORUNNING, &aic->state);
>>>>>		aic->last_end_request = jiffies;
>>>>>	}
>>>>>-	spin_unlock(&aic->lock);
>>>>>+	spin_unlock_irqrestore(&aic->lock, flags);
>>>>>
>>>>>	put_io_context(arq->io_context);
>>>>>}
>>>>>
>>>>>
>>>>BTW, this looks bogus. Why do you need any locking there?
>>>>
>>>>
>>>To prevent a request completion on another queue on another CPU from
>>>racing with request insertion: last_end_request is undefined if the
>>>flag is not set. I guess you could flip the statements and put a
>>>smp_mb between them. Probably not worth the trouble though.
>>>
>>No better to make it explicit, probably doesn't matter much in
>>real-life. Thanks for the clarifications.
>>
>
>Ah, it would be clearer as:
>
>	if (arq->is_sync == REQ_SYNC) {
>		spin_lock(&aic->lock);
>		set_bit(AS_TASK_IORUNNING, &aic->state);
>		aic->last_end_request = jiffies;
>		spin_unlock(&aic->lock);
>	}
>
>Then it doesn't need comments :)
>

Yeah thats was a bit silly of me I see why you got confused. I
have actually fixed this up in mm3. So it should get through to
Linus sometime after 2.6.0.



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2003-11-13 11:16 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-11-13 10:38 AS spin lock bugs Jens Axboe
2003-11-13 10:52 ` Nick Piggin
2003-11-13 10:59   ` Jens Axboe
2003-11-13 10:52 ` Jens Axboe
2003-11-13 10:59   ` Nick Piggin
2003-11-13 11:01     ` Jens Axboe
2003-11-13 11:10       ` Jens Axboe
2003-11-13 11:16         ` Nick Piggin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.