public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* cpqarray broken since 2.5.19
@ 2002-07-21 15:28 Adam Kropelin
  2002-07-24 13:39 ` Jens Axboe
  0 siblings, 1 reply; 16+ messages in thread
From: Adam Kropelin @ 2002-07-21 15:28 UTC (permalink / raw)
  To: linux-kernel; +Cc: axboe

The cpqarray driver seems to have been broken around 2.5.19 with the
blk_start_queue/blk_stop_queue changes. As-is, cpqarray deadlocks the entire
system when it tries to do partition detection. The bits from the 2.5.19 patch
which seem to relate are:

> @@ -916,6 +915,7 @@
>       goto queue_next;
>
>  startio:
> +     blk_stop_queue(q);
>       start_io(h);
>  }
>
> @@ -1066,8 +1066,8 @@
>       /*
>        * See if we can queue up some more IO
>        */
> -     do_ida_request(BLK_DEFAULT_QUEUE(MAJOR_NR + h->ctlr));
>       spin_unlock_irqrestore(IDA_LOCK(h->ctlr), flags);
> +     blk_start_queue(BLK_DEFAULT_QUEUE(MAJOR_NR + h->ctlr));
>  }
>
>  /*

Simply reverting these changes allows the driver to successfully do partition
detect, but it quickly hangs if any significant amount of I/O is attempted. The
hang in this case seems to just affect processes trying to do I/O on the array;
it is not a whole-system-deadlock.

Test machine is SMP ppro.

--Adam


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: cpqarray broken since 2.5.19
  2002-07-21 15:28 cpqarray broken since 2.5.19 Adam Kropelin
@ 2002-07-24 13:39 ` Jens Axboe
  2002-07-24 14:07   ` Bartlomiej Zolnierkiewicz
  2002-07-25  0:32   ` Adam Kropelin
  0 siblings, 2 replies; 16+ messages in thread
From: Jens Axboe @ 2002-07-24 13:39 UTC (permalink / raw)
  To: Adam Kropelin; +Cc: linux-kernel

On Sun, Jul 21 2002, Adam Kropelin wrote:
> The cpqarray driver seems to have been broken around 2.5.19 with the
> blk_start_queue/blk_stop_queue changes. As-is, cpqarray deadlocks the entire
> system when it tries to do partition detection. The bits from the 2.5.19 patch
> which seem to relate are:
> 
> > @@ -916,6 +915,7 @@
> >       goto queue_next;
> >
> >  startio:
> > +     blk_stop_queue(q);
> >       start_io(h);
> >  }
> >
> > @@ -1066,8 +1066,8 @@
> >       /*
> >        * See if we can queue up some more IO
> >        */
> > -     do_ida_request(BLK_DEFAULT_QUEUE(MAJOR_NR + h->ctlr));
> >       spin_unlock_irqrestore(IDA_LOCK(h->ctlr), flags);
> > +     blk_start_queue(BLK_DEFAULT_QUEUE(MAJOR_NR + h->ctlr));
> >  }
> >
> >  /*
> 
> Simply reverting these changes allows the driver to successfully do
> partition detect, but it quickly hangs if any significant amount of
> I/O is attempted. The hang in this case seems to just affect processes
> trying to do I/O on the array; it is not a whole-system-deadlock.
> 
> Test machine is SMP ppro.

Thanks for the report. Could you just kill the spin_lock/unlock in
blk_stop_queue() in drivers/block/ll_rw_blk.c and see if it works?

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: cpqarray broken since 2.5.19
  2002-07-24 13:39 ` Jens Axboe
@ 2002-07-24 14:07   ` Bartlomiej Zolnierkiewicz
  2002-07-24 14:09     ` Jens Axboe
  2002-07-24 14:11     ` Marcin Dalecki
  2002-07-25  0:32   ` Adam Kropelin
  1 sibling, 2 replies; 16+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2002-07-24 14:07 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Adam Kropelin, linux-kernel


On Wed, 24 Jul 2002, Jens Axboe wrote:

> On Sun, Jul 21 2002, Adam Kropelin wrote:
> > The cpqarray driver seems to have been broken around 2.5.19 with the
> > blk_start_queue/blk_stop_queue changes. As-is, cpqarray deadlocks the entire
> > system when it tries to do partition detection. The bits from the 2.5.19 patch
> > which seem to relate are:
> >
> > > @@ -916,6 +915,7 @@
> > >       goto queue_next;
> > >
> > >  startio:
> > > +     blk_stop_queue(q);
> > >       start_io(h);
> > >  }
> > >
> > > @@ -1066,8 +1066,8 @@
> > >       /*
> > >        * See if we can queue up some more IO
> > >        */
> > > -     do_ida_request(BLK_DEFAULT_QUEUE(MAJOR_NR + h->ctlr));
> > >       spin_unlock_irqrestore(IDA_LOCK(h->ctlr), flags);
> > > +     blk_start_queue(BLK_DEFAULT_QUEUE(MAJOR_NR + h->ctlr));
> > >  }
> > >
> > >  /*
> >
> > Simply reverting these changes allows the driver to successfully do
> > partition detect, but it quickly hangs if any significant amount of
> > I/O is attempted. The hang in this case seems to just affect processes
> > trying to do I/O on the array; it is not a whole-system-deadlock.
> >
> > Test machine is SMP ppro.
>
> Thanks for the report. Could you just kill the spin_lock/unlock in
> blk_stop_queue() in drivers/block/ll_rw_blk.c and see if it works?
>
> --
> Jens Axboe

Jens, the same is in cciss.c.
Please remove locking from blk_stop_queue() (as you suggested) or intrduce
unlocking in request_functions.

--
Bartlomiej


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: cpqarray broken since 2.5.19
  2002-07-24 14:07   ` Bartlomiej Zolnierkiewicz
@ 2002-07-24 14:09     ` Jens Axboe
  2002-07-24 14:11     ` Marcin Dalecki
  1 sibling, 0 replies; 16+ messages in thread
From: Jens Axboe @ 2002-07-24 14:09 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz; +Cc: Adam Kropelin, linux-kernel

On Wed, Jul 24 2002, Bartlomiej Zolnierkiewicz wrote:
> 
> On Wed, 24 Jul 2002, Jens Axboe wrote:
> 
> > On Sun, Jul 21 2002, Adam Kropelin wrote:
> > > The cpqarray driver seems to have been broken around 2.5.19 with the
> > > blk_start_queue/blk_stop_queue changes. As-is, cpqarray deadlocks the entire
> > > system when it tries to do partition detection. The bits from the 2.5.19 patch
> > > which seem to relate are:
> > >
> > > > @@ -916,6 +915,7 @@
> > > >       goto queue_next;
> > > >
> > > >  startio:
> > > > +     blk_stop_queue(q);
> > > >       start_io(h);
> > > >  }
> > > >
> > > > @@ -1066,8 +1066,8 @@
> > > >       /*
> > > >        * See if we can queue up some more IO
> > > >        */
> > > > -     do_ida_request(BLK_DEFAULT_QUEUE(MAJOR_NR + h->ctlr));
> > > >       spin_unlock_irqrestore(IDA_LOCK(h->ctlr), flags);
> > > > +     blk_start_queue(BLK_DEFAULT_QUEUE(MAJOR_NR + h->ctlr));
> > > >  }
> > > >
> > > >  /*
> > >
> > > Simply reverting these changes allows the driver to successfully do
> > > partition detect, but it quickly hangs if any significant amount of
> > > I/O is attempted. The hang in this case seems to just affect processes
> > > trying to do I/O on the array; it is not a whole-system-deadlock.
> > >
> > > Test machine is SMP ppro.
> >
> > Thanks for the report. Could you just kill the spin_lock/unlock in
> > blk_stop_queue() in drivers/block/ll_rw_blk.c and see if it works?
> >
> > --
> > Jens Axboe
> 
> Jens, the same is in cciss.c.
> Please remove locking from blk_stop_queue() (as you suggested) or intrduce
> unlocking in request_functions.

I just fixed both of them in my BK and pushed it on. I opted for adding
a __blk_stop_queue() as well.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: cpqarray broken since 2.5.19
  2002-07-24 14:07   ` Bartlomiej Zolnierkiewicz
  2002-07-24 14:09     ` Jens Axboe
@ 2002-07-24 14:11     ` Marcin Dalecki
  2002-07-24 14:19       ` Jens Axboe
  2002-07-24 14:21       ` Bartlomiej Zolnierkiewicz
  1 sibling, 2 replies; 16+ messages in thread
From: Marcin Dalecki @ 2002-07-24 14:11 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz; +Cc: Jens Axboe, Adam Kropelin, linux-kernel


> 
> Jens, the same is in cciss.c.
> Please remove locking from blk_stop_queue() (as you suggested) or intrduce
> unlocking in request_functions.
> 
Bartek I think the removal is just for reassertion that the
locking is the problem. You can't remove it easly from
blk_stop_queue() unless you make it mandatory that blk_stop_queue
has to be run with the lock already held. Or in other words
basically -> Don't use blk_stop_queue() outside of ->request_fn.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: cpqarray broken since 2.5.19
  2002-07-24 14:11     ` Marcin Dalecki
@ 2002-07-24 14:19       ` Jens Axboe
  2002-07-24 14:41         ` Bartlomiej Zolnierkiewicz
  2002-07-24 14:21       ` Bartlomiej Zolnierkiewicz
  1 sibling, 1 reply; 16+ messages in thread
From: Jens Axboe @ 2002-07-24 14:19 UTC (permalink / raw)
  To: martin; +Cc: Bartlomiej Zolnierkiewicz, Adam Kropelin, linux-kernel

On Wed, Jul 24 2002, Marcin Dalecki wrote:
> 
> >
> >Jens, the same is in cciss.c.
> >Please remove locking from blk_stop_queue() (as you suggested) or intrduce
> >unlocking in request_functions.
> >
> Bartek I think the removal is just for reassertion that the
> locking is the problem. You can't remove it easly from
> blk_stop_queue() unless you make it mandatory that blk_stop_queue
> has to be run with the lock already held. Or in other words
> basically -> Don't use blk_stop_queue() outside of ->request_fn.

Of couse Bart is advocating just making sure that every caller of
blk_stop_queue() _has_ the queue_lock before calling it, not removing
the locking there.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: cpqarray broken since 2.5.19
  2002-07-24 14:11     ` Marcin Dalecki
  2002-07-24 14:19       ` Jens Axboe
@ 2002-07-24 14:21       ` Bartlomiej Zolnierkiewicz
  2002-07-24 14:27         ` Marcin Dalecki
  1 sibling, 1 reply; 16+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2002-07-24 14:21 UTC (permalink / raw)
  To: martin; +Cc: Jens Axboe, Adam Kropelin, linux-kernel


On Wed, 24 Jul 2002, Marcin Dalecki wrote:

>
> >
> > Jens, the same is in cciss.c.
> > Please remove locking from blk_stop_queue() (as you suggested) or intrduce
> > unlocking in request_functions.
> >
> Bartek I think the removal is just for reassertion that the
> locking is the problem. You can't remove it easly from
> blk_stop_queue() unless you make it mandatory that blk_stop_queue
> has to be run with the lock already held. Or in other words
> basically -> Don't use blk_stop_queue() outside of ->request_fn.

Yep, that how it should be only used.
However you are right these stop/start need some checking.

About idea of using QUEUE_FLAG_STOPPED as IDE_BUSY right now is no go
and will never be.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: cpqarray broken since 2.5.19
  2002-07-24 14:21       ` Bartlomiej Zolnierkiewicz
@ 2002-07-24 14:27         ` Marcin Dalecki
  2002-07-24 14:45           ` Bartlomiej Zolnierkiewicz
  0 siblings, 1 reply; 16+ messages in thread
From: Marcin Dalecki @ 2002-07-24 14:27 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz; +Cc: Jens Axboe, linux-kernel

Bartlomiej Zolnierkiewicz wrote:
> On Wed, 24 Jul 2002, Marcin Dalecki wrote:
> 
> 
>>>Jens, the same is in cciss.c.
>>>Please remove locking from blk_stop_queue() (as you suggested) or intrduce
>>>unlocking in request_functions.
>>>
>>
>>Bartek I think the removal is just for reassertion that the
>>locking is the problem. You can't remove it easly from
>>blk_stop_queue() unless you make it mandatory that blk_stop_queue
>>has to be run with the lock already held. Or in other words
>>basically -> Don't use blk_stop_queue() outside of ->request_fn.
> 
> 
> Yep, that how it should be only used.
> However you are right these stop/start need some checking.
> 
> About idea of using QUEUE_FLAG_STOPPED as IDE_BUSY right now is no go
> and will never be.

Hold on please. Becouse if you think one step further ->
not blocking blk_stop_queue() in do_request or more
precisely at places where the IDE_BUSY get's set 8-) you suddenly get 
completely rid of it if you replace the "back-calls" to do_request() in 
ata_irq_request() and ide_timer_expiry() with blk_start_queue()...
No direct manipulation whatsever.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: cpqarray broken since 2.5.19
  2002-07-24 14:19       ` Jens Axboe
@ 2002-07-24 14:41         ` Bartlomiej Zolnierkiewicz
  2002-07-24 14:44           ` Jens Axboe
  2002-07-24 14:46           ` Marcin Dalecki
  0 siblings, 2 replies; 16+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2002-07-24 14:41 UTC (permalink / raw)
  To: Jens Axboe; +Cc: martin, Adam Kropelin, linux-kernel


On Wed, 24 Jul 2002, Jens Axboe wrote:

> On Wed, Jul 24 2002, Marcin Dalecki wrote:
> >
> > >
> > >Jens, the same is in cciss.c.
> > >Please remove locking from blk_stop_queue() (as you suggested) or intrduce
> > >unlocking in request_functions.
> > >
> > Bartek I think the removal is just for reassertion that the
> > locking is the problem. You can't remove it easly from
> > blk_stop_queue() unless you make it mandatory that blk_stop_queue
> > has to be run with the lock already held. Or in other words
> > basically -> Don't use blk_stop_queue() outside of ->request_fn.
>
> Of couse Bart is advocating just making sure that every caller of
> blk_stop_queue() _has_ the queue_lock before calling it, not removing
> the locking there.
>
> --
> Jens Axboe

And I'm also advocating for __blk_start_queue() ideal for usage in
ata_end_request(). And moving spin_lock scope to cover test_and_set_bit()
in blk_start_queue() (for coherency and avoiding spurious calls to
q->request_fn() ).

However IDE_BUSY -> QUEUE_STOPPED_FLAG is braindamaged idea.

--
Bartlomiej


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: cpqarray broken since 2.5.19
  2002-07-24 14:41         ` Bartlomiej Zolnierkiewicz
@ 2002-07-24 14:44           ` Jens Axboe
  2002-07-24 14:46           ` Marcin Dalecki
  1 sibling, 0 replies; 16+ messages in thread
From: Jens Axboe @ 2002-07-24 14:44 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz; +Cc: martin, Adam Kropelin, linux-kernel

On Wed, Jul 24 2002, Bartlomiej Zolnierkiewicz wrote:
> 
> On Wed, 24 Jul 2002, Jens Axboe wrote:
> 
> > On Wed, Jul 24 2002, Marcin Dalecki wrote:
> > >
> > > >
> > > >Jens, the same is in cciss.c.
> > > >Please remove locking from blk_stop_queue() (as you suggested) or intrduce
> > > >unlocking in request_functions.
> > > >
> > > Bartek I think the removal is just for reassertion that the
> > > locking is the problem. You can't remove it easly from
> > > blk_stop_queue() unless you make it mandatory that blk_stop_queue
> > > has to be run with the lock already held. Or in other words
> > > basically -> Don't use blk_stop_queue() outside of ->request_fn.
> >
> > Of couse Bart is advocating just making sure that every caller of
> > blk_stop_queue() _has_ the queue_lock before calling it, not removing
> > the locking there.
> >
> > --
> > Jens Axboe
> 
> And I'm also advocating for __blk_start_queue() ideal for usage in
> ata_end_request(). And moving spin_lock scope to cover test_and_set_bit()
> in blk_start_queue() (for coherency and avoiding spurious calls to
> q->request_fn() ).

Feel free to send me the patch to compliment the __blk_stop_queue() part
(which is in axboe@master:/home/axboe/BK/linux-2.5-block, btw).

> However IDE_BUSY -> QUEUE_STOPPED_FLAG is braindamaged idea.

I agree.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: cpqarray broken since 2.5.19
  2002-07-24 14:27         ` Marcin Dalecki
@ 2002-07-24 14:45           ` Bartlomiej Zolnierkiewicz
  0 siblings, 0 replies; 16+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2002-07-24 14:45 UTC (permalink / raw)
  To: martin; +Cc: Jens Axboe, linux-kernel


On Wed, 24 Jul 2002, Marcin Dalecki wrote:

> Bartlomiej Zolnierkiewicz wrote:
> > On Wed, 24 Jul 2002, Marcin Dalecki wrote:
> >
> >
> >>>Jens, the same is in cciss.c.
> >>>Please remove locking from blk_stop_queue() (as you suggested) or intrduce
> >>>unlocking in request_functions.
> >>>
> >>
> >>Bartek I think the removal is just for reassertion that the
> >>locking is the problem. You can't remove it easly from
> >>blk_stop_queue() unless you make it mandatory that blk_stop_queue
> >>has to be run with the lock already held. Or in other words
> >>basically -> Don't use blk_stop_queue() outside of ->request_fn.
> >
> >
> > Yep, that how it should be only used.
> > However you are right these stop/start need some checking.
> >
> > About idea of using QUEUE_FLAG_STOPPED as IDE_BUSY right now is no go
> > and will never be.
>
> Hold on please. Becouse if you think one step further ->
> not blocking blk_stop_queue() in do_request or more
> precisely at places where the IDE_BUSY get's set 8-) you suddenly get
> completely rid of it if you replace the "back-calls" to do_request() in
> ata_irq_request() and ide_timer_expiry() with blk_start_queue()...
> No direct manipulation whatsever.

If you think it is good idea, go for it, as I don't work on your 2.5 IDE
anymore. Game is over, I will start my tree (based on 2.4) soon.

Don't say later you haven't been warned.

--
Bartlomiej


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: cpqarray broken since 2.5.19
  2002-07-24 14:41         ` Bartlomiej Zolnierkiewicz
  2002-07-24 14:44           ` Jens Axboe
@ 2002-07-24 14:46           ` Marcin Dalecki
  2002-07-24 15:15             ` Bartlomiej Zolnierkiewicz
  1 sibling, 1 reply; 16+ messages in thread
From: Marcin Dalecki @ 2002-07-24 14:46 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz; +Cc: Jens Axboe, linux-kernel

Bartlomiej Zolnierkiewicz wrote:
> On Wed, 24 Jul 2002, Jens Axboe wrote:
> 
> 
>>On Wed, Jul 24 2002, Marcin Dalecki wrote:
>>
>>>>Jens, the same is in cciss.c.
>>>>Please remove locking from blk_stop_queue() (as you suggested) or intrduce
>>>>unlocking in request_functions.
>>>>
>>>
>>>Bartek I think the removal is just for reassertion that the
>>>locking is the problem. You can't remove it easly from
>>>blk_stop_queue() unless you make it mandatory that blk_stop_queue
>>>has to be run with the lock already held. Or in other words
>>>basically -> Don't use blk_stop_queue() outside of ->request_fn.
>>
>>Of couse Bart is advocating just making sure that every caller of
>>blk_stop_queue() _has_ the queue_lock before calling it, not removing
>>the locking there.
>>
>>--
>>Jens Axboe
> 
> 
> And I'm also advocating for __blk_start_queue() ideal for usage in
> ata_end_request(). And moving spin_lock scope to cover test_and_set_bit()
> in blk_start_queue() (for coherency and avoiding spurious calls to
> q->request_fn() )

You mean this:

void blk_start_queue(request_queue_t *q)
{
	if (test_bit(QUEUE_FLAG_STOPPED, &q->queue_flags)) {
		unsigned long flags;

		spin_lock_irqsave(q->queue_lock, flags);
		if (!test_bit(QUEUE_FLAG_STOPPED, &q->queue_flags)) {
			spin_unlock_irqrestore(q->queue_lock, flags);
			return;
		}
		clear_bit(QUEUE_FLAG_STOPPED, &q->queue_flags);
		if (!elv_queue_empty(q))
			q->request_fn(q);
		spin_unlock_irqrestore(q->queue_lock, flags);
	}
}

Becouse this is avoiding checking for spinlock in the case
the queue is not stopped.

The spinlock free variant isn't needed right.

> However IDE_BUSY -> QUEUE_STOPPED_FLAG is braindamaged idea.

You should never see it. Think of it as a mind bridge between
IDE_BUSY and queue plug and unplug please. Becouse the purpose
of IDE_BUSY *is* to effectively stall queue processing for
the time of internally issued request. OK?


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: cpqarray broken since 2.5.19
  2002-07-24 14:46           ` Marcin Dalecki
@ 2002-07-24 15:15             ` Bartlomiej Zolnierkiewicz
  0 siblings, 0 replies; 16+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2002-07-24 15:15 UTC (permalink / raw)
  To: martin; +Cc: Jens Axboe, linux-kernel


On Wed, 24 Jul 2002, Marcin Dalecki wrote:

> Bartlomiej Zolnierkiewicz wrote:
> > On Wed, 24 Jul 2002, Jens Axboe wrote:
> >
> >
> >>On Wed, Jul 24 2002, Marcin Dalecki wrote:
> >>
> >>>>Jens, the same is in cciss.c.
> >>>>Please remove locking from blk_stop_queue() (as you suggested) or intrduce
> >>>>unlocking in request_functions.
> >>>>
> >>>
> >>>Bartek I think the removal is just for reassertion that the
> >>>locking is the problem. You can't remove it easly from
> >>>blk_stop_queue() unless you make it mandatory that blk_stop_queue
> >>>has to be run with the lock already held. Or in other words
> >>>basically -> Don't use blk_stop_queue() outside of ->request_fn.
> >>
> >>Of couse Bart is advocating just making sure that every caller of
> >>blk_stop_queue() _has_ the queue_lock before calling it, not removing
> >>the locking there.
> >>
> >>--
> >>Jens Axboe
> >
> >
> > And I'm also advocating for __blk_start_queue() ideal for usage in
> > ata_end_request(). And moving spin_lock scope to cover test_and_set_bit()
> > in blk_start_queue() (for coherency and avoiding spurious calls to
> > q->request_fn() )
>
> You mean this:
>
> void blk_start_queue(request_queue_t *q)
> {
> 	if (test_bit(QUEUE_FLAG_STOPPED, &q->queue_flags)) {
> 		unsigned long flags;
>
> 		spin_lock_irqsave(q->queue_lock, flags);
> 		if (!test_bit(QUEUE_FLAG_STOPPED, &q->queue_flags)) {
> 			spin_unlock_irqrestore(q->queue_lock, flags);
> 			return;
> 		}
> 		clear_bit(QUEUE_FLAG_STOPPED, &q->queue_flags);
> 		if (!elv_queue_empty(q))
> 			q->request_fn(q);
> 		spin_unlock_irqrestore(q->queue_lock, flags);
> 	}
> }

No I mean full locking version. Look at usage context - queue will be
almost (== queueing drivers) stopped.

> Becouse this is avoiding checking for spinlock in the case
> the queue is not stopped.
>
> The spinlock free variant isn't needed right.

Is needed/helpful.

> > However IDE_BUSY -> QUEUE_STOPPED_FLAG is braindamaged idea.
>
> You should never see it. Think of it as a mind bridge between
> IDE_BUSY and queue plug and unplug please. Becouse the purpose
> of IDE_BUSY *is* to effectively stall queue processing for
> the time of internally issued request. OK?

I don't care == I'm tired of bullshit.

--
Bartlomiej


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: cpqarray broken since 2.5.19
  2002-07-24 13:39 ` Jens Axboe
  2002-07-24 14:07   ` Bartlomiej Zolnierkiewicz
@ 2002-07-25  0:32   ` Adam Kropelin
  2002-07-25 10:39     ` Jens Axboe
  1 sibling, 1 reply; 16+ messages in thread
From: Adam Kropelin @ 2002-07-25  0:32 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel

> On Sun, Jul 21 2002, Adam Kropelin wrote:
> > The cpqarray driver seems to have been broken around 2.5.19 with the
> > blk_start_queue/blk_stop_queue changes. As-is, cpqarray deadlocks the entire

On Wed, Jul 24, 2002 at 03:39:59PM +0200, Jens Axboe wrote:
> Thanks for the report. Could you just kill the spin_lock/unlock in
> blk_stop_queue() in drivers/block/ll_rw_blk.c and see if it works?

Hi, Jens,

I killed the spin_lock/unlock as you directed (and made no other changes
to the tree). Result is the same as before: hard lock on partition detect.
The first time I tried it I got an oops but didn't have the serial console
up so it didn't get captured. (Stupid, stupid) Subsequent attempts have just
the hard lock with no oops.

--Adam

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: cpqarray broken since 2.5.19
  2002-07-25  0:32   ` Adam Kropelin
@ 2002-07-25 10:39     ` Jens Axboe
  2002-07-26  0:30       ` Adam Kropelin
  0 siblings, 1 reply; 16+ messages in thread
From: Jens Axboe @ 2002-07-25 10:39 UTC (permalink / raw)
  To: Adam Kropelin; +Cc: linux-kernel

On Wed, Jul 24 2002, Adam Kropelin wrote:
> > On Sun, Jul 21 2002, Adam Kropelin wrote:
> > > The cpqarray driver seems to have been broken around 2.5.19 with the
> > > blk_start_queue/blk_stop_queue changes. As-is, cpqarray deadlocks the entire
> 
> On Wed, Jul 24, 2002 at 03:39:59PM +0200, Jens Axboe wrote:
> > Thanks for the report. Could you just kill the spin_lock/unlock in
> > blk_stop_queue() in drivers/block/ll_rw_blk.c and see if it works?
> 
> Hi, Jens,
> 
> I killed the spin_lock/unlock as you directed (and made no other changes
> to the tree). Result is the same as before: hard lock on partition detect.
> The first time I tried it I got an oops but didn't have the serial console
> up so it didn't get captured. (Stupid, stupid) Subsequent attempts have just
> the hard lock with no oops.

Could you please repeat the expirement with the 2.5.28 just released,
and record any oopses :-)

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: cpqarray broken since 2.5.19
  2002-07-25 10:39     ` Jens Axboe
@ 2002-07-26  0:30       ` Adam Kropelin
  0 siblings, 0 replies; 16+ messages in thread
From: Adam Kropelin @ 2002-07-26  0:30 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel

On Thu, Jul 25, 2002 at 12:39:40PM +0200, Jens Axboe wrote:
> Could you please repeat the expirement with the 2.5.28 just released,
> and record any oopses :-)

Done. Same behavior. It oopses when I insmod if I build as a module. Just
hardlocks, no oops, if I build it in statically. Below is the decoded oops
from the module case. I couldn't get module symbols since it dies when I
insmod so let me know if the trace looks fishy.

--Adam
 
ksymoops 2.4.1 on i686 2.5.28.  Options used
     -V (default)
     -k /proc/ksyms (default)
     -l /proc/modules (default)
     -o /lib/modules/2.5.28/ (default)
     -m /boot/System.map-2.5.28 (default)

Warning: You did not tell me where to find symbol information.  I will
assume that the log matches the kernel and modules that are running
right now and I'll use the default options above for symbol resolution.
If the current kernel and/or modules do not match the log, you can get
more accurate output by telling me the kernel version and where to find
map, modules, ksyms etc.  ksymoops -h explains the options.

No modules in ksyms, skipping objects
Warning (read_lsmod): no symbols in lsmod, is /proc/modules a valid lsmod file?
Warning (compare_maps): ksyms_base symbol GPLONLY___wake_up_sync not found in System.map.  Ignoring ksyms_base entry
Warning (compare_maps): ksyms_base symbol GPLONLY_idle_cpu not found in System.map.  Ignoring ksyms_base entry
Warning (compare_maps): ksyms_base symbol GPLONLY_set_cpus_allowed not found in System.map.  Ignoring ksyms_base entry
Unable to handle kernel NULL pointer dereference at virtual address 00000000
*pde = 00000000
Oops: 0002
CPU:    1
EIP:    0010:[<ca89b725>]    Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010046
eax: 00000000   ebx: c784fe80   ecx: 00000001   edx: 00000000
esi: c889cc00   edi: c889cc00   ebp: c9679e20   esp: c9679db8
ds: 0018   es: 0018   ss: 0018
Stack: 00000002 00000286 c0365e40 00000000 00000001 c784fe80 24000001 00000005 
       c9679e20 c0108e9a 00000005 c889cc00 c9679e20 c9679e20 c03608a0 00000005 
       c784fe80 c0109082 00000005 c9679e20 c784fe80 00000000 c8fbf980 c8fbf980 
Call Trace: [<c0108e9a>] [<c0109082>] [<c0107978>] [<c0167a6c>] [<c0153a1f>] 
   [<c012bf10>] [<c0262ec1>] [<c01319f3>] [<c0131b9c>] [<c0148bf3>] [<c02674d7>] 
   [<c0163360>] [<c013b094>] [<c0148f99>] [<c02635d4>] [<c013b213>] [<c0107033>] 
Code: f0 fe 08 0f 88 fa 14 00 00 83 e1 01 0f 84 01 02 00 00 e9 ea 

>>EIP; ca89b725 <END_OF_CODE+a4c0185/????>   <=====
Trace; c0108e9a <handle_IRQ_event+3a/60>
Trace; c0109082 <do_IRQ+a2/110>
Trace; c0107978 <common_interrupt+18/20>
Trace; c0167a6c <.text.lock.inode+eb/ff>
Trace; c0153a1f <__mark_inode_dirty+2f/c0>
Trace; c012bf10 <generic_file_write+3b0/760>
Trace; c0262ec1 <sys_recvfrom+a1/100>
Trace; c01319f3 <__alloc_pages+53/1f0>
Trace; c0131b9c <__get_free_pages+c/50>
Trace; c0148bf3 <__pollwait+33/a0>
Trace; c02674d7 <datagram_poll+27/d3>
Trace; c0163360 <ext3_file_write+0/60>
Trace; c013b094 <do_readv_writev+204/2e0>
Trace; c0148f99 <do_select+259/270>
Trace; c02635d4 <sys_socketcall+154/200>
Trace; c013b213 <sys_writev+43/54>
Trace; c0107033 <syscall_call+7/b>
Code;  ca89b725 <END_OF_CODE+a4c0185/????>
00000000 <_EIP>:
Code;  ca89b725 <END_OF_CODE+a4c0185/????>   <=====
   0:   f0 fe 08                  lock decb (%eax)   <=====
Code;  ca89b728 <END_OF_CODE+a4c0188/????>
   3:   0f 88 fa 14 00 00         js     1503 <_EIP+0x1503> ca89cc28 <END_OF_CODE+a4c1688/????>
Code;  ca89b72e <END_OF_CODE+a4c018e/????>
   9:   83 e1 01                  and    $0x1,%ecx
Code;  ca89b731 <END_OF_CODE+a4c0191/????>
   c:   0f 84 01 02 00 00         je     213 <_EIP+0x213> ca89b938 <END_OF_CODE+a4c0398/????>
Code;  ca89b737 <END_OF_CODE+a4c0197/????>
  12:   e9 ea 00 00 00            jmp    101 <_EIP+0x101> ca89b826 <END_OF_CODE+a4c0286/????>

 <0>Kernel panic: Aiee, killing interrupt handler!

5 warnings issued.  Results may not be reliable.


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2002-07-26  0:27 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-07-21 15:28 cpqarray broken since 2.5.19 Adam Kropelin
2002-07-24 13:39 ` Jens Axboe
2002-07-24 14:07   ` Bartlomiej Zolnierkiewicz
2002-07-24 14:09     ` Jens Axboe
2002-07-24 14:11     ` Marcin Dalecki
2002-07-24 14:19       ` Jens Axboe
2002-07-24 14:41         ` Bartlomiej Zolnierkiewicz
2002-07-24 14:44           ` Jens Axboe
2002-07-24 14:46           ` Marcin Dalecki
2002-07-24 15:15             ` Bartlomiej Zolnierkiewicz
2002-07-24 14:21       ` Bartlomiej Zolnierkiewicz
2002-07-24 14:27         ` Marcin Dalecki
2002-07-24 14:45           ` Bartlomiej Zolnierkiewicz
2002-07-25  0:32   ` Adam Kropelin
2002-07-25 10:39     ` Jens Axboe
2002-07-26  0:30       ` Adam Kropelin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox