[PATCH] SCSI: Fix some locking issues

linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] SCSI: Fix some locking issues
@ 2008-06-29 11:38 Elias Oltmanns
  2008-07-01 21:37 ` Elias Oltmanns
  0 siblings, 1 reply; 24+ messages in thread
From: Elias Oltmanns @ 2008-06-29 11:38 UTC (permalink / raw)
  To: linux-scsi; +Cc: stable

Hi all,

the following patch applies to 2.6.26-rc8. However, the same issues
exist in the 2.6.25.y stable tree and the patch applies with offsets to
2.6.25.9; I haven't checked 2.6.24 and before.

Regards,

Elias


From: Elias Oltmanns <eo@nebensachen.de>
Subject: SCSI: Fix some locking issues

Make sure that host_blocked is consistently protected by the host_lock.
Similarly, device_block has to be protected by the queue_lock.  Also,
blk_plug_device() has to be called with the queue_lock held in
scsi_host_queue_ready().

Cc: stable@kernel.org
Signed-off-by: Elias Oltmanns <eo@nebensachen.de>
---

 drivers/scsi/scsi.c     |    6 ++++++
 drivers/scsi/scsi_lib.c |   13 +++++++++++--
 2 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index 110e776..b8b7982 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -826,8 +826,10 @@ void scsi_finish_command(struct scsi_cmnd *cmd)
 {
 	struct scsi_device *sdev = cmd->device;
 	struct Scsi_Host *shost = sdev->host;
+	struct request_queue *q = sdev->request_queue;
 	struct scsi_driver *drv;
 	unsigned int good_bytes;
+	unsigned long flags;
 
 	scsi_device_unbusy(sdev);
 
@@ -839,8 +841,12 @@ void scsi_finish_command(struct scsi_cmnd *cmd)
 	 *
 	 * XXX(hch): What about locking?
          */
+	spin_lock_irqsave(shost->host_lock, flags);
         shost->host_blocked = 0;
+	spin_unlock(shost->host_lock);
+	spin_lock(q->queue_lock);
         sdev->device_blocked = 0;
+	spin_unlock_irqrestore(q->queue_lock, flags);
 
 	/*
 	 * If we have valid sense information, then some kind of recovery
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index a82d2fe..65d0c39 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -133,10 +133,15 @@ int scsi_queue_insert(struct scsi_cmnd *cmd, int reason)
 	 * if a command is requeued with no other commands outstanding
 	 * either for the device or for the host.
 	 */
-	if (reason == SCSI_MLQUEUE_HOST_BUSY)
+	if (reason == SCSI_MLQUEUE_HOST_BUSY) {
+		spin_lock_irqsave(host->host_lock, flags);
 		host->host_blocked = host->max_host_blocked;
-	else if (reason == SCSI_MLQUEUE_DEVICE_BUSY)
+		spin_unlock_irqrestore(host->host_lock, flags);
+	} else if (reason == SCSI_MLQUEUE_DEVICE_BUSY) {
+		spin_lock_irqsave(q->queue_lock, flags);
 		device->device_blocked = device->max_device_blocked;
+		spin_unlock_irqrestore(q->queue_lock, flags);
+	}
 
 	/*
 	 * Decrement the counters, since these commands are no longer
@@ -1320,7 +1325,11 @@ static inline int scsi_host_queue_ready(struct request_queue *q,
 				printk("scsi%d unblocking host at zero depth\n",
 					shost->host_no));
 		} else {
+			spin_unlock(shost->host_lock);
+			spin_lock(q->queue_lock);
 			blk_plug_device(q);
+			spin_unlock(q->queue_lock);
+			spin_lock(shost->host_lock);
 			return 0;
 		}
 	}

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: Fix some locking issues
  2008-06-29 11:38 [PATCH] SCSI: Fix some locking issues Elias Oltmanns
@ 2008-07-01 21:37 ` Elias Oltmanns
  2008-07-02  1:55   ` James Bottomley
  0 siblings, 1 reply; 24+ messages in thread
From: Elias Oltmanns @ 2008-07-01 21:37 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-scsi, stable

Hi James,

sorry for bothering you but I've just noticed that the patch below has
neither been scheduled for the stable review, nor queued up for Linus.
May be you just don't consider this serious enough for these trees but I
wanted to make sure that the situation will be dealt with eventualy. The
patch applies to 2.6.26-rc8.

Regards,

Elias


From: Elias Oltmanns <eo@nebensachen.de>
Subject: SCSI: Fix some locking issues

Make sure that host_blocked is consistently protected by the host_lock.
Similarly, device_block has to be protected by the queue_lock.  Also,
blk_plug_device() has to be called with the queue_lock held in
scsi_host_queue_ready().

Cc: stable@kernel.org
Signed-off-by: Elias Oltmanns <eo@nebensachen.de>
---

 drivers/scsi/scsi.c     |    6 ++++++
 drivers/scsi/scsi_lib.c |   13 +++++++++++--
 2 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index 110e776..b8b7982 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -826,8 +826,10 @@ void scsi_finish_command(struct scsi_cmnd *cmd)
 {
 	struct scsi_device *sdev = cmd->device;
 	struct Scsi_Host *shost = sdev->host;
+	struct request_queue *q = sdev->request_queue;
 	struct scsi_driver *drv;
 	unsigned int good_bytes;
+	unsigned long flags;
 
 	scsi_device_unbusy(sdev);
 
@@ -839,8 +841,12 @@ void scsi_finish_command(struct scsi_cmnd *cmd)
 	 *
 	 * XXX(hch): What about locking?
          */
+	spin_lock_irqsave(shost->host_lock, flags);
         shost->host_blocked = 0;
+	spin_unlock(shost->host_lock);
+	spin_lock(q->queue_lock);
         sdev->device_blocked = 0;
+	spin_unlock_irqrestore(q->queue_lock, flags);
 
 	/*
 	 * If we have valid sense information, then some kind of recovery
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index a82d2fe..65d0c39 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -133,10 +133,15 @@ int scsi_queue_insert(struct scsi_cmnd *cmd, int reason)
 	 * if a command is requeued with no other commands outstanding
 	 * either for the device or for the host.
 	 */
-	if (reason == SCSI_MLQUEUE_HOST_BUSY)
+	if (reason == SCSI_MLQUEUE_HOST_BUSY) {
+		spin_lock_irqsave(host->host_lock, flags);
 		host->host_blocked = host->max_host_blocked;
-	else if (reason == SCSI_MLQUEUE_DEVICE_BUSY)
+		spin_unlock_irqrestore(host->host_lock, flags);
+	} else if (reason == SCSI_MLQUEUE_DEVICE_BUSY) {
+		spin_lock_irqsave(q->queue_lock, flags);
 		device->device_blocked = device->max_device_blocked;
+		spin_unlock_irqrestore(q->queue_lock, flags);
+	}
 
 	/*
 	 * Decrement the counters, since these commands are no longer
@@ -1320,7 +1325,11 @@ static inline int scsi_host_queue_ready(struct request_queue *q,
 				printk("scsi%d unblocking host at zero depth\n",
 					shost->host_no));
 		} else {
+			spin_unlock(shost->host_lock);
+			spin_lock(q->queue_lock);
 			blk_plug_device(q);
+			spin_unlock(q->queue_lock);
+			spin_lock(shost->host_lock);
 			return 0;
 		}
 	}

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: Fix some locking issues
  2008-07-01 21:37 ` Elias Oltmanns
@ 2008-07-02  1:55   ` James Bottomley
  2008-07-02  7:08     ` Elias Oltmanns
  0 siblings, 1 reply; 24+ messages in thread
From: James Bottomley @ 2008-07-02  1:55 UTC (permalink / raw)
  To: Elias Oltmanns; +Cc: linux-scsi, stable

On Tue, 2008-07-01 at 23:37 +0200, Elias Oltmanns wrote:
> Hi James,
> 
> sorry for bothering you but I've just noticed that the patch below has
> neither been scheduled for the stable review, nor queued up for Linus.
> May be you just don't consider this serious enough for these trees but I
> wanted to make sure that the situation will be dealt with eventualy. The
> patch applies to 2.6.26-rc8.

OK, well at first glance, the locking around device_blocked and
host_blocked looks pointless.  What are the failure traces you're using
to decide they need spinlock protection?

The blk_plug_queue change looks reasonable ... however, blk_plug_queue
itself looks like it might not entirely need the queue lock ... I need
to investigate more closely.

James

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: Fix some locking issues
  2008-07-02  1:55   ` James Bottomley
@ 2008-07-02  7:08     ` Elias Oltmanns
  2008-07-02 11:50       ` Jens Axboe
  2008-07-02 14:46       ` James Bottomley
  0 siblings, 2 replies; 24+ messages in thread
From: Elias Oltmanns @ 2008-07-02  7:08 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-scsi, stable

James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> On Tue, 2008-07-01 at 23:37 +0200, Elias Oltmanns wrote:
>> Hi James,
>
>> 
>> sorry for bothering you but I've just noticed that the patch below has
>> neither been scheduled for the stable review, nor queued up for Linus.
>> May be you just don't consider this serious enough for these trees but I
>> wanted to make sure that the situation will be dealt with eventualy. The
>> patch applies to 2.6.26-rc8.
>
> OK, well at first glance, the locking around device_blocked and
> host_blocked looks pointless.  What are the failure traces you're using
> to decide they need spinlock protection?

scsi_queue_insert() as well as scsi_finish_command() can be called at
any time as part of regular command completion or error handling. There
is no reason why the ->request_fn() for the same device or for another
device on the same host should not be in progress at the same time.

>
> The blk_plug_queue change looks reasonable ... however, blk_plug_queue
> itself looks like it might not entirely need the queue lock ... I need
> to investigate more closely.

Well, I rather think it does. We have to serialise access to the
unplug_timer and there is a call to __set_bit() which, as I understand,
requires the calling function to ensure atomicity.

Regards,

Elias

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: Fix some locking issues
  2008-07-02  7:08     ` Elias Oltmanns
@ 2008-07-02 11:50       ` Jens Axboe
  2008-07-02 14:49         ` James Bottomley
  2008-07-02 14:46       ` James Bottomley
  1 sibling, 1 reply; 24+ messages in thread
From: Jens Axboe @ 2008-07-02 11:50 UTC (permalink / raw)
  To: Elias Oltmanns; +Cc: James Bottomley, linux-scsi, stable

On Wed, Jul 02 2008, Elias Oltmanns wrote:
> > The blk_plug_queue change looks reasonable ... however, blk_plug_queue
> > itself looks like it might not entirely need the queue lock ... I need
> > to investigate more closely.
> 
> Well, I rather think it does. We have to serialise access to the
> unplug_timer and there is a call to __set_bit() which, as I understand,
> requires the calling function to ensure atomicity.

Yep, blk_plug_device() needs to be called with the queue lock held.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: Fix some locking issues
  2008-07-02  7:08     ` Elias Oltmanns
  2008-07-02 11:50       ` Jens Axboe
@ 2008-07-02 14:46       ` James Bottomley
  2008-07-02 15:59         ` Elias Oltmanns
  1 sibling, 1 reply; 24+ messages in thread
From: James Bottomley @ 2008-07-02 14:46 UTC (permalink / raw)
  To: Elias Oltmanns; +Cc: linux-scsi

On Wed, 2008-07-02 at 09:08 +0200, Elias Oltmanns wrote:
> James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> > On Tue, 2008-07-01 at 23:37 +0200, Elias Oltmanns wrote:
> >> Hi James,
> >
> >> 
> >> sorry for bothering you but I've just noticed that the patch below has
> >> neither been scheduled for the stable review, nor queued up for Linus.
> >> May be you just don't consider this serious enough for these trees but I
> >> wanted to make sure that the situation will be dealt with eventualy. The
> >> patch applies to 2.6.26-rc8.
> >
> > OK, well at first glance, the locking around device_blocked and
> > host_blocked looks pointless.  What are the failure traces you're using
> > to decide they need spinlock protection?
> 
> scsi_queue_insert() as well as scsi_finish_command() can be called at
> any time as part of regular command completion or error handling. There
> is no reason why the ->request_fn() for the same device or for another
> device on the same host should not be in progress at the same time.

So would I be correct in deducing you haven't seen an observed failure?

The reason no locks are necessary is that there's no race to mediate.
The checks are only is it set or not ... unless we get down to zero
depth in which case the decrements are done under lock.

> > The blk_plug_queue change looks reasonable ... however, blk_plug_queue
> > itself looks like it might not entirely need the queue lock ... I need
> > to investigate more closely.
> 
> Well, I rather think it does. We have to serialise access to the
> unplug_timer and there is a call to __set_bit() which, as I understand,
> requires the calling function to ensure atomicity.

It does at the moment ... it just looks like it could make use of
test_and_set_bit() to avoid the requirement.  The access to the timer
uses mod_timer() which is specifically designed not to require
serialisation.

James



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: Fix some locking issues
  2008-07-02 11:50       ` Jens Axboe
@ 2008-07-02 14:49         ` James Bottomley
  2008-07-02 18:45           ` Jens Axboe
  0 siblings, 1 reply; 24+ messages in thread
From: James Bottomley @ 2008-07-02 14:49 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Elias Oltmanns, linux-scsi, stable

On Wed, 2008-07-02 at 13:50 +0200, Jens Axboe wrote:
> On Wed, Jul 02 2008, Elias Oltmanns wrote:
> > > The blk_plug_queue change looks reasonable ... however, blk_plug_queue
> > > itself looks like it might not entirely need the queue lock ... I need
> > > to investigate more closely.
> > 
> > Well, I rather think it does. We have to serialise access to the
> > unplug_timer and there is a call to __set_bit() which, as I understand,
> > requires the calling function to ensure atomicity.
> 
> Yep, blk_plug_device() needs to be called with the queue lock held.

That's what the comment says ... but if you replaced the test_bit with
an atomic operation then the rest of it does look to be in no need of
serialisation ... unless there's something I missed?

James



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: Fix some locking issues
  2008-07-02 14:46       ` James Bottomley
@ 2008-07-02 15:59         ` Elias Oltmanns
  2008-07-02 16:23           ` Matthew Wilcox
  2008-07-02 16:32           ` James Bottomley
  0 siblings, 2 replies; 24+ messages in thread
From: Elias Oltmanns @ 2008-07-02 15:59 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-scsi

James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> On Wed, 2008-07-02 at 09:08 +0200, Elias Oltmanns wrote:
>> James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
>
>> > On Tue, 2008-07-01 at 23:37 +0200, Elias Oltmanns wrote:
>> >> Hi James,
>> >
>> >> 
>> >> sorry for bothering you but I've just noticed that the patch below has
>> >> neither been scheduled for the stable review, nor queued up for Linus.
>> >> May be you just don't consider this serious enough for these trees but I
>> >> wanted to make sure that the situation will be dealt with eventualy. The
>> >> patch applies to 2.6.26-rc8.
>> >
>> > OK, well at first glance, the locking around device_blocked and
>> > host_blocked looks pointless.  What are the failure traces you're using
>> > to decide they need spinlock protection?
>> 
>> scsi_queue_insert() as well as scsi_finish_command() can be called at
>> any time as part of regular command completion or error handling. There
>> is no reason why the ->request_fn() for the same device or for another
>> device on the same host should not be in progress at the same time.
>
> So would I be correct in deducing you haven't seen an observed failure?

Yes, I don't even have an SMP machine.

>
> The reason no locks are necessary is that there's no race to mediate.
> The checks are only is it set or not ...

I'm not sure whether that is of any consequence. Don't get me wrong, I
really don't know and you may well be right. But how exactly does
decrementing from 2 to 1 work? Do we know for sure that there will
always be at least one bit set so reading that address will reliably
return a non zero value?

> unless we get down to zero depth in which case the decrements are done
> under lock.

Sorry, but this simply doesn't resolve the matter at hand.
scsi_finish_command() can change (host|device)_blocked values to zero at
any time currently *not* protected by any lock. In much the same way
scsi_queue_insert() can change these values from zero to something else
at any time.

>
>> > The blk_plug_queue change looks reasonable ... however, blk_plug_queue
>> > itself looks like it might not entirely need the queue lock ... I need
>> > to investigate more closely.
>> 
>> Well, I rather think it does. We have to serialise access to the
>> unplug_timer and there is a call to __set_bit() which, as I understand,
>> requires the calling function to ensure atomicity.
>
> It does at the moment ... it just looks like it could make use of
> test_and_set_bit() to avoid the requirement.  The access to the timer
> uses mod_timer() which is specifically designed not to require
> serialisation.

Concurrent calls to mod_timer() are alright; I'm not so sure what
happens when del_timer() is called at the same time (haven't checked
though, so you might be right here).

Regards,

Elias

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: Fix some locking issues
  2008-07-02 15:59         ` Elias Oltmanns
@ 2008-07-02 16:23           ` Matthew Wilcox
  2008-07-03  7:12             ` Elias Oltmanns
  2008-07-02 16:32           ` James Bottomley
  1 sibling, 1 reply; 24+ messages in thread
From: Matthew Wilcox @ 2008-07-02 16:23 UTC (permalink / raw)
  To: Elias Oltmanns; +Cc: James Bottomley, linux-scsi

On Wed, Jul 02, 2008 at 05:59:31PM +0200, Elias Oltmanns wrote:
> > The reason no locks are necessary is that there's no race to mediate.
> > The checks are only is it set or not ...
> 
> I'm not sure whether that is of any consequence. Don't get me wrong, I
> really don't know and you may well be right. But how exactly does
> decrementing from 2 to 1 work? Do we know for sure that there will
> always be at least one bit set so reading that address will reliably
> return a non zero value?

The assumption we make (and it is believed to be true on all SMP systems)
is that a write to a naturally aligned memory location that is sized <=
sizeof(long) is atomic.  That is, a reader will get either the previous
value or the subsequent value, not a mixture.  The RCU code relies
heavily on this assumption.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: Fix some locking issues
  2008-07-02 15:59         ` Elias Oltmanns
  2008-07-02 16:23           ` Matthew Wilcox
@ 2008-07-02 16:32           ` James Bottomley
  2008-07-03  7:25             ` Elias Oltmanns
  1 sibling, 1 reply; 24+ messages in thread
From: James Bottomley @ 2008-07-02 16:32 UTC (permalink / raw)
  To: Elias Oltmanns; +Cc: linux-scsi

On Wed, 2008-07-02 at 17:59 +0200, Elias Oltmanns wrote:
> James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> > On Wed, 2008-07-02 at 09:08 +0200, Elias Oltmanns wrote:
> >> James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> >
> >> > On Tue, 2008-07-01 at 23:37 +0200, Elias Oltmanns wrote:
> >> >> Hi James,
> >> >
> >> >> 
> >> >> sorry for bothering you but I've just noticed that the patch below has
> >> >> neither been scheduled for the stable review, nor queued up for Linus.
> >> >> May be you just don't consider this serious enough for these trees but I
> >> >> wanted to make sure that the situation will be dealt with eventualy. The
> >> >> patch applies to 2.6.26-rc8.
> >> >
> >> > OK, well at first glance, the locking around device_blocked and
> >> > host_blocked looks pointless.  What are the failure traces you're using
> >> > to decide they need spinlock protection?
> >> 
> >> scsi_queue_insert() as well as scsi_finish_command() can be called at
> >> any time as part of regular command completion or error handling. There
> >> is no reason why the ->request_fn() for the same device or for another
> >> device on the same host should not be in progress at the same time.
> >
> > So would I be correct in deducing you haven't seen an observed failure?
> 
> Yes, I don't even have an SMP machine.
> 
> >
> > The reason no locks are necessary is that there's no race to mediate.
> > The checks are only is it set or not ...
> 
> I'm not sure whether that is of any consequence. Don't get me wrong, I
> really don't know and you may well be right. But how exactly does
> decrementing from 2 to 1 work? Do we know for sure that there will
> always be at least one bit set so reading that address will reliably
> return a non zero value?

both the check and the decrement are under a lock already. But anyway,
for an int, we always read an integral value (that's a guarantee of the
CPU architecture)

> > unless we get down to zero depth in which case the decrements are done
> > under lock.
> 
> Sorry, but this simply doesn't resolve the matter at hand.
> scsi_finish_command() can change (host|device)_blocked values to zero at
> any time currently *not* protected by any lock. In much the same way
> scsi_queue_insert() can change these values from zero to something else
> at any time.

Look more closely at the requirements for the decrements:  There have to
be no outstanding commands: nothing can be in scsi_finish_command for
the device (or the host for host_blocked).  Likewise,
scsi_queue_insert() is called either for retry return (but nothing
outstanding, so can't) or for queuecommand() failure.

The way locking changes in the queue function introduces a race ... the
block check to the queuecommand.  Introducing additional locking around
the variable setting can't mediate that race, so the code has to be
prepared for it (which it is).

> >
> >> > The blk_plug_queue change looks reasonable ... however, blk_plug_queue
> >> > itself looks like it might not entirely need the queue lock ... I need
> >> > to investigate more closely.
> >> 
> >> Well, I rather think it does. We have to serialise access to the
> >> unplug_timer and there is a call to __set_bit() which, as I understand,
> >> requires the calling function to ensure atomicity.
> >
> > It does at the moment ... it just looks like it could make use of
> > test_and_set_bit() to avoid the requirement.  The access to the timer
> > uses mod_timer() which is specifically designed not to require
> > serialisation.
> 
> Concurrent calls to mod_timer() are alright; I'm not so sure what
> happens when del_timer() is called at the same time (haven't checked
> though, so you might be right here).

James



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: Fix some locking issues
  2008-07-02 14:49         ` James Bottomley
@ 2008-07-02 18:45           ` Jens Axboe
  2008-07-02 20:18             ` James Bottomley
  0 siblings, 1 reply; 24+ messages in thread
From: Jens Axboe @ 2008-07-02 18:45 UTC (permalink / raw)
  To: James Bottomley; +Cc: Elias Oltmanns, linux-scsi, stable

On Wed, Jul 02 2008, James Bottomley wrote:
> On Wed, 2008-07-02 at 13:50 +0200, Jens Axboe wrote:
> > On Wed, Jul 02 2008, Elias Oltmanns wrote:
> > > > The blk_plug_queue change looks reasonable ... however, blk_plug_queue
> > > > itself looks like it might not entirely need the queue lock ... I need
> > > > to investigate more closely.
> > > 
> > > Well, I rather think it does. We have to serialise access to the
> > > unplug_timer and there is a call to __set_bit() which, as I understand,
> > > requires the calling function to ensure atomicity.
> > 
> > Yep, blk_plug_device() needs to be called with the queue lock held.
> 
> That's what the comment says ... but if you replaced the test_bit with
> an atomic operation then the rest of it does look to be in no need of
> serialisation ... unless there's something I missed?

Indeed, but then you would have to use atomic bitops everywhere and that
is the bit we moved away from.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: Fix some locking issues
  2008-07-02 18:45           ` Jens Axboe
@ 2008-07-02 20:18             ` James Bottomley
  2008-07-03  7:53               ` Elias Oltmanns
  0 siblings, 1 reply; 24+ messages in thread
From: James Bottomley @ 2008-07-02 20:18 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Elias Oltmanns, linux-scsi

On Wed, 2008-07-02 at 20:45 +0200, Jens Axboe wrote:
> On Wed, Jul 02 2008, James Bottomley wrote:
> > On Wed, 2008-07-02 at 13:50 +0200, Jens Axboe wrote:
> > > On Wed, Jul 02 2008, Elias Oltmanns wrote:
> > > > > The blk_plug_queue change looks reasonable ... however, blk_plug_queue
> > > > > itself looks like it might not entirely need the queue lock ... I need
> > > > > to investigate more closely.
> > > > 
> > > > Well, I rather think it does. We have to serialise access to the
> > > > unplug_timer and there is a call to __set_bit() which, as I understand,
> > > > requires the calling function to ensure atomicity.
> > > 
> > > Yep, blk_plug_device() needs to be called with the queue lock held.
> > 
> > That's what the comment says ... but if you replaced the test_bit with
> > an atomic operation then the rest of it does look to be in no need of
> > serialisation ... unless there's something I missed?
> 
> Indeed, but then you would have to use atomic bitops everywhere and that
> is the bit we moved away from.

Not necessarily ... only for QUEUE_FLAG_CLUSTER.  That's really only in
this one place and then the one in blk_remove_plug would have to become
test_and_clear_bit.  All the other places barring loop_unplug() are only
tests (which don't affect the atomicity).

It's just for SCSI the double spin lock followed by double spin unlock
to get the locking right is kind of nasty ... I'm just wondering what
the universe would look like if it were rendered unnecessary.

James




^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: Fix some locking issues
  2008-07-02 16:23           ` Matthew Wilcox
@ 2008-07-03  7:12             ` Elias Oltmanns
  2008-07-03 15:22               ` James Bottomley
  2008-07-03 15:47               ` Matthew Wilcox
  0 siblings, 2 replies; 24+ messages in thread
From: Elias Oltmanns @ 2008-07-03  7:12 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: James Bottomley, linux-scsi

Matthew Wilcox <matthew@wil.cx> wrote:
> On Wed, Jul 02, 2008 at 05:59:31PM +0200, Elias Oltmanns wrote:
>> > The reason no locks are necessary is that there's no race to mediate.
>
>> > The checks are only is it set or not ...
>> 
>> I'm not sure whether that is of any consequence. Don't get me wrong, I
>> really don't know and you may well be right. But how exactly does
>> decrementing from 2 to 1 work? Do we know for sure that there will
>> always be at least one bit set so reading that address will reliably
>> return a non zero value?
>
> The assumption we make (and it is believed to be true on all SMP systems)
> is that a write to a naturally aligned memory location that is sized <=
> sizeof(long) is atomic.  That is, a reader will get either the previous
> value or the subsequent value, not a mixture.  The RCU code relies
> heavily on this assumption.

Does that mean that where ever I have

spin_lock_irqsave(some_lock, flags);
var = some_val;
spin_unlock_irqrestore(some_lock, flags);

I could just as well discard the locking provided that
sizeof(var) <= sizeof(long)
because the assignment of some_val to var will be atomic anyway?

Regards,

Elias

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: Fix some locking issues
  2008-07-02 16:32           ` James Bottomley
@ 2008-07-03  7:25             ` Elias Oltmanns
  0 siblings, 0 replies; 24+ messages in thread
From: Elias Oltmanns @ 2008-07-03  7:25 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-scsi

James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> On Wed, 2008-07-02 at 17:59 +0200, Elias Oltmanns wrote:
>> James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
[...]
>> > unless we get down to zero depth in which case the decrements are done
>> > under lock.
>> 
>> Sorry, but this simply doesn't resolve the matter at hand.
>> scsi_finish_command() can change (host|device)_blocked values to zero at
>> any time currently *not* protected by any lock. In much the same way
>> scsi_queue_insert() can change these values from zero to something else
>> at any time.
>
> Look more closely at the requirements for the decrements:  There have to
> be no outstanding commands: nothing can be in scsi_finish_command for
> the device (or the host for host_blocked).

Yes, I agree as far as the decrements are concerned. There still is the
check

if (sdev->device_blocked)

which can happen while ->device_blocked is changed either by
scsi_finish_command() or scsi_queue_insert(). If I understand Matthew and
you correctly, this doesn't pose any problem because assigning an int is
an atomic operation anyway.

Thanks for explaining,

Elias

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: Fix some locking issues
  2008-07-02 20:18             ` James Bottomley
@ 2008-07-03  7:53               ` Elias Oltmanns
  2008-07-03 10:38                 ` Elias Oltmanns
  0 siblings, 1 reply; 24+ messages in thread
From: Elias Oltmanns @ 2008-07-03  7:53 UTC (permalink / raw)
  To: James Bottomley; +Cc: Jens Axboe, linux-scsi

James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> On Wed, 2008-07-02 at 20:45 +0200, Jens Axboe wrote:
>> On Wed, Jul 02 2008, James Bottomley wrote:
>
>> > On Wed, 2008-07-02 at 13:50 +0200, Jens Axboe wrote:
>> > > On Wed, Jul 02 2008, Elias Oltmanns wrote:
>> > > > > The blk_plug_queue change looks reasonable ... however, blk_plug_queue
>> > > > > itself looks like it might not entirely need the queue lock ... I need
>> > > > > to investigate more closely.
>> > > > 
>> > > > Well, I rather think it does. We have to serialise access to the
>> > > > unplug_timer and there is a call to __set_bit() which, as I understand,
>> > > > requires the calling function to ensure atomicity.
>> > > 
>> > > Yep, blk_plug_device() needs to be called with the queue lock held.
>> > 
>> > That's what the comment says ... but if you replaced the test_bit with
>> > an atomic operation then the rest of it does look to be in no need of
>> > serialisation ... unless there's something I missed?
>> 
>> Indeed, but then you would have to use atomic bitops everywhere and that
>> is the bit we moved away from.
>
> Not necessarily ... only for QUEUE_FLAG_CLUSTER.  That's really only in
> this one place and then the one in blk_remove_plug would have to become
> test_and_clear_bit.  All the other places barring loop_unplug() are only
> tests (which don't affect the atomicity).
>
> It's just for SCSI the double spin lock followed by double spin unlock
> to get the locking right is kind of nasty ... I'm just wondering what
> the universe would look like if it were rendered unnecessary.

We have to consider one more thing: Without the locking in
blk_plug_device(), the following sequence of events may occur:

1. Some calls blk_plug_device().
2. Someone else calls blk_stop_queue() right after the check in
   blk_plug_device() has been performed but before the
   test_and_set_bit() has been called.
3. The unplug_timer expires, __generic_unplug_device() discovers that
   the queue has been stopped in the meantime and returns without
   removing the plug.
4. Someone calls blk_start_queue() later on which will execute the
   ->request_fn() even though blk_queue_plugged() is still true.

In order to resolve this, we'd have to switch the calls to
blk_remove_plug() and blk_queue_stopped() in __generic_unplug_device().
I'm not quite sure at the moment whether this would have any further
implications.

Regards,

Elias

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: Fix some locking issues
  2008-07-03  7:53               ` Elias Oltmanns
@ 2008-07-03 10:38                 ` Elias Oltmanns
  2008-07-03 11:24                   ` Jens Axboe
  0 siblings, 1 reply; 24+ messages in thread
From: Elias Oltmanns @ 2008-07-03 10:38 UTC (permalink / raw)
  To: James Bottomley; +Cc: Jens Axboe, linux-scsi

Elias Oltmanns <eo@nebensachen.de> wrote:
> James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
>> On Wed, 2008-07-02 at 20:45 +0200, Jens Axboe wrote:
>
>>> On Wed, Jul 02 2008, James Bottomley wrote:
>>
>>> > On Wed, 2008-07-02 at 13:50 +0200, Jens Axboe wrote:
>>> > > Yep, blk_plug_device() needs to be called with the queue lock held.
>>> > 
>>> > That's what the comment says ... but if you replaced the test_bit with
>>> > an atomic operation then the rest of it does look to be in no need of
>>> > serialisation ... unless there's something I missed?
>>> 
>>> Indeed, but then you would have to use atomic bitops everywhere and that
>>> is the bit we moved away from.
>>
>> Not necessarily ... only for QUEUE_FLAG_CLUSTER.  That's really only in
>> this one place and then the one in blk_remove_plug would have to become
>> test_and_clear_bit.  All the other places barring loop_unplug() are only
>> tests (which don't affect the atomicity).
>>
>> It's just for SCSI the double spin lock followed by double spin unlock
>> to get the locking right is kind of nasty ... I'm just wondering what
>> the universe would look like if it were rendered unnecessary.
>
> We have to consider one more thing: Without the locking in
> blk_plug_device(), the following sequence of events may occur:

Actually, it's worse than that. Locking is required in order to make
absolutely sure that the unplug_timer is active iff QUEUE_FLAG_PLUGGED
is set. Admittedly, it seems *very* unlikely that blk_remove_plug() will
complete before the call to mod_timer() in blk_plug_device() even though
it has started only *after* a call to test_and_set_bit(). However, if
such a thing would ever happen, it could have dire consequences.

Regards,

Elias

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: Fix some locking issues
  2008-07-03 10:38                 ` Elias Oltmanns
@ 2008-07-03 11:24                   ` Jens Axboe
  2008-07-03 16:31                     ` James Bottomley
  0 siblings, 1 reply; 24+ messages in thread
From: Jens Axboe @ 2008-07-03 11:24 UTC (permalink / raw)
  To: Elias Oltmanns; +Cc: James Bottomley, linux-scsi

On Thu, Jul 03 2008, Elias Oltmanns wrote:
> Elias Oltmanns <eo@nebensachen.de> wrote:
> > James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> >> On Wed, 2008-07-02 at 20:45 +0200, Jens Axboe wrote:
> >
> >>> On Wed, Jul 02 2008, James Bottomley wrote:
> >>
> >>> > On Wed, 2008-07-02 at 13:50 +0200, Jens Axboe wrote:
> >>> > > Yep, blk_plug_device() needs to be called with the queue lock held.
> >>> > 
> >>> > That's what the comment says ... but if you replaced the test_bit with
> >>> > an atomic operation then the rest of it does look to be in no need of
> >>> > serialisation ... unless there's something I missed?
> >>> 
> >>> Indeed, but then you would have to use atomic bitops everywhere and that
> >>> is the bit we moved away from.
> >>
> >> Not necessarily ... only for QUEUE_FLAG_CLUSTER.  That's really only in
> >> this one place and then the one in blk_remove_plug would have to become
> >> test_and_clear_bit.  All the other places barring loop_unplug() are only
> >> tests (which don't affect the atomicity).
> >>
> >> It's just for SCSI the double spin lock followed by double spin unlock
> >> to get the locking right is kind of nasty ... I'm just wondering what
> >> the universe would look like if it were rendered unnecessary.
> >
> > We have to consider one more thing: Without the locking in
> > blk_plug_device(), the following sequence of events may occur:
> 
> Actually, it's worse than that. Locking is required in order to make
> absolutely sure that the unplug_timer is active iff QUEUE_FLAG_PLUGGED
> is set. Admittedly, it seems *very* unlikely that blk_remove_plug() will
> complete before the call to mod_timer() in blk_plug_device() even though
> it has started only *after* a call to test_and_set_bit(). However, if
> such a thing would ever happen, it could have dire consequences.

Both are races possible without either atomic bitops or the queue lock
being held. We can't properly mix eg set_bit() and __set_bit(). The
plugged bit is the most hammered, so it's staying non-atomic and SCSI
will need to provide proper locking there.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: Fix some locking issues
  2008-07-03  7:12             ` Elias Oltmanns
@ 2008-07-03 15:22               ` James Bottomley
  2008-07-03 19:39                 ` Elias Oltmanns
  2008-07-03 15:47               ` Matthew Wilcox
  1 sibling, 1 reply; 24+ messages in thread
From: James Bottomley @ 2008-07-03 15:22 UTC (permalink / raw)
  To: Elias Oltmanns; +Cc: Matthew Wilcox, linux-scsi

On Thu, 2008-07-03 at 09:12 +0200, Elias Oltmanns wrote:
> Matthew Wilcox <matthew@wil.cx> wrote:
> > On Wed, Jul 02, 2008 at 05:59:31PM +0200, Elias Oltmanns wrote:
> >> > The reason no locks are necessary is that there's no race to mediate.
> >
> >> > The checks are only is it set or not ...
> >> 
> >> I'm not sure whether that is of any consequence. Don't get me wrong, I
> >> really don't know and you may well be right. But how exactly does
> >> decrementing from 2 to 1 work? Do we know for sure that there will
> >> always be at least one bit set so reading that address will reliably
> >> return a non zero value?
> >
> > The assumption we make (and it is believed to be true on all SMP systems)
> > is that a write to a naturally aligned memory location that is sized <=
> > sizeof(long) is atomic.  That is, a reader will get either the previous
> > value or the subsequent value, not a mixture.  The RCU code relies
> > heavily on this assumption.
> 
> Does that mean that where ever I have
> 
> spin_lock_irqsave(some_lock, flags);
> var = some_val;
> spin_unlock_irqrestore(some_lock, flags);
> 
> I could just as well discard the locking provided that
> sizeof(var) <= sizeof(long)
> because the assignment of some_val to var will be atomic anyway?

I think you're still confused about the difference between integral
observation of values and serialisation.

Assignment is always integral.  In your example above anything observing
the variable always sees either the previous value or the one you set it
to, nothing else.  So, if all you're doing is setting a flag and
clearing it (as in setting it to an integral value, not setting it via a
bit operation) and checking it, no locking is needed.  It's when you
start doing operations on variables that require serialisation, or the
variables themselves are absolute indicators of events that need
serialisation that you start having to introduce locking.

James



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: Fix some locking issues
  2008-07-03  7:12             ` Elias Oltmanns
  2008-07-03 15:22               ` James Bottomley
@ 2008-07-03 15:47               ` Matthew Wilcox
  1 sibling, 0 replies; 24+ messages in thread
From: Matthew Wilcox @ 2008-07-03 15:47 UTC (permalink / raw)
  To: Elias Oltmanns; +Cc: James Bottomley, linux-scsi

On Thu, Jul 03, 2008 at 09:12:26AM +0200, Elias Oltmanns wrote:
> Matthew Wilcox <matthew@wil.cx> wrote:
> > The assumption we make (and it is believed to be true on all SMP systems)
> > is that a write to a naturally aligned memory location that is sized <=
> > sizeof(long) is atomic.  That is, a reader will get either the previous
> > value or the subsequent value, not a mixture.  The RCU code relies
> > heavily on this assumption.
> 
> Does that mean that where ever I have
> 
> spin_lock_irqsave(some_lock, flags);
> var = some_val;
> spin_unlock_irqrestore(some_lock, flags);
> 
> I could just as well discard the locking provided that
> sizeof(var) <= sizeof(long)
> because the assignment of some_val to var will be atomic anyway?

You might need a barrier.

Also, I realised I mis-spoke above.  Early Alphas do not have instructions
that store byte or 16-bit sized quantities.  So the compiler implements
a byte store by loading a 32-bit quantity, replacing the appropriate 8
bits with the new value and storing the word back again.  The same is
true for most processors with, say, bit-sized quantities.  So while it's
atomic with respect to other _readers_, it's unsafe with respect to other
_writers_ in that if you have two modification attempts simultaneously,
you can end up with only one modification having taken place, not both.

eg:

CPU 0		CPU 1
load value
		load value
modify
		modify
store value
		store value

CPU 1 got the old value, and stored it over CPU 0's value.

CPUs which have instructions like cmpxchg can avoid these kinds of
problems, but not all do.  It's certainly not something that you can
make a blanket statement on; you really need to analyse it carefully.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: Fix some locking issues
  2008-07-03 11:24                   ` Jens Axboe
@ 2008-07-03 16:31                     ` James Bottomley
  2008-07-03 17:54                       ` Jens Axboe
  2008-07-03 19:47                       ` Elias Oltmanns
  0 siblings, 2 replies; 24+ messages in thread
From: James Bottomley @ 2008-07-03 16:31 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Elias Oltmanns, linux-scsi

On Thu, 2008-07-03 at 13:24 +0200, Jens Axboe wrote:
> On Thu, Jul 03 2008, Elias Oltmanns wrote:
> > Elias Oltmanns <eo@nebensachen.de> wrote:
> > > James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> > >> On Wed, 2008-07-02 at 20:45 +0200, Jens Axboe wrote:
> > >
> > >>> On Wed, Jul 02 2008, James Bottomley wrote:
> > >>
> > >>> > On Wed, 2008-07-02 at 13:50 +0200, Jens Axboe wrote:
> > >>> > > Yep, blk_plug_device() needs to be called with the queue lock held.
> > >>> > 
> > >>> > That's what the comment says ... but if you replaced the test_bit with
> > >>> > an atomic operation then the rest of it does look to be in no need of
> > >>> > serialisation ... unless there's something I missed?
> > >>> 
> > >>> Indeed, but then you would have to use atomic bitops everywhere and that
> > >>> is the bit we moved away from.
> > >>
> > >> Not necessarily ... only for QUEUE_FLAG_CLUSTER.  That's really only in
> > >> this one place and then the one in blk_remove_plug would have to become
> > >> test_and_clear_bit.  All the other places barring loop_unplug() are only
> > >> tests (which don't affect the atomicity).
> > >>
> > >> It's just for SCSI the double spin lock followed by double spin unlock
> > >> to get the locking right is kind of nasty ... I'm just wondering what
> > >> the universe would look like if it were rendered unnecessary.
> > >
> > > We have to consider one more thing: Without the locking in
> > > blk_plug_device(), the following sequence of events may occur:
> > 
> > Actually, it's worse than that. Locking is required in order to make
> > absolutely sure that the unplug_timer is active iff QUEUE_FLAG_PLUGGED
> > is set. Admittedly, it seems *very* unlikely that blk_remove_plug() will
> > complete before the call to mod_timer() in blk_plug_device() even though
> > it has started only *after* a call to test_and_set_bit(). However, if
> > such a thing would ever happen, it could have dire consequences.
> 
> Both are races possible without either atomic bitops or the queue lock
> being held. We can't properly mix eg set_bit() and __set_bit(). The
> plugged bit is the most hammered, so it's staying non-atomic and SCSI
> will need to provide proper locking there.

You're the boss.

Actually, after all of this, it looks like the host queue plug is
superfluous.  If the host actually says not ready from
scsi_host_queue_ready() we go to the not ready processing clause in
scsi_prep_fn() which actually checks the outstanding on the current
device and plugs the queue if there aren't any commands.  This is
actually more correct behaviour than a blind plug regardless (and it's
also done under the queue lock), so I think this is the correct fix.

James

---

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 678556c..88d1b5f 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1337,7 +1337,6 @@ static inline int scsi_host_queue_ready(struct request_queue *q,
 				printk("scsi%d unblocking host at zero depth\n",
 					shost->host_no));
 		} else {
-			blk_plug_device(q);
 			return 0;
 		}
 	}



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: Fix some locking issues
  2008-07-03 16:31                     ` James Bottomley
@ 2008-07-03 17:54                       ` Jens Axboe
  2008-07-03 19:47                       ` Elias Oltmanns
  1 sibling, 0 replies; 24+ messages in thread
From: Jens Axboe @ 2008-07-03 17:54 UTC (permalink / raw)
  To: James Bottomley; +Cc: Elias Oltmanns, linux-scsi

On Thu, Jul 03 2008, James Bottomley wrote:
> On Thu, 2008-07-03 at 13:24 +0200, Jens Axboe wrote:
> > On Thu, Jul 03 2008, Elias Oltmanns wrote:
> > > Elias Oltmanns <eo@nebensachen.de> wrote:
> > > > James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> > > >> On Wed, 2008-07-02 at 20:45 +0200, Jens Axboe wrote:
> > > >
> > > >>> On Wed, Jul 02 2008, James Bottomley wrote:
> > > >>
> > > >>> > On Wed, 2008-07-02 at 13:50 +0200, Jens Axboe wrote:
> > > >>> > > Yep, blk_plug_device() needs to be called with the queue lock held.
> > > >>> > 
> > > >>> > That's what the comment says ... but if you replaced the test_bit with
> > > >>> > an atomic operation then the rest of it does look to be in no need of
> > > >>> > serialisation ... unless there's something I missed?
> > > >>> 
> > > >>> Indeed, but then you would have to use atomic bitops everywhere and that
> > > >>> is the bit we moved away from.
> > > >>
> > > >> Not necessarily ... only for QUEUE_FLAG_CLUSTER.  That's really only in
> > > >> this one place and then the one in blk_remove_plug would have to become
> > > >> test_and_clear_bit.  All the other places barring loop_unplug() are only
> > > >> tests (which don't affect the atomicity).
> > > >>
> > > >> It's just for SCSI the double spin lock followed by double spin unlock
> > > >> to get the locking right is kind of nasty ... I'm just wondering what
> > > >> the universe would look like if it were rendered unnecessary.
> > > >
> > > > We have to consider one more thing: Without the locking in
> > > > blk_plug_device(), the following sequence of events may occur:
> > > 
> > > Actually, it's worse than that. Locking is required in order to make
> > > absolutely sure that the unplug_timer is active iff QUEUE_FLAG_PLUGGED
> > > is set. Admittedly, it seems *very* unlikely that blk_remove_plug() will
> > > complete before the call to mod_timer() in blk_plug_device() even though
> > > it has started only *after* a call to test_and_set_bit(). However, if
> > > such a thing would ever happen, it could have dire consequences.
> > 
> > Both are races possible without either atomic bitops or the queue lock
> > being held. We can't properly mix eg set_bit() and __set_bit(). The
> > plugged bit is the most hammered, so it's staying non-atomic and SCSI
> > will need to provide proper locking there.
> 
> You're the boss.
> 
> Actually, after all of this, it looks like the host queue plug is
> superfluous.  If the host actually says not ready from
> scsi_host_queue_ready() we go to the not ready processing clause in
> scsi_prep_fn() which actually checks the outstanding on the current
> device and plugs the queue if there aren't any commands.  This is
> actually more correct behaviour than a blind plug regardless (and it's
> also done under the queue lock), so I think this is the correct fix.

That looks good, much better than juggling locks there.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: Fix some locking issues
  2008-07-03 15:22               ` James Bottomley
@ 2008-07-03 19:39                 ` Elias Oltmanns
  0 siblings, 0 replies; 24+ messages in thread
From: Elias Oltmanns @ 2008-07-03 19:39 UTC (permalink / raw)
  To: James Bottomley; +Cc: Matthew Wilcox, linux-scsi

James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> On Thu, 2008-07-03 at 09:12 +0200, Elias Oltmanns wrote:
>> Matthew Wilcox <matthew@wil.cx> wrote:
[...]
>> > The assumption we make (and it is believed to be true on all SMP systems)
>> > is that a write to a naturally aligned memory location that is sized <=
>> > sizeof(long) is atomic.  That is, a reader will get either the previous
>> > value or the subsequent value, not a mixture.  The RCU code relies
>> > heavily on this assumption.
>> 
>> Does that mean that where ever I have
>> 
>> spin_lock_irqsave(some_lock, flags);
>> var = some_val;
>> spin_unlock_irqrestore(some_lock, flags);
>> 
>> I could just as well discard the locking provided that
>> sizeof(var) <= sizeof(long)
>> because the assignment of some_val to var will be atomic anyway?
>
> I think you're still confused about the difference between integral
> observation of values and serialisation.
>
> Assignment is always integral.  In your example above anything observing
> the variable always sees either the previous value or the one you set it
> to, nothing else.  So, if all you're doing is setting a flag and
> clearing it (as in setting it to an integral value, not setting it via a
> bit operation) and checking it, no locking is needed.  It's when you
> start doing operations on variables that require serialisation, or the
> variables themselves are absolute indicators of events that need
> serialisation that you start having to introduce locking.

Right. Thank you very much for the clarification.

Regards,

Elias

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: Fix some locking issues
  2008-07-03 16:31                     ` James Bottomley
  2008-07-03 17:54                       ` Jens Axboe
@ 2008-07-03 19:47                       ` Elias Oltmanns
  2008-07-03 21:33                         ` James Bottomley
  1 sibling, 1 reply; 24+ messages in thread
From: Elias Oltmanns @ 2008-07-03 19:47 UTC (permalink / raw)
  To: James Bottomley; +Cc: Jens Axboe, linux-scsi

James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> On Thu, 2008-07-03 at 13:24 +0200, Jens Axboe wrote:
>> On Thu, Jul 03 2008, Elias Oltmanns wrote:
>
>> > Elias Oltmanns <eo@nebensachen.de> wrote:
>> > > James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
>> > >> On Wed, 2008-07-02 at 20:45 +0200, Jens Axboe wrote:
>> > >
>> > >>> On Wed, Jul 02 2008, James Bottomley wrote:
>> > >>
>> > >>> > On Wed, 2008-07-02 at 13:50 +0200, Jens Axboe wrote:
>> > >>> > > Yep, blk_plug_device() needs to be called with the queue lock held.
>> > >>> > 
>> > >>> > That's what the comment says ... but if you replaced the test_bit with
>> > >>> > an atomic operation then the rest of it does look to be in no need of
>> > >>> > serialisation ... unless there's something I missed?
>> > >>> 
>> > >>> Indeed, but then you would have to use atomic bitops everywhere and that
>> > >>> is the bit we moved away from.
>> > >>
>> > >> Not necessarily ... only for QUEUE_FLAG_CLUSTER.  That's really only in
>> > >> this one place and then the one in blk_remove_plug would have to become
>> > >> test_and_clear_bit.  All the other places barring loop_unplug() are only
>> > >> tests (which don't affect the atomicity).
>> > >>
>> > >> It's just for SCSI the double spin lock followed by double spin unlock
>> > >> to get the locking right is kind of nasty ... I'm just wondering what
>> > >> the universe would look like if it were rendered unnecessary.
>> > >
>> > > We have to consider one more thing: Without the locking in
>> > > blk_plug_device(), the following sequence of events may occur:
>> > 
>> > Actually, it's worse than that. Locking is required in order to make
>> > absolutely sure that the unplug_timer is active iff QUEUE_FLAG_PLUGGED
>> > is set. Admittedly, it seems *very* unlikely that blk_remove_plug() will
>> > complete before the call to mod_timer() in blk_plug_device() even though
>> > it has started only *after* a call to test_and_set_bit(). However, if
>> > such a thing would ever happen, it could have dire consequences.
>> 
>> Both are races possible without either atomic bitops or the queue lock
>> being held. We can't properly mix eg set_bit() and __set_bit(). The
>> plugged bit is the most hammered, so it's staying non-atomic and SCSI
>> will need to provide proper locking there.
>
> You're the boss.
>
> Actually, after all of this, it looks like the host queue plug is
> superfluous.  If the host actually says not ready from
> scsi_host_queue_ready() we go to the not ready processing clause in
> scsi_prep_fn() which actually checks the outstanding on the current
> device and plugs the queue if there aren't any commands.  This is
> actually more correct behaviour than a blind plug regardless (and it's
> also done under the queue lock), so I think this is the correct fix.

Indeed, a very neat way out. Will this be queued up for stable too?

Regards,

Elias

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: Fix some locking issues
  2008-07-03 19:47                       ` Elias Oltmanns
@ 2008-07-03 21:33                         ` James Bottomley
  0 siblings, 0 replies; 24+ messages in thread
From: James Bottomley @ 2008-07-03 21:33 UTC (permalink / raw)
  To: Elias Oltmanns; +Cc: Jens Axboe, linux-scsi

On Thu, 2008-07-03 at 21:47 +0200, Elias Oltmanns wrote:
> James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> > On Thu, 2008-07-03 at 13:24 +0200, Jens Axboe wrote:
> >> On Thu, Jul 03 2008, Elias Oltmanns wrote:
> >
> >> > Elias Oltmanns <eo@nebensachen.de> wrote:
> >> > > James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> >> > >> On Wed, 2008-07-02 at 20:45 +0200, Jens Axboe wrote:
> >> > >
> >> > >>> On Wed, Jul 02 2008, James Bottomley wrote:
> >> > >>
> >> > >>> > On Wed, 2008-07-02 at 13:50 +0200, Jens Axboe wrote:
> >> > >>> > > Yep, blk_plug_device() needs to be called with the queue lock held.
> >> > >>> > 
> >> > >>> > That's what the comment says ... but if you replaced the test_bit with
> >> > >>> > an atomic operation then the rest of it does look to be in no need of
> >> > >>> > serialisation ... unless there's something I missed?
> >> > >>> 
> >> > >>> Indeed, but then you would have to use atomic bitops everywhere and that
> >> > >>> is the bit we moved away from.
> >> > >>
> >> > >> Not necessarily ... only for QUEUE_FLAG_CLUSTER.  That's really only in
> >> > >> this one place and then the one in blk_remove_plug would have to become
> >> > >> test_and_clear_bit.  All the other places barring loop_unplug() are only
> >> > >> tests (which don't affect the atomicity).
> >> > >>
> >> > >> It's just for SCSI the double spin lock followed by double spin unlock
> >> > >> to get the locking right is kind of nasty ... I'm just wondering what
> >> > >> the universe would look like if it were rendered unnecessary.
> >> > >
> >> > > We have to consider one more thing: Without the locking in
> >> > > blk_plug_device(), the following sequence of events may occur:
> >> > 
> >> > Actually, it's worse than that. Locking is required in order to make
> >> > absolutely sure that the unplug_timer is active iff QUEUE_FLAG_PLUGGED
> >> > is set. Admittedly, it seems *very* unlikely that blk_remove_plug() will
> >> > complete before the call to mod_timer() in blk_plug_device() even though
> >> > it has started only *after* a call to test_and_set_bit(). However, if
> >> > such a thing would ever happen, it could have dire consequences.
> >> 
> >> Both are races possible without either atomic bitops or the queue lock
> >> being held. We can't properly mix eg set_bit() and __set_bit(). The
> >> plugged bit is the most hammered, so it's staying non-atomic and SCSI
> >> will need to provide proper locking there.
> >
> > You're the boss.
> >
> > Actually, after all of this, it looks like the host queue plug is
> > superfluous.  If the host actually says not ready from
> > scsi_host_queue_ready() we go to the not ready processing clause in
> > scsi_prep_fn() which actually checks the outstanding on the current
> > device and plugs the queue if there aren't any commands.  This is
> > actually more correct behaviour than a blind plug regardless (and it's
> > also done under the queue lock), so I think this is the correct fix.
> 
> Indeed, a very neat way out. Will this be queued up for stable too?

No ... the fact that there's another plug in the path means that the
consequences should be unobservable (borne out by the fact that we have
no bug reports).  Prudence dictates that there's greater risk from an
untested fix than there is from the original problem.

James



^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2008-07-03 21:33 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-06-29 11:38 [PATCH] SCSI: Fix some locking issues Elias Oltmanns
2008-07-01 21:37 ` Elias Oltmanns
2008-07-02  1:55   ` James Bottomley
2008-07-02  7:08     ` Elias Oltmanns
2008-07-02 11:50       ` Jens Axboe
2008-07-02 14:49         ` James Bottomley
2008-07-02 18:45           ` Jens Axboe
2008-07-02 20:18             ` James Bottomley
2008-07-03  7:53               ` Elias Oltmanns
2008-07-03 10:38                 ` Elias Oltmanns
2008-07-03 11:24                   ` Jens Axboe
2008-07-03 16:31                     ` James Bottomley
2008-07-03 17:54                       ` Jens Axboe
2008-07-03 19:47                       ` Elias Oltmanns
2008-07-03 21:33                         ` James Bottomley
2008-07-02 14:46       ` James Bottomley
2008-07-02 15:59         ` Elias Oltmanns
2008-07-02 16:23           ` Matthew Wilcox
2008-07-03  7:12             ` Elias Oltmanns
2008-07-03 15:22               ` James Bottomley
2008-07-03 19:39                 ` Elias Oltmanns
2008-07-03 15:47               ` Matthew Wilcox
2008-07-02 16:32           ` James Bottomley
2008-07-03  7:25             ` Elias Oltmanns

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).