public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 5/5] SCSI: fix target allocation outside of scan_mutex
@ 2010-02-12 17:14 Alan Stern
  2010-02-19 16:34 ` James Bottomley
  0 siblings, 1 reply; 5+ messages in thread
From: Alan Stern @ 2010-02-12 17:14 UTC (permalink / raw)
  To: James Bottomley; +Cc: SCSI development list

This patch (as1337) fixes a bug in __scsi_add_device().  It calls
scsi_alloc_target() outside the protection of the host's scan_mutex,
meaning that it might find an incompletely-initialized target or it
might create a duplicate target.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>

---

Index: usb-2.6/drivers/scsi/scsi_scan.c
===================================================================
--- usb-2.6.orig/drivers/scsi/scsi_scan.c
+++ usb-2.6/drivers/scsi/scsi_scan.c
@@ -1506,25 +1506,29 @@ struct scsi_device *__scsi_add_device(st
 {
 	struct scsi_device *sdev = ERR_PTR(-ENODEV);
 	struct device *parent = &shost->shost_gendev;
-	struct scsi_target *starget;
+	struct scsi_target *starget = NULL;
 
 	if (strncmp(scsi_scan_type, "none", 4) == 0)
-		return ERR_PTR(-ENODEV);
-
-	starget = scsi_alloc_target(parent, channel, id);
-	if (!starget)
-		return ERR_PTR(-ENOMEM);
+		return sdev;
 
 	mutex_lock(&shost->scan_mutex);
 	if (!shost->async_scan)
 		scsi_complete_async_scans();
 
-	if (scsi_host_scan_allowed(shost))
-		scsi_probe_and_add_lun(starget, lun, NULL, &sdev, 1, hostdata);
+	if (scsi_host_scan_allowed(shost)) {
+		starget = scsi_alloc_target(parent, channel, id);
+		if (starget)
+			scsi_probe_and_add_lun(starget, lun, NULL, &sdev,
+					1, hostdata);
+		else
+			sdev = ERR_PTR(-ENOMEM);
+	}
 	mutex_unlock(&shost->scan_mutex);
-	scsi_target_reap(starget);
-	put_device(&starget->dev);
 
+	if (starget) {
+		scsi_target_reap(starget);
+		put_device(&starget->dev);
+	}
 	return sdev;
 }
 EXPORT_SYMBOL(__scsi_add_device);


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 5/5] SCSI: fix target allocation outside of scan_mutex
  2010-02-12 17:14 [PATCH 5/5] SCSI: fix target allocation outside of scan_mutex Alan Stern
@ 2010-02-19 16:34 ` James Bottomley
  2010-02-19 20:21   ` Alan Stern
  0 siblings, 1 reply; 5+ messages in thread
From: James Bottomley @ 2010-02-19 16:34 UTC (permalink / raw)
  To: Alan Stern; +Cc: SCSI development list

On Fri, 2010-02-12 at 12:14 -0500, Alan Stern wrote:
> This patch (as1337) fixes a bug in __scsi_add_device().  It calls
> scsi_alloc_target() outside the protection of the host's scan_mutex,
> meaning that it might find an incompletely-initialized target or it
> might create a duplicate target.

I don't think this is correct.  scsi_alloc_target should be completely
atomic on the host lock, if you look.  It allocates the proposed target,
takes the lock and searches the host target list ... if it finds
something it drops the lock, destroys the proposed target allocations
and returns what it found.  If it finds nothing, it adds the proposed
target to the list drops the lock and returns it.  There's some
complexity around finding dying targets, but nothing that I think needs
to be mediated by the scan mutex.

James



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 5/5] SCSI: fix target allocation outside of scan_mutex
  2010-02-19 16:34 ` James Bottomley
@ 2010-02-19 20:21   ` Alan Stern
  2010-03-12 21:45     ` James Bottomley
  0 siblings, 1 reply; 5+ messages in thread
From: Alan Stern @ 2010-02-19 20:21 UTC (permalink / raw)
  To: James Bottomley; +Cc: SCSI development list

On Fri, 19 Feb 2010, James Bottomley wrote:

> On Fri, 2010-02-12 at 12:14 -0500, Alan Stern wrote:
> > This patch (as1337) fixes a bug in __scsi_add_device().  It calls
> > scsi_alloc_target() outside the protection of the host's scan_mutex,
> > meaning that it might find an incompletely-initialized target or it
> > might create a duplicate target.
> 
> I don't think this is correct.  scsi_alloc_target should be completely
> atomic on the host lock, if you look.  It allocates the proposed target,
> takes the lock and searches the host target list ... if it finds
> something it drops the lock, destroys the proposed target allocations
> and returns what it found.  If it finds nothing, it adds the proposed
> target to the list drops the lock and returns it.  There's some
> complexity around finding dying targets, but nothing that I think needs
> to be mediated by the scan mutex.

You're right that a duplicate target wouldn't get created.  Still,
there really is a bug.  Suppose two different threads call
scsi_alloc_target() for the same target ID at about the same time.  The
first thread won't find an existing target on the list, so it will
allocate and initialize a new target structure.  However, it releases
the host lock before calling the host template's target_alloc method.

Meanwhile, the second thread will find the new target structure on the 
list and will therefore feel free to start using it.  If the timing 
works out, it could start using the new target before the first thread 
calls target_alloc.

Perhaps you would say a better solution to this race is to move the
target_alloc call into the spinlocked region.  Fine.  But there would
still be a problem.  The whole purpose of the scan_mutex is to prevent
threads from adding new devices below a host that's in the process of
removal.  Since __scsi_add_device() doesn't use the scan_mutex, there's
nothing to prevent it from doing exactly that.

In fact, there's yet another race.  This is between scsi_alloc_target()
and scsi_target_reap().  Because scsi_alloc_target() doesn't check
whether reap_ref is 0 before incrementing it, we could have something
like this:

	Thread 0			Thread 1
	--------			--------
	call scsi_target_reap()
	lock host_lock
	--starget->reap_ref goes to 0
	unlock host_lock
					call scsi_target_alloc()
					lock host_lock
					find found_target = starget
					found_target->reap_ref++
					unlock host_lock
					found_target->state != STARGET_DEL,
						so begin using it
	starget->state = STARGET_DEL
	remove starget

To prevent this while relying only on the host lock,
scsi_alloc_target() should check the value of reap_ref before
incrementing it.  It also wouldn't hurt to make scsi_alloc_target()  
check the target state and scsi_target_reap() set the state all under 
the protection of the host lock.

Alan Stern


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 5/5] SCSI: fix target allocation outside of scan_mutex
  2010-02-19 20:21   ` Alan Stern
@ 2010-03-12 21:45     ` James Bottomley
  2010-03-13 15:51       ` Alan Stern
  0 siblings, 1 reply; 5+ messages in thread
From: James Bottomley @ 2010-03-12 21:45 UTC (permalink / raw)
  To: Alan Stern; +Cc: SCSI development list

On Fri, 2010-02-19 at 15:21 -0500, Alan Stern wrote:
> On Fri, 19 Feb 2010, James Bottomley wrote:
> 
> > On Fri, 2010-02-12 at 12:14 -0500, Alan Stern wrote:
> > > This patch (as1337) fixes a bug in __scsi_add_device().  It calls
> > > scsi_alloc_target() outside the protection of the host's scan_mutex,
> > > meaning that it might find an incompletely-initialized target or it
> > > might create a duplicate target.
> > 
> > I don't think this is correct.  scsi_alloc_target should be completely
> > atomic on the host lock, if you look.  It allocates the proposed target,
> > takes the lock and searches the host target list ... if it finds
> > something it drops the lock, destroys the proposed target allocations
> > and returns what it found.  If it finds nothing, it adds the proposed
> > target to the list drops the lock and returns it.  There's some
> > complexity around finding dying targets, but nothing that I think needs
> > to be mediated by the scan mutex.
> 
> You're right that a duplicate target wouldn't get created.  Still,
> there really is a bug.  Suppose two different threads call
> scsi_alloc_target() for the same target ID at about the same time.  The
> first thread won't find an existing target on the list, so it will
> allocate and initialize a new target structure.  However, it releases
> the host lock before calling the host template's target_alloc method.
> 
> Meanwhile, the second thread will find the new target structure on the 
> list and will therefore feel free to start using it.  If the timing 
> works out, it could start using the new target before the first thread 
> calls target_alloc.

So currently, that can't happen because of the scan mutex.

> Perhaps you would say a better solution to this race is to move the
> target_alloc call into the spinlocked region.  Fine.  But there would
> still be a problem.  The whole purpose of the scan_mutex is to prevent
> threads from adding new devices below a host that's in the process of
> removal.  Since __scsi_add_device() doesn't use the scan_mutex, there's
> nothing to prevent it from doing exactly that.
> 
> In fact, there's yet another race.  This is between scsi_alloc_target()
> and scsi_target_reap().  Because scsi_alloc_target() doesn't check
> whether reap_ref is 0 before incrementing it, we could have something
> like this:
> 
> 	Thread 0			Thread 1
> 	--------			--------
> 	call scsi_target_reap()
> 	lock host_lock
> 	--starget->reap_ref goes to 0
> 	unlock host_lock
> 					call scsi_target_alloc()
> 					lock host_lock
> 					find found_target = starget
> 					found_target->reap_ref++
> 					unlock host_lock
> 					found_target->state != STARGET_DEL,
> 						so begin using it
> 	starget->state = STARGET_DEL
> 	remove starget

Well, not really ... your race depends on us managing to execute the
first five statements in thread 1 before we set starget->state in thread
0 ... given the massive number of operations in thread 1 and the fact
that setting the state is a single op in thread 0, that's not really
feasible.

> To prevent this while relying only on the host lock,
> scsi_alloc_target() should check the value of reap_ref before
> incrementing it.  It also wouldn't hurt to make scsi_alloc_target()  
> check the target state and scsi_target_reap() set the state all under 
> the protection of the host lock.

That would be a complex way of fixing it ... a much simpler one would be
to move the setting of STARGET_DEL in thread 0 under the host lock.

James



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 5/5] SCSI: fix target allocation outside of scan_mutex
  2010-03-12 21:45     ` James Bottomley
@ 2010-03-13 15:51       ` Alan Stern
  0 siblings, 0 replies; 5+ messages in thread
From: Alan Stern @ 2010-03-13 15:51 UTC (permalink / raw)
  To: James Bottomley; +Cc: SCSI development list

On Fri, 12 Mar 2010, James Bottomley wrote:

> > You're right that a duplicate target wouldn't get created.  Still,
> > there really is a bug.  Suppose two different threads call
> > scsi_alloc_target() for the same target ID at about the same time.  The
> > first thread won't find an existing target on the list, so it will
> > allocate and initialize a new target structure.  However, it releases
> > the host lock before calling the host template's target_alloc method.
> > 
> > Meanwhile, the second thread will find the new target structure on the 
> > list and will therefore feel free to start using it.  If the timing 
> > works out, it could start using the new target before the first thread 
> > calls target_alloc.
> 
> So currently, that can't happen because of the scan mutex.

True.  Okay, that's not a bug.  However I still have the feeling that
the concurrency issues associated with targets haven't been thought out
sufficiently.

Here's another example.  Suppose that two threads probe two different 
LUNs under the same target at the same time, and suppose that 
scsi_target_add() fails:

	Thread 0			Thread 1
	--------			--------
	scsi_alloc_target() creates a new target
	Lock scan_mutex
	Scan LUN1
					scsi_alloc_target() finds the
						existing target and
						increments its reap_ref
					Blocks waiting for scan_mutex
	scsi_target_add() fails
	Destroy the sdev for LUN1 (the target
	  isn't destroyed because thread 1
	  has incremented reap_ref)
	Unlock scan_mutex
					Scan LUN2
					Call scsi_target_add() again

The code ends up calling device_add() twice for the same target
structure.  Even though the first call fails, the driver core does not
guarantee this will work properly.  AFAIK, nobody has ever audited that
code to see if there are any fields assumed to be 0 which may end up
being nonzero after a failed registration.

Also, notice that in its failure path, scsi_alloc_target() calls 
scsi_target_reap().  This doesn't seem right; it's unbalanced.  Here's 
what can happen:

	__scsi_add_device() calls scsi_alloc_target(), creating a new
	target with reap_ref = 1.

	It then calls scsi_probe_and_add_lun(), which calls 
	scsi_alloc_sdev(), increasing reap_ref to 2.

	Next we call scsi_add_lun(), but scsi_target_add() fails
	decreasing reap_ref back to 1.

	scsi_add_lun() returns SCSI_SCAN_NO_RESPONSE because of
	the failure, so scsi_probe_and_add_lun() goes on to call
	__scsi_remove_device().

	This causes scsi_device_dev_release_usercontext() to call
	scsi_target_reap(), decreasing reap_ref to 0 and deallocating
	the target structure.

	Finally, __scsi_add_device() calls scsi_target_reap(), passing
	a pointer to a structure that no longer exists.

The last two steps may occur in the opposite order because of the use
of a workqueue, but the end result is the same: scsi_target_reap() gets
called with a stale pointer.  It certainly looks as though
scsi_target_add() should not include the

		get_device(&starget->dev);
		scsi_target_reap(starget);
		put_device(&starget->dev);

sequence.  (Not to mention that the get_device and put_device calls are 
totally gratuitous anyway.)

Alan Stern


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2010-03-13 15:51 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-02-12 17:14 [PATCH 5/5] SCSI: fix target allocation outside of scan_mutex Alan Stern
2010-02-19 16:34 ` James Bottomley
2010-02-19 20:21   ` Alan Stern
2010-03-12 21:45     ` James Bottomley
2010-03-13 15:51       ` Alan Stern

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox