Linux SCSI subsystem development
 help / color / mirror / Atom feed
* Locking scheme of /proc/scsi/scsi
@ 2011-11-21 17:32 Petr Tesarik
  2011-11-22  8:59 ` Hannes Reinecke
  0 siblings, 1 reply; 4+ messages in thread
From: Petr Tesarik @ 2011-11-21 17:32 UTC (permalink / raw)
  To: linux-scsi

Hi folks,

I've been working on a kernel crash dump of an ancient kernel recently, and I 
have come to the conculsion that walking the scsi devices via 
bus_find_device() is completely flawed. While looking for an upstream fix, I 
didn't find any, so the same flaw is probably still there. However, let me ask 
here to check how this is supposed to work.

First, this is how I understand the issue. The "/proc/scsi/scsi" file is 
handled as a pretty standard seqfile, iterating over the devices with the 
following function:

static inline struct device *next_scsi_device(struct device *start)
{
	struct device *next = bus_find_device(&scsi_bus_type, start, NULL,
					      always_match);
	put_device(start);
	return next;
}

The returned value is used for the next iteration. Now, bus_find_device() 
assumes that the device is still attached to the knode_bus klist, because 
that's how it initializes the klist iterator. When it finds the next device, 
it increments the reference count on the device with get_device(), but it 
doesn't do anything about the knode_bus field. So, when somebody calls 
scsi_remove_device() on the current device between two calls to 
next_scsi_device, then it does:

	if (sdev->is_visible) {
[...]
		device_del(dev);

which in turn calls:

	bus_remove_device(dev);

which does:

		if (klist_node_attached(&dev->p->knode_bus))
			klist_del(&dev->p->knode_bus);

So, even though the struct device has a non-zero refcount, the code in 
next_scsi_device cannot continue, because it only has a stale pointer to an 
already detached klist, right?

At least that's what I saw in 2.6.16, and I can still see the same thing 
possible in 3.1.

Please, include my mail in your replies, because I'm not subscribed to linux-
scsi.

Petr Tesarik
SUSE LINUX

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Locking scheme of /proc/scsi/scsi
  2011-11-21 17:32 Locking scheme of /proc/scsi/scsi Petr Tesarik
@ 2011-11-22  8:59 ` Hannes Reinecke
  2011-11-22  9:16   ` Hannes Reinecke
  0 siblings, 1 reply; 4+ messages in thread
From: Hannes Reinecke @ 2011-11-22  8:59 UTC (permalink / raw)
  To: Petr Tesarik; +Cc: linux-scsi

On 11/21/2011 06:32 PM, Petr Tesarik wrote:
> Hi folks,
> 
> I've been working on a kernel crash dump of an ancient kernel recently, and I 
> have come to the conculsion that walking the scsi devices via 
> bus_find_device() is completely flawed. While looking for an upstream fix, I 
> didn't find any, so the same flaw is probably still there. However, let me ask 
> here to check how this is supposed to work.
> 
> First, this is how I understand the issue. The "/proc/scsi/scsi" file is 
> handled as a pretty standard seqfile, iterating over the devices with the 
> following function:
> 
> static inline struct device *next_scsi_device(struct device *start)
> {
> 	struct device *next = bus_find_device(&scsi_bus_type, start, NULL,
> 					      always_match);
> 	put_device(start);
> 	return next;
> }
> 
> The returned value is used for the next iteration. Now, bus_find_device() 
> assumes that the device is still attached to the knode_bus klist, because 
> that's how it initializes the klist iterator. When it finds the next device, 
> it increments the reference count on the device with get_device(), but it 
> doesn't do anything about the knode_bus field. So, when somebody calls 
> scsi_remove_device() on the current device between two calls to 
> next_scsi_device, then it does:
> 
> 	if (sdev->is_visible) {
> [...]
> 		device_del(dev);
> 
> which in turn calls:
> 
> 	bus_remove_device(dev);
> 
> which does:
> 
> 		if (klist_node_attached(&dev->p->knode_bus))
> 			klist_del(&dev->p->knode_bus);
> 
> So, even though the struct device has a non-zero refcount, the code in 
> next_scsi_device cannot continue, because it only has a stale pointer to an 
> already detached klist, right?
> 
> At least that's what I saw in 2.6.16, and I can still see the same thing 
> possible in 3.1.
> 
Hmm. Looks like we need to increase the refcount to knode_bus when
we iterate devices.
Let me check ...

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Locking scheme of /proc/scsi/scsi
  2011-11-22  8:59 ` Hannes Reinecke
@ 2011-11-22  9:16   ` Hannes Reinecke
  2011-11-22 10:57     ` Petr Tesarik
  0 siblings, 1 reply; 4+ messages in thread
From: Hannes Reinecke @ 2011-11-22  9:16 UTC (permalink / raw)
  To: Petr Tesarik; +Cc: linux-scsi

On 11/22/2011 09:59 AM, Hannes Reinecke wrote:
> On 11/21/2011 06:32 PM, Petr Tesarik wrote:
>> Hi folks,
>>
>> I've been working on a kernel crash dump of an ancient kernel recently, and I 
>> have come to the conculsion that walking the scsi devices via 
>> bus_find_device() is completely flawed. While looking for an upstream fix, I 
>> didn't find any, so the same flaw is probably still there. However, let me ask 
>> here to check how this is supposed to work.
>>
>> First, this is how I understand the issue. The "/proc/scsi/scsi" file is 
>> handled as a pretty standard seqfile, iterating over the devices with the 
>> following function:
>>
>> static inline struct device *next_scsi_device(struct device *start)
>> {
>> 	struct device *next = bus_find_device(&scsi_bus_type, start, NULL,
>> 					      always_match);
>> 	put_device(start);
>> 	return next;
>> }
>>
>> The returned value is used for the next iteration. Now, bus_find_device() 
>> assumes that the device is still attached to the knode_bus klist, because 
>> that's how it initializes the klist iterator. When it finds the next device, 
>> it increments the reference count on the device with get_device(), but it 
>> doesn't do anything about the knode_bus field. So, when somebody calls 
>> scsi_remove_device() on the current device between two calls to 
>> next_scsi_device, then it does:
>>
>> 	if (sdev->is_visible) {
>> [...]
>> 		device_del(dev);
>>
>> which in turn calls:
>>
>> 	bus_remove_device(dev);
>>
>> which does:
>>
>> 		if (klist_node_attached(&dev->p->knode_bus))
>> 			klist_del(&dev->p->knode_bus);
>>
>> So, even though the struct device has a non-zero refcount, the code in 
>> next_scsi_device cannot continue, because it only has a stale pointer to an 
>> already detached klist, right?
>>
>> At least that's what I saw in 2.6.16, and I can still see the same thing 
>> possible in 3.1.
>>
> Hmm. Looks like we need to increase the refcount to knode_bus when
> we iterate devices.
> Let me check ...
> 
No, this seems to be okay. klists are protected by their own
refcounting in ->n_ref (via klist_dec_and_del()), so the list
processing can continue.
However, seeing that you're working with 2.6.16 there has been a
rather serious issue with scsi device scanning, which has been fixed
by 32aeef605aa01e1fee45e052eceffb00e72ba2b0.
Please to check whether that patch is included.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Locking scheme of /proc/scsi/scsi
  2011-11-22  9:16   ` Hannes Reinecke
@ 2011-11-22 10:57     ` Petr Tesarik
  0 siblings, 0 replies; 4+ messages in thread
From: Petr Tesarik @ 2011-11-22 10:57 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: linux-scsi

Dne Út 22. listopadu 2011 10:16:23 Hannes Reinecke napsal(a):
> On 11/22/2011 09:59 AM, Hannes Reinecke wrote:
> > On 11/21/2011 06:32 PM, Petr Tesarik wrote:
> >> Hi folks,
> >> 
> >> I've been working on a kernel crash dump of an ancient kernel recently,
> >> and I have come to the conculsion that walking the scsi devices via
> >> bus_find_device() is completely flawed. While looking for an upstream
> >> fix, I didn't find any, so the same flaw is probably still there.
> >> However, let me ask here to check how this is supposed to work.
> >> 
> >> First, this is how I understand the issue. The "/proc/scsi/scsi" file is
> >> handled as a pretty standard seqfile, iterating over the devices with
> >> the following function:
> >> 
> >> static inline struct device *next_scsi_device(struct device *start)
> >> {
> >> 
> >> 	struct device *next = bus_find_device(&scsi_bus_type, start, NULL,
> >> 	
> >> 					      always_match);
> >> 	
> >> 	put_device(start);
> >> 	return next;
> >> 
> >> }
> >> 
> >> The returned value is used for the next iteration. Now,
> >> bus_find_device() assumes that the device is still attached to the
> >> knode_bus klist, because that's how it initializes the klist iterator.
> >> When it finds the next device, it increments the reference count on the
> >> device with get_device(), but it doesn't do anything about the
> >> knode_bus field. So, when somebody calls scsi_remove_device() on the
> >> current device between two calls to
> >> 
> >> next_scsi_device, then it does:
> >> 	if (sdev->is_visible) {
> >> 
> >> [...]
> >> 
> >> 		device_del(dev);
> >> 
> >> which in turn calls:
> >> 	bus_remove_device(dev);
> >> 
> >> which does:
> >> 		if (klist_node_attached(&dev->p->knode_bus))
> >> 		
> >> 			klist_del(&dev->p->knode_bus);
> >> 
> >> So, even though the struct device has a non-zero refcount, the code in
> >> next_scsi_device cannot continue, because it only has a stale pointer to
> >> an already detached klist, right?
> >> 
> >> At least that's what I saw in 2.6.16, and I can still see the same thing
> >> possible in 3.1.
> > 
> > Hmm. Looks like we need to increase the refcount to knode_bus when
> > we iterate devices.
> > Let me check ...
> 
> No, this seems to be okay. klists are protected by their own
> refcounting in ->n_ref (via klist_dec_and_del()), so the list
> processing can continue.

Of course it has its own refcounting, and that's exactly my point! The last 
reference can be dropped with klist_del() in bus_remove_device(). The seqfile 
doesn't hold any reference to the knode_bus klist node. The obvious fix is to 
add the extra reference somehow before it is dropped in bus_find_device() with

	klist_iter_exit(&it);

> However, seeing that you're working with 2.6.16 there has been a
> rather serious issue with scsi device scanning, which has been fixed
> by 32aeef605aa01e1fee45e052eceffb00e72ba2b0.

Thank you for the pointer. This is patches.fixes/scsi-restart-lookup-by-target 
in the SLES kernel, and you added it yourself. ;-)

Cheers,
Petr Tesarik
SUSE LINUX
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2011-11-22 10:57 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-11-21 17:32 Locking scheme of /proc/scsi/scsi Petr Tesarik
2011-11-22  8:59 ` Hannes Reinecke
2011-11-22  9:16   ` Hannes Reinecke
2011-11-22 10:57     ` Petr Tesarik

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox