* Locking scheme of /proc/scsi/scsi
@ 2011-11-21 17:32 Petr Tesarik
2011-11-22 8:59 ` Hannes Reinecke
0 siblings, 1 reply; 4+ messages in thread
From: Petr Tesarik @ 2011-11-21 17:32 UTC (permalink / raw)
To: linux-scsi
Hi folks,
I've been working on a kernel crash dump of an ancient kernel recently, and I
have come to the conculsion that walking the scsi devices via
bus_find_device() is completely flawed. While looking for an upstream fix, I
didn't find any, so the same flaw is probably still there. However, let me ask
here to check how this is supposed to work.
First, this is how I understand the issue. The "/proc/scsi/scsi" file is
handled as a pretty standard seqfile, iterating over the devices with the
following function:
static inline struct device *next_scsi_device(struct device *start)
{
struct device *next = bus_find_device(&scsi_bus_type, start, NULL,
always_match);
put_device(start);
return next;
}
The returned value is used for the next iteration. Now, bus_find_device()
assumes that the device is still attached to the knode_bus klist, because
that's how it initializes the klist iterator. When it finds the next device,
it increments the reference count on the device with get_device(), but it
doesn't do anything about the knode_bus field. So, when somebody calls
scsi_remove_device() on the current device between two calls to
next_scsi_device, then it does:
if (sdev->is_visible) {
[...]
device_del(dev);
which in turn calls:
bus_remove_device(dev);
which does:
if (klist_node_attached(&dev->p->knode_bus))
klist_del(&dev->p->knode_bus);
So, even though the struct device has a non-zero refcount, the code in
next_scsi_device cannot continue, because it only has a stale pointer to an
already detached klist, right?
At least that's what I saw in 2.6.16, and I can still see the same thing
possible in 3.1.
Please, include my mail in your replies, because I'm not subscribed to linux-
scsi.
Petr Tesarik
SUSE LINUX
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Locking scheme of /proc/scsi/scsi
2011-11-21 17:32 Locking scheme of /proc/scsi/scsi Petr Tesarik
@ 2011-11-22 8:59 ` Hannes Reinecke
2011-11-22 9:16 ` Hannes Reinecke
0 siblings, 1 reply; 4+ messages in thread
From: Hannes Reinecke @ 2011-11-22 8:59 UTC (permalink / raw)
To: Petr Tesarik; +Cc: linux-scsi
On 11/21/2011 06:32 PM, Petr Tesarik wrote:
> Hi folks,
>
> I've been working on a kernel crash dump of an ancient kernel recently, and I
> have come to the conculsion that walking the scsi devices via
> bus_find_device() is completely flawed. While looking for an upstream fix, I
> didn't find any, so the same flaw is probably still there. However, let me ask
> here to check how this is supposed to work.
>
> First, this is how I understand the issue. The "/proc/scsi/scsi" file is
> handled as a pretty standard seqfile, iterating over the devices with the
> following function:
>
> static inline struct device *next_scsi_device(struct device *start)
> {
> struct device *next = bus_find_device(&scsi_bus_type, start, NULL,
> always_match);
> put_device(start);
> return next;
> }
>
> The returned value is used for the next iteration. Now, bus_find_device()
> assumes that the device is still attached to the knode_bus klist, because
> that's how it initializes the klist iterator. When it finds the next device,
> it increments the reference count on the device with get_device(), but it
> doesn't do anything about the knode_bus field. So, when somebody calls
> scsi_remove_device() on the current device between two calls to
> next_scsi_device, then it does:
>
> if (sdev->is_visible) {
> [...]
> device_del(dev);
>
> which in turn calls:
>
> bus_remove_device(dev);
>
> which does:
>
> if (klist_node_attached(&dev->p->knode_bus))
> klist_del(&dev->p->knode_bus);
>
> So, even though the struct device has a non-zero refcount, the code in
> next_scsi_device cannot continue, because it only has a stale pointer to an
> already detached klist, right?
>
> At least that's what I saw in 2.6.16, and I can still see the same thing
> possible in 3.1.
>
Hmm. Looks like we need to increase the refcount to knode_bus when
we iterate devices.
Let me check ...
Cheers,
Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare@suse.de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Locking scheme of /proc/scsi/scsi
2011-11-22 8:59 ` Hannes Reinecke
@ 2011-11-22 9:16 ` Hannes Reinecke
2011-11-22 10:57 ` Petr Tesarik
0 siblings, 1 reply; 4+ messages in thread
From: Hannes Reinecke @ 2011-11-22 9:16 UTC (permalink / raw)
To: Petr Tesarik; +Cc: linux-scsi
On 11/22/2011 09:59 AM, Hannes Reinecke wrote:
> On 11/21/2011 06:32 PM, Petr Tesarik wrote:
>> Hi folks,
>>
>> I've been working on a kernel crash dump of an ancient kernel recently, and I
>> have come to the conculsion that walking the scsi devices via
>> bus_find_device() is completely flawed. While looking for an upstream fix, I
>> didn't find any, so the same flaw is probably still there. However, let me ask
>> here to check how this is supposed to work.
>>
>> First, this is how I understand the issue. The "/proc/scsi/scsi" file is
>> handled as a pretty standard seqfile, iterating over the devices with the
>> following function:
>>
>> static inline struct device *next_scsi_device(struct device *start)
>> {
>> struct device *next = bus_find_device(&scsi_bus_type, start, NULL,
>> always_match);
>> put_device(start);
>> return next;
>> }
>>
>> The returned value is used for the next iteration. Now, bus_find_device()
>> assumes that the device is still attached to the knode_bus klist, because
>> that's how it initializes the klist iterator. When it finds the next device,
>> it increments the reference count on the device with get_device(), but it
>> doesn't do anything about the knode_bus field. So, when somebody calls
>> scsi_remove_device() on the current device between two calls to
>> next_scsi_device, then it does:
>>
>> if (sdev->is_visible) {
>> [...]
>> device_del(dev);
>>
>> which in turn calls:
>>
>> bus_remove_device(dev);
>>
>> which does:
>>
>> if (klist_node_attached(&dev->p->knode_bus))
>> klist_del(&dev->p->knode_bus);
>>
>> So, even though the struct device has a non-zero refcount, the code in
>> next_scsi_device cannot continue, because it only has a stale pointer to an
>> already detached klist, right?
>>
>> At least that's what I saw in 2.6.16, and I can still see the same thing
>> possible in 3.1.
>>
> Hmm. Looks like we need to increase the refcount to knode_bus when
> we iterate devices.
> Let me check ...
>
No, this seems to be okay. klists are protected by their own
refcounting in ->n_ref (via klist_dec_and_del()), so the list
processing can continue.
However, seeing that you're working with 2.6.16 there has been a
rather serious issue with scsi device scanning, which has been fixed
by 32aeef605aa01e1fee45e052eceffb00e72ba2b0.
Please to check whether that patch is included.
Cheers,
Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare@suse.de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Locking scheme of /proc/scsi/scsi
2011-11-22 9:16 ` Hannes Reinecke
@ 2011-11-22 10:57 ` Petr Tesarik
0 siblings, 0 replies; 4+ messages in thread
From: Petr Tesarik @ 2011-11-22 10:57 UTC (permalink / raw)
To: Hannes Reinecke; +Cc: linux-scsi
Dne Út 22. listopadu 2011 10:16:23 Hannes Reinecke napsal(a):
> On 11/22/2011 09:59 AM, Hannes Reinecke wrote:
> > On 11/21/2011 06:32 PM, Petr Tesarik wrote:
> >> Hi folks,
> >>
> >> I've been working on a kernel crash dump of an ancient kernel recently,
> >> and I have come to the conculsion that walking the scsi devices via
> >> bus_find_device() is completely flawed. While looking for an upstream
> >> fix, I didn't find any, so the same flaw is probably still there.
> >> However, let me ask here to check how this is supposed to work.
> >>
> >> First, this is how I understand the issue. The "/proc/scsi/scsi" file is
> >> handled as a pretty standard seqfile, iterating over the devices with
> >> the following function:
> >>
> >> static inline struct device *next_scsi_device(struct device *start)
> >> {
> >>
> >> struct device *next = bus_find_device(&scsi_bus_type, start, NULL,
> >>
> >> always_match);
> >>
> >> put_device(start);
> >> return next;
> >>
> >> }
> >>
> >> The returned value is used for the next iteration. Now,
> >> bus_find_device() assumes that the device is still attached to the
> >> knode_bus klist, because that's how it initializes the klist iterator.
> >> When it finds the next device, it increments the reference count on the
> >> device with get_device(), but it doesn't do anything about the
> >> knode_bus field. So, when somebody calls scsi_remove_device() on the
> >> current device between two calls to
> >>
> >> next_scsi_device, then it does:
> >> if (sdev->is_visible) {
> >>
> >> [...]
> >>
> >> device_del(dev);
> >>
> >> which in turn calls:
> >> bus_remove_device(dev);
> >>
> >> which does:
> >> if (klist_node_attached(&dev->p->knode_bus))
> >>
> >> klist_del(&dev->p->knode_bus);
> >>
> >> So, even though the struct device has a non-zero refcount, the code in
> >> next_scsi_device cannot continue, because it only has a stale pointer to
> >> an already detached klist, right?
> >>
> >> At least that's what I saw in 2.6.16, and I can still see the same thing
> >> possible in 3.1.
> >
> > Hmm. Looks like we need to increase the refcount to knode_bus when
> > we iterate devices.
> > Let me check ...
>
> No, this seems to be okay. klists are protected by their own
> refcounting in ->n_ref (via klist_dec_and_del()), so the list
> processing can continue.
Of course it has its own refcounting, and that's exactly my point! The last
reference can be dropped with klist_del() in bus_remove_device(). The seqfile
doesn't hold any reference to the knode_bus klist node. The obvious fix is to
add the extra reference somehow before it is dropped in bus_find_device() with
klist_iter_exit(&it);
> However, seeing that you're working with 2.6.16 there has been a
> rather serious issue with scsi device scanning, which has been fixed
> by 32aeef605aa01e1fee45e052eceffb00e72ba2b0.
Thank you for the pointer. This is patches.fixes/scsi-restart-lookup-by-target
in the SLES kernel, and you added it yourself. ;-)
Cheers,
Petr Tesarik
SUSE LINUX
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2011-11-22 10:57 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-11-21 17:32 Locking scheme of /proc/scsi/scsi Petr Tesarik
2011-11-22 8:59 ` Hannes Reinecke
2011-11-22 9:16 ` Hannes Reinecke
2011-11-22 10:57 ` Petr Tesarik
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox