From: Petr Tesarik <ptesarik@suse.cz>
To: Hannes Reinecke <hare@suse.de>
Cc: linux-scsi@vger.kernel.org
Subject: Re: Locking scheme of /proc/scsi/scsi
Date: Tue, 22 Nov 2011 11:57:19 +0100 [thread overview]
Message-ID: <201111221157.20099.ptesarik@suse.cz> (raw)
In-Reply-To: <4ECB6867.8020509@suse.de>
Dne Út 22. listopadu 2011 10:16:23 Hannes Reinecke napsal(a):
> On 11/22/2011 09:59 AM, Hannes Reinecke wrote:
> > On 11/21/2011 06:32 PM, Petr Tesarik wrote:
> >> Hi folks,
> >>
> >> I've been working on a kernel crash dump of an ancient kernel recently,
> >> and I have come to the conculsion that walking the scsi devices via
> >> bus_find_device() is completely flawed. While looking for an upstream
> >> fix, I didn't find any, so the same flaw is probably still there.
> >> However, let me ask here to check how this is supposed to work.
> >>
> >> First, this is how I understand the issue. The "/proc/scsi/scsi" file is
> >> handled as a pretty standard seqfile, iterating over the devices with
> >> the following function:
> >>
> >> static inline struct device *next_scsi_device(struct device *start)
> >> {
> >>
> >> struct device *next = bus_find_device(&scsi_bus_type, start, NULL,
> >>
> >> always_match);
> >>
> >> put_device(start);
> >> return next;
> >>
> >> }
> >>
> >> The returned value is used for the next iteration. Now,
> >> bus_find_device() assumes that the device is still attached to the
> >> knode_bus klist, because that's how it initializes the klist iterator.
> >> When it finds the next device, it increments the reference count on the
> >> device with get_device(), but it doesn't do anything about the
> >> knode_bus field. So, when somebody calls scsi_remove_device() on the
> >> current device between two calls to
> >>
> >> next_scsi_device, then it does:
> >> if (sdev->is_visible) {
> >>
> >> [...]
> >>
> >> device_del(dev);
> >>
> >> which in turn calls:
> >> bus_remove_device(dev);
> >>
> >> which does:
> >> if (klist_node_attached(&dev->p->knode_bus))
> >>
> >> klist_del(&dev->p->knode_bus);
> >>
> >> So, even though the struct device has a non-zero refcount, the code in
> >> next_scsi_device cannot continue, because it only has a stale pointer to
> >> an already detached klist, right?
> >>
> >> At least that's what I saw in 2.6.16, and I can still see the same thing
> >> possible in 3.1.
> >
> > Hmm. Looks like we need to increase the refcount to knode_bus when
> > we iterate devices.
> > Let me check ...
>
> No, this seems to be okay. klists are protected by their own
> refcounting in ->n_ref (via klist_dec_and_del()), so the list
> processing can continue.
Of course it has its own refcounting, and that's exactly my point! The last
reference can be dropped with klist_del() in bus_remove_device(). The seqfile
doesn't hold any reference to the knode_bus klist node. The obvious fix is to
add the extra reference somehow before it is dropped in bus_find_device() with
klist_iter_exit(&it);
> However, seeing that you're working with 2.6.16 there has been a
> rather serious issue with scsi device scanning, which has been fixed
> by 32aeef605aa01e1fee45e052eceffb00e72ba2b0.
Thank you for the pointer. This is patches.fixes/scsi-restart-lookup-by-target
in the SLES kernel, and you added it yourself. ;-)
Cheers,
Petr Tesarik
SUSE LINUX
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
prev parent reply other threads:[~2011-11-22 10:57 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-11-21 17:32 Locking scheme of /proc/scsi/scsi Petr Tesarik
2011-11-22 8:59 ` Hannes Reinecke
2011-11-22 9:16 ` Hannes Reinecke
2011-11-22 10:57 ` Petr Tesarik [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201111221157.20099.ptesarik@suse.cz \
--to=ptesarik@suse.cz \
--cc=hare@suse.de \
--cc=linux-scsi@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.