* [PATCH] driver model/scsi: synchronize pm calls with probe/remove
@ 2005-03-21 9:18 Tejun Heo
2005-03-21 14:40 ` Dmitry Torokhov
0 siblings, 1 reply; 3+ messages in thread
From: Tejun Heo @ 2005-03-21 9:18 UTC (permalink / raw)
To: dtor_core, mochel, James.Bottomley; +Cc: linux-kernel, linux-scsi
Hello, Dmitry, Mochel and James.
I've been looking at sd code and found seemingly bogus 'if (!sdkp)'
tests with /* this can happen */ comment. I've digged changelog and
found out that this was to prevent oops which occurs if some driver
gets stuck inside ->probe and the machine goes down and calls back
->remove. IMHO, we should avoid this problem by fixing driver ->probe
or ->remove callbacks instead of detecting and bypassing
half-initialized/destroyed devices in pm callbacks.
This patch read-locks a device's bus using device_pm_down_read_bus()
before invoking any pm callback. The function also periodically
prints out warning messages while waiting for the bus subsys rwsem.
This patch makes the machine wait indefinitely if any driver is stuck
inside ->probe or ->remove.
In device_shutdown(), devices_subsys.kset.list is walked while
holding devices_subsys.rwsem. This patch nests each bus's subsys
rwsem inside.
Signed-off-by: Tejun Heo <htejun@gmail.com>
# This is a BitKeeper generated diff -Nru style patch.
#
# ChangeSet
# 2005/03/21 17:22:41+09:00 tj@htj.dyndns.org
# device_pm_down_read_bus() implemented.
#
# drivers/scsi/sd.c
# 2005/03/21 17:22:33+09:00 tj@htj.dyndns.org +0 -6
# device_pm_down_read_bus() implemented.
#
# drivers/base/power/suspend.c
# 2005/03/21 17:22:33+09:00 tj@htj.dyndns.org +4 -1
# device_pm_down_read_bus() implemented.
#
# drivers/base/power/shutdown.c
# 2005/03/21 17:22:33+09:00 tj@htj.dyndns.org +13 -5
# device_pm_down_read_bus() implemented.
#
# drivers/base/power/resume.c
# 2005/03/21 17:22:33+09:00 tj@htj.dyndns.org +7 -2
# device_pm_down_read_bus() implemented.
#
# drivers/base/power/power.h
# 2005/03/21 17:22:33+09:00 tj@htj.dyndns.org +2 -0
# device_pm_down_read_bus() implemented.
#
# drivers/base/power/main.c
# 2005/03/21 17:22:33+09:00 tj@htj.dyndns.org +23 -1
# device_pm_down_read_bus() implemented.
#
diff -Nru a/drivers/base/power/main.c b/drivers/base/power/main.c
--- a/drivers/base/power/main.c 2005-03-21 17:25:50 +09:00
+++ b/drivers/base/power/main.c 2005-03-21 17:25:50 +09:00
@@ -21,6 +21,7 @@
#include <linux/config.h>
#include <linux/device.h>
+#include <linux/delay.h>
#include "power.h"
LIST_HEAD(dpm_active);
@@ -96,4 +97,25 @@
up(&dpm_list_sem);
}
-
+/**
+ * device_pm_down_read_bus - Read-lock dev->bus's rwsem.
+ * @dev: The bus of this dev is read locked on return.
+ * @opstr: Error message prefix.
+ *
+ * Read locks the subsys rwsem of the device's bus. Generally pm
+ * calls are made with processes frozen, so there shouldn't be
+ * any contention; however, the shutdown path is invoked when
+ * halting the machine and it's possible to have some drivers
+ * stuck inside ->probe or ->remove method. In such cases, we
+ * retry while printing a warning message every 10s.
+ */
+void device_pm_down_read_bus(struct device * dev, const char * opstr)
+{
+ int cnt = 0;
+ while (!down_read_trylock(&dev->bus->subsys.rwsem)) {
+ if (cnt++ % 100 == 0)
+ printk(KERN_WARNING "%s %s%s: waiting on bus subsys "
+ "rwsem\n", opstr, dev->bus->name, dev->bus_id);
+ msleep(100);
+ }
+}
diff -Nru a/drivers/base/power/power.h b/drivers/base/power/power.h
--- a/drivers/base/power/power.h 2005-03-21 17:25:50 +09:00
+++ b/drivers/base/power/power.h 2005-03-21 17:25:50 +09:00
@@ -53,6 +53,8 @@
extern int device_pm_add(struct device *);
extern void device_pm_remove(struct device *);
+extern void device_pm_down_read_bus(struct device * dev, const char *opstr);
+
/*
* sysfs.c
*/
diff -Nru a/drivers/base/power/resume.c b/drivers/base/power/resume.c
--- a/drivers/base/power/resume.c 2005-03-21 17:25:50 +09:00
+++ b/drivers/base/power/resume.c 2005-03-21 17:25:50 +09:00
@@ -22,8 +22,13 @@
int resume_device(struct device * dev)
{
- if (dev->bus && dev->bus->resume)
- return dev->bus->resume(dev);
+ if (dev->bus && dev->bus->resume) {
+ int ret;
+ device_pm_down_read_bus(dev, "Resuming");
+ ret = dev->bus->resume(dev);
+ up_read(&dev->bus->subsys.rwsem);
+ return ret;
+ }
return 0;
}
diff -Nru a/drivers/base/power/shutdown.c b/drivers/base/power/shutdown.c
--- a/drivers/base/power/shutdown.c 2005-03-21 17:25:50 +09:00
+++ b/drivers/base/power/shutdown.c 2005-03-21 17:25:50 +09:00
@@ -25,6 +25,8 @@
return 0;
if (dev->detach_state == DEVICE_PM_OFF) {
+ /* We have bus rwsem write-locked on entry to this
+ * function. No need to mess with the bus rwsem. */
if (dev->driver && dev->driver->shutdown)
dev->driver->shutdown(dev);
return 0;
@@ -54,11 +56,17 @@
down_write(&devices_subsys.rwsem);
list_for_each_entry_reverse(dev, &devices_subsys.kset.list, kobj.entry) {
pr_debug("shutting down %s: ", dev->bus_id);
- if (dev->driver && dev->driver->shutdown) {
- pr_debug("Ok\n");
- dev->driver->shutdown(dev);
- } else
- pr_debug("Ignored.\n");
+ if (dev->bus) {
+ device_pm_down_read_bus(dev, "Shutting down");
+ if (dev->driver && dev->driver->shutdown) {
+ pr_debug("Ok\n");
+ dev->driver->shutdown(dev);
+ up_read(&dev->bus->subsys.rwsem);
+ continue;
+ }
+ up_read(&dev->bus->subsys.rwsem);
+ }
+ pr_debug("Ignored.\n");
}
up_write(&devices_subsys.rwsem);
diff -Nru a/drivers/base/power/suspend.c b/drivers/base/power/suspend.c
--- a/drivers/base/power/suspend.c 2005-03-21 17:25:50 +09:00
+++ b/drivers/base/power/suspend.c 2005-03-21 17:25:50 +09:00
@@ -43,8 +43,11 @@
dev->power.prev_state = dev->power.power_state;
- if (dev->bus && dev->bus->suspend && !dev->power.power_state)
+ if (dev->bus && dev->bus->suspend && !dev->power.power_state) {
+ device_pm_down_read_bus(dev, "Suspending");
error = dev->bus->suspend(dev, state);
+ up_read(&dev->bus->subsys.rwsem);
+ }
return error;
}
diff -Nru a/drivers/scsi/sd.c b/drivers/scsi/sd.c
--- a/drivers/scsi/sd.c 2005-03-21 17:25:50 +09:00
+++ b/drivers/scsi/sd.c 2005-03-21 17:25:50 +09:00
@@ -730,9 +730,6 @@
struct scsi_device *sdp = to_scsi_device(dev);
struct scsi_disk *sdkp = dev_get_drvdata(dev);
- if (!sdkp)
- return -ENODEV;
-
if (!sdkp->WCE)
return 0;
@@ -1682,9 +1679,6 @@
{
struct scsi_device *sdp = to_scsi_device(dev);
struct scsi_disk *sdkp = dev_get_drvdata(dev);
-
- if (!sdkp)
- return; /* this can happen */
if (!sdkp->WCE)
return;
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [PATCH] driver model/scsi: synchronize pm calls with probe/remove
2005-03-21 9:18 [PATCH] driver model/scsi: synchronize pm calls with probe/remove Tejun Heo
@ 2005-03-21 14:40 ` Dmitry Torokhov
2005-03-22 4:57 ` Tejun Heo
0 siblings, 1 reply; 3+ messages in thread
From: Dmitry Torokhov @ 2005-03-21 14:40 UTC (permalink / raw)
To: Tejun Heo; +Cc: mochel, James.Bottomley, linux-kernel, linux-scsi
On Mon, 21 Mar 2005 18:18:46 +0900, Tejun Heo <htejun@gmail.com> wrote:
> Hello, Dmitry, Mochel and James.
>
> I've been looking at sd code and found seemingly bogus 'if (!sdkp)'
> tests with /* this can happen */ comment. I've digged changelog and
> found out that this was to prevent oops which occurs if some driver
> gets stuck inside ->probe and the machine goes down and calls back
> ->remove. IMHO, we should avoid this problem by fixing driver ->probe
> or ->remove callbacks instead of detecting and bypassing
> half-initialized/destroyed devices in pm callbacks.
>
> This patch read-locks a device's bus using device_pm_down_read_bus()
> before invoking any pm callback.
Hi Tejun,
There are talks about getting rid of bus's rwsem and replacing it with
a per-device semaphore to serialize probe, remove, suspend and resume.
This should resolve entire host of problems including this one, if I
unrerstand it correctly.
Please take a look here:
http://seclists.org/lists/linux-kernel/2005/Mar/5847.html
--
Dmitry
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] driver model/scsi: synchronize pm calls with probe/remove
2005-03-21 14:40 ` Dmitry Torokhov
@ 2005-03-22 4:57 ` Tejun Heo
0 siblings, 0 replies; 3+ messages in thread
From: Tejun Heo @ 2005-03-22 4:57 UTC (permalink / raw)
To: dtor_core; +Cc: mochel, James.Bottomley, linux-kernel, linux-scsi
Hi, Dmitry.
Dmitry Torokhov wrote:
> On Mon, 21 Mar 2005 18:18:46 +0900, Tejun Heo <htejun@gmail.com> wrote:
>
>>Hello, Dmitry, Mochel and James.
>>
>>I've been looking at sd code and found seemingly bogus 'if (!sdkp)'
>>tests with /* this can happen */ comment. I've digged changelog and
>>found out that this was to prevent oops which occurs if some driver
>>gets stuck inside ->probe and the machine goes down and calls back
>>->remove. IMHO, we should avoid this problem by fixing driver ->probe
>>or ->remove callbacks instead of detecting and bypassing
>>half-initialized/destroyed devices in pm callbacks.
>>
>>This patch read-locks a device's bus using device_pm_down_read_bus()
>>before invoking any pm callback.
>
>
> Hi Tejun,
>
> There are talks about getting rid of bus's rwsem and replacing it with
> a per-device semaphore to serialize probe, remove, suspend and resume.
> This should resolve entire host of problems including this one, if I
> unrerstand it correctly.
>
> Please take a look here:
> http://seclists.org/lists/linux-kernel/2005/Mar/5847.html
>
Yeap, sounds great. Hmmm.. as the final result will (and should) be
the same for inidividual drivers (no overlapping callback invocations),
how about incorporating my patch before implementing the proposed fix
such that we can get rid of the awkward semantic first? The proposed
change should change the same part of code anyway, so I don't think this
would be a hassle.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2005-03-22 4:58 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-03-21 9:18 [PATCH] driver model/scsi: synchronize pm calls with probe/remove Tejun Heo
2005-03-21 14:40 ` Dmitry Torokhov
2005-03-22 4:57 ` Tejun Heo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox