public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* lockdep problem conversion semaphore->mutex (dev->sem)
@ 2007-12-07 23:02 Remy Bohmer
  2007-12-08 12:16 ` Peter Zijlstra
  0 siblings, 1 reply; 14+ messages in thread
From: Remy Bohmer @ 2007-12-07 23:02 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Daniel Walker, Ingo Molnar, Steven Rostedt, linux-kernel,
	Dave Chinner

[-- Attachment #1: Type: text/plain, Size: 2936 bytes --]

Hello Peter,

> > What specifically is wrong with dev->sem ?
>
> Nothing really, other than that they use semaphores to avoid lockdep :-/
>
> I think I know how to annotate this, after Alan Stern explained all the
> use cases, but I haven't come around to implementing it. Hope to do that
> soonish.

I was looking for an easy semaphore I could convert to a mutex, and I
ran into one that was widely spread and interesting, and which seemed
quite doable at first sight.
So, I started working on it, but was forgotten this discussion, (until
Daniel made me remember it this afternoon). So, I (stupid me ;-) )
tried to convert dev->sem...

After doing the monkey part of the conversion I can boot the kernel
completely on X86 and ARM, and everything works fine, except after
enabling lockdep, lockdep starts complaining...

Is this the problem you were pointing at?
=======================================================
Setting up standard PCI resources
ACPI: EC: Look up EC in DSDT
ACPI: Interpreter enabled
ACPI: (supports S0 S1 S3 S4 S5)
ACPI: Using IOAPIC for interrupt routing

=============================================
[ INFO: possible recursive locking detected ]
2.6.24-rc4 #5
---------------------------------------------
swapper/1 is trying to acquire lock:
 (&dev->mut){--..}, at: [<c056760c>] __driver_attach+0x2c/0x5f

but task is already holding lock:
 (&dev->mut){--..}, at: [<c05675fd>] __driver_attach+0x1d/0x5f

other info that might help us debug this:
1 lock held by swapper/1:
 #0:  (&dev->mut){--..}, at: [<c05675fd>] __driver_attach+0x1d/0x5f

stack backtrace:
Pid: 1, comm: swapper Not tainted 2.6.24-rc4 #5
 [<c0448a25>] __lock_acquire+0x17b/0xb83
 [<c0448491>] trace_hardirqs_on+0x11a/0x13d
 [<c04497f9>] lock_acquire+0x5f/0x77
 [<c056760c>] __driver_attach+0x2c/0x5f
 [<c06210de>] mutex_lock_nested+0x105/0x26b
 [<c056760c>] __driver_attach+0x2c/0x5f
 [<c056760c>] __driver_attach+0x2c/0x5f
 [<c05675e0>] __driver_attach+0x0/0x5f
 [<c056760c>] __driver_attach+0x2c/0x5f
 [<c0566ba9>] bus_for_each_dev+0x3a/0x5c
 [<c05673ba>] driver_attach+0x16/0x18
 [<c05675e0>] __driver_attach+0x0/0x5f
 [<c0566ea0>] bus_add_driver+0x6d/0x19a
 [<c0762613>] acpi_ec_init+0x33/0x51
 [<c0749491>] kernel_init+0x148/0x2af
 [<c0749349>] kernel_init+0x0/0x2af
 [<c0749349>] kernel_init+0x0/0x2af
 [<c0405bd7>] kernel_thread_helper+0x7/0x10
 =======================
ACPI: PCI Root Bridge [PCI0] (0000:00)
Force enabled HPET at base address 0xfed00000
=======================================================

I tried debugging it, and I have not found a recursive mutex locking
so far, only locking of 2 different mutexes in a row prior to this
warning, which IMO should be valid.

What is your opinion?

BTW: I attached my patch for dev->sem as I have it now, that generates
this lockdep warning ( for if you want to look at it yourself also, so
you do not have to do the monkey part yourself anymore ;-)


Kind Regards,

Remy

[-- Attachment #2: patch_replace_sem_by_mutex_in_device_h --]
[-- Type: application/octet-stream, Size: 29835 bytes --]

Subject: Semaphore to Mutex in struct device

The semaphore in include/linux/device.h is used as a mutex
So, replacing it by a mutex everywhere in the tree.
Also renamed sem->mut, to make sure all out-of-tree modules
that use this semaphore will also be converted.

Signed-off-by: Remy Bohmer <linux@bohmer.net>

---
 drivers/base/bus.c                |   20 +++++++-------
 drivers/base/class.c              |   22 +++++++--------
 drivers/base/core.c               |   20 ++++++--------
 drivers/base/dd.c                 |   38 +++++++++++++--------------
 drivers/base/power/main.c         |    8 ++---
 drivers/firewire/fw-device.c      |    4 +-
 drivers/ieee1394/nodemgr.c        |   53 ++++++++++++++++++--------------------
 drivers/pci/bus.c                 |    4 +-
 drivers/pcmcia/ds.c               |    8 ++---
 drivers/power/apm_power.c         |    6 ++--
 drivers/power/power_supply_core.c |    8 ++---
 drivers/rtc/interface.c           |    4 +-
 drivers/scsi/hosts.c              |    4 +-
 drivers/spi/spi.c                 |    4 +-
 drivers/usb/core/hub.c            |    6 ++--
 include/linux/device.h            |    6 ++--
 include/linux/usb.h               |    6 ++--
 17 files changed, 110 insertions(+), 111 deletions(-)

Index: linux-2.6.24-rc4/include/linux/device.h
===================================================================
--- linux-2.6.24-rc4.orig/include/linux/device.h	2007-12-07 00:11:58.000000000 +0100
+++ linux-2.6.24-rc4/include/linux/device.h	2007-12-07 10:35:09.000000000 +0100
@@ -180,7 +180,9 @@ struct class {
 	struct list_head	devices;
 	struct list_head	interfaces;
 	struct kset		class_dirs;
-	struct semaphore	sem;	/* locks both the children and interfaces lists */
+
+	/* locks both the children and interfaces lists */
+	struct mutex		mut;
 
 	struct class_attribute		* class_attrs;
 	struct class_device_attribute	* class_dev_attrs;
@@ -410,7 +412,7 @@ struct device {
 	unsigned		is_registered:1;
 	unsigned		uevent_suppress:1;
 
-	struct semaphore	sem;	/* semaphore to synchronize calls to
+	struct mutex            mut;	/* mutex to synchronize calls to
 					 * its driver.
 					 */
 
Index: linux-2.6.24-rc4/drivers/base/class.c
===================================================================
--- linux-2.6.24-rc4.orig/drivers/base/class.c	2007-12-07 00:11:57.000000000 +0100
+++ linux-2.6.24-rc4/drivers/base/class.c	2007-12-07 00:14:39.000000000 +0100
@@ -144,7 +144,7 @@ int class_register(struct class * cls)
 	INIT_LIST_HEAD(&cls->devices);
 	INIT_LIST_HEAD(&cls->interfaces);
 	kset_init(&cls->class_dirs);
-	init_MUTEX(&cls->sem);
+	mutex_init(&cls->mut);
 	error = kobject_set_name(&cls->subsys.kobj, "%s", cls->name);
 	if (error)
 		return error;
@@ -617,13 +617,13 @@ int class_device_add(struct class_device
 	kobject_uevent(&class_dev->kobj, KOBJ_ADD);
 
 	/* notify any interfaces this device is now here */
-	down(&parent_class->sem);
+	mutex_lock(&parent_class->mut);
 	list_add_tail(&class_dev->node, &parent_class->children);
 	list_for_each_entry(class_intf, &parent_class->interfaces, node) {
 		if (class_intf->add)
 			class_intf->add(class_dev, class_intf);
 	}
-	up(&parent_class->sem);
+	mutex_unlock(&parent_class->mut);
 
 	goto out1;
 
@@ -725,12 +725,12 @@ void class_device_del(struct class_devic
 	struct class_interface *class_intf;
 
 	if (parent_class) {
-		down(&parent_class->sem);
+		mutex_lock(&parent_class->mut);
 		list_del_init(&class_dev->node);
 		list_for_each_entry(class_intf, &parent_class->interfaces, node)
 			if (class_intf->remove)
 				class_intf->remove(class_dev, class_intf);
-		up(&parent_class->sem);
+		mutex_unlock(&parent_class->mut);
 	}
 
 	if (class_dev->dev) {
@@ -772,14 +772,14 @@ void class_device_destroy(struct class *
 	struct class_device *class_dev = NULL;
 	struct class_device *class_dev_tmp;
 
-	down(&cls->sem);
+	mutex_lock(&cls->mut);
 	list_for_each_entry(class_dev_tmp, &cls->children, node) {
 		if (class_dev_tmp->devt == devt) {
 			class_dev = class_dev_tmp;
 			break;
 		}
 	}
-	up(&cls->sem);
+	mutex_unlock(&cls->mut);
 
 	if (class_dev)
 		class_device_unregister(class_dev);
@@ -812,7 +812,7 @@ int class_interface_register(struct clas
 	if (!parent)
 		return -EINVAL;
 
-	down(&parent->sem);
+	mutex_lock(&parent->mut);
 	list_add_tail(&class_intf->node, &parent->interfaces);
 	if (class_intf->add) {
 		list_for_each_entry(class_dev, &parent->children, node)
@@ -822,7 +822,7 @@ int class_interface_register(struct clas
 		list_for_each_entry(dev, &parent->devices, node)
 			class_intf->add_dev(dev, class_intf);
 	}
-	up(&parent->sem);
+	mutex_unlock(&parent->mut);
 
 	return 0;
 }
@@ -836,7 +836,7 @@ void class_interface_unregister(struct c
 	if (!parent)
 		return;
 
-	down(&parent->sem);
+	mutex_lock(&parent->mut);
 	list_del_init(&class_intf->node);
 	if (class_intf->remove) {
 		list_for_each_entry(class_dev, &parent->children, node)
@@ -846,7 +846,7 @@ void class_interface_unregister(struct c
 		list_for_each_entry(dev, &parent->devices, node)
 			class_intf->remove_dev(dev, class_intf);
 	}
-	up(&parent->sem);
+	mutex_unlock(&parent->mut);
 
 	class_put(parent);
 }
Index: linux-2.6.24-rc4/drivers/base/core.c
===================================================================
--- linux-2.6.24-rc4.orig/drivers/base/core.c	2007-12-07 00:11:57.000000000 +0100
+++ linux-2.6.24-rc4/drivers/base/core.c	2007-12-07 00:14:39.000000000 +0100
@@ -19,8 +19,6 @@
 #include <linux/kdev_t.h>
 #include <linux/notifier.h>
 
-#include <asm/semaphore.h>
-
 #include "base.h"
 #include "power/power.h"
 
@@ -531,7 +529,7 @@ void device_initialize(struct device *de
 		   klist_children_put);
 	INIT_LIST_HEAD(&dev->dma_pools);
 	INIT_LIST_HEAD(&dev->node);
-	init_MUTEX(&dev->sem);
+	mutex_init(&dev->mut);
 	spin_lock_init(&dev->devres_lock);
 	INIT_LIST_HEAD(&dev->devres_head);
 	device_init_wakeup(dev, 0);
@@ -783,7 +781,7 @@ int device_add(struct device *dev)
 		klist_add_tail(&dev->knode_parent, &parent->klist_children);
 
 	if (dev->class) {
-		down(&dev->class->sem);
+		mutex_lock(&dev->class->mut);
 		/* tie the class to the device */
 		list_add_tail(&dev->node, &dev->class->devices);
 
@@ -791,7 +789,7 @@ int device_add(struct device *dev)
 		list_for_each_entry(class_intf, &dev->class->interfaces, node)
 			if (class_intf->add_dev)
 				class_intf->add_dev(dev, class_intf);
-		up(&dev->class->sem);
+		mutex_unlock(&dev->class->mut);
 	}
  Done:
 	put_device(dev);
@@ -928,14 +926,14 @@ void device_del(struct device * dev)
 			sysfs_remove_link(&dev->kobj, "device");
 		}
 
-		down(&dev->class->sem);
+		mutex_lock(&dev->class->mut);
 		/* notify any interfaces that the device is now gone */
 		list_for_each_entry(class_intf, &dev->class->interfaces, node)
 			if (class_intf->remove_dev)
 				class_intf->remove_dev(dev, class_intf);
 		/* remove the device from the class list */
 		list_del_init(&dev->node);
-		up(&dev->class->sem);
+		mutex_unlock(&dev->class->mut);
 
 		/* If we live in a parent class-directory, unreference it */
 		if (dev->kobj.parent->kset == &dev->class->class_dirs) {
@@ -946,7 +944,7 @@ void device_del(struct device * dev)
 			 * if we are the last child of our class, delete
 			 * our class-directory at this parent
 			 */
-			down(&dev->class->sem);
+			mutex_lock(&dev->class->mut);
 			list_for_each_entry(d, &dev->class->devices, node) {
 				if (d == dev)
 					continue;
@@ -959,7 +957,7 @@ void device_del(struct device * dev)
 				kobject_del(dev->kobj.parent);
 
 			kobject_put(dev->kobj.parent);
-			up(&dev->class->sem);
+			mutex_unlock(&dev->class->mut);
 		}
 	}
 	device_remove_file(dev, &uevent_attr);
@@ -1168,14 +1166,14 @@ void device_destroy(struct class *class,
 	struct device *dev = NULL;
 	struct device *dev_tmp;
 
-	down(&class->sem);
+	mutex_lock(&class->mut);
 	list_for_each_entry(dev_tmp, &class->devices, node) {
 		if (dev_tmp->devt == devt) {
 			dev = dev_tmp;
 			break;
 		}
 	}
-	up(&class->sem);
+	mutex_unlock(&class->mut);
 
 	if (dev)
 		device_unregister(dev);
Index: linux-2.6.24-rc4/drivers/base/bus.c
===================================================================
--- linux-2.6.24-rc4.orig/drivers/base/bus.c	2007-12-07 00:11:58.000000000 +0100
+++ linux-2.6.24-rc4/drivers/base/bus.c	2007-12-07 00:14:39.000000000 +0100
@@ -190,10 +190,10 @@ static ssize_t driver_unbind(struct devi
 	dev = bus_find_device(bus, NULL, (void *)buf, driver_helper);
 	if (dev && dev->driver == drv) {
 		if (dev->parent)	/* Needed for USB */
-			down(&dev->parent->sem);
+			mutex_lock(&dev->parent->mut);
 		device_release_driver(dev);
 		if (dev->parent)
-			up(&dev->parent->sem);
+			mutex_unlock(&dev->parent->mut);
 		err = count;
 	}
 	put_device(dev);
@@ -217,12 +217,12 @@ static ssize_t driver_bind(struct device
 	dev = bus_find_device(bus, NULL, (void *)buf, driver_helper);
 	if (dev && dev->driver == NULL) {
 		if (dev->parent)	/* Needed for USB */
-			down(&dev->parent->sem);
-		down(&dev->sem);
+			mutex_lock(&dev->parent->mut);
+		mutex_lock(&dev->mut);
 		err = driver_probe_device(drv, dev);
-		up(&dev->sem);
+		mutex_unlock(&dev->mut);
 		if (dev->parent)
-			up(&dev->parent->sem);
+			mutex_unlock(&dev->parent->mut);
 
 		if (err > 0) 		/* success */
 			err = count;
@@ -711,10 +711,10 @@ static int __must_check bus_rescan_devic
 
 	if (!dev->driver) {
 		if (dev->parent)	/* Needed for USB */
-			down(&dev->parent->sem);
+			mutex_lock(&dev->parent->mut);
 		ret = device_attach(dev);
 		if (dev->parent)
-			up(&dev->parent->sem);
+			mutex_unlock(&dev->parent->mut);
 	}
 	return ret < 0 ? ret : 0;
 }
@@ -745,10 +745,10 @@ int device_reprobe(struct device *dev)
 {
 	if (dev->driver) {
 		if (dev->parent)        /* Needed for USB */
-			down(&dev->parent->sem);
+			mutex_lock(&dev->parent->mut);
 		device_release_driver(dev);
 		if (dev->parent)
-			up(&dev->parent->sem);
+			mutex_unlock(&dev->parent->mut);
 	}
 	return bus_rescan_devices_helper(dev, NULL);
 }
Index: linux-2.6.24-rc4/drivers/base/dd.c
===================================================================
--- linux-2.6.24-rc4.orig/drivers/base/dd.c	2007-12-07 00:11:58.000000000 +0100
+++ linux-2.6.24-rc4/drivers/base/dd.c	2007-12-07 00:14:39.000000000 +0100
@@ -82,7 +82,7 @@ static void driver_sysfs_remove(struct d
  *	for before calling this. (It is ok to call with no other effort
  *	from a driver's probe() method.)
  *
- *	This function must be called with @dev->sem held.
+ *	This function must be called with @dev->mut held.
  */
 int device_bind_driver(struct device *dev)
 {
@@ -180,8 +180,8 @@ int driver_probe_done(void)
  * This function returns 1 if a match is found, -ENODEV if the device is
  * not registered, and 0 otherwise.
  *
- * This function must be called with @dev->sem held.  When called for a
- * USB interface, @dev->parent->sem must be held as well.
+ * This function must be called with @dev->mut held.  When called for a
+ * USB interface, @dev->parent->mut must be held as well.
  */
 int driver_probe_device(struct device_driver * drv, struct device * dev)
 {
@@ -219,13 +219,13 @@ static int __device_attach(struct device
  *	0 if no matching device was found;
  *	-ENODEV if the device is not registered.
  *
- *	When called for a USB interface, @dev->parent->sem must be held.
+ *	When called for a USB interface, @dev->parent->mut must be held.
  */
 int device_attach(struct device * dev)
 {
 	int ret = 0;
 
-	down(&dev->sem);
+	mutex_lock(&dev->mut);
 	if (dev->driver) {
 		ret = device_bind_driver(dev);
 		if (ret == 0)
@@ -237,7 +237,7 @@ int device_attach(struct device * dev)
 	} else {
 		ret = bus_for_each_drv(dev->bus, NULL, dev, __device_attach);
 	}
-	up(&dev->sem);
+	mutex_unlock(&dev->mut);
 	return ret;
 }
 
@@ -256,13 +256,13 @@ static int __driver_attach(struct device
 	 */
 
 	if (dev->parent)	/* Needed for USB */
-		down(&dev->parent->sem);
-	down(&dev->sem);
+		mutex_lock(&dev->parent->mut);
+	mutex_lock(&dev->mut);
 	if (!dev->driver)
 		driver_probe_device(drv, dev);
-	up(&dev->sem);
+	mutex_unlock(&dev->mut);
 	if (dev->parent)
-		up(&dev->parent->sem);
+		mutex_unlock(&dev->parent->mut);
 
 	return 0;
 }
@@ -282,8 +282,8 @@ int driver_attach(struct device_driver *
 }
 
 /*
- *	__device_release_driver() must be called with @dev->sem held.
- *	When called for a USB interface, @dev->parent->sem must be held as well.
+ *	__device_release_driver() must be called with @dev->mut held.
+ *	When called for a USB interface, @dev->parent->mut must be held as well.
  */
 static void __device_release_driver(struct device * dev)
 {
@@ -315,7 +315,7 @@ static void __device_release_driver(stru
  *	@dev:	device.
  *
  *	Manually detach device from driver.
- *	When called for a USB interface, @dev->parent->sem must be held.
+ *	When called for a USB interface, @dev->parent->mut must be held.
  */
 void device_release_driver(struct device * dev)
 {
@@ -324,9 +324,9 @@ void device_release_driver(struct device
 	 * within their ->remove callback for the same device, they
 	 * will deadlock right here.
 	 */
-	down(&dev->sem);
+	mutex_lock(&dev->mut);
 	__device_release_driver(dev);
-	up(&dev->sem);
+	mutex_unlock(&dev->mut);
 }
 
 
@@ -350,13 +350,13 @@ void driver_detach(struct device_driver 
 		spin_unlock(&drv->klist_devices.k_lock);
 
 		if (dev->parent)	/* Needed for USB */
-			down(&dev->parent->sem);
-		down(&dev->sem);
+			mutex_lock(&dev->parent->mut);
+		mutex_lock(&dev->mut);
 		if (dev->driver == drv)
 			__device_release_driver(dev);
-		up(&dev->sem);
+		mutex_unlock(&dev->mut);
 		if (dev->parent)
-			up(&dev->parent->sem);
+			mutex_unlock(&dev->parent->mut);
 		put_device(dev);
 	}
 }
Index: linux-2.6.24-rc4/drivers/base/power/main.c
===================================================================
--- linux-2.6.24-rc4.orig/drivers/base/power/main.c	2007-12-07 00:11:57.000000000 +0100
+++ linux-2.6.24-rc4/drivers/base/power/main.c	2007-12-07 00:14:39.000000000 +0100
@@ -75,7 +75,7 @@ static int resume_device(struct device *
 	TRACE_DEVICE(dev);
 	TRACE_RESUME(0);
 
-	down(&dev->sem);
+	mutex_lock(&dev->mut);
 
 	if (dev->bus && dev->bus->resume) {
 		dev_dbg(dev,"resuming\n");
@@ -92,7 +92,7 @@ static int resume_device(struct device *
 		error = dev->class->resume(dev);
 	}
 
-	up(&dev->sem);
+	mutex_unlock(&dev->mut);
 
 	TRACE_RESUME(error);
 	return error;
@@ -241,7 +241,7 @@ static int suspend_device(struct device 
 {
 	int error = 0;
 
-	down(&dev->sem);
+	mutex_lock(&dev->mut);
 	if (dev->power.power_state.event) {
 		dev_dbg(dev, "PM: suspend %d-->%d\n",
 			dev->power.power_state.event, state.event);
@@ -264,7 +264,7 @@ static int suspend_device(struct device 
 		error = dev->bus->suspend(dev, state);
 		suspend_report_result(dev->bus->suspend, error);
 	}
-	up(&dev->sem);
+	mutex_unlock(&dev->mut);
 	return error;
 }
 
Index: linux-2.6.24-rc4/drivers/firewire/fw-device.c
===================================================================
--- linux-2.6.24-rc4.orig/drivers/firewire/fw-device.c	2007-12-07 00:11:57.000000000 +0100
+++ linux-2.6.24-rc4/drivers/firewire/fw-device.c	2007-12-07 00:14:39.000000000 +0100
@@ -731,9 +731,9 @@ static int update_unit(struct device *de
 	struct fw_driver *driver = (struct fw_driver *)dev->driver;
 
 	if (is_fw_unit(dev) && driver != NULL && driver->update != NULL) {
-		down(&dev->sem);
+		mutex_lock(&dev->mut);
 		driver->update(unit);
-		up(&dev->sem);
+		mutex_unlock(&dev->mut);
 	}
 
 	return 0;
Index: linux-2.6.24-rc4/include/linux/usb.h
===================================================================
--- linux-2.6.24-rc4.orig/include/linux/usb.h	2007-12-07 00:11:58.000000000 +0100
+++ linux-2.6.24-rc4/include/linux/usb.h	2007-12-07 00:14:39.000000000 +0100
@@ -440,9 +440,9 @@ extern struct usb_device *usb_get_dev(st
 extern void usb_put_dev(struct usb_device *dev);
 
 /* USB device locking */
-#define usb_lock_device(udev)		down(&(udev)->dev.sem)
-#define usb_unlock_device(udev)		up(&(udev)->dev.sem)
-#define usb_trylock_device(udev)	down_trylock(&(udev)->dev.sem)
+#define usb_lock_device(udev)		mutex_lock(&(udev)->dev.mut)
+#define usb_unlock_device(udev)		mutex_unlock(&(udev)->dev.mut)
+#define usb_trylock_device(udev)	mutex_trylock(&(udev)->dev.mut)
 extern int usb_lock_device_for_reset(struct usb_device *udev,
 				     const struct usb_interface *iface);
 
Index: linux-2.6.24-rc4/drivers/pci/bus.c
===================================================================
--- linux-2.6.24-rc4.orig/drivers/pci/bus.c	2007-12-07 00:11:57.000000000 +0100
+++ linux-2.6.24-rc4/drivers/pci/bus.c	2007-12-07 00:14:39.000000000 +0100
@@ -198,9 +198,9 @@ void pci_walk_bus(struct pci_bus *top, v
 			next = dev->bus_list.next;
 
 		/* Run device routines with the device locked */
-		down(&dev->dev.sem);
+		mutex_lock(&dev->dev.mut);
 		cb(dev, userdata);
-		up(&dev->dev.sem);
+		mutex_unlock(&dev->dev.mut);
 	}
 	up_read(&pci_bus_sem);
 }
Index: linux-2.6.24-rc4/drivers/pcmcia/ds.c
===================================================================
--- linux-2.6.24-rc4.orig/drivers/pcmcia/ds.c	2007-12-07 00:11:57.000000000 +0100
+++ linux-2.6.24-rc4/drivers/pcmcia/ds.c	2007-12-07 00:14:39.000000000 +0100
@@ -1126,9 +1126,9 @@ static int runtime_suspend(struct device
 {
 	int rc;
 
-	down(&dev->sem);
+	mutex_lock(&dev->mut);
 	rc = pcmcia_dev_suspend(dev, PMSG_SUSPEND);
-	up(&dev->sem);
+	mutex_unlock(&dev->mut);
 	if (!rc)
 		dev->power.power_state.event = PM_EVENT_SUSPEND;
 	return rc;
@@ -1138,9 +1138,9 @@ static void runtime_resume(struct device
 {
 	int rc;
 
-	down(&dev->sem);
+	mutex_lock(&dev->mut);
 	rc = pcmcia_dev_resume(dev);
-	up(&dev->sem);
+	mutex_unlock(&dev->mut);
 	if (!rc)
 		dev->power.power_state.event = PM_EVENT_ON;
 }
Index: linux-2.6.24-rc4/drivers/power/power_supply_core.c
===================================================================
--- linux-2.6.24-rc4.orig/drivers/power/power_supply_core.c	2007-12-07 00:11:57.000000000 +0100
+++ linux-2.6.24-rc4/drivers/power/power_supply_core.c	2007-12-07 00:14:39.000000000 +0100
@@ -31,7 +31,7 @@ static void power_supply_changed_work(st
 	for (i = 0; i < psy->num_supplicants; i++) {
 		struct device *dev;
 
-		down(&power_supply_class->sem);
+		mutex_lock(&power_supply_class->mut);
 		list_for_each_entry(dev, &power_supply_class->devices, node) {
 			struct power_supply *pst = dev_get_drvdata(dev);
 
@@ -40,7 +40,7 @@ static void power_supply_changed_work(st
 					pst->external_power_changed(pst);
 			}
 		}
-		up(&power_supply_class->sem);
+		mutex_unlock(&power_supply_class->mut);
 	}
 
 	power_supply_update_leds(psy);
@@ -60,7 +60,7 @@ int power_supply_am_i_supplied(struct po
 	union power_supply_propval ret = {0,};
 	struct device *dev;
 
-	down(&power_supply_class->sem);
+	mutex_lock(&power_supply_class->mut);
 	list_for_each_entry(dev, &power_supply_class->devices, node) {
 		struct power_supply *epsy = dev_get_drvdata(dev);
 		int i;
@@ -76,7 +76,7 @@ int power_supply_am_i_supplied(struct po
 		}
 	}
 out:
-	up(&power_supply_class->sem);
+	mutex_unlock(&power_supply_class->mut);
 
 	dev_dbg(psy->dev, "%s %d\n", __FUNCTION__, ret.intval);
 
Index: linux-2.6.24-rc4/drivers/scsi/hosts.c
===================================================================
--- linux-2.6.24-rc4.orig/drivers/scsi/hosts.c	2007-12-07 00:11:58.000000000 +0100
+++ linux-2.6.24-rc4/drivers/scsi/hosts.c	2007-12-07 00:14:39.000000000 +0100
@@ -443,7 +443,7 @@ struct Scsi_Host *scsi_host_lookup(unsig
 	struct class_device *cdev;
 	struct Scsi_Host *shost = ERR_PTR(-ENXIO), *p;
 
-	down(&class->sem);
+	mutex_lock(&class->mut);
 	list_for_each_entry(cdev, &class->children, node) {
 		p = class_to_shost(cdev);
 		if (p->host_no == hostnum) {
@@ -451,7 +451,7 @@ struct Scsi_Host *scsi_host_lookup(unsig
 			break;
 		}
 	}
-	up(&class->sem);
+	mutex_unlock(&class->mut);
 
 	return shost;
 }
Index: linux-2.6.24-rc4/drivers/usb/core/hub.c
===================================================================
--- linux-2.6.24-rc4.orig/drivers/usb/core/hub.c	2007-12-07 00:11:57.000000000 +0100
+++ linux-2.6.24-rc4/drivers/usb/core/hub.c	2007-12-07 00:14:39.000000000 +0100
@@ -82,7 +82,7 @@ struct usb_hub {
 
 
 /* Protect struct usb_device->state and ->children members
- * Note: Both are also protected by ->dev.sem, except that ->state can
+ * Note: Both are also protected by ->dev.mut, except that ->state can
  * change to USB_STATE_NOTATTACHED even when the semaphore isn't held. */
 static DEFINE_SPINLOCK(device_state_lock);
 
@@ -3143,7 +3143,7 @@ int usb_reset_composite_device(struct us
 		for (i = 0; i < config->desc.bNumInterfaces; ++i) {
 			cintf = config->interface[i];
 			if (cintf != iface)
-				down(&cintf->dev.sem);
+				mutex_lock(&cintf->dev.mut);
 			if (device_is_registered(&cintf->dev) &&
 					cintf->dev.driver) {
 				drv = to_usb_driver(cintf->dev.driver);
@@ -3171,7 +3171,7 @@ int usb_reset_composite_device(struct us
 	/* FIXME: Unbind if post_reset returns an error or isn't defined */
 			}
 			if (cintf != iface)
-				up(&cintf->dev.sem);
+				mutex_unlock(&cintf->dev.mut);
 		}
 	}
 
Index: linux-2.6.24-rc4/drivers/ieee1394/nodemgr.c
===================================================================
--- linux-2.6.24-rc4.orig/drivers/ieee1394/nodemgr.c	2007-12-06 20:52:15.000000000 +0100
+++ linux-2.6.24-rc4/drivers/ieee1394/nodemgr.c	2007-12-07 10:09:28.000000000 +0100
@@ -19,7 +19,6 @@
 #include <linux/mutex.h>
 #include <linux/freezer.h>
 #include <asm/atomic.h>
-#include <asm/semaphore.h>
 
 #include "csr.h"
 #include "highlevel.h"
@@ -733,16 +732,16 @@ static void nodemgr_remove_uds(struct no
 	struct unit_directory *tmp, *ud;
 
 	/* Iteration over nodemgr_ud_class.devices has to be protected by
-	 * nodemgr_ud_class.sem, but device_unregister() will eventually
-	 * take nodemgr_ud_class.sem too. Therefore pick out one ud at a time,
-	 * release the semaphore, and then unregister the ud. Since this code
+	 * nodemgr_ud_class.mut, but device_unregister() will eventually
+	 * take nodemgr_ud_class.mut too. Therefore pick out one ud at a time,
+	 * release the mutex, and then unregister the ud. Since this code
 	 * may be called from other contexts besides the knodemgrds, protect the
-	 * gap after release of the semaphore by nodemgr_serialize_remove_uds.
+	 * gap after release of the mutex by nodemgr_serialize_remove_uds.
 	 */
 	mutex_lock(&nodemgr_serialize_remove_uds);
 	for (;;) {
 		ud = NULL;
-		down(&nodemgr_ud_class.sem);
+		mutex_lock(&nodemgr_ud_class.mut);
 		list_for_each_entry(dev, &nodemgr_ud_class.devices, node) {
 			tmp = container_of(dev, struct unit_directory,
 					   unit_dev);
@@ -751,7 +750,7 @@ static void nodemgr_remove_uds(struct no
 				break;
 			}
 		}
-		up(&nodemgr_ud_class.sem);
+		mutex_unlock(&nodemgr_ud_class.mut);
 		if (ud == NULL)
 			break;
 		device_unregister(&ud->unit_dev);
@@ -888,7 +887,7 @@ static struct node_entry *find_entry_by_
 	struct device *dev;
 	struct node_entry *ne, *ret_ne = NULL;
 
-	down(&nodemgr_ne_class.sem);
+	mutex_lock(&nodemgr_ne_class.mut);
 	list_for_each_entry(dev, &nodemgr_ne_class.devices, node) {
 		ne = container_of(dev, struct node_entry, node_dev);
 
@@ -897,7 +896,7 @@ static struct node_entry *find_entry_by_
 			break;
 		}
 	}
-	up(&nodemgr_ne_class.sem);
+	mutex_unlock(&nodemgr_ne_class.mut);
 
 	return ret_ne;
 }
@@ -909,7 +908,7 @@ static struct node_entry *find_entry_by_
 	struct device *dev;
 	struct node_entry *ne, *ret_ne = NULL;
 
-	down(&nodemgr_ne_class.sem);
+	mutex_lock(&nodemgr_ne_class.mut);
 	list_for_each_entry(dev, &nodemgr_ne_class.devices, node) {
 		ne = container_of(dev, struct node_entry, node_dev);
 
@@ -918,7 +917,7 @@ static struct node_entry *find_entry_by_
 			break;
 		}
 	}
-	up(&nodemgr_ne_class.sem);
+	mutex_unlock(&nodemgr_ne_class.mut);
 
 	return ret_ne;
 }
@@ -1384,7 +1383,7 @@ static void nodemgr_suspend_ne(struct no
 	ne->in_limbo = 1;
 	WARN_ON(device_create_file(&ne->device, &dev_attr_ne_in_limbo));
 
-	down(&nodemgr_ud_class.sem);
+	mutex_lock(&nodemgr_ud_class.mut);
 	list_for_each_entry(dev, &nodemgr_ud_class.devices, node) {
 		ud = container_of(dev, struct unit_directory, unit_dev);
 		if (ud->ne != ne)
@@ -1396,15 +1395,15 @@ static void nodemgr_suspend_ne(struct no
 
 		error = 1; /* release if suspend is not implemented */
 		if (drv->suspend) {
-			down(&ud->device.sem);
+			mutex_lock(&ud->device.mut);
 			error = drv->suspend(&ud->device, PMSG_SUSPEND);
-			up(&ud->device.sem);
+			mutex_unlock(&ud->device.mut);
 		}
 		if (error)
 			device_release_driver(&ud->device);
 		put_driver(drv);
 	}
-	up(&nodemgr_ud_class.sem);
+	mutex_unlock(&nodemgr_ud_class.mut);
 }
 
 
@@ -1417,7 +1416,7 @@ static void nodemgr_resume_ne(struct nod
 	ne->in_limbo = 0;
 	device_remove_file(&ne->device, &dev_attr_ne_in_limbo);
 
-	down(&nodemgr_ud_class.sem);
+	mutex_lock(&nodemgr_ud_class.mut);
 	list_for_each_entry(dev, &nodemgr_ud_class.devices, node) {
 		ud = container_of(dev, struct unit_directory, unit_dev);
 		if (ud->ne != ne)
@@ -1428,13 +1427,13 @@ static void nodemgr_resume_ne(struct nod
 			continue;
 
 		if (drv->resume) {
-			down(&ud->device.sem);
+			mutex_lock(&ud->device.mut);
 			drv->resume(&ud->device);
-			up(&ud->device.sem);
+			mutex_unlock(&ud->device.mut);
 		}
 		put_driver(drv);
 	}
-	up(&nodemgr_ud_class.sem);
+	mutex_unlock(&nodemgr_ud_class.mut);
 
 	HPSB_DEBUG("Node resumed: ID:BUS[" NODE_BUS_FMT "]  GUID[%016Lx]",
 		   NODE_BUS_ARGS(ne->host, ne->nodeid), (unsigned long long)ne->guid);
@@ -1449,7 +1448,7 @@ static void nodemgr_update_pdrv(struct n
 	struct hpsb_protocol_driver *pdrv;
 	int error;
 
-	down(&nodemgr_ud_class.sem);
+	mutex_lock(&nodemgr_ud_class.mut);
 	list_for_each_entry(dev, &nodemgr_ud_class.devices, node) {
 		ud = container_of(dev, struct unit_directory, unit_dev);
 		if (ud->ne != ne)
@@ -1462,15 +1461,15 @@ static void nodemgr_update_pdrv(struct n
 		error = 0;
 		pdrv = container_of(drv, struct hpsb_protocol_driver, driver);
 		if (pdrv->update) {
-			down(&ud->device.sem);
+			mutex_lock(&ud->device.mut);
 			error = pdrv->update(ud);
-			up(&ud->device.sem);
+			mutex_unlock(&ud->device.mut);
 		}
 		if (error)
 			device_release_driver(&ud->device);
 		put_driver(drv);
 	}
-	up(&nodemgr_ud_class.sem);
+	mutex_unlock(&nodemgr_ud_class.mut);
 }
 
 
@@ -1545,7 +1544,7 @@ static void nodemgr_node_probe(struct ho
 	 * while probes are time-consuming. (Well, those probes need some
 	 * improvement...) */
 
-	down(&nodemgr_ne_class.sem);
+	mutex_lock(&nodemgr_ne_class.mut);
 	list_for_each_entry(dev, &nodemgr_ne_class.devices, node) {
 		ne = container_of(dev, struct node_entry, node_dev);
 		if (!ne->needs_probe)
@@ -1556,7 +1555,7 @@ static void nodemgr_node_probe(struct ho
 		if (ne->needs_probe)
 			nodemgr_probe_ne(hi, ne, generation);
 	}
-	up(&nodemgr_ne_class.sem);
+	mutex_unlock(&nodemgr_ne_class.mut);
 
 
 	/* If we had a bus reset while we were scanning the bus, it is
@@ -1775,14 +1774,14 @@ int nodemgr_for_each_host(void *data, in
 	struct hpsb_host *host;
 	int error = 0;
 
-	down(&hpsb_host_class.sem);
+	mutex_lock(&hpsb_host_class.mut);
 	list_for_each_entry(dev, &hpsb_host_class.devices, node) {
 		host = container_of(dev, struct hpsb_host, host_dev);
 
 		if ((error = cb(host, data)))
 			break;
 	}
-	up(&hpsb_host_class.sem);
+	mutex_unlock(&hpsb_host_class.mut);
 
 	return error;
 }
Index: linux-2.6.24-rc4/drivers/power/apm_power.c
===================================================================
--- linux-2.6.24-rc4.orig/drivers/power/apm_power.c	2007-12-06 20:52:18.000000000 +0100
+++ linux-2.6.24-rc4/drivers/power/apm_power.c	2007-12-07 10:10:48.000000000 +0100
@@ -207,10 +207,10 @@ static void apm_battery_apm_get_power_st
 	union power_supply_propval status;
 	union power_supply_propval capacity, time_to_full, time_to_empty;
 
-	down(&power_supply_class->sem);
+	mutex_lock(&power_supply_class->mut);
 	find_main_battery();
 	if (!main_battery) {
-		up(&power_supply_class->sem);
+		mutex_unlock(&power_supply_class->mut);
 		return;
 	}
 
@@ -278,7 +278,7 @@ static void apm_battery_apm_get_power_st
 		}
 	}
 
-	up(&power_supply_class->sem);
+	mutex_unlock(&power_supply_class->mut);
 }
 
 static int __init apm_battery_init(void)
Index: linux-2.6.24-rc4/drivers/rtc/interface.c
===================================================================
--- linux-2.6.24-rc4.orig/drivers/rtc/interface.c	2007-12-06 20:52:18.000000000 +0100
+++ linux-2.6.24-rc4/drivers/rtc/interface.c	2007-12-07 10:14:39.000000000 +0100
@@ -256,7 +256,7 @@ struct rtc_device *rtc_class_open(char *
 	struct device *dev;
 	struct rtc_device *rtc = NULL;
 
-	down(&rtc_class->sem);
+	mutex_lock(&rtc_class->mut);
 	list_for_each_entry(dev, &rtc_class->devices, node) {
 		if (strncmp(dev->bus_id, name, BUS_ID_SIZE) == 0) {
 			dev = get_device(dev);
@@ -272,7 +272,7 @@ struct rtc_device *rtc_class_open(char *
 			rtc = NULL;
 		}
 	}
-	up(&rtc_class->sem);
+	mutex_unlock(&rtc_class->mut);
 
 	return rtc;
 }
Index: linux-2.6.24-rc4/drivers/spi/spi.c
===================================================================
--- linux-2.6.24-rc4.orig/drivers/spi/spi.c	2007-12-06 20:52:18.000000000 +0100
+++ linux-2.6.24-rc4/drivers/spi/spi.c	2007-12-07 09:59:37.000000000 +0100
@@ -501,7 +501,7 @@ struct spi_master *spi_busnum_to_master(
 	struct spi_master	*master = NULL;
 	struct spi_master	*m;
 
-	down(&spi_master_class.sem);
+	mutex_lock(&spi_master_class.mut);
 	list_for_each_entry(dev, &spi_master_class.children, node) {
 		m = container_of(dev, struct spi_master, dev);
 		if (m->bus_num == bus_num) {
@@ -509,7 +509,7 @@ struct spi_master *spi_busnum_to_master(
 			break;
 		}
 	}
-	up(&spi_master_class.sem);
+	mutex_unlock(&spi_master_class.mut);
 	return master;
 }
 EXPORT_SYMBOL_GPL(spi_busnum_to_master);

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: lockdep problem conversion semaphore->mutex (dev->sem)
  2007-12-07 23:02 lockdep problem conversion semaphore->mutex (dev->sem) Remy Bohmer
@ 2007-12-08 12:16 ` Peter Zijlstra
  2007-12-08 16:53   ` Daniel Walker
  0 siblings, 1 reply; 14+ messages in thread
From: Peter Zijlstra @ 2007-12-08 12:16 UTC (permalink / raw)
  To: Remy Bohmer
  Cc: Daniel Walker, Ingo Molnar, Steven Rostedt, linux-kernel,
	Dave Chinner


On Sat, 2007-12-08 at 00:02 +0100, Remy Bohmer wrote:
> Hello Peter,
> 
> > > What specifically is wrong with dev->sem ?
> >
> > Nothing really, other than that they use semaphores to avoid lockdep :-/
> >
> > I think I know how to annotate this, after Alan Stern explained all the
> > use cases, but I haven't come around to implementing it. Hope to do that
> > soonish.
> 
> I was looking for an easy semaphore I could convert to a mutex, and I
> ran into one that was widely spread and interesting, and which seemed
> quite doable at first sight.
> So, I started working on it, but was forgotten this discussion, (until
> Daniel made me remember it this afternoon). So, I (stupid me ;-) )
> tried to convert dev->sem...
> 
> After doing the monkey part of the conversion I can boot the kernel
> completely on X86 and ARM, and everything works fine, except after
> enabling lockdep, lockdep starts complaining...
> 
> Is this the problem you were pointing at?

Yeah, one of the interesting nestings :-)

> I tried debugging it, and I have not found a recursive mutex locking
> so far, only locking of 2 different mutexes in a row prior to this
> warning, which IMO should be valid.
> 
> What is your opinion?

Yeah, the locking is all valid afaics, its just that it needs some
interesting annotations to make lockdep see it that way.

> BTW: I attached my patch for dev->sem as I have it now, that generates
> this lockdep warning ( for if you want to look at it yourself also, so
> you do not have to do the monkey part yourself anymore ;-)

I have a similar patch floating around, but thanks anyway :-)


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: lockdep problem conversion semaphore->mutex (dev->sem)
  2007-12-08 12:16 ` Peter Zijlstra
@ 2007-12-08 16:53   ` Daniel Walker
  2007-12-08 17:11     ` Peter Zijlstra
  0 siblings, 1 reply; 14+ messages in thread
From: Daniel Walker @ 2007-12-08 16:53 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Remy Bohmer, Ingo Molnar, Steven Rostedt, linux-kernel,
	Dave Chinner

On Sat, 2007-12-08 at 13:16 +0100, Peter Zijlstra wrote:
> On Sat, 2007-12-08 at 00:02 +0100, Remy Bohmer wrote:
> > Hello Peter,
> > 
> > > > What specifically is wrong with dev->sem ?
> > >
> > > Nothing really, other than that they use semaphores to avoid lockdep :-/
> > >
> > > I think I know how to annotate this, after Alan Stern explained all the
> > > use cases, but I haven't come around to implementing it. Hope to do that
> > > soonish.
> > 
> > I was looking for an easy semaphore I could convert to a mutex, and I
> > ran into one that was widely spread and interesting, and which seemed
> > quite doable at first sight.
> > So, I started working on it, but was forgotten this discussion, (until
> > Daniel made me remember it this afternoon). So, I (stupid me ;-) )
> > tried to convert dev->sem...
> > 
> > After doing the monkey part of the conversion I can boot the kernel
> > completely on X86 and ARM, and everything works fine, except after
> > enabling lockdep, lockdep starts complaining...
> > 
> > Is this the problem you were pointing at?
> 
> Yeah, one of the interesting nestings :-)

It must be the locking in __driver_attach(), taking dev->parent->sem
then taking dev->sem .. Assuming those are different structures, why
does lockdep trigger?

Daniel


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: lockdep problem conversion semaphore->mutex (dev->sem)
  2007-12-08 17:11     ` Peter Zijlstra
@ 2007-12-08 17:06       ` Daniel Walker
  2007-12-08 17:36         ` Peter Zijlstra
  0 siblings, 1 reply; 14+ messages in thread
From: Daniel Walker @ 2007-12-08 17:06 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Remy Bohmer, Ingo Molnar, Steven Rostedt, linux-kernel,
	Dave Chinner

On Sat, 2007-12-08 at 18:11 +0100, Peter Zijlstra wrote:

> > It must be the locking in __driver_attach(), taking dev->parent->sem
> > then taking dev->sem .. Assuming those are different structures, why
> > does lockdep trigger?
> 
> They aren't different, parent is a struct device again.

It's different memory tho .. I wasn't sure how to term that .. The locks
are in two different memory location so it couldn't be recursive .. I'm
only asking for my own understanding .. I don't mind inspecting
potentially bad locking ..

Daniel


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: lockdep problem conversion semaphore->mutex (dev->sem)
  2007-12-08 16:53   ` Daniel Walker
@ 2007-12-08 17:11     ` Peter Zijlstra
  2007-12-08 17:06       ` Daniel Walker
  0 siblings, 1 reply; 14+ messages in thread
From: Peter Zijlstra @ 2007-12-08 17:11 UTC (permalink / raw)
  To: Daniel Walker
  Cc: Remy Bohmer, Ingo Molnar, Steven Rostedt, linux-kernel,
	Dave Chinner


On Sat, 2007-12-08 at 08:53 -0800, Daniel Walker wrote:
> On Sat, 2007-12-08 at 13:16 +0100, Peter Zijlstra wrote:
> > On Sat, 2007-12-08 at 00:02 +0100, Remy Bohmer wrote:
> > > Hello Peter,
> > > 
> > > > > What specifically is wrong with dev->sem ?
> > > >
> > > > Nothing really, other than that they use semaphores to avoid lockdep :-/
> > > >
> > > > I think I know how to annotate this, after Alan Stern explained all the
> > > > use cases, but I haven't come around to implementing it. Hope to do that
> > > > soonish.
> > > 
> > > I was looking for an easy semaphore I could convert to a mutex, and I
> > > ran into one that was widely spread and interesting, and which seemed
> > > quite doable at first sight.
> > > So, I started working on it, but was forgotten this discussion, (until
> > > Daniel made me remember it this afternoon). So, I (stupid me ;-) )
> > > tried to convert dev->sem...
> > > 
> > > After doing the monkey part of the conversion I can boot the kernel
> > > completely on X86 and ARM, and everything works fine, except after
> > > enabling lockdep, lockdep starts complaining...
> > > 
> > > Is this the problem you were pointing at?
> > 
> > Yeah, one of the interesting nestings :-)
> 
> It must be the locking in __driver_attach(), taking dev->parent->sem
> then taking dev->sem .. Assuming those are different structures, why
> does lockdep trigger?

They aren't different, parent is a struct device again.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: lockdep problem conversion semaphore->mutex (dev->sem)
  2007-12-08 17:06       ` Daniel Walker
@ 2007-12-08 17:36         ` Peter Zijlstra
  2007-12-08 19:52           ` Remy Bohmer
  0 siblings, 1 reply; 14+ messages in thread
From: Peter Zijlstra @ 2007-12-08 17:36 UTC (permalink / raw)
  To: Daniel Walker
  Cc: Remy Bohmer, Ingo Molnar, Steven Rostedt, linux-kernel,
	Dave Chinner


On Sat, 2007-12-08 at 09:06 -0800, Daniel Walker wrote:
> On Sat, 2007-12-08 at 18:11 +0100, Peter Zijlstra wrote:
> 
> > > It must be the locking in __driver_attach(), taking dev->parent->sem
> > > then taking dev->sem .. Assuming those are different structures, why
> > > does lockdep trigger?
> > 
> > They aren't different, parent is a struct device again.
> 
> It's different memory tho .. I wasn't sure how to term that .. The locks
> are in two different memory location so it couldn't be recursive .. I'm
> only asking for my own understanding .. I don't mind inspecting
> potentially bad locking ..

Yeah, it are different lock instances, however by virtue of sharing the
same lock init site, they belong to the same lock class. Lockdep works
by tracking class dependancies, not instance dependancies.

By generalizing to classes it can detect locking errors before they
actually occur.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: lockdep problem conversion semaphore->mutex (dev->sem)
  2007-12-08 17:36         ` Peter Zijlstra
@ 2007-12-08 19:52           ` Remy Bohmer
  2007-12-08 20:04             ` Peter Zijlstra
  2007-12-08 20:08             ` Ingo Molnar
  0 siblings, 2 replies; 14+ messages in thread
From: Remy Bohmer @ 2007-12-08 19:52 UTC (permalink / raw)
  To: Peter Zijlstra, Daniel Walker
  Cc: Ingo Molnar, Steven Rostedt, linux-kernel, Dave Chinner

Hello Peter and Daniel,

> Yeah, it are different lock instances, however by virtue of sharing the
> same lock init site, they belong to the same lock class. Lockdep works
> by tracking class dependancies, not instance dependancies.

The device and its parent both indeed have different
pointers/instances. I saw that during debugging yesterday, so I
already expected this was not really a bug, but a false positive of
lockdep.

> By generalizing to classes it can detect locking errors before they
> actually occur.

Basically that is a good thing...
But... now we do not transfer the dev->sem to a mutex, because lockdep
will start generating false positives, and thus we mask the lockdep
error, by not converting the dev->sem to what it really is...

This does give me a bad feeling about this...  In short, we are
implementing workarounds in good code code to hide bugs in
debug-tooling... ?!

So, any suggestions on how to fix lockdep? Anyone some good hints
where I can start?
Is it that fundamental to lockdep that it cannot be fixed without
breaking it, or do we have to instrument the code that tells lockdep
to ignore this particular mutex?


Kind Regards,

Remy

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: lockdep problem conversion semaphore->mutex (dev->sem)
  2007-12-08 19:52           ` Remy Bohmer
@ 2007-12-08 20:04             ` Peter Zijlstra
  2007-12-08 20:33               ` Remy Bohmer
  2007-12-08 20:08             ` Ingo Molnar
  1 sibling, 1 reply; 14+ messages in thread
From: Peter Zijlstra @ 2007-12-08 20:04 UTC (permalink / raw)
  To: Remy Bohmer
  Cc: Daniel Walker, Ingo Molnar, Steven Rostedt, linux-kernel,
	Dave Chinner


On Sat, 2007-12-08 at 20:52 +0100, Remy Bohmer wrote:
> Hello Peter and Daniel,
> 
> > Yeah, it are different lock instances, however by virtue of sharing the
> > same lock init site, they belong to the same lock class. Lockdep works
> > by tracking class dependancies, not instance dependancies.
> 
> The device and its parent both indeed have different
> pointers/instances. I saw that during debugging yesterday, so I
> already expected this was not really a bug, but a false positive of
> lockdep.
> 
> > By generalizing to classes it can detect locking errors before they
> > actually occur.
> 
> Basically that is a good thing...
> But... now we do not transfer the dev->sem to a mutex, because lockdep
> will start generating false positives, and thus we mask the lockdep
> error, by not converting the dev->sem to what it really is...
> 
> This does give me a bad feeling about this...  In short, we are
> implementing workarounds in good code code to hide bugs in
> debug-tooling... ?!

Its not a bug, its a feature! :-)

> So, any suggestions on how to fix lockdep? Anyone some good hints
> where I can start?
> Is it that fundamental to lockdep that it cannot be fixed without
> breaking it, or do we have to instrument the code that tells lockdep
> to ignore this particular mutex?

I have a good idea on how to annotate this, but not yet the time to do
so. Aside from the nesting problems, the dev->sem has other problems as
well. Converting this is rather non-trivial.

I'd not put it as harshly as you put it though, lockdep makes some
assumptions which can lead to false positives - otoh these assumptions
often end up pointing out 'curious' locking coupled to 'curious' data
structures. And fixing up these things often leads to better and simpler
code.

The emphasis is on often, this is one of the cases where this is not so.
So while it does restain the creativity of locking it often ends up
being for the better.

And while you might not see it in-tree anymore, lockdep does help out
tremendously while developing new code. I'm sure that without it the
locking would be in a much worse state than it is today.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: lockdep problem conversion semaphore->mutex (dev->sem)
  2007-12-08 19:52           ` Remy Bohmer
  2007-12-08 20:04             ` Peter Zijlstra
@ 2007-12-08 20:08             ` Ingo Molnar
  2007-12-08 20:48               ` Remy Bohmer
  1 sibling, 1 reply; 14+ messages in thread
From: Ingo Molnar @ 2007-12-08 20:08 UTC (permalink / raw)
  To: Remy Bohmer
  Cc: Peter Zijlstra, Daniel Walker, Steven Rostedt, linux-kernel,
	Dave Chinner


* Remy Bohmer <linux@bohmer.net> wrote:

> But... now we do not transfer the dev->sem to a mutex, because lockdep 
> will start generating false positives, and thus we mask the lockdep 
> error, by not converting the dev->sem to what it really is...

no, you are wrong. If you want to do complex locking, you can still do 
it: take a look at the dev->sem conversion patches from Peter which 
correctly do this. Lockdep has all the facilities for that. (you just 
dont know about them) Currently there are 4459 critical sections in the 
kernel that use mutexes and which work fine with lockdep.

the general policy message here is: do not implement complex locking. It 
hurts. It's hard to maintain. It's easy to mess up and leads to bugs. 
Lockdep just makes that plain obvious.

Your mail and your frustration shows this general concept in happy 
action: judging from your comments you have little clue about dev->sem 
locking and its implications and you'd happily go along and pollute the 
kernel with complex and hard to maintain nested locking constructs.

Lockdep prevents you from doing it mindlessly, it _forces_ you to first 
understand the data structures, their locking and their relationship 
with each other. Then you can implement complexity, if you still want 
it.

That, Sir, is a Good Thing (tm).

	Ingo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: lockdep problem conversion semaphore->mutex (dev->sem)
  2007-12-08 20:04             ` Peter Zijlstra
@ 2007-12-08 20:33               ` Remy Bohmer
  2007-12-08 20:50                 ` Peter Zijlstra
  0 siblings, 1 reply; 14+ messages in thread
From: Remy Bohmer @ 2007-12-08 20:33 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Daniel Walker, Ingo Molnar, Steven Rostedt, linux-kernel,
	Dave Chinner

Hello Peter,

> And while you might not see it in-tree anymore, lockdep does help out
> tremendously while developing new code. I'm sure that without it the
> locking would be in a much worse state than it is today.

I am not arguing that, I am also convinced it has done a good job.

> I have a good idea on how to annotate this, but not yet the time to do
> so.

Sounds good...

> Aside from the nesting problems, the dev->sem has other problems as
> well. Converting this is rather non-trivial.

Which problems? I did not see any special things, it looked rather
straight forward. What have I overlooked?

> I'd not put it as harshly as you put it though, lockdep makes some
> assumptions which can lead to false positives -

By putting it this black and white, it usually helps to get all the
opinions clear ;-)
(By staying in the middle, everybody usually tend to agree ;-)

> otoh these assumptions
> often end up pointing out 'curious' locking coupled to 'curious' data
> structures. And fixing up these things often leads to better and simpler
> code.
> The emphasis is on often, this is one of the cases where this is not so.
> So while it does restain the creativity of locking it often ends up
> being for the better.

Ack.

Kind Regards,

Remy

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: lockdep problem conversion semaphore->mutex (dev->sem)
  2007-12-08 20:08             ` Ingo Molnar
@ 2007-12-08 20:48               ` Remy Bohmer
  0 siblings, 0 replies; 14+ messages in thread
From: Remy Bohmer @ 2007-12-08 20:48 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Daniel Walker, Steven Rostedt, linux-kernel,
	Dave Chinner

Hello Ingo,

> no, you are wrong. If you want to do complex locking, you can still do
> it: take a look at the dev->sem conversion patches from Peter which
> correctly do this. Lockdep has all the facilities for that.
> (you just dont know about them)

Ok.

> the general policy message here is: do not implement complex locking. It
> hurts. It's hard to maintain. It's easy to mess up and leads to bugs.

Agree

> Lockdep just makes that plain obvious.
> Your mail and your frustration shows this general concept in happy

Please, do not see it as frustration, because it isn't...
I just want to understand it better.

> action: judging from your comments you have little clue about dev->sem
> locking and its implications and you'd happily go along and pollute the
> kernel with complex and hard to maintain nested locking constructs.

Now, you assume that i would _like_ complex locking. This is not true.

I just want to understand what was so wrong with dev->sem, I even read
in the discussions before about dev->sem, that it still was on the old
semaphore interface to get around lockdep issues, and _that_ is wrong.
That is bug-hiding from either the code or the tool. I just wanted to
understand if this was a lockdep bug, or a real code bug.

> Lockdep prevents you from doing it mindlessly, it _forces_ you to first
> understand the data structures, their locking and their relationship
> with each other. Then you can implement complexity, if you still want
> it.
>
> That, Sir, is a Good Thing (tm).

Completely agree.


Remy

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: lockdep problem conversion semaphore->mutex (dev->sem)
  2007-12-08 20:33               ` Remy Bohmer
@ 2007-12-08 20:50                 ` Peter Zijlstra
  2007-12-08 20:55                   ` Remy Bohmer
  2007-12-08 22:57                   ` Jarek Poplawski
  0 siblings, 2 replies; 14+ messages in thread
From: Peter Zijlstra @ 2007-12-08 20:50 UTC (permalink / raw)
  To: Remy Bohmer
  Cc: Daniel Walker, Ingo Molnar, Steven Rostedt, linux-kernel,
	Dave Chinner


On Sat, 2007-12-08 at 21:33 +0100, Remy Bohmer wrote:

> Which problems? I did not see any special things, it looked rather
> straight forward. What have I overlooked?

On suspend it locks the whole device tree, this means it has 'unbounded'
nesting and holds an 'unbounded' number of locks. Neither things are
easy to annotate (remember that mutex_lock_nested can handle up to 8
nestings and current->held_locks has a max of 30).

In fact, converting this will be the hardest part, it would require
reworking the locking and introduction of a hard limit on the device
tree depth - this might upset some people, but I suspect that 16 or 24
should be deep enough for pretty much anything. Of course, if people
prove me wrong, I'll have to reconsider. The up-side of the locking
scheme I'm thinking of will be that locking the whole tree will only
take 'depth' number of opterations vs the total number of tree elements.




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: lockdep problem conversion semaphore->mutex (dev->sem)
  2007-12-08 20:50                 ` Peter Zijlstra
@ 2007-12-08 20:55                   ` Remy Bohmer
  2007-12-08 22:57                   ` Jarek Poplawski
  1 sibling, 0 replies; 14+ messages in thread
From: Remy Bohmer @ 2007-12-08 20:55 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Daniel Walker, Ingo Molnar, Steven Rostedt, linux-kernel,
	Dave Chinner

Peter,

Thanks for this clear answer.

Remy

2007/12/8, Peter Zijlstra <peterz@infradead.org>:
>
> On Sat, 2007-12-08 at 21:33 +0100, Remy Bohmer wrote:
>
> > Which problems? I did not see any special things, it looked rather
> > straight forward. What have I overlooked?
>
> On suspend it locks the whole device tree, this means it has 'unbounded'
> nesting and holds an 'unbounded' number of locks. Neither things are
> easy to annotate (remember that mutex_lock_nested can handle up to 8
> nestings and current->held_locks has a max of 30).
>
> In fact, converting this will be the hardest part, it would require
> reworking the locking and introduction of a hard limit on the device
> tree depth - this might upset some people, but I suspect that 16 or 24
> should be deep enough for pretty much anything. Of course, if people
> prove me wrong, I'll have to reconsider. The up-side of the locking
> scheme I'm thinking of will be that locking the whole tree will only
> take 'depth' number of opterations vs the total number of tree elements.
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: lockdep problem conversion semaphore->mutex (dev->sem)
  2007-12-08 20:50                 ` Peter Zijlstra
  2007-12-08 20:55                   ` Remy Bohmer
@ 2007-12-08 22:57                   ` Jarek Poplawski
  1 sibling, 0 replies; 14+ messages in thread
From: Jarek Poplawski @ 2007-12-08 22:57 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Remy Bohmer, Daniel Walker, Ingo Molnar, Steven Rostedt,
	linux-kernel, Dave Chinner

Peter Zijlstra wrote, On 12/08/2007 09:50 PM:

> On Sat, 2007-12-08 at 21:33 +0100, Remy Bohmer wrote:
> 
>> Which problems? I did not see any special things, it looked rather
>> straight forward. What have I overlooked?
> 
> On suspend it locks the whole device tree, this means it has 'unbounded'
> nesting and holds an 'unbounded' number of locks. Neither things are
> easy to annotate (remember that mutex_lock_nested can handle up to 8
> nestings and current->held_locks has a max of 30).
> 
> In fact, converting this will be the hardest part, it would require
> reworking the locking and introduction of a hard limit on the device
> tree depth - this might upset some people, but I suspect that 16 or 24
> should be deep enough for pretty much anything. Of course, if people
> prove me wrong, I'll have to reconsider. The up-side of the locking
> scheme I'm thinking of will be that locking the whole tree will only
> take 'depth' number of opterations vs the total number of tree elements.

Of course, I don't know the problem enough, and would be glad if somebody
give me a hint, but I wonder why so deep nesting with lockdep's modification
is necessary here? Does buses have parent buses and so on x8? Why isn't
here considered creating of different lockdep classes according to types
of buses and devices "the usual way"? This way seems to be quite easy in
later debugging.

Regards,
Jarek P.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2007-12-08 22:56 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-12-07 23:02 lockdep problem conversion semaphore->mutex (dev->sem) Remy Bohmer
2007-12-08 12:16 ` Peter Zijlstra
2007-12-08 16:53   ` Daniel Walker
2007-12-08 17:11     ` Peter Zijlstra
2007-12-08 17:06       ` Daniel Walker
2007-12-08 17:36         ` Peter Zijlstra
2007-12-08 19:52           ` Remy Bohmer
2007-12-08 20:04             ` Peter Zijlstra
2007-12-08 20:33               ` Remy Bohmer
2007-12-08 20:50                 ` Peter Zijlstra
2007-12-08 20:55                   ` Remy Bohmer
2007-12-08 22:57                   ` Jarek Poplawski
2007-12-08 20:08             ` Ingo Molnar
2007-12-08 20:48               ` Remy Bohmer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox