public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Dan Williams <dan.j.williams@intel.com>
Cc: linux-cxl@vger.kernel.org,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Dave Jiang <dave.jiang@intel.com>,
	Kevin Tian <kevin.tian@intel.com>,
	vishal.l.verma@intel.com, alison.schofield@intel.com,
	linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev
Subject: Re: [PATCH v2 02/12] device-core: Add dev->lock_class to enable device_lock() lockdep validation
Date: Wed, 13 Apr 2022 10:43:09 +0200	[thread overview]
Message-ID: <20220413084309.GV2731@worktop.programming.kicks-ass.net> (raw)
In-Reply-To: <164982969858.684294.17819743973041389492.stgit@dwillia2-desk3.amr.corp.intel.com>

On Tue, Apr 12, 2022 at 11:01:38PM -0700, Dan Williams wrote:
> The device_lock() is hidden from lockdep by default because, for
> example, a device subsystem may do something like:
> 
> ---
> device_add(dev1);
> ...in driver core...
> device_lock(dev1);
> bus->probe(dev1); /* where bus->probe() calls driver1_probe() */
> 
> driver1_probe(struct device *dev)
> {
> 	...do some enumeration...
> 	dev2->parent = dev;
> 	/* this triggers probe under device_lock(dev2); */
> 	device_add(dev2);
> }
> ---
> 
> To lockdep, that device_lock(dev2) looks like a deadlock because lockdep

Recursion, you're meaning to say it looks like same lock recursion.

> only sees lock classes, not individual lock instances. All device_lock()
> instances across the entire kernel are the same class. However, this is
> not a deadlock in practice because the locking is strictly hierarchical.
> I.e. device_lock(dev1) is held over device_lock(dev2), but never the
> reverse.

I have some very vague memories from a conversation with Alan Stern,
some maybe 10 years ago, where I think he was explaining to me this was
not in fact a simple hierarchy.

> In order for lockdep to be satisfied and see that it is
> hierarchical in practice the mutex_lock() call in device_lock() needs to
> be moved to mutex_lock_nested() where the @subclass argument to
> mutex_lock_nested() represents the nesting level, i.e.:

That's not an obvious conclusion; lockdep has lots of funny annotations,
subclasses is just one.

I think the big new development in lockdep since that time with Alan
Stern is that lockdep now has support for dynamic keys; that is lock
keys in heap memory (as opposed to static storage).

> s/device_lock(dev1)/mutex_lock_nested(&dev1->mutex, 1)/
> 
> s/device_lock(dev2)/mutex_lock_nested(&dev2->mutex, 2)/
> 
> Now, what if the internals of the device_lock() could be annotated with
> the right @subclass argument to call mutex_lock_nested()?
> 
> With device_set_lock_class() a subsystem can optionally add that
> metadata. The device_lock() still takes dev->mutex, but when
> dev->lock_class is >= 0 it additionally takes dev->lockdep_mutex with
> the proper nesting. Unlike dev->mutex, dev->lockdep_mutex is not marked
> lockdep_set_novalidate_class() and lockdep will become useful... at
> least for one subsystem at a time.
> 
> It is still the case that only one subsystem can be using lockdep with
> lockdep_mutex at a time because different subsystems will collide class
> numbers. You might say "well, how about subsystem1 gets class ids 0 to 9
> and subsystem2 gets class ids 10 to 20?". MAX_LOCKDEP_SUBCLASSES is 8,
> and 8 is just enough class ids for one subsystem of moderate complexity.

Again, that doesn't seem like an obvious suggestion at all. Why not give
each subsystem a different lock key?


> diff --git a/include/linux/device.h b/include/linux/device.h
> index af2576ace130..6083e757e804 100644
> --- a/include/linux/device.h
> +++ b/include/linux/device.h
> @@ -402,6 +402,7 @@ struct dev_msi_info {
>   * @mutex:	Mutex to synchronize calls to its driver.
>   * @lockdep_mutex: An optional debug lock that a subsystem can use as a
>   * 		peer lock to gain localized lockdep coverage of the device_lock.
> + * @lock_class: per-subsystem annotated device lock class
>   * @bus:	Type of bus device is on.
>   * @driver:	Which driver has allocated this
>   * @platform_data: Platform data specific to the device.
> @@ -501,6 +502,7 @@ struct device {
>  					   dev_set_drvdata/dev_get_drvdata */
>  #ifdef CONFIG_PROVE_LOCKING
>  	struct mutex		lockdep_mutex;
> +	int			lock_class;
>  #endif
>  	struct mutex		mutex;	/* mutex to synchronize calls to
>  					 * its driver.
> @@ -762,18 +764,100 @@ static inline bool dev_pm_test_driver_flags(struct device *dev, u32 flags)
>  	return !!(dev->power.driver_flags & flags);
>  }
>  
> +static inline void device_lock_assert(struct device *dev)
> +{
> +	lockdep_assert_held(&dev->mutex);
> +}
> +
>  #ifdef CONFIG_PROVE_LOCKING
>  static inline void device_lockdep_init(struct device *dev)
>  {
>  	mutex_init(&dev->lockdep_mutex);
> +	dev->lock_class = -1;
>  	lockdep_set_novalidate_class(&dev->mutex);
>  }
> -#else
> +
> +static inline void device_lock(struct device *dev)
> +{
> +	/*
> +	 * For double-lock programming errors the kernel will hang
> +	 * trying to acquire @dev->mutex before lockdep can report the
> +	 * problem acquiring @dev->lockdep_mutex, so manually assert
> +	 * before that hang.
> +	 */
> +	lockdep_assert_not_held(&dev->lockdep_mutex);
> +
> +	mutex_lock(&dev->mutex);
> +	if (dev->lock_class >= 0)
> +		mutex_lock_nested(&dev->lockdep_mutex, dev->lock_class);
> +}
> +
> +static inline int device_lock_interruptible(struct device *dev)
> +{
> +	int rc;
> +
> +	lockdep_assert_not_held(&dev->lockdep_mutex);
> +
> +	rc = mutex_lock_interruptible(&dev->mutex);
> +	if (rc || dev->lock_class < 0)
> +		return rc;
> +
> +	return mutex_lock_interruptible_nested(&dev->lockdep_mutex,
> +					       dev->lock_class);
> +}
> +
> +static inline int device_trylock(struct device *dev)
> +{
> +	if (mutex_trylock(&dev->mutex)) {
> +		if (dev->lock_class >= 0)
> +			mutex_lock_nested(&dev->lockdep_mutex, dev->lock_class);

This must be the weirdest stuff I've seen in a while.

> +		return 1;
> +	}
> +
> +	return 0;
> +}
> +
> +static inline void device_unlock(struct device *dev)
> +{
> +	if (dev->lock_class >= 0)
> +		mutex_unlock(&dev->lockdep_mutex);
> +	mutex_unlock(&dev->mutex);
> +}
> +
> +/*
> + * Note: this routine expects that the state of @dev->mutex is stable
> + * from entry to exit. There is no support for changing lockdep
> + * validation classes, only enabling and disabling validation.
> + */
> +static inline void device_set_lock_class(struct device *dev, int lock_class)
> +{
> +	/*
> +	 * Allow for setting or clearing the lock class while the
> +	 * device_lock() is held, in which case the paired nested lock
> +	 * might need to be acquired or released now to accommodate the
> +	 * next device_unlock().
> +	 */
> +	if (dev->lock_class < 0 && lock_class >= 0) {
> +		/* Enabling lockdep validation... */
> +		if (mutex_is_locked(&dev->mutex))
> +			mutex_lock_nested(&dev->lockdep_mutex, lock_class);
> +	} else if (dev->lock_class >= 0 && lock_class < 0) {
> +		/* Disabling lockdep validation... */
> +		if (mutex_is_locked(&dev->mutex))
> +			mutex_unlock(&dev->lockdep_mutex);
> +	} else {
> +		dev_warn(dev,
> +			 "%s: failed to change lock_class from: %d to %d\n",
> +			 __func__, dev->lock_class, lock_class);
> +		return;
> +	}
> +	dev->lock_class = lock_class;
> +}
> +#else /* !CONFIG_PROVE_LOCKING */

This all reads like something utterly surreal... *WHAT*!?!?

If you want lockdep validation for one (or more) dev->mutex instances,
why not pull them out of the no_validate class and use the normal
locking?

This is all quite insane.

  reply	other threads:[~2022-04-13  8:43 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-13  6:01 [PATCH v2 00/12] device-core: Enable device_lock() lockdep validation Dan Williams
2022-04-13  6:01 ` [PATCH v2 01/12] device-core: Move device_lock() lockdep init to a helper Dan Williams
2022-04-13  6:01 ` [PATCH v2 02/12] device-core: Add dev->lock_class to enable device_lock() lockdep validation Dan Williams
2022-04-13  8:43   ` Peter Zijlstra [this message]
2022-04-13 22:01     ` Dan Williams
2022-04-14 10:16       ` Peter Zijlstra
2022-04-14 17:17         ` Dan Williams
2022-04-14 19:33           ` Peter Zijlstra
2022-04-14 19:43             ` Dan Williams
2022-04-15  7:53               ` Peter Zijlstra
2022-04-13  6:01 ` [PATCH v2 03/12] cxl/core: Refactor a cxl_lock_class() out of cxl_nested_lock() Dan Williams
2022-04-13  9:39   ` Peter Zijlstra
2022-04-13 22:05     ` Dan Williams
2022-04-13  6:01 ` [PATCH v2 04/12] cxl/core: Remove cxl_device_lock() Dan Williams
2022-04-13  6:01 ` [PATCH v2 05/12] cxl/core: Clamp max lock_class Dan Williams
2022-04-13  6:02 ` [PATCH v2 06/12] cxl/core: Use dev->lock_class for device_lock() lockdep validation Dan Williams
2022-04-13  6:02 ` [PATCH v2 07/12] cxl/acpi: Add a device_lock() lock class for the root platform device Dan Williams
2022-04-13  6:02 ` [PATCH v2 08/12] libnvdimm: Refactor an nvdimm_lock_class() helper Dan Williams
2022-04-13  6:02 ` [PATCH v2 09/12] ACPI: NFIT: Drop nfit_device_lock() Dan Williams
2022-04-13  6:02 ` [PATCH v2 10/12] libnvdimm: Drop nd_device_lock() Dan Williams
2022-04-13  6:02 ` [PATCH v2 11/12] libnvdimm: Enable lockdep validation Dan Williams
2022-04-13  6:02 ` [PATCH v2 12/12] device-core: Enable multi-subsystem device_lock() " Dan Williams
2022-04-13 14:02 ` [PATCH v2 00/12] device-core: Enable " Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220413084309.GV2731@worktop.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=alison.schofield@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=kevin.tian@intel.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nvdimm@lists.linux.dev \
    --cc=rafael@kernel.org \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox