All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org,
	"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
	Ramalingam C <ramalingam.c@intel.com>,
	Arend van Spriel <aspriel@gmail.com>,
	Andy Shevchenko <andriy.shevchenko@linux.intel.com>,
	Geert Uytterhoeven <geert+renesas@glider.be>,
	Bartosz Golaszewski <brgl@bgdev.pl>,
	Heikki Krogerus <heikki.krogerus@linux.intel.com>,
	Vivek Gautam <vivek.gautam@codeaurora.org>,
	Joe Perches <joe@perches.com>,
	Daniel Vetter <daniel.vetter@intel.com>,
	Sasha Levin <sashal@kernel.org>
Subject: [PATCH 3.18 50/52] sysfs: Disable lockdep for driver bind/unbind files
Date: Thu, 24 Jan 2019 20:20:14 +0100	[thread overview]
Message-ID: <20190124190151.012666115@linuxfoundation.org> (raw)
In-Reply-To: <20190124190140.879495253@linuxfoundation.org>

3.18-stable review patch.  If anyone has any objections, please let me know.

------------------

[ Upstream commit 4f4b374332ec0ae9c738ff8ec9bed5cd97ff9adc ]

This is the much more correct fix for my earlier attempt at:

https://lkml.org/lkml/2018/12/10/118

Short recap:

- There's not actually a locking issue, it's just lockdep being a bit
  too eager to complain about a possible deadlock.

- Contrary to what I claimed the real problem is recursion on
  kn->count. Greg pointed me at sysfs_break_active_protection(), used
  by the scsi subsystem to allow a sysfs file to unbind itself. That
  would be a real deadlock, which isn't what's happening here. Also,
  breaking the active protection means we'd need to manually handle
  all the lifetime fun.

- With Rafael we discussed the task_work approach, which kinda works,
  but has two downsides: It's a functional change for a lockdep
  annotation issue, and it won't work for the bind file (which needs
  to get the errno from the driver load function back to userspace).

- Greg also asked why this never showed up: To hit this you need to
  unregister a 2nd driver from the unload code of your first driver. I
  guess only gpus do that. The bug has always been there, but only
  with a recent patch series did we add more locks so that lockdep
  built a chain from unbinding the snd-hda driver to the
  acpi_video_unregister call.

Full lockdep splat:

[12301.898799] ============================================
[12301.898805] WARNING: possible recursive locking detected
[12301.898811] 4.20.0-rc7+ #84 Not tainted
[12301.898815] --------------------------------------------
[12301.898821] bash/5297 is trying to acquire lock:
[12301.898826] 00000000f61c6093 (kn->count#39){++++}, at: kernfs_remove_by_name_ns+0x3b/0x80
[12301.898841] but task is already holding lock:
[12301.898847] 000000005f634021 (kn->count#39){++++}, at: kernfs_fop_write+0xdc/0x190
[12301.898856] other info that might help us debug this:
[12301.898862]  Possible unsafe locking scenario:
[12301.898867]        CPU0
[12301.898870]        ----
[12301.898874]   lock(kn->count#39);
[12301.898879]   lock(kn->count#39);
[12301.898883] *** DEADLOCK ***
[12301.898891]  May be due to missing lock nesting notation
[12301.898899] 5 locks held by bash/5297:
[12301.898903]  #0: 00000000cd800e54 (sb_writers#4){.+.+}, at: vfs_write+0x17f/0x1b0
[12301.898915]  #1: 000000000465e7c2 (&of->mutex){+.+.}, at: kernfs_fop_write+0xd3/0x190
[12301.898925]  #2: 000000005f634021 (kn->count#39){++++}, at: kernfs_fop_write+0xdc/0x190
[12301.898936]  #3: 00000000414ef7ac (&dev->mutex){....}, at: device_release_driver_internal+0x34/0x240
[12301.898950]  #4: 000000003218fbdf (register_count_mutex){+.+.}, at: acpi_video_unregister+0xe/0x40
[12301.898960] stack backtrace:
[12301.898968] CPU: 1 PID: 5297 Comm: bash Not tainted 4.20.0-rc7+ #84
[12301.898974] Hardware name: Hewlett-Packard HP EliteBook 8460p/161C, BIOS 68SCF Ver. F.01 03/11/2011
[12301.898982] Call Trace:
[12301.898989]  dump_stack+0x67/0x9b
[12301.898997]  __lock_acquire+0x6ad/0x1410
[12301.899003]  ? kernfs_remove_by_name_ns+0x3b/0x80
[12301.899010]  ? find_held_lock+0x2d/0x90
[12301.899017]  ? mutex_spin_on_owner+0xe4/0x150
[12301.899023]  ? find_held_lock+0x2d/0x90
[12301.899030]  ? lock_acquire+0x90/0x180
[12301.899036]  lock_acquire+0x90/0x180
[12301.899042]  ? kernfs_remove_by_name_ns+0x3b/0x80
[12301.899049]  __kernfs_remove+0x296/0x310
[12301.899055]  ? kernfs_remove_by_name_ns+0x3b/0x80
[12301.899060]  ? kernfs_name_hash+0xd/0x80
[12301.899066]  ? kernfs_find_ns+0x6c/0x100
[12301.899073]  kernfs_remove_by_name_ns+0x3b/0x80
[12301.899080]  bus_remove_driver+0x92/0xa0
[12301.899085]  acpi_video_unregister+0x24/0x40
[12301.899127]  i915_driver_unload+0x42/0x130 [i915]
[12301.899160]  i915_pci_remove+0x19/0x30 [i915]
[12301.899169]  pci_device_remove+0x36/0xb0
[12301.899176]  device_release_driver_internal+0x185/0x240
[12301.899183]  unbind_store+0xaf/0x180
[12301.899189]  kernfs_fop_write+0x104/0x190
[12301.899195]  __vfs_write+0x31/0x180
[12301.899203]  ? rcu_read_lock_sched_held+0x6f/0x80
[12301.899209]  ? rcu_sync_lockdep_assert+0x29/0x50
[12301.899216]  ? __sb_start_write+0x13c/0x1a0
[12301.899221]  ? vfs_write+0x17f/0x1b0
[12301.899227]  vfs_write+0xb9/0x1b0
[12301.899233]  ksys_write+0x50/0xc0
[12301.899239]  do_syscall_64+0x4b/0x180
[12301.899247]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[12301.899253] RIP: 0033:0x7f452ac7f7a4
[12301.899259] Code: 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 80 00 00 00 00 8b 05 aa f0 2c 00 48 63 ff 85 c0 75 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 f3 c3 66 90 55 53 48 89 d5 48 89 f3 48 83
[12301.899273] RSP: 002b:00007ffceafa6918 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[12301.899282] RAX: ffffffffffffffda RBX: 000000000000000d RCX: 00007f452ac7f7a4
[12301.899288] RDX: 000000000000000d RSI: 00005612a1abf7c0 RDI: 0000000000000001
[12301.899295] RBP: 00005612a1abf7c0 R08: 000000000000000a R09: 00005612a1c46730
[12301.899301] R10: 000000000000000a R11: 0000000000000246 R12: 000000000000000d
[12301.899308] R13: 0000000000000001 R14: 00007f452af4a740 R15: 000000000000000d

Looking around I've noticed that usb and i2c already handle similar
recursion problems, where a sysfs file can unbind the same type of
sysfs somewhere else in the hierarchy. Relevant commits are:

commit 356c05d58af05d582e634b54b40050c73609617b
Author: Alan Stern <stern@rowland.harvard.edu>
Date:   Mon May 14 13:30:03 2012 -0400

    sysfs: get rid of some lockdep false positives

commit e9b526fe704812364bca07edd15eadeba163ebfb
Author: Alexander Sverdlin <alexander.sverdlin@nsn.com>
Date:   Fri May 17 14:56:35 2013 +0200

    i2c: suppress lockdep warning on delete_device

Implement the same trick for driver bind/unbind.

v2: Put the macro into bus.c (Greg).

Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Ramalingam C <ramalingam.c@intel.com>
Cc: Arend van Spriel <aspriel@gmail.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Bartosz Golaszewski <brgl@bgdev.pl>
Cc: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Cc: Vivek Gautam <vivek.gautam@codeaurora.org>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/base/bus.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/base/bus.c b/drivers/base/bus.c
index 07ea8608fb0b..c05561e26d85 100644
--- a/drivers/base/bus.c
+++ b/drivers/base/bus.c
@@ -32,6 +32,9 @@ static struct kset *system_kset;
 
 #define to_drv_attr(_attr) container_of(_attr, struct driver_attribute, attr)
 
+#define DRIVER_ATTR_IGNORE_LOCKDEP(_name, _mode, _show, _store) \
+	struct driver_attribute driver_attr_##_name =		\
+		__ATTR_IGNORE_LOCKDEP(_name, _mode, _show, _store)
 
 static int __must_check bus_rescan_devices_helper(struct device *dev,
 						void *data);
@@ -197,7 +200,7 @@ static ssize_t unbind_store(struct device_driver *drv, const char *buf,
 	bus_put(bus);
 	return err;
 }
-static DRIVER_ATTR_WO(unbind);
+static DRIVER_ATTR_IGNORE_LOCKDEP(unbind, S_IWUSR, NULL, unbind_store);
 
 /*
  * Manually attach a device to a driver.
@@ -233,7 +236,7 @@ static ssize_t bind_store(struct device_driver *drv, const char *buf,
 	bus_put(bus);
 	return err;
 }
-static DRIVER_ATTR_WO(bind);
+static DRIVER_ATTR_IGNORE_LOCKDEP(bind, S_IWUSR, NULL, bind_store);
 
 static ssize_t show_drivers_autoprobe(struct bus_type *bus, char *buf)
 {
-- 
2.19.1




  parent reply	other threads:[~2019-01-24 19:23 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-24 19:19 [PATCH 3.18 00/52] 3.18.133-stable review Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 3.18 01/52] sparc32: Fix inverted invalid_frame_pointer checks on sigreturns Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 3.18 02/52] CIFS: Do not hide EINTR after sending network packets Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 3.18 03/52] cifs: Fix potential OOB access of lock element array Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 3.18 04/52] usb: cdc-acm: send ZLP for Telit 3G Intel based modems Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 3.18 05/52] USB: storage: dont insert sane sense for SPC3+ when bad sense specified Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 3.18 06/52] USB: storage: add quirk for SMI SM3350 Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 3.18 07/52] slab: alien caches must not be initialized if the allocation of the alien cache failed Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 3.18 08/52] ACPI: power: Skip duplicate power resource references in _PRx Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 3.18 09/52] i2c: dev: prevent adapter retries and timeout being set as minus value Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 3.18 10/52] crypto: cts - fix crash on short inputs Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 3.18 11/52] sunrpc: use-after-free in svc_process_common() Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 3.18 12/52] tty/ldsem: Wake up readers after timed out down_write() Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 3.18 13/52] can: gw: ensure DLC boundaries after CAN frame modification Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 3.18 14/52] media: em28xx: Fix misplaced reset of dev->v4l::field_count Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 3.18 15/52] ipv6: fix kernel-infoleak in ipv6_local_error() Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 3.18 16/52] packet: Do not leak dev refcounts on error exit Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 3.18 17/52] net: bridge: fix a bug on using a neighbour cache entry without checking its state Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 3.18 18/52] crypto: authenc - fix parsing key with misaligned rta_len Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 3.18 19/52] btrfs: wait on ordered extents on abort cleanup Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 3.18 20/52] Yama: Check for pid death before checking ancestry Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 3.18 21/52] scsi: sd: Fix cache_type_store() Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 3.18 22/52] mfd: tps6586x: Handle interrupts on suspend Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 3.18 23/52] Disable MSI also when pcie-octeon.pcie_disable on Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 3.18 24/52] omap2fb: Fix stack memory disclosure Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 3.18 25/52] media: vivid: fix error handling of kthread_run Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 3.18 26/52] media: vivid: set min width/height to a value > 0 Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 3.18 27/52] media: vb2: vb2_mmap: move lock up Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 3.18 28/52] sunrpc: handle ENOMEM in rpcb_getport_async Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 3.18 29/52] selinux: fix GPF on invalid policy Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 3.18 30/52] sctp: allocate sctp_sockaddr_entry with kzalloc Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 3.18 31/52] block/loop: Use global lock for ioctl() operation Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 3.18 32/52] drm/fb-helper: Ignore the value of fb_var_screeninfo.pixclock Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 3.18 33/52] media: vb2: be sure to unlock mutex on errors Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 3.18 34/52] r8169: Add support for new Realtek Ethernet Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 3.18 35/52] MIPS: SiByte: Enable swiotlb for SWARM, LittleSur and BigSur Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 3.18 36/52] jffs2: Fix use of uninitialized delayed_work, lockdep breakage Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 3.18 37/52] pstore/ram: Do not treat empty buffers as valid Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 3.18 38/52] powerpc/pseries/cpuidle: Fix preempt warning Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 3.18 39/52] media: firewire: Fix app_info parameter type in avc_ca{,_app}_info Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 3.18 40/52] net: call sk_dst_reset when set SO_DONTROUTE Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 3.18 41/52] scsi: target: use consistent left-aligned ASCII INQUIRY data Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 3.18 42/52] clk: imx6q: reset exclusive gates on init Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 3.18 43/52] kconfig: fix memory leak when EOF is encountered in quotation Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 3.18 44/52] mmc: atmel-mci: do not assume idle after atmci_request_end Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 3.18 45/52] perf svghelper: Fix unchecked usage of strncpy() Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 3.18 46/52] perf parse-events: " Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 3.18 47/52] dm kcopyd: Fix bug causing workqueue stalls Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 3.18 48/52] dm snapshot: Fix excessive memory usage and " Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 3.18 49/52] ALSA: bebob: fix model-id of unit for Apogee Ensemble Greg Kroah-Hartman
2019-01-24 19:20 ` Greg Kroah-Hartman [this message]
2019-01-24 19:20 ` [PATCH 3.18 51/52] ocfs2: fix panic due to unrecovered local alloc Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 3.18 52/52] mm, proc: be more verbose about unstable VMA flags in /proc/<pid>/smaps Greg Kroah-Hartman
2019-01-25 14:39 ` [PATCH 3.18 00/52] 3.18.133-stable review shuah
2019-01-25 23:16 ` Guenter Roeck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190124190151.012666115@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=andriy.shevchenko@linux.intel.com \
    --cc=aspriel@gmail.com \
    --cc=brgl@bgdev.pl \
    --cc=daniel.vetter@intel.com \
    --cc=geert+renesas@glider.be \
    --cc=heikki.krogerus@linux.intel.com \
    --cc=joe@perches.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rafael.j.wysocki@intel.com \
    --cc=ramalingam.c@intel.com \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=vivek.gautam@codeaurora.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.