From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
alan@lxorguk.ukuu.org.uk, Roland Dreier <roland@purestorage.com>,
David Woodhouse <David.Woodhouse@intel.com>,
Steven Rostedt <rostedt@goodmis.org>
Subject: [ 38/38] intel-iommu: Fix AB-BA lockdep report
Date: Wed, 14 Nov 2012 20:10:35 -0800 [thread overview]
Message-ID: <20121115040935.248769659@linuxfoundation.org> (raw)
In-Reply-To: <20121115040932.918082372@linuxfoundation.org>
3.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Roland Dreier <roland@purestorage.com>
commit 3e7abe2556b583e87dabda3e0e6178a67b20d06f upstream.
When unbinding a device so that I could pass it through to a KVM VM, I
got the lockdep report below. It looks like a legitimate lock
ordering problem:
- domain_context_mapping_one() takes iommu->lock and calls
iommu_support_dev_iotlb(), which takes device_domain_lock (inside
iommu->lock).
- domain_remove_one_dev_info() starts by taking device_domain_lock
then takes iommu->lock inside it (near the end of the function).
So this is the classic AB-BA deadlock. It looks like a safe fix is to
simply release device_domain_lock a bit earlier, since as far as I can
tell, it doesn't protect any of the stuff accessed at the end of
domain_remove_one_dev_info() anyway.
BTW, the use of device_domain_lock looks a bit unsafe to me... it's
at least not obvious to me why we aren't vulnerable to the race below:
iommu_support_dev_iotlb()
domain_remove_dev_info()
lock device_domain_lock
find info
unlock device_domain_lock
lock device_domain_lock
find same info
unlock device_domain_lock
free_devinfo_mem(info)
do stuff with info after it's free
However I don't understand the locking here well enough to know if
this is a real problem, let alone what the best fix is.
Anyway here's the full lockdep output that prompted all of this:
=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.39.1+ #1
-------------------------------------------------------
bash/13954 is trying to acquire lock:
(&(&iommu->lock)->rlock){......}, at: [<ffffffff812f6421>] domain_remove_one_dev_info+0x121/0x230
but task is already holding lock:
(device_domain_lock){-.-...}, at: [<ffffffff812f6508>] domain_remove_one_dev_info+0x208/0x230
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (device_domain_lock){-.-...}:
[<ffffffff8109ca9d>] lock_acquire+0x9d/0x130
[<ffffffff81571475>] _raw_spin_lock_irqsave+0x55/0xa0
[<ffffffff812f8350>] domain_context_mapping_one+0x600/0x750
[<ffffffff812f84df>] domain_context_mapping+0x3f/0x120
[<ffffffff812f9175>] iommu_prepare_identity_map+0x1c5/0x1e0
[<ffffffff81ccf1ca>] intel_iommu_init+0x88e/0xb5e
[<ffffffff81cab204>] pci_iommu_init+0x16/0x41
[<ffffffff81002165>] do_one_initcall+0x45/0x190
[<ffffffff81ca3d3f>] kernel_init+0xe3/0x168
[<ffffffff8157ac24>] kernel_thread_helper+0x4/0x10
-> #0 (&(&iommu->lock)->rlock){......}:
[<ffffffff8109bf3e>] __lock_acquire+0x195e/0x1e10
[<ffffffff8109ca9d>] lock_acquire+0x9d/0x130
[<ffffffff81571475>] _raw_spin_lock_irqsave+0x55/0xa0
[<ffffffff812f6421>] domain_remove_one_dev_info+0x121/0x230
[<ffffffff812f8b42>] device_notifier+0x72/0x90
[<ffffffff8157555c>] notifier_call_chain+0x8c/0xc0
[<ffffffff81089768>] __blocking_notifier_call_chain+0x78/0xb0
[<ffffffff810897b6>] blocking_notifier_call_chain+0x16/0x20
[<ffffffff81373a5c>] __device_release_driver+0xbc/0xe0
[<ffffffff81373ccf>] device_release_driver+0x2f/0x50
[<ffffffff81372ee3>] driver_unbind+0xa3/0xc0
[<ffffffff813724ac>] drv_attr_store+0x2c/0x30
[<ffffffff811e4506>] sysfs_write_file+0xe6/0x170
[<ffffffff8117569e>] vfs_write+0xce/0x190
[<ffffffff811759e4>] sys_write+0x54/0xa0
[<ffffffff81579a82>] system_call_fastpath+0x16/0x1b
other info that might help us debug this:
6 locks held by bash/13954:
#0: (&buffer->mutex){+.+.+.}, at: [<ffffffff811e4464>] sysfs_write_file+0x44/0x170
#1: (s_active#3){++++.+}, at: [<ffffffff811e44ed>] sysfs_write_file+0xcd/0x170
#2: (&__lockdep_no_validate__){+.+.+.}, at: [<ffffffff81372edb>] driver_unbind+0x9b/0xc0
#3: (&__lockdep_no_validate__){+.+.+.}, at: [<ffffffff81373cc7>] device_release_driver+0x27/0x50
#4: (&(&priv->bus_notifier)->rwsem){.+.+.+}, at: [<ffffffff8108974f>] __blocking_notifier_call_chain+0x5f/0xb0
#5: (device_domain_lock){-.-...}, at: [<ffffffff812f6508>] domain_remove_one_dev_info+0x208/0x230
stack backtrace:
Pid: 13954, comm: bash Not tainted 2.6.39.1+ #1
Call Trace:
[<ffffffff810993a7>] print_circular_bug+0xf7/0x100
[<ffffffff8109bf3e>] __lock_acquire+0x195e/0x1e10
[<ffffffff810972bd>] ? trace_hardirqs_off+0xd/0x10
[<ffffffff8109d57d>] ? trace_hardirqs_on_caller+0x13d/0x180
[<ffffffff8109ca9d>] lock_acquire+0x9d/0x130
[<ffffffff812f6421>] ? domain_remove_one_dev_info+0x121/0x230
[<ffffffff81571475>] _raw_spin_lock_irqsave+0x55/0xa0
[<ffffffff812f6421>] ? domain_remove_one_dev_info+0x121/0x230
[<ffffffff810972bd>] ? trace_hardirqs_off+0xd/0x10
[<ffffffff812f6421>] domain_remove_one_dev_info+0x121/0x230
[<ffffffff812f8b42>] device_notifier+0x72/0x90
[<ffffffff8157555c>] notifier_call_chain+0x8c/0xc0
[<ffffffff81089768>] __blocking_notifier_call_chain+0x78/0xb0
[<ffffffff810897b6>] blocking_notifier_call_chain+0x16/0x20
[<ffffffff81373a5c>] __device_release_driver+0xbc/0xe0
[<ffffffff81373ccf>] device_release_driver+0x2f/0x50
[<ffffffff81372ee3>] driver_unbind+0xa3/0xc0
[<ffffffff813724ac>] drv_attr_store+0x2c/0x30
[<ffffffff811e4506>] sysfs_write_file+0xe6/0x170
[<ffffffff8117569e>] vfs_write+0xce/0x190
[<ffffffff811759e4>] sys_write+0x54/0xa0
[<ffffffff81579a82>] system_call_fastpath+0x16/0x1b
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/pci/intel-iommu.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/pci/intel-iommu.c
+++ b/drivers/pci/intel-iommu.c
@@ -3573,6 +3573,8 @@ static void domain_remove_one_dev_info(s
found = 1;
}
+ spin_unlock_irqrestore(&device_domain_lock, flags);
+
if (found == 0) {
unsigned long tmp_flags;
spin_lock_irqsave(&domain->iommu_lock, tmp_flags);
@@ -3589,8 +3591,6 @@ static void domain_remove_one_dev_info(s
spin_unlock_irqrestore(&iommu->lock, tmp_flags);
}
}
-
- spin_unlock_irqrestore(&device_domain_lock, flags);
}
static void vm_domain_remove_all_dev_info(struct dmar_domain *domain)
prev parent reply other threads:[~2012-11-15 4:43 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-11-15 4:09 [ 00/38] 3.0.52-stable review Greg Kroah-Hartman
2012-11-15 4:09 ` [ 01/38] ath9k: fix stale pointers potentially causing access to freed skbs Greg Kroah-Hartman
2012-11-15 4:09 ` [ 02/38] rt2800: validate step value for temperature compensation Greg Kroah-Hartman
2012-11-15 4:10 ` [ 03/38] target: Dont return success from module_init() if setup fails Greg Kroah-Hartman
2012-11-15 4:10 ` [ 04/38] cfg80211: fix antenna gain handling Greg Kroah-Hartman
2012-11-15 4:10 ` [ 05/38] wireless: drop invalid mesh address extension frames Greg Kroah-Hartman
2012-11-15 4:10 ` [ 06/38] mac80211: dont inspect Sequence Control field on control frames Greg Kroah-Hartman
2012-11-15 4:10 ` [ 07/38] DRM/Radeon: Fix Load Detection on legacy primary DAC Greg Kroah-Hartman
2012-11-15 4:10 ` [ 08/38] mac80211: check management frame header length Greg Kroah-Hartman
2012-11-15 4:10 ` [ 09/38] mac80211: fix SSID copy on IBSS JOIN Greg Kroah-Hartman
2012-11-15 4:10 ` [ 10/38] nfsv3: Make v3 mounts fail with ETIMEDOUTs instead EIO on mountd timeouts Greg Kroah-Hartman
2012-11-15 4:10 ` [ 11/38] nfs: Show original device name verbatim in /proc/*/mount{s,info} Greg Kroah-Hartman
2012-11-15 4:10 ` [ 12/38] NFSv4: nfs4_locku_done must release the sequence id Greg Kroah-Hartman
2012-11-15 4:10 ` [ 13/38] nfsd: add get_uint for u32s Greg Kroah-Hartman
2012-11-15 4:10 ` [ 14/38] NFS: fix bug in legacy DNS resolver Greg Kroah-Hartman
2012-11-15 4:10 ` [ 15/38] NFS: Fix Oopses in nfs_lookup_revalidate and nfs4_lookup_revalidate Greg Kroah-Hartman
2012-11-15 4:10 ` [ 16/38] drm: restore open_count if drm_setup fails Greg Kroah-Hartman
2012-11-15 4:10 ` [ 17/38] hwmon: (w83627ehf) Force initial bank selection Greg Kroah-Hartman
2012-11-15 4:10 ` [ 18/38] ALSA: PCM: Fix some races at disconnection Greg Kroah-Hartman
2012-11-15 4:10 ` [ 19/38] ALSA: usb-audio: Fix " Greg Kroah-Hartman
2012-11-15 4:10 ` [ 20/38] ALSA: usb-audio: Use rwsem for disconnect protection Greg Kroah-Hartman
2012-11-15 4:10 ` [ 21/38] ALSA: usb-audio: Fix races at disconnection in mixer_quirks.c Greg Kroah-Hartman
2012-11-15 4:10 ` [ 22/38] ALSA: Add a reference counter to card instance Greg Kroah-Hartman
2012-11-15 4:10 ` [ 23/38] ALSA: Avoid endless sleep after disconnect Greg Kroah-Hartman
2012-11-15 4:10 ` [ 24/38] sctp: fix call to SCTP_CMD_PROCESS_SACK in sctp_cmd_interpreter() Greg Kroah-Hartman
2012-11-15 4:10 ` [ 25/38] netlink: use kfree_rcu() in netlink_release() Greg Kroah-Hartman
2012-11-15 4:10 ` [ 26/38] tcp: fix FIONREAD/SIOCINQ Greg Kroah-Hartman
2012-11-15 4:10 ` [ 27/38] ipv6: Set default hoplimit as zero Greg Kroah-Hartman
2012-11-15 4:10 ` [ 28/38] net: usb: Fix memory leak on Tx data path Greg Kroah-Hartman
2012-11-15 4:10 ` [ 29/38] net: fix divide by zero in tcp algorithm illinois Greg Kroah-Hartman
2012-11-15 4:10 ` [ 30/38] l2tp: fix oops in l2tp_eth_create() error path Greg Kroah-Hartman
2012-11-15 4:10 ` [ 31/38] ipv6: send unsolicited neighbour advertisements to all-nodes Greg Kroah-Hartman
2012-11-15 4:10 ` [ 32/38] futex: Handle futex_pi OWNER_DIED take over correctly Greg Kroah-Hartman
2012-11-15 4:10 ` [ 33/38] drm/vmwgfx: Fix hibernation device reset Greg Kroah-Hartman
2012-11-15 4:10 ` [ 34/38] drm/i915: fixup infoframe support for sdvo Greg Kroah-Hartman
2012-11-15 4:10 ` [ 35/38] drm/i915: clear the entire sdvo infoframe buffer Greg Kroah-Hartman
2012-11-15 4:10 ` [ 36/38] USB: mos7840: remove unused variable Greg Kroah-Hartman
2012-11-15 4:10 ` [ 37/38] xfs: fix reading of wrapped log data Greg Kroah-Hartman
2012-11-15 4:10 ` Greg Kroah-Hartman [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20121115040935.248769659@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=David.Woodhouse@intel.com \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=linux-kernel@vger.kernel.org \
--cc=roland@purestorage.com \
--cc=rostedt@goodmis.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).