From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Thomas Gleixner <tglx@linutronix.de>,
Peter Zijlstra <peterz@infradead.org>,
Steven Rostedt <rostedt@goodmis.org>,
Lai Jiangshan <laijs@cn.fujitsu.com>,
Mike Galbraith <umgwanakikbuti@gmail.com>
Subject: [PATCH 3.10 29/44] rtmutex: Fix deadlock detector for real
Date: Tue, 15 Jul 2014 16:17:27 -0700 [thread overview]
Message-ID: <20140715231647.559614715@linuxfoundation.org> (raw)
In-Reply-To: <20140715231646.690728348@linuxfoundation.org>
3.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Gleixner <tglx@linutronix.de>
commit 397335f004f41e5fcf7a795e94eb3ab83411a17c upstream.
The current deadlock detection logic does not work reliably due to the
following early exit path:
/*
* Drop out, when the task has no waiters. Note,
* top_waiter can be NULL, when we are in the deboosting
* mode!
*/
if (top_waiter && (!task_has_pi_waiters(task) ||
top_waiter != task_top_pi_waiter(task)))
goto out_unlock_pi;
So this not only exits when the task has no waiters, it also exits
unconditionally when the current waiter is not the top priority waiter
of the task.
So in a nested locking scenario, it might abort the lock chain walk
and therefor miss a potential deadlock.
Simple fix: Continue the chain walk, when deadlock detection is
enabled.
We also avoid the whole enqueue, if we detect the deadlock right away
(A-A). It's an optimization, but also prevents that another waiter who
comes in after the detection and before the task has undone the damage
observes the situation and detects the deadlock and returns
-EDEADLOCK, which is wrong as the other task is not in a deadlock
situation.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Link: http://lkml.kernel.org/r/20140522031949.725272460@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
kernel/rtmutex.c | 32 ++++++++++++++++++++++++++++----
1 file changed, 28 insertions(+), 4 deletions(-)
--- a/kernel/rtmutex.c
+++ b/kernel/rtmutex.c
@@ -212,9 +212,16 @@ static int rt_mutex_adjust_prio_chain(st
* top_waiter can be NULL, when we are in the deboosting
* mode!
*/
- if (top_waiter && (!task_has_pi_waiters(task) ||
- top_waiter != task_top_pi_waiter(task)))
- goto out_unlock_pi;
+ if (top_waiter) {
+ if (!task_has_pi_waiters(task))
+ goto out_unlock_pi;
+ /*
+ * If deadlock detection is off, we stop here if we
+ * are not the top pi waiter of the task.
+ */
+ if (!detect_deadlock && top_waiter != task_top_pi_waiter(task))
+ goto out_unlock_pi;
+ }
/*
* When deadlock detection is off then we check, if further
@@ -230,7 +237,12 @@ static int rt_mutex_adjust_prio_chain(st
goto retry;
}
- /* Deadlock detection */
+ /*
+ * Deadlock detection. If the lock is the same as the original
+ * lock which caused us to walk the lock chain or if the
+ * current lock is owned by the task which initiated the chain
+ * walk, we detected a deadlock.
+ */
if (lock == orig_lock || rt_mutex_owner(lock) == top_task) {
debug_rt_mutex_deadlock(deadlock_detect, orig_waiter, lock);
raw_spin_unlock(&lock->wait_lock);
@@ -399,6 +411,18 @@ static int task_blocks_on_rt_mutex(struc
unsigned long flags;
int chain_walk = 0, res;
+ /*
+ * Early deadlock detection. We really don't want the task to
+ * enqueue on itself just to untangle the mess later. It's not
+ * only an optimization. We drop the locks, so another waiter
+ * can come in before the chain walk detects the deadlock. So
+ * the other will detect the deadlock and return -EDEADLOCK,
+ * which is wrong, as the other waiter is not in a deadlock
+ * situation.
+ */
+ if (detect_deadlock && owner == task)
+ return -EDEADLK;
+
raw_spin_lock_irqsave(&task->pi_lock, flags);
__rt_mutex_adjust_prio(task);
waiter->task = task;
next prev parent reply other threads:[~2014-07-16 0:12 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-15 23:16 [PATCH 3.10 00/44] 3.10.49-stable review Greg Kroah-Hartman
2014-07-15 23:16 ` [PATCH 3.10 01/44] usb: option: Add ID for Telewell TW-LTE 4G v2 Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.10 02/44] USB: cp210x: add support for Corsair usb dongle Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.10 03/44] USB: ftdi_sio: Add extra PID Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.10 04/44] USB: serial: ftdi_sio: Add Infineon Triboard Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.10 05/44] parisc: add serial ports of C8000/1GHz machine to hardware database Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.10 06/44] workqueue: fix dev_set_uevent_suppress() imbalance Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.10 07/44] cpuset,mempolicy: fix sleeping function called from invalid context Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.10 08/44] workqueue: zero cpumask of wq_numa_possible_cpumask on init Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.10 09/44] hwmon: (amc6821) Fix permissions for temp2_input Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.10 10/44] hwmon: (adm1031) Fix writes to limit registers Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.10 11/44] hwmon: (adm1029) Ensure the fan_div cache is updated in set_fan_div Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.10 12/44] hwmon: (adm1021) Fix cache problem when writing temperature limits Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.10 13/44] ACPI / resources: only reject zero length resources based at address zero Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.10 14/44] powerpc/perf: Never program book3s PMCs with values >= 0x80000000 Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.10 15/44] powerpc/perf: Add PPMU_ARCH_207S define Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.10 16/44] powerpc/perf: Clear MMCR2 when enabling PMU Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.10 17/44] cpufreq: Makefile: fix compilation for davinci platform Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.10 18/44] crypto: sha512_ssse3 - fix byte count to bit count conversion Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.10 19/44] arm64: implement TASK_SIZE_OF Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.10 20/44] clk: spear3xx: Use proper control register offset Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.10 21/44] Drivers: hv: vmbus: Fix a bug in the channel callback dispatch code Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.10 22/44] dm io: fix a race condition in the wake up code for sync_io Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.10 23/44] ext4: fix unjournalled bg descriptor while initializing inode bitmap Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.10 24/44] ext4: clarify error count warning messages Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.10 25/44] ext4: disable synchronous transaction batching if max_batch_time==0 Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.10 26/44] drm/radeon: fix typo in golden register setup on evergreen Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.10 28/44] ring-buffer: Check if buffer exists before polling Greg Kroah-Hartman
2014-07-15 23:17 ` Greg Kroah-Hartman [this message]
2014-07-15 23:17 ` [PATCH 3.10 30/44] rtmutex: Detect changes in the pi lock chain Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.10 31/44] rtmutex: Handle deadlock detection smarter Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.10 32/44] rtmutex: Plug slow unlock race Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.10 33/44] score: normalize global variables exported by vmlinux.lds Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.10 34/44] Score: Implement the function csum_ipv6_magic Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.10 35/44] Score: The commit is for compiling successfully Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.10 36/44] Score: Modify the Makefile of Score, remove -mlong-calls for compiling Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.10 37/44] Revert "x86-64, modify_ldt: Make support for 16-bit segments a runtime option" Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.10 38/44] x86-64, espfix: Dont leak bits 31:16 of %esp returning to 16-bit stack Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.10 39/44] x86, espfix: Move espfix definitions into a separate header file Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.10 40/44] x86, espfix: Fix broken header guard Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.10 41/44] x86, espfix: Make espfix64 a Kconfig option, fix UML Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.10 42/44] x86, espfix: Make it possible to disable 16-bit support Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.10 43/44] x86, ioremap: Speed up check for RAM pages Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.10 44/44] ACPI / battery: Retry to get battery information if failed during probing Greg Kroah-Hartman
2014-07-16 4:25 ` [PATCH 3.10 00/44] 3.10.49-stable review Guenter Roeck
2014-07-16 23:10 ` Greg Kroah-Hartman
2014-07-17 13:24 ` Shuah Khan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140715231647.559614715@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=laijs@cn.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=stable@vger.kernel.org \
--cc=tglx@linutronix.de \
--cc=umgwanakikbuti@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox