From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
"Peter Zijlstra (Intel)" <peterz@infradead.org>,
juri.lelli@arm.com, bigeasy@linutronix.de, xlpang@redhat.com,
rostedt@goodmis.org, mathieu.desnoyers@efficios.com,
jdesfossez@efficios.com, dvhart@infradead.org,
bristot@redhat.com, Thomas Gleixner <tglx@linutronix.de>,
Ben Hutchings <ben@decadent.org.uk>
Subject: [PATCH 4.9 39/53] futex: Rework futex_lock_pi() to use rt_mutex_*_proxy_lock()
Date: Mon, 29 Mar 2021 09:58:14 +0200 [thread overview]
Message-ID: <20210329075608.797462410@linuxfoundation.org> (raw)
In-Reply-To: <20210329075607.561619583@linuxfoundation.org>
From: Peter Zijlstra <peterz@infradead.org>
commit cfafcd117da0216520568c195cb2f6cd1980c4bb upstream.
By changing futex_lock_pi() to use rt_mutex_*_proxy_lock() all wait_list
modifications are done under both hb->lock and wait_lock.
This closes the obvious interleave pattern between futex_lock_pi() and
futex_unlock_pi(), but not entirely so. See below:
Before:
futex_lock_pi() futex_unlock_pi()
unlock hb->lock
lock hb->lock
unlock hb->lock
lock rt_mutex->wait_lock
unlock rt_mutex_wait_lock
-EAGAIN
lock rt_mutex->wait_lock
list_add
unlock rt_mutex->wait_lock
schedule()
lock rt_mutex->wait_lock
list_del
unlock rt_mutex->wait_lock
<idem>
-EAGAIN
lock hb->lock
After:
futex_lock_pi() futex_unlock_pi()
lock hb->lock
lock rt_mutex->wait_lock
list_add
unlock rt_mutex->wait_lock
unlock hb->lock
schedule()
lock hb->lock
unlock hb->lock
lock hb->lock
lock rt_mutex->wait_lock
list_del
unlock rt_mutex->wait_lock
lock rt_mutex->wait_lock
unlock rt_mutex_wait_lock
-EAGAIN
unlock hb->lock
It does however solve the earlier starvation/live-lock scenario which got
introduced with the -EAGAIN since unlike the before scenario; where the
-EAGAIN happens while futex_unlock_pi() doesn't hold any locks; in the
after scenario it happens while futex_unlock_pi() actually holds a lock,
and then it is serialized on that lock.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: juri.lelli@arm.com
Cc: bigeasy@linutronix.de
Cc: xlpang@redhat.com
Cc: rostedt@goodmis.org
Cc: mathieu.desnoyers@efficios.com
Cc: jdesfossez@efficios.com
Cc: dvhart@infradead.org
Cc: bristot@redhat.com
Link: http://lkml.kernel.org/r/20170322104152.062785528@infradead.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[bwh: Backported to 4.9: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
kernel/futex.c | 77 ++++++++++++++++++++++++++++------------
kernel/locking/rtmutex.c | 26 +++----------
kernel/locking/rtmutex_common.h | 1
3 files changed, 62 insertions(+), 42 deletions(-)
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -2333,20 +2333,7 @@ queue_unlock(struct futex_hash_bucket *h
hb_waiters_dec(hb);
}
-/**
- * queue_me() - Enqueue the futex_q on the futex_hash_bucket
- * @q: The futex_q to enqueue
- * @hb: The destination hash bucket
- *
- * The hb->lock must be held by the caller, and is released here. A call to
- * queue_me() is typically paired with exactly one call to unqueue_me(). The
- * exceptions involve the PI related operations, which may use unqueue_me_pi()
- * or nothing if the unqueue is done as part of the wake process and the unqueue
- * state is implicit in the state of woken task (see futex_wait_requeue_pi() for
- * an example).
- */
-static inline void queue_me(struct futex_q *q, struct futex_hash_bucket *hb)
- __releases(&hb->lock)
+static inline void __queue_me(struct futex_q *q, struct futex_hash_bucket *hb)
{
int prio;
@@ -2363,6 +2350,24 @@ static inline void queue_me(struct futex
plist_node_init(&q->list, prio);
plist_add(&q->list, &hb->chain);
q->task = current;
+}
+
+/**
+ * queue_me() - Enqueue the futex_q on the futex_hash_bucket
+ * @q: The futex_q to enqueue
+ * @hb: The destination hash bucket
+ *
+ * The hb->lock must be held by the caller, and is released here. A call to
+ * queue_me() is typically paired with exactly one call to unqueue_me(). The
+ * exceptions involve the PI related operations, which may use unqueue_me_pi()
+ * or nothing if the unqueue is done as part of the wake process and the unqueue
+ * state is implicit in the state of woken task (see futex_wait_requeue_pi() for
+ * an example).
+ */
+static inline void queue_me(struct futex_q *q, struct futex_hash_bucket *hb)
+ __releases(&hb->lock)
+{
+ __queue_me(q, hb);
spin_unlock(&hb->lock);
}
@@ -2868,6 +2873,7 @@ static int futex_lock_pi(u32 __user *uad
{
struct hrtimer_sleeper timeout, *to = NULL;
struct task_struct *exiting = NULL;
+ struct rt_mutex_waiter rt_waiter;
struct futex_hash_bucket *hb;
struct futex_q q = futex_q_init;
int res, ret;
@@ -2928,25 +2934,52 @@ retry_private:
}
}
+ WARN_ON(!q.pi_state);
+
/*
* Only actually queue now that the atomic ops are done:
*/
- queue_me(&q, hb);
+ __queue_me(&q, hb);
- WARN_ON(!q.pi_state);
- /*
- * Block on the PI mutex:
- */
- if (!trylock) {
- ret = rt_mutex_timed_futex_lock(&q.pi_state->pi_mutex, to);
- } else {
+ if (trylock) {
ret = rt_mutex_futex_trylock(&q.pi_state->pi_mutex);
/* Fixup the trylock return value: */
ret = ret ? 0 : -EWOULDBLOCK;
+ goto no_block;
}
+ /*
+ * We must add ourselves to the rt_mutex waitlist while holding hb->lock
+ * such that the hb and rt_mutex wait lists match.
+ */
+ rt_mutex_init_waiter(&rt_waiter);
+ ret = rt_mutex_start_proxy_lock(&q.pi_state->pi_mutex, &rt_waiter, current);
+ if (ret) {
+ if (ret == 1)
+ ret = 0;
+
+ goto no_block;
+ }
+
+ spin_unlock(q.lock_ptr);
+
+ if (unlikely(to))
+ hrtimer_start_expires(&to->timer, HRTIMER_MODE_ABS);
+
+ ret = rt_mutex_wait_proxy_lock(&q.pi_state->pi_mutex, to, &rt_waiter);
+
spin_lock(q.lock_ptr);
/*
+ * If we failed to acquire the lock (signal/timeout), we must
+ * first acquire the hb->lock before removing the lock from the
+ * rt_mutex waitqueue, such that we can keep the hb and rt_mutex
+ * wait lists consistent.
+ */
+ if (ret && !rt_mutex_cleanup_proxy_lock(&q.pi_state->pi_mutex, &rt_waiter))
+ ret = 0;
+
+no_block:
+ /*
* Fixup the pi_state owner and possibly acquire the lock if we
* haven't already.
*/
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -1523,19 +1523,6 @@ int __sched rt_mutex_lock_interruptible(
EXPORT_SYMBOL_GPL(rt_mutex_lock_interruptible);
/*
- * Futex variant with full deadlock detection.
- * Futex variants must not use the fast-path, see __rt_mutex_futex_unlock().
- */
-int __sched rt_mutex_timed_futex_lock(struct rt_mutex *lock,
- struct hrtimer_sleeper *timeout)
-{
- might_sleep();
-
- return rt_mutex_slowlock(lock, TASK_INTERRUPTIBLE,
- timeout, RT_MUTEX_FULL_CHAINWALK);
-}
-
-/*
* Futex variant, must not use fastpath.
*/
int __sched rt_mutex_futex_trylock(struct rt_mutex *lock)
@@ -1808,12 +1795,6 @@ int rt_mutex_wait_proxy_lock(struct rt_m
/* sleep on the mutex */
ret = __rt_mutex_slowlock(lock, TASK_INTERRUPTIBLE, to, waiter);
- /*
- * try_to_take_rt_mutex() sets the waiter bit unconditionally. We might
- * have to fix that up.
- */
- fixup_rt_mutex_waiters(lock);
-
raw_spin_unlock_irq(&lock->wait_lock);
return ret;
@@ -1853,6 +1834,13 @@ bool rt_mutex_cleanup_proxy_lock(struct
fixup_rt_mutex_waiters(lock);
cleanup = true;
}
+
+ /*
+ * try_to_take_rt_mutex() sets the waiter bit unconditionally. We might
+ * have to fix that up.
+ */
+ fixup_rt_mutex_waiters(lock);
+
raw_spin_unlock_irq(&lock->wait_lock);
return cleanup;
--- a/kernel/locking/rtmutex_common.h
+++ b/kernel/locking/rtmutex_common.h
@@ -112,7 +112,6 @@ extern int rt_mutex_wait_proxy_lock(stru
struct rt_mutex_waiter *waiter);
extern bool rt_mutex_cleanup_proxy_lock(struct rt_mutex *lock,
struct rt_mutex_waiter *waiter);
-extern int rt_mutex_timed_futex_lock(struct rt_mutex *l, struct hrtimer_sleeper *to);
extern int rt_mutex_futex_trylock(struct rt_mutex *l);
extern int __rt_mutex_futex_trylock(struct rt_mutex *l);
next prev parent reply other threads:[~2021-03-29 8:05 UTC|newest]
Thread overview: 59+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-29 7:57 [PATCH 4.9 00/53] 4.9.264-rc1 review Greg Kroah-Hartman
2021-03-29 7:57 ` [PATCH 4.9 01/53] net: fec: ptp: avoid register access when ipg clock is disabled Greg Kroah-Hartman
2021-03-29 7:57 ` [PATCH 4.9 02/53] powerpc/4xx: Fix build errors from mfdcr() Greg Kroah-Hartman
2021-03-29 7:57 ` [PATCH 4.9 03/53] atm: eni: dont release is never initialized Greg Kroah-Hartman
2021-03-29 7:57 ` [PATCH 4.9 04/53] atm: lanai: dont run lanai_dev_close if not open Greg Kroah-Hartman
2021-03-29 7:57 ` [PATCH 4.9 05/53] ixgbe: Fix memleak in ixgbe_configure_clsu32 Greg Kroah-Hartman
2021-03-29 7:57 ` [PATCH 4.9 06/53] net: tehuti: fix error return code in bdx_probe() Greg Kroah-Hartman
2021-03-29 7:57 ` [PATCH 4.9 07/53] sun/niu: fix wrong RXMAC_BC_FRM_CNT_COUNT count Greg Kroah-Hartman
2021-03-29 7:57 ` [PATCH 4.9 08/53] nfs: fix PNFS_FLEXFILE_LAYOUT Kconfig default Greg Kroah-Hartman
2021-03-29 7:57 ` [PATCH 4.9 09/53] NFS: Correct size calculation for create reply length Greg Kroah-Hartman
2021-03-29 7:57 ` [PATCH 4.9 10/53] net: wan: fix error return code of uhdlc_init() Greg Kroah-Hartman
2021-03-29 7:57 ` [PATCH 4.9 11/53] atm: uPD98402: fix incorrect allocation Greg Kroah-Hartman
2021-03-29 7:57 ` [PATCH 4.9 12/53] atm: idt77252: fix null-ptr-dereference Greg Kroah-Hartman
2021-03-29 7:57 ` [PATCH 4.9 13/53] u64_stats,lockdep: Fix u64_stats_init() vs lockdep Greg Kroah-Hartman
2021-03-29 7:57 ` [PATCH 4.9 14/53] nfs: we dont support removing system.nfs4_acl Greg Kroah-Hartman
2021-03-29 7:57 ` [PATCH 4.9 15/53] ia64: fix ia64_syscall_get_set_arguments() for break-based syscalls Greg Kroah-Hartman
2021-03-29 7:57 ` [PATCH 4.9 16/53] ia64: fix ptrace(PTRACE_SYSCALL_INFO_EXIT) sign Greg Kroah-Hartman
2021-03-29 7:57 ` [PATCH 4.9 17/53] x86/tlb: Flush global mappings when KAISER is disabled Greg Kroah-Hartman
2021-03-29 7:57 ` [PATCH 4.9 18/53] squashfs: fix inode lookup sanity checks Greg Kroah-Hartman
2021-03-29 7:57 ` [PATCH 4.9 19/53] squashfs: fix xattr id and id " Greg Kroah-Hartman
2021-03-29 7:57 ` [PATCH 4.9 20/53] arm64: dts: ls1043a: mark crypto engine dma coherent Greg Kroah-Hartman
2021-03-29 7:57 ` [PATCH 4.9 21/53] bus: omap_l3_noc: mark l3 irqs as IRQF_NO_THREAD Greg Kroah-Hartman
2021-03-29 7:57 ` [PATCH 4.9 22/53] macvlan: macvlan_count_rx() needs to be aware of preemption Greg Kroah-Hartman
2021-03-29 7:57 ` [PATCH 4.9 23/53] net: dsa: bcm_sf2: Qualify phydev->dev_flags based on port Greg Kroah-Hartman
2021-03-29 7:57 ` [PATCH 4.9 24/53] e1000e: add rtnl_lock() to e1000_reset_task Greg Kroah-Hartman
2021-03-29 7:58 ` [PATCH 4.9 25/53] e1000e: Fix error handling in e1000_set_d0_lplu_state_82571 Greg Kroah-Hartman
2021-03-29 7:58 ` [PATCH 4.9 26/53] net/qlcnic: Fix a use after free in qlcnic_83xx_get_minidump_template Greg Kroah-Hartman
2021-03-29 7:58 ` [PATCH 4.9 27/53] can: c_can_pci: c_can_pci_remove(): fix use-after-free Greg Kroah-Hartman
2021-03-29 7:58 ` [PATCH 4.9 28/53] can: c_can: move runtime PM enable/disable to c_can_platform Greg Kroah-Hartman
2021-03-29 7:58 ` [PATCH 4.9 29/53] can: m_can: m_can_do_rx_poll(): fix extraneous msg loss warning Greg Kroah-Hartman
2021-03-29 7:58 ` [PATCH 4.9 30/53] mac80211: fix rate mask reset Greg Kroah-Hartman
2021-03-29 7:58 ` [PATCH 4.9 31/53] net: cdc-phonet: fix data-interface release on probe failure Greg Kroah-Hartman
2021-03-29 7:58 ` [PATCH 4.9 32/53] RDMA/cxgb4: Fix adapter LE hash errors while destroying ipv6 listening server Greg Kroah-Hartman
2021-03-29 7:58 ` [PATCH 4.9 33/53] ACPI: scan: Rearrange memory allocation in acpi_device_add() Greg Kroah-Hartman
2021-03-29 7:58 ` [PATCH 4.9 34/53] ACPI: scan: Use unique number for instance_no Greg Kroah-Hartman
2021-03-29 7:58 ` [PATCH 4.9 35/53] perf auxtrace: Fix auxtrace queue conflict Greg Kroah-Hartman
2021-03-29 7:58 ` [PATCH 4.9 36/53] idr: add ida_is_empty Greg Kroah-Hartman
2021-03-29 7:58 ` [PATCH 4.9 37/53] futex: Use smp_store_release() in mark_wake_futex() Greg Kroah-Hartman
2021-03-29 7:58 ` [PATCH 4.9 38/53] futex,rt_mutex: Introduce rt_mutex_init_waiter() Greg Kroah-Hartman
2021-03-29 7:58 ` Greg Kroah-Hartman [this message]
2021-03-29 7:58 ` [PATCH 4.9 40/53] futex: Drop hb->lock before enqueueing on the rtmutex Greg Kroah-Hartman
2021-03-29 7:58 ` [PATCH 4.9 41/53] futex: Avoid freeing an active timer Greg Kroah-Hartman
2021-03-29 7:58 ` [PATCH 4.9 42/53] futex,rt_mutex: Fix rt_mutex_cleanup_proxy_lock() Greg Kroah-Hartman
2021-03-29 7:58 ` [PATCH 4.9 43/53] futex: Handle early deadlock return correctly Greg Kroah-Hartman
2021-03-29 7:58 ` [PATCH 4.9 44/53] futex: Fix (possible) missed wakeup Greg Kroah-Hartman
2021-03-29 7:58 ` [PATCH 4.9 45/53] locking/futex: Allow low-level atomic operations to return -EAGAIN Greg Kroah-Hartman
2021-03-29 7:58 ` [PATCH 4.9 46/53] arm64: futex: Bound number of LDXR/STXR loops in FUTEX_WAKE_OP Greg Kroah-Hartman
2021-03-29 7:58 ` [PATCH 4.9 47/53] futex: Prevent robust futex exit race Greg Kroah-Hartman
2021-03-29 7:58 ` [PATCH 4.9 48/53] futex: Fix incorrect should_fail_futex() handling Greg Kroah-Hartman
2021-03-29 7:58 ` [PATCH 4.9 49/53] futex: Handle transient "ownerless" rtmutex state correctly Greg Kroah-Hartman
2021-03-29 7:58 ` [PATCH 4.9 50/53] can: dev: Move device back to init netns on owning netns delete Greg Kroah-Hartman
2021-03-29 7:58 ` [PATCH 4.9 51/53] net: sched: validate stab values Greg Kroah-Hartman
2021-03-29 7:58 ` [PATCH 4.9 52/53] net: qrtr: fix a kernel-infoleak in qrtr_recvmsg() Greg Kroah-Hartman
2021-03-29 7:58 ` [PATCH 4.9 53/53] mac80211: fix double free in ibss_leave Greg Kroah-Hartman
2021-03-29 18:45 ` [PATCH 4.9 00/53] 4.9.264-rc1 review Florian Fainelli
2021-03-29 21:32 ` Guenter Roeck
2021-03-30 1:27 ` Shuah Khan
2021-03-30 7:05 ` Naresh Kamboju
2021-03-30 9:35 ` Jon Hunter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210329075608.797462410@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=ben@decadent.org.uk \
--cc=bigeasy@linutronix.de \
--cc=bristot@redhat.com \
--cc=dvhart@infradead.org \
--cc=jdesfossez@efficios.com \
--cc=juri.lelli@arm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=stable@vger.kernel.org \
--cc=tglx@linutronix.de \
--cc=xlpang@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.