From: gregkh@linuxfoundation.org
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Stefan Liebler <stli@linux.ibm.com>,
Thomas Gleixner <tglx@linutronix.de>,
Peter Zijlstra <peterz@infradead.org>,
Heiko Carstens <heiko.carstens@de.ibm.com>,
Darren Hart <dvhart@infradead.org>,
Ingo Molnar <mingo@kernel.org>, Sasha Levin <sashal@kernel.org>,
Sudip Mukherjee <sudipm.mukherjee@gmail.com>,
Lee Jones <lee.jones@linaro.org>,
Zheng Yejian <zhengyejian1@huawei.com>
Subject: [PATCH 4.4 12/75] futex: Cure exit race
Date: Mon, 15 Mar 2021 14:51:26 +0100 [thread overview]
Message-ID: <20210315135208.663967134@linuxfoundation.org> (raw)
In-Reply-To: <20210315135208.252034256@linuxfoundation.org>
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
From: Thomas Gleixner <tglx@linutronix.de>
commit da791a667536bf8322042e38ca85d55a78d3c273 upstream.
This patch comes directly from an origin patch (commit
9c3f3986036760c48a92f04b36774aa9f63673f80) in v4.9.
Stefan reported, that the glibc tst-robustpi4 test case fails
occasionally. That case creates the following race between
sys_exit() and sys_futex_lock_pi():
CPU0 CPU1
sys_exit() sys_futex()
do_exit() futex_lock_pi()
exit_signals(tsk) No waiters:
tsk->flags |= PF_EXITING; *uaddr == 0x00000PID
mm_release(tsk) Set waiter bit
exit_robust_list(tsk) { *uaddr = 0x80000PID;
Set owner died attach_to_pi_owner() {
*uaddr = 0xC0000000; tsk = get_task(PID);
} if (!tsk->flags & PF_EXITING) {
... attach();
tsk->flags |= PF_EXITPIDONE; } else {
if (!(tsk->flags & PF_EXITPIDONE))
return -EAGAIN;
return -ESRCH; <--- FAIL
}
ESRCH is returned all the way to user space, which triggers the glibc test
case assert. Returning ESRCH unconditionally is wrong here because the user
space value has been changed by the exiting task to 0xC0000000, i.e. the
FUTEX_OWNER_DIED bit is set and the futex PID value has been cleared. This
is a valid state and the kernel has to handle it, i.e. taking the futex.
Cure it by rereading the user space value when PF_EXITING and PF_EXITPIDONE
is set in the task which 'owns' the futex. If the value has changed, let
the kernel retry the operation, which includes all regular sanity checks
and correctly handles the FUTEX_OWNER_DIED case.
If it hasn't changed, then return ESRCH as there is no way to distinguish
this case from malfunctioning user space. This happens when the exiting
task did not have a robust list, the robust list was corrupted or the user
space value in the futex was simply bogus.
Reported-by: Stefan Liebler <stli@linux.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Sasha Levin <sashal@kernel.org>
Cc: stable@vger.kernel.org
Link: https://bugzilla.kernel.org/show_bug.cgi?id=200467
Link: https://lkml.kernel.org/r/20181210152311.986181245@linutronix.de
Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[Lee: Required to satisfy functional dependency from futex back-port.
Re-add the missing handle_exit_race() parts from:
3d4775df0a89 ("futex: Replace PF_EXITPIDONE with a state")]
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
kernel/futex.c | 71 ++++++++++++++++++++++++++++++++++++++++++++++++++++-----
1 file changed, 65 insertions(+), 6 deletions(-)
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -1198,11 +1198,67 @@ static void wait_for_owner_exiting(int r
put_task_struct(exiting);
}
+static int handle_exit_race(u32 __user *uaddr, u32 uval,
+ struct task_struct *tsk)
+{
+ u32 uval2;
+
+ /*
+ * If the futex exit state is not yet FUTEX_STATE_DEAD, wait
+ * for it to finish.
+ */
+ if (tsk && tsk->futex_state != FUTEX_STATE_DEAD)
+ return -EAGAIN;
+
+ /*
+ * Reread the user space value to handle the following situation:
+ *
+ * CPU0 CPU1
+ *
+ * sys_exit() sys_futex()
+ * do_exit() futex_lock_pi()
+ * futex_lock_pi_atomic()
+ * exit_signals(tsk) No waiters:
+ * tsk->flags |= PF_EXITING; *uaddr == 0x00000PID
+ * mm_release(tsk) Set waiter bit
+ * exit_robust_list(tsk) { *uaddr = 0x80000PID;
+ * Set owner died attach_to_pi_owner() {
+ * *uaddr = 0xC0000000; tsk = get_task(PID);
+ * } if (!tsk->flags & PF_EXITING) {
+ * ... attach();
+ * tsk->futex_state = } else {
+ * FUTEX_STATE_DEAD; if (tsk->futex_state !=
+ * FUTEX_STATE_DEAD)
+ * return -EAGAIN;
+ * return -ESRCH; <--- FAIL
+ * }
+ *
+ * Returning ESRCH unconditionally is wrong here because the
+ * user space value has been changed by the exiting task.
+ *
+ * The same logic applies to the case where the exiting task is
+ * already gone.
+ */
+ if (get_futex_value_locked(&uval2, uaddr))
+ return -EFAULT;
+
+ /* If the user space value has changed, try again. */
+ if (uval2 != uval)
+ return -EAGAIN;
+
+ /*
+ * The exiting task did not have a robust list, the robust list was
+ * corrupted or the user space value in *uaddr is simply bogus.
+ * Give up and tell user space.
+ */
+ return -ESRCH;
+}
+
/*
* Lookup the task for the TID provided from user space and attach to
* it after doing proper sanity checks.
*/
-static int attach_to_pi_owner(u32 uval, union futex_key *key,
+static int attach_to_pi_owner(u32 __user *uaddr, u32 uval, union futex_key *key,
struct futex_pi_state **ps,
struct task_struct **exiting)
{
@@ -1213,12 +1269,15 @@ static int attach_to_pi_owner(u32 uval,
/*
* We are the first waiter - try to look up the real owner and attach
* the new pi_state to it, but bail out when TID = 0 [1]
+ *
+ * The !pid check is paranoid. None of the call sites should end up
+ * with pid == 0, but better safe than sorry. Let the caller retry
*/
if (!pid)
- return -ESRCH;
+ return -EAGAIN;
p = futex_find_get_task(pid);
if (!p)
- return -ESRCH;
+ return handle_exit_race(uaddr, uval, NULL);
if (unlikely(p->flags & PF_KTHREAD)) {
put_task_struct(p);
@@ -1237,7 +1296,7 @@ static int attach_to_pi_owner(u32 uval,
* FUTEX_STATE_DEAD, we know that the task has finished
* the cleanup:
*/
- int ret = (p->futex_state = FUTEX_STATE_DEAD) ? -ESRCH : -EAGAIN;
+ int ret = handle_exit_race(uaddr, uval, p);
raw_spin_unlock_irq(&p->pi_lock);
/*
@@ -1303,7 +1362,7 @@ static int lookup_pi_state(u32 __user *u
* We are the first waiter - try to look up the owner based on
* @uval and attach to it.
*/
- return attach_to_pi_owner(uval, key, ps, exiting);
+ return attach_to_pi_owner(uaddr, uval, key, ps, exiting);
}
static int lock_pi_update_atomic(u32 __user *uaddr, u32 uval, u32 newval)
@@ -1419,7 +1478,7 @@ static int futex_lock_pi_atomic(u32 __us
* attach to the owner. If that fails, no harm done, we only
* set the FUTEX_WAITERS bit in the user space variable.
*/
- return attach_to_pi_owner(uval, key, ps, exiting);
+ return attach_to_pi_owner(uaddr, newval, key, ps, exiting);
}
/**
next prev parent reply other threads:[~2021-03-15 13:54 UTC|newest]
Thread overview: 80+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-15 13:51 [PATCH 4.4 00/75] 4.4.262-rc1 review gregkh
2021-03-15 13:51 ` [PATCH 4.4 01/75] uapi: nfnetlink_cthelper.h: fix userspace compilation error gregkh
2021-03-15 13:51 ` [PATCH 4.4 02/75] ath9k: fix transmitting to stations in dynamic SMPS mode gregkh
2021-03-15 13:51 ` [PATCH 4.4 03/75] net: Fix gro aggregation for udp encaps with zero csum gregkh
2021-03-15 13:51 ` [PATCH 4.4 04/75] can: skb: can_skb_set_owner(): fix ref counting if socket was closed before setting skb ownership gregkh
2021-03-15 13:51 ` [PATCH 4.4 05/75] can: flexcan: assert FRZ bit in flexcan_chip_freeze() gregkh
2021-03-15 13:51 ` [PATCH 4.4 06/75] can: flexcan: enable RX FIFO after FRZ/HALT valid gregkh
2021-03-15 13:51 ` [PATCH 4.4 07/75] netfilter: x_tables: gpf inside xt_find_revision() gregkh
2021-03-15 13:51 ` [PATCH 4.4 08/75] cifs: return proper error code in statfs(2) gregkh
2021-03-15 13:51 ` [PATCH 4.4 09/75] floppy: fix lock_fdc() signal handling gregkh
2021-03-15 13:51 ` [PATCH 4.4 10/75] Revert "mm, slub: consider rest of partial list if acquire_slab() fails" gregkh
2021-03-15 13:51 ` [PATCH 4.4 11/75] futex: Change locking rules gregkh
2021-03-15 13:51 ` gregkh [this message]
2021-03-15 13:51 ` [PATCH 4.4 13/75] futex: fix dead code in attach_to_pi_owner() gregkh
2021-03-15 13:51 ` [PATCH 4.4 14/75] net/mlx4_en: update moderation when config reset gregkh
2021-03-15 13:51 ` [PATCH 4.4 15/75] net: lapbether: Remove netif_start_queue / netif_stop_queue gregkh
2021-03-15 13:51 ` [PATCH 4.4 16/75] net: davicom: Fix regulator not turned off on failed probe gregkh
2021-03-15 13:51 ` [PATCH 4.4 17/75] net: davicom: Fix regulator not turned off on driver removal gregkh
2021-03-15 13:51 ` [PATCH 4.4 18/75] media: usbtv: Fix deadlock on suspend gregkh
2021-03-15 13:51 ` [PATCH 4.4 19/75] mmc: mxs-mmc: Fix a resource leak in an error handling path in mxs_mmc_probe() gregkh
2021-03-15 13:51 ` [PATCH 4.4 20/75] mmc: mediatek: fix race condition between msdc_request_timeout and irq gregkh
2021-03-15 13:51 ` [PATCH 4.4 21/75] powerpc/perf: Record counter overflow always if SAMPLE_IP is unset gregkh
2021-03-15 13:51 ` [PATCH 4.4 22/75] PCI: xgene-msi: Fix race in installing chained irq handler gregkh
2021-03-15 13:51 ` [PATCH 4.4 23/75] s390/smp: __smp_rescan_cpus() - move cpumask away from stack gregkh
2021-03-15 13:51 ` [PATCH 4.4 24/75] scsi: libiscsi: Fix iscsi_prep_scsi_cmd_pdu() error handling gregkh
2021-03-15 13:51 ` [PATCH 4.4 25/75] ALSA: hda/hdmi: Cancel pending works before suspend gregkh
2021-03-15 13:51 ` [PATCH 4.4 26/75] ALSA: hda: Avoid spurious unsol event handling during S3/S4 gregkh
2021-03-15 13:51 ` [PATCH 4.4 27/75] ALSA: usb-audio: Fix "cannot get freq eq" errors on Dell AE515 sound bar gregkh
2021-03-15 13:51 ` [PATCH 4.4 28/75] s390/dasd: fix hanging DASD driver unbind gregkh
2021-03-15 13:51 ` [PATCH 4.4 29/75] mmc: core: Fix partition switch time for eMMC gregkh
2021-03-15 13:51 ` [PATCH 4.4 30/75] scripts/recordmcount.{c,pl}: support -ffunction-sections .text.* section names gregkh
2021-03-15 13:51 ` [PATCH 4.4 31/75] libertas: fix a potential NULL pointer dereference gregkh
2021-03-15 13:51 ` [PATCH 4.4 32/75] Goodix Fingerprint device is not a modem gregkh
2021-03-15 13:51 ` [PATCH 4.4 33/75] usb: gadget: f_uac2: always increase endpoint max_packet_size by one audio slot gregkh
2021-03-15 13:51 ` [PATCH 4.4 34/75] usb: renesas_usbhs: Clear PIPECFG for re-enabling pipe with other EPNUM gregkh
2021-03-15 13:51 ` [PATCH 4.4 35/75] xhci: Improve detection of device initiated wake signal gregkh
2021-03-15 13:51 ` [PATCH 4.4 36/75] USB: serial: io_edgeport: fix memory leak in edge_startup gregkh
2021-03-15 13:51 ` [PATCH 4.4 37/75] USB: serial: ch341: add new Product ID gregkh
2021-03-15 13:51 ` [PATCH 4.4 38/75] USB: serial: cp210x: add ID for Acuity Brands nLight Air Adapter gregkh
2021-03-15 13:51 ` [PATCH 4.4 39/75] USB: serial: cp210x: add some more GE USB IDs gregkh
2021-03-15 13:51 ` [PATCH 4.4 40/75] usbip: fix stub_dev to check for stream socket gregkh
2021-03-15 13:51 ` [PATCH 4.4 41/75] usbip: fix vhci_hcd " gregkh
2021-03-15 13:51 ` [PATCH 4.4 42/75] usbip: fix stub_dev usbip_sockfd_store() races leading to gpf gregkh
2021-03-15 13:51 ` [PATCH 4.4 43/75] staging: rtl8192u: fix ->ssid overflow in r8192_wx_set_scan() gregkh
2021-03-15 13:51 ` [PATCH 4.4 44/75] staging: rtl8188eu: prevent ->ssid overflow in rtw_wx_set_scan() gregkh
2021-03-15 13:51 ` [PATCH 4.4 45/75] staging: rtl8712: unterminated string leads to read overflow gregkh
2021-03-15 13:52 ` [PATCH 4.4 46/75] staging: rtl8188eu: fix potential memory corruption in rtw_check_beacon_data() gregkh
2021-03-15 13:52 ` [PATCH 4.4 47/75] staging: rtl8712: Fix possible buffer overflow in r8712_sitesurvey_cmd gregkh
2021-03-15 13:52 ` [PATCH 4.4 48/75] staging: rtl8192e: Fix possible buffer overflow in _rtl92e_wx_set_scan gregkh
2021-03-15 13:52 ` [PATCH 4.4 49/75] staging: comedi: addi_apci_1032: Fix endian problem for COS sample gregkh
2021-03-15 13:52 ` [PATCH 4.4 50/75] staging: comedi: addi_apci_1500: Fix endian problem for command sample gregkh
2021-03-15 13:52 ` [PATCH 4.4 51/75] staging: comedi: adv_pci1710: Fix endian problem for AI command data gregkh
2021-03-15 13:52 ` [PATCH 4.4 52/75] staging: comedi: das6402: " gregkh
2021-03-15 13:52 ` [PATCH 4.4 53/75] staging: comedi: das800: " gregkh
2021-03-15 13:52 ` [PATCH 4.4 54/75] staging: comedi: dmm32at: " gregkh
2021-03-15 13:52 ` [PATCH 4.4 55/75] staging: comedi: me4000: " gregkh
2021-03-15 13:52 ` [PATCH 4.4 56/75] staging: comedi: pcl711: " gregkh
2021-03-15 13:52 ` [PATCH 4.4 57/75] staging: comedi: pcl818: " gregkh
2021-03-15 13:52 ` [PATCH 4.4 58/75] NFSv4.2: fix return value of _nfs4_get_security_label() gregkh
2021-03-15 13:52 ` [PATCH 4.4 59/75] block: rsxx: fix error return code of rsxx_pci_probe() gregkh
2021-03-15 13:52 ` [PATCH 4.4 60/75] prctl: fix PR_SET_MM_AUXV kernel stack leak gregkh
2021-03-15 13:52 ` [PATCH 4.4 61/75] alpha: add $(src)/ rather than $(obj)/ to make source file path gregkh
2021-03-15 13:52 ` [PATCH 4.4 62/75] alpha: merge build rules of division routines gregkh
2021-03-15 13:52 ` [PATCH 4.4 63/75] alpha: make short build log available for " gregkh
2021-03-15 13:52 ` [PATCH 4.4 64/75] alpha: Package string routines together gregkh
2021-03-15 13:52 ` [PATCH 4.4 65/75] alpha: move exports to actual definitions gregkh
2021-03-15 13:52 ` [PATCH 4.4 66/75] alpha: get rid of tail-zeroing in __copy_user() gregkh
2021-03-15 13:52 ` [PATCH 4.4 67/75] alpha: switch __copy_user() and __do_clean_user() to normal calling conventions gregkh
2021-03-15 13:52 ` [PATCH 4.4 68/75] powerpc/64s: Fix instruction encoding for lis in ppc_function_entry() gregkh
2021-03-15 13:52 ` [PATCH 4.4 69/75] media: hdpvr: Fix an error handling path in hdpvr_probe() gregkh
2021-03-15 13:52 ` [PATCH 4.4 70/75] KVM: arm64: Fix exclusive limit for IPA size gregkh
2021-03-15 13:52 ` [PATCH 4.4 71/75] iio: imu: adis16400: release allocated memory on failure gregkh
2021-03-15 13:52 ` [PATCH 4.4 72/75] iio: imu: adis16400: fix memory leak gregkh
2021-03-15 13:52 ` [PATCH 4.4 73/75] xen/events: reset affinity of 2-level event when tearing it down gregkh
2021-03-15 13:52 ` [PATCH 4.4 74/75] xen/events: dont unmask an event channel when an eoi is pending gregkh
2021-03-15 13:52 ` [PATCH 4.4 75/75] xen/events: avoid handling the same event on two cpus at the same time gregkh
2021-03-15 21:05 ` [PATCH 4.4 00/75] 4.4.262-rc1 review Pavel Machek
2021-03-15 21:29 ` Guenter Roeck
2021-03-15 22:57 ` Jason Self
2021-03-16 12:07 ` Naresh Kamboju
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210315135208.663967134@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=dvhart@infradead.org \
--cc=heiko.carstens@de.ibm.com \
--cc=lee.jones@linaro.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=sashal@kernel.org \
--cc=stable@vger.kernel.org \
--cc=stli@linux.ibm.com \
--cc=sudipm.mukherjee@gmail.com \
--cc=tglx@linutronix.de \
--cc=zhengyejian1@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox