stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Stefan Liebler <stli@linux.ibm.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>,
	Heiko Carstens <heiko.carstens@de.ibm.com>,
	Darren Hart <dvhart@infradead.org>,
	Ingo Molnar <mingo@kernel.org>, Sasha Levin <sashal@kernel.org>
Subject: [PATCH 4.19 25/46] futex: Cure exit race
Date: Fri, 28 Dec 2018 12:52:19 +0100	[thread overview]
Message-ID: <20181228113126.253595860@linuxfoundation.org> (raw)
In-Reply-To: <20181228113124.971620049@linuxfoundation.org>

4.19-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner <tglx@linutronix.de>

commit da791a667536bf8322042e38ca85d55a78d3c273 upstream.

Stefan reported, that the glibc tst-robustpi4 test case fails
occasionally. That case creates the following race between
sys_exit() and sys_futex_lock_pi():

 CPU0				CPU1

 sys_exit()			sys_futex()
  do_exit()			 futex_lock_pi()
   exit_signals(tsk)		  No waiters:
    tsk->flags |= PF_EXITING;	  *uaddr == 0x00000PID
  mm_release(tsk)		  Set waiter bit
   exit_robust_list(tsk) {	  *uaddr = 0x80000PID;
      Set owner died		  attach_to_pi_owner() {
    *uaddr = 0xC0000000;	   tsk = get_task(PID);
   }				   if (!tsk->flags & PF_EXITING) {
  ...				     attach();
  tsk->flags |= PF_EXITPIDONE;	   } else {
				     if (!(tsk->flags & PF_EXITPIDONE))
				       return -EAGAIN;
				     return -ESRCH; <--- FAIL
				   }

ESRCH is returned all the way to user space, which triggers the glibc test
case assert. Returning ESRCH unconditionally is wrong here because the user
space value has been changed by the exiting task to 0xC0000000, i.e. the
FUTEX_OWNER_DIED bit is set and the futex PID value has been cleared. This
is a valid state and the kernel has to handle it, i.e. taking the futex.

Cure it by rereading the user space value when PF_EXITING and PF_EXITPIDONE
is set in the task which 'owns' the futex. If the value has changed, let
the kernel retry the operation, which includes all regular sanity checks
and correctly handles the FUTEX_OWNER_DIED case.

If it hasn't changed, then return ESRCH as there is no way to distinguish
this case from malfunctioning user space. This happens when the exiting
task did not have a robust list, the robust list was corrupted or the user
space value in the futex was simply bogus.

Reported-by: Stefan Liebler <stli@linux.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Sasha Levin <sashal@kernel.org>
Cc: stable@vger.kernel.org
Link: https://bugzilla.kernel.org/show_bug.cgi?id=200467
Link: https://lkml.kernel.org/r/20181210152311.986181245@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 kernel/futex.c |   69 ++++++++++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 63 insertions(+), 6 deletions(-)

--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -1148,11 +1148,65 @@ out_error:
 	return ret;
 }
 
+static int handle_exit_race(u32 __user *uaddr, u32 uval,
+			    struct task_struct *tsk)
+{
+	u32 uval2;
+
+	/*
+	 * If PF_EXITPIDONE is not yet set, then try again.
+	 */
+	if (tsk && !(tsk->flags & PF_EXITPIDONE))
+		return -EAGAIN;
+
+	/*
+	 * Reread the user space value to handle the following situation:
+	 *
+	 * CPU0				CPU1
+	 *
+	 * sys_exit()			sys_futex()
+	 *  do_exit()			 futex_lock_pi()
+	 *                                futex_lock_pi_atomic()
+	 *   exit_signals(tsk)		    No waiters:
+	 *    tsk->flags |= PF_EXITING;	    *uaddr == 0x00000PID
+	 *  mm_release(tsk)		    Set waiter bit
+	 *   exit_robust_list(tsk) {	    *uaddr = 0x80000PID;
+	 *      Set owner died		    attach_to_pi_owner() {
+	 *    *uaddr = 0xC0000000;	     tsk = get_task(PID);
+	 *   }				     if (!tsk->flags & PF_EXITING) {
+	 *  ...				       attach();
+	 *  tsk->flags |= PF_EXITPIDONE;     } else {
+	 *				       if (!(tsk->flags & PF_EXITPIDONE))
+	 *				         return -EAGAIN;
+	 *				       return -ESRCH; <--- FAIL
+	 *				     }
+	 *
+	 * Returning ESRCH unconditionally is wrong here because the
+	 * user space value has been changed by the exiting task.
+	 *
+	 * The same logic applies to the case where the exiting task is
+	 * already gone.
+	 */
+	if (get_futex_value_locked(&uval2, uaddr))
+		return -EFAULT;
+
+	/* If the user space value has changed, try again. */
+	if (uval2 != uval)
+		return -EAGAIN;
+
+	/*
+	 * The exiting task did not have a robust list, the robust list was
+	 * corrupted or the user space value in *uaddr is simply bogus.
+	 * Give up and tell user space.
+	 */
+	return -ESRCH;
+}
+
 /*
  * Lookup the task for the TID provided from user space and attach to
  * it after doing proper sanity checks.
  */
-static int attach_to_pi_owner(u32 uval, union futex_key *key,
+static int attach_to_pi_owner(u32 __user *uaddr, u32 uval, union futex_key *key,
 			      struct futex_pi_state **ps)
 {
 	pid_t pid = uval & FUTEX_TID_MASK;
@@ -1162,12 +1216,15 @@ static int attach_to_pi_owner(u32 uval,
 	/*
 	 * We are the first waiter - try to look up the real owner and attach
 	 * the new pi_state to it, but bail out when TID = 0 [1]
+	 *
+	 * The !pid check is paranoid. None of the call sites should end up
+	 * with pid == 0, but better safe than sorry. Let the caller retry
 	 */
 	if (!pid)
-		return -ESRCH;
+		return -EAGAIN;
 	p = find_get_task_by_vpid(pid);
 	if (!p)
-		return -ESRCH;
+		return handle_exit_race(uaddr, uval, NULL);
 
 	if (unlikely(p->flags & PF_KTHREAD)) {
 		put_task_struct(p);
@@ -1187,7 +1244,7 @@ static int attach_to_pi_owner(u32 uval,
 		 * set, we know that the task has finished the
 		 * cleanup:
 		 */
-		int ret = (p->flags & PF_EXITPIDONE) ? -ESRCH : -EAGAIN;
+		int ret = handle_exit_race(uaddr, uval, p);
 
 		raw_spin_unlock_irq(&p->pi_lock);
 		put_task_struct(p);
@@ -1244,7 +1301,7 @@ static int lookup_pi_state(u32 __user *u
 	 * We are the first waiter - try to look up the owner based on
 	 * @uval and attach to it.
 	 */
-	return attach_to_pi_owner(uval, key, ps);
+	return attach_to_pi_owner(uaddr, uval, key, ps);
 }
 
 static int lock_pi_update_atomic(u32 __user *uaddr, u32 uval, u32 newval)
@@ -1352,7 +1409,7 @@ static int futex_lock_pi_atomic(u32 __us
 	 * attach to the owner. If that fails, no harm done, we only
 	 * set the FUTEX_WAITERS bit in the user space variable.
 	 */
-	return attach_to_pi_owner(uval, key, ps);
+	return attach_to_pi_owner(uaddr, newval, key, ps);
 }
 
 /**



  parent reply	other threads:[~2018-12-28 12:14 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-28 11:51 [PATCH 4.19 00/46] 4.19.13-stable review Greg Kroah-Hartman
2018-12-28 11:51 ` [PATCH 4.19 01/46] iomap: Revert "fs/iomap.c: get/put the page in iomap_page_create/release()" Greg Kroah-Hartman
2018-12-28 11:51 ` [PATCH 4.19 02/46] Revert "vfs: Allow userns root to call mknod on owned filesystems." Greg Kroah-Hartman
2018-12-28 11:51 ` [PATCH 4.19 03/46] USB: hso: Fix OOB memory access in hso_probe/hso_get_config_data Greg Kroah-Hartman
2018-12-28 11:51 ` [PATCH 4.19 04/46] xhci: Dont prevent USB2 bus suspend in state check intended for USB3 only Greg Kroah-Hartman
2018-12-28 11:51 ` [PATCH 4.19 05/46] USB: xhci: fix broken_suspend placement in struct xchi_hcd Greg Kroah-Hartman
2018-12-28 11:52 ` [PATCH 4.19 06/46] USB: serial: option: add GosunCn ZTE WeLink ME3630 Greg Kroah-Hartman
2018-12-28 11:52 ` [PATCH 4.19 07/46] USB: serial: option: add HP lt4132 Greg Kroah-Hartman
2018-12-28 11:52 ` [PATCH 4.19 08/46] USB: serial: option: add Simcom SIM7500/SIM7600 (MBIM mode) Greg Kroah-Hartman
2018-12-28 11:52 ` [PATCH 4.19 09/46] USB: serial: option: add Fibocom NL668 series Greg Kroah-Hartman
2018-12-28 11:52 ` [PATCH 4.19 10/46] USB: serial: option: add Telit LN940 series Greg Kroah-Hartman
2018-12-28 11:52 ` [PATCH 4.19 11/46] ubifs: Handle re-linking of inodes correctly while recovery Greg Kroah-Hartman
2018-12-28 11:52 ` [PATCH 4.19 12/46] scsi: t10-pi: Return correct ref tag when queue has no integrity profile Greg Kroah-Hartman
2018-12-28 11:52 ` [PATCH 4.19 13/46] scsi: sd: use mempool for discard special page Greg Kroah-Hartman
2018-12-28 11:52 ` [PATCH 4.19 14/46] mmc: core: Reset HPI enabled state during re-init and in case of errors Greg Kroah-Hartman
2018-12-28 11:52 ` [PATCH 4.19 15/46] mmc: core: Allow BKOPS and CACHE ctrl even if no HPI support Greg Kroah-Hartman
2018-12-28 11:52 ` [PATCH 4.19 16/46] mmc: core: Use a minimum 1600ms timeout when enabling CACHE ctrl Greg Kroah-Hartman
2018-12-28 11:52 ` [PATCH 4.19 17/46] mmc: omap_hsmmc: fix DMA API warning Greg Kroah-Hartman
2018-12-28 11:52 ` [PATCH 4.19 18/46] gpio: max7301: fix driver for use with CONFIG_VMAP_STACK Greg Kroah-Hartman
2018-12-28 11:52 ` [PATCH 4.19 19/46] gpiolib-acpi: Only defer request_irq for GpioInt ACPI event handlers Greg Kroah-Hartman
2018-12-28 11:52 ` [PATCH 4.19 20/46] posix-timers: Fix division by zero bug Greg Kroah-Hartman
2018-12-28 11:52 ` [PATCH 4.19 21/46] KVM: X86: Fix NULL deref in vcpu_scan_ioapic Greg Kroah-Hartman
2018-12-28 11:52 ` [PATCH 4.19 22/46] kvm: x86: Add AMDs EX_CFG to the list of ignored MSRs Greg Kroah-Hartman
2018-12-28 11:52 ` [PATCH 4.19 23/46] KVM: Fix UAF in nested posted interrupt processing Greg Kroah-Hartman
2018-12-28 11:52 ` [PATCH 4.19 24/46] Drivers: hv: vmbus: Return -EINVAL for the sys files for unopened channels Greg Kroah-Hartman
2018-12-28 11:52 ` Greg Kroah-Hartman [this message]
2018-12-28 11:52 ` [PATCH 4.19 26/46] x86/mtrr: Dont copy uninitialized gentry fields back to userspace Greg Kroah-Hartman
2018-12-28 11:52 ` [PATCH 4.19 27/46] x86/mm: Fix decoy address handling vs 32-bit builds Greg Kroah-Hartman
2018-12-28 11:52 ` [PATCH 4.19 28/46] x86/vdso: Pass --eh-frame-hdr to the linker Greg Kroah-Hartman
2018-12-28 11:52 ` [PATCH 4.19 29/46] x86/intel_rdt: Ensure a CPU remains online for the regions pseudo-locking sequence Greg Kroah-Hartman
2018-12-28 11:52 ` [PATCH 4.19 30/46] panic: avoid deadlocks in re-entrant console drivers Greg Kroah-Hartman
2018-12-28 11:52 ` [PATCH 4.19 31/46] mm: add mm_pxd_folded checks to pgtable_bytes accounting functions Greg Kroah-Hartman
2018-12-28 11:52 ` [PATCH 4.19 32/46] mm: make the __PAGETABLE_PxD_FOLDED defines non-empty Greg Kroah-Hartman
2018-12-28 11:52 ` [PATCH 4.19 33/46] mm: introduce mm_[p4d|pud|pmd]_folded Greg Kroah-Hartman
2018-12-28 11:52 ` [PATCH 4.19 34/46] xfrm_user: fix freeing of xfrm states on acquire Greg Kroah-Hartman
2018-12-28 11:52 ` [PATCH 4.19 35/46] rtlwifi: Fix leak of skb when processing C2H_BT_INFO Greg Kroah-Hartman
2018-12-28 11:52 ` [PATCH 4.19 36/46] iwlwifi: mvm: dont send GEO_TX_POWER_LIMIT to old firmwares Greg Kroah-Hartman
2018-12-28 11:52 ` [PATCH 4.19 37/46] Revert "mwifiex: restructure rx_reorder_tbl_lock usage" Greg Kroah-Hartman
2018-12-28 11:52 ` [PATCH 4.19 38/46] iwlwifi: add new cards for 9560, 9462, 9461 and killer series Greg Kroah-Hartman
2018-12-28 11:52 ` [PATCH 4.19 39/46] media: ov5640: Fix set format regression Greg Kroah-Hartman
2018-12-28 11:52 ` [PATCH 4.19 40/46] mm, memory_hotplug: initialize struct pages for the full memory section Greg Kroah-Hartman
2018-12-28 11:52 ` [PATCH 4.19 41/46] mm: thp: fix flags for pmd migration when split Greg Kroah-Hartman
2018-12-28 11:52 ` [PATCH 4.19 42/46] mm, page_alloc: fix has_unmovable_pages for HugePages Greg Kroah-Hartman
2018-12-28 11:52 ` [PATCH 4.19 43/46] mm: dont miss the last page because of round-off error Greg Kroah-Hartman
2018-12-28 11:52 ` [PATCH 4.19 44/46] Input: elantech - disable elan-i2c for P52 and P72 Greg Kroah-Hartman
2018-12-28 11:52 ` [PATCH 4.19 45/46] proc/sysctl: dont return ENOMEM on lookup when a table is unregistering Greg Kroah-Hartman
2018-12-28 11:52 ` [PATCH 4.19 46/46] drm/ioctl: Fix Spectre v1 vulnerabilities Greg Kroah-Hartman
2018-12-28 17:54 ` [PATCH 4.19 00/46] 4.19.13-stable review Dan Rue
2018-12-29 12:20   ` Greg Kroah-Hartman
2018-12-28 20:08 ` shuah
2018-12-29  9:55   ` Greg Kroah-Hartman
2018-12-28 21:29 ` Guenter Roeck
2018-12-29 12:20   ` Greg Kroah-Hartman
2018-12-29  9:01 ` Harsh Shandilya
2018-12-29 12:20   ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181228113126.253595860@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=dvhart@infradead.org \
    --cc=heiko.carstens@de.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=stli@linux.ibm.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).