public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Greg KH <gregkh@suse.de>
To: linux-kernel@vger.kernel.org, stable@kernel.org
Cc: Justin Forbes <jmforbes@linuxtx.org>,
	Zwane Mwaikambo <zwane@arm.linux.org.uk>,
	"Theodore Ts'o" <tytso@mit.edu>,
	Randy Dunlap <rdunlap@xenotime.net>,
	Dave Jones <davej@redhat.com>,
	Chuck Wolber <chuckw@quantumlinux.com>,
	Chris Wedgwood <reviews@ml.cw.f00f.org>,
	Michael Krufky <mkrufky@linuxtv.org>,
	Chuck Ebbert <cebbert@redhat.com>,
	Domenico Andreoli <cavokz@gmail.com>,
	torvalds@linux-foundation.org, akpm@linux-foundation.org,
	alan@lxorguk.ukuu.org.uk, Steven Rostedt <srostedt@redhat.com>,
	Ingo Molnar <mingo@elte.hu>, Thomas Gleixner <tglx@linutronix.de>
Subject: [patch 12/36] futex: fix for futex_wait signal stack corruption
Date: Wed, 12 Dec 2007 22:34:34 -0800	[thread overview]
Message-ID: <20071213063434.GM25301@kroah.com> (raw)
In-Reply-To: <20071213063308.GA25301@kroah.com>

[-- Attachment #1: futex-fix-for-futex_wait-signal-stack-corruption.patch --]
[-- Type: text/plain, Size: 7326 bytes --]

2.6.22-stable review patch.  If anyone has any objections, please let us
know.

------------------

From: Steven Rostedt <srostedt@redhat.com>

patch ce6bd420f43b28038a2c6e8fbb86ad24014727b6 in mainline.

David Holmes found a bug in the -rt tree with respect to
pthread_cond_timedwait. After trying his test program on the latest git
from mainline, I found the bug was there too.  The bug he was seeing
that his test program showed, was that if one were to do a "Ctrl-Z" on a
process that was in the pthread_cond_timedwait, and then did a "bg" on
that process, it would return with a "-ETIMEDOUT" but early. That is,
the timer would go off early.

Looking into this, I found the source of the problem. And it is a rather
nasty bug at that.

Here's the relevant code from kernel/futex.c: (not in order in the file)

[...]
smlinkage long sys_futex(u32 __user *uaddr, int op, u32 val,
                          struct timespec __user *utime, u32 __user *uaddr2,
                          u32 val3)
{
        struct timespec ts;
        ktime_t t, *tp = NULL;
        u32 val2 = 0;
        int cmd = op & FUTEX_CMD_MASK;

        if (utime && (cmd == FUTEX_WAIT || cmd == FUTEX_LOCK_PI)) {
                if (copy_from_user(&ts, utime, sizeof(ts)) != 0)
                        return -EFAULT;
                if (!timespec_valid(&ts))
                        return -EINVAL;

                t = timespec_to_ktime(ts);
                if (cmd == FUTEX_WAIT)
                        t = ktime_add(ktime_get(), t);
                tp = &t;
        }
[...]
        return do_futex(uaddr, op, val, tp, uaddr2, val2, val3);
}

[...]

long do_futex(u32 __user *uaddr, int op, u32 val, ktime_t *timeout,
                u32 __user *uaddr2, u32 val2, u32 val3)
{
        int ret;
        int cmd = op & FUTEX_CMD_MASK;
        struct rw_semaphore *fshared = NULL;

        if (!(op & FUTEX_PRIVATE_FLAG))
                fshared = &current->mm->mmap_sem;

        switch (cmd) {
        case FUTEX_WAIT:
                ret = futex_wait(uaddr, fshared, val, timeout);

[...]

static int futex_wait(u32 __user *uaddr, struct rw_semaphore *fshared,
                      u32 val, ktime_t *abs_time)
{
[...]
               struct restart_block *restart;
                restart = &current_thread_info()->restart_block;
                restart->fn = futex_wait_restart;
                restart->arg0 = (unsigned long)uaddr;
                restart->arg1 = (unsigned long)val;
                restart->arg2 = (unsigned long)abs_time;
                restart->arg3 = 0;
                if (fshared)
                        restart->arg3 |= ARG3_SHARED;
                return -ERESTART_RESTARTBLOCK;
[...]

static long futex_wait_restart(struct restart_block *restart)
{
        u32 __user *uaddr = (u32 __user *)restart->arg0;
        u32 val = (u32)restart->arg1;
        ktime_t *abs_time = (ktime_t *)restart->arg2;
        struct rw_semaphore *fshared = NULL;

        restart->fn = do_no_restart_syscall;
        if (restart->arg3 & ARG3_SHARED)
                fshared = &current->mm->mmap_sem;
        return (long)futex_wait(uaddr, fshared, val, abs_time);
}

So when the futex_wait is interrupt by a signal we break out of the
hrtimer code and set up or return from signal. This code does not return
back to userspace, so we set up a RESTARTBLOCK.  The bug here is that we
save the "abs_time" which is a pointer to the stack variable "ktime_t t"
from sys_futex.

This returns and unwinds the stack before we get to call our signal. On
return from the signal we go to futex_wait_restart, where we update all
the parameters for futex_wait and call it. But here we have a problem
where abs_time is no longer valid.

I verified this with print statements, and sure enough, what abs_time
was set to ends up being garbage when we get to futex_wait_restart.

The solution I did to solve this (with input from Linus Torvalds)
was to add unions to the restart_block to allow system calls to
use the restart with specific parameters.  This way the futex code now
saves the time in a 64bit value in the restart block instead of storing
it on the stack.

Note: I'm a bit nervious to add "linux/types.h" and use u32 and u64
in thread_info.h, when there's a #ifdef __KERNEL__ just below that.
Not sure what that is there for.  If this turns out to be a problem, I've
tested this with using "unsigned int" for u32 and "unsigned long long" for
u64 and it worked just the same. I'm using u32 and u64 just to be
consistent with what the futex code uses.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 include/linux/thread_info.h |   17 +++++++++++++++--
 kernel/futex.c              |   25 +++++++++++++------------
 2 files changed, 28 insertions(+), 14 deletions(-)

--- a/include/linux/thread_info.h
+++ b/include/linux/thread_info.h
@@ -7,12 +7,25 @@
 #ifndef _LINUX_THREAD_INFO_H
 #define _LINUX_THREAD_INFO_H
 
+#include <linux/types.h>
+
 /*
- * System call restart block. 
+ * System call restart block.
  */
 struct restart_block {
 	long (*fn)(struct restart_block *);
-	unsigned long arg0, arg1, arg2, arg3;
+	union {
+		struct {
+			unsigned long arg0, arg1, arg2, arg3;
+		};
+		/* For futex_wait */
+		struct {
+			u32 *uaddr;
+			u32 val;
+			u32 flags;
+			u64 time;
+		} futex;
+	};
 };
 
 extern long do_no_restart_syscall(struct restart_block *parm);
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -1129,9 +1129,9 @@ static int fixup_pi_state_owner(u32 __us
 
 /*
  * In case we must use restart_block to restart a futex_wait,
- * we encode in the 'arg3' shared capability
+ * we encode in the 'flags' shared capability
  */
-#define ARG3_SHARED  1
+#define FLAGS_SHARED  1
 
 static long futex_wait_restart(struct restart_block *restart);
 static int futex_wait(u32 __user *uaddr, struct rw_semaphore *fshared,
@@ -1272,12 +1272,13 @@ static int futex_wait(u32 __user *uaddr,
 		struct restart_block *restart;
 		restart = &current_thread_info()->restart_block;
 		restart->fn = futex_wait_restart;
-		restart->arg0 = (unsigned long)uaddr;
-		restart->arg1 = (unsigned long)val;
-		restart->arg2 = (unsigned long)abs_time;
-		restart->arg3 = 0;
+		restart->futex.uaddr = (u32 *)uaddr;
+		restart->futex.val = val;
+		restart->futex.time = abs_time->tv64;
+		restart->futex.flags = 0;
+
 		if (fshared)
-			restart->arg3 |= ARG3_SHARED;
+			restart->futex.flags |= FLAGS_SHARED;
 		return -ERESTART_RESTARTBLOCK;
 	}
 
@@ -1293,15 +1294,15 @@ static int futex_wait(u32 __user *uaddr,
 
 static long futex_wait_restart(struct restart_block *restart)
 {
-	u32 __user *uaddr = (u32 __user *)restart->arg0;
-	u32 val = (u32)restart->arg1;
-	ktime_t *abs_time = (ktime_t *)restart->arg2;
+	u32 __user *uaddr = (u32 __user *)restart->futex.uaddr;
 	struct rw_semaphore *fshared = NULL;
+	ktime_t t;
 
+	t.tv64 = restart->futex.time;
 	restart->fn = do_no_restart_syscall;
-	if (restart->arg3 & ARG3_SHARED)
+	if (restart->futex.flags & FLAGS_SHARED)
 		fshared = &current->mm->mmap_sem;
-	return (long)futex_wait(uaddr, fshared, val, abs_time);
+	return (long)futex_wait(uaddr, fshared, restart->futex.val, &t);
 }
 
 

-- 

  parent reply	other threads:[~2007-12-13  6:41 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20071213062511.265908583@mini.kroah.org>
2007-12-13  6:33 ` [patch 00/36] 2.6.22-stable review Greg KH
2007-12-13  6:33   ` [patch 01/36] atl1: disable broken 64-bit DMA Greg KH
2007-12-13  6:33   ` [patch 02/36] rd: fix data corruption on memory pressure Future of Linux 2.6.22.y series Greg KH
2007-12-13  6:34   ` [patch 03/36] wait_task_stopped(): pass correct exit_code to wait_noreap_copyout() Greg KH
2007-12-13  6:34   ` [patch 04/36] USB: make the microtek driver and HAL cooperate Greg KH
2007-12-13  6:34   ` [patch 05/36] USB: fix up EHCI startup synchronization Greg KH
2007-12-13  6:34   ` [patch 06/36] tmpfs: restore missing clear_highpage Greg KH
2007-12-13  6:34   ` [patch 07/36] nf_nat: fix memset error Greg KH
2007-12-13  6:34   ` [patch 08/36] libcrc32c: keep intermediate crc state in cpu order Greg KH
2007-12-13  6:34   ` [patch 09/36] isdn: avoid copying overly-long strings Greg KH
2007-12-13  6:34   ` [patch 10/36] I4L: fix isdn_ioctl memory overrun vulnerability Greg KH
2007-12-13  6:34   ` [patch 11/36] hrtimers: avoid overflow for large relative timeouts (CVE-2007-5966) Greg KH
2007-12-13  6:34   ` Greg KH [this message]
2007-12-13  6:34   ` [patch 13/36] forcedeth: new mcp79 pci ids Greg KH
2007-12-13  6:34   ` [patch 14/36] forcedeth boot delay fix Greg KH
2007-12-13  6:34   ` [patch 15/36] fb_ddc: fix DDC lines quirk Greg KH
2007-12-13  6:34   ` [patch 16/36] TCP: Problem bug with sysctl_tcp_congestion_control function Greg KH
2007-12-13  6:34   ` [patch 17/36] TCP: MTUprobe: fix potential sk_send_head corruption Greg KH
2007-12-13  6:34   ` [patch 18/36] PFKEY: Sending an SADB_GET responds with an SADB_GET Greg KH
2007-12-13  6:34   ` [patch 19/36] NET: Corrects a bug in ip_rt_acct_read() Greg KH
2007-12-13  6:34   ` [patch 20/36] IPV4: Remove bogus ifdef mess in arp_process Greg KH
2007-12-13  6:34   ` [patch 21/36] CRYPTO api: Fix potential race in crypto_remove_spawn Greg KH
2007-12-13  6:35   ` [patch 22/36] ATM: initialize lock and tasklet earlier Greg KH
2007-12-13  6:35   ` [patch 23/36] UNIX: EOF on non-blocking SOCK_SEQPACKET Greg KH
2007-12-13  6:35   ` [patch 24/36] TEXTSEARCH: Do not allow zero length patterns in the textsearch infrastructure Greg KH
2007-12-13  6:35   ` [patch 25/36] TCP: illinois: Incorrect beta usage Greg KH
2007-12-13  6:35   ` [patch 26/36] RXRPC: Add missing select on CRYPTO Greg KH
2007-12-13  6:35   ` [patch 27/36] IPV6: Restore IPv6 when MTU is big enough Greg KH
2007-12-13  6:35   ` [patch 28/36] DECNET: dn_nl_deladdr() almost always returns no error Greg KH
2007-12-13  6:35   ` [patch 29/36] BRIDGE: Lost call to br_fdb_fini() in br_init() error path Greg KH
2007-12-13  6:35   ` [patch 30/36] knfsd: Validate filehandle type in fsid_source Greg KH
2007-12-13  6:35   ` [patch 31/36] Revert "Fix SMP poweroff hangs" Greg KH
2007-12-13  6:35   ` [patch 32/36] XFS: Make xfsbufd threads freezable Greg KH
2007-12-13 18:46     ` Fortier,Vincent [Montreal]
2007-12-13 19:07       ` Olivér Pintér
2007-12-13 19:16         ` Olivér Pintér
2007-12-14  0:34           ` Greg KH
2007-12-13  6:35   ` [patch 33/36] XFRM: Fix leak of expired xfrm_states Greg KH
2007-12-13  6:35   ` [patch 34/36] NETFILTER: xt_TCPMSS: remove network triggerable WARN_ON Greg KH
2007-12-13  6:35   ` [patch 35/36] libata: kill spurious NCQ completion detection Greg KH
2007-12-13  6:35   ` [patch 36/36] BRIDGE: Properly dereference the br_should_route_hook Greg KH
2007-12-13  6:42   ` [stable] [patch 00/36] 2.6.22-stable review Greg KH

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20071213063434.GM25301@kroah.com \
    --to=gregkh@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=cavokz@gmail.com \
    --cc=cebbert@redhat.com \
    --cc=chuckw@quantumlinux.com \
    --cc=davej@redhat.com \
    --cc=jmforbes@linuxtx.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=mkrufky@linuxtv.org \
    --cc=rdunlap@xenotime.net \
    --cc=reviews@ml.cw.f00f.org \
    --cc=srostedt@redhat.com \
    --cc=stable@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=tytso@mit.edu \
    --cc=zwane@arm.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox