public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [2.6.2] Badness in futex_wait revisited
@ 2004-02-13 21:13 Dirk Morris
  2004-02-16 11:42 ` Rusty Russell
  0 siblings, 1 reply; 14+ messages in thread
From: Dirk Morris @ 2004-02-13 21:13 UTC (permalink / raw)
  To: linux-kernel

In reference to this previous post:
http://www.ussg.iu.edu/hypermail/linux/kernel/0401.1/1966.html

I also get:
Feb 13 11:43:52 timmy kernel: Call Trace:
Feb 13 11:43:52 timmy kernel:  [<c01387e1>] futex_wait+0x191/0x1a0
Feb 13 11:43:52 timmy kernel:  [<c011e9e0>] default_wake_function+0x0/0x20
Feb 13 11:43:52 timmy kernel:  [<c011e9e0>] default_wake_function+0x0/0x20
Feb 13 11:43:52 timmy kernel:  [<c0138abb>] do_futex+0x6b/0x80
Feb 13 11:43:52 timmy kernel:  [<c0138be6>] sys_futex+0x116/0x130
Feb 13 11:43:52 timmy kernel:  [<c010959f>] syscall_call+0x7/0xb

I get this in 2.6.1 and 2.6.2.
In userland a call to sem_wait returns with -1, and errno = -EINTR

I have been unable to write a simple test case that shows this behavior, 
but I
can replicate it consistently in a fairly complicated manner.

Interestingly, I can *not* get this to happen in debian stable:
~ # dpkg -l | grep ' libc6 | gcc '
ii  gcc            2.95.4-14      The GNU C compiler.
ii  libc6          2.2.5-11.5     GNU C Library: Shared libraries and 
Timezone

But it happens everytime in debian testing (same kernel, same machine):
~ # dpkg -l | grep ' libc6 | gcc '
ii  gcc            3.3.2-2        The GNU C compiler
ii  libc6          2.3.2.ds1-11   GNU C Library: Shared libraries and 
Timezone

I have tried this on two machines (both athlons), both show the same 
behavior.

I also tried to apply the patch previously posted by Andrew to 2.6.2,
and it gives me:

Feb 13 10:00:29 timmy kernel: Badness in try_to_wake_up at 
kernel/sched.c:662
Feb 13 10:00:29 timmy kernel: Call Trace:
Feb 13 10:00:29 timmy kernel:  [<c011d3d8>] try_to_wake_up+0x298/0x2a0
Feb 13 10:00:29 timmy kernel:  [<c011ed4a>] __wake_up_common+0x3a/0x60
Feb 13 10:00:29 timmy kernel:  [<c011edaf>] __wake_up+0x3f/0x70
Feb 13 10:00:29 timmy kernel:  [<c0138fb0>] wake_futex+0x30/0x60
Feb 13 10:00:29 timmy kernel:  [<c01390d3>] futex_wake+0xf3/0x110
Feb 13 10:00:29 timmy kernel:  [<c012d040>] do_timer+0xc0/0xd0
Feb 13 10:00:29 timmy kernel:  [<c013995e>] do_futex+0x7e/0x80
Feb 13 10:00:29 timmy kernel:  [<c0139a84>] sys_futex+0x124/0x140
Feb 13 10:00:29 timmy kernel:  [<c0127767>] sys_gettimeofday+0x67/0xe0
Feb 13 10:00:29 timmy kernel:  [<c010962b>] syscall_call+0x7/0xb
Feb 13 10:00:29 timmy kernel:

The patch didnt cleanly fit 2.6.2, so I might have applied it incorrectly.

Please let me know if I can give anymore information, or try another patch.
Thanks,

-Dirk Morris



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [2.6.2] Badness in futex_wait revisited
  2004-02-13 21:13 [2.6.2] Badness in futex_wait revisited Dirk Morris
@ 2004-02-16 11:42 ` Rusty Russell
  0 siblings, 0 replies; 14+ messages in thread
From: Rusty Russell @ 2004-02-16 11:42 UTC (permalink / raw)
  To: Dirk Morris; +Cc: linux-kernel

On Fri, 13 Feb 2004 13:13:35 -0800
Dirk Morris <dmorris@metavize.com> wrote:

> In reference to this previous post:
> http://www.ussg.iu.edu/hypermail/linux/kernel/0401.1/1966.html
> 
> I also get:
> Feb 13 11:43:52 timmy kernel: Call Trace:
> Feb 13 11:43:52 timmy kernel:  [<c01387e1>] futex_wait+0x191/0x1a0
> Feb 13 11:43:52 timmy kernel:  [<c011e9e0>] default_wake_function+0x0/0x20
> Feb 13 11:43:52 timmy kernel:  [<c011e9e0>] default_wake_function+0x0/0x20
> Feb 13 11:43:52 timmy kernel:  [<c0138abb>] do_futex+0x6b/0x80
> Feb 13 11:43:52 timmy kernel:  [<c0138be6>] sys_futex+0x116/0x130
> Feb 13 11:43:52 timmy kernel:  [<c010959f>] syscall_call+0x7/0xb
> 
> I get this in 2.6.1 and 2.6.2.
> In userland a call to sem_wait returns with -1, and errno = -EINTR

Yes...

	Please send your config.  What's happening at the time?

(Andrew's patch was buggy, I fixed it and can send you an update).

Rusty.
-- 
   there are those who do and those who hang on and you don't see too
   many doers quoting their contemporaries.  -- Larry McVoy

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [2.6.2] Badness in futex_wait revisited
       [not found] <40311703.8070309@metavize.com>
@ 2004-02-17  4:39 ` Rusty Russell
  2004-02-17  5:27   ` Andrew Morton
                     ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Rusty Russell @ 2004-02-17  4:39 UTC (permalink / raw)
  To: Dirk Morris; +Cc: akpm, linux-kernel

In message <40311703.8070309@metavize.com> you write:
> Please send me the patch, and I'll give you some updated information.

Here it is: Andrew's patch updated and fixed.

Thanks!
Rusty.
--
  Anyone who quotes me in their sig is an idiot. -- Rusty Russell.

Name: Who's Spuriously Waking Futexes?
Author: Andrew Morton, Rusty Russell
Status: Tested on 2.6.3-bk1

Someone is triggering the WARN_ON() in futex.c.  We know that software
suspend could do it, in theory.  But noone else should be.

This code adds a PF_FUTEX_DEBUG flag, which is set in the futex code
when we sleep, and also when we wake up.  If a task with
PF_FUTEX_DEBUG is woken by a task without PF_FUTEX_DEBUG, we have
found our culprit.

diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal .6375-linux-2.6.3-rc3-bk1/include/linux/sched.h .6375-linux-2.6.3-rc3-bk1.updated/include/linux/sched.h
--- .6375-linux-2.6.3-rc3-bk1/include/linux/sched.h	2004-02-15 18:17:21.000000000 +1100
+++ .6375-linux-2.6.3-rc3-bk1.updated/include/linux/sched.h	2004-02-17 12:01:47.000000000 +1100
@@ -500,6 +500,7 @@ do { if (atomic_dec_and_test(&(tsk)->usa
 #define PF_SWAPOFF	0x00080000	/* I am in swapoff */
 #define PF_LESS_THROTTLE 0x00100000	/* Throttle me less: I clean memory */
 #define PF_SYNCWRITE	0x00200000	/* I am doing a sync write */
+#define PF_FUTEX_DEBUG	0x00400000
 
 #ifdef CONFIG_SMP
 extern int set_cpus_allowed(task_t *p, cpumask_t new_mask);
diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal .6375-linux-2.6.3-rc3-bk1/kernel/futex.c .6375-linux-2.6.3-rc3-bk1.updated/kernel/futex.c
--- .6375-linux-2.6.3-rc3-bk1/kernel/futex.c	2004-02-15 18:17:21.000000000 +1100
+++ .6375-linux-2.6.3-rc3-bk1.updated/kernel/futex.c	2004-02-17 12:01:47.000000000 +1100
@@ -269,7 +269,11 @@ static void wake_futex(struct futex_q *q
 	 * The lock in wake_up_all() is a crucial memory barrier after the
 	 * list_del_init() and also before assigning to q->lock_ptr.
 	 */
+	
+	current->flags |= PF_FUTEX_DEBUG;
 	wake_up_all(&q->waiters);
+	current->flags &= ~PF_FUTEX_DEBUG;
+
 	/*
 	 * The waiting task can free the futex_q as soon as this is written,
 	 * without taking any locks.  This must come last.
@@ -490,8 +494,11 @@ static int futex_wait(unsigned long uadd
 	 * !list_empty() is safe here without any lock.
 	 * q.lock_ptr != 0 is not safe, because of ordering against wakeup.
 	 */
-	if (likely(!list_empty(&q.list)))
+	if (likely(!list_empty(&q.list))) {
+		current->flags |= PF_FUTEX_DEBUG;
 		time = schedule_timeout(time);
+		current->flags &= ~PF_FUTEX_DEBUG;
+	}
 	__set_current_state(TASK_RUNNING);
 
 	/*
diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal .6375-linux-2.6.3-rc3-bk1/kernel/sched.c .6375-linux-2.6.3-rc3-bk1.updated/kernel/sched.c
--- .6375-linux-2.6.3-rc3-bk1/kernel/sched.c	2004-02-15 18:17:22.000000000 +1100
+++ .6375-linux-2.6.3-rc3-bk1.updated/kernel/sched.c	2004-02-17 12:02:24.000000000 +1100
@@ -658,6 +658,14 @@ static int try_to_wake_up(task_t * p, un
 	long old_state;
 	runqueue_t *rq;
 
+	if ((p->flags & PF_FUTEX_DEBUG)
+	    && !(current->flags & PF_FUTEX_DEBUG)) {
+		printk("%s %i waking %s: %i %i\n",
+		       current->comm, (int)in_interrupt(),
+		       p->comm, p->tgid, p->pid);
+		WARN_ON(1);
+	}
+
 repeat_lock_task:
 	rq = task_rq_lock(p, &flags);
 	old_state = p->state;

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [2.6.2] Badness in futex_wait revisited
  2004-02-17  4:39 ` Rusty Russell
@ 2004-02-17  5:27   ` Andrew Morton
  2004-02-18  4:14     ` Rusty Russell
  2004-02-17 19:55   ` Dirk Morris
  2004-03-31 16:56   ` Jamie Lokier
  2 siblings, 1 reply; 14+ messages in thread
From: Andrew Morton @ 2004-02-17  5:27 UTC (permalink / raw)
  To: Rusty Russell; +Cc: dmorris, linux-kernel

Rusty Russell <rusty@rustcorp.com.au> wrote:
>
> -	if (likely(!list_empty(&q.list)))
>  +	if (likely(!list_empty(&q.list))) {
>  +		current->flags |= PF_FUTEX_DEBUG;
>   		time = schedule_timeout(time);
>  +		current->flags &= ~PF_FUTEX_DEBUG;
>  +	}
>   	__set_current_state(TASK_RUNNING);
>   
>   	/*
>  diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal .6375-linux-2.6.3-rc3-bk1/kernel/sched.c .6375-linux-2.6.3-rc3-bk1.updated/kernel/sched.c
>  --- .6375-linux-2.6.3-rc3-bk1/kernel/sched.c	2004-02-15 18:17:22.000000000 +1100
>  +++ .6375-linux-2.6.3-rc3-bk1.updated/kernel/sched.c	2004-02-17 12:02:24.000000000 +1100
>  @@ -658,6 +658,14 @@ static int try_to_wake_up(task_t * p, un
>   	long old_state;
>   	runqueue_t *rq;
>   
>  +	if ((p->flags & PF_FUTEX_DEBUG)
>  +	    && !(current->flags & PF_FUTEX_DEBUG)) {
>  +		printk("%s %i waking %s: %i %i\n",
>  +		       current->comm, (int)in_interrupt(),
>  +		       p->comm, p->tgid, p->pid);
>  +		WARN_ON(1);
>  +	}
>  +

If the schedule_timeout() expires then this wakeup will occur from within
the timer interrupt - current could point at any old thing.  We will get
bogus warnings.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [2.6.2] Badness in futex_wait revisited
  2004-02-17  4:39 ` Rusty Russell
  2004-02-17  5:27   ` Andrew Morton
@ 2004-02-17 19:55   ` Dirk Morris
  2004-03-31 16:56   ` Jamie Lokier
  2 siblings, 0 replies; 14+ messages in thread
From: Dirk Morris @ 2004-02-17 19:55 UTC (permalink / raw)
  To: Rusty Russell; +Cc: akpm, linux-kernel

I installed this patch. (which gives be lots of Badness in sched.c errors)

Here is the dump of what I believe is the offending awakening:

Feb 17 10:49:09 timmy kernel: iptables 0 waking foobar: 898 898
Feb 17 10:49:09 timmy kernel: Badness in try_to_wake_up at 
kernel/sched.c:666
Feb 17 10:49:09 timmy kernel: Call Trace:
Feb 17 10:49:09 timmy kernel:  [<c0117e79>] try_to_wake_up+0x1b9/0x1c0
Feb 17 10:49:09 timmy kernel:  [<c0117eba>] wake_up_state+0x1a/0x20
Feb 17 10:49:09 timmy kernel:  [<c0126358>] do_notify_parent+0x3f8/0x530
Feb 17 10:49:09 timmy kernel:  [<c013fb03>] unmap_page_range+0x43/0x70
Feb 17 10:49:09 timmy kernel:  [<c014ede9>] __fput+0x99/0xf0
Feb 17 10:49:09 timmy kernel:  [<c011d831>] exit_notify+0x211/0x750
Feb 17 10:49:09 timmy kernel:  [<c011d31e>] put_files_struct+0x8e/0xc0
Feb 17 10:49:09 timmy kernel:  [<c011df0c>] do_exit+0x19c/0x370
Feb 17 10:49:09 timmy kernel:  [<c011e113>] sys_exit+0x13/0x20
Feb 17 10:49:09 timmy kernel:  [<c010922b>] syscall_call+0x7/0xb

foobar makes a system("iptables -t nat -L") like call from one pthread.
Around this time... Some other thread (or possibly the same one?) gets 
an EINTR from sem_wait.

I am still unable to replicate this in a simple test program.
It also seems that the Badness in futex_wait message is only printed 
some of the time.

This is with SMP and preempt off. (again, doesnt happen in old 2.2.x libc)

I also have pages worth of other taking debug messages if anyone would 
like them.
Let me know if I can help further.

-Dirk

Rusty Russell wrote:

>In message <40311703.8070309@metavize.com> you write:
>  
>
>>Please send me the patch, and I'll give you some updated information.
>>    
>>
>
>Here it is: Andrew's patch updated and fixed.
>
>Thanks!
>Rusty.
>--
>  Anyone who quotes me in their sig is an idiot. -- Rusty Russell.
>
>Name: Who's Spuriously Waking Futexes?
>Author: Andrew Morton, Rusty Russell
>Status: Tested on 2.6.3-bk1
>
>Someone is triggering the WARN_ON() in futex.c.  We know that software
>suspend could do it, in theory.  But noone else should be.
>
>This code adds a PF_FUTEX_DEBUG flag, which is set in the futex code
>when we sleep, and also when we wake up.  If a task with
>PF_FUTEX_DEBUG is woken by a task without PF_FUTEX_DEBUG, we have
>found our culprit.
>
>diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal .6375-linux-2.6.3-rc3-bk1/include/linux/sched.h .6375-linux-2.6.3-rc3-bk1.updated/include/linux/sched.h
>--- .6375-linux-2.6.3-rc3-bk1/include/linux/sched.h	2004-02-15 18:17:21.000000000 +1100
>+++ .6375-linux-2.6.3-rc3-bk1.updated/include/linux/sched.h	2004-02-17 12:01:47.000000000 +1100
>@@ -500,6 +500,7 @@ do { if (atomic_dec_and_test(&(tsk)->usa
> #define PF_SWAPOFF	0x00080000	/* I am in swapoff */
> #define PF_LESS_THROTTLE 0x00100000	/* Throttle me less: I clean memory */
> #define PF_SYNCWRITE	0x00200000	/* I am doing a sync write */
>+#define PF_FUTEX_DEBUG	0x00400000
> 
> #ifdef CONFIG_SMP
> extern int set_cpus_allowed(task_t *p, cpumask_t new_mask);
>diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal .6375-linux-2.6.3-rc3-bk1/kernel/futex.c .6375-linux-2.6.3-rc3-bk1.updated/kernel/futex.c
>--- .6375-linux-2.6.3-rc3-bk1/kernel/futex.c	2004-02-15 18:17:21.000000000 +1100
>+++ .6375-linux-2.6.3-rc3-bk1.updated/kernel/futex.c	2004-02-17 12:01:47.000000000 +1100
>@@ -269,7 +269,11 @@ static void wake_futex(struct futex_q *q
> 	 * The lock in wake_up_all() is a crucial memory barrier after the
> 	 * list_del_init() and also before assigning to q->lock_ptr.
> 	 */
>+	
>+	current->flags |= PF_FUTEX_DEBUG;
> 	wake_up_all(&q->waiters);
>+	current->flags &= ~PF_FUTEX_DEBUG;
>+
> 	/*
> 	 * The waiting task can free the futex_q as soon as this is written,
> 	 * without taking any locks.  This must come last.
>@@ -490,8 +494,11 @@ static int futex_wait(unsigned long uadd
> 	 * !list_empty() is safe here without any lock.
> 	 * q.lock_ptr != 0 is not safe, because of ordering against wakeup.
> 	 */
>-	if (likely(!list_empty(&q.list)))
>+	if (likely(!list_empty(&q.list))) {
>+		current->flags |= PF_FUTEX_DEBUG;
> 		time = schedule_timeout(time);
>+		current->flags &= ~PF_FUTEX_DEBUG;
>+	}
> 	__set_current_state(TASK_RUNNING);
> 
> 	/*
>diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal .6375-linux-2.6.3-rc3-bk1/kernel/sched.c .6375-linux-2.6.3-rc3-bk1.updated/kernel/sched.c
>--- .6375-linux-2.6.3-rc3-bk1/kernel/sched.c	2004-02-15 18:17:22.000000000 +1100
>+++ .6375-linux-2.6.3-rc3-bk1.updated/kernel/sched.c	2004-02-17 12:02:24.000000000 +1100
>@@ -658,6 +658,14 @@ static int try_to_wake_up(task_t * p, un
> 	long old_state;
> 	runqueue_t *rq;
> 
>+	if ((p->flags & PF_FUTEX_DEBUG)
>+	    && !(current->flags & PF_FUTEX_DEBUG)) {
>+		printk("%s %i waking %s: %i %i\n",
>+		       current->comm, (int)in_interrupt(),
>+		       p->comm, p->tgid, p->pid);
>+		WARN_ON(1);
>+	}
>+
> repeat_lock_task:
> 	rq = task_rq_lock(p, &flags);
> 	old_state = p->state;
>  
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [2.6.2] Badness in futex_wait revisited
  2004-02-17  5:27   ` Andrew Morton
@ 2004-02-18  4:14     ` Rusty Russell
  0 siblings, 0 replies; 14+ messages in thread
From: Rusty Russell @ 2004-02-18  4:14 UTC (permalink / raw)
  To: Andrew Morton; +Cc: dmorris, linux-kernel

In message <20040216212758.437af444.akpm@osdl.org> you write:
> If the schedule_timeout() expires then this wakeup will occur from within
> the timer interrupt - current could point at any old thing.  We will get
> bogus warnings.

Yes.  How about this:

Name: Who's Spuriously Waking Futexes?
Author: Andrew Morton, Rusty Russell
Status: Experimental

Someone is triggering the WARN_ON() in futex.c.  We know that software
suspend could do it, in theory.  But noone else should be.

This code adds a PF_FUTEX_DEBUG flag, which is set in the futex code
when we sleep, and also when we wake up.  If a task with
PF_FUTEX_DEBUG is woken by a task without PF_FUTEX_DEBUG, we have
found our culprit.

diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal .31103-linux-2.6.3-rc4/include/linux/sched.h .31103-linux-2.6.3-rc4.updated/include/linux/sched.h
--- .31103-linux-2.6.3-rc4/include/linux/sched.h	2004-02-17 17:54:03.000000000 +1100
+++ .31103-linux-2.6.3-rc4.updated/include/linux/sched.h	2004-02-18 14:55:05.000000000 +1100
@@ -500,6 +500,7 @@ do { if (atomic_dec_and_test(&(tsk)->usa
 #define PF_SWAPOFF	0x00080000	/* I am in swapoff */
 #define PF_LESS_THROTTLE 0x00100000	/* Throttle me less: I clean memory */
 #define PF_SYNCWRITE	0x00200000	/* I am doing a sync write */
+#define PF_FUTEX_DEBUG	0x00400000
 
 #ifdef CONFIG_SMP
 extern int set_cpus_allowed(task_t *p, cpumask_t new_mask);
diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal .31103-linux-2.6.3-rc4/kernel/futex.c .31103-linux-2.6.3-rc4.updated/kernel/futex.c
--- .31103-linux-2.6.3-rc4/kernel/futex.c	2004-02-17 17:54:04.000000000 +1100
+++ .31103-linux-2.6.3-rc4.updated/kernel/futex.c	2004-02-18 15:09:15.000000000 +1100
@@ -269,7 +269,11 @@ static void wake_futex(struct futex_q *q
 	 * The lock in wake_up_all() is a crucial memory barrier after the
 	 * list_del_init() and also before assigning to q->lock_ptr.
 	 */
+	
+	current->flags |= PF_FUTEX_DEBUG;
 	wake_up_all(&q->waiters);
+	current->flags &= ~PF_FUTEX_DEBUG;
+
 	/*
 	 * The waiting task can free the futex_q as soon as this is written,
 	 * without taking any locks.  This must come last.
@@ -490,8 +494,11 @@ static int futex_wait(unsigned long uadd
 	 * !list_empty() is safe here without any lock.
 	 * q.lock_ptr != 0 is not safe, because of ordering against wakeup.
 	 */
-	if (likely(!list_empty(&q.list)))
+	if (likely(!list_empty(&q.list))) {
+		current->flags |= PF_FUTEX_DEBUG;
 		time = schedule_timeout(time);
+		current->flags &= ~PF_FUTEX_DEBUG;
+	}
 	__set_current_state(TASK_RUNNING);
 
 	/*
@@ -505,7 +512,11 @@ static int futex_wait(unsigned long uadd
 	if (time == 0)
 		return -ETIMEDOUT;
 	/* A spurious wakeup should never happen. */
-	WARN_ON(!signal_pending(current));
+	if (!signal_pending(current)) {
+		printk("futex_wait woken: %lu %i %lu\n",
+		       uaddr, val, time);
+		WARN_ON(1);
+	}
 	return -EINTR;
 
  out_unqueue:
diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal .31103-linux-2.6.3-rc4/kernel/sched.c .31103-linux-2.6.3-rc4.updated/kernel/sched.c
--- .31103-linux-2.6.3-rc4/kernel/sched.c	2004-02-17 17:54:04.000000000 +1100
+++ .31103-linux-2.6.3-rc4.updated/kernel/sched.c	2004-02-18 14:55:05.000000000 +1100
@@ -658,6 +658,14 @@ static int try_to_wake_up(task_t * p, un
 	long old_state;
 	runqueue_t *rq;
 
+	if ((p->flags & PF_FUTEX_DEBUG)
+	    && !(current->flags & PF_FUTEX_DEBUG)) {
+		printk("%s %i waking %s: %i %i\n",
+		       current->comm, (int)in_interrupt(),
+		       p->comm, p->tgid, p->pid);
+		WARN_ON(1);
+	}
+
 repeat_lock_task:
 	rq = task_rq_lock(p, &flags);
 	old_state = p->state;
diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal .31103-linux-2.6.3-rc4/kernel/timer.c .31103-linux-2.6.3-rc4.updated/kernel/timer.c
--- .31103-linux-2.6.3-rc4/kernel/timer.c	2004-02-04 15:39:15.000000000 +1100
+++ .31103-linux-2.6.3-rc4.updated/kernel/timer.c	2004-02-18 14:55:05.000000000 +1100
@@ -971,6 +971,13 @@ static void process_timeout(unsigned lon
 	wake_up_process((task_t *)__data);
 }
 
+static void futex_timeout(unsigned long __data)
+{
+	current->flags |= PF_FUTEX_DEBUG;
+	wake_up_process((task_t *)__data);
+	current->flags &= ~PF_FUTEX_DEBUG;
+}
+
 /**
  * schedule_timeout - sleep until timeout
  * @timeout: timeout value in jiffies
@@ -1037,7 +1044,10 @@ signed long schedule_timeout(signed long
 	init_timer(&timer);
 	timer.expires = expire;
 	timer.data = (unsigned long) current;
-	timer.function = process_timeout;
+	if (current->flags & PF_FUTEX_DEBUG)
+		timer.function = futex_timeout;
+	else
+		timer.function = process_timeout;
 
 	add_timer(&timer);
 	schedule();


--
  Anyone who quotes me in their sig is an idiot. -- Rusty Russell.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [2.6.2] Badness in futex_wait revisited
  2004-02-17  4:39 ` Rusty Russell
  2004-02-17  5:27   ` Andrew Morton
  2004-02-17 19:55   ` Dirk Morris
@ 2004-03-31 16:56   ` Jamie Lokier
  2004-03-31 17:38     ` Dirk Morris
  2004-04-01  1:57     ` Rusty Russell
  2 siblings, 2 replies; 14+ messages in thread
From: Jamie Lokier @ 2004-03-31 16:56 UTC (permalink / raw)
  To: Rusty Russell; +Cc: Dirk Morris, akpm, linux-kernel

Was the badness in futex_wait problem ever resolved?

-- Jamie

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [2.6.2] Badness in futex_wait revisited
  2004-03-31 16:56   ` Jamie Lokier
@ 2004-03-31 17:38     ` Dirk Morris
  2004-03-31 18:32       ` Jamie Lokier
  2004-04-01  1:57     ` Rusty Russell
  1 sibling, 1 reply; 14+ messages in thread
From: Dirk Morris @ 2004-03-31 17:38 UTC (permalink / raw)
  To: Jamie Lokier; +Cc: Rusty Russell, akpm, linux-kernel

No, I still get them in 2.6.4

To my knowledge, they only happen when one thread is blocked in a system 
call
and another thread makes a system("foo bar") call.
The blocked thread will return with EINTR.
I see this usually in sem_wait, but also in epoll_wait sometimes.

Is it possible system() is causing the wrong process to get woken up?

Let me know if you need any further information, I can reproduce it 
consistently.

Jamie Lokier wrote:

>Was the badness in futex_wait problem ever resolved?
>
>-- Jamie
>  
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [2.6.2] Badness in futex_wait revisited
  2004-03-31 17:38     ` Dirk Morris
@ 2004-03-31 18:32       ` Jamie Lokier
  2004-03-31 18:59         ` Dirk Morris
  0 siblings, 1 reply; 14+ messages in thread
From: Jamie Lokier @ 2004-03-31 18:32 UTC (permalink / raw)
  To: Dirk Morris; +Cc: Rusty Russell, akpm, linux-kernel

Dirk Morris wrote:
> Let me know if you need any further information, I can reproduce it 
> consistently.

If you have a small test program (or pair of programs) that
consistently triggers this message on any machine running 2.6.4, that
would be very helpful indeed.

-- Jamie

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [2.6.2] Badness in futex_wait revisited
  2004-03-31 18:32       ` Jamie Lokier
@ 2004-03-31 18:59         ` Dirk Morris
  2004-04-01  2:16           ` Rusty Russell
  0 siblings, 1 reply; 14+ messages in thread
From: Dirk Morris @ 2004-03-31 18:59 UTC (permalink / raw)
  To: Jamie Lokier; +Cc: Rusty Russell, akpm, linux-kernel

Jamie Lokier wrote:

>If you have a small test program (or pair of programs) that
>consistently triggers this message on any machine running 2.6.4, that
>would be very helpful indeed.
>  
>
Here ya go. :)

* $Id: foo.c,v 1.00 2004/03/31 10:51:19 dmorris Exp $ */
/* gcc foo.c -o foo -pthread */
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <pthread.h>
#include <semaphore.h>

sem_t sem;

void* make_system_call (void* arg)
{
    while(1) {
        sleep(1);
        system("/bin/true");
    }
    return NULL;
}

int main(void)
{
    pthread_t id;
    sem_init(&sem,0,0);
    pthread_create(&id,NULL,make_system_call,NULL);
    while (sem_wait(&sem)<0)
        perror("sem_wait");
    return 0;
}



Output for me is :
~ # uname 
-a                                                                                        
[dmorris @ gobbles]
Linux gobbles 2.6.4 #2 SMP Mon Mar 29 17:15:08 PST 2004 i686 GNU/Linux
~ # 
./foo                                                                                           
[dmorris @ gobbles]
sem_wait: Interrupted system call
sem_wait: Interrupted system call
sem_wait: Interrupted system call
sem_wait: Interrupted system call
...


meanwhile kern.log spits out:
Mar 30 16:40:23 gobbles kernel: Badness in futex_wait at kernel/futex.c:508
Mar 30 16:40:23 gobbles kernel: Call Trace:
Mar 30 16:40:23 gobbles kernel:  [<c0139a28>] futex_wait+0x1a8/0x1c0
Mar 30 16:40:23 gobbles kernel:  [<c011efc0>] default_wake_function+0x0/0x20
Mar 30 16:40:23 gobbles kernel:  [<c011efc0>] default_wake_function+0x0/0x20
Mar 30 16:40:23 gobbles kernel:  [<c012d020>] do_timer+0xc0/0xd0
Mar 30 16:40:23 gobbles kernel:  [<c0139d30>] do_futex+0x70/0x80
Mar 30 16:40:23 gobbles kernel:  [<c0139e64>] sys_futex+0x124/0x140
Mar 30 16:40:23 gobbles kernel:  [<c015ef11>] sys_write+0x61/0x70
Mar 30 16:40:23 gobbles kernel:  [<c01095db>] syscall_call+0x7/0xb
Mar 30 16:40:23 gobbles kernel:


I can do this on multiple machines, so I won't include all the box info.
If you need it let me know.

Thanks! :)
-Dirk




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [2.6.2] Badness in futex_wait revisited
  2004-03-31 16:56   ` Jamie Lokier
  2004-03-31 17:38     ` Dirk Morris
@ 2004-04-01  1:57     ` Rusty Russell
  1 sibling, 0 replies; 14+ messages in thread
From: Rusty Russell @ 2004-04-01  1:57 UTC (permalink / raw)
  To: Jamie Lokier; +Cc: Dirk Morris, Andrew Morton, lkml - Kernel Mailing List

On Thu, 2004-04-01 at 02:56, Jamie Lokier wrote:
> Was the badness in futex_wait problem ever resolved?

No: Andrew misapplied the patches for ages and ended up ditching it.  We
tried a different version but there was a bug which meant real wakups
triggered it...

Here is the current version...

Name: Who's Spuriously Waking Futexes?
Author: Andrew Morton, Rusty Russell
Status: Tested on 2.6.3-bk1

Someone is triggering the WARN_ON() in futex.c.  We know that software
suspend could do it, in theory.  But noone else should be.

This code adds a PF_FUTEX_DEBUG flag, which is set in the futex code
when we sleep, and also when we wake up.  If a task with
PF_FUTEX_DEBUG is woken by a task without PF_FUTEX_DEBUG, we have
found our culprit.

diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal .31103-linux-2.6.3-rc4/include/linux/sched.h .31103-linux-2.6.3-rc4.updated/include/linux/sched.h
--- .31103-linux-2.6.3-rc4/include/linux/sched.h	2004-02-17 17:54:03.000000000 +1100
+++ .31103-linux-2.6.3-rc4.updated/include/linux/sched.h	2004-02-18 14:55:05.000000000 +1100
@@ -500,6 +500,7 @@ do { if (atomic_dec_and_test(&(tsk)->usa
 #define PF_SWAPOFF	0x00080000	/* I am in swapoff */
 #define PF_LESS_THROTTLE 0x00100000	/* Throttle me less: I clean memory */
 #define PF_SYNCWRITE	0x00200000	/* I am doing a sync write */
+#define PF_FUTEX_DEBUG	0x00400000
 
 #ifdef CONFIG_SMP
 extern int set_cpus_allowed(task_t *p, cpumask_t new_mask);
diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal .31103-linux-2.6.3-rc4/kernel/futex.c .31103-linux-2.6.3-rc4.updated/kernel/futex.c
--- .31103-linux-2.6.3-rc4/kernel/futex.c	2004-02-17 17:54:04.000000000 +1100
+++ .31103-linux-2.6.3-rc4.updated/kernel/futex.c	2004-02-18 15:09:15.000000000 +1100
@@ -269,7 +269,11 @@ static void wake_futex(struct futex_q *q
 	 * The lock in wake_up_all() is a crucial memory barrier after the
 	 * list_del_init() and also before assigning to q->lock_ptr.
 	 */
+	
+	current->flags |= PF_FUTEX_DEBUG;
 	wake_up_all(&q->waiters);
+	current->flags &= ~PF_FUTEX_DEBUG;
+
 	/*
 	 * The waiting task can free the futex_q as soon as this is written,
 	 * without taking any locks.  This must come last.
@@ -490,8 +494,11 @@ static int futex_wait(unsigned long uadd
 	 * !list_empty() is safe here without any lock.
 	 * q.lock_ptr != 0 is not safe, because of ordering against wakeup.
 	 */
-	if (likely(!list_empty(&q.list)))
+	if (likely(!list_empty(&q.list))) {
+		current->flags |= PF_FUTEX_DEBUG;
 		time = schedule_timeout(time);
+		current->flags &= ~PF_FUTEX_DEBUG;
+	}
 	__set_current_state(TASK_RUNNING);
 
 	/*
@@ -505,7 +512,11 @@ static int futex_wait(unsigned long uadd
 	if (time == 0)
 		return -ETIMEDOUT;
 	/* A spurious wakeup should never happen. */
-	WARN_ON(!signal_pending(current));
+	if (!signal_pending(current)) {
+		printk("futex_wait woken: %lu %i %lu\n",
+		       uaddr, val, time);
+		WARN_ON(1);
+	}
 	return -EINTR;
 
  out_unqueue:
diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal .31103-linux-2.6.3-rc4/kernel/sched.c .31103-linux-2.6.3-rc4.updated/kernel/sched.c
--- .31103-linux-2.6.3-rc4/kernel/sched.c	2004-02-17 17:54:04.000000000 +1100
+++ .31103-linux-2.6.3-rc4.updated/kernel/sched.c	2004-02-18 14:55:05.000000000 +1100
@@ -658,6 +658,14 @@ static int try_to_wake_up(task_t * p, un
 	long old_state;
 	runqueue_t *rq;
 
+	if ((p->flags & PF_FUTEX_DEBUG)
+	    && !(current->flags & PF_FUTEX_DEBUG)) {
+		printk("%s %i waking %s: %i %i\n",
+		       current->comm, (int)in_interrupt(),
+		       p->comm, p->tgid, p->pid);
+		WARN_ON(1);
+	}
+
 repeat_lock_task:
 	rq = task_rq_lock(p, &flags);
 	old_state = p->state;
diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal .31103-linux-2.6.3-rc4/kernel/timer.c .31103-linux-2.6.3-rc4.updated/kernel/timer.c
--- .31103-linux-2.6.3-rc4/kernel/timer.c	2004-02-04 15:39:15.000000000 +1100
+++ .31103-linux-2.6.3-rc4.updated/kernel/timer.c	2004-02-18 14:55:05.000000000 +1100
@@ -971,6 +971,13 @@ static void process_timeout(unsigned lon
 	wake_up_process((task_t *)__data);
 }
 
+static void futex_timeout(unsigned long __data)
+{
+	current->flags |= PF_FUTEX_DEBUG;
+	wake_up_process((task_t *)__data);
+	current->flags &= ~PF_FUTEX_DEBUG;
+}
+
 /**
  * schedule_timeout - sleep until timeout
  * @timeout: timeout value in jiffies
@@ -1037,7 +1044,10 @@ signed long schedule_timeout(signed long
 	init_timer(&timer);
 	timer.expires = expire;
 	timer.data = (unsigned long) current;
-	timer.function = process_timeout;
+	if (current->flags & PF_FUTEX_DEBUG)
+		timer.function = futex_timeout;
+	else
+		timer.function = process_timeout;
 
 	add_timer(&timer);
 	schedule();

-- 
Anyone who quotes me in their signature is an idiot -- Rusty Russell


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [2.6.2] Badness in futex_wait revisited
  2004-03-31 18:59         ` Dirk Morris
@ 2004-04-01  2:16           ` Rusty Russell
  2004-04-01  8:34             ` Andrew Morton
  0 siblings, 1 reply; 14+ messages in thread
From: Rusty Russell @ 2004-04-01  2:16 UTC (permalink / raw)
  To: Dirk Morris; +Cc: Jamie Lokier, Andrew Morton, lkml - Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 923 bytes --]

On Thu, 2004-04-01 at 04:59, Dirk Morris wrote:
> Jamie Lokier wrote:
> 
> >If you have a small test program (or pair of programs) that
> >consistently triggers this message on any machine running 2.6.4, that
> >would be very helpful indeed.
> >  
> >
> Here ya go. :)

Doesn't work for me (2.6.5-rc1 Debian unstable 2xi686).

strace -f output attached below.

rusty@mingo:/tmp$ strace -f -o /tmp/foo.out ./foo
Process 3993 attached
Process 3994 attached
Process 3993 suspended
Process 3993 resumed
Process 3994 detached
sem_wait: Interrupted system call
Process 3995 attached
Process 3993 suspended
Process 3993 resumed
Process 3995 detached
sem_wait: Interrupted system call
Process 3996 attached
Process 3993 suspended
Process 3993 resumed
Process 3996 detached
sem_wait: Interrupted system call
PANIC: handle_group_exit: 3993 leader 3992

Rusty.
-- 
Anyone who quotes me in their signature is an idiot -- Rusty Russell

[-- Attachment #2: foo.out.bz2 --]
[-- Type: application/x-bzip, Size: 3509 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [2.6.2] Badness in futex_wait revisited
  2004-04-01  2:16           ` Rusty Russell
@ 2004-04-01  8:34             ` Andrew Morton
  2004-04-01  9:24               ` Andrew Morton
  0 siblings, 1 reply; 14+ messages in thread
From: Andrew Morton @ 2004-04-01  8:34 UTC (permalink / raw)
  To: Rusty Russell; +Cc: dmorris, jamie, linux-kernel

Rusty Russell <rusty@rustcorp.com.au> wrote:
>
>  On Thu, 2004-04-01 at 04:59, Dirk Morris wrote:
>  > Jamie Lokier wrote:
>  > 
>  > >If you have a small test program (or pair of programs) that
>  > >consistently triggers this message on any machine running 2.6.4, that
>  > >would be very helpful indeed.
>  > >  
>  > >
>  > Here ya go. :)

Dirk, thanks for that.  It's worth its weight.

> 
>  Doesn't work for me (2.6.5-rc1 Debian unstable 2xi686).

It works here.  You're probably using some steam-driven debian userspace.

There seem to be two call traces according to dmesg:

true 0 waking futex-prob: 1732 1732
Badness in task_rq_lock at kernel/sched.c:277
Call Trace:
 [<c011c678>] task_rq_lock+0x78/0xf4
 [<c011ccbd>] try_to_wake_up+0x1d/0x2c0
 [<c012d35b>] send_signal+0xdb/0x128
 [<c011cf80>] wake_up_state+0xc/0x10
 [<c012ea0d>] do_notify_parent+0x499/0x54c
 [<c0178454>] dput+0x1c/0x274
 [<c016226c>] __fput+0xfc/0x124
 [<c0160b6e>] filp_close+0x62/0x70
 [<c0125c1e>] exit_notify+0x61e/0x660
 [<c0126102>] do_exit+0x4a2/0x4b4
 [<c012625c>] sys_exit_group+0x0/0x14
 [<c012626c>] sys_exit_group+0x10/0x14
 [<c038d3f2>] sysenter_past_esp+0x43/0x65

sshd 0 waking futex-prob: 1732 1732
Badness in task_rq_lock at kernel/sched.c:277
Call Trace:
 [<c011c678>] task_rq_lock+0x78/0xf4
 [<c011ccbd>] try_to_wake_up+0x1d/0x2c0
 [<c014a86c>] kmem_cache_alloc+0x20/0x68
 [<c012d2be>] send_signal+0x3e/0x128
 [<c011cf80>] wake_up_state+0xc/0x10
 [<c012d97c>] group_send_sig_info+0x2cc/0x3f0
 [<c012dafe>] __kill_pg_info+0x5e/0x90
 [<c012db62>] kill_pg_info+0x32/0x5c
 [<c012df03>] kill_pg+0x1b/0x20
 [<c02536e0>] n_tty_receive_buf+0x530/0x11bc
 [<c0334a85>] release_sock+0x75/0x7f
 [<c0355fee>] tcp_recvmsg+0x656/0x694
 [<c0202141>] avc_has_perm+0x41/0x4d
 [<c02561f3>] pty_write+0x12b/0x198
 [<c02550dd>] write_chan+0x195/0x1f4
 [<c011f088>] default_wake_function+0x0/0x1c
 [<c011f088>] default_wake_function+0x0/0x1c
 [<c024f982>] tty_write+0x22a/0x2e4
 [<c0254f48>] write_chan+0x0/0x1f4
 [<c0161517>] vfs_write+0xb7/0xf0
 [<c01615d0>] sys_write+0x30/0x50
 [<c038d3f2>] sysenter_past_esp+0x43/0x65


But setting a breakpoint in task_rq_lock(), kgdb only ever seems to find
one:

Breakpoint 1, task_rq_lock (p=0xcf3a8330, flags=0xcf105e88) at kernel/sched.c:274
274                     printk("%s %i waking %s: %i %i\n",
(gdb) bt
#0  task_rq_lock (p=0xcf3a8330, flags=0xcf105e88) at kernel/sched.c:274
#1  0xc011ce7d in try_to_wake_up (p=0xcf3a8330, state=1, sync=0) at kernel/sched.c:792
#2  0xc011cf80 in wake_up_state (p=0xcf29d420, state=3473952768) at kernel/sched.c:849
#3  0xc012ea0d in do_notify_parent (tsk=0xcf29d420, sig=17) at kernel/signal.c:552
#4  0xc0125c1e in exit_notify (tsk=0xcf29d420) at kernel/exit.c:716
#5  0xc0126102 in do_exit (code=0) at kernel/exit.c:790
#6  0xc012625c in do_group_exit (exit_code=0) at kernel/exit.c:861
#7  0xc012626c in sys_exit_group (error_code=0) at kernel/exit.c:872
#8  0xc038d3f2 in sysenter_past_esp () at net/key/af_key.c:2834

Looks like breakage in the signal code.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [2.6.2] Badness in futex_wait revisited
  2004-04-01  8:34             ` Andrew Morton
@ 2004-04-01  9:24               ` Andrew Morton
  0 siblings, 0 replies; 14+ messages in thread
From: Andrew Morton @ 2004-04-01  9:24 UTC (permalink / raw)
  To: rusty, dmorris, jamie, linux-kernel

Andrew Morton <akpm@osdl.org> wrote:
>
>  >  > Here ya go. :)
> 
>  Dirk, thanks for that.  It's worth its weight.
> 
>  > 
>  >  Doesn't work for me (2.6.5-rc1 Debian unstable 2xi686).
> 
>  It works here.

Then again, I only saw the debug messages on the wakeup path.  I'm unable
to trigger the warning in futex_wait().

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2004-04-01  9:25 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-02-13 21:13 [2.6.2] Badness in futex_wait revisited Dirk Morris
2004-02-16 11:42 ` Rusty Russell
     [not found] <40311703.8070309@metavize.com>
2004-02-17  4:39 ` Rusty Russell
2004-02-17  5:27   ` Andrew Morton
2004-02-18  4:14     ` Rusty Russell
2004-02-17 19:55   ` Dirk Morris
2004-03-31 16:56   ` Jamie Lokier
2004-03-31 17:38     ` Dirk Morris
2004-03-31 18:32       ` Jamie Lokier
2004-03-31 18:59         ` Dirk Morris
2004-04-01  2:16           ` Rusty Russell
2004-04-01  8:34             ` Andrew Morton
2004-04-01  9:24               ` Andrew Morton
2004-04-01  1:57     ` Rusty Russell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox