public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* FUTEX_CMP_REQUEUE_PI is not quite there
@ 2007-05-12  6:10 Ulrich Drepper
  2007-05-12  6:19 ` Andrew Morton
  2007-06-05 16:58 ` Thomas Gleixner
  0 siblings, 2 replies; 5+ messages in thread
From: Ulrich Drepper @ 2007-05-12  6:10 UTC (permalink / raw)
  To: Pierre Peiffer; +Cc: Linux Kernel, Andrew Morton, Dave Jones

I hooked up FUTEX_CMP_REQUEUE_PI here and got a kernel crash.  No serial 
console so this is the output of the screen after the machine stopped.

This is of course on x86-64.  Compiled from a rawhide-ified upstream 
kernel from two days ago.

The situation is the we requeue from a non-PI futex to a PI futex.  We 
might now actually want to change the condvar implementation to use
internally a PI futex if the mutex in use is PI, too, but this kind of 
mismatch can still happen.  I can provide binaries if necessary.


There is quite a lot of output from the kernel:

BUG: at kernel/futex.c:1665 set_pi_futex_owner()

Call Trace:
  [<ffffffff80249eee>] futex_lock_pi+0x351/0x685
  [<ffffffff8043b3cb>] _spin_lock_irqsave+0x9/0xe
  [<ffffffff803089ac>] __up_read+0x19/0x7f
  [<ffffffff8022ca81>] default_wake_function+0x0/0xe
  [<ffffffff8024b475>] do_futex+0xa68/0x10e8
  [<ffffffff8024bbe3>] sys_futex+0xee/0x10c
  [<ffffffff8043b399>] _spin_unlock_irq+0x9/0xc
  [<ffffffff80209b9e>] system_call+0x7e/0x83

BUG: at lib/plist.c:78 plist_add()

Call Trace:
  [<ffffffff8030c812>] plist_add+0x3a/0x90
  [<ffffffff80249f24>] futex_lock_pi+0x387/0x685
  [<ffffffff8043b3cb>] _spin_lock_irqsave+0x9/0xe
  [<ffffffff803089ac>] __up_read+0x19/0x7f
  [<ffffffff8022ca81>] default_wake_function+0x0/0xe
  [<ffffffff8024b475>] do_futex+0xa68/0x10e8
  [<ffffffff8024bbe3>] sys_futex+0xee/0x10c
  [<ffffffff8043b399>] _spin_unlock_irq+0x9/0xc
  [<ffffffff80209b9e>] system_call+0x7e/0x83

BUG: at kernel/futex.c:483 exit_pi_state_list()

Call Trace:
  [<ffffffff8024be47>] exit_pi_state_list+0xbe/0x11e
  [<ffffffff80235aad>] do_exit+0x801/0x84e
  [<ffffffff80235b97>] complete_and_exit+0x0/0x16
  [<ffffffff80209b9e>] system_call+0x7e/0x83

list_add corruption. prev->next should be next (ffff81001dda1cb8), but 
was ffff81006c 6e06c8. (prev=ffff81006c6e06c8).
------------[ cut here ]------------
kernel BUG at lib/list_debug.c:33!
invalid opcode: 0000 [1] SMP
CPU 0
Pid: 15097, comm: ld-linux-x86-64 Not tainted 2.6.21-1.3145.fc7 #1
RIP: 0010:[<ffffffff8030c90a>]  [<ffffffff8030c90a>] __list_add+0x47/0x5b
RSP: 0018:ffff81003cc01e78  EFLAGS: 00010092
RAX: 0000000000000079 RBX: ffff81001dda1cb8 RCX: fffffffffffffca9
RDX: 00000000ffffffff RSI: 0000000000000282 RDI: ffffffff80559a50
RBP: ffff81001dda1cb0 R08: 00000000000000a0 R09: 0000000000000010
R10: ffff81000305dd00 R11: 0000000000000000 R12: ffff81001dda1c88
R13: 0000000000000282 R14: ffff81006c6e0080 R15: ffff810075edac78
FS:  0000000000000000(0000) GS:ffffffff8059e000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000040400eb8 CR3: 000000001c40f000 CR4: 00000000000026e0
Process ld-linux-x86-64 (pid: 15097, threadinfo ffff81003cc00000, task 
ffff81006c6e00

Stack:  ffff81006c6e06b0 ffffffff8030c7a2 ffff81006c6e07b0 ffff810075edac50
  ffff81006c6e06b0 ffffffff8043ac19 ffff81006c6e06b0 ffff810075edac40
  ffff81006c6e06b0 ffffffff8070f9f0 ffff81006c6e07b0 ffff81006c6e0080
Call Trace:
  [<ffffffff8030c7a2>] plist_del+0x3a/0x70
  [<ffffffff8043ac19>] rt_mutex_slowunlock+0x8c/0x1cd
  [<ffffffff8024be75>] exit_pi_state_list+0xec/0x11e
  [<ffffffff80235aad>] do_exit+0x801/0x84e
  [<ffffffff80235b97>] complete_and_exit+0x0/0x16
  [<ffffffff80209b9e>] system_call+0x7e/0x83


Code: 0f 0b eb fe 48 89 7e 08 48 89 37 48 89 57 08 48 89 3a 5a c3
RIP  [<ffffffff8030c90a>] __list_add+0x47/0x5b
  RSP <ffff81003cc01e78>

-- 
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: FUTEX_CMP_REQUEUE_PI is not quite there
  2007-05-12  6:10 FUTEX_CMP_REQUEUE_PI is not quite there Ulrich Drepper
@ 2007-05-12  6:19 ` Andrew Morton
  2007-05-12  6:29   ` Ulrich Drepper
  2007-06-05 16:58 ` Thomas Gleixner
  1 sibling, 1 reply; 5+ messages in thread
From: Andrew Morton @ 2007-05-12  6:19 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: Pierre Peiffer, Linux Kernel, Dave Jones, Ingo Molnar

On Fri, 11 May 2007 23:10:47 -0700 Ulrich Drepper <drepper@redhat.com> wrote:

> I hooked up FUTEX_CMP_REQUEUE_PI here and got a kernel crash.

Well yup.  We're kind of waiting for someone to reply
to http://lkml.org/lkml/2007/5/7/129

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: FUTEX_CMP_REQUEUE_PI is not quite there
  2007-05-12  6:19 ` Andrew Morton
@ 2007-05-12  6:29   ` Ulrich Drepper
  0 siblings, 0 replies; 5+ messages in thread
From: Ulrich Drepper @ 2007-05-12  6:29 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Pierre Peiffer, Linux Kernel, Dave Jones, Ingo Molnar

Andrew Morton wrote:
> Well yup.  We're kind of waiting for someone to reply
> to http://lkml.org/lkml/2007/5/7/129

Seems to be the same or at least related.

On comment about my first mail: this is the correct code of condvars, 
despite what I wrote before.  I wasn't thinking clear.  The internal 
futex is a normal futex.  It is the job of the CMP_REQUEUE_PI call to 
figure this out, select the waiter with the highest priority, and boost 
the priority if necessary based on the targer futex which always is a PI 
futex.

-- 
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: FUTEX_CMP_REQUEUE_PI is not quite there
  2007-05-12  6:10 FUTEX_CMP_REQUEUE_PI is not quite there Ulrich Drepper
  2007-05-12  6:19 ` Andrew Morton
@ 2007-06-05 16:58 ` Thomas Gleixner
  2007-06-09 18:01   ` Ulrich Drepper
  1 sibling, 1 reply; 5+ messages in thread
From: Thomas Gleixner @ 2007-06-05 16:58 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: Pierre Peiffer, Linux Kernel, Andrew Morton, Dave Jones

On Fri, 2007-05-11 at 23:10 -0700, Ulrich Drepper wrote:
> I hooked up FUTEX_CMP_REQUEUE_PI here and got a kernel crash.  No serial 
> console so this is the output of the screen after the machine stopped.
> 
> This is of course on x86-64.  Compiled from a rawhide-ified upstream 
> kernel from two days ago.
> 
> The situation is the we requeue from a non-PI futex to a PI futex.  We 
> might now actually want to change the condvar implementation to use
> internally a PI futex if the mutex in use is PI, too, but this kind of 
> mismatch can still happen.  I can provide binaries if necessary.

Can you put the binaries somewhere for download please ? 

I'm looking at the problems, which were reported by Alexey, so I can
look at this as well.

	tglx



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: FUTEX_CMP_REQUEUE_PI is not quite there
  2007-06-05 16:58 ` Thomas Gleixner
@ 2007-06-09 18:01   ` Ulrich Drepper
  0 siblings, 0 replies; 5+ messages in thread
From: Ulrich Drepper @ 2007-06-09 18:01 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Pierre Peiffer, Linux Kernel, Andrew Morton, "Dave Jones",
	Ingo Molnar

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Thomas Gleixner wrote:
> Can you put the binaries somewhere for download please ? 

I was waiting until I could test the current patches.  DaveJ built a
kernel based on -rc4-git3 which I tested.  The results are the same.

I've put a statically linked x86-64 binary at

  http://people.redhat.com/drepper/tst-robustpi7.bz2

Running it produces a backtrace with the 2.6.21-1.3218.fc8 kernel and
the machine dies.

The test case creates a robust, PI mutex.  Five threads then run into a
condvar for this mutex.  The main threads wakes them all with a
broadcast which causes the new requeue_pi code to be used.  The woken
threads then (in pthread_cond_wait) lock the mutex and then kill
themselves while holding the mutex.  This is supposed to have the result
that all but the first thread get EDEADOWNER errors.

Note: the condvar futex is not a PI futex.  It cannot be, the semantics
is different.  This is no locking futex.  The kernel will have to deal
with this.  The requeue operation is meaningless if this is not done
since it's only use is for this situation.

- --
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)

iD8DBQFGauro2ijCOnn/RHQRAluEAKC2rG4DSGjwNkYILzQsMtp7jgcN0QCgh4od
bN8HUrwjg1keVo8DjKTQL6o=
=twSF
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2007-06-09 18:02 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-05-12  6:10 FUTEX_CMP_REQUEUE_PI is not quite there Ulrich Drepper
2007-05-12  6:19 ` Andrew Morton
2007-05-12  6:29   ` Ulrich Drepper
2007-06-05 16:58 ` Thomas Gleixner
2007-06-09 18:01   ` Ulrich Drepper

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox