* futex_wake_op deadlock? @ 2007-11-16 23:52 Kaz Kylheku 2007-11-16 23:52 ` Kaz Kylheku 2007-11-19 18:48 ` Ralf Baechle 0 siblings, 2 replies; 15+ messages in thread From: Kaz Kylheku @ 2007-11-16 23:52 UTC (permalink / raw) To: linux-mips Hey everyone, From time to time, on 2.6.17.7, I see a deadlock situation go off. The soft lockup tick occurs in the middle of do_futex, which is heavily inlined. The system is actually hosed; it's not one of those recoverable CPU busy situations that can sometimes trigger the lockup detector. The instruction that is interrupted by the soft lockup tick appears to be in the assembly code (__futex_atomic_op) used by the futex_wake_op function; the case is FUTEX_OP_SET. It's the instruction just before the load-linked; i.e. the interrupt is outside of the ll/sc loop. I can't figure out how the code would get into a loop here. The ll/sc logic should eventually succeed. There is a large loop in the overall futex operation, but that is bounded by an interation variable (attempt++). (I checked the 2.6.17 head, but there doesn't appear to be any futex-related work). This lockup has reproduced more than once for us. Once at bootup, and several times on shutdown. The call stack always includes several do_futex frames, and a compat_sys_futex/handle_sysn32 at the top of the chain. This is from syslog (the unusual format is due to running metalog rather than syslog in our distribution, and the human-readable time in the square-bracketed printk timestamps is a locally developed patch): Jan 3 02:47:02 [kernel] [02:47:02.953075] [<ffffffff8016de8c>] softlockup_tick+0x1bc/0x208 Jan 3 02:47:02 [kernel] [02:47:02.953121] [<ffffffff8014cc54>] update_process_times+0x9c/0xe8 Jan 3 02:47:02 [kernel] [02:47:02.953158] [<ffffffff801098bc>] ll_local_timer_interrupt+0x94/0xa8 Jan 3 02:47:02 [kernel] [02:47:02.953194] [<ffffffff801026a0>] plat_irq_dispatch+0x120/0x1a0 Jan 3 02:47:02 [kernel] [02:47:02.953221] [<ffffffff80163758>] do_futex+0x870/0xb58 Jan 3 02:47:02 [kernel] [02:47:02.953251] [<ffffffff801637e0>] do_futex+0x8f8/0xb58 Jan 3 02:47:02 [kernel] [02:47:02.953275] [<ffffffff8047b16c>] __lock_text_end+0x1b3c/0x474c Jan 3 02:47:02 [kernel] [02:47:02.953312] [<ffffffff8036fc40>] sys_sendto+0xe8/0x140 Jan 3 02:47:02 [kernel] [02:47:02.953345] [<ffffffff80163fac>] compat_sys_futex+0x84/0x188 Jan 3 02:47:02 [kernel] [02:47:02.953372] [<ffffffff80116314>] handle_sysn32+0x54/0xb0 The sys_sendto is a red herring, since the backtrace function dumps every single word on the stack as an address, not having any frame pointers to go by. The code surrounding ffffffff80163758: ffffffff8016374c: 00023000 sll a2,v0,0x0 ffffffff80163750: 08058c77 j ffffffff801631dc <do_futex+0x2f4> ffffffff80163754: 00034000 sll a4,v1,0x0 ffffffff80163758: 0000102d move v0,zero <----<< ffffffff8016375c: c2030000 ll v1,0(s0) ffffffff80163760: 00a0082d move at,a1 ffffffff80163764: e2010000 sc at,0(s0) ffffffff80163768: 1020fffc beqz at,ffffffff8016375c <do_futex+0x874> ffffffff8016376c: 00000000 nop ffffffff80163770: 0000000f sync ffffffff80163774: 8f870024 lw a3,36(gp) ffffffff80163778: 00023000 sll a2,v0,0x0 ffffffff8016377c: 08058c77 j ffffffff801631dc <do_futex+0x2f4> You can tell from the "move at, a1" that it's the FUTEX_OP_SET case. ^ permalink raw reply [flat|nested] 15+ messages in thread
* futex_wake_op deadlock? 2007-11-16 23:52 futex_wake_op deadlock? Kaz Kylheku @ 2007-11-16 23:52 ` Kaz Kylheku 2007-11-19 18:48 ` Ralf Baechle 1 sibling, 0 replies; 15+ messages in thread From: Kaz Kylheku @ 2007-11-16 23:52 UTC (permalink / raw) To: linux-mips Hey everyone, ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: futex_wake_op deadlock? 2007-11-16 23:52 futex_wake_op deadlock? Kaz Kylheku 2007-11-16 23:52 ` Kaz Kylheku @ 2007-11-19 18:48 ` Ralf Baechle 2007-11-19 21:27 ` Kaz Kylheku 1 sibling, 1 reply; 15+ messages in thread From: Ralf Baechle @ 2007-11-19 18:48 UTC (permalink / raw) To: Kaz Kylheku; +Cc: linux-mips On Fri, Nov 16, 2007 at 03:52:47PM -0800, Kaz Kylheku wrote: > From time to time, on 2.6.17.7, I see a deadlock situation go off. The > soft lockup tick occurs in the middle of do_futex, which is heavily > inlined. The system is actually hosed; it's not one of those > recoverable CPU busy situations that can sometimes trigger the lockup > detector. Can you reproduce thing hang also if you're not running in a binary compat mode, that is either running o32 binaries on a 32-bit kernel or 64-bit binaries on a 64-bit kernel? Ralf ^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: futex_wake_op deadlock? 2007-11-19 18:48 ` Ralf Baechle @ 2007-11-19 21:27 ` Kaz Kylheku 2007-11-19 21:27 ` Kaz Kylheku ` (2 more replies) 0 siblings, 3 replies; 15+ messages in thread From: Kaz Kylheku @ 2007-11-19 21:27 UTC (permalink / raw) To: Ralf Baechle; +Cc: linux-mips Ralf Baechle wrote: > On Fri, Nov 16, 2007 at 03:52:47PM -0800, Kaz Kylheku wrote: > >> From time to time, on 2.6.17.7, I see a deadlock situation go off. >> The soft lockup tick occurs in the middle of do_futex, which is >> heavily inlined. The system is actually hosed; it's not one of those >> recoverable CPU busy situations that can sometimes trigger the lockup >> detector. > > Can you reproduce thing hang also if you're not running in a > binary compat > mode, that is either running o32 binaries on a 32-bit kernel or > 64-bit binaries on a 64-bit kernel? I have hacked up little a test program which hosed my board within seconds. The system is not completely hung. However: - I can't kill the test program with Ctrl-C. - I can log into the box with telnet. - If I run "ps aux" to see all processes, the ps command hangs partway through the table, and cannot be killed with Ctrl-C. - System hangs on soft reboot attempt; requires hard reset. The program basically uses several threads to beat up the FUTEX_WAKE_OP. The key trick is that there is an interfering thread which does a mmap/munmap on the futexes in parallel with the threads which are using them. . If I just stick the futexes into a permanently good memory location, nothing bad happens; the program just churns away taking up 400% of the CPU time across the four cores of the 1480. If you call the function with permanently bad addresses, nothing bad happens either; the syscalls bail nicely with EFAULT. The idea is to tickle some race condition or other bug in the interaction between futexes and mmap. I put a little delay into the interfering thread so that the memory is held in a good state most of the time, with a quick unmap/remap. We want the memory to be good most of the time, but an unmap to happen from time to time at an inopportune time, while the kernel is executing the futex code on one or more cores This needs to be compiled -pthread, obviously, and you need -lrt to link in the library for clock_nanosleep. #include <stdlib.h> #include <unistd.h> #include <errno.h> #include <time.h> #include <sys/syscall.h> #include <sys/mman.h> #define FUTEX_WAIT 0 #define FUTEX_WAKE 1 #define FUTEX_FD 2 #define FUTEX_REQUEUE 3 #define FUTEX_CMP_REQUEUE 4 #define FUTEX_WAKE_OP 5 #define FUTEX_OP_SET 0 /* *(int *)UADDR2 = OPARG; */ #define FUTEX_OP_ADD 1 /* *(int *)UADDR2 += OPARG; */ #define FUTEX_OP_OR 2 /* *(int *)UADDR2 |= OPARG; */ #define FUTEX_OP_ANDN 3 /* *(int *)UADDR2 &= ~OPARG; */ #define FUTEX_OP_XOR 4 /* *(int *)UADDR2 ^= OPARG; */ #define FUTEX_OP_OPARG_SHIFT 8 /* Use (1 << OPARG) instead of OPARG. */ #define FUTEX_OP_CMP_EQ 0 /* if (oldval == CMPARG) wake */ #define FUTEX_OP_CMP_NE 1 /* if (oldval != CMPARG) wake */ #define FUTEX_OP_CMP_LT 2 /* if (oldval < CMPARG) wake */ #define FUTEX_OP_CMP_LE 3 /* if (oldval <= CMPARG) wake */ #define FUTEX_OP_CMP_GT 4 /* if (oldval > CMPARG) wake */ #define FUTEX_OP_CMP_GE 5 /* if (oldval >= CMPARG) wake */ #define NUM_THREADS 8 int futex_wake_op(int *addr1, int *addr2, int nr_wake_1, int nr_wake_2, int encoded_op) { syscall(SYS_futex, addr1, FUTEX_WAKE_OP, nr_wake_1, nr_wake_2, addr2, encoded_op); } int futex1 = 0, futex2 = 0; struct { int futex1; int futex2; } *shared; void *mapper(void *arg) { for (;;) { struct timespec delay; void *mem; delay.tv_sec = 0; delay.tv_nsec = 100000000; mem = mmap(0, 16384, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); if (mem == (void *) -1) { perror("mmap"); exit(EXIT_FAILURE); } shared = mem; clock_nanosleep(CLOCK_MONOTONIC, TIMER_ABSTIME, &delay, 0); if (munmap(mem, 16384) < 0) { perror("munmap"); exit(EXIT_FAILURE); } } } void *waker(void *arg) { int rand_state = 1; for (;;) { int val = rand_r(&rand_state) & 0xFFFF; const int op = (FUTEX_OP_SET << 28) | (FUTEX_OP_CMP_GT << 24) | val; int result = futex_wake_op(&shared->futex1, &shared->futex2, 1, 1, op); if (result < 0 && errno != EFAULT) { perror("futex_wake_op"); exit(EXIT_FAILURE); } } /* notreached */ return 0; } int main(void) { int i; srand(1); for (i = 0; i < NUM_THREADS; i++) { pthread_t thr; void *(*func)(void *) = (i == 0) ? mapper : waker; int result = errno = pthread_create(&thr, 0, func, 0); if (result != 0) { perror("pthread_create"); return EXIT_FAILURE; } } pthread_exit(0); } ^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: futex_wake_op deadlock? 2007-11-19 21:27 ` Kaz Kylheku @ 2007-11-19 21:27 ` Kaz Kylheku 2007-11-19 21:42 ` Kaz Kylheku 2007-11-20 11:21 ` Ralf Baechle 2 siblings, 0 replies; 15+ messages in thread From: Kaz Kylheku @ 2007-11-19 21:27 UTC (permalink / raw) To: Ralf Baechle; +Cc: linux-mips Ralf Baechle wrote: > On Fri, Nov 16, 2007 at 03:52:47PM -0800, Kaz Kylheku wrote: > >> From time to time, on 2.6.17.7, I see a deadlock situation go off. >> The soft lockup tick occurs in the middle of do_futex, which is >> heavily inlined. The system is actually hosed; it's not one of those >> recoverable CPU busy situations that can sometimes trigger the lockup >> detector. > > Can you reproduce thing hang also if you're not running in a > binary compat > mode, that is either running o32 binaries on a 32-bit kernel or > 64-bit binaries on a 64-bit kernel? I have hacked up little a test program which hosed my board within seconds. The system is not completely hung. However: - I can't kill the test program with Ctrl-C. - I can log into the box with telnet. - If I run "ps aux" to see all processes, the ps command hangs partway through the table, and cannot be killed with Ctrl-C. - System hangs on soft reboot attempt; requires hard reset. The program basically uses several threads to beat up the FUTEX_WAKE_OP. The key trick is that there is an interfering thread which does a mmap/munmap on the futexes in parallel with the threads which are using them. . If I just stick the futexes into a permanently good memory location, nothing bad happens; the program just churns away taking up 400% of the CPU time across the four cores of the 1480. If you call the function with permanently bad addresses, nothing bad happens either; the syscalls bail nicely with EFAULT. The idea is to tickle some race condition or other bug in the interaction between futexes and mmap. I put a little delay into the interfering thread so that the memory is held in a good state most of the time, with a quick unmap/remap. We want the memory to be good most of the time, but an unmap to happen from time to time at an inopportune time, while the kernel is executing the futex code on one or more cores This needs to be compiled -pthread, obviously, and you need -lrt to link in the library for clock_nanosleep. #include <stdlib.h> #include <unistd.h> #include <errno.h> #include <time.h> #include <sys/syscall.h> #include <sys/mman.h> #define FUTEX_WAIT 0 #define FUTEX_WAKE 1 #define FUTEX_FD 2 #define FUTEX_REQUEUE 3 #define FUTEX_CMP_REQUEUE 4 #define FUTEX_WAKE_OP 5 #define FUTEX_OP_SET 0 /* *(int *)UADDR2 = OPARG; */ #define FUTEX_OP_ADD 1 /* *(int *)UADDR2 += OPARG; */ #define FUTEX_OP_OR 2 /* *(int *)UADDR2 |= OPARG; */ #define FUTEX_OP_ANDN 3 /* *(int *)UADDR2 &= ~OPARG; */ #define FUTEX_OP_XOR 4 /* *(int *)UADDR2 ^= OPARG; */ #define FUTEX_OP_OPARG_SHIFT 8 /* Use (1 << OPARG) instead of OPARG. */ #define FUTEX_OP_CMP_EQ 0 /* if (oldval == CMPARG) wake */ #define FUTEX_OP_CMP_NE 1 /* if (oldval != CMPARG) wake */ #define FUTEX_OP_CMP_LT 2 /* if (oldval < CMPARG) wake */ #define FUTEX_OP_CMP_LE 3 /* if (oldval <= CMPARG) wake */ #define FUTEX_OP_CMP_GT 4 /* if (oldval > CMPARG) wake */ #define FUTEX_OP_CMP_GE 5 /* if (oldval >= CMPARG) wake */ #define NUM_THREADS 8 int futex_wake_op(int *addr1, int *addr2, int nr_wake_1, int nr_wake_2, int encoded_op) { syscall(SYS_futex, addr1, FUTEX_WAKE_OP, nr_wake_1, nr_wake_2, addr2, encoded_op); } int futex1 = 0, futex2 = 0; struct { int futex1; int futex2; } *shared; void *mapper(void *arg) { for (;;) { struct timespec delay; void *mem; delay.tv_sec = 0; delay.tv_nsec = 100000000; mem = mmap(0, 16384, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); if (mem == (void *) -1) { perror("mmap"); exit(EXIT_FAILURE); } shared = mem; clock_nanosleep(CLOCK_MONOTONIC, TIMER_ABSTIME, &delay, 0); if (munmap(mem, 16384) < 0) { perror("munmap"); exit(EXIT_FAILURE); } } } void *waker(void *arg) { int rand_state = 1; for (;;) { int val = rand_r(&rand_state) & 0xFFFF; const int op = (FUTEX_OP_SET << 28) | (FUTEX_OP_CMP_GT << 24) | val; int result = futex_wake_op(&shared->futex1, &shared->futex2, 1, 1, op); if (result < 0 && errno != EFAULT) { perror("futex_wake_op"); exit(EXIT_FAILURE); } } /* notreached */ return 0; } int main(void) { int i; srand(1); for (i = 0; i < NUM_THREADS; i++) { pthread_t thr; void *(*func)(void *) = (i == 0) ? mapper : waker; int result = errno = pthread_create(&thr, 0, func, 0); if (result != 0) { perror("pthread_create"); return EXIT_FAILURE; } } pthread_exit(0); } ^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: futex_wake_op deadlock? 2007-11-19 21:27 ` Kaz Kylheku 2007-11-19 21:27 ` Kaz Kylheku @ 2007-11-19 21:42 ` Kaz Kylheku 2007-11-19 21:42 ` Kaz Kylheku 2007-11-20 11:21 ` Ralf Baechle 2 siblings, 1 reply; 15+ messages in thread From: Kaz Kylheku @ 2007-11-19 21:42 UTC (permalink / raw) To: Ralf Baechle; +Cc: linux-mips Earlier, I wrote: > I have hacked up little a test program which hosed my board within > seconds. The system is not completely hung. However: > > - I can't kill the test program with Ctrl-C. > - I can log into the box with telnet. > - If I run "ps aux" to see all processes, the ps command hangs partway > through the table, and cannot be killed with Ctrl-C. > - System hangs on soft reboot attempt; requires hard reset. Furthermore: my console loglevel was too high to see the crash on the serial console, but, surely enough, the syslog has this: Nov 19 14:19:57 [kernel] [14:19:57.846017] BUG: soft lockup detected on CPU#1! Nov 19 14:19:57 [kernel] [14:19:57.846051] Call Trace: Nov 19 14:19:58 [kernel] [14:19:57.846069] [<ffffffff8016de8c>] softlockup_tick+0x1bc/0x208 Nov 19 14:19:58 [kernel] [14:19:57.846112] [<ffffffff8014cc54>] update_process_times+0x9c/0xe8 Nov 19 14:19:58 [kernel] [14:19:57.846147] [<ffffffff801098bc>] ll_local_timer_interrupt+0x94/0xa8 Nov 19 14:19:58 [kernel] [14:19:57.846180] [<ffffffff801098bc>] ll_local_timer_interrupt+0x94/0xa8 Nov 19 14:19:58 [kernel] [14:19:57.846205] [<ffffffff801026a0>] plat_irq_dispatch+0x120/0x1a0 Nov 19 14:19:58 [kernel] [14:19:57.846232] [<ffffffff80163f28>] compat_sys_futex+0x0/0x188 Nov 19 14:19:58 [kernel] [14:19:57.846258] [<ffffffff801637e0>] do_futex+0x8f8/0xb58 Nov 19 14:19:58 [kernel] [14:19:57.846281] [<ffffffff8011db28>] tlb_do_page_fault_1+0x110/0x128 Nov 19 14:19:58 [kernel] [14:19:57.846317] [<ffffffff80163758>] do_futex+0x870/0xb58 Nov 19 14:19:58 [kernel] [14:19:57.846339] [<ffffffff80163f28>] compat_sys_futex+0x0/0x188 Nov 19 14:19:58 [kernel] [14:19:57.846364] [<ffffffff80163170>] do_futex+0x288/0xb58 Nov 19 14:19:58 [kernel] [14:19:57.846385] [<ffffffff801637e0>] do_futex+0x8f8/0xb58 Nov 19 14:19:58 [kernel] [14:19:57.846407] [<ffffffff80163764>] do_futex+0x87c/0xb58 Nov 19 14:19:58 [kernel] [14:19:57.846430] [<ffffffff80177500>] __alloc_pages+0x70/0x398 Nov 19 14:19:58 [kernel] [14:19:57.846456] [<ffffffff80130d1c>] try_to_wake_up+0x3c4/0x4f8 Nov 19 14:19:58 [kernel] [14:19:57.846489] [<ffffffff802f3c28>] __up_read+0xe8/0x130 Nov 19 14:19:58 [kernel] [14:19:57.846528] [<ffffffff80163fac>] compat_sys_futex+0x84/0x188 Nov 19 14:19:58 [kernel] [14:19:57.846552] [<ffffffff80116314>] handle_sysn32+0x54/0xb0 Nov 19 14:19:58 [kernel] [14:19:57.846578] [<ffffffff80163f28>] compat_sys_futex+0x0/0x188 ^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: futex_wake_op deadlock? 2007-11-19 21:42 ` Kaz Kylheku @ 2007-11-19 21:42 ` Kaz Kylheku 0 siblings, 0 replies; 15+ messages in thread From: Kaz Kylheku @ 2007-11-19 21:42 UTC (permalink / raw) To: Ralf Baechle; +Cc: linux-mips Earlier, I wrote: > I have hacked up little a test program which hosed my board within > seconds. The system is not completely hung. However: > > - I can't kill the test program with Ctrl-C. > - I can log into the box with telnet. > - If I run "ps aux" to see all processes, the ps command hangs partway > through the table, and cannot be killed with Ctrl-C. > - System hangs on soft reboot attempt; requires hard reset. Furthermore: my console loglevel was too high to see the crash on the serial console, but, surely enough, the syslog has this: Nov 19 14:19:57 [kernel] [14:19:57.846017] BUG: soft lockup detected on CPU#1! Nov 19 14:19:57 [kernel] [14:19:57.846051] Call Trace: Nov 19 14:19:58 [kernel] [14:19:57.846069] [<ffffffff8016de8c>] softlockup_tick+0x1bc/0x208 Nov 19 14:19:58 [kernel] [14:19:57.846112] [<ffffffff8014cc54>] update_process_times+0x9c/0xe8 Nov 19 14:19:58 [kernel] [14:19:57.846147] [<ffffffff801098bc>] ll_local_timer_interrupt+0x94/0xa8 Nov 19 14:19:58 [kernel] [14:19:57.846180] [<ffffffff801098bc>] ll_local_timer_interrupt+0x94/0xa8 Nov 19 14:19:58 [kernel] [14:19:57.846205] [<ffffffff801026a0>] plat_irq_dispatch+0x120/0x1a0 Nov 19 14:19:58 [kernel] [14:19:57.846232] [<ffffffff80163f28>] compat_sys_futex+0x0/0x188 Nov 19 14:19:58 [kernel] [14:19:57.846258] [<ffffffff801637e0>] do_futex+0x8f8/0xb58 Nov 19 14:19:58 [kernel] [14:19:57.846281] [<ffffffff8011db28>] tlb_do_page_fault_1+0x110/0x128 Nov 19 14:19:58 [kernel] [14:19:57.846317] [<ffffffff80163758>] do_futex+0x870/0xb58 Nov 19 14:19:58 [kernel] [14:19:57.846339] [<ffffffff80163f28>] compat_sys_futex+0x0/0x188 Nov 19 14:19:58 [kernel] [14:19:57.846364] [<ffffffff80163170>] do_futex+0x288/0xb58 Nov 19 14:19:58 [kernel] [14:19:57.846385] [<ffffffff801637e0>] do_futex+0x8f8/0xb58 Nov 19 14:19:58 [kernel] [14:19:57.846407] [<ffffffff80163764>] do_futex+0x87c/0xb58 Nov 19 14:19:58 [kernel] [14:19:57.846430] [<ffffffff80177500>] __alloc_pages+0x70/0x398 Nov 19 14:19:58 [kernel] [14:19:57.846456] [<ffffffff80130d1c>] try_to_wake_up+0x3c4/0x4f8 Nov 19 14:19:58 [kernel] [14:19:57.846489] [<ffffffff802f3c28>] __up_read+0xe8/0x130 Nov 19 14:19:58 [kernel] [14:19:57.846528] [<ffffffff80163fac>] compat_sys_futex+0x84/0x188 Nov 19 14:19:58 [kernel] [14:19:57.846552] [<ffffffff80116314>] handle_sysn32+0x54/0xb0 Nov 19 14:19:58 [kernel] [14:19:57.846578] [<ffffffff80163f28>] compat_sys_futex+0x0/0x188 ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: futex_wake_op deadlock? 2007-11-19 21:27 ` Kaz Kylheku 2007-11-19 21:27 ` Kaz Kylheku 2007-11-19 21:42 ` Kaz Kylheku @ 2007-11-20 11:21 ` Ralf Baechle 2007-11-20 18:06 ` Kaz Kylheku ` (2 more replies) 2 siblings, 3 replies; 15+ messages in thread From: Ralf Baechle @ 2007-11-20 11:21 UTC (permalink / raw) To: Kaz Kylheku; +Cc: linux-mips On Mon, Nov 19, 2007 at 01:27:37PM -0800, Kaz Kylheku wrote: > >> From time to time, on 2.6.17.7, I see a deadlock situation go off. > >> The soft lockup tick occurs in the middle of do_futex, which is > >> heavily inlined. The system is actually hosed; it's not one of those > >> recoverable CPU busy situations that can sometimes trigger the lockup > >> detector. > > > > Can you reproduce thing hang also if you're not running in a > > binary compat > > mode, that is either running o32 binaries on a 32-bit kernel or > > 64-bit binaries on a 64-bit kernel? > > I have hacked up little a test program which hosed my board within > seconds. > The system is not completely hung. However: Cute. So looking again at the futex code this morning it was quite obvious what happened. The ll/sc loops in __futex_atomic_op() had the usual fixups necessary for memory acccesses to userspace from kernel space installed: __asm__ __volatile__( " .set push \n" " .set noat \n" " .set mips3 \n" "1: ll %1, %4 # __futex_atomic_op \n" " .set mips0 \n" " " insn " \n" " .set mips3 \n" "2: sc $1, %2 \n" " beqz $1, 1b \n" __WEAK_LLSC_MB "3: \n" " .set pop \n" " .set mips0 \n" " .section .fixup,\"ax\" \n" "4: li %0, %6 \n" " j 2b \n" <----- " .previous \n" " .section __ex_table,\"a\" \n" " "__UA_ADDR "\t1b, 4b \n" " "__UA_ADDR "\t2b, 4b \n" " .previous \n" : "=r" (ret), "=&r" (oldval), "=R" (*uaddr) : "0" (0), "R" (*uaddr), "Jr" (oparg), "i" (-EFAULT) : "memory"); Notice the branch at the end of the fixup code, it goes back to the SC instruction. The SC instruction took an exception so it will not have changed $1 so the loop will continue endless unless by coincidence the value to be stored from $1 happened to be zero. Obviously this one was MIPS specific and may hit all supported ABIs. So my initial suspicion this might be the issue David Miller recently discovered in the binary compat code isn't true. And it's a local DoS probably for all of 2.6.16 and up. Patch below. It fixes your test case on a 32-bit kernel for me. Ralf Signed-off-by: Ralf Baechle <ralf@linux-mips.org> diff --git a/include/asm-mips/futex.h b/include/asm-mips/futex.h index 3e7e30d..17f082c 100644 --- a/include/asm-mips/futex.h +++ b/include/asm-mips/futex.h @@ -35,7 +35,7 @@ " .set mips0 \n" \ " .section .fixup,\"ax\" \n" \ "4: li %0, %6 \n" \ - " j 2b \n" \ + " j 3b \n" \ " .previous \n" \ " .section __ex_table,\"a\" \n" \ " "__UA_ADDR "\t1b, 4b \n" \ @@ -61,7 +61,7 @@ " .set mips0 \n" \ " .section .fixup,\"ax\" \n" \ "4: li %0, %6 \n" \ - " j 2b \n" \ + " j 3b \n" \ " .previous \n" \ " .section __ex_table,\"a\" \n" \ " "__UA_ADDR "\t1b, 4b \n" \ ^ permalink raw reply related [flat|nested] 15+ messages in thread
* RE: futex_wake_op deadlock? 2007-11-20 11:21 ` Ralf Baechle @ 2007-11-20 18:06 ` Kaz Kylheku 2007-11-20 18:06 ` Kaz Kylheku 2007-11-20 18:16 ` Ralf Baechle 2007-11-20 18:24 ` Kaz Kylheku 2007-11-20 18:29 ` David Daney 2 siblings, 2 replies; 15+ messages in thread From: Kaz Kylheku @ 2007-11-20 18:06 UTC (permalink / raw) To: Ralf Baechle; +Cc: linux-mips Ralf Baechle wrote: > __asm__ __volatile__( > " .set push \n" > " .set noat \n" > " .set mips3 \n" > "1: ll %1, %4 # __futex_atomic_op \n" > " .set mips0 \n" > " " insn " \n" > " .set mips3 \n" > "2: sc $1, %2 \n" > " beqz $1, 1b \n" > __WEAK_LLSC_MB "3: > \n" " .set pop \n" > " .set mips0 \n" > " .section .fixup,\"ax\" \n" > "4: li %0, %6 \n" > " j 2b \n" <----- > " .previous \n" > " .section __ex_table,\"a\" \n" > " "__UA_ADDR "\t1b, 4b \n" > " "__UA_ADDR "\t2b, 4b \n" > " .previous \n" > : "=r" (ret), "=&r" (oldval), "=R" (*uaddr) > : "0" (0), "R" (*uaddr), "Jr" (oparg), "i" (-EFAULT) > : "memory"); > > Notice the branch at the end of the fixup code, it goes back to the > SC instruction. Hi Ralf, I had gone through all that code, but didn't see it! The problem is I didn't pay enough attention because I didn't suspect it enough. I was misled by the backtrace address in the soft lockup dump, which points to one instruction /before/ the ll instruction. So I thought that the lockup is somewhere outside of that loop, right? Does the backward branch on MIPS set up the instruction pointer in such a way that if an interrupt goes off, it can be pointing to the previous instruction? I thought about that possibility. ^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: futex_wake_op deadlock? 2007-11-20 18:06 ` Kaz Kylheku @ 2007-11-20 18:06 ` Kaz Kylheku 2007-11-20 18:16 ` Ralf Baechle 1 sibling, 0 replies; 15+ messages in thread From: Kaz Kylheku @ 2007-11-20 18:06 UTC (permalink / raw) To: Ralf Baechle; +Cc: linux-mips Ralf Baechle wrote: > __asm__ __volatile__( > " .set push \n" > " .set noat \n" > " .set mips3 \n" > "1: ll %1, %4 # __futex_atomic_op \n" > " .set mips0 \n" > " " insn " \n" > " .set mips3 \n" > "2: sc $1, %2 \n" > " beqz $1, 1b \n" > __WEAK_LLSC_MB "3: > \n" " .set pop \n" > " .set mips0 \n" > " .section .fixup,\"ax\" \n" > "4: li %0, %6 \n" > " j 2b \n" <----- > " .previous \n" > " .section __ex_table,\"a\" \n" > " "__UA_ADDR "\t1b, 4b \n" > " "__UA_ADDR "\t2b, 4b \n" > " .previous \n" > : "=r" (ret), "=&r" (oldval), "=R" (*uaddr) > : "0" (0), "R" (*uaddr), "Jr" (oparg), "i" (-EFAULT) > : "memory"); > > Notice the branch at the end of the fixup code, it goes back to the > SC instruction. Hi Ralf, I had gone through all that code, but didn't see it! The problem is I didn't pay enough attention because I didn't suspect it enough. I was misled by the backtrace address in the soft lockup dump, which points to one instruction /before/ the ll instruction. So I thought that the lockup is somewhere outside of that loop, right? Does the backward branch on MIPS set up the instruction pointer in such a way that if an interrupt goes off, it can be pointing to the previous instruction? I thought about that possibility. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: futex_wake_op deadlock? 2007-11-20 18:06 ` Kaz Kylheku 2007-11-20 18:06 ` Kaz Kylheku @ 2007-11-20 18:16 ` Ralf Baechle 1 sibling, 0 replies; 15+ messages in thread From: Ralf Baechle @ 2007-11-20 18:16 UTC (permalink / raw) To: Kaz Kylheku; +Cc: linux-mips On Tue, Nov 20, 2007 at 10:06:44AM -0800, Kaz Kylheku wrote: > The problem is I didn't pay enough attention because I didn't suspect it > enough. > > I was misled by the backtrace address in the soft lockup dump, which > points to one instruction /before/ the ll instruction. So I thought that > the lockup is somewhere outside of that loop, right? > > Does the backward branch on MIPS set up the instruction pointer in such > a way that if an interrupt goes off, it can be pointing to the previous > instruction? I thought about that possibility. The EPC will always point to the instruction which caused the exception with the one special case where an instruction in a branch delay slot was causing the exception. If that's the case the EPC will point at the branch and the BD bit in the cause register (bit 31) will be set to indicate this special case. Ralf ^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: futex_wake_op deadlock? 2007-11-20 11:21 ` Ralf Baechle 2007-11-20 18:06 ` Kaz Kylheku @ 2007-11-20 18:24 ` Kaz Kylheku 2007-11-20 18:24 ` Kaz Kylheku 2007-11-20 18:29 ` David Daney 2 siblings, 1 reply; 15+ messages in thread From: Kaz Kylheku @ 2007-11-20 18:24 UTC (permalink / raw) To: Ralf Baechle; +Cc: linux-mips Ralf Baechle wrote: > Patch below. It fixes your test case on a 32-bit kernel for me. I'm running it now on 64 bit. The test case isn't causing any ill effects. Thanks a lot, Ralf! ^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: futex_wake_op deadlock? 2007-11-20 18:24 ` Kaz Kylheku @ 2007-11-20 18:24 ` Kaz Kylheku 0 siblings, 0 replies; 15+ messages in thread From: Kaz Kylheku @ 2007-11-20 18:24 UTC (permalink / raw) To: Ralf Baechle; +Cc: linux-mips Ralf Baechle wrote: > Patch below. It fixes your test case on a 32-bit kernel for me. I'm running it now on 64 bit. The test case isn't causing any ill effects. Thanks a lot, Ralf! ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: futex_wake_op deadlock? 2007-11-20 11:21 ` Ralf Baechle 2007-11-20 18:06 ` Kaz Kylheku 2007-11-20 18:24 ` Kaz Kylheku @ 2007-11-20 18:29 ` David Daney 2007-11-20 19:00 ` Ralf Baechle 2 siblings, 1 reply; 15+ messages in thread From: David Daney @ 2007-11-20 18:29 UTC (permalink / raw) To: Ralf Baechle; +Cc: Kaz Kylheku, linux-mips Ralf Baechle wrote: > > Notice the branch at the end of the fixup code, it goes back to the > SC instruction. The SC instruction took an exception so it will not have > changed $1 so the loop will continue endless unless by coincidence the > value to be stored from $1 happened to be zero. > > Obviously this one was MIPS specific and may hit all supported ABIs. So > my initial suspicion this might be the issue David Miller recently > discovered in the binary compat code isn't true. And it's a local DoS > probably for all of 2.6.16 and up. > I mostly similar code is in 2.6.15, so I think it is effected as well. 2.6.12 on the other hand doesn't seem to have futex.h David Daney ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: futex_wake_op deadlock? 2007-11-20 18:29 ` David Daney @ 2007-11-20 19:00 ` Ralf Baechle 0 siblings, 0 replies; 15+ messages in thread From: Ralf Baechle @ 2007-11-20 19:00 UTC (permalink / raw) To: David Daney; +Cc: Kaz Kylheku, linux-mips On Tue, Nov 20, 2007 at 10:29:47AM -0800, David Daney wrote: >> Notice the branch at the end of the fixup code, it goes back to the >> SC instruction. The SC instruction took an exception so it will not have >> changed $1 so the loop will continue endless unless by coincidence the >> value to be stored from $1 happened to be zero. >> >> Obviously this one was MIPS specific and may hit all supported ABIs. So >> my initial suspicion this might be the issue David Miller recently >> discovered in the binary compat code isn't true. And it's a local DoS >> probably for all of 2.6.16 and up. >> > > I mostly similar code is in 2.6.15, so I think it is effected as well. > 2.6.12 on the other hand doesn't seem to have futex.h It originally appeared in the lmo kernel for 2.6.14-rc1 and a little after the 2.6.14 release in kernel.org. If I say 2.6.16 then it's simply that I don't ever look at anything that doesn't have a -stable branch. Ralf ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2007-11-20 19:00 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-11-16 23:52 futex_wake_op deadlock? Kaz Kylheku 2007-11-16 23:52 ` Kaz Kylheku 2007-11-19 18:48 ` Ralf Baechle 2007-11-19 21:27 ` Kaz Kylheku 2007-11-19 21:27 ` Kaz Kylheku 2007-11-19 21:42 ` Kaz Kylheku 2007-11-19 21:42 ` Kaz Kylheku 2007-11-20 11:21 ` Ralf Baechle 2007-11-20 18:06 ` Kaz Kylheku 2007-11-20 18:06 ` Kaz Kylheku 2007-11-20 18:16 ` Ralf Baechle 2007-11-20 18:24 ` Kaz Kylheku 2007-11-20 18:24 ` Kaz Kylheku 2007-11-20 18:29 ` David Daney 2007-11-20 19:00 ` Ralf Baechle
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox