* 3.8.4-rt2 panic in migrate_task_rq_fair @ 2013-04-05 16:47 Darren Hart 2013-04-05 17:17 ` Darren Hart 2013-04-26 13:21 ` Sebastian Andrzej Siewior 0 siblings, 2 replies; 4+ messages in thread From: Darren Hart @ 2013-04-05 16:47 UTC (permalink / raw) To: linux-rt-users Running on a UEFI 32bit Atom E6xx system I see the following panic after several minutes running the following cyclictest command. root@sys940x:~# cyclictest -p 50 -d 10m -t -q # /dev/cpu_dma_latency set to 0us BUG: unable to handle kernel paging request at fffffff4 IP: [<c106a41c>] migrate_task_rq_fair+0x4c/0x100 *pde = 0198f067 *pte = 00000000 Oops: 0000 [#1] PREEMPT SMP Modules linked in: Pid: 649, comm: cyclictest Not tainted 3.8.4-rt2-yocto-preempt-rt #1 EIP: 0060:[<c106a41c>] EFLAGS: 00010046 CPU: 0 EIP is at migrate_task_rq_fair+0x4c/0x100 EAX: 00000000 EBX: deec43f0 ECX: 00000000 EDX: 00000000 ESI: dde8f948 EDI: c1983900 EBP: dee9fe58 ESP: dee9fe40 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 CR0: 80050033 CR2: fffffff4 CR3: 1ef64000 CR4: 000007d0 DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 DR6: ffff0ff0 DR7: 00000400 Process cyclictest (pid: 649, ti=dee9e000 task=def74170 task.ti=dee9e000) Stack: c197f060 00000000 00000000 00000000 deec43f0 00000001 dee9fec0 c1064ec5 c10380ce 00000010 dee9fe7c c1065894 de699ef0 619b7c30 00000000 dee9feac 00000010 00000000 dee9fea0 c1037f76 00000000 def74170 00000001 dee9ff28 Call Trace: [<c1064ec5>] set_task_cpu+0x55/0x1b0 [<c10380ce>] ? unpin_current_cpu+0xe/0x70 [<c1065894>] ? migrate_enable+0xc4/0x1c0 [<c1037f76>] ? pin_current_cpu+0x76/0x1c0 [<c106713c>] try_to_wake_up+0x18c/0x300 [<c10672ef>] wake_up_process+0x1f/0x40 [<c10595ed>] hrtimer_wakeup+0x1d/0x30 [<c10599cb>] __run_hrtimer+0x9b/0x260 [<c10595d0>] ? update_rmtp+0x90/0x90 [<c105ad62>] hrtimer_interrupt+0x272/0x320 [<c1645ed5>] smp_apic_timer_interrupt+0x55/0x87 [<c163f75d>] apic_timer_interrupt+0x2d/0x34 [<c163f48c>] ? resume_kernel+0x44/0x44 Code: 83 74 01 00 00 74 48 8d 4e 58 e8 94 2e 2c 00 89 45 f0 89 55 f4 8b 8b 78 01 00 00 8b 93 74 01 00 00 29 55 f0 19 4d f4 31 c0 31 d2 <8b> 49 f4 0b 4d f0 75 2c 89 83 74 01 00 00 89 93 78 01 00 00 8b EIP: [<c106a41c>] migrate_task_rq_fair+0x4c/0x100 SS:ESP 0068:dee9fe40 CR2: 00000000fffffff4 ---[ end trace 0000000000000002 ]--- Kernel panic - not syncing: Fatal exception in interrupt -- Darren Hart Intel Open Source Technology Center Yocto Project - Technical Lead - Linux Kernel ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: 3.8.4-rt2 panic in migrate_task_rq_fair 2013-04-05 16:47 3.8.4-rt2 panic in migrate_task_rq_fair Darren Hart @ 2013-04-05 17:17 ` Darren Hart 2013-04-26 13:21 ` Sebastian Andrzej Siewior 1 sibling, 0 replies; 4+ messages in thread From: Darren Hart @ 2013-04-05 17:17 UTC (permalink / raw) To: linux-rt-users On 04/05/2013 09:47 AM, Darren Hart wrote: > Running on a UEFI 32bit Atom E6xx system I see the following panic after > several minutes running the following cyclictest command. > > root@sys940x:~# cyclictest -p 50 -d 10m -t -q Whoops, I should have used "-D 10m", but the following is of course still a problem. -- Darren > # /dev/cpu_dma_latency set to 0us > > BUG: unable to handle kernel paging request at fffffff4 > IP: [<c106a41c>] migrate_task_rq_fair+0x4c/0x100 > *pde = 0198f067 *pte = 00000000 > Oops: 0000 [#1] PREEMPT SMP > Modules linked in: > Pid: 649, comm: cyclictest Not tainted 3.8.4-rt2-yocto-preempt-rt #1 > EIP: 0060:[<c106a41c>] EFLAGS: 00010046 CPU: 0 > EIP is at migrate_task_rq_fair+0x4c/0x100 > EAX: 00000000 EBX: deec43f0 ECX: 00000000 EDX: 00000000 > ESI: dde8f948 EDI: c1983900 EBP: dee9fe58 ESP: dee9fe40 > DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 > CR0: 80050033 CR2: fffffff4 CR3: 1ef64000 CR4: 000007d0 > DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 > DR6: ffff0ff0 DR7: 00000400 > Process cyclictest (pid: 649, ti=dee9e000 task=def74170 task.ti=dee9e000) > Stack: > c197f060 00000000 00000000 00000000 deec43f0 00000001 dee9fec0 c1064ec5 > c10380ce 00000010 dee9fe7c c1065894 de699ef0 619b7c30 00000000 dee9feac > 00000010 00000000 dee9fea0 c1037f76 00000000 def74170 00000001 dee9ff28 > Call Trace: > [<c1064ec5>] set_task_cpu+0x55/0x1b0 > [<c10380ce>] ? unpin_current_cpu+0xe/0x70 > [<c1065894>] ? migrate_enable+0xc4/0x1c0 > [<c1037f76>] ? pin_current_cpu+0x76/0x1c0 > [<c106713c>] try_to_wake_up+0x18c/0x300 > [<c10672ef>] wake_up_process+0x1f/0x40 > [<c10595ed>] hrtimer_wakeup+0x1d/0x30 > [<c10599cb>] __run_hrtimer+0x9b/0x260 > [<c10595d0>] ? update_rmtp+0x90/0x90 > [<c105ad62>] hrtimer_interrupt+0x272/0x320 > [<c1645ed5>] smp_apic_timer_interrupt+0x55/0x87 > [<c163f75d>] apic_timer_interrupt+0x2d/0x34 > [<c163f48c>] ? resume_kernel+0x44/0x44 > Code: 83 74 01 00 00 74 48 8d 4e 58 e8 94 2e 2c 00 89 45 f0 89 55 f4 8b > 8b 78 01 00 00 8b 93 74 01 00 00 29 55 f0 19 4d f4 31 c0 31 d2 <8b> 49 > f4 0b 4d f0 75 2c 89 83 74 01 00 00 89 93 78 01 00 00 8b > > EIP: [<c106a41c>] migrate_task_rq_fair+0x4c/0x100 SS:ESP 0068:dee9fe40 > CR2: 00000000fffffff4 > ---[ end trace 0000000000000002 ]--- > Kernel panic - not syncing: Fatal exception in interrupt > -- Darren Hart Intel Open Source Technology Center Yocto Project - Technical Lead - Linux Kernel ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: 3.8.4-rt2 panic in migrate_task_rq_fair 2013-04-05 16:47 3.8.4-rt2 panic in migrate_task_rq_fair Darren Hart 2013-04-05 17:17 ` Darren Hart @ 2013-04-26 13:21 ` Sebastian Andrzej Siewior 2013-04-29 23:00 ` Darren Hart 1 sibling, 1 reply; 4+ messages in thread From: Sebastian Andrzej Siewior @ 2013-04-26 13:21 UTC (permalink / raw) To: Darren Hart; +Cc: linux-rt-users * Darren Hart | 2013-04-05 09:47:09 [-0700]: >Running on a UEFI 32bit Atom E6xx system I see the following panic after >several minutes running the following cyclictest command. Can you reproduce this? >root@sys940x:~# cyclictest -p 50 -d 10m -t -q ># /dev/cpu_dma_latency set to 0us > >BUG: unable to handle kernel paging request at fffffff4 >IP: [<c106a41c>] migrate_task_rq_fair+0x4c/0x100 >EIP is at migrate_task_rq_fair+0x4c/0x100 >EAX: 00000000 EBX: deec43f0 ECX: 00000000 EDX: 00000000 >ESI: dde8f948 EDI: c1983900 EBP: dee9fe58 ESP: dee9fe40 > DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 This is the disassembly of your code: | 0: 83 74 01 00 00 xorl $0x0,0x0(%rcx,%rax,1) | 5: 74 48 je 4f <crash+0x24> | 7: 8d 4e 58 lea 0x58(%rsi),%ecx | a: e8 94 2e 2c 00 callq 2c2ea3 <crash+0x2c2e78> | f: 89 45 f0 mov %eax,-0x10(%rbp) | 12: 89 55 f4 mov %edx,-0xc(%rbp) | 15: 8b 8b 78 01 00 00 mov 0x178(%rbx),%ecx | 1b: 8b 93 74 01 00 00 mov 0x174(%rbx),%edx | 21: 29 55 f0 sub %edx,-0x10(%rbp) | 24: 19 4d f4 sbb %ecx,-0xc(%rbp) | 27: 31 c0 xor %eax,%eax | 29: 31 d2 xor %edx,%edx | |000000000000002b <crash>: | 2b: 8b 49 f4 mov -0xc(%rcx),%ecx So ecx is zero, -0xc gives xfffffff4. Okay, bad pointer crash. | 2e: 0b 4d f0 or -0x10(%rbp),%ecx | 31: 75 2c jne 5f <crash+0x34> | 33: 89 83 74 01 00 00 mov %eax,0x174(%rbx) | 39: 89 93 78 01 00 00 mov %edx,0x178(%rbx) A few lines up (offset 0x21) rcx is used for u64 subtraction in __synchronize_entity_decay(), the C code: | decays -= se->avg.decay_count; | if (!decays) | return 0; The result is saved in -0x10 & -0xc *rbp. Later it is loaded again from stack because atomic64 is not inlined and it needs to do the zero check. So *I* think that the assembly here is wrong because line 0x2b should use rbp as the pointer as it is done in 0x2e. The two lines are are the zero check. My gcc creates here: |c105c835: e8 da 3a 1d 00 call c1230314 <atomic64_read_cx8> |c105c83a: 89 55 f4 mov %edx,-0xc(%ebp) |c105c83d: 8b 93 9c 00 00 00 mov 0x9c(%ebx),%edx |c105c843: 89 45 f0 mov %eax,-0x10(%ebp) |c105c846: 8b 8b a0 00 00 00 mov 0xa0(%ebx),%ecx |c105c84c: 29 55 f0 sub %edx,-0x10(%ebp) |c105c84f: 19 4d f4 sbb %ecx,-0xc(%ebp) |c105c852: 31 c0 xor %eax,%eax |c105c854: 31 d2 xor %edx,%edx crash: |c105c856: 8b 4d f4 mov -0xc(%ebp),%ecx as you see, it uses ebp instead of rcx for the 0 check. |c105c859: 0b 4d f0 or -0x10(%ebp),%ecx |c105c85c: 75 2a jne c105c888 <migrate_task_rq_fair+0x78> The assembly code looks wrong to me. So it is either a gcc bug or the attributes for the inline assembly in atomic64_read() / alternative_atomic64() are wrong. Sebastian ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: 3.8.4-rt2 panic in migrate_task_rq_fair 2013-04-26 13:21 ` Sebastian Andrzej Siewior @ 2013-04-29 23:00 ` Darren Hart 0 siblings, 0 replies; 4+ messages in thread From: Darren Hart @ 2013-04-29 23:00 UTC (permalink / raw) To: Sebastian Andrzej Siewior; +Cc: linux-rt-users On 04/26/2013 06:21 AM, Sebastian Andrzej Siewior wrote: > * Darren Hart | 2013-04-05 09:47:09 [-0700]: > >> Running on a UEFI 32bit Atom E6xx system I see the following panic after >> several minutes running the following cyclictest command. > > Can you reproduce this? Yes, it was perfectly repeatable. >> root@sys940x:~# cyclictest -p 50 -d 10m -t -q >> # /dev/cpu_dma_latency set to 0us >> >> BUG: unable to handle kernel paging request at fffffff4 >> IP: [<c106a41c>] migrate_task_rq_fair+0x4c/0x100 >> EIP is at migrate_task_rq_fair+0x4c/0x100 >> EAX: 00000000 EBX: deec43f0 ECX: 00000000 EDX: 00000000 >> ESI: dde8f948 EDI: c1983900 EBP: dee9fe58 ESP: dee9fe40 >> DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 > > This is the disassembly of your code: > > | 0: 83 74 01 00 00 xorl $0x0,0x0(%rcx,%rax,1) > | 5: 74 48 je 4f <crash+0x24> > | 7: 8d 4e 58 lea 0x58(%rsi),%ecx > | a: e8 94 2e 2c 00 callq 2c2ea3 <crash+0x2c2e78> > | f: 89 45 f0 mov %eax,-0x10(%rbp) > | 12: 89 55 f4 mov %edx,-0xc(%rbp) > | 15: 8b 8b 78 01 00 00 mov 0x178(%rbx),%ecx > | 1b: 8b 93 74 01 00 00 mov 0x174(%rbx),%edx > | 21: 29 55 f0 sub %edx,-0x10(%rbp) > | 24: 19 4d f4 sbb %ecx,-0xc(%rbp) > | 27: 31 c0 xor %eax,%eax > | 29: 31 d2 xor %edx,%edx > | > |000000000000002b <crash>: > | 2b: 8b 49 f4 mov -0xc(%rcx),%ecx > > So ecx is zero, -0xc gives xfffffff4. Okay, bad pointer crash. > > | 2e: 0b 4d f0 or -0x10(%rbp),%ecx > | 31: 75 2c jne 5f <crash+0x34> > | 33: 89 83 74 01 00 00 mov %eax,0x174(%rbx) > | 39: 89 93 78 01 00 00 mov %edx,0x178(%rbx) > > A few lines up (offset 0x21) rcx is used for u64 subtraction in > __synchronize_entity_decay(), the C code: > | decays -= se->avg.decay_count; > | if (!decays) > | return 0; > > The result is saved in -0x10 & -0xc *rbp. Later it is loaded again from > stack because atomic64 is not inlined and it needs to do the zero check. > > So *I* think that the assembly here is wrong because line 0x2b should > use rbp as the pointer as it is done in 0x2e. The two lines are are the > zero check. > My gcc creates here: > > |c105c835: e8 da 3a 1d 00 call c1230314 <atomic64_read_cx8> > |c105c83a: 89 55 f4 mov %edx,-0xc(%ebp) > |c105c83d: 8b 93 9c 00 00 00 mov 0x9c(%ebx),%edx > |c105c843: 89 45 f0 mov %eax,-0x10(%ebp) > |c105c846: 8b 8b a0 00 00 00 mov 0xa0(%ebx),%ecx > |c105c84c: 29 55 f0 sub %edx,-0x10(%ebp) > |c105c84f: 19 4d f4 sbb %ecx,-0xc(%ebp) > |c105c852: 31 c0 xor %eax,%eax > |c105c854: 31 d2 xor %edx,%edx > crash: > |c105c856: 8b 4d f4 mov -0xc(%ebp),%ecx > > as you see, it uses ebp instead of rcx for the 0 check. > > |c105c859: 0b 4d f0 or -0x10(%ebp),%ecx > |c105c85c: 75 2a jne c105c888 <migrate_task_rq_fair+0x78> > > The assembly code looks wrong to me. So it is either a gcc bug or the > attributes for the inline assembly in atomic64_read() / > alternative_atomic64() are wrong. Something to look into, I will try to get back to this and compare a couple of different compiler versions. Thanks for looking into it! -- Darren Hart Intel Open Source Technology Center Yocto Project - Technical Lead - Linux Kernel ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2013-04-29 23:00 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-04-05 16:47 3.8.4-rt2 panic in migrate_task_rq_fair Darren Hart 2013-04-05 17:17 ` Darren Hart 2013-04-26 13:21 ` Sebastian Andrzej Siewior 2013-04-29 23:00 ` Darren Hart
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).