* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64) [not found] ` <20131117220611.GQ27323@pd.tnic> @ 2013-11-17 22:34 ` Rafael J. Wysocki 2013-11-17 22:46 ` Borislav Petkov 2013-11-18 12:20 ` Francis Moreau 0 siblings, 2 replies; 4+ messages in thread From: Rafael J. Wysocki @ 2013-11-17 22:34 UTC (permalink / raw) To: Borislav Petkov, Francis Moreau; +Cc: LKML, Thomas Gleixner, Linux PM list On Sunday, November 17, 2013 11:06:12 PM Borislav Petkov wrote: > On Sun, Nov 17, 2013 at 09:49:40PM +0100, Francis Moreau wrote: > > On Sun, Nov 17, 2013 at 8:53 PM, Borislav Petkov <bp@alien8.de> wrote: > > > On Sun, Nov 17, 2013 at 07:02:21PM +0100, Francis Moreau wrote: > > >> Sorry I haven't taken the original picture large enough, and getting > > >> this kernel panic is pretty hard since the kernel usually displays the > > >> black screen. > > > > > > Ok, just try to make a readable picture of the whole line, next time you > > > trigger it. > > > > > >> I can't find any traces of this function in the dump... > > > > > > Hmm, strange. Can you upload the whole vmlinux somewhere? Or is this the > > > official archlinux kernel? If so, where can I get it from? > > > > Yes, you can download the bin package from : > > https://www.archlinux.org/packages/core/x86_64/linux/ > > > > The bin package is a tar archive, so it pretty straightforward to > > unpack the vmlinux file (actual is filename vmlinuz-linux). > > Ok, here's what I was able to see: rIP points to call_timer_fn+0x33 > which is this: > > ffffffff8106f590 <call_timer_fn>: > ffffffff8106f590: e8 2b b2 48 00 callq ffffffff814fa7c0 <__fentry__> > ffffffff8106f595: 55 push %rbp > ffffffff8106f596: 65 48 8b 04 25 70 c7 mov %gs:0xc770,%rax > ffffffff8106f59d: 00 00 > ffffffff8106f59f: 48 89 e5 mov %rsp,%rbp > ffffffff8106f5a2: 41 57 push %r15 > ffffffff8106f5a4: 49 89 d7 mov %rdx,%r15 > ffffffff8106f5a7: 41 56 push %r14 > ffffffff8106f5a9: 49 89 f6 mov %rsi,%r14 > ffffffff8106f5ac: 41 55 push %r13 > ffffffff8106f5ae: 41 54 push %r12 > ffffffff8106f5b0: 49 89 fc mov %rdi,%r12 > ffffffff8106f5b3: 53 push %rbx > ffffffff8106f5b4: 44 8b a8 44 e0 ff ff mov -0x1fbc(%rax),%r13d > ffffffff8106f5bb: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) > ffffffff8106f5c0: 4c 89 ff mov %r15,%rdi > ffffffff8106f5c3: 41 ff d6 callq *%r14 <--- faulting insn > ffffffff8106f5c6: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) > ffffffff8106f5cb: 65 48 8b 04 25 70 c7 mov %gs:0xc770,%rax > ffffffff8106f5d2: 00 00 > ffffffff8106f5d4: 44 39 a8 44 e0 ff ff cmp %r13d,-0x1fbc(%rax) > > and the virtual address in rIP is ffffffff8106f5c3, i.e. the same one > as in the photo. Thus, the CALL instruction tries to call the timer > function 'fn' which we pass as an argument to call_timer_fn. > > However, the address we're trying to call in %r14 is garbage: > 0x455300323d504544 and not in canonical form, causing the #GP. > > So basically what happens is suspend to RAM corrupts something > containing one or more timer functions and we end up calling crap after > resume. > > If you want to debug this further, you could try playing through > Documentation/power/basic-pm-debugging.txt and see whether suspend to > disk works. There's also a section 2 which talks about testing suspend > to RAM which could be of help. > > But let me add Rafael and Thomas - they should have much better ideas > than me. > > Guys, thread starts here: > http://marc.info/?l=linux-kernel&m=138468134321335 This looks like a softirq bug to me (and related to cpuidle). I'm wondering if that happens with any of the older kernels or just 3.12? -- I speak only for myself. Rafael J. Wysocki, Intel Open Source Technology Center. ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64) 2013-11-17 22:34 ` 3.12: kernel panic when resuming from suspend to RAM (x86_64) Rafael J. Wysocki @ 2013-11-17 22:46 ` Borislav Petkov 2013-11-18 12:21 ` Francis Moreau 2013-11-18 12:20 ` Francis Moreau 1 sibling, 1 reply; 4+ messages in thread From: Borislav Petkov @ 2013-11-17 22:46 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: Francis Moreau, LKML, Thomas Gleixner, Linux PM list On Sun, Nov 17, 2013 at 11:34:20PM +0100, Rafael J. Wysocki wrote: > This looks like a softirq bug to me (and related to cpuidle). Reportedly, it happens right after resume from RAM. Francis, is that correct? > I'm wondering if that happens with any of the older kernels or just > 3.12? That could be helpful, yeah. -- Regards/Gruss, Boris. Sent from a fat crate under my desk. Formatting is fine. -- ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64) 2013-11-17 22:46 ` Borislav Petkov @ 2013-11-18 12:21 ` Francis Moreau 0 siblings, 0 replies; 4+ messages in thread From: Francis Moreau @ 2013-11-18 12:21 UTC (permalink / raw) To: Borislav Petkov, Rafael J. Wysocki; +Cc: LKML, Thomas Gleixner, Linux PM list Le 17/11/2013 23:46, Borislav Petkov a écrit : > On Sun, Nov 17, 2013 at 11:34:20PM +0100, Rafael J. Wysocki wrote: >> This looks like a softirq bug to me (and related to cpuidle). > > Reportedly, it happens right after resume from RAM. Francis, is that > correct? yes that's correct. I haven't been hit by this issue otherwise. ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64) 2013-11-17 22:34 ` 3.12: kernel panic when resuming from suspend to RAM (x86_64) Rafael J. Wysocki 2013-11-17 22:46 ` Borislav Petkov @ 2013-11-18 12:20 ` Francis Moreau 1 sibling, 0 replies; 4+ messages in thread From: Francis Moreau @ 2013-11-18 12:20 UTC (permalink / raw) To: Rafael J. Wysocki, Borislav Petkov; +Cc: LKML, Thomas Gleixner, Linux PM list Le 17/11/2013 23:34, Rafael J. Wysocki a écrit : > On Sunday, November 17, 2013 11:06:12 PM Borislav Petkov wrote: >> On Sun, Nov 17, 2013 at 09:49:40PM +0100, Francis Moreau wrote: >>> On Sun, Nov 17, 2013 at 8:53 PM, Borislav Petkov <bp@alien8.de> wrote: >>>> On Sun, Nov 17, 2013 at 07:02:21PM +0100, Francis Moreau wrote: >>>>> Sorry I haven't taken the original picture large enough, and getting >>>>> this kernel panic is pretty hard since the kernel usually displays the >>>>> black screen. >>>> >>>> Ok, just try to make a readable picture of the whole line, next time you >>>> trigger it. >>>> >>>>> I can't find any traces of this function in the dump... >>>> >>>> Hmm, strange. Can you upload the whole vmlinux somewhere? Or is this the >>>> official archlinux kernel? If so, where can I get it from? >>> >>> Yes, you can download the bin package from : >>> https://www.archlinux.org/packages/core/x86_64/linux/ >>> >>> The bin package is a tar archive, so it pretty straightforward to >>> unpack the vmlinux file (actual is filename vmlinuz-linux). >> >> Ok, here's what I was able to see: rIP points to call_timer_fn+0x33 >> which is this: >> >> ffffffff8106f590 <call_timer_fn>: >> ffffffff8106f590: e8 2b b2 48 00 callq ffffffff814fa7c0 <__fentry__> >> ffffffff8106f595: 55 push %rbp >> ffffffff8106f596: 65 48 8b 04 25 70 c7 mov %gs:0xc770,%rax >> ffffffff8106f59d: 00 00 >> ffffffff8106f59f: 48 89 e5 mov %rsp,%rbp >> ffffffff8106f5a2: 41 57 push %r15 >> ffffffff8106f5a4: 49 89 d7 mov %rdx,%r15 >> ffffffff8106f5a7: 41 56 push %r14 >> ffffffff8106f5a9: 49 89 f6 mov %rsi,%r14 >> ffffffff8106f5ac: 41 55 push %r13 >> ffffffff8106f5ae: 41 54 push %r12 >> ffffffff8106f5b0: 49 89 fc mov %rdi,%r12 >> ffffffff8106f5b3: 53 push %rbx >> ffffffff8106f5b4: 44 8b a8 44 e0 ff ff mov -0x1fbc(%rax),%r13d >> ffffffff8106f5bb: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) >> ffffffff8106f5c0: 4c 89 ff mov %r15,%rdi >> ffffffff8106f5c3: 41 ff d6 callq *%r14 <--- faulting insn >> ffffffff8106f5c6: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) >> ffffffff8106f5cb: 65 48 8b 04 25 70 c7 mov %gs:0xc770,%rax >> ffffffff8106f5d2: 00 00 >> ffffffff8106f5d4: 44 39 a8 44 e0 ff ff cmp %r13d,-0x1fbc(%rax) >> >> and the virtual address in rIP is ffffffff8106f5c3, i.e. the same one >> as in the photo. Thus, the CALL instruction tries to call the timer >> function 'fn' which we pass as an argument to call_timer_fn. >> >> However, the address we're trying to call in %r14 is garbage: >> 0x455300323d504544 and not in canonical form, causing the #GP. >> >> So basically what happens is suspend to RAM corrupts something >> containing one or more timer functions and we end up calling crap after >> resume. >> >> If you want to debug this further, you could try playing through >> Documentation/power/basic-pm-debugging.txt and see whether suspend to >> disk works. There's also a section 2 which talks about testing suspend >> to RAM which could be of help. >> >> But let me add Rafael and Thomas - they should have much better ideas >> than me. >> >> Guys, thread starts here: >> http://marc.info/?l=linux-kernel&m=138468134321335 > > This looks like a softirq bug to me (and related to cpuidle). > > I'm wondering if that happens with any of the older kernels or just 3.12? > I can try to find the old kernel package and see if that happens tonight. ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2013-11-18 12:21 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <52888F6D.6000802@gmail.com> [not found] ` <CAC9WiBggfssh+31_MqVCT-hDf2E8zTuc81q-DhLYQuVF053mEA@mail.gmail.com> [not found] ` <20131117220611.GQ27323@pd.tnic> 2013-11-17 22:34 ` 3.12: kernel panic when resuming from suspend to RAM (x86_64) Rafael J. Wysocki 2013-11-17 22:46 ` Borislav Petkov 2013-11-18 12:21 ` Francis Moreau 2013-11-18 12:20 ` Francis Moreau
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).