From: Cyrill Gorcunov <gorcunov@gmail.com>
To: Tej <bewith.tej@gmail.com>
Cc: Jason Wessel <jason.wessel@windriver.com>, linux-kernel@vger.kernel.org
Subject: Re: Kernel PANIC with 2.6.27-rc6
Date: Thu, 25 Sep 2008 18:56:35 +0400 [thread overview]
Message-ID: <20080925145635.GA7277@localhost> (raw)
In-Reply-To: <f1c9d250809250735i4b98201ft753d6ec2959097e6@mail.gmail.com>
[Tej - Thu, Sep 25, 2008 at 08:05:23PM +0530]
| On 9/11/08, Jason Wessel <jason.wessel@windriver.com> wrote:
| > Tej wrote:
| >> observed a panic on 2.6.27-rc6 caused by enabling
| >> "CONFIG_KGDB_TESTS_ON_BOOT" option
| >>
| >> panic logged using serial console and .config are attached.
| >>
| >> Linux version 2.6.27-rc6 (root@luser-desktop) (gcc version 4.2.3
| > (Ubuntu 4.2.3-8
| > [clip]
| >> kgdb: Registered I/O driver kgdbts.
| >> kgdbts:RUN plant and detach test
| >> kgdbts:RUN sw breakpoint test
| >> kgdbts:RUN bad memory access test
| >> kgdbts:RUN singlestep test 1000 iterations
| >> kgdbts:RUN singlestep [0/1000]
| >> kgdbts:RUN singlestep [100/1000]
| >> kgdbts: BP mismatch c0109c02 expected c02ad250
| >
| >
| >
| > So if we break this down into what it means, the kgdb test got an
| > exception and stopped somewhere other than where it placed the
| > breakpoint. In this case 0xc0109c02 which you can see right after the
| > text "BP mismatch".
|
| i have obsvered this bug from 2.6.27-rc1 to rc7
| however the 2.6.26 was fine
|
|
| >
| > This address is in your stack trace below (read on)
|
| >
| >
| >> ------------[ cut here ]------------
| >> WARNING: at drivers/misc/kgdbts.c:302 check_and_rewind_pc+0xb2/0xe0()
| >> Modules linked in:
| >> Pid: 0, comm: swapper Not tainted 2.6.27-rc6 #16
| >> [<c0125d04>] warn_on_slowpath+0x54/0x70
| >> [<c01260f0>] ? __call_console_drivers+0x60/0x70
| >> [<c013c15b>] ? up+0x2b/0x40
| >> [<c0126634>] ? release_console_sem+0x1a4/0x1c0
| >> [<c01548f7>] ? __rmqueue_smallest+0xb7/0x130
| >> [<c02445e7>] ? __copy_to_user_ll+0x57/0x60
| >> [<c0153dc0>] ? probe_kernel_write+0x20/0x40
| >> [<c02ad250>] ? kgdbts_break_test+0x0/0x30
| >> [<c0126bab>] ? printk+0x1b/0x20
| >> [<c02ad250>] ? kgdbts_break_test+0x0/0x30
| >> [<c02adfb2>] check_and_rewind_pc+0xb2/0xe0
| >> [<c0109c02>] ? mwait_idle+0x32/0x40
| >
| >
| > So if you were to do:
| > gdb ./vmlinux
| > i line *0xc0109c02
| >
| > It will print the line of source that caused the problem. It is
| > probably the case that this is a victim however, as this looks like
| > some kind of nmi race condition.
| >
| > Perhaps you could confirm this?
|
| ok so i did git bisection which pointed:-
|
| commit 3ed3f06295e69700fa808396f7b350bff2b69de0
| Author: Cyrill Gorcunov <gorcunov@gmail.com>
| Date: Wed Jun 4 01:00:47 2008 +0400
|
| x86: nmi - consolidate nmi_watchdog_default for 32bit mode
|
| 64bit mode bootstrap code does set nmi_watchdog to NMI_NONE
| by default and doing the same on 32bit mode is safe too.
| Such an action saves us from several #ifdef.
|
|
|
| git -bisection log.
|
| git-bisect start
| # bad: [6e86841d05f371b5b9b86ce76c02aaee83352298] Linux 2.6.27-rc1
| git-bisect bad 6e86841d05f371b5b9b86ce76c02aaee83352298
| # good: [bce7f793daec3e65ec5c5705d2457b81fe7b5725] Linux 2.6.26
| git-bisect good bce7f793daec3e65ec5c5705d2457b81fe7b5725
| # bad: [d20b27478d6ccf7c4c8de4f09db2bdbaec82a6c0] V4L/DVB (8415):
| gspca: Infinite loop in i2c_w() of etoms.
| git-bisect bad d20b27478d6ccf7c4c8de4f09db2bdbaec82a6c0
| # bad: [666484f0250db2e016948d63b3ef33e202e3b8d0] Merge branch
| 'core/softirq' of
| git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
| git-bisect bad 666484f0250db2e016948d63b3ef33e202e3b8d0
| # bad: [d59fdcf2ac501de99c3dfb452af5e254d4342886] Merge commit
| 'v2.6.26' into x86/core
| git-bisect bad d59fdcf2ac501de99c3dfb452af5e254d4342886
| # good: [3de352bbd86f890dd0c5e1c09a6a1b0b29e0f8ce] Merge branch
| 'x86/mpparse' into x86/devel
| git-bisect good 3de352bbd86f890dd0c5e1c09a6a1b0b29e0f8ce
| # bad: [b4df32f4aeef8794d0135fc8dc250acb44cfee60] x86: fix warning in
| e820_reserve_resources with 32bit
| git-bisect bad b4df32f4aeef8794d0135fc8dc250acb44cfee60
| # bad: [7f0be02c5ed1deb04c54c6a17f412e04f417df11] x86: move
| boot_params declaring to setup.c
| git-bisect bad 7f0be02c5ed1deb04c54c6a17f412e04f417df11
| # bad: [4b62ac9a2b859f932afd5625362c927111b7dd9b] Merge branch
| 'x86/nmi' into x86/devel
| git-bisect bad 4b62ac9a2b859f932afd5625362c927111b7dd9b
| # bad: [f781b03c4b1c713ac000877c8bbc31fc4164a29b] x86:
| touch_nmi_watchdog(): reset alert counters for supported nmi_watchdog
| modes only
| git-bisect bad f781b03c4b1c713ac000877c8bbc31fc4164a29b
| # good: [96f9dcb10755e96eae706b9e869c0dc25adb3d74] x86: nmi_64.c - use
| for_each_possible_cpu helper
| git-bisect good 96f9dcb10755e96eae706b9e869c0dc25adb3d74
| # good: [ba3a5974239293d921235e6fa82b09b670e674ef] - fix typo in
| include/asm-x86/nmi.h
| git-bisect good ba3a5974239293d921235e6fa82b09b670e674ef
| # bad: [3ed3f06295e69700fa808396f7b350bff2b69de0] x86: nmi -
| consolidate nmi_watchdog_default for 32bit mode
| git-bisect bad 3ed3f06295e69700fa808396f7b350bff2b69de0
| # good: [3d1ba1da2b4ff4ace7801e99fb9a3095b182d847] x86: fix nmi.c build bug
| git-bisect good 3d1ba1da2b4ff4ace7801e99fb9a3095b182d847
|
|
| One more point, this bug i have observed on intel VT machine, on
| non-VT machine i couldn't able to trigger this bug.
|
| cat /proc/cpuinfo | grep vmx
|
| flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
| pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe lm
| constant_tsc pni monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr
| lahf_lm
| flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
| pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe lm
| constant_tsc pni monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr
| lahf_lm
|
| HTH
|
|
|
|
|
|
|
| >
| > My guess would rougly be arch/x86/kernel/processs.c
| >
| > 170 __monitor((void *)¤t_thread_info()->flags, 0, 0);
| >
| > And perhaps this is a deference error, or it will be an even less
| > obvious problem where the return from the nmi kgdb uses to sync the
| > processors has created some kind of invalid context. This may or may
| > not be a direct problem with kgdb, but nothing in this area had really
| > changed since the 2.6.26 kernel.
| >
| > Unfortunately I could not reproduce the problem. If you consistently
| > see this with each boot, but you do not see it on the 2.6.27 rc1 or
| > 2.6.26 released kernel for instance, it is definitely worth bisecting,
| > as the problem was caused elsewhere in the kernel.
| >
| >> [<c02ad250>] ? kgdbts_break_test+0x0/0x30
| >> [<c02acdd1>] validate_simple_test+0x21/0xb0
| >> [<c02ad5a7>] run_simple_test+0x107/0x260
| >> [<c0153e00>] ? probe_kernel_read+0x20/0x40
| >> [<c02acec4>] kgdbts_put_char+0x14/0x20
| >> [<c014aa36>] put_packet+0x86/0xe0
| >> [<c014b650>] kgdb_handle_exception+0x2f0/0xd90
| >> [<c0114cc8>] kgdb_notify+0x58/0x180
| >> [<c013c33d>] notifier_call_chain+0x2d/0x60
| >> [<c013c38c>] __atomic_notifier_call_chain+0x1c/0x30
| >> [<c013c3ba>] atomic_notifier_call_chain+0x1a/0x20
| >> [<c013c44d>] notify_die+0x2d/0x30
| >> [<c01056ca>] die_nmi+0x3a/0x100
| >> [<c0112697>] nmi_watchdog_tick+0x187/0x190
| >> [<c0106067>] do_nmi+0x87/0x2a0
| >> [<c04d5a73>] nmi_stack_correct+0x26/0x2b
| >> [<c011007b>] ? msr_seek+0xb/0x80
| >> [<c0109c02>] ? mwait_idle+0x32/0x40
| >> [<c01026c5>] cpu_idle+0x55/0xf0
| >> [<c04ae02e>] rest_init+0x4e/0x60
| >> [<c069794f>] start_kernel+0x23f/0x2c0
| >> [<c0697270>] ? unknown_bootoption+0x0/0x210
| >> [<c0697077>] __init_begin+0x77/0xb0
| >
| >
| > Jason.
| >
| --
| To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
| the body of a message to majordomo@vger.kernel.org
| More majordomo info at http://vger.kernel.org/majordomo-info.html
| Please read the FAQ at http://www.tux.org/lkml/
|
Thanks for feedback, Tej! Will check this commit.
- Cyrill -
next prev parent reply other threads:[~2008-09-25 14:56 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-09-11 15:41 Kernel PANIC with 2.6.27-rc6 Tej
2008-09-12 17:58 ` Kernel PANIC with 2.6.27-rc6 (kgdb) Randy Dunlap
[not found] ` <48C94F65.3050405@windriver.com>
2008-09-25 14:35 ` Kernel PANIC with 2.6.27-rc6 Tej
2008-09-25 14:56 ` Cyrill Gorcunov [this message]
2008-09-25 15:47 ` Cyrill Gorcunov
2008-09-25 18:34 ` Tej
2008-09-25 19:27 ` Cyrill Gorcunov
2008-09-25 19:39 ` Cyrill Gorcunov
2008-09-26 8:50 ` Cyrill Gorcunov
2008-10-03 10:48 ` Tej
2008-10-03 14:01 ` Cyrill Gorcunov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080925145635.GA7277@localhost \
--to=gorcunov@gmail.com \
--cc=bewith.tej@gmail.com \
--cc=jason.wessel@windriver.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox