* 2.6.26.3-rt3 bug report
@ 2008-08-24 22:51 John Kacur
2008-08-25 23:15 ` John Kacur
0 siblings, 1 reply; 6+ messages in thread
From: John Kacur @ 2008-08-24 22:51 UTC (permalink / raw)
To: Steven Rostedt, LKML, RT; +Cc: Thomas Gleixner, Ingo Molnar
I haven't seen this one before.
Bad page state in process 'firefox-bin'
page:ffffe20001011f00 flags:0x0100000000000000
mapping:ffffe20001011f18 mapcount:0 count:0
Trying to fix it up, but a reboot is needed
Backtrace:
Pid: 4180, comm: firefox-bin Tainted: G W 2.6.26.3-rt3 #3
Call Trace:
[<ffffffff80287a72>] bad_page+0x6f/0x9d
[<ffffffff8029184c>] ? inc_zone_page_state+0x5f/0x6b
[<ffffffff80289046>] free_hot_cold_page+0x88/0x1e6
[<ffffffff8028920b>] free_hot_page+0x10/0x12
[<ffffffff8028922a>] __free_pages+0x1d/0x26
[<ffffffff80294b95>] __pte_alloc+0x8e/0x99
[<ffffffff80294cff>] handle_mm_fault+0x15f/0x766
[<ffffffff8045f2d6>] ? rt_mutex_down_read_trylock+0x1ee/0x1f9
[<ffffffff804629d9>] do_page_fault+0x51d/0x92c
[<ffffffff8021251c>] ? native_sched_clock+0x2a/0x72
[<ffffffff8027c930>] ? ftrace_now+0x9/0xb
[<ffffffff80281548>] ? tracing_hist_preempt_start+0xf1/0x10d
[<ffffffff8021251c>] ? native_sched_clock+0x2a/0x72
[<ffffffff8027c930>] ? ftrace_now+0x9/0xb
[<ffffffff804602f9>] error_exit+0x0/0x56
---------------------------
| preempt count: 00000000 ]
| 0-level deep critical section nesting:
----------------------------------------
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: 2.6.26.3-rt3 bug report
2008-08-24 22:51 2.6.26.3-rt3 bug report John Kacur
@ 2008-08-25 23:15 ` John Kacur
2008-08-26 4:14 ` Jon Masters
0 siblings, 1 reply; 6+ messages in thread
From: John Kacur @ 2008-08-25 23:15 UTC (permalink / raw)
To: Steven Rostedt, LKML, RT; +Cc: Thomas Gleixner, Ingo Molnar
On Mon, Aug 25, 2008 at 12:51 AM, John Kacur <jkacur@gmail.com> wrote:
> I haven't seen this one before.
>
> Bad page state in process 'firefox-bin'
> page:ffffe20001011f00 flags:0x0100000000000000
> mapping:ffffe20001011f18 mapcount:0 count:0
> Trying to fix it up, but a reboot is needed
> Backtrace:
> Pid: 4180, comm: firefox-bin Tainted: G W 2.6.26.3-rt3 #3
>
> Call Trace:
> [<ffffffff80287a72>] bad_page+0x6f/0x9d
> [<ffffffff8029184c>] ? inc_zone_page_state+0x5f/0x6b
> [<ffffffff80289046>] free_hot_cold_page+0x88/0x1e6
> [<ffffffff8028920b>] free_hot_page+0x10/0x12
> [<ffffffff8028922a>] __free_pages+0x1d/0x26
> [<ffffffff80294b95>] __pte_alloc+0x8e/0x99
> [<ffffffff80294cff>] handle_mm_fault+0x15f/0x766
> [<ffffffff8045f2d6>] ? rt_mutex_down_read_trylock+0x1ee/0x1f9
> [<ffffffff804629d9>] do_page_fault+0x51d/0x92c
> [<ffffffff8021251c>] ? native_sched_clock+0x2a/0x72
> [<ffffffff8027c930>] ? ftrace_now+0x9/0xb
> [<ffffffff80281548>] ? tracing_hist_preempt_start+0xf1/0x10d
> [<ffffffff8021251c>] ? native_sched_clock+0x2a/0x72
> [<ffffffff8027c930>] ? ftrace_now+0x9/0xb
> [<ffffffff804602f9>] error_exit+0x0/0x56
>
> ---------------------------
> | preempt count: 00000000 ]
> | 0-level deep critical section nesting:
> ----------------------------------------
>
Slightly different form today.
Bad page state in process 'firefox-bin'
page:ffffe20000daa360 flags:0x0100000000000000
mapping:ffffe20000daa378 mapcount:0 count:0
Trying to fix it up, but a reboot is needed
Backtrace:
Pid: 16955, comm: firefox-bin Not tainted 2.6.26.3-rt3 #3
Call Trace:
[<ffffffff80287a72>] bad_page+0x6f/0x9d
[<ffffffff8029184c>] ? inc_zone_page_state+0x5f/0x6b
[<ffffffff80289046>] free_hot_cold_page+0x88/0x1e6
[<ffffffff8028920b>] free_hot_page+0x10/0x12
[<ffffffff8028922a>] __free_pages+0x1d/0x26
[<ffffffff80294b95>] __pte_alloc+0x8e/0x99
[<ffffffff80294cff>] handle_mm_fault+0x15f/0x766
[<ffffffff8045f2d6>] ? rt_mutex_down_read_trylock+0x1ee/0x1f9
[<ffffffff804629d9>] do_page_fault+0x51d/0x92c
[<ffffffff8027f57a>] ? trace_hardirqs_off+0x11d/0x136
[<ffffffff8045feeb>] ? __spin_unlock_irqrestore+0x29/0x4d
[<ffffffff8045feeb>] ? __spin_unlock_irqrestore+0x29/0x4d
[<ffffffff8023168b>] ? hrtick_set+0x8f/0x100
[<ffffffff8021251c>] ? native_sched_clock+0x2a/0x72
[<ffffffff8027c930>] ? ftrace_now+0x9/0xb
[<ffffffff80281a97>] ? tracing_hist_preempt_stop+0x2cb/0x2f5
[<ffffffff8020c94d>] ? retint_swapgs+0xe/0x13
[<ffffffff8045f52c>] ? trace_hardirqs_on_thunk+0x3a/0x3c
[<ffffffff804602f9>] error_exit+0x0/0x56
---------------------------
| preempt count: 00000000 ]
| 0-level deep critical section nesting:
----------------------------------------
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: 2.6.26.3-rt3 bug report
2008-08-25 23:15 ` John Kacur
@ 2008-08-26 4:14 ` Jon Masters
2008-08-26 5:49 ` Carsten Emde
0 siblings, 1 reply; 6+ messages in thread
From: Jon Masters @ 2008-08-26 4:14 UTC (permalink / raw)
To: John Kacur; +Cc: Steven Rostedt, LKML, RT, Thomas Gleixner, Ingo Molnar
On Tue, 2008-08-26 at 01:15 +0200, John Kacur wrote:
> Slightly different form today.
>
> Bad page state in process 'firefox-bin'
> page:ffffe20000daa360 flags:0x0100000000000000
> mapping:ffffe20000daa378 mapcount:0 count:0
> Trying to fix it up, but a reboot is needed
> Backtrace:
> Pid: 16955, comm: firefox-bin Not tainted 2.6.26.3-rt3 #3
Er, does firefox run reliably on a non-RT 2.6.26.3 kernel? This many
random calls to bad_page suggests more of a RAM problem.
Jon.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: 2.6.26.3-rt3 bug report
2008-08-26 4:14 ` Jon Masters
@ 2008-08-26 5:49 ` Carsten Emde
2008-08-29 14:29 ` John Kacur
0 siblings, 1 reply; 6+ messages in thread
From: Carsten Emde @ 2008-08-26 5:49 UTC (permalink / raw)
To: Jon Masters
Cc: John Kacur, Steven Rostedt, LKML, RT, Thomas Gleixner,
Ingo Molnar
Jon Masters wrote:
> On Tue, 2008-08-26 at 01:15 +0200, John Kacur wrote:
>> Slightly different form today.
>> Bad page state in process 'firefox-bin'
>> page:ffffe20000daa360 flags:0x0100000000000000
>> mapping:ffffe20000daa378 mapcount:0 count:0
>> Trying to fix it up, but a reboot is needed
>> Backtrace:
>> Pid: 16955, comm: firefox-bin Not tainted 2.6.26.3-rt3 #3
> Er, does firefox run reliably on a non-RT 2.6.26.3 kernel? This many
> random calls to bad_page suggests more of a RAM problem.
Don't think so. Same problem here:
Bad page state in process 'bonobo-activati'
page:c18f66fc flags:0x40000000 mapping:c18f670c mapcount:0 count:0
Trying to fix it up, but a reboot is needed
Backtrace:
Pid: 4169, comm: bonobo-activati Not tainted 2.6.26.3-rt3 #2
[<c0438e52>] ? printk+0x14/0x1a
[<c026e8ef>] bad_page+0x4e/0x79
[<c026f3d3>] free_hot_cold_page+0x5b/0x1dc
[<c026f5a1>] free_hot_page+0xf/0x11
[<c026f5c3>] __free_pages+0x20/0x2b
[<c02789a0>] __pte_alloc+0x6f/0x77
[<c0278a4d>] handle_mm_fault+0xa5/0x56c
[<c0243a1c>] ? rt_mutex_down_read+0x15c/0x164
[<c0218759>] do_page_fault+0x2c1/0x674
[<c021b21d>] ? enqueue_task+0x5a/0x66
[<c043bb66>] ? __spin_unlock_irqrestore+0x24/0x42
[<c0241e4f>] ? rt_mutex_adjust_prio+0x1a/0x31
[<c043bc92>] ? __spin_lock_irqsave+0x1e/0x38
[<c021f37d>] ? try_to_wake_up+0x176/0x180
[<c043bb5a>] ? __spin_unlock_irqrestore+0x18/0x42
[<c0241e4f>] ? rt_mutex_adjust_prio+0x1a/0x31
[<c0241e61>] ? rt_mutex_adjust_prio+0x2c/0x31
[<c0241e61>] ? rt_mutex_adjust_prio+0x2c/0x31
[<c043a8f7>] ? rt_write_slowunlock+0x1cd/0x1d5
[<c027ccb0>] ? mprotect_fixup+0x238/0x282
[<c0259384>] ? audit_syscall_exit+0x2b6/0x2d1
[<c0204a4e>] ? resume_userspace+0x6/0x1c
[<c0218498>] ? do_page_fault+0x0/0x674
[<c043bf02>] error_code+0x72/0x78
--cbe
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: 2.6.26.3-rt3 bug report
2008-08-26 5:49 ` Carsten Emde
@ 2008-08-29 14:29 ` John Kacur
2008-08-30 14:29 ` Carsten Emde
0 siblings, 1 reply; 6+ messages in thread
From: John Kacur @ 2008-08-29 14:29 UTC (permalink / raw)
To: Carsten Emde
Cc: Jon Masters, Steven Rostedt, LKML, RT, Thomas Gleixner,
Ingo Molnar
On Tue, Aug 26, 2008 at 7:49 AM, Carsten Emde <Carsten.Emde@osadl.org> wrote:
> Jon Masters wrote:
>> On Tue, 2008-08-26 at 01:15 +0200, John Kacur wrote:
>>> Slightly different form today.
>>> Bad page state in process 'firefox-bin'
>>> page:ffffe20000daa360 flags:0x0100000000000000
>>> mapping:ffffe20000daa378 mapcount:0 count:0
>>> Trying to fix it up, but a reboot is needed
>>> Backtrace:
>>> Pid: 16955, comm: firefox-bin Not tainted 2.6.26.3-rt3 #3
>> Er, does firefox run reliably on a non-RT 2.6.26.3 kernel? This many
>> random calls to bad_page suggests more of a RAM problem.
> Don't think so. Same problem here:
> Bad page state in process 'bonobo-activati'
> page:c18f66fc flags:0x40000000 mapping:c18f670c mapcount:0 count:0
> Trying to fix it up, but a reboot is needed
> Backtrace:
> Pid: 4169, comm: bonobo-activati Not tainted 2.6.26.3-rt3 #2
> [<c0438e52>] ? printk+0x14/0x1a
> [<c026e8ef>] bad_page+0x4e/0x79
> [<c026f3d3>] free_hot_cold_page+0x5b/0x1dc
> [<c026f5a1>] free_hot_page+0xf/0x11
> [<c026f5c3>] __free_pages+0x20/0x2b
> [<c02789a0>] __pte_alloc+0x6f/0x77
> [<c0278a4d>] handle_mm_fault+0xa5/0x56c
> [<c0243a1c>] ? rt_mutex_down_read+0x15c/0x164
> [<c0218759>] do_page_fault+0x2c1/0x674
> [<c021b21d>] ? enqueue_task+0x5a/0x66
> [<c043bb66>] ? __spin_unlock_irqrestore+0x24/0x42
> [<c0241e4f>] ? rt_mutex_adjust_prio+0x1a/0x31
> [<c043bc92>] ? __spin_lock_irqsave+0x1e/0x38
> [<c021f37d>] ? try_to_wake_up+0x176/0x180
> [<c043bb5a>] ? __spin_unlock_irqrestore+0x18/0x42
> [<c0241e4f>] ? rt_mutex_adjust_prio+0x1a/0x31
> [<c0241e61>] ? rt_mutex_adjust_prio+0x2c/0x31
> [<c0241e61>] ? rt_mutex_adjust_prio+0x2c/0x31
> [<c043a8f7>] ? rt_write_slowunlock+0x1cd/0x1d5
> [<c027ccb0>] ? mprotect_fixup+0x238/0x282
> [<c0259384>] ? audit_syscall_exit+0x2b6/0x2d1
> [<c0204a4e>] ? resume_userspace+0x6/0x1c
> [<c0218498>] ? do_page_fault+0x0/0x674
> [<c043bf02>] error_code+0x72/0x78
>
Hi Carsten
Any progress or ideas here?
Thank you
John
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: 2.6.26.3-rt3 bug report
2008-08-29 14:29 ` John Kacur
@ 2008-08-30 14:29 ` Carsten Emde
0 siblings, 0 replies; 6+ messages in thread
From: Carsten Emde @ 2008-08-30 14:29 UTC (permalink / raw)
To: John Kacur
Cc: Jon Masters, Steven Rostedt, LKML, RT, Thomas Gleixner,
Ingo Molnar
Hi John,
> Carsten Emde wrote:
>> Jon Masters wrote:
>>> John Kacur wrote:
>>>> Slightly different form today.
>>>> Bad page state in process 'firefox-bin'
>>>> page:ffffe20000daa360 flags:0x0100000000000000
>>>> mapping:ffffe20000daa378 mapcount:0 count:0
>>>> Trying to fix it up, but a reboot is needed
>>>> Backtrace:
>>>> Pid: 16955, comm: firefox-bin Not tainted 2.6.26.3-rt3 #3
>>> Er, does firefox run reliably on a non-RT 2.6.26.3 kernel? This many
>>> random calls to bad_page suggests more of a RAM problem.
>> Don't think so. Same problem here:
>> Bad page state in process 'bonobo-activati'
>> page:c18f66fc flags:0x40000000 mapping:c18f670c mapcount:0 count:0
>> [..]
>
> Hi Carsten
> Any progress or ideas here?
No. Admittedly, we have not yet started to take care of the 2.6.26 RT
tree. Finishing the 2.6.24 RT tree was quite difficult and took somewhat
longer than earlier trees so we now really need to use it and to
integrate it into the various industrial projects that waited long
enough for it. This will certainly keep us busy during the next weeks or so.
Unfortunately, the "Bad page state" is not the only regression.
Currently, we know of the following problems in 2.6.26.3-rt3:
- Bad page state
- Tasks blocked for more than 120 seconds
- Various kernel OOPSes
- Increased worst-case latencies as compared to 2.6.24.7-rt17
The problem with these regressions is that they occur only very rarely
and under specific high system load conditions. So we will probably
first need to work on test conditions that will make them happen more
frequently. The "Bad page state" thing appears especially difficult to
fix, since it is probably due to memory corruption.
However, we will continuously check this and future releases of the
2.6.26 RT tree and work on them as time permits, but we do not expect to
have 2.6.26.X-rtY ready for production before October. Meanwhile, if you
find a way to better reproduce the "Bad page state" message, we will
certainly appreciate and use it here.
Thanks,
Carsten.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2008-08-30 14:31 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-08-24 22:51 2.6.26.3-rt3 bug report John Kacur
2008-08-25 23:15 ` John Kacur
2008-08-26 4:14 ` Jon Masters
2008-08-26 5:49 ` Carsten Emde
2008-08-29 14:29 ` John Kacur
2008-08-30 14:29 ` Carsten Emde
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).