* Frustrating Random Reboots, seeking suggestions
@ 2006-06-09 14:57 Hui Zhou
2006-06-09 16:01 ` Chase Venters
0 siblings, 1 reply; 6+ messages in thread
From: Hui Zhou @ 2006-06-09 14:57 UTC (permalink / raw)
To: linux-kernel
Hi Lists,
I understand this type of ask for help may be slightly off topic here,
but hoping for some clue to my desperation, here it goes:
I am running a linux machine with a self programmed pvr running on it.
All is well until I reinstalled the linux system a few weeks ago. Now
I am suffering from random reboots. The reboots does not leave any
debug messages or clues. After some isolation, I finally narrowed it
down to a blankscene marking program -- bkmark. Running bkmark against
any recording randomly reboots the computer. By random, I mean it may
complete sucessfully once, but repeating it for a few times, the
reboots will happen. On average, it reboots every 2 - 3 runs.
I am happy with and used to seg faults, which given time, I can
debug it. But this random reboots stuff is new to me and I have no
clues at all. How and why would a user land program reboots the
system?
I am running debian stable. Self compiled unpatched kernel 2.6.16.15
PREEMPT. Single Pentium 2.8GHz on Intel 865P motherboard. bkmark uses
libmpeg2 shared library. The source code is 471 lines, availible on
request. The same program runs without problem on the system before
(debian unstable) and even before (debian stable, but that was 5 months
ago) and even with the same kernel (2.6.14.6, I updated the kernel
after this problem occured).
More info is availible but I felt it may be inapprorate to post here
and honestly I have no clue which info is relavant. Any suggestions
or clues or advices on how to debug or narrow down the cause are very
appreciated.
Thank you.
--
Hui Zhou
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Frustrating Random Reboots, seeking suggestions
2006-06-09 14:57 Frustrating Random Reboots, seeking suggestions Hui Zhou
@ 2006-06-09 16:01 ` Chase Venters
2006-06-10 2:37 ` Hui Zhou
0 siblings, 1 reply; 6+ messages in thread
From: Chase Venters @ 2006-06-09 16:01 UTC (permalink / raw)
To: Hui Zhou; +Cc: linux-kernel
On Fri, 9 Jun 2006, Hui Zhou wrote:
> Hi Lists,
>
> I understand this type of ask for help may be slightly off topic here, but
> hoping for some clue to my desperation, here it goes:
>
> I am running a linux machine with a self programmed pvr running on it. All is
> well until I reinstalled the linux system a few weeks ago. Now I am suffering
> from random reboots. The reboots does not leave any debug messages or clues.
> After some isolation, I finally narrowed it down to a blankscene marking
> program -- bkmark. Running bkmark against any recording randomly reboots the
> computer. By random, I mean it may complete sucessfully once, but repeating
> it for a few times, the reboots will happen. On average, it reboots every 2
> - 3 runs.
>
> I am happy with and used to seg faults, which given time, I can debug it. But
> this random reboots stuff is new to me and I have no clues at all. How and
> why would a user land program reboots the system?
>
> I am running debian stable. Self compiled unpatched kernel 2.6.16.15 PREEMPT.
> Single Pentium 2.8GHz on Intel 865P motherboard. bkmark uses libmpeg2 shared
> library. The source code is 471 lines, availible on request. The same program
> runs without problem on the system before (debian unstable) and even before
> (debian stable, but that was 5 months ago) and even with the same kernel
> (2.6.14.6, I updated the kernel after this problem occured).
>
> More info is availible but I felt it may be inapprorate to post here and
> honestly I have no clue which info is relavant. Any suggestions or clues or
> advices on how to debug or narrow down the cause are very appreciated.
>
> Thank you.
>
>
Try to knock out any hardware problems first (run memtest86, check for
high heat / crappy power).
If you're still having trouble, purchase a serial cable. Plug it into
another computer with a terminal program. Enable serial console support in
your kernel (and on your kernel command line). When the kernel boots, use
SysRq on the serial console to turn the console messaging level up to
maximum. If you're lucky, you'll catch some sort of diagnostics message on
the serial console before this happens.
Cheers,
Chase
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Frustrating Random Reboots, seeking suggestions
2006-06-09 16:01 ` Chase Venters
@ 2006-06-10 2:37 ` Hui Zhou
2006-06-10 8:52 ` Ingo Oeser
0 siblings, 1 reply; 6+ messages in thread
From: Hui Zhou @ 2006-06-10 2:37 UTC (permalink / raw)
To: linux-kernel
On Fri, Jun 09, 2006 at 11:01:22AM -0500, Chase Venters wrote:
>On Fri, 9 Jun 2006, Hui Zhou wrote:
>>I am running a linux machine with a self programmed pvr running on it. All
>>is well until I reinstalled the linux system a few weeks ago. Now I am
>>suffering from random reboots. The reboots does not leave any debug
>>messages or clues. After some isolation, I finally narrowed it down to a
>>blankscene marking program -- bkmark. Running bkmark against any recording
>>randomly reboots the computer. By random, I mean it may complete
>>sucessfully once, but repeating it for a few times, the reboots will
>>happen. On average, it reboots every 2 - 3 runs.
>Try to knock out any hardware problems first (run memtest86, check for
>high heat / crappy power).
>
>If you're still having trouble, purchase a serial cable. Plug it into
>another computer with a terminal program. Enable serial console support in
>your kernel (and on your kernel command line). When the kernel boots, use
>SysRq on the serial console to turn the console messaging level up to
>maximum. If you're lucky, you'll catch some sort of diagnostics message on
>the serial console before this happens.
>
Thanks. memtest86 passes 6 times without errors. Serial console didn't
show up anything (it just reboots).
Anyway, I finally suspect the debian libmpeg binary is at fault. I
manually build it from src and statically linked to the `bkmark'
program. It seems cured the random reboots problem. It runs
successfully for 4 times. However, the fifth time it ended up in a `D'
state. The only system call it uses is libc file IO and some signal
passing. Any comment on the cause?
--
Hui Zhou
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Frustrating Random Reboots, seeking suggestions
2006-06-10 2:37 ` Hui Zhou
@ 2006-06-10 8:52 ` Ingo Oeser
2006-06-10 11:37 ` Hui Zhou
0 siblings, 1 reply; 6+ messages in thread
From: Ingo Oeser @ 2006-06-10 8:52 UTC (permalink / raw)
To: Hui Zhou; +Cc: linux-kernel
Hi Hui Zhou,
On Saturday, 10. June 2006 04:37, Hui Zhou wrote:
> Thanks. memtest86 passes 6 times without errors. Serial console didn't
> show up anything (it just reboots).
>
> Anyway, I finally suspect the debian libmpeg binary is at fault. I
> manually build it from src and statically linked to the `bkmark'
> program. It seems cured the random reboots problem. It runs
> successfully for 4 times. However, the fifth time it ended up in a `D'
> state. The only system call it uses is libc file IO and some signal
> passing. Any comment on the cause?
Do you also see the problem if you decode from file to memory only.
without any display?
NO: You have some problem with your peripherals.
YES: Check for heat and power problems.
If you are brave you could try some cpuburn variant to put the heat
to the maximum.
WARNING: This could kill your CPU and might void your warranty,
since this is not "normal use" of your CPU :-)
Good luck!
Regards
Ingo Oeser
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Frustrating Random Reboots, seeking suggestions
2006-06-10 8:52 ` Ingo Oeser
@ 2006-06-10 11:37 ` Hui Zhou
2006-06-10 15:27 ` Pavel Machek
0 siblings, 1 reply; 6+ messages in thread
From: Hui Zhou @ 2006-06-10 11:37 UTC (permalink / raw)
To: Ingo Oeser; +Cc: linux-kernel
On Sat, Jun 10, 2006 at 10:52:03AM +0200, Ingo Oeser wrote:
>Hi Hui Zhou,
>
>On Saturday, 10. June 2006 04:37, Hui Zhou wrote:
>> Thanks. memtest86 passes 6 times without errors. Serial console didn't
>> show up anything (it just reboots).
>>
>> Anyway, I finally suspect the debian libmpeg binary is at fault. I
>> manually build it from src and statically linked to the `bkmark'
>> program. It seems cured the random reboots problem. It runs
>> successfully for 4 times. However, the fifth time it ended up in a `D'
>> state. The only system call it uses is libc file IO and some signal
>> passing. Any comment on the cause?
>
>Do you also see the problem if you decode from file to memory only.
>without any display?
>
>NO: You have some problem with your peripherals.
There is no display. The program just marks the blank scene or scene
changes and dumps the results to a text file for another program to
analyze.
>
>YES: Check for heat and power problems.
>
> If you are brave you could try some cpuburn variant to put the heat
> to the maximum.
>
> WARNING: This could kill your CPU and might void your warranty,
> since this is not "normal use" of your CPU :-)
No, I am not that brave. :) However, I am now faily certain it is not
heat problem. After relinked with a new libmpeg binary, it hasn't
rebooted yet (8+hours). Any possibility that some binary code can
randomly trigger reboots on certain CPUs? (Sounds absurd, but only you
kernel guys have better answers.)
Now I am concerned with the `D' state. Is that some problem from the
kernel?
Thanks.
--
Hui Zhou
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Frustrating Random Reboots, seeking suggestions
2006-06-10 11:37 ` Hui Zhou
@ 2006-06-10 15:27 ` Pavel Machek
0 siblings, 0 replies; 6+ messages in thread
From: Pavel Machek @ 2006-06-10 15:27 UTC (permalink / raw)
To: Hui Zhou; +Cc: Ingo Oeser, linux-kernel
Hi!
> >YES: Check for heat and power problems.
> >
> > If you are brave you could try some cpuburn variant to put the heat
> > to the maximum.
> >
> > WARNING: This could kill your CPU and might void your warranty,
> > since this is not "normal use" of your CPU :-)
>
> No, I am not that brave. :) However, I am now faily certain it is not
> heat problem. After relinked with a new libmpeg binary, it hasn't
> rebooted yet (8+hours). Any possibility that some binary code can
Wrong answer. If you are fairly sure it is not heat problem, try some
cpuburn. See http://www.livejournal.com/~pavelmachek -- you are
unlikely to physicaly damage anything.
Pavel
--
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2006-06-10 15:28 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-06-09 14:57 Frustrating Random Reboots, seeking suggestions Hui Zhou
2006-06-09 16:01 ` Chase Venters
2006-06-10 2:37 ` Hui Zhou
2006-06-10 8:52 ` Ingo Oeser
2006-06-10 11:37 ` Hui Zhou
2006-06-10 15:27 ` Pavel Machek
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox