From: ebiederm@xmission.com (Eric W. Biederman)
To: Keith Chew <keith.chew@gmail.com>
Cc: linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: Hang on "echo b > /proc/sysrq-trigger"
Date: Wed, 29 Feb 2012 15:34:30 -0800 [thread overview]
Message-ID: <87zkc1w6wp.fsf@xmission.com> (raw)
In-Reply-To: <CAKaWdxa326eP47=nENh8_1NmRCadTB=o8jyWO2dAzZcn9yCwiA@mail.gmail.com> (Keith Chew's message of "Thu, 1 Mar 2012 11:06:53 +1300")
Keith Chew <keith.chew@gmail.com> writes:
> Hi Eric
>
>>
>> Historically a lot of issues have had to do with which cpu you are
>> entering the bios from. So you might try pinning your process
>> to differen cpus and see if you can make the failure more deterministic.
>>
>
> We are using a Celeron 575 uniprocessor, so we do not have the option
> to pin on another cpu. I have tried compiling the kernel in both UP
> and SMP configuration, but sadly both causes the hang.
Ok. That rules out a bunch of things, and emerengy_restart may not
be much different in practice.
>> Ugh. The other possibility is that there is an intermittent failure in
>> the hardware, that prevents the boot/reboot. Wrong values on pull-up
>> resistors have been known to cause that kind of thing.
>>
>
> Thank you very much for this pointer, will feed that back to the
> manufacturer and see if it will give them some clues. The original
> purpose for this reboot exercise was to ensure the software will
> handle a power failure without any OS/data corruptions. With this new
> discovery of unreliable reboot, the next worry is "If reboot is not
> reliable, is the boot process also susceptible to the same issue?". I
> have not rigged up any hardware to simulate a periodic full shutdown
> and boot up process, but will be planning to set this up next.
>
> Thanks again, if you have any other suggestions for us to try, I am
> all ears!
I would check with your BIOS folks and perhaps play with the kernel
option. The most reliable way to peform a reset is to trigger a board
reset by writing to 0xcf9 or a similar register. I expect your BIOS
does that and you can probably get the kernel to do that. I would
definitely test to see if you can write to the mostly standard
0xcf9 register directly from the kernel and trigger a reset directly.
Once past a reset and with a single cpu all of the failures will be
happening in the boot path. So the only possible points of failure
are in devices that are different between a soft reset and a power on
reset.
I would check to see if your board perhaps supports post codes or any
other debugging that will let you see where you are hanging.
It sounds like there is some very rare failure, that is going to be
a challenge to track down. I would definitely test more than one
motherboard to ensure that you can reproduce the problem on more
than one piece of hardware. Sometimes hardware is just broken.
Eric
next prev parent reply other threads:[~2012-02-29 23:34 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-02-17 22:54 Hang on "echo b > /proc/sysrq-trigger" Keith Chew
2012-02-29 18:07 ` Eric W. Biederman
2012-02-29 18:28 ` Keith Chew
2012-02-29 20:49 ` Eric W. Biederman
2012-02-29 22:06 ` Keith Chew
2012-02-29 23:34 ` Eric W. Biederman [this message]
2012-03-01 0:12 ` Keith Chew
2012-03-10 23:45 ` Keith Chew
2012-03-19 6:34 ` Jon Masters
2012-03-19 6:45 ` Keith Chew
2012-03-24 1:11 ` Ray Lee
2012-03-28 20:25 ` Keith Chew
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87zkc1w6wp.fsf@xmission.com \
--to=ebiederm@xmission.com \
--cc=keith.chew@gmail.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox