From: linas@austin.ibm.com
To: Paul Mackerras <paulus@samba.org>
Cc: linuxppc64-dev@lists.linuxppc.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] [2.6] PPC64: log firmware errors during boot.
Date: Thu, 1 Jul 2004 16:06:14 -0500 [thread overview]
Message-ID: <20040701160614.I21634@forte.austin.ibm.com> (raw)
In-Reply-To: <16610.39955.554139.858593@cargo.ozlabs.ibm.com>; from paulus@samba.org on Wed, Jun 30, 2004 at 08:55:15PM +1000
On Wed, Jun 30, 2004 at 08:55:15PM +1000, Paul Mackerras wrote:
> Linas,
>
> > Firmware can report errors at any time, and not atypically during boot.
> > However, these reports were being discarded until th rtasd comes up,
> > which occurs fairly late in the boot cycle. As a result, firmware
> > errors during boot were being silently ignored.
>
> As far as I can see the main change is in log_rtas_len, which is
> called from pSeries_log_error, which is called from do_event_scan and
> rtasd(), and do_event_scan is only called from rtasd(). And
> get_eventscan_parms() is already called at the beginning of rtasd().
Yes, but rtasd starts up late in the book process. Most of the
"interesting" manipulations with firmware are old history by then,
and thus, any firmware errors encountered during the boot were never
logged.
> So I don't see the point of the get_eventscan_parms call in
> log_rtas_len.
If the parms aren't set up, then the rtas_error_log_max is zero,
and, as a result, the message is never logged. By initializing
rtas_error_log_max to the correct non-zero value, the errors can
get logged.
> > This patch at least gets them printk'ed so that at least they show
> > up in boot.msg/syslog. There are two other logging mechanisms,
> > nvram and rtas, that I didn't touch because I don't understand
> > the reprecussions. In particular, nvram logging isn't enabled
> > until late in the boot ... but what's the point of nvram logging
> > if not to catch messages that occured very early in boot ??
>
> Indeed.
>
> As for printk'ing the errors, it is annoying and it seems of somewhat
> dubious benefit to me, given that it is just incomprehensible hex
> numbers that can go on and on. There has to be a better way.
Yes, well, you'll be hard-pressed to find a lover of the hex format
anywhere. Lets review the history of the design decisions that
got us to this point. I think a better solution might then become
evident.
-- Originally, these binary messages from firmware were decoded
in the kernel, and printed out in 'plain english'. However, there
were problems: 1) the format of the binary kept evolving; I think
we are now up to version 6. 2) the need for supporting version 6
and *all* of the earlier versions lead to dreaded kernel bloat.
For the current user-space decoder:
# wc *.c *.h
2207 7056 67959 total
-- So the decision was wisely made to move this all to user-space.
But what shall the communications link between user-space and kernel be?
Somebody, somewhere, I know not who or why, decided that they should
go into syslog. And so here we are.
How else could we do this? I have never had to architect a kernel-to-user
data communications interface, so I don't know what the alternatives
are. We could queue them up to some file in /proc, which user-space
reads. Or maybe /sys instead ?? Maybe a stunt with sockets? Some
new device in /dev/ that can be opened, read, closed? How should
the user space daemon indicate that its picked up the message and
doesn't need it any more? Write a msg number to a /proc file?
Maybe each individual message should go in its own file, and user
space just rm's that file after its fetched/saved the message.
I dunno, I think any one of these could be whipped up in a jiffy.
Convincing the user-space to use the interface might be harder.
Pick one. If it can be coded in under a day, I can volunteer to
do that.
> Putting
> it in nvram seems like a better option to me. I don't know of any
> reason why we can't use nvram quite early on.
Me neither, Jake knows. I thought the whole point of nvram was to not
loose the messages during crash; the messages are promtly copied out
of nvram once the system is up and stable; nvram is a staging area,
not a permanent repository.
--linas
next prev parent reply other threads:[~2004-07-02 0:20 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-06-30 0:10 [PATCH] [2.6] PPC64: log firmware errors during boot linas
2004-06-30 10:55 ` Paul Mackerras
2004-07-01 21:06 ` linas [this message]
2004-07-02 5:36 ` Greg KH
2004-07-02 10:44 ` Paul Mackerras
2004-07-02 14:15 ` Hollis Blanchard
2004-07-02 16:18 ` Nathan Fontenot
2004-07-02 17:29 ` Hollis Blanchard
2004-07-02 18:13 ` linas
2004-07-02 18:27 ` Greg KH
2004-07-02 18:55 ` Dave Hansen
2004-07-02 19:44 ` Greg KH
2004-07-06 13:24 ` Jake Moilanen
2004-07-06 13:41 ` Jake Moilanen
2004-07-08 16:03 ` linas
2004-07-08 17:55 ` Jake Moilanen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20040701160614.I21634@forte.austin.ibm.com \
--to=linas@austin.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc64-dev@lists.linuxppc.org \
--cc=paulus@samba.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.