From: Keith Owens <kaos@sgi.com>
To: linux-ia64@vger.kernel.org
Subject: [ANNOUNCE] salinfo 1.0 is available
Date: Thu, 15 Dec 2005 07:17:32 +0000 [thread overview]
Message-ID: <13035.1134631052@kao2.melbourne.sgi.com> (raw)
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="windows-1252", Size: 5319 bytes --]
There is a new and (so far) unofficial version of salinfo on
ftp://ftp.ocs.com.au, in /pub/salinfo-1.0.tar.bz2 and
salinfo-1.0-1.src.rpm. I hope that they will move to the official HP
location soon.
The base functionality of salinfo has not changed, it still reads from
/proc/sal/{cmc,cpe,init,mca}/* and writes to /var/log/salinfo. The
changes are above this layer and are aimed at making the salinfo code
more resilient, less of a potential denial of service and to make it
easier to post process the SAL records.
Note: you need this kernel patch to let salinfo_decode 1.0 see alarm
signals. Without this patch it will still work, just not log the
dropped records correctly.
diff-tree 05f70395c642bed0300bc1955bfa8c0f93de2bc2 (from 885da19e8044051a92cfd70099398c373245c431)
Author: Keith Owens <kaos@sgi.com>
Date: Fri Dec 2 13:40:15 2005 +1100
[IA64] Allow salinfo_decode to detect signals on read
Return -EINTR instead of -ERESTARTSYS when signals are delivered during
a blocked read of /proc/sal/*/event. This allows salinfo_decode to
detect signals when it is blocked on a read of those files.
Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
diff --git a/arch/ia64/kernel/salinfo.c b/arch/ia64/kernel/salinfo.c
index ca68e6e..1461dc6 100644
--- a/arch/ia64/kernel/salinfo.c
+++ b/arch/ia64/kernel/salinfo.c
@@ -293,7 +293,7 @@ retry:
if (file->f_flags & O_NONBLOCK)
return -EAGAIN;
if (down_interruptible(&data->sem))
- return -ERESTARTSYS;
+ return -EINTR;
}
n = data->cpu_check;
Changelog extract for salinfo 1.0.
2005-12-14 Keith Owens <kaos@sgi.com>
* Released as 1.0.
* salinfo_decode_all is now a C program instead of a shell script. It
monitors the health of the salinfo_decode tasks.
* Add salinfo_decode option -i pct, do not write records if the -D
filesystem inode percentage used is pct or greater.
* Add salinfo_decode option -s pct, do not write records if the -D
filesystem space used percentage is pct or greater.
* Add salinfo_decode option -l limit, limit the number of events per
minute.
* Add salinfo_decode option -T filename, write a trigger record to
filename for each SAL record.
* Site specific options can be set in /etc/sysconfig/salinfo_decode_all.
* Count and log the number of dropped records.
* Build allows separate source and object directories.
* Fix use after free bug in read_salinfo_decode_oem().
Default /etc/sysconfig/salinfo_decode_all.
# Define custom options in /etc/sysconfig/salinfo_decode_all
#
# All variables come in two forms, global (applies to all record types) and
# per record (only applies to that record type). The per record variables
# have a prefix of 'CMC_', 'CPE_', 'INIT_' or 'MCA_', global settings have no
# prefix. The global value is used if there is no record specific variable in
# the environment.
#
# Required variables are :-
#
# DIRECTORY The value passed as parameter -D to salinfo_decode.
#
# RETRIES How many times a version of salinfo_decode is restarted
# before we give up and log the failure.
#
# Optional variables are :-
#
# INODE_PCT Passed as -i <value> to salinfo_decode.
#
# SPACE_PCT Passed as -s <value> to salinfo_decode.
#
# RATE_LIMIT Passed as -l <value> to salinfo_decode.
#
# TRIGGER Passed as -T <value> to salinfo_decode.
# Required variables
export DIRECTORY=/var/log/salinfo
export RETRIES=3
# Optional variables, these are rule of thumb limits
export INODE_PCT # drop records if inodes used is >= 90%
export SPACE_PCT # drop records if space used is >= 90%
export RATE_LIMIT\x10 # drop records if more than 10/minute
# TRIGGER= is not set, it only makes sense if you install a post processing program
Typical syslog entries from salinfo_decode_all when any of the
salinfo_decode children fail.
Dec 15 06:27:50 salinfo_decode_all[2637]: Retry 1 for type INIT, previous status was 15
Dec 15 06:28:05 salinfo_decode_all[2637]: Type INIT died very quickly, no respawn, last status was 15
Typical syslog entry when salinfo_decode drops records because of the
limits. This one says that 6 records were dropped because the
filesystem was filling up and 5 records were dropped because they
exceeded the rate limit.
Dec 13 16:31:56 salinfo_decode[21460]: 11 cpe records dropped since Tue Dec 13 16:31:39 2005, 6 -s pct, 5 -l limit
Typical syslog entry when salinfo_decode drops trigger records because
the post processing program is not working. The actual cpe records
were still processed and saved, the only things lost in this case were
the post processing triggers.
Dec 13 20:37:27 salinfo_decode[30292]: 4 cpe trigger records dropped since Tue Dec 13 20:37:18 2005
We hope that we do not see this one :). If all the children die and
they have reached their retry limit or they are dying too quickly, then
there is nothing that salinfo_decode_all can do.
Dec 15 06:28:05 salinfo_decode_all[2637]: All children have died, giving up
reply other threads:[~2005-12-15 7:17 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=13035.1134631052@kao2.melbourne.sgi.com \
--to=kaos@sgi.com \
--cc=linux-ia64@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox