From: Joshua Roys <joshua.roys@gtri.gatech.edu>
To: "linux-audit@redhat.com" <linux-audit@redhat.com>
Subject: Backwards-compatible string encoding
Date: Fri, 27 Mar 2009 11:18:48 -0400 [thread overview]
Message-ID: <49CCEE58.5010800@gtri.gatech.edu> (raw)
Hello all,
I have just run into the problem that many of you have: trying to parse
the audit logs.
Yesterday I read through the linux-audit mail archive. Here are the
related topics I have found:
https://www.redhat.com/archives/linux-audit/2006-March/msg00093.html
https://www.redhat.com/archives/linux-audit/2006-March/msg00158.html
https://www.redhat.com/archives/linux-audit/2007-November/msg00036.html
https://www.redhat.com/archives/linux-audit/2008-January/msg00082.html
https://www.redhat.com/archives/linux-audit/2008-March/msg00024.html
https://www.redhat.com/archives/linux-audit/2008-May/msg00029.html
https://www.redhat.com/archives/linux-audit/2008-June/msg00005.html
https://www.redhat.com/archives/linux-audit/2008-August/msg00078.html
https://www.redhat.com/archives/linux-audit/2009-March/msg00018.html
From these I see these requirements (correct me if I am wrong):
- must be backwards-compatible (doesn't break user-space on FC2, etc)
- kernel does no verifying of incoming user-space strings
- kernel must output strings in a "simple" format (e.g. no XML :-)
- able to write a parser that guarantees all (relevant) input ends up in
output
- use disk space efficiently
- handle UTF-8
Based on things other people have proposed, how does this sound:
- radix prefixes for any non-base10 number (I think audit mostly does
this already?)
- hex-encode strings (and do not quote) if:
-- contains non-ASCII or non-printable characters
- quote strings if:
-- contains whitespace or '=' or '"' (in which case you have to output
something like '\"'
-- entirely {hex,octal,base10} characters
Or we could just save a little more headache at the cost of
space/readability and hex-encode on '=' and '"' too. Looking at
auparse, we may have to hexencode with embedded '"'.
Check if you need to encode first, then check for quoting. Something
like...
// somewhere in kernel/audit.c ?
char *audit_log_sane_string(char *str, size_t slen) {
int quoteme = 0;
size_t i, numhex = 0;
for(i = 0; i < slen; i++) {
if (!isprint(str[i])) return(hexencode(str));
if (isspace(str[i]) || str[i] == '=' || str[i] == '"') quoteme = 1;
if (isxdigit(str[i])) numhex++; // xdigit covers base8,10,16
}
if (quoteme || numhex == slen) return(quote(str));
return(strdup(str)); // kstrdup...?
}
Oh, and if anyone has ideas for making shadow-utils play nicer with
audit, I possibly have that kind of time on my hands. Also, getting rid
of the extra punctuation [:(,)] would be great.
What do you all think?
Joshua Roys
next reply other threads:[~2009-03-27 16:19 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-03-27 15:18 Joshua Roys [this message]
2009-03-27 16:41 ` Backwards-compatible string encoding John Dennis
2009-04-09 19:55 ` Joshua Roys
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=49CCEE58.5010800@gtri.gatech.edu \
--to=joshua.roys@gtri.gatech.edu \
--cc=linux-audit@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox