Audit for live supervision

public inbox for linux-audit@redhat.com
 help / color / mirror / Atom feed

* Audit for live supervision
@ 2008-08-14  7:14 Kay Hayen
  2008-08-14 14:04 ` Steve Grubb
  0 siblings, 1 reply; 17+ messages in thread
From: Kay Hayen @ 2008-08-14  7:14 UTC (permalink / raw)
  To: linux-audit

Hello,

I would like to present our plan for using audit briefly. We have made a 
prototype implementation, and discovered some things along the way.

We are making a middleware for ATC systems. We are writing it in Ada and 
partially in Python. In Python we do mostly the prototypes, so the prototype 
code is in Python.

For that, we have one problem, to uniquely identify a process that 
communicated with the outside world. We have settled with the process start 
date. That date can be determined in a way so that it is stable 
(using /proc/stat btime field, elf note for Hertz value, and then translate 
ticks from /proc/pid/stat into a date) and reproducible outside of the 
process. Given the pid and start_date, we can check if a process is still 
alive, reliably. The method is notably different from what ps does, which may 
(or so I propose after looking at the source) output different start times in 
different runs.

We have a daemon running that may or may not fork processes that it monitors, 
for the communicating ones, we want to be able to tell everybody in the 
system (spanning several nodes) that a communication partner is no more, for 
non-communicating ones we simply want to observe and report that e.g. ntpd or 
some monitoring/working shell script is running or not.

The identifier hostname/pid/start_date is therefore what what we call a "life" 
of a process. It may restart, but the pid won't wrap around within one tick, 
that is at least the limiting restriction.

Now one issue, I see is that the times that we get from auditd through the 
socket from its child daemon may not match the start_date exactly. I think 
they could. Actually we would prefer to receive the tick at which a process 
started, instead of a absolute time dated fork event, because then we could 
apply our code to calculate the stable time. Alternatively it would be nice 
to know how the time value from auditd comes into existance. In principle 
it's true, that for every event we should actually get the tick over a date, 
at least both. Ticks are the real kernel time, aren't they?

Currently we feel we should apply a delta around the times to match them, and 
that's somehow unstable methinks. We would prefer delta to be 0. Otherwise we 
may e.g. run into pid number overruns much easier.

The other thing is sequence numbers. We see in the output sequence numbers for 
each audit event. Very nice. But can you confirm where these sequence numbers 
are created? Are they done in the kernel, in auditd or in its child daemon?

The underlying question is, how safe can we be that we didn't miss anything 
when sequence numbers don't suggest so. We would like to use the lossless 
mode of auditd. Does that simply mean that auditd may get behind in worst 
case?

Then, we have first looked at auditd 1.2 (RHEL3), auditd 1.6 (RHEL5/Ubuntu) 
and auditd 1.7 (Debian and self-compiled for RHEL 5.2). The format did 
undergo important changes and it seems that 1.7 is much more friendly to 
parse. Can you confirm that a type=EOE delimits every event (is that even the 
correct term to use, audit trace, how is it called).

We can't build the rpm due to dependency problems, so I was using the hard 
way, ./configure --prefix=/opt/auditd-1.7 and that works fine on our RHEL 5.2 
it seems. What's not so clear to (me) is which kernel dependency there really 
is. Were there interface changes at all? The changelog didn't suggest so.

BTW: Release-wise, will RHEL 5.3 include the latest auditd? That is our target 
platform for a release next year, and it sure would be nice not to have to 
fix up the audit installation.

One thing I observed with 1.7.4-1 from Debian Testing amd64 that we won't ever 
see any clone events on the socket (and no forks, but we only know of cron 
doing these anyway), but all execs and exit_groups.

The rules we use are:

# First rule - delete all
-D

# Increase the buffers to survive stress events.
# Make this bigger for busy systems
-b 320

# Feel free to add below this line. See auditctl man page

-a entry,always -S clone -S fork -S vfork
-a entry,always -S execve
-a entry,always -S exit_group -S exit

Very strange. Works fine with self-compile RHEL 5.2, I understand that you are 
not Debian guys, I just wanted to ask you briefly if you were aware of 
anything that could cause that. I am going to report that as a bug (to them) 
otherwise.

With our rules file, we have grouped only similar purpose syscalls that we 
care about. The goal we have is to track all newly created processes, their 
exits and the code they run. If you are aware of anything we miss, please 
point it out.

Also, it is true (I read that yesterday) that every syscall is slowed down for 
every new rule? That means, we are making a mistake by not having only one 
line? And is open() performance really affected by this? Does audit not 
(yet?) use other tracing interface like SystemTap, etc. where people try to 
have 0 cost for inactive traces.

Also on a general basis. Do you recommend using the sub-daemon for the job or 
should we rather use libaudit for the task instead? Any insight is welcome 
here.

What we would like to achieve is:

1. Monitor every created process if it (was) relevant to something. We don't 
want to miss a process however briefly it ran.
2. We don't want to poll periodically, but rather only wake up (and then with 
minimal latency) when something interesting happened. We would want to poll a 
periodic check that forks are still reported, so we would detect a loss of 
service from audit.
3. We don't want to possible loose or miss anything, even if load gets higher, 
although we don't require to survive a fork bomb.

Sorry for the overlong email. We just hope you can help us identify how to 
make best use of audit for our project.

Best regards,
Kay Hayen

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Audit for live supervision
  2008-08-14  7:14 Audit for live supervision Kay Hayen
@ 2008-08-14 14:04 ` Steve Grubb
  2008-08-15  6:43   ` Kay Hayen
  0 siblings, 1 reply; 17+ messages in thread
From: Steve Grubb @ 2008-08-14 14:04 UTC (permalink / raw)
  To: linux-audit

On Thursday 14 August 2008 03:14:07 Kay Hayen wrote:
> I would like to present our plan for using audit briefly. We have made a
> prototype implementation, and discovered some things along the way.

Nice. I'll skip straight to the parts that I thnk I can comment on.

> Now one issue, I see is that the times that we get from auditd through the
> socket from its child daemon may not match the start_date exactly. 

All time hacks in the audit logs come from the kernel at the instant the 
record is created. They all start by calling audit_log_start, and right here 
is where time is written:

http://lxr.linux.no/linux+v2.6.26.2/kernel/audit.c#L1194

The source that is used is current_kernel_time();

> I think they could. Actually we would prefer to receive the tick at which a
> process started,

The audit system has millisecond resolution.This was considered adequate due 
to system ticks being < 1000 Hz. The current_kernel_time90 is a broken down 
time struct similar to pselect's. This is how its used:

 audit_log_format(ab, "audit(%lu.%03lu:%u): ", 
			t.tv_sec, t.tv_nsec/1000000, serial);

> Currently we feel we should apply a delta around the times to match them,
> and that's somehow unstable methinks. We would prefer delta to be 0.
> Otherwise we may e.g. run into pid number overruns much easier.

I'm thinking the audit resolution is higher than the scheduler's ticks. If you 
take the absolute ticks and turn them into <sys/time.h>, 

           struct timespec {
               long    tv_sec;         /* seconds */
               long    tv_nsec;        /* nanoseconds */
           };

Would they match?

> The other thing is sequence numbers. We see in the output sequence numbers
> for each audit event. Very nice. But can you confirm where these sequence
> numbers are created? Are they done in the kernel, in auditd or in its child
> daemon?

They are done in the kernel and are incremented for each audit_log_start so 
that no 2 audit events within the same millisecond have the same serial 
number. Their source is here:

http://lxr.linux.no/linux+v2.6.26.2/kernel/audit.c#L1085

> The underlying question is, how safe can we be that we didn't miss anything
> when sequence numbers don't suggest so. We would like to use the lossless
> mode of auditd. Does that simply mean that auditd may get behind in worst
> case?

Yes. You would want to do a couple things. Increase the kernel backlog, 
increase auditd priority, & increase audispd's internal queue.

> Can you confirm that a type=EOE delimits every event (is that even
> the correct term to use, audit trace, how is it called).

It delimits every multipart event. you can use something like this to 
determine if you have an event:

if ( r->type == AUDIT_EOE || r->type < AUDIT_FIRST_EVENT ||
                                r->type >= AUDIT_FIRST_ANOM_MSG) {
  have full event...
}

> We can't build the rpm due to dependency problems?

If you are on RHEL 5, just edit the spec file to remove --with-prelude. And 
delete any packaging of egginfo files.

> , so I was using the hard way, ./configure --prefix=/opt/auditd-1.7 and that
> works fine on our RHEL 5.2 it seems. What's not so clear to (me) is which
> kernel dependency there really is. Were there interface changes at all?

The best bet is to take the last RHEL5 audit srpm and install it. Modify that 
to have the new tar file. Then remove some of the patches. I have not build 
current for RHEL5 so I can't say much except to remove one, rpmbuild -bp and 
see if that is ok. then delete another if so. You do not need to do an 
rpmbuild -ba.

> The changelog didn't suggest so.

There are likely dependency issues for the selinux policy used for the 
zos-remote plugin.

> BTW: Release-wise, will RHEL 5.3 include the latest auditd?

That is the plan. But there will be a point where audit development continues 
and bugfixes are backported rather than new version. At a minimum, 
audit-1.7.5 will be in RHEL5.3. Maybe 1.7.6 if we have another quick release.

> One thing I observed with 1.7.4-1 from Debian Testing amd64 that we won't
> ever see any clone events on the socket (and no forks, but we only know of
> cron doing these anyway), but all execs and exit_groups.

That may be distro dependent. And you should use strace to confirm what you 
are looking for. On x86_64, note there are 2 clone syscall and you should 
have -F arch=b64 and -F arch=b32 for each rule.

>
> The rules we use are:
>
> # First rule - delete all
> -D
>
> # Increase the buffers to survive stress events.
> # Make this bigger for busy systems
> -b 320

bump this up. maybe 8192. That's what we use for CAPP.

> # Feel free to add below this line. See auditctl man page
>
> -a entry,always -S clone -S fork -S vfork

If you are on amd64, I would suggest:

-a entry,always -F arch=b32 -S clone -S fork -S vfork
-a entry,always -F arch=b64 -S clone -S fork -S vfork

and similar for other syscall rules.

> -a entry,always -S execve
> -a entry,always -S exit_group -S exit

> Very strange. Works fine with self-compile RHEL 5.2, I understand that you
> are not Debian guys, I just wanted to ask you briefly if you were aware of
> anything that could cause that. I am going to report that as a bug (to
> them) otherwise.

There might be tunables that different distros can used with glibc. strace is 
your friend...and having both 32/64 bit rules if amd64 is the target 
platform.

> With our rules file, we have grouped only similar purpose syscalls that we
> care about. The goal we have is to track all newly created processes, their
> exits and the code they run. If you are aware of anything we miss, please
> point it out.

This is a really tricky area. The could mmap a file and execute it. They can 
pass file descriptors between processes and execve /proc/<pid>/fd/4. or maybe 
take advantage of a hole in a program and overlay memory with another program 
so that /proc shows one thing but its really another. Its really hard to make 
airtight. SE Linux is your best bet to make sure people stay within the 
bounds that you intend - which means that the real processes are auditable.

> Also, it is true (I read that yesterday) that every syscall is slowed down
> for every new rule? 

Yes, if they are syscall rules. Its best to group as many together as 
possible.

> That means, we are making a mistake by not having only 
> one line?

I wouldn't say a mistake. Its that there will be a performance difference and 
it may not be enough to worry about. You would have to benchmark it.

> And is open() performance really affected by this?

Yes.

> Does audit not  (yet?) use other tracing interface like SystemTap, etc.
> where people try to have 0 cost for inactive traces.

They have a cost. :)  Also, systemtap while good for some things not good for 
auditing. For one, systemtap recompiles the kernel to make new modules. You 
may not want that in your environment. It also has not been tested for 
CAPP/LSPP compilance.

> Also on a general basis. Do you recommend using the sub-daemon for the job
> or should we rather use libaudit for the task instead? Any insight is
> welcome here.

It really depends on what your environment allows. Do you need an audit trail? 
With search tools? And reporting tools? Do you need the system to halt if 
auditing problems occur? Do you need any certifications?

> What we would like to achieve is:
>
> 1. Monitor every created process if it (was) relevant to something. We
> don't want to miss a process however briefly it ran.

This is hard, but can be achieved with help from SE Linux.

> 2. We don't want to poll periodically, but rather only wake up (and then
> with minimal latency) when something interesting happened. We would want to
> poll a periodic check that forks are still reported, so we would detect a
> loss of service from audit.

You might write a audispd plugin for this.

-Steve

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Audit for live supervision
  2008-08-14 14:04 ` Steve Grubb
@ 2008-08-15  6:43   ` Kay Hayen
  2008-08-15 12:54     ` Steve Grubb
  0 siblings, 1 reply; 17+ messages in thread
From: Kay Hayen @ 2008-08-15  6:43 UTC (permalink / raw)
  To: Steve Grubb; +Cc: linux-audit

Hello Steve,

thanks for your reply, very helpful. :-)

> > Now one issue, I see is that the times that we get from auditd through
> > the socket from its child daemon may not match the start_date exactly.
>
> All time hacks in the audit logs come from the kernel at the instant the
> record is created. They all start by calling audit_log_start, and right
> here is where time is written:
>
> http://lxr.linux.no/linux+v2.6.26.2/kernel/audit.c#L1194
>
> The source that is used is current_kernel_time();

[...]

> The audit system has millisecond resolution.This was considered adequate
> due to system ticks being < 1000 Hz. The current_kernel_time90 is a broken
> down time struct similar to pselect's. This is how its used:
>
>  audit_log_format(ab, "audit(%lu.%03lu:%u): ",
> 			t.tv_sec, t.tv_nsec/1000000, serial);

>From what I checked, it seems that current_kernel_time() is indeed fed exactly 
by the jiffy/system ticks since boot (at least did I find comments that 
suggest so). I still have to verify how it is translated. There still is the 
issue of translating ticks into seconds since 1970. So far I have only 
achieved to get hands on system boot time in a granularity of one second.

I have no clue if that is the same time used in the kernel to offset the ticks 
value. I will make a delta test once I can. But suspect would be that the 
real time clock has better time than one second, which is all that I get 
from /proc/stat btime: field.

More importantly, and somewhat blocking my tests: With the improved rules I 
get this when compiling quite well reproducible:

type=SYSCALL msg=audit(1218773075.500:118620): arch=c000003e syscall=59 
success=yes exit=0 a0=7fff6f78cf90 a1=7fff6f78cf40 a2=7fff6f78f068 a3=0 
items=2 pp
id=11412 pid=11421 auid=4294967295 uid=1000 gid=1000 euid=1000 suid=1000 
fsuid=1000 egid=1000 sgid=1000 fsgid=1000 tty=pts3 ses=4294967295 
comm="gcc-4.3"
exe="/usr/bin/gcc-4.3" key=(null)

[...]
type=SYSCALL msg=audit(1218773075.496:118624): arch=c000003e syscall=56 
success=yes exit=11421 a0=1200011 a1=0 a2=0 a3=7fc067776770 items=0 
ppid=11407 pid
=11412 auid=4294967295 uid=1000 gid=1000 euid=1000 suid=1000 fsuid=1000 
egid=1000 sgid=1000 fsgid=1000 tty=pts3 ses=4294967295 comm="gnatchop" 
exe="/usr/b
in/gnatchop" key=(null)

Please note the _ascending_ sequence number but _descending_ time. This is 
pasted from my audit.log and quite surprising. It triggers an assertion in 
our code, because we also seem receive things in the wrong order from the 
socket. There was not FORK before the EXECVE that refers to the pid. We 
tolerate that (obviously there are going to be processes that we didn't see 
fork once we start, we don't do an initial scan yet), but we don't tolerate 
that fork returns an existing pid. 

Seems like a bug? Can you have a look at it? I was using for that:

-a entry,always -F arch=b32 -S clone -S fork -S vfork
-a entry,always -F arch=b64 -S clone -S fork -S vfork
-a entry,always -S execve
-a entry,always -S exit_group -S exit

I didn't apply the arch to exit, etc. yet, but there wasn't an pid rollover 
yet, so I don't think that missing an exit is really the issue here. 

Plus I still did't fully grasp why that arch filter was necessary in the first 
place. I mean, after all, I was simply expecting that per default no filter 
should give all arches. Is that filter actually a selector? Does it have to 
do with the fact that syscall numbers are arch dependent?

> > The other thing is sequence numbers. We see in the output sequence
> > numbers for each audit event. Very nice. But can you confirm where these
> > sequence numbers are created? Are they done in the kernel, in auditd or
> > in its child daemon?
>
> They are done in the kernel and are incremented for each audit_log_start so
> that no 2 audit events within the same millisecond have the same serial
> number. Their source is here:
>
> http://lxr.linux.no/linux+v2.6.26.2/kernel/audit.c#L1085

Thanks for the pointer. That looks indeed like nothing can go wrong on the 
reliability side of time and serial. The more astounding my report from above 
is.

> > The underlying question is, how safe can we be that we didn't miss
> > anything when sequence numbers don't suggest so. We would like to use the
> > lossless mode of auditd. Does that simply mean that auditd may get behind
> > in worst case?
>
> Yes. You would want to do a couple things. Increase the kernel backlog,
> increase auditd priority, & increase audispd's internal queue.

That's fine with us. We just wouldn't want to be in an inconistent state after 
a load peek. I think with a decicated core for our supervision processes, 
everything except benchmarks of clone performance (aka fork bombs) will be 
able to trigger any actual issue.

> > Can you confirm that a type=EOE delimits every event (is that even
> > the correct term to use, audit trace, how is it called).
>
> It delimits every multipart event. you can use something like this to
> determine if you have an event:
>
	> if ( r->type == AUDIT_EOE || r->type < AUDIT_FIRST_EVENT ||
>                                 r->type >= AUDIT_FIRST_ANOM_MSG) {
>   have full event...
> }

I will have to check if this affects our intended process tracing. The parsing 
is certainly not simplified by it, for a possibly unrelated reason.

We read from a socket, and in data chunks, which are then processed. For that, 
we want to identify the end of a message before processing it. Right now, we 
have switched to waiting for type=EOE and its new line. Without that, the 
fact that type=non-EOE may be a multi-line thing. I think parameters are on 
new lines and that makes it a bit hard to detect the end of a complete type= 
trace. I don't see that in the audit.log, so probably it's a problem of the 
sub-daemon:

 type=EXECVE msg=audit(1218774568.173:129440): argc=4 a0="/bin/sh"
a1="..."
a2="..."
a3="..."

Without a very stateful message parser, one that e.g. knows how many lines are 
to follow an EXECVE, we don't know when to forward it the part that should 
process it.

What we first, once we got a message is the following code:
        # 1. Some lines are split across multiple lines. The good thing is 
that these never start
        #    with whitespace and so we can make them back into single lines. 
This makes the next
        #    part easier.

        lines = []

        for line in message.split( "\n" ):
            if line.strip() == "":
                pass
            elif line.startswith( " type=" ):
                lines.append( line )
            else:
                assert line[0] != ' '

                lines[-1] = lines[-1] + ' ' + line

This is in hope that indeed continued lines always start with a non-space and 
type lines always start with a space. Would you consider this format worthy 
and possible to change? 

I have no idea how much it represents and existing external interface, but I 
can imagine you can't change it (easily). Probably the end of type= must be 
detected by terminating empty line in case of those that can be continued. 
But it would be very ugly to have to know the event types that have this so 
early in the decoding process.

> > BTW: Release-wise, will RHEL 5.3 include the latest auditd?
>
> That is the plan. But there will be a point where audit development
> continues and bugfixes are backported rather than new version. At a
> minimum, audit-1.7.5 will be in RHEL5.3. Maybe 1.7.6 if we have another
> quick release.

That's OK for us. And until then self-compiled will do.

> If you are on amd64, I would suggest:
>
> -a entry,always -F arch=b32 -S clone -S fork -S vfork
> -a entry,always -F arch=b64 -S clone -S fork -S vfork

Actually that solved the trouble of not seeing anything on "Debian". The fact 
is that we are using RHEL x86 for production and Debian (or Ubuntu) amd64 on 
our development machines. So we never checked RHEL amd64, which likely would 
show the same thing.

But see above, this possibly a bug?

> There might be tunables that different distros can used with glibc. strace
> is your friend...and having both 32/64 bit rules if amd64 is the target
> platform.

We did that of course. And what was confusing us was that the audit.log did 
actually seem to show the calls. Can that even be?

> > Does audit not  (yet?) use other tracing interface like SystemTap, etc.
> > where people try to have 0 cost for inactive traces.
>
> They have a cost. :)  Also, systemtap while good for some things not good
> for auditing. For one, systemtap recompiles the kernel to make new modules.
> You may not want that in your environment. It also has not been tested for
> CAPP/LSPP compilance.
>
> > Also on a general basis. Do you recommend using the sub-daemon for the
> > job or should we rather use libaudit for the task instead? Any insight is
> > welcome here.
>
> It really depends on what your environment allows. Do you need an audit
> trail? With search tools? And reporting tools? Do you need the system to
> halt if auditing problems occur? Do you need any certifications?

I see. Luckily we are not into security, but only "safety". I can't find 
anything on Wikipedia about it, so I will try to explain it briefly, please 
forgive my limited understanding of it. :-)

Increasingly we in the ATC need to provide a "Software Assurance Level" (SWAL) 
to our customers and those to their regulators. That's one of many standards 
to define "safety". Depending on the level, you may not only need to document 
the requirements of your software or you have to relate each line of code 
with them and ultimately prove the correctness of compiler output. I believe 
they do this for fighter planes.

We currently target level 3, which means that we will have to prove that we 
have concerned ourselves with all possible hazards and their combinations and 
if that would lead to "safety" relevant things. For the running system, we 
don't have to prove anything. There are "legal recordings" of the input and 
output data, but no complete logs of the operation are required per se, 
although of course, as we support the analysis of trouble reports, that is 
ver welcome. That level is acceptable for non-airborne systems.

So if we are running and monitoring a system, and if e.g. certain process must 
run to prevent planes from colliding, we must provide explanations (and 
tests) of why we are at any time sure _if_ (not "that") the process is 
running or how the problem will be noted by the software and ultimately the 
operators through other means, if that fails.

On a general level, redundancy is normally the solution to hazards. Everything 
is doubled, and you normally have 2 trackers, etc. so you use not only one 
software, but multiple implementations. 

When something fails, we generally stop the software unless we are sure it's 
an isolated event. That is because data corruption is worse than no data at 
all. An unseen flight or a falsely reported flight is a lot worse than a 
crashed system that only needs a restart and will be back in operation within 
time limits.

Our concern is not at all to prevent or detect malicious use of the system. If 
somebody wants to run a binary in a hidden way and takes effort to achieve 
that, other processes have failed already. Our systems always run in an 
environment where security is already solved through other means. The systems 
e.g. do allow rsh login, without a password, as root. I hope you can continue 
breathing now. :-)

It certainly will be very helpful to have the audit log and it searchable and 
I understand we get that automatic by leaving audit enabled, but configured 
correctly. In the past we have disabled it, because it caused a full disk and 
boot failure on RHEL 3 after only a month or so. I think it complained about 
the UDP echo packets that we use to check our internal LAN operations, but it 
could have been SELinux too.

So, what we will do with audit is to look at source code, identify design 
principles that make errors impossible, write that down, define our 
requirements of it, test these. And ultimately we may have to provide an 
alternative implementation without audit at all, with probably worse latency, 
and the ability to detect and compensate (simulated) audit bugs. And test 
that as well.

And in some figure, we will say: If the tracker process exits, we will detect 
that on avarage after 1 ms, and at latest after 50ms. And that will play a 
role in the time it takes to switch to the standby tracker process on another 
hardware. And that will e.g. have the requirement to be able to take over in 
1 or 3 seconds, and that will be possible, because the process that hands 
over started early enough and takes itself a limited time.

The ultimate benefits of auditd would be to us:

a) Achieve very low values in latency, being able to observe an "exit" as it 
happens, not just when a periodic timer makes a test. We can only achieve 
these kinds of performance through child process SIGCHLD so far and signals 
are not an ideal interface. Being able to monitor non-childs is very good for 
us, actually what made us interested in audit in the first place.

b) Robust operation. In theory the audit approach should be able to receive 
assurances that certain hazards just cannot happen that's very nice and 
certainly increases the assurances we can give for our process supervision, a 
key stone building part for every system.

c) Our middleware may suddenly offer to monitor processes like ntpd, cron 
jobs, etc. very easily and without system changes. There are systems that 
work with "ps", "grep" and "kill" to achieve this, but as you can imagine, 
that only goes so far.

Security, in the sense of intrusion detection, authorization, etc. doesn't 
play a role to us. So we don't need audit trails for our live supervision, 
we "only want to know" what was running, what is running.

> > 2. We don't want to poll periodically, but rather only wake up (and then
> > with minimal latency) when something interesting happened. We would want
> > to poll a periodic check that forks are still reported, so we would
> > detect a loss of service from audit.
>
> You might write a audispd plugin for this.

Did you mean for the periodic check, or for the whole job, that means our 
supervision process? We certainly would prefer the plugin approach for such a 
test, esp. if there is hope that you accept it into your code. 

The closing of the socket in case of loss of service would be sufficient 
signal to us.

Regarding performance I would like to say, you are likely right in that it's a 
non-issue. It has something of a bike-shed to me though. :-) I think I still 
have http://lwn.net/Articles/290428/ on my mind, where I had the impression 
that kernel markers would only require a few noop instructions as place 
holders for a jumps that would cause audit code to run. I was wondering why 
audit wouldn't use that. Is that historic (didn't exist, nobody made a patch 
for it) or conscious decision (too difficult, not worth it). Just curious 
here and of course the comment could be read as a bit scary, because it 
actually means we will have to benchmark the impact...

Best regards,
Kay Hayen

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Audit for live supervision
  2008-08-15  6:43   ` Kay Hayen
@ 2008-08-15 12:54     ` Steve Grubb
  2008-08-16 11:19       ` Kay Hayen
  0 siblings, 1 reply; 17+ messages in thread
From: Steve Grubb @ 2008-08-15 12:54 UTC (permalink / raw)
  To: Kay Hayen; +Cc: linux-audit

On Friday 15 August 2008 02:43:49 Kay Hayen wrote:
> More importantly, and somewhat blocking my tests: With the improved rules I
> get this when compiling quite well reproducible:
>
> type=SYSCALL msg=audit(1218773075.500:118620): arch=c000003e syscall=59
> success=yes exit=0 a0=7fff6f78cf90 a1=7fff6f78cf40 a2=7fff6f78f068 a3=0
> items=2 pp
> id=11412 pid=11421 auid=4294967295 uid=1000 gid=1000 euid=1000 suid=1000
> fsuid=1000 egid=1000 sgid=1000 fsgid=1000 tty=pts3 ses=4294967295
> comm="gcc-4.3"
> exe="/usr/bin/gcc-4.3" key=(null)
>
> [...]
> type=SYSCALL msg=audit(1218773075.496:118624): arch=c000003e syscall=56
> success=yes exit=11421 a0=1200011 a1=0 a2=0 a3=7fc067776770 items=0
> ppid=11407 pid
> =11412 auid=4294967295 uid=1000 gid=1000 euid=1000 suid=1000 fsuid=1000
> egid=1000 sgid=1000 fsgid=1000 tty=pts3 ses=4294967295 comm="gnatchop"
> exe="/usr/b
> in/gnatchop" key=(null)
>
> Please note the _ascending_ sequence number but _descending_ time.

What this indicates is that there was some recursion before the syscall 
triggered an event. The syscall context exists from sycall entry to exit. If 
during the middle a signal is delivered, the syscall is not finished. Instead 
it runs the signal handler associated with the signal. The signal handler 
might make syscalls which are then handled using the existing syscall context 
via linked list. When that occurs, the timestamp is not being updated. Not 
sure that is appropriate or why the original time really mattered. But that 
is what you are observing. My guess is SIGTERM is being delivered during 
another syscall.

> Seems like a bug? Can you have a look at it?

I'll check on why we don't update the time stamp during syscall recursion.

> -a entry,always -F arch=b32 -S clone -S fork -S vfork
> -a entry,always -F arch=b64 -S clone -S fork -S vfork
>
> Plus I still did't fully grasp why that arch filter was necessary in the
> first place. I mean, after all, I was simply expecting that per default no
> filter should give all arches. Is that filter actually a selector? 

The -F arch is a selector for the syscall table. The kernel works off of 
numbers not strings. So, clone doesn't mean anything to the kernel, but 56 
has meaning. 56 doesn't mean much to people. So, auditctl does you a favor of 
converting text to numbers. It needs to know which table to choose from, the 
32 bit or 64 bit table as both or one could be valid. Its possible to compile 
the kernel to use only the 64 bit table. There is no way to detect this from 
user space except by failure...in which case all you know is failure but not 
why. 

There is also not a direct mapping between x86_64 and i386. There are syscalls 
that exist on one arch but not the other. There are syscalls that change 
names between arches. The problem is that I could maintain a table of all 
these cross references for x86_64 and i386, but I don't have a good idea 
about ppc and s390 which are also biarch. Then the table would be a snapshot 
in time. A syscall could get added in a later kernel but you won't get the 
right results because you were trusting the tool and not suspcious enough to 
do your own review.

Then there is a problem of correlation. If I have 1 rule that expands to 2, 
then how can I do a compare of what's in memory vs what rules are on disk? 
IOW, how do I tell that someone typed:

 -a entry,always -F arch=b32 -S clone -S fork -S vfork
 -a entry,always -F arch=b64 -S clone -S fork -S vfork

or just

-a entry,always -S clone -S fork -S vfork

because auditctl would make 2 from 1. This is a really tricky issue and if we 
didn't care about correlation...or about outdated tools we trust too 
much...we could do this.

> Does it have to do with the fact that syscall numbers are arch dependent?

Yes.

ausyscall x86_64 clone
56

ausyscall i386 clone
120

> > > Can you confirm that a type=EOE delimits every event (is that even
> > > the correct term to use, audit trace, how is it called).
> >
> > It delimits every multipart event. you can use something like this to
> >
> > determine if you have an event:
> 	> if ( r->type == AUDIT_EOE || r->type < AUDIT_FIRST_EVENT ||
> >
> >                                 r->type >= AUDIT_FIRST_ANOM_MSG) {
> >   have full event...
> > }
>
> I will have to check if this affects our intended process tracing. The
> parsing is certainly not simplified by it, for a possibly unrelated reason.

We have an audit parsing library. It takes this into account. the one and only 
bug that I know of in it is when event records are interlaced. This is a 
prolem you'll find at some point. Audit events and their records are not 
serialized in the kernel. So, you could have:

syscall a
path a
syscall b
user msg c
cwd a
avc b

> Without a very stateful message parser, one that e.g. knows how many lines
> are to follow an EXECVE, we don't know when to forward it the part that
> should process it.

time->Thu Aug 14 08:21:34 2008
node=127.0.0.1 type=PATH msg=audit(1218716494.667:677): item=1 
name="/home/sgrubb/.kde/share/config/kmailrc.lock3U3ZZa.tmp" inode=11304982 
dev=08:03 mode=0100644 ouid=4325 ogid=4325 rdev=00:00 
obj=unconfined_u:object_r:user_home_t:s0 

node=127.0.0.1 type=PATH msg=audit(1218716494.667:677): item=0 
name="/home/sgrubb/.kde/share/config/" inode=12550361 dev=08:03 mode=040700 
ouid=4325 ogid=4325 rdev=00:00 obj=unconfined_u:object_r:user_home_t:s0 
node=127.0.0.1 type=CWD msg=audit(1218716494.667:677):  cwd="/home/sgrubb" 

node=127.0.0.1 type=SYSCALL msg=audit(1218716494.667:677): arch=c000003e 
syscall=87 success=yes exit=0 a0=15f06b0 a1=39609389d0 a2=1340ac0 
a3=3960b67a70 items=2 ppid=1 pid=3432 auid=4325 uid=4325 gid=4325 euid=4325 
suid=4325 fsuid=4325 egid=4325 sgid=4325 fsgid=4325 tty=(none) ses=1 
comm="kontact" exe="/usr/bin/kontact" 
subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 key="delete" 

Look at the syscall record. It is always emitted with multi-line records. It 
has an items count. Each auxiliary (path in this case) record has an item 
number. You can tell when you have everything. Single line entries do not 
have an items field. Also note that the record comprising an event comes out 
of the kernel in a backwards order.

> What we first, once we got a message is the following code:
>         # 1. Some lines are split across multiple lines. The good thing is
> that these never start
>         #    with whitespace and so we can make them back into single
> lines. This makes the next
>         #    part easier.
>
>         lines = []
>
>         for line in message.split( "\n" ):
>             if line.strip() == "":
>                 pass
>             elif line.startswith( " type=" ):
>                 lines.append( line )
>             else:
>                 assert line[0] != ' '
>
>                 lines[-1] = lines[-1] + ' ' + line

Did you know about the audit parsing library?

> This is in hope that indeed continued lines always start with a non-space
> and type lines always start with a space. Would you consider this format
> worthy and possible to change?

Don't like changing formats as that affects test suites.

> I have no idea how much it represents and existing external interface, but
> I can imagine you can't change it (easily). Probably the end of type= must
> be detected by terminating empty line in case of those that can be
> continued. But it would be very ugly to have to know the event types that
> have this so early in the decoding process.

We have a parsing library, auparse, that handles the rules of audit parsing. 
Look for auparse.h for the API.

> > There might be tunables that different distros can used with glibc.
> > strace is your friend...and having both 32/64 bit rules if amd64 is the
> > target platform.
>
> We did that of course. And what was confusing us was that the audit.log did
> actually seem to show the calls. Can that even be?

Yes, as explained above.

> > > Does audit not  (yet?) use other tracing interface like SystemTap, etc.
> > > where people try to have 0 cost for inactive traces.
> >
> > They have a cost. :)  Also, systemtap while good for some things not good
> > for auditing. For one, systemtap recompiles the kernel to make new
> > modules. You may not want that in your environment. It also has not been
> > tested for CAPP/LSPP compilance.
> >
> > > Also on a general basis. Do you recommend using the sub-daemon for the
> > > job or should we rather use libaudit for the task instead? Any insight
> > > is welcome here.
> >
> > It really depends on what your environment allows. Do you need an audit
> > trail? With search tools? And reporting tools? Do you need the system to
> > halt if auditing problems occur? Do you need any certifications?
>
> I see. Luckily we are not into security, but only "safety". I can't find
> anything on Wikipedia about it, so I will try to explain it briefly, please
> forgive my limited understanding of it. :-)

At one point, I worked on Space Shuttle software. I know a little on how they 
think about this.

> It certainly will be very helpful to have the audit log and it searchable
> and I understand we get that automatic by leaving audit enabled, but
> configured correctly. In the past we have disabled it, because it caused a
> full disk and boot failure on RHEL 3 after only a month or so. I think it
> complained about the UDP echo packets that we use to check our internal LAN
> operations, but it could have been SELinux too.

RHEL3's audit system is completely different than RHEL5's.

> > > 2. We don't want to poll periodically, but rather only wake up (and
> > > then with minimal latency) when something interesting happened. We
> > > would want to poll a periodic check that forks are still reported, so
> > > we would detect a loss of service from audit.
> >
> > You might write a audispd plugin for this.
>
> Did you mean for the periodic check,

There is a realtime interface for the audit stream. You can write either a new 
event dispatcher or a plugin to the existing one. Seeing as you are more 
concerned with assurance, I'd just replace the current dispatcher with your 
own. I have a description of this here:

http://people.redhat.com/sgrubb/audit/audit-rt-events.txt

> or for the whole job, that means our supervision process?

The supervision process. Then again, maybe you want to replace the audit 
daemon and handle events your own way. libaudit has all the primitives for 
that. So, I guess that brings up the question of how you are accessing the 
audit event stream. Are you reading straight from netlink or the disk?

> Regarding performance I would like to say, you are likely right in that
> it's a non-issue. It has something of a bike-shed to me though. :-) I think
> I still have http://lwn.net/Articles/290428/ on my mind, where I had the
> impression that kernel markers would only require a few noop instructions
> as place holders for a jumps that would cause audit code to run. 

You can go that way if you want. But I don't know of anyone else that has.

> I was wondering why audit wouldn't use that. Is that historic (didn't exist,
> nobody made a patch for it) or conscious decision (too difficult, not worth
> it). Just curious here and of course the comment could be read as a bit
> scary, because it actually means we will have to benchmark the impact...

systemtap came after audit. They have 2 different purposes. One is 
debugging/profiling, the other is regulatory compliance and security. The 
system tap people have no gurantees about what kinds of data is contained in 
the stream or the reliability of delivery. There was some talk about 
combining hooks and in the end it was decided that we should leave them 
disconnected as they serve entirely different purposes.

-Steve

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Audit for live supervision
  2008-08-15 12:54     ` Steve Grubb
@ 2008-08-16 11:19       ` Kay Hayen
  2008-08-18 15:10         ` Steve Grubb
  0 siblings, 1 reply; 17+ messages in thread
From: Kay Hayen @ 2008-08-16 11:19 UTC (permalink / raw)
  To: Steve Grubb; +Cc: linux-audit, alex

Hello Steve,

[ time descending, sequence number ascending problem ]

> What this indicates is that there was some recursion before the syscall
> triggered an event. The syscall context exists from sycall entry to exit.
> If during the middle a signal is delivered, the syscall is not finished.
> Instead it runs the signal handler associated with the signal. The signal
> handler might make syscalls which are then handled using the existing
> syscall context via linked list. When that occurs, the timestamp is not
> being updated. Not sure that is appropriate or why the original time really
> mattered. But that is what you are observing. My guess is SIGTERM is being
> delivered during another syscall.

That raised the following question to me: We have "entry" rules defined. When 
we saw that we get exit codes, the conclusion was that "entry" and "exit" are 
not different for every syscall. Can you confirm? 

And assuming that, indeed the may fork complete only after its has completed 
its signal handlers. The expectation would be that this is not an issue 
though, because the new process is inactive. Wrong assumption. The new 
process seemingly can already EXECVE with its fresh life, if the call of FORK 
is interrupted after the process exists.

Now that poses an interesting problem. I guess, what we are missing is that 
FORK should actually enter once and return up to _two_ times, and we need to 
handle and see traces of these. That way we would simply see FORK return with 
0 for the child before it does EXECVE, and so we would know who created it 
(ppid) and know that it exists.

I tried to change our rules to "exit,always" from "entry,always", but it 
didn't make a difference. Can you confirm that only one exit is traced and do 
you think audit can be enhanced to trace these extra exits of syscalls like 
FORK.

For every workaround, a SIGKILL signal towards the FORK making process would 
probably leave us even without a trace, forever possibly.

As for the signal at hand here, I think we have ant (gcj based Java) doing 
multi-threading (parallel building). That would be FORKs series with probably 
a SIGALARM or whatever they use to switch target execution without resorting 
to threading.

> > Seems like a bug? Can you have a look at it?
>
> I'll check on why we don't update the time stamp during syscall recursion.

Thanks a lot. I guess, the expectation could be that "exit" traces have the 
datation of the "exit" and not "entry".

> Then there is a problem of correlation. If I have 1 rule that expands to 2,
> then how can I do a compare of what's in memory vs what rules are on disk?
> IOW, how do I tell that someone typed:
>
>  -a entry,always -F arch=b32 -S clone -S fork -S vfork
>  -a entry,always -F arch=b64 -S clone -S fork -S vfork
>
> or just
>
> -a entry,always -S clone -S fork -S vfork
>
> because auditctl would make 2 from 1. This is a really tricky issue and if
> we didn't care about correlation...or about outdated tools we trust too
> much...we could do this.

That's understood. And a typical danger of user friendly abstraction is that 
it causes confusion. As I said, -F was bound to "filter" in my mind. 
For "arch" it's suddenly a selector. I would find something like this more 
logical:

-a entry,always,any -S clone -S fork -S vfork

and if I really only wanted a certain arch, make me say so:

-a entry,always,b64 -S clone -S fork -S vfork

> ausyscall x86_64 clone
> 56
>
> ausyscall i386 clone
> 120

Very good. We have initially defined a hash in Python manually with what we 
encounter, but we can rather use that to create them. We specifically have 
the problem of visiting a s390 site, where it will handy to have these 
already in place. There is no such function in libaudit, is there?

> We have an audit parsing library. It takes this into account. 

I have looked at it, and auparse_init doesn't seem to support reading from the 
socket itself, does it? It could be AUSOURCE_FILE. And there there is the 
issue that the logs on disk seem to be different from what format we get on 
the socket. The node= is not on disk, newlines, empty lines, etc. see below 
about that.

In an ideal world, we would like to note that the audit socket is readable, 
hand it (or an arbitrarily truncated chunk of data) from it to libaudit, ask 
it for events until it says there are not more. That would leave the 
truncated line/event issue to libaudit. Is that part of the code?

> > Without a very stateful message parser, one that e.g. knows how many
> > lines are to follow an EXECVE, we don't know when to forward it the part
> > that should process it.

[Example deleted]

> Look at the syscall record. It is always emitted with multi-line records.
> It has an items count. Each auxiliary (path in this case) record has an
> item number. You can tell when you have everything. Single line entries do
> not have an items field. Also note that the record comprising an event
> comes out of the kernel in a backwards order.

Ah, we simply ignored the type=PATH etc. elements. But what I mean is that of 
the syscall itself, the arguments seemed to be on new lines:

This is from Python code:

    data = _audit_socket.recv( 32*1024 )
    print data

node=Annuitka type=SYSCALL msg=audit(1218880198.814:42205): arch=c000003e 
syscall=59 success=yes exit=0 a0=16cc168 a1=1464c08 a2=1588008 a3=0 items=2 
ppid=3864 pid=19928 auid=4294967295 uid=1000 gid=1000 euid=1000 suid=1000 
fsuid=1000 egid=1000 sgid=1000 fsgid=1000 tty=pts3 ses=4294967295 comm="ls" 
exe="/bin/ls" key=(null)
node=Annuitka type=EXECVE msg=audit(1218880198.814:42205): argc=2 a0="ls"
a1="--color=auto"

node=Annuitka type=CWD msg=audit(1218880198.814:42205):  
cwd="/data/home/anna/comsoft/v7a1-ps2-acs/src/acs"
node=Annuitka type=PATH msg=audit(1218880198.814:42205): item=0 name="/bin/ls" 
inode=1651626 dev=08:12 mode=0100755 ouid=0 ogid=0 rdev=00:00
node=Annuitka type=PATH msg=audit(1218880198.814:42205): item=1 name=(null) 
inode=779612 dev=08:12 mode=0100755 ouid=0 ogid=0 rdev=00:00
node=Annuitka type=EOE msg=audit(1218880198.814:42205):

Note that we get a SYSCALL with 2 items, and then in order the items - from 
the socket. But inbetween we get type=EXECVE it doesn't have an item number, 
and worse the new line before 'a1=--color-auto' is real and so is the empty 
line after it. I have another example of a "gnash" call from Konqueror with 
no less than 29 arguments.

That means, in order to parse the socket, we should check argc, right? I think 
we would prefer very long lines like they are in /var/log/audit instead, 
making these kinds of steps optional.

Actually I don't understand the differences in format. I assume they serve the 
purpose of making things readable?

> Did you know about the audit parsing library?

Our assumption was also that it should be easy enough to parse the text. Well 
you know assumptions. Rarely ever true. :-)

> > This is in hope that indeed continued lines always start with a non-space
> > and type lines always start with a space. Would you consider this format
> > worthy and possible to change?
>
> Don't like changing formats as that affects test suites.

That " type=" start is a self-confusion of ours. Starting with 1.6 the node= 
part was added, and some hack was still in place that removes "node=hostname" 
and leaves the space there. Sorry about that.

> > I have no idea how much it represents and existing external interface,
> > but I can imagine you can't change it (easily). Probably the end of type=
> > must be detected by terminating empty line in case of those that can be
> > continued. But it would be very ugly to have to know the event types that
> > have this so early in the decoding process.
>
> We have a parsing library, auparse, that handles the rules of audit
> parsing. Look for auparse.h for the API.

If you confirm that can handles the parsing from the socket, as suggested 
above, we may persue that path and can ignore strangeness of the format once 
its handled by the library.

> > > There might be tunables that different distros can used with glibc.
> > > strace is your friend...and having both 32/64 bit rules if amd64 is the
> > > target platform.
> >
> > We did that of course. And what was confusing us was that the audit.log
> > did actually seem to show the calls. Can that even be?
>
> Yes, as explained above.

Sorry, I am still confused. Can you explain why the socket and the audit.log 
can have different contents. I was blaming my (usually bad) memory.

> > I see. Luckily we are not into security, but only "safety". I can't find
> > anything on Wikipedia about it, so I will try to explain it briefly,
> > please forgive my limited understanding of it. :-)
>
> At one point, I worked on Space Shuttle software. I know a little on how
> they think about this.

Well, that's perfect. :-)

> > > > 2. We don't want to poll periodically, but rather only wake up (and
> > > > then with minimal latency) when something interesting happened. We
> > > > would want to poll a periodic check that forks are still reported, so
> > > > we would detect a loss of service from audit.
> > >
> > > You might write a audispd plugin for this.
> >
> > Did you mean for the periodic check,
>
> There is a realtime interface for the audit stream. You can write either a
> new event dispatcher or a plugin to the existing one. Seeing as you are
> more concerned with assurance, I'd just replace the current dispatcher with
> your own. I have a description of this here:
>
> http://people.redhat.com/sgrubb/audit/audit-rt-events.txt

I saw that too, but I thought it would be better to build upon the existing 
effort. I think that's a viable alternative and potentially could be more 
robust to us. At least it could be that audisp seems to try and solve 
problems we don't have or want.

Looking at the source I saw that node name is something that audisp indeed 
adds the node name and that auditd doesn't log EOEs, explaining some of the 
differences. I didn't find why audisp has extra new lines, or if auditd 
removed these.

I think we will make a prototype for the RT interface and see what it gives 
us.

> > or for the whole job, that means our supervision process?
>
> The supervision process. Then again, maybe you want to replace the audit
> daemon and handle events your own way. libaudit has all the primitives for
> that. So, I guess that brings up the question of how you are accessing the
> audit event stream. Are you reading straight from netlink or the disk?

>From the files is out of question. We thought of the audit sub deamon as 
something that simply allows to access the audit stream live, but that it is 
otherwise the same as the file. Like auditd would accept things from the 
kernel, write it to a file and hand it to audisp as well which would then 
provide it to others. Isn't that the design? 

> > I was wondering why audit wouldn't use that. Is that historic (didn't
> > exist, nobody made a patch for it) or conscious decision (too difficult,
> > not worth it). Just curious here and of course the comment could be read
> > as a bit scary, because it actually means we will have to benchmark the
> > impact...
>
> systemtap came after audit. They have 2 different purposes. One is
> debugging/profiling, the other is regulatory compliance and security. The
> system tap people have no gurantees about what kinds of data is contained
> in the stream or the reliability of delivery. There was some talk about
> combining hooks and in the end it was decided that we should leave them
> disconnected as they serve entirely different purposes.

Ah I see. Thanks for the explanation. :-)

Best regards,
Kay Hayen

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Audit for live supervision
  2008-08-16 11:19       ` Kay Hayen
@ 2008-08-18 15:10         ` Steve Grubb
  2008-08-19  6:45           ` Kay Hayen
  0 siblings, 1 reply; 17+ messages in thread
From: Steve Grubb @ 2008-08-18 15:10 UTC (permalink / raw)
  To: Kay Hayen; +Cc: linux-audit, alex

On Saturday 16 August 2008 07:19:27 Kay Hayen wrote:
> [ time descending, sequence number ascending problem ]
>
> > What this indicates is that there was some recursion before the syscall
> > triggered an event. The syscall context exists from sycall entry to exit.
> > If during the middle a signal is delivered, the syscall is not finished.
> > Instead it runs the signal handler associated with the signal. The signal
> > handler might make syscalls which are then handled using the existing
> > syscall context via linked list. When that occurs, the timestamp is not
> > being updated. Not sure that is appropriate or why the original time
> > really mattered. But that is what you are observing. My guess is SIGTERM
> > is being delivered during another syscall.
>
> That raised the following question to me: We have "entry" rules defined.
> When we saw that we get exit codes, the conclusion was that "entry" and
> "exit" are not different for every syscall. Can you confirm?

They are different in that some things are not defined at entry while all 
things are defined at exit. I believe you can write all audit rules as exit 
rules without any noticable differences. if have an entry rule that evaluates 
to never, then it does speed things up since it no longer needs to collect 
aux records. With respect to the time, its set when audit_log_start is called 
which is always on exit for any rules that involve syscalls (that is when the 
exit code is valid).


> I tried to change our rules to "exit,always" from "entry,always", but it
> didn't make a difference. Can you confirm that only one exit is traced and
> do you think audit can be enhanced to trace these extra exits of syscalls
> like FORK.

Yes, I think the kernel could be updated to return twice. This would need to 
be sent upstream and I think 2.6.28 is the next chance.


> > > Seems like a bug? Can you have a look at it?
> >
> > I'll check on why we don't update the time stamp during syscall
> > recursion.
>
> Thanks a lot. I guess, the expectation could be that "exit" traces have the
> datation of the "exit" and not "entry".

See above about timestamp generation.


> > Then there is a problem of correlation. If I have 1 rule that expands to
> > 2, then how can I do a compare of what's in memory vs what rules are on
> > disk? IOW, how do I tell that someone typed:
> >
> >  -a entry,always -F arch=b32 -S clone -S fork -S vfork
> >  -a entry,always -F arch=b64 -S clone -S fork -S vfork
> >
> > or just
> >
> > -a entry,always -S clone -S fork -S vfork
> >
> > because auditctl would make 2 from 1. This is a really tricky issue and
> > if we didn't care about correlation...or about outdated tools we trust
> > too much...we could do this.
>
> That's understood. And a typical danger of user friendly abstraction is
> that it causes confusion. As I said, -F was bound to "filter" in my mind.

-F means field. In this case, the filter does use the arch field to select 
which syscalls become events. But we put it before the syscall so that 
auditctl looks it up in the right table. It might possibly be more correct to 
introduce a selector for -S, but then people won't like giving it twice.


> > ausyscall x86_64 clone
> > 56
> >
> > ausyscall i386 clone
> > 120
>
> Very good. We have initially defined a hash in Python manually with what we
> encounter, but we can rather use that to create them. We specifically have
> the problem of visiting a s390 site, where it will handy to have these
> already in place. There is no such function in libaudit, is there?

For what? 


> > We have an audit parsing library. It takes this into account.
>
> I have looked at it, and auparse_init doesn't seem to support reading from
> the socket itself, does it?

You mean the netlink socket?


> It could be AUSOURCE_FILE. And there there is the issue that the logs on
> disk seem to be different from what format we get on the socket. 

yes it is.

> In an ideal world, we would like to note that the audit socket is readable,
> hand it (or an arbitrarily truncated chunk of data) from it to libaudit,
> ask it for events until it says there are not more. That would leave the
> truncated line/event issue to libaudit. Is that part of the code?

libaudit should pull complete events from the kernel unless an execve has an 
excessive number of arguments or large sized arguments.


> > Look at the syscall record. It is always emitted with multi-line records.
> > It has an items count. Each auxiliary (path in this case) record has an
> > item number. You can tell when you have everything. Single line entries
> > do not have an items field. Also note that the record comprising an event
> > comes out of the kernel in a backwards order.
>
> Ah, we simply ignored the type=PATH etc. elements. But what I mean is that
> of the syscall itself, the arguments seemed to be on new lines:
>
> This is from Python code:
>
>     data = _audit_socket.recv( 32*1024 )
>     print data
>
> node=Annuitka type=SYSCALL msg=audit(1218880198.814:42205): arch=c000003e
> syscall=59 success=yes exit=0 a0=16cc168 a1=1464c08 a2=1588008 a3=0 items=2
> ppid=3864 pid=19928 auid=4294967295 uid=1000 gid=1000 euid=1000 suid=1000
> fsuid=1000 egid=1000 sgid=1000 fsgid=1000 tty=pts3 ses=4294967295 comm="ls"
> exe="/bin/ls" key=(null)
> node=Annuitka type=EXECVE msg=audit(1218880198.814:42205): argc=2 a0="ls"
> a1="--color=auto"
>
> node=Annuitka type=CWD msg=audit(1218880198.814:42205):
> cwd="/data/home/anna/comsoft/v7a1-ps2-acs/src/acs"
> node=Annuitka type=PATH msg=audit(1218880198.814:42205): item=0
> name="/bin/ls" inode=1651626 dev=08:12 mode=0100755 ouid=0 ogid=0
> rdev=00:00
> node=Annuitka type=PATH msg=audit(1218880198.814:42205): item=1 name=(null)
> inode=779612 dev=08:12 mode=0100755 ouid=0 ogid=0 rdev=00:00
> node=Annuitka type=EOE msg=audit(1218880198.814:42205):
>
> Note that we get a SYSCALL with 2 items, and then in order the items - from
> the socket. But inbetween we get type=EXECVE it doesn't have an item
> number,

I suppose that could be fixed.

> and worse the new line before 'a1=--color-auto' is real and so is 
> the empty line after it. I have another example of a "gnash" call from
> Konqueror with no less than 29 arguments.

That is coming from here, and I think a patch was submitted fixing it.

http://lxr.linux.no/linux+v2.6.26.2/kernel/auditsc.c#L1114



> That means, in order to parse the socket, we should check argc, right? I
> think we would prefer very long lines like they are in /var/log/audit
> instead, making these kinds of steps optional.
>
> Actually I don't understand the differences in format. I assume they serve
> the purpose of making things readable?

Yes. From the kernel is a structure with some text. For logging we want it all 
text.


> > > This is in hope that indeed continued lines always start with a
> > > non-space and type lines always start with a space. Would you consider
> > > this format worthy and possible to change?
> >
> > Don't like changing formats as that affects test suites.
>
> That " type=" start is a self-confusion of ours. Starting with 1.6 the
> node= part was added, and some hack was still in place that removes
> "node=hostname" and leaves the space there. Sorry about that.

You can remove or add the node field. Its controlled by the name_format config 
item.


> > > I have no idea how much it represents and existing external interface,
> > > but I can imagine you can't change it (easily). Probably the end of
> > > type= must be detected by terminating empty line in case of those that
> > > can be continued. But it would be very ugly to have to know the event
> > > types that have this so early in the decoding process.
> >
> > We have a parsing library, auparse, that handles the rules of audit
> > parsing. Look for auparse.h for the API.
>
> If you confirm that can handles the parsing from the socket, as suggested
> above, we may persue that path and can ignore strangeness of the format
> once its handled by the library.

The audit parsing library wants to read text strings as you would find them on 
disk. The kernel keeps type separate as an integer so that decisions can be 
made about what the record means without having to do a text to int 
conversion. So, the audit daemon does the reformatting after it decides that 
it a record type that we are interested in.


> > > > > 2. We don't want to poll periodically, but rather only wake up (and
> > > > > then with minimal latency) when something interesting happened. We
> > > > > would want to poll a periodic check that forks are still reported,
> > > > > so we would detect a loss of service from audit.
> > > >
> > > > You might write a audispd plugin for this.
> > >
> > > Did you mean for the periodic check,
> >
> > There is a realtime interface for the audit stream. You can write either
> > a new event dispatcher or a plugin to the existing one. Seeing as you are
> > more concerned with assurance, I'd just replace the current dispatcher
> > with your own. I have a description of this here:
> >
> > http://people.redhat.com/sgrubb/audit/audit-rt-events.txt
>
> I saw that too, but I thought it would be better to build upon the existing
> effort. I think that's a viable alternative and potentially could be more
> robust to us. At least it could be that audisp seems to try and solve
> problems we don't have or want.

You have 4 points to get the audit stream, in order of distance from the event 
generation: the audit netlink socket, auditd realtime interface, audisp 
plugin interface, and the af_unix socket created by the af_unix plugin from 
audispd. For higher reliability where you don't want of need any other audit 
processing interfering, I would say use either of the first 2.


> Looking at the source I saw that node name is something that audisp indeed
> adds the node name and that auditd doesn't log EOEs, explaining some of the
> differences. I didn't find why audisp has extra new lines, or if auditd
> removed these.

Auditd strips these before writing to disk. The realtime interface sends the 
event just as it was received by the kernel.


> > > or for the whole job, that means our supervision process?
> >
> > The supervision process. Then again, maybe you want to replace the audit
> > daemon and handle events your own way. libaudit has all the primitives
> > for that. So, I guess that brings up the question of how you are
> > accessing the audit event stream. Are you reading straight from netlink
> > or the disk?
>
> From the files is out of question. We thought of the audit sub deamon as
> something that simply allows to access the audit stream live, but that it
> is otherwise the same as the file.

The data can be either binary or string. Binary means that its exactly the 
same format that comes from thekernel unchanged. String means that its been 
formatted just like you would see on disk. But it appears that its not 
stripping the 0X0A out of the text. It probably should.

> Like auditd would accept things from the kernel, write it to a file and hand
> it to audisp as well which would then provide it to others. Isn't that the
> design? 

Yes.

-Steve

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Audit for live supervision
  2008-08-18 15:10         ` Steve Grubb
@ 2008-08-19  6:45           ` Kay Hayen
  2008-08-19 14:14             ` John Dennis
  2008-08-19 14:47             ` Steve Grubb
  0 siblings, 2 replies; 17+ messages in thread
From: Kay Hayen @ 2008-08-19  6:45 UTC (permalink / raw)
  To: Steve Grubb; +Cc: linux-audit, alex

Hello Steve,

> > I tried to change our rules to "exit,always" from "entry,always", but it
> > didn't make a difference. Can you confirm that only one exit is traced
> > and do you think audit can be enhanced to trace these extra exits of
> > syscalls like FORK.
>
> Yes, I think the kernel could be updated to return twice. This would need
> to be sent upstream and I think 2.6.28 is the next chance.

The missing FORK return is incidentally the one we care about. We have no 
concern if a process forks or not, we only want to know and see the new 
process. It doesn't matter if the parent ever got to notice it.

Is there any hope such a patch could be part of RHEL 5.3, given that Redhat 
has its own kernel release process? I am not that much into security, but I 
could imagine that it's possible to carefully craft a process that escapes 
the audit trail with SIGKILL to a forker.

All you got to do is to fork a process that will fork another and with 
increasingly bigger times, you SIGKILL the process until its child will 
secretely survive.

> > > ausyscall x86_64 clone
> > > 56
> > >
> > > ausyscall i386 clone
> > > 120
> >
> > Very good. We have initially defined a hash in Python manually with what
> > we encounter, but we can rather use that to create them. We specifically
> > have the problem of visiting a s390 site, where it will handy to have
> > these already in place. There is no such function in libaudit, is there?
>
> For what?

Well for the functionality of ausyscall. If we could query the current arch, 
well it's b32/b64 arches, then we could build that table at run time, 
couldn't we?

That would be a whole lot nicer than hardcoded values, even if they are 
generated using ausyscall. 

> > > We have an audit parsing library. It takes this into account.
> >
> > I have looked at it, and auparse_init doesn't seem to support reading
> > from the socket itself, does it?
>
> You mean the netlink socket?

No, when opening the socket the to the sub deamon audisp. I couldn't convice 
myself how the API would work with a socket. Does it?

> > In an ideal world, we would like to note that the audit socket is
> > readable, hand it (or an arbitrarily truncated chunk of data) from it to
> > libaudit, ask it for events until it says there are not more. That would
> > leave the truncated line/event issue to libaudit. Is that part of the
> > code?
>
> libaudit should pull complete events from the kernel unless an execve has
> an excessive number of arguments or large sized arguments.

I read that as that we can use the netlink socket with the libaudit directly, 
which sort of could be exactly what we want. That would mean we wouldn't use 
audit user space (processes) at all, right?

> > Note that we get a SYSCALL with 2 items, and then in order the items -
> > from the socket. But inbetween we get type=EXECVE it doesn't have an item
> > number,
>
> I suppose that could be fixed.
>
> > and worse the new line before 'a1=--color-auto' is real and so is
> > the empty line after it. I have another example of a "gnash" call from
> > Konqueror with no less than 29 arguments.
>
> That is coming from here, and I think a patch was submitted fixing it.
>
> http://lxr.linux.no/linux+v2.6.26.2/kernel/auditsc.c#L1114

I see. Strange to see line formatting like that in the kernel in the first 
place. But libaudit doesn't care about them anyway I suppose.

> > > > I have no idea how much it represents and existing external
> > > > interface, but I can imagine you can't change it (easily). Probably
> > > > the end of type= must be detected by terminating empty line in case
> > > > of those that can be continued. But it would be very ugly to have to
> > > > know the event types that have this so early in the decoding process.
> > >
> > > We have a parsing library, auparse, that handles the rules of audit
> > > parsing. Look for auparse.h for the API.
> >
> > If you confirm that can handles the parsing from the socket, as suggested
> > above, we may persue that path and can ignore strangeness of the format
> > once its handled by the library.
>
> The audit parsing library wants to read text strings as you would find them
> on disk. The kernel keeps type separate as an integer so that decisions can
> be made about what the record means without having to do a text to int
> conversion. So, the audit daemon does the reformatting after it decides
> that it a record type that we are interested in.

And I read that as the libaudit library being unable to use the netlink socket 
directly.

[ Options for listening]

> You have 4 points to get the audit stream, in order of distance from the
> event generation: the audit netlink socket, auditd realtime interface,
> audisp plugin interface, and the af_unix socket created by the af_unix
> plugin from audispd. For higher reliability where you don't want of need
> any other audit processing interfering, I would say use either of the first
> 2.

The latency is getting higher with each step. For optimal performance we would 
listen to the netlink socket and duplicate only the code essential to process 
what we are interested it. 

For extra points and hurt, we would do it in Ada and inside the target 
process, really achieving the low latency. It may be the only realistic 
option, but it also feels like duplication of effort. We have done netlink 
interfaces in Ada before, but also have on our mind that it was said that the 
netlink interface was said (not by you) to be still in flux. Is that still 
true?

It certainly would be nice if the audisp had some form of output that can be 
fed directly into libaudit parsing as it comes in. But that may be an 
unrealistic expectation, is it?

> > > > or for the whole job, that means our supervision process?
> > >
> > > The supervision process. Then again, maybe you want to replace the
> > > audit daemon and handle events your own way. libaudit has all the
> > > primitives for that. So, I guess that brings up the question of how you
> > > are accessing the audit event stream. Are you reading straight from
> > > netlink or the disk?
> >
> > From the files is out of question. We thought of the audit sub deamon as
> > something that simply allows to access the audit stream live, but that it
> > is otherwise the same as the file.
>
> The data can be either binary or string. Binary means that its exactly the
> same format that comes from thekernel unchanged. String means that its been
> formatted just like you would see on disk. But it appears that its not
> stripping the 0X0A out of the text. It probably should.

Well, as it is, the parsing must be done in lines, and you must be stateful to 
know how many of them to expect, because of two bugs only:

a) "Random" newlines for some type=xxx values.
b) No item= for type=EXECVE and similar.

If it were not for that, the parsing of socket output would be much easier. I 
understand both issues will be addressed in future releases, which is a good 
outcome.

> > Like auditd would accept things from the kernel, write it to a file and
> > hand it to audisp as well which would then provide it to others. Isn't
> > that the design?
>
> Yes.

The difference between disk and audisp socket was also confusing to us. Some 
corrections (removal of newlines) were applied to only one part.

Overall I would like to thank you for clarifying the supposed workings, 
addressing found bugs, and advising us very well. 

If we choose to use the netlink socket for audit messages, we are of course 
interested about the kernel changes. The missing fork exits more so than the 
EXECVE extra newlines which should be easy enough to compensate.

Best regards,
Kay Hayen

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Audit for live supervision
  2008-08-19  6:45           ` Kay Hayen
@ 2008-08-19 14:14             ` John Dennis
  2008-08-19 17:46               ` Kay Hayen
  2008-08-19 14:47             ` Steve Grubb
  1 sibling, 1 reply; 17+ messages in thread
From: John Dennis @ 2008-08-19 14:14 UTC (permalink / raw)
  To: Kay Hayen; +Cc: alex, linux-audit


[-- Attachment #1.1: Type: text/plain, Size: 2382 bytes --]

Kay Hayen wrote:
> No, when opening the socket the to the sub deamon audisp. I couldn't convice 
> myself how the API would work with a socket. Does it?
>   

The auparse library can read a stream by opening the parser with 
AUSOURCE_FEED, you set a callback, then feed arbitrary number of bytes 
into the parser by calling auparse_feed(), you'll be called back when a 
complete event is found, at that point just use the normal auparse 
functions.

You can read off of this unix socket (/var/run/audispd_events) but this 
is deprecated. It is now preferred is now to use a audispd plugin and 
read from stdin. See the audit src package and look in audisp/plugins 
for examples. FWIW I noticed that code was calling fgets to get data to 
feed to auparse_feed() but it seems inefficient to buffer lines twice, 
auparse_feed will do the line protocol.
> I read that as that we can use the netlink socket with the libaudit directly, 
> which sort of could be exactly what we want. That would mean we wouldn't use 
> audit user space (processes) at all, right?
>
>   
No, you really want to use the user space interface (see above).
>   
>> You have 4 points to get the audit stream, in order of distance from the
>> event generation: the audit netlink socket, auditd realtime interface,
>> audisp plugin interface, and the af_unix socket created by the af_unix
>> plugin from audispd. For higher reliability where you don't want of need
>> any other audit processing interfering, I would say use either of the first
>> 2.
>>     
>
> The latency is getting higher with each step. For optimal performance we would 
> listen to the netlink socket and duplicate only the code essential to process 
> what we are interested it. 
>
> For extra points and hurt, we would do it in Ada and inside the target 
> process, really achieving the low latency. It may be the only realistic 
> option, but it also feels like duplication of effort. We have done netlink 
> interfaces in Ada before, but also have on our mind that it was said that the 
> netlink interface was said (not by you) to be still in flux. Is that still 
> true?
>
> It certainly would be nice if the audisp had some form of output that can be 
> fed directly into libaudit parsing as it comes in. But that may be an 
> unrealistic expectation, is it?
>
>   
It does, see above comment.


-- 
John Dennis <jdennis@redhat.com>


[-- Attachment #1.2: Type: text/html, Size: 3085 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Audit for live supervision
  2008-08-19  6:45           ` Kay Hayen
  2008-08-19 14:14             ` John Dennis
@ 2008-08-19 14:47             ` Steve Grubb
  2008-08-19 18:23               ` Kay Hayen
  1 sibling, 1 reply; 17+ messages in thread
From: Steve Grubb @ 2008-08-19 14:47 UTC (permalink / raw)
  To: Kay Hayen; +Cc: linux-audit, alex

On Tuesday 19 August 2008 02:45:00 Kay Hayen wrote:
> Hello Steve,
>
> > > I tried to change our rules to "exit,always" from "entry,always", but
> > > it didn't make a difference. Can you confirm that only one exit is
> > > traced and do you think audit can be enhanced to trace these extra
> > > exits of syscalls like FORK.
> >
> > Yes, I think the kernel could be updated to return twice. This would need
> > to be sent upstream and I think 2.6.28 is the next chance.

<snip>

> Is there any hope such a patch could be part of RHEL 5.3, given that Redhat
> has its own kernel release process? I am not that much into security, but I
> could imagine that it's possible to carefully craft a process that escapes
> the audit trail with SIGKILL to a forker.

Just because a process exists is not a security concern. The process actually 
has to do something related to security - e.g. access resources. At that 
point we will pick it up. I can see that we should probably have 2 records on 
the clone syscall if that is being audited.


> All you got to do is to fork a process that will fork another and with
> increasingly bigger times, you SIGKILL the process until its child will
> secretely survive.

Sure, but as soon as it touches something or makes any syscall, its 
potentially auditable.


> > > > ausyscall x86_64 clone
> > > > 56
> > > >
> > > > ausyscall i386 clone
> > > > 120
> > >
> > > Very good. We have initially defined a hash in Python manually with
> > > what we encounter, but we can rather use that to create them. We
> > > specifically have the problem of visiting a s390 site, where it will
> > > handy to have these already in place. There is no such function in
> > > libaudit, is there?
> >
> > For what?
>
> Well for the functionality of ausyscall. If we could query the current
> arch, well it's b32/b64 arches, then we could build that table at run time,
> couldn't we?

Sure. If you look at the code for ausyscall, it simply calls 
audit_syscall_to_name() in libaudit. On number to string its a straight 
lookup, for string to number, we have to brute force search for it. 


> That would be a whole lot nicer than hardcoded values, even if they are
> generated using ausyscall.

Sure. Occassionally syscalls get added to the upstream kernel and very rarely 
to a RHEL kernel. So, using libaudit would future-proof the code.


> > > > We have an audit parsing library. It takes this into account.
> > >
> > > I have looked at it, and auparse_init doesn't seem to support reading
> > > from the socket itself, does it?
> >
> > You mean the netlink socket?
>
> No, when opening the socket the to the sub deamon audisp. I couldn't
> convice myself how the API would work with a socket. Does it?

Not directly because the audit internal API has the type as an integer 
separate from the text of the event. Its really simple to create a string 
that auparse can use and then use the feed interface. A working example of 
the feed interface can be found in audisp/plugins/prelude/audisp-prelude.c.


> > > In an ideal world, we would like to note that the audit socket is
> > > readable, hand it (or an arbitrarily truncated chunk of data) from it
> > > to libaudit, ask it for events until it says there are not more. That
> > > would leave the truncated line/event issue to libaudit. Is that part of
> > > the code?
> >
> > libaudit should pull complete events from the kernel unless an execve has
> > an excessive number of arguments or large sized arguments.
>
> I read that as that we can use the netlink socket with the libaudit
> directly, which sort of could be exactly what we want. That would mean we
> wouldn't use audit user space (processes) at all, right?

True. You would have to load your own rules since that is done by the audit 
user space.


> > > Note that we get a SYSCALL with 2 items, and then in order the items -
> > > from the socket. But inbetween we get type=EXECVE it doesn't have an
> > > item number,
> >
> > I suppose that could be fixed.
> >
> > > and worse the new line before 'a1=--color-auto' is real and so is
> > > the empty line after it. I have another example of a "gnash" call from
> > > Konqueror with no less than 29 arguments.
> >
> > That is coming from here, and I think a patch was submitted fixing it.
> >
> > http://lxr.linux.no/linux+v2.6.26.2/kernel/auditsc.c#L1114
>
> I see. Strange to see line formatting like that in the kernel in the first
> place. But libaudit doesn't care about them anyway I suppose.

No it doesn't. Things down stream from it might, but its stripped going to 
disk.


> > > > > I have no idea how much it represents and existing external
> > > > > interface, but I can imagine you can't change it (easily). Probably
> > > > > the end of type= must be detected by terminating empty line in case
> > > > > of those that can be continued. But it would be very ugly to have
> > > > > to know the event types that have this so early in the decoding
> > > > > process.
> > > >
> > > > We have a parsing library, auparse, that handles the rules of audit
> > > > parsing. Look for auparse.h for the API.
> > >
> > > If you confirm that can handles the parsing from the socket, as
> > > suggested above, we may persue that path and can ignore strangeness of
> > > the format once its handled by the library.
> >
> > The audit parsing library wants to read text strings as you would find
> > them on disk. The kernel keeps type separate as an integer so that
> > decisions can be made about what the record means without having to do a
> > text to int conversion. So, the audit daemon does the reformatting after
> > it decides that it a record type that we are interested in.
>
> And I read that as the libaudit library being unable to use the netlink
> socket directly.

No, libaudit is the I/O inteface with the kernel. The audit daemon and 
auditctl make extensive use of it when talking to the kernel about audit 
events. wrt auparse (if that's what you meant) you just run the data through:

asprintf(&v, "type=%s msg=%.*s\n", type, e->hdr.size, e->data);

and "v" has the string ready for auparse use. asprinf() allocates memory, so 
watch that it doesn't create a memory leak.


> [ Options for listening]
>
> > You have 4 points to get the audit stream, in order of distance from the
> > event generation: the audit netlink socket, auditd realtime interface,
> > audisp plugin interface, and the af_unix socket created by the af_unix
> > plugin from audispd. For higher reliability where you don't want of need
> > any other audit processing interfering, I would say use either of the
> > first 2.
>
> The latency is getting higher with each step. For optimal performance we
> would listen to the netlink socket and duplicate only the code essential to
> process what we are interested it.

Sure


> For extra points and hurt, we would do it in Ada and inside the target
> process, really achieving the low latency. It may be the only realistic
> option, but it also feels like duplication of effort. We have done netlink
> interfaces in Ada before, but also have on our mind that it was said that
> the netlink interface was said (not by you) to be still in flux. Is that
> still true?

We are in the process of migrating from the old rules to the new rules API. 
from kernel 2.6.6 to around 2.6.16 had one API (audit_add_rule) and replaced 
with a new and improved API (audit_add_rule_data) for kernels after that. The 
deprecated functions should be removed from libaudit.h so that there is 
binary compatibility for prebuilt apps and newly built apps won't be able to 
use the old functions.


> It certainly would be nice if the audisp had some form of output that can
> be fed directly into libaudit parsing as it comes in. But that may be an
> unrealistic expectation, is it?

one note...libaudit is an I/O library, libauparse is the library that parses 
audit events. Assuming you meant the latter...they are built for one another. 
The audispd feeds data to siblings. In the configuration file, you just 
specify that the child app wants string data and it takes care of  the 
conversion. The prelude plugin is a good example. However, audispd plugins 
are probably too high of latency for you. Converting the kernel's data into a 
string is simple as code snippet above shows.

-Steve

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Audit for live supervision
  2008-08-19 14:14             ` John Dennis
@ 2008-08-19 17:46               ` Kay Hayen
  2008-08-19 18:18                 ` Steve Grubb
  0 siblings, 1 reply; 17+ messages in thread
From: Kay Hayen @ 2008-08-19 17:46 UTC (permalink / raw)
  To: John Dennis; +Cc: alex, linux-audit

Hello John,

Am Dienstag, 19. August 2008 16:14:54 schrieb John Dennis:
> Kay Hayen wrote:
> > No, when opening the socket the to the sub deamon audisp. I couldn't
> > convice myself how the API would work with a socket. Does it?
>
> The auparse library can read a stream by opening the parser with
> AUSOURCE_FEED, you set a callback, then feed arbitrary number of bytes
> into the parser by calling auparse_feed(), you'll be called back when a
> complete event is found, at that point just use the normal auparse
> functions.
>
> You can read off of this unix socket (/var/run/audispd_events) but this
> is deprecated. It is now preferred is now to use a audispd plugin and
> read from stdin. See the audit src package and look in audisp/plugins
> for examples. FWIW I noticed that code was calling fgets to get data to
> feed to auparse_feed() but it seems inefficient to buffer lines twice,
> auparse_feed will do the line protocol.

That's great. We can use the first approach initially (unix socket), a plugin 
is not so good for us, because our supervision process would need to receive 
from it anyway. 

The next best step would be to use the netlink socket directly. From what 
Steve wrote, that doesn't seem entirely difficult either:

Steve wrote:
> wrt auparse (if that's what you meant) you just run the data through:

> asprintf(&v, "type=%s msg=%.*s\n", type, e->hdr.size, e->data);

> and "v" has the string ready for auparse use. asprinf() allocates memory, so 
> watch that it doesn't create a memory leak.

That seems pretty direct too. If it's that easy to use auparse with either 
audisp socket or netlink socket, all the blame on us for not discovering it 
on our own. :-)

The extra handling of the netlink socket may not be that much code then, 
allthough I suspect it requires a root rights (or simply a capability I would 
hope).

> > I read that as that we can use the netlink socket with the libaudit
> > directly, which sort of could be exactly what we want. That would mean we
> > wouldn't use audit user space (processes) at all, right?
>
> No, you really want to use the user space interface (see above).

Well, for lowest latency possible (note the "live" in subject), it would be 
ideal to avoid context switches auditd -> audisp -> our supervisor and 
instead simply run an additional netlink socket in addition to auditd (if 
that is allowed). That way we would have a lot less latency, at least in 
theory.

But that could well be a final step only and future work and be judged by 
measurements of what we get from the relatively easy solution. I am going to 
try out that approach of AUSOURCE_FEED and report.

Best regards,
Kay Hayen

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Audit for live supervision
  2008-08-19 17:46               ` Kay Hayen
@ 2008-08-19 18:18                 ` Steve Grubb
  0 siblings, 0 replies; 17+ messages in thread
From: Steve Grubb @ 2008-08-19 18:18 UTC (permalink / raw)
  To: Kay Hayen; +Cc: linux-audit, alex

On Tuesday 19 August 2008 13:46:14 Kay Hayen wrote:
> > No, you really want to use the user space interface (see above).
>
> Well, for lowest latency possible (note the "live" in subject), it would be
> ideal to avoid context switches auditd -> audisp -> our supervisor and
> instead simply run an additional netlink socket in addition to auditd (if
> that is allowed). That way we would have a lot less latency, at least in
> theory.

Only 1 netlink socket connection is allowed. The code you want to write for 
low latency would either need to take the place of the audit daemon, meaning 
you need to make your own trail if you need it. Or, write an audispd that is 
run from auditd. There is some sample code here contrib/skeleton.c for 
starting your own audispd.

-Steve

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Audit for live supervision
  2008-08-19 14:47             ` Steve Grubb
@ 2008-08-19 18:23               ` Kay Hayen
  2008-08-19 18:39                 ` Steve Grubb
  0 siblings, 1 reply; 17+ messages in thread
From: Kay Hayen @ 2008-08-19 18:23 UTC (permalink / raw)
  To: Steve Grubb; +Cc: linux-audit, alex

Hello Steve,

> > Well for the functionality of ausyscall. If we could query the current
> > arch, well it's b32/b64 arches, then we could build that table at run
> > time, couldn't we?
>
> Sure. If you look at the code for ausyscall, it simply calls
> audit_syscall_to_name() in libaudit. On number to string its a straight
> lookup, for string to number, we have to brute force search for it.

Perfect, we are going to use that instead then.

> > > > > We have an audit parsing library. It takes this into account.
> > > >
> > > > I have looked at it, and auparse_init doesn't seem to support reading
> > > > from the socket itself, does it?
> > >
> > > You mean the netlink socket?
> >
> > No, when opening the socket the to the sub deamon audisp. I couldn't
> > convice myself how the API would work with a socket. Does it?
>
> Not directly because the audit internal API has the type as an integer
> separate from the text of the event. Its really simple to create a string
> that auparse can use and then use the feed interface. A working example of
> the feed interface can be found in audisp/plugins/prelude/audisp-prelude.c.

Nice, as I wrote in that other email of mine shortly ago, this could well be 
the way forward for now and later we can switch to pure netlink socket.

> > > libaudit should pull complete events from the kernel unless an execve
> > > has an excessive number of arguments or large sized arguments.
> >
> > I read that as that we can use the netlink socket with the libaudit
> > directly, which sort of could be exactly what we want. That would mean we
> > wouldn't use audit user space (processes) at all, right?
>
> True. You would have to load your own rules since that is done by the audit
> user space.

Can you confirm that two processes opening netlink sockets for audit 
information get the same messages? I am under the impression that the kernel 
doesn't maintain per socket configuration, does it?

If that were the case, we would simply co-exist with auditd and let it do its 
logging, etc. and benefit from it, and its ability to load the rules (which 

> events. wrt auparse (if that's what you meant) you just run the data
> through:
>
> asprintf(&v, "type=%s msg=%.*s\n", type, e->hdr.size, e->data);
>
> and "v" has the string ready for auparse use. asprinf() allocates memory,
> so watch that it doesn't create a memory leak.

That's very sweet. Where would you expect the pitfalls? I mean, it can't be so 
easy. :-)

> The audispd feeds data to siblings. In the configuration file, you 
> just specify that the child app wants string data and it takes care of  the
> conversion. The prelude plugin is a good example. However, audispd plugins
> are probably too high of latency for you. Converting the kernel's data into
> a string is simple as code snippet above shows.

Given that all we would have to do is to open the socket and listen, feed 
what's received on it to asprintf and get our callbacks called for events, it 
sounds very simply indeed.

We will have a look at this too, seems from code complexity there barely would 
be a difference, so taking the full jump immediately might be an option as 
well. I will report on that too.

Best regards,
Kay Hayen

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Audit for live supervision
  2008-08-19 18:23               ` Kay Hayen
@ 2008-08-19 18:39                 ` Steve Grubb
  2008-08-19 20:33                   ` Kay Hayen
  0 siblings, 1 reply; 17+ messages in thread
From: Steve Grubb @ 2008-08-19 18:39 UTC (permalink / raw)
  To: Kay Hayen; +Cc: linux-audit, alex

On Tuesday 19 August 2008 14:23:21 Kay Hayen wrote:
> > > > libaudit should pull complete events from the kernel unless an execve
> > > > has an excessive number of arguments or large sized arguments.
> > >
> > > I read that as that we can use the netlink socket with the libaudit
> > > directly, which sort of could be exactly what we want. That would mean
> > > we wouldn't use audit user space (processes) at all, right?
> >
> > True. You would have to load your own rules since that is done by the
> > audit user space.
>
> Can you confirm that two processes opening netlink sockets for audit
> information get the same messages?

Only one audit pid is allowed for security purposes.


> I am under the impression that the kernel doesn't maintain per socket
> configuration, does it?

Nope, it only allows one.


> If that were the case, we would simply co-exist with auditd and let it do
> its logging, etc. and benefit from it, and its ability to load the rules

If you want to co-exist with auditd, then you want to write your own audispd. 
I pointed you to the skeleton.c code in the other email.


> > events. wrt auparse (if that's what you meant) you just run the data
> > through:
> >
> > asprintf(&v, "type=%s msg=%.*s\n", type, e->hdr.size, e->data);
> >
> > and "v" has the string ready for auparse use. asprinf() allocates memory,
> > so watch that it doesn't create a memory leak.
>
> That's very sweet. Where would you expect the pitfalls? I mean, it can't be
> so easy. :-)

No pitfalls except watching for memory leaks. Audispd used the same code.

-Steve

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Audit for live supervision
  2008-08-19 18:39                 ` Steve Grubb
@ 2008-08-19 20:33                   ` Kay Hayen
  2008-08-19 20:47                     ` Steve Grubb
  0 siblings, 1 reply; 17+ messages in thread
From: Kay Hayen @ 2008-08-19 20:33 UTC (permalink / raw)
  To: Steve Grubb; +Cc: linux-audit, alex

Hello Steve,

> > Can you confirm that two processes opening netlink sockets for audit
> > information get the same messages?
>
> Only one audit pid is allowed for security purposes.

Damn security. I saw that patch while googling, and hoped it wasn't merged, 
but seems it was. 

I don't really understand why it is helping security, if I need to kill auditd 
before I can open the netlink socket. For both I need root rights. 

There isn't any SELinux in the play, is there? 

Because if that were the case, we could e.g. only open the netlink socket with 
the auditd binary. That would be effective, and configuration we could then 
change.

But probably pointless to waiste your time on this, given how little I 
understand security. I just can't resist, feels like a bike-shed and really 
annoying limitation for our non-security interested system. :-)

Best regards,
Kay Hayen

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Audit for live supervision
  2008-08-19 20:33                   ` Kay Hayen
@ 2008-08-19 20:47                     ` Steve Grubb
  2008-08-19 21:35                       ` Kay Hayen
  0 siblings, 1 reply; 17+ messages in thread
From: Steve Grubb @ 2008-08-19 20:47 UTC (permalink / raw)
  To: Kay Hayen; +Cc: linux-audit, alex

On Tuesday 19 August 2008 16:33:58 Kay Hayen wrote:
> > Only one audit pid is allowed for security purposes.
>
> Damn security. I saw that patch while googling, and hoped it wasn't merged,
> but seems it was.

Its been there since 2.6.6 kernel. IOW - day 1.

> I don't really understand why it is helping security, if I need to kill
> auditd before I can open the netlink socket. For both I need root rights.

The queueing is complicated and if you have a group of processes it gets real 
messy. The audit queue tries hard for guaranteed delivery or take the system 
down if the flow is not working right. Its not like syslog or iptables 
logging.

> There isn't any SELinux in the play, is there?

SE Linux helps for MLS systems, but for CAPP, it doesn't come into play. The 
data flowing through the audit system could be very sensitive. Anyone needing 
access to the stream either needs to replace auditd, write their own 
dispatcher, or write a plugin to the shipped dispatcher. This way the admin 
knows exactly what processes have access to the data.

-Steve

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Audit for live supervision
  2008-08-19 20:47                     ` Steve Grubb
@ 2008-08-19 21:35                       ` Kay Hayen
  2008-08-19 21:47                         ` Steve Grubb
  0 siblings, 1 reply; 17+ messages in thread
From: Kay Hayen @ 2008-08-19 21:35 UTC (permalink / raw)
  To: Steve Grubb; +Cc: linux-audit, alex

Hello Steve,

you wrote:

> > I don't really understand why it is helping security, if I need to kill
> > auditd before I can open the netlink socket. For both I need root rights.
>
> The queueing is complicated and if you have a group of processes it gets
> real messy. The audit queue tries hard for guaranteed delivery or take the
> system down if the flow is not working right. Its not like syslog or
> iptables logging.

Ah I see! So I misread "security" to mean "prevent access" where it's 
actually "security" as in "not possibly corrupted data", and that's very 
welcome. Sorry about the confusion.

BTW: I looked at auditctl source and did some test, and it seems the rules can 
be set by using auditctl even without auditd running. So that means we don't 
have to do that ourselves.

Best regards,
Kay Hayen

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Audit for live supervision
  2008-08-19 21:35                       ` Kay Hayen
@ 2008-08-19 21:47                         ` Steve Grubb
  0 siblings, 0 replies; 17+ messages in thread
From: Steve Grubb @ 2008-08-19 21:47 UTC (permalink / raw)
  To: Kay Hayen; +Cc: linux-audit, alex

On Tuesday 19 August 2008 17:35:14 Kay Hayen wrote:
> BTW: I looked at auditctl source and did some test, and it seems the rules
> can be set by using auditctl even without auditd running. So that means we
> don't have to do that ourselves.

Sort of. The initscripts of auditd load the rules using 
auditctl -R /etc/audit/audit.rules. So, you'd want to do that in your 
initscript if you decide to replace auditd.

-Steve

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2008-08-19 21:47 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-08-14  7:14 Audit for live supervision Kay Hayen
2008-08-14 14:04 ` Steve Grubb
2008-08-15  6:43   ` Kay Hayen
2008-08-15 12:54     ` Steve Grubb
2008-08-16 11:19       ` Kay Hayen
2008-08-18 15:10         ` Steve Grubb
2008-08-19  6:45           ` Kay Hayen
2008-08-19 14:14             ` John Dennis
2008-08-19 17:46               ` Kay Hayen
2008-08-19 18:18                 ` Steve Grubb
2008-08-19 14:47             ` Steve Grubb
2008-08-19 18:23               ` Kay Hayen
2008-08-19 18:39                 ` Steve Grubb
2008-08-19 20:33                   ` Kay Hayen
2008-08-19 20:47                     ` Steve Grubb
2008-08-19 21:35                       ` Kay Hayen
2008-08-19 21:47                         ` Steve Grubb

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox