From mboxrd@z Thu Jan 1 00:00:00 1970 From: Kay Hayen Subject: Re: Audit for live supervision Date: Sat, 16 Aug 2008 13:19:27 +0200 Message-ID: <200808161319.28048.kayhayen@gmx.de> References: <200808140914.07779.kayhayen@gmx.de> <200808150843.49680.kayhayen@gmx.de> <200808150854.35585.sgrubb@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Return-path: Received: from mx3.redhat.com (mx3.redhat.com [172.16.48.32]) by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id m7GBJmmA009432 for ; Sat, 16 Aug 2008 07:19:48 -0400 Received: from mail.gmx.net (mail.gmx.net [213.165.64.20]) by mx3.redhat.com (8.13.8/8.13.8) with SMTP id m7GBJZfH021065 for ; Sat, 16 Aug 2008 07:19:35 -0400 In-Reply-To: <200808150854.35585.sgrubb@redhat.com> Content-Disposition: inline List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: linux-audit-bounces@redhat.com Errors-To: linux-audit-bounces@redhat.com To: Steve Grubb Cc: linux-audit@redhat.com, alex@segv.de List-Id: linux-audit@redhat.com Hello Steve, [ time descending, sequence number ascending problem ] > What this indicates is that there was some recursion before the syscall > triggered an event. The syscall context exists from sycall entry to exi= t. > If during the middle a signal is delivered, the syscall is not finished= . > Instead it runs the signal handler associated with the signal. The sign= al > handler might make syscalls which are then handled using the existing > syscall context via linked list. When that occurs, the timestamp is not > being updated. Not sure that is appropriate or why the original time re= ally > mattered. But that is what you are observing. My guess is SIGTERM is be= ing > delivered during another syscall. That raised the following question to me: We have "entry" rules defined. = When=20 we saw that we get exit codes, the conclusion was that "entry" and "exit"= are=20 not different for every syscall. Can you confirm?=20 And assuming that, indeed the may fork complete only after its has comple= ted=20 its signal handlers. The expectation would be that this is not an issue=20 though, because the new process is inactive. Wrong assumption. The new=20 process seemingly can already EXECVE with its fresh life, if the call of = FORK=20 is interrupted after the process exists. Now that poses an interesting problem. I guess, what we are missing is th= at=20 FORK should actually enter once and return up to _two_ times, and we need= to=20 handle and see traces of these. That way we would simply see FORK return = with=20 0 for the child before it does EXECVE, and so we would know who created i= t=20 (ppid) and know that it exists. I tried to change our rules to "exit,always" from "entry,always", but it=20 didn't make a difference. Can you confirm that only one exit is traced an= d do=20 you think audit can be enhanced to trace these extra exits of syscalls li= ke=20 FORK. For every workaround, a SIGKILL signal towards the FORK making process wo= uld=20 probably leave us even without a trace, forever possibly. As for the signal at hand here, I think we have ant (gcj based Java) doin= g=20 multi-threading (parallel building). That would be FORKs series with prob= ably=20 a SIGALARM or whatever they use to switch target execution without resort= ing=20 to threading. > > Seems like a bug? Can you have a look at it? > > I'll check on why we don't update the time stamp during syscall recursi= on. Thanks a lot. I guess, the expectation could be that "exit" traces have t= he=20 datation of the "exit" and not "entry". > Then there is a problem of correlation. If I have 1 rule that expands t= o 2, > then how can I do a compare of what's in memory vs what rules are on di= sk? > IOW, how do I tell that someone typed: > > -a entry,always -F arch=3Db32 -S clone -S fork -S vfork > -a entry,always -F arch=3Db64 -S clone -S fork -S vfork > > or just > > -a entry,always -S clone -S fork -S vfork > > because auditctl would make 2 from 1. This is a really tricky issue and= if > we didn't care about correlation...or about outdated tools we trust too > much...we could do this. That's understood. And a typical danger of user friendly abstraction is t= hat=20 it causes confusion. As I said, -F was bound to "filter" in my mind.=20 For "arch" it's suddenly a selector. I would find something like this mor= e=20 logical: -a entry,always,any -S clone -S fork -S vfork and if I really only wanted a certain arch, make me say so: -a entry,always,b64 -S clone -S fork -S vfork > ausyscall x86_64 clone > 56 > > ausyscall i386 clone > 120 Very good. We have initially defined a hash in Python manually with what = we=20 encounter, but we can rather use that to create them. We specifically hav= e=20 the problem of visiting a s390 site, where it will handy to have these=20 already in place. There is no such function in libaudit, is there? > We have an audit parsing library. It takes this into account.=20 I have looked at it, and auparse_init doesn't seem to support reading fro= m the=20 socket itself, does it? It could be AUSOURCE_FILE. And there there is the= =20 issue that the logs on disk seem to be different from what format we get = on=20 the socket. The node=3D is not on disk, newlines, empty lines, etc. see b= elow=20 about that. In an ideal world, we would like to note that the audit socket is readabl= e,=20 hand it (or an arbitrarily truncated chunk of data) from it to libaudit, = ask=20 it for events until it says there are not more. That would leave the=20 truncated line/event issue to libaudit. Is that part of the code? > > Without a very stateful message parser, one that e.g. knows how many > > lines are to follow an EXECVE, we don't know when to forward it the p= art > > that should process it. [Example deleted] > Look at the syscall record. It is always emitted with multi-line record= s. > It has an items count. Each auxiliary (path in this case) record has an > item number. You can tell when you have everything. Single line entries= do > not have an items field. Also note that the record comprising an event > comes out of the kernel in a backwards order. Ah, we simply ignored the type=3DPATH etc. elements. But what I mean is t= hat of=20 the syscall itself, the arguments seemed to be on new lines: This is from Python code: data =3D _audit_socket.recv( 32*1024 ) print data node=3DAnnuitka type=3DSYSCALL msg=3Daudit(1218880198.814:42205): arch=3D= c000003e=20 syscall=3D59 success=3Dyes exit=3D0 a0=3D16cc168 a1=3D1464c08 a2=3D158800= 8 a3=3D0 items=3D2=20 ppid=3D3864 pid=3D19928 auid=3D4294967295 uid=3D1000 gid=3D1000 euid=3D10= 00 suid=3D1000=20 fsuid=3D1000 egid=3D1000 sgid=3D1000 fsgid=3D1000 tty=3Dpts3 ses=3D429496= 7295 comm=3D"ls"=20 exe=3D"/bin/ls" key=3D(null) node=3DAnnuitka type=3DEXECVE msg=3Daudit(1218880198.814:42205): argc=3D2= a0=3D"ls" a1=3D"--color=3Dauto" node=3DAnnuitka type=3DCWD msg=3Daudit(1218880198.814:42205): =20 cwd=3D"/data/home/anna/comsoft/v7a1-ps2-acs/src/acs" node=3DAnnuitka type=3DPATH msg=3Daudit(1218880198.814:42205): item=3D0 n= ame=3D"/bin/ls"=20 inode=3D1651626 dev=3D08:12 mode=3D0100755 ouid=3D0 ogid=3D0 rdev=3D00:00 node=3DAnnuitka type=3DPATH msg=3Daudit(1218880198.814:42205): item=3D1 n= ame=3D(null)=20 inode=3D779612 dev=3D08:12 mode=3D0100755 ouid=3D0 ogid=3D0 rdev=3D00:00 node=3DAnnuitka type=3DEOE msg=3Daudit(1218880198.814:42205): Note that we get a SYSCALL with 2 items, and then in order the items - fr= om=20 the socket. But inbetween we get type=3DEXECVE it doesn't have an item nu= mber,=20 and worse the new line before 'a1=3D--color-auto' is real and so is the e= mpty=20 line after it. I have another example of a "gnash" call from Konqueror wi= th=20 no less than 29 arguments. That means, in order to parse the socket, we should check argc, right? I = think=20 we would prefer very long lines like they are in /var/log/audit instead,=20 making these kinds of steps optional. Actually I don't understand the differences in format. I assume they serv= e the=20 purpose of making things readable? > Did you know about the audit parsing library? Our assumption was also that it should be easy enough to parse the text. = Well=20 you know assumptions. Rarely ever true. :-) > > This is in hope that indeed continued lines always start with a non-s= pace > > and type lines always start with a space. Would you consider this for= mat > > worthy and possible to change? > > Don't like changing formats as that affects test suites. That " type=3D" start is a self-confusion of ours. Starting with 1.6 the = node=3D=20 part was added, and some hack was still in place that removes "node=3Dhos= tname"=20 and leaves the space there. Sorry about that. > > I have no idea how much it represents and existing external interface= , > > but I can imagine you can't change it (easily). Probably the end of t= ype=3D > > must be detected by terminating empty line in case of those that can = be > > continued. But it would be very ugly to have to know the event types = that > > have this so early in the decoding process. > > We have a parsing library, auparse, that handles the rules of audit > parsing. Look for auparse.h for the API. If you confirm that can handles the parsing from the socket, as suggested= =20 above, we may persue that path and can ignore strangeness of the format o= nce=20 its handled by the library. > > > There might be tunables that different distros can used with glibc. > > > strace is your friend...and having both 32/64 bit rules if amd64 is= the > > > target platform. > > > > We did that of course. And what was confusing us was that the audit.l= og > > did actually seem to show the calls. Can that even be? > > Yes, as explained above. Sorry, I am still confused. Can you explain why the socket and the audit.= log=20 can have different contents. I was blaming my (usually bad) memory. > > I see. Luckily we are not into security, but only "safety". I can't f= ind > > anything on Wikipedia about it, so I will try to explain it briefly, > > please forgive my limited understanding of it. :-) > > At one point, I worked on Space Shuttle software. I know a little on ho= w > they think about this. Well, that's perfect. :-) > > > > 2. We don't want to poll periodically, but rather only wake up (a= nd > > > > then with minimal latency) when something interesting happened. W= e > > > > would want to poll a periodic check that forks are still reported= , so > > > > we would detect a loss of service from audit. > > > > > > You might write a audispd plugin for this. > > > > Did you mean for the periodic check, > > There is a realtime interface for the audit stream. You can write eithe= r a > new event dispatcher or a plugin to the existing one. Seeing as you are > more concerned with assurance, I'd just replace the current dispatcher = with > your own. I have a description of this here: > > http://people.redhat.com/sgrubb/audit/audit-rt-events.txt I saw that too, but I thought it would be better to build upon the existi= ng=20 effort. I think that's a viable alternative and potentially could be more= =20 robust to us. At least it could be that audisp seems to try and solve=20 problems we don't have or want. Looking at the source I saw that node name is something that audisp indee= d=20 adds the node name and that auditd doesn't log EOEs, explaining some of t= he=20 differences. I didn't find why audisp has extra new lines, or if auditd=20 removed these. I think we will make a prototype for the RT interface and see what it giv= es=20 us. > > or for the whole job, that means our supervision process? > > The supervision process. Then again, maybe you want to replace the audi= t > daemon and handle events your own way. libaudit has all the primitives = for > that. So, I guess that brings up the question of how you are accessing = the > audit event stream. Are you reading straight from netlink or the disk? >>From the files is out of question. We thought of the audit sub deamon as= =20 something that simply allows to access the audit stream live, but that it= is=20 otherwise the same as the file. Like auditd would accept things from the=20 kernel, write it to a file and hand it to audisp as well which would then= =20 provide it to others. Isn't that the design?=20 > > I was wondering why audit wouldn't use that. Is that historic (didn't > > exist, nobody made a patch for it) or conscious decision (too difficu= lt, > > not worth it). Just curious here and of course the comment could be r= ead > > as a bit scary, because it actually means we will have to benchmark t= he > > impact... > > systemtap came after audit. They have 2 different purposes. One is > debugging/profiling, the other is regulatory compliance and security. T= he > system tap people have no gurantees about what kinds of data is contain= ed > in the stream or the reliability of delivery. There was some talk about > combining hooks and in the end it was decided that we should leave them > disconnected as they serve entirely different purposes. Ah I see. Thanks for the explanation. :-) Best regards, Kay Hayen