audit-viewer performance

public inbox for linux-audit@redhat.com
 help / color / mirror / Atom feed

* audit-viewer performance
@ 2009-12-19  1:42 LC Bruzenak
  2009-12-19 13:34 ` Steve Grubb
  0 siblings, 1 reply; 3+ messages in thread
From: LC Bruzenak @ 2009-12-19  1:42 UTC (permalink / raw)
  To: Linux Audit

I don't know how much the audit-viewer tool is used by folks with
substantial amounts of data, but my experience is that it is nearly
unusable for our system. I appreciate that it does a lot really well,
however it takes minutes to load our data on startup. It seems more
filter tabs hurt the performance as well.

The only purpose of this particular machine is to collect/process
audit/prelude data.
Only prelude and audit-related processes are being run on this one.
It is an HP DL380 with 2 quad-core processors, 12GB RAM and an
internal RAID running on F10.

As it is now, I daily move the audit data from the /var/log/audit
directory or there is no chance that the audit-viewer will complete
its load. Sometimes it will never recover and we have to kill and
restart it. The amount of data is around 1.5GB on the directory we are
currently loading and it appears to take about 5 minutes (give or
take) for the viewer to load the data and be usable.

Once loaded, a big filter effort will take maybe a minute or so to
yield results.
While the data is loading, there is no feedback and of course
uncertainty about whether it is going to return with any data always
sets in after a minute or two.

What is the plan for this tool? As I said, I think it is very nice
feature-wise in general but in practice it isn't living up to
expectations.
I can try to help but will take a while to get python-proficient. Or
is the trouble in the parse library?

I do not have scientific data yet, but recently I loaded one 100MB
audit file from the store. It took around 3 minute to load. Then I
changed the source and that one took longer. When it was finally
loaded it, the process size was over 2GB.

I can run some better tests and try to get some data if it is helpful.
Are there ways I can try to exercise the parse library outside the GUI
on these same files which might help me know what to look for? Or any
other ideas I can try?

Thanks,
LCB.
-- 
LC (Lenny) Bruzenak

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: audit-viewer performance
  2009-12-19  1:42 audit-viewer performance LC Bruzenak
@ 2009-12-19 13:34 ` Steve Grubb
  2009-12-19 18:20   ` LC Bruzenak
  0 siblings, 1 reply; 3+ messages in thread
From: Steve Grubb @ 2009-12-19 13:34 UTC (permalink / raw)
  To: linux-audit

On Friday 18 December 2009 08:42:51 pm LC Bruzenak wrote:
> What is the plan for this tool? As I said, I think it is very nice
> feature-wise in general but in practice it isn't living up to
> expectations.
> I can try to help but will take a while to get python-proficient. Or
> is the trouble in the parse library?

The audit parsing library has not been optimized for handling large data sets. 
I don't think its the entire problem you are seeing, but I'm sure its a 
contributor to the problem. I was planning to look at performance issues in a 
future release.

The immediate plans are:

1) Get store and forward working for remote logging
2) Get intermingled records fixed in auparse
3) Get sighup corrected for network options
4) Look at performance issues in auparse

The biggest obstacle is #2 above. What makes fixing this in auparse so much fun 
is that there is a state machine right in the middle of auparse (due to the 
feed input option) that was not in aureport or ausearch, so I couldn't fix it 
at the same time.

> I do not have scientific data yet, but recently I loaded one 100MB
> audit file from the store. It took around 3 minute to load. Then I
> changed the source and that one took longer. When it was finally
> loaded it, the process size was over 2GB.

Sure. The audit viewer could be changed to hold only the records that might be 
displayed and not all of them. It would then need to track what's displayed 
and start a background thread to gather more info for display as you scroll 
around.

> I can run some better tests and try to get some data if it is helpful.
> Are there ways I can try to exercise the parse library outside the GUI
> on these same files which might help me know what to look for?

The parse library has some test programs in 
https://fedorahosted.org/audit/browser/trunk/auparse/test
that could be adapted for performance testing. But I don't want to change the 
code in auparse until after we make it handle interlaced records correctly.

But you could test the native C library against the python version to see if 
python itself is adding delay.

-Steve

PS - I keep a TODO file up to date that will always let you know what the 
immediate plans are: https://fedorahosted.org/audit/browser/trunk/TODO

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: audit-viewer performance
  2009-12-19 13:34 ` Steve Grubb
@ 2009-12-19 18:20   ` LC Bruzenak
  0 siblings, 0 replies; 3+ messages in thread
From: LC Bruzenak @ 2009-12-19 18:20 UTC (permalink / raw)
  To: Steve Grubb; +Cc: linux-audit

On Sat, Dec 19, 2009 at 7:34 AM, Steve Grubb <sgrubb@redhat.com> wrote:
> On Friday 18 December 2009 08:42:51 pm LC Bruzenak wrote:
>> What is the plan for this tool? As I said, I think it is very nice
>> feature-wise in general but in practice it isn't living up to
>> expectations.
>> I can try to help but will take a while to get python-proficient. Or
>> is the trouble in the parse library?
>
> The audit parsing library has not been optimized for handling large data sets.
> I don't think its the entire problem you are seeing, but I'm sure its a
> contributor to the problem. I was planning to look at performance issues in a
> future release.

I should be able to help out testing, debugging, etc. since we really
use the aggregation capability on high-volume systems and therefore
have a big data store to use in testing.

> But you could test the native C library against the python version to see if
> python itself is adding delay.

I'll try to take a look at this.

I was thinking that it seems to me a relational DB would be a help on
this point. Rather than parsing the entire log structure every time,
perhaps the audit-viewer could just query for the desired data and try
to leverage the DB's optimization.
But I guess if you went to such a big change there you might also
consider making it network-capable similar in form and function to the
prewikka viewer. This one handles large amounts of data pretty well.

>
> -Steve
>
> PS - I keep a TODO file up to date that will always let you know what the
> immediate plans are: https://fedorahosted.org/audit/browser/trunk/TODO
>

Very good. Thanks Steve, and Happy Holidays!

LCB.

-- 
LC (Lenny) Bruzenak

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2009-12-19 18:20 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-12-19  1:42 audit-viewer performance LC Bruzenak
2009-12-19 13:34 ` Steve Grubb
2009-12-19 18:20   ` LC Bruzenak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox