From mboxrd@z Thu Jan  1 00:00:00 1970
From: Steve Grubb <sgrubb@redhat.com>
Subject: Re: Performance of libauparse
Date: Tue, 30 Sep 2008 14:34:06 -0400
Message-ID: <200809301434.07079.sgrubb@redhat.com>
References: <48E22AC8.8010906@redhat.com>
Mime-Version: 1.0
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Return-path: <linux-audit-bounces@redhat.com>
In-Reply-To: <48E22AC8.8010906@redhat.com>
Content-Disposition: inline
List-Unsubscribe: <https://www.redhat.com/mailman/listinfo/linux-audit>,
	<mailto:linux-audit-request@redhat.com?subject=unsubscribe>
List-Archive: <https://www.redhat.com/archives/linux-audit>
List-Post: <mailto:linux-audit@redhat.com>
List-Help: <mailto:linux-audit-request@redhat.com?subject=help>
List-Subscribe: <https://www.redhat.com/mailman/listinfo/linux-audit>,
	<mailto:linux-audit-request@redhat.com?subject=subscribe>
Sender: linux-audit-bounces@redhat.com
Errors-To: linux-audit-bounces@redhat.com
To: linux-audit@redhat.com
List-Id: linux-audit@redhat.com

On Tuesday 30 September 2008 09:34:00 Matthew Booth wrote:
> To measure the overhead of libauparse on austream I initialised auparse
> =A0 as AUSOURCE_FEED, fed each received record into it, and spat them o=
ut
> unmodified on receiving the AUPARSE_CB_EVENT_READY event.=20

This is not an apples to apples comparison. libauparse fully parses each=20
field. So, it is doing significantly more work. You could add a strtok_r =
loop=20
to your code to come closer to a direct comparison.


> This added more than an order of magnitude to the time austream spends =
in
> userspace. A brief look at this overhead shows that about 40% is spent
> in malloc()/free(),=20

Yep. The main idea was really to just get it working and then optimize in=
 a=20
future release. The memory management is the low hanging fruit. I've been=
=20
thinking to fix it so that each field is not malloc'ed individually, that=
=20
there are string pointers and lengths stored and used internally.


> and 25% is spent in strlen, strdup, memcpy, memmove and friends. I susp=
ect
> that very substantial gains could be made in the performance of libaupa=
rse
> by reworking the way it uses memory, and passing the length of strings
> around with the strings. Unfortunately, I suspect this would amount to =
a
> substantial rewrite.

Possibly. But I wasn't planning to do this until after solving the interl=
aced=20
record problem. I'd rather it be the current speed and correct than faste=
r=20
and still wrong.


> Is this something anybody else is interested in? I guess performance
> isn't so important if you're just scanning log files in non-real time.

Yes, after the next release. In the mean time it might not hurt to add so=
me=20
tests to the auparse_test programs so that any re-write induced regressio=
n=20
has a chance of being found.


> [1] What I'd really like is a well-defined binary format from the kerne=
l.

Not likely to ever happen.

-Steve