* libaudit large stack requirement in audit_send()
@ 2013-04-29 16:15 Luciano Chavez
2013-04-29 16:48 ` Steve Grubb
0 siblings, 1 reply; 4+ messages in thread
From: Luciano Chavez @ 2013-04-29 16:15 UTC (permalink / raw)
To: linux-audit
Hello,
I am working with a development team developing a J2EE application. They
reported a problem with a crash in audit_send(). The crash occurred in a
ppc64 architecture environment early on in the invocation to audit send.
The crash occurs in this instruction which is establishing the size of
the local stack:
=> 0xfff73237994 <audit_send+52>: stdu r1,-27232(r1)
I found one large struct defined to a local variable
(gdb) print sizeof(struct audit_message)
$4 = 8988
but you will note that it asks for much more than that and after looking
at the audit_send() routine, it calls a function called check_ack()
which appears to be inlined and it contains two even larger definitions
on the stack for the following structure:
struct audit_reply
(gdb) print sizeof(struct audit_reply)
$3 = 9016
So, the combination of the three is what requires almost 26.5K of local
stack usage in this frame alone.
Is there a requirement for libaudit to have the structs on the stack
versus allocated from heap? Is so, is this requirement documented
somewhere?
To be fair, the Java application has some heavy stack usage as it is
since it is deployed in a web application server and there is a JNI
function that is somewhere in the call stack as well. However, the stack
usage in the audit_send() function seems ... excessive.
Originally the thread stacksize size was set to 256K and that did not
help but once we raised it to 1MB it did but I think that is probably
more than we really need.
I have looked at the source for the audit 2.2.3 release from March and
don't see a difference in how the structs are allocated. So once again,
if there is not a requirement that the structs be on the stack, should
they not be allocated off the heap?
regards,
--
Luciano Chavez <lnx1138@linux.vnet.ibm.com>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: libaudit large stack requirement in audit_send()
2013-04-29 16:15 libaudit large stack requirement in audit_send() Luciano Chavez
@ 2013-04-29 16:48 ` Steve Grubb
2013-04-29 18:35 ` Steve Grubb
2013-04-29 21:02 ` Luciano Chavez
0 siblings, 2 replies; 4+ messages in thread
From: Steve Grubb @ 2013-04-29 16:48 UTC (permalink / raw)
To: linux-audit
On Monday, April 29, 2013 11:15:16 AM Luciano Chavez wrote:
> Hello,
>
> I am working with a development team developing a J2EE application. They
> reported a problem with a crash in audit_send(). The crash occurred in a
> ppc64 architecture environment early on in the invocation to audit send.
>
> The crash occurs in this instruction which is establishing the size of
> the local stack:
>
> => 0xfff73237994 <audit_send+52>: stdu r1,-27232(r1)
>
> I found one large struct defined to a local variable
>
> (gdb) print sizeof(struct audit_message)
> $4 = 8988
>
> but you will note that it asks for much more than that and after looking
> at the audit_send() routine, it calls a function called check_ack()
At the time it calls check_ack, its done with that variable and the compiler
could have retired that area of the stack for reuse by something else.
> which appears to be inlined
It doesn't have the inline keyword.
> and it contains two even larger definitions
> on the stack for the following structure:
>
> struct audit_reply
>
> (gdb) print sizeof(struct audit_reply)
> $3 = 9016
They should be the same size. One is in an if statement that only executes on
a netlink message error. That should not be a normal case.
> So, the combination of the three is what requires almost 26.5K of local
> stack usage in this frame alone.
>
> Is there a requirement for libaudit to have the structs on the stack
> versus allocated from heap? Is so, is this requirement documented
> somewhere?
The audit daemon has to be reliable, It was designed to use as few malloc's as
possible so that nothing unexpected can happen, like memory fragmentation or
stopping because an allocation failed.
> To be fair, the Java application has some heavy stack usage as it is
> since it is deployed in a web application server and there is a JNI
> function that is somewhere in the call stack as well. However, the stack
> usage in the audit_send() function seems ... excessive.
Well, if you are within 26k of crashing due to being out of stack, then you
might just be shifting the problem to something else cause you're about out of
stack. :-) To me, it sounds a bit like the compiler could be more aggressive
on reclaiming the unused stack area once it sees stack variables expire.
> Originally the thread stacksize size was set to 256K and that did not
> help but once we raised it to 1MB it did but I think that is probably
> more than we really need.
>
> I have looked at the source for the audit 2.2.3 release from March and
> don't see a difference in how the structs are allocated. So once again,
> if there is not a requirement that the structs be on the stack, should
> they not be allocated off the heap?
Well, except that I am trying real hard to avoid malloc in all functions that
are executed by auditd. You'll find it in other places, like library functions
that are used by other apps or things that parse and analyze. They aren't as
critical as auditd staying running.
I'm open to ideas. I don't know if putting req.nlh.nlmsg_len into a tmp
variable would help the compiler realise it can reclaim the area occupied by
req or not. I might be able to get rid of rep2 by reusing the area of rep
after coping out the important part of it. But this part of the code hasn't
significantly changed in 7-8 years.
-Steve
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: libaudit large stack requirement in audit_send()
2013-04-29 16:48 ` Steve Grubb
@ 2013-04-29 18:35 ` Steve Grubb
2013-04-29 21:02 ` Luciano Chavez
1 sibling, 0 replies; 4+ messages in thread
From: Steve Grubb @ 2013-04-29 18:35 UTC (permalink / raw)
To: linux-audit
On Monday, April 29, 2013 12:48:48 PM Steve Grubb wrote:
> I'm open to ideas. I don't know if putting req.nlh.nlmsg_len into a tmp
> variable would help the compiler realise it can reclaim the area occupied
> by req or not. I might be able to get rid of rep2 by reusing the area of
> rep after coping out the important part of it. But this part of the code
> hasn't significantly changed in 7-8 years.
I checked in a change (792) that reduces the stack use by 9008 bytes. Not sure
I can get much more.
-Steve
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: libaudit large stack requirement in audit_send()
2013-04-29 16:48 ` Steve Grubb
2013-04-29 18:35 ` Steve Grubb
@ 2013-04-29 21:02 ` Luciano Chavez
1 sibling, 0 replies; 4+ messages in thread
From: Luciano Chavez @ 2013-04-29 21:02 UTC (permalink / raw)
To: Steve Grubb; +Cc: linux-audit
On Mon, 2013-04-29 at 12:48 -0400, Steve Grubb wrote:
> On Monday, April 29, 2013 11:15:16 AM Luciano Chavez wrote:
> > Hello,
> >
> > I am working with a development team developing a J2EE application. They
> > reported a problem with a crash in audit_send(). The crash occurred in a
> > ppc64 architecture environment early on in the invocation to audit send.
> >
> > The crash occurs in this instruction which is establishing the size of
> > the local stack:
> >
> > => 0xfff73237994 <audit_send+52>: stdu r1,-27232(r1)
> >
> > I found one large struct defined to a local variable
> >
> > (gdb) print sizeof(struct audit_message)
> > $4 = 8988
> >
> > but you will note that it asks for much more than that and after looking
> > at the audit_send() routine, it calls a function called check_ack()
>
> At the time it calls check_ack, its done with that variable and the compiler
> could have retired that area of the stack for reuse by something else.
>
> > which appears to be inlined
>
> It doesn't have the inline keyword.
>
Hi Steve,
Thanks for the quick reply. More than likely the compiler inlined the
check_ack() function as there is only one call to it from audit_send()
but even so, I am not sure that mattered.
>
> > and it contains two even larger definitions
> > on the stack for the following structure:
> >
> > struct audit_reply
> >
> > (gdb) print sizeof(struct audit_reply)
> > $3 = 9016
>
> They should be the same size. One is in an if statement that only executes on
> a netlink message error. That should not be a normal case.
>
>
> > So, the combination of the three is what requires almost 26.5K of local
> > stack usage in this frame alone.
> >
> > Is there a requirement for libaudit to have the structs on the stack
> > versus allocated from heap? Is so, is this requirement documented
> > somewhere?
>
> The audit daemon has to be reliable, It was designed to use as few malloc's as
> possible so that nothing unexpected can happen, like memory fragmentation or
> stopping because an allocation failed.
>
>
> > To be fair, the Java application has some heavy stack usage as it is
> > since it is deployed in a web application server and there is a JNI
> > function that is somewhere in the call stack as well. However, the stack
> > usage in the audit_send() function seems ... excessive.
>
> Well, if you are within 26k of crashing due to being out of stack, then you
> might just be shifting the problem to something else cause you're about out of
> stack. :-) To me, it sounds a bit like the compiler could be more aggressive
> on reclaiming the unused stack area once it sees stack variables expire.
>
>
> > Originally the thread stacksize size was set to 256K and that did not
> > help but once we raised it to 1MB it did but I think that is probably
> > more than we really need.
> >
> > I have looked at the source for the audit 2.2.3 release from March and
> > don't see a difference in how the structs are allocated. So once again,
> > if there is not a requirement that the structs be on the stack, should
> > they not be allocated off the heap?
>
> Well, except that I am trying real hard to avoid malloc in all functions that
> are executed by auditd. You'll find it in other places, like library functions
> that are used by other apps or things that parse and analyze. They aren't as
> critical as auditd staying running.
>
In our case the libaudit functions were invoked as a result of audit
logging after a pam_authenticate() function was called by the JNI code
in the context of java native thread. It wasn't within auditd but you
did confirm that placing these structs was intentional as part of the
design process and not something else. Thanks.
> I'm open to ideas. I don't know if putting req.nlh.nlmsg_len into a tmp
> variable would help the compiler realise it can reclaim the area occupied by
> req or not. I might be able to get rid of rep2 by reusing the area of rep
> after coping out the important part of it. But this part of the code hasn't
> significantly changed in 7-8 years.
>
> -Steve
>
I saw your followup and thanks for freeing up some stack space. I saw
the diff in trunk. I'll see if the development team wants to try another
libaudit with this change to reduce there stacksize requirements a bit.
regards,
--
Luciano Chavez <lnx1138@linux.vnet.ibm.com>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2013-04-29 21:02 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-04-29 16:15 libaudit large stack requirement in audit_send() Luciano Chavez
2013-04-29 16:48 ` Steve Grubb
2013-04-29 18:35 ` Steve Grubb
2013-04-29 21:02 ` Luciano Chavez
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox