public inbox for linux-audit@redhat.com
 help / color / mirror / Atom feed
* question on audit_backlog settings and how to prevent the sytem from hanging due to audit overload
@ 2011-10-06 20:27 larry.erdahl
  2011-10-07 12:28 ` Steve Grubb
  0 siblings, 1 reply; 2+ messages in thread
From: larry.erdahl @ 2011-10-06 20:27 UTC (permalink / raw)
  To: linux-audit

I have a 5.4 Redhat that I'm using Snare to control the audit rules with. 
Recently this server hung on me and pointed to the SnareDispatcher as the 
cause. You can see from the samples below the dispatcher was running at 99 
- 100%.
The morning of the hang Auditd  peaked at  ~200,000 event's/hour, up from 
~50,000 events per hour. Is there away to protect the server from hanging 
during unexpected loads like this? 

I'm assuming from what I've read, I'll need to increase the audit_backlog 
level to something higher. Before increasing the number of buffers  I'd 
like to get a clearer understanding of their size and  how increasing 
these buffers my impact my over all system performance. Are there any 
recommendations on what the settings should be or a formula that I could 
use to determine the proper setting. 

I am looking into what may of caused the spike, but I'd like to know what 
my options to keep from having another system hang


Any help would be appreciated 

Sep 30 01:29:16 <servername> kernel: audit: audit_backlog=321 > 
audit_backlog_limit=320
Sep 30 01:29:16<servername> kernel: audit: audit_lost=1 audit_rate_limit=0 
audit_backlog_limit=320
Sep 30 01:29:16 <servername> kernel: audit: backlog limit exceeded
Sep 30 01:29:16 <servername> kernel: audit: audit_backlog=321 > 
audit_backlog_limit=320
Sep 30 01:29:16 <servername> auditmanager: Received wakeup signal before 
sleep finished
And this is in the process monitoring
1:16:06 4545    99.8    0       99.8    140848  3292    12      0 484   0 
0       SnareDispatchHe 4.16    12
1:21:07 4545    99.9    0       99.9    140848  3292    12      0 484   0 
0       SnareDispatchHe 4.16    12
1:26:07 4545    100     0       100     140848  3292    12      0 484   0 
0       SnareDispatchHe 4.17    12
1:31:07 4545    99.7    0       99.7    140848  3292    12      0 484   0 
0       SnareDispatchHe 4.15    12
1:36:07 4545    99.9    0       99.9    140848  3292    12      0 484   0 
0       SnareDispatchHe 4.16    12
1:41:07 4545    99.9    0       99.9    140848  3292    12      0 484   0 
0       SnareDispatchHe 4.16    12
1:46:08 4545    99.9    0       99.9    140848  3292    12      0 484   0 
0       SnareDispatchHe 4.16    12
1:51:08 4545    82.8    0       82.8    140848  3292    12      0 484   0 
0       SnareDispatchHe 3.45    12

Thanks....

Larry E. Erdahl
Information Security Services
Computer Security Incident Response Team (CSIRT)
1 Meridian Crossing 
Richfield, MN 55423
Mail Code: EP-MN-MS6I
Office Phone: (612)973-7153

U.S. BANCORP made the following annotations
---------------------------------------------------------------------
Electronic Privacy Notice. This e-mail, and any attachments, contains information that is, or may be, covered by electronic communications privacy laws, and is also confidential and proprietary in nature. If you are not the intended recipient, please be advised that you are legally prohibited from retaining, using, copying, distributing, or otherwise disclosing this information in any manner. Instead, please reply to the sender that you have received this communication in error, and then immediately delete it. Thank you in advance for your cooperation.



---------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: question on audit_backlog settings and how to prevent the sytem from hanging due to audit overload
  2011-10-06 20:27 question on audit_backlog settings and how to prevent the sytem from hanging due to audit overload larry.erdahl
@ 2011-10-07 12:28 ` Steve Grubb
  0 siblings, 0 replies; 2+ messages in thread
From: Steve Grubb @ 2011-10-07 12:28 UTC (permalink / raw)
  To: linux-audit

On Thursday, October 06, 2011 04:27:03 PM larry.erdahl@usbank.com wrote:
> I have a 5.4 Redhat that I'm using Snare to control the audit rules with.
> Recently this server hung on me and pointed to the SnareDispatcher as the
> cause. You can see from the samples below the dispatcher was running at 99
> - 100%.
> The morning of the hang Auditd  peaked at  ~200,000 event's/hour, up from
> ~50,000 events per hour. Is there away to protect the server from hanging
> during unexpected loads like this?
> 
> I'm assuming from what I've read, I'll need to increase the audit_backlog
> level to something higher. Before increasing the number of buffers  I'd
> like to get a clearer understanding of their size and  how increasing
> these buffers my impact my over all system performance. Are there any
> recommendations on what the settings should be or a formula that I could
> use to determine the proper setting.

What the kernel sends to user space is a data structure like this:

#define MAX_AUDIT_MESSAGE_LENGTH    8970 // PATH_MAX*2+CONTEXT_SIZE*2+11+256+1
struct audit_message {
        struct nlmsghdr nlh;
        char   data[MAX_AUDIT_MESSAGE_LENGTH];
};

This is in a skb, so there is probably some more memory used for skb bookkeeping. You 
might just round that off to 9000 bytes and be close enough for practical purposes. 
Increasing the backlog limit means that the kernel allocates this memory and its no 
longer available for user space. With the size of memory in current hardware, I don't 
think you have to worry too much as long as the setting is sane. A backlog length of 
8192 means it occupies a little over 70 Mb of memory. But if you need to do this, you 
need to do this.


> I am looking into what may of caused the spike, but I'd like to know what
> my options to keep from having another system hang

Do you use keys for your audit rules? If so, run the key report to get an idea of what 
was happening. From that you can zero in on what it was. You may also have a rule that 
is too aggressive in logging. For example, perhaps you record file deletions in /usr/* 
and then a yum update comes a long....overwriting and deleting thousands of files in a 
few seconds.
 
 
> Any help would be appreciated

Another possibility is increasing the audit daemon's priority a little and make sure 
its disk performance is tuned.

 
> Sep 30 01:29:16 <servername> kernel: audit: audit_backlog=321 >
> audit_backlog_limit=320

This is the default setting. Its a bit low for production use. I'd bump that up a lot. 
Make it at least 4096 if not 8192.

-Steve

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2011-10-07 12:28 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-06 20:27 question on audit_backlog settings and how to prevent the sytem from hanging due to audit overload larry.erdahl
2011-10-07 12:28 ` Steve Grubb

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox