* question on audit_backlog settings and how to prevent the sytem from hanging due to audit overload
@ 2011-10-06 20:27 larry.erdahl
2011-10-07 12:28 ` Steve Grubb
0 siblings, 1 reply; 2+ messages in thread
From: larry.erdahl @ 2011-10-06 20:27 UTC (permalink / raw)
To: linux-audit
I have a 5.4 Redhat that I'm using Snare to control the audit rules with.
Recently this server hung on me and pointed to the SnareDispatcher as the
cause. You can see from the samples below the dispatcher was running at 99
- 100%.
The morning of the hang Auditd peaked at ~200,000 event's/hour, up from
~50,000 events per hour. Is there away to protect the server from hanging
during unexpected loads like this?
I'm assuming from what I've read, I'll need to increase the audit_backlog
level to something higher. Before increasing the number of buffers I'd
like to get a clearer understanding of their size and how increasing
these buffers my impact my over all system performance. Are there any
recommendations on what the settings should be or a formula that I could
use to determine the proper setting.
I am looking into what may of caused the spike, but I'd like to know what
my options to keep from having another system hang
Any help would be appreciated
Sep 30 01:29:16 <servername> kernel: audit: audit_backlog=321 >
audit_backlog_limit=320
Sep 30 01:29:16<servername> kernel: audit: audit_lost=1 audit_rate_limit=0
audit_backlog_limit=320
Sep 30 01:29:16 <servername> kernel: audit: backlog limit exceeded
Sep 30 01:29:16 <servername> kernel: audit: audit_backlog=321 >
audit_backlog_limit=320
Sep 30 01:29:16 <servername> auditmanager: Received wakeup signal before
sleep finished
And this is in the process monitoring
1:16:06 4545 99.8 0 99.8 140848 3292 12 0 484 0
0 SnareDispatchHe 4.16 12
1:21:07 4545 99.9 0 99.9 140848 3292 12 0 484 0
0 SnareDispatchHe 4.16 12
1:26:07 4545 100 0 100 140848 3292 12 0 484 0
0 SnareDispatchHe 4.17 12
1:31:07 4545 99.7 0 99.7 140848 3292 12 0 484 0
0 SnareDispatchHe 4.15 12
1:36:07 4545 99.9 0 99.9 140848 3292 12 0 484 0
0 SnareDispatchHe 4.16 12
1:41:07 4545 99.9 0 99.9 140848 3292 12 0 484 0
0 SnareDispatchHe 4.16 12
1:46:08 4545 99.9 0 99.9 140848 3292 12 0 484 0
0 SnareDispatchHe 4.16 12
1:51:08 4545 82.8 0 82.8 140848 3292 12 0 484 0
0 SnareDispatchHe 3.45 12
Thanks....
Larry E. Erdahl
Information Security Services
Computer Security Incident Response Team (CSIRT)
1 Meridian Crossing
Richfield, MN 55423
Mail Code: EP-MN-MS6I
Office Phone: (612)973-7153
U.S. BANCORP made the following annotations
---------------------------------------------------------------------
Electronic Privacy Notice. This e-mail, and any attachments, contains information that is, or may be, covered by electronic communications privacy laws, and is also confidential and proprietary in nature. If you are not the intended recipient, please be advised that you are legally prohibited from retaining, using, copying, distributing, or otherwise disclosing this information in any manner. Instead, please reply to the sender that you have received this communication in error, and then immediately delete it. Thank you in advance for your cooperation.
---------------------------------------------------------------------
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: question on audit_backlog settings and how to prevent the sytem from hanging due to audit overload
2011-10-06 20:27 question on audit_backlog settings and how to prevent the sytem from hanging due to audit overload larry.erdahl
@ 2011-10-07 12:28 ` Steve Grubb
0 siblings, 0 replies; 2+ messages in thread
From: Steve Grubb @ 2011-10-07 12:28 UTC (permalink / raw)
To: linux-audit
On Thursday, October 06, 2011 04:27:03 PM larry.erdahl@usbank.com wrote:
> I have a 5.4 Redhat that I'm using Snare to control the audit rules with.
> Recently this server hung on me and pointed to the SnareDispatcher as the
> cause. You can see from the samples below the dispatcher was running at 99
> - 100%.
> The morning of the hang Auditd peaked at ~200,000 event's/hour, up from
> ~50,000 events per hour. Is there away to protect the server from hanging
> during unexpected loads like this?
>
> I'm assuming from what I've read, I'll need to increase the audit_backlog
> level to something higher. Before increasing the number of buffers I'd
> like to get a clearer understanding of their size and how increasing
> these buffers my impact my over all system performance. Are there any
> recommendations on what the settings should be or a formula that I could
> use to determine the proper setting.
What the kernel sends to user space is a data structure like this:
#define MAX_AUDIT_MESSAGE_LENGTH 8970 // PATH_MAX*2+CONTEXT_SIZE*2+11+256+1
struct audit_message {
struct nlmsghdr nlh;
char data[MAX_AUDIT_MESSAGE_LENGTH];
};
This is in a skb, so there is probably some more memory used for skb bookkeeping. You
might just round that off to 9000 bytes and be close enough for practical purposes.
Increasing the backlog limit means that the kernel allocates this memory and its no
longer available for user space. With the size of memory in current hardware, I don't
think you have to worry too much as long as the setting is sane. A backlog length of
8192 means it occupies a little over 70 Mb of memory. But if you need to do this, you
need to do this.
> I am looking into what may of caused the spike, but I'd like to know what
> my options to keep from having another system hang
Do you use keys for your audit rules? If so, run the key report to get an idea of what
was happening. From that you can zero in on what it was. You may also have a rule that
is too aggressive in logging. For example, perhaps you record file deletions in /usr/*
and then a yum update comes a long....overwriting and deleting thousands of files in a
few seconds.
> Any help would be appreciated
Another possibility is increasing the audit daemon's priority a little and make sure
its disk performance is tuned.
> Sep 30 01:29:16 <servername> kernel: audit: audit_backlog=321 >
> audit_backlog_limit=320
This is the default setting. Its a bit low for production use. I'd bump that up a lot.
Make it at least 4096 if not 8192.
-Steve
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2011-10-07 12:28 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-06 20:27 question on audit_backlog settings and how to prevent the sytem from hanging due to audit overload larry.erdahl
2011-10-07 12:28 ` Steve Grubb
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox