Log rotation and client disconnects

linux-audit.redhat.com archive mirror
 help / color / mirror / Atom feed

* Log rotation and client disconnects
@ 2010-08-09 16:59 rshaw1
  2010-08-09 17:53 ` Steve Grubb
  0 siblings, 1 reply; 9+ messages in thread
From: rshaw1 @ 2010-08-09 16:59 UTC (permalink / raw)
  To: linux-audit

I've been having a few issues lately with auditd.  I'm running the version
packaged with RHEL5 (1.7.17), with one machine collecting logs for a few
hundred others using audisp.

I had been using logrotate to rotate the logs (in order to get them named
with a date extension, bzipped a day after being rotated, etc.)  I thought
that restarting the daemons each night might be causing issues with many
clients trying to reconnect at once, so I tried using copytruncate in
order to avoid restarting.  This appears to make auditd crash, so I'm
looking at using its built-in rotation.  However, "service auditd rotate"
does not do anything.  The man page says this "will consult the
max_log_size_action to see if it should keep the logs or not", but I'm not
sure what that means; there is "max_log_file_action", which I have set to
"ignore" as the FAQ specifies.

I'm also having separate issues with some clients disconnecting from the
server, retrying twice in about a 40 second interval, and then giving up. 
The server isn't going down, and this isn't even happening at the same
time I was restarting auditd.  I would really like the clients to make
more of an effort at reconnecting.  I have the configuration options set
like so on the clients, but maybe I'm misunderstanding what they do:

network_retry_time = 30
max_tries_per_record = 60
max_time_per_record = 5
...
remote_ending_action = reconnect

Finally, if anyone has any recommendations for setting tcp_listen_queue on
the server (I'm not sure if this is supposed to indicate a number of audit
messages or clients) and queue_depth on the clients when using a few
hundred clients, that would be great.

Thanks for any assistance,

--Ray

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Log rotation and client disconnects
  2010-08-09 16:59 Log rotation and client disconnects rshaw1
@ 2010-08-09 17:53 ` Steve Grubb
  2010-08-12 14:02   ` rshaw1
  0 siblings, 1 reply; 9+ messages in thread
From: Steve Grubb @ 2010-08-09 17:53 UTC (permalink / raw)
  To: linux-audit

On Monday, August 09, 2010 12:59:50 pm rshaw1@umbc.edu wrote:
> I had been using logrotate to rotate the logs (in order to get them named
> with a date extension, bzipped a day after being rotated, etc.)  I thought
> that restarting the daemons each night might be causing issues with many
> clients trying to reconnect at once, so I tried using copytruncate in
> order to avoid restarting.  This appears to make auditd crash, so I'm
> looking at using its built-in rotation.

Yes, this is the preferred way.


> However, "service auditd rotate" does not do anything.

It should. I just double-checked the code and I can't see how it doesn't work 
without writing something to syslog on error.


> The man page says this "will consult the max_log_size_action to see if it
> should keep the logs or not", but I'm not sure what that means;

It means that if you set the action to rotate, then it will delete any log 
that results in a number higher than the num_logs.


> there is "max_log_file_action", which I have set to "ignore" as the FAQ
> specifies.

That means do nothing when the size of the log file exceeds max_log_file in 
megabytes. But this has no effect on rotation by the "service auditd rotate" 
technique. Its working like its supposed to on my system.

 
> I'm also having separate issues with some clients disconnecting from the
> server, retrying twice in about a 40 second interval, and then giving up.
> The server isn't going down, and this isn't even happening at the same
> time I was restarting auditd. 

Anything written to syslog on either end?


> I would really like the clients to make more of an effort at reconnecting.  I
> have the configuration options set like so on the clients, but maybe I'm
> misunderstanding what they do:
> 
> network_retry_time = 30

^^ time to delay in seconds between retries

> max_tries_per_record = 60

How many time to retry

> max_time_per_record = 5

Maximum time before doing the network failure action.

> remote_ending_action = reconnect
> 
> Finally, if anyone has any recommendations for setting tcp_listen_queue on
> the server (I'm not sure if this is supposed to indicate a number of audit
> messages or clients) and queue_depth on the clients when using a few
> hundred clients, that would be great.

If you have a few hundred clients, you will want to set the number higher. 
This is the queue size in the kernel for pending connections. How high ? 
Experiment. But 25 would be a good start and go higher.

-Steve

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Log rotation and client disconnects
  2010-08-09 17:53 ` Steve Grubb
@ 2010-08-12 14:02   ` rshaw1
  2010-08-12 14:25     ` Steve Grubb
  2010-08-12 14:31     ` LC Bruzenak
  0 siblings, 2 replies; 9+ messages in thread
From: rshaw1 @ 2010-08-12 14:02 UTC (permalink / raw)
  To: linux-audit

I just realized that my last reply went only to Steve Grubb, and not the
list.  Sorry about that.  This Webmail client is pretty awful, but at the
moment, I have to use it.

I've discovered the issue since I sent it, anyway.  If num_logs is set to
0, auditd will ignore explicit requests to rotate the logs.  I guess this
may be intentional, but it's unfortunate as num_logs caps at 99 and I need
to keep 365 of them.  I suppose that since I'll have to rename and bzip
them anyway, I may as well just move them to another location (maybe
/var/log/audit/archive) so that auditd doesn't "see" them, unless there's
a better way to do this.

I'm still not sure what to do about the disconnection issues (although
hopefully those will be very infrequent once I'm no longer restarting any
of the daemons).  If a client does lose the connection to the server for a
while though (say, an hour-long network outage for networking upgrades),
I'd like to be able to tell them to try reconnecting periodically, and the
combination of network_retry_time and max_tries_per_record doesn't seem to
be the way to do that.

Other than checking the logs, is there a way to determine whether or not a
running audispd is connected to the remote server?

>> I'm also having separate issues with some clients disconnecting from the
>> server, retrying twice in about a 40 second interval, and then giving
>> up.
>> The server isn't going down, and this isn't even happening at the same
>> time I was restarting auditd.
>
> Anything written to syslog on either end?

Nothing is on the server, but this is (everything) on the client:

Aug  4 23:12:07 host1 audisp-remote: connection to host2 closed unexpectedly
Aug  4 23:12:07 host1 audisp-remote: Connected to host2
Aug  4 23:12:12 host1 audisp-remote: connection to host2 closed unexpectedly
Aug  4 23:12:42 host1 audisp-remote: network failure, max retry time
exhausted

Thanks,

--Ray

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Log rotation and client disconnects
  2010-08-12 14:02   ` rshaw1
@ 2010-08-12 14:25     ` Steve Grubb
  2010-08-12 15:16       ` rshaw1
  2010-08-12 14:31     ` LC Bruzenak
  1 sibling, 1 reply; 9+ messages in thread
From: Steve Grubb @ 2010-08-12 14:25 UTC (permalink / raw)
  To: linux-audit

On Thursday, August 12, 2010 10:02:29 am rshaw1@umbc.edu wrote:
> I've discovered the issue since I sent it, anyway.  If num_logs is set to
> 0, auditd will ignore explicit requests to rotate the logs.  I guess this
> may be intentional, but it's unfortunate as num_logs caps at 99 and I need
> to keep 365 of them.

Have you looked at the keep_logs option for max_log_file_action?


> I suppose that since I'll have to rename and bzip
> them anyway, I may as well just move them to another location (maybe
> /var/log/audit/archive) so that auditd doesn't "see" them, unless there's
> a better way to do this.

Yes, you should archive them away since by being in /var/log/audit, they are 
used in calculating the log space left. 

 
> I'm still not sure what to do about the disconnection issues (although
> hopefully those will be very infrequent once I'm no longer restarting any
> of the daemons).  If a client does lose the connection to the server for a
> while though (say, an hour-long network outage for networking upgrades),
> I'd like to be able to tell them to try reconnecting periodically, and the
> combination of network_retry_time and max_tries_per_record doesn't seem to
> be the way to do that.
> 
> Other than checking the logs, is there a way to determine whether or not a
> running audispd is connected to the remote server?

It logs this. Also I suppose you could peek into its open descriptors with 
lsof or just checking in /proc.

-Steve

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Log rotation and client disconnects
  2010-08-12 14:02   ` rshaw1
  2010-08-12 14:25     ` Steve Grubb
@ 2010-08-12 14:31     ` LC Bruzenak
  1 sibling, 0 replies; 9+ messages in thread
From: LC Bruzenak @ 2010-08-12 14:31 UTC (permalink / raw)
  To: rshaw1; +Cc: linux-audit

On Thu, 2010-08-12 at 10:02 -0400, rshaw1@umbc.edu wrote:
> I've discovered the issue since I sent it, anyway.  If num_logs is set to
> 0, auditd will ignore explicit requests to rotate the logs.  I guess this
> may be intentional, but it's unfortunate as num_logs caps at 99 and I need
> to keep 365 of them.  I suppose that since I'll have to rename and bzip
> them anyway, I may as well just move them to another location (maybe
> /var/log/audit/archive) so that auditd doesn't "see" them, unless there's
> a better way to do this.

How big are your logfiles? Mine are 100MB each.
Each day I have to move mine out of the way for the same reasons.
However, the search tools are then impacted, since you'll need to know
where to find them. 
Also, since it appears you have a lot of data, I assume you are finding
performance issues on the audit-viewer?

> 
> I'm still not sure what to do about the disconnection issues (although
> hopefully those will be very infrequent once I'm no longer restarting any
> of the daemons).  If a client does lose the connection to the server for a
> while though (say, an hour-long network outage for networking upgrades),
> I'd like to be able to tell them to try reconnecting periodically, and the
> combination of network_retry_time and max_tries_per_record doesn't seem to
> be the way to do that.
> 
> Other than checking the logs, is there a way to determine whether or not a
> running audispd is connected to the remote server?

I do a combination of things to detect this on the sending side.
The network_failure_action of the audisp-remote.conf file allows for a
custom action using the "exec" option.

The remote_ending_action = reconnect helps if the  (server) restarts its
auditd. Maybe your version is different from mine but I get the
reconnects...

Also - I have a big ugly system involving timestamps and reconnect
logic.

> 
> >> I'm also having separate issues with some clients disconnecting from the
> >> server, retrying twice in about a 40 second interval, and then giving
> >> up.
> >> The server isn't going down, and this isn't even happening at the same
> >> time I was restarting auditd.
> >
> > Anything written to syslog on either end?
> 
> Nothing is on the server, but this is (everything) on the client:
> 
> Aug  4 23:12:07 host1 audisp-remote: connection to host2 closed unexpectedly
> Aug  4 23:12:07 host1 audisp-remote: Connected to host2
> Aug  4 23:12:12 host1 audisp-remote: connection to host2 closed unexpectedly
> Aug  4 23:12:42 host1 audisp-remote: network failure, max retry time
> exhausted

I will go back and read your previous posts; maybe something will click.

LCB.

-- 
LC (Lenny) Bruzenak
lenny@magitekltd.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Log rotation and client disconnects
  2010-08-12 14:25     ` Steve Grubb
@ 2010-08-12 15:16       ` rshaw1
  2010-08-12 15:57         ` LC Bruzenak
  0 siblings, 1 reply; 9+ messages in thread
From: rshaw1 @ 2010-08-12 15:16 UTC (permalink / raw)
  To: linux-audit

> On Thursday, August 12, 2010 10:02:29 am rshaw1@umbc.edu wrote:
>> I've discovered the issue since I sent it, anyway.  If num_logs is set
>> to
>> 0, auditd will ignore explicit requests to rotate the logs.  I guess
>> this
>> may be intentional, but it's unfortunate as num_logs caps at 99 and I
>> need
>> to keep 365 of them.
>
> Have you looked at the keep_logs option for max_log_file_action?

I did, but the man page states that keep_logs is similar to rotate, so it
sounds like if I used this option, it would still rotate the log file if
it went above the max_log_file size, which I don't want to happen.  I
suppose I could just set max_log_file to 99999 or something (if that's
supported).  Typically, uncompressed log files for ~400 clients on the
central server end up being around 3-4Gb.

Thanks for all the help so far; I think I'm almost there.

--Ray

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Log rotation and client disconnects
  2010-08-12 15:16       ` rshaw1
@ 2010-08-12 15:57         ` LC Bruzenak
  2010-08-13 15:06           ` rshaw1
  0 siblings, 1 reply; 9+ messages in thread
From: LC Bruzenak @ 2010-08-12 15:57 UTC (permalink / raw)
  To: rshaw1; +Cc: linux-audit

On Thu, 2010-08-12 at 11:16 -0400, rshaw1@umbc.edu wrote:
> > On Thursday, August 12, 2010 10:02:29 am rshaw1@umbc.edu wrote:
> >> I've discovered the issue since I sent it, anyway.  If num_logs is set
> >> to
> >> 0, auditd will ignore explicit requests to rotate the logs.  I guess
> >> this
> >> may be intentional, but it's unfortunate as num_logs caps at 99 and I
> >> need
> >> to keep 365 of them.
> >
> > Have you looked at the keep_logs option for max_log_file_action?
> 
> I did, but the man page states that keep_logs is similar to rotate, so it
> sounds like if I used this option, it would still rotate the log file if
> it went above the max_log_file size, which I don't want to happen.  I
> suppose I could just set max_log_file to 99999 or something (if that's
> supported).  Typically, uncompressed log files for ~400 clients on the
> central server end up being around 3-4Gb.
> 
> Thanks for all the help so far; I think I'm almost there.
> 
> --Ray

Do you not want to rotate because of the time it takes?
Yep, the keep_logs does a rotate without a limit.

The max_log_file value is an unsigned long so it should take a very
large number. However, in case there is a lot of auditing you are not
prepared for, I'd suggest limiting the file size to 2GB. The rotate time
should be similar regardless of the file size.

BTW, in what a time period are you getting the 3-4GB amounts? Are you
happy with the data you are getting - or maybe you could pare it down
some with audit.rules tweaks on the senders?

LCB.

-- 
LC (Lenny) Bruzenak
lenny@magitekltd.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Log rotation and client disconnects
  2010-08-12 15:57         ` LC Bruzenak
@ 2010-08-13 15:06           ` rshaw1
  2010-08-13 15:38             ` LC Bruzenak
  0 siblings, 1 reply; 9+ messages in thread
From: rshaw1 @ 2010-08-13 15:06 UTC (permalink / raw)
  To: linux-audit

LC Bruzenak wrote:
> On Thu, 2010-08-12 at 11:16 -0400, rshaw1@umbc.edu wrote:
>> > On Thursday, August 12, 2010 10:02:29 am rshaw1@umbc.edu wrote:
>> >> I've discovered the issue since I sent it, anyway.  If num_logs is
>> set
>> >> to
>> >> 0, auditd will ignore explicit requests to rotate the logs.  I guess
>> >> this
>> >> may be intentional, but it's unfortunate as num_logs caps at 99 and I
>> >> need
>> >> to keep 365 of them.
>> >
>> > Have you looked at the keep_logs option for max_log_file_action?
>>
>> I did, but the man page states that keep_logs is similar to rotate, so
>> it
>> sounds like if I used this option, it would still rotate the log file if
>> it went above the max_log_file size, which I don't want to happen.  I
>> suppose I could just set max_log_file to 99999 or something (if that's
>> supported).  Typically, uncompressed log files for ~400 clients on the
>> central server end up being around 3-4Gb.
>
> Do you not want to rotate because of the time it takes?
> Yep, the keep_logs does a rotate without a limit.

I am required to rotate the logs once per day, and I would like to make it
exactly once per day. This is to make it easier to keep track (with
date-named logs), easier to keep 1 year's worth of logs (also required),
and easier to run reports on a particular workday's worth of events.

> The max_log_file value is an unsigned long so it should take a very
> large number. However, in case there is a lot of auditing you are not
> prepared for, I'd suggest limiting the file size to 2GB. The rotate time
> should be similar regardless of the file size.

I've made /var a little over 200G on the current audit collection machine
(and on its final destination, /var is much bigger than that).  I guess I
could set a very large "just in case" value that stops short of ludicrous,
but I'd really prefer that size-based rotation never happen.

> BTW, in what a time period are you getting the 3-4GB amounts? Are you
> happy with the data you are getting - or maybe you could pare it down
> some with audit.rules tweaks on the senders?

That amount of data is in one day, for all clients.  Whether I am happy is
somewhat less relevant than whether I am STIG-compliant :p  However, I do
have the data I want for running a few everyday, useful reports.  I'm not
sure whether I could reduce it much and still be auditing everything I'm
required to (I started with the example rules file, and added quite a bit;
each machine auto-generates a list of SUID/SGID binaries and adds rules
for them, etc.)  The size is manageable for us.

Given the nature of the systems, there are often lots of files being
created and destroyed.  This will get even worse once the RHEL4 ones are
brought up (probably to 6), as I'm not auditing them since they'd need
different rules and don't have audisp.

(Technology preview or no, I'm very happy to have audisp; certain other
systems aren't so lucky.)

> Each day I have to move mine out of the way for the same reasons.
> However, the search tools are then impacted, since you'll need to know
> where to find them.
> Also, since it appears you have a lot of data, I assume you are finding
> performance issues on the audit-viewer?

Well, I can't run aureport --summary; it pegs the CPU for hours and hours.
 That's not really a big deal for me, though.  I have a script that runs
shortly after the logs are rotated, generating a report based on the
previous day's data.  It's using 3 aureports and one ausearch (piped
through a bunch of stuff).  Usually takes less than 15 minutes to run.  At
the moment, this is the main way we're using the data, though I'm hoping
to do more in the future.  I've glanced at the audit+Prelude HOWTO, since
Prelude can do a few other things that appeal to me.

(The ausearch used to be an aureport, but aureport --anomaly -i doesn't
seem to get the node/host names from the logs, which is why I ended up
writing my own thing.  Interestingly, --anomaly isn't even in the man page
for aureport; I've no idea where I found it.  I don't know if any of this
is different in more recent versions.)

>> I'm still not sure what to do about the disconnection issues (although
>> hopefully those will be very infrequent once I'm no longer restarting
>> any
>> of the daemons).  If a client does lose the connection to the server for
>> a
>> while though (say, an hour-long network outage for networking upgrades),
>> I'd like to be able to tell them to try reconnecting periodically, and
>> the
>> combination of network_retry_time and max_tries_per_record doesn't seem
>> to
>> be the way to do that.
>>
>> Other than checking the logs, is there a way to determine whether or not
>> a
>> running audispd is connected to the remote server?
>
> I do a combination of things to detect this on the sending side.
> The network_failure_action of the audisp-remote.conf file allows for a
> custom action using the "exec" option.
>
> The remote_ending_action = reconnect helps if the  (server) restarts its
> auditd. Maybe your version is different from mine but I get the
> reconnects...

Hrm.  This is what I have:

network_retry_time = 30
max_tries_per_record = 60
max_time_per_record = 5
network_failure_action = syslog (looks like I'll be changing that)
...
remote_ending_action = reconnect

Are you using the heartbeat_timeout stuff?  I haven't been.

> Also - I have a big ugly system involving timestamps and reconnect
> logic.

Yeah, I think I might come up with something like that, and use the "exec"
option for network_failure_action combined with cron stuff to keep
retrying.

Thanks,

--Ray

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Log rotation and client disconnects
  2010-08-13 15:06           ` rshaw1
@ 2010-08-13 15:38             ` LC Bruzenak
  0 siblings, 0 replies; 9+ messages in thread
From: LC Bruzenak @ 2010-08-13 15:38 UTC (permalink / raw)
  To: rshaw1; +Cc: Linux Audit

On Fri, 2010-08-13 at 11:06 -0400, rshaw1@umbc.edu wrote:

> 
> (Technology preview or no, I'm very happy to have audisp; certain other
> systems aren't so lucky.)

I agree.

> 
> Well, I can't run aureport --summary; it pegs the CPU for hours and hours.
>  That's not really a big deal for me, though.  I have a script that runs
> shortly after the logs are rotated, generating a report based on the
> previous day's data.  It's using 3 aureports and one ausearch (piped
> through a bunch of stuff).  Usually takes less than 15 minutes to run.  At
> the moment, this is the main way we're using the data, though I'm hoping
> to do more in the future.  I've glanced at the audit+Prelude HOWTO, since
> Prelude can do a few other things that appeal to me.

I use this. Works pretty well.

> 
> (The ausearch used to be an aureport, but aureport --anomaly -i doesn't
> seem to get the node/host names from the logs, which is why I ended up
> writing my own thing.  Interestingly, --anomaly isn't even in the man page
> for aureport; I've no idea where I found it.  I don't know if any of this
> is different in more recent versions.)

That's a doc bug I guess. I have never heard of it.

> 
> Hrm.  This is what I have:
> 
> network_retry_time = 30
> max_tries_per_record = 60
> max_time_per_record = 5
> network_failure_action = syslog (looks like I'll be changing that)
> ...
> remote_ending_action = reconnect
> 
> Are you using the heartbeat_timeout stuff?  I haven't been.
Me:
network_retry_time = 1
max_tries_per_record = 10
max_time_per_record = 10
heartbeat_timeout = 30
...
remote_ending_action = reconnect

> 
> > Also - I have a big ugly system involving timestamps and reconnect
> > logic.
> 
> Yeah, I think I might come up with something like that, and use the "exec"
> option for network_failure_action combined with cron stuff to keep
> retrying.

That is what I do. It gets a little tricky, but it works.

LCB.

-- 
LC (Lenny) Bruzenak
lenny@magitekltd.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2010-08-13 15:39 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-08-09 16:59 Log rotation and client disconnects rshaw1
2010-08-09 17:53 ` Steve Grubb
2010-08-12 14:02   ` rshaw1
2010-08-12 14:25     ` Steve Grubb
2010-08-12 15:16       ` rshaw1
2010-08-12 15:57         ` LC Bruzenak
2010-08-13 15:06           ` rshaw1
2010-08-13 15:38             ` LC Bruzenak
2010-08-12 14:31     ` LC Bruzenak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).