From mboxrd@z Thu Jan  1 00:00:00 1970
From: Steve Grubb <sgrubb@redhat.com>
Subject: Re: Lost events during boot
Date: Mon, 20 Mar 2017 11:08:56 -0400
Message-ID: <2742334.zvR4i4OIcv@x2>
References: <3997070.g5Zg3o8xPs@x2>
	<CAHC9VhR-SNN=P9GRCtcjUns8KaWEAffa67OkYEo=UKr5xg+mCQ@mail.gmail.com>
	<CAHC9VhQeRahXPGoDf5w6=mr175Ge3CMEKyDRxVNV4BA7pcW1Cg@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Return-path: <linux-audit-bounces@redhat.com>
In-Reply-To: <CAHC9VhQeRahXPGoDf5w6=mr175Ge3CMEKyDRxVNV4BA7pcW1Cg@mail.gmail.com>
List-Unsubscribe: <https://www.redhat.com/mailman/options/linux-audit>,
	<mailto:linux-audit-request@redhat.com?subject=unsubscribe>
List-Archive: <https://www.redhat.com/archives/linux-audit>
List-Post: <mailto:linux-audit@redhat.com>
List-Help: <mailto:linux-audit-request@redhat.com?subject=help>
List-Subscribe: <https://www.redhat.com/mailman/listinfo/linux-audit>,
	<mailto:linux-audit-request@redhat.com?subject=subscribe>
Sender: linux-audit-bounces@redhat.com
Errors-To: linux-audit-bounces@redhat.com
To: Paul Moore <paul@paul-moore.com>
Cc: Richard Briggs <rgb@redhat.com>, linux-audit@redhat.com
List-Id: linux-audit@redhat.com

On Monday, March 20, 2017 10:55:43 AM EDT Paul Moore wrote:
> On Mon, Mar 20, 2017 at 10:44 AM, Paul Moore <paul@paul-moore.com> wrote:
> > On Mon, Mar 20, 2017 at 8:08 AM, Paul Moore <paul@paul-moore.com> wrote:
> >> On Sun, Mar 19, 2017 at 9:46 PM, Steve Grubb <sgrubb@redhat.com> wrote:
> >>> Hello Richard and Paul,
> >>> 
> >>> I was going to do a blog write up about booting the system with
> >>> audit_backlog_limit=8192 for STIG users and have stumbled on to a
> >>> mystery. The kernel initializes the variable to 64 at power on. During
> >>> boot, if audit == 1, then it holds events in the hopes that an audit
> >>> daemon will show up later and drain all the events. Anything over 64
> >>> events should fall off the end and increment the lost counter and put a
> >>> notice in syslog.
> >>> 
> >>> However, when booting with audit_backlog_limit=8192, as soon as I log in
> >>> I run "auditctl -s" I can see I've lost 73 events. The I run "aureport
> >>> --start boot" and I see 644 total events. This is nowhere near the 8192
> >>> limit that I asked for. So, why am I losing events?
> >>> 
> >>> Additionally, I checked the logs and there is absolutely no message in
> >>> syslog showing that I've lost events. This is with failure mode set to
> >>> 1 - which is default at power on. And this is in spite of the the fact
> >>> that the source code seems to show that it should have printk'ed
> >>> something.
> >>> 
> >>> Any ideas? Can you replicate this finding?
> >> 
> >> It's funny, I just noticed this for the first time on Friday (the
> >> exact same lost count too), although it was a development kernel build
> >> with a *heavily* modified audit subsystem so I just assumed I had
> >> broken something with the queuing, the lost counter, or both.  It's
> >> possible I still may have broken something in the v4.10 queue rework,
> >> or something broke a long time ago and we are just noticing it now.
> >> 
> >> First off, can you create a GitHub issue for this and include your
> >> kernel build (e.g. 'uname -r')?  Second, if you are seeing this on a
> >> +v4.10 kernel, do you see the same results with a +v4.9 kernel?
> > 
> > Quick follow-up, and completely untested, but it would appear that the
> > problem lies in kauditd_hold_skb()/kauditd_print_skb();
> > kauditd_print_skb() registers a false lost record when the printk
> > ratelimit is tripped.  The fix is rather simple, and I'll include that
> > in an upcoming patchset.
> 
> ... and a quick question, if the kernel is booted without "audit=1" do
> we want to count lost records in the case where the backlog overflows?

If audit == 0, then we should not care because auditing may never be enabled. 
If for some reason audit == 2, then I suppose we should care.

-Steve