Linux-audit Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [RFC][PATCH] audit: add feature audit_lost reset
From: Steve Grubb @ 2016-12-10 20:40 UTC (permalink / raw)
  To: linux-audit; +Cc: Richard Guy Briggs
In-Reply-To: <CAHC9VhS-roP2arsepAFPzRdbfrADTp6i=mp0FAOGvT33GNCg7A@mail.gmail.com>

On Friday, December 9, 2016 6:46:43 PM EST Paul Moore wrote:
> > I would suggest that the return value (presuming it was reset when
> > non-zero) or the audit record generated reporting the lost value
> > reset would be sufficient confirmation that the feature exists on the
> > running kernel and the addition to the feature bitmap is not strictly
> > necessary, but you only find this out upon attempting that lost reset.
> > 
> > Well, we haven't used much of that bitmap space and if it isn't to be
> > used when needed, why is it there?  If there is a relatively simple
> > alternate non-destructive way to discover the presence of a feature use
> > of the bitmap isn't necessary.
> 
> My concern isn't the absolute consumption of the bitmap, but rather
> the rate of the consumption.

I'm not concerned much about it. There are very few more RFE's that are either 
in the pipeline or something I can think of that we need.

-Steve

^ permalink raw reply

* [PATCH v2] audit: add feature audit_lost reset
From: Richard Guy Briggs @ 2016-12-10 11:52 UTC (permalink / raw)
  To: linux-audit; +Cc: Richard Guy Briggs

Add a method to reset the audit_lost value.

An AUDIT_SET message with the AUDIT_STATUS_LOST flag set by itself
will return a positive value repesenting the current audit_lost value
and reset the counter to zero.  If AUDIT_STATUS_LOST is not the
only flag set, the reset command will be ignored.  The value sent with
the command is ignored.

An AUDIT_LOST_RESET message will be sent to the listening audit daemon.
The data field will contain a u32 with the positive value of the
audit_lost value when it was reset.

See: https://github.com/linux-audit/audit-kernel/issues/3

Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
---
 include/uapi/linux/audit.h |    2 ++
 kernel/audit.c             |    8 +++++++-
 2 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/include/uapi/linux/audit.h b/include/uapi/linux/audit.h
index 208df7b..6d38bff 100644
--- a/include/uapi/linux/audit.h
+++ b/include/uapi/linux/audit.h
@@ -70,6 +70,7 @@
 #define AUDIT_TTY_SET		1017	/* Set TTY auditing status */
 #define AUDIT_SET_FEATURE	1018	/* Turn an audit feature on or off */
 #define AUDIT_GET_FEATURE	1019	/* Get which features are enabled */
+#define AUDIT_LOST_RESET	1020	/* Reset the audit_lost value */
 
 #define AUDIT_FIRST_USER_MSG	1100	/* Userspace messages mostly uninteresting to kernel */
 #define AUDIT_USER_AVC		1107	/* We filter this differently */
@@ -325,6 +326,7 @@ enum {
 #define AUDIT_STATUS_RATE_LIMIT		0x0008
 #define AUDIT_STATUS_BACKLOG_LIMIT	0x0010
 #define AUDIT_STATUS_BACKLOG_WAIT_TIME	0x0020
+#define AUDIT_STATUS_LOST		0x0040
 
 #define AUDIT_FEATURE_BITMAP_BACKLOG_LIMIT	0x00000001
 #define AUDIT_FEATURE_BITMAP_BACKLOG_WAIT_TIME	0x00000002
diff --git a/kernel/audit.c b/kernel/audit.c
index f1ca116..19cfee0 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -122,7 +122,7 @@
    3) suppressed due to audit_rate_limit
    4) suppressed due to audit_backlog_limit
 */
-static atomic_t    audit_lost = ATOMIC_INIT(0);
+static atomic_t	audit_lost = ATOMIC_INIT(0);
 
 /* The netlink socket. */
 static struct sock *audit_sock;
@@ -920,6 +920,12 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
 			if (err < 0)
 				return err;
 		}
+		if (s.mask == AUDIT_STATUS_LOST) {
+			u32 lost = atomic_xchg(&audit_lost, 0);
+
+			audit_send_reply(skb, seq, AUDIT_LOST_RESET, 0, 0, &lost, sizeof(lost));
+			return lost;
+		}
 		break;
 	}
 	case AUDIT_GET_FEATURE:
-- 
1.7.1

^ permalink raw reply related

* Re: netlink: GPF in sock_sndtimeo
From: Cong Wang @ 2016-12-10  7:40 UTC (permalink / raw)
  To: Richard Guy Briggs
  Cc: linux-audit, Paul Moore, Dmitry Vyukov, David Miller,
	Johannes Berg, Florian Westphal, Eric Dumazet, Herbert Xu, netdev,
	LKML, syzkaller
In-Reply-To: <CAM_iQpV2GuKhR_1tD5jjACeD+pajJLws08CLmeYAo+rsjxvB0A@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1664 bytes --]

On Fri, Dec 9, 2016 at 8:13 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> On Fri, Dec 9, 2016 at 3:01 AM, Richard Guy Briggs <rgb@redhat.com> wrote:
>> On 2016-12-08 22:57, Cong Wang wrote:
>>> On Thu, Dec 8, 2016 at 10:02 PM, Richard Guy Briggs <rgb@redhat.com> wrote:
>>> > I also tried to extend Cong Wang's idea to attempt to proactively respond to a
>>> > NETLINK_URELEASE on the audit_sock and reset it, but ran into a locking error
>>> > stack dump using mutex_lock(&audit_cmd_mutex) in the notifier callback.
>>> > Eliminating the lock since the sock is dead anways eliminates the error.
>>> >
>>> > Is it safe?  I'll resubmit if this looks remotely sane.  Meanwhile I'll try to
>>> > get the test case to compile.
>>>
>>> It doesn't look safe, because 'audit_sock', 'audit_nlk_portid' and 'audit_pid'
>>> are updated as a whole and race between audit_receive_msg() and
>>> NETLINK_URELEASE.
>>
>> This is what I expected and why I originally added the mutex lock in the
>> callback...  The dumps I got were bare with no wrapper identifying the
>> process context or specific error, so I'm at a bit of a loss how to
>> solve this (without thinking more about it) other than instinctively
>> removing the mutex.
>
> Netlink notifier can safely be converted to blocking one, I will send
> a patch.
>
> But I seriously doubt you really need NETLINK_URELEASE here,
> it adds nothing but overhead, b/c the netlink notifier is called on
> every netlink socket in the system, but for net exit path, that is
> relatively a slow path.
>
> Also, kauditd_send_skb() needs audit_cmd_mutex too.

Please let me know what you think about the attached patch?

Thanks!

[-- Attachment #2: audit_sock.diff --]
[-- Type: text/plain, Size: 1370 bytes --]

commit a12b43ee814625933ff155c20dc863c59cfcf240
Author: Cong Wang <xiyou.wangcong@gmail.com>
Date:   Fri Dec 9 17:56:42 2016 -0800

    audit: close a race condition on audit_sock
    
    Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>

diff --git a/kernel/audit.c b/kernel/audit.c
index f1ca116..ab947d8 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -423,6 +423,8 @@ static void kauditd_send_skb(struct sk_buff *skb)
 				snprintf(s, sizeof(s), "audit_pid=%d reset", audit_pid);
 				audit_log_lost(s);
 				audit_pid = 0;
+				audit_nlk_portid = 0;
+				sock_put(audit_sock);
 				audit_sock = NULL;
 			} else {
 				pr_warn("re-scheduling(#%d) write to audit_pid=%d\n",
@@ -899,6 +901,9 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
 				audit_log_config_change("audit_pid", new_pid, audit_pid, 1);
 			audit_pid = new_pid;
 			audit_nlk_portid = NETLINK_CB(skb).portid;
+			sock_hold(skb->sk);
+			if (audit_sock)
+				sock_put(audit_sock);
 			audit_sock = skb->sk;
 		}
 		if (s.mask & AUDIT_STATUS_RATE_LIMIT) {
@@ -1167,10 +1172,6 @@ static void __net_exit audit_net_exit(struct net *net)
 {
 	struct audit_net *aunet = net_generic(net, audit_net_id);
 	struct sock *sock = aunet->nlsk;
-	if (sock == audit_sock) {
-		audit_pid = 0;
-		audit_sock = NULL;
-	}
 
 	RCU_INIT_POINTER(aunet->nlsk, NULL);
 	synchronize_net();

^ permalink raw reply related

* Re: netlink: GPF in sock_sndtimeo
From: Cong Wang @ 2016-12-10  4:13 UTC (permalink / raw)
  To: Richard Guy Briggs
  Cc: linux-audit, Paul Moore, Dmitry Vyukov, David Miller,
	Johannes Berg, Florian Westphal, Eric Dumazet, Herbert Xu, netdev,
	LKML, syzkaller
In-Reply-To: <20161209110155.GW22655@madcap2.tricolour.ca>

On Fri, Dec 9, 2016 at 3:01 AM, Richard Guy Briggs <rgb@redhat.com> wrote:
> On 2016-12-08 22:57, Cong Wang wrote:
>> On Thu, Dec 8, 2016 at 10:02 PM, Richard Guy Briggs <rgb@redhat.com> wrote:
>> > I also tried to extend Cong Wang's idea to attempt to proactively respond to a
>> > NETLINK_URELEASE on the audit_sock and reset it, but ran into a locking error
>> > stack dump using mutex_lock(&audit_cmd_mutex) in the notifier callback.
>> > Eliminating the lock since the sock is dead anways eliminates the error.
>> >
>> > Is it safe?  I'll resubmit if this looks remotely sane.  Meanwhile I'll try to
>> > get the test case to compile.
>>
>> It doesn't look safe, because 'audit_sock', 'audit_nlk_portid' and 'audit_pid'
>> are updated as a whole and race between audit_receive_msg() and
>> NETLINK_URELEASE.
>
> This is what I expected and why I originally added the mutex lock in the
> callback...  The dumps I got were bare with no wrapper identifying the
> process context or specific error, so I'm at a bit of a loss how to
> solve this (without thinking more about it) other than instinctively
> removing the mutex.

Netlink notifier can safely be converted to blocking one, I will send
a patch.

But I seriously doubt you really need NETLINK_URELEASE here,
it adds nothing but overhead, b/c the netlink notifier is called on
every netlink socket in the system, but for net exit path, that is
relatively a slow path.

Also, kauditd_send_skb() needs audit_cmd_mutex too.

I will send a formal patch.

Thanks.

^ permalink raw reply

* Re: [RFC][PATCH] audit: add feature audit_lost reset
From: Paul Moore @ 2016-12-09 23:46 UTC (permalink / raw)
  To: Richard Guy Briggs; +Cc: linux-audit
In-Reply-To: <20161209070014.GE22660@madcap2.tricolour.ca>

On Fri, Dec 9, 2016 at 2:00 AM, Richard Guy Briggs <rgb@redhat.com> wrote:
> On 2016-12-08 09:05, Paul Moore wrote:
>> On Wed, Dec 7, 2016 at 10:53 PM, Richard Guy Briggs <rgb@redhat.com> wrote:
>> > On 2016-12-07 18:45, Paul Moore wrote:
>> >> On Wed, Dec 7, 2016 at 6:30 PM, Steve Grubb <sgrubb@redhat.com> wrote:
>> >> > On Wednesday, December 7, 2016 6:10:49 PM EST Paul Moore wrote:
>> >> >> On Wed, Dec 7, 2016 at 10:58 AM, Richard Guy Briggs <rgb@redhat.com> wrote:
>> >> >> > On 2016-12-07 10:53, Steve Grubb wrote:
>> >> >> >> On Wednesday, December 7, 2016 10:05:30 AM EST Paul Moore wrote:
>> >> >> >> > On Tue, Dec 6, 2016 at 10:32 PM, Richard Guy Briggs <rgb@redhat.com>
>> >> > wrote:
>> >> >> >> > > On 2016-12-06 19:17, Paul Moore wrote:
>> >> >> >> > >> On Tue, Dec 6, 2016 at 12:13 AM, Richard Guy Briggs <rgb@redhat.com>
>> >> >> >> > >> Okay, back up ... this whole mess about atomic_xchg() was always
>> >> >> >> > >> unrelated to my original suggestion, let's focus on my original
>> >> >> >> > >> comment ... don't reset the counter on a AUDIT_GET, reset it on a
>> >> >> >> > >> AUDIT_SET with an AUDIT_STATUS_LOST, does that make sense?
>> >> >> >> > >
>> >> >> >> > > I understood that.  It sounds like a nice simple and straightforward
>> >> >> >> > > method to do it but for the question of accuracy.  Please rewind to
>> >> >> >> > > my
>> >> >> >> > > fundamental point: How do we get an accurate reading of the last
>> >> >> >> > > value
>> >> >> >> > > of audit_lost before resetting it?
>> >> >> >> >
>> >> >> >> > Okay, I thought you were worried about a different race, which is why
>> >> >> >> > this discussion wasn't making much sense to me.  I understand your
>> >> >> >> > point, but I really dislike the API; although that's not your fault,
>> >> >> >> > it's really the only way to do it via AUDIT_GET.
>> >> >> >> >
>> >> >> >> > I'd much prefer we go with the cleaner AUDIT_SET approach and just not
>> >> >> >> > worry about the small race window.  It would only be an issue if you
>> >> >> >> > reset the count under heavy audit load, and why would you reset the
>> >> >> >> > lost value if you were under a heavy audit load?  That just doesn't
>> >> >> >> > make sense.
>> >> >> >> >
>> >> >> >> > I suppose we should hear from Steve on this since he was the one who
>> >> >> >> > has been asking for this feature, although I'm pretty sure I know what
>> >> >> >> > he is going to say.
>> >> >> >>
>> >> >> >> To start with, this request comes from users of the audit system. I just
>> >> >> >> passed along the request. The issue is that when you do auditctl -s, you
>> >> >> >> get the number of records lost. If you do it the next day, you have to
>> >> >> >> do math to see what the one day delta is. So, to make reporting easy,
>> >> >> >> they want it to be reset whenever they do audictl -s.
>> >> >> >>
>> >> >> >> You could also make a AUDIT_GET_RESET that gets the status and resets the
>> >> >> >> number atomically. Then I can add another commandline option to auditctl
>> >> >> >> that allows an admin to say also reset the counters. If that command
>> >> >> >> line option is passed, I call AUDIT_GET_RESET otherwise I call
>> >> >> >> AUDIT_GET. Thought?>
>> >> >> > This would be slightly simpler in kernel implementation than the method
>> >> >> > I proposed and would work fine, off the top of my head.
>> >> >>
>> >> >> I'd prefer not to introduce another command message type for something
>> >> >> small like this.
>> >> >>
>> >> >> Steve, do you have any objection to the AUDIT_SET based approach?
>> >> >
>> >> > Either way, we'd need a feature flag so that I can tell if the kernel supports
>> >> > this or not.
>> >>
>> >> I think we are okay without a specific feature flag as sending a
>> >> AUDIT_SET/reset on an old kernel will be harmless; it won't do
>> >> anything, but it shouldn't return an error either.
>> >
>> > Ok, so userspace is still left wondering if it worked until the next
>> > time it reads that value, and even then it can't be certain if that
>> > value is the same or higher than it was when userspace thought it reset
>> > that value.  At this point, an old kernel will not return an error and
>> > simply ignore any new AUDIT_STATUS_* flag since each flag is treated
>> > independently and extra flags are ignored and not blocked.  This seems
>> > sloppy since we have two ways of fixing this uncertainty pretty easily.
>>
>> Comments like the above aren't helpful, they are just annoying.  The
>> drawbacks to the AUDIT_SET/reset approach have already been discussed
>> on this thread, if you want to do something constructive think about
>> how you can resolve these limitations within the context of others
>> comments/feedback.
>
> I'm sorry to offend.  That wasn't my intent.  I was trying to bring new
> information and observations into the discussion.  Is there anything
> factually wrong in my observations other than the opinion of it being
> sloppy?

Your email wasn't offensive, maybe a little snarky at times, but this
is an open mailing list and developers are nothing if not occasionally
snarky (at least I know I am); that wasn't my objection.

My objection to that first paragraph was that you weren't bringing
anything new to this thread (in my opinion), you were simply
regurgitating the same arguments without any thought of how to solve
these concerns within the context of our ongoing discussion.  I tried
to point you in a more constructive direction with my last email, but
perhaps I did a poor job, let me try again ... Opposing viewpoints,
especially when technical matters are concerned, is a good thing, but
ultimately a decision needs to be made if we are to solve our problems
and move forward; it is nice if that decision is the result of
unanimous vote, but it is foolish to expect that, or delay a fix,
waiting for consensus.  When you present a possible solution, and the
solution is rejected, in whole or in part, look at the other proposed
solutions: do they resolve all your concerns?  If yes, wonderful, work
to embrace that solution.  If not, try to resolve your concerns within
the context of this other proposal; don't continue to push something
which has already been rejected.  The ultimate goal should be to solve
the problem, not to merge a specific solution; the "best" solution for
the problem is always the one that gets merged.

Hopefully this helps, because I don't know how else to explain this,
and to be quite frank, I'm growing tired of talking about this.

>> I've already mentioned that I didn't like the AUDIT_GET/reset approach
>> because I thought the interface was bad.  As I'm sure you know, the
>> audit kernel/userspace interface is a bit of a hot-button topic with
>> me; I think it has a lot of problems and I'm very intent on not making
>> it worse (in my opinion, I will admit that API design is not entirely
>> objective).  Continuing to argue for a interface design that I've
>> already expressed a dislike for is not likely to win me over to your
>> side; regardless of the outcome you will end up frustrating both
>> yourself and the maintainer, neither are good things.
>
> I've re-read the thread from the beginning.  I guess I must have missed
> what was the fundamental problem with the AUDIT_GET/reset method other
> than taste.  What don't you like about the API that precludes
> using/abusing it this way?

A few things come to mind immediately: abusing GET this way
effectively makes it a "write" operation which can make access control
policy difficult, having to bracket the GET with a FEATURE operation
is ugly and potentially racy in its own way, complexity compared to
other solutions.

>> You feel very strongly that
>> the window is of grave concern, Steve and myself much less so.  If you
>> still feel strongly about this, think about some different ways in
>> which you can avoid losing a lost message counter bump.  Off the top
>> of my head, there are really only two ways for the kernel's audit
>> subsystem to send information back to userspace in this case, via a
>> netlink return/error message or an audit record.  We could possibly do
>> something with the netlink error message by returning the lost counter
>> as a positive integer (negative integer is a failure code, zero is
>> success), but that might get tricky in the future, although we could
>> mitigate that risk by forcing the AUDIT_SET/reset to happen by itself
>> (in other words, don't simply check to see if the bit is set in the
>> bitmask, e.g. (s.mask & AUDIT_STATUS_LOST), check to see it is equal,
>> e.g. (s.mask == AUDIT_STATUS_LOST)).
>
> This could work.  What risk do you see in doing it with other flags?
> That another set failure could usurp the return code?  If so, yes, I
> agree with requiring it to be a lone flag.

Possible conflicts with other SET operations failing as well as
potential future use.  We can always relax the restriction in the
future, we can't make it more restrictive.

>> There you go, two possible solutions for eliminating/mitigating the
>> potential race while sticking with the simpler AUDIT_SET/reset
>> interface.  I suppose you could even implement both of the solutions
>> above, they aren't mutually exclusive; that would depend on what
>> Steve/userspace would prefer.  Finally, as for the feature bitmap to
>> signal to userspace that we support this new feature: if you can't
>> live without it, go ahead and add it in.  As I said before, I'm a
>> little concerned at the rate we are consuming this bitmap, but I'll
>> admit we still have plenty of room before we have to start worrying
>> about alternatives.
>
> I would suggest that the return value (presuming it was reset when
> non-zero) or the audit record generated reporting the lost value
> reset would be sufficient confirmation that the feature exists on the
> running kernel and the addition to the feature bitmap is not strictly
> necessary, but you only find this out upon attempting that lost reset.
>
> Well, we haven't used much of that bitmap space and if it isn't to be
> used when needed, why is it there?  If there is a relatively simple
> alternate non-destructive way to discover the presence of a feature use
> of the bitmap isn't necessary.

My concern isn't the absolute consumption of the bitmap, but rather
the rate of the consumption.

-- 
paul moore
www.paul-moore.com

^ permalink raw reply

* Re: netlink: GPF in sock_sndtimeo
From: Richard Guy Briggs @ 2016-12-09 12:12 UTC (permalink / raw)
  To: Dmitry Vyukov; +Cc: netdev, LKML, linux-audit
In-Reply-To: <CACT4Y+ZsOXoQqVE4vhenb9fUJkwAbGL6wUZxGyaT2h7Cncbfog@mail.gmail.com>

On 2016-12-09 12:53, Dmitry Vyukov wrote:
> On Fri, Dec 9, 2016 at 12:48 PM, Richard Guy Briggs <rgb@redhat.com> wrote:
> > On 2016-12-09 11:49, Dmitry Vyukov wrote:
> >> On Fri, Dec 9, 2016 at 7:02 AM, Richard Guy Briggs <rgb@redhat.com> wrote:
> >> > On 2016-11-29 23:52, Richard Guy Briggs wrote:
> >> > I tried a quick compile attempt on the test case (I assume it is a
> >> > socket fuzzer) and get the following compile error:
> >> > cc -g -O0 -Wall -D_GNU_SOURCE -o socket_fuzz socket_fuzz.c
> >> > socket_fuzz.c:16:1: warning: "_GNU_SOURCE" redefined
> >> > <command-line>: warning: this is the location of the previous definition
> >> > socket_fuzz.c: In function ‘segv_handler’:
> >> > socket_fuzz.c:89: warning: implicit declaration of function ‘__atomic_load_n’
> >> > socket_fuzz.c:89: error: ‘__ATOMIC_RELAXED’ undeclared (first use in this function)
> >> > socket_fuzz.c:89: error: (Each undeclared identifier is reported only once
> >> > socket_fuzz.c:89: error: for each function it appears in.)
> >> > socket_fuzz.c: In function ‘loop’:
> >> > socket_fuzz.c:280: warning: unused variable ‘errno0’
> >> > socket_fuzz.c: In function ‘test’:
> >> > socket_fuzz.c:303: warning: implicit declaration of function ‘__atomic_fetch_add’
> >> > socket_fuzz.c:303: error: ‘__ATOMIC_SEQ_CST’ undeclared (first use in this function)
> >> > socket_fuzz.c:303: warning: implicit declaration of function ‘__atomic_fetch_sub’
> >>
> >> -std=gnu99 should help
> >> ignore warnings
> >
> > I got a little further, left with "__ATOMIC_RELAXED undeclared", "__ATOMIC_SEQ_CST
> > undeclared" under gcc 4.4.7-16.
> >
> > gcc 4.8.2-15 leaves me with "undefined reference to `clock_gettime'"
> 
> add -lrt

Ok, that helped.  Thanks!

> > What compiler version do you recommend?
> 
> 6.x sounds reasonable
> 4.4 branch is 7.5 years old, surprised that it does not disintegrate
> into dust yet :)

  These are under RHEL6...  so there are updates to them, but yeah, they are old.

> >> >> - RGB
> >> >
> >> > - RGB
> >
> > - RGB
> >
> > --
> > Richard Guy Briggs <rgb@redhat.com>
> > Kernel Security Engineering, Base Operating Systems, Red Hat
> > Remote, Ottawa, Canada
> > Voice: +1.647.777.2635, Internal: (81) 32635

- RGB

--
Richard Guy Briggs <rgb@redhat.com>
Kernel Security Engineering, Base Operating Systems, Red Hat
Remote, Ottawa, Canada
Voice: +1.647.777.2635, Internal: (81) 32635

^ permalink raw reply

* Re: netlink: GPF in sock_sndtimeo
From: Richard Guy Briggs @ 2016-12-09 11:01 UTC (permalink / raw)
  To: Cong Wang
  Cc: linux-audit, Paul Moore, Dmitry Vyukov, David Miller,
	Johannes Berg, Florian Westphal, Eric Dumazet, Herbert Xu, netdev,
	LKML, syzkaller
In-Reply-To: <CAM_iQpWRe0tDk=wvXONG7hamH3EtE0nJuAOF1kVKJgdpMvz2DA@mail.gmail.com>

On 2016-12-08 22:57, Cong Wang wrote:
> On Thu, Dec 8, 2016 at 10:02 PM, Richard Guy Briggs <rgb@redhat.com> wrote:
> > I also tried to extend Cong Wang's idea to attempt to proactively respond to a
> > NETLINK_URELEASE on the audit_sock and reset it, but ran into a locking error
> > stack dump using mutex_lock(&audit_cmd_mutex) in the notifier callback.
> > Eliminating the lock since the sock is dead anways eliminates the error.
> >
> > Is it safe?  I'll resubmit if this looks remotely sane.  Meanwhile I'll try to
> > get the test case to compile.
> 
> It doesn't look safe, because 'audit_sock', 'audit_nlk_portid' and 'audit_pid'
> are updated as a whole and race between audit_receive_msg() and
> NETLINK_URELEASE.

This is what I expected and why I originally added the mutex lock in the
callback...  The dumps I got were bare with no wrapper identifying the
process context or specific error, so I'm at a bit of a loss how to
solve this (without thinking more about it) other than instinctively
removing the mutex.

Another approach might be to look at consolidating the three into one
identifier or derive the other two from one, or serialize their access.

> > @@ -1167,10 +1190,14 @@ static void __net_exit audit_net_exit(struct net *net)
> >  {
> >         struct audit_net *aunet = net_generic(net, audit_net_id);
> >         struct sock *sock = aunet->nlsk;
> > +
> > +       mutex_lock(&audit_cmd_mutex);
> >         if (sock == audit_sock) {
> >                 audit_pid = 0;
> > +               audit_nlk_portid = 0;
> >                 audit_sock = NULL;
> >         }
> > +       mutex_unlock(&audit_cmd_mutex);
> 
> If you decide to use NETLINK_URELEASE notifier, the above piece is no
> longer needed, the net_exit path simply releases a refcnt.

Good point.  It would have already killed it off.  So this piece is
arguably too late anyways.

- RGB

--
Richard Guy Briggs <rgb@redhat.com>
Kernel Security Engineering, Base Operating Systems, Red Hat
Remote, Ottawa, Canada
Voice: +1.647.777.2635, Internal: (81) 32635

^ permalink raw reply

* Re: netlink: GPF in sock_sndtimeo
From: Dmitry Vyukov @ 2016-12-09 10:49 UTC (permalink / raw)
  To: Richard Guy Briggs
  Cc: Cong Wang, linux-audit, pmoore, David Miller, Johannes Berg,
	Florian Westphal, Eric Dumazet, Herbert Xu, netdev, LKML,
	syzkaller
In-Reply-To: <20161209060248.GT22655@madcap2.tricolour.ca>

On Fri, Dec 9, 2016 at 7:02 AM, Richard Guy Briggs <rgb@redhat.com> wrote:
> On 2016-11-29 23:52, Richard Guy Briggs wrote:
>> On 2016-11-29 15:13, Cong Wang wrote:
>> > On Tue, Nov 29, 2016 at 8:48 AM, Richard Guy Briggs <rgb@redhat.com> wrote:
>> > > On 2016-11-26 17:11, Cong Wang wrote:
>> > >> It is racy on audit_sock, especially on the netns exit path.
>> > >
>> > > I think that is the only place it is racy.  The other places audit_sock
>> > > is set is when the socket failure has just triggered a reset.
>> > >
>> > > Is there a notifier callback for failed or reaped sockets?
>> >
>> > Is NETLINK_URELEASE event what you are looking for?
>>
>> Possibly, yes.  Thanks, I'll have a look.
>
> I tried a quick compile attempt on the test case (I assume it is a
> socket fuzzer) and get the following compile error:
> cc -g -O0 -Wall -D_GNU_SOURCE -o socket_fuzz socket_fuzz.c
> socket_fuzz.c:16:1: warning: "_GNU_SOURCE" redefined
> <command-line>: warning: this is the location of the previous definition
> socket_fuzz.c: In function ‘segv_handler’:
> socket_fuzz.c:89: warning: implicit declaration of function ‘__atomic_load_n’
> socket_fuzz.c:89: error: ‘__ATOMIC_RELAXED’ undeclared (first use in this function)
> socket_fuzz.c:89: error: (Each undeclared identifier is reported only once
> socket_fuzz.c:89: error: for each function it appears in.)
> socket_fuzz.c: In function ‘loop’:
> socket_fuzz.c:280: warning: unused variable ‘errno0’
> socket_fuzz.c: In function ‘test’:
> socket_fuzz.c:303: warning: implicit declaration of function ‘__atomic_fetch_add’
> socket_fuzz.c:303: error: ‘__ATOMIC_SEQ_CST’ undeclared (first use in this function)
> socket_fuzz.c:303: warning: implicit declaration of function ‘__atomic_fetch_sub’

-std=gnu99 should help
ignore warnings



> I also tried to extend Cong Wang's idea to attempt to proactively respond to a
> NETLINK_URELEASE on the audit_sock and reset it, but ran into a locking error
> stack dump using mutex_lock(&audit_cmd_mutex) in the notifier callback.
> Eliminating the lock since the sock is dead anways eliminates the error.
>
> Is it safe?  I'll resubmit if this looks remotely sane.  Meanwhile I'll try to
> get the test case to compile.
>
> This is being tracked as https://github.com/linux-audit/audit-kernel/issues/30
>
> Subject: [PATCH] audit: proactively reset audit_sock on matching NETLINK_URELEASE
>
> diff --git a/kernel/audit.c b/kernel/audit.c
> index f1ca116..91d222d 100644
> --- a/kernel/audit.c
> +++ b/kernel/audit.c
> @@ -423,6 +423,7 @@ static void kauditd_send_skb(struct sk_buff *skb)
>                                 snprintf(s, sizeof(s), "audit_pid=%d reset", audit_pid);
>                                 audit_log_lost(s);
>                                 audit_pid = 0;
> +                               audit_nlk_portid = 0;
>                                 audit_sock = NULL;
>                         } else {
>                                 pr_warn("re-scheduling(#%d) write to audit_pid=%d\n",
> @@ -1143,6 +1144,28 @@ static int audit_bind(struct net *net, int group)
>         return 0;
>  }
>
> +static int audit_sock_netlink_notify(struct notifier_block *nb,
> +                                    unsigned long event,
> +                                    void *_notify)
> +{
> +       struct netlink_notify *notify = _notify;
> +       struct audit_net *aunet = net_generic(notify->net, audit_net_id);
> +
> +       if (event == NETLINK_URELEASE && notify->protocol == NETLINK_AUDIT) {
> +               if (audit_nlk_portid == notify->portid &&
> +                   audit_sock == aunet->nlsk) {
> +                       audit_pid = 0;
> +                       audit_nlk_portid = 0;
> +                       audit_sock = NULL;
> +               }
> +       }
> +       return NOTIFY_DONE;
> +}
> +
> +static struct notifier_block audit_netlink_notifier = {
> +       .notifier_call = audit_sock_netlink_notify,
> +};
> +
>  static int __net_init audit_net_init(struct net *net)
>  {
>         struct netlink_kernel_cfg cfg = {
> @@ -1167,10 +1190,14 @@ static void __net_exit audit_net_exit(struct net *net)
>  {
>         struct audit_net *aunet = net_generic(net, audit_net_id);
>         struct sock *sock = aunet->nlsk;
> +
> +       mutex_lock(&audit_cmd_mutex);
>         if (sock == audit_sock) {
>                 audit_pid = 0;
> +               audit_nlk_portid = 0;
>                 audit_sock = NULL;
>         }
> +       mutex_unlock(&audit_cmd_mutex);
>
>         RCU_INIT_POINTER(aunet->nlsk, NULL);
>         synchronize_net();
> @@ -1202,6 +1229,7 @@ static int __init audit_init(void)
>         audit_enabled = audit_default;
>         audit_ever_enabled |= !!audit_default;
>
> +       netlink_register_notifier(&audit_netlink_notifier);
>         audit_log(NULL, GFP_KERNEL, AUDIT_KERNEL, "initialized");
>
>         for (i = 0; i < AUDIT_INODE_BUCKETS; i++)
> --
> 1.7.1
>
>
>> - RGB
>
> - RGB
>
> --
> Richard Guy Briggs <rgb@redhat.com>
> Kernel Security Engineering, Base Operating Systems, Red Hat
> Remote, Ottawa, Canada
> Voice: +1.647.777.2635, Internal: (81) 32635

^ permalink raw reply

* Re: [RFC][PATCH] audit: add feature audit_lost reset
From: Richard Guy Briggs @ 2016-12-09  7:00 UTC (permalink / raw)
  To: Paul Moore; +Cc: linux-audit
In-Reply-To: <CAHC9VhQJFikpzQT6XBPBf1LdVK14auT4iEzcgOkR1eN850R=4Q@mail.gmail.com>

On 2016-12-08 09:05, Paul Moore wrote:
> On Wed, Dec 7, 2016 at 10:53 PM, Richard Guy Briggs <rgb@redhat.com> wrote:
> > On 2016-12-07 18:45, Paul Moore wrote:
> >> On Wed, Dec 7, 2016 at 6:30 PM, Steve Grubb <sgrubb@redhat.com> wrote:
> >> > On Wednesday, December 7, 2016 6:10:49 PM EST Paul Moore wrote:
> >> >> On Wed, Dec 7, 2016 at 10:58 AM, Richard Guy Briggs <rgb@redhat.com> wrote:
> >> >> > On 2016-12-07 10:53, Steve Grubb wrote:
> >> >> >> On Wednesday, December 7, 2016 10:05:30 AM EST Paul Moore wrote:
> >> >> >> > On Tue, Dec 6, 2016 at 10:32 PM, Richard Guy Briggs <rgb@redhat.com>
> >> > wrote:
> >> >> >> > > On 2016-12-06 19:17, Paul Moore wrote:
> >> >> >> > >> On Tue, Dec 6, 2016 at 12:13 AM, Richard Guy Briggs <rgb@redhat.com>
> >> >> >> > >> Okay, back up ... this whole mess about atomic_xchg() was always
> >> >> >> > >> unrelated to my original suggestion, let's focus on my original
> >> >> >> > >> comment ... don't reset the counter on a AUDIT_GET, reset it on a
> >> >> >> > >> AUDIT_SET with an AUDIT_STATUS_LOST, does that make sense?
> >> >> >> > >
> >> >> >> > > I understood that.  It sounds like a nice simple and straightforward
> >> >> >> > > method to do it but for the question of accuracy.  Please rewind to
> >> >> >> > > my
> >> >> >> > > fundamental point: How do we get an accurate reading of the last
> >> >> >> > > value
> >> >> >> > > of audit_lost before resetting it?
> >> >> >> >
> >> >> >> > Okay, I thought you were worried about a different race, which is why
> >> >> >> > this discussion wasn't making much sense to me.  I understand your
> >> >> >> > point, but I really dislike the API; although that's not your fault,
> >> >> >> > it's really the only way to do it via AUDIT_GET.
> >> >> >> >
> >> >> >> > I'd much prefer we go with the cleaner AUDIT_SET approach and just not
> >> >> >> > worry about the small race window.  It would only be an issue if you
> >> >> >> > reset the count under heavy audit load, and why would you reset the
> >> >> >> > lost value if you were under a heavy audit load?  That just doesn't
> >> >> >> > make sense.
> >> >> >> >
> >> >> >> > I suppose we should hear from Steve on this since he was the one who
> >> >> >> > has been asking for this feature, although I'm pretty sure I know what
> >> >> >> > he is going to say.
> >> >> >>
> >> >> >> To start with, this request comes from users of the audit system. I just
> >> >> >> passed along the request. The issue is that when you do auditctl -s, you
> >> >> >> get the number of records lost. If you do it the next day, you have to
> >> >> >> do math to see what the one day delta is. So, to make reporting easy,
> >> >> >> they want it to be reset whenever they do audictl -s.
> >> >> >>
> >> >> >> You could also make a AUDIT_GET_RESET that gets the status and resets the
> >> >> >> number atomically. Then I can add another commandline option to auditctl
> >> >> >> that allows an admin to say also reset the counters. If that command
> >> >> >> line option is passed, I call AUDIT_GET_RESET otherwise I call
> >> >> >> AUDIT_GET. Thought?>
> >> >> > This would be slightly simpler in kernel implementation than the method
> >> >> > I proposed and would work fine, off the top of my head.
> >> >>
> >> >> I'd prefer not to introduce another command message type for something
> >> >> small like this.
> >> >>
> >> >> Steve, do you have any objection to the AUDIT_SET based approach?
> >> >
> >> > Either way, we'd need a feature flag so that I can tell if the kernel supports
> >> > this or not.
> >>
> >> I think we are okay without a specific feature flag as sending a
> >> AUDIT_SET/reset on an old kernel will be harmless; it won't do
> >> anything, but it shouldn't return an error either.
> >
> > Ok, so userspace is still left wondering if it worked until the next
> > time it reads that value, and even then it can't be certain if that
> > value is the same or higher than it was when userspace thought it reset
> > that value.  At this point, an old kernel will not return an error and
> > simply ignore any new AUDIT_STATUS_* flag since each flag is treated
> > independently and extra flags are ignored and not blocked.  This seems
> > sloppy since we have two ways of fixing this uncertainty pretty easily.
> 
> Comments like the above aren't helpful, they are just annoying.  The
> drawbacks to the AUDIT_SET/reset approach have already been discussed
> on this thread, if you want to do something constructive think about
> how you can resolve these limitations within the context of others
> comments/feedback.

I'm sorry to offend.  That wasn't my intent.  I was trying to bring new
information and observations into the discussion.  Is there anything
factually wrong in my observations other than the opinion of it being
sloppy?

> I've already mentioned that I didn't like the AUDIT_GET/reset approach
> because I thought the interface was bad.  As I'm sure you know, the
> audit kernel/userspace interface is a bit of a hot-button topic with
> me; I think it has a lot of problems and I'm very intent on not making
> it worse (in my opinion, I will admit that API design is not entirely
> objective).  Continuing to argue for a interface design that I've
> already expressed a dislike for is not likely to win me over to your
> side; regardless of the outcome you will end up frustrating both
> yourself and the maintainer, neither are good things.

I've re-read the thread from the beginning.  I guess I must have missed
what was the fundamental problem with the AUDIT_GET/reset method other
than taste.  What don't you like about the API that precludes
using/abusing it this way?  Is it the issue that a GET would
surprisingly change a value rather than just reading it?  I didn't like
it either, but it was the next obvious way to tackle the issue without
losing information.

> We all agree that there is a potential race window with respect to
> reading/resetting the lost counter, where we disagree is the
> likelihood of that happening in practice.

We've had other races that looked pretty unlikely in practice that were
deemed to be unacceptable.  Granted the risks were higher if they were
exploited, but they were mitigated to the best of our ability.

None of this should matter now since there are ideas below that should
work.

> You feel very strongly that
> the window is of grave concern, Steve and myself much less so.  If you
> still feel strongly about this, think about some different ways in
> which you can avoid losing a lost message counter bump.  Off the top
> of my head, there are really only two ways for the kernel's audit
> subsystem to send information back to userspace in this case, via a
> netlink return/error message or an audit record.  We could possibly do
> something with the netlink error message by returning the lost counter
> as a positive integer (negative integer is a failure code, zero is
> success), but that might get tricky in the future, although we could
> mitigate that risk by forcing the AUDIT_SET/reset to happen by itself
> (in other words, don't simply check to see if the bit is set in the
> bitmask, e.g. (s.mask & AUDIT_STATUS_LOST), check to see it is equal,
> e.g. (s.mask == AUDIT_STATUS_LOST)).

This could work.  What risk do you see in doing it with other flags?
That another set failure could usurp the return code?  If so, yes, I
agree with requiring it to be a lone flag.

> We could also mitigate the race
> via an audit record by emitting a record indicating that the lost
> counter was reset and record the lost counter (before the reset) in
> that record; honestly, now that I'm writing this, it seems like
> something we should be doing regardless, as tampering with the lost
> counter seems like a security relevant event.

Agreed.  This should be added regardless of reset method.

> There you go, two possible solutions for eliminating/mitigating the
> potential race while sticking with the simpler AUDIT_SET/reset
> interface.  I suppose you could even implement both of the solutions
> above, they aren't mutually exclusive; that would depend on what
> Steve/userspace would prefer.  Finally, as for the feature bitmap to
> signal to userspace that we support this new feature: if you can't
> live without it, go ahead and add it in.  As I said before, I'm a
> little concerned at the rate we are consuming this bitmap, but I'll
> admit we still have plenty of room before we have to start worrying
> about alternatives.

I would suggest that the return value (presuming it was reset when
non-zero) or the audit record generated reporting the lost value
reset would be sufficient confirmation that the feature exists on the
running kernel and the addition to the feature bitmap is not strictly
necessary, but you only find this out upon attempting that lost reset.

Well, we haven't used much of that bitmap space and if it isn't to be
used when needed, why is it there?  If there is a relatively simple
alternate non-destructive way to discover the presence of a feature use
of the bitmap isn't necessary.

> paul moore

- RGB

--
Richard Guy Briggs <rgb@redhat.com>
Kernel Security Engineering, Base Operating Systems, Red Hat
Remote, Ottawa, Canada
Voice: +1.647.777.2635, Internal: (81) 32635

^ permalink raw reply

* Re: netlink: GPF in sock_sndtimeo
From: Cong Wang @ 2016-12-09  6:57 UTC (permalink / raw)
  To: Richard Guy Briggs
  Cc: linux-audit, Paul Moore, Dmitry Vyukov, David Miller,
	Johannes Berg, Florian Westphal, Eric Dumazet, Herbert Xu, netdev,
	LKML, syzkaller
In-Reply-To: <20161209060248.GT22655@madcap2.tricolour.ca>

On Thu, Dec 8, 2016 at 10:02 PM, Richard Guy Briggs <rgb@redhat.com> wrote:
> I also tried to extend Cong Wang's idea to attempt to proactively respond to a
> NETLINK_URELEASE on the audit_sock and reset it, but ran into a locking error
> stack dump using mutex_lock(&audit_cmd_mutex) in the notifier callback.
> Eliminating the lock since the sock is dead anways eliminates the error.
>
> Is it safe?  I'll resubmit if this looks remotely sane.  Meanwhile I'll try to
> get the test case to compile.

It doesn't look safe, because 'audit_sock', 'audit_nlk_portid' and 'audit_pid'
are updated as a whole and race between audit_receive_msg() and
NETLINK_URELEASE.


> @@ -1167,10 +1190,14 @@ static void __net_exit audit_net_exit(struct net *net)
>  {
>         struct audit_net *aunet = net_generic(net, audit_net_id);
>         struct sock *sock = aunet->nlsk;
> +
> +       mutex_lock(&audit_cmd_mutex);
>         if (sock == audit_sock) {
>                 audit_pid = 0;
> +               audit_nlk_portid = 0;
>                 audit_sock = NULL;
>         }
> +       mutex_unlock(&audit_cmd_mutex);
>

If you decide to use NETLINK_URELEASE notifier, the above piece is no
longer needed, the net_exit path simply releases a refcnt.

^ permalink raw reply

* Re: netlink: GPF in sock_sndtimeo
From: Richard Guy Briggs @ 2016-12-09  6:02 UTC (permalink / raw)
  To: Cong Wang, linux-audit, pmoore
  Cc: Dmitry Vyukov, David Miller, Johannes Berg, Florian Westphal,
	Eric Dumazet, Herbert Xu, netdev, LKML, syzkaller
In-Reply-To: <20161130045207.GE26673@madcap2.tricolour.ca>

On 2016-11-29 23:52, Richard Guy Briggs wrote:
> On 2016-11-29 15:13, Cong Wang wrote:
> > On Tue, Nov 29, 2016 at 8:48 AM, Richard Guy Briggs <rgb@redhat.com> wrote:
> > > On 2016-11-26 17:11, Cong Wang wrote:
> > >> It is racy on audit_sock, especially on the netns exit path.
> > >
> > > I think that is the only place it is racy.  The other places audit_sock
> > > is set is when the socket failure has just triggered a reset.
> > >
> > > Is there a notifier callback for failed or reaped sockets?
> > 
> > Is NETLINK_URELEASE event what you are looking for?
> 
> Possibly, yes.  Thanks, I'll have a look.

I tried a quick compile attempt on the test case (I assume it is a
socket fuzzer) and get the following compile error:
cc -g -O0 -Wall -D_GNU_SOURCE -o socket_fuzz socket_fuzz.c
socket_fuzz.c:16:1: warning: "_GNU_SOURCE" redefined
<command-line>: warning: this is the location of the previous definition
socket_fuzz.c: In function ‘segv_handler’:
socket_fuzz.c:89: warning: implicit declaration of function ‘__atomic_load_n’
socket_fuzz.c:89: error: ‘__ATOMIC_RELAXED’ undeclared (first use in this function)
socket_fuzz.c:89: error: (Each undeclared identifier is reported only once
socket_fuzz.c:89: error: for each function it appears in.)
socket_fuzz.c: In function ‘loop’:
socket_fuzz.c:280: warning: unused variable ‘errno0’
socket_fuzz.c: In function ‘test’:
socket_fuzz.c:303: warning: implicit declaration of function ‘__atomic_fetch_add’
socket_fuzz.c:303: error: ‘__ATOMIC_SEQ_CST’ undeclared (first use in this function)
socket_fuzz.c:303: warning: implicit declaration of function ‘__atomic_fetch_sub’

I also tried to extend Cong Wang's idea to attempt to proactively respond to a
NETLINK_URELEASE on the audit_sock and reset it, but ran into a locking error
stack dump using mutex_lock(&audit_cmd_mutex) in the notifier callback.
Eliminating the lock since the sock is dead anways eliminates the error.

Is it safe?  I'll resubmit if this looks remotely sane.  Meanwhile I'll try to
get the test case to compile.

This is being tracked as https://github.com/linux-audit/audit-kernel/issues/30

Subject: [PATCH] audit: proactively reset audit_sock on matching NETLINK_URELEASE

diff --git a/kernel/audit.c b/kernel/audit.c
index f1ca116..91d222d 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -423,6 +423,7 @@ static void kauditd_send_skb(struct sk_buff *skb)
 				snprintf(s, sizeof(s), "audit_pid=%d reset", audit_pid);
 				audit_log_lost(s);
 				audit_pid = 0;
+				audit_nlk_portid = 0;
 				audit_sock = NULL;
 			} else {
 				pr_warn("re-scheduling(#%d) write to audit_pid=%d\n",
@@ -1143,6 +1144,28 @@ static int audit_bind(struct net *net, int group)
 	return 0;
 }
 
+static int audit_sock_netlink_notify(struct notifier_block *nb,
+				     unsigned long event,
+				     void *_notify)
+{
+	struct netlink_notify *notify = _notify;
+	struct audit_net *aunet = net_generic(notify->net, audit_net_id);
+
+	if (event == NETLINK_URELEASE && notify->protocol == NETLINK_AUDIT) {
+		if (audit_nlk_portid == notify->portid &&
+		    audit_sock == aunet->nlsk) {
+			audit_pid = 0;
+			audit_nlk_portid = 0;
+			audit_sock = NULL;
+		}
+	}
+	return NOTIFY_DONE;
+}
+
+static struct notifier_block audit_netlink_notifier = {
+	.notifier_call = audit_sock_netlink_notify,
+};
+
 static int __net_init audit_net_init(struct net *net)
 {
 	struct netlink_kernel_cfg cfg = {
@@ -1167,10 +1190,14 @@ static void __net_exit audit_net_exit(struct net *net)
 {
 	struct audit_net *aunet = net_generic(net, audit_net_id);
 	struct sock *sock = aunet->nlsk;
+
+	mutex_lock(&audit_cmd_mutex);
 	if (sock == audit_sock) {
 		audit_pid = 0;
+		audit_nlk_portid = 0;
 		audit_sock = NULL;
 	}
+	mutex_unlock(&audit_cmd_mutex);
 
 	RCU_INIT_POINTER(aunet->nlsk, NULL);
 	synchronize_net();
@@ -1202,6 +1229,7 @@ static int __init audit_init(void)
 	audit_enabled = audit_default;
 	audit_ever_enabled |= !!audit_default;
 
+	netlink_register_notifier(&audit_netlink_notifier);
 	audit_log(NULL, GFP_KERNEL, AUDIT_KERNEL, "initialized");
 
 	for (i = 0; i < AUDIT_INODE_BUCKETS; i++)
-- 
1.7.1


> - RGB

- RGB

--
Richard Guy Briggs <rgb@redhat.com>
Kernel Security Engineering, Base Operating Systems, Red Hat
Remote, Ottawa, Canada
Voice: +1.647.777.2635, Internal: (81) 32635

^ permalink raw reply related

* Re: [PATCH 1/1] audit: Make AUDIT_ANOM_ABEND event normalized
From: Paul Moore @ 2016-12-09  1:10 UTC (permalink / raw)
  To: Steve Grubb; +Cc: linux-audit
In-Reply-To: <1547916.CxkrPjAzLT@x2>

On Thu, Dec 8, 2016 at 1:55 PM, Steve Grubb <sgrubb@redhat.com> wrote:
> The audit event specification asks for certain fields to exist in
> all events. Running 'ausearch -m anom_abend -sv yes' returns no
> events. This patch adds the result field so that the
> AUDIT_ANOM_ABEND event conforms to the rules.
>
> Signed-off-by: Steve Grubb <sgrubb@redhat.com>
> ---
>  kernel/auditsc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Looks good to me, added to the queue of patches to merge after the
upcoming merge window.

> diff --git a/kernel/auditsc.c b/kernel/auditsc.c
> index 5abf1dc..4317f5b 100644
> --- a/kernel/auditsc.c
> +++ b/kernel/auditsc.c
> @@ -2403,7 +2403,7 @@ void audit_core_dumps(long signr)
>         if (unlikely(!ab))
>                 return;
>         audit_log_task(ab);
> -       audit_log_format(ab, " sig=%ld", signr);
> +       audit_log_format(ab, " sig=%ld res=1", signr);
>         audit_log_end(ab);
>  }
>
> --
> 2.7.4
>
>
> --
> Linux-audit mailing list
> Linux-audit@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-audit

-- 
paul moore
www.paul-moore.com

^ permalink raw reply

* [PATCH 1/1] audit: Make AUDIT_ANOM_ABEND event normalized
From: Steve Grubb @ 2016-12-08 18:55 UTC (permalink / raw)
  To: linux-audit

The audit event specification asks for certain fields to exist in
all events. Running 'ausearch -m anom_abend -sv yes' returns no
events. This patch adds the result field so that the
AUDIT_ANOM_ABEND event conforms to the rules.

Signed-off-by: Steve Grubb <sgrubb@redhat.com>
---
 kernel/auditsc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index 5abf1dc..4317f5b 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -2403,7 +2403,7 @@ void audit_core_dumps(long signr)
 	if (unlikely(!ab))
 		return;
 	audit_log_task(ab);
-	audit_log_format(ab, " sig=%ld", signr);
+	audit_log_format(ab, " sig=%ld res=1", signr);
 	audit_log_end(ab);
 }
 
-- 
2.7.4

^ permalink raw reply related

* Re: [RFC][PATCH] audit: add feature audit_lost reset
From: Paul Moore @ 2016-12-08 14:05 UTC (permalink / raw)
  To: Richard Guy Briggs; +Cc: linux-audit
In-Reply-To: <20161208035311.GP22655@madcap2.tricolour.ca>

On Wed, Dec 7, 2016 at 10:53 PM, Richard Guy Briggs <rgb@redhat.com> wrote:
> On 2016-12-07 18:45, Paul Moore wrote:
>> On Wed, Dec 7, 2016 at 6:30 PM, Steve Grubb <sgrubb@redhat.com> wrote:
>> > On Wednesday, December 7, 2016 6:10:49 PM EST Paul Moore wrote:
>> >> On Wed, Dec 7, 2016 at 10:58 AM, Richard Guy Briggs <rgb@redhat.com> wrote:
>> >> > On 2016-12-07 10:53, Steve Grubb wrote:
>> >> >> On Wednesday, December 7, 2016 10:05:30 AM EST Paul Moore wrote:
>> >> >> > On Tue, Dec 6, 2016 at 10:32 PM, Richard Guy Briggs <rgb@redhat.com>
>> > wrote:
>> >> >> > > On 2016-12-06 19:17, Paul Moore wrote:
>> >> >> > >> On Tue, Dec 6, 2016 at 12:13 AM, Richard Guy Briggs <rgb@redhat.com>
>> >> >> > >> Okay, back up ... this whole mess about atomic_xchg() was always
>> >> >> > >> unrelated to my original suggestion, let's focus on my original
>> >> >> > >> comment ... don't reset the counter on a AUDIT_GET, reset it on a
>> >> >> > >> AUDIT_SET with an AUDIT_STATUS_LOST, does that make sense?
>> >> >> > >
>> >> >> > > I understood that.  It sounds like a nice simple and straightforward
>> >> >> > > method to do it but for the question of accuracy.  Please rewind to
>> >> >> > > my
>> >> >> > > fundamental point: How do we get an accurate reading of the last
>> >> >> > > value
>> >> >> > > of audit_lost before resetting it?
>> >> >> >
>> >> >> > Okay, I thought you were worried about a different race, which is why
>> >> >> > this discussion wasn't making much sense to me.  I understand your
>> >> >> > point, but I really dislike the API; although that's not your fault,
>> >> >> > it's really the only way to do it via AUDIT_GET.
>> >> >> >
>> >> >> > I'd much prefer we go with the cleaner AUDIT_SET approach and just not
>> >> >> > worry about the small race window.  It would only be an issue if you
>> >> >> > reset the count under heavy audit load, and why would you reset the
>> >> >> > lost value if you were under a heavy audit load?  That just doesn't
>> >> >> > make sense.
>> >> >> >
>> >> >> > I suppose we should hear from Steve on this since he was the one who
>> >> >> > has been asking for this feature, although I'm pretty sure I know what
>> >> >> > he is going to say.
>> >> >>
>> >> >> To start with, this request comes from users of the audit system. I just
>> >> >> passed along the request. The issue is that when you do auditctl -s, you
>> >> >> get the number of records lost. If you do it the next day, you have to
>> >> >> do math to see what the one day delta is. So, to make reporting easy,
>> >> >> they want it to be reset whenever they do audictl -s.
>> >> >>
>> >> >> You could also make a AUDIT_GET_RESET that gets the status and resets the
>> >> >> number atomically. Then I can add another commandline option to auditctl
>> >> >> that allows an admin to say also reset the counters. If that command
>> >> >> line option is passed, I call AUDIT_GET_RESET otherwise I call
>> >> >> AUDIT_GET. Thought?>
>> >> > This would be slightly simpler in kernel implementation than the method
>> >> > I proposed and would work fine, off the top of my head.
>> >>
>> >> I'd prefer not to introduce another command message type for something
>> >> small like this.
>> >>
>> >> Steve, do you have any objection to the AUDIT_SET based approach?
>> >
>> > Either way, we'd need a feature flag so that I can tell if the kernel supports
>> > this or not.
>>
>> I think we are okay without a specific feature flag as sending a
>> AUDIT_SET/reset on an old kernel will be harmless; it won't do
>> anything, but it shouldn't return an error either.
>
> Ok, so userspace is still left wondering if it worked until the next
> time it reads that value, and even then it can't be certain if that
> value is the same or higher than it was when userspace thought it reset
> that value.  At this point, an old kernel will not return an error and
> simply ignore any new AUDIT_STATUS_* flag since each flag is treated
> independently and extra flags are ignored and not blocked.  This seems
> sloppy since we have two ways of fixing this uncertainty pretty easily.

Comments like the above aren't helpful, they are just annoying.  The
drawbacks to the AUDIT_SET/reset approach have already been discussed
on this thread, if you want to do something constructive think about
how you can resolve these limitations within the context of others
comments/feedback.

I've already mentioned that I didn't like the AUDIT_GET/reset approach
because I thought the interface was bad.  As I'm sure you know, the
audit kernel/userspace interface is a bit of a hot-button topic with
me; I think it has a lot of problems and I'm very intent on not making
it worse (in my opinion, I will admit that API design is not entirely
objective).  Continuing to argue for a interface design that I've
already expressed a dislike for is not likely to win me over to your
side; regardless of the outcome you will end up frustrating both
yourself and the maintainer, neither are good things.

We all agree that there is a potential race window with respect to
reading/resetting the lost counter, where we disagree is the
likelihood of that happening in practice.  You feel very strongly that
the window is of grave concern, Steve and myself much less so.  If you
still feel strongly about this, think about some different ways in
which you can avoid losing a lost message counter bump.  Off the top
of my head, there are really only two ways for the kernel's audit
subsystem to send information back to userspace in this case, via a
netlink return/error message or an audit record.  We could possibly do
something with the netlink error message by returning the lost counter
as a positive integer (negative integer is a failure code, zero is
success), but that might get tricky in the future, although we could
mitigate that risk by forcing the AUDIT_SET/reset to happen by itself
(in other words, don't simply check to see if the bit is set in the
bitmask, e.g. (s.mask & AUDIT_STATUS_LOST), check to see it is equal,
e.g. (s.mask == AUDIT_STATUS_LOST)).  We could also mitigate the race
via an audit record by emitting a record indicating that the lost
counter was reset and record the lost counter (before the reset) in
that record; honestly, now that I'm writing this, it seems like
something we should be doing regardless, as tampering with the lost
counter seems like a security relevant event.

There you go, two possible solutions for eliminating/mitigating the
potential race while sticking with the simpler AUDIT_SET/reset
interface.  I suppose you could even implement both of the solutions
above, they aren't mutually exclusive; that would depend on what
Steve/userspace would prefer.  Finally, as for the feature bitmap to
signal to userspace that we support this new feature: if you can't
live without it, go ahead and add it in.  As I said before, I'm a
little concerned at the rate we are consuming this bitmap, but I'll
admit we still have plenty of room before we have to start worrying
about alternatives.

-- 
paul moore
www.paul-moore.com

^ permalink raw reply

* Re: [RFC][PATCH] audit: add feature audit_lost reset
From: Richard Guy Briggs @ 2016-12-08  3:53 UTC (permalink / raw)
  To: Paul Moore; +Cc: linux-audit
In-Reply-To: <CAHC9VhQoU1DW_PDGym06ravAV8-NMxchwutwcokJzgt6bd=hWA@mail.gmail.com>

On 2016-12-07 18:45, Paul Moore wrote:
> On Wed, Dec 7, 2016 at 6:30 PM, Steve Grubb <sgrubb@redhat.com> wrote:
> > On Wednesday, December 7, 2016 6:10:49 PM EST Paul Moore wrote:
> >> On Wed, Dec 7, 2016 at 10:58 AM, Richard Guy Briggs <rgb@redhat.com> wrote:
> >> > On 2016-12-07 10:53, Steve Grubb wrote:
> >> >> On Wednesday, December 7, 2016 10:05:30 AM EST Paul Moore wrote:
> >> >> > On Tue, Dec 6, 2016 at 10:32 PM, Richard Guy Briggs <rgb@redhat.com>
> > wrote:
> >> >> > > On 2016-12-06 19:17, Paul Moore wrote:
> >> >> > >> On Tue, Dec 6, 2016 at 12:13 AM, Richard Guy Briggs <rgb@redhat.com>
> >> >> > >> Okay, back up ... this whole mess about atomic_xchg() was always
> >> >> > >> unrelated to my original suggestion, let's focus on my original
> >> >> > >> comment ... don't reset the counter on a AUDIT_GET, reset it on a
> >> >> > >> AUDIT_SET with an AUDIT_STATUS_LOST, does that make sense?
> >> >> > >
> >> >> > > I understood that.  It sounds like a nice simple and straightforward
> >> >> > > method to do it but for the question of accuracy.  Please rewind to
> >> >> > > my
> >> >> > > fundamental point: How do we get an accurate reading of the last
> >> >> > > value
> >> >> > > of audit_lost before resetting it?
> >> >> >
> >> >> > Okay, I thought you were worried about a different race, which is why
> >> >> > this discussion wasn't making much sense to me.  I understand your
> >> >> > point, but I really dislike the API; although that's not your fault,
> >> >> > it's really the only way to do it via AUDIT_GET.
> >> >> >
> >> >> > I'd much prefer we go with the cleaner AUDIT_SET approach and just not
> >> >> > worry about the small race window.  It would only be an issue if you
> >> >> > reset the count under heavy audit load, and why would you reset the
> >> >> > lost value if you were under a heavy audit load?  That just doesn't
> >> >> > make sense.
> >> >> >
> >> >> > I suppose we should hear from Steve on this since he was the one who
> >> >> > has been asking for this feature, although I'm pretty sure I know what
> >> >> > he is going to say.
> >> >>
> >> >> To start with, this request comes from users of the audit system. I just
> >> >> passed along the request. The issue is that when you do auditctl -s, you
> >> >> get the number of records lost. If you do it the next day, you have to
> >> >> do math to see what the one day delta is. So, to make reporting easy,
> >> >> they want it to be reset whenever they do audictl -s.
> >> >>
> >> >> You could also make a AUDIT_GET_RESET that gets the status and resets the
> >> >> number atomically. Then I can add another commandline option to auditctl
> >> >> that allows an admin to say also reset the counters. If that command
> >> >> line option is passed, I call AUDIT_GET_RESET otherwise I call
> >> >> AUDIT_GET. Thought?>
> >> > This would be slightly simpler in kernel implementation than the method
> >> > I proposed and would work fine, off the top of my head.
> >>
> >> I'd prefer not to introduce another command message type for something
> >> small like this.
> >>
> >> Steve, do you have any objection to the AUDIT_SET based approach?
> >
> > Either way, we'd need a feature flag so that I can tell if the kernel supports
> > this or not.
> 
> I think we are okay without a specific feature flag as sending a
> AUDIT_SET/reset on an old kernel will be harmless; it won't do
> anything, but it shouldn't return an error either.

Ok, so userspace is still left wondering if it worked until the next
time it reads that value, and even then it can't be certain if that
value is the same or higher than it was when userspace thought it reset
that value.  At this point, an old kernel will not return an error and
simply ignore any new AUDIT_STATUS_* flag since each flag is treated
independently and extra flags are ignored and not blocked.  This seems
sloppy since we have two ways of fixing this uncertainty pretty easily.

> > Also, it should accept "0" as the only valid value.
> 
> Of course.

No problem.

> > We can do this and I can make auditctl do the two back to back before displaying the
> > results to minimize the window of risk.

Having it depend on userspace implementation to minimize the risk of a 
race strikes me as unsound design.

> >> Based on what you've said above, it would seem like the potential race
> >> condition with AUDIT_SET wouldn't be a significant issue.
> >
> > All a matter of perspective. What I think is a reasonable risk someone else
> > may disagree. Does anyone else on the list object? If not I'd say go with it.
> 
> Judgement calls are always a matter of perspective, since you are the
> only one who has asked for this (even if it is by proxy) I was asking
> for your perspective.  Anyway, it looks like we're probably okay here;
> if we don't hear anything in the next day or two lets go ahead with
> the AUDIT_SET approach.  If it proves to be a problem we can always
> introduce one of the other approaches later.

If it isn't dependable and accurate, what's the point?  Why not do it
right the first time?

> paul moore

- RGB

--
Richard Guy Briggs <rgb@redhat.com>
Kernel Security Engineering, Base Operating Systems, Red Hat
Remote, Ottawa, Canada
Voice: +1.647.777.2635, Internal: (81) 32635

^ permalink raw reply

* Re: [RFC][PATCH] audit: add feature audit_lost reset
From: Paul Moore @ 2016-12-07 23:45 UTC (permalink / raw)
  To: Steve Grubb; +Cc: Richard Guy Briggs, linux-audit
In-Reply-To: <1664851.YNMxxsjdG9@x2>

On Wed, Dec 7, 2016 at 6:30 PM, Steve Grubb <sgrubb@redhat.com> wrote:
> On Wednesday, December 7, 2016 6:10:49 PM EST Paul Moore wrote:
>> On Wed, Dec 7, 2016 at 10:58 AM, Richard Guy Briggs <rgb@redhat.com> wrote:
>> > On 2016-12-07 10:53, Steve Grubb wrote:
>> >> On Wednesday, December 7, 2016 10:05:30 AM EST Paul Moore wrote:
>> >> > On Tue, Dec 6, 2016 at 10:32 PM, Richard Guy Briggs <rgb@redhat.com>
> wrote:
>> >> > > On 2016-12-06 19:17, Paul Moore wrote:
>> >> > >> On Tue, Dec 6, 2016 at 12:13 AM, Richard Guy Briggs <rgb@redhat.com>
>> >> > >> Okay, back up ... this whole mess about atomic_xchg() was always
>> >> > >> unrelated to my original suggestion, let's focus on my original
>> >> > >> comment ... don't reset the counter on a AUDIT_GET, reset it on a
>> >> > >> AUDIT_SET with an AUDIT_STATUS_LOST, does that make sense?
>> >> > >
>> >> > > I understood that.  It sounds like a nice simple and straightforward
>> >> > > method to do it but for the question of accuracy.  Please rewind to
>> >> > > my
>> >> > > fundamental point: How do we get an accurate reading of the last
>> >> > > value
>> >> > > of audit_lost before resetting it?
>> >> >
>> >> > Okay, I thought you were worried about a different race, which is why
>> >> > this discussion wasn't making much sense to me.  I understand your
>> >> > point, but I really dislike the API; although that's not your fault,
>> >> > it's really the only way to do it via AUDIT_GET.
>> >> >
>> >> > I'd much prefer we go with the cleaner AUDIT_SET approach and just not
>> >> > worry about the small race window.  It would only be an issue if you
>> >> > reset the count under heavy audit load, and why would you reset the
>> >> > lost value if you were under a heavy audit load?  That just doesn't
>> >> > make sense.
>> >> >
>> >> > I suppose we should hear from Steve on this since he was the one who
>> >> > has been asking for this feature, although I'm pretty sure I know what
>> >> > he is going to say.
>> >>
>> >> To start with, this request comes from users of the audit system. I just
>> >> passed along the request. The issue is that when you do auditctl -s, you
>> >> get the number of records lost. If you do it the next day, you have to
>> >> do math to see what the one day delta is. So, to make reporting easy,
>> >> they want it to be reset whenever they do audictl -s.
>> >>
>> >> You could also make a AUDIT_GET_RESET that gets the status and resets the
>> >> number atomically. Then I can add another commandline option to auditctl
>> >> that allows an admin to say also reset the counters. If that command
>> >> line option is passed, I call AUDIT_GET_RESET otherwise I call
>> >> AUDIT_GET. Thought?>
>> > This would be slightly simpler in kernel implementation than the method
>> > I proposed and would work fine, off the top of my head.
>>
>> I'd prefer not to introduce another command message type for something
>> small like this.
>>
>> Steve, do you have any objection to the AUDIT_SET based approach?
>
> Either way, we'd need a feature flag so that I can tell if the kernel supports
> this or not.

I think we are okay without a specific feature flag as sending a
AUDIT_SET/reset on an old kernel will be harmless; it won't do
anything, but it shouldn't return an error either.

> Also, it should accept "0" as the only valid value.

Of course.

> We can do this and I can make auditctl do the two back to back before displaying the
> results to minimize the window of risk.
>
>> Based on what you've said above, it would seem like the potential race
>> condition with AUDIT_SET wouldn't be a significant issue.
>
> All a matter of perspective. What I think is a reasonable risk someone else
> may disagree. Does anyone else on the list object? If not I'd say go with it.

Judgement calls are always a matter of perspective, since you are the
only one who has asked for this (even if it is by proxy) I was asking
for your perspective.  Anyway, it looks like we're probably okay here;
if we don't hear anything in the next day or two lets go ahead with
the AUDIT_SET approach.  If it proves to be a problem we can always
introduce one of the other approaches later.

-- 
paul moore
www.paul-moore.com

^ permalink raw reply

* Re: [RFC][PATCH] audit: add feature audit_lost reset
From: Steve Grubb @ 2016-12-07 23:30 UTC (permalink / raw)
  To: Paul Moore; +Cc: Richard Guy Briggs, linux-audit
In-Reply-To: <CAHC9VhTRvMwH2441eEfLxbxAVft+02LV=23_PWXHAOnPBEM70Q@mail.gmail.com>

On Wednesday, December 7, 2016 6:10:49 PM EST Paul Moore wrote:
> On Wed, Dec 7, 2016 at 10:58 AM, Richard Guy Briggs <rgb@redhat.com> wrote:
> > On 2016-12-07 10:53, Steve Grubb wrote:
> >> On Wednesday, December 7, 2016 10:05:30 AM EST Paul Moore wrote:
> >> > On Tue, Dec 6, 2016 at 10:32 PM, Richard Guy Briggs <rgb@redhat.com> 
wrote:
> >> > > On 2016-12-06 19:17, Paul Moore wrote:
> >> > >> On Tue, Dec 6, 2016 at 12:13 AM, Richard Guy Briggs <rgb@redhat.com>
> >> > >> Okay, back up ... this whole mess about atomic_xchg() was always
> >> > >> unrelated to my original suggestion, let's focus on my original
> >> > >> comment ... don't reset the counter on a AUDIT_GET, reset it on a
> >> > >> AUDIT_SET with an AUDIT_STATUS_LOST, does that make sense?
> >> > > 
> >> > > I understood that.  It sounds like a nice simple and straightforward
> >> > > method to do it but for the question of accuracy.  Please rewind to
> >> > > my
> >> > > fundamental point: How do we get an accurate reading of the last
> >> > > value
> >> > > of audit_lost before resetting it?
> >> > 
> >> > Okay, I thought you were worried about a different race, which is why
> >> > this discussion wasn't making much sense to me.  I understand your
> >> > point, but I really dislike the API; although that's not your fault,
> >> > it's really the only way to do it via AUDIT_GET.
> >> > 
> >> > I'd much prefer we go with the cleaner AUDIT_SET approach and just not
> >> > worry about the small race window.  It would only be an issue if you
> >> > reset the count under heavy audit load, and why would you reset the
> >> > lost value if you were under a heavy audit load?  That just doesn't
> >> > make sense.
> >> > 
> >> > I suppose we should hear from Steve on this since he was the one who
> >> > has been asking for this feature, although I'm pretty sure I know what
> >> > he is going to say.
> >> 
> >> To start with, this request comes from users of the audit system. I just
> >> passed along the request. The issue is that when you do auditctl -s, you
> >> get the number of records lost. If you do it the next day, you have to
> >> do math to see what the one day delta is. So, to make reporting easy,
> >> they want it to be reset whenever they do audictl -s.
> >> 
> >> You could also make a AUDIT_GET_RESET that gets the status and resets the
> >> number atomically. Then I can add another commandline option to auditctl
> >> that allows an admin to say also reset the counters. If that command
> >> line option is passed, I call AUDIT_GET_RESET otherwise I call
> >> AUDIT_GET. Thought?> 
> > This would be slightly simpler in kernel implementation than the method
> > I proposed and would work fine, off the top of my head.
> 
> I'd prefer not to introduce another command message type for something
> small like this.
> 
> Steve, do you have any objection to the AUDIT_SET based approach?

Either way, we'd need a feature flag so that I can tell if the kernel supports 
this or not. Also, it should accept "0" as the only valid value. We can do 
this and I can make auditctl do the two back to back before displaying the 
results to minimize the window of risk.

> Based on what you've said above, it would seem like the potential race
> condition with AUDIT_SET wouldn't be a significant issue.

All a matter of perspective. What I think is a reasonable risk someone else 
may disagree. Does anyone else on the list object? If not I'd say go with it.

-Steve

^ permalink raw reply

* Re: [RFC][PATCH] audit: add feature audit_lost reset
From: Paul Moore @ 2016-12-07 23:10 UTC (permalink / raw)
  To: Richard Guy Briggs, Steve Grubb; +Cc: linux-audit
In-Reply-To: <20161207155849.GM22655@madcap2.tricolour.ca>

On Wed, Dec 7, 2016 at 10:58 AM, Richard Guy Briggs <rgb@redhat.com> wrote:
> On 2016-12-07 10:53, Steve Grubb wrote:
>> On Wednesday, December 7, 2016 10:05:30 AM EST Paul Moore wrote:
>> > On Tue, Dec 6, 2016 at 10:32 PM, Richard Guy Briggs <rgb@redhat.com> wrote:
>> > > On 2016-12-06 19:17, Paul Moore wrote:
>> > >> On Tue, Dec 6, 2016 at 12:13 AM, Richard Guy Briggs <rgb@redhat.com>
>> > >> Okay, back up ... this whole mess about atomic_xchg() was always
>> > >> unrelated to my original suggestion, let's focus on my original
>> > >> comment ... don't reset the counter on a AUDIT_GET, reset it on a
>> > >> AUDIT_SET with an AUDIT_STATUS_LOST, does that make sense?
>> > >
>> > > I understood that.  It sounds like a nice simple and straightforward
>> > > method to do it but for the question of accuracy.  Please rewind to my
>> > > fundamental point: How do we get an accurate reading of the last value
>> > > of audit_lost before resetting it?
>> >
>> > Okay, I thought you were worried about a different race, which is why
>> > this discussion wasn't making much sense to me.  I understand your
>> > point, but I really dislike the API; although that's not your fault,
>> > it's really the only way to do it via AUDIT_GET.
>> >
>> > I'd much prefer we go with the cleaner AUDIT_SET approach and just not
>> > worry about the small race window.  It would only be an issue if you
>> > reset the count under heavy audit load, and why would you reset the
>> > lost value if you were under a heavy audit load?  That just doesn't
>> > make sense.
>> >
>> > I suppose we should hear from Steve on this since he was the one who
>> > has been asking for this feature, although I'm pretty sure I know what
>> > he is going to say.
>>
>> To start with, this request comes from users of the audit system. I just
>> passed along the request. The issue is that when you do auditctl -s, you get
>> the number of records lost. If you do it the next day, you have to do math to
>> see what the one day delta is. So, to make reporting easy, they want it to be
>> reset whenever they do audictl -s.
>>
>> You could also make a AUDIT_GET_RESET that gets the status and resets the
>> number atomically. Then I can add another commandline option to auditctl that
>> allows an admin to say also reset the counters. If that command line option is
>> passed, I call AUDIT_GET_RESET otherwise I call AUDIT_GET. Thought?
>
> This would be slightly simpler in kernel implementation than the method
> I proposed and would work fine, off the top of my head.

I'd prefer not to introduce another command message type for something
small like this.

Steve, do you have any objection to the AUDIT_SET based approach?
Based on what you've said above, it would seem like the potential race
condition with AUDIT_SET wouldn't be a significant issue.

-- 
paul moore
www.paul-moore.com

^ permalink raw reply

* Re: [RFC][PATCH] audit: add feature audit_lost reset
From: Richard Guy Briggs @ 2016-12-07 15:58 UTC (permalink / raw)
  To: Steve Grubb; +Cc: linux-audit
In-Reply-To: <25564022.n3eyru8ApV@x2>

On 2016-12-07 10:53, Steve Grubb wrote:
> On Wednesday, December 7, 2016 10:05:30 AM EST Paul Moore wrote:
> > On Tue, Dec 6, 2016 at 10:32 PM, Richard Guy Briggs <rgb@redhat.com> wrote:
> > > On 2016-12-06 19:17, Paul Moore wrote:
> > >> On Tue, Dec 6, 2016 at 12:13 AM, Richard Guy Briggs <rgb@redhat.com> 
> > >> Okay, back up ... this whole mess about atomic_xchg() was always
> > >> unrelated to my original suggestion, let's focus on my original
> > >> comment ... don't reset the counter on a AUDIT_GET, reset it on a
> > >> AUDIT_SET with an AUDIT_STATUS_LOST, does that make sense?
> > > 
> > > I understood that.  It sounds like a nice simple and straightforward
> > > method to do it but for the question of accuracy.  Please rewind to my
> > > fundamental point: How do we get an accurate reading of the last value
> > > of audit_lost before resetting it?
> > 
> > Okay, I thought you were worried about a different race, which is why
> > this discussion wasn't making much sense to me.  I understand your
> > point, but I really dislike the API; although that's not your fault,
> > it's really the only way to do it via AUDIT_GET.
> > 
> > I'd much prefer we go with the cleaner AUDIT_SET approach and just not
> > worry about the small race window.  It would only be an issue if you
> > reset the count under heavy audit load, and why would you reset the
> > lost value if you were under a heavy audit load?  That just doesn't
> > make sense.
> > 
> > I suppose we should hear from Steve on this since he was the one who
> > has been asking for this feature, although I'm pretty sure I know what
> > he is going to say.
> 
> To start with, this request comes from users of the audit system. I just 
> passed along the request. The issue is that when you do auditctl -s, you get 
> the number of records lost. If you do it the next day, you have to do math to 
> see what the one day delta is. So, to make reporting easy, they want it to be 
> reset whenever they do audictl -s.
> 
> You could also make a AUDIT_GET_RESET that gets the status and resets the 
> number atomically. Then I can add another commandline option to auditctl that 
> allows an admin to say also reset the counters. If that command line option is 
> passed, I call AUDIT_GET_RESET otherwise I call AUDIT_GET. Thought?

This would be slightly simpler in kernel implementation than the method
I proposed and would work fine, off the top of my head.

> -Steve

- RGB

--
Richard Guy Briggs <rgb@redhat.com>
Kernel Security Engineering, Base Operating Systems, Red Hat
Remote, Ottawa, Canada
Voice: +1.647.777.2635, Internal: (81) 32635

^ permalink raw reply

* Re: [RFC][PATCH] audit: add feature audit_lost reset
From: Richard Guy Briggs @ 2016-12-07 15:55 UTC (permalink / raw)
  To: Paul Moore; +Cc: linux-audit
In-Reply-To: <CAHC9VhQCaJwoMw+dvHF5S9x9Jy1jXAMx_kGGXPHthfQXDUW06g@mail.gmail.com>

On 2016-12-07 10:05, Paul Moore wrote:
> On Tue, Dec 6, 2016 at 10:32 PM, Richard Guy Briggs <rgb@redhat.com> wrote:
> > On 2016-12-06 19:17, Paul Moore wrote:
> >> On Tue, Dec 6, 2016 at 12:13 AM, Richard Guy Briggs <rgb@redhat.com> wrote:
> >> > On 2016-12-05 12:48, Paul Moore wrote:
> >> >> On Mon, Dec 5, 2016 at 11:52 AM, Richard Guy Briggs <rgb@redhat.com> wrote:
> >> >> > On 2016-12-05 11:02, Paul Moore wrote:
> >> >> >> On Mon, Dec 5, 2016 at 3:02 AM, Richard Guy Briggs <rgb@redhat.com> wrote:
> >> >> >> > Add a method to reset the audit_lost value.
> >> >> >> >
> >> >> >> > An AUDIT_GET message will get the current audit_lost value and reset the
> >> >> >> > counter to zero iff (if and only if) the AUDIT_FEATURE_LOST_RESET
> >> >> >> > feature is set.
> >> >> >> >
> >> >> >> > If the flag AUDIT_FEATURE_BITMAP_LOST_RESET is present in the audit
> >> >> >> > feature bitmap, the feature is settable by setting the
> >> >> >> > AUDIT_FEATURE_LOST_RESET flag in the audit feature list with an
> >> >> >> > AUDIT_SET_FEATURE call.  This setting is lockable.
> >> >> >> >
> >> >> >> > See: https://github.com/linux-audit/audit-kernel/issues/3
> >> >> >> >
> >> >> >> > Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
> >> >> >> > ---
> >> >> >> > Note: The AUDIT_FEATURE_BITMAP_LOST_RESET check may not be necessary if
> >> >> >> > it is possible to read all the entries from audit_feature_names from
> >> >> >> > userspace.
> >> >> >> > ---
> >> >> >> >  include/uapi/linux/audit.h |    7 +++++--
> >> >> >> >  kernel/audit.c             |    9 ++++++---
> >> >> >> >  2 files changed, 11 insertions(+), 5 deletions(-)
> >> >> >>
> >> >> >> Instead of resetting the lost counter on an AUDIT_GET if the reset
> >> >> >> feature is set, how about preserving the AUDIT_GET behavior, skipping
> >> >> >> the AUDIT_FEATURE_* addition, and simply reset the lost value by
> >> >> >> sending a AUDIT_SET message with AUDIT_STATUS_LOST (you obviously have
> >> >> >> to add this to the uapi header).
> >> >> >
> >> >> > I realized as I was coding it up that we would potentially lose an
> >> >> > accurate count if the read and reset were not atomic.  This was the
> >> >> > reason for using atomic_xchg().
> >> >>
> >> >> Well, okay, but that isn't what I was talking about ... ?  The
> >> >> audit_cmd_mutex is held for both AUDIT_GET and AUDIT_SET so we should
> >> >> never process these requests at the same time.
> >> >
> >> > I guess I still don't follow what you are talking about...  I hadn't
> >> > thought of including both an AUDIT_GET and an AUDIT_SET in the same skb,
> >> > but that would ensure that no other command reads or resets the lost
> >> > value.  However, the lost value could still be incrementing on another
> >> > CPU between these two commands, losing an accurate lost count.
> >>
> >> Okay, back up ... this whole mess about atomic_xchg() was always
> >> unrelated to my original suggestion, let's focus on my original
> >> comment ... don't reset the counter on a AUDIT_GET, reset it on a
> >> AUDIT_SET with an AUDIT_STATUS_LOST, does that make sense?
> >
> > I understood that.  It sounds like a nice simple and straightforward
> > method to do it but for the question of accuracy.  Please rewind to my
> > fundamental point: How do we get an accurate reading of the last value
> > of audit_lost before resetting it?
> 
> Okay, I thought you were worried about a different race, which is why
> this discussion wasn't making much sense to me.  I understand your
> point, but I really dislike the API; although that's not your fault,
> it's really the only way to do it via AUDIT_GET.
> 
> I'd much prefer we go with the cleaner AUDIT_SET approach and just not
> worry about the small race window.  It would only be an issue if you
> reset the count under heavy audit load, and why would you reset the
> lost value if you were under a heavy audit load?  That just doesn't
> make sense.

I agree the AUDIT_SET approach is much more elegant, which is the
original way I had envisioned doing it, but it is lossy, and I'd lean
towards accuracy and reliability than uncertainty and doubt.

> I suppose we should hear from Steve on this since he was the one who
> has been asking for this feature, although I'm pretty sure I know what
> he is going to say.
> 
> -- 
> paul moore
> www.paul-moore.com

- RGB

--
Richard Guy Briggs <rgb@redhat.com>
Kernel Security Engineering, Base Operating Systems, Red Hat
Remote, Ottawa, Canada
Voice: +1.647.777.2635, Internal: (81) 32635

^ permalink raw reply

* Re: [RFC][PATCH] audit: add feature audit_lost reset
From: Steve Grubb @ 2016-12-07 15:53 UTC (permalink / raw)
  To: linux-audit; +Cc: Richard Guy Briggs
In-Reply-To: <CAHC9VhQCaJwoMw+dvHF5S9x9Jy1jXAMx_kGGXPHthfQXDUW06g@mail.gmail.com>

On Wednesday, December 7, 2016 10:05:30 AM EST Paul Moore wrote:
> On Tue, Dec 6, 2016 at 10:32 PM, Richard Guy Briggs <rgb@redhat.com> wrote:
> > On 2016-12-06 19:17, Paul Moore wrote:
> >> On Tue, Dec 6, 2016 at 12:13 AM, Richard Guy Briggs <rgb@redhat.com> 
> >> Okay, back up ... this whole mess about atomic_xchg() was always
> >> unrelated to my original suggestion, let's focus on my original
> >> comment ... don't reset the counter on a AUDIT_GET, reset it on a
> >> AUDIT_SET with an AUDIT_STATUS_LOST, does that make sense?
> > 
> > I understood that.  It sounds like a nice simple and straightforward
> > method to do it but for the question of accuracy.  Please rewind to my
> > fundamental point: How do we get an accurate reading of the last value
> > of audit_lost before resetting it?
> 
> Okay, I thought you were worried about a different race, which is why
> this discussion wasn't making much sense to me.  I understand your
> point, but I really dislike the API; although that's not your fault,
> it's really the only way to do it via AUDIT_GET.
> 
> I'd much prefer we go with the cleaner AUDIT_SET approach and just not
> worry about the small race window.  It would only be an issue if you
> reset the count under heavy audit load, and why would you reset the
> lost value if you were under a heavy audit load?  That just doesn't
> make sense.
> 
> I suppose we should hear from Steve on this since he was the one who
> has been asking for this feature, although I'm pretty sure I know what
> he is going to say.

To start with, this request comes from users of the audit system. I just 
passed along the request. The issue is that when you do auditctl -s, you get 
the number of records lost. If you do it the next day, you have to do math to 
see what the one day delta is. So, to make reporting easy, they want it to be 
reset whenever they do audictl -s.

You could also make a AUDIT_GET_RESET that gets the status and resets the 
number atomically. Then I can add another commandline option to auditctl that 
allows an admin to say also reset the counters. If that command line option is 
passed, I call AUDIT_GET_RESET otherwise I call AUDIT_GET. Thought?

-Steve

^ permalink raw reply

* Re: [RFC][PATCH] audit: add feature audit_lost reset
From: Paul Moore @ 2016-12-07 15:05 UTC (permalink / raw)
  To: Richard Guy Briggs; +Cc: linux-audit
In-Reply-To: <20161207033220.GD22660@madcap2.tricolour.ca>

On Tue, Dec 6, 2016 at 10:32 PM, Richard Guy Briggs <rgb@redhat.com> wrote:
> On 2016-12-06 19:17, Paul Moore wrote:
>> On Tue, Dec 6, 2016 at 12:13 AM, Richard Guy Briggs <rgb@redhat.com> wrote:
>> > On 2016-12-05 12:48, Paul Moore wrote:
>> >> On Mon, Dec 5, 2016 at 11:52 AM, Richard Guy Briggs <rgb@redhat.com> wrote:
>> >> > On 2016-12-05 11:02, Paul Moore wrote:
>> >> >> On Mon, Dec 5, 2016 at 3:02 AM, Richard Guy Briggs <rgb@redhat.com> wrote:
>> >> >> > Add a method to reset the audit_lost value.
>> >> >> >
>> >> >> > An AUDIT_GET message will get the current audit_lost value and reset the
>> >> >> > counter to zero iff (if and only if) the AUDIT_FEATURE_LOST_RESET
>> >> >> > feature is set.
>> >> >> >
>> >> >> > If the flag AUDIT_FEATURE_BITMAP_LOST_RESET is present in the audit
>> >> >> > feature bitmap, the feature is settable by setting the
>> >> >> > AUDIT_FEATURE_LOST_RESET flag in the audit feature list with an
>> >> >> > AUDIT_SET_FEATURE call.  This setting is lockable.
>> >> >> >
>> >> >> > See: https://github.com/linux-audit/audit-kernel/issues/3
>> >> >> >
>> >> >> > Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
>> >> >> > ---
>> >> >> > Note: The AUDIT_FEATURE_BITMAP_LOST_RESET check may not be necessary if
>> >> >> > it is possible to read all the entries from audit_feature_names from
>> >> >> > userspace.
>> >> >> > ---
>> >> >> >  include/uapi/linux/audit.h |    7 +++++--
>> >> >> >  kernel/audit.c             |    9 ++++++---
>> >> >> >  2 files changed, 11 insertions(+), 5 deletions(-)
>> >> >>
>> >> >> Instead of resetting the lost counter on an AUDIT_GET if the reset
>> >> >> feature is set, how about preserving the AUDIT_GET behavior, skipping
>> >> >> the AUDIT_FEATURE_* addition, and simply reset the lost value by
>> >> >> sending a AUDIT_SET message with AUDIT_STATUS_LOST (you obviously have
>> >> >> to add this to the uapi header).
>> >> >
>> >> > I realized as I was coding it up that we would potentially lose an
>> >> > accurate count if the read and reset were not atomic.  This was the
>> >> > reason for using atomic_xchg().
>> >>
>> >> Well, okay, but that isn't what I was talking about ... ?  The
>> >> audit_cmd_mutex is held for both AUDIT_GET and AUDIT_SET so we should
>> >> never process these requests at the same time.
>> >
>> > I guess I still don't follow what you are talking about...  I hadn't
>> > thought of including both an AUDIT_GET and an AUDIT_SET in the same skb,
>> > but that would ensure that no other command reads or resets the lost
>> > value.  However, the lost value could still be incrementing on another
>> > CPU between these two commands, losing an accurate lost count.
>>
>> Okay, back up ... this whole mess about atomic_xchg() was always
>> unrelated to my original suggestion, let's focus on my original
>> comment ... don't reset the counter on a AUDIT_GET, reset it on a
>> AUDIT_SET with an AUDIT_STATUS_LOST, does that make sense?
>
> I understood that.  It sounds like a nice simple and straightforward
> method to do it but for the question of accuracy.  Please rewind to my
> fundamental point: How do we get an accurate reading of the last value
> of audit_lost before resetting it?

Okay, I thought you were worried about a different race, which is why
this discussion wasn't making much sense to me.  I understand your
point, but I really dislike the API; although that's not your fault,
it's really the only way to do it via AUDIT_GET.

I'd much prefer we go with the cleaner AUDIT_SET approach and just not
worry about the small race window.  It would only be an issue if you
reset the count under heavy audit load, and why would you reset the
lost value if you were under a heavy audit load?  That just doesn't
make sense.

I suppose we should hear from Steve on this since he was the one who
has been asking for this feature, although I'm pretty sure I know what
he is going to say.

-- 
paul moore
www.paul-moore.com

^ permalink raw reply

* Re: [userspace PATCH] Prevent free() of stack buffer with NOLOG format
From: Steve Grubb @ 2016-12-07 14:39 UTC (permalink / raw)
  To: linux-audit
In-Reply-To: <1613616.fOQnUE7urM@x2>

On Tuesday, December 6, 2016 10:55:05 AM EST Steve Grubb wrote:
> On Tuesday, December 6, 2016 7:57:33 AM EST George McCollister wrote:
> > On Mon, Dec 5, 2016 at 6:30 PM, Steve Grubb <sgrubb@redhat.com> wrote:
> > > On Monday, December 5, 2016 6:01:02 PM EST George McCollister wrote:
> > >> When the NOLOG format is used replace_event_msg() doesn't change
> > >> e->reply.message so the message located on the stack is left and later
> > >> is
> > > 
> > >> free()'d in cleanup_event() resulting in the following:
> > > Hmm...thanks for reporting this. Which version of audit are you using?
> > 
> > I'm using 2.6.6 but I reproduced the problem and made the change
> > against the HEAD of the master branch (using this mirror
> > https://github.com/linux-audit/audit-userspace).
> 
> OK. Got it. The patch isn't exactly the right fix. While it may hide the
> problem, the intent is that people may want to use the enriched format and
> send logs to a remote collector. By any chance do you know which buffer on
> the stack is getting freed? I'm trying to reproduce this but I thought I'd
> ask if you where it is since you have already looked into it.

I committed the following patch to fix this:

https://fedorahosted.org/audit/changeset/1421

Thanks for reporting the problem!

-Steve

^ permalink raw reply

* Re: [RFC][PATCH] audit: add feature audit_lost reset
From: Richard Guy Briggs @ 2016-12-07  3:32 UTC (permalink / raw)
  To: Paul Moore; +Cc: linux-audit
In-Reply-To: <CAHC9VhS5PM6x=nwO31mjjeE-nQmxP1c4BLZVeOVM8b9Eh_RmDA@mail.gmail.com>

On 2016-12-06 19:17, Paul Moore wrote:
> On Tue, Dec 6, 2016 at 12:13 AM, Richard Guy Briggs <rgb@redhat.com> wrote:
> > On 2016-12-05 12:48, Paul Moore wrote:
> >> On Mon, Dec 5, 2016 at 11:52 AM, Richard Guy Briggs <rgb@redhat.com> wrote:
> >> > On 2016-12-05 11:02, Paul Moore wrote:
> >> >> On Mon, Dec 5, 2016 at 3:02 AM, Richard Guy Briggs <rgb@redhat.com> wrote:
> >> >> > Add a method to reset the audit_lost value.
> >> >> >
> >> >> > An AUDIT_GET message will get the current audit_lost value and reset the
> >> >> > counter to zero iff (if and only if) the AUDIT_FEATURE_LOST_RESET
> >> >> > feature is set.
> >> >> >
> >> >> > If the flag AUDIT_FEATURE_BITMAP_LOST_RESET is present in the audit
> >> >> > feature bitmap, the feature is settable by setting the
> >> >> > AUDIT_FEATURE_LOST_RESET flag in the audit feature list with an
> >> >> > AUDIT_SET_FEATURE call.  This setting is lockable.
> >> >> >
> >> >> > See: https://github.com/linux-audit/audit-kernel/issues/3
> >> >> >
> >> >> > Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
> >> >> > ---
> >> >> > Note: The AUDIT_FEATURE_BITMAP_LOST_RESET check may not be necessary if
> >> >> > it is possible to read all the entries from audit_feature_names from
> >> >> > userspace.
> >> >> > ---
> >> >> >  include/uapi/linux/audit.h |    7 +++++--
> >> >> >  kernel/audit.c             |    9 ++++++---
> >> >> >  2 files changed, 11 insertions(+), 5 deletions(-)
> >> >>
> >> >> Instead of resetting the lost counter on an AUDIT_GET if the reset
> >> >> feature is set, how about preserving the AUDIT_GET behavior, skipping
> >> >> the AUDIT_FEATURE_* addition, and simply reset the lost value by
> >> >> sending a AUDIT_SET message with AUDIT_STATUS_LOST (you obviously have
> >> >> to add this to the uapi header).
> >> >
> >> > I realized as I was coding it up that we would potentially lose an
> >> > accurate count if the read and reset were not atomic.  This was the
> >> > reason for using atomic_xchg().
> >>
> >> Well, okay, but that isn't what I was talking about ... ?  The
> >> audit_cmd_mutex is held for both AUDIT_GET and AUDIT_SET so we should
> >> never process these requests at the same time.
> >
> > I guess I still don't follow what you are talking about...  I hadn't
> > thought of including both an AUDIT_GET and an AUDIT_SET in the same skb,
> > but that would ensure that no other command reads or resets the lost
> > value.  However, the lost value could still be incrementing on another
> > CPU between these two commands, losing an accurate lost count.
> 
> Okay, back up ... this whole mess about atomic_xchg() was always
> unrelated to my original suggestion, let's focus on my original
> comment ... don't reset the counter on a AUDIT_GET, reset it on a
> AUDIT_SET with an AUDIT_STATUS_LOST, does that make sense?

I understood that.  It sounds like a nice simple and straightforward
method to do it but for the question of accuracy.  Please rewind to my
fundamental point: How do we get an accurate reading of the last value
of audit_lost before resetting it?

> paul moore

- RGB

--
Richard Guy Briggs <rgb@redhat.com>
Kernel Security Engineering, Base Operating Systems, Red Hat
Remote, Ottawa, Canada
Voice: +1.647.777.2635, Internal: (81) 32635

^ permalink raw reply

* Re: [RFC][PATCH] audit: add feature audit_lost reset
From: Paul Moore @ 2016-12-07  0:17 UTC (permalink / raw)
  To: Richard Guy Briggs; +Cc: linux-audit
In-Reply-To: <20161206051337.GG22655@madcap2.tricolour.ca>

On Tue, Dec 6, 2016 at 12:13 AM, Richard Guy Briggs <rgb@redhat.com> wrote:
> On 2016-12-05 12:48, Paul Moore wrote:
>> On Mon, Dec 5, 2016 at 11:52 AM, Richard Guy Briggs <rgb@redhat.com> wrote:
>> > On 2016-12-05 11:02, Paul Moore wrote:
>> >> On Mon, Dec 5, 2016 at 3:02 AM, Richard Guy Briggs <rgb@redhat.com> wrote:
>> >> > Add a method to reset the audit_lost value.
>> >> >
>> >> > An AUDIT_GET message will get the current audit_lost value and reset the
>> >> > counter to zero iff (if and only if) the AUDIT_FEATURE_LOST_RESET
>> >> > feature is set.
>> >> >
>> >> > If the flag AUDIT_FEATURE_BITMAP_LOST_RESET is present in the audit
>> >> > feature bitmap, the feature is settable by setting the
>> >> > AUDIT_FEATURE_LOST_RESET flag in the audit feature list with an
>> >> > AUDIT_SET_FEATURE call.  This setting is lockable.
>> >> >
>> >> > See: https://github.com/linux-audit/audit-kernel/issues/3
>> >> >
>> >> > Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
>> >> > ---
>> >> > Note: The AUDIT_FEATURE_BITMAP_LOST_RESET check may not be necessary if
>> >> > it is possible to read all the entries from audit_feature_names from
>> >> > userspace.
>> >> > ---
>> >> >  include/uapi/linux/audit.h |    7 +++++--
>> >> >  kernel/audit.c             |    9 ++++++---
>> >> >  2 files changed, 11 insertions(+), 5 deletions(-)
>> >>
>> >> Instead of resetting the lost counter on an AUDIT_GET if the reset
>> >> feature is set, how about preserving the AUDIT_GET behavior, skipping
>> >> the AUDIT_FEATURE_* addition, and simply reset the lost value by
>> >> sending a AUDIT_SET message with AUDIT_STATUS_LOST (you obviously have
>> >> to add this to the uapi header).
>> >
>> > I realized as I was coding it up that we would potentially lose an
>> > accurate count if the read and reset were not atomic.  This was the
>> > reason for using atomic_xchg().
>>
>> Well, okay, but that isn't what I was talking about ... ?  The
>> audit_cmd_mutex is held for both AUDIT_GET and AUDIT_SET so we should
>> never process these requests at the same time.
>
> I guess I still don't follow what you are talking about...  I hadn't
> thought of including both an AUDIT_GET and an AUDIT_SET in the same skb,
> but that would ensure that no other command reads or resets the lost
> value.  However, the lost value could still be incrementing on another
> CPU between these two commands, losing an accurate lost count.

Okay, back up ... this whole mess about atomic_xchg() was always
unrelated to my original suggestion, let's focus on my original
comment ... don't reset the counter on a AUDIT_GET, reset it on a
AUDIT_SET with an AUDIT_STATUS_LOST, does that make sense?

-- 
paul moore
www.paul-moore.com

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox