From: Max Englander <max.englander@gmail.com>
To: Paul Moore <paul@paul-moore.com>
Cc: linux-audit@redhat.com
Subject: Re: [PATCH v2] audit: report audit wait metric in audit status reply
Date: Fri, 3 Jul 2020 22:31:22 +0000 [thread overview]
Message-ID: <20200703223121.GA10378@linux-kernel-dev> (raw)
In-Reply-To: <CAHC9VhTctHCCrm4Q1cPdFX-6NXEtmjEPmw6rvUoxOq8UUmycxA@mail.gmail.com>
On Thu, Jul 02, 2020 at 04:42:13PM -0400, Paul Moore wrote:
> On Wed, Jul 1, 2020 at 5:32 PM Max Englander <max.englander@gmail.com> wrote:
> >
> > In environments where the preservation of audit events and predictable
> > usage of system memory are prioritized, admins may use a combination of
> > --backlog_wait_time and -b options at the risk of degraded performance
> > resulting from backlog waiting. In some cases, this risk may be
> > preferred to lost events or unbounded memory usage. Ideally, this risk
> > can be mitigated by making adjustments when backlog waiting is detected.
> >
> > However, detection can be diffult using the currently available metrics.
> > For example, an admin attempting to debug degraded performance may
> > falsely believe a full backlog indicates backlog waiting. It may turn
> > out the backlog frequently fills up but drains quickly.
> >
> > To make it easier to reliably track degraded performance to backlog
> > waiting, this patch makes the following changes:
> >
> > Add a new field backlog_wait_sum to the audit status reply. Initialize
> > this field to zero. Add to this field the total time spent by the
> > current task on scheduled timeouts while the backlog limit is exceeded.
> >
> > Tested on Ubuntu 18.04 using complementary changes to the audit
> > userspace: https://github.com/linux-audit/audit-userspace/pull/134.
> >
> > Signed-off-by: Max Englander <max.englander@gmail.com>
> > ---
> > Patch changelogs between v1 and v2:
> > - Instead of printing a warning when backlog waiting occurs, add
> > duration of backlog waiting to cumulative sum, and report this
> > sum in audit status reply.
> >
> > include/uapi/linux/audit.h | 7 ++++++-
> > kernel/audit.c | 9 +++++++++
> > 2 files changed, 15 insertions(+), 1 deletion(-)
>
> Hi Max,
>
> In general this looks better than the previous approach, but I do have
> a few specific comments (inline). It also important that in addition
Thanks for your feedback and comments, Paul. I've prepared a v3 patch
addressing all of your comments, with corresponding changes to the
audit-userspace, which I'll submit once I get a working change to
the test suite. I'll use this thread to respond to some things and ask
a question.
> to the requisite userspace patch, we also need a test added to the
> audit-testsuite project so we can verify this functionality in the
> future.
>
> * https://github.com/linux-audit/audit-testsuite
>
I downloaded this test suite and attempted to run it on Ubuntu 18.04,
with the latest audit kernel tree and latest audit-userspace. Many tests
were failing, I assume because I have some issues with my environment.
May I ask what environment (OS, tree, commit) you recommend for running
the test suite? Happy to move this question over to GitHub if that's a
better venue.
> > diff --git a/include/uapi/linux/audit.h b/include/uapi/linux/audit.h
> > index a534d71e689a..ea0cc364beca 100644
> > --- a/include/uapi/linux/audit.h
> > +++ b/include/uapi/linux/audit.h
> > @@ -340,6 +340,7 @@ enum {
> > #define AUDIT_STATUS_BACKLOG_LIMIT 0x0010
> > #define AUDIT_STATUS_BACKLOG_WAIT_TIME 0x0020
> > #define AUDIT_STATUS_LOST 0x0040
> > +#define AUDIT_STATUS_BACKLOG_WAIT_SUM 0x0080
>
> Sooo ... you've defined this, but I don't see any of the corresponding
> AUDIT_SET code that I would expect, was that an oversight? If not, it
> is something we should support in the kernel as I'm sure admins will
> want to reset this value at some point.
To be honest I had based this patch off v1 which included a flag for
setting the backlog warn time threshold, but didn't remove it from the
v2 patch (an oversight). I wasn't thinking about admins' need to reset
the value, but since you suggested it I've included support for that in
the v3 patch.
>
> > #define AUDIT_FEATURE_BITMAP_BACKLOG_LIMIT 0x00000001
> > #define AUDIT_FEATURE_BITMAP_BACKLOG_WAIT_TIME 0x00000002
> > @@ -348,6 +349,7 @@ enum {
> > #define AUDIT_FEATURE_BITMAP_SESSIONID_FILTER 0x00000010
> > #define AUDIT_FEATURE_BITMAP_LOST_RESET 0x00000020
> > #define AUDIT_FEATURE_BITMAP_FILTER_FS 0x00000040
> > +#define AUDIT_FEATURE_BITMAP_BACKLOG_WAIT_SUM 0x00000080
>
> In an effort not to exhaust the feature bitmap too quickly, I've been
> restricting it to only those features that would cause breakage with
> userspace. I haven't looked closely at Steve's userspace in quite a
> while, but I'm guessing it can key off the structure size and doesn't
> need this entry in the bitmap, right? Let me rephrase, if userspace
> needs to key off anything, it *should* key off the structure size and
> not a new flag in the bitmask ;)
I've removed this flag from the v3 patch, but if it's alright with
you I'll key off the absence/presence of
AUDIT_STATUS_BACKLOG_WAIT_TIME_ACTUAL, which follows the pattern
currently used throughout the audit-userspace, and which Richard suggests
in separate message.
>
> Also, I'm assuming that older userspace doesn't blow-up if it sees the
> larger structure size? That's even more important.
>
> > #define AUDIT_FEATURE_BITMAP_ALL (AUDIT_FEATURE_BITMAP_BACKLOG_LIMIT | \
> > AUDIT_FEATURE_BITMAP_BACKLOG_WAIT_TIME | \
> > @@ -355,12 +357,14 @@ enum {
> > AUDIT_FEATURE_BITMAP_EXCLUDE_EXTEND | \
> > AUDIT_FEATURE_BITMAP_SESSIONID_FILTER | \
> > AUDIT_FEATURE_BITMAP_LOST_RESET | \
> > - AUDIT_FEATURE_BITMAP_FILTER_FS)
> > + AUDIT_FEATURE_BITMAP_FILTER_FS | \
> > + AUDIT_FEATURE_BITMAP_BACKLOG_WAIT_SUM)
> >
> > /* deprecated: AUDIT_VERSION_* */
> > #define AUDIT_VERSION_LATEST AUDIT_FEATURE_BITMAP_ALL
> > #define AUDIT_VERSION_BACKLOG_LIMIT AUDIT_FEATURE_BITMAP_BACKLOG_LIMIT
> > #define AUDIT_VERSION_BACKLOG_WAIT_TIME AUDIT_FEATURE_BITMAP_BACKLOG_WAIT_TIME
> > +#define AUDIT_VERSION_BACKLOG_WAIT_SUM AUDIT_FEATURE_BITMAP_BACKLOG_WAIT_SUM
> >
> > /* Failure-to-log actions */
> > #define AUDIT_FAIL_SILENT 0
> > @@ -466,6 +470,7 @@ struct audit_status {
> > __u32 feature_bitmap; /* bitmap of kernel audit features */
> > };
> > __u32 backlog_wait_time;/* message queue wait timeout */
> > + __u32 backlog_wait_sum;/* time spent waiting while message limit exceeded */
>
> This is very nitpicky, but how about a rename to 'backlog_wait_time_actual'?
Sure, will do.
>
> > diff --git a/kernel/audit.c b/kernel/audit.c
> > index 87f31bf1f0a0..301ea4f3d750 100644
> > --- a/kernel/audit.c
> > +++ b/kernel/audit.c
> > @@ -136,6 +136,11 @@ u32 audit_sig_sid = 0;
> > */
> > static atomic_t audit_lost = ATOMIC_INIT(0);
> >
> > +/* Monotonically increasing sum of time the kernel has spent
> > + * waiting while the backlog limit is exceeded.
> > + */
> > +static atomic_t audit_backlog_wait_sum = ATOMIC_INIT(0);
>
> Needless to say, this should be renamed too so we don't go crazy.
Of course :)
>
> > /* Hash for inode-based rules */
> > struct list_head audit_inode_hash[AUDIT_INODE_BUCKETS];
> >
> > @@ -1204,6 +1209,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
> > s.backlog = skb_queue_len(&audit_queue);
> > s.feature_bitmap = AUDIT_FEATURE_BITMAP_ALL;
> > s.backlog_wait_time = audit_backlog_wait_time;
> > + s.backlog_wait_sum = atomic_read(&audit_backlog_wait_sum);
> > audit_send_reply(skb, seq, AUDIT_GET, 0, 0, &s, sizeof(s));
> > break;
> > }
> > @@ -1794,6 +1800,9 @@ struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask,
> > return NULL;
> > }
> > }
> > +
> > + if (stime != audit_backlog_wait_time)
> > + atomic_add(audit_backlog_wait_time - stime, &audit_backlog_wait_sum);
>
> Since stime can only be different in one place in the code above
> (after the schedule_timeout() call), why not move the atomic_add() up
> there and drop the "if"? Yes there is the potential of calling
> atomic_add() multiple times in this case, but the thread is waiting
> anyway and this way we don't impact other code paths.
My concern was calling atomic_add() more than necessary, but happy to
move it up as you suggest.
>
> > }
> >
> > ab = audit_buffer_alloc(ctx, gfp_mask, type);
> > --
> > 2.17.1
>
> --
> paul moore
> www.paul-moore.com
--
Linux-audit mailing list
Linux-audit@redhat.com
https://www.redhat.com/mailman/listinfo/linux-audit
next prev parent reply other threads:[~2020-07-03 22:34 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-01 21:32 [PATCH v2] audit: report audit wait metric in audit status reply Max Englander
2020-07-02 20:42 ` Paul Moore
2020-07-03 21:29 ` Richard Guy Briggs
2020-07-03 22:36 ` Max Englander
2020-07-03 22:31 ` Max Englander [this message]
2020-12-03 3:52 ` Steve Grubb
2020-12-03 4:12 ` Paul Moore
2020-12-03 12:37 ` Richard Guy Briggs
2020-12-03 15:37 ` Paul Moore
2020-12-03 23:10 ` Richard Guy Briggs
2020-12-03 23:43 ` Paul Moore
2020-12-03 23:55 ` Steve Grubb
2020-12-04 2:16 ` Paul Moore
2020-12-04 2:47 ` Steve Grubb
2020-12-04 20:41 ` Paul Moore
2020-12-07 21:13 ` Max Englander
2020-12-07 21:17 ` Paul Moore
2020-12-07 21:21 ` Richard Guy Briggs
2020-12-07 21:28 ` Max Englander
2020-12-07 23:28 ` Steve Grubb
2020-12-08 1:34 ` Richard Guy Briggs
2020-12-08 3:34 ` Steve Grubb
2020-12-08 13:20 ` Richard Guy Briggs
2020-12-08 13:44 ` Steve Grubb
2020-12-08 23:08 ` Paul Moore
2020-12-03 13:31 ` Steve Grubb
2020-12-07 19:43 ` Lenny Bruzenak
2020-12-07 21:14 ` Paul Moore
2020-12-03 4:33 ` Joe Wulf
2020-12-07 21:48 ` Max Englander
2020-12-08 16:57 ` Lenny Bruzenak
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200703223121.GA10378@linux-kernel-dev \
--to=max.englander@gmail.com \
--cc=linux-audit@redhat.com \
--cc=paul@paul-moore.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).