From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 616F9C433F5 for ; Mon, 13 Dec 2021 18:31:57 +0000 (UTC) Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-445-Tg-LVqetM5-HRLiqOqZS-w-1; Mon, 13 Dec 2021 13:31:52 -0500 X-MC-Unique: Tg-LVqetM5-HRLiqOqZS-w-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id C071E1853024; Mon, 13 Dec 2021 18:31:48 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.20]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 227CE5BE1B; Mon, 13 Dec 2021 18:31:46 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id 9CC591809CB8; Mon, 13 Dec 2021 18:31:43 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id 1BDIVfNK003193 for ; Mon, 13 Dec 2021 13:31:41 -0500 Received: by smtp.corp.redhat.com (Postfix) id 1932B1121314; Mon, 13 Dec 2021 18:31:41 +0000 (UTC) Received: from mimecast-mx02.redhat.com (mimecast08.extmail.prod.ext.rdu2.redhat.com [10.11.55.24]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 145D01121318 for ; Mon, 13 Dec 2021 18:31:38 +0000 (UTC) Received: from us-smtp-1.mimecast.com (us-smtp-2.mimecast.com [205.139.110.61]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 1D2893879966 for ; Mon, 13 Dec 2021 18:31:38 +0000 (UTC) Received: from mail-qt1-f182.google.com (mail-qt1-f182.google.com [209.85.160.182]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-210-V1JFadMPP9OBLBOoHesXKg-1; Mon, 13 Dec 2021 13:31:36 -0500 X-MC-Unique: V1JFadMPP9OBLBOoHesXKg-1 Received: by mail-qt1-f182.google.com with SMTP id 8so16120086qtx.5 for ; Mon, 13 Dec 2021 10:31:35 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:subject:from:to:cc:date:message-id:user-agent :mime-version:content-transfer-encoding; bh=GPXNsyp1z/qRZY4SSy0s5AZY0Jea0ALh3VoCdAshotE=; b=aZ32ptymApeBRVhUdKxlsPwvE9stmDasHUBdQ4eVhH3hU25qAMlMFsR0qA+NCH5QeP TaBfOMlw8yMW+qsADhmDvu5SNCLDWIqBKIT67hNTybOlyq/9EiKdlz0GkYmVU8ekKdhN zfFNKKAQbYTP7iGHJAf7XbpEPNY2psjIaiElWpgC0qSXZRq0yCifxx9hYhlzMGLyu2I3 dAdyeNoZ20VrzECu9ocplCKdtvz+r+rjTvSpoTQOP1nbAKqQFgebDxaKNonaOU9PUH9E gZpt7x7RKo30zN4+DKPSy27F/g78nbheL1BrwiGjCLN9pgPyVO/zj2LdbzwxyRRFLnV6 e12w== X-Gm-Message-State: AOAM531GC3PEXMhIqEPznvzfk1c/iKM5fj4cxxb6/4qu0OrYN7/akb1J eEliwmhv4X+28gxQ6rXy2od8X2AvrM6H X-Google-Smtp-Source: ABdhPJyMAxK3nAw6e6NifQb8a2R+0vjJ22fB0Cj24WN5pPsfVTObj7kWscS8vR7at28uIagyVaC3XA== X-Received: by 2002:a05:622a:64c:: with SMTP id a12mr22157qtb.312.1639420294887; Mon, 13 Dec 2021 10:31:34 -0800 (PST) Received: from localhost (pool-96-237-52-46.bstnma.fios.verizon.net. [96.237.52.46]) by smtp.gmail.com with ESMTPSA id y18sm10799629qtx.19.2021.12.13.10.31.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Dec 2021 10:31:34 -0800 (PST) Subject: [PATCH] audit: improve robustness of the audit queue handling From: Paul Moore To: linux-audit@redhat.com Date: Mon, 13 Dec 2021 13:31:33 -0500 Message-ID: <163942029335.62691.7176083220772477830.stgit@olly> User-Agent: StGit/1.4 MIME-Version: 1.0 X-Mimecast-Impersonation-Protect: Policy=CLT - Impersonation Protection Definition; Similar Internal Domain=false; Similar Monitored External Domain=false; Custom External Domain=false; Mimecast External Domain=false; Newly Observed Domain=false; Internal User Name=false; Custom Display Name List=false; Reply-to Address Mismatch=false; Targeted Threat Dictionary=false; Mimecast Threat Dictionary=false; Custom Threat Dictionary=false X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 X-loop: linux-audit@redhat.com Cc: Gaosheng Cui X-BeenThere: linux-audit@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: Linux Audit Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: linux-audit-bounces@redhat.com Errors-To: linux-audit-bounces@redhat.com X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=linux-audit-bounces@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit If the audit daemon were ever to get stuck in a stopped state the kernel's kauditd_thread() could get blocked attempting to send audit records to the userspace audit daemon. With the kernel thread blocked it is possible that the audit queue could grow unbounded as certain audit record generating events must be exempt from the queue limits else the system enter a deadlock state. This patch resolves this problem by lowering the kernel thread's socket sending timeout from MAX_SCHEDULE_TIMEOUT to HZ/10 and tweaks the kauditd_send_queue() function to better manage the various audit queues when connection problems occur between the kernel and the audit daemon. With this patch, the backlog may temporarily grow beyond the defined limits when the audit daemon is stopped and the system is under heavy audit pressure, but kauditd_thread() will continue to make progress and drain the queues as it would for other connection problems. For example, with the audit daemon put into a stopped state and the system configured to audit every syscall it was still possible to shutdown the system without a kernel panic, deadlock, etc.; granted, the system was slow to shutdown but that is to be expected given the extreme pressure of recording every syscall. The timeout value of HZ/10 was chosen primarily through experimentation and this developer's "gut feeling". There is likely no one perfect value, but as this scenario is limited in scope (root privileges would be needed to send SIGSTOP to the audit daemon), it is likely not worth exposing this as a tunable at present. This can always be done at a later date if it proves necessary. Cc: stable@vger.kernel.org Fixes: 5b52330bbfe63 ("audit: fix auditd/kernel connection state tracking") Reported-by: Gaosheng Cui Signed-off-by: Paul Moore --- kernel/audit.c | 21 ++++++++++----------- 1 file changed, 10 insertions(+), 11 deletions(-) diff --git a/kernel/audit.c b/kernel/audit.c index 121d37e700a6..4cebadb5f30d 100644 --- a/kernel/audit.c +++ b/kernel/audit.c @@ -718,7 +718,7 @@ static int kauditd_send_queue(struct sock *sk, u32 portid, { int rc = 0; struct sk_buff *skb; - static unsigned int failed = 0; + unsigned int failed = 0; /* NOTE: kauditd_thread takes care of all our locking, we just use * the netlink info passed to us (e.g. sk and portid) */ @@ -735,32 +735,30 @@ static int kauditd_send_queue(struct sock *sk, u32 portid, continue; } +retry: /* grab an extra skb reference in case of error */ skb_get(skb); rc = netlink_unicast(sk, skb, portid, 0); if (rc < 0) { - /* fatal failure for our queue flush attempt? */ + /* send failed - try a few times unless fatal error */ if (++failed >= retry_limit || rc == -ECONNREFUSED || rc == -EPERM) { - /* yes - error processing for the queue */ sk = NULL; if (err_hook) (*err_hook)(skb); - if (!skb_hook) - goto out; - /* keep processing with the skb_hook */ + if (rc == -EAGAIN) + rc = 0; + /* continue to drain the queue */ continue; } else - /* no - requeue to preserve ordering */ - skb_queue_head(queue, skb); + goto retry; } else { - /* it worked - drop the extra reference and continue */ + /* skb sent - drop the extra reference and continue */ consume_skb(skb); failed = 0; } } -out: return (rc >= 0 ? 0 : rc); } @@ -1609,7 +1607,8 @@ static int __net_init audit_net_init(struct net *net) audit_panic("cannot initialize netlink socket in namespace"); return -ENOMEM; } - aunet->sk->sk_sndtimeo = MAX_SCHEDULE_TIMEOUT; + /* limit the timeout in case auditd is blocked/stopped */ + aunet->sk->sk_sndtimeo = HZ / 10; return 0; } -- Linux-audit mailing list Linux-audit@redhat.com https://listman.redhat.com/mailman/listinfo/linux-audit