* Re: Flush the hold queue fall into an infinite loop. [not found] ` <8b487a19-d121-5fee-eda5-0aee9340f453@huawei.com> @ 2022-01-13 15:22 ` Paul Moore 2022-01-14 1:22 ` cuigaosheng 0 siblings, 1 reply; 3+ messages in thread From: Paul Moore @ 2022-01-13 15:22 UTC (permalink / raw) To: cuigaosheng Cc: linux-audit, Xiujianfeng, wangweiyang, linux-security-module, linux-kernel On Thu, Jan 13, 2022 at 6:57 AM cuigaosheng <cuigaosheng1@huawei.com> wrote: > > When we add "audit=1" to the cmdline, kauditd will take up 100% > cpu resource.As follows: > > configurations: > auditctl -b 64 > auditctl --backlog_wait_time 60000 > auditctl -r 0 > auditctl -w /root/aaa -p wrx > shell scripts: > #!/bin/bash > i=0 > while [ $i -le 66 ] > do > touch /root/aaa > let i++ > done > mandatory conditions: > > add "audit=1" to the cmdline, and kill -19 pid_number(for /sbin/auditd). > > As long as we keep the audit_hold_queue non-empty, flush the hold queue will fall into > an infinite loop. > > 713 static int kauditd_send_queue(struct sock *sk, u32 portid, > 714 struct sk_buff_head *queue, > 715 unsigned int retry_limit, > 716 void (*skb_hook)(struct sk_buff *skb), > 717 void (*err_hook)(struct sk_buff *skb)) > 718 { > 719 int rc = 0; > 720 struct sk_buff *skb; > 721 unsigned int failed = 0; > 722 > 723 /* NOTE: kauditd_thread takes care of all our locking, we just use > 724 * the netlink info passed to us (e.g. sk and portid) */ > 725 > 726 while ((skb = skb_dequeue(queue))) { > 727 /* call the skb_hook for each skb we touch */ > 728 if (skb_hook) > 729 (*skb_hook)(skb); > 730 > 731 /* can we send to anyone via unicast? */ > 732 if (!sk) { > 733 if (err_hook) > 734 (*err_hook)(skb); > 735 continue; > 736 } > 737 > 738 retry: > 739 /* grab an extra skb reference in case of error */ > 740 skb_get(skb); > 741 rc = netlink_unicast(sk, skb, portid, 0); > 742 if (rc < 0) { > 743 /* send failed - try a few times unless fatal error */ > 744 if (++failed >= retry_limit || > 745 rc == -ECONNREFUSED || rc == -EPERM) { > 746 sk = NULL; > 747 if (err_hook) > 748 (*err_hook)(skb); > 749 if (rc == -EAGAIN) > 750 rc = 0; > 751 /* continue to drain the queue */ > 752 continue; > 753 } else > 754 goto retry; > 755 } else { > 756 /* skb sent - drop the extra reference and continue */ > 757 consume_skb(skb); > 758 failed = 0; > 759 } > 760 } > 761 > 762 return (rc >= 0 ? 0 : rc); > 763 } > > When kauditd attempt to flush the hold queue, the queue parameter is &audit_hold_queue, > and if netlink_unicast(line 741 ) return -EAGAIN, sk will be NULL(line 746), so err_hook(kauditd_rehold_skb) > will be call. Then continue, skb_dequeue(line 726) and err_hook(kauditd_rehold_skb,line 733) will > fall into an infinite loop. > I don't really understand the value of audit_hold_queue, can we remove it, or stop droping the logs > into kauditd_rehold_skb when the auditd is abnormal? Thanks Gaosheng for the bug report, I'm able to reproduce this and I'm looking into it now. I'll report back when I have a better idea of the problem and a potential fix. -- paul moore www.paul-moore.com ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Flush the hold queue fall into an infinite loop. 2022-01-13 15:22 ` Flush the hold queue fall into an infinite loop Paul Moore @ 2022-01-14 1:22 ` cuigaosheng 2022-01-14 22:35 ` Paul Moore 0 siblings, 1 reply; 3+ messages in thread From: cuigaosheng @ 2022-01-14 1:22 UTC (permalink / raw) To: Paul Moore Cc: linux-audit, Xiujianfeng, wangweiyang, linux-security-module, linux-kernel I want to stop droping the logs into audit_hold_queue when the auditd is abnormal.it seems that this modification goes against the design intent of audit_hold_queue. its effect is similar to removing the audit_hold_queue. diff --git a/kernel/audit.c b/kernel/audit.c index 2a38cbaf3ddb..a8091b1a6587 100644 --- a/kernel/audit.c +++ b/kernel/audit.c @@ -748,6 +748,7 @@ static int kauditd_send_queue(struct sock *sk, u32 portid, (*err_hook)(skb); if (rc == -EAGAIN) rc = 0; + audit_default = AUDIT_OFF; /* continue to drain the queue */ continue; } else @@ -755,6 +756,7 @@ static int kauditd_send_queue(struct sock *sk, u32 portid, } else { /* skb sent - drop the extra reference and continue */ consume_skb(skb); + audit_default = audit_enabled; failed = 0; } } 在 2022/1/13 23:22, Paul Moore 写道: > On Thu, Jan 13, 2022 at 6:57 AM cuigaosheng <cuigaosheng1@huawei.com> wrote: >> When we add "audit=1" to the cmdline, kauditd will take up 100% >> cpu resource.As follows: >> >> configurations: >> auditctl -b 64 >> auditctl --backlog_wait_time 60000 >> auditctl -r 0 >> auditctl -w /root/aaa -p wrx >> shell scripts: >> #!/bin/bash >> i=0 >> while [ $i -le 66 ] >> do >> touch /root/aaa >> let i++ >> done >> mandatory conditions: >> >> add "audit=1" to the cmdline, and kill -19 pid_number(for /sbin/auditd). >> >> As long as we keep the audit_hold_queue non-empty, flush the hold queue will fall into >> an infinite loop. >> >> 713 static int kauditd_send_queue(struct sock *sk, u32 portid, >> 714 struct sk_buff_head *queue, >> 715 unsigned int retry_limit, >> 716 void (*skb_hook)(struct sk_buff *skb), >> 717 void (*err_hook)(struct sk_buff *skb)) >> 718 { >> 719 int rc = 0; >> 720 struct sk_buff *skb; >> 721 unsigned int failed = 0; >> 722 >> 723 /* NOTE: kauditd_thread takes care of all our locking, we just use >> 724 * the netlink info passed to us (e.g. sk and portid) */ >> 725 >> 726 while ((skb = skb_dequeue(queue))) { >> 727 /* call the skb_hook for each skb we touch */ >> 728 if (skb_hook) >> 729 (*skb_hook)(skb); >> 730 >> 731 /* can we send to anyone via unicast? */ >> 732 if (!sk) { >> 733 if (err_hook) >> 734 (*err_hook)(skb); >> 735 continue; >> 736 } >> 737 >> 738 retry: >> 739 /* grab an extra skb reference in case of error */ >> 740 skb_get(skb); >> 741 rc = netlink_unicast(sk, skb, portid, 0); >> 742 if (rc < 0) { >> 743 /* send failed - try a few times unless fatal error */ >> 744 if (++failed >= retry_limit || >> 745 rc == -ECONNREFUSED || rc == -EPERM) { >> 746 sk = NULL; >> 747 if (err_hook) >> 748 (*err_hook)(skb); >> 749 if (rc == -EAGAIN) >> 750 rc = 0; >> 751 /* continue to drain the queue */ >> 752 continue; >> 753 } else >> 754 goto retry; >> 755 } else { >> 756 /* skb sent - drop the extra reference and continue */ >> 757 consume_skb(skb); >> 758 failed = 0; >> 759 } >> 760 } >> 761 >> 762 return (rc >= 0 ? 0 : rc); >> 763 } >> >> When kauditd attempt to flush the hold queue, the queue parameter is &audit_hold_queue, >> and if netlink_unicast(line 741 ) return -EAGAIN, sk will be NULL(line 746), so err_hook(kauditd_rehold_skb) >> will be call. Then continue, skb_dequeue(line 726) and err_hook(kauditd_rehold_skb,line 733) will >> fall into an infinite loop. >> I don't really understand the value of audit_hold_queue, can we remove it, or stop droping the logs >> into kauditd_rehold_skb when the auditd is abnormal? > Thanks Gaosheng for the bug report, I'm able to reproduce this and I'm > looking into it now. I'll report back when I have a better idea of > the problem and a potential fix. > ^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: Flush the hold queue fall into an infinite loop. 2022-01-14 1:22 ` cuigaosheng @ 2022-01-14 22:35 ` Paul Moore 0 siblings, 0 replies; 3+ messages in thread From: Paul Moore @ 2022-01-14 22:35 UTC (permalink / raw) To: cuigaosheng Cc: linux-audit, Xiujianfeng, wangweiyang, linux-security-module, linux-kernel On Thu, Jan 13, 2022 at 8:22 PM cuigaosheng <cuigaosheng1@huawei.com> wrote: > > I want to stop droping the logs into audit_hold_queue when the auditd is abnormal.it > seems that this modification goes against the design intent of audit_hold_queue. its > effect is similar to removing the audit_hold_queue. > > diff --git a/kernel/audit.c b/kernel/audit.c > index 2a38cbaf3ddb..a8091b1a6587 100644 > --- a/kernel/audit.c > +++ b/kernel/audit.c > @@ -748,6 +748,7 @@ static int kauditd_send_queue(struct sock *sk, u32 > portid, > (*err_hook)(skb); > if (rc == -EAGAIN) > rc = 0; > + audit_default = AUDIT_OFF; > /* continue to drain the queue */ > continue; > } else > @@ -755,6 +756,7 @@ static int kauditd_send_queue(struct sock *sk, u32 > portid, > } else { > /* skb sent - drop the extra reference and > continue */ > consume_skb(skb); > + audit_default = audit_enabled; > failed = 0; > } > } We can't toggle the audit_default setting like this, that isn't acceptable upstream. I believe I have a fix, but I need to finish the testing before I can post it for further review. > 在 2022/1/13 23:22, Paul Moore 写道: > > On Thu, Jan 13, 2022 at 6:57 AM cuigaosheng <cuigaosheng1@huawei.com> wrote: > >> When we add "audit=1" to the cmdline, kauditd will take up 100% > >> cpu resource.As follows: > >> > >> configurations: > >> auditctl -b 64 > >> auditctl --backlog_wait_time 60000 > >> auditctl -r 0 > >> auditctl -w /root/aaa -p wrx > >> shell scripts: > >> #!/bin/bash > >> i=0 > >> while [ $i -le 66 ] > >> do > >> touch /root/aaa > >> let i++ > >> done > >> mandatory conditions: > >> > >> add "audit=1" to the cmdline, and kill -19 pid_number(for /sbin/auditd). > >> > >> As long as we keep the audit_hold_queue non-empty, flush the hold queue will fall into > >> an infinite loop. > >> > >> 713 static int kauditd_send_queue(struct sock *sk, u32 portid, > >> 714 struct sk_buff_head *queue, > >> 715 unsigned int retry_limit, > >> 716 void (*skb_hook)(struct sk_buff *skb), > >> 717 void (*err_hook)(struct sk_buff *skb)) > >> 718 { > >> 719 int rc = 0; > >> 720 struct sk_buff *skb; > >> 721 unsigned int failed = 0; > >> 722 > >> 723 /* NOTE: kauditd_thread takes care of all our locking, we just use > >> 724 * the netlink info passed to us (e.g. sk and portid) */ > >> 725 > >> 726 while ((skb = skb_dequeue(queue))) { > >> 727 /* call the skb_hook for each skb we touch */ > >> 728 if (skb_hook) > >> 729 (*skb_hook)(skb); > >> 730 > >> 731 /* can we send to anyone via unicast? */ > >> 732 if (!sk) { > >> 733 if (err_hook) > >> 734 (*err_hook)(skb); > >> 735 continue; > >> 736 } > >> 737 > >> 738 retry: > >> 739 /* grab an extra skb reference in case of error */ > >> 740 skb_get(skb); > >> 741 rc = netlink_unicast(sk, skb, portid, 0); > >> 742 if (rc < 0) { > >> 743 /* send failed - try a few times unless fatal error */ > >> 744 if (++failed >= retry_limit || > >> 745 rc == -ECONNREFUSED || rc == -EPERM) { > >> 746 sk = NULL; > >> 747 if (err_hook) > >> 748 (*err_hook)(skb); > >> 749 if (rc == -EAGAIN) > >> 750 rc = 0; > >> 751 /* continue to drain the queue */ > >> 752 continue; > >> 753 } else > >> 754 goto retry; > >> 755 } else { > >> 756 /* skb sent - drop the extra reference and continue */ > >> 757 consume_skb(skb); > >> 758 failed = 0; > >> 759 } > >> 760 } > >> 761 > >> 762 return (rc >= 0 ? 0 : rc); > >> 763 } > >> > >> When kauditd attempt to flush the hold queue, the queue parameter is &audit_hold_queue, > >> and if netlink_unicast(line 741 ) return -EAGAIN, sk will be NULL(line 746), so err_hook(kauditd_rehold_skb) > >> will be call. Then continue, skb_dequeue(line 726) and err_hook(kauditd_rehold_skb,line 733) will > >> fall into an infinite loop. > >> I don't really understand the value of audit_hold_queue, can we remove it, or stop droping the logs > >> into kauditd_rehold_skb when the auditd is abnormal? > > Thanks Gaosheng for the bug report, I'm able to reproduce this and I'm > > looking into it now. I'll report back when I have a better idea of > > the problem and a potential fix. > > -- paul moore www.paul-moore.com ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2022-01-14 22:35 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <96f4f1cb-0e7d-6682-ce33-f7f1314cba83@huawei.com>
[not found] ` <8b487a19-d121-5fee-eda5-0aee9340f453@huawei.com>
2022-01-13 15:22 ` Flush the hold queue fall into an infinite loop Paul Moore
2022-01-14 1:22 ` cuigaosheng
2022-01-14 22:35 ` Paul Moore
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox