From: Konstantin Khlebnikov <khlebnikov@openvz.org>
To: Luiz Capitulino <lcapitulino@redhat.com>
Cc: linux-kernel@vger.kernel.org, oleg@redhat.com, eparis@redhat.com,
akpm@linux-foundation.org, rgb@redhat.com
Subject: Re: [RFC] audit: avoid soft lockup in audit_log_start()
Date: Mon, 09 Sep 2013 19:42:09 +0400 [thread overview]
Message-ID: <522DEC51.2090608@openvz.org> (raw)
In-Reply-To: <20130909112953.10ac2e28@redhat.com>
Luiz Capitulino wrote:
> On Mon, 09 Sep 2013 19:19:14 +0400
> Konstantin Khlebnikov<khlebnikov@openvz.org> wrote:
>
>> Luiz Capitulino wrote:
>>> On Mon, 09 Sep 2013 18:32:13 +0400
>>> Konstantin Khlebnikov<khlebnikov@openvz.org> wrote:
>>>
>>>> Luiz Capitulino wrote:
>>>>> I'm getting the following soft lockup:
>>>>>
>>>>> CPU: 6 PID: 2278 Comm: killall5 Tainted: GF 3.11.0-rc7+ #1
>>>>> Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
>>>>> 0000000000000099 ffff88011fd83de8 ffffffff815324df 0000000000002800
>>>>> ffffffff817d48f9 ffff88011fd83e68 ffffffff8152e669 ffff88011fd83e68
>>>>> ffffffff00000008 ffff88011fd83e78 ffff88011fd83e18 0000004081dac040
>>>>> Call Trace:
>>>>> <IRQ> [<ffffffff815324df>] dump_stack+0x46/0x58
>>>>> [<ffffffff8152e669>] panic+0xbb/0x1c4
>>>>> [<ffffffff810d03c3>] watchdog_timer_fn+0x163/0x170
>>>>> [<ffffffff8106c691>] __run_hrtimer+0x81/0x1c0
>>>>> [<ffffffff810d0260>] ? watchdog+0x30/0x30
>>>>> [<ffffffff8106cea7>] hrtimer_interrupt+0x107/0x240
>>>>> [<ffffffff8102f61b>] local_apic_timer_interrupt+0x3b/0x60
>>>>> [<ffffffff81542465>] smp_apic_timer_interrupt+0x45/0x60
>>>>> [<ffffffff8154124a>] apic_timer_interrupt+0x6a/0x70
>>>>> <EOI> [<ffffffff810c2f5f>] ? audit_log_start+0xbf/0x430
>>>>> [<ffffffff810c2fe7>] ? audit_log_start+0x147/0x430
>>>>> [<ffffffff81079030>] ? try_to_wake_up+0x2a0/0x2a0
>>>>> [<ffffffff810c86be>] audit_log_exit+0x6ae/0xc30
>>>>> [<ffffffff81188662>] ? __alloc_fd+0x42/0x100
>>>>> [<ffffffff810c98e7>] __audit_syscall_exit+0x257/0x2b0
>>>>> [<ffffffff81540794>] sysret_audit+0x17/0x21
>>>>>
>>>>> The reproducer is somewhat unusual:
>>>>>
>>>>> 1. Install RHEL6.5 (maybe a similar older user-space will do)
>>>>> 2. Boot the just installed system
>>>>> 3. In this first boot you'll meet the firstboot script, which
>>>>> will do some setup and (depending on your answers) it will
>>>>> reboot the machine
>>>>> 4. During that first reboot the system hangs while terminating
>>>>> all processes:
>>>>>
>>>>> Sending all processes the TERM signal...
>>>>>
>>>>> It's when the soft lockup above happens. And yes, I managed
>>>>> to get this with latest upstream kernel (HEAD fa8218def1b1)
>>>>>
>>>>> I'm reproducing it on a VM, but the first report was on bare-metal.
>>>>>
>>>>> This is what is happening:
>>>>>
>>>>> 1. audit_log_start() is called
>>>>> 2. As we have SKBs waiting in audit_skb_queue and all conditions
>>>>> evaluate to true, we sleep in wait_for_auditd()
>>>>> 3. Go to 2, until sleep_time gets negative and audit_log_start()
>>>>> just busy-waits
>>>>>
>>>>> Now, *why* this is happening is a mistery to me. I tried debugging
>>>>> it, but all I could find is that at some point the kauditd thread
>>>>> never wakes up after having called schedule(). I even tried waking
>>>>> it up before calling wait_for_auditd(), but it didn't.
>>>>
>>>> We run into the same problem in rhel6 kernel.
>>>>
>>>> "readahead-collector" uses audit interface and sometimes stuck in 'stopped' state.
>>>
>>> Yes, please also see:
>>>
>>> http://marc.info/?l=linux-kernel&m=137818375024600&w=2
>>>
>>>> After commit 829199197a430dade2519d54f5545c4a094393b8
>>>> (which was backported by RH into their kernel)
>>>> audit emiters will block forever if userspace daemon cannot handle backlog.
>>>> That commit just breaks timeout condition, after timeout waiting loop turns
>>>> into busy loop until deamon dies or returns back to work.
>>>>
>>>> this trivial patch should fix that problem
>>>>
>>>> --- a/kernel/audit.c
>>>> +++ b/kernel/audit.c
>>>> @@ -1215,9 +1215,10 @@ struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask,
>>>>
>>>> sleep_time = timeout_start + audit_backlog_wait_time -
>>>> jiffies;
>>>> - if ((long)sleep_time> 0)
>>>> + if ((long)sleep_time> 0) {
>>>> wait_for_auditd(sleep_time);
>>>> - continue;
>>>> + continue;
>>>> + }
>>>> }
>>>
>>>
>>> Chuck Anderson posted a similar fix:
>>>
>>> http://marc.info/?l=linux-kernel&m=137817994623832&w=2
>>>
>>> I still get a hang for around a minute with Chuck's fix, I believe I'll
>>> get the same thing with yours.
>>
>> Yep, this is normal behaviour -- default audit_backlog_wait_time is 60 seconds.
>
> Yes, I know that's the cause but I don't call it normal behavior to
> be unable to use my system during 60 seconds.
This stuff was designed for auditing, so it shouldn't loose messages too easily.
Meanwhile 'readahead' reuses it for a different purpose without proper tuning.
I think kernel should provide interface for tuning that timeout and let 'readahead'
set it to zero or some small value.
>
>> But this logic is really strange, I don't see any interface for tuning that timeout
>> and seems like kernel set it to zero after first disaster and never recovers it back.
next prev parent reply other threads:[~2013-09-09 15:42 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-08-28 22:21 [RFC] audit: avoid soft lockup in audit_log_start() Luiz Capitulino
2013-08-28 22:33 ` Andrew Morton
2013-08-28 22:54 ` Luiz Capitulino
2013-08-28 23:08 ` Andrew Morton
2013-08-29 0:49 ` Luiz Capitulino
2013-08-30 18:23 ` Luiz Capitulino
2013-09-09 14:32 ` Konstantin Khlebnikov
2013-09-09 14:54 ` Luiz Capitulino
2013-09-09 15:19 ` Konstantin Khlebnikov
2013-09-09 15:29 ` Luiz Capitulino
2013-09-09 15:42 ` Konstantin Khlebnikov [this message]
2013-09-10 16:03 ` Eric Paris
2013-09-10 17:45 ` Luiz Capitulino
2013-09-17 22:28 ` Andrew Morton
2013-09-17 22:54 ` Luiz Capitulino
2013-09-18 1:57 ` Richard Guy Briggs
2013-09-18 9:48 ` [PATCH] audit: fix endless wait " Konstantin Khlebnikov
2013-09-18 13:31 ` Richard Guy Briggs
2013-09-18 19:06 ` [PATCH 0/8] Audit backlog queue fixes related to soft lockup Richard Guy Briggs
2013-09-18 19:06 ` [PATCH 1/8] audit: avoid soft lockup due to audit_log_start() incorrect loop termination Richard Guy Briggs
2013-09-18 19:06 ` [PATCH 2/8] audit: reset audit backlog wait time after error recovery Richard Guy Briggs
2013-09-18 19:06 ` [PATCH 3/8] audit: make use of remaining sleep time from wait_for_auditd Richard Guy Briggs
2013-09-18 19:06 ` [PATCH 4/8] audit: efficiency fix 1: only wake up if queue shorter than backlog limit Richard Guy Briggs
2013-09-18 19:06 ` [PATCH 5/8] audit: efficiency fix 2: request exclusive wait since all need same resource Richard Guy Briggs
2013-09-18 19:06 ` [PATCH 6/8] audit: add boot option to override default backlog limit Richard Guy Briggs
2013-09-18 19:06 ` [PATCH 7/8] audit: clean up AUDIT_GET/SET local variables and future-proof API Richard Guy Briggs
2013-09-19 21:18 ` Steve Grubb
2013-09-20 14:47 ` Eric Paris
2013-09-23 16:38 ` Richard Guy Briggs
2013-09-18 19:06 ` [PATCH 8/8] audit: add audit_backlog_wait_time configuration option Richard Guy Briggs
2013-09-18 20:33 ` Eric Paris
2013-09-18 20:49 ` Richard Guy Briggs
2013-09-18 20:54 ` Eric Paris
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=522DEC51.2090608@openvz.org \
--to=khlebnikov@openvz.org \
--cc=akpm@linux-foundation.org \
--cc=eparis@redhat.com \
--cc=lcapitulino@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=oleg@redhat.com \
--cc=rgb@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox