From: Andy Lutomirski <luto@amacapital.net>
To: paulmck@linux.vnet.ibm.com, Dave Jones <davej@redhat.com>,
Linux Kernel <linux-kernel@vger.kernel.org>,
rgb@redhat.com, eparis@redhat.com
Cc: fweisbec@gmail.com,
Linus Torvalds <torvalds@linux-foundation.org>,
Oleg Nesterov <oleg@redhat.com>
Subject: Re: audit: rcu_read_lock() used illegally while idle
Date: Wed, 03 Dec 2014 12:06:56 -0800 [thread overview]
Message-ID: <547F6D60.5050007@amacapital.net> (raw)
In-Reply-To: <20141203192919.GP25340@linux.vnet.ibm.com>
On 12/03/2014 11:29 AM, Paul E. McKenney wrote:
> On Wed, Dec 03, 2014 at 01:19:22PM -0500, Dave Jones wrote:
>> I'm not sure why this only just started complaining, because this code
>> hasn't changed in years, but I don't recall seeing this before.
>> This gets spewed during bootup since I put 3.18-rc7 on my firewall.
>> Previously it was running rc4 where I didn't see this.
>
> That is quite strange. I wonder if NO_HZ_FULL has told RCU that the
> CPU is idle before the sysret_audit hook is called.
>
> Adding Frederic for his thoughts on this.
>
>> Did something in RCU change recently ?
>
> Not since -rc1, as far as I know, anyway.
There was a cute little bug that probably prevented sysret_audit from
being called if TIF_NOHZ was set:
b5e212a3051b x86, syscall: Fix _TIF_NOHZ handling in
syscall_trace_enter_phase1
IOW, between 3.18-rc1 and 3.18-rc6 (approximately), sysret_audit would
never happen in NO_HZ_FULL mode due to that bug.
Looking at the current code, I'm not sure I understand why we don't just
infinite-loop. Hmm, maybe we do -- could this be the lockup that no one
understands yet? (I doubt it, but maybe.)
The loop would be:
sysret_check:
Load ti.flags into edx (assume that _TIF_SYSCALL_AUDIT and some other
_TIF_ALLWORK_MASK but is set).
Jump to sysret_careful. Eventually get to sysret_signal. Notice that
TIF_SYSCALL_AUDIT is set in edx. Jump to sysret_audit.
sysret_audit then clears the _TIF_SYSCALL_AUDIT bit from edi. edi is
not edx. Hmm.
Jump to sysret_check, where edx is the same as before. Repeat.
This doesn't explain how we end up screwing up context tracking. But I
don't understand why we don't execute the audit exit hook *twice* if
TIF_NOHZ is set.
I have patches to delete this whole fscking sysret fast but not really
fast path. I'll resend them for 3.19. In the mean time, can you test
this patch by itself:
https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=x86/entry&id=1072a16a8d4ad1b11b8062f76e3236b9771b0fb6
Applying just that patch will result in a fairly large performance hit
if auditing is on, but given the shear number of bugs that the syscall
audit hooks seem to cause, I'm think it may be a reasonable tradeoff. I
don't really like it as a last-minute fix, though. (To get the
performance back, you need the two patches before it, but those are
*definitely* not last-minute material.)
On another note, all those TIF_FOO_BAR_BAZ masks are incomprehensible
and probably wrong in various interesting ways.
--Andy
>
> Thanx, Paul
>
>> ===============================
>> [ INFO: suspicious RCU usage. ]
>> 3.18.0-rc7+ #93 Not tainted
>> -------------------------------
>> include/linux/rcupdate.h:883 rcu_read_lock() used illegally while idle!
>>
>> other info that might help us debug this:
>>
>>
>> RCU used illegally from idle CPU!
>> rcu_scheduler_active = 1, debug_locks = 0
>> RCU used illegally from extended quiescent state!
>> 1 lock held by systemd-sysctl/557:
>> #0: (rcu_read_lock){......}, at: [<ffffffff890f1320>] audit_filter_type+0x0/0x240
>>
>> stack backtrace:
>> CPU: 0 PID: 557 Comm: systemd-sysctl Not tainted 3.18.0-rc7+ #93
>> 0000000000000000 0000000063c50efc ffff88021f51fd28 ffffffff895abec3
>> 0000000000000000 ffff880234758000 ffff88021f51fd58 ffffffff890a5280
>> ffff880221d9c548 00007fd651f320d0 0000000000000514 00000000ffff9012
>> Call Trace:
>> [<ffffffff895abec3>] dump_stack+0x4e/0x68
>> [<ffffffff890a5280>] lockdep_rcu_suspicious+0xf0/0x110
>> [<ffffffff890f150e>] audit_filter_type+0x1ee/0x240
>> [<ffffffff890f1320>] ? audit_filter_user+0x340/0x340
>> [<ffffffff890ecd39>] audit_log_start+0x49/0x4a0
>> [<ffffffff8908f4f5>] ? local_clock+0x25/0x30
>> [<ffffffff890a397f>] ? lock_release_holdtime.part.30+0xf/0x190
>> [<ffffffff890f1973>] audit_log_exit+0x53/0xcf0
>> [<ffffffff8908f4f5>] ? local_clock+0x25/0x30
>> [<ffffffff890a397f>] ? lock_release_holdtime.part.30+0xf/0x190
>> [<ffffffff895b4f0c>] ? sysret_signal+0x5/0x43
>> [<ffffffff890f4a65>] __audit_syscall_exit+0x245/0x2a0
>> [<ffffffff895b4f61>] sysret_audit+0x17/0x21
>>
>
next prev parent reply other threads:[~2014-12-03 20:07 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-12-03 18:19 audit: rcu_read_lock() used illegally while idle Dave Jones
2014-12-03 19:29 ` Paul E. McKenney
2014-12-03 20:06 ` Andy Lutomirski [this message]
2014-12-03 20:19 ` Dave Jones
2014-12-03 20:38 ` Andy Lutomirski
2014-12-03 22:08 ` Frederic Weisbecker
2014-12-03 22:12 ` Andy Lutomirski
2014-12-03 23:49 ` Frederic Weisbecker
2014-12-03 23:18 ` [PATCH] context_tracking: Restore previous state in schedule_user Andy Lutomirski
2014-12-03 23:26 ` Andy Lutomirski
2014-12-03 23:31 ` Dave Jones
2014-12-03 23:58 ` Frederic Weisbecker
2014-12-04 0:04 ` Andy Lutomirski
2014-12-04 0:30 ` Dave Jones
2014-12-04 0:38 ` Andy Lutomirski
2014-12-04 1:13 ` Frederic Weisbecker
2014-12-03 23:37 ` [PATCH v2] " Andy Lutomirski
2014-12-03 23:50 ` Paul E. McKenney
2014-12-04 0:01 ` Frederic Weisbecker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=547F6D60.5050007@amacapital.net \
--to=luto@amacapital.net \
--cc=davej@redhat.com \
--cc=eparis@redhat.com \
--cc=fweisbec@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=oleg@redhat.com \
--cc=paulmck@linux.vnet.ibm.com \
--cc=rgb@redhat.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.