From: Frederic Weisbecker <fweisbec@gmail.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
LKML <linux-kernel@vger.kernel.org>,
"Paul E. McKenney" <paulmck@us.ibm.com>,
Dave Jones <davej@redhat.com>, Ingo Molnar <mingo@kernel.org>
Subject: Re: [PATCH v2][RFC] tracing/context-tracking: Add preempt_schedule_context() for tracing
Date: Tue, 4 Jun 2013 14:09:51 +0200 [thread overview]
Message-ID: <20130604120949.GD14973@somewhere> (raw)
In-Reply-To: <1370050218.26799.87.camel@gandalf.local.home>
On Fri, May 31, 2013 at 09:30:18PM -0400, Steven Rostedt wrote:
> Dave Jones hit the following bug report:
>
> ===============================
> [ INFO: suspicious RCU usage. ]
> 3.10.0-rc2+ #1 Not tainted
> -------------------------------
> include/linux/rcupdate.h:771 rcu_read_lock() used illegally while idle!
> other info that might help us debug this:
> RCU used illegally from idle CPU! rcu_scheduler_active = 1, debug_locks = 0
> RCU used illegally from extended quiescent state!
> 2 locks held by cc1/63645:
> #0: (&rq->lock){-.-.-.}, at: [<ffffffff816b39fd>] __schedule+0xed/0x9b0
> #1: (rcu_read_lock){.+.+..}, at: [<ffffffff8109d645>] cpuacct_charge+0x5/0x1f0
>
> CPU: 1 PID: 63645 Comm: cc1 Not tainted 3.10.0-rc2+ #1 [loadavg: 40.57 27.55 13.39 25/277 64369]
> Hardware name: Gigabyte Technology Co., Ltd. GA-MA78GM-S2H/GA-MA78GM-S2H, BIOS F12a 04/23/2010
> 0000000000000000 ffff88010f78fcf8 ffffffff816ae383 ffff88010f78fd28
> ffffffff810b698d ffff88011c092548 000000000023d073 ffff88011c092500
> 0000000000000001 ffff88010f78fd60 ffffffff8109d7c5 ffffffff8109d645
> Call Trace:
> [<ffffffff816ae383>] dump_stack+0x19/0x1b
> [<ffffffff810b698d>] lockdep_rcu_suspicious+0xfd/0x130
> [<ffffffff8109d7c5>] cpuacct_charge+0x185/0x1f0
> [<ffffffff8109d645>] ? cpuacct_charge+0x5/0x1f0
> [<ffffffff8108dffc>] update_curr+0xec/0x240
> [<ffffffff8108f528>] put_prev_task_fair+0x228/0x480
> [<ffffffff816b3a71>] __schedule+0x161/0x9b0
> [<ffffffff816b4721>] preempt_schedule+0x51/0x80
> [<ffffffff816b4800>] ? __cond_resched_softirq+0x60/0x60
> [<ffffffff816b6824>] ? retint_careful+0x12/0x2e
> [<ffffffff810ff3cc>] ftrace_ops_control_func+0x1dc/0x210
> [<ffffffff816be280>] ftrace_call+0x5/0x2f
> [<ffffffff816b681d>] ? retint_careful+0xb/0x2e
> [<ffffffff816b4805>] ? schedule_user+0x5/0x70
> [<ffffffff816b4805>] ? schedule_user+0x5/0x70
> [<ffffffff816b6824>] ? retint_careful+0x12/0x2e
> ------------[ cut here ]------------
>
> What happened was that the function tracer traced the schedule_user() code
> that tells RCU that the system is coming back from userspace, and to
> add the CPU back to the RCU monitoring.
>
> Because the function tracer does a preempt_disable/enable_notrace() calls
> the preempt_enable_notrace() checks the NEED_RESCHED flag. If it is set,
> then preempt_schedule() is called. But this is called before the user_exit()
> function can inform the kernel that the CPU is no longer in user mode and
> needs to be accounted for by RCU.
>
> The fix is to create a new preempt_schedule_context() that checks if
> the kernel is still in user mode and if so to switch it to kernel mode
> before calling schedule. It also switches back to user mode coming back
> from schedule in need be.
>
> The only user of this currently is the preempt_enable_notrace(), which is
> only used by the tracing subsystem.
>
> Link: http://lkml.kernel.org/r/1369423420.6828.226.camel@gandalf.local.home
>
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> ---
> include/linux/preempt.h | 18 +++++++++++++++++-
> kernel/context_tracking.c | 38 ++++++++++++++++++++++++++++++++++++++
> 2 files changed, 55 insertions(+), 1 deletions(-)
>
> diff --git a/include/linux/preempt.h b/include/linux/preempt.h
> index 87a03c7..f5d4723 100644
> --- a/include/linux/preempt.h
> +++ b/include/linux/preempt.h
> @@ -33,9 +33,25 @@ do { \
> preempt_schedule(); \
> } while (0)
>
> +#ifdef CONFIG_CONTEXT_TRACKING
> +
> +void preempt_schedule_context(void);
> +
> +#define preempt_check_resched_context() \
> +do { \
> + if (unlikely(test_thread_flag(TIF_NEED_RESCHED))) \
> + preempt_schedule_context(); \
> +} while (0)
> +#else
> +
> +#define preempt_check_resched_context() preempt_check_resched()
> +
> +#endif /* CONFIG_CONTEXT_TRACKING */
> +
> #else /* !CONFIG_PREEMPT */
>
> #define preempt_check_resched() do { } while (0)
> +#define preempt_check_resched_context() do { } while (0)
>
> #endif /* CONFIG_PREEMPT */
>
> @@ -88,7 +104,7 @@ do { \
> do { \
> preempt_enable_no_resched_notrace(); \
> barrier(); \
> - preempt_check_resched(); \
> + preempt_check_resched_context(); \
> } while (0)
>
> #else /* !CONFIG_PREEMPT_COUNT */
> diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
> index 65349f0..15c9f2e 100644
> --- a/kernel/context_tracking.c
> +++ b/kernel/context_tracking.c
> @@ -71,6 +71,44 @@ void user_enter(void)
> local_irq_restore(flags);
> }
>
> +/**
> + * preempt_schedule_context - preempt_schedule called by tracing
> + *
> + * The tracing infrastructure uses preempt_enable_notrace to prevent
> + * recursion and tracing preempt enabling caused by the tracing
> + * infrastructure itself. But as tracing can happen in areas coming
> + * from userspace or just about to enter userspace, a preempt enable
> + * can occur before user_exit() is called. This will cause the scheduler
> + * to be called when the system is still in usermode.
> + *
> + * To prevent this, the preempt_enable_notrace will use this function
> + * instead of preempt_schedule() to exit user context if needed before
> + * calling the scheduler.
> + */
> +void __sched notrace preempt_schedule_context(void)
> +{
> + struct thread_info *ti = current_thread_info();
> + enum ctx_state prev_ctx;
> +
> + if (likely(ti->preempt_count || irqs_disabled()))
> + return;
or:
if (!preemptible())
return;
> +
> + /*
> + * Need to disable preemption in case user_exit() is traced
> + * and the tracer calls preempt_enable_notrace() causing
> + * an infinite recursion.
> + */
> + preempt_disable_notrace();
> + prev_ctx = exception_enter();
> + preempt_enable_no_resched_notrace();
> +
> + preempt_schedule();
> +
> + preempt_disable_notrace();
> + exception_exit(prev_ctx);
> + preempt_enable_notrace();
> +}
> +EXPORT_SYMBOL(preempt_schedule_context);
That's quite a tricky change but I can't think of anything better.
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
> /**
> * user_exit - Inform the context tracking that the CPU is
> --
> 1.7.3.4
>
>
>
next prev parent reply other threads:[~2013-06-04 12:09 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-05-30 19:59 [PATCH][RFC] tracing/context-tracking: Add preempt_schedule_context() for tracing Steven Rostedt
2013-05-31 13:43 ` Frederic Weisbecker
2013-05-31 15:18 ` Steven Rostedt
2013-05-31 15:56 ` Frederic Weisbecker
2013-05-31 16:01 ` Peter Zijlstra
2013-05-31 16:11 ` Steven Rostedt
2013-06-01 1:30 ` [PATCH v2][RFC] " Steven Rostedt
2013-06-04 12:09 ` Frederic Weisbecker [this message]
2013-06-04 12:16 ` Steven Rostedt
2013-06-04 12:29 ` Frederic Weisbecker
2013-06-04 12:27 ` Frederic Weisbecker
2013-06-04 14:16 ` Steven Rostedt
2013-06-05 11:45 ` Peter Zijlstra
2013-06-05 13:41 ` Steven Rostedt
2013-06-06 2:49 ` Steven Rostedt
2013-06-06 10:07 ` Peter Zijlstra
2013-06-06 13:50 ` Steven Rostedt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130604120949.GD14973@somewhere \
--to=fweisbec@gmail.com \
--cc=davej@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=paulmck@us.ibm.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox