public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Frederic Weisbecker <fweisbec@gmail.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
	LKML <linux-kernel@vger.kernel.org>,
	"Paul E. McKenney" <paulmck@us.ibm.com>,
	Dave Jones <davej@redhat.com>, Ingo Molnar <mingo@kernel.org>
Subject: Re: [PATCH v2][RFC] tracing/context-tracking: Add preempt_schedule_context() for tracing
Date: Tue, 4 Jun 2013 14:09:51 +0200	[thread overview]
Message-ID: <20130604120949.GD14973@somewhere> (raw)
In-Reply-To: <1370050218.26799.87.camel@gandalf.local.home>

On Fri, May 31, 2013 at 09:30:18PM -0400, Steven Rostedt wrote:
> Dave Jones hit the following bug report:
> 
>  ===============================
>  [ INFO: suspicious RCU usage. ]
>  3.10.0-rc2+ #1 Not tainted
>  -------------------------------
>  include/linux/rcupdate.h:771 rcu_read_lock() used illegally while idle!
>  other info that might help us debug this:
>  RCU used illegally from idle CPU! rcu_scheduler_active = 1, debug_locks = 0
>  RCU used illegally from extended quiescent state!
>  2 locks held by cc1/63645:
>   #0:  (&rq->lock){-.-.-.}, at: [<ffffffff816b39fd>] __schedule+0xed/0x9b0
>   #1:  (rcu_read_lock){.+.+..}, at: [<ffffffff8109d645>] cpuacct_charge+0x5/0x1f0
> 
>  CPU: 1 PID: 63645 Comm: cc1 Not tainted 3.10.0-rc2+ #1 [loadavg: 40.57 27.55 13.39 25/277 64369]
>  Hardware name: Gigabyte Technology Co., Ltd. GA-MA78GM-S2H/GA-MA78GM-S2H, BIOS F12a 04/23/2010
>   0000000000000000 ffff88010f78fcf8 ffffffff816ae383 ffff88010f78fd28
>   ffffffff810b698d ffff88011c092548 000000000023d073 ffff88011c092500
>   0000000000000001 ffff88010f78fd60 ffffffff8109d7c5 ffffffff8109d645
>  Call Trace:
>   [<ffffffff816ae383>] dump_stack+0x19/0x1b
>   [<ffffffff810b698d>] lockdep_rcu_suspicious+0xfd/0x130
>   [<ffffffff8109d7c5>] cpuacct_charge+0x185/0x1f0
>   [<ffffffff8109d645>] ? cpuacct_charge+0x5/0x1f0
>   [<ffffffff8108dffc>] update_curr+0xec/0x240
>   [<ffffffff8108f528>] put_prev_task_fair+0x228/0x480
>   [<ffffffff816b3a71>] __schedule+0x161/0x9b0
>   [<ffffffff816b4721>] preempt_schedule+0x51/0x80
>   [<ffffffff816b4800>] ? __cond_resched_softirq+0x60/0x60
>   [<ffffffff816b6824>] ? retint_careful+0x12/0x2e
>   [<ffffffff810ff3cc>] ftrace_ops_control_func+0x1dc/0x210
>   [<ffffffff816be280>] ftrace_call+0x5/0x2f
>   [<ffffffff816b681d>] ? retint_careful+0xb/0x2e
>   [<ffffffff816b4805>] ? schedule_user+0x5/0x70
>   [<ffffffff816b4805>] ? schedule_user+0x5/0x70
>   [<ffffffff816b6824>] ? retint_careful+0x12/0x2e
>  ------------[ cut here ]------------
> 
> What happened was that the function tracer traced the schedule_user() code
> that tells RCU that the system is coming back from userspace, and to
> add the CPU back to the RCU monitoring.
> 
> Because the function tracer does a preempt_disable/enable_notrace() calls
> the preempt_enable_notrace() checks the NEED_RESCHED flag. If it is set,
> then preempt_schedule() is called. But this is called before the user_exit()
> function can inform the kernel that the CPU is no longer in user mode and
> needs to be accounted for by RCU.
> 
> The fix is to create a new preempt_schedule_context() that checks if
> the kernel is still in user mode and if so to switch it to kernel mode
> before calling schedule. It also switches back to user mode coming back
> from schedule in need be.
> 
> The only user of this currently is the preempt_enable_notrace(), which is
> only used by the tracing subsystem.
> 
> Link: http://lkml.kernel.org/r/1369423420.6828.226.camel@gandalf.local.home
> 
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> ---
>  include/linux/preempt.h   |   18 +++++++++++++++++-
>  kernel/context_tracking.c |   38 ++++++++++++++++++++++++++++++++++++++
>  2 files changed, 55 insertions(+), 1 deletions(-)
> 
> diff --git a/include/linux/preempt.h b/include/linux/preempt.h
> index 87a03c7..f5d4723 100644
> --- a/include/linux/preempt.h
> +++ b/include/linux/preempt.h
> @@ -33,9 +33,25 @@ do { \
>  		preempt_schedule(); \
>  } while (0)
>  
> +#ifdef CONFIG_CONTEXT_TRACKING
> +
> +void preempt_schedule_context(void);
> +
> +#define preempt_check_resched_context() \
> +do { \
> +	if (unlikely(test_thread_flag(TIF_NEED_RESCHED))) \
> +		preempt_schedule_context(); \
> +} while (0)
> +#else
> +
> +#define preempt_check_resched_context() preempt_check_resched()
> +
> +#endif /* CONFIG_CONTEXT_TRACKING */
> +
>  #else /* !CONFIG_PREEMPT */
>  
>  #define preempt_check_resched()		do { } while (0)
> +#define preempt_check_resched_context()	do { } while (0)
>  
>  #endif /* CONFIG_PREEMPT */
>  
> @@ -88,7 +104,7 @@ do { \
>  do { \
>  	preempt_enable_no_resched_notrace(); \
>  	barrier(); \
> -	preempt_check_resched(); \
> +	preempt_check_resched_context(); \
>  } while (0)
>  
>  #else /* !CONFIG_PREEMPT_COUNT */
> diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
> index 65349f0..15c9f2e 100644
> --- a/kernel/context_tracking.c
> +++ b/kernel/context_tracking.c
> @@ -71,6 +71,44 @@ void user_enter(void)
>  	local_irq_restore(flags);
>  }
>  
> +/**
> + * preempt_schedule_context - preempt_schedule called by tracing
> + *
> + * The tracing infrastructure uses preempt_enable_notrace to prevent
> + * recursion and tracing preempt enabling caused by the tracing
> + * infrastructure itself. But as tracing can happen in areas coming
> + * from userspace or just about to enter userspace, a preempt enable
> + * can occur before user_exit() is called. This will cause the scheduler
> + * to be called when the system is still in usermode.
> + *
> + * To prevent this, the preempt_enable_notrace will use this function
> + * instead of preempt_schedule() to exit user context if needed before
> + * calling the scheduler.
> + */
> +void __sched notrace preempt_schedule_context(void)
> +{
> +	struct thread_info *ti = current_thread_info();
> +	enum ctx_state prev_ctx;
> +
> +	if (likely(ti->preempt_count || irqs_disabled()))
> +		return;

or:
        if (!preemptible())
                return;

> +
> +	/*
> +	 * Need to disable preemption in case user_exit() is traced
> +	 * and the tracer calls preempt_enable_notrace() causing
> +	 * an infinite recursion.
> +	 */
> +	preempt_disable_notrace();
> +	prev_ctx = exception_enter();
> +	preempt_enable_no_resched_notrace();
> +
> +	preempt_schedule();
> +
> +	preempt_disable_notrace();
> +	exception_exit(prev_ctx);
> +	preempt_enable_notrace();
> +}
> +EXPORT_SYMBOL(preempt_schedule_context);

That's quite a tricky change but I can't think of anything better.

Acked-by: Frederic Weisbecker <fweisbec@gmail.com>


>  /**
>   * user_exit - Inform the context tracking that the CPU is
> -- 
> 1.7.3.4
> 
> 
> 

  reply	other threads:[~2013-06-04 12:09 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-30 19:59 [PATCH][RFC] tracing/context-tracking: Add preempt_schedule_context() for tracing Steven Rostedt
2013-05-31 13:43 ` Frederic Weisbecker
2013-05-31 15:18   ` Steven Rostedt
2013-05-31 15:56     ` Frederic Weisbecker
2013-05-31 16:01     ` Peter Zijlstra
2013-05-31 16:11       ` Steven Rostedt
2013-06-01  1:30       ` [PATCH v2][RFC] " Steven Rostedt
2013-06-04 12:09         ` Frederic Weisbecker [this message]
2013-06-04 12:16           ` Steven Rostedt
2013-06-04 12:29             ` Frederic Weisbecker
2013-06-04 12:27         ` Frederic Weisbecker
2013-06-04 14:16           ` Steven Rostedt
2013-06-05 11:45             ` Peter Zijlstra
2013-06-05 13:41               ` Steven Rostedt
2013-06-06  2:49                 ` Steven Rostedt
2013-06-06 10:07                   ` Peter Zijlstra
2013-06-06 13:50                     ` Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130604120949.GD14973@somewhere \
    --to=fweisbec@gmail.com \
    --cc=davej@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=paulmck@us.ibm.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox