Re: Testing tracer wakeup_rt: .. no entries found ..FAILED!

kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: Testing tracer wakeup_rt: .. no entries found ..FAILED!
       [not found]   ` <1343663105.3847.7.camel@fedora>
@ 2012-07-31 12:17     ` Fengguang Wu
  2012-07-31 12:37       ` Avi Kivity
  0 siblings, 1 reply; 29+ messages in thread
From: Fengguang Wu @ 2012-07-31 12:17 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Paul E. McKenney, LKML, Steven Rostedt, Avi Kivity,
	kvm@vger.kernel.org

[CC kvm developers]

On Mon, Jul 30, 2012 at 11:45:05AM -0400, Steven Rostedt wrote:
> On Tue, 2012-07-24 at 17:07 +0800, Fengguang Wu wrote:
> > On Tue, Jul 24, 2012 at 05:03:30PM +0800, Fengguang Wu wrote:
> 
> > And this warning shows up in one of the dozens of boots, for the same
> > kconfig.
> > 
> > [    2.320434] Testing tracer wakeup: PASSED
> > [    2.840288] Testing tracer wakeup_rt: .. no entries found ..FAILED!
> > [    3.280861] ------------[ cut here ]------------
> > [    3.281967] WARNING: at /c/kernel-tests/src/linux/kernel/trace/trace.c:834 register_tracer+0x1b0/0x270()
> > [    3.284162] Hardware name: Bochs
> > [    3.284933] Modules linked in:
> > [    3.285695] Pid: 1, comm: swapper/0 Not tainted 3.5.0+ #1371
> > [    3.287032] Call Trace:
> > [    3.287626]  [<41035c32>] warn_slowpath_common+0x72/0xa0
> > [    3.288938]  [<410e7dd0>] ? register_tracer+0x1b0/0x270
> > [    3.290280]  [<410e7dd0>] ? register_tracer+0x1b0/0x270
> > [    3.291516]  [<41035c82>] warn_slowpath_null+0x22/0x30
> > [    3.292723]  [<410e7dd0>] register_tracer+0x1b0/0x270
> > [    3.293921]  [<41434c7a>] ? init_irqsoff_tracer+0x11/0x11
> > [    3.295269]  [<41434c95>] init_wakeup_tracer+0x1b/0x1d
> > [    3.296464]  [<41001112>] do_one_initcall+0x112/0x160
> > [    3.297639]  [<4141fadd>] kernel_init+0xf7/0x18e
> > [    3.298724]  [<4141f455>] ? do_early_param+0x7a/0x7a
> > [    3.299879]  [<4141f9e6>] ? start_kernel+0x375/0x375
> > [    3.301093]  [<412b15c2>] kernel_thread_helper+0x6/0x10
> > [    3.302352] ---[ end trace 57f7151f6a5def05 ]---
> > 
> 
> The comment above this test shows:
> 
> 	 * Yes this is slightly racy. It is possible that for some
> 	 * strange reason that the RT thread we created, did not
> 	 * call schedule for 100ms after doing the completion,
> 	 * and we do a wakeup on a task that already is awake.
> 	 * But that is extremely unlikely, and the worst thing that
> 	 * happens in such a case, is that we disable tracing.
> 	 * Honestly, if this race does happen something is horrible
> 	 * wrong with the system.
> 
> I guess the question now is, why didn't the RT test wake up?
> 
> Oh wait! You did this on a virt machine. This test isn't designed for
> virt machines because the thread could have woken on another vcpu, but
> due to scheduling of the host system, it didn't get to run for 100ms,
> thus the test will fail because it never recorded the wakeup of the RT
> task.
> 
> In other-words, the test is bogus on virt boxes :-/

It's good to quickly get to the root cause :) Can we possibly detect
whether we are in a virtual machine and hence skip this particular
test case?

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Testing tracer wakeup_rt: .. no entries found ..FAILED!
  2012-07-31 12:17     ` Testing tracer wakeup_rt: .. no entries found ..FAILED! Fengguang Wu
@ 2012-07-31 12:37       ` Avi Kivity
  2012-07-31 12:43         ` Steven Rostedt
  0 siblings, 1 reply; 29+ messages in thread
From: Avi Kivity @ 2012-07-31 12:37 UTC (permalink / raw)
  To: Fengguang Wu
  Cc: Steven Rostedt, Paul E. McKenney, LKML, Steven Rostedt,
	kvm@vger.kernel.org

On 07/31/2012 03:17 PM, Fengguang Wu wrote:
> 
> It's good to quickly get to the root cause :) Can we possibly detect
> whether we are in a virtual machine and hence skip this particular
> test case?

cpu_has(&boot_cpu, X86_FEATURE_HYPERVISOR)

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Testing tracer wakeup_rt: .. no entries found ..FAILED!
  2012-07-31 12:37       ` Avi Kivity
@ 2012-07-31 12:43         ` Steven Rostedt
  2012-07-31 12:50           ` Avi Kivity
  0 siblings, 1 reply; 29+ messages in thread
From: Steven Rostedt @ 2012-07-31 12:43 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Fengguang Wu, Steven Rostedt, Paul E. McKenney, LKML,
	kvm@vger.kernel.org

On Tue, 2012-07-31 at 15:37 +0300, Avi Kivity wrote:
> On 07/31/2012 03:17 PM, Fengguang Wu wrote:
> > 
> > It's good to quickly get to the root cause :) Can we possibly detect
> > whether we are in a virtual machine and hence skip this particular
> > test case?
> 
> cpu_has(&boot_cpu, X86_FEATURE_HYPERVISOR)
> 

Yeah, but then it is still broken on non-x86 code (the test lives in
core kernel).

As it is just testing the events for wakeup, I could probably just add a
completion and force the other thread to just wait for it. I'll write up
a patch. But it wont make it in till 3.7.

-- Steve

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Testing tracer wakeup_rt: .. no entries found ..FAILED!
  2012-07-31 12:43         ` Steven Rostedt
@ 2012-07-31 12:50           ` Avi Kivity
  2012-07-31 13:13             ` Steven Rostedt
  0 siblings, 1 reply; 29+ messages in thread
From: Avi Kivity @ 2012-07-31 12:50 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Fengguang Wu, Steven Rostedt, Paul E. McKenney, LKML,
	kvm@vger.kernel.org

On 07/31/2012 03:43 PM, Steven Rostedt wrote:
> On Tue, 2012-07-31 at 15:37 +0300, Avi Kivity wrote:
>> On 07/31/2012 03:17 PM, Fengguang Wu wrote:
>> > 
>> > It's good to quickly get to the root cause :) Can we possibly detect
>> > whether we are in a virtual machine and hence skip this particular
>> > test case?
>> 
>> cpu_has(&boot_cpu, X86_FEATURE_HYPERVISOR)
>> 
> 
> Yeah, but then it is still broken on non-x86 code (the test lives in
> core kernel).
> 
> As it is just testing the events for wakeup, I could probably just add a
> completion and force the other thread to just wait for it. I'll write up
> a patch. But it wont make it in till 3.7.

That would be better.  A hypervisor might be real-time capable (with
some effort kvm can do this), so we don't want to turn off real time
features just based on that.


-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Testing tracer wakeup_rt: .. no entries found ..FAILED!
  2012-07-31 12:50           ` Avi Kivity
@ 2012-07-31 13:13             ` Steven Rostedt
  2012-07-31 23:43               ` Fengguang Wu
  0 siblings, 1 reply; 29+ messages in thread
From: Steven Rostedt @ 2012-07-31 13:13 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Fengguang Wu, Steven Rostedt, Paul E. McKenney, LKML,
	kvm@vger.kernel.org

On Tue, 2012-07-31 at 15:50 +0300, Avi Kivity wrote:
> On 07/31/2012 03:43 PM, Steven Rostedt wrote:

> That would be better.  A hypervisor might be real-time capable (with
> some effort kvm can do this), so we don't want to turn off real time
> features just based on that.

It would only turn off if you enable selftests and the timing falied. If
the kvm had real time features, this most likely would fail anyway. But
that said, here's a patch that should solve this:

-- Steve


diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c
index 1003a4d..2c00a69 100644
--- a/kernel/trace/trace_selftest.c
+++ b/kernel/trace/trace_selftest.c
@@ -1041,6 +1041,8 @@ static int trace_wakeup_test_thread(void *data)
 	set_current_state(TASK_INTERRUPTIBLE);
 	schedule();
 
+	complete(x);
+
 	/* we are awake, now wait to disappear */
 	while (!kthread_should_stop()) {
 		/*
@@ -1084,24 +1086,21 @@ trace_selftest_startup_wakeup(struct tracer *trace, struct trace_array *tr)
 	/* reset the max latency */
 	tracing_max_latency = 0;
 
-	/* sleep to let the RT thread sleep too */
-	msleep(100);
+	while (p->on_rq) {
+		/*
+		 * Sleep to make sure the RT thread is asleep too.
+		 * On virtual machines we can't rely on timings,
+		 * but we want to make sure this test still works.
+		 */
+		msleep(100);
+	}
 
-	/*
-	 * Yes this is slightly racy. It is possible that for some
-	 * strange reason that the RT thread we created, did not
-	 * call schedule for 100ms after doing the completion,
-	 * and we do a wakeup on a task that already is awake.
-	 * But that is extremely unlikely, and the worst thing that
-	 * happens in such a case, is that we disable tracing.
-	 * Honestly, if this race does happen something is horrible
-	 * wrong with the system.
-	 */
+	init_completion(&isrt);
 
 	wake_up_process(p);
 
-	/* give a little time to let the thread wake up */
-	msleep(100);
+	/* Wait for the task to wake up */
+	wait_for_completion(&isrt);
 
 	/* stop the tracing. */
 	tracing_stop();

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: Testing tracer wakeup_rt: .. no entries found ..FAILED!
  2012-07-31 13:13             ` Steven Rostedt
@ 2012-07-31 23:43               ` Fengguang Wu
  2012-07-31 23:51                 ` Steven Rostedt
  0 siblings, 1 reply; 29+ messages in thread
From: Fengguang Wu @ 2012-07-31 23:43 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Avi Kivity, Steven Rostedt, Paul E. McKenney, LKML,
	kvm@vger.kernel.org

On Tue, Jul 31, 2012 at 09:13:39AM -0400, Steven Rostedt wrote:
> On Tue, 2012-07-31 at 15:50 +0300, Avi Kivity wrote:
> > On 07/31/2012 03:43 PM, Steven Rostedt wrote:
> 
> > That would be better.  A hypervisor might be real-time capable (with
> > some effort kvm can do this), so we don't want to turn off real time
> > features just based on that.
> 
> It would only turn off if you enable selftests and the timing falied. If
> the kvm had real time features, this most likely would fail anyway. But
> that said, here's a patch that should solve this:

No luck.. it still fails:

[    2.360068] Testing tracer irqsoff: [    2.854529] 
[    2.854828] ===============================
[    2.855560] [ INFO: suspicious RCU usage. ]
[    2.856266] 3.5.0-00024-g01ff5db-dirty #3 Not tainted
[    2.857182] -------------------------------
[    2.857933] /c/wfg/linux/include/linux/rcupdate.h:730 rcu_read_lock() used illegally while idle!
[    2.859450] 
[    2.859450] other info that might help us debug this:
[    2.859450] 
[    2.860874] 
[    2.860874] RCU used illegally from idle CPU!
[    2.860874] rcu_scheduler_active = 1, debug_locks = 0
[    2.862754] RCU used illegally from extended quiescent state!
[    2.863741] 2 locks held by swapper/0/0:

[    2.864377]  #0: [    2.864423]  (max_trace_lock){......}, at: [<814f6bfe>] check_critical_timing+0xd7/0x286
[    2.864423]  #1:  (rcu_read_lock){.+.+..}, at: [<8116f930>] __update_max_tr+0x0/0x430

[    2.864423] stack backtrace:
[    2.864423] Pid: 0, comm: swapper/0 Not tainted 3.5.0-00024-g01ff5db-dirty #3
[    2.864423] Call Trace:
[    2.864423]  [<81103a06>] lockdep_rcu_suspicious+0x1c6/0x210
[    2.864423]  [<8116fc9a>] __update_max_tr+0x36a/0x430
[    2.864423]  [<8116f930>] ? tracing_record_cmdline+0x200/0x200
[    2.864423]  [<8117186e>] update_max_tr_single+0x14e/0x2c0
[    2.864423]  [<81170baa>] ? __trace_stack+0x2a/0x40
[    2.864423]  [<814f6d22>] check_critical_timing+0x1fb/0x286
[    2.864423]  [<81013313>] ? default_idle+0x593/0xc30
[    2.864423]  [<81013313>] ? default_idle+0x593/0xc30
[    2.864423]  [<8110a0e7>] ? trace_hardirqs_on+0x27/0x40
[    2.864423]  [<8117ea5e>] time_hardirqs_on+0x1de/0x220
[    2.864423]  [<81013313>] ? default_idle+0x593/0xc30
[    2.864423]  [<81109d6d>] trace_hardirqs_on_caller+0x2d/0x380
[    2.864423]  [<8110a0e7>] trace_hardirqs_on+0x27/0x40
[    2.864423]  [<81013313>] default_idle+0x593/0xc30
[    2.864423]  [<8101692d>] cpu_idle+0x2dd/0x390
[    2.864423]  [<814eb841>] rest_init+0x2f5/0x314
[    2.864423]  [<814eb54c>] ? __read_lock_failed+0x14/0x14
[    2.864423]  [<817a43b4>] start_kernel+0x866/0x87a

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Testing tracer wakeup_rt: .. no entries found ..FAILED!
  2012-07-31 23:43               ` Fengguang Wu
@ 2012-07-31 23:51                 ` Steven Rostedt
  2012-07-31 23:57                   ` Paul E. McKenney
  2012-07-31 23:57                   ` Testing tracer wakeup_rt: .. no entries found ..FAILED! Fengguang Wu
  0 siblings, 2 replies; 29+ messages in thread
From: Steven Rostedt @ 2012-07-31 23:51 UTC (permalink / raw)
  To: Fengguang Wu
  Cc: Avi Kivity, Steven Rostedt, Paul E. McKenney, LKML,
	kvm@vger.kernel.org

On Wed, 2012-08-01 at 07:43 +0800, Fengguang Wu wrote:
> On Tue, Jul 31, 2012 at 09:13:39AM -0400, Steven Rostedt wrote:
> > On Tue, 2012-07-31 at 15:50 +0300, Avi Kivity wrote:
> > > On 07/31/2012 03:43 PM, Steven Rostedt wrote:
> > 
> > > That would be better.  A hypervisor might be real-time capable (with
> > > some effort kvm can do this), so we don't want to turn off real time
> > > features just based on that.
> > 
> > It would only turn off if you enable selftests and the timing falied. If
> > the kvm had real time features, this most likely would fail anyway. But
> > that said, here's a patch that should solve this:
> 
> No luck.. it still fails:

I bet you it didn't ;-)

> 
> [    2.360068] Testing tracer irqsoff: [    2.854529] 
> [    2.854828] ===============================
> [    2.855560] [ INFO: suspicious RCU usage. ]
> [    2.856266] 3.5.0-00024-g01ff5db-dirty #3 Not tainted
> [    2.857182] -------------------------------
> [    2.857933] /c/wfg/linux/include/linux/rcupdate.h:730 rcu_read_lock() used illegally while idle!
> [    2.859450] 
> [    2.859450] other info that might help us debug this:
> [    2.859450] 
> [    2.860874] 
> [    2.860874] RCU used illegally from idle CPU!
> [    2.860874] rcu_scheduler_active = 1, debug_locks = 0
> [    2.862754] RCU used illegally from extended quiescent state!
> [    2.863741] 2 locks held by swapper/0/0:
> 
> [    2.864377]  #0: [    2.864423]  (max_trace_lock){......}, at: [<814f6bfe>] check_critical_timing+0xd7/0x286
> [    2.864423]  #1:  (rcu_read_lock){.+.+..}, at: [<8116f930>] __update_max_tr+0x0/0x430
> 
> [    2.864423] stack backtrace:
> [    2.864423] Pid: 0, comm: swapper/0 Not tainted 3.5.0-00024-g01ff5db-dirty #3
> [    2.864423] Call Trace:
> [    2.864423]  [<81103a06>] lockdep_rcu_suspicious+0x1c6/0x210
> [    2.864423]  [<8116fc9a>] __update_max_tr+0x36a/0x430
> [    2.864423]  [<8116f930>] ? tracing_record_cmdline+0x200/0x200
> [    2.864423]  [<8117186e>] update_max_tr_single+0x14e/0x2c0
> [    2.864423]  [<81170baa>] ? __trace_stack+0x2a/0x40
> [    2.864423]  [<814f6d22>] check_critical_timing+0x1fb/0x286
> [    2.864423]  [<81013313>] ? default_idle+0x593/0xc30
> [    2.864423]  [<81013313>] ? default_idle+0x593/0xc30
> [    2.864423]  [<8110a0e7>] ? trace_hardirqs_on+0x27/0x40
> [    2.864423]  [<8117ea5e>] time_hardirqs_on+0x1de/0x220
> [    2.864423]  [<81013313>] ? default_idle+0x593/0xc30
> [    2.864423]  [<81109d6d>] trace_hardirqs_on_caller+0x2d/0x380
> [    2.864423]  [<8110a0e7>] trace_hardirqs_on+0x27/0x40
> [    2.864423]  [<81013313>] default_idle+0x593/0xc30
> [    2.864423]  [<8101692d>] cpu_idle+0x2dd/0x390
> [    2.864423]  [<814eb841>] rest_init+0x2f5/0x314
> [    2.864423]  [<814eb54c>] ? __read_lock_failed+0x14/0x14
> [    2.864423]  [<817a43b4>] start_kernel+0x866/0x87a

What was the next lines? I bet you it was "PASSED". Which means it did
not fail. This is the second bug you found that has to do with RCU being
called in 'idle'. The one that Paul posted a patch for.

-- Steve

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Testing tracer wakeup_rt: .. no entries found ..FAILED!
  2012-07-31 23:51                 ` Steven Rostedt
@ 2012-07-31 23:57                   ` Paul E. McKenney
  2012-08-01  0:09                     ` Steven Rostedt
  2012-07-31 23:57                   ` Testing tracer wakeup_rt: .. no entries found ..FAILED! Fengguang Wu
  1 sibling, 1 reply; 29+ messages in thread
From: Paul E. McKenney @ 2012-07-31 23:57 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Fengguang Wu, Avi Kivity, Steven Rostedt, LKML,
	kvm@vger.kernel.org

On Tue, Jul 31, 2012 at 07:51:39PM -0400, Steven Rostedt wrote:
> On Wed, 2012-08-01 at 07:43 +0800, Fengguang Wu wrote:
> > On Tue, Jul 31, 2012 at 09:13:39AM -0400, Steven Rostedt wrote:
> > > On Tue, 2012-07-31 at 15:50 +0300, Avi Kivity wrote:
> > > > On 07/31/2012 03:43 PM, Steven Rostedt wrote:
> > > 
> > > > That would be better.  A hypervisor might be real-time capable (with
> > > > some effort kvm can do this), so we don't want to turn off real time
> > > > features just based on that.
> > > 
> > > It would only turn off if you enable selftests and the timing falied. If
> > > the kvm had real time features, this most likely would fail anyway. But
> > > that said, here's a patch that should solve this:
> > 
> > No luck.. it still fails:
> 
> I bet you it didn't ;-)
> 
> > 
> > [    2.360068] Testing tracer irqsoff: [    2.854529] 
> > [    2.854828] ===============================
> > [    2.855560] [ INFO: suspicious RCU usage. ]
> > [    2.856266] 3.5.0-00024-g01ff5db-dirty #3 Not tainted
> > [    2.857182] -------------------------------
> > [    2.857933] /c/wfg/linux/include/linux/rcupdate.h:730 rcu_read_lock() used illegally while idle!
> > [    2.859450] 
> > [    2.859450] other info that might help us debug this:
> > [    2.859450] 
> > [    2.860874] 
> > [    2.860874] RCU used illegally from idle CPU!
> > [    2.860874] rcu_scheduler_active = 1, debug_locks = 0
> > [    2.862754] RCU used illegally from extended quiescent state!
> > [    2.863741] 2 locks held by swapper/0/0:
> > 
> > [    2.864377]  #0: [    2.864423]  (max_trace_lock){......}, at: [<814f6bfe>] check_critical_timing+0xd7/0x286
> > [    2.864423]  #1:  (rcu_read_lock){.+.+..}, at: [<8116f930>] __update_max_tr+0x0/0x430
> > 
> > [    2.864423] stack backtrace:
> > [    2.864423] Pid: 0, comm: swapper/0 Not tainted 3.5.0-00024-g01ff5db-dirty #3
> > [    2.864423] Call Trace:
> > [    2.864423]  [<81103a06>] lockdep_rcu_suspicious+0x1c6/0x210
> > [    2.864423]  [<8116fc9a>] __update_max_tr+0x36a/0x430
> > [    2.864423]  [<8116f930>] ? tracing_record_cmdline+0x200/0x200
> > [    2.864423]  [<8117186e>] update_max_tr_single+0x14e/0x2c0
> > [    2.864423]  [<81170baa>] ? __trace_stack+0x2a/0x40
> > [    2.864423]  [<814f6d22>] check_critical_timing+0x1fb/0x286
> > [    2.864423]  [<81013313>] ? default_idle+0x593/0xc30
> > [    2.864423]  [<81013313>] ? default_idle+0x593/0xc30
> > [    2.864423]  [<8110a0e7>] ? trace_hardirqs_on+0x27/0x40
> > [    2.864423]  [<8117ea5e>] time_hardirqs_on+0x1de/0x220
> > [    2.864423]  [<81013313>] ? default_idle+0x593/0xc30
> > [    2.864423]  [<81109d6d>] trace_hardirqs_on_caller+0x2d/0x380
> > [    2.864423]  [<8110a0e7>] trace_hardirqs_on+0x27/0x40
> > [    2.864423]  [<81013313>] default_idle+0x593/0xc30
> > [    2.864423]  [<8101692d>] cpu_idle+0x2dd/0x390
> > [    2.864423]  [<814eb841>] rest_init+0x2f5/0x314
> > [    2.864423]  [<814eb54c>] ? __read_lock_failed+0x14/0x14
> > [    2.864423]  [<817a43b4>] start_kernel+0x866/0x87a
> 
> What was the next lines? I bet you it was "PASSED". Which means it did
> not fail. This is the second bug you found that has to do with RCU being
> called in 'idle'. The one that Paul posted a patch for.

Though it needs another patch to actually use it in the right place...

							Thanx, Paul


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Testing tracer wakeup_rt: .. no entries found ..FAILED!
  2012-07-31 23:51                 ` Steven Rostedt
  2012-07-31 23:57                   ` Paul E. McKenney
@ 2012-07-31 23:57                   ` Fengguang Wu
  2012-08-07 13:29                     ` Steven Rostedt
  1 sibling, 1 reply; 29+ messages in thread
From: Fengguang Wu @ 2012-07-31 23:57 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Avi Kivity, Steven Rostedt, Paul E. McKenney, LKML,
	kvm@vger.kernel.org

On Tue, Jul 31, 2012 at 07:51:39PM -0400, Steven Rostedt wrote:
> On Wed, 2012-08-01 at 07:43 +0800, Fengguang Wu wrote:
> > On Tue, Jul 31, 2012 at 09:13:39AM -0400, Steven Rostedt wrote:
> > > On Tue, 2012-07-31 at 15:50 +0300, Avi Kivity wrote:
> > > > On 07/31/2012 03:43 PM, Steven Rostedt wrote:
> > > 
> > > > That would be better.  A hypervisor might be real-time capable (with
> > > > some effort kvm can do this), so we don't want to turn off real time
> > > > features just based on that.
> > > 
> > > It would only turn off if you enable selftests and the timing falied. If
> > > the kvm had real time features, this most likely would fail anyway. But
> > > that said, here's a patch that should solve this:
> > 
> > No luck.. it still fails:
> 
> I bet you it didn't ;-)
> 
> > 
> > [    2.360068] Testing tracer irqsoff: [    2.854529] 
> > [    2.854828] ===============================
> > [    2.855560] [ INFO: suspicious RCU usage. ]
> > [    2.856266] 3.5.0-00024-g01ff5db-dirty #3 Not tainted
> > [    2.857182] -------------------------------
> > [    2.857933] /c/wfg/linux/include/linux/rcupdate.h:730 rcu_read_lock() used illegally while idle!
> > [    2.859450] 
> > [    2.859450] other info that might help us debug this:
> > [    2.859450] 
> > [    2.860874] 
> > [    2.860874] RCU used illegally from idle CPU!
> > [    2.860874] rcu_scheduler_active = 1, debug_locks = 0
> > [    2.862754] RCU used illegally from extended quiescent state!
> > [    2.863741] 2 locks held by swapper/0/0:
> > 
> > [    2.864377]  #0: [    2.864423]  (max_trace_lock){......}, at: [<814f6bfe>] check_critical_timing+0xd7/0x286
> > [    2.864423]  #1:  (rcu_read_lock){.+.+..}, at: [<8116f930>] __update_max_tr+0x0/0x430
> > 
> > [    2.864423] stack backtrace:
> > [    2.864423] Pid: 0, comm: swapper/0 Not tainted 3.5.0-00024-g01ff5db-dirty #3
> > [    2.864423] Call Trace:
> > [    2.864423]  [<81103a06>] lockdep_rcu_suspicious+0x1c6/0x210
> > [    2.864423]  [<8116fc9a>] __update_max_tr+0x36a/0x430
> > [    2.864423]  [<8116f930>] ? tracing_record_cmdline+0x200/0x200
> > [    2.864423]  [<8117186e>] update_max_tr_single+0x14e/0x2c0
> > [    2.864423]  [<81170baa>] ? __trace_stack+0x2a/0x40
> > [    2.864423]  [<814f6d22>] check_critical_timing+0x1fb/0x286
> > [    2.864423]  [<81013313>] ? default_idle+0x593/0xc30
> > [    2.864423]  [<81013313>] ? default_idle+0x593/0xc30
> > [    2.864423]  [<8110a0e7>] ? trace_hardirqs_on+0x27/0x40
> > [    2.864423]  [<8117ea5e>] time_hardirqs_on+0x1de/0x220
> > [    2.864423]  [<81013313>] ? default_idle+0x593/0xc30
> > [    2.864423]  [<81109d6d>] trace_hardirqs_on_caller+0x2d/0x380
> > [    2.864423]  [<8110a0e7>] trace_hardirqs_on+0x27/0x40
> > [    2.864423]  [<81013313>] default_idle+0x593/0xc30
> > [    2.864423]  [<8101692d>] cpu_idle+0x2dd/0x390
> > [    2.864423]  [<814eb841>] rest_init+0x2f5/0x314
> > [    2.864423]  [<814eb54c>] ? __read_lock_failed+0x14/0x14
> > [    2.864423]  [<817a43b4>] start_kernel+0x866/0x87a
> 
> What was the next lines? I bet you it was "PASSED". Which means it did
> not fail. This is the second bug you found that has to do with RCU being
> called in 'idle'. The one that Paul posted a patch for.

Yeah, PASSED!

[    2.898070]  [<8117ea5e>] time_hardirqs_on+0x1de/0x220
[    2.898070]  [<81013313>] ? default_idle+0x593/0xc30
[    2.898070]  [<81109d6d>] trace_hardirqs_on_caller+0x2d/0x380
[    2.898070]  [<8110a0e7>] trace_hardirqs_on+0x27/0x40
[    2.898070]  [<81013313>] default_idle+0x593/0xc30
[    2.898070]  [<8101692d>] cpu_idle+0x2dd/0x390
[    2.898070]  [<817fbe97>] start_secondary+0x44b/0x460
[    3.150115] PASSED
[    3.390079] Testing tracer function_graph: PASSED

I'll test Paul's patch on top of yours right away.

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Testing tracer wakeup_rt: .. no entries found ..FAILED!
  2012-07-31 23:57                   ` Paul E. McKenney
@ 2012-08-01  0:09                     ` Steven Rostedt
  2012-08-01  0:18                       ` Paul E. McKenney
  0 siblings, 1 reply; 29+ messages in thread
From: Steven Rostedt @ 2012-08-01  0:09 UTC (permalink / raw)
  To: paulmck; +Cc: Fengguang Wu, Avi Kivity, Steven Rostedt, LKML,
	kvm@vger.kernel.org

On Tue, 2012-07-31 at 16:57 -0700, Paul E. McKenney wrote:

> > What was the next lines? I bet you it was "PASSED". Which means it did
> > not fail. This is the second bug you found that has to do with RCU being
> > called in 'idle'. The one that Paul posted a patch for.
> 
> Though it needs another patch to actually use it in the right place...

Right. Something like this:

-- Steve

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 5638104..d915638 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -631,7 +631,12 @@ __update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu)
 
 	memcpy(max_data->comm, tsk->comm, TASK_COMM_LEN);
 	max_data->pid = tsk->pid;
-	max_data->uid = task_uid(tsk);
+	/*
+	 * task_uid() calls rcu_read_lock, but this can be called
+	 * outside of RCU state monitoring (irq going back to idle).
+	 */ 
+	RCU_NONIDLE(max_data->uid = task_uid(tsk));
+
 	max_data->nice = tsk->static_prio - 20 - MAX_RT_PRIO;
 	max_data->policy = tsk->policy;
 	max_data->rt_priority = tsk->rt_priority;



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: Testing tracer wakeup_rt: .. no entries found ..FAILED!
  2012-08-01  0:09                     ` Steven Rostedt
@ 2012-08-01  0:18                       ` Paul E. McKenney
  2012-08-01  0:43                         ` pci_get_subsys: GFP_KERNEL allocations with IRQs disabled Fengguang Wu
  0 siblings, 1 reply; 29+ messages in thread
From: Paul E. McKenney @ 2012-08-01  0:18 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Fengguang Wu, Avi Kivity, Steven Rostedt, LKML,
	kvm@vger.kernel.org

On Tue, Jul 31, 2012 at 08:09:38PM -0400, Steven Rostedt wrote:
> On Tue, 2012-07-31 at 16:57 -0700, Paul E. McKenney wrote:
> 
> > > What was the next lines? I bet you it was "PASSED". Which means it did
> > > not fail. This is the second bug you found that has to do with RCU being
> > > called in 'idle'. The one that Paul posted a patch for.
> > 
> > Though it needs another patch to actually use it in the right place...
> 
> Right. Something like this:

Looks good to me!

							Thanx, Paul

> -- Steve
> 
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index 5638104..d915638 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -631,7 +631,12 @@ __update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu)
> 
>  	memcpy(max_data->comm, tsk->comm, TASK_COMM_LEN);
>  	max_data->pid = tsk->pid;
> -	max_data->uid = task_uid(tsk);
> +	/*
> +	 * task_uid() calls rcu_read_lock, but this can be called
> +	 * outside of RCU state monitoring (irq going back to idle).
> +	 */ 
> +	RCU_NONIDLE(max_data->uid = task_uid(tsk));
> +
>  	max_data->nice = tsk->static_prio - 20 - MAX_RT_PRIO;
>  	max_data->policy = tsk->policy;
>  	max_data->rt_priority = tsk->rt_priority;
> 
> 


^ permalink raw reply	[flat|nested] 29+ messages in thread

* pci_get_subsys: GFP_KERNEL allocations with IRQs disabled
  2012-08-01  0:18                       ` Paul E. McKenney
@ 2012-08-01  0:43                         ` Fengguang Wu
  2012-08-22  2:50                           ` Fengguang Wu
  0 siblings, 1 reply; 29+ messages in thread
From: Fengguang Wu @ 2012-08-01  0:43 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Steven Rostedt, Avi Kivity, Steven Rostedt, LKML,
	kvm@vger.kernel.org, Kenji Kaneshige, Yinghai Lu, Bjorn Helgaas,
	linux-pci

On Tue, Jul 31, 2012 at 05:18:11PM -0700, Paul E. McKenney wrote:
> On Tue, Jul 31, 2012 at 08:09:38PM -0400, Steven Rostedt wrote:
> > On Tue, 2012-07-31 at 16:57 -0700, Paul E. McKenney wrote:
> > 
> > > > What was the next lines? I bet you it was "PASSED". Which means it did
> > > > not fail. This is the second bug you found that has to do with RCU being
> > > > called in 'idle'. The one that Paul posted a patch for.
> > > 
> > > Though it needs another patch to actually use it in the right place...
> > 
> > Right. Something like this:
> 
> Looks good to me!
 
With all 3 patches applied, the warning on __update_max_tr finally
goes away. Thanks!

However, this unrelated warning still reliably remains (the same config).
I think it's pci_get_subsys() triggered this assert:

        /*
         * Oi! Can't be having __GFP_FS allocations with IRQs disabled.
         */
        if (DEBUG_LOCKS_WARN_ON(irqs_disabled_flags(flags)))
                return;

[   91.282131] machine restart
[   91.283895] ------------[ cut here ]------------
[   91.284731] WARNING: at /c/wfg/linux/kernel/lockdep.c:2739 lockdep_trace_alloc+0x1fb/0x210()
[   91.286132] Modules linked in:
[   91.286703] Pid: 697, comm: reboot Not tainted 3.5.0-00024-g01ff5db-dirty #4
[   91.287859] Call Trace:
[   91.288289]  [<81050148>] warn_slowpath_common+0xb8/0x100
[   91.289338]  [<8110acdb>] ? lockdep_trace_alloc+0x1fb/0x210
[   91.290264]  [<8110acdb>] ? lockdep_trace_alloc+0x1fb/0x210
[   91.291161]  [<810501ce>] warn_slowpath_null+0x3e/0x50
[   91.292042]  [<8110acdb>] lockdep_trace_alloc+0x1fb/0x210
[   91.292934]  [<81228e25>] kmem_cache_alloc_trace+0x55/0x600
[   91.292934]  [<813025ca>] ? kobject_put+0x9a/0x160
[   91.292934]  [<814e95e0>] ? klist_iter_exit+0x30/0x50
[   91.292934]  [<81405881>] ? bus_find_device+0xf1/0x120
[   91.292934]  [<81361a3c>] ? pci_get_subsys+0x11c/0x1b0
[   91.292934]  [<81361a3c>] pci_get_subsys+0x11c/0x1b0
[   91.292934]  [<81361afe>] pci_get_device+0x2e/0x40
[   91.292934]  [<81033e25>] mach_reboot_fixups+0xa5/0xd0
[   91.292934]  [<81027611>] native_machine_emergency_restart+0x1f1/0x590
[   91.292934]  [<814f2e00>] ? printk+0x4b/0x5b
[   91.292934]  [<810269ef>] native_machine_restart+0x6f/0x80
[   91.292934]  [<810271cc>] machine_restart+0x1c/0x30
[   91.292934]  [<810886e0>] kernel_restart+0x70/0xc0
[   91.292934]  [<81088a85>] sys_reboot+0x325/0x380
[   91.292934]  [<811f796c>] ? handle_pte_fault+0xdc/0x1740
[   91.292934]  [<811f93e7>] ? handle_mm_fault+0x417/0x4a0
[   91.292934]  [<8103e07b>] ? do_page_fault+0x7fb/0xb30
[   91.292934]  [<810b33e7>] ? up_read+0x37/0x70
[   91.292934]  [<8103e07b>] ? do_page_fault+0x7fb/0xb30
[   91.292934]  [<8123c063>] ? do_sys_open+0x3a3/0x3f0
[   91.292934]  [<8123c063>] ? do_sys_open+0x3a3/0x3f0
[   91.292934]  [<810b0270>] ? update_rmtp+0xe0/0xe0
[   91.292934]  [<8150376e>] ? restore_all+0xf/0xf
[   91.292934]  [<8103d880>] ? vmalloc_sync_all+0x320/0x320
[   91.292934]  [<81109fca>] ? trace_hardirqs_on_caller+0x28a/0x380
[   91.292934]  [<81311594>] ? trace_hardirqs_on_thunk+0xc/0x10
[   91.292934]  [<81503735>] syscall_call+0x7/0xb

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Testing tracer wakeup_rt: .. no entries found ..FAILED!
  2012-07-31 23:57                   ` Testing tracer wakeup_rt: .. no entries found ..FAILED! Fengguang Wu
@ 2012-08-07 13:29                     ` Steven Rostedt
  2012-08-07 13:32                       ` Fengguang Wu
  0 siblings, 1 reply; 29+ messages in thread
From: Steven Rostedt @ 2012-08-07 13:29 UTC (permalink / raw)
  To: Fengguang Wu
  Cc: Avi Kivity, Steven Rostedt, Paul E. McKenney, LKML,
	kvm@vger.kernel.org

On Wed, 2012-08-01 at 07:57 +0800, Fengguang Wu wrote:
> > 
> > What was the next lines? I bet you it was "PASSED". Which means it did
> > not fail. This is the second bug you found that has to do with RCU being
> > called in 'idle'. The one that Paul posted a patch for.
> 
> Yeah, PASSED!

I have this patch queued for 3.7. Can I add your 'Tested-by' for it.

Thanks,

-- Steve

> 
> [    2.898070]  [<8117ea5e>] time_hardirqs_on+0x1de/0x220
> [    2.898070]  [<81013313>] ? default_idle+0x593/0xc30
> [    2.898070]  [<81109d6d>] trace_hardirqs_on_caller+0x2d/0x380
> [    2.898070]  [<8110a0e7>] trace_hardirqs_on+0x27/0x40
> [    2.898070]  [<81013313>] default_idle+0x593/0xc30
> [    2.898070]  [<8101692d>] cpu_idle+0x2dd/0x390
> [    2.898070]  [<817fbe97>] start_secondary+0x44b/0x460
> [    3.150115] PASSED
> [    3.390079] Testing tracer function_graph: PASSED
> 
> I'll test Paul's patch on top of yours right away.
> 
> Thanks,
> Fengguang

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Testing tracer wakeup_rt: .. no entries found ..FAILED!
  2012-08-07 13:29                     ` Steven Rostedt
@ 2012-08-07 13:32                       ` Fengguang Wu
  0 siblings, 0 replies; 29+ messages in thread
From: Fengguang Wu @ 2012-08-07 13:32 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Avi Kivity, Steven Rostedt, Paul E. McKenney, LKML,
	kvm@vger.kernel.org

On Tue, Aug 07, 2012 at 09:29:33AM -0400, Steven Rostedt wrote:
> On Wed, 2012-08-01 at 07:57 +0800, Fengguang Wu wrote:
> > > 
> > > What was the next lines? I bet you it was "PASSED". Which means it did
> > > not fail. This is the second bug you found that has to do with RCU being
> > > called in 'idle'. The one that Paul posted a patch for.
> > 
> > Yeah, PASSED!
> 
> I have this patch queued for 3.7. Can I add your 'Tested-by' for it.

Yes, please. Thanks!

Thanks,
Fengguang

> > [    2.898070]  [<8117ea5e>] time_hardirqs_on+0x1de/0x220
> > [    2.898070]  [<81013313>] ? default_idle+0x593/0xc30
> > [    2.898070]  [<81109d6d>] trace_hardirqs_on_caller+0x2d/0x380
> > [    2.898070]  [<8110a0e7>] trace_hardirqs_on+0x27/0x40
> > [    2.898070]  [<81013313>] default_idle+0x593/0xc30
> > [    2.898070]  [<8101692d>] cpu_idle+0x2dd/0x390
> > [    2.898070]  [<817fbe97>] start_secondary+0x44b/0x460
> > [    3.150115] PASSED
> > [    3.390079] Testing tracer function_graph: PASSED
> > 
> > I'll test Paul's patch on top of yours right away.
> > 
> > Thanks,
> > Fengguang
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: pci_get_subsys: GFP_KERNEL allocations with IRQs disabled
  2012-08-01  0:43                         ` pci_get_subsys: GFP_KERNEL allocations with IRQs disabled Fengguang Wu
@ 2012-08-22  2:50                           ` Fengguang Wu
  2012-08-22  7:49                             ` Feng Tang
  0 siblings, 1 reply; 29+ messages in thread
From: Fengguang Wu @ 2012-08-22  2:50 UTC (permalink / raw)
  To: Tang, Feng
  Cc: Paul E. McKenney, Steven Rostedt, Avi Kivity, Steven Rostedt,
	LKML, kvm@vger.kernel.org, Kenji Kaneshige, Yinghai Lu,
	Bjorn Helgaas, linux-pci

Feng,

> I think it's pci_get_subsys() triggered this assert:
> 
>         /*
>          * Oi! Can't be having __GFP_FS allocations with IRQs disabled.
>          */
>         if (DEBUG_LOCKS_WARN_ON(irqs_disabled_flags(flags)))
>                 return;

It's bisected down to this commit:

commit 55c844a4dd16a4d1fdc0cf2a283ec631a02ec448
Author:     Feng Tang <feng.tang@intel.com>
AuthorDate: Wed May 30 23:15:41 2012 +0800
Commit:     Ingo Molnar <mingo@kernel.org>
CommitDate: Wed Jun 6 12:03:23 2012 +0200

    x86/reboot: Fix a warning message triggered by stop_other_cpus()

Thanks,
Fengguang

> [   91.282131] machine restart
> [   91.283895] ------------[ cut here ]------------
> [   91.284731] WARNING: at /c/wfg/linux/kernel/lockdep.c:2739 lockdep_trace_alloc+0x1fb/0x210()
> [   91.286132] Modules linked in:
> [   91.286703] Pid: 697, comm: reboot Not tainted 3.5.0-00024-g01ff5db-dirty #4
> [   91.287859] Call Trace:
> [   91.288289]  [<81050148>] warn_slowpath_common+0xb8/0x100
> [   91.289338]  [<8110acdb>] ? lockdep_trace_alloc+0x1fb/0x210
> [   91.290264]  [<8110acdb>] ? lockdep_trace_alloc+0x1fb/0x210
> [   91.291161]  [<810501ce>] warn_slowpath_null+0x3e/0x50
> [   91.292042]  [<8110acdb>] lockdep_trace_alloc+0x1fb/0x210
> [   91.292934]  [<81228e25>] kmem_cache_alloc_trace+0x55/0x600
> [   91.292934]  [<813025ca>] ? kobject_put+0x9a/0x160
> [   91.292934]  [<814e95e0>] ? klist_iter_exit+0x30/0x50
> [   91.292934]  [<81405881>] ? bus_find_device+0xf1/0x120
> [   91.292934]  [<81361a3c>] ? pci_get_subsys+0x11c/0x1b0
> [   91.292934]  [<81361a3c>] pci_get_subsys+0x11c/0x1b0
> [   91.292934]  [<81361afe>] pci_get_device+0x2e/0x40
> [   91.292934]  [<81033e25>] mach_reboot_fixups+0xa5/0xd0
> [   91.292934]  [<81027611>] native_machine_emergency_restart+0x1f1/0x590
> [   91.292934]  [<814f2e00>] ? printk+0x4b/0x5b
> [   91.292934]  [<810269ef>] native_machine_restart+0x6f/0x80
> [   91.292934]  [<810271cc>] machine_restart+0x1c/0x30
> [   91.292934]  [<810886e0>] kernel_restart+0x70/0xc0
> [   91.292934]  [<81088a85>] sys_reboot+0x325/0x380
> [   91.292934]  [<811f796c>] ? handle_pte_fault+0xdc/0x1740
> [   91.292934]  [<811f93e7>] ? handle_mm_fault+0x417/0x4a0
> [   91.292934]  [<8103e07b>] ? do_page_fault+0x7fb/0xb30
> [   91.292934]  [<810b33e7>] ? up_read+0x37/0x70
> [   91.292934]  [<8103e07b>] ? do_page_fault+0x7fb/0xb30
> [   91.292934]  [<8123c063>] ? do_sys_open+0x3a3/0x3f0
> [   91.292934]  [<8123c063>] ? do_sys_open+0x3a3/0x3f0
> [   91.292934]  [<810b0270>] ? update_rmtp+0xe0/0xe0
> [   91.292934]  [<8150376e>] ? restore_all+0xf/0xf
> [   91.292934]  [<8103d880>] ? vmalloc_sync_all+0x320/0x320
> [   91.292934]  [<81109fca>] ? trace_hardirqs_on_caller+0x28a/0x380
> [   91.292934]  [<81311594>] ? trace_hardirqs_on_thunk+0xc/0x10
> [   91.292934]  [<81503735>] syscall_call+0x7/0xb
> 
> Thanks,
> Fengguang

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: pci_get_subsys: GFP_KERNEL allocations with IRQs disabled
  2012-08-22  2:50                           ` Fengguang Wu
@ 2012-08-22  7:49                             ` Feng Tang
  2012-08-22 13:02                               ` Fengguang Wu
  2012-08-22 18:02                               ` Bjorn Helgaas
  0 siblings, 2 replies; 29+ messages in thread
From: Feng Tang @ 2012-08-22  7:49 UTC (permalink / raw)
  To: Fengguang Wu
  Cc: Paul E. McKenney, Steven Rostedt, Avi Kivity, Steven Rostedt,
	LKML, kvm@vger.kernel.org, Kenji Kaneshige, Yinghai Lu,
	Bjorn Helgaas, linux-pci

Hi Fengguang,


On Wed, 22 Aug 2012 10:50:08 +0800
Fengguang Wu <fengguang.wu@intel.com> wrote:

> Feng,
> 
> > I think it's pci_get_subsys() triggered this assert:
> > 
> >         /*
> >          * Oi! Can't be having __GFP_FS allocations with IRQs disabled.
> >          */
> >         if (DEBUG_LOCKS_WARN_ON(irqs_disabled_flags(flags)))
> >                 return;
> 
> It's bisected down to this commit:
> 
> commit 55c844a4dd16a4d1fdc0cf2a283ec631a02ec448
> Author:     Feng Tang <feng.tang@intel.com>
> AuthorDate: Wed May 30 23:15:41 2012 +0800
> Commit:     Ingo Molnar <mingo@kernel.org>
> CommitDate: Wed Jun 6 12:03:23 2012 +0200
> 
>     x86/reboot: Fix a warning message triggered by stop_other_cpus()
> 
> Thanks,
> Fengguang

Thanks for the bisection.

Revert my commit should be a solution, but can we simply make the pci_device_id
a local on stack one instead of using sleepable kmalloc for it, as this
sounds fragile when pci_get_subsys get called in a late system reboot stage?

Thanks,
Feng

------------
diff --git a/drivers/pci/search.c b/drivers/pci/search.c
index 993d4a0..e5ccede 100644
--- a/drivers/pci/search.c
+++ b/drivers/pci/search.c
@@ -246,7 +246,7 @@ struct pci_dev *pci_get_subsys(unsigned int vendor, unsigned int device,
 			       struct pci_dev *from)
 {
 	struct pci_dev *pdev;
-	struct pci_device_id *id;
+	struct pci_device_id id;
 
 	/*
 	 * pci_find_subsys() can be called on the ide_setup() path,
@@ -257,17 +257,12 @@ struct pci_dev *pci_get_subsys(unsigned int vendor, unsigned int device,
 	if (unlikely(no_pci_devices()))
 		return NULL;
 
-	id = kzalloc(sizeof(*id), GFP_KERNEL);
-	if (!id)
-		return NULL;
-	id->vendor = vendor;
-	id->device = device;
-	id->subvendor = ss_vendor;
-	id->subdevice = ss_device;
-
-	pdev = pci_get_dev_by_id(id, from);
-	kfree(id);
+	id.vendor = vendor;
+	id.device = device;
+	id.subvendor = ss_vendor;
+	id.subdevice = ss_device;
 
+	pdev = pci_get_dev_by_id(&id, from);
 	return pdev;
 }
 


> 
> > [   91.282131] machine restart
> > [   91.283895] ------------[ cut here ]------------
> > [   91.284731] WARNING: at /c/wfg/linux/kernel/lockdep.c:2739 lockdep_trace_alloc+0x1fb/0x210()
> > [   91.286132] Modules linked in:
> > [   91.286703] Pid: 697, comm: reboot Not tainted 3.5.0-00024-g01ff5db-dirty #4
> > [   91.287859] Call Trace:
> > [   91.288289]  [<81050148>] warn_slowpath_common+0xb8/0x100
> > [   91.289338]  [<8110acdb>] ? lockdep_trace_alloc+0x1fb/0x210
> > [   91.290264]  [<8110acdb>] ? lockdep_trace_alloc+0x1fb/0x210
> > [   91.291161]  [<810501ce>] warn_slowpath_null+0x3e/0x50
> > [   91.292042]  [<8110acdb>] lockdep_trace_alloc+0x1fb/0x210
> > [   91.292934]  [<81228e25>] kmem_cache_alloc_trace+0x55/0x600
> > [   91.292934]  [<813025ca>] ? kobject_put+0x9a/0x160
> > [   91.292934]  [<814e95e0>] ? klist_iter_exit+0x30/0x50
> > [   91.292934]  [<81405881>] ? bus_find_device+0xf1/0x120
> > [   91.292934]  [<81361a3c>] ? pci_get_subsys+0x11c/0x1b0
> > [   91.292934]  [<81361a3c>] pci_get_subsys+0x11c/0x1b0
> > [   91.292934]  [<81361afe>] pci_get_device+0x2e/0x40
> > [   91.292934]  [<81033e25>] mach_reboot_fixups+0xa5/0xd0
> > [   91.292934]  [<81027611>] native_machine_emergency_restart+0x1f1/0x590
> > [   91.292934]  [<814f2e00>] ? printk+0x4b/0x5b
> > [   91.292934]  [<810269ef>] native_machine_restart+0x6f/0x80
> > [   91.292934]  [<810271cc>] machine_restart+0x1c/0x30
> > [   91.292934]  [<810886e0>] kernel_restart+0x70/0xc0
> > [   91.292934]  [<81088a85>] sys_reboot+0x325/0x380
> > [   91.292934]  [<811f796c>] ? handle_pte_fault+0xdc/0x1740
> > [   91.292934]  [<811f93e7>] ? handle_mm_fault+0x417/0x4a0
> > [   91.292934]  [<8103e07b>] ? do_page_fault+0x7fb/0xb30
> > [   91.292934]  [<810b33e7>] ? up_read+0x37/0x70
> > [   91.292934]  [<8103e07b>] ? do_page_fault+0x7fb/0xb30
> > [   91.292934]  [<8123c063>] ? do_sys_open+0x3a3/0x3f0
> > [   91.292934]  [<8123c063>] ? do_sys_open+0x3a3/0x3f0
> > [   91.292934]  [<810b0270>] ? update_rmtp+0xe0/0xe0
> > [   91.292934]  [<8150376e>] ? restore_all+0xf/0xf
> > [   91.292934]  [<8103d880>] ? vmalloc_sync_all+0x320/0x320
> > [   91.292934]  [<81109fca>] ? trace_hardirqs_on_caller+0x28a/0x380
> > [   91.292934]  [<81311594>] ? trace_hardirqs_on_thunk+0xc/0x10
> > [   91.292934]  [<81503735>] syscall_call+0x7/0xb
> > 
> > Thanks,
> > Fengguang

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: pci_get_subsys: GFP_KERNEL allocations with IRQs disabled
  2012-08-22  7:49                             ` Feng Tang
@ 2012-08-22 13:02                               ` Fengguang Wu
  2012-08-22 18:02                               ` Bjorn Helgaas
  1 sibling, 0 replies; 29+ messages in thread
From: Fengguang Wu @ 2012-08-22 13:02 UTC (permalink / raw)
  To: Feng Tang
  Cc: Paul E. McKenney, Steven Rostedt, Avi Kivity, Steven Rostedt,
	LKML, kvm@vger.kernel.org, Kenji Kaneshige, Yinghai Lu,
	Bjorn Helgaas, linux-pci

On Wed, Aug 22, 2012 at 03:49:08PM +0800, Tang, Feng wrote:
> Hi Fengguang,
> 
> 
> On Wed, 22 Aug 2012 10:50:08 +0800
> Fengguang Wu <fengguang.wu@intel.com> wrote:
> 
> > Feng,
> > 
> > > I think it's pci_get_subsys() triggered this assert:
> > > 
> > >         /*
> > >          * Oi! Can't be having __GFP_FS allocations with IRQs disabled.
> > >          */
> > >         if (DEBUG_LOCKS_WARN_ON(irqs_disabled_flags(flags)))
> > >                 return;
> > 
> > It's bisected down to this commit:
> > 
> > commit 55c844a4dd16a4d1fdc0cf2a283ec631a02ec448
> > Author:     Feng Tang <feng.tang@intel.com>
> > AuthorDate: Wed May 30 23:15:41 2012 +0800
> > Commit:     Ingo Molnar <mingo@kernel.org>
> > CommitDate: Wed Jun 6 12:03:23 2012 +0200
> > 
> >     x86/reboot: Fix a warning message triggered by stop_other_cpus()
> > 
> > Thanks,
> > Fengguang
> 
> Thanks for the bisection.
> 
> Revert my commit should be a solution, but can we simply make the pci_device_id
> a local on stack one instead of using sleepable kmalloc for it, as this
> sounds fragile when pci_get_subsys get called in a late system reboot stage?

Good idea! I like this simple solution. It will sure fix the warning.

Reviewed-by: Fengguang Wu <fengguang.wu@intel.com>

Thanks,
Fengguang

> ------------
> diff --git a/drivers/pci/search.c b/drivers/pci/search.c
> index 993d4a0..e5ccede 100644
> --- a/drivers/pci/search.c
> +++ b/drivers/pci/search.c
> @@ -246,7 +246,7 @@ struct pci_dev *pci_get_subsys(unsigned int vendor, unsigned int device,
>  			       struct pci_dev *from)
>  {
>  	struct pci_dev *pdev;
> -	struct pci_device_id *id;
> +	struct pci_device_id id;
>  
>  	/*
>  	 * pci_find_subsys() can be called on the ide_setup() path,
> @@ -257,17 +257,12 @@ struct pci_dev *pci_get_subsys(unsigned int vendor, unsigned int device,
>  	if (unlikely(no_pci_devices()))
>  		return NULL;
>  
> -	id = kzalloc(sizeof(*id), GFP_KERNEL);
> -	if (!id)
> -		return NULL;
> -	id->vendor = vendor;
> -	id->device = device;
> -	id->subvendor = ss_vendor;
> -	id->subdevice = ss_device;
> -
> -	pdev = pci_get_dev_by_id(id, from);
> -	kfree(id);
> +	id.vendor = vendor;
> +	id.device = device;
> +	id.subvendor = ss_vendor;
> +	id.subdevice = ss_device;
>  
> +	pdev = pci_get_dev_by_id(&id, from);
>  	return pdev;
>  }
>  
> 
> 
> > 
> > > [   91.282131] machine restart
> > > [   91.283895] ------------[ cut here ]------------
> > > [   91.284731] WARNING: at /c/wfg/linux/kernel/lockdep.c:2739 lockdep_trace_alloc+0x1fb/0x210()
> > > [   91.286132] Modules linked in:
> > > [   91.286703] Pid: 697, comm: reboot Not tainted 3.5.0-00024-g01ff5db-dirty #4
> > > [   91.287859] Call Trace:
> > > [   91.288289]  [<81050148>] warn_slowpath_common+0xb8/0x100
> > > [   91.289338]  [<8110acdb>] ? lockdep_trace_alloc+0x1fb/0x210
> > > [   91.290264]  [<8110acdb>] ? lockdep_trace_alloc+0x1fb/0x210
> > > [   91.291161]  [<810501ce>] warn_slowpath_null+0x3e/0x50
> > > [   91.292042]  [<8110acdb>] lockdep_trace_alloc+0x1fb/0x210
> > > [   91.292934]  [<81228e25>] kmem_cache_alloc_trace+0x55/0x600
> > > [   91.292934]  [<813025ca>] ? kobject_put+0x9a/0x160
> > > [   91.292934]  [<814e95e0>] ? klist_iter_exit+0x30/0x50
> > > [   91.292934]  [<81405881>] ? bus_find_device+0xf1/0x120
> > > [   91.292934]  [<81361a3c>] ? pci_get_subsys+0x11c/0x1b0
> > > [   91.292934]  [<81361a3c>] pci_get_subsys+0x11c/0x1b0
> > > [   91.292934]  [<81361afe>] pci_get_device+0x2e/0x40
> > > [   91.292934]  [<81033e25>] mach_reboot_fixups+0xa5/0xd0
> > > [   91.292934]  [<81027611>] native_machine_emergency_restart+0x1f1/0x590
> > > [   91.292934]  [<814f2e00>] ? printk+0x4b/0x5b
> > > [   91.292934]  [<810269ef>] native_machine_restart+0x6f/0x80
> > > [   91.292934]  [<810271cc>] machine_restart+0x1c/0x30
> > > [   91.292934]  [<810886e0>] kernel_restart+0x70/0xc0
> > > [   91.292934]  [<81088a85>] sys_reboot+0x325/0x380
> > > [   91.292934]  [<811f796c>] ? handle_pte_fault+0xdc/0x1740
> > > [   91.292934]  [<811f93e7>] ? handle_mm_fault+0x417/0x4a0
> > > [   91.292934]  [<8103e07b>] ? do_page_fault+0x7fb/0xb30
> > > [   91.292934]  [<810b33e7>] ? up_read+0x37/0x70
> > > [   91.292934]  [<8103e07b>] ? do_page_fault+0x7fb/0xb30
> > > [   91.292934]  [<8123c063>] ? do_sys_open+0x3a3/0x3f0
> > > [   91.292934]  [<8123c063>] ? do_sys_open+0x3a3/0x3f0
> > > [   91.292934]  [<810b0270>] ? update_rmtp+0xe0/0xe0
> > > [   91.292934]  [<8150376e>] ? restore_all+0xf/0xf
> > > [   91.292934]  [<8103d880>] ? vmalloc_sync_all+0x320/0x320
> > > [   91.292934]  [<81109fca>] ? trace_hardirqs_on_caller+0x28a/0x380
> > > [   91.292934]  [<81311594>] ? trace_hardirqs_on_thunk+0xc/0x10
> > > [   91.292934]  [<81503735>] syscall_call+0x7/0xb
> > > 
> > > Thanks,
> > > Fengguang

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: pci_get_subsys: GFP_KERNEL allocations with IRQs disabled
  2012-08-22  7:49                             ` Feng Tang
  2012-08-22 13:02                               ` Fengguang Wu
@ 2012-08-22 18:02                               ` Bjorn Helgaas
  2012-08-23  5:45                                 ` Feng Tang
                                                   ` (2 more replies)
  1 sibling, 3 replies; 29+ messages in thread
From: Bjorn Helgaas @ 2012-08-22 18:02 UTC (permalink / raw)
  To: Feng Tang
  Cc: Fengguang Wu, Paul E. McKenney, Steven Rostedt, Avi Kivity,
	Steven Rostedt, LKML, kvm@vger.kernel.org, Kenji Kaneshige,
	Yinghai Lu, linux-pci

On Wed, Aug 22, 2012 at 12:49 AM, Feng Tang <feng.tang@intel.com> wrote:
> Hi Fengguang,
>
>
> On Wed, 22 Aug 2012 10:50:08 +0800
> Fengguang Wu <fengguang.wu@intel.com> wrote:
>
>> Feng,
>>
>> > I think it's pci_get_subsys() triggered this assert:
>> >
>> >         /*
>> >          * Oi! Can't be having __GFP_FS allocations with IRQs disabled.
>> >          */
>> >         if (DEBUG_LOCKS_WARN_ON(irqs_disabled_flags(flags)))
>> >                 return;
>>
>> It's bisected down to this commit:
>>
>> commit 55c844a4dd16a4d1fdc0cf2a283ec631a02ec448
>> Author:     Feng Tang <feng.tang@intel.com>
>> AuthorDate: Wed May 30 23:15:41 2012 +0800
>> Commit:     Ingo Molnar <mingo@kernel.org>
>> CommitDate: Wed Jun 6 12:03:23 2012 +0200
>>
>>     x86/reboot: Fix a warning message triggered by stop_other_cpus()
>>
>> Thanks,
>> Fengguang
>
> Thanks for the bisection.
>
> Revert my commit should be a solution, but can we simply make the pci_device_id
> a local on stack one instead of using sleepable kmalloc for it, as this
> sounds fragile when pci_get_subsys get called in a late system reboot stage?

I think this is a great idea.  Can you make this a real patch, with a
changelog and Signed-off-by?

We should also remove the obsolete comment about early boot.  I'm not
sure the no_pci_devices() check is needed, either.  And we can make
the same simplification in pci_get_class().

> ------------
> diff --git a/drivers/pci/search.c b/drivers/pci/search.c
> index 993d4a0..e5ccede 100644
> --- a/drivers/pci/search.c
> +++ b/drivers/pci/search.c
> @@ -246,7 +246,7 @@ struct pci_dev *pci_get_subsys(unsigned int vendor, unsigned int device,
>                                struct pci_dev *from)
>  {
>         struct pci_dev *pdev;
> -       struct pci_device_id *id;
> +       struct pci_device_id id;
>
>         /*
>          * pci_find_subsys() can be called on the ide_setup() path,
> @@ -257,17 +257,12 @@ struct pci_dev *pci_get_subsys(unsigned int vendor, unsigned int device,
>         if (unlikely(no_pci_devices()))
>                 return NULL;
>
> -       id = kzalloc(sizeof(*id), GFP_KERNEL);
> -       if (!id)
> -               return NULL;
> -       id->vendor = vendor;
> -       id->device = device;
> -       id->subvendor = ss_vendor;
> -       id->subdevice = ss_device;
> -
> -       pdev = pci_get_dev_by_id(id, from);
> -       kfree(id);
> +       id.vendor = vendor;
> +       id.device = device;
> +       id.subvendor = ss_vendor;
> +       id.subdevice = ss_device;
>
> +       pdev = pci_get_dev_by_id(&id, from);

No need for "pdev" here, since we don't have to free anything.

>         return pdev;
>  }

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: pci_get_subsys: GFP_KERNEL allocations with IRQs disabled
  2012-08-22 18:02                               ` Bjorn Helgaas
@ 2012-08-23  5:45                                 ` Feng Tang
  2012-08-23  7:45                                 ` [PATCH 1/2] PCI: Use local parameter pci_device_id for pci_get_subsys/class() Feng Tang
  2012-08-23  7:45                                 ` [PATCH 2/2] PCI: Remove the obsolete no_pci_devices() check Feng Tang
  2 siblings, 0 replies; 29+ messages in thread
From: Feng Tang @ 2012-08-23  5:45 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Fengguang Wu, Paul E. McKenney, Steven Rostedt, Avi Kivity,
	Steven Rostedt, LKML, kvm@vger.kernel.org, Kenji Kaneshige,
	Yinghai Lu, linux-pci

Hi Bjorn,

On Wed, 22 Aug 2012 11:02:52 -0700
Bjorn Helgaas <bhelgaas@google.com> wrote:

> On Wed, Aug 22, 2012 at 12:49 AM, Feng Tang <feng.tang@intel.com> wrote:
> > Hi Fengguang,
> >
> >
> > On Wed, 22 Aug 2012 10:50:08 +0800
> > Fengguang Wu <fengguang.wu@intel.com> wrote:
> >
> >> Feng,
> >>
> >> > I think it's pci_get_subsys() triggered this assert:
> >> >
> >> >         /*
> >> >          * Oi! Can't be having __GFP_FS allocations with IRQs disabled.
> >> >          */
> >> >         if (DEBUG_LOCKS_WARN_ON(irqs_disabled_flags(flags)))
> >> >                 return;
> >>
> >> It's bisected down to this commit:
> >>
> >> commit 55c844a4dd16a4d1fdc0cf2a283ec631a02ec448
> >> Author:     Feng Tang <feng.tang@intel.com>
> >> AuthorDate: Wed May 30 23:15:41 2012 +0800
> >> Commit:     Ingo Molnar <mingo@kernel.org>
> >> CommitDate: Wed Jun 6 12:03:23 2012 +0200
> >>
> >>     x86/reboot: Fix a warning message triggered by stop_other_cpus()
> >>
> >> Thanks,
> >> Fengguang
> >
> > Thanks for the bisection.
> >
> > Revert my commit should be a solution, but can we simply make the pci_device_id
> > a local on stack one instead of using sleepable kmalloc for it, as this
> > sounds fragile when pci_get_subsys get called in a late system reboot stage?
> 
> I think this is a great idea.  Can you make this a real patch, with a
> changelog and Signed-off-by?

Thanks and will do.

> 
> We should also remove the obsolete comment about early boot.  I'm not
> sure the no_pci_devices() check is needed, either.  And we can make
> the same simplification in pci_get_class().

Will check the no_pci_devices() part, and try to make it a separate
patch for easy reverting in case of error.

> 
> > ------------
> > diff --git a/drivers/pci/search.c b/drivers/pci/search.c
> > index 993d4a0..e5ccede 100644
> > --- a/drivers/pci/search.c
> > +++ b/drivers/pci/search.c
> > @@ -246,7 +246,7 @@ struct pci_dev *pci_get_subsys(unsigned int vendor, unsigned int device,
> >                                struct pci_dev *from)
> >  {
> > +       id.vendor = vendor;
> > +       id.device = device;
> > +       id.subvendor = ss_vendor;
> > +       id.subdevice = ss_device;
> >
> > +       pdev = pci_get_dev_by_id(&id, from);
> 
> No need for "pdev" here, since we don't have to free anything.

ok, will directly return it.

Thanks,
Feng

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 1/2] PCI: Use local parameter pci_device_id for pci_get_subsys/class()
  2012-08-22 18:02                               ` Bjorn Helgaas
  2012-08-23  5:45                                 ` Feng Tang
@ 2012-08-23  7:45                                 ` Feng Tang
  2012-09-08  1:00                                   ` Yinghai Lu
  2012-08-23  7:45                                 ` [PATCH 2/2] PCI: Remove the obsolete no_pci_devices() check Feng Tang
  2 siblings, 1 reply; 29+ messages in thread
From: Feng Tang @ 2012-08-23  7:45 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Fengguang Wu, Paul E. McKenney, Steven Rostedt, Avi Kivity,
	Steven Rostedt, LKML, kvm@vger.kernel.org, Kenji Kaneshige,
	Yinghai Lu, linux-pci

>From 57a28ee5e7662ca28ba4c793aa037d64bd082dee Mon Sep 17 00:00:00 2001
From: Feng Tang <feng.tang@intel.com>
Date: Wed, 22 Aug 2012 15:41:51 +0800
Subject: [PATCH 1/2] PCI: Use local parameter pci_device_id for pci_get_subsys/class()

This fixes a kernel warning https://lkml.org/lkml/2012/7/31/682

pci_get_subsys() may get called in late system reboot stage, using
a sleepable kmalloc() sounds fragile and will casue a kernel warning
with my recent commmit 55c844a "x86/reboot: Fix a warning message
triggered by stop_other_cpus()" which disable local interrupt in
late system shutdown/reboot phase. Using a local parameter instead
will fix it and make it eligible for calling forom atomic context.

Do the same change for the pci_get_class() as suggeted by Bjorn Helgaas

Bisected-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Feng Tang <feng.tang@intel.com>
Reviewed-by: Fengguang Wu <fengguang.wu@intel.com>
---
 drivers/pci/search.c |   35 +++++++++++------------------------
 1 files changed, 11 insertions(+), 24 deletions(-)

diff --git a/drivers/pci/search.c b/drivers/pci/search.c
index 993d4a0..78a08b1 100644
--- a/drivers/pci/search.c
+++ b/drivers/pci/search.c
@@ -245,8 +245,7 @@ struct pci_dev *pci_get_subsys(unsigned int vendor, unsigned int device,
 			       unsigned int ss_vendor, unsigned int ss_device,
 			       struct pci_dev *from)
 {
-	struct pci_dev *pdev;
-	struct pci_device_id *id;
+	struct pci_device_id id;
 
 	/*
 	 * pci_find_subsys() can be called on the ide_setup() path,
@@ -257,18 +256,12 @@ struct pci_dev *pci_get_subsys(unsigned int vendor, unsigned int device,
 	if (unlikely(no_pci_devices()))
 		return NULL;
 
-	id = kzalloc(sizeof(*id), GFP_KERNEL);
-	if (!id)
-		return NULL;
-	id->vendor = vendor;
-	id->device = device;
-	id->subvendor = ss_vendor;
-	id->subdevice = ss_device;
-
-	pdev = pci_get_dev_by_id(id, from);
-	kfree(id);
+	id.vendor = vendor;
+	id.device = device;
+	id.subvendor = ss_vendor;
+	id.subdevice = ss_device;
 
-	return pdev;
+	return pci_get_dev_by_id(&id, from);
 }
 
 /**
@@ -307,19 +300,13 @@ pci_get_device(unsigned int vendor, unsigned int device, struct pci_dev *from)
  */
 struct pci_dev *pci_get_class(unsigned int class, struct pci_dev *from)
 {
-	struct pci_dev *dev;
-	struct pci_device_id *id;
+	struct pci_device_id id;
 
-	id = kzalloc(sizeof(*id), GFP_KERNEL);
-	if (!id)
-		return NULL;
-	id->vendor = id->device = id->subvendor = id->subdevice = PCI_ANY_ID;
-	id->class_mask = PCI_ANY_ID;
-	id->class = class;
+	id.vendor = id.device = id.subvendor = id.subdevice = PCI_ANY_ID;
+	id.class_mask = PCI_ANY_ID;
+	id.class = class;
 
-	dev = pci_get_dev_by_id(id, from);
-	kfree(id);
-	return dev;
+	return pci_get_dev_by_id(&id, from);
 }
 
 /**
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 2/2] PCI: Remove the obsolete no_pci_devices() check
  2012-08-22 18:02                               ` Bjorn Helgaas
  2012-08-23  5:45                                 ` Feng Tang
  2012-08-23  7:45                                 ` [PATCH 1/2] PCI: Use local parameter pci_device_id for pci_get_subsys/class() Feng Tang
@ 2012-08-23  7:45                                 ` Feng Tang
  2 siblings, 0 replies; 29+ messages in thread
From: Feng Tang @ 2012-08-23  7:45 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Fengguang Wu, Paul E. McKenney, Steven Rostedt, Avi Kivity,
	Steven Rostedt, LKML, kvm@vger.kernel.org, Kenji Kaneshige,
	Yinghai Lu, linux-pci, akpm, gregkh

>From 9f2f3bbdf65f669e091c72b9648a4a0394ce28f5 Mon Sep 17 00:00:00 2001
From: Feng Tang <feng.tang@intel.com>
Date: Thu, 23 Aug 2012 14:55:48 +0800
Subject: [PATCH 2/2] PCI: Remove the obsolete no_pci_devices() check

In function pci_get_subsys() there is a check:

	/*
	 * pci_find_subsys() can be called on the ide_setup() path,
	 * super-early in boot.  But the down_read() will enable local
	 * interrupts, which can cause some machines to crash.  So here we
	 * detect and flag that situation and bail out early.
	 */
	if (unlikely(no_pci_devices()))
		return NULL;

But there is no ide_setup() now, and no down_read() either, which
makes the check absolete. So remove it.

Signed-off-by: Feng Tang <feng.tang@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/pci/search.c |    9 ---------
 1 files changed, 0 insertions(+), 9 deletions(-)

diff --git a/drivers/pci/search.c b/drivers/pci/search.c
index 78a08b1..e6e604f 100644
--- a/drivers/pci/search.c
+++ b/drivers/pci/search.c
@@ -247,15 +247,6 @@ struct pci_dev *pci_get_subsys(unsigned int vendor, unsigned int device,
 {
 	struct pci_device_id id;
 
-	/*
-	 * pci_find_subsys() can be called on the ide_setup() path,
-	 * super-early in boot.  But the down_read() will enable local
-	 * interrupts, which can cause some machines to crash.  So here we
-	 * detect and flag that situation and bail out early.
-	 */
-	if (unlikely(no_pci_devices()))
-		return NULL;
-
 	id.vendor = vendor;
 	id.device = device;
 	id.subvendor = ss_vendor;
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/2] PCI: Use local parameter pci_device_id for pci_get_subsys/class()
  2012-08-23  7:45                                 ` [PATCH 1/2] PCI: Use local parameter pci_device_id for pci_get_subsys/class() Feng Tang
@ 2012-09-08  1:00                                   ` Yinghai Lu
  2012-09-08  1:32                                     ` Yinghai Lu
  0 siblings, 1 reply; 29+ messages in thread
From: Yinghai Lu @ 2012-09-08  1:00 UTC (permalink / raw)
  To: Feng Tang, Bjorn Helgaas
  Cc: Fengguang Wu, Paul E. McKenney, Steven Rostedt, Avi Kivity,
	Steven Rostedt, LKML, kvm@vger.kernel.org, Kenji Kaneshige,
	linux-pci

On Thu, Aug 23, 2012 at 12:45 AM, Feng Tang <feng.tang@intel.com> wrote:
> From 57a28ee5e7662ca28ba4c793aa037d64bd082dee Mon Sep 17 00:00:00 2001
> From: Feng Tang <feng.tang@intel.com>
> Date: Wed, 22 Aug 2012 15:41:51 +0800
> Subject: [PATCH 1/2] PCI: Use local parameter pci_device_id for pci_get_subsys/class()
>
> This fixes a kernel warning https://lkml.org/lkml/2012/7/31/682
>
> pci_get_subsys() may get called in late system reboot stage, using
> a sleepable kmalloc() sounds fragile and will casue a kernel warning
> with my recent commmit 55c844a "x86/reboot: Fix a warning message
> triggered by stop_other_cpus()" which disable local interrupt in
> late system shutdown/reboot phase. Using a local parameter instead
> will fix it and make it eligible for calling forom atomic context.
>
> Do the same change for the pci_get_class() as suggeted by Bjorn Helgaas
>
> Bisected-by: Fengguang Wu <fengguang.wu@intel.com>
> Signed-off-by: Feng Tang <feng.tang@intel.com>
> Reviewed-by: Fengguang Wu <fengguang.wu@intel.com>
> ---
>  drivers/pci/search.c |   35 +++++++++++------------------------
>  1 files changed, 11 insertions(+), 24 deletions(-)
>
> diff --git a/drivers/pci/search.c b/drivers/pci/search.c
> index 993d4a0..78a08b1 100644
> --- a/drivers/pci/search.c
> +++ b/drivers/pci/search.c
> @@ -245,8 +245,7 @@ struct pci_dev *pci_get_subsys(unsigned int vendor, unsigned int device,
>                                unsigned int ss_vendor, unsigned int ss_device,
>                                struct pci_dev *from)
>  {
> -       struct pci_dev *pdev;
> -       struct pci_device_id *id;
> +       struct pci_device_id id;
>
>         /*
>          * pci_find_subsys() can be called on the ide_setup() path,
> @@ -257,18 +256,12 @@ struct pci_dev *pci_get_subsys(unsigned int vendor, unsigned int device,
>         if (unlikely(no_pci_devices()))
>                 return NULL;
>
> -       id = kzalloc(sizeof(*id), GFP_KERNEL);
> -       if (!id)
> -               return NULL;
> -       id->vendor = vendor;
> -       id->device = device;
> -       id->subvendor = ss_vendor;
> -       id->subdevice = ss_device;
> -
> -       pdev = pci_get_dev_by_id(id, from);
> -       kfree(id);
> +       id.vendor = vendor;
> +       id.device = device;
> +       id.subvendor = ss_vendor;
> +       id.subdevice = ss_device;
>
> -       return pdev;
> +       return pci_get_dev_by_id(&id, from);
>  }
>
>  /**
> @@ -307,19 +300,13 @@ pci_get_device(unsigned int vendor, unsigned int device, struct pci_dev *from)
>   */
>  struct pci_dev *pci_get_class(unsigned int class, struct pci_dev *from)
>  {
> -       struct pci_dev *dev;
> -       struct pci_device_id *id;
> +       struct pci_device_id id;
>
> -       id = kzalloc(sizeof(*id), GFP_KERNEL);
> -       if (!id)
> -               return NULL;
> -       id->vendor = id->device = id->subvendor = id->subdevice = PCI_ANY_ID;
> -       id->class_mask = PCI_ANY_ID;
> -       id->class = class;
> +       id.vendor = id.device = id.subvendor = id.subdevice = PCI_ANY_ID;
> +       id.class_mask = PCI_ANY_ID;
> +       id.class = class;
>
> -       dev = pci_get_dev_by_id(id, from);
> -       kfree(id);
> -       return dev;
> +       return pci_get_dev_by_id(&id, from);
>  }
>
>  /**

with this one in pci/next pci config in /sys are not created.

10:~ # lspci -tv
pcilib: Cannot open /sys/bus/pci/devices/0000:00:03.0/config
lspci: Unable to read the standard configuration space header of
device 0000:00:03.0
pcilib: Cannot open /sys/bus/pci/devices/0000:00:02.0/config
lspci: Unable to read the standard configuration space header of
device 0000:00:02.0
pcilib: Cannot open /sys/bus/pci/devices/0000:00:01.3/config
lspci: Unable to read the standard configuration space header of
device 0000:00:01.3
pcilib: Cannot open /sys/bus/pci/devices/0000:00:01.1/config
lspci: Unable to read the standard configuration space header of
device 0000:00:01.1
pcilib: Cannot open /sys/bus/pci/devices/0000:00:01.0/config
lspci: Unable to read the standard configuration space header of
device 0000:00:01.0
pcilib: Cannot open /sys/bus/pci/devices/0000:00:00.0/config
lspci: Unable to read the standard configuration space header of
device 0000:00:00.0
-[0000:00]-

bisected to this commit

ccee7d23102f5e5765ec24779c5b77472af8f79e is the first bad commit
commit ccee7d23102f5e5765ec24779c5b77472af8f79e
Author: Feng Tang <feng.tang@intel.com>
Date:   Thu Aug 23 15:45:03 2012 +0800

    PCI: Use pci_device_id on stack for pci_get_subsys/class() to avoid kmalloc

    This fixes a kernel warning https://lkml.org/lkml/2012/7/31/682

    pci_get_subsys() may get called in late system reboot stage, using
    a sleepable kmalloc() sounds fragile and will cause a kernel warning
    with my recent commmit 55c844a "x86/reboot: Fix a warning message
    triggered by stop_other_cpus()" which disable local interrupt in
    late system shutdown/reboot phase. Using a local parameter instead
    will fix it and make it eligible for calling forom atomic context.

    Do the same change for the pci_get_class() as suggested by Bjorn Helgaas

    [bhelgaas: changelog]
    Bisected-by: Fengguang Wu <fengguang.wu@intel.com>
    Signed-off-by: Feng Tang <feng.tang@intel.com>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
    Reviewed-by: Fengguang Wu <fengguang.wu@intel.com>

:040000 040000 dee62a035816b73abc68e40de8f21c7349efc4cb
70b2a6258bffa1ab963bd650d8f5d02da774fbce M	drivers

so the stack get overrun ?

Bjorn, I think it is this one that cause lspci broken that I mentioned
during meeting at San Diego.

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/2] PCI: Use local parameter pci_device_id for pci_get_subsys/class()
  2012-09-08  1:00                                   ` Yinghai Lu
@ 2012-09-08  1:32                                     ` Yinghai Lu
  2012-09-08  1:59                                       ` Greg Kroah-Hartman
  2012-09-08 13:42                                       ` Fengguang Wu
  0 siblings, 2 replies; 29+ messages in thread
From: Yinghai Lu @ 2012-09-08  1:32 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Fengguang Wu, Paul E. McKenney, Steven Rostedt, Avi Kivity,
	Steven Rostedt, LKML, kvm@vger.kernel.org, Kenji Kaneshige,
	linux-pci, Feng Tang, Bjorn Helgaas

On Fri, Sep 7, 2012 at 6:00 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> On Thu, Aug 23, 2012 at 12:45 AM, Feng Tang <feng.tang@intel.com> wrote:
>> From 57a28ee5e7662ca28ba4c793aa037d64bd082dee Mon Sep 17 00:00:00 2001
>> From: Feng Tang <feng.tang@intel.com>
>> Date: Wed, 22 Aug 2012 15:41:51 +0800
>> Subject: [PATCH 1/2] PCI: Use local parameter pci_device_id for pci_get_subsys/class()
>>
>> This fixes a kernel warning https://lkml.org/lkml/2012/7/31/682
>>
>> pci_get_subsys() may get called in late system reboot stage, using
>> a sleepable kmalloc() sounds fragile and will casue a kernel warning
>> with my recent commmit 55c844a "x86/reboot: Fix a warning message
>> triggered by stop_other_cpus()" which disable local interrupt in
>> late system shutdown/reboot phase. Using a local parameter instead
>> will fix it and make it eligible for calling forom atomic context.
>>
>> Do the same change for the pci_get_class() as suggeted by Bjorn Helgaas
>>
>> Bisected-by: Fengguang Wu <fengguang.wu@intel.com>
>> Signed-off-by: Feng Tang <feng.tang@intel.com>
>> Reviewed-by: Fengguang Wu <fengguang.wu@intel.com>
>> ---
>>  drivers/pci/search.c |   35 +++++++++++------------------------
>>  1 files changed, 11 insertions(+), 24 deletions(-)
>>
>> diff --git a/drivers/pci/search.c b/drivers/pci/search.c
>> index 993d4a0..78a08b1 100644
>> --- a/drivers/pci/search.c
>> +++ b/drivers/pci/search.c
>> @@ -245,8 +245,7 @@ struct pci_dev *pci_get_subsys(unsigned int vendor, unsigned int device,
>>                                unsigned int ss_vendor, unsigned int ss_device,
>>                                struct pci_dev *from)
>>  {
>> -       struct pci_dev *pdev;
>> -       struct pci_device_id *id;
>> +       struct pci_device_id id;
>>
>>         /*
>>          * pci_find_subsys() can be called on the ide_setup() path,
>> @@ -257,18 +256,12 @@ struct pci_dev *pci_get_subsys(unsigned int vendor, unsigned int device,
>>         if (unlikely(no_pci_devices()))
>>                 return NULL;
>>
>> -       id = kzalloc(sizeof(*id), GFP_KERNEL);
>> -       if (!id)
>> -               return NULL;
>> -       id->vendor = vendor;
>> -       id->device = device;
>> -       id->subvendor = ss_vendor;
>> -       id->subdevice = ss_device;
>> -
>> -       pdev = pci_get_dev_by_id(id, from);
>> -       kfree(id);
>> +       id.vendor = vendor;
>> +       id.device = device;
>> +       id.subvendor = ss_vendor;
>> +       id.subdevice = ss_device;
>>
>> -       return pdev;
>> +       return pci_get_dev_by_id(&id, from);
>>  }
>>
>>  /**
>> @@ -307,19 +300,13 @@ pci_get_device(unsigned int vendor, unsigned int device, struct pci_dev *from)
>>   */
>>  struct pci_dev *pci_get_class(unsigned int class, struct pci_dev *from)
>>  {
>> -       struct pci_dev *dev;
>> -       struct pci_device_id *id;
>> +       struct pci_device_id id;
>>
>> -       id = kzalloc(sizeof(*id), GFP_KERNEL);
>> -       if (!id)
>> -               return NULL;
>> -       id->vendor = id->device = id->subvendor = id->subdevice = PCI_ANY_ID;
>> -       id->class_mask = PCI_ANY_ID;
>> -       id->class = class;
>> +       id.vendor = id.device = id.subvendor = id.subdevice = PCI_ANY_ID;
>> +       id.class_mask = PCI_ANY_ID;
>> +       id.class = class;
>>
>> -       dev = pci_get_dev_by_id(id, from);
>> -       kfree(id);
>> -       return dev;
>> +       return pci_get_dev_by_id(&id, from);
>>  }
>>
>>  /**
>
> with this one in pci/next pci config in /sys are not created.
>
> 10:~ # lspci -tv
> pcilib: Cannot open /sys/bus/pci/devices/0000:00:03.0/config
> lspci: Unable to read the standard configuration space header of
> device 0000:00:03.0
> pcilib: Cannot open /sys/bus/pci/devices/0000:00:02.0/config
> lspci: Unable to read the standard configuration space header of
> device 0000:00:02.0
> pcilib: Cannot open /sys/bus/pci/devices/0000:00:01.3/config
> lspci: Unable to read the standard configuration space header of
> device 0000:00:01.3
> pcilib: Cannot open /sys/bus/pci/devices/0000:00:01.1/config
> lspci: Unable to read the standard configuration space header of
> device 0000:00:01.1
> pcilib: Cannot open /sys/bus/pci/devices/0000:00:01.0/config
> lspci: Unable to read the standard configuration space header of
> device 0000:00:01.0
> pcilib: Cannot open /sys/bus/pci/devices/0000:00:00.0/config
> lspci: Unable to read the standard configuration space header of
> device 0000:00:00.0
> -[0000:00]-
>
> bisected to this commit
>
> ccee7d23102f5e5765ec24779c5b77472af8f79e is the first bad commit
> commit ccee7d23102f5e5765ec24779c5b77472af8f79e
> Author: Feng Tang <feng.tang@intel.com>
> Date:   Thu Aug 23 15:45:03 2012 +0800
>
>     PCI: Use pci_device_id on stack for pci_get_subsys/class() to avoid kmalloc
>
>     This fixes a kernel warning https://lkml.org/lkml/2012/7/31/682
>
>     pci_get_subsys() may get called in late system reboot stage, using
>     a sleepable kmalloc() sounds fragile and will cause a kernel warning
>     with my recent commmit 55c844a "x86/reboot: Fix a warning message
>     triggered by stop_other_cpus()" which disable local interrupt in
>     late system shutdown/reboot phase. Using a local parameter instead
>     will fix it and make it eligible for calling forom atomic context.
>
>     Do the same change for the pci_get_class() as suggested by Bjorn Helgaas
>
>     [bhelgaas: changelog]
>     Bisected-by: Fengguang Wu <fengguang.wu@intel.com>
>     Signed-off-by: Feng Tang <feng.tang@intel.com>
>     Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>     Reviewed-by: Fengguang Wu <fengguang.wu@intel.com>
>
> :040000 040000 dee62a035816b73abc68e40de8f21c7349efc4cb
> 70b2a6258bffa1ab963bd650d8f5d02da774fbce M      drivers
>
> so the stack get overrun ?
>
> Bjorn, I think it is this one that cause lspci broken that I mentioned
> during meeting at San Diego.
>

Greg,

Any reason for using kmalloc instead of local variable during your
rewriting pci search code?

commit 95247b57ed844511a212265b45cf9a919753aea1
Author: Greg Kroah-Hartman <gregkh@suse.de>
Date:   Wed Feb 13 11:03:58 2008 -0800

    PCI: clean up search.c a lot

    This cleans up the search.c file, now using the pci list of devices that
    are created for the driver core, instead of relying on our separate list
    of devices.  It's better to use the functions already created for this
    kind of thing, instead of rolling our own all the time.

    This work is done in anticipation of getting rid of that second list of
    pci devices all together.

    And it ends up saving code, always a nice benefit.

    This also removes one compiler warning for when CONFIG_PCI_LEGACY is
    enabled as we no longer internally use the deprecated functions anymore.

    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/2] PCI: Use local parameter pci_device_id for pci_get_subsys/class()
  2012-09-08  1:32                                     ` Yinghai Lu
@ 2012-09-08  1:59                                       ` Greg Kroah-Hartman
  2012-09-08 13:42                                       ` Fengguang Wu
  1 sibling, 0 replies; 29+ messages in thread
From: Greg Kroah-Hartman @ 2012-09-08  1:59 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Fengguang Wu, Paul E. McKenney, Steven Rostedt, Avi Kivity,
	Steven Rostedt, LKML, kvm@vger.kernel.org, Kenji Kaneshige,
	linux-pci, Feng Tang, Bjorn Helgaas

On Fri, Sep 07, 2012 at 06:32:48PM -0700, Yinghai Lu wrote:
> Greg,
> 
> Any reason for using kmalloc instead of local variable during your
> rewriting pci search code?
> 
> commit 95247b57ed844511a212265b45cf9a919753aea1
> Author: Greg Kroah-Hartman <gregkh@suse.de>
> Date:   Wed Feb 13 11:03:58 2008 -0800

Seriously?  Something I wrote 4 years ago?  I really have no idea,
sorry.

greg k-h

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/2] PCI: Use local parameter pci_device_id for pci_get_subsys/class()
  2012-09-08  1:32                                     ` Yinghai Lu
  2012-09-08  1:59                                       ` Greg Kroah-Hartman
@ 2012-09-08 13:42                                       ` Fengguang Wu
  2012-09-08 15:30                                         ` Yinghai Lu
  2012-09-08 15:34                                         ` Feng Tang
  1 sibling, 2 replies; 29+ messages in thread
From: Fengguang Wu @ 2012-09-08 13:42 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Greg Kroah-Hartman, Paul E. McKenney, Steven Rostedt, Avi Kivity,
	Steven Rostedt, LKML, kvm@vger.kernel.org, Kenji Kaneshige,
	linux-pci, Feng Tang, Bjorn Helgaas

On Fri, Sep 07, 2012 at 06:32:48PM -0700, Yinghai Lu wrote:

> > with this one in pci/next pci config in /sys are not created.
> >
> > 10:~ # lspci -tv
> > pcilib: Cannot open /sys/bus/pci/devices/0000:00:03.0/config
> > lspci: Unable to read the standard configuration space header of
> > device 0000:00:03.0
> > pcilib: Cannot open /sys/bus/pci/devices/0000:00:02.0/config
> > lspci: Unable to read the standard configuration space header of
> > device 0000:00:02.0
> > pcilib: Cannot open /sys/bus/pci/devices/0000:00:01.3/config
> > lspci: Unable to read the standard configuration space header of
> > device 0000:00:01.3
> > pcilib: Cannot open /sys/bus/pci/devices/0000:00:01.1/config
> > lspci: Unable to read the standard configuration space header of
> > device 0000:00:01.1
> > pcilib: Cannot open /sys/bus/pci/devices/0000:00:01.0/config
> > lspci: Unable to read the standard configuration space header of
> > device 0000:00:01.0
> > pcilib: Cannot open /sys/bus/pci/devices/0000:00:00.0/config
> > lspci: Unable to read the standard configuration space header of
> > device 0000:00:00.0
> > -[0000:00]-
> >
> > bisected to this commit
> >
> > ccee7d23102f5e5765ec24779c5b77472af8f79e is the first bad commit
> > commit ccee7d23102f5e5765ec24779c5b77472af8f79e
> > Author: Feng Tang <feng.tang@intel.com>
> > Date:   Thu Aug 23 15:45:03 2012 +0800
> >
> >     PCI: Use pci_device_id on stack for pci_get_subsys/class() to avoid kmalloc
> >
> >     This fixes a kernel warning https://lkml.org/lkml/2012/7/31/682
> >
> >     pci_get_subsys() may get called in late system reboot stage, using
> >     a sleepable kmalloc() sounds fragile and will cause a kernel warning
> >     with my recent commmit 55c844a "x86/reboot: Fix a warning message
> >     triggered by stop_other_cpus()" which disable local interrupt in
> >     late system shutdown/reboot phase. Using a local parameter instead
> >     will fix it and make it eligible for calling forom atomic context.
> >
> >     Do the same change for the pci_get_class() as suggested by Bjorn Helgaas
> >
> >     [bhelgaas: changelog]
> >     Bisected-by: Fengguang Wu <fengguang.wu@intel.com>
> >     Signed-off-by: Feng Tang <feng.tang@intel.com>
> >     Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> >     Reviewed-by: Fengguang Wu <fengguang.wu@intel.com>
> >
> > :040000 040000 dee62a035816b73abc68e40de8f21c7349efc4cb
> > 70b2a6258bffa1ab963bd650d8f5d02da774fbce M      drivers
> >
> > so the stack get overrun ?
> >
> > Bjorn, I think it is this one that cause lspci broken that I mentioned
> > during meeting at San Diego.

This makes lspci work again on my side. The caveat is, kzalloc() will
zero out all data while the new local variable leaves some data
uninitialized.

diff --git a/drivers/pci/search.c b/drivers/pci/search.c
index 78a08b1..9148b6e 100644
--- a/drivers/pci/search.c
+++ b/drivers/pci/search.c
@@ -245,7 +245,12 @@ struct pci_dev *pci_get_subsys(unsigned int vendor, unsigned int device,
 			       unsigned int ss_vendor, unsigned int ss_device,
 			       struct pci_dev *from)
 {
-	struct pci_device_id id;
+	struct pci_device_id id = {
+		.vendor = vendor,
+		.device = device,
+		.subvendor = ss_vendor,
+		.subdevice = ss_device,
+	};
 
 	/*
 	 * pci_find_subsys() can be called on the ide_setup() path,
@@ -256,11 +261,6 @@ struct pci_dev *pci_get_subsys(unsigned int vendor, unsigned int device,
 	if (unlikely(no_pci_devices()))
 		return NULL;
 
-	id.vendor = vendor;
-	id.device = device;
-	id.subvendor = ss_vendor;
-	id.subdevice = ss_device;
-
 	return pci_get_dev_by_id(&id, from);
 }
 
@@ -300,11 +300,14 @@ pci_get_device(unsigned int vendor, unsigned int device, struct pci_dev *from)
  */
 struct pci_dev *pci_get_class(unsigned int class, struct pci_dev *from)
 {
-	struct pci_device_id id;
-
-	id.vendor = id.device = id.subvendor = id.subdevice = PCI_ANY_ID;
-	id.class_mask = PCI_ANY_ID;
-	id.class = class;
+	struct pci_device_id id = {
+		.vendor = PCI_ANY_ID,
+		.device = PCI_ANY_ID,
+		.subvendor = PCI_ANY_ID,
+		.subdevice = PCI_ANY_ID,
+		.class_mask = PCI_ANY_ID,
+		.class = class,
+	};
 
 	return pci_get_dev_by_id(&id, from);
 }

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/2] PCI: Use local parameter pci_device_id for pci_get_subsys/class()
  2012-09-08 13:42                                       ` Fengguang Wu
@ 2012-09-08 15:30                                         ` Yinghai Lu
  2012-09-08 15:34                                         ` Feng Tang
  1 sibling, 0 replies; 29+ messages in thread
From: Yinghai Lu @ 2012-09-08 15:30 UTC (permalink / raw)
  To: Fengguang Wu
  Cc: Greg Kroah-Hartman, Paul E. McKenney, Steven Rostedt, Avi Kivity,
	Steven Rostedt, LKML, kvm@vger.kernel.org, Kenji Kaneshige,
	linux-pci, Feng Tang, Bjorn Helgaas

On Sat, Sep 8, 2012 at 6:42 AM, Fengguang Wu <fengguang.wu@intel.com> wrote:
> On Fri, Sep 07, 2012 at 06:32:48PM -0700, Yinghai Lu wrote:
>
>> > with this one in pci/next pci config in /sys are not created.
>> >
>> > 10:~ # lspci -tv
>> > pcilib: Cannot open /sys/bus/pci/devices/0000:00:03.0/config
>> > lspci: Unable to read the standard configuration space header of
>> > device 0000:00:03.0
>> > pcilib: Cannot open /sys/bus/pci/devices/0000:00:02.0/config
>> > lspci: Unable to read the standard configuration space header of
>> > device 0000:00:02.0
>> > pcilib: Cannot open /sys/bus/pci/devices/0000:00:01.3/config
>> > lspci: Unable to read the standard configuration space header of
>> > device 0000:00:01.3
>> > pcilib: Cannot open /sys/bus/pci/devices/0000:00:01.1/config
>> > lspci: Unable to read the standard configuration space header of
>> > device 0000:00:01.1
>> > pcilib: Cannot open /sys/bus/pci/devices/0000:00:01.0/config
>> > lspci: Unable to read the standard configuration space header of
>> > device 0000:00:01.0
>> > pcilib: Cannot open /sys/bus/pci/devices/0000:00:00.0/config
>> > lspci: Unable to read the standard configuration space header of
>> > device 0000:00:00.0
>> > -[0000:00]-
>> >
>> > bisected to this commit
>> >
>> > ccee7d23102f5e5765ec24779c5b77472af8f79e is the first bad commit
>> > commit ccee7d23102f5e5765ec24779c5b77472af8f79e
>> > Author: Feng Tang <feng.tang@intel.com>
>> > Date:   Thu Aug 23 15:45:03 2012 +0800
>> >
>> >     PCI: Use pci_device_id on stack for pci_get_subsys/class() to avoid kmalloc
>> >
>> >     This fixes a kernel warning https://lkml.org/lkml/2012/7/31/682
>> >
>> >     pci_get_subsys() may get called in late system reboot stage, using
>> >     a sleepable kmalloc() sounds fragile and will cause a kernel warning
>> >     with my recent commmit 55c844a "x86/reboot: Fix a warning message
>> >     triggered by stop_other_cpus()" which disable local interrupt in
>> >     late system shutdown/reboot phase. Using a local parameter instead
>> >     will fix it and make it eligible for calling forom atomic context.
>> >
>> >     Do the same change for the pci_get_class() as suggested by Bjorn Helgaas
>> >
>> >     [bhelgaas: changelog]
>> >     Bisected-by: Fengguang Wu <fengguang.wu@intel.com>
>> >     Signed-off-by: Feng Tang <feng.tang@intel.com>
>> >     Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>> >     Reviewed-by: Fengguang Wu <fengguang.wu@intel.com>
>> >
>> > :040000 040000 dee62a035816b73abc68e40de8f21c7349efc4cb
>> > 70b2a6258bffa1ab963bd650d8f5d02da774fbce M      drivers
>> >
>> > so the stack get overrun ?
>> >
>> > Bjorn, I think it is this one that cause lspci broken that I mentioned
>> > during meeting at San Diego.
>
> This makes lspci work again on my side. The caveat is, kzalloc() will
> zero out all data while the new local variable leaves some data
> uninitialized.
>
> diff --git a/drivers/pci/search.c b/drivers/pci/search.c
> index 78a08b1..9148b6e 100644
> --- a/drivers/pci/search.c
> +++ b/drivers/pci/search.c
> @@ -245,7 +245,12 @@ struct pci_dev *pci_get_subsys(unsigned int vendor, unsigned int device,
>                                unsigned int ss_vendor, unsigned int ss_device,
>                                struct pci_dev *from)
>  {
> -       struct pci_device_id id;
> +       struct pci_device_id id = {
> +               .vendor = vendor,
> +               .device = device,
> +               .subvendor = ss_vendor,
> +               .subdevice = ss_device,
> +       };
>
>         /*
>          * pci_find_subsys() can be called on the ide_setup() path,
> @@ -256,11 +261,6 @@ struct pci_dev *pci_get_subsys(unsigned int vendor, unsigned int device,
>         if (unlikely(no_pci_devices()))
>                 return NULL;
>
> -       id.vendor = vendor;
> -       id.device = device;
> -       id.subvendor = ss_vendor;
> -       id.subdevice = ss_device;
> -

yes, here forget to clear .class and .class_mask

>         return pci_get_dev_by_id(&id, from);
>  }
>
> @@ -300,11 +300,14 @@ pci_get_device(unsigned int vendor, unsigned int device, struct pci_dev *from)
>   */
>  struct pci_dev *pci_get_class(unsigned int class, struct pci_dev *from)
>  {
> -       struct pci_device_id id;
> -
> -       id.vendor = id.device = id.subvendor = id.subdevice = PCI_ANY_ID;
> -       id.class_mask = PCI_ANY_ID;
> -       id.class = class;
> +       struct pci_device_id id = {
> +               .vendor = PCI_ANY_ID,
> +               .device = PCI_ANY_ID,
> +               .subvendor = PCI_ANY_ID,
> +               .subdevice = PCI_ANY_ID,
> +               .class_mask = PCI_ANY_ID,
> +               .class = class,
> +       };
>
>         return pci_get_dev_by_id(&id, from);
>  }

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/2] PCI: Use local parameter pci_device_id for pci_get_subsys/class()
  2012-09-08 13:42                                       ` Fengguang Wu
  2012-09-08 15:30                                         ` Yinghai Lu
@ 2012-09-08 15:34                                         ` Feng Tang
  2012-09-08 18:40                                           ` Yinghai Lu
  1 sibling, 1 reply; 29+ messages in thread
From: Feng Tang @ 2012-09-08 15:34 UTC (permalink / raw)
  To: Fengguang Wu
  Cc: Yinghai Lu, Greg Kroah-Hartman, Paul E. McKenney, Steven Rostedt,
	Avi Kivity, Steven Rostedt, LKML, kvm@vger.kernel.org,
	Kenji Kaneshige, linux-pci, Bjorn Helgaas

On Sat, 8 Sep 2012 21:42:20 +0800
Fengguang Wu <fengguang.wu@intel.com> wrote:

> On Fri, Sep 07, 2012 at 06:32:48PM -0700, Yinghai Lu wrote:
> 
> > > with this one in pci/next pci config in /sys are not created.
> > >
> > > 10:~ # lspci -tv
> > > pcilib: Cannot open /sys/bus/pci/devices/0000:00:03.0/config
> > > lspci: Unable to read the standard configuration space header of
> > > device 0000:00:03.0
> > > pcilib: Cannot open /sys/bus/pci/devices/0000:00:02.0/config
> > > lspci: Unable to read the standard configuration space header of
> > > device 0000:00:02.0
> > > pcilib: Cannot open /sys/bus/pci/devices/0000:00:01.3/config
> > > lspci: Unable to read the standard configuration space header of
> > > device 0000:00:01.3
> > > pcilib: Cannot open /sys/bus/pci/devices/0000:00:01.1/config
> > > lspci: Unable to read the standard configuration space header of
> > > device 0000:00:01.1
> > > pcilib: Cannot open /sys/bus/pci/devices/0000:00:01.0/config
> > > lspci: Unable to read the standard configuration space header of
> > > device 0000:00:01.0
> > > pcilib: Cannot open /sys/bus/pci/devices/0000:00:00.0/config
> > > lspci: Unable to read the standard configuration space header of
> > > device 0000:00:00.0
> > > -[0000:00]-
> > >
> > > bisected to this commit
> > >
> > > ccee7d23102f5e5765ec24779c5b77472af8f79e is the first bad commit
> > > commit ccee7d23102f5e5765ec24779c5b77472af8f79e
> > > Author: Feng Tang <feng.tang@intel.com>
> > > Date:   Thu Aug 23 15:45:03 2012 +0800
> > >
> > >     PCI: Use pci_device_id on stack for pci_get_subsys/class() to avoid kmalloc
> > >
> > >     This fixes a kernel warning https://lkml.org/lkml/2012/7/31/682
> > >
> > >     pci_get_subsys() may get called in late system reboot stage, using
> > >     a sleepable kmalloc() sounds fragile and will cause a kernel warning
> > >     with my recent commmit 55c844a "x86/reboot: Fix a warning message
> > >     triggered by stop_other_cpus()" which disable local interrupt in
> > >     late system shutdown/reboot phase. Using a local parameter instead
> > >     will fix it and make it eligible for calling forom atomic context.
> > >
> > >     Do the same change for the pci_get_class() as suggested by Bjorn Helgaas
> > >
> > >     [bhelgaas: changelog]
> > >     Bisected-by: Fengguang Wu <fengguang.wu@intel.com>
> > >     Signed-off-by: Feng Tang <feng.tang@intel.com>
> > >     Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> > >     Reviewed-by: Fengguang Wu <fengguang.wu@intel.com>
> > >
> > > :040000 040000 dee62a035816b73abc68e40de8f21c7349efc4cb
> > > 70b2a6258bffa1ab963bd650d8f5d02da774fbce M      drivers
> > >
> > > so the stack get overrun ?
> > >
> > > Bjorn, I think it is this one that cause lspci broken that I mentioned
> > > during meeting at San Diego.
> 
> This makes lspci work again on my side. The caveat is, kzalloc() will
> zero out all data while the new local variable leaves some data
> uninitialized.

Yes, thanks for the quick root cause and fix to the bug in my code.

- Feng

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/2] PCI: Use local parameter pci_device_id for pci_get_subsys/class()
  2012-09-08 15:34                                         ` Feng Tang
@ 2012-09-08 18:40                                           ` Yinghai Lu
  2012-09-08 21:06                                             ` Bjorn Helgaas
  0 siblings, 1 reply; 29+ messages in thread
From: Yinghai Lu @ 2012-09-08 18:40 UTC (permalink / raw)
  To: Feng Tang
  Cc: Fengguang Wu, Greg Kroah-Hartman, Paul E. McKenney,
	Steven Rostedt, Avi Kivity, Steven Rostedt, LKML,
	kvm@vger.kernel.org, Kenji Kaneshige, linux-pci, Bjorn Helgaas

On Sat, Sep 8, 2012 at 8:34 AM, Feng Tang <feng.tang@intel.com> wrote:
>> This makes lspci work again on my side. The caveat is, kzalloc() will
>> zero out all data while the new local variable leaves some data
>> uninitialized.
>
> Yes, thanks for the quick root cause and fix to the bug in my code.

Can you resubmit your patch with two extra "memset" line?

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/2] PCI: Use local parameter pci_device_id for pci_get_subsys/class()
  2012-09-08 18:40                                           ` Yinghai Lu
@ 2012-09-08 21:06                                             ` Bjorn Helgaas
  0 siblings, 0 replies; 29+ messages in thread
From: Bjorn Helgaas @ 2012-09-08 21:06 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Feng Tang, Fengguang Wu, Greg Kroah-Hartman, Paul E. McKenney,
	Steven Rostedt, Avi Kivity, Steven Rostedt, LKML,
	kvm@vger.kernel.org, Kenji Kaneshige, linux-pci

On Sat, Sep 08, 2012 at 11:40:52AM -0700, Yinghai Lu wrote:
> On Sat, Sep 8, 2012 at 8:34 AM, Feng Tang <feng.tang@intel.com> wrote:
> >> This makes lspci work again on my side. The caveat is, kzalloc() will
> >> zero out all data while the new local variable leaves some data
> >> uninitialized.
> >
> > Yes, thanks for the quick root cause and fix to the bug in my code.
> 
> Can you resubmit your patch with two extra "memset" line?

I updated the patch as follows and rebased my "next" branch to include it:

commit e664f5bd55247bba3a6ebd61f83d6c9cd87ce0de
Author: Feng Tang <feng.tang@intel.com>
Date:   Thu Aug 23 15:45:03 2012 +0800

    PCI: Use pci_device_id on stack for pci_get_subsys/class() to avoid kmalloc
    
    This fixes a kernel warning https://lkml.org/lkml/2012/7/31/682
    
    pci_get_subsys() may get called in late system reboot stage, using
    a sleepable kmalloc() sounds fragile and will cause a kernel warning
    with my recent commmit 55c844a "x86/reboot: Fix a warning message
    triggered by stop_other_cpus()" which disable local interrupt in
    late system shutdown/reboot phase. Using a local parameter instead
    will fix it and make it eligible for calling forom atomic context.
    
    Do the same change for the pci_get_class() as suggested by Bjorn Helgaas
    
    [bhelgaas: changelog, clear pci_device_id on stack with memset()]
    Bisected-by: Fengguang Wu <fengguang.wu@intel.com>
    Signed-off-by: Feng Tang <feng.tang@intel.com>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
    Reviewed-by: Fengguang Wu <fengguang.wu@intel.com>

diff --git a/drivers/pci/search.c b/drivers/pci/search.c
index 993d4a0..e0a0310 100644
--- a/drivers/pci/search.c
+++ b/drivers/pci/search.c
@@ -245,8 +245,7 @@ struct pci_dev *pci_get_subsys(unsigned int vendor, unsigned int device,
 			       unsigned int ss_vendor, unsigned int ss_device,
 			       struct pci_dev *from)
 {
-	struct pci_dev *pdev;
-	struct pci_device_id *id;
+	struct pci_device_id id;
 
 	/*
 	 * pci_find_subsys() can be called on the ide_setup() path,
@@ -257,18 +256,13 @@ struct pci_dev *pci_get_subsys(unsigned int vendor, unsigned int device,
 	if (unlikely(no_pci_devices()))
 		return NULL;
 
-	id = kzalloc(sizeof(*id), GFP_KERNEL);
-	if (!id)
-		return NULL;
-	id->vendor = vendor;
-	id->device = device;
-	id->subvendor = ss_vendor;
-	id->subdevice = ss_device;
-
-	pdev = pci_get_dev_by_id(id, from);
-	kfree(id);
+	memset(&id, 0, sizeof(id));
+	id.vendor = vendor;
+	id.device = device;
+	id.subvendor = ss_vendor;
+	id.subdevice = ss_device;
 
-	return pdev;
+	return pci_get_dev_by_id(&id, from);
 }
 
 /**
@@ -307,19 +301,14 @@ pci_get_device(unsigned int vendor, unsigned int device, struct pci_dev *from)
  */
 struct pci_dev *pci_get_class(unsigned int class, struct pci_dev *from)
 {
-	struct pci_dev *dev;
-	struct pci_device_id *id;
+	struct pci_device_id id;
 
-	id = kzalloc(sizeof(*id), GFP_KERNEL);
-	if (!id)
-		return NULL;
-	id->vendor = id->device = id->subvendor = id->subdevice = PCI_ANY_ID;
-	id->class_mask = PCI_ANY_ID;
-	id->class = class;
+	memset(&id, 0, sizeof(id));
+	id.vendor = id.device = id.subvendor = id.subdevice = PCI_ANY_ID;
+	id.class_mask = PCI_ANY_ID;
+	id.class = class;
 
-	dev = pci_get_dev_by_id(id, from);
-	kfree(id);
-	return dev;
+	return pci_get_dev_by_id(&id, from);
 }
 
 /**

^ permalink raw reply related	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2012-09-08 21:06 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20120724090330.GA9830@localhost>
     [not found] ` <20120724090720.GA10434@localhost>
     [not found]   ` <1343663105.3847.7.camel@fedora>
2012-07-31 12:17     ` Testing tracer wakeup_rt: .. no entries found ..FAILED! Fengguang Wu
2012-07-31 12:37       ` Avi Kivity
2012-07-31 12:43         ` Steven Rostedt
2012-07-31 12:50           ` Avi Kivity
2012-07-31 13:13             ` Steven Rostedt
2012-07-31 23:43               ` Fengguang Wu
2012-07-31 23:51                 ` Steven Rostedt
2012-07-31 23:57                   ` Paul E. McKenney
2012-08-01  0:09                     ` Steven Rostedt
2012-08-01  0:18                       ` Paul E. McKenney
2012-08-01  0:43                         ` pci_get_subsys: GFP_KERNEL allocations with IRQs disabled Fengguang Wu
2012-08-22  2:50                           ` Fengguang Wu
2012-08-22  7:49                             ` Feng Tang
2012-08-22 13:02                               ` Fengguang Wu
2012-08-22 18:02                               ` Bjorn Helgaas
2012-08-23  5:45                                 ` Feng Tang
2012-08-23  7:45                                 ` [PATCH 1/2] PCI: Use local parameter pci_device_id for pci_get_subsys/class() Feng Tang
2012-09-08  1:00                                   ` Yinghai Lu
2012-09-08  1:32                                     ` Yinghai Lu
2012-09-08  1:59                                       ` Greg Kroah-Hartman
2012-09-08 13:42                                       ` Fengguang Wu
2012-09-08 15:30                                         ` Yinghai Lu
2012-09-08 15:34                                         ` Feng Tang
2012-09-08 18:40                                           ` Yinghai Lu
2012-09-08 21:06                                             ` Bjorn Helgaas
2012-08-23  7:45                                 ` [PATCH 2/2] PCI: Remove the obsolete no_pci_devices() check Feng Tang
2012-07-31 23:57                   ` Testing tracer wakeup_rt: .. no entries found ..FAILED! Fengguang Wu
2012-08-07 13:29                     ` Steven Rostedt
2012-08-07 13:32                       ` Fengguang Wu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).