public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [BUG]INFO: suspicious RCU usage for 3.5-rc1+
@ 2012-06-12  8:26 Feng Tang
  2012-07-02 21:41 ` Alexander Holler
  2012-07-03  6:35 ` [PATCH] leds: heartbeat: fix bug on panic Alexander Holler
  0 siblings, 2 replies; 8+ messages in thread
From: Feng Tang @ 2012-06-12  8:26 UTC (permalink / raw)
  To: Linux Kernel Mail List; +Cc: Wu, Fengguang, Alexander Holler, Richard Purdie

During Fengguang's build test, we found a problem here:

(x86_64-allyesdebian config)
               
[ 1526.520230] ===============================
[ 1526.520230] [ INFO: suspicious RCU usage. ]
[ 1526.520230] 3.5.0-rc1+ #12 Not tainted
[ 1526.520230] -------------------------------
[ 1526.520230] /c/kernel-tests/mm/include/linux/rcupdate.h:436 Illegal context switch in RCU read-side critical section!
[ 1526.520230]  
[ 1526.520230] other info that might help us debug this:
[ 1526.520230]  
[ 1526.520230]  
[ 1526.520230] rcu_scheduler_active = 1, debug_locks = 0
[ 1526.520230] 3 locks held by net.agent/3279:
[ 1526.520230]  #0:  (&mm->mmap_sem){++++++}, at: [<ffffffff82f85962>] do_page_fault+0x193/0x390
[ 1526.520230]  #1:  (panic_lock){+.+...}, at: [<ffffffff82ed2830>] panic+0x37/0x1d3
[ 1526.520230]  #2:  (rcu_read_lock){.+.+..}, at: [<ffffffff810b9b28>] rcu_lock_acquire+0x0/0x29
[ 1526.520230]  
[ 1526.520230] stack backtrace:
[ 1526.520230] Pid: 3279, comm: net.agent Not tainted 3.5.0-rc1+ #12
[ 1526.520230] Call Trace:
[ 1526.520230]  [<ffffffff810e1570>] lockdep_rcu_suspicious+0x109/0x112
[ 1526.520230]  [<ffffffff810bfe3a>] rcu_preempt_sleep_check+0x45/0x47
[ 1526.520230]  [<ffffffff810bfe5a>] __might_sleep+0x1e/0x19a
[ 1526.520230]  [<ffffffff82f8010e>] down_write+0x26/0x81
[ 1526.520230]  [<ffffffff8276a966>] led_trigger_unregister+0x1f/0x9c
[ 1526.520230]  [<ffffffff8276def5>] heartbeat_reboot_notifier+0x15/0x19
[ 1526.520230]  [<ffffffff82f85bf5>] notifier_call_chain+0x96/0xcd
[ 1526.520230]  [<ffffffff82f85cba>] __atomic_notifier_call_chain+0x8e/0xff
[ 1526.520230]  [<ffffffff81094b7c>] ? kmsg_dump+0x37/0x1eb
[ 1526.520230]  [<ffffffff82f85d3f>] atomic_notifier_call_chain+0x14/0x16
[ 1526.520230]  [<ffffffff82ed28e1>] panic+0xe8/0x1d3
[ 1526.520230]  [<ffffffff811473e2>] out_of_memory+0x15d/0x1d3



Seems it is caused by doing sleepable option in led_trigger_unregister() inside the panic atomic environment.

The original commit is:
commit 49dca5aebfdeadd4bf27b6cb4c60392147dc35a4
Author: Alexander Holler <holler@ahsoftware.de>
Date:   Tue May 29 15:07:29 2012 -0700

    leds: heartbeat: stop on shutdown

    A halted kernel should not show a heartbeat.


Thanks,
Feng

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [BUG]INFO: suspicious RCU usage for 3.5-rc1+
  2012-06-12  8:26 [BUG]INFO: suspicious RCU usage for 3.5-rc1+ Feng Tang
@ 2012-07-02 21:41 ` Alexander Holler
  2012-07-03  6:35 ` [PATCH] leds: heartbeat: fix bug on panic Alexander Holler
  1 sibling, 0 replies; 8+ messages in thread
From: Alexander Holler @ 2012-07-02 21:41 UTC (permalink / raw)
  To: Feng Tang; +Cc: Linux Kernel Mail List, Wu, Fengguang, Richard Purdie

Hello Feng,

sorry for the late answer.

Am 12.06.2012 10:26, schrieb Feng Tang:
> During Fengguang's build test, we found a problem here:
>
> (x86_64-allyesdebian config)
>
> [ 1526.520230] ===============================
> [ 1526.520230] [ INFO: suspicious RCU usage. ]
> [ 1526.520230] 3.5.0-rc1+ #12 Not tainted
> [ 1526.520230] -------------------------------
> [ 1526.520230] /c/kernel-tests/mm/include/linux/rcupdate.h:436 Illegal context switch in RCU read-side critical section!
> [ 1526.520230]
> [ 1526.520230] other info that might help us debug this:
> [ 1526.520230]
> [ 1526.520230]
> [ 1526.520230] rcu_scheduler_active = 1, debug_locks = 0
> [ 1526.520230] 3 locks held by net.agent/3279:
> [ 1526.520230]  #0:  (&mm->mmap_sem){++++++}, at: [<ffffffff82f85962>] do_page_fault+0x193/0x390
> [ 1526.520230]  #1:  (panic_lock){+.+...}, at: [<ffffffff82ed2830>] panic+0x37/0x1d3
> [ 1526.520230]  #2:  (rcu_read_lock){.+.+..}, at: [<ffffffff810b9b28>] rcu_lock_acquire+0x0/0x29
> [ 1526.520230]
> [ 1526.520230] stack backtrace:
> [ 1526.520230] Pid: 3279, comm: net.agent Not tainted 3.5.0-rc1+ #12
> [ 1526.520230] Call Trace:
> [ 1526.520230]  [<ffffffff810e1570>] lockdep_rcu_suspicious+0x109/0x112
> [ 1526.520230]  [<ffffffff810bfe3a>] rcu_preempt_sleep_check+0x45/0x47
> [ 1526.520230]  [<ffffffff810bfe5a>] __might_sleep+0x1e/0x19a
> [ 1526.520230]  [<ffffffff82f8010e>] down_write+0x26/0x81
> [ 1526.520230]  [<ffffffff8276a966>] led_trigger_unregister+0x1f/0x9c
> [ 1526.520230]  [<ffffffff8276def5>] heartbeat_reboot_notifier+0x15/0x19
> [ 1526.520230]  [<ffffffff82f85bf5>] notifier_call_chain+0x96/0xcd
> [ 1526.520230]  [<ffffffff82f85cba>] __atomic_notifier_call_chain+0x8e/0xff
> [ 1526.520230]  [<ffffffff81094b7c>] ? kmsg_dump+0x37/0x1eb
> [ 1526.520230]  [<ffffffff82f85d3f>] atomic_notifier_call_chain+0x14/0x16
> [ 1526.520230]  [<ffffffff82ed28e1>] panic+0xe8/0x1d3
> [ 1526.520230]  [<ffffffff811473e2>] out_of_memory+0x15d/0x1d3
>
>
>
> Seems it is caused by doing sleepable option in led_trigger_unregister() inside the panic atomic environment.
>
> The original commit is:
> commit 49dca5aebfdeadd4bf27b6cb4c60392147dc35a4
> Author: Alexander Holler <holler@ahsoftware.de>
> Date:   Tue May 29 15:07:29 2012 -0700
>
>      leds: heartbeat: stop on shutdown
>
>      A halted kernel should not show a heartbeat.
>

Hmm, I'm not very familiar with what happens when a panic occurs and my 
tests with panics haven't revealed that (or I didn't have seen it).

I will try to get the same error and will have a deeper look at what 
happens in led_trigger_unregister() (and if and how that might be 
avoidable) in the next days. Thanks.

Regards,

Alexander

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH] leds: heartbeat: fix bug on panic
  2012-06-12  8:26 [BUG]INFO: suspicious RCU usage for 3.5-rc1+ Feng Tang
  2012-07-02 21:41 ` Alexander Holler
@ 2012-07-03  6:35 ` Alexander Holler
  2012-07-04  7:05   ` Bryan Wu
  1 sibling, 1 reply; 8+ messages in thread
From: Alexander Holler @ 2012-07-03  6:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: Shuah Khan, Richard Purdie, Bryan Wu, Feng Tang, Alexander Holler

With commit 49dca5aebfdeadd4bf27b6cb4c60392147dc35a4 I introduced
a bug (visible if CONFIG_PROVE_RCU is enabled) which occures when a panic
has happened:

[ 1526.520230] ===============================
[ 1526.520230] [ INFO: suspicious RCU usage. ]
[ 1526.520230] 3.5.0-rc1+ #12 Not tainted
[ 1526.520230] -------------------------------
[ 1526.520230] /c/kernel-tests/mm/include/linux/rcupdate.h:436 Illegal context switch in RCU read-side critical section!
[ 1526.520230]
[ 1526.520230] other info that might help us debug this:
[ 1526.520230]
[ 1526.520230]
[ 1526.520230] rcu_scheduler_active = 1, debug_locks = 0
[ 1526.520230] 3 locks held by net.agent/3279:
[ 1526.520230]  #0:  (&mm->mmap_sem){++++++}, at: [<ffffffff82f85962>] do_page_fault+0x193/0x390
[ 1526.520230]  #1:  (panic_lock){+.+...}, at: [<ffffffff82ed2830>] panic+0x37/0x1d3
[ 1526.520230]  #2:  (rcu_read_lock){.+.+..}, at: [<ffffffff810b9b28>] rcu_lock_acquire+0x0/0x29
[ 1526.520230]
[ 1526.520230] stack backtrace:
[ 1526.520230] Pid: 3279, comm: net.agent Not tainted 3.5.0-rc1+ #12
[ 1526.520230] Call Trace:
[ 1526.520230]  [<ffffffff810e1570>] lockdep_rcu_suspicious+0x109/0x112
[ 1526.520230]  [<ffffffff810bfe3a>] rcu_preempt_sleep_check+0x45/0x47
[ 1526.520230]  [<ffffffff810bfe5a>] __might_sleep+0x1e/0x19a
[ 1526.520230]  [<ffffffff82f8010e>] down_write+0x26/0x81
[ 1526.520230]  [<ffffffff8276a966>] led_trigger_unregister+0x1f/0x9c
[ 1526.520230]  [<ffffffff8276def5>] heartbeat_reboot_notifier+0x15/0x19
[ 1526.520230]  [<ffffffff82f85bf5>] notifier_call_chain+0x96/0xcd
[ 1526.520230]  [<ffffffff82f85cba>] __atomic_notifier_call_chain+0x8e/0xff
[ 1526.520230]  [<ffffffff81094b7c>] ? kmsg_dump+0x37/0x1eb
[ 1526.520230]  [<ffffffff82f85d3f>] atomic_notifier_call_chain+0x14/0x16
[ 1526.520230]  [<ffffffff82ed28e1>] panic+0xe8/0x1d3
[ 1526.520230]  [<ffffffff811473e2>] out_of_memory+0x15d/0x1d3

So in case of a panic, now just turn of the LED. Other approaches like
scheduling a work to unregister the trigger aren't working because there
isn't much which still runs after a panic occured (except timers).

Signed-off-by: Alexander Holler <holler@ahsoftware.de>
---
 drivers/leds/ledtrig-heartbeat.c |   16 +++++++++++++++-
 1 files changed, 15 insertions(+), 1 deletions(-)

diff --git a/drivers/leds/ledtrig-heartbeat.c b/drivers/leds/ledtrig-heartbeat.c
index 41dc76d..a019fbb 100644
--- a/drivers/leds/ledtrig-heartbeat.c
+++ b/drivers/leds/ledtrig-heartbeat.c
@@ -21,6 +21,8 @@
 #include <linux/reboot.h>
 #include "leds.h"
 
+static int panic_heartbeats;
+
 struct heartbeat_trig_data {
 	unsigned int phase;
 	unsigned int period;
@@ -34,6 +36,11 @@ static void led_heartbeat_function(unsigned long data)
 	unsigned long brightness = LED_OFF;
 	unsigned long delay = 0;
 
+	if (unlikely(panic_heartbeats)) {
+		led_set_brightness(led_cdev, LED_OFF);
+		return;
+	}
+
 	/* acts like an actual heart beat -- ie thump-thump-pause... */
 	switch (heartbeat_data->phase) {
 	case 0:
@@ -111,12 +118,19 @@ static int heartbeat_reboot_notifier(struct notifier_block *nb,
 	return NOTIFY_DONE;
 }
 
+static int heartbeat_panic_notifier(struct notifier_block *nb,
+				     unsigned long code, void *unused)
+{
+	panic_heartbeats = 1;
+	return NOTIFY_DONE;
+}
+
 static struct notifier_block heartbeat_reboot_nb = {
 	.notifier_call = heartbeat_reboot_notifier,
 };
 
 static struct notifier_block heartbeat_panic_nb = {
-	.notifier_call = heartbeat_reboot_notifier,
+	.notifier_call = heartbeat_panic_notifier,
 };
 
 static int __init heartbeat_trig_init(void)
-- 
1.7.6.5


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] leds: heartbeat: fix bug on panic
  2012-07-03  6:35 ` [PATCH] leds: heartbeat: fix bug on panic Alexander Holler
@ 2012-07-04  7:05   ` Bryan Wu
  2012-07-04  7:11     ` Alexander Holler
  0 siblings, 1 reply; 8+ messages in thread
From: Bryan Wu @ 2012-07-04  7:05 UTC (permalink / raw)
  To: Alexander Holler; +Cc: linux-kernel, Shuah Khan, Richard Purdie, Feng Tang

On Tue, Jul 3, 2012 at 2:35 PM, Alexander Holler <holler@ahsoftware.de> wrote:
> With commit 49dca5aebfdeadd4bf27b6cb4c60392147dc35a4 I introduced
> a bug (visible if CONFIG_PROVE_RCU is enabled) which occures when a panic
> has happened:
>
> [ 1526.520230] ===============================
> [ 1526.520230] [ INFO: suspicious RCU usage. ]
> [ 1526.520230] 3.5.0-rc1+ #12 Not tainted
> [ 1526.520230] -------------------------------
> [ 1526.520230] /c/kernel-tests/mm/include/linux/rcupdate.h:436 Illegal context switch in RCU read-side critical section!
> [ 1526.520230]
> [ 1526.520230] other info that might help us debug this:
> [ 1526.520230]
> [ 1526.520230]
> [ 1526.520230] rcu_scheduler_active = 1, debug_locks = 0
> [ 1526.520230] 3 locks held by net.agent/3279:
> [ 1526.520230]  #0:  (&mm->mmap_sem){++++++}, at: [<ffffffff82f85962>] do_page_fault+0x193/0x390
> [ 1526.520230]  #1:  (panic_lock){+.+...}, at: [<ffffffff82ed2830>] panic+0x37/0x1d3
> [ 1526.520230]  #2:  (rcu_read_lock){.+.+..}, at: [<ffffffff810b9b28>] rcu_lock_acquire+0x0/0x29
> [ 1526.520230]
> [ 1526.520230] stack backtrace:
> [ 1526.520230] Pid: 3279, comm: net.agent Not tainted 3.5.0-rc1+ #12
> [ 1526.520230] Call Trace:
> [ 1526.520230]  [<ffffffff810e1570>] lockdep_rcu_suspicious+0x109/0x112
> [ 1526.520230]  [<ffffffff810bfe3a>] rcu_preempt_sleep_check+0x45/0x47
> [ 1526.520230]  [<ffffffff810bfe5a>] __might_sleep+0x1e/0x19a
> [ 1526.520230]  [<ffffffff82f8010e>] down_write+0x26/0x81
> [ 1526.520230]  [<ffffffff8276a966>] led_trigger_unregister+0x1f/0x9c
> [ 1526.520230]  [<ffffffff8276def5>] heartbeat_reboot_notifier+0x15/0x19
> [ 1526.520230]  [<ffffffff82f85bf5>] notifier_call_chain+0x96/0xcd
> [ 1526.520230]  [<ffffffff82f85cba>] __atomic_notifier_call_chain+0x8e/0xff
> [ 1526.520230]  [<ffffffff81094b7c>] ? kmsg_dump+0x37/0x1eb
> [ 1526.520230]  [<ffffffff82f85d3f>] atomic_notifier_call_chain+0x14/0x16
> [ 1526.520230]  [<ffffffff82ed28e1>] panic+0xe8/0x1d3
> [ 1526.520230]  [<ffffffff811473e2>] out_of_memory+0x15d/0x1d3
>
> So in case of a panic, now just turn of the LED. Other approaches like
> scheduling a work to unregister the trigger aren't working because there
> isn't much which still runs after a panic occured (except timers).
>
> Signed-off-by: Alexander Holler <holler@ahsoftware.de>
> ---
>  drivers/leds/ledtrig-heartbeat.c |   16 +++++++++++++++-
>  1 files changed, 15 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/leds/ledtrig-heartbeat.c b/drivers/leds/ledtrig-heartbeat.c
> index 41dc76d..a019fbb 100644
> --- a/drivers/leds/ledtrig-heartbeat.c
> +++ b/drivers/leds/ledtrig-heartbeat.c
> @@ -21,6 +21,8 @@
>  #include <linux/reboot.h>
>  #include "leds.h"
>
> +static int panic_heartbeats;
> +
>  struct heartbeat_trig_data {
>         unsigned int phase;
>         unsigned int period;
> @@ -34,6 +36,11 @@ static void led_heartbeat_function(unsigned long data)
>         unsigned long brightness = LED_OFF;
>         unsigned long delay = 0;
>
> +       if (unlikely(panic_heartbeats)) {
> +               led_set_brightness(led_cdev, LED_OFF);
> +               return;
> +       }
> +
>         /* acts like an actual heart beat -- ie thump-thump-pause... */
>         switch (heartbeat_data->phase) {
>         case 0:
> @@ -111,12 +118,19 @@ static int heartbeat_reboot_notifier(struct notifier_block *nb,
>         return NOTIFY_DONE;
>  }
>
> +static int heartbeat_panic_notifier(struct notifier_block *nb,
> +                                    unsigned long code, void *unused)
> +{
> +       panic_heartbeats = 1;

Can we just set LED as OFF and delete the timer here? because timer is
also useless after a kernel panic.
So we don't need this global static variable here.

-Bryan

> +       return NOTIFY_DONE;
> +}
> +
>  static struct notifier_block heartbeat_reboot_nb = {
>         .notifier_call = heartbeat_reboot_notifier,
>  };
>
>  static struct notifier_block heartbeat_panic_nb = {
> -       .notifier_call = heartbeat_reboot_notifier,
> +       .notifier_call = heartbeat_panic_notifier,
>  };
>
>  static int __init heartbeat_trig_init(void)
> --
> 1.7.6.5
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] leds: heartbeat: fix bug on panic
  2012-07-04  7:05   ` Bryan Wu
@ 2012-07-04  7:11     ` Alexander Holler
  2012-07-04  7:29       ` Bryan Wu
  0 siblings, 1 reply; 8+ messages in thread
From: Alexander Holler @ 2012-07-04  7:11 UTC (permalink / raw)
  To: Bryan Wu; +Cc: linux-kernel, Shuah Khan, Richard Purdie, Feng Tang

Am 04.07.2012 09:05, schrieb Bryan Wu:
> On Tue, Jul 3, 2012 at 2:35 PM, Alexander Holler <holler@ahsoftware.de> wrote:
>> With commit 49dca5aebfdeadd4bf27b6cb4c60392147dc35a4 I introduced
>> a bug (visible if CONFIG_PROVE_RCU is enabled) which occures when a panic
>> has happened:
>>
>> [ 1526.520230] ===============================
>> [ 1526.520230] [ INFO: suspicious RCU usage. ]
>> [ 1526.520230] 3.5.0-rc1+ #12 Not tainted
>> [ 1526.520230] -------------------------------
>> [ 1526.520230] /c/kernel-tests/mm/include/linux/rcupdate.h:436 Illegal context switch in RCU read-side critical section!
>> [ 1526.520230]
>> [ 1526.520230] other info that might help us debug this:
>> [ 1526.520230]
>> [ 1526.520230]
>> [ 1526.520230] rcu_scheduler_active = 1, debug_locks = 0
>> [ 1526.520230] 3 locks held by net.agent/3279:
>> [ 1526.520230]  #0:  (&mm->mmap_sem){++++++}, at: [<ffffffff82f85962>] do_page_fault+0x193/0x390
>> [ 1526.520230]  #1:  (panic_lock){+.+...}, at: [<ffffffff82ed2830>] panic+0x37/0x1d3
>> [ 1526.520230]  #2:  (rcu_read_lock){.+.+..}, at: [<ffffffff810b9b28>] rcu_lock_acquire+0x0/0x29
>> [ 1526.520230]
>> [ 1526.520230] stack backtrace:
>> [ 1526.520230] Pid: 3279, comm: net.agent Not tainted 3.5.0-rc1+ #12
>> [ 1526.520230] Call Trace:
>> [ 1526.520230]  [<ffffffff810e1570>] lockdep_rcu_suspicious+0x109/0x112
>> [ 1526.520230]  [<ffffffff810bfe3a>] rcu_preempt_sleep_check+0x45/0x47
>> [ 1526.520230]  [<ffffffff810bfe5a>] __might_sleep+0x1e/0x19a
>> [ 1526.520230]  [<ffffffff82f8010e>] down_write+0x26/0x81
>> [ 1526.520230]  [<ffffffff8276a966>] led_trigger_unregister+0x1f/0x9c
>> [ 1526.520230]  [<ffffffff8276def5>] heartbeat_reboot_notifier+0x15/0x19
>> [ 1526.520230]  [<ffffffff82f85bf5>] notifier_call_chain+0x96/0xcd
>> [ 1526.520230]  [<ffffffff82f85cba>] __atomic_notifier_call_chain+0x8e/0xff
>> [ 1526.520230]  [<ffffffff81094b7c>] ? kmsg_dump+0x37/0x1eb
>> [ 1526.520230]  [<ffffffff82f85d3f>] atomic_notifier_call_chain+0x14/0x16
>> [ 1526.520230]  [<ffffffff82ed28e1>] panic+0xe8/0x1d3
>> [ 1526.520230]  [<ffffffff811473e2>] out_of_memory+0x15d/0x1d3
>>
>> So in case of a panic, now just turn of the LED. Other approaches like
>> scheduling a work to unregister the trigger aren't working because there
>> isn't much which still runs after a panic occured (except timers).
>>
>> Signed-off-by: Alexander Holler <holler@ahsoftware.de>
>> ---
>>   drivers/leds/ledtrig-heartbeat.c |   16 +++++++++++++++-
>>   1 files changed, 15 insertions(+), 1 deletions(-)
>>
>> diff --git a/drivers/leds/ledtrig-heartbeat.c b/drivers/leds/ledtrig-heartbeat.c
>> index 41dc76d..a019fbb 100644
>> --- a/drivers/leds/ledtrig-heartbeat.c
>> +++ b/drivers/leds/ledtrig-heartbeat.c
>> @@ -21,6 +21,8 @@
>>   #include <linux/reboot.h>
>>   #include "leds.h"
>>
>> +static int panic_heartbeats;
>> +
>>   struct heartbeat_trig_data {
>>          unsigned int phase;
>>          unsigned int period;
>> @@ -34,6 +36,11 @@ static void led_heartbeat_function(unsigned long data)
>>          unsigned long brightness = LED_OFF;
>>          unsigned long delay = 0;
>>
>> +       if (unlikely(panic_heartbeats)) {
>> +               led_set_brightness(led_cdev, LED_OFF);
>> +               return;
>> +       }
>> +
>>          /* acts like an actual heart beat -- ie thump-thump-pause... */
>>          switch (heartbeat_data->phase) {
>>          case 0:
>> @@ -111,12 +118,19 @@ static int heartbeat_reboot_notifier(struct notifier_block *nb,
>>          return NOTIFY_DONE;
>>   }
>>
>> +static int heartbeat_panic_notifier(struct notifier_block *nb,
>> +                                    unsigned long code, void *unused)
>> +{
>> +       panic_heartbeats = 1;
>
> Can we just set LED as OFF and delete the timer here? because timer is
> also useless after a kernel panic.
> So we don't need this global static variable here.

No, the necessary information (heartbeat_trig_data) isn't available here.

Regards,

Alexander


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] leds: heartbeat: fix bug on panic
  2012-07-04  7:11     ` Alexander Holler
@ 2012-07-04  7:29       ` Bryan Wu
  2012-07-04  7:51         ` Alexander Holler
  0 siblings, 1 reply; 8+ messages in thread
From: Bryan Wu @ 2012-07-04  7:29 UTC (permalink / raw)
  To: Alexander Holler; +Cc: linux-kernel, Shuah Khan, Richard Purdie, Feng Tang

On Wed, Jul 4, 2012 at 3:11 PM, Alexander Holler <holler@ahsoftware.de> wrote:
> Am 04.07.2012 09:05, schrieb Bryan Wu:
>
>> On Tue, Jul 3, 2012 at 2:35 PM, Alexander Holler <holler@ahsoftware.de>
>> wrote:
>>>
>>> With commit 49dca5aebfdeadd4bf27b6cb4c60392147dc35a4 I introduced
>>> a bug (visible if CONFIG_PROVE_RCU is enabled) which occures when a panic
>>> has happened:
>>>
>>> [ 1526.520230] ===============================
>>> [ 1526.520230] [ INFO: suspicious RCU usage. ]
>>> [ 1526.520230] 3.5.0-rc1+ #12 Not tainted
>>> [ 1526.520230] -------------------------------
>>> [ 1526.520230] /c/kernel-tests/mm/include/linux/rcupdate.h:436 Illegal
>>> context switch in RCU read-side critical section!
>>> [ 1526.520230]
>>> [ 1526.520230] other info that might help us debug this:
>>> [ 1526.520230]
>>> [ 1526.520230]
>>> [ 1526.520230] rcu_scheduler_active = 1, debug_locks = 0
>>> [ 1526.520230] 3 locks held by net.agent/3279:
>>> [ 1526.520230]  #0:  (&mm->mmap_sem){++++++}, at: [<ffffffff82f85962>]
>>> do_page_fault+0x193/0x390
>>> [ 1526.520230]  #1:  (panic_lock){+.+...}, at: [<ffffffff82ed2830>]
>>> panic+0x37/0x1d3
>>> [ 1526.520230]  #2:  (rcu_read_lock){.+.+..}, at: [<ffffffff810b9b28>]
>>> rcu_lock_acquire+0x0/0x29
>>> [ 1526.520230]
>>> [ 1526.520230] stack backtrace:
>>> [ 1526.520230] Pid: 3279, comm: net.agent Not tainted 3.5.0-rc1+ #12
>>> [ 1526.520230] Call Trace:
>>> [ 1526.520230]  [<ffffffff810e1570>] lockdep_rcu_suspicious+0x109/0x112
>>> [ 1526.520230]  [<ffffffff810bfe3a>] rcu_preempt_sleep_check+0x45/0x47
>>> [ 1526.520230]  [<ffffffff810bfe5a>] __might_sleep+0x1e/0x19a
>>> [ 1526.520230]  [<ffffffff82f8010e>] down_write+0x26/0x81
>>> [ 1526.520230]  [<ffffffff8276a966>] led_trigger_unregister+0x1f/0x9c
>>> [ 1526.520230]  [<ffffffff8276def5>] heartbeat_reboot_notifier+0x15/0x19
>>> [ 1526.520230]  [<ffffffff82f85bf5>] notifier_call_chain+0x96/0xcd
>>> [ 1526.520230]  [<ffffffff82f85cba>]
>>> __atomic_notifier_call_chain+0x8e/0xff
>>> [ 1526.520230]  [<ffffffff81094b7c>] ? kmsg_dump+0x37/0x1eb
>>> [ 1526.520230]  [<ffffffff82f85d3f>] atomic_notifier_call_chain+0x14/0x16
>>> [ 1526.520230]  [<ffffffff82ed28e1>] panic+0xe8/0x1d3
>>> [ 1526.520230]  [<ffffffff811473e2>] out_of_memory+0x15d/0x1d3
>>>
>>> So in case of a panic, now just turn of the LED. Other approaches like
>>> scheduling a work to unregister the trigger aren't working because there
>>> isn't much which still runs after a panic occured (except timers).
>>>
>>> Signed-off-by: Alexander Holler <holler@ahsoftware.de>
>>> ---
>>>   drivers/leds/ledtrig-heartbeat.c |   16 +++++++++++++++-
>>>   1 files changed, 15 insertions(+), 1 deletions(-)
>>>
>>> diff --git a/drivers/leds/ledtrig-heartbeat.c
>>> b/drivers/leds/ledtrig-heartbeat.c
>>> index 41dc76d..a019fbb 100644
>>> --- a/drivers/leds/ledtrig-heartbeat.c
>>> +++ b/drivers/leds/ledtrig-heartbeat.c
>>> @@ -21,6 +21,8 @@
>>>   #include <linux/reboot.h>
>>>   #include "leds.h"
>>>
>>> +static int panic_heartbeats;
>>> +
>>>   struct heartbeat_trig_data {
>>>          unsigned int phase;
>>>          unsigned int period;
>>> @@ -34,6 +36,11 @@ static void led_heartbeat_function(unsigned long data)
>>>          unsigned long brightness = LED_OFF;
>>>          unsigned long delay = 0;
>>>
>>> +       if (unlikely(panic_heartbeats)) {
>>> +               led_set_brightness(led_cdev, LED_OFF);
>>> +               return;
>>> +       }
>>> +
>>>          /* acts like an actual heart beat -- ie thump-thump-pause... */
>>>          switch (heartbeat_data->phase) {
>>>          case 0:
>>> @@ -111,12 +118,19 @@ static int heartbeat_reboot_notifier(struct
>>> notifier_block *nb,
>>>          return NOTIFY_DONE;
>>>   }
>>>
>>> +static int heartbeat_panic_notifier(struct notifier_block *nb,
>>> +                                    unsigned long code, void *unused)
>>> +{
>>> +       panic_heartbeats = 1;
>>
>>
>> Can we just set LED as OFF and delete the timer here? because timer is
>> also useless after a kernel panic.
>> So we don't need this global static variable here.
>
>
> No, the necessary information (heartbeat_trig_data) isn't available here.
>

Yeah, looks like there is no way to pass heartbeat_trig_data
information to the notifier call function.
Anyway, I will apply this patch to my for-next branch.

Thanks,
-- 
Bryan Wu <bryan.wu@canonical.com>
Kernel Developer    +86.186-168-78255 Mobile
Canonical Ltd.      www.canonical.com
Ubuntu - Linux for human beings | www.ubuntu.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] leds: heartbeat: fix bug on panic
  2012-07-04  7:29       ` Bryan Wu
@ 2012-07-04  7:51         ` Alexander Holler
  2012-07-04  7:54           ` Bryan Wu
  0 siblings, 1 reply; 8+ messages in thread
From: Alexander Holler @ 2012-07-04  7:51 UTC (permalink / raw)
  To: Bryan Wu; +Cc: linux-kernel, Shuah Khan, Richard Purdie, Feng Tang

Am 04.07.2012 09:29, schrieb Bryan Wu:

> Anyway, I will apply this patch to my for-next branch.

Is it already too late for 3.5 (where I introduced the bug)? I know I 
should have reacted faster, but I the bug report got down under a ton of 
other mails. ;)

Regards,

Alexander

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] leds: heartbeat: fix bug on panic
  2012-07-04  7:51         ` Alexander Holler
@ 2012-07-04  7:54           ` Bryan Wu
  0 siblings, 0 replies; 8+ messages in thread
From: Bryan Wu @ 2012-07-04  7:54 UTC (permalink / raw)
  To: Alexander Holler; +Cc: linux-kernel, Shuah Khan, Richard Purdie, Feng Tang

On Wed, Jul 4, 2012 at 3:51 PM, Alexander Holler <holler@ahsoftware.de> wrote:
> Am 04.07.2012 09:29, schrieb Bryan Wu:
>
>
>> Anyway, I will apply this patch to my for-next branch.
>
>
> Is it already too late for 3.5 (where I introduced the bug)? I know I should
> have reacted faster, but I the bug report got down under a ton of other
> mails. ;)
>

OK, I will put it to my fixes-for-3.5 branch, since this is a real bug fixing.

Thanks,
-- 
Bryan Wu <bryan.wu@canonical.com>
Kernel Developer    +86.186-168-78255 Mobile
Canonical Ltd.      www.canonical.com
Ubuntu - Linux for human beings | www.ubuntu.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2012-07-04  7:54 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-06-12  8:26 [BUG]INFO: suspicious RCU usage for 3.5-rc1+ Feng Tang
2012-07-02 21:41 ` Alexander Holler
2012-07-03  6:35 ` [PATCH] leds: heartbeat: fix bug on panic Alexander Holler
2012-07-04  7:05   ` Bryan Wu
2012-07-04  7:11     ` Alexander Holler
2012-07-04  7:29       ` Bryan Wu
2012-07-04  7:51         ` Alexander Holler
2012-07-04  7:54           ` Bryan Wu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox