* Do we need to fix below dump during cpu hot plug operation? @ 2012-10-19 1:56 Peter Chen 2012-10-19 4:58 ` Shawn Guo 0 siblings, 1 reply; 5+ messages in thread From: Peter Chen @ 2012-10-19 1:56 UTC (permalink / raw) To: linux-arm-kernel Hi all, When doing system suspend/resume or CPU hot plug at SMP platform (like Freescale i.mx6Q sabrelite board), if local time is enabled, there is a clk_get_rate at twd_timer_setup during the atomic context. So, if mutex_lock debug is enabled, there will be below dump message. Although it will not affect future system use, it is boring message. BUG: sleeping function called from invalid context at /home/b29397/work/code/git/linus/linux-2.6/kernel/mutex.c:269 no locks held by swapper/1/0. irq event stamp: 279320 hardirqs last enabled at (279319): [<80052bd0>] idle_task_exit+0x16c/0x1a0 hardirqs last disabled at (279320): [<8051a470>] cpu_die+0x28/0x98 softirqs last enabled at (279304): [<8002ae3c>] _local_bh_enable+0x14/0x18 softirqs last disabled at (279303): [<8002aea8>] irq_enter+0x68/0x78 Backtrace: [<8001284c>] (dump_backtrace+0x0/0x10c) from [<8052b614>] (dump_stack+0x18/0x1c) r7:8062b070 r6:0000010d r5:00000000 r4:bf88c000 [<8052b5fc>] (dump_stack+0x0/0x1c) from [<80050bb4>] (__might_sleep+0x1ec/0x238) [<800509c8>] (__might_sleep+0x0/0x238) from [<8052cf60>] (mutex_lock_nested+0x38/0x3bc) r7:00000000 r6:bf88c000 r5:808b19e0 r4:808a5360 [<8052cf28>] (mutex_lock_nested+0x0/0x3bc) from [<803e2108>] (clk_get_rate+0x1c/0x84) [<803e20ec>] (clk_get_rate+0x0/0x84) from [<8052900c>] (twd_timer_setup+0x28c/0x2e0) r5:808b19e0 r4:817c0200 [<80528d80>] (twd_timer_setup+0x0/0x2e0) from [<805288b8>] (percpu_timer_setup+0x6c/0xdc) r9:412fc09a r8:1000406a r7:808b19d8 r6:bf88c000 r5:80879c14 r4:817c0200 [<8052884c>] (percpu_timer_setup+0x0/0xdc) from [<80528a68>] (secondary_start_kernel+0x140/0x19c) r5:80879c14 r4:00000001 [<80528928>] (secondary_start_kernel+0x0/0x19c) from [<10528248>] (0x10528248) r7:808b19d8 r6:10c03c7d r5:0000001f r4:4f87806a BUG: scheduling while atomic: swapper/1/0/0x40000002 no locks held by swapper/1/0. Modules linked in: irq event stamp: 279320 hardirqs last enabled at (279319): [<80052bd0>] idle_task_exit+0x16c/0x1a0 hardirqs last disabled at (279320): [<8051a470>] cpu_die+0x28/0x98 softirqs last enabled at (279304): [<8002ae3c>] _local_bh_enable+0x14/0x18 softirqs last disabled at (279303): [<8002aea8>] irq_enter+0x68/0x78 Backtrace: [<8001284c>] (dump_backtrace+0x0/0x10c) from [<8052b614>] (dump_stack+0x18/0x1c) r7:00000001 r6:bf88c000 r5:bf890000 r4:bf890000 [<8052b5fc>] (dump_stack+0x0/0x1c) from [<800539b4>] (__schedule_bug+0x5c/0x7c) [<80053958>] (__schedule_bug+0x0/0x7c) from [<8052e1a8>] (__schedule+0x620/0x6e0) r5:bf890000 r4:819215c0 [<8052db88>] (__schedule+0x0/0x6e0) from [<8052e320>] (_cond_resched+0x44/0x58) [<8052e2dc>] (_cond_resched+0x0/0x58) from [<8052cf64>] (mutex_lock_nested+0x3c/0x3bc) r5:808b19e0 r4:808a5360 [<8052cf28>] (mutex_lock_nested+0x0/0x3bc) from [<803e2108>] (clk_get_rate+0x1c/0x84) [<803e20ec>] (clk_get_rate+0x0/0x84) from [<8052900c>] (twd_timer_setup+0x28c/0x2e0) r5:808b19e0 r4:817c0200 [<80528d80>] (twd_timer_setup+0x0/0x2e0) from [<805288b8>] (percpu_timer_setup+0x6c/0xdc) r9:412fc09a r8:1000406a r7:808b19d8 r6:bf88c000 r5:80879c14 r4:817c0200 [<8052884c>] (percpu_timer_setup+0x0/0xdc) from [<80528a68>] (secondary_start_kernel+0x140/0x19c) r5:80879c14 r4:00000001 [<80528928>] (secondary_start_kernel+0x0/0x19c) from [<10528248>] (0x10528248) r7:808b19d8 r6:10c03c7d r5:0000001f r4:4f87806a -- Best Regards, Peter Chen ^ permalink raw reply [flat|nested] 5+ messages in thread
* Do we need to fix below dump during cpu hot plug operation? 2012-10-19 1:56 Do we need to fix below dump during cpu hot plug operation? Peter Chen @ 2012-10-19 4:58 ` Shawn Guo 2012-10-19 9:37 ` Linus Walleij 0 siblings, 1 reply; 5+ messages in thread From: Shawn Guo @ 2012-10-19 4:58 UTC (permalink / raw) To: linux-arm-kernel On Fri, Oct 19, 2012 at 09:56:22AM +0800, Peter Chen wrote: > Hi all, > > When doing system suspend/resume or CPU hot plug at SMP platform (like > Freescale i.mx6Q sabrelite board), if local time is enabled, > there is a clk_get_rate at twd_timer_setup during the atomic context. > So, if mutex_lock debug is enabled, there will be below dump message. > Although it will not affect future system use, it is boring message. > Thanks for reporting, Peter. We should fix the warning. Having not turned on those warning options in imx defconfig might be the reason that we had not noticed it. Since the driver is able to update the twd_timer_rate whenever clock rate changes after system boots up, the clk_get_rate call in twd_timer_setup should be only needed when we get a valid twd_clk at the first time. So let's move clk_get_rate call into if (!twd_clk) block just after twd_get_clock(). Linus, Russell, Does it seem to be a valid fix? Shawn 8<--- diff --git a/arch/arm/kernel/smp_twd.c b/arch/arm/kernel/smp_twd.c index e1f9069..c1f54a1 100644 --- a/arch/arm/kernel/smp_twd.c +++ b/arch/arm/kernel/smp_twd.c @@ -273,13 +273,13 @@ static int __cpuinit twd_timer_setup(struct clock_event_device *clk) { struct clock_event_device **this_cpu_clk; - if (!twd_clk) + if (!twd_clk) { twd_clk = twd_get_clock(); - - if (!IS_ERR_OR_NULL(twd_clk)) - twd_timer_rate = clk_get_rate(twd_clk); - else + if (!IS_ERR_OR_NULL(twd_clk)) + twd_timer_rate = clk_get_rate(twd_clk); + } else { twd_calibrate_rate(); + } __raw_writel(0, twd_base + TWD_TIMER_CONTROL); > BUG: sleeping function called from invalid context at /home/b29397/work/code/git/linus/linux-2.6/kernel/mutex.c:269 > no locks held by swapper/1/0. > irq event stamp: 279320 > hardirqs last enabled at (279319): [<80052bd0>] idle_task_exit+0x16c/0x1a0 > hardirqs last disabled at (279320): [<8051a470>] cpu_die+0x28/0x98 > softirqs last enabled at (279304): [<8002ae3c>] _local_bh_enable+0x14/0x18 > softirqs last disabled at (279303): [<8002aea8>] irq_enter+0x68/0x78 > Backtrace: > [<8001284c>] (dump_backtrace+0x0/0x10c) from [<8052b614>] (dump_stack+0x18/0x1c) > r7:8062b070 r6:0000010d r5:00000000 r4:bf88c000 > [<8052b5fc>] (dump_stack+0x0/0x1c) from [<80050bb4>] (__might_sleep+0x1ec/0x238) > [<800509c8>] (__might_sleep+0x0/0x238) from [<8052cf60>] (mutex_lock_nested+0x38/0x3bc) > r7:00000000 r6:bf88c000 r5:808b19e0 r4:808a5360 > [<8052cf28>] (mutex_lock_nested+0x0/0x3bc) from [<803e2108>] (clk_get_rate+0x1c/0x84) > [<803e20ec>] (clk_get_rate+0x0/0x84) from [<8052900c>] (twd_timer_setup+0x28c/0x2e0) > r5:808b19e0 r4:817c0200 > [<80528d80>] (twd_timer_setup+0x0/0x2e0) from [<805288b8>] (percpu_timer_setup+0x6c/0xdc) > r9:412fc09a r8:1000406a r7:808b19d8 r6:bf88c000 r5:80879c14 > r4:817c0200 > [<8052884c>] (percpu_timer_setup+0x0/0xdc) from [<80528a68>] (secondary_start_kernel+0x140/0x19c) > r5:80879c14 r4:00000001 > [<80528928>] (secondary_start_kernel+0x0/0x19c) from [<10528248>] (0x10528248) > r7:808b19d8 r6:10c03c7d r5:0000001f r4:4f87806a > BUG: scheduling while atomic: swapper/1/0/0x40000002 > no locks held by swapper/1/0. > Modules linked in: > irq event stamp: 279320 > hardirqs last enabled at (279319): [<80052bd0>] idle_task_exit+0x16c/0x1a0 > hardirqs last disabled at (279320): [<8051a470>] cpu_die+0x28/0x98 > softirqs last enabled at (279304): [<8002ae3c>] _local_bh_enable+0x14/0x18 > softirqs last disabled at (279303): [<8002aea8>] irq_enter+0x68/0x78 > Backtrace: > [<8001284c>] (dump_backtrace+0x0/0x10c) from [<8052b614>] (dump_stack+0x18/0x1c) > r7:00000001 r6:bf88c000 r5:bf890000 r4:bf890000 > [<8052b5fc>] (dump_stack+0x0/0x1c) from [<800539b4>] (__schedule_bug+0x5c/0x7c) > [<80053958>] (__schedule_bug+0x0/0x7c) from [<8052e1a8>] (__schedule+0x620/0x6e0) > r5:bf890000 r4:819215c0 > [<8052db88>] (__schedule+0x0/0x6e0) from [<8052e320>] (_cond_resched+0x44/0x58) > [<8052e2dc>] (_cond_resched+0x0/0x58) from [<8052cf64>] (mutex_lock_nested+0x3c/0x3bc) > r5:808b19e0 r4:808a5360 > [<8052cf28>] (mutex_lock_nested+0x0/0x3bc) from [<803e2108>] (clk_get_rate+0x1c/0x84) > [<803e20ec>] (clk_get_rate+0x0/0x84) from [<8052900c>] (twd_timer_setup+0x28c/0x2e0) > r5:808b19e0 r4:817c0200 > [<80528d80>] (twd_timer_setup+0x0/0x2e0) from [<805288b8>] (percpu_timer_setup+0x6c/0xdc) > r9:412fc09a r8:1000406a r7:808b19d8 r6:bf88c000 r5:80879c14 > r4:817c0200 > [<8052884c>] (percpu_timer_setup+0x0/0xdc) from [<80528a68>] (secondary_start_kernel+0x140/0x19c) > r5:80879c14 r4:00000001 > [<80528928>] (secondary_start_kernel+0x0/0x19c) from [<10528248>] (0x10528248) > r7:808b19d8 r6:10c03c7d r5:0000001f r4:4f87806a > > -- > > Best Regards, > Peter Chen > ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Do we need to fix below dump during cpu hot plug operation? 2012-10-19 4:58 ` Shawn Guo @ 2012-10-19 9:37 ` Linus Walleij 2012-10-19 9:40 ` Russell King - ARM Linux 0 siblings, 1 reply; 5+ messages in thread From: Linus Walleij @ 2012-10-19 9:37 UTC (permalink / raw) To: linux-arm-kernel On Fri, Oct 19, 2012 at 6:58 AM, Shawn Guo <shawn.guo@linaro.org> wrote: > Linus, Russell, > Does it seem to be a valid fix? Yes, but this patch is very incomplete. It only accounts for handling the fact that the *twd_clk has already been fetched. If the clock .setup() and .stop() is going to be called repeatedly, look over the entire .setup() and .stop() function pair and make it reentrant, right now it's pure luck if it works. There are obviously more things that need fixing here. You need to move this_cpu_clk = __this_cpu_ptr(twd_evt); and check that before trying to register the clockevent right, because if setup() is called repeatedly, you will add a new clockevent every time, which is a massive memory leak. Isn't the easiest and most straigt-forward fix something like this: >From 202c411f3212f74a5d2525ca291b249e1599b64e Mon Sep 17 00:00:00 2001 From: Linus Walleij <linus.walleij@linaro.org> Date: Fri, 19 Oct 2012 11:30:47 +0200 Subject: [PATCH] ARM: SMP_TWD: make setup()/stop() reentrant This makes the SMP_TWD clock .setup()/.stop() pair reentrant by not re-fetching the clk and re-registering the clock event every time .setup() is called. We also make sure to call the clk_enable()/clk_disable() pair on subsequent calls. As it has been brought to my knowledge that this pair is going to be called from atomic contexts for CPU clusters coming and going, the clk_prepare()/clk_unprepare() calls cannot be called on subsequent .setup()/.stop() iterations. The patch assumes that the code will make sure that twd_set_mode() is called through .set_mode() on the clock event *after* the .setup() call, so that the timer registers are fully re-programmed after a new .setup() cycle. Reported-by: Peter Chen <peter.chen@freescale.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> --- arch/arm/kernel/smp_twd.c | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-) diff --git a/arch/arm/kernel/smp_twd.c b/arch/arm/kernel/smp_twd.c index e1f9069..1ac637b 100644 --- a/arch/arm/kernel/smp_twd.c +++ b/arch/arm/kernel/smp_twd.c @@ -31,6 +31,7 @@ static void __iomem *twd_base; static struct clk *twd_clk; static unsigned long twd_timer_rate; +static bool initial_setup_called; static struct clock_event_device __percpu **twd_evt; static int twd_ppi; @@ -93,6 +94,8 @@ static void twd_timer_stop(struct clock_event_device *clk) { twd_set_mode(CLOCK_EVT_MODE_UNUSED, clk); disable_percpu_irq(clk->irq); + if (twd_clk) + clk_disable(twd_clk); } #ifdef CONFIG_COMMON_CLK @@ -273,8 +276,21 @@ static int __cpuinit twd_timer_setup(struct clock_event_device *clk) { struct clock_event_device **this_cpu_clk; - if (!twd_clk) - twd_clk = twd_get_clock(); + /* + * If the basic setup has been done before, don't bother + * with yet again looking up the clock and register the clock + * source. + */ + if (initial_setup_called) { + if (twd_clk) + clk_enable(twd_clk); + __raw_writel(0, twd_base + TWD_TIMER_CONTROL); + enable_percpu_irq(clk->irq, 0); + return 0; + } + initial_setup_called = true; + + twd_clk = twd_get_clock(); if (!IS_ERR_OR_NULL(twd_clk)) twd_timer_rate = clk_get_rate(twd_clk); -- 1.7.11.7 Can you please see if this solves your problem? Yours, Linus Walleij ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Do we need to fix below dump during cpu hot plug operation? 2012-10-19 9:37 ` Linus Walleij @ 2012-10-19 9:40 ` Russell King - ARM Linux 2012-10-19 9:50 ` Linus Walleij 0 siblings, 1 reply; 5+ messages in thread From: Russell King - ARM Linux @ 2012-10-19 9:40 UTC (permalink / raw) To: linux-arm-kernel On Fri, Oct 19, 2012 at 11:37:48AM +0200, Linus Walleij wrote: > @@ -93,6 +94,8 @@ static void twd_timer_stop(struct clock_event_device *clk) > { > twd_set_mode(CLOCK_EVT_MODE_UNUSED, clk); > disable_percpu_irq(clk->irq); > + if (twd_clk) > + clk_disable(twd_clk); While we're here, can we please ensure that the CLK API is properly respected, and use IS_ERR(twd_clk) to determine of the clock is valid or not (and not use IS_ERR_OR_NULL()). ^ permalink raw reply [flat|nested] 5+ messages in thread
* Do we need to fix below dump during cpu hot plug operation? 2012-10-19 9:40 ` Russell King - ARM Linux @ 2012-10-19 9:50 ` Linus Walleij 0 siblings, 0 replies; 5+ messages in thread From: Linus Walleij @ 2012-10-19 9:50 UTC (permalink / raw) To: linux-arm-kernel On Fri, Oct 19, 2012 at 11:40 AM, Russell King - ARM Linux <linux@arm.linux.org.uk> wrote: > On Fri, Oct 19, 2012 at 11:37:48AM +0200, Linus Walleij wrote: >> @@ -93,6 +94,8 @@ static void twd_timer_stop(struct clock_event_device *clk) >> { >> twd_set_mode(CLOCK_EVT_MODE_UNUSED, clk); >> disable_percpu_irq(clk->irq); >> + if (twd_clk) >> + clk_disable(twd_clk); > > While we're here, can we please ensure that the CLK API is properly > respected, and use IS_ERR(twd_clk) to determine of the clock is valid > or not (and not use IS_ERR_OR_NULL()). OK I'll refine this thing if it works for Freescale. I'm under the impression that the use of IS_ERR_OR_NULL() in the setup() section below the changed code is correct however, since it'll catch the situation where the clock API is disabled (and returns rate 0) and then proceed to calibrate the rate properly? Yours, Linus Walleij ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2012-10-19 9:50 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-10-19 1:56 Do we need to fix below dump during cpu hot plug operation? Peter Chen 2012-10-19 4:58 ` Shawn Guo 2012-10-19 9:37 ` Linus Walleij 2012-10-19 9:40 ` Russell King - ARM Linux 2012-10-19 9:50 ` Linus Walleij
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).