linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* arm64: WARN_ON_ONCE issue when resuming from hibernation
@ 2018-12-07  1:40 Kunihiko Hayashi
  2018-12-07 10:47 ` Will Deacon
  0 siblings, 1 reply; 4+ messages in thread
From: Kunihiko Hayashi @ 2018-12-07  1:40 UTC (permalink / raw)
  To: arm, linux-arm-kernel

Hello all,

I found that a WARN_ON_ONCE dump occured in the resuming sequence from
hibernation on arm64 SoC (I use UniPhier LD20 environment).

    ...
    Disabling non-boot CPUs ...
    CPU1: shutdown
    psci: CPU1 killed.
    CPU2: shutdown
    psci: CPU2 killed.
    CPU3: shutdown
    psci: CPU3 killed.
    WARNING: CPU: 0 PID: 1 at ../kernel/smp.c:416 smp_call_function_many+0xd4/0x350
    Modules linked in:
    CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.20.0-rc4 #1
    ...

I show the result of reading the code, however,
I'm not sure that this issue occurs in other arm64 SoC.

In the resuming sequence, once all CPUs are stopped and local IRQs
are disabled [1].

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/power/hibernate.c?h=v4.20-rc4#n450

In case of arm64, flush_icache_range() will be called after that.
This calls kick_all_cpus_sync() to sync all CPUs with IPI, and
since local IRQs are disabled, WARN_ON_ONCE() will be called in
smp_call_function_many() [2].

[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/smp.c?h=v4.20-rc4#n415

The following tree shows a part of the callgraph.

    resume_target_kernel()
    +- local_irq_disable()
    +- swsusp_arch_resume()				/* for arm64 */
       +- create_safe_exec_page()			/* for arm64 */
          +- flush_icache_range()			/* for arm64 */
             +- kick_all_cpus_sync()
                +- smp_call_function()
                   +- smp_call_function_many()
                      +- WARN_ON_ONCE(irq_disabled())

What is the possible way to solve this issue?

Thank you,

---
Best Regards,
Kunihiko Hayashi



_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: arm64: WARN_ON_ONCE issue when resuming from hibernation
  2018-12-07  1:40 arm64: WARN_ON_ONCE issue when resuming from hibernation Kunihiko Hayashi
@ 2018-12-07 10:47 ` Will Deacon
  2018-12-07 12:02   ` James Morse
  0 siblings, 1 reply; 4+ messages in thread
From: Will Deacon @ 2018-12-07 10:47 UTC (permalink / raw)
  To: Kunihiko Hayashi; +Cc: arm, james.morse, linux-arm-kernel

[+ James]

On Fri, Dec 07, 2018 at 10:40:50AM +0900, Kunihiko Hayashi wrote:
> I found that a WARN_ON_ONCE dump occured in the resuming sequence from
> hibernation on arm64 SoC (I use UniPhier LD20 environment).
> 
>     ...
>     Disabling non-boot CPUs ...
>     CPU1: shutdown
>     psci: CPU1 killed.
>     CPU2: shutdown
>     psci: CPU2 killed.
>     CPU3: shutdown
>     psci: CPU3 killed.
>     WARNING: CPU: 0 PID: 1 at ../kernel/smp.c:416 smp_call_function_many+0xd4/0x350
>     Modules linked in:
>     CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.20.0-rc4 #1
>     ...
> 
> I show the result of reading the code, however,
> I'm not sure that this issue occurs in other arm64 SoC.
> 
> In the resuming sequence, once all CPUs are stopped and local IRQs
> are disabled [1].
> 
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/power/hibernate.c?h=v4.20-rc4#n450
> 
> In case of arm64, flush_icache_range() will be called after that.
> This calls kick_all_cpus_sync() to sync all CPUs with IPI, and
> since local IRQs are disabled, WARN_ON_ONCE() will be called in
> smp_call_function_many() [2].
> 
> [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/smp.c?h=v4.20-rc4#n415
> 
> The following tree shows a part of the callgraph.
> 
>     resume_target_kernel()
>     +- local_irq_disable()
>     +- swsusp_arch_resume()				/* for arm64 */
>        +- create_safe_exec_page()			/* for arm64 */
>           +- flush_icache_range()			/* for arm64 */
>              +- kick_all_cpus_sync()
>                 +- smp_call_function()
>                    +- smp_call_function_many()
>                       +- WARN_ON_ONCE(irq_disabled())
> 
> What is the possible way to solve this issue?

Given that all secondary CPUs are hotplugged out at this point, we can
just use the non-IPI version of flush_icache_range(). Completely untested
diff below.

Will

--->8

diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c
index 6b2686d54411..29cdc99688f3 100644
--- a/arch/arm64/kernel/hibernate.c
+++ b/arch/arm64/kernel/hibernate.c
@@ -214,7 +214,7 @@ static int create_safe_exec_page(void *src_start, size_t length,
 	}
 
 	memcpy((void *)dst, src_start, length);
-	flush_icache_range(dst, dst + length);
+	__flush_icache_range(dst, dst + length);
 
 	pgdp = pgd_offset_raw(allocator(mask), dst_addr);
 	if (pgd_none(READ_ONCE(*pgdp))) {

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: arm64: WARN_ON_ONCE issue when resuming from hibernation
  2018-12-07 10:47 ` Will Deacon
@ 2018-12-07 12:02   ` James Morse
  2018-12-07 12:24     ` Kunihiko Hayashi
  0 siblings, 1 reply; 4+ messages in thread
From: James Morse @ 2018-12-07 12:02 UTC (permalink / raw)
  To: Will Deacon, Kunihiko Hayashi; +Cc: arm, linux-arm-kernel

Hi Kunihiko, Will,

On 07/12/2018 10:47, Will Deacon wrote:
> On Fri, Dec 07, 2018 at 10:40:50AM +0900, Kunihiko Hayashi wrote:
>> I found that a WARN_ON_ONCE dump occured in the resuming sequence from
>> hibernation on arm64 SoC (I use UniPhier LD20 environment).
>>
>>     ...
>>     Disabling non-boot CPUs ...
>>     CPU1: shutdown
>>     psci: CPU1 killed.
>>     CPU2: shutdown
>>     psci: CPU2 killed.
>>     CPU3: shutdown
>>     psci: CPU3 killed.
>>     WARNING: CPU: 0 PID: 1 at ../kernel/smp.c:416 smp_call_function_many+0xd4/0x350
>>     Modules linked in:
>>     CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.20.0-rc4 #1
>>     ...
>>
>> I show the result of reading the code, however,
>> I'm not sure that this issue occurs in other arm64 SoC.

It does, but you can only see it if you also have 'no_console_suspend' on the
command-line, otherwise the console driver is frozen with the rest of the system
when this happens.

Thanks for the report!


>> In the resuming sequence, once all CPUs are stopped and local IRQs
>> are disabled [1].

>> What is the possible way to solve this issue?
> 
> Given that all secondary CPUs are hotplugged out at this point, we can
> just use the non-IPI version of flush_icache_range().

Sounds good,

> --->8
> 
> diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c
> index 6b2686d54411..29cdc99688f3 100644
> --- a/arch/arm64/kernel/hibernate.c
> +++ b/arch/arm64/kernel/hibernate.c
> @@ -214,7 +214,7 @@ static int create_safe_exec_page(void *src_start, size_t length,
>  	}
>  
>  	memcpy((void *)dst, src_start, length);
> -	flush_icache_range(dst, dst + length);
> +	__flush_icache_range(dst, dst + length);
>  
>  	pgdp = pgd_offset_raw(allocator(mask), dst_addr);
>  	if (pgd_none(READ_ONCE(*pgdp))) {
> 

Tested-by: James Morse <james.morse@arm.com>

This changed came from commit 3b8c9f1cdfc50 ("arm64: IPI each CPU after
invalidating the I-cache for kernel mappings"). Which came in with v4.19 if you
want to send it to stable. (let me know if its easier for you if I re-post your
patch)


Thanks,

James

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: arm64: WARN_ON_ONCE issue when resuming from hibernation
  2018-12-07 12:02   ` James Morse
@ 2018-12-07 12:24     ` Kunihiko Hayashi
  0 siblings, 0 replies; 4+ messages in thread
From: Kunihiko Hayashi @ 2018-12-07 12:24 UTC (permalink / raw)
  To: Will Deacon, James Morse; +Cc: arm, linux-arm-kernel

Hi Will, James,

On Fri, 7 Dec 2018 12:02:57 +0000 <james.morse@arm.com> wrote:

> Hi Kunihiko, Will,
> 
> On 07/12/2018 10:47, Will Deacon wrote:
> > On Fri, Dec 07, 2018 at 10:40:50AM +0900, Kunihiko Hayashi wrote:
> >> I found that a WARN_ON_ONCE dump occured in the resuming sequence from
> >> hibernation on arm64 SoC (I use UniPhier LD20 environment).
> >>
> >>     ...
> >>     Disabling non-boot CPUs ...
> >>     CPU1: shutdown
> >>     psci: CPU1 killed.
> >>     CPU2: shutdown
> >>     psci: CPU2 killed.
> >>     CPU3: shutdown
> >>     psci: CPU3 killed.
> >>     WARNING: CPU: 0 PID: 1 at ../kernel/smp.c:416 smp_call_function_many+0xd4/0x350
> >>     Modules linked in:
> >>     CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.20.0-rc4 #1
> >>     ...
> >>
> >> I show the result of reading the code, however,
> >> I'm not sure that this issue occurs in other arm64 SoC.
> 
> It does, but you can only see it if you also have 'no_console_suspend' on the
> command-line, otherwise the console driver is frozen with the rest of the system
> when this happens.

Yes, I added 'no_console_suspend' to command-line.
Surely the console was frozen without it.

> 
> Thanks for the report!
> 
> 
> >> In the resuming sequence, once all CPUs are stopped and local IRQs
> >> are disabled [1].
> 
> >> What is the possible way to solve this issue?
> > 
> > Given that all secondary CPUs are hotplugged out at this point, we can
> > just use the non-IPI version of flush_icache_range().
> 
> Sounds good,

Thanks for your solution.
Surely we can call this non-IPI version in create_safe_exec_page().

> > --->8
> > 
> > diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c
> > index 6b2686d54411..29cdc99688f3 100644
> > --- a/arch/arm64/kernel/hibernate.c
> > +++ b/arch/arm64/kernel/hibernate.c
> > @@ -214,7 +214,7 @@ static int create_safe_exec_page(void *src_start, size_t length,
> >  	}
> >  
> >  	memcpy((void *)dst, src_start, length);
> > -	flush_icache_range(dst, dst + length);
> > +	__flush_icache_range(dst, dst + length);
> >  
> >  	pgdp = pgd_offset_raw(allocator(mask), dst_addr);
> >  	if (pgd_none(READ_ONCE(*pgdp))) {
> > 
> 
> Tested-by: James Morse <james.morse@arm.com>

I also tried your diff and resume sequence from hibernation finished successfully
without WARN_ON_ONCE().

Tested-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>

> This changed came from commit 3b8c9f1cdfc50 ("arm64: IPI each CPU after
> invalidating the I-cache for kernel mappings"). Which came in with v4.19 if you
> want to send it to stable. (let me know if its easier for you if I re-post your
> patch)

Agreed.

Thank you,

> 
> Thanks,
> 
> James

---
Best Regards,
Kunihiko Hayashi



_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-12-07 12:24 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-12-07  1:40 arm64: WARN_ON_ONCE issue when resuming from hibernation Kunihiko Hayashi
2018-12-07 10:47 ` Will Deacon
2018-12-07 12:02   ` James Morse
2018-12-07 12:24     ` Kunihiko Hayashi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).