* [PATCH] cpuidle: coupled: fix dead loop corner case
@ 2013-08-20 5:17 Neil Zhang
2013-08-20 12:36 ` Rafael J. Wysocki
2013-08-22 21:08 ` Colin Cross
0 siblings, 2 replies; 5+ messages in thread
From: Neil Zhang @ 2013-08-20 5:17 UTC (permalink / raw)
To: rjw, daniel.lezcano; +Cc: linux-pm, linux-kernel, Neil Zhang
There is a corener case when no peripheral irqs route to secondary
cores.
Let's take dual core system for example, the sequence is as following:
Core 0 Core1
1. set waiting bit and enter waiting loop
2. set waiting bit and poke core1
3. clear poke in irq and enter safe state
4. set ready bit and enter ready loop
Since there is no peripheral irq route to core 1, so it will stay in
safe state forever, and core 0 will dead loop in the following code.
while (!cpuidle_coupled_cpus_ready(coupled)) {
/* Check if any other cpus bailed out of idle. */
if (!cpuidle_coupled_cpus_waiting(coupled))
}
The solution is don't let secondary core enter safe state when it has
already handled the poke interrupt.
Signed-off-by: Neil Zhang <zhangwm@marvell.com>
Reviewed-by: Fangsuo Wu <fswu@marvell.com>
---
drivers/cpuidle/coupled.c | 7 +++++++
1 files changed, 7 insertions(+), 0 deletions(-)
diff --git a/drivers/cpuidle/coupled.c b/drivers/cpuidle/coupled.c
index 2a297f8..a37c718 100644
--- a/drivers/cpuidle/coupled.c
+++ b/drivers/cpuidle/coupled.c
@@ -119,6 +119,7 @@ struct cpuidle_coupled {
#define CPUIDLE_COUPLED_NOT_IDLE (-1)
static DEFINE_MUTEX(cpuidle_coupled_lock);
+static DEFINE_PER_CPU(bool, poke_sync);
static DEFINE_PER_CPU(struct call_single_data, cpuidle_coupled_poke_cb);
/*
@@ -295,6 +296,7 @@ static void cpuidle_coupled_poked(void *info)
{
int cpu = (unsigned long)info;
cpumask_clear_cpu(cpu, &cpuidle_coupled_poked_mask);
+ __this_cpu_write(poke_sync, true);
}
/**
@@ -473,6 +475,7 @@ retry:
* allowed for a single cpu.
*/
while (!cpuidle_coupled_cpus_waiting(coupled)) {
+ __this_cpu_write(poke_sync, false);
if (cpuidle_coupled_clear_pokes(dev->cpu)) {
cpuidle_coupled_set_not_waiting(dev->cpu, coupled);
goto out;
@@ -483,6 +486,10 @@ retry:
goto out;
}
+ if (cpuidle_coupled_cpus_waiting(coupled)
+ && __this_cpu_read(poke_sync))
+ break;
+
entered_state = cpuidle_enter_state(dev, drv,
dev->safe_state_index);
}
--
1.7.4.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] cpuidle: coupled: fix dead loop corner case
2013-08-20 5:17 [PATCH] cpuidle: coupled: fix dead loop corner case Neil Zhang
@ 2013-08-20 12:36 ` Rafael J. Wysocki
2013-08-22 10:11 ` Neil Zhang
2013-08-22 21:08 ` Colin Cross
1 sibling, 1 reply; 5+ messages in thread
From: Rafael J. Wysocki @ 2013-08-20 12:36 UTC (permalink / raw)
To: Neil Zhang, Daniel Lezcano; +Cc: linux-pm, linux-kernel
On Tuesday, August 20, 2013 01:17:44 PM Neil Zhang wrote:
> There is a corener case when no peripheral irqs route to secondary
> cores.
> Let's take dual core system for example, the sequence is as following:
>
> Core 0 Core1
> 1. set waiting bit and enter waiting loop
> 2. set waiting bit and poke core1
> 3. clear poke in irq and enter safe state
> 4. set ready bit and enter ready loop
>
> Since there is no peripheral irq route to core 1, so it will stay in
> safe state forever, and core 0 will dead loop in the following code.
> while (!cpuidle_coupled_cpus_ready(coupled)) {
> /* Check if any other cpus bailed out of idle. */
> if (!cpuidle_coupled_cpus_waiting(coupled))
> }
>
> The solution is don't let secondary core enter safe state when it has
> already handled the poke interrupt.
>
> Signed-off-by: Neil Zhang <zhangwm@marvell.com>
> Reviewed-by: Fangsuo Wu <fswu@marvell.com>
Daniel, can you please have a look at this?
Rafael
> ---
> drivers/cpuidle/coupled.c | 7 +++++++
> 1 files changed, 7 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/cpuidle/coupled.c b/drivers/cpuidle/coupled.c
> index 2a297f8..a37c718 100644
> --- a/drivers/cpuidle/coupled.c
> +++ b/drivers/cpuidle/coupled.c
> @@ -119,6 +119,7 @@ struct cpuidle_coupled {
> #define CPUIDLE_COUPLED_NOT_IDLE (-1)
>
> static DEFINE_MUTEX(cpuidle_coupled_lock);
> +static DEFINE_PER_CPU(bool, poke_sync);
> static DEFINE_PER_CPU(struct call_single_data, cpuidle_coupled_poke_cb);
>
> /*
> @@ -295,6 +296,7 @@ static void cpuidle_coupled_poked(void *info)
> {
> int cpu = (unsigned long)info;
> cpumask_clear_cpu(cpu, &cpuidle_coupled_poked_mask);
> + __this_cpu_write(poke_sync, true);
> }
>
> /**
> @@ -473,6 +475,7 @@ retry:
> * allowed for a single cpu.
> */
> while (!cpuidle_coupled_cpus_waiting(coupled)) {
> + __this_cpu_write(poke_sync, false);
> if (cpuidle_coupled_clear_pokes(dev->cpu)) {
> cpuidle_coupled_set_not_waiting(dev->cpu, coupled);
> goto out;
> @@ -483,6 +486,10 @@ retry:
> goto out;
> }
>
> + if (cpuidle_coupled_cpus_waiting(coupled)
> + && __this_cpu_read(poke_sync))
> + break;
> +
> entered_state = cpuidle_enter_state(dev, drv,
> dev->safe_state_index);
> }
>
--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: [PATCH] cpuidle: coupled: fix dead loop corner case
2013-08-20 12:36 ` Rafael J. Wysocki
@ 2013-08-22 10:11 ` Neil Zhang
0 siblings, 0 replies; 5+ messages in thread
From: Neil Zhang @ 2013-08-22 10:11 UTC (permalink / raw)
To: Rafael J. Wysocki, Daniel Lezcano, ccross@android.com
Cc: linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org
Daniel & Colin,
> -----Original Message-----
> From: Rafael J. Wysocki [mailto:rjw@sisk.pl]
> Sent: 2013年8月20日 20:37
> To: Neil Zhang; Daniel Lezcano
> Cc: linux-pm@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH] cpuidle: coupled: fix dead loop corner case
>
> On Tuesday, August 20, 2013 01:17:44 PM Neil Zhang wrote:
> > There is a corener case when no peripheral irqs route to secondary
> > cores.
> > Let's take dual core system for example, the sequence is as following:
> >
> > Core 0 Core1
> > 1. set waiting bit and enter waiting loop
> > 2. set waiting bit and poke core1
> > 3. clear poke in irq and enter safe state
> > 4. set ready bit and enter ready loop
> >
> > Since there is no peripheral irq route to core 1, so it will stay in
> > safe state forever, and core 0 will dead loop in the following code.
> > while (!cpuidle_coupled_cpus_ready(coupled)) {
> > /* Check if any other cpus bailed out of idle. */
> > if (!cpuidle_coupled_cpus_waiting(coupled))
> > }
> >
> > The solution is don't let secondary core enter safe state when it has
> > already handled the poke interrupt.
> >
> > Signed-off-by: Neil Zhang <zhangwm@marvell.com>
> > Reviewed-by: Fangsuo Wu <fswu@marvell.com>
>
> Daniel, can you please have a look at this?
>
> Rafael
>
What's your opinion?
Thanks.
>
> > ---
> > drivers/cpuidle/coupled.c | 7 +++++++
> > 1 files changed, 7 insertions(+), 0 deletions(-)
> >
> > diff --git a/drivers/cpuidle/coupled.c b/drivers/cpuidle/coupled.c
> > index 2a297f8..a37c718 100644
> > --- a/drivers/cpuidle/coupled.c
> > +++ b/drivers/cpuidle/coupled.c
> > @@ -119,6 +119,7 @@ struct cpuidle_coupled {
> > #define CPUIDLE_COUPLED_NOT_IDLE (-1)
> >
> > static DEFINE_MUTEX(cpuidle_coupled_lock);
> > +static DEFINE_PER_CPU(bool, poke_sync);
> > static DEFINE_PER_CPU(struct call_single_data,
> > cpuidle_coupled_poke_cb);
> >
> > /*
> > @@ -295,6 +296,7 @@ static void cpuidle_coupled_poked(void *info) {
> > int cpu = (unsigned long)info;
> > cpumask_clear_cpu(cpu, &cpuidle_coupled_poked_mask);
> > + __this_cpu_write(poke_sync, true);
> > }
> >
> > /**
> > @@ -473,6 +475,7 @@ retry:
> > * allowed for a single cpu.
> > */
> > while (!cpuidle_coupled_cpus_waiting(coupled)) {
> > + __this_cpu_write(poke_sync, false);
> > if (cpuidle_coupled_clear_pokes(dev->cpu)) {
> > cpuidle_coupled_set_not_waiting(dev->cpu, coupled);
> > goto out;
> > @@ -483,6 +486,10 @@ retry:
> > goto out;
> > }
> >
> > + if (cpuidle_coupled_cpus_waiting(coupled)
> > + && __this_cpu_read(poke_sync))
> > + break;
> > +
> > entered_state = cpuidle_enter_state(dev, drv,
> > dev->safe_state_index);
> > }
> >
> --
> I speak only for myself.
> Rafael J. Wysocki, Intel Open Source Technology Center.
Best Regards,
Neil Zhang
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] cpuidle: coupled: fix dead loop corner case
2013-08-20 5:17 [PATCH] cpuidle: coupled: fix dead loop corner case Neil Zhang
2013-08-20 12:36 ` Rafael J. Wysocki
@ 2013-08-22 21:08 ` Colin Cross
2013-08-23 3:17 ` Neil Zhang
1 sibling, 1 reply; 5+ messages in thread
From: Colin Cross @ 2013-08-22 21:08 UTC (permalink / raw)
To: Neil Zhang; +Cc: Rafael J. Wysocki, Daniel Lezcano, Linux PM list, lkml
On Mon, Aug 19, 2013 at 10:17 PM, Neil Zhang <zhangwm@marvell.com> wrote:
> There is a corener case when no peripheral irqs route to secondary
> cores.
> Let's take dual core system for example, the sequence is as following:
>
> Core 0 Core1
> 1. set waiting bit and enter waiting loop
> 2. set waiting bit and poke core1
> 3. clear poke in irq and enter safe state
> 4. set ready bit and enter ready loop
>
> Since there is no peripheral irq route to core 1, so it will stay in
> safe state forever, and core 0 will dead loop in the following code.
> while (!cpuidle_coupled_cpus_ready(coupled)) {
> /* Check if any other cpus bailed out of idle. */
> if (!cpuidle_coupled_cpus_waiting(coupled))
> }
>
> The solution is don't let secondary core enter safe state when it has
> already handled the poke interrupt.
>
> Signed-off-by: Neil Zhang <zhangwm@marvell.com>
> Reviewed-by: Fangsuo Wu <fswu@marvell.com>
> ---
> drivers/cpuidle/coupled.c | 7 +++++++
> 1 files changed, 7 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/cpuidle/coupled.c b/drivers/cpuidle/coupled.c
> index 2a297f8..a37c718 100644
> --- a/drivers/cpuidle/coupled.c
> +++ b/drivers/cpuidle/coupled.c
> @@ -119,6 +119,7 @@ struct cpuidle_coupled {
> #define CPUIDLE_COUPLED_NOT_IDLE (-1)
>
> static DEFINE_MUTEX(cpuidle_coupled_lock);
> +static DEFINE_PER_CPU(bool, poke_sync);
> static DEFINE_PER_CPU(struct call_single_data, cpuidle_coupled_poke_cb);
>
> /*
> @@ -295,6 +296,7 @@ static void cpuidle_coupled_poked(void *info)
> {
> int cpu = (unsigned long)info;
> cpumask_clear_cpu(cpu, &cpuidle_coupled_poked_mask);
> + __this_cpu_write(poke_sync, true);
> }
>
> /**
> @@ -473,6 +475,7 @@ retry:
> * allowed for a single cpu.
> */
> while (!cpuidle_coupled_cpus_waiting(coupled)) {
> + __this_cpu_write(poke_sync, false);
> if (cpuidle_coupled_clear_pokes(dev->cpu)) {
> cpuidle_coupled_set_not_waiting(dev->cpu, coupled);
> goto out;
> @@ -483,6 +486,10 @@ retry:
> goto out;
> }
>
> + if (cpuidle_coupled_cpus_waiting(coupled)
> + && __this_cpu_read(poke_sync))
> + break;
> +
> entered_state = cpuidle_enter_state(dev, drv,
> dev->safe_state_index);
> }
> --
> 1.7.4.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
I have a similar patch that avoids adding another check for
cpuidle_coupled_cpus_waiting, and uses the return value from
cpuidle_coupled_clear_pokes instead of adding a percpu bool. I will
post it shortly.
Do you have a test case that can reproduce this easily?
^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: [PATCH] cpuidle: coupled: fix dead loop corner case
2013-08-22 21:08 ` Colin Cross
@ 2013-08-23 3:17 ` Neil Zhang
0 siblings, 0 replies; 5+ messages in thread
From: Neil Zhang @ 2013-08-23 3:17 UTC (permalink / raw)
To: Colin Cross; +Cc: Rafael J. Wysocki, Daniel Lezcano, Linux PM list, lkml
> -----Original Message-----
> From: Colin Cross [mailto:ccross@google.com]
> Sent: 2013年8月23日 5:08
> To: Neil Zhang
> Cc: Rafael J. Wysocki; Daniel Lezcano; Linux PM list; lkml
> Subject: Re: [PATCH] cpuidle: coupled: fix dead loop corner case
>
> On Mon, Aug 19, 2013 at 10:17 PM, Neil Zhang <zhangwm@marvell.com>
> wrote:
> > There is a corener case when no peripheral irqs route to secondary
> > cores.
> > Let's take dual core system for example, the sequence is as following:
> >
> > Core 0 Core1
> > 1. set waiting bit and enter waiting
> loop
> > 2. set waiting bit and poke core1
> > 3. clear poke in irq and enter safe
> state
> > 4. set ready bit and enter ready loop
> >
> > Since there is no peripheral irq route to core 1, so it will stay in
> > safe state forever, and core 0 will dead loop in the following code.
> > while (!cpuidle_coupled_cpus_ready(coupled)) {
> > /* Check if any other cpus bailed out of idle. */
> > if (!cpuidle_coupled_cpus_waiting(coupled))
> > }
> >
> > The solution is don't let secondary core enter safe state when it has
> > already handled the poke interrupt.
> >
> > Signed-off-by: Neil Zhang <zhangwm@marvell.com>
> > Reviewed-by: Fangsuo Wu <fswu@marvell.com>
> > ---
> > drivers/cpuidle/coupled.c | 7 +++++++
> > 1 files changed, 7 insertions(+), 0 deletions(-)
> >
> > diff --git a/drivers/cpuidle/coupled.c b/drivers/cpuidle/coupled.c
> > index 2a297f8..a37c718 100644
> > --- a/drivers/cpuidle/coupled.c
> > +++ b/drivers/cpuidle/coupled.c
> > @@ -119,6 +119,7 @@ struct cpuidle_coupled {
> > #define CPUIDLE_COUPLED_NOT_IDLE (-1)
> >
> > static DEFINE_MUTEX(cpuidle_coupled_lock);
> > +static DEFINE_PER_CPU(bool, poke_sync);
> > static DEFINE_PER_CPU(struct call_single_data,
> > cpuidle_coupled_poke_cb);
> >
> > /*
> > @@ -295,6 +296,7 @@ static void cpuidle_coupled_poked(void *info) {
> > int cpu = (unsigned long)info;
> > cpumask_clear_cpu(cpu, &cpuidle_coupled_poked_mask);
> > + __this_cpu_write(poke_sync, true);
> > }
> >
> > /**
> > @@ -473,6 +475,7 @@ retry:
> > * allowed for a single cpu.
> > */
> > while (!cpuidle_coupled_cpus_waiting(coupled)) {
> > + __this_cpu_write(poke_sync, false);
> > if (cpuidle_coupled_clear_pokes(dev->cpu)) {
> > cpuidle_coupled_set_not_waiting(dev->cpu,
> coupled);
> > goto out;
> > @@ -483,6 +486,10 @@ retry:
> > goto out;
> > }
> >
> > + if (cpuidle_coupled_cpus_waiting(coupled)
> > + && __this_cpu_read(poke_sync))
> > + break;
> > +
> > entered_state = cpuidle_enter_state(dev, drv,
> > dev->safe_state_index);
> > }
> > --
> > 1.7.4.1
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe
> > linux-kernel" in the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
>
> I have a similar patch that avoids adding another check for
> cpuidle_coupled_cpus_waiting, and uses the return value from
> cpuidle_coupled_clear_pokes instead of adding a percpu bool. I will post it
> shortly.
>
> Do you have a test case that can reproduce this easily?
It's not easy to reproduce.
We only catch one time till now.
Best Regards,
Neil Zhang
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2013-08-23 3:20 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-08-20 5:17 [PATCH] cpuidle: coupled: fix dead loop corner case Neil Zhang
2013-08-20 12:36 ` Rafael J. Wysocki
2013-08-22 10:11 ` Neil Zhang
2013-08-22 21:08 ` Colin Cross
2013-08-23 3:17 ` Neil Zhang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).