* [PATCH]acpi c-states: Fix multiply C-states name disturbance
@ 2009-12-14 13:02 Youquan,Song
2009-12-14 19:09 ` Pallipadi, Venkatesh
2010-01-04 10:49 ` [PATCH]tickless: Fix tick nohz timer irq0 fail to increaase Youquan,Song
0 siblings, 2 replies; 8+ messages in thread
From: Youquan,Song @ 2009-12-14 13:02 UTC (permalink / raw)
To: lenb
Cc: venkatesh.pallipadi, linux, kent.liu, chaohong.guo, youquan.song,
linux-acpi
There are possible multiply C-states for special type of ACPI C-state, such as
ACPI C3, processor C3,C6,C7 are possible all mapped to ACPI C3.
In /sys/devices/system/cpu/cpu0/cpuidle/stateXX, state<number> will increase
with number of validate C-states. But its C-state name under stateXX will also
increase with number of validate C-state.
Actually, C-state name under stateXX should be type of ACPI C-state which we
can refer to /proc/acpi/processor/CPU0/power type[C-state].
This patch unify that all the multiply C-states to one special ACPI C-state.
Signed-off-by: Youquan, Song <youquan.song@intel.com>
---
diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
index 302d656..a94b4d7 100644
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -1086,7 +1086,6 @@ static int acpi_processor_setup_cpuidle(struct acpi_processor *pr)
#endif
cpuidle_set_statedata(state, cx);
- snprintf(state->name, CPUIDLE_NAME_LEN, "C%d", i);
strncpy(state->desc, cx->desc, CPUIDLE_DESC_LEN);
state->exit_latency = cx->latency;
state->target_residency = cx->latency * latency_factor;
@@ -1099,6 +1098,8 @@ static int acpi_processor_setup_cpuidle(struct acpi_processor *pr)
if (cx->entry_method == ACPI_CSTATE_FFH)
state->flags |= CPUIDLE_FLAG_TIME_VALID;
+ snprintf(state->name, CPUIDLE_NAME_LEN, "C%d",
+ ACPI_STATE_C1);
state->enter = acpi_idle_enter_c1;
dev->safe_state = state;
break;
@@ -1106,6 +1107,8 @@ static int acpi_processor_setup_cpuidle(struct acpi_processor *pr)
case ACPI_STATE_C2:
state->flags |= CPUIDLE_FLAG_BALANCED;
state->flags |= CPUIDLE_FLAG_TIME_VALID;
+ snprintf(state->name, CPUIDLE_NAME_LEN, "C%d",
+ ACPI_STATE_C2);
state->enter = acpi_idle_enter_simple;
dev->safe_state = state;
break;
@@ -1114,6 +1117,8 @@ static int acpi_processor_setup_cpuidle(struct acpi_processor *pr)
state->flags |= CPUIDLE_FLAG_DEEP;
state->flags |= CPUIDLE_FLAG_TIME_VALID;
state->flags |= CPUIDLE_FLAG_CHECK_BM;
+ snprintf(state->name, CPUIDLE_NAME_LEN, "C%d",
+ ACPI_STATE_C3);
state->enter = pr->flags.bm_check ?
acpi_idle_enter_bm :
acpi_idle_enter_simple;
^ permalink raw reply related [flat|nested] 8+ messages in thread
* RE: [PATCH]acpi c-states: Fix multiply C-states name disturbance
2009-12-14 13:02 [PATCH]acpi c-states: Fix multiply C-states name disturbance Youquan,Song
@ 2009-12-14 19:09 ` Pallipadi, Venkatesh
2009-12-16 16:06 ` [Resend PATCH]acpi " Youquan,Song
2010-01-04 10:49 ` [PATCH]tickless: Fix tick nohz timer irq0 fail to increaase Youquan,Song
1 sibling, 1 reply; 8+ messages in thread
From: Pallipadi, Venkatesh @ 2009-12-14 19:09 UTC (permalink / raw)
To: Youquan,Song, lenb@kernel.org
Cc: linux@dominikbrodowski.net, Liu, Kent, Guo, Chaohong,
Song, Youquan, linux-acpi@vger.kernel.org
Ack the idea of this patch.
I don't quite agree with the comment
"Actually, C-state name under stateXX should be type of ACPI"
cpuidle is more generic than ACPI and C-state name under stateXX
can be anything that the driver wants to be.
I agree with your point though that today there is no mapping for
ACPI state in cpuidle sysfs output and it will be nice to add
It.
However, I think it is better to have
>+ snprintf(state->name, CPUIDLE_NAME_LEN, "ACPI C%d",
>+ ACPI_STATE_C1);
To clearly distinguish it from OS C-state index or Hardware C-state.
One other thing of concern with this change is any breakage of
existing apps with this change.
Thanks,
Venki
>-----Original Message-----
>From: Youquan,Song [mailto:youquan.song@linux.intel.com]
>Sent: Monday, December 14, 2009 5:03 AM
>To: lenb@kernel.org
>Cc: Pallipadi, Venkatesh; linux@dominikbrodowski.net; Liu,
>Kent; Guo, Chaohong; Song, Youquan; linux-acpi@vger.kernel.org
>Subject: [PATCH]acpi c-states: Fix multiply C-states name disturbance
>
>There are possible multiply C-states for special type of ACPI
>C-state, such as
> ACPI C3, processor C3,C6,C7 are possible all mapped to ACPI C3.
>
>In /sys/devices/system/cpu/cpu0/cpuidle/stateXX, state<number>
>will increase
>with number of validate C-states. But its C-state name under
>stateXX will also
>increase with number of validate C-state.
>
>Actually, C-state name under stateXX should be type of ACPI
>C-state which we
>can refer to /proc/acpi/processor/CPU0/power type[C-state].
>
>This patch unify that all the multiply C-states to one special
>ACPI C-state.
>
>Signed-off-by: Youquan, Song <youquan.song@intel.com>
>---
>
>
>diff --git a/drivers/acpi/processor_idle.c
>b/drivers/acpi/processor_idle.c
>index 302d656..a94b4d7 100644
>--- a/drivers/acpi/processor_idle.c
>+++ b/drivers/acpi/processor_idle.c
>@@ -1086,7 +1086,6 @@ static int
>acpi_processor_setup_cpuidle(struct acpi_processor *pr)
> #endif
> cpuidle_set_statedata(state, cx);
>
>- snprintf(state->name, CPUIDLE_NAME_LEN, "C%d", i);
> strncpy(state->desc, cx->desc, CPUIDLE_DESC_LEN);
> state->exit_latency = cx->latency;
> state->target_residency = cx->latency * latency_factor;
>@@ -1099,6 +1098,8 @@ static int
>acpi_processor_setup_cpuidle(struct acpi_processor *pr)
> if (cx->entry_method == ACPI_CSTATE_FFH)
> state->flags |= CPUIDLE_FLAG_TIME_VALID;
>
>+ snprintf(state->name, CPUIDLE_NAME_LEN, "C%d",
>+ ACPI_STATE_C1);
> state->enter = acpi_idle_enter_c1;
> dev->safe_state = state;
> break;
>@@ -1106,6 +1107,8 @@ static int
>acpi_processor_setup_cpuidle(struct acpi_processor *pr)
> case ACPI_STATE_C2:
> state->flags |= CPUIDLE_FLAG_BALANCED;
> state->flags |= CPUIDLE_FLAG_TIME_VALID;
>+ snprintf(state->name, CPUIDLE_NAME_LEN, "C%d",
>+ ACPI_STATE_C2);
> state->enter = acpi_idle_enter_simple;
> dev->safe_state = state;
> break;
>@@ -1114,6 +1117,8 @@ static int
>acpi_processor_setup_cpuidle(struct acpi_processor *pr)
> state->flags |= CPUIDLE_FLAG_DEEP;
> state->flags |= CPUIDLE_FLAG_TIME_VALID;
> state->flags |= CPUIDLE_FLAG_CHECK_BM;
>+ snprintf(state->name, CPUIDLE_NAME_LEN, "C%d",
>+ ACPI_STATE_C3);
> state->enter = pr->flags.bm_check ?
> acpi_idle_enter_bm :
> acpi_idle_enter_simple;
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Resend PATCH]acpi c-states: Fix multiply C-states name disturbance
2009-12-14 19:09 ` Pallipadi, Venkatesh
@ 2009-12-16 16:06 ` Youquan,Song
0 siblings, 0 replies; 8+ messages in thread
From: Youquan,Song @ 2009-12-16 16:06 UTC (permalink / raw)
To: lenb
Cc: venkatesh.pallipadi, linux@dominikbrodowski.net, Liu, Kent,
Guo, Chaohong, Song, Youquan, linux-acpi@vger.kernel.org
There are possible multiply C-states for special type of ACPI C-state, such as
ACPI C3, processor C3,C6,C7 are possible all mapped to ACPI C3.
In /sys/devices/system/cpu/cpu0/cpuidle/stateXX, state<number> will increase
with number of validate C-states. But its C-state name under stateXX will also
increase with number of validate C-state.
This patch unify that all the multiply C-states to one special ACPI C-state.
Signed-off-by: Youquan, Song <youquan.song@intel.com>
Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
---
diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
index 302d656..f14785d 100644
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -1086,7 +1086,6 @@ static int acpi_processor_setup_cpuidle(struct acpi_processor *pr)
#endif
cpuidle_set_statedata(state, cx);
- snprintf(state->name, CPUIDLE_NAME_LEN, "C%d", i);
strncpy(state->desc, cx->desc, CPUIDLE_DESC_LEN);
state->exit_latency = cx->latency;
state->target_residency = cx->latency * latency_factor;
@@ -1099,6 +1098,8 @@ static int acpi_processor_setup_cpuidle(struct acpi_processor *pr)
if (cx->entry_method == ACPI_CSTATE_FFH)
state->flags |= CPUIDLE_FLAG_TIME_VALID;
+ snprintf(state->name, CPUIDLE_NAME_LEN, "ACPI C%d",
+ ACPI_STATE_C1);
state->enter = acpi_idle_enter_c1;
dev->safe_state = state;
break;
@@ -1106,6 +1107,8 @@ static int acpi_processor_setup_cpuidle(struct acpi_processor *pr)
case ACPI_STATE_C2:
state->flags |= CPUIDLE_FLAG_BALANCED;
state->flags |= CPUIDLE_FLAG_TIME_VALID;
+ snprintf(state->name, CPUIDLE_NAME_LEN, "ACPI C%d",
+ ACPI_STATE_C2);
state->enter = acpi_idle_enter_simple;
dev->safe_state = state;
break;
@@ -1114,6 +1117,8 @@ static int acpi_processor_setup_cpuidle(struct acpi_processor *pr)
state->flags |= CPUIDLE_FLAG_DEEP;
state->flags |= CPUIDLE_FLAG_TIME_VALID;
state->flags |= CPUIDLE_FLAG_CHECK_BM;
+ snprintf(state->name, CPUIDLE_NAME_LEN, "ACPI C%d",
+ ACPI_STATE_C3);
state->enter = pr->flags.bm_check ?
acpi_idle_enter_bm :
acpi_idle_enter_simple;
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH]tickless: Fix tick nohz timer irq0 fail to increaase
2009-12-14 13:02 [PATCH]acpi c-states: Fix multiply C-states name disturbance Youquan,Song
2009-12-14 19:09 ` Pallipadi, Venkatesh
@ 2010-01-04 10:49 ` Youquan,Song
2010-01-04 12:36 ` Arjan van de Ven
1 sibling, 1 reply; 8+ messages in thread
From: Youquan,Song @ 2010-01-04 10:49 UTC (permalink / raw)
To: hpa, tglx, akpm
Cc: venkatesh.pallipadi, suresh.b.siddha, kent.liu, chaohong.guo,
youquan.song, linux-acpi, linux-kernel
Tickless is disabled by nohz=off, which is used in OSVs. But in current kernel,
if tickless is disabled, the timer irq0 will not increase. Because the timer
event handler should be tick_handle_periodic, but actually event handler keep
as tick_handle_oneshot_broadcast which is used in tickless. The root cause is
that it is default to enable high resolution timer which will force to oneshot
broadcast mode.
This patch add tickless enable check before enable high resolution timer
On Nehalem-EX:
Before the patch:
linux-a25n:~ # cat /proc/interrupts | grep timer
0: 334 0 0 0 0 0 ....
LOC: 192248 193931 193851 184441 193803 193625 ....
After the patch:
cat /proc/interrupts | grep timer
0: 223788 0 0 0 0 0 ....
LOC: 13081 238407 238452 229405 238298 235688 ....
Signed-off-by: Youquan, Song <youquan.song@intel.com>
---
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index f992762..a515bed 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -815,7 +815,7 @@ int tick_check_oneshot_change(int allow_nohz)
if (!timekeeping_valid_for_hres() || !tick_is_oneshot_available())
return 0;
- if (!allow_nohz)
+ if (!allow_nohz && tick_nohz_enabled)
return 1;
tick_nohz_switch_to_nohz();
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH]tickless: Fix tick nohz timer irq0 fail to increaase
2010-01-04 10:49 ` [PATCH]tickless: Fix tick nohz timer irq0 fail to increaase Youquan,Song
@ 2010-01-04 12:36 ` Arjan van de Ven
2010-01-04 14:00 ` Peter Zijlstra
2010-01-04 21:40 ` Youquan,Song
0 siblings, 2 replies; 8+ messages in thread
From: Arjan van de Ven @ 2010-01-04 12:36 UTC (permalink / raw)
To: Youquan,Song
Cc: hpa, tglx, akpm, venkatesh.pallipadi, suresh.b.siddha, kent.liu,
chaohong.guo, youquan.song, linux-acpi, linux-kernel
On Mon, 4 Jan 2010 05:49:05 -0500
"Youquan,Song" <youquan.song@linux.intel.com> wrote:
> Tickless is disabled by nohz=off, which is used in OSVs.
it is? I doubt anyone wants to normally disable tickless, the power
cost is just too high....
> But in
> current kernel, if tickless is disabled, the timer irq0 will not
> increase.
why is this a problem?
> Because the timer event handler should be
> tick_handle_periodic, but actually event handler keep as
> tick_handle_oneshot_broadcast which is used in tickless. The root
> cause is that it is default to enable high resolution timer which
> will force to oneshot broadcast mode.
using local apic in one shot mode is not a problem, I'd in fact call it
a feature...
--
Arjan van de Ven Intel Open Source Technology Centre
For development, discussion and tips for power savings,
visit http://www.lesswatts.org
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH]tickless: Fix tick nohz timer irq0 fail to increaase
2010-01-04 12:36 ` Arjan van de Ven
@ 2010-01-04 14:00 ` Peter Zijlstra
2010-01-04 22:02 ` Youquan,Song
2010-01-04 21:40 ` Youquan,Song
1 sibling, 1 reply; 8+ messages in thread
From: Peter Zijlstra @ 2010-01-04 14:00 UTC (permalink / raw)
To: Arjan van de Ven
Cc: Youquan,Song, hpa, tglx, akpm, venkatesh.pallipadi,
suresh.b.siddha, kent.liu, chaohong.guo, youquan.song, linux-acpi,
linux-kernel, jens.axboe
On Mon, 2010-01-04 at 04:36 -0800, Arjan van de Ven wrote:
> > cause is that it is default to enable high resolution timer which
> > will force to oneshot broadcast mode.
>
> using local apic in one shot mode is not a problem, I'd in fact call it
> a feature...
lapic yes, broadcast not so much.
He mentioned he was using Nehelem-EX which is known broken in that it
stops the lapic in deeper C states and thus reverts to HPET broadcast
since there's not enough HPET timers to go around.
Something like processor.max_cstate=2 should fix that IIRC.
Jens ran into this on his shiny new box.
Disabling nohz is funny indeed, one wonders why he does that.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH]tickless: Fix tick nohz timer irq0 fail to increaase
2010-01-04 12:36 ` Arjan van de Ven
2010-01-04 14:00 ` Peter Zijlstra
@ 2010-01-04 21:40 ` Youquan,Song
1 sibling, 0 replies; 8+ messages in thread
From: Youquan,Song @ 2010-01-04 21:40 UTC (permalink / raw)
To: Arjan van de Ven
Cc: Youquan,Song, hpa, tglx, akpm, venkatesh.pallipadi,
suresh.b.siddha, kent.liu, chaohong.guo, youquan.song, linux-acpi,
linux-kernel
> > Tickless is disabled by nohz=off, which is used in OSVs.
>
> it is? I doubt anyone wants to normally disable tickless, the power
> cost is just too high....
tickless is normally used but it is possible to use for some special users,
tickless is disabled by nohz=off kernel option, which is address in OSV's
release notes.
> > But in
> > current kernel, if tickless is disabled, the timer irq0 will not
> > increase.
>
> why is this a problem?
Two reasons:
R1:
In my mind, tick is period timer with frequent of HZ. but timer irq0 is
actually be an oneshot mode timer.
But in real watch, it is keep not increase for long time. but it
suddenly increase 5000~10000 times in a seconds.
R2:
Tickless disable, change cpuidle driver to ladder governor(menu does not
work at all).
Run powertop, if apply my patch, the c-state residency has > 10% improvement
than current kernel.
>
> > Because the timer event handler should be
> > tick_handle_periodic, but actually event handler keep as
> > tick_handle_oneshot_broadcast which is used in tickless. The root
> > cause is that it is default to enable high resolution timer which
> > will force to oneshot broadcast mode.
>
> using local apic in one shot mode is not a problem, I'd in fact call it
> a feature...
But in current kernel, local apic timer is in periodic mode with
frequencey of HZ when tickless is disabled.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH]tickless: Fix tick nohz timer irq0 fail to increaase
2010-01-04 14:00 ` Peter Zijlstra
@ 2010-01-04 22:02 ` Youquan,Song
0 siblings, 0 replies; 8+ messages in thread
From: Youquan,Song @ 2010-01-04 22:02 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Arjan van de Ven, Youquan,Song, hpa, tglx, akpm,
venkatesh.pallipadi, suresh.b.siddha, kent.liu, chaohong.guo,
youquan.song, linux-acpi, linux-kernel, jens.axboe
> Disabling nohz is funny indeed, one wonders why he does that.
I do not have real case for it. I look at nohz=off because
1. OSV has such release notes to explain how to disable tickless.
2. with some OSV config + latest kernel, it will result in kernel
oops and it config disable tickless. I have reported it to acpi maillist
before.
Thanks.
-Youquan
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2010-01-04 14:27 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-12-14 13:02 [PATCH]acpi c-states: Fix multiply C-states name disturbance Youquan,Song
2009-12-14 19:09 ` Pallipadi, Venkatesh
2009-12-16 16:06 ` [Resend PATCH]acpi " Youquan,Song
2010-01-04 10:49 ` [PATCH]tickless: Fix tick nohz timer irq0 fail to increaase Youquan,Song
2010-01-04 12:36 ` Arjan van de Ven
2010-01-04 14:00 ` Peter Zijlstra
2010-01-04 22:02 ` Youquan,Song
2010-01-04 21:40 ` Youquan,Song
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).