linux-acpi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH]acpi c-states: Fix multiply C-states name disturbance
@ 2009-12-14 13:02 Youquan,Song
  2009-12-14 19:09 ` Pallipadi, Venkatesh
  2010-01-04 10:49 ` [PATCH]tickless: Fix tick nohz timer irq0 fail to increaase Youquan,Song
  0 siblings, 2 replies; 8+ messages in thread
From: Youquan,Song @ 2009-12-14 13:02 UTC (permalink / raw)
  To: lenb
  Cc: venkatesh.pallipadi, linux, kent.liu, chaohong.guo, youquan.song,
	linux-acpi

There are possible multiply C-states for special type of ACPI C-state, such as
 ACPI C3, processor C3,C6,C7 are possible all mapped to ACPI C3.

In /sys/devices/system/cpu/cpu0/cpuidle/stateXX, state<number> will increase 
with number of validate C-states. But its C-state name under stateXX will also
increase with number of validate C-state.

Actually, C-state name under stateXX should be type of ACPI C-state which we 
can refer to /proc/acpi/processor/CPU0/power type[C-state].   
 
This patch unify that all the multiply C-states to one special ACPI C-state. 

Signed-off-by: Youquan, Song <youquan.song@intel.com>
---


diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
index 302d656..a94b4d7 100644
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -1086,7 +1086,6 @@ static int acpi_processor_setup_cpuidle(struct acpi_processor *pr)
 #endif
 		cpuidle_set_statedata(state, cx);
 
-		snprintf(state->name, CPUIDLE_NAME_LEN, "C%d", i);
 		strncpy(state->desc, cx->desc, CPUIDLE_DESC_LEN);
 		state->exit_latency = cx->latency;
 		state->target_residency = cx->latency * latency_factor;
@@ -1099,6 +1098,8 @@ static int acpi_processor_setup_cpuidle(struct acpi_processor *pr)
 			if (cx->entry_method == ACPI_CSTATE_FFH)
 				state->flags |= CPUIDLE_FLAG_TIME_VALID;
 
+			snprintf(state->name, CPUIDLE_NAME_LEN, "C%d",
+					ACPI_STATE_C1);
 			state->enter = acpi_idle_enter_c1;
 			dev->safe_state = state;
 			break;
@@ -1106,6 +1107,8 @@ static int acpi_processor_setup_cpuidle(struct acpi_processor *pr)
 			case ACPI_STATE_C2:
 			state->flags |= CPUIDLE_FLAG_BALANCED;
 			state->flags |= CPUIDLE_FLAG_TIME_VALID;
+			snprintf(state->name, CPUIDLE_NAME_LEN, "C%d",
+				ACPI_STATE_C2);
 			state->enter = acpi_idle_enter_simple;
 			dev->safe_state = state;
 			break;
@@ -1114,6 +1117,8 @@ static int acpi_processor_setup_cpuidle(struct acpi_processor *pr)
 			state->flags |= CPUIDLE_FLAG_DEEP;
 			state->flags |= CPUIDLE_FLAG_TIME_VALID;
 			state->flags |= CPUIDLE_FLAG_CHECK_BM;
+			snprintf(state->name, CPUIDLE_NAME_LEN, "C%d",
+				ACPI_STATE_C3);
 			state->enter = pr->flags.bm_check ?
 					acpi_idle_enter_bm :
 					acpi_idle_enter_simple;

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* RE: [PATCH]acpi c-states: Fix multiply C-states name disturbance
  2009-12-14 13:02 [PATCH]acpi c-states: Fix multiply C-states name disturbance Youquan,Song
@ 2009-12-14 19:09 ` Pallipadi, Venkatesh
  2009-12-16 16:06   ` [Resend PATCH]acpi " Youquan,Song
  2010-01-04 10:49 ` [PATCH]tickless: Fix tick nohz timer irq0 fail to increaase Youquan,Song
  1 sibling, 1 reply; 8+ messages in thread
From: Pallipadi, Venkatesh @ 2009-12-14 19:09 UTC (permalink / raw)
  To: Youquan,Song, lenb@kernel.org
  Cc: linux@dominikbrodowski.net, Liu, Kent, Guo, Chaohong,
	Song, Youquan, linux-acpi@vger.kernel.org


Ack the idea of this patch.

I don't quite agree with the comment
"Actually, C-state name under stateXX should be type of ACPI"

cpuidle is more generic than ACPI and C-state name under stateXX
can be anything that the driver wants to be.

I agree with your point though that today there is no mapping for
ACPI state in cpuidle sysfs output and it will be nice to add
It.

However, I think it is better to have
>+			snprintf(state->name, CPUIDLE_NAME_LEN, "ACPI C%d",
>+					ACPI_STATE_C1);

To clearly distinguish it from OS C-state index or Hardware C-state.

One other thing of concern with this change is any breakage of
existing apps with this change.

Thanks,
Venki

>-----Original Message-----
>From: Youquan,Song [mailto:youquan.song@linux.intel.com] 
>Sent: Monday, December 14, 2009 5:03 AM
>To: lenb@kernel.org
>Cc: Pallipadi, Venkatesh; linux@dominikbrodowski.net; Liu, 
>Kent; Guo, Chaohong; Song, Youquan; linux-acpi@vger.kernel.org
>Subject: [PATCH]acpi c-states: Fix multiply C-states name disturbance
>
>There are possible multiply C-states for special type of ACPI 
>C-state, such as
> ACPI C3, processor C3,C6,C7 are possible all mapped to ACPI C3.
>
>In /sys/devices/system/cpu/cpu0/cpuidle/stateXX, state<number> 
>will increase 
>with number of validate C-states. But its C-state name under 
>stateXX will also
>increase with number of validate C-state.
>
>Actually, C-state name under stateXX should be type of ACPI 
>C-state which we 
>can refer to /proc/acpi/processor/CPU0/power type[C-state].   
> 
>This patch unify that all the multiply C-states to one special 
>ACPI C-state. 
>
>Signed-off-by: Youquan, Song <youquan.song@intel.com>
>---
>
>
>diff --git a/drivers/acpi/processor_idle.c 
>b/drivers/acpi/processor_idle.c
>index 302d656..a94b4d7 100644
>--- a/drivers/acpi/processor_idle.c
>+++ b/drivers/acpi/processor_idle.c
>@@ -1086,7 +1086,6 @@ static int 
>acpi_processor_setup_cpuidle(struct acpi_processor *pr)
> #endif
> 		cpuidle_set_statedata(state, cx);
> 
>-		snprintf(state->name, CPUIDLE_NAME_LEN, "C%d", i);
> 		strncpy(state->desc, cx->desc, CPUIDLE_DESC_LEN);
> 		state->exit_latency = cx->latency;
> 		state->target_residency = cx->latency * latency_factor;
>@@ -1099,6 +1098,8 @@ static int 
>acpi_processor_setup_cpuidle(struct acpi_processor *pr)
> 			if (cx->entry_method == ACPI_CSTATE_FFH)
> 				state->flags |= CPUIDLE_FLAG_TIME_VALID;
> 
>+			snprintf(state->name, CPUIDLE_NAME_LEN, "C%d",
>+					ACPI_STATE_C1);
> 			state->enter = acpi_idle_enter_c1;
> 			dev->safe_state = state;
> 			break;
>@@ -1106,6 +1107,8 @@ static int 
>acpi_processor_setup_cpuidle(struct acpi_processor *pr)
> 			case ACPI_STATE_C2:
> 			state->flags |= CPUIDLE_FLAG_BALANCED;
> 			state->flags |= CPUIDLE_FLAG_TIME_VALID;
>+			snprintf(state->name, CPUIDLE_NAME_LEN, "C%d",
>+				ACPI_STATE_C2);
> 			state->enter = acpi_idle_enter_simple;
> 			dev->safe_state = state;
> 			break;
>@@ -1114,6 +1117,8 @@ static int 
>acpi_processor_setup_cpuidle(struct acpi_processor *pr)
> 			state->flags |= CPUIDLE_FLAG_DEEP;
> 			state->flags |= CPUIDLE_FLAG_TIME_VALID;
> 			state->flags |= CPUIDLE_FLAG_CHECK_BM;
>+			snprintf(state->name, CPUIDLE_NAME_LEN, "C%d",
>+				ACPI_STATE_C3);
> 			state->enter = pr->flags.bm_check ?
> 					acpi_idle_enter_bm :
> 					acpi_idle_enter_simple;
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Resend PATCH]acpi c-states: Fix multiply C-states name disturbance
  2009-12-14 19:09 ` Pallipadi, Venkatesh
@ 2009-12-16 16:06   ` Youquan,Song
  0 siblings, 0 replies; 8+ messages in thread
From: Youquan,Song @ 2009-12-16 16:06 UTC (permalink / raw)
  To: lenb
  Cc: venkatesh.pallipadi, linux@dominikbrodowski.net, Liu, Kent,
	Guo, Chaohong, Song, Youquan, linux-acpi@vger.kernel.org

There are possible multiply C-states for special type of ACPI C-state, such as
 ACPI C3, processor C3,C6,C7 are possible all mapped to ACPI C3.

In /sys/devices/system/cpu/cpu0/cpuidle/stateXX, state<number> will increase 
with number of validate C-states. But its C-state name under stateXX will also
increase with number of validate C-state.
 
This patch unify that all the multiply C-states to one special ACPI C-state. 

Signed-off-by: Youquan, Song <youquan.song@intel.com>
Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>  
---


diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
index 302d656..f14785d 100644
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -1086,7 +1086,6 @@ static int acpi_processor_setup_cpuidle(struct acpi_processor *pr)
 #endif
 		cpuidle_set_statedata(state, cx);
 
-		snprintf(state->name, CPUIDLE_NAME_LEN, "C%d", i);
 		strncpy(state->desc, cx->desc, CPUIDLE_DESC_LEN);
 		state->exit_latency = cx->latency;
 		state->target_residency = cx->latency * latency_factor;
@@ -1099,6 +1098,8 @@ static int acpi_processor_setup_cpuidle(struct acpi_processor *pr)
 			if (cx->entry_method == ACPI_CSTATE_FFH)
 				state->flags |= CPUIDLE_FLAG_TIME_VALID;
 
+			snprintf(state->name, CPUIDLE_NAME_LEN, "ACPI C%d",
+					ACPI_STATE_C1);
 			state->enter = acpi_idle_enter_c1;
 			dev->safe_state = state;
 			break;
@@ -1106,6 +1107,8 @@ static int acpi_processor_setup_cpuidle(struct acpi_processor *pr)
 			case ACPI_STATE_C2:
 			state->flags |= CPUIDLE_FLAG_BALANCED;
 			state->flags |= CPUIDLE_FLAG_TIME_VALID;
+			snprintf(state->name, CPUIDLE_NAME_LEN, "ACPI C%d",
+				ACPI_STATE_C2);
 			state->enter = acpi_idle_enter_simple;
 			dev->safe_state = state;
 			break;
@@ -1114,6 +1117,8 @@ static int acpi_processor_setup_cpuidle(struct acpi_processor *pr)
 			state->flags |= CPUIDLE_FLAG_DEEP;
 			state->flags |= CPUIDLE_FLAG_TIME_VALID;
 			state->flags |= CPUIDLE_FLAG_CHECK_BM;
+			snprintf(state->name, CPUIDLE_NAME_LEN, "ACPI C%d",
+				ACPI_STATE_C3);
 			state->enter = pr->flags.bm_check ?
 					acpi_idle_enter_bm :
 					acpi_idle_enter_simple;

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH]tickless: Fix tick nohz timer irq0 fail to increaase
  2009-12-14 13:02 [PATCH]acpi c-states: Fix multiply C-states name disturbance Youquan,Song
  2009-12-14 19:09 ` Pallipadi, Venkatesh
@ 2010-01-04 10:49 ` Youquan,Song
  2010-01-04 12:36   ` Arjan van de Ven
  1 sibling, 1 reply; 8+ messages in thread
From: Youquan,Song @ 2010-01-04 10:49 UTC (permalink / raw)
  To: hpa, tglx, akpm
  Cc: venkatesh.pallipadi, suresh.b.siddha, kent.liu, chaohong.guo,
	youquan.song, linux-acpi, linux-kernel

Tickless is disabled by nohz=off, which is used in OSVs. But in current kernel,
if tickless is disabled, the timer irq0 will not increase. Because the timer 
event handler should be tick_handle_periodic, but actually event handler keep 
as tick_handle_oneshot_broadcast which is used in tickless. The root cause is
that it is default to enable high resolution timer which will force to oneshot
 broadcast mode.

This patch add tickless enable check before enable high resolution timer

On Nehalem-EX:

Before the patch:
linux-a25n:~ # cat /proc/interrupts | grep timer
   0:        334          0          0          0          0          0 ....
 LOC:     192248     193931     193851     184441     193803     193625 ....

After the patch:
cat /proc/interrupts | grep timer
   0:     223788          0          0          0          0          0 ....
 LOC:      13081     238407     238452     229405     238298     235688 ....

Signed-off-by: Youquan, Song <youquan.song@intel.com>
---


diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index f992762..a515bed 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -815,7 +815,7 @@ int tick_check_oneshot_change(int allow_nohz)
 	if (!timekeeping_valid_for_hres() || !tick_is_oneshot_available())
 		return 0;
 
-	if (!allow_nohz)
+	if (!allow_nohz && tick_nohz_enabled)
 		return 1;
 
 	tick_nohz_switch_to_nohz();

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH]tickless: Fix tick nohz timer irq0 fail to increaase
  2010-01-04 10:49 ` [PATCH]tickless: Fix tick nohz timer irq0 fail to increaase Youquan,Song
@ 2010-01-04 12:36   ` Arjan van de Ven
  2010-01-04 14:00     ` Peter Zijlstra
  2010-01-04 21:40     ` Youquan,Song
  0 siblings, 2 replies; 8+ messages in thread
From: Arjan van de Ven @ 2010-01-04 12:36 UTC (permalink / raw)
  To: Youquan,Song
  Cc: hpa, tglx, akpm, venkatesh.pallipadi, suresh.b.siddha, kent.liu,
	chaohong.guo, youquan.song, linux-acpi, linux-kernel

On Mon, 4 Jan 2010 05:49:05 -0500
"Youquan,Song" <youquan.song@linux.intel.com> wrote:

> Tickless is disabled by nohz=off, which is used in OSVs.

it is? I doubt anyone wants to normally disable tickless, the power
cost is just too high....

> But in
> current kernel, if tickless is disabled, the timer irq0 will not
> increase.

why is this a problem? 

> Because the timer event handler should be
> tick_handle_periodic, but actually event handler keep as
> tick_handle_oneshot_broadcast which is used in tickless. The root
> cause is that it is default to enable high resolution timer which
> will force to oneshot broadcast mode.

using local apic in one shot mode is not a problem, I'd in fact call it
a feature...


-- 
Arjan van de Ven 	Intel Open Source Technology Centre
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH]tickless: Fix tick nohz timer irq0 fail to increaase
  2010-01-04 12:36   ` Arjan van de Ven
@ 2010-01-04 14:00     ` Peter Zijlstra
  2010-01-04 22:02       ` Youquan,Song
  2010-01-04 21:40     ` Youquan,Song
  1 sibling, 1 reply; 8+ messages in thread
From: Peter Zijlstra @ 2010-01-04 14:00 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Youquan,Song, hpa, tglx, akpm, venkatesh.pallipadi,
	suresh.b.siddha, kent.liu, chaohong.guo, youquan.song, linux-acpi,
	linux-kernel, jens.axboe

On Mon, 2010-01-04 at 04:36 -0800, Arjan van de Ven wrote:
> > cause is that it is default to enable high resolution timer which
> > will force to oneshot broadcast mode.
> 
> using local apic in one shot mode is not a problem, I'd in fact call it
> a feature... 

lapic yes, broadcast not so much.

He mentioned he was using Nehelem-EX which is known broken in that it
stops the lapic in deeper C states and thus reverts to HPET broadcast
since there's not enough HPET timers to go around.

Something like processor.max_cstate=2 should fix that IIRC.

Jens ran into this on his shiny new box.

Disabling nohz is funny indeed, one wonders why he does that.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH]tickless: Fix tick nohz timer irq0 fail to increaase
  2010-01-04 12:36   ` Arjan van de Ven
  2010-01-04 14:00     ` Peter Zijlstra
@ 2010-01-04 21:40     ` Youquan,Song
  1 sibling, 0 replies; 8+ messages in thread
From: Youquan,Song @ 2010-01-04 21:40 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Youquan,Song, hpa, tglx, akpm, venkatesh.pallipadi,
	suresh.b.siddha, kent.liu, chaohong.guo, youquan.song, linux-acpi,
	linux-kernel

> > Tickless is disabled by nohz=off, which is used in OSVs.
> 
> it is? I doubt anyone wants to normally disable tickless, the power
> cost is just too high....
tickless is normally used but it is possible to use for some special users, 
tickless is disabled by nohz=off kernel option, which is address in OSV's 
release notes.

> > But in
> > current kernel, if tickless is disabled, the timer irq0 will not
> > increase.
> 
> why is this a problem? 

Two reasons: 
R1:
In my mind, tick is period timer with frequent of HZ. but timer irq0 is
actually be an oneshot mode timer.
But in real watch, it is keep not increase for long time. but it
suddenly increase 5000~10000 times in a seconds.

R2:
Tickless disable, change cpuidle driver to ladder governor(menu does not
work at all). 

Run powertop, if apply my patch, the c-state residency has > 10% improvement
than current kernel. 

> 
> > Because the timer event handler should be
> > tick_handle_periodic, but actually event handler keep as
> > tick_handle_oneshot_broadcast which is used in tickless. The root
> > cause is that it is default to enable high resolution timer which
> > will force to oneshot broadcast mode.
> 
> using local apic in one shot mode is not a problem, I'd in fact call it
> a feature...

But in current kernel, local apic timer is in periodic mode with
frequencey of HZ when tickless is disabled.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH]tickless: Fix tick nohz timer irq0 fail to increaase
  2010-01-04 14:00     ` Peter Zijlstra
@ 2010-01-04 22:02       ` Youquan,Song
  0 siblings, 0 replies; 8+ messages in thread
From: Youquan,Song @ 2010-01-04 22:02 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Arjan van de Ven, Youquan,Song, hpa, tglx, akpm,
	venkatesh.pallipadi, suresh.b.siddha, kent.liu, chaohong.guo,
	youquan.song, linux-acpi, linux-kernel, jens.axboe

> Disabling nohz is funny indeed, one wonders why he does that.
I do not have real case for it. I look at nohz=off because 
1. OSV has such release notes to explain how to disable tickless.  
2. with some OSV config + latest kernel, it will result in kernel 
oops and it config disable tickless. I have reported it to acpi maillist
before.

Thanks.

-Youquan


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2010-01-04 14:27 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-12-14 13:02 [PATCH]acpi c-states: Fix multiply C-states name disturbance Youquan,Song
2009-12-14 19:09 ` Pallipadi, Venkatesh
2009-12-16 16:06   ` [Resend PATCH]acpi " Youquan,Song
2010-01-04 10:49 ` [PATCH]tickless: Fix tick nohz timer irq0 fail to increaase Youquan,Song
2010-01-04 12:36   ` Arjan van de Ven
2010-01-04 14:00     ` Peter Zijlstra
2010-01-04 22:02       ` Youquan,Song
2010-01-04 21:40     ` Youquan,Song

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).