From: Jeff Garrett <jeff@jgarrett.org>
To: Len Brown <lenb@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>,
linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org
Subject: Re: acpi_idle: Very idle Core i7 machine never enters C3
Date: Fri, 5 Feb 2010 14:53:12 -0600 [thread overview]
Message-ID: <20100205205312.GA4532@jgarrett.org> (raw)
In-Reply-To: <alpine.LFD.2.00.1002051238280.4050@localhost.localdomain>
[-- Attachment #1: Type: text/plain, Size: 1710 bytes --]
On Fri, Feb 05, 2010 at 12:45:21PM -0500, Len Brown wrote:
> Jeff,
> What do you see if you apply just the patch below?
>
> Also, in addition to "powertop -d" to show what the kernel requests,
> please run turbostat to show what the hardware actually did:
>
> http://userweb.kernel.org/~lenb/acpi/utils/pmtools-latest/turbostat/turbostat.c
>
> eg.
> # turbostat -d -v sleep 5
>
> thanks,
> -Len Brown, Intel Open Source Technology Center
With the patch, powertop reports good C3 residency and wakeups
remain very low. Seems to work. :)
I attached the powertop & turbostat output with this patch.
However, this confuses me. In a previous experiment in the
acpi_processor_setup_cpuidle() function, I replaced the pointer
to acpi_idle_enter_bm() with a pointer to
acpi_idle_enter_simple() even when bm_check is nonzero. With
that, I was able to get into C3, but the wakeups ballooned. But
the difference between what I did, and what you did, is the
difference between acpi_idle_enter_bm() with acpi_idle_bm_check()
returning zero and acpi_idle_enter_simple(). Those code paths
look almost identical. The bm path calls acpi_unlazy_tlb(), and
doesn't appear to call the ACPI_FLUSH_CPU_CACHE(), and they call
sched_clock_idle_sleep_event() in different places. I don't
understand why any of these differences would have had any
significant effect on wakeups.
I'm left wondering if it's a problem on my part. I should repeat
that previous experiment and see if there really is something
significantly different there.
BTW, getting a bit off topic, but since the two code paths are
almost identical, is there any reason not to unite them?
Something like the attached patch might work?
Thanks,
Jeff
[-- Attachment #2: powertop.out --]
[-- Type: text/plain, Size: 1829 bytes --]
PowerTOP 1.12 (C) 2007, 2008 Intel Corporation
Collecting data for 15 seconds
Cn Avg residency
C0 (cpu running) ( 2.7%)
polling 0.0ms ( 0.0%)
C1 mwait 0.0ms ( 0.0%)
C2 mwait 0.0ms ( 0.0%)
C3 mwait 151.8ms (97.3%)
P-states (frequencies)
Turbo Mode 0.0%
2.67 Ghz 0.0%
2.54 Ghz 0.0%
2.40 Ghz 0.0%
1.60 Ghz 100.0%
Wakeups-from-idle per second : 6.4 interval: 15.0s
no ACPI power usage estimate available
Top causes for wakeups:
38.4% ( 11.2) [extra timer interrupt]
29.7% ( 8.7) [kernel scheduler] Load balancing tick
11.2% ( 3.3) [kernel core] hrtimer_start (tick_sched_timer)
7.1% ( 2.1) [Rescheduling interrupts] <kernel IPI>
6.9% ( 2.0) [kernel core] add_timer_on (clocksource_watchdog)
1.8% ( 0.5) events/3
1.6% ( 0.5) [eth0] <interrupt>
0.7% ( 0.2) [kernel core] __mod_timer (dev_watchdog)
0.5% ( 0.1) [ata_piix, ata_piix] <interrupt>
0.2% ( 0.1) [kernel core] __mod_timer (tcp_delack_timer)
0.2% ( 0.1) sshd
0.2% ( 0.1) [kernel core] enqueue_task_rt (sched_rt_period_timer)
0.2% ( 0.1) expr
0.2% ( 0.1) [kernel core] __mod_timer (cfq_idle_slice_timer)
0.2% ( 0.1) [kernel core] queue_delayed_work (delayed_work_timer_fn)
0.2% ( 0.1) [kernel core] __mod_timer (peer_check_expire)
0.2% ( 0.1) [kernel core] __mod_timer (neigh_timer_handler)
0.2% ( 0.1) [kernel core] __mod_timer (sync_supers_timer_fn)
Suggestion: Enable the CONFIG_USB_SUSPEND kernel configuration option.
This option will automatically disable UHCI USB when not in use, and may
save approximately 1 Watt of power.
Recent USB suspend statistics
Active Device name
Recent audio activity statistics
Active Device name
Recent SATA AHCI link activity statistics
Active Partial Slumber Device name
[-- Attachment #3: turbostat.out --]
[-- Type: text/plain, Size: 1089 bytes --]
turbostat Jan-27, 2010 - Len Brown <lenb@kernel.org>
num_cpus 8
CPUID GenuineIntel 11 levels family:model:stepping 6:26:4
Nehalem multiplier 20, TSC frequency 2667 MHz
MSR_NEHALEM_PLATFORM_INFO: 0xc0000001400
Nehalem 4 cores active: 21 mult, max turbo frequency = 2800 MHz
Nehalem 3 cores active: 21 mult, max turbo frequency = 2800 MHz
Nehalem 2 cores active: 21 mult, max turbo frequency = 2800 MHz
Nehalem 1 core active: 22 mult, max turbo frequency = 2933 MHz
5.002349 sec
CPU GHz TSC %c0 %c1 %c3 %c6 %pc3 %pc6 %pc7
0 1.60 2.66 0.02 0.07 0.00 99.92 0.00 0.00 0.00
1 1.60 2.66 0.01 0.02 0.00 99.97 0.00 0.00 0.00
2 1.60 2.66 0.02 0.04 0.00 99.94 0.00 0.00 0.00
3 1.60 2.66 0.09 0.13 0.00 99.78 0.00 0.00 0.00
4 1.60 2.66 0.05 0.04 0.00 99.91 0.00 0.00 0.00
5 1.60 2.66 0.01 0.02 0.00 99.97 0.00 0.00 0.00
6 1.60 2.66 0.01 0.05 0.00 99.94 0.00 0.00 0.00
7 1.59 2.66 0.05 0.16 0.00 99.79 0.00 0.00 0.00
[-- Attachment #4: patch --]
[-- Type: text/plain, Size: 4741 bytes --]
diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
index 7c0441f..8c636de 100644
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -849,73 +851,6 @@ static int acpi_idle_enter_c1(struct cpuidle_device *dev,
return idle_time;
}
-/**
- * acpi_idle_enter_simple - enters an ACPI state without BM handling
- * @dev: the target CPU
- * @state: the state data
- */
-static int acpi_idle_enter_simple(struct cpuidle_device *dev,
- struct cpuidle_state *state)
-{
- struct acpi_processor *pr;
- struct acpi_processor_cx *cx = cpuidle_get_statedata(state);
- ktime_t kt1, kt2;
- s64 idle_time;
- s64 sleep_ticks = 0;
-
- pr = __get_cpu_var(processors);
-
- if (unlikely(!pr))
- return 0;
-
- if (acpi_idle_suspend)
- return(acpi_idle_enter_c1(dev, state));
-
- local_irq_disable();
- current_thread_info()->status &= ~TS_POLLING;
- /*
- * TS_POLLING-cleared state must be visible before we test
- * NEED_RESCHED:
- */
- smp_mb();
-
- if (unlikely(need_resched())) {
- current_thread_info()->status |= TS_POLLING;
- local_irq_enable();
- return 0;
- }
-
- /*
- * Must be done before busmaster disable as we might need to
- * access HPET !
- */
- lapic_timer_state_broadcast(pr, cx, 1);
-
- if (cx->type == ACPI_STATE_C3)
- ACPI_FLUSH_CPU_CACHE();
-
- kt1 = ktime_get_real();
- /* Tell the scheduler that we are going deep-idle: */
- sched_clock_idle_sleep_event();
- acpi_idle_do_entry(cx);
- kt2 = ktime_get_real();
- idle_time = ktime_to_us(ktime_sub(kt2, kt1));
-
- sleep_ticks = us_to_pm_timer_ticks(idle_time);
-
- /* Tell the scheduler how much we idled: */
- sched_clock_idle_wakeup_event(sleep_ticks*PM_TIMER_TICK_NS);
-
- local_irq_enable();
- current_thread_info()->status |= TS_POLLING;
-
- cx->usage++;
-
- lapic_timer_state_broadcast(pr, cx, 0);
- cx->time += sleep_ticks;
- return idle_time;
-}
-
static int c3_cpu_count;
static DEFINE_SPINLOCK(c3_lock);
@@ -944,7 +879,7 @@ static int acpi_idle_enter_bm(struct cpuidle_device *dev,
if (acpi_idle_suspend)
return(acpi_idle_enter_c1(dev, state));
- if (acpi_idle_bm_check()) {
+ if (cx->type == ACPI_STATE_C3 && pr->flags.bm_check && acpi_idle_bm_check()) {
if (dev->safe_state) {
dev->last_state = dev->safe_state;
return dev->safe_state->enter(dev, dev->safe_state);
@@ -970,17 +905,24 @@ static int acpi_idle_enter_bm(struct cpuidle_device *dev,
return 0;
}
- acpi_unlazy_tlb(smp_processor_id());
+ if (cx->type == ACPI_STATE_C3 && pr->flags.bm_check)
+ acpi_unlazy_tlb(smp_processor_id());
/* Tell the scheduler that we are going deep-idle: */
- sched_clock_idle_sleep_event();
+ if (cx->type == ACPI_STATE_C3 && pr->flags.bm_check)
+ sched_clock_idle_sleep_event();
/*
* Must be done before busmaster disable as we might need to
* access HPET !
*/
lapic_timer_state_broadcast(pr, cx, 1);
+ if (cx->type == ACPI_STATE_C3 && !pr->flags.bm_check)
+ ACPI_FLUSH_CPU_CACHE();
+
kt1 = ktime_get_real();
+ if (cx->type != ACPI_STATE_C3 || !pr->flags.bm_check)
+ sched_clock_idle_sleep_event();
/*
* disable bus master
* bm_check implies we need ARB_DIS
@@ -991,21 +933,19 @@ static int acpi_idle_enter_bm(struct cpuidle_device *dev,
* not set. In that case we cannot do much, we enter C3
* without doing anything.
*/
- if (pr->flags.bm_check && pr->flags.bm_control) {
+ if (cx->type == ACPI_STATE_C3 && pr->flags.bm_check && pr->flags.bm_control) {
spin_lock(&c3_lock);
c3_cpu_count++;
/* Disable bus master arbitration when all CPUs are in C3 */
if (c3_cpu_count == num_online_cpus())
acpi_write_bit_register(ACPI_BITREG_ARB_DISABLE, 1);
spin_unlock(&c3_lock);
- } else if (!pr->flags.bm_check) {
- ACPI_FLUSH_CPU_CACHE();
}
acpi_idle_do_entry(cx);
/* Re-enable bus master arbitration */
- if (pr->flags.bm_check && pr->flags.bm_control) {
+ if (cx->type == ACPI_STATE_C3 && pr->flags.bm_check && pr->flags.bm_control) {
spin_lock(&c3_lock);
acpi_write_bit_register(ACPI_BITREG_ARB_DISABLE, 0);
c3_cpu_count--;
@@ -1095,7 +1035,7 @@ static int acpi_processor_setup_cpuidle(struct acpi_processor *pr)
case ACPI_STATE_C2:
state->flags |= CPUIDLE_FLAG_BALANCED;
state->flags |= CPUIDLE_FLAG_TIME_VALID;
- state->enter = acpi_idle_enter_simple;
+ state->enter = acpi_idle_enter_bm;
dev->safe_state = state;
break;
@@ -1103,9 +1043,7 @@ static int acpi_processor_setup_cpuidle(struct acpi_processor *pr)
state->flags |= CPUIDLE_FLAG_DEEP;
state->flags |= CPUIDLE_FLAG_TIME_VALID;
state->flags |= CPUIDLE_FLAG_CHECK_BM;
- state->enter = pr->flags.bm_check ?
- acpi_idle_enter_bm :
- acpi_idle_enter_simple;
+ state->enter = acpi_idle_enter_bm;
break;
}
next prev parent reply other threads:[~2010-02-05 20:54 UTC|newest]
Thread overview: 54+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-01-26 8:47 acpi_idle: Very idle Core i7 machine never enters C3 Jeff Garrett
2010-01-26 12:41 ` peng huang
2010-01-26 12:41 ` peng huang
2010-01-26 14:59 ` Jeff Garrett
2010-01-26 14:59 ` Jeff Garrett
2010-01-27 13:27 ` peng huang
2010-01-27 13:27 ` peng huang
2010-02-05 16:22 ` Jeff Garrett
2010-02-05 16:22 ` Jeff Garrett
2010-01-26 21:45 ` Andi Kleen
2010-01-26 21:45 ` Andi Kleen
2010-02-05 16:09 ` Jeff Garrett
2010-02-05 17:45 ` Len Brown
2010-02-05 20:53 ` Jeff Garrett [this message]
2010-04-27 2:40 ` Philip Langdale
2010-04-27 7:26 ` Len Brown
2010-04-27 15:41 ` Philip Langdale
2010-04-27 12:47 ` Jeff Garrett
2010-04-30 14:57 ` Philip Langdale
2010-04-30 14:57 ` Philip Langdale
2010-04-30 16:25 ` Len Brown
2010-04-30 17:44 ` Matthew Garrett
2010-04-30 18:35 ` Philip Langdale
2010-04-30 18:35 ` Philip Langdale
2010-05-25 5:43 ` Len Brown
2010-05-25 5:59 ` Yu, Luming
2010-05-25 12:39 ` Matthew Garrett
2010-05-25 12:43 ` Matthew Garrett
2010-05-25 12:43 ` Matthew Garrett
2010-05-25 15:33 ` Len Brown
2010-05-25 18:55 ` Matthew Garrett
2010-07-21 21:31 ` [PATCH] ACPI: make acpi_idle Nehalem-aware Len Brown
2010-07-22 0:53 ` Venkatesh Pallipadi
2010-07-22 0:53 ` Venkatesh Pallipadi
2010-07-22 7:47 ` Andi Kleen
2010-07-22 15:57 ` Len Brown
2010-07-22 21:21 ` [PATCH] ACPI: skip checking BM_STS if the BIOS doesn't ask for it Len Brown
2010-07-22 21:40 ` [PATCH] ACPI: create "processor.bm_check_disable" boot param Len Brown
2010-07-26 7:24 ` Andi Kleen
2010-07-26 7:24 ` Andi Kleen
2010-07-27 0:19 ` Len Brown
2010-07-27 11:28 ` Andi Kleen
2010-07-28 18:58 ` Len Brown
2010-07-22 21:25 ` [PATCH] ACPI: make acpi_idle Nehalem-aware Iain
2010-07-22 21:53 ` Iain
2010-07-22 22:01 ` Len Brown
2010-07-23 12:40 ` Iain
2010-08-03 6:55 ` Pavel Machek
2010-08-03 7:05 ` Andi Kleen
2010-05-25 12:37 ` acpi_idle: Very idle Core i7 machine never enters C3 Matthew Garrett
2010-05-25 15:40 ` Len Brown
2010-07-22 5:34 ` Len Brown
2010-02-01 14:10 ` Pavel Machek
2010-02-05 16:30 ` Jeff Garrett
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100205205312.GA4532@jgarrett.org \
--to=jeff@jgarrett.org \
--cc=andi@firstfloor.org \
--cc=lenb@kernel.org \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.