From mboxrd@z Thu Jan 1 00:00:00 1970 From: ShuoX Liu Subject: [PATCH V3] cpuidle: Add a sysfs entry to disable specific C state for debug purpose. Date: Mon, 05 Mar 2012 15:09:56 +0800 Message-ID: <4F5466C4.2090808@intel.com> References: <1330647453.1916.7.camel@ymzhang> <20120302142303.648285a7.akpm@linux-foundation.org> <4F545A4E.8010801@linux.vnet.ibm.com> Reply-To: shuox.liu@intel.com Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE Return-path: In-Reply-To: <4F545A4E.8010801@linux.vnet.ibm.com> Sender: linux-kernel-owner@vger.kernel.org To: Deepthi Dharwar Cc: Andrew Morton , "yanmin_zhang@linux.intel.com" , "linux-kernel@vger.kernel.org" , "Brown, Len" , Ingo Molnar , Thomas Gleixner , "H. Peter Anvin" , "linux-pm@lists.linux-foundation.org" List-Id: linux-pm@vger.kernel.org On 2012=E5=B9=B403=E6=9C=8805=E6=97=A5 14:16, Deepthi Dharwar wrote: > Hi, > > On 03/05/2012 08:52 AM, Liu, ShuoX wrote: > >>>> It's useful to help developers debug some issues. >>> >>> It would help developers more if it was documented a bit. As it >>> stands, they'd be lucky if they even knew it existed. >>> >>> Looky: >>> >>> akpm:/usr/src/linux-3.3-rc5> grep -ril cpuidle Documentation >>> Documentation/trace/events-power.txt >>> Documentation/ABI/testing/sysfs-devices-system-cpu >>> Documentation/kernel-parameters.txt >>> Documentation/00-INDEX >>> Documentation/cpuidle/core.txt >>> Documentation/cpuidle/sysfs.txt >>> Documentation/cpuidle/driver.txt >>> Documentation/cpuidle/governor.txt >> >> Thanks Andrew's advice and Yanmin's review. Here is the new patch. >> >> From: ShuoX Liu >> >> Some C states of new CPU might be not good. One reason is BIOS might= configure >> them incorrectly. To help developers root cause it quickly, the patc= h adds a >> new sysfs entry, so developers could disable specific C state manual= ly. >> >> In addition, C state might have much impact on performance tuning, a= s it takes >> much time to enter/exit C states, which might delay interrupt proces= sing. With >> the new debug option, developers could check if a deep C state could= impact >> performance and how much impact it could cause. >> >> Also add this option in Documentation/cpuidle/sysfs.txt. > > > > Enabling/Disabling particular C-states from sysfs entry is really > helpful for cpuidle developers for general debugging and performance > tuning for not just x86 but for all platforms that support cpuidle, > like arm, powerpc etc . > >> > >> Signed-off-by: ShuoX Liu >> Reviewed-by: Yanmin Zhang > > Please update the documentation, rw fields are valid > for disable flag > /sys/devices/system/cpu/cpu2/cpuidle/state1 # ls -l > total 0 I will do so. >> static void cpuidle_state_sysfs_release(struct kobject *kobj) >> diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h >> index 712abcc..eb150a5 100644 >> --- a/include/linux/cpuidle.h >> +++ b/include/linux/cpuidle.h >> @@ -45,6 +45,7 @@ struct cpuidle_state { >> unsigned int exit_latency; /* in US */ >> unsigned int power_usage; /* in mW */ >> unsigned int target_residency; /* in US */ >> + unsigned int disable; >> >> int (*enter) (struct cpuidle_device *dev, >> struct cpuidle_driver *drv, >> Thanks Deepthi. below is the new patch. =46rom: ShuoX Liu Some C states of new CPU might be not good. One reason is BIOS might=20 configure them incorrectly. To help developers root cause it quickly, the patch a= dds a new sysfs entry, so developers could disable specific C state manually. In addition, C state might have much impact on performance tuning, as i= t=20 takes much time to enter/exit C states, which might delay interrupt=20 processing. With the new debug option, developers could check if a deep C state could i= mpact performance and how much impact it could cause. Also add this option in Documentation/cpuidle/sysfs.txt. Signed-off-by: ShuoX Liu Reviewed-by: Yanmin Zhang Reviewed-and-Tested-by: Deepthi Dharwar --- Documentation/cpuidle/sysfs.txt | 5 +++++ drivers/cpuidle/cpuidle.c | 1 + drivers/cpuidle/governors/menu.c | 5 ++++- drivers/cpuidle/sysfs.c | 34 +++++++++++++++++++++++++++++= +++++ include/linux/cpuidle.h | 1 + 5 files changed, 45 insertions(+), 1 deletions(-) diff --git a/Documentation/cpuidle/sysfs.txt=20 b/Documentation/cpuidle/sysfs.txt index 50d7b16..9d28a34 100644 --- a/Documentation/cpuidle/sysfs.txt +++ b/Documentation/cpuidle/sysfs.txt @@ -36,6 +36,7 @@ drwxr-xr-x 2 root root 0 Feb 8 10:42 state3 /sys/devices/system/cpu/cpu0/cpuidle/state0: total 0 -r--r--r-- 1 root root 4096 Feb 8 10:42 desc +-rw-r--r-- 1 root root 4096 Feb 8 10:42 disable -r--r--r-- 1 root root 4096 Feb 8 10:42 latency -r--r--r-- 1 root root 4096 Feb 8 10:42 name -r--r--r-- 1 root root 4096 Feb 8 10:42 power @@ -45,6 +46,7 @@ total 0 /sys/devices/system/cpu/cpu0/cpuidle/state1: total 0 -r--r--r-- 1 root root 4096 Feb 8 10:42 desc +-rw-r--r-- 1 root root 4096 Feb 8 10:42 disable -r--r--r-- 1 root root 4096 Feb 8 10:42 latency -r--r--r-- 1 root root 4096 Feb 8 10:42 name -r--r--r-- 1 root root 4096 Feb 8 10:42 power @@ -54,6 +56,7 @@ total 0 /sys/devices/system/cpu/cpu0/cpuidle/state2: total 0 -r--r--r-- 1 root root 4096 Feb 8 10:42 desc +-rw-r--r-- 1 root root 4096 Feb 8 10:42 disable -r--r--r-- 1 root root 4096 Feb 8 10:42 latency -r--r--r-- 1 root root 4096 Feb 8 10:42 name -r--r--r-- 1 root root 4096 Feb 8 10:42 power @@ -63,6 +66,7 @@ total 0 /sys/devices/system/cpu/cpu0/cpuidle/state3: total 0 -r--r--r-- 1 root root 4096 Feb 8 10:42 desc +-rw-r--r-- 1 root root 4096 Feb 8 10:42 disable -r--r--r-- 1 root root 4096 Feb 8 10:42 latency -r--r--r-- 1 root root 4096 Feb 8 10:42 name -r--r--r-- 1 root root 4096 Feb 8 10:42 power @@ -72,6 +76,7 @@ total 0 * desc : Small description about the idle state (string) +* disable : Option to disable this idle state (bool) * latency : Latency to exit out of this idle state (in microseconds) * name : Name of the idle state (string) * power : Power consumed while in this idle state (in milliwatts) diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c index 59f4261..7d66d3e 100644 --- a/drivers/cpuidle/cpuidle.c +++ b/drivers/cpuidle/cpuidle.c @@ -197,6 +197,7 @@ static void poll_idle_init(struct cpuidle_driver *d= rv) state->power_usage =3D -1; state->flags =3D 0; state->enter =3D poll_idle; + state->disable =3D 0; } #else static void poll_idle_init(struct cpuidle_driver *drv) {} diff --git a/drivers/cpuidle/governors/menu.c=20 b/drivers/cpuidle/governors/menu.c index ad09526..5c17ca1 100644 --- a/drivers/cpuidle/governors/menu.c +++ b/drivers/cpuidle/governors/menu.c @@ -280,7 +280,8 @@ static int menu_select(struct cpuidle_driver *drv,=20 struct cpuidle_device *dev) * We want to default to C1 (hlt), not to busy polling * unless the timer is happening really really soon. */ - if (data->expected_us > 5) + if (data->expected_us > 5 && + drv->states[CPUIDLE_DRIVER_STATE_START].disable =3D=3D 0) data->last_state_idx =3D CPUIDLE_DRIVER_STATE_START; /* @@ -290,6 +291,8 @@ static int menu_select(struct cpuidle_driver *drv,=20 struct cpuidle_device *dev) for (i =3D CPUIDLE_DRIVER_STATE_START; i < drv->state_count; i++) { struct cpuidle_state *s =3D &drv->states[i]; + if (s->disable) + continue; if (s->target_residency > data->predicted_us) continue; if (s->exit_latency > latency_req) diff --git a/drivers/cpuidle/sysfs.c b/drivers/cpuidle/sysfs.c index 3fe41fe..1eae29a 100644 --- a/drivers/cpuidle/sysfs.c +++ b/drivers/cpuidle/sysfs.c @@ -222,6 +222,9 @@ struct cpuidle_state_attr { #define define_one_state_ro(_name, show) \ static struct cpuidle_state_attr attr_##_name =3D __ATTR(_name, 0444,= =20 show, NULL) +#define define_one_state_rw(_name, show, store) \ +static struct cpuidle_state_attr attr_##_name =3D __ATTR(_name, 0644,=20 show, store) + #define define_show_state_function(_name) \ static ssize_t show_state_##_name(struct cpuidle_state *state, \ struct cpuidle_state_usage *state_usage, char *buf) \ @@ -229,6 +232,19 @@ static ssize_t show_state_##_name(struct=20 cpuidle_state *state, \ return sprintf(buf, "%u\n", state->_name);\ } +#define define_store_state_function(_name) \ +static ssize_t store_state_##_name(struct cpuidle_state *state, \ + const char *buf, size_t size) \ +{ \ + int value; \ + sscanf(buf, "%d", &value); \ + if (value) \ + state->disable =3D 1; \ + else \ + state->disable =3D 0; \ + return size; \ +} + #define define_show_state_ull_function(_name) \ static ssize_t show_state_##_name(struct cpuidle_state *state, \ struct cpuidle_state_usage *state_usage, char *buf) \ @@ -251,6 +267,8 @@ define_show_state_ull_function(usage) define_show_state_ull_function(time) define_show_state_str_function(name) define_show_state_str_function(desc) +define_show_state_function(disable) +define_store_state_function(disable) define_one_state_ro(name, show_state_name); define_one_state_ro(desc, show_state_desc); @@ -258,6 +276,7 @@ define_one_state_ro(latency, show_state_exit_latenc= y); define_one_state_ro(power, show_state_power_usage); define_one_state_ro(usage, show_state_usage); define_one_state_ro(time, show_state_time); +define_one_state_rw(disable, show_state_disable, store_state_disable); static struct attribute *cpuidle_state_default_attrs[] =3D { &attr_name.attr, @@ -266,6 +285,7 @@ static struct attribute=20 *cpuidle_state_default_attrs[] =3D { &attr_power.attr, &attr_usage.attr, &attr_time.attr, + &attr_disable.attr, NULL }; @@ -287,8 +307,22 @@ static ssize_t cpuidle_state_show(struct kobject *= =20 kobj, return ret; } +static ssize_t cpuidle_state_store(struct kobject *kobj, + struct attribute *attr, const char *buf, size_t size) +{ + int ret =3D -EIO; + struct cpuidle_state *state =3D kobj_to_state(kobj); + struct cpuidle_state_attr *cattr =3D attr_to_stateattr(attr); + + if (cattr->store) + ret =3D cattr->store(state, buf, size); + + return ret; +} + static const struct sysfs_ops cpuidle_state_sysfs_ops =3D { .show =3D cpuidle_state_show, + .store =3D cpuidle_state_store, }; static void cpuidle_state_sysfs_release(struct kobject *kobj) diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h index 712abcc..eb150a5 100644 --- a/include/linux/cpuidle.h +++ b/include/linux/cpuidle.h @@ -45,6 +45,7 @@ struct cpuidle_state { unsigned int exit_latency; /* in US */ unsigned int power_usage; /* in mW */ unsigned int target_residency; /* in US */ + unsigned int disable; int (*enter) (struct cpuidle_device *dev, struct cpuidle_driver *drv, --=20