* Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage @ 2011-11-28 22:28 pomac 2011-11-29 7:52 ` Michal Hocko 0 siblings, 1 reply; 23+ messages in thread From: pomac @ 2011-11-28 22:28 UTC (permalink / raw) To: linux-kernel; +Cc: mhocko, rjw, tino.keitel, t.artem Hi, All this time i have been thinking i'm the only one - and i've been to loaded with work during working hours and tired when home =P Anyways, I've neen seeing this since -rc1 on: Intel(R) Core(TM) i5 CPU U520 @ 1.07GHz - x86-64 AMD Phenom(tm) II X6 1090T - x86-64 AMD Athlon(tm) 64 X2 Dual Core Processor 4400+ - x86-64 Intel(R) Core(TM)2 CPU T7200 @ 2.00GHz - i686 Configs available on demand - I've been running the same config for quite some time though. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage 2011-11-28 22:28 [REGRESSION] [Linux 3.2] top/htop and all other CPU usage pomac @ 2011-11-29 7:52 ` Michal Hocko 2011-11-29 11:38 ` Artem S. Tashkinov ` (2 more replies) 0 siblings, 3 replies; 23+ messages in thread From: Michal Hocko @ 2011-11-29 7:52 UTC (permalink / raw) To: pomac; +Cc: linux-kernel, rjw, tino.keitel, t.artem On Mon 28-11-11 23:28:03, pomac@vapor.com wrote: > Hi, > > All this time i have been thinking i'm the only one - and i've been to > loaded with work during working hours and tired when home =P > > Anyways, I've neen seeing this since -rc1 on: > Intel(R) Core(TM) i5 CPU U520 @ 1.07GHz - x86-64 > AMD Phenom(tm) II X6 1090T - x86-64 > AMD Athlon(tm) 64 X2 Dual Core Processor 4400+ - x86-64 > Intel(R) Core(TM)2 CPU T7200 @ 2.00GHz - i686 > > Configs available on demand - I've been running the same config for > quite some time though. As I have written in other email could you post your config and collect the following data? for i in `seq 30`; do cat /proc/stat > `date +'%s'` sleep 1 done export old_user=0 old_nice=0 old_sys=0 old_idle=0 old_iowait=0; # for all your available CPUs grep cpu0 * | while read cpu user nice sys idle iowait rest; do echo $cpu $(($user-$old_user)) $(($nice-$old_nice)) $(($sys-$old_sys)) $(($idle-$old_idle)) $(($iowait-$old_iowait)) old_user=$user old_nice=$nice old_sys=$sys old_idle=$idle old_iowait=$iowait done Thanks -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage 2011-11-29 7:52 ` Michal Hocko @ 2011-11-29 11:38 ` Artem S. Tashkinov 2011-11-29 12:31 ` Michal Hocko 2011-12-02 13:35 ` Michal Hocko 2011-11-29 17:23 ` Ian Kumlien 2011-11-29 17:31 ` Ian Kumlien 2 siblings, 2 replies; 23+ messages in thread From: Artem S. Tashkinov @ 2011-11-29 11:38 UTC (permalink / raw) To: mhocko; +Cc: pomac, linux-kernel, rjw, tino.keitel [-- Attachment #1: Type: text/plain, Size: 2189 bytes --] On Nov 29, 2011, Michal Hocko <mhocko@suse.cz> wrote: > As I have written in other email could you post your config and collect > the following data? > for i in `seq 30`; > do > cat /proc/stat > `date +'%s'` > sleep 1 > done > export old_user=0 old_nice=0 old_sys=0 old_idle=0 old_iowait=0; > > # for all your available CPUs > grep cpu0 * | while read cpu user nice sys idle iowait rest; > do > echo $cpu $(($user-$old_user)) $(($nice-$old_nice)) $(($sys-$old_sys)) $(($idle-$old_idle)) $(($iowait-$old_iowait)) > old_user=$user old_nice=$nice old_sys=$sys old_idle=$idle old_iowait=$iowait > done 1322566208:cpu0 5199 0 2931 357890604 2541 1322566209:cpu0 0 0 1 0 0 1322566210:cpu0 0 0 0 0 0 1322566211:cpu0 0 0 0 0 0 1322566212:cpu0 0 0 0 0 0 1322566213:cpu0 0 0 0 0 0 1322566214:cpu0 1 0 0 0 0 1322566215:cpu0 2 0 0 0 0 1322566216:cpu0 3 0 0 0 0 1322566217:cpu0 2 0 0 0 0 1322566218:cpu0 4 0 0 0 0 1322566219:cpu0 1 0 0 0 0 1322566220:cpu0 2 0 0 0 0 1322566221:cpu0 2 0 1 0 0 1322566222:cpu0 1 0 0 0 0 1322566223:cpu0 2 0 0 0 0 1322566224:cpu0 1 0 1 0 0 1322566225:cpu0 1 0 0 0 0 1322566226:cpu0 2 0 0 0 0 1322566227:cpu0 1 0 1 0 0 1322566228:cpu0 2 0 0 0 0 1322566229:cpu0 2 0 0 0 0 1322566230:cpu0 6 0 3 0 0 1322566231:cpu0 1 0 0 0 0 1322566232:cpu0 2 0 0 0 0 1322566233:cpu0 3 0 0 0 0 1322566234:cpu0 2 0 0 0 0 1322566235:cpu0 2 0 2 0 0 1322566236:cpu0 0 0 1 0 0 1322566237:cpu0 1 0 0 0 0 $ grep . -r /sys/devices/system/cpu/cpuidle/ /sys/devices/system/cpu/cpuidle/current_driver:intel_idle /sys/devices/system/cpu/cpuidle/current_governor_ro:menu $ grep . -r /sys/devices/system/cpu/cpufreq/ /sys/devices/system/cpu/cpufreq/ondemand/sampling_rate_min:10000 /sys/devices/system/cpu/cpufreq/ondemand/sampling_rate:10000 /sys/devices/system/cpu/cpufreq/ondemand/up_threshold:95 /sys/devices/system/cpu/cpufreq/ondemand/sampling_down_factor:1 /sys/devices/system/cpu/cpufreq/ondemand/ignore_nice_load:0 /sys/devices/system/cpu/cpufreq/ondemand/powersave_bias:0 /sys/devices/system/cpu/cpufreq/ondemand/io_is_busy:1 One thing I have to note, it takes some time (from 30 seconds to 10 minutes) before this bug starts manifesting itself. [-- Attachment #2: out.tar.xz --] [-- Type: application/octet-stream, Size: 1816 bytes --] [-- Attachment #3: config.bz2 --] [-- Type: application/octet-stream, Size: 13899 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage 2011-11-29 11:38 ` Artem S. Tashkinov @ 2011-11-29 12:31 ` Michal Hocko 2011-11-29 12:44 ` Michal Hocko 2011-12-02 13:35 ` Michal Hocko 1 sibling, 1 reply; 23+ messages in thread From: Michal Hocko @ 2011-11-29 12:31 UTC (permalink / raw) To: Artem S. Tashkinov; +Cc: pomac, linux-kernel, rjw, tino.keitel On Tue 29-11-11 11:38:47, Artem S. Tashkinov wrote: > On Nov 29, 2011, Michal Hocko <mhocko@suse.cz> wrote: > > > As I have written in other email could you post your config and collect > > the following data? > > for i in `seq 30`; > > do > > cat /proc/stat > `date +'%s'` > > sleep 1 > > done > > export old_user=0 old_nice=0 old_sys=0 old_idle=0 old_iowait=0; > > > > # for all your available CPUs > > grep cpu0 * | while read cpu user nice sys idle iowait rest; > > do > > echo $cpu $(($user-$old_user)) $(($nice-$old_nice)) $(($sys-$old_sys)) $(($idle-$old_idle)) $(($iowait-$old_iowait)) > > old_user=$user old_nice=$nice old_sys=$sys old_idle=$idle old_iowait=$iowait > > done > > 1322566208:cpu0 5199 0 2931 357890604 2541 > 1322566209:cpu0 0 0 1 0 0 > 1322566210:cpu0 0 0 0 0 0 > 1322566211:cpu0 0 0 0 0 0 > 1322566212:cpu0 0 0 0 0 0 > 1322566213:cpu0 0 0 0 0 0 > 1322566214:cpu0 1 0 0 0 0 > 1322566215:cpu0 2 0 0 0 0 > 1322566216:cpu0 3 0 0 0 0 > 1322566217:cpu0 2 0 0 0 0 > 1322566218:cpu0 4 0 0 0 0 > 1322566219:cpu0 1 0 0 0 0 > 1322566220:cpu0 2 0 0 0 0 > 1322566221:cpu0 2 0 1 0 0 > 1322566222:cpu0 1 0 0 0 0 > 1322566223:cpu0 2 0 0 0 0 > 1322566224:cpu0 1 0 1 0 0 > 1322566225:cpu0 1 0 0 0 0 > 1322566226:cpu0 2 0 0 0 0 > 1322566227:cpu0 1 0 1 0 0 > 1322566228:cpu0 2 0 0 0 0 > 1322566229:cpu0 2 0 0 0 0 > 1322566230:cpu0 6 0 3 0 0 > 1322566231:cpu0 1 0 0 0 0 > 1322566232:cpu0 2 0 0 0 0 > 1322566233:cpu0 3 0 0 0 0 > 1322566234:cpu0 2 0 0 0 0 > 1322566235:cpu0 2 0 2 0 0 > 1322566236:cpu0 0 0 1 0 0 > 1322566237:cpu0 1 0 0 0 0 Hmm, really strange. It looks that idle/iowait is not accounted at all. Which would explain why the numbers you are seeing are so weird. > > $ grep . -r /sys/devices/system/cpu/cpuidle/ > /sys/devices/system/cpu/cpuidle/current_driver:intel_idle > /sys/devices/system/cpu/cpuidle/current_governor_ro:menu I will check whether I have a machine with intel_idle somewhere around. > > $ grep . -r /sys/devices/system/cpu/cpufreq/ > /sys/devices/system/cpu/cpufreq/ondemand/sampling_rate_min:10000 > /sys/devices/system/cpu/cpufreq/ondemand/sampling_rate:10000 > /sys/devices/system/cpu/cpufreq/ondemand/up_threshold:95 > /sys/devices/system/cpu/cpufreq/ondemand/sampling_down_factor:1 > /sys/devices/system/cpu/cpufreq/ondemand/ignore_nice_load:0 > /sys/devices/system/cpu/cpufreq/ondemand/powersave_bias:0 > /sys/devices/system/cpu/cpufreq/ondemand/io_is_busy:1 > > One thing I have to note, it takes some time (from 30 seconds to 10 > minutes) before this bug starts manifesting itself. -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage 2011-11-29 12:31 ` Michal Hocko @ 2011-11-29 12:44 ` Michal Hocko 2011-11-29 12:54 ` Artem S. Tashkinov 0 siblings, 1 reply; 23+ messages in thread From: Michal Hocko @ 2011-11-29 12:44 UTC (permalink / raw) To: Artem S. Tashkinov; +Cc: pomac, linux-kernel, rjw, tino.keitel On Tue 29-11-11 13:31:56, Michal Hocko wrote: > On Tue 29-11-11 11:38:47, Artem S. Tashkinov wrote: [...] > > $ grep . -r /sys/devices/system/cpu/cpuidle/ > > /sys/devices/system/cpu/cpuidle/current_driver:intel_idle > > /sys/devices/system/cpu/cpuidle/current_governor_ro:menu Could you try with acpi_idle driver? echo acpi_idle > /sys/devices/system/cpu/cpuidle/current_driver -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Re: Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage 2011-11-29 12:44 ` Michal Hocko @ 2011-11-29 12:54 ` Artem S. Tashkinov 2011-11-29 13:10 ` Michal Hocko 0 siblings, 1 reply; 23+ messages in thread From: Artem S. Tashkinov @ 2011-11-29 12:54 UTC (permalink / raw) To: mhocko; +Cc: pomac, linux-kernel, rjw, tino.keitel > On Nov 29, 2011, Michal Hocko wrote: > > On Tue 29-11-11 13:31:56, Michal Hocko wrote: > > On Tue 29-11-11 11:38:47, Artem S. Tashkinov wrote: > [...] > > > $ grep . -r /sys/devices/system/cpu/cpuidle/ > > > /sys/devices/system/cpu/cpuidle/current_driver:intel_idle > > > /sys/devices/system/cpu/cpuidle/current_governor_ro:menu > > Could you try with acpi_idle driver? > echo acpi_idle > /sys/devices/system/cpu/cpuidle/current_driver [root@localhost ~]# echo acpi_idle > /sys/devices/system/cpu/cpuidle/current_driver -bash: /sys/devices/system/cpu/cpuidle/current_driver: Permission denied ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Re: Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage 2011-11-29 12:54 ` Artem S. Tashkinov @ 2011-11-29 13:10 ` Michal Hocko 2011-11-29 13:51 ` Artem S. Tashkinov 2011-11-29 22:51 ` Rafael J. Wysocki 0 siblings, 2 replies; 23+ messages in thread From: Michal Hocko @ 2011-11-29 13:10 UTC (permalink / raw) To: Artem S. Tashkinov; +Cc: pomac, linux-kernel, rjw, tino.keitel On Tue 29-11-11 12:54:16, Artem S. Tashkinov wrote: > > On Nov 29, 2011, Michal Hocko wrote: > > > > On Tue 29-11-11 13:31:56, Michal Hocko wrote: > > > On Tue 29-11-11 11:38:47, Artem S. Tashkinov wrote: > > [...] > > > > $ grep . -r /sys/devices/system/cpu/cpuidle/ > > > > /sys/devices/system/cpu/cpuidle/current_driver:intel_idle > > > > /sys/devices/system/cpu/cpuidle/current_governor_ro:menu > > > > Could you try with acpi_idle driver? > > echo acpi_idle > /sys/devices/system/cpu/cpuidle/current_driver > > [root@localhost ~]# echo acpi_idle > /sys/devices/system/cpu/cpuidle/current_driver > -bash: /sys/devices/system/cpu/cpuidle/current_driver: Permission denied It seems that this is cannot be set in runtime: SYSDEV_CLASS_ATTR(current_driver, 0444, show_current_driver, NULL); Or maybe you need to boot with cpuidle_sysfs_switch. According to the documentation you might be able to change the governor. I have no idea whether this can help somehow but let's try that. I haven't found any intel_idle machine in my lab so far and all other acpi_idle machines seem to work (or at least randomly selected ones) so this smells like a major difference in the setup. -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Re: Re: Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage 2011-11-29 13:10 ` Michal Hocko @ 2011-11-29 13:51 ` Artem S. Tashkinov 2011-11-29 22:51 ` Rafael J. Wysocki 1 sibling, 0 replies; 23+ messages in thread From: Artem S. Tashkinov @ 2011-11-29 13:51 UTC (permalink / raw) To: mhocko; +Cc: pomac, linux-kernel, rjw, tino.keitel On Nov 29, 2011, Michal Hocko wrote: > It seems that this is cannot be set in runtime: > SYSDEV_CLASS_ATTR(current_driver, 0444, show_current_driver, NULL); > > Or maybe you need to boot with cpuidle_sysfs_switch. According to the > documentation you might be able to change the governor. I have no idea > whether this can help somehow but let's try that. > > I haven't found any intel_idle machine in my lab so far and all other > acpi_idle machines seem to work (or at least randomly selected ones) so > this smells like a major difference in the setup. # echo acpi_idle > /sys/devices/system/cpu/cpuidle/current_driver bash: /sys/devices/system/cpu/cpuidle/current_driver: Permission denied BTW # cat /sys/devices/system/cpu/cpuidle/available_governors ladder menu dmesg | grep -i idle [ 0.000000] Kernel command line: root=/dev/sda1 ro cpuidle_sysfs_switch [ 0.000126] using mwait in idle threads. [ 1.083872] intel_idle: MWAIT substates: 0x1120 [ 1.083873] intel_idle: v0.4 model 0x2A [ 1.083874] intel_idle: lapic_timer_reliable_states 0xffffffff [ 1.159043] cpuidle: using governor ladder [ 1.168842] cpuidle: using governor menu ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage 2011-11-29 13:10 ` Michal Hocko 2011-11-29 13:51 ` Artem S. Tashkinov @ 2011-11-29 22:51 ` Rafael J. Wysocki 2011-11-30 10:12 ` Michal Hocko 1 sibling, 1 reply; 23+ messages in thread From: Rafael J. Wysocki @ 2011-11-29 22:51 UTC (permalink / raw) To: Michal Hocko; +Cc: Artem S. Tashkinov, pomac, linux-kernel, tino.keitel On Tuesday, November 29, 2011, Michal Hocko wrote: > On Tue 29-11-11 12:54:16, Artem S. Tashkinov wrote: > > > On Nov 29, 2011, Michal Hocko wrote: > > > > > > On Tue 29-11-11 13:31:56, Michal Hocko wrote: > > > > On Tue 29-11-11 11:38:47, Artem S. Tashkinov wrote: > > > [...] > > > > > $ grep . -r /sys/devices/system/cpu/cpuidle/ > > > > > /sys/devices/system/cpu/cpuidle/current_driver:intel_idle > > > > > /sys/devices/system/cpu/cpuidle/current_governor_ro:menu > > > > > > Could you try with acpi_idle driver? > > > echo acpi_idle > /sys/devices/system/cpu/cpuidle/current_driver > > > > [root@localhost ~]# echo acpi_idle > /sys/devices/system/cpu/cpuidle/current_driver > > -bash: /sys/devices/system/cpu/cpuidle/current_driver: Permission denied > > It seems that this is cannot be set in runtime: > SYSDEV_CLASS_ATTR(current_driver, 0444, show_current_driver, NULL); > > Or maybe you need to boot with cpuidle_sysfs_switch. According to the > documentation you might be able to change the governor. I have no idea > whether this can help somehow but let's try that. > > I haven't found any intel_idle machine in my lab so far and all other > acpi_idle machines seem to work (or at least randomly selected ones) so > this smells like a major difference in the setup. I'm able to reproduce that with acpi_driver on one box, but not on demand. Thanks, Rafael ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage 2011-11-29 22:51 ` Rafael J. Wysocki @ 2011-11-30 10:12 ` Michal Hocko 2011-11-30 19:56 ` Rafael J. Wysocki 0 siblings, 1 reply; 23+ messages in thread From: Michal Hocko @ 2011-11-30 10:12 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: Artem S. Tashkinov, pomac, linux-kernel, tino.keitel On Tue 29-11-11 23:51:16, Rafael J. Wysocki wrote: > On Tuesday, November 29, 2011, Michal Hocko wrote: > > On Tue 29-11-11 12:54:16, Artem S. Tashkinov wrote: > > > > On Nov 29, 2011, Michal Hocko wrote: > > > > > > > > On Tue 29-11-11 13:31:56, Michal Hocko wrote: > > > > > On Tue 29-11-11 11:38:47, Artem S. Tashkinov wrote: > > > > [...] > > > > > > $ grep . -r /sys/devices/system/cpu/cpuidle/ > > > > > > /sys/devices/system/cpu/cpuidle/current_driver:intel_idle > > > > > > /sys/devices/system/cpu/cpuidle/current_governor_ro:menu > > > > > > > > Could you try with acpi_idle driver? > > > > echo acpi_idle > /sys/devices/system/cpu/cpuidle/current_driver > > > > > > [root@localhost ~]# echo acpi_idle > /sys/devices/system/cpu/cpuidle/current_driver > > > -bash: /sys/devices/system/cpu/cpuidle/current_driver: Permission denied > > > > It seems that this is cannot be set in runtime: > > SYSDEV_CLASS_ATTR(current_driver, 0444, show_current_driver, NULL); > > > > Or maybe you need to boot with cpuidle_sysfs_switch. According to the > > documentation you might be able to change the governor. I have no idea > > whether this can help somehow but let's try that. > > > > I haven't found any intel_idle machine in my lab so far and all other > > acpi_idle machines seem to work (or at least randomly selected ones) so > > this smells like a major difference in the setup. > > I'm able to reproduce that with acpi_driver on one box, but not on demand. And do you see the same thing (no idle/io_wait) updates? > > Thanks, > Rafael -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage 2011-11-30 10:12 ` Michal Hocko @ 2011-11-30 19:56 ` Rafael J. Wysocki 2011-12-01 14:07 ` Michal Hocko 0 siblings, 1 reply; 23+ messages in thread From: Rafael J. Wysocki @ 2011-11-30 19:56 UTC (permalink / raw) To: Michal Hocko; +Cc: Artem S. Tashkinov, pomac, linux-kernel, tino.keitel On Wednesday, November 30, 2011, Michal Hocko wrote: > On Tue 29-11-11 23:51:16, Rafael J. Wysocki wrote: > > On Tuesday, November 29, 2011, Michal Hocko wrote: > > > On Tue 29-11-11 12:54:16, Artem S. Tashkinov wrote: > > > > > On Nov 29, 2011, Michal Hocko wrote: > > > > > > > > > > On Tue 29-11-11 13:31:56, Michal Hocko wrote: > > > > > > On Tue 29-11-11 11:38:47, Artem S. Tashkinov wrote: > > > > > [...] > > > > > > > $ grep . -r /sys/devices/system/cpu/cpuidle/ > > > > > > > /sys/devices/system/cpu/cpuidle/current_driver:intel_idle > > > > > > > /sys/devices/system/cpu/cpuidle/current_governor_ro:menu > > > > > > > > > > Could you try with acpi_idle driver? > > > > > echo acpi_idle > /sys/devices/system/cpu/cpuidle/current_driver > > > > > > > > [root@localhost ~]# echo acpi_idle > /sys/devices/system/cpu/cpuidle/current_driver > > > > -bash: /sys/devices/system/cpu/cpuidle/current_driver: Permission denied > > > > > > It seems that this is cannot be set in runtime: > > > SYSDEV_CLASS_ATTR(current_driver, 0444, show_current_driver, NULL); > > > > > > Or maybe you need to boot with cpuidle_sysfs_switch. According to the > > > documentation you might be able to change the governor. I have no idea > > > whether this can help somehow but let's try that. > > > > > > I haven't found any intel_idle machine in my lab so far and all other > > > acpi_idle machines seem to work (or at least randomly selected ones) so > > > this smells like a major difference in the setup. > > > > I'm able to reproduce that with acpi_driver on one box, but not on demand. > > And do you see the same thing (no idle/io_wait) updates? Actaully, I was wrong. The box I'm seeing the issue on also has "none" in /sys/devices/system/cpu/cpuidle/current_driver. Sorry for the confusion. Thanks, Rafael ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage 2011-11-30 19:56 ` Rafael J. Wysocki @ 2011-12-01 14:07 ` Michal Hocko 2011-12-02 10:39 ` Michal Hocko 0 siblings, 1 reply; 23+ messages in thread From: Michal Hocko @ 2011-12-01 14:07 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Artem S. Tashkinov, pomac, linux-kernel, tino.keitel, Len Brown [Let's add Len to the CC for idle driver] On Wed 30-11-11 20:56:54, Rafael J. Wysocki wrote: > On Wednesday, November 30, 2011, Michal Hocko wrote: > > On Tue 29-11-11 23:51:16, Rafael J. Wysocki wrote: > > > On Tuesday, November 29, 2011, Michal Hocko wrote: [...] > > > > I haven't found any intel_idle machine in my lab so far and all other > > > > acpi_idle machines seem to work (or at least randomly selected ones) so > > > > this smells like a major difference in the setup. > > > > > > I'm able to reproduce that with acpi_driver on one box, but not on demand. > > > > And do you see the same thing (no idle/io_wait) updates? > > Actaully, I was wrong. The box I'm seeing the issue on also has "none" > in /sys/devices/system/cpu/cpuidle/current_driver. Sorry for the confusion. OK. So we have seen the issue only with intel_idle and none drivers so far. acpi_idle which is at my machines works just fine. I think we should focus on those drivers. To summarize this issue. Users are seeing weird values reported by [h]top. CPUs seem to be at 100% even though there is nothing hogging them. /proc/stat collected data on the affected system shown that idle/io_wait are not accounted properly. It has been identified that problem disappears if a25cac51 [proc: Consider NO_HZ when printing idle and iowait times] is reverted. This patch fixes a bug when idle/io_wait times are not repororted properly when a CPU is tickless. It relies on get_cpu_idle_time_us which either reports sched_time idle_sleeptime or (idle_sleeptime + now-idle_entrytime) if we are idle at the moment. While implementation is not race free (we better not use locks in that path...) so we might race: E.g. CPU1 CPU2 now = ktime_get tick_nohz_start_idle ts->idle_entrytime = now; if (ts->idle_active) ts->idle_active = 1 [...] return idle_sleeptime But this is OK because sleeptime will be more or less accurate. We just skip few ticks. It would be worse if we had a race like: CPU1 CPU2 now = ktime_get tick_nohz_start_idle now = ktime_get update_ts_time_stats() ts->idle_entrytime = now; ts->idle_active = 1 if (ts->idle_active) delta = ktime_sub(now, idle_entrytime) ktime_add(idle_sleeptime, delta) In this case we might get an overflow from ktime_sub but AFAIU the ktime_* magic the overflow should cause to get smaller idle_sleeptime in the end after ktime_add (we do not add a small number but rather subtract it), right? So it shouldn't be a big deal as well. So the question is. What is the role of the idle driver here? Thanks -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage 2011-12-01 14:07 ` Michal Hocko @ 2011-12-02 10:39 ` Michal Hocko 0 siblings, 0 replies; 23+ messages in thread From: Michal Hocko @ 2011-12-02 10:39 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Artem S. Tashkinov, pomac, linux-kernel, tino.keitel, Len Brown On Thu 01-12-11 15:07:49, Michal Hocko wrote: [...] > While implementation is not race free (we better not use locks in that > path...) so we might race: > > E.g. > > CPU1 CPU2 > now = ktime_get > tick_nohz_start_idle > ts->idle_entrytime = now; > if (ts->idle_active) > ts->idle_active = 1 > [...] > return idle_sleeptime > > But this is OK because sleeptime will be more or less accurate. We just > skip few ticks. > > It would be worse if we had a race like: > CPU1 CPU2 > now = ktime_get > tick_nohz_start_idle > now = ktime_get > update_ts_time_stats() > ts->idle_entrytime = now; > ts->idle_active = 1 > if (ts->idle_active) > delta = ktime_sub(now, idle_entrytime) > ktime_add(idle_sleeptime, delta) > > In this case we might get an overflow from ktime_sub but AFAIU the > ktime_* magic the overflow should cause to get smaller idle_sleeptime > in the end after ktime_add (we do not add a small number but rather > subtract it), right? Scratch that. Dunno why but I thought that ktime_t has unsigned values but it is apparently not true (tv64 is s64). Anyway the above races should be safe. -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage 2011-11-29 11:38 ` Artem S. Tashkinov 2011-11-29 12:31 ` Michal Hocko @ 2011-12-02 13:35 ` Michal Hocko 2011-12-02 16:49 ` [PATCH] proc: Do not overflow get_{idle,iowait}_time for nohz (was: Re: Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage) Michal Hocko 2011-12-02 17:43 ` Re: Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage Artem S. Tashkinov 1 sibling, 2 replies; 23+ messages in thread From: Michal Hocko @ 2011-12-02 13:35 UTC (permalink / raw) To: Artem S. Tashkinov; +Cc: pomac, linux-kernel, rjw, tino.keitel On Tue 29-11-11 11:38:47, Artem S. Tashkinov wrote: > On Nov 29, 2011, Michal Hocko <mhocko@suse.cz> wrote: > > > As I have written in other email could you post your config and collect > > the following data? > > for i in `seq 30`; > > do > > cat /proc/stat > `date +'%s'` > > sleep 1 > > done > > export old_user=0 old_nice=0 old_sys=0 old_idle=0 old_iowait=0; > > > > # for all your available CPUs > > grep cpu0 * | while read cpu user nice sys idle iowait rest; > > do > > echo $cpu $(($user-$old_user)) $(($nice-$old_nice)) $(($sys-$old_sys)) $(($idle-$old_idle)) $(($iowait-$old_iowait)) > > old_user=$user old_nice=$nice old_sys=$sys old_idle=$idle old_iowait=$iowait > > done > > 1322566208:cpu0 5199 0 2931 357890604 2541 > 1322566209:cpu0 0 0 1 0 0 > 1322566210:cpu0 0 0 0 0 0 > 1322566211:cpu0 0 0 0 0 0 > 1322566212:cpu0 0 0 0 0 0 > 1322566213:cpu0 0 0 0 0 0 > 1322566214:cpu0 1 0 0 0 0 > 1322566215:cpu0 2 0 0 0 0 > 1322566216:cpu0 3 0 0 0 0 > 1322566217:cpu0 2 0 0 0 0 > 1322566218:cpu0 4 0 0 0 0 > 1322566219:cpu0 1 0 0 0 0 > 1322566220:cpu0 2 0 0 0 0 > 1322566221:cpu0 2 0 1 0 0 > 1322566222:cpu0 1 0 0 0 0 > 1322566223:cpu0 2 0 0 0 0 > 1322566224:cpu0 1 0 1 0 0 > 1322566225:cpu0 1 0 0 0 0 > 1322566226:cpu0 2 0 0 0 0 > 1322566227:cpu0 1 0 1 0 0 > 1322566228:cpu0 2 0 0 0 0 > 1322566229:cpu0 2 0 0 0 0 > 1322566230:cpu0 6 0 3 0 0 > 1322566231:cpu0 1 0 0 0 0 > 1322566232:cpu0 2 0 0 0 0 > 1322566233:cpu0 3 0 0 0 0 > 1322566234:cpu0 2 0 0 0 0 > 1322566235:cpu0 2 0 2 0 0 > 1322566236:cpu0 0 0 1 0 0 > 1322566237:cpu0 1 0 0 0 0 Could you post raw data as well? Ideally starting right after boot and collected for more than 30s (longer better...) Thanks! -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH] proc: Do not overflow get_{idle,iowait}_time for nohz (was: Re: Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage) 2011-12-02 13:35 ` Michal Hocko @ 2011-12-02 16:49 ` Michal Hocko 2011-12-02 17:59 ` Michal Hocko 2011-12-02 17:43 ` Re: Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage Artem S. Tashkinov 1 sibling, 1 reply; 23+ messages in thread From: Michal Hocko @ 2011-12-02 16:49 UTC (permalink / raw) To: Artem S. Tashkinov; +Cc: pomac, linux-kernel, rjw, tino.keitel On Fri 02-12-11 14:35:15, Michal Hocko wrote: > On Tue 29-11-11 11:38:47, Artem S. Tashkinov wrote: > > On Nov 29, 2011, Michal Hocko <mhocko@suse.cz> wrote: > > > > > As I have written in other email could you post your config and collect > > > the following data? > > > for i in `seq 30`; > > > do > > > cat /proc/stat > `date +'%s'` > > > sleep 1 > > > done > > > export old_user=0 old_nice=0 old_sys=0 old_idle=0 old_iowait=0; > > > > > > # for all your available CPUs > > > grep cpu0 * | while read cpu user nice sys idle iowait rest; > > > do > > > echo $cpu $(($user-$old_user)) $(($nice-$old_nice)) $(($sys-$old_sys)) $(($idle-$old_idle)) $(($iowait-$old_iowait)) > > > old_user=$user old_nice=$nice old_sys=$sys old_idle=$idle old_iowait=$iowait > > > done > > > > 1322566208:cpu0 5199 0 2931 357890604 2541 > > 1322566209:cpu0 0 0 1 0 0 > > 1322566210:cpu0 0 0 0 0 0 > > 1322566211:cpu0 0 0 0 0 0 [...] > > Could you post raw data as well? Ideally starting right after boot and > collected for more than 30s (longer better...) Ahh, missed that you attached data. And also noticed that you are using CONFIG_HZ_300 which explains the problem and why I do cannot reproduce it. get_{idle,iowait}_time translates us to cputime64_t and it uses usecs_to_cputime which is just an alias for usecs_to_jiffies and it does if (u > jiffies_to_usecs(MAX_JIFFY_OFFSET)) return MAX_JIFFY_OFFSET; which in your case (HZ=300) means that we overflow much more often than for HZ==100. The patch below should fix this: --- >From 23882e2aabe27934df4d23b0ed52749fd4f61ab4 Mon Sep 17 00:00:00 2001 From: Michal Hocko <mhocko@suse.cz> Date: Fri, 2 Dec 2011 16:17:03 +0100 Subject: [PATCH] proc: Do not overflow get_{idle,iowait}_time for nohz get_{idle,iowait}_time use usecs_to_cputime to translate micro seconds time to cputime64_t. This is just an alias to usecs_to_jiffies which reduces the data type from u64 to unsigned int and also checks whether the given paramerer overflows jiffies_to_usecs(MAX_JIFFY_OFFSET) and returns MAX_JIFFY_OFFSET in that case. How much we overflow depends on CONFIG_HZ and especially for CONFIG_HZ_300 it is quite low (1431649781). This results in a bug when people saw [h]top going mad reporting 100% CPU usage even though there was basically no CPU load at all. The reason was simply that /proc/stat stopped reporting idle/io_wait changes (and reported MAX_JIFFY_OFFSET) and so the only change happenning was for user system time. Let's use nsecs_to_jiffies64 instead as it doesn't overflow. Signed-off-by: Michal Hocko <mhocko@suse.cz> --- fs/proc/stat.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/proc/stat.c b/fs/proc/stat.c index 42b274d..2a30d67 100644 --- a/fs/proc/stat.c +++ b/fs/proc/stat.c @@ -32,7 +32,7 @@ static cputime64_t get_idle_time(int cpu) idle = kstat_cpu(cpu).cpustat.idle; idle = cputime64_add(idle, arch_idle_time(cpu)); } else - idle = usecs_to_cputime(idle_time); + idle = nsecs_to_jiffies64(1000 * idle_time); return idle; } @@ -46,7 +46,7 @@ static cputime64_t get_iowait_time(int cpu) /* !NO_HZ so we can rely on cpustat.iowait */ iowait = kstat_cpu(cpu).cpustat.iowait; else - iowait = usecs_to_cputime(iowait_time); + iowait = nsecs_to_jiffies64(1000 * iowait_time); return iowait; } -- 1.7.7.3 -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH] proc: Do not overflow get_{idle,iowait}_time for nohz (was: Re: Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage) 2011-12-02 16:49 ` [PATCH] proc: Do not overflow get_{idle,iowait}_time for nohz (was: Re: Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage) Michal Hocko @ 2011-12-02 17:59 ` Michal Hocko 2011-12-02 20:12 ` Artem S. Tashkinov 0 siblings, 1 reply; 23+ messages in thread From: Michal Hocko @ 2011-12-02 17:59 UTC (permalink / raw) To: Artem S. Tashkinov; +Cc: pomac, linux-kernel, rjw, tino.keitel On Fri 02-12-11 17:49:17, Michal Hocko wrote: > On Fri 02-12-11 14:35:15, Michal Hocko wrote: > > On Tue 29-11-11 11:38:47, Artem S. Tashkinov wrote: > > > On Nov 29, 2011, Michal Hocko <mhocko@suse.cz> wrote: > > > > > > > As I have written in other email could you post your config and collect > > > > the following data? > > > > for i in `seq 30`; > > > > do > > > > cat /proc/stat > `date +'%s'` > > > > sleep 1 > > > > done > > > > export old_user=0 old_nice=0 old_sys=0 old_idle=0 old_iowait=0; > > > > > > > > # for all your available CPUs > > > > grep cpu0 * | while read cpu user nice sys idle iowait rest; > > > > do > > > > echo $cpu $(($user-$old_user)) $(($nice-$old_nice)) $(($sys-$old_sys)) $(($idle-$old_idle)) $(($iowait-$old_iowait)) > > > > old_user=$user old_nice=$nice old_sys=$sys old_idle=$idle old_iowait=$iowait > > > > done > > > > > > 1322566208:cpu0 5199 0 2931 357890604 2541 > > > 1322566209:cpu0 0 0 1 0 0 > > > 1322566210:cpu0 0 0 0 0 0 > > > 1322566211:cpu0 0 0 0 0 0 > [...] > > > > Could you post raw data as well? Ideally starting right after boot and > > collected for more than 30s (longer better...) > > Ahh, missed that you attached data. And also noticed that you are using > CONFIG_HZ_300 which explains the problem and why I do cannot reproduce > it. > > get_{idle,iowait}_time translates us to cputime64_t and it uses > usecs_to_cputime which is just an alias for usecs_to_jiffies and it does > if (u > jiffies_to_usecs(MAX_JIFFY_OFFSET)) > return MAX_JIFFY_OFFSET; > which in your case (HZ=300) means that we overflow much more often than > for HZ==100. The patch below should fix this: And the one with a more cleaned up changelog. No functional changes --- >From 107887016b91de59194a93c751d040b05d5e37fe Mon Sep 17 00:00:00 2001 From: Michal Hocko <mhocko@suse.cz> Date: Fri, 2 Dec 2011 16:17:03 +0100 Subject: [PATCH] proc: Do not overflow get_{idle,iowait}_time for nohz Since a25cac51 [proc: Consider NO_HZ when printing idle and iowait times] we are reporting idle/io_wait time also while a CPU is tickless. We rely on get_{idle,iowait}_time functions to retrieve proper data. These functions, however, use usecs_to_cputime to translate micro seconds time to cputime64_t. This is just an alias to usecs_to_jiffies which reduces the data type from u64 to unsigned int and also checks whether the given parameter overflows jiffies_to_usecs(MAX_JIFFY_OFFSET) and returns MAX_JIFFY_OFFSET in that case. When do we overflow depends on CONFIG_HZ but especially for CONFIG_HZ_300 it is quite low (1431649781) so we are getting MAX_JIFFY_OFFSET for >3000s! until we overflow unsigned int. Just for reference CONFIG_100 has an overflow window around 20s, CONFIG_250 ~8s and CONFIG_1000 ~2s. This results in a bug when people saw [h]top going mad reporting 100% CPU usage even though there was basically no CPU load. The reason was simply that /proc/stat stopped reporting idle/io_wait changes (and reported MAX_JIFFY_OFFSET) and so the only change happening was for user system time. Let's use nsecs_to_jiffies64 instead which doesn't reduce the precision to 32b type and it is much more appropriate for cumulative time values (unlike usecs_to_jiffies which intended for timeout calculations). Signed-off-by: Michal Hocko <mhocko@suse.cz> --- fs/proc/stat.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/proc/stat.c b/fs/proc/stat.c index 42b274d..2a30d67 100644 --- a/fs/proc/stat.c +++ b/fs/proc/stat.c @@ -32,7 +32,7 @@ static cputime64_t get_idle_time(int cpu) idle = kstat_cpu(cpu).cpustat.idle; idle = cputime64_add(idle, arch_idle_time(cpu)); } else - idle = usecs_to_cputime(idle_time); + idle = nsecs_to_jiffies64(1000 * idle_time); return idle; } @@ -46,7 +46,7 @@ static cputime64_t get_iowait_time(int cpu) /* !NO_HZ so we can rely on cpustat.iowait */ iowait = kstat_cpu(cpu).cpustat.iowait; else - iowait = usecs_to_cputime(iowait_time); + iowait = nsecs_to_jiffies64(1000 * iowait_time); return iowait; } -- 1.7.7.3 -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: Re: [PATCH] proc: Do not overflow get_{idle,iowait}_time for nohz (was: Re: Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage) 2011-12-02 17:59 ` Michal Hocko @ 2011-12-02 20:12 ` Artem S. Tashkinov 2011-12-05 8:56 ` Michal Hocko 0 siblings, 1 reply; 23+ messages in thread From: Artem S. Tashkinov @ 2011-12-02 20:12 UTC (permalink / raw) To: mhocko; +Cc: pomac, linux-kernel, rjw, tino.keitel On Dec 2, 2011, Michal Hocko wrote: > And the one with a more cleaned up changelog. No functional changes > --- > From 107887016b91de59194a93c751d040b05d5e37fe Mon Sep 17 00:00:00 2001 > From: Michal Hocko <> > Date: Fri, 2 Dec 2011 16:17:03 +0100 > Subject: [PATCH] proc: Do not overflow get_{idle,iowait}_time for nohz > > Since a25cac51 [proc: Consider NO_HZ when printing idle and iowait times] > we are reporting idle/io_wait time also while a CPU is tickless. We rely > on get_{idle,iowait}_time functions to retrieve proper data. > > These functions, however, use usecs_to_cputime to translate micro > seconds time to cputime64_t. This is just an alias to usecs_to_jiffies > which reduces the data type from u64 to unsigned int and also checks > whether the given parameter overflows jiffies_to_usecs(MAX_JIFFY_OFFSET) > and returns MAX_JIFFY_OFFSET in that case. > > When do we overflow depends on CONFIG_HZ but especially for > CONFIG_HZ_300 it is quite low (1431649781) so we are getting > MAX_JIFFY_OFFSET for >3000s! until we overflow unsigned int. > Just for reference CONFIG_100 has an overflow window around 20s, > CONFIG_250 ~8s and CONFIG_1000 ~2s. > > This results in a bug when people saw [h]top going mad reporting 100% > CPU usage even though there was basically no CPU load. The reason was > simply that /proc/stat stopped reporting idle/io_wait changes (and > reported MAX_JIFFY_OFFSET) and so the only change happening was for > user system time. > > Let's use nsecs_to_jiffies64 instead which doesn't reduce the precision > to 32b type and it is much more appropriate for cumulative time values > (unlike usecs_to_jiffies which intended for timeout calculations). > > Signed-off-by: Michal Hocko <mhocko@suse.cz> > --- > fs/proc/stat.c | 4 ++-- > 1 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/fs/proc/stat.c b/fs/proc/stat.c > index 42b274d..2a30d67 100644 > --- a/fs/proc/stat.c > +++ b/fs/proc/stat.c > @@ -32,7 +32,7 @@ static cputime64_t get_idle_time(int cpu) > idle = kstat_cpu(cpu).cpustat.idle; > idle = cputime64_add(idle, arch_idle_time(cpu)); > } else > - idle = usecs_to_cputime(idle_time); > + idle = nsecs_to_jiffies64(1000 * idle_time); > > return idle; > } > @@ -46,7 +46,7 @@ static cputime64_t get_iowait_time(int cpu) > /* !NO_HZ so we can rely on cpustat.iowait */ > iowait = kstat_cpu(cpu).cpustat.iowait; > else > - iowait = usecs_to_cputime(iowait_time); > + iowait = nsecs_to_jiffies64(1000 * iowait_time); > > return iowait; > } > -- > 1.7.7.3 Thank you, this patch has fixed the issue for me. Tested-by: Artem S. Tashkinov <t.artem@mailcity.com> ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Re: [PATCH] proc: Do not overflow get_{idle,iowait}_time for nohz (was: Re: Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage) 2011-12-02 20:12 ` Artem S. Tashkinov @ 2011-12-05 8:56 ` Michal Hocko 0 siblings, 0 replies; 23+ messages in thread From: Michal Hocko @ 2011-12-05 8:56 UTC (permalink / raw) To: Artem S. Tashkinov; +Cc: pomac, linux-kernel, rjw, tino.keitel On Fri 02-12-11 20:12:14, Artem S. Tashkinov wrote: > On Dec 2, 2011, Michal Hocko wrote: > > > And the one with a more cleaned up changelog. No functional changes > > --- > > From 107887016b91de59194a93c751d040b05d5e37fe Mon Sep 17 00:00:00 2001 > > From: Michal Hocko <> > > Date: Fri, 2 Dec 2011 16:17:03 +0100 > > Subject: [PATCH] proc: Do not overflow get_{idle,iowait}_time for nohz > > > > Since a25cac51 [proc: Consider NO_HZ when printing idle and iowait times] > > we are reporting idle/io_wait time also while a CPU is tickless. We rely > > on get_{idle,iowait}_time functions to retrieve proper data. > > > > These functions, however, use usecs_to_cputime to translate micro > > seconds time to cputime64_t. This is just an alias to usecs_to_jiffies > > which reduces the data type from u64 to unsigned int and also checks > > whether the given parameter overflows jiffies_to_usecs(MAX_JIFFY_OFFSET) > > and returns MAX_JIFFY_OFFSET in that case. > > > > When do we overflow depends on CONFIG_HZ but especially for > > CONFIG_HZ_300 it is quite low (1431649781) so we are getting > > MAX_JIFFY_OFFSET for >3000s! until we overflow unsigned int. > > Just for reference CONFIG_100 has an overflow window around 20s, > > CONFIG_250 ~8s and CONFIG_1000 ~2s. > > > > This results in a bug when people saw [h]top going mad reporting 100% > > CPU usage even though there was basically no CPU load. The reason was > > simply that /proc/stat stopped reporting idle/io_wait changes (and > > reported MAX_JIFFY_OFFSET) and so the only change happening was for > > user system time. > > > > Let's use nsecs_to_jiffies64 instead which doesn't reduce the precision > > to 32b type and it is much more appropriate for cumulative time values > > (unlike usecs_to_jiffies which intended for timeout calculations). > > > > Signed-off-by: Michal Hocko <mhocko@suse.cz> > > --- > > fs/proc/stat.c | 4 ++-- > > 1 files changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/fs/proc/stat.c b/fs/proc/stat.c > > index 42b274d..2a30d67 100644 > > --- a/fs/proc/stat.c > > +++ b/fs/proc/stat.c > > @@ -32,7 +32,7 @@ static cputime64_t get_idle_time(int cpu) > > idle = kstat_cpu(cpu).cpustat.idle; > > idle = cputime64_add(idle, arch_idle_time(cpu)); > > } else > > - idle = usecs_to_cputime(idle_time); > > + idle = nsecs_to_jiffies64(1000 * idle_time); > > > > return idle; > > } > > @@ -46,7 +46,7 @@ static cputime64_t get_iowait_time(int cpu) > > /* !NO_HZ so we can rely on cpustat.iowait */ > > iowait = kstat_cpu(cpu).cpustat.iowait; > > else > > - iowait = usecs_to_cputime(iowait_time); > > + iowait = nsecs_to_jiffies64(1000 * iowait_time); > > > > return iowait; > > } > > -- > > 1.7.7.3 > > Thank you, this patch has fixed the issue for me. > > Tested-by: Artem S. Tashkinov <t.artem@mailcity.com> Thanks for retesting! -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Re: Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage 2011-12-02 13:35 ` Michal Hocko 2011-12-02 16:49 ` [PATCH] proc: Do not overflow get_{idle,iowait}_time for nohz (was: Re: Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage) Michal Hocko @ 2011-12-02 17:43 ` Artem S. Tashkinov 1 sibling, 0 replies; 23+ messages in thread From: Artem S. Tashkinov @ 2011-12-02 17:43 UTC (permalink / raw) To: mhocko; +Cc: pomac, linux-kernel, rjw, tino.keitel On Dec 2, 2011, Michal Hocko <mhocko@suse.cz> wrote: > Could you post raw data as well? Ideally starting right after boot and > collected for more than 30s (longer better...) Already posted that under the "out.tar.xz" filename - just grep your mail agent history. If it's not there, here's a copy: http://www.gossamer-threads.com/lists/linux/kernel/1460716#1460716 Best wishes, Artem ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage 2011-11-29 7:52 ` Michal Hocko 2011-11-29 11:38 ` Artem S. Tashkinov @ 2011-11-29 17:23 ` Ian Kumlien 2011-11-29 17:31 ` Ian Kumlien 2 siblings, 0 replies; 23+ messages in thread From: Ian Kumlien @ 2011-11-29 17:23 UTC (permalink / raw) To: Michal Hocko; +Cc: linux-kernel, rjw, tino.keitel, t.artem [-- Attachment #1: Type: text/plain, Size: 3089 bytes --] On tis, 2011-11-29 at 08:52 +0100, Michal Hocko wrote: > On Mon 28-11-11 23:28:03, pomac@vapor.com wrote: > > Hi, > > > > All this time i have been thinking i'm the only one - and i've been to > > loaded with work during working hours and tired when home =P > > > > Anyways, I've neen seeing this since -rc1 on: > > Intel(R) Core(TM) i5 CPU U520 @ 1.07GHz - x86-64 > > AMD Phenom(tm) II X6 1090T - x86-64 > > AMD Athlon(tm) 64 X2 Dual Core Processor 4400+ - x86-64 > > Intel(R) Core(TM)2 CPU T7200 @ 2.00GHz - i686 > > > > Configs available on demand - I've been running the same config for > > quite some time though. > > As I have written in other email could you post your config and collect > the following data? > for i in `seq 30`; > do > cat /proc/stat > `date +'%s'` > sleep 1 > done > export old_user=0 old_nice=0 old_sys=0 old_idle=0 old_iowait=0; > > # for all your available CPUs > grep cpu0 * | while read cpu user nice sys idle iowait rest; > do > echo $cpu $(($user-$old_user)) $(($nice-$old_nice)) $(($sys-$old_sys)) $(($idle-$old_idle)) $(($iowait-$old_iowait)) > old_user=$user old_nice=$nice old_sys=$sys old_idle=$idle old_iowait=$iowait > done 1322587011:cpu0 5838553 164 1252921 1844674407370 78234 1322587012:cpu0 4 0 1 0 0 1322587013:cpu0 4 0 2 0 0 1322587014:cpu0 2 0 1 0 0 1322587015:cpu0 3 0 2 0 0 1322587016:cpu0 7 0 2 0 4 1322587017:cpu0 2 0 2 0 0 1322587018:cpu0 1 0 1 0 0 1322587019:cpu0 4 0 2 0 1 1322587020:cpu0 5 0 3 0 5 1322587021:cpu0 7 0 1 0 3 1322587022:cpu0 4 0 2 0 0 1322587023:cpu0 4 0 2 0 0 1322587024:cpu0 3 0 3 0 0 1322587025:cpu0 3 0 1 0 0 1322587026:cpu0 4 0 2 0 4 1322587027:cpu0 2 0 2 0 0 1322587028:cpu0 2 0 1 0 0 1322587029:cpu0 4 0 1 0 0 1322587030:cpu0 5 0 1 0 0 1322587032:cpu0 5 0 1 0 4 1322587033:cpu0 6 0 1 0 0 1322587034:cpu0 1 0 2 0 0 1322587035:cpu0 3 0 2 0 0 1322587036:cpu0 3 0 1 0 0 1322587037:cpu0 4 0 2 0 2 1322587038:cpu0 4 0 1 0 0 1322587039:cpu0 2 0 1 0 0 1322587040:cpu0 3 0 1 0 0 1322587041:cpu0 4 0 3 0 0 1322587011:cpu1 5944677 172 1287816 1844674407370 871 1322587012:cpu1 4 0 2 0 0 1322587013:cpu1 2 0 2 0 0 1322587014:cpu1 3 0 0 0 0 1322587015:cpu1 3 0 1 0 9 1322587016:cpu1 22 0 3 0 18 1322587017:cpu1 10 0 3 0 13 1322587018:cpu1 3 0 1 0 0 1322587019:cpu1 4 0 3 0 14 1322587020:cpu1 14 0 2 0 7 1322587021:cpu1 16 0 1 0 2 1322587022:cpu1 5 0 0 0 0 1322587023:cpu1 7 0 1 0 0 1322587024:cpu1 8 0 1 0 0 1322587025:cpu1 2 0 2 0 0 1322587026:cpu1 4 0 2 0 5 1322587027:cpu1 7 0 2 0 0 1322587028:cpu1 5 0 0 0 0 1322587029:cpu1 4 0 2 0 0 1322587030:cpu1 6 0 0 0 0 1322587032:cpu1 3 0 2 0 0 1322587033:cpu1 6 0 1 0 0 1322587034:cpu1 2 0 1 0 0 1322587035:cpu1 4 0 2 0 0 1322587036:cpu1 3 0 1 0 0 1322587037:cpu1 4 0 0 0 0 1322587038:cpu1 8 0 1 0 0 1322587039:cpu1 4 0 2 0 0 1322587040:cpu1 4 0 0 0 0 1322587041:cpu1 8 0 1 0 0 For me the cpu usage is sporadic and only shows up as user or kernel, nothing else. -- Ian Kumlien -- http://demius.net || http://pomac.netswarm.net [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage 2011-11-29 7:52 ` Michal Hocko 2011-11-29 11:38 ` Artem S. Tashkinov 2011-11-29 17:23 ` Ian Kumlien @ 2011-11-29 17:31 ` Ian Kumlien 2011-11-29 17:56 ` Michal Hocko 2 siblings, 1 reply; 23+ messages in thread From: Ian Kumlien @ 2011-11-29 17:31 UTC (permalink / raw) To: Michal Hocko; +Cc: linux-kernel, rjw, tino.keitel, t.artem [-- Attachment #1: Type: text/plain, Size: 3099 bytes --] On tis, 2011-11-29 at 08:52 +0100, Michal Hocko wrote: > On Mon 28-11-11 23:28:03, pomac@vapor.com wrote: > > Hi, > > > > All this time i have been thinking i'm the only one - and i've been to > > loaded with work during working hours and tired when home =P > > > > Anyways, I've neen seeing this since -rc1 on: > > Intel(R) Core(TM) i5 CPU U520 @ 1.07GHz - x86-64 > > AMD Phenom(tm) II X6 1090T - x86-64 > > AMD Athlon(tm) 64 X2 Dual Core Processor 4400+ - x86-64 > > Intel(R) Core(TM)2 CPU T7200 @ 2.00GHz - i686 > > > > Configs available on demand - I've been running the same config for > > quite some time though. > > As I have written in other email could you post your config and collect > the following data? > for i in `seq 30`; > do > cat /proc/stat > `date +'%s'` > sleep 1 > done > export old_user=0 old_nice=0 old_sys=0 old_idle=0 old_iowait=0; > > # for all your available CPUs > grep cpu0 * | while read cpu user nice sys idle iowait rest; > do > echo $cpu $(($user-$old_user)) $(($nice-$old_nice)) $(($sys-$old_sys)) $(($idle-$old_idle)) $(($iowait-$old_iowait)) > old_user=$user old_nice=$nice old_sys=$sys old_idle=$idle old_iowait=$iowait > done Sorry, the previous one was AMD X2 4400+ and now the T7200: 1322587696:cpu0 1410110 148 368843 357890604 357890604 1322587697:cpu0 0 0 1 0 0 1322587698:cpu0 1 0 1 0 0 1322587699:cpu0 2 0 1 0 0 1322587700:cpu0 0 0 1 0 0 1322587701:cpu0 0 0 1 0 0 1322587702:cpu0 0 0 1 0 0 1322587703:cpu0 1 0 0 0 0 1322587704:cpu0 0 0 0 0 0 1322587705:cpu0 0 0 1 0 0 1322587706:cpu0 1 0 1 0 0 1322587707:cpu0 0 0 1 0 0 1322587708:cpu0 1 0 0 0 0 1322587709:cpu0 1 0 0 0 0 1322587711:cpu0 0 0 2 0 0 1322587712:cpu0 0 0 0 0 0 1322587713:cpu0 1 0 1 0 0 1322587714:cpu0 1 0 0 0 0 1322587715:cpu0 1 0 1 0 0 1322587716:cpu0 1 0 0 0 0 1322587717:cpu0 0 0 1 0 0 1322587718:cpu0 1 0 1 0 0 1322587719:cpu0 0 0 1 0 0 1322587720:cpu0 0 0 0 0 0 1322587721:cpu0 2 0 1 0 0 1322587722:cpu0 1 0 0 0 0 1322587723:cpu0 0 0 3 0 0 1322587724:cpu0 1 0 0 0 0 1322587725:cpu0 0 0 0 0 0 1322587726:cpu0 0 0 1 0 0 1322587696:cpu1 1395249 509 284978 357890604 95640 1322587697:cpu1 0 0 0 0 0 1322587698:cpu1 0 0 0 0 0 1322587699:cpu1 0 0 0 0 0 1322587700:cpu1 1 0 0 0 0 1322587701:cpu1 1 0 0 0 0 1322587702:cpu1 0 0 0 0 0 1322587703:cpu1 0 0 1 0 0 1322587704:cpu1 1 0 0 0 0 1322587705:cpu1 0 0 1 0 0 1322587706:cpu1 1 0 0 0 0 1322587707:cpu1 0 0 0 0 0 1322587708:cpu1 1 0 0 0 0 1322587709:cpu1 1 0 1 0 0 1322587711:cpu1 0 0 0 0 0 1322587712:cpu1 0 0 0 0 0 1322587713:cpu1 0 0 0 0 0 1322587714:cpu1 0 0 0 0 0 1322587715:cpu1 0 0 0 0 0 1322587716:cpu1 0 0 1 0 0 1322587717:cpu1 0 0 0 0 0 1322587718:cpu1 0 0 0 0 0 1322587719:cpu1 0 0 1 0 0 1322587720:cpu1 1 0 0 0 0 1322587721:cpu1 2 0 0 0 0 1322587722:cpu1 0 0 0 0 0 1322587723:cpu1 0 0 1 0 0 1322587724:cpu1 0 0 0 0 0 1322587725:cpu1 1 0 0 0 0 1322587726:cpu1 0 0 0 0 0 Btw, perf doesn't seem to yield anything.. -- Ian Kumlien -- http://demius.net || http://pomac.netswarm.net [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage 2011-11-29 17:31 ` Ian Kumlien @ 2011-11-29 17:56 ` Michal Hocko 2011-11-29 18:37 ` Ian Kumlien 0 siblings, 1 reply; 23+ messages in thread From: Michal Hocko @ 2011-11-29 17:56 UTC (permalink / raw) To: Ian Kumlien; +Cc: linux-kernel, rjw, tino.keitel, t.artem On Tue 29-11-11 18:31:00, Ian Kumlien wrote: > On tis, 2011-11-29 at 08:52 +0100, Michal Hocko wrote: > > On Mon 28-11-11 23:28:03, pomac@vapor.com wrote: > > > Hi, > > > > > > All this time i have been thinking i'm the only one - and i've been to > > > loaded with work during working hours and tired when home =P > > > > > > Anyways, I've neen seeing this since -rc1 on: > > > Intel(R) Core(TM) i5 CPU U520 @ 1.07GHz - x86-64 > > > AMD Phenom(tm) II X6 1090T - x86-64 > > > AMD Athlon(tm) 64 X2 Dual Core Processor 4400+ - x86-64 > > > Intel(R) Core(TM)2 CPU T7200 @ 2.00GHz - i686 > > > > > > Configs available on demand - I've been running the same config for > > > quite some time though. > > > > As I have written in other email could you post your config and collect > > the following data? > > for i in `seq 30`; > > do > > cat /proc/stat > `date +'%s'` > > sleep 1 > > done > > export old_user=0 old_nice=0 old_sys=0 old_idle=0 old_iowait=0; > > > > # for all your available CPUs > > grep cpu0 * | while read cpu user nice sys idle iowait rest; > > do > > echo $cpu $(($user-$old_user)) $(($nice-$old_nice)) $(($sys-$old_sys)) $(($idle-$old_idle)) $(($iowait-$old_iowait)) > > old_user=$user old_nice=$nice old_sys=$sys old_idle=$idle old_iowait=$iowait > > done > > Sorry, the previous one was AMD X2 4400+ and now the T7200: > > 1322587696:cpu0 1410110 148 368843 357890604 357890604 > 1322587697:cpu0 0 0 1 0 0 > 1322587698:cpu0 1 0 1 0 0 > 1322587699:cpu0 2 0 1 0 0 > 1322587700:cpu0 0 0 1 0 0 > 1322587701:cpu0 0 0 1 0 0 > 1322587702:cpu0 0 0 1 0 0 > 1322587703:cpu0 1 0 0 0 0 > 1322587704:cpu0 0 0 0 0 0 > 1322587705:cpu0 0 0 1 0 0 > 1322587706:cpu0 1 0 1 0 0 > 1322587707:cpu0 0 0 1 0 0 > 1322587708:cpu0 1 0 0 0 0 > 1322587709:cpu0 1 0 0 0 0 > 1322587711:cpu0 0 0 2 0 0 > 1322587712:cpu0 0 0 0 0 0 > 1322587713:cpu0 1 0 1 0 0 > 1322587714:cpu0 1 0 0 0 0 > 1322587715:cpu0 1 0 1 0 0 > 1322587716:cpu0 1 0 0 0 0 > 1322587717:cpu0 0 0 1 0 0 > 1322587718:cpu0 1 0 1 0 0 > 1322587719:cpu0 0 0 1 0 0 > 1322587720:cpu0 0 0 0 0 0 > 1322587721:cpu0 2 0 1 0 0 > 1322587722:cpu0 1 0 0 0 0 > 1322587723:cpu0 0 0 3 0 0 > 1322587724:cpu0 1 0 0 0 0 > 1322587725:cpu0 0 0 0 0 0 > 1322587726:cpu0 0 0 1 0 0 OK, so the same thing as in another email in the thread (no idle/io_wait accounting). Could you double check what kind of idle driver are you using? cat /sys/devices/system/cpu/cpuidle/current_driver -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage 2011-11-29 17:56 ` Michal Hocko @ 2011-11-29 18:37 ` Ian Kumlien 0 siblings, 0 replies; 23+ messages in thread From: Ian Kumlien @ 2011-11-29 18:37 UTC (permalink / raw) To: Michal Hocko; +Cc: linux-kernel, rjw, tino.keitel, t.artem [-- Attachment #1: Type: text/plain, Size: 1877 bytes --] On tis, 2011-11-29 at 18:56 +0100, Michal Hocko wrote: > On Tue 29-11-11 18:31:00, Ian Kumlien wrote: > > On tis, 2011-11-29 at 08:52 +0100, Michal Hocko wrote: > > > On Mon 28-11-11 23:28:03, pomac@vapor.com wrote: > > > > Hi, > > > > > > > > All this time i have been thinking i'm the only one - and i've been to > > > > loaded with work during working hours and tired when home =P > > > > > > > > Anyways, I've neen seeing this since -rc1 on: > > > > Intel(R) Core(TM) i5 CPU U520 @ 1.07GHz - x86-64 > > > > AMD Phenom(tm) II X6 1090T - x86-64 > > > > AMD Athlon(tm) 64 X2 Dual Core Processor 4400+ - x86-64 > > > > Intel(R) Core(TM)2 CPU T7200 @ 2.00GHz - i686 > > > > > > > > Configs available on demand - I've been running the same config for > > > > quite some time though. > > > > > > As I have written in other email could you post your config and collect > > > the following data? > > > for i in `seq 30`; > > > do > > > cat /proc/stat > `date +'%s'` > > > sleep 1 > > > done > > > export old_user=0 old_nice=0 old_sys=0 old_idle=0 old_iowait=0; > > > > > > # for all your available CPUs > > > grep cpu0 * | while read cpu user nice sys idle iowait rest; > > > do > > > echo $cpu $(($user-$old_user)) $(($nice-$old_nice)) $(($sys-$old_sys)) $(($idle-$old_idle)) $(($iowait-$old_iowait)) > > > old_user=$user old_nice=$nice old_sys=$sys old_idle=$idle old_iowait=$iowait > > > done > > > > Sorry, the previous one was AMD X2 4400+ and now the T7200: --8<-- [data] --8<-- > OK, so the same thing as in another email in the thread (no idle/io_wait > accounting). > Could you double check what kind of idle driver are you using? > cat /sys/devices/system/cpu/cpuidle/current_driver "none", on both machines -- Ian Kumlien -- http://demius.net || http://pomac.netswarm.net [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2011-12-05 8:56 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-11-28 22:28 [REGRESSION] [Linux 3.2] top/htop and all other CPU usage pomac
2011-11-29 7:52 ` Michal Hocko
2011-11-29 11:38 ` Artem S. Tashkinov
2011-11-29 12:31 ` Michal Hocko
2011-11-29 12:44 ` Michal Hocko
2011-11-29 12:54 ` Artem S. Tashkinov
2011-11-29 13:10 ` Michal Hocko
2011-11-29 13:51 ` Artem S. Tashkinov
2011-11-29 22:51 ` Rafael J. Wysocki
2011-11-30 10:12 ` Michal Hocko
2011-11-30 19:56 ` Rafael J. Wysocki
2011-12-01 14:07 ` Michal Hocko
2011-12-02 10:39 ` Michal Hocko
2011-12-02 13:35 ` Michal Hocko
2011-12-02 16:49 ` [PATCH] proc: Do not overflow get_{idle,iowait}_time for nohz (was: Re: Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage) Michal Hocko
2011-12-02 17:59 ` Michal Hocko
2011-12-02 20:12 ` Artem S. Tashkinov
2011-12-05 8:56 ` Michal Hocko
2011-12-02 17:43 ` Re: Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage Artem S. Tashkinov
2011-11-29 17:23 ` Ian Kumlien
2011-11-29 17:31 ` Ian Kumlien
2011-11-29 17:56 ` Michal Hocko
2011-11-29 18:37 ` Ian Kumlien
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox