* [REGRESSION] [Linux 3.2] top/htop and all other CPU usage metering applications has gone crackers
@ 2011-11-24 10:30 Artem S. Tashkinov
2011-11-24 20:05 ` Tino Keitel
2011-11-29 21:16 ` Maciej Rutecki
0 siblings, 2 replies; 15+ messages in thread
From: Artem S. Tashkinov @ 2011-11-24 10:30 UTC (permalink / raw)
To: linux-kernel
Hello,
I'd like to report a weird regression in Linux 3.2 (running rc3 now) - all CPU metering applications have gone terribly mad
under this kernel:
Here are two text snapshots of htop:
1 [||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%] Tasks: 135 total, 1 running
2 [||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%] Load average: 0.00 0.01 0.05
3 [||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%] Uptime: 00:27:49
4 [||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%] Load: 0.00
Mem[|||||||||||| 544/8093MB] Avg[||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%]
IORR IOWR IO PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command
0 0 0 3529 root 20 0 70052 52244 19324 S 300. 0.6 0:25.93 /usr/bin/X -br
0 0 0 3678 user 20 0 33952 10220 8448 S 100. 0.1 0:03.58 gkrellm
0 0 0 3772 user 20 0 32144 13960 9532 S 100. 0.2 0:07.46 konsole [kdeinit] --noxft
0 0 0 6061 user 20 0 2780 1276 960 R 100. 0.0 0:00.01 htop
0 0 0 1 root 20 0 2884 1376 1168 S 0.0 0.0 0:01.00 /sbin/init
0 0 0 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd
1 [||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%] Tasks: 135 total, 1 running
2 [||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%] Load average: 0.00 0.01 0.05
3 [||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%] Uptime: 00:28:43
4 [||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%] Load: 0.00
Mem[|||||||||||| 545/8093MB] Avg[||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%]
IORR IOWR IO PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command
0 0 0 6061 user 20 0 2780 1288 972 R 57.0 0.0 0:00.11 htop
0 0 0 5243 user 20 0 559M 318M 30928 S 57.0 3.9 0:26.07 /opt/firefox/firefox
0 0 0 3687 user 20 0 108M 39124 23564 S 57.0 0.5 0:06.50 kedit
0 0 0 3637 user 20 0 26960 5572 4288 S 57.0 0.1 0:00.02 klauncher [kdeinit] --new-startup
0 0 0 3529 root 20 0 70052 52244 19324 S 0.0 0.6 0:26.71 /usr/bin/X -br
Interestingly with this madness going on, the internal kernel average load counter works properly:
[root@localhost ~]# uptime
16:20:38 up 44 min, 2 users, load average: 0.02, 0.04, 0.05
Right at this moment all process viewers report 400% CPU load (I have 4 CPU cores), either 400% loaded by user processes
or 200% by system and 200% by user processes.
Graphically it looks this way: http://img717.imageshack.us/img717/6495/top2b.png
I'm not running any CPU intensive applications at the moment at all.
My .config file can be downloaded here: http://ompldr.org/iYmZneg
My distro is Fedora 14 i686.
Best wishes,
Artem
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage metering applications has gone crackers 2011-11-24 10:30 [REGRESSION] [Linux 3.2] top/htop and all other CPU usage metering applications has gone crackers Artem S. Tashkinov @ 2011-11-24 20:05 ` Tino Keitel 2011-11-27 11:04 ` Tino Keitel 2011-11-29 21:16 ` Maciej Rutecki 1 sibling, 1 reply; 15+ messages in thread From: Tino Keitel @ 2011-11-24 20:05 UTC (permalink / raw) To: linux-kernel On Thu, Nov 24, 2011 at 10:30:15 +0000, Artem S. Tashkinov wrote: > Hello, > > I'd like to report a weird regression in Linux 3.2 (running rc3 now) - all CPU metering applications have gone terribly mad > under this kernel: I get the same using top, htop and the gnome system monitor with kernel 3.2 on a Sandy Bridge quad core box, running Debian unstable. Regards, Tino ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage metering applications has gone crackers 2011-11-24 20:05 ` Tino Keitel @ 2011-11-27 11:04 ` Tino Keitel 2011-11-27 11:45 ` Rafael J. Wysocki 0 siblings, 1 reply; 15+ messages in thread From: Tino Keitel @ 2011-11-27 11:04 UTC (permalink / raw) To: linux-kernel On Thu, Nov 24, 2011 at 21:05:53 +0100, Tino Keitel wrote: > On Thu, Nov 24, 2011 at 10:30:15 +0000, Artem S. Tashkinov wrote: > > Hello, > > > > I'd like to report a weird regression in Linux 3.2 (running rc3 now) - all CPU metering applications have gone terribly mad > > under this kernel: > > I get the same using top, htop and the gnome system monitor with kernel > 3.2 on a Sandy Bridge quad core box, running Debian unstable. I just tested 3.2-rc2, and see the same bug. Regards, Tino ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage metering applications has gone crackers 2011-11-27 11:04 ` Tino Keitel @ 2011-11-27 11:45 ` Rafael J. Wysocki 2011-11-27 11:45 ` Tino Keitel ` (2 more replies) 0 siblings, 3 replies; 15+ messages in thread From: Rafael J. Wysocki @ 2011-11-27 11:45 UTC (permalink / raw) To: Tino Keitel; +Cc: linux-kernel, Artem S. Tashkinov On Sunday, November 27, 2011, Tino Keitel wrote: > On Thu, Nov 24, 2011 at 21:05:53 +0100, Tino Keitel wrote: > > On Thu, Nov 24, 2011 at 10:30:15 +0000, Artem S. Tashkinov wrote: > > > Hello, > > > > > > I'd like to report a weird regression in Linux 3.2 (running rc3 now) - all CPU metering applications have gone terribly mad > > > under this kernel: > > > > I get the same using top, htop and the gnome system monitor with kernel > > 3.2 on a Sandy Bridge quad core box, running Debian unstable. > > I just tested 3.2-rc2, and see the same bug. I'm seeing that too on one of my test boxes, but not all the time (i.e. there are periods in which the readings are correct). The other boxes I've tested with 3.2-rc are fine in that respect. Also, it seems that it shows 100%-(real load) when it is wrong. So, it looks like there's an overflow somewhere in the CPU load measuring code, at least on some CPUs. What's your CPU, BTW? Thanks, Rafael ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage metering applications has gone crackers 2011-11-27 11:45 ` Rafael J. Wysocki @ 2011-11-27 11:45 ` Tino Keitel 2011-11-27 11:56 ` Rafael J. Wysocki 2011-11-27 11:57 ` Artem S. Tashkinov 2011-11-28 19:55 ` Tino Keitel 2 siblings, 1 reply; 15+ messages in thread From: Tino Keitel @ 2011-11-27 11:45 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: linux-kernel, Artem S. Tashkinov On Sun, Nov 27, 2011 at 12:45:57 +0100, Rafael J. Wysocki wrote: > On Sunday, November 27, 2011, Tino Keitel wrote: > > On Thu, Nov 24, 2011 at 21:05:53 +0100, Tino Keitel wrote: > > > On Thu, Nov 24, 2011 at 10:30:15 +0000, Artem S. Tashkinov wrote: > > > > Hello, > > > > > > > > I'd like to report a weird regression in Linux 3.2 (running rc3 now) - all CPU metering applications have gone terribly mad > > > > under this kernel: > > > > > > I get the same using top, htop and the gnome system monitor with kernel > > > 3.2 on a Sandy Bridge quad core box, running Debian unstable. > > > > I just tested 3.2-rc2, and see the same bug. > > I'm seeing that too on one of my test boxes, but not all the time > (i.e. there are periods in which the readings are correct). The other boxes > I've tested with 3.2-rc are fine in that respect. > > Also, it seems that it shows 100%-(real load) when it is wrong. So, it looks > like there's an overflow somewhere in the CPU load measuring code, at least > on some CPUs. > > What's your CPU, BTW? Intel Core i5-2400 Regards, Tino ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage metering applications has gone crackers 2011-11-27 11:45 ` Tino Keitel @ 2011-11-27 11:56 ` Rafael J. Wysocki 0 siblings, 0 replies; 15+ messages in thread From: Rafael J. Wysocki @ 2011-11-27 11:56 UTC (permalink / raw) To: Tino Keitel; +Cc: linux-kernel, Artem S. Tashkinov On Sunday, November 27, 2011, Tino Keitel wrote: > On Sun, Nov 27, 2011 at 12:45:57 +0100, Rafael J. Wysocki wrote: > > On Sunday, November 27, 2011, Tino Keitel wrote: > > > On Thu, Nov 24, 2011 at 21:05:53 +0100, Tino Keitel wrote: > > > > On Thu, Nov 24, 2011 at 10:30:15 +0000, Artem S. Tashkinov wrote: > > > > > Hello, > > > > > > > > > > I'd like to report a weird regression in Linux 3.2 (running rc3 now) - all CPU metering applications have gone terribly mad > > > > > under this kernel: > > > > > > > > I get the same using top, htop and the gnome system monitor with kernel > > > > 3.2 on a Sandy Bridge quad core box, running Debian unstable. > > > > > > I just tested 3.2-rc2, and see the same bug. > > > > I'm seeing that too on one of my test boxes, but not all the time > > (i.e. there are periods in which the readings are correct). The other boxes > > I've tested with 3.2-rc are fine in that respect. > > > > Also, it seems that it shows 100%-(real load) when it is wrong. So, it looks > > like there's an overflow somewhere in the CPU load measuring code, at least > > on some CPUs. > > > > What's your CPU, BTW? > > Intel Core i5-2400 The CPU I'm seeing the problem with is AMD Athlon X2 3800+. Thanks, Rafael ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage metering applications has gone crackers 2011-11-27 11:45 ` Rafael J. Wysocki 2011-11-27 11:45 ` Tino Keitel @ 2011-11-27 11:57 ` Artem S. Tashkinov 2011-11-28 19:55 ` Tino Keitel 2 siblings, 0 replies; 15+ messages in thread From: Artem S. Tashkinov @ 2011-11-27 11:57 UTC (permalink / raw) To: rjw; +Cc: tino.keitel, linux-kernel > On Nov 27, 2011, Rafael J. Wysocki <rjw@sisk.pl> wrote: > > On Sunday, November 27, 2011, Tino Keitel wrote: > > On Thu, Nov 24, 2011 at 21:05:53 +0100, Tino Keitel wrote: > > > On Thu, Nov 24, 2011 at 10:30:15 +0000, Artem S. Tashkinov wrote: > > > > Hello, > > > > > > > > I'd like to report a weird regression in Linux 3.2 (running rc3 now) - all CPU metering applications have gone terribly mad > > > > under this kernel: > > > > > > I get the same using top, htop and the gnome system monitor with kernel > > > 3.2 on a Sandy Bridge quad core box, running Debian unstable. > > > > I just tested 3.2-rc2, and see the same bug. > > I'm seeing that too on one of my test boxes, but not all the time > (i.e. there are periods in which the readings are correct). The other boxes > I've tested with 3.2-rc are fine in that respect. > > Also, it seems that it shows 100%-(real load) when it is wrong. So, it looks > like there's an overflow somewhere in the CPU load measuring code, at least > on some CPUs. > > What's your CPU, BTW? Intel Core i5 2500, i686 Best wishes, Artem ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage metering applications has gone crackers 2011-11-27 11:45 ` Rafael J. Wysocki 2011-11-27 11:45 ` Tino Keitel 2011-11-27 11:57 ` Artem S. Tashkinov @ 2011-11-28 19:55 ` Tino Keitel 2011-11-28 20:19 ` Rafael J. Wysocki 2011-11-29 21:25 ` Tino Keitel 2 siblings, 2 replies; 15+ messages in thread From: Tino Keitel @ 2011-11-28 19:55 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: linux-kernel, Artem S. Tashkinov On Sun, Nov 27, 2011 at 12:45:57 +0100, Rafael J. Wysocki wrote: > On Sunday, November 27, 2011, Tino Keitel wrote: > > On Thu, Nov 24, 2011 at 21:05:53 +0100, Tino Keitel wrote: > > > On Thu, Nov 24, 2011 at 10:30:15 +0000, Artem S. Tashkinov wrote: > > > > Hello, > > > > > > > > I'd like to report a weird regression in Linux 3.2 (running rc3 now) - all CPU metering applications have gone terribly mad > > > > under this kernel: > > > > > > I get the same using top, htop and the gnome system monitor with kernel > > > 3.2 on a Sandy Bridge quad core box, running Debian unstable. > > > > I just tested 3.2-rc2, and see the same bug. > > I'm seeing that too on one of my test boxes, but not all the time > (i.e. there are periods in which the readings are correct). The other boxes > I've tested with 3.2-rc are fine in that respect. > > Also, it seems that it shows 100%-(real load) when it is wrong. So, it looks > like there's an overflow somewhere in the CPU load measuring code, at least > on some CPUs. Hi, I reverted this commit and so far it looks good: commit a25cac5198d4ff2842ccca63b423962848ad24b2 Author: Michal Hocko <mhocko@suse.cz> Date: Wed Aug 24 09:40:25 2011 +0200 proc: Consider NO_HZ when printing idle and iowait times I'll report back tomorrow how the kernel behaves. Regards, Tino ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage metering applications has gone crackers 2011-11-28 19:55 ` Tino Keitel @ 2011-11-28 20:19 ` Rafael J. Wysocki 2011-11-28 21:41 ` Michal Hocko 2011-11-29 21:25 ` Tino Keitel 1 sibling, 1 reply; 15+ messages in thread From: Rafael J. Wysocki @ 2011-11-28 20:19 UTC (permalink / raw) To: Tino Keitel, Michal Hocko; +Cc: linux-kernel, Artem S. Tashkinov On Monday, November 28, 2011, Tino Keitel wrote: > On Sun, Nov 27, 2011 at 12:45:57 +0100, Rafael J. Wysocki wrote: > > On Sunday, November 27, 2011, Tino Keitel wrote: > > > On Thu, Nov 24, 2011 at 21:05:53 +0100, Tino Keitel wrote: > > > > On Thu, Nov 24, 2011 at 10:30:15 +0000, Artem S. Tashkinov wrote: > > > > > Hello, > > > > > > > > > > I'd like to report a weird regression in Linux 3.2 (running rc3 now) - all CPU metering applications have gone terribly mad > > > > > under this kernel: > > > > > > > > I get the same using top, htop and the gnome system monitor with kernel > > > > 3.2 on a Sandy Bridge quad core box, running Debian unstable. > > > > > > I just tested 3.2-rc2, and see the same bug. > > > > I'm seeing that too on one of my test boxes, but not all the time > > (i.e. there are periods in which the readings are correct). The other boxes > > I've tested with 3.2-rc are fine in that respect. > > > > Also, it seems that it shows 100%-(real load) when it is wrong. So, it looks > > like there's an overflow somewhere in the CPU load measuring code, at least > > on some CPUs. > > Hi, > > I reverted this commit and so far it looks good: > > commit a25cac5198d4ff2842ccca63b423962848ad24b2 > Author: Michal Hocko <mhocko@suse.cz> > Date: Wed Aug 24 09:40:25 2011 +0200 > > proc: Consider NO_HZ when printing idle and iowait times > > I'll report back tomorrow how the kernel behaves. Hmm. Michal, can you have a look at that, please? Rafael ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage metering applications has gone crackers 2011-11-28 20:19 ` Rafael J. Wysocki @ 2011-11-28 21:41 ` Michal Hocko 2011-11-28 21:43 ` Michal Hocko ` (2 more replies) 0 siblings, 3 replies; 15+ messages in thread From: Michal Hocko @ 2011-11-28 21:41 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: Tino Keitel, linux-kernel, Artem S. Tashkinov Hi, On Mon 28-11-11 21:19:26, Rafael J. Wysocki wrote: > On Monday, November 28, 2011, Tino Keitel wrote: > > On Sun, Nov 27, 2011 at 12:45:57 +0100, Rafael J. Wysocki wrote: > > > On Sunday, November 27, 2011, Tino Keitel wrote: > > > > On Thu, Nov 24, 2011 at 21:05:53 +0100, Tino Keitel wrote: > > > > > On Thu, Nov 24, 2011 at 10:30:15 +0000, Artem S. Tashkinov wrote: > > > > > > Hello, > > > > > > > > > > > > I'd like to report a weird regression in Linux 3.2 (running rc3 now) - all CPU metering applications have gone terribly mad > > > > > > under this kernel: > > > > > > > > > > I get the same using top, htop and the gnome system monitor with kernel > > > > > 3.2 on a Sandy Bridge quad core box, running Debian unstable. > > > > > > > > I just tested 3.2-rc2, and see the same bug. > > > > > > I'm seeing that too on one of my test boxes, but not all the time > > > (i.e. there are periods in which the readings are correct). The other boxes > > > I've tested with 3.2-rc are fine in that respect. > > > > > > Also, it seems that it shows 100%-(real load) when it is wrong. So, it looks > > > like there's an overflow somewhere in the CPU load measuring code, at least > > > on some CPUs. > > > > Hi, > > > > I reverted this commit and so far it looks good: > > > > commit a25cac5198d4ff2842ccca63b423962848ad24b2 > > Author: Michal Hocko <mhocko@suse.cz> > > Date: Wed Aug 24 09:40:25 2011 +0200 > > > > proc: Consider NO_HZ when printing idle and iowait times > > > > I'll report back tomorrow how the kernel behaves. > > Hmm. Michal, can you have a look at that, please? Hmm, my testing didn't show anything like that. Could you post cat /proc/stat collected every second during 30s or so? Here is the output of my run with 3.2.0-rc3-00004-gdd38d29 and the attached config: for i in `seq 30`; do cat /proc/stat > `date +'%s'` sleep 1 done export old_user=0 old_nice=0 old_sys=0 old_idle=0 old_iowait=0; grep cpu0 * | while read cpu user nice sys idle iowait rest; do echo $cpu $(($user-$old_user)) $(($nice-$old_nice)) $(($sys-$old_sys)) $(($idle-$old_idle)) $(($iowait-$old_iowait)) old_user=$user old_nice=$nice old_sys=$sys old_idle=$idle old_iowait=$iowait done Mostly no workload (idle desktop) - few seconds of bosy loop: 1322516060:cpu0 621150 1978 148367 299773 196163 1322516061:cpu0 4 0 3 92 0 1322516062:cpu0 16 0 9 79 0 1322516063:cpu0 0 0 0 97 0 1322516064:cpu0 70 0 2 28 0 << Busy loop started 1322516065:cpu0 100 0 0 0 0 1322516066:cpu0 100 0 0 0 0 1322516067:cpu0 41 0 1 58 0 << Busy loop finished 1322516068:cpu0 0 0 2 96 0 1322516069:cpu0 1 0 2 97 0 1322516070:cpu0 100 0 0 0 0 1322516071:cpu0 42 0 1 58 0 1322516072:cpu0 0 0 2 97 0 1322516073:cpu0 1 0 2 97 0 1322516074:cpu0 1 0 1 98 0 1322516075:cpu0 2 0 1 97 0 1322516076:cpu0 1 0 1 91 7 1322516077:cpu0 1 0 0 97 0 1322516078:cpu0 0 0 0 97 0 1322516079:cpu0 2 0 1 97 0 1322516080:cpu0 0 0 1 97 1 1322516081:cpu0 1 0 4 90 4 1322516082:cpu0 2 0 0 97 0 1322516083:cpu0 1 0 2 98 0 1322516084:cpu0 2 0 1 96 0 1322516085:cpu0 0 0 2 98 0 1322516086:cpu0 1 0 1 91 7 1322516087:cpu0 0 0 0 97 0 1322516088:cpu0 1 0 0 97 0 1322516089:cpu0 1 0 1 100 0 Which looks correct (matches USER_HZ 100) to me. Governors are updating those values and maybe idle driver might be relevant. Here is my setting: $ grep . -r /sys/devices/system/cpu/cpuidle/ /sys/devices/system/cpu/cpuidle/current_driver:acpi_idle /sys/devices/system/cpu/cpuidle/current_governor_ro:menu $ grep . -r /sys/devices/system/cpu/cpufreq/ /sys/devices/system/cpu/cpufreq/ondemand/sampling_rate_min:10000 /sys/devices/system/cpu/cpufreq/ondemand/sampling_rate:10000 /sys/devices/system/cpu/cpufreq/ondemand/up_threshold:95 /sys/devices/system/cpu/cpufreq/ondemand/sampling_down_factor:1 /sys/devices/system/cpu/cpufreq/ondemand/ignore_nice_load:0 /sys/devices/system/cpu/cpufreq/ondemand/powersave_bias:0 /sys/devices/system/cpu/cpufreq/ondemand/io_is_busy:1 > > Rafael -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage metering applications has gone crackers 2011-11-28 21:41 ` Michal Hocko @ 2011-11-28 21:43 ` Michal Hocko 2011-11-28 21:48 ` Michal Hocko 2011-11-29 8:14 ` Michal Hocko 2 siblings, 0 replies; 15+ messages in thread From: Michal Hocko @ 2011-11-28 21:43 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: Tino Keitel, linux-kernel, Artem S. Tashkinov On Mon 28-11-11 22:41:25, Michal Hocko wrote: > Hi, > > On Mon 28-11-11 21:19:26, Rafael J. Wysocki wrote: > > On Monday, November 28, 2011, Tino Keitel wrote: > > > On Sun, Nov 27, 2011 at 12:45:57 +0100, Rafael J. Wysocki wrote: > > > > On Sunday, November 27, 2011, Tino Keitel wrote: > > > > > On Thu, Nov 24, 2011 at 21:05:53 +0100, Tino Keitel wrote: > > > > > > On Thu, Nov 24, 2011 at 10:30:15 +0000, Artem S. Tashkinov wrote: > > > > > > > Hello, > > > > > > > > > > > > > > I'd like to report a weird regression in Linux 3.2 (running rc3 now) - all CPU metering applications have gone terribly mad > > > > > > > under this kernel: > > > > > > > > > > > > I get the same using top, htop and the gnome system monitor with kernel > > > > > > 3.2 on a Sandy Bridge quad core box, running Debian unstable. > > > > > > > > > > I just tested 3.2-rc2, and see the same bug. > > > > > > > > I'm seeing that too on one of my test boxes, but not all the time > > > > (i.e. there are periods in which the readings are correct). The other boxes > > > > I've tested with 3.2-rc are fine in that respect. > > > > > > > > Also, it seems that it shows 100%-(real load) when it is wrong. So, it looks > > > > like there's an overflow somewhere in the CPU load measuring code, at least > > > > on some CPUs. > > > > > > Hi, > > > > > > I reverted this commit and so far it looks good: > > > > > > commit a25cac5198d4ff2842ccca63b423962848ad24b2 > > > Author: Michal Hocko <mhocko@suse.cz> > > > Date: Wed Aug 24 09:40:25 2011 +0200 > > > > > > proc: Consider NO_HZ when printing idle and iowait times > > > > > > I'll report back tomorrow how the kernel behaves. > > > > Hmm. Michal, can you have a look at that, please? > > Hmm, my testing didn't show anything like that. Could you post > cat /proc/stat collected every second during 30s or so? > > Here is the output of my run with 3.2.0-rc3-00004-gdd38d29 and the attached config: > for i in `seq 30`; > do > cat /proc/stat > `date +'%s'` > sleep 1 > done > export old_user=0 old_nice=0 old_sys=0 old_idle=0 old_iowait=0; > grep cpu0 * | while read cpu user nice sys idle iowait rest; > do > echo $cpu $(($user-$old_user)) $(($nice-$old_nice)) $(($sys-$old_sys)) $(($idle-$old_idle)) $(($iowait-$old_iowait)) > old_user=$user old_nice=$nice old_sys=$sys old_idle=$idle old_iowait=$iowait > done > > Mostly no workload (idle desktop) - few seconds of bosy loop: > 1322516060:cpu0 621150 1978 148367 299773 196163 > 1322516061:cpu0 4 0 3 92 0 > 1322516062:cpu0 16 0 9 79 0 > 1322516063:cpu0 0 0 0 97 0 [...] Forgot to add, but cpu1 looks similar 1322516060:cpu1 641344 832 137307 132871 44144 1322516061:cpu1 4 0 4 92 0 1322516062:cpu1 19 0 11 74 0 1322516063:cpu1 2 0 2 96 0 1322516064:cpu1 7 0 4 89 0 1322516065:cpu1 0 0 0 97 0 1322516066:cpu1 2 0 3 88 6 1322516067:cpu1 59 0 1 40 0 1322516068:cpu1 101 0 0 0 0 1322516069:cpu1 100 0 0 0 0 1322516070:cpu1 1 0 1 96 0 1322516071:cpu1 1 0 3 90 7 1322516072:cpu1 2 0 0 97 0 1322516073:cpu1 1 0 1 98 0 1322516074:cpu1 1 0 3 97 0 1322516075:cpu1 0 0 0 98 0 1322516076:cpu1 2 0 1 98 0 1322516077:cpu1 1 0 2 98 0 1322516078:cpu1 0 0 1 99 0 1322516079:cpu1 1 0 2 99 0 1322516080:cpu1 0 0 1 98 0 1322516081:cpu1 1 0 1 98 0 1322516082:cpu1 1 0 2 98 0 1322516083:cpu1 0 0 1 99 0 1322516084:cpu1 2 0 1 98 0 1322516085:cpu1 1 0 2 97 0 1322516086:cpu1 1 0 0 99 0 1322516087:cpu1 0 0 2 97 0 1322516088:cpu1 2 0 2 98 0 1322516089:cpu1 1 0 1 97 0 -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage metering applications has gone crackers 2011-11-28 21:41 ` Michal Hocko 2011-11-28 21:43 ` Michal Hocko @ 2011-11-28 21:48 ` Michal Hocko 2011-11-29 8:14 ` Michal Hocko 2 siblings, 0 replies; 15+ messages in thread From: Michal Hocko @ 2011-11-28 21:48 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: Tino Keitel, linux-kernel, Artem S. Tashkinov [-- Attachment #1: Type: text/plain, Size: 133 bytes --] And the forgotten config. Sorry... -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic [-- Attachment #2: config.gz --] [-- Type: application/octet-stream, Size: 17157 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage metering applications has gone crackers 2011-11-28 21:41 ` Michal Hocko 2011-11-28 21:43 ` Michal Hocko 2011-11-28 21:48 ` Michal Hocko @ 2011-11-29 8:14 ` Michal Hocko 2 siblings, 0 replies; 15+ messages in thread From: Michal Hocko @ 2011-11-29 8:14 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: Tino Keitel, linux-kernel, Artem S. Tashkinov [-- Attachment #1: Type: text/plain, Size: 3182 bytes --] On Mon 28-11-11 22:41:25, Michal Hocko wrote: > Hi, > > On Mon 28-11-11 21:19:26, Rafael J. Wysocki wrote: > > On Monday, November 28, 2011, Tino Keitel wrote: > > > On Sun, Nov 27, 2011 at 12:45:57 +0100, Rafael J. Wysocki wrote: > > > > On Sunday, November 27, 2011, Tino Keitel wrote: > > > > > On Thu, Nov 24, 2011 at 21:05:53 +0100, Tino Keitel wrote: > > > > > > On Thu, Nov 24, 2011 at 10:30:15 +0000, Artem S. Tashkinov wrote: > > > > > > > Hello, > > > > > > > > > > > > > > I'd like to report a weird regression in Linux 3.2 (running rc3 now) - all CPU metering applications have gone terribly mad > > > > > > > under this kernel: > > > > > > > > > > > > I get the same using top, htop and the gnome system monitor with kernel > > > > > > 3.2 on a Sandy Bridge quad core box, running Debian unstable. > > > > > > > > > > I just tested 3.2-rc2, and see the same bug. > > > > > > > > I'm seeing that too on one of my test boxes, but not all the time > > > > (i.e. there are periods in which the readings are correct). The other boxes > > > > I've tested with 3.2-rc are fine in that respect. > > > > > > > > Also, it seems that it shows 100%-(real load) when it is wrong. So, it looks > > > > like there's an overflow somewhere in the CPU load measuring code, at least > > > > on some CPUs. > > > > > > Hi, > > > > > > I reverted this commit and so far it looks good: > > > > > > commit a25cac5198d4ff2842ccca63b423962848ad24b2 > > > Author: Michal Hocko <mhocko@suse.cz> > > > Date: Wed Aug 24 09:40:25 2011 +0200 > > > > > > proc: Consider NO_HZ when printing idle and iowait times > > > > > > I'll report back tomorrow how the kernel behaves. > > > > Hmm. Michal, can you have a look at that, please? > > Hmm, my testing didn't show anything like that. Could you post > cat /proc/stat collected every second during 30s or so? > > Here is the output of my run with 3.2.0-rc3-00004-gdd38d29 and the attached config: > for i in `seq 30`; > do > cat /proc/stat > `date +'%s'` > sleep 1 > done > export old_user=0 old_nice=0 old_sys=0 old_idle=0 old_iowait=0; > grep cpu0 * | while read cpu user nice sys idle iowait rest; > do > echo $cpu $(($user-$old_user)) $(($nice-$old_nice)) $(($sys-$old_sys)) $(($idle-$old_idle)) $(($iowait-$old_iowait)) > old_user=$user old_nice=$nice old_sys=$sys old_idle=$idle old_iowait=$iowait > done > Same results (attached) x86_64 with AMD 16CPUs in my lab with a different cpuidle driver: grep . -r /sys/devices/system/cpu/cpuidle/ $ /sys/devices/system/cpu/cpuidle/current_driver:none /sys/devices/system/cpu/cpuidle/current_governor_ro:menu $ grep . -r /sys/devices/system/cpu/cpufreq/ /sys/devices/system/cpu/cpufreq/ondemand/sampling_rate_min:10000 /sys/devices/system/cpu/cpufreq/ondemand/sampling_rate:38000 /sys/devices/system/cpu/cpufreq/ondemand/up_threshold:40 /sys/devices/system/cpu/cpufreq/ondemand/sampling_down_factor:1 /sys/devices/system/cpu/cpufreq/ondemand/ignore_nice_load:0 /sys/devices/system/cpu/cpufreq/ondemand/powersave_bias:0 /sys/devices/system/cpu/cpufreq/ondemand/io_is_busy:0 -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic [-- Attachment #2: config.gz --] [-- Type: application/octet-stream, Size: 33394 bytes --] [-- Attachment #3: amd_16cpus.tar.bz2 --] [-- Type: application/octet-stream, Size: 3958 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage metering applications has gone crackers 2011-11-28 19:55 ` Tino Keitel 2011-11-28 20:19 ` Rafael J. Wysocki @ 2011-11-29 21:25 ` Tino Keitel 1 sibling, 0 replies; 15+ messages in thread From: Tino Keitel @ 2011-11-29 21:25 UTC (permalink / raw) To: linux-kernel; +Cc: Rafael J. Wysocki, Artem S. Tashkinov On Mon, Nov 28, 2011 at 20:55:34 +0100, Tino Keitel wrote: > On Sun, Nov 27, 2011 at 12:45:57 +0100, Rafael J. Wysocki wrote: > > On Sunday, November 27, 2011, Tino Keitel wrote: > > > On Thu, Nov 24, 2011 at 21:05:53 +0100, Tino Keitel wrote: > > > > On Thu, Nov 24, 2011 at 10:30:15 +0000, Artem S. Tashkinov wrote: > > > > > Hello, > > > > > > > > > > I'd like to report a weird regression in Linux 3.2 (running rc3 now) - all CPU metering applications have gone terribly mad > > > > > under this kernel: > > > > > > > > I get the same using top, htop and the gnome system monitor with kernel > > > > 3.2 on a Sandy Bridge quad core box, running Debian unstable. > > > > > > I just tested 3.2-rc2, and see the same bug. > > > > I'm seeing that too on one of my test boxes, but not all the time > > (i.e. there are periods in which the readings are correct). The other boxes > > I've tested with 3.2-rc are fine in that respect. > > > > Also, it seems that it shows 100%-(real load) when it is wrong. So, it looks > > like there's an overflow somewhere in the CPU load measuring code, at least > > on some CPUs. > > Hi, > > I reverted this commit and so far it looks good: > > commit a25cac5198d4ff2842ccca63b423962848ad24b2 > Author: Michal Hocko <mhocko@suse.cz> > Date: Wed Aug 24 09:40:25 2011 +0200 > > proc: Consider NO_HZ when printing idle and iowait times > > I'll report back tomorrow how the kernel behaves. Hi, looks fine so far with the git tree from commit 401d0069cb344f401bc9d264c31db55876ff78c0 and a25cac5198d4ff2842ccca63b423962848ad24b2 reverted. Regards, Tino ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage metering applications has gone crackers 2011-11-24 10:30 [REGRESSION] [Linux 3.2] top/htop and all other CPU usage metering applications has gone crackers Artem S. Tashkinov 2011-11-24 20:05 ` Tino Keitel @ 2011-11-29 21:16 ` Maciej Rutecki 1 sibling, 0 replies; 15+ messages in thread From: Maciej Rutecki @ 2011-11-29 21:16 UTC (permalink / raw) To: Artem S. Tashkinov; +Cc: linux-kernel On czwartek, 24 listopada 2011 o 11:30:15 Artem S. Tashkinov wrote: > Hello, > > I'd like to report a weird regression in Linux 3.2 (running rc3 now) - all > CPU metering applications have gone terribly mad under this kernel: > > Here are two text snapshots of htop: > > 1 [||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%] > Tasks: 135 total, 1 running 2 > [||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%] > Load average: 0.00 0.01 0.05 3 > [||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%] > Uptime: 00:27:49 4 > [||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%] > Load: 0.00 Mem[|||||||||||| > 544/8093MB] > Avg[||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%] > > IORR IOWR IO PID USER PRI NI VIRT RES SHR S CPU% MEM% > TIME+ Command 0 0 0 3529 root 20 0 70052 52244 19324 S > 300. 0.6 0:25.93 /usr/bin/X -br 0 0 0 3678 user 20 0 > 33952 10220 8448 S 100. 0.1 0:03.58 gkrellm 0 0 0 3772 user > 20 0 32144 13960 9532 S 100. 0.2 0:07.46 konsole [kdeinit] --noxft > 0 0 0 6061 user 20 0 2780 1276 960 R 100. 0.0 > 0:00.01 htop 0 0 0 1 root 20 0 2884 1376 1168 S 0.0 > 0.0 0:01.00 /sbin/init 0 0 0 2 root 20 0 0 0 > 0 S 0.0 0.0 0:00.00 kthreadd > > > 1 [||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%] > Tasks: 135 total, 1 running 2 > [||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%] > Load average: 0.00 0.01 0.05 3 > [||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%] > Uptime: 00:28:43 4 > [||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%] > Load: 0.00 Mem[|||||||||||| > 545/8093MB] > Avg[||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%] > > IORR IOWR IO PID USER PRI NI VIRT RES SHR S CPU% MEM% > TIME+ Command 0 0 0 6061 user 20 0 2780 1288 972 R > 57.0 0.0 0:00.11 htop 0 0 0 5243 user 20 0 559M 318M > 30928 S 57.0 3.9 0:26.07 /opt/firefox/firefox 0 0 0 3687 user > 20 0 108M 39124 23564 S 57.0 0.5 0:06.50 kedit 0 0 0 3637 > user 20 0 26960 5572 4288 S 57.0 0.1 0:00.02 klauncher > [kdeinit] --new-startup 0 0 0 3529 root 20 0 70052 52244 > 19324 S 0.0 0.6 0:26.71 /usr/bin/X -br > > Interestingly with this madness going on, the internal kernel average load > counter works properly: > > [root@localhost ~]# uptime > 16:20:38 up 44 min, 2 users, load average: 0.02, 0.04, 0.05 > > Right at this moment all process viewers report 400% CPU load (I have 4 CPU > cores), either 400% loaded by user processes or 200% by system and 200% by > user processes. > > Graphically it looks this way: > http://img717.imageshack.us/img717/6495/top2b.png > > I'm not running any CPU intensive applications at the moment at all. > > My .config file can be downloaded here: http://ompldr.org/iYmZneg > > My distro is Fedora 14 i686. > > Best wishes, > > Artem > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ Seems be similar to: http://marc.info/?l=linux-kernel&m=132156832124314&w=2 http://marc.info/?l=linux-kernel&m=132164909313594&w=2 Regards -- Maciej Rutecki http://www.mrutecki.pl ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2011-11-29 21:25 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-11-24 10:30 [REGRESSION] [Linux 3.2] top/htop and all other CPU usage metering applications has gone crackers Artem S. Tashkinov 2011-11-24 20:05 ` Tino Keitel 2011-11-27 11:04 ` Tino Keitel 2011-11-27 11:45 ` Rafael J. Wysocki 2011-11-27 11:45 ` Tino Keitel 2011-11-27 11:56 ` Rafael J. Wysocki 2011-11-27 11:57 ` Artem S. Tashkinov 2011-11-28 19:55 ` Tino Keitel 2011-11-28 20:19 ` Rafael J. Wysocki 2011-11-28 21:41 ` Michal Hocko 2011-11-28 21:43 ` Michal Hocko 2011-11-28 21:48 ` Michal Hocko 2011-11-29 8:14 ` Michal Hocko 2011-11-29 21:25 ` Tino Keitel 2011-11-29 21:16 ` Maciej Rutecki
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox