* Runnable threads on run queue
@ 2006-07-08 20:18 Ask List
2006-07-08 21:18 ` Chase Venters
` (3 more replies)
0 siblings, 4 replies; 13+ messages in thread
From: Ask List @ 2006-07-08 20:18 UTC (permalink / raw)
To: linux-kernel
Have an issue maybe someone on this list can help with.
At times of very high load the number of processes on the run queue drops to
0 then jumps really high and then drops to 0 and back and forth. It seems to
last 10 seconds or so. If you look at this vmstat you can see an example of
what I mean. Now im not a linux kernel expert but i am thinking it has
something to do with the scheduling algorithm and locking of the run queue.
For this particular application I need all available threads to be processed as
fast as possible. Is there a way for me to elimnate this behavior or at least
minimize the window in which there are no threads on the run queue? Is there a
sysctl parameter I can use?
Please help.
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
83 0 1328 301684 37868 1520632 0 0 0 264 400 1332 98 2 0 0
17 0 1328 293936 37868 1520688 0 0 0 0 537 979 97 3 0 0
73 0 1328 293688 37868 1520712 0 0 0 0 268 2643 98 2 0 0
80 0 1328 277220 37868 1520756 0 0 0 0 351 824 98 2 0 0
49 0 1328 262452 37868 1520800 0 0 0 0 393 1882 97 3 0 0
45 0 1328 246796 37868 1520828 0 0 0 304 302 1631 96 4 0 0
55 0 1328 243852 37868 1520872 0 0 0 0 356 1101 99 1 0 0
17 0 1328 228672 37868 1520916 0 0 0 0 336 748 97 3 0 0
0 0 1328 299948 37868 1520956 0 0 0 0 299 821 78 3 19 0
0 0 1328 299184 37868 1520960 0 0 0 0 168 78 8 0 92 0
0 0 1328 299184 37868 1520960 0 0 0 248 173 38 0 1 99 0
0 0 1328 299184 37868 1520960 0 0 0 0 160 20 0 0 100 0
0 0 1328 299184 37868 1520960 0 0 0 0 151 6 0 0 100 0
0 0 1328 299184 37868 1520960 0 0 0 0 162 42 0 1 99 0
1 0 1328 299188 37868 1520960 0 0 0 0 161 24 0 0 100 0
0 0 1328 298808 37868 1520988 0 0 0 100 303 1119 57 0 42 0
0 0 1328 298808 37868 1520988 0 0 0 0 162 22 0 1 99 0
3 0 1328 298808 37868 1520992 0 0 0 0 195 233 16 0 84 0
14 0 1328 298788 37868 1521032 0 0 0 0 400 1158 87 3 10 0
54 0 1328 298860 37868 1521064 0 0 0 0 438 940 97 3 0 0
80 0 1328 298296 37868 1521092 0 0 0 180 476 556 97 3 0 0
29 0 1328 294632 37868 1521148 0 0 0 0 824 1178 99 1 0 0
68 0 1328 292936 37868 1521172 0 0 0 0 404 2283 96 4 0 0
73 0 1328 292740 37868 1521216 0 0 0 0 521 828 98 2 0 0
38 0 1328 260340 37868 1521260 0 0 0 0 405 1069 96 4 0 0
46 0 1328 253072 37868 1521292 0 0 0 300 371 1692 95 5 0 0
71 0 1328 244084 37868 1521328 0 0 0 0 357 1478 98 2 0 0
71 0 1328 233916 37868 1521384 0 0 0 0 528 1121 97 3 0 0
32 0 1328 222784 37868 1521416 0 0 0 0 347 1191 96 4 0 0
76 0 1328 212396 37868 1521448 0 0 0 0 337 2526 97 3 0 0
71 0 1328 198684 37868 1521488 0 0 0 284 497 942 98 2 0 0
40 0 1328 189964 37868 1521532 0 0 0 0 420 1525 96 4 0 0
53 0 1328 179656 37868 1521576 0 0 0 0 391 1983 98 2 0 0
91 0 1328 169164 37868 1521608 0 0 0 0 415 2018 98 2 0 0
70 0 1328 151300 37868 1521648 0 0 0 0 411 1769 98 2 0 0
43 0 1328 145980 37868 1521684 0 0 0 308 420 1713 96 4 0 0
48 0 1328 142708 37868 1521724 0 0 0 0 290 1490 97 3 0 0
76 0 1328 126080 37868 1521752 0 0 0 0 389 1568 97 3 0 0
85 0 1328 120544 37864 1518164 0 0 0 0 365 1261 96 4 0 0
51 0 1328 121312 37864 1506908 0 0 0 0 306 1217 98 2 0 0
55 0 1328 121488 37864 1495128 0 0 0 292 364 1976 98 2 0 0
79 0 1328 120408 37864 1486072 0 0 0 0 328 2106 97 3 0 0
29 0 1328 216660 37864 1482744 0 0 0 0 387 866 97 3 0 0
0 0 1328 321932 37864 1482788 0 0 0 0 289 750 67 3 31 0
0 0 1328 321932 37864 1482788 0 0 0 0 158 10 0 0 100 0
2 0 1328 321912 37864 1482792 0 0 0 268 201 156 4 1 94 0
0 0 1328 321892 37864 1482796 0 0 0 0 180 270 7 0 93 0
0 0 1328 321892 37864 1482796 0 0 0 0 152 4 0 0 100 0
0 0 1328 321880 37864 1482796 0 0 0 0 158 26 0 1 99 0
0 0 1328 321844 37864 1482820 0 0 0 0 330 454 41 1 58 0
0 0 1328 321844 37864 1482820 0 0 0 120 167 30 0 0 100 0
0 0 1328 321844 37864 1482820 0 0 0 0 166 35 1 0 99 0
35 0 1328 321476 37864 1482836 0 0 0 0 530 1026 67 2 31 0
76 0 1328 321528 37868 1482864 0 0 0 0 406 1744 96 4 0 0
41 0 1328 321172 37868 1482920 0 0 0 192 409 690 97 3 0 0
34 0 1328 314788 37868 1482956 0 0 0 0 356 1616 97 3 0 0
63 0 1328 314368 37868 1482996 0 0 0 0 437 1277 98 2 0 0
1 0 1328 331744 37868 1483044 0 0 0 0 331 709 90 3 7 0
0 0 1328 331724 37868 1483048 0 0 0 0 174 395 4 0 96 0
0 0 1328 331724 37868 1483048 0 0 0 224 168 16 0 0 100 0
0 0 1328 331724 37868 1483048 0 0 0 0 167 54 0 1 99 0
7 0 1328 331744 37868 1483048 0 0 0 0 238 167 10 0 90 0
46 0 1328 330788 37868 1483076 0 0 0 0 878 1677 98 2 0 0
84 0 1328 330444 37868 1483100 0 0 0 0 425 1449 97 3 0 0
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Runnable threads on run queue
2006-07-08 20:18 Ask List
@ 2006-07-08 21:18 ` Chase Venters
2006-07-08 22:54 ` Ask List
2006-07-08 22:19 ` Dr. David Alan Gilbert
` (2 subsequent siblings)
3 siblings, 1 reply; 13+ messages in thread
From: Chase Venters @ 2006-07-08 21:18 UTC (permalink / raw)
To: Ask List; +Cc: linux-kernel
On Saturday 08 July 2006 15:18, Ask List wrote:
> Have an issue maybe someone on this list can help with.
>
> At times of very high load the number of processes on the run queue drops
> to 0 then jumps really high and then drops to 0 and back and forth. It
> seems to last 10 seconds or so. If you look at this vmstat you can see an
> example of what I mean. Now im not a linux kernel expert but i am thinking
> it has something to do with the scheduling algorithm and locking of the run
> queue. For this particular application I need all available threads to be
> processed as fast as possible. Is there a way for me to elimnate this
> behavior or at least minimize the window in which there are no threads on
> the run queue? Is there a sysctl parameter I can use?
If there's a runnable task on the system, the run queue should never empty
except inside schedule(). The scheduler should then swap expired and active.
First question - what kernel are you running? Is it stock?
Second question - what's the application? Are you sure your threads just
aren't falling into interruptible sleep due to an app bug of some sort? Are
you observing misbehavior in the application (long pauses) or just in the
reporting?
Thanks,
Chase
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Runnable threads on run queue
2006-07-08 20:18 Ask List
2006-07-08 21:18 ` Chase Venters
@ 2006-07-08 22:19 ` Dr. David Alan Gilbert
2006-07-08 23:08 ` Ask List
2006-07-09 7:20 ` Mike Galbraith
2006-07-09 8:33 ` Rik van Riel
3 siblings, 1 reply; 13+ messages in thread
From: Dr. David Alan Gilbert @ 2006-07-08 22:19 UTC (permalink / raw)
To: Ask List; +Cc: linux-kernel
* Ask List (askthelist@gmail.com) wrote:
> Have an issue maybe someone on this list can help with.
<snip>
> Please help.
>
> procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
> r b swpd free buff cache si so bi bo in cs us sy id wa
> 83 0 1328 301684 37868 1520632 0 0 0 264 400 1332 98 2 0 0
> 17 0 1328 293936 37868 1520688 0 0 0 0 537 979 97 3 0 0
> 73 0 1328 293688 37868 1520712 0 0 0 0 268 2643 98 2 0 0
> 80 0 1328 277220 37868 1520756 0 0 0 0 351 824 98 2 0 0
> 49 0 1328 262452 37868 1520800 0 0 0 0 393 1882 97 3 0 0
> 45 0 1328 246796 37868 1520828 0 0 0 304 302 1631 96 4 0 0
> 55 0 1328 243852 37868 1520872 0 0 0 0 356 1101 99 1 0 0
> 17 0 1328 228672 37868 1520916 0 0 0 0 336 748 97 3 0 0
> 0 0 1328 299948 37868 1520956 0 0 0 0 299 821 78 3 19 0
> 0 0 1328 299184 37868 1520960 0 0 0 0 168 78 8 0 92 0
Could you also post the output of iostat -x 1 covering the same period?
(You might need to restrict the set of devices if you have a lot)
The pattern of bursts of output is something I've seen on apps
just trying to do continuous large writes and I'm wondering
what you are seeing there.
Dave
--
-----Open up your eyes, open up your mind, open up your code -------
/ Dr. David Alan Gilbert | Running GNU/Linux on Alpha,68K| Happy \
\ gro.gilbert @ treblig.org | MIPS,x86,ARM,SPARC,PPC & HPPA | In Hex /
\ _________________________|_____ http://www.treblig.org |_______/
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Runnable threads on run queue
2006-07-08 21:18 ` Chase Venters
@ 2006-07-08 22:54 ` Ask List
0 siblings, 0 replies; 13+ messages in thread
From: Ask List @ 2006-07-08 22:54 UTC (permalink / raw)
To: linux-kernel
Chase Venters <chase.venters <at> clientec.com> writes:
>
> On Saturday 08 July 2006 15:18, Ask List wrote:
> > Have an issue maybe someone on this list can help with.
> >
> > At times of very high load the number of processes on the run queue drops
> > to 0 then jumps really high and then drops to 0 and back and forth. It
> > seems to last 10 seconds or so. If you look at this vmstat you can see an
> > example of what I mean. Now im not a linux kernel expert but i am thinking
> > it has something to do with the scheduling algorithm and locking of the run
> > queue. For this particular application I need all available threads to be
> > processed as fast as possible. Is there a way for me to elimnate this
> > behavior or at least minimize the window in which there are no threads on
> > the run queue? Is there a sysctl parameter I can use?
>
> If there's a runnable task on the system, the run queue should never empty
> except inside schedule(). The scheduler should then swap expired and active.
>
> First question - what kernel are you running? Is it stock?
>
> Second question - what's the application? Are you sure your threads just
> aren't falling into interruptible sleep due to an app bug of some sort? Are
> you observing misbehavior in the application (long pauses) or just in the
> reporting?
>
> Thanks,
> Chase
>
The kernel version is a debian kernel source version 2.4.27-3 and it was
recompiled to support SMP, High Memory, etc. The application is SpamAssassin
version 3.1.1. It is possible there may be an app bug, however I do not know
this for certain. We have manipulated the configuration of the daemon to try and
aleviate the symptoms to no avail. We experience the issues if we use a mysql
backend for the bayes db or not. We are experiencing misbehavior in the
application in the sense of the time it takes for messages to be processed. It
normally takes tenths of a second to process incoming mail, however we notice
the processing time jump to over 10 seconds each time the run queue drops to 0
and then drops back down to tenths of a second when the queue fills back up.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Runnable threads on run queue
2006-07-08 22:19 ` Dr. David Alan Gilbert
@ 2006-07-08 23:08 ` Ask List
0 siblings, 0 replies; 13+ messages in thread
From: Ask List @ 2006-07-08 23:08 UTC (permalink / raw)
To: linux-kernel
I dont exactly have iostat -x 1 from the same time frame. But I do have the
collected sar data from before and during our period of high load. Here is
a snippet before ...
tps rtps wtps bread/s bwrtn/s
11:46:00 PM 5.00 0.00 5.00 0.00 160.00
11:46:01 PM 8.91 0.00 8.91 0.00 182.18
11:46:02 PM 10.00 0.00 10.00 0.00 368.00
11:46:03 PM 7.92 0.00 7.92 0.00 142.57
11:46:04 PM 13.00 0.00 13.00 0.00 336.00
11:46:05 PM 9.90 0.00 9.90 0.00 261.39
11:46:06 PM 9.00 0.00 9.00 0.00 264.00
11:46:07 PM 6.93 0.00 6.93 0.00 198.02
11:46:08 PM 8.00 0.00 8.00 0.00 288.00
11:46:09 PM 12.87 0.00 12.87 0.00 324.75
11:46:10 PM 9.00 0.00 9.00 0.00 280.00
11:46:11 PM 6.93 0.00 6.93 0.00 134.65
11:46:12 PM 12.00 0.00 12.00 0.00 336.00
11:46:13 PM 10.89 0.00 10.89 0.00 253.47
11:46:14 PM 18.00 0.00 18.00 0.00 464.00
11:46:15 PM 4.81 0.00 4.81 0.00 84.62
11:46:16 PM 10.00 0.00 10.00 0.00 328.00
11:46:17 PM 10.89 0.00 10.89 0.00 269.31
11:46:18 PM 11.00 0.00 11.00 0.00 304.00
11:46:19 PM 30.69 0.00 30.69 0.00 451.49
11:46:20 PM 9.00 0.00 9.00 0.00 272.00
11:46:21 PM 5.94 0.00 5.94 0.00 95.05
11:46:22 PM 10.00 0.00 10.00 0.00 304.00
11:46:23 PM 5.94 0.00 5.94 0.00 150.50
11:46:24 PM 17.00 0.00 17.00 0.00 432.00
11:46:25 PM 6.93 0.00 6.93 0.00 190.10
11:46:26 PM 10.00 0.00 10.00 0.00 344.00
11:46:27 PM 8.91 0.00 8.91 0.00 166.34
11:46:28 PM 7.00 0.00 7.00 0.00 192.00
11:46:29 PM 15.84 0.00 15.84 0.00 427.72
11:46:30 PM 7.00 0.00 7.00 0.00 168.00
11:46:31 PM 9.90 0.00 9.90 0.00 221.78
11:46:32 PM 12.00 0.00 12.00 0.00 360.00
11:46:33 PM 10.89 0.00 10.89 0.00 245.54
11:46:34 PM 10.00 0.00 10.00 0.00 280.00
11:46:35 PM 6.93 0.00 6.93 0.00 134.65
11:46:36 PM 11.00 0.00 11.00 0.00 296.00
11:46:37 PM 8.91 0.00 8.91 0.00 205.94
11:46:38 PM 12.00 0.00 12.00 0.00 376.00
11:46:39 PM 14.85 0.00 14.85 0.00 435.64
11:46:40 PM 9.00 0.00 9.00 0.00 248.00
11:46:41 PM 7.92 0.00 7.92 0.00 237.62
11:46:42 PM 10.00 0.00 10.00 0.00 320.00
11:46:43 PM 5.94 0.00 5.94 0.00 55.45
11:46:44 PM 15.00 0.00 15.00 0.00 408.00
11:46:45 PM 9.90 0.00 9.90 0.00 229.70
11:46:46 PM 10.00 0.00 10.00 0.00 272.00
11:46:47 PM 10.89 0.00 10.89 0.00 269.31
11:46:48 PM 10.00 0.00 10.00 0.00 272.00
11:46:49 PM 36.63 0.00 36.63 0.00 514.85
11:46:50 PM 11.00 0.00 11.00 0.00 296.00
11:46:51 PM 8.91 0.00 8.91 0.00 205.94
11:46:52 PM 11.00 0.00 11.00 0.00 312.00
11:46:53 PM 8.91 0.00 8.91 0.00 190.10
11:46:54 PM 15.00 0.00 15.00 0.00 368.00
11:46:55 PM 9.90 0.00 9.90 0.00 253.47
11:46:56 PM 11.00 0.00 11.00 0.00 352.00
11:46:57 PM 8.91 0.00 8.91 0.00 245.54
11:46:58 PM 9.00 0.00 9.00 0.00 256.00
11:46:59 PM 11.88 0.00 11.88 0.00 308.91
11:47:00 PM 7.00 0.00 7.00 0.00 168.00
and here is a snippet during high load....
12:13:00 AM 6.00 0.00 6.00 0.00 224.00
12:13:01 AM 8.06 0.00 8.06 0.00 180.65
12:13:02 AM 18.00 0.00 18.00 0.00 544.00
12:13:03 AM 8.00 0.00 8.00 0.00 192.00
12:13:04 AM 47.00 0.00 47.00 0.00 856.00
12:13:05 AM 8.91 0.00 8.91 0.00 229.70
12:13:06 AM 15.00 0.00 15.00 0.00 392.00
12:13:07 AM 9.90 0.00 9.90 0.00 229.70
12:13:08 AM 8.00 0.00 8.00 0.00 232.00
12:13:09 AM 15.52 0.00 15.52 0.00 379.31
12:13:10 AM 6.98 0.00 6.98 0.00 198.45
12:13:12 AM 12.96 0.00 12.96 0.00 348.15
12:13:13 AM 17.00 0.00 17.00 0.00 424.00
12:13:14 AM 28.74 0.00 28.74 0.00 526.95
12:13:15 AM 13.46 0.00 13.46 0.00 361.54
12:13:16 AM 9.40 0.00 9.40 0.00 225.64
12:13:17 AM 15.00 0.00 15.00 0.00 488.00
12:13:19 AM 14.91 0.00 14.91 0.00 377.64
12:13:20 AM 9.00 0.00 9.00 0.00 296.00
12:13:21 AM 12.15 0.00 12.15 0.00 366.36
12:13:22 AM 26.00 0.00 26.00 0.00 784.00
12:13:24 AM 11.06 0.00 11.06 0.00 324.42
12:13:25 AM 14.81 0.00 14.81 0.00 333.33
12:13:26 AM 25.47 0.00 25.47 0.00 777.36
12:13:27 AM 19.00 0.00 19.00 0.00 480.00
12:13:28 AM 20.79 0.00 20.79 0.00 538.61
12:13:29 AM 5.00 0.00 5.00 0.00 136.00
12:13:31 AM 12.73 0.00 12.73 0.00 298.18
12:13:32 AM 23.00 0.00 23.00 0.00 632.00
12:13:33 AM 36.79 0.00 36.79 0.00 1011.32
12:13:34 AM 37.50 0.00 37.50 0.00 950.00
12:13:35 AM 7.76 0.00 7.76 0.00 186.21
12:13:36 AM 12.93 0.00 12.93 0.00 324.14
12:13:38 AM 8.57 0.00 8.57 0.00 210.29
12:13:39 AM 30.00 0.00 30.00 0.00 696.00
12:13:40 AM 9.90 0.00 9.90 0.00 245.54
12:13:41 AM 12.00 0.00 12.00 0.00 328.00
12:13:42 AM 5.94 0.00 5.94 0.00 63.37
12:13:43 AM 7.00 0.00 7.00 0.00 256.00
12:13:44 AM 44.54 0.00 44.54 0.00 746.22
12:13:45 AM 9.71 0.00 9.71 0.00 248.54
12:13:46 AM 13.89 0.00 13.89 0.00 370.37
12:13:47 AM 13.00 0.00 13.00 0.00 336.00
12:13:48 AM 13.86 0.00 13.86 0.00 324.75
12:13:49 AM 15.00 0.00 15.00 0.00 344.00
12:13:50 AM 3.96 0.00 3.96 0.00 39.60
12:13:51 AM 11.00 0.00 11.00 0.00 368.00
12:13:52 AM 7.92 0.00 7.92 0.00 174.26
12:13:54 AM 10.17 0.00 10.17 0.00 266.67
12:13:55 AM 7.41 0.00 7.41 0.00 133.33
12:13:56 AM 15.00 0.00 15.00 0.00 328.00
12:13:57 AM 5.71 0.00 5.71 0.00 91.43
12:13:58 AM 9.68 0.00 9.68 0.00 316.13
12:13:59 AM 24.27 0.00 24.27 0.00 520.39
12:14:00 AM 10.89 0.00 10.89 0.00 324.75
... I hope this helps.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Runnable threads on run queue
2006-07-08 20:18 Ask List
2006-07-08 21:18 ` Chase Venters
2006-07-08 22:19 ` Dr. David Alan Gilbert
@ 2006-07-09 7:20 ` Mike Galbraith
2006-07-09 23:38 ` Horst von Brand
2006-07-12 4:14 ` Ask List
2006-07-09 8:33 ` Rik van Riel
3 siblings, 2 replies; 13+ messages in thread
From: Mike Galbraith @ 2006-07-09 7:20 UTC (permalink / raw)
To: Ask List; +Cc: linux-kernel
On Sat, 2006-07-08 at 20:18 +0000, Ask List wrote:
> procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
> r b swpd free buff cache si so bi bo in cs us sy id wa
> 83 0 1328 301684 37868 1520632 0 0 0 264 400 1332 98 2 0 0
> 17 0 1328 293936 37868 1520688 0 0 0 0 537 979 97 3 0 0
> 73 0 1328 293688 37868 1520712 0 0 0 0 268 2643 98 2 0 0
> 80 0 1328 277220 37868 1520756 0 0 0 0 351 824 98 2 0 0
> 49 0 1328 262452 37868 1520800 0 0 0 0 393 1882 97 3 0 0
> 45 0 1328 246796 37868 1520828 0 0 0 304 302 1631 96 4 0 0
> 55 0 1328 243852 37868 1520872 0 0 0 0 356 1101 99 1 0 0
> 17 0 1328 228672 37868 1520916 0 0 0 0 336 748 97 3 0 0
> 0 0 1328 299948 37868 1520956 0 0 0 0 299 821 78 3 19 0
> 0 0 1328 299184 37868 1520960 0 0 0 0 168 78 8 0 92 0
> 0 0 1328 299184 37868 1520960 0 0 0 248 173 38 0 1 99 0
> 0 0 1328 299184 37868 1520960 0 0 0 0 160 20 0 0 100 0
> 0 0 1328 299184 37868 1520960 0 0 0 0 151 6 0 0 100 0
> 0 0 1328 299184 37868 1520960 0 0 0 0 162 42 0 1 99 0
> 1 0 1328 299188 37868 1520960 0 0 0 0 161 24 0 0 100 0
> 0 0 1328 298808 37868 1520988 0 0 0 100 303 1119 57 0 42 0
> 0 0 1328 298808 37868 1520988 0 0 0 0 162 22 0 1 99 0
Looking at the interrupts column, I suspect you have a network problem,
not a scheduler problem. Looks to me like your SpamAssasins are simply
running out of work to do because your network traffic comes in bursts.
-Mike
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Runnable threads on run queue
2006-07-08 20:18 Ask List
` (2 preceding siblings ...)
2006-07-09 7:20 ` Mike Galbraith
@ 2006-07-09 8:33 ` Rik van Riel
2006-07-12 3:55 ` Ask List
3 siblings, 1 reply; 13+ messages in thread
From: Rik van Riel @ 2006-07-09 8:33 UTC (permalink / raw)
To: Ask List; +Cc: linux-kernel
Ask List wrote:
> Have an issue maybe someone on this list can help with.
>
> At times of very high load the number of processes on the run queue drops to
> 0 then jumps really high and then drops to 0 and back and forth. It seems to
> last 10 seconds or so.
Are you using sendmail by any chance? :)
We start out with a low load averag, so sendmail forks as many
spamassassins as it can...
> procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
> r b swpd free buff cache si so bi bo in cs us sy id wa
> 83 0 1328 301684 37868 1520632 0 0 0 264 400 1332 98 2 0 0
> 17 0 1328 293936 37868 1520688 0 0 0 0 537 979 97 3 0 0
> 73 0 1328 293688 37868 1520712 0 0 0 0 268 2643 98 2 0 0
> 80 0 1328 277220 37868 1520756 0 0 0 0 351 824 98 2 0 0
> 49 0 1328 262452 37868 1520800 0 0 0 0 393 1882 97 3 0 0
> 45 0 1328 246796 37868 1520828 0 0 0 304 302 1631 96 4 0 0
> 55 0 1328 243852 37868 1520872 0 0 0 0 356 1101 99 1 0 0
> 17 0 1328 228672 37868 1520916 0 0 0 0 336 748 97 3 0 0
> 0 0 1328 299948 37868 1520956 0 0 0 0 299 821 78 3 19 0
> 0 0 1328 299184 37868 1520960 0 0 0 0 168 78 8 0 92 0
... and guess what?
The load average went through the roof, so sendmail stops forking
spamassassins. Now nothing is running, and sendmail will not start
forking new spamassassins again until after the load average has
decayed to an acceptable level.
After that, it will fork way too many at once again, and the load
average will go through the roof. Lather, rinse, repeat.
You'd probably be better off limiting the number of simultaneous
local mail deliveries to something reasonable, so the load average
always stays at an acceptable level - and more importantly, all of
the CPU capacity could be used if needed...
--
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Runnable threads on run queue
[not found] <fa.CQngdtRN/1xSBi2RLvhjLxBm1bE@ifi.uio.no>
@ 2006-07-09 16:11 ` Robert Hancock
0 siblings, 0 replies; 13+ messages in thread
From: Robert Hancock @ 2006-07-09 16:11 UTC (permalink / raw)
To: Ask List; +Cc: linux-kernel
Ask List wrote:
> Have an issue maybe someone on this list can help with.
>
> At times of very high load the number of processes on the run queue drops to
> 0 then jumps really high and then drops to 0 and back and forth. It seems to
> last 10 seconds or so. If you look at this vmstat you can see an example of
> what I mean. Now im not a linux kernel expert but i am thinking it has
> something to do with the scheduling algorithm and locking of the run queue.
> For this particular application I need all available threads to be processed as
> fast as possible. Is there a way for me to elimnate this behavior or at least
> minimize the window in which there are no threads on the run queue? Is there a
> sysctl parameter I can use?
>
> Please help.
This seems like a userspace issue to me. There is no way the scheduler
would let the system sit idle for 10 seconds with runnable processes. I
think Rik van Riel's comment about sendmail reacting to increased load
average may be related to what's going on here.
--
Robert Hancock Saskatoon, SK, Canada
To email, remove "nospam" from hancockr@nospamshaw.ca
Home Page: http://www.roberthancock.com/
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Runnable threads on run queue
2006-07-09 7:20 ` Mike Galbraith
@ 2006-07-09 23:38 ` Horst von Brand
2006-07-12 4:14 ` Ask List
1 sibling, 0 replies; 13+ messages in thread
From: Horst von Brand @ 2006-07-09 23:38 UTC (permalink / raw)
To: Mike Galbraith; +Cc: Ask List, linux-kernel
Mike Galbraith <efault@gmx.de> wrote:
> On Sat, 2006-07-08 at 20:18 +0000, Ask List wrote:
> > procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
> > r b swpd free buff cache si so bi bo in cs us sy id wa
[...]
> Looking at the interrupts column, I suspect you have a network problem,
> not a scheduler problem. Looks to me like your SpamAssasins are simply
> running out of work to do because your network traffic comes in bursts.
spamassassin acted up here some time ago. With personal training and some
messages it went to a loop and the load went through the roof. Couldn't
find a cure, plus some hundred users with large personalized rule files
were causing problems anyway, so we axed that.
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Runnable threads on run queue
2006-07-09 8:33 ` Rik van Riel
@ 2006-07-12 3:55 ` Ask List
0 siblings, 0 replies; 13+ messages in thread
From: Ask List @ 2006-07-12 3:55 UTC (permalink / raw)
To: linux-kernel
We are not running sendmail. We developed our own mail server in-house. We have
a cluster of these mail servers sending spam traffic to a cluster of SA servers
and we use the round-robin parameter when starting the spamd process and start
the daemon with a ton of min/spare/max children. So we dont see the forking
issue you mention.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Runnable threads on run queue
2006-07-09 7:20 ` Mike Galbraith
2006-07-09 23:38 ` Horst von Brand
@ 2006-07-12 4:14 ` Ask List
2006-07-12 5:40 ` Mike Galbraith
1 sibling, 1 reply; 13+ messages in thread
From: Ask List @ 2006-07-12 4:14 UTC (permalink / raw)
To: linux-kernel
Mike Galbraith <efault <at> gmx.de> writes:
...
> Looking at the interrupts column, I suspect you have a network problem,
> not a scheduler problem. Looks to me like your SpamAssasins are simply
> running out of work to do because your network traffic comes in bursts.
>
> -Mike
>
>
Network Problem? So your saying our mail servers are not sending spam traffic
fast enough if spam assassin processes are running out of work to do? So when
our mail servers are not sending spam traffic we see our cpu,cs,interrupts, &
runnable threads drop ...?
I'd really like to believe this is true, however in the sa logs there are still
plenty of B (busy threads)...
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Runnable threads on run queue
2006-07-12 4:14 ` Ask List
@ 2006-07-12 5:40 ` Mike Galbraith
2006-07-13 19:05 ` Ask List
0 siblings, 1 reply; 13+ messages in thread
From: Mike Galbraith @ 2006-07-12 5:40 UTC (permalink / raw)
To: Ask List; +Cc: linux-kernel
On Wed, 2006-07-12 at 04:14 +0000, Ask List wrote:
> Network Problem? So your saying our mail servers are not sending spam traffic
> fast enough if spam assassin processes are running out of work to do? So when
> our mail servers are not sending spam traffic we see our cpu,cs,interrupts, &
> runnable threads drop ...?
More or less, yes. I think somebody is dropping the communication ball.
-Mike
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Runnable threads on run queue
2006-07-12 5:40 ` Mike Galbraith
@ 2006-07-13 19:05 ` Ask List
0 siblings, 0 replies; 13+ messages in thread
From: Ask List @ 2006-07-13 19:05 UTC (permalink / raw)
To: linux-kernel
I'll look into it. Thanks for the input.
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2006-07-13 19:05 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <fa.CQngdtRN/1xSBi2RLvhjLxBm1bE@ifi.uio.no>
2006-07-09 16:11 ` Runnable threads on run queue Robert Hancock
2006-07-08 20:18 Ask List
2006-07-08 21:18 ` Chase Venters
2006-07-08 22:54 ` Ask List
2006-07-08 22:19 ` Dr. David Alan Gilbert
2006-07-08 23:08 ` Ask List
2006-07-09 7:20 ` Mike Galbraith
2006-07-09 23:38 ` Horst von Brand
2006-07-12 4:14 ` Ask List
2006-07-12 5:40 ` Mike Galbraith
2006-07-13 19:05 ` Ask List
2006-07-09 8:33 ` Rik van Riel
2006-07-12 3:55 ` Ask List
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox