* scheduler starvation running irman with 2.5.64bk2
@ 2003-03-09 2:50 rwhron
2003-03-09 3:07 ` Zwane Mwaikambo
` (3 more replies)
0 siblings, 4 replies; 11+ messages in thread
From: rwhron @ 2003-03-09 2:50 UTC (permalink / raw)
To: linux-kernel
irman triggers some odd behavior with 2.5.64bk2 on uniprocessor
K6/2 475. "ps aux" hasn't returned for a couple hours, though
irman appears to be doing it's thing. I haven't tried irman on smp.
Time to run irman 3x.
2.5.63 4066 seconds
2.5.63-mjb1 2993 seconds
2.5.63-mm2-dline 2856 seconds
2.5.64 3473 seconds
2.5.64bk2 ??
Time to complete irman is variable because the amount of work
done in 3 runs is also variable.
vmstat looks fairly typical, and it prints every 60 seconds.
vmstat 60
procs memory swap io system cpu
r b w swpd free buff cache si so bi bo in cs us sy id
4 0 0 2564 376368 108 1916 0 0 0 0 1092 25978 2 98 0
4 0 0 2564 376368 108 1920 0 0 0 0 1092 26102 2 98 0
This is vmstat 60 from 2.5.64
procs memory swap io system cpu
r b w swpd free buff cache si so bi bo in cs us sy id
4 0 0 2552 377024 116 1520 0 0 0 0 1092 32649 4 96 0
3 0 0 2552 377024 116 1520 0 0 0 0 1092 32865 5 95 0
There are probably less than 50 processes running.
"ps aux" doesn't hear <ctrl z> or <ctrl c>.
Machine doesn't accept new ssh connections.
It's as if after starting irman, new processes don't get any CPU time.
--
Randy Hron
http://home.earthlink.net/~rwhron/kernel/bigbox.html
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: scheduler starvation running irman with 2.5.64bk2
2003-03-09 2:50 rwhron
@ 2003-03-09 3:07 ` Zwane Mwaikambo
2003-03-09 18:59 ` Mike Galbraith
` (2 subsequent siblings)
3 siblings, 0 replies; 11+ messages in thread
From: Zwane Mwaikambo @ 2003-03-09 3:07 UTC (permalink / raw)
To: rwhron; +Cc: linux-kernel
On Sat, 8 Mar 2003 rwhron@earthlink.net wrote:
> There are probably less than 50 processes running.
> "ps aux" doesn't hear <ctrl z> or <ctrl c>.
>
> Machine doesn't accept new ssh connections.
>
> It's as if after starting irman, new processes don't get any CPU time.
Try doing a sysrq-T to get task state for all the processes and even
sysrq-P at any random point will give you an idea of where the cpu is.
Zwane
--
function.linuxpower.ca
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: scheduler starvation running irman with 2.5.64bk2
2003-03-09 2:50 rwhron
2003-03-09 3:07 ` Zwane Mwaikambo
@ 2003-03-09 18:59 ` Mike Galbraith
2003-03-09 19:13 ` Mike Galbraith
2003-03-10 4:42 ` Mike Galbraith
2003-03-10 4:46 ` Mike Galbraith
3 siblings, 1 reply; 11+ messages in thread
From: Mike Galbraith @ 2003-03-09 18:59 UTC (permalink / raw)
To: rwhron, linux-kernel
At 09:50 PM 3/8/2003 -0500, rwhron@earthlink.net wrote:
>irman triggers some odd behavior with 2.5.64bk2 on uniprocessor
>K6/2 475. "ps aux" hasn't returned for a couple hours, though
>irman appears to be doing it's thing. I haven't tried irman on smp.
>
>Time to run irman 3x.
>
>2.5.63 4066 seconds
>2.5.63-mjb1 2993 seconds
>2.5.63-mm2-dline 2856 seconds
>2.5.64 3473 seconds
>2.5.64bk2 ??
It _appears_ to be a valid test-case. I can reproduce this with
2.5.64+combo and B2. I can also influence it to behave. (that doesn't
mean that the scheduler changes are bust... could be that the test-case is
doing something bad)
Q/indicator: do you ever see ksoftirqd _wanting_ to run? (set top SCHED_RR)
-Mike
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: scheduler starvation running irman with 2.5.64bk2
2003-03-09 18:59 ` Mike Galbraith
@ 2003-03-09 19:13 ` Mike Galbraith
0 siblings, 0 replies; 11+ messages in thread
From: Mike Galbraith @ 2003-03-09 19:13 UTC (permalink / raw)
To: rwhron, linux-kernel
At 07:59 PM 3/9/2003 +0100, Mike Galbraith wrote:
>At 09:50 PM 3/8/2003 -0500, rwhron@earthlink.net wrote:
>>irman triggers some odd behavior with 2.5.64bk2 on uniprocessor
>>K6/2 475. "ps aux" hasn't returned for a couple hours, though
>>irman appears to be doing it's thing. I haven't tried irman on smp.
>>
>>Time to run irman 3x.
>>
>>2.5.63 4066 seconds
>>2.5.63-mjb1 2993 seconds
>>2.5.63-mm2-dline 2856 seconds
>>2.5.64 3473 seconds
>>2.5.64bk2 ??
(btw, it's only the process load that hangs)
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: scheduler starvation running irman with 2.5.64bk2
2003-03-09 2:50 rwhron
2003-03-09 3:07 ` Zwane Mwaikambo
2003-03-09 18:59 ` Mike Galbraith
@ 2003-03-10 4:42 ` Mike Galbraith
2003-03-10 4:46 ` Mike Galbraith
3 siblings, 0 replies; 11+ messages in thread
From: Mike Galbraith @ 2003-03-10 4:42 UTC (permalink / raw)
To: rwhron; +Cc: linux-kernel, Ingo Molnar
[-- Attachment #1: Type: text/plain, Size: 925 bytes --]
At 09:50 PM 3/8/2003 -0500, rwhron@earthlink.net wrote:
>irman triggers some odd behavior with 2.5.64bk2 on uniprocessor
>K6/2 475. "ps aux" hasn't returned for a couple hours, though
>irman appears to be doing it's thing. I haven't tried irman on smp.
OK, can you do the following to determine whether we're both seeing the
_same_ problem?
1.) build the attached rtnice utility (don't remember who wrote/posted this)
2.) login on vt1 and set the shell SCHED_RR via rtnice -n 1 -p <pid_of_sh>
-d RR
3.) login on vt2 and renice that shell to -10
4.) login on another vt as a normal user, and start irman
5.) try login/out on another vt, or ps or _whatever_ (doesn't matter)
until box is starving
6.) on vt2, try to do ps (it should hang despite -10 priority)
7.) on vt1, try to do ps (it should work just fine)
cc to Ingo since stock .64 doesn't show this problem, and .64-combo and
.64-B2 do.
-Mike
[-- Attachment #2: rtnice.c --]
[-- Type: text/plain, Size: 3933 bytes --]
#include <stdio.h>
#include <unistd.h>
#include <getopt.h>
#include <string.h>
#include <sys/types.h>
#include <sched.h>
#include <sys/time.h>
/*
* Gross hack, just in case glibc isn't up to date yet.
* (it wasn't when SCHED_IDLE was added to the kernel)
*/
#ifndef SCHED_IDLE
#define SCHED_IDLE 3
#endif
static char *usage = "Usage: %s -n <priority> -p <pid> -d FF|TS|RR|ID [command...]\n";
static char progname[128];
main(int argc, char *argv[])
{
int minpri, maxpri;
int ch;
char *p;
struct timeval interval;
static struct sched_param param;
pid_t pid = 0;
int do_priority = 0;
int do_policy = 0;
int policy = 0;
int priority = 0;
int args = 0;
extern char *optarg;
extern int optind;
(void) strcpy(progname, argv[0]);
while ((ch = getopt(argc, argv, "p:d:n:")) != EOF) {
switch (ch) {
case 'p':
if (sscanf(optarg, "%d", &pid) != 1) {
fprintf(stderr,"%s: Bad pid.\n", progname);
fprintf(stderr, usage, progname);
exit(3);
}
args++;
#ifdef DEBUG
fprintf(stderr, "pid = %d.\n", pid);
#endif
break;
case 'd':
if (!strcmp(optarg, "TS")) {
policy = SCHED_OTHER;
}
else
if (!strcmp(optarg, "RR")) {
policy = SCHED_RR;
}
else
if (!strcmp(optarg, "ID"))
policy = SCHED_IDLE;
else
if (!strcmp(optarg, "FF") || !strcmp(optarg, "FIFO")) {
policy = SCHED_FIFO;
}
else {
fprintf(stderr,"%s: Bad scheduling domain '%s'.\n", progname, optarg);
fprintf(stderr, usage, progname);
exit(4);
}
args++;
do_policy++;
#ifdef DEBUG
fprintf(stderr,"policy = %d.\n", policy);
#endif
break;
case 'n':
/* we have one digit, let's get more */
if (sscanf(optarg, "%d", &priority) != 1) {
fprintf(stderr,"%s: Bad priority value.\n", progname);
fprintf(stderr, usage, progname);
exit(4);
}
do_priority++;
args++;
#ifdef DEBUG
fprintf(stderr, "priority = %d.\n", priority);
#endif
break;
default:
fprintf(stderr,"%s: Unrecognized option '%c'.\n", progname, ch);
fprintf(stderr, usage, progname);
exit(6);
break;
}
}
argc -= optind;
argv += optind;
if (args == 0) {
fprintf(stderr, usage, progname);
exit(7);
}
if ((maxpri = sched_get_priority_max(SCHED_FIFO)) == -1) {
perror("sched_get_priority_max fails");
exit(8);
}
if ((minpri = sched_get_priority_min(SCHED_FIFO)) == -1) {
perror("sched_get_priority_min fails");
exit(10);
}
if (do_priority) {
if (priority > maxpri) {
fprintf(stderr,"%s: maximum priority allowed is %d.\n", progname, maxpri);
exit(9);
}
if (priority < minpri) {
fprintf(stderr,"%s: minimum priority allowed is %d.\n", progname, minpri);
exit(11);
}
}
if (!do_policy) {
if ((policy = sched_getscheduler(pid)) == -1) {
policy = SCHED_OTHER;
}
}
if (!do_priority) {
if (sched_getparam(pid, ¶m) == 0) { /* will fail for TS! */
priority = param.sched_priority;
}
else {
priority = minpri;
}
if (!do_policy && pid) { /* print and exit */
static char *pol; /* default */
if (policy == SCHED_OTHER) pol = "TS";
if (policy == SCHED_FIFO) pol = "FIFO";
if (policy == SCHED_RR) pol = "RR";
fprintf(stdout,"%s: scheduler %s priority %d.\n", progname, pol, priority);
exit(0);
}
}
if (policy == SCHED_OTHER || policy == SCHED_IDLE)
priority = 0;
param.sched_priority = priority;
if (sched_setscheduler(pid, policy, ¶m) == -1) {
perror("sched_setscheduler fails");
exit(13);
}
if (pid != 0) { /* non-local target => exit */
exit(0);
}
if (!argv[0]) {
fprintf(stderr,"%s: no valid command args.\n", progname);
fprintf(stderr, usage, progname);
exit(14); /* don't exec if target was not us or no args */
}
/* the big leap! */
/* hal: dave, will I dream? */
if (!execvp(argv[0], &argv[0])) {
perror("rtrun exec fails");
exit(-1);
}
}
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: scheduler starvation running irman with 2.5.64bk2
2003-03-09 2:50 rwhron
` (2 preceding siblings ...)
2003-03-10 4:42 ` Mike Galbraith
@ 2003-03-10 4:46 ` Mike Galbraith
2003-03-10 5:54 ` Mike Galbraith
3 siblings, 1 reply; 11+ messages in thread
From: Mike Galbraith @ 2003-03-10 4:46 UTC (permalink / raw)
To: rwhron; +Cc: linux-kernel
P.S. You can save a little time by only running the process load. Edit
irman.c:412 and set load_num to 3... no need to wait for the other two
loads to complete first, it's the process load that starves the box to
death here.
-Mike
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: scheduler starvation running irman with 2.5.64bk2
2003-03-10 4:46 ` Mike Galbraith
@ 2003-03-10 5:54 ` Mike Galbraith
0 siblings, 0 replies; 11+ messages in thread
From: Mike Galbraith @ 2003-03-10 5:54 UTC (permalink / raw)
To: rwhron; +Cc: linux-kernel
At 05:46 AM 3/10/2003 +0100, Mike Galbraith wrote:
>P.S. You can save a little time by only running the process load. Edit
>irman.c:412 and set load_num to 3... no need to wait for the other two
>loads to complete first, it's the process load that starves the box to
>death here.
P.P.S. (last one;)
To make it cease and desist starving your box, renice the shell you're
running irman from. 12 is the magic (hmm) number here at which starvation
ceases.
It's interesting to watch the response times irman reports. The higher the
renice level, the better response times it reports both in 2.5.64-virgin,
and 2.5.64-combo (which looks like what is in bk2). /me has no clue
whether that means anything or not ;-)
ciao,
-Mike
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: scheduler starvation running irman with 2.5.64bk2
@ 2003-03-11 0:09 rwhron
2003-03-11 2:37 ` Mike Galbraith
0 siblings, 1 reply; 11+ messages in thread
From: rwhron @ 2003-03-11 0:09 UTC (permalink / raw)
To: efault, ingo; +Cc: linux-kernel
> OK, can you do the following to determine whether we're both seeing the
> _same_ problem?
> 1.) build the attached rtnice utility (don't remember who wrote/posted this)
> 2.) login on vt1 and set the shell SCHED_RR via rtnice -n 1 -p <pid_of_sh>
> -d RR
> 3.) login on vt2 and renice that shell to -10
> 4.) login on another vt as a normal user, and start irman
> 5.) try login/out on another vt, or ps or _whatever_ (doesn't matter)
> until box is starving
> 6.) on vt2, try to do ps (it should hang despite -10 priority)
> 7.) on vt1, try to do ps (it should work just fine)
Here is the test as I understand it, splitted by tty. Quick summary,
only new process seems starved. RR and nice -10 process behave okay.
BTW, this is now with 2.5.64bk5.
Below shows irman started, and it gets into the PROCESS
load. Only thing odd here is the PROCESS load never completes.
(or I am not patient enough)
root@mountain:/usr/src/sources/i/irman# tty
/dev/pts/0
root@mountain:/usr/src/sources/i/irman# ./irman
Response time measurements (milliseconds) for: 2.5.64bk5
Load Max Min Avg Std. Dev.
NULL 0.177 0.022 0.026 0.003
MEMORY 1074.825 0.020 0.044 1.424
FILE_IO 408.013 0.016 0.039 2.469
This is the RR shell. It functions okay throughout the
irman run.
root@mountain:~# tty
/dev/pts/1
root@mountain:~# rtnice -n 1 -p $$ -d RR
root@mountain:~# uptime
18:46:33 up 8 min, 3 users, load average: 3.21, 2.06, 0.89
This is the nice -10 shell. It does okay too.
root@mountain:~# tty
/dev/pts/2
root@mountain:~# renice -10 $$
787: old priority 0, new priority -10
root@mountain:~# uptime
18:45:18 up 7 min, 3 users, load average: 3.39, 1.76, 0.70
root@mountain:~# uptime
18:50:19 up 12 min, 3 users, load average: 3.88, 2.98, 1.50
root@mountain:~# ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.5 0.0 1312 100 ? S 18:37 0:04 init [3]
root 2 0.0 0.0 0 0 ? SWN 18:37 0:00 [ksoftirqd/0]
root 3 0.0 0.0 0 0 ? SW< 18:37 0:00 [events/0]
root 4 0.0 0.0 0 0 ? SW 18:37 0:00 [pdflush]
root 5 0.0 0.0 0 0 ? SW 18:37 0:00 [pdflush]
root 6 0.2 0.0 0 0 ? SW 18:37 0:02 [kswapd0]
root 7 0.0 0.0 0 0 ? SW< 18:37 0:00 [aio/0]
root 326 0.0 0.0 0 0 ? SW 18:37 0:00 [eth0]
root 396 0.0 0.0 1368 92 ? S 18:37 0:00 syslogd -m 0
root 401 0.0 0.0 1300 48 ? S 18:37 0:00 klogd -x
rpc 421 0.0 0.0 1444 40 ? S 18:37 0:00 portmap
root 534 0.0 0.0 2568 224 ? S 18:37 0:00 /usr/sbin/sshd
root 544 0.0 0.0 1284 44 tty1 S 18:37 0:00 /sbin/mingetty tty1
root 545 0.0 0.0 1300 44 ttyS1 S 18:37 0:00 /sbin/agetty -h ttyS1 38400 linux
root 548 0.0 0.0 3352 256 ? S 18:38 0:00 /usr/sbin/sshd
rwhron 549 0.0 0.0 2404 32 pts/0 S 18:38 0:00 -bash
root 593 0.0 0.0 2268 48 pts/0 S 18:38 0:00 su -
root 594 0.0 0.0 2400 32 pts/0 S 18:38 0:00 -bash
root 645 0.0 0.0 3352 300 ? S 18:38 0:00 /usr/sbin/sshd
rwhron 646 0.0 0.0 2404 32 pts/1 S 18:38 0:00 -bash
root 690 0.0 0.0 2268 48 pts/1 S 18:38 0:00 su -
root 691 0.0 0.1 2400 512 pts/1 S 18:38 0:00 -bash
root 741 0.0 0.0 3352 300 ? S 18:38 0:00 /usr/sbin/sshd
rwhron 742 0.0 0.0 2404 32 pts/2 S 18:38 0:00 -bash
root 786 0.0 0.0 2268 48 pts/2 S 18:38 0:00 su -
root 787 0.0 0.1 2400 520 pts/2 S< 18:38 0:00 -bash
root 844 23.0 0.0 1460 224 pts/0 R 18:41 2:10 ./irman
root 845 0.0 0.0 0 0 pts/0 Z 18:41 0:00 [irman <defunct>]
root 846 2.1 0.0 0 0 pts/0 Z 18:41 0:12 [irman <defunct>]
root 848 5.9 0.0 0 0 pts/0 Z 18:41 0:32 [mem_load <defunct>]
root 849 2.3 0.0 0 0 pts/0 Z 18:41 0:12 [irman <defunct>]
root 850 0.0 0.0 0 0 pts/0 Z 18:43 0:00 [io_load <defunct>]
root 851 1.7 0.0 0 0 pts/0 Z 18:43 0:07 [irman <defunct>]
root 855 8.9 0.0 1300 324 pts/0 S 18:44 0:32 process_load --processes 10 --recordsize 10000 --injections 2
root 857 7.4 0.0 1288 276 pts/0 S 18:44 0:27 child_process 10000 0 10
root 858 7.9 0.0 1288 276 pts/0 S 18:44 0:28 child_process 10000 3 12
root 859 13.0 0.0 1288 276 pts/0 R 18:44 0:47 child_process 10000 11 14
root 860 13.4 0.0 1288 276 pts/0 S 18:44 0:49 child_process 10000 13 16
root 861 8.0 0.0 1288 276 pts/0 R 18:44 0:29 child_process 10000 15 18
root 862 10.5 0.0 1288 276 pts/0 S 18:44 0:38 child_process 10000 17 20
root 863 11.2 0.0 1288 276 pts/0 S 18:44 0:41 child_process 10000 19 22
root 864 8.7 0.0 1288 276 pts/0 S 18:44 0:31 child_process 10000 21 24
root 865 7.9 0.0 1288 276 pts/0 S 18:44 0:28 child_process 10000 23 26
root 868 0.0 0.0 2568 108 ? R 18:47 0:00 /usr/sbin/sshd
root 874 0.0 0.1 2544 668 pts/2 R< 18:50 0:00 ps aux
However if I start a new connection, it never gives a prompt.
rwhron@rushmore:~$ ssh mt
.. wait several minutes ... still no prompt.
>From the RR shell, one can see the new PID 876 for the sshd above.
root@mountain:~# ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.4 0.0 1312 100 ? S 18:37 0:04 init [3]
root 2 0.0 0.0 0 0 ? SWN 18:37 0:00 [ksoftirqd/0]
root 3 0.0 0.0 0 0 ? SW< 18:37 0:00 [events/0]
root 4 0.0 0.0 0 0 ? SW 18:37 0:00 [pdflush]
root 5 0.0 0.0 0 0 ? SW 18:37 0:00 [pdflush]
root 6 0.2 0.0 0 0 ? SW 18:37 0:02 [kswapd0]
root 7 0.0 0.0 0 0 ? SW< 18:37 0:00 [aio/0]
root 326 0.0 0.0 0 0 ? SW 18:37 0:00 [eth0]
root 396 0.0 0.0 1368 92 ? S 18:37 0:00 syslogd -m 0
root 401 0.0 0.0 1300 52 ? S 18:37 0:00 klogd -x
rpc 421 0.0 0.0 1444 44 ? S 18:37 0:00 portmap
root 534 0.0 0.0 2568 224 ? S 18:37 0:00 /usr/sbin/sshd
root 544 0.0 0.0 1284 48 tty1 S 18:37 0:00 /sbin/mingetty tty1
root 545 0.0 0.0 1300 48 ttyS1 S 18:37 0:00 /sbin/agetty -h ttyS1 38400 linux
root 548 0.0 0.0 3352 256 ? S 18:38 0:00 /usr/sbin/sshd
rwhron 549 0.0 0.0 2404 36 pts/0 S 18:38 0:00 -bash
root 593 0.0 0.0 2268 52 pts/0 S 18:38 0:00 su -
root 594 0.0 0.0 2400 36 pts/0 S 18:38 0:00 -bash
root 645 0.0 0.0 3352 304 ? R 18:38 0:00 /usr/sbin/sshd
rwhron 646 0.0 0.0 2404 36 pts/1 S 18:38 0:00 -bash
root 690 0.0 0.0 2268 52 pts/1 S 18:38 0:00 su -
root 691 0.0 0.1 2400 520 pts/1 S 18:38 0:00 -bash
root 741 0.0 0.0 3352 304 ? S 18:38 0:00 /usr/sbin/sshd
rwhron 742 0.0 0.0 2404 36 pts/2 S 18:38 0:00 -bash
root 786 0.0 0.0 2268 52 pts/2 S 18:38 0:00 su -
root 787 0.0 0.1 2400 520 pts/2 S< 18:38 0:00 -bash
root 844 18.6 0.0 1460 224 pts/0 R 18:41 2:10 ./irman
root 845 0.0 0.0 0 0 pts/0 Z 18:41 0:00 [irman <defunct>]
root 846 1.7 0.0 0 0 pts/0 Z 18:41 0:12 [irman <defunct>]
root 848 4.8 0.0 0 0 pts/0 Z 18:41 0:32 [mem_load <defunct>]
root 849 1.8 0.0 0 0 pts/0 Z 18:41 0:12 [irman <defunct>]
root 850 0.0 0.0 0 0 pts/0 Z 18:43 0:00 [io_load <defunct>]
root 851 1.3 0.0 0 0 pts/0 Z 18:43 0:07 [irman <defunct>]
root 855 9.0 0.0 1300 324 pts/0 S 18:44 0:45 process_load --processes 10 --recordsize 10000 --injections 2
root 857 7.5 0.0 1288 276 pts/0 S 18:44 0:37 child_process 10000 0 10
root 858 7.9 0.0 1288 276 pts/0 S 18:44 0:39 child_process 10000 3 12
root 859 13.1 0.0 1288 276 pts/0 S 18:44 1:05 child_process 10000 11 14
root 860 13.5 0.0 1288 276 pts/0 R 18:44 1:07 child_process 10000 13 16
root 861 8.0 0.0 1288 276 pts/0 R 18:44 0:40 child_process 10000 15 18
root 862 10.6 0.0 1288 276 pts/0 S 18:44 0:53 child_process 10000 17 20
root 863 11.3 0.0 1288 276 pts/0 S 18:44 0:56 child_process 10000 19 22
root 864 8.8 0.0 1288 276 pts/0 S 18:44 0:43 child_process 10000 21 24
root 865 7.9 0.0 1288 276 pts/0 S 18:44 0:39 child_process 10000 23 26
root 868 0.0 0.0 2568 108 ? R 18:47 0:00 /usr/sbin/sshd
root 876 0.0 0.0 2568 252 ? R 18:52 0:00 /usr/sbin/sshd
root 877 0.0 0.1 2544 668 pts/1 R 18:53 0:00 ps aux
This is some info on PID 876 (sshd child):
root@mountain:/proc/876# cat cmdline
/usr/sbin/sshd
root@mountain:/proc/876# cat status
Name: sshd
State: R (running)
Tgid: 876
Pid: 876
PPid: 534
TracerPid: 0
Uid: 0 0 0 0
Gid: 0 0 0 0
FDSize: 32
Groups:
VmSize: 2568 kB
VmLck: 0 kB
VmRSS: 252 kB
VmData: 100 kB
VmStk: 12 kB
VmExe: 248 kB
VmLib: 2108 kB
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 8000000000001000
SigCgt: 0000000000000000
CapInh: 0000000000000000
CapPrm: 00000000fffffeff
CapEff: 00000000fffffeff
root@mountain:/proc/876# cat wchan
0
--
Randy Hron
http://home.earthlink.net/~rwhron/kernel/bigbox.html
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: scheduler starvation running irman with 2.5.64bk2
2003-03-11 0:09 scheduler starvation running irman with 2.5.64bk2 rwhron
@ 2003-03-11 2:37 ` Mike Galbraith
0 siblings, 0 replies; 11+ messages in thread
From: Mike Galbraith @ 2003-03-11 2:37 UTC (permalink / raw)
To: rwhron, ingo; +Cc: linux-kernel
At 07:09 PM 3/10/2003 -0500, rwhron@earthlink.net wrote:
> > OK, can you do the following to determine whether we're both seeing the
> > _same_ problem?
>
> > 1.) build the attached rtnice utility (don't remember who wrote/posted
> this)
> > 2.) login on vt1 and set the shell SCHED_RR via rtnice -n 1 -p <pid_of_sh>
> > -d RR
> > 3.) login on vt2 and renice that shell to -10
> > 4.) login on another vt as a normal user, and start irman
> > 5.) try login/out on another vt, or ps or _whatever_ (doesn't matter)
> > until box is starving
> > 6.) on vt2, try to do ps (it should hang despite -10 priority)
> > 7.) on vt1, try to do ps (it should work just fine)
>
>Here is the test as I understand it, splitted by tty. Quick summary,
>only new process seems starved. RR and nice -10 process behave okay.
>
>BTW, this is now with 2.5.64bk5.
The only difference I see is that reniced task wasn't starved with this
kernel... wonder what the difference is. Thanks for running the test.
-Mike
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: scheduler starvation running irman with 2.5.64bk2
@ 2003-03-11 2:46 rwhron
2003-03-11 3:43 ` Mike Galbraith
0 siblings, 1 reply; 11+ messages in thread
From: rwhron @ 2003-03-11 2:46 UTC (permalink / raw)
To: efault; +Cc: linux-kernel
> The only difference I see is that reniced task wasn't starved with this
> kernel... wonder what the difference is.
One more thing. After <ctrl c> interrupting irman process load,
ssh into the box works.
--
Randy Hron
http://home.earthlink.net/~rwhron/kernel/bigbox.html
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: scheduler starvation running irman with 2.5.64bk2
2003-03-11 2:46 rwhron
@ 2003-03-11 3:43 ` Mike Galbraith
0 siblings, 0 replies; 11+ messages in thread
From: Mike Galbraith @ 2003-03-11 3:43 UTC (permalink / raw)
To: rwhron; +Cc: linux-kernel
At 09:46 PM 3/10/2003 -0500, rwhron@earthlink.net wrote:
> > The only difference I see is that reniced task wasn't starved with this
> > kernel... wonder what the difference is.
>
>One more thing. After <ctrl c> interrupting irman process load,
>ssh into the box works.
Yup. It should also work fine if you renice the shell you run irman from
to a low enough priority. Anyhoo, we definitely were experiencing the same
exact problem, as was Con with contest. Mission accomplished. I'm off to
fiddle with Jim Houston's patch until Ingo does his magic :)
-Mike
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2003-03-11 3:28 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-03-11 0:09 scheduler starvation running irman with 2.5.64bk2 rwhron
2003-03-11 2:37 ` Mike Galbraith
-- strict thread matches above, loose matches on Subject: below --
2003-03-11 2:46 rwhron
2003-03-11 3:43 ` Mike Galbraith
2003-03-09 2:50 rwhron
2003-03-09 3:07 ` Zwane Mwaikambo
2003-03-09 18:59 ` Mike Galbraith
2003-03-09 19:13 ` Mike Galbraith
2003-03-10 4:42 ` Mike Galbraith
2003-03-10 4:46 ` Mike Galbraith
2003-03-10 5:54 ` Mike Galbraith
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox