* RE: Consistent lock up 2.6.10-rc1-bk7 (mutex/SCHED_RR bug?) [not found] <OMEGLKPBDPDHAGCIBHHJMEIDFCAA.aathan-linux-kernel-1542@cloakmail.com> @ 2004-10-29 14:26 ` Andrew 2004-10-29 15:07 ` Alan Cox ` (2 more replies) 0 siblings, 3 replies; 8+ messages in thread From: Andrew @ 2004-10-29 14:26 UTC (permalink / raw) To: Andrew, linux-kernel; +Cc: roland, Andrew Morton I have reproduced this hang on 2.6.10-rc1-bk7, and have also installed the sysrq-n patch. Even after "SysRq : Nice All RT Tasks", the system is completely unresponsive as far as user mode is concerned, and will only react to SysRq. It -does- respond to ICMP pings. Sysrq-e, -k, -i do not stop the offending tt1 process. I do not have netdump available in 2.6.10-rc1-bk7, and so cannot provide a full sysrq-t output, but the visible section shows two tt1 threads with identical stacks: schedule_timeout+0xd0/0xd2 futex_wait+0x140/0x1a9 do_futex+0x33/0x78 sys_futex+0xcd/0xd9 sysenter_past_esp+0x52/0x71 I then tried running this task as non-root user, which should prevent SCHED_RR and PRIO changes of the threads/tasks. Under these conditions, the system does *not* hang. I noticed that the app periodically ends up in a high-speed loop involving the ACE_Semaphore class in ACE; having checked the compilation flags, it seems ACE is simulating semaphors using below calls. It is *not* using POSIX 1003.1b semaphores (sem_wait, etc.) pthread_mutex_lock() pthread_cond_wait() pthread_cond_signal() Although it appears I need to fix an applicaiton bug, is it normal/desirable for an application calling system mutex facilities to starve the system so completely, and/or become "unkillable"? A. -----Original Message----- From: Andrew [mailto:aathan-linux-kernel-1542@cloakmail.com] Sent: Thursday, October 28, 2004 5:10 PM To: linux-kernel@vger.kernel.org Cc: roland@topspin.com; Andrew Morton Subject: Consistent lock up 2.6.8-1.521 (and 2.6.8.1 w/ high-res-timers/skas/sysemu) Caveat: This may be an infinite loop in a SCHED_RR process. See very bottom of email for sysrq-t sysrq-p output. [LARGE EMAIL DELETED] ^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: Consistent lock up 2.6.10-rc1-bk7 (mutex/SCHED_RR bug?) 2004-10-29 14:26 ` Consistent lock up 2.6.10-rc1-bk7 (mutex/SCHED_RR bug?) Andrew @ 2004-10-29 15:07 ` Alan Cox 2004-10-29 16:43 ` Andrew A. 2004-10-29 15:36 ` Andrew A. 2004-10-29 15:52 ` Andrew A. 2 siblings, 1 reply; 8+ messages in thread From: Alan Cox @ 2004-10-29 15:07 UTC (permalink / raw) To: Andrew; +Cc: Linux Kernel Mailing List, roland, Andrew Morton On Gwe, 2004-10-29 at 15:26, Andrew wrote: > Although it appears I need to fix an applicaiton bug, is it normal/desirable for an application calling system mutex facilities to > starve the system so completely, and/or become "unkillable"? If it is SCHED_RR then it may get to hog the processor but it should not be doing worse than that and should be killable by something higher priority. You are right to suspect futexes don't deal with hard real time but the failure you see isnt the intended failure case. [Inaky has posted some drafts of a near futex efficient lock system that ought to work for real time use btw] Alan ^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: Consistent lock up 2.6.10-rc1-bk7 (mutex/SCHED_RR bug?) 2004-10-29 15:07 ` Alan Cox @ 2004-10-29 16:43 ` Andrew A. 2004-10-29 17:06 ` Chris Wright 0 siblings, 1 reply; 8+ messages in thread From: Andrew A. @ 2004-10-29 16:43 UTC (permalink / raw) To: Alan Cox, Andrew; +Cc: Linux Kernel Mailing List, roland, Andrew Morton Alan: Thanks for your note. The application in question is not "hard RT" and I am using SCHED_RR to improve latency, rather than guarantee a particular latency number. Also, since I am using the ACE framework, and don't have the time to detangle its protability preprocesor macros to add support for a different futex/mutex mechanism, I'm inclined to use stock code. I did dig up Inaky's work which is a fusyn mapping to existing futex calls--I might try that. However, would any of that really solve this problem? That is, do lower priority non-RR tasks and/or kernel signal delivery benefit from additional scheduled time under those patches? I suspect what is happening here is that my process is essentially in a while(1) { lock(); unlock(); } loop from two or mode SCHED_RR threads running at nice -15. They seem to be unkillable. However, should we really dismiss the possibility that the problem could be that these threads are in some kind of deadlock that involves the scheduler? A. -----Original Message----- From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel-owner@vger.kernel.org]On Behalf Of Alan Cox Sent: Friday, October 29, 2004 11:07 AM To: Andrew Cc: Linux Kernel Mailing List; roland@topspin.com; Andrew Morton Subject: RE: Consistent lock up 2.6.10-rc1-bk7 (mutex/SCHED_RR bug?) On Gwe, 2004-10-29 at 15:26, Andrew wrote: > Although it appears I need to fix an applicaiton bug, is it normal/desirable for an application calling system mutex facilities to > starve the system so completely, and/or become "unkillable"? If it is SCHED_RR then it may get to hog the processor but it should not be doing worse than that and should be killable by something higher priority. You are right to suspect futexes don't deal with hard real time but the failure you see isnt the intended failure case. [Inaky has posted some drafts of a near futex efficient lock system that ought to work for real time use btw] Alan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Consistent lock up 2.6.10-rc1-bk7 (mutex/SCHED_RR bug?) 2004-10-29 16:43 ` Andrew A. @ 2004-10-29 17:06 ` Chris Wright 2004-10-29 17:44 ` Andrew A. 0 siblings, 1 reply; 8+ messages in thread From: Chris Wright @ 2004-10-29 17:06 UTC (permalink / raw) To: Andrew A.; +Cc: Alan Cox, Linux Kernel Mailing List, roland, Andrew Morton * Andrew A. (aathan-linux-kernel-1542@cloakmail.com) wrote: > I suspect what is happening here is that my process is essentially in a > > while(1) > { > lock(); > unlock(); > } > > loop from two or mode SCHED_RR threads running at nice -15. They seem to be unkillable. Give yourself a shell that's SCHED_RR with a higher priority. I've used the small hack below to debug userspace SCHED_RR problems (newer distros have a chrt utility to do this). thanks, -chris -- #include <stdio.h> #include <sys/types.h> #include <unistd.h> #include <sched.h> #include <string.h> #include <errno.h> main(int argc, char *argv[]) { pid_t pid = 0; int priority = 99; int policy = SCHED_RR; struct sched_param sched; if (argc > 1) { pid = atoi(argv[1]); if (argc > 2) { priority = atoi(argv[2]); if (argc > 3) policy = atoi(argv[3]); } } memset(&sched, 0, sizeof(sched)); sched.sched_priority = priority; if (sched_setscheduler(pid, policy, &sched) < 0) { printf("setscheduler: %s\n", strerror(errno)); exit(1); } if (!pid) { /* turn this into a shell */ argv[0] = "/bin/bash"; argv[1] = NULL; execv(argv[0], argv); } } ^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: Consistent lock up 2.6.10-rc1-bk7 (mutex/SCHED_RR bug?) 2004-10-29 17:06 ` Chris Wright @ 2004-10-29 17:44 ` Andrew A. 2004-10-29 20:32 ` Chris Wright 0 siblings, 1 reply; 8+ messages in thread From: Andrew A. @ 2004-10-29 17:44 UTC (permalink / raw) To: Chris Wright; +Cc: Alan Cox, Linux Kernel Mailing List, roland, Andrew Morton chrt 25 bash Shell remains as badly hung as everything else. The code sets the SCHED_RR priority of the task and threads in tt1 to 10. I'm left thinking: Shouldn't the system be scheduling the shell? Is this a problem with priority inversion due to 2+ threads doing the lock()/unlock() dance and never giving the bash a chance to run? Is the system able to schedule signal and/or select wakeups (for bash) in this condition? Thanks, I wasn't aware of the chrt command and had only been using nice on my shell. The man pages on all this stuff are rather confusing: Which priority numbers are valid, how priorities interact, negative vs. positive priorities, process vs. thread priority, what is a "dynamic" vs. "static" priority, etc. My impression after re-re-read reading the man pages was that it would be sufficient to have a non SCHED_RR shell with a high enough nice value. -----Original Message----- From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel-owner@vger.kernel.org]On Behalf Of Chris Wright Sent: Friday, October 29, 2004 1:07 PM To: Andrew A. Cc: Alan Cox; Linux Kernel Mailing List; roland@topspin.com; Andrew Morton Subject: Re: Consistent lock up 2.6.10-rc1-bk7 (mutex/SCHED_RR bug?) * Andrew A. (aathan-linux-kernel-1542@cloakmail.com) wrote: > I suspect what is happening here is that my process is essentially in a > > while(1) > { > lock(); > unlock(); > } > > loop from two or mode SCHED_RR threads running at nice -15. They seem to be unkillable. Give yourself a shell that's SCHED_RR with a higher priority. I've used the small hack below to debug userspace SCHED_RR problems (newer distros have a chrt utility to do this). thanks, -chris -- #include <stdio.h> #include <sys/types.h> #include <unistd.h> #include <sched.h> #include <string.h> #include <errno.h> main(int argc, char *argv[]) { pid_t pid = 0; int priority = 99; int policy = SCHED_RR; struct sched_param sched; if (argc > 1) { pid = atoi(argv[1]); if (argc > 2) { priority = atoi(argv[2]); if (argc > 3) policy = atoi(argv[3]); } } memset(&sched, 0, sizeof(sched)); sched.sched_priority = priority; if (sched_setscheduler(pid, policy, &sched) < 0) { printf("setscheduler: %s\n", strerror(errno)); exit(1); } if (!pid) { /* turn this into a shell */ argv[0] = "/bin/bash"; argv[1] = NULL; execv(argv[0], argv); } } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Consistent lock up 2.6.10-rc1-bk7 (mutex/SCHED_RR bug?) 2004-10-29 17:44 ` Andrew A. @ 2004-10-29 20:32 ` Chris Wright 0 siblings, 0 replies; 8+ messages in thread From: Chris Wright @ 2004-10-29 20:32 UTC (permalink / raw) To: Andrew A. Cc: Chris Wright, Alan Cox, Linux Kernel Mailing List, roland, Andrew Morton * Andrew A. (aathan-linux-kernel-1542@cloakmail.com) wrote: > > chrt 25 bash Try 99. > Shell remains as badly hung as everything else. The code sets the SCHED_RR priority of the task and threads in tt1 to 10. I'm left > thinking: Shouldn't the system be scheduling the shell? Is this a problem with priority inversion due to 2+ threads doing the > lock()/unlock() dance and never giving the bash a chance to run? Is the system able to schedule signal and/or select wakeups (for > bash) in this condition? Not knowing what tt1 is doing it's hard to say. Ah, I missed the priority you used, so 99 above shouldn't be needed. > Thanks, I wasn't aware of the chrt command and had only been using nice on my shell. The man pages on all this stuff are rather > confusing: Which priority numbers are valid, how priorities interact, negative vs. positive priorities, process vs. thread > priority, what is a "dynamic" vs. "static" priority, etc. Dynamic is adjusted by the behaviour (using up timeslice, blocking, waiting to run) or by nice. Static is the base value used when figuring out what the dynamic should be (can be changed via nice or setpriority). IIRC, realtime priorities effectively stay static (unless changed via sched_setscheduler). The dynamic priority is what's used in scheduling decisions. The userspace interfaces are a bit confusing. The kernel keeps track of it a bit more simply. Internally, the priority ranges between 0 and 139 (0 is highest priority). 0-99 are for realtime tasks, and 100-139 are for normal tasks (note how the top 40 priorties can map to nice values -- where -20 == 100, and 19 == 139). The nice(2) (and setpriority(2)) interface lets you adjust the static priority in that upper range (and the dynamic changes accordingly). The sched_setscheduler(2) ranges for realtime [1, 99] map exactly inverted to the kernels priority (so while the syscall has 99 as highest priority, that becomes 0 internally). > My impression after re-re-read reading the man pages was that it would be sufficient to have a non SCHED_RR shell with a high enough > nice value. High enough priority set via sched_setscheduler(2), not nice value. nice [1, 19] actually lowers your priority, while [-20, -1] increases it. thanks, -chris -- Linux Security Modules http://lsm.immunix.org http://lsm.bkbits.net ^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: Consistent lock up 2.6.10-rc1-bk7 (mutex/SCHED_RR bug?) 2004-10-29 14:26 ` Consistent lock up 2.6.10-rc1-bk7 (mutex/SCHED_RR bug?) Andrew 2004-10-29 15:07 ` Alan Cox @ 2004-10-29 15:36 ` Andrew A. 2004-10-29 15:52 ` Andrew A. 2 siblings, 0 replies; 8+ messages in thread From: Andrew A. @ 2004-10-29 15:36 UTC (permalink / raw) To: linux-kernel For whatever reason, I have been unable to send the original message below through vger. I am therefore enclosing some of the important text here: ========== I have in the past posted emails with the subject "Consistent kernel hang during heavy TCP connection handling load" I then recently saw a linux-kernel thread "PROBLEM: Consistent lock up on >=2.6.8" that seemed to be related to the problem I am experiencing, but I am not sure of it (thus the cc:). I have, however, now managed to formulate a series of steps which reproduce my particular lock up consistently. When I trigger the hang, all virtual consoles become unresponsive, and the application cannot be signaled from the keyboard. Sysrq seems to work. The application in question is called "tt1". It runs several threads in SCHED_RR and uses select(), sleep() and/or nanosleep() extensively. I suspect there's a good chance the application calls select() with nfds=0 at some point. Due to the SCHED_RR usage in tt1, before executing the tt1 hang, I have tried to log into a virtual console on the host and run "nice -20 bash" as root. THe nice'd shell is hung just like everything else. Did I do it right? I was trying to make sure this hang is not simply an infinite loop in a SCHED_RR high priority process (tt1). I initially had a lot of trouble trying to capture sysrq output, but then I checked my netlog host and found (lo and behold) that it had captured it! Of course, that was before I went through the trouble of taking pictures of my monitor! I've included the netlog sysrq output from two runs below. They are at the very bottom of this email, separated by lines of '*'s These runs are probably DIFFERENT than the runs from which I produced the below screenshots. So, here are those screenshots, I still welcome any comments you might have about easier ways to capture sysrq output than using netdump! I modified /etc/syslog.conf to say kern.* /var/log/kernel, however, output of sysrq-t and sysrq-p while in the locked up state never ends up in the file (though, it does, when not locked up). The sysreq output and screenshots can be found at triple w dot memeplex dot com slash sysrq[1-2].txt.gz lock[1-3].gif mapping to System.map-2.6.8.1.gz lock[4-5].gif mapping to System.map-2.6.8-1.521.gz ======= ^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: Consistent lock up 2.6.10-rc1-bk7 (mutex/SCHED_RR bug?) 2004-10-29 14:26 ` Consistent lock up 2.6.10-rc1-bk7 (mutex/SCHED_RR bug?) Andrew 2004-10-29 15:07 ` Alan Cox 2004-10-29 15:36 ` Andrew A. @ 2004-10-29 15:52 ` Andrew A. 2 siblings, 0 replies; 8+ messages in thread From: Andrew A. @ 2004-10-29 15:52 UTC (permalink / raw) To: linux-kernel For whatever reason, I have been unable to send the original message below through vger. I am therefore enclosing some of the important text here: ========== I have in the past posted emails with the subject "Consistent kernel hang during heavy TCP connection handling load" I then recently saw a linux-kernel thread "PROBLEM: Consistent lock up on &gr;&eq;2.6.8" that seemed to be related to the problem I am experiencing, but I am not sure of it (thus the cc:). I have, however, now managed to formulate a series of steps which reproduce my particular lock up consistently. When I trigger the hang, all virtual consoles become unresponsive, and the application cannot be signaled from the keyboard. Sysrq seems to work. The application in question is called "tt1". It runs several threads in SCHED_RR and uses select(), sleep() and/or nanosleep() extensively. I suspect there's a good chance the application calls select() with nfds=0 at some point. Due to the SCHED_RR usage in tt1, before executing the tt1 hang, I have tried to log into a virtual console on the host and run "nice -20 bash" as root. THe nice'd shell is hung just like everything else. Did I do it right? I was trying to make sure this hang is not simply an infinite loop in a SCHED_RR high priority process (tt1). I initially had a lot of trouble trying to capture sysrq output, but then I checked my netlog host and found (lo and behold) that it had captured it! Of course, that was before I went through the trouble of taking pictures of my monitor! I've included the netlog sysrq output from two runs below. They are at the very bottom of this email. These runs are probably DIFFERENT than the runs from which I produced the below screenshots. So, here are those screenshots, I still welcome any comments you might have about easier ways to capture sysrq output than using netdump! I modified /etc/syslog.conf to say kern.* /var/log/kernel, however, output of sysrq-t and sysrq-p while in the locked up state never ends up in the file (though, it does, when not locked up). The sysreq output and screenshots can be found at triple w dot memeplex dot com slash sysrq[1-2].txt.gz lock[1-3].[g][i][f] mapping to System.map-2.6.8.1.gz lock[4-5].[g][i][f] mapping to System.map-2.6.8-1.521.gz ======= ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2004-10-29 20:43 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <OMEGLKPBDPDHAGCIBHHJMEIDFCAA.aathan-linux-kernel-1542@cloakmail.com>
2004-10-29 14:26 ` Consistent lock up 2.6.10-rc1-bk7 (mutex/SCHED_RR bug?) Andrew
2004-10-29 15:07 ` Alan Cox
2004-10-29 16:43 ` Andrew A.
2004-10-29 17:06 ` Chris Wright
2004-10-29 17:44 ` Andrew A.
2004-10-29 20:32 ` Chris Wright
2004-10-29 15:36 ` Andrew A.
2004-10-29 15:52 ` Andrew A.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox