* Scheduling issue with signal handling @ 2008-07-10 8:14 Elias Oltmanns 2008-07-10 8:55 ` Ingo Molnar 0 siblings, 1 reply; 3+ messages in thread From: Elias Oltmanns @ 2008-07-10 8:14 UTC (permalink / raw) To: Ingo Molnar; +Cc: linux-kernel Hi there, in another thread [1] we've discussed something which, after some testing, I now believe to be a problem related to scheduling / preemption. Here is the problem: If there is heavy disk I/O in the background, signals sent to an I/O related process interactively by pressing Ctrl+Z or Ctrl+C are sometimes processed quite sluggishly. I can pretty reliably reproduce this using a test case provided by the original reporter [1] (on a uniprocessor if that's of any consequence). The problem has been verified to exist at least since 2.6.19 but it's become much easier to observe and thus attracted attention since 2.6.25 because pressing Ctrl+Z or Ctrl+C is now acknowledged by printing ^Z or ^C to the terminal, respectively. As a result, the user gets slightly irritated and possibly impatient when pressing Ctrl+Z repeatedly or just holding down these keys generates a string like ^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z but the shell prompt doesn't appear for up to a few seconds. By sprinkling some printk()s all over the place, I've managed to establish the following sequence of events taking place in the event of delayed signal handling as described above: The first Ctrl+Z event enqueues a SIGTSTP signal which eventually results in a call to kick_process(). For some reason though, the signal isn't handled straight away but remains on the queue for some time. Consequently, subsequent Ctrl+Z events result in echoing another ^Z to the terminal but everything related to sending a signal is skipped (and rightly so) because the kernel detects that a SIGTSTP is still pending. Eventually, get_signal_to_deliver() dequeues the SIGTSTP signal and the shell propt appears. My question is this: Even under high disk I/O pressure, the threads dealing with I/O to the terminal evidently still get their turn as indicated by the sequence of ^Z appearing on screen. Why is it then, that the threads which are meant to process the SIGTSTP or SIGINT signals aren't scheduled for some seconds and is there a way to change this? Please let me know if there is anything I can try to investigate this any further or if you need further information. Thanks in advance, Elias [1] http://lkml.org/lkml/2008/6/28/50 ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Scheduling issue with signal handling 2008-07-10 8:14 Scheduling issue with signal handling Elias Oltmanns @ 2008-07-10 8:55 ` Ingo Molnar 2008-07-10 14:30 ` Elias Oltmanns 0 siblings, 1 reply; 3+ messages in thread From: Ingo Molnar @ 2008-07-10 8:55 UTC (permalink / raw) To: Elias Oltmanns Cc: linux-kernel, Roland McGrath, Peter Zijlstra, Dmitry Adamushko * Elias Oltmanns <eo@nebensachen.de> wrote: > By sprinkling some printk()s all over the place, I've managed to > establish the following sequence of events taking place in the event > of delayed signal handling as described above: The first Ctrl+Z event > enqueues a SIGTSTP signal which eventually results in a call to > kick_process(). For some reason though, the signal isn't handled > straight away but remains on the queue for some time. Consequently, > subsequent Ctrl+Z events result in echoing another ^Z to the terminal > but everything related to sending a signal is skipped (and rightly so) > because the kernel detects that a SIGTSTP is still pending. > Eventually, get_signal_to_deliver() dequeues the SIGTSTP signal and > the shell propt appears. > > My question is this: Even under high disk I/O pressure, the threads > dealing with I/O to the terminal evidently still get their turn as > indicated by the sequence of ^Z appearing on screen. Why is it then, > that the threads which are meant to process the SIGTSTP or SIGINT > signals aren't scheduled for some seconds and is there a way to change > this? > > Please let me know if there is anything I can try to investigate this > any further or if you need further information. > > Thanks in advance, > > Elias > > [1] http://lkml.org/lkml/2008/6/28/50 hm, kick_process() is a no-op on !SMP. Basically, when a new signal is queued and a task is already running, it will run in due course and process the signal the moment it's scheduled again. (unless the signal is blocked) If a task is not already running, then the signal code will wake up the task and it will then process the signal the moment it's executed. The maximum latency of a runnable task hitting the CPU is controlled via /proc/sys/kernel/sched_latency [available if CONFIG_SCHED_DEBUG=y in the .config] - 20 milliseconds on uniprocessors. Several seconds of lag is almost out of question and would indicate a serious scheduler bug, or - which is far more likely - either an application signal processing hickup or a kernel signal processing hickup. If the lag happens with the task you can observe its worst-case scheduling delay by looking at /proc/<PID>/sched, if you also have CONFIG_SCHEDSTAT=y in your .config. For example, a random shell's delays on a testbox: phoenix:~> grep se.wait_max /proc/$$/sched se.wait_max : 3.338588 That's 3.3 msecs _worst case_, on a system that has otherwise quite insane load: 10:53:57 up 2:48, 2 users, load average: 77.56, 94.33, 102.75 So several seconds of delay, if it came from the scheduler, would be really anomalous. As a final measure, instead of printk's, you could try the scheduler tracer in linux-next (CONFIG_CONTEXT_SWITCH_TRACER=y), to have an exact idea about what is going on and when. (see /debug/tracing/README) [ You might also want to try CONFIG_FTRACE=y and CONFIG_DYNAMIC_FTRACE=y for extremely finegrained kernel tracing - available in linux-next too. ] Ingo ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Scheduling issue with signal handling 2008-07-10 8:55 ` Ingo Molnar @ 2008-07-10 14:30 ` Elias Oltmanns 0 siblings, 0 replies; 3+ messages in thread From: Elias Oltmanns @ 2008-07-10 14:30 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, Roland McGrath, Peter Zijlstra, Dmitry Adamushko Ingo Molnar <mingo@elte.hu> wrote: > * Elias Oltmanns <eo@nebensachen.de> wrote: > >> By sprinkling some printk()s all over the place, I've managed to >> establish the following sequence of events taking place in the event >> of delayed signal handling as described above: The first Ctrl+Z event >> enqueues a SIGTSTP signal which eventually results in a call to >> kick_process(). For some reason though, the signal isn't handled >> straight away but remains on the queue for some time. Consequently, >> subsequent Ctrl+Z events result in echoing another ^Z to the terminal >> but everything related to sending a signal is skipped (and rightly so) >> because the kernel detects that a SIGTSTP is still pending. >> Eventually, get_signal_to_deliver() dequeues the SIGTSTP signal and >> the shell propt appears. >> >> My question is this: Even under high disk I/O pressure, the threads >> dealing with I/O to the terminal evidently still get their turn as >> indicated by the sequence of ^Z appearing on screen. Why is it then, >> that the threads which are meant to process the SIGTSTP or SIGINT >> signals aren't scheduled for some seconds and is there a way to change >> this? >> >> Please let me know if there is anything I can try to investigate this >> any further or if you need further information. >> >> Thanks in advance, >> >> Elias >> >> [1] http://lkml.org/lkml/2008/6/28/50 > > hm, kick_process() is a no-op on !SMP. > > Basically, when a new signal is queued and a task is already running, it > will run in due course and process the signal the moment it's scheduled > again. (unless the signal is blocked) > > If a task is not already running, then the signal code will wake up the > task and it will then process the signal the moment it's executed. > > The maximum latency of a runnable task hitting the CPU is controlled via > /proc/sys/kernel/sched_latency [available if CONFIG_SCHED_DEBUG=y in the > .config] - 20 milliseconds on uniprocessors. > > Several seconds of lag is almost out of question and would indicate a > serious scheduler bug, or - which is far more likely - either an > application signal processing hickup or a kernel signal processing > hickup. I feel quite certain that the kernel is the part to blame, one way or another. > > If the lag happens with the task you can observe its worst-case > scheduling delay by looking at /proc/<PID>/sched, if you also have > CONFIG_SCHEDSTAT=y in your .config. Nothing suspicious regarding the applications involved here. The only threads with se.max_wait values between, say, 50 msecs and 190 msecs are things like ksoftirqd/0, udevd and the likes. [...] > That's 3.3 msecs _worst case_, on a system that has otherwise quite > insane load: > > 10:53:57 up 2:48, 2 users, load average: 77.56, 94.33, 102.75 Very impressive. I don't think I have ever observed the tenth part of that load on my system. > > So several seconds of delay, if it came from the scheduler, would be > really anomalous. With the patch below applied to next-20080708, I can quite easily produce the following: denkblock:/mnt# fg find / > /dev/null ^Z[ 421.968364] complete_signal(): sig = 20. ^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z[ 423.224898] get_signal_to_deliver(): signr = 20. [4]+ Stopped find / > /dev/null denkblock:/mnt# That's about 1.26 msecs, not quite the worst case yet, but still absurdly long by my reckoning given that the system does still respond to keyboard events. > > As a final measure, instead of printk's, you could try the scheduler > tracer in linux-next (CONFIG_CONTEXT_SWITCH_TRACER=y), to have an exact > idea about what is going on and when. (see /debug/tracing/README) > > [ You might also want to try CONFIG_FTRACE=y and CONFIG_DYNAMIC_FTRACE=y > for extremely finegrained kernel tracing - available in linux-next > too. ] Well, I've generated a trace using sched_switch. Right now, I can't make head or tail of it and it's a ridiculous amount of data. If you'd like to have a look at it, I can put it online somewhere. Alternatively, you could give me a hint as to how I can read it or what I can do to generate / filter / process the relevant data. Thanks, Elias -------- diff --git a/kernel/signal.c b/kernel/signal.c index 6c0958e..8ae57f8 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -721,11 +721,18 @@ static inline int wants_signal(int sig, struct task_struct *p) return task_curr(p) || !signal_pending(p); } +static inline void print_sig(const char *msg, int sig) +{ + if (sig == SIGTSTP || sig == SIGINT) + printk(msg, sig); +} + static void complete_signal(int sig, struct task_struct *p, int group) { struct signal_struct *signal = p->signal; struct task_struct *t; + print_sig("complete_signal(): sig = %d.\n", sig); /* * Now find a thread we can wake up to take the signal off the queue. * @@ -1778,6 +1785,7 @@ relock: break; /* will return 0 */ if (signr != SIGKILL) { + print_sig("get_signal_to_deliver(): signr = %d.\n", signr); signr = ptrace_signal(signr, info, regs, cookie); if (!signr) continue; ^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2008-07-10 14:33 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-07-10 8:14 Scheduling issue with signal handling Elias Oltmanns 2008-07-10 8:55 ` Ingo Molnar 2008-07-10 14:30 ` Elias Oltmanns
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox