From: Elias Oltmanns <eo@nebensachen.de>
To: Ingo Molnar <mingo@elte.hu>
Cc: linux-kernel@vger.kernel.org, Roland McGrath <roland@redhat.com>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Dmitry Adamushko <dmitry.adamushko@gmail.com>
Subject: Re: Scheduling issue with signal handling
Date: Thu, 10 Jul 2008 16:30:25 +0200 [thread overview]
Message-ID: <87bq1595ha.fsf@denkblock.local> (raw)
In-Reply-To: <20080710085500.GA19918@elte.hu> (Ingo Molnar's message of "Thu, 10 Jul 2008 10:55:00 +0200")
Ingo Molnar <mingo@elte.hu> wrote:
> * Elias Oltmanns <eo@nebensachen.de> wrote:
>
>> By sprinkling some printk()s all over the place, I've managed to
>> establish the following sequence of events taking place in the event
>> of delayed signal handling as described above: The first Ctrl+Z event
>> enqueues a SIGTSTP signal which eventually results in a call to
>> kick_process(). For some reason though, the signal isn't handled
>> straight away but remains on the queue for some time. Consequently,
>> subsequent Ctrl+Z events result in echoing another ^Z to the terminal
>> but everything related to sending a signal is skipped (and rightly so)
>> because the kernel detects that a SIGTSTP is still pending.
>> Eventually, get_signal_to_deliver() dequeues the SIGTSTP signal and
>> the shell propt appears.
>>
>> My question is this: Even under high disk I/O pressure, the threads
>> dealing with I/O to the terminal evidently still get their turn as
>> indicated by the sequence of ^Z appearing on screen. Why is it then,
>> that the threads which are meant to process the SIGTSTP or SIGINT
>> signals aren't scheduled for some seconds and is there a way to change
>> this?
>>
>> Please let me know if there is anything I can try to investigate this
>> any further or if you need further information.
>>
>> Thanks in advance,
>>
>> Elias
>>
>> [1] http://lkml.org/lkml/2008/6/28/50
>
> hm, kick_process() is a no-op on !SMP.
>
> Basically, when a new signal is queued and a task is already running, it
> will run in due course and process the signal the moment it's scheduled
> again. (unless the signal is blocked)
>
> If a task is not already running, then the signal code will wake up the
> task and it will then process the signal the moment it's executed.
>
> The maximum latency of a runnable task hitting the CPU is controlled via
> /proc/sys/kernel/sched_latency [available if CONFIG_SCHED_DEBUG=y in the
> .config] - 20 milliseconds on uniprocessors.
>
> Several seconds of lag is almost out of question and would indicate a
> serious scheduler bug, or - which is far more likely - either an
> application signal processing hickup or a kernel signal processing
> hickup.
I feel quite certain that the kernel is the part to blame, one way or
another.
>
> If the lag happens with the task you can observe its worst-case
> scheduling delay by looking at /proc/<PID>/sched, if you also have
> CONFIG_SCHEDSTAT=y in your .config.
Nothing suspicious regarding the applications involved here. The only
threads with se.max_wait values between, say, 50 msecs and 190 msecs are
things like ksoftirqd/0, udevd and the likes.
[...]
> That's 3.3 msecs _worst case_, on a system that has otherwise quite
> insane load:
>
> 10:53:57 up 2:48, 2 users, load average: 77.56, 94.33, 102.75
Very impressive. I don't think I have ever observed the tenth part of
that load on my system.
>
> So several seconds of delay, if it came from the scheduler, would be
> really anomalous.
With the patch below applied to next-20080708, I can quite easily
produce the following:
denkblock:/mnt# fg
find / > /dev/null
^Z[ 421.968364] complete_signal(): sig = 20.
^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z^Z[ 423.224898] get_signal_to_deliver(): signr = 20.
[4]+ Stopped find / > /dev/null
denkblock:/mnt#
That's about 1.26 msecs, not quite the worst case yet, but still
absurdly long by my reckoning given that the system does still respond
to keyboard events.
>
> As a final measure, instead of printk's, you could try the scheduler
> tracer in linux-next (CONFIG_CONTEXT_SWITCH_TRACER=y), to have an exact
> idea about what is going on and when. (see /debug/tracing/README)
>
> [ You might also want to try CONFIG_FTRACE=y and CONFIG_DYNAMIC_FTRACE=y
> for extremely finegrained kernel tracing - available in linux-next
> too. ]
Well, I've generated a trace using sched_switch. Right now, I can't make
head or tail of it and it's a ridiculous amount of data. If you'd like
to have a look at it, I can put it online somewhere. Alternatively, you
could give me a hint as to how I can read it or what I can do to
generate / filter / process the relevant data.
Thanks,
Elias
--------
diff --git a/kernel/signal.c b/kernel/signal.c
index 6c0958e..8ae57f8 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -721,11 +721,18 @@ static inline int wants_signal(int sig, struct task_struct *p)
return task_curr(p) || !signal_pending(p);
}
+static inline void print_sig(const char *msg, int sig)
+{
+ if (sig == SIGTSTP || sig == SIGINT)
+ printk(msg, sig);
+}
+
static void complete_signal(int sig, struct task_struct *p, int group)
{
struct signal_struct *signal = p->signal;
struct task_struct *t;
+ print_sig("complete_signal(): sig = %d.\n", sig);
/*
* Now find a thread we can wake up to take the signal off the queue.
*
@@ -1778,6 +1785,7 @@ relock:
break; /* will return 0 */
if (signr != SIGKILL) {
+ print_sig("get_signal_to_deliver(): signr = %d.\n", signr);
signr = ptrace_signal(signr, info, regs, cookie);
if (!signr)
continue;
prev parent reply other threads:[~2008-07-10 14:33 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-07-10 8:14 Scheduling issue with signal handling Elias Oltmanns
2008-07-10 8:55 ` Ingo Molnar
2008-07-10 14:30 ` Elias Oltmanns [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87bq1595ha.fsf@denkblock.local \
--to=eo@nebensachen.de \
--cc=a.p.zijlstra@chello.nl \
--cc=dmitry.adamushko@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=roland@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.