From: Daniel Lezcano <daniel.lezcano@linaro.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: raistlin@linux.it, juri.lelli@gmail.com,
Ingo Molnar <mingo@kernel.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [BUG] [ tip/sched/core ] System unresponsive after booting
Date: Thu, 16 Jan 2014 14:48:51 +0100 [thread overview]
Message-ID: <52D7E343.40909@linaro.org> (raw)
In-Reply-To: <20140115120418.GD31570@twins.programming.kicks-ass.net>
On 01/15/2014 01:04 PM, Peter Zijlstra wrote:
> On Wed, Jan 15, 2014 at 09:27:34AM +0100, Daniel Lezcano wrote:
>>
>> Hi all,
>>
>> I use the tip/sched/core branch.
>>
>> After git pulling yesterday, my host is unresponsive after booting the OS.
>>
>> * It boots normally
>> * It sends info to the console
>> * The graphics does not work
>> * The terminals show the prompt, I can enter the username but after
>> pressing enter, it does not give the password prompt
>> * sysrq works more or less, I can't get the process stack but it receives
>> the command
>>
>> It is like no new process can be created.
>>
>> I have a dual Xeon processor E5325 (2 x 4 cores).
>>
>> After git bisecting, the following patch seems to introduce the bug.
>>
>> commit d50dde5a10f305253cbc3855307f608f8a3c5f73
>
> OK, so my headless WSM-EP boots just fine. Obviously it cannot confirm
> if graphics works, but I can ssh in and work on it without bother.
>
> I can even log in on the serial console without problems.
>
> I tried both tip/master and tip/sched/core.
>
> Would you happen to have a .config for me to try?
I was able to reduce the scope and reproduce the issue.
AFAICT, that happens with rsyslogd. When login in a tty, the login
command sends a message through /dev/log. But rsyslogd is never woken up
and blocked in poll_schedule_timeout. The login process is blocked in
unix_wait_for_peer.
I can strace rsyslogd at startup. The two last sched_setscheduler calls
fail.
> grep sched trace.out
3570 sched_getparam(3570, { 0 }) = 0
3570 sched_getscheduler(3570) = 0 (SCHED_OTHER)
3570 sched_get_priority_min(SCHED_OTHER) = 0
3570 sched_get_priority_max(SCHED_OTHER) = 0
3571 sched_get_priority_min(SCHED_OTHER) = 0
3571 sched_get_priority_max(SCHED_OTHER) = 0
3571 sched_get_priority_min(SCHED_OTHER) = 0
3571 sched_get_priority_max(SCHED_OTHER) = 0
3571 sched_setscheduler(3572, SCHED_OTHER, { 0 } <unfinished ...>
3571 <... sched_setscheduler resumed> ) = 0
3571 sched_get_priority_min(SCHED_OTHER <unfinished ...>
3571 <... sched_get_priority_min resumed> ) = 0
3571 sched_get_priority_max(SCHED_OTHER <unfinished ...>
3571 <... sched_get_priority_max resumed> ) = 0
3571 sched_setscheduler(3573, SCHED_OTHER, { 0 } <unfinished ...>
3571 <... sched_setscheduler resumed> ) = -1 EPERM (Operation not
permitted)
3571 sched_get_priority_min(SCHED_OTHER <unfinished ...>
3571 <... sched_get_priority_min resumed> ) = 0
3571 sched_get_priority_max(SCHED_OTHER <unfinished ...>
3571 <... sched_get_priority_max resumed> ) = 0
3571 sched_setscheduler(3574, SCHED_OTHER, { 0 } <unfinished ...>
3571 <... sched_setscheduler resumed> ) = -1 EPERM (Operation not
permitted)
The same strace but on a kernel which does not hang. The calls to
sched_setscheduler do not fail.
3292 sched_getparam(3292, { 0 }) = 0
3292 sched_getscheduler(3292) = 0 (SCHED_OTHER)
3292 sched_get_priority_min(SCHED_OTHER) = 0
3292 sched_get_priority_max(SCHED_OTHER) = 0
3293 sched_get_priority_min(SCHED_OTHER) = 0
3293 sched_get_priority_max(SCHED_OTHER) = 0
3293 sched_get_priority_min(SCHED_OTHER) = 0
3293 sched_get_priority_max(SCHED_OTHER) = 0
3293 sched_setscheduler(3294, SCHED_OTHER, { 0 } <unfinished ...>
3293 <... sched_setscheduler resumed> ) = 0
3293 sched_get_priority_min(SCHED_OTHER <unfinished ...>
3293 <... sched_get_priority_min resumed> ) = 0
3293 sched_get_priority_max(SCHED_OTHER <unfinished ...>
3293 <... sched_get_priority_max resumed> ) = 0
3293 sched_setscheduler(3295, SCHED_OTHER, { 0 } <unfinished ...>
3293 <... sched_setscheduler resumed> ) = 0
3293 sched_get_priority_min(SCHED_OTHER <unfinished ...>
3293 <... sched_get_priority_min resumed> ) = 0
3293 sched_get_priority_max(SCHED_OTHER <unfinished ...>
3293 <... sched_get_priority_max resumed> ) = 0
3293 sched_setscheduler(3296, SCHED_OTHER, { 0 } <unfinished ...>
3293 <... sched_setscheduler resumed> ) = 0
The EPERM error comes from kernel/sched/core.c:3303
...
if (fair_policy(policy)) {
if (!can_nice(p, attr->sched_nice))
return -EPERM;
}
...
But I don't know why this is leading to block a process or making
rsyslogd being not woken up by a packet coming in the af_unix socket.
I hope that helps
-- Daniel
--
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs
Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog
next prev parent reply other threads:[~2014-01-16 13:48 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-01-15 8:27 [BUG] [ tip/sched/core ] System unresponsive after booting Daniel Lezcano
2014-01-15 9:22 ` Ingo Molnar
2014-01-15 10:39 ` Peter Zijlstra
2014-01-15 11:00 ` Peter Zijlstra
2014-01-15 14:05 ` Peter Zijlstra
2014-01-15 9:25 ` Michael wang
2014-01-15 11:30 ` Peter Zijlstra
2014-01-15 13:28 ` Daniel Lezcano
2014-01-16 13:40 ` [tip:sched/core] sched: Preserve the nice level over sched_setscheduler() and sched_setparam() calls tip-bot for Peter Zijlstra
2014-01-15 13:27 ` [BUG] [ tip/sched/core ] System unresponsive after booting Daniel Lezcano
2014-01-15 12:04 ` Peter Zijlstra
2014-01-15 12:24 ` Ingo Molnar
2014-01-15 13:45 ` Daniel Lezcano
2014-01-15 13:09 ` Daniel Lezcano
2014-01-16 13:48 ` Daniel Lezcano [this message]
2014-01-16 14:17 ` Peter Zijlstra
2014-01-16 14:20 ` Daniel Lezcano
2014-01-16 14:25 ` Peter Zijlstra
2014-01-16 14:30 ` Daniel Lezcano
2014-01-16 15:42 ` Peter Zijlstra
2014-01-16 15:50 ` Daniel Lezcano
2014-01-16 16:54 ` [PATCH] sched: Fix __sched_setscheduler() nice test Peter Zijlstra
2014-01-16 18:33 ` Peter Zijlstra
2014-01-16 18:39 ` [tip:sched/core] " tip-bot for Peter Zijlstra
2014-01-16 15:28 ` [BUG] [ tip/sched/core ] System unresponsive after booting Daniel Lezcano
2014-01-16 15:48 ` Peter Zijlstra
2014-01-16 15:51 ` Daniel Lezcano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52D7E343.40909@linaro.org \
--to=daniel.lezcano@linaro.org \
--cc=juri.lelli@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=raistlin@linux.it \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.