* preempt-related hangs
@ 2002-03-25 1:59 Andrew Morton
2002-03-25 2:11 ` Andrew Morton
2002-03-25 8:04 ` Zwane Mwaikambo
0 siblings, 2 replies; 7+ messages in thread
From: Andrew Morton @ 2002-03-25 1:59 UTC (permalink / raw)
To: Robert Love; +Cc: lkml
I sent this email to Ingo last week; seems that he's
having some downtime. It was happening on my dual PIII
and I now discover that the quad pIII does the same
thing. Any ideas?
Kernel is 2.5.7, dual PIII. When I enable preempt it
locks during boot.
I applied the kgdb patch and had a poke.
(gdb) info threads
* 6 Thread 6 preempt_schedule () at sched.c:848
5 Thread 5 preempt_schedule () at sched.c:848
4 Thread 4 context_thread (startup=0xc0395f90) at context.c:101
3 Thread 3 migration_thread (unused=0x0) at sched.c:1646
2 Thread 2 migration_thread (unused=0x0) at sched.c:1646
1 Thread 1 spawn_ksoftirqd () at softirq.c:407
Note that init is stuck in spawn_ksoftirqd. It's spinning in
that function, yielding, waiting for the softirqd threads to
come alive. They're threads 5 and 6.
(gdb) thread 6
[Switching to thread 6 (Thread 6)]#0 preempt_schedule () at sched.c:848
848 }
(gdb) bt
#0 preempt_schedule () at sched.c:848
#1 0xc0117f77 in try_to_wake_up (p=0xefef4dc0) at sched.c:179
#2 0xc0117f8d in wake_up_process (p=0xefef68c0) at sched.c:347
#3 0xc011ad70 in set_cpus_allowed (p=0xefef68c0, new_mask=2) at sched.c:1583
#4 0xc01247fe in ksoftirqd (__bind_cpu=0x1) at softirq.c:371
#5 0xc010586b in kernel_thread (fn=0xc038aa97 <spawn_ksoftirqd+71>, arg=0x10, flags=582)
at process.c:501
#6 0xffffffff in ?? ()
So ksoftirqd has entered set_cpus_allowed(), and has tried to wake
migration_thread.
try_to_wake_up() has called task_rq_unlock(rq, &flags); and
task_rq_unlock() has done preempt_enable(). Game over at that
point. Looks like ksoftirqd has scheduled away on that preempt_enable
and is never coming back.
(gdb) info registers
eax 0xefef68c0 -269522752
ecx 0xc03d1540 -1069738688
edx 0x0 0
ebx 0xefed8000 -269647872
esp 0xefed9f18 0xefed9f18
...
(gdb) p *((struct thread_info *)0xefed8000)->task
$8 = {state = 0, thread_info = 0xefed8000, usage = {counter = 1}, flags = 64,
ptrace = 0, lock_depth = -1, prio = 139, static_prio = 139, run_list = {
next = 0xc03d239c, prev = 0xc03d239c}, array = 0xc03d1f2c, sleep_avg = 0,
sleep_timestamp = 117, policy = 0, cpus_allowed = 2, time_slice = 8, tasks = {
next = 0xc0360580, prev = 0xefef6240}, mm = 0x0, active_mm = 0x0, local_pages = {
next = 0xefef6910, prev = 0xefef6910}, allocation_order = 0, nr_local_pages = 0,
binfmt = 0x0, exit_code = 0, exit_signal = 0, pdeath_signal = 0, personality = 0,
did_exec = 0, pid = 6, pgrp = 1, tty_old_pgrp = 0, session = 1, tgid = 0, leader = 0,
real_parent = 0xefef4700, parent = 0xefef4700, children = {next = 0xefef6958,
prev = 0xefef6958}, sibling = {next = 0xefef4798, prev = 0xefef62a0},
thread_group = {next = 0xefef62a8, prev = 0xefef47a8}, pidhash_next = 0x0,
pidhash_pprev = 0xc03e4d98, wait_chldexit = {lock = {lock = 1, magic = 3735899821},
task_list = {next = 0xefef6980, prev = 0xefef6980}}, vfork_done = 0x0,
rt_priority = 0, it_real_value = 0, it_prof_value = 0, it_virt_value = 0,
it_real_incr = 0, it_prof_incr = 0, it_virt_incr = 0, real_timer = {list = {
next = 0x0, prev = 0x0}, expires = 0, data = 4025444544,
function = 0xc0123548 <it_real_fn>}, times = {tms_utime = 0, tms_stime = 0,
tms_cutime = 0, tms_cstime = 0}, start_time = 116, per_cpu_utime = {
0 <repeats 32 times>}, per_cpu_stime = {0 <repeats 32 times>}, min_flt = 0,
maj_flt = 0, nswap = 0, cmin_flt = 0, cmaj_flt = 0, cnswap = 0, swappable = -1,
uid = 0, euid = 0, suid = 0, fsuid = 0, gid = 0, egid = 0, sgid = 0, fsgid = 0,
ngroups = 0, groups = {0 <repeats 32 times>}, cap_effective = 4294967039,
cap_inheritable = 0, cap_permitted = 4294967295, keep_capabilities = 0,
user = 0xc0363a8c, rlim = {{rlim_cur = 4294967295, rlim_max = 4294967295}, {
...
It's in state TASK_RUNNING.
It's 100% reproducible. Happens with gcc-2.91.66 and gcc-3.0.3.
Can diagnose further if needed.
-
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: preempt-related hangs
2002-03-25 1:59 preempt-related hangs Andrew Morton
@ 2002-03-25 2:11 ` Andrew Morton
2002-03-25 2:30 ` Robert Love
2002-03-25 2:33 ` Anton Altaparmakov
2002-03-25 8:04 ` Zwane Mwaikambo
1 sibling, 2 replies; 7+ messages in thread
From: Andrew Morton @ 2002-03-25 2:11 UTC (permalink / raw)
To: Robert Love, lkml, Ingo Molnar
Andrew Morton wrote:
>
> ..
> Kernel is 2.5.7, dual PIII. When I enable preempt it
> locks during boot.
OK, this patch fixed it. I don't know why.
--- linux-2.5.7/kernel/sched.c Mon Mar 18 13:04:41 2002
+++ 25/kernel/sched.c Sun Mar 24 18:09:09 2002
@@ -1545,6 +1545,8 @@ void set_cpus_allowed(task_t *p, unsigne
migration_req_t req;
runqueue_t *rq;
+ preempt_disable();
+
new_mask &= cpu_online_map;
if (!new_mask)
BUG();
@@ -1557,7 +1559,7 @@ void set_cpus_allowed(task_t *p, unsigne
*/
if (new_mask & (1UL << p->thread_info->cpu)) {
task_rq_unlock(rq, &flags);
- return;
+ goto out;
}
init_MUTEX_LOCKED(&req.sem);
@@ -1567,6 +1569,8 @@ void set_cpus_allowed(task_t *p, unsigne
wake_up_process(rq->migration_thread);
down(&req.sem);
+out:
+ preempt_disable();
}
static volatile unsigned long migration_mask;
-
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: preempt-related hangs
2002-03-25 2:11 ` Andrew Morton
@ 2002-03-25 2:30 ` Robert Love
2002-03-25 2:33 ` Anton Altaparmakov
1 sibling, 0 replies; 7+ messages in thread
From: Robert Love @ 2002-03-25 2:30 UTC (permalink / raw)
To: Andrew Morton; +Cc: lkml, Ingo Molnar
On Sun, 2002-03-24 at 21:11, Andrew Morton wrote:
> OK, this patch fixed it. I don't know why.
Eh, odd. That effectively disables kernel preemption around
set_cpus_allowed, but preemption is already disabled by the task_rq_lock
call. Note, however, preemption is enabled by the task_rq_unlock and
thus wake_up_process is called with preemption enabled.
With your patch, preemption is now disabled across the call, and
subsequently the task_rq_unlock in try_to_wake_up will never call
preempt_schedule and your lock does not happen.
The actual problem may be elsewhere, and this just hides it. This is
pretty clear, since we would get a similar effect just wrapping
wake_up_process in preempt_disable. But, oh, try_to_wake_up disables
preempt, too ... hrm.
Hm, what if try_to_wake_up wakes up a process and then preemptively
schedules into it and it wants to acquire the req.sem semaphore, but
cannot, as it is still taken by set_cpus_allowed? The semaphore seems
to just be used in the migration code.
So we have init spinning on softirq threads to come up and then we have
a deadlock on req.sem from set_cpus_allowed and into the migration
thread?
Bleh ... Ingo?
Robert Love
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: preempt-related hangs
2002-03-25 2:11 ` Andrew Morton
2002-03-25 2:30 ` Robert Love
@ 2002-03-25 2:33 ` Anton Altaparmakov
2002-03-25 2:40 ` Robert Love
2002-03-25 2:48 ` Andrew Morton
1 sibling, 2 replies; 7+ messages in thread
From: Anton Altaparmakov @ 2002-03-25 2:33 UTC (permalink / raw)
To: Andrew Morton; +Cc: Robert Love, lkml, Ingo Molnar
At 02:11 25/03/02, Andrew Morton wrote:
>Andrew Morton wrote:
> >
> > ..
> > Kernel is 2.5.7, dual PIII. When I enable preempt it
> > locks during boot.
>
>OK, this patch fixed it. I don't know why.
Er, because you disable preemption twice and it never gets enabled again? (-:
You probably meant that to be preemt_enable() at the bottom of the patch...
That might not solve your problem of course... But with the patch you
basically have completely disabled preemption, you might as well not
configure it into the kernel. (-;
Anton
>--- linux-2.5.7/kernel/sched.c Mon Mar 18 13:04:41 2002
>+++ 25/kernel/sched.c Sun Mar 24 18:09:09 2002
>@@ -1545,6 +1545,8 @@ void set_cpus_allowed(task_t *p, unsigne
> migration_req_t req;
> runqueue_t *rq;
>
>+ preempt_disable();
>+
> new_mask &= cpu_online_map;
> if (!new_mask)
> BUG();
>@@ -1557,7 +1559,7 @@ void set_cpus_allowed(task_t *p, unsigne
> */
> if (new_mask & (1UL << p->thread_info->cpu)) {
> task_rq_unlock(rq, &flags);
>- return;
>+ goto out;
> }
>
> init_MUTEX_LOCKED(&req.sem);
>@@ -1567,6 +1569,8 @@ void set_cpus_allowed(task_t *p, unsigne
> wake_up_process(rq->migration_thread);
>
> down(&req.sem);
>+out:
>+ preempt_disable();
> }
>
> static volatile unsigned long migration_mask;
>
>
>-
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at http://www.tux.org/lkml/
--
"I've not lost my mind. It's backed up on tape somewhere." - Unknown
--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Linux NTFS Maintainer / WWW: http://linux-ntfs.sf.net/
ICQ: 8561279 / WWW: http://www-stu.christs.cam.ac.uk/~aia21/
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: preempt-related hangs
2002-03-25 2:33 ` Anton Altaparmakov
@ 2002-03-25 2:40 ` Robert Love
2002-03-25 2:48 ` Andrew Morton
1 sibling, 0 replies; 7+ messages in thread
From: Robert Love @ 2002-03-25 2:40 UTC (permalink / raw)
To: Anton Altaparmakov; +Cc: Andrew Morton, lkml, Ingo Molnar
On Sun, 2002-03-24 at 21:33, Anton Altaparmakov wrote:
> Er, because you disable preemption twice and it never gets enabled again? (-:
>
> You probably meant that to be preemt_enable() at the bottom of the patch...
> That might not solve your problem of course... But with the patch you
> basically have completely disabled preemption, you might as well not
> configure it into the kernel. (-;
Crap - good eye Anton. What does it do now, Andrew?
Robert Love
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: preempt-related hangs
2002-03-25 2:33 ` Anton Altaparmakov
2002-03-25 2:40 ` Robert Love
@ 2002-03-25 2:48 ` Andrew Morton
1 sibling, 0 replies; 7+ messages in thread
From: Andrew Morton @ 2002-03-25 2:48 UTC (permalink / raw)
To: Anton Altaparmakov; +Cc: Robert Love, lkml, Ingo Molnar
Anton Altaparmakov wrote:
>
> At 02:11 25/03/02, Andrew Morton wrote:
> >Andrew Morton wrote:
> > >
> > > ..
> > > Kernel is 2.5.7, dual PIII. When I enable preempt it
> > > locks during boot.
> >
> >OK, this patch fixed it. I don't know why.
>
> Er, because you disable preemption twice and it never gets enabled again? (-:
>
> You probably meant that to be preemt_enable() at the bottom of the patch...
> That might not solve your problem of course... But with the patch you
> basically have completely disabled preemption, you might as well not
> configure it into the kernel. (-;
Yeah I know. Sheesh. I don't even have time to test the fix
before you're on my act :)
Fixed-up workaround with a little debug check is below.
I think Robert's right - the problem is more likely to lie
with the migration thread handoff thingy.
--- 2.5.7/kernel/sched.c~preempt-lockup Sun Mar 24 18:10:49 2002
+++ 2.5.7-akpm/kernel/sched.c Sun Mar 24 18:25:29 2002
@@ -1561,6 +1561,8 @@ void set_cpus_allowed(task_t *p, unsigne
migration_req_t req;
runqueue_t *rq;
+ preempt_disable();
+
new_mask &= cpu_online_map;
if (!new_mask)
BUG();
@@ -1573,7 +1575,7 @@ void set_cpus_allowed(task_t *p, unsigne
*/
if (new_mask & (1UL << p->thread_info->cpu)) {
task_rq_unlock(rq, &flags);
- return;
+ goto out;
}
init_MUTEX_LOCKED(&req.sem);
@@ -1583,6 +1585,8 @@ void set_cpus_allowed(task_t *p, unsigne
wake_up_process(rq->migration_thread);
down(&req.sem);
+out:
+ preempt_enable();
}
static volatile unsigned long migration_mask;
--- 2.5.7/kernel/exit.c~preempt-lockup Sun Mar 24 18:31:39 2002
+++ 2.5.7-akpm/kernel/exit.c Sun Mar 24 18:37:19 2002
@@ -489,6 +489,14 @@ NORET_TYPE void do_exit(long code)
panic("Attempted to kill the idle task!");
if (tsk->pid == 1)
panic("Attempted to kill init!");
+#ifdef CONFIG_PREEMPT
+ if (preempt_get_count()) {
+ printk(KERN_ERR "task `%s' exits with non-zero "
+ "preempt count: %d\n",
+ current->comm,
+ preempt_get_count());
+ }
+#endif
tsk->flags |= PF_EXITING;
del_timer_sync(&tsk->real_timer);
-
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: preempt-related hangs
2002-03-25 1:59 preempt-related hangs Andrew Morton
2002-03-25 2:11 ` Andrew Morton
@ 2002-03-25 8:04 ` Zwane Mwaikambo
1 sibling, 0 replies; 7+ messages in thread
From: Zwane Mwaikambo @ 2002-03-25 8:04 UTC (permalink / raw)
To: Andrew Morton; +Cc: Robert Love, lkml
On Sun, 24 Mar 2002, Andrew Morton wrote:
> I sent this email to Ingo last week; seems that he's
> having some downtime. It was happening on my dual PIII
> and I now discover that the quad pIII does the same
> thing. Any ideas?
>
>
> Kernel is 2.5.7, dual PIII. When I enable preempt it
> locks during boot.
same 2.5.7 here with quad ppro emulation, i have preempt disabled.
> I applied the kgdb patch and had a poke.
>
> (gdb) info threads
> * 6 Thread 6 preempt_schedule () at sched.c:848
> 5 Thread 5 preempt_schedule () at sched.c:848
> 4 Thread 4 context_thread (startup=0xc0395f90) at context.c:101
> 3 Thread 3 migration_thread (unused=0x0) at sched.c:1646
> 2 Thread 2 migration_thread (unused=0x0) at sched.c:1646
> 1 Thread 1 spawn_ksoftirqd () at softirq.c:407
>
> Note that init is stuck in spawn_ksoftirqd. It's spinning in
> that function, yielding, waiting for the softirqd threads to
> come alive. They're threads 5 and 6.
I'm locking in the same place i have my last CPU spinning, waiting for its
softirqd thread. Then i get a smp_migrate_task IPI from an alive CPU, at
which case i'm stuck.
Zwane
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2002-03-25 8:15 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-03-25 1:59 preempt-related hangs Andrew Morton
2002-03-25 2:11 ` Andrew Morton
2002-03-25 2:30 ` Robert Love
2002-03-25 2:33 ` Anton Altaparmakov
2002-03-25 2:40 ` Robert Love
2002-03-25 2:48 ` Andrew Morton
2002-03-25 8:04 ` Zwane Mwaikambo
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.