* kernel_thread() & thread starting @ 2001-02-18 22:24 Kenn Humborg 2001-02-18 22:53 ` Russell King 2001-02-19 10:17 ` Philipp Rumpf 0 siblings, 2 replies; 9+ messages in thread From: Kenn Humborg @ 2001-02-18 22:24 UTC (permalink / raw) To: Linux-Kernel In init/main.c, do_basic_setup() we have: start_context_thread(); do_initcalls(); start_context_thread() calls kernel_thread() to start the keventd thread. Then do_initcalls() calls all the init functions and finishes by calling flush_scheduled_tasks(). This function ends up calling schedule_task() which checks if keventd is running. With a very stripped down kernel, it seems possible that do_initcalls() can complete without context_thread() having had a chance to run (and set the flag that keventd is running). Right now, in the Linux/VAX project, I'm working with a very stripped down kernel and I'm seeing this behaviour. Depending on what I enable in the .config, I can get schedule_task() to fail with: schedule_task(): keventd has not started When starting bdflush and kupdated, bdflush_init() uses a semaphore to make sure that the threads have run before continuing. Shouldn't start_context_thread() do something similar? Or am I missing something? Thanks, Kenn ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: kernel_thread() & thread starting 2001-02-18 22:24 kernel_thread() & thread starting Kenn Humborg @ 2001-02-18 22:53 ` Russell King 2001-02-21 0:51 ` Kenn Humborg 2001-02-19 10:17 ` Philipp Rumpf 1 sibling, 1 reply; 9+ messages in thread From: Russell King @ 2001-02-18 22:53 UTC (permalink / raw) To: Kenn Humborg; +Cc: Linux-Kernel Kenn Humborg writes: > When starting bdflush and kupdated, bdflush_init() uses a semaphore to > make sure that the threads have run before continuing. Shouldn't > start_context_thread() do something similar? I think this would be a good idea. Here is a patch to try. Please report back if it works so that it can be forwarded to Linus. Thanks. --- orig/kernel/context.c Tue Jan 30 13:31:11 2001 +++ linux/kernel/context.c Sun Feb 18 22:51:56 2001 @@ -63,7 +63,7 @@ return ret; } -static int context_thread(void *dummy) +static int context_thread(void *sem) { struct task_struct *curtask = current; DECLARE_WAITQUEUE(wait, curtask); @@ -79,6 +79,8 @@ recalc_sigpending(curtask); spin_unlock_irq(&curtask->sigmask_lock); + up((struct semaphore *)sem); + /* Install a handler so SIGCLD is delivered */ sa.sa.sa_handler = SIG_IGN; sa.sa.sa_flags = 0; @@ -148,7 +150,9 @@ int start_context_thread(void) { - kernel_thread(context_thread, NULL, CLONE_FS | CLONE_FILES); + DECLARE_MUTEX_LOCKED(sem); + kernel_thread(context_thread, &sem, CLONE_FS | CLONE_FILES); + down(&sem); return 0; } -- Russell King (rmk@arm.linux.org.uk) The developer of ARM Linux http://www.arm.linux.org.uk/personal/aboutme.html ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: kernel_thread() & thread starting 2001-02-18 22:53 ` Russell King @ 2001-02-21 0:51 ` Kenn Humborg 0 siblings, 0 replies; 9+ messages in thread From: Kenn Humborg @ 2001-02-21 0:51 UTC (permalink / raw) To: Russell King; +Cc: Linux-Kernel, Philipp Rumpf, David Woodhouse, Andrew Morton On Sun, Feb 18, 2001 at 10:53:16PM +0000, Russell King wrote: > Kenn Humborg writes: > > When starting bdflush and kupdated, bdflush_init() uses a semaphore to > > make sure that the threads have run before continuing. Shouldn't > > start_context_thread() do something similar? > > I think this would be a good idea. Here is a patch to try. Please report > back if it works so that it can be forwarded to Linus. Thanks. Works perfectly for me. I'll leave it up to you guys to decide what's the right way to deal with this and pass a patch to Linus/Alan. Meanwhile, I'll keep Russell's patch below in our CVS tree. Thanks, Kenn > --- orig/kernel/context.c Tue Jan 30 13:31:11 2001 > +++ linux/kernel/context.c Sun Feb 18 22:51:56 2001 > @@ -63,7 +63,7 @@ > return ret; > } > > -static int context_thread(void *dummy) > +static int context_thread(void *sem) > { > struct task_struct *curtask = current; > DECLARE_WAITQUEUE(wait, curtask); > @@ -79,6 +79,8 @@ > recalc_sigpending(curtask); > spin_unlock_irq(&curtask->sigmask_lock); > > + up((struct semaphore *)sem); > + > /* Install a handler so SIGCLD is delivered */ > sa.sa.sa_handler = SIG_IGN; > sa.sa.sa_flags = 0; > @@ -148,7 +150,9 @@ > > int start_context_thread(void) > { > - kernel_thread(context_thread, NULL, CLONE_FS | CLONE_FILES); > + DECLARE_MUTEX_LOCKED(sem); > + kernel_thread(context_thread, &sem, CLONE_FS | CLONE_FILES); > + down(&sem); > return 0; > } > > > > -- > Russell King (rmk@arm.linux.org.uk) The developer of ARM Linux > http://www.arm.linux.org.uk/personal/aboutme.html ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: kernel_thread() & thread starting 2001-02-18 22:24 kernel_thread() & thread starting Kenn Humborg 2001-02-18 22:53 ` Russell King @ 2001-02-19 10:17 ` Philipp Rumpf 2001-02-19 10:32 ` David Woodhouse 1 sibling, 1 reply; 9+ messages in thread From: Philipp Rumpf @ 2001-02-19 10:17 UTC (permalink / raw) To: Kenn Humborg; +Cc: Linux-Kernel, dwmw2 On Sun, 18 Feb 2001, Kenn Humborg wrote: > in the .config, I can get schedule_task() to fail with: > > schedule_task(): keventd has not started This shouldn't be a failure case, just a (bogus) printk. > When starting bdflush and kupdated, bdflush_init() uses a semaphore to > make sure that the threads have run before continuing. Shouldn't > start_context_thread() do something similar? Why bother ? It looks like a leftover debugging message which doesn't make a lot of sense once the code is stable (what might make sense is checking keventd is still around, but that's not what the code is doing). Proposed patch: --- linux-2.4/kernel/context.c Fri Jan 12 18:52:41 2001 +++ linux-prumpf/kernel/context.c Mon Feb 19 11:11:37 2001 @@ -26,19 +26,9 @@ static int keventd_running; static struct task_struct *keventd_task; -static int need_keventd(const char *who) -{ - if (keventd_running == 0) - printk(KERN_ERR "%s(): keventd has not started\n", who); - return keventd_running; -} - int current_is_keventd(void) { - int ret = 0; - if (need_keventd(__FUNCTION__)) - ret = (current == keventd_task); - return ret; + return current == keventd_task; } /** @@ -57,7 +47,6 @@ int schedule_task(struct tq_struct *task) { int ret; - need_keventd(__FUNCTION__); ret = queue_task(task, &tq_context); wake_up(&context_task_wq); return ret; dwmw2 ? ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: kernel_thread() & thread starting 2001-02-19 10:17 ` Philipp Rumpf @ 2001-02-19 10:32 ` David Woodhouse 2001-02-19 12:44 ` Andrew Morton 0 siblings, 1 reply; 9+ messages in thread From: David Woodhouse @ 2001-02-19 10:32 UTC (permalink / raw) To: Philipp Rumpf; +Cc: Kenn Humborg, Linux-Kernel, akpm prumpf@mandrakesoft.com said: > Why bother ? It looks like a leftover debugging message which > doesn't make a lot of sense once the code is stable (what might make > sense is checking keventd is still around, but that's not what the > code is doing). > Proposed patch: > dwmw2 ? Don't look at me. I didn't like the current_is_keventd stuff very much in the first place. akpm? -- dwmw2 ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: kernel_thread() & thread starting 2001-02-19 10:32 ` David Woodhouse @ 2001-02-19 12:44 ` Andrew Morton 0 siblings, 0 replies; 9+ messages in thread From: Andrew Morton @ 2001-02-19 12:44 UTC (permalink / raw) To: David Woodhouse; +Cc: Philipp Rumpf, Kenn Humborg, Linux-Kernel David Woodhouse wrote: > > prumpf@mandrakesoft.com said: > > Why bother ? It looks like a leftover debugging message which > > doesn't make a lot of sense once the code is stable (what might make > > sense is checking keventd is still around, but that's not what the > > code is doing). keventd *must* still be around. And the code obviously isn't completely stable, and this debug message has found something rather unpleasant. I don't think we should run the init tasks when keventd may, or may not be running. Sure, the current code does, by happenstance, all work correctly when keventd hasn't yet started running, and when it's starting up. But it's safer, saner and surer just to crank the damn thing up before proceeding. > > Proposed patch: > > > dwmw2 ? > > Don't look at me. I didn't like the current_is_keventd stuff very much > in the first place. akpm? Heh. Tell that to wakeup_bdflush(). The all-singing fully-async + fully-sync + async with callback patch was dropped, and until we can demonstrate that it fixes a bug, it can stay dropped. I think it _will_ fix a bug, but the development of userspace hotplug infrastructure hasn't reached the stage where the need for a kernel fix has been proven. I believe the right thing to do here is the RMK approach. Here's a faintly tested patch. --- linux-2.4.2-pre4/kernel/context.c Sat Jan 13 04:52:41 2001 +++ lk/kernel/context.c Mon Feb 19 23:33:38 2001 @@ -19,6 +19,7 @@ #include <linux/init.h> #include <linux/unistd.h> #include <linux/signal.h> +#include <asm/semaphore.h> static DECLARE_TASK_QUEUE(tq_context); static DECLARE_WAIT_QUEUE_HEAD(context_task_wq); @@ -26,19 +27,9 @@ static int keventd_running; static struct task_struct *keventd_task; -static int need_keventd(const char *who) -{ - if (keventd_running == 0) - printk(KERN_ERR "%s(): keventd has not started\n", who); - return keventd_running; -} - int current_is_keventd(void) { - int ret = 0; - if (need_keventd(__FUNCTION__)) - ret = (current == keventd_task); - return ret; + return (current == keventd_task); } /** @@ -57,13 +48,12 @@ int schedule_task(struct tq_struct *task) { int ret; - need_keventd(__FUNCTION__); ret = queue_task(task, &tq_context); wake_up(&context_task_wq); return ret; } -static int context_thread(void *dummy) +static int context_thread(void *sem) { struct task_struct *curtask = current; DECLARE_WAITQUEUE(wait, curtask); @@ -85,6 +75,7 @@ siginitset(&sa.sa.sa_mask, sigmask(SIGCHLD)); do_sigaction(SIGCHLD, &sa, (struct k_sigaction *)0); + up((struct semaphore *)sem); /* * If one of the functions on a task queue re-adds itself * to the task queue we call schedule() in state TASK_RUNNING @@ -146,9 +137,11 @@ remove_wait_queue(&context_task_done, &wait); } -int start_context_thread(void) +int __init start_context_thread(void) { - kernel_thread(context_thread, NULL, CLONE_FS | CLONE_FILES); + DECLARE_MUTEX_LOCKED(sem); + kernel_thread(context_thread, &sem, CLONE_FS | CLONE_FILES); + down(&sem); return 0; } - ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <Pine.LNX.3.96.1010219073717.16489K-100000@mandrakesoft.mandrakesoft.com>]
* Re: kernel_thread() & thread starting [not found] <Pine.LNX.3.96.1010219073717.16489K-100000@mandrakesoft.mandrakesoft.com> @ 2001-02-19 21:17 ` Russell King 2001-02-19 21:40 ` Philipp Rumpf 0 siblings, 1 reply; 9+ messages in thread From: Russell King @ 2001-02-19 21:17 UTC (permalink / raw) To: Philipp Rumpf; +Cc: Andrew Morton, David Woodhouse, Kenn Humborg, Linux-Kernel Philipp Rumpf writes: > That still won't catch keventd oopsing though - which I think might happen > quite easily in real life. Maybe we should panic in that case? For example, what happens if kswapd oopses? kreclaimd? bdflush? kupdate? All these have the same problem, and should probably have the same "fix" applied, whatever that fix may be. -- Russell King (rmk@arm.linux.org.uk) The developer of ARM Linux http://www.arm.linux.org.uk/personal/aboutme.html ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: kernel_thread() & thread starting 2001-02-19 21:17 ` Russell King @ 2001-02-19 21:40 ` Philipp Rumpf 2001-02-19 22:25 ` David Woodhouse 0 siblings, 1 reply; 9+ messages in thread From: Philipp Rumpf @ 2001-02-19 21:40 UTC (permalink / raw) To: Russell King; +Cc: Andrew Morton, David Woodhouse, Kenn Humborg, Linux-Kernel On Mon, 19 Feb 2001, Russell King wrote: > Philipp Rumpf writes: > > That still won't catch keventd oopsing though - which I think might happen > > quite easily in real life. > > Maybe we should panic in that case? For example, what happens if kswapd > oopses? kreclaimd? bdflush? kupdate? All these have the same problem, No. If kswapd oopses it's a bug in kswapd (or related code). If keventd oopses most likely the broken code is actually the task queue you scheduled, which belongs to your driver. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: kernel_thread() & thread starting 2001-02-19 21:40 ` Philipp Rumpf @ 2001-02-19 22:25 ` David Woodhouse 0 siblings, 0 replies; 9+ messages in thread From: David Woodhouse @ 2001-02-19 22:25 UTC (permalink / raw) To: Philipp Rumpf; +Cc: Russell King, Andrew Morton, Kenn Humborg, Linux-Kernel On Mon, 19 Feb 2001, Philipp Rumpf wrote: > No. If kswapd oopses it's a bug in kswapd (or related code). If keventd > oopses most likely the broken code is actually the task queue you > scheduled, which belongs to your driver. If we're going to detect this case, we might as well just restart keventd. -- dwmw2 ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2001-02-21 0:52 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-02-18 22:24 kernel_thread() & thread starting Kenn Humborg
2001-02-18 22:53 ` Russell King
2001-02-21 0:51 ` Kenn Humborg
2001-02-19 10:17 ` Philipp Rumpf
2001-02-19 10:32 ` David Woodhouse
2001-02-19 12:44 ` Andrew Morton
[not found] <Pine.LNX.3.96.1010219073717.16489K-100000@mandrakesoft.mandrakesoft.com>
2001-02-19 21:17 ` Russell King
2001-02-19 21:40 ` Philipp Rumpf
2001-02-19 22:25 ` David Woodhouse
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox