All of lore.kernel.org
 help / color / mirror / Atom feed
From: Rusty Russell <rusty@rustcorp.com.au>
To: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	"Rafael J. Wysocki" <rjw@sisk.pl>,
	Bjorn Helgaas <bjorn.helgaas@hp.com>,
	andreas.herrmann3@amd.com,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: 2.6.29 boot hang
Date: Wed, 1 Apr 2009 17:03:57 +1030	[thread overview]
Message-ID: <200904011703.58179.rusty@rustcorp.com.au> (raw)
In-Reply-To: <49D2F2D4.9090008@oracle.com>

On Wednesday 01 April 2009 15:21:32 Randy Dunlap wrote:
> Rusty Russell wrote:
> > On Wednesday 01 April 2009 07:15:35 Randy Dunlap wrote:
> >> On a 4-proc x86_64 (HP BladeCenter, AMD CPUs) system, booting 2.6.29
> >> (or earlier, back to 2.6.28-6921-g873392c) hangs during boot.
> >>
> >> git bisect says:
> >> 873392ca514f87eae39f53b6944caf85b1a047cb is first bad commit
> >> commit 873392ca514f87eae39f53b6944caf85b1a047cb
> >> Author: Rusty Russell <rusty@rustcorp.com.au>
> >> Date:   Wed Dec 31 23:54:56 2008 +1030
> >>
> >>     PCI: work_on_cpu: use in drivers/pci/pci-driver.c
> > 
> > ...
> > 
> >> If I change CONFIG_MICROCODE_AMD=y to CONFIG_MICROCODE_AMD=n & rebuild,
> >> the kernel boots successfully.
> > 
> > How very very odd.  My first thought was a deadlock with keventd used
> > by work_on_cpu (changed in latest Linus tree), but the microcode code at
> > that version doesn't use work_on_cpu.
> 
> Yep, I thought it a bit odd also.
> 
> > So I don't think that's it, but this patch should canonically eliminate it:
> > 
> > Subject: work_on_cpu(): rewrite it to create a kernel thread on demand
> > From: Andrew Morton <akpm@linux-foundation.org>
> 
> This patch doesn't apply to 2.6.29-final, but it does apply to 2.6.29-git8,

Err, it has 14 line offset.  But here's an adjusted one.

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 1f0c509..08bd795 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -971,20 +971,20 @@ undo:
 }
 
 #ifdef CONFIG_SMP
-static struct workqueue_struct *work_on_cpu_wq __read_mostly;
 
 struct work_for_cpu {
-	struct work_struct work;
+	struct completion completion;
 	long (*fn)(void *);
 	void *arg;
 	long ret;
 };
 
-static void do_work_for_cpu(struct work_struct *w)
+static int do_work_for_cpu(void *_wfc)
 {
-	struct work_for_cpu *wfc = container_of(w, struct work_for_cpu, work);
-
+	struct work_for_cpu *wfc = _wfc;
 	wfc->ret = wfc->fn(wfc->arg);
+	complete(&wfc->completion);
+	return 0;
 }
 
 /**
@@ -995,17 +995,23 @@ static void do_work_for_cpu(struct work_struct *w)
  *
  * This will return the value @fn returns.
  * It is up to the caller to ensure that the cpu doesn't go offline.
+ * The caller must not hold any locks which would prevent @fn from completing.
  */
 long work_on_cpu(unsigned int cpu, long (*fn)(void *), void *arg)
 {
-	struct work_for_cpu wfc;
-
-	INIT_WORK(&wfc.work, do_work_for_cpu);
-	wfc.fn = fn;
-	wfc.arg = arg;
-	queue_work_on(cpu, work_on_cpu_wq, &wfc.work);
-	flush_work(&wfc.work);
-
+	struct task_struct *sub_thread;
+	struct work_for_cpu wfc = {
+		.completion = COMPLETION_INITIALIZER_ONSTACK(wfc.completion),
+		.fn = fn,
+		.arg = arg,
+	};
+
+	sub_thread = kthread_create(do_work_for_cpu, &wfc, "work_for_cpu");
+	if (IS_ERR(sub_thread))
+		return PTR_ERR(sub_thread);
+	kthread_bind(sub_thread, cpu);
+	wake_up_process(sub_thread);
+	wait_for_completion(&wfc.completion);
 	return wfc.ret;
 }
 EXPORT_SYMBOL_GPL(work_on_cpu);
@@ -1021,8 +1027,4 @@ void __init init_workqueues(void)
 	hotcpu_notifier(workqueue_cpu_callback, 0);
 	keventd_wq = create_workqueue("events");
 	BUG_ON(!keventd_wq);
-#ifdef CONFIG_SMP
-	work_on_cpu_wq = create_workqueue("work_on_cpu");
-	BUG_ON(!work_on_cpu_wq);
-#endif
 }

  parent reply	other threads:[~2009-04-01  6:34 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-31 20:45 2.6.29 boot hang Randy Dunlap
2009-03-31 21:01 ` Morten P.D. Stevens
2009-03-31 21:54   ` Randy Dunlap
2009-03-31 23:42 ` Rusty Russell
2009-04-01  4:51   ` Randy Dunlap
2009-04-01  4:52     ` Randy Dunlap
2009-04-01  4:56     ` Randy Dunlap
2009-04-01  6:33     ` Rusty Russell [this message]
2009-04-01 17:30       ` Randy Dunlap
2009-04-02  0:42         ` Rusty Russell
2009-04-02  1:34           ` Randy Dunlap
2009-04-02 16:35           ` Randy Dunlap

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200904011703.58179.rusty@rustcorp.com.au \
    --to=rusty@rustcorp.com.au \
    --cc=akpm@linux-foundation.org \
    --cc=andreas.herrmann3@amd.com \
    --cc=bjorn.helgaas@hp.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=randy.dunlap@oracle.com \
    --cc=rjw@sisk.pl \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.