Bisected rcu hang (kernel/sched.c): was 2.6.33rc4 RCU hang mm spin_lock deadlock(?) after running libvirtd - reproducible.

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Michael Breuer <mbreuer@majjas.com>
To: paulmck@linux.vnet.ibm.com
Cc: linux-kernel@vger.kernel.org, Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Bisected rcu hang (kernel/sched.c): was 2.6.33rc4 RCU hang mm spin_lock deadlock(?) after running libvirtd - reproducible.
Date: Sat, 23 Jan 2010 21:49:25 -0500	[thread overview]
Message-ID: <4B5BB535.8040200@majjas.com> (raw)
In-Reply-To: <4B4E1461.4010806@majjas.com>

On 01/13/2010 01:43 PM, Michael Breuer wrote:
> [Originally posted as: "Re: 2.6.33RC3 libvirtd ->sky2 & rcu oops (was 
> Sky2 oops - Driver    tries to sync DMA memory it has not allocated)"]
>
> On 1/11/2010 8:49 PM, Paul E. McKenney wrote:
>> On Sun, Jan 10, 2010 at 03:10:03PM -0500, Michael Breuer wrote:
>>> On 1/9/2010 5:21 PM, Michael Breuer wrote:
>>>> Hi,
>>>>
>>>> Attempting to move back to mainline after my recent 2.6.32 issues...
>>>> Config is make oldconfig from working 2.6.32 config. Patch for 
>>>> af_packet.c
>>>> (for skb issue found in 2.6.32) included. Attaching .config and NMI
>>>> backtraces.
>>>>
>>>> System becomes unusable after bringing up the network:
>>>>
>>>> ...
>> RCU stall warnings are usually due to an infinite loop somewhere in the
>> kernel.  If you are running !CONFIG_PREEMPT, then any infinite loop not
>> containing some call to schedule will get you a stall warning.  If you
>> are running CONFIG_PREEMPT, then the infinite loop is in some section of
>> code with preemption disabled (or irqs disabled).
>>
>> The stall-warning dump will normally finger one or more of the CPUs.
>> Since you are getting repeated warnings, look at the stacks and see
>> which of the most-recently-called functions stays the same in successive
>> stack traces.  This information should help you finger the infinite (or
>> longer than average) loop.
>> ...
> I can now recreate this simply by "service start libvirtd" on an F12 
> box. My earlier report that suggested this had something to do with 
> the sky2 driver was incorrect. Interestingly, it's always CPU1 
> whenever I start libvirtd.
> Attaching two of the traces (I've got about ten, but they're all 
> pretty much the same). Looks pretty consistent - libvirtd in CPU1 is 
> hung forking. Not sure why yet - perhaps someone who knows this better 
> than I can jump in.
> Summary of hang appears to be libvirtd forks - two threads show with 
> same pid deadlocked on a spin_lock
>> Then if looking at the stack traces doesn't locate the offending loop,
>> bisection might help.
> It would, however it's going to be really difficult as I wasn't able 
> to get this far with rc1 & rc2 :(
>>                             Thanx, Paul
>
I was finally able to bisect this to commit: 
3802290628348674985d14914f9bfee7b9084548 (see below)

Libvirtd always triggers the crash; other things that fork and use mmap 
sometimes do (vsftpd, for example).

Author: Peter Zijlstra <a.p.zijlstra@chello.nl>  2009-12-16 12:04:37
Committer: Ingo Molnar <mingo@elte.hu>  2009-12-16 13:01:56
Parent: e2912009fb7b715728311b0d8fe327a1432b3f79 (sched: Ensure 
set_task_cpu() is never called on blocked tasks)
Branches: remotes/origin/master
Follows: v2.6.32
Precedes: v2.6.33-rc2

     sched: Fix sched_exec() balancing

     Since we access ->cpus_allowed without holding rq->lock we need
     a retry loop to validate the result, this comes for near free
     when we merge sched_migrate_task() into sched_exec() since that
     already does the needed check.

     Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
     Cc: Mike Galbraith <efault@gmx.de>
     LKML-Reference: <20091216170517.884743662@chello.nl>
     Signed-off-by: Ingo Molnar <mingo@elte.hu>

-------------------------------- kernel/sched.c 
--------------------------------
index 33d7965..63e55ac 100644
@@ -2322,7 +2322,7 @@ void task_oncpu_function_call(struct task_struct *p,
   *
   *  - fork, @p is stable because it isn't on the tasklist yet
   *
- *  - exec, @p is unstable XXX
+ *  - exec, @p is unstable, retry loop
   *
   *  - wake-up, we serialize ->cpus_allowed against TASK_WAKING so
   *             we should be good.
@@ -3132,21 +3132,36 @@ static void double_rq_unlock(struct rq *rq1, 
struct rq *rq2)
  }

  /*
- * If dest_cpu is allowed for this process, migrate the task to it.
- * This is accomplished by forcing the cpu_allowed mask to only
- * allow dest_cpu, which will force the cpu onto dest_cpu. Then
- * the cpu_allowed mask is restored.
+ * sched_exec - execve() is a valuable balancing opportunity, because at
+ * this point the task has the smallest effective memory and cache 
footprint.
   */
-static void sched_migrate_task(struct task_struct *p, int dest_cpu)
+void sched_exec(void)
  {
+    struct task_struct *p = current;
      struct migration_req req;
+    int dest_cpu, this_cpu;
      unsigned long flags;
      struct rq *rq;

+again:
+    this_cpu = get_cpu();
+    dest_cpu = select_task_rq(p, SD_BALANCE_EXEC, 0);
+    if (dest_cpu == this_cpu) {
+        put_cpu();
+        return;
+    }
+
      rq = task_rq_lock(p, &flags);
+    put_cpu();
+
+    /*
+     * select_task_rq() can race against ->cpus_allowed
+     */
      if (!cpumask_test_cpu(dest_cpu, &p->cpus_allowed)
-        || unlikely(!cpu_active(dest_cpu)))
-        goto out;
+        || unlikely(!cpu_active(dest_cpu))) {
+        task_rq_unlock(rq, &flags);
+        goto again;
+    }

      /* force the process onto the specified CPU */
      if (migrate_task(p, dest_cpu, &req)) {
@@ -3161,24 +3176,10 @@ static void sched_migrate_task(struct 
task_struct *p, int dest_cpu)

          return;
      }
-out:
      task_rq_unlock(rq, &flags);
  }

  /*
- * sched_exec - execve() is a valuable balancing opportunity, because at
- * this point the task has the smallest effective memory and cache 
footprint.
- */
-void sched_exec(void)
-{
-    int new_cpu, this_cpu = get_cpu();
-    new_cpu = select_task_rq(current, SD_BALANCE_EXEC, 0);
-    put_cpu();
-    if (new_cpu != this_cpu)
-        sched_migrate_task(current, new_cpu);
-}
-
-/*
   * pull_task - move a task from a remote runqueue to the local runqueue.
   * Both runqueues must be locked.
   */

next prev parent reply	other threads:[~2010-01-24  2:49 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-01-09 22:21 2.6.33RC3 Sky2 oops - Driver tries to sync DMA memory it has not allocated Michael Breuer
2010-01-10 20:10 ` 2.6.33RC3 libvirtd ->sky2 & rcu oops (was Sky2 oops - Driver tries to sync DMA memory it has not allocated) Michael Breuer
2010-01-12  1:49   ` Paul E. McKenney
2010-01-13 18:43     ` 2.6.33rc4 RCU hang mm spin_lock deadlock(?) after running libvirtd - reproducible Michael Breuer
2010-01-13 18:58       ` Paul E. McKenney
2010-01-24  2:49       ` Michael Breuer [this message]
2010-01-24  5:59         ` Bisected rcu hang (kernel/sched.c): was " Mike Galbraith
2010-01-24  6:32           ` Michael Breuer
2010-01-24  7:19             ` Mike Galbraith
2010-01-25 16:03         ` Peter Zijlstra
2010-01-25 16:14           ` Michael Breuer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B5BB535.8040200@majjas.com \
    --to=mbreuer@majjas.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.