All of lore.kernel.org
 help / color / mirror / Atom feed
From: Martin Schwidefsky <schwidefsky@de.ibm.com>
To: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	linux-kernel@vger.kernel.org
Cc: Ingo Molnar <mingo@elte.hu>, Thomas Gleixner <tglx@linutronix.de>,
	Gerald Schaefer <gerald.schaefer@de.ibm.com>
Subject: [BUG] race of RCU vs NOHU
Date: Fri, 7 Aug 2009 15:15:29 +0200	[thread overview]
Message-ID: <20090807151529.2806b8b4@skybase> (raw)

Hi Paul,
I analysed a dump of a hanging 2.6.30 system and found what I think is
a bug of RCU vs NOHZ. There are a number of patches ontop of that
kernel but they should be independent of the bug.

The systems has 4 cpus and uses classic RCU. cpus #0, #2 and #3 woke up
recently, cpu #1 has been sleeping for 5 minutes, but there is a pending
rcu batch. The timer wheel for cpu #1 is empty, it will continue to
sleep for NEXT_TIMER_MAX_DELTA ticks.

Now if I look at the RCU data structures I find this:

rcu_ctrlblk
>> px *(struct rcu_ctrlblk *) 0x810000
struct rcu_ctrlblk {
        cur = 0xffffffffffffff99
        completed = 0xffffffffffffff98
        pending = 0xffffffffffffff99
        signaled = 0x0
        lock = spinlock_t {
                raw_lock = raw_spinlock_t {
                        owner_cpu = 0x0
                }
                break_lock = 0x0
                magic = 0xdead4ead
                owner_cpu = 0xffffffff
                owner = 0xffffffffffffffff
                dep_map = struct lockdep_map {
                        key = 0x810118
                        class_cache = 0xcbcff0
                        name = 0x63e944
                        cpu = 0x0
                        ip = 0x1a7f64
                }
        }
        cpumask = {
                [0] 0x2
        }
}

rcu_data cpu #0
>> px *(struct rcu_data *) 0x872f8430
struct rcu_data {
        quiescbatch = 0xffffffffffffff99
        passed_quiesc = 0x1
        qs_pending = 0x0
        batch = 0xffffffffffffff97
        nxtlist = (nil)
        nxttail = {
                [0]         0x872f8448
                [1]         0x872f8448
                [2]         0x872f8448
        }
        qlen = 0x0
        donelist = (nil)
        donetail = 0x872f8470
        blimit = 0xa
        cpu = 0x0
        barrier = struct rcu_head {
                next = (nil)
                func = 0x0
        }
}

rcu_data cpu #1
>> px *(struct rcu_data *) 0x874be430
struct rcu_data {
        quiescbatch = 0xffffffffffffff98
        passed_quiesc = 0x1
        qs_pending = 0x0
        batch = 0xffffffffffffff97
        nxtlist = (nil)
        nxttail = {
                [0]         0x874be448
                [1]         0x874be448
                [2]         0x874be448
        }
        qlen = 0x0
        donelist = (nil)
        donetail = 0x874be470
        blimit = 0xa
        cpu = 0x1
        barrier = struct rcu_head {
                next = (nil)
                func = 0x0
        }
}

rcu_data cpu #2
>> px *(struct rcu_data *) 0x87684430
struct rcu_data {
        quiescbatch = 0xffffffffffffff99
        passed_quiesc = 0x1
        qs_pending = 0x0
        batch = 0xffffffffffffff99
        nxtlist = 0xffc1fc18
        nxttail = {
                [0]         0x87684448
                [1]         0x87684448
                [2]         0xffc1fc18
        }
        qlen = 0x1
        donelist = (nil)
        donetail = 0x87684470
        blimit = 0xa
        cpu = 0x2
        barrier = struct rcu_head {
                next = (nil)
                func = 0x0
        }
}

rcu_data cpu #3
>> px *(struct rcu_data *) 0x8784a430
struct rcu_data {
        quiescbatch = 0xffffffffffffff99
        passed_quiesc = 0x1
        qs_pending = 0x0
        batch = 0xffffffffffffff63
        nxtlist = (nil)
        nxttail = {
                [0]         0x8784a448
                [1]         0x8784a448
                [2]         0x8784a448
        }
        qlen = 0x0
        donelist = (nil)
        donetail = 0x8784a470
        blimit = 0xa
        cpu = 0x3
        barrier = struct rcu_head {
                next = (nil)
                func = 0x0
        }
}

At the time cpu #1 went to sleep rcu_needs_cpu must have answered false,
otherwise a 1 tick delay would have been programmed. rcu_pending compares
rcu_ctrlblk.cur with rcu_data.quiescbatch for cpu #1. So these two must
have been equal otherwise rcu_needs_cpu would have answered true.
That means that the rcu_needs_cpu check has been completed before
rcu_start_batch for batch 0xffffffffffffff99. The bit for cpu #1 is
still set in the rcu_ctrlblk.cpumask, therefore the bit for cpu #1
in nohz_cpu_mask can not have been set at the time rcu_start_batch has
completed. That gives the following race (cpu 0 is starting the batch,
cpu 1 is going to sleep):

cpu 1: tick_nohz_stop_sched_tick: rcu_needs_cpu();
cpu 0: rcu_start_batch: rcp->cur++;
cpu 0: rcu_start_batch: cpumask_andnot(to_cpumask(rcp->cpumask),
					cpu_online_mask, nonz_cpu_mask);
cpu 1: tick_nohz_stop_schedk_tick: cpumask_set_cpu(1, nohz_cpu_mask);

The order of i) setting the bit in nohz_cpu_mask and ii) the rcu_needs_cpu()
check in tick_nohz_stop_sched_tick is wrong, no? Or did I miss some suble
check that comes afterwards ?

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


             reply	other threads:[~2009-08-07 13:15 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-08-07 13:15 Martin Schwidefsky [this message]
2009-08-07 14:29 ` [BUG] race of RCU vs NOHU Paul E. McKenney
2009-08-10 12:25   ` Martin Schwidefsky
2009-08-10 15:08     ` Paul E. McKenney
2009-08-11 10:56       ` Martin Schwidefsky
2009-08-11 14:52         ` Paul E. McKenney
2009-08-11 15:17           ` Martin Schwidefsky
2009-08-11 18:04             ` Paul E. McKenney
2009-08-12  7:32               ` Martin Schwidefsky
2009-08-21 15:54                 ` Paul E. McKenney
2009-08-31  8:47                   ` Martin Schwidefsky
2009-08-31 14:30                     ` Paul E. McKenney
2009-08-11 16:58         ` Greg KH
2009-08-10 16:10     ` Pavel Machek
2009-08-11 21:23       ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090807151529.2806b8b4@skybase \
    --to=schwidefsky@de.ibm.com \
    --cc=gerald.schaefer@de.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.