Re: [PATCH 1/7] rcu: Fix preempt-unsafe debug check of rcu extended quiescent state

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Pavel Ivanov <paivanof@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Thomas Gleixner <tglx@linutronix.de>,
	Lai Jiangshan <laijs@cn.fujitsu.com>,
	Ingo Molnar <mingo@redhat.com>
Subject: Re: [PATCH 1/7] rcu: Fix preempt-unsafe debug check of rcu extended quiescent state
Date: Tue, 27 Sep 2011 09:01:00 -0700	[thread overview]
Message-ID: <20110927160100.GC2335@linux.vnet.ibm.com> (raw)
In-Reply-To: <CAG1a4ruxzNo3TrRZTTLQWNfhVK7mB_0Mr4ff=tuK9k6CZbReDg@mail.gmail.com>

On Tue, Sep 27, 2011 at 11:16:02AM -0400, Pavel Ivanov wrote:
> On Tue, Sep 27, 2011 at 7:50 AM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > On Mon, Sep 26, 2011 at 06:04:06PM -0400, Pavel Ivanov wrote:
> >> On Mon, Sep 26, 2011 at 6:19 AM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> >> > In the rcu_check_extended_qs() function that is used to check
> >> > illegal uses of RCU under extended quiescent states, we look
> >> > at the local value of dynticks that is even if an extended
> >> > quiescent state or odd otherwise.
> >> >
> >> > We are looking at it without disabling the preemption though
> >> > and this opens a race window where we may read the state of
> >> > a remote CPU instead, like in the following scenario:
> >> >
> >> > CPU 1                                                        CPU 2
> >> >
> >> > bool rcu_check_extended_qs(void)
> >> > {
> >> >        struct rcu_dynticks *rdtp;
> >> >
> >> >        rdtp = &per_cpu(rcu_dynticks,
> >> >                        raw_smp_processor_id());
> >> >
> >> >        < ---- Task is migrated here ---- >
> >> >
> >> >        // CPU 1 goes idle and increase rdtp->dynticks
> >> >                                                        /* Here we are reading the value for
> >> >                                                        the remote CPU 1 instead of the local CPU */
> >> >                                                        if (atomic_read(&rdtp->dynticks) & 0x1)
> >> >                                                                return false;
> >> >
> >> >                                                        return true;
> >> >                                                        }
> >> >
> >> > The possible result of this is false positive return value of that
> >> > function, suggesting we are in an extended quiescent state in random
> >> > places.
> >>
> >> How is this different from what your patch allows?
> >>
> >> CPU 1                                               CPU 2
> >>
> >> bool rcu_check_extended_qs(void)
> >> {
> >>        struct rcu_dynticks *rdtp =
> >>                &get_cpu_var(rcu_dynticks);
> >>        bool ext_qs = true;
> >>
> >>        if (atomic_read(&rdtp->dynticks) & 0x1)
> >>                ext_qs = false;
> >>
> >>        put_cpu_var(rcu_dynticks);
> >>
> >>        < ---- Task is migrated here ---- >
> >>
> >>                                                   /* Here we return true/false
> >>                                                      based on the value for the
> >>                                                      remote CPU 1 instead of the
> >>                                                      local CPU */
> >>
> >>                                                   return ext_qs;
> >>                                                   }
> >>
> >>
> >> Looks like it can result in the same false positive result.
> >
> > Why?
> >
> > While calling rcu_read_lock(), the task can still be migrated
> > for example from CPU 0 to CPU 1, until we do our check
> > in rcu_check_extended_qs() with preemption disabled. But that's
> > not a problem, we are going to do our check either in CPU 0 or
> > CPU 1, that's doesn't matter much.
> 
> I don't have sources at hand now to check all call sites of
> rcu_check_extended_qs(). But are you saying that
> rcu_check_extended_qs() is always called after rcu_read_lock() ? If
> that's the case then preemption is already disabled, right? Otherwise
> I don't understand your reasoning here.

For kernels built with CONFIG_TREE_PREEMPT_RCU or CONFIG_TINY_PREEMPT_RCU,
rcu_read_lock() does not imply preempt_disable().  So you really can be
in an RCU read-side critical section without having preemption disabled.

> > What matters is that we do that check by ensuring we are really
> > checking the value of the cpu var in the CPU we are currently
> > running and not some other random one that can change its dynticks
> > value at the same time.
> 
> Define the "CPU we are currently running on" in this context. Is it
> CPU executing call to rcu_check_extended_qs() or is it CPU executing
> return from rcu_check_extended_qs() ? These CPUs can be different both
> before your patch and after that. And function can return extended_qs
> state from either of these CPUs again both before and after the patch.
> If the calling code wants these CPUs to be the same it has to disable
> preemption before making the call. And if it does so then additional
> preemption disabling inside the function is pointless.
> 
> All your patch does is changing possible scenarios of preemption...
> Wait a minute... Is that the whole point of the patch? If CPU where
> rcu_check_extended_qs() was called is in an extended qs then function
> cannot be preempted and so it will correctly return true from current
> CPU. If CPU where rcu_check_extended_qs() was called is not in
> extended qs then function can be preempted and migrated. But CPU where
> it migrates to won't be in extended qs either and thus function will
> correctly return false no matter which per_cpu variable is read. Is my
> understanding correct now? If so it's better to be explained in the
> commit log.
> 
> BTW, isn't it possible for CPU to wake up from extended qs after some
> interrupt coming in the middle of the function?

Indeed this can happen.  The calls to rcu_irq_enter() and
rcu_irq_exit() handle this transition from dyntick-idle extended
quiescent state to RCU being usable in the interrrupt handler.

							Thanx, Paul

next prev parent reply	other threads:[~2011-09-27 16:03 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-09-26 10:19 [PATCH 0/7 v4] rcu: Fix some rcu uses in extended quiescent state Frederic Weisbecker
2011-09-26 10:19 ` [PATCH 1/7] rcu: Fix preempt-unsafe debug check of rcu " Frederic Weisbecker
2011-09-26 22:04   ` Pavel Ivanov
2011-09-27 11:50     ` Frederic Weisbecker
2011-09-27 15:16       ` Pavel Ivanov
2011-09-27 16:01         ` Paul E. McKenney [this message]
2011-09-27 21:44         ` Frederic Weisbecker
2011-09-28  3:17           ` Yong Zhang
2011-09-28 12:44             ` Frederic Weisbecker
2011-09-28  3:52           ` Pavel Ivanov
2011-09-28 12:46             ` Frederic Weisbecker
2011-09-26 10:19 ` [PATCH 2/7] rcu: Fix early call to rcu_enter_nohz() on tick stopping Frederic Weisbecker
2011-09-26 10:19 ` [PATCH 3/7] nohz: Separate out irq exit and idle loop dyntick logic Frederic Weisbecker
2011-09-26 10:19 ` [PATCH 4/7] nohz: Allow rcu extended quiescent state handling seperately from tick stop Frederic Weisbecker
2011-09-26 10:44   ` Peter Zijlstra
2011-09-26 16:02     ` Paul E. McKenney
2011-09-26 16:06       ` Peter Zijlstra
2011-09-26 16:32         ` Paul E. McKenney
2011-09-26 17:06     ` Frederic Weisbecker
2011-09-26 10:19 ` [PATCH 5/7] x86: Enter rcu extended qs after idle notifier call Frederic Weisbecker
2011-09-26 10:19 ` [PATCH 6/7] x86: Call idle notifier after irq_enter() Frederic Weisbecker
2011-09-26 10:19 ` [PATCH 7/7] rcu: Fix early call to rcu_irq_exit() Frederic Weisbecker
2011-09-26 18:26 ` [PATCH 0/7 v4] rcu: Fix some rcu uses in extended quiescent state Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110927160100.GC2335@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=fweisbec@gmail.com \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=paivanof@gmail.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.