From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1752685AbdEHWgG (ORCPT <rfc822;w@1wt.eu>);
        Mon, 8 May 2017 18:36:06 -0400
Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:34399 "EHLO
        mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL)
        by vger.kernel.org with ESMTP id S1750941AbdEHWgE (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Mon, 8 May 2017 18:36:04 -0400
Date: Mon, 8 May 2017 15:36:00 -0700
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>, Petr Mladek <pmladek@suse.com>,
        Jessica Yu <jeyu@redhat.com>, Jiri Kosina <jikos@kernel.org>,
        Miroslav Benes <mbenes@suse.cz>, live-patching@vger.kernel.org,
        linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/3] livepatch/rcu: Warn when system consistency is
 broken in RCU code
Reply-To: paulmck@linux.vnet.ibm.com
References: <1493895316-19165-1-git-send-email-pmladek@suse.com>
 <1493895316-19165-3-git-send-email-pmladek@suse.com>
 <20170508165108.d3vd4h6ffa25bfui@treble>
 <20170508151322.76e8e9db@gandalf.local.home>
 <20170508194729.jjq7qrc7gkiq2s5v@treble>
 <20170508201558.GD3956@linux.vnet.ibm.com>
 <20170508204333.xc3isvr4riv26his@treble>
 <20170508210754.GE3956@linux.vnet.ibm.com>
 <20170508221609.roaeaidj7mpfozcq@treble>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20170508221609.roaeaidj7mpfozcq@treble>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-TM-AS-GCONF: 00
x-cbid: 17050822-0044-0000-0000-000003259C0C
X-IBM-SpamModules-Scores: 
X-IBM-SpamModules-Versions: BY=3.00007035; HX=3.00000240; KW=3.00000007;
 PH=3.00000004; SC=3.00000209; SDB=6.00857934; UDB=6.00425024; IPR=6.00637369;
 BA=6.00005334; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000;
 ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00015363; XFM=3.00000015;
 UTC=2017-05-08 22:36:01
X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused
x-cbparentid: 17050822-0045-0000-0000-00000753A6ED
Message-Id: <20170508223600.GH3956@linux.vnet.ibm.com>
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-05-08_16:,,
 signatures=0
X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0
 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam
 adjust=0 reason=mlx scancount=1 engine=8.0.1-1703280000
 definitions=main-1705080119
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Mon, May 08, 2017 at 05:16:09PM -0500, Josh Poimboeuf wrote:
> On Mon, May 08, 2017 at 02:07:54PM -0700, Paul E. McKenney wrote:
> > On Mon, May 08, 2017 at 03:43:33PM -0500, Josh Poimboeuf wrote:
> > > On Mon, May 08, 2017 at 01:15:58PM -0700, Paul E. McKenney wrote:
> > > > On Mon, May 08, 2017 at 02:47:29PM -0500, Josh Poimboeuf wrote:
> > > > > On Mon, May 08, 2017 at 03:13:22PM -0400, Steven Rostedt wrote:
> > > > 
> > > > [ . . . ]
> > > > 
> > > > > > If rcu is not watching, calling rcu_enter_irq() will have it watch
> > > > > > again. Even in NMI context I believe.
> > > > > 
> > > > > What if you get an NMI while running in rcu_dynticks_eqs_enter() before
> > > > > it increments rdtp->dynticks?  Will rcu_enter_irq() still work from the
> > > >                                       rcu_irq_enter()
> > > > > NMI?
> > > > 
> > > > The rcu_nmi_enter() function willl notice that RCU is not watching, and
> > > > will therefore atomically increment RCU's dynticks-idle counter, which
> > > > will be atomically incremented again upon return.  Since the bottom bit
> > > > of this counter controls whether or not RCU is watching, RCU will be
> > > > watching during the NMI, will stop watching upon return from the NMI,
> > > > which restores state so as to allow rcu_irq_enter() to cause RCU to once
> > > > again watch.  (NMI algorithm due to Andy Lutomirski.)
> > > > 
> > > > > I'm just trying to understand what are the cases where rcu_enter_irq()
> > > > > *doesn't* work from an ftrace handler.
> > > > 
> > > > It doesn't work from an NMI handler.  Aside from possible architecture
> > > > specific special cases, it should work everywhere else.
> > > 
> > > Ok, so just to clarify.  Is there a bug in the ftrace stack tracer in
> > > the following situation?
> > > 
> > > 1. RCU isn't watching
> > > 2. An NMI hits
> > > 3. ist_enter() calls into the ftrace stack tracer, before
> > >    rcu_nmi_enter() is called, so RCU isn't watching yet
> > > 4. The ftrace stack tracer calls rcu_irq_enter(), which has no effect,
> > >    so RCU still isn't watching
> > > 5. Hilarity ensues in the ftrace stack tracer
> > 
> > This would be a problem if step 2's NMI hit rcu_irq_enter(),
> > rcu_irq_exit(), and friends in just the wrong place.
> > 
> > I would suggest that ftrace() do something like this...
> > 
> > 	if (in_nmi())
> > 		rcu_nmi_enter();
> > 	else
> > 		rcu_irq_enter();
> > 
> > Except that, as Steven will quickly point out, this won't work at the
> > very edges of the NMI, when NMI_MASK won't be set in preempt_count().
> > 
> > Other thoughts?
> 
> Ok.  So I think the livepatch ftrace handler would need the in_nmi()
> check, in case it's called early in the NMI.
> 
> But on x86, rcu_nmi_enter() is also called in some non-NMI exception
> cases, from ist_enter().  So it appears that the in_nmi() check wouldn't
> be sufficient.  We might instead need something like:
> 
> 	if (in_nmi() || in_some_other_exception())
> 		rcu_nmi_enter();
> 	else
> 		rcu_irq_enter();
> 
> But unfortunately the in_some_other_exception() function doesn't
> currently exist.
> 
> So, one more question.  Would it work if we just always called
> rcu_nmi_enter()?

I am a bit nervous about this.  It would -at- -least- be necessary to have
interrupts disabled throughout the entire time from the rcu_nmi_enter()
through the matching rcu_nmi_exit().  And there might be other failure
modes that I don't immediately see.

But do we really need this, given the in_nmi() check that Steven
pointed out?

							Thanx, Paul