From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1752159AbcBOSD4 (ORCPT <rfc822;w@1wt.eu>);
	Mon, 15 Feb 2016 13:03:56 -0500
Received: from e19.ny.us.ibm.com ([129.33.205.209]:56560 "EHLO
	e19.ny.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1751949AbcBOSDy (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Mon, 15 Feb 2016 13:03:54 -0500
X-IBM-Helo: d01dlp03.pok.ibm.com
X-IBM-MailFrom: paulmck@linux.vnet.ibm.com
X-IBM-RcptTo: linux-kernel@vger.kernel.org
Date: Mon, 15 Feb 2016 10:03:51 -0800
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Boqun Feng <boqun.feng@gmail.com>
Cc: linux-kernel@vger.kernel.org, Peter Zijlstra <peterz@infradead.org>,
        Ingo Molnar <mingo@kernel.org>, Josh Triplett <josh@joshtriplett.org>,
        Steven Rostedt <rostedt@goodmis.org>,
        Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
        Lai Jiangshan <jiangshanlai@gmail.com>, sasha.levin@oracle.com
Subject: Re: [RFC 0/6] Track RCU dereferences in RCU read-side critical
 sections
Message-ID: <20160215180351.GJ6719@linux.vnet.ibm.com>
Reply-To: paulmck@linux.vnet.ibm.com
References: <1454517912-10457-1-git-send-email-boqun.feng@gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <1454517912-10457-1-git-send-email-boqun.feng@gmail.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-TM-AS-MML: disable
X-Content-Scanned: Fidelis XPS MAILER
x-cbid: 16021518-0057-0000-0000-0000036DE2C8
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Thu, Feb 04, 2016 at 12:45:06AM +0800, Boqun Feng wrote:
> As a characteristic of RCU, read-side critical sections have a very
> loose connection with rcu_dereference()s, which is you can only be sure
> about an rcu_dereference() might be called in some read-side critical
> section, but if code gets complex, you may not be sure which read-side
> critical section exactly, this might be also an problem for some other
> locking mechanisms, that is the critical sections protecting data and
> the data accesses protected are not clearly correlated.

Seeing no objections, I am queuing this series for review and testing.
The information it provides would have been extremely helpful to me
several times in past!

							Thanx, Paul

> In this series, we are introducing LOCKED_ACCESS framework and based on
> which, we implement the RCU_LOCKED_ACCESS functionality to give us a
> clear hint: which rcu_dereference() happens in which RCU read-side
> critical section. 
> 
> The basic idea of LOCKED_ACCESS is to maintain a chain of locks we have
> acquired already, and when there happens a data access, correlate the
> data access with the chain.
> 
> Lockdep already has lock chains, but we introduce a new but similar one
> concept: acqchain, an acqchain is similar to a lock chain, except that
> the key of an acqchain is the hash sum of the acquire (instruction)
> positions of the locks in the chain, whereas the key of a lock chain is
> the hash sum of the class keys of the locks in the chain.
> 
> Acqchains are introduced because we want to correlate data accesses with
> critical sections and critical sections are better represented by the
> acquire positions rather than lock classes.
> 
> The acqchain key of a task is maintained in the same way as lock chain
> keys in lockdep.
> 
> Similar as lockdep, LOCKED_ACCESS also classify locks and data accesses
> by groups, locked access class is introduced for this reason. A locked
> access class also contains the data for allocation and lookup of
> acqchains and accesses, and the address of a locked access class is used
> as its key. By tagging locks and data accesses with the keys, we could
> describe which locks and data accesses are related.
> 
> The entry point of LOCKED_ACCESS is locked_access_point(). Calling
> locked_access_point() indicates that a data access happens, and after it
> called the data access will be correlated with the current acqchain.
> 
> We also provide a /proc filesystem interface to show the information
> we've collected, for each locked access class with the name <name> there
> will be a file at /proc/locked_access/<name> showing all the
> relationships collected so far for this locked access classes.
> 
> Based on LOCKED_ACCESS, we implement RCU_LOCKED_ACCESS, that tracks
> rcu_dereference()s inside RCU read-side critical sections.
> 
> This patchset is based on v4.5-rc2 and consists of 6 patches(in which
> patch 2-5 are the implementation of LOCKED_ACCESS):
> 
> 1.	Introduce some functions of irq_context.
> 
> 2.	Introduce locked access class and acqchain.
> 
> 3.	Maintain the keys of acqchains.
> 
> 4.	Introduce the entry point of LOCKED_ACCESS.
> 
> 5.	Add proc interface for locked access class
> 
> 6.	Enables LOCKED_ACCESS for RCU.
> 
> Tested by 0day and I also did a simple test on x86: build and boot a
> kernel with RCU_LOCKED_ACCESS=y and CONFIG_PROVE_LOCKING=y and ran
> several workloads(kernel building, git cloning, dbench), no problem has
> been observed, and /proc/locked_access/rcu was able to collect the
> relationships between ~300 RCU read-critical sections and ~500
> rcu_dereference*().
> 
> Snippets of /proc/locked_access/rcu are as follow:
> 
> ...(this rcu_dereference() happens after one rcu_read_lock())
> ...
> ACQCHAIN 0xfdbf0c6aeea, 1 locks, irq_context 0:
>   LOCK at [<ffffffff812b1115>] get_proc_task_net+0x5/0x140
>     ACCESS TYPE 1 at kernel/pid.c:441
> ...
> ...(this rcu_dereference() happens after three rcu_read_lock())
> ...
> ACQCHAIN 0xfe042af3bbfb2605, 3 locks, irq_context 0:
>   LOCK at [<ffffffff81094b47>] SyS_kill+0x97/0x2a0
>     LOCK at [<ffffffff8109286f>] kill_pid_info+0x1f/0x140
>       LOCK at [<ffffffff81092605>] group_send_sig_info+0x5/0x130
>         ACCESS TYPE 1 at kernel/signal.c:695
> ...
> 
> Looking forwards to any suggestion, comment and question ;-)
> 
> Regards,
> Boqun
>