From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754443AbaEHQxH (ORCPT ); Thu, 8 May 2014 12:53:07 -0400 Received: from fw-tnat.austin.arm.com ([217.140.110.23]:32506 "EHLO collaborate-mta1.arm.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751885AbaEHQxF (ORCPT ); Thu, 8 May 2014 12:53:05 -0400 Date: Thu, 8 May 2014 17:52:17 +0100 From: Catalin Marinas To: "Paul E. McKenney" Cc: Jaegeuk Kim , Johannes Weiner , "Linux Kernel, Mailing List" , "linux-mm@kvack.org" Subject: Re: [BUG] kmemleak on __radix_tree_preload Message-ID: <20140508165217.GI17344@arm.com> References: <20140501184112.GH23420@cmpxchg.org> <1399431488.13268.29.camel@kjgkr> <20140507113928.GB17253@arm.com> <1399540611.13268.45.camel@kjgkr> <20140508092646.GA17349@arm.com> <1399541860.13268.48.camel@kjgkr> <20140508102436.GC17344@arm.com> <20140508150026.GA8754@linux.vnet.ibm.com> <20140508152946.GA10470@localhost> <20140508155330.GE8754@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20140508155330.GE8754@linux.vnet.ibm.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, May 08, 2014 at 04:53:30PM +0100, Paul E. McKenney wrote: > On Thu, May 08, 2014 at 04:29:48PM +0100, Catalin Marinas wrote: > > BTW, is it safe to have a union overlapping node->parent and > > node->rcu_head.next? I'm still staring at the radix-tree code but a > > scenario I have in mind is that call_rcu() has been raised for a few > > nodes, other CPU may have some reference to one of them and set > > node->parent to NULL (e.g. concurrent calls to radix_tree_shrink()), > > breaking the RCU linking. I can't confirm this theory yet ;) > > If this were reproducible, I would suggest retrying with non-overlapping > node->parent and node->rcu_head.next, but you knew that already. ;-) Reading the code, I'm less convinced about this scenario (though it's worth checking without the union). > But the usual practice would be to make node removal exclude shrinking. > And the radix-tree code seems to delegate locking to the caller. > > So, is the correct locking present in the page cache? The radix-tree > code seems to assume that all update operations for a given tree are > protected by a lock global to that tree. The calling code in mm/filemap.c holds mapping->tree_lock when deleting radix-tree nodes, so no concurrent calls. > Another diagnosis approach would be to build with > CONFIG_DEBUG_OBJECTS_RCU_HEAD=y, which would complain about double > call_rcu() invocations. Rumor has it that is is necessary to turn off > other kmem debugging for this to tell you anything -- I have seen cases > where the kmem debugging obscures the debug-objects diagnostics. Another test Jaegeuk could run (hopefully he has some time to look into this). Thanks for suggestions. -- Catalin