From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752087AbdBIJCh (ORCPT ); Thu, 9 Feb 2017 04:02:37 -0500 Received: from zeniv.linux.org.uk ([195.92.253.2]:39504 "EHLO ZenIV.linux.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751777AbdBIJCd (ORCPT ); Thu, 9 Feb 2017 04:02:33 -0500 Date: Thu, 9 Feb 2017 08:40:16 +0000 From: Al Viro To: Konstantin Khlebnikov Cc: Andrew Morton , Konstantin Khlebnikov , "Eric W. Biederman" , Linux Kernel Mailing List Subject: Re: [PATCH] proc/sysctl: drop unregistered stale dentries as soon as possible Message-ID: <20170209084016.GL13195@ZenIV.linux.org.uk> References: <148655090380.421415.14305642138058304882.stgit@buzz> <20170208134804.5662cddf3a269eb8acb0aa8a@linux-foundation.org> <20170209035307.GK13195@ZenIV.linux.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.7.1 (2016-10-04) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Feb 09, 2017 at 10:36:15AM +0300, Konstantin Khlebnikov wrote: > Ok, Thank you. I've expected that this fix isn't sane, > > Maybe we could minimize changes for now. For example: keep these > stale dentries in memory but silently unhash them in ->d_compare(). > Memory processure and reclaimer will kill them later. ->d_compare() is called by the code walking the hash chains. What's worse, in the most common case all we have is rcu_read_lock(). Modifying the chain in rcu reader is no-go. Turning __d_lookup_rcu() into a writer on the off-chance that we'll walk onto a visibly stale sysctl dentry - even more so. If you want to deal with that, do it right, please. Have sysctl inodes on a list of some kind anchored in struct ctl_table_header; insert them there in proc_sys_make_inode(), remove - in proc_evict_inode() (or have it pass the inode to sysctl_head_put() and do the removal there). Use sysctl_lock for serialization. In start_unregistering(), just before the erase_header() call, check if the list is non-empty and if it is - grab sysctl_lock last = NULL walk the list igrab(inode we are looking at) if succeeded drop sysctl_lock iput(last) last = that inode d_prune_aliases(last) retake sysctl_lock // inode is still not evicted, so it's still on the list drop sysctl_lock iput(last) list would pass through struct proc_inode, and I would probably use hlist rather than the normal one; might be more convenient to initialize that way. Getting from containing struct proc_inode to inode - &ei->vfs_inode. It's not that much work; if you have time - go for it, or remind me after -rc1...