From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Fri, 2 Dec 2016 11:48:04 +0100 From: Jan Kara To: Miklos Szeredi Cc: Jan Kara , Amir Goldstein , Eric Paris , linux-fsdevel , linux-kernel Subject: Re: fsnotify_mark_srcu wtf? Message-ID: <20161202104804.GC26086@quack2.suse.cz> References: <20161102220851.GA1839@veci.piliscsaba.szeredi.hu> <20161105213411.GA32353@quack2.suse.cz> <20161109111005.GA32353@quack2.suse.cz> <20161110194625.GG31098@quack2.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org List-ID: On Fri 02-12-16 09:26:51, Miklos Szeredi wrote: > On Thu, Nov 10, 2016 at 8:46 PM, Jan Kara wrote: > > On Wed 09-11-16 20:26:16, Amir Goldstein wrote: > >> On Wed, Nov 9, 2016 at 1:10 PM, Jan Kara wrote: > > >> > And this does not work as well... Fanotify must notify groups by their > >> > priority so you cannot arbitrarily reorder ordering in which groups get > >> > notified. I'm currently pondering on using mark refcount to pin it when > >> > processing permission event but there are still some details to check. > >> > > >> > >> All right, mark refcount sound like the proper solution. > > > > Except it doesn't quite work. We can pin the current marks by a refcount > > but they can still be removed from the list so after we regain srcu lock, > > we are not sure their ->next pointers still point to still allocated marks > > :-| Sadly I realized this only after implementing all this. > > Hmm, how about this: when removing mark from inode, drop refcount. If > refcount is zero can remove from list. Otherwise mark the mark "dead" > and leave it on the list. > > And fsnotify can just skip dead marks. I had this idea as well and when trying to implement this, I've stumbled over some problems. I think the biggest problem was that destruction of a notification mark is relatively complex operation (doing iput() for example) and quite a few places dropping mark references are in a context where this can cause problems. Also I don't want to defer iput() to a workqueue as that will have unexpected consequences such as unlinked watched inode lingering in the system (possibly colliding with umount etc.). Honza -- Jan Kara SUSE Labs, CR