From mboxrd@z Thu Jan 1 00:00:00 1970 From: Anton Altaparmakov Subject: Re: fishy ->put_inode usage in ntfs Date: Thu, 14 Oct 2004 13:39:30 +0100 Sender: linux-fsdevel-owner@vger.kernel.org Message-ID: <1097757569.21275.40.camel@imp.csi.cam.ac.uk> References: <20041014112607.GA24508@lst.de> Mime-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: 7bit Cc: aia21@cantab.net, ntfs-dev , linux-fsdevel@vger.kernel.org Return-path: Received: from ppsw-4.csi.cam.ac.uk ([131.111.8.134]:42956 "EHLO ppsw-4.csi.cam.ac.uk") by vger.kernel.org with ESMTP id S261451AbUJNMjk (ORCPT ); Thu, 14 Oct 2004 08:39:40 -0400 To: Christoph Hellwig In-Reply-To: <20041014112607.GA24508@lst.de> List-Id: linux-fsdevel.vger.kernel.org Hi Christoph, On Thu, 2004-10-14 at 12:26, Christoph Hellwig wrote: > the inode->i_count useage in ntfs_put_inode is possibly ract vecause we > could get anoher reference while you're looking at it. Hm. For the directory inode case I don't think this is a problem because all places that use NTFS_I(vfs inode)->itype.index.bmp_ino hold i_sem of the vfs inode which guarantees that at least someone is holding a reference on the vfs inode which in turn guarantees that the i_count cannot drop to levels that would affect ntfs_put_inode(). Explanation: The person who holds i_sem also has 1 i_count and if bmp_ino is set this also has 1 i_count on its parent inode thus we now have a minimum i_count of 2. To get into ntfs_put_inode() at this point in time someone would have to do iput() and they better have 1 i_count before they do the iput() so we now have a minimum i_count of 3 and no code in ntfs_put_inode() would trigger. Only when the last external user of the inode does iput(), ntfs_put_inode() is called with i_count of 2. ntfs_put_inode() then iput()s the bmp_ino attribute inode. This in turn does iput() on the parent inode which drops i_count once more and now clear_inode() will be called by the VFS. We cannot move this code to clear_inode() as it would never get called since the i_count would never drop to 0 due to the existence of the bmp_ino attribute inode which in turn will never have its i_count drop to zero due to it being attached to the parent inode. Catch 22. If someone does get a new reference while we are running the above this is also fine as all users of NTFS_I(vfs inode)->itype.index.bmp_ino check bmp_ino for being NULL and if it is they do an ntfs_attr_iget() and attach it on NTFS_I(vfs inode)->itype.index.bmp_ino again. The reason for doing this is to not to have to ntfs_attr_iget() the bmp_ino every time we need to use it (e.g. every ->readdir() invocation). So it may look odd but it is a speedup worth doing IMO. Hm, I can now see that there is a small race window here and this is simply fixed by doing the setting to NULL of bmp_ino before doing the iput() of bmp_ino. Thanks for pointing this problem area out to! (-: > For the index inode case moving this to clear_inode (which is called > after we released the last reference) should be fine, Indeed. I will move it. Thanks! (-: > but I don't understand the case for directories - why are you checking > for i_count beeing 2 there and not one? I hope I have explained this above. If I failed please do ask again... Best regards, Anton -- Anton Altaparmakov (replace at with @) Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net WWW: http://linux-ntfs.sf.net/, http://www-stu.christs.cam.ac.uk/~aia21/