From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jonathan E Brassow Date: Mon, 26 Feb 2007 10:11:49 -0600 Subject: [Cluster-devel] [GFS2 Patch] pass formal ino in do_filldir_main In-Reply-To: <1172504728.11001.614.camel@quoit.chygwyn.com> References: <1170823892.3414.3.camel@d-cthon07-w-63.Connectathon.ORG> <1170845370.11001.417.camel@quoit.chygwyn.com> <1170887711.7725.14.camel@d-cthon07-w-63.Connectathon.ORG> <1170927758.11001.467.camel@quoit.chygwyn.com> <45DFB779.9000100@redhat.com> <1172504728.11001.614.camel@quoit.chygwyn.com> Message-ID: List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Please note, I haven't been following current GFS2 development... On Feb 26, 2007, at 9:45 AM, Steven Whitehouse wrote: > Hi, > > On Fri, 2007-02-23 at 22:56 -0500, Wendy Cheng wrote: >> Steve, >> >> I gave this issue some more thoughts - would like to suggest we take >> this patch (at least for now) since it aligns with current code base. >> >> The no_formal_ino is apparently intented to get returned back to user >> space due to its unique-ness (and we have to trust pick_formal_ino() >> does its job right). With normal readdir system call, after the inode >> number is sent to user space, there is no route (I've checked) for it >> to >> come back to kernel. So the only user that would use these filldir ino >> inside the kernel is NFSD. NFSD calls gfs2_ilookup() that requires >> no_formal_ino. >> > no_formal_ino is not ever guaranteed to be unique which is why I want > to > keep it away from userspace. I don't really want to have to change that > in order to fix NFS. Why is it not unique? (from Ken's doc on GFS2: "If the filesystem allocated 1 billion inodes per second, it would take over 500 years for roll-over.") > As you say there is no route for the inode number returned via readdir > to come back to the kernel, but it should match the inode number > returned by stat and I don't see that we should change either of them > just to get NFS to work when userspace is perfectly ok as it is. There were alot of things done to make NFS work properly with GFS - many of them through great pain, because GFS is a clustered FS. >> If you look further... The current lookup code actually uses >> no_formal_ino, not no_addr. The two main "gate" routines that controls >> ino-to-inode conversion are: >> >> * gfs2_ilookup() (used by NFS route) >> * gfs2_inode_lookup() (used by VFS that calls gfs2_lookup()). >> > Neither of them need to use no_formal_ino at all. The lookup is using > no_addr as there is no other index which makes sense. I know that > no_formal_ino is used as part of the hash, but it shouldn't be really > and its trivial to change that. You can't swap to just using > no_formal_ino on its own as there is no mapping to find its on-disk > location from no_formal_ino. > >> Both use no_formal_ino - gfs2_inode_lookup() logic hides this behind >> the >> little wrapper "gfs2_iget()". Since current VFS side's lookup has been >> working fine, this no_formal_ino idea apparently is working. So let's >> make NFSD side work the *same* way. I'm convinced this patch does a >> right thing. >> >> I don't dispute using generation number may not be a bad idea and may >> perform better. However, if we really look into the details, it is not >> easy for current code structure. Since we have something already >> working, it is not wise to mess this code layout at this moment. >> >> Make sense ? >> >> -- Wendy >> > I'm afraid that I'm still not convinced on that. It seems that ext3 > uses > the method that I'm proposing in that it looks up the inode by inode > number and then only afterwards checks the generation number. It also > has the advantage of separating the NFS interface from the rest of the > filesystem. That seems to me to be a lot of the problem that we are > looking at here in that the VFS's lookup function wants to look up by > no_addr and (currently) NFS is asking for something slightly different, > How do you plan on handling inode migration when using NFS? I believe that was one of the central reasons for having no_formal_inode. There may be other reasons as well, as I am not familiar with the code. brassow