cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
From: Jonathan E Brassow <jbrassow@redhat.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [GFS2 Patch] pass formal ino in do_filldir_main
Date: Mon, 26 Feb 2007 10:11:49 -0600	[thread overview]
Message-ID: <fffab4af9d51d64e88b9f0ce0660406b@redhat.com> (raw)
In-Reply-To: <1172504728.11001.614.camel@quoit.chygwyn.com>

Please note, I haven't been following current GFS2 development...

On Feb 26, 2007, at 9:45 AM, Steven Whitehouse wrote:

> Hi,
>
> On Fri, 2007-02-23 at 22:56 -0500, Wendy Cheng wrote:
>> Steve,
>>
>> I gave this issue some more thoughts - would like to suggest we take
>> this patch (at least for now) since it aligns with current code base.
>>
>> The no_formal_ino is apparently intented to get returned back to user
>> space due to its unique-ness (and we have to trust pick_formal_ino()
>> does its job right). With normal readdir system call, after the inode
>> number is sent to user space, there is no route (I've checked) for it 
>> to
>> come back to kernel. So the only user that would use these filldir ino
>> inside the kernel is NFSD. NFSD calls gfs2_ilookup() that requires
>> no_formal_ino.
>>
> no_formal_ino is not ever guaranteed to be unique which is why I want 
> to
> keep it away from userspace. I don't really want to have to change that
> in order to fix NFS.

Why is it not unique?  (from Ken's doc on GFS2:  "If the filesystem 
allocated 1 billion inodes per second, it would take over 500 years for 
roll-over.")

> As you say there is no route for the inode number returned via readdir
> to come back to the kernel, but it should match the inode number
> returned by stat and I don't see that we should change either of them
> just to get NFS to work when userspace is perfectly ok as it is.

There were alot of things done to make NFS work properly with GFS - 
many of them through great pain, because GFS is a clustered FS.

>> If you look further... The current lookup code actually uses
>> no_formal_ino, not no_addr. The two main "gate" routines that controls
>> ino-to-inode conversion are:
>>
>> * gfs2_ilookup() (used by NFS route)
>> * gfs2_inode_lookup() (used by VFS that calls gfs2_lookup()).
>>
> Neither of them need to use no_formal_ino at all. The lookup is using
> no_addr as there is no other index which makes sense. I know that
> no_formal_ino is used as part of the hash, but it shouldn't be really
> and its trivial to change that. You can't swap to just using
> no_formal_ino on its own as there is no mapping to find its on-disk
> location from no_formal_ino.
>
>> Both use no_formal_ino - gfs2_inode_lookup() logic hides this behind 
>> the
>> little wrapper "gfs2_iget()". Since current VFS side's lookup has been
>> working fine, this no_formal_ino idea apparently is working. So let's
>> make NFSD side work the *same* way. I'm convinced this patch does a
>> right thing.
>>
>> I don't dispute using generation number may not be a bad idea and may
>> perform better. However, if we really look into the details, it is not
>> easy for current code structure. Since we have something already
>> working, it is not wise to mess this code layout at this moment.
>>
>> Make sense ?
>>
>> -- Wendy
>>
> I'm afraid that I'm still not convinced on that. It seems that ext3 
> uses
> the method that I'm proposing in that it looks up the inode by inode
> number and then only afterwards checks the generation number. It also
> has the advantage of separating the NFS interface from the rest of the
> filesystem. That seems to me to be a lot of the problem that we are
> looking at here in that the VFS's lookup function wants to look up by
> no_addr and (currently) NFS is asking for something slightly different,
>

How do you plan on handling inode migration when using NFS?  I believe 
that was one of the central reasons for having no_formal_inode.  There 
may be other reasons as well, as I am not familiar with the code.

  brassow



  reply	other threads:[~2007-02-26 16:11 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-02-07  4:51 [Cluster-devel] [GFS2 Patch] pass formal ino in do_filldir_main Wendy Cheng
2007-02-07 10:49 ` Steven Whitehouse
2007-02-07 22:35   ` Wendy Cheng
2007-02-07 23:37     ` Wendy Cheng
2007-02-08  9:42     ` Steven Whitehouse
2007-02-24  3:56       ` Wendy Cheng
2007-02-26 15:45         ` Steven Whitehouse
2007-02-26 16:11           ` Jonathan E Brassow [this message]
2007-02-26 17:16             ` Steven Whitehouse
2007-02-27 20:04               ` Wendy Cheng
2007-02-28  9:03                 ` Steven Whitehouse
2007-02-28 16:24                   ` Wendy Cheng
2007-02-28 16:26                     ` Wendy Cheng
2007-02-28 16:46                     ` Steven Whitehouse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fffab4af9d51d64e88b9f0ce0660406b@redhat.com \
    --to=jbrassow@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).