linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <matthew@wil.cx>
To: Valerie Henson <val_henson@linux.intel.com>
Cc: Ric Wheeler <ric@emc.com>,
	linux-fsdevel@vger.kernel.org,
	Arjan van de Ven <arjan@linux.intel.com>
Subject: Re: topics for the file system mini-summit
Date: Thu, 1 Jun 2006 06:45:17 -0600	[thread overview]
Message-ID: <20060601124517.GC32143@parisc-linux.org> (raw)
In-Reply-To: <20060601032418.GM10420@goober>

On Wed, May 31, 2006 at 08:24:18PM -0700, Valerie Henson wrote:
> Actually, the continuation inode is in B.  When we create a link in
> directory A to file C, a continuation inode for directory A is created
> in domain B, and a block containing the link to file C is allocated
> inside domain B as well.  So there is no continuation inode in domain
> A.
> 
> That being said, this idea is at the hand-waving stage and probably
> has many other (hopefully non-fatal) flaws.  Thanks for taking a look!

OK, so we really have two kinds of continuation inodes, and it might be
sensible to name them differently.  We have "here's some extra data for
that inode over there" and "here's a hardlink from another domain".  I
dub the first one a 'continuation inode' and the second a 'shadow inode'.

Continuation inodes and shadow inodes both suffer from the problem
that they might be unwittingly orphaned, unless they have some kind of
back-link to their referrer.  That seems more soluble though.  The domain
B minifsck can check to see if the backlinked inode or directory is
still there.  If the domain A minifsck prunes something which has a link
to domain B, it should be able to just remove the continuation/shadow
inode there, without fscking domain B.

Another advantage to this is that inodes never refer to blocks outside
their zone, so we can forget about all this '64-bit block number' crap.
We don't even need 64-bit inode numbers -- we can use special direntries
for shadow inodes, and inodes which refer to continuation inodes need
a new encoding scheme anyway.  Normal inodes would remain 32-bit and
refer to the local domain, and shadow/continuation inode numbers would
be 32-bits of domain, plus 32-bits of inode within that domain.

So I like this ;-)

> > Surely XFS must have a more elegant solution than this?
> 
> val@goober:/usr/src/linux-2.6.16.19$ wc -l `find fs/xfs/ -type f`
> [snip]
>  109083 total

Well, yes.  I think that inside the Linux XFS implementation there's a
small and neat filesystem struggling to get out.  Once SGI finally dies,
perhaps we can rip out all the CXFS stubs and IRIX combatability.  Then
we might be able to see it.

For fun, if you're a masochist, try to follow the code flow for
something easy like fsync().

const struct file_operations xfs_file_operations = {
        .fsync          = xfs_file_fsync,
}

xfs_file_fsync(struct file *filp, struct dentry *dentry, int datasync)
{
        struct inode    *inode = dentry->d_inode;
        vnode_t         *vp = vn_from_inode(inode);
        int             error;
        int             flags = FSYNC_WAIT;

        if (datasync)
                flags |= FSYNC_DATA;
        VOP_FSYNC(vp, flags, NULL, (xfs_off_t)0, (xfs_off_t)-1, error);
        return -error;
}

#define _VOP_(op, vp)   (*((vnodeops_t *)(vp)->v_fops)->op)

#define VOP_FSYNC(vp,f,cr,b,e,rv)                                       \
        rv = _VOP_(vop_fsync, vp)((vp)->v_fbhv,f,cr,b,e)

vnodeops_t xfs_vnodeops = {
        .vop_fsync              = xfs_fsync,
}

Finally, xfs_fsync actually does the work.  The best bit about all this
abstraction is that there's only one xfs_vnodeops defined!  So this could
all be done with an xfs_file_fsync() that munged its parameters and called
xfs_fsync() directly.  That wouldn't even affect IRIX combatability,
but it would make life difficult for CXFS, apparently.

http://oss.sgi.com/projects/xfs/mail_archive/200308/msg00214.html

  reply	other threads:[~2006-06-01 12:45 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-05-25 21:44 topics for the file system mini-summit Ric Wheeler
2006-05-26 16:48 ` Andreas Dilger
2006-05-27  0:49   ` Ric Wheeler
2006-05-27 14:18     ` Andreas Dilger
2006-05-28  1:44       ` Ric Wheeler
2006-05-29  0:11 ` Matthew Wilcox
2006-05-29  2:07   ` Ric Wheeler
2006-05-29 16:09     ` Andreas Dilger
2006-05-29 19:29       ` Ric Wheeler
2006-05-30  6:14         ` Andreas Dilger
2006-06-07 10:10       ` Stephen C. Tweedie
2006-06-07 14:03         ` Andi Kleen
2006-06-07 18:55         ` Andreas Dilger
2006-06-01  2:19 ` Valerie Henson
2006-06-01  2:42   ` Matthew Wilcox
2006-06-01  3:24     ` Valerie Henson
2006-06-01 12:45       ` Matthew Wilcox [this message]
2006-06-01 12:53         ` Arjan van de Ven
2006-06-01 20:06         ` Russell Cattelan
2006-06-02 11:27         ` Nathan Scott
2006-06-01  5:36   ` Andreas Dilger
2006-06-03 13:50   ` Ric Wheeler
2006-06-03 14:13     ` Arjan van de Ven
2006-06-03 15:07       ` Ric Wheeler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060601124517.GC32143@parisc-linux.org \
    --to=matthew@wil.cx \
    --cc=arjan@linux.intel.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=ric@emc.com \
    --cc=val_henson@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).