From: "J. Bruce Fields" <bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
To: Sage Weil <sage-4GqslpFJ+cxBDgjK7y7TUQ@public.gmane.org>
Cc: Christoph Hellwig <hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>,
Al Viro <viro-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>,
linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
sandeen-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org
Subject: Re: why is i_ino unsigned long, anyway?
Date: Wed, 2 Oct 2013 15:00:34 -0400 [thread overview]
Message-ID: <20131002190034.GH14808@fieldses.org> (raw)
In-Reply-To: <alpine.DEB.2.00.1310021130280.7765-vIokxiIdD2AQNTJnQDzGJqxOck334EZe@public.gmane.org>
On Wed, Oct 02, 2013 at 11:47:22AM -0700, Sage Weil wrote:
> On Wed, 2 Oct 2013, J. Bruce Fields wrote:
> > On Wed, Oct 02, 2013 at 09:05:27AM -0700, Christoph Hellwig wrote:
> > > On Wed, Oct 02, 2013 at 10:25:27AM -0400, J. Bruce Fields wrote:
> > > > If so then it's no huge code duplication to it by hand:
> > > >
> > > > if (inode->i_op->getattr)
> > > > inode->i_op->getattr(path->mnt, path->dentry, &stat);
> > > > else
> > > > generic_fillattr(inode, &stat);
> > >
> > > Maybe make that a vfs_getattr_nosec and let vfs_getattr call it?
> > >
> > > Including a proper kerneldoc comment explaining when to use it, please.
> >
> > Something like this?
>
> I'm late to this thread, but: getattr() is a comparatively expensive
> operation for just getting an (immutable) ino value on a distributed or
> clustered fs. It would be nice if there were a separate call that didn't
> try to populate all the other kstat fields with valid data. Something
> like this was part of the xstat series from forever ago (a bit mask passed
> to getattr indicating which fields were needed), so the approach below
> might be okay if we think we'll get there sometime soon, but my preference
> would be for another fix...
Understood, and perhaps this code should eventually take advantage of
xstat, but:
- this code isn't handling the common case, so we're not too
worried about the performance, and
- you also have the option of defining your own
export_operations->get_name. Especially consider that if you
have some better way to answer the question "what is the name
of inode X in directory Y" that's better than reading Y
looking for a matching inode number.
--b.
> (Ceph also uses 64-bit inos.)
>
> Thanks!
> sage
>
>
> >
> > --b.
> >
> > commit 8418a41b7192cf2f372ae091207adb29a088f9a0
> > Author: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> > Date: Tue Sep 10 11:41:12 2013 -0400
> >
> > exportfs: fix 32-bit nfsd handling of 64-bit inode numbers
> >
> > Symptoms were spurious -ENOENTs on stat of an NFS filesystem from a
> > 32-bit NFS server exporting a very large XFS filesystem, when the
> > server's cache is cold (so the inodes in question are not in cache).
> >
> > Reported-by: Trevor Cordes <trevor-CGgvEiIIHbIFyWsGDH9TEg@public.gmane.org>
> > Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> >
> > diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
> > index 293bc2e..811831a 100644
> > --- a/fs/exportfs/expfs.c
> > +++ b/fs/exportfs/expfs.c
> > @@ -215,7 +215,7 @@ struct getdents_callback {
> > struct dir_context ctx;
> > char *name; /* name that was found. It already points to a
> > buffer NAME_MAX+1 is size */
> > - unsigned long ino; /* the inum we are looking for */
> > + u64 ino; /* the inum we are looking for */
> > int found; /* inode matched? */
> > int sequence; /* sequence counter */
> > };
> > @@ -255,10 +255,10 @@ static int get_name(const struct path *path, char *name, struct dentry *child)
> > struct inode *dir = path->dentry->d_inode;
> > int error;
> > struct file *file;
> > + struct kstat stat;
> > struct getdents_callback buffer = {
> > .ctx.actor = filldir_one,
> > .name = name,
> > - .ino = child->d_inode->i_ino
> > };
> >
> > error = -ENOTDIR;
> > @@ -268,6 +268,16 @@ static int get_name(const struct path *path, char *name, struct dentry *child)
> > if (!dir->i_fop)
> > goto out;
> > /*
> > + * inode->i_ino is unsigned long, kstat->ino is u64, so the
> > + * former would be insufficient on 32-bit hosts when the
> > + * filesystem supports 64-bit inode numbers. So we need to
> > + * actually call ->getattr, not just read i_ino:
> > + */
> > + error = vfs_getattr_nosec(path, &stat);
> > + if (error)
> > + return error;
> > + buffer.ino = stat.ino;
> > + /*
> > * Open the directory ...
> > */
> > file = dentry_open(path, O_RDONLY, cred);
> > diff --git a/fs/stat.c b/fs/stat.c
> > index 04ce1ac..71a39e8 100644
> > --- a/fs/stat.c
> > +++ b/fs/stat.c
> > @@ -37,14 +37,21 @@ void generic_fillattr(struct inode *inode, struct kstat *stat)
> >
> > EXPORT_SYMBOL(generic_fillattr);
> >
> > -int vfs_getattr(struct path *path, struct kstat *stat)
> > +/**
> > + * vfs_getattr_nosec - getattr without security checks
> > + * @path: file to get attributes from
> > + * @stat: structure to return attributes in
> > + *
> > + * Get attributes without calling security_inode_getattr.
> > + *
> > + * Currently the only caller other than vfs_getattr is internal to the
> > + * filehandle lookup code, which uses only the inode number and returns
> > + * no attributes to any user. Any other code probably wants
> > + * vfs_getattr.
> > + */
> > +int vfs_getattr_nosec(struct path *path, struct kstat *stat)
> > {
> > struct inode *inode = path->dentry->d_inode;
> > - int retval;
> > -
> > - retval = security_inode_getattr(path->mnt, path->dentry);
> > - if (retval)
> > - return retval;
> >
> > if (inode->i_op->getattr)
> > return inode->i_op->getattr(path->mnt, path->dentry, stat);
> > @@ -53,6 +60,18 @@ int vfs_getattr(struct path *path, struct kstat *stat)
> > return 0;
> > }
> >
> > +EXPORT_SYMBOL_GPL(vfs_getattr_nosec);
> > +
> > +int vfs_getattr(struct path *path, struct kstat *stat)
> > +{
> > + int retval;
> > +
> > + retval = security_inode_getattr(path->mnt, path->dentry);
> > + if (retval)
> > + return retval;
> > + return vfs_getattr_nosec(path, stat);
> > +}
> > +
> > EXPORT_SYMBOL(vfs_getattr);
> >
> > int vfs_fstat(unsigned int fd, struct kstat *stat)
> > diff --git a/include/linux/fs.h b/include/linux/fs.h
> > index 9818747..5a51faa 100644
> > --- a/include/linux/fs.h
> > +++ b/include/linux/fs.h
> > @@ -2500,6 +2500,7 @@ extern int page_symlink(struct inode *inode, const char *symname, int len);
> > extern const struct inode_operations page_symlink_inode_operations;
> > extern int generic_readlink(struct dentry *, char __user *, int);
> > extern void generic_fillattr(struct inode *, struct kstat *);
> > +int vfs_getattr_nosec(struct path *path, struct kstat *stat);
> > extern int vfs_getattr(struct path *, struct kstat *);
> > void __inode_add_bytes(struct inode *inode, loff_t bytes);
> > void inode_add_bytes(struct inode *inode, loff_t bytes);
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> >
> >
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2013-10-02 19:00 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-09-12 16:03 why is i_ino unsigned long, anyway? J. Bruce Fields
2013-09-12 19:33 ` Al Viro
[not found] ` <20130912193328.GP13318-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
2013-09-29 11:54 ` Christoph Hellwig
[not found] ` <20130929115454.GA3953-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2013-10-02 14:25 ` J. Bruce Fields
[not found] ` <20131002142527.GD14808-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2013-10-02 15:43 ` J. Bruce Fields
[not found] ` <20131002154320.GE14808-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2013-10-02 16:04 ` Christoph Hellwig
2013-10-02 18:14 ` J. Bruce Fields
2013-10-02 16:05 ` Christoph Hellwig
[not found] ` <20131002160527.GB23875-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2013-10-02 17:53 ` J. Bruce Fields
[not found] ` <20131002175328.GF14808-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2013-10-02 17:57 ` Christoph Hellwig
2013-10-02 21:07 ` J. Bruce Fields
[not found] ` <20131002210736.GA20598-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2013-10-02 21:28 ` [PATCH 1/2] vfs: split out vfs_getattr_nosec J. Bruce Fields
2013-10-02 21:28 ` [PATCH 2/2] exportfs: fix 32-bit nfsd handling of 64-bit inode numbers J. Bruce Fields
[not found] ` <1380749295-20854-2-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-10-04 22:12 ` J. Bruce Fields
[not found] ` <20131004221216.GC18051-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2013-10-04 22:15 ` J. Bruce Fields
2013-10-08 21:56 ` J. Bruce Fields
2013-10-09 0:16 ` Dave Chinner
2013-10-09 14:53 ` J. Bruce Fields
2013-10-10 22:28 ` Dave Chinner
2013-10-11 21:53 ` J. Bruce Fields
2013-10-13 22:52 ` Dave Chinner
2013-10-02 18:47 ` why is i_ino unsigned long, anyway? Sage Weil
[not found] ` <alpine.DEB.2.00.1310021130280.7765-vIokxiIdD2AQNTJnQDzGJqxOck334EZe@public.gmane.org>
2013-10-02 19:00 ` J. Bruce Fields [this message]
[not found] ` <20131002190034.GH14808-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2013-10-02 19:04 ` Sage Weil
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131002190034.GH14808@fieldses.org \
--to=bfields-uc3wqj2krung9huczpvpmw@public.gmane.org \
--cc=hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
--cc=linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=sage-4GqslpFJ+cxBDgjK7y7TUQ@public.gmane.org \
--cc=sandeen-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=viro-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).