All of lore.kernel.org
 help / color / mirror / Atom feed
From: Valerie Aurora <vaurora@redhat.com>
To: Jamie Lokier <jamie@shareable.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>,
	Christoph Hellwig <hch@infradead.org>,
	Miklos Szeredi <miklos@szeredi.hu>, Jan Blunck <jblunck@suse.de>,
	David Woodhouse <dwmw2@infradead.org>,
	Arnd Bergmann <arnd@arndb.de>, Andreas Dilger <adilger@sun.com>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH v2] d_ino considered harmful
Date: Thu, 17 Jun 2010 15:10:26 -0400	[thread overview]
Message-ID: <20100617191026.GE14389@shell> (raw)
In-Reply-To: <20100617175429.GC4979@shareable.org>

On Thu, Jun 17, 2010 at 06:54:29PM +0100, Jamie Lokier wrote:
> Valerie Aurora wrote:
> >> Who needs d_ino anyway?  I am running a kernel with this patch -
> >> Gnome, a browser, IRC, kernel compile, etc. and everything works.
> > I'm running a kernel with the below patch and everything still works.
> > Apparently "ls -i" is still using the bogus d_ino performance
> > improvement mentioned here because it returns all 1's for inode
> > number.
> 
> I'm surprised at "ls -i", as a patch to change that has been submitted:
> 
>     http://marc.info/?l=linux-kernel&m=125181054102075
>     http://thread.gmane.org/gmane.comp.gnu.coreutils.bugs/17887

I was surprised too.  I guess people still want to optimize ls -i,
even at the cost of wrong results.

> >     Use of d_ino without the corresponding st_dev is always buggy in the
> >     presence of submounts, bind mounts, and union mounts.  E.g., the d_ino
> >     of a mountpoint will be the inode number of the directory under the
> >     mountpoint, not the mounted directory.
> 
> It's not surprising everything seems to work.
> 
> It can be useful as a performance hint, which you probably didn't test.

I'm afraid I wasn't entirely serious with that patch. :) But it was an
interesting exercise.

> I strongly disagree that correct code must call stat().  Correct code
> can check against the list of mountpoints in /proc/mounts, because it
> is strictly only mountpoints where the number doesn't agree with
> stat() -- prior to your patch :-)

If you are assuming that the application is parsing /proc/mounts (does
anyone actually do this?), then the application can also learn about
union mounts and not trust d_ino in any directory below the union
mount point. :)

> Anyway, maybe your patch is not allowed by POSIX :-) as follows
> (posted to linux-kernel some time ago):
> 
>     http://marc.info/?l=linux-kernel&m=125181054102075
>     http://www.gossamer-threads.com/lists/linux/kernel/1124140
> 
>     The POSIX readdir spec says this:
> 
> 	The structure dirent defined in the <dirent.h> header describes a
> 	directory entry. The value of the structure's d_ino member shall be set
> 	to the file serial number of the file named by the d_name member.
> 
>     The description for sys/stat.h makes the connection between
>     "file serial number" and the stat.st_ino member:
> 
> 	The <sys/stat.h> header shall define the stat structure, which shall
> 	include at least the following members:
> 	...
> 	    ino_t st_ino                File serial number.
> 
> Returning the covered inode's number at a mountpoint is apparently not
> POSIX compliant either, but is widespread.  (I.e. all unixes except
> Cygwin apparently.)
> 
> > Gosh, maybe it would help to patch the currently used readdir instead
> > of just old_readdir() (thanks, Arnd).  And return 1 instead of 0 so ls
> > doesn't think all files are deleted (thanks, Andreas).
> 
> It's not just ls.  Bash 3.0 ignores entries for completion if d_ino == 0.
> 
> > I'm running a kernel with the below patch and everything still works.
> > Apparently "ls -i" is still using the bogus d_ino performance
> > improvement mentioned here because it returns all 1's for inode
> > number.
> > 
> > http://www.mail-archive.com/bug-findutils@gnu.org/msg02531.html
> 
> I'm intrigued by the mentioned in that report that Linux bind mounts
> return the covering inode number in d_ino, not the covered inode number.
> 
> If true, that means mounts are already being checked when returning d_ino,
> and suggests that doing it for all mounts isn't expensive.

This surprises me too.  I will check into it further.

-VAL

  reply	other threads:[~2010-06-17 19:11 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-16 18:59 [PATCH] d_ino considered harmful Valerie Aurora
2010-06-16 18:59 ` Valerie Aurora
2010-06-16 19:10 ` Arnd Bergmann
2010-06-16 19:58   ` Valerie Aurora
2010-06-16 19:15 ` Jeff Layton
2010-06-16 19:58   ` Valerie Aurora
2010-06-16 19:54 ` [PATCH v2] " Valerie Aurora
2010-06-16 19:54   ` Valerie Aurora
2010-06-16 20:44   ` David Dillow
2010-06-17 18:04     ` J. R. Okajima
2010-06-17 18:17       ` David Dillow
2010-06-17 18:58       ` Valerie Aurora
2010-06-18  1:41       ` Andreas Dilger
2010-06-18  2:57         ` J. R. Okajima
2010-06-17 17:54   ` Jamie Lokier
2010-06-17 19:10     ` Valerie Aurora [this message]
2010-06-17 23:39   ` Andreas Dilger
2010-06-18 19:41     ` Valerie Aurora

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100617191026.GE14389@shell \
    --to=vaurora@redhat.com \
    --cc=adilger@sun.com \
    --cc=arnd@arndb.de \
    --cc=dwmw2@infradead.org \
    --cc=hch@infradead.org \
    --cc=jamie@shareable.org \
    --cc=jblunck@suse.de \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.