From: Frank van Maarseveen <frankvm@frankvm.com>
To: Miklos Szeredi <miklos@szeredi.hu>
Cc: pavel@ucw.cz, matthew@wil.cx, bhalevy@panasas.com,
arjan@infradead.org, mikulas@artax.karlin.mff.cuni.cz,
jaharkes@cs.cmu.edu, linux-kernel@vger.kernel.org,
linux-fsdevel@vger.kernel.org, nfsv4@ietf.org
Subject: Re: Finding hardlinks
Date: Fri, 5 Jan 2007 18:30:04 +0100 [thread overview]
Message-ID: <20070105173004.GA24513@janus> (raw)
In-Reply-To: <E1H2kfa-0007Jl-00@dorka.pomaz.szeredi.hu>
On Fri, Jan 05, 2007 at 09:43:22AM +0100, Miklos Szeredi wrote:
> > > > > High probability is all you have. Cosmic radiation hitting your
> > > > > computer will more likly cause problems, than colliding 64bit inode
> > > > > numbers ;)
> > > >
> > > > Some of us have machines designed to cope with cosmic rays, and would be
> > > > unimpressed with a decrease in reliability.
> > >
> > > With the suggested samefile() interface you'd get a failure with just
> > > about 100% reliability for any application which needs to compare a
> > > more than a few files. The fact is open files are _very_ expensive,
> > > no wonder they are limited in various ways.
> > >
> > > What should 'tar' do when it runs out of open files, while searching
> > > for hardlinks? Should it just give up? Then the samefile() interface
> > > would be _less_ reliable than the st_ino one by a significant margin.
> >
> > You need at most two simultenaously open files for examining any
> > number of hardlinks. So yes, you can make it reliable.
>
> Well, sort of. Samefile without keeping fds open doesn't have any
> protection against the tree changing underneath between first
> registering a file and later opening it. The inode number is more
> useful in this respect. In fact inode number + generation number will
> give you a unique identifier in time as well, which is a _lot_ more
> useful to determine if the file you are checking is actually the same
> as one that you've come across previously.
Samefile with keeping fds open doesn't buy you much anyway. What exactly
would be the value of a directory tree seen by operating only on fds
(even for directories) when some rogue process is renaming, moving,
updating stuff underneath? One ends up with a tree which misses alot
of files and hardly bears any resemblance with the actual tree at any
point in time and I'm not even talking about filedata.
It is futile to try to get a consistent tree view on a live filesystem,
with- or without using fds. It just doesn't work without fundamental
support for some kind of "freezing" or time-travel inside the
kernel. Snapshots at the block device level are problematic too.
>
> So instead of samefile() I'd still suggest an extended attribute
> interface which exports the file's unique (in space and time)
> identifier as an opaque cookie.
But then you're just _shifting_ the problem instead of fixing it:
st_ino/st_mtime (st_ctime?) are designed for this purpose. If the
filesystem doesn't support it properly: live with the consequences
which are mostly minor. Notable exceptions are of course backup tools
but backups _must_ be verified anyway so you'll discover soon.
(btw, that's what I noticed after restoring a system from a CD (iso9660
with RR): all hardlinks were gone)
--
Frank
next prev parent reply other threads:[~2007-01-05 17:30 UTC|newest]
Thread overview: 127+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-12-20 9:03 Finding hardlinks Mikulas Patocka
2006-12-20 11:44 ` Miklos Szeredi
2006-12-20 16:36 ` Mikulas Patocka
2006-12-20 16:50 ` Miklos Szeredi
2006-12-20 19:54 ` Al Viro
2006-12-20 20:12 ` Mikulas Patocka
2006-12-31 15:02 ` Mikulas Patocka
2006-12-21 18:58 ` Jan Harkes
2006-12-21 23:49 ` Mikulas Patocka
2006-12-22 5:05 ` Jan Harkes
2006-12-23 10:18 ` Arjan van de Ven
2006-12-23 14:00 ` Mikulas Patocka
2006-12-28 9:06 ` Benny Halevy
2006-12-28 10:05 ` Arjan van de Ven
2006-12-28 15:24 ` Benny Halevy
2006-12-28 19:58 ` Miklos Szeredi
2006-12-29 0:51 ` Bryan Henderson
2007-01-02 19:15 ` Pavel Machek
2007-01-02 20:41 ` Miklos Szeredi
2007-01-02 20:50 ` Mikulas Patocka
2007-01-02 21:10 ` Miklos Szeredi
2007-01-02 21:37 ` Mikulas Patocka
2007-01-03 11:56 ` Pavel Machek
2007-01-03 12:33 ` Miklos Szeredi
2007-01-03 12:42 ` Pavel Machek
2007-01-11 23:43 ` Denis Vlasenko
2007-01-03 12:45 ` Martin Mares
2007-01-03 13:54 ` Matthew Wilcox
2007-01-03 15:51 ` Miklos Szeredi
2007-01-03 19:04 ` Mikulas Patocka
2007-01-04 22:59 ` Pavel Machek
2007-01-05 8:43 ` Miklos Szeredi
2007-01-05 13:12 ` Pavel Machek
2007-01-05 13:55 ` Miklos Szeredi
2007-01-05 14:08 ` Mikulas Patocka
2007-01-05 15:09 ` Miklos Szeredi
2007-01-05 15:15 ` Miklos Szeredi
2007-01-08 11:27 ` Pavel Machek
2007-01-08 5:57 ` Mikulas Patocka
2007-01-08 8:49 ` Miklos Szeredi
2007-01-08 11:29 ` Pavel Machek
2007-01-08 12:00 ` Miklos Szeredi
2007-01-08 13:26 ` Martin Mares
2007-01-08 13:39 ` Miklos Szeredi
2007-01-09 16:26 ` Steven Rostedt
2007-01-09 19:53 ` Frank van Maarseveen
2007-01-09 20:11 ` Steven Rostedt
2007-01-11 10:07 ` Pádraig Brady
2007-01-09 23:43 ` Bryan Henderson
2007-01-09 23:46 ` Pavel Machek
2007-01-10 0:02 ` Matthew Wilcox
2007-01-10 17:30 ` Bryan Henderson
2007-01-10 17:38 ` Symbolic links vs hard links Bryan Henderson
2007-01-10 17:42 ` Matthew Wilcox
2007-01-11 20:03 ` Bryan Henderson
2007-01-10 19:33 ` Mikulas Patocka
2007-01-10 1:30 ` Finding hardlinks Steven Rostedt
2007-01-05 17:30 ` Frank van Maarseveen [this message]
2006-12-28 18:14 ` Mikulas Patocka
2006-12-29 10:34 ` Trond Myklebust
2006-12-30 1:04 ` Mikulas Patocka
2007-01-01 2:30 ` Nikita Danilov
2007-01-01 22:58 ` Mikulas Patocka
2007-01-01 23:05 ` Nikita Danilov
2007-01-01 23:22 ` Mikulas Patocka
2007-01-04 13:59 ` Nikita Danilov
2007-01-02 23:14 ` Trond Myklebust
2007-01-02 23:50 ` Mikulas Patocka
2006-12-28 13:22 ` Jeff Layton
2006-12-28 15:12 ` Benny Halevy
2006-12-28 15:54 ` Jeff Layton
2006-12-28 16:26 ` Jan Engelhardt
2006-12-28 18:15 ` Bryan Henderson
2006-12-28 18:43 ` Arjan van de Ven
2006-12-29 0:44 ` Bryan Henderson
2006-12-29 3:03 ` Phillip Lougher
2006-12-29 8:41 ` Arjan van de Ven
2006-12-29 15:12 ` Phillip Lougher
2006-12-29 15:43 ` Arjan van de Ven
2006-12-29 8:36 ` Arjan van de Ven
2006-12-29 18:08 ` Bryan Henderson
2006-12-29 18:18 ` Arjan van de Ven
2006-12-29 21:36 ` Bryan Henderson
2006-12-29 22:36 ` Arjan van de Ven
2006-12-28 18:17 ` Mikulas Patocka
2006-12-28 20:07 ` Halevy, Benny
2006-12-29 10:28 ` [nfsv4] " Trond Myklebust
2006-12-31 21:25 ` Halevy, Benny
2007-01-02 23:21 ` Trond Myklebust
2007-01-03 12:35 ` Benny Halevy
2007-01-04 0:43 ` [nfsv4] " Trond Myklebust
2007-01-04 8:36 ` Trond Myklebust
2007-01-04 10:04 ` Benny Halevy
2007-01-04 10:47 ` [nfsv4] " Trond Myklebust
2007-01-04 18:12 ` Bryan Henderson
2007-01-04 18:26 ` Peter Staubach
2007-01-05 8:28 ` Benny Halevy
2007-01-05 10:29 ` Trond Myklebust
2007-01-05 16:40 ` Nicolas Williams
2007-01-05 16:56 ` Trond Myklebust
2007-01-06 7:44 ` Halevy, Benny
2007-01-10 13:04 ` Benny Halevy
2006-12-29 10:12 ` Trond Myklebust
2006-12-31 21:19 ` Halevy, Benny
2007-01-02 23:20 ` Trond Myklebust
2007-01-02 23:46 ` Trond Myklebust
2006-12-28 17:58 ` Bryan Henderson
2006-12-28 18:13 ` Shaya Potter
2006-12-28 22:50 ` Halevy, Benny
2007-01-11 23:35 ` Denis Vlasenko
2006-12-29 10:02 ` Pavel Machek
2007-01-01 22:47 ` Mikulas Patocka
2007-01-01 23:53 ` Jan Harkes
2007-01-02 0:04 ` Mikulas Patocka
2007-01-03 18:58 ` Frank van Maarseveen
2007-01-03 19:17 ` Mikulas Patocka
2007-01-03 19:26 ` Frank van Maarseveen
2007-01-03 19:31 ` Mikulas Patocka
2007-01-03 20:26 ` Frank van Maarseveen
2007-01-12 0:00 ` Denis Vlasenko
2007-01-03 22:30 ` Pavel Machek
2007-01-03 21:09 ` Bryan Henderson
2007-01-03 22:01 ` Frank van Maarseveen
2007-01-03 23:43 ` Mikulas Patocka
2007-01-04 0:12 ` Frank van Maarseveen
2007-01-08 6:19 ` Mikulas Patocka
[not found] <7x5mR-2wX-3@gated-at.bofh.it>
[not found] ` <7x9Ad-18O-35@gated-at.bofh.it>
[not found] ` <7yXEy-UI-39@gated-at.bofh.it>
[not found] ` <7yYKa-2Ds-3@gated-at.bofh.it>
[not found] ` <7zcWP-7ET-5@gated-at.bofh.it>
[not found] ` <7zdzA-jc-27@gated-at.bofh.it>
[not found] ` <7zeP5-2ic-15@gated-at.bofh.it>
[not found] ` <7zgH9-5my-17@gated-at.bofh.it>
[not found] ` <7zJSM-14t-9@gated-at.bofh.it>
[not found] ` <7zSW5-6cj-9@gated-at.bofh.it>
[not found] ` <7zX9l-4rS-7@gated-at.bofh.it>
[not found] ` <7zXMb-5g5-27@gated-at.bofh.it>
2007-01-05 23:54 ` Bodo Eggert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070105173004.GA24513@janus \
--to=frankvm@frankvm.com \
--cc=arjan@infradead.org \
--cc=bhalevy@panasas.com \
--cc=jaharkes@cs.cmu.edu \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=matthew@wil.cx \
--cc=miklos@szeredi.hu \
--cc=mikulas@artax.karlin.mff.cuni.cz \
--cc=nfsv4@ietf.org \
--cc=pavel@ucw.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).