public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Bill Davidsen <davidsen@tmr.com>
To: Anton Altaparmakov <aia21@cam.ac.uk>
Cc: linux-kernel@vger.kernel.org
Subject: Re: what is our answer to ZFS?
Date: Tue, 22 Nov 2005 16:06:26 -0500	[thread overview]
Message-ID: <43838852.8000706@tmr.com> (raw)
In-Reply-To: <Pine.LNX.4.64.0511221650360.2763@hermes-1.csi.cam.ac.uk>

Anton Altaparmakov wrote:
> On Tue, 22 Nov 2005, Chris Adams wrote:
> 
>>Once upon a time, Jan Harkes <jaharkes@cs.cmu.edu> said:
>>
>>>The only thing that tends to break are userspace archiving tools like
>>>tar, which assume that 2 objects with the same 32-bit st_ino value are
>>>identical.
>>
>>That assumption is probably made because that's what POSIX and Single
>>Unix Specification define: "The st_ino and st_dev fields taken together
>>uniquely identify the file within the system."  Don't blame code that
>>follows standards for breaking.
> 
> 
> The standards are insufficient however.  For example dealing with named 
> streams or extended attributes if exposed as "normal files" would 
> naturally have the same st_ino (given they are the same inode as the 
> normal file data) and st_dev fields.
> 
> 
>>>I think that by now several actually double check that theinode
>>>linkcount is larger than 1.
>>
>>That is not a good check.  I could have two separate files that have
>>multiple links; if st_ino is the same, how can tar make sense of it?
> 
> 
> Now that is true.  In addition to checking the link count is larger then 
> 1, they should check the file size and if that matches compute the SHA-1 
> digest of the data (or the MD5 sum or whatever) and probably should also 
> check the various stat fields for equality before bothering with the 
> checksum of the file contents.
> 
> Or Linux just needs a backup api that programs like this can use to 
> save/restore files.  (Analogous to the MS Backup API but hopefully 
> less horid...)
> 
In order to prevent the problems mentioned AND satisfy SuS, I would 
think that the st_dev field would be the value which should be unique, 
which is not always the case currently. The st_inod is a file id on 
st_dev, and it would be less confusing if the inode on each st_dev were 
unique. Not to mention that some backup programs do look at that st_dev 
and could be mightily confused if the meaning is not determinant.

Historical application usage assumes that it is invariant, many 
applications were written before pluggable devices and network mounts. 
In a perfect world where nothing broke when things were changed, if 
there were some UUID on a filesystem, so it looks the same mounted over 
network or by direct mount, or loopback mount, etc, then there would be 
no confusion.

A backup API would really be nice if it could somehow provide some 
unique ID, such that a netowrk or direct backup of the same data would 
have the same IDs.
-- 
    -bill davidsen (davidsen@tmr.com)
"The secret to procrastination is to put things off until the
  last possible moment - but no longer"  -me


  parent reply	other threads:[~2005-11-23 15:51 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-11-21  9:28 what is our answer to ZFS? Alfred Brons
2005-11-21  9:44 ` Paulo Jorge Matos
2005-11-21  9:59   ` Alfred Brons
2005-11-21 10:08     ` Bernd Petrovitsch
2005-11-21 10:16     ` Andreas Happe
2005-11-21 11:30       ` Anton Altaparmakov
2005-11-21 10:19     ` Jörn Engel
2005-11-21 11:46       ` Matthias Andree
2005-11-21 12:07         ` Kasper Sandberg
2005-11-21 13:18           ` Matthias Andree
2005-11-21 14:18             ` Kasper Sandberg
2005-11-21 14:41               ` Matthias Andree
2005-11-21 15:08                 ` Kasper Sandberg
2005-11-22  8:52                   ` Matthias Andree
2005-11-21 22:41               ` Bill Davidsen
2005-11-21 20:48             ` jdow
2005-11-22 11:17               ` Jörn Engel
2005-11-21 11:59       ` Diego Calleja
2005-11-22  7:51       ` Christoph Hellwig
2005-11-22 10:28         ` Jörn Engel
2005-11-22 14:50         ` Theodore Ts'o
2005-11-22 15:25           ` Jan Harkes
2005-11-22 16:17             ` Chris Adams
2005-11-22 16:55               ` Anton Altaparmakov
2005-11-22 17:18                 ` Theodore Ts'o
2005-11-22 19:25                   ` Anton Altaparmakov
2005-11-22 19:52                     ` Theodore Ts'o
2005-11-22 20:00                       ` Anton Altaparmakov
2005-11-22 23:02                         ` Theodore Ts'o
2005-11-22 21:14                       ` Bill Davidsen
2005-11-22 21:06                 ` Bill Davidsen [this message]
2005-11-22 20:19               ` Alan Cox
2005-11-22 19:56                 ` Chris Adams
2005-11-22 21:19                   ` Bill Davidsen
2005-11-23 19:20                   ` Generation numbers in stat was Re: what is slashdot's " Andi Kleen
2005-11-24  5:15                     ` Chris Adams
2005-11-24  8:47                       ` Andi Kleen
2005-11-22 16:28             ` what is our " Theodore Ts'o
2005-11-22 17:37               ` Jan Harkes
2005-11-22 16:36                 ` Jeff V. Merkey
2005-11-28 12:53       ` Lars Marowsky-Bree
2005-11-29  5:04         ` Theodore Ts'o
2005-11-29  5:57           ` Willy Tarreau
2005-11-29 14:42             ` John Stoffel
2005-11-29 13:58           ` Andi Kleen
2005-11-29 16:03           ` Chris Adams
2005-11-21 11:45     ` Diego Calleja
2005-11-21 14:19       ` Tarkan Erimer
2005-11-21 18:52         ` Rob Landley
2005-11-21 19:28           ` Diego Calleja
2005-11-21 20:02           ` Bernd Petrovitsch
2005-11-22  5:42             ` Rob Landley
2005-11-22  9:25               ` Matthias Andree
2005-11-21 23:05           ` Bill Davidsen
2005-11-22  0:15           ` Bernd Eckenfels
2005-11-21 22:59             ` Jeff V. Merkey
2005-11-22  7:45               ` Christoph Hellwig
2005-11-22  9:19                 ` Jeff V. Merkey
2005-11-22 16:00               ` Bill Davidsen
2005-11-22 16:09                 ` Jeff V. Merkey
2005-11-22 20:16                   ` Bill Davidsen
2005-11-22 16:14                 ` Randy.Dunlap
2005-11-22 16:38                   ` Steve Flynn
2005-11-22  7:15             ` Rob Landley
2005-11-22  8:16               ` Bernd Eckenfels
2005-11-22  0:45           ` Pavel Machek
2005-11-22  6:34             ` Rob Landley
2005-11-22 19:05               ` Pavel Machek
2005-11-22  9:20           ` Matthias Andree
2005-11-22 10:00             ` Tarkan Erimer
2005-11-22 15:46               ` Jan Dittmer
2005-11-22 16:27               ` Bill Davidsen
2005-11-21 18:17       ` Rob Landley
  -- strict thread matches above, loose matches on Subject: below --
2005-11-24  1:52 art

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=43838852.8000706@tmr.com \
    --to=davidsen@tmr.com \
    --cc=aia21@cam.ac.uk \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox