All of lore.kernel.org
 help / color / mirror / Atom feed
* ISO-9660 Rock Ridge gives different links different inums
@ 2003-01-10  3:08 Peter Chubb
  2003-01-10  3:23 ` Andrew McGregor
  0 siblings, 1 reply; 11+ messages in thread
From: Peter Chubb @ 2003-01-10  3:08 UTC (permalink / raw)
  To: eric; +Cc: linux-kernel


In linux 2.5.54, multiple links to the same file on a rock-ridge CD
have different inode numbers.  This confuses cpio, tar and cp -ra
because the multiple links are each copied separately as a single file.

It'll probably also confuse NFS, but I haven't tried that.

Example from the knoppix CD:

$ ls -il gunzip gzip uncompress zcat
1896278 -rwxr-xr-x    4 root     root        49256 Oct 10 01:31 gunzip
1896564 -rwxr-xr-x    4 root     root        49256 Oct 10 01:31 gzip
1902292 -rwxr-xr-x    4 root     root        49256 Oct 10 01:31 uncompress
1902856 -rwxr-xr-x    4 root     root        49256 Oct 10 01:31 zcat

(For comparison, here's what I see on XFS:
100663485 -rwxr-xr-x    4 root     root        49288 Nov  7 11:37 gunzip
100663485 -rwxr-xr-x    4 root     root        49288 Nov  7 11:37 gzip
100663485 -rwxr-xr-x    4 root     root        49288 Nov  7 11:37 uncompress
100663485 -rwxr-xr-x    4 root     root        49288 Nov  7 11:37 zcat
)


Currently the inode number appears to be the offset in bytes from the start of
the file system to the iso directory entry.  Files with multiple
directory entries (i.e., links) therefore have different inums.

I don't know enough about the ISO9660 standard to be sure what's best
to do about this.

--
Dr Peter Chubb				    peterc@gelato.unsw.edu.au
You are lost in a maze of BitKeeper repositories, all almost the same.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ISO-9660 Rock Ridge gives different links different inums
  2003-01-10  3:08 ISO-9660 Rock Ridge gives different links different inums Peter Chubb
@ 2003-01-10  3:23 ` Andrew McGregor
  2003-01-10  3:34   ` Peter Chubb
  0 siblings, 1 reply; 11+ messages in thread
From: Andrew McGregor @ 2003-01-10  3:23 UTC (permalink / raw)
  To: Peter Chubb, eric; +Cc: linux-kernel



--On Friday, January 10, 2003 14:08:59 +1100 Peter Chubb 
<peter@chubb.wattle.id.au> wrote:

>
> In linux 2.5.54, multiple links to the same file on a rock-ridge CD
> have different inode numbers.  This confuses cpio, tar and cp -ra
> because the multiple links are each copied separately as a single file.
>
> It'll probably also confuse NFS, but I haven't tried that.

Shouldn't do, but it will probably make the buffer cache on the server less 
effective.

> Currently the inode number appears to be the offset in bytes from the
> start of the file system to the iso directory entry.  Files with multiple
> directory entries (i.e., links) therefore have different inums.
>
> I don't know enough about the ISO9660 standard to be sure what's best
> to do about this.

Change it to be the offset to the data area, which should be the same for 
all of them?

>
> --
> Dr Peter Chubb				    peterc@gelato.unsw.edu.au
> You are lost in a maze of BitKeeper repositories, all almost the same.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ISO-9660 Rock Ridge gives different links different inums
  2003-01-10  3:23 ` Andrew McGregor
@ 2003-01-10  3:34   ` Peter Chubb
  2003-01-10  6:34     ` Denis Vlasenko
  2003-01-13 22:16     ` Bill Davidsen
  0 siblings, 2 replies; 11+ messages in thread
From: Peter Chubb @ 2003-01-10  3:34 UTC (permalink / raw)
  To: Andrew McGregor; +Cc: Peter Chubb, eric, linux-kernel

>>>>> "Andrew" == Andrew McGregor <andrew@indranet.co.nz> writes:

Andrew> --On Friday, January 10, 2003 14:08:59 +1100 Peter Chubb
Andrew> <peter@chubb.wattle.id.au> wrote:

>> In linux 2.5.54, multiple links to the same file on a rock-ridge CD
>> have different inode numbers.  This confuses cpio, tar and cp -ra
>> because the multiple links are each copied separately as a single
>> file.
>> 
>> It'll probably also confuse NFS, but I haven't tried that.

Andrew> Shouldn't do, but it will probably make the buffer cache on
Andrew> the server less effective.

>> Currently the inode number appears to be the offset in bytes from
>> the start of the file system to the iso directory entry.  Files
>> with multiple directory entries (i.e., links) therefore have
>> different inums.
>> 
>> I don't know enough about the ISO9660 standard to be sure what's
>> best to do about this.

Andrew> Change it to be the offset to the data area, which should be
Andrew> the same for all of them?

I thought about that, but I'm unsure if there's any way to get from
that offset to the directory information.  As far as I can tell,
there's no concept of an inode separate from directory entry on iso9660
--- the directory entry/entries all contain all the information that
describes a file.  Which means that the inumber has to point to some
directory node.

Preferably, all the inumbers for the same file would point to the same
directory entry; but I can see no easy way to do that.  Keeping an
in-memory table for files with multiple links might be the best way,
as there aren't that many on a typical filesystem.

--
Dr Peter Chubb				    peterc@gelato.unsw.edu.au
You are lost in a maze of BitKeeper repositories, all almost the same.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ISO-9660 Rock Ridge gives different links different inums
  2003-01-10  3:34   ` Peter Chubb
@ 2003-01-10  6:34     ` Denis Vlasenko
  2003-01-10  8:56       ` Peter Chubb
  2003-01-13 22:16     ` Bill Davidsen
  1 sibling, 1 reply; 11+ messages in thread
From: Denis Vlasenko @ 2003-01-10  6:34 UTC (permalink / raw)
  To: Peter Chubb, Andrew McGregor; +Cc: eric, linux-kernel

On 10 January 2003 05:34, Peter Chubb wrote:
> >> I don't know enough about the ISO9660 standard to be sure what's
> >> best to do about this.
>
> Andrew> Change it to be the offset to the data area, which should be
> Andrew> the same for all of them?
>
> I thought about that, but I'm unsure if there's any way to get from
> that offset to the directory information.  As far as I can tell,
> there's no concept of an inode separate from directory entry on
> iso9660 --- the directory entry/entries all contain all the
> information that describes a file.  Which means that the inumber has
> to point to some directory node.
>
> Preferably, all the inumbers for the same file would point to the
> same directory entry; but I can see no easy way to do that.  Keeping
> an in-memory table for files with multiple links might be the best
> way, as there aren't that many on a typical filesystem.

And what will happen on a non-typical filesystem with 1 million hardlinks?

The root of the problem is a fundamental layering violation in
traditional Unix filesystems: inode numbers should NOT be visible
to userspace. Userspace just needs a way to tell hardlinks from separate
files, that's all. Exposing inumbers does that, but creates tons
of problems for filesystems which do NOT have such a concept.

There is at least one way to redesign it:
* provide hash number instead of an inumber for each file
  with the following semantics:
  - hardlinks ALWAYS have equal hash numbers
  - different files MAY have equal hash numbers (but rarely)
* provide is_hardlink(file1,file2) system call

But this will cause very long migration period (~10 years?)
and incompatibilities with other Unix variants...
--
vda

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ISO-9660 Rock Ridge gives different links different inums
  2003-01-10  8:56       ` Peter Chubb
@ 2003-01-10  8:54         ` Denis Vlasenko
  2003-01-10 15:58         ` Horst von Brand
  1 sibling, 0 replies; 11+ messages in thread
From: Denis Vlasenko @ 2003-01-10  8:54 UTC (permalink / raw)
  To: Peter Chubb; +Cc: Peter Chubb, Andrew McGregor, eric, linux-kernel

On 10 January 2003 10:56, Peter Chubb wrote:
> >>>>> "Denis" == Denis Vlasenko <vda@port.imtp.ilyichevsk.odessa.ua>
> >>>>> writes:
>
> Denis> On 10 January 2003 05:34, Peter Chubb wrote:
> >> Preferably, all the inumbers for the same file would point to the
> >> same directory entry; but I can see no easy way to do that.
> >> Keeping an in-memory table for files with multiple links might be
> >> the best way, as there aren't that many on a typical filesystem.
>
> Denis> And what will happen on a non-typical filesystem with 1
> million Denis> hardlinks?
>
> Denis> The root of the problem is a fundamental layering violation in
> Denis> traditional Unix filesystems: inode numbers should NOT be
> Denis> visible to userspace. Userspace just needs a way to tell
> Denis> hardlinks from separate files, that's all. Exposing inumbers
> Denis> does that, but creates tons of problems for filesystems which
> Denis> do NOT have such a concept.
>
> The problem is that in Unix the fundamental identity of a file is
> the tuple (blkdev, inum); names are merely indices (links) that
> resolve to that tuple.

You are right. It is designed this way. This design is wrong.

> Personally, I'd swap to a pair of system
> calls to map name to (blkdev, inum), and open(blkdev, inum).  Think
> of the inode number as a unique within-filesystem index.

This does not fix the design.
--
vda

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ISO-9660 Rock Ridge gives different links different inums
  2003-01-10  6:34     ` Denis Vlasenko
@ 2003-01-10  8:56       ` Peter Chubb
  2003-01-10  8:54         ` Denis Vlasenko
  2003-01-10 15:58         ` Horst von Brand
  0 siblings, 2 replies; 11+ messages in thread
From: Peter Chubb @ 2003-01-10  8:56 UTC (permalink / raw)
  To: vda; +Cc: Peter Chubb, Andrew McGregor, eric, linux-kernel

>>>>> "Denis" == Denis Vlasenko <vda@port.imtp.ilyichevsk.odessa.ua> writes:

Denis> On 10 January 2003 05:34, Peter Chubb wrote:
>> Preferably, all the inumbers for the same file would point to the
>> same directory entry; but I can see no easy way to do that.
>> Keeping an in-memory table for files with multiple links might be
>> the best way, as there aren't that many on a typical filesystem.

Denis> And what will happen on a non-typical filesystem with 1 million
Denis> hardlinks?

Denis> The root of the problem is a fundamental layering violation in
Denis> traditional Unix filesystems: inode numbers should NOT be
Denis> visible to userspace. Userspace just needs a way to tell
Denis> hardlinks from separate files, that's all. Exposing inumbers
Denis> does that, but creates tons of problems for filesystems which
Denis> do NOT have such a concept.

The problem is that in Unix the fundamental identity of a file is
the tuple (blkdev, inum); names are merely indices (links) that resolve to
that tuple.   Personally, I'd swap to a pair of system calls to map
name to (blkdev, inum), and open(blkdev, inum).  Think of the inode
number as a unique within-filesystem index.

--
Dr Peter Chubb				    peterc@gelato.unsw.edu.au
You are lost in a maze of BitKeeper repositories, all almost the same.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ISO-9660 Rock Ridge gives different links different inums
  2003-01-10  8:56       ` Peter Chubb
  2003-01-10  8:54         ` Denis Vlasenko
@ 2003-01-10 15:58         ` Horst von Brand
  2003-01-15 21:07           ` Mark H. Wood
  1 sibling, 1 reply; 11+ messages in thread
From: Horst von Brand @ 2003-01-10 15:58 UTC (permalink / raw)
  To: Peter Chubb; +Cc: linux-kernel

Peter Chubb <peter@chubb.wattle.id.au> said:

[...]

> The problem is that in Unix the fundamental identity of a file is
> the tuple (blkdev, inum); names are merely indices (links) that resolve to
> that tuple.   Personally, I'd swap to a pair of system calls to map
> name to (blkdev, inum), and open(blkdev, inum).  Think of the inode
> number as a unique within-filesystem index.

That way any joker can go ahead and open any file, without any regard to
permission bits on the directories that lead there. Not nice.
--
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                     Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria              +56 32 654239
Casilla 110-V, Valparaiso, Chile                Fax:  +56 32 797513

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ISO-9660 Rock Ridge gives different links different inums
  2003-01-10  3:34   ` Peter Chubb
  2003-01-10  6:34     ` Denis Vlasenko
@ 2003-01-13 22:16     ` Bill Davidsen
  2003-01-13 23:10       ` Peter Chubb
  1 sibling, 1 reply; 11+ messages in thread
From: Bill Davidsen @ 2003-01-13 22:16 UTC (permalink / raw)
  To: Peter Chubb; +Cc: Linux Kernel Mailing List

On Fri, 10 Jan 2003, Peter Chubb wrote:

> >>>>> "Andrew" == Andrew McGregor <andrew@indranet.co.nz> writes:

> Andrew> Change it to be the offset to the data area, which should be
> Andrew> the same for all of them?
> 
> I thought about that, but I'm unsure if there's any way to get from
> that offset to the directory information.  As far as I can tell,
> there's no concept of an inode separate from directory entry on iso9660
> --- the directory entry/entries all contain all the information that
> describes a file.  Which means that the inumber has to point to some
> directory node.

I can see that you would have to carry that information forward to the
"inode" if you used the data area address, for stat that's probaby not an
issue, for open after you open the file you don't really need access
checking and the times on a CD don't change.

What's the case where you are starting with an inode and trying to get to
a filename without having gone through a dir entry to the inode? No one is
running things like dump/restore on iso9660 I hope!

-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ISO-9660 Rock Ridge gives different links different inums
  2003-01-13 22:16     ` Bill Davidsen
@ 2003-01-13 23:10       ` Peter Chubb
  0 siblings, 0 replies; 11+ messages in thread
From: Peter Chubb @ 2003-01-13 23:10 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: Peter Chubb, Linux Kernel Mailing List

>>>>> "Bill" == Bill Davidsen <davidsen@tmr.com> writes:

Bill> On Fri, 10 Jan 2003, Peter Chubb wrote:
>> >>>>> "Andrew" == Andrew McGregor <andrew@indranet.co.nz> writes:

Andrew> Change it to be the offset to the data area, which should be
Andrew> the same for all of them?
>> I thought about that, but I'm unsure if there's any way to get from
>> that offset to the directory information.  As far as I can tell,
>> there's no concept of an inode separate from directory entry on
>> iso9660 --- the directory entry/entries all contain all the
>> information that describes a file.  Which means that the inumber
>> has to point to some directory node.

Bill> I can see that you would have to carry that information forward
Bill> to the "inode" if you used the data area address, for stat
Bill> that's probaby not an issue, for open after you open the file
Bill> you don't really need access checking and the times on a CD
Bill> don't change.

In isofs, the on-disc `inode' is an iso_directory_record, which
contains the name as well as describing a single extent.
iso_directory_records are chained together for files that have more
than one extent on disc.  The code currently uses iget() to get the
chained iso_directory_records.

Bill> What's the case where you are starting with an inode and trying
Bill> to get to a filename without having gone through a dir entry to
Bill> the inode? No one is running things like dump/restore on iso9660
Bill> I hope!

no it's where you're starting with an inode number, and want to get an
inode.  Having looked at the code, now, I think that that's confined
to autofs and internally to the isofs code, so could be worked around.

Maybe we should deprecate iget() ???

--
Dr Peter Chubb				    peterc@gelato.unsw.edu.au
You are lost in a maze of BitKeeper repositories, all almost the same.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ISO-9660 Rock Ridge gives different links different inums
  2003-01-10 15:58         ` Horst von Brand
@ 2003-01-15 21:07           ` Mark H. Wood
  2003-01-15 21:28             ` Jesse Pollard
  0 siblings, 1 reply; 11+ messages in thread
From: Mark H. Wood @ 2003-01-15 21:07 UTC (permalink / raw)
  To: Linux kernel list

On Fri, 10 Jan 2003, Horst von Brand wrote:
> Peter Chubb <peter@chubb.wattle.id.au> said:
> [...]
> > The problem is that in Unix the fundamental identity of a file is
> > the tuple (blkdev, inum); names are merely indices (links) that resolve to
> > that tuple.   Personally, I'd swap to a pair of system calls to map
> > name to (blkdev, inum), and open(blkdev, inum).  Think of the inode
> > number as a unique within-filesystem index.
>
> That way any joker can go ahead and open any file, without any regard to
> permission bits on the directories that lead there. Not nice.

Welcome to VMS, which can open files by INDEXF.SYS offset.  Some app.s
which create and delete files rapidly never bother to make directory
entries at all.  It may not be what you're used to, and it may be contrary
to expected Unix semantics, but it's not unthinkable.

-- 
Mark H. Wood, Lead System Programmer   mwood@IUPUI.Edu
MS Windows *is* user-friendly, but only for certain values of "user".


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ISO-9660 Rock Ridge gives different links different inums
  2003-01-15 21:07           ` Mark H. Wood
@ 2003-01-15 21:28             ` Jesse Pollard
  0 siblings, 0 replies; 11+ messages in thread
From: Jesse Pollard @ 2003-01-15 21:28 UTC (permalink / raw)
  To: Mark H. Wood, Linux kernel list

On Wednesday 15 January 2003 03:07 pm, Mark H. Wood wrote:
> On Fri, 10 Jan 2003, Horst von Brand wrote:
> > Peter Chubb <peter@chubb.wattle.id.au> said:
> > [...]
> >
> > > The problem is that in Unix the fundamental identity of a file is
> > > the tuple (blkdev, inum); names are merely indices (links) that resolve
> > > to that tuple.   Personally, I'd swap to a pair of system calls to map
> > > name to (blkdev, inum), and open(blkdev, inum).  Think of the inode
> > > number as a unique within-filesystem index.
> >
> > That way any joker can go ahead and open any file, without any regard to
> > permission bits on the directories that lead there. Not nice.
>
> Welcome to VMS, which can open files by INDEXF.SYS offset.  Some app.s
> which create and delete files rapidly never bother to make directory
> entries at all.  It may not be what you're used to, and it may be contrary
> to expected Unix semantics, but it's not unthinkable.

Or UNICOS, which then restricts the system call to only privileged operation:

This is (I believe) used to optimize a user mode NFS daemon by eliminating 
multiple namei translations (plus locking). The process is secure by not 
permitting the user to have the same privilege mapping of the daemon (thus
the old "kill nfsd" denial of service attack fails). There are also hints that 
this is used to optimize checkpoint/restart capabilities too.

NAME
     openi - Opens a file by using the inode number

SYNOPSIS
     int openi (long dev, long ino, long gen, long uflag);

IMPLEMENTATION
     Cray PVP systems

DESCRIPTION
     The openi system call presents the user with a flat view of all native
     UNICOS file systems currently mounted.  Rather than use the directory
     tree structure to search through directories for a file, openi
     provides access by inode number.

     The openi system call accepts the following arguments:

     dev       Specifies the device number as built by the makedev macro
               that is defined outside of the kernel.

     ino       Specifies an inode number for the file as reported by the ls
               -i command.

     gen       Specifies the generation number of the inode.  This provides
               a unique identification for a specific file.  The generation
               number changes when an inode is reused.  To print the inode
               generation values, use the fck(1) command with the -i and -l
               options.

     uflag     Specifies the open flags.  These are bit values of the form
               O_name that are defined in the fcntl.h file.

     Character, block, and FIFO special files are not allowed.  Specifying
     a dev and ino pair that point to one of these will produce an EINVAL
     error code.

NOTES
     Only a process with appropriate privilege can use this system call.

     If the PRIV_SU configuration option is enabled, the super user is
     allowed to use this system call.

     A process with the PRIV_MAC_READ and PRIV_DAC_OVERRIDE effective
     privileges are allowed to use this system call.  See the effective
     privilege discussion in the NOTES section of the open(2) man page for
     additional privilege requirements.  The open(2) search access
     discussions do not apply to this system call.

RETURN VALUES
     If openi completes successfully, a nonnegative integer is returned
     which may be used in further I/O operations.  Otherwise, openi returns
     a negative value, and errno is set to indicate the error.

ERRORS
     The openi system call fails to open the specified file if the
     following error condition or one of those listed on the open(2) man
     page occurs.

     Error Code          Description

     EINVAL              A dev and ino pair point to a character, block, or
                         FIFO special file.  The openi(2) system call does
                         not work with these types of files.

-- 
-------------------------------------------------------------------------
Jesse I Pollard, II
Email: pollard@navo.hpc.mil

Any opinions expressed are solely my own.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2003-01-15 21:23 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-01-10  3:08 ISO-9660 Rock Ridge gives different links different inums Peter Chubb
2003-01-10  3:23 ` Andrew McGregor
2003-01-10  3:34   ` Peter Chubb
2003-01-10  6:34     ` Denis Vlasenko
2003-01-10  8:56       ` Peter Chubb
2003-01-10  8:54         ` Denis Vlasenko
2003-01-10 15:58         ` Horst von Brand
2003-01-15 21:07           ` Mark H. Wood
2003-01-15 21:28             ` Jesse Pollard
2003-01-13 22:16     ` Bill Davidsen
2003-01-13 23:10       ` Peter Chubb

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.