public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* 2.5.2-pre7 still missing bits of kdev_t
@ 2002-01-04  0:05 Alessandro Suardi
  2002-01-04  0:13 ` Jeff Garzik
  0 siblings, 1 reply; 16+ messages in thread
From: Alessandro Suardi @ 2002-01-04  0:05 UTC (permalink / raw)
  To: linux-kernel; +Cc: andries.brouwer, torvalds

Merging Andries' changes for these files gets me a full build:

./fs/reiserfs/inode.c
./fs/reiserfs/super.c
./fs/reiserfs/journal.c
./fs/ext3/super.c
./include/linux/reiserfs_fs.h

--alessandro

 "this machine will, will not communicate
   these thoughts and the strain I am under
  be a world child, form a circle before we all go under"
                         (Radiohead, "Street Spirit [fade out]")

^ permalink raw reply	[flat|nested] 16+ messages in thread
* Re: 2.5.2-pre7 still missing bits of kdev_t
@ 2002-01-04  1:47 Andries.Brouwer
  0 siblings, 0 replies; 16+ messages in thread
From: Andries.Brouwer @ 2002-01-04  1:47 UTC (permalink / raw)
  To: alessandro.suardi, jgarzik; +Cc: andries.brouwer, linux-kernel, torvalds

    From jgarzik@mandrakesoft.com Fri Jan  4 01:13:34 2002

    reiserfs is blindly storing the kernel's kdev_t value raw to disk.

    AFAICS this will need a policy decision not just cleanup, before it
    works in 2.5.2 properly.  If we switch the kernel to 12:20 major:minor
    numbers, suddenly the reiserfs disk format changes based on kernel
    version, and earlier kernels see corrupted major:minor numbers.

No, not really. For how to do this, see a fragment of example code
that Linus removed from kdev_t.h in pre6, it went something like
(adapted for 12+20 instead of 16+16):

int major(dev_t dev) {
	int ma;

	ma = (dev >> 20);
	if (!ma)
		ma = (dev >> 8);
	return ma;
}

int minor(dev_t dev) {
	if (dev >> 20)
		return (dev & 0xfffff);
	else
		return (dev & 0xff);
}

dev_t mkdev(int ma, int mi) {
	if (mi & ~0xff)
		return ((ma << 20) + mi);
	else
		return ((ma << 8) + mi);
}

(with the correctness conditions that ma is 12-bit,
mi is 20-bit, and major 0 has only 8-bit minors).

You see that the representation of old values does not change.
No disk corruption.

Andries


[I didnt spell it out, but you understand: the dev_t is the on-disk
format, the conversion finds the major and minor, and these are
combined again into a kdev_t for use by the kernel]

[Similar code occurs is isofs/rock.c, where a 64-bit dev
must be converted.]

[I don't know whether reiserfs is a Linux-only filesystem.
If it is not, and has a disk format that is OS-independent,
a third struct stat_data might be needed, since the 12+20
is not universal, so 32+32 would be the better choice.]

^ permalink raw reply	[flat|nested] 16+ messages in thread
* Re: 2.5.2-pre7 still missing bits of kdev_t
@ 2002-01-04 13:21 Andries.Brouwer
  0 siblings, 0 replies; 16+ messages in thread
From: Andries.Brouwer @ 2002-01-04 13:21 UTC (permalink / raw)
  To: alessandro.suardi, andries.brouwer, jgarzik, torvalds
  Cc: linux-fsdevel, linux-kernel

Jeff Garzik wrote [on reiserfs]:

> granted you can stick a kdev_to_nr in there but it's still an FS policy
> decision at that point, IMHO...

Yes, for today we stick a kdev_t_to_nr in there and preserve
old behaviour, that is, nothing changes and no policy decisions
have been made. It should have been there from the start.

For next week, when larger-than-16-bit device numbers are
introduced, the proper code everywhere (on all interfaces
with the outside world: stat, mknod with user space and
special device nodes on disk and network filesystems)
would unpack the kdev_t into major and minor, and pack
again to the dev_t required by this interface (and vice versa).
That is, in principle, there is no global, unique, kdev_t_to_nr.

This is done already in most places, but reiserfs is one
of the exceptions, and they'll need a policy decision
on how to pack. In fact ext2 needs precisely the same
policy decision.

The details are rather unimportant - device numbers are
nonportable, so if we transport an ext2 disk to some
other OS and it sees different major,minor pairs, there
is no big catastrophe. Still, I have heard many a complaint
from sysadmins who needed to do _mknod foo x ma mi_
on some NFS mounted filesystem and had to make some
computations to decide on the right ma' mi' to use.
Installation scripts fail over NFS.

That is, even though a device number must be regarded
as a cookie, the fact that the mknod command separates
that cookie into two parts means that the way the
on-disk dev_t is separated belongs to the definition
of the on-disk filesystem format.
Now that 8+24, 12+20, 14+18, 32+32 all occur, the easy
way to solve all problems for a filesystem is to use 32+32.
That is what NFSv3 does, and isofs, etc.
If it is possible, the right policy no doubt is to store 32+32.
If there is no room for that then one just has to live with
the fact that the filesystem image is somewhat less portable.

Andries

^ permalink raw reply	[flat|nested] 16+ messages in thread
* Re: 2.5.2-pre7 still missing bits of kdev_t
@ 2002-01-04 19:24 Andries.Brouwer
  2002-01-04 21:10 ` Alexander Viro
  0 siblings, 1 reply; 16+ messages in thread
From: Andries.Brouwer @ 2002-01-04 19:24 UTC (permalink / raw)
  To: torvalds, viro
  Cc: Nikita, alessandro.suardi, andries.brouwer, jgarzik, linux-kernel

    From viro@math.psu.edu Fri Jan  4 19:11:10 2002

    On Fri, 4 Jan 2002, Linus Torvalds wrote:

    > On Fri, 4 Jan 2002, Jeff Garzik wrote:
    > >
    > > As mentioned to viro on IRC, I think init_special_inode should take
    > > major and minor arguments, to nudge the filesystem implementors into
    > > thinking that major and minor should be treated separately, and be
    > > given additional thought as to how they are encoded on-disk.
    > 
    > Yes. If somebody sends me a patch, I'll apply it in a jiffy.

    Guys, wait a minute with that.  There is a related issue (->i_rdev
    becoming dev_t) and I'd rather see it handled first.

Those are independent issues.

If init_special_inode() has major,minor arguments instead of
the present rdev, then the line

	inode->i_rdev = to_kdev_t(rdev);

just becomes

	inode->i_rdev = mk_kdev(major,minor);

I consider every occurrence of mk_kdev() and of to_kdev_t()
a flaw in the kernel, so this change does not make things
better or worse inside init_special_inode().

Andries

^ permalink raw reply	[flat|nested] 16+ messages in thread
* Re: 2.5.2-pre7 still missing bits of kdev_t
@ 2002-01-04 19:32 Andries.Brouwer
  0 siblings, 0 replies; 16+ messages in thread
From: Andries.Brouwer @ 2002-01-04 19:32 UTC (permalink / raw)
  To: Nikita, jgarzik
  Cc: alessandro.suardi, andries.brouwer, linux-kernel, torvalds, viro

> (I suggested having init_special_inode taking a kdev_t argument as its
> third arg, but viro yelled at me :))

Yes. If you think that a kdev_t is a pointer to a struct with
device information, then having a kdev_t there is wrong,
because a special device node can have arbitrary major,minor
not necessarily belonging to any device, so rdev should just
have the numbers.

Andries

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2002-01-05  0:01 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-01-04  0:05 2.5.2-pre7 still missing bits of kdev_t Alessandro Suardi
2002-01-04  0:13 ` Jeff Garzik
2002-01-04  2:32   ` Linus Torvalds
2002-01-04  6:52     ` Jeff Garzik
2002-01-04 16:45       ` Linus Torvalds
2002-01-04 17:34         ` Nikita Danilov
2002-01-04 17:51           ` Jeff Garzik
2002-01-04 17:53             ` Linus Torvalds
2002-01-04 18:11               ` Alexander Viro
  -- strict thread matches above, loose matches on Subject: below --
2002-01-04  1:47 Andries.Brouwer
2002-01-04 13:21 Andries.Brouwer
2002-01-04 19:24 Andries.Brouwer
2002-01-04 21:10 ` Alexander Viro
2002-01-04 21:14   ` Linus Torvalds
2002-01-05  0:10     ` Alan Cox
2002-01-04 19:32 Andries.Brouwer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox