* Larger dev_t and major/minor split
@ 2003-03-20 21:42 H. Peter Anvin
2003-03-20 22:09 ` Joel Becker
2003-03-20 22:47 ` Greg KH
0 siblings, 2 replies; 6+ messages in thread
From: H. Peter Anvin @ 2003-03-20 21:42 UTC (permalink / raw)
To: linux-kernel
Since Linus opened for this the other day I guess I would like to
suggest it "officially":
Since glibc already runs with a 64-bit dev_t on as far as I know all
Linux platforms, which means that userspace is already taking the
performance hit, *and* since it cause it isn't murdeously obvious by
now, changing the kernel/userspace interface for this is painful as
hell, I would like to suggest that:
a) We use a 32+32 bit split for dev_t. Major zero, minor < 65536
would be reserved for compatibility with the old 16-bit dev_t; it
still leaves the zero value the "no device" entry. We could still
use major 0, minor >= 65536 as anonymous devices, or we could
switch using major 255 which has been reserved for expansion for
the past eight years.
b) In order to support NFSv2 and other filesystems which only support
a 32-bit dev_t, I suggest we stay within a (12,20)-bit range for as
long as that is practical. Note, however, that this only affect
using those filesystems for /dev, and I personally think it's not
too huge of a loss to say "well, if you use NFS for root, either
use NFSv3 or make /dev a tmpfs and extract a tarball from your
initrd."
All cases where we have to deal with a 32-bit dev_t on the wire or
on disk should use a 12+20 split.
How does this sound?
-hpa
--
<hpa@transmeta.com> at work, <hpa@zytor.com> in private!
"Unix gives you enough rope to shoot yourself in the foot."
Architectures needed: ia64 m68k mips64 ppc ppc64 s390 s390x sh v850 x86-64
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Larger dev_t and major/minor split
2003-03-20 21:42 Larger dev_t and major/minor split H. Peter Anvin
@ 2003-03-20 22:09 ` Joel Becker
2003-03-20 23:06 ` H. Peter Anvin
2003-03-20 22:47 ` Greg KH
1 sibling, 1 reply; 6+ messages in thread
From: Joel Becker @ 2003-03-20 22:09 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: linux-kernel
On Thu, Mar 20, 2003 at 01:42:41PM -0800, H. Peter Anvin wrote:
> b) In order to support NFSv2 and other filesystems which only support
> a 32-bit dev_t, I suggest we stay within a (12,20)-bit range for as
Hmm, I guess that means dropping ext2/3 for / ;-(
Joel
--
"Nothing is wrong with California that a rise in the ocean level
wouldn't cure."
- Ross MacDonald
Joel Becker
Senior Member of Technical Staff
Oracle Corporation
E-mail: joel.becker@oracle.com
Phone: (650) 506-8127
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: Larger dev_t and major/minor split
2003-03-20 22:09 ` Joel Becker
@ 2003-03-20 23:06 ` H. Peter Anvin
2003-03-20 23:49 ` Joel Becker
0 siblings, 1 reply; 6+ messages in thread
From: H. Peter Anvin @ 2003-03-20 23:06 UTC (permalink / raw)
To: Joel Becker; +Cc: linux-kernel
Joel Becker wrote:
> On Thu, Mar 20, 2003 at 01:42:41PM -0800, H. Peter Anvin wrote:
>
>>b) In order to support NFSv2 and other filesystems which only support
>> a 32-bit dev_t, I suggest we stay within a (12,20)-bit range for as
>
>
> Hmm, I guess that means dropping ext2/3 for / ;-(
>
Last I checked, all traditional (inode-based) Unix filesystems,
including ext2/3 used block pointers for dev_t. There are plenty of
block pointers; 60 bytes worth.
-hpa
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Larger dev_t and major/minor split
2003-03-20 23:06 ` H. Peter Anvin
@ 2003-03-20 23:49 ` Joel Becker
2003-03-21 0:02 ` H. Peter Anvin
0 siblings, 1 reply; 6+ messages in thread
From: Joel Becker @ 2003-03-20 23:49 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: linux-kernel
On Thu, Mar 20, 2003 at 03:06:31PM -0800, H. Peter Anvin wrote:
> Last I checked, all traditional (inode-based) Unix filesystems,
> including ext2/3 used block pointers for dev_t. There are plenty of
> block pointers; 60 bytes worth.
They do indeed. But ext2/3 touches that block pointer with
cpu_to_le32() and friends. It needs fixing at best, and compatability
work for already existing partitions.
Joel
--
"There is shadow under this red rock.
(Come in under the shadow of this red rock)
And I will show you something different from either
Your shadow at morning striding behind you
Or your shadow at evening rising to meet you.
I will show you fear in a handful of dust."
Joel Becker
Senior Member of Technical Staff
Oracle Corporation
E-mail: joel.becker@oracle.com
Phone: (650) 506-8127
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Larger dev_t and major/minor split
2003-03-20 23:49 ` Joel Becker
@ 2003-03-21 0:02 ` H. Peter Anvin
0 siblings, 0 replies; 6+ messages in thread
From: H. Peter Anvin @ 2003-03-21 0:02 UTC (permalink / raw)
To: Joel Becker; +Cc: linux-kernel
Joel Becker wrote:
> On Thu, Mar 20, 2003 at 03:06:31PM -0800, H. Peter Anvin wrote:
>
>>Last I checked, all traditional (inode-based) Unix filesystems,
>>including ext2/3 used block pointers for dev_t. There are plenty of
>>block pointers; 60 bytes worth.
>
> They do indeed. But ext2/3 touches that block pointer with
> cpu_to_le32() and friends. It needs fixing at best, and compatability
> work for already existing partitions.
>
A few options:
a) Use an inode flag indicating a large dev_t. This is probably the
best option.
b) Use a sentinel value, e.g. 0xffffffff, to indicate that the major and
minor are in block pointers 1 and 2.
-hpa
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Larger dev_t and major/minor split
2003-03-20 21:42 Larger dev_t and major/minor split H. Peter Anvin
2003-03-20 22:09 ` Joel Becker
@ 2003-03-20 22:47 ` Greg KH
1 sibling, 0 replies; 6+ messages in thread
From: Greg KH @ 2003-03-20 22:47 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: linux-kernel
On Thu, Mar 20, 2003 at 01:42:41PM -0800, H. Peter Anvin wrote:
>
> a) We use a 32+32 bit split for dev_t. Major zero, minor < 65536
> would be reserved for compatibility with the old 16-bit dev_t; it
> still leaves the zero value the "no device" entry. We could still
> use major 0, minor >= 65536 as anonymous devices, or we could
> switch using major 255 which has been reserved for expansion for
> the past eight years.
Well, it seems that this is the most reasonable split, able to handle
everyone for a long time. I can live with it, if only to keep people
from Oracle quiet :)
thanks,
greg k-h
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2003-03-20 23:51 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-03-20 21:42 Larger dev_t and major/minor split H. Peter Anvin
2003-03-20 22:09 ` Joel Becker
2003-03-20 23:06 ` H. Peter Anvin
2003-03-20 23:49 ` Joel Becker
2003-03-21 0:02 ` H. Peter Anvin
2003-03-20 22:47 ` Greg KH
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox