* regarding major number of block extended devt
@ 2008-09-02 12:26 Tejun Heo
2008-09-02 12:35 ` Tejun Heo
0 siblings, 1 reply; 15+ messages in thread
From: Tejun Heo @ 2008-09-02 12:26 UTC (permalink / raw)
To: device, Linux Kernel Mailing List, Jens Axboe
Hello,
Extended devt is scheduled for 2.6.28 merge and is currently using 259.
Can extended devt keep this major or should it use something else?
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: regarding major number of block extended devt
2008-09-02 12:26 regarding major number of block extended devt Tejun Heo
@ 2008-09-02 12:35 ` Tejun Heo
2008-09-02 20:16 ` H. Peter Anvin
0 siblings, 1 reply; 15+ messages in thread
From: Tejun Heo @ 2008-09-02 12:35 UTC (permalink / raw)
To: device, Linux Kernel Mailing List, Jens Axboe
Tejun Heo wrote:
> Hello,
>
> Extended devt is scheduled for 2.6.28 merge and is currently using 259.
> Can extended devt keep this major or should it use something else?
Oops, I forgot to write information about ext devt.
259 block Block extended device numbers
This is pool of dynamically allocated block device
numbers. Currently, ide and sd overflows into this
region if there are more partitions than the existing
device number scheme can accomodate, but any block
device can use it and it's not restricted to
overflows.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: regarding major number of block extended devt
2008-09-02 12:35 ` Tejun Heo
@ 2008-09-02 20:16 ` H. Peter Anvin
2008-09-03 4:13 ` Tejun Heo
0 siblings, 1 reply; 15+ messages in thread
From: H. Peter Anvin @ 2008-09-02 20:16 UTC (permalink / raw)
To: Tejun Heo; +Cc: device, Linux Kernel Mailing List, Jens Axboe
Tejun Heo wrote:
> Tejun Heo wrote:
>> Hello,
>>
>> Extended devt is scheduled for 2.6.28 merge and is currently using 259.
>> Can extended devt keep this major or should it use something else?
>
> Oops, I forgot to write information about ext devt.
>
> 259 block Block extended device numbers
>
> This is pool of dynamically allocated block device
> numbers. Currently, ide and sd overflows into this
> region if there are more partitions than the existing
> device number scheme can accomodate, but any block
> device can use it and it's not restricted to
> overflows.
>
It would seem better to simply use the high minors on the
already-existing ide and scsi majors?
-hpa
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: regarding major number of block extended devt
2008-09-02 20:16 ` H. Peter Anvin
@ 2008-09-03 4:13 ` Tejun Heo
2008-09-03 16:12 ` H. Peter Anvin
0 siblings, 1 reply; 15+ messages in thread
From: Tejun Heo @ 2008-09-03 4:13 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: device, Linux Kernel Mailing List, Jens Axboe
H. Peter Anvin wrote:
>> 259 block Block extended device numbers
>>
>> This is pool of dynamically allocated block device
>> numbers. Currently, ide and sd overflows into this
>> region if there are more partitions than the existing
>> device number scheme can accomodate, but any block
>> device can use it and it's not restricted to
>> overflows.
>>
>
> It would seem better to simply use the high minors on the
> already-existing ide and scsi majors?
I thought it would be better to break from those majors to make it clear
that the traditional minor allocation scheme isn't followed anymore.
Programs which expect certain majors are likely to need update for how
it deals with minors, so....
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: regarding major number of block extended devt
2008-09-03 4:13 ` Tejun Heo
@ 2008-09-03 16:12 ` H. Peter Anvin
2008-09-03 16:21 ` Tejun Heo
0 siblings, 1 reply; 15+ messages in thread
From: H. Peter Anvin @ 2008-09-03 16:12 UTC (permalink / raw)
To: Tejun Heo; +Cc: device, Linux Kernel Mailing List, Jens Axboe
Tejun Heo wrote:
>>>
>> It would seem better to simply use the high minors on the
>> already-existing ide and scsi majors?
>
> I thought it would be better to break from those majors to make it clear
> that the traditional minor allocation scheme isn't followed anymore.
> Programs which expect certain majors are likely to need update for how
> it deals with minors, so....
>
But just allocating a big bucket of device numbers and throw it all into
a pot semirandomly is likely to cause more damage, not less.
-hpa
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: regarding major number of block extended devt
2008-09-03 16:12 ` H. Peter Anvin
@ 2008-09-03 16:21 ` Tejun Heo
2008-09-03 16:27 ` H. Peter Anvin
0 siblings, 1 reply; 15+ messages in thread
From: Tejun Heo @ 2008-09-03 16:21 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: device, Linux Kernel Mailing List, Jens Axboe
H. Peter Anvin wrote:
> But just allocating a big bucket of device numbers and throw it all into
> a pot semirandomly is likely to cause more damage, not less.
To use ext devt, the system has to use udev for device numbers. As long
as udev is used, the major number doesn't matter. In addition, breaking
drastically (e.g. can't find the device) seems better than subtle
failure (e.g. weird partition number calculation based on the
traditional minor number scheme) and CONFIG_DEBUG_BLOCK_EXT_DEVT is
exactly aimed at making breakages obvious.
I don't really see there's much to gain by sharing the original major
numbers.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: regarding major number of block extended devt
2008-09-03 16:21 ` Tejun Heo
@ 2008-09-03 16:27 ` H. Peter Anvin
2008-09-03 16:45 ` Tejun Heo
0 siblings, 1 reply; 15+ messages in thread
From: H. Peter Anvin @ 2008-09-03 16:27 UTC (permalink / raw)
To: Tejun Heo; +Cc: device, Linux Kernel Mailing List, Jens Axboe
Tejun Heo wrote:
> H. Peter Anvin wrote:
>> But just allocating a big bucket of device numbers and throw it all into
>> a pot semirandomly is likely to cause more damage, not less.
>
> To use ext devt, the system has to use udev for device numbers. As long
> as udev is used, the major number doesn't matter.
I'm sorry, but that's simply false. There is a *lot* of code out there
that assumes you can determine what the device is by correlating the
major number with /proc/devices.
-hpa
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: regarding major number of block extended devt
2008-09-03 16:27 ` H. Peter Anvin
@ 2008-09-03 16:45 ` Tejun Heo
2008-09-03 16:57 ` H. Peter Anvin
2008-09-03 18:11 ` H. Peter Anvin
0 siblings, 2 replies; 15+ messages in thread
From: Tejun Heo @ 2008-09-03 16:45 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: device, Linux Kernel Mailing List, Jens Axboe
H. Peter Anvin wrote:
> Tejun Heo wrote:
>> To use ext devt, the system has to use udev for device numbers. As long
>> as udev is used, the major number doesn't matter.
>
> I'm sorry, but that's simply false. There is a *lot* of code out there
> that assumes you can determine what the device is by correlating the
> major number with /proc/devices.
Then, we're between the rock and hard place then as there also is a
lot of code which assumes certain layout of sd or hd minor numbers.
Keeping only the major numbers doesn't really resolve any problem. It
may be able to mask a few but that can be more harmful than helpful.
So, if a program expects certain major numbers, it won't be able to
access the partitions which have overflowed to the extended area. If
a program uses udev or sys hierarchy to walk through devices, it will
be able to use them all. Isn't that much better than overflowing into
the same major and hope that everything would work out okay?
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: regarding major number of block extended devt
2008-09-03 16:45 ` Tejun Heo
@ 2008-09-03 16:57 ` H. Peter Anvin
2008-09-03 18:11 ` H. Peter Anvin
1 sibling, 0 replies; 15+ messages in thread
From: H. Peter Anvin @ 2008-09-03 16:57 UTC (permalink / raw)
To: Tejun Heo; +Cc: device, Linux Kernel Mailing List, Jens Axboe
Tejun Heo wrote:
> Then, we're between the rock and hard place then as there also is a
> lot of code which assumes certain layout of sd or hd minor numbers.
> Keeping only the major numbers doesn't really resolve any problem. It
> may be able to mask a few but that can be more harmful than helpful.
>
> So, if a program expects certain major numbers, it won't be able to
> access the partitions which have overflowed to the extended area. If
> a program uses udev or sys hierarchy to walk through devices, it will
> be able to use them all. Isn't that much better than overflowing into
> the same major and hope that everything would work out okay?
Oh dear...
I just realized that you're talking about *partitions*, not *devices*.
There is a metric boatload of code out there that assumes you can take a
device number, mask off some number of bits, and reach the parent
device. They will generally do that without checking if they are right
or not.
As such, you're liable to suffer corruption of unrelated devices.
In that sense, yes, a separate major will help somewhat.
-hpa
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: regarding major number of block extended devt
2008-09-03 16:45 ` Tejun Heo
2008-09-03 16:57 ` H. Peter Anvin
@ 2008-09-03 18:11 ` H. Peter Anvin
2008-09-04 0:25 ` Tejun Heo
1 sibling, 1 reply; 15+ messages in thread
From: H. Peter Anvin @ 2008-09-03 18:11 UTC (permalink / raw)
To: Tejun Heo; +Cc: device, Linux Kernel Mailing List, Jens Axboe
Tejun Heo wrote:
> H. Peter Anvin wrote:
>> Tejun Heo wrote:
>>> To use ext devt, the system has to use udev for device numbers. As long
>>> as udev is used, the major number doesn't matter.
>> I'm sorry, but that's simply false. There is a *lot* of code out there
>> that assumes you can determine what the device is by correlating the
>> major number with /proc/devices.
>
> Then, we're between the rock and hard place then as there also is a
> lot of code which assumes certain layout of sd or hd minor numbers.
> Keeping only the major numbers doesn't really resolve any problem. It
> may be able to mask a few but that can be more harmful than helpful.
>
Thinking about it some more, one invariant this is *guaranteed* to
violate is:
partition_number = partition_device - master_device
Code that needs a partition number (which is common enough) are using
this invariant, because (a) it has held for 17 years and (b) because
there is still no alternative other that relying on fragile naming
scheme hacks.
(a) we can't do anything about, but (b) we can, by introducing a
partition number attribute in sysfs.
I would consider this a precondition for this.
-hpa
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: regarding major number of block extended devt
2008-09-03 18:11 ` H. Peter Anvin
@ 2008-09-04 0:25 ` Tejun Heo
2008-09-04 0:28 ` H. Peter Anvin
2008-09-04 11:50 ` Jens Axboe
0 siblings, 2 replies; 15+ messages in thread
From: Tejun Heo @ 2008-09-04 0:25 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: device, Linux Kernel Mailing List, Jens Axboe
Hello,
H. Peter Anvin wrote:
> Thinking about it some more, one invariant this is *guaranteed* to
> violate is:
>
> partition_number = partition_device - master_device
>
> Code that needs a partition number (which is common enough) are using
> this invariant, because (a) it has held for 17 years and (b) because
> there is still no alternative other that relying on fragile naming
> scheme hacks.
>
> (a) we can't do anything about, but (b) we can, by introducing a
> partition number attribute in sysfs.
Yeah, that would certainly be a nice addition. Also, if partitions
are made proper classes, they'll be easily enumerable by
/sys/block/*/partitions/*.
Jens, what do you think?
--
tejun
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: regarding major number of block extended devt
2008-09-04 0:25 ` Tejun Heo
@ 2008-09-04 0:28 ` H. Peter Anvin
2008-09-04 0:35 ` Tejun Heo
2008-09-04 11:50 ` Jens Axboe
1 sibling, 1 reply; 15+ messages in thread
From: H. Peter Anvin @ 2008-09-04 0:28 UTC (permalink / raw)
To: Tejun Heo; +Cc: device, Linux Kernel Mailing List, Jens Axboe
Tejun Heo wrote:
>>
>> (a) we can't do anything about, but (b) we can, by introducing a
>> partition number attribute in sysfs.
>
> Yeah, that would certainly be a nice addition. Also, if partitions
> are made proper classes, they'll be easily enumerable by
> /sys/block/*/partitions/*.
>
> Jens, what do you think?
>
Note that addition /partitions/ is somewhat unlikely to be useful, since
existing code will have to search through random crap in sysfs to look
for the partition directories anyway.
-hpa
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: regarding major number of block extended devt
2008-09-04 0:28 ` H. Peter Anvin
@ 2008-09-04 0:35 ` Tejun Heo
2008-09-04 0:43 ` H. Peter Anvin
0 siblings, 1 reply; 15+ messages in thread
From: Tejun Heo @ 2008-09-04 0:35 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: device, Linux Kernel Mailing List, Jens Axboe
H. Peter Anvin wrote:
> Note that addition /partitions/ is somewhat unlikely to be useful, since
> existing code will have to search through random crap in sysfs to look
> for the partition directories anyway.
Currently it has to list /sys/block/DEV/DEV[-]N/. With proper
classification, it can do /sys/block/*/partitions/*. We'll need to keep
around symlinks at the root level. It also plays well with how other
subsystems have been changing.
--
tejun
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: regarding major number of block extended devt
2008-09-04 0:35 ` Tejun Heo
@ 2008-09-04 0:43 ` H. Peter Anvin
0 siblings, 0 replies; 15+ messages in thread
From: H. Peter Anvin @ 2008-09-04 0:43 UTC (permalink / raw)
To: Tejun Heo; +Cc: device, Linux Kernel Mailing List, Jens Axboe
Tejun Heo wrote:
> H. Peter Anvin wrote:
>> Note that addition /partitions/ is somewhat unlikely to be useful, since
>> existing code will have to search through random crap in sysfs to look
>> for the partition directories anyway.
>
> Currently it has to list /sys/block/DEV/DEV[-]N/. With proper
> classification, it can do /sys/block/*/partitions/*. We'll need to keep
> around symlinks at the root level. It also plays well with how other
> subsystems have been changing.
>
Yes, my point was mostly that in order to support older kernels, most
code is going to want to just access /sys/block/DEV/DEV*N/ anyway. What
I have done in my code is I do a readdir() on /sys/block/DEV and look
for subdirectories with a "dev" member.
Changing them to symlinks would actually break at least my code
(arguably bad programming on my part), since optimize by looking for
DT_DIR.
-hpa
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: regarding major number of block extended devt
2008-09-04 0:25 ` Tejun Heo
2008-09-04 0:28 ` H. Peter Anvin
@ 2008-09-04 11:50 ` Jens Axboe
1 sibling, 0 replies; 15+ messages in thread
From: Jens Axboe @ 2008-09-04 11:50 UTC (permalink / raw)
To: Tejun Heo; +Cc: H. Peter Anvin, device, Linux Kernel Mailing List
On Thu, Sep 04 2008, Tejun Heo wrote:
> Hello,
>
> H. Peter Anvin wrote:
> > Thinking about it some more, one invariant this is *guaranteed* to
> > violate is:
> >
> > partition_number = partition_device - master_device
> >
> > Code that needs a partition number (which is common enough) are using
> > this invariant, because (a) it has held for 17 years and (b) because
> > there is still no alternative other that relying on fragile naming
> > scheme hacks.
> >
> > (a) we can't do anything about, but (b) we can, by introducing a
> > partition number attribute in sysfs.
>
> Yeah, that would certainly be a nice addition. Also, if partitions
> are made proper classes, they'll be easily enumerable by
> /sys/block/*/partitions/*.
>
> Jens, what do you think?
Agree, that addition definitely makes sense.
--
Jens Axboe
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2008-09-04 11:51 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-09-02 12:26 regarding major number of block extended devt Tejun Heo
2008-09-02 12:35 ` Tejun Heo
2008-09-02 20:16 ` H. Peter Anvin
2008-09-03 4:13 ` Tejun Heo
2008-09-03 16:12 ` H. Peter Anvin
2008-09-03 16:21 ` Tejun Heo
2008-09-03 16:27 ` H. Peter Anvin
2008-09-03 16:45 ` Tejun Heo
2008-09-03 16:57 ` H. Peter Anvin
2008-09-03 18:11 ` H. Peter Anvin
2008-09-04 0:25 ` Tejun Heo
2008-09-04 0:28 ` H. Peter Anvin
2008-09-04 0:35 ` Tejun Heo
2008-09-04 0:43 ` H. Peter Anvin
2008-09-04 11:50 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).