public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* 2.4.11 loses sda9
@ 2001-10-11  4:22 arvest
  0 siblings, 0 replies; 20+ messages in thread
From: arvest @ 2001-10-11  4:22 UTC (permalink / raw)
  To: linux-kernel


  I recompiled (I used the same .10 conf) and rebooted, but my reboot halted 
because /dev/sda9 didnt exist.  I checked this in fdisk, and it didnt see it. 
 I rebooted to the 2.4.10 kernel, and sda9 was there.  What happened?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.4.11 loses sda9
@ 2001-10-11  5:07 Alexander Viro
  2001-10-11  5:20 ` arvest
  2001-10-11 18:22 ` Guest section DW
  0 siblings, 2 replies; 20+ messages in thread
From: Alexander Viro @ 2001-10-11  5:07 UTC (permalink / raw)
  To: linux-kernel



>  I recompiled (I used the same .10 conf) and rebooted, but my reboot halted 
>because /dev/sda9 didnt exist.  I checked this in fdisk, and it didnt see it. 
> I rebooted to the 2.4.10 kernel, and sda9 was there.  What happened?

Information from fdisk would help - from both versions (with 2.4.11 you'll
need to boot with init=/bin/sh, obviously).  It may be a bug in partition
code, it may be something fishy with guessing geometry (SCSI uses bread()
for that) and it may be something fishy in block devices in pagecache stuff.

If you have sfdisk, sfdisk /dev/sda -O /tmp/foo + mailing the result would
make debugging the thing much simpler (that one - from the 2.4.10).



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.4.11 loses sda9
  2001-10-11  5:07 Alexander Viro
@ 2001-10-11  5:20 ` arvest
  2001-10-11  5:23   ` Alexander Viro
  2001-10-11 18:22 ` Guest section DW
  1 sibling, 1 reply; 20+ messages in thread
From: arvest @ 2001-10-11  5:20 UTC (permalink / raw)
  To: Alexander Viro, linux-kernel

On Thursday 11 October 2001 00:07, Alexander Viro wrote:
> >  I recompiled (I used the same .10 conf) and rebooted, but my reboot
> > halted because /dev/sda9 didnt exist.  I checked this in fdisk, and it
> > didnt see it. I rebooted to the 2.4.10 kernel, and sda9 was there.  What
> > happened?
>
> Information from fdisk would help - from both versions (with 2.4.11 you'll
> need to boot with init=/bin/sh, obviously).  It may be a bug in partition
> code, it may be something fishy with guessing geometry (SCSI uses bread()
> for that) and it may be something fishy in block devices in pagecache
> stuff.
>
> If you have sfdisk, sfdisk /dev/sda -O /tmp/foo + mailing the result would
> make debugging the thing much simpler (that one - from the 2.4.10).

  I can get the system booted enough to work on (and totaly up) with this 
partition failing.  I dont know what more information from fdisk I can give 
you, sda9 is there with .10, and gone with .11  It even allowed me to add a 
new partition (i didnt save)  I tried sfdisk but it gave me these errors.

sfdisk /dev/sda -O /tmp/foo
Checking that no-one is using this disk right now ...
BLKRRPART: Device or resource busy
 
This disk is currently in use - repartitioning is probably a bad idea.
Umount all file systems, and swapoff all swap partitions on this disk.
Use the --no-reread flag to suppress this check.
Use the --force flag to overrule all checks.

  I didnt try the flags, Im worried that its going to overwrite my 
filesystem.  Heres my /proc/scsi/sym53c8xx/0 in case its needed.  My system 
is entirely scsi, except for an atapi burner.  All scsi compiled static.

General information:
  Chip sym53c875, device id 0xf, revision id 0x26
  On PCI bus 0, device 16, function 0, IRQ 9
  Synchronous period factor 12, max commands per lun 32

  Whats my next step?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.4.11 loses sda9
  2001-10-11  5:20 ` arvest
@ 2001-10-11  5:23   ` Alexander Viro
  2001-10-11  5:45     ` arvest
  0 siblings, 1 reply; 20+ messages in thread
From: Alexander Viro @ 2001-10-11  5:23 UTC (permalink / raw)
  To: arvest; +Cc: linux-kernel



On Thu, 11 Oct 2001 arvest@orphansonfire.com wrote:

>   I can get the system booted enough to work on (and totaly up) with this 
> partition failing.  I dont know what more information from fdisk I can give 
> you, sda9 is there with .10, and gone with .11  It even allowed me to add a 
> new partition (i didnt save)  I tried sfdisk but it gave me these errors.

Sigh... OK, dmesg|grep sda on both kernels + fdisk -l /dev/sda (also on
both).


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.4.11 loses sda9
  2001-10-11  5:23   ` Alexander Viro
@ 2001-10-11  5:45     ` arvest
  2001-10-11  6:08       ` Andreas Dilger
  0 siblings, 1 reply; 20+ messages in thread
From: arvest @ 2001-10-11  5:45 UTC (permalink / raw)
  To: Alexander Viro, linux-kernel

On Thursday 11 October 2001 00:23, Alexander Viro wrote:
> On Thu, 11 Oct 2001 arvest@orphansonfire.com wrote:
> >   I can get the system booted enough to work on (and totaly up) with this
> > partition failing.  I dont know what more information from fdisk I can
> > give you, sda9 is there with .10, and gone with .11  It even allowed me
> > to add a new partition (i didnt save)  I tried sfdisk but it gave me
> > these errors.
>
> Sigh... OK, dmesg|grep sda on both kernels + fdisk -l /dev/sda (also on
> both).

  Ok, heres .10

Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
SCSI device sda: 17783250 512-byte hdwr sectors (9105 MB)
 sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 sda8 >
omitting empty partition (9)
 
Disk /dev/sda: 64 heads, 32 sectors, 8683 cylinders
Units = cylinders of 2048 * 512 bytes
 
   Device Boot    Start       End    Blocks   Id  System
/dev/sda1   *         1       501    513008   83  Linux
/dev/sda2           502      3698   3273728   83  Linux
/dev/sda3          3699      4199    513024   83  Linux
/dev/sda4          4200      8683   4591616    5  Extended
/dev/sda5          4200      4700    513008   83  Linux
/dev/sda6          4701      5725   1049584   83  Linux
/dev/sda7          5726      5918    197616   82  Linux swap
/dev/sda8          5919      6419    513008   83  Linux

heres .11

Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
SCSI device sda: 17783250 512-byte hdwr sectors (9105 MB)
 sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 sda8 >
omitting empty partition (9)
 
Disk /dev/sda: 64 heads, 32 sectors, 8683 cylinders
Units = cylinders of 2048 * 512 bytes
 
   Device Boot    Start       End    Blocks   Id  System
/dev/sda1   *         1       501    513008   83  Linux
/dev/sda2           502      3698   3273728   83  Linux
/dev/sda3          3699      4199    513024   83  Linux
/dev/sda4          4200      8683   4591616    5  Extended
/dev/sda5          4200      4700    513008   83  Linux
/dev/sda6          4701      5725   1049584   83  Linux
/dev/sda7          5726      5918    197616   82  Linux swap
/dev/sda8          5919      6419    513008   83  Linux

  sda9 is mounted, and it does have its file system intact even though fdisk 
says its ommiting empty partition 9.  Ill save you the eye strain, diff came 
up empty.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.4.11 loses sda9
  2001-10-11  5:45     ` arvest
@ 2001-10-11  6:08       ` Andreas Dilger
  2001-10-11  6:17         ` Alexander Viro
  2001-10-11 16:41         ` arvest
  0 siblings, 2 replies; 20+ messages in thread
From: Andreas Dilger @ 2001-10-11  6:08 UTC (permalink / raw)
  To: arvest; +Cc: Alexander Viro, linux-kernel

On Oct 11, 2001  00:45 -0500, arvest@orphansonfire.com wrote:
> > On Thu, 11 Oct 2001 arvest@orphansonfire.com wrote:
> > >   I can get the system booted enough to work on (and totaly up) with this
> > > partition failing.  I dont know what more information from fdisk I can
> > > give you, sda9 is there with .10, and gone with .11  It even allowed me
> > > to add a new partition (i didnt save)  I tried sfdisk but it gave me
> > > these errors.
>
> Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
> SCSI device sda: 17783250 512-byte hdwr sectors (9105 MB)
>  sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 sda8 >
> omitting empty partition (9)
>  
> /dev/sda1   *         1       501    513008   83  Linux
> /dev/sda2           502      3698   3273728   83  Linux
> /dev/sda3          3699      4199    513024   83  Linux
> /dev/sda4          4200      8683   4591616    5  Extended
> /dev/sda5          4200      4700    513008   83  Linux
> /dev/sda6          4701      5725   1049584   83  Linux
> /dev/sda7          5726      5918    197616   82  Linux swap
> /dev/sda8          5919      6419    513008   83  Linux

You probably need to go into fdisk and change the partition type of
sda9 from "0" to "83" (or any other non-zero type).  There is a
reason that it is saying "omitting empty partition (9)" at boot,
and "fdisk -l" doesn't list it - because type "0" means "I don't exist".

In fdisk, use the "t" option to set the type of sda9.

Cheers, Andreas
--
Andreas Dilger  \ "If a man ate a pound of pasta and a pound of antipasto,
                 \  would they cancel out, leaving him still hungry?"
http://www-mddsp.enel.ucalgary.ca/People/adilger/               -- Dogbert


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.4.11 loses sda9
  2001-10-11  6:08       ` Andreas Dilger
@ 2001-10-11  6:17         ` Alexander Viro
  2001-10-11 16:41         ` arvest
  1 sibling, 0 replies; 20+ messages in thread
From: Alexander Viro @ 2001-10-11  6:17 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: arvest, linux-kernel



On Thu, 11 Oct 2001, Andreas Dilger wrote:

> You probably need to go into fdisk and change the partition type of
> sda9 from "0" to "83" (or any other non-zero type).  There is a
> reason that it is saying "omitting empty partition (9)" at boot,
> and "fdisk -l" doesn't list it - because type "0" means "I don't exist".
> 
> In fdisk, use the "t" option to set the type of sda9.

... and after that try to boot into 2.4.11 again.  It might be a
corruption introduced by partition code changes.  What I don't
understand is how the hell does 2.4.10 manage to mount it if
it hadn't registered the sucker...


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.4.11 loses sda9
  2001-10-11  6:08       ` Andreas Dilger
  2001-10-11  6:17         ` Alexander Viro
@ 2001-10-11 16:41         ` arvest
  2001-10-11 16:46           ` Ignacio Vazquez-Abrams
  1 sibling, 1 reply; 20+ messages in thread
From: arvest @ 2001-10-11 16:41 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Alexander Viro, linux-kernel

On Thursday 11 October 2001 01:08, Andreas Dilger wrote:
> On Oct 11, 2001  00:45 -0500, arvest@orphansonfire.com wrote:
> > > On Thu, 11 Oct 2001 arvest@orphansonfire.com wrote:
> > > >   I can get the system booted enough to work on (and totaly up) with
> > > > this partition failing.  I dont know what more information from fdisk
> > > > I can give you, sda9 is there with .10, and gone with .11  It even
> > > > allowed me to add a new partition (i didnt save)  I tried sfdisk but
> > > > it gave me these errors.
> >
> > Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
> > SCSI device sda: 17783250 512-byte hdwr sectors (9105 MB)
> >  sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 sda8 >
> > omitting empty partition (9)
> >
> > /dev/sda1   *         1       501    513008   83  Linux
> > /dev/sda2           502      3698   3273728   83  Linux
> > /dev/sda3          3699      4199    513024   83  Linux
> > /dev/sda4          4200      8683   4591616    5  Extended
> > /dev/sda5          4200      4700    513008   83  Linux
> > /dev/sda6          4701      5725   1049584   83  Linux
> > /dev/sda7          5726      5918    197616   82  Linux swap
> > /dev/sda8          5919      6419    513008   83  Linux
>
> You probably need to go into fdisk and change the partition type of
> sda9 from "0" to "83" (or any other non-zero type).  There is a
> reason that it is saying "omitting empty partition (9)" at boot,
> and "fdisk -l" doesn't list it - because type "0" means "I don't exist".
>
> In fdisk, use the "t" option to set the type of sda9.

  sda9 doesnt show in fdisk.  Cylinders 6420-8683 are shown free.  

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.4.11 loses sda9
  2001-10-11 16:41         ` arvest
@ 2001-10-11 16:46           ` Ignacio Vazquez-Abrams
  2001-10-11 16:50             ` Alexander Viro
  0 siblings, 1 reply; 20+ messages in thread
From: Ignacio Vazquez-Abrams @ 2001-10-11 16:46 UTC (permalink / raw)
  To: linux-kernel

On Thu, 11 Oct 2001 arvest@orphansonfire.com wrote:

> On Thursday 11 October 2001 01:08, Andreas Dilger wrote:
> > On Oct 11, 2001  00:45 -0500, arvest@orphansonfire.com wrote:
> > > > On Thu, 11 Oct 2001 arvest@orphansonfire.com wrote:
> > > > >   I can get the system booted enough to work on (and totaly up) with
> > > > > this partition failing.  I dont know what more information from fdisk
> > > > > I can give you, sda9 is there with .10, and gone with .11  It even
> > > > > allowed me to add a new partition (i didnt save)  I tried sfdisk but
> > > > > it gave me these errors.
> > >
> > > Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
> > > SCSI device sda: 17783250 512-byte hdwr sectors (9105 MB)
> > >  sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 sda8 >
> > > omitting empty partition (9)
> > >
> > > /dev/sda1   *         1       501    513008   83  Linux
> > > /dev/sda2           502      3698   3273728   83  Linux
> > > /dev/sda3          3699      4199    513024   83  Linux
> > > /dev/sda4          4200      8683   4591616    5  Extended
> > > /dev/sda5          4200      4700    513008   83  Linux
> > > /dev/sda6          4701      5725   1049584   83  Linux
> > > /dev/sda7          5726      5918    197616   82  Linux swap
> > > /dev/sda8          5919      6419    513008   83  Linux
> >
> > You probably need to go into fdisk and change the partition type of
> > sda9 from "0" to "83" (or any other non-zero type).  There is a
> > reason that it is saying "omitting empty partition (9)" at boot,
> > and "fdisk -l" doesn't list it - because type "0" means "I don't exist".
> >
> > In fdisk, use the "t" option to set the type of sda9.
>
>   sda9 doesnt show in fdisk.  Cylinders 6420-8683 are shown free.

Ouch. You may have to use partedit from PartitionMagic (or some other
low-level partition editor) to manually change the partition type.

-- 
Ignacio Vazquez-Abrams  <ignacio@openservices.net>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.4.11 loses sda9
  2001-10-11 16:46           ` Ignacio Vazquez-Abrams
@ 2001-10-11 16:50             ` Alexander Viro
  2001-10-11 17:23               ` Alexander Viro
  0 siblings, 1 reply; 20+ messages in thread
From: Alexander Viro @ 2001-10-11 16:50 UTC (permalink / raw)
  To: Ignacio Vazquez-Abrams; +Cc: linux-kernel



On Thu, 11 Oct 2001, Ignacio Vazquez-Abrams wrote:

> Ouch. You may have to use partedit from PartitionMagic (or some other
> low-level partition editor) to manually change the partition type.

Like, say it, dd(1).  However, partitioning code doesn't give a damn for
entry type - "empty" means "zero number of sectors" for it.  Something
very screwy is going on.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.4.11 loses sda9
  2001-10-11 16:50             ` Alexander Viro
@ 2001-10-11 17:23               ` Alexander Viro
  0 siblings, 0 replies; 20+ messages in thread
From: Alexander Viro @ 2001-10-11 17:23 UTC (permalink / raw)
  To: linux-kernel



On Thu, 11 Oct 2001, Alexander Viro wrote:

> 
> 
> On Thu, 11 Oct 2001, Ignacio Vazquez-Abrams wrote:
> 
> > Ouch. You may have to use partedit from PartitionMagic (or some other
> > low-level partition editor) to manually change the partition type.
> 
> Like, say it, dd(1).  However, partitioning code doesn't give a damn for
> entry type - "empty" means "zero number of sectors" for it.  Something
> very screwy is going on.


Owww...  I think I know what can be happening here.  Combination of very
weird (and apparently old) paritioning corruption with slightly broken
error handling in old extended_partition() code.

Setup that could explain everything we'd seen on that one looks so:

	a) extended partitions' chain ends with empty partition table.
	b) extended_partition() sets a fake device on the tail of
extended partitionbefore going into it.  Normally that fake device is
overwritten by _data_ partition refered from the EPT in the beginning of
the tail.  In this case, though, the fake is left untouched.
	c) it can be opened.  fdisk screams bloody murder seeing the
extended partition with no partitions inside, but it can be opened.
And mkfs'ed.

	IOW, you've got ext2 living on partition with type 5.  Since its
(empty) EPT lives where the boot sector should be, ext2 leaves the thing
untouched.

	That's one very sick puppy - any fdisk-style program will have
a fit on it and it certainly shouldn't create anything like that.  And
no, I don't see a good solution for that one - it's going to be very hard
to turn into valid partitions' chain.

	We can restore the bug in question, but it's still going to be
hell on any fdisk and there's nothing kernel could do about that one.
Notice that even with the old kernel sda9 officially doesn't exist -
it can be opened only because of the lack of proper error-recovery in
old extended_partition().

	All that, unfortunately, doesn't explain another bug-report
on lost partitions, but there we have very different picture - 2.4.10
actually seeing the partition in question and fdisk being OK with it.
Ugh...


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.4.11 loses sda9
  2001-10-11  5:07 Alexander Viro
  2001-10-11  5:20 ` arvest
@ 2001-10-11 18:22 ` Guest section DW
  2001-10-11 18:25   ` Alexander Viro
  1 sibling, 1 reply; 20+ messages in thread
From: Guest section DW @ 2001-10-11 18:22 UTC (permalink / raw)
  To: Alexander Viro, linux-kernel

On Thu, Oct 11, 2001 at 01:07:22AM -0400, Alexander Viro wrote:

> If you have sfdisk, sfdisk /dev/sda -O /tmp/foo + mailing the result would
> make debugging the thing much simpler (that one - from the 2.4.10).

Probably you mean sfdisk -d /dev/sda > /tmp/foo or so?
My favourite tends to be sfdisk -l -uS -x /dev/sda

The -O option saves the sectors that are changed by the sfdisk call
to some file, so that a later undo is possible.

Andries

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.4.11 loses sda9
  2001-10-11 18:22 ` Guest section DW
@ 2001-10-11 18:25   ` Alexander Viro
  0 siblings, 0 replies; 20+ messages in thread
From: Alexander Viro @ 2001-10-11 18:25 UTC (permalink / raw)
  To: Guest section DW; +Cc: linux-kernel



On Thu, 11 Oct 2001, Guest section DW wrote:

> On Thu, Oct 11, 2001 at 01:07:22AM -0400, Alexander Viro wrote:
> 
> > If you have sfdisk, sfdisk /dev/sda -O /tmp/foo + mailing the result would
> > make debugging the thing much simpler (that one - from the 2.4.10).
> 
> Probably you mean sfdisk -d /dev/sda > /tmp/foo or so?
> My favourite tends to be sfdisk -l -uS -x /dev/sda
> 
> The -O option saves the sectors that are changed by the sfdisk call
> to some file, so that a later undo is possible.

... and the contents of these sectors is precisely what I would like to see.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.4.11 loses sda9
@ 2001-10-11 19:07 Andries.Brouwer
  2001-10-11 19:19 ` Alexander Viro
  0 siblings, 1 reply; 20+ messages in thread
From: Andries.Brouwer @ 2001-10-11 19:07 UTC (permalink / raw)
  To: adilger, arvest; +Cc: linux-kernel, viro

On Thu, Oct 11, 2001 at 12:08:14AM -0600, Andreas Dilger wrote:

> You probably need to go into fdisk and change the partition type of
> sda9 from "0" to "83" (or any other non-zero type).  There is a
> reason that it is saying "omitting empty partition (9)" at boot,
> and "fdisk -l" doesn't list it - because type "0" means "I don't exist".

If I am not mistaken, it is fdisk rather than the kernel that says
"omitting empty partition (9)". (And the latest fdisk no longer
deletes partitions of type 0 from its listings.)
The sys_type field never had any significance to the kernel.

Andries


[By the way, it is a sad sight to see patch-2.4.11.
Where my own sources use  dev->hardsect_size , and
intermediate sources use  get_hardsect_size(dev)
an inline function defined roughly either as
        dev->hardsect_size
or as
        hardsect_size[MAJOR(dev)][MINOR(dev)]
so as to make it easy to switch between compiles where
a kdev_t is a number and we use the infamous arrays,
and compiles where a kdev_t is a pointer to a device struct,
and no arrays exist, I now see that get_hardsect_size(dev)
is replaced by
        get_hardsect_size(to_kdev_t(bdev->bd_dev))
. Yecch.
Al, I never understood why you want to introduce a
struct block_device * to do precisely what kdev_t
was designed to do.]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.4.11 loses sda9
  2001-10-11 19:07 Andries.Brouwer
@ 2001-10-11 19:19 ` Alexander Viro
  0 siblings, 0 replies; 20+ messages in thread
From: Alexander Viro @ 2001-10-11 19:19 UTC (permalink / raw)
  To: Andries.Brouwer; +Cc: adilger, arvest, linux-kernel



On Thu, 11 Oct 2001 Andries.Brouwer@cwi.nl wrote:

> so as to make it easy to switch between compiles where
> a kdev_t is a number and we use the infamous arrays,
> and compiles where a kdev_t is a pointer to a device struct,
> and no arrays exist, I now see that get_hardsect_size(dev)
> is replaced by
>         get_hardsect_size(to_kdev_t(bdev->bd_dev))
> . Yecch.
> Al, I never understood why you want to introduce a
> struct block_device * to do precisely what kdev_t
> was designed to do.]
 
We had been through that way too many times.  You know what problems
with unified device struct I've brought before.  You know what
problems I have with your 64bit dev_t.  And you know _very_ well that
any patches in that area should be done in small steps.

Hell, I'd prefer that one to be done _much_ slower - with decent
debugging between the steps instead of "we've got to close the
holes opened by bdev-in-pagecache _NOW_" kind of situation we'd got.

IMO eventually we should have per-disk structure and keep reference to
it from struct block_device.  Then get_hardsect_size() wiuld turn into
access to field of that beast (and would take struct block_device *
as an argument).  But that's 2.5 stuff and I bloody refuse to participate
in attempts to do everything in one huge leap.  One we'd got is already
bad enough.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.4.11 loses sda9
@ 2001-10-11 20:29 Andries.Brouwer
  2001-10-11 20:54 ` Alexander Viro
  0 siblings, 1 reply; 20+ messages in thread
From: Andries.Brouwer @ 2001-10-11 20:29 UTC (permalink / raw)
  To: Andries.Brouwer, viro; +Cc: adilger, arvest, linux-kernel

    From viro@math.psu.edu Thu Oct 11 21:19:27 2001

    On Thu, 11 Oct 2001 Andries.Brouwer@cwi.nl wrote:

    > so as to make it easy to switch between compiles where
    > a kdev_t is a number and we use the infamous arrays,
    > and compiles where a kdev_t is a pointer to a device struct,
    > and no arrays exist, I now see that get_hardsect_size(dev)
    > is replaced by
    >         get_hardsect_size(to_kdev_t(bdev->bd_dev))
    > . Yecch.
    > Al, I never understood why you want to introduce a
    > struct block_device * to do precisely what kdev_t
    > was designed to do.]
     
    We had been through that way too many times.  You know what problems
    with unified device struct I've brought before.  You know what
    problems I have with your 64bit dev_t.  And you know _very_ well that
    any patches in that area should be done in small steps.

    Hell, I'd prefer that one to be done _much_ slower - with decent
    debugging between the steps instead of "we've got to close the
    holes opened by bdev-in-pagecache _NOW_" kind of situation we'd got.

    IMO eventually we should have per-disk structure and keep reference to
    it from struct block_device.  Then get_hardsect_size() wiuld turn into
    access to field of that beast (and would take struct block_device *
    as an argument).  But that's 2.5 stuff and I bloody refuse to participate
    in attempts to do everything in one huge leap.  One we'd got is already
    bad enough.

You invent two strawmen to divert attention from the question:

> You know what problems I have with your 64bit dev_t.

Well, in fact I don't know, except that you announced to fork the
source once it got one. But that is an entirely separate discussion
and has nothing to do with kdev_t.  A dev_t is what goes into the kernel
at the mknod system call, and comes out of the kernel at the stat
system call, and roughly speaking has no other significance.
It plays a role in NFS, but NFS already uses a 64bit dev_t.

Kernel patches for a 64bit dev_t are entirely orthogonal to kdev_t work.

> I refuse to participate in attempts to do everything in one huge leap.

Well, it is not me who invents the idea of big changes at once.
It was mostly Linus who did not like intermediate code.

I think I can go to my goal in many small steps avoiding intermediate code,
although life is a bit easier with intermediate stuff like get_hardsect_size(dev).

But the size of the leaps towards the goal has very little to do
with the design of the goal.

So those are your two strawmen. Yes, there is one more sentence:

> You know what problems with unified device struct I've brought before.

I don't mind splitting kdev_t into kbdev_t and kcdev_t.
Keeping the former requires a cast or a union somewhere.
Splitting requires some code duplication.
Altogether there is very little difference between the two setups.

Remains the question, let me repeat:
"Al, I never understood why you want to introduce a struct block_device *
 to do precisely what kdev_t was designed to do."

I see that you are making small steps away from my goal, so I hope
you know very precisely where you want to go and how to get there.

> But that's 2.5 stuff

Yes, precisely. But you do not wait for 2.5 but start walking already.

Andries

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.4.11 loses sda9
  2001-10-11 20:29 Andries.Brouwer
@ 2001-10-11 20:54 ` Alexander Viro
  0 siblings, 0 replies; 20+ messages in thread
From: Alexander Viro @ 2001-10-11 20:54 UTC (permalink / raw)
  To: Andries.Brouwer; +Cc: adilger, arvest, linux-kernel



On Thu, 11 Oct 2001 Andries.Brouwer@cwi.nl wrote:

> > You know what problems with unified device struct I've brought before.
> 
> I don't mind splitting kdev_t into kbdev_t and kcdev_t.
> Keeping the former requires a cast or a union somewhere.
> Splitting requires some code duplication.
> Altogether there is very little difference between the two setups.

typedef struct block_device *kbdev_t;
 
Aside of 1:5 vowels to consonants ratio I've no problems with that.

> Remains the question, let me repeat:
> "Al, I never understood why you want to introduce a struct block_device *
>  to do precisely what kdev_t was designed to do."
> 
> I see that you are making small steps away from my goal, so I hope
> you know very precisely where you want to go and how to get there.

Right now we have a big and fairly nasty mix of the stuff that can be
turned in pointer to block device, pointer to character device _and_
stuff that is used as numbers.  Moreover, allocation policy for these
structures is a tricky beast - block ones are mostly sane now, but
character are most definitely not. Right now we have places where
we look up the blocksize, etc. with nothing more than a number.
Actually, recent changes, as much as I'd prefer to see them done only
in 2.5, help in that respect - they remove one of the major sources
of that.  Switching get_...size() to stuct block_device * (or kbdev_t
is you like to spell it that way) is a good thing, but it deserves
a separate patch.  And no, changing definition of kdev_t is not a
good idea right now - too large patch and too many places to audit.

Resulting setup is going to be pretty close, indeed, but getting
there is going to take some work.

> > But that's 2.5 stuff
> 
> Yes, precisely. But you do not wait for 2.5 but start walking already.

We needed to switch partition-related code to pagecache; _that_ was the
result of changes that were, IMO, bad idea at that point.  But these
changes were done and we have to deal with the fallout.  Getting to the
address_space of device in question requires pointer to structure.  So
the choice is between merging your patch on top of Andrea's stuff +
fixes to said stuff and praying  and  propagating pointer to existing
object in places where we need it and trying to debug that.  Somehow
the latter variant seems less painful.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.4.11 loses sda9
@ 2001-10-11 22:11 Andries.Brouwer
  2001-10-11 22:26 ` Alexander Viro
  0 siblings, 1 reply; 20+ messages in thread
From: Andries.Brouwer @ 2001-10-11 22:11 UTC (permalink / raw)
  To: Andries.Brouwer, viro; +Cc: adilger, arvest, linux-kernel

> Right now we have a big and fairly nasty mix of the stuff that can be
> turned in pointer to block device, pointer to character device _and_
> stuff that is used as numbers.

Not really. I don't know whether you ever tried the experiment
and compiled kdev_t as a pointer to a struct with two members
namely major and minor, where the struct is allocated by MKDEV().
Very few places break, and these places are very easy to fix.
Stuff that is used as numbers can be forgotten quickly.
It is not difficult at all to get a kernel up and running that has
kdev_t a pointer type.

> Moreover, allocation policy for these structures is a tricky beast.

Yes. I entirely agree. All the rest is a mechanical action.
(Or, more precisely, removable modules require freeing, and
freeing requires refcounting. It is the refcounting that is
work, more than the allocation.)

Andries

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.4.11 loses sda9
  2001-10-11 22:11 Andries.Brouwer
@ 2001-10-11 22:26 ` Alexander Viro
  0 siblings, 0 replies; 20+ messages in thread
From: Alexander Viro @ 2001-10-11 22:26 UTC (permalink / raw)
  To: Andries.Brouwer; +Cc: adilger, arvest, linux-kernel



On Thu, 11 Oct 2001 Andries.Brouwer@cwi.nl wrote:

> Not really. I don't know whether you ever tried the experiment
> and compiled kdev_t as a pointer to a struct with two members
> namely major and minor, where the struct is allocated by MKDEV().
> Very few places break, and these places are very easy to fix.
> Stuff that is used as numbers can be forgotten quickly.
> It is not difficult at all to get a kernel up and running that has
> kdev_t a pointer type.

Ugh... When do you free them?
 
> > Moreover, allocation policy for these structures is a tricky beast.
> 
> Yes. I entirely agree. All the rest is a mechanical action.
> (Or, more precisely, removable modules require freeing, and
> freeing requires refcounting. It is the refcounting that is
> work, more than the allocation.)

Precisely.  I think that on the block side we are fairly close to
reasonable one - at least I see how to get there.  Character devices
are nastier - especially with the lack of common point on ->release()
path (->f_op reassignment done by various subsystems).  Once we have
that, the rest will be pretty easy (there will be a separate issue
with per-disk objects, e.g. for serialization between open() and
BLKRRPART, but that's almost independent).

However, amount of mechanical work is going to be large - especially
if ->i_rdev becomes dev_t.  That means changing types of a lot of local
variables in drivers and I'd rather leave that to 2.5.  It _does_ break
source compatibility, and that makes it -CURRENT material.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.4.11 loses sda9
@ 2001-10-12  0:07 Andries.Brouwer
  0 siblings, 0 replies; 20+ messages in thread
From: Andries.Brouwer @ 2001-10-12  0:07 UTC (permalink / raw)
  To: Andries.Brouwer, viro; +Cc: adilger, arvest, linux-kernel

>> It is not difficult at all to get a kernel up and running that has
>> kdev_t a pointer type.

>  When do you free them?

That is not the purpose of the demo. It is just to convince
you that your fear (used as number) is almost entirely unfounded.

Andries

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2001-10-12  0:08 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-10-11  4:22 2.4.11 loses sda9 arvest
  -- strict thread matches above, loose matches on Subject: below --
2001-10-11  5:07 Alexander Viro
2001-10-11  5:20 ` arvest
2001-10-11  5:23   ` Alexander Viro
2001-10-11  5:45     ` arvest
2001-10-11  6:08       ` Andreas Dilger
2001-10-11  6:17         ` Alexander Viro
2001-10-11 16:41         ` arvest
2001-10-11 16:46           ` Ignacio Vazquez-Abrams
2001-10-11 16:50             ` Alexander Viro
2001-10-11 17:23               ` Alexander Viro
2001-10-11 18:22 ` Guest section DW
2001-10-11 18:25   ` Alexander Viro
2001-10-11 19:07 Andries.Brouwer
2001-10-11 19:19 ` Alexander Viro
2001-10-11 20:29 Andries.Brouwer
2001-10-11 20:54 ` Alexander Viro
2001-10-11 22:11 Andries.Brouwer
2001-10-11 22:26 ` Alexander Viro
2001-10-12  0:07 Andries.Brouwer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox