* Re: [bug, 2.5.29, IDE] partition table corruption?
@ 2002-08-07 18:43 Andries.Brouwer
2002-08-07 21:12 ` [bug, 2.5.29, (not IDE)] partition table (not) corruption? Ingo Molnar
2002-08-08 7:46 ` [bug, 2.5.29, IDE] partition table corruption? Marcin Dalecki
0 siblings, 2 replies; 8+ messages in thread
From: Andries.Brouwer @ 2002-08-07 18:43 UTC (permalink / raw)
To: Andries.Brouwer, mingo; +Cc: alan, dalecki, linux-kernel
> The funny thing is, I removed some stuff here in 2.5.30,
> so I would understand things immediately if you reported this
> about 2.5.30. But for 2.5.29 I do not immediately see why
> you would see any changes.
2.5.30 breaks as well.
> Did you in the meantime find out what was wrong?
nope. I still keep working it around.
> Are things OK in 2.5.28 and wrong in vanilla 2.5.29
> with the same version of LILO? (which version?)
a fairly standard LILO from RH 7.3: linux-21.4.4-10.
> Do you use the linear or lba32 options? The fix-table option?
I use none of these options. I use a very simple setup, a proper /boot
partition, nothing complex or unexpected.
> What corruption do you see in the partition table?
nothing in the descriptors that i can tell from looking at fdisk output -
but it would be pretty hard to recover the system via a pure rescue CD
otherwise.
> Do you use LVM?
nope. Plain old IDE, ext3fs,
> What happens under 2.5.30?
the same 'LI' message.
I'll try Alan's suggestion of adding the 'linear' option.
...
this actually did the trick - lilo no more messes up the bootup.
So Alan's suspicion is right, there's something wrong about geometries
in 2.5-current.
I always like to understand all the details - forgive me if I come
with further questions.
LILO without "linear" or "lba32" is inherently broken:
it will talk CHS at boot time to the BIOS and hence needs a geometry
and install time, and nobody knows the geometry required. So, if
LILO doesnt break, this is pure coincidence.
Since 2.5.30 many people will have a different geometry, so many
people will have to find grub or a recent LILO, or add "linear"
to their old LILO. This is all well understood - I just repeat it
a few times in the hope that that will reduce the amount of email.
But now you talk about vanilla 2.5.29, and I am surprised.
Could you send the kernel boot messages concerning that disk
(dmesg | grep hd) for 2.5.28 and 2.5.29 and 2.5.30?
And you talk about corruption, and I am surprised again.
Have you verified that there really was a difference?
Or do you only suspect corruption because LILO has problem?
(In that case I can assure you that there was no corruption.)
Andries
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [bug, 2.5.29, (not IDE)] partition table (not) corruption?
2002-08-07 18:43 [bug, 2.5.29, IDE] partition table corruption? Andries.Brouwer
@ 2002-08-07 21:12 ` Ingo Molnar
2002-08-08 7:30 ` Marcin Dalecki
2002-08-08 7:46 ` [bug, 2.5.29, IDE] partition table corruption? Marcin Dalecki
1 sibling, 1 reply; 8+ messages in thread
From: Ingo Molnar @ 2002-08-07 21:12 UTC (permalink / raw)
To: Andries.Brouwer; +Cc: Alan Cox, Marcin Dalecki, linux-kernel
On Wed, 7 Aug 2002 Andries.Brouwer@cwi.nl wrote:
> LILO without "linear" or "lba32" is inherently broken: it will talk CHS
> at boot time to the BIOS and hence needs a geometry and install time,
> and nobody knows the geometry required. So, if LILO doesnt break, this
> is pure coincidence.
well, lilo without linear worked for like years on this box ...
> Since 2.5.30 many people will have a different geometry, so many people
> will have to find grub or a recent LILO, or add "linear" to their old
> LILO. This is all well understood - I just repeat it a few times in the
> hope that that will reduce the amount of email.
>
> But now you talk about vanilla 2.5.29, and I am surprised. Could you
> send the kernel boot messages concerning that disk (dmesg | grep hd) for
> 2.5.28 and 2.5.29 and 2.5.30?
will do - it might have started in 2.5.28. But since i use the BK tree, i
might have tested an 'almost 2.5.30' 2.5.29 BK tree.
> And you talk about corruption, and I am surprised again. Have you
> verified that there really was a difference? Or do you only suspect
> corruption because LILO has problem? (In that case I can assure you that
> there was no corruption.)
you are right, there was no corruption most likely. And the IDE subsystem
is most definitely innocent.
Ingo
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [bug, 2.5.29, (not IDE)] partition table (not) corruption?
2002-08-07 21:12 ` [bug, 2.5.29, (not IDE)] partition table (not) corruption? Ingo Molnar
@ 2002-08-08 7:30 ` Marcin Dalecki
2002-08-08 7:37 ` Ingo Molnar
0 siblings, 1 reply; 8+ messages in thread
From: Marcin Dalecki @ 2002-08-08 7:30 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Andries.Brouwer, Alan Cox, linux-kernel
Uz.ytkownik Ingo Molnar napisa?:
> On Wed, 7 Aug 2002 Andries.Brouwer@cwi.nl wrote:
>
>
>>LILO without "linear" or "lba32" is inherently broken: it will talk CHS
>>at boot time to the BIOS and hence needs a geometry and install time,
>>and nobody knows the geometry required. So, if LILO doesnt break, this
>>is pure coincidence.
>
>
> well, lilo without linear worked for like years on this box ...
You have to take in to account that by creating a new kernel image
you are storing it sometimes after a long long time at perhaps maybe
another block group far away. This is becouse ext2 suddenly may feel
like doing so...And surprisingly you have to teach lilo about the new
far away sectors becouse basic C/H/S addressing can't reach them
anylonger. Been there seen that frequently enough.
It would be maybe informative if you could actually provide the
first sector address used by the inode corresponding to vmlinuz.
At least this way one could resolve the issue definitively.
>>And you talk about corruption, and I am surprised again. Have you
>>verified that there really was a difference? Or do you only suspect
>>corruption because LILO has problem? (In that case I can assure you that
>>there was no corruption.)
>
>
> you are right, there was no corruption most likely. And the IDE subsystem
> is most definitely innocent.
I have told you :-).
BTW.> Please don't consider RH lilo "fairly standard" it *is* messing
with the geometry issues, since in esp. limbo.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [bug, 2.5.29, (not IDE)] partition table (not) corruption?
2002-08-08 7:30 ` Marcin Dalecki
@ 2002-08-08 7:37 ` Ingo Molnar
2002-08-08 8:42 ` Marcin Dalecki
0 siblings, 1 reply; 8+ messages in thread
From: Ingo Molnar @ 2002-08-08 7:37 UTC (permalink / raw)
To: martin; +Cc: Andries.Brouwer, Alan Cox, linux-kernel
On Thu, 8 Aug 2002, Marcin Dalecki wrote:
> >>LILO without "linear" or "lba32" is inherently broken: it will talk CHS
> >>at boot time to the BIOS and hence needs a geometry and install time,
> >>and nobody knows the geometry required. So, if LILO doesnt break, this
> >>is pure coincidence.
> >
> >
> > well, lilo without linear worked for like years on this box ...
>
> You have to take in to account that by creating a new kernel image
> you are storing it sometimes after a long long time at perhaps maybe
> another block group far away. This is becouse ext2 suddenly may feel
> like doing so...And surprisingly you have to teach lilo about the new
> far away sectors becouse basic C/H/S addressing can't reach them
> anylonger. Been there seen that frequently enough.
this particular testbox has seen *thousands* of development kernels of all
sizes, and i often have filled up the complete /boot partition. It is very
unlikely that this harmless (and not too big) 2.5.29 kernel would have
been the first one to trigger a 'wrong' CHS combination. Especially since
2.4 kernels with exactly the *same* bzImage (and same lilo) work just
fine.
Ingo
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [bug, 2.5.29, (not IDE)] partition table (not) corruption?
2002-08-08 7:37 ` Ingo Molnar
@ 2002-08-08 8:42 ` Marcin Dalecki
2002-08-08 9:03 ` Ingo Molnar
0 siblings, 1 reply; 8+ messages in thread
From: Marcin Dalecki @ 2002-08-08 8:42 UTC (permalink / raw)
To: Ingo Molnar; +Cc: martin, Andries.Brouwer, Alan Cox, linux-kernel
Uz.ytkownik Ingo Molnar napisa?:
> On Thu, 8 Aug 2002, Marcin Dalecki wrote:
>
>
>>>>LILO without "linear" or "lba32" is inherently broken: it will talk CHS
>>>>at boot time to the BIOS and hence needs a geometry and install time,
>>>>and nobody knows the geometry required. So, if LILO doesnt break, this
>>>>is pure coincidence.
>>>
>>>
>>>well, lilo without linear worked for like years on this box ...
>>
>>You have to take in to account that by creating a new kernel image
>>you are storing it sometimes after a long long time at perhaps maybe
>>another block group far away. This is becouse ext2 suddenly may feel
>>like doing so...And surprisingly you have to teach lilo about the new
>>far away sectors becouse basic C/H/S addressing can't reach them
>>anylonger. Been there seen that frequently enough.
>
>
> this particular testbox has seen *thousands* of development kernels of all
> sizes, and i often have filled up the complete /boot partition. It is very
> unlikely that this harmless (and not too big) 2.5.29 kernel would have
> been the first one to trigger a 'wrong' CHS combination. Especially since
> 2.4 kernels with exactly the *same* bzImage (and same lilo) work just
> fine.
Well well having a look at lilo-s inwards I can the the following:
if (ioctl(fd,HDIO_GETGEO,&hdprm) < 0)
die("geo_query_dev HDIO_GETGEO (dev 0x%04x): %s",device,
strerror(errno));
geo->heads = hdprm.heads;
geo->cylinders = hdprm.cylinders;
geo->sectors = hdprm.sectors;
geo->start = hdprm.start;
if ((geo->device = bios_device(geo, device)) < 0)
geo->device = 0x80 + (MINOR(device) >> 6) +
(MAJOR(device) == MAJOR_HD ? 0 :
last_dev(MAJOR_HD,64));
If you look at the boot messages from a kernel:
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
hda: 78140160 sectors, CHS=77520/16/63, UDMA(33)
hda: hda1 hda4
You can actually see the CHS info field.
Would you care to maybe compare them between 2.4 and 2.5 on the
system in question?
If they are not different, well, taking a look at the bios_device()
in lilo you can actually see that it doesn't know *anything* about
EZ disk or similar partition table tricks and therelike - this can be
definitively considered a bug *there*.
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [bug, 2.5.29, (not IDE)] partition table (not) corruption?
2002-08-08 8:42 ` Marcin Dalecki
@ 2002-08-08 9:03 ` Ingo Molnar
2002-08-08 9:18 ` Marcin Dalecki
0 siblings, 1 reply; 8+ messages in thread
From: Ingo Molnar @ 2002-08-08 9:03 UTC (permalink / raw)
To: martin; +Cc: Andries.Brouwer, Alan Cox, linux-kernel
On Thu, 8 Aug 2002, Marcin Dalecki wrote:
> If you look at the boot messages from a kernel:
>
> ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
> hda: 78140160 sectors, CHS=77520/16/63, UDMA(33)
> hda: hda1 hda4
>
> You can actually see the CHS info field.
okay, here are the 2.4.18(-ish) and 2.5.30 CHS fields:
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
ide1 at 0x170-0x177,0x376 on irq 15
hda: 8418816 sectors (4310 MB) w/80KiB Cache, CHS=524/255/63, UDMA(33)
hdc: 40132503 sectors (20548 MB) w/1900KiB Cache, CHS=39813/16/63, UDMA(33)
hdb: ATAPI 32X CD-ROM CD-R/RW drive, 2048kB Cache
hda: QUANTUM FIREBALL SE4.3A, DISK drive
hdb: RICOH CD-R/RW MP7083A, ATAPI CD/DVD-ROM drive
hdc: QUANTUM FIREBALLP LM20.5, DISK drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
ide1 at 0x170-0x177,0x376 on irq 15
hda: 8418816 sectors w/80KiB Cache, CHS=14848/9/63, UDMA(33)
hda: hda1 hda2 < hda5 hda6 >
hdc: 40132503 sectors w/1900KiB Cache, CHS=39813/16/63, UDMA(33)
hdc: hdc1
hdb: Disabling (U)DMA for RICOH CD-R/RW MP7083A
hdb: DMA disabled
hdb: ATAPI 32X CD-ROM CD-R/RW drive, 2048kB Cache
i have the bootlogs of this system back to 2.5.25 only, which also shows
the wrong(?) CHS:
Linux version 2.5.25 (mingo@mars) (gcc version 2.96 20000731 (Red Hat Linux 7.2
2.96-101.9)) #3 SMP Tue Jul 9 21:12:18 CEST 2002
hda: QUANTUM FIREBALL SE4.3A, DISK drive
hdb: RICOH CD-R/RW MP7083A, ATAPI CD/DVD-ROM drive
hdc: QUANTUM FIREBALLP LM20.5, DISK drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
ide1 at 0x170-0x177,0x376 on irq 15
hda: 8418816 sectors w/80KiB Cache, CHS=14848/9/63
hda: [PTBL] [524/255/63] hda1 hda2 < hda5 hda6 >
hdc: 40132503 sectors w/1900KiB Cache, CHS=39813/16/63
hdc: hdc1
Ingo
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [bug, 2.5.29, (not IDE)] partition table (not) corruption?
2002-08-08 9:03 ` Ingo Molnar
@ 2002-08-08 9:18 ` Marcin Dalecki
0 siblings, 0 replies; 8+ messages in thread
From: Marcin Dalecki @ 2002-08-08 9:18 UTC (permalink / raw)
To: Ingo Molnar; +Cc: martin, Andries.Brouwer, Alan Cox, linux-kernel
Uz.ytkownik Ingo Molnar napisa?:
> On Thu, 8 Aug 2002, Marcin Dalecki wrote:
>
>
>>If you look at the boot messages from a kernel:
>>
>>ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
>> hda: 78140160 sectors, CHS=77520/16/63, UDMA(33)
>> hda: hda1 hda4
>>
>>You can actually see the CHS info field.
>
>
> okay, here are the 2.4.18(-ish) and 2.5.30 CHS fields:
>
> ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
> ide1 at 0x170-0x177,0x376 on irq 15
> hda: 8418816 sectors (4310 MB) w/80KiB Cache, CHS=524/255/63, UDMA(33)
> hdc: 40132503 sectors (20548 MB) w/1900KiB Cache, CHS=39813/16/63, UDMA(33)
> hdb: ATAPI 32X CD-ROM CD-R/RW drive, 2048kB Cache
>
> hda: QUANTUM FIREBALL SE4.3A, DISK drive
> hdb: RICOH CD-R/RW MP7083A, ATAPI CD/DVD-ROM drive
> hdc: QUANTUM FIREBALLP LM20.5, DISK drive
> ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
> ide1 at 0x170-0x177,0x376 on irq 15
> hda: 8418816 sectors w/80KiB Cache, CHS=14848/9/63, UDMA(33)
> hda: hda1 hda2 < hda5 hda6 >
> hdc: 40132503 sectors w/1900KiB Cache, CHS=39813/16/63, UDMA(33)
> hdc: hdc1
> hdb: Disabling (U)DMA for RICOH CD-R/RW MP7083A
> hdb: DMA disabled
> hdb: ATAPI 32X CD-ROM CD-R/RW drive, 2048kB Cache
>
> i have the bootlogs of this system back to 2.5.25 only, which also shows
> the wrong(?) CHS:
>
> Linux version 2.5.25 (mingo@mars) (gcc version 2.96 20000731 (Red Hat Linux 7.2
> 2.96-101.9)) #3 SMP Tue Jul 9 21:12:18 CEST 2002
>
> hda: QUANTUM FIREBALL SE4.3A, DISK drive
> hdb: RICOH CD-R/RW MP7083A, ATAPI CD/DVD-ROM drive
> hdc: QUANTUM FIREBALLP LM20.5, DISK drive
> ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
> ide1 at 0x170-0x177,0x376 on irq 15
> hda: 8418816 sectors w/80KiB Cache, CHS=14848/9/63
> hda: [PTBL] [524/255/63] hda1 hda2 < hda5 hda6 >
^^^^^^^^^^^^^^^^^^^^^^^^^^^ This is actually the interresting
part. Well this is actually printed by
the partition grocking code. 255/63 *is* indicating LBA (linear)
access to the drive. It is really lilo who didn't parse the partition
table and was relying on the value returned by GETGEO ioctl instead.
At least lilo should automatically resort to LBA if C > 1024.
> hdc: 40132503 sectors w/1900KiB Cache, CHS=39813/16/63
> hdc: hdc1
>
> Ingo
>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [bug, 2.5.29, IDE] partition table corruption?
2002-08-07 18:43 [bug, 2.5.29, IDE] partition table corruption? Andries.Brouwer
2002-08-07 21:12 ` [bug, 2.5.29, (not IDE)] partition table (not) corruption? Ingo Molnar
@ 2002-08-08 7:46 ` Marcin Dalecki
1 sibling, 0 replies; 8+ messages in thread
From: Marcin Dalecki @ 2002-08-08 7:46 UTC (permalink / raw)
To: Andries.Brouwer; +Cc: mingo, alan, linux-kernel
> Since 2.5.30 many people will have a different geometry, so many
> people will have to find grub or a recent LILO, or add "linear"
> to their old LILO. This is all well understood - I just repeat it
> a few times in the hope that that will reduce the amount of email.
I think you confuse two entierly unrelated issues a bit:
1. Remapping s single sector and thus making the behaviour of dd
if=/dev/hda /of=dev/hdb less then intuitive, namely: severly BROKEN. It
doesn't matter that this was broken for years. Now I can remember it did
bite me once I tryed to clone a system precisely in the dd way. (Of
course rerunning lilo on the clone wasn't impossible for me...) The only
thing which makes me worry here are the problems Petr was reporting about...
2. The xlate trick which was only supposed to be used by the MSDOS fs
driver and only on i386 and only if this thing was residuent and
ide-disk was not compiled as a module and so on. This is actually the
*geometry* issue. If someone needs access to an MS-DOS partition, well
he can always resort to mtools. FAT16, which is likely the affected
variant of FAT filesystem, was broken before anyway and I have still to
recheck whatever the removal of the geomtry "translation" didn't even
maybe make my CF PSION system disk readbale.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2002-08-08 9:21 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-08-07 18:43 [bug, 2.5.29, IDE] partition table corruption? Andries.Brouwer
2002-08-07 21:12 ` [bug, 2.5.29, (not IDE)] partition table (not) corruption? Ingo Molnar
2002-08-08 7:30 ` Marcin Dalecki
2002-08-08 7:37 ` Ingo Molnar
2002-08-08 8:42 ` Marcin Dalecki
2002-08-08 9:03 ` Ingo Molnar
2002-08-08 9:18 ` Marcin Dalecki
2002-08-08 7:46 ` [bug, 2.5.29, IDE] partition table corruption? Marcin Dalecki
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox