* grub mishandles corrupt/missing primary GPT @ 2013-10-24 1:38 Chris Murphy 2013-10-24 1:49 ` Vladimir 'φ-coder/phcoder' Serbinenko 0 siblings, 1 reply; 13+ messages in thread From: Chris Murphy @ 2013-10-24 1:38 UTC (permalink / raw) To: The development of GNU GRUB [-- Attachment #1: Type: text/plain, Size: 836 bytes --] https://bugzilla.redhat.com/show_bug.cgi?id=1022743 Gist is, starting with a disk with valid PMBR, primary GPT, and backup GPT, if I zero LBA 2, I can no longer boot from the disk. I get a grub rescue prompt. Instead, if I merely corrupt a portion of the first partitiontypeguid to mimic corruption, I can still boot, whereas this primary GPT fails checksums with both gdisk and parted. This tells me that GRUB isn't checking for the validity of the primary GPT. And GRUB doesn't ever use the backup GPT. Expected behavior is GRUB should check if the MBR is a PMBR (1st and only entry is type 0xEE) and if not then consider the disk MBR. If it is PMBR, check validity of the primary GPT header+table, if valid use it. If invalid, check validity of backup GPT header+table, if valid use it. If invalid, fail. Chris Murphy [-- Attachment #2: Type: text/html, Size: 1140 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: grub mishandles corrupt/missing primary GPT 2013-10-24 1:38 grub mishandles corrupt/missing primary GPT Chris Murphy @ 2013-10-24 1:49 ` Vladimir 'φ-coder/phcoder' Serbinenko 2013-10-24 3:07 ` Chris Murphy 2013-12-09 15:28 ` Phillip Susi 0 siblings, 2 replies; 13+ messages in thread From: Vladimir 'φ-coder/phcoder' Serbinenko @ 2013-10-24 1:49 UTC (permalink / raw) To: grub-devel [-- Attachment #1: Type: text/plain, Size: 2030 bytes --] On 24.10.2013 03:38, Chris Murphy wrote: > https://bugzilla.redhat.com/show_bug.cgi?id=1022743 > > Gist is, starting with a disk with valid PMBR, primary GPT, and backup > GPT, if I zero LBA 2, I can no longer boot from the disk. I get a grub > rescue prompt. > > Instead, if I merely corrupt a portion of the first partitiontypeguid to > mimic corruption, I can still boot, whereas this primary GPT fails > checksums with both gdisk and parted. > > This tells me that GRUB isn't checking for the validity of the primary > GPT. And GRUB doesn't ever use the backup GPT. > > Expected behavior is GRUB should check if the MBR is a PMBR (1st and > only entry is type 0xEE) There are so called "hybrid" disks which we have to treat as GPT > and if not then consider the disk MBR. If it is > PMBR, check validity of the primary GPT header+table, if valid use it. > If invalid, check validity of backup GPT header+table, if valid use it. > If invalid, fail. partmap module is size-critical and CRC32 verification is pretty big. There are 3 problems with backup header: 1) Backup header would be preserved even when primary is deliberately reformatted and if we use it then we'll use it even on disks where we should use newly-created MBR 2) The disk size isn't always known (loopback over network device, ieee1275 disks and CD-ROMs, possibly others) 3) There are some weird scenarios with USB enclosures "forgetting" last disk sectors which leads to partition having two different back-headers. Consider following scenario: One formats with enclosure, then puts disk natively and moves backup headers to real end of disk and later modifies partition table. Then puts disk in enclosure again and then backup has older table. Do you have ways to handle this? Why primary would be corrupted in first place? > > Chris Murphy > > > _______________________________________________ > Grub-devel mailing list > Grub-devel@gnu.org > https://lists.gnu.org/mailman/listinfo/grub-devel > [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 291 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: grub mishandles corrupt/missing primary GPT 2013-10-24 1:49 ` Vladimir 'φ-coder/phcoder' Serbinenko @ 2013-10-24 3:07 ` Chris Murphy 2013-10-24 13:39 ` Lennart Sorensen 2013-12-09 15:28 ` Phillip Susi 1 sibling, 1 reply; 13+ messages in thread From: Chris Murphy @ 2013-10-24 3:07 UTC (permalink / raw) To: The development of GNU GRUB Thanks for the response: On Oct 23, 2013, at 7:49 PM, Vladimir 'φ-coder/phcoder' Serbinenko <phcoder@gmail.com> wrote: > On 24.10.2013 03:38, Chris Murphy wrote: >> https://bugzilla.redhat.com/show_bug.cgi?id=1022743 >> >> Gist is, starting with a disk with valid PMBR, primary GPT, and backup >> GPT, if I zero LBA 2, I can no longer boot from the disk. I get a grub >> rescue prompt. >> >> Instead, if I merely corrupt a portion of the first partitiontypeguid to >> mimic corruption, I can still boot, whereas this primary GPT fails >> checksums with both gdisk and parted. >> >> This tells me that GRUB isn't checking for the validity of the primary >> GPT. And GRUB doesn't ever use the backup GPT. >> >> Expected behavior is GRUB should check if the MBR is a PMBR (1st and >> only entry is type 0xEE) > There are so called "hybrid" disks which we have to treat as GPT While technically a violation of the UEFI spec, I think this can be worked around by considering the disk GPT if the first entry in the MBR is type 0xEE. I don't know of a hybrid MBR implementation where an entry other than the first is 0xEE. But if there is no 0xEE entry at all, this is identical to a formerly GPT disk repartitioned as MBR by a utility that doesn't know anything about GPT, and thus doesn't erase the stale GPT data - and therefore must be treated as MBR. >> and if not then consider the disk MBR. If it is >> PMBR, check validity of the primary GPT header+table, if valid use it. >> If invalid, check validity of backup GPT header+table, if valid use it. >> If invalid, fail. > partmap module is size-critical and CRC32 verification is pretty big. So perhaps this test is difficult because it's GPT on BIOS, with a limited space BIOS boot partition. However, I think on UEFI computers this should still work with one valid GPT, rather than not boot at all. There's a lot more space for this there. > There are 3 problems with backup header: > 1) Backup header would be preserved even when primary is deliberately > reformatted and if we use it then we'll use it even on disks where we > should use newly-created MBR Both primary and backup GPTs are preserved in this case since the primary is in LBA 1 and 2, and only LBA 0 is overwritten with the new MBR. UEFI spec says if the MBR signature of 0xaa55 is intact, and there isn't an 0xEE entry, and the partition entries are rational (physically on disk and don't overlap), then the two GPTs are considered stale and the disk is MBR. > 2) The disk size isn't always known (loopback over network device, > ieee1275 disks and CD-ROMs, possibly others) The primary header contains the location of the backup GPT. If the header is sufficiently corrupt, and the backup GPT can't be located, then that's the same as an invalid backup GPT, and in that case fail. My point is we shouldn't fail when there is a valid locatable backup GPT. The whole point of having a second GPT is obviated with the current behavior. > 3) There are some weird scenarios with USB enclosures "forgetting" last > disk sectors which leads to partition having two different back-headers. > Consider following scenario: > One formats with enclosure, then puts disk natively and moves backup > headers to real end of disk and later modifies partition table. Then > puts disk in enclosure again and then backup has older table. I don't think we can work around this kind of hardware vendor sabotage. If it looks like a valid GPT, but is actually stale, if it's used and contains incorrect information, then boot fails. Better to try than not try at all. > > Do you have ways to handle this? > Why primary would be corrupted in first place? It's certainly uncommon. A Google search: corrupt "primary gpt" only turns up 1900 results. But it is possible. And this isn't the only mishandling I'm finding, so it's not like GRUB is unique. In fact just now by changing only a single byte in the primary GPT table (I changed the E to an F in the BIOS boot partition type UUID), the kernel suddenly has no idea what disklabel the disk is, and fails to mount rootfs. So I need to track that down too, but it seems like it knows the primary GPT table is corrupt, but then fails to use the backup GPT for some reason. An argument against GRUB doing all of this work: maybe the bootloader should be able to blindly trust the primary GPT table with no validity checks? And instead rely on (presently non-existent) checks by the underlying OS to fixi this problem? Something like an fsck_gpt, seeing as nothing else is in a good position to both check and fix such GPTs other than a partition tool. The UEFI spec says "Software should ask a user for confirmation before restoring the primary GPT" and yet it also requires the unspecified software fix the primary GPT if corrupt. The spec actually uses the word "must". So per usual, the spec has rather lofty demands. Chris Murphy ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: grub mishandles corrupt/missing primary GPT 2013-10-24 3:07 ` Chris Murphy @ 2013-10-24 13:39 ` Lennart Sorensen 2013-10-24 18:17 ` Chris Murphy 0 siblings, 1 reply; 13+ messages in thread From: Lennart Sorensen @ 2013-10-24 13:39 UTC (permalink / raw) To: The development of GNU GRUB On Wed, Oct 23, 2013 at 09:07:21PM -0600, Chris Murphy wrote: > While technically a violation of the UEFI spec, I think this can be worked around by considering the disk GPT if the first entry in the MBR is type 0xEE. I don't know of a hybrid MBR implementation where an entry other than the first is 0xEE. Well everyone other than Microsoft seems to understand how useful support for hybrid setups can be and hence support them. > But if there is no 0xEE entry at all, this is identical to a formerly GPT disk repartitioned as MBR by a utility that doesn't know anything about GPT, and thus doesn't erase the stale GPT data - and therefore must be treated as MBR. That is true. That does not mean there must ONLY be a 0xEE entry. > So perhaps this test is difficult because it's GPT on BIOS, with a limited space BIOS boot partition. However, I think on UEFI computers this should still work with one valid GPT, rather than not boot at all. There's a lot more space for this there. Certainly if using the BIOS boot partition, there really isn't much of a space excuse anymore, unless you run into limitations on how much ram you can use in early boot. > Both primary and backup GPTs are preserved in this case since the primary is in LBA 1 and 2, and only LBA 0 is overwritten with the new MBR. > > UEFI spec says if the MBR signature of 0xaa55 is intact, and there isn't an 0xEE entry, and the partition entries are rational (physically on disk and don't overlap), then the two GPTs are considered stale and the disk is MBR. > > The primary header contains the location of the backup GPT. If the header is sufficiently corrupt, and the backup GPT can't be located, then that's the same as an invalid backup GPT, and in that case fail. > > My point is we shouldn't fail when there is a valid locatable backup GPT. The whole point of having a second GPT is obviated with the current behavior. Sometimes backups are designed in and never used. I don't recall ever seeing any indication Microsoft ever used the second copy of the FAT for anything other than filesystem repair tools. > I don't think we can work around this kind of hardware vendor sabotage. If it looks like a valid GPT, but is actually stale, if it's used and contains incorrect information, then boot fails. Better to try than not try at all. > > It's certainly uncommon. A Google search: corrupt "primary gpt" only turns up 1900 results. But it is possible. > > And this isn't the only mishandling I'm finding, so it's not like GRUB is unique. In fact just now by changing only a single byte in the primary GPT table (I changed the E to an F in the BIOS boot partition type UUID), the kernel suddenly has no idea what disklabel the disk is, and fails to mount rootfs. So I need to track that down too, but it seems like it knows the primary GPT table is corrupt, but then fails to use the backup GPT for some reason. > > An argument against GRUB doing all of this work: maybe the bootloader should be able to blindly trust the primary GPT table with no validity checks? And instead rely on (presently non-existent) checks by the underlying OS to fixi this problem? Something like an fsck_gpt, seeing as nothing else is in a good position to both check and fix such GPTs other than a partition tool. Perhaps. Certainly simpler. I do wonder how Windows handles booting with a corrupt primary GPT. Would you happen to know? (A quick google search didn't find an answer to the question unfortunately). > The UEFI spec says "Software should ask a user for confirmation before restoring the primary GPT" and yet it also requires the unspecified software fix the primary GPT if corrupt. The spec actually uses the word "must". So per usual, the spec has rather lofty demands. So it must fix it after asking the user for confirmation? -- Len Sorensen ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: grub mishandles corrupt/missing primary GPT 2013-10-24 13:39 ` Lennart Sorensen @ 2013-10-24 18:17 ` Chris Murphy 2013-11-02 22:36 ` Vladimir 'φ-coder/phcoder' Serbinenko 0 siblings, 1 reply; 13+ messages in thread From: Chris Murphy @ 2013-10-24 18:17 UTC (permalink / raw) To: The development of GNU GRUB On Oct 24, 2013, at 7:39 AM, "Lennart Sorensen" <lsorense@csclub.uwaterloo.ca> wrote: > On Wed, Oct 23, 2013 at 09:07:21PM -0600, Chris Murphy wrote: >> While technically a violation of the UEFI spec, I think this can be worked around by considering the disk GPT if the first entry in the MBR is type 0xEE. I don't know of a hybrid MBR implementation where an entry other than the first is 0xEE. > > Well everyone other than Microsoft seems to understand how useful support > for hybrid setups can be and hence support them. Support is a very strong word. They're basically a craptastic workaround for prior unfortunate choices. Apple uses them, it hardly supports them. Their tools routinely nuke hybrid MBRs in favor of PMBRs rendering the secondary OS unbootable; if the MBR and GPT aren't sync'd, they will bone the correct MBR with wrong GPT information, rendering the secondary OS unbootable and data inaccessible. And it does this silently. I think it's OK to tiptoe around hybrid MBRs, and do something sensible, if possible. Supporting them is out of scope because there's no standard or agreed upon way to interpret them. > >> But if there is no 0xEE entry at all, this is identical to a formerly GPT disk repartitioned as MBR by a utility that doesn't know anything about GPT, and thus doesn't erase the stale GPT data - and therefore must be treated as MBR. > > That is true. That does not mean there must ONLY be a 0xEE entry. Well, there must be only an 0xEE entry to treat the disk as a pure GPT disk. Once there's 0xEE and 1-3 additional entries, it's a hybrid logic, very few combinations of which are sane. When the MBR and GPT don't agree with each other, which on Macs is actually somewhat common once you've used Bootcamp Assistant, because users think it's OK to resize OS X volumes in OS X Disk Utility, and then use free space to either create an additional OS X partition, or grow an existing Windows partition from within Windows. Oops, now the MBR and GPT don't agree with each other, so which one is correct? Well, it's ambiguous. With a few exceptions, there's actually no way to know what's correct, which is why hybrid MBRs are ultimately shit. But again, I'm fine dodging piles of crap rather than cleaning up other people's messes. > > I do wonder how Windows handles booting with a corrupt primary GPT. > Would you happen to know? (A quick google search didn't find an answer > to the question unfortunately). I haven't tested it because I don't have a UEFI machine here, only Apple EFI. So I'm stuck with CSM-BIOS mode booting of Windows, which means it will only use MBR. I haven't figured out UEFI within qemu/kvm, and if that can boot Windows in UEFI mode. > >> The UEFI spec says "Software should ask a user for confirmation before restoring the primary GPT" and yet it also requires the unspecified software fix the primary GPT if corrupt. The spec actually uses the word "must". So per usual, the spec has rather lofty demands. > > So it must fix it after asking the user for confirmation? Yes it's just being silly. But the take away is that (partitioning) tools are behaving wrongly if they understand GPT, yet ignore or can't fix problems with either GPT. The spec only says software, it doesn't specify what software, so I'm assuming partitioning tools. Obviously the kernel is software, but it's not in a position to ask the user anything. Chris Murphy ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: grub mishandles corrupt/missing primary GPT 2013-10-24 18:17 ` Chris Murphy @ 2013-11-02 22:36 ` Vladimir 'φ-coder/phcoder' Serbinenko 0 siblings, 0 replies; 13+ messages in thread From: Vladimir 'φ-coder/phcoder' Serbinenko @ 2013-11-02 22:36 UTC (permalink / raw) To: grub-devel [-- Attachment #1: Type: text/plain, Size: 550 bytes --] On 24.10.2013 20:17, Chris Murphy wrote: > Yes it's just being silly. But the take away is that (partitioning) tools are behaving wrongly if they understand GPT, yet ignore or can't fix problems with either GPT. The spec only says software, it doesn't specify what software, so I'm assuming partitioning tools. Obviously the kernel is software, but it's not in a position to ask the user anything. GRUB logic is that it should be able to read corrupted as far as it's not too corrupted and let kernel/partitioning tool to do the permanent fix. [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 291 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: grub mishandles corrupt/missing primary GPT 2013-10-24 1:49 ` Vladimir 'φ-coder/phcoder' Serbinenko 2013-10-24 3:07 ` Chris Murphy @ 2013-12-09 15:28 ` Phillip Susi 2013-12-09 15:54 ` Vladimir 'φ-coder/phcoder' Serbinenko 1 sibling, 1 reply; 13+ messages in thread From: Phillip Susi @ 2013-12-09 15:28 UTC (permalink / raw) To: The development of GNU GRUB -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 10/23/2013 9:49 PM, Vladimir 'φ-coder/phcoder' Serbinenko wrote: > partmap module is size-critical and CRC32 verification is pretty > big. There are 3 problems with backup header: The grub core no longer fits in 63 sectors in all but the most trivial configurations as it is, and a 2048 sector embed area has been standard now for several years, so I don't think size is a problem. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJSpeGhAAoJEI5FoCIzSKrw8XYH/09Aou9FwkH4i2bhVqYeKNeb ge0VYz3JNSxVpEVz3cmw0STNyz4/5vF+lJ59Renjbo7vj8BhVcYpMF2FfuUtdM2f 8vgqAMWnCRud7dJgO13G1CopNfAg/rjduc2zFmxMDYdFtyGEGaFYUhrIXSjetzj2 g2Lryoah6BPIdvQA/kANSNvixTj/b2+uxUpnKSbqR2b+5c8zcdXkhUJGJwR9ZEmh 4K10uMA4QlR+Y2QNqxwSPzWo44NY5xmupjOVnNFeV/ROC/OAXQXoOa8lrapDLWta vTSH6eddfoBdMqT5hdfQYnSgn61/sca1DR4IB9LdAVW+tPq4znDB6paFRfx+38A= =YXGu -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: grub mishandles corrupt/missing primary GPT 2013-12-09 15:28 ` Phillip Susi @ 2013-12-09 15:54 ` Vladimir 'φ-coder/phcoder' Serbinenko 2013-12-10 0:11 ` Chris Murphy 2013-12-10 19:38 ` Phillip Susi 0 siblings, 2 replies; 13+ messages in thread From: Vladimir 'φ-coder/phcoder' Serbinenko @ 2013-12-09 15:54 UTC (permalink / raw) To: The development of GNU GRUB [-- Attachment #1: Type: text/plain, Size: 979 bytes --] On 09.12.2013 16:28, Phillip Susi wrote: > On 10/23/2013 9:49 PM, Vladimir 'φ-coder/phcoder' Serbinenko wrote: >> partmap module is size-critical and CRC32 verification is pretty >> big. There are 3 problems with backup header: > > The grub core no longer fits in 63 sectors in all but the most trivial > configurations as it is, Not true. I've checked: all configs not involving compressed fs or diskfilter fit in 31K. > and a 2048 sector embed area has been > standard now for several years, so I don't think size is a problem. > We're speaking abut GPT, nothing to do with MBR embed area. My problem with that is that it increases complexity a lot in currently simple code. And also I had experience with backup header out of place due to disk reconfiguration and primary header corrupted but still well enough to have valid partitions. I could boot this system by using "gpt" linux option. With proposed changes this system would become unbootable. [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 291 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: grub mishandles corrupt/missing primary GPT 2013-12-09 15:54 ` Vladimir 'φ-coder/phcoder' Serbinenko @ 2013-12-10 0:11 ` Chris Murphy 2013-12-10 0:55 ` Vladimir 'φ-coder/phcoder' Serbinenko 2013-12-10 19:38 ` Phillip Susi 1 sibling, 1 reply; 13+ messages in thread From: Chris Murphy @ 2013-12-10 0:11 UTC (permalink / raw) To: The development of GNU GRUB On Dec 9, 2013, at 8:54 AM, Vladimir 'φ-coder/phcoder' Serbinenko <phcoder@gmail.com> wrote: > On 09.12.2013 16:28, Phillip Susi wrote: >> On 10/23/2013 9:49 PM, Vladimir 'φ-coder/phcoder' Serbinenko wrote: >>> partmap module is size-critical and CRC32 verification is pretty >>> big. There are 3 problems with backup header: >> >> The grub core no longer fits in 63 sectors in all but the most trivial >> configurations as it is, > Not true. I've checked: all configs not involving compressed fs or > diskfilter fit in 31K. >> and a 2048 sector embed area has been >> standard now for several years, so I don't think size is a problem. >> > We're speaking abut GPT, nothing to do with MBR embed area. > > My problem with that is that it increases complexity a lot in currently > simple code. > And also I had experience with backup header out of place due to disk > reconfiguration and primary header corrupted but still well enough to > have valid partitions. I could boot this system by using "gpt" linux > option. With proposed changes this system would become unbootable. Technically if the alternate is invalid by being in the wrong location (either end of disk or where the primary header says it should be located), and the header is also invalid because the header is corrupt, then the disk has an invalid GPT. So long as GRUB knows a valid MBR without an 0xEE entry means any found GPT is stale (or rather, simply doesn't go looking for the GPT), it seems possibly reasonable for GRUB to blindly use the primary partition table. If it fails, it fails, even if it's unfortunate there's no fallback to a valid alternate GPT. Maybe someone could argue it's a security problem for an invalid GPT being used despite being invalid? Also, I have some evidence that newer Apple EFI firmware are repairing these cases. I have one older computer that I can consistently corrupt, and it remains corrupted through boot, even to the degree the (linux) kernel face plants by default if the primary header or table is corrupt, unless the gpt kernel parameter is used. Yet a newer computer boots without the kernel complaining, and upon startup completion the GPT is fixed. Identically performed installations were performed in those cases. So maybe it can be argued the firmware has a role to play in fixing up GPT? Or maybe this is a hideously bad idea for firmware, which as we know is slightly less than massively bug ridden, to have such write privileges to the disk. Chris Murphy ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: grub mishandles corrupt/missing primary GPT 2013-12-10 0:11 ` Chris Murphy @ 2013-12-10 0:55 ` Vladimir 'φ-coder/phcoder' Serbinenko 2013-12-10 1:56 ` Chris Murphy 0 siblings, 1 reply; 13+ messages in thread From: Vladimir 'φ-coder/phcoder' Serbinenko @ 2013-12-10 0:55 UTC (permalink / raw) To: The development of GNU GRUB [-- Attachment #1: Type: text/plain, Size: 3148 bytes --] On 10.12.2013 01:11, Chris Murphy wrote: > > On Dec 9, 2013, at 8:54 AM, Vladimir 'φ-coder/phcoder' Serbinenko <phcoder@gmail.com> wrote: > >> On 09.12.2013 16:28, Phillip Susi wrote: >>> On 10/23/2013 9:49 PM, Vladimir 'φ-coder/phcoder' Serbinenko wrote: >>>> partmap module is size-critical and CRC32 verification is pretty >>>> big. There are 3 problems with backup header: >>> >>> The grub core no longer fits in 63 sectors in all but the most trivial >>> configurations as it is, >> Not true. I've checked: all configs not involving compressed fs or >> diskfilter fit in 31K. >>> and a 2048 sector embed area has been >>> standard now for several years, so I don't think size is a problem. >>> >> We're speaking abut GPT, nothing to do with MBR embed area. >> >> My problem with that is that it increases complexity a lot in currently >> simple code. >> And also I had experience with backup header out of place due to disk >> reconfiguration and primary header corrupted but still well enough to >> have valid partitions. I could boot this system by using "gpt" linux >> option. With proposed changes this system would become unbootable. > > Technically if the alternate is invalid by being in the wrong location (either end of disk or where the primary header says it should be located), and the header is also invalid because the header is corrupt, then the disk has an invalid GPT. So long as GRUB knows a valid MBR without an 0xEE entry means any found GPT is stale (or rather, simply doesn't go looking for the GPT), it seems possibly reasonable for GRUB to blindly use the primary partition table. If it fails, it fails, even if it's unfortunate there's no fallback to a valid alternate GPT. It's already the case. Probably the real remaining points are: - Should we use backup headers under some conditions? - Should msdos partitions be visible? Always? When it's not a PMBR? Or when GPT is corrupt? > > Maybe someone could argue it's a security problem for an invalid GPT being used despite being invalid? > CRC32 isn't a MAC. Anyone who can modify GPT can fix CRC32 as well. > Also, I have some evidence that newer Apple EFI firmware are repairing these cases. I have one older computer that I can consistently corrupt, and it remains corrupted through boot, even to the degree the (linux) kernel face plants by default if the primary header or table is corrupt, unless the gpt kernel parameter is used. Yet a newer computer boots without the kernel complaining, and upon startup completion the GPT is fixed. Identically performed installations were performed in those cases. > > So maybe it can be argued the firmware has a role to play in fixing up GPT? Or maybe this is a hideously bad idea for firmware, which as we know is slightly less than massively bug ridden, to have such write privileges to the disk. > Firmware writing to disk without being explicitly asked for it is a bugware or spyware. > > Chris Murphy > _______________________________________________ > Grub-devel mailing list > Grub-devel@gnu.org > https://lists.gnu.org/mailman/listinfo/grub-devel > [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 291 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: grub mishandles corrupt/missing primary GPT 2013-12-10 0:55 ` Vladimir 'φ-coder/phcoder' Serbinenko @ 2013-12-10 1:56 ` Chris Murphy 2013-12-10 2:06 ` Vladimir 'φ-coder/phcoder' Serbinenko 0 siblings, 1 reply; 13+ messages in thread From: Chris Murphy @ 2013-12-10 1:56 UTC (permalink / raw) To: The development of GNU GRUB On Dec 9, 2013, at 5:55 PM, Vladimir 'φ-coder/phcoder' Serbinenko <phcoder@gmail.com> wrote: > On 10.12.2013 01:11, Chris Murphy wrote: >> >> Technically if the alternate is invalid by being in the wrong location (either end of disk or where the primary header says it should be located), and the header is also invalid because the header is corrupt, then the disk has an invalid GPT. So long as GRUB knows a valid MBR without an 0xEE entry means any found GPT is stale (or rather, simply doesn't go looking for the GPT), it seems possibly reasonable for GRUB to blindly use the primary partition table. If it fails, it fails, even if it's unfortunate there's no fallback to a valid alternate GPT. > It's already the case. > Probably the real remaining points are: > - Should we use backup headers under some conditions? It would be nice. But if not by validating at least the table checksum, how? I don't know how big the CRC32 code is in comparison to code needed to evaluate the table some with some heuristic approach. Also it seems like a bit flip of the most important partition data, the needed partition's start sector value (is the end value needed?) is a really rare case. The more likely scenario is some software alters the GPT and has a bad write or crash at that moment, in which case the cause of boot failure isn't a complete mystery. > - Should msdos partitions be visible? Always? When it's not a PMBR? Or > when GPT is corrupt? I suggest parsing LBA 0 first for a conventional MBR, if it is, don't even parse LBA1 looking for a GPT. If the MBR is either hybrid or PMBR, then parse the GPT. I don't think it's a good idea to get into a case where GRUB looks at both MBR and GPT and has to figure out which partitions to honor or ignore if they aren't in sync. Even in the constrained Apple OS X Boot Camp implementation there has been a lot of data loss due to missteps in interpreting hybrid MBRs. >> So maybe it can be argued the firmware has a role to play in fixing up GPT? Or maybe this is a hideously bad idea for firmware, which as we know is slightly less than massively bug ridden, to have such write privileges to the disk. >> > Firmware writing to disk without being explicitly asked for it is a > bugware or spyware. Yes I definitely find this really interesting behavior. If the firmware does have the ability to write, I wonder if an arbitrary EFI application could have write permission? If so, it seems like a potentially huge attack vector. I don't see what else could be repairing the GPT: computer firmware, SSD firmware, GRUB, linux kernel. I think GRUB and linux are out, otherwise one of them would have fixed the GPT on other hardware that used an identical installation source. Chris Murphy ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: grub mishandles corrupt/missing primary GPT 2013-12-10 1:56 ` Chris Murphy @ 2013-12-10 2:06 ` Vladimir 'φ-coder/phcoder' Serbinenko 0 siblings, 0 replies; 13+ messages in thread From: Vladimir 'φ-coder/phcoder' Serbinenko @ 2013-12-10 2:06 UTC (permalink / raw) To: The development of GNU GRUB [-- Attachment #1: Type: text/plain, Size: 3713 bytes --] On 10.12.2013 02:56, Chris Murphy wrote: > > On Dec 9, 2013, at 5:55 PM, Vladimir 'φ-coder/phcoder' Serbinenko <phcoder@gmail.com> wrote: > >> On 10.12.2013 01:11, Chris Murphy wrote: >>> >>> Technically if the alternate is invalid by being in the wrong location (either end of disk or where the primary header says it should be located), and the header is also invalid because the header is corrupt, then the disk has an invalid GPT. So long as GRUB knows a valid MBR without an 0xEE entry means any found GPT is stale (or rather, simply doesn't go looking for the GPT), it seems possibly reasonable for GRUB to blindly use the primary partition table. If it fails, it fails, even if it's unfortunate there's no fallback to a valid alternate GPT. >> It's already the case. >> Probably the real remaining points are: >> - Should we use backup headers under some conditions? > > It would be nice. But if not by validating at least the table checksum, how? I don't know how big the CRC32 code is in comparison to code needed to evaluate the table some with some heuristic approach. Also it seems like a bit flip of the most important partition data, the needed partition's start sector value (is the end value needed?) is a really rare case. The more likely scenario is some software alters the GPT and has a bad write or crash at that moment, in which case the cause of boot failure isn't a complete mystery. > We need end value as well. Here the interesting part is that the data you need is about 1% of checksummed area, so in most cases checksum check gets more in the way than it helps. >> - Should msdos partitions be visible? Always? When it's not a PMBR? Or >> when GPT is corrupt? > > I suggest parsing LBA 0 first for a conventional MBR, if it is, don't even parse LBA1 looking for a GPT. If the MBR is either hybrid or PMBR, then parse the GPT. I don't think it's a good idea to get into a case where GRUB looks at both MBR and GPT and has to figure out which partitions to honor or ignore if they aren't in sync. Even in the constrained Apple OS X Boot Camp implementation there has been a lot of data loss due to missteps in interpreting hybrid MBRs. > GRUB has handling of multiple partmap scenarios but it won't handle all the cases of desync correctly. E.g. partitions with same start but different end would be recognized as same UUID with most filesystems but the files may be unreadable in case of premature partition end. > >>> So maybe it can be argued the firmware has a role to play in fixing up GPT? Or maybe this is a hideously bad idea for firmware, which as we know is slightly less than massively bug ridden, to have such write privileges to the disk. >>> >> Firmware writing to disk without being explicitly asked for it is a >> bugware or spyware. > > > Yes I definitely find this really interesting behavior. If the firmware does have the ability to write, I wonder if an arbitrary EFI application could have write permission? If so, it seems like a potentially huge attack vector. I don't see what else could be repairing the GPT: computer firmware, SSD firmware, GRUB, linux kernel. I think GRUB and linux are out, otherwise one of them would have fixed the GPT on other hardware that used an identical installation source. > Firmware has full write capability. BIOS, EFI, IEEE1275, ARC(S) all have disk write functions usable by bootloader U-Boot has only read functions. Remaining GRUB platforms have no disk functions and GRUB uses own drivers. > > Chris Murphy > _______________________________________________ > Grub-devel mailing list > Grub-devel@gnu.org > https://lists.gnu.org/mailman/listinfo/grub-devel > [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 291 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: grub mishandles corrupt/missing primary GPT 2013-12-09 15:54 ` Vladimir 'φ-coder/phcoder' Serbinenko 2013-12-10 0:11 ` Chris Murphy @ 2013-12-10 19:38 ` Phillip Susi 1 sibling, 0 replies; 13+ messages in thread From: Phillip Susi @ 2013-12-10 19:38 UTC (permalink / raw) To: The development of GNU GRUB -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 12/9/2013 10:54 AM, Vladimir '?-coder/phcoder' Serbinenko wrote: > Not true. I've checked: all configs not involving compressed fs or > diskfilter fit in 31K. As I said, "trivial" configurations ;) ext2 with no raid or lvm fits... btrfs or any combination of raid or lvm doesn't. > We're speaking abut GPT, nothing to do with MBR embed area. You seemed to be concerned that increasing the size to deal with GPT properly would be bad for MBR setups. MBR setups already have plenty of spare room in the vast majority of cases. > My problem with that is that it increases complexity a lot in > currently simple code. And also I had experience with backup header > out of place due to disk reconfiguration and primary header > corrupted but still well enough to have valid partitions. I could > boot this system by using "gpt" linux option. With proposed changes > this system would become unbootable. One very damaged configuration becomes unbootable, many other less damaged configurations become bootable. Good trade in my book. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJSp22jAAoJEI5FoCIzSKrwxcEH/1Ban2YrF5XKC0qmywYnUjDc Bk29a/1KQTPEgX8L8gm9k6cmdIWis+bPCn2HLxNo738/9OmAlUK23Tt5mXgAfy3j 6H+wZPl/NunNrYiWrVjql+sBgKyC69k6tGUwEXUeldyQRBfMWagJtbJGlZC7jmcq zPwjME+hys+JDXSIhSDLWT6+EpNpwha8e147vlDKJ9CFA83l8WVR1kB6RuIloUly iAPHavx33unqPc2vLghsajIj7MhGTzTKy0jDs1g8u1wZW3A2oJKMWAuz/FiCu1fL K1wHeR0Mi6QeEKeQkbaNotAgW6CXlWO6zLzhdF7SuQRBsTxLAp6/ymrthGUQECA= =tJLy -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2013-12-10 19:38 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-10-24 1:38 grub mishandles corrupt/missing primary GPT Chris Murphy 2013-10-24 1:49 ` Vladimir 'φ-coder/phcoder' Serbinenko 2013-10-24 3:07 ` Chris Murphy 2013-10-24 13:39 ` Lennart Sorensen 2013-10-24 18:17 ` Chris Murphy 2013-11-02 22:36 ` Vladimir 'φ-coder/phcoder' Serbinenko 2013-12-09 15:28 ` Phillip Susi 2013-12-09 15:54 ` Vladimir 'φ-coder/phcoder' Serbinenko 2013-12-10 0:11 ` Chris Murphy 2013-12-10 0:55 ` Vladimir 'φ-coder/phcoder' Serbinenko 2013-12-10 1:56 ` Chris Murphy 2013-12-10 2:06 ` Vladimir 'φ-coder/phcoder' Serbinenko 2013-12-10 19:38 ` Phillip Susi
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).