public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Re: EFI partition code broken..
       [not found] ` <Pine.LNX.4.58.0411071128240.24286@ppc970.osdl.org>
@ 2004-11-07 20:03   ` Matt Domsch
  2004-11-07 21:13     ` Linus Torvalds
  2004-11-07 21:52   ` EFI partition code broken Matt Domsch
  1 sibling, 1 reply; 7+ messages in thread
From: Matt Domsch @ 2004-11-07 20:03 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: davej, Luck, Tony, linux-kernel, linux-ia64

On Sun, Nov 07, 2004 at 11:30:18AM -0800, Linus Torvalds wrote:
> There's a few reports of various USB storage devices locking up. The last 
> one was an iPod, but there's apparently others too.
> 
> The reason? They are unhappy if you access them past the end, and they 
> seem to have problems reporting their true size.
> 
> And the EFI partitioning code will happily just blindly try to access the 
> last sector, because that's where the EFI partition is. Boom. Immediately 
> dead iPod/whatever.
> 
> I'm going to make EFI depend on ia64. Does anybody else use them? 

It has been suggested on lkml that CONFIG_EFI_PARTITION (aka GUID
Partition Tables, or GPT), be used as a partitioning strategy
(assuming you want partitions instead of whole-disk LVM) for >2TB
non-boot block devices, on all architectures, not just on ia64.
Granted, you've got to have a RAID controller to get there today, but
SANs can.

Also, Intel is heavily pushing for Extensible Firmware Interface (EFI)
to be implemented in x86 and x86_64 environments in the next couple of
years, so it may become more widespread.
 
> Anyway, if anybody is interested in EFI tables on anything else, or if 
> ia64 people ever care about actually working in any interesting hw 
> situation, I'd strongly suggest EFI be fixed to not accept any random 
> crap. 
> 
> As far as I can tell from reading the source, the EFI driver already
> checks the first sector to see if it has a valid gpt, but apparently it
> happily ends up doing all the end-of-disk checks even if the GPT isn't
> there. Whatever the GPT is.

GPT = GUID Partition Table, that's the metadata as described in the
EFI Specification.  There's two copies, one at the front of the disk,
and one at the end of the disk, as a backup.  (why, I'm not too sure,
but that's the way it was specfied).  So the code needs to check both,
if it gets that far...
 
> Anyway, it is possible that the fix is to just exit EFI partition checking 
> early if the first sector isn't palatable. But I don't know what the EFI 
> rules are, and quite frankly, I think it's a hell of a lot more important 
> to not do random access patterns to a disk and confuse it than it is to 
> have EFI, so for now I'm just making it conditional on IA64.

The check for invalid primary master boot record (PMBR) needs to be
moved up ahead of the reads for the GPTs (primary at the start of the
disk, alternate/backup at the end of the disk).  When first written,
Intel didn't specify that the PMBR (a normal DOS-like MBR partition
table with a single entry of type 0xEE covering the whole disk) *had*
to exist, so we let the code try looking for GPTs first.  When the
PMBR test was added, it should have been added ahead of the GPT tests,
not after.  I'll work up a patch to do just that, and will then remove
the dependency on IA64.  Fair enough?
 
> Btw, USB devices are _not_ the only devices that don't necessarily know 
> their size very well. CD-roms tend to sometimes have the same issues. They 
> just don't have partitions on them, so people probably never cared. Also, 
> I think some things like nbd end up being "allocate on demand", and the 
> size is meaningless.
> 
> Basic rule of thumb: be _careful_ when accessing a disk. There are too 
> damn many buggy or strange setups out there.

Indeed.

Thanks,
Matt

-- 
Matt Domsch
Sr. Software Engineer, Lead Engineer
Dell Linux Solutions linux.dell.com & www.dell.com/linux
Linux on Dell mailing lists @ http://lists.us.dell.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: EFI partition code broken..
  2004-11-07 20:03   ` EFI partition code broken Matt Domsch
@ 2004-11-07 21:13     ` Linus Torvalds
  2004-11-09 21:49       ` Matt Domsch
  0 siblings, 1 reply; 7+ messages in thread
From: Linus Torvalds @ 2004-11-07 21:13 UTC (permalink / raw)
  To: Matt Domsch; +Cc: davej, Luck, Tony, linux-kernel, linux-ia64



On Sun, 7 Nov 2004, Matt Domsch wrote:
> 
> The check for invalid primary master boot record (PMBR) needs to be
> moved up ahead of the reads for the GPTs (primary at the start of the
> disk, alternate/backup at the end of the disk).  When first written,
> Intel didn't specify that the PMBR (a normal DOS-like MBR partition
> table with a single entry of type 0xEE covering the whole disk) *had*
> to exist, so we let the code try looking for GPTs first.  When the
> PMBR test was added, it should have been added ahead of the GPT tests,
> not after.  I'll work up a patch to do just that, and will then remove
> the dependency on IA64.  Fair enough?

Yes, sounds good. Also, if I understand you correctly, I would suggest
actually taking the size of the disk from the PMBR 0xEE entry itself,
rather than depend on what size the disk reports (perhaps double-check
that it is sane, of course - the more careful the better).

With people doing things like concatenating partitions with "md", the size
of the disk itself is less important than what the partition table was set
up with - even if the size reporting of the disk itself is reliable.

For example, let's say that you create a EFI table on a RAID (striped or
whatever), and that in turn then means that the first sub-disk used for
the RAID will contain (as part of the RAID array) the EFI signature in its 
PMBR, we don't want to look at the GPT at the end of _that_ disk. See what 
I'm saying?

		Linus

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: EFI partition code broken..
       [not found] ` <Pine.LNX.4.58.0411071128240.24286@ppc970.osdl.org>
  2004-11-07 20:03   ` EFI partition code broken Matt Domsch
@ 2004-11-07 21:52   ` Matt Domsch
  2004-11-07 22:41     ` Linus Torvalds
  2004-11-07 23:46     ` Andries Brouwer
  1 sibling, 2 replies; 7+ messages in thread
From: Matt Domsch @ 2004-11-07 21:52 UTC (permalink / raw)
  To: Linus Torvalds, greg; +Cc: linux-kernel

On Sun, Nov 07, 2004 at 11:30:18AM -0800, Linus Torvalds wrote:
> There's a few reports of various USB storage devices locking up. The last 
> one was an iPod, but there's apparently others too.
> 
> The reason? They are unhappy if you access them past the end, and they 
> seem to have problems reporting their true size.
> 
> And the EFI partitioning code will happily just blindly try to access the 
> last sector, because that's where the EFI partition is. Boom. Immediately 
> dead iPod/whatever.

Another train of thought, and copying gregkh for inspiration.  Is there
any way to know which devices lie about their size, and fix that with
quirk code in the device discovery routines?  While I can fix
fs/partitions/efi.c to not to always do I/O to the end of the
purported size of the device, userspace and 'dd' can't.  If we could
quirk down the reported size for devices known to lie, then everything
which uses that value wouldn't have to have its own rules for such.

Thanks,
Matt

-- 
Matt Domsch
Sr. Software Engineer, Lead Engineer
Dell Linux Solutions linux.dell.com & www.dell.com/linux
Linux on Dell mailing lists @ http://lists.us.dell.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: EFI partition code broken..
  2004-11-07 21:52   ` EFI partition code broken Matt Domsch
@ 2004-11-07 22:41     ` Linus Torvalds
  2004-11-07 23:46     ` Andries Brouwer
  1 sibling, 0 replies; 7+ messages in thread
From: Linus Torvalds @ 2004-11-07 22:41 UTC (permalink / raw)
  To: Matt Domsch; +Cc: greg, linux-kernel



On Sun, 7 Nov 2004, Matt Domsch wrote:
>
> Another train of thought, and copying gregkh for inspiration.  Is there
> any way to know which devices lie about their size, and fix that with
> quirk code in the device discovery routines?

The USB layer actually has some quirks like this, but I think that's a 
bug waiting to happen.

The thing is, if you start doing quirks, you _are_ screwed in the end.  
Don't do it. It's not just a maintenance nightmare, it's fundamentally
wrong. It fundamentally takes the approach of "you have to have a kernel
that is two years newer than the hardware you have", which is an approach 
that I just find incredibly broken.

Quirks work slightly better in practice for stuff that seldom changes,
and/or where we have fairly good vendor support. So CPU's, for example,
are largely ok with quirks (aka "errata"). But random regular devices?  
Please no.

Side note: the USB storage stuff has historically had tons of quirks,
largely because the SCSI layer used to do crap-all to try to be sane. The
SCSI layer historically only cared about high-end devices, and then the
USB storage model clashed pretty hard with the old SCSI layer belief that
standards are something that people follow etc.

Happily, most of those quirks are hopefully stale these days, because the
SCSI layer has been slowly converted to the idea that you don't use every
documented feature under the sun just because it exists.

So I'm trying to make for _fewer_ quirks rather than more of them.

		Linus

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: EFI partition code broken..
  2004-11-07 21:52   ` EFI partition code broken Matt Domsch
  2004-11-07 22:41     ` Linus Torvalds
@ 2004-11-07 23:46     ` Andries Brouwer
  1 sibling, 0 replies; 7+ messages in thread
From: Andries Brouwer @ 2004-11-07 23:46 UTC (permalink / raw)
  To: Matt Domsch; +Cc: Linus Torvalds, greg, linux-kernel

On Sun, Nov 07, 2004 at 03:52:04PM -0600, Matt Domsch wrote:
> On Sun, Nov 07, 2004 at 11:30:18AM -0800, Linus Torvalds wrote:
> > There's a few reports of various USB storage devices locking up. The last 
> > one was an iPod, but there's apparently others too.
> > 
> > The reason? They are unhappy if you access them past the end, and they 
> > seem to have problems reporting their true size.
> > 
> > And the EFI partitioning code will happily just blindly try to access the 
> > last sector, because that's where the EFI partition is. Boom. Immediately 
> > dead iPod/whatever.
> 
> Another train of thought, and copying gregkh for inspiration.  Is there
> any way to know which devices lie about their size, and fix that with
> quirk code in the device discovery routines?  While I can fix
> fs/partitions/efi.c to not to always do I/O to the end of the
> purported size of the device, userspace and 'dd' can't.  If we could
> quirk down the reported size for devices known to lie, then everything
> which uses that value wouldn't have to have its own rules for such.

You see, Linux does automatic partition reading - nothing the user
can do about that - and goes Boom. One has to be more careful with
things that happen fully automatically than with things that happen
from user space. Perhaps the user learns not to do certain things.

The reason things go wrong is confusion about the result of asking
for the capacity. The SCSI way is to report the highest available address,
so that one has to add 1 to get the capacity. The ATA way is to give the
capacity. Some USB devices (that should use the SCSI way) get it wrong,
and thus give an off-by-1 answer, making the device one sector too large.
I introduced the US_FL_FIX_CAPACITY flag for that (in unusual_devs.h),
so once it is known that a device is bad it can be added to the list
with quirks. But it is better to be careful, and only do I/O to the last
sector when the user really asks for that.

Andries

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: EFI partition code broken..
  2004-11-07 21:13     ` Linus Torvalds
@ 2004-11-09 21:49       ` Matt Domsch
  2004-11-09 21:59         ` [PATCH 2.6] EFI GPT: reduce alternate header probing Matt Domsch
  0 siblings, 1 reply; 7+ messages in thread
From: Matt Domsch @ 2004-11-09 21:49 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: davej, Luck, Tony, linux-kernel, linux-ia64

On Sun, Nov 07, 2004 at 01:13:00PM -0800, Linus Torvalds wrote:
> On Sun, 7 Nov 2004, Matt Domsch wrote:
> > 
> > The check for invalid primary master boot record (PMBR) needs to be
> > moved up ahead of the reads for the GPTs (primary at the start of the
> > disk, alternate/backup at the end of the disk).  When first written,
> > Intel didn't specify that the PMBR (a normal DOS-like MBR partition
> > table with a single entry of type 0xEE covering the whole disk) *had*
> > to exist, so we let the code try looking for GPTs first.  When the
> > PMBR test was added, it should have been added ahead of the GPT tests,
> > not after.  I'll work up a patch to do just that, and will then remove
> > the dependency on IA64.  Fair enough?
> 
> Yes, sounds good. Also, if I understand you correctly, I would suggest
> actually taking the size of the disk from the PMBR 0xEE entry itself,
> rather than depend on what size the disk reports (perhaps double-check
> that it is sane, of course - the more careful the better).

While the PMBR 0xEE entry should cover the whole disk, I've got older
disks from Itanium1 Linux systems where it doesn't quite.  Likewise,
Windows systems appear to put 0xFFFFFFFF there, regardless of drive
size.  So that value can't be trusted.
 
> With people doing things like concatenating partitions with "md", the size
> of the disk itself is less important than what the partition table was set
> up with - even if the size reporting of the disk itself is reliable.
> 
> For example, let's say that you create a EFI table on a RAID (striped or
> whatever), and that in turn then means that the first sub-disk used for
> the RAID will contain (as part of the RAID array) the EFI signature in its 
> PMBR, we don't want to look at the GPT at the end of _that_ disk. See what 
> I'm saying?

Yes.

There are two ways to create software RAID volumes.

a) partition disks, put partitions in the array, then partition /dev/md_d0.

This is handled properly already.  The space projected to the OS as
the RAID volume does not include the underlying disks' partition table
sectors or last few sectors.  I tested this by making an md_d0 raid0
disk set, then partitioning that with GPT using parted, and all works
swimmingly, /dev/md_d0p[123456] all appeared as expected.  This is
the method that RAID autostart expects to find.

b) put whole disks in the array, then partition /dev/md_d0.

This isn't handled properly by existing code, because as you point
out, the MBR for the first disk and the MBR for the partitioned array
would occupy the same blocks.  In such a case, the GPT code would
blindly set up the disk with partitions (which were too big to have
fit on the disk), and then md would have done the same (with
partitions that fit in the md).  I've modified the GPT code to ignore
a partition table where the start or end blocks of the disk are
larger than the reported size of the disk.

There's a pathological case here, where you create a single-whole-disk
/dev/md_d0 device, then partition /dev/md_d0 into pieces.  I don't
know how to solve this, as the partition table data would occupy the
same space, and be valid for both configurations, but you would want
it to ignore the table on the disk and only care about the table on
the md.  I think this is just a case of "don't do that".

Patch to solve the immediate problem will follow in the next email.

Thanks,
Matt  

-- 
Matt Domsch
Sr. Software Engineer, Lead Engineer
Dell Linux Solutions linux.dell.com & www.dell.com/linux
Linux on Dell mailing lists @ http://lists.us.dell.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 2.6] EFI GPT: reduce alternate header probing
  2004-11-09 21:49       ` Matt Domsch
@ 2004-11-09 21:59         ` Matt Domsch
  0 siblings, 0 replies; 7+ messages in thread
From: Matt Domsch @ 2004-11-09 21:59 UTC (permalink / raw)
  To: Linus Torvalds, akpm; +Cc: davej, Luck, Tony, linux-kernel, linux-ia64

EFI partitioning scheme was reading the last reported sector of the
block device to look for the alternate GPT header, before it had
confirmed that it should.  This causes problems for devices with the
following problems:  a) those who misreport their size (typically
off-by-one), and b) those who fail when asked to read a block
outside their range.

This patch moves the test for the Protective Master Boot Record (PMBR)
ahead of the tests for the Primary and Alternate GPT headers.  If the
PMBR is not valid, the disk is assumed to not be a GPT disk.  This can
be overridden with the 'gpt' kernel command line option.  If the
Primary GPT header is not valid, the Alternate GPT header is not
probed automatically unless the 'gpt' kernel command line option is
passed.  If the both the PMBR and Primary GPT header are valid, then
the Alternate GPT header at the end of the disk is probed.

Also re-enables CONFIG_EFI_PARTITION for all architectures.


Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>

-- 
Matt Domsch
Sr. Software Engineer, Lead Engineer
Dell Linux Solutions linux.dell.com & www.dell.com/linux
Linux on Dell mailing lists @ http://lists.us.dell.com

===== fs/partitions/efi.c 1.8 vs edited =====
--- 1.8/fs/partitions/efi.c	2004-02-22 23:24:15 -06:00
+++ edited/fs/partitions/efi.c	2004-11-09 14:43:34 -06:00
@@ -3,7 +3,7 @@
  * Per Intel EFI Specification v1.02
  * http://developer.intel.com/technology/efi/efi.htm
  * efi.[ch] by Matt Domsch <Matt_Domsch@dell.com>
- *   Copyright 2000,2001,2002 Dell Inc.
+ *   Copyright 2000,2001,2002,2004 Dell Inc.
  *
  *  This program is free software; you can redistribute it and/or modify
  *  it under the terms of the GNU General Public License as published by
@@ -23,6 +23,11 @@
  * TODO:
  *
  * Changelog:
+ * Mon Nov 09 2004 Matt Domsch <Matt_Domsch@dell.com>
+ * - test for valid PMBR and valid PGPT before ever reading
+ *   AGPT, allow override with 'gpt' kernel command line option.
+ * - check for first/last_usable_lba outside of size of disk
+ *
  * Tue  Mar 26 2002 Matt Domsch <Matt_Domsch@dell.com>
  * - Ported to 2.5.7-pre1 and 2.5.7-dj2
  * - Applied patch to avoid fault in alternate header handling
@@ -131,32 +136,6 @@
 }
 
 /**
- * is_pmbr_valid(): test Protective MBR for validity
- * @mbr: pointer to a legacy mbr structure
- *
- * Description: Returns 1 if PMBR is valid, 0 otherwise.
- * Validity depends on two things:
- *  1) MSDOS signature is in the last two bytes of the MBR
- *  2) One partition of type 0xEE is found
- */
-static int
-is_pmbr_valid(legacy_mbr *mbr)
-{
-	int i, found = 0, signature = 0;
-	if (!mbr)
-		return 0;
-	signature = (le16_to_cpu(mbr->signature) == MSDOS_MBR_SIGNATURE);
-	for (i = 0; signature && i < 4; i++) {
-		if (mbr->partition_record[i].sys_ind ==
-                    EFI_PMBR_OSTYPE_EFI_GPT) {
-			found = 1;
-			break;
-		}
-	}
-	return (signature && found);
-}
-
-/**
  * last_lba(): return number of last logical block of device
  * @bdev: block device
  * 
@@ -168,7 +147,40 @@
 static u64
 last_lba(struct block_device *bdev)
 {
-	return (bdev->bd_inode->i_size >> 9) - 1;
+	if (!bdev || !bdev->bd_inode)
+		return 0;
+	return (bdev->bd_inode->i_size >> 9) - 1ULL;
+}
+
+static inline int
+pmbr_part_valid(struct partition *part, u64 lastlba)
+{
+        if (part->sys_ind == EFI_PMBR_OSTYPE_EFI_GPT &&
+            le32_to_cpu(part->start_sect) == 1UL)
+                return 1;
+        return 0;
+}
+
+/**
+ * is_pmbr_valid(): test Protective MBR for validity
+ * @mbr: pointer to a legacy mbr structure
+ * @lastlba: last_lba for the whole device
+ *
+ * Description: Returns 1 if PMBR is valid, 0 otherwise.
+ * Validity depends on two things:
+ *  1) MSDOS signature is in the last two bytes of the MBR
+ *  2) One partition of type 0xEE is found
+ */
+static int
+is_pmbr_valid(legacy_mbr *mbr, u64 lastlba)
+{
+	int i;
+	if (!mbr || le16_to_cpu(mbr->signature) != MSDOS_MBR_SIGNATURE)
+                return 0;
+	for (i = 0; i < 4; i++)
+		if (pmbr_part_valid(&mbr->partition_record[i], lastlba))
+                        return 1;
+	return 0;
 }
 
 /**
@@ -186,8 +198,8 @@
 {
 	size_t totalreadcount = 0;
 
-	if (!bdev || !buffer)
-		return 0;
+	if (!bdev || !buffer || lba > last_lba(bdev))
+                return 0;
 
 	while (count) {
 		int copied = 512;
@@ -206,7 +218,6 @@
 	return totalreadcount;
 }
 
-
 /**
  * alloc_read_gpt_entries(): reads partition entries from disk
  * @bdev
@@ -289,6 +300,7 @@
 	     gpt_header **gpt, gpt_entry **ptes)
 {
 	u32 crc, origcrc;
+	u64 lastlba;
 
 	if (!bdev || !gpt || !ptes)
 		return 0;
@@ -301,9 +313,7 @@
 			"%lld != %lld\n",
 			(unsigned long long)le64_to_cpu((*gpt)->signature),
 			(unsigned long long)GPT_HEADER_SIGNATURE);
-		kfree(*gpt);
-		*gpt = NULL;
-		return 0;
+		goto fail;
 	}
 
 	/* Check the GUID Partition Table CRC */
@@ -315,9 +325,7 @@
 		Dprintk
 		    ("GUID Partition Table Header CRC is wrong: %x != %x\n",
 		     crc, origcrc);
-		kfree(*gpt);
-		*gpt = NULL;
-		return 0;
+		goto fail;
 	}
 	(*gpt)->header_crc32 = cpu_to_le32(origcrc);
 
@@ -327,16 +335,28 @@
 		Dprintk("GPT my_lba incorrect: %lld != %lld\n",
 			(unsigned long long)le64_to_cpu((*gpt)->my_lba),
 			(unsigned long long)lba);
-		kfree(*gpt);
-		*gpt = NULL;
-		return 0;
+		goto fail;
 	}
 
-	if (!(*ptes = alloc_read_gpt_entries(bdev, *gpt))) {
-		kfree(*gpt);
-		*gpt = NULL;
-		return 0;
+	/* Check the first_usable_lba and last_usable_lba are
+	 * within the disk.
+	 */
+	lastlba = last_lba(bdev);
+	if (le64_to_cpu((*gpt)->first_usable_lba) > lastlba) {
+		Dprintk("GPT: first_usable_lba incorrect: %lld > %lld\n",
+			(unsigned long long)le64_to_cpu((*gpt)->first_usable_lba),
+			(unsigned long long)lastlba);
+		goto fail;
 	}
+	if (le64_to_cpu((*gpt)->last_usable_lba) > lastlba) {
+		Dprintk("GPT: last_usable_lba incorrect: %lld > %lld\n",
+			(unsigned long long)le64_to_cpu((*gpt)->last_usable_lba),
+			(unsigned long long)lastlba);
+		goto fail;
+	}
+
+	if (!(*ptes = alloc_read_gpt_entries(bdev, *gpt)))
+		goto fail;
 
 	/* Check the GUID Partition Entry Array CRC */
 	crc = efi_crc32((const unsigned char *) (*ptes),
@@ -345,15 +365,36 @@
 
 	if (crc != le32_to_cpu((*gpt)->partition_entry_array_crc32)) {
 		Dprintk("GUID Partitition Entry Array CRC check failed.\n");
-		kfree(*gpt);
-		*gpt = NULL;
-		kfree(*ptes);
-		*ptes = NULL;
-		return 0;
+		goto fail_ptes;
 	}
 
 	/* We're done, all's well */
 	return 1;
+
+ fail_ptes:
+	kfree(*ptes);
+	*ptes = NULL;
+ fail:
+	kfree(*gpt);
+	*gpt = NULL;
+	return 0;
+}
+
+/**
+ * is_pte_valid() - tests one PTE for validity
+ * @pte is the pte to check
+ * @lastlba is last lba of the disk
+ *
+ * Description: returns 1 if valid,  0 on error.
+ */
+static inline int
+is_pte_valid(const gpt_entry *pte, const u64 lastlba)
+{
+	if ((!efi_guidcmp(pte->partition_type_guid, NULL_GUID)) ||
+	    le64_to_cpu(pte->starting_lba) > lastlba         ||
+	    le64_to_cpu(pte->ending_lba)   > lastlba)
+		return 0;
+	return 1;
 }
 
 /**
@@ -464,8 +505,13 @@
  * @ptes is a PTEs ptr, filled on return.
  * Description: Returns 1 if valid, 0 on error.
  * If valid, returns pointers to newly allocated GPT header and PTEs.
- * Validity depends on finding either the Primary GPT header and PTEs valid,
- * or the Alternate GPT header and PTEs valid, and the PMBR valid.
+ * Validity depends on PMBR being valid (or being overridden by the
+ * 'gpt' kernel command line option) and finding either the Primary
+ * GPT header and PTEs valid, or the Alternate GPT header and PTEs
+ * valid.  If the Primary GPT header is not valid, the Alternate GPT header
+ * is not checked unless the 'gpt' kernel command line option is passed.
+ * This protects against devices which misreport their size, and forces
+ * the user to decide to use the Alternate GPT.
  */
 static int
 find_valid_gpt(struct block_device *bdev, gpt_header **gpt, gpt_entry **ptes)
@@ -479,70 +525,43 @@
 		return 0;
 
 	lastlba = last_lba(bdev);
+        if (!force_gpt) {
+                /* This will be added to the EFI Spec. per Intel after v1.02. */
+                legacymbr = kmalloc(sizeof (*legacymbr), GFP_KERNEL);
+                if (legacymbr) {
+                        memset(legacymbr, 0, sizeof (*legacymbr));
+                        read_lba(bdev, 0, (u8 *) legacymbr,
+                                 sizeof (*legacymbr));
+                        good_pmbr = is_pmbr_valid(legacymbr, lastlba);
+                        kfree(legacymbr);
+                        legacymbr=NULL;
+                }
+                if (!good_pmbr)
+                        goto fail;
+        }
+
 	good_pgpt = is_gpt_valid(bdev, GPT_PRIMARY_PARTITION_TABLE_LBA,
 				 &pgpt, &pptes);
-        if (good_pgpt) {
+        if (good_pgpt)
 		good_agpt = is_gpt_valid(bdev,
-                                         le64_to_cpu(pgpt->alternate_lba),
+					 le64_to_cpu(pgpt->alternate_lba),
 					 &agpt, &aptes);
-                if (!good_agpt) {
-                        good_agpt = is_gpt_valid(bdev, lastlba,
-                                                 &agpt, &aptes);
-                }
-        }
-        else {
+        if (!good_agpt && force_gpt)
                 good_agpt = is_gpt_valid(bdev, lastlba,
                                          &agpt, &aptes);
-        }
 
         /* The obviously unsuccessful case */
-        if (!good_pgpt && !good_agpt) {
-                goto fail;
-        }
-
-	/* This will be added to the EFI Spec. per Intel after v1.02. */
-        legacymbr = kmalloc(sizeof (*legacymbr), GFP_KERNEL);
-        if (legacymbr) {
-                memset(legacymbr, 0, sizeof (*legacymbr));
-                read_lba(bdev, 0, (u8 *) legacymbr,
-                         sizeof (*legacymbr));
-                good_pmbr = is_pmbr_valid(legacymbr);
-                kfree(legacymbr);
-                legacymbr=NULL;
-        }
-
-        /* Failure due to bad PMBR */
-        if ((good_pgpt || good_agpt) && !good_pmbr && !force_gpt) {
-                printk(KERN_WARNING 
-                       "  Warning: Disk has a valid GPT signature "
-                       "but invalid PMBR.\n");
-                printk(KERN_WARNING
-                       "  Assuming this disk is *not* a GPT disk anymore.\n");
-                printk(KERN_WARNING
-                       "  Use gpt kernel option to override.  "
-                       "Use GNU Parted to correct disk.\n");
+        if (!good_pgpt && !good_agpt)
                 goto fail;
-        }
-
-        /* Would fail due to bad PMBR, but force GPT anyhow */
-        if ((good_pgpt || good_agpt) && !good_pmbr && force_gpt) {
-                printk(KERN_WARNING
-                       "  Warning: Disk has a valid GPT signature but "
-                       "invalid PMBR.\n");
-                printk(KERN_WARNING
-                       "  Use GNU Parted to correct disk.\n");
-                printk(KERN_WARNING
-                       "  gpt option taken, disk treated as GPT.\n");
-        }
 
         compare_gpts(pgpt, agpt, lastlba);
 
         /* The good cases */
-        if (good_pgpt && (good_pmbr || force_gpt)) {
+        if (good_pgpt) {
                 *gpt  = pgpt;
                 *ptes = pptes;
-                if (agpt)  { kfree(agpt);   agpt = NULL; }
-                if (aptes) { kfree(aptes); aptes = NULL; }
+                kfree(agpt);
+                kfree(aptes);
                 if (!good_agpt) {
                         printk(KERN_WARNING 
 			       "Alternate GPT is invalid, "
@@ -550,21 +569,21 @@
                 }
                 return 1;
         }
-        else if (good_agpt && (good_pmbr || force_gpt)) {
+        else if (good_agpt) {
                 *gpt  = agpt;
                 *ptes = aptes;
-                if (pgpt)  { kfree(pgpt);   pgpt = NULL; }
-                if (pptes) { kfree(pptes); pptes = NULL; }
+                kfree(pgpt);
+                kfree(pptes);
                 printk(KERN_WARNING 
                        "Primary GPT is invalid, using alternate GPT.\n");
                 return 1;
         }
 
  fail:
-        if (pgpt)  { kfree(pgpt);   pgpt=NULL; }
-        if (agpt)  { kfree(agpt);   agpt=NULL; }
-        if (pptes) { kfree(pptes); pptes=NULL; }
-        if (aptes) { kfree(aptes); aptes=NULL; }
+        kfree(pgpt);
+        kfree(agpt);
+        kfree(pptes);
+        kfree(aptes);
         *gpt = NULL;
         *ptes = NULL;
         return 0;
@@ -606,15 +625,15 @@
 	Dprintk("GUID Partition Table is valid!  Yea!\n");
 
 	for (i = 0; i < le32_to_cpu(gpt->num_partition_entries) && i < state->limit-1; i++) {
-		if (!efi_guidcmp(ptes[i].partition_type_guid, NULL_GUID))
+		if (!is_pte_valid(&ptes[i], last_lba(bdev)))
 			continue;
 
 		put_partition(state, i+1, le64_to_cpu(ptes[i].starting_lba),
 				 (le64_to_cpu(ptes[i].ending_lba) -
                                   le64_to_cpu(ptes[i].starting_lba) +
-				  1));
+				  1ULL));
 
-		/* If there's this is a RAID volume, tell md */
+		/* If this is a RAID volume, tell md */
 		if (!efi_guidcmp(ptes[i].partition_type_guid,
 				 PARTITION_LINUX_RAID_GUID))
 			state->parts[i+1].flags = 1;
@@ -624,22 +643,3 @@
 	printk("\n");
 	return 1;
 }
-
-/*
- * Overrides for Emacs so that we follow Linus's tabbing style.
- * Emacs will notice this stuff at the end of the file and automatically
- * adjust the settings for this buffer only.  This must remain at the end
- * of the file.
- * ---------------------------------------------------------------------------
- * Local variables:
- * c-indent-level: 4
- * c-brace-imaginary-offset: 0
- * c-brace-offset: -4
- * c-argdecl-indent: 4
- * c-label-offset: -4
- * c-continued-statement-offset: 4
- * c-continued-brace-offset: 0
- * indent-tabs-mode: nil
- * tab-width: 8
- * End:
- */
===== fs/partitions/Kconfig 1.6 vs edited =====
--- 1.6/fs/partitions/Kconfig	2004-11-07 12:15:48 -06:00
+++ edited/fs/partitions/Kconfig	2004-11-08 09:52:30 -06:00
@@ -219,7 +219,7 @@
 
 config EFI_PARTITION
 	bool "EFI GUID Partition support"
-	depends on PARTITION_ADVANCED && IA64
+	depends on PARTITION_ADVANCED
 	select CRC32
 	help
 	  Say Y here if you would like to use hard disks under Linux which

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2004-11-09 22:01 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <Pine.LNX.4.58.0411070959560.2223@ppc970.osdl.org>
     [not found] ` <Pine.LNX.4.58.0411071128240.24286@ppc970.osdl.org>
2004-11-07 20:03   ` EFI partition code broken Matt Domsch
2004-11-07 21:13     ` Linus Torvalds
2004-11-09 21:49       ` Matt Domsch
2004-11-09 21:59         ` [PATCH 2.6] EFI GPT: reduce alternate header probing Matt Domsch
2004-11-07 21:52   ` EFI partition code broken Matt Domsch
2004-11-07 22:41     ` Linus Torvalds
2004-11-07 23:46     ` Andries Brouwer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox