All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Török Edwin" <edwintorok@gmail.com>
To: grub-devel@gnu.org
Subject: Re: Bug#478238: grub-probe: fails to find drive for /dev/sda10
Date: Sun, 11 May 2008 14:35:41 +0300	[thread overview]
Message-ID: <4826DA0D.9070302@gmail.com> (raw)
In-Reply-To: <20080506133126.GG5055@thorin>

[-- Attachment #1: Type: text/plain, Size: 3823 bytes --]

[sending to grub-devel@ as requested]

Robert Millan wrote:
> On Sun, May 04, 2008 at 05:01:32PM +0300, Török Edwin wrote:
>   
>>>>    Device Boot      Start         End      Blocks   Id  System
>>>> /dev/sda1   *           1        1275    10241406    7  HPFS/NTFS
>>>> /dev/sda2            1276        2248     7815622+  a6  OpenBSD
>>>> /dev/sda3            2249        5289    24426832+   f  W95 Ext'd (LBA)
>>>> /dev/sda4            6080        7296     9775552+  bf  Solaris
>>>> /dev/sda5            2249        2371      987966   82  Linux swap / Solaris
>>>> /dev/sda6            2372        3587     9767488+  83  Linux
>>>> /dev/sda7            3588        3600      104391   83  Linux
>>>> /dev/sda8            3601        4863    10145016   8e  Linux LVM
>>>> /dev/sda9            4864        5228     2931831   a6  OpenBSD
>>>> /dev/sda10           5229        5289      489951   83  Linux
>>>>         
>> [...]
>> grub> ls (hd0,10)
>> error: unknown device
>> grub> ls (hd0,11)
>> error: unknown device
>> grub>
>>     
>
> I tried reproducing your setup, but I can't hit the same bug.  This starts to
> look really nasty.  Just spotted this:
>
>   /build/buildd/grub2-1.96+20080426/partmap/pc.c:141: partition 0: flag 0x80, type 0x7, start 0x3f, len 0x1388afc
>   [...]
>   /build/buildd/grub2-1.96+20080426/partmap/pc.c:141: partition 0: flag 0x0, type 0x82, start 0x2270f07, len 0x1e267c
>
> for which I can't find any explanation other than memory corruption.  Also,
> due to a missing fflush() call the output is somewhat scrambled, which makes
> it harder to track (I fixed this already in upstream).
>
> Could you:
>
>   - Apply the attached patch & run grub-probe again (this time output
>     will be a bit more readable)
>   

There was no patch attached, however I did a 'cvs diff -u -D2008-04-30',
and applied that patch.
I found what the problem is, and it also explains why you couldn't
reproduce the problem.

/dev/sda9 is not a valid OpenBSD partition, and in partmap/pc.c:176 the
iteration fails with an error: invalid disk label magic 0x%x.
If I replace that return with a continue, it works.

The problem is that grub2 stops looking for more partitions as soon as
it encountered the invalid partition,
grub 0.97 was working perfectly and I never noticed the partition has
the wrong type!

Also if I change the partition type to 83 (as it should be) an unpatched
grub-probe can find that /boot is on /dev/sda10:
# grub-probe -t device /boot
/dev/sda10

I think grub2 should handle errors more gracefully, eventually mark the
partition as invalid, and keep going.
grub-probe was looking for /dev/sda10, and it shouldn't be affected by
/dev/sda9 being corrupted/invalid.
Think of it this way: if a partition gets corrupted, that shouldn't
prevent from booting, assuming the boot and root partitions are
still ok.

Compare what grub-emu says when sda9 has wrong type:

grub> ls (hd0,10)
error: unknown device

And this is what it says when sda9 has the correct type:
grub> ls (hd0,10)
      Partition hd0,10: Filesystem type ext2, Label debian_BOOT



>   - Send it to grub-devel@gnu.org
>   
Done
>   ?
>
> Maybe someone there has an idea, but if it's memory corruption and we can't
> reproduce it, tracing the problem remotely isn't going to work very well.
>   

It wasn't memory corruption, however I have run valgrind and it has
shown some leaks, plus call to stat() with NULL parameter.
The attached patch fixes some valgrind warnings. Some leaks still
remain, I attached the new valgrind logs.

P.S.: grub2 seems to work now, I am able to boot with it with the
text-mode menu. The default graphics mode doesn't work I will open a
separate bug about that.

Best regards,
--Edwin



[-- Attachment #2: grub2.patch --]
[-- Type: text/x-diff, Size: 1033 bytes --]

diff -ur grub2-1.96+20080429/kern/disk.c ../grub2-1.96+20080429/kern/disk.c
--- grub2-1.96+20080429/kern/disk.c	2008-02-08 14:22:51.000000000 +0200
+++ ../grub2-1.96+20080429/kern/disk.c	2008-05-11 13:58:02.270673755 +0300
@@ -317,7 +317,10 @@
   /* Reset the timer.  */
   grub_last_time = grub_get_rtc ();
 
-  grub_free (disk->partition);
+  if(disk->partition) {
+	  grub_free (disk->partition->data);
+	  grub_free (disk->partition);
+  }
   grub_free ((void *) disk->name);
   grub_free (disk);
 }
diff -ur grub2-1.96+20080429/util/grub-probe.c ../grub2-1.96+20080429/util/grub-probe.c
--- grub2-1.96+20080429/util/grub-probe.c	2008-05-11 13:59:14.934811935 +0300
+++ ../grub2-1.96+20080429/util/grub-probe.c	2008-05-11 13:46:21.729236855 +0300
@@ -190,9 +190,10 @@
       struct stat st;
       grub_fs_t fs;
 
-      stat (path, &st);
+      if(path)
+	      stat (path, &st);
 
-      if (st.st_mode == S_IFREG)
+      if (path && st.st_mode == S_IFREG)
 	{
 	  /* Regular file.  Verify that we can read it properly.  */
 


[-- Attachment #3: vallog --]
[-- Type: text/plain, Size: 3800 bytes --]

==25071== Memcheck, a memory error detector.
==25071== Copyright (C) 2002-2007, and GNU GPL'd, by Julian Seward et al.
==25071== Using LibVEX rev 1804, a library for dynamic binary translation.
==25071== Copyright (C) 2004-2007, and GNU GPL'd, by OpenWorks LLP.
==25071== Using valgrind-3.3.0-Debian, a dynamic binary instrumentation framework.
==25071== Copyright (C) 2000-2007, and GNU GPL'd, by Julian Seward et al.
==25071== For more details, rerun with: -v
==25071== 
==25071== My PID = 25071, parent PID = 5663.  Prog and args are:
==25071==    ./grub-probe
==25071==    -d
==25071==    /dev/sda10
==25071== 
==25071== Warning: noted but unhandled ioctl 0x1261 with no size/direction hints
==25071==    This could cause spurious value errors to appear.
==25071==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
==25071== Warning: noted but unhandled ioctl 0x1261 with no size/direction hints
==25071==    This could cause spurious value errors to appear.
==25071==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
==25071== Warning: noted but unhandled ioctl 0x1261 with no size/direction hints
==25071==    This could cause spurious value errors to appear.
==25071==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
==25071== 
==25071== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 10 from 1)
==25071== malloc/free: in use at exit: 611,077 bytes in 176 blocks.
==25071== malloc/free: 901 allocs, 725 frees, 2,397,201 bytes allocated.
==25071== For counts of detected errors, rerun with: -v
==25071== searching for pointers to 176 not-freed blocks.
==25071== checked 662,256 bytes.
==25071== 
==25071== 4,096 bytes in 1 blocks are possibly lost in loss record 3 of 5
==25071==    at 0x4006AB8: malloc (vg_replace_malloc.c:207)
==25071==    by 0x804AFE4: xmalloc (misc.c:81)
==25071==    by 0x804B41A: grub_malloc (misc.c:222)
==25071==    by 0x804C3EB: grub_disk_cache_store (disk.c:162)
==25071==    by 0x804CDC1: grub_disk_read (disk.c:461)
==25071==    by 0x8069A72: grub_lvm_scan_device (lvm.c:288)
==25071==    by 0x804C014: iterate_partition.2134 (device.c:132)
==25071==    by 0x8066C9C: pc_partition_map_iterate (pc.c:153)
==25071==    by 0x804F3AD: grub_partition_iterate (partition.c:126)
==25071==    by 0x804C09D: iterate_disk.2131 (device.c:101)
==25071==    by 0x80498FA: call_hook (biosdisk.c:132)
==25071==    by 0x804992B: grub_util_biosdisk_iterate (biosdisk.c:141)
==25071== 
==25071== 
==25071== 41,136 (41,132 direct, 4 indirect) bytes in 12 blocks are definitely lost in loss record 4 of 5
==25071==    at 0x4006AB8: malloc (vg_replace_malloc.c:207)
==25071==    by 0x804AFE4: xmalloc (misc.c:81)
==25071==    by 0x804B41A: grub_malloc (misc.c:222)
==25071==    by 0x804C3EB: grub_disk_cache_store (disk.c:162)
==25071==    by 0x804CDC1: grub_disk_read (disk.c:461)
==25071==    by 0x8066D4E: pc_partition_map_iterate (pc.c:165)
==25071==    by 0x804F3AD: grub_partition_iterate (partition.c:126)
==25071==    by 0x804C09D: iterate_disk.2131 (device.c:101)
==25071==    by 0x80498FA: call_hook (biosdisk.c:132)
==25071==    by 0x804992B: grub_util_biosdisk_iterate (biosdisk.c:141)
==25071==    by 0x804C4CC: grub_disk_dev_iterate (disk.c:205)
==25071==    by 0x804BF63: grub_device_iterate (device.c:138)
==25071== 
==25071== LEAK SUMMARY:
==25071==    definitely lost: 41,132 bytes in 12 blocks.
==25071==    indirectly lost: 4 bytes in 1 blocks.
==25071==      possibly lost: 4,096 bytes in 1 blocks.
==25071==    still reachable: 565,845 bytes in 162 blocks.
==25071==         suppressed: 0 bytes in 0 blocks.
==25071== Reachable blocks (those to which a pointer was found) are not shown.
==25071== To see them, rerun with: --leak-check=full --show-reachable=yes

       reply	other threads:[~2008-05-11 11:35 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <481590B8.7000405@gmail.com>
     [not found] ` <20080429134626.GB8328@thorin>
     [not found]   ` <481DC1BC.6020002@gmail.com>
     [not found]     ` <20080506133126.GG5055@thorin>
2008-05-11 11:35       ` Török Edwin [this message]
2008-05-12 15:44         ` Bug#478238: grub-probe: fails to find drive for /dev/sda10 Robert Millan
2008-05-12 16:02           ` Török Edwin
2008-05-12 18:43             ` Robert Millan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4826DA0D.9070302@gmail.com \
    --to=edwintorok@gmail.com \
    --cc=grub-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.