From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with archive (Exim 4.43) id 1KD1Cn-0001np-Mf for mharc-grub-devel@gnu.org; Sun, 29 Jun 2008 14:00:53 -0400 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1KD1Cm-0001ma-0n for grub-devel@gnu.org; Sun, 29 Jun 2008 14:00:52 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1KD1Cl-0001lw-0F for grub-devel@gnu.org; Sun, 29 Jun 2008 14:00:51 -0400 Received: from [199.232.76.173] (port=33032 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1KD1Ck-0001lj-JT for grub-devel@gnu.org; Sun, 29 Jun 2008 14:00:50 -0400 Received: from mail-in-03.arcor-online.net ([151.189.21.43]:46142) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1KD1Cj-0005oL-Nq for grub-devel@gnu.org; Sun, 29 Jun 2008 14:00:50 -0400 Received: from mail-in-13-z2.arcor-online.net (mail-in-13-z2.arcor-online.net [151.189.8.30]) by mail-in-03.arcor-online.net (Postfix) with ESMTP id 614B02CADD1 for ; Sun, 29 Jun 2008 20:00:48 +0200 (CEST) Received: from mail-in-03.arcor-online.net (mail-in-03.arcor-online.net [151.189.21.43]) by mail-in-13-z2.arcor-online.net (Postfix) with ESMTP id 4D0A41B8E01 for ; Sun, 29 Jun 2008 20:00:48 +0200 (CEST) Received: from cerberus.olympus (dslb-088-074-238-083.pools.arcor-ip.net [88.74.238.83]) (Authenticated sender: blubberdiblub@arcor.de) by mail-in-03.arcor-online.net (Postfix) with ESMTP id D2B0D30AC49 for ; Sun, 29 Jun 2008 20:00:47 +0200 (CEST) Received: from cerberus.olympus ([192.168.1.3] ident=foobar) by cerberus.olympus with esmtp (Exim 4.69) (envelope-from ) id 1KD1Ch-00018z-AB for grub-devel@gnu.org; Sun, 29 Jun 2008 20:00:47 +0200 To: grub-devel@gnu.org Content-Disposition: inline From: bitbucket@arcor.de Date: Sun, 29 Jun 2008 20:00:47 +0200 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Message-Id: <200806292000.47091.bitbucket@arcor.de> X-Virus-Scanned: ClamAV 0.93/7407/Mon Jun 9 04:21:00 2008 on mail-in-03.arcor-online.net X-Virus-Status: Clean X-detected-kernel: by monty-python.gnu.org: Linux 2.4-2.6 Subject: grub 2: xfs.mod reads some directories incorrectly X-BeenThere: grub-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: The development of GRUB 2 List-Id: The development of GRUB 2 List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 29 Jun 2008 18:00:52 -0000 Hello, I want to bring to your attention a problem with the XFS module of grub2 si= nce=20 I was told that you might not always pay attention to the bugtracker. I've= =20 reported the bug at savannah.gnu.org as well as on the Debian BTS, so I'm=20 pasting the interesting snippets from the latter one for your convenience: =46rom: Alex Malinovich To: 436943@bugs.debian.org Subject: Re: Bug#436943: another confirmation Date: Tue, 12 Feb 2008 05:06:15 -0800 [Message part 1 (text/plain, inline)] On Tue, 2008-02-12 at 13:19 +0100, Robert Millan wrote: =2D-snip-- > On Tue, Feb 12, 2008 at 02:14:55AM -0800, Alex Malinovich wrote: > >=20 > > Oddly, trying to do "ls (hd0,1)/" works just fine. Yet when running > "ls > > (hd0,1)/boot" gives that same "out of partition error", even > > though /boot is NOT on a separate partition. >=20 > Alex, what is your filesystem in (hd0,1)/ ? >=20 > Can you identify the problem if you do: >=20 > set debug=3Dall > ls (hd0,1)/boot The fs on hd0,1 is xfs. I just did an fsck on it and the fs itself is fine. Since it's hard to paste a good amount of code that happens at boot, here's what I get when running grub-emu from the console with the above commands. I'll do a reboot later and verify that I get the same errors. If I see anything different on a regular boot I'll send in a follow-up email. One thing that might be potentially useful, when just doing the ls without the debug=3Dall, I actually get a little bit of output prior to the out of partition error. In this particular case the output is: grub> ls (hd0,1)/boot ?^^ ?^^ ^Q^_ = =20 = =20 error: out of partition grub> Not sure what those extra characters are about, but they are consistent across multiple runs of grub-emu. So, running ls after setting debug=3Dall I get: grub> ls (hd0,1)/boot =2D-snip-- /home/rmh/hacking/grub/svn/upload/grub2-1.96+20080210/kern/disk.c:364:=20 Reading `hd0,1'... =20 /home/rmh/hacking/grub/svn/upload/grub2-1.96+20080210/kern/disk.c:371: Rea= d=20 out of range: sector 0xffffffffef400000 (out of partition). =20 /home/rmh/hacking/grub/svn/upload/grub2-1.96+20080210/kern/disk.c:364: of= =20 range: sector 0xffffffffef400000 (out of partition). =2D-snip-- repeating a few hundred times. There's some other scattered output but it's very hard to make out. I do see some lines about detecting and opening the xfs filesystem early in the log. =46rom: Niels Boehm To: Debian Bug Tracking System <436943@bugs.debian.org> Subject: grub-pc: xfs.mod reads some directories incorrectly Date: Sat, 28 Jun 2008 08:15:33 +0200 Package: grub-pc Version: 1.96+20080512-1 =46ollowup-For: Bug #436943 Hi, grub2 fails for me in the aforementioned manner. It is unable to read anything from /boot or /boot/grub which are both on my root partition with an xfs file system. Trying to ls some directories, I find that some read without problem (apparently ones containing only a few entries, like / /mnt /media /lib64 for example) and others produce garbled output and the "out of partition" error (ones with many entries, like /boot /boot/grub /etc /bin for example). I checked the root fs with /usr/sbin/xfs_check, but it looks alright. And if I remember correctly, I created the root fs not long ago, so it should have quite recent data structures. The log being version 2 confirms that: # xfs_info / meta-data=3D/dev/root isize=3D256 agcount=3D4, agsize=3D645= 10 blks =3D sectsz=3D512 attr=3D2 data =3D bsize=3D4096 blocks=3D258040, imaxpct= =3D25 =3D sunit=3D0 swidth=3D0 blks naming =3Dversion 2 bsize=3D4096 log =3Dinternal bsize=3D4096 blocks=3D1200, version=3D2 =3D sectsz=3D512 sunit=3D0 blks, lazy-coun= t=3D0 realtime =3Dnone extsz=3D4096 blocks=3D0, rtextents=3D0 =46rom: Niels Boehm To: Debian Bug Tracking System <436943@bugs.debian.org> Subject: grub-pc: missing mapping from fs-block-no. to disk-block-no. in xf= s.c Date: Sat, 28 Jun 2008 14:50:36 +0200 Package: grub-pc =46ollowup-For: Bug #436943 Okay, I hunted the problem down myself. It's a missing mapping from the file system block numbering scheme ((agno << agbits) | block_in_ag) to the on-partition block numbering (agno * agsize + block_in_ag) in the grub_xfs_read_block() function. It would affect all users who have a partition with more than one allocation group with an agsize which is not a power of 2. The problem arises when grub encounters files with blocks not on ag#0 and directories which are extent lists not stored on ag#0. I changed the offending file like that: =2D--- CUT HERE ---- =2D-- grub2-1.96+20080512/fs/xfs.c 2008-02-02 15:15:31.000000000 +0100 +++ xfs.c_Niels 2008-06-28 12:40:39.487565975 +0200 @@ -162,4 +162,8 @@ (grub_be_to_cpu64 (ino) >> GRUB_XFS_INO_AGBITS (data)) =20 +#define GRUB_XFS_FSB_TO_BLOCK(data, fsb) \ + (((fsb) >> (data)->sblock.log2_agblk) * (data)->agsize \ + + ((fsb) & ((1 << (data)->sblock.log2_agblk) - 1))) + #define GRUB_XFS_EXTENT_OFFSET(exts,ex) \ ((grub_be_to_cpu32 (exts[ex][0]) & ~(1 << 31)) << 23 \ @@ -309,5 +313,5 @@ grub_free (leaf); =20 =2D return ret; + return GRUB_XFS_FSB_TO_BLOCK(node->data, ret); } =20 =2D--- CUT HERE ---- The patch works fine for me, but I can't tell if I missed any intricacies, since I'm not into grub development. =46rom: bitbucket@arcor.de To: Robert Millan Cc: 436943@bugs.debian.org Subject: Re: Bug#436943: grub-pc: xfs.mod reads some directories incorrectly Date: Sun, 29 Jun 2008 18:36:08 +0200 On Sunday 29 June 2008, Robert Millan wrote: > The version you're using (1.96+20080512-1) is a bit old. Could you check > if this problem is reproducible with the sid one? Or, since I notice you > sent this upstream (thanks!), with latest CVS. It the same with 1.96+20080626-1. And I also had a look at the source of th= e=20 CVS version and it looks like the mapping is still missing there (tho it's = a=20 bit strange that they didn't notice it when they added uuid detection to=20 xfs.c - maybe they happen to have agsizes that are a power of 2), but I'm n= ot=20 sure. I'd prefer to stick to the normal packages, since I don't really feel= =20 at home with package maintenance stuff. Regards, Niels B=F6hm