From: Jason Tackaberry <tack@linux.com>
To: linux-lvm@sistina.com
Subject: [linux-lvm] Nasty bug in lvm and/or md and/or reiserfs
Date: Thu, 26 Jul 2001 13:19:28 -0400 [thread overview]
Message-ID: <20010726131928.D1779@linux.com> (raw)
I have discovered one or more severe bugs in any or all of lvm, md, and
reiserfs. I'm not a kernel guy and I don't claim to know how these
things are supposed to interact at that level, so I'm not going to point
fingers. Instead, here's what happened and I'll let you decide. :)
The problem surfaces when you create a volume group with lvm and include
a partition that happens to be mounted. In my tests the cwd was this
mounted filesystem. This may or may not be a necessary condition to
reproduce this problem. (I didn't test otherwise.)
In my tests I have /dev/sda7 and /dev/sdb7. I created an md raid0
device and assigned that to /dev/md3. Then, here are the sequence of
commands I issued:
mkreiserfs /dev/md3
mount /dev/md3 /space
cd /space
pvcreate /dev/md3
vgcreate vol01 /dev/md3
Obviously it's not normal to create a vg on a device that's mounted. I
did this by accident, but the result was pretty ugly. Immediately after
doing 'vgcreate' this popped up in the console:
--- snip ---------------------------------------------------------------
is_tree_node: node level 19784 does not match to the expected one -1
vs-5150: search_by_key: invalid format found in block 0. Fsck?
kernel BUG at namei.c:343!
invalid operand: 0000
CPU: 0
EIP: 0010:[<c017d965>]
EFLAGS: 00010286
eax: 0000001b ebx: c2047e60 ecx: c2c0e000 edx: c02d88c4
esi: c28795e0 edi: c2047ea0 ebp: c2047e04 esp: c2047df8
ds: 0018 es: 0018 ss: 0018
Process bash (pid: 834, stackpage=c2047000)
Stack: c028ac46 c028acda 00000157 00000001 00000002 7f84dbff 000001f4 00000000
00000003 c011ac84 00000001 c028ad8e c2047e60 c2047ea0 c23be0e0 c017da8b
c28795e0 c23be13c 0000000a c2047e60 c2047ea0 00000000 00000001 c03275c0
Call Trace: [<c011ac84>] [<c017da8b>] [<c0139b70>] [<c013a582>] [<c0141999>] [<c0139c2f>] [<c013a341>]
[<c013a94a>] [<c0137993>] [<c012fae3>] [<c0106ecb>]
Code: 0f 0b 83 c4 0c 8b 54 24 3c 52 8b 44 24 3c 50 57 55 e8 25 fc
--- snip ---------------------------------------------------------------
After realising that I'd tried to vgcreate a partition that was mounted,
I cd'ed out of /space and issued 'umount /space'. Then I got this:
--- snip ---------------------------------------------------------------
journal-2332: Trying to log block 16, which is a log block
kernel BUG at prints.c:332!
invalid operand: 0000
CPU: 0
EIP: 0010:[<c01887a4>]
EFLAGS: 00010286
eax: 0000001c ebx: c0335a80 ecx: cf41c000 edx: c02d88c4
esi: cc3c1e68 edi: c0335ebd ebp: c0335a80 esp: cc3c1e2c
ds: 0018 es: 0018 ss: 0018
Process umount (pid: 881, stackpage=cc3c1000)
Stack: c028c6c6 c028c9c0 0000014c cc3c1e44 cc3c1e48 00000000 000007e0 00000001
00000001 d0865024 00000000 d0865000 c0195e4e ce803800 c028fb80 00000010
00000560 00000000 00000001 00000001 00000807 00000000 00000000 ce633ea0
Call Trace: [<c0195e4e>] [<c0193546>] [<c018684a>] [<c01935d1>] [<c018685a>] [<c0135174>] [<c0139b88>]
[<c013559a>] [<c012fb83>] [<c0121ea2>] [<c01355dc>] [<c0106ecb>]
Code: 0f 0b 83 c4 0c 8b 54 24 28 85 d2 74 08 8b 44 24 28 83 48 28
Segmentation fault
--- snip ---------------------------------------------------------------
After umount segfaulted, I decided I ought to reboot. ctrl-alt-del or
halt(8) appeared to have no effect, so I hit the reset button.
As my system was coming back up, the boot sequence continually said
"Can't open /etc/mtab: input/output error." The boot and root
partitions are both md RAID1 devices with ext2. The disks seemed to
fsck okay, and when I logged in, I indeed could not read /etc/mtab. 'ls
-l /etc/mtab' even yielded the same input/output error. I rebooted with
the stock RH 7.1 kernel that has no lvm support and this error
persisted. So this seems to be a problem with md?
I managed to fix it by booting the RedHat rescue cd, mounting the root
partition, and deleting /etc/mtab. All seems okay.
Just to see if this problem was linked to md beneath LVM, I repeated the
above with /dev/sda7 instead of /dev/md3. Same result. Now, I also
repeated the above using ext2 instead of reiserfs. ext2 complained:
EXT2-fs error (device sd(8,7)): ext2_write_inode: unable to read inode
block - inode=2, block=1899261259
attempt to access beyond end of device
But it did not cause any kernel errors. This process failed gracefully
with ext2.
Now, if I had to blame LVM for something, I'd say it should at least not
let me vgcreate with a mounted partition, or one that has inodes in use.
mkraid, for exaple, would not let me start a raid device with a mounted
partition. If vgcreate failed with an error saying "This partition is
mounted or has inodes in use" I would have realized the problem and
unmounted /space before vgcreate. The kernel errors came from Reiserfs,
so there seems to be a problem there. But like I said, I'm not a kernel
guy so I'll let you guys confirm this before I submit a bug report to
the Reiserfs team (or maybe one of the LVM hackers would be a better
candidate). Finally, md did not handle this robustly at all. The whole
/etc/mtab issue makes me extremely nervous since md is handling my boot
and root parition with RAID1. I'm just in a testing stages,
fortunately, but I'm thinking md/lvm aren't suitable for this system
which will be in production in a month.
I'd be happy to repeat and experiments or give further information if
you need some. This system is mine to beat on for a little while longer
before it needs to go live.
Cheers,
Jason.
--
Academic Computing Support Specialist Assistant Section Editor
Algoma University College http://linux.com/develop
Sault Ste. Marie, Ontario
705-949-2301 x330 Personal Home Page
http://www.auc.ca http://sault.org
next reply other threads:[~2001-07-26 17:19 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2001-07-26 17:19 Jason Tackaberry [this message]
2001-07-26 18:47 ` [linux-lvm] Nasty bug in lvm and/or md and/or reiserfs AJ Lewis
2001-07-26 20:07 ` Jason Tackaberry
2001-07-26 20:17 ` AJ Lewis
2001-07-27 6:57 ` Werner John
2001-07-27 13:02 ` Jason Tackaberry
2001-07-30 7:46 ` Werner John
2001-07-30 19:31 ` Ralph Jennings
2001-07-31 5:59 ` Werner John
2001-07-26 22:24 ` Joe Thornber
2001-07-26 21:44 ` Jason Tackaberry
-- strict thread matches above, loose matches on Subject: below --
2001-07-27 15:20 Kevin M Corry
2001-07-27 15:51 ` Jason Tackaberry
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20010726131928.D1779@linux.com \
--to=tack@linux.com \
--cc=linux-lvm@sistina.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.