public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [2.4.18] oops in lvm or raid
@ 2003-12-29 14:59 Roger Gammans
  2003-12-29 17:16 ` Christophe Saout
  0 siblings, 1 reply; 3+ messages in thread
From: Roger Gammans @ 2003-12-29 14:59 UTC (permalink / raw)
  To: LKML

[-- Attachment #1: Type: text/plain, Size: 3253 bytes --]

Hi

The system in question runs:-
   ext3-> lvm->raid1(hdc,hde) -> ide( piix  , pdc202xx )

We had a power failure here which caused serve corruption 
on our system, after coming back up one of the ext3 partitions
wouldn't recover its journal and there was manual fsck ing
gave warnings of serve data loss. Before fscking I created a 
lvm snapshot of the partition desperately hoping that the lvm 
meta data was intact. And used this to write the raw filesystem image
pre-fsck to a tape.

Anyway to cut the chase I forgot to delete this snapshot partition
before recovering filesystem form a backup and at about the time
the snapshot got full I got the following kernel messages....


Dec 29 13:15:23 turin kernel: lvm -- giving up to snapshot /dev/rootvg/data_root on /dev/rootvg/data_20031218: out of spa
Dec 29 13:15:23 turin kernel: Unable to handle kernel paging request at virtual address 00015618
Dec 29 13:15:23 turin kernel:  printing eip:
Dec 29 13:15:23 turin kernel: c4847a7c
                              c4847a7c -> lvm_snapshot_remap_block (c4847a0c)
Dec 29 13:15:23 turin kernel: *pde = 00000000
Dec 29 13:15:23 turin kernel: Oops: 0000
Dec 29 13:15:23 turin kernel: CPU:    0
Dec 29 13:15:23 turin kernel: EIP:    0010:[raid1:__insmod_raid1_O/lib/modules/2.4.18-bf2.4/kernel/drivers/md+-169348/96]
Dec 29 13:15:23 turin kernel: EFLAGS: 00010246
Dec 29 13:15:23 turin kernel: eax: 0000ffff   ebx: 00015618   ecx: c2741600   edx: 00000000
Dec 29 13:15:23 turin kernel: esi: 0111e1f8   edi: 00000900   ebp: 00000000   esp: c1173e58
Dec 29 13:15:23 turin kernel: ds: 0018   es: 0018   ss: 0018
Dec 29 13:15:23 turin kernel: Process kswapd (pid: 4, stackpage=c1173000)
Dec 29 13:15:23 turin kernel: Stack: c2741600 0111e178 c2741770 c2741170 c0392520 00002ac3 00000038 00000000 
Dec 29 13:15:23 turin kernel:        c4844a5d c1173eb6 c1173eb8 0111e178 c2741600 00003a00 c3abc500 0111c0b8 
Dec 29 13:15:23 turin kernel:        01680000 011240b8 01680000 c2741000 c3b9f000 0111e178 000088e0 09005920 
Dec 29 13:15:23 turin kernel: Call Trace: [raid1:__insmod_raid1_O/lib/modules/2.4.18-bf2.4/kernel/drivers/md+-181667/96] 
Dec 29 13:15:23 turin kernel:    [try_to_free_buffers+174/212] [try_to_release_page+63/72] [shrink_cache+458/728] [shrink
Dec 29 13:15:23 turin kernel:    [kswapd_balance+18/40] [kswapd+153/188] [kernel_thread+40/56] 
		c0392520 -> tasklist_lock (c0392510)
		c4844a5d -> lvm_init (c4843060)
Dec 29 13:15:23 turin kernel: 
Dec 29 13:15:23 turin kernel: Code: 8b 0b eb 03 45 8b 09 39 d9 74 27 39 71 08 75 f4 66 39 79 0c 


The lines without a date stamp I manually added with my own hand oops
decoding which doesn't seem to match the ksymoops handling. 

This is a stock kernel (bf2.4) from debian stable (Version: 2.4.18-5)

More information available on request, I'm going to try to find time
to set a similar system up to try to reproduce but that might not be
this week.

TTFN
-- 
Roger. 	                        Home| http://www.sandman.uklinux.net/
Master of Peng Shui.      (Ancient oriental art of Penguin Arranging)
Work|Independent Sys Consultant | http://www.computer-surgery.co.uk/
So what are the eigenvalues and eigenvectors of 'The Matrix'? --anon

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [2.4.18] oops in lvm or raid
  2003-12-29 14:59 [2.4.18] oops in lvm or raid Roger Gammans
@ 2003-12-29 17:16 ` Christophe Saout
  2003-12-30 12:02   ` Roger Gammans
  0 siblings, 1 reply; 3+ messages in thread
From: Christophe Saout @ 2003-12-29 17:16 UTC (permalink / raw)
  To: Roger Gammans; +Cc: LKML

Am Mo, den 29.12.2003 schrieb Roger Gammans um 15:59:

> Dec 29 13:15:23 turin kernel: lvm -- giving up to snapshot /dev/rootvg/data_root on /dev/rootvg/data_20031218: out of spa
> Dec 29 13:15:23 turin kernel: Unable to handle kernel paging request at virtual address 00015618
> Dec 29 13:15:23 turin kernel:  printing eip:
> Dec 29 13:15:23 turin kernel: c4847a7c
>                               c4847a7c -> lvm_snapshot_remap_block (c4847a0c)
>
> This is a stock kernel (bf2.4) from debian stable (Version: 2.4.18-5)

LVM1 snapshotting in the plain 2.4.18 kernel is known to have bugs.

You should upgrade to the latest LVM 1.0.8 kernel code. Well, I can't
access the Sistina website at the moment. I'm sure you can find a
lvm_1.0.8.tar.gz (or lvm_1.0.7.tar.gz which also has most bugs fixed)
somewhere.

In the directory LVM/1.0.8/kernel are some files:

Copy lvm.h to /usr/src/linux/include/linux and the rest to
/usr/src/linux/drivers/md (overwrite the old ones) and recompile the
kernel.

Or you can upgrade to the 2.4.23 kernel, I think it contains the LVM
1.0.7 code.

--
Christophe Saout <christophe@saout.de>
Please avoid sending me Word or PowerPoint attachments.
See http://www.fsf.org/philosophy/no-word-attachments.html


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [2.4.18] oops in lvm or raid
  2003-12-29 17:16 ` Christophe Saout
@ 2003-12-30 12:02   ` Roger Gammans
  0 siblings, 0 replies; 3+ messages in thread
From: Roger Gammans @ 2003-12-30 12:02 UTC (permalink / raw)
  To: Christophe Saout; +Cc: LKML

[-- Attachment #1: Type: text/plain, Size: 1930 bytes --]

On Mon, Dec 29, 2003 at 06:16:10PM +0100, Christophe Saout wrote:
> Am Mo, den 29.12.2003 schrieb Roger Gammans um 15:59:
> 
> > Dec 29 13:15:23 turin kernel: lvm -- giving up to snapshot /dev/rootvg/data_root on /dev/rootvg/data_20031218: out of spa
> > Dec 29 13:15:23 turin kernel: Unable to handle kernel paging request at virtual address 00015618
> > Dec 29 13:15:23 turin kernel:  printing eip:
> > Dec 29 13:15:23 turin kernel: c4847a7c
> >                               c4847a7c -> lvm_snapshot_remap_block (c4847a0c)
> >
> > This is a stock kernel (bf2.4) from debian stable (Version: 2.4.18-5)
> 
> LVM1 snapshotting in the plain 2.4.18 kernel is known to have bugs.

Ah. Ok.
But to be honest I could have (and indeed should have) deleted the sanpshot 
before doing the write to the volume which caused the snapshot ot 
run out of space. As long as the snapshot intergrity is good I'm happy.

My real worry is taht this was triggered by a raid problem and we've got
serious data corruption in both the ext3 and lvm metadata.

> You should upgrade to the latest LVM 1.0.8 kernel code. Well, I can't
> access the Sistina website at the moment. I'm sure you can find a
> lvm_1.0.8.tar.gz (or lvm_1.0.7.tar.gz which also has most bugs fixed)
> somewhere.

The site seems to be back now.

> [snip] 
> Or you can upgrade to the 2.4.23 kernel, I think it contains the LVM
> 1.0.7 code.

Ok. We need to make some sort of decision here about what kernel to use
then there has been an degree of discusson about exactly how
conservative we should be anyway . Does 2.4.23 have the vfs-locking
patch ?

TTFN
-- 
Roger. 	                        Home| http://www.sandman.uklinux.net/
Master of Peng Shui.      (Ancient oriental art of Penguin Arranging)
Work|Independent Sys Consultant | http://www.computer-surgery.co.uk/
So what are the eigenvalues and eigenvectors of 'The Matrix'? --anon

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2003-12-30 12:03 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-12-29 14:59 [2.4.18] oops in lvm or raid Roger Gammans
2003-12-29 17:16 ` Christophe Saout
2003-12-30 12:02   ` Roger Gammans

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox