linux-lvm.redhat.com archive mirror
 help / color / mirror / Atom feed
* [linux-lvm] snapshot error with xfs and disk I/O
@ 2006-03-30  8:09 Wim Bakker
  2006-03-30 22:04 ` Alasdair G Kergon
  0 siblings, 1 reply; 3+ messages in thread
From: Wim Bakker @ 2006-03-30  8:09 UTC (permalink / raw)
  To: linux-lvm


Hello ,

There seem to be serious problems with snapshots , lvm2 and xfs.
As soon as there is a slight amount of disk I/O during snapshotting
a logical volume with xfs , the following kind of kernel panic occurs:
--------------------------------------------------------------------------------------------
root@test.cashnet.nl [/root]# umount /backup
root@test.cashnet.nl [/root]# lvremove -f /dev/data/dbackup
Segmentation fault

Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel: Oops: 0000 [#1]

Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel: SMP

Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel: CPU:    0

Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel: EIP is at exit_exception_table+0x48/0x8e [dm_snapshot]
root@test.cashnet.nl [/root]#
Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel: eax: 00000000   ebx: e0b62c70   ecx: 00000000   edx: dfbdaf40

Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel: esi: 00000000   edi: dfbdaf40   ebp: 00001c70   esp: cdfb9e9c

Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel: ds: 007b   es: 007b   ss: 0068

Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel: Process lvremove (pid: 14480, threadinfo=cdfb8000 task=df97aa90)

Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel: Stack: <0>dfbdaf40 d03cbf88 00002000 0000038e db2fa40c db2fa3c0 
e0ade080 00000040

Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel:        00000001 e0ab098f db2fa40c dfbdaf40 e0ade080 df4c1480 
e0abc13b e0ade080

Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel:        dc276d80 df4c1480 00000004 080e2888 e0abb5ed df4c1480 
df4c1480 c9ff2440

Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel: Call Trace:

Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel:  [<e0ab098f>] snapshot_dtr+0x33/0x7c [dm_snapshot]

Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel:  [<e0abc13b>] table_destroy+0x5b/0xbf [dm_mod]

Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel:  [<e0abb5ed>] dm_put+0x4c/0x72 [dm_mod]

Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel:  [<e0abe286>] __hash_remove+0x82/0xb1 [dm_mod]

Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel:  [<e0abec26>] dev_remove+0x3b/0x85 [dm_mod]

Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel:  [<e0abfc82>] ctl_ioctl+0xde/0x141 [dm_mod]

Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel:  [<e0abebeb>] dev_remove+0x0/0x85 [dm_mod]

Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel:  [<c0176e63>] do_ioctl+0x6f/0xa9

Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel:  [<c0177046>] vfs_ioctl+0x65/0x1e1

Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel:  [<c0177247>] sys_ioctl+0x85/0x92

Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel:  [<c0102cd9>] syscall_call+0x7/0xb

Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel: Code: 83 c2 01 39 54 24 0c 89 54 24 08 7d 4d 8b 50 04 31 ed 8d 1c 
2a 8b 03 39 d8 8b 30 74 1b 89 44 24 04 89 3c 24 e8 bd 00 6b df 89 f0 <8b> 36 
39 d8 75 ec 8b 44 24 10 8b 50 04 83 44 24 0c 01 8b 44 24
----------------------------------------------------------------------------------------------------------------
The system contains two disks , each 80 Gb , with two volume groups :
    PV /dev/md3    VG data     lvm2 [55.30 GB / 4.52 GB free]
  PV /dev/sda3   VG shares   lvm2 [9.32 GB / 0    free]
  PV /dev/sdb3   VG shares   lvm2 [9.32 GB / 3.02 GB free]
  Total: 3 [73.94 GB] / in use: 3 [73.94 GB] / in no VG: 0 [0   ]

one vg is created with a pv of a software raid device , /dev/md3
the other on a pv consisting of two partitions on each disk.
Both have a lv of the same name, data and shares.
From each logical volume every ten minutes a snapshot was taken
from cron , meanwhile I was running a script that caused increasing disk I/O
very slowly. After two days running , the following happened :
6:50am  up 2 days 19:13,  1 user,  load average: 2.53, 3.06, 4.27
---------------
  Logical volume "dbackup" already exists in volume group "data"
mount: /dev/data/dbackup already mounted or /backup busy
mount: according to mtab, /dev/mapper/data-dbackup is already mounted 
on /backup
  Can't remove open logical volume "dbackup"
---------------
The script couldn't do anything anymore with the dbackup snapshot (snapshot
of the data LV).
I stopped the script and unmounted manually /backup whereafter I gave
the command :
lvremove -f /dev/data/dbackup and then the kernel panic , as shown above
happened. The same happened on the original server , that has an areca
hw raid controller , the snapshotting of a LV with xfs goes fine , until at a 
certain point when moderate disk I/O happens , then the kernel panics
and oopses out of service.
Are there patches to fix this problem?

TIA

sincerely
Wim bakker

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [linux-lvm] snapshot error with xfs and disk I/O
  2006-03-30  8:09 [linux-lvm] snapshot error with xfs and disk I/O Wim Bakker
@ 2006-03-30 22:04 ` Alasdair G Kergon
  0 siblings, 0 replies; 3+ messages in thread
From: Alasdair G Kergon @ 2006-03-30 22:04 UTC (permalink / raw)
  To: LVM general discussion and development

On Thu, Mar 30, 2006 at 09:09:22AM +0100, Wim Bakker wrote:
> There seem to be serious problems with snapshots , lvm2 and xfs.
> As soon as there is a slight amount of disk I/O during snapshotting
> a logical volume with xfs , the following kind of kernel panic occurs:

You don't say what kernel.

Make sure it contains the dm snapshot patches from -mm (and probably
in Linus's git tree by now - I haven't checked).

Alasdair
-- 
agk@redhat.com

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [linux-lvm] snapshot error with xfs and disk I/O
@ 2006-03-31 14:21 Wim Bakker
  0 siblings, 0 replies; 3+ messages in thread
From: Wim Bakker @ 2006-03-31 14:21 UTC (permalink / raw)
  To: linux-lvm


>On Thu, Mar 30, 2006 at 09:09:22AM +0100, Wim Bakker wrote:
>> There seem to be serious problems with snapshots , lvm2 and xfs.
> >As soon as there is a slight amount of disk I/O during snapshotting
> >a logical volume with xfs , the following kind of kernel panic occurs:

>You don't say what kernel.
>
>Make sure it contains the dm snapshot patches from -mm (and probably
>in Linus's git tree by now - I haven't checked).
>
>Alasdair

On the original server 2.6.15 was used, on the test server I installed
2.6.16 with pending patches :
--------------
dm-snapshot-fix-kcopyd-destructor.patch
dm-flush-queue-eintr.patch
device-mapper-snapshot-fix-origin_write-pending_exception-submission.patch
device-mapper-snapshot-replace-sibling-list.patch
device-mapper-snapshot-fix-invalidation.patch
drivers-md-dm-raid1c-fix-inconsistent-mirroring-after-interrupted.patch
dm-remove-sector_format.patch
sem2mutex-misc-static-one-file-mutexes.patch
sem2mutex-drivers-md.patch
dm-make-sure-queue_flag_cluster-is-set-properly.patch
#Thisonestillneedspropertesting
#md-dm-reduce-stack-usage-with-stacked-block-devices.patch
#Submittedto-mmbymerecently
dm-snapshot-fix-pending-pe-ref.patch
dm-store-md-name.patch
dm-tidy-mdptr.patch
dm-table-store-md.patch
dm-store-geometry.patch
dm-stripe-fix-bounds.patch
#Submittedto-mmbyothersrecently
dm-md-dependency-tree-in-sysfs-kobject_add_dir.patch
dm-md-dependency-tree-in-sysfs-add_subdirs.patch
dm-md-dependency-tree-in-sysfs-bd_claim_by_kobj.patch
dm-md-dependency-tree-in-sysfs-md_deptree.patch
dm-md-dependency-tree-in-sysfs-dm_deptree.patch
#Newbugfixtopushto-mmnexttime
dm-bio-split-bvec.patch
-------------------------------
as found in 
ftp.kernel.org/pub/linux/kernel/people/agk/patches/2.6/2.6.16-rc1/2.6.16-rc1-dm1/
and conform a recommendation I read in the mail archives regarding the 
problems with snapshots (and xfs?).
Version of lvm is : LVM2.2.02.02 
Version of dm is : device-mapper.1.02.03


The snapshot is mounted under backup , the original LV is mounted under data
and the files that give I/O errors under backup are perfectly accessable on 
the original LV. There is a script in the background running that randomly 
creates files test-* and testfile-* on the data partition.
Meanwhile I have going on with some testing , to that end I installed kernel 
2.4.32 with the dm and lock patches , reinstalled the latest lvm2 stable and 
dm stable. And ran under 2.4.32 the same tests as under 2.6.16 with the
extra pending patches.
I restarted the system with 2.4.32 and ran the same tests as with 2.6.16
and no problems occurred at all.
Then I restarted the system again with kernel 2.6.16 , and lvremove locked up 
again with the resulting kernel panic when forcefully removing the snapshot
volume , like before. I than restarted again with kernel 2.4.32 and all tests
and snapshots run smoothly, so it seems definitely related to kernel 2.6.x.
For the production machine we need definitely 2.6 though because of the areca 
controller.

sincerely
Wim Bakker

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2006-03-31 14:22 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-03-30  8:09 [linux-lvm] snapshot error with xfs and disk I/O Wim Bakker
2006-03-30 22:04 ` Alasdair G Kergon
  -- strict thread matches above, loose matches on Subject: below --
2006-03-31 14:21 Wim Bakker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).