* [linux-lvm] snapshot error with xfs and disk I/O
@ 2006-03-30 8:09 Wim Bakker
2006-03-30 22:04 ` Alasdair G Kergon
0 siblings, 1 reply; 3+ messages in thread
From: Wim Bakker @ 2006-03-30 8:09 UTC (permalink / raw)
To: linux-lvm
Hello ,
There seem to be serious problems with snapshots , lvm2 and xfs.
As soon as there is a slight amount of disk I/O during snapshotting
a logical volume with xfs , the following kind of kernel panic occurs:
--------------------------------------------------------------------------------------------
root@test.cashnet.nl [/root]# umount /backup
root@test.cashnet.nl [/root]# lvremove -f /dev/data/dbackup
Segmentation fault
Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel: Oops: 0000 [#1]
Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel: SMP
Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel: CPU: 0
Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel: EIP is at exit_exception_table+0x48/0x8e [dm_snapshot]
root@test.cashnet.nl [/root]#
Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel: eax: 00000000 ebx: e0b62c70 ecx: 00000000 edx: dfbdaf40
Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel: esi: 00000000 edi: dfbdaf40 ebp: 00001c70 esp: cdfb9e9c
Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel: ds: 007b es: 007b ss: 0068
Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel: Process lvremove (pid: 14480, threadinfo=cdfb8000 task=df97aa90)
Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel: Stack: <0>dfbdaf40 d03cbf88 00002000 0000038e db2fa40c db2fa3c0
e0ade080 00000040
Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel: 00000001 e0ab098f db2fa40c dfbdaf40 e0ade080 df4c1480
e0abc13b e0ade080
Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel: dc276d80 df4c1480 00000004 080e2888 e0abb5ed df4c1480
df4c1480 c9ff2440
Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel: Call Trace:
Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel: [<e0ab098f>] snapshot_dtr+0x33/0x7c [dm_snapshot]
Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel: [<e0abc13b>] table_destroy+0x5b/0xbf [dm_mod]
Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel: [<e0abb5ed>] dm_put+0x4c/0x72 [dm_mod]
Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel: [<e0abe286>] __hash_remove+0x82/0xb1 [dm_mod]
Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel: [<e0abec26>] dev_remove+0x3b/0x85 [dm_mod]
Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel: [<e0abfc82>] ctl_ioctl+0xde/0x141 [dm_mod]
Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel: [<e0abebeb>] dev_remove+0x0/0x85 [dm_mod]
Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel: [<c0176e63>] do_ioctl+0x6f/0xa9
Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel: [<c0177046>] vfs_ioctl+0x65/0x1e1
Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel: [<c0177247>] sys_ioctl+0x85/0x92
Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel: [<c0102cd9>] syscall_call+0x7/0xb
Message from syslogd@test at Thu Mar 30 09:27:37 2006 ...
test kernel: Code: 83 c2 01 39 54 24 0c 89 54 24 08 7d 4d 8b 50 04 31 ed 8d 1c
2a 8b 03 39 d8 8b 30 74 1b 89 44 24 04 89 3c 24 e8 bd 00 6b df 89 f0 <8b> 36
39 d8 75 ec 8b 44 24 10 8b 50 04 83 44 24 0c 01 8b 44 24
----------------------------------------------------------------------------------------------------------------
The system contains two disks , each 80 Gb , with two volume groups :
PV /dev/md3 VG data lvm2 [55.30 GB / 4.52 GB free]
PV /dev/sda3 VG shares lvm2 [9.32 GB / 0 free]
PV /dev/sdb3 VG shares lvm2 [9.32 GB / 3.02 GB free]
Total: 3 [73.94 GB] / in use: 3 [73.94 GB] / in no VG: 0 [0 ]
one vg is created with a pv of a software raid device , /dev/md3
the other on a pv consisting of two partitions on each disk.
Both have a lv of the same name, data and shares.
From each logical volume every ten minutes a snapshot was taken
from cron , meanwhile I was running a script that caused increasing disk I/O
very slowly. After two days running , the following happened :
6:50am up 2 days 19:13, 1 user, load average: 2.53, 3.06, 4.27
---------------
Logical volume "dbackup" already exists in volume group "data"
mount: /dev/data/dbackup already mounted or /backup busy
mount: according to mtab, /dev/mapper/data-dbackup is already mounted
on /backup
Can't remove open logical volume "dbackup"
---------------
The script couldn't do anything anymore with the dbackup snapshot (snapshot
of the data LV).
I stopped the script and unmounted manually /backup whereafter I gave
the command :
lvremove -f /dev/data/dbackup and then the kernel panic , as shown above
happened. The same happened on the original server , that has an areca
hw raid controller , the snapshotting of a LV with xfs goes fine , until at a
certain point when moderate disk I/O happens , then the kernel panics
and oopses out of service.
Are there patches to fix this problem?
TIA
sincerely
Wim bakker
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [linux-lvm] snapshot error with xfs and disk I/O
2006-03-30 8:09 Wim Bakker
@ 2006-03-30 22:04 ` Alasdair G Kergon
0 siblings, 0 replies; 3+ messages in thread
From: Alasdair G Kergon @ 2006-03-30 22:04 UTC (permalink / raw)
To: LVM general discussion and development
On Thu, Mar 30, 2006 at 09:09:22AM +0100, Wim Bakker wrote:
> There seem to be serious problems with snapshots , lvm2 and xfs.
> As soon as there is a slight amount of disk I/O during snapshotting
> a logical volume with xfs , the following kind of kernel panic occurs:
You don't say what kernel.
Make sure it contains the dm snapshot patches from -mm (and probably
in Linus's git tree by now - I haven't checked).
Alasdair
--
agk@redhat.com
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [linux-lvm] snapshot error with xfs and disk I/O
@ 2006-03-31 14:21 Wim Bakker
0 siblings, 0 replies; 3+ messages in thread
From: Wim Bakker @ 2006-03-31 14:21 UTC (permalink / raw)
To: linux-lvm
>On Thu, Mar 30, 2006 at 09:09:22AM +0100, Wim Bakker wrote:
>> There seem to be serious problems with snapshots , lvm2 and xfs.
> >As soon as there is a slight amount of disk I/O during snapshotting
> >a logical volume with xfs , the following kind of kernel panic occurs:
>You don't say what kernel.
>
>Make sure it contains the dm snapshot patches from -mm (and probably
>in Linus's git tree by now - I haven't checked).
>
>Alasdair
On the original server 2.6.15 was used, on the test server I installed
2.6.16 with pending patches :
--------------
dm-snapshot-fix-kcopyd-destructor.patch
dm-flush-queue-eintr.patch
device-mapper-snapshot-fix-origin_write-pending_exception-submission.patch
device-mapper-snapshot-replace-sibling-list.patch
device-mapper-snapshot-fix-invalidation.patch
drivers-md-dm-raid1c-fix-inconsistent-mirroring-after-interrupted.patch
dm-remove-sector_format.patch
sem2mutex-misc-static-one-file-mutexes.patch
sem2mutex-drivers-md.patch
dm-make-sure-queue_flag_cluster-is-set-properly.patch
#Thisonestillneedspropertesting
#md-dm-reduce-stack-usage-with-stacked-block-devices.patch
#Submittedto-mmbymerecently
dm-snapshot-fix-pending-pe-ref.patch
dm-store-md-name.patch
dm-tidy-mdptr.patch
dm-table-store-md.patch
dm-store-geometry.patch
dm-stripe-fix-bounds.patch
#Submittedto-mmbyothersrecently
dm-md-dependency-tree-in-sysfs-kobject_add_dir.patch
dm-md-dependency-tree-in-sysfs-add_subdirs.patch
dm-md-dependency-tree-in-sysfs-bd_claim_by_kobj.patch
dm-md-dependency-tree-in-sysfs-md_deptree.patch
dm-md-dependency-tree-in-sysfs-dm_deptree.patch
#Newbugfixtopushto-mmnexttime
dm-bio-split-bvec.patch
-------------------------------
as found in
ftp.kernel.org/pub/linux/kernel/people/agk/patches/2.6/2.6.16-rc1/2.6.16-rc1-dm1/
and conform a recommendation I read in the mail archives regarding the
problems with snapshots (and xfs?).
Version of lvm is : LVM2.2.02.02
Version of dm is : device-mapper.1.02.03
The snapshot is mounted under backup , the original LV is mounted under data
and the files that give I/O errors under backup are perfectly accessable on
the original LV. There is a script in the background running that randomly
creates files test-* and testfile-* on the data partition.
Meanwhile I have going on with some testing , to that end I installed kernel
2.4.32 with the dm and lock patches , reinstalled the latest lvm2 stable and
dm stable. And ran under 2.4.32 the same tests as under 2.6.16 with the
extra pending patches.
I restarted the system with 2.4.32 and ran the same tests as with 2.6.16
and no problems occurred at all.
Then I restarted the system again with kernel 2.6.16 , and lvremove locked up
again with the resulting kernel panic when forcefully removing the snapshot
volume , like before. I than restarted again with kernel 2.4.32 and all tests
and snapshots run smoothly, so it seems definitely related to kernel 2.6.x.
For the production machine we need definitely 2.6 though because of the areca
controller.
sincerely
Wim Bakker
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2006-03-31 14:22 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-03-31 14:21 [linux-lvm] snapshot error with xfs and disk I/O Wim Bakker
-- strict thread matches above, loose matches on Subject: below --
2006-03-30 8:09 Wim Bakker
2006-03-30 22:04 ` Alasdair G Kergon
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).