* [linux-lvm] lvm hangup on snapshot overflow
@ 2011-08-12 16:05 Pasha Z
2011-08-13 22:53 ` Alasdair G Kergon
0 siblings, 1 reply; 3+ messages in thread
From: Pasha Z @ 2011-08-12 16:05 UTC (permalink / raw)
To: linux-lvm
[-- Attachment #1.1: Type: text/plain, Size: 3998 bytes --]
Hello everyone!
Well, the issue first – lvm hangs up on sync command after snapshot
overflow.
How to reproduce the problem
You can do that with the script – test.sh, which is in the attachments. It
may appear rather big, but that's primarily due to debug messages - in fact
It's quite simple. First it creates physical volume on a chosen physical
disk, creates volume group and 2 logical volumes. One of them is the
original LV that we write data to and the other one reserved for a snapshot.
Later it mounts the original volume, converts the second LV to a snapshot
and writes data to the origin LV in the amount that would make the next
snapshot overflow. Then sync is executed. Afterwards another logical volume
is created and converted to snapshot. sync. This sync hangs up.
It is advised to perform tests in virtual environment, because besides other
reasons, you won't be able to reboot normally. When you run the script for
the next time after a reboot it will take care of the old stuff – the
required commands are at the very beginning of the script.
And this is what we have so far
We started off here: *
http://www.redhat.com/archives/dm-devel/2011-May/msg00059.html*, but after a
bunch of tests came to a conclusion that it is neither the kernel version,
nor its configuration or file system that has an impact on hangup. By now we
know that this issue occurs on all versions of lvm past 2.02.56 (2.02.57
fails). An interesting fact is that when we built the most verbose version
of kernel possible (meaning the amount of kernel logs) and the system became
real slow the newer version (2.02.57), that had previously hung up, -
passed! Based on this we think there might be an overrun present that leads
to a deadlock.
For now there are two basic errors:
--------------------
lvconvert device-mapper: suspend ioctl failed: Input/output error
lvconvert Unable to suspend VG-sn_x (252:3)
lvconvert Failed to suspend origin lv
------- and --------
LV VG/sn_x in use: not deactivating
Couldn't deactivate LV sn_x
--------------------
The first one always precedes the hang up, while the second one doesn't
appear every time, but always comes first of the two and can appear multiple
times before the first error. In both cases _lock_vol. returns 0.
As of the second error. The function lvconvert_snapshot fails, reporting
“Couldn't deactivate LV sn_x”, because info.open_count is not equal to zero.
That's indicated by “LV VG/sn_x in use: not deactivating” error. The value
of info.open_count is clearly set to 1 with the lv_info function, but seems
to be never cleared - the value of info.open_count is set to the value of a
field, stored in dm_ioctl struct, which is a member of dm_task struct, but I
couldn't find were it is assigned.
Things get much more complicated due to inability to use a debugger, so an
attending question would be – how do you properly build lvm to get debugging
symbols on? Right now lvm wouldn't build with debug symbols even though
configuration script is provided with appropriate option and it's proved to
be applied, while building (configuration log says it's on and the
corresponding option (-g) is added to the list of flags, passed to gcc).
Attachment description
It the attached archive you will find the following files:
kernel_logs - kernel logs after each tool invocation, retrieved by dmesg -c.
lvm.conf - lvm configuration file that we have used
lvm2.log - lvm logs with debug level set to 7
output_logs - 2 versions: neat and verbose. The difference is that verbose
contains commands performed (set -x)
test.sh - the main test-script
remove.sh - a portion of test.sh responsible for cleanup (sometimes
convenient to have separate)
We continue to study the problem, but any help or guidance from people, how
are familiar with the structure and code of lvm would be highly appreciated.
Thanks a lot!
[-- Attachment #1.2: Type: text/html, Size: 7562 bytes --]
[-- Attachment #2: lvm-test.tar.gz --]
[-- Type: application/x-gzip, Size: 28599 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [linux-lvm] lvm hangup on snapshot overflow
2011-08-12 16:05 [linux-lvm] lvm hangup on snapshot overflow Pasha Z
@ 2011-08-13 22:53 ` Alasdair G Kergon
2011-08-16 8:40 ` Pasha Z
0 siblings, 1 reply; 3+ messages in thread
From: Alasdair G Kergon @ 2011-08-13 22:53 UTC (permalink / raw)
To: Pasha Z; +Cc: linux-lvm
On Fri, Aug 12, 2011 at 07:05:31PM +0300, Pasha Z wrote:
> lvconvert device-mapper: suspend ioctl failed: Input/output error
> lvconvert Unable to suspend VG-sn_x (252:3)
> lvconvert Failed to suspend origin lv
What exactly are you trying to do with lvconvert rather than lvcreate?
Never let snapshots fill up - use dmeventd instead and intervene in
a controlled way before they are full.
The lvconvert -s man page is too subtle and needs improving.
Alasdair
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [linux-lvm] lvm hangup on snapshot overflow
2011-08-13 22:53 ` Alasdair G Kergon
@ 2011-08-16 8:40 ` Pasha Z
0 siblings, 0 replies; 3+ messages in thread
From: Pasha Z @ 2011-08-16 8:40 UTC (permalink / raw)
To: Pasha Z, linux-lvm
[-- Attachment #1: Type: text/plain, Size: 1080 bytes --]
Thanks for the reply!
> What exactly are you trying to do with lvconvert rather than lvcreate?
Sorry, I'm not sure I understand the question.
> Never let snapshots fill up - use dmeventd instead and intervene in
> a controlled way before they are full.
Thanks for the idea. The only obstacle, preventing from using dmeventd right
away, is that it can unmount/extend full snapshots, while I need to have
them switched to read-only.
And the final question: is it planned to integrate control over snapshots to
lvm itself and how soon if yes?
2011/8/14 Alasdair G Kergon <agk@redhat.com>
> On Fri, Aug 12, 2011 at 07:05:31PM +0300, Pasha Z wrote:
> > lvconvert device-mapper: suspend ioctl failed: Input/output error
> > lvconvert Unable to suspend VG-sn_x (252:3)
> > lvconvert Failed to suspend origin lv
>
> What exactly are you trying to do with lvconvert rather than lvcreate?
>
> Never let snapshots fill up - use dmeventd instead and intervene in
> a controlled way before they are full.
>
> The lvconvert -s man page is too subtle and needs improving.
>
> Alasdair
>
>
[-- Attachment #2: Type: text/html, Size: 2763 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2011-08-16 8:40 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-08-12 16:05 [linux-lvm] lvm hangup on snapshot overflow Pasha Z
2011-08-13 22:53 ` Alasdair G Kergon
2011-08-16 8:40 ` Pasha Z
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).