* [linux-lvm] access through LVM causes D state lock up
@ 2011-12-13 17:45 Ray Morris
2011-12-13 18:35 ` Peter M. Petrakis
0 siblings, 1 reply; 7+ messages in thread
From: Ray Morris @ 2011-12-13 17:45 UTC (permalink / raw)
To: LVM general discussion and development
I've been struggling for some time with a problem wherein at times
the machine locks up, with access throgh LVM hanging. I can read and
write the physical volumes with "dd", but trying to read or write
the logical volume hangs. pvdisplay also hangs. The PVs, which seem
to accept writes just fine, are mdadm raid volumes.
I experienced this before under 5.7 and am now experiencing the same
with 6.0 using lvm2-2.02.72-8.el6_0.4.x86_64. I've also experienced
it on entirely different hardware, with different controller chipsets.
I'm pretty much at my wits end and would appreciate any pointers as
to where to look next.
The differences between our current lvm.conf and the default are as
follows:
53c53
< filter = [ "a/.*/" ]
---
54a55
> filter = [ "a|^/dev/md.*|", "a|^/dev/sd.*|", "a|^/dev/etherd/.*|","r|^/dev/ram.*|", "r|block|", "r/.*/" ]
101,104d101
<
118,120c115,117
---
129d125
139,144d134
< disable_after_error_count = 0
<
Extra logging
191c162
< level = 0
---
> level = 5
198c169
< command_names = 0
---
> command_names = 1
270c241
< units = "h"
---
> units = "G"
331c302
< locking_dir = "/var/lock/lvm"
---
> locking_dir = "/dev/shm"
356,362d326
<
< metadata_read_only = 0
407c371
<
---
535a481
> pvmetadatasize = 32768
When the machine locks up, /var/log/messages shows processes "blocked
for more than 120 seconds" as shown below. What other information
should I be loooking to diagnose and resolve this issue?
Dec 13 09:13:26 clonebox3 lvm[32461]: Using logical volume(s) on command line
Dec 13 09:15:52 clonebox3 kernel: INFO: task kdmflush:31627 blocked for more than 120 seconds.
Dec 13 09:15:52 clonebox3 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 13 09:15:52 clonebox3 kernel: kdmflush D ffff88007b824300 0 31627 2 0x00000080
Dec 13 09:15:52 clonebox3 kernel: ffff8800372af9f0 0000000000000046 ffff8800372af9b8 ffff8800372af9b4
Dec 13 09:15:52 clonebox3 kernel: ffff8800372af9e0 ffff88007b824300 ffff880001e96980 00000001083f7318
Dec 13 09:15:52 clonebox3 kernel: ffff880076f27ad8 ffff8800372affd8 0000000000010518 ffff880076f27ad8
Dec 13 09:15:52 clonebox3 kernel: Call Trace:
Dec 13 09:15:52 clonebox3 kernel: [<ffffffffa01feca5>] raid5_quiesce+0x125/0x1a0 [raid456]
Dec 13 09:15:52 clonebox3 kernel: [<ffffffff8105c580>] ? default_wake_function+0x0/0x20
Dec 13 09:15:52 clonebox3 kernel: [<ffffffff810563f3>] ? __wake_up+0x53/0x70
Dec 13 09:15:52 clonebox3 kernel: [<ffffffffa02070c1>] make_request+0x501/0x520 [raid456]
Dec 13 09:15:52 clonebox3 kernel: [<ffffffff8102ea69>] ? native_smp_send_reschedule+0x49/0x60
Dec 13 09:15:52 clonebox3 kernel: [<ffffffff810508e8>] ? resched_task+0x68/0x80
Dec 13 09:15:52 clonebox3 kernel: [<ffffffff813d09fb>] md_make_request+0xcb/0x230
Dec 13 09:15:52 clonebox3 kernel: [<ffffffff8105c484>] ? try_to_wake_up+0x284/0x380
Dec 13 09:15:52 clonebox3 kernel: [<ffffffff81241982>] generic_make_request+0x1b2/0x4f0
Dec 13 09:15:52 clonebox3 kernel: [<ffffffff8110e925>] ? mempool_alloc_slab+0x15/0x20
Dec 13 09:15:52 clonebox3 kernel: [<ffffffff8110ea33>] ? mempool_alloc+0x63/0x140
Dec 13 09:15:52 clonebox3 kernel: [<ffffffffa00016bd>] __map_bio+0xad/0x130 [dm_mod]
Dec 13 09:15:52 clonebox3 kernel: [<ffffffffa00017ef>] __issue_target_requests+0xaf/0xd0 [dm_mod]
Dec 13 09:15:52 clonebox3 kernel: [<ffffffffa000351f>] __split_and_process_bio+0x59f/0x630 [dm_mod]
Dec 13 09:15:52 clonebox3 kernel: [<ffffffff8109225c>] ? remove_wait_queue+0x3c/0x50
Dec 13 09:15:52 clonebox3 kernel: [<ffffffffa00029c4>] ? dm_wait_for_completion+0xd4/0x100 [dm_mod]
Dec 13 09:15:52 clonebox3 kernel: [<ffffffffa0003836>] dm_flush+0x56/0x70 [dm_mod]
Dec 13 09:15:52 clonebox3 kernel: [<ffffffffa00038a4>] dm_wq_work+0x54/0x200 [dm_mod]
Dec 13 09:15:52 clonebox3 kernel: [<ffffffffa0003850>] ? dm_wq_work+0x0/0x200 [dm_mod]
Dec 13 09:15:52 clonebox3 kernel: [<ffffffff8108c7d0>] worker_thread+0x170/0x2a0
Dec 13 09:15:52 clonebox3 kernel: [<ffffffff81091ea0>] ? autoremove_wake_function+0x0/0x40
Dec 13 09:15:52 clonebox3 kernel: [<ffffffff8108c660>] ? worker_thread+0x0/0x2a0
Dec 13 09:15:52 clonebox3 kernel: [<ffffffff81091b36>] kthread+0x96/0xa0
Dec 13 09:15:52 clonebox3 kernel: [<ffffffff810141ca>] child_rip+0xa/0x20
Dec 13 09:15:52 clonebox3 kernel: [<ffffffff81091aa0>] ? kthread+0x0/0xa0
Dec 13 09:15:52 clonebox3 kernel: [<ffffffff810141c0>] ? child_rip+0x0/0x20
Dec 13 09:15:52 clonebox3 kernel: INFO: task kcopyd:31629 blocked for more than 120 seconds.
Dec 13 09:15:52 clonebox3 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 13 09:15:52 clonebox3 kernel: kcopyd D ffff88007b824700 0 31629 2 0x00000080
Dec 13 09:15:52 clonebox3 kernel: ffff880044aa7ac0 0000000000000046 ffff880044aa7a88 ffff880044aa7a84
Dec 13 09:15:52 clonebox3 kernel: ffff880044aa7ae0 ffff88007b824700 ffff880001e16980 00000001083f7280
--
Ray Morris
support@bettercgi.com
Strongbox - The next generation in site security:
http://www.bettercgi.com/strongbox/
Throttlebox - Intelligent Bandwidth Control
http://www.bettercgi.com/throttlebox/
Strongbox / Throttlebox affiliate program:
http://www.bettercgi.com/affiliates/user/register.php
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [linux-lvm] access through LVM causes D state lock up
2011-12-13 17:45 [linux-lvm] access through LVM causes D state lock up Ray Morris
@ 2011-12-13 18:35 ` Peter M. Petrakis
2011-12-13 20:10 ` Ray Morris
0 siblings, 1 reply; 7+ messages in thread
From: Peter M. Petrakis @ 2011-12-13 18:35 UTC (permalink / raw)
To: LVM general discussion and development
On 12/13/2011 12:45 PM, Ray Morris wrote:
> I've been struggling for some time with a problem wherein at times
> the machine locks up, with access throgh LVM hanging. I can read and
> write the physical volumes with "dd", but trying to read or write
> the logical volume hangs. pvdisplay also hangs. The PVs, which seem
> to accept writes just fine, are mdadm raid volumes.
>
> I experienced this before under 5.7 and am now experiencing the same
> with 6.0 using lvm2-2.02.72-8.el6_0.4.x86_64. I've also experienced
> it on entirely different hardware, with different controller chipsets.
>
> I'm pretty much at my wits end and would appreciate any pointers as
> to where to look next.
> The differences between our current lvm.conf and the default are as
> follows:
>
> 53c53
> < filter = [ "a/.*/" ]
> ---
> 54a55
>> filter = [ "a|^/dev/md.*|", "a|^/dev/sd.*|", "a|^/dev/etherd/.*|","r|^/dev/ram.*|", "r|block|", "r/.*/" ]
>
Is it intentional to include sd devices? Just because the MD uses them doesn't mean you have
to make allowances for them here.
> 101,104d101
> <
> 118,120c115,117
> ---
> 129d125
> 139,144d134
> < disable_after_error_count = 0
> <
>
>
> Extra logging
> 191c162
> < level = 0
> ---
>> level = 5
> 198c169
> < command_names = 0
> ---
>> command_names = 1
> 270c241
> < units = "h"
> ---
>> units = "G"
> 331c302
> < locking_dir = "/var/lock/lvm"
> ---
>> locking_dir = "/dev/shm"
Why?
> 356,362d326
> <
> < metadata_read_only = 0
> 407c371
> <
> ---
>
> 535a481
>> pvmetadatasize = 32768
>
>
> When the machine locks up, /var/log/messages shows processes "blocked
> for more than 120 seconds" as shown below. What other information
> should I be loooking to diagnose and resolve this issue?
>
>
> Dec 13 09:13:26 clonebox3 lvm[32461]: Using logical volume(s) on command line
> Dec 13 09:15:52 clonebox3 kernel: INFO: task kdmflush:31627 blocked for more than 120 seconds.
> Dec 13 09:15:52 clonebox3 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Dec 13 09:15:52 clonebox3 kernel: kdmflush D ffff88007b824300 0 31627 2 0x00000080
> Dec 13 09:15:52 clonebox3 kernel: ffff8800372af9f0 0000000000000046 ffff8800372af9b8 ffff8800372af9b4
> Dec 13 09:15:52 clonebox3 kernel: ffff8800372af9e0 ffff88007b824300 ffff880001e96980 00000001083f7318
> Dec 13 09:15:52 clonebox3 kernel: ffff880076f27ad8 ffff8800372affd8 0000000000010518 ffff880076f27ad8
> Dec 13 09:15:52 clonebox3 kernel: Call Trace:
> Dec 13 09:15:52 clonebox3 kernel: [<ffffffffa01feca5>] raid5_quiesce+0x125/0x1a0 [raid456]
> Dec 13 09:15:52 clonebox3 kernel: [<ffffffff8105c580>] ? default_wake_function+0x0/0x20
> Dec 13 09:15:52 clonebox3 kernel: [<ffffffff810563f3>] ? __wake_up+0x53/0x70
> Dec 13 09:15:52 clonebox3 kernel: [<ffffffffa02070c1>] make_request+0x501/0x520 [raid456]
> Dec 13 09:15:52 clonebox3 kernel: [<ffffffff8102ea69>] ? native_smp_send_reschedule+0x49/0x60
> Dec 13 09:15:52 clonebox3 kernel: [<ffffffff810508e8>] ? resched_task+0x68/0x80
> Dec 13 09:15:52 clonebox3 kernel: [<ffffffff813d09fb>] md_make_request+0xcb/0x230
> Dec 13 09:15:52 clonebox3 kernel: [<ffffffff8105c484>] ? try_to_wake_up+0x284/0x380
> Dec 13 09:15:52 clonebox3 kernel: [<ffffffff81241982>] generic_make_request+0x1b2/0x4f0
> Dec 13 09:15:52 clonebox3 kernel: [<ffffffff8110e925>] ? mempool_alloc_slab+0x15/0x20
> Dec 13 09:15:52 clonebox3 kernel: [<ffffffff8110ea33>] ? mempool_alloc+0x63/0x140
> Dec 13 09:15:52 clonebox3 kernel: [<ffffffffa00016bd>] __map_bio+0xad/0x130 [dm_mod]
> Dec 13 09:15:52 clonebox3 kernel: [<ffffffffa00017ef>] __issue_target_requests+0xaf/0xd0 [dm_mod]
> Dec 13 09:15:52 clonebox3 kernel: [<ffffffffa000351f>] __split_and_process_bio+0x59f/0x630 [dm_mod]
> Dec 13 09:15:52 clonebox3 kernel: [<ffffffff8109225c>] ? remove_wait_queue+0x3c/0x50
> Dec 13 09:15:52 clonebox3 kernel: [<ffffffffa00029c4>] ? dm_wait_for_completion+0xd4/0x100 [dm_mod]
> Dec 13 09:15:52 clonebox3 kernel: [<ffffffffa0003836>] dm_flush+0x56/0x70 [dm_mod]
> Dec 13 09:15:52 clonebox3 kernel: [<ffffffffa00038a4>] dm_wq_work+0x54/0x200 [dm_mod]
> Dec 13 09:15:52 clonebox3 kernel: [<ffffffffa0003850>] ? dm_wq_work+0x0/0x200 [dm_mod]
> Dec 13 09:15:52 clonebox3 kernel: [<ffffffff8108c7d0>] worker_thread+0x170/0x2a0
> Dec 13 09:15:52 clonebox3 kernel: [<ffffffff81091ea0>] ? autoremove_wake_function+0x0/0x40
> Dec 13 09:15:52 clonebox3 kernel: [<ffffffff8108c660>] ? worker_thread+0x0/0x2a0
> Dec 13 09:15:52 clonebox3 kernel: [<ffffffff81091b36>] kthread+0x96/0xa0
> Dec 13 09:15:52 clonebox3 kernel: [<ffffffff810141ca>] child_rip+0xa/0x20
> Dec 13 09:15:52 clonebox3 kernel: [<ffffffff81091aa0>] ? kthread+0x0/0xa0
> Dec 13 09:15:52 clonebox3 kernel: [<ffffffff810141c0>] ? child_rip+0x0/0x20
> Dec 13 09:15:52 clonebox3 kernel: INFO: task kcopyd:31629 blocked for more than 120 seconds.
> Dec 13 09:15:52 clonebox3 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Dec 13 09:15:52 clonebox3 kernel: kcopyd D ffff88007b824700 0 31629 2 0x00000080
> Dec 13 09:15:52 clonebox3 kernel: ffff880044aa7ac0 0000000000000046 ffff880044aa7a88 ffff880044aa7a84
> Dec 13 09:15:52 clonebox3 kernel: ffff880044aa7ae0 ffff88007b824700 ffff880001e16980 00000001083f7280
>
Do you by any chance have active LVM snapshots? If so how many and how long have they been provisioned for?
Peter
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [linux-lvm] access through LVM causes D state lock up
2011-12-13 18:35 ` Peter M. Petrakis
@ 2011-12-13 20:10 ` Ray Morris
2011-12-13 22:54 ` Peter M. Petrakis
0 siblings, 1 reply; 7+ messages in thread
From: Ray Morris @ 2011-12-13 20:10 UTC (permalink / raw)
To: linux-lvm
On Tue, 13 Dec 2011 13:35:46 -0500
"Peter M. Petrakis" <peter.petrakis@canonical.com> wrote:
> Do you by any chance have active LVM snapshots? If so how many and
> how long have they been provisioned for?
I forgot to mention that. There are now three snapshots, one on each of
three LVs, that have been provisioned for a few hours. These LVs aren't
in active use, but are backups, synced daily. So basically the only
activity is rsync once daily, bandwidth limited to be fairly slow. One
logical volume that locked up when trying to write to it had a snapshot.
Prior to this most recent rebuild, there were a lot of snap shots -
three on each of fifteen LVs. I replaced that VG with a fresh one
and it seemed to work for a while. I thought the problem was likely
related to lots of long lived snapshots, but after completely rebuilding
the VG after deleting all snapshots the problem recurred very quickly,
before there were many snapshots and before there was a lot of IO
to the snaps
I realize I'm somewhat abusing snapshots - they weren't designed to
be long lived. Therefore my "torture test" usage may reveal problems
that wouldn't happen often with very short lived snapshots.
Another similar server has more snapshots on more LVs running the
same rsyncs without obvious trouble.
I should also have mentioned sequential writes to one LV at a time
don't seem to trigger the problem. I copied the whole VG one LV
at a time with:
dd if=/dev/oldvg/lv1 of=/dev/newvg/lv1
Copying the entire LVs sequentially saw no problems. Later when I tried
to rsync to the LVs the problem showed itself.
> >> filter = [ "a|^/dev/md.*|", "a|^/dev/sd.*|",
> >> "a|^/dev/etherd/.*|","r|^/dev/ram.*|", "r|block|", "r/.*/" ]
> >
> Is it intentional to include sd devices? Just because the MD uses
> them doesn't mean you have to make allowances for them here.
Some /dev/sdX devices were used, but no more and I have now removed
sd.* and etherd.
> > < locking_dir = "/var/lock/lvm"
> > ---
> >> locking_dir = "/dev/shm"
>
> Why?
This was changed AFTER the problem started.
Because comment in the file says:
# Local non-LV directory that holds file-based locks while commands
# are in progress.
Because /var/lock is on an LV, I tried switching it to a directory that
will never be on an LV. That didn't seem to have any effect.
--
Ray Morris
support@bettercgi.com
Strongbox - The next generation in site security:
http://www.bettercgi.com/strongbox/
Throttlebox - Intelligent Bandwidth Control
http://www.bettercgi.com/throttlebox/
Strongbox / Throttlebox affiliate program:
http://www.bettercgi.com/affiliates/user/register.php
--
Ray Morris
support@bettercgi.com
Strongbox - The next generation in site security:
http://www.bettercgi.com/strongbox/
Throttlebox - Intelligent Bandwidth Control
http://www.bettercgi.com/throttlebox/
Strongbox / Throttlebox affiliate program:
http://www.bettercgi.com/affiliates/user/register.php
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [linux-lvm] access through LVM causes D state lock up
2011-12-13 20:10 ` Ray Morris
@ 2011-12-13 22:54 ` Peter M. Petrakis
2011-12-13 23:33 ` Ray Morris
0 siblings, 1 reply; 7+ messages in thread
From: Peter M. Petrakis @ 2011-12-13 22:54 UTC (permalink / raw)
To: linux-lvm
On 12/13/2011 03:10 PM, Ray Morris wrote:
>
> On Tue, 13 Dec 2011 13:35:46 -0500
> "Peter M. Petrakis" <peter.petrakis@canonical.com> wrote:
>
>
>> Do you by any chance have active LVM snapshots? If so how many and
>> how long have they been provisioned for?
>
> I forgot to mention that. There are now three snapshots, one on each of
What distro and kernel on you on?
> three LVs, that have been provisioned for a few hours. These LVs aren't
> in active use, but are backups, synced daily. So basically the only
> activity is rsync once daily, bandwidth limited to be fairly slow. One
> logical volume that locked up when trying to write to it had a snapshot.
LVM snapshots can be very I/O intensive, even when you're not directly using them.
> Prior to this most recent rebuild, there were a lot of snap shots -
> three on each of fifteen LVs. I replaced that VG with a fresh one
> and it seemed to work for a while. I thought the problem was likely
> related to lots of long lived snapshots, but after completely rebuilding
> the VG after deleting all snapshots the problem recurred very quickly,
> before there were many snapshots and before there was a lot of IO
> to the snaps
>
> I realize I'm somewhat abusing snapshots - they weren't designed to
> be long lived. Therefore my "torture test" usage may reveal problems
> that wouldn't happen often with very short lived snapshots.
That's right :). Some have reported as high as 50% impact on performance.
> Another similar server has more snapshots on more LVs running the
> same rsyncs without obvious trouble.
>
> I should also have mentioned sequential writes to one LV at a time
> don't seem to trigger the problem. I copied the whole VG one LV
> at a time with:
> dd if=/dev/oldvg/lv1 of=/dev/newvg/lv1
> Copying the entire LVs sequentially saw no problems. Later when I tried
> to rsync to the LVs the problem showed itself.
That's remarkable as it removes the fs from the equation. What
fs are you using?
>
>>>> filter = [ "a|^/dev/md.*|", "a|^/dev/sd.*|",
>>>> "a|^/dev/etherd/.*|","r|^/dev/ram.*|", "r|block|", "r/.*/" ]
>>>
>> Is it intentional to include sd devices? Just because the MD uses
>> them doesn't mean you have to make allowances for them here.
>
>
> Some /dev/sdX devices were used, but no more and I have now removed
> sd.* and etherd.
>
>
>>> < locking_dir = "/var/lock/lvm"
>>> ---
>>>> locking_dir = "/dev/shm"
>>
>> Why?
>
> This was changed AFTER the problem started.
> Because comment in the file says:
>
> # Local non-LV directory that holds file-based locks while commands
> # are in progress.
>
> Because /var/lock is on an LV, I tried switching it to a directory that
> will never be on an LV. That didn't seem to have any effect.
Not a bad idea. Returning to the backtrace:
...
Dec 13 09:15:52 clonebox3 kernel: [<ffffffffa01feca5>] raid5_quiesce+0x125/0x1a0 [raid456]
Dec 13 09:15:52 clonebox3 kernel: [<ffffffff8105c580>] ? default_wake_function+0x0/0x20
Dec 13 09:15:52 clonebox3 kernel: [<ffffffff810563f3>] ? __wake_up+0x53/0x70
Dec 13 09:15:52 clonebox3 kernel: [<ffffffffa02070c1>] make_request+0x501/0x520 [raid456]
Dec 13 09:15:52 clonebox3 kernel: [<ffffffff8102ea69>] ? native_smp_send_reschedule+0x49/0x60
Dec 13 09:15:52 clonebox3 kernel: [<ffffffff810508e8>] ? resched_task+0x68/0x80
Dec 13 09:15:52 clonebox3 kernel: [<ffffffff813d09fb>] md_make_request+0xcb/0x230
Dec 13 09:15:52 clonebox3 kernel: [<ffffffff8105c484>] ? try_to_wake_up+0x284/0x380
Dec 13 09:15:52 clonebox3 kernel: [<ffffffff81241982>] generic_make_request+0x1b2/0x4f0
...
Dec 13 09:15:52 clonebox3 kernel: [<ffffffffa00029c4>] ? dm_wait_for_completion+0xd4/0x100 [dm_mod]
Dec 13 09:15:52 clonebox3 kernel: [<ffffffffa0003836>] dm_flush+0x56/0x70 [dm_mod]
Dec 13 09:15:52 clonebox3 kernel: [<ffffffffa00038a4>] dm_wq_work+0x54/0x200 [dm_mod]
...
Dec 13 09:15:52 clonebox3 kernel: INFO: task kcopyd:31629 blocked for more than 120 seconds.
Dec 13 09:15:52 clonebox3 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 13 09:15:52 clonebox3 kernel: kcopyd D ffff88007b824700 0 31629 2 0x00000080
Dec 13 09:15:52 clonebox3 kernel: ffff880044aa7ac0 0000000000000046 ffff880044aa7a88 ffff880044aa7a84
Dec 13 09:15:52 clonebox3 kernel: ffff880044aa7ae0 ffff88007b824700 ffff880001e16980 00000001083f7280
raid5_quiesce should have been straight forward
http://lxr.linux.no/linux+v3.1.5/drivers/md/raid5.c#L5422
From the stack context I expect it to in case 2 or 0. It could also be stuck on a lock or it
really did stop writes.
At this point I think you might have more of an MD issue than anything else. If you could take MD
out of the picture by using a single disk or use a HW RAID, that would be a really useful data point.
I would also investigate the health of your RAID. Look back in the logs for members being ejected
and then re-introduced. Also if you have any scripts that use mdadm to ping the array for status you might
want to stop those too. Sounds like the linux-raid list is your next stop.
Peter
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [linux-lvm] access through LVM causes D state lock up
2011-12-13 22:54 ` Peter M. Petrakis
@ 2011-12-13 23:33 ` Ray Morris
2011-12-14 14:50 ` Peter M. Petrakis
0 siblings, 1 reply; 7+ messages in thread
From: Ray Morris @ 2011-12-13 23:33 UTC (permalink / raw)
To: linux-lvm
> > On Tue, 13 Dec 2011 13:35:46 -0500
> > "Peter M. Petrakis" <peter.petrakis@canonical.com> wrote
> What distro and kernel on you on?
2.6.32-71.29.1.el6.x86_64 (CentOS 6)
> > Copying the entire LVs sequentially saw no problems. Later when I
> > tried to rsync to the LVs the problem showed itself.
>
> That's remarkable as it removes the fs from the equation. What
> fs are you using?
ext3
> Not a bad idea. Returning to the backtrace:
...
> raid5_quiesce should have been straight forward
>
> http://lxr.linux.no/linux+v3.1.5/drivers/md/raid5.c#L5422
Interesting. Not that I speak kernel, but I may have to learn.
Please note the other partial stack trace included refers to a
different function.
Dec 13 09:15:52 clonebox3 kernel: Call Trace:
Dec 13 09:15:52 clonebox3 kernel: [<ffffffffa01feca5>] raid5_quiesce+0x125/0x1a0 [raid456]
Dec 13 09:15:52 clonebox3 kernel: [<ffffffff8105c580>] ? default_wake_function+0x0/0x20
Dec 13 09:15:52 clonebox3 kernel: [<ffffffff810563f3>] ? __wake_up+0x53/0x70
--
Dec 13 09:15:52 clonebox3 kernel: Call Trace:
Dec 13 09:15:52 clonebox3 kernel: [<ffffffff814c9a53>] io_schedule+0x73/0xc0
Dec 13 09:15:52 clonebox3 kernel: [<ffffffffa0009a15>] sync_io+0xe5/0x180 [dm_mod]
Dec 13 09:15:52 clonebox3 kernel: [<ffffffff81241982>] ? generic_make_request+0x1b2/0x4f0
--
Dec 13 09:15:52 clonebox3 kernel: Call Trace:
Dec 13 09:15:52 clonebox3 kernel: [<ffffffffa00046ec>] ? dm_table_unplug_all+0x5c/0xd0 [dm_mod]
Dec 13 09:15:52 clonebox3 kernel: [<ffffffff8109bba9>] ? ktime_get_ts+0xa9/0xe0
Dec 13 09:15:52 clonebox3 kernel: [<ffffffff8119e960>] ? sync_buffer+0x0/0x50
an earlier occurrence:
Dec 5 23:31:34 clonebox3 kernel: Call Trace:
Dec 5 23:31:34 clonebox3 kernel: [<ffffffff8134ac7d>] ? scsi_setup_blk_pc_cmnd+0x13d/0x170
Dec 5 23:31:34 clonebox3 kernel: [<ffffffffa01e7ca5>] raid5_quiesce+0x125/0x1a0 [raid456]
Dec 5 23:31:34 clonebox3 kernel: [<ffffffff8105c580>] ? default_wake_function+0x0/0x20
--
Dec 5 23:31:34 clonebox3 kernel: Call Trace:
Dec 5 23:31:34 clonebox3 kernel: [<ffffffffa00098fb>] ? dispatch_io+0x22b/0x260 [dm_mod]
Dec 5 23:31:34 clonebox3 kernel: [<ffffffff814c9a53>] io_schedule+0x73/0xc0
Dec 5 23:31:34 clonebox3 kernel: [<ffffffffa0009a15>] sync_io+0xe5/0x180 [dm_mod]
--
Dec 5 23:31:34 clonebox3 kernel: Call Trace:
Dec 5 23:31:34 clonebox3 kernel: [<ffffffff81267d7e>] ? __sg_alloc_table+0x7e/0x130
Dec 5 23:31:34 clonebox3 kernel: [<ffffffffa01e7ca5>] raid5_quiesce+0x125/0x1a0 [raid456]
Dec 5 23:31:34 clonebox3 kernel: [<ffffffff8105c580>] ? default_wake_function+0x0/0x20
--
Dec 5 23:31:35 clonebox3 kernel: Call Trace:
Dec 5 23:31:35 clonebox3 kernel: [<ffffffffa00098fb>] ? dispatch_io+0x22b/0x260 [dm_mod]
Dec 5 23:31:35 clonebox3 kernel: [<ffffffff814c9a53>] io_schedule+0x73/0xc0
Dec 5 23:31:35 clonebox3 kernel: [<ffffffffa0009a15>] sync_io+0xe5/0x180 [dm_mod]
--
Dec 5 23:31:35 clonebox3 kernel: Call Trace:
Dec 5 23:31:35 clonebox3 kernel: [<ffffffffa01e7ca5>] raid5_quiesce+0x125/0x1a0 [raid456]
Dec 5 23:31:35 clonebox3 kernel: [<ffffffff8105c580>] ? default_wake_function+0x0/0x20
Dec 5 23:31:35 clonebox3 kernel: [<ffffffffa01f00c1>] make_request+0x501/0x520 [raid456]
--
Dec 5 23:31:35 clonebox3 kernel: Call Trace:
Dec 5 23:31:35 clonebox3 kernel: [<ffffffffa00098fb>] ? dispatch_io+0x22b/0x260 [dm_mod]
Dec 5 23:31:35 clonebox3 kernel: [<ffffffff814c9a53>] io_schedule+0x73/0xc0
Dec 5 23:31:35 clonebox3 kernel: [<ffffffffa0009a15>] sync_io+0xe5/0x180 [dm_mod]
--
Dec 5 23:31:35 clonebox3 kernel: Call Trace:
Dec 5 23:31:35 clonebox3 kernel: [<ffffffff81091eb6>] ? autoremove_wake_function+0x16/0x40
Dec 5 23:31:35 clonebox3 kernel: [<ffffffff810500e9>] ? __wake_up_common+0x59/0x90
Dec 5 23:31:35 clonebox3 kernel: [<ffffffff8109218e>] ? prepare_to_wait+0x4e/0x80
> At this point I think you might have more of an MD issue than
> anything else. If you could take MD out of the picture by using a
> single disk or use a HW RAID, that would be a really useful data
> point.
I _THINK_ it was all hardware RAID when this happened before, but I
can't be sure.
--
Ray Morris
support@bettercgi.com
Strongbox - The next generation in site security:
http://www.bettercgi.com/strongbox/
Throttlebox - Intelligent Bandwidth Control
http://www.bettercgi.com/throttlebox/
Strongbox / Throttlebox affiliate program:
http://www.bettercgi.com/affiliates/user/register.php
On Tue, 13 Dec 2011 17:54:02 -0500
"Peter M. Petrakis" <peter.petrakis@canonical.com> wrote:
>
>
> On 12/13/2011 03:10 PM, Ray Morris wrote:
> >
> > On Tue, 13 Dec 2011 13:35:46 -0500
> > "Peter M. Petrakis" <peter.petrakis@canonical.com> wrote:
> >
> >
> >> Do you by any chance have active LVM snapshots? If so how many and
> >> how long have they been provisioned for?
> >
> > I forgot to mention that. There are now three snapshots, one on
> > each of
>
> What distro and kernel on you on?
>
> > three LVs, that have been provisioned for a few hours. These LVs
> > aren't in active use, but are backups, synced daily. So basically
> > the only activity is rsync once daily, bandwidth limited to be
> > fairly slow. One logical volume that locked up when trying to write
> > to it had a snapshot.
>
> LVM snapshots can be very I/O intensive, even when you're not
> directly using them.
>
>
> > Prior to this most recent rebuild, there were a lot of snap shots -
> > three on each of fifteen LVs. I replaced that VG with a fresh one
> > and it seemed to work for a while. I thought the problem was likely
> > related to lots of long lived snapshots, but after completely
> > rebuilding the VG after deleting all snapshots the problem recurred
> > very quickly, before there were many snapshots and before there was
> > a lot of IO to the snaps
> >
> > I realize I'm somewhat abusing snapshots - they weren't designed to
> > be long lived. Therefore my "torture test" usage may reveal
> > problems that wouldn't happen often with very short lived
> > snapshots.
>
> That's right :). Some have reported as high as 50% impact on
> performance.
>
> > Another similar server has more snapshots on more LVs running the
> > same rsyncs without obvious trouble.
> >
> > I should also have mentioned sequential writes to one LV at a time
> > don't seem to trigger the problem. I copied the whole VG one LV
> > at a time with:
> > dd if=/dev/oldvg/lv1 of=/dev/newvg/lv1
>
> > Copying the entire LVs sequentially saw no problems. Later when I
> > tried to rsync to the LVs the problem showed itself.
>
> That's remarkable as it removes the fs from the equation. What
> fs are you using?
>
> >
> >>>> filter = [ "a|^/dev/md.*|", "a|^/dev/sd.*|",
> >>>> "a|^/dev/etherd/.*|","r|^/dev/ram.*|", "r|block|", "r/.*/" ]
> >>>
> >> Is it intentional to include sd devices? Just because the MD uses
> >> them doesn't mean you have to make allowances for them here.
> >
> >
> > Some /dev/sdX devices were used, but no more and I have now removed
> > sd.* and etherd.
> >
> >
> >>> < locking_dir = "/var/lock/lvm"
> >>> ---
> >>>> locking_dir = "/dev/shm"
> >>
> >> Why?
> >
> > This was changed AFTER the problem started.
> > Because comment in the file says:
> >
> > # Local non-LV directory that holds file-based locks while
> > commands # are in progress.
> >
> > Because /var/lock is on an LV, I tried switching it to a directory
> > that will never be on an LV. That didn't seem to have any effect.
>
> Not a bad idea. Returning to the backtrace:
>
> ...
> Dec 13 09:15:52 clonebox3 kernel: [<ffffffffa01feca5>]
> raid5_quiesce+0x125/0x1a0 [raid456] Dec 13 09:15:52 clonebox3 kernel:
> [<ffffffff8105c580>] ? default_wake_function+0x0/0x20 Dec 13 09:15:52
> clonebox3 kernel: [<ffffffff810563f3>] ? __wake_up+0x53/0x70 Dec 13
> 09:15:52 clonebox3 kernel: [<ffffffffa02070c1>]
> make_request+0x501/0x520 [raid456] Dec 13 09:15:52 clonebox3 kernel:
> [<ffffffff8102ea69>] ? native_smp_send_reschedule+0x49/0x60 Dec 13
> 09:15:52 clonebox3 kernel: [<ffffffff810508e8>] ?
> resched_task+0x68/0x80 Dec 13 09:15:52 clonebox3 kernel:
> [<ffffffff813d09fb>] md_make_request+0xcb/0x230
>
> Dec 13 09:15:52 clonebox3 kernel: [<ffffffff8105c484>] ?
> try_to_wake_up+0x284/0x380 Dec 13 09:15:52 clonebox3 kernel:
> [<ffffffff81241982>] generic_make_request+0x1b2/0x4f0 ...
>
> Dec 13 09:15:52 clonebox3 kernel: [<ffffffffa00029c4>] ?
> dm_wait_for_completion+0xd4/0x100 [dm_mod] Dec 13 09:15:52 clonebox3
> kernel: [<ffffffffa0003836>] dm_flush+0x56/0x70 [dm_mod] Dec 13
> 09:15:52 clonebox3 kernel: [<ffffffffa00038a4>] dm_wq_work+0x54/0x200
> [dm_mod] ... Dec 13 09:15:52 clonebox3 kernel: INFO: task
> kcopyd:31629 blocked for more than 120 seconds. Dec 13 09:15:52
> clonebox3 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message. Dec 13 09:15:52 clonebox3 kernel:
> kcopyd D ffff88007b824700 0 31629 2 0x00000080 Dec 13
> 09:15:52 clonebox3 kernel: ffff880044aa7ac0 0000000000000046
> ffff880044aa7a88 ffff880044aa7a84 Dec 13 09:15:52 clonebox3 kernel:
> ffff880044aa7ae0 ffff88007b824700 ffff880001e16980 00000001083f7280
>
> raid5_quiesce should have been straight forward
>
> http://lxr.linux.no/linux+v3.1.5/drivers/md/raid5.c#L5422
>
> >From the stack context I expect it to in case 2 or 0. It could also
> >be stuck on a lock or it
> really did stop writes.
>
> At this point I think you might have more of an MD issue than
> anything else. If you could take MD out of the picture by using a
> single disk or use a HW RAID, that would be a really useful data
> point.
>
> I would also investigate the health of your RAID. Look back in the
> logs for members being ejected and then re-introduced. Also if you
> have any scripts that use mdadm to ping the array for status you
> might want to stop those too. Sounds like the linux-raid list is your
> next stop.
>
> Peter
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [linux-lvm] access through LVM causes D state lock up
2011-12-13 23:33 ` Ray Morris
@ 2011-12-14 14:50 ` Peter M. Petrakis
2011-12-15 21:38 ` Ray Morris
0 siblings, 1 reply; 7+ messages in thread
From: Peter M. Petrakis @ 2011-12-14 14:50 UTC (permalink / raw)
To: linux-lvm
On 12/13/2011 06:33 PM, Ray Morris wrote:
>>> On Tue, 13 Dec 2011 13:35:46 -0500
>>> "Peter M. Petrakis" <peter.petrakis@canonical.com> wrote
>
>> What distro and kernel on you on?
>
>
> 2.6.32-71.29.1.el6.x86_64 (CentOS 6)
>
>
>>> Copying the entire LVs sequentially saw no problems. Later when I
>>> tried to rsync to the LVs the problem showed itself.
>>
>> That's remarkable as it removes the fs from the equation. What
>> fs are you using?
>
> ext3
>
>> Not a bad idea. Returning to the backtrace:
> ...
>> raid5_quiesce should have been straight forward
>>
>> http://lxr.linux.no/linux+v3.1.5/drivers/md/raid5.c#L5422
>
> Interesting. Not that I speak kernel, but I may have to learn.
> Please note the other partial stack trace included refers to a
> different function.
>
>
> Dec 13 09:15:52 clonebox3 kernel: Call Trace:
> Dec 13 09:15:52 clonebox3 kernel: [<ffffffffa01feca5>] raid5_quiesce+0x125/0x1a0 [raid456]
> Dec 13 09:15:52 clonebox3 kernel: [<ffffffff8105c580>] ? default_wake_function+0x0/0x20
> Dec 13 09:15:52 clonebox3 kernel: [<ffffffff810563f3>] ? __wake_up+0x53/0x70
> --
> Dec 13 09:15:52 clonebox3 kernel: Call Trace:
> Dec 13 09:15:52 clonebox3 kernel: [<ffffffff814c9a53>] io_schedule+0x73/0xc0
> Dec 13 09:15:52 clonebox3 kernel: [<ffffffffa0009a15>] sync_io+0xe5/0x180 [dm_mod]
> Dec 13 09:15:52 clonebox3 kernel: [<ffffffff81241982>] ? generic_make_request+0x1b2/0x4f0
> --
> Dec 13 09:15:52 clonebox3 kernel: Call Trace:
> Dec 13 09:15:52 clonebox3 kernel: [<ffffffffa00046ec>] ? dm_table_unplug_all+0x5c/0xd0 [dm_mod]
> Dec 13 09:15:52 clonebox3 kernel: [<ffffffff8109bba9>] ? ktime_get_ts+0xa9/0xe0
> Dec 13 09:15:52 clonebox3 kernel: [<ffffffff8119e960>] ? sync_buffer+0x0/0x50
>
> an earlier occurrence:
>
> Dec 5 23:31:34 clonebox3 kernel: Call Trace:
> Dec 5 23:31:34 clonebox3 kernel: [<ffffffff8134ac7d>] ? scsi_setup_blk_pc_cmnd+0x13d/0x170
> Dec 5 23:31:34 clonebox3 kernel: [<ffffffffa01e7ca5>] raid5_quiesce+0x125/0x1a0 [raid456]
> Dec 5 23:31:34 clonebox3 kernel: [<ffffffff8105c580>] ? default_wake_function+0x0/0x20
[snip]
Still in the RAID code, just a tiny bit further. I assume when you examine lsscsi -l
that all the disks are 'running' at this point?
>
>
>> At this point I think you might have more of an MD issue than
>> anything else. If you could take MD out of the picture by using a
>> single disk or use a HW RAID, that would be a really useful data
>> point.
>
> I _THINK_ it was all hardware RAID when this happened before, but I
> can't be sure.
Then you're not at your wits end, and you posses the HW to isolate this
issue. Please retry your experiment and keep us posted.
Peter
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [linux-lvm] access through LVM causes D state lock up
2011-12-14 14:50 ` Peter M. Petrakis
@ 2011-12-15 21:38 ` Ray Morris
0 siblings, 0 replies; 7+ messages in thread
From: Ray Morris @ 2011-12-15 21:38 UTC (permalink / raw)
To: linux-lvm
Peter thank you very much for your comments. I'll continue testing and cussing.
It's beginning to look like snapshots + md raid check or resync =
freeze in some cases.
--
Ray Morris
support@bettercgi.com
Strongbox - The next generation in site security:
http://www.bettercgi.com/strongbox/
Throttlebox - Intelligent Bandwidth Control
http://www.bettercgi.com/throttlebox/
Strongbox / Throttlebox affiliate program:
http://www.bettercgi.com/affiliates/user/register.php
On Wed, 14 Dec 2011 09:50:27 -0500
"Peter M. Petrakis" <peter.petrakis@canonical.com> wrote:
>
>
> On 12/13/2011 06:33 PM, Ray Morris wrote:
> >>> On Tue, 13 Dec 2011 13:35:46 -0500
> >>> "Peter M. Petrakis" <peter.petrakis@canonical.com> wrote
> >
> >> What distro and kernel on you on?
> >
> >
> > 2.6.32-71.29.1.el6.x86_64 (CentOS 6)
> >
> >
> >>> Copying the entire LVs sequentially saw no problems. Later when I
> >>> tried to rsync to the LVs the problem showed itself.
> >>
> >> That's remarkable as it removes the fs from the equation. What
> >> fs are you using?
> >
> > ext3
> >
> >> Not a bad idea. Returning to the backtrace:
> > ...
> >> raid5_quiesce should have been straight forward
> >>
> >> http://lxr.linux.no/linux+v3.1.5/drivers/md/raid5.c#L5422
> >
> > Interesting. Not that I speak kernel, but I may have to learn.
> > Please note the other partial stack trace included refers to a
> > different function.
> >
> >
> > Dec 13 09:15:52 clonebox3 kernel: Call Trace:
> > Dec 13 09:15:52 clonebox3 kernel: [<ffffffffa01feca5>]
> > raid5_quiesce+0x125/0x1a0 [raid456] Dec 13 09:15:52 clonebox3
> > kernel: [<ffffffff8105c580>] ? default_wake_function+0x0/0x20 Dec
> > 13 09:15:52 clonebox3 kernel: [<ffffffff810563f3>] ?
> > __wake_up+0x53/0x70 -- Dec 13 09:15:52 clonebox3 kernel: Call Trace:
> > Dec 13 09:15:52 clonebox3 kernel: [<ffffffff814c9a53>]
> > io_schedule+0x73/0xc0 Dec 13 09:15:52 clonebox3 kernel:
> > [<ffffffffa0009a15>] sync_io+0xe5/0x180 [dm_mod] Dec 13 09:15:52
> > clonebox3 kernel: [<ffffffff81241982>] ?
> > generic_make_request+0x1b2/0x4f0 -- Dec 13 09:15:52 clonebox3
> > kernel: Call Trace: Dec 13 09:15:52 clonebox3 kernel:
> > [<ffffffffa00046ec>] ? dm_table_unplug_all+0x5c/0xd0 [dm_mod] Dec
> > 13 09:15:52 clonebox3 kernel: [<ffffffff8109bba9>] ?
> > ktime_get_ts+0xa9/0xe0 Dec 13 09:15:52 clonebox3 kernel:
> > [<ffffffff8119e960>] ? sync_buffer+0x0/0x50
> >
> > an earlier occurrence:
> >
> > Dec 5 23:31:34 clonebox3 kernel: Call Trace:
> > Dec 5 23:31:34 clonebox3 kernel: [<ffffffff8134ac7d>] ?
> > scsi_setup_blk_pc_cmnd+0x13d/0x170 Dec 5 23:31:34 clonebox3
> > kernel: [<ffffffffa01e7ca5>] raid5_quiesce+0x125/0x1a0 [raid456]
> > Dec 5 23:31:34 clonebox3 kernel: [<ffffffff8105c580>] ?
> > default_wake_function+0x0/0x20
>
> [snip]
>
> Still in the RAID code, just a tiny bit further. I assume when you
> examine lsscsi -l that all the disks are 'running' at this point?
>
> >
> >
> >> At this point I think you might have more of an MD issue than
> >> anything else. If you could take MD out of the picture by using a
> >> single disk or use a HW RAID, that would be a really useful data
> >> point.
> >
> > I _THINK_ it was all hardware RAID when this happened before, but I
> > can't be sure.
>
> Then you're not at your wits end, and you posses the HW to isolate
> this issue. Please retry your experiment and keep us posted.
>
> Peter
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2011-12-15 21:38 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-12-13 17:45 [linux-lvm] access through LVM causes D state lock up Ray Morris
2011-12-13 18:35 ` Peter M. Petrakis
2011-12-13 20:10 ` Ray Morris
2011-12-13 22:54 ` Peter M. Petrakis
2011-12-13 23:33 ` Ray Morris
2011-12-14 14:50 ` Peter M. Petrakis
2011-12-15 21:38 ` Ray Morris
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).