Re: [linux-lvm] access through LVM causes D state lock up

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Peter M. Petrakis" <peter.petrakis@canonical.com>
To: linux-lvm@redhat.com
Subject: Re: [linux-lvm] access through LVM causes D state lock up
Date: Tue, 13 Dec 2011 17:54:02 -0500	[thread overview]
Message-ID: <4EE7D78A.2080704@canonical.com> (raw)
In-Reply-To: <20111213141040.3b090df3@bettercgi.com>



On 12/13/2011 03:10 PM, Ray Morris wrote:
> 
> On Tue, 13 Dec 2011 13:35:46 -0500
> "Peter M. Petrakis" <peter.petrakis@canonical.com> wrote:
> 
> 
>> Do you by any chance have active LVM snapshots? If so how many and
>> how long have they been provisioned for?
> 
> I forgot to mention that. There are now three snapshots, one on each of

What distro and kernel on you on?

> three LVs, that have been provisioned for a few hours. These LVs aren't 
> in active use, but are backups, synced daily. So basically the only 
> activity is rsync once daily, bandwidth limited to be fairly slow. One 
> logical volume that locked up when trying to write to it had a snapshot.

LVM snapshots can be very I/O intensive, even when you're not directly using them.


> Prior to this most recent rebuild, there were a lot of snap shots - 
> three on each of fifteen LVs. I replaced that VG with a fresh one 
> and it seemed to work for a while. I thought the problem was likely 
> related to lots of long lived snapshots, but after completely rebuilding
> the VG after deleting all snapshots the problem recurred very quickly, 
> before there were many snapshots and before there was a lot of IO 
> to the snaps
> 
> I realize I'm somewhat abusing snapshots - they weren't designed to 
> be long lived. Therefore my "torture test" usage may reveal problems 
> that wouldn't happen often with very short lived snapshots. 

That's right :). Some have reported as high as 50% impact on performance.

> Another similar server has more snapshots on more LVs running the 
> same rsyncs without obvious trouble.
> 
> I should also have mentioned sequential writes to one LV at a time 
> don't seem to trigger the problem. I copied the whole VG one LV 
> at a time with:
> dd if=/dev/oldvg/lv1 of=/dev/newvg/lv1

> Copying the entire LVs sequentially saw no problems. Later when I tried 
> to rsync to the LVs the problem showed itself.

That's remarkable as it removes the fs from the equation. What
fs are you using?

> 
>>>>    filter = [ "a|^/dev/md.*|", "a|^/dev/sd.*|",
>>>> "a|^/dev/etherd/.*|","r|^/dev/ram.*|", "r|block|", "r/.*/" ]
>>>
>> Is it intentional to include sd devices? Just because the MD uses
>> them doesn't mean you have to make allowances for them here.
> 
> 
> Some /dev/sdX devices were used, but no more and I have now removed 
> sd.* and etherd.
> 
> 
>>> <     locking_dir = "/var/lock/lvm"
>>> ---
>>>>     locking_dir = "/dev/shm"
>>
>> Why?
> 
> This was changed AFTER the problem started.
> Because comment in the file says:
> 
>   # Local non-LV directory that holds file-based locks while commands
>   # are in progress.  
> 
> Because /var/lock is on an LV, I tried switching it to a directory that 
> will never be on an LV. That didn't seem to have any effect.

Not a bad idea. Returning to the backtrace:

...
Dec 13 09:15:52 clonebox3 kernel: [<ffffffffa01feca5>] raid5_quiesce+0x125/0x1a0 [raid456]
Dec 13 09:15:52 clonebox3 kernel: [<ffffffff8105c580>] ? default_wake_function+0x0/0x20
Dec 13 09:15:52 clonebox3 kernel: [<ffffffff810563f3>] ? __wake_up+0x53/0x70
Dec 13 09:15:52 clonebox3 kernel: [<ffffffffa02070c1>] make_request+0x501/0x520 [raid456]
Dec 13 09:15:52 clonebox3 kernel: [<ffffffff8102ea69>] ? native_smp_send_reschedule+0x49/0x60
Dec 13 09:15:52 clonebox3 kernel: [<ffffffff810508e8>] ? resched_task+0x68/0x80
Dec 13 09:15:52 clonebox3 kernel: [<ffffffff813d09fb>] md_make_request+0xcb/0x230

Dec 13 09:15:52 clonebox3 kernel: [<ffffffff8105c484>] ? try_to_wake_up+0x284/0x380
Dec 13 09:15:52 clonebox3 kernel: [<ffffffff81241982>] generic_make_request+0x1b2/0x4f0
...

Dec 13 09:15:52 clonebox3 kernel: [<ffffffffa00029c4>] ? dm_wait_for_completion+0xd4/0x100 [dm_mod]
Dec 13 09:15:52 clonebox3 kernel: [<ffffffffa0003836>] dm_flush+0x56/0x70 [dm_mod]
Dec 13 09:15:52 clonebox3 kernel: [<ffffffffa00038a4>] dm_wq_work+0x54/0x200 [dm_mod]
...
Dec 13 09:15:52 clonebox3 kernel: INFO: task kcopyd:31629 blocked for more than 120 seconds.
Dec 13 09:15:52 clonebox3 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 13 09:15:52 clonebox3 kernel: kcopyd        D ffff88007b824700     0 31629      2 0x00000080
Dec 13 09:15:52 clonebox3 kernel: ffff880044aa7ac0 0000000000000046 ffff880044aa7a88 ffff880044aa7a84
Dec 13 09:15:52 clonebox3 kernel: ffff880044aa7ae0 ffff88007b824700 ffff880001e16980 00000001083f7280

raid5_quiesce should have been straight forward

http://lxr.linux.no/linux+v3.1.5/drivers/md/raid5.c#L5422

From the stack context I expect it to in case 2 or 0. It could also be stuck on a lock or it
really did stop writes.

At this point I think you might have more of an MD issue than anything else. If you could take MD
out of the picture by using a single disk or use a HW RAID, that would be a really useful data point.

I would also investigate the health of your RAID. Look back in the logs for members being ejected
and then re-introduced. Also if you have any scripts that use mdadm to ping the array for status you might
want to stop those too. Sounds like the linux-raid list is your next stop.

Peter

next prev parent reply	other threads:[~2011-12-13 22:54 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-12-13 17:45 [linux-lvm] access through LVM causes D state lock up Ray Morris
2011-12-13 18:35 ` Peter M. Petrakis
2011-12-13 20:10   ` Ray Morris
2011-12-13 22:54     ` Peter M. Petrakis [this message]
2011-12-13 23:33       ` Ray Morris
2011-12-14 14:50         ` Peter M. Petrakis
2011-12-15 21:38           ` Ray Morris

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4EE7D78A.2080704@canonical.com \
    --to=peter.petrakis@canonical.com \
    --cc=linux-lvm@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.