* [linux-lvm] DM suspend locks up under load?
@ 2012-01-04 22:50 David Shaw
2012-01-05 10:44 ` Zdenek Kabelac
0 siblings, 1 reply; 3+ messages in thread
From: David Shaw @ 2012-01-04 22:50 UTC (permalink / raw)
To: linux-lvm
[-- Attachment #1: Type: text/plain, Size: 1313 bytes --]
Hi,
I'm using some code that creates a snapshot using DM directly (we aren't using LVM), using essentially:
suspend linear device X
reload X as a "snapshot-origin" device
create "snapshot" device
resume original X device (which is now a snapshot-origin)
This has worked fine for several years. Recently, however, we updated to a more recent system, and ext4, and are seeing something odd. Under load, the process above freezes at the first suspend step, and locks up the device in question, requiring a reboot to fix things.
I wrote the attached program to demonstrate the problem. All it does it call DM_DEVICE_SUSPEND and DM_DEVICE_RESUME over and over on a DM device. Basically, run the test program on any mounted linear DM target in one shell, then delete a lot of data from a directory residing on that device in another shell. On my systems this will freeze both the test program and the rm in D state, and require a reboot to fix things.
I've tried multiple different kernels, but at the moment, I'm using kernel-PAE-2.6.35.6-45.fc14.i686 and device-mapper-libs-1.02.63-2.fc14.i686.
One clue I can add is that it only seems to happen if the filesystem on the device is ext4. It does not happen with ext3.
Any ideas on where I should look next?
Thanks,
David
[-- Attachment #2: suspendtest.c --]
[-- Type: application/octet-stream, Size: 883 bytes --]
#include <stdio.h>
#include <libdevmapper.h>
static int
dm_command(int command,const char *device)
{
int ret=0;
struct dm_task *dmt;
dmt=dm_task_create(command);
if(!dmt)
return 0;
if(!dm_task_set_name(dmt,device))
goto fail;
ret=dm_task_run(dmt);
fail:
dm_task_destroy(dmt);
return ret;
}
int
main(int argc,char *argv[])
{
if(argc<2)
{
printf("%s <DM name>\n",argv[0]);
return 1;
}
dm_udev_set_sync_support(0);
printf("Suspending and resuming %s\n",argv[1]);
for(;;)
{
if(!dm_command(DM_DEVICE_SUSPEND,argv[1]))
{
fprintf(stderr,"Unable to suspend %s\n",argv[1]);
break;
}
printf("/\r");
fflush(stdout);
if(!dm_command(DM_DEVICE_RESUME,argv[1]))
{
fprintf(stderr,"Unable to resume %s\n",argv[1]);
break;
}
printf("\\\r");
fflush(stdout);
}
return 0;
}
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [linux-lvm] DM suspend locks up under load?
2012-01-04 22:50 [linux-lvm] DM suspend locks up under load? David Shaw
@ 2012-01-05 10:44 ` Zdenek Kabelac
2012-01-10 23:20 ` David Shaw
0 siblings, 1 reply; 3+ messages in thread
From: Zdenek Kabelac @ 2012-01-05 10:44 UTC (permalink / raw)
To: LVM general discussion and development
Dne 4.1.2012 23:50, David Shaw napsal(a):
> Hi,
>
> I'm using some code that creates a snapshot using DM directly (we aren't using LVM), using essentially:
>
> suspend linear device X
> reload X as a "snapshot-origin" device
> create "snapshot" device
> resume original X device (which is now a snapshot-origin)
>
> This has worked fine for several years. Recently, however, we updated to a more recent system, and ext4, and are seeing something odd. Under load, the process above freezes at the first suspend step, and locks up the device in question, requiring a reboot to fix things.
>
> I wrote the attached program to demonstrate the problem. All it does it call DM_DEVICE_SUSPEND and DM_DEVICE_RESUME over and over on a DM device. Basically, run the test program on any mounted linear DM target in one shell, then delete a lot of data from a directory residing on that device in another shell. On my systems this will freeze both the test program and the rm in D state, and require a reboot to fix things.
>
> I've tried multiple different kernels, but at the moment, I'm using kernel-PAE-2.6.35.6-45.fc14.i686 and device-mapper-libs-1.02.63-2.fc14.i686.
>
> One clue I can add is that it only seems to happen if the filesystem on the device is ext4. It does not happen with ext3.
>
> Any ideas on where I should look next?
>
Maybe you should suspect ext4 - if there is no problem with dm & ext3 ?
I guess you need to get stacktrace where the system locks.
(echo t >/proc/sysrq-trigger - or Sysrq+T)
You should probably also try different kernel.
Zdenek
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [linux-lvm] DM suspend locks up under load?
2012-01-05 10:44 ` Zdenek Kabelac
@ 2012-01-10 23:20 ` David Shaw
0 siblings, 0 replies; 3+ messages in thread
From: David Shaw @ 2012-01-10 23:20 UTC (permalink / raw)
To: Zdenek Kabelac; +Cc: LVM general discussion and development
On Jan 5, 2012, at 5:44 AM, Zdenek Kabelac wrote:
> Dne 4.1.2012 23:50, David Shaw napsal(a):
>> Hi,
>>
>> I'm using some code that creates a snapshot using DM directly (we aren't using LVM), using essentially:
>>
>> suspend linear device X
>> reload X as a "snapshot-origin" device
>> create "snapshot" device
>> resume original X device (which is now a snapshot-origin)
>>
>> This has worked fine for several years. Recently, however, we updated to a more recent system, and ext4, and are seeing something odd. Under load, the process above freezes at the first suspend step, and locks up the device in question, requiring a reboot to fix things.
>>
>> I wrote the attached program to demonstrate the problem. All it does it call DM_DEVICE_SUSPEND and DM_DEVICE_RESUME over and over on a DM device. Basically, run the test program on any mounted linear DM target in one shell, then delete a lot of data from a directory residing on that device in another shell. On my systems this will freeze both the test program and the rm in D state, and require a reboot to fix things.
>>
>> I've tried multiple different kernels, but at the moment, I'm using kernel-PAE-2.6.35.6-45.fc14.i686 and device-mapper-libs-1.02.63-2.fc14.i686.
>>
>> One clue I can add is that it only seems to happen if the filesystem on the device is ext4. It does not happen with ext3.
>>
>> Any ideas on where I should look next?
>>
>
> Maybe you should suspect ext4 - if there is no problem with dm & ext3 ?
>
> I guess you need to get stacktrace where the system locks.
> (echo t >/proc/sysrq-trigger - or Sysrq+T)
>
> You should probably also try different kernel.
Thanks for the tip! It did indeed turn out to be ext4, and it was already fixed: http://git.kernel.org/?p=linux/kernel/git/stable/linux
-stable.git;a=commitdiff;h=be4f27d324e8ddd57cc0d4d604fe85ee0425cba9
David
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2012-01-10 23:20 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-01-04 22:50 [linux-lvm] DM suspend locks up under load? David Shaw
2012-01-05 10:44 ` Zdenek Kabelac
2012-01-10 23:20 ` David Shaw
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).