From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from licorne.daevel.fr ([178.32.94.222]:33011 "EHLO licorne.daevel.fr" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755276Ab2JIKHY (ORCPT ); Tue, 9 Oct 2012 06:07:24 -0400 Received: from local.plusdinfo.com ([82.232.160.30] helo=[192.168.0.10]) by licorne.daevel.fr with esmtpsa (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.72) (envelope-from ) id 1TLWig-0004rV-AB for linux-btrfs@vger.kernel.org; Tue, 09 Oct 2012 12:07:22 +0200 Message-ID: <5073F758.6080507@daevel.fr> Date: Tue, 09 Oct 2012 12:07:20 +0200 From: Olivier Bonvalet MIME-Version: 1.0 To: linux-btrfs@vger.kernel.org Subject: Re: Frozen transaction References: <5073D44C.7000601@daevel.fr> <20121009095205.GL4405@twin.jikos.cz> In-Reply-To: <20121009095205.GL4405@twin.jikos.cz> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Sender: linux-btrfs-owner@vger.kernel.org List-ID: Thanks for your reply. On 09/10/2012 11:52, David Sterba wrote: > On Tue, Oct 09, 2012 at 09:37:48AM +0200, Olivier Bonvalet wrote: >> on one system I have a "frozen transaction" since more than 24 hours, >> without any IO. >> I can't umount the partition, delete a snapshot or write anything. >> I try to reboot the system, but the problem is still present. > > The processes could point at the cleaner deadlock, though I'm not > completely sure without looking at the process stacks (/proc/PID/stack). I didn't see any "stack" entry in /proc/$PID/ ; I will try to find which kernel option export that. > > If the problem persists accross reboots, how long after mount does it > take to get to this state? Cleaner usually kicks in after the 30 second > transaction commit period, so this should be easy to verify if it's > immediate or if it requires some load to get into the dead state. The cleaner process get it's state D between 30 and 60 seconds after the reboot. But that cleaner process should not throw a lot of write access ? This time I tried to remount with the space-cache enabled, there is a lot of read access now. Does that space cache will help to find "free locations" ? > >> The partition is mounted with this options : >> # mount | grep btrfs >> /dev/mapper/vg--sofia-backup on /backup type btrfs >> (rw,noatime,compress-force=zlib,nossd) > > So you don't mount with autodefrag, hmm. The deadlock I had in mind > is more likely with autodefrag but also requires umount. > >> The disk is near full : >> # btrfs fi df /backup/ >> Data: total=482.68GB, used=480.89GB > > Quite full. Yes, it's the problem. > >> System, DUP: total=32.00MB, used=72.00KB >> System: total=4.00MB, used=0.00 >> Metadata, DUP: total=10.12GB, used=8.82GB >> >> But one of the last actions was the removing of some big subvolumes (near >> 50GB). > > Given the amount of free space left, this creates high pressure on data > writes and makes the deadlock more likely. > >> There is no error in logs, the frozen transaction was started from a 3.5* >> kernel (from GIT), and the system is now running on a 3.6.1 kernel >> (vanilla). >> >> Is there something I can do to solve that problem ? > > No, there's a patch sent out in order to fix the deadlocks but it's > unfortunatelly still unmerged. > > I suppose I can't resize the FS without solving that cleanup deadlock before ? > david > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >