From: Jojo <jojo@automatix.de>
To: Sulla <Sulla@gmx.at>, linux-btrfs@vger.kernel.org
Subject: Re: btrfs-transaction blocked for more than 120 seconds
Date: Thu, 02 Jan 2014 09:49:19 +0100 [thread overview]
Message-ID: <52C5280F.8090208@automatix.de> (raw)
In-Reply-To: <52C2AE7C.5020601@gmx.at>
Am 31.12.2013 12:46, schrieb Sulla:
> Dear all!
>
> On my Ubuntu Server 13.10 I use a RAID5 blockdevice consisting of 3 WD20EARS
> drives. On this I built a LVM and in this LVM I use quite normal partitions
> /, /home, SWAP (/boot resides on a RAID1.) and also a custom /data
> partition. Everything (except boot and swap) is on btrfs.
>
> sometimes my system hangs for quite some time (top is showing a high wait
> percentage), then runs on normally. I get kernel messages into
> /var/log/sylsog, see below. I am unable to make any sense of the kernel
> messages, there is no reference to the filesystem or drive affected (at
> least I can not find one).
>
> Question: What is happening here?
> * Is a HDD failing (smart looks good, however)
> * Is something wrong with my btrfs-filesystem? with which one?
> * How can I find the cause?
>
Moin Wolfgang,
first ot: Happy new Year,
over the last celebration days one of our servers (ubuntu 13.04) with
custom kernel 3.11.04 did quite simular things, also rais5/raid6.
Our Problem was writing to backup showed quit the same kernelog.
Also btrfs-transaction was hanging.
Also Filesystem usage with 83% looked fine. But that was not true.
After some time eating investigation I found, that BTRFS may have in
3.11.x and other kernels(?) a problem with free block lists and
fragmentation.
Our Server was able to self recover after defragmentation and
compressing run.
We had problems with end-of-free blocks.
After rebuilding the free block list and running defrag the server got
enough free blocks to operate well.
To be able to do that, we were forced to use the btrfs-git kernel and
also the btrfs-progs from git. (3.13-rcX)
I did on 26.12.13:
# umount /ar
# btrfsck --repair --init-extent-tree /dev/sda1
# mount -o clear_cache,skip_balance,autodefrag /dev/sda1 /ar
# btrfs fi defragment -rc /ar/backup
But attention, I thougt 83% used space shoud be enough "free blocks",
but this was wrong. It seems that BTRFS free Block lists are somewhat
errous.
Especially "balance" may crash if an file has got too many
extents/fragments, and allocating space may also hang if
free blocks are running low.
During the defragmentation run the response of the Server was getting
slow, but did not stop in Read Access.
Our state today:
root@bk:~# df -m /ar
Dateisystem 1M-Blöcke Benutzt Verfügbar Verw% Eingehängt auf
/dev/sda1 13232966 7213717 3181874 70% /ar
root@bk:~# btrfs fi show /ar
Label: Archiv+Backup uuid: 72b710aa-49a0-4ff5-a470-231560bfee81
Total devices 5 FS bytes used 6.88TiB
devid 1 size 2.73TiB used 2.70TiB path /dev/sda1
devid 2 size 2.73TiB used 2.70TiB path /dev/sdb1
devid 3 size 2.73TiB used 2.70TiB path /dev/sdc1
devid 4 size 2.73TiB used 2.70TiB path /dev/sdd1
devid 5 size 1.70TiB used 4.25GiB path /dev/sde4
Btrfs v3.12
root@bk:~# btrfs fi df /ar
Data, single: total=8.00MiB, used=0.00
Data, RAID5: total=8.10TiB, used=6.87TiB
System, single: total=4.00MiB, used=0.00
System, RAID5: total=12.00MiB, used=600.00KiB
Metadata, single: total=8.00MiB, used=0.00
Metadata, RAID5: total=12.25GiB, used=10.41GiB
Today the server completely recovered to full operation.
Is there a plan ongoing to hangle such out of free blocks/space
situations more comfortable?
TIA
J. Sauer
--
Jürgen Sauer - automatiX GmbH,
+49-4209-4699, juergen.sauer@automatix.de
Geschäftsführer: Jürgen Sauer,
Gerichtstand: Amtsgericht Walsrode • HRB 120986
Ust-Id: DE191468481 • St.Nr.: 36/211/08000
GPG Public Key zur Signaturprüfung:
http://www.automatix.de/juergen_sauer_publickey.gpg
next prev parent reply other threads:[~2014-01-02 8:56 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-12-31 11:46 btrfs-transaction blocked for more than 120 seconds Sulla
2014-01-01 12:37 ` Duncan
2014-01-01 20:08 ` Sulla
2014-01-02 8:38 ` Duncan
2014-01-03 1:24 ` Kai Krakow
2014-01-03 9:18 ` Duncan
2014-01-05 0:12 ` Sulla
2014-01-03 17:25 ` Marc MERLIN
2014-01-03 21:34 ` Duncan
2014-01-05 6:39 ` Marc MERLIN
2014-01-05 17:09 ` Chris Murphy
2014-01-05 17:54 ` Jim Salter
2014-01-05 19:57 ` Duncan
2014-01-05 20:44 ` Chris Murphy
2014-01-08 3:22 ` Marc MERLIN
2014-01-08 9:45 ` Duncan
2014-01-04 20:48 ` Roger Binns
2014-01-02 8:49 ` Jojo [this message]
2014-01-05 20:32 ` Chris Murphy
2014-01-05 21:17 ` Sulla
2014-01-05 22:36 ` Brendan Hide
2014-01-05 22:57 ` Roman Mamedov
2014-01-07 10:22 ` Brendan Hide
2014-01-06 0:15 ` Chris Murphy
2014-01-06 0:19 ` Chris Murphy
2014-01-05 23:48 ` Chris Murphy
2014-01-05 23:57 ` Chris Murphy
2014-01-06 0:25 ` Sulla
2014-01-06 0:49 ` Chris Murphy
[not found] ` <52CA06FE.2030802@gmx.at>
2014-01-06 1:55 ` Chris Murphy
[not found] <ADin1n00P0VAdqd01DioM9>
2014-01-05 20:44 ` Duncan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52C5280F.8090208@automatix.de \
--to=jojo@automatix.de \
--cc=Sulla@gmx.at \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).