All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eugene Crosser <crosser@average.org>
To: linux-btrfs@vger.kernel.org
Subject: btrfs ops hang indefinitely (process in D state)
Date: Sat, 2 Jul 2016 12:49:53 +0300	[thread overview]
Message-ID: <57778E41.1010000@average.org> (raw)


[-- Attachment #1.1: Type: text/plain, Size: 3379 bytes --]

Hello,

This may be the same problem as "btrfs lockup".

I have two systems using btrfs for several years. One is my home desktop, it has
root+home ext4 fs on a PCI SSD, and "big stuff" on a btrfs using two hard disks
in RAID1 configuration:

root@pccross:/export# uname -a
Linux pccross 4.7.0-rc2-custom #2 SMP Sat Jun 11 01:13:59 MSK 2016 x86_64 x86_64
x86_64 GNU/Linux # -- Was earlier 4.x version when the problem happened
root@pccross:/export# btrfs --version
btrfs-progs v4.4
root@pccross:/export# btrfs fi show
Label: 'export'  uuid: c94c3ef6-394e-4441-8992-d7033332bdff
	Total devices 2 FS bytes used 1.26TiB
	devid    1 size 3.64TiB used 1.26TiB path /dev/sda
	devid    2 size 3.64TiB used 1.26TiB path /dev/sdb

root@pccross:/export# btrfs fi df /export
Data, RAID1: total=1.26TiB, used=1.25TiB
System, RAID1: total=32.00MiB, used=208.00KiB
Metadata, RAID1: total=5.00GiB, used=3.82GiB
GlobalReserve, single: total=512.00MiB, used=0.00B

A month ago, I moved a directory containing a few Gb from home (ext4) to btrfs
with `mv` command. The command took some minutes and eventually finished without
error. After some hours, a cron job that uses files on btrfs did not run. I
logged in to investigate and realized that its process was in 'D' state, and any
command that I tried that would use btrfs (ls, ...) would enter 'D' state and
stay there indefinitely. There was nothing interesting (that I remember) in
dmesg. Reboot did not help and indeed could not complete because some of startup
jobs use files on btfs, and they hang.

I rebooted without mounting btrfs and ran `btrfsck`. It found and fixed some
inconsistencies (no log, sorry), and I could mount, and since then everything
works, except the directory that I moved disappeared altogether (I had a backup
so could restore it). No debugging material left so this is just for background.

=====
Enter the second system. It is a rented physical server in a datacenter with two
hard disks, joined into a single root btrfs (/dev/sd[ab]1 are swap partitions):

root@dehost:~# uname -a
Linux dehost 3.13.0-91-generic #138-Ubuntu SMP Fri Jun 24 17:00:34 UTC 2016
x86_64 x86_64 x86_64 GNU/Linux
root@dehost:~# btrfs --version
Btrfs v3.12
root@dehost:~# btrfs fi show
Label: none  uuid: 67a2708c-f039-4783-a699-6f6be0dac318
	Total devices 2 FS bytes used 442.58GiB
	devid    1 size 2.72TiB used 444.04GiB path /dev/sda2
	devid    2 size 2.72TiB used 444.03GiB path /dev/sdb2

Btrfs v3.12
root@dehost:~# btrfs fi df /
Data, RAID1: total=440.00GiB, used=439.51GiB
System, RAID1: total=32.00MiB, used=72.00KiB
System, single: total=4.00MiB, used=0.00
Metadata, RAID1: total=4.00GiB, used=3.07GiB

A week ago, the system started to become unresponsive every day. Kernel works
(responds to ping) but no processes can start. Looking at the logs after reboot
I noticed that activity stops some time after the start of backup cron job that
covers a set of directories (/etc, /home, /var/mail and some more.). I disabled
the backup job and since then, several days, it did not hang.

=====
My question to the developers: what can I do to (1) recover the filesystem while
it is mounted (I can use recovery netboot system and run `btrfs check` as the
last resort), and (2) provide any useful debugging information to the developers?

Thank you,

Eugene


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

             reply	other threads:[~2016-07-02 10:00 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-02  9:49 Eugene Crosser [this message]
2016-07-02 10:54 ` btrfs ops hang indefinitely (process in D state) Duncan
2016-07-02 13:32   ` Eugene Crosser

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=57778E41.1010000@average.org \
    --to=crosser@average.org \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.