From: Marc MERLIN <marc@merlins.org>
To: linux-btrfs@vger.kernel.org
Subject: 3.15.0-rc5: btrfs and sync deadlock: call_rwsem_down_read_failed
Date: Thu, 22 May 2014 02:09:21 -0700 [thread overview]
Message-ID: <20140522090921.GA12037@merlins.org> (raw)
I got m laptop to hang all IO to one of its devices again, this time
drive #2.
This is the 3rd time it happens, and I've already lost data as a result
since things that haven't hit disk, don't make it at this point.
I was doing balance and btrfs send/receive.
Then cron started a scrub in the background too.
IO to drive #1 was working fine, I didn't even notice that drive #2 IO
was hung.
And then I typed sync and it never returned.
legolas:~# ps -eo pid,user,args,wchan | grep sync
23605 root sync call_rwsem_down_read_failed
31885 root sync call_rwsem_down_read_failed
What does this mean when sync is stuck that way?
When I'm in that state, accessing btrfs on drive 1 still works (read and
write).
Any access on drive 2 through btrfs hangs
Both block devices still work.
legolas:~# dd if=/dev/sda of=/dev/null bs=1M
2593128448 bytes (2.6 GB) copied, 6.47656 s, 400 MB/s
legolas:~# dd if=/dev/sdb of=/dev/null bs=1M
148897792 bytes (149 MB) copied, 7.99576 s, 18.6 MB/s
So at least it shows that I don't have a hardware problem, right?
After reboot, most of the data to disk1 made it, so at least sync worked
there.
How can I confirm that it is btrfs deadlocking and not something else in
the kernel?
The state of btrfs is:
legolas:~# ps -eo pid,user,args,wchan | grep btrfs
527 root [btrfs-worker] rescuer_thread
528 root [btrfs-worker-hi] rescuer_thread
529 root [btrfs-delalloc] rescuer_thread
530 root [btrfs-flush_del] rescuer_thread
531 root [btrfs-cache] rescuer_thread
532 root [btrfs-submit] rescuer_thread
533 root [btrfs-fixup] rescuer_thread
534 root [btrfs-endio] rescuer_thread
535 root [btrfs-endio-met] rescuer_thread
536 root [btrfs-endio-met] rescuer_thread
537 root [btrfs-endio-rai] rescuer_thread
538 root [btrfs-rmw] rescuer_thread
539 root [btrfs-endio-wri] rescuer_thread
540 root [btrfs-freespace] rescuer_thread
541 root [btrfs-delayed-m] rescuer_thread
542 root [btrfs-readahead] rescuer_thread
543 root [btrfs-qgroup-re] rescuer_thread
544 root [btrfs-cleaner] cleaner_kthread
545 root [btrfs-transacti] transaction_kthread
2111 root [btrfs-worker] rescuer_thread
2112 root [btrfs-worker-hi] rescuer_thread
2113 root [btrfs-delalloc] rescuer_thread
2114 root [btrfs-flush_del] rescuer_thread
2115 root [btrfs-cache] rescuer_thread
2116 root [btrfs-submit] rescuer_thread
2117 root [btrfs-fixup] rescuer_thread
2119 root [btrfs-endio] rescuer_thread
2120 root [btrfs-endio-met] rescuer_thread
2121 root [btrfs-endio-met] rescuer_thread
2122 root [btrfs-endio-rai] rescuer_thread
2123 root [btrfs-rmw] rescuer_thread
2124 root [btrfs-endio-wri] rescuer_thread
2125 root [btrfs-freespace] rescuer_thread
2126 root [btrfs-delayed-m] rescuer_thread
2127 root [btrfs-readahead] rescuer_thread
2128 root [btrfs-qgroup-re] rescuer_thread
3205 root [btrfs-cleaner] cleaner_kthread
3206 root [btrfs-transacti] transaction_kthread
19156 root gvim /etc/cron.d/btrfs_back poll_schedule_timeout
19729 root btrfs send var_ro.20140521_ pipe_wait
19730 root btrfs receive /mnt/btrfs_po sleep_on_page
19824 root btrfs balance start -dusage btrfs_wait_and_free_delalloc_work
24611 root /bin/sh -c cd /mnt/btrfs_po wait
24619 root btrfs subvolume snapshot /m btrfs_start_delalloc_inodes
32044 root /sbin/btrfs scrub start -Bd futex_wait_queue_me
Thanks,
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/ | PGP 1024R/763BE901
next reply other threads:[~2014-05-22 9:09 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-22 9:09 Marc MERLIN [this message]
2014-05-22 13:15 ` 3.15.0-rc5: btrfs and sync deadlock: call_rwsem_down_read_failed / balance seems to create locks that block everything else Marc MERLIN
2014-05-22 20:52 ` Duncan
2014-05-23 0:22 ` Marc MERLIN
2014-05-23 14:17 ` 3.15.0-rc5: now sync and mount are hung on call_rwsem_down_write_failed Marc MERLIN
2014-05-23 20:24 ` Chris Mason
2014-05-23 23:13 ` Marc MERLIN
2014-05-27 19:27 ` Chris Mason
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140522090921.GA12037@merlins.org \
--to=marc@merlins.org \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.