linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Machine lockup due to btrfs-transaction on AWS EC2 Ubuntu 14.04
@ 2014-07-29  8:04 Peter Waller
  2014-07-29  9:20 ` Peter Waller
  0 siblings, 1 reply; 7+ messages in thread
From: Peter Waller @ 2014-07-29  8:04 UTC (permalink / raw)
  To: linux-btrfs

Hi All,

I've reported a bug with Ubuntu here:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1349711

The machine in question has one BTRFS volume which is 87% full and
lives on an Logical Volume Manager (LVM) block device on top of one
Amazon Elastic Block Store (EBS) device.

We have other machines in a similar configuration which have not
displayed this behaviour.

The one thing which makes this machine different is that it has
directories which contain many thousands of files. We don't make heavy
use of subvolumes or snapshots.

More details follow:

# cat /proc/version_signature
Ubuntu 3.13.0-32.57-generic 3.13.11.4

The machine had a soft-lockup with messages like this appearing on the console:

[246736.752053] INFO: rcu_sched self-detected stall on CPU { 0}
(t=2220246 jiffies g=35399662 c=35399661 q=0)
[246736.756059] INFO: rcu_sched detected stalls on CPUs/tasks: { 0}
(detected by 1, t=2220247 jiffies, g=35399662, c=35399661, q=0)
[246764.192014] BUG: soft lockup - CPU#0 stuck for 23s! [kworker/u30:2:1828]
[246764.212058] BUG: soft lockup - CPU#1 stuck for 23s! [btrfs-transacti:492]


After the first lockup and reboot, the following messages were in
dmesg, which I ignored because after some research I saw that they
were changed to warnings and considered non-harmful. A btrfs-scrub
performed after this failed without error:


[ 77.609490] BTRFS error (device dm-0): block group 10766778368 has
wrong amount of free space
[ 77.613678] BTRFS error (device dm-0): failed to load free space
cache for block group 10766778368
[ 77.643801] BTRFS error (device dm-0): block group 19356712960 has
wrong amount of free space
[ 77.648952] BTRFS error (device dm-0): failed to load free space
cache for block group 19356712960
[ 77.926325] BTRFS error (device dm-0): block group 20430454784 has
wrong amount of free space
[ 77.931078] BTRFS error (device dm-0): failed to load free space
cache for block group 20430454784
[ 78.111437] BTRFS error (device dm-0): block group 21504196608 has
wrong amount of free space
[ 78.116165] BTRFS error (device dm-0): failed to load free space
cache for block group 21504196608


After the second time I've observed the lockup and rebooted, these
messages have appeared:


[ 45.390221] BTRFS error (device dm-0): free space inode generation
(0) did not match free space cache generation (70012)
[ 45.413472] BTRFS error (device dm-0): free space inode generation
(0) did not match free space cache generation (70012)
[ 467.423961] BTRFS error (device dm-0): block group 518646661120 has
wrong amount of free space
[ 467.429251] BTRFS error (device dm-0): failed to load free space
cache for block group 518646661120


I would like to know if these second messages are harmful and if
remedial action is needed in response to the latter messages.
Searching for messages similar to my lockup I found this report which
suggested the problem may be fixed in 3.14.

Any advice appreciated,

Thanks,

- Peter

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-08-05 10:23 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-07-29  8:04 Machine lockup due to btrfs-transaction on AWS EC2 Ubuntu 14.04 Peter Waller
2014-07-29  9:20 ` Peter Waller
2014-07-30 10:02   ` Peter Waller
2014-07-31 15:07     ` Peter Waller
2014-07-31 15:10       ` Peter Waller
2014-08-01 11:59         ` Peter Waller
2014-08-05 10:23           ` Peter Waller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).