From: Chris Mason <clm@fb.com>
To: Martin Steigerwald <Martin@lichtvoll.de>, Liu Bo <bo.li.liu@oracle.com>
Cc: <linux-btrfs@vger.kernel.org>
Subject: Re: [PATCH] Btrfs: fix compressed write corruption on enospc
Date: Wed, 6 Aug 2014 09:35:51 -0400 [thread overview]
Message-ID: <53E22F37.9000308@fb.com> (raw)
In-Reply-To: <3143412.n5KSJct7YP@merkaba>
On 08/06/2014 06:21 AM, Martin Steigerwald wrote:
>> I think this should go to stable. Thanks, Liu.
I'm definitely tagging this for stable.
>
> Unfortunately this fix does not seem to fix all lockups.
The traces below are a little different, could you please send the whole
file?
-chris
>
> Just had a hard lockup again during java-bases CrashPlanPROe app backuping
> company data which is stored on BTRFS via ecryptfs to central Backup server.
>
> It basically happened on about the first heavy write I/O occasion after
> the BTRFS trees filled the complete device:
>
> I am now balancing the trees down to lower sizes manually with
>
> btrfs balance start -dusage=10 /home
>
> btrfs balance start -musage=10 /home
>
> and raising values. BTW I got out of space with trying both at the same time:
>
> merkaba:~#1> btrfs balance start -dusage=10 -musage=10 /home
> ERROR: error during balancing '/home' - No space left on device
> There may be more info in syslog - try dmesg | tail
>
> merkaba:~#1> btrfs fi sh /home
> Label: 'home' uuid: […]
> Total devices 2 FS bytes used 128.76GiB
> devid 1 size 160.00GiB used 146.00GiB path /dev/dm-0
> devid 2 size 160.00GiB used 146.00GiB path /dev/mapper/sata-home
>
> So I am pretty sure meanwhile that hangs can best be trigger *if* BTRFS
> trees fill the complete device.
>
> I will try to keep tree sizes down as a work-around for now even it if means
> additional write access towards the SSD devices.
>
> And make sure tree sizes stay down on my first server BTRFS as well although
> this uses debian backport kernel 3.14 and thus may not be affected.
>
> Are there any other fixes to try out? I really like to see this resolved. Its
> in two stable kernel revisions already: 3.15 and 3.16. And by this it means
> if not fixed next Debian stable (Jessie) will be affected by it.
>
>
> Some kern.log (have stored the complete file)
>
> Aug 6 12:01:16 merkaba kernel: [90496.262084] INFO: task java:21301 blocked for more than 120 seconds.
> Aug 6 12:01:16 merkaba kernel: [90496.263626] Tainted: G O 3.16.0-tp520-fixcompwrite+ #3
> Aug 6 12:01:16 merkaba kernel: [90496.265159] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Aug 6 12:01:16 merkaba kernel: [90496.266756] java D ffff880044e3cef0 0 21301 1 0x00000000
> Aug 6 12:01:16 merkaba kernel: [90496.268353] ffff8801960e3bd8 0000000000000002 ffff880407f649b0 ffff8801960e3fd8
> Aug 6 12:01:16 merkaba kernel: [90496.269980] ffff880044e3c9b0 0000000000013040 ffff880044e3c9b0 ffff88041e293040
> Aug 6 12:01:16 merkaba kernel: [90496.271766] ffff88041e5c6868 ffff8801960e3c70 0000000000000002 ffffffff810db1d9
> Aug 6 12:01:16 merkaba kernel: [90496.273383] Call Trace:
> Aug 6 12:01:16 merkaba kernel: [90496.275017] [<ffffffff810db1d9>] ? wait_on_page_read+0x37/0x37
> Aug 6 12:01:16 merkaba kernel: [90496.276630] [<ffffffff81470fd0>] schedule+0x64/0x66
> Aug 6 12:01:16 merkaba kernel: [90496.278209] [<ffffffff81471157>] io_schedule+0x57/0x76
> Aug 6 12:01:16 merkaba kernel: [90496.279817] [<ffffffff810db1e2>] sleep_on_page+0x9/0xd
> Aug 6 12:01:16 merkaba kernel: [90496.281403] [<ffffffff8147152d>] __wait_on_bit_lock+0x41/0x85
> Aug 6 12:01:16 merkaba kernel: [90496.282991] [<ffffffff810db29f>] __lock_page+0x70/0x7c
> Aug 6 12:01:16 merkaba kernel: [90496.284550] [<ffffffff81070f3a>] ? autoremove_wake_function+0x2f/0x2f
> Aug 6 12:01:16 merkaba kernel: [90496.286156] [<ffffffffc0476617>] lock_page+0x1e/0x21 [btrfs]
> Aug 6 12:01:16 merkaba kernel: [90496.287742] [<ffffffffc0476617>] ? lock_page+0x1e/0x21 [btrfs]
> Aug 6 12:01:16 merkaba kernel: [90496.289344] [<ffffffffc047a8f4>] extent_write_cache_pages.isra.21.constprop.42+0x1a7/0x2d9 [btrfs]
> Aug 6 12:01:16 merkaba kernel: [90496.290955] [<ffffffff810dba82>] ? find_get_pages_tag+0xfc/0x123
> Aug 6 12:01:16 merkaba kernel: [90496.292574] [<ffffffffc047ae1c>] extent_writepages+0x46/0x57 [btrfs]
> Aug 6 12:01:16 merkaba kernel: [90496.294154] [<ffffffffc04650b2>] ? btrfs_submit_direct+0x3ef/0x3ef [btrfs]
> Aug 6 12:01:16 merkaba kernel: [90496.295760] [<ffffffffc046361f>] btrfs_writepages+0x23/0x25 [btrfs]
> Aug 6 12:01:16 merkaba kernel: [90496.297492] [<ffffffff810e556b>] do_writepages+0x1b/0x24
> Aug 6 12:01:16 merkaba kernel: [90496.299035] [<ffffffff810dc88a>] __filemap_fdatawrite_range+0x50/0x52
> Aug 6 12:01:16 merkaba kernel: [90496.300561] [<ffffffff810dc8f1>] filemap_fdatawrite_range+0xe/0x10
> Aug 6 12:01:16 merkaba kernel: [90496.302118] [<ffffffffc046ea30>] btrfs_sync_file+0x67/0x2bd [btrfs]
> Aug 6 12:01:16 merkaba kernel: [90496.303630] [<ffffffff810dc88a>] ? __filemap_fdatawrite_range+0x50/0x52
> Aug 6 12:01:16 merkaba kernel: [90496.305158] [<ffffffff81156289>] vfs_fsync_range+0x1c/0x1e
> Aug 6 12:01:16 merkaba kernel: [90496.306669] [<ffffffff811562a2>] vfs_fsync+0x17/0x19
> Aug 6 12:01:16 merkaba kernel: [90496.308197] [<ffffffffc04f732a>] ecryptfs_fsync+0x2f/0x34 [ecryptfs]
> Aug 6 12:01:16 merkaba kernel: [90496.309711] [<ffffffff81156289>] vfs_fsync_range+0x1c/0x1e
> Aug 6 12:01:16 merkaba kernel: [90496.311249] [<ffffffff811562a2>] vfs_fsync+0x17/0x19
> Aug 6 12:01:16 merkaba kernel: [90496.312771] [<ffffffff811564c9>] do_fsync+0x2c/0x45
> Aug 6 12:01:16 merkaba kernel: [90496.314288] [<ffffffff811566af>] SyS_fsync+0xb/0xf
> Aug 6 12:01:16 merkaba kernel: [90496.315800] [<ffffffff8147420b>] tracesys+0xdd/0xe2
>
>
>
> Aug 6 12:01:16 merkaba kernel: [90496.380221] INFO: task java:21563 blocked for more than 120 seconds.
> Aug 6 12:01:16 merkaba kernel: [90496.381691] Tainted: G O 3.16.0-tp520-fixcompwrite+ #3
> Aug 6 12:01:16 merkaba kernel: [90496.383192] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Aug 6 12:01:16 merkaba kernel: [90496.384687] java D ffff880038111dd0 0 21563 1 0x00000000
> Aug 6 12:01:16 merkaba kernel: [90496.386203] ffff88006df0fbd8 0000000000000002 ffffffff81a15500 ffff88006df0ffd8
> Aug 6 12:01:16 merkaba kernel: [90496.387843] ffff880038111890 0000000000013040 ffff880038111890 ffff88041e213040
> Aug 6 12:01:16 merkaba kernel: [90496.389414] ffff88041e5cc568 ffff88006df0fc70 0000000000000002 ffffffff810db1d9
> Aug 6 12:01:16 merkaba kernel: [90496.391031] Call Trace:
> Aug 6 12:01:16 merkaba kernel: [90496.392574] [<ffffffff810db1d9>] ? wait_on_page_read+0x37/0x37
> Aug 6 12:01:16 merkaba kernel: [90496.394154] [<ffffffff81470fd0>] schedule+0x64/0x66
> Aug 6 12:01:16 merkaba kernel: [90496.395686] [<ffffffff81471157>] io_schedule+0x57/0x76
> Aug 6 12:01:16 merkaba kernel: [90496.397218] [<ffffffff810db1e2>] sleep_on_page+0x9/0xd
> Aug 6 12:01:16 merkaba kernel: [90496.398723] [<ffffffff8147152d>] __wait_on_bit_lock+0x41/0x85
> Aug 6 12:01:16 merkaba kernel: [90496.400232] [<ffffffff810db29f>] __lock_page+0x70/0x7c
> Aug 6 12:01:16 merkaba kernel: [90496.401895] [<ffffffff81070f3a>] ? autoremove_wake_function+0x2f/0x2f
> Aug 6 12:01:16 merkaba kernel: [90496.403440] [<ffffffffc0476617>] lock_page+0x1e/0x21 [btrfs]
> Aug 6 12:01:16 merkaba kernel: [90496.404942] [<ffffffffc0476617>] ? lock_page+0x1e/0x21 [btrfs]
> Aug 6 12:01:16 merkaba kernel: [90496.406433] [<ffffffffc047a8f4>] extent_write_cache_pages.isra.21.constprop.42+0x1a7/0x2d9 [btrfs]
> Aug 6 12:01:16 merkaba kernel: [90496.407950] [<ffffffff810dba82>] ? find_get_pages_tag+0xfc/0x123
> Aug 6 12:01:16 merkaba kernel: [90496.409474] [<ffffffffc047ae1c>] extent_writepages+0x46/0x57 [btrfs]
> Aug 6 12:01:16 merkaba kernel: [90496.411020] [<ffffffffc04650b2>] ? btrfs_submit_direct+0x3ef/0x3ef [btrfs]
> Aug 6 12:01:16 merkaba kernel: [90496.412558] [<ffffffffc046361f>] btrfs_writepages+0x23/0x25 [btrfs]
> Aug 6 12:01:16 merkaba kernel: [90496.414102] [<ffffffff810e556b>] do_writepages+0x1b/0x24
> Aug 6 12:01:16 merkaba kernel: [90496.415621] [<ffffffff810dc88a>] __filemap_fdatawrite_range+0x50/0x52
> Aug 6 12:01:16 merkaba kernel: [90496.417184] [<ffffffff810dc8f1>] filemap_fdatawrite_range+0xe/0x10
> Aug 6 12:01:16 merkaba kernel: [90496.418753] [<ffffffffc046ea30>] btrfs_sync_file+0x67/0x2bd [btrfs]
> Aug 6 12:01:16 merkaba kernel: [90496.420344] [<ffffffff810dc88a>] ? __filemap_fdatawrite_range+0x50/0x52
> Aug 6 12:01:16 merkaba kernel: [90496.421914] [<ffffffff81156289>] vfs_fsync_range+0x1c/0x1e
> Aug 6 12:01:16 merkaba kernel: [90496.423467] [<ffffffff811562a2>] vfs_fsync+0x17/0x19
> Aug 6 12:01:16 merkaba kernel: [90496.425051] [<ffffffffc04f732a>] ecryptfs_fsync+0x2f/0x34 [ecryptfs]
> Aug 6 12:01:16 merkaba kernel: [90496.426593] [<ffffffff81156289>] vfs_fsync_range+0x1c/0x1e
> Aug 6 12:01:16 merkaba kernel: [90496.428280] [<ffffffff811562a2>] vfs_fsync+0x17/0x19
> Aug 6 12:01:16 merkaba kernel: [90496.429853] [<ffffffff811564c9>] do_fsync+0x2c/0x45
> Aug 6 12:01:16 merkaba kernel: [90496.431351] [<ffffffff811566af>] SyS_fsync+0xb/0xf
> Aug 6 12:01:16 merkaba kernel: [90496.432841] [<ffffffff8147420b>] tracesys+0xdd/0xe2
>
>
>
> Aug 6 12:01:16 merkaba kernel: [90496.434306] INFO: task kworker/u8:3:21401 blocked for more than 120 seconds.
> Aug 6 12:01:16 merkaba kernel: [90496.435814] Tainted: G O 3.16.0-tp520-fixcompwrite+ #3
> Aug 6 12:01:16 merkaba kernel: [90496.437328] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Aug 6 12:01:16 merkaba kernel: [90496.438885] kworker/u8:3 D ffff880133ebe780 0 21401 2 0x00000000
> Aug 6 12:01:16 merkaba kernel: [90496.440464] Workqueue: btrfs-flush_delalloc normal_work_helper [btrfs]
> Aug 6 12:01:16 merkaba kernel: [90496.442037] ffff88003e953b18 0000000000000002 ffffffff81a15500 ffff88003e953fd8
> Aug 6 12:01:16 merkaba kernel: [90496.443639] ffff880133ebe240 0000000000013040 ffff880133ebe240 ffff88041e213040
> Aug 6 12:01:16 merkaba kernel: [90496.445246] ffff88041e5bc968 ffff88003e953bb0 0000000000000002 ffffffff810db1d9
> Aug 6 12:01:16 merkaba kernel: [90496.446901] Call Trace:
> Aug 6 12:01:16 merkaba kernel: [90496.448485] [<ffffffff810db1d9>] ? wait_on_page_read+0x37/0x37
> Aug 6 12:01:16 merkaba kernel: [90496.450081] [<ffffffff81470fd0>] schedule+0x64/0x66
> Aug 6 12:01:16 merkaba kernel: [90496.451682] [<ffffffff81471157>] io_schedule+0x57/0x76
> Aug 6 12:01:16 merkaba kernel: [90496.453271] [<ffffffff810db1e2>] sleep_on_page+0x9/0xd
> Aug 6 12:01:16 merkaba kernel: [90496.455037] [<ffffffff8147152d>] __wait_on_bit_lock+0x41/0x85
> Aug 6 12:01:16 merkaba kernel: [90496.456617] [<ffffffff810db29f>] __lock_page+0x70/0x7c
> Aug 6 12:01:16 merkaba kernel: [90496.458203] [<ffffffff81070f3a>] ? autoremove_wake_function+0x2f/0x2f
> Aug 6 12:01:16 merkaba kernel: [90496.459793] [<ffffffffc0476617>] lock_page+0x1e/0x21 [btrfs]
> Aug 6 12:01:16 merkaba kernel: [90496.461353] [<ffffffffc0476617>] ? lock_page+0x1e/0x21 [btrfs]
> Aug 6 12:01:16 merkaba kernel: [90496.462917] [<ffffffffc047a8f4>] extent_write_cache_pages.isra.21.constprop.42+0x1a7/0x2d9 [btrfs]
> Aug 6 12:01:16 merkaba kernel: [90496.464479] [<ffffffff81009e60>] ? native_sched_clock+0x3a/0x3c
> Aug 6 12:01:16 merkaba kernel: [90496.466036] [<ffffffffc047ae1c>] extent_writepages+0x46/0x57 [btrfs]
> Aug 6 12:01:16 merkaba kernel: [90496.467632] [<ffffffff81066fbf>] ? task_group_account_field+0x3b/0x40
> Aug 6 12:01:16 merkaba kernel: [90496.469168] [<ffffffffc04650b2>] ? btrfs_submit_direct+0x3ef/0x3ef [btrfs]
> Aug 6 12:01:16 merkaba kernel: [90496.470737] [<ffffffffc046361f>] btrfs_writepages+0x23/0x25 [btrfs]
> Aug 6 12:01:16 merkaba kernel: [90496.472307] [<ffffffff810e556b>] do_writepages+0x1b/0x24
> Aug 6 12:01:16 merkaba kernel: [90496.473885] [<ffffffff810dc88a>] __filemap_fdatawrite_range+0x50/0x52
> Aug 6 12:01:16 merkaba kernel: [90496.475458] [<ffffffff810dd1b3>] filemap_flush+0x17/0x19
> Aug 6 12:01:16 merkaba kernel: [90496.477041] [<ffffffffc0465df9>] btrfs_run_delalloc_work+0x2e/0x64 [btrfs]
> Aug 6 12:01:16 merkaba kernel: [90496.478624] [<ffffffffc04863d7>] normal_work_helper+0xdf/0x224 [btrfs]
> Aug 6 12:01:16 merkaba kernel: [90496.480257] [<ffffffff81052d8c>] process_one_work+0x16f/0x2b8
> Aug 6 12:01:16 merkaba kernel: [90496.481977] [<ffffffff81053636>] worker_thread+0x27b/0x32e
> Aug 6 12:01:16 merkaba kernel: [90496.483544] [<ffffffff810533bb>] ? cancel_delayed_work_sync+0x10/0x10
> Aug 6 12:01:16 merkaba kernel: [90496.485082] [<ffffffff81058012>] kthread+0xb2/0xba
> Aug 6 12:01:16 merkaba kernel: [90496.486624] [<ffffffff81470000>] ? ap_handle_dropped_data+0xf/0xc8
> Aug 6 12:01:16 merkaba kernel: [90496.488148] [<ffffffff81057f60>] ? __kthread_parkme+0x62/0x62
> Aug 6 12:01:16 merkaba kernel: [90496.489719] [<ffffffff81473f6c>] ret_from_fork+0x7c/0xb0
> Aug 6 12:01:16 merkaba kernel: [90496.491265] [<ffffffff81057f60>] ? __kthread_parkme+0x62/0x62
>
>
> Ciao,
>
next prev parent reply other threads:[~2014-08-06 13:36 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-24 14:48 [PATCH] Btrfs: fix compressed write corruption on enospc Liu Bo
2014-07-24 14:55 ` Chris Mason
2014-07-25 1:00 ` Liu Bo
2014-07-25 9:58 ` Martin Steigerwald
2014-07-25 1:53 ` Wang Shilong
2014-07-25 2:08 ` Liu Bo
2014-07-25 2:11 ` Wang Shilong
2014-07-25 9:54 ` Martin Steigerwald
2014-08-04 12:50 ` Martin Steigerwald
2014-08-04 12:52 ` Martin Steigerwald
2014-08-06 10:21 ` Martin Steigerwald
2014-08-06 10:29 ` Hugo Mills
2014-08-06 12:28 ` Martin Steigerwald
2014-08-06 13:35 ` Chris Mason [this message]
2014-08-06 14:43 ` Martin Steigerwald
2014-08-06 15:18 ` Chris Mason
2014-08-07 0:52 ` Chris Mason
2014-08-07 7:50 ` Liu Bo
2014-08-07 8:20 ` Miao Xie
2014-08-07 14:02 ` Chris Mason
2014-08-10 14:55 ` Liu Bo
2014-08-11 20:35 ` Chris Mason
2014-08-12 2:55 ` Miao Xie
2014-08-12 7:51 ` Liu Bo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53E22F37.9000308@fb.com \
--to=clm@fb.com \
--cc=Martin@lichtvoll.de \
--cc=bo.li.liu@oracle.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).