linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chris Mason <clm@fb.com>
To: Martin Steigerwald <Martin@lichtvoll.de>, Liu Bo <bo.li.liu@oracle.com>
Cc: <linux-btrfs@vger.kernel.org>
Subject: Re: [PATCH] Btrfs: fix compressed write corruption on enospc
Date: Wed, 6 Aug 2014 09:35:51 -0400	[thread overview]
Message-ID: <53E22F37.9000308@fb.com> (raw)
In-Reply-To: <3143412.n5KSJct7YP@merkaba>

On 08/06/2014 06:21 AM, Martin Steigerwald wrote:

>> I think this should go to stable. Thanks, Liu.

I'm definitely tagging this for stable.

> 
> Unfortunately this fix does not seem to fix all lockups.

The traces below are a little different, could you please send the whole
file?

-chris

> 
> Just had a hard lockup again during java-bases CrashPlanPROe app backuping
> company data which is stored on BTRFS via ecryptfs to central Backup server.
> 
> It basically happened on about the first heavy write I/O occasion after
> the BTRFS trees filled the complete device:
> 
> I am now balancing the trees down to lower sizes manually with
> 
> btrfs balance start -dusage=10 /home
> 
> btrfs balance start -musage=10 /home
> 
> and raising values. BTW I got out of space with trying both at the same time:
> 
> merkaba:~#1> btrfs balance start -dusage=10 -musage=10 /home
> ERROR: error during balancing '/home' - No space left on device
> There may be more info in syslog - try dmesg | tail
> 
> merkaba:~#1> btrfs fi sh /home
> Label: 'home'  uuid: […]
>         Total devices 2 FS bytes used 128.76GiB
>         devid    1 size 160.00GiB used 146.00GiB path /dev/dm-0
>         devid    2 size 160.00GiB used 146.00GiB path /dev/mapper/sata-home
> 
> So I am pretty sure meanwhile that hangs can best be trigger *if* BTRFS
> trees fill the complete device.
> 
> I will try to keep tree sizes down as a work-around for now even it if means
> additional write access towards the SSD devices.
> 
> And make sure tree sizes stay down on my first server BTRFS as well although
> this uses debian backport kernel 3.14 and thus may not be affected.
> 
> Are there any other fixes to try out? I really like to see this resolved. Its
> in two stable kernel revisions already: 3.15 and 3.16. And by this it means
> if not fixed next Debian stable (Jessie) will be affected by it.
> 
> 
> Some kern.log (have stored the complete file)
> 
> Aug  6 12:01:16 merkaba kernel: [90496.262084] INFO: task java:21301 blocked for more than 120 seconds.
> Aug  6 12:01:16 merkaba kernel: [90496.263626]       Tainted: G           O  3.16.0-tp520-fixcompwrite+ #3
> Aug  6 12:01:16 merkaba kernel: [90496.265159] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Aug  6 12:01:16 merkaba kernel: [90496.266756] java            D ffff880044e3cef0     0 21301      1 0x00000000
> Aug  6 12:01:16 merkaba kernel: [90496.268353]  ffff8801960e3bd8 0000000000000002 ffff880407f649b0 ffff8801960e3fd8
> Aug  6 12:01:16 merkaba kernel: [90496.269980]  ffff880044e3c9b0 0000000000013040 ffff880044e3c9b0 ffff88041e293040
> Aug  6 12:01:16 merkaba kernel: [90496.271766]  ffff88041e5c6868 ffff8801960e3c70 0000000000000002 ffffffff810db1d9
> Aug  6 12:01:16 merkaba kernel: [90496.273383] Call Trace:
> Aug  6 12:01:16 merkaba kernel: [90496.275017]  [<ffffffff810db1d9>] ? wait_on_page_read+0x37/0x37
> Aug  6 12:01:16 merkaba kernel: [90496.276630]  [<ffffffff81470fd0>] schedule+0x64/0x66
> Aug  6 12:01:16 merkaba kernel: [90496.278209]  [<ffffffff81471157>] io_schedule+0x57/0x76
> Aug  6 12:01:16 merkaba kernel: [90496.279817]  [<ffffffff810db1e2>] sleep_on_page+0x9/0xd
> Aug  6 12:01:16 merkaba kernel: [90496.281403]  [<ffffffff8147152d>] __wait_on_bit_lock+0x41/0x85
> Aug  6 12:01:16 merkaba kernel: [90496.282991]  [<ffffffff810db29f>] __lock_page+0x70/0x7c
> Aug  6 12:01:16 merkaba kernel: [90496.284550]  [<ffffffff81070f3a>] ? autoremove_wake_function+0x2f/0x2f
> Aug  6 12:01:16 merkaba kernel: [90496.286156]  [<ffffffffc0476617>] lock_page+0x1e/0x21 [btrfs]
> Aug  6 12:01:16 merkaba kernel: [90496.287742]  [<ffffffffc0476617>] ? lock_page+0x1e/0x21 [btrfs]
> Aug  6 12:01:16 merkaba kernel: [90496.289344]  [<ffffffffc047a8f4>] extent_write_cache_pages.isra.21.constprop.42+0x1a7/0x2d9 [btrfs]
> Aug  6 12:01:16 merkaba kernel: [90496.290955]  [<ffffffff810dba82>] ? find_get_pages_tag+0xfc/0x123
> Aug  6 12:01:16 merkaba kernel: [90496.292574]  [<ffffffffc047ae1c>] extent_writepages+0x46/0x57 [btrfs]
> Aug  6 12:01:16 merkaba kernel: [90496.294154]  [<ffffffffc04650b2>] ? btrfs_submit_direct+0x3ef/0x3ef [btrfs]
> Aug  6 12:01:16 merkaba kernel: [90496.295760]  [<ffffffffc046361f>] btrfs_writepages+0x23/0x25 [btrfs]
> Aug  6 12:01:16 merkaba kernel: [90496.297492]  [<ffffffff810e556b>] do_writepages+0x1b/0x24
> Aug  6 12:01:16 merkaba kernel: [90496.299035]  [<ffffffff810dc88a>] __filemap_fdatawrite_range+0x50/0x52
> Aug  6 12:01:16 merkaba kernel: [90496.300561]  [<ffffffff810dc8f1>] filemap_fdatawrite_range+0xe/0x10
> Aug  6 12:01:16 merkaba kernel: [90496.302118]  [<ffffffffc046ea30>] btrfs_sync_file+0x67/0x2bd [btrfs]
> Aug  6 12:01:16 merkaba kernel: [90496.303630]  [<ffffffff810dc88a>] ? __filemap_fdatawrite_range+0x50/0x52
> Aug  6 12:01:16 merkaba kernel: [90496.305158]  [<ffffffff81156289>] vfs_fsync_range+0x1c/0x1e
> Aug  6 12:01:16 merkaba kernel: [90496.306669]  [<ffffffff811562a2>] vfs_fsync+0x17/0x19
> Aug  6 12:01:16 merkaba kernel: [90496.308197]  [<ffffffffc04f732a>] ecryptfs_fsync+0x2f/0x34 [ecryptfs]
> Aug  6 12:01:16 merkaba kernel: [90496.309711]  [<ffffffff81156289>] vfs_fsync_range+0x1c/0x1e
> Aug  6 12:01:16 merkaba kernel: [90496.311249]  [<ffffffff811562a2>] vfs_fsync+0x17/0x19
> Aug  6 12:01:16 merkaba kernel: [90496.312771]  [<ffffffff811564c9>] do_fsync+0x2c/0x45
> Aug  6 12:01:16 merkaba kernel: [90496.314288]  [<ffffffff811566af>] SyS_fsync+0xb/0xf
> Aug  6 12:01:16 merkaba kernel: [90496.315800]  [<ffffffff8147420b>] tracesys+0xdd/0xe2
> 
> 
> 
> Aug  6 12:01:16 merkaba kernel: [90496.380221] INFO: task java:21563 blocked for more than 120 seconds.
> Aug  6 12:01:16 merkaba kernel: [90496.381691]       Tainted: G           O  3.16.0-tp520-fixcompwrite+ #3
> Aug  6 12:01:16 merkaba kernel: [90496.383192] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Aug  6 12:01:16 merkaba kernel: [90496.384687] java            D ffff880038111dd0     0 21563      1 0x00000000
> Aug  6 12:01:16 merkaba kernel: [90496.386203]  ffff88006df0fbd8 0000000000000002 ffffffff81a15500 ffff88006df0ffd8
> Aug  6 12:01:16 merkaba kernel: [90496.387843]  ffff880038111890 0000000000013040 ffff880038111890 ffff88041e213040
> Aug  6 12:01:16 merkaba kernel: [90496.389414]  ffff88041e5cc568 ffff88006df0fc70 0000000000000002 ffffffff810db1d9
> Aug  6 12:01:16 merkaba kernel: [90496.391031] Call Trace:
> Aug  6 12:01:16 merkaba kernel: [90496.392574]  [<ffffffff810db1d9>] ? wait_on_page_read+0x37/0x37
> Aug  6 12:01:16 merkaba kernel: [90496.394154]  [<ffffffff81470fd0>] schedule+0x64/0x66
> Aug  6 12:01:16 merkaba kernel: [90496.395686]  [<ffffffff81471157>] io_schedule+0x57/0x76
> Aug  6 12:01:16 merkaba kernel: [90496.397218]  [<ffffffff810db1e2>] sleep_on_page+0x9/0xd
> Aug  6 12:01:16 merkaba kernel: [90496.398723]  [<ffffffff8147152d>] __wait_on_bit_lock+0x41/0x85
> Aug  6 12:01:16 merkaba kernel: [90496.400232]  [<ffffffff810db29f>] __lock_page+0x70/0x7c
> Aug  6 12:01:16 merkaba kernel: [90496.401895]  [<ffffffff81070f3a>] ? autoremove_wake_function+0x2f/0x2f
> Aug  6 12:01:16 merkaba kernel: [90496.403440]  [<ffffffffc0476617>] lock_page+0x1e/0x21 [btrfs]
> Aug  6 12:01:16 merkaba kernel: [90496.404942]  [<ffffffffc0476617>] ? lock_page+0x1e/0x21 [btrfs]
> Aug  6 12:01:16 merkaba kernel: [90496.406433]  [<ffffffffc047a8f4>] extent_write_cache_pages.isra.21.constprop.42+0x1a7/0x2d9 [btrfs]
> Aug  6 12:01:16 merkaba kernel: [90496.407950]  [<ffffffff810dba82>] ? find_get_pages_tag+0xfc/0x123
> Aug  6 12:01:16 merkaba kernel: [90496.409474]  [<ffffffffc047ae1c>] extent_writepages+0x46/0x57 [btrfs]
> Aug  6 12:01:16 merkaba kernel: [90496.411020]  [<ffffffffc04650b2>] ? btrfs_submit_direct+0x3ef/0x3ef [btrfs]
> Aug  6 12:01:16 merkaba kernel: [90496.412558]  [<ffffffffc046361f>] btrfs_writepages+0x23/0x25 [btrfs]
> Aug  6 12:01:16 merkaba kernel: [90496.414102]  [<ffffffff810e556b>] do_writepages+0x1b/0x24
> Aug  6 12:01:16 merkaba kernel: [90496.415621]  [<ffffffff810dc88a>] __filemap_fdatawrite_range+0x50/0x52
> Aug  6 12:01:16 merkaba kernel: [90496.417184]  [<ffffffff810dc8f1>] filemap_fdatawrite_range+0xe/0x10
> Aug  6 12:01:16 merkaba kernel: [90496.418753]  [<ffffffffc046ea30>] btrfs_sync_file+0x67/0x2bd [btrfs]
> Aug  6 12:01:16 merkaba kernel: [90496.420344]  [<ffffffff810dc88a>] ? __filemap_fdatawrite_range+0x50/0x52
> Aug  6 12:01:16 merkaba kernel: [90496.421914]  [<ffffffff81156289>] vfs_fsync_range+0x1c/0x1e
> Aug  6 12:01:16 merkaba kernel: [90496.423467]  [<ffffffff811562a2>] vfs_fsync+0x17/0x19
> Aug  6 12:01:16 merkaba kernel: [90496.425051]  [<ffffffffc04f732a>] ecryptfs_fsync+0x2f/0x34 [ecryptfs]
> Aug  6 12:01:16 merkaba kernel: [90496.426593]  [<ffffffff81156289>] vfs_fsync_range+0x1c/0x1e
> Aug  6 12:01:16 merkaba kernel: [90496.428280]  [<ffffffff811562a2>] vfs_fsync+0x17/0x19
> Aug  6 12:01:16 merkaba kernel: [90496.429853]  [<ffffffff811564c9>] do_fsync+0x2c/0x45
> Aug  6 12:01:16 merkaba kernel: [90496.431351]  [<ffffffff811566af>] SyS_fsync+0xb/0xf
> Aug  6 12:01:16 merkaba kernel: [90496.432841]  [<ffffffff8147420b>] tracesys+0xdd/0xe2
> 
> 
> 
> Aug  6 12:01:16 merkaba kernel: [90496.434306] INFO: task kworker/u8:3:21401 blocked for more than 120 seconds.
> Aug  6 12:01:16 merkaba kernel: [90496.435814]       Tainted: G           O  3.16.0-tp520-fixcompwrite+ #3
> Aug  6 12:01:16 merkaba kernel: [90496.437328] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Aug  6 12:01:16 merkaba kernel: [90496.438885] kworker/u8:3    D ffff880133ebe780     0 21401      2 0x00000000
> Aug  6 12:01:16 merkaba kernel: [90496.440464] Workqueue: btrfs-flush_delalloc normal_work_helper [btrfs]
> Aug  6 12:01:16 merkaba kernel: [90496.442037]  ffff88003e953b18 0000000000000002 ffffffff81a15500 ffff88003e953fd8
> Aug  6 12:01:16 merkaba kernel: [90496.443639]  ffff880133ebe240 0000000000013040 ffff880133ebe240 ffff88041e213040
> Aug  6 12:01:16 merkaba kernel: [90496.445246]  ffff88041e5bc968 ffff88003e953bb0 0000000000000002 ffffffff810db1d9
> Aug  6 12:01:16 merkaba kernel: [90496.446901] Call Trace:
> Aug  6 12:01:16 merkaba kernel: [90496.448485]  [<ffffffff810db1d9>] ? wait_on_page_read+0x37/0x37
> Aug  6 12:01:16 merkaba kernel: [90496.450081]  [<ffffffff81470fd0>] schedule+0x64/0x66
> Aug  6 12:01:16 merkaba kernel: [90496.451682]  [<ffffffff81471157>] io_schedule+0x57/0x76
> Aug  6 12:01:16 merkaba kernel: [90496.453271]  [<ffffffff810db1e2>] sleep_on_page+0x9/0xd
> Aug  6 12:01:16 merkaba kernel: [90496.455037]  [<ffffffff8147152d>] __wait_on_bit_lock+0x41/0x85
> Aug  6 12:01:16 merkaba kernel: [90496.456617]  [<ffffffff810db29f>] __lock_page+0x70/0x7c
> Aug  6 12:01:16 merkaba kernel: [90496.458203]  [<ffffffff81070f3a>] ? autoremove_wake_function+0x2f/0x2f
> Aug  6 12:01:16 merkaba kernel: [90496.459793]  [<ffffffffc0476617>] lock_page+0x1e/0x21 [btrfs]
> Aug  6 12:01:16 merkaba kernel: [90496.461353]  [<ffffffffc0476617>] ? lock_page+0x1e/0x21 [btrfs]
> Aug  6 12:01:16 merkaba kernel: [90496.462917]  [<ffffffffc047a8f4>] extent_write_cache_pages.isra.21.constprop.42+0x1a7/0x2d9 [btrfs]
> Aug  6 12:01:16 merkaba kernel: [90496.464479]  [<ffffffff81009e60>] ? native_sched_clock+0x3a/0x3c
> Aug  6 12:01:16 merkaba kernel: [90496.466036]  [<ffffffffc047ae1c>] extent_writepages+0x46/0x57 [btrfs]
> Aug  6 12:01:16 merkaba kernel: [90496.467632]  [<ffffffff81066fbf>] ? task_group_account_field+0x3b/0x40
> Aug  6 12:01:16 merkaba kernel: [90496.469168]  [<ffffffffc04650b2>] ? btrfs_submit_direct+0x3ef/0x3ef [btrfs]
> Aug  6 12:01:16 merkaba kernel: [90496.470737]  [<ffffffffc046361f>] btrfs_writepages+0x23/0x25 [btrfs]
> Aug  6 12:01:16 merkaba kernel: [90496.472307]  [<ffffffff810e556b>] do_writepages+0x1b/0x24
> Aug  6 12:01:16 merkaba kernel: [90496.473885]  [<ffffffff810dc88a>] __filemap_fdatawrite_range+0x50/0x52
> Aug  6 12:01:16 merkaba kernel: [90496.475458]  [<ffffffff810dd1b3>] filemap_flush+0x17/0x19
> Aug  6 12:01:16 merkaba kernel: [90496.477041]  [<ffffffffc0465df9>] btrfs_run_delalloc_work+0x2e/0x64 [btrfs]
> Aug  6 12:01:16 merkaba kernel: [90496.478624]  [<ffffffffc04863d7>] normal_work_helper+0xdf/0x224 [btrfs]
> Aug  6 12:01:16 merkaba kernel: [90496.480257]  [<ffffffff81052d8c>] process_one_work+0x16f/0x2b8
> Aug  6 12:01:16 merkaba kernel: [90496.481977]  [<ffffffff81053636>] worker_thread+0x27b/0x32e
> Aug  6 12:01:16 merkaba kernel: [90496.483544]  [<ffffffff810533bb>] ? cancel_delayed_work_sync+0x10/0x10
> Aug  6 12:01:16 merkaba kernel: [90496.485082]  [<ffffffff81058012>] kthread+0xb2/0xba
> Aug  6 12:01:16 merkaba kernel: [90496.486624]  [<ffffffff81470000>] ? ap_handle_dropped_data+0xf/0xc8
> Aug  6 12:01:16 merkaba kernel: [90496.488148]  [<ffffffff81057f60>] ? __kthread_parkme+0x62/0x62
> Aug  6 12:01:16 merkaba kernel: [90496.489719]  [<ffffffff81473f6c>] ret_from_fork+0x7c/0xb0
> Aug  6 12:01:16 merkaba kernel: [90496.491265]  [<ffffffff81057f60>] ? __kthread_parkme+0x62/0x62
> 
> 
> Ciao,
> 

  parent reply	other threads:[~2014-08-06 13:36 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-24 14:48 [PATCH] Btrfs: fix compressed write corruption on enospc Liu Bo
2014-07-24 14:55 ` Chris Mason
2014-07-25  1:00   ` Liu Bo
2014-07-25  9:58     ` Martin Steigerwald
2014-07-25  1:53 ` Wang Shilong
2014-07-25  2:08   ` Liu Bo
2014-07-25  2:11     ` Wang Shilong
2014-07-25  9:54 ` Martin Steigerwald
2014-08-04 12:50   ` Martin Steigerwald
2014-08-04 12:52     ` Martin Steigerwald
2014-08-06 10:21     ` Martin Steigerwald
2014-08-06 10:29       ` Hugo Mills
2014-08-06 12:28         ` Martin Steigerwald
2014-08-06 13:35       ` Chris Mason [this message]
2014-08-06 14:43         ` Martin Steigerwald
2014-08-06 15:18           ` Chris Mason
2014-08-07  0:52             ` Chris Mason
2014-08-07  7:50               ` Liu Bo
2014-08-07  8:20                 ` Miao Xie
2014-08-07 14:02                   ` Chris Mason
2014-08-10 14:55                     ` Liu Bo
2014-08-11 20:35                       ` Chris Mason
2014-08-12  2:55                       ` Miao Xie
2014-08-12  7:51                         ` Liu Bo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53E22F37.9000308@fb.com \
    --to=clm@fb.com \
    --cc=Martin@lichtvoll.de \
    --cc=bo.li.liu@oracle.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).