All of lore.kernel.org
 help / color / mirror / Atom feed
From: Miao Xie <miaox@cn.fujitsu.com>
To: Liu Bo <bo.li.liu@oracle.com>, linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: [PATCH] Btrfs: fix abnormal long waiting in fsync
Date: Tue, 15 Jul 2014 18:22:10 +0800	[thread overview]
Message-ID: <53C500D2.3060801@cn.fujitsu.com> (raw)
In-Reply-To: <1405416674-17208-1-git-send-email-bo.li.liu@oracle.com>

On Tue, 15 Jul 2014 17:31:14 +0800, Liu Bo wrote:
> xfstests generic/127 detected this problem.
> 
> With commit 7fc34a62ca4434a79c68e23e70ed26111b7a4cf8, now fsync will only flush
> data within the passed range.  This is the cause of the above problem,
> -- btrfs's fsync has a stage called 'sync log' which will wait for all the
> ordered extents it've recorded to finish.
> 
> In xfstests/generic/127, with mixed operations such as truncate, fallocate,
> punch hole, and mapwrite, we get some pre-allocated extents, and mapwrite will
> mmap, and then msync.  And I find that msync will wait for quite a long time
> (about 20s in my case), thanks to ftrace, it turns out that the previous
> fallocate calls 'btrfs_wait_ordered_range()' to flush dirty pages, but as the
> range of dirty pages may be larger than 'btrfs_wait_ordered_range()' wants,

btrfs_sync_file also calls 'btrfs_wait_ordered_range()' and introduces the same
problem.

> there can be some ordered extents created but not getting corresponding pages
> flushed, then they're left in memory until we fsync which runs into the
> stage 'sync log', and fsync will just wait for the system writeback thread
> to flush those pages and get ordered extents finished, so the latency is
> inevitable.
> 
> This adds a non-blocked flush, filemap_flush(), in btrfs_sync_file() to fix
> that.

I think this fix is not so good, because it will flush the pages that is not
relative to the current sync. I think the key reason is btrfs_wait_logged_extents(),
that just wait the ordered extents, not flush the relative dirty pages.

So the more reasonable fix is to use btrfs_start_ordered_extent() instead of
wait_event in btrfs_wait_logged_extents().

(This above is just my analysis)

Thanks
Miao

> 
> Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
> ---
>  fs/btrfs/file.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> index 1f2b99c..1af395d 100644
> --- a/fs/btrfs/file.c
> +++ b/fs/btrfs/file.c
> @@ -2002,6 +2002,8 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync)
>  
>  	if (ret != BTRFS_NO_LOG_SYNC) {
>  		if (!ret) {
> +			filemap_flush(inode->i_mapping);
> +
>  			ret = btrfs_sync_log(trans, root, &ctx);
>  			if (!ret) {
>  				ret = btrfs_end_transaction(trans, root);
> 


  reply	other threads:[~2014-07-15 10:20 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-15  9:31 [PATCH] Btrfs: fix abnormal long waiting in fsync Liu Bo
2014-07-15 10:22 ` Miao Xie [this message]
2014-07-15 10:41   ` constantine
2014-07-15 10:47   ` Liu Bo
2014-07-16  7:37 ` [PATCH v2] " Liu Bo
2014-07-16  9:36   ` Miao Xie
2014-07-17  3:00     ` Liu Bo
2014-07-17  8:08 ` [PATCH v3] " Liu Bo
2014-07-17 12:36   ` Chris Mason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53C500D2.3060801@cn.fujitsu.com \
    --to=miaox@cn.fujitsu.com \
    --cc=bo.li.liu@oracle.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.