public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: stummala@codeaurora.org
To: tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz,
	linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org
Cc: stummala@codeaurora.org
Subject: huge fsync latencies for a small file on ext4
Date: Tue, 19 Feb 2019 15:50:23 +0530	[thread overview]
Message-ID: <b1ba05256affe171c2afca6c3aceeb7a@codeaurora.org> (raw)

Hi,

I am observing huge fsync latencies for a small file under the below 
test scenario -

process A -
Issue async write of 4GB using dd command (say large_file) on /data 
mounted
with ext4:
dd if=/dev/zero of=/data/testfile bs=1M count=4096

process B -
In parallel another process wrote a small 4KB data to another file
(say, small_file) and has issued fsync on this file.

Problem -
The fsync() on 4KB file, is taking upto ~30sec (worst case latency).
This is tested on an eMMC based device.

Observations -
This happens when the small_file and large_file both are part of the 
same
committing transaction or when the small_file is part of the running 
transaction
while large_file is part of the committing transaction.

During the commit of a transaction which includes large_file, the jbd2 
thread
does journal_finish_inode_data_buffers() by calling
filemap_fdatawait_keep_errors() on the file's inode address space. While 
this is
happening, if the writeback thread is running in parallel for the 
large_file, then
filemap_fdatawait_keep_errors() could potentially run in a loop of all 
the
pages (upto 4GB of data) and also wait for all the file's data to be 
written
to the disk in the current transaction context itself. At the time
of calling journal_finish_inode_data_buffers(), the file size is of only 
150MB.
and by the time filemap_fdatawait_keep_errors() returns, the file size 
is 4GB
and the page index also points to 4GB file offset in
__filemap_fdatawait_range(), indicating that is has scanned and waited 
for writeback
all the pages upto 4GB and not just 150MB.

Ideally, I think the jbd2 thread should have waited for only the amount 
of data
it has submitted as part of the current transaction and not to wait for 
the
on-going pages that are getting tagged for writeback in parallel in 
another context.
So along these lines, I have tried to use the inode's size at the time 
of calling
journal_finish_inode_data_buffers() as below -

diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
index 2eb55c3..e86cf67 100644
--- a/fs/jbd2/commit.c
+++ b/fs/jbd2/commit.c
@@ -261,8 +261,8 @@ static int 
journal_finish_inode_data_buffers(journal_t *journal,
                         continue;
                 jinode->i_flags |= JI_COMMIT_RUNNING;
                 spin_unlock(&journal->j_list_lock);
-               err = filemap_fdatawait_keep_errors(
-                               jinode->i_vfs_inode->i_mapping);
+               err = 
filemap_fdatawait_range(jinode->i_vfs_inode->i_mapping,
+                               0, 
i_size_read(jinode->i_vfs_inode->i_mapping->host));
                 if (!ret)
                         ret = err;
                 spin_lock(&journal->j_list_lock);

With this, the fsync latencies for small_file have reduced 
significantly.
It took upto max ~5sec (worst case latency).

Although this is seen in a test code, this could potentially impact the
phone's performance if any application or main UI thread in Android 
issues
fsync() in foreground while a large data transfer is going on in another
context.

Request you to share your thoughts and comments on this issue
and the fix suggested above.

Thanks,
Sahitya.

--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, 
Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a 
Linux Foundation Collaborative Project.

             reply	other threads:[~2019-02-19 10:20 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-19 10:20 stummala [this message]
2019-02-19 13:53 ` huge fsync latencies for a small file on ext4 Jan Kara
2019-02-25  5:10   ` Sahitya Tummala
2019-02-26  8:30     ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b1ba05256affe171c2afca6c3aceeb7a@codeaurora.org \
    --to=stummala@codeaurora.org \
    --cc=adilger.kernel@dilger.ca \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox