linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vivek Goyal <vgoyal@redhat.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-kernel@vger.kernel.org, jaxboe@fusionio.com,
	linux-fsdevel@vger.kernel.org, andrea@betterlinux.com,
	linux-ext4@vger.kernel.org
Subject: fsync serialization on ext4 with blkio throttling (Was: Re: [PATCH 0/8][V2] blk-throttle: Throttle buffered WRITEs in balance_dirty_pages())
Date: Thu, 30 Jun 2011 16:04:59 -0400	[thread overview]
Message-ID: <20110630200459.GI27889@redhat.com> (raw)
In-Reply-To: <20110629015336.GA19082@redhat.com>

On Tue, Jun 28, 2011 at 09:53:36PM -0400, Vivek Goyal wrote:

[..]
> > FYI, filesystem development cycles are slow and engineers are
> > conservative because of the absolute requirement for data integrity.
> > Hence we tend to focus development on problems that users are
> > reporting (i.e. known pain points) or functionality they have
> > requested.
> > 
> > In this case, block throttling works OK on most filesystems out of
> > the box, but it has some known problems. If there are people out
> > there hitting these known problems then they'll report them, we'll
> > hear about them and they'll eventually get fixed.
> > 
> > However, if no-one is reporting problems related to block throttling
> > then it either works well enough for the existing user base or
> > nobody is using the functionality. Either way we don't need to spend
> > time on optimising the filesystem for such functionality.
> > 
> > So while you may be skeptical about whether filesystems will be
> > changed, it really comes down to behaviour in real-world
> > deployments. If what we already have is good enough, then we don't
> > need to spend resources on fixing problems no-one is seeing...
>

[CC linux-ext4 list]

Dave,

Just another example where serialization is taking place with ext4.

I created a group with 1MB/s write limit and ran tedso's fsync tester
program with little modification. I used write() system call instead
of pwrite() so that file size grows. This program basically writes
1MB of data and then fsync's it and then measures the fsync time.

I ran two instances of prgram in two groups on two separate files. One
instances is throttled to 1MB/s and other is in root group unthrottled.

Unthrottled program gets serialized behind throttled one. Following
are fsync times.

Throttled instance	Unthrottled Instance
------------------ 	--------------------
fsync time: 1.0051	fsync time: 1.0067
fsync time: 1.0049	fsync time: 1.0075
fsync time: 1.0048	fsync time: 1.0063
fsync time: 1.0073	fsync time: 1.0062
fsync time: 1.0070	fsync time: 1.0078
fsync time: 1.0032	fsync time: 1.0049
fsync time: 0.0154	fsync time: 1.0068
fsync time: 0.0137	fsync time: 1.0048

Without any throttling both the instances do fine
-------------------------------------------------
Throttled instance	Unthrottled Instance
------------------ 	--------------------
fsync time: 0.0139	fsync time: 0.0162
fsync time: 0.0132	fsync time: 0.0156
fsync time: 0.0149	fsync time: 0.0169
fsync time: 0.0165	fsync time: 0.0152
fsync time: 0.0188	fsync time: 0.0135
fsync time: 0.0137	fsync time: 0.0142
fsync time: 0.0148	fsync time: 0.0149
fsync time: 0.0168	fsync time: 0.0163
fsync time: 0.0153	fsync time: 0.0143

So when we are inreasing the size of file and fsyncing it, other
unthrottled instances of similar activities will get throttled
behind it.

IMHO, this is a problem and should be fixed. If filesystem can fix it great.
But if not, then we should consider the option of throttling buffered writes 
in balance_dirty_pages().

Following is the test program.

/*
 *  * fsync-tester.c
 *
 * Written by Theodore Ts'o, 3/21/09.
 *
 * This file may be redistributed under the terms of the GNU Public
 * License, version 2.
 */

#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <time.h>
#include <fcntl.h>
#include <string.h>

#define SIZE (1024*1024)

static float timeval_subtract(struct timeval *tv1, struct timeval *tv2)
{
        return ((tv1->tv_sec - tv2->tv_sec) +
                ((float) (tv1->tv_usec - tv2->tv_usec)) / 1000000);
}

int main(int argc, char **argv)
{
        int     fd;
        struct timeval tv, tv2;
        char buf[SIZE];

        fd = open("fsync-tester.tst-file", O_WRONLY|O_CREAT);
        if (fd < 0) {
                perror("open");
                exit(1);
        }
        memset(buf, 'a', SIZE);
        while (1) {
                write(fd, buf, SIZE);
                gettimeofday(&tv, NULL);
                fsync(fd);
                gettimeofday(&tv2, NULL);
                printf("fsync time: %5.4f\n", timeval_subtract(&tv2,
&tv));
                sleep(1);
        }
}

Thanks
Vivek

  reply	other threads:[~2011-06-30 20:04 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-28 15:35 [PATCH 0/8][V2] blk-throttle: Throttle buffered WRITEs in balance_dirty_pages() Vivek Goyal
2011-06-28 15:35 ` [PATCH 1/8] blk-throttle: convert wait routines to return jiffies to wait Vivek Goyal
2011-06-28 15:35 ` [PATCH 2/8] blk-throttle: do not enforce first queued bio check in tg_wait_dispatch Vivek Goyal
2011-06-28 15:35 ` [PATCH 3/8] blk-throttle: use io size and direction as parameters to wait routines Vivek Goyal
2011-06-28 15:35 ` [PATCH 4/8] blk-throttle: specify number of ios during dispatch update Vivek Goyal
2011-06-28 15:35 ` [PATCH 5/8] blk-throttle: get rid of extend slice trace message Vivek Goyal
2011-06-28 15:35 ` [PATCH 6/8] blk-throttle: core logic to throttle task while dirtying pages Vivek Goyal
2011-06-29  9:30   ` Andrea Righi
2011-06-29 15:25   ` Andrea Righi
2011-06-29 20:03     ` Vivek Goyal
2011-06-28 15:35 ` [PATCH 7/8] blk-throttle: do not throttle writes at device level except direct io Vivek Goyal
2011-06-28 15:35 ` [PATCH 8/8] blk-throttle: enable throttling of task while dirtying pages Vivek Goyal
2011-06-30 14:52   ` Andrea Righi
2011-06-30 15:06     ` Andrea Righi
2011-06-30 17:14     ` Vivek Goyal
2011-06-30 21:22       ` Andrea Righi
2011-06-28 16:21 ` [PATCH 0/8][V2] blk-throttle: Throttle buffered WRITEs in balance_dirty_pages() Andrea Righi
2011-06-28 17:06   ` Vivek Goyal
2011-06-28 17:39     ` Andrea Righi
2011-06-29 16:05     ` Andrea Righi
2011-06-29 20:04       ` Vivek Goyal
2011-06-29  0:42 ` Dave Chinner
2011-06-29  1:53   ` Vivek Goyal
2011-06-30 20:04     ` Vivek Goyal [this message]
2011-06-30 20:44       ` fsync serialization on ext4 with blkio throttling (Was: Re: [PATCH 0/8][V2] blk-throttle: Throttle buffered WRITEs in balance_dirty_pages()) Vivek Goyal
2011-07-01  0:16         ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110630200459.GI27889@redhat.com \
    --to=vgoyal@redhat.com \
    --cc=andrea@betterlinux.com \
    --cc=david@fromorbit.com \
    --cc=jaxboe@fusionio.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).