All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vivek Goyal <vgoyal@redhat.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-kernel@vger.kernel.org, jaxboe@fusionio.com,
	linux-fsdevel@vger.kernel.org, andrea@betterlinux.com,
	linux-ext4@vger.kernel.org
Subject: fsync serialization on ext4 with blkio throttling (Was: Re: [PATCH 0/8][V2] blk-throttle: Throttle buffered WRITEs in balance_dirty_pages())
Date: Thu, 30 Jun 2011 16:04:59 -0400	[thread overview]
Message-ID: <20110630200459.GI27889@redhat.com> (raw)
In-Reply-To: <20110629015336.GA19082@redhat.com>

On Tue, Jun 28, 2011 at 09:53:36PM -0400, Vivek Goyal wrote:

[..]
> > FYI, filesystem development cycles are slow and engineers are
> > conservative because of the absolute requirement for data integrity.
> > Hence we tend to focus development on problems that users are
> > reporting (i.e. known pain points) or functionality they have
> > requested.
> > 
> > In this case, block throttling works OK on most filesystems out of
> > the box, but it has some known problems. If there are people out
> > there hitting these known problems then they'll report them, we'll
> > hear about them and they'll eventually get fixed.
> > 
> > However, if no-one is reporting problems related to block throttling
> > then it either works well enough for the existing user base or
> > nobody is using the functionality. Either way we don't need to spend
> > time on optimising the filesystem for such functionality.
> > 
> > So while you may be skeptical about whether filesystems will be
> > changed, it really comes down to behaviour in real-world
> > deployments. If what we already have is good enough, then we don't
> > need to spend resources on fixing problems no-one is seeing...
>

[CC linux-ext4 list]

Dave,

Just another example where serialization is taking place with ext4.

I created a group with 1MB/s write limit and ran tedso's fsync tester
program with little modification. I used write() system call instead
of pwrite() so that file size grows. This program basically writes
1MB of data and then fsync's it and then measures the fsync time.

I ran two instances of prgram in two groups on two separate files. One
instances is throttled to 1MB/s and other is in root group unthrottled.

Unthrottled program gets serialized behind throttled one. Following
are fsync times.

Throttled instance	Unthrottled Instance
------------------ 	--------------------
fsync time: 1.0051	fsync time: 1.0067
fsync time: 1.0049	fsync time: 1.0075
fsync time: 1.0048	fsync time: 1.0063
fsync time: 1.0073	fsync time: 1.0062
fsync time: 1.0070	fsync time: 1.0078
fsync time: 1.0032	fsync time: 1.0049
fsync time: 0.0154	fsync time: 1.0068
fsync time: 0.0137	fsync time: 1.0048

Without any throttling both the instances do fine
-------------------------------------------------
Throttled instance	Unthrottled Instance
------------------ 	--------------------
fsync time: 0.0139	fsync time: 0.0162
fsync time: 0.0132	fsync time: 0.0156
fsync time: 0.0149	fsync time: 0.0169
fsync time: 0.0165	fsync time: 0.0152
fsync time: 0.0188	fsync time: 0.0135
fsync time: 0.0137	fsync time: 0.0142
fsync time: 0.0148	fsync time: 0.0149
fsync time: 0.0168	fsync time: 0.0163
fsync time: 0.0153	fsync time: 0.0143

So when we are inreasing the size of file and fsyncing it, other
unthrottled instances of similar activities will get throttled
behind it.

IMHO, this is a problem and should be fixed. If filesystem can fix it great.
But if not, then we should consider the option of throttling buffered writes 
in balance_dirty_pages().

Following is the test program.

/*
 *  * fsync-tester.c
 *
 * Written by Theodore Ts'o, 3/21/09.
 *
 * This file may be redistributed under the terms of the GNU Public
 * License, version 2.
 */

#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <time.h>
#include <fcntl.h>
#include <string.h>

#define SIZE (1024*1024)

static float timeval_subtract(struct timeval *tv1, struct timeval *tv2)
{
        return ((tv1->tv_sec - tv2->tv_sec) +
                ((float) (tv1->tv_usec - tv2->tv_usec)) / 1000000);
}

int main(int argc, char **argv)
{
        int     fd;
        struct timeval tv, tv2;
        char buf[SIZE];

        fd = open("fsync-tester.tst-file", O_WRONLY|O_CREAT);
        if (fd < 0) {
                perror("open");
                exit(1);
        }
        memset(buf, 'a', SIZE);
        while (1) {
                write(fd, buf, SIZE);
                gettimeofday(&tv, NULL);
                fsync(fd);
                gettimeofday(&tv2, NULL);
                printf("fsync time: %5.4f\n", timeval_subtract(&tv2,
&tv));
                sleep(1);
        }
}

Thanks
Vivek

  reply	other threads:[~2011-06-30 20:05 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-28 15:35 [PATCH 0/8][V2] blk-throttle: Throttle buffered WRITEs in balance_dirty_pages() Vivek Goyal
2011-06-28 15:35 ` [PATCH 1/8] blk-throttle: convert wait routines to return jiffies to wait Vivek Goyal
2011-06-28 15:35 ` [PATCH 2/8] blk-throttle: do not enforce first queued bio check in tg_wait_dispatch Vivek Goyal
2011-06-28 15:35 ` [PATCH 3/8] blk-throttle: use io size and direction as parameters to wait routines Vivek Goyal
2011-06-28 15:35 ` [PATCH 4/8] blk-throttle: specify number of ios during dispatch update Vivek Goyal
2011-06-28 15:35 ` [PATCH 5/8] blk-throttle: get rid of extend slice trace message Vivek Goyal
2011-06-28 15:35 ` [PATCH 6/8] blk-throttle: core logic to throttle task while dirtying pages Vivek Goyal
2011-06-29  9:30   ` Andrea Righi
2011-06-29 15:25   ` Andrea Righi
2011-06-29 20:03     ` Vivek Goyal
2011-06-28 15:35 ` [PATCH 7/8] blk-throttle: do not throttle writes at device level except direct io Vivek Goyal
2011-06-28 15:35 ` [PATCH 8/8] blk-throttle: enable throttling of task while dirtying pages Vivek Goyal
2011-06-30 14:52   ` Andrea Righi
2011-06-30 15:06     ` Andrea Righi
2011-06-30 17:14     ` Vivek Goyal
2011-06-30 21:22       ` Andrea Righi
2011-06-28 16:21 ` [PATCH 0/8][V2] blk-throttle: Throttle buffered WRITEs in balance_dirty_pages() Andrea Righi
2011-06-28 17:06   ` Vivek Goyal
2011-06-28 17:39     ` Andrea Righi
2011-06-29 16:05     ` Andrea Righi
2011-06-29 20:04       ` Vivek Goyal
2011-06-29  0:42 ` Dave Chinner
2011-06-29  1:53   ` Vivek Goyal
2011-06-30 20:04     ` Vivek Goyal [this message]
2011-06-30 20:44       ` fsync serialization on ext4 with blkio throttling (Was: Re: [PATCH 0/8][V2] blk-throttle: Throttle buffered WRITEs in balance_dirty_pages()) Vivek Goyal
2011-07-01  0:16         ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110630200459.GI27889@redhat.com \
    --to=vgoyal@redhat.com \
    --cc=andrea@betterlinux.com \
    --cc=david@fromorbit.com \
    --cc=jaxboe@fusionio.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.