From: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>
To: "Andrew Morton" <akpm@linux-foundation.org>,
"Bryan Henderson" <hbryan@us.ibm.com>,
"J\\x8Bn Engel" <joern@logfs.org>
Cc: linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH] ext3,4:fdatasync should skip metadata writeout
Date: Tue, 20 Nov 2007 16:20:56 +0900 [thread overview]
Message-ID: <6.0.0.20.2.20071120142849.03f5f5e0@172.19.0.2> (raw)
In-Reply-To: <20071115185919.7df4cda9.akpm@linux-foundation.org>
At 11:59 07/11/16, Andrew Morton wrote:
>
>I suppose so. Although one wonders what earthly point there is in syncing
>a file's data if we haven't yet written out the metadata which is required
>for locating that data.
>
>IOW, fdatasync() is only useful if the application knows that it is overwriting
>already-instantiated blocks.
>
>In which case it might as well have used fsync(). For ext2-style filesystems,
>anyway.
>
>hm. It needs some thought.
I did a test to measure the file overwriting performance difference between
original fdatasync and one that skips journal flush.
The test program and obtained result is as follows:
Test program source code:
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <time.h>
#define BUFSIZE 8192
#define LOOP 1024*1024
main(void)
{
int i;
int fd;
char buf[BUFSIZE];
time_t t1,t2;
memset(buf,0,BUFSIZE);
fd = open("testfile", O_CREAT|O_RDWR);
if (fd < 0)
perror("cannot open file\n");
for (i = 0; i < LOOP; i++)
write(fd,buf,BUFSIZE);
fsync(fd);
lseek(fd, 0, SEEK_SET);
time(&t1);
for (i = 0; i < LOOP; i++) {
write(fd,buf,BUFSIZE);
fdatasync(fd);
}
time(&t2);
printf("%d sec\n",t2-t1);
}
Result:
2.6.24-rc3:
264 sec
2.6.23-rc3-fdatasync-skips-journal-flush-patched
253 sec
Hardware environment:
Dell Poweredge 850
CPU Pentium D 3GHz
memory 4GB
HDD Maxtor 6L160M0
I got somewhat better result from the patched ext3 skipping journal flush.
Some DBMS such as PostgreSQL can use fdatasync. So I think skipping
journal flush on overwriting leads to performance improvement for
these application.
I am for the notion that skipping metadata writeout unconditionally is wrong,
and "important metadata" such as i_size, block-bitmap etc should be
synched even if fdatasync is issued , but unimportant meta such as
mtime and ctime update can be ignored when a file is overwritten.
next prev parent reply other threads:[~2007-11-20 7:20 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-11-16 2:47 [PATCH] ext3,4:fdatasync should skip metadata writeout Hisashi Hifumi
2007-11-16 2:59 ` Andrew Morton
2007-11-16 3:17 ` Jörn Engel
2007-11-16 18:12 ` Bryan Henderson
2007-11-16 3:47 ` Wendy Cheng
2007-11-16 3:53 ` Andrew Morton
2007-11-20 7:20 ` Hisashi Hifumi [this message]
2008-02-04 10:15 ` [RESEND] [PATCH] ext3,4:fdatasync should skip metadata writeout when overwriting Hisashi Hifumi
2008-02-06 16:22 ` Jan Kara
2008-02-07 6:45 ` Hisashi Hifumi
2007-11-16 3:43 ` [PATCH] ext3,4:fdatasync should skip metadata writeout Jörn Engel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6.0.0.20.2.20071120142849.03f5f5e0@172.19.0.2 \
--to=hifumi.hisashi@oss.ntt.co.jp \
--cc=akpm@linux-foundation.org \
--cc=hbryan@us.ibm.com \
--cc=joern@logfs.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).