From mboxrd@z Thu Jan 1 00:00:00 1970 From: Li Zefan Subject: Re: [RFC PATCH] Btrfs: do not flush csum items of unchanged file data during treelog Date: Fri, 22 Apr 2011 08:55:40 +0800 Message-ID: <4DB0D20C.9060407@cn.fujitsu.com> References: <4DAFE39D.4040309@cn.fujitsu.com> <1303391634-sup-1145@think> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Cc: liubo , Linux Btrfs , Josef Bacik To: Chris Mason Return-path: In-Reply-To: <1303391634-sup-1145@think> List-ID: Chris Mason wrote: > Excerpts from liubo's message of 2011-04-21 03:58:21 -0400: >> >> The current code relogs the entire inode every time during fsync log, >> and it is much better suited to small files rather than large ones. >> >> During my performance test, the fsync performace of large files sucks, >> and we can ascribe this to the tremendous amount of csum infos of the >> large ones, cause we have to flush all of these csum infos into log trees >> even when there are only _one_ change in the whole file data. Apparently, >> to optimize fsync, we need to create a filter to skip the unnecessary csum >> ones, that is, the corresponding file data remains unchanged before this fsync. >> >> Here I have some test results to show, I use sysbench to do "random write + fsync". >> >> Sysbench args: >> - Number of threads: 1 >> - Extra file open flags: 0 >> - 2 files, 4Gb each >> - Block size 4Kb >> - Number of random requests for random IO: 10000 >> - Read/Write ratio for combined random IO test: 1.50 >> - Periodic FSYNC enabled, calling fsync() each 100 requests. >> - Calling fsync() at the end of test, Enabled. >> - Using synchronous I/O mode >> - Doing random write test >> >> Sysbench results: >> === >> Operations performed: 0 Read, 10000 Write, 200 Other = 10200 Total >> Read 0b Written 39.062Mb Total transferred 39.062Mb >> === >> a) without patch: (*SPEED* : 451.01Kb/sec) >> 112.75 Requests/sec executed >> >> b) with patch: (*SPEED* : 5.1537Mb/sec) >> 1319.34 Requests/sec executed > > Really nice results! Especially considering the small size of the patch. > > But, I'd really like to look at using sub transaction ids for this, and > then logging just the part of the inode that had changed since the last > log commit. It's more complex, but will also help reduce tree searches > for the file items. > And this patch forgot to mention it has compatability issue.