From: Robert Yang <liezhi.yang@windriver.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: <tytso@mit.edu>, <dvhart@linux.intel.com>, <linux-ext4@vger.kernel.org>
Subject: Re: [PATCH 2/2] debugfs.c: do sparse copy when src is a sparse file
Date: Sun, 21 Jul 2013 10:38:12 +0800 [thread overview]
Message-ID: <51EB4994.3050703@windriver.com> (raw)
In-Reply-To: <20130719185515.GC5785@blackbox.djwong.org>
Hi Darrick,
Thanks for your reply, it seems that 64K is a good idea since put 128K
on the stack might cause problems, I will wait for one or two days for
more comments on other parts of the patches, then send a V2 with the
updates.
// Robert
On 07/20/2013 02:55 AM, Darrick J. Wong wrote:
> On Fri, Jul 19, 2013 at 10:17:37AM +0800, Robert Yang wrote:
>> Let debugfs do sparse copy when src is a sparse file, just like
>> "cp --sparse=auto"
>>
>> * For the
>> #define IO_BUFSIZE 32*1024
>>
>> This is from coreutils-8.13/src/ioblksize.h (GPL V3):
>> /* As of Mar 2009, 32KiB is determined to be the minimium
>> blksize to best minimize system call overhead.
>> This can be tested with this script with the results
>> shown for a 1.7GHz pentium-m with 2GB of 400MHz DDR2 RAM:
>
> Um.... GNU updated this to 64K a couple of years ago:
> http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=blob;f=src/ioblksize.h;h=1ae93255e7d0ccf0855208c7ae5888209997bf16;hb=HEAD
>
> Just for laughs I tried it on a T430 with an i5-3320M and 16G of DDR3-1600 RAM:
>
> 1024=3.7 GB/s
> 2048=7.1 GB/s
> 4096=8.8 GB/s
> 8192=14.9 GB/s
> 16384=14.3 GB/s
> 32768=13.4 GB/s
> 65536=15.8 GB/s
> 131072=20.7 GB/s
> 262144=16.4 GB/s
> 524288=15.9 GB/s
> 1048576=15.8 GB/s
> 2097152=15.1 GB/s
> 4194304=11.7 GB/s
> 8388608=9.9 GB/s
> 16777216=9.4 GB/s
> 33554432=9.3 GB/s
> 67108864=9.3 GB/s
> 134217728=8.8 GB/s
>
> For that matter, a 2010-era i7-950/DDR3-1066 system showed this:
>
> 1024=3.4 GB/s
> 2048=5.6 GB/s
> 4096=7.8 GB/s
> 8192=9.5 GB/s
> 16384=10.8 GB/s
> 32768=11.4 GB/s
> 65536=11.6 GB/s
> 131072=12.2 GB/s
> 262144=11.9 GB/s
> 524288=12.3 GB/s
> 1048576=12.4 GB/s
> 2097152=12.5 GB/s
> 4194304=12.5 GB/s
> 8388608=10.3 GB/s
> 16777216=8.0 GB/s
> 33554432=7.6 GB/s
> 67108864=7.8 GB/s
> 134217728=7.5 GB/s
>
> And for good measure, a cruddy old T2300 Core Duo from 2006 spat out this:
>
> 1024=1.1 GB/s
> 2048=2.1 GB/s
> 4096=3.6 GB/s
> 8192=5.0 GB/s
> 16384=6.3 GB/s
> 32768=6.5 GB/s
> 65536=6.6 GB/s
> 131072=7.0 GB/s
> 262144=7.1 GB/s
> 524288=7.1 GB/s
> 1048576=6.8 GB/s
> 2097152=4.4 GB/s
> 4194304=2.3 GB/s
> 8388608=2.0 GB/s
> 16777216=2.0 GB/s
> 33554432=2.0 GB/s
> 67108864=2.0 GB/s
> 134217728=1.9 GB/s
>
> I suspect you could increase the buffer size to 128K (or possibly even BLKRAGET
> size?) without much of a problem...
>
>>
>> for i in $(seq 0 10); do
>> size=$((8*1024**3)) #ensure this is big enough
>> bs=$((1024*2**$i))
>> printf "%7s=" $bs
>> dd bs=$bs if=/dev/zero of=/dev/null count=$(($size/$bs)) 2>&1 |
>> sed -n 's/.* \([0-9.]* [GM]B\/s\)/\1/p'
>> done
>>
>> 1024=734 MB/s
>> 2048=1.3 GB/s
>> 4096=2.4 GB/s
>> 8192=3.5 GB/s
>> 16384=3.9 GB/s
>> 32768=5.2 GB/s
>> 65536=5.3 GB/s
>> 131072=5.5 GB/s
>> 262144=5.7 GB/s
>> 524288=5.7 GB/s
>> 1048576=5.8 GB/s
>>
>> Note that this is to minimize system call overhead.
>> Other values may be appropriate to minimize file system
>> or disk overhead. For example on my current GNU/Linux system
>> the readahead setting is 128KiB which was read using:
>>
>> file="."
>> device=$(df -P --local "$file" | tail -n1 | cut -d' ' -f1)
>> echo $(( $(blockdev --getra $device) * 512 ))
>>
>> However there isn't a portable way to get the above.
>> In the future we could use the above method if available
>> and default to io_blksize() if not.
>> */
>> enum { IO_BUFSIZE = 32*1024 };
>>
>> Signed-off-by: Robert Yang <liezhi.yang@windriver.com>
>> Acked-by: Darren Hart <dvhart@linux.intel.com>
>> ---
>> debugfs/debugfs.c | 70 +++++++++++++++++++++++++++++++++++++++++++++++--------
>> 1 file changed, 60 insertions(+), 10 deletions(-)
>>
>> diff --git a/debugfs/debugfs.c b/debugfs/debugfs.c
>> index b77d0b5..e443703 100644
>> --- a/debugfs/debugfs.c
>> +++ b/debugfs/debugfs.c
>> @@ -37,6 +37,16 @@ extern char *optarg;
>> #include "../version.h"
>> #include "jfs_user.h"
>>
>> +/* 32KiB is the minimium blksize to best minimize system call overhead. */
>> +#ifndef IO_BUFSIZE
>> +#define IO_BUFSIZE 32*1024
>> +#endif
>> +
>> +/* Block size for `st_blocks' */
>> +#ifndef S_BLKSIZE
>> +#define S_BLKSIZE 512
>> +#endif
>> +
>> ss_request_table *extra_cmds;
>> const char *debug_prog_name;
>> int sci_idx;
>> @@ -1571,14 +1581,17 @@ void do_find_free_inode(int argc, char *argv[])
>> }
>>
>> #ifndef READ_ONLY
>> -static errcode_t copy_file(int fd, ext2_ino_t newfile)
>> +static errcode_t copy_file(int fd, ext2_ino_t newfile, int bufsize,
>> + int make_holes, int *zero_written)
>> {
>> ext2_file_t e2_file;
>> errcode_t retval;
>> int got;
>> unsigned int written;
>> - char buf[8192];
>> + char buf[bufsize];
>
> ...well, I guess it could be more of a problem if you put 128K on the stack.
>
> --D
>
>> char *ptr;
>> + char *cp;
>> + int count;
>>
>> retval = ext2fs_file_open(current_fs, newfile,
>> EXT2_FILE_WRITE, &e2_file);
>> @@ -1594,14 +1607,30 @@ static errcode_t copy_file(int fd, ext2_ino_t newfile)
>> goto fail;
>> }
>> ptr = buf;
>> + cp = ptr;
>> + count = got;
>> while (got > 0) {
>> - retval = ext2fs_file_write(e2_file, ptr,
>> - got, &written);
>> - if (retval)
>> - goto fail;
>> -
>> - got -= written;
>> - ptr += written;
>> + if (make_holes) {
>> + /* Check whether all is zero */
>> + while (count-- && *cp++ == 0)
>> + continue;
>> + if (count < 0) {
>> + /* The whole block is zero, make a hole */
>> + retval = ext2fs_file_lseek(e2_file, got, EXT2_SEEK_CUR, NULL);
>> + if (retval)
>> + goto fail;
>> + got = 0;
>> + }
>> + }
>> + /* Normal copy */
>> + if (got > 0) {
>> + *zero_written = 0;
>> + retval = ext2fs_file_write(e2_file, ptr, got, &written);
>> + if (retval)
>> + goto fail;
>> + got -= written;
>> + ptr += written;
>> + }
>> }
>> }
>> retval = ext2fs_file_close(e2_file);
>> @@ -1620,6 +1649,9 @@ void do_write(int argc, char *argv[])
>> ext2_ino_t newfile;
>> errcode_t retval;
>> struct ext2_inode inode;
>> + int bufsize = IO_BUFSIZE;
>> + int make_holes = 0;
>> + int zero_written = 1;
>>
>> if (common_args_process(argc, argv, 3, 3, "write",
>> "<native file> <new file>", CHECK_FS_RW))
>> @@ -1684,9 +1716,27 @@ void do_write(int argc, char *argv[])
>> return;
>> }
>> if (LINUX_S_ISREG(inode.i_mode)) {
>> - retval = copy_file(fd, newfile);
>> + if (statbuf.st_blocks < statbuf.st_size / S_BLKSIZE) {
>> + make_holes = 1;
>> + /*
>> + * Use I/O blocksize as buffer size when
>> + * copying sparse files.
>> + */
>> + bufsize = statbuf.st_blksize;
>> + }
>> + retval = copy_file(fd, newfile, bufsize, make_holes, &zero_written);
>> if (retval)
>> com_err("copy_file", retval, 0);
>> +
>> + if ((inode.i_flags & EXT4_EXTENTS_FL) && zero_written) {
>> + /*
>> + * If no data is copied which indicateds that no write
>> + * happens, we need to turn off the EXT4_EXTENTS_FL.
>> + */
>> + inode.i_flags &= ~EXT4_EXTENTS_FL;
>> + if (debugfs_write_inode(newfile, &inode, argv[0]))
>> + close(fd);
>> + }
>> }
>> close(fd);
>> }
>> --
>> 1.8.1.2
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
next prev parent reply other threads:[~2013-07-21 2:38 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-07-19 2:17 [PATCH 0/2] e2fsprogs/debugfs: do sparse copy when src is a sparse file Robert Yang
2013-07-19 2:17 ` [PATCH 1/2] debugfs.c: the max length of debugfs argument is too short Robert Yang
2013-07-19 2:17 ` [PATCH 2/2] debugfs.c: do sparse copy when src is a sparse file Robert Yang
2013-07-19 18:55 ` Darrick J. Wong
2013-07-21 2:38 ` Robert Yang [this message]
2013-07-22 17:30 ` Darrick J. Wong
2013-07-23 9:44 ` Robert Yang
2013-07-26 10:18 ` Robert Yang
-- strict thread matches above, loose matches on Subject: below --
2013-07-26 10:30 [PATCH 0/2 V2] e2fsprogs/debugfs: " Robert Yang
2013-07-26 10:30 ` [PATCH 2/2] debugfs.c: " Robert Yang
2013-07-26 16:02 ` Darrick J. Wong
2013-07-29 7:11 ` Robert Yang
2013-07-29 9:06 [PATCH 0/2 V3] e2fsprogs/debugfs: " Robert Yang
2013-07-29 9:06 ` [PATCH 2/2] debugfs.c: " Robert Yang
2013-08-19 22:43 ` Darren Hart
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51EB4994.3050703@windriver.com \
--to=liezhi.yang@windriver.com \
--cc=darrick.wong@oracle.com \
--cc=dvhart@linux.intel.com \
--cc=linux-ext4@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).