From: Robert Yang <liezhi.yang@windriver.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: <tytso@mit.edu>, <dvhart@linux.intel.com>, <linux-ext4@vger.kernel.org>
Subject: Re: [PATCH 2/2] debugfs.c: do sparse copy when src is a sparse file
Date: Sun, 21 Jul 2013 10:38:12 +0800 [thread overview]
Message-ID: <51EB4994.3050703@windriver.com> (raw)
In-Reply-To: <20130719185515.GC5785@blackbox.djwong.org>
Hi Darrick,
Thanks for your reply, it seems that 64K is a good idea since put 128K
on the stack might cause problems, I will wait for one or two days for
more comments on other parts of the patches, then send a V2 with the
updates.
// Robert
On 07/20/2013 02:55 AM, Darrick J. Wong wrote:
> On Fri, Jul 19, 2013 at 10:17:37AM +0800, Robert Yang wrote:
>> Let debugfs do sparse copy when src is a sparse file, just like
>> "cp --sparse=auto"
>>
>> * For the
>> #define IO_BUFSIZE 32*1024
>>
>> This is from coreutils-8.13/src/ioblksize.h (GPL V3):
>> /* As of Mar 2009, 32KiB is determined to be the minimium
>> blksize to best minimize system call overhead.
>> This can be tested with this script with the results
>> shown for a 1.7GHz pentium-m with 2GB of 400MHz DDR2 RAM:
>
> Um.... GNU updated this to 64K a couple of years ago:
> http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=blob;f=src/ioblksize.h;h=1ae93255e7d0ccf0855208c7ae5888209997bf16;hb=HEAD
>
> Just for laughs I tried it on a T430 with an i5-3320M and 16G of DDR3-1600 RAM:
>
> 1024=3.7 GB/s
> 2048=7.1 GB/s
> 4096=8.8 GB/s
> 8192=14.9 GB/s
> 16384=14.3 GB/s
> 32768=13.4 GB/s
> 65536=15.8 GB/s
> 131072=20.7 GB/s
> 262144=16.4 GB/s
> 524288=15.9 GB/s
> 1048576=15.8 GB/s
> 2097152=15.1 GB/s
> 4194304=11.7 GB/s
> 8388608=9.9 GB/s
> 16777216=9.4 GB/s
> 33554432=9.3 GB/s
> 67108864=9.3 GB/s
> 134217728=8.8 GB/s
>
> For that matter, a 2010-era i7-950/DDR3-1066 system showed this:
>
> 1024=3.4 GB/s
> 2048=5.6 GB/s
> 4096=7.8 GB/s
> 8192=9.5 GB/s
> 16384=10.8 GB/s
> 32768=11.4 GB/s
> 65536=11.6 GB/s
> 131072=12.2 GB/s
> 262144=11.9 GB/s
> 524288=12.3 GB/s
> 1048576=12.4 GB/s
> 2097152=12.5 GB/s
> 4194304=12.5 GB/s
> 8388608=10.3 GB/s
> 16777216=8.0 GB/s
> 33554432=7.6 GB/s
> 67108864=7.8 GB/s
> 134217728=7.5 GB/s
>
> And for good measure, a cruddy old T2300 Core Duo from 2006 spat out this:
>
> 1024=1.1 GB/s
> 2048=2.1 GB/s
> 4096=3.6 GB/s
> 8192=5.0 GB/s
> 16384=6.3 GB/s
> 32768=6.5 GB/s
> 65536=6.6 GB/s
> 131072=7.0 GB/s
> 262144=7.1 GB/s
> 524288=7.1 GB/s
> 1048576=6.8 GB/s
> 2097152=4.4 GB/s
> 4194304=2.3 GB/s
> 8388608=2.0 GB/s
> 16777216=2.0 GB/s
> 33554432=2.0 GB/s
> 67108864=2.0 GB/s
> 134217728=1.9 GB/s
>
> I suspect you could increase the buffer size to 128K (or possibly even BLKRAGET
> size?) without much of a problem...
>
>>
>> for i in $(seq 0 10); do
>> size=$((8*1024**3)) #ensure this is big enough
>> bs=$((1024*2**$i))
>> printf "%7s=" $bs
>> dd bs=$bs if=/dev/zero of=/dev/null count=$(($size/$bs)) 2>&1 |
>> sed -n 's/.* \([0-9.]* [GM]B\/s\)/\1/p'
>> done
>>
>> 1024=734 MB/s
>> 2048=1.3 GB/s
>> 4096=2.4 GB/s
>> 8192=3.5 GB/s
>> 16384=3.9 GB/s
>> 32768=5.2 GB/s
>> 65536=5.3 GB/s
>> 131072=5.5 GB/s
>> 262144=5.7 GB/s
>> 524288=5.7 GB/s
>> 1048576=5.8 GB/s
>>
>> Note that this is to minimize system call overhead.
>> Other values may be appropriate to minimize file system
>> or disk overhead. For example on my current GNU/Linux system
>> the readahead setting is 128KiB which was read using:
>>
>> file="."
>> device=$(df -P --local "$file" | tail -n1 | cut -d' ' -f1)
>> echo $(( $(blockdev --getra $device) * 512 ))
>>
>> However there isn't a portable way to get the above.
>> In the future we could use the above method if available
>> and default to io_blksize() if not.
>> */
>> enum { IO_BUFSIZE = 32*1024 };
>>
>> Signed-off-by: Robert Yang <liezhi.yang@windriver.com>
>> Acked-by: Darren Hart <dvhart@linux.intel.com>
>> ---
>> debugfs/debugfs.c | 70 +++++++++++++++++++++++++++++++++++++++++++++++--------
>> 1 file changed, 60 insertions(+), 10 deletions(-)
>>
>> diff --git a/debugfs/debugfs.c b/debugfs/debugfs.c
>> index b77d0b5..e443703 100644
>> --- a/debugfs/debugfs.c
>> +++ b/debugfs/debugfs.c
>> @@ -37,6 +37,16 @@ extern char *optarg;
>> #include "../version.h"
>> #include "jfs_user.h"
>>
>> +/* 32KiB is the minimium blksize to best minimize system call overhead. */
>> +#ifndef IO_BUFSIZE
>> +#define IO_BUFSIZE 32*1024
>> +#endif
>> +
>> +/* Block size for `st_blocks' */
>> +#ifndef S_BLKSIZE
>> +#define S_BLKSIZE 512
>> +#endif
>> +
>> ss_request_table *extra_cmds;
>> const char *debug_prog_name;
>> int sci_idx;
>> @@ -1571,14 +1581,17 @@ void do_find_free_inode(int argc, char *argv[])
>> }
>>
>> #ifndef READ_ONLY
>> -static errcode_t copy_file(int fd, ext2_ino_t newfile)
>> +static errcode_t copy_file(int fd, ext2_ino_t newfile, int bufsize,
>> + int make_holes, int *zero_written)
>> {
>> ext2_file_t e2_file;
>> errcode_t retval;
>> int got;
>> unsigned int written;
>> - char buf[8192];
>> + char buf[bufsize];
>
> ...well, I guess it could be more of a problem if you put 128K on the stack.
>
> --D
>
>> char *ptr;
>> + char *cp;
>> + int count;
>>
>> retval = ext2fs_file_open(current_fs, newfile,
>> EXT2_FILE_WRITE, &e2_file);
>> @@ -1594,14 +1607,30 @@ static errcode_t copy_file(int fd, ext2_ino_t newfile)
>> goto fail;
>> }
>> ptr = buf;
>> + cp = ptr;
>> + count = got;
>> while (got > 0) {
>> - retval = ext2fs_file_write(e2_file, ptr,
>> - got, &written);
>> - if (retval)
>> - goto fail;
>> -
>> - got -= written;
>> - ptr += written;
>> + if (make_holes) {
>> + /* Check whether all is zero */
>> + while (count-- && *cp++ == 0)
>> + continue;
>> + if (count < 0) {
>> + /* The whole block is zero, make a hole */
>> + retval = ext2fs_file_lseek(e2_file, got, EXT2_SEEK_CUR, NULL);
>> + if (retval)
>> + goto fail;
>> + got = 0;
>> + }
>> + }
>> + /* Normal copy */
>> + if (got > 0) {
>> + *zero_written = 0;
>> + retval = ext2fs_file_write(e2_file, ptr, got, &written);
>> + if (retval)
>> + goto fail;
>> + got -= written;
>> + ptr += written;
>> + }
>> }
>> }
>> retval = ext2fs_file_close(e2_file);
>> @@ -1620,6 +1649,9 @@ void do_write(int argc, char *argv[])
>> ext2_ino_t newfile;
>> errcode_t retval;
>> struct ext2_inode inode;
>> + int bufsize = IO_BUFSIZE;
>> + int make_holes = 0;
>> + int zero_written = 1;
>>
>> if (common_args_process(argc, argv, 3, 3, "write",
>> "<native file> <new file>", CHECK_FS_RW))
>> @@ -1684,9 +1716,27 @@ void do_write(int argc, char *argv[])
>> return;
>> }
>> if (LINUX_S_ISREG(inode.i_mode)) {
>> - retval = copy_file(fd, newfile);
>> + if (statbuf.st_blocks < statbuf.st_size / S_BLKSIZE) {
>> + make_holes = 1;
>> + /*
>> + * Use I/O blocksize as buffer size when
>> + * copying sparse files.
>> + */
>> + bufsize = statbuf.st_blksize;
>> + }
>> + retval = copy_file(fd, newfile, bufsize, make_holes, &zero_written);
>> if (retval)
>> com_err("copy_file", retval, 0);
>> +
>> + if ((inode.i_flags & EXT4_EXTENTS_FL) && zero_written) {
>> + /*
>> + * If no data is copied which indicateds that no write
>> + * happens, we need to turn off the EXT4_EXTENTS_FL.
>> + */
>> + inode.i_flags &= ~EXT4_EXTENTS_FL;
>> + if (debugfs_write_inode(newfile, &inode, argv[0]))
>> + close(fd);
>> + }
>> }
>> close(fd);
>> }
>> --
>> 1.8.1.2
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
next prev parent reply other threads:[~2013-07-21 2:38 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-07-19 2:17 [PATCH 0/2] e2fsprogs/debugfs: do sparse copy when src is a sparse file Robert Yang
2013-07-19 2:17 ` [PATCH 1/2] debugfs.c: the max length of debugfs argument is too short Robert Yang
2013-07-19 2:17 ` [PATCH 2/2] debugfs.c: do sparse copy when src is a sparse file Robert Yang
2013-07-19 18:55 ` Darrick J. Wong
2013-07-21 2:38 ` Robert Yang [this message]
2013-07-22 17:30 ` Darrick J. Wong
2013-07-23 9:44 ` Robert Yang
2013-07-26 10:18 ` Robert Yang
-- strict thread matches above, loose matches on Subject: below --
2013-07-26 10:30 [PATCH 0/2 V2] e2fsprogs/debugfs: " Robert Yang
2013-07-26 10:30 ` [PATCH 2/2] debugfs.c: " Robert Yang
2013-07-26 16:02 ` Darrick J. Wong
2013-07-29 7:11 ` Robert Yang
2013-07-29 9:06 [PATCH 0/2 V3] e2fsprogs/debugfs: " Robert Yang
2013-07-29 9:06 ` [PATCH 2/2] debugfs.c: " Robert Yang
2013-08-19 22:43 ` Darren Hart
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51EB4994.3050703@windriver.com \
--to=liezhi.yang@windriver.com \
--cc=darrick.wong@oracle.com \
--cc=dvhart@linux.intel.com \
--cc=linux-ext4@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.