* worse than expected compression ratios with -o compress
@ 2010-01-16 16:16 Jim Faulkner
2010-01-17 14:34 ` Sander
2010-01-18 14:12 ` Josef Bacik
0 siblings, 2 replies; 13+ messages in thread
From: Jim Faulkner @ 2010-01-16 16:16 UTC (permalink / raw)
To: linux-btrfs
I have a mysql database which consists of hundreds of millions, if not
billions of Usenet newsgroup headers. This data should be highly
compressable, so I put the mysql data directory on a btrfs filesystem
mounted with the compress option:
/dev/sdi on /var/news/mysql type btrfs (rw,noatime,compress,noacl)
However, I'm not seeing the kind of compression ratios that I would expect
with this type of data. FYI, all my tests are using Linux 2.6.32.3.
Here's my current disk usage:
Filesystem Size Used Avail Use% Mounted on
/dev/sdi 302G 122G 181G 41% /var/news/mysql
and here's the actual size of all files:
delta-9 mysql # pwd
/var/news/mysql
delta-9 mysql # du -h --max-depth=1
747K ./mysql
0 ./test
125G ./urd
125G .
delta-9 mysql #
As you can see, I am only shaving off 3 gigs out of 125 gigs worth of what
should be very compressable data. The compressed data ends up being
around 98% the size of the original data.
To contrast, rzip can compress a database dump of this data to around 7%
of its original size. This is an older database dump, which is why it is
smaller. Before:
-rw------- 1 root root 69G 2010-01-15 14:55 mysqlurdbackup.2010-01-15
and after:
-rw------- 1 root root 5.2G 2010-01-16 05:34 mysqlurdbackup.2010-01-15.rz
Of course it took 15 hours to compress the data, and btrfs wouldn't be
able to use rzip for compression anyway.
However, I still would expect to see better compression ratios than 98% on
such data. Are there plans to implement a better compression algorithm?
Alternatively, is there a way to tune btrfs compression to achieve better
ratios?
thanks,
Jim Faulkner
Please CC my e-mail address on any replies.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: worse than expected compression ratios with -o compress
2010-01-16 16:16 worse than expected compression ratios with -o compress Jim Faulkner
@ 2010-01-17 14:34 ` Sander
2010-01-18 14:46 ` Jim Faulkner
2010-01-18 14:12 ` Josef Bacik
1 sibling, 1 reply; 13+ messages in thread
From: Sander @ 2010-01-17 14:34 UTC (permalink / raw)
To: Jim Faulkner; +Cc: linux-btrfs
Hello Jim,
Jim Faulkner wrote (ao):
> To contrast, rzip can compress a database dump of this data to
> around 7% of its original size. This is an older database dump,
> which is why it is smaller. Before:
> -rw------- 1 root root 69G 2010-01-15 14:55 mysqlurdbackup.2010-01-15
> and after:
> -rw------- 1 root root 5.2G 2010-01-16 05:34 mysqlurdbackup.2010-01-15.rz
>
> Of course it took 15 hours to compress the data, and btrfs wouldn't
> be able to use rzip for compression anyway.
The difference between a life MySQL database and a dump of that database
is that the dump is text, while the database files are binary.
A fair comparison would be to compress the actual database files.
With kind regards, Sander
--
Humilis IT Services and Solutions
http://www.humilis.net
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: worse than expected compression ratios with -o compress
2010-01-16 16:16 worse than expected compression ratios with -o compress Jim Faulkner
2010-01-17 14:34 ` Sander
@ 2010-01-18 14:12 ` Josef Bacik
2010-01-18 21:29 ` Chris Mason
1 sibling, 1 reply; 13+ messages in thread
From: Josef Bacik @ 2010-01-18 14:12 UTC (permalink / raw)
To: Jim Faulkner; +Cc: linux-btrfs
On Sat, Jan 16, 2010 at 11:16:50AM -0500, Jim Faulkner wrote:
>
> I have a mysql database which consists of hundreds of millions, if not
> billions of Usenet newsgroup headers. This data should be highly
> compressable, so I put the mysql data directory on a btrfs filesystem
> mounted with the compress option:
> /dev/sdi on /var/news/mysql type btrfs (rw,noatime,compress,noacl)
>
> However, I'm not seeing the kind of compression ratios that I would
> expect with this type of data. FYI, all my tests are using Linux
> 2.6.32.3. Here's my current disk usage:
> Filesystem Size Used Avail Use% Mounted on
> /dev/sdi 302G 122G 181G 41% /var/news/mysql
>
> and here's the actual size of all files:
> delta-9 mysql # pwd
> /var/news/mysql
> delta-9 mysql # du -h --max-depth=1
> 747K ./mysql
> 0 ./test
> 125G ./urd
> 125G .
> delta-9 mysql #
>
> As you can see, I am only shaving off 3 gigs out of 125 gigs worth of
> what should be very compressable data. The compressed data ends up being
> around 98% the size of the original data.
>
> To contrast, rzip can compress a database dump of this data to around 7%
> of its original size. This is an older database dump, which is why it is
> smaller. Before:
> -rw------- 1 root root 69G 2010-01-15 14:55 mysqlurdbackup.2010-01-15
> and after:
> -rw------- 1 root root 5.2G 2010-01-16 05:34 mysqlurdbackup.2010-01-15.rz
>
> Of course it took 15 hours to compress the data, and btrfs wouldn't be
> able to use rzip for compression anyway.
>
> However, I still would expect to see better compression ratios than 98%
> on such data. Are there plans to implement a better compression
> algorithm? Alternatively, is there a way to tune btrfs compression to
> achieve better ratios?
>
Currently the only compression algorithm we support is gzip, so try gzipp'ing
your database to get a better comparison. The plan is to eventually support
other compression algorithms, but currently we do not. Thanks,
Josef
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: worse than expected compression ratios with -o compress
2010-01-17 14:34 ` Sander
@ 2010-01-18 14:46 ` Jim Faulkner
2010-01-18 16:06 ` Jim Faulkner
0 siblings, 1 reply; 13+ messages in thread
From: Jim Faulkner @ 2010-01-18 14:46 UTC (permalink / raw)
To: Sander; +Cc: linux-btrfs
On Sun, 17 Jan 2010, Sander wrote:
> A fair comparison would be to compress the actual database files.
You are absolutely right. I've run some more tests, this time against the
database files themselves.
This time there are 73 GB worth of database files:
delta-9 mysql # du -h
747K ./mysql
0 ./test
73G ./urd
73G .
btrfs compresses this to 70 GB:
Filesystem Size Used Avail Use% Mounted on
/dev/sdi 187G 70G 117G 38% /var/news/mysql
which is 96% compression ratio. I then tried compressing /var/news/mysql
with some popular compressors.
zip -5, zip -9, and gzip all ended up producing archives that are roughly
11 GB:
delta-9 btrfs-mysql-test-jim # ls -lh btrfs-mysql-test.gz
btrfs-mysql-test-9.zip btrfs-mysql-test-5.zip
-rw-r--r-- 1 jim jim 11G 2010-01-18 01:50 btrfs-mysql-test-5.zip
-rw-r--r-- 1 jim jim 11G 2010-01-18 06:56 btrfs-mysql-test-9.zip
-rw-r--r-- 1 jim jim 11G 2010-01-18 02:17 btrfs-mysql-test.gz
delta-9 btrfs-mysql-test-jim #
This is a 15% compression ratio.
bzip2 produced an 8 GB archive, which is an 11% compression ratio:
-rw-r--r-- 1 jim jim 8.0G 2010-01-18 09:08 btrfs-mysql-test.bz2
7z produced a 6.1 GB archive, which is an 8% compression ratio:
-rw-r--r-- 1 jim jim 6.1G 2010-01-18 07:36 btrfs-mysql-test.7z
Finally, all of these are just command line compressors, I wanted to get a
test in with actual disk compression software. I haven't had a DOS box
running doublespace since I was rather young, so I plugged an extra drive
into a Windows Vista machine, formatted it with NTFS, and enabled
compression via the drive properties menu. I then copied the mysql data
directory onto the compressed NTFS drive:
http://www.ccs.neu.edu/home/jfaulkne/ntfscompression1.jpg
The end result was 72.4 GB of data using 29.5 GB of disk space:
http://www.ccs.neu.edu/home/jfaulkne/ntfscompression2.jpg
This is a 41% compression ratio.
So, in summary, the compression ratios are:
btrfs: 96%
zip/gzip: 15%
bzip2: 11%
7z: 8%
NTFS: 41%
I think most would agree that btrfs is doing a rather poor job of
compressing my data, even compared to gzip and NTFS compression.
Thoughts?
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: worse than expected compression ratios with -o compress
2010-01-18 14:46 ` Jim Faulkner
@ 2010-01-18 16:06 ` Jim Faulkner
0 siblings, 0 replies; 13+ messages in thread
From: Jim Faulkner @ 2010-01-18 16:06 UTC (permalink / raw)
To: linux-btrfs
On Mon, 18 Jan 2010, Jim Faulkner wrote:
> So, in summary, the compression ratios are:
> btrfs: 96%
> zip/gzip: 15%
> bzip2: 11%
> 7z: 8%
> NTFS: 41%
One minor follow up. I used "mysql < mysqldump.sql" to initially populate
the database on the btrfs filesystem. I thought that this may affect the
compression ratio, since mysql probably keeps changing the same blocks as
the sql import progresses. Compare this to the NTFS test, in which I
simply copied the populated database files onto the filesystem.
So, I created a new btrfs filesystem, mounted it with the compress option,
and simply copied my database files onto the filesystem, just like I did
on the NTFS test. This did make a small difference:
delta-9 mysql # du -h
0 ./btrfs-mysql-test/test
73G ./btrfs-mysql-test/urd
743K ./btrfs-mysql-test/mysql
73G ./btrfs-mysql-test
73G .
Filesystem Size Used Avail Use% Mounted on
/dev/sdi 187G 67G 120G 36% /var/news/mysql
This shaved another 3 GB off of the disk usage, so btrfs has now achieved
a 91% compression ratio.
This is still rather poor compared to the 41% compression ratio achieved
by NTFS. Surely btrfs should be better at compressing this data.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: worse than expected compression ratios with -o compress
2010-01-18 14:12 ` Josef Bacik
@ 2010-01-18 21:29 ` Chris Mason
2010-01-18 22:11 ` Jim Faulkner
0 siblings, 1 reply; 13+ messages in thread
From: Chris Mason @ 2010-01-18 21:29 UTC (permalink / raw)
To: Josef Bacik; +Cc: Jim Faulkner, linux-btrfs
On Mon, Jan 18, 2010 at 09:12:40AM -0500, Josef Bacik wrote:
> On Sat, Jan 16, 2010 at 11:16:50AM -0500, Jim Faulkner wrote:
> >
> > I have a mysql database which consists of hundreds of millions, if not
> > billions of Usenet newsgroup headers. This data should be highly
> > compressable, so I put the mysql data directory on a btrfs filesystem
> > mounted with the compress option:
> > /dev/sdi on /var/news/mysql type btrfs (rw,noatime,compress,noacl)
> >
> > However, I'm not seeing the kind of compression ratios that I would
> > expect with this type of data. FYI, all my tests are using Linux
> > 2.6.32.3. Here's my current disk usage:
> > Filesystem Size Used Avail Use% Mounted on
> > /dev/sdi 302G 122G 181G 41% /var/news/mysql
> >
> > and here's the actual size of all files:
> > delta-9 mysql # pwd
> > /var/news/mysql
> > delta-9 mysql # du -h --max-depth=1
> > 747K ./mysql
> > 0 ./test
> > 125G ./urd
> > 125G .
> > delta-9 mysql #
> >
> > As you can see, I am only shaving off 3 gigs out of 125 gigs worth of
> > what should be very compressable data. The compressed data ends up being
> > around 98% the size of the original data.
> >
> > To contrast, rzip can compress a database dump of this data to around 7%
> > of its original size. This is an older database dump, which is why it is
> > smaller. Before:
> > -rw------- 1 root root 69G 2010-01-15 14:55 mysqlurdbackup.2010-01-15
> > and after:
> > -rw------- 1 root root 5.2G 2010-01-16 05:34 mysqlurdbackup.2010-01-15.rz
> >
> > Of course it took 15 hours to compress the data, and btrfs wouldn't be
> > able to use rzip for compression anyway.
> >
> > However, I still would expect to see better compression ratios than 98%
> > on such data. Are there plans to implement a better compression
> > algorithm? Alternatively, is there a way to tune btrfs compression to
> > achieve better ratios?
> >
>
> Currently the only compression algorithm we support is gzip, so try gzipp'ing
> your database to get a better comparison. The plan is to eventually support
> other compression algorithms, but currently we do not. Thanks,
The compression code backs off compression pretty quickly if parts of
the file do not compress well. This is another way of saying it favors
CPU time over the best possible compression. If gzip ends up better
than what you're getting from btrfs, I can give you a patch to force
compression all the time.
-chris
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: worse than expected compression ratios with -o compress
2010-01-18 21:29 ` Chris Mason
@ 2010-01-18 22:11 ` Jim Faulkner
2010-01-20 16:30 ` Chris Mason
0 siblings, 1 reply; 13+ messages in thread
From: Jim Faulkner @ 2010-01-18 22:11 UTC (permalink / raw)
To: Chris Mason; +Cc: Josef Bacik, linux-btrfs
On Mon, 18 Jan 2010, Chris Mason wrote:
>> Currently the only compression algorithm we support is gzip, so try gzipp'ing
>> your database to get a better comparison. The plan is to eventually support
>> other compression algorithms, but currently we do not. Thanks,
>
> The compression code backs off compression pretty quickly if parts of
> the file do not compress well. This is another way of saying it favors
> CPU time over the best possible compression. If gzip ends up better
> than what you're getting from btrfs, I can give you a patch to force
> compression all the time.
Yes, gzip compresses much better than btrfs. I'd greatly appreciate a
patch to force compression all the time.
It would be nice if such an ability were merged in the mainline. Perhaps
there could be a mount option or tunable parameter to force compression?
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: worse than expected compression ratios with -o compress
2010-01-18 22:11 ` Jim Faulkner
@ 2010-01-20 16:30 ` Chris Mason
2010-01-21 18:16 ` Jim Faulkner
0 siblings, 1 reply; 13+ messages in thread
From: Chris Mason @ 2010-01-20 16:30 UTC (permalink / raw)
To: Jim Faulkner; +Cc: Josef Bacik, linux-btrfs
On Mon, Jan 18, 2010 at 05:11:53PM -0500, Jim Faulkner wrote:
>
> On Mon, 18 Jan 2010, Chris Mason wrote:
>
> >>Currently the only compression algorithm we support is gzip, so try gzipp'ing
> >>your database to get a better comparison. The plan is to eventually support
> >>other compression algorithms, but currently we do not. Thanks,
> >
> >The compression code backs off compression pretty quickly if parts of
> >the file do not compress well. This is another way of saying it favors
> >CPU time over the best possible compression. If gzip ends up better
> >than what you're getting from btrfs, I can give you a patch to force
> >compression all the time.
>
> Yes, gzip compresses much better than btrfs. I'd greatly appreciate
> a patch to force compression all the time.
>
> It would be nice if such an ability were merged in the mainline.
> Perhaps there could be a mount option or tunable parameter to force
> compression?
Lets start by making sure that this patch works for you. Just apply it
(2.6.32 or 2.6.33-rc) and then mount -o compress-force. Normally, mount
-o compress will set a flag on a file after it fails to get good
compression for that file.
With this patch, mount -o compress-force will still honor that flag.
But it will skip setting it during new writes. This is a long way of
saying you'll have to copy your data file to a new files for the new
mount option to do anything.
Please let me know if this improves your ratios
-chris
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 9f806dd..2aa8ec6 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1161,6 +1161,7 @@ struct btrfs_root {
#define BTRFS_MOUNT_SSD_SPREAD (1 << 8)
#define BTRFS_MOUNT_NOSSD (1 << 9)
#define BTRFS_MOUNT_DISCARD (1 << 10)
+#define BTRFS_MOUNT_FORCE_COMPRESS (1 << 11)
#define btrfs_clear_opt(o, opt) ((o) &= ~BTRFS_MOUNT_##opt)
#define btrfs_set_opt(o, opt) ((o) |= BTRFS_MOUNT_##opt)
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index b330e27..f46c572 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -483,7 +483,8 @@ again:
nr_pages_ret = 0;
/* flag the file so we don't compress in the future */
- BTRFS_I(inode)->flags |= BTRFS_INODE_NOCOMPRESS;
+ if (!btrfs_test_opt(root, FORCE_COMPRESS))
+ BTRFS_I(inode)->flags |= BTRFS_INODE_NOCOMPRESS;
}
if (will_compress) {
*num_added += 1;
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 3f9b457..8a1ea6e 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -66,7 +66,8 @@ enum {
Opt_degraded, Opt_subvol, Opt_device, Opt_nodatasum, Opt_nodatacow,
Opt_max_extent, Opt_max_inline, Opt_alloc_start, Opt_nobarrier,
Opt_ssd, Opt_nossd, Opt_ssd_spread, Opt_thread_pool, Opt_noacl,
- Opt_compress, Opt_notreelog, Opt_ratio, Opt_flushoncommit,
+ Opt_compress, Opt_compress_force, Opt_notreelog, Opt_ratio,
+ Opt_flushoncommit,
Opt_discard, Opt_err,
};
@@ -82,6 +83,7 @@ static match_table_t tokens = {
{Opt_alloc_start, "alloc_start=%s"},
{Opt_thread_pool, "thread_pool=%d"},
{Opt_compress, "compress"},
+ {Opt_compress_force, "compress-force"},
{Opt_ssd, "ssd"},
{Opt_ssd_spread, "ssd_spread"},
{Opt_nossd, "nossd"},
@@ -173,6 +175,11 @@ int btrfs_parse_options(struct btrfs_root *root, char *options)
printk(KERN_INFO "btrfs: use compression\n");
btrfs_set_opt(info->mount_opt, COMPRESS);
break;
+ case Opt_compress_force:
+ printk(KERN_INFO "btrfs: forcing compression\n");
+ btrfs_set_opt(info->mount_opt, FORCE_COMPRESS);
+ btrfs_set_opt(info->mount_opt, COMPRESS);
+ break;
case Opt_ssd:
printk(KERN_INFO "btrfs: use ssd allocation scheme\n");
btrfs_set_opt(info->mount_opt, SSD);
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: worse than expected compression ratios with -o compress
2010-01-20 16:30 ` Chris Mason
@ 2010-01-21 18:16 ` Jim Faulkner
2010-01-21 20:04 ` Gregory Maxwell
2010-01-21 20:05 ` Chris Mason
0 siblings, 2 replies; 13+ messages in thread
From: Jim Faulkner @ 2010-01-21 18:16 UTC (permalink / raw)
To: Chris Mason; +Cc: Josef Bacik, linux-btrfs
On Wed, 20 Jan 2010, Chris Mason wrote:
> Please let me know if this improves your ratios
It most certainly does! It also greatly reduced the time required to copy
the data to my (not very fast) disk. All my testing was done on 2.6.32.4.
The line numbers in your patch were a little off for 2.6.32.4, but I did
manage to apply it cleanly. Here's the results of my testing:
First let's see the results with plain old mount -o compress:
delta-9 ~ # mkfs.btrfs /dev/sdi
WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL
WARNING! - see http://btrfs.wiki.kernel.org before using
fs created label (null) on /dev/sdi
nodesize 4096 leafsize 4096 sectorsize 4096 size 186.31GB
Btrfs Btrfs v0.19
delta-9 ~ # mount -o noacl,compress,noatime /dev/sdi /var/news/mysql
delta-9 ~ # time cp -a /nfs/media/tmp/btrfs-mysql-test /var/news/mysql
real 57m6.983s
user 0m0.807s
sys 1m28.494s
delta-9 ~ # cd /var/news/mysql
delta-9 mysql # du -h --max-depth=1
73G ./btrfs-mysql-test
73G .
delta-9 mysql # df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sdi 187G 67G 120G 36% /var/news/mysql
So, with plain mount -o compress, it took 57 minutes to copy the data to
my btrfs disk, and it achieved a 92% compression ratio.
Now let's test with mount -o compress-force. First I'll create a new
btrfs filesystem so we're getting a fresh start:
delta-9 ~ # mkbtrfs /dev/sdi
WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL
WARNING! - see http://btrfs.wiki.kernel.org before using
fs created label (null) on /dev/sdi
nodesize 4096 leafsize 4096 sectorsize 4096 size 186.31GB
Btrfs Btrfs v0.19
delta-9 ~ # mount -o noatime,noacl,compress-force /dev/sdi /var/news/mysql
delta-9 ~ # time cp -a /nfs/media/tmp/btrfs-mysql-test /var/news/mysql/
real 14m45.742s
user 0m0.547s
sys 1m30.551s
delta-9 ~ # df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sdi 187G 14G 173G 8% /var/news/mysql
delta-9 ~ # cd /var/news/mysql
delta-9 mysql # du -h --max-depth=1
73G ./btrfs-mysql-test
73G .
delta-9 mysql #
Wow! So not only did mount -o compress-force achieve a 19% compression
ratio, using 53 GB less disk space than mount -o compress, it managed to
copy the data in only 15 minutes, compared to 57 minutes with mount -o
compress.
The disk in question is an old IDE disk in a cheap external USB 2.0
enclosure, which is probably not exactly the type of storage that btrfs is
being developed for. Nevertheless, it is nice to see such a huge
improvement in the time required to copy the data around.
I'd be very happy to see the -o compress-force option in the mainline
kernel someday!
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: worse than expected compression ratios with -o compress
2010-01-21 18:16 ` Jim Faulkner
@ 2010-01-21 20:04 ` Gregory Maxwell
2010-01-21 20:07 ` Chris Mason
2010-01-21 20:05 ` Chris Mason
1 sibling, 1 reply; 13+ messages in thread
From: Gregory Maxwell @ 2010-01-21 20:04 UTC (permalink / raw)
To: Jim Faulkner; +Cc: Chris Mason, Josef Bacik, linux-btrfs
On Thu, Jan 21, 2010 at 1:16 PM, Jim Faulkner <jfaulkne@ccs.neu.edu> wr=
ote:
>
> On Wed, 20 Jan 2010, Chris Mason wrote:
>
>> Please let me know if this improves your ratios
>
> It most certainly does! =C2=A0It also greatly reduced the time requir=
ed to copy
> the data to my (not very fast) disk. =C2=A0All my testing was done on=
2.6.32.4.
> The line numbers in your patch were a little off for 2.6.32.4, but I =
did
> manage to apply it cleanly. =C2=A0Here's the results of my testing:
[snip]
> I'd be very happy to see the -o compress-force option in the mainline=
kernel
> someday!
Sweet. But I think a force mount option is an unreasonably blunt tool.
I think two things would be nice:
(1) Fix the compression decision, I think this example suggests that
something is broken. (I'd noticed poorer than expected compression on
my laptop, but I'd chalked it up to the 64k blocks=E2=80=A6 now I'm not=
so
confident)
(2) An IOCTL for compression control. Userspace knows best, some
files ought to have a different compression policy.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: worse than expected compression ratios with -o compress
2010-01-21 18:16 ` Jim Faulkner
2010-01-21 20:04 ` Gregory Maxwell
@ 2010-01-21 20:05 ` Chris Mason
2010-01-21 22:38 ` Jim Faulkner
1 sibling, 1 reply; 13+ messages in thread
From: Chris Mason @ 2010-01-21 20:05 UTC (permalink / raw)
To: Jim Faulkner; +Cc: Josef Bacik, linux-btrfs
On Thu, Jan 21, 2010 at 01:16:31PM -0500, Jim Faulkner wrote:
> delta-9 ~ # mount -o noatime,noacl,compress-force /dev/sdi /var/news/mysql
> delta-9 ~ # time cp -a /nfs/media/tmp/btrfs-mysql-test /var/news/mysql/
>
> real 14m45.742s
> user 0m0.547s
> sys 1m30.551s
> delta-9 ~ # df -h
> Filesystem Size Used Avail Use% Mounted on
> /dev/sdi 187G 14G 173G 8% /var/news/mysql
> delta-9 ~ # cd /var/news/mysql
> delta-9 mysql # du -h --max-depth=1
> 73G ./btrfs-mysql-test
> 73G .
> delta-9 mysql #
>
> Wow! So not only did mount -o compress-force achieve a 19%
> compression ratio, using 53 GB less disk space than mount -o
> compress, it managed to copy the data in only 15 minutes, compared
> to 57 minutes with mount -o compress.
>
> The disk in question is an old IDE disk in a cheap external USB 2.0
> enclosure, which is probably not exactly the type of storage that
> btrfs is being developed for. Nevertheless, it is nice to see such
> a huge improvement in the time required to copy the data around.
>
> I'd be very happy to see the -o compress-force option in the
> mainline kernel someday!
Great, it is working here so I'll queue it up as well.
The performance improvement just comes from writing less to the disk.
Your first run copied 67GB in 57m6s, which comes out to about 20MB/s
write throughput.
Your compress-force run copied 14GB in 14m45s, which gives us
around 16 or 17MB/s (depending on df rounding). So, we either lost a
little drive throughput due to overhead in the compression code or the
run was totally CPU bound.
If you look at the output from 'time' the two runs appear to be using
the same amount of system time (cpu time in the kernel). But, most of
the compression is done by helper threads, so the cpu time spent
compressing data doesn't show up.
Just a quick clarification that we didn't make your drive faster, we
just traded CPU for disk. Most of the time that's a safe trade,
especially when you put usb and ide into the same sentence ;)
Either way, thanks for testing this out. Do you happen to remember how
small the file becomes when you just plain gzip it?
-chris
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: worse than expected compression ratios with -o compress
2010-01-21 20:04 ` Gregory Maxwell
@ 2010-01-21 20:07 ` Chris Mason
0 siblings, 0 replies; 13+ messages in thread
From: Chris Mason @ 2010-01-21 20:07 UTC (permalink / raw)
To: Gregory Maxwell; +Cc: Jim Faulkner, Josef Bacik, linux-btrfs
On Thu, Jan 21, 2010 at 03:04:56PM -0500, Gregory Maxwell wrote:
> On Thu, Jan 21, 2010 at 1:16 PM, Jim Faulkner <jfaulkne@ccs.neu.edu> =
wrote:
> >
> > On Wed, 20 Jan 2010, Chris Mason wrote:
> >
> >> Please let me know if this improves your ratios
> >
> > It most certainly does! =C2=A0It also greatly reduced the time requ=
ired to copy
> > the data to my (not very fast) disk. =C2=A0All my testing was done =
on 2.6.32.4.
> > The line numbers in your patch were a little off for 2.6.32.4, but =
I did
> > manage to apply it cleanly. =C2=A0Here's the results of my testing:
> [snip]
> > I'd be very happy to see the -o compress-force option in the mainli=
ne kernel
> > someday!
>=20
>=20
> Sweet. But I think a force mount option is an unreasonably blunt tool=
=2E
>=20
> I think two things would be nice:
>=20
> (1) Fix the compression decision, I think this example suggests that
> something is broken. (I'd noticed poorer than expected compression on
> my laptop, but I'd chalked it up to the 64k blocks=E2=80=A6 now I'm n=
ot so
> confident)
The current code assumes that files have consistent data in them. This
is very true for the average data set, but it'll be horribly wrong for
something like a database file.
>=20
> (2) An IOCTL for compression control. Userspace knows best, some
> files ought to have a different compression policy.
Yes, we'll get #2 added.
-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: worse than expected compression ratios with -o compress
2010-01-21 20:05 ` Chris Mason
@ 2010-01-21 22:38 ` Jim Faulkner
0 siblings, 0 replies; 13+ messages in thread
From: Jim Faulkner @ 2010-01-21 22:38 UTC (permalink / raw)
To: Chris Mason; +Cc: Josef Bacik, linux-btrfs
On Thu, 21 Jan 2010, Chris Mason wrote:
> Either way, thanks for testing this out. Do you happen to remember how
> small the file becomes when you just plain gzip it?
gzipping it I end up with an 11 GB file.
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2010-01-21 22:38 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-16 16:16 worse than expected compression ratios with -o compress Jim Faulkner
2010-01-17 14:34 ` Sander
2010-01-18 14:46 ` Jim Faulkner
2010-01-18 16:06 ` Jim Faulkner
2010-01-18 14:12 ` Josef Bacik
2010-01-18 21:29 ` Chris Mason
2010-01-18 22:11 ` Jim Faulkner
2010-01-20 16:30 ` Chris Mason
2010-01-21 18:16 ` Jim Faulkner
2010-01-21 20:04 ` Gregory Maxwell
2010-01-21 20:07 ` Chris Mason
2010-01-21 20:05 ` Chris Mason
2010-01-21 22:38 ` Jim Faulkner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox