public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
* Compressed Filesystem
@ 2008-10-27 14:54 Lee Trager
  2008-10-28 15:47 ` Chris Mason
  0 siblings, 1 reply; 17+ messages in thread
From: Lee Trager @ 2008-10-27 14:54 UTC (permalink / raw)
  To: linux-btrfs; +Cc: balajirrao, miguel.filipe

I have read on the mailing list that there has been some interest in
implementing transparent compression on btrfs and I too am thinking about
trying to implement it. Before I start from scratch I am wondering if
anyone else has started to work on this and if so how far along have
they gotten? I would be happy to work on this alone or with someone else
but currently I am doing some preliminary research.

Thanks,

Lee

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Compressed Filesystem
  2008-10-27 14:54 Compressed Filesystem Lee Trager
@ 2008-10-28 15:47 ` Chris Mason
  2008-10-28 16:33   ` Lee Trager
  0 siblings, 1 reply; 17+ messages in thread
From: Chris Mason @ 2008-10-28 15:47 UTC (permalink / raw)
  To: Lee Trager; +Cc: linux-btrfs, balajirrao, miguel.filipe

On Mon, 2008-10-27 at 10:54 -0400, Lee Trager wrote:
> I have read on the mailing list that there has been some interest in
> implementing transparent compression on btrfs and I too am thinking about
> trying to implement it. Before I start from scratch I am wondering if
> anyone else has started to work on this and if so how far along have
> they gotten? I would be happy to work on this alone or with someone else
> but currently I am doing some preliminary research.

Compression is working on my machine, I'm just running some long tests
before I push it out to the unstable repo.  The current code uses the
in-kernel zlib implementation.

-chris



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Compressed Filesystem
  2008-10-28 15:47 ` Chris Mason
@ 2008-10-28 16:33   ` Lee Trager
  2008-10-28 17:38     ` Chris Mason
  0 siblings, 1 reply; 17+ messages in thread
From: Lee Trager @ 2008-10-28 16:33 UTC (permalink / raw)
  To: Chris Mason; +Cc: linux-btrfs, balajirrao, miguel.filipe

On Tue, Oct 28, 2008 at 11:47:27AM -0400, Chris Mason wrote:
> On Mon, 2008-10-27 at 10:54 -0400, Lee Trager wrote:
> > I have read on the mailing list that there has been some interest in
> > implementing transparent compression on btrfs and I too am thinking about
> > trying to implement it. Before I start from scratch I am wondering if
> > anyone else has started to work on this and if so how far along have
> > they gotten? I would be happy to work on this alone or with someone else
> > but currently I am doing some preliminary research.
> 
> Compression is working on my machine, I'm just running some long tests
> before I push it out to the unstable repo.  The current code uses the
> in-kernel zlib implementation.
> 
> -chris
>

Thats great I am eager to try it. How long will these tests take? I
would love to look at the code.  Is compression done for every file by
default or does a user space program have to set a compression flag on
the file?

Thanks,

Lee

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Compressed Filesystem
  2008-10-28 16:33   ` Lee Trager
@ 2008-10-28 17:38     ` Chris Mason
  2008-10-28 17:40       ` Zach Brown
       [not found]       ` <53696.2001:470:e828:1::2:2.1225304096.squirrel@avalon.arbitraryconstant.com>
  0 siblings, 2 replies; 17+ messages in thread
From: Chris Mason @ 2008-10-28 17:38 UTC (permalink / raw)
  To: Lee Trager; +Cc: linux-btrfs, balajirrao, miguel.filipe

On Tue, 2008-10-28 at 12:33 -0400, Lee Trager wrote:
> On Tue, Oct 28, 2008 at 11:47:27AM -0400, Chris Mason wrote:
> > On Mon, 2008-10-27 at 10:54 -0400, Lee Trager wrote:
> > > I have read on the mailing list that there has been some interest in
> > > implementing transparent compression on btrfs and I too am thinking about
> > > trying to implement it. Before I start from scratch I am wondering if
> > > anyone else has started to work on this and if so how far along have
> > > they gotten? I would be happy to work on this alone or with someone else
> > > but currently I am doing some preliminary research.
> > 
> > Compression is working on my machine, I'm just running some long tests
> > before I push it out to the unstable repo.  The current code uses the
> > in-kernel zlib implementation.
> > 
> > -chris
> >
> 
> Thats great I am eager to try it. How long will these tests take? I
> would love to look at the code.  Is compression done for every file by
> default or does a user space program have to set a compression flag on
> the file?

This is a fairly large change, I plan on running it overnight.

Compression is optional and off by default (mount -o compress to enable
it).  When enabled, every file is compressed.

-chris



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Compressed Filesystem
  2008-10-28 17:38     ` Chris Mason
@ 2008-10-28 17:40       ` Zach Brown
  2008-10-28 17:46         ` Chris Mason
       [not found]       ` <53696.2001:470:e828:1::2:2.1225304096.squirrel@avalon.arbitraryconstant.com>
  1 sibling, 1 reply; 17+ messages in thread
From: Zach Brown @ 2008-10-28 17:40 UTC (permalink / raw)
  To: Chris Mason; +Cc: Lee Trager, linux-btrfs, balajirrao, miguel.filipe


> Compression is optional and off by default (mount -o compress to enable
> it).  When enabled, every file is compressed.

Compression is attempted as files are written when the mount option is
enabled, right?

There isn't a background scrubber that tries to compress files which are
already written?

- z

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Compressed Filesystem
  2008-10-28 17:40       ` Zach Brown
@ 2008-10-28 17:46         ` Chris Mason
  0 siblings, 0 replies; 17+ messages in thread
From: Chris Mason @ 2008-10-28 17:46 UTC (permalink / raw)
  To: Zach Brown; +Cc: Lee Trager, linux-btrfs, balajirrao, miguel.filipe

On Tue, 2008-10-28 at 10:40 -0700, Zach Brown wrote:
> > Compression is optional and off by default (mount -o compress to enable
> > it).  When enabled, every file is compressed.
> 
> Compression is attempted as files are written when the mount option is
> enabled, right?

Yes, and if the compression doesn't make a given set of pages smaller it
quickly backs off and goes back to writing it straight through.

> 
> There isn't a background scrubber that tries to compress files which are
> already written?

No, but if you mount with compression on and use the single file defrag
ioctl (btrfsctl -d some_file) it'll compress it.

-chris



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Compressed Filesystem
       [not found]       ` <53696.2001:470:e828:1::2:2.1225304096.squirrel@avalon.arbitraryconstant.com>
@ 2008-10-29 20:08         ` Chris Mason
  2008-11-04  0:08           ` Chris Samuel
  0 siblings, 1 reply; 17+ messages in thread
From: Chris Mason @ 2008-10-29 20:08 UTC (permalink / raw)
  To: btrfs-devel; +Cc: linux-btrfs

On Wed, 2008-10-29 at 12:14 -0600, Anthony Roberts wrote:
> Hi, I have a few questions about this:
> 
> > Compression is optional and off by default (mount -o compress to enable
> > it).  When enabled, every file is compressed.
> 
> Do you know what the CPU load is like with this enabled?

Now that I've finally pushed the code out, you can try it ;)  One part
of the implementation I need to revisit is the place in the code where I
do compression means that most of the time the single threaded pdflush
is the one compressing.

This doesn't spread the load very well across the cpus.  It can be
fixed, but I wanted to get the code out there.

The decompression does spread across cpus, and I've gotten about 800MB/s
doing decompress and checksumming on a zero filled compressed file.  At
the time, the disk was reading 14MB/s.

> 
> Do you know whether data can be compressed at a sufficient rate to still
> saturate the disk on recent-ish AMD/Intel CPUs?

My recentish intel cpu can compress and checksum at about 120MB/s.  
> 
> If no, is the effective pre-compression I/O rate still comparable to the
> disk without compression?
> 

It depends on your disks...

> I'm pretty sure that won't even matter in many cases (eg you're seeking
> too much to care, or you're on a VM with lots of cores but congested
> disks, or you're dealing with media files that it doesn't bother
> compressing, etc), but I'm curious what sort of overhead this adds. :)
> 
> Mostly it seems like a good tradeoff, it trades plentiful cores for scarce
> disk resources.

This varies quite a bit from workload to workload, in some places it'll
make a big difference, but many workloads are seek bound and not
bandwidth bound.

-chris





^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Compressed Filesystem
  2008-10-29 20:08         ` Chris Mason
@ 2008-11-04  0:08           ` Chris Samuel
  0 siblings, 0 replies; 17+ messages in thread
From: Chris Samuel @ 2008-11-04  0:08 UTC (permalink / raw)
  To: linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 886 bytes --]

On Thu, 30 Oct 2008 7:08:42 am Chris Mason wrote:

> The decompression does spread across cpus, and I've gotten about 800MB/s
> doing decompress and checksumming on a zero filled compressed file.  At
> the time, the disk was reading 14MB/s.

FWIW I've got a pretty ugly patch to Bonnie++ that makes it use data from 
/dev/urandom for writes rather than just blocks of zero's which give, um, 
optomistic values for throughput on filesystems that do compression.

Still not particularly realistic in terms of an actual workload, but maybe 
just a tad less unrealistic. :-)

Caveat emptor - I've not tried this since I sent it to Russell Coker in 
January '07.

cheers,
Chris
-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC

This email may come with a PGP signature as a file. Do not panic.
For more info see: http://en.wikipedia.org/wiki/OpenPGP


[-- Attachment #1.2: bonnie++-1.03a-urand.patch --]
[-- Type: text/x-patch, Size: 1422 bytes --]

diff -ur bonnie++-1.03a/bonnie++.cpp bonnie++-1.03a-urand/bonnie++.cpp
--- bonnie++-1.03a/bonnie++.cpp	2002-12-04 00:40:35.000000000 +1100
+++ bonnie++-1.03a-urand/bonnie++.cpp	2007-01-01 13:03:41.644378000 +1100
@@ -41,6 +41,9 @@
 #include <string.h>
 #include <sys/utsname.h>
 #include <signal.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <fcntl.h>
 
 #ifdef AIX_MEM_SIZE
 #include <cf.h>
@@ -148,6 +151,28 @@
   }
 }
 
+void load_random_data(char *temp_buffer,int length)
+{
+	int filedes, numbytes;
+
+	filedes=open("/dev/urandom",O_RDONLY);
+	if(filedes<0)
+	{
+		perror("Open of /dev/urandom failed, falling back to 0's");
+		memset(temp_buffer, 0, length);
+	}
+	else
+	{
+		numbytes=read(filedes,temp_buffer,length);
+		if(numbytes!=length)
+			{
+				perror("Read from /dev/urandom failed, falling back to 0's");
+				memset(temp_buffer, 0, length);
+			}
+		close(filedes);
+	}
+}
+
 int main(int argc, char *argv[])
 {
   int    file_size = DefaultFileSize;
@@ -477,7 +502,8 @@
       return 1;
     globals.decrement_and_wait(FastWrite);
     if(!globals.quiet) fprintf(stderr, "Writing intelligently...");
-    memset(buf, 0, globals.chunk_size());
+    // memset(buf, 0, globals.chunk_size());
+    load_random_data(buf, globals.chunk_size());
     globals.timer.timestamp();
     bufindex = 0;
     // for the number of chunks of file data

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 481 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Compressed Filesystem
@ 2008-12-15 22:14 devzero
  2008-12-15 23:07 ` Lee Trager
  0 siblings, 1 reply; 17+ messages in thread
From: devzero @ 2008-12-15 22:14 UTC (permalink / raw)
  To: linux-btrfs

fantastic feature!

i`m curious: can btrfs support more than one compression scheme at the =
same time, i.e. is compression "pluggable" ?

lzo compression coming to my mind, as this is giving real-time compessi=
on and may even speed up disk access.

compression ratio isn`t too bad, but speed is awesome and doesn`t need =
as much cpu as gzip.

experimental lzo compression in zfs-fuse showed that it could compress =
tarred kernel-source with 2.99x compressratio (where gzip gave 3.41x), =
so maybe lzo is a better algorithm for realtime filesystem compression.=
=2E.

regards
roland



=46rom: Chris Mason <chris.mason <at> oracle.com>
Subject: Re: Compressed Filesystem
Newsgroups: gmane.comp.file-systems.btrfs
Date: 2008-10-29 20:08:42 GMT (6 weeks, 5 days, 1 hour and 53 minutes a=
go)

On Wed, 2008-10-29 at 12:14 -0600, Anthony Roberts wrote:
> Hi, I have a few questions about this:
>=20
> > Compression is optional and off by default (mount -o compress to en=
able
> > it).  When enabled, every file is compressed.
>=20
> Do you know what the CPU load is like with this enabled?

Now that I've finally pushed the code out, you can try it ;)  One part
of the implementation I need to revisit is the place in the code where =
I
do compression means that most of the time the single threaded pdflush
is the one compressing.

This doesn't spread the load very well across the cpus.  It can be
fixed, but I wanted to get the code out there.

The decompression does spread across cpus, and I've gotten about 800MB/=
s
doing decompress and checksumming on a zero filled compressed file.  At
the time, the disk was reading 14MB/s.

>=20
> Do you know whether data can be compressed at a sufficient rate to st=
ill
> saturate the disk on recent-ish AMD/Intel CPUs?

My recentish intel cpu can compress and checksum at about 120MB/s. =20
>=20
> If no, is the effective pre-compression I/O rate still comparable to =
the
> disk without compression?
>=20

It depends on your disks...

> I'm pretty sure that won't even matter in many cases (eg you're seeki=
ng
> too much to care, or you're on a VM with lots of cores but congested
> disks, or you're dealing with media files that it doesn't bother
> compressing, etc), but I'm curious what sort of overhead this adds. :=
)
>=20
> Mostly it seems like a good tradeoff, it trades plentiful cores for s=
carce
> disk resources.

This varies quite a bit from workload to workload, in some places it'll
make a big difference, but many workloads are seek bound and not
bandwidth bound.

-chris


____________________________________________________________________
Psssst! Schon vom neuen WEB.DE MultiMessenger geh=F6rt?=20
Der kann`s mit allen: http://www.produkte.web.de/messenger/?did=3D3123

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Compressed Filesystem
  2008-12-15 22:14 devzero
@ 2008-12-15 23:07 ` Lee Trager
  0 siblings, 0 replies; 17+ messages in thread
From: Lee Trager @ 2008-12-15 23:07 UTC (permalink / raw)
  To: devzero; +Cc: linux-btrfs

If multiple compression schemes are implemented how should the user go
about choosing which one they want? Should it be done at kernel time? Or
with the userland tools on a per file basis(maybe zlib is the default
but a user could say I want this directory to be bzip)?

On Mon, Dec 15, 2008 at 11:14:01PM +0100, devzero@web.de wrote:
> fantastic feature!
> 
> i`m curious: can btrfs support more than one compression scheme at the same time, i.e. is compression "pluggable" ?
If you look at compression.c, compression.h, and ctree.h you can clearly
see that support for multiple compression scheme was in mind. Implmented
a new one shouldn't be to hard but you probably want to make the current
system a little bit more pluggable and move all the zlib stuff into
zlib.c.
> 
> lzo compression coming to my mind, as this is giving real-time compession and may even speed up disk access.
> 
> compression ratio isn`t too bad, but speed is awesome and doesn`t need as much cpu as gzip.
> 
In some tests I've run zlib is actually faster then nocompression
because of the lesser amount of data that has to transfer to and from
the disk. It would be instresting to see how bzip works with this to.
> experimental lzo compression in zfs-fuse showed that it could compress tarred kernel-source with 2.99x compressratio (where gzip gave 3.41x), so maybe lzo is a better algorithm for realtime filesystem compression...
> 
> regards
> roland
> 
> 
> 
> From: Chris Mason <chris.mason <at> oracle.com>
> Subject: Re: Compressed Filesystem
> Newsgroups: gmane.comp.file-systems.btrfs
> Date: 2008-10-29 20:08:42 GMT (6 weeks, 5 days, 1 hour and 53 minutes ago)
> 
> On Wed, 2008-10-29 at 12:14 -0600, Anthony Roberts wrote:
> > Hi, I have a few questions about this:
> > 
> > > Compression is optional and off by default (mount -o compress to enable
> > > it).  When enabled, every file is compressed.
> > 
> > Do you know what the CPU load is like with this enabled?
> 
> Now that I've finally pushed the code out, you can try it ;)  One part
> of the implementation I need to revisit is the place in the code where I
> do compression means that most of the time the single threaded pdflush
> is the one compressing.
> 
> This doesn't spread the load very well across the cpus.  It can be
> fixed, but I wanted to get the code out there.
> 
> The decompression does spread across cpus, and I've gotten about 800MB/s
> doing decompress and checksumming on a zero filled compressed file.  At
> the time, the disk was reading 14MB/s.
> 
> > 
> > Do you know whether data can be compressed at a sufficient rate to still
> > saturate the disk on recent-ish AMD/Intel CPUs?
> 
> My recentish intel cpu can compress and checksum at about 120MB/s.  
> > 
> > If no, is the effective pre-compression I/O rate still comparable to the
> > disk without compression?
> > 
> 
> It depends on your disks...
> 
> > I'm pretty sure that won't even matter in many cases (eg you're seeking
> > too much to care, or you're on a VM with lots of cores but congested
> > disks, or you're dealing with media files that it doesn't bother
> > compressing, etc), but I'm curious what sort of overhead this adds. :)
> > 
> > Mostly it seems like a good tradeoff, it trades plentiful cores for scarce
> > disk resources.
> 
> This varies quite a bit from workload to workload, in some places it'll
> make a big difference, but many workloads are seek bound and not
> bandwidth bound.
> 
> -chris
> 
> 
> ____________________________________________________________________
> Psssst! Schon vom neuen WEB.DE MultiMessenger geh?rt? 
> Der kann`s mit allen: http://www.produkte.web.de/messenger/?did=3123
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Compressed Filesystem
@ 2008-12-15 23:19 devzero
  2008-12-16 15:20 ` Lee Trager
  0 siblings, 1 reply; 17+ messages in thread
From: devzero @ 2008-12-15 23:19 UTC (permalink / raw)
  To: Lee Trager; +Cc: linux-btrfs

> If multiple compression schemes are implemented how should the user g=
o
> about choosing which one they want? Should it be done at kernel time?=
 Or
> with the userland tools on a per file basis(maybe zlib is the default
> but a user could say I want this directory to be bzip)?

yes, why not...

doing that at mounttime like

mount -o compress,cscheme=3Dmyzip /dev/xyz /mntpoint

would be a good start....



> -----Urspr=FCngliche Nachricht-----
> Von: "Lee Trager" <lt73@cs.drexel.edu>
> Gesendet: 16.12.08 00:07:32
> An: devzero@web.de
> CC: linux-btrfs@vger.kernel.org
> Betreff: Re: Compressed Filesystem


> If multiple compression schemes are implemented how should the user g=
o
> about choosing which one they want? Should it be done at kernel time?=
 Or
> with the userland tools on a per file basis(maybe zlib is the default
> but a user could say I want this directory to be bzip)?
>=20
> On Mon, Dec 15, 2008 at 11:14:01PM +0100, devzero@web.de wrote:
> > fantastic feature!
> >=20
> > i`m curious: can btrfs support more than one compression scheme at =
the same time, i.e. is compression "pluggable" ?
> If you look at compression.c, compression.h, and ctree.h you can clea=
rly
> see that support for multiple compression scheme was in mind. Implmen=
ted
> a new one shouldn't be to hard but you probably want to make the curr=
ent
> system a little bit more pluggable and move all the zlib stuff into
> zlib.c.
> >=20
> > lzo compression coming to my mind, as this is giving real-time comp=
ession and may even speed up disk access.
> >=20
> > compression ratio isn`t too bad, but speed is awesome and doesn`t n=
eed as much cpu as gzip.
> >=20
> In some tests I've run zlib is actually faster then nocompression
> because of the lesser amount of data that has to transfer to and from
> the disk. It would be instresting to see how bzip works with this to.
> > experimental lzo compression in zfs-fuse showed that it could compr=
ess tarred kernel-source with 2.99x compressratio (where gzip gave 3.41=
x), so maybe lzo is a better algorithm for realtime filesystem compress=
ion...
> >=20
> > regards
> > roland
> >=20
> >=20
> >=20
> > From: Chris Mason <chris.mason <at> oracle.com>
> > Subject: Re: Compressed Filesystem
> > Newsgroups: gmane.comp.file-systems.btrfs
> > Date: 2008-10-29 20:08:42 GMT (6 weeks, 5 days, 1 hour and 53 minut=
es ago)
> >=20
> > On Wed, 2008-10-29 at 12:14 -0600, Anthony Roberts wrote:
> > > Hi, I have a few questions about this:
> > >=20
> > > > Compression is optional and off by default (mount -o compress t=
o enable
> > > > it).  When enabled, every file is compressed.
> > >=20
> > > Do you know what the CPU load is like with this enabled?
> >=20
> > Now that I've finally pushed the code out, you can try it ;)  One p=
art
> > of the implementation I need to revisit is the place in the code wh=
ere I
> > do compression means that most of the time the single threaded pdfl=
ush
> > is the one compressing.
> >=20
> > This doesn't spread the load very well across the cpus.  It can be
> > fixed, but I wanted to get the code out there.
> >=20
> > The decompression does spread across cpus, and I've gotten about 80=
0MB/s
> > doing decompress and checksumming on a zero filled compressed file.=
  At
> > the time, the disk was reading 14MB/s.
> >=20
> > >=20
> > > Do you know whether data can be compressed at a sufficient rate t=
o still
> > > saturate the disk on recent-ish AMD/Intel CPUs?
> >=20
> > My recentish intel cpu can compress and checksum at about 120MB/s. =
=20
> > >=20
> > > If no, is the effective pre-compression I/O rate still comparable=
 to the
> > > disk without compression?
> > >=20
> >=20
> > It depends on your disks...
> >=20
> > > I'm pretty sure that won't even matter in many cases (eg you're s=
eeking
> > > too much to care, or you're on a VM with lots of cores but conges=
ted
> > > disks, or you're dealing with media files that it doesn't bother
> > > compressing, etc), but I'm curious what sort of overhead this add=
s. :)
> > >=20
> > > Mostly it seems like a good tradeoff, it trades plentiful cores f=
or scarce
> > > disk resources.
> >=20
> > This varies quite a bit from workload to workload, in some places i=
t'll
> > make a big difference, but many workloads are seek bound and not
> > bandwidth bound.
> >=20
> > -chris
> >=20
> >=20
> > ___________________________________________________________________=
_
> > Psssst! Schon vom neuen WEB.DE MultiMessenger geh?rt?=20
> > Der kann`s mit allen: http://www.produkte.web.de/messenger/?did=3D3=
123
> >=20
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-btr=
fs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>=20


_______________________________________________________________________
Sensationsangebot verl=E4ngert: WEB.DE FreeDSL - Telefonanschluss + DSL
f=FCr nur 16,37 Euro/mtl.!* http://dsl.web.de/?ac=3DOM.AD.AD008K15039B7=
069a

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Compressed Filesystem
  2008-12-15 23:19 devzero
@ 2008-12-16 15:20 ` Lee Trager
  2008-12-16 15:26   ` Chris Mason
  0 siblings, 1 reply; 17+ messages in thread
From: Lee Trager @ 2008-12-16 15:20 UTC (permalink / raw)
  To: devzero; +Cc: Lee Trager, linux-btrfs

While I agree that the command you send should be possible it wasn't
exactly what I was thinking. Currently I am working on a way for the
user to individually set which files/directories they want compressed or
not. What I was saying is that assuming you are in a mounted btrfs
directory you could do something like

chattr -R +c zlib dir1	Compress dir1 and all its contents with zlib
chattr -R +c bzip dir2	Compress dir2 and all its contents with bzip
chattr +c lzo file1	Compress fil1 with lzo
chattr -c file2		Uncompress file2
chattr +c none dir3	Uncompress dir3 but leave contents as is

If the user did something like 
mount -o compress,cscheme=zlib /dev/xyz /mntpoint
and then
chattr +c /mntpoint/dir
/mntpoint/dir would default to zlib as would anything else written to
the disk.

Lee

On Tue, Dec 16, 2008 at 12:19:13AM +0100, devzero@web.de wrote:
> > If multiple compression schemes are implemented how should the user go
> > about choosing which one they want? Should it be done at kernel time? Or
> > with the userland tools on a per file basis(maybe zlib is the default
> > but a user could say I want this directory to be bzip)?
> 
> yes, why not...
> 
> doing that at mounttime like
> 
> mount -o compress,cscheme=myzip /dev/xyz /mntpoint
> 
> would be a good start....
> 
> 
> 
> > -----Urspr?ngliche Nachricht-----
> > Von: "Lee Trager" <lt73@cs.drexel.edu>
> > Gesendet: 16.12.08 00:07:32
> > An: devzero@web.de
> > CC: linux-btrfs@vger.kernel.org
> > Betreff: Re: Compressed Filesystem
> 
> 
> > If multiple compression schemes are implemented how should the user go
> > about choosing which one they want? Should it be done at kernel time? Or
> > with the userland tools on a per file basis(maybe zlib is the default
> > but a user could say I want this directory to be bzip)?
> > 
> > On Mon, Dec 15, 2008 at 11:14:01PM +0100, devzero@web.de wrote:
> > > fantastic feature!
> > > 
> > > i`m curious: can btrfs support more than one compression scheme at the same time, i.e. is compression "pluggable" ?
> > If you look at compression.c, compression.h, and ctree.h you can clearly
> > see that support for multiple compression scheme was in mind. Implmented
> > a new one shouldn't be to hard but you probably want to make the current
> > system a little bit more pluggable and move all the zlib stuff into
> > zlib.c.
> > > 
> > > lzo compression coming to my mind, as this is giving real-time compession and may even speed up disk access.
> > > 
> > > compression ratio isn`t too bad, but speed is awesome and doesn`t need as much cpu as gzip.
> > > 
> > In some tests I've run zlib is actually faster then nocompression
> > because of the lesser amount of data that has to transfer to and from
> > the disk. It would be instresting to see how bzip works with this to.
> > > experimental lzo compression in zfs-fuse showed that it could compress tarred kernel-source with 2.99x compressratio (where gzip gave 3.41x), so maybe lzo is a better algorithm for realtime filesystem compression...
> > > 
> > > regards
> > > roland
> > > 
> > > 
> > > 
> > > From: Chris Mason <chris.mason <at> oracle.com>
> > > Subject: Re: Compressed Filesystem
> > > Newsgroups: gmane.comp.file-systems.btrfs
> > > Date: 2008-10-29 20:08:42 GMT (6 weeks, 5 days, 1 hour and 53 minutes ago)
> > > 
> > > On Wed, 2008-10-29 at 12:14 -0600, Anthony Roberts wrote:
> > > > Hi, I have a few questions about this:
> > > > 
> > > > > Compression is optional and off by default (mount -o compress to enable
> > > > > it).  When enabled, every file is compressed.
> > > > 
> > > > Do you know what the CPU load is like with this enabled?
> > > 
> > > Now that I've finally pushed the code out, you can try it ;)  One part
> > > of the implementation I need to revisit is the place in the code where I
> > > do compression means that most of the time the single threaded pdflush
> > > is the one compressing.
> > > 
> > > This doesn't spread the load very well across the cpus.  It can be
> > > fixed, but I wanted to get the code out there.
> > > 
> > > The decompression does spread across cpus, and I've gotten about 800MB/s
> > > doing decompress and checksumming on a zero filled compressed file.  At
> > > the time, the disk was reading 14MB/s.
> > > 
> > > > 
> > > > Do you know whether data can be compressed at a sufficient rate to still
> > > > saturate the disk on recent-ish AMD/Intel CPUs?
> > > 
> > > My recentish intel cpu can compress and checksum at about 120MB/s.  
> > > > 
> > > > If no, is the effective pre-compression I/O rate still comparable to the
> > > > disk without compression?
> > > > 
> > > 
> > > It depends on your disks...
> > > 
> > > > I'm pretty sure that won't even matter in many cases (eg you're seeking
> > > > too much to care, or you're on a VM with lots of cores but congested
> > > > disks, or you're dealing with media files that it doesn't bother
> > > > compressing, etc), but I'm curious what sort of overhead this adds. :)
> > > > 
> > > > Mostly it seems like a good tradeoff, it trades plentiful cores for scarce
> > > > disk resources.
> > > 
> > > This varies quite a bit from workload to workload, in some places it'll
> > > make a big difference, but many workloads are seek bound and not
> > > bandwidth bound.
> > > 
> > > -chris
> > > 
> > > 
> > > ____________________________________________________________________
> > > Psssst! Schon vom neuen WEB.DE MultiMessenger geh?rt? 
> > > Der kann`s mit allen: http://www.produkte.web.de/messenger/?did=3123
> > > 
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> 
> 
> _______________________________________________________________________
> Sensationsangebot verl?ngert: WEB.DE FreeDSL - Telefonanschluss + DSL
> f?r nur 16,37 Euro/mtl.!* http://dsl.web.de/?ac=OM.AD.AD008K15039B7069a
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Compressed Filesystem
  2008-12-16 15:20 ` Lee Trager
@ 2008-12-16 15:26   ` Chris Mason
  2008-12-16 16:25     ` Lee Trager
  0 siblings, 1 reply; 17+ messages in thread
From: Chris Mason @ 2008-12-16 15:26 UTC (permalink / raw)
  To: Lee Trager; +Cc: devzero, linux-btrfs

On Tue, 2008-12-16 at 10:20 -0500, Lee Trager wrote:
> While I agree that the command you send should be possible it wasn't
> exactly what I was thinking. Currently I am working on a way for the
> user to individually set which files/directories they want compressed or
> not. What I was saying is that assuming you are in a mounted btrfs
> directory you could do something like
> 
> chattr -R +c zlib dir1	Compress dir1 and all its contents with zlib
> chattr -R +c bzip dir2	Compress dir2 and all its contents with bzip
> chattr +c lzo file1	Compress fil1 with lzo
> chattr -c file2		Uncompress file2
> chattr +c none dir3	Uncompress dir3 but leave contents as is
> 
> If the user did something like 
> mount -o compress,cscheme=zlib /dev/xyz /mntpoint
> and then
> chattr +c /mntpoint/dir
> /mntpoint/dir would default to zlib as would anything else written to
> the disk.
> 

This is one of those places where more options isn't always better.
Every option adds complexity to the filesystem and the testing matrix.  

I'd much rather have just one compression scheme per FS.  If people need
a specific compression scheme for a specific file, they can just
compress it in userland.

-chris



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Compressed Filesystem
  2008-12-16 15:26   ` Chris Mason
@ 2008-12-16 16:25     ` Lee Trager
  2008-12-16 19:45       ` Roland
  0 siblings, 1 reply; 17+ messages in thread
From: Lee Trager @ 2008-12-16 16:25 UTC (permalink / raw)
  To: Chris Mason; +Cc: Lee Trager, devzero, linux-btrfs

I agree that adding more options will add more complexity but it seems
the same amount of work in kernel space will have to be done. If we
support mutiple compression schemes somewhere the compression scheme
used will have to be stored so we know what to use in the future. If we
store it on the super block the user will have to choose when they
format at which point they may not see the need to use compression. Or
they may choose one compression scheme and later want to change to
something else. It doesn't make sence to have to reformat your drive
just to change compression scheme. This leaves us with storing what the
compression scheme is on each inode.  We currently store if compression
is used on a per inode basis so storing the type wouldn't be a huge
leap.

Lee
On Tue, Dec 16, 2008 at 10:26:10AM -0500, Chris Mason wrote:
> On Tue, 2008-12-16 at 10:20 -0500, Lee Trager wrote:
> > While I agree that the command you send should be possible it wasn't
> > exactly what I was thinking. Currently I am working on a way for the
> > user to individually set which files/directories they want compressed or
> > not. What I was saying is that assuming you are in a mounted btrfs
> > directory you could do something like
> > 
> > chattr -R +c zlib dir1	Compress dir1 and all its contents with zlib
> > chattr -R +c bzip dir2	Compress dir2 and all its contents with bzip
> > chattr +c lzo file1	Compress fil1 with lzo
> > chattr -c file2		Uncompress file2
> > chattr +c none dir3	Uncompress dir3 but leave contents as is
> > 
> > If the user did something like 
> > mount -o compress,cscheme=zlib /dev/xyz /mntpoint
> > and then
> > chattr +c /mntpoint/dir
> > /mntpoint/dir would default to zlib as would anything else written to
> > the disk.
> > 
> 
> This is one of those places where more options isn't always better.
> Every option adds complexity to the filesystem and the testing matrix.  
> 
> I'd much rather have just one compression scheme per FS.  If people need
> a specific compression scheme for a specific file, they can just
> compress it in userland.
> 
> -chris
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Compressed Filesystem
@ 2008-12-16 18:14 devzero
  0 siblings, 0 replies; 17+ messages in thread
From: devzero @ 2008-12-16 18:14 UTC (permalink / raw)
  To: Chris Mason, Lee Trager; +Cc: linux-btrfs

> I'd much rather have just one compression scheme per FS.  If people n=
eed
> a specific compression scheme for a specific file, they can just
> compress it in userland.

yes, i also think one compression scheme per FS is absolutely sufficien=
t.

> -----Urspr=FCngliche Nachricht-----
> Von: "Chris Mason" <chris.mason@oracle.com>
> Gesendet: 16.12.08 16:26:28
> An: Lee Trager <lt73@cs.drexel.edu>
> CC: linux-btrfs@vger.kernel.org
> Betreff: Re: Compressed Filesystem


> On Tue, 2008-12-16 at 10:20 -0500, Lee Trager wrote:
> > While I agree that the command you send should be possible it wasn'=
t
> > exactly what I was thinking. Currently I am working on a way for th=
e
> > user to individually set which files/directories they want compress=
ed or
> > not. What I was saying is that assuming you are in a mounted btrfs
> > directory you could do something like
> >=20
> > chattr -R +c zlib dir1	Compress dir1 and all its contents with zlib
> > chattr -R +c bzip dir2	Compress dir2 and all its contents with bzip
> > chattr +c lzo file1	Compress fil1 with lzo
> > chattr -c file2		Uncompress file2
> > chattr +c none dir3	Uncompress dir3 but leave contents as is
> >=20
> > If the user did something like=20
> > mount -o compress,cscheme=3Dzlib /dev/xyz /mntpoint
> > and then
> > chattr +c /mntpoint/dir
> > /mntpoint/dir would default to zlib as would anything else written =
to
> > the disk.
> >=20
>=20
> This is one of those places where more options isn't always better.
> Every option adds complexity to the filesystem and the testing matrix=
=2E =20
>=20
> I'd much rather have just one compression scheme per FS.  If people n=
eed
> a specific compression scheme for a specific file, they can just
> compress it in userland.
>=20
> -chris
>=20
>=20
>=20


_______________________________________________________________________
Sensationsangebot verl=E4ngert: WEB.DE FreeDSL - Telefonanschluss + DSL
f=FCr nur 16,37 Euro/mtl.!* http://dsl.web.de/?ac=3DOM.AD.AD008K15039B7=
069a

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Compressed Filesystem
  2008-12-16 16:25     ` Lee Trager
@ 2008-12-16 19:45       ` Roland
  2008-12-18 15:55         ` Chris Mason
  0 siblings, 1 reply; 17+ messages in thread
From: Roland @ 2008-12-16 19:45 UTC (permalink / raw)
  To: Lee Trager, Chris Mason; +Cc: linux-btrfs

>I agree that adding more options will add more complexity but it seems
> the same amount of work in kernel space will have to be done

regarding lzo compression itself - it`s already there(since july 2007).
the in-kernel lzo is equivalent to minilzo. 
(http://www.oberhumer.com/opensource/lzo/)

regards
roland


----- Original Message ----- 
From: "Lee Trager" <lt73@cs.drexel.edu>
To: "Chris Mason" <chris.mason@oracle.com>
Cc: "Lee Trager" <lt73@cs.drexel.edu>; <devzero@web.de>; 
<linux-btrfs@vger.kernel.org>
Sent: Tuesday, December 16, 2008 5:25 PM
Subject: Re: Compressed Filesystem


>I agree that adding more options will add more complexity but it seems
> the same amount of work in kernel space will have to be done. If we
> support mutiple compression schemes somewhere the compression scheme
> used will have to be stored so we know what to use in the future. If we
> store it on the super block the user will have to choose when they
> format at which point they may not see the need to use compression. Or
> they may choose one compression scheme and later want to change to
> something else. It doesn't make sence to have to reformat your drive
> just to change compression scheme. This leaves us with storing what the
> compression scheme is on each inode.  We currently store if compression
> is used on a per inode basis so storing the type wouldn't be a huge
> leap.
>
> Lee
> On Tue, Dec 16, 2008 at 10:26:10AM -0500, Chris Mason wrote:
>> On Tue, 2008-12-16 at 10:20 -0500, Lee Trager wrote:
>> > While I agree that the command you send should be possible it wasn't
>> > exactly what I was thinking. Currently I am working on a way for the
>> > user to individually set which files/directories they want compressed 
>> > or
>> > not. What I was saying is that assuming you are in a mounted btrfs
>> > directory you could do something like
>> >
>> > chattr -R +c zlib dir1 Compress dir1 and all its contents with zlib
>> > chattr -R +c bzip dir2 Compress dir2 and all its contents with bzip
>> > chattr +c lzo file1 Compress fil1 with lzo
>> > chattr -c file2 Uncompress file2
>> > chattr +c none dir3 Uncompress dir3 but leave contents as is
>> >
>> > If the user did something like
>> > mount -o compress,cscheme=zlib /dev/xyz /mntpoint
>> > and then
>> > chattr +c /mntpoint/dir
>> > /mntpoint/dir would default to zlib as would anything else written to
>> > the disk.
>> >
>>
>> This is one of those places where more options isn't always better.
>> Every option adds complexity to the filesystem and the testing matrix.
>>
>> I'd much rather have just one compression scheme per FS.  If people need
>> a specific compression scheme for a specific file, they can just
>> compress it in userland.
>>
>> -chris
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Compressed Filesystem
  2008-12-16 19:45       ` Roland
@ 2008-12-18 15:55         ` Chris Mason
  0 siblings, 0 replies; 17+ messages in thread
From: Chris Mason @ 2008-12-18 15:55 UTC (permalink / raw)
  To: Roland; +Cc: Lee Trager, linux-btrfs

On Tue, 2008-12-16 at 20:45 +0100, Roland wrote:
> >I agree that adding more options will add more complexity but it seems
> > the same amount of work in kernel space will have to be done
> 
> regarding lzo compression itself - it`s already there(since july 2007).
> the in-kernel lzo is equivalent to minilzo. 
> (http://www.oberhumer.com/opensource/lzo/)

The compression code initially used the kernel lzo modules.  Even though
the zlib api is clunky and strange, it is actually a better fit to the
multi-page compressions that need to be done by btrfs.  So adding LZO
support would require some work to compress over multiple pages at a
time.

-chris



^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2008-12-18 15:55 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-10-27 14:54 Compressed Filesystem Lee Trager
2008-10-28 15:47 ` Chris Mason
2008-10-28 16:33   ` Lee Trager
2008-10-28 17:38     ` Chris Mason
2008-10-28 17:40       ` Zach Brown
2008-10-28 17:46         ` Chris Mason
     [not found]       ` <53696.2001:470:e828:1::2:2.1225304096.squirrel@avalon.arbitraryconstant.com>
2008-10-29 20:08         ` Chris Mason
2008-11-04  0:08           ` Chris Samuel
  -- strict thread matches above, loose matches on Subject: below --
2008-12-15 22:14 devzero
2008-12-15 23:07 ` Lee Trager
2008-12-15 23:19 devzero
2008-12-16 15:20 ` Lee Trager
2008-12-16 15:26   ` Chris Mason
2008-12-16 16:25     ` Lee Trager
2008-12-16 19:45       ` Roland
2008-12-18 15:55         ` Chris Mason
2008-12-16 18:14 devzero

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox