public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
* Compressed Filesystem
@ 2008-10-27 14:54 Lee Trager
  2008-10-28 15:47 ` Chris Mason
  0 siblings, 1 reply; 17+ messages in thread
From: Lee Trager @ 2008-10-27 14:54 UTC (permalink / raw)
  To: linux-btrfs; +Cc: balajirrao, miguel.filipe

I have read on the mailing list that there has been some interest in
implementing transparent compression on btrfs and I too am thinking about
trying to implement it. Before I start from scratch I am wondering if
anyone else has started to work on this and if so how far along have
they gotten? I would be happy to work on this alone or with someone else
but currently I am doing some preliminary research.

Thanks,

Lee

^ permalink raw reply	[flat|nested] 17+ messages in thread
* Re: Compressed Filesystem
@ 2008-12-15 22:14 devzero
  2008-12-15 23:07 ` Lee Trager
  0 siblings, 1 reply; 17+ messages in thread
From: devzero @ 2008-12-15 22:14 UTC (permalink / raw)
  To: linux-btrfs

fantastic feature!

i`m curious: can btrfs support more than one compression scheme at the =
same time, i.e. is compression "pluggable" ?

lzo compression coming to my mind, as this is giving real-time compessi=
on and may even speed up disk access.

compression ratio isn`t too bad, but speed is awesome and doesn`t need =
as much cpu as gzip.

experimental lzo compression in zfs-fuse showed that it could compress =
tarred kernel-source with 2.99x compressratio (where gzip gave 3.41x), =
so maybe lzo is a better algorithm for realtime filesystem compression.=
=2E.

regards
roland



=46rom: Chris Mason <chris.mason <at> oracle.com>
Subject: Re: Compressed Filesystem
Newsgroups: gmane.comp.file-systems.btrfs
Date: 2008-10-29 20:08:42 GMT (6 weeks, 5 days, 1 hour and 53 minutes a=
go)

On Wed, 2008-10-29 at 12:14 -0600, Anthony Roberts wrote:
> Hi, I have a few questions about this:
>=20
> > Compression is optional and off by default (mount -o compress to en=
able
> > it).  When enabled, every file is compressed.
>=20
> Do you know what the CPU load is like with this enabled?

Now that I've finally pushed the code out, you can try it ;)  One part
of the implementation I need to revisit is the place in the code where =
I
do compression means that most of the time the single threaded pdflush
is the one compressing.

This doesn't spread the load very well across the cpus.  It can be
fixed, but I wanted to get the code out there.

The decompression does spread across cpus, and I've gotten about 800MB/=
s
doing decompress and checksumming on a zero filled compressed file.  At
the time, the disk was reading 14MB/s.

>=20
> Do you know whether data can be compressed at a sufficient rate to st=
ill
> saturate the disk on recent-ish AMD/Intel CPUs?

My recentish intel cpu can compress and checksum at about 120MB/s. =20
>=20
> If no, is the effective pre-compression I/O rate still comparable to =
the
> disk without compression?
>=20

It depends on your disks...

> I'm pretty sure that won't even matter in many cases (eg you're seeki=
ng
> too much to care, or you're on a VM with lots of cores but congested
> disks, or you're dealing with media files that it doesn't bother
> compressing, etc), but I'm curious what sort of overhead this adds. :=
)
>=20
> Mostly it seems like a good tradeoff, it trades plentiful cores for s=
carce
> disk resources.

This varies quite a bit from workload to workload, in some places it'll
make a big difference, but many workloads are seek bound and not
bandwidth bound.

-chris


____________________________________________________________________
Psssst! Schon vom neuen WEB.DE MultiMessenger geh=F6rt?=20
Der kann`s mit allen: http://www.produkte.web.de/messenger/?did=3D3123

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread
* Re: Compressed Filesystem
@ 2008-12-15 23:19 devzero
  2008-12-16 15:20 ` Lee Trager
  0 siblings, 1 reply; 17+ messages in thread
From: devzero @ 2008-12-15 23:19 UTC (permalink / raw)
  To: Lee Trager; +Cc: linux-btrfs

> If multiple compression schemes are implemented how should the user g=
o
> about choosing which one they want? Should it be done at kernel time?=
 Or
> with the userland tools on a per file basis(maybe zlib is the default
> but a user could say I want this directory to be bzip)?

yes, why not...

doing that at mounttime like

mount -o compress,cscheme=3Dmyzip /dev/xyz /mntpoint

would be a good start....



> -----Urspr=FCngliche Nachricht-----
> Von: "Lee Trager" <lt73@cs.drexel.edu>
> Gesendet: 16.12.08 00:07:32
> An: devzero@web.de
> CC: linux-btrfs@vger.kernel.org
> Betreff: Re: Compressed Filesystem


> If multiple compression schemes are implemented how should the user g=
o
> about choosing which one they want? Should it be done at kernel time?=
 Or
> with the userland tools on a per file basis(maybe zlib is the default
> but a user could say I want this directory to be bzip)?
>=20
> On Mon, Dec 15, 2008 at 11:14:01PM +0100, devzero@web.de wrote:
> > fantastic feature!
> >=20
> > i`m curious: can btrfs support more than one compression scheme at =
the same time, i.e. is compression "pluggable" ?
> If you look at compression.c, compression.h, and ctree.h you can clea=
rly
> see that support for multiple compression scheme was in mind. Implmen=
ted
> a new one shouldn't be to hard but you probably want to make the curr=
ent
> system a little bit more pluggable and move all the zlib stuff into
> zlib.c.
> >=20
> > lzo compression coming to my mind, as this is giving real-time comp=
ession and may even speed up disk access.
> >=20
> > compression ratio isn`t too bad, but speed is awesome and doesn`t n=
eed as much cpu as gzip.
> >=20
> In some tests I've run zlib is actually faster then nocompression
> because of the lesser amount of data that has to transfer to and from
> the disk. It would be instresting to see how bzip works with this to.
> > experimental lzo compression in zfs-fuse showed that it could compr=
ess tarred kernel-source with 2.99x compressratio (where gzip gave 3.41=
x), so maybe lzo is a better algorithm for realtime filesystem compress=
ion...
> >=20
> > regards
> > roland
> >=20
> >=20
> >=20
> > From: Chris Mason <chris.mason <at> oracle.com>
> > Subject: Re: Compressed Filesystem
> > Newsgroups: gmane.comp.file-systems.btrfs
> > Date: 2008-10-29 20:08:42 GMT (6 weeks, 5 days, 1 hour and 53 minut=
es ago)
> >=20
> > On Wed, 2008-10-29 at 12:14 -0600, Anthony Roberts wrote:
> > > Hi, I have a few questions about this:
> > >=20
> > > > Compression is optional and off by default (mount -o compress t=
o enable
> > > > it).  When enabled, every file is compressed.
> > >=20
> > > Do you know what the CPU load is like with this enabled?
> >=20
> > Now that I've finally pushed the code out, you can try it ;)  One p=
art
> > of the implementation I need to revisit is the place in the code wh=
ere I
> > do compression means that most of the time the single threaded pdfl=
ush
> > is the one compressing.
> >=20
> > This doesn't spread the load very well across the cpus.  It can be
> > fixed, but I wanted to get the code out there.
> >=20
> > The decompression does spread across cpus, and I've gotten about 80=
0MB/s
> > doing decompress and checksumming on a zero filled compressed file.=
  At
> > the time, the disk was reading 14MB/s.
> >=20
> > >=20
> > > Do you know whether data can be compressed at a sufficient rate t=
o still
> > > saturate the disk on recent-ish AMD/Intel CPUs?
> >=20
> > My recentish intel cpu can compress and checksum at about 120MB/s. =
=20
> > >=20
> > > If no, is the effective pre-compression I/O rate still comparable=
 to the
> > > disk without compression?
> > >=20
> >=20
> > It depends on your disks...
> >=20
> > > I'm pretty sure that won't even matter in many cases (eg you're s=
eeking
> > > too much to care, or you're on a VM with lots of cores but conges=
ted
> > > disks, or you're dealing with media files that it doesn't bother
> > > compressing, etc), but I'm curious what sort of overhead this add=
s. :)
> > >=20
> > > Mostly it seems like a good tradeoff, it trades plentiful cores f=
or scarce
> > > disk resources.
> >=20
> > This varies quite a bit from workload to workload, in some places i=
t'll
> > make a big difference, but many workloads are seek bound and not
> > bandwidth bound.
> >=20
> > -chris
> >=20
> >=20
> > ___________________________________________________________________=
_
> > Psssst! Schon vom neuen WEB.DE MultiMessenger geh?rt?=20
> > Der kann`s mit allen: http://www.produkte.web.de/messenger/?did=3D3=
123
> >=20
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-btr=
fs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>=20


_______________________________________________________________________
Sensationsangebot verl=E4ngert: WEB.DE FreeDSL - Telefonanschluss + DSL
f=FCr nur 16,37 Euro/mtl.!* http://dsl.web.de/?ac=3DOM.AD.AD008K15039B7=
069a

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread
* Re: Compressed Filesystem
@ 2008-12-16 18:14 devzero
  0 siblings, 0 replies; 17+ messages in thread
From: devzero @ 2008-12-16 18:14 UTC (permalink / raw)
  To: Chris Mason, Lee Trager; +Cc: linux-btrfs

> I'd much rather have just one compression scheme per FS.  If people n=
eed
> a specific compression scheme for a specific file, they can just
> compress it in userland.

yes, i also think one compression scheme per FS is absolutely sufficien=
t.

> -----Urspr=FCngliche Nachricht-----
> Von: "Chris Mason" <chris.mason@oracle.com>
> Gesendet: 16.12.08 16:26:28
> An: Lee Trager <lt73@cs.drexel.edu>
> CC: linux-btrfs@vger.kernel.org
> Betreff: Re: Compressed Filesystem


> On Tue, 2008-12-16 at 10:20 -0500, Lee Trager wrote:
> > While I agree that the command you send should be possible it wasn'=
t
> > exactly what I was thinking. Currently I am working on a way for th=
e
> > user to individually set which files/directories they want compress=
ed or
> > not. What I was saying is that assuming you are in a mounted btrfs
> > directory you could do something like
> >=20
> > chattr -R +c zlib dir1	Compress dir1 and all its contents with zlib
> > chattr -R +c bzip dir2	Compress dir2 and all its contents with bzip
> > chattr +c lzo file1	Compress fil1 with lzo
> > chattr -c file2		Uncompress file2
> > chattr +c none dir3	Uncompress dir3 but leave contents as is
> >=20
> > If the user did something like=20
> > mount -o compress,cscheme=3Dzlib /dev/xyz /mntpoint
> > and then
> > chattr +c /mntpoint/dir
> > /mntpoint/dir would default to zlib as would anything else written =
to
> > the disk.
> >=20
>=20
> This is one of those places where more options isn't always better.
> Every option adds complexity to the filesystem and the testing matrix=
=2E =20
>=20
> I'd much rather have just one compression scheme per FS.  If people n=
eed
> a specific compression scheme for a specific file, they can just
> compress it in userland.
>=20
> -chris
>=20
>=20
>=20


_______________________________________________________________________
Sensationsangebot verl=E4ngert: WEB.DE FreeDSL - Telefonanschluss + DSL
f=FCr nur 16,37 Euro/mtl.!* http://dsl.web.de/?ac=3DOM.AD.AD008K15039B7=
069a

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2008-12-18 15:55 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-10-27 14:54 Compressed Filesystem Lee Trager
2008-10-28 15:47 ` Chris Mason
2008-10-28 16:33   ` Lee Trager
2008-10-28 17:38     ` Chris Mason
2008-10-28 17:40       ` Zach Brown
2008-10-28 17:46         ` Chris Mason
     [not found]       ` <53696.2001:470:e828:1::2:2.1225304096.squirrel@avalon.arbitraryconstant.com>
2008-10-29 20:08         ` Chris Mason
2008-11-04  0:08           ` Chris Samuel
  -- strict thread matches above, loose matches on Subject: below --
2008-12-15 22:14 devzero
2008-12-15 23:07 ` Lee Trager
2008-12-15 23:19 devzero
2008-12-16 15:20 ` Lee Trager
2008-12-16 15:26   ` Chris Mason
2008-12-16 16:25     ` Lee Trager
2008-12-16 19:45       ` Roland
2008-12-18 15:55         ` Chris Mason
2008-12-16 18:14 devzero

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox