public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
* Transparent compression for Btrfs
@ 2008-09-01  2:09 Balaji Rao
  2008-09-01  2:49 ` Eric Anopolsky
  2008-09-01 13:52 ` Chris Mason
  0 siblings, 2 replies; 3+ messages in thread
From: Balaji Rao @ 2008-09-01  2:09 UTC (permalink / raw)
  To: linux-btrfs

Hi,

For a medium term project, I'm thinking of working on transparent compression 
for Btrfs. Please give any hints and comments on how we would want to go 
about this, the features we would like to have and some common pitfalls to 
avoid.

Is looking at how it's done in Reiser4, a good idea ? Can we allow the 
compression algorithm be configurable on a per file basis, may be using an 
xattr ? This, for example, would allow us to make a compromise between speed 
and compression ratio.

Any other ideas welcome.

-- 
Thanks,
Balaji Rao

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Transparent compression for Btrfs
  2008-09-01  2:09 Transparent compression for Btrfs Balaji Rao
@ 2008-09-01  2:49 ` Eric Anopolsky
  2008-09-01 13:52 ` Chris Mason
  1 sibling, 0 replies; 3+ messages in thread
From: Eric Anopolsky @ 2008-09-01  2:49 UTC (permalink / raw)
  To: Balaji Rao; +Cc: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 1211 bytes --]

On Mon, 2008-09-01 at 07:39 +0530, Balaji Rao wrote:
> Hi,
> 
> For a medium term project, I'm thinking of working on transparent compression 
> for Btrfs. Please give any hints and comments on how we would want to go 
> about this, the features we would like to have and some common pitfalls to 
> avoid.
> 
> Is looking at how it's done in Reiser4, a good idea ? Can we allow the 
> compression algorithm be configurable on a per file basis, may be using an 
> xattr ? This, for example, would allow us to make a compromise between speed 
> and compression ratio.
> 
> Any other ideas welcome.

If the algorithm is tunable on a per file basis, why not make the
compression level tunable on a per file basis? I think it's also
important to consider the tradeoff the between learning
curve/administrative overhead and the granularity of control over
compression, but I don't have any good answers.

While we're on the subject, someone on the ZFS list expressed a need to
tune redundancy on a per-file basis (or even tune redundancy at all
after the filesystem is created). Currently ZFS is very inflexible in
this regard, so it could be a way for btrfs to get ahead.

Cheers,
Eric


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Transparent compression for Btrfs
  2008-09-01  2:09 Transparent compression for Btrfs Balaji Rao
  2008-09-01  2:49 ` Eric Anopolsky
@ 2008-09-01 13:52 ` Chris Mason
  1 sibling, 0 replies; 3+ messages in thread
From: Chris Mason @ 2008-09-01 13:52 UTC (permalink / raw)
  To: Balaji Rao; +Cc: linux-btrfs

On Mon, 2008-09-01 at 07:39 +0530, Balaji Rao wrote:
> Hi,
> 
> For a medium term project, I'm thinking of working on transparent compression 
> for Btrfs. Please give any hints and comments on how we would want to go 
> about this, the features we would like to have and some common pitfalls to 
> avoid.
> 
> Is looking at how it's done in Reiser4, a good idea ? Can we allow the 
> compression algorithm be configurable on a per file basis, may be using an 
> xattr ? This, for example, would allow us to make a compromise between speed 
> and compression ratio.
> 
> Any other ideas welcome.
> 

Actually coding up the compression is fairly simple, the hard part is
making it work in with the rest of the writeback code.  The good news is
the data=ordered mode from v0.16 should make this easier.  There are a
few rules for how writeback works in general:

1) COW rules mean that we only write a block once.  This helps quite a
lot with compression.

2) Delayed allocation and data=ordered pair up, and once a given extent
is recorded in the data=ordered tree, those pages in the file will not
be modified again until the ordered IO is complete.

So, the very natural place to do compression is while
inode.c:cow_file_range is doing btrfs_add_ordered_extent.  You'll
actually want to compress it just before btrfs_reserve_extent is called
so that you know how much space should be reserved on disk.

The basic formula would be:

* allocate pages to hold the compressed result
* compress the data and copy it into those pages
* allocate extents to hold the compressed copy on disk
* checksum the compressed copy
* write the compress copy to disk.

I would suggest that you put the compressed data into an address space
dedicated just to compressed data blocks.  This gives you something to
allocate pages against, but more importantly it gives you a place to
cache the compressed blocks.

In terms of customizing things, I would like the compression algorithm
to be selectable and stored in a field in the inode.  We can make
interfaces for changing the field later.

-chris



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2008-09-01 13:52 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-09-01  2:09 Transparent compression for Btrfs Balaji Rao
2008-09-01  2:49 ` Eric Anopolsky
2008-09-01 13:52 ` Chris Mason

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox