Re: Can compression at filesystem level improve overall performance?

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Hans Reiser <reiser@namesys.com>
To: Scott Young <youngs1@sunyit.edu>
Cc: reiserfs-list@namesys.com
Subject: Re: Can compression at filesystem level improve	overall	performance?
Date: Tue, 23 Mar 2004 13:59:39 +0300	[thread overview]
Message-ID: <4060189B.2000500@namesys.com> (raw)
In-Reply-To: <1080011016.4658.408.camel@localhost.localdomain>

Scott Young wrote:

>On Mon, 2004-03-22 at 15:04, Hans Reiser wrote:
>  
>
>>Scott Young wrote:
>>
>>    
>>
>>>>That's common misconception. :)
>>>>
>>>>The goal of compression is to conserve disk bandwidth rather than space.
>>>>
>>>>By compressing it is possible to transfer data (== uncompressed data
>>>>user works with), at a rate higher than raw device bandwidth.
>>>>   
>>>>
>>>>        
>>>>
>>>I will be doing some research on an algorithm that speeds up data
>>>transfers over a network by adaptively selecting a compression
>>>algorithm.  It can be applied to filesystem reads and writes too.  When
>>>the send queue is reasonably full on the server, it starts compressing
>>>data at the tail of the queue while sending the data at the head of the
>>>queue.  If the output stream catches up to segment currently being
>>>compressed, then that segment is sent uncompressed.  If the compressed
>>>data is not significantly smaller, then the uncompressed data is sent
>>>instead.  For network applications that are not network interface bound
>>>(like rsync over a 100mbit connection), the buffer will be empty most of
>>>the time and therefore little compression would be needed or wanted as
>>>it would only slow the application down.  Compression is chosen from a
>>>pool of algorithms and varied depending on the history of buffer
>>>overflows and under-runs.  Slower, better compression algorithms are
>>>used when the buffer is mostly full and the compression is observably
>>>effective.  The idea here is to minimize the time between the client
>>>requesting the data and having the usable data in a minimal amount of
>>>time.  This can be seen as a time-verses-amount-of-usable-data-on-client
>>>graph, and some applications prefer a low latency for the initial stream
>>>of data (such as a web page) whereas some prefer the time to retrieve a
>>>very large piece of data (such as scp scott@1.2.3.4/SomeBigDocument.sxw
>>>/home/scott over a 56k modem).
>>>
>>>Adapting this to filesystem concepts, the server can be seen as the
>>>write process and the client can be seen as the read process. 
>>>
>>>      
>>>
>>I don't understand.  Why not view the client as the disk drive and the 
>>bus as the network?
>>    
>>
>
>Interesting view.  The "server," as I see it, is the source of the data,
>and the client is the ultimate destination.  In the context of a
>filesystem, one could see the disk as the ultimate destination. 
>However, data on a disk has no use at all until it is read and used by
>some application.  Therefore, I see writing to the filesystem as a
>server and, reading from the filesystem as a client, and the bus and
>disk drive together as the network (at least when translating the
>concept from a network context to a filesystem context).
>
>  
>
>>>The idea
>>>can be applied to Reiser4 by compressing the overwrite set while the
>>>journal data is being written, and then compressing the tail of the
>>>relocate set moving backwards until the write stream catches up to the
>>>compression.  It could also take into account the estimated
>>>decompression time when reading the data back, and use it for deciding
>>>whether the compression ratio is good enough to write the compressed
>>>data instead of the uncompressed data.
>>> 
>>>
>>>      
>>>
>>I didn't understand the above.
>>    
>>
>
>What I'm saying is that you can start writing uncompressed data to the
>drive while the yet-to-be-written data is being compressed.  The goal is
>to have some segments of data compressed, and have them compressed
>before they come up next for writing to the drive.  Virtually no time is
>lost if the data cannot be compressed because the data will be sent to
>the disk at full speed anyway, whether or not the system had time to
>compress it.  The repacker could compress the data that was written to
>the drive before it could be compressed if the repacker thinks that
>compression would speed up reading of that data (or significantly reduce
>space usage, which will generally happen at the same time as data moving
>faster because of compression).
>  
>
Too much complexity.  LZW is fast and effective.

>
>  
>
>>>Another interesting twist would be to cache the compressed data if the
>>>same data is going to be sent from the server several times.  This
>>>reduces CPU overhead on the server (and possibly it's memory
>>>requirements for caching the data, and reduces the amount of data that
>>>needs to be read from the drive), but it is complicated in the context
>>>of a network algorithm and is mostly application-dependent.  This is
>>>research for another day, maybe in the form of a derived-data plugin for
>>>ReiserFS where an application tells the filesystem how to construct the
>>>file, and the filesystem can store the original, the result, or both,
>>>depending on space needs and performance analysis, with copy-on-write
>>>metadata flags when appropriate.
>>> 
>>>
>>>      
>>>
>>I didn't understand the above.
>>    
>>
>
>Yeah, I realize now what I wrote was an incoherent mess.  I think too
>much and write too little sometimes :-)  I was combining the idea for
>networking with another different idea for ReiserFS, and that
>combination came out unclear.
>
>As an example, take a web server that has HTTP compression enabled. 
>Instead of compressing the data once per request, simply store the
>compressed version and send that when the next request occurs.  The
>server isn't compressing data all the time, so the CPU can be used for
>other tasks.  To make life easier on the web-server developers, the
>filesystem could have an interface that allows for defining a file as a
>derivative of another with some transformation done to that file, such
>as a compression algorithm.  It'd be like telling the filesystem that
>"File B is what happens when you do X to file A."  The web server
>developer would not have to worry about storing the compressed file to
>increase performance because it would be handled by the filesystem. 
>Furthermore, the web server developer would not have to use the
>compression libraries directly, making their job much easier.
>
>There could be a "live" derivative, where any change to the original
>file reflects in the derivative file, or they could be constant, where
>any change in the original does not reflect in the derivative.  When I
>say "constant" I mean that the source file can be thought of as being
>constant because any change to the original will not be reflected in the
>derivative.  Think of the simplest case: a derivative file with the
>identity transform applied to the original file, or simply stated,
>copying a file.  A "live" derivative with the identity transform would
>act like a hard link, and a constant derivative with the identity
>transform would act like copying the entire file and storing it
>separately (Underneath the covers, the transform could just link the
>derived file to the original with copy-on-write enabled, thereby
>improving performance).  This becomes complicated depending on what
>files are actually stored and what is the fastest way to arrange
>things.  The derivative could simply be calculated from the original,
>and space can be saved by not storing the derivative on disk, or CPU
>time could be saved by storing it on disk (e.g. an MD5 derived-file). 
>Sometimes the derivative may load faster if it is calculated from the
>original (e.g. if /dev/urandom is implemented as a derived stream from
>some seed file.  In this case the file size could be infinite too, so
>storing it would not be prudent).  If the derivative file is not stored
>on the disk, and it is marked as a constant derivative, then the
>original file would have to use copy-on-write so that the constant
>derivative file isn't altered.
>
>Your idea of /etc/.passwd as the combination of small per-user password
>files could be implemented using derived files.  They would need some
>way of moving the writes to the combined file back to the original small
>files, but that could be a part of the derived-data plugin for combined
>files.
>
>As another example, gcc could create a "live" derived file a.out that
>that means "a.out is the live derivative of files src1.c, src2.c,
>src3.c, src4.c when applied using the command gcc -o a.out src1.c src2.c
>src3.c src4.c".  Of course it would be encoded in the syntax that the
>derived-file plugin would understand.  When the source files change, the
>developer wouldn't have to rerun gcc to run the executable.  The file
>would automagically be compiled from the sources.
>
>The more I think about it, it seems that derived-files could be
>implemented as plugin(s) with only one extra feature added to the core
>filesystem: copy-on-write.  (Copy-on-write could be extended to only
>write changes, and have the new version be constructed as a derived file
>from the original, thereby compactly maintaining multiple versions of a
>file and improving write performance, but I digress)
>  
>
Consider our use of compression atoms, it may reduce the incentive for 
what you describe.

>  
>
>>>I haven't started coding the adaptive compression algorithm yet, but I
>>>have a general idea about how I am going to implement it.  For the
>>>proof-of-concept, I want to write this using sockets and some basic
>>>library compression algorithms (gzip, bzip2, and maybe a simple MTF +
>>>Adaptive Huffman).  Later variants may work with TCP or other protocols
>>>around that layer.  Any suggestions will be appreciated.
>>> 
>>>
>>>      
>>>
>>I think we need to use adaptive compression in Reiser4, based on the 
>>type of file being compressed,  
>>    
>>
>
>I think it should be based on how well compression is working during the
>actual write.  That way you don't have the overhead of compression when
>there is no benefit of compression, and CPU-bound applications will not
>run slower because of CPU-intensive compression.  The type of file could
>be used, but I'd like to see it as only a suggestion to the adaptive
>compression heuristic.
>
>  
>
>>and anyone who finds it interesting to 
>>develop heuristics for selecting compression strategies is welcome to 
>>help and join the fun.
>>    
>>
>
>And it shall be fun!
>
>I have many other ideas relating to compression.  In high school I
>designed a filesystem that used the ideas in rsync to find identical
>segments of data between files, and only write the matching segments
>once.  It could barely be called a filesystem (It was written as a
>filesystem-within-a-file using java and I didn't implement deletion or
>mounting), but it gave me many ideas and directions to go for making the
>general idea work in a real filesystem.  One thing I realized is that an
>array of millions or billions of random hashes can be compressed, and a
>Trie is a good basis for implementing a compressed data structure for
>random hashes (I could further describe the complete data structure I
>designed, but that is probably worth writing a coherent peer-reviewed
>paper instead of quickly writing an incoherent email on the topic). 
>  
>
I think ATT and MS have done work in this area.  It is interesting 
topic, you need to (deeply) worry about adding additional seeks though.

>Furthermore, I realized that my filesystem algorithm would significantly
>slow down the system, and only be applicable in specific circumstances
>(such as a when there is a dedicated file server that has plenty of CPU
>and RAM to give up).  This leads to the network algorithm for
>compressing when compressing is good:  it embodies a general heuristic 
>for algorithms that have parts that don't have to be done but can have
>positive effects they are done.  For example, compression does not have
>to be done to send data across a network, and an rsync-like signature
>search does not need to be done to write data to a filesystem, but using
>them will lead to a performance improvement in some circumstances.
>
>If/when I implement these ideas, I will contribute them to ReiserFS. 
>I'd have more time to work on this if I didn't have college and a
>part-time job, but there's no rush to implement all these features.
>
>
>
>
>
>
>
>
>
>  
>


-- 
Hans

next prev parent reply	other threads:[~2004-03-23 10:59 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-03-19 14:25 Can compression at filesystem level improve overall performance? Erik Terpstra
2004-03-19 16:29 ` Redeeman
2004-03-19 16:53   ` Nikita Danilov
2004-03-21 14:29     ` Sean Johnson
2004-03-21 23:17       ` Can compression at filesystem level improve overall The Amazing Dragon
2004-03-21 23:23         ` Sean Johnson
2004-03-22  9:14         ` Hans Reiser
2004-03-22  8:01     ` Can compression at filesystem level improve overall performance? Kris Van Bruwaene
2004-03-22 18:00     ` Scott Young
2004-03-22 20:04       ` Hans Reiser
2004-03-23  3:03         ` Scott Young
2004-03-23 10:59           ` Hans Reiser [this message]
2004-03-24 16:19             ` Scott Young
2004-03-29  5:25               ` Tom Vier
2004-03-29  5:16           ` Tom Vier
2004-03-30  3:34             ` Scott Young
2004-03-30  4:53               ` Tom Vier
2004-03-31  4:51                 ` Scott Young
2004-04-08 21:46                   ` Tom Vier
2004-04-08 11:47                 ` Stewart Smith
2004-03-19 18:59 ` Hans Reiser
2004-03-23  0:17 ` Miguel
     [not found] <no.id>
2004-03-24  0:08 ` The Amazing Dragon
2004-03-24  0:12   ` Alan Horn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4060189B.2000500@namesys.com \
    --to=reiser@namesys.com \
    --cc=reiserfs-list@namesys.com \
    --cc=youngs1@sunyit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.