linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Richard Weinberger <richard@nod.at>
To: Tom Marshall <tom@cyngn.com>
Cc: linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [RFC] Per-file compression
Date: Sat, 18 Apr 2015 17:07:16 +0200	[thread overview]
Message-ID: <55327324.3030400@nod.at> (raw)
In-Reply-To: <20150418145828.GA13809@eden.sea.cyngn.com>

Hi!

Am 18.04.2015 um 16:58 schrieb Tom Marshall:
> On Sat, Apr 18, 2015 at 01:41:09PM +0200, Richard Weinberger wrote:
>> On Sat, Apr 18, 2015 at 12:20 AM, Tom Marshall <tom@cyngn.com> wrote:
>>> So, I wrote a thing called 'zfile' that hooks into the VFS layer and
>>> intercepts file_operations to do file (de)compression on the fly. When a
>>> file is opened, it reads and decompresses the data into memory.  The file
>>> may be read, written, and mmaped in the usual way.  If the contents are
>>> changed, the data is compressed and written back.  A working patch for an
>>> older kernel version may be found at:
>>> http://review.cyanogenmod.org/#/c/95220/
>>
>> So, I've extracted the patch from that website and gave a quick review.
>>
>> I'm pretty sure VFS folks will hate the VFS layering you do.
> 
> This, I'm afraid, is the biggest obstacle to such a solution.  I know that
> OverlayFS has been merged, so filesystem stacking is acceptable.  Perhaps
> there would be a way to design a filesystem that stacks compression?

That's why I said think of adding compression support to ecryptfs.

>> Beside of that you decompress the *whole* file into memory at open() time.
>> This will explode as soon you deal with bigger files.
> 
> I was thinking that a header with compressed offsets might be an option.  Or
> in the case of lz4 it's not terribly inefficient to scan the blocks.
> 
>> Also you seem to trust the user.compression.realsize xattr provided by
>> userspace.  That looks exploitable.
> 
> This is only used to provide a fast stat().  It could be put into a header
> or even removed entirely in favor of scanning the blocks.
> 
>> Back to my original question, why not FUSE?
> 
> Mostly because I'm not very familiar with FUSE.  But I suppose it could be
> an option.  I have some immediate concerns though:
> 
> * How would it affect performance?  FUSE passes all operations through user
>   space, correct?

Yeah, but you'll have to do benchmarks to find out the real trade off.
I'd give it a tray. Your targets are smartphones not HPC clusters.

> * How big might a reasonably complete implementation be for ARM?  The
>   implementation would need to be stored in the initrd.

I bet you can do it in less than 1000 lines of C...

>> Or add compression support to ecryptfs...
> 
> Several filesystems have native compression support.  But this would violate
> the goal of not switching filesystems.

...your goals. Be flexible. ;)
BTW: ecryptfs is an overlay filesystem.

Thanks,
//richard

  reply	other threads:[~2015-04-18 15:07 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-17 22:20 [RFC] Per-file compression Tom Marshall
2015-04-18  8:06 ` Richard Weinberger
2015-04-18 23:09   ` Theodore Ts'o
2015-04-20  3:00     ` Alex Elsayed
2015-04-18 11:41 ` Richard Weinberger
2015-04-18 14:58   ` Tom Marshall
2015-04-18 15:07     ` Richard Weinberger [this message]
2015-04-18 15:48       ` Tom Marshall
2015-04-18 15:52         ` Richard Weinberger
2015-04-20 14:53       ` Tyler Hicks
2015-04-20 16:51         ` Richard Weinberger
2015-04-21 15:18           ` Theodore Ts'o
2015-04-21 15:37             ` Jeff Moyer
2015-04-21 16:54               ` Theodore Ts'o
2015-04-29 23:15             ` Tom Marshall
2015-05-01 18:09 ` Steve French
  -- strict thread matches above, loose matches on Subject: below --
2015-04-19 21:15 Tom Marshall
2015-04-21  3:29 Tom Marshall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55327324.3030400@nod.at \
    --to=richard@nod.at \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=tom@cyngn.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).