From: Tom Marshall <tom@cyngn.com>
To: Theodore Ts'o <tytso@mit.edu>, Richard Weinberger <richard@nod.at>
Cc: Tyler Hicks <tyhicks@canonical.com>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [RFC] Per-file compression
Date: Wed, 29 Apr 2015 16:15:42 -0700 [thread overview]
Message-ID: <5541661E.7000006@cyngn.com> (raw)
In-Reply-To: <20150421151811.GJ3238@thunk.org>
I've done some investigation into stacking compression on top of an
existing filesystem with the ideas you suggested. I'm not really much
of a filesystem guy, so maybe I'm off base. But here's what I've come
up with so far:
Provide a function that the underlying filesystem can call to wrap the
inode. This allows a new inode to be created and passed back to the VFS
layer. Since this won't be a stacking filesystem, I was thinking of
using new_inode_pseudo for this purpose, as is done in pipe.c. The
underlying inode is stored and referenced in inode.i_private.
The compressed inode implements all necessary operations for eg. read,
write, mmap using generic_* functions where appropriate. This mostly
leaves inode_operations.getattr and address_space_operations.readpage to
be implemented.
getattr is implemented by calling the underlying getattr and then
substituting in the uncompressed file size.
readpage is implemented by finding the compressed offset for the
requested chunk of data and reading the underlying pages, decompressing
the chunk, and copying out the desired data. I'm looking at the
squashfs implementation for clues as to how this should be done.
Does that sound like a reasonable plan, or am I off base?
On 04/21/2015 08:18 AM, Theodore Ts'o wrote:
> On Mon, Apr 20, 2015 at 06:51:03PM +0200, Richard Weinberger wrote:
>> My thought was that compression is not far away from crypto an hence
>> a lot of ecryptfs could be reused.
> The problem with using eCryptfs as a base is that it assumes that the
> encryption is constant-sized --- i.e., that a 4096 plaintext block
> encrypts to a 4096 ciphertext block. This is *not* true for compression.
>
> The other problem with eCryptfs is that since the underlying file
> system doesn't know it's being stacked, you end up burning memory for
> both the plaintext and ciphertext versions of the file. This is one
> of the reasons why eCryptfs wasn't considered for future versions of
> Android; instead we've added encryption into the ext4 file system
> layer instead. (With most of the interesting bits in separate files,
> and where I've been communicating with the f2fs maintainer so that
> f2fs can add the same encryption feature into f2fs).
>
> For compression, what I'd recommend doing is something similar; do it
> at the file system level, but structure it such that it's relatively
> easy for other file systems to reuse "library code" for the core data
> transforms. However, allow the underlying file system to use its own
> specialized storage for things like flags, xattrs, etc., since it can
> be made more efficient.
>
> What I'd also suggest is that you support read-only compression (which
> is what MacOS did as well), and do it by using a chunksize of say, 32k
> or 64k, and at the very end of the file, store a pointer to the
> compressed chunk directory which is simply a header which describes
> the chunk size (and other useful bits, such as the compression
> algorith, *possibly* a space for a preset compression dictionary that
> would be shared across all of the chunks, if that makes sense, and
> then a list of offsets into the files which gives the starting offset
> for chunk #0, chunk #1, chunk #2, etc.
>
> This file would be created with some help from a userspace
> application; said userspace application would do the compression and
> write out the compressed file, and then call an ioctl which sets an
> attribute which (a) flushes the page cache from containing the
> compressed version of the file, and (b) marks the inode as read-only
> and containing compressed data.
>
> When the kernel reads from the file, it reads the compression header
> and directory, and then pages into the page cache a chunk at a time
> --- that is, if userspace requests a single 4k page, the kernel will
> read in whatever blocks are needed to decompress the 64k chunk
> containing that page, and populate the page cache with that 64k chunk.
>
> I've sketched this design out a few times, hoping to interest someone
> into implementing it for ext4, but this is the sort of thing that
> could be implemented as a library, and then easily spliced into
> mulitple file systems.
>
> Cheers,
>
> - Ted
>
> P.S. Note that one of the things about this design is that although
> it requires userspace support, it's *perfect* for files which are
> installed via a package, whether that be an RPM, dpkg, or apk. You
> just need to create a userspace library which takes the incoming file
> stream from the package file, and then writes out the compressed
> version of the file and marks the file as containing compressed data.
> It shouldn't be hard, once the userspace library is created, to modify
> rpm, dpkg, etc., to take advantage of this feature. And these package
> files are the ones which are *perfect* candidates for compression;
> they tend to be written once, and read many times, and in general they
> are read-only. (Yes, there are exceptions for config files, but rpm
> and dpkg already have a way of specifying which files are config
> files, which is important if you want to verify that the unpacked
> pacakge is consistent with what was installed originally.)
>
next prev parent reply other threads:[~2015-04-29 23:15 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-04-17 22:20 [RFC] Per-file compression Tom Marshall
2015-04-18 8:06 ` Richard Weinberger
2015-04-18 23:09 ` Theodore Ts'o
2015-04-20 3:00 ` Alex Elsayed
2015-04-18 11:41 ` Richard Weinberger
2015-04-18 14:58 ` Tom Marshall
2015-04-18 15:07 ` Richard Weinberger
2015-04-18 15:48 ` Tom Marshall
2015-04-18 15:52 ` Richard Weinberger
2015-04-20 14:53 ` Tyler Hicks
2015-04-20 16:51 ` Richard Weinberger
2015-04-21 15:18 ` Theodore Ts'o
2015-04-21 15:37 ` Jeff Moyer
2015-04-21 16:54 ` Theodore Ts'o
2015-04-29 23:15 ` Tom Marshall [this message]
2015-05-01 18:09 ` Steve French
-- strict thread matches above, loose matches on Subject: below --
2015-04-19 21:15 Tom Marshall
2015-04-21 3:29 Tom Marshall
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5541661E.7000006@cyngn.com \
--to=tom@cyngn.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=richard@nod.at \
--cc=tyhicks@canonical.com \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.