From: "Theodore Ts'o" <tytso@mit.edu>
To: "Jörn Engel" <joern@logfs.org>
Cc: Vyacheslav Dubeyko <slava@dubeyko.com>,
Dhaval Giani <dgiani@mozilla.com>, Taras Glek <tglek@mozilla.com>,
linux-kernel@vger.kernel.org, vdjeric@mozilla.com,
glandium@mozilla.com, linux-ext4@vger.kernel.org,
linux-fsdevel@vger.kernel.org
Subject: Re: [RFC/PATCH 0/2] ext4: Transparent Decompression Support
Date: Sat, 3 Aug 2013 20:33:16 -0400 [thread overview]
Message-ID: <20130804003316.GA19781@thunk.org> (raw)
In-Reply-To: <20130726132034.GB21977@logfs.org>
On Fri, Jul 26, 2013 at 09:20:34AM -0400, Jörn Engel wrote:
>
> I don't think the e2compr patches are strictly necessary. They are a
> good option, but not the only one.
Sorry for not chiming in earlier; I've been travelling this past week,
and between that and a bunch of other things I've fallen a bit earlier
on my e-mail.
> One trick to simplify the problem is to make Dhaval's compressed files
> strictly read-only. It will require some dance to load the compressed
> content, flip the switch, then uncompress data on the fly and disallow
> writes. Not the most pleasing of interfaces, but yet another option.
Yeah, this is something that I've wanted for a while. (In fact a few
years ago I shopped around this design to some folks who were
associated with Firefox.) MacOS has something rather similar to this.
I haven't had a chance to look at Dhaval's patches yet, but the way
I've been thinking about this is that the compression and building the
table mapping compressed clusters to byte offsets in the file would be
done in userspace. Once the compressed file plus the table is written
to the disk, the userspace program would then close the file
descriptor, and then set the "compressed" bit.
When the bit is set, we flush all of its pages from the page cache,
and the file becomes immutable. At that point, the kernel will handle
the decompression, by implementing readpages() by reading the pages
into the buffer cache, and then decompressing the compressed cluster
of pages into the page cache. This gives us transparent compression,
with a fraction of the complexity of supporting read/write
compression. In addition, since we don't have to worry rewriting a
cluster (and having the modified compressed cluster taking up more
space), the on-disk representation can be a lot more efficient, since
you don't have to use a stacker-style design.
One of the cool things about this design is that the vast majority of
files on a typical distribution are write-once, and better yet, they
are written by the package manager. So once you teach dpkg, rpm, and
the Android package installer how to write the file in this compressed
format and set the compressed bit, we can the vast majority of the
benefits of using compressed file with minimal effort.
- Ted
P.S. This is interesting not just for systems with slow HDD's, but
also for cheap, single-channel MMC flash, the kind found in low-end
handset and embedded systems.
P.P.S. At least in theory, nothing of what I've described here has to
be ext4 specific. We could implement this in the VFS layer, at which
point not only ext4 would benefit, but also btrfs, xfs, f2fs, etc.
next prev parent reply other threads:[~2013-08-04 2:25 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-07-24 21:03 [RFC/PATCH 0/2] ext4: Transparent Decompression Support Dhaval Giani
2013-07-24 23:36 ` Jörn Engel
2013-07-25 15:16 ` Dhaval Giani
2013-07-25 15:29 ` Phillip Lougher
2013-07-25 16:42 ` Taras Glek
2013-07-25 17:53 ` Jörn Engel
2013-07-25 19:27 ` Dhaval Giani
2013-07-25 18:15 ` Vyacheslav Dubeyko
2013-07-25 18:35 ` Dhaval Giani
2013-07-26 8:01 ` Vyacheslav Dubeyko
2013-07-26 13:20 ` Jörn Engel
2013-07-29 23:15 ` Mike Hommey
2013-08-04 0:33 ` Theodore Ts'o [this message]
2013-08-04 2:21 ` Jörn Engel
2013-08-04 23:48 ` Dave Chinner
2013-08-07 9:21 ` Andreas Dilger
2013-08-07 15:52 ` Jörn Engel
[not found] ` <51F16B9A.5020006@mozilla.com>
2013-07-26 7:47 ` Vyacheslav Dubeyko
2013-07-25 18:05 ` Jörn Engel
2013-07-25 20:09 ` Zach Brown
2013-07-25 18:46 ` Jörn Engel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130804003316.GA19781@thunk.org \
--to=tytso@mit.edu \
--cc=dgiani@mozilla.com \
--cc=glandium@mozilla.com \
--cc=joern@logfs.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=slava@dubeyko.com \
--cc=tglek@mozilla.com \
--cc=vdjeric@mozilla.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox