linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] Per-file compression
@ 2015-04-17 22:20 Tom Marshall
  2015-04-18  8:06 ` Richard Weinberger
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Tom Marshall @ 2015-04-17 22:20 UTC (permalink / raw)
  To: linux-fsdevel

I'd like to get some thoughts on whether having transparent per-file 
compression implemented in the kernel would be desirable.

Details:

We are running into space issues when upgrading Android devices to the 
latest version.  This issue is particularly troublesome for 64-bit 
devices that initially shipped without 64-bit support, as the multi-lib 
files nearly double the space requirements.  As it happens, many system 
files are quite compressible -- namely shared libs and apk files.  So 
implementing filesystem compression is a natural solution.

After investigating several well known compression solutions, we did not 
find any that met our goals:

* Compression should be transparent to user space.  It would be possible 
to identify subsets of the filesystem that compress well and handle 
compression in user space.  However, doing this involves repeating 
similar work in different areas of code, and can never fully utilize 
compression across the entire filesystem.

* Compression must work with existing Android filesystems.  There are 
filesystems that have native compression.  However, switching to a new 
filesystem on a released device is not really desirable.

* Compression should be implemented as transparently as possible. cloop 
is filesystem independent, but it makes the underlying device 
read-only.  This could be mitigated somewhat with overlayfs, but that 
would be a rather complex solution.

* Compression should be independent of the filesystem.  e2compr may work 
for ext2, but it seems abandoned and does not address other filesystem 
types.

So, I wrote a thing called 'zfile' that hooks into the VFS layer and 
intercepts file_operations to do file (de)compression on the fly. When a 
file is opened, it reads and decompresses the data into memory.  The 
file may be read, written, and mmaped in the usual way.  If the contents 
are changed, the data is compressed and written back.  A working patch 
for an older kernel version may be found at: 
http://review.cyanogenmod.org/#/c/95220/

Note this is mostly just a prototype at the moment.  I'm fully aware 
that it has some bugs and limitations.  Pretty major ones, I'm sure.  
However, if this is something that is generally desirable, I would look 
forward to working with the VFS maintainers to make it merge worthy.

Thanks!


^ permalink raw reply	[flat|nested] 18+ messages in thread
* Re: [RFC] Per-file compression
@ 2015-04-19 21:15 Tom Marshall
  0 siblings, 0 replies; 18+ messages in thread
From: Tom Marshall @ 2015-04-19 21:15 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: linux-fsdevel, Richard Weinberger


On Apr 18, 2015 4:09 PM, Theodore Ts'o <tytso@mit.edu> wrote:
>
> On Sat, Apr 18, 2015 at 10:06:14AM +0200, Richard Weinberger wrote: 
> > On Sat, Apr 18, 2015 at 12:20 AM, Tom Marshall <tom@cyngn.com> wrote: 
> > > We are running into space issues when upgrading Android devices to the 
> > > latest version.  This issue is particularly troublesome for 64-bit devices 
> > > that initially shipped without 64-bit support, as the multi-lib files nearly 
> > > double the space requirements.  As it happens, many system files are quite 
> > > compressible -- namely shared libs and apk files.  So implementing 
> > > filesystem compression is a natural solution. 
>
> Tom, have you verified whether or not this approach actually is 
> sufficient to address the problem you are trying to solve? 
> Specifically, (a) can you get enough space compression that it will 
> make things fit?  Given that apk files are already compressed using 
> zip, I'm a bit surprised you would get enough of a compression savings 
> that it's going to allow to make things fit.

Yes it works surprisingly well. Both .so files and .apk files compress by roughly 40%. I'm also a bit surprised at the .apk results, but it works great.

>  And secondly (b) after 
> you decompress the shared libraries, is there enough memory that the 
> resulting system functions well.

Yes, on our particular device, it actually performs at least as well as before, possibly better. I haven't done an extensive survey of memory usage though.

The majority of compression comes from .so and .apk files. Shared libs are typically mmap'd and most of the contents accessed. apk files are not referenced much after initial dexopt.

> Often times these older Android 
> devices don't have a huge amount of memory, and so either with or 
> without the overhead of needing to decompress the entire file (which 
> can be solved with chunking, although this tends to reduce the 
> compression efficiency), is the system going to function well using 
> 64-bit binaries that take up a lot more room than the older 32-bit 
> binaries? 

Our test device is not so much an older device, just a mid-range device (msm8916). It has mediocre emmc speed and decent CPU (though thermal throttling is pretty aggressive).

>
> > > So, I wrote a thing called 'zfile' that hooks into the VFS layer and 
> > > intercepts file_operations to do file (de)compression on the fly. When a 
> > > file is opened, it reads and decompresses the data into memory.  The file 
> > > may be read, written, and mmaped in the usual way.  If the contents are 
> > > changed, the data is compressed and written back. 
>
> Do you really need to be able to rewrite the files?  Life gets much 
> easier if you can assume that the compressed files are read-only; 
> especially once you start trying to use chunking, it *definitely* 
> becomes easier if you don't need to support modifying the compressed 
> files.  And in the case of the system partition, this might be a safe 
> assumption,yes? 

You are correct. Writing is not really a requirement for us, just nice to have.

^ permalink raw reply	[flat|nested] 18+ messages in thread
* Re: [RFC] Per-file compression
@ 2015-04-21  3:29 Tom Marshall
  0 siblings, 0 replies; 18+ messages in thread
From: Tom Marshall @ 2015-04-21  3:29 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: Tyler Hicks, linux-fsdevel

I'll try my hand at ecompressfs shortly, thanks!On Apr 20, 2015 9:51 AM, Richard Weinberger <richard@nod.at> wrote:
>
> Am 20.04.2015 um 16:53 schrieb Tyler Hicks: 
> > On 2015-04-18 17:07:16, Richard Weinberger wrote: 
> >> Hi! 
> >> 
> >> Am 18.04.2015 um 16:58 schrieb Tom Marshall: 
> >>> On Sat, Apr 18, 2015 at 01:41:09PM +0200, Richard Weinberger wrote: 
> >>>> On Sat, Apr 18, 2015 at 12:20 AM, Tom Marshall <tom@cyngn.com> wrote: 
> >>>>> So, I wrote a thing called 'zfile' that hooks into the VFS layer and 
> >>>>> intercepts file_operations to do file (de)compression on the fly. When a 
> >>>>> file is opened, it reads and decompresses the data into memory.  The file 
> >>>>> may be read, written, and mmaped in the usual way.  If the contents are 
> >>>>> changed, the data is compressed and written back.  A working patch for an 
> >>>>> older kernel version may be found at: 
> >>>>> http://review.cyanogenmod.org/#/c/95220/ 
> >>>> 
> >>>> So, I've extracted the patch from that website and gave a quick review. 
> >>>> 
> >>>> I'm pretty sure VFS folks will hate the VFS layering you do. 
> >>> 
> >>> This, I'm afraid, is the biggest obstacle to such a solution.  I know that 
> >>> OverlayFS has been merged, so filesystem stacking is acceptable.  Perhaps 
> >>> there would be a way to design a filesystem that stacks compression? 
> >> 
> >> That's why I said think of adding compression support to ecryptfs. 
> > 
> > I think adding compression support to eCryptfs is the wrong approach. 
> > The "X is already a stacked filesystem so look into adding compression 
> > support to it" logic also works when X=overlayfs. I doubt that Miklos 
> > would be willing to accept such a feature. :) 
>
> My thought was that compression is not far away from crypto an hence 
> a lot of ecryptfs could be reused. 
>
> > A stacked filesystem that implements compression should be fairly 
> > simple. If it is not simple, it is too complicated to try to wedge into 
> > an unrelated stacked filesystem. 
> > 
> > While it may be the quickest route to your end goal, it will overly 
> > complicate eCryptfs. eCryptfs already has plenty of complexity around 
> > the file offset since metadata may be stored in the first 8192 bytes of 
> > the lower file, which offsets the entire file, and the end of the file 
> > has to be padded. Mixing in compression would make things much worse. 
>
> I assumed that you need also some meta data for compression. 
> At least if you do it in a non-trivial way. 
>
> > Also, eCryptfs has lots of cruft that is definitely not needed for 
> > compression. 
>
> As you're the maintainer of ecryptfs you know obviously better than I do. :) 
> Tom, to make the story short, you'll have to experiment a bit. 
>
> Thanks, 
> //richard 

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2015-05-01 18:09 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-04-17 22:20 [RFC] Per-file compression Tom Marshall
2015-04-18  8:06 ` Richard Weinberger
2015-04-18 23:09   ` Theodore Ts'o
2015-04-20  3:00     ` Alex Elsayed
2015-04-18 11:41 ` Richard Weinberger
2015-04-18 14:58   ` Tom Marshall
2015-04-18 15:07     ` Richard Weinberger
2015-04-18 15:48       ` Tom Marshall
2015-04-18 15:52         ` Richard Weinberger
2015-04-20 14:53       ` Tyler Hicks
2015-04-20 16:51         ` Richard Weinberger
2015-04-21 15:18           ` Theodore Ts'o
2015-04-21 15:37             ` Jeff Moyer
2015-04-21 16:54               ` Theodore Ts'o
2015-04-29 23:15             ` Tom Marshall
2015-05-01 18:09 ` Steve French
  -- strict thread matches above, loose matches on Subject: below --
2015-04-19 21:15 Tom Marshall
2015-04-21  3:29 Tom Marshall

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).