All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yue Hu <zbestahu@gmail.com>
To: Gao Xiang <hsiangkao@linux.alibaba.com>
Cc: huyue2@coolpad.com, linux-erofs@lists.ozlabs.org,
	zbestahu@163.com, shaojunjun@coolpad.com, zhangwen@coolpad.com
Subject: Re: [RFC PATCH v3 0/3] erofs-utils: compressed fragments feature
Date: Wed, 3 Aug 2022 15:33:05 +0800	[thread overview]
Message-ID: <20220803153305.00000ec4.zbestahu@gmail.com> (raw)
In-Reply-To: <YuoQnLQTcp/bTere@B-P7TQMD6M-0146.local>

Hi Xiang,

On Wed, 3 Aug 2022 14:07:24 +0800
Gao Xiang <hsiangkao@linux.alibaba.com> wrote:

> Hi Yue,
> 
> On Wed, Aug 03, 2022 at 11:51:27AM +0800, Yue Hu wrote:
> > In order to achieve greater compression ratio, let's introduce
> > compressed fragments feature which can merge tail of per-file or the
> > whole files into one special inode to reach the target.
> > 
> > And we can also set pcluster size to fragments inode for different
> > compression requirments.
> > 
> > In this patchset, we also improve the uncompressed data layout of
> > compressed files. Just write it from 'clusterofs' instead of 0 since it
> > can benefit from in-place I/O. For now, it only goes with fragments.
> > 
> > The main idea above is from Xiang.  
> 
> Thanks for your hard work! I will take a deep try this weekend,

Got it.

> 
> Also I'd like to enable logical cluster size != 4k for big pcluster with
> large pclustersize in order to reduce the size of compression indexes.

Let me think about this first.

Thanks.

> 
> In such cases, I think compact indexes are unnecessary.  I think it's
> already supported on the kernel side, so we just need to implement the
> userspace side.
> 
> Thanks,
> Gao Xiang
> 
> > 
> > Here is some test data of Linux 5.10.87 source code under Ubuntu 18.04:
> > 
> > linux-5.10.87 (erofs, uncompressed)                1.1G
> > 
> > linux-5.10.87 (erofs, lz4hc,12 4k fragments,4k)    301M
> > linux-5.10.87 (erofs, lz4hc,12 8k fragments,8k)    268M
> > linux-5.10.87 (erofs, lz4hc,12 16k fragments,16k)  242M
> > linux-5.10.87 (erofs, lz4hc,12 32k fragments,32k)  225M
> > linux-5.10.87 (erofs, lz4hc,12 64k fragments,64k)  217M
> > 
> > linux-5.10.87 (erofs, lz4hc,12 4k vanilla)         396M
> > linux-5.10.87 (erofs, lz4hc,12 8k vanilla)         376M
> > linux-5.10.87 (erofs, lz4hc,12 16k vanilla)        364M
> > linux-5.10.87 (erofs, lz4hc,12 32k vanilla)        359M
> > linux-5.10.87 (erofs, lz4hc,12 64k vanilla)        358M
> > 
> > Usage:
> > mkfs.erofs -zlz4hc,12 -C65536 -Efragments,65536 foo.erofs.img foo/
> > 
> > Changes since v2:
> >  - mainly reimplment the decompression logic for fragment inode due to
> >    kernel side;
> >  - fix compatibility issue to old image with ztailpacking feature;
> >  - move code of super.c in patch 3/3 to patch 1/3;
> >  - minor naming change.
> > 
> > Changes since v1:
> >  - mainly optimize index space for fragment inode;
> >  - add merging tail with len <= pclustersize into fragments directly;
> >  - use a inode instead of nid to avoid multiple load fragments;
> >  - fix memory leak of building fragments;
> >  - minor change to diff special fragments with normal inode.
> >  - rebase to commit cb058526 with patch [1];
> >  - code cleanup.
> > 
> > Note that inode will be extended version (64 bytes) due to mtime, may
> > use 'force-inode-compact' option to reduce the size if mtime careless.
> > 
> > [1] https://lore.kernel.org/linux-erofs/20220722053610.23912-1-huyue2@coolpad.com/
> > 
> > Yue Hu (3):
> >   erofs-utils: lib: add support for fragments data decompression
> >   erofs-utils: lib: support on-disk offset for shifted decompression
> >   erofs-utils: introduce compressed fragments support
> > 
> >  include/erofs/compress.h   |   3 +-
> >  include/erofs/config.h     |   3 +-
> >  include/erofs/decompress.h |   3 ++
> >  include/erofs/fragments.h  |  25 +++++++++
> >  include/erofs/inode.h      |   2 +
> >  include/erofs/internal.h   |   9 ++++
> >  include/erofs_fs.h         |  27 +++++++---
> >  lib/Makefile.am            |   4 +-
> >  lib/compress.c             | 108 +++++++++++++++++++++++++++----------
> >  lib/data.c                 |  28 +++++++++-
> >  lib/decompress.c           |  10 +++-
> >  lib/fragments.c            |  76 ++++++++++++++++++++++++++
> >  lib/inode.c                |  43 ++++++++++-----
> >  lib/super.c                |  24 ++++++++-
> >  lib/zmap.c                 |  26 +++++++++
> >  mkfs/main.c                |  64 +++++++++++++++++++---
> >  16 files changed, 393 insertions(+), 62 deletions(-)
> >  create mode 100644 include/erofs/fragments.h
> >  create mode 100644 lib/fragments.c
> > 
> > -- 
> > 2.17.1  


      reply	other threads:[~2022-08-03  7:31 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-03  3:51 [RFC PATCH v3 0/3] erofs-utils: compressed fragments feature Yue Hu
2022-08-03  3:51 ` [RFC PATCH v3 1/3] erofs-utils: lib: add support for fragments data decompression Yue Hu
2022-08-16 18:48   ` Gao Xiang
2022-08-17  3:56     ` Yue Hu
2022-08-03  3:51 ` [RFC PATCH v3 2/3] erofs-utils: lib: support on-disk offset for shifted decompression Yue Hu
2022-08-03  3:51 ` [RFC PATCH v3 3/3] erofs-utils: introduce compressed fragments support Yue Hu
2022-08-03  6:07 ` [RFC PATCH v3 0/3] erofs-utils: compressed fragments feature Gao Xiang
2022-08-03  7:33   ` Yue Hu [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220803153305.00000ec4.zbestahu@gmail.com \
    --to=zbestahu@gmail.com \
    --cc=hsiangkao@linux.alibaba.com \
    --cc=huyue2@coolpad.com \
    --cc=linux-erofs@lists.ozlabs.org \
    --cc=shaojunjun@coolpad.com \
    --cc=zbestahu@163.com \
    --cc=zhangwen@coolpad.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.