linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Gao Xiang <hsiangkao@linux.alibaba.com>
To: Ariel Miculas <amiculas@cisco.com>
Cc: Benno Lossin <benno.lossin@proton.me>,
	Gary Guo <gary@garyguo.net>, Yiyang Wu <toolmanp@tlmp.cc>,
	rust-for-linux@vger.kernel.org,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	linux-fsdevel@vger.kernel.org, linux-erofs@lists.ozlabs.org,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [RFC PATCH 03/24] erofs: add Errno in Rust
Date: Thu, 26 Sep 2024 16:25:07 +0800	[thread overview]
Message-ID: <54bf7cc6-a62a-44e9-9ff0-ca2e334d364f@linux.alibaba.com> (raw)
In-Reply-To: <20240926081007.6amk4xfuo6l4jhsc@amiculas-l-PF3FCGJH>



On 2024/9/26 16:10, Ariel Miculas wrote:
> On 24/09/26 09:04, Gao Xiang wrote:
>>


...

> 
> And here [4] you can see the space savings achieved by PuzzleFS. In
> short, if you take 10 versions of Ubuntu Jammy from dockerhub, they take
> up 282 MB. Convert them to PuzzleFS and they only take up 130 MB (this
> is before applying any compression, the space savings are only due to
> the chunking algorithm). If we enable compression (PuzzleFS uses Zstd
> seekable compression), which is a fairer comparison (considering that
> the OCI image uses gzip compression), then we get down to 53 MB for
> storing all 10 Ubuntu Jammy versions using PuzzleFS.
> 
> Here's a summary:
> # Steps
> 
> * I’ve downloaded 10 versions of Jammy from hub.docker.com
> * These images only have one layer which is in tar.gz format
> * I’ve built 10 equivalent puzzlefs images
> * Compute the tarball_total_size by summing the sizes of every Jammy
>    tarball (uncompressed) => 766 MB (use this as baseline)
> * Sum the sizes of every oci/puzzlefs image => total_size
> * Compute the total size as if all the versions were stored in a single
>    oci/puzzlefs repository => total_unified_size
> * Saved space = tarball_total_size - total_unified_size
> 
> # Results
> (See [5] if you prefer the video format)
> 
> | Type | Total size (MB) | Average layer size (MB) | Unified size (MB) | Saved (MB) / 766 MB |
> | --- | --- | --- | --- | --- |
> | Oci (uncompressed) | 766 | 77 | 766 | 0 (0%) |
> | PuzzleFS uncompressed | 748 | 74 | 130 | 635 (83%) |
> | Oci (compressed) | 282 | 28 | 282 | 484 (63%) |
> | PuzzleFS (compressed) | 298 | 30 | 53 | 713 (93%) |
> 
> Here's the script I used to download the Ubuntu Jammy versions and
> generate the PuzzleFS images [6] to get an idea about how I got to these
> results.
> 
> Can we achieve these results with the current erofs features?  I'm
> referring specifically to this comment: "EROFS already supports
> variable-sized chunks + CDC" [7].

Please see
https://erofs.docs.kernel.org/en/latest/comparsion/dedupe.html

	                Total Size (MiB)	Average layer size (MiB)	Saved / 766.1MiB
Compressed OCI (tar.gz)	282.5	28.3	63%
Uncompressed OCI (tar)	766.1	76.6	0%
Uncomprssed EROFS	109.5	11.0	86%
EROFS (DEFLATE,9,32k)	46.4	4.6	94%
EROFS (LZ4HC,12,64k)	54.2	5.4	93%

I don't know which compression algorithm are you using (maybe Zstd?),
but from the result is
   EROFS (LZ4HC,12,64k)  54.2
   PuzzleFS compressed   53?
   EROFS (DEFLATE,9,32k) 46.4

I could reran with EROFS + Zstd, but it should be smaller. This feature
has been supported since Linux 6.1, thanks.

Thanks,
Gao Xiang

  reply	other threads:[~2024-09-26  8:25 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-16 13:56 [RFC PATCH 00/24] erofs: introduce Rust implementation Yiyang Wu
2024-09-16 13:56 ` [RFC PATCH 01/24] erofs: lift up erofs_fill_inode to global Yiyang Wu
2024-09-16 13:56 ` [RFC PATCH 02/24] erofs: add superblock data structure in Rust Yiyang Wu
2024-09-16 17:55   ` Greg KH
2024-09-17  0:18     ` Gao Xiang
2024-09-17  5:34       ` Greg KH
2024-09-17  5:45         ` Gao Xiang
2024-09-17  5:27     ` Yiyang Wu
2024-09-17  5:39     ` Yiyang Wu
2024-09-16 13:56 ` [RFC PATCH 03/24] erofs: add Errno " Yiyang Wu
2024-09-16 17:51   ` Greg KH
2024-09-16 23:45     ` Gao Xiang
2024-09-20  2:49     ` [PATCH RESEND 0/1] rust: introduce declare_err! autogeneration Yiyang Wu
2024-09-20  2:49       ` [PATCH RESEND 1/1] rust: error: auto-generate error declarations Yiyang Wu
2024-09-20  2:57     ` [RFC PATCH 03/24] erofs: add Errno in Rust Yiyang Wu
2024-09-16 20:01   ` Gary Guo
2024-09-16 23:58     ` Gao Xiang
2024-09-19 13:45       ` Benno Lossin
2024-09-19 15:13         ` Gao Xiang
2024-09-19 19:36           ` Benno Lossin
2024-09-20  0:49             ` Gao Xiang
2024-09-21  8:37               ` Greg Kroah-Hartman
2024-09-21  9:29                 ` Gao Xiang
2024-09-25 15:48             ` Ariel Miculas
2024-09-25 16:35               ` Gao Xiang
2024-09-25 21:45                 ` Ariel Miculas
2024-09-26  0:40                   ` Gao Xiang
2024-09-26  1:04                     ` Gao Xiang
2024-09-26  8:10                       ` Ariel Miculas
2024-09-26  8:25                         ` Gao Xiang [this message]
2024-09-26  9:51                           ` Ariel Miculas
2024-09-26 10:46                             ` Gao Xiang
2024-09-26 11:01                               ` Ariel Miculas
2024-09-26 11:05                                 ` Gao Xiang
2024-09-26 11:23                                 ` Gao Xiang
2024-09-26 12:50                                   ` Ariel Miculas
2024-09-27  2:18                                     ` Gao Xiang
2024-09-26  8:48                         ` Gao Xiang
2024-09-16 13:56 ` [RFC PATCH 04/24] erofs: add xattrs data structure " Yiyang Wu
2024-09-16 13:56 ` [RFC PATCH 05/24] erofs: add inode " Yiyang Wu
2024-09-18 13:04   ` [External Mail][RFC " Huang Jianan
2024-09-16 13:56 ` [RFC PATCH 06/24] erofs: add alloc_helper " Yiyang Wu
2024-09-16 13:56 ` [RFC PATCH 07/24] erofs: add data abstraction " Yiyang Wu
2024-09-16 13:56 ` [RFC PATCH 08/24] erofs: add device data structure " Yiyang Wu
2024-09-16 13:56 ` [RFC PATCH 09/24] erofs: add continuous iterators " Yiyang Wu
2024-09-16 13:56 ` [RFC PATCH 10/24] erofs: add device_infos implementation " Yiyang Wu
2024-09-21  9:44   ` Jianan Huang
2024-09-16 13:56 ` [RFC PATCH 11/24] erofs: add map data structure " Yiyang Wu
2024-09-16 13:56 ` [RFC PATCH 12/24] erofs: add directory entry " Yiyang Wu
2024-09-16 13:56 ` [RFC PATCH 13/24] erofs: add runtime filesystem and inode " Yiyang Wu
2024-09-16 13:56 ` [RFC PATCH 14/24] erofs: add block mapping capability " Yiyang Wu
2024-09-16 13:56 ` [RFC PATCH 15/24] erofs: add iter methods in filesystem " Yiyang Wu
2024-09-16 13:56 ` [RFC PATCH 16/24] erofs: implement dir and inode operations " Yiyang Wu
2024-09-16 13:56 ` [RFC PATCH 17/24] erofs: introduce Rust SBI to C Yiyang Wu
2024-09-16 13:56 ` [RFC PATCH 18/24] erofs: introduce iget alternative " Yiyang Wu
2024-09-16 13:56 ` [RFC PATCH 19/24] erofs: introduce namei " Yiyang Wu
2024-09-16 17:08   ` Al Viro
2024-09-17  6:48     ` Yiyang Wu
2024-09-17  7:14       ` Gao Xiang
2024-09-17  7:31         ` Al Viro
2024-09-17  7:44           ` Al Viro
2024-09-17  8:08             ` Gao Xiang
2024-09-17 22:22             ` Al Viro
2024-09-17  8:06           ` Gao Xiang
2024-09-16 13:56 ` [RFC PATCH 20/24] erofs: introduce readdir " Yiyang Wu
2024-09-16 13:56 ` [RFC PATCH 21/24] erofs: introduce erofs_map_blocks " Yiyang Wu
2024-09-16 13:56 ` [RFC PATCH 22/24] erofs: add skippable iters in Rust Yiyang Wu
2024-09-16 13:56 ` [RFC PATCH 23/24] erofs: implement xattrs operations " Yiyang Wu
2024-09-16 13:56 ` [RFC PATCH 24/24] erofs: introduce xattrs replacement to C Yiyang Wu
  -- strict thread matches above, loose matches on Subject: below --
2024-09-16 13:55 [RFC PATCH 00/24] erofs: introduce Rust implementation Yiyang Wu
2024-09-16 13:55 ` [RFC PATCH 03/24] erofs: add Errno in Rust Yiyang Wu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54bf7cc6-a62a-44e9-9ff0-ca2e334d364f@linux.alibaba.com \
    --to=hsiangkao@linux.alibaba.com \
    --cc=amiculas@cisco.com \
    --cc=benno.lossin@proton.me \
    --cc=gary@garyguo.net \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-erofs@lists.ozlabs.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rust-for-linux@vger.kernel.org \
    --cc=toolmanp@tlmp.cc \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).