From: Gao Xiang <hsiangkao@linux.alibaba.com>
To: Ariel Miculas <amiculas@cisco.com>
Cc: Benno Lossin <benno.lossin@proton.me>,
Gary Guo <gary@garyguo.net>, Yiyang Wu <toolmanp@tlmp.cc>,
rust-for-linux@vger.kernel.org,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
LKML <linux-kernel@vger.kernel.org>,
Al Viro <viro@zeniv.linux.org.uk>,
linux-fsdevel@vger.kernel.org, linux-erofs@lists.ozlabs.org,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [RFC PATCH 03/24] erofs: add Errno in Rust
Date: Thu, 26 Sep 2024 16:25:07 +0800 [thread overview]
Message-ID: <54bf7cc6-a62a-44e9-9ff0-ca2e334d364f@linux.alibaba.com> (raw)
In-Reply-To: <20240926081007.6amk4xfuo6l4jhsc@amiculas-l-PF3FCGJH>
On 2024/9/26 16:10, Ariel Miculas wrote:
> On 24/09/26 09:04, Gao Xiang wrote:
>>
...
>
> And here [4] you can see the space savings achieved by PuzzleFS. In
> short, if you take 10 versions of Ubuntu Jammy from dockerhub, they take
> up 282 MB. Convert them to PuzzleFS and they only take up 130 MB (this
> is before applying any compression, the space savings are only due to
> the chunking algorithm). If we enable compression (PuzzleFS uses Zstd
> seekable compression), which is a fairer comparison (considering that
> the OCI image uses gzip compression), then we get down to 53 MB for
> storing all 10 Ubuntu Jammy versions using PuzzleFS.
>
> Here's a summary:
> # Steps
>
> * I’ve downloaded 10 versions of Jammy from hub.docker.com
> * These images only have one layer which is in tar.gz format
> * I’ve built 10 equivalent puzzlefs images
> * Compute the tarball_total_size by summing the sizes of every Jammy
> tarball (uncompressed) => 766 MB (use this as baseline)
> * Sum the sizes of every oci/puzzlefs image => total_size
> * Compute the total size as if all the versions were stored in a single
> oci/puzzlefs repository => total_unified_size
> * Saved space = tarball_total_size - total_unified_size
>
> # Results
> (See [5] if you prefer the video format)
>
> | Type | Total size (MB) | Average layer size (MB) | Unified size (MB) | Saved (MB) / 766 MB |
> | --- | --- | --- | --- | --- |
> | Oci (uncompressed) | 766 | 77 | 766 | 0 (0%) |
> | PuzzleFS uncompressed | 748 | 74 | 130 | 635 (83%) |
> | Oci (compressed) | 282 | 28 | 282 | 484 (63%) |
> | PuzzleFS (compressed) | 298 | 30 | 53 | 713 (93%) |
>
> Here's the script I used to download the Ubuntu Jammy versions and
> generate the PuzzleFS images [6] to get an idea about how I got to these
> results.
>
> Can we achieve these results with the current erofs features? I'm
> referring specifically to this comment: "EROFS already supports
> variable-sized chunks + CDC" [7].
Please see
https://erofs.docs.kernel.org/en/latest/comparsion/dedupe.html
Total Size (MiB) Average layer size (MiB) Saved / 766.1MiB
Compressed OCI (tar.gz) 282.5 28.3 63%
Uncompressed OCI (tar) 766.1 76.6 0%
Uncomprssed EROFS 109.5 11.0 86%
EROFS (DEFLATE,9,32k) 46.4 4.6 94%
EROFS (LZ4HC,12,64k) 54.2 5.4 93%
I don't know which compression algorithm are you using (maybe Zstd?),
but from the result is
EROFS (LZ4HC,12,64k) 54.2
PuzzleFS compressed 53?
EROFS (DEFLATE,9,32k) 46.4
I could reran with EROFS + Zstd, but it should be smaller. This feature
has been supported since Linux 6.1, thanks.
Thanks,
Gao Xiang
next prev parent reply other threads:[~2024-09-26 8:25 UTC|newest]
Thread overview: 70+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-16 13:56 [RFC PATCH 00/24] erofs: introduce Rust implementation Yiyang Wu
2024-09-16 13:56 ` [RFC PATCH 01/24] erofs: lift up erofs_fill_inode to global Yiyang Wu
2024-09-16 13:56 ` [RFC PATCH 02/24] erofs: add superblock data structure in Rust Yiyang Wu
2024-09-16 17:55 ` Greg KH
2024-09-17 0:18 ` Gao Xiang
2024-09-17 5:34 ` Greg KH
2024-09-17 5:45 ` Gao Xiang
2024-09-17 5:27 ` Yiyang Wu
2024-09-17 5:39 ` Yiyang Wu
2024-09-16 13:56 ` [RFC PATCH 03/24] erofs: add Errno " Yiyang Wu
2024-09-16 17:51 ` Greg KH
2024-09-16 23:45 ` Gao Xiang
2024-09-20 2:49 ` [PATCH RESEND 0/1] rust: introduce declare_err! autogeneration Yiyang Wu
2024-09-20 2:49 ` [PATCH RESEND 1/1] rust: error: auto-generate error declarations Yiyang Wu
2024-09-20 2:57 ` [RFC PATCH 03/24] erofs: add Errno in Rust Yiyang Wu
2024-09-16 20:01 ` Gary Guo
2024-09-16 23:58 ` Gao Xiang
2024-09-19 13:45 ` Benno Lossin
2024-09-19 15:13 ` Gao Xiang
2024-09-19 19:36 ` Benno Lossin
2024-09-20 0:49 ` Gao Xiang
2024-09-21 8:37 ` Greg Kroah-Hartman
2024-09-21 9:29 ` Gao Xiang
2024-09-25 15:48 ` Ariel Miculas
2024-09-25 16:35 ` Gao Xiang
2024-09-25 21:45 ` Ariel Miculas
2024-09-26 0:40 ` Gao Xiang
2024-09-26 1:04 ` Gao Xiang
2024-09-26 8:10 ` Ariel Miculas
2024-09-26 8:25 ` Gao Xiang [this message]
2024-09-26 9:51 ` Ariel Miculas
2024-09-26 10:46 ` Gao Xiang
2024-09-26 11:01 ` Ariel Miculas
2024-09-26 11:05 ` Gao Xiang
2024-09-26 11:23 ` Gao Xiang
2024-09-26 12:50 ` Ariel Miculas
2024-09-27 2:18 ` Gao Xiang
2024-09-26 8:48 ` Gao Xiang
2024-09-16 13:56 ` [RFC PATCH 04/24] erofs: add xattrs data structure " Yiyang Wu
2024-09-16 13:56 ` [RFC PATCH 05/24] erofs: add inode " Yiyang Wu
2024-09-18 13:04 ` [External Mail][RFC " Huang Jianan
2024-09-16 13:56 ` [RFC PATCH 06/24] erofs: add alloc_helper " Yiyang Wu
2024-09-16 13:56 ` [RFC PATCH 07/24] erofs: add data abstraction " Yiyang Wu
2024-09-16 13:56 ` [RFC PATCH 08/24] erofs: add device data structure " Yiyang Wu
2024-09-16 13:56 ` [RFC PATCH 09/24] erofs: add continuous iterators " Yiyang Wu
2024-09-16 13:56 ` [RFC PATCH 10/24] erofs: add device_infos implementation " Yiyang Wu
2024-09-21 9:44 ` Jianan Huang
2024-09-16 13:56 ` [RFC PATCH 11/24] erofs: add map data structure " Yiyang Wu
2024-09-16 13:56 ` [RFC PATCH 12/24] erofs: add directory entry " Yiyang Wu
2024-09-16 13:56 ` [RFC PATCH 13/24] erofs: add runtime filesystem and inode " Yiyang Wu
2024-09-16 13:56 ` [RFC PATCH 14/24] erofs: add block mapping capability " Yiyang Wu
2024-09-16 13:56 ` [RFC PATCH 15/24] erofs: add iter methods in filesystem " Yiyang Wu
2024-09-16 13:56 ` [RFC PATCH 16/24] erofs: implement dir and inode operations " Yiyang Wu
2024-09-16 13:56 ` [RFC PATCH 17/24] erofs: introduce Rust SBI to C Yiyang Wu
2024-09-16 13:56 ` [RFC PATCH 18/24] erofs: introduce iget alternative " Yiyang Wu
2024-09-16 13:56 ` [RFC PATCH 19/24] erofs: introduce namei " Yiyang Wu
2024-09-16 17:08 ` Al Viro
2024-09-17 6:48 ` Yiyang Wu
2024-09-17 7:14 ` Gao Xiang
2024-09-17 7:31 ` Al Viro
2024-09-17 7:44 ` Al Viro
2024-09-17 8:08 ` Gao Xiang
2024-09-17 22:22 ` Al Viro
2024-09-17 8:06 ` Gao Xiang
2024-09-16 13:56 ` [RFC PATCH 20/24] erofs: introduce readdir " Yiyang Wu
2024-09-16 13:56 ` [RFC PATCH 21/24] erofs: introduce erofs_map_blocks " Yiyang Wu
2024-09-16 13:56 ` [RFC PATCH 22/24] erofs: add skippable iters in Rust Yiyang Wu
2024-09-16 13:56 ` [RFC PATCH 23/24] erofs: implement xattrs operations " Yiyang Wu
2024-09-16 13:56 ` [RFC PATCH 24/24] erofs: introduce xattrs replacement to C Yiyang Wu
-- strict thread matches above, loose matches on Subject: below --
2024-09-16 13:55 [RFC PATCH 00/24] erofs: introduce Rust implementation Yiyang Wu
2024-09-16 13:55 ` [RFC PATCH 03/24] erofs: add Errno in Rust Yiyang Wu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=54bf7cc6-a62a-44e9-9ff0-ca2e334d364f@linux.alibaba.com \
--to=hsiangkao@linux.alibaba.com \
--cc=amiculas@cisco.com \
--cc=benno.lossin@proton.me \
--cc=gary@garyguo.net \
--cc=gregkh@linuxfoundation.org \
--cc=linux-erofs@lists.ozlabs.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=rust-for-linux@vger.kernel.org \
--cc=toolmanp@tlmp.cc \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).