[PATCH 0/2] fsck.erofs: introduce multi-threaded decompression

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Nithurshen <nithurshen.dev@gmail.com>
To: linux-erofs@lists.ozlabs.org
Cc: hsiangkao@linux.alibaba.com, xiang@kernel.org,
	Nithurshen <nithurshen.dev@gmail.com>
Subject: [PATCH 0/2] fsck.erofs: introduce multi-threaded decompression
Date: Sat, 23 May 2026 06:07:55 +0530	[thread overview]
Message-ID: <20260523003757.13078-1-nithurshen.dev@gmail.com> (raw)

Hi,

As part of my GSoC 2026 proposal to introduce Multi-Threaded 
Decompression Support in fsck.erofs, I am submitting this two-patch 
series which establishes the core workqueue offloading infrastructure.

Baseline profiling of fsck.erofs extracting LZ4HC 4K pclusters showed 
the main thread bottlenecking on synchronous VFS writes while blocking 
decompression tasks. This series decouples the compute payload into the 
existing erofs_workqueue.

- Patch 1 introduces the baseline producer-consumer logic. To avoid 
  massive futex scheduling overhead on tiny 4K clusters, it implements 
  a batching context that groups sequential pclusters into a single 
  erofs_work unit. Buffer memory ownership is strictly delegated to 
  the workers using calloc() to prevent garbage-byte leaks.
  
- Patch 2 implements dynamic, algorithm-aware batching. Fast algorithms 
  (LZ4) are permitted to utilize the maximum batch size (32 pclusters) 
  to hide scheduling latency, whereas compute-heavy algorithms (LZMA) 
  trigger much smaller batches (8 pclusters) to prevent memory bloat 
  and keep the thread pool continuously fed.

The implementation has been verified to produce bit-perfect extractions 
against heavily packed LZ4HC test images.

Nithurshen (2):
  fsck.erofs: introduce multi-threaded decompression with static
    batching
  fsck.erofs: implement dynamic pcluster batching based on algorithm
    complexity

 fsck/main.c              | 234 +++++++++++++++++----------------------
 include/erofs/internal.h |  18 ++-
 lib/data.c               | 206 ++++++++++++++++++++++++----------
 3 files changed, 268 insertions(+), 190 deletions(-)

-- 
2.52.0

next             reply	other threads:[~2026-05-23  0:38 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-23  0:37 Nithurshen [this message]
2026-05-23  0:37 ` [PATCH 1/2] fsck.erofs: introduce multi-threaded decompression with static batching Nithurshen
2026-06-07  1:50   ` Gao Xiang
2026-05-23  0:37 ` [PATCH 2/2] fsck.erofs: implement dynamic pcluster batching based on algorithm complexity Nithurshen
2026-06-07  1:52   ` Gao Xiang
2026-06-08  5:07 ` [PATCH v2 0/2] fsck.erofs: add multi-threaded decompression Nithurshen
2026-06-08  5:07   ` [PATCH v2 1/2] fsck.erofs: introduce multi-threaded decompression with static batching Nithurshen
2026-06-08  6:25     ` Gao Xiang
2026-06-08  5:07   ` [PATCH v2 2/2] fsck.erofs: implement algorithm-aware pcluster batching Nithurshen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260523003757.13078-1-nithurshen.dev@gmail.com \
    --to=nithurshen.dev@gmail.com \
    --cc=hsiangkao@linux.alibaba.com \
    --cc=linux-erofs@lists.ozlabs.org \
    --cc=xiang@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.