public inbox for ntfs3@lists.linux.dev
 help / color / mirror / Atom feed
From: Eric Biggers <ebiggers@kernel.org>
To: Phillip Lougher <phillip@squashfs.org.uk>
Cc: Matthew Wilcox <willy@infradead.org>, Chris Mason <clm@fb.com>,
	Josef Bacik <josef@toxicpanda.com>,
	David Sterba <dsterba@suse.com>,
	linux-btrfs@vger.kernel.org, Nicolas Pitre <nico@fluxnic.net>,
	Gao Xiang <xiang@kernel.org>, Chao Yu <chao@kernel.org>,
	linux-erofs@lists.ozlabs.org, Jaegeuk Kim <jaegeuk@kernel.org>,
	linux-f2fs-devel@lists.sourceforge.net, Jan Kara <jack@suse.cz>,
	linux-fsdevel@vger.kernel.org,
	David Woodhouse <dwmw2@infradead.org>,
	Richard Weinberger <richard@nod.at>,
	linux-mtd@lists.infradead.org,
	David Howells <dhowells@redhat.com>,
	netfs@lists.linux.dev, Paulo Alcantara <pc@manguebit.org>,
	Konstantin Komarov <almaz.alexandrovich@paragon-software.com>,
	ntfs3@lists.linux.dev, Steve French <sfrench@samba.org>,
	linux-cifs@vger.kernel.org
Subject: Re: Compressed files & the page cache
Date: Wed, 16 Jul 2025 19:49:03 -0700	[thread overview]
Message-ID: <20250717024903.GA1288@sol> (raw)
In-Reply-To: <f4b9faf9-8efd-4396-b080-e712025825ab@squashfs.org.uk>

On Wed, Jul 16, 2025 at 11:37:28PM +0100, Phillip Lougher wrote:
> > There also seems to be some discrepancy between filesystems whether the
> > decompression involves vmap() of all the memory allocated or whether the
> > decompression routines can handle doing kmap_local() on individual pages.
> > 
> 
> Squashfs does both, and this depends on whether the decompression
> algorithm implementation in the kernel is multi-shot or single-shot.
> 
> The zlib/xz/zstd decompressors are multi-shot, in that you can call them
> multiply, giving them an extra input or output buffer when it runs out.
> This means you can get them to output into a 4K page at a time, without
> requiring the pages to be contiguous.  kmap_local() can be called on each
> page before passing it to the decompressor.

While those compression libraries do provide streaming APIs, it's sort
of an illusion.  They still need the uncompressed data in a virtually
contiguous buffer for the LZ77 match finding and copying to work.  So,
internally they copy the uncompressed data into a virtually contiguous
buffer.  I suspect that vmap() (or vm_map_ram() which is what f2fs uses)
is actually more efficient than these streaming APIs, since it avoids
the internal copy.  But it would need to be measured.

> > So, my proposal is that filesystems tell the page cache that their minimum
> > folio size is the compression block size.  That seems to be around 64k,
> > so not an unreasonable minimum allocation size.  That removes all the
> > extra code in filesystems to allocate extra memory in the page cache.
> > It means we don't attempt to track dirtiness at a sub-folio granularity
> > (there's no point, we have to write back the entire compressed bock
> > at once).  We also get a single virtually contiguous block ... if you're
> > willing to ditch HIGHMEM support.  Or there's a proposal to introduce a
> > vmap_file() which would give us a virtually contiguous chunk of memory
> > (and could be trivially turned into a noop for the case of trying to
> > vmap a single large folio).

... but of course, if we could get a virtually contiguous buffer
"for free" (at least in the !HIGHMEM case) as in the above proposal,
that would clearly be the best option.

- Eric

  reply	other threads:[~2025-07-17  2:49 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-15 20:40 Compressed files & the page cache Matthew Wilcox
2025-07-15 21:22 ` Boris Burkov
2025-07-15 23:32 ` Gao Xiang
2025-07-16  0:28   ` Gao Xiang
2025-07-21  1:02     ` Barry Song
2025-07-21  3:14       ` Gao Xiang
2025-07-21 10:25         ` Jan Kara
2025-07-21 11:36           ` Qu Wenruo
2025-07-21 11:52             ` Gao Xiang
2025-07-22  3:54             ` Barry Song
2025-07-21 11:40           ` Gao Xiang
2025-07-21  0:43   ` Barry Song
2025-07-16  0:57 ` Qu Wenruo
2025-07-16  1:16   ` Gao Xiang
2025-07-16  4:54     ` Qu Wenruo
2025-07-16  5:40       ` Gao Xiang
2025-07-16 22:37 ` Phillip Lougher
2025-07-17  2:49   ` Eric Biggers [this message]
2025-07-17  3:18     ` Gao Xiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250717024903.GA1288@sol \
    --to=ebiggers@kernel.org \
    --cc=almaz.alexandrovich@paragon-software.com \
    --cc=chao@kernel.org \
    --cc=clm@fb.com \
    --cc=dhowells@redhat.com \
    --cc=dsterba@suse.com \
    --cc=dwmw2@infradead.org \
    --cc=jack@suse.cz \
    --cc=jaegeuk@kernel.org \
    --cc=josef@toxicpanda.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-cifs@vger.kernel.org \
    --cc=linux-erofs@lists.ozlabs.org \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mtd@lists.infradead.org \
    --cc=netfs@lists.linux.dev \
    --cc=nico@fluxnic.net \
    --cc=ntfs3@lists.linux.dev \
    --cc=pc@manguebit.org \
    --cc=phillip@squashfs.org.uk \
    --cc=richard@nod.at \
    --cc=sfrench@samba.org \
    --cc=willy@infradead.org \
    --cc=xiang@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox