The Linux Kernel Mailing List
 help / color / mirror / Atom feed
* [PATCH RFC 0/4] fs/pipe: unify the page pools into a single per-pipe pool
@ 2026-06-26 10:26 Breno Leitao
  2026-06-26 10:26 ` [PATCH RFC 1/4] fs/pipe: make the prealloc pool per-pipe infrastructure Breno Leitao
                   ` (4 more replies)
  0 siblings, 5 replies; 7+ messages in thread
From: Breno Leitao @ 2026-06-26 10:26 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, oleg, mjguzik, josh, Jan Kara,
	jlayton
  Cc: axboe, shakeel.butt, linux-fsdevel, linux-kernel, Breno Leitao,
	kernel-team

TL;DR: This simplifies the pipe code, unify the page pools, reduce the
code by 11 lines, and improves the microbenchmark by up to 23% — so it's
probably wrong (!?).

Summary:
=======

I've spent some time converging tmp_page[] and the on-stack
anon_pipe_prealloc pool of pages into a single per-pipe pool, as
discussed previously in a few places, most recently at:

https://lore.kernel.org/all/ajLA_zxsYyKISkwp@redhat.com/

Problem:
========

1) We have two types of page caches in the pipe mechanism today
   * tmp_page[]
   * anon_pipe_prealloc

2) they operate in different ways:
   * tmp_page[] is protected by the pipe lock
    *  per-pipe, persistent, 2 pages
   * anon_pipe_prealloc is an on-stack pool, not lock protected
    *  burst, up to 8 pages

Proposal/Design:
================

1) Keep the same page budget as today
  a) up to two per-pipe persistent pages
  b) burst of up to 8 pages

2) no pages are allocated unless necessary
   * Pages are _ONLY_ allocated based on the length of the write,
     minus the pages already available in the pool.
   * No page is allocated but left unused

3) keep allocation and freeing outside of the lock
   * only the assignment of pages stays lock-protected
   * Currently, tmp_page[] pages are allocated in the lock, so
     this patch will improve it (thus the performance numbers)

How:
====

1) replace tmp_page[] with anon_pipe_prealloc in pipe_inode_info
2) at write (anon_pipe_write), allocate the pages outside the lock in a helper
   called anon_pipe_prefill()
   a) the assignment into the pool must be lock protected
      * anon_pipe_prefill() does it
   b) anon_pipe_prefill() can populate up to PIPE_PREALLOC_MAX pages in the
      pool
3) once anon_pipe_write is done, the pool is trimmed back to at most
   PIPE_PREALLOC_KEEP (2) pages by anon_pipe_trim_pool()

Testing:
========

Tested on a bare-metal Intel(R) Xeon(R) Platinum 8321HC (52 CPUs) using the
pipe_bench selftest (tools/testing/selftests/pipe/pipe_bench).

Two kernels were built from the same configuration (no debug options),
differing only by this series:

  - baseline: on-stack anon_pipe_prealloc pool + tmp_page[]
    Commit 4e5dfb7c84012 ("Add linux-next specific files for 20260623")
  - patched:  this series (unified per-pipe pool)

Each kernel was booted on the same host and benchmarked with 5 writers /
5 readers, 64 KiB messages, 5s per run, with and without memory pressure
(stress-ng --vm 4 --vm-bytes 80%). Comparing writes/s and average write
latency:

  - no memory pressure:    ~+11% throughput, ~-10% avg write latency
  - under memory pressure: ~+23% throughput, ~-18% avg write latency

The improvement comes from the larger persistent cache (up to 8 reusable
pages vs the old 2-page tmp_page cache), which reduces alloc_page()/
free_page() traffic; the effect is largest when reclaim is active.

Future:
=======

If this approach is accepted, we could keep all allocated pages in the pool
and rely on a shrinker to trim it under memory pressure.

Signed-off-by: Breno Leitao <leitao@debian.org>
---
Breno Leitao (4):
      fs/pipe: make the prealloc pool per-pipe infrastructure
      fs/pipe: add per-pipe pool push, prefill and trim helpers
      fs/pipe: switch the write path to the per-pipe pool
      fs/pipe: remove the old on-stack prealloc helpers and tmp_page[2]

 fs/pipe.c                 | 162 +++++++++++++++++++---------------------------
 include/linux/pipe_fs_i.h |  21 +++++-
 2 files changed, 86 insertions(+), 97 deletions(-)
---
base-commit: 4e5dfb7c84012007c3c7061126491bbc92d71bf1
change-id: 20260625-b4-pipe-unification-aba7b8525de7

Best regards,
-- 
Breno Leitao <leitao@debian.org>


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2026-07-03 15:28 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-26 10:26 [PATCH RFC 0/4] fs/pipe: unify the page pools into a single per-pipe pool Breno Leitao
2026-06-26 10:26 ` [PATCH RFC 1/4] fs/pipe: make the prealloc pool per-pipe infrastructure Breno Leitao
2026-06-26 10:26 ` [PATCH RFC 2/4] fs/pipe: add per-pipe pool push, prefill and trim helpers Breno Leitao
2026-06-26 10:26 ` [PATCH RFC 3/4] fs/pipe: switch the write path to the per-pipe pool Breno Leitao
2026-06-26 10:26 ` [PATCH RFC 4/4] fs/pipe: remove the old on-stack prealloc helpers and tmp_page[2] Breno Leitao
2026-07-03 10:19 ` [PATCH RFC 0/4] fs/pipe: unify the page pools into a single per-pipe pool Christian Brauner
2026-07-03 15:27   ` Breno Leitao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox