netfs.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* New cachefilesd implementation
@ 2024-11-29 14:54 Max Kellermann
  2024-11-29 16:54 ` David Howells
  0 siblings, 1 reply; 6+ messages in thread
From: Max Kellermann @ 2024-11-29 14:54 UTC (permalink / raw)
  To: netfs; +Cc: David Howells

Hi,

the venerable cachefilesd has caused us numerous problems. We have a
rather large cache partition (several terabytes containing hundreds of
millions of cached files), and cachefiled's culling would take a
thousand times longer than it would take the kernel to refill the
space that was freed. cachefilesd had been running for months at 100%
CPU with no visible progress. This made our fscache unusable.

This problem is caused by cachefilesd's bad choice of data structure:
it maintains a sorted array of object pointers, making
insert_into_cull_table() extremely slow and CPU consuming. I figured
reimplementing it from scratch in a saner language was faster than
fixing the old C code base. So I did just that, with C++ and io_uring.

This is it, after barely one day of hacking: https://github.com/CM4all/cash
It is not yet complete (hard-coded configuration, no graveyard
cleanup, bad error handling) and the code sure could be prettier, but
it's already running on our production servers and has cleaned up the
mess that cachefilesd has left behind.

Max

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-01-07 11:15 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-29 14:54 New cachefilesd implementation Max Kellermann
2024-11-29 16:54 ` David Howells
2024-11-29 18:41   ` Max Kellermann
2024-11-29 18:49   ` David Howells
2024-12-06 14:06   ` Max Kellermann
2025-01-07 11:15     ` Max Kellermann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).