linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Francesco Mazzoli" <f@mazzo.li>
To: linux-fsdevel@vger.kernel.org
Subject: Mainlining the kernel module for TernFS, a distributed filesystem
Date: Fri, 03 Oct 2025 13:13:39 +0100	[thread overview]
Message-ID: <bc883a36-e690-4384-b45f-6faf501524f0@app.fastmail.com> (raw)

My workplace (XTX Markets) has open sourced a distributed
filesystem which has been used internally for a few years, TernFS:
<https://github.com/XTXMarkets/ternfs>. The repository includes both the server
code for the filesystem but also several clients. The main client we use
is a kernel module which allows you to mount TernFS from Linux systems. The
current codebase would not be ready for upstreaming, but I wanted to gauge
if eventual upstreaming would be even possible in this case, and if yes,
what the process would be.

Obviously TernFS currently has only one user, although we run on more than
100 thousand machines, spanning relatively diverse hardware and running
fairly diverse software. And this might change if other organizations adopt
TernFS now that it is open source, naturally.

The kernel module has been fairly stable, although we need to properly adapt
it to the folio world. However it would be much easier to maintain it if
it was mainlined, and I wanted to describe the peculiarities of TernFS to
see if it would be even possible to do so. For those interested we also
have a blog post going in a lot more detail about the design of TernFS
(<https://www.xtxmarkets.com/tech/2025-ternfs/>), but hopefully this email
would be enough for the purposes of this discusion.

TernFS files are immutable, they're written once and then can't be modified.
Moreover, when files are created they're not actually linked into the
directory structure until they're closed. One way to think about it is that
in TernFS every file follows the semantics you'd have if you opened the file
with `O_TMPFILE` and then linked them with `linkat`. This is the most "odd"
part of the kernel module since it goes counter pretty baked in assumptions
of how the file lifecycle works.

TernFS also does not support many things, for example hardlinks, permissions,
any sort of extended attribute, and so on. This is I would imagine less
unpleasant though since it's just a matter of getting ENOTSUP out of a bunch
of syscalls.

Apart from that I wouldn't expect TernFS to be that different from Ceph or
other networked storage codebases inside the kernel.

Let me know what you think,
Francesco

             reply	other threads:[~2025-10-03 12:15 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-03 12:13 Francesco Mazzoli [this message]
2025-10-03 14:22 ` Mainlining the kernel module for TernFS, a distributed filesystem Amir Goldstein
2025-10-03 15:01   ` Francesco Mazzoli
2025-10-03 17:35     ` Bernd Schubert
2025-10-03 18:18       ` Francesco Mazzoli
2025-10-03 19:01         ` Francesco Mazzoli
2025-10-04  2:52     ` Theodore Ts'o
2025-10-04  9:01       ` Francesco Mazzoli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bc883a36-e690-4384-b45f-6faf501524f0@app.fastmail.com \
    --to=f@mazzo.li \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).