From: "Theodore Ts'o" <tytso@mit.edu>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: NeilBrown <neilb@suse.de>,
James Bottomley <James.Bottomley@hansenpartnership.com>,
Dave Chinner <david@fromorbit.com>,
Eric Sandeen <sandeen@sandeen.net>,
Steven Rostedt <rostedt@goodmis.org>,
Guenter Roeck <linux@roeck-us.net>,
Christoph Hellwig <hch@infradead.org>,
ksummit@lists.linux.dev, linux-fsdevel@vger.kernel.org
Subject: Re: [MAINTAINERS/KERNEL SUMMIT] Trust and maintenance of file systems
Date: Sun, 17 Sep 2023 14:57:42 -0400 [thread overview]
Message-ID: <20230917185742.GA19642@mit.edu> (raw)
In-Reply-To: <CAHk-=wg_p7g=nonWOqgHGVXd+ZwZs8im-G=pNHP6hW60c8=UHw@mail.gmail.com>
On Sun, Sep 17, 2023 at 10:30:55AM -0700, Linus Torvalds wrote:
> And yes, *within* the context of a filesystem or two, the whole "try
> to avoid the buffer cache" can be a real thing.
Ext4 uses buffer_heads, and wasn't on your list because we don't use
sb_bread(). And we are thinking about getting rid of buffer heads,
mostly because (a) we want to have more control over which metadata
blocks gets cached and which doesn't, and (b) the buffer cache doesn't
have a callback function to inform the file system if the writeback
fails, so that the file system can try to work around the issue, or at
the very least, mark the file system as corrupted and to signal it via
fsnotify.
Attempts to fix (b) via enhancements buffer cache where shot down by
the linux-fsdevel bike-shedding cabal, because "the buffer cache is
deprecated", and at the time, I decided it wasn't worth pushing it,
since (a) was also a consideration, and I expect we can also (c)
reduce the memory overhead since there are large parts of struct
buffer_head that ext4 doesn't need.
There was *one* one technical argument raised by people who want to
get rid of buffer heads, which is that the change from set_bh_page()
to folio_set_bh() introduced a bug which broke bh_offset() in a way
that only showed up if you were using bh_offset() and the file system
block size was less than the page size.
Eh, it was a bug, and we caught it quickly enough once someone
actually tried to run xfstests on the commit, and it bisected pretty
quickly. (Unfortunately, the change went in via the mm tree, and so
it wasn't noticed by the ext4 file system developers; but
fortunatelly, Zorro flagged it, and once that showed up, I
investigated it.) As far as I'm concerned, that's working as
intended, and these sorts of things happen. So personally, I don't
consider this an argument for nuking the buffer cache.
I *do* view it as only one of many concerns when we do make these
tree-wide changes, such as the folio migration. Yes, these these
tree-wide can introduce regressions, such as breaking bh_offset() for
a week or so before the regression tests catch it, and another week
before the fix makes its way to Linus's tree. That's the system
working as designed.
But that's not the only concern; the other problem with these
tree-wide changes is that it tends to break automatic backports of bug
fixes to the LTS kernels, which now require manual handling by the
file system developers (or we could leave the LTS kernels with the
bugs unfixed, but that tends to make customers cranky :-).
Anyway, it's perhaps natural that the people who make these sorts of
tree-wide changes may get cranky when they need to modify, or at least
regression test, 20+ legacy file systems, and it does kind of suck
that many of these legacy file systems can't easily be tested by
xfstests because we don't even have a mkfs program for them. (OTOH,
we recently merged ntfs3 w/o a working, or at least open-source, mkfs
program, so this isn't *just* the fault of legacy file systems.)
So sure, they may wish that we could make the job of landing these
sorts of tree-wide changes to make their job easier. But we don't do
tree-wide changes all that often, and so it's a mistake to try to
optimize for this non-common case.
Cheers,
- Ted
next prev parent reply other threads:[~2023-09-17 18:59 UTC|newest]
Thread overview: 97+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-30 14:07 [MAINTAINERS/KERNEL SUMMIT] Trust and maintenance of file systems Christoph Hellwig
2023-09-05 23:06 ` Dave Chinner
2023-09-05 23:23 ` Matthew Wilcox
2023-09-06 2:09 ` Dave Chinner
2023-09-06 15:06 ` Christian Brauner
2023-09-06 15:59 ` Christian Brauner
2023-09-06 19:09 ` Geert Uytterhoeven
2023-09-08 8:34 ` Christoph Hellwig
2023-09-07 0:46 ` Bagas Sanjaya
2023-09-09 12:50 ` James Bottomley
2023-09-09 15:44 ` Matthew Wilcox
2023-09-10 19:51 ` James Bottomley
2023-09-10 20:19 ` Kent Overstreet
2023-09-10 21:15 ` Guenter Roeck
2023-09-11 3:10 ` Theodore Ts'o
2023-09-11 19:03 ` James Bottomley
2023-09-12 0:23 ` Dave Chinner
2023-09-12 16:52 ` H. Peter Anvin
2023-09-09 22:42 ` Kent Overstreet
2023-09-10 8:19 ` Geert Uytterhoeven
2023-09-10 8:37 ` Bernd Schubert
2023-09-10 16:35 ` Kent Overstreet
2023-09-10 17:26 ` Geert Uytterhoeven
2023-09-10 17:35 ` Kent Overstreet
2023-09-11 1:05 ` Dave Chinner
2023-09-11 1:29 ` Kent Overstreet
2023-09-11 2:07 ` Dave Chinner
2023-09-11 13:35 ` David Disseldorp
2023-09-11 17:45 ` Bart Van Assche
2023-09-11 19:11 ` David Disseldorp
2023-09-11 23:05 ` Dave Chinner
2023-09-26 5:24 ` Eric W. Biederman
2023-09-08 8:55 ` Christoph Hellwig
2023-09-08 22:47 ` Dave Chinner
2023-09-06 22:32 ` Guenter Roeck
2023-09-06 22:54 ` Dave Chinner
2023-09-07 0:53 ` Bagas Sanjaya
2023-09-07 3:14 ` Dave Chinner
2023-09-07 1:53 ` Steven Rostedt
2023-09-07 2:22 ` Dave Chinner
2023-09-07 2:51 ` Steven Rostedt
2023-09-07 3:26 ` Matthew Wilcox
2023-09-07 8:04 ` Thorsten Leemhuis
2023-09-07 10:29 ` Christian Brauner
2023-09-07 11:18 ` Thorsten Leemhuis
2023-09-07 12:04 ` Matthew Wilcox
2023-09-07 12:57 ` Guenter Roeck
2023-09-07 13:56 ` Christian Brauner
2023-09-08 8:44 ` Christoph Hellwig
2023-09-07 3:38 ` Dave Chinner
2023-09-07 11:18 ` Steven Rostedt
2023-09-13 16:43 ` Eric Sandeen
2023-09-13 16:58 ` Guenter Roeck
2023-09-13 17:03 ` Linus Torvalds
2023-09-15 22:48 ` Dave Chinner
2023-09-16 19:44 ` Steven Rostedt
2023-09-16 21:50 ` James Bottomley
2023-09-17 1:40 ` NeilBrown
2023-09-17 17:30 ` Linus Torvalds
2023-09-17 18:09 ` Linus Torvalds
2023-09-17 18:57 ` Theodore Ts'o [this message]
2023-09-17 19:45 ` Linus Torvalds
2023-09-18 11:14 ` Jan Kara
2023-09-18 17:26 ` Linus Torvalds
2023-09-18 19:32 ` Jiri Kosina
2023-09-18 19:59 ` Linus Torvalds
2023-09-18 20:50 ` Theodore Ts'o
2023-09-18 22:48 ` Linus Torvalds
2023-09-18 20:33 ` H. Peter Anvin
2023-09-19 4:56 ` Dave Chinner
2023-09-25 9:43 ` Christoph Hellwig
2023-09-27 22:23 ` Dave Kleikamp
2023-09-19 1:15 ` Dave Chinner
2023-09-19 5:17 ` Matthew Wilcox
2023-09-19 16:34 ` Theodore Ts'o
2023-09-19 16:45 ` Matthew Wilcox
2023-09-19 17:15 ` Linus Torvalds
2023-09-19 22:57 ` Dave Chinner
2023-09-18 14:54 ` Bill O'Donnell
2023-09-19 2:44 ` Dave Chinner
2023-09-19 16:57 ` James Bottomley
2023-09-25 9:38 ` Christoph Hellwig
2023-09-25 14:14 ` Dan Carpenter
2023-09-25 16:50 ` Linus Torvalds
2023-09-07 9:48 ` Dan Carpenter
2023-09-07 11:04 ` Segher Boessenkool
2023-09-07 11:22 ` Steven Rostedt
2023-09-07 12:24 ` Segher Boessenkool
2023-09-07 11:23 ` Dan Carpenter
2023-09-07 12:30 ` Segher Boessenkool
2023-09-12 9:50 ` Richard Biener
2023-10-23 5:19 ` Eric Gallager
2023-09-08 8:39 ` Christoph Hellwig
2023-09-08 8:38 ` Christoph Hellwig
2023-09-08 23:21 ` Dave Chinner
2023-09-07 0:48 ` Bagas Sanjaya
2023-09-07 3:07 ` Guenter Roeck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230917185742.GA19642@mit.edu \
--to=tytso@mit.edu \
--cc=James.Bottomley@hansenpartnership.com \
--cc=david@fromorbit.com \
--cc=hch@infradead.org \
--cc=ksummit@lists.linux.dev \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux@roeck-us.net \
--cc=neilb@suse.de \
--cc=rostedt@goodmis.org \
--cc=sandeen@sandeen.net \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).