linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: tytso@mit.edu
Cc: John@groves.net, linux-ext4@vger.kernel.org, miklos@szeredi.hu,
	joannelkoong@gmail.com, bernd@bsbernd.com,
	linux-fsdevel@vger.kernel.org
Subject: [PATCH 09/10] libext2fs: allow clients to ask to write full superblocks
Date: Wed, 21 May 2025 17:10:35 -0700	[thread overview]
Message-ID: <174787198229.1484572.9967956151235591161.stgit@frogsfrogsfrogs> (raw)
In-Reply-To: <174787198025.1484572.10345977324531146086.stgit@frogsfrogsfrogs>

From: Darrick J. Wong <djwong@kernel.org>

write_primary_superblock currently does this weird dance where it will
try to write only the dirty bytes of the primary superblock to disk.  In
theory, this is done so that tune2fs can incrementally update superblock
bytes when the filesystem is mounted; ext2 was famous for allowing using
this dance to set new fs parameters and have them take effect in real
time.

The ability to do this safely was obliterated back in 2001 when ext3 was
introduced with journalling, because tune2fs has no way to know if the
journal has already logged an updated primary superblock but not yet
written it to disk, which means that they can race to write, and changes
can be lost.

This (non-)safety was further obliterated back in 2012 when I added
checksums to all the metadata blocks in ext4 because anyone else with
the block device open can see the primary superblock in an intermediate
state where the checksum does not match the superblock contents.

At this point in 2025 it's kind of stupid to still be doing this, and it
makes fuse2fs syncfs slow because we now perform a bunch of small writes
and introduce extra fsyncs.  It will become especially painful when
fuse2fs turns on iomap, at which point it will need to use directio to
access the disk, which then runs the Really Sad Path where we change the
blocksize and completely obliterate the cache contents.

So, add a new flag to ask for full superblock writes, which fuse2fs will
use later.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 lib/ext2fs/ext2fs.h  |    1 +
 lib/ext2fs/closefs.c |    7 +++++++
 2 files changed, 8 insertions(+)


diff --git a/lib/ext2fs/ext2fs.h b/lib/ext2fs/ext2fs.h
index 2661e10f57c047..22d56ad7554496 100644
--- a/lib/ext2fs/ext2fs.h
+++ b/lib/ext2fs/ext2fs.h
@@ -220,6 +220,7 @@ typedef struct ext2_file *ext2_file_t;
 #define EXT2_FLAG_IBITMAP_TAIL_PROBLEM	0x2000000
 #define EXT2_FLAG_THREADS		0x4000000
 #define EXT2_FLAG_IGNORE_SWAP_DIRENT	0x8000000
+#define EXT2_FLAG_WRITE_FULL_SUPER	0x10000000
 
 /*
  * Internal flags for use by the ext2fs library only
diff --git a/lib/ext2fs/closefs.c b/lib/ext2fs/closefs.c
index 8e5bec03a050de..9a67db76e7b326 100644
--- a/lib/ext2fs/closefs.c
+++ b/lib/ext2fs/closefs.c
@@ -196,6 +196,13 @@ static errcode_t write_primary_superblock(ext2_filsys fs,
 	int		check_idx, write_idx, size;
 	errcode_t	retval;
 
+	if (fs->flags & EXT2_FLAG_WRITE_FULL_SUPER) {
+		retval = io_channel_write_byte(fs->io, SUPERBLOCK_OFFSET,
+					       SUPERBLOCK_SIZE, super);
+		if (!retval)
+			return 0;
+	}
+
 	if (!fs->io->manager->write_byte || !fs->orig_super) {
 	fallback:
 		io_channel_set_blksize(fs->io, SUPERBLOCK_OFFSET);


  parent reply	other threads:[~2025-05-22  0:10 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-21 23:58 [RFC[RAP]] fuse: use fs-iomap for better performance so we can containerize ext4 Darrick J. Wong
2025-05-22  0:01 ` [PATCHSET RFC[RAP]] fuse: allow servers to use iomap for better file IO performance Darrick J. Wong
2025-05-22  0:02   ` [PATCH 01/11] fuse: fix livelock in synchronous file put from fuseblk workers Darrick J. Wong
2025-05-29 11:08     ` Miklos Szeredi
2025-05-31  1:08       ` Darrick J. Wong
2025-06-06 13:54         ` Miklos Szeredi
2025-06-09 18:13           ` Darrick J. Wong
2025-06-09 20:29             ` Darrick J. Wong
2025-05-22  0:02   ` [PATCH 02/11] iomap: exit early when iomap_iter is called with zero length Darrick J. Wong
2025-05-22  0:03   ` [PATCH 03/11] fuse: implement the basic iomap mechanisms Darrick J. Wong
2025-05-29 22:15     ` Joanne Koong
2025-05-29 23:15       ` Joanne Koong
2025-06-03  0:13         ` Darrick J. Wong
2025-05-22  0:03   ` [PATCH 04/11] fuse: add a notification to add new iomap devices Darrick J. Wong
2025-05-22 16:46     ` Amir Goldstein
2025-05-22 17:11       ` Darrick J. Wong
2025-05-22  0:03   ` [PATCH 05/11] fuse: send FUSE_DESTROY to userspace when tearing down an iomap connection Darrick J. Wong
2025-05-22  0:04   ` [PATCH 06/11] fuse: implement basic iomap reporting such as FIEMAP and SEEK_{DATA,HOLE} Darrick J. Wong
2025-05-22  0:04   ` [PATCH 07/11] fuse: implement direct IO with iomap Darrick J. Wong
2025-05-22  0:04   ` [PATCH 08/11] fuse: implement buffered " Darrick J. Wong
2025-05-22  0:04   ` [PATCH 09/11] fuse: implement large folios for iomap pagecache files Darrick J. Wong
2025-05-22  0:05   ` [PATCH 10/11] fuse: use an unrestricted backing device with iomap pagecache io Darrick J. Wong
2025-05-22  0:05   ` [PATCH 11/11] fuse: advertise support for iomap Darrick J. Wong
2025-05-22  0:01 ` [PATCHSET RFC[RAP]] libfuse: allow servers to use iomap for better file IO performance Darrick J. Wong
2025-05-22  0:05   ` [PATCH 1/8] libfuse: add kernel gates for FUSE_IOMAP and bump libfuse api version Darrick J. Wong
2025-05-22  0:05   ` [PATCH 2/8] libfuse: add fuse commands for iomap_begin and end Darrick J. Wong
2025-05-22  0:06   ` [PATCH 3/8] libfuse: add upper level iomap commands Darrick J. Wong
2025-05-22  0:06   ` [PATCH 4/8] libfuse: add a notification to add a new device to iomap Darrick J. Wong
2025-05-22  0:06   ` [PATCH 5/8] libfuse: add iomap ioend low level handler Darrick J. Wong
2025-05-22  0:06   ` [PATCH 6/8] libfuse: add upper level iomap ioend commands Darrick J. Wong
2025-05-22  0:07   ` [PATCH 7/8] libfuse: add FUSE_IOMAP_PAGECACHE Darrick J. Wong
2025-05-22  0:07   ` [PATCH 8/8] libfuse: allow discovery of the kernel's iomap capabilities Darrick J. Wong
2025-05-22  0:02 ` [PATCHSET RFC[RAP] 2/3] libext2fs: refactoring for fuse2fs iomap support Darrick J. Wong
2025-05-22  0:08   ` [PATCH 01/10] libext2fs: always fsync the device when flushing the cache Darrick J. Wong
2025-05-22  0:08   ` [PATCH 02/10] libext2fs: always fsync the device when closing the unix IO manager Darrick J. Wong
2025-05-22  0:09   ` [PATCH 03/10] libext2fs: only fsync the unix fd if we wrote to the device Darrick J. Wong
2025-05-22  0:09   ` [PATCH 04/10] libext2fs: invalidate cached blocks when freeing them Darrick J. Wong
2025-05-22  0:09   ` [PATCH 05/10] libext2fs: add tagged block IO for better caching Darrick J. Wong
2025-05-22  0:09   ` [PATCH 06/10] libext2fs: add tagged block IO caching to the unix IO manager Darrick J. Wong
2025-05-22  0:10   ` [PATCH 07/10] libext2fs: only flush affected blocks in unix_write_byte Darrick J. Wong
2025-05-22  0:10   ` [PATCH 08/10] libext2fs: allow unix_write_byte when the write would be aligned Darrick J. Wong
2025-05-22  0:10   ` Darrick J. Wong [this message]
2025-05-22  0:10   ` [PATCH 10/10] libext2fs: allow callers to disallow I/O to file data blocks Darrick J. Wong
2025-05-22  0:02 ` [PATCHSET RFC[RAP] 3/3] fuse2fs: use fuse iomap data paths for better file I/O performance Darrick J. Wong
2025-05-22  0:11   ` [PATCH 01/16] fuse2fs: implement bare minimum iomap for file mapping reporting Darrick J. Wong
2025-05-22  0:11   ` [PATCH 02/16] fuse2fs: register block devices for use with iomap Darrick J. Wong
2025-05-22  0:11   ` [PATCH 03/16] fuse2fs: always use directio disk reads with fuse2fs Darrick J. Wong
2025-05-22  0:11   ` [PATCH 04/16] fuse2fs: implement directio file reads Darrick J. Wong
2025-05-22  0:12   ` [PATCH 05/16] fuse2fs: use tagged block IO for zeroing sub-block regions Darrick J. Wong
2025-05-22  0:12   ` [PATCH 06/16] fuse2fs: only flush the cache for the file under directio read Darrick J. Wong
2025-05-22  0:12   ` [PATCH 07/16] fuse2fs: add extent dump function for debugging Darrick J. Wong
2025-05-22  0:12   ` [PATCH 08/16] fuse2fs: implement direct write support Darrick J. Wong
2025-05-22  0:13   ` [PATCH 09/16] fuse2fs: turn on iomap for pagecache IO Darrick J. Wong
2025-05-22  0:13   ` [PATCH 10/16] fuse2fs: flush and invalidate the buffer cache on trim Darrick J. Wong
2025-05-22  0:13   ` [PATCH 11/16] fuse2fs: improve tracing for fallocate Darrick J. Wong
2025-05-22  0:13   ` [PATCH 12/16] fuse2fs: don't zero bytes in punch hole Darrick J. Wong
2025-05-22  0:14   ` [PATCH 13/16] fuse2fs: don't do file data block IO when iomap is enabled Darrick J. Wong
2025-05-22  0:14   ` [PATCH 14/16] fuse2fs: disable most io channel flush/invalidate in iomap pagecache mode Darrick J. Wong
2025-05-22  0:14   ` [PATCH 15/16] fuse2fs: re-enable the block device pagecache for metadata IO Darrick J. Wong
2025-05-22  0:15   ` [PATCH 16/16] fuse2fs: avoid fuseblk mode if fuse-iomap support is likely Darrick J. Wong
2025-05-22 16:24 ` [RFC[RAP]] fuse: use fs-iomap for better performance so we can containerize ext4 Amir Goldstein
2025-05-29 16:45   ` Darrick J. Wong
2025-05-29 19:41     ` Amir Goldstein
2025-06-09 22:31       ` Darrick J. Wong
2025-06-10 10:59         ` Amir Goldstein
2025-06-10 19:00           ` Darrick J. Wong
2025-06-10 19:51             ` Amir Goldstein
2025-06-11  6:00               ` Darrick J. Wong
2025-06-11  8:54                 ` Amir Goldstein
2025-06-12  5:54                   ` Miklos Szeredi
2025-06-13 17:44                     ` Darrick J. Wong
2025-06-11 11:56             ` Theodore Ts'o
2025-06-12  3:20               ` Darrick J. Wong
2025-06-12  6:10                 ` Amir Goldstein
2025-06-20  8:58               ` Allison Karlitskaya
2025-06-20 11:50                 ` Bernd Schubert
2025-07-01  6:02                   ` Darrick J. Wong
2025-07-01  5:58                 ` Darrick J. Wong
2025-07-12 10:57       ` Amir Goldstein
2025-06-13 17:37   ` [RFC[RAP] V2] " Darrick J. Wong
2025-06-23 13:16     ` Miklos Szeredi
2025-07-01  6:05       ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=174787198229.1484572.9967956151235591161.stgit@frogsfrogsfrogs \
    --to=djwong@kernel.org \
    --cc=John@groves.net \
    --cc=bernd@bsbernd.com \
    --cc=joannelkoong@gmail.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).