From: Christian Brauner <brauner@kernel.org>
To: Jan Kara <jack@suse.cz>
Cc: linux-fsdevel@vger.kernel.org, aivazian.tigran@gmail.com,
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>,
Ted Tso <tytso@mit.edu>,
linux-ext4@vger.kernel.org
Subject: Re: [PATCH 3/9] fs: Writeout inode buffer from mmb_sync()
Date: Mon, 11 May 2026 15:27:30 +0200 [thread overview]
Message-ID: <20260511-marder-showprogramm-9c7a3198ef15@brauner> (raw)
In-Reply-To: <20260511121356.241821-12-jack@suse.cz>
On Mon, May 11, 2026 at 02:13:53PM +0200, Jan Kara wrote:
> Currently metadata bh tracking does not track inode buffers because they
> are usually shared by several inodes and so our linked list tracking
> cannot be used. On fsync we call sync_inode_metadata() to write inode
> instead where filesystems' .write_inode methods detect data integrity
> writeback and take care to submit inode buffer to disk and wait for it
> in that case. This is however racy as for example flush worker can
> submit normal (WB_SYNC_NONE) inode writeback first, which makes the
> inode clean and copies the inode to the buffer but doesn't submit the
> buffer for IO. Thus sync_inode_metadata() call does nothing and we fail
> to persist inode buffer to disk on fsync(2).
>
> Fix the problem by allowing filesystem to set the number of block backing
> the inode in mmb structure and mmb_sync() then takes care to writeout
> corresponding buffer and wait for it.
>
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
> fs/buffer.c | 34 +++++++++++++++++++++++-----------
> include/linux/fs.h | 1 +
> 2 files changed, 24 insertions(+), 11 deletions(-)
>
> diff --git a/fs/buffer.c b/fs/buffer.c
> index b0b3792b1496..dba29a45346b 100644
> --- a/fs/buffer.c
> +++ b/fs/buffer.c
> @@ -477,12 +477,14 @@ EXPORT_SYMBOL(mark_buffer_async_write);
> * using RCU, grab the lock, verify we didn't race with somebody detaching the
> * bh / moving it to different inode and only then proceeding.
> */
> +#define INVALID_BLK (~0ULL)
>
> void mmb_init(struct mapping_metadata_bhs *mmb, struct address_space *mapping)
> {
> spin_lock_init(&mmb->lock);
> INIT_LIST_HEAD(&mmb->list);
> mmb->mapping = mapping;
> + mmb->inode_blk = INVALID_BLK;
> }
> EXPORT_SYMBOL(mmb_init);
>
> @@ -593,8 +595,18 @@ int mmb_sync(struct mapping_metadata_bhs *mmb)
> }
> }
> }
> -
> spin_unlock(&mmb->lock);
> +
> + /* Writeout inode buffer head */
> + if (mmb->inode_blk != INVALID_BLK) {
> + bh = sb_find_get_block(mmb->mapping->host->i_sb, mmb->inode_blk);
> + write_dirty_buffer(bh, REQ_SYNC);
> + wait_on_buffer(bh);
> + if (!buffer_uptodate(bh))
> + err = -EIO;
> + brelse(bh);
> + }
> +
> blk_finish_plug(&plug);
> spin_lock(&mmb->lock);
>
> @@ -646,18 +658,18 @@ int mmb_fsync_noflush(struct file *file, struct mapping_metadata_bhs *mmb,
> if (err)
> return err;
>
> - if (mmb)
> - ret = mmb_sync(mmb);
> if (!(inode_state_read_once(inode) & I_DIRTY_ALL))
> - goto out;
> + goto sync_buffers;
> if (datasync && !(inode_state_read_once(inode) & I_DIRTY_DATASYNC))
> - goto out;
> -
> - err = sync_inode_metadata(inode, 1);
> - if (ret == 0)
> - ret = err;
> -
> -out:
> + goto sync_buffers;
> +
> + ret = sync_inode_metadata(inode, 1);
> +sync_buffers:
> + if (mmb) {
> + err = mmb_sync(mmb);
> + if (ret == 0)
> + ret = err;
> + }
> /* check and advance again to catch errors after syncing out buffers */
> err = file_check_and_advance_wb_err(file);
> if (ret == 0)
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 11559c513dfb..435a41e4c90f 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -446,6 +446,7 @@ extern const struct address_space_operations empty_aops;
> /* Structure for tracking metadata buffer heads associated with the mapping */
> struct mapping_metadata_bhs {
> struct address_space *mapping; /* Mapping bhs are associated with */
> + sector_t inode_blk; /* Number of block containing the inode */
This is great, thanks!
next prev parent reply other threads:[~2026-05-11 13:27 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-11 12:13 [PATCH 0/9] fs: Fix missed inode write during fsync Jan Kara
2026-05-11 12:13 ` [PATCH 1/9] affs: Drop support for metadata bh tracking Jan Kara
2026-05-11 12:13 ` [PATCH 2/9] ext4: Allocate mapping_metadata_bhs struct on demand Jan Kara
2026-05-11 12:13 ` [PATCH 3/9] fs: Writeout inode buffer from mmb_sync() Jan Kara
2026-05-11 13:27 ` Christian Brauner [this message]
2026-05-11 12:13 ` [PATCH 4/9] ext2: Fix possibly missing inode write on fsync(2) Jan Kara
2026-05-11 12:13 ` [PATCH 5/9] udf: " Jan Kara
2026-05-11 12:13 ` [PATCH 6/9] fat: " Jan Kara
2026-05-11 14:32 ` OGAWA Hirofumi
2026-05-11 17:03 ` Jan Kara
2026-05-11 18:02 ` OGAWA Hirofumi
2026-05-12 7:29 ` Jan Kara
2026-05-12 14:17 ` OGAWA Hirofumi
2026-05-11 12:13 ` [PATCH 7/9] minix: " Jan Kara
2026-05-11 12:13 ` [PATCH 8/9] bfs: " Jan Kara
2026-05-11 12:13 ` [PATCH 9/9] ext4: Use mmb infrastructure for inode buffer writeout Jan Kara
2026-05-11 13:30 ` Christian Brauner
2026-05-11 20:49 ` [syzbot ci] Re: fs: Fix missed inode write during fsync syzbot ci
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260511-marder-showprogramm-9c7a3198ef15@brauner \
--to=brauner@kernel.org \
--cc=aivazian.tigran@gmail.com \
--cc=hirofumi@mail.parknet.co.jp \
--cc=jack@suse.cz \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox