From: Jan Kara <jack@suse.cz>
To: <linux-fsdevel@vger.kernel.org>
Cc: <linux-block@vger.kernel.org>,
Christian Brauner <brauner@kernel.org>,
Al Viro <viro@ZenIV.linux.org.uk>, <linux-ext4@vger.kernel.org>,
Ted Tso <tytso@mit.edu>,
"Tigran A. Aivazian" <aivazian.tigran@gmail.com>,
David Sterba <dsterba@suse.com>,
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>,
Muchun Song <muchun.song@linux.dev>,
Oscar Salvador <osalvador@suse.de>,
David Hildenbrand <david@kernel.org>,
linux-mm@kvack.org, linux-aio@kvack.org,
Benjamin LaHaise <bcrl@kvack.org>, Jan Kara <jack@suse.cz>
Subject: [PATCH 33/42] fs: Provide functions for handling mapping_metadata_bhs directly
Date: Thu, 26 Mar 2026 10:54:27 +0100 [thread overview]
Message-ID: <20260326095354.16340-75-jack@suse.cz> (raw)
In-Reply-To: <20260326082428.31660-1-jack@suse.cz>
As part of transition toward moving mapping_metadata_bhs to fs-private
part of the inode, provide functions for operations on this list
directly instead of going through the inode / mapping.
Signed-off-by: Jan Kara <jack@suse.cz>
---
fs/buffer.c | 110 +++++++++++++++++-------------------
include/linux/buffer_head.h | 44 ++++++++++++---
2 files changed, 87 insertions(+), 67 deletions(-)
diff --git a/fs/buffer.c b/fs/buffer.c
index b0436481d0f1..cbed175f418b 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -467,31 +467,25 @@ EXPORT_SYMBOL(mark_buffer_async_write);
* a successful fsync(). For example, ext2 indirect blocks need to be
* written back and waited upon before fsync() returns.
*
- * The functions mark_buffer_dirty_inode(), fsync_inode_buffers(),
- * mmb_has_buffers() and invalidate_inode_buffers() are provided for the
- * management of a list of dependent buffers in mapping_metadata_bhs struct.
+ * The functions mmb_mark_buffer_dirty(), mmb_sync(), mmb_has_buffers()
+ * and mmb_invalidate() are provided for the management of a list of dependent
+ * buffers in mapping_metadata_bhs struct.
*
* The locking is a little subtle: The list of buffer heads is protected by
* the lock in mapping_metadata_bhs so functions coming from bdev mapping
* (such as try_to_free_buffers()) need to safely get to mapping_metadata_bhs
* using RCU, grab the lock, verify we didn't race with somebody detaching the
* bh / moving it to different inode and only then proceeding.
- *
- * FIXME: mark_buffer_dirty_inode() is a data-plane operation. It should
- * take an address_space, not an inode. And it should be called
- * mark_buffer_dirty_fsync() to clearly define why those buffers are being
- * queued up.
- *
- * FIXME: mark_buffer_dirty_inode() doesn't need to add the buffer to the
- * list if it is already on a list. Because if the buffer is on a list,
- * it *must* already be on the right one. If not, the filesystem is being
- * silly. This will save a ton of locking. But first we have to ensure
- * that buffers are taken *off* the old inode's list when they are freed
- * (presumably in truncate). That requires careful auditing of all
- * filesystems (do it inside bforget()). It could also be done by bringing
- * b_inode back.
*/
+void mmb_init(struct mapping_metadata_bhs *mmb, struct address_space *mapping)
+{
+ spin_lock_init(&mmb->lock);
+ INIT_LIST_HEAD(&mmb->list);
+ mmb->mapping = mapping;
+}
+EXPORT_SYMBOL(mmb_init);
+
static void __remove_assoc_queue(struct mapping_metadata_bhs *mmb,
struct buffer_head *bh)
{
@@ -533,12 +527,12 @@ bool mmb_has_buffers(struct mapping_metadata_bhs *mmb)
EXPORT_SYMBOL_GPL(mmb_has_buffers);
/**
- * sync_mapping_buffers - write out & wait upon a mapping's "associated" buffers
- * @mapping: the mapping which wants those buffers written
+ * mmb_sync - write out & wait upon all buffers in a list
+ * @mmb: the list of buffers to write
*
- * Starts I/O against the buffers at mapping->i_metadata_bhs and waits upon
- * that I/O. Basically, this is a convenience function for fsync(). @mapping
- * is a file or directory which needs those buffers to be written for a
+ * Starts I/O against the buffers in the given list and waits upon
+ * that I/O. Basically, this is a convenience function for fsync(). @mmb is
+ * for a file or directory which needs those buffers to be written for a
* successful fsync().
*
* We have conflicting pressures: we want to make sure that all
@@ -553,9 +547,8 @@ EXPORT_SYMBOL_GPL(mmb_has_buffers);
* buffer stays on our list until IO completes (at which point it can be
* reaped).
*/
-int sync_mapping_buffers(struct address_space *mapping)
+int mmb_sync(struct mapping_metadata_bhs *mmb)
{
- struct mapping_metadata_bhs *mmb = &mapping->i_metadata_bhs;
struct buffer_head *bh;
int err = 0;
struct blk_plug plug;
@@ -626,33 +619,35 @@ int sync_mapping_buffers(struct address_space *mapping)
spin_unlock(&mmb->lock);
return err;
}
-EXPORT_SYMBOL(sync_mapping_buffers);
+EXPORT_SYMBOL(mmb_sync);
/**
- * generic_buffers_fsync_noflush - generic buffer fsync implementation
- * for simple filesystems with no inode lock
+ * mmb_fsync_noflush - fsync implementation for simple filesystems with
+ * metadata buffers list
*
* @file: file to synchronize
+ * @mmb: list of metadata bhs to flush
* @start: start offset in bytes
* @end: end offset in bytes (inclusive)
* @datasync: only synchronize essential metadata if true
*
- * This is a generic implementation of the fsync method for simple
- * filesystems which track all non-inode metadata in the buffers list
- * hanging off the address_space structure.
+ * This is an implementation of the fsync method for simple filesystems which
+ * track all non-inode metadata in the buffers list hanging off the @mmb
+ * structure.
*/
-int generic_buffers_fsync_noflush(struct file *file, loff_t start, loff_t end,
- bool datasync)
+int mmb_fsync_noflush(struct file *file, struct mapping_metadata_bhs *mmb,
+ loff_t start, loff_t end, bool datasync)
{
struct inode *inode = file->f_mapping->host;
int err;
- int ret;
+ int ret = 0;
err = file_write_and_wait_range(file, start, end);
if (err)
return err;
- ret = sync_mapping_buffers(inode->i_mapping);
+ if (mmb)
+ ret = mmb_sync(mmb);
if (!(inode_state_read_once(inode) & I_DIRTY_ALL))
goto out;
if (datasync && !(inode_state_read_once(inode) & I_DIRTY_DATASYNC))
@@ -669,34 +664,35 @@ int generic_buffers_fsync_noflush(struct file *file, loff_t start, loff_t end,
ret = err;
return ret;
}
-EXPORT_SYMBOL(generic_buffers_fsync_noflush);
+EXPORT_SYMBOL(mmb_fsync_noflush);
/**
- * generic_buffers_fsync - generic buffer fsync implementation
- * for simple filesystems with no inode lock
+ * mmb_fsync - fsync implementation for simple filesystems with metadata
+ * buffers list
*
* @file: file to synchronize
+ * @mmb: list of metadata bhs to flush
* @start: start offset in bytes
* @end: end offset in bytes (inclusive)
* @datasync: only synchronize essential metadata if true
*
- * This is a generic implementation of the fsync method for simple
- * filesystems which track all non-inode metadata in the buffers list
- * hanging off the address_space structure. This also makes sure that
- * a device cache flush operation is called at the end.
+ * This is an implementation of the fsync method for simple filesystems which
+ * track all non-inode metadata in the buffers list hanging off the @mmb
+ * structure. This also makes sure that a device cache flush operation is
+ * called at the end.
*/
-int generic_buffers_fsync(struct file *file, loff_t start, loff_t end,
- bool datasync)
+int mmb_fsync(struct file *file, struct mapping_metadata_bhs *mmb,
+ loff_t start, loff_t end, bool datasync)
{
struct inode *inode = file->f_mapping->host;
int ret;
- ret = generic_buffers_fsync_noflush(file, start, end, datasync);
+ ret = mmb_fsync_noflush(file, mmb, start, end, datasync);
if (!ret)
ret = blkdev_issue_flush(inode->i_sb->s_bdev);
return ret;
}
-EXPORT_SYMBOL(generic_buffers_fsync);
+EXPORT_SYMBOL(mmb_fsync);
/*
* Called when we've recently written block `bblock', and it is known that
@@ -717,20 +713,18 @@ void write_boundary_block(struct block_device *bdev,
}
}
-void mark_buffer_dirty_inode(struct buffer_head *bh, struct inode *inode)
+void mmb_mark_buffer_dirty(struct buffer_head *bh,
+ struct mapping_metadata_bhs *mmb)
{
- struct address_space *mapping = inode->i_mapping;
-
mark_buffer_dirty(bh);
if (!bh->b_mmb) {
- spin_lock(&mapping->i_metadata_bhs.lock);
- list_move_tail(&bh->b_assoc_buffers,
- &mapping->i_metadata_bhs.list);
- bh->b_mmb = &mapping->i_metadata_bhs;
- spin_unlock(&mapping->i_metadata_bhs.lock);
+ spin_lock(&mmb->lock);
+ list_move_tail(&bh->b_assoc_buffers, &mmb->list);
+ bh->b_mmb = mmb;
+ spin_unlock(&mmb->lock);
}
}
-EXPORT_SYMBOL(mark_buffer_dirty_inode);
+EXPORT_SYMBOL(mmb_mark_buffer_dirty);
/**
* block_dirty_folio - Mark a folio as dirty.
@@ -797,14 +791,12 @@ bool block_dirty_folio(struct address_space *mapping, struct folio *folio)
EXPORT_SYMBOL(block_dirty_folio);
/*
- * Invalidate any and all dirty buffers on a given inode. We are
+ * Invalidate any and all dirty buffers on a given buffers list. We are
* probably unmounting the fs, but that doesn't mean we have already
* done a sync(). Just drop the buffers from the inode list.
*/
-void invalidate_inode_buffers(struct inode *inode)
+void mmb_invalidate(struct mapping_metadata_bhs *mmb)
{
- struct mapping_metadata_bhs *mmb = &inode->i_data.i_metadata_bhs;
-
if (mmb_has_buffers(mmb)) {
spin_lock(&mmb->lock);
while (!list_empty(&mmb->list))
@@ -812,7 +804,7 @@ void invalidate_inode_buffers(struct inode *inode)
spin_unlock(&mmb->lock);
}
}
-EXPORT_SYMBOL(invalidate_inode_buffers);
+EXPORT_SYMBOL(mmb_invalidate);
/*
* Create the appropriate buffers when given a folio for data area and
diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
index 44094fd476f5..e207dcca7a25 100644
--- a/include/linux/buffer_head.h
+++ b/include/linux/buffer_head.h
@@ -205,12 +205,30 @@ struct buffer_head *create_empty_buffers(struct folio *folio,
void end_buffer_read_sync(struct buffer_head *bh, int uptodate);
void end_buffer_write_sync(struct buffer_head *bh, int uptodate);
-/* Things to do with buffers at mapping->private_list */
-void mark_buffer_dirty_inode(struct buffer_head *bh, struct inode *inode);
-int generic_buffers_fsync_noflush(struct file *file, loff_t start, loff_t end,
- bool datasync);
-int generic_buffers_fsync(struct file *file, loff_t start, loff_t end,
- bool datasync);
+/* Things to do with metadata buffers list */
+void mmb_mark_buffer_dirty(struct buffer_head *bh, struct mapping_metadata_bhs *mmb);
+static inline void mark_buffer_dirty_inode(struct buffer_head *bh,
+ struct inode *inode)
+{
+ mmb_mark_buffer_dirty(bh, &inode->i_data.i_metadata_bhs);
+}
+int mmb_fsync_noflush(struct file *file, struct mapping_metadata_bhs *mmb,
+ loff_t start, loff_t end, bool datasync);
+static inline int generic_buffers_fsync_noflush(struct file *file,
+ loff_t start, loff_t end,
+ bool datasync)
+{
+ return mmb_fsync_noflush(file, &file->f_mapping->i_metadata_bhs,
+ start, end, datasync);
+}
+int mmb_fsync(struct file *file, struct mapping_metadata_bhs *mmb,
+ loff_t start, loff_t end, bool datasync);
+static inline int generic_buffers_fsync(struct file *file,
+ loff_t start, loff_t end, bool datasync)
+{
+ return mmb_fsync(file, &file->f_mapping->i_metadata_bhs,
+ start, end, datasync);
+}
void clean_bdev_aliases(struct block_device *bdev, sector_t block,
sector_t len);
static inline void clean_bdev_bh_alias(struct buffer_head *bh)
@@ -515,9 +533,18 @@ bool block_dirty_folio(struct address_space *mapping, struct folio *folio);
void buffer_init(void);
bool try_to_free_buffers(struct folio *folio);
+void mmb_init(struct mapping_metadata_bhs *mmb, struct address_space *mapping);
bool mmb_has_buffers(struct mapping_metadata_bhs *mmb);
-void invalidate_inode_buffers(struct inode *inode);
-int sync_mapping_buffers(struct address_space *mapping);
+void mmb_invalidate(struct mapping_metadata_bhs *mmb);
+int mmb_sync(struct mapping_metadata_bhs *mmb);
+static inline void invalidate_inode_buffers(struct inode *inode)
+{
+ mmb_invalidate(&inode->i_data.i_metadata_bhs);
+}
+static inline int sync_mapping_buffers(struct address_space *mapping)
+{
+ return mmb_sync(&mapping->i_metadata_bhs);
+}
void invalidate_bh_lrus(void);
void invalidate_bh_lrus_cpu(void);
bool has_bh_in_lru(int cpu, void *dummy);
@@ -527,6 +554,7 @@ extern int buffer_heads_over_limit;
static inline void buffer_init(void) {}
static inline bool try_to_free_buffers(struct folio *folio) { return true; }
+static inline int mmb_sync(struct mapping_metadata_bhs *mmb) { return 0; }
static inline void invalidate_inode_buffers(struct inode *inode) {}
static inline int sync_mapping_buffers(struct address_space *mapping) { return 0; }
static inline void invalidate_bh_lrus(void) {}
--
2.51.0
next prev parent reply other threads:[~2026-03-26 9:57 UTC|newest]
Thread overview: 62+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-26 9:53 [PATCH v3 0/42] fs: Move metadata bh tracking from address_space Jan Kara
2026-03-26 9:53 ` [PATCH 01/42] ext4: Use inode_has_buffers() Jan Kara
2026-03-26 9:53 ` [PATCH 02/42] gfs2: Don't zero i_private_data Jan Kara
2026-03-26 9:53 ` [PATCH 03/42] ntfs3: Drop pointless sync_mapping_buffers() and invalidate_inode_buffers() calls Jan Kara
2026-03-26 9:53 ` [PATCH 04/42] ocfs2: Drop pointless sync_mapping_buffers() calls Jan Kara
2026-03-26 9:53 ` [PATCH 05/42] bdev: Drop pointless invalidate_inode_buffers() call Jan Kara
2026-03-27 6:20 ` Christoph Hellwig
2026-03-26 9:54 ` [PATCH 06/42] ufs: Drop pointless invalidate_mapping_buffers() call Jan Kara
2026-03-26 9:54 ` [PATCH 07/42] exfat: Drop pointless invalidate_inode_buffers() call Jan Kara
2026-03-26 9:54 ` [PATCH 08/42] fs: Remove inode lock from __generic_file_fsync() Jan Kara
2026-03-27 6:20 ` Christoph Hellwig
2026-03-26 9:54 ` [PATCH 09/42] udf: Switch to generic_buffers_fsync() Jan Kara
2026-03-26 9:54 ` [PATCH 10/42] minix: " Jan Kara
2026-03-26 9:54 ` [PATCH 11/42] bfs: " Jan Kara
2026-03-26 9:54 ` [PATCH 12/42] fat: Switch to generic_buffers_fsync_noflush() Jan Kara
2026-03-26 9:54 ` [PATCH 13/42] fs: Drop sync_mapping_buffers() from __generic_file_fsync() Jan Kara
2026-03-27 6:21 ` Christoph Hellwig
2026-03-26 9:54 ` [PATCH 14/42] fs: Rename generic_file_fsync() to simple_fsync() Jan Kara
2026-03-27 6:22 ` Christoph Hellwig
2026-03-27 16:26 ` Jan Kara
2026-03-26 9:54 ` [PATCH 15/42] fat: Sync and invalidate metadata buffers from fat_evict_inode() Jan Kara
2026-03-29 13:55 ` OGAWA Hirofumi
2026-03-30 9:08 ` Jan Kara
2026-03-30 11:29 ` OGAWA Hirofumi
2026-03-31 8:49 ` Jan Kara
2026-03-31 10:40 ` OGAWA Hirofumi
2026-04-01 9:11 ` Jan Kara
2026-04-01 9:41 ` OGAWA Hirofumi
2026-04-01 10:36 ` Jan Kara
2026-04-01 12:50 ` OGAWA Hirofumi
2026-03-26 9:54 ` [PATCH 16/42] udf: Sync and invalidate metadata buffers from udf_evict_inode() Jan Kara
2026-03-26 9:54 ` [PATCH 17/42] minix: Sync and invalidate metadata buffers from minix_evict_inode() Jan Kara
2026-03-26 9:54 ` [PATCH 18/42] ext2: Sync and invalidate metadata buffers from ext2_evict_inode() Jan Kara
2026-03-26 9:54 ` [PATCH 19/42] ext4: Sync and invalidate metadata buffers from ext4_evict_inode() Jan Kara
2026-03-26 9:54 ` [PATCH 20/42] bfs: Sync and invalidate metadata buffers from bfs_evict_inode() Jan Kara
2026-03-26 9:54 ` [PATCH 21/42] affs: Sync and invalidate metadata buffers from affs_evict_inode() Jan Kara
2026-03-26 9:54 ` [PATCH 22/42] fs: Ignore inode metadata buffers in inode_lru_isolate() Jan Kara
2026-03-27 6:22 ` Christoph Hellwig
2026-03-26 9:54 ` [PATCH 23/42] fs: Stop using i_private_data for metadata bh tracking Jan Kara
2026-03-26 9:54 ` [PATCH 24/42] hugetlbfs: Stop using i_private_data Jan Kara
2026-03-26 9:54 ` [PATCH 25/42] aio: Stop using i_private_data and i_private_lock Jan Kara
2026-03-26 9:54 ` [PATCH 26/42] fs: Remove i_private_data Jan Kara
2026-03-26 9:54 ` [PATCH 27/42] kvm: Use private inode list instead of i_private_list Jan Kara
2026-03-26 9:54 ` [PATCH 28/42] fs: Drop osync_buffers_list() Jan Kara
2026-03-26 9:54 ` [PATCH 29/42] fs: Fold fsync_buffers_list() into sync_mapping_buffers() Jan Kara
2026-03-26 9:54 ` [PATCH 30/42] fs: Move metadata bhs tracking to a separate struct Jan Kara
2026-03-26 9:54 ` [PATCH 31/42] fs: Make bhs point to mapping_metadata_bhs Jan Kara
2026-03-26 9:54 ` [PATCH 32/42] fs: Switch inode_has_buffers() to take mapping_metadata_bhs Jan Kara
2026-03-26 9:54 ` Jan Kara [this message]
2026-03-27 6:23 ` [PATCH 33/42] fs: Provide functions for handling mapping_metadata_bhs directly Christoph Hellwig
2026-03-26 9:54 ` [PATCH 34/42] ext2: Track metadata bhs in fs-private inode part Jan Kara
2026-03-26 9:54 ` [PATCH 35/42] affs: " Jan Kara
2026-03-26 9:54 ` [PATCH 36/42] bfs: " Jan Kara
2026-03-26 9:54 ` [PATCH 37/42] fat: " Jan Kara
2026-03-26 9:54 ` [PATCH 38/42] udf: " Jan Kara
2026-03-26 9:54 ` [PATCH 39/42] minix: " Jan Kara
2026-03-26 9:54 ` [PATCH 40/42] ext4: " Jan Kara
2026-03-26 9:54 ` [PATCH 41/42] fs: Drop mapping_metadata_bhs from address space Jan Kara
2026-03-27 6:24 ` Christoph Hellwig
2026-03-26 9:54 ` [PATCH 42/42] fs: Drop i_private_list from address_space Jan Kara
2026-03-27 6:24 ` Christoph Hellwig
2026-03-26 14:06 ` [PATCH v3 0/42] fs: Move metadata bh tracking " Christian Brauner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260326095354.16340-75-jack@suse.cz \
--to=jack@suse.cz \
--cc=aivazian.tigran@gmail.com \
--cc=bcrl@kvack.org \
--cc=brauner@kernel.org \
--cc=david@kernel.org \
--cc=dsterba@suse.com \
--cc=hirofumi@mail.parknet.co.jp \
--cc=linux-aio@kvack.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=muchun.song@linux.dev \
--cc=osalvador@suse.de \
--cc=tytso@mit.edu \
--cc=viro@ZenIV.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.