From: Jan Kara <jack@suse.cz>
To: <linux-fsdevel@vger.kernel.org>
Cc: Christian Brauner <brauner@kernel.org>,
Al Viro <viro@ZenIV.linux.org.uk>, <linux-ext4@vger.kernel.org>,
Ted Tso <tytso@mit.edu>,
"Tigran A. Aivazian" <aivazian.tigran@gmail.com>,
David Sterba <dsterba@suse.com>,
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>,
Muchun Song <muchun.song@linux.dev>,
Oscar Salvador <osalvador@suse.de>,
David Hildenbrand <david@kernel.org>,
linux-mm@kvack.org, linux-aio@kvack.org,
Benjamin LaHaise <bcrl@kvack.org>, Jan Kara <jack@suse.cz>
Subject: [PATCH 16/32] fs: Fold fsync_buffers_list() into sync_mapping_buffers()
Date: Tue, 3 Mar 2026 11:34:05 +0100 [thread overview]
Message-ID: <20260303103406.4355-48-jack@suse.cz> (raw)
In-Reply-To: <20260303101717.27224-1-jack@suse.cz>
There's only single caller of fsync_buffers_list() so untangle the code
a bit by folding fsync_buffers_list() into sync_mapping_buffers(). Also
merge the comments and update them to reflect current state of code.
Signed-off-by: Jan Kara <jack@suse.cz>
---
fs/buffer.c | 180 +++++++++++++++++++++++-----------------------------
1 file changed, 80 insertions(+), 100 deletions(-)
diff --git a/fs/buffer.c b/fs/buffer.c
index 1c0e7c81a38b..18012afb8289 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -54,7 +54,6 @@
#include "internal.h"
-static int fsync_buffers_list(spinlock_t *lock, struct list_head *list);
static void submit_bh_wbc(blk_opf_t opf, struct buffer_head *bh,
enum rw_hint hint, struct writeback_control *wbc);
@@ -531,22 +530,96 @@ EXPORT_SYMBOL_GPL(inode_has_buffers);
* @mapping: the mapping which wants those buffers written
*
* Starts I/O against the buffers at mapping->i_private_list, and waits upon
- * that I/O.
+ * that I/O. Basically, this is a convenience function for fsync(). @mapping
+ * is a file or directory which needs those buffers to be written for a
+ * successful fsync().
*
- * Basically, this is a convenience function for fsync().
- * @mapping is a file or directory which needs those buffers to be written for
- * a successful fsync().
+ * We have conflicting pressures: we want to make sure that all
+ * initially dirty buffers get waited on, but that any subsequently
+ * dirtied buffers don't. After all, we don't want fsync to last
+ * forever if somebody is actively writing to the file.
+ *
+ * Do this in two main stages: first we copy dirty buffers to a
+ * temporary inode list, queueing the writes as we go. Then we clean
+ * up, waiting for those writes to complete. mark_buffer_dirty_inode()
+ * doesn't touch b_assoc_buffers list if b_assoc_map is not NULL so we
+ * are sure the buffer stays on our list until IO completes (at which point
+ * it can be reaped).
*/
int sync_mapping_buffers(struct address_space *mapping)
{
struct address_space *buffer_mapping =
mapping->host->i_sb->s_bdev->bd_mapping;
+ struct buffer_head *bh;
+ int err = 0;
+ struct blk_plug plug;
+ LIST_HEAD(tmp);
if (list_empty(&mapping->i_private_list))
return 0;
- return fsync_buffers_list(&buffer_mapping->i_private_lock,
- &mapping->i_private_list);
+ blk_start_plug(&plug);
+
+ spin_lock(&buffer_mapping->i_private_lock);
+ while (!list_empty(&mapping->i_private_list)) {
+ bh = BH_ENTRY(list->next);
+ WARN_ON_ONCE(bh->b_assoc_map != mapping);
+ __remove_assoc_queue(bh);
+ /* Avoid race with mark_buffer_dirty_inode() which does
+ * a lockless check and we rely on seeing the dirty bit */
+ smp_mb();
+ if (buffer_dirty(bh) || buffer_locked(bh)) {
+ list_add(&bh->b_assoc_buffers, &tmp);
+ bh->b_assoc_map = mapping;
+ if (buffer_dirty(bh)) {
+ get_bh(bh);
+ spin_unlock(&buffer_mapping->i_private_lock);
+ /*
+ * Ensure any pending I/O completes so that
+ * write_dirty_buffer() actually writes the
+ * current contents - it is a noop if I/O is
+ * still in flight on potentially older
+ * contents.
+ */
+ write_dirty_buffer(bh, REQ_SYNC);
+
+ /*
+ * Kick off IO for the previous mapping. Note
+ * that we will not run the very last mapping,
+ * wait_on_buffer() will do that for us
+ * through sync_buffer().
+ */
+ brelse(bh);
+ spin_lock(&buffer_mapping->i_private_lock);
+ }
+ }
+ }
+
+ spin_unlock(&buffer_mapping->i_private_lock);
+ blk_finish_plug(&plug);
+ spin_lock(&buffer_mapping->i_private_lock);
+
+ while (!list_empty(&tmp)) {
+ bh = BH_ENTRY(tmp.prev);
+ get_bh(bh);
+ __remove_assoc_queue(bh);
+ /* Avoid race with mark_buffer_dirty_inode() which does
+ * a lockless check and we rely on seeing the dirty bit */
+ smp_mb();
+ if (buffer_dirty(bh)) {
+ list_add(&bh->b_assoc_buffers,
+ &mapping->i_private_list);
+ bh->b_assoc_map = mapping;
+ }
+ spin_unlock(&buffer_mapping->i_private_lock);
+ wait_on_buffer(bh);
+ if (!buffer_uptodate(bh))
+ err = -EIO;
+ brelse(bh);
+ spin_lock(&buffer_mapping->i_private_lock);
+ }
+ spin_unlock(&buffer_mapping->i_private_lock);
+ return err;
}
EXPORT_SYMBOL(sync_mapping_buffers);
@@ -719,99 +792,6 @@ bool block_dirty_folio(struct address_space *mapping, struct folio *folio)
}
EXPORT_SYMBOL(block_dirty_folio);
-/*
- * Write out and wait upon a list of buffers.
- *
- * We have conflicting pressures: we want to make sure that all
- * initially dirty buffers get waited on, but that any subsequently
- * dirtied buffers don't. After all, we don't want fsync to last
- * forever if somebody is actively writing to the file.
- *
- * Do this in two main stages: first we copy dirty buffers to a
- * temporary inode list, queueing the writes as we go. Then we clean
- * up, waiting for those writes to complete.
- *
- * During this second stage, any subsequent updates to the file may end
- * up refiling the buffer on the original inode's dirty list again, so
- * there is a chance we will end up with a buffer queued for write but
- * not yet completed on that list. So, as a final cleanup we go through
- * the osync code to catch these locked, dirty buffers without requeuing
- * any newly dirty buffers for write.
- */
-static int fsync_buffers_list(spinlock_t *lock, struct list_head *list)
-{
- struct buffer_head *bh;
- struct address_space *mapping;
- int err = 0;
- struct blk_plug plug;
- LIST_HEAD(tmp);
-
- blk_start_plug(&plug);
-
- spin_lock(lock);
- while (!list_empty(list)) {
- bh = BH_ENTRY(list->next);
- mapping = bh->b_assoc_map;
- __remove_assoc_queue(bh);
- /* Avoid race with mark_buffer_dirty_inode() which does
- * a lockless check and we rely on seeing the dirty bit */
- smp_mb();
- if (buffer_dirty(bh) || buffer_locked(bh)) {
- list_add(&bh->b_assoc_buffers, &tmp);
- bh->b_assoc_map = mapping;
- if (buffer_dirty(bh)) {
- get_bh(bh);
- spin_unlock(lock);
- /*
- * Ensure any pending I/O completes so that
- * write_dirty_buffer() actually writes the
- * current contents - it is a noop if I/O is
- * still in flight on potentially older
- * contents.
- */
- write_dirty_buffer(bh, REQ_SYNC);
-
- /*
- * Kick off IO for the previous mapping. Note
- * that we will not run the very last mapping,
- * wait_on_buffer() will do that for us
- * through sync_buffer().
- */
- brelse(bh);
- spin_lock(lock);
- }
- }
- }
-
- spin_unlock(lock);
- blk_finish_plug(&plug);
- spin_lock(lock);
-
- while (!list_empty(&tmp)) {
- bh = BH_ENTRY(tmp.prev);
- get_bh(bh);
- mapping = bh->b_assoc_map;
- __remove_assoc_queue(bh);
- /* Avoid race with mark_buffer_dirty_inode() which does
- * a lockless check and we rely on seeing the dirty bit */
- smp_mb();
- if (buffer_dirty(bh)) {
- list_add(&bh->b_assoc_buffers,
- &mapping->i_private_list);
- bh->b_assoc_map = mapping;
- }
- spin_unlock(lock);
- wait_on_buffer(bh);
- if (!buffer_uptodate(bh))
- err = -EIO;
- brelse(bh);
- spin_lock(lock);
- }
-
- spin_unlock(lock);
- return err;
-}
-
/*
* Invalidate any and all dirty buffers on a given inode. We are
* probably unmounting the fs, but that doesn't mean we have already
--
2.51.0
next prev parent reply other threads:[~2026-03-03 10:35 UTC|newest]
Thread overview: 65+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-03 10:33 [PATCH 0/32] fs: Move metadata bh tracking from address_space Jan Kara
2026-03-03 10:33 ` [PATCH 01/32] fat: Sync and invalidate metadata buffers from fat_evict_inode() Jan Kara
2026-03-03 10:33 ` [PATCH 02/32] udf: Sync and invalidate metadata buffers from udf_evict_inode() Jan Kara
2026-03-03 10:33 ` [PATCH 03/32] minix: Sync and invalidate metadata buffers from minix_evict_inode() Jan Kara
2026-03-03 10:33 ` [PATCH 04/32] ext2: Sync and invalidate metadata buffers from ext2_evict_inode() Jan Kara
2026-03-03 10:33 ` [PATCH 05/32] ext4: Sync and invalidate metadata buffers from ext4_evict_inode() Jan Kara
2026-03-04 14:14 ` Theodore Tso
2026-03-03 10:33 ` [PATCH 06/32] ext4: Use inode_has_buffers() Jan Kara
2026-03-04 14:14 ` Theodore Tso
2026-03-03 10:33 ` [PATCH 07/32] bfs: Sync and invalidate metadata buffers from bfs_evict_inode() Jan Kara
2026-03-03 10:33 ` [PATCH 08/32] affs: Sync and invalidate metadata buffers from affs_evict_inode() Jan Kara
2026-03-03 10:33 ` [PATCH 09/32] fs: Ignore inode metadata buffers in inode_lru_isolate() Jan Kara
2026-03-03 10:33 ` [PATCH 10/32] fs: Stop using i_private_data for metadata bh tracking Jan Kara
2026-03-03 10:34 ` [PATCH 11/32] gfs2: Don't zero i_private_data Jan Kara
2026-03-03 12:32 ` Andreas Gruenbacher
2026-03-04 10:39 ` Jan Kara
2026-03-03 10:34 ` [PATCH 12/32] hugetlbfs: Stop using i_private_data Jan Kara
2026-03-10 7:24 ` kernel test robot
2026-03-10 7:24 ` [LTP] " kernel test robot
2026-03-03 10:34 ` [PATCH 13/32] aio: Stop using i_private_data and i_private_lock Jan Kara
2026-03-03 10:34 ` [PATCH 14/32] fs: Remove i_private_data Jan Kara
2026-03-03 10:34 ` [PATCH 15/32] fs: Drop osync_buffers_list() Jan Kara
2026-03-03 10:34 ` Jan Kara [this message]
2026-03-04 13:38 ` [PATCH 16/32] fs: Fold fsync_buffers_list() into sync_mapping_buffers() Christian Brauner
2026-03-05 16:14 ` Jan Kara
2026-03-03 10:34 ` [PATCH 17/32] fs: Move metadata bhs tracking to a separate struct Jan Kara
2026-03-04 13:38 ` Christoph Hellwig
2026-03-05 16:42 ` Jan Kara
2026-03-04 13:40 ` Christoph Hellwig
2026-03-05 16:39 ` Jan Kara
2026-03-03 10:34 ` [PATCH 18/32] fs: Provide operation for fetching mapping_metadata_bhs Jan Kara
2026-03-04 12:48 ` Christian Brauner
2026-03-04 13:19 ` Christoph Hellwig
2026-03-04 13:38 ` Jan Kara
2026-03-04 13:44 ` Christoph Hellwig
2026-03-03 10:34 ` [PATCH 19/32] ntfs3: Drop pointless sync_mapping_buffers() call Jan Kara
2026-03-04 13:41 ` Christoph Hellwig
2026-03-05 16:26 ` Jan Kara
2026-03-03 10:34 ` [PATCH 20/32] ocfs2: Drop pointless sync_mapping_buffers() calls Jan Kara
2026-03-03 10:34 ` [PATCH 21/32] bdev: Drop pointless invalidate_mapping_buffers() call Jan Kara
2026-03-03 14:03 ` Christoph Hellwig
2026-03-04 10:30 ` Jan Kara
2026-03-03 14:09 ` Christoph Hellwig
2026-03-04 10:36 ` Jan Kara
2026-03-04 13:29 ` Christoph Hellwig
2026-03-04 13:39 ` Christian Brauner
2026-03-05 15:58 ` Jan Kara
2026-03-03 10:34 ` [PATCH 22/32] fs: Switch inode_has_buffers() to take mapping_metadata_bhs Jan Kara
2026-03-03 10:34 ` [PATCH 23/32] ext2: Track metadata bhs in fs-private inode part Jan Kara
2026-03-03 10:34 ` [PATCH 24/32] affs: " Jan Kara
2026-03-03 10:34 ` [PATCH 25/32] bfs: " Jan Kara
2026-03-03 10:34 ` [PATCH 26/32] fat: " Jan Kara
2026-03-03 10:34 ` [PATCH 27/32] udf: " Jan Kara
2026-03-03 10:34 ` [PATCH 28/32] minix: " Jan Kara
2026-03-03 10:34 ` [PATCH 29/32] ext4: " Jan Kara
2026-03-03 10:34 ` [PATCH 30/32] vfs: Drop mapping_metadata_bhs from address space Jan Kara
2026-03-03 10:34 ` [PATCH 31/32] kvm: Use private inode list instead of i_private_list Jan Kara
2026-03-04 13:40 ` Christian Brauner
2026-03-05 16:25 ` Jan Kara
2026-03-04 13:42 ` Christoph Hellwig
2026-03-05 16:25 ` Jan Kara
2026-03-03 10:34 ` [PATCH 32/32] fs: Drop i_private_list from address_space Jan Kara
2026-03-04 13:43 ` Christoph Hellwig
2026-03-03 23:35 ` [syzbot ci] Re: fs: Move metadata bh tracking " syzbot ci
2026-03-04 12:32 ` [PATCH 0/32] " Christian Brauner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260303103406.4355-48-jack@suse.cz \
--to=jack@suse.cz \
--cc=aivazian.tigran@gmail.com \
--cc=bcrl@kvack.org \
--cc=brauner@kernel.org \
--cc=david@kernel.org \
--cc=dsterba@suse.com \
--cc=hirofumi@mail.parknet.co.jp \
--cc=linux-aio@kvack.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=muchun.song@linux.dev \
--cc=osalvador@suse.de \
--cc=tytso@mit.edu \
--cc=viro@ZenIV.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.