From: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
To: Jan Kara <jack@suse.cz>
Cc: linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org,
Christian Brauner <brauner@kernel.org>,
Al Viro <viro@zeniv.linux.org.uk>,
linux-ext4@vger.kernel.org, Ted Tso <tytso@mit.edu>,
"Tigran A. Aivazian" <aivazian.tigran@gmail.com>,
David Sterba <dsterba@suse.com>,
Muchun Song <muchun.song@linux.dev>,
Oscar Salvador <osalvador@suse.de>,
David Hildenbrand <david@kernel.org>,
linux-mm@kvack.org, linux-aio@kvack.org,
Benjamin LaHaise <bcrl@kvack.org>
Subject: Re: [PATCH 15/42] fat: Sync and invalidate metadata buffers from fat_evict_inode()
Date: Tue, 31 Mar 2026 19:40:01 +0900 [thread overview]
Message-ID: <87wlyss2ny.fsf@mail.parknet.co.jp> (raw)
In-Reply-To: <mjraa5thsnzchejoomqgiahwajx4cs5ryqnmn7zsi3fylzhrrr@5ip65jcrlknn>
Jan Kara <jack@suse.cz> writes:
>> >> Hm, why do we have to add this here? For FAT, if buffers are still
>> >> dirty, buffers will be flushed via bdev flush?
>> >
>> > The reason why I've put sync_mapping_buffers() here is the following
>> > sequence:
>> > fd = open("file")
>> > write(fd)
>> > close(fd)
>> > - now data gets written out, dentry & inode can get evicted from memory
>> > fd = open("file")
>> > fsync(fd)
>> > - this should flush all dirty metadata associated with "file" but if we
>> > didn't call sync_mapping_buffers() during inode eviction we wouldn't
>> > have a way to do that.
>> >
>> > So in general I think sync_mapping_buffers() call is indeed needed.
>>
>> Hm, it looks like not new issue, isn't it? Why we have changed now in
>> this series?
>
> It isn't a new issue. But so far inode_lru_isolate() was checking whether
> the metadata buffers list has any dirty buffers and if yes, it skipped the
> inode. So inodes with dirty buffers in this list could reach .evict method
> only for deleted inodes or during unmount and either case makes above
> problem impossible to happen. This is however a layering violation (generic
> inode handling code shouldn't care about details of buffer heads) and as a
> result it makes it difficult to abstract the metadata buffer handling this
> series is doing. And all this for a handful of filesystems which,
> honestly, aren't used in performace critical settings.
I see, I could understand why you wanted to change though. However, (in
my thought, this change has disadvantage by below reason) changing this
behavior because of layering violation is not good way, IMO.
>> It is including trade off write amplification vs reliability (i.e. may
>> not call fsync()), for example. So I think we should not add it easily.
>
> I expect in practice you'll hardly be able to observe the difference as
> inodes usually get quite a while to be reclaimed at which point the dirty
> buffers would be already flushed by background writeback. I don't see how
> this change would lead specifically to "write amplification" - that would
> mean frequent redirtying of the same metadata buffer of an inode
> interleaved with frequent reclaims of the inode and I don't see how that
> would happen in a realistic setting.
>
> If someone comes with a realistic workload which would suffer significant
> regression from this change, then of course we should address it. I have
> plans for adding an interface for filesystems to expose the information
> that inode has some pending dirty metadata and a way to flush them from
> flush worker because that is a common need a lot of filesystems has and
> doing the flushing from .evict isn't always doable due to locking
> constraints.
I think it would happen with normal operation, for example, copy many
files more than total memory. I think this would be much common than
write=>close=>open=>fsync in your example. Anyway, with it, reclaimed
inode metadata will be flushed forcibly and frequently (yeah, may not be
significant though. but I can't see the benefit for users from this
change.), and lost to chance combining multiple time of dirty while copy
many files.
> I'm still thinking about details but this has to be a properly
> abstracted interface all filesystems can use and not a special hack for a
> handful of old filesystems.
Sounds great. How about we delay this behavior change until this
interface?
Thanks.
--
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
next prev parent reply other threads:[~2026-03-31 10:40 UTC|newest]
Thread overview: 62+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-26 9:53 [PATCH v3 0/42] fs: Move metadata bh tracking from address_space Jan Kara
2026-03-26 9:53 ` [PATCH 01/42] ext4: Use inode_has_buffers() Jan Kara
2026-03-26 9:53 ` [PATCH 02/42] gfs2: Don't zero i_private_data Jan Kara
2026-03-26 9:53 ` [PATCH 03/42] ntfs3: Drop pointless sync_mapping_buffers() and invalidate_inode_buffers() calls Jan Kara
2026-03-26 9:53 ` [PATCH 04/42] ocfs2: Drop pointless sync_mapping_buffers() calls Jan Kara
2026-03-26 9:53 ` [PATCH 05/42] bdev: Drop pointless invalidate_inode_buffers() call Jan Kara
2026-03-27 6:20 ` Christoph Hellwig
2026-03-26 9:54 ` [PATCH 06/42] ufs: Drop pointless invalidate_mapping_buffers() call Jan Kara
2026-03-26 9:54 ` [PATCH 07/42] exfat: Drop pointless invalidate_inode_buffers() call Jan Kara
2026-03-26 9:54 ` [PATCH 08/42] fs: Remove inode lock from __generic_file_fsync() Jan Kara
2026-03-27 6:20 ` Christoph Hellwig
2026-03-26 9:54 ` [PATCH 09/42] udf: Switch to generic_buffers_fsync() Jan Kara
2026-03-26 9:54 ` [PATCH 10/42] minix: " Jan Kara
2026-03-26 9:54 ` [PATCH 11/42] bfs: " Jan Kara
2026-03-26 9:54 ` [PATCH 12/42] fat: Switch to generic_buffers_fsync_noflush() Jan Kara
2026-03-26 9:54 ` [PATCH 13/42] fs: Drop sync_mapping_buffers() from __generic_file_fsync() Jan Kara
2026-03-27 6:21 ` Christoph Hellwig
2026-03-26 9:54 ` [PATCH 14/42] fs: Rename generic_file_fsync() to simple_fsync() Jan Kara
2026-03-27 6:22 ` Christoph Hellwig
2026-03-27 16:26 ` Jan Kara
2026-03-26 9:54 ` [PATCH 15/42] fat: Sync and invalidate metadata buffers from fat_evict_inode() Jan Kara
2026-03-29 13:55 ` OGAWA Hirofumi
2026-03-30 9:08 ` Jan Kara
2026-03-30 11:29 ` OGAWA Hirofumi
2026-03-31 8:49 ` Jan Kara
2026-03-31 10:40 ` OGAWA Hirofumi [this message]
2026-04-01 9:11 ` Jan Kara
2026-04-01 9:41 ` OGAWA Hirofumi
2026-04-01 10:36 ` Jan Kara
2026-04-01 12:50 ` OGAWA Hirofumi
2026-03-26 9:54 ` [PATCH 16/42] udf: Sync and invalidate metadata buffers from udf_evict_inode() Jan Kara
2026-03-26 9:54 ` [PATCH 17/42] minix: Sync and invalidate metadata buffers from minix_evict_inode() Jan Kara
2026-03-26 9:54 ` [PATCH 18/42] ext2: Sync and invalidate metadata buffers from ext2_evict_inode() Jan Kara
2026-03-26 9:54 ` [PATCH 19/42] ext4: Sync and invalidate metadata buffers from ext4_evict_inode() Jan Kara
2026-03-26 9:54 ` [PATCH 20/42] bfs: Sync and invalidate metadata buffers from bfs_evict_inode() Jan Kara
2026-03-26 9:54 ` [PATCH 21/42] affs: Sync and invalidate metadata buffers from affs_evict_inode() Jan Kara
2026-03-26 9:54 ` [PATCH 22/42] fs: Ignore inode metadata buffers in inode_lru_isolate() Jan Kara
2026-03-27 6:22 ` Christoph Hellwig
2026-03-26 9:54 ` [PATCH 23/42] fs: Stop using i_private_data for metadata bh tracking Jan Kara
2026-03-26 9:54 ` [PATCH 24/42] hugetlbfs: Stop using i_private_data Jan Kara
2026-03-26 9:54 ` [PATCH 25/42] aio: Stop using i_private_data and i_private_lock Jan Kara
2026-03-26 9:54 ` [PATCH 26/42] fs: Remove i_private_data Jan Kara
2026-03-26 9:54 ` [PATCH 27/42] kvm: Use private inode list instead of i_private_list Jan Kara
2026-03-26 9:54 ` [PATCH 28/42] fs: Drop osync_buffers_list() Jan Kara
2026-03-26 9:54 ` [PATCH 29/42] fs: Fold fsync_buffers_list() into sync_mapping_buffers() Jan Kara
2026-03-26 9:54 ` [PATCH 30/42] fs: Move metadata bhs tracking to a separate struct Jan Kara
2026-03-26 9:54 ` [PATCH 31/42] fs: Make bhs point to mapping_metadata_bhs Jan Kara
2026-03-26 9:54 ` [PATCH 32/42] fs: Switch inode_has_buffers() to take mapping_metadata_bhs Jan Kara
2026-03-26 9:54 ` [PATCH 33/42] fs: Provide functions for handling mapping_metadata_bhs directly Jan Kara
2026-03-27 6:23 ` Christoph Hellwig
2026-03-26 9:54 ` [PATCH 34/42] ext2: Track metadata bhs in fs-private inode part Jan Kara
2026-03-26 9:54 ` [PATCH 35/42] affs: " Jan Kara
2026-03-26 9:54 ` [PATCH 36/42] bfs: " Jan Kara
2026-03-26 9:54 ` [PATCH 37/42] fat: " Jan Kara
2026-03-26 9:54 ` [PATCH 38/42] udf: " Jan Kara
2026-03-26 9:54 ` [PATCH 39/42] minix: " Jan Kara
2026-03-26 9:54 ` [PATCH 40/42] ext4: " Jan Kara
2026-03-26 9:54 ` [PATCH 41/42] fs: Drop mapping_metadata_bhs from address space Jan Kara
2026-03-27 6:24 ` Christoph Hellwig
2026-03-26 9:54 ` [PATCH 42/42] fs: Drop i_private_list from address_space Jan Kara
2026-03-27 6:24 ` Christoph Hellwig
2026-03-26 14:06 ` [PATCH v3 0/42] fs: Move metadata bh tracking " Christian Brauner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87wlyss2ny.fsf@mail.parknet.co.jp \
--to=hirofumi@mail.parknet.co.jp \
--cc=aivazian.tigran@gmail.com \
--cc=bcrl@kvack.org \
--cc=brauner@kernel.org \
--cc=david@kernel.org \
--cc=dsterba@suse.com \
--cc=jack@suse.cz \
--cc=linux-aio@kvack.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=muchun.song@linux.dev \
--cc=osalvador@suse.de \
--cc=tytso@mit.edu \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox