From: Chao Yu via Linux-f2fs-devel <linux-f2fs-devel@lists.sourceforge.net>
To: Yongpeng Yang <monty_pavel@sina.com>, Jaegeuk Kim <jaegeuk@kernel.org>
Cc: Yongpeng Yang <yangyongpeng@xiaomi.com>,
linux-f2fs-devel@lists.sourceforge.net
Subject: Re: [f2fs-dev] [PATCH] f2fs: invalidate block device page cache on umount
Date: Tue, 24 Mar 2026 16:34:57 +0800 [thread overview]
Message-ID: <95c41d72-0bcf-46d3-883b-793265cfd8cc@kernel.org> (raw)
In-Reply-To: <eb60f2ef-2c74-4c9d-9547-323042be5233@sina.com>
On 3/13/26 17:42, Yongpeng Yang wrote:
>
> On 3/13/26 15:19, Chao Yu via Linux-f2fs-devel wrote:
>> On 3/12/2026 7:34 PM, Yongpeng Yang wrote:
>>>
>>> On 3/12/26 18:49, Chao Yu via Linux-f2fs-devel wrote:
>>>> On 2026/3/12 18:41, Chao Yu wrote:
>>>>> On 2026/3/12 11:56, Yongpeng Yang wrote:
>>>>>>
>>>>>> On 3/12/26 08:49, Chao Yu via Linux-f2fs-devel wrote:
>>>>>>> On 2026/2/16 19:27, Yongpeng Yang wrote:
>>>>>>>> From: Yongpeng Yang <yangyongpeng@xiaomi.com>
>>>>>>>>
>>>>>>>> Neither F2FS nor VFS invalidates the block device page cache, which
>>>>>>>> results in reading stale metadata. An example scenario is shown
>>>>>>>> below:
>>>>>>>>
>>>>>>>> Terminal A Terminal B
>>>>>>>> mount /dev/vdb /mnt/f2fs
>>>>>>>> touch mx // ino = 4
>>>>>>>> sync
>>>>>>>> dump.f2fs -i 4 /dev/vdb// block on "[Y/N]"
>>>>>>>> touch mx2 // ino = 5
>>>>>>>> sync
>>>>>>>> umount /mnt/f2fs
>>>>>>>> dump.f2fs -i 5 /dev/vdb // block addr
>>>>>>>> is 0
>>>>>>>>
>>>>>>>> After umount, the block device page cache is not purged, causing
>>>>>>>> `dump.f2fs -i 5 /dev/vdb` to read stale metadata and see inode 5
>>>>>>>> with
>>>>>>>> block address 0.
>>>>>>>>
>>>>>>>> This patch calls invalidate_bdev during umount to invalidate the
>>>>>>>> block
>>>>>>>> device page cache, preventing stale metadata from being read.
>>>>>>>>
>>>>>>>> Signed-off-by: Yongpeng Yang <yangyongpeng@xiaomi.com>
>>>>>>>> ---
>>>>>>>> fs/f2fs/super.c | 6 ++++++
>>>>>>>> 1 file changed, 6 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
>>>>>>>> index 1a755997aff5..39d3b52ceac1 100644
>>>>>>>> --- a/fs/f2fs/super.c
>>>>>>>> +++ b/fs/f2fs/super.c
>>>>>>>> @@ -2091,6 +2091,12 @@ static void f2fs_put_super(struct
>>>>>>>> super_block *sb)
>>>>>>>> #if IS_ENABLED(CONFIG_UNICODE)
>>>>>>>> utf8_unload(sb->s_encoding);
>>>>>>>> #endif
>>>>>>>> + sync_blockdev(sb->s_bdev);
>>>>>>>
>>>>>>> We will call sync_blockdev in below path?
>>>>>>>
>>>>>>> - kill_f2fs_super
>>>>>>> - kill_block_super
>>>>>>> - generic_shutdown_super
>>>>>>> - put_super
>>>>>>> - sync_blockdev
>>>>>>>
>>>>>>> 1721 void kill_block_super(struct super_block *sb)
>>>>>>> 1722 {
>>>>>>> 1723 struct block_device *bdev = sb->s_bdev;
>>>>>>> 1724
>>>>>>> 1725 generic_shutdown_super(sb);
>>>>>>> 1726 if (bdev) {
>>>>>>> 1727 sync_blockdev(bdev);
>>>>>>> 1728 bdev_fput(sb->s_bdev_file);
>>>>>>> 1729 }
>>>>>>> 1730 }
>>>>>>>
>>>>>>>> + invalidate_bdev(sb->s_bdev);
>>>>>>
>>>>>> This works for an f2fs instance mounted on a single device, but it
>>>>>> does
>>>>>> not work for multi-device configurations, because the vfs cannot be
>>>>>> aware of FDEV(1).
>>>>>
>>>>> Yeah, I meant we can avoid duplicated sync_blockdev() for main
>>>>> device in
>>>>> f2fs_put_super().
>>>>>
>>>
>>> The call trace corresponding to this patch is as follows:
>>>
>>> kill_block_super
>>> - generic_shutdown_super
>>> - f2fs_put_super
>>> - sync_blockdev
>>> - invalidate_bdev
>>> - sync_blockdev
>>>
>>> There is indeed a duplicated call to sync_blockdev(). However, since all
>>> data must be written to disk before invalidate_bdev() is called,
>>> f2fs_put_super() must call sync_blockdev(). Because invalidate_bdev()
>>> has already cleared the page cache, the second call to sync_blockdev()
>>> does nothing.
>>>
>>>>>>
>>>>>>>
>>>>>>> I guess we can leave the device w/ uptodate cache, in case if
>>>>>>> there are
>>>>>>> multiple user on the device?
>>>>>>
>>>>>> The page cache of the block device file may contain data that is not
>>>>>> uptodate. For example, data may first be read directly through the
>>>>>> block
>>>>> > device file, and then the same blocks may be written through f2fs.
>>>>> Since> f2fs writes to the block device via the submit_bio path, it
>>>>> does not
>>>>>> update the page cache of the block device file. As a result, the
>>>>>> data in
>>>>>> the block device file’s page cache may become stale. Therefore, the
>>>>>> page
>>>>>> cache must be invalidated during unmount.
>>>>>
>>>>> Well, can we call ioctl(fd, BLKFLSBUF) for all devices belong to f2fs
>>>>> img like
>>>>> you did in ("f2fs-tools: invalidate block device page cache before
>>>>> reading
>>>>> metadata"), does that fix the issue?
>>>>
>>>> Oh, you already called the ioctl for all devices. Does that fix the
>>>> issue?
>>>>
>>>
>>> Yes, that can fix this issue. From a system robustness perspective, both
>>> the kernel and f2fs-tools should ensure that the uptodate data is read.
>>> Therefore, I made changes in both.
>>
>> Well, I think it will be a little bit overprotective, because f2fs kernel
>> module itself doesn't suffer any issue, and dump.f2fs already has its
>> way to avoid accessing stale data w/ your change.
>
> I overlooked the permission check in blkdev_flushbuf. After unmount,
> non-root users may still read stale data. Therefore, it is still
> necessary to ensure that the page cache is invalidated after unmount.
>
> blkdev_ioctl
> - blkdev_common_ioctl
> - blkdev_flushbuf
>
> static int blkdev_flushbuf(struct block_device *bdev, unsigned cmd,
> unsigned long arg)
> {
> if (!capable(CAP_SYS_ADMIN))
> return -EACCES;
Well, can you describe the race case that non-root user still read stale data
in commit message?
>
> I also noticed that btrfs has encountered a similar issue:
> https://mail-archive.com/linux-btrfs@vger.kernel.org/msg54188.html
And including this example as well.
Thanks,
>
> In that discussion, the solution was to call invalidate_bdev() when the
> device is closed.
>
> Thanks
> Yongpeng,
>
>>
>> Thanks,
>>
>>>
>>> Thanks
>>> Yongpeng,
>>>
>>>> Thanks,
>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>>>
>>>>>> Thanks
>>>>>> Yongpeng,
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>>> + for (i = 1; i < sbi->s_ndevs; i++) {
>>>>>>>> + sync_blockdev(FDEV(i).bdev);
>>>>>>>> + invalidate_bdev(FDEV(i).bdev);
>>>>>>>> + }
>>>>>>>> }
>>>>>>>> int f2fs_sync_fs(struct super_block *sb, int sync)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Linux-f2fs-devel mailing list
>>>>>>> Linux-f2fs-devel@lists.sourceforge.net
>>>>>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Linux-f2fs-devel mailing list
>>>> Linux-f2fs-devel@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>>>
>>
>>
>>
>> _______________________________________________
>> Linux-f2fs-devel mailing list
>> Linux-f2fs-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>
_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
next prev parent reply other threads:[~2026-03-24 8:35 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-16 11:27 [f2fs-dev] [PATCH] f2fs: invalidate block device page cache on umount Yongpeng Yang
2026-03-11 9:31 ` Yongpeng Yang
2026-03-12 0:49 ` Chao Yu via Linux-f2fs-devel
2026-03-12 3:56 ` Yongpeng Yang
2026-03-12 10:41 ` Chao Yu via Linux-f2fs-devel
2026-03-12 10:49 ` Chao Yu via Linux-f2fs-devel
2026-03-12 11:34 ` Yongpeng Yang
2026-03-13 7:19 ` Chao Yu via Linux-f2fs-devel
2026-03-13 9:42 ` Yongpeng Yang
2026-03-24 8:34 ` Chao Yu via Linux-f2fs-devel [this message]
2026-03-24 9:36 ` Yongpeng Yang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=95c41d72-0bcf-46d3-883b-793265cfd8cc@kernel.org \
--to=linux-f2fs-devel@lists.sourceforge.net \
--cc=chao@kernel.org \
--cc=jaegeuk@kernel.org \
--cc=monty_pavel@sina.com \
--cc=yangyongpeng@xiaomi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.