From: Chao Yu via Linux-f2fs-devel <linux-f2fs-devel@lists.sourceforge.net>
To: hanqi <hanqi@vivo.com>, Jens Axboe <axboe@kernel.dk>, jaegeuk@kernel.org
Cc: linux-kernel@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net
Subject: Re: [f2fs-dev] [PATCH] f2fs: f2fs supports uncached buffered I/O
Date: Fri, 25 Jul 2025 10:37:42 +0800 [thread overview]
Message-ID: <1d03de75-26c4-4a58-af46-dafb319bed89@kernel.org> (raw)
In-Reply-To: <06b9d287-816c-4347-945b-8fda83a6f557@vivo.com>
On 7/25/2025 9:44 AM, hanqi wrote:
>
>
> 在 2025/7/24 21:09, Chao Yu 写道:
>> On 2025/7/16 16:27, hanqi wrote:
>>>
>>>
>>> 在 2025/7/16 11:43, Jens Axboe 写道:
>>>> On 7/15/25 9:34 PM, hanqi wrote:
>>>>>
>>>>> ? 2025/7/15 22:28, Jens Axboe ??:
>>>>>> On 7/14/25 9:10 PM, Qi Han wrote:
>>>>>>> Jens has already completed the development of uncached buffered I/O
>>>>>>> in git [1], and in f2fs, the feature can be enabled simply by
>>>>>>> setting
>>>>>>> the FOP_DONTCACHE flag in f2fs_file_operations.
>>>>>> You need to ensure that for any DONTCACHE IO that the completion is
>>>>>> routed via non-irq context, if applicable. I didn't verify that
>>>>>> this is
>>>>>> the case for f2fs. Generally you can deduce this as well through
>>>>>> testing, I'd say the following cases would be interesting to test:
>>>>>>
>>>>>> 1) Normal DONTCACHE buffered read
>>>>>> 2) Overwrite DONTCACHE buffered write
>>>>>> 3) Append DONTCACHE buffered write
>>>>>>
>>>>>> Test those with DEBUG_ATOMIC_SLEEP set in your config, and it that
>>>>>> doesn't complain, that's a great start.
>>>>>>
>>>>>> For the above test cases as well, verify that page cache doesn't
>>>>>> grow as
>>>>>> IO is performed. A bit is fine for things like meta data, but
>>>>>> generally
>>>>>> you want to see it remain basically flat in terms of page cache
>>>>>> usage.
>>>>>>
>>>>>> Maybe this is all fine, like I said I didn't verify. Just
>>>>>> mentioning it
>>>>>> for completeness sake.
>>>>> Hi, Jens
>>>>> Thanks for your suggestion. As I mentioned earlier in [1], in f2fs,
>>>>> the regular buffered write path invokes folio_end_writeback from a
>>>>> softirq context. Therefore, it seems that f2fs may not be suitable
>>>>> for DONTCACHE I/O writes.
>>>>>
>>>>> I?d like to ask a question: why is DONTCACHE I/O write restricted to
>>>>> non-interrupt context only? Is it because dropping the page might be
>>>>> too time-consuming to be done safely in interrupt context? This might
>>>>> be a naive question, but I?d really appreciate your clarification.
>>>>> Thanks in advance.
>>>> Because (as of right now, at least) the code doing the invalidation
>>>> needs process context. There are various reasons for this, which you'll
>>>> see if you follow the path off folio_end_writeback() ->
>>>> filemap_end_dropbehind_write() -> filemap_end_dropbehind() ->
>>>> folio_unmap_invalidate(). unmap_mapping_folio() is one case, and while
>>>> that may be doable, the inode i_lock is not IRQ safe.
>>>>
>>>> Most file systems have a need to punt some writeback completions to
>>>> non-irq context, eg for file extending etc. Hence for most file
>>>> systems,
>>>> the dontcache case just becomes another case that needs to go through
>>>> that path.
>>>>
>>>> It'd certainly be possible to improve upon this, for example by having
>>>> an opportunistic dontcache unmap from IRQ/soft-irq context, and then
>>>> punting to a workqueue if that doesn't pan out. But this doesn't exist
>>>> as of yet, hence the need for the workqueue punt.
>>
>> Thanks Jens for the detailed explanation.
>>
>>>
>>> Hi, Jens
>>> Thank you for your response. I tested uncached buffer I/O reads with
>>> a 50GB dataset on a local F2FS filesystem, and the page cache size
>>> only increased slightly, which I believe aligns with expectations.
>>> After clearing the page cache, the page cache size returned to its
>>> initial state. The test results are as follows:
>>>
>>> stat 50G.txt
>>> File: 50G.txt
>>> Size: 53687091200 Blocks: 104960712 IO Blocks: 512
>>> regular file
>>>
>>> [read before]:
>>> echo 3 > /proc/sys/vm/drop_caches
>>> 01:48:17 kbmemfree kbavail kbmemused %memused
>>> kbbuffers kbcached kbcommit %commit kbactive kbinact kbdirty
>>> 01:50:59 6404648 8149508 2719384 23.40 512 1898092
>>> 199384760 823.75 1846756 466832 44
>>>
>>> ./uncached_io_test 8192 1 1 50G.txt
>>> Starting 1 threads
>>> reading bs 8192, uncached 1
>>> 1s: 754MB/sec, MB=754
>>> ...
>>> 64s: 844MB/sec, MB=262144
>>>
>>> [read after]:
>>> 01:52:33 6326664 8121240 2747968 23.65 728 1947656
>>> 199384788 823.75 1887896 502004 68
>>> echo 3 > /proc/sys/vm/drop_caches
>>> 01:53:11 6351136 8096936 2772400 23.86 512 1900500
>>> 199385216 823.75 1847252 533768 104
>>>
>>> Hi Chao,
>>> Given that F2FS currently calls folio_end_writeback in the softirq
>>> context for normal write scenarios, could we first support uncached
>>> buffer I/O reads? For normal uncached buffer I/O writes, would it be
>>> feasible for F2FS to introduce an asynchronous workqueue to handle the
>>> page drop operation in the future? What are your thoughts on this?
>>
>> Qi,
>>
>> Sorry for the delay.
>>
>> I think it will be good to support uncached buffered I/O in read path
>> first, and then let's take a look what we can do for write path, anyway,
>> let's do this step by step.
>>
>> Can you please update the patch?
>> - support read path only
>> - include test data in commit message
> Chao
>
> I will re-submit a patch to first enable F2FS support for uncached
> buffer I/O reads. Following that, I will work on implementing
> asynchronous page dropping in F2FS.
Qi, sure, please go ahead, thanks for the work. :)
Thanks,
>
> Thank you!
>>
>>> Thank you!
>>>
>>>
>
_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
WARNING: multiple messages have this Message-ID (diff)
From: Chao Yu <chao@kernel.org>
To: hanqi <hanqi@vivo.com>, Jens Axboe <axboe@kernel.dk>, jaegeuk@kernel.org
Cc: chao@kernel.org, linux-f2fs-devel@lists.sourceforge.net,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] f2fs: f2fs supports uncached buffered I/O
Date: Fri, 25 Jul 2025 10:37:42 +0800 [thread overview]
Message-ID: <1d03de75-26c4-4a58-af46-dafb319bed89@kernel.org> (raw)
In-Reply-To: <06b9d287-816c-4347-945b-8fda83a6f557@vivo.com>
On 7/25/2025 9:44 AM, hanqi wrote:
>
>
> 在 2025/7/24 21:09, Chao Yu 写道:
>> On 2025/7/16 16:27, hanqi wrote:
>>>
>>>
>>> 在 2025/7/16 11:43, Jens Axboe 写道:
>>>> On 7/15/25 9:34 PM, hanqi wrote:
>>>>>
>>>>> ? 2025/7/15 22:28, Jens Axboe ??:
>>>>>> On 7/14/25 9:10 PM, Qi Han wrote:
>>>>>>> Jens has already completed the development of uncached buffered I/O
>>>>>>> in git [1], and in f2fs, the feature can be enabled simply by
>>>>>>> setting
>>>>>>> the FOP_DONTCACHE flag in f2fs_file_operations.
>>>>>> You need to ensure that for any DONTCACHE IO that the completion is
>>>>>> routed via non-irq context, if applicable. I didn't verify that
>>>>>> this is
>>>>>> the case for f2fs. Generally you can deduce this as well through
>>>>>> testing, I'd say the following cases would be interesting to test:
>>>>>>
>>>>>> 1) Normal DONTCACHE buffered read
>>>>>> 2) Overwrite DONTCACHE buffered write
>>>>>> 3) Append DONTCACHE buffered write
>>>>>>
>>>>>> Test those with DEBUG_ATOMIC_SLEEP set in your config, and it that
>>>>>> doesn't complain, that's a great start.
>>>>>>
>>>>>> For the above test cases as well, verify that page cache doesn't
>>>>>> grow as
>>>>>> IO is performed. A bit is fine for things like meta data, but
>>>>>> generally
>>>>>> you want to see it remain basically flat in terms of page cache
>>>>>> usage.
>>>>>>
>>>>>> Maybe this is all fine, like I said I didn't verify. Just
>>>>>> mentioning it
>>>>>> for completeness sake.
>>>>> Hi, Jens
>>>>> Thanks for your suggestion. As I mentioned earlier in [1], in f2fs,
>>>>> the regular buffered write path invokes folio_end_writeback from a
>>>>> softirq context. Therefore, it seems that f2fs may not be suitable
>>>>> for DONTCACHE I/O writes.
>>>>>
>>>>> I?d like to ask a question: why is DONTCACHE I/O write restricted to
>>>>> non-interrupt context only? Is it because dropping the page might be
>>>>> too time-consuming to be done safely in interrupt context? This might
>>>>> be a naive question, but I?d really appreciate your clarification.
>>>>> Thanks in advance.
>>>> Because (as of right now, at least) the code doing the invalidation
>>>> needs process context. There are various reasons for this, which you'll
>>>> see if you follow the path off folio_end_writeback() ->
>>>> filemap_end_dropbehind_write() -> filemap_end_dropbehind() ->
>>>> folio_unmap_invalidate(). unmap_mapping_folio() is one case, and while
>>>> that may be doable, the inode i_lock is not IRQ safe.
>>>>
>>>> Most file systems have a need to punt some writeback completions to
>>>> non-irq context, eg for file extending etc. Hence for most file
>>>> systems,
>>>> the dontcache case just becomes another case that needs to go through
>>>> that path.
>>>>
>>>> It'd certainly be possible to improve upon this, for example by having
>>>> an opportunistic dontcache unmap from IRQ/soft-irq context, and then
>>>> punting to a workqueue if that doesn't pan out. But this doesn't exist
>>>> as of yet, hence the need for the workqueue punt.
>>
>> Thanks Jens for the detailed explanation.
>>
>>>
>>> Hi, Jens
>>> Thank you for your response. I tested uncached buffer I/O reads with
>>> a 50GB dataset on a local F2FS filesystem, and the page cache size
>>> only increased slightly, which I believe aligns with expectations.
>>> After clearing the page cache, the page cache size returned to its
>>> initial state. The test results are as follows:
>>>
>>> stat 50G.txt
>>> File: 50G.txt
>>> Size: 53687091200 Blocks: 104960712 IO Blocks: 512
>>> regular file
>>>
>>> [read before]:
>>> echo 3 > /proc/sys/vm/drop_caches
>>> 01:48:17 kbmemfree kbavail kbmemused %memused
>>> kbbuffers kbcached kbcommit %commit kbactive kbinact kbdirty
>>> 01:50:59 6404648 8149508 2719384 23.40 512 1898092
>>> 199384760 823.75 1846756 466832 44
>>>
>>> ./uncached_io_test 8192 1 1 50G.txt
>>> Starting 1 threads
>>> reading bs 8192, uncached 1
>>> 1s: 754MB/sec, MB=754
>>> ...
>>> 64s: 844MB/sec, MB=262144
>>>
>>> [read after]:
>>> 01:52:33 6326664 8121240 2747968 23.65 728 1947656
>>> 199384788 823.75 1887896 502004 68
>>> echo 3 > /proc/sys/vm/drop_caches
>>> 01:53:11 6351136 8096936 2772400 23.86 512 1900500
>>> 199385216 823.75 1847252 533768 104
>>>
>>> Hi Chao,
>>> Given that F2FS currently calls folio_end_writeback in the softirq
>>> context for normal write scenarios, could we first support uncached
>>> buffer I/O reads? For normal uncached buffer I/O writes, would it be
>>> feasible for F2FS to introduce an asynchronous workqueue to handle the
>>> page drop operation in the future? What are your thoughts on this?
>>
>> Qi,
>>
>> Sorry for the delay.
>>
>> I think it will be good to support uncached buffered I/O in read path
>> first, and then let's take a look what we can do for write path, anyway,
>> let's do this step by step.
>>
>> Can you please update the patch?
>> - support read path only
>> - include test data in commit message
> Chao
>
> I will re-submit a patch to first enable F2FS support for uncached
> buffer I/O reads. Following that, I will work on implementing
> asynchronous page dropping in F2FS.
Qi, sure, please go ahead, thanks for the work. :)
Thanks,
>
> Thank you!
>>
>>> Thank you!
>>>
>>>
>
next prev parent reply other threads:[~2025-07-25 2:37 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-15 3:10 [f2fs-dev] [PATCH] f2fs: f2fs supports uncached buffered I/O Qi Han via Linux-f2fs-devel
2025-07-15 3:10 ` Qi Han
2025-07-15 6:58 ` [f2fs-dev] " Chao Yu via Linux-f2fs-devel
2025-07-15 6:58 ` Chao Yu
2025-07-15 8:14 ` [f2fs-dev] " hanqi via Linux-f2fs-devel
2025-07-15 8:14 ` hanqi
2025-07-15 14:28 ` [f2fs-dev] " Jens Axboe
2025-07-15 14:28 ` Jens Axboe
2025-07-16 3:34 ` [f2fs-dev] " hanqi via Linux-f2fs-devel
2025-07-16 3:34 ` hanqi
2025-07-16 3:43 ` [f2fs-dev] " Jens Axboe
2025-07-16 3:43 ` Jens Axboe
2025-07-16 8:27 ` [f2fs-dev] " hanqi via Linux-f2fs-devel
2025-07-16 8:27 ` hanqi
2025-07-24 13:09 ` [f2fs-dev] " Chao Yu via Linux-f2fs-devel
2025-07-24 13:09 ` Chao Yu
2025-07-25 1:44 ` [f2fs-dev] " hanqi via Linux-f2fs-devel
2025-07-25 1:44 ` hanqi
2025-07-25 2:37 ` Chao Yu via Linux-f2fs-devel [this message]
2025-07-25 2:37 ` Chao Yu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1d03de75-26c4-4a58-af46-dafb319bed89@kernel.org \
--to=linux-f2fs-devel@lists.sourceforge.net \
--cc=axboe@kernel.dk \
--cc=chao@kernel.org \
--cc=hanqi@vivo.com \
--cc=jaegeuk@kernel.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.