From: Jaegeuk Kim <jaegeuk@kernel.org>
To: Theodore Tso <tytso@mit.edu>
Cc: linux-api@vger.kernel.org, linux-kernel@vger.kernel.org,
Matthew Wilcox <willy@infradead.org>,
linux-f2fs-devel@lists.sourceforge.net,
Christoph Hellwig <hch@infradead.org>,
linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
Akilesh Kailash <akailash@google.com>,
Christian Brauner <christian@brauner.io>
Subject: Re: [f2fs-dev] [PATCH v2] f2fs: another way to set large folio by remembering inode number
Date: Wed, 27 May 2026 02:43:21 +0000 [thread overview]
Message-ID: <ahZaScMpx19ZLQi4@google.com> (raw)
In-Reply-To: <psj3kr2gcze2yll5xdbvyyzxwcwhds5gh55poobpkfxrkpbgr7@ljdindismzd4>
On 05/26, Theodore Tso wrote:
> On Tue, May 26, 2026 at 09:52:40PM +0000, Jaegeuk Kim wrote:
> > > It seems... surprising that the additional I/O operations are actually
> > > throttloing UFS device bandwidth by 2x (4GB/s vs 2GB/s). Have you dug
> > > into why this is happening, and whether there is anything that can be
> > > optimized below the file system?
> >
> > I can't tell the exact size tho, roughly it's between 1GB and
> > 4GB. And, per lots of test results with various tunings, it turned
> > out memory allocation speed was the culprit. If we use 4KB page, we
> > couldn't get the full bandwidth unless we set the biggest core
> > running the highest frequency.
>
> OK, if we assume that the model file that you want to load is is 2GB
> then the number of 4k pages that you need is a bit over half a million
> (524288). So if it take 1 second with large folios (2 GB/s as you
> stated above), and half-second without (4 GB/s), then you're basically
> saying that it was costing you half-second to allocate 524288
> singleton pages. And the whole point of this exercise is to save that
> half second?
>
> And I assume that these timing was using a performance cores, and part
> of the goal here is to be able to use an efficiency core instead.
>
> Did I get that right?
Yes, right.
>
> > > But the problem with using small folios is that if you want to
> > > actually *use* the memory, unless you want to segment out the memory
> > > so it can't be used for anything other than the AI models (e.g., by
> > > using somthing like hugetlbfs) it's just going to break up the memory
> > > into smaller folios. So that's not actually going to *help* in actual
> > > real life use cases. It might help for your artificial benchmarks /
> > > experiments, but in the real life case where Android applications are
> > > running and fragmenting all of the device memory, the large folios
> > > won't be available *anyway*.
> >
> > Agreed it's hard to get this done perfectly tho, as the best effort on this
> > particular AI model case, I focused on two timings when loading the models:
> > 1) right after device boot, 2) dynamic loading when required. To secure high
> > order pages, for 1), I disabled the large folio consumed by EROFS, while for
> > 2), I tried to call compact_memory before loading the model. Both of cases,
> > I could observe we could get fair amount of large folios. Yes, not 100% tho.
>
> If (1) is a common case in real life, the thing to do would be grab
> 2GB of large folios early in the startup sequence, and then letting
> erofs do its thing --- and then at the end of the startup, right before you
> load the model, you can release the 2GB worth of large folios.
>
> (That being said, I'm guessing #1 is actually not that interesting,
> since as a percentage of the time that it takes for an Android device
> to startup, is adding an extra half-second *really* going to be
> noticeable by the user?)
>
> But for case #2, that's the much more challenging case. If you don't
> call compact_memory() you're going to burn half a second to allocate
> the 4k pages, since the large folios won't be available. But if you
> *do* call compact_memory() in a production ROM, depending fragmented the
> memory is and how much memory have, calling compat_memory() could take
> **minutes**. So what's the point?
>
> The bottom line is if it's right after device boot, there are simple
> techniques that don't require hacking up the f2fs. But in the
> demand-loaded case, calling compact_memory() is the last thing you'll
> want to do. You're better either asking the mm to allocate the 4k
> pages, or do whatever compaction it can do to just free up 2GB worth
> of folios. (Calling compact_memory() is overkill, and only makes
> sense in the context of benchmark / proof of concept demo.)
>
> Either way, trying to get file systems to avoid using large folios in
> the hopes that this will speed up large AI model loading.... doesn't
> seem to make sense.
>
> If the problem is fundamentally about making 2GB worth of large folios
> available in a way that takes significantly less time that just
> allocating the model using half-million 4k pages, that's the question
> that we should be asking Matthew and the mm folks. Which is why it
> was too bad we didn't raise this issue at LSF/MM earlier this month.
Thanks for the context. To clarify a piece I missed earlier: the model pages
are also utilized for inference. Our data shows that larger chunks yield
higher inference speeds. Consequently, I required high-order pages to optimize
both read throughput and inference latency. I will halt my current efforts
and wait for alternative suggestions.
>
> > Indeed, I was off from LSF/MM for years due to various product issues, not
> > related F2FS tho. Let me make some effort to attend upcoming ones like LPC,
> > if I can get the budget from company.
>
> Next time, as a suggestion, feel free to raise the issue when the
> LSF/MM CFP goes out, even if you don't think it's likely you will get
> an invite. Indeed, with a sufficiently interesting topic, that's the
> way to *get* an invitation. It will require breaking down the
> technical requires as you and I have done for the last few messages on
> this thread.
>
> Even if you can't attend LSF/MM due to time or budget reasons, there
> are a number of your colleagues who are attending, who could raise the
> question on your behalf. I've been known to do that once or twice on
> behalf of other Google teams. But it does require that you approach
> the usual LSF/MM suspects a good 2-3 months before the conference so
> we can help you craft the an appropriate response to the CFP.
Thanks for the suggestion. Will definitely do.
>
> Cheers,
>
> - Ted
>
>
> _______________________________________________
> Linux-f2fs-devel mailing list
> Linux-f2fs-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
WARNING: multiple messages have this Message-ID (diff)
From: Jaegeuk Kim via Linux-f2fs-devel <linux-f2fs-devel@lists.sourceforge.net>
To: Theodore Tso <tytso@mit.edu>
Cc: linux-api@vger.kernel.org, linux-kernel@vger.kernel.org,
Matthew Wilcox <willy@infradead.org>,
linux-f2fs-devel@lists.sourceforge.net,
Christoph Hellwig <hch@infradead.org>,
linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
Akilesh Kailash <akailash@google.com>,
Christian Brauner <christian@brauner.io>
Subject: Re: [f2fs-dev] [PATCH v2] f2fs: another way to set large folio by remembering inode number
Date: Wed, 27 May 2026 02:43:21 +0000 [thread overview]
Message-ID: <ahZaScMpx19ZLQi4@google.com> (raw)
In-Reply-To: <psj3kr2gcze2yll5xdbvyyzxwcwhds5gh55poobpkfxrkpbgr7@ljdindismzd4>
On 05/26, Theodore Tso wrote:
> On Tue, May 26, 2026 at 09:52:40PM +0000, Jaegeuk Kim wrote:
> > > It seems... surprising that the additional I/O operations are actually
> > > throttloing UFS device bandwidth by 2x (4GB/s vs 2GB/s). Have you dug
> > > into why this is happening, and whether there is anything that can be
> > > optimized below the file system?
> >
> > I can't tell the exact size tho, roughly it's between 1GB and
> > 4GB. And, per lots of test results with various tunings, it turned
> > out memory allocation speed was the culprit. If we use 4KB page, we
> > couldn't get the full bandwidth unless we set the biggest core
> > running the highest frequency.
>
> OK, if we assume that the model file that you want to load is is 2GB
> then the number of 4k pages that you need is a bit over half a million
> (524288). So if it take 1 second with large folios (2 GB/s as you
> stated above), and half-second without (4 GB/s), then you're basically
> saying that it was costing you half-second to allocate 524288
> singleton pages. And the whole point of this exercise is to save that
> half second?
>
> And I assume that these timing was using a performance cores, and part
> of the goal here is to be able to use an efficiency core instead.
>
> Did I get that right?
Yes, right.
>
> > > But the problem with using small folios is that if you want to
> > > actually *use* the memory, unless you want to segment out the memory
> > > so it can't be used for anything other than the AI models (e.g., by
> > > using somthing like hugetlbfs) it's just going to break up the memory
> > > into smaller folios. So that's not actually going to *help* in actual
> > > real life use cases. It might help for your artificial benchmarks /
> > > experiments, but in the real life case where Android applications are
> > > running and fragmenting all of the device memory, the large folios
> > > won't be available *anyway*.
> >
> > Agreed it's hard to get this done perfectly tho, as the best effort on this
> > particular AI model case, I focused on two timings when loading the models:
> > 1) right after device boot, 2) dynamic loading when required. To secure high
> > order pages, for 1), I disabled the large folio consumed by EROFS, while for
> > 2), I tried to call compact_memory before loading the model. Both of cases,
> > I could observe we could get fair amount of large folios. Yes, not 100% tho.
>
> If (1) is a common case in real life, the thing to do would be grab
> 2GB of large folios early in the startup sequence, and then letting
> erofs do its thing --- and then at the end of the startup, right before you
> load the model, you can release the 2GB worth of large folios.
>
> (That being said, I'm guessing #1 is actually not that interesting,
> since as a percentage of the time that it takes for an Android device
> to startup, is adding an extra half-second *really* going to be
> noticeable by the user?)
>
> But for case #2, that's the much more challenging case. If you don't
> call compact_memory() you're going to burn half a second to allocate
> the 4k pages, since the large folios won't be available. But if you
> *do* call compact_memory() in a production ROM, depending fragmented the
> memory is and how much memory have, calling compat_memory() could take
> **minutes**. So what's the point?
>
> The bottom line is if it's right after device boot, there are simple
> techniques that don't require hacking up the f2fs. But in the
> demand-loaded case, calling compact_memory() is the last thing you'll
> want to do. You're better either asking the mm to allocate the 4k
> pages, or do whatever compaction it can do to just free up 2GB worth
> of folios. (Calling compact_memory() is overkill, and only makes
> sense in the context of benchmark / proof of concept demo.)
>
> Either way, trying to get file systems to avoid using large folios in
> the hopes that this will speed up large AI model loading.... doesn't
> seem to make sense.
>
> If the problem is fundamentally about making 2GB worth of large folios
> available in a way that takes significantly less time that just
> allocating the model using half-million 4k pages, that's the question
> that we should be asking Matthew and the mm folks. Which is why it
> was too bad we didn't raise this issue at LSF/MM earlier this month.
Thanks for the context. To clarify a piece I missed earlier: the model pages
are also utilized for inference. Our data shows that larger chunks yield
higher inference speeds. Consequently, I required high-order pages to optimize
both read throughput and inference latency. I will halt my current efforts
and wait for alternative suggestions.
>
> > Indeed, I was off from LSF/MM for years due to various product issues, not
> > related F2FS tho. Let me make some effort to attend upcoming ones like LPC,
> > if I can get the budget from company.
>
> Next time, as a suggestion, feel free to raise the issue when the
> LSF/MM CFP goes out, even if you don't think it's likely you will get
> an invite. Indeed, with a sufficiently interesting topic, that's the
> way to *get* an invitation. It will require breaking down the
> technical requires as you and I have done for the last few messages on
> this thread.
>
> Even if you can't attend LSF/MM due to time or budget reasons, there
> are a number of your colleagues who are attending, who could raise the
> question on your behalf. I've been known to do that once or twice on
> behalf of other Google teams. But it does require that you approach
> the usual LSF/MM suspects a good 2-3 months before the conference so
> we can help you craft the an appropriate response to the CFP.
Thanks for the suggestion. Will definitely do.
>
> Cheers,
>
> - Ted
>
>
> _______________________________________________
> Linux-f2fs-devel mailing list
> Linux-f2fs-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
next prev parent reply other threads:[~2026-05-27 2:43 UTC|newest]
Thread overview: 116+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-09 13:45 [f2fs-dev] [PATCH] f2fs: another way to set large folio by remembering inode number Jaegeuk Kim via Linux-f2fs-devel
2026-04-09 13:45 ` Jaegeuk Kim
2026-04-10 1:16 ` [f2fs-dev] [PATCH v2] " Jaegeuk Kim via Linux-f2fs-devel
2026-04-10 1:16 ` Jaegeuk Kim
2026-04-14 8:02 ` Christoph Hellwig
2026-04-14 8:02 ` [f2fs-dev] " Christoph Hellwig
2026-04-15 16:44 ` Jaegeuk Kim
2026-04-15 16:44 ` [f2fs-dev] " Jaegeuk Kim via Linux-f2fs-devel
2026-04-15 17:15 ` Matthew Wilcox
2026-04-15 17:15 ` [f2fs-dev] " Matthew Wilcox
2026-04-15 22:02 ` Jaegeuk Kim
2026-04-15 22:02 ` [f2fs-dev] " Jaegeuk Kim via Linux-f2fs-devel
2026-04-15 23:49 ` Darrick J. Wong
2026-04-15 23:49 ` [f2fs-dev] " Darrick J. Wong via Linux-f2fs-devel
2026-04-16 1:19 ` Jaegeuk Kim
2026-04-16 1:19 ` [f2fs-dev] " Jaegeuk Kim via Linux-f2fs-devel
2026-05-21 8:51 ` Christoph Hellwig
2026-05-21 8:51 ` [f2fs-dev] " Christoph Hellwig
2026-05-21 15:57 ` Theodore Tso
2026-05-21 15:57 ` [f2fs-dev] " Theodore Tso
2026-05-21 17:42 ` Matthew Wilcox
2026-05-21 17:42 ` [f2fs-dev] " Matthew Wilcox
2026-05-22 3:59 ` Jaegeuk Kim
2026-05-22 3:59 ` [f2fs-dev] " Jaegeuk Kim via Linux-f2fs-devel
2026-05-22 12:55 ` Matthew Wilcox
2026-05-22 12:55 ` [f2fs-dev] " Matthew Wilcox
2026-05-22 14:04 ` Jaegeuk Kim
2026-05-22 14:04 ` Jaegeuk Kim via Linux-f2fs-devel
2026-05-25 5:34 ` Christoph Hellwig
2026-05-25 5:34 ` Christoph Hellwig
2026-05-26 1:21 ` Jaegeuk Kim
2026-05-26 1:21 ` Jaegeuk Kim via Linux-f2fs-devel
2026-05-26 2:31 ` Matthew Wilcox
2026-05-26 2:31 ` Matthew Wilcox
2026-05-26 3:47 ` Jaegeuk Kim
2026-05-26 3:47 ` Jaegeuk Kim via Linux-f2fs-devel
2026-05-27 6:33 ` Christoph Hellwig
2026-05-27 6:33 ` Christoph Hellwig
2026-05-27 6:26 ` Christoph Hellwig
2026-05-27 6:26 ` Christoph Hellwig
2026-05-27 15:42 ` Jaegeuk Kim
2026-05-27 15:42 ` Jaegeuk Kim via Linux-f2fs-devel
2026-05-25 5:34 ` Christoph Hellwig
2026-05-25 5:34 ` [f2fs-dev] " Christoph Hellwig
2026-05-22 3:32 ` Jaegeuk Kim
2026-05-22 3:32 ` Jaegeuk Kim via Linux-f2fs-devel
2026-05-22 3:53 ` Eric Biggers
2026-05-22 3:53 ` Eric Biggers via Linux-f2fs-devel
2026-05-22 4:02 ` Jaegeuk Kim
2026-05-22 4:02 ` Jaegeuk Kim via Linux-f2fs-devel
2026-05-22 10:01 ` Christian Brauner
2026-05-22 10:01 ` Christian Brauner via Linux-f2fs-devel
2026-05-22 14:11 ` Theodore Tso
2026-05-22 14:11 ` Theodore Tso
2026-05-22 17:08 ` Jaegeuk Kim
2026-05-22 17:08 ` Jaegeuk Kim via Linux-f2fs-devel
2026-05-22 22:41 ` Theodore Tso
2026-05-22 22:41 ` Theodore Tso
2026-05-26 1:10 ` Jaegeuk Kim
2026-05-26 1:10 ` Jaegeuk Kim via Linux-f2fs-devel
2026-05-26 2:35 ` Matthew Wilcox
2026-05-26 2:35 ` Matthew Wilcox
2026-05-26 3:34 ` Jaegeuk Kim
2026-05-26 3:34 ` Jaegeuk Kim via Linux-f2fs-devel
2026-05-26 3:35 ` Randy Dunlap
2026-05-26 3:35 ` Randy Dunlap
2026-05-26 4:12 ` Jaegeuk Kim
2026-05-26 4:12 ` Jaegeuk Kim via Linux-f2fs-devel
2026-05-26 13:42 ` Theodore Tso
2026-05-26 13:42 ` Theodore Tso
2026-05-26 16:14 ` Bart Van Assche
2026-05-26 16:14 ` Bart Van Assche via Linux-f2fs-devel
2026-05-27 6:28 ` Christoph Hellwig
2026-05-27 6:28 ` Christoph Hellwig
2026-05-27 15:59 ` Jaegeuk Kim
2026-05-27 15:59 ` Jaegeuk Kim via Linux-f2fs-devel
2026-05-29 5:36 ` Christoph Hellwig
2026-05-29 5:36 ` Christoph Hellwig
2026-05-31 0:12 ` Jaegeuk Kim
2026-05-31 0:12 ` Jaegeuk Kim via Linux-f2fs-devel
2026-05-31 5:28 ` Barry Song
2026-05-31 5:28 ` Barry Song via Linux-f2fs-devel
2026-06-01 1:52 ` Jaegeuk Kim
2026-06-01 1:52 ` Jaegeuk Kim via Linux-f2fs-devel
2026-05-26 21:52 ` Jaegeuk Kim
2026-05-26 21:52 ` Jaegeuk Kim via Linux-f2fs-devel
2026-05-27 1:21 ` Theodore Tso
2026-05-27 1:21 ` Theodore Tso
2026-05-27 2:43 ` Jaegeuk Kim [this message]
2026-05-27 2:43 ` Jaegeuk Kim via Linux-f2fs-devel
2026-05-27 3:30 ` Matthew Wilcox
2026-05-27 3:30 ` Matthew Wilcox
2026-05-27 15:39 ` Jaegeuk Kim
2026-05-27 15:39 ` Jaegeuk Kim via Linux-f2fs-devel
2026-05-27 6:31 ` Christoph Hellwig
2026-05-27 6:31 ` Christoph Hellwig
2026-05-27 1:15 ` Bart Van Assche via Linux-f2fs-devel
2026-05-27 1:15 ` Bart Van Assche
2026-05-28 19:36 ` Matthew Wilcox
2026-05-28 19:36 ` Matthew Wilcox
2026-05-31 0:35 ` Jaegeuk Kim
2026-05-31 0:35 ` Jaegeuk Kim via Linux-f2fs-devel
2026-05-25 5:37 ` Christoph Hellwig
2026-05-25 5:37 ` Christoph Hellwig
2026-05-22 9:59 ` Christian Brauner
2026-05-22 9:59 ` [f2fs-dev] " Christian Brauner via Linux-f2fs-devel
2026-04-15 16:41 ` Jaegeuk Kim
2026-04-15 16:41 ` [f2fs-dev] " Jaegeuk Kim via Linux-f2fs-devel
2026-04-17 0:58 ` Chao Yu via Linux-f2fs-devel
2026-04-17 0:58 ` Chao Yu
2026-04-17 16:54 ` Jaegeuk Kim via Linux-f2fs-devel
2026-04-17 16:54 ` Jaegeuk Kim
2026-04-18 1:08 ` Chao Yu via Linux-f2fs-devel
2026-04-18 1:08 ` Chao Yu
2026-04-18 1:11 ` Chao Yu via Linux-f2fs-devel
2026-04-18 1:11 ` Chao Yu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ahZaScMpx19ZLQi4@google.com \
--to=jaegeuk@kernel.org \
--cc=akailash@google.com \
--cc=christian@brauner.io \
--cc=hch@infradead.org \
--cc=linux-api@vger.kernel.org \
--cc=linux-f2fs-devel@lists.sourceforge.net \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=tytso@mit.edu \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.