From: Jaegeuk Kim <jaegeuk@kernel.org>
To: Juhyung Park <qkrwngud825@gmail.com>
Cc: Alexander Koskovich <akoskovich@pm.me>,
linux-f2fs-devel@lists.sourceforge.net
Subject: Re: [f2fs-dev] [DISCUSSION] f2fs for desktop
Date: Tue, 4 Apr 2023 16:35:40 -0700 [thread overview]
Message-ID: <ZCy0TBDkzh+VRrnU@google.com> (raw)
In-Reply-To: <CAD14+f3z=kS9E+NTKH7t1J2xL1PpLOVMNx=CabD_t2K6U=T9uQ@mail.gmail.com>
Hi Juhyung,
On 04/04, Juhyung Park wrote:
> Hi everyone,
>
> I want to start a discussion on using f2fs for regular desktops/workstations.
>
> There are growing number of interests in using f2fs as the general
> root file-system:
> 2018: https://www.phoronix.com/news/GRUB-Now-Supports-F2FS
> 2020: https://www.phoronix.com/news/Clear-Linux-F2FS-Root-Option
> 2023: https://code.launchpad.net/~nexusprism/curtin/+git/curtin/+merge/439880
> 2023: https://code.launchpad.net/~nexusprism/grub/+git/ubuntu/+merge/440193
This is quite promising. :)
>
> I've been personally running f2fs on all of my x86 Linux boxes since
> 2015, and I have several concerns that I think we need to collectively
> address for regular non-Android normies to use f2fs:
>
> A. Bootloader and installer support
> B. Host-side GC
> C. Extended node bitmap
>
> I'll go through each one.
>
> === A. Bootloader and installer support ===
>
> It seems that both GRUB and systemd-boot supports f2fs without the
> need for a separate ext4-formatted /boot partition.
> Some distros are seemingly disabling f2fs module for GRUB though for
> security reasons:
> https://bugs.launchpad.net/ubuntu/+source/grub2/+bug/1868664
>
> It's ultimately up to the distro folks to enable this, and still in
> the worst-case scenario, they can specify a separate /boot partition
> and format it to ext4 upon installation.
>
> The installer itself to show f2fs and call mkfs.f2fs is being worked
> on currently on Ubuntu. See the 2023 links above.
>
> Nothing f2fs mainline developers should do here, imo.
>
> === B. Host-side GC ===
>
> f2fs relieves most of the device-side GC but introduces a new
> host-side GC. This is extremely confusing for people who have no
> background in SSDs and flash storage to understand, let alone
> discard/trim/erase complications.
>
> In most consumer-grade blackbox SSDs, device-side GCs are handled
> automatically for various workloads. f2fs, however, leaves that
> responsibility to the userspace with conservative tuning on the
> kernel-side by default. Android handles this by init.rc tunings and a
> separate code running in vold to trigger gc_urgent.
>
> For regular Linux desktop distros, f2fs just runs on the default
> configuration set on the kernel and unless it’s running 24/7 with
> plentiful idle time, it quickly runs out of free segments and starts
> triggering foreground GC. This is giving people the wrong impression
> that f2fs slows down far drastically than other file-systems when
> that’s quite the contrary (i.e., less fragmentation overtime).
>
> This is almost the equivalent of re-living the nightmare of trim. On
> SSDs with very small to no over-provisioned space, running a
> file-system with no discard what-so-ever (sadly still a common case
> when an external SSD is used with no UAS) will also drastically slow
> the performance down. On file-systems with no asynchronous discard,
> mounting a file-system with the discard option adds a non-negligible
> overhead on every remove/delete operations, so most distros now
> (thankfully) use a timer job registered to systemd to trigger fstrim:
> https://github.com/util-linux/util-linux/commits/master/sys-utils/fstrim.timer
>
> This is still far from ideal. The default file-system, ext4, slows
> down drastically almost to a halt when fstrim -a is called, especially
> on SATA. For some reason that is still a mystery for me, people seem
> to be happy with it. No one bothered to improve it for years
> ¯\_(ツ)_/¯.
>
> So here’s my proposal:
> As Linux distros don’t have a good mechanism for hinting when to
> trigger GC, introduce a new Kconfig, CONFIG_F2FS_GC_UPON_FSTRIM and
> enable it by default.
> This config will hook up ioctl(FITRIM), which is currently ignored on
> f2fs - https://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs.git/commit/?h=master&id=e555da9f31210d2b62805cd7faf29228af7c3cfb
> , to perform discard and GC on all invalid segments.
> Userspace configuration with enough f2fs/GC knowledge such as Android
> should disable it.
How about adding an option like "memory=high" to tune background GC parameters
seamlessly?
>
> This will ensure that Linux distros that blindly call fstrim will at
> least avoid constant slowdowns when free segments are depleted with
> the occasional (once a week) slowdown, which *people are already
> living with on ext4*. I'll even go further and mention that since f2fs
> GC is a regular R/W workload, it doesn't cause an extreme slowdown
> comparable to a level of a full file-system trim operation.
>
> If this is acceptable, I’ll cook up a patch.
>
> In an ideal world, all Linux distros should have an explicit f2fs GC
> trigger mechanism (akin to
> https://github.com/kdave/btrfsmaintenance#distro-integration ), but
> it’s practically unrealistic to expect that, given the installer
> doesn’t even support f2fs for now.
>
> === C. Extended node bitmap ===
>
> f2fs by default have a very limited number of allowed inodes compared
> to other file-systems. Just 2 AOSP syncs are enough to exhaust f2fs
> and result in -ENOSPC.
>
> Here are some of the stats collected from me and my colleague that we
> use daily as a regular desktop with GUI, web-browsing and everything:
> 1. Laptop
> Utilization: 68% (182914850 valid blocks, 462 discard blocks)
> - Node: 10234905 (Inode: 10106526, Other: 128379)
> - Data: 172679945
> - Inline_xattr Inode: 2004827
> - Inline_data Inode: 867204
> - Inline_dentry Inode: 51456
>
> 2. Desktop #1
> Utilization: 55% (133310465 valid blocks, 0 discard blocks)
> - Node: 6389660 (Inode: 6289765, Other: 99895)
> - Data: 126920805
> - Inline_xattr Inode: 2253838
> - Inline_data Inode: 1119109
> - Inline_dentry Inode: 187958
>
> 3. Desktop #2
> Utilization: 83% (202222003 valid blocks, 1 discard blocks)
> - Node: 21887836 (Inode: 21757139, Other: 130697)
> - Data: 180334167
> - Inline_xattr Inode: 39292
> - Inline_data Inode: 35213
> - Inline_dentry Inode: 1127
>
> 4. Colleague
> Utilization: 22% (108652929 valid blocks, 362420605 discard blocks)
> - Node: 5629348 (Inode: 5542909, Other: 86439)
> - Data: 103023581
> - Inline_xattr Inode: 655752
> - Inline_data Inode: 259900
> - Inline_dentry Inode: 193000
>
> 5. Android phone (for reference)
> Utilization: 78% (36505713 valid blocks, 1074 discard blocks)
> - Node: 704698 (Inode: 683337, Other: 21361)
> - Data: 35801015
> - Inline_xattr Inode: 683333
> - Inline_data Inode: 237470
> - Inline_dentry Inode: 112177
>
> Chao Yu added a functionality to expand this via the -i flag passed to
> mkfs.f2fs back in 2018 -
> https://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs-tools.git/commit/?id=baaa076b4d576042913cfe34169442dfda651ca4
>
> I occasionally find myself in a weird position of having to tell
> people "Oh you should use the -i option from mkfs.f2fs" when they
> encounter this issue only after they’ve migrated most of the data and
> ask back "Why isn’t this enabled by default?".
>
> While this might not be an issue for the foreseeable future in
> Android, I’d argue that this is a feature that needs to be enabled by
> default for desktop environments with preferably a robust testing
> infrastructure. Guarding this with #ifndef __ANDROID__ doesn’t seem to
> make much sense as it introduces more complications to how
> fuzzing/testing should be done.
>
> I’ll also add that it’s a common practice for userspace mkfs tools to
> introduce breaking default changes to older kernels (with options to
> produce a legacy image, of course).
Do you have some measurements regarding to the additional space that large NAT
occupies?
Thanks,
>
> This was a lengthy email, but I hope I was being reasonable.
>
> Jaegeuk and Chao, let me know what you think.
> And as always, thanks for your hard work :)
>
> Thanks,
> regards
_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
next prev parent reply other threads:[~2023-04-04 23:35 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-04 7:36 [f2fs-dev] [DISCUSSION] f2fs for desktop Juhyung Park
2023-04-04 23:35 ` Jaegeuk Kim [this message]
2023-04-10 8:39 ` Juhyung Park
2023-04-10 15:44 ` Chao Yu
2023-04-10 17:03 ` Juhyung Park
2023-04-20 16:19 ` Chao Yu
2023-04-20 17:26 ` Juhyung Park
2023-05-18 7:53 ` Chao Yu
2023-05-18 18:12 ` Juhyung Park
2023-05-22 13:10 ` Chao Yu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZCy0TBDkzh+VRrnU@google.com \
--to=jaegeuk@kernel.org \
--cc=akoskovich@pm.me \
--cc=linux-f2fs-devel@lists.sourceforge.net \
--cc=qkrwngud825@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).