From: Jaegeuk Kim <jaegeuk@kernel.org>
To: Juhyung Park <qkrwngud825@gmail.com>
Cc: Alexander Koskovich <akoskovich@pm.me>,
linux-f2fs-devel@lists.sourceforge.net
Subject: Re: [f2fs-dev] [DISCUSSION] f2fs for desktop
Date: Tue, 4 Apr 2023 16:35:40 -0700 [thread overview]
Message-ID: <ZCy0TBDkzh+VRrnU@google.com> (raw)
In-Reply-To: <CAD14+f3z=kS9E+NTKH7t1J2xL1PpLOVMNx=CabD_t2K6U=T9uQ@mail.gmail.com>
Hi Juhyung,
On 04/04, Juhyung Park wrote:
> Hi everyone,
>
> I want to start a discussion on using f2fs for regular desktops/workstations.
>
> There are growing number of interests in using f2fs as the general
> root file-system:
> 2018: https://www.phoronix.com/news/GRUB-Now-Supports-F2FS
> 2020: https://www.phoronix.com/news/Clear-Linux-F2FS-Root-Option
> 2023: https://code.launchpad.net/~nexusprism/curtin/+git/curtin/+merge/439880
> 2023: https://code.launchpad.net/~nexusprism/grub/+git/ubuntu/+merge/440193
This is quite promising. :)
>
> I've been personally running f2fs on all of my x86 Linux boxes since
> 2015, and I have several concerns that I think we need to collectively
> address for regular non-Android normies to use f2fs:
>
> A. Bootloader and installer support
> B. Host-side GC
> C. Extended node bitmap
>
> I'll go through each one.
>
> === A. Bootloader and installer support ===
>
> It seems that both GRUB and systemd-boot supports f2fs without the
> need for a separate ext4-formatted /boot partition.
> Some distros are seemingly disabling f2fs module for GRUB though for
> security reasons:
> https://bugs.launchpad.net/ubuntu/+source/grub2/+bug/1868664
>
> It's ultimately up to the distro folks to enable this, and still in
> the worst-case scenario, they can specify a separate /boot partition
> and format it to ext4 upon installation.
>
> The installer itself to show f2fs and call mkfs.f2fs is being worked
> on currently on Ubuntu. See the 2023 links above.
>
> Nothing f2fs mainline developers should do here, imo.
>
> === B. Host-side GC ===
>
> f2fs relieves most of the device-side GC but introduces a new
> host-side GC. This is extremely confusing for people who have no
> background in SSDs and flash storage to understand, let alone
> discard/trim/erase complications.
>
> In most consumer-grade blackbox SSDs, device-side GCs are handled
> automatically for various workloads. f2fs, however, leaves that
> responsibility to the userspace with conservative tuning on the
> kernel-side by default. Android handles this by init.rc tunings and a
> separate code running in vold to trigger gc_urgent.
>
> For regular Linux desktop distros, f2fs just runs on the default
> configuration set on the kernel and unless it’s running 24/7 with
> plentiful idle time, it quickly runs out of free segments and starts
> triggering foreground GC. This is giving people the wrong impression
> that f2fs slows down far drastically than other file-systems when
> that’s quite the contrary (i.e., less fragmentation overtime).
>
> This is almost the equivalent of re-living the nightmare of trim. On
> SSDs with very small to no over-provisioned space, running a
> file-system with no discard what-so-ever (sadly still a common case
> when an external SSD is used with no UAS) will also drastically slow
> the performance down. On file-systems with no asynchronous discard,
> mounting a file-system with the discard option adds a non-negligible
> overhead on every remove/delete operations, so most distros now
> (thankfully) use a timer job registered to systemd to trigger fstrim:
> https://github.com/util-linux/util-linux/commits/master/sys-utils/fstrim.timer
>
> This is still far from ideal. The default file-system, ext4, slows
> down drastically almost to a halt when fstrim -a is called, especially
> on SATA. For some reason that is still a mystery for me, people seem
> to be happy with it. No one bothered to improve it for years
> ¯\_(ツ)_/¯.
>
> So here’s my proposal:
> As Linux distros don’t have a good mechanism for hinting when to
> trigger GC, introduce a new Kconfig, CONFIG_F2FS_GC_UPON_FSTRIM and
> enable it by default.
> This config will hook up ioctl(FITRIM), which is currently ignored on
> f2fs - https://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs.git/commit/?h=master&id=e555da9f31210d2b62805cd7faf29228af7c3cfb
> , to perform discard and GC on all invalid segments.
> Userspace configuration with enough f2fs/GC knowledge such as Android
> should disable it.
How about adding an option like "memory=high" to tune background GC parameters
seamlessly?
>
> This will ensure that Linux distros that blindly call fstrim will at
> least avoid constant slowdowns when free segments are depleted with
> the occasional (once a week) slowdown, which *people are already
> living with on ext4*. I'll even go further and mention that since f2fs
> GC is a regular R/W workload, it doesn't cause an extreme slowdown
> comparable to a level of a full file-system trim operation.
>
> If this is acceptable, I’ll cook up a patch.
>
> In an ideal world, all Linux distros should have an explicit f2fs GC
> trigger mechanism (akin to
> https://github.com/kdave/btrfsmaintenance#distro-integration ), but
> it’s practically unrealistic to expect that, given the installer
> doesn’t even support f2fs for now.
>
> === C. Extended node bitmap ===
>
> f2fs by default have a very limited number of allowed inodes compared
> to other file-systems. Just 2 AOSP syncs are enough to exhaust f2fs
> and result in -ENOSPC.
>
> Here are some of the stats collected from me and my colleague that we
> use daily as a regular desktop with GUI, web-browsing and everything:
> 1. Laptop
> Utilization: 68% (182914850 valid blocks, 462 discard blocks)
> - Node: 10234905 (Inode: 10106526, Other: 128379)
> - Data: 172679945
> - Inline_xattr Inode: 2004827
> - Inline_data Inode: 867204
> - Inline_dentry Inode: 51456
>
> 2. Desktop #1
> Utilization: 55% (133310465 valid blocks, 0 discard blocks)
> - Node: 6389660 (Inode: 6289765, Other: 99895)
> - Data: 126920805
> - Inline_xattr Inode: 2253838
> - Inline_data Inode: 1119109
> - Inline_dentry Inode: 187958
>
> 3. Desktop #2
> Utilization: 83% (202222003 valid blocks, 1 discard blocks)
> - Node: 21887836 (Inode: 21757139, Other: 130697)
> - Data: 180334167
> - Inline_xattr Inode: 39292
> - Inline_data Inode: 35213
> - Inline_dentry Inode: 1127
>
> 4. Colleague
> Utilization: 22% (108652929 valid blocks, 362420605 discard blocks)
> - Node: 5629348 (Inode: 5542909, Other: 86439)
> - Data: 103023581
> - Inline_xattr Inode: 655752
> - Inline_data Inode: 259900
> - Inline_dentry Inode: 193000
>
> 5. Android phone (for reference)
> Utilization: 78% (36505713 valid blocks, 1074 discard blocks)
> - Node: 704698 (Inode: 683337, Other: 21361)
> - Data: 35801015
> - Inline_xattr Inode: 683333
> - Inline_data Inode: 237470
> - Inline_dentry Inode: 112177
>
> Chao Yu added a functionality to expand this via the -i flag passed to
> mkfs.f2fs back in 2018 -
> https://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs-tools.git/commit/?id=baaa076b4d576042913cfe34169442dfda651ca4
>
> I occasionally find myself in a weird position of having to tell
> people "Oh you should use the -i option from mkfs.f2fs" when they
> encounter this issue only after they’ve migrated most of the data and
> ask back "Why isn’t this enabled by default?".
>
> While this might not be an issue for the foreseeable future in
> Android, I’d argue that this is a feature that needs to be enabled by
> default for desktop environments with preferably a robust testing
> infrastructure. Guarding this with #ifndef __ANDROID__ doesn’t seem to
> make much sense as it introduces more complications to how
> fuzzing/testing should be done.
>
> I’ll also add that it’s a common practice for userspace mkfs tools to
> introduce breaking default changes to older kernels (with options to
> produce a legacy image, of course).
Do you have some measurements regarding to the additional space that large NAT
occupies?
Thanks,
>
> This was a lengthy email, but I hope I was being reasonable.
>
> Jaegeuk and Chao, let me know what you think.
> And as always, thanks for your hard work :)
>
> Thanks,
> regards
_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
next prev parent reply other threads:[~2023-04-04 23:35 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-04 7:36 [f2fs-dev] [DISCUSSION] f2fs for desktop Juhyung Park
2023-04-04 23:35 ` Jaegeuk Kim [this message]
2023-04-10 8:39 ` Juhyung Park
2023-04-10 15:44 ` Chao Yu
2023-04-10 17:03 ` Juhyung Park
2023-04-20 16:19 ` Chao Yu
2023-04-20 17:26 ` Juhyung Park
2023-05-18 7:53 ` Chao Yu
2023-05-18 18:12 ` Juhyung Park
2023-05-22 13:10 ` Chao Yu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZCy0TBDkzh+VRrnU@google.com \
--to=jaegeuk@kernel.org \
--cc=akoskovich@pm.me \
--cc=linux-f2fs-devel@lists.sourceforge.net \
--cc=qkrwngud825@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.