On 3/16/26 19:41, Darrick J. Wong wrote: > On Mon, Mar 16, 2026 at 04:08:55PM -0700, Joanne Koong wrote: >> On Mon, Mar 16, 2026 at 11:04 AM Darrick J. Wong wrote: >>> >>> On Mon, Mar 16, 2026 at 10:56:21AM -0700, Joanne Koong wrote: >>>> On Mon, Feb 23, 2026 at 2:46 PM Darrick J. Wong wrote: >>>>> >>>>> There are some warts remaining: >>>>> >>>>> a. I would like to continue the discussion about how the design review >>>>> of this code should be structured, and how might I go about creating >>>>> new userspace filesystem servers -- lightweight new ones based off >>>>> the existing userspace tools? Or by merging lklfuse? >>>> >>>> What do you mean by "merging lklfuse"? >>> >>> Merging the lklfuse project into upstream Linux, which involves running >>> the whole kit and caboodle through our review process, and then fixing >> >> Gotcha, so it would basically be having to port this arch/lkl >> directory [1] into the linux tree > > Right. > >>> user-mode-linux to work anywhere other than x86. >> >> Are lklfuse and user-mode-linux (UML) two separate things or is >> lklfuse dependent on user-mode-linux? > > I was under the impression that lklfuse uses UML. Given the weird > things in arch/lkl/Kconfig: > > config 64BIT > bool "64bit kernel" > default y if OUTPUT_FORMAT = "pe-x86-64" > default $(success,$(srctree)/arch/lkl/scripts/cc-objdump-file-format.sh|grep -q '^elf64-') if OUTPUT_FORMAT != "pe-x86-64" > > I was kinda guessing x86_64 was the primary target of the developers? > > /me notes that he's now looked into libguestfs per Demi Marie's comments > and some curiosity on the part of ngompa and i> > > Whatever it is that libguestfs does to stand up unprivileged fs mounts > also could fit this bill. It's *really* slow to start because it takes > the booted kernel, creates a largeish initramfs, boots that combo via > libvirt, and then fires up a fuse server to talk to the vm kernel. > > I think all you'd have to do is change libguestfs to start the VM and > run the fuse server inside a systemd container instead of directly from > the CLI. The feedback I have gotten from ngompa is that libguestfs is just too slow for distros to use it to mount stuff. >>>> Could you explain what the limitations of lklfuse are compared to the >>>> fuse iomap approach in this patchset? >>> >>> The ones I know about are: >>> >>> 1> There's no support for vmapped kernel memory in UML mode, so anyone >>> who requires a large contiguous memory buffer cannot assemble them out >>> of "physical" pages. This has been a stumbling block for XFS in the >>> past. >>> >>> 2> LKLFUSE still uses the classic fuse IO paths, which means that at >>> best you can directio the IO through the lklfuse kernel. At worst you >>> have to use the pagecache inside the lklfuse kernel, which is very >>> wasteful. >> >> For the security / isolation use cases you've described, is >> near-native performance a hard requirement? > > Not a hard requirement, just a means to convince people that they can > choose containment without completely collapsing performance. > >> As I understand it, the main use cases of this will be for mounting >> untrusted disk images and CI/filesystem testing, or are there broader >> use cases beyond this? > > That covers nearly all of it. It's worth noting that on ChromeOS and Android, the only trusted disk images are those that are read-only and protected by dm-verity. *Every* writable image is considered untrusted. I don't know if doing a full fsck at each boot is considered acceptable, but I suspect it would slow boot far too much. Yes, Google ought to be paying for the kernel changes to fix this mess. >>> 3> lklfuse hasn't been updated since 6.6. >> >> Gotcha. So if I'm understanding it correctly, the pros/cons come down to: >> lklfuse pros: >> - (arguably) easier setup cost. once it's setup (assuming it's >> possible to add support for the vmapped kernel memory thing you >> mentioned above), it'll automatically work for every filesystem vs. >> having to implement a fuse-iomap server for every filesystem > > Or even a good non-iomap fuse server for every filesystem. Admittedly > the weak part of fuse4fs is that libext2fs is not as robust as the > kernel is. > >> - easier to maintain vs. having to maintain each filesystem's >> userspace server implementation > > Yeah. > >> lklfuse cons: >> - worse (not sure by how much) performance > > Probably a lot, because now you have to run a full IO stack all the way > through lklfuse. How much is "a lot"? Is it "this is only useful for non-interactive overnight backups", "you will notice this in benchmarks but it's okay for normal use", or somewhere in between? Could lklfuse and iomap be combined? >> - once it's merged into the kernel, we can't choose to not >> maintain/support it in the future > > Correct. > >> Am I understanding this correctly? >> >> In my opinion, if near-native performance is not a hard requirement, >> it seems like less pain overall to go with lklfuse. lklfuse seems a >> lot easier to maintain and I'm not sure if some complexities like >> btrfs's copy-on-write could be handled properly with fuse-iomap. > > btrfs cow can be done with iomap, at least on the directio end. It's > the other features like fsverity/fscrypt/data checksumming that aren't > currently supported by iomap. Pretty much everyone on btrfs uses data checksumming. >> What are your thoughts on this? > > "Gee, what if I could simplify most of my own work out of existence?" What is that work? > --D > >> Thanks, >> Joanne >> >> [1] https://github.com/lkl/linux/tree/master/arch/lkl >> >>> >>> --D >>> >>>> Thanks, >>>> Joanne >>>> >>>>> >>>>> b. ext4 doesn't support out of place writes so I don't know if that >>>>> actually works correctly. >>>>> >>>>> c. fuse2fs doesn't support the ext4 journal. Urk. >>>>> >>>>> d. There's a VERY large quantity of fuse2fs improvements that need to be >>>>> applied before we get to the fuse-iomap parts. I'm not sending these >>>>> (or the fstests changes) to keep the size of the patchbomb at >>>>> "unreasonably large". :P As a result, the fstests and e2fsprogs >>>>> postings are very targeted. >>>>> >>>>> e. I've dropped the fstests part of the patchbomb because v6 was just >>>>> way too long. >>>>> >>>>> I would like to get the main parts of this submission reviewed for 7.1 >>>>> now that this has been collecting comments and tweaks in non-rfc status >>>>> for 3.5 months. >>>>> >>>>> Kernel: >>>>> https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=fuse-iomap-bpf >>>>> >>>>> libfuse: >>>>> https://git.kernel.org/pub/scm/linux/kernel/git/djwong/libfuse.git/log/?h=fuse-iomap-bpf >>>>> >>>>> e2fsprogs: >>>>> https://git.kernel.org/pub/scm/linux/kernel/git/djwong/e2fsprogs.git/log/?h=fuse-iomap-bpf >>>>> >>>>> fstests: >>>>> https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfstests-dev.git/log/?h=fuse2fs >>>>> >>>>> --Darrick >> -- Sincerely, Demi Marie Obenour (she/her/hers)