From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from cn.fujitsu.com ([59.151.112.132]:19150 "EHLO heian.cn.fujitsu.com" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S939690AbdAJAff (ORCPT ); Mon, 9 Jan 2017 19:35:35 -0500 Subject: Re: [RFC] Converging userspace and kernel code To: Goldwyn Rodrigues , , Jeff Mahoney , David Sterba References: <512c15bf-3c4b-3b88-1106-a7d43386400f@suse.de> <65341974-d15f-408c-abad-98c2a5bb8743@cn.fujitsu.com> <2a7a4b30-3a90-1aaf-b9ba-9269c58a31c9@suse.de> From: Qu Wenruo Message-ID: <27be1998-e535-b630-e10c-778cc53acd2e@cn.fujitsu.com> Date: Tue, 10 Jan 2017 08:35:26 +0800 MIME-Version: 1.0 In-Reply-To: <2a7a4b30-3a90-1aaf-b9ba-9269c58a31c9@suse.de> Content-Type: text/plain; charset="utf-8"; format=flowed Sender: linux-btrfs-owner@vger.kernel.org List-ID: At 01/09/2017 08:06 PM, Goldwyn Rodrigues wrote: > > > On 01/08/2017 08:11 PM, Qu Wenruo wrote: >> >> >> At 01/08/2017 09:16 PM, Goldwyn Rodrigues wrote: >>> >>> 1. Motivation >>> While fixing user space tools for btrfs-progs, I found a couple of bugs >>> which are already solved in kernel space but were not ported to user >>> space. User space is a little ignored when it comes to fixing bugs in >>> the core functionality. XFS developers have already performed this and >>> the userspace and kernel code walks in lockstep for libxfs. >> >> Personally speaking, I'm not a fan of re-using kernel code in btrfs-progs. >> >> In fact, in btrfs-progs, we don't need a lot of kernel facilities, like >> page/VFS/lock(btrfs-progs works in single thread under most case). > > The core functionality would be specific to btrfs algorithms. While I > agree it cannot do away with kernel components completely, we will be > minimizing the dependency of kernel components. The parts which interact > with kernel such as inode/dentry/etc handling would be moved out of this > core. If we can move kernel facilities completely (or at least most of them), then the idea sounds good for me. So the first step would be isolating btrfs algorithms from kernel facilities. > >> >> And that should make btrfs-progs easier to maintain. >> >> Furthermore, there are cases while kernel is doing things wrong while >> btrfs-progs does it right. >> >> Like RAID56 scrub, kernel scrub can screw up P/Q while (still >> out-of-tree though) scrub won't. >> > > Time to fix it? Patch already sent. BTW, in kernel and in btrfs-progs, scrub are completely different. Kernel scrub uses different flags and have tons of dirty hacks for variants fixes. While in btrfs-progs offline scrub, everything is just as easy as reading things out, check csum and parity. > >> >> Personally speaking, I'd like to see btrfs-progs as a lightweight >> re-implementation without the mess from kernel. >> Btrfs-progs and kernel can do cross-check, while btrfs-progs can be more >> developer friendly. > > This cross-check, and lack of thereof, is what I wish to save. As for > being developer-friendly, you need to know the internal workings of the > kernel components to work on btrfs-progs. The core shall act as a > library on top of which btrfs-progs will be built/migrated. That depends on how independent the "core" btrfs part is. If completely independent,(Don't ever has any page/VFS/race code) I'm completely fine with that. But the problem is, at least from current code base, even btrfs_path structure has quite a lot kernel facilities, mainly locks. (Just check the different size of btrfs_path in kernel and btrfs-progs) So although the idea itself is very nice, but the practice may be quite hard. That's why I prefer current btrfs-progs method, because it's more close to the core of btrfs algorithm, unlike kernel, which is a superset. In this case, I prefer to let kernel use and expand btrfs-progs code, other than pull in a superset from kernel. > >> >> (I'm already trying to re-implement btrfs_map_block() in btrfs-progs >> with a better interface in out-of-tree offline scrub patchset) >> >>> >>> >>> 2. Implementation >>> >>> 2.1 Code Re-arranaging >>> Re-arrange the kernel code so that core functions are in a separate >>> directory "libbtrfs" or "core". (There is already libbtrfs_objects >>> keyword defined in Makefile.in, so I am not sure if we should use >>> libbtrfs, or perhaps just core). The core directory will contain >>> algorithms, function prototypes and implementations spcific to btrfs. >> >> Well, we have kernel-lib/ for it already. >> And it sounds much like what you want. > > While kernel-lib exists, it is not complete to perform a drop-in > replacement. > >> >> Further more, there is already work to separate fs/btrfs/ctree.h and >> include/uapi/linux/btrfs_tree.h (already done in kernel though). > > Another advantage of performing this. doing the same thing twice ;) > >> >> We could start from the same thing in btrfs-progs. >> >>> >>> Comparing the current situation, ctree.h is pretty "polluted" with >>> functions prototypes which do not belong there, the definition of which >>> are not in ctree.c. An example: functions which use struct dentry, inode >>> or other kernel component can be moved out of the core. Besides, >>> functions which could survive with btrfs_inode as opposed to inode >>> should be changed so. We would need new files to split the logic, such >>> as creating inode.c to keep all inode related code. >>> >>> 2.2 Kernel Stubs >>> Making the core directory a drop-in replacement will require kernel >>> stubs which would make some meaning in user-space. This may or may not >>> be included in kerncompat.h. Personally, I would prefer to move >>> kerncompat.h into kernel-stub linux/*.h files. The down-side is we could >>> have a lot of files and directories for stubs, not forgetting the ones >>> for trace/*.h or event asm/*.h. >> >> Errr, I'm not a fan of 2.2 and 2.3 though. >> >> Yes, we have cases that over 90% code can be reused from kernel, like >> raid6 recovery tables. >> >> But that's almost pure algorithm part. (And I followed David's >> suggestion to put them into kernel-lib) >> > > Yes, it is algorithmic specific code which we want to pull. _Not_ the > entire fs/btrfs/*. This would require some additions to kernel-lib and > quite a few kernel stubs. > >> For anything kernel specific, like ftrace/mutex/page/VFS, I prefer a >> lightweight re-write, which is easier to write and review. > > And it should be so. > >> >> In my experience, I just used about 300 lines to rewrite >> btrfs_map_block(), compare to kernel 500+ lines and variant wrappers, no >> to mention easier to understand interface. >> Examples are: >> https://patchwork.kernel.org/patch/9488519/ >> https://patchwork.kernel.org/patch/9488505/ >> >> >> Especially since there may be more hidden bugs in kernel code. > > Like Christoph said, We'd rather fix "hidden bugs" in both kernel and > tools rather than just the tools. While most of the bugs I found in RAID5/6 and dev-replace are all related to race, which doesn't ever exist in btrfs-progs. (That's why I love btrfs-progs so much than the kernel part) So the problem is still how independent we can extract the core functions. Thanks, Qu > >> >> Thanks, >> Qu >> >>> >>> 2.3 Flag day >>> Flag day would be the day we move to the new directory structure. Until >>> then, we send the patches with the current directory structure. After >>> flag day, all patches must be ported to the new directory structure. We >>> could request developers to repost/retest patches leading up to the flag >>> day. >>> >>> >>> 3. Post converging >>> Checks/scripts to make sure that patches which are pushed to kernel >>> space will not render user space tools uncompilable. >>> >>> While these are my (and my teams) thoughts, I would like suggestions >>> and/or criticism to improve the idea. >>> >> >> >