Re: [RFC] Converging userspace and kernel code

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Qu Wenruo <quwenruo@cn.fujitsu.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>,
	Jeff Mahoney <jeffm@suse.com>
Cc: Omar Sandoval <osandov@osandov.com>,
	Eric Sandeen <sandeen@redhat.com>,
	Goldwyn Rodrigues <rgoldwyn@suse.de>,
	<linux-btrfs@vger.kernel.org>, David Sterba <dsterba@suse.com>
Subject: Re: [RFC] Converging userspace and kernel code
Date: Tue, 10 Jan 2017 10:24:13 +0800	[thread overview]
Message-ID: <63f4eab8-c85d-4f0f-ab32-775db38cc388@cn.fujitsu.com> (raw)
In-Reply-To: <20170110014633.GA14032@birch.djwong.org>



At 01/10/2017 09:46 AM, Darrick J. Wong wrote:
> On Mon, Jan 09, 2017 at 04:38:22PM -0500, Jeff Mahoney wrote:
>> On 1/9/17 4:34 PM, Omar Sandoval wrote:
>>> On Mon, Jan 09, 2017 at 09:31:39AM -0600, Eric Sandeen wrote:
>>>> On 1/8/17 8:11 PM, Qu Wenruo wrote:
>>>>>
>>>>>
>>>>> At 01/08/2017 09:16 PM, Goldwyn Rodrigues wrote:
>>>>>>
>>>>>> 1. Motivation
>>>>>> While fixing user space tools for btrfs-progs, I found a couple of bugs
>>>>>> which are already solved in kernel space but were not ported to user
>>>>>> space. User space is a little ignored when it comes to fixing bugs in
>>>>>> the core functionality. XFS developers have already performed this and
>>>>>> the userspace and kernel code walks in lockstep for libxfs.
>
> Eh, I've wrangled the two other FSes mentioned, so I guess I can
> reiterate for a while too. :P
>
> Yes, it's very nice to be able to apply kernel patches of core libxfs
> code onto xfsprogs and have it work.  It's /annoying/ to have to it sort
> of work but with a lot of fuzz and somewhere midway through the apply
> loop the whole thing crashes and burns due to some minor merge conflict.
>
> (It's not so bad for libext2fs since at least it's a totally different
> implementation.  But I'll get to that later.)
>
> The core XFS algorithms (btrees, space management, inodes, directories,
> attributes, on-disk formats, and some of the log stuff) live in
> libxfs/xfs_*.[ch].   The stuff that sits between those algorithms and
> the kernel all live in fs/xfs/*.[ch], and the stuff between the
> algorithms and the C library live in libxfs/ and include/ in files that
> don't start with "xfs_".  As I understand it, the dividing line is that
> core algorithms are in libxfs, and everything else isn't.

Well, thanks for pointing out libxfs, which I didn't ever know before.

After a quick glance, I must say, this is AWESOME!!!

Over 55K lines of codes, but kernel facilities are kept into minimal!
No mutex/spinlock, and VFS inode/page seldom occurs.

What a wonderful implementation!
This is almost a perfect extraction, making core logical really isolated 
from all this kernel mess.
(On the other hand, btrfs is just a hell of random/ancient fixes)

If btrfs can do it like this(although I still wonder), then I'm 
completely OK for that.

While from what I see in current codes like btrfs_map_block(), I still 
wonder if we can do it without a large code/interface rework.
(But personally speaking, I'm quite happy if we can do a large rework, 
quite a lot interfaces are just insane and hard to understand)

Thanks,
Qu

>
>>>>> Personally speaking, I'm not a fan of re-using kernel code in
>>>>> btrfs-progs.
>>>>
>>>> But it already does re-use kernel code, it's just that the re-use is
>>>> extremely stale, with unfixed bugs in both directions as a result
>>>> (at least last time I looked.)
>>>>
>>>>> In fact, in btrfs-progs, we don't need a lot of kernel facilities,
>>>>> like page/VFS/lock(btrfs-progs works in single thread under most
>>>>> case).
>>>>>
>>>>> And that should make btrfs-progs easier to maintain.
>
> The way I look at it is that there's core algorithms and on-disk format
> stuff that can be the same wherever it is, and this code ought to be
> kept in sync so that bugfixes and features end up the same in both.
> The 'core algorithms' library can talk to the same interfaces in the
> kernel and userspace, even though the implementations are different, or,
> as Eric points out, even #define'd away.
>
>>>> But as Goldwyn already pointed out, many bugs have gone un-fixed
>>>> in userspace, in code which was forked long ago from kernelspace.
>
> Ick.
>
>>>> For things like locking it's trivial to define that away.
>>>>
>>>> xfsprogs does i.e. -
>>>>
>>>> /* miscellaneous kernel routines not in user space */
>>>> #define down_read(a)            ((void) 0)
>>>> #define up_read(a)              ((void) 0)
>>>> #define spin_lock_init(a)       ((void) 0)
>>>> #define spin_lock(a)            ((void) 0)
>>>> #define spin_unlock(a)          ((void) 0)
>>>> #define likely(x)               (x)
>>>> #define unlikely(x)             (x)
>>>> #define rcu_read_lock()         ((void) 0)
>>>> #define rcu_read_unlock()       ((void) 0)
>>>> #define WARN_ON_ONCE(expr)      ((void) 0)
>>>>
>>>>
>>>>> Furthermore, there are cases while kernel is doing things wrong while
>>>>> btrfs-progs does it right.
>>>>
>>>> All the more reason to sync it up, fixes should always be in both
>>>> places, right?
>>>>
>>>> I had looked at this a few years ago, and started trying to sync things
>>>> up, but got daunted and busy and never completed anything.  :(  I sent
>>>> a few fixups back in April 2013 to get things /slightly/ closer.
>>>>
>>>> The libxfs sync in xfs has borne fruit; I'm of the opinion that similar
>>>> work would help btrfs too, though it can be a long road.
>>>>
>>>> (e2fsprogs has gone the other way, and has a completely separate
>>>> re-implementation in userspace; it works, I guess, but I have to say
>>>> that I really like the code commonality in xfs.)
>
> Having worked on ext* also I will say that as far as finding and
> resolving ambiguities in the on-disk specification, having two
> independent and maintained implementations of ext4 is wonderful.
>
> Unfortunately, it's /really/ expensive to engineer both, and there are
> plenty of small behavioral discrepancies between the two.  In your free
> time, run xfstests against fuse2fs and regular ext4. ;)
>
>>> Yup, I also think we should be going in the XFS direction. It's a big
>>> maintenance burden to have to worry about the code in two places. (E.g.,
>>> the biggest reason I haven't gotten around to implementing full free
>>> space tree support in btrfs-progs is that it's such a pain in the ass to
>>> port new kernel code to the outdated progs code.)
>
> Yeah.  That sucked for the (very) few times I had to do that for
> xfsprogs.
>
>> Another advantage is that we will be able to at least write test cases
>> for the shared code that can be run entirely in userspace.  That
>> obviously doesn't address a huge class of potential problems, but it's
>> better than what we have now.
>>
>>> As far as I know, the only reason it hasn't happened yet is that no one
>>> has agreed to do it, and we're all just hoping someone else will take
>>> care of it.
>>
>> Yeah, I think that's exactly it.  Having one shared source base is going
>> to be easier to maintain for everyone.  The longer we wait to do it, the
>> more it will diverge.  This is a topic we've been discussing in our
>> team's weekly call for the past few weeks.  I think Goldwyn is granting
>> that last wish. :)
>
> I hope you succeed.
>
> --D
>
>>
>> -Jeff
>>
>> --
>> Jeff Mahoney
>> SUSE Labs
>>
>
>
>
>
>

next prev parent reply	other threads:[~2017-01-10  2:24 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-08 13:16 [RFC] Converging userspace and kernel code Goldwyn Rodrigues
2017-01-09  2:11 ` Qu Wenruo
2017-01-09  9:31   ` Christoph Hellwig
2017-01-09 12:06   ` Goldwyn Rodrigues
2017-01-10  0:35     ` Qu Wenruo
2017-01-10  0:56       ` Omar Sandoval
2017-01-09 15:31   ` Eric Sandeen
2017-01-09 21:34     ` Omar Sandoval
2017-01-09 21:38       ` Jeff Mahoney
2017-01-10  1:46         ` Darrick J. Wong
2017-01-10  2:24           ` Qu Wenruo [this message]
2017-01-10  3:28 ` Anand Jain
2017-01-10 12:14   ` Goldwyn Rodrigues
2017-01-10 15:20     ` Anand Jain
2017-01-10 16:04       ` Goldwyn Rodrigues
2017-01-11  2:23         ` Anand Jain
2017-01-11  2:32           ` Qu Wenruo
2017-01-11  2:55           ` Goldwyn Rodrigues
2017-01-11 10:58             ` Anand Jain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=63f4eab8-c85d-4f0f-ab32-775db38cc388@cn.fujitsu.com \
    --to=quwenruo@cn.fujitsu.com \
    --cc=darrick.wong@oracle.com \
    --cc=dsterba@suse.com \
    --cc=jeffm@suse.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=osandov@osandov.com \
    --cc=rgoldwyn@suse.de \
    --cc=sandeen@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).