git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] Code reorgnization
@ 2016-03-17 11:11 Duy Nguyen
  2016-03-17 13:32 ` Johannes Schindelin
                   ` (4 more replies)
  0 siblings, 5 replies; 14+ messages in thread
From: Duy Nguyen @ 2016-03-17 11:11 UTC (permalink / raw)
  To: git

Git's top directory is crowded and I think it's agreed that moving
test-* to t/helper is a good move. I just wanted to check if we could
take this opportunity (after v2.8.0) to move some other files too. I
propose the following new subdirs

lib
---
This contains files that are about data structures or algorithms. Very
general purpose. This directory includes

argv-array.[ch] base85.c column.[ch] delta.h diff-delta.c hashmap.[ch]
hex.c khash.h kwset.[ch] levenshtein.[ch] mergesort.[ch] patch-delta.c
prio-queue.[ch] sha1-array.[ch] sha1-lookup.[ch] strbuf.[ch]
string-list.[ch] url.[ch] urlmatch.[ch] utf8.[ch] varint.[ch]
versioncmp.c wildmatch.[ch]

odb
---
The grouping of object database files is to easily make connections
between them. Unlike, for example, diff-related files which either
start with "diff" or has that word in the file name to make
connections.

alloc.c blob.[ch] bulk-checkin.[ch] commit-slab.h commit.[ch]
object.[ch] pack.h pack-revindex.[ch] replace_object.c sha1_file.c
streaming.[ch] tag.[ch] tree.[ch]

index
-----
For the same reason of odb subdir. This directory contains

cache-tree.[ch] name-hash.c preload-index.c read-cache.c
split-index.[ch] unpack-trees.[ch]

sys (or maybe util or support)
------------------------------
These are still general purpose but is usually system-related. They
are still far away from git's core logic. I want to separate them to
make it easier to spot "important" files at top dir.

abspath.c color.[ch] copy.c csum-file.[ch] ctype.c date.c editor.c
exec_cmd.[ch] gettext.[ch] gettext.h gpg-interface.[ch] ident.c
lockfile.[ch] mailinfo.[ch] mailmap.[ch] pager.c parse-options-cb.c
parse-options.[ch] pathspec.[ch] pkt-line.[ch] progress.[ch]
prompt.[ch] quote.[ch] run-command.[ch] sideband.[ch] sigchain.[ch]
symlinks.c tar.h tempfile.[ch] thread-utils.[ch] trace.[ch]
unix-socket.[ch] usage.c userdiff.[ch] wrapper.c write_or_die.c zlib.c

Good? Bad? Ugly?
--
Duy

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] Code reorgnization
  2016-03-17 11:11 [RFC] Code reorgnization Duy Nguyen
@ 2016-03-17 13:32 ` Johannes Schindelin
  2016-03-17 13:35   ` Duy Nguyen
  2016-03-17 16:21 ` Junio C Hamano
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 14+ messages in thread
From: Johannes Schindelin @ 2016-03-17 13:32 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: git

Hi Duy,

On Thu, 17 Mar 2016, Duy Nguyen wrote:

> Git's top directory is crowded and I think it's agreed that moving
> test-* to t/helper is a good move. I just wanted to check if we could
> take this opportunity (after v2.8.0) to move some other files too. I
> propose the following new subdirs
> 
> lib
> ---
> This contains files that are about data structures or algorithms. Very
> general purpose. This directory includes
> 
> argv-array.[ch] base85.c column.[ch] delta.h diff-delta.c hashmap.[ch]
> hex.c khash.h kwset.[ch] levenshtein.[ch] mergesort.[ch] patch-delta.c
> prio-queue.[ch] sha1-array.[ch] sha1-lookup.[ch] strbuf.[ch]
> string-list.[ch] url.[ch] urlmatch.[ch] utf8.[ch] varint.[ch]
> versioncmp.c wildmatch.[ch]

The name "lib" makes it sound as if this contains the source code of
libgit.a. Maybe "generic" or "common" or "util" would be better (my
favorite would be "util").

> odb
> ---
> The grouping of object database files is to easily make connections
> between them. Unlike, for example, diff-related files which either
> start with "diff" or has that word in the file name to make
> connections.
> 
> alloc.c blob.[ch] bulk-checkin.[ch] commit-slab.h commit.[ch]
> object.[ch] pack.h pack-revindex.[ch] replace_object.c sha1_file.c
> streaming.[ch] tag.[ch] tree.[ch]
> 
> index
> -----
> For the same reason of odb subdir. This directory contains
> 
> cache-tree.[ch] name-hash.c preload-index.c read-cache.c
> split-index.[ch] unpack-trees.[ch]
> 
> sys (or maybe util or support)
> ------------------------------
> These are still general purpose but is usually system-related. They
> are still far away from git's core logic. I want to separate them to
> make it easier to spot "important" files at top dir.
> 
> abspath.c color.[ch] copy.c csum-file.[ch] ctype.c date.c editor.c
> exec_cmd.[ch] gettext.[ch] gettext.h gpg-interface.[ch] ident.c
> lockfile.[ch] mailinfo.[ch] mailmap.[ch] pager.c parse-options-cb.c
> parse-options.[ch] pathspec.[ch] pkt-line.[ch] progress.[ch]
> prompt.[ch] quote.[ch] run-command.[ch] sideband.[ch] sigchain.[ch]
> symlinks.c tar.h tempfile.[ch] thread-utils.[ch] trace.[ch]
> unix-socket.[ch] usage.c userdiff.[ch] wrapper.c write_or_die.c zlib.c
> 
> Good? Bad? Ugly?

Disruptive. Probably a change for 3.0?

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] Code reorgnization
  2016-03-17 13:32 ` Johannes Schindelin
@ 2016-03-17 13:35   ` Duy Nguyen
  0 siblings, 0 replies; 14+ messages in thread
From: Duy Nguyen @ 2016-03-17 13:35 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Git Mailing List

On Thu, Mar 17, 2016 at 8:32 PM, Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
>> Good? Bad? Ugly?
>
> Disruptive. Probably a change for 3.0?

We tested it with the builtin rename a long time ago, so it's probably
not bad. By the principle of "dogfooding", we should try it soon and
make sure it's not disruptive, or prepare ourselves for such a change
(I think git-am can't track renames, for example, without us giving it
a clue somehow)
-- 
Duy

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] Code reorgnization
  2016-03-17 11:11 [RFC] Code reorgnization Duy Nguyen
  2016-03-17 13:32 ` Johannes Schindelin
@ 2016-03-17 16:21 ` Junio C Hamano
  2016-03-17 17:00 ` Thomas Adam
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 14+ messages in thread
From: Junio C Hamano @ 2016-03-17 16:21 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: git

Duy Nguyen <pclouds@gmail.com> writes:

> Good? Bad? Ugly?

Too fine-grained to induce confusion for things that have to work as
a bridge between two categories (e.g. odb & index).  In short, bad
and ugly.

I am OK with a looser classification e.g. (1) things that can be
used without Git at all like strbuf, string-list vs (2) things that
are Git but shared across different subcommands like read-cache,
sha1_file, vs (3) command implementations (e.g. builtin/ and also
standalone).

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] Code reorgnization
  2016-03-17 11:11 [RFC] Code reorgnization Duy Nguyen
  2016-03-17 13:32 ` Johannes Schindelin
  2016-03-17 16:21 ` Junio C Hamano
@ 2016-03-17 17:00 ` Thomas Adam
  2016-03-17 17:48   ` Junio C Hamano
  2016-03-17 18:37 ` Stefan Beller
  2016-03-18  5:24 ` Jeff King
  4 siblings, 1 reply; 14+ messages in thread
From: Thomas Adam @ 2016-03-17 17:00 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: git list

On 17 March 2016 at 11:11, Duy Nguyen <pclouds@gmail.com> wrote:
> Git's top directory is crowded and I think it's agreed that moving
> test-* to t/helper is a good move. I just wanted to check if we could
> take this opportunity (after v2.8.0) to move some other files too. I
> propose the following new subdirs

I wonder whether previous discussions on this still count?  See:

http://marc.info/?l=git&m=129650572621523&w=1

-- Thomas Adam

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] Code reorgnization
  2016-03-17 17:00 ` Thomas Adam
@ 2016-03-17 17:48   ` Junio C Hamano
  0 siblings, 0 replies; 14+ messages in thread
From: Junio C Hamano @ 2016-03-17 17:48 UTC (permalink / raw)
  To: Thomas Adam; +Cc: Duy Nguyen, git list

Thomas Adam <thomas@xteddy.org> writes:

> On 17 March 2016 at 11:11, Duy Nguyen <pclouds@gmail.com> wrote:
>> Git's top directory is crowded and I think it's agreed that moving
>> test-* to t/helper is a good move. I just wanted to check if we could
>> take this opportunity (after v2.8.0) to move some other files too. I
>> propose the following new subdirs
>
> I wonder whether previous discussions on this still count?  See:
>
> http://marc.info/?l=git&m=129650572621523&w=1

If you refer to ancient discussion, especially to a large thread
like that one, please spend a bit more time to summarize it.  It
is between one person spends a bit more time, and all others
independently go there and read.

The essense of the proposal [1] back then was to move all the source
file to src/, rename t/ to testsuite.  And I think [2] is a pretty
good summary of the common feeling back then that explains why the
proposal died out:

    Moving everything into src/ and calling it "organized" doesn't
    actually accomplish much other than perhaps making the README
    file more visible to newbs; things are _still_ a mess, just a
    mess with four more letters...

This round is slightly more organized, so many points the old thread
raised would not apply, I suspect.


[References]

*1* http://thread.gmane.org/gmane.comp.version-control.git/165720/focus=165748

*2* http://thread.gmane.org/gmane.comp.version-control.git/165720/focus=166019

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] Code reorgnization
  2016-03-17 11:11 [RFC] Code reorgnization Duy Nguyen
                   ` (2 preceding siblings ...)
  2016-03-17 17:00 ` Thomas Adam
@ 2016-03-17 18:37 ` Stefan Beller
  2016-03-17 19:10   ` Junio C Hamano
  2016-03-18  5:24 ` Jeff King
  4 siblings, 1 reply; 14+ messages in thread
From: Stefan Beller @ 2016-03-17 18:37 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: git@vger.kernel.org

On Thu, Mar 17, 2016 at 4:11 AM, Duy Nguyen <pclouds@gmail.com> wrote:
> Good? Bad? Ugly?

For now I would just go with 3 directories:

non-git/ (or util, helpers, or anything that could be ripped out and be useful
    e.g. strbufs, argv-array run-command, lockfile
git/ (maybe called lib? All stuff that is pure Git and is used for libgit

builtin/ (as we have it today + all that stuff that doesn't go into
git/ very well?)

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] Code reorgnization
  2016-03-17 18:37 ` Stefan Beller
@ 2016-03-17 19:10   ` Junio C Hamano
  2016-03-17 21:03     ` Pranit Bauva
  2016-03-17 21:43     ` John Keeping
  0 siblings, 2 replies; 14+ messages in thread
From: Junio C Hamano @ 2016-03-17 19:10 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Duy Nguyen, git@vger.kernel.org

Stefan Beller <sbeller@google.com> writes:

> For now I would just go with 3 directories:
>
> non-git/ (or util, helpers, or anything that could be ripped out and be useful
>     e.g. strbufs, argv-array run-command, lockfile
> git/ (maybe called lib? All stuff that is pure Git and is used for libgit
>
> builtin/ (as we have it today + all that stuff that doesn't go into
> git/ very well?)

It is unclear where you want to have standalone programs in the
above.  I'd say lib/ and src/ for the first two, where lib/ is for
things that could be lifted without any Git dependencies and src/
for everything else.

Aren't there some folks who link directly with our codebase (I am
thinking about cgit, but hjemli.net/git/cgit does not seem to be
responding anymore)?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] Code reorgnization
  2016-03-17 19:10   ` Junio C Hamano
@ 2016-03-17 21:03     ` Pranit Bauva
  2016-03-17 21:43     ` John Keeping
  1 sibling, 0 replies; 14+ messages in thread
From: Pranit Bauva @ 2016-03-17 21:03 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Stefan Beller, Duy Nguyen, git@vger.kernel.org

On Fri, Mar 18, 2016 at 12:40 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> For now I would just go with 3 directories:
>>
>> non-git/ (or util, helpers, or anything that could be ripped out and be useful
>>     e.g. strbufs, argv-array run-command, lockfile
>> git/ (maybe called lib? All stuff that is pure Git and is used for libgit
>>
>> builtin/ (as we have it today + all that stuff that doesn't go into
>> git/ very well?)
>
> It is unclear where you want to have standalone programs in the
> above.  I'd say lib/ and src/ for the first two, where lib/ is for
> things that could be lifted without any Git dependencies and src/
> for everything else.
>
> Aren't there some folks who link directly with our codebase (I am
> thinking about cgit, but hjemli.net/git/cgit does not seem to be
> responding anymore)?

They now have a new mailing list.[1] I guess they forward all the
mails to the new one[2]. The new one seems active. [3]

[1] : cgit@lists.zx2c4.com
[2] : https://lists.zx2c4.com/pipermail/cgit/2013-May/001380.html
[3] : http://news.gmane.org/gmane.comp.version-control.cgit

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] Code reorgnization
  2016-03-17 19:10   ` Junio C Hamano
  2016-03-17 21:03     ` Pranit Bauva
@ 2016-03-17 21:43     ` John Keeping
  2016-03-17 21:49       ` Junio C Hamano
  1 sibling, 1 reply; 14+ messages in thread
From: John Keeping @ 2016-03-17 21:43 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Stefan Beller, Duy Nguyen, git@vger.kernel.org

On Thu, Mar 17, 2016 at 12:10:44PM -0700, Junio C Hamano wrote:
> Stefan Beller <sbeller@google.com> writes:
> 
> > For now I would just go with 3 directories:
> >
> > non-git/ (or util, helpers, or anything that could be ripped out and be useful
> >     e.g. strbufs, argv-array run-command, lockfile
> > git/ (maybe called lib? All stuff that is pure Git and is used for libgit
> >
> > builtin/ (as we have it today + all that stuff that doesn't go into
> > git/ very well?)
> 
> It is unclear where you want to have standalone programs in the
> above.  I'd say lib/ and src/ for the first two, where lib/ is for
> things that could be lifted without any Git dependencies and src/
> for everything else.
> 
> Aren't there some folks who link directly with our codebase (I am
> thinking about cgit, but hjemli.net/git/cgit does not seem to be
> responding anymore)?

CGit lives at https://git.zx2c4.com/cgit/ these days.

The organisation of the git code shouldn't make a difference since CGit
just links with libgit.a, even if it does CGit pulls in git.git as a
submodule so it can just fix any problems in the same commit that
updates the submodule reference.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] Code reorgnization
  2016-03-17 21:43     ` John Keeping
@ 2016-03-17 21:49       ` Junio C Hamano
  2016-03-18  0:28         ` Duy Nguyen
  0 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2016-03-17 21:49 UTC (permalink / raw)
  To: John Keeping; +Cc: Stefan Beller, Duy Nguyen, git@vger.kernel.org

John Keeping <john@keeping.me.uk> writes:

> The organisation of the git code shouldn't make a difference since CGit
> just links with libgit.a, even if it does CGit pulls in git.git as a
> submodule so it can just fix any problems in the same commit that
> updates the submodule reference.

I was mostly worried about where Duy and Stefan want to place *.h

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] Code reorgnization
  2016-03-17 21:49       ` Junio C Hamano
@ 2016-03-18  0:28         ` Duy Nguyen
  0 siblings, 0 replies; 14+ messages in thread
From: Duy Nguyen @ 2016-03-18  0:28 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: John Keeping, Stefan Beller, git@vger.kernel.org

On Fri, Mar 18, 2016 at 4:49 AM, Junio C Hamano <gitster@pobox.com> wrote:
> John Keeping <john@keeping.me.uk> writes:
>
>> The organisation of the git code shouldn't make a difference since CGit
>> just links with libgit.a, even if it does CGit pulls in git.git as a
>> submodule so it can just fix any problems in the same commit that
>> updates the submodule reference.
>
> I was mostly worried about where Duy and Stefan want to place *.h

*.h stay with their *.c. CFLAGS has two more -Isrc and -Ilib. I don't
expect any #include line changes. Maybe we can start moving stuff to
"lib" soon. Many of them rarely receive changes these days. The
creation of "src" could be more disruptive and can wait until
$(topdir) is once again unbearable.
-- 
Duy

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] Code reorgnization
  2016-03-17 11:11 [RFC] Code reorgnization Duy Nguyen
                   ` (3 preceding siblings ...)
  2016-03-17 18:37 ` Stefan Beller
@ 2016-03-18  5:24 ` Jeff King
  2016-03-18  5:59   ` Duy Nguyen
  4 siblings, 1 reply; 14+ messages in thread
From: Jeff King @ 2016-03-18  5:24 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: git

On Thu, Mar 17, 2016 at 06:11:36PM +0700, Duy Nguyen wrote:

> Git's top directory is crowded and I think it's agreed that moving
> test-* to t/helper is a good move. I just wanted to check if we could
> take this opportunity (after v2.8.0) to move some other files too. I
> propose the following new subdirs

I guess I don't really see the "crowded" problem, but perhaps that is
because I am more or less familiar with where things are in git's code
base. I suppose if you were looking for a "utility" function, you might
look in "util" and therefore have a smaller set of files to check.

But I think we also run into the opposite problem: I am looking for some
particular function, but I can't find it, because I am looking in "util"
and it is in some other directory. And when files move around, it makes
history harder to follow (maybe that is because git sucks and we need to
make it better, but certainly I run into mild annoyances with the
builtin/ rename when digging in history).

And you have a similar problem when creating new files. Which slot do
they go in? What if they could feasibly go into two slots?

So there can be friction either way. In practice I find I just use ctags
to jump to the functions I am interested in, and I don't care that much
about filenames.

The reorganization that _would_ be more interesting to me is not files
in directories, but rather functions in files. I wish everything were
designed more as modules with a pair of matching ".c" and ".h" files,
with a public interface defined in the ".h", and messier, private stuff
in the ".c". But we have some real dumping grounds:

  1. cache.h has the declarations for at least a dozen different
     modules; besides being hard to navigate, it causes more frequent
     recompilation than necessary.

  2. a few of the .c files could probably be split (e.g., dir.c is where
     all of the pathspec code lives, even though that is used for much
     more than filesystem access these days).

Splitting those up would _also_ introduce friction (and actually worse
than whole-file renames, because finding code movement between files is
an even harder / more expensive problem). But I feel like it would buy a
lot more in terms of code clarity, and in reducing the scope of code
which has access to private, static interfaces.

-Peff

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] Code reorgnization
  2016-03-18  5:24 ` Jeff King
@ 2016-03-18  5:59   ` Duy Nguyen
  0 siblings, 0 replies; 14+ messages in thread
From: Duy Nguyen @ 2016-03-18  5:59 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Mailing List

On Fri, Mar 18, 2016 at 12:24 PM, Jeff King <peff@peff.net> wrote:
> On Thu, Mar 17, 2016 at 06:11:36PM +0700, Duy Nguyen wrote:
>
>> Git's top directory is crowded and I think it's agreed that moving
>> test-* to t/helper is a good move. I just wanted to check if we could
>> take this opportunity (after v2.8.0) to move some other files too. I
>> propose the following new subdirs
>
> I guess I don't really see the "crowded" problem, but perhaps that is
> because I am more or less familiar with where things are in git's code
> base. I suppose if you were looking for a "utility" function, you might
> look in "util" and therefore have a smaller set of files to check.
>
> But I think we also run into the opposite problem: I am looking for some
> particular function, but I can't find it, because I am looking in "util"
> and it is in some other directory. And when files move around, it makes
> history harder to follow (maybe that is because git sucks and we need to
> make it better, but certainly I run into mild annoyances with the
> builtin/ rename when digging in history).

Yeah, for finding a particular function, I just "git grep" (or rgrep
from emacs) if I fail to locate it after the first guess. We have this
problem nowadays anyway. Besides builtin, we also have ewah, refs and
some more subdirs.

> And you have a similar problem when creating new files. Which slot do
> they go in? What if they could feasibly go into two slots?

Everything goes to topdir (or later on "src") by default and only goes
to "lib" when it's _obvious_ that it's disconnected from git (i'm
talking about the "lib/src" layout).

> So there can be friction either way. In practice I find I just use ctags
> to jump to the functions I am interested in, and I don't care that much
> about filenames.
>
> The reorganization that _would_ be more interesting to me is not files
> in directories, but rather functions in files. I wish everything were
> designed more as modules with a pair of matching ".c" and ".h" files,
> with a public interface defined in the ".h", and messier, private stuff
> in the ".c". But we have some real dumping grounds:
>
>   1. cache.h has the declarations for at least a dozen different
>      modules; besides being hard to navigate, it causes more frequent
>      recompilation than necessary.
>
>   2. a few of the .c files could probably be split (e.g., dir.c is where
>      all of the pathspec code lives, even though that is used for much
>      more than filesystem access these days).

Heh.. that's what I wanted to do (or at least discuss) after files are moved :)

> Splitting those up would _also_ introduce friction (and actually worse
> than whole-file renames, because finding code movement between files is
> an even harder / more expensive problem).

.. and this is why I did not raise it in the first mail.

> But I feel like it would buy a
> lot more in terms of code clarity, and in reducing the scope of code
> which has access to private, static interfaces.
-- 
Duy

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2016-03-18  6:02 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-03-17 11:11 [RFC] Code reorgnization Duy Nguyen
2016-03-17 13:32 ` Johannes Schindelin
2016-03-17 13:35   ` Duy Nguyen
2016-03-17 16:21 ` Junio C Hamano
2016-03-17 17:00 ` Thomas Adam
2016-03-17 17:48   ` Junio C Hamano
2016-03-17 18:37 ` Stefan Beller
2016-03-17 19:10   ` Junio C Hamano
2016-03-17 21:03     ` Pranit Bauva
2016-03-17 21:43     ` John Keeping
2016-03-17 21:49       ` Junio C Hamano
2016-03-18  0:28         ` Duy Nguyen
2016-03-18  5:24 ` Jeff King
2016-03-18  5:59   ` Duy Nguyen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).