git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* start of git2 (based on libgit2)
@ 2011-03-25 23:12 Motiejus Jakštys
  2011-03-25 23:54 ` Vincent van Ravesteijn
  2011-03-26  6:33 ` Sam Vilain
  0 siblings, 2 replies; 7+ messages in thread
From: Motiejus Jakštys @ 2011-03-25 23:12 UTC (permalink / raw)
  To: git

Hello,

I wrote similar letter before, but did not receive feedback I was expecting.

I think libgit2 is an amazing thing, and I started writing[1] cli client for
it. This is what it can do now:
    $ git2 rev-list <anything>

Which is roughly equivalent to:
    $ git rev-list HEAD

I do not know how it will figure out past merge history, but that's for
the future.

I want to get started with it, but before that I want and discuss some
architectural questions.

According to Jeff King[2], I should start with plumbing commands. I
agree.  However, how deep?  I.e. do I have to make sure all git rev-list
possible arguments are implemented?

Are we aiming for a distributed 100s of executables architecture
(current git), or single huge binary? I would go for single executable
for to higher portability. Is that ok?

Build tool. Currently libgit2 uses waf. I am not against it (I've chosen
waf for one of my own C++ projects), However, it's too clumsy for me. Is
it me who lacks experience? Scons looks much easier for me. Moreover, we
do not need automatic configuration, so it makes waf "overfeatured".

Build configuration. Git-send-email is not really a must-have for an
embedded device, so we should be able to specify these features in
configure-time. How do you think it should be taken care of?

1) <buildtool> configure  --disable-everything --enable-email
2) make menuconfig and enjoy the blue screen of choice
3) anything else?

Waiting for your answers, will go on working.

I am a student and would like to do this take this up in GSOC. I just
received a letter from Vicent Marti with sort of confirmation that the
project is interesting for the community. I'm happy about it.  Currently
I am a full-time python programmer, but have done some C++. I created
SoundPatty[3], a real-time sound recognition (record) application for my
job VoIP recognition needs.

In case you have any questions, opinions, please ask. Thank you.

[1] https://github.com/Motiejus/git2/
[2] http://marc.info/?l=3Dgit&m=3D130081966214059&w=3D4
[3] https://github.com/Motiejus/SoundPatty/ 
[CV] http://m.jakstys.lt/

Motiejus Jakštys

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: start of git2 (based on libgit2)
  2011-03-25 23:12 start of git2 (based on libgit2) Motiejus Jakštys
@ 2011-03-25 23:54 ` Vincent van Ravesteijn
  2011-03-26  2:13   ` Motiejus Jakštys
  2011-03-26 13:29   ` Jeff King
  2011-03-26  6:33 ` Sam Vilain
  1 sibling, 2 replies; 7+ messages in thread
From: Vincent van Ravesteijn @ 2011-03-25 23:54 UTC (permalink / raw)
  To: Motiejus Jakštys; +Cc: git

On 26-3-2011 0:12, Motiejus Jakštys wrote:
> Hello,
>
> I wrote similar letter before, but did not receive feedback I was expecting.

I wrote a mail on the same topic to the libgit2@librelist.org 
mailinglist, because I got interested in the same project (although I 
will not be a GSoC student).

http://librelist.com/browser/libgit2/
> According to Jeff King[2], I should start with plumbing commands. I
> agree.  However, how deep?  I.e. do I have to make sure all git rev-list
> possible arguments are implemented?

I guess a lot can be copied from Git itself. Actually builtin/rev-list.c 
consists mostly of command line arguments parsing methods, and 
outputting functions. The key is to parse what you want to know and ask 
libgit2 to provide the info. If libgit2 has implemented the basic 
functionality that is needed, the rest would be relatively simple.

> Are we aiming for a distributed 100s of executables architecture
> (current git), or single huge binary? I would go for single executable
> for to higher portability. Is that ok?

AFAICS, current git is a single binary on Windows already.

> Build tool. Currently libgit2 uses waf. I am not against it (I've chosen
> waf for one of my own C++ projects), However, it's too clumsy for me. Is
> it me who lacks experience? Scons looks much easier for me. Moreover, we
> do not need automatic configuration, so it makes waf "overfeatured".

Why not CMake which is also used for libgit2 ?

I already wrote a CMakeLists file for your git2 app.

> I am a student and would like to do this take this up in GSOC. I just
> received a letter from Vicent Marti with sort of confirmation that the
> project is interesting for the community.

As you know, this project can be possibly fulfilled by a GSoC student 
(either you or someone else). Maybe people are awaiting this before 
diving into the project.

Vincent

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: start of git2 (based on libgit2)
  2011-03-25 23:54 ` Vincent van Ravesteijn
@ 2011-03-26  2:13   ` Motiejus Jakštys
  2011-03-26 13:29   ` Jeff King
  1 sibling, 0 replies; 7+ messages in thread
From: Motiejus Jakštys @ 2011-03-26  2:13 UTC (permalink / raw)
  To: Vincent van Ravesteijn; +Cc: git, libgit2

On Sat, Mar 26, 2011 at 12:54:25AM +0100, Vincent van Ravesteijn wrote:
> On 26-3-2011 0:12, Motiejus Jakštys wrote:
> >According to Jeff King[2], I should start with plumbing commands. I
> >agree.  However, how deep?  I.e. do I have to make sure all git rev-list
> >possible arguments are implemented?
> 
> I guess a lot can be copied from Git itself. Actually
> builtin/rev-list.c consists mostly of command line arguments parsing
> methods, and outputting functions. The key is to parse what you want
> to know and ask libgit2 to provide the info. If libgit2 has
> implemented the basic functionality that is needed, the rest would
> be relatively simple.
> AFAICS, current git is a single binary on Windows already.

So I have the answer. Thank you. Further working path is getting
clearer. Finish with rev-list, make it work with t/. Then pick up
dependencies of one of the must-have commands (commit/merge/diff?),
implement them and implement the command.

> 
> >Build tool. Currently libgit2 uses waf. I am not against it (I've chosen
> >waf for one of my own C++ projects), However, it's too clumsy for me. Is
> >it me who lacks experience? Scons looks much easier for me. Moreover, we
> >do not need automatic configuration, so it makes waf "overfeatured".
> 
> Why not CMake which is also used for libgit2 ?

Did not notice that. I noticed wscript and stopped looking... I never
tried CMake before. But I have nothing against it.

> 
> I already wrote a CMakeLists file for your git2 app.

Very nice. Pull request? Patch?

> 
> As you know, this project can be possibly fulfilled by a GSoC
> student (either you or someone else). Maybe people are awaiting this
> before diving into the project.

Competition is a good thing. The most important thing is picking the
best choice.

Thank you Vincent,
Motiejus

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: start of git2 (based on libgit2)
  2011-03-25 23:12 start of git2 (based on libgit2) Motiejus Jakštys
  2011-03-25 23:54 ` Vincent van Ravesteijn
@ 2011-03-26  6:33 ` Sam Vilain
  1 sibling, 0 replies; 7+ messages in thread
From: Sam Vilain @ 2011-03-26  6:33 UTC (permalink / raw)
  To: Motiejus Jakštys; +Cc: git

On 26/03/11 12:12, Motiejus Jakštys wrote:
> Build tool. Currently libgit2 uses waf. I am not against it (I've chosen
> waf for one of my own C++ projects), However, it's too clumsy for me. Is
> it me who lacks experience? Scons looks much easier for me. Moreover, we
> do not need automatic configuration, so it makes waf "overfeatured".

Another one you might like to look at is "ccanlint" - it wraps a whole
bunch of things that make for exceptional quality code, such as code
coverage by the test suite, documentation coverage, compilable examples,
even cranks it up using valgrind to check that it's right.

As far as your question about how much to implement or bring across from
git - try to do it feature by feature, with reference to the test suite
and make sure each feature has a test.  It's a very bad idea IMHO to
port across untested features.  I'd much rather have a core set of
commands which are well tested and stable, than a handful of
fully-implemented but buggy commands.

Sam

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: start of git2 (based on libgit2)
  2011-03-25 23:54 ` Vincent van Ravesteijn
  2011-03-26  2:13   ` Motiejus Jakštys
@ 2011-03-26 13:29   ` Jeff King
  2011-03-27  8:34     ` Junio C Hamano
  1 sibling, 1 reply; 7+ messages in thread
From: Jeff King @ 2011-03-26 13:29 UTC (permalink / raw)
  To: Vincent van Ravesteijn; +Cc: Motiejus Jakštys, git

On Sat, Mar 26, 2011 at 12:54:25AM +0100, Vincent van Ravesteijn wrote:

> http://librelist.com/browser/libgit2/
> >According to Jeff King[2], I should start with plumbing commands. I
> >agree.  However, how deep?  I.e. do I have to make sure all git rev-list
> >possible arguments are implemented?
> 
> I guess a lot can be copied from Git itself. Actually
> builtin/rev-list.c consists mostly of command line arguments parsing
> methods, and outputting functions. The key is to parse what you want
> to know and ask libgit2 to provide the info. If libgit2 has
> implemented the basic functionality that is needed, the rest would be
> relatively simple.

I wouldn't worry about having _every_ argument. Some arguments are much
less frequently used than others. For example, start with basic stuff,
like including and excluding commits (e.g., "branch1 ^branch2"),
--max-count, --{min,max}-age, --grep, and others. Do common things like
path limiting. And then once all that is done and tested, start worrying
about things like --cherry-pick (or maybe not, and focus on the basics
of other simple commands).

> >Are we aiming for a distributed 100s of executables architecture
> >(current git), or single huge binary? I would go for single executable
> >for to higher portability. Is that ok?
> 
> AFAICS, current git is a single binary on Windows already.

Even on Linux, most of the commands are just hardlinks to the git
executable. Most commands are built-in these days. A few are still
external but written in C (sometimes because we want to keep them small
and external, like git-daemon and git-shell). But there are still some
commands written in other languages, like pull, stash, and
add--interactive.

Check out the BUILTIN_OBJS, PROGRAM_OBJS, and SCRIPT_* variables in the
Makefile.

So yeah, for basic commands, one monolithic binary is probably fine.

-Peff

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: start of git2 (based on libgit2)
  2011-03-26 13:29   ` Jeff King
@ 2011-03-27  8:34     ` Junio C Hamano
  2011-03-27  9:56       ` Vincent van Ravesteijn
  0 siblings, 1 reply; 7+ messages in thread
From: Junio C Hamano @ 2011-03-27  8:34 UTC (permalink / raw)
  To: Jeff King; +Cc: Vincent van Ravesteijn, Motiejus Jakštys, git

Jeff King <peff@peff.net> writes:

> On Sat, Mar 26, 2011 at 12:54:25AM +0100, Vincent van Ravesteijn wrote:
>> 
>> I guess a lot can be copied from Git itself. Actually
>> builtin/rev-list.c consists mostly of command line arguments parsing
>> methods, and outputting functions. The key is to parse what you want
>> to know and ask libgit2 to provide the info. If libgit2 has
>> implemented the basic functionality that is needed, the rest would be
>> relatively simple.
>
> I wouldn't worry about having _every_ argument. Some arguments are much
> less frequently used than others. For example, start with basic stuff,
> like including and excluding commits (e.g., "branch1 ^branch2"),
> --max-count, --{min,max}-age, --grep, and others. Do common things like
> path limiting. And then once all that is done and tested, start worrying
> about things like --cherry-pick (or maybe not, and focus on the basics
> of other simple commands).

I agree that for a summer student project, aiming at basic stuff makes
more sense than trying to chew a large bite that cannot be managed within
the timeframe and not achieving anything.

"A..B" requires you to walk the ancestry chain. Limiting history with
pathspec while simplifying merges needs to use the tree-diff machinery;
and filtering commits by looking at the message with "--grep" needs to
call into the grep machinery.  Depending on how much libgit2 has already
covered the basic blocks, even the above list might be too much, I am
afraid.

A good news is that among the larger and more important basic building
blocks in C git, there is only one part that was designed from day one to
disregard the reusability and instead aimed for speed and simplicity, and
that is the history and object walking. The way the in-core object pool is
managed and especially the way per-object flags are designed to be used
clearly show that the revision walker machinery can take it granted that
the calling programs are run-once-and-clean-via-exit.

But other major parts are designed to be reusable and I would imagine that
it shouldn't be hard to link with them (or better yet, find counterparts
in libgit2). "diff" machinery below the diffcore layer (i.e. the entry
points "diff-lib.c" calls into, e.g. starting at diff_addremove(), then
running the diffcore machinery with diffcore_std() and finally getting the
result from diff_flush() callchain) and "grep" machinery below the
"grep.c" (but not "builtin/grep.c") are designed not to depend on the
process level global variables.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: start of git2 (based on libgit2)
  2011-03-27  8:34     ` Junio C Hamano
@ 2011-03-27  9:56       ` Vincent van Ravesteijn
  0 siblings, 0 replies; 7+ messages in thread
From: Vincent van Ravesteijn @ 2011-03-27  9:56 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Motiejus Jakštys, git

> I agree that for a summer student project, aiming at basic stuff makes
> more sense than trying to chew a large bite that cannot be managed within
> the timeframe and not achieving anything

If things will be working out, I will at least not disappear after the 
summer. I'm quite new here, but I'd like to help out in coordinating the 
student(s) and to continue with it after summer (if the student(s) do 
not stick to the git development). I'm still missing quite some 
knowledge about git, but I hope that will come with time.

> "A..B" requires you to walk the ancestry chain. Limiting history with
> pathspec while simplifying merges needs to use the tree-diff machinery;
> and filtering commits by looking at the message with "--grep" needs to
> call into the grep machinery.  Depending on how much libgit2 has already
> covered the basic blocks, even the above list might be too much, I am
> afraid.

Yes, it would be important to understand how much already is covered by 
libgit2. If someone could shed some light on this (see also my message 
on the libgit2@librelist.org mailing list).

> A good news is

[..] still re-reading this paragraph to find out what the actual 'good' 
part of this news is ;)...[..]

> that among the larger and more important basic building
> blocks in C git, there is only one part that was designed from day one to
> disregard the reusability and instead aimed for speed and simplicity, and
> that is the history and object walking. The way the in-core object pool is
> managed and especially the way per-object flags are designed to be used
> clearly show that the revision walker machinery can take it granted that
> the calling programs are run-once-and-clean-via-exit.

That's what I meant with my previous message. I was not aiming to 
implement all exotic features, but I think that it would be a good 
design if git and git2 share a lot together and only differ in how they 
actually use the git/libgit backend. As part of the process, the git 
code can be adjusted as well to "libify" it (as it was called in another 
thread).

Vincent

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2011-03-27  9:56 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-03-25 23:12 start of git2 (based on libgit2) Motiejus Jakštys
2011-03-25 23:54 ` Vincent van Ravesteijn
2011-03-26  2:13   ` Motiejus Jakštys
2011-03-26 13:29   ` Jeff King
2011-03-27  8:34     ` Junio C Hamano
2011-03-27  9:56       ` Vincent van Ravesteijn
2011-03-26  6:33 ` Sam Vilain

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).