* libgit2 - a true git library
@ 2008-10-31 17:07 Shawn O. Pearce
2008-10-31 17:28 ` Pieter de Bie
` (4 more replies)
0 siblings, 5 replies; 83+ messages in thread
From: Shawn O. Pearce @ 2008-10-31 17:07 UTC (permalink / raw)
To: git; +Cc: Scott Chacon
During the GitTogether we were kicking around the idea of a ground-up
implementation of a Git library. This may be easier than trying
to grind down git.git into a library, as we aren't tied to any
of the current global state baggage or the current die() based
error handling.
I've started an _extremely_ rough draft. The code compiles into a
libgit.a but it doesn't even implement what it describes in the API,
let alone a working Git implementation. Really what I'm trying to
incite here is some discussion on what the API looks like.
API Docs:
http://www.spearce.org/projects/scm/libgit2/apidocs/html/modules.html
Source Code Clone URL:
http://www.spearce.org/projects/scm/libgit2/libgit2.git
--
Shawn.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-10-31 17:07 libgit2 - a true git library Shawn O. Pearce
@ 2008-10-31 17:28 ` Pieter de Bie
2008-10-31 17:29 ` Pieter de Bie
2008-10-31 17:47 ` Pierre Habouzit
` (3 subsequent siblings)
4 siblings, 1 reply; 83+ messages in thread
From: Pieter de Bie @ 2008-10-31 17:28 UTC (permalink / raw)
To: Shawn O. Pearce; +Cc: git, Scott Chacon
On 31 okt 2008, at 18:07, Shawn O. Pearce wrote:
> Source Code Clone URL:
> http://www.spearce.org/projects/scm/libgit2/libgit2.git
This 404's for me
- Pieter
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-10-31 17:28 ` Pieter de Bie
@ 2008-10-31 17:29 ` Pieter de Bie
0 siblings, 0 replies; 83+ messages in thread
From: Pieter de Bie @ 2008-10-31 17:29 UTC (permalink / raw)
To: Shawn O. Pearce; +Cc: Git Mailing List, Scott Chacon
On 31 okt 2008, at 18:28, Pieter de Bie wrote:
>> Source Code Clone URL:
>> http://www.spearce.org/projects/scm/libgit2/libgit2.git
>
> This 404's for me
Nevermind, it's only a clone url.. I'd expected a gitweb or so ;)
Sorry for the noise,
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-10-31 17:07 libgit2 - a true git library Shawn O. Pearce
2008-10-31 17:28 ` Pieter de Bie
@ 2008-10-31 17:47 ` Pierre Habouzit
2008-10-31 18:41 ` Shawn O. Pearce
` (3 more replies)
2008-10-31 23:18 ` Bruno Santos
` (2 subsequent siblings)
4 siblings, 4 replies; 83+ messages in thread
From: Pierre Habouzit @ 2008-10-31 17:47 UTC (permalink / raw)
To: Shawn O. Pearce; +Cc: git, Scott Chacon
[-- Attachment #1.1: Type: text/plain, Size: 3900 bytes --]
On Fri, Oct 31, 2008 at 05:07:04PM +0000, Shawn O. Pearce wrote:
> During the GitTogether we were kicking around the idea of a ground-up
> implementation of a Git library. This may be easier than trying
> to grind down git.git into a library, as we aren't tied to any
> of the current global state baggage or the current die() based
> error handling.
>
> I've started an _extremely_ rough draft. The code compiles into a
> libgit.a but it doesn't even implement what it describes in the API,
> let alone a working Git implementation. Really what I'm trying to
> incite here is some discussion on what the API looks like.
I know this isn't actually helping a lot to define the real APIs, but we
should really not repeat current git mistakes and have a really uniform
APIs, meaning that first we must decide:
* proper namespacing (e.g. OBJ_* looks like failure to me, it's a way
too common prefix);
* proper public "stuff" naming (I e.g. realy like types names -- not
struct or enum tags, that I don't really care -- ending with _t as
it helps navigating source.
* ...
And write that down _first_. It's not a lot of work, but it must be
done. Working on a library really asks us to create something coherent
for our users.
Second, if we want this to be a successful stuff, we all agree we must
let git be able to use it medium term. That means that when git-core is
experimenting with new interfaces, it will probably need to hook into
some more internal aspects of the library. This is a problem to solve
elegantly, linking git against a static library won't please a lot of
vendors and linux distributions, and exporting "private" symbols is a
sure way to see them being abused.
Last but not least, I believe parts of git-core are currently easy to
just take. For example, any code *I* wrote, I hereby give permission to
relicense it in any of the following licenses: BSD-like, MIT-like,
WTFPL.
For example, on parse-options.c, git blame yields:
git blame -C -C -M parse-options.c|cut -d\( -f2|cut -d2 -f1|sort|uniq -c
16 Alex Riesen
6 Jeff King
47 Johannes Schindelin
12 Junio C Hamano
19 Michele Ballabio
1 Nanako Shiraishi
1 Olivier Marin
395 Pierre Habouzit
Okay, arguably parse-options.c in libgit quite doesn't makes sense
(though it can help bringing some kind of uniformity to other git
ecosystem tools built on libgit but that's not the point I'm trying to
make), I'm sure this kind of pattern where it's likely to be easy to
relicense code happens to some source files. Nicolas already said I
think that he was okay with relicensing his work too e.g.
Maybe we could, in parallel to that, contact people who "own" code in
the core parts of git to ask them where they stand, and see if that can
free some bits of the code. Attached is the current owners of the non
builtin-* C, non header, code in git core, got using this on top of
next:
for i in *.c; do
case $i in
builtin-*)
continue;;
*)
git blame -C -C -M $i|cut -d\( -f2|cut -d2 -f1;;
esac
done
and doing sort | uniq -c | sort -n >owners on it is attached.
Interestingly, it yields around 200 contributors "only" but more
interestingly, only 41 people "own" more than 100 lines of code in
there, and 23 more if you add people with more than 50. IOW, it wouldn't
be absurd to mail those roughly 65 people ask them what they think of
relicensing their work (for those where it's needed because of current
GPL-ness of the code) and see what result it yields. Worst case we lost
like 2 or 3 weeks, best case scenario, we can reuse some bits of git to
reimplement some of the algorithms verbatim.
--
·O· Pierre Habouzit
··O madcoder@debian.org
OOO http://www.madism.org
[-- Attachment #1.2: owners --]
[-- Type: text/plain, Size: 4029 bytes --]
16818 Junio C Hamano
8892 Linus Torvalds
4978 Johannes Schindelin
3664 Shawn O. Pearce
2938 Nick Hengeveld
2857 Nicolas Pitre
2455 Daniel Barkalow
1660 Pierre Habouzit
1294 Adam Simpkins
1126 Mike McCormack
1110 René Scharfe
861 Jeff King
653 Lukas Sandström
564 Miklos Vajna
531 Johannes Sixt
495 Alex Riesen
400 Martin Koegler
332 Petr Baudis
310 Jon Loeliger
263 Timo Hirvonen
236 Brandon Casey
231 Lars Hjemli
221 Sergey Vlasov
200 H. Peter Anvin
181 Dmitry Potapov
165 Mike Hommey
163 Matthias Lederhofer
158 Robert Shearman
155 Andreas Ericsson
151 YOSHIFUJI Hideaki
147 Franck Bui-Huu
137 Christian Couder
127 Stephan Beyer
127 David Reiss
123 Paolo Bonzini
120 Steffen Prohaska
115 Scott R Parish
113 Bradford C. Smith
109 Kristian Høgsberg
105 Heikki Orsila
100 Andy Whitcroft
93 Jim Meyering
90 Wincent Colaiuta
90 Julian Phillips
87 Stephen R. van den Berg
85 Marco Costalba
82 David Rientjes
76 Sven Verdoolaege
75 Eric Wong
72 Edgar Toernig
71 Ramsay Allan Jones
69 Martin Waitz
68 Mark Wooding
68 Eric W. Biederman
67 Geert Bosch
64 Michal Ostrowski
64 David Kastrup
63 Theodore Ts'o
63 Alexandre Julliard
61 Brian Downing
59 Alexander Gavrilov
58 Andy Parkins
56 Jon Seymour
52 Carlos Rica
47 Sean Estabrooks
44 Dana L. How
43 J. Bruce Fields
41 Ping Yin
40 Florian Forster
38 Tilman Sauerbeck
38 Serge E. Hallyn
38 Kay Sievers
38 Jason Riedy
37 Dustin Sallings
35 Nguyễn Thái Ngọc Duy
35 Luiz Fernando N. Capitulino
35 Clemens Buchacher
34 Fredrik Kuivinen
33 Pavel Roskin
33 Lars Knoll
32 Josef Weidendorfer
32 Jonas Fonseca
30 Peter Eriksen
28 Jay Soffian
26 Grégoire Barbier
25 Adam Roben
24 Marius Storm-Olsen
24 Luben Tuikov
24 Jürgen Rühle
24 Bryan Larsen
24 Anders Melchiorsen
23 Pieter de Bie
23 Nanako Shiraishi
23 Michael S. Tsirkin
23 Brian Hetro
22 Paul Mackerras
22 Jens Axboe
20 Thomas Rast
19 Paul Collins
19 Matthias Kestenholz
19 Joachim Berdal Haga
19 David Woodhouse
18 Mark Levedahl
17 Michele Ballabio
15 Gerrit Pape
14 Sam Vilain
13 Olivier Marin
13 Adam Brewster
12 SZEDER Gábor
12 Kai Ruemmler
12 Govind Salinas
11 Sasha Khapyorsky
11 Raphael Zimmerer
11 Brian Gernhardt
10 Steven Grimm
10 Holger Eitzenberger
10 Dmitry V. Levin
10 Dennis Stosberg
10 Avery Pennarun
9 Willy Tarreau
9 Santi Béjar
9 Robin H. Johnson
9 James Bowes
9 Björn Steinbrink
8 Markus Amsler
8 Jonathan del Strother
8 Johan Herland
8 Chris Parsons
8 Boyd Lynn Gerber
7 Robin Rosenberg
7 Paul Serice
7 Matt Kraai
7 Jason McMullan
7 Frank Lichtenheld
7 Christopher Li
6 Qingning Huo
6 Han-Wen Nienhuys
6 David Soria Parra
6 Björn Engelmann
6 Ariel Badichi
5 Sam Ravnborg
5 Peter Hagervall
5 Paul T Darga
5 Michal Vitecek
5 James Bottomley
5 Dotan Barak
5 David S. Miller
5 André Goddard Rosa
4 Uwe Kleine-König
4 Teemu Likonen
4 Samuel Tardieu
4 Patrick Welche
4 Michael Spang
4 Matthieu Moy
4 Joey Hess
4 Finn Arne Gangstad
4 Dmitry Kakurin
4 Carl Worth
4 Arjen Laarhoven
3 Steven Drake
3 Matthew Ogilvie
3 Li Hong
3 Josh Triplett
3 Jakub Narebski
3 Eygene Ryabinkin
3 Brian Gerst
2 Tom Prince
2 Todd Zullinger
2 Shawn Bohrer
2 Matthias Urlichs
2 Martin Sivak
2 Krzysztof Kowalczyk
2 Kevin Ballard
2 Jan Harkes
2 Fernando J. Pereda
2 Eyvind Bernhardsen
2 Deskin Miller
2 Charles Bailey
2 Andrew Ruder
2 Amos Waterland
2 Alp Toker
2 Alexey Nezhdanov
1 Yann Dirson
1 Tuncer Ayaz
1 Tony Luck
1 Timo Sirainen
1 Thomas Harning
1 Thomas Glanzmann
1 Sverre Hvammen Johansen
1 Stephan Feder
1 Simon Hausmann
1 Salikh Zakirov
1 Ryan Anderson
1 Rutger Nijlunsing
1 Randal L. Schwartz
1 Peter Valdemar Mørch
1 Paul Eggert
1 Patrick Higgins
1 Mika Kukkonen
1 Matt Draisey
1 Marco Roeland
1 Lars Doelle
1 Jerald Fitzjerald
1 Jean-Luc Herren
1 Jan Andres
1 Ingo Molnar
1 David Symonds
1 David Meybohm
1 Darrin Thompson
1 Bryan Donlan
1 Brad Roberts
1 Blake Ramsdell
1 Benoit Sigoure
1 Adeodato Simó
[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-10-31 17:47 ` Pierre Habouzit
@ 2008-10-31 18:41 ` Shawn O. Pearce
2008-10-31 18:54 ` Pierre Habouzit
` (2 more replies)
2008-10-31 20:24 ` Nicolas Pitre
` (2 subsequent siblings)
3 siblings, 3 replies; 83+ messages in thread
From: Shawn O. Pearce @ 2008-10-31 18:41 UTC (permalink / raw)
To: Pierre Habouzit; +Cc: git, Scott Chacon
Pierre Habouzit <madcoder@debian.org> wrote:
>
> I know this isn't actually helping a lot to define the real APIs, but we
> should really not repeat current git mistakes and have a really uniform
> APIs, meaning that first we must decide:
Agreed.
> * proper namespacing (e.g. OBJ_* looks like failure to me, it's a way
> too common prefix);
Fixed. Its now GIT_OBJ_*.
> * proper public "stuff" naming (I e.g. realy like types names -- not
> struct or enum tags, that I don't really care -- ending with _t as
> it helps navigating source.
Fixed, types now end in _t.
> And write that down _first_. It's not a lot of work, but it must be
> done. Working on a library really asks us to create something coherent
> for our users.
How about this?
http://www.spearce.org/projects/scm/libgit2/apidocs/CONVENTIONS
> Second, if we want this to be a successful stuff, we all agree we must
> let git be able to use it medium term. That means that when git-core is
> experimenting with new interfaces, it will probably need to hook into
> some more internal aspects of the library. This is a problem to solve
> elegantly, linking git against a static library won't please a lot of
> vendors and linux distributions, and exporting "private" symbols is a
> sure way to see them being abused.
Private symbols are a problem. On some systems we can use link
editing to strip them out of the .so, but that isn't always going to
work everywhere. I've outlined the idea of using double underscore
to name private functions, and we can link-edit out '*__*' if the
platform's linker supports it (e.g. GNU ld).
> Last but not least, I believe parts of git-core are currently easy to
> just take. For example, any code *I* wrote, I hereby give permission to
> relicense it in any of the following licenses: BSD-like, MIT-like,
> WTFPL.
Yea. We could try to do that. I don't know how far it will get us,
but if we have to "steal" code we can rip a good part from JGit.
Its BSD-like, but has that "icky Java smell" to it. :-)
Before worrying about where we get implementation bits from I'm
more interested in trying to get a consistent view of what our
namespace looks like, and what our calling conventions are, so we
have some sort of benchmark to measure APIs against as we add them
to the implementation.
--
Shawn.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-10-31 18:41 ` Shawn O. Pearce
@ 2008-10-31 18:54 ` Pierre Habouzit
2008-10-31 19:57 ` Shawn O. Pearce
2008-10-31 20:05 ` Junio C Hamano
2008-11-01 17:30 ` Pierre Habouzit
2 siblings, 1 reply; 83+ messages in thread
From: Pierre Habouzit @ 2008-10-31 18:54 UTC (permalink / raw)
To: Shawn O. Pearce; +Cc: git, Scott Chacon
[-- Attachment #1: Type: text/plain, Size: 3787 bytes --]
On Fri, Oct 31, 2008 at 06:41:54PM +0000, Shawn O. Pearce wrote:
> Pierre Habouzit <madcoder@debian.org> wrote:
> How about this?
>
> http://www.spearce.org/projects/scm/libgit2/apidocs/CONVENTIONS
looks like a good start.
> > Second, if we want this to be a successful stuff, we all agree we must
> > let git be able to use it medium term. That means that when git-core is
> > experimenting with new interfaces, it will probably need to hook into
> > some more internal aspects of the library. This is a problem to solve
> > elegantly, linking git against a static library won't please a lot of
> > vendors and linux distributions, and exporting "private" symbols is a
> > sure way to see them being abused.
>
> Private symbols are a problem. On some systems we can use link
> editing to strip them out of the .so, but that isn't always going to
> work everywhere. I've outlined the idea of using double underscore
> to name private functions, and we can link-edit out '*__*' if the
> platform's linker supports it (e.g. GNU ld).
Well, I propose the following: we set-up on GNU-ld + gcc enabled systems
all what is needed to use symbol visibility, which isn't that intrusive,
and also rather easy given your GIT_EXPORT macro definition.
This way, people who care about portability across all libgit2 supported
platforms will have to align on the lowest common denominator, which
will not have any kind of private stuff available, so we're safe. And
GCC/GNU-ld enabled platforms cover most of the popular platforms (namely
linux and *BSD, I'm not sure about Macos dynlibs). Even win32 has kind
of what you need to do visibility I think.
IOW prehistoric systems will be have to cope with that because of Linux
(yeah, this is kind of deliciously backwards ;p).
No, my worry was rather wrt git core itself, I really think we _must_
make it link against libgit2 if we want libgit2 to stay current, but git
core will _very likely_ need the private stuff, and it _will_ be a
problem. I mean we cannot seriously so-name a library and show its guts
at the same time, and I'm unsure how to fix that problem. _that_ was my
actual question.
> > Last but not least, I believe parts of git-core are currently easy to
> > just take. For example, any code *I* wrote, I hereby give permission to
> > relicense it in any of the following licenses: BSD-like, MIT-like,
> > WTFPL.
>
> Yea. We could try to do that. I don't know how far it will get us,
> but if we have to "steal" code we can rip a good part from JGit.
> Its BSD-like, but has that "icky Java smell" to it. :-)
>
> Before worrying about where we get implementation bits from I'm
> more interested in trying to get a consistent view of what our
> namespace looks like, and what our calling conventions are, so we
> have some sort of benchmark to measure APIs against as we add them
> to the implementation.
I'd say we should do both at the same time. Asking people if they would
agree to relicense code can be done in parallel. We could extract a list
of source files that we may need (my extraction included stuff that is
very unlikely to be useful like test-*.c that aren't useful, and some
that are already BSD I think), and see who it yields. It should be
possible to do a matrix source-file x people and see on a per-file basis
what they think.
If someone gives me the list of files we should consider (I'm not sure
about a good list right now) I could do the matrix at some fixed sha1
from git.git using git blame -C -M -M -w, and ask people see where it
leads us ?
--
·O· Pierre Habouzit
··O madcoder@debian.org
OOO http://www.madism.org
[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-10-31 18:54 ` Pierre Habouzit
@ 2008-10-31 19:57 ` Shawn O. Pearce
2008-10-31 20:12 ` Pierre Habouzit
0 siblings, 1 reply; 83+ messages in thread
From: Shawn O. Pearce @ 2008-10-31 19:57 UTC (permalink / raw)
To: Pierre Habouzit; +Cc: git, Scott Chacon
Yet more updates to the API:
http://www.spearce.org/projects/scm/libgit2/apidocs/html/modules.html
In particular I've started to define what revision machinary might
look like, based somewhat on JGit's (public) RevWalk API. The guts
of how to make it work of course aren't yet defined.
My goal here is to have the git_revp_t be non-thread safe, but
also to contain the entire object pool, so git_revp_free() will
dispose of any and all git_commit_t's which were created from it.
This "fixes" the "we leak everything" behavior and allows
applications to have multiple pools on different threads if
it needs to. Thus the core git_revp_t isn't thread-safe and
doesn't get weighed down by locking if the application wants
multiple threads.
Pierre Habouzit <madcoder@debian.org> wrote:
>
> Well, I propose the following: we set-up on GNU-ld + gcc enabled systems
> all what is needed to use symbol visibility, which isn't that intrusive,
> and also rather easy given your GIT_EXPORT macro definition.
Yes, agreed. Only I don't know how to do it myself. I know its
possible, so if someone wants to contribute a patch for this ... :-)
> No, my worry was rather wrt git core itself, I really think we _must_
> make it link against libgit2 if we want libgit2 to stay current, but git
> core will _very likely_ need the private stuff, and it _will_ be a
> problem. I mean we cannot seriously so-name a library and show its guts
> at the same time, and I'm unsure how to fix that problem. _that_ was my
> actual question.
Hmmph. I agree git-core needs to link to libgit2.
I disagree it needs private bits. If we do the library right
git-core can use the public API. And where it cannot its either
not something that is "right" for libgit2 (e.g. its parseopts and
we aren't committed yet to including option parsing) or its highly
experimental and we shouldn't put it into the library (and thus
also git-core) until its more frozen.
That said, I don't think its criminal to have git-core include and
link to a static libgit2, especially if git-core's usage of the
library is ahead of what the library itself is able to expose at
the present time.
> I'd say we should do both at the same time. Asking people if they would
> agree to relicense code can be done in parallel. We could extract a list
> of source files that we may need (my extraction included stuff that is
> very unlikely to be useful like test-*.c that aren't useful, and some
> that are already BSD I think), and see who it yields. It should be
> possible to do a matrix source-file x people and see on a per-file basis
> what they think.
>
> If someone gives me the list of files we should consider (I'm not sure
> about a good list right now) I could do the matrix at some fixed sha1
> from git.git using git blame -C -M -M -w, and ask people see where it
> leads us ?
Off the top of my head some really important ones:
diff-delta.c
object.c
patch-delta.c
refs.c
revision.c
sha1_file.c
sha1_name.c
They form a pretty large part of the guts of what most people want
from a git library.
Slightly less important, but still fairly core:
builtin-fetch-pack.c
builtin-send-pack.c
connect.c
remote.c
transport.c
Is most of the client side of the git:// transport, something people want.
--
Shawn.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-10-31 18:41 ` Shawn O. Pearce
2008-10-31 18:54 ` Pierre Habouzit
@ 2008-10-31 20:05 ` Junio C Hamano
2008-10-31 21:58 ` Shawn O. Pearce
2008-11-01 17:30 ` Pierre Habouzit
2 siblings, 1 reply; 83+ messages in thread
From: Junio C Hamano @ 2008-10-31 20:05 UTC (permalink / raw)
To: Shawn O. Pearce; +Cc: Pierre Habouzit, git, Scott Chacon
"Shawn O. Pearce" <spearce@spearce.org> writes:
>> * proper public "stuff" naming (I e.g. realy like types names -- not
>> struct or enum tags, that I don't really care -- ending with _t as
>> it helps navigating source.
>
> Fixed, types now end in _t.
Ugh.
You could talk me into it if you promise never typedef structures (or
pointer to structures) with such symbols, I guess.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-10-31 19:57 ` Shawn O. Pearce
@ 2008-10-31 20:12 ` Pierre Habouzit
0 siblings, 0 replies; 83+ messages in thread
From: Pierre Habouzit @ 2008-10-31 20:12 UTC (permalink / raw)
To: Shawn O. Pearce; +Cc: git, Scott Chacon
[-- Attachment #1: Type: text/plain, Size: 4248 bytes --]
On Fri, Oct 31, 2008 at 07:57:11PM +0000, Shawn O. Pearce wrote:
> This "fixes" the "we leak everything" behavior and allows
> applications to have multiple pools on different threads if
> it needs to. Thus the core git_revp_t isn't thread-safe and
> doesn't get weighed down by locking if the application wants
> multiple threads.
Note that on Linux (and also BSDs I think) many pthread locking
functions have stubs in the glibc and cost 0 until you use the
libpthread. So unless we need to use locking inside the tight-loop, it's
virtually free to write a thread-safe library (a function call is
basically 10 times cheaper than an xchg-based lock).
(Though for git it would suck as we get libpthread in our depends
through some of the other dependencies.
Another way is to use function pointers for the locking, and have a
function to make the object store thread safe by setting the pointers to
functions actually doing locking, and to let it point to functions doing
nothing else.
It looks unrealistic to me to let people deal with the locking,
especially if we mean this library to _also_ be used in language
bindings, hence used in sloppily written scripts.
But of course, if locking calls are used in the tight loop that would
rather suck :/
> Pierre Habouzit <madcoder@debian.org> wrote:
> >
> > Well, I propose the following: we set-up on GNU-ld + gcc enabled systems
> > all what is needed to use symbol visibility, which isn't that intrusive,
> > and also rather easy given your GIT_EXPORT macro definition.
>
> Yes, agreed. Only I don't know how to do it myself. I know its
> possible, so if someone wants to contribute a patch for this ... :-)
I will, basically, you need to build everything with -fvisibilty=hidden
in your CFLAGS, and mark the prototypes of symbols you want to export
with __attribute__((visibility("default"))) (that you can set into your
EXPORT_GIT macro when you're building with __GNUC__).
You don't even need a linker script (unless we're going to do some
symbol versioning but I'm unsure whether it's that useful for now).
> > No, my worry was rather wrt git core itself, I really think we _must_
> > make it link against libgit2 if we want libgit2 to stay current, but git
> > core will _very likely_ need the private stuff, and it _will_ be a
> > problem. I mean we cannot seriously so-name a library and show its guts
> > at the same time, and I'm unsure how to fix that problem. _that_ was my
> > actual question.
>
> Hmmph. I agree git-core needs to link to libgit2.
>
> I disagree it needs private bits. If we do the library right
> git-core can use the public API. And where it cannot its either
> not something that is "right" for libgit2
> (e.g. its parseopts and
> we aren't committed yet to including option parsing)
Sure, I didn't mean we have to put it in libgit2, it's highly UI related
and other tool may want to use something else, and it makes no sense for
many languages that have their library already (python, perl do e.g.).
> or its highly
> experimental and we shouldn't put it into the library (and thus
> also git-core) until its more frozen.
>
> That said, I don't think its criminal to have git-core include and
> link to a static libgit2, especially if git-core's usage of the
> library is ahead of what the library itself is able to expose at
> the present time.
Okay, we'll see how that turns out to work then :)
> Off the top of my head some really important ones:
>
> diff-delta.c
> object.c
> patch-delta.c
> refs.c
> revision.c
> sha1_file.c
> sha1_name.c
>
> They form a pretty large part of the guts of what most people want
> from a git library.
>
> Slightly less important, but still fairly core:
>
> builtin-fetch-pack.c
> builtin-send-pack.c
> connect.c
> remote.c
> transport.c
>
> Is most of the client side of the git:// transport, something people want.
Okay I'll let people mention what they would like to see too, and I'll
work from that then.
--
·O· Pierre Habouzit
··O madcoder@debian.org
OOO http://www.madism.org
[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-10-31 17:47 ` Pierre Habouzit
2008-10-31 18:41 ` Shawn O. Pearce
@ 2008-10-31 20:24 ` Nicolas Pitre
2008-10-31 20:29 ` david
` (2 more replies)
2008-10-31 20:24 ` Brian Gernhardt
2008-10-31 21:59 ` Andreas Ericsson
3 siblings, 3 replies; 83+ messages in thread
From: Nicolas Pitre @ 2008-10-31 20:24 UTC (permalink / raw)
To: Pierre Habouzit; +Cc: Shawn O. Pearce, git, Scott Chacon
On Fri, 31 Oct 2008, Pierre Habouzit wrote:
> Last but not least, I believe parts of git-core are currently easy to
> just take. For example, any code *I* wrote, I hereby give permission to
> relicense it in any of the following licenses: BSD-like, MIT-like,
> WTFPL.
First........... is there really a need to re-license it?
If so then the choice of license is IMHO rather important.
> Nicolas already said I think that he was okay with relicensing his
> work too e.g.
Depends. Sure, I gave permission to copy some of my code for JGIT
because 1) JGIT is Java code in which I have little interest, 2) the
different license was justified by the nature of the JGIT project, and
3) although no license convey this I asked for the C version of git to
remain the authoritative reference and that any improvements done to JGIT
first be usable in the C version under the GPL.
Of course a library might need a different license than the GPL to be
widely useful from a linkage stand point, but the code within that
library does not need to be miles away from the GPL. What I personally
care about is for improvements to my code to always be contributed back,
which pretty much discards BSD-like licenses.
My favorite license for a library is the GPL with the gcc exception,
i.e. what libraries coming with gcc are using. They're GPL but with an
exception allowing them to be linked with anything. And because
everything on a Linux system, including proprietary applications, is
likely linked against those gcc libs, then there is nothing that would
prevent libgit to be linked against anything as well. But the library
code itself has GPL protection.
For reference, here's the exception text:
In addition to the permissions in the GNU General Public License, the
Free Software Foundation gives you unlimited permission to link the
compiled version of this file into combinations with other programs,
and to distribute those combinations without any restriction coming
from the use of this file. (The General Public License restrictions
do apply in other respects; for example, they cover modification of
the file, and distribution when not linked into a combine
executable.)
Nicolas
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-10-31 17:47 ` Pierre Habouzit
2008-10-31 18:41 ` Shawn O. Pearce
2008-10-31 20:24 ` Nicolas Pitre
@ 2008-10-31 20:24 ` Brian Gernhardt
2008-10-31 21:59 ` Andreas Ericsson
3 siblings, 0 replies; 83+ messages in thread
From: Brian Gernhardt @ 2008-10-31 20:24 UTC (permalink / raw)
To: Pierre Habouzit; +Cc: Shawn O. Pearce, git, Scott Chacon
On Oct 31, 2008, at 1:47 PM, Pierre Habouzit wrote:
> Last but not least, I believe parts of git-core are currently easy to
> just take. For example, any code *I* wrote, I hereby give permission
> to
> relicense it in any of the following licenses: BSD-like, MIT-like,
> WTFPL.
FWIW, I give the same permissions.
> 11 Brian Gernhardt
~~ Brian Gernhardt
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-10-31 20:24 ` Nicolas Pitre
@ 2008-10-31 20:29 ` david
2008-10-31 20:56 ` Nicolas Pitre
2008-10-31 21:31 ` Pierre Habouzit
2008-10-31 23:14 ` Junio C Hamano
2 siblings, 1 reply; 83+ messages in thread
From: david @ 2008-10-31 20:29 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: Pierre Habouzit, Shawn O. Pearce, git, Scott Chacon
On Fri, 31 Oct 2008, Nicolas Pitre wrote:
> On Fri, 31 Oct 2008, Pierre Habouzit wrote:
>
>> Last but not least, I believe parts of git-core are currently easy to
>> just take. For example, any code *I* wrote, I hereby give permission to
>> relicense it in any of the following licenses: BSD-like, MIT-like,
>> WTFPL.
>
> First........... is there really a need to re-license it?
> If so then the choice of license is IMHO rather important.
at the very least you should go from GPLv2 to LGPLv2 for the library.
while it can be argued that this really shouldn't be nessasary, the water
is muddy enough that it would be a very good thing to do this.
I don't see any need to switch to a BSD/MIT/etc license for a library, the
LGPL lets it get linked with those licenses anyway.
> My favorite license for a library is the GPL with the gcc exception,
> i.e. what libraries coming with gcc are using. They're GPL but with an
> exception allowing them to be linked with anything. And because
> everything on a Linux system, including proprietary applications, is
> likely linked against those gcc libs, then there is nothing that would
> prevent libgit to be linked against anything as well. But the library
> code itself has GPL protection.
>
> For reference, here's the exception text:
>
> In addition to the permissions in the GNU General Public License, the
> Free Software Foundation gives you unlimited permission to link the
> compiled version of this file into combinations with other programs,
> and to distribute those combinations without any restriction coming
> from the use of this file. (The General Public License restrictions
> do apply in other respects; for example, they cover modification of
> the file, and distribution when not linked into a combine
> executable.)
<shrug>, I don't see why this is needed with the LGPL, but I'm not a
lawyer.
David Lang
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-10-31 20:29 ` david
@ 2008-10-31 20:56 ` Nicolas Pitre
2008-10-31 21:43 ` Shawn O. Pearce
0 siblings, 1 reply; 83+ messages in thread
From: Nicolas Pitre @ 2008-10-31 20:56 UTC (permalink / raw)
To: david; +Cc: Pierre Habouzit, Shawn O. Pearce, git, Scott Chacon
On Fri, 31 Oct 2008, david@lang.hm wrote:
> On Fri, 31 Oct 2008, Nicolas Pitre wrote:
>
> > On Fri, 31 Oct 2008, Pierre Habouzit wrote:
> >
> > > Last but not least, I believe parts of git-core are currently easy to
> > > just take. For example, any code *I* wrote, I hereby give permission to
> > > relicense it in any of the following licenses: BSD-like, MIT-like,
> > > WTFPL.
> >
> > First........... is there really a need to re-license it?
> > If so then the choice of license is IMHO rather important.
>
> at the very least you should go from GPLv2 to LGPLv2 for the library.
Sure.
> while it can be argued that this really shouldn't be nessasary, the water is
> muddy enough that it would be a very good thing to do this.
>
> I don't see any need to switch to a BSD/MIT/etc license for a library, the
> LGPL lets it get linked with those licenses anyway.
Right.
> > My favorite license for a library is the GPL with the gcc exception,
> > i.e. what libraries coming with gcc are using. They're GPL but with an
> > exception allowing them to be linked with anything. And because
> > everything on a Linux system, including proprietary applications, is
> > likely linked against those gcc libs, then there is nothing that would
> > prevent libgit to be linked against anything as well. But the library
> > code itself has GPL protection.
> >
> > For reference, here's the exception text:
> >
> > In addition to the permissions in the GNU General Public License, the
> > Free Software Foundation gives you unlimited permission to link the
> > compiled version of this file into combinations with other programs,
> > and to distribute those combinations without any restriction coming
> > from the use of this file. (The General Public License restrictions
> > do apply in other respects; for example, they cover modification of
> > the file, and distribution when not linked into a combine
> > executable.)
>
> <shrug>, I don't see why this is needed with the LGPL, but I'm not a lawyer.
The LGPL also asks that proprietary applications provides necessary
object files so you can link it against an alternative implementation of
the LGPL library if you so wish. With dynamic libraries this is rather
moot but I think that's the main difference.
Nicolas
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-10-31 20:24 ` Nicolas Pitre
2008-10-31 20:29 ` david
@ 2008-10-31 21:31 ` Pierre Habouzit
2008-10-31 22:10 ` Nicolas Pitre
2008-10-31 23:24 ` Pieter de Bie
2008-10-31 23:14 ` Junio C Hamano
2 siblings, 2 replies; 83+ messages in thread
From: Pierre Habouzit @ 2008-10-31 21:31 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: Shawn O. Pearce, git, Scott Chacon
[-- Attachment #1: Type: text/plain, Size: 2214 bytes --]
On Fri, Oct 31, 2008 at 08:24:14PM +0000, Nicolas Pitre wrote:
> On Fri, 31 Oct 2008, Pierre Habouzit wrote:
>
> > Last but not least, I believe parts of git-core are currently easy to
> > just take. For example, any code *I* wrote, I hereby give permission to
> > relicense it in any of the following licenses: BSD-like, MIT-like,
> > WTFPL.
>
> First........... is there really a need to re-license it?
> If so then the choice of license is IMHO rather important.
Hmm yeah, GPL has the viral link issue, and there is a real use of
embedding the libgit in many projects. There is a use for it: front-ends
to git from editors, for closed-source projects of various sorts, ...
All of those right now must do a reimplementation of what they need. As
soon as they need some kind of performance, exec()ing git-* commands
doesn't fly.
Git is currently mostly "GPLv2 or later". A BSDish license was
mentioned, because it's the most permissive one and that nobody cared
that much, though a LGPL/GPL-with-GCC-exception would probably fly.
Many of the people needing a library for libgit are probably reading the
list, I'll let them comment. The kind of license you propose would
totally suite my needs, and I think, most of the one discussed at
GitTogether'08 (except for the eclipse people disliking GPL'ed stuff,
but anyways there was the issue of C code being non pure java anyways,
so maybe Shawn can comment on that bit, I don't recall the exact
specifics I must reckon).
OT: FWIW I prefer BSDish licenses (even the MIT actually) for libraries
because I believe that computing is overall better if everyone can use
the right tool for the task, and I don't want to prevent people from
using good stuff (I hope I write good stuff ;P) because of the license.
And I don't care about people don't giving back to me, those are not the
kind of people who would have given back if it was GPL'ed anyways.
But I understand this is a completely personal view, and I'm not even
trying to persuade you :)
--
·O· Pierre Habouzit
··O madcoder@debian.org
OOO http://www.madism.org
[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-10-31 20:56 ` Nicolas Pitre
@ 2008-10-31 21:43 ` Shawn O. Pearce
2008-10-31 21:50 ` Shawn O. Pearce
2008-10-31 21:51 ` Pierre Habouzit
0 siblings, 2 replies; 83+ messages in thread
From: Shawn O. Pearce @ 2008-10-31 21:43 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: david, Pierre Habouzit, git, Scott Chacon
Nicolas Pitre <nico@cam.org> wrote:
> On Fri, 31 Oct 2008, david@lang.hm wrote:
> > On Fri, 31 Oct 2008, Nicolas Pitre wrote:
> > > On Fri, 31 Oct 2008, Pierre Habouzit wrote:
> > >
> > > > Last but not least, I believe parts of git-core are currently easy to
> > > > just take. For example, any code *I* wrote, I hereby give permission to
> > > > relicense it in any of the following licenses: BSD-like, MIT-like,
> > > > WTFPL.
> > >
> > > First........... is there really a need to re-license it?
> > > If so then the choice of license is IMHO rather important.
Some people want to be able to link the library into an application
that they redistribute binaries of, but not sources to. Those folks
have also volunteered to help write the library. If they put their
code where their mouth is, then I think they should be able to use
their code the way they want to.
That said, I think the license choice that makes the most sense
here is probably LGPL or GPL+gcc exception, like you note below.
BSD and MIT are probably not serious contenders.
> > at the very least you should go from GPLv2 to LGPLv2 for the library.
>
> Sure.
Well, we cannot do a GPL->LGPL switch on code without author
permission for that sort of re-licensing.
That said, I think many authors of git.git code would be more
comfortable with a GPL->LGPL change, where they wouldn't be OK with
a GPL->BSD/MIT change. There may be some folks though who still
wouldn't accept a GPL->LGPL move.
> > > My favorite license for a library is the GPL with the gcc exception,
...
> > >
> > > For reference, here's the exception text:
> > >
> > > In addition to the permissions in the GNU General Public License, the
> > > Free Software Foundation gives you unlimited permission to link the
> > > compiled version of this file into combinations with other programs,
> > > and to distribute those combinations without any restriction coming
> > > from the use of this file. (The General Public License restrictions
> > > do apply in other respects; for example, they cover modification of
> > > the file, and distribution when not linked into a combine
> > > executable.)
> >
> > <shrug>, I don't see why this is needed with the LGPL, but I'm not a lawyer.
>
> The LGPL also asks that proprietary applications provides necessary
> object files so you can link it against an alternative implementation of
> the LGPL library if you so wish. With dynamic libraries this is rather
> moot but I think that's the main difference.
I'm happy with either the LGPL or the GPL+exception above. If I
read these correctly the GPL+exception allows one to distribute
static executables without source or object files, so long as the
library source wasn't modified. I'd almost prefer just using the
standard LGPL then, static linking isn't very common anymore.
--
Shawn.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-10-31 21:43 ` Shawn O. Pearce
@ 2008-10-31 21:50 ` Shawn O. Pearce
2008-10-31 21:51 ` Pierre Habouzit
1 sibling, 0 replies; 83+ messages in thread
From: Shawn O. Pearce @ 2008-10-31 21:50 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: david, Pierre Habouzit, git, Scott Chacon
"Shawn O. Pearce" <spearce@spearce.org> wrote:
> That said, I think the license choice that makes the most sense
> here is probably LGPL or GPL+gcc exception, like you note below.
> BSD and MIT are probably not serious contenders.
I should clarify that I said the above paragraph...
> That said, I think many authors of git.git code would be more
> comfortable with a GPL->LGPL change, where they wouldn't be OK with
> a GPL->BSD/MIT change. There may be some folks though who still
> wouldn't accept a GPL->LGPL move.
because of this paragraph.
Like Pierre I also prefer a BSD style license, and JGit is under
that, as it offers quite a bit of freedom for the consumer of
the code.
I'm also not too worried about not getting changes back. If someone
forks away from the base project and doesn't contribute back,
that's their problem. So long as the base project has sufficient
momentum under it making changes and improving things, everyone
else will want to pull and either face merge-hell once in a while,
or send changes back upstream to avoid merge-hell.
But I doubt Git regulars share our views on this, and I think most
of the major contributors to git.git have stated multiple times
that they prefer a GPL style license on their code. I want those
people to contribute to libgit2 (assuming the project moves past the
pie-in-the-sky theory stage), so I want the license to be something
they will be comfortable with.
--
Shawn.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-10-31 21:43 ` Shawn O. Pearce
2008-10-31 21:50 ` Shawn O. Pearce
@ 2008-10-31 21:51 ` Pierre Habouzit
1 sibling, 0 replies; 83+ messages in thread
From: Pierre Habouzit @ 2008-10-31 21:51 UTC (permalink / raw)
To: Shawn O. Pearce; +Cc: Nicolas Pitre, david, git, Scott Chacon
[-- Attachment #1: Type: text/plain, Size: 959 bytes --]
On Fri, Oct 31, 2008 at 09:43:56PM +0000, Shawn O. Pearce wrote:
> Nicolas Pitre <nico@cam.org> wrote:
> > On Fri, 31 Oct 2008, david@lang.hm wrote:
> > > at the very least you should go from GPLv2 to LGPLv2 for the library.
> >
> > Sure.
>
> Well, we cannot do a GPL->LGPL switch on code without author
> permission for that sort of re-licensing.
>
> That said, I think many authors of git.git code would be more
> comfortable with a GPL->LGPL change, where they wouldn't be OK with
> a GPL->BSD/MIT change. There may be some folks though who still
> wouldn't accept a GPL->LGPL move.
FWIW, I dislike the LGPL for many reasons, and I prefer 100x times a GPL
with GCC kind of exceptions. But if other people hate it, I wont be
be a problem and refuse it.
--
·O· Pierre Habouzit
··O madcoder@debian.org
OOO http://www.madism.org
[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-10-31 20:05 ` Junio C Hamano
@ 2008-10-31 21:58 ` Shawn O. Pearce
0 siblings, 0 replies; 83+ messages in thread
From: Shawn O. Pearce @ 2008-10-31 21:58 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Pierre Habouzit, git, Scott Chacon
Junio C Hamano <gitster@pobox.com> wrote:
> "Shawn O. Pearce" <spearce@spearce.org> writes:
>
> >> * proper public "stuff" naming (I e.g. realy like types names -- not
> >> struct or enum tags, that I don't really care -- ending with _t as
> >> it helps navigating source.
> >
> > Fixed, types now end in _t.
>
> Ugh.
>
> You could talk me into it if you promise never typedef structures (or
> pointer to structures) with such symbols, I guess.
I should write that one down in CONVENTIONS.
IMHO:
typedef uint32_t uid_t; /* sane */
typedef enum {...} status_t; /* sane */
typedef struct foo_t foo_t; /* sane */
typedef struct {...} foo_t; /* borderline insane */
typedef char* str_t; /* totally nuts */
typedef char**** str_pppp_t; /* totally nuts */
Hiding the fact that scalar types like a uid_t are 32 bits on
this system is reasonable. Heck, uid_t is already in POSIX,
we shouldn't fight that sort of idea. It at least improves
documentation somewhat.
Hiding the fact that some scalar type is an enum, so you don't have
to type "enum blah" everywhere is also reasonable. Its slightly
better than #define some magic constants and passing an int
everywhere. Its a reasonable balance between reducing keystrokes
and keeping the code semi-self-documenting.
Hiding the fact that an opaque struct (or union) you cannot ever
see the members of is a struct or union is good API design. You can
later change the major class from struct to union or back, or totally
redefine it, but the caller never needs to know what is going on.
Hiding a pointer is wrong. Callers should know they are getting a
pointer, or are being asked to supply a pointer-to-a-pointer. So the
"FILE*" stdio functions are sane, because we don't know what is under
a FILE type but we do know when we are dealing with a pointer to one.
My original proposal didn't stick _t onto the end of everything,
because I didn't think it was really necessary. I'm fine with it
either way. It may be better to include the _t suffix, it seems
to be somewhat common in libraries.
--
Shawn.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-10-31 17:47 ` Pierre Habouzit
` (2 preceding siblings ...)
2008-10-31 20:24 ` Brian Gernhardt
@ 2008-10-31 21:59 ` Andreas Ericsson
2008-10-31 22:01 ` Shawn O. Pearce
2008-10-31 23:22 ` Johannes Schindelin
3 siblings, 2 replies; 83+ messages in thread
From: Andreas Ericsson @ 2008-10-31 21:59 UTC (permalink / raw)
To: Pierre Habouzit; +Cc: Shawn O. Pearce, git, Scott Chacon
Pierre Habouzit wrote:
> On Fri, Oct 31, 2008 at 05:07:04PM +0000, Shawn O. Pearce wrote:
>> During the GitTogether we were kicking around the idea of a ground-up
>> implementation of a Git library. This may be easier than trying
>> to grind down git.git into a library, as we aren't tied to any
>> of the current global state baggage or the current die() based
>> error handling.
>>
>> I've started an _extremely_ rough draft. The code compiles into a
>> libgit.a but it doesn't even implement what it describes in the API,
>> let alone a working Git implementation. Really what I'm trying to
>> incite here is some discussion on what the API looks like.
>
> I know this isn't actually helping a lot to define the real APIs, but we
> should really not repeat current git mistakes and have a really uniform
> APIs, meaning that first we must decide:
> * proper namespacing (e.g. OBJ_* looks like failure to me, it's a way
> too common prefix);
>
As it's the git-lib, all public functions should almost certainly be
prefixed with "git" or "git_". I favor "git_".
> * proper public "stuff" naming (I e.g. realy like types names -- not
> struct or enum tags, that I don't really care -- ending with _t as
> it helps navigating source.
>
*_t types are reserved by POSIX for future implementations, so that's
a no-go (although I doubt POSIX will ever make types named git_*_t).
Apart from that, please consider reading Ulrich Drepper's musings on
library design at http://people.redhat.com/drepper/goodpractice.pdf
It's pretty short but brings up nearly all the crucial points one really
don't want to forget.
--
Andreas Ericsson andreas.ericsson@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-10-31 21:59 ` Andreas Ericsson
@ 2008-10-31 22:01 ` Shawn O. Pearce
2008-10-31 22:51 ` Junio C Hamano
2008-11-01 11:17 ` Andreas Ericsson
2008-10-31 23:22 ` Johannes Schindelin
1 sibling, 2 replies; 83+ messages in thread
From: Shawn O. Pearce @ 2008-10-31 22:01 UTC (permalink / raw)
To: Andreas Ericsson; +Cc: Pierre Habouzit, git, Scott Chacon
Andreas Ericsson <ae@op5.se> wrote:
>> * proper public "stuff" naming (I e.g. realy like types names -- not
>> struct or enum tags, that I don't really care -- ending with _t as
>> it helps navigating source.
>
> *_t types are reserved by POSIX for future implementations, so that's
> a no-go (although I doubt POSIX will ever make types named git_*_t).
Yikes. Anyone know where a concise list of the reserved names are?
> Apart from that, please consider reading Ulrich Drepper's musings on
> library design at http://people.redhat.com/drepper/goodpractice.pdf
I think I've read that before, but I'll skim over it again.
Thanks for the link.
--
Shawn.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-10-31 21:31 ` Pierre Habouzit
@ 2008-10-31 22:10 ` Nicolas Pitre
2008-11-01 10:52 ` Andreas Ericsson
2008-10-31 23:24 ` Pieter de Bie
1 sibling, 1 reply; 83+ messages in thread
From: Nicolas Pitre @ 2008-10-31 22:10 UTC (permalink / raw)
To: Pierre Habouzit; +Cc: Shawn O. Pearce, git, Scott Chacon
On Fri, 31 Oct 2008, Pierre Habouzit wrote:
> Git is currently mostly "GPLv2 or later". A BSDish license was
> mentioned, because it's the most permissive one and that nobody cared
> that much, though a LGPL/GPL-with-GCC-exception would probably fly.
I do care. I think the BSD license is too permissive. There are really
nifty pieces of code in Git that I would be really sorry to see go
proprietary.
> Many of the people needing a library for libgit are probably reading the
> list, I'll let them comment. The kind of license you propose would
> totally suite my needs, and I think, most of the one discussed at
> GitTogether'08 (except for the eclipse people disliking GPL'ed stuff,
> but anyways there was the issue of C code being non pure java anyways,
> so maybe Shawn can comment on that bit, I don't recall the exact
> specifics I must reckon).
Eclipse is Java and that issue is already solved with JGIT which doesn't
reuse C code from git.
> OT: FWIW I prefer BSDish licenses (even the MIT actually) for libraries
> because I believe that computing is overall better if everyone can use
> the right tool for the task, and I don't want to prevent people from
> using good stuff (I hope I write good stuff ;P) because of the license.
Everybody can and does link against glibc on Linux which is LGPL. So
that doesn't affect "usage".
> And I don't care about people don't giving back to me, those are not the
> kind of people who would have given back if it was GPL'ed anyways.
> But I understand this is a completely personal view, and I'm not even
> trying to persuade you :)
Sure, and that's where we differ. I let you use my code for free, as
long as you give me back your improvements to it. This way everybody
stays honnest. I think this is Linus' view as well which he often
resume as "tit for tat".
Nicolas
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-10-31 22:01 ` Shawn O. Pearce
@ 2008-10-31 22:51 ` Junio C Hamano
2008-11-01 11:17 ` Andreas Ericsson
1 sibling, 0 replies; 83+ messages in thread
From: Junio C Hamano @ 2008-10-31 22:51 UTC (permalink / raw)
To: Shawn O. Pearce; +Cc: Andreas Ericsson, Pierre Habouzit, git, Scott Chacon
"Shawn O. Pearce" <spearce@spearce.org> writes:
> Andreas Ericsson <ae@op5.se> wrote:
>>> * proper public "stuff" naming (I e.g. realy like types names -- not
>>> struct or enum tags, that I don't really care -- ending with _t as
>>> it helps navigating source.
>>
>> *_t types are reserved by POSIX for future implementations, so that's
>> a no-go (although I doubt POSIX will ever make types named git_*_t).
>
> Yikes. Anyone know where a concise list of the reserved names are?
Essentially, anything that ends with "_t" ;-)
http://www.opengroup.org/onlinepubs/000095399/functions/xsh_chap02_02.html#tag_02_02_02
Look for "Any Header" in the table.
>> Apart from that, please consider reading Ulrich Drepper's musings on
>> library design at http://people.redhat.com/drepper/goodpractice.pdf
>
> I think I've read that before, but I'll skim over it again.
> Thanks for the link.
>
> --
> Shawn.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-10-31 20:24 ` Nicolas Pitre
2008-10-31 20:29 ` david
2008-10-31 21:31 ` Pierre Habouzit
@ 2008-10-31 23:14 ` Junio C Hamano
2008-10-31 23:33 ` Pierre Habouzit
2008-10-31 23:41 ` Shawn O. Pearce
2 siblings, 2 replies; 83+ messages in thread
From: Junio C Hamano @ 2008-10-31 23:14 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: Pierre Habouzit, Shawn O. Pearce, git, Scott Chacon
Nicolas Pitre <nico@cam.org> writes:
> Depends. Sure, I gave permission to copy some of my code for JGIT
> because 1) JGIT is Java code in which I have little interest, 2) the
> different license was justified by the nature of the JGIT project, and
> 3) although no license convey this I asked for the C version of git to
> remain the authoritative reference and that any improvements done to JGIT
> first be usable in the C version under the GPL.
This reminds me that Shawn earlier in an unrelated thread asked me if I
can relicense builtin-blame.c for JGIT; your reasoning fully matches that
of mine regarding that part of the code.
> My favorite license for a library is the GPL with the gcc exception,
> i.e. what libraries coming with gcc are using. They're GPL but with an
> exception allowing them to be linked with anything.
Although I'd be Ok with either GPL + gcc exception on whatever core-ish
(i.e. what will be necessary for libgit2; "blame" would not count) pieces
I have in C-git codebase, "can be linked with anything" allows a gaping
hole to the library, which I'm a bit hesitant to swallow without thinking.
E.g. our read_object() may look like this:
void *read_object(const object_name_t sha1,
enum object_type *type,
size_t *size)
{
...
}
but an extension a closed-source person may sell you back may do:
+typedef void *read_object_fn(const object_name_t,
+ enum object_type *,
+ size_t *);
+read_object_fn read_object_custom = NULL;
void *read_object(const object_name_t sha1,
enum object_type *type,
size_t *size)
{
+ if (read_object_custom != NULL)
+ return read_object_custom(sha1, type, size);
...
}
I.e. use the supplied custom function to do proprietary magic, such as
reading the object lazily from elsewhere over the network. And we will
never get that magic bit back.
Although no license asks this, my wish is that if somebody built on top of
what I wrote to make the world a better place, I'd like the same access to
that additional code so that I too can enjoy the improved world. Because
almost all of my code in git.git are under GPLv2, in reality I do not have
any access to your software as long as you do not distribute your
additional code that made the world a better place, which is a bit sad.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-10-31 17:07 libgit2 - a true git library Shawn O. Pearce
2008-10-31 17:28 ` Pieter de Bie
2008-10-31 17:47 ` Pierre Habouzit
@ 2008-10-31 23:18 ` Bruno Santos
2008-10-31 23:25 ` Shawn O. Pearce
2008-11-01 19:18 ` Andreas Ericsson
2008-11-08 13:26 ` Steve Frécinaux
4 siblings, 1 reply; 83+ messages in thread
From: Bruno Santos @ 2008-10-31 23:18 UTC (permalink / raw)
To: Shawn O. Pearce; +Cc: git
Shawn O. Pearce wrote:
> During the GitTogether we were kicking around the idea of a ground-up
> implementation of a Git library. This may be easier than trying
> to grind down git.git into a library, as we aren't tied to any
> of the current global state baggage or the current die() based
> error handling.
>
> I've started an _extremely_ rough draft. The code compiles into a
> libgit.a but it doesn't even implement what it describes in the API,
> let alone a working Git implementation. Really what I'm trying to
> incite here is some discussion on what the API looks like.
>
> API Docs:
> http://www.spearce.org/projects/scm/libgit2/apidocs/html/modules.html
>
> Source Code Clone URL:
> http://www.spearce.org/projects/scm/libgit2/libgit2.git
>
We should take the opportunity a make it more portable. Instead of using
the posix api directly we should warp it in "git_" APIs. And be carefull
with certain APIs like fork or fork+exec and instead provided a more
generic solution: for fork one that would use the best solution in the
given platform, either by forking or threading; and for fork+exec a
generic create_process/run_command.
Here's an example, for the 'read' API, on how we can simply do this
without worries for the posix crowd:
ssize_t git_read(git_fildes_t fildes, void* buf, size_t bufsize);
Were git_fildes_t would be an int for posix and an HANDLE for win32.
For the posix case git_read can be simply inlined and we get zero overhead:
static inline ssize_t git_read(git_fildes_t fildes, void *buf,
size_t bufsize)
{
return read(fildes, buf, bufsize);
}
And for the win32 case it would be much more easier to implement the
equivalent, something like:
ssize_t git_read(git_fildes_t fildes, void *buf, size_t bufsize)
{
DWORD rd;
if (!ReadFile(fildes, buf, bufsize, &rd, NULL)) {
//translate win32 error to errno
return -1;
}
return rd;
}
Of course, there is also the issue of using the c runtime on win32, but
that problem can be easily solved outside git, provided that we don't
use a 'fileno' like API.
Bruno Santos
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-10-31 21:59 ` Andreas Ericsson
2008-10-31 22:01 ` Shawn O. Pearce
@ 2008-10-31 23:22 ` Johannes Schindelin
1 sibling, 0 replies; 83+ messages in thread
From: Johannes Schindelin @ 2008-10-31 23:22 UTC (permalink / raw)
To: Andreas Ericsson; +Cc: Pierre Habouzit, Shawn O. Pearce, git, Scott Chacon
Hi,
On Fri, 31 Oct 2008, Andreas Ericsson wrote:
> Apart from that, please consider reading Ulrich Drepper's musings on
> library design at http://people.redhat.com/drepper/goodpractice.pdf
I do not know if I want to trust a person that has shown a certain
eagerness to ignore good library design by breaking the well-established
dont-change-apis-on-minor-versions idiom, and instead of listening to
users that have problems as a consequence rather ignore them.
Instead, let's build on the knowledge of people we have learnt to trust,
on this list.
Thank you,
Dscho
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-10-31 21:31 ` Pierre Habouzit
2008-10-31 22:10 ` Nicolas Pitre
@ 2008-10-31 23:24 ` Pieter de Bie
2008-10-31 23:28 ` Shawn O. Pearce
1 sibling, 1 reply; 83+ messages in thread
From: Pieter de Bie @ 2008-10-31 23:24 UTC (permalink / raw)
To: Pierre Habouzit; +Cc: Nicolas Pitre, Shawn O. Pearce, git, Scott Chacon
On 31 okt 2008, at 22:31, Pierre Habouzit wrote:
> Git is currently mostly "GPLv2 or later". A BSDish license was
> mentioned, because it's the most permissive one and that nobody cared
> that much, though a LGPL/GPL-with-GCC-exception would probably fly.
>
> Many of the people needing a library for libgit are probably reading
> the
> list, I'll let them comment
As an implementor of a git GUI, I don't really care what license
libgit2 will be. GitX is currently GPLv2, though that might change
to LGPL to allow it to ship/link with non-GPL libraries. GPL and LGPL
would both suit me.
As a more concrete comment, is there anything you would like to hear
from GUI developers during the development of libgit2? I'm not sure I
can contribute much in terms of code, but if you need any constructive
comments, I can help with that.
- Pieter
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-10-31 23:18 ` Bruno Santos
@ 2008-10-31 23:25 ` Shawn O. Pearce
0 siblings, 0 replies; 83+ messages in thread
From: Shawn O. Pearce @ 2008-10-31 23:25 UTC (permalink / raw)
To: Bruno Santos; +Cc: git
Bruno Santos <nayart3@gmail.com> wrote:
> Shawn O. Pearce wrote:
> > During the GitTogether we were kicking around the idea of a ground-up
> > implementation of a Git library.
>
> We should take the opportunity a make it more portable. Instead of using
> the posix api directly we should warp it in "git_" APIs. And be carefull
> with certain APIs like fork or fork+exec [...]
>
> Here's an example, for the 'read' API, on how we can simply do this
> without worries for the posix crowd:
>
> ssize_t git_read(git_fildes_t fildes, void* buf, size_t bufsize);
>
> Were git_fildes_t would be an int for posix and an HANDLE for win32.
...
Yes, already on my mind. _IF_ this carries further I'll be involved
with its development, and I'll make certain its reasonably portable
like you are asking.
There are only a handful of things we really need to from the OS
and I think we can wrap most of them up into little inline stubs
on POSIX (so POSIX folks have no impact) and on Win32 we can have
small stubs (or again inline) so its pretty native on Win32.
Yes, I have considered say NPR or APR; I'm not going there.
Both are nice packages but there's also downsides to bringing them
into libgit2. IMHO its just easier to wrap the handful of things
we really need.
--
Shawn.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-10-31 23:24 ` Pieter de Bie
@ 2008-10-31 23:28 ` Shawn O. Pearce
2008-10-31 23:49 ` Junio C Hamano
0 siblings, 1 reply; 83+ messages in thread
From: Shawn O. Pearce @ 2008-10-31 23:28 UTC (permalink / raw)
To: Pieter de Bie; +Cc: Pierre Habouzit, Nicolas Pitre, git, Scott Chacon
Pieter de Bie <pdebie@ai.rug.nl> wrote:
> On 31 okt 2008, at 22:31, Pierre Habouzit wrote:
>>
>> Many of the people needing a library for libgit are probably reading
>> the
>> list, I'll let them comment
>
> As a more concrete comment, is there anything you would like to hear
> from GUI developers during the development of libgit2? I'm not sure I
> can contribute much in terms of code, but if you need any constructive
> comments, I can help with that.
We'd like to hear feedback on the API. Look at the operations
your application does with git-core.
Can you express that with the libgit2 API?
If not, why not?
Is it just that the docs are unclear?
Is the API missing?
What would you like to invoke to get the data you need?
...
You are the end-user of the library, so it needs to suit you. Ok,
you aren't the only end-user, but you and other developers like
you... :-)
--
Shawn.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-10-31 23:14 ` Junio C Hamano
@ 2008-10-31 23:33 ` Pierre Habouzit
2008-10-31 23:41 ` Shawn O. Pearce
1 sibling, 0 replies; 83+ messages in thread
From: Pierre Habouzit @ 2008-10-31 23:33 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Nicolas Pitre, Shawn O. Pearce, git, Scott Chacon
[-- Attachment #1: Type: text/plain, Size: 2834 bytes --]
On Fri, Oct 31, 2008 at 11:14:44PM +0000, Junio C Hamano wrote:
> Nicolas Pitre <nico@cam.org> writes:
>
> > My favorite license for a library is the GPL with the gcc exception,
> > i.e. what libraries coming with gcc are using. They're GPL but with an
> > exception allowing them to be linked with anything.
>
> Although I'd be Ok with either GPL + gcc exception on whatever core-ish
> (i.e. what will be necessary for libgit2; "blame" would not count) pieces
> I have in C-git codebase, "can be linked with anything" allows a gaping
> hole to the library, which I'm a bit hesitant to swallow without thinking.
Well I wasn't thinking about anything else than what is needed for the
libgit2. I love BSDish for libraries, though like GPL for the actual
_tools_ I write with it.
> E.g. our read_object() may look like this:
>
> void *read_object(const object_name_t sha1,
> enum object_type *type,
> size_t *size)
> {
> ...
> }
>
>
> but an extension a closed-source person may sell you back may do:
>
> +typedef void *read_object_fn(const object_name_t,
> + enum object_type *,
> + size_t *);
> +read_object_fn read_object_custom = NULL;
> void *read_object(const object_name_t sha1,
> enum object_type *type,
> size_t *size)
> {
> + if (read_object_custom != NULL)
> + return read_object_custom(sha1, type, size);
> ...
> }
>
> I.e. use the supplied custom function to do proprietary magic, such as
> reading the object lazily from elsewhere over the network. And we will
> never get that magic bit back.
Well, for one "we're" not supposed to accept any patch that does that,
and I don't expect that the people who end up maintaining libgit2 will
become rogue. Though if such bits of APIs do exist one day, then well, I
see no license except the GPL that can prevent you from that.
My idea of trying to be able to reuse git.git code is not a necessity,
a new implementation from scratch is likely to be possible. Though we
all know that if the core git contributors don't contribute and
eventually use libgit2 this will not fly. That's why we must think about
it.
I assume given your answer that if libgit2 is BSD you may not be as
motivated to contribute code to it as you are to git.git, and this IMHO
would be a big no-go, like shawn said in another mail.
--
·O· Pierre Habouzit
··O madcoder@debian.org
OOO http://www.madism.org
[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-10-31 23:14 ` Junio C Hamano
2008-10-31 23:33 ` Pierre Habouzit
@ 2008-10-31 23:41 ` Shawn O. Pearce
2008-10-31 23:56 ` Jakub Narebski
2008-11-01 0:41 ` david
1 sibling, 2 replies; 83+ messages in thread
From: Shawn O. Pearce @ 2008-10-31 23:41 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Nicolas Pitre, Pierre Habouzit, git, Scott Chacon
Junio C Hamano <gitster@pobox.com> wrote:
>
> Although I'd be Ok with either GPL + gcc exception on whatever core-ish
> (i.e. what will be necessary for libgit2; "blame" would not count) pieces
> I have in C-git codebase,
Someday I'm going to come back to you and ask for "blame" in libgit2.
Its an important function to be able to execute for an end-user.
Look at "git gui blame", its a major feature of the GUI.
If your blame implementation will never be available except under
the GPL then either it should be clean-room rewritten under the
library's license, or maybe there is a "libgitblame" that is GPL
and can be optionally linked with libgit2 and a GPL'd application
to get blame support.
>"can be linked with anything" allows a gaping
> hole to the library, which I'm a bit hesitant to swallow without thinking.
>
> E.g. our read_object() may look like this:
>
> void *read_object(const object_name_t sha1,
> enum object_type *type,
> size_t *size)
> {
> ...
> }
>
>
> but an extension a closed-source person may sell you back may do:
>
> +typedef void *read_object_fn(const object_name_t,
> + enum object_type *,
> + size_t *);
> +read_object_fn read_object_custom = NULL;
> void *read_object(const object_name_t sha1,
> enum object_type *type,
> size_t *size)
> {
> + if (read_object_custom != NULL)
> + return read_object_custom(sha1, type, size);
> ...
> }
>
> I.e. use the supplied custom function to do proprietary magic, such as
> reading the object lazily from elsewhere over the network. And we will
> never get that magic bit back.
As a maintainer I'd never accept such a patch. I'd ask for the
code under read_object_custom, or toss the patch on the floor.
But that doesn't stop them from distributing the patched sources
like above, keeping the fun bits in the closed source portion of
the executable they distribute.
Maybe I just think too highly of the other guy, but I'd hope that
anyone patching libgit2 like above would try to avoid it, because
they'd face merge issues in the future.
> Although no license asks this, my wish is that if somebody built on top of
> what I wrote to make the world a better place, I'd like the same access to
> that additional code so that I too can enjoy the improved world. Because
> almost all of my code in git.git are under GPLv2, in reality I do not have
> any access to your software as long as you do not distribute your
> additional code that made the world a better place, which is a bit sad.
IMHO, its a flaw of the GPL. GitHub anyone? Heck, even Google uses
a lot of GPL'd software internally (yes, we have Linux desktops and
servers) but not all of the software we distribute internally goes
external, so not all of our patches are published. *sigh*
I've actually stayed awake at night sometimes wondering what the
world would be like if the GPL virual clause forced the source code
for a website to be opened, or forced you to publish your code
even if you never distribute binaries beyond "you" (where "you"
is some mega corp in many countries with many employees).
--
Shawn.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-10-31 23:28 ` Shawn O. Pearce
@ 2008-10-31 23:49 ` Junio C Hamano
2008-11-01 0:02 ` Pierre Habouzit
2008-11-01 0:13 ` Shawn O. Pearce
0 siblings, 2 replies; 83+ messages in thread
From: Junio C Hamano @ 2008-10-31 23:49 UTC (permalink / raw)
To: Shawn O. Pearce
Cc: Pieter de Bie, Pierre Habouzit, Nicolas Pitre, git, Scott Chacon
"Shawn O. Pearce" <spearce@spearce.org> writes:
> You are the end-user of the library, so it needs to suit you. Ok,
> you aren't the only end-user, but you and other developers like
> you... :-)
I will be the end-user of the library because if we want libgit2 to be
anywhere close to successful, you should be able to port C-git to it.
I understand that the apidocs/ is a very early work-in-progress, but
still, it bothers me that it is unclear to me what lifetime rules are in
effect on the in-core objects. For example, in C-git, commit objects are
not just parsed but are modified in place as history is traversed
(e.g. their flags smudged and their parents simplified). You have "flags"
field in commit, which implies to me that the design shares this same
"modified by processing in-place" assumption. It is great for processing
efficiency as long as you are a "run once and let exit(3) clean-up" type
of program, but is quite problematic otherwise. commit.flags that C-git
uses for traversal marker purposes, together with "who are parents and
children of this commit", should probably be kept inside traversal module,
if you want to make this truly reusable.
By the way, I hate git_result_t. That should be "int", the most natural
integral type on the platform.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-10-31 23:41 ` Shawn O. Pearce
@ 2008-10-31 23:56 ` Jakub Narebski
2008-11-01 0:41 ` david
1 sibling, 0 replies; 83+ messages in thread
From: Jakub Narebski @ 2008-10-31 23:56 UTC (permalink / raw)
To: Shawn O. Pearce
Cc: Junio C Hamano, Nicolas Pitre, Pierre Habouzit, git, Scott Chacon
"Shawn O. Pearce" <spearce@spearce.org> writes:
> Junio C Hamano <gitster@pobox.com> wrote:
> > Although no license asks this, my wish is that if somebody built on top of
> > what I wrote to make the world a better place, I'd like the same access to
> > that additional code so that I too can enjoy the improved world. Because
> > almost all of my code in git.git are under GPLv2, in reality I do not have
> > any access to your software as long as you do not distribute your
> > additional code that made the world a better place, which is a bit sad.
>
> IMHO, its a flaw of the GPL. GitHub anyone? Heck, even Google uses
> a lot of GPL'd software internally (yes, we have Linux desktops and
> servers) but not all of the software we distribute internally goes
> external, so not all of our patches are published. *sigh*
>
> I've actually stayed awake at night sometimes wondering what the
> world would be like if the GPL virual clause forced the source code
> for a website to be opened, or forced you to publish your code
> even if you never distribute binaries beyond "you" (where "you"
> is some mega corp in many countries with many employees).
There is such license, and it is called AGPLv3, Affero GPL[1].
And of course Google prohibits (or did prohibit) using it for projects
hosted at Google Code... wonder why... ;-)
[1] http://en.wikipedia.org/wiki/AGPL
--
Jakub Narebski
Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-10-31 23:49 ` Junio C Hamano
@ 2008-11-01 0:02 ` Pierre Habouzit
2008-11-01 0:19 ` Shawn O. Pearce
2008-11-01 0:13 ` Shawn O. Pearce
1 sibling, 1 reply; 83+ messages in thread
From: Pierre Habouzit @ 2008-11-01 0:02 UTC (permalink / raw)
To: Junio C Hamano
Cc: Shawn O. Pearce, Pieter de Bie, Nicolas Pitre, git, Scott Chacon
[-- Attachment #1: Type: text/plain, Size: 1992 bytes --]
On Fri, Oct 31, 2008 at 11:49:11PM +0000, Junio C Hamano wrote:
> "Shawn O. Pearce" <spearce@spearce.org> writes:
>
> > You are the end-user of the library, so it needs to suit you. Ok,
> > you aren't the only end-user, but you and other developers like
> > you... :-)
>
> I will be the end-user of the library because if we want libgit2 to be
> anywhere close to successful, you should be able to port C-git to it.
>
> I understand that the apidocs/ is a very early work-in-progress, but
> still, it bothers me that it is unclear to me what lifetime rules are in
> effect on the in-core objects. For example, in C-git, commit objects are
> not just parsed but are modified in place as history is traversed
> (e.g. their flags smudged and their parents simplified). You have "flags"
> field in commit, which implies to me that the design shares this same
> "modified by processing in-place" assumption. It is great for processing
> efficiency as long as you are a "run once and let exit(3) clean-up" type
> of program, but is quite problematic otherwise. commit.flags that C-git
> uses for traversal marker purposes, together with "who are parents and
> children of this commit", should probably be kept inside traversal module,
> if you want to make this truly reusable.
I don't think it's impossible to have something efficient without this
kind of hacks. You just need to dissociate the objects from their
annotations, though use some kind of allocator that allow numbering of
the objects, and use that number as a lookup in an array of annotations.
It will require pool allocators for the annotations, but that should
work fine and efficientely.
> By the way, I hate git_result_t. That should be "int", the most natural
> integral type on the platform.
I concur.
--
·O· Pierre Habouzit
··O madcoder@debian.org
OOO http://www.madism.org
[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-10-31 23:49 ` Junio C Hamano
2008-11-01 0:02 ` Pierre Habouzit
@ 2008-11-01 0:13 ` Shawn O. Pearce
2008-11-01 1:15 ` Nicolas Pitre
1 sibling, 1 reply; 83+ messages in thread
From: Shawn O. Pearce @ 2008-11-01 0:13 UTC (permalink / raw)
To: Junio C Hamano
Cc: Pieter de Bie, Pierre Habouzit, Nicolas Pitre, git, Scott Chacon
Junio C Hamano <gitster@pobox.com> wrote:
> I understand that the apidocs/ is a very early work-in-progress, but
> still, it bothers me that it is unclear to me what lifetime rules are in
> effect on the in-core objects.
Yes, this needs a lot more documentation.
> For example, in C-git, commit objects are
> not just parsed but are modified in place as history is traversed
> (e.g. their flags smudged and their parents simplified). You have "flags"
> field in commit, which implies to me that the design shares this same
> "modified by processing in-place" assumption.
Yup. I was assuming the same model, we modify in-place.
> It is great for processing
> efficiency as long as you are a "run once and let exit(3) clean-up" type
> of program, but is quite problematic otherwise. commit.flags that C-git
> uses for traversal marker purposes, together with "who are parents and
> children of this commit", should probably be kept inside traversal module,
> if you want to make this truly reusable.
Its not efficient to keep this data inside of the "traversal module"
instance (aka what I called git_revp_t). You really want it inside
of the commit itself (aka git_commit_t).
My thought here is that git_commit_t's are scoped within a given
git_revp_t that was used when they were parsed. That is:
git_revp_t *pool_a = git_revp_alloc(db, NULL);
git_revp_t *pool_b = git_revp_alloc(db, NULL);
git_oid_t id;
git_commit_t *commit_a, *commit_b, *commit_c;
git_oid_mkstr(&id, "3c223b36af9cace4f802a855fbb588b1dccf0648");
commit_a = git_commit_parse(pool_a, &id);
commit_b = git_commit_parse(pool_b, &id);
commit_c = git_commit_parse(pool_a, &id);
if (commit_a == commit_b)
die("the world just exploded");
else
printf("this was correct behavior\n");
if (commit_a == commit_c)
printf("this was correct behavior\n");
else
die("the hash table is broken");
To completely different git_revp_t's on the same database yeild
different commit pointers, but successive calls to parse the same
commit in the same pool yield the same pointer.
Certain operations on the pool can cause it to alter its state
in a way that cannot be reversed (e.g. rewrite parents). In such
cases the caller should free the pool and alloc a new one in order
to issue new traversals against the same object database, but with
the original (or differently rewritten) parent information.
> By the way, I hate git_result_t. That should be "int", the most natural
> integral type on the platform.
Yea, I'm torn on git_result_t myself. Some library APIs use their
own result type, but as a typedef off int.
I'm tempted to stick with int for the result type, but I don't
want readers to confuse our result type of 0 == success, <0 ==
failure with some case where we return a signed integral value as
a result of a computation.
I'm also debating the error handling. Do we return the error
code as the return value from the function, or do we stick it into
some sort of thread-global like classic "errno", or do we ask the
application to pass in a structure to us?
E.g.:
Return code:
git_result_t r = git_foo_bar(...);
if (r < 0)
die("foo_bar failed: %s", git_strerr(r));
Use an errno:
if (git_foo_bar(...))
die("foo_bar failed: %s", git_strerr(git_errno));
Use a caller allocated struct:
git_error_t err;
if (git_foo_bar(..., &err))
die("foo_bar failed: %s", git_strerr(&err));
I'm slightly leaning towards the result code approach, as it means
we don't have to mess around with thread local variables.
We don't get to pass back anything more complex than an int (possibly
losing context about parameter values and/or on-disk state we want
to report on), but we also don't have to deal with thread-locals
or some messy "always pass a thread context parameter".
--
Shawn.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-11-01 0:02 ` Pierre Habouzit
@ 2008-11-01 0:19 ` Shawn O. Pearce
2008-11-01 1:02 ` Pierre Habouzit
0 siblings, 1 reply; 83+ messages in thread
From: Shawn O. Pearce @ 2008-11-01 0:19 UTC (permalink / raw)
To: Pierre Habouzit
Cc: Junio C Hamano, Pieter de Bie, Nicolas Pitre, git, Scott Chacon
Pierre Habouzit <madcoder@debian.org> wrote:
> On Fri, Oct 31, 2008 at 11:49:11PM +0000, Junio C Hamano wrote:
> >
> > I understand that the apidocs/ is a very early work-in-progress, but
> > still, it bothers me that it is unclear to me what lifetime rules are in
> > effect on the in-core objects. For example, in C-git, commit objects are
> > not just parsed but are modified in place as history is traversed
> > (e.g. their flags smudged and their parents simplified). You have "flags"
> > field in commit, which implies to me that the design shares this same
> > "modified by processing in-place" assumption.
>
> I don't think it's impossible to have something efficient without this
> kind of hacks. You just need to dissociate the objects from their
> annotations, though use some kind of allocator that allow numbering of
> the objects, and use that number as a lookup in an array of annotations.
> It will require pool allocators for the annotations, but that should
> work fine and efficientely.
Interesting approach. I don't know why I didn't think of that one.
You'll still need to be able to toss parts of the git graph though.
If you just pin everything in memory under a single global object
table you'll run server processes out of memory as they chug through
large numbers of repositories.
> > By the way, I hate git_result_t. That should be "int", the most natural
> > integral type on the platform.
>
> I concur.
int it is then.
--
Shawn.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-10-31 23:41 ` Shawn O. Pearce
2008-10-31 23:56 ` Jakub Narebski
@ 2008-11-01 0:41 ` david
2008-11-01 1:00 ` Shawn O. Pearce
2008-11-01 1:06 ` Pierre Habouzit
1 sibling, 2 replies; 83+ messages in thread
From: david @ 2008-11-01 0:41 UTC (permalink / raw)
To: Shawn O. Pearce
Cc: Junio C Hamano, Nicolas Pitre, Pierre Habouzit, git, Scott Chacon
On Fri, 31 Oct 2008, Shawn O. Pearce wrote:
> Junio C Hamano <gitster@pobox.com> wrote:
>>
>>
>> I.e. use the supplied custom function to do proprietary magic, such as
>> reading the object lazily from elsewhere over the network. And we will
>> never get that magic bit back.
>
> As a maintainer I'd never accept such a patch. I'd ask for the
> code under read_object_custom, or toss the patch on the floor.
> But that doesn't stop them from distributing the patched sources
> like above, keeping the fun bits in the closed source portion of
> the executable they distribute.
>
> Maybe I just think too highly of the other guy, but I'd hope that
> anyone patching libgit2 like above would try to avoid it, because
> they'd face merge issues in the future.
the issue that I see is that libgit2 will be (on most systems) a shared
library.
what's to stop someone from taking the libgit2 code, adding the magic
proprietary piece, and selling a new libgit2 library binary 'just replace
your existing shared library with this new one and all your git related
programs gain this feature'
they would only face merge issues if they need to keep up to date with
you, and git makes it pretty easy to maintain a fork if you only have to
do one-way merging (rere)
David Lang
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-11-01 0:41 ` david
@ 2008-11-01 1:00 ` Shawn O. Pearce
2008-11-01 1:04 ` david
2008-11-01 1:06 ` Pierre Habouzit
1 sibling, 1 reply; 83+ messages in thread
From: Shawn O. Pearce @ 2008-11-01 1:00 UTC (permalink / raw)
To: david; +Cc: git
david@lang.hm wrote:
> On Fri, 31 Oct 2008, Shawn O. Pearce wrote:
>> Junio C Hamano <gitster@pobox.com> wrote:
>>>
>>> I.e. use the supplied custom function to do proprietary magic, such as
>>> reading the object lazily from elsewhere over the network. And we will
>>> never get that magic bit back.
>>
>> Maybe I just think too highly of the other guy, but I'd hope that
>> anyone patching libgit2 like above would try to avoid it, because
>> they'd face merge issues in the future.
>
> the issue that I see is that libgit2 will be (on most systems) a shared
> library.
>
> what's to stop someone from taking the libgit2 code, adding the magic
> proprietary piece, and selling a new libgit2 library binary 'just replace
> your existing shared library with this new one and all your git related
> programs gain this feature'
True. The only thing that prevents that is the normal GPL. The
LGPL and GPL+"gcc exception" allow this sort of mean behavior.
I doubt there's enough of a market for that; replacing a library
is something of a pain and if the feature really is interesting or
useful someone will write a clean-room re-implementation and submit
patches to do the same thing.
> they would only face merge issues if they need to keep up to date with
> you, and git makes it pretty easy to maintain a fork if you only have to
> do one-way merging (rere)
In other words, we're too good for our own good. ;-)
--
Shawn.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-11-01 0:19 ` Shawn O. Pearce
@ 2008-11-01 1:02 ` Pierre Habouzit
0 siblings, 0 replies; 83+ messages in thread
From: Pierre Habouzit @ 2008-11-01 1:02 UTC (permalink / raw)
To: Shawn O. Pearce
Cc: Junio C Hamano, Pieter de Bie, Nicolas Pitre, git, Scott Chacon
[-- Attachment #1: Type: text/plain, Size: 5028 bytes --]
On Sat, Nov 01, 2008 at 12:19:26AM +0000, Shawn O. Pearce wrote:
> Pierre Habouzit <madcoder@debian.org> wrote:
> > On Fri, Oct 31, 2008 at 11:49:11PM +0000, Junio C Hamano wrote:
> > >
> > > I understand that the apidocs/ is a very early work-in-progress, but
> > > still, it bothers me that it is unclear to me what lifetime rules are in
> > > effect on the in-core objects. For example, in C-git, commit objects are
> > > not just parsed but are modified in place as history is traversed
> > > (e.g. their flags smudged and their parents simplified). You have "flags"
> > > field in commit, which implies to me that the design shares this same
> > > "modified by processing in-place" assumption.
> >
> > I don't think it's impossible to have something efficient without this
> > kind of hacks. You just need to dissociate the objects from their
> > annotations, though use some kind of allocator that allow numbering of
> > the objects, and use that number as a lookup in an array of annotations.
> > It will require pool allocators for the annotations, but that should
> > work fine and efficientely.
>
> Interesting approach. I don't know why I didn't think of that one.
>
> You'll still need to be able to toss parts of the git graph though.
> If you just pin everything in memory under a single global object
> table you'll run server processes out of memory as they chug through
> large numbers of repositories.
Sure, but for that you just need to reinject the numbers into some kind
of free list (hint a bitmap) to reuse old slots. Of course this is some
kind of take-once never-release approach _BUT_ one can do better and
"defrag" this at times.
E.g. for a server if we take your idea, there are some times we probably
*know* nobody has kept a reference to one of the pointers and we can
reorganize some pointers around to free chunks of data not allocated.
An escape way is to use mmap + madvise for those pools, the former to
allocate the memory and the latter to drop large unused ranges when
needed (remapping with MAP_FIXED is also supposed to work but fails on
Macos it seems). Even win32 has what it takes to do so (just skim
through the code of jemalloc, that is used for mozilla, it's quite
portable on POSIX + Windows).
I was thinking that one should create and register pools of annotations
as such, define the size of the annotation (IOW the size of cell), and
let deal with that. I imagine the stuff as some kind of allocator that
would allocate the first time e.g. 4k of objects, and if you need more
8k, and if need more 16k and so on exponentially. Mapping an integer to
a cell in this kind of ropes is _really_ efficient: you need to know the
first bit set (__builtin_clz is of help with gcc, it can be emulated
with proper asm for many non-gcc platforms also) to give you the number
of the rope component that you intend to address, you clear that bit,
the resulting number is the index in that rope component of the cell
you're interested in. On most machines such an operation would be a few
cycles, which should be fairly little wrt the operation you would do
during a traversal.
Maintaining the bitmap is important, so that we can give back large
chunks of physical memory back to the system, and many malloc
implementations would likely use such kind of implementations, we would
just do our (which is arguably heavy) but gives us control on the cell
number, which is just what we need.
Of _course_ when we have a better / more natural place to put
annotations we need, let's do it ! But the kind of thing I just imagined
is more generic and can help to do arbitrary complex stuff during a
traversal.
We may want to explore a storage that is aware of the objects type also,
as I expect the liveness of objects to depend on the objects type a lot,
and some traversal to not care about some kind of objects at all, which
when combined with lazy allocation in the annotations pools, could end
up with reduced memory footprint. We could e.g. use two bits of the
"object handler" to store the type and dispatch into 3 (commit, tree,
blob) ropes instead of a big one.
The object and annotations store _is_ what makes git efficient, and is
the very service the library will have to support well. We _will_ have
to put some clever code in there and we cannot sacrifice any kind of
performance, this will be the tight loop every time. I don't think
there will be anything else in git that is so central for performance.
On the other hand, it's probably ok for the first versions of the
library to have some things in it that don't perform well in long lived
processes (repacking e.g., if we ever want to do that in the library, as
it's slow and that forking a git-repack away should not be a too
expensive cost anyway).
--
·O· Pierre Habouzit
··O madcoder@debian.org
OOO http://www.madism.org
[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-11-01 1:00 ` Shawn O. Pearce
@ 2008-11-01 1:04 ` david
2008-11-01 1:08 ` Pierre Habouzit
0 siblings, 1 reply; 83+ messages in thread
From: david @ 2008-11-01 1:04 UTC (permalink / raw)
To: Shawn O. Pearce; +Cc: git
On Fri, 31 Oct 2008, Shawn O. Pearce wrote:
> david@lang.hm wrote:
>> On Fri, 31 Oct 2008, Shawn O. Pearce wrote:
>>> Junio C Hamano <gitster@pobox.com> wrote:
>>>>
>>>> I.e. use the supplied custom function to do proprietary magic, such as
>>>> reading the object lazily from elsewhere over the network. And we will
>>>> never get that magic bit back.
>>>
>>> Maybe I just think too highly of the other guy, but I'd hope that
>>> anyone patching libgit2 like above would try to avoid it, because
>>> they'd face merge issues in the future.
>>
>> the issue that I see is that libgit2 will be (on most systems) a shared
>> library.
>>
>> what's to stop someone from taking the libgit2 code, adding the magic
>> proprietary piece, and selling a new libgit2 library binary 'just replace
>> your existing shared library with this new one and all your git related
>> programs gain this feature'
>
> True. The only thing that prevents that is the normal GPL. The
> LGPL and GPL+"gcc exception" allow this sort of mean behavior.
> I doubt there's enough of a market for that; replacing a library
> is something of a pain and if the feature really is interesting or
> useful someone will write a clean-room re-implementation and submit
> patches to do the same thing.
how would the LGPL of GPL+gcc extention allow this? if they modify the
code in the library and then distribute the modified library wouldn't they
be required to distribute the changes to that library?
they could use the LGPL or GPL+exception library with their propriatary
program, but I don't see how they could get away with modifying the
library.
David Lang
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-11-01 0:41 ` david
2008-11-01 1:00 ` Shawn O. Pearce
@ 2008-11-01 1:06 ` Pierre Habouzit
2008-11-01 1:36 ` david
1 sibling, 1 reply; 83+ messages in thread
From: Pierre Habouzit @ 2008-11-01 1:06 UTC (permalink / raw)
To: david; +Cc: Shawn O. Pearce, Junio C Hamano, Nicolas Pitre, git, Scott Chacon
[-- Attachment #1: Type: text/plain, Size: 2074 bytes --]
On Sat, Nov 01, 2008 at 12:41:22AM +0000, david@lang.hm wrote:
> On Fri, 31 Oct 2008, Shawn O. Pearce wrote:
>
> >Junio C Hamano <gitster@pobox.com> wrote:
> >>
> >>
> >>I.e. use the supplied custom function to do proprietary magic, such as
> >>reading the object lazily from elsewhere over the network. And we will
> >>never get that magic bit back.
> >
> >As a maintainer I'd never accept such a patch. I'd ask for the
> >code under read_object_custom, or toss the patch on the floor.
> >But that doesn't stop them from distributing the patched sources
> >like above, keeping the fun bits in the closed source portion of
> >the executable they distribute.
> >
> >Maybe I just think too highly of the other guy, but I'd hope that
> >anyone patching libgit2 like above would try to avoid it, because
> >they'd face merge issues in the future.
>
> the issue that I see is that libgit2 will be (on most systems) a shared
> library.
>
> what's to stop someone from taking the libgit2 code, adding the magic
> proprietary piece, and selling a new libgit2 library binary 'just replace
> your existing shared library with this new one and all your git related
> programs gain this feature'
Its license. GPL even with GCC exception would not allow you to do that.
Though they could propose a fork of the library patched, with the patch
distributed. The downside would be that their code would not be binary
compatible with the "true" libgit2, so they would probably have to
change the name to avoid namespace clashes, or overwrite the "real"
library.
But yes, it's theoretically feasible. I'm not sure it would be worth the
hassle, and if they respect the license (if they don't they can already
do that with the current git anyway) then the fact that someone would
want to do something like that would be known fact, probably not
avoided, but known.
--
·O· Pierre Habouzit
··O madcoder@debian.org
OOO http://www.madism.org
[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-11-01 1:04 ` david
@ 2008-11-01 1:08 ` Pierre Habouzit
2008-11-01 1:33 ` Nicolas Pitre
0 siblings, 1 reply; 83+ messages in thread
From: Pierre Habouzit @ 2008-11-01 1:08 UTC (permalink / raw)
To: david; +Cc: Shawn O. Pearce, git
[-- Attachment #1: Type: text/plain, Size: 2225 bytes --]
On Sat, Nov 01, 2008 at 01:04:50AM +0000, david@lang.hm wrote:
> On Fri, 31 Oct 2008, Shawn O. Pearce wrote:
>
> >david@lang.hm wrote:
> >>On Fri, 31 Oct 2008, Shawn O. Pearce wrote:
> >>>Junio C Hamano <gitster@pobox.com> wrote:
> >>>>
> >>>>I.e. use the supplied custom function to do proprietary magic, such
> >>>>as
> >>>>reading the object lazily from elsewhere over the network. And we
> >>>>will
> >>>>never get that magic bit back.
> >>>
> >>>Maybe I just think too highly of the other guy, but I'd hope that
> >>>anyone patching libgit2 like above would try to avoid it, because
> >>>they'd face merge issues in the future.
> >>
> >>the issue that I see is that libgit2 will be (on most systems) a shared
> >>library.
> >>
> >>what's to stop someone from taking the libgit2 code, adding the magic
> >>proprietary piece, and selling a new libgit2 library binary 'just
> >>replace
> >>your existing shared library with this new one and all your git related
> >>programs gain this feature'
> >
> >True. The only thing that prevents that is the normal GPL. The
> >LGPL and GPL+"gcc exception" allow this sort of mean behavior.
> >I doubt there's enough of a market for that; replacing a library
> >is something of a pain and if the feature really is interesting or
> >useful someone will write a clean-room re-implementation and submit
> >patches to do the same thing.
>
> how would the LGPL of GPL+gcc extention allow this? if they modify the
> code in the library and then distribute the modified library wouldn't
> they be required to distribute the changes to that library?
See junio's example. It's rather easy to add hooks into the library to
implement a feature outside of it. It's even possible to do it while
preserving the ABI fully IMHO (by being a strict superset of it).
The patch would be so trivial, that I see no reason why they wouldn't
provide it. Though the real implementation of the feature that would be
delegated through it would be in their closed source stuff.
--
·O· Pierre Habouzit
··O madcoder@debian.org
OOO http://www.madism.org
[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-11-01 0:13 ` Shawn O. Pearce
@ 2008-11-01 1:15 ` Nicolas Pitre
2008-11-01 1:19 ` Shawn O. Pearce
0 siblings, 1 reply; 83+ messages in thread
From: Nicolas Pitre @ 2008-11-01 1:15 UTC (permalink / raw)
To: Shawn O. Pearce
Cc: Junio C Hamano, Pieter de Bie, Pierre Habouzit, git, Scott Chacon
On Fri, 31 Oct 2008, Shawn O. Pearce wrote:
> I'm tempted to stick with int for the result type, but I don't
> want readers to confuse our result type of 0 == success, <0 ==
> failure with some case where we return a signed integral value as
> a result of a computation.
Does this actually happen ?
> I'm also debating the error handling. Do we return the error
> code as the return value from the function, or do we stick it into
> some sort of thread-global like classic "errno", or do we ask the
> application to pass in a structure to us?
Passing a structure pointer for errors is adding overhead to the API in
all cases, please don't do that.
Both the negative code and errno style are lightweight in the common "no
error" case. The errno style is probably more handy for those functions
returning a pointer which should be NULL in the error case.
Nicolas
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-11-01 1:15 ` Nicolas Pitre
@ 2008-11-01 1:19 ` Shawn O. Pearce
2008-11-01 1:45 ` Nicolas Pitre
0 siblings, 1 reply; 83+ messages in thread
From: Shawn O. Pearce @ 2008-11-01 1:19 UTC (permalink / raw)
To: Nicolas Pitre
Cc: Junio C Hamano, Pieter de Bie, Pierre Habouzit, git, Scott Chacon
Nicolas Pitre <nico@cam.org> wrote:
> On Fri, 31 Oct 2008, Shawn O. Pearce wrote:
>
> > I'm tempted to stick with int for the result type, but I don't
> > want readers to confuse our result type of 0 == success, <0 ==
> > failure with some case where we return a signed integral value as
> > a result of a computation.
>
> Does this actually happen ?
Sometimes, but yea, its not that often. I think we've already
settled that it shall be "int".
I'm updating some stuff like that (and dropping _t) as I write
this email.
> > I'm also debating the error handling. Do we return the error
> > code as the return value from the function, or do we stick it into
> > some sort of thread-global like classic "errno", or do we ask the
> > application to pass in a structure to us?
>
> Passing a structure pointer for errors is adding overhead to the API in
> all cases, please don't do that.
Agreed. Not going there.
> Both the negative code and errno style are lightweight in the common "no
> error" case. The errno style is probably more handy for those functions
> returning a pointer which should be NULL in the error case.
I'm sticking with return a negative code for now, to the extent
that some functions which return a pointer but also have many
common failure modes (e.g. git_odb_open) use an output parameter
as their first arg, so the error code can be returned as the result
of the function.
--
Shawn.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-11-01 1:08 ` Pierre Habouzit
@ 2008-11-01 1:33 ` Nicolas Pitre
2008-11-01 1:38 ` Pierre Habouzit
2008-11-01 1:43 ` Shawn O. Pearce
0 siblings, 2 replies; 83+ messages in thread
From: Nicolas Pitre @ 2008-11-01 1:33 UTC (permalink / raw)
To: Pierre Habouzit; +Cc: david, Shawn O. Pearce, git
On Sat, 1 Nov 2008, Pierre Habouzit wrote:
> See junio's example. It's rather easy to add hooks into the library to
> implement a feature outside of it. It's even possible to do it while
> preserving the ABI fully IMHO (by being a strict superset of it).
>
> The patch would be so trivial, that I see no reason why they wouldn't
> provide it. Though the real implementation of the feature that would be
> delegated through it would be in their closed source stuff.
But at that point it's a matter of public perception. It would clearly
be against the spirit of the license even though the license itself
couldn't prevent such tortuous practices. Those people doing such
things would clearly be identified as bad guys and get bad press, and
libgit contributors could even attempt law suits based on the derived
work angle. That might be just enough to prevent such things to happen.
OTOH they would certainly come out clean if the license was BSD since
the spirit of that license explicitly allows closing up the whole
library and adding extra features.
Nicolas
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-11-01 1:06 ` Pierre Habouzit
@ 2008-11-01 1:36 ` david
0 siblings, 0 replies; 83+ messages in thread
From: david @ 2008-11-01 1:36 UTC (permalink / raw)
To: Pierre Habouzit
Cc: Shawn O. Pearce, Junio C Hamano, Nicolas Pitre, git, Scott Chacon
On Sat, 1 Nov 2008, Pierre Habouzit wrote:
> On Sat, Nov 01, 2008 at 12:41:22AM +0000, david@lang.hm wrote:
>> On Fri, 31 Oct 2008, Shawn O. Pearce wrote:
>>
>>> Junio C Hamano <gitster@pobox.com> wrote:
>>>>
>>>>
>>>> I.e. use the supplied custom function to do proprietary magic, such as
>>>> reading the object lazily from elsewhere over the network. And we will
>>>> never get that magic bit back.
>>>
>>> As a maintainer I'd never accept such a patch. I'd ask for the
>>> code under read_object_custom, or toss the patch on the floor.
>>> But that doesn't stop them from distributing the patched sources
>>> like above, keeping the fun bits in the closed source portion of
>>> the executable they distribute.
>>>
>>> Maybe I just think too highly of the other guy, but I'd hope that
>>> anyone patching libgit2 like above would try to avoid it, because
>>> they'd face merge issues in the future.
>>
>> the issue that I see is that libgit2 will be (on most systems) a shared
>> library.
>>
>> what's to stop someone from taking the libgit2 code, adding the magic
>> proprietary piece, and selling a new libgit2 library binary 'just replace
>> your existing shared library with this new one and all your git related
>> programs gain this feature'
>
> Its license. GPL even with GCC exception would not allow you to do that.
> Though they could propose a fork of the library patched, with the patch
> distributed. The downside would be that their code would not be binary
> compatible with the "true" libgit2, so they would probably have to
> change the name to avoid namespace clashes, or overwrite the "real"
> library.
the proposal had been for the library to be BSD, not GPL. and I don't see
any reason why such a thing couldn't be done with a BSE library.
David Lang
> But yes, it's theoretically feasible. I'm not sure it would be worth the
> hassle, and if they respect the license (if they don't they can already
> do that with the current git anyway) then the fact that someone would
> want to do something like that would be known fact, probably not
> avoided, but known.
>
>
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-11-01 1:33 ` Nicolas Pitre
@ 2008-11-01 1:38 ` Pierre Habouzit
2008-11-01 1:49 ` Nicolas Pitre
2008-11-01 1:43 ` Shawn O. Pearce
1 sibling, 1 reply; 83+ messages in thread
From: Pierre Habouzit @ 2008-11-01 1:38 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: david, Shawn O. Pearce, git
[-- Attachment #1: Type: text/plain, Size: 592 bytes --]
On Sat, Nov 01, 2008 at 01:33:51AM +0000, Nicolas Pitre wrote:
> OTOH they would certainly come out clean if the license was BSD since
> the spirit of that license explicitly allows closing up the whole
> library and adding extra features.
I think it's pretty clear most contributor from git.git would be driven
away if the license isn't GPLish. So I was basing my answer on a
supposition it would be.
--
·O· Pierre Habouzit
··O madcoder@debian.org
OOO http://www.madism.org
[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-11-01 1:33 ` Nicolas Pitre
2008-11-01 1:38 ` Pierre Habouzit
@ 2008-11-01 1:43 ` Shawn O. Pearce
2008-11-01 1:53 ` Nicolas Pitre
1 sibling, 1 reply; 83+ messages in thread
From: Shawn O. Pearce @ 2008-11-01 1:43 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: Pierre Habouzit, david, git
Nicolas Pitre <nico@cam.org> wrote:
> On Sat, 1 Nov 2008, Pierre Habouzit wrote:
>
> > See junio's example. It's rather easy to add hooks into the library to
> > implement a feature outside of it. It's even possible to do it while
> > preserving the ABI fully IMHO (by being a strict superset of it).
> >
> > The patch would be so trivial, that I see no reason why they wouldn't
> > provide it. Though the real implementation of the feature that would be
> > delegated through it would be in their closed source stuff.
>
> But at that point it's a matter of public perception. It would clearly
> be against the spirit of the license even though the license itself
> couldn't prevent such tortuous practices. Those people doing such
> things would clearly be identified as bad guys and get bad press, and
> libgit contributors could even attempt law suits based on the derived
> work angle. That might be just enough to prevent such things to happen.
Yes.
> OTOH they would certainly come out clean if the license was BSD since
> the spirit of that license explicitly allows closing up the whole
> library and adding extra features.
My take on the consensus for the license part of the discussion is
that libgit2 should be under the "GPL gcc library" license.
BTW, I can't actually find a copy of that license; the only thing
I can locate in the GCC SVN tree is a copy of the LGPL.
--
Shawn.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-11-01 1:19 ` Shawn O. Pearce
@ 2008-11-01 1:45 ` Nicolas Pitre
2008-11-01 1:52 ` Shawn O. Pearce
0 siblings, 1 reply; 83+ messages in thread
From: Nicolas Pitre @ 2008-11-01 1:45 UTC (permalink / raw)
To: Shawn O. Pearce
Cc: Junio C Hamano, Pieter de Bie, Pierre Habouzit, git, Scott Chacon
On Fri, 31 Oct 2008, Shawn O. Pearce wrote:
> > Both the negative code and errno style are lightweight in the common "no
> > error" case. The errno style is probably more handy for those functions
> > returning a pointer which should be NULL in the error case.
>
> I'm sticking with return a negative code for now, to the extent
> that some functions which return a pointer but also have many
> common failure modes (e.g. git_odb_open) use an output parameter
> as their first arg, so the error code can be returned as the result
> of the function.
Actually, the pointer-returning functions can encode error cases into a
"negative" pointer. See include/linux/err.h for example.
void *ptr = git_alloc_foo(...);
if (IS_ERR(ptr))
die("git_alloc_foo failed: %s", git_strerr(PTR_ERR(ptr)));
Nicolas
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-11-01 1:38 ` Pierre Habouzit
@ 2008-11-01 1:49 ` Nicolas Pitre
0 siblings, 0 replies; 83+ messages in thread
From: Nicolas Pitre @ 2008-11-01 1:49 UTC (permalink / raw)
To: Pierre Habouzit; +Cc: david, Shawn O. Pearce, git
On Sat, 1 Nov 2008, Pierre Habouzit wrote:
> On Sat, Nov 01, 2008 at 01:33:51AM +0000, Nicolas Pitre wrote:
> > OTOH they would certainly come out clean if the license was BSD since
> > the spirit of that license explicitly allows closing up the whole
> > library and adding extra features.
>
> I think it's pretty clear most contributor from git.git would be driven
> away if the license isn't GPLish. So I was basing my answer on a
> supposition it would be.
Sure. My point is the difference in guiltiness. Even if a LGPL license
doesn't prevent all abuses, it at least allows for public denunciations
at the very least.
Nicolas
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-11-01 1:45 ` Nicolas Pitre
@ 2008-11-01 1:52 ` Shawn O. Pearce
2008-11-01 2:26 ` Johannes Schindelin
0 siblings, 1 reply; 83+ messages in thread
From: Shawn O. Pearce @ 2008-11-01 1:52 UTC (permalink / raw)
To: Nicolas Pitre
Cc: Junio C Hamano, Pieter de Bie, Pierre Habouzit, git, Scott Chacon
Nicolas Pitre <nico@cam.org> wrote:
> On Fri, 31 Oct 2008, Shawn O. Pearce wrote:
>
> > > Both the negative code and errno style are lightweight in the common "no
> > > error" case. The errno style is probably more handy for those functions
> > > returning a pointer which should be NULL in the error case.
> >
> > I'm sticking with return a negative code for now, to the extent
> > that some functions which return a pointer but also have many
> > common failure modes (e.g. git_odb_open) use an output parameter
> > as their first arg, so the error code can be returned as the result
> > of the function.
>
> Actually, the pointer-returning functions can encode error cases into a
> "negative" pointer. See include/linux/err.h for example.
>
> void *ptr = git_alloc_foo(...);
> if (IS_ERR(ptr))
> die("git_alloc_foo failed: %s", git_strerr(PTR_ERR(ptr)));
Oh, good point. We could also stagger the errors so they are
always odd, and never return an odd-alignment pointer from a
successful function. Thus IS_ERR can be written as:
#define IS_ERR(ptr) (((intptr_t)(ptr)) & 1)
which is quite cheap, and given the (probably required anyway)
aligned allocation policy means we still have 2^31 possible
error codes available.
--
Shawn.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-11-01 1:43 ` Shawn O. Pearce
@ 2008-11-01 1:53 ` Nicolas Pitre
2008-11-01 22:57 ` Shawn O. Pearce
0 siblings, 1 reply; 83+ messages in thread
From: Nicolas Pitre @ 2008-11-01 1:53 UTC (permalink / raw)
To: Shawn O. Pearce; +Cc: Pierre Habouzit, david, git
On Fri, 31 Oct 2008, Shawn O. Pearce wrote:
> My take on the consensus for the license part of the discussion is
> that libgit2 should be under the "GPL gcc library" license.
>
> BTW, I can't actually find a copy of that license; the only thing
> I can locate in the GCC SVN tree is a copy of the LGPL.
The exception is usually found at the top of files constituting
libgcc.a. One example is gcc/config/arm/ieee754-df.S. ;-)
Nicolas
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-11-01 1:52 ` Shawn O. Pearce
@ 2008-11-01 2:26 ` Johannes Schindelin
2008-11-01 11:01 ` Pierre Habouzit
0 siblings, 1 reply; 83+ messages in thread
From: Johannes Schindelin @ 2008-11-01 2:26 UTC (permalink / raw)
To: Shawn O. Pearce
Cc: Nicolas Pitre, Junio C Hamano, Pieter de Bie, Pierre Habouzit,
git, Scott Chacon
Hi,
On Fri, 31 Oct 2008, Shawn O. Pearce wrote:
> Nicolas Pitre <nico@cam.org> wrote:
> > On Fri, 31 Oct 2008, Shawn O. Pearce wrote:
> >
> > > > Both the negative code and errno style are lightweight in the
> > > > common "no error" case. The errno style is probably more handy
> > > > for those functions returning a pointer which should be NULL in
> > > > the error case.
Unfortunately, errno would not be thread-safe, unless you can guarantee
that errno is a thread-local variable.
> > >
> > > I'm sticking with return a negative code for now, to the extent that
> > > some functions which return a pointer but also have many common
> > > failure modes (e.g. git_odb_open) use an output parameter as their
> > > first arg, so the error code can be returned as the result of the
> > > function.
> >
> > Actually, the pointer-returning functions can encode error cases into a
> > "negative" pointer. See include/linux/err.h for example.
> >
> > void *ptr = git_alloc_foo(...);
> > if (IS_ERR(ptr))
> > die("git_alloc_foo failed: %s", git_strerr(PTR_ERR(ptr)));
>
> Oh, good point. We could also stagger the errors so they are
> always odd, and never return an odd-alignment pointer from a
> successful function. Thus IS_ERR can be written as:
>
> #define IS_ERR(ptr) (((intptr_t)(ptr)) & 1)
>
> which is quite cheap, and given the (probably required anyway)
> aligned allocation policy means we still have 2^31 possible
> error codes available.
Oh boy, both solutions are ugly as hell. Although the &1 method does not
limit the memory space as much (except if you plan to work in
space-contrained environments, where you do not want to be forced to
word-align structs).
The only pointer game I would remotely consider clean is if you had
const char *errors[] = {
...
};
inline int is_error(void *ptr) {
return ptr >= errors && ptr < errors + ARRAY_SIZE(errors);
}
although that would be less performant.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-10-31 22:10 ` Nicolas Pitre
@ 2008-11-01 10:52 ` Andreas Ericsson
0 siblings, 0 replies; 83+ messages in thread
From: Andreas Ericsson @ 2008-11-01 10:52 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: Pierre Habouzit, Shawn O. Pearce, git, Scott Chacon
Nicolas Pitre wrote:
> On Fri, 31 Oct 2008, Pierre Habouzit wrote:
>
>> Git is currently mostly "GPLv2 or later". A BSDish license was
>> mentioned, because it's the most permissive one and that nobody cared
>> that much, though a LGPL/GPL-with-GCC-exception would probably fly.
>
> I do care. I think the BSD license is too permissive. There are really
> nifty pieces of code in Git that I would be really sorry to see go
> proprietary.
>
I agree with Nicolas, for what it's worth. Although I realize my small
contributions to the git code can be trivially rewritten, I think that
will definitely be trickier with Nicolas' contributions.
>
>> OT: FWIW I prefer BSDish licenses (even the MIT actually) for libraries
>> because I believe that computing is overall better if everyone can use
>> the right tool for the task, and I don't want to prevent people from
>> using good stuff (I hope I write good stuff ;P) because of the license.
>
> Everybody can and does link against glibc on Linux which is LGPL. So
> that doesn't affect "usage".
>
For dynamic libraries, yes, as you can replace them in-flight if you need
to. For static libraries, it's a different matter. I think the gcc exception
is rather special, as parts of gcc is the dynamic linker initialization code,
which has to be shipped in object form and included in all executables
produced with gcc. Perhaps we'd be better off asking one of EFF's lawyers
what each license means, exactly.
--
Andreas Ericsson andreas.ericsson@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-11-01 2:26 ` Johannes Schindelin
@ 2008-11-01 11:01 ` Pierre Habouzit
2008-11-01 13:50 ` Nicolas Pitre
0 siblings, 1 reply; 83+ messages in thread
From: Pierre Habouzit @ 2008-11-01 11:01 UTC (permalink / raw)
To: Johannes Schindelin
Cc: Shawn O. Pearce, Nicolas Pitre, Junio C Hamano, Pieter de Bie,
git, Scott Chacon
[-- Attachment #1: Type: text/plain, Size: 4088 bytes --]
On Sat, Nov 01, 2008 at 02:26:45AM +0000, Johannes Schindelin wrote:
> Hi,
>
> On Fri, 31 Oct 2008, Shawn O. Pearce wrote:
>
> > Nicolas Pitre <nico@cam.org> wrote:
> > > On Fri, 31 Oct 2008, Shawn O. Pearce wrote:
> > >
> > > > > Both the negative code and errno style are lightweight in the
> > > > > common "no error" case. The errno style is probably more handy
> > > > > for those functions returning a pointer which should be NULL in
> > > > > the error case.
>
> Unfortunately, errno would not be thread-safe, unless you can guarantee
> that errno is a thread-local variable.
Well, TLS afaict is implemented on arches that have GNU ld, or on win32
with a recent enough mingw. Though this is quite a requirement.
> > Oh, good point. We could also stagger the errors so they are
> > always odd, and never return an odd-alignment pointer from a
> > successful function. Thus IS_ERR can be written as:
> >
> > #define IS_ERR(ptr) (((intptr_t)(ptr)) & 1)
> >
> > which is quite cheap, and given the (probably required anyway)
> > aligned allocation policy means we still have 2^31 possible
> > error codes available.
>
> Oh boy, both solutions are ugly as hell. Although the &1 method does not
> limit the memory space as much (except if you plan to work in
> space-contrained environments, where you do not want to be forced to
> word-align structs).
>
> The only pointer game I would remotely consider clean is if you had
>
> const char *errors[] = {
> ...
> };
>
> inline int is_error(void *ptr) {
> return ptr >= errors && ptr < errors + ARRAY_SIZE(errors);
> }
Well, you can't return _sanely_ an error through a pointer. The &1
method is broken as soon as you return a char* (there is an alignment
requirement for malloc, not for any pointer out there), hence shall not
be used, as it would not be the sole way to test for error.
Another option, that is _theorically_ not portable, but is ttbomk on all
the platforms we intend to support (IOW POSIX-ish and windows), is to
use "small" values of the pointers for errors. [NULL .. (void *)(PAGE_SIZE - 1)[
cannot exist, which gives us probably always 512 different errors, and
the test is ((uintptr_t)ptr < (PAGE_SIZE)) which is cheap. It's butt
ugly, but encoding errors into pointers is butt ugly in the first place.
Another option that is what I would prefer, would be for the use of
errno where it makes sense. E.g. if you want a function that fetches an
object, this is somehow what read(3) would look like on our store, more
or less, and errno's are enough. For the other functions where errno
cannot be used, I'm pretty sure we will always pass a handle to some
kind of libgit2 stuff, like the "repository" we're working on. The
_easiest_ way is to put the "last error" into that structure and use
that. I mean, if we want libgit2 to be useful for _everyone_ we *WILL*
have to pass a repository context around. I see almost no way around it.
And there, NULL means error, and if you want to know about the specific
error, git_repo_errno(&ctx) / git_repo_strerr(&ctx) is just easy.
Note: What is important is to be able to check for errors _fast_, I
don't think printing out the error and knowing which error it was would
be in the fast path, so it's less useful to have this information
immediately.
_My_ taste (but again, like the _t I would use what is there, and won't
make a fuss about it at all) is to see function return -1 or NULL for
errors, and "abuse" errno for system-like functions, or put the last
error into the context on which you're working (your "this" more or
less). We don't need a specific error context at all, because we already
have a repository context available.
I'm not really a fan of pointer semantics abuse (it's sometimes useful,
but as the public interface of a library, this is butt-ugly.
--
·O· Pierre Habouzit
··O madcoder@debian.org
OOO http://www.madism.org
[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-10-31 22:01 ` Shawn O. Pearce
2008-10-31 22:51 ` Junio C Hamano
@ 2008-11-01 11:17 ` Andreas Ericsson
1 sibling, 0 replies; 83+ messages in thread
From: Andreas Ericsson @ 2008-11-01 11:17 UTC (permalink / raw)
To: Shawn O. Pearce; +Cc: Pierre Habouzit, git, Scott Chacon
Shawn O. Pearce wrote:
> Andreas Ericsson <ae@op5.se> wrote:
>>> * proper public "stuff" naming (I e.g. realy like types names -- not
>>> struct or enum tags, that I don't really care -- ending with _t as
>>> it helps navigating source.
>> *_t types are reserved by POSIX for future implementations, so that's
>> a no-go (although I doubt POSIX will ever make types named git_*_t).
>
> Yikes. Anyone know where a concise list of the reserved names are?
>
The ones I know of off the top of my head are:
* Anything ending with _t (posix)
pthread_t, size_t, socklen_t, ...
* str/mem/? function name prefix NOT followed by an underscore (C lib)
strlen, strchr, memcmp, memcpy, ...
(strbuf_ stuff doesn't break this, as it contains an underscore in the
name - the underscore following it doesn't have to be immediate).
* Double underscore prefix (compiler / C lib)
__WORDSIZE, __cplusplus, ...
It's also a good idea to stay away from single underscore prefix for
generic-ish names, as some compilers/architectures abuse it extensively.
Prefixing all public functions and macros of the library with 'git_'
(lower-case for functions and function-like macros, upper-case for
the rest) will probably see us through safe. Exceptions can probably
be made for already completed API's, such as the arg-parsing stuff
and the strbuf code.
>> Apart from that, please consider reading Ulrich Drepper's musings on
>> library design at http://people.redhat.com/drepper/goodpractice.pdf
>
> I think I've read that before, but I'll skim over it again.
> Thanks for the link.
>
I swear by it at work, where I'm "the library guy". I'll make sure to
review stuff and can chip in code or thoughts to make the library fly
with a minimum amount of problems and maintenance burden.
--
Andreas Ericsson andreas.ericsson@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-11-01 11:01 ` Pierre Habouzit
@ 2008-11-01 13:50 ` Nicolas Pitre
2008-11-01 17:01 ` Pierre Habouzit
2008-11-01 20:26 ` Johannes Schindelin
0 siblings, 2 replies; 83+ messages in thread
From: Nicolas Pitre @ 2008-11-01 13:50 UTC (permalink / raw)
To: Pierre Habouzit
Cc: Johannes Schindelin, Shawn O. Pearce, Junio C Hamano,
Pieter de Bie, git, Scott Chacon
On Sat, 1 Nov 2008, Pierre Habouzit wrote:
> Well, you can't return _sanely_ an error through a pointer. The &1
> method is broken as soon as you return a char* (there is an alignment
> requirement for malloc, not for any pointer out there), hence shall not
> be used, as it would not be the sole way to test for error.
>
> Another option, that is _theorically_ not portable, but is ttbomk on all
> the platforms we intend to support (IOW POSIX-ish and windows), is to
> use "small" values of the pointers for errors. [NULL .. (void *)(PAGE_SIZE - 1)[
> cannot exist, which gives us probably always 512 different errors, and
4095 actually. You don't need to align error codes.
> the test is ((uintptr_t)ptr < (PAGE_SIZE)) which is cheap. It's butt
> ugly, but encoding errors into pointers is butt ugly in the first place.
Or use "negative" pointers. Again, please have a look at
include/linux/err.h. The pointer range from 0xffffffff (or
0xffffffffffffffff on 64-bit machines) down to the range you want is for
errors, and the top of the address range is almost certain to never be
valid in user space either.
Nicolas
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-11-01 13:50 ` Nicolas Pitre
@ 2008-11-01 17:01 ` Pierre Habouzit
2008-11-01 20:26 ` Johannes Schindelin
1 sibling, 0 replies; 83+ messages in thread
From: Pierre Habouzit @ 2008-11-01 17:01 UTC (permalink / raw)
To: Nicolas Pitre
Cc: Johannes Schindelin, Shawn O. Pearce, Junio C Hamano,
Pieter de Bie, git, Scott Chacon
[-- Attachment #1: Type: text/plain, Size: 1585 bytes --]
On Sat, Nov 01, 2008 at 01:50:45PM +0000, Nicolas Pitre wrote:
> On Sat, 1 Nov 2008, Pierre Habouzit wrote:
>
> > Well, you can't return _sanely_ an error through a pointer. The &1
> > method is broken as soon as you return a char* (there is an alignment
> > requirement for malloc, not for any pointer out there), hence shall not
> > be used, as it would not be the sole way to test for error.
> >
> > Another option, that is _theorically_ not portable, but is ttbomk on all
> > the platforms we intend to support (IOW POSIX-ish and windows), is to
> > use "small" values of the pointers for errors. [NULL .. (void *)(PAGE_SIZE - 1)[
> > cannot exist, which gives us probably always 512 different errors, and
>
> 4095 actually. You don't need to align error codes.
Sure, I'm just not sure there isn't an arch where a page would be 512
octets. And you really have 4096 errors, from 0 to 4095 *included* :)
> > the test is ((uintptr_t)ptr < (PAGE_SIZE)) which is cheap. It's butt
> > ugly, but encoding errors into pointers is butt ugly in the first place.
>
> Or use "negative" pointers. Again, please have a look at
> include/linux/err.h. The pointer range from 0xffffffff (or
> 0xffffffffffffffff on 64-bit machines) down to the range you want is for
> errors, and the top of the address range is almost certain to never be
> valid in user space either.
Indeed.
--
·O· Pierre Habouzit
··O madcoder@debian.org
OOO http://www.madism.org
[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-10-31 18:41 ` Shawn O. Pearce
2008-10-31 18:54 ` Pierre Habouzit
2008-10-31 20:05 ` Junio C Hamano
@ 2008-11-01 17:30 ` Pierre Habouzit
2008-11-01 18:44 ` Andreas Ericsson
2008-11-02 1:56 ` Shawn O. Pearce
2 siblings, 2 replies; 83+ messages in thread
From: Pierre Habouzit @ 2008-11-01 17:30 UTC (permalink / raw)
To: Shawn O. Pearce; +Cc: git, Scott Chacon
[-- Attachment #1: Type: text/plain, Size: 1501 bytes --]
On Fri, Oct 31, 2008 at 06:41:54PM +0000, Shawn O. Pearce wrote:
> How about this?
>
> http://www.spearce.org/projects/scm/libgit2/apidocs/CONVENTIONS
FWIW I've read what you say about types, while this is good design to
make things abstract, accessors are slower _and_ disallow many
optimizations as it's a function call and that it may clobber all your
pointers values.
For types that _will_ be in the tight loops, we must make the types
explicit or it'll bite us hard performance-wise. I'm thinking what is
"struct object" or "struct commit" in git.git. It's likely that we will
loose a *lot* of those types are opaque.
struct object in git has not changed since 2006.06. struct commit hasn't
since 2005.04 if you ignore { unsigned int indegree; void *util; } that
if I'm correct are annotations, and is a problem we (I think) have to
address differently anyways (I gave my proposal on this, I'm eager to
hear about what other think on the subject). So if in git.git that _is_
a moving target we have had a 2 year old implementation for those types,
it's that they're pretty well like this.
It's IMNSHO on the matter that core structures of git _will_ have to be
made explicit. I'm thinking objects and their "subtypes" (commits,
trees, blobs). Maybe a couple of things on the same vein.
--
·O· Pierre Habouzit
··O madcoder@debian.org
OOO http://www.madism.org
[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-11-01 17:30 ` Pierre Habouzit
@ 2008-11-01 18:44 ` Andreas Ericsson
2008-11-01 18:48 ` Pierre Habouzit
2008-11-01 20:29 ` Shawn O. Pearce
2008-11-02 1:56 ` Shawn O. Pearce
1 sibling, 2 replies; 83+ messages in thread
From: Andreas Ericsson @ 2008-11-01 18:44 UTC (permalink / raw)
To: Pierre Habouzit; +Cc: Shawn O. Pearce, git, Scott Chacon
Pierre Habouzit wrote:
> On Fri, Oct 31, 2008 at 06:41:54PM +0000, Shawn O. Pearce wrote:
>> How about this?
>>
>> http://www.spearce.org/projects/scm/libgit2/apidocs/CONVENTIONS
>
> FWIW I've read what you say about types, while this is good design to
> make things abstract, accessors are slower _and_ disallow many
> optimizations as it's a function call and that it may clobber all your
> pointers values.
>
Accessors are very nifty for one thing though; With a debugging flag,
you can use an accessor-function, while without that debugging flag you
can use a macro instead of a function. In other words, you use the
compiler as a sort of sanity-checker that you're only accessing the
variables through the proper macros.
This method introduces a bit of extra code (50% of which is always
dead) for each struct it's used on, but it makes debugging large-ish
pieces of software relatively simple, since access to all object types
is controlled through the use of macros.
> For types that _will_ be in the tight loops, we must make the types
> explicit or it'll bite us hard performance-wise. I'm thinking what is
> "struct object" or "struct commit" in git.git. It's likely that we will
> loose a *lot* of those types are opaque.
>
The last sentence doesn't parse. I assume you mean "if those types are..",
in which case it'll be solved by using accessor-macros and forward-declaring
the structs.
> struct object in git has not changed since 2006.06. struct commit hasn't
> since 2005.04 if you ignore { unsigned int indegree; void *util; } that
> if I'm correct are annotations, and is a problem we (I think) have to
> address differently anyways (I gave my proposal on this, I'm eager to
> hear about what other think on the subject). So if in git.git that _is_
> a moving target we have had a 2 year old implementation for those types,
> it's that they're pretty well like this.
>
> It's IMNSHO on the matter that core structures of git _will_ have to be
> made explicit. I'm thinking objects and their "subtypes" (commits,
> trees, blobs). Maybe a couple of things on the same vein.
>
I agree. "git_commit", "git_tree", "git_blob" and "git_tag" can almost
certainly be set in stone straight away.
--
Andreas Ericsson andreas.ericsson@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-11-01 18:44 ` Andreas Ericsson
@ 2008-11-01 18:48 ` Pierre Habouzit
2008-11-01 20:29 ` Shawn O. Pearce
1 sibling, 0 replies; 83+ messages in thread
From: Pierre Habouzit @ 2008-11-01 18:48 UTC (permalink / raw)
To: Andreas Ericsson; +Cc: Shawn O. Pearce, git, Scott Chacon
[-- Attachment #1: Type: text/plain, Size: 663 bytes --]
On Sat, Nov 01, 2008 at 06:44:12PM +0000, Andreas Ericsson wrote:
> Pierre Habouzit wrote:
> >For types that _will_ be in the tight loops, we must make the types
> >explicit or it'll bite us hard performance-wise. I'm thinking what is
> >"struct object" or "struct commit" in git.git. It's likely that we will
> >loose a *lot* of those types are opaque.
>
> The last sentence doesn't parse. I assume you mean "if those types
> are..",
This was a typo, indeed s/of/if/
--
·O· Pierre Habouzit
··O madcoder@debian.org
OOO http://www.madism.org
[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-10-31 17:07 libgit2 - a true git library Shawn O. Pearce
` (2 preceding siblings ...)
2008-10-31 23:18 ` Bruno Santos
@ 2008-11-01 19:18 ` Andreas Ericsson
2008-11-01 20:42 ` Shawn O. Pearce
2008-11-08 13:26 ` Steve Frécinaux
4 siblings, 1 reply; 83+ messages in thread
From: Andreas Ericsson @ 2008-11-01 19:18 UTC (permalink / raw)
To: Shawn O. Pearce; +Cc: git, Scott Chacon
Shawn O. Pearce wrote:
> During the GitTogether we were kicking around the idea of a ground-up
> implementation of a Git library. This may be easier than trying
> to grind down git.git into a library, as we aren't tied to any
> of the current global state baggage or the current die() based
> error handling.
>
> I've started an _extremely_ rough draft. The code compiles into a
> libgit.a but it doesn't even implement what it describes in the API,
> let alone a working Git implementation. Really what I'm trying to
> incite here is some discussion on what the API looks like.
>
> API Docs:
> http://www.spearce.org/projects/scm/libgit2/apidocs/html/modules.html
>
> Source Code Clone URL:
> http://www.spearce.org/projects/scm/libgit2/libgit2.git
>
Having looked briefly at the code, I've got a couple of comments:
* GIT_EXTERN() does nothing. Ever. It's noise and should be removed.
Instead it would be better to have GIT_PRIVATE(), which could
set visibility to "internal" or "hidden", meaning the symbol it's
attached to can be used for lookups when creating a shared library
but won't be usable from programs linking to that shared library
(visibility-attributes have zero effect on static libraries). At
least on all archs anyone really cares about.
* Prefixing the files themselves with git_ is useless and only leads
to developer frustration. I imagine we'd be installing any header
files in a git/ directory anyway, so we're gaining absolutely
nothing with the git_ prefix on source-files.
Apart from that, it seems you've been designing a lot rather than
trying to use the API to actually do something. It would, imo, be
a lot better to start development with adding functionality shared
between all programs and then expand further on that, such as
incorporating all functions needed for manipulating tags into the
library and then modify existing code to use the library to get
tag-ish things done. That would also mean that the library would
quickly get used by core git, as once a certain part of it is
complete patches can be fitted to the library rather than to the
current non-libish dying() functions.
I also think it's quite alright to not strive *too* hard to make
all functions thread-safe, as very few of them will actually need
that. It's unlikely that a user program will spawn one thread to
write a lot of tags while another is trying to parse them, for
example.
By adding an init routine that determines the workdir and the
gitdir, one could start using the library straight away.
int git_init(const char *db, const char *worktree)
{
if (git_set_db_dir(db))
return -1;
git_set_worktree((worktree))
return -1;
return 0;
}
and already you have a some few small helpers that are nifty to
to have around:
int git_is_gitdir(const char *path); /* returns 1 on success */
int git_has_gitdir(const char *path); /* returns 1 on success */
const char *git_mkpath(const char *fmt, ...)
This way one will notice rather quickly what's needed (making it
easy to keep a more-or-less public TODO available, with small stuff
on it for the most part), and one can then go look for it in the
existing git code and, if possible, convert stuff or, best case
scenario, steal it straight off so that more apps can benefit from
tried and tested code.
--
Andreas Ericsson andreas.ericsson@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-11-01 13:50 ` Nicolas Pitre
2008-11-01 17:01 ` Pierre Habouzit
@ 2008-11-01 20:26 ` Johannes Schindelin
1 sibling, 0 replies; 83+ messages in thread
From: Johannes Schindelin @ 2008-11-01 20:26 UTC (permalink / raw)
To: Nicolas Pitre
Cc: Pierre Habouzit, Shawn O. Pearce, Junio C Hamano, Pieter de Bie,
git, Scott Chacon
Hi,
On Sat, 1 Nov 2008, Nicolas Pitre wrote:
> Again, please have a look at include/linux/err.h. The pointer range
> from 0xffffffff (or 0xffffffffffffffff on 64-bit machines) down to the
> range you want is for errors, and the top of the address range is almost
> certain to never be valid in user space either.
How certain may I be of that assumption? Because an assumption it is.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-11-01 18:44 ` Andreas Ericsson
2008-11-01 18:48 ` Pierre Habouzit
@ 2008-11-01 20:29 ` Shawn O. Pearce
2008-11-01 21:58 ` Andreas Ericsson
1 sibling, 1 reply; 83+ messages in thread
From: Shawn O. Pearce @ 2008-11-01 20:29 UTC (permalink / raw)
To: Andreas Ericsson; +Cc: Pierre Habouzit, git, Scott Chacon
Andreas Ericsson <ae@op5.se> wrote:
> Pierre Habouzit wrote:
>> On Fri, Oct 31, 2008 at 06:41:54PM +0000, Shawn O. Pearce wrote:
>>> How about this?
>>>
>>> http://www.spearce.org/projects/scm/libgit2/apidocs/CONVENTIONS
>>
>> FWIW I've read what you say about types, while this is good design to
>> make things abstract, accessors are slower _and_ disallow many
>> optimizations as it's a function call and that it may clobber all your
>> pointers values.
True, accessors slow things down. But I'm not sure that the
accessors at the application level are going to be a huge problem.
Where the CPU time really matters is inside the tight loops of the
library, where we can expose the struct to ourselves, because if
the layout changes we'd be relinking the library anyway with the
updated object code.
I would rather stick with accessors right now. We could in the
future expose the structs and convert the accessors to macros or
inline functions in a future version of the ABI if performance is
really shown to be a problem here from the *application*.
Remember we are mostly talking about applications that are happy to
fork+exec git right now. A little accessor function call is *still*
faster than that fork call was.
>> struct object in git has not changed since 2006.06. struct commit hasn't
>> since 2005.04 if you ignore { unsigned int indegree; void *util; } that
>> if I'm correct are annotations, and is a problem we (I think) have to
>> address differently anyways (I gave my proposal on this, I'm eager to
>> hear about what other think on the subject). So if in git.git that _is_
>> a moving target we have had a 2 year old implementation for those types,
>> it's that they're pretty well like this.
>>
>> It's IMNSHO on the matter that core structures of git _will_ have to be
>> made explicit. I'm thinking objects and their "subtypes" (commits,
>> trees, blobs). Maybe a couple of things on the same vein.
>
> I agree. "git_commit", "git_tree", "git_blob" and "git_tag" can almost
> certainly be set in stone straight away.
Eh, I disagree here. In git.git today "struct commit" exposes its
buffer with the canonical commit encoding. Having that visible
wrecks what Nico and I were thinking about doing with pack v4 and
encoding commits in a non-canonical format when stored in packs.
Ditto with trees.
Because git.git code goes against that canonical buffer we cannot
easily insert pack v4 and test the improvements we want to make.
The refactoring required is one of the reasons we haven't done pack
v4 yet. _IF_ we really are going through this effort of building
a different API and shifting to its use in git.git I want to make
sure we at least initially leave the door open to make changes
without rewriting everything *again*.
Accessor functions can usually be inlined or macro'd away. But
they cannot be magically inserted by the compiler if they aren't
there in the first place. This isn't Python... :-)
--
Shawn.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-11-01 19:18 ` Andreas Ericsson
@ 2008-11-01 20:42 ` Shawn O. Pearce
2008-11-02 2:30 ` Johannes Schindelin
` (2 more replies)
0 siblings, 3 replies; 83+ messages in thread
From: Shawn O. Pearce @ 2008-11-01 20:42 UTC (permalink / raw)
To: Andreas Ericsson; +Cc: git, Scott Chacon
Andreas Ericsson <ae@op5.se> wrote:
> Shawn O. Pearce wrote:
>> During the GitTogether we were kicking around the idea of a ground-up
>> implementation of a Git library.
>
> Having looked briefly at the code, I've got a couple of comments:
> * GIT_EXTERN() does nothing. Ever. It's noise and should be removed.
I feel the same way.
But I was also under the impression that the brilliant engineers
who work for Microsoft decided that on their platform special
annotations have to be inserted on functions that a DLL wants to
export to applications.
Hence any cross-platform library that I have seen annotates their
exported functions this way, with the macro being empty on POSIX
and expanding to some magic keyword on Microsoft's OS. I think it
goes between the return type and the function name too...
> Instead it would be better to have GIT_PRIVATE(),
I can see why you said this; needing GIT_PRIVATE() is a lot more
rare than needing GIT_EXTERN(). Only a handful of cross-module,
but private, functions are likely to exist, so it makes sense to
mark the smaller subset. But see above. *sigh*
> * Prefixing the files themselves with git_ is useless and only leads
> to developer frustration. I imagine we'd be installing any header
> files in a git/ directory anyway, so we're gaining absolutely
> nothing with the git_ prefix on source-files.
Yes, I realized that this morning. I plan on changing that mess
around so we have "include/git/oid.h" and library and application
code can use "#include <git/oid.h>". Library modules should just
be "src/oid.c" then.
> Apart from that, it seems you've been designing a lot rather than
> trying to use the API to actually do something.
I wanted to get a solid idea of what our API conventions should be,
before we started writing a lot of code around them. Part of the
problem with the git.git code is we don't have conventions that are
really suited for use in a shared library (assuming we even have
conventions in there) so we can't use that code as a library today.
> It would, imo, be
> a lot better to start development with adding functionality shared
> between all programs and then expand further on that, such as
> incorporating all functions needed for manipulating tags into the
> library and then modify existing code to use the library to get
> tag-ish things done.
Tags are mostly pointless. Its a tiny part of the code that isn't
that interesting to most people. And it requires object database
access anyway if you want to talk about parsing or reading a tag.
There's almost no point in a git library that can't read the on
disk object database, or write to it.
> I also think it's quite alright to not strive *too* hard to make
> all functions thread-safe, as very few of them will actually need
> that. It's unlikely that a user program will spawn one thread to
> write a lot of tags while another is trying to parse them, for
> example.
Oh really?
Maybe true for tags, just because they are such an unimportant part
of the git suite compared to everything else.
But right now I'm running a production system using a threaded server
process that is operating on Git repositories. Fortunately threads
suck less on Java than they do on POSIX, and we have a 100% pure
Java library available for Git.
It would be nice if a library created in the late part of 2008
recognized that threads exist, aren't going to disappear tomorrow,
and that consumers of libraries actually may need to run the library
within a threaded process.
Or are you one of those developers who think threads only exist
in the giant monolithic kernel land, and all user space should
be isolated process? I often wonder who such people can justify
the kernel address space being multi-threaded but userland being
stuck to single threaded applications. Oh, right, the kernel has
to go fast...
--
Shawn.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-11-01 20:29 ` Shawn O. Pearce
@ 2008-11-01 21:58 ` Andreas Ericsson
2008-11-02 1:50 ` Shawn O. Pearce
0 siblings, 1 reply; 83+ messages in thread
From: Andreas Ericsson @ 2008-11-01 21:58 UTC (permalink / raw)
To: Shawn O. Pearce; +Cc: Pierre Habouzit, git, Scott Chacon
Shawn O. Pearce wrote:
> Andreas Ericsson <ae@op5.se> wrote:
>> Pierre Habouzit wrote:
>>> On Fri, Oct 31, 2008 at 06:41:54PM +0000, Shawn O. Pearce wrote:
>>>> How about this?
>>>>
>>>> http://www.spearce.org/projects/scm/libgit2/apidocs/CONVENTIONS
>>> FWIW I've read what you say about types, while this is good design to
>>> make things abstract, accessors are slower _and_ disallow many
>>> optimizations as it's a function call and that it may clobber all your
>>> pointers values.
>
> True, accessors slow things down. But I'm not sure that the
> accessors at the application level are going to be a huge problem.
>
> Where the CPU time really matters is inside the tight loops of the
> library, where we can expose the struct to ourselves, because if
> the layout changes we'd be relinking the library anyway with the
> updated object code.
>
> I would rather stick with accessors right now. We could in the
> future expose the structs and convert the accessors to macros or
> inline functions in a future version of the ABI if performance is
> really shown to be a problem here from the *application*.
>
> Remember we are mostly talking about applications that are happy to
> fork+exec git right now. A little accessor function call is *still*
> faster than that fork call was.
>
>>> struct object in git has not changed since 2006.06. struct commit hasn't
>>> since 2005.04 if you ignore { unsigned int indegree; void *util; } that
>>> if I'm correct are annotations, and is a problem we (I think) have to
>>> address differently anyways (I gave my proposal on this, I'm eager to
>>> hear about what other think on the subject). So if in git.git that _is_
>>> a moving target we have had a 2 year old implementation for those types,
>>> it's that they're pretty well like this.
>>>
>>> It's IMNSHO on the matter that core structures of git _will_ have to be
>>> made explicit. I'm thinking objects and their "subtypes" (commits,
>>> trees, blobs). Maybe a couple of things on the same vein.
>> I agree. "git_commit", "git_tree", "git_blob" and "git_tag" can almost
>> certainly be set in stone straight away.
>
> Eh, I disagree here. In git.git today "struct commit" exposes its
> buffer with the canonical commit encoding. Having that visible
> wrecks what Nico and I were thinking about doing with pack v4 and
> encoding commits in a non-canonical format when stored in packs.
> Ditto with trees.
>
Err... isn't that backwards? Surely you want to store stuff in the
canonical format so you're forced to do as few translations as
possible? Or are you trying to speed up packing by skipping the
canonicalization part? If so, that would slow down reading (or
rather, presenting) the commits, wouldn't it?
> Because git.git code goes against that canonical buffer we cannot
> easily insert pack v4 and test the improvements we want to make.
> The refactoring required is one of the reasons we haven't done pack
> v4 yet. _IF_ we really are going through this effort of building
> a different API and shifting to its use in git.git I want to make
> sure we at least initially leave the door open to make changes
> without rewriting everything *again*.
>
Well, if macro usage is adhered to one wouldn't have to worry,
since the macro can just be rewritten with a function later (if,
for example, translation or some such happens to be required).
Older code linking to a newer library would work (assuming the
size of the commit object doesn't change anyway), but newer code
linking to an older library would not. Otoh, they wouldn't even
build unless they used the wrong header files, so there is nothing
to worry about there.
> Accessor functions can usually be inlined or macro'd away. But
> they cannot be magically inserted by the compiler if they aren't
> there in the first place. This isn't Python... :-)
>
What I meant was this (I'm a tad drunk, so read the spirit, not
the letter):
in "foo-api.h":
--%<--%<--
#ifdef BUILDING_FOR_DEPLOYING
#include "git_foo_decls.h"
# define git_foo_get_buf(git_foo) (git_foo->buf)
#else
#include "git_foo_fwd_decls.h"
extern const char *git_foo_get_buf(git_foo *foo);
#endif
--%<--%<--
foo.c
--%<--%<--
#include "foo-api.h"
#include "git_foo_decls.h"
#ifndef BUILDING_FOR_DEPLOYING
const char git_foo_get_buf(git_foo *foo)
{
return foo->buf;
}
/* other accessors go here */
#endif
/* rest of git_foo manipulators go here */
--%<--%<--
It's almost certainly not worth it for libgit2 though, as git@vger
provides a good review system.
--
Andreas Ericsson andreas.ericsson@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-11-01 1:53 ` Nicolas Pitre
@ 2008-11-01 22:57 ` Shawn O. Pearce
2008-11-02 0:26 ` Scott Chacon
0 siblings, 1 reply; 83+ messages in thread
From: Shawn O. Pearce @ 2008-11-01 22:57 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: Pierre Habouzit, david, git
Nicolas Pitre <nico@cam.org> wrote:
> On Fri, 31 Oct 2008, Shawn O. Pearce wrote:
>
> > My take on the consensus for the license part of the discussion is
> > that libgit2 should be under the "GPL gcc library" license.
> >
> > BTW, I can't actually find a copy of that license; the only thing
> > I can locate in the GCC SVN tree is a copy of the LGPL.
>
> The exception is usually found at the top of files constituting
> libgcc.a. One example is gcc/config/arm/ieee754-df.S. ;-)
Headers updated. Its now GPL+gcc library exception.
Not that the 5 lines of useful code there really needs copyright,
but hey, whatever.
--
Shawn.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-11-01 22:57 ` Shawn O. Pearce
@ 2008-11-02 0:26 ` Scott Chacon
2008-11-02 1:07 ` Scott Chacon
0 siblings, 1 reply; 83+ messages in thread
From: Scott Chacon @ 2008-11-02 0:26 UTC (permalink / raw)
To: Shawn O. Pearce; +Cc: Nicolas Pitre, Pierre Habouzit, david, git
I'm sorry - why is that better than LGPL? Wouldn't it be better to
use a license that people have heard of rather than one that can't be
looked up or it's implications easily researched? What is this
affording the library that offsets the headaches of everyone trying to
figure out if they can use it or not?
Scott
On Sat, Nov 1, 2008 at 3:57 PM, Shawn O. Pearce <spearce@spearce.org> wrote:
> Nicolas Pitre <nico@cam.org> wrote:
>> On Fri, 31 Oct 2008, Shawn O. Pearce wrote:
>>
>> > My take on the consensus for the license part of the discussion is
>> > that libgit2 should be under the "GPL gcc library" license.
>> >
>> > BTW, I can't actually find a copy of that license; the only thing
>> > I can locate in the GCC SVN tree is a copy of the LGPL.
>>
>> The exception is usually found at the top of files constituting
>> libgcc.a. One example is gcc/config/arm/ieee754-df.S. ;-)
>
> Headers updated. Its now GPL+gcc library exception.
>
> Not that the 5 lines of useful code there really needs copyright,
> but hey, whatever.
>
> --
> Shawn.
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-11-02 0:26 ` Scott Chacon
@ 2008-11-02 1:07 ` Scott Chacon
2008-11-02 1:36 ` Shawn O. Pearce
2008-11-02 5:09 ` David Brown
0 siblings, 2 replies; 83+ messages in thread
From: Scott Chacon @ 2008-11-02 1:07 UTC (permalink / raw)
To: Shawn O. Pearce; +Cc: Nicolas Pitre, Pierre Habouzit, david, git
> On Sat, Nov 1, 2008 at 3:57 PM, Shawn O. Pearce <spearce@spearce.org> wrote:
>> Nicolas Pitre <nico@cam.org> wrote:
>>> On Fri, 31 Oct 2008, Shawn O. Pearce wrote:
>>>
>>> > My take on the consensus for the license part of the discussion is
>>> > that libgit2 should be under the "GPL gcc library" license.
>>> >
>>> > BTW, I can't actually find a copy of that license; the only thing
>>> > I can locate in the GCC SVN tree is a copy of the LGPL.
>>>
>>> The exception is usually found at the top of files constituting
>>> libgcc.a. One example is gcc/config/arm/ieee754-df.S. ;-)
>>
>> Headers updated. Its now GPL+gcc library exception.
>>
>> Not that the 5 lines of useful code there really needs copyright,
>> but hey, whatever.
I guess my main concern is that if a company wanted to direct
resources at supporting Git in something (say, an editor or GUI or
whatnot), and that company is of _any_ size, they are going to have to
get their legal department to review this strange and almost totally
unused license - only knowing that it's barely different than GPL and
they know GPL will not fly. LGPL will likely be known to them and a
policy may already be in place.
Think about trying to incorporate this into something proprietary,
Shawn - how much of a pain is it going to be to get that license
reviewed in Google? However, LGPL I'm sure there is already a
reviewed policy. Now, since that may be a pain, time that Shawn could
have been spending being paid to work on the library is lost because
they can't use it, or it takes weeks/months to review it. That's my
concern.
I personally would rather see it BSD or something more permissive so
that no human has to waste even a second of their valuable time
figuring out if they can work with it or not, but I understand that
many people here are much more protective of their code. I simply
think that LGPL is a much more widely used and understood compromise
that affords nearly the same protectionism.
Scott
>>
>> --
>> Shawn.
>> --
>> To unsubscribe from this list: send the line "unsubscribe git" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-11-02 1:07 ` Scott Chacon
@ 2008-11-02 1:36 ` Shawn O. Pearce
2008-11-02 5:09 ` David Brown
1 sibling, 0 replies; 83+ messages in thread
From: Shawn O. Pearce @ 2008-11-02 1:36 UTC (permalink / raw)
To: Scott Chacon; +Cc: Nicolas Pitre, Pierre Habouzit, david, git
Scott Chacon <schacon@gmail.com> wrote:
> > On Sat, Nov 1, 2008 at 3:57 PM, Shawn O. Pearce <spearce@spearce.org> wrote:
> >>
> >> Headers updated. Its now GPL+gcc library exception.
>
> I personally would rather see it BSD or something more permissive so
> that no human has to waste even a second of their valuable time
> figuring out if they can work with it or not, but I understand that
> many people here are much more protective of their code. I simply
> think that LGPL is a much more widely used and understood compromise
> that affords nearly the same protectionism.
Apparently BSD won't fly, as you have already seen on the list.
If we did put the library under a BSD license we'd lose some core
contributors. Or they at least wouldn't improve the library code,
even if git.git linked to it in the future. I don't want to lose
these folks.
IANAL, but from what I can tell the main difference between LGPL
and GPL+"gcc library exception" is that the LGPL requires that
the end-user must be able to relink the derived executable with
their own replacement library. The GPL+"gcc library exception"
makes no such requirement.
If you read the exception clause it practically makes the library
even easier to use commerically than the BSD license does, however
modifications to the library sources must still be distributed.
Isn't that actually somewhat close to the Mozilla Public License?
--
Shawn.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-11-01 21:58 ` Andreas Ericsson
@ 2008-11-02 1:50 ` Shawn O. Pearce
2008-11-03 10:17 ` Andreas Ericsson
0 siblings, 1 reply; 83+ messages in thread
From: Shawn O. Pearce @ 2008-11-02 1:50 UTC (permalink / raw)
To: Andreas Ericsson; +Cc: Pierre Habouzit, git, Scott Chacon
Andreas Ericsson <ae@op5.se> wrote:
> Shawn O. Pearce wrote:
>>
>> Eh, I disagree here. In git.git today "struct commit" exposes its
>> buffer with the canonical commit encoding. Having that visible
>> wrecks what Nico and I were thinking about doing with pack v4 and
>> encoding commits in a non-canonical format when stored in packs.
>> Ditto with trees.
>
> Err... isn't that backwards?
No.
> Surely you want to store stuff in the
> canonical format so you're forced to do as few translations as
> possible?
No. We suspect that canonical format is harder to decompress and
parse during revision traversal. Other encodings in the pack file
may produce much faster runtime performance, and reduce page faults
(due to smaller pack sizes).
We hardly ever use the canonical format for actual output; most
output rips the canonical format apart and then formats the data
the way it was requested. If we have the data *already* parsed in
the pack its much faster to output.
> Or are you trying to speed up packing by skipping the
> canonicalization part?
Wrong; we're trying to speed up reading. Packing may go slower,
especially during the first conversion of v2->v4 for any given
repository, but packing is infrequent so the minor (if any) drop
in performance here is probably worth the reading performance gains.
> Well, if macro usage is adhered to one wouldn't have to worry,
> since the macro can just be rewritten with a function later (if,
> for example, translation or some such happens to be required).
> Older code linking to a newer library would work (assuming the
> size of the commit object doesn't change anyway),
You are assuming too much magic. If the older ABI used a macro
and the newer one (which supports pack v4) organized struct commit
differently and the user upgrades libgit2.so the older applications
just broke, horribly.
We know we want to do pack v4 in the near future. Or at least
experiment with it and see if it works. If it does, we don't
want to have to cause a major ABI breakage across all those newly
installed libgit2s... yikes.
I'm really in favor of accessor functions for the first version of
the library. They can always be converted to macros once someone
shows that their git visualizer program saves 10 ms on a 8,000 ms
render operation by avoiding accessor functions. I'd rather spend
our brain cycles optimizing the runtime and the in-core data so
we spend less time in our tight revision traversal loops.
Seriously. We make at least 10 or 11 function calls *per commit*
that comes out of get_revision(). If the formatting application is
really suffering from its 4 or 5 accessor function calls in order
to get that returned data, we probably should also be looking at
how we can avoid function cals in the library.
Oh, and even with 4 or 5 accessor functions per commit in the
application that is *still* better than the 10 or so calls the
application probably makes today scraping "git log --format=raw"
off a pipe and segment it into the different fields it needs.
Unless pipes in Linux somehow allow negative time warping with
CPU counters. Though on dual-core systems they might, since the
two processes can run on different cores. But oh, you didn't want
to worry about threading support too much in libgit2, so I guess
you also don't want to use multi-core systems.
--
Shawn.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-11-01 17:30 ` Pierre Habouzit
2008-11-01 18:44 ` Andreas Ericsson
@ 2008-11-02 1:56 ` Shawn O. Pearce
2008-11-02 9:25 ` Pierre Habouzit
1 sibling, 1 reply; 83+ messages in thread
From: Shawn O. Pearce @ 2008-11-02 1:56 UTC (permalink / raw)
To: Pierre Habouzit; +Cc: git, Scott Chacon
Pierre Habouzit <madcoder@debian.org> wrote:
> On Fri, Oct 31, 2008 at 06:41:54PM +0000, Shawn O. Pearce wrote:
> > How about this?
> >
> > http://www.spearce.org/projects/scm/libgit2/apidocs/CONVENTIONS
>
> FWIW I've read what you say about types, while this is good design to
> make things abstract, accessors are slower _and_ disallow many
> optimizations as it's a function call and that it may clobber all your
> pointers values.
Yea, optimizing C is a bitch.
I'm in favor of accessors *IN THE APPLICATION*.
Within the library's own C code, I think we should expose the struct,
and use its members where it makes sense to. Especially in the
really tight loops where we don't want to introduce more overhead.
My rationale here is that we can change the struct at any time,
and yet not change the ABI.
> For types that _will_ be in the tight loops, we must make the types
> explicit or it'll bite us hard performance-wise. I'm thinking what is
> "struct object" or "struct commit" in git.git. It's likely that we will
> loose a *lot* of those types are opaque.
Yes, but I'm arguing they should be opaque to the application, and
visible to the library. Today the application is suffering from
massive fork+exec overhead. I really don't give a damn if the
application's compiler has to deal with a function call to read
from a private member of an opaque type. Its still thousands of
CPU instructions less per operation.
Come back to me a year after libgit2 has been widely deployed on
Linux distros and we have multiple applications linking to it.
Lets talk then about the harmful performance problems caused by
making these types opaque to the application. About that time
we'll also be talking about how great pack v4 is and why its a good
thing those types were opaque, as we didn't have to break the ABI
to introduce it.
> It's IMNSHO on the matter that core structures of git _will_ have to be
> made explicit. I'm thinking objects and their "subtypes" (commits,
> trees, blobs). Maybe a couple of things on the same vein.
Sure, but in the library only.
--
Shawn.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-11-01 20:42 ` Shawn O. Pearce
@ 2008-11-02 2:30 ` Johannes Schindelin
2008-11-02 9:19 ` Pierre Habouzit
2008-11-03 13:08 ` Andreas Ericsson
2 siblings, 0 replies; 83+ messages in thread
From: Johannes Schindelin @ 2008-11-02 2:30 UTC (permalink / raw)
To: Shawn O. Pearce; +Cc: Andreas Ericsson, git, Scott Chacon
Hi,
On Sat, 1 Nov 2008, Shawn O. Pearce wrote:
> But I was also under the impression that the brilliant engineers who
> work for Microsoft decided that on their platform special annotations
> have to be inserted on functions that a DLL wants to export to
> applications.
Exactly. This is the "good" old __declspec(dllexport) for you. It is a
pain in the butt, but that is what you have to go for if libgit2 is
supposed to be any more portable than ligit.a.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-11-02 1:07 ` Scott Chacon
2008-11-02 1:36 ` Shawn O. Pearce
@ 2008-11-02 5:09 ` David Brown
2008-11-03 16:20 ` Shawn O. Pearce
1 sibling, 1 reply; 83+ messages in thread
From: David Brown @ 2008-11-02 5:09 UTC (permalink / raw)
To: Scott Chacon; +Cc: Shawn O. Pearce, Nicolas Pitre, Pierre Habouzit, david, git
On Sat, Nov 01, 2008 at 06:07:04PM -0700, Scott Chacon wrote:
>Think about trying to incorporate this into something proprietary,
>Shawn - how much of a pain is it going to be to get that license
>reviewed in Google? However, LGPL I'm sure there is already a
>reviewed policy. Now, since that may be a pain, time that Shawn could
>have been spending being paid to work on the library is lost because
>they can't use it, or it takes weeks/months to review it. That's my
>concern.
The gcc exception license should have been reviewed by anyone who has
ever build anything proprietary out of gcc.
GPL+link exception is a very common license. It's most common use is
for runtime libraries for various programming languages.
Lawyers I know are significantly less fearful of the GPL+exception
than the LGPL. The exception basically says that if you use it in a
certain way, then none of the GPL applies.
David
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-11-01 20:42 ` Shawn O. Pearce
2008-11-02 2:30 ` Johannes Schindelin
@ 2008-11-02 9:19 ` Pierre Habouzit
2008-11-03 13:08 ` Andreas Ericsson
2 siblings, 0 replies; 83+ messages in thread
From: Pierre Habouzit @ 2008-11-02 9:19 UTC (permalink / raw)
To: Shawn O. Pearce; +Cc: Andreas Ericsson, git, Scott Chacon
[-- Attachment #1: Type: text/plain, Size: 2330 bytes --]
On Sat, Nov 01, 2008 at 08:42:59PM +0000, Shawn O. Pearce wrote:
> Andreas Ericsson <ae@op5.se> wrote:
> > Shawn O. Pearce wrote:
> >> During the GitTogether we were kicking around the idea of a ground-up
> >> implementation of a Git library.
> >
> > Having looked briefly at the code, I've got a couple of comments:
> > * GIT_EXTERN() does nothing. Ever. It's noise and should be removed.
>
> I feel the same way.
>
> But I was also under the impression that the brilliant engineers
> who work for Microsoft decided that on their platform special
> annotations have to be inserted on functions that a DLL wants to
> export to applications.
>
> Hence any cross-platform library that I have seen annotates their
> exported functions this way, with the macro being empty on POSIX
> and expanding to some magic keyword on Microsoft's OS. I think it
> goes between the return type and the function name too...
>
> > Instead it would be better to have GIT_PRIVATE(),
>
> I can see why you said this; needing GIT_PRIVATE() is a lot more
> rare than needing GIT_EXTERN(). Only a handful of cross-module,
> but private, functions are likely to exist, so it makes sense to
> mark the smaller subset. But see above. *sigh*
Not only there is the windows thing, but the *best* way to design a
library is to make _everything_ by default and only show what you mean
to.
GIT_EXTERN allow us to do just that with
__attribute__((visibility("default"))) on GNU-ld/GCC.
Of course we can achieve the same using nothing on public symbols and
__attribute__((visibility("hidden"))) on the private ones, but it's way
more cumbersome and there's always a chance to forget one. It's bad
design IMHO.
FWIW GIT_EXTERN(...) prototype(arguments); isn't unredable to me,
everyone does that, it doesn't break tags, it's _good_. And you can ask
doxygen to substitute this GIT_EXTERN(x) with x so that it doesn't shows
up in the documentation at all (it can include these kind of partially
preprocessed headers in the documentation) for the people who are really
annoyed with them.
But it's definitely a necessary evil.
--
·O· Pierre Habouzit
··O madcoder@debian.org
OOO http://www.madism.org
[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-11-02 1:56 ` Shawn O. Pearce
@ 2008-11-02 9:25 ` Pierre Habouzit
0 siblings, 0 replies; 83+ messages in thread
From: Pierre Habouzit @ 2008-11-02 9:25 UTC (permalink / raw)
To: Shawn O. Pearce; +Cc: git, Scott Chacon
[-- Attachment #1: Type: text/plain, Size: 1904 bytes --]
On Sun, Nov 02, 2008 at 01:56:11AM +0000, Shawn O. Pearce wrote:
> Pierre Habouzit <madcoder@debian.org> wrote:
> > For types that _will_ be in the tight loops, we must make the types
> > explicit or it'll bite us hard performance-wise. I'm thinking what is
> > "struct object" or "struct commit" in git.git. It's likely that we will
> > loose a *lot* of those types are opaque.
>
> Yes, but I'm arguing they should be opaque to the application, and
> visible to the library. Today the application is suffering from
> massive fork+exec overhead. I really don't give a damn if the
> application's compiler has to deal with a function call to read
> from a private member of an opaque type. Its still thousands of
> CPU instructions less per operation.
The problem is not the function call, it's really cheap on modern CPUs.
The problem is that across a function call the compiler cannot make
some kind of optimizations and _that_ isn't cheap.
For example, pointers whose value have been loaded from the store into a
register have to be loaded again. A function call trashes all the
registers, etc...
> Come back to me a year after libgit2 has been widely deployed on
> Linux distros and we have multiple applications linking to it.
> Lets talk then about the harmful performance problems caused by
> making these types opaque to the application. About that time
> we'll also be talking about how great pack v4 is and why its a good
> thing those types were opaque, as we didn't have to break the ABI
> to introduce it.
Well I fear it'll be more than a few percent harm, and I can already see
Linus argue against the switch to libgit2 for git-core because it lost
2% performances ;P
--
·O· Pierre Habouzit
··O madcoder@debian.org
OOO http://www.madism.org
[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-11-02 1:50 ` Shawn O. Pearce
@ 2008-11-03 10:17 ` Andreas Ericsson
0 siblings, 0 replies; 83+ messages in thread
From: Andreas Ericsson @ 2008-11-03 10:17 UTC (permalink / raw)
To: Shawn O. Pearce; +Cc: Pierre Habouzit, git, Scott Chacon
Shawn O. Pearce wrote:
> Andreas Ericsson <ae@op5.se> wrote:
>> Shawn O. Pearce wrote:
>>> Eh, I disagree here. In git.git today "struct commit" exposes its
>>> buffer with the canonical commit encoding. Having that visible
>>> wrecks what Nico and I were thinking about doing with pack v4 and
>>> encoding commits in a non-canonical format when stored in packs.
>>> Ditto with trees.
>> Err... isn't that backwards?
>
> No.
>
>> Surely you want to store stuff in the
>> canonical format so you're forced to do as few translations as
>> possible?
>
> No. We suspect that canonical format is harder to decompress and
> parse during revision traversal. Other encodings in the pack file
> may produce much faster runtime performance, and reduce page faults
> (due to smaller pack sizes).
>
> We hardly ever use the canonical format for actual output; most
> output rips the canonical format apart and then formats the data
> the way it was requested. If we have the data *already* parsed in
> the pack its much faster to output.
>
I'll have to look into the pack v4 stuff, as I can't get what you're
saying to make sense to me. Not canonicalizing the data when storing
it means you'll have to have conversion routines from all the various
encodings, unless you first canonicalize it and then encode it in the
way you need it in the pack way. Never using canonical format means
it has no potential for combinatorial explosion, and every converter
needs to know about every other format.
>> Or are you trying to speed up packing by skipping the
>> canonicalization part?
>
> Wrong; we're trying to speed up reading. Packing may go slower,
> especially during the first conversion of v2->v4 for any given
> repository, but packing is infrequent so the minor (if any) drop
> in performance here is probably worth the reading performance gains.
>
>> Well, if macro usage is adhered to one wouldn't have to worry,
>> since the macro can just be rewritten with a function later (if,
>> for example, translation or some such happens to be required).
>> Older code linking to a newer library would work (assuming the
>> size of the commit object doesn't change anyway),
>
> You are assuming too much magic. If the older ABI used a macro
> and the newer one (which supports pack v4) organized struct commit
> differently and the user upgrades libgit2.so the older applications
> just broke, horribly.
>
Naturally. Re-sizing objects that weren't previously protected by
accessor functions will always break the ABI unless the layout of
the previously existing items in the object doesn't change, which
is exactly what I said.
> We know we want to do pack v4 in the near future. Or at least
> experiment with it and see if it works. If it does, we don't
> want to have to cause a major ABI breakage across all those newly
> installed libgit2s... yikes.
>
That's not necessary, although it requires a bit more thought when
changing the objects.
> I'm really in favor of accessor functions for the first version of
> the library. They can always be converted to macros once someone
> shows that their git visualizer program saves 10 ms on a 8,000 ms
> render operation by avoiding accessor functions. I'd rather spend
> our brain cycles optimizing the runtime and the in-core data so
> we spend less time in our tight revision traversal loops.
>
I agree that for early versions they should definitely be functions,
but since we've been talking about "final design" earlier in the
thread, that's what I was referring to here to.
> Seriously. We make at least 10 or 11 function calls *per commit*
> that comes out of get_revision(). If the formatting application is
> really suffering from its 4 or 5 accessor function calls in order
> to get that returned data, we probably should also be looking at
> how we can avoid function cals in the library.
>
> Oh, and even with 4 or 5 accessor functions per commit in the
> application that is *still* better than the 10 or so calls the
> application probably makes today scraping "git log --format=raw"
> off a pipe and segment it into the different fields it needs.
>
Right.
> Unless pipes in Linux somehow allow negative time warping with
> CPU counters. Though on dual-core systems they might, since the
> two processes can run on different cores. But oh, you didn't want
> to worry about threading support too much in libgit2, so I guess
> you also don't want to use multi-core systems.
>
All comps where I use git today are multi-core systems, so I'm very
much in favour of parallell processing. Otoh, I also don't think it
makes sense to jump through hoops to make sure one can, fe, read and
parse all tags in multiple threads simultaneously. In other words, I
don't think we need to bother adding thread-safety for stuff that
really only should happen if the application is poorly designed
(unless it's straightforward to do so, ofcourse).
--
Andreas Ericsson andreas.ericsson@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-11-01 20:42 ` Shawn O. Pearce
2008-11-02 2:30 ` Johannes Schindelin
2008-11-02 9:19 ` Pierre Habouzit
@ 2008-11-03 13:08 ` Andreas Ericsson
2 siblings, 0 replies; 83+ messages in thread
From: Andreas Ericsson @ 2008-11-03 13:08 UTC (permalink / raw)
To: Shawn O. Pearce; +Cc: git, Scott Chacon
Shawn O. Pearce wrote:
> Andreas Ericsson <ae@op5.se> wrote:
>> Shawn O. Pearce wrote:
>>> During the GitTogether we were kicking around the idea of a ground-up
>>> implementation of a Git library.
>> Having looked briefly at the code, I've got a couple of comments:
>> * GIT_EXTERN() does nothing. Ever. It's noise and should be removed.
>
> I feel the same way.
>
> But I was also under the impression that the brilliant engineers
> who work for Microsoft decided that on their platform special
> annotations have to be inserted on functions that a DLL wants to
> export to applications.
>
> Hence any cross-platform library that I have seen annotates their
> exported functions this way, with the macro being empty on POSIX
> and expanding to some magic keyword on Microsoft's OS. I think it
> goes between the return type and the function name too...
>
>> Instead it would be better to have GIT_PRIVATE(),
>
> I can see why you said this; needing GIT_PRIVATE() is a lot more
> rare than needing GIT_EXTERN(). Only a handful of cross-module,
> but private, functions are likely to exist, so it makes sense to
> mark the smaller subset. But see above. *sigh*
>
Thanks for the detailed explanation.
>> * Prefixing the files themselves with git_ is useless and only leads
>> to developer frustration. I imagine we'd be installing any header
>> files in a git/ directory anyway, so we're gaining absolutely
>> nothing with the git_ prefix on source-files.
>
> Yes, I realized that this morning. I plan on changing that mess
> around so we have "include/git/oid.h" and library and application
> code can use "#include <git/oid.h>". Library modules should just
> be "src/oid.c" then.
>
I noticed when I fetched the latest head today that it's already
done. I fail to understand why headers need to be in a separate
path so that oid.c can't just '#include "oid.h"'.
With the risk of nitpicking you to death, put public headers in a
separate dir (I'd suggest public/%.h in Make-speak, but I have no
strong preference) and keep private headers next to %.c. Always
#include the public header file from the private one (that should
probably be in CONVENTIONS).
>> Apart from that, it seems you've been designing a lot rather than
>> trying to use the API to actually do something.
>
> I wanted to get a solid idea of what our API conventions should be,
> before we started writing a lot of code around them. Part of the
> problem with the git.git code is we don't have conventions that are
> really suited for use in a shared library (assuming we even have
> conventions in there) so we can't use that code as a library today.
>
Right. I guess I'm too firm a believer in system evolution by constant
refactoring (with fluctuating api's, yes) rather than thinking initial
design can ever be done exactly right.
>> It would, imo, be
>> a lot better to start development with adding functionality shared
>> between all programs and then expand further on that, such as
>> incorporating all functions needed for manipulating tags into the
>> library and then modify existing code to use the library to get
>> tag-ish things done.
>
> Tags are mostly pointless. Its a tiny part of the code that isn't
> that interesting to most people. And it requires object database
> access anyway if you want to talk about parsing or reading a tag.
> There's almost no point in a git library that can't read the on
> disk object database, or write to it.
>
True, but designing top-down means you'll need to write one more
API to get the first stuff working, so you'll always be using the
new code you write immediately and for something real. IMO, that
makes it much more fun and productive to write the lib itself.
>> I also think it's quite alright to not strive *too* hard to make
>> all functions thread-safe, as very few of them will actually need
>> that. It's unlikely that a user program will spawn one thread to
>> write a lot of tags while another is trying to parse them, for
>> example.
>
> Oh really?
>
> Maybe true for tags, just because they are such an unimportant part
> of the git suite compared to everything else.
>
> But right now I'm running a production system using a threaded server
> process that is operating on Git repositories. Fortunately threads
> suck less on Java than they do on POSIX, and we have a 100% pure
> Java library available for Git.
>
> It would be nice if a library created in the late part of 2008
> recognized that threads exist, aren't going to disappear tomorrow,
> and that consumers of libraries actually may need to run the library
> within a threaded process.
>
> Or are you one of those developers who think threads only exist
> in the giant monolithic kernel land, and all user space should
> be isolated process? I often wonder who such people can justify
> the kernel address space being multi-threaded but userland being
> stuck to single threaded applications. Oh, right, the kernel has
> to go fast...
>
No, I'm one of those developers who think that if implementing a
function as thread-safe means it'll take 50 times longer than just
writing something that works, the right decision is to go with the
faster way to get the job done and then expand on it later when the
need arises. Reading my original post, I realize I should have made
that more clear. Sorry for making your gall rise unnecessarily.
--
Andreas Ericsson andreas.ericsson@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-11-02 5:09 ` David Brown
@ 2008-11-03 16:20 ` Shawn O. Pearce
0 siblings, 0 replies; 83+ messages in thread
From: Shawn O. Pearce @ 2008-11-03 16:20 UTC (permalink / raw)
To: David Brown; +Cc: Scott Chacon, Nicolas Pitre, Pierre Habouzit, david, git
David Brown <git@davidb.org> wrote:
> On Sat, Nov 01, 2008 at 06:07:04PM -0700, Scott Chacon wrote:
>
>> Think about trying to incorporate this into something proprietary,
>> Shawn - how much of a pain is it going to be to get that license
>> reviewed in Google? However, LGPL I'm sure there is already a
>> reviewed policy. Now, since that may be a pain, time that Shawn could
>> have been spending being paid to work on the library is lost because
>> they can't use it, or it takes weeks/months to review it. That's my
>> concern.
>
> The gcc exception license should have been reviewed by anyone who has
> ever build anything proprietary out of gcc.
Indeed. And gcc is a *very* popular compiler on Linux distributions.
A lot of commerical software is built under it, and distributed in
binary-only form. Those companies have already done the legal review
they felt necessary before shipping (and selling) that software.
--
Shawn.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-10-31 17:07 libgit2 - a true git library Shawn O. Pearce
` (3 preceding siblings ...)
2008-11-01 19:18 ` Andreas Ericsson
@ 2008-11-08 13:26 ` Steve Frécinaux
2008-11-08 14:35 ` Andreas Ericsson
4 siblings, 1 reply; 83+ messages in thread
From: Steve Frécinaux @ 2008-11-08 13:26 UTC (permalink / raw)
To: Shawn O. Pearce; +Cc: git, Scott Chacon
Shawn O. Pearce wrote:
> During the GitTogether we were kicking around the idea of a ground-up
> implementation of a Git library. This may be easier than trying
> to grind down git.git into a library, as we aren't tied to any
> of the current global state baggage or the current die() based
> error handling.
>
> I've started an _extremely_ rough draft. The code compiles into a
> libgit.a but it doesn't even implement what it describes in the API,
> let alone a working Git implementation. Really what I'm trying to
> incite here is some discussion on what the API looks like.
Just a random question: is there a reason why you have put all the .h in
a separate includes/ directory instead of relying on the install target
to put the include files at the right place ?
To me it makes it much harder to hack on the files as one is always
required to switch between both directories...
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-11-08 13:26 ` Steve Frécinaux
@ 2008-11-08 14:35 ` Andreas Ericsson
2008-11-08 17:27 ` Pierre Habouzit
0 siblings, 1 reply; 83+ messages in thread
From: Andreas Ericsson @ 2008-11-08 14:35 UTC (permalink / raw)
To: Steve Frécinaux; +Cc: Shawn O. Pearce, git, Scott Chacon
Steve Frécinaux wrote:
> Shawn O. Pearce wrote:
>> During the GitTogether we were kicking around the idea of a ground-up
>> implementation of a Git library. This may be easier than trying
>> to grind down git.git into a library, as we aren't tied to any
>> of the current global state baggage or the current die() based
>> error handling.
>>
>> I've started an _extremely_ rough draft. The code compiles into a
>> libgit.a but it doesn't even implement what it describes in the API,
>> let alone a working Git implementation. Really what I'm trying to
>> incite here is some discussion on what the API looks like.
>
> Just a random question: is there a reason why you have put all the .h in
> a separate includes/ directory instead of relying on the install target
> to put the include files at the right place ?
>
> To me it makes it much harder to hack on the files as one is always
> required to switch between both directories...
I agree with this, but as I guess Shawn will do roughly 45 times more
work on it than me (according to current commit-count in git.git), I'll
live with it.
--
Andreas Ericsson andreas.ericsson@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-11-08 14:35 ` Andreas Ericsson
@ 2008-11-08 17:27 ` Pierre Habouzit
2008-11-09 10:17 ` Andreas Ericsson
0 siblings, 1 reply; 83+ messages in thread
From: Pierre Habouzit @ 2008-11-08 17:27 UTC (permalink / raw)
To: Andreas Ericsson; +Cc: Steve Frécinaux, Shawn O. Pearce, git, Scott Chacon
[-- Attachment #1: Type: text/plain, Size: 1272 bytes --]
On Sat, Nov 08, 2008 at 02:35:55PM +0000, Andreas Ericsson wrote:
> Steve Frécinaux wrote:
> > Just a random question: is there a reason why you have put all the
> > .h in a separate includes/ directory instead of relying on the
> > install target to put the include files at the right place ?
> > To me it makes it much harder to hack on the files as one is always
> > required to switch between both directories...
>
> I agree with this, but as I guess Shawn will do roughly 45 times more
> work on it than me (according to current commit-count in git.git), I'll
> live with it.
I don't, modifying the public includes may break the ABI and the API.
I believe it to be a good practice to put them in a separate directory
so that people modifying them will know this particular header is
public. Yes you can name your private headers differently, but it's not
really the same, it doesn't make editing public headers hard, and it has
to. People modifying them _have_ to thing "err why am I modifying this
specific header in the first place" before doing anything in it.
--
·O· Pierre Habouzit
··O madcoder@debian.org
OOO http://www.madism.org
[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-11-08 17:27 ` Pierre Habouzit
@ 2008-11-09 10:17 ` Andreas Ericsson
2008-11-09 21:02 ` Shawn O. Pearce
0 siblings, 1 reply; 83+ messages in thread
From: Andreas Ericsson @ 2008-11-09 10:17 UTC (permalink / raw)
To: Pierre Habouzit; +Cc: Steve Frécinaux, Shawn O. Pearce, git, Scott Chacon
Pierre Habouzit wrote:
> On Sat, Nov 08, 2008 at 02:35:55PM +0000, Andreas Ericsson wrote:
>> Steve Frécinaux wrote:
>>> Just a random question: is there a reason why you have put all the
>>> .h in a separate includes/ directory instead of relying on the
>>> install target to put the include files at the right place ?
>>> To me it makes it much harder to hack on the files as one is always
>>> required to switch between both directories...
>> I agree with this, but as I guess Shawn will do roughly 45 times more
>> work on it than me (according to current commit-count in git.git), I'll
>> live with it.
>
> I don't, modifying the public includes may break the ABI and the API.
>
> I believe it to be a good practice to put them in a separate directory
> so that people modifying them will know this particular header is
> public. Yes you can name your private headers differently, but it's not
> really the same, it doesn't make editing public headers hard, and it has
> to. People modifying them _have_ to thing "err why am I modifying this
> specific header in the first place" before doing anything in it.
>
Well, I suggested putting "src/public/public_header.h" quite early on,
with private headers next to the source. AFAIU, the private and public
headers both are now located in the same directory, and that directory is
separate from the .c files.
--
Andreas Ericsson andreas.ericsson@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: libgit2 - a true git library
2008-11-09 10:17 ` Andreas Ericsson
@ 2008-11-09 21:02 ` Shawn O. Pearce
0 siblings, 0 replies; 83+ messages in thread
From: Shawn O. Pearce @ 2008-11-09 21:02 UTC (permalink / raw)
To: Andreas Ericsson; +Cc: Pierre Habouzit, Steve Frrrcinaux, git, Scott Chacon
Andreas Ericsson <ae@op5.se> wrote:
>
> Well, I suggested putting "src/public/public_header.h" quite early on,
I must have missed that suggestion. Its not a bad idea.
> with private headers next to the source. AFAIU, the private and public
> headers both are now located in the same directory, and that directory is
> separate from the .c files.
Currently there are only public headers, and the public headers are
all under include/git/. Private headers are going to be under src/
so they are isolated from the public headers.
But I haven't had a chance to touch libgit2 in over a week. :-\
I've simply got too many projects going on at once. This is one
I really want to work on though, so I'm going to try and make time
for it next week. But I'm also in the middle of a major overhaul
of Gerrit, so it can run on non-Google infrastructure and thus is
usable by pretty much anyone.
--
Shawn.
^ permalink raw reply [flat|nested] 83+ messages in thread
end of thread, other threads:[~2008-11-09 21:04 UTC | newest]
Thread overview: 83+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-10-31 17:07 libgit2 - a true git library Shawn O. Pearce
2008-10-31 17:28 ` Pieter de Bie
2008-10-31 17:29 ` Pieter de Bie
2008-10-31 17:47 ` Pierre Habouzit
2008-10-31 18:41 ` Shawn O. Pearce
2008-10-31 18:54 ` Pierre Habouzit
2008-10-31 19:57 ` Shawn O. Pearce
2008-10-31 20:12 ` Pierre Habouzit
2008-10-31 20:05 ` Junio C Hamano
2008-10-31 21:58 ` Shawn O. Pearce
2008-11-01 17:30 ` Pierre Habouzit
2008-11-01 18:44 ` Andreas Ericsson
2008-11-01 18:48 ` Pierre Habouzit
2008-11-01 20:29 ` Shawn O. Pearce
2008-11-01 21:58 ` Andreas Ericsson
2008-11-02 1:50 ` Shawn O. Pearce
2008-11-03 10:17 ` Andreas Ericsson
2008-11-02 1:56 ` Shawn O. Pearce
2008-11-02 9:25 ` Pierre Habouzit
2008-10-31 20:24 ` Nicolas Pitre
2008-10-31 20:29 ` david
2008-10-31 20:56 ` Nicolas Pitre
2008-10-31 21:43 ` Shawn O. Pearce
2008-10-31 21:50 ` Shawn O. Pearce
2008-10-31 21:51 ` Pierre Habouzit
2008-10-31 21:31 ` Pierre Habouzit
2008-10-31 22:10 ` Nicolas Pitre
2008-11-01 10:52 ` Andreas Ericsson
2008-10-31 23:24 ` Pieter de Bie
2008-10-31 23:28 ` Shawn O. Pearce
2008-10-31 23:49 ` Junio C Hamano
2008-11-01 0:02 ` Pierre Habouzit
2008-11-01 0:19 ` Shawn O. Pearce
2008-11-01 1:02 ` Pierre Habouzit
2008-11-01 0:13 ` Shawn O. Pearce
2008-11-01 1:15 ` Nicolas Pitre
2008-11-01 1:19 ` Shawn O. Pearce
2008-11-01 1:45 ` Nicolas Pitre
2008-11-01 1:52 ` Shawn O. Pearce
2008-11-01 2:26 ` Johannes Schindelin
2008-11-01 11:01 ` Pierre Habouzit
2008-11-01 13:50 ` Nicolas Pitre
2008-11-01 17:01 ` Pierre Habouzit
2008-11-01 20:26 ` Johannes Schindelin
2008-10-31 23:14 ` Junio C Hamano
2008-10-31 23:33 ` Pierre Habouzit
2008-10-31 23:41 ` Shawn O. Pearce
2008-10-31 23:56 ` Jakub Narebski
2008-11-01 0:41 ` david
2008-11-01 1:00 ` Shawn O. Pearce
2008-11-01 1:04 ` david
2008-11-01 1:08 ` Pierre Habouzit
2008-11-01 1:33 ` Nicolas Pitre
2008-11-01 1:38 ` Pierre Habouzit
2008-11-01 1:49 ` Nicolas Pitre
2008-11-01 1:43 ` Shawn O. Pearce
2008-11-01 1:53 ` Nicolas Pitre
2008-11-01 22:57 ` Shawn O. Pearce
2008-11-02 0:26 ` Scott Chacon
2008-11-02 1:07 ` Scott Chacon
2008-11-02 1:36 ` Shawn O. Pearce
2008-11-02 5:09 ` David Brown
2008-11-03 16:20 ` Shawn O. Pearce
2008-11-01 1:06 ` Pierre Habouzit
2008-11-01 1:36 ` david
2008-10-31 20:24 ` Brian Gernhardt
2008-10-31 21:59 ` Andreas Ericsson
2008-10-31 22:01 ` Shawn O. Pearce
2008-10-31 22:51 ` Junio C Hamano
2008-11-01 11:17 ` Andreas Ericsson
2008-10-31 23:22 ` Johannes Schindelin
2008-10-31 23:18 ` Bruno Santos
2008-10-31 23:25 ` Shawn O. Pearce
2008-11-01 19:18 ` Andreas Ericsson
2008-11-01 20:42 ` Shawn O. Pearce
2008-11-02 2:30 ` Johannes Schindelin
2008-11-02 9:19 ` Pierre Habouzit
2008-11-03 13:08 ` Andreas Ericsson
2008-11-08 13:26 ` Steve Frécinaux
2008-11-08 14:35 ` Andreas Ericsson
2008-11-08 17:27 ` Pierre Habouzit
2008-11-09 10:17 ` Andreas Ericsson
2008-11-09 21:02 ` Shawn O. Pearce
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).