* Libification project (SoC)
@ 2007-03-16 4:24 Luiz Fernando N. Capitulino
2007-03-16 4:59 ` Shawn O. Pearce
2007-03-17 2:24 ` Jakub Narebski
0 siblings, 2 replies; 62+ messages in thread
From: Luiz Fernando N. Capitulino @ 2007-03-16 4:24 UTC (permalink / raw)
To: gsoc; +Cc: git
Hi Shawn,
I'm going to apply for the libification project and, in order to help
me to get started, would be good to get some feedback regarding the
project's goal and your expectations.
I'll just dump some thoughts/question I had, so that we can
start some discussion.
1. This' a more complete todo list, based on the wiki and a
quick look at the code.
o Remove static variables
o Avoid dying when a function call fails (eg, malloc())
o Input parameter checking (plus errno setting)
o Documentation (eg, doxygen)
o Unit-tests
o Add prefix (eg, git_*) to public API functions
Do we agree here? Is there more suggestions?
2. What's the minimum amount of work that need to be done for
the SoC project to be considered successful?
3. I don't code in Perl, is it a problem? I mean, the project's
goal is to have a Perl binding but I think it goes far from
that: we could have a python module, a C program, or anything
that shows the libgit is useful.
Thanks,
--
Luiz Fernando N. Capitulino
^ permalink raw reply [flat|nested] 62+ messages in thread* Re: Libification project (SoC) 2007-03-16 4:24 Libification project (SoC) Luiz Fernando N. Capitulino @ 2007-03-16 4:59 ` Shawn O. Pearce 2007-03-16 5:30 ` Junio C Hamano ` (2 more replies) 2007-03-17 2:24 ` Jakub Narebski 1 sibling, 3 replies; 62+ messages in thread From: Shawn O. Pearce @ 2007-03-16 4:59 UTC (permalink / raw) To: Luiz Fernando N. Capitulino; +Cc: git "Luiz Fernando N. Capitulino" <lcapitulino@mandriva.com.br> wrote: > I'm going to apply for the libification project and, in order to help > me to get started, would be good to get some feedback regarding the > project's goal and your expectations. Excellent! > 1. This' a more complete todo list, based on the wiki and a > quick look at the code. > > o Remove static variables Yes. Removing all of these is not completely necessary in the first version; in fact I would recommened against it. For example the active_cache variable and its related friends is referenced a lot. lt contains the index in memory. I think its perfectly OK to say that in the first iteration of a public libgit.a that the process may only use one index at a time, if it can even use the index at all (see below). But if you eventually got around to even helping the index parts of "the Git library", that would certainly be appreciated! On the other hand, many of the variables declared in environment.c are repository specific configuration variables. These probably should be abstracted into some sort of wrapper, so that multiple repositories can be accessed from within the same process. Why? a future mod_perl running gitweb.cgi accessing repositories through libgit.a and Perl bindings of course! But static variable removal is low on the priority list for this project I think. Our more important issues are related to some of the other items. > o Avoid dying when a function call fails (eg, malloc()) malloc is a huge problem in the Git code today. Almost all of our malloc calls are actually through the xmalloc wrapper. All xmalloc callers assume xmalloc will *never* fail. This makes it, uh, interesting. ;-) Although one could argue that being unable to malloc needed memory probably means you're toast, so die()'ing is good. But other areas die when they get given a bad SHA-1 (for example). If the library caller can supply that (possibly bad) SHA-1 to an API function, that's just mean to die out. ;-) > o Input parameter checking (plus errno setting) Yes, of course. But most functions (at least those that should be made public) probably already do check their arguments. Some return an error code back to their caller; others die() and abort the current process. And there are probably a few that don't check their arguments enough. But I think input parameter checking is probably going to be a relatively small task here. Although sometimes the input checking is done in the program that calls the function, and not the function itself. So that might need to be refactored in a few spots. > o Documentation (eg, doxygen) Yes; very important for the library to be of any use to anyone else. > o Unit-tests Of the public API, yes. Our current test suite covers some of that code that we want to make public, but does so through programs that call those functions. We would want unit tests to verify the public API conforms to the expectations of the unit test's writer. ;-) > o Add prefix (eg, git_*) to public API functions Yes. But which functions shall we expose? ;-) See below for functionality I'm thinking about; others may have different ideas. o Build system issues You missed this, but I think its an important consideration. Our current libgit.a is a static library that has a relatively large number of symbols its modules are exporting. These symbol names aren't namespace-ized (e.g. git_* prefix) so we wouldn't want to just offer this library up in its current form. Some of those symbols would get name changes (as you suggest above), but others might not (e.g. the active_cache that I suggest further above). These modules might need to be moved out of libgit.a and moved into say a new libgitprivate.a, that our own code can link against, but that isn't offered to the public as a stable API. o Public header definition Whatever we expose, we will need to draft a public "git.h" (or somesuch) that callers can rely upon. It will need to be fairly stable, and handle revisions as new features get added. E.g. version testing support like the zlib and cURL library have, and that we rely upon in Git to do feature checks. ;-) > 2. What's the minimum amount of work that need to be done for > the SoC project to be considered successful? I'd like to see enough API support that gitweb.cgi could: * get the most recent commit date of all refs in all projects (the toplevel project index page); * get a shortlog for the main summary page of a project; * get the full content of a single commit; * get the "raw" diff (paths that changed) for two commits; There's a thousand other things that gitweb.cgi would still need to fully avoid forking Git processes. But that's a really good start, and is probably going to be a decent chunk of work. Especially to create high-quality patches that pass our standards review. ;-) In some cases much of the above is already "internally public"; meaning we already treat parts of that code as a library and invoke them from within processes to get work done. Much of this project is about improving the interfaces and behavior enough to make those existing APIs truely public. See refs.h, diff.h, revision.h, commit.h... > 3. I don't code in Perl, is it a problem? I mean, the project's > goal is to have a Perl binding but I think it goes far from > that: we could have a python module, a C program, or anything > that shows the libgit is useful. No, I don't see that as a problem at all. We have some Perl experts on the mailing list who would like to see Perl bindings. Some of the Perl binding is pure C code, and some if it is this weird Perl macro language... so I expect those Perl experts to come out of the woodwork and help the community to create a prototype set of bindings. There's also Ruby and Python interests around, so we may see bindings for those too. ;-) >From a goal perspective of this SoC project, any functioning binding that can support a gitweb type of application would be great. It shows the library works as intended, is useful, and can be continued to be built upon. That's a pretty successful project in my mind. -- Shawn. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-16 4:59 ` Shawn O. Pearce @ 2007-03-16 5:30 ` Junio C Hamano 2007-03-16 6:00 ` Shawn O. Pearce ` (2 more replies) 2007-03-16 8:06 ` Johannes Sixt 2007-03-16 12:55 ` Petr Baudis 2 siblings, 3 replies; 62+ messages in thread From: Junio C Hamano @ 2007-03-16 5:30 UTC (permalink / raw) To: Shawn O. Pearce; +Cc: Luiz Fernando N. Capitulino, git "Shawn O. Pearce" <spearce@spearce.org> writes: > On the other hand, many of the variables declared in environment.c > are repository specific configuration variables. These probably > should be abstracted into some sort of wrapper, so that multiple > repositories can be accessed from within the same process. Why? > a future mod_perl running gitweb.cgi accessing repositories through > libgit.a and Perl bindings of course! I think if you are abstracting them out, into "struct repo_state", the index and object store related variables such as packed_git should go there as well, so your recommendation feels very inconsistent to me. >> o Avoid dying when a function call fails (eg, malloc()) > > malloc is a huge problem in the Git code today. Almost all > of our malloc calls are actually through the xmalloc wrapper. > All xmalloc callers assume xmalloc will *never* fail. This > makes it, uh, interesting. ;-) Actually they do not assume such. What they assume is worse. They assume that there is nothing else you can do other than dying when allocation fails. > But other areas die when they get given a bad SHA-1 (for example). > If the library caller can supply that (possibly bad) SHA-1 to an > API function, that's just mean to die out. ;-) That's a real problem, but on the other hand, perl or whatever wrapped ones can do the dying (or not dying) before calling into libgit, so it may not be such a big issue. >> o Documentation (eg, doxygen) >> o Unit-tests >> o Add prefix (eg, git_*) to public API functions > > Yes. But which functions shall we expose? ;-) Before going into that topic, a bigger question is if we are happy with the current internal API and what the goal of libification is. If the libification is going to say that "this is a published API so we are not going to change it", I would imagine that it would be very hard to accept in the mainline. Improvements like the earlier sliding mmap() series need to be able to change the interfaces without backward compatibility wart. In other words, I do not know what idiot ^W ^W who listed the libification stuff on the SoC "ideas" page, but I think (1) it is premature to promise stable ABI, and (2) if it does not promise stable ABI a library is not very useful. > o Build system issues > > You missed this, but I think its an important consideration. > Our current libgit.a is a static library that has a relatively large > number of symbols its modules are exporting. These symbol names > aren't namespace-ized (e.g. git_* prefix) so we wouldn't want to > just offer this library up in its current form. Very true, in fact, the current libgit.a is _NOT_ a library at all. It is just a way to be terse in our Makefile to make the linker do the work for us, nothing more. And I do not think we would want to rename our "internally public" functions such as find_pack_entry_one() and sha1_object_info() with git_ prefix only for the purpose of this libification. If we can trick the linker to create gitlib.so which defines the symbol git_sha1_object_info() that lets the caller to call our internal sha1_object_info(), without exposing the internal name sha1_object_info(), and strip other global names libgit.a and plumbing internally use to communicate each other, such as find_pack_entry_one(), from the gitlib.so library, that would be a good solution. >> 2. What's the minimum amount of work that need to be done for >> the SoC project to be considered successful? > > I'd like to see enough API support that gitweb.cgi could: > > * get the most recent commit date of all refs in all projects > (the toplevel project index page); > * get a shortlog for the main summary page of a project; > * get the full content of a single commit; > * get the "raw" diff (paths that changed) for two commits; I would disagree with tying libification and Perl binding this way. If the goal is to get faster gitweb, then that does not necessarily have to be libified git. Let one person who does the libification come up with a decent C binding and let others worry about Perl bindings. > In some cases much of the above is already "internally public"; > meaning we already treat parts of that code as a library and invoke > them from within processes to get work done. Much of this project > is about improving the interfaces and behavior enough to make those > existing APIs truely public. One big thing you forgot to mention is that whatever form it takes, the libification should not impact performance of existing plumbing. These interfaces are "internally" public exactly because the callers still honor underlying convention such as not having to clean-up the object flags for the last invocation. If you libify in a wrong way, you would end up an implementation of the interface that always cleans up (because you would not know if you are part of a long-living process so you will clean-up just in case you will still be called later), which would be unusable from the plumbing point-of-view. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-16 5:30 ` Junio C Hamano @ 2007-03-16 6:00 ` Shawn O. Pearce 2007-03-16 6:54 ` Junio C Hamano 2007-03-16 12:53 ` Petr Baudis 2007-03-16 13:47 ` Luiz Fernando N. Capitulino 2 siblings, 1 reply; 62+ messages in thread From: Shawn O. Pearce @ 2007-03-16 6:00 UTC (permalink / raw) To: Junio C Hamano; +Cc: Luiz Fernando N. Capitulino, git Junio C Hamano <junkio@cox.net> wrote: > "Shawn O. Pearce" <spearce@spearce.org> writes: > > On the other hand, many of the variables declared in environment.c > > are repository specific configuration variables. These probably > > should be abstracted into some sort of wrapper, so that multiple > > repositories can be accessed from within the same process. Why? > > a future mod_perl running gitweb.cgi accessing repositories through > > libgit.a and Perl bindings of course! > > I think if you are abstracting them out, into "struct repo_state", > the index and object store related variables such as packed_git > should go there as well, so your recommendation feels very > inconsistent to me. I missed packed_git, but you are right, that should definately go with a struct repo_state. And maybe you are right that the index should go with it... but I'm not sure the index should be tied to the repository at all. Its strictly convention that the index goes with the repository; GIT_INDEX_FILE lets you say otherwise at the command line level, why can't we do otherwise from a library level too? > >> o Add prefix (eg, git_*) to public API functions > > > > Yes. But which functions shall we expose? ;-) > > Before going into that topic, a bigger question is if we are > happy with the current internal API and what the goal of > libification is. If the libification is going to say that "this > is a published API so we are not going to change it", I would > imagine that it would be very hard to accept in the mainline. I'm looking at a middleground between our current "moving target" internal API and our "frozen" plumbing process based API. There are a number of places where just being able to get data *out* of Git easily would be useful, but doing so right now is awkward. Either you code against our "moving target" internal API by creating a new builtin (e.g. my builtin-statplog) where its easy to get what you want, or you code against the plumbing based tools, where its sometimes not so easy... Most of the data formats aren't changing; a commit is a commit is a commit. It has a tree, parents, author, committer, message. > Improvements like the earlier sliding mmap() series need to be > able to change the interfaces without backward compatibility > wart. I agree. But I also think the use_mmap() API is just way too low level for a public library. That particular change was pretty low level. Think higher, like "struct commit". That is actually too low still, as it doesn't really help you with the author and committer. > In other words, I do not know what idiot ^W ^W who listed the > libification stuff on the SoC "ideas" page, I'm the idiot ^W individual responsible. ;-) > I would disagree with tying libification and Perl binding this > way. If the goal is to get faster gitweb, then that does not > necessarily have to be libified git. Let one person who does > the libification come up with a decent C binding and let others > worry about Perl bindings. Yes. However Perl bindings are often asked for. And Marco Costalba might like a working libgit that he could use for revision fetching in qgit. I think that if patches for a library started to appear, another interested party would start to at least play with them. > One big thing you forgot to mention is that whatever form it > takes, the libification should not impact performance of > existing plumbing. These interfaces are "internally" public > exactly because the callers still honor underlying convention > such as not having to clean-up the object flags for the last > invocation. If you libify in a wrong way, you would end up an > implementation of the interface that always cleans up (because > you would not know if you are part of a long-living process so > you will clean-up just in case you will still be called later), > which would be unusable from the plumbing point-of-view. I didn't forget; I just simply did not mention it. I was considering writing something to that effect, and probably should have. This is a really valid point. Git is insanely fast, partly because we have a lot of "run once" types of applications and we have optimized for those. Any sort of "run many times" reuse needs to not make the "run once" guy pay for something he will not use. A good example of this is in git-describe, where we use the object flags, and only bother to clear them out if there is another commit remaining to be described. -- Shawn. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-16 6:00 ` Shawn O. Pearce @ 2007-03-16 6:54 ` Junio C Hamano 2007-03-16 11:54 ` Johannes Schindelin 0 siblings, 1 reply; 62+ messages in thread From: Junio C Hamano @ 2007-03-16 6:54 UTC (permalink / raw) To: Shawn O. Pearce; +Cc: Luiz Fernando N. Capitulino, git "Shawn O. Pearce" <spearce@spearce.org> writes: > Junio C Hamano <junkio@cox.net> wrote: >> "Shawn O. Pearce" <spearce@spearce.org> writes: >> > On the other hand, many of the variables declared in environment.c >> > are repository specific configuration variables. These probably >> > should be abstracted into some sort of wrapper, so that multiple >> > repositories can be accessed from within the same process. Why? >> > a future mod_perl running gitweb.cgi accessing repositories through >> > libgit.a and Perl bindings of course! >> >> I think if you are abstracting them out, into "struct repo_state", >> the index and object store related variables such as packed_git >> should go there as well, so your recommendation feels very >> inconsistent to me. > > I missed packed_git, but you are right, that should definately go > with a struct repo_state. And maybe you are right that the index > should go with it... but I'm not sure the index should be tied to the > repository at all. Its strictly convention that the index goes with > the repository; GIT_INDEX_FILE lets you say otherwise at the command > line level, why can't we do otherwise from a library level too? Even within a plumbing, being able to shuffle multiple indices at once would be very useful. For example, if I were to rewrite unpack-trees, I would most likely read from the current index and trees and populate a new index from emptiness by appending to it, thereby avoiding the binary-search and insert costs. I've thought about the layering when Smurf first brought up the libification (which was a loooong time ago), and concluded three layered approach would be most useful. The bottom layer is object store across repositories. If we ignore SHA-1 collisions as an issue (and we _will_ ignore it for forseeable future), unless you are doing "read from one repository and write that to another repository", it is more handy to be able to name an object and get its data without knowing which repository's object store it comes from, and it would make "git log master~A..master~B" across repositories (i.e. 'master' of repository A and 'master' of repository B) possible. An example interface would be like: (current) void *read_sha1_file(const unsigned char *sha1, enum object_type *type, unsigned long *size); (libified) void *git_read_sha1_file(struct gitlib *, const unsigned char *sha1, enum object_type *type, unsigned long *size); where "struct gitlib" has a list of "struct object_store", and we will have: int git_add_object_store(struct gitlib *, const char *path); to add one directory as object store the toplevel gitlib structure knows about. In a sense, "struct gitlib" and object store is so global that we might not even need to have it as a parameter (iow, it and "struct object **obj_hash" from object.c can stay global). The middle layer is repositories, primarily their refs and reflogs. An example interface would be like: (current) int get_sha1(const char *name, unsigned char *sha1); (libified) int git_get_sha1(struct git_repo *, const char *name, unsigned char *sha1); where "struct git_repo" is one repository (and it would have a pointer to "struct gitlib *" so that we can follow objects to follow parents and stuff). And the top layer would have indices, and working trees as per-invocation parameter. (current) int cache_name_pos(const char *name, int namelen); int unpack_trees(struct object_list *trees, struct unpack_trees_options *o); (libified) int git_cache_name_pos(struct git_cache *, const char *name, int namelen); int git_unpack_trees(struct object_list *trees, struct git_unpack_trees_options *o); where "struct git_cache" has "index" thingies, such as active_cache, active_nr, active_alloc, and active_cache_tree. And we would have pointer to "struct git_cache *" in unpack_trees_options structure. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-16 6:54 ` Junio C Hamano @ 2007-03-16 11:54 ` Johannes Schindelin 2007-03-16 13:09 ` Rocco Rutte 0 siblings, 1 reply; 62+ messages in thread From: Johannes Schindelin @ 2007-03-16 11:54 UTC (permalink / raw) To: Junio C Hamano; +Cc: Shawn O. Pearce, Luiz Fernando N. Capitulino, git Hi, On Thu, 15 Mar 2007, Junio C Hamano wrote: > "Shawn O. Pearce" <spearce@spearce.org> writes: > > > Junio C Hamano <junkio@cox.net> wrote: > >> "Shawn O. Pearce" <spearce@spearce.org> writes: > >> > On the other hand, many of the variables declared in environment.c > >> > are repository specific configuration variables. These probably > >> > should be abstracted into some sort of wrapper, so that multiple > >> > repositories can be accessed from within the same process. Why? > >> > a future mod_perl running gitweb.cgi accessing repositories through > >> > libgit.a and Perl bindings of course! > >> > >> I think if you are abstracting them out, into "struct repo_state", > >> the index and object store related variables such as packed_git > >> should go there as well, so your recommendation feels very > >> inconsistent to me. > > > > I missed packed_git, but you are right, that should definately go > > with a struct repo_state. And maybe you are right that the index > > should go with it... but I'm not sure the index should be tied to the > > repository at all. Its strictly convention that the index goes with > > the repository; GIT_INDEX_FILE lets you say otherwise at the command > > line level, why can't we do otherwise from a library level too? > > Even within a plumbing, being able to shuffle multiple indices > at once would be very useful. For example, if I were to rewrite > unpack-trees, I would most likely read from the current index > and trees and populate a new index from emptiness by appending > to it, thereby avoiding the binary-search and insert costs. > > I've thought about the layering when Smurf first brought up the > libification (which was a loooong time ago), and concluded three > layered approach would be most useful. > > The bottom layer is object store across repositories. If we > ignore SHA-1 collisions as an issue (and we _will_ ignore it for > forseeable future), unless you are doing "read from one > repository and write that to another repository", it is more > handy to be able to name an object and get its data without > knowing which repository's object store it comes from, and it > would make "git log master~A..master~B" across repositories > (i.e. 'master' of repository A and 'master' of repository B) > possible. An example interface would be like: > > (current) > void *read_sha1_file(const unsigned char *sha1, > enum object_type *type, > unsigned long *size); > > (libified) > void *git_read_sha1_file(struct gitlib *, > const unsigned char *sha1, > enum object_type *type, > unsigned long *size); > > where "struct gitlib" has a list of "struct object_store", and > we will have: > > int git_add_object_store(struct gitlib *, const char *path); > > to add one directory as object store the toplevel gitlib structure > knows about. In a sense, "struct gitlib" and object store is so > global that we might not even need to have it as a parameter > (iow, it and "struct object **obj_hash" from object.c can stay > global). > > The middle layer is repositories, primarily their refs and > reflogs. An example interface would be like: > > (current) > int get_sha1(const char *name, unsigned char *sha1); > > (libified) > int git_get_sha1(struct git_repo *, const char *name, unsigned char *sha1); > > where "struct git_repo" is one repository (and it would have a > pointer to "struct gitlib *" so that we can follow objects to > follow parents and stuff). > > And the top layer would have indices, and working trees as > per-invocation parameter. > > (current) > int cache_name_pos(const char *name, int namelen); > int unpack_trees(struct object_list *trees, struct unpack_trees_options *o); > > (libified) > int git_cache_name_pos(struct git_cache *, const char *name, int namelen); > int git_unpack_trees(struct object_list *trees, struct git_unpack_trees_options *o); > > where "struct git_cache" has "index" thingies, such as > active_cache, active_nr, active_alloc, and active_cache_tree. > And we would have pointer to "struct git_cache *" in unpack_trees_options > structure. Isn't this an awfully long shot? I'd be happy if the libification project resulted - in a (static!) libgit.a which can be linked to qgit or similar (being reentrant, or at least optionally so, and not die()ing all the time), and - which does not fix the API yet (at least for the most parts). We _can_ -- once we agree on a stable API -- expose _some_ functions in a libgit.so, but that does not have to be the goal for the first step! Ciao, Dscho ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-16 11:54 ` Johannes Schindelin @ 2007-03-16 13:09 ` Rocco Rutte 2007-03-16 15:12 ` Johannes Schindelin 0 siblings, 1 reply; 62+ messages in thread From: Rocco Rutte @ 2007-03-16 13:09 UTC (permalink / raw) To: git Hi, * Johannes Schindelin [07-03-16 12:54:52 +0100] wrote: [...] >Isn't this an awfully long shot? >I'd be happy if the libification project resulted >- in a (static!) libgit.a which can be linked to qgit or similar (being > reentrant, or at least optionally so, and not die()ing all the time), > and >- which does not fix the API yet (at least for the most parts). >We _can_ -- once we agree on a stable API -- expose _some_ functions in a >libgit.so, but that does not have to be the goal for the first step! First, I think that would be some cleanup "only" since that basically would mean to 1) make all functions die()ing return some value and handle it and 2) wrap all static vars into structures and pass them around If you don't choose a design before wrapping things up in structures, you'll probably end up having one structure per source file (at least too many structures). Porting things like qgit to it or writting proper perl/python bindings is wasted time since you'd have to rewrite all of it once you decided which functions to expose and which structures to use (calling the main() routines of builtin's doesn't count as real libifaction, it would rather be a performance improvement only). I'd simply try to find a rough consensus on the data structures and the layer model before starting the project, solve 1), afterwards implement 2) according to it. While 2) happens it would make sense to try to develop perl, python, C and C++ bindings in parallel to find out early enough whether the design details chosen are useful for real consumers outside the git-* tools. You could put big fat warnings everywhere that parts of the API which are exposed are heavily unstable and likely subject to change and that programmers using them will have to frequently start over. Once it turns out that all the git-tools and all "reference consumers" work it, you can do some cleanup to get to the final first API version after the libification project is done. bye, Rocco -- :wq! ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-16 13:09 ` Rocco Rutte @ 2007-03-16 15:12 ` Johannes Schindelin 2007-03-16 15:55 ` Nicolas Pitre ` (2 more replies) 0 siblings, 3 replies; 62+ messages in thread From: Johannes Schindelin @ 2007-03-16 15:12 UTC (permalink / raw) To: Rocco Rutte; +Cc: git Hi, [please do not cull the Cc: list] On Fri, 16 Mar 2007, Rocco Rutte wrote: > First, I think that would be some cleanup "only" since that basically would > mean to > > 1) make all functions die()ing return some value and handle it and > 2) wrap all static vars into structures and pass them around > > If you don't choose a design before wrapping things up in structures, you'll > probably end up having one structure per source file (at least too many > structures). Why? For some tasks, it should be 1) easier, 2) more elegant, and 3) faster to write a function which re-initialises the static variables. Of course, if you want to work with multiple repos _at the same time_, this does not help you. But frankly, we don't support that with core-git, so why should we in libgit? > Porting things like qgit to it or writting proper perl/python bindings > is wasted time since you'd have to rewrite all of it once you decided > which functions to expose and which structures to use (calling the > main() routines of builtin's doesn't count as real libifaction, it would > rather be a performance improvement only). Nope. It is _not_ a complete rewrite. More likely, it is minimal adjustments. It's not like we will replace apples with cars... > I'd simply try to find a rough consensus on the data structures and the > layer model before starting the project, solve 1), afterwards implement > 2) according to it. We already _have_ the data structures! Also, in my experience, defining a complete API, and only after that, implement it, never works. Rather, start with a _small_ part you want to do. Define a clean API _just for that part_. Implement it. Verify that it indeed does what it should do (and that means not just _you_ should verify it, but it should be stress tested on the list). We don't have to create the whole world in one day, you know? Ciao, Dscho ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-16 15:12 ` Johannes Schindelin @ 2007-03-16 15:55 ` Nicolas Pitre 2007-03-16 16:13 ` Johannes Schindelin 2007-03-16 16:17 ` Shawn O. Pearce 2007-03-16 18:20 ` Marco Costalba 2007-03-18 14:08 ` Petr Baudis 2 siblings, 2 replies; 62+ messages in thread From: Nicolas Pitre @ 2007-03-16 15:55 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Rocco Rutte, git On Fri, 16 Mar 2007, Johannes Schindelin wrote: > We already _have_ the data structures! Well... Shawn and I are contemplating alternate data structures to improve things dramatically. With a fixed public API I doubt such improvements could be as effective. One thing that was really done right in the Linux kernel is to _not_ have any sort of fixed API at all for drivers. This is a big upside for progress. Yet the Linux kernel is regarded as highly useful. So... if any API is to be developed, I'd argue that it must be done _above_ the existing code with a higher level of abstraction and a much narrower scope. Nicolas ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-16 15:55 ` Nicolas Pitre @ 2007-03-16 16:13 ` Johannes Schindelin 2007-03-16 16:26 ` Nicolas Pitre 2007-03-16 16:17 ` Shawn O. Pearce 1 sibling, 1 reply; 62+ messages in thread From: Johannes Schindelin @ 2007-03-16 16:13 UTC (permalink / raw) To: Nicolas Pitre; +Cc: Rocco Rutte, git Hi, On Fri, 16 Mar 2007, Nicolas Pitre wrote: > On Fri, 16 Mar 2007, Johannes Schindelin wrote: > > > We already _have_ the data structures! > > Well... Shawn and I are contemplating alternate data structures to > improve things dramatically. I was alluding to rev_info, not pack_window and friends. > With a fixed public API I doubt such improvements could be as effective. Just think of the "API" we have for porcelains. It is literally unchanged since the beginning. You can even use the original script git-log.sh today! _That_ is what I mean by fixed public API: give certain guarantees about what will not go away. > One thing that was really done right in the Linux kernel is to _not_ > have any sort of fixed API at all for drivers. This is a big upside for > progress. Yet the Linux kernel is regarded as highly useful. Yes. I am a Linux user myself. Ciao, Dscho ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-16 16:13 ` Johannes Schindelin @ 2007-03-16 16:26 ` Nicolas Pitre 2007-03-16 18:22 ` Steve Frécinaux 2007-03-16 23:26 ` Johannes Schindelin 0 siblings, 2 replies; 62+ messages in thread From: Nicolas Pitre @ 2007-03-16 16:26 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Rocco Rutte, git On Fri, 16 Mar 2007, Johannes Schindelin wrote: > Hi, > > On Fri, 16 Mar 2007, Nicolas Pitre wrote: > > > On Fri, 16 Mar 2007, Johannes Schindelin wrote: > > > > > We already _have_ the data structures! > > > > Well... Shawn and I are contemplating alternate data structures to > > improve things dramatically. > > I was alluding to rev_info, not pack_window and friends. > > > With a fixed public API I doubt such improvements could be as effective. > > Just think of the "API" we have for porcelains. It is literally unchanged > since the beginning. You can even use the original script git-log.sh > today! _That_ is what I mean by fixed public API: give certain guarantees > about what will not go away. Sure. But the output from an executable is a damn good abstraction and the executable itself is an impenetrable boundary. Anything can change (and did change) underneath. This is why a public API must be done at a higher level to allow for anything to change at the lower level as we wish. Nicolas ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-16 16:26 ` Nicolas Pitre @ 2007-03-16 18:22 ` Steve Frécinaux 2007-03-16 18:53 ` Nicolas Pitre 2007-03-16 23:26 ` Johannes Schindelin 1 sibling, 1 reply; 62+ messages in thread From: Steve Frécinaux @ 2007-03-16 18:22 UTC (permalink / raw) To: Nicolas Pitre; +Cc: Johannes Schindelin, Rocco Rutte, git On Fri, 2007-03-16 at 12:26 -0400, Nicolas Pitre wrote: > Sure. But the output from an executable is a damn good abstraction and > the executable itself is an impenetrable boundary. Anything can change > (and did change) underneath. Strictly speaking, you can use opaque structures for commits and so on (so that the outside world will only ever see a pointer), and use some getter/setters for commonly used stuffs (like datum, title, content). Also, I guess what people would expect from a C library is roughly the same as for the current plumbing... just easier to use from another program. It doesn't need a low-level access to data structure (most applications would be to interact with an existing repo or to store data for a third-party software, something that is high-level) and I don't think such an opaque API would be a huge constraint as soon as you keep the Object/Index/Tree/Commit/etc basic opaque structs. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-16 18:22 ` Steve Frécinaux @ 2007-03-16 18:53 ` Nicolas Pitre 2007-03-18 13:57 ` Petr Baudis 0 siblings, 1 reply; 62+ messages in thread From: Nicolas Pitre @ 2007-03-16 18:53 UTC (permalink / raw) To: Steve Frécinaux; +Cc: Johannes Schindelin, Rocco Rutte, git [-- Attachment #1: Type: TEXT/PLAIN, Size: 1484 bytes --] On Fri, 16 Mar 2007, Steve Frécinaux wrote: > Also, I guess what people would expect from a C library is roughly the > same as for the current plumbing... just easier to use from another > program. It doesn't need a low-level access to data structure (most > applications would be to interact with an existing repo or to store data > for a third-party software, something that is high-level) and I don't > think such an opaque API would be a huge constraint as soon as you keep > the Object/Index/Tree/Commit/etc basic opaque structs. Right. I like that idea. A good way to define the lib API needs then might be expressed as follows: Each existing plumbing commands must be turned into the minimal implementation required to interact with the libgit public API and display results. In other words, the public libgit API should provide the same functionality as existing plumbing commands such that those existing commands will only need the necessary code to bridge the C interface with the existing command line interface. Then, of course, there is the matter of reentrancy. But that's still a minor API detail even if it is not a trivial issue implementation wise. But the API must be right as this is what we'll be stuck with even if the implementation may change. And as far as an API definition is needed I think that it should reflect the current plumbing which is actually the real API that grew naturally and has been proven useful. Nicolas ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-16 18:53 ` Nicolas Pitre @ 2007-03-18 13:57 ` Petr Baudis 0 siblings, 0 replies; 62+ messages in thread From: Petr Baudis @ 2007-03-18 13:57 UTC (permalink / raw) To: Nicolas Pitre; +Cc: Steve Frécinaux, Johannes Schindelin, Rocco Rutte, git On Fri, Mar 16, 2007 at 07:53:06PM CET, Nicolas Pitre wrote: > A good way to define the lib API needs then might be expressed as > follows: > > Each existing plumbing commands must be turned into the minimal > implementation required to interact with the libgit public API and > display results. > > In other words, the public libgit API should provide the same > functionality as existing plumbing commands such that those existing > commands will only need the necessary code to bridge the C interface > with the existing command line interface. I think this is good definition if interpreted well - that is, git-log library equivalent shouldn't spew out textual output but provide interface to retrieve revision information in easy-to-use format. > Then, of course, there is the matter of reentrancy. But that's still a > minor API detail even if it is not a trivial issue implementation wise. > But the API must be right as this is what we'll be stuck with even if > the implementation may change. And as far as an API definition is > needed I think that it should reflect the current plumbing which is > actually the real API that grew naturally and has been proven useful. Well what you said about reentrancy is that "it's minor API detail but even minor API details must be right because we will be stuck with them". And I don't think it's minor at all either. :-) Also, even if the implementation won't be completely re-entrant initially, the question of re-entrancy is something we should decide since it still affects the scope of the librarification work. -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ Ever try. Ever fail. No matter. // Try again. Fail again. Fail better. -- Samuel Beckett ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-16 16:26 ` Nicolas Pitre 2007-03-16 18:22 ` Steve Frécinaux @ 2007-03-16 23:26 ` Johannes Schindelin 1 sibling, 0 replies; 62+ messages in thread From: Johannes Schindelin @ 2007-03-16 23:26 UTC (permalink / raw) To: Nicolas Pitre; +Cc: Rocco Rutte, git Hi, On Fri, 16 Mar 2007, Nicolas Pitre wrote: > [...] the output from an executable is a damn good abstraction and the > executable itself is an impenetrable boundary. Anything can change (and > did change) underneath. > > This is why a public API must be done at a higher level to allow for > anything to change at the lower level as we wish. Absolutely. Ciao, Dscho ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-16 15:55 ` Nicolas Pitre 2007-03-16 16:13 ` Johannes Schindelin @ 2007-03-16 16:17 ` Shawn O. Pearce 1 sibling, 0 replies; 62+ messages in thread From: Shawn O. Pearce @ 2007-03-16 16:17 UTC (permalink / raw) To: Nicolas Pitre; +Cc: Johannes Schindelin, Rocco Rutte, git Nicolas Pitre <nico@cam.org> wrote: > On Fri, 16 Mar 2007, Johannes Schindelin wrote: > > > We already _have_ the data structures! > > Well... Shawn and I are contemplating alternate data structures to > improve things dramatically. Hang on. Yes, Nico and I are contemplating alternate disk based data structure, and in some cases, alternate memory based data structures to improve things. But these structures are not changing the basic Git data structures that have been with us since way back when. ;-) Commits still have the same fields, with the same data and the same meaning. Trees still have the same fields, and same meaning... etc. > With a fixed public API I doubt such improvements could be as effective. They still can be, and without shooting ourselves in the foot in the process. > So... if any API is to be developed, I'd argue that it must be done > _above_ the existing code with a higher level of abstraction and a much > narrower scope. Yes. Today we have a frozen API for commit walking. Its called `git rev-list --pretty=raw A ^B`. That output format is pretty well set in stone, and we cannot change it. Everyone knows what each field means, and hopefully knows that additional fields can be added. ;-) Instead of formatting out those fields as hex strings, or as decimal integer dates, we can offer them in a struct. E.g.: struct git_objid { const unsigned char *obj_name; }; struct git_commit { struct git_objid tree; struct git_objid *parents; uint32_t nr_parent; const char *author; time_t author_date; int author_tz; const char *committer; time_t committer_date; int committer_tz; const char *message; }; With the rule that the pointers are to static memory buffers that libgit is loaning out to the caller (the caller should *not* free these buffers). This lets us play cute tricks down in the lower tiers by pointing directly into the packfile dictionary tables (saves memcpys); or xstrdup/xmalloc everything we give out if we want to be really paranoid. Just tossing ideas out - don't think that what I wrote above is my final suggestion on the matter. It may change in another day or two if I think about it more. ;-) -- Shawn. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-16 15:12 ` Johannes Schindelin 2007-03-16 15:55 ` Nicolas Pitre @ 2007-03-16 18:20 ` Marco Costalba 2007-03-16 18:38 ` Marco Costalba 2007-03-18 14:08 ` Petr Baudis 2 siblings, 1 reply; 62+ messages in thread From: Marco Costalba @ 2007-03-16 18:20 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Rocco Rutte, git On 3/16/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > > > Porting things like qgit to it or writting proper perl/python bindings > > is wasted time since you'd have to rewrite all of it once you decided > > which functions to expose and which structures to use (calling the > > main() routines of builtin's doesn't count as real libifaction, it would > > rather be a performance improvement only). > > Nope. It is _not_ a complete rewrite. More likely, it is minimal > adjustments. It's not like we will replace apples with cars... > IMHO probably the truth is in the middle. I wouldn't call it a trivial porting, at least for me, but anyway it would be interesting to have fun with linking libgit. *The most important thing for a libgit to be used by qgit is reentrancy* Currently an unlimited number of tabs could be open in qgit, I'm not talking about tabs open on different repos, but different views on the same repo: main view, file history of file A, file history of file B, tree view, i.e. select some files/directory from directory tree and view the revisions that modified that repo subset, and so on. Other different views could be added in the future. Because each view has a dedicated tab and each tab calls _his_ 'git rev-list' instance (could be called also at the same time) this libgit thing should be able to support many instance of the libified git-rev-list function running at the same time. Perhaps currently this need is only for qgit among the GUI browsers, but it would be not too difficult to foreseen a multi view GUI interface as a relative common feature in the future also for the remaining crop of git tools. Marco ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-16 18:20 ` Marco Costalba @ 2007-03-16 18:38 ` Marco Costalba 2007-03-16 18:59 ` Nicolas Pitre 2007-03-16 19:09 ` Andy Parkins 0 siblings, 2 replies; 62+ messages in thread From: Marco Costalba @ 2007-03-16 18:38 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Rocco Rutte, git On 3/16/07, Marco Costalba <mcostalba@gmail.com> wrote: > > *The most important thing for a libgit to be used by qgit is reentrancy* > Another crtitical feature is that this call to git-rev-list-like function MUST be non-blocking. Reading a big repo could take many seconds, also more then 10 seconds in cold cache case for Linux tree, as example. Getting the history of a file ('git rev-list -- /path/to/file) it's also very slow. There is no way that a GUI tool is allowed to *freeze* for that amount of time. Currently, because an external process is forked when running 'git rev-list' all the problem is happly handled by the kernel scheduler and the QProcess callback mechanism (based on select()). In case of a libified git-rev-list this could be an issue. Marco ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-16 18:38 ` Marco Costalba @ 2007-03-16 18:59 ` Nicolas Pitre 2007-03-16 21:07 ` Marco Costalba 2007-03-16 19:09 ` Andy Parkins 1 sibling, 1 reply; 62+ messages in thread From: Nicolas Pitre @ 2007-03-16 18:59 UTC (permalink / raw) To: Marco Costalba; +Cc: Johannes Schindelin, Rocco Rutte, git On Fri, 16 Mar 2007, Marco Costalba wrote: > On 3/16/07, Marco Costalba <mcostalba@gmail.com> wrote: > > > > *The most important thing for a libgit to be used by qgit is reentrancy* > > > > Another crtitical feature is that this call to git-rev-list-like > function MUST be non-blocking. I'm not sure I agree. The non-blockingness can be (and probably should be) handled at a higher level with your own threading facility of choice. Making GIT restartable has the potential for making the core code much too complex. Nicolas ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-16 18:59 ` Nicolas Pitre @ 2007-03-16 21:07 ` Marco Costalba 2007-03-16 23:24 ` Johannes Schindelin 0 siblings, 1 reply; 62+ messages in thread From: Marco Costalba @ 2007-03-16 21:07 UTC (permalink / raw) To: Nicolas Pitre; +Cc: Johannes Schindelin, Rocco Rutte, git On 3/16/07, Nicolas Pitre <nico@cam.org> wrote: > On Fri, 16 Mar 2007, Marco Costalba wrote: > > > On 3/16/07, Marco Costalba <mcostalba@gmail.com> wrote: > > > > > > *The most important thing for a libgit to be used by qgit is reentrancy* > > > > > > > Another crtitical feature is that this call to git-rev-list-like > > function MUST be non-blocking. > > I'm not sure I agree. > > The non-blockingness can be (and probably should be) handled at a higher > level with your own threading facility of choice. Making GIT > restartable has the potential for making the core code much too complex. > The fact is that the solution is complex anyway, moving the complex code at higher level doesn't simplify the whole issue, it just *moves* the issue somewhere else. BTW now qgit is single-threaded (as gitk), you suggest that linking with libgit it will involve to go on the multi threading side and I think you are right. But it will be not that easy. Currently we have both single threaded GUI tools and blocking git commands and it works nicely not because it's simple but because the 'complex code' is hidden inside the OS process handling and scheduling stuff. Linking with a synchronous libgit it means, roughly speaking, take the 'complex code' out from the OS and put somewhere in user space, or in libgit or in the user GUI tool linked with the library. Now, it happens that Qt has a good multi thread support, but this is just incidental and of course cannot be taken as granted by a git library that aims to be broadly and possibly easily used. Because we are just speaking (well, writing ;-) ) about a possible library I think we could take in account what would involve to foreseen a callback mechanism in the API, at least for the slowest ones. Marco ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-16 21:07 ` Marco Costalba @ 2007-03-16 23:24 ` Johannes Schindelin 2007-03-17 7:04 ` Marco Costalba 0 siblings, 1 reply; 62+ messages in thread From: Johannes Schindelin @ 2007-03-16 23:24 UTC (permalink / raw) To: Marco Costalba; +Cc: Nicolas Pitre, Rocco Rutte, git Hi, On Fri, 16 Mar 2007, Marco Costalba wrote: > On 3/16/07, Nicolas Pitre <nico@cam.org> wrote: > > On Fri, 16 Mar 2007, Marco Costalba wrote: > > > > > On 3/16/07, Marco Costalba <mcostalba@gmail.com> wrote: > > > > > > > > *The most important thing for a libgit to be used by qgit is > > > > reentrancy* > > > > > > > > > > Another crtitical feature is that this call to git-rev-list-like > > > function MUST be non-blocking. > > > > I'm not sure I agree. I am sure I don't agree. > > The non-blockingness can be (and probably should be) handled at a > > higher level with your own threading facility of choice. Making GIT > > restartable has the potential for making the core code much too > > complex. > > The fact is that the solution is complex anyway, moving the complex code > at higher level doesn't simplify the whole issue, it just *moves* the > issue somewhere else. It not only *moves* the issue somewhere else, but it also cleanly separates the issues. > BTW now qgit is single-threaded (as gitk), you suggest that linking with > libgit it will involve to go on the multi threading side and I think you > are right. But it will be not that easy. Why? First, it _is_ multi-threaded, since it calls external programs. That is even more than a thread. It is a process. Second, it _would_ be easy to just use the threads provided by Qt. > Because we are just speaking (well, writing ;-) ) about a possible > library I think we could take in account what would involve to foreseen > a callback mechanism in the API, at least for the slowest ones. We are talking about libgit. Which should make access to certain common functions on Git repositories easy. Nothing more than that. If you need to do that asynchronously, do _not_ fiddle with libgit. Just imagine what this would involve: you'd have to have timeouts (since there is _NO_ other way to find out when to return with empty hands, instead of blocking), which is _not_ portable. You'd soon be in the same _mess_ we are talking about with respect to exceptions. Also, you would make _all_ operations expensive, since they _would_ have to store state to be restartable. The common solution for your problem _is_ to use threads. And you have to admit that _only_ viewers would need asynchronous access anyway. I doubt that other tools -- which could take their advantage of a libgit -- would need such an access. Ciao, Dscho ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-16 23:24 ` Johannes Schindelin @ 2007-03-17 7:04 ` Marco Costalba 2007-03-17 17:29 ` Johannes Schindelin 0 siblings, 1 reply; 62+ messages in thread From: Marco Costalba @ 2007-03-17 7:04 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Nicolas Pitre, Rocco Rutte, git On 3/17/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > > We are talking about libgit. Which should make access to certain common > functions on Git repositories easy. Nothing more than that. > Fair enough. > If you need to do that asynchronously, do _not_ fiddle with libgit. Just > imagine what this would involve: you'd have to have timeouts (since there > is _NO_ other way to find out when to return with empty hands, instead of > blocking), which is _not_ portable. You'd soon be in the same _mess_ we > are talking about with respect to exceptions. > > Also, you would make _all_ operations expensive, since they _would_ have > to store state to be restartable. > > The common solution for your problem _is_ to use threads. > I would say, the common solution to have non blocking libgit is to use threads in the tool linked with libgit. This is clearly a design choice and I agree it's an important statement to keep libgit simple and portable (otherwise you'd probably need to use a thread library as pthread in libgit). Thread facility in Qt is instead already portable and well integrated. Anyway it's a design choice perhaps worth documenting. > And you have to admit that _only_ viewers would need asynchronous access > anyway. I doubt that other tools -- which could take their advantage of a > libgit -- would need such an access. > Yes, and you have to admit ;-) that viewers are the tools that mostly will use libgit. Marco ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-17 7:04 ` Marco Costalba @ 2007-03-17 17:29 ` Johannes Schindelin 0 siblings, 0 replies; 62+ messages in thread From: Johannes Schindelin @ 2007-03-17 17:29 UTC (permalink / raw) To: Marco Costalba; +Cc: Nicolas Pitre, Rocco Rutte, git Hi, On Sat, 17 Mar 2007, Marco Costalba wrote: > On 3/17/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > > > The common solution for your problem _is_ to use threads. > > I would say, the common solution to have non blocking libgit is to use > threads in the tool linked with libgit. Yes, that's what I tried to say. > > And you have to admit that _only_ viewers would need asynchronous > > access anyway. I doubt that other tools -- which could take their > > advantage of a libgit -- would need such an access. > > Yes, and you have to admit ;-) that viewers are the tools that mostly > will use libgit. I hope that there are many more users. _And_ not all viewers want to do the display asynchronously. For example, statplot takes the time it takes. Ciao, Dscho ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-16 18:38 ` Marco Costalba 2007-03-16 18:59 ` Nicolas Pitre @ 2007-03-16 19:09 ` Andy Parkins 1 sibling, 0 replies; 62+ messages in thread From: Andy Parkins @ 2007-03-16 19:09 UTC (permalink / raw) To: git; +Cc: Marco Costalba, Johannes Schindelin, Rocco Rutte On Friday 2007, March 16, Marco Costalba wrote: > There is no way that a GUI tool is allowed to *freeze* for that > amount of time. Currently, because an external process is forked when > running 'git rev-list' all the problem is happly handled by the > kernel scheduler and the QProcess callback mechanism (based on > select()). In case of a libified git-rev-list this could be an issue. I don't think that is ever going to be an issue. At the worst you could just fork() and run the libgit command in that. Threads are fairly easy in Qt as well. In short, I wouldn't worry about libgit blocking - in fact it's almost a guarantee that libgit /will/ block; it would be a nightmare to write an asynchronous libgit. Andy -- Dr Andy Parkins, M Eng (hons), MIET andyparkins@gmail.com ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-16 15:12 ` Johannes Schindelin 2007-03-16 15:55 ` Nicolas Pitre 2007-03-16 18:20 ` Marco Costalba @ 2007-03-18 14:08 ` Petr Baudis 2007-03-18 23:48 ` Johannes Schindelin 2 siblings, 1 reply; 62+ messages in thread From: Petr Baudis @ 2007-03-18 14:08 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Rocco Rutte, git On Fri, Mar 16, 2007 at 04:12:17PM CET, Johannes Schindelin wrote: > Hi, > > [please do not cull the Cc: list] > > On Fri, 16 Mar 2007, Rocco Rutte wrote: > > > First, I think that would be some cleanup "only" since that basically would > > mean to > > > > 1) make all functions die()ing return some value and handle it and > > 2) wrap all static vars into structures and pass them around > > > > If you don't choose a design before wrapping things up in structures, you'll > > probably end up having one structure per source file (at least too many > > structures). > > Why? For some tasks, it should be 1) easier, 2) more elegant, and 3) > faster to write a function which re-initialises the static variables. > > Of course, if you want to work with multiple repos _at the same time_, > this does not help you. But frankly, we don't support that with core-git, > so why should we in libgit? Because you don't know who will want to use libgit. Maybe perl bindings from inside of mod_perl, where single process can multiplex between many repositories based on whichever request just arrived. You talked about memory usage issues, but I think that's just a minor technical issue that can be adjusted, while this is _conceptual_. Maybe someone will want to write repodiff which looks at two repositories and compares them (without fetching massive data around). Maybe someone will want to write some other cool hack we didn't think about. Because in the other subthread you just suggested the git viewers should be multi-threaded. Of course you can state that "only a single thread can use libgit at a time", but then multithreading is just a hack to work around libgit limitations (albeit still legitimate) while it could be used to do so much more cool stuff like fetching old history information on background while you can already _work_ with the tool and look at the new stuff details (isn't this actually exactly how gitk and qgit already work? they couldn't with non-reentrant libgit!). Because if you look at the UNIX history, you'll notice that first people started with non-reentrant stuff because it was "good enough" and then came back later and added reentrant versions anyway. Let's learn from history. It's question of probability but it's very likely this will happen to us as well. This is why the _API_ should be designed to be re-entrant. The implementation may not be re-entrant right away, it may take a while to get there, but the API really should be. -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ Ever try. Ever fail. No matter. // Try again. Fail again. Fail better. -- Samuel Beckett ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-18 14:08 ` Petr Baudis @ 2007-03-18 23:48 ` Johannes Schindelin 2007-03-19 1:21 ` Petr Baudis 0 siblings, 1 reply; 62+ messages in thread From: Johannes Schindelin @ 2007-03-18 23:48 UTC (permalink / raw) To: Petr Baudis; +Cc: Rocco Rutte, git Hi, On Sun, 18 Mar 2007, Petr Baudis wrote: > [...] if you look at the UNIX history, you'll notice that first people > started with non-reentrant stuff because it was "good enough" and then > came back later and added reentrant versions anyway. Let's learn from > history. It's question of probability but it's very likely this will > happen to us as well. Yes, let's learn from history. Start with a libgit that is good enough. And when somebody actually needs it to behave a little differently, or more sophisticated, then let that somebody work on it! Ciao, Dscho ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-18 23:48 ` Johannes Schindelin @ 2007-03-19 1:21 ` Petr Baudis 2007-03-19 1:43 ` Johannes Schindelin 0 siblings, 1 reply; 62+ messages in thread From: Petr Baudis @ 2007-03-19 1:21 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Rocco Rutte, git Hi, On Mon, Mar 19, 2007 at 12:48:27AM CET, Johannes Schindelin wrote: > On Sun, 18 Mar 2007, Petr Baudis wrote: > > > [...] if you look at the UNIX history, you'll notice that first people > > started with non-reentrant stuff because it was "good enough" and then > > came back later and added reentrant versions anyway. Let's learn from > > history. It's question of probability but it's very likely this will > > happen to us as well. > > Yes, let's learn from history. Start with a libgit that is good enough. > And when somebody actually needs it to behave a little differently, or > more sophisticated, then let that somebody work on it! I was talking about the API. The API has to be designed to be reentrant. And you get pretty much stuck with the API. And requiring reentrance isn't that far off once libgit is there, as I tried to point out; it's not really any obscure requirement. -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ Ever try. Ever fail. No matter. // Try again. Fail again. Fail better. -- Samuel Beckett ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-19 1:21 ` Petr Baudis @ 2007-03-19 1:43 ` Johannes Schindelin 2007-03-19 2:56 ` Theodore Tso 2007-03-19 7:01 ` Marco Costalba 0 siblings, 2 replies; 62+ messages in thread From: Johannes Schindelin @ 2007-03-19 1:43 UTC (permalink / raw) To: Petr Baudis; +Cc: Rocco Rutte, git Hi, On Mon, 19 Mar 2007, Petr Baudis wrote: > On Mon, Mar 19, 2007 at 12:48:27AM CET, Johannes Schindelin wrote: > > On Sun, 18 Mar 2007, Petr Baudis wrote: > > > > > [...] if you look at the UNIX history, you'll notice that first > > > people started with non-reentrant stuff because it was "good enough" > > > and then came back later and added reentrant versions anyway. Let's > > > learn from history. It's question of probability but it's very > > > likely this will happen to us as well. > > > > Yes, let's learn from history. Start with a libgit that is good > > enough. And when somebody actually needs it to behave a little > > differently, or more sophisticated, then let that somebody work on it! > > I was talking about the API. The API has to be designed to be > reentrant. And you get pretty much stuck with the API. And requiring > reentrance isn't that far off once libgit is there, as I tried to point > out; it's not really any obscure requirement. I don't see _any_ problem in making an API which works with _one_ repo first. This has several advantages: - most users (if any!) will work that way, - it is easier to implement, - you are more likely to get that right than the more complex thing you seem to want already in the first version, and - it is easy enough to extend the API later, _retaining_ the small and beautiful functions. As for the memory problems I was pointing out to you on IRC: if you do some operation on one repo, and run out of memory, okay, there is not much you can do about it. Tough luck. If you cache different repos in the _same_ process, and run out of memory, you should free the caches of the _other_ repos first, instead of just erroring out. This is not entirely trivial, likely to make libgit fragile, and quite possibly a performance hit (making libgit unattractive for plumbing, which would take away the best test case for libgit). Also, when you cache different repos, you want to avoid duplicating identical objects in different caches, which makes the cache handling no easier. But even if these issues would not exist, isn't it obvious that you should start with something _simple_? Ciao, Dscho ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-19 1:43 ` Johannes Schindelin @ 2007-03-19 2:56 ` Theodore Tso 2007-03-19 3:55 ` Shawn O. Pearce ` (2 more replies) 2007-03-19 7:01 ` Marco Costalba 1 sibling, 3 replies; 62+ messages in thread From: Theodore Tso @ 2007-03-19 2:56 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Petr Baudis, Rocco Rutte, git On Mon, Mar 19, 2007 at 02:43:54AM +0100, Johannes Schindelin wrote: > > I was talking about the API. The API has to be designed to be > > reentrant. And you get pretty much stuck with the API. And requiring > > reentrance isn't that far off once libgit is there, as I tried to point > > out; it's not really any obscure requirement. > > - it is easy enough to extend the API later, _retaining_ the small and > beautiful functions. Um, look at what we had to do with gethostbyname() and gethostbyname_r(). It wasn't possible to sweep through and fix all of the programs that used gethostbyname(), despite the fact that if a program called gethostbyname(), then called library function which unknowingly to application, could possibly do a DNS or YP lookup (and whose behavior could change depending on some config file like /etc/nsswitch.conf), which would blow away the static information. So if the application tryied to use the information returned by _its_ call to gethostbyname after calling some other library function, it could get some completely random hostname that wasn't what it expected. Yelch! And so we have two API's that libc has to support, gethostbyname(), and gethostbyname_r(), with the ugly _r() suffix, and which in a sane world most programs should use since otherwise they can be incredibly fragile unless the _first_ thing they do after calling gethostbyname is to copy the information to someplace stable, instead of relying on the static buffer to remain sane. (And yet they don't, which means bugs that only show up if optional YP or Hesiod lookups are enabled, etc.) Berkely got it horribly wrong when it tried to start with the "small and beautiful" functions that were non-reentrant, and we've been paying the price ever since. Do we really want to support two versions of the API forever? Is it really that hard to support a reentrant API from the beginning? I'd submit the answer to these two questions are no, and no, respectively. - Ted ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-19 2:56 ` Theodore Tso @ 2007-03-19 3:55 ` Shawn O. Pearce 2007-03-19 14:57 ` Johannes Schindelin 2007-03-19 16:28 ` Linus Torvalds 2 siblings, 0 replies; 62+ messages in thread From: Shawn O. Pearce @ 2007-03-19 3:55 UTC (permalink / raw) To: Theodore Tso; +Cc: Johannes Schindelin, Petr Baudis, Rocco Rutte, git Theodore Tso <tytso@mit.edu> wrote: > Berkely got it horribly wrong when it tried to start with the "small > and beautiful" functions that were non-reentrant, and we've been > paying the price ever since. Do we really want to support two > versions of the API forever? Is it really that hard to support a > reentrant API from the beginning? I'd submit the answer to these two > questions are no, and no, respectively. I agree entirely, for every reason mentioned by Ted (including those not quoted). ;-) I learned about gethostbyname after gethostbyname_r was already introduced, so I have always been asking myself "uhhhhh, why do we have gethostbyname?". ;-) -- Shawn. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-19 2:56 ` Theodore Tso 2007-03-19 3:55 ` Shawn O. Pearce @ 2007-03-19 14:57 ` Johannes Schindelin 2007-03-19 16:28 ` Linus Torvalds 2 siblings, 0 replies; 62+ messages in thread From: Johannes Schindelin @ 2007-03-19 14:57 UTC (permalink / raw) To: Theodore Tso; +Cc: Petr Baudis, Rocco Rutte, git Hi, On Sun, 18 Mar 2007, Theodore Tso wrote: > On Mon, Mar 19, 2007 at 02:43:54AM +0100, Johannes Schindelin wrote: > > > I was talking about the API. The API has to be designed to be > > > reentrant. And you get pretty much stuck with the API. And requiring > > > reentrance isn't that far off once libgit is there, as I tried to point > > > out; it's not really any obscure requirement. > > > > - it is easy enough to extend the API later, _retaining_ the small and > > beautiful functions. > > Um, look at what we had to do with gethostbyname() and > gethostbyname_r(). It wasn't possible to sweep through and fix all of > the programs that used gethostbyname(), despite the fact that if a > program called gethostbyname(), then called library function which > unknowingly to application, could possibly do a DNS or YP lookup (and > whose behavior could change depending on some config file like > /etc/nsswitch.conf), which would blow away the static information. So > if the application tryied to use the information returned by _its_ call > to gethostbyname after calling some other library function, it could get > some completely random hostname that wasn't what it expected. > > Yelch! And so we have two API's that libc has to support, > gethostbyname(), and gethostbyname_r(), with the ugly _r() suffix, and > which in a sane world most programs should use since otherwise they can > be incredibly fragile unless the _first_ thing they do after calling > gethostbyname is to copy the information to someplace stable, instead of > relying on the static buffer to remain sane. (And yet they don't, which > means bugs that only show up if optional YP or Hesiod lookups are > enabled, etc.) > > Berkely got it horribly wrong when it tried to start with the "small and > beautiful" functions that were non-reentrant, and we've been paying the > price ever since. Do we really want to support two versions of the API > forever? Is it really that hard to support a reentrant API from the > beginning? I'd submit the answer to these two questions are no, and no, > respectively. You make a good case why gethostbyname() was wrong, and should have been defined as gethostbyname_r() to begin with. However, as I wrote in another reply in this thread, I am not prepared to sink more time in this discussion, _unless_ somebody who cares about it enough shows me some code and/or numbers. Ciao, Dscho ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-19 2:56 ` Theodore Tso 2007-03-19 3:55 ` Shawn O. Pearce 2007-03-19 14:57 ` Johannes Schindelin @ 2007-03-19 16:28 ` Linus Torvalds 2007-03-19 16:32 ` Linus Torvalds 2007-03-21 11:17 ` Andreas Ericsson 2 siblings, 2 replies; 62+ messages in thread From: Linus Torvalds @ 2007-03-19 16:28 UTC (permalink / raw) To: Theodore Tso; +Cc: Johannes Schindelin, Petr Baudis, Rocco Rutte, git On Sun, 18 Mar 2007, Theodore Tso wrote: > > Berkely got it horribly wrong when it tried to start with the "small > and beautiful" functions that were non-reentrant, and we've been > paying the price ever since. I don't think that's a good argument, ESPECIALLY when coming from somebody from MIT. Berkeley may have gotten it "horribly wrong", but the fact is, BSD kicked ass and took over the world, in a way that nothing comparable I know of from MIT ever did. Exactly *because* the BSD people didn't try to make it perfect, but made things "small and easy to *implement*". (I would not say "small and beautiful". "Beauty" had nothing to do with it. "simple" had. And unlike beauty, simplicity really *is* more than skin deep, and is a fundamentally good design). I'm a *huge* believer in "Worse is Better" (for people who don't know it, just google for that phrase, with the quotes around it). In fact, I'd argue that the reason git kicks ass is exactly that "Worse is Better" design: you need to have a few conceptual (good) ideas to base your design off on, but given those good ideas, it's more important that things _work_well_in_practice_ than some "wouldn't it be better.." kind of mentality. The "paying the price ever since" argument is bogus. If you get to that point, you've by definition *already*won*! Here's the real world according to Linus: 1) everybody makes mistakes 2) only the winners "pay the price" of those mistakes ever since, since the losers will not be around to pay it, and the winners will have made mistakes too (see #1) 3) the more complex and subtle you make the interfaces, the more mistakes you'll make, AND the less likely you are to be a winner anyway, since you'll have problems implementing it *and* it will probably be subtle to use too! So the motto should always be: "Just Do It!", and screw worrying about paying the price. You *want* to have to pay the price. It's the best thing that can ever happen to you. And you want to have to start paying the price as early as possible - because that not only means that you won, it also means that you'll now be learning from your mistakes instead of trying to anticipate them, and I will *guarantee* that learnign from mistakes is going to be a lot more productive than trying to worry about them up-front. > Do we really want to support two versions of the API forever? I'd personally strongly vote for a "simple library" interface as a first cut. And yes, if that means supporting two versions, I think it's better. You can easily have "libgit-simple.a" for trivial non-threaded accesses with out-of-memory conditions causing the process to die. That really *is* a very useful schenario, as shown by the fact that *every*single*core*git program has been happy with it. Claiming that you need a complicated interface in the face of the *proof* that git itself dosn't need that complicated an interface is to me a bit disingenious. Yes, *some* people will want a thread-safe one. But we're not talking something like libc here, where the library is so fundamental that it needs to be acceptable for everybody. It's perfectly possible to have a "libgit-simple.a" that is good for 99% of all uses, and that is simple to use, and less bug-prone simply because is is *simpler* (not just for users, but as an implementation). And then for the small small minority of programs that want something fancier, do a "libgit-complicated.a" library. IF you ever get it working and complete, you can always then implement "libgit-simple" in terms of the complicated version. Is it > really that hard to support a reentrant API from the beginning? I'd > submit the answer to these two questions are no, and no, respectively. > > - Ted > - > To unsubscribe from this list: send the line "unsubscribe git" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-19 16:28 ` Linus Torvalds @ 2007-03-19 16:32 ` Linus Torvalds 2007-03-21 11:17 ` Andreas Ericsson 1 sibling, 0 replies; 62+ messages in thread From: Linus Torvalds @ 2007-03-19 16:32 UTC (permalink / raw) To: Theodore Tso; +Cc: Johannes Schindelin, Petr Baudis, Rocco Rutte, git Oops. My fingers are faster than my brain, and that email got sent out half-completed and without the final editing. But it wasn't reallymissing anything else than editing away the parts of the original I didn't respond to, and my normal sign-off. So I'll just sign this one off twice, instead.. Linus Linus ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-19 16:28 ` Linus Torvalds 2007-03-19 16:32 ` Linus Torvalds @ 2007-03-21 11:17 ` Andreas Ericsson 2007-03-21 17:24 ` Linus Torvalds 1 sibling, 1 reply; 62+ messages in thread From: Andreas Ericsson @ 2007-03-21 11:17 UTC (permalink / raw) To: Linus Torvalds Cc: Theodore Tso, Johannes Schindelin, Petr Baudis, Rocco Rutte, git Linus Torvalds wrote: > > I'm a *huge* believer in "Worse is Better" (for people who don't know it, > just google for that phrase, with the quotes around it). > I just did, and having read the first page of the document found at http://www.jwz.org/doc/worse-is-better.html, I must say "worse-is-better" sounds an awful lot like evolution; "Start with something that works. When something else works better, jump train and embrace The New Thing". -- Andreas Ericsson andreas.ericsson@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-21 11:17 ` Andreas Ericsson @ 2007-03-21 17:24 ` Linus Torvalds 2007-03-22 9:51 ` Andreas Ericsson 0 siblings, 1 reply; 62+ messages in thread From: Linus Torvalds @ 2007-03-21 17:24 UTC (permalink / raw) To: Andreas Ericsson Cc: Theodore Tso, Johannes Schindelin, Petr Baudis, Rocco Rutte, git On Wed, 21 Mar 2007, Andreas Ericsson wrote: > Linus Torvalds wrote: > > > > I'm a *huge* believer in "Worse is Better" (for people who don't know it, > > just google for that phrase, with the quotes around it). > > I just did, and having read the first page of the document found at > http://www.jwz.org/doc/worse-is-better.html, I must say "worse-is-better" > sounds an awful lot like evolution; "Start with something that works. When > something else works better, jump train and embrace The New Thing". Yeah. I'm a huge believer in evolution too (and not just the biological kind ;) The thing is, most "designers" are just totally clueless. Even the smartest people that have done something similar five times before are prone to totally mis-design something if they start from scratch and try to "think it through". You tend to concentrate on the problems of the previous generation, and not even think about everything that worked wonderfully well, because that wasn't something you *needed* to think about. So "designing" stuff is way overrated. You can spend years designing somethign that is total crap, just because you didn't actually try it out and _realize_ that it wasn't what the user wanted (it may have been what the user _thought_ and _claimed_ that he wanted, but that was before actually tried to use it, and realized that he was wrong). Linus ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-21 17:24 ` Linus Torvalds @ 2007-03-22 9:51 ` Andreas Ericsson 0 siblings, 0 replies; 62+ messages in thread From: Andreas Ericsson @ 2007-03-22 9:51 UTC (permalink / raw) To: Linus Torvalds Cc: Theodore Tso, Johannes Schindelin, Petr Baudis, Rocco Rutte, git Linus Torvalds wrote: > > On Wed, 21 Mar 2007, Andreas Ericsson wrote: > >> Linus Torvalds wrote: >>> I'm a *huge* believer in "Worse is Better" (for people who don't know it, >>> just google for that phrase, with the quotes around it). >> I just did, and having read the first page of the document found at >> http://www.jwz.org/doc/worse-is-better.html, I must say "worse-is-better" >> sounds an awful lot like evolution; "Start with something that works. When >> something else works better, jump train and embrace The New Thing". > > Yeah. I'm a huge believer in evolution too (and not just the biological > kind ;) > > So "designing" stuff is way overrated. You can spend years designing > somethign that is total crap, just because you didn't actually try it out > and _realize_ that it wasn't what the user wanted (it may have been what > the user _thought_ and _claimed_ that he wanted, but that was before > actually tried to use it, and realized that he was wrong). > Indeed. That's probably why Extreme Programming (silly hype-name, but what to call it otherwise?) has gained so much popularity from the people that really understand the concept. To those that don't wish to google for it, Extreme Programming is about taking small steps that lead to a diffuse goal ("We shall make a fantasy video game that millions of people would like to play. Significant lore is here, here and here"). The goal and any of the steps might change along the way. Basically, it puts "re-think, re-design, re-factor" on the table for corporate software production and promotes rapid implementation over correctness. -- Andreas Ericsson andreas.ericsson@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-19 1:43 ` Johannes Schindelin 2007-03-19 2:56 ` Theodore Tso @ 2007-03-19 7:01 ` Marco Costalba 2007-03-19 9:46 ` Steve Frécinaux ` (2 more replies) 1 sibling, 3 replies; 62+ messages in thread From: Marco Costalba @ 2007-03-19 7:01 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Petr Baudis, Rocco Rutte, git, tytso, spearce On 3/19/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > > I don't see _any_ problem in making an API which works with _one_ repo > first. This has several advantages: > > - most users (if any!) will work that way, > Sometime could be useful to write a list of possible users before starting to code. Please which are, in your opinion, the possible tools that could use a non-reentrant, blocking libgit? In case tool is already exsistant please write the name, in case it's a 'would be' one give a brief description. I' have tried to do the list myself, but I found only viewers ;-) among _currently_ tools I know of, and all the viewers allow loading in background _now_ so will not be portable to libgit without main surgery, read multi-thread (BTW none is currently multi-thread). Marco ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-19 7:01 ` Marco Costalba @ 2007-03-19 9:46 ` Steve Frécinaux 2007-03-19 10:33 ` Steve Frécinaux 2007-03-19 12:37 ` Johannes Schindelin 2 siblings, 0 replies; 62+ messages in thread From: Steve Frécinaux @ 2007-03-19 9:46 UTC (permalink / raw) To: Marco Costalba; +Cc: git On Mon, 2007-03-19 at 08:01 +0100, Marco Costalba wrote: > I' have tried to do the list myself, but I found only viewers ;-) > among _currently_ tools I know of, and all the viewers allow loading > in background _now_ so will not be portable to libgit without main > surgery, read multi-thread (BTW none is currently multi-thread). I thought about configuration tools (gconf, kconfig, etc), that could then implement something similar to what the recovery system of WinXP does: they could store an history of the configuration state, and then recover a previous state if things go wrong. This would be incredibly useful for system administrators. Also, more generally, git can be used as a versioned storage system without direct link to source control. I'm thinking about ikiwiki for instance. More SCM-oriented, a cron script that manages a website by checkouting several repositories (one for the wiki module, another for the blog module, another for the forum, etc) using, say, the python bindi There are probably a zillion other possible uses. The common thing when exposing an API is that it ends up being used in a way nobody had thought of. So it's dangerous to say "it's useless" or "nobody will do it". You can be sure someone will, it's just a matter of time. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-19 7:01 ` Marco Costalba 2007-03-19 9:46 ` Steve Frécinaux @ 2007-03-19 10:33 ` Steve Frécinaux 2007-03-19 12:37 ` Johannes Schindelin 2 siblings, 0 replies; 62+ messages in thread From: Steve Frécinaux @ 2007-03-19 10:33 UTC (permalink / raw) To: Marco Costalba; +Cc: git On Mon, 2007-03-19 at 08:01 +0100, Marco Costalba wrote: > Please which are, in your opinion, the possible tools that could use a > non-reentrant, blocking libgit? In case tool is already existent > please write the name, in case it's a 'would be' one give a brief > description. Another idea that I just remembered about: two years ago there was a SoC project to make nautilus (the file manager from gnome) able to version directories. It was using SVN (and failed, but it's another story). While nautilus is heavily multi-threaded, it's a "single-instance app", so there is at most only one instance of nautilus ever running. Under the hypothesis of a "versioned directories" support using libgit (that would be easier to do and support since it doesn't need to set up a server), it's quite obvious that a non-reentrant git would not be enough: you are likely to have more than one versioned directories on screen at the same time! OTOH, blocking doesn't look like an issue since gnomevfs already deals with quite a number of blocking synchronous libs and exposes an async API on top of those (similar to what QT Threading does, I guess). BTW, if some Gnome people are reading, if libgit comes into life, such a project is something I'd like to see for real ;-) ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-19 7:01 ` Marco Costalba 2007-03-19 9:46 ` Steve Frécinaux 2007-03-19 10:33 ` Steve Frécinaux @ 2007-03-19 12:37 ` Johannes Schindelin 2007-03-19 12:52 ` Petr Baudis 2007-03-19 13:04 ` Marco Costalba 2 siblings, 2 replies; 62+ messages in thread From: Johannes Schindelin @ 2007-03-19 12:37 UTC (permalink / raw) To: Marco Costalba; +Cc: Petr Baudis, Rocco Rutte, git, tytso, spearce Hi, On Mon, 19 Mar 2007, Marco Costalba wrote: > On 3/19/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > > > > I don't see _any_ problem in making an API which works with _one_ repo > > first. This has several advantages: > > > > - most users (if any!) will work that way, > > > > Sometime could be useful to write a list of possible users before > starting to code. Fair enough. I expect the most visible users of libgit to be: the core Git programs! Because if we don't eat our own dog food, why should anybody else? And I am absolutely utterly opposed to make them slower just to support a program which wants to cache meta data from multiple repositories. Yes, you could write a program which can compare objects from several repos, but that is easy in fact: just set GIT_ALTERNATE_OBJECT_DIRECTORIES and you're done. Without changing the core of Git at all! Having said that, I never liked the idea of having static variables to talk with config handlers, and would have preferred cb_data like for_each_ref() does. That is a low hanging fruit, which does not affect performance, and is _definitely_ a clean up. I am not so sure about the impact of changing the index to a non-static structure. Ciao, Dscho ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-19 12:37 ` Johannes Schindelin @ 2007-03-19 12:52 ` Petr Baudis 2007-03-19 13:55 ` Johannes Schindelin 2007-03-19 13:04 ` Marco Costalba 1 sibling, 1 reply; 62+ messages in thread From: Petr Baudis @ 2007-03-19 12:52 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Marco Costalba, Rocco Rutte, git, tytso, spearce On Mon, Mar 19, 2007 at 01:37:18PM CET, Johannes Schindelin wrote: > Yes, you could write a program which can compare objects from several > repos, but that is easy in fact: just set GIT_ALTERNATE_OBJECT_DIRECTORIES > and you're done. Without changing the core of Git at all! But you'll also need to access refs. And the key point here is reentrance - handling multiple repositories at once is only part of this, actually probably the much bigger customer would be multi-threaded programs. And easier creation of reusable components and other libraries, and so on... I believe the performance impact will be most likely absolutely negligible. Of course we have no hard data, but I doubt it's this where most of the CPU crunching is. -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ Ever try. Ever fail. No matter. // Try again. Fail again. Fail better. -- Samuel Beckett ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-19 12:52 ` Petr Baudis @ 2007-03-19 13:55 ` Johannes Schindelin 0 siblings, 0 replies; 62+ messages in thread From: Johannes Schindelin @ 2007-03-19 13:55 UTC (permalink / raw) To: Petr Baudis; +Cc: Marco Costalba, Rocco Rutte, git, tytso, spearce Hi, On Mon, 19 Mar 2007, Petr Baudis wrote: > On Mon, Mar 19, 2007 at 01:37:18PM CET, Johannes Schindelin wrote: > > Yes, you could write a program which can compare objects from several > > repos, but that is easy in fact: just set GIT_ALTERNATE_OBJECT_DIRECTORIES > > and you're done. Without changing the core of Git at all! > > But you'll also need to access refs. Yes, and you want it to bake some fine pizza, too. > And the key point here is reentrance - handling multiple repositories at > once is only part of this, actually probably the much bigger customer > would be multi-threaded programs. And easier creation of reusable > components and other libraries, and so on... > > I believe the performance impact will be most likely absolutely > negligible. Of course we have no hard data, but I doubt it's this where > most of the CPU crunching is. My time is very limited, and I see this thread going nowhere since everybody says "I like this, I like that", and nobody shows some hard data (me included). It almost feels like a Windows user community. Or Slashdot. Anyway, I refuse to comment on these issues until somebody proves me wrong or right in my assumption that the impact on core Git (in terms of time _or_ lines of code) would be huge. Ciao, Dscho ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-19 12:37 ` Johannes Schindelin 2007-03-19 12:52 ` Petr Baudis @ 2007-03-19 13:04 ` Marco Costalba 1 sibling, 0 replies; 62+ messages in thread From: Marco Costalba @ 2007-03-19 13:04 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Petr Baudis, Rocco Rutte, git, tytso, spearce On 3/19/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > Hi, > > On Mon, 19 Mar 2007, Marco Costalba wrote: > > > On 3/19/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > > > > > > I don't see _any_ problem in making an API which works with _one_ repo > > > first. This has several advantages: > > > > > > - most users (if any!) will work that way, > > > > > > > Sometime could be useful to write a list of possible users before > > starting to code. > > Fair enough. > > I expect the most visible users of libgit to be: the core Git programs! > Because if we don't eat our own dog food, why should anybody else? > But in case you eat your own food, why others should to the same? > And I am absolutely utterly opposed to make them slower just to support a > program which wants to cache meta data from multiple repositories. > The problem, at least with viewers I know, it's not with multiple repositories but with multiple views of the same repo. Anyway. Just to give my two cent: The two possible features we are talking about are: - reentrancy (many views open on the same repo) - non-blocking behaviour (loading repo in background) These two features are _very_ different. I agree an async library it's not a small thing, and probably it involves using an external thread library in libgit itself, like pthread, just to not reinventing the (difficult) wheel. Regarding reentrancy I don't know what is involved in avoiding globals and the like, but I would think it's really an absolute minimum to get people eating your food ;-) I completely agree that it's impossible to know how a library will be used when you write it, but giving a good look around before to start allows you to get a minimum subset of needed features and if you add a little bit of generalization and you are lucky enough perhaps you will avoid to rewrite the library in the future. >From the viewers survey and also from the interesting examples of Steve I would say that do not planning for reentarncy would be a big no-no Marco ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-16 5:30 ` Junio C Hamano 2007-03-16 6:00 ` Shawn O. Pearce @ 2007-03-16 12:53 ` Petr Baudis 2007-03-16 13:47 ` Luiz Fernando N. Capitulino 2 siblings, 0 replies; 62+ messages in thread From: Petr Baudis @ 2007-03-16 12:53 UTC (permalink / raw) To: Junio C Hamano; +Cc: Shawn O. Pearce, Luiz Fernando N. Capitulino, git On Fri, Mar 16, 2007 at 06:30:46AM CET, Junio C Hamano wrote: > "Shawn O. Pearce" <spearce@spearce.org> writes: > > But other areas die when they get given a bad SHA-1 (for example). > > If the library caller can supply that (possibly bad) SHA-1 to an > > API function, that's just mean to die out. ;-) > > That's a real problem, but on the other hand, perl or whatever > wrapped ones can do the dying (or not dying) before calling into > libgit, so it may not be such a big issue. At least you can catch the die from the library caller using set_*_routine(). ;-) > >> o Documentation (eg, doxygen) > >> o Unit-tests > >> o Add prefix (eg, git_*) to public API functions > > > > Yes. But which functions shall we expose? ;-) > > Before going into that topic, a bigger question is if we are > happy with the current internal API and what the goal of > libification is. If the libification is going to say that "this > is a published API so we are not going to change it", I would > imagine that it would be very hard to accept in the mainline. > Improvements like the earlier sliding mmap() series need to be > able to change the interfaces without backward compatibility > wart. > > In other words, I do not know what idiot ^W ^W who listed the > libification stuff on the SoC "ideas" page, but I think (1) it > is premature to promise stable ABI, and (2) if it does not > promise stable ABI a library is not very useful. I disagree, it can live in the "zero major version" realm and already be very useful for language bindings (say whatever is bundled with git itself) and other nifty stuff. -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ Ever try. Ever fail. No matter. // Try again. Fail again. Fail better. -- Samuel Beckett ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-16 5:30 ` Junio C Hamano 2007-03-16 6:00 ` Shawn O. Pearce 2007-03-16 12:53 ` Petr Baudis @ 2007-03-16 13:47 ` Luiz Fernando N. Capitulino 2007-03-16 14:08 ` Petr Baudis 2007-03-16 15:16 ` Johannes Schindelin 2 siblings, 2 replies; 62+ messages in thread From: Luiz Fernando N. Capitulino @ 2007-03-16 13:47 UTC (permalink / raw) To: Junio C Hamano; +Cc: Shawn O. Pearce, git Em Thu, 15 Mar 2007 22:30:46 -0700 Junio C Hamano <junkio@cox.net> escreveu: | "Shawn O. Pearce" <spearce@spearce.org> writes: | | >> o Documentation (eg, doxygen) | >> o Unit-tests | >> o Add prefix (eg, git_*) to public API functions | > | > Yes. But which functions shall we expose? ;-) | | Before going into that topic, a bigger question is if we are | happy with the current internal API and what the goal of | libification is. If the libification is going to say that "this | is a published API so we are not going to change it", I would | imagine that it would be very hard to accept in the mainline. I think you can put this way: do you want/whish to make git more useful than it's today? If so, such a library is important because it will allow users to write application that use git in a reasonable way. It doesn't need to be the next five-zilion-function-library that will provide the wonders of git in several different ways. We could start by fixing the got-an-error-die behaivor and define a _experimental_ API (just a few functions) just to get data out of git. This would be enough to write the Perl binding I think? -- Luiz Fernando N. Capitulino ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-16 13:47 ` Luiz Fernando N. Capitulino @ 2007-03-16 14:08 ` Petr Baudis 2007-03-16 18:38 ` Luiz Fernando N. Capitulino 2007-03-16 15:16 ` Johannes Schindelin 1 sibling, 1 reply; 62+ messages in thread From: Petr Baudis @ 2007-03-16 14:08 UTC (permalink / raw) To: Luiz Fernando N. Capitulino; +Cc: Junio C Hamano, Shawn O. Pearce, git On Fri, Mar 16, 2007 at 02:47:15PM CET, Luiz Fernando N. Capitulino wrote: > We could start by fixing the got-an-error-die behaivor and > define a _experimental_ API (just a few functions) just to get > data out of git. > > This would be enough to write the Perl binding I think? Actually, well, I've already done this. :-) The trouble begins when you want to access multiple repositories from the same process, etc. Without that, writing the Perl binding is trivial; there's already a hook the binding can use to catch dies, I've added it. So, the main point of the work is to define a _good_ API and get rid of the static state, I guess. -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ Ever try. Ever fail. No matter. // Try again. Fail again. Fail better. -- Samuel Beckett ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-16 14:08 ` Petr Baudis @ 2007-03-16 18:38 ` Luiz Fernando N. Capitulino 2007-03-16 23:16 ` Shawn O. Pearce 0 siblings, 1 reply; 62+ messages in thread From: Luiz Fernando N. Capitulino @ 2007-03-16 18:38 UTC (permalink / raw) To: Petr Baudis; +Cc: Junio C Hamano, Shawn O. Pearce, git Em Fri, 16 Mar 2007 15:08:55 +0100 Petr Baudis <pasky@suse.cz> escreveu: | On Fri, Mar 16, 2007 at 02:47:15PM CET, Luiz Fernando N. Capitulino wrote: | > We could start by fixing the got-an-error-die behaivor and | > define a _experimental_ API (just a few functions) just to get | > data out of git. | > | > This would be enough to write the Perl binding I think? | | Actually, well, I've already done this. :-) Not exactly, at least not the way I think it should be done. | The trouble begins when you want to access multiple repositories from | the same process, etc. Without that, writing the Perl binding is | trivial; there's already a hook the binding can use to catch dies, I've | added it. | | So, the main point of the work is to define a _good_ API and get rid of | the static state, I guess. Yes, the set_*_routine()s seems a workaround to me, you're only fixing die()'s final effect. I think the right solution is to get rid of die() from functions that are supposed to be an interface, set errno if needed and return -1 or NULL. That looks a lot of work BTW, but I'll be pleased to work on it. Is there more things like the set_*_routine()s added to fix other problems? -- Luiz Fernando N. Capitulino ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-16 18:38 ` Luiz Fernando N. Capitulino @ 2007-03-16 23:16 ` Shawn O. Pearce 2007-03-17 19:58 ` Luiz Fernando N. Capitulino 0 siblings, 1 reply; 62+ messages in thread From: Shawn O. Pearce @ 2007-03-16 23:16 UTC (permalink / raw) To: Luiz Fernando N. Capitulino; +Cc: Petr Baudis, Junio C Hamano, git "Luiz Fernando N. Capitulino" <lcapitulino@mandriva.com.br> wrote: > I think the right solution is to get rid of die() from functions that > are supposed to be an interface, set errno if needed and return -1 > or NULL. And then make their callers (if they are above the public API layer) die instead. In some cases this might imply an undesirable change in the error message produced, as necessary details that are included today would be unavailable in the caller. > Is there more things like the set_*_routine()s added to fix > other problems? Not that I am aware of. -- Shawn. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-16 23:16 ` Shawn O. Pearce @ 2007-03-17 19:58 ` Luiz Fernando N. Capitulino 2007-03-18 5:23 ` Shawn O. Pearce 0 siblings, 1 reply; 62+ messages in thread From: Luiz Fernando N. Capitulino @ 2007-03-17 19:58 UTC (permalink / raw) To: Shawn O. Pearce; +Cc: Petr Baudis, Junio C Hamano, git On Fri, 16 Mar 2007 19:16:46 -0400 "Shawn O. Pearce" <spearce@spearce.org> wrote: | "Luiz Fernando N. Capitulino" <lcapitulino@mandriva.com.br> wrote: | > I think the right solution is to get rid of die() from functions that | > are supposed to be an interface, set errno if needed and return -1 | > or NULL. | | And then make their callers (if they are above the public API layer) | die instead. In some cases this might imply an undesirable change | in the error message produced, as necessary details that are included | today would be unavailable in the caller. Exactly! One simple example of an important error message that would be lost can be found in read-cache.c:read_cache_from(): o index file smaller than expected I've found a possible solution, though. Take a look at Rusty's solution for the same problem in module-init-tools: """ /* We use error numbers in a loose translation... */ static const char *insert_moderror(int err) { switch (err) { case ENOEXEC: return "Invalid module format"; case ENOENT: return "Unknown symbol in module, or unknown parameter (see dmesg)"; case ENOSYS: return "Kernel does not have module support"; default: return strerror(err); } } """ Instead of calling strerror() directly for error generated when inserting a module, the insmod() function calls insert_moderror() which provides the desirable mapping. I think we could have something like that for each git's module, eg, git_cache_strerror(), git_commit_strerror() and so on. Does this look reasonable? -- Luiz Fernando N. Capitulino ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-17 19:58 ` Luiz Fernando N. Capitulino @ 2007-03-18 5:23 ` Shawn O. Pearce 2007-03-18 5:52 ` Junio C Hamano 0 siblings, 1 reply; 62+ messages in thread From: Shawn O. Pearce @ 2007-03-18 5:23 UTC (permalink / raw) To: Luiz Fernando N. Capitulino; +Cc: Petr Baudis, Junio C Hamano, git "Luiz Fernando N. Capitulino" <lcapitulino@mandriva.com.br> wrote: > On Fri, 16 Mar 2007 19:16:46 -0400 > "Shawn O. Pearce" <spearce@spearce.org> wrote: > | And then make their callers (if they are above the public API layer) > | die instead. In some cases this might imply an undesirable change > | in the error message produced, as necessary details that are included > | today would be unavailable in the caller. > > I've found a possible solution, though. > > Take a look at Rusty's solution for the same problem in > module-init-tools: > > """ > /* We use error numbers in a loose translation... */ > static const char *insert_moderror(int err) > { > switch (err) { > case ENOEXEC: > return "Invalid module format"; > case ENOENT: > return "Unknown symbol in module, or unknown parameter (see dmesg)"; > case ENOSYS: > return "Kernel does not have module support"; > default: > return strerror(err); > } > } > """ Take a look at sha1_file.c, open_packed_git_1: ... if (!pack_version_ok(hdr.hdr_version)) return error("packfile %s is version %u and not supported" " (try upgrading GIT to a newer version)", p->pack_name, ntohl(hdr.hdr_version)); ... Here we are supplying a lot more than just a simple error code that can be mapped to a static string. Of course that code is currently feeding it to the error function, which today calls the error_routine (see usage.c). We could buffer the strings sent to error()/warn() and let the caller obtain all strings that occurred during the last API call. -- Shawn. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-18 5:23 ` Shawn O. Pearce @ 2007-03-18 5:52 ` Junio C Hamano 2007-03-18 16:18 ` Luiz Fernando N. Capitulino 0 siblings, 1 reply; 62+ messages in thread From: Junio C Hamano @ 2007-03-18 5:52 UTC (permalink / raw) To: Shawn O. Pearce; +Cc: Luiz Fernando N. Capitulino, Petr Baudis, git "Shawn O. Pearce" <spearce@spearce.org> writes: > Take a look at sha1_file.c, open_packed_git_1: > > ... > if (!pack_version_ok(hdr.hdr_version)) > return error("packfile %s is version %u and not supported" > " (try upgrading GIT to a newer version)", > p->pack_name, ntohl(hdr.hdr_version)); > ... > > Here we are supplying a lot more than just a simple error code > that can be mapped to a static string. > > Of course that code is currently feeding it to the error function, > which today calls the error_routine (see usage.c). We could buffer > the strings sent to error()/warn() and let the caller obtain all > strings that occurred during the last API call. Actually, since we are talking about the error path, (1) we do not care performance of what happens there that much, but (2) we *do* care about not doing extra allocation. So it might make sense to have a preallocated "error string" buffer, sprintf the error message in there and return error codes. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-18 5:52 ` Junio C Hamano @ 2007-03-18 16:18 ` Luiz Fernando N. Capitulino 2007-03-18 19:31 ` Junio C Hamano 2007-03-18 21:15 ` Nicolas Pitre 0 siblings, 2 replies; 62+ messages in thread From: Luiz Fernando N. Capitulino @ 2007-03-18 16:18 UTC (permalink / raw) To: Junio C Hamano; +Cc: Shawn O. Pearce, Petr Baudis, git On Sat, 17 Mar 2007 22:52:52 -0700 Junio C Hamano <junkio@cox.net> wrote: | "Shawn O. Pearce" <spearce@spearce.org> writes: | | > Take a look at sha1_file.c, open_packed_git_1: | > | > ... | > if (!pack_version_ok(hdr.hdr_version)) | > return error("packfile %s is version %u and not supported" | > " (try upgrading GIT to a newer version)", | > p->pack_name, ntohl(hdr.hdr_version)); | > ... | > | > Here we are supplying a lot more than just a simple error code | > that can be mapped to a static string. | > | > Of course that code is currently feeding it to the error function, | > which today calls the error_routine (see usage.c). We could buffer | > the strings sent to error()/warn() and let the caller obtain all | > strings that occurred during the last API call. | | Actually, since we are talking about the error path, | | (1) we do not care performance of what happens there that much, but | (2) we *do* care about not doing extra allocation. | | So it might make sense to have a preallocated "error string" | buffer, sprintf the error message in there and return error | codes. Other possibility is to let the caller do the job. I mean, if the information needed to print the error message (packfile name and version in this example) is available to the caller, or the caller can get it someway, then the caller could check which error he got and build the message himself. That seems simpler to me, considering the caller has the needed info, of course... -- Luiz Fernando N. Capitulino ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-18 16:18 ` Luiz Fernando N. Capitulino @ 2007-03-18 19:31 ` Junio C Hamano 2007-03-19 16:09 ` Luiz Fernando N. Capitulino 2007-03-18 21:15 ` Nicolas Pitre 1 sibling, 1 reply; 62+ messages in thread From: Junio C Hamano @ 2007-03-18 19:31 UTC (permalink / raw) To: Luiz Fernando N. Capitulino; +Cc: Shawn O. Pearce, Petr Baudis, git "Luiz Fernando N. Capitulino" <lcapitulino@mandriva.com.br> writes: > I mean, if the information needed to print the error message (packfile > name and version in this example) is available to the caller, or the > caller can get it someway, then the caller could check which error > he got and build the message himself. > > That seems simpler to me, considering the caller has the needed > info, of course... It's a possibility, but that would make it much less nice to diagnose and debug problems, as the caller does not usually have necessary information. The caller may ask for object A, and the error is triggered because a different object C is missing, which is the delta base of object B which in turn is the delta base of object A. The best your "caller" can say is "cannot read object A for some reason", and it cannot say "cannot read object A because object C is missing". ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-18 19:31 ` Junio C Hamano @ 2007-03-19 16:09 ` Luiz Fernando N. Capitulino 0 siblings, 0 replies; 62+ messages in thread From: Luiz Fernando N. Capitulino @ 2007-03-19 16:09 UTC (permalink / raw) To: Junio C Hamano; +Cc: Shawn O. Pearce, Petr Baudis, git Em Sun, 18 Mar 2007 12:31:13 -0700 Junio C Hamano <junkio@cox.net> escreveu: | "Luiz Fernando N. Capitulino" <lcapitulino@mandriva.com.br> | writes: | | > I mean, if the information needed to print the error message (packfile | > name and version in this example) is available to the caller, or the | > caller can get it someway, then the caller could check which error | > he got and build the message himself. | > | > That seems simpler to me, considering the caller has the needed | > info, of course... | | It's a possibility, but that would make it much less nice to | diagnose and debug problems, as the caller does not usually have | necessary information. | | The caller may ask for object A, and the error is triggered | because a different object C is missing, which is the delta base | of object B which in turn is the delta base of object A. The | best your "caller" can say is "cannot read object A for some | reason", and it cannot say "cannot read object A because object | C is missing". Okay, you're right. I'm going to let the low-level functions fill the error buffer then. Thanks, -- Luiz Fernando N. Capitulino ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-18 16:18 ` Luiz Fernando N. Capitulino 2007-03-18 19:31 ` Junio C Hamano @ 2007-03-18 21:15 ` Nicolas Pitre 1 sibling, 0 replies; 62+ messages in thread From: Nicolas Pitre @ 2007-03-18 21:15 UTC (permalink / raw) To: Luiz Fernando N. Capitulino Cc: Junio C Hamano, Shawn O. Pearce, Petr Baudis, git On Sun, 18 Mar 2007, Luiz Fernando N. Capitulino wrote: > Other possibility is to let the caller do the job. > > I mean, if the information needed to print the error message (packfile > name and version in this example) is available to the caller, or the > caller can get it someway, then the caller could check which error > he got and build the message himself. Nah... The error details should be handled at the failure location. Any error code based mechanism is bound to get out of synch at some point, or people simply won't bother adding new codes for new error conditions but simply reuse an existing generic enough code instead. We already have this nice error() function. Right now it simply dumps the message to stderr but it could be made more sophisticated if needed. Nicolas ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-16 13:47 ` Luiz Fernando N. Capitulino 2007-03-16 14:08 ` Petr Baudis @ 2007-03-16 15:16 ` Johannes Schindelin 1 sibling, 0 replies; 62+ messages in thread From: Johannes Schindelin @ 2007-03-16 15:16 UTC (permalink / raw) To: Luiz Fernando N. Capitulino; +Cc: Junio C Hamano, Shawn O. Pearce, git Hi, On Fri, 16 Mar 2007, Luiz Fernando N. Capitulino wrote: > It doesn't need to be the next five-zilion-function-library that will > provide the wonders of git in several different ways. Yes. Just like we have a really small really stable part of core-git, which can be used by porcelains, and is expected to work the same in future versions, we could have eventually with libgit. That would mean, for example, that rev_info should always be initialised with malloc() so that future versions can make it bigger, and that new members be added always at the end. > We could start by fixing the got-an-error-die behaivor and define a > _experimental_ API (just a few functions) just to get data out of git. That sounds very reasonable. And if it does not work out as expected, we don't have to make it part of "official" Git. It can live on as a fork. Ciao, Dscho ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-16 4:59 ` Shawn O. Pearce 2007-03-16 5:30 ` Junio C Hamano @ 2007-03-16 8:06 ` Johannes Sixt 2007-03-16 8:58 ` Matthieu Moy 2007-03-16 12:55 ` Petr Baudis 2 siblings, 1 reply; 62+ messages in thread From: Johannes Sixt @ 2007-03-16 8:06 UTC (permalink / raw) To: git "Shawn O. Pearce" wrote: > "Luiz Fernando N. Capitulino" <lcapitulino@mandriva.com.br> wrote: > > o Avoid dying when a function call fails (eg, malloc()) > > malloc is a huge problem in the Git code today. Almost all > of our malloc calls are actually through the xmalloc wrapper. > All xmalloc callers assume xmalloc will *never* fail. This > makes it, uh, interesting. ;-) You could think about longjmp(3)ing out into main(), which would have to setjmp(3). But in order to clean up intermediate frames, you would have to have a stack of setjmp/longjmp buffers. Oh, well, how do I *love* them C++ exceptions! -- Hannes ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-16 8:06 ` Johannes Sixt @ 2007-03-16 8:58 ` Matthieu Moy 2007-03-16 11:51 ` Johannes Schindelin 0 siblings, 1 reply; 62+ messages in thread From: Matthieu Moy @ 2007-03-16 8:58 UTC (permalink / raw) To: git Johannes Sixt <J.Sixt@eudaptics.com> writes: > You could think about longjmp(3)ing out into main(), which would have to > setjmp(3). But in order to clean up intermediate frames, you would have > to have a stack of setjmp/longjmp buffers. > > Oh, well, how do I *love* them C++ exceptions! You can have exceptions in C too. I've used it a bit while contributing to Baz 1.x (the fork of tla). The library used was cexcept ( http://cexcept.sourceforge.net/ ). As you mention, jumping is the easy part, and cleaning up is the hard one. Baz was using talloc, hacked to somehow work with cexcept. The mini-library doesn't seem to be available as a tarball anymore, so I did the checkout+targz in case someone's curious to have a look, and lazy enough not to install baz to get it: http://www-verimag.imag.fr/~moy/tmp/talloc-except--2.0.1--patch-2.tar.gz This stuff is not supported anymore, but very small anyway. -- Matthieu ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-16 8:58 ` Matthieu Moy @ 2007-03-16 11:51 ` Johannes Schindelin 0 siblings, 0 replies; 62+ messages in thread From: Johannes Schindelin @ 2007-03-16 11:51 UTC (permalink / raw) To: Matthieu Moy; +Cc: git Hi, On Fri, 16 Mar 2007, Matthieu Moy wrote: > Johannes Sixt <J.Sixt@eudaptics.com> writes: > > > You could think about longjmp(3)ing out into main(), which would have to > > setjmp(3). But in order to clean up intermediate frames, you would have > > to have a stack of setjmp/longjmp buffers. > > > > Oh, well, how do I *love* them C++ exceptions! > > You can have exceptions in C too. > > I've used it a bit while contributing to Baz 1.x (the fork of tla). > The library used was cexcept ( http://cexcept.sourceforge.net/ ). > > As you mention, jumping is the easy part, and cleaning up is the hard > one. Baz was using talloc, hacked to somehow work with cexcept. The > mini-library doesn't seem to be available as a tarball anymore, so I > did the checkout+targz in case someone's curious to have a look, and > lazy enough not to install baz to get it: > > http://www-verimag.imag.fr/~moy/tmp/talloc-except--2.0.1--patch-2.tar.gz > > This stuff is not supported anymore, but very small anyway. I was thinking about a similar approach some time ago. But that means that you _must not_ have static variables that you rely on being initialised correctly. I mean, we have xmalloc(), and it would be easy to enforce xfree(), too (which would be good for memory profiling anyway), and we _could_ hack that into tracking which pointers were returned after which checkpoint. But we _cannot_ say which static variables should be initialised (and how), after some "exception" was thrown at a certain point. Ciao, Dscho ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-16 4:59 ` Shawn O. Pearce 2007-03-16 5:30 ` Junio C Hamano 2007-03-16 8:06 ` Johannes Sixt @ 2007-03-16 12:55 ` Petr Baudis 2 siblings, 0 replies; 62+ messages in thread From: Petr Baudis @ 2007-03-16 12:55 UTC (permalink / raw) To: Shawn O. Pearce; +Cc: Luiz Fernando N. Capitulino, git On Fri, Mar 16, 2007 at 05:59:28AM CET, Shawn O. Pearce wrote: > "Luiz Fernando N. Capitulino" <lcapitulino@mandriva.com.br> wrote: > > 3. I don't code in Perl, is it a problem? I mean, the project's > > goal is to have a Perl binding but I think it goes far from > > that: we could have a python module, a C program, or anything > > that shows the libgit is useful. > > No, I don't see that as a problem at all. We have some Perl > experts on the mailing list who would like to see Perl bindings. > Some of the Perl binding is pure C code, and some if it is this > weird Perl macro language... so I expect those Perl experts to come > out of the woodwork and help the community to create a prototype > set of bindings. There's also Ruby and Python interests around, > so we may see bindings for those too. ;-) I'll add perl binding as soon as libgit part is there; the infrastructure is already in place (not now but it's in git history, you just have to dig it out), so it should be pretty easy too; so even if I wouldn't, someone surely will. ;-) I don't think knowing Perl or moreover the Perl XS horrors should be a prerequisite for this project. -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ Ever try. Ever fail. No matter. // Try again. Fail again. Fail better. -- Samuel Beckett ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-16 4:24 Libification project (SoC) Luiz Fernando N. Capitulino 2007-03-16 4:59 ` Shawn O. Pearce @ 2007-03-17 2:24 ` Jakub Narebski 2007-03-17 5:22 ` Shawn O. Pearce 1 sibling, 1 reply; 62+ messages in thread From: Jakub Narebski @ 2007-03-17 2:24 UTC (permalink / raw) To: git [Cc: git@vger.kernel.org] Luiz Fernando N. Capitulino wrote: > o Documentation (eg, doxygen) I wonder if documenting and finishing documentation of git storage structure (format description of: loose objects, packs, pack indices, index, refs and symbolic refs, packed refs) and git protocols (git protocol description, local/ssh fetch/push pipeline description), perhaps using RFC or RFC-like notation could (and should) be made part of libification effort... -- Jakub Narebski Warsaw, Poland ShadeHawk on #git ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Libification project (SoC) 2007-03-17 2:24 ` Jakub Narebski @ 2007-03-17 5:22 ` Shawn O. Pearce 0 siblings, 0 replies; 62+ messages in thread From: Shawn O. Pearce @ 2007-03-17 5:22 UTC (permalink / raw) To: Jakub Narebski; +Cc: git, lcapitulino Jakub Narebski <jnareb@gmail.com> wrote: > [Cc: git@vger.kernel.org] > > Luiz Fernando N. Capitulino wrote: > > > o Documentation (eg, doxygen) > > I wonder if documenting and finishing documentation of git storage structure > (format description of: loose objects, packs, pack indices, index, refs and > symbolic refs, packed refs) and git protocols (git protocol description, > local/ssh fetch/push pipeline description), perhaps using RFC or RFC-like > notation could (and should) be made part of libification effort... I would consider that out of scope for this project. It would be nice if someone did this work, or at least dusted off "A Large Angry SCM"'s document and made that available in the Documentation/technical folder. But I don't think it should be part of the Libification SoC project, or any of our other current ideas. Users of a public API don't need to know the internal formatting of an object within a packfile. They do however need to know that a commit has a tree, and 0-n parents. And that's already covered in our existing docs. And *please* stop breaking the CC chains Jakub. We've asked you to not do that. I had to go lookup Luiz' email address so I could get him back onto it. -- Shawn. ^ permalink raw reply [flat|nested] 62+ messages in thread
end of thread, other threads:[~2007-03-22 9:51 UTC | newest] Thread overview: 62+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-03-16 4:24 Libification project (SoC) Luiz Fernando N. Capitulino 2007-03-16 4:59 ` Shawn O. Pearce 2007-03-16 5:30 ` Junio C Hamano 2007-03-16 6:00 ` Shawn O. Pearce 2007-03-16 6:54 ` Junio C Hamano 2007-03-16 11:54 ` Johannes Schindelin 2007-03-16 13:09 ` Rocco Rutte 2007-03-16 15:12 ` Johannes Schindelin 2007-03-16 15:55 ` Nicolas Pitre 2007-03-16 16:13 ` Johannes Schindelin 2007-03-16 16:26 ` Nicolas Pitre 2007-03-16 18:22 ` Steve Frécinaux 2007-03-16 18:53 ` Nicolas Pitre 2007-03-18 13:57 ` Petr Baudis 2007-03-16 23:26 ` Johannes Schindelin 2007-03-16 16:17 ` Shawn O. Pearce 2007-03-16 18:20 ` Marco Costalba 2007-03-16 18:38 ` Marco Costalba 2007-03-16 18:59 ` Nicolas Pitre 2007-03-16 21:07 ` Marco Costalba 2007-03-16 23:24 ` Johannes Schindelin 2007-03-17 7:04 ` Marco Costalba 2007-03-17 17:29 ` Johannes Schindelin 2007-03-16 19:09 ` Andy Parkins 2007-03-18 14:08 ` Petr Baudis 2007-03-18 23:48 ` Johannes Schindelin 2007-03-19 1:21 ` Petr Baudis 2007-03-19 1:43 ` Johannes Schindelin 2007-03-19 2:56 ` Theodore Tso 2007-03-19 3:55 ` Shawn O. Pearce 2007-03-19 14:57 ` Johannes Schindelin 2007-03-19 16:28 ` Linus Torvalds 2007-03-19 16:32 ` Linus Torvalds 2007-03-21 11:17 ` Andreas Ericsson 2007-03-21 17:24 ` Linus Torvalds 2007-03-22 9:51 ` Andreas Ericsson 2007-03-19 7:01 ` Marco Costalba 2007-03-19 9:46 ` Steve Frécinaux 2007-03-19 10:33 ` Steve Frécinaux 2007-03-19 12:37 ` Johannes Schindelin 2007-03-19 12:52 ` Petr Baudis 2007-03-19 13:55 ` Johannes Schindelin 2007-03-19 13:04 ` Marco Costalba 2007-03-16 12:53 ` Petr Baudis 2007-03-16 13:47 ` Luiz Fernando N. Capitulino 2007-03-16 14:08 ` Petr Baudis 2007-03-16 18:38 ` Luiz Fernando N. Capitulino 2007-03-16 23:16 ` Shawn O. Pearce 2007-03-17 19:58 ` Luiz Fernando N. Capitulino 2007-03-18 5:23 ` Shawn O. Pearce 2007-03-18 5:52 ` Junio C Hamano 2007-03-18 16:18 ` Luiz Fernando N. Capitulino 2007-03-18 19:31 ` Junio C Hamano 2007-03-19 16:09 ` Luiz Fernando N. Capitulino 2007-03-18 21:15 ` Nicolas Pitre 2007-03-16 15:16 ` Johannes Schindelin 2007-03-16 8:06 ` Johannes Sixt 2007-03-16 8:58 ` Matthieu Moy 2007-03-16 11:51 ` Johannes Schindelin 2007-03-16 12:55 ` Petr Baudis 2007-03-17 2:24 ` Jakub Narebski 2007-03-17 5:22 ` Shawn O. Pearce
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).