* Histogram diff, libgit2 enhancement, libgit2 => git merge (GSOC) @ 2011-03-20 10:55 Pavel Raiskup 2011-03-20 18:06 ` Shawn Pearce ` (2 more replies) 0 siblings, 3 replies; 13+ messages in thread From: Pavel Raiskup @ 2011-03-20 10:55 UTC (permalink / raw) To: git Hi Git's community! I'd like to ask you for some details about "histogram diff" and "libgit" enhancement/git-merge tasks for this year's GSOC. Histogram diff: There is no mentor mentioned in [1]. Does it mean that there is no person who can be a mentor for this task or is that assignment possible to be mentored by everyone mentioned in other tasks? I'd like to do this task very much. After doing a small observing around source code of git/jgit it looks feasible for me. There is a goal "Get this feature merged to the upstream git." -- but I have one theoretical question -- what if the benchmarking/study of histogram diff leads to conclusion that this algorithm will not be useful for upstream? Does it mean "fail" in terms of GSOC? I have to think about it even if it looks that there should be speedup quite obvious. I don't want to fail a priory :). libgit2: I really like the concept of libraries for to be binding-able from dozens of languages - this leads to expanding functionality among masses users almost everywhere. In this part I like the idea of implementing new features inside library (diff, config file parsing) but also maybe the task of merging libgit2 into git upstream. Basically I don't know much about that.. and you wrote that this task is more difficult then others, so I probably need to study git's and libgit's architecture very precisely beforehand .. but could you tell me some details about that? Is it impossible to do it before GSOC deadline and is it worth making a serious big efforts to this task (from your point of view onto project objectives)? How big are requirements for this task in term of GSOC? Now it is quite hectic time because of my study :) it's been a long time since I've had time for myself but I'd like to prepare some patch for to proof my interests and abilities. ==== And now not so important part of message (you can skip).. I plan to write this informations later on to google-melange more precisely. Something about me || I am: -- I like C language but there is no problem to study more deeply other commonly used languages (I need only little brainstorming), -- interested in Open Source in general, programming (especially in parallel), chess playing and challenges, -- student of master's degree BUT (CZ), penultimate year of study, my last summer :( -- a fan of Git because of many reasons, I'd like to become a contributor even if the GSOC opportunity wont come. -- not so good English speaker so sometimes my messages could be a little harder to understand. Experiences: In most cases I have only school projects experiences (even if programming projects are some kind of evergreen here in Brno). But I've had one Open Source experience -- enhancement for Daniel Stenberg's libcurl [2] followed with some continuing patches. The main patch implements shell-like wildcard pattern matching functionality for FTP protocol and makes an enhancement of API to allow implementing of this functionality among other protocols. (I've done implementation of wildcard "*.txt, [a-z]???.txt" compiler, auto testing script, enhancement for testing FTP server inside libcurl, man pages, .. ) The most difficult part was to understand how it works inside curl library -- but now I think I'm better in that aspect so I think I can make some useful work for Git too. ==== Don't worry please, my next messages will be much briefer :) Pavel [1] https://git.wiki.kernel.org/index.php/SoC2011Ideas [2] https://github.com/bagder/curl/commit/0825cd80a62c21725fb3615f1fdd3aa6cc5f0f34 ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Histogram diff, libgit2 enhancement, libgit2 => git merge (GSOC) 2011-03-20 10:55 Histogram diff, libgit2 enhancement, libgit2 => git merge (GSOC) Pavel Raiskup @ 2011-03-20 18:06 ` Shawn Pearce 2011-03-22 12:32 ` Pavel Raiskup 2011-03-20 18:25 ` Junio C Hamano 2011-03-20 21:01 ` Vicent Marti 2 siblings, 1 reply; 13+ messages in thread From: Shawn Pearce @ 2011-03-20 18:06 UTC (permalink / raw) To: Pavel Raiskup; +Cc: git On Sun, Mar 20, 2011 at 03:55, Pavel Raiskup <xraisk00@gmail.com> wrote: > I'd like to ask you for some details about "histogram diff" and "libgit" > enhancement/git-merge tasks for this year's GSOC. > > Histogram diff: > There is no mentor mentioned in [1]. Does it mean that there is no person > who can be a mentor for this task or is that assignment possible to be > mentored by everyone mentioned in other tasks? I'd like to do this task very > much. After doing a small observing around source code of git/jgit it looks > feasible for me. As the original author of HistogramDiff in JGit, and a contributor to C Git... I'm probably the best person to mentor this task. I'm really busy, so I didn't sign up to mentor anything else this year, but I think I would make time for this project. > There is a goal "Get this feature merged to the upstream git." -- but I have > one theoretical question -- what if the benchmarking/study of histogram diff > leads to conclusion that this algorithm will not be useful for upstream? Then the project doesn't merge. :-) > Does it mean "fail" in terms of GSOC? I have to think about it even if it > looks that there should be speedup quite obvious. I don't want to fail > a priory :). I don't think so I think the success of this project is if the code is of the quality that upstream would accept it, and if the final analysis data makes it clear whether or not its worth including. Its probably not worth including if its the same speed as the current Myers diff implementation from libxdiff or slower. But if its 2x faster, its probably worth merging. If the code quality is acceptable to the upstream maintainers. > [1] https://git.wiki.kernel.org/index.php/SoC2011Ideas -- Shawn. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Histogram diff, libgit2 enhancement, libgit2 => git merge (GSOC) 2011-03-20 18:06 ` Shawn Pearce @ 2011-03-22 12:32 ` Pavel Raiskup 0 siblings, 0 replies; 13+ messages in thread From: Pavel Raiskup @ 2011-03-22 12:32 UTC (permalink / raw) To: git@vger.kernel.org >> Histogram diff: >> There is no mentor mentioned in [1]. Does it mean that there is no person >> .. > > As the original author of HistogramDiff in JGit, and a contributor to > C Git... I'm probably the best person to mentor this task. I'm really > busy, so I didn't sign up to mentor anything else this year, but I > think I would make time for this project. Thanks for your answer and for your ability to be a mentor of this task. >> There is a goal "Get this feature merged to the upstream git." -- but I have >> one theoretical question -- what if the benchmarking/study of histogram diff >> leads to conclusion that this algorithm will not be useful for upstream? > > Then the project doesn't merge. :-) > >> Does it mean "fail" in terms of GSOC? I have to think about it even if it >> looks that there should be speedup quite obvious. I don't want to fail >> a priory :). > > I don't think so > > I think the success of this project is if the code is of the quality > that upstream would accept it, and if the final analysis data makes it > clear whether or not its worth including. Its probably not worth > including if its the same speed as the current Myers diff > implementation from libxdiff or slower. But if its 2x faster, its > probably worth merging. If the code quality is acceptable to the > upstream maintainers. I wanted to know exactly this kind of information. Of course I don't want to make a code of unacceptable quality from any perspective. And I think that you probably don't expect histogram diff to be significantly faster in general :) Thanks again - it is good to know that you as author of histogram diff are here. And sorry for my latency .. [ot] this is because of hectic school schedule now - which is actually not good :( I need to study git source very deeply _NOW_ (I wanted to reply earlier but..) [/ot] Thanks to Junio C Hamano with almost the same answer here: http://thread.gmane.org/gmane.comp.version-control.git/169498/focus=169516 Pavel >> [1] https://git.wiki.kernel.org/index.php/SoC2011Ideas ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Histogram diff, libgit2 enhancement, libgit2 => git merge (GSOC) 2011-03-20 10:55 Histogram diff, libgit2 enhancement, libgit2 => git merge (GSOC) Pavel Raiskup 2011-03-20 18:06 ` Shawn Pearce @ 2011-03-20 18:25 ` Junio C Hamano 2011-03-20 21:01 ` Vicent Marti 2 siblings, 0 replies; 13+ messages in thread From: Junio C Hamano @ 2011-03-20 18:25 UTC (permalink / raw) To: Pavel Raiskup; +Cc: git "Pavel Raiskup" <xraisk00@gmail.com> writes: > I have one theoretical question -- what if the benchmarking/study of > histogram diff leads to conclusion that this algorithm will not be > useful for upstream? Does it mean "fail" in terms of GSOC? Not necessarily. A negative result is often as valuable as a positive result. It will take a clearly good implementation to justify why a negative result is a success, though. If it is clear to the reviewers that the implementation is poorly done, the negative conclusion does not necessarily mean that use of the histogram algorithm is a bad approach---it would just mean the particular implementation that didn't implement it well was, and then the GSoC task may have to be marked as a failure. But otherwise, if the submission is done with the usual code quality we would expect from contributors and explained well in its log message (either positive or negative), I would say it should be considered a "success". ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Histogram diff, libgit2 enhancement, libgit2 => git merge (GSOC) 2011-03-20 10:55 Histogram diff, libgit2 enhancement, libgit2 => git merge (GSOC) Pavel Raiskup 2011-03-20 18:06 ` Shawn Pearce 2011-03-20 18:25 ` Junio C Hamano @ 2011-03-20 21:01 ` Vicent Marti 2011-03-20 23:44 ` Jeff King ` (2 more replies) 2 siblings, 3 replies; 13+ messages in thread From: Vicent Marti @ 2011-03-20 21:01 UTC (permalink / raw) To: Pavel Raiskup; +Cc: git Yo! On Sun, Mar 20, 2011 at 12:55 PM, Pavel Raiskup <xraisk00@gmail.com> wrote: > libgit2: > I really like the concept of libraries for to be binding-able from dozens of > languages - this leads to expanding functionality among masses users > almost everywhere. In this part I like the idea of implementing new features > inside library (diff, config file parsing) but also maybe the task of > merging > libgit2 into git upstream. Basically I don't know much about that.. and > you wrote that this task is more difficult then others, so I probably need > to study git's and libgit's architecture very precisely beforehand .. but > could you tell me some details about that? Is it impossible to do it before > GSOC deadline and is it worth making a serious big efforts to this task > (from your point of view onto project objectives)? How big are requirements > for this task in term of GSOC? Merging libgit2 into upstream Git is a scary as fuck task. Somebody put it up on the Wiki ideas page, but that was not me -- I'm personally doubtful of anybody succeeding on doing that project during the SoC, so I have very little interest on mentoring the task. Here's what's going on: The Git code base is hairy and not that well documented, so you're gonna need to study that quite a bit. I like to think that the libgit2 code base is not hairy, and is pretty well documented (I'm an optimistic guy), but you're still going to need quite a bit of research to understand the whole architecture before you can actually merge anything into Git. You could try to port just some selected parts of the library to libgit2 (i.e. the parts which benchmark to be faster than their Git counterparts), but the interdependency chain of libgit2 internals is not going to be pretty, embedding into the Git core is not going to be easy (libgit2 is reentrant and mostly threadsafe, so there's quite the architecture mismatch there), and there's no guarantee that the final implementation is going to be faster once it's in there. Overall, you'd need balls of steel and a lot of spare time and interest to accomplish anything significant with this task, so my personal opinion as very old wise man is to forget about it. HOWEVER. If you want to do something libgit2-related for the SoC (which would be awesome), there's still two options: a) Help us make the library more awesome by implementing new features! This task is the opposite the previous one; it's like full of unicorns and rainbows. You can choose one (or more) features we are missing, and see how to implement them in libgit2 while making them reentrant, threadsafe AND faster. It's not easy, but it's fucking cool. And you get to do a lot of micro-optimization if you're into that. b) Write a minimal Git client using libgit2. Peff keeps bringing this up and I think it's a bangin' good idea. Write something small and 100% self contained in a C executable that runs everywhere with 0 dependencies -- don't aim for full feature completion, just the basic stuff to interoperate with a Git repository. Clone, checkout, branch, commit, push, pull, log. I would totally use that shit on my Windows boxes. And since it'll be externally compatible with the original Git client, we can reuse the Git unit tests to test libgit2. HA. Awesome! So, yeah. That's pretty much my libgit2-related advice for the SoC. Best of luck with your application process with whatever project you decide, Vicent ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Histogram diff, libgit2 enhancement, libgit2 => git merge (GSOC) 2011-03-20 21:01 ` Vicent Marti @ 2011-03-20 23:44 ` Jeff King 2011-03-21 0:38 ` Vicent Marti 2011-03-22 17:32 ` Pavel Raiskup 2011-03-21 1:27 ` Jonathan Nieder 2011-03-23 0:24 ` Vincent van Ravesteijn 2 siblings, 2 replies; 13+ messages in thread From: Jeff King @ 2011-03-20 23:44 UTC (permalink / raw) To: Vicent Marti; +Cc: Pavel Raiskup, git On Sun, Mar 20, 2011 at 11:01:25PM +0200, Vicent Marti wrote: > b) Write a minimal Git client using libgit2. Peff keeps bringing this > up and I think it's a bangin' good idea. Write something small and > 100% self contained in a C executable that runs everywhere with 0 > dependencies -- don't aim for full feature completion, just the basic > stuff to interoperate with a Git repository. Clone, checkout, branch, > commit, push, pull, log. I would totally use that shit on my Windows > boxes. And since it'll be externally compatible with the original Git > client, we can reuse the Git unit tests to test libgit2. HA. Awesome! Yeah, I would be happy to mentor or co-mentor with Vicent on a project like that. Not only might it be useful to actually _use_, but my secret motive is that I'd like to start testing libgit2 using some of the regular git tests, both for interoperability and for performance. -Peff ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Histogram diff, libgit2 enhancement, libgit2 => git merge (GSOC) 2011-03-20 23:44 ` Jeff King @ 2011-03-21 0:38 ` Vicent Marti 2011-03-22 17:32 ` Pavel Raiskup 1 sibling, 0 replies; 13+ messages in thread From: Vicent Marti @ 2011-03-21 0:38 UTC (permalink / raw) To: Jeff King; +Cc: Pavel Raiskup, git On Mon, Mar 21, 2011 at 1:44 AM, Jeff King <peff@github.com> wrote: > Yeah, I would be happy to mentor or co-mentor with Vicent on a project > like that. Not only might it be useful to actually _use_, but my secret > motive is that I'd like to start testing libgit2 using some of the > regular git tests, both for interoperability and for performance. Right on! I've just added this task to the wiki so other prospective students can take it into account, and listed you as a possible mentor. While I'm at it, I removed the "merge libgit2 into mainstream" task from there. Feel free to re-add it again if you find a suitable mentor -- I'm getting diarrhea just by thinking about it. Cheers, Vicent ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Histogram diff, libgit2 enhancement, libgit2 => git merge (GSOC) 2011-03-20 23:44 ` Jeff King 2011-03-21 0:38 ` Vicent Marti @ 2011-03-22 17:32 ` Pavel Raiskup 2011-03-22 18:47 ` Jeff King 1 sibling, 1 reply; 13+ messages in thread From: Pavel Raiskup @ 2011-03-22 17:32 UTC (permalink / raw) To: Vicent Marti, Jeff King; +Cc: git Hi again! This sounds probably like the most exciting task for me: >> b) Write a minimal Git client using libgit2. Peff keeps bringing this >> up and I think it's a bangin' good idea. Write something small and >> 100% self contained in a C executable that runs everywhere with 0 >> dependencies -- don't aim for full feature completion, just the basic >> stuff to interoperate with a Git repository. Clone, checkout, branch, >> commit, push, pull, log. I would totally use that shit on my Windows >> boxes. And since it'll be externally compatible with the original Git >> client, we can reuse the Git unit tests to test libgit2. HA. Awesome! > > Yeah, I would be happy to mentor or co-mentor with Vicent on a project > like that. Not only might it be useful to actually _use_, but my secret > motive is that I'd like to start testing libgit2 using some of the > regular git tests, both for interoperability and for performance. Do you mean git tests in directory "/t"? Could you give me a list of possible reusable unit tests? After a quick overview of test suite in git it looks quite complex to reuse. I haven't spent a lot of time studying test-suite, but calling: test_expect_success 'plain' 'command && command && ..' reinterprets chain of commands given in (2nd) string and in this commands is often called git as utility with arguments. Even in this very easy test feature is expected some command-line-interface behavior from tested utility.. Is this the way how do you want to test this new libgit2-like tool? So this standalone utility is going to have the same interface as git has -- kind of substitution of git with "git2" inside test suite? This probably will lead to some test suite changes, is it truth? ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Histogram diff, libgit2 enhancement, libgit2 => git merge (GSOC) 2011-03-22 17:32 ` Pavel Raiskup @ 2011-03-22 18:47 ` Jeff King 2011-03-22 19:18 ` Junio C Hamano 0 siblings, 1 reply; 13+ messages in thread From: Jeff King @ 2011-03-22 18:47 UTC (permalink / raw) To: Pavel Raiskup; +Cc: Vicent Marti, git On Tue, Mar 22, 2011 at 06:32:54PM +0100, Pavel Raiskup wrote: > >Yeah, I would be happy to mentor or co-mentor with Vicent on a project > >like that. Not only might it be useful to actually _use_, but my secret > >motive is that I'd like to start testing libgit2 using some of the > >regular git tests, both for interoperability and for performance. > > Do you mean git tests in directory "/t"? Yes. > Could you give me a list of possible reusable unit tests? After a quick > overview of test suite in git it looks quite complex to reuse. I haven't > spent a lot of time studying test-suite, but calling: > > test_expect_success 'plain' 'command && command && ..' > > reinterprets chain of commands given in (2nd) string and in this > commands is often called git as utility with arguments. Even in this > very easy test feature is expected some command-line-interface behavior > from tested utility.. Is this the way how do you want to test this new > libgit2-like tool? So this standalone utility is going to have the > same interface as git has -- kind of substitution of git with "git2" > inside test suite? Exactly. My plan was to implement a few of the simpler git commands (or at least the basic parts of them) using libgit2, and then test them with unmodified scripts from git's t/ directory. Of course, many of the tests won't pass because of obscure features that we haven't implemented. But that's OK. Even getting a partial list of passing tests will be useful. And tests known not to work because of unimplemented features can often be skipped (see the description of GIT_SKIP_TESTS in t/README). Part of the project would be sorting out which tests will be useful. It may also be necessary to use a mixture of git and libgit2 commands to finish tests. For example, a test which is really about checking "log" might use "commit", but "commit" hasn't been implemented yet. But it is still useful information if we cheat and use regular git's "commit", but test the libgit2 log command. As far as which commands to start with, I would start with plumbing commands like "update-index", "commit-tree", "update-ref", "rev-list", etc. Those are basic building blocks that have reasonably simple interfaces, and they're easy to test. And once you start, I think it will become more obvious where to go next (because some of the commands build on the results of others). > This probably will lead to some test suite changes, is it truth? There may be modifications necessary to the test suite to make this easier to do. But rather than forking the test suite and changing the tests, I would much rather see whatever support is needed done in a generalized way and merged to regular git. -Peff ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Histogram diff, libgit2 enhancement, libgit2 => git merge (GSOC) 2011-03-22 18:47 ` Jeff King @ 2011-03-22 19:18 ` Junio C Hamano 0 siblings, 0 replies; 13+ messages in thread From: Junio C Hamano @ 2011-03-22 19:18 UTC (permalink / raw) To: Jeff King; +Cc: Pavel Raiskup, Vicent Marti, git Jeff King <peff@github.com> writes: > It may also be necessary to use a mixture of git and libgit2 commands to > finish tests. For example, a test which is really about checking "log" > might use "commit", but "commit" hasn't been implemented yet. But it is > still useful information if we cheat and use regular git's "commit", but > test the libgit2 log command. Absolutely, and I don't even think that is "cheating"; it is merely a natural way to work incrementally. > As far as which commands to start with, I would start with plumbing > commands like "update-index", "commit-tree", "update-ref", "rev-list", > etc. Those are basic building blocks that have reasonably simple > interfaces, and they're easy to test. And once you start, I think it > will become more obvious where to go next (because some of the commands > build on the results of others). > >> This probably will lead to some test suite changes, is it truth? Some tests _might_ depend on implementation detail that we would rather not, but I don't think there are too many of them, unless you count the stuff that use "test-<something>" helper binary that link with libgit.a to make direct calls to the internal. I would suggest to consider a failure an uncovered bug in the new implementation by default, and discuss the tests that do depend on the implementation detail of C git on case-by-case basis to be fixed. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Histogram diff, libgit2 enhancement, libgit2 => git merge (GSOC) 2011-03-20 21:01 ` Vicent Marti 2011-03-20 23:44 ` Jeff King @ 2011-03-21 1:27 ` Jonathan Nieder 2011-03-22 16:43 ` Pavel Raiskup 2011-03-23 0:24 ` Vincent van Ravesteijn 2 siblings, 1 reply; 13+ messages in thread From: Jonathan Nieder @ 2011-03-21 1:27 UTC (permalink / raw) To: Vicent Marti; +Cc: Pavel Raiskup, git, Jeff King, Ramkumar Ramachandra Hi, Vicent Marti wrote: > Merging libgit2 into upstream Git is a scary as fuck task. Somebody > put it up on the Wiki ideas page, but that was not me Cc-ing Ram (who added it), in case he has anything to add. > -- I'm > personally doubtful of anybody succeeding on doing that project during > the SoC, I agree there --- it is a huge task. But maybe it could inspire someone to come up with a smaller task. One long-term goal might be to get libgit2 and core git to share revision walking APIs; a baby step towards that would be a proof-of-concept patch to share object access APIs. If someone wants to work on this, I'd be glad to talk over what would be needed to make a realistic proposal. > so I have very little interest on mentoring the task. That's okay, of course. What's probably important for people considering this project is: would you be willing to answer questions and consider patches from a person working on this? That is, do you consider the goal even worthwhile? I am probably not the best person to mentor this but if no one else wants to then I would be interested. > Here's what's going on: The Git code base is hairy and not that well > documented, so you're gonna need to study that quite a bit. I like to > think that the libgit2 code base is not hairy, and is pretty well > documented (I'm an optimistic guy), but you're still going to need > quite a bit of research to understand the whole architecture before > you can actually merge anything into Git. Like the Linux kernel, the git codebase does not have many comments alongside the code, it is true. But it is actually incredibly well documented in my experience. The best documentation is in the history. In addition to that, there is some API documentation in Documentation/technical. A good place to start is the initial commit e83c516 (Initial revision of "git", the information manager from hell, 2005-04-07). The architecture described therein is very simple and still exists today with few changes. To explain something that has come later, the easiest way is to learn how the author explained it when the change was made. Let me give an example. Suppose I am wondering how git decides what commits to show when I say "git log ^topic1 topic2". In particular, I wonder what the performance characteristics of that operation are and how it is able to print the first result without spending O(depth of history) to traverse all the ancestors of topic1 going back to the beginning of time. First step: what does "git log" do with that "^topic1 topic2"? Wait, where is the "log" command defined in the first place? $ git grep -e '"log"' [...] git.c: { "log", cmd_log, RUN_SETUP }, [...] Ok, it's the cmd_log function. Looking at the definition of that function, it seems that it does init_revisions(&rev, prefix); rev.always_show_header = 1; memset(&opt, 0, sizeof(opt)); opt.def = "HEAD"; cmd_log_init(argc, argv, prefix, &rev, &opt); return cmd_log_walk(&rev); $ git grep -e init_revisions -- Documentation Documentation/technical/api-revision-walking.txt:`init_revisions`:: The revision walking API is explained in the api-revision-walking.txt document. From this we learn that responsibility for the revision walk is divided between prepare_revision_walk and get_revision, defined in revision.c. prepare_revision_walk seems to use functions "handle_commit" and "commit_list_insert_by_date". What do they do? $ git log -p -Shandle_commit -- revision.c commit cd2bdc5309461034e5cc58e1d3e87535ed9e093b Author: Linus Torvalds <torvalds@osdl.org> Date: Fri Apr 14 16:52:13 2006 -0700 Common option parsing for "git log --diff" and friends This basically does a few things that are sadly somewhat interdependent, [...] Now, that was the easy and straightforward part. The slightly more involved part is that some of the programs that want to use the new-and-improved rev_info parsing don't actually want _commits_, they may want tree'ish arguments instead. That meant that I had to change setup_revision() to parse the arguments not into the "revs->commits" list, but into the "revs->pending_objects" list. Then, when we do "prepare_revision_walk()", we walk that list, and create the sorted commit list from there. Okay: so in revision walking: - first (in setup_revisions), git pushes the ^topic1 and topic2 commits onto a list called "pending_objects"; - next, in prepare_revision_walk, it walks through the pending objects list and inserts them in a commit list, sorted by date; and next? $ git log -Sget_revision -- revision.c [...] commit a4a88b2bab3b6fb0b30f63418701f42388e0fe0a Author: Linus Torvalds <torvalds@osdl.org> Date: Tue Feb 28 11:24:00 2006 -0800 git-rev-list libification: rev-list walking This actually moves the "meat" of the revision walking from rev-list.c to the new library code in revision.h. It introduces the new functions void prepare_revision_walk(struct rev_info *revs); struct commit *get_revision(struct rev_info *revs); to prepare and then walk the revisions that we have. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> Well, that's actually not so helpful. I mean, it tells us that get_revision is what takes care of the revision walk, but it doesn't tell us what the revision walk consists of. So here we need another trick to get at the meat of the matter --- we need to know where this "revision walking from rev-list.c" came from. Ah: $ git log -- rev-list.c [...] commit 64745109c41a5c4a66b9e3df6bca2fd4abf60d48 Author: Linus Torvalds <torvalds@ppc970.osdl.org> Date: Sat Apr 23 19:04:40 2005 -0700 Add "rev-list" program that uses the new time-based commit listing. This is probably what you'd want to see for "git log". And the answer is there in the patch for a commit that comes after that (8906300, git-rev-list: use proper lazy reachability analysis, 2005-05-30). Heh, probably I didn't choose the best example. :) A short article about this in Documentation/technical certainly wouldn't be a bad thing. In addition to "git log -S" as used above, I tend to find "git blame -L" helpful FWIW. And people on the list can be helpful, too. > (libgit2 is reentrant and mostly threadsafe, so there's quite the > architecture mismatch there), Could you expand on that a little? I understand that a lot of git code wouldn't be usable for libgit2 as-is and that there is going to be some overhead from, say, using malloc to initialize buffers instead of relying on static ones. But does that deserve to be called an architecture mismatch? Would that make it hard to reuse libgit2 code within git? I'd be very interested in learning about more substantial differences in approach. Probably the two codebases could learn a lot from each other's design. > Overall, you'd need balls of steel Here I agree. > HOWEVER. If you want to do something libgit2-related for the SoC > (which would be awesome), there's still two options: > > a) Help us make the library more awesome by implementing new features! > This task is the opposite the previous one; it's like full of unicorns > and rainbows. You can choose one (or more) features we are missing, > and see how to implement them in libgit2 while making them reentrant, > threadsafe AND faster. It's not easy, but it's fucking cool. And you > get to do a lot of micro-optimization if you're into that. Note that if this is your kind of thing, you might consider sending "libification patches" to modify the code in git while at it. That means free code review and free bugfixes from then on if your changes are accepted. > b) Write a minimal Git client using libgit2. Peff keeps bringing this > up and I think it's a bangin' good idea. Write something small and > 100% self contained in a C executable that runs everywhere with 0 > dependencies -- don't aim for full feature completion, just the basic > stuff to interoperate with a Git repository. I agree that this would be very neat, too. > So, yeah. That's pretty much my libgit2-related advice for the SoC. Thanks again, Vicent, for these very useful explanations. > Best of luck with your application process with whatever project you decide, > Vicent Seconded. :) Hope that helps, Jonathan ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Histogram diff, libgit2 enhancement, libgit2 => git merge (GSOC) 2011-03-21 1:27 ` Jonathan Nieder @ 2011-03-22 16:43 ` Pavel Raiskup 0 siblings, 0 replies; 13+ messages in thread From: Pavel Raiskup @ 2011-03-22 16:43 UTC (permalink / raw) To: Vicent Marti, Jonathan Nieder; +Cc: git, Jeff King, Ramkumar Ramachandra Hello, Jonathan Nieder wrote: > Vicent Marti wrote: > >> -- I'm >> personally doubtful of anybody succeeding on doing that project during >> the SoC, > > ... > If someone wants to work on this, I'd be glad to talk over what would > be needed to make a realistic proposal. > >> so I have very little interest on mentoring the task. > > That's okay, of course. What's probably important for people > considering this project is: would you be willing to answer questions > and consider patches from a person working on this? That is, do you > consider the goal even worthwhile? > > I am probably not the best person to mentor this but if no one else > wants to then I would be interested. As I can see now, it could be quite too heavy for me to produce results as good as would be needed. This is probably quite difficult task for starting with git contributing. Rather considering the other git-topics for now (but I'm not rejecting this idea yet). > A good place to start is the initial commit e83c516 (Initial revision > of "git", the information manager from hell, 2005-04-07). > ........ > Heh, probably I didn't choose the best example. :) A short article > about this in Documentation/technical certainly wouldn't be a bad > thing. > > In addition to "git log -S" as used above, I tend to find "git blame -L" > helpful FWIW. And people on the list can be helpful, too. This "short" article is very helpful, thank you for that! I think it can help all contributors (not only students) at the beginning of their git journey. >> b) Write a minimal Git client using libgit2. Peff keeps bringing this >> up and I think it's a bangin' good idea. Write something small and >> 100% self contained in a C executable that runs everywhere with 0 >> dependencies -- don't aim for full feature completion, just the basic >> stuff to interoperate with a Git repository. > > I agree that this would be very neat, too. The idea of git client based on libgit2 sounds VERY interesting. I'm going to ask for some details in neighboring sub-thread. >> Best of luck with your application process with whatever project you decide, >> Vicent > > Seconded. :) Thank you both, I'm not going to try other projects, there is not enough time now for researching other projects and git is my only choice and desire. Pavel ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Histogram diff, libgit2 enhancement, libgit2 => git merge (GSOC) 2011-03-20 21:01 ` Vicent Marti 2011-03-20 23:44 ` Jeff King 2011-03-21 1:27 ` Jonathan Nieder @ 2011-03-23 0:24 ` Vincent van Ravesteijn 2 siblings, 0 replies; 13+ messages in thread From: Vincent van Ravesteijn @ 2011-03-23 0:24 UTC (permalink / raw) To: Vicent Marti; +Cc: git > b) Write a minimal Git client using libgit2. Peff keeps bringing this > up and I think it's a bangin' good idea. Write something small and > 100% self contained in a C executable that runs everywhere with 0 > dependencies -- don't aim for full feature completion, just the basic > stuff to interoperate with a Git repository. Clone, checkout, branch, > commit, push, pull, log. I would totally use that shit on my Windows > boxes. And since it'll be externally compatible with the original Git > client, we can reuse the Git unit tests to test libgit2. HA. Awesome! > I would dream of having a platform-independent GUI based on libgit2 which could be used to manage a large project. Setup the workflow in the app, requiring only single mouseclicks to promote a topic branch into the stable series. Have a button to merge all maint-branch-updates into the other branches. And more.. In order to come up with a possible workflow for our project, I have been checking out how Git is managed. I got a little bit disappointed that Junio uses some 'home-brewn' scripts for Git. I don't want to write them myselves (on Windows). I'm happy to see that the 'vger' people are supporting libgit2. Anyway, when I do have some time, I am willing to contribute to the libgit2 project. Greetings, Vincent ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2011-03-23 0:24 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-03-20 10:55 Histogram diff, libgit2 enhancement, libgit2 => git merge (GSOC) Pavel Raiskup 2011-03-20 18:06 ` Shawn Pearce 2011-03-22 12:32 ` Pavel Raiskup 2011-03-20 18:25 ` Junio C Hamano 2011-03-20 21:01 ` Vicent Marti 2011-03-20 23:44 ` Jeff King 2011-03-21 0:38 ` Vicent Marti 2011-03-22 17:32 ` Pavel Raiskup 2011-03-22 18:47 ` Jeff King 2011-03-22 19:18 ` Junio C Hamano 2011-03-21 1:27 ` Jonathan Nieder 2011-03-22 16:43 ` Pavel Raiskup 2011-03-23 0:24 ` Vincent van Ravesteijn
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).