* native-git-svn: A Summer of Code 2010 proposal @ 2010-03-19 17:18 Ramkumar Ramachandra 2010-03-19 18:32 ` Avery Pennarun 0 siblings, 1 reply; 33+ messages in thread From: Ramkumar Ramachandra @ 2010-03-19 17:18 UTC (permalink / raw) To: Git Mailing List; +Cc: Sverre Rabbelier Hi, I picked up a project I liked from the Wiki [https://git.wiki.kernel.org/index.php/SoC2010Ideas#A_remote_helper_for_svn] and discussed it with Sverre. I now have a preliminary draft of my proposal ready, and I'd really appreciate feedback. ===================================== Project Proposal: native-git-svn | Native SVN support in Git == The Outline == Currently, git-svn.perl is used to interface with SVN repositories. However, it has serious shortcomings: 1. It is essentially an arcane 5000-line Perl script that doesn't use git-fast-import/ git-fast-export. It converts an SVN repository to a Git repository by hand. This makes it virtually unmaintainable. 2. Its UI is unnecessarily complex. git-svn-* has some commands corresponding to git-* commands, and it can be quite difficult for the user to understand which one to use in different situations. These can be merged easily. 3. It handles the standard trunk/branches/tags layout well, but it doesn't know how to handle non-standard/ changing SVN layout. 4. There's an array of other annoyances which makes it quite imperfect. For example, it ignores all SVN properties except svn:executable. While many of these problems can be tackled in git-svn.perl itself, problem 1 is the most prominent. git-svn.perl is very difficult to modify or even maintain. A more permanent solution is required. My proposal is to start from scratch and build an application that makes dealing SVN repositories very easy. The plan is to build component-wise, in a modular manner. The project can be considered fully successful only after the functionality described in all the components have been written, and the project is merged into upstream. It will involve minimal changes to the current Git codebase, if any at all. I additionally hope that this project will serve as a roadmap for other projects that involve natively supporting other versioning systems in Git. == The Technicalities == The distinct components I plan to write are: 1. An SVN client that uses libsvn to fetch/ push revisions to a remote SVN repository. 2. An exporter for SVN repositories, which will extract all the relevant revision history and metadata to import into Git. 3. A remote helper for Git that takes the data from this SVN exporter, and uses git-fast-import to create corresponding commits in Git. 4. Another remote helper to export commit data and metadata from Git to import into SVN. 5. An importer for SVN, which will create revisions in SVN corresponding to commits in Git. 6. A UI that glues all the components together into one large consistent interface. Due to a licensing conflict, the details of which can be found here [1], native-git-svn will link to libsvn, but will NOT link to Git. It will simply use a thin wrapper to call compiled Git executables (referred to as remote helper in article). The six components will be developed and tested independently. The following resources are relevant to the project: 1. git_remote_helpers/git/git.py is a minimalistic remote helper written by Sverre. I plan to extend this as much as possible before rewriting it in C. 2. libsvn contains excellent documentation and clear examples to create the SVN client. 3. git-svn.perl has a lot functionality that I plan to re-implement in native-git-svn: 3.1 parse_svn_date: Given a date (in UTC) from Subversion, return a string in the format "<TZ Offset> <local date/time>" that Git will use 3.2 load_authors: <svn username> = real-name <email address> mapping based on git-svnimport 3.3 do_git_init_db: Create and maintain svn-remotes 3.4 get_commit_entry: Parse commit messages, and encode them; SVN requires messages to be UTF-8 when entering the repo 3.5 cmd_branch: Handle branching/ tagging 3.6 cmd_create_ignore: Reads svn:ignore and puts the information into .gitignore 4. There are several existing third-party SVN exporters worth looking into [2]. I've additionally discussed the project with Sverre Rabbelier at length over email. == Who am I? == I'm Ramkumar, a student at the Indian Institute of Technology, Kharagpur. I haven't contributed more than a few small patches to Git [3], and I look at this project as a fantastic opportunity to get more involved with the community. In the summer and winter of 2008, I worked with a Django-based startup. The team comprised of three experienced Python developers, one designer to steer the project, and an undergraduate student- me. We versioned everything on Git, deployed on Apache/ PostgreSQL, using Amazon S3 for static content. While working with the startup, I also contributed to South, a migration framework for Django. A lot more about this is mentioned on my resume [4]. C, C++ [5], and Python are my strongest languages. I've additionally learnt Common Lisp through an Emacs Lisp application I wrote in summer 2009 [6]. I'm known to be very communicative, both in person, and over email/ chat. The style and clarity of my communication is seen in the slides I used at FOSS.IN/2009 in winter 2009 [7]. == Notes == [1] http://thread.gmane.org/gmane.comp.version-control.git/139545 [2] svn-all-fast-export | git://repo.or.cz/svn-all-fast-export.git and fast-export | git://repo.or.cz/fast-export.git [3] 52eb5173ac and 88d50e78c3 [4] TODO [5] On a related note, I've also contributed a little to Chromium [6] http://github.com/artagnon/ublog.el [7] http://artagnon.com/wp-content/uploads/haskell-internals.pdf and http://artagnon.com/wp-content/uploads/unladen-swallow.pdf ===================================== Thanks! Regards, Ramkumar ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: native-git-svn: A Summer of Code 2010 proposal 2010-03-19 17:18 native-git-svn: A Summer of Code 2010 proposal Ramkumar Ramachandra @ 2010-03-19 18:32 ` Avery Pennarun 2010-03-19 18:39 ` Sverre Rabbelier 2010-03-19 20:53 ` Jonathan Nieder 0 siblings, 2 replies; 33+ messages in thread From: Avery Pennarun @ 2010-03-19 18:32 UTC (permalink / raw) To: Ramkumar Ramachandra; +Cc: Git Mailing List, Sverre Rabbelier On Fri, Mar 19, 2010 at 1:18 PM, Ramkumar Ramachandra <artagnon@gmail.com> wrote: > 1. It is essentially an arcane 5000-line Perl script that doesn't use > git-fast-import/ git-fast-export. It converts an SVN repository to a > Git repository by hand. This makes it virtually unmaintainable. > [...] > My proposal is to start from scratch and build an application that > makes dealing SVN repositories very easy. The plan is to build > component-wise, in a modular manner. The project can be considered > fully successful only after the functionality described in all the > components have been written, and the project is merged into upstream. "I don't understand the current implementation" is approximately the worst possible reason to rewrite something from scratch. Here is a great article about this problem: http://www.joelonsoftware.com/articles/fog0000000069.html (from back before Joel "junked the sharp"). Now, git-svn is not as big a project as, say, Netscape or dBase or Quattro Pro. It's just a little piece. So rewriting from scratch *might* actually work in this case. But you should be aware that: - a lot of the complexity is there for a reason - all those "extra commands" that git-svn supports are considered backwards compatibility, even if they're absolutely obsolete because of newer commands, and therefore will be very hard to justify getting rid of - getting your replacement merged into git is rather unlikely unless you can provide this backwards compatibility *and* a comparable feature set, or at least a compelling reason that yours should be merged in *alongside* the existing git-svn, resulting in duplicate functionality. So if your goal is to write a possibly better replacement to git-svn, that's a potentially great goal with an unfortunately high probability of failure (but great upside if you don't fail). If you won't consider it successful unless it gets merged upstream... then you're setting yourself up for disappointment, at least if you expect to be done withing the GSoC timeframe. So that's the "downer" part of my feedback. On the other hand, I don't really like how git-svn works either, so I'd be happy to also offer some constructive suggestions if you really are brave enough to start from scratch :) > 3. It handles the standard trunk/branches/tags layout well, but it > doesn't know how to handle non-standard/ changing SVN layout. Indeed. After fiddling with git-svn more than I'd like to think about (we use a mishmash of git and svn at work, both with lots of branches, and I'm basically the junction point between them), I have come to a simple conclusion: ** Trying to map the svn branch-is-a-folder model to git's branch-is-a-branch model is wrong. ** git-svn tries to take /trunk/* and /branches/foo/* and pretend that the '*' in each corresponds to the same set of files, separated only by history. This is more-or-less true in most svn repositories... some more and some less. But the mapping is never 100% correct, and once you've done the mapping, it's impossible to fix without doing a whole new import. Moreover, extracting these branches separately from the svn history often involves downloading the same file contents over and over, which is stupid and slow. git-svn goes through many contortions to *avoid* such re-downloading, but it isn't always successful; sooner or later it ends up re-downloading a tonne of stuff. And this is despite the fact that svn has *explicit* recording of renames and copies, which ought to be easy for git-svn to read and say "just copy the treeid corresponding to the source and paste it into the destination" as a trivial O(1) operation. So anyway, my suggestion here is to do something so simple it's not obvious: import the entire history, one revision at a time, from the very top of the svn tree. That means the git history will show /branches/whatever/* *and* /trunk/*, and 99% of the files will be the same. Of course, the fact that they're the same costs you nothing, because git is awesome. You only pay for it if you check out that magic history branch into your working tree. And that's the trick: you never want to actually check out this full-history branch. Instead, use something like the technique from 'git subtree split' (http://github.com/apenwarr/git-subtree) to extract sub-histories *from that primary imported history*. That split operation can happen *really* fast, since all you're doing is creating a bunch of synthetic git commits where the toplevel tree points at the subtree from the corresponding commit in the primary history. Creating a bunch of tiny commit objects is quick and easy. And if you screw it up, no problem; just throw it away and regenerate the subtree. None of that requires re-importing stuff from the svn server, which is the unbelievably slow and expensive part. Separating these two concepts, svn importing from svn-branch-directory-swizzling, ought to make git-svn much more manageable, much faster, and much more flexible. > 4. There's an array of other annoyances which makes it quite > imperfect. For example, it ignores all SVN properties except > svn:executable. Yes, you'll have to find a good solution to this that balances one set of disadvantages (file attributes that aren't stored, or are stored invisibly) with another (file attributes that are stored in dotfiles or something and thus clutter up the work tree). Maybe git-notes would help. Or maybe not :) > The following resources are relevant to the project: > 1. git_remote_helpers/git/git.py is a minimalistic remote helper > written by Sverre. I plan to extend this as much as possible before > rewriting it in C. Are you sure you really want to rewrite git-svn in C? svn is so slow that interpreted vs. native performance is unlikely to be an issue. git-svn is probably not going to be needed on embedded systems where installing python or perl is a problem. And managing the data structures in a high-level language should be a lot easier. You could always do your whole project in python or perl and make it *work* the way you want. If it's really good, you can maybe get that accepted into the git core. Then, if it's really modular enough, you ought to be able to rewrite the modules one by one into C as needed. Have fun, Avery ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: native-git-svn: A Summer of Code 2010 proposal 2010-03-19 18:32 ` Avery Pennarun @ 2010-03-19 18:39 ` Sverre Rabbelier 2010-03-19 21:30 ` Avery Pennarun 2010-03-19 20:53 ` Jonathan Nieder 1 sibling, 1 reply; 33+ messages in thread From: Sverre Rabbelier @ 2010-03-19 18:39 UTC (permalink / raw) To: Avery Pennarun; +Cc: Ramkumar Ramachandra, Git Mailing List Heya, On Fri, Mar 19, 2010 at 19:32, Avery Pennarun <apenwarr@gmail.com> wrote: > - all those "extra commands" that git-svn supports are considered > backwards compatibility, even if they're absolutely obsolete because > of newer commands, and therefore will be very hard to justify getting > rid of I don't think this is true. The proposal is to implement git-remote-svn, which would allow _native_ interaction with svn repositories, so without using 'git svn'. It would allow 'git clone svn://example.com/myrepo' and subsequent "git pull"s from that svn source. Do you agree that makes (part of) your comments moot, or am I missing something? -- Cheers, Sverre Rabbelier ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: native-git-svn: A Summer of Code 2010 proposal 2010-03-19 18:39 ` Sverre Rabbelier @ 2010-03-19 21:30 ` Avery Pennarun 2010-03-20 9:19 ` Ramkumar Ramachandra ` (2 more replies) 0 siblings, 3 replies; 33+ messages in thread From: Avery Pennarun @ 2010-03-19 21:30 UTC (permalink / raw) To: Sverre Rabbelier; +Cc: Ramkumar Ramachandra, Git Mailing List On Fri, Mar 19, 2010 at 2:39 PM, Sverre Rabbelier <srabbelier@gmail.com> wrote: > On Fri, Mar 19, 2010 at 19:32, Avery Pennarun <apenwarr@gmail.com> wrote: >> - all those "extra commands" that git-svn supports are considered >> backwards compatibility, even if they're absolutely obsolete because >> of newer commands, and therefore will be very hard to justify getting >> rid of > > I don't think this is true. The proposal is to implement > git-remote-svn, which would allow _native_ interaction with svn > repositories, so without using 'git svn'. It would allow 'git clone > svn://example.com/myrepo' and subsequent "git pull"s from that svn > source. Do you agree that makes (part of) your comments moot, or am I > missing something? I don't know enough about the proposal to comment on this part of the design. I do know that where git-svn fits into git's UI has not been the problem for me or my co-workers; we can learn some weirdo syntax if needed. Things like branching and merging, and git-svn redownloading the same stuff 100 times, and oddly-named-svn-branch-hierarchies, and git pulling between git-svn users, however, have given us lots of grief. For example, I'd be very happy to learn that your new design would allow two people to independently pull from svn://, do work in their respective copies of the git repositories, branch and merge all day long, pull from each other, and then push back to svn without a) making a mess of the svn repo and causing zillions of conflicts, or b) linearizing history and losing git's complex DAG. In the current version of git-svn this is very hard. 'git svn dcommit' generates entirely new git commit objects corresponding to the ones that were created in svn... but which nevertheless have your merge history included, which is awesome. But if a new person clones the svn repo from scratch, he will end up with git commits corresponding to those same ones from svn, but *without* the merge history, and therefore with different commit ids, and which therefore prevent push/pulling between other people who have cloned the repo. If the above explanation doesn't make any sense, let me know and I can clarify it further. If you know what I'm talking about and have either solved it or don't care about that use case, please just ignore me and I'll go back to hide in my hole :) Have fun, Avery ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: native-git-svn: A Summer of Code 2010 proposal 2010-03-19 21:30 ` Avery Pennarun @ 2010-03-20 9:19 ` Ramkumar Ramachandra 2010-03-20 10:48 ` Johannes Schindelin 2010-03-21 23:51 ` Dave Olszewski 2 siblings, 0 replies; 33+ messages in thread From: Ramkumar Ramachandra @ 2010-03-20 9:19 UTC (permalink / raw) To: Avery Pennarun Cc: Johannes.Schindelin, Sverre Rabbelier, Git Mailing List, Jonathan Nieder Hi, > So if your goal is to write a possibly better replacement to git-svn, > that's a potentially great goal with an unfortunately high probability > of failure (but great upside if you don't fail). If you won't > consider it successful unless it gets merged upstream... then you're > setting yourself up for disappointment, at least if you expect to be > done withing the GSoC timeframe. As Sverre pointed out, this is not the goal of the project. The proposal is not to rewrite git-svn. You're probably right to assume that any such endeavor would be unsuccessful in one summer. The proposal is to create an application that will natively support SVN repositories in Git. I'm simply pointing out the limitations of git-svn as a motivation for this project. As I've mentioned in my proposal, good SVN exporters already exist, and creating an SVN client can be fairly elementary. The whole point of the project is to move away from the "git-svn.perl approach". Ofcourse, that doesn't mean that I won't use some parts of git-svn in native-git-svn. Along with creating the infrastructure for this approach, I do expect to have *working* native SVN support at the end of summer merged into mainline. I'll make this clearer in the next revision of my proposal. > You could always do your whole project in python or perl and make it > *work* the way you want. If it's really good, you can maybe get that > accepted into the git core. Then, if it's really modular enough, you > ought to be able to rewrite the modules one by one into C as needed. Writing everything in C can be quite painful. I plan to start off by prototyping the various components in Python anyway. If and when it's necessary, components can be re-implemented in C. > In the current version of git-svn this is very hard. 'git svn dcommit' > generates entirely new git commit objects corresponding to the ones > that were created in svn... but which nevertheless have your merge > history included, which is awesome. But if a new person clones the > svn repo from scratch, he will end up with git commits corresponding > to those same ones from svn, but *without* the merge history, and > therefore with different commit ids, and which therefore prevent > push/pulling between other people who have cloned the repo. Oh, that's terribly ugly. Thanks for pointing it out. I haven't thought of a solution yet, but yes- it would be really nice if the new design could handle this elegantly. Do feel free to tell me what you'd like to see in the next revision of my proposal, and what you'd like to see omitted. A proposal can't run into many pages, so I'll attach anything that's very detailed as notes. Thanks, Ramkumar ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: native-git-svn: A Summer of Code 2010 proposal 2010-03-19 21:30 ` Avery Pennarun 2010-03-20 9:19 ` Ramkumar Ramachandra @ 2010-03-20 10:48 ` Johannes Schindelin 2010-03-20 20:34 ` Ramkumar Ramachandra 2010-03-21 23:51 ` Dave Olszewski 2 siblings, 1 reply; 33+ messages in thread From: Johannes Schindelin @ 2010-03-20 10:48 UTC (permalink / raw) To: Avery Pennarun; +Cc: Sverre Rabbelier, Ramkumar Ramachandra, Git Mailing List Hi, On Fri, 19 Mar 2010, Avery Pennarun wrote: > On Fri, Mar 19, 2010 at 2:39 PM, Sverre Rabbelier <srabbelier@gmail.com> wrote: > > On Fri, Mar 19, 2010 at 19:32, Avery Pennarun <apenwarr@gmail.com> wrote: > >> - all those "extra commands" that git-svn supports are considered > >> backwards compatibility, even if they're absolutely obsolete because > >> of newer commands, and therefore will be very hard to justify getting > >> rid of > > > > I don't think this is true. The proposal is to implement > > git-remote-svn, which would allow _native_ interaction with svn > > repositories, so without using 'git svn'. It would allow 'git clone > > svn://example.com/myrepo' and subsequent "git pull"s from that svn > > source. Do you agree that makes (part of) your comments moot, or am I > > missing something? > > I don't know enough about the proposal to comment on this part of the > design. How about reading it? It's on the Wiki. Ciao, Dscho ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: native-git-svn: A Summer of Code 2010 proposal 2010-03-20 10:48 ` Johannes Schindelin @ 2010-03-20 20:34 ` Ramkumar Ramachandra 2010-03-20 20:55 ` Ramkumar Ramachandra ` (3 more replies) 0 siblings, 4 replies; 33+ messages in thread From: Ramkumar Ramachandra @ 2010-03-20 20:34 UTC (permalink / raw) To: Johannes Schindelin Cc: Avery Pennarun, Sverre Rabbelier, Git Mailing List, Jonathan Nieder Hi, I just prepared another revision of my proposal- I've tried to be clearer about the objective, and included a timeline this time. Note that I've also changed the name from native-git-svn to git-remote-svn, as recommended by Sverre. ====================================== Project Proposal: git-remote-svn | Native SVN support in Git == The Outline == The objective of git-remote-svn is to allow native interaction with SVN repositories in Git. The motivation for writing this comes from the shortcomings of the current approach: git-svn. 1. It is essentially an arcane 5000-line Perl script that doesn't use git-fast-import/ git-fast-export. It converts an SVN repository to a Git repository by hand. This makes it virtually unmaintainable. 2. The UI is unnatural and complex. git-svn-* has some commands corresponding to git-* commands, and it can be quite difficult for the user to understand which one to use in different situations. 3. It handles the standard trunk/branches/tags layout well, but it doesn't know how to handle non-standard/ changing SVN layout. 4. There's an array of other annoyances which makes it quite imperfect. For example, it ignores all SVN properties except svn:executable. While the last two problems can be tackled in git-svn.perl itself, a fresh approach is required to tackle the first two. git-remote-svn is a proposal for an alternative approach. 1. Several good SVN exporters already exist, and using them with git-fast-import should simplify a lot of the plumbing git-svn tackles by hand. 2. Using a remote helper to keep track of SVN remotes will simplify the UI greatly. The fresh UI will allow for a simple `git clone svn://example.com/myrepo` and multiple subsequent `git pull` invocations. However, the project does not aim to be compatible with git-svn, and does not serve as an immediate replacement. It can be considered fully successful after the functionality described in all the components have been written. Merging the project to upstream will involve small changes to the Git codebase to incorporate the native UI. I additionally hope that this project will serve as a roadmap for other projects that involve natively supporting other versioning systems in Git. == The Technicalities == I've discussed the project with Sverre Rabbelier at length over email. The plan is to build component-wise. The distinct components are: 1. An SVN client that uses libsvn to fetch/ push revisions to a remote SVN repository. 2. An exporter for SVN repositories, which will extract all the relevant revision history and metadata to import into Git. 3. A remote helper for Git that takes the data from this SVN exporter, and uses git-fast-import to create corresponding commits in Git. 4. Another remote helper to export commit data and metadata from Git to import into SVN. 5. An importer for SVN, which will create revisions in SVN corresponding to commits in Git. 6. A UI that glues all the components together. Due to a licensing conflict, the details of which can be found here [1], git-remote-svn will link to libsvn, but will NOT link to Git. It will simply use a thin wrapper to call compiled Git executables (referred to as remote helper in article). The following resources will help build the various components: 1. git_remote_helpers/git/git.py is a small remote helper written by Sverre that wraps around git-fast-import. I plan to extend this to wrap around git-fast-export as well. 2. git-svn.perl contains a two-way mapping, parts of which I plan to implement. 3. Thiago Macieira's svn-all-fast-export [2] has a complete SVN -> Git mapping. I plan to take several ideas from the branch/ tag mapper in repository.cpp. == Timeline == April 26 - May 24: Study svn-all-fast-export extensively, and chalk out a mapper for SVN branches and tags. Take time to become comfortable with libsvn. May 24 - June 10: Write a minimal importer/ exporter for SVN in Python. It should allow interconversion between commit messages, timestamps, authors corresponding to each revision/ commit. Also figure out how to preserve the data in SVN properties. June 10 - June 25: Extend the the remote helper in git_remote_helpers/git/git.py to do git-fast-export as well. Make the importer/ exporter for SVN work with it. June 25 - July 5: Implement a minimal SVN client, and make it work with the SVN exporter. July 5 - July 10: Write a quick UI binder. This will involve modifications to builtin-push, builtin-fetch, builtin-clone, and remote to use the SVN client to perform the corresponding actions in the case of an SVN remote. July 10 - July 16: Scrub code and write documentation for mid-term evaluations. Commit changes to the remote helper to upstream. Try to get the other changes into `next`. July 17 - August 9: Implement the branch and tag mapper. August 9 - August 16: Write more documentation and get everything ready for final evaluations. Commit the SVN importer/ exporter and UI changes. == Who am I? == I'm Ramkumar, a student at the Indian Institute of Technology, Kharagpur. I haven't contributed more than a few small patches to Git [3], and I look at this project as a fantastic opportunity to get more involved with the community. In the summer and winter of 2008, I worked with a Django-based startup. The team comprised of three experienced Python developers, one designer to steer the project, and an undergraduate student- me. We versioned everything on Git, deployed on Apache/ PostgreSQL, using Amazon S3 for static content. While working for the startup, I also contributed to South. C, C++ [4], and Python are my strongest languages. I've additionally learnt Common Lisp through an Emacs Lisp I wrote in summer 2009 [5]. I'm known to be very communicative, both in person, and over email/ chat. The style and clarity of my communication is seen in the slides I used at FOSS.IN/2009 in winter 2009 [6]. == Notes == [1] http://thread.gmane.org/gmane.comp.version-control.git/139545 [2] git://repo.or.cz/svn-all-fast-export.git [3] 52eb5173ac and 88d50e78c3 [4] On a related note, I've also contributed a little to Chromium [5] http://github.com/artagnon/ublog.el [6] http://artagnon.com/wp-content/uploads/haskell-internals.pdf and http://artagnon.com/wp-content/uploads/unladen-swallow.pdf ====================================== Again, I'd really appreciate comments. Thanks, Ramkumar ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: native-git-svn: A Summer of Code 2010 proposal 2010-03-20 20:34 ` Ramkumar Ramachandra @ 2010-03-20 20:55 ` Ramkumar Ramachandra 2010-03-20 21:04 ` Jonathan Nieder ` (2 subsequent siblings) 3 siblings, 0 replies; 33+ messages in thread From: Ramkumar Ramachandra @ 2010-03-20 20:55 UTC (permalink / raw) To: Johannes Schindelin Cc: Avery Pennarun, Sverre Rabbelier, Git Mailing List, Jonathan Nieder > July 5 - July 10: Write a quick UI binder. This will involve > modifications to builtin-push, builtin-fetch, builtin-clone, and > remote to use the SVN client to perform the corresponding actions in > the case of an SVN remote. Correction: This functionality is already present. I'll just continue working on my SVN client for this period. Regards, Ramkumar ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: native-git-svn: A Summer of Code 2010 proposal 2010-03-20 20:34 ` Ramkumar Ramachandra 2010-03-20 20:55 ` Ramkumar Ramachandra @ 2010-03-20 21:04 ` Jonathan Nieder 2010-03-21 10:26 ` Johannes Schindelin 2010-03-20 21:58 ` native-git-svn: A Summer of Code 2010 proposal Daniel Barkalow 2010-03-21 7:40 ` Peter Baumann 3 siblings, 1 reply; 33+ messages in thread From: Jonathan Nieder @ 2010-03-20 21:04 UTC (permalink / raw) To: Ramkumar Ramachandra Cc: Johannes Schindelin, Avery Pennarun, Sverre Rabbelier, Git Mailing List Hi, Ramkumar Ramachandra wrote: > I just prepared another revision of my proposal- I've tried to be > clearer about the objective, and included a timeline this time. Very nice. Thanks! > == Timeline == The one thing I worry about is that you are proposing to wait a while before submitting your changes upstream. I would suggest pushing whatever pieces work to contrib/ early on to get more feedback from reviewers and testers. (I am saying this selfishly, as a potential tester.) > July 10 - July 16: Scrub code and write documentation for mid-term > evaluations. Commit changes to the remote helper to upstream. Try to > get the other changes into `next`. In particular, this seems like very little time to get the code into shape for git.git. Hope that helps. Cheers, Jonathan ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: native-git-svn: A Summer of Code 2010 proposal 2010-03-20 21:04 ` Jonathan Nieder @ 2010-03-21 10:26 ` Johannes Schindelin 2010-03-21 11:08 ` Jonathan Nieder 0 siblings, 1 reply; 33+ messages in thread From: Johannes Schindelin @ 2010-03-21 10:26 UTC (permalink / raw) To: Jonathan Nieder Cc: Ramkumar Ramachandra, Avery Pennarun, Sverre Rabbelier, Git Mailing List Hi, On Sat, 20 Mar 2010, Jonathan Nieder wrote: > Ramkumar Ramachandra wrote: > > > == Timeline == > > The one thing I worry about is that you are proposing to wait a while > before submitting your changes upstream. I would suggest pushing > whatever pieces work to contrib/ early on to get more feedback from > reviewers and testers. (I am saying this selfishly, as a potential > tester.) I would rather have frequent updates about the progress on the mailing list, and a long-running branch in which the code is developed, only rebasing to Junio's next/pu when absolutely necessary. After all, it would be additional work to put it first into contrib/ and then to integrate it fully into git.git. Ciao, Dscho ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: native-git-svn: A Summer of Code 2010 proposal 2010-03-21 10:26 ` Johannes Schindelin @ 2010-03-21 11:08 ` Jonathan Nieder 2010-03-21 11:47 ` Johannes Schindelin 0 siblings, 1 reply; 33+ messages in thread From: Jonathan Nieder @ 2010-03-21 11:08 UTC (permalink / raw) To: Johannes Schindelin Cc: Ramkumar Ramachandra, Avery Pennarun, Sverre Rabbelier, Git Mailing List, Daniel Barkalow, Christian Couder, Stephan Beyer Hi, Johannes Schindelin wrote: > On Sat, 20 Mar 2010, Jonathan Nieder wrote: >> Ramkumar Ramachandra wrote: >>> == Timeline == >> >> The one thing I worry about is that you are proposing to wait a while >> before submitting your changes upstream. I would suggest pushing >> whatever pieces work to contrib/ early on to get more feedback from >> reviewers and testers. (I am saying this selfishly, as a potential >> tester.) > > I would rather have frequent updates about the progress on the mailing > list, and a long-running branch in which the code is developed, only > rebasing to Junio's next/pu when absolutely necessary. You are usually right about this kind of thing, so I will not disagree too strongly. But I will say: I think this was a mistake in the git sequencer project. Stephan did excellent work both on and off list, and I think it is a shame that as little of his code reached mainline by the end of the summer as did. I imagine that submitting bit by bit would have required a different approach: maybe a sequencer--helper that would gradually grow to absorb more of the functionality of the prototype script. Harder, but the result would be working code. Now it is hard enough to merge current master into the sequencer branch... Whether to use stable topic branches or rebased-against-master patch series as the means of submission is a decision that matters less to me. (I prefer the former.) > After all, it would be additional work to put it first into contrib/ and > then to integrate it fully into git.git. I am not sure I understand this point. Are you saying the change in filenames would be problematic? Curious, Jonathan ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: native-git-svn: A Summer of Code 2010 proposal 2010-03-21 11:08 ` Jonathan Nieder @ 2010-03-21 11:47 ` Johannes Schindelin 2010-03-21 12:25 ` Ramkumar Ramachandra 2010-03-21 16:43 ` Best example of GSoC student participation (was: Re: native-git-svn: A Summer of Code 2010 proposal) Jakub Narebski 0 siblings, 2 replies; 33+ messages in thread From: Johannes Schindelin @ 2010-03-21 11:47 UTC (permalink / raw) To: Jonathan Nieder Cc: Ramkumar Ramachandra, Avery Pennarun, Sverre Rabbelier, Git Mailing List, Daniel Barkalow, Christian Couder, Stephan Beyer Hi, On Sun, 21 Mar 2010, Jonathan Nieder wrote: > Johannes Schindelin wrote: > > On Sat, 20 Mar 2010, Jonathan Nieder wrote: > >> Ramkumar Ramachandra wrote: > > >>> == Timeline == > >> > >> The one thing I worry about is that you are proposing to wait a while > >> before submitting your changes upstream. I would suggest pushing > >> whatever pieces work to contrib/ early on to get more feedback from > >> reviewers and testers. (I am saying this selfishly, as a potential > >> tester.) > > > > I would rather have frequent updates about the progress on the mailing > > list, and a long-running branch in which the code is developed, only > > rebasing to Junio's next/pu when absolutely necessary. > > You are usually right about this kind of thing, so I will not disagree > too strongly. > > But I will say: I think this was a mistake in the git sequencer project. The mistakes in the sequencer project were more than this. Not only was the development of the branch almost invisible, when it was done, it was basically with a comment "here it is, take it or leave it", and good suggestions as to improve the code went unheeded. That's why I suggested frequent progress reports on the mailing list. Of course, these reports should only be commented upon by people who are fully informed about the project, they should not be invitations to everybody and her dog to distract the student by putting in unreasonable or uninformed wishes. > Now it is hard enough to merge current master into the sequencer > branch... The problem is not the merging. The problem is that the code is not in a form I (or certain others) want to see in git.git. > > After all, it would be additional work to put it first into contrib/ > > and then to integrate it fully into git.git. > > I am not sure I understand this point. Are you saying the change in > filenames would be problematic? I say that distracting the student from the real task is problematic. The real task does not involve putting the code into contrib/ first, and then move it into the final location. Personally, I would have little problems just adding the remote and checking out the branch, just to test the thing after I got a promising progress report. And I think those who are truly interested in git-remote-svn will have little problems, either. The important part would be the visible progress (i.e. mails by the student to this list). Ciao, Dscho ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: native-git-svn: A Summer of Code 2010 proposal 2010-03-21 11:47 ` Johannes Schindelin @ 2010-03-21 12:25 ` Ramkumar Ramachandra 2010-03-21 12:31 ` Johannes Schindelin ` (3 more replies) 2010-03-21 16:43 ` Best example of GSoC student participation (was: Re: native-git-svn: A Summer of Code 2010 proposal) Jakub Narebski 1 sibling, 4 replies; 33+ messages in thread From: Ramkumar Ramachandra @ 2010-03-21 12:25 UTC (permalink / raw) To: Johannes Schindelin Cc: Jonathan Nieder, Avery Pennarun, Sverre Rabbelier, Git Mailing List, Daniel Barkalow, Christian Couder, Stephan Beyer Hi, > Personally, I would have little problems just adding the remote and > checking out the branch, just to test the thing after I got a promising > progress report. And I think those who are truly interested in > git-remote-svn will have little problems, either. The important part would > be the visible progress (i.e. mails by the student to this list). Thanks for the elaborate explanation. The way I see it, there are two extreme situations I must avoid. The first is being opaque for the risk of not being able to integrate it into git.git at the end of the summer term. The other extreme is worrying so much about the integration of each little bit that the project keeps getting detracted, and eventually loses focus. To strike a balance, I will post progress reports to the mailing list (atleast) once a week, and keep a public development branch for myself. Occasionally, it might help to post patches for small components of the project with unittests to get a wider test audience. Ramkumar ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: native-git-svn: A Summer of Code 2010 proposal 2010-03-21 12:25 ` Ramkumar Ramachandra @ 2010-03-21 12:31 ` Johannes Schindelin 2010-03-21 12:36 ` Sverre Rabbelier ` (2 subsequent siblings) 3 siblings, 0 replies; 33+ messages in thread From: Johannes Schindelin @ 2010-03-21 12:31 UTC (permalink / raw) To: Ramkumar Ramachandra Cc: Jonathan Nieder, Avery Pennarun, Sverre Rabbelier, Git Mailing List, Daniel Barkalow, Christian Couder, Stephan Beyer Hi, On Sun, 21 Mar 2010, Ramkumar Ramachandra wrote: > Dscho wrote: > > > Personally, I would have little problems just adding the remote and > > checking out the branch, just to test the thing after I got a > > promising progress report. And I think those who are truly interested > > in git-remote-svn will have little problems, either. The important > > part would be the visible progress (i.e. mails by the student to this > > list). > > Thanks for the elaborate explanation. The way I see it, there are two > extreme situations I must avoid. The first is being opaque for the risk > of not being able to integrate it into git.git at the end of the summer > term. The other extreme is worrying so much about the integration of > each little bit that the project keeps getting detracted, and eventually > loses focus. To strike a balance, I will post progress reports to the > mailing list (atleast) once a week, and keep a public development branch > for myself. Occasionally, it might help to post patches for small > components of the project with unittests to get a wider test audience. That sounds very good! Ciao, Dscho ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: native-git-svn: A Summer of Code 2010 proposal 2010-03-21 12:25 ` Ramkumar Ramachandra 2010-03-21 12:31 ` Johannes Schindelin @ 2010-03-21 12:36 ` Sverre Rabbelier 2010-03-21 17:58 ` Jonathan Nieder 2010-03-22 0:33 ` Daniel Barkalow 3 siblings, 0 replies; 33+ messages in thread From: Sverre Rabbelier @ 2010-03-21 12:36 UTC (permalink / raw) To: Ramkumar Ramachandra, Junio C Hamano Cc: Johannes Schindelin, Jonathan Nieder, Avery Pennarun, Git Mailing List, Daniel Barkalow, Christian Couder, Stephan Beyer Heya, On Sun, Mar 21, 2010 at 13:25, Ramkumar Ramachandra <artagnon@gmail.com> wrote: > Occasionally, it might help to post patches for small components > of the project with unittests to get a wider test audience. I suggest that (and this is a suggestion for all git GSoC projects) that all students are required to post a weekly status update to the mailing list. If they then prefix those mails with [GSoC update] it is 1) easy to filter out these emails for those who are not interested in them and 2) easy later on to get a quick overview of who did what when, by searching for all [GSoC update] emails. At first these emails can just be a description of what was done in the past week, which design decisions were made (and why), and what the current status of the project is. Later on these should be complemented by patches to the git list (prefixed with [RFC/GSoC/PATCH] or such) with the progress of the student so far. In this particular case I think it'd be a good idea to start sending patches around 3-4 weeks into the project, since by June 20 you hope to have something minimally usable. I reckon it would be good to keep these patches in pu to give them wider exposure, and to start merging them to next as appropriate. For example, as soon as there is full read support that would be a good time to merge to next to allow more people to give it a try. Junio, how do you feel about keeping these (most likely fairly-unstable) GSoC patches in pu? I think it would be a great motivation for our students (look, I've got my work into pu!), and it'd be a great way to make it easy for everybody else to try out the work of our students. -- Cheers, Sverre Rabbelier ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: native-git-svn: A Summer of Code 2010 proposal 2010-03-21 12:25 ` Ramkumar Ramachandra 2010-03-21 12:31 ` Johannes Schindelin 2010-03-21 12:36 ` Sverre Rabbelier @ 2010-03-21 17:58 ` Jonathan Nieder 2010-03-22 0:33 ` Daniel Barkalow 3 siblings, 0 replies; 33+ messages in thread From: Jonathan Nieder @ 2010-03-21 17:58 UTC (permalink / raw) To: Ramkumar Ramachandra Cc: Johannes Schindelin, Avery Pennarun, Sverre Rabbelier, Git Mailing List, Daniel Barkalow, Christian Couder, Stephan Beyer Ramkumar Ramachandra wrote: > Thanks for the elaborate explanation. The way I see it, there are two > extreme situations I must avoid. The first is being opaque for the > risk of not being able to integrate it into git.git at the end of the > summer term. The other extreme is worrying so much about the > integration of each little bit that the project keeps getting > detracted, and eventually loses focus. To strike a balance, I will > post progress reports to the mailing list (atleast) once a week, and > keep a public development branch for myself. Occasionally, it might > help to post patches for small components of the project with > unittests to get a wider test audience. Sounds great to me, too. It seems you understand the issues well. Cheers, Jonathan ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: native-git-svn: A Summer of Code 2010 proposal 2010-03-21 12:25 ` Ramkumar Ramachandra ` (2 preceding siblings ...) 2010-03-21 17:58 ` Jonathan Nieder @ 2010-03-22 0:33 ` Daniel Barkalow 2010-03-22 2:41 ` Christian Couder 3 siblings, 1 reply; 33+ messages in thread From: Daniel Barkalow @ 2010-03-22 0:33 UTC (permalink / raw) To: Ramkumar Ramachandra Cc: Johannes Schindelin, Jonathan Nieder, Avery Pennarun, Sverre Rabbelier, Git Mailing List, Christian Couder, Stephan Beyer On Sun, 21 Mar 2010, Ramkumar Ramachandra wrote: > Hi, > > > Personally, I would have little problems just adding the remote and > > checking out the branch, just to test the thing after I got a promising > > progress report. And I think those who are truly interested in > > git-remote-svn will have little problems, either. The important part would > > be the visible progress (i.e. mails by the student to this list). > > Thanks for the elaborate explanation. The way I see it, there are two > extreme situations I must avoid. The first is being opaque for the > risk of not being able to integrate it into git.git at the end of the > summer term. The other extreme is worrying so much about the > integration of each little bit that the project keeps getting > detracted, and eventually loses focus. To strike a balance, I will > post progress reports to the mailing list (atleast) once a week, and > keep a public development branch for myself. Occasionally, it might > help to post patches for small components of the project with > unittests to get a wider test audience. One thing to keep in mind is that you'll get review at a slower rate than you'll make progress, and you'll need progress, review, and fixes to get integration. This means that the optimal pattern is to post incomplete things (marked [RFC PATCH]) when you've got enough there to show where you're going and you think the quality of the code you have is pretty good. Your patches go out, and you work on the next step while other people find them, read them, write comments, and you get the comments. Then you incorporate the changes for the comments into the next round (or you acknowledge the need for changes, but defer them to the third round, if you've got a second round ready). The thing that really stalls a project, either in the middle or at the end, is when you can't do anything while you wait for a round-trip exchange with reviewers (or multiple round-trips, if the comments are non-trivial and you need further explanation or to propose alternatives). The longer you anticipate between sending the patches out and having them included, and the busier you can stay in that time, the better. Overlapping does mean that you end up reworking later patches, but (unless you can save up hours of work) it's better to have patches to rework than to be starting from scratch at that point, and it's better to know what you'll have to rework as early as is feasible. -Daniel *This .sig left intentionally blank* ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: native-git-svn: A Summer of Code 2010 proposal 2010-03-22 0:33 ` Daniel Barkalow @ 2010-03-22 2:41 ` Christian Couder 2010-03-22 3:49 ` Ramkumar Ramachandra 0 siblings, 1 reply; 33+ messages in thread From: Christian Couder @ 2010-03-22 2:41 UTC (permalink / raw) To: Daniel Barkalow Cc: Ramkumar Ramachandra, Johannes Schindelin, Jonathan Nieder, Avery Pennarun, Sverre Rabbelier, Git Mailing List, Stephan Beyer On Monday 22 March 2010 01:33:47 Daniel Barkalow wrote: > One thing to keep in mind is that you'll get review at a slower rate than > you'll make progress, and you'll need progress, review, and fixes to get > integration. This means that the optimal pattern is to post incomplete > things (marked [RFC PATCH]) when you've got enough there to show where > you're going and you think the quality of the code you have is pretty > good. Your patches go out, and you work on the next step while other > people find them, read them, write comments, and you get the comments. > Then you incorporate the changes for the comments into the next round (or > you acknowledge the need for changes, but defer them to the third round, > if you've got a second round ready). I agree but I think that it should be stressed that a GSoC project should be split into many milestones and that each milestone should be in itself a worthwhile improvement to the previous state. So that when a milestone is reached, the code can be sent to the list for review (marked [RFC PATCH]) and then improved and sent again as many times as needed (marked with v1, v2, ...) to get it merged. And it is important to understand that responding to reviews and doing whatever is needed to get the code for the first milestones merged _is more important_ than developing code for the next milestones. Because it's much better for everyone at the end of the GSoC if only half of the project is finished but merged, rather than if all the project is "finished" but nothing can be merged. The code that can't be merged will rust very fast and will probably need quite some work that unfortunately few people may want or be able to do fast enough after the end of the GSoC. And that means that basically the work that has been done will be mostly lost which is very _very_ frustrating for students, mentors, reviewers and everyone involved... Best regards, Christian. ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: native-git-svn: A Summer of Code 2010 proposal 2010-03-22 2:41 ` Christian Couder @ 2010-03-22 3:49 ` Ramkumar Ramachandra 2010-03-22 11:33 ` Johannes Schindelin 0 siblings, 1 reply; 33+ messages in thread From: Ramkumar Ramachandra @ 2010-03-22 3:49 UTC (permalink / raw) To: Christian Couder Cc: Daniel Barkalow, Johannes Schindelin, Jonathan Nieder, Avery Pennarun, Sverre Rabbelier, Git Mailing List, Stephan Beyer > Don't know about importer modes, but in native connection mode it is > possible to avoid calling or linking to git in any way (been there, done > that). > Mostly, except that I think it should be possible to avoid having > git-remote-svn actually link to the git core, because the git core should > be taking care of everything git-specific for you. Of course, the git core > also provides a bunch of useful C library code that you may want to use, > such as a nice string buffer implementation, so you may want to link to > git even if you don't actually need it, if licenses are suitable and it > would be convenient. As of this point, I'm undecided about which parts of Git Core to link to, if at all. I'll try to avoid linking, but I'll do whatever is most convenient within the bounds of the license as I write the remote helper. > I solved this problem you mention by rebasing in both directions onto > detached HEADs and exporting the result, meaning that the history is > permanently diverged from a DAG standpoint. Of course, over time, the > rebase would become increasingly messy and horrible, so I created a > couple of placeholder refs which are updated after the import/export is > finished. These mark the last time it was done, and allow you only to > attempt to apply the commits which are new on each side. Ah. Could you please post a link to your code? > Because it's much better for everyone at the end of the GSoC if only half of > the project is finished but merged, rather than if all the project is "finished" > but nothing can be merged. Right. I'll merge the whole thing in 3-4 phases then. -- Ram ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: native-git-svn: A Summer of Code 2010 proposal 2010-03-22 3:49 ` Ramkumar Ramachandra @ 2010-03-22 11:33 ` Johannes Schindelin [not found] ` <f3271551003220643j3a726d09o2d3a078292fd8bf6@mail.gmail.com> 0 siblings, 1 reply; 33+ messages in thread From: Johannes Schindelin @ 2010-03-22 11:33 UTC (permalink / raw) To: Ramkumar Ramachandra Cc: Christian Couder, Daniel Barkalow, Jonathan Nieder, Avery Pennarun, Sverre Rabbelier, Git Mailing List, Stephan Beyer Hi Ram, On Mon, 22 Mar 2010, Ramkumar Ramachandra wrote: > > Don't know about importer modes, but in native connection mode it is > > possible to avoid calling or linking to git in any way (been there, > > done that). > > > Mostly, except that I think it should be possible to avoid having > > git-remote-svn actually link to the git core, because the git core > > should be taking care of everything git-specific for you. Of course, > > the git core also provides a bunch of useful C library code that you > > may want to use, such as a nice string buffer implementation, so you > > may want to link to git even if you don't actually need it, if > > licenses are suitable and it would be convenient. > > As of this point, I'm undecided about which parts of Git Core to link > to, if at all. I'll try to avoid linking, but I'll do whatever is most > convenient within the bounds of the license as I write the remote > helper. AFAICT the git-remote idea is to use a text protocol (which may lend itself to be copied by Mercurial at some stage, a clear sign that we did something well when that happens). > > Because it's much better for everyone at the end of the GSoC if only > > half of the project is finished but merged, rather than if all the > > project is "finished" but nothing can be merged. > > Right. I'll merge the whole thing in 3-4 phases then. I am sure that Sverre will be of tremendous help to decide when is the opportune moment to show off your code. Ciao, Dscho ^ permalink raw reply [flat|nested] 33+ messages in thread
[parent not found: <f3271551003220643j3a726d09o2d3a078292fd8bf6@mail.gmail.com>]
* Re: native-git-svn: A Summer of Code 2010 proposal [not found] ` <f3271551003220643j3a726d09o2d3a078292fd8bf6@mail.gmail.com> @ 2010-03-22 19:52 ` Johannes Schindelin 2010-03-23 7:49 ` Ramkumar Ramachandra 0 siblings, 1 reply; 33+ messages in thread From: Johannes Schindelin @ 2010-03-22 19:52 UTC (permalink / raw) To: Ramkumar Ramachandra Cc: Christian Couder, Daniel Barkalow, Jonathan Nieder, Avery Pennarun, Sverre Rabbelier, Git Mailing List, Stephan Beyer Hi, On Mon, 22 Mar 2010, Ramkumar Ramachandra wrote: > Thank you all for the feedback. Here's an updated version of the diagram > I posted earlier with some corrections. I've assumed that I won't be > linking to Git core at all. The flow looks good (although I had to jump through hoops to look at it, I am reading mails via ssh'ed text console). I have to admit that I would prefer a more detailed timeline, if only to show how well you understand the individual requirements of trying to interpret the fast-import protocol in an svn context. Ciao, Dscho ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: native-git-svn: A Summer of Code 2010 proposal 2010-03-22 19:52 ` Johannes Schindelin @ 2010-03-23 7:49 ` Ramkumar Ramachandra 0 siblings, 0 replies; 33+ messages in thread From: Ramkumar Ramachandra @ 2010-03-23 7:49 UTC (permalink / raw) To: Johannes Schindelin Cc: Christian Couder, Daniel Barkalow, Jonathan Nieder, Avery Pennarun, Sverre Rabbelier, Git Mailing List, Stephan Beyer Hi, > The flow looks good (although I had to jump through hoops to look at it, I > am reading mails via ssh'ed text console). In that case, I shouldn't put any crucial information in the figure like the timeline, in the interest of several reviewers who might be in a similar position. > I have to admit that I would prefer a more detailed timeline, if only to > show how well you understand the individual requirements of trying to > interpret the fast-import protocol in an svn context. Is the timeline in the text of the proposal detailed enough? I'm planning to submit that, and remove the one in the figure. -- Ram ^ permalink raw reply [flat|nested] 33+ messages in thread
* Best example of GSoC student participation (was: Re: native-git-svn: A Summer of Code 2010 proposal) 2010-03-21 11:47 ` Johannes Schindelin 2010-03-21 12:25 ` Ramkumar Ramachandra @ 2010-03-21 16:43 ` Jakub Narebski 2010-03-21 17:27 ` Best example of GSoC student participation Johannes Schindelin 1 sibling, 1 reply; 33+ messages in thread From: Jakub Narebski @ 2010-03-21 16:43 UTC (permalink / raw) To: Johannes Schindelin Cc: Jonathan Nieder, Ramkumar Ramachandra, Avery Pennarun, Sverre Rabbelier, Git Mailing List, Daniel Barkalow, Christian Couder, Stephan Beyer, Shawn O. Pearce Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > The mistakes in the sequencer project were more than this. Not only was > the development of the branch almost invisible, when it was done, it was > basically with a comment "here it is, take it or leave it", and good > suggestions as to improve the code went unheeded. > > That's why I suggested frequent progress reports on the mailing list. > Of course, these reports should only be commented upon by people who are > fully informed about the project, they should not be invitations to > everybody and her dog to distract the student by putting in unreasonable > or uninformed wishes. By the way, which of the former GSoC students (and for which project) can serve as the best example of good interaction with the git community and with GSoC mentors? -- Jakub Narebski Poland ShadeHawk on #git ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Best example of GSoC student participation 2010-03-21 16:43 ` Best example of GSoC student participation (was: Re: native-git-svn: A Summer of Code 2010 proposal) Jakub Narebski @ 2010-03-21 17:27 ` Johannes Schindelin 0 siblings, 0 replies; 33+ messages in thread From: Johannes Schindelin @ 2010-03-21 17:27 UTC (permalink / raw) To: Jakub Narebski Cc: Jonathan Nieder, Ramkumar Ramachandra, Avery Pennarun, Sverre Rabbelier, Git Mailing List, Daniel Barkalow, Christian Couder, Stephan Beyer, Shawn O. Pearce Hi, On Sun, 21 Mar 2010, Jakub Narebski wrote: > Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > > > The mistakes in the sequencer project were more than this. Not only was > > the development of the branch almost invisible, when it was done, it was > > basically with a comment "here it is, take it or leave it", and good > > suggestions as to improve the code went unheeded. > > > > That's why I suggested frequent progress reports on the mailing list. > > Of course, these reports should only be commented upon by people who are > > fully informed about the project, they should not be invitations to > > everybody and her dog to distract the student by putting in unreasonable > > or uninformed wishes. > > By the way, which of the former GSoC students (and for which project) > can serve as the best example of good interaction with the git > community and with GSoC mentors? I think the best examples we have to show are - Sverre (even if his work is not in git.git; Junio thought that the project could gain more visibility outside of Git, but that had rather the opposite effect), who interacted with the Git community and with his mentor rather well, - Marek, who did a good job with the push support in JGit/EGit (even if he did not communicate with the community frequently, he did often enough, and listened to advice), and of course - Miklos, who was already known on the Git mailing list before getting really involved through the Summer of Code. Ciao, Dscho ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: native-git-svn: A Summer of Code 2010 proposal 2010-03-20 20:34 ` Ramkumar Ramachandra 2010-03-20 20:55 ` Ramkumar Ramachandra 2010-03-20 21:04 ` Jonathan Nieder @ 2010-03-20 21:58 ` Daniel Barkalow 2010-03-20 22:19 ` Ramkumar Ramachandra ` (2 more replies) 2010-03-21 7:40 ` Peter Baumann 3 siblings, 3 replies; 33+ messages in thread From: Daniel Barkalow @ 2010-03-20 21:58 UTC (permalink / raw) To: Ramkumar Ramachandra Cc: Johannes Schindelin, Avery Pennarun, Sverre Rabbelier, Git Mailing List, Jonathan Nieder On Sun, 21 Mar 2010, Ramkumar Ramachandra wrote: > Hi, > > I just prepared another revision of my proposal- I've tried to be > clearer about the objective, and included a timeline this time. Note > that I've also changed the name from native-git-svn to git-remote-svn, > as recommended by Sverre. > > ====================================== > Project Proposal: git-remote-svn | Native SVN support in Git > > == The Outline == > The objective of git-remote-svn is to allow native interaction with > SVN repositories in Git. The motivation for writing this comes from > the shortcomings of the current approach: git-svn. > 1. It is essentially an arcane 5000-line Perl script that doesn't use > git-fast-import/ git-fast-export. It converts an SVN repository to a > Git repository by hand. This makes it virtually unmaintainable. > 2. The UI is unnatural and complex. git-svn-* has some commands > corresponding to git-* commands, and it can be quite difficult for the > user to understand which one to use in different situations. > 3. It handles the standard trunk/branches/tags layout well, but it > doesn't know how to handle non-standard/ changing SVN layout. > 4. There's an array of other annoyances which makes it quite > imperfect. For example, it ignores all SVN properties except > svn:executable. > > While the last two problems can be tackled in git-svn.perl itself, a > fresh approach is required to tackle the first two. git-remote-svn is > a proposal for an alternative approach. > 1. Several good SVN exporters already exist, and using them with > git-fast-import should simplify a lot of the plumbing git-svn tackles > by hand. > 2. Using a remote helper to keep track of SVN remotes will simplify > the UI greatly. The fresh UI will allow for a simple `git clone > svn://example.com/myrepo` and multiple subsequent `git pull` > invocations. > > However, the project does not aim to be compatible with git-svn, and > does not serve as an immediate replacement. It can be considered fully > successful after the functionality described in all the components > have been written. Merging the project to upstream will involve small > changes to the Git codebase to incorporate the native UI. I > additionally hope that this project will serve as a roadmap for other > projects that involve natively supporting other versioning systems in > Git. > > == The Technicalities == > I've discussed the project with Sverre Rabbelier at length over email. > The plan is to build component-wise. The distinct components are: > 1. An SVN client that uses libsvn to fetch/ push revisions to a remote > SVN repository. > 2. An exporter for SVN repositories, which will extract all the > relevant revision history and metadata to import into Git. > 3. A remote helper for Git that takes the data from this SVN exporter, > and uses git-fast-import to create corresponding commits in Git. > 4. Another remote helper to export commit data and metadata from Git > to import into SVN. > 5. An importer for SVN, which will create revisions in SVN > corresponding to commits in Git. The structure for remote helpers should be that each foreign system has a single helper which git can call with instructions on what to do (both for foreign-to-git and for git-to-foreign operations). So 3 and 4 have to be functions of the same program, and it's probably best for 2 and 5 and maybe 1 to also be part of this program. The structure is that git will essentially call you in a pipeline like: commands | you | git-fast-import or: git-fast-export | you | git-fast-import So the helper wouldn't be running git-fast-export or git-fast-import, unless it was a helper for using git as the foreign system. > 6. A UI that glues all the components together. If you use the remote-helper framework (and probably extend it as necessary), there shouldn't need to be a UI. The grand idea is that, regardless of what mechanism you use to interact with git, it would use the transport code from the library, which would know how to interact with remote helpers, and users and UI developers don't need to know about SVN or Perforce or any other particular system. An extra-grand idea would be to allow helpers to be agnostic about the local system they're helping, so that Mercurial could use an SVN helper you developed for git, and git could use a Darcs helper that the Darcs people developed without a particular local system in mind, just to be interoperable. > Due to a licensing conflict, the details of which can be found here > [1], git-remote-svn will link to libsvn, but will NOT link to Git. It > will simply use a thin wrapper to call compiled Git executables > (referred to as remote helper in article). It should be possible to avoid calling any git executables (directly or otherwise); git should call you with all the information you need. > The following resources will help build the various components: > 1. git_remote_helpers/git/git.py is a small remote helper written by > Sverre that wraps around git-fast-import. I plan to extend this to > wrap around git-fast-export as well. > 2. git-svn.perl contains a two-way mapping, parts of which I plan to implement. > 3. Thiago Macieira's svn-all-fast-export [2] has a complete SVN -> Git > mapping. I plan to take several ideas from the branch/ tag mapper in > repository.cpp. If you're going to work in C, you should look at my Perforce helper. It's suitable for mainline inclusion, due to using a free-as-in-beer, made-available-without-license-terms C++ library for the Perforce side, but may be a better model for a C remote helper than git.py is. -Daniel *This .sig left intentionally blank* ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: native-git-svn: A Summer of Code 2010 proposal 2010-03-20 21:58 ` native-git-svn: A Summer of Code 2010 proposal Daniel Barkalow @ 2010-03-20 22:19 ` Ramkumar Ramachandra 2010-03-21 5:36 ` Ramkumar Ramachandra 2010-03-21 17:08 ` Ilari Liusvaara 2 siblings, 0 replies; 33+ messages in thread From: Ramkumar Ramachandra @ 2010-03-20 22:19 UTC (permalink / raw) To: Daniel Barkalow Cc: Johannes Schindelin, Avery Pennarun, Sverre Rabbelier, Git Mailing List, Jonathan Nieder > The one thing I worry about is that you are proposing to wait a while > before submitting your changes upstream. I would suggest pushing > whatever pieces work to contrib/ early on to get more feedback from > reviewers and testers. (I am saying this selfishly, as a potential > tester.) Okay, I'll try to get patches integrated immediately then. > The structure for remote helpers should be that each foreign system has a > single helper which git can call with instructions on what to do (both for > foreign-to-git and for git-to-foreign operations). So 3 and 4 have to be > functions of the same program, and it's probably best for 2 and 5 and > maybe 1 to also be part of this program. Right. I only split it up for the purposes of illustration. 3 and 4 will be merged into a program called `git-remote-svn` that will automatically be invoked when Git encounters an SVN remote. 2 and 5 will be merged into another program `svn-export-import` which can be thought of as the fusion of svn-fast-export and svn-fast-import. `git-remote-svn` will invoke it when necessary. And yeah, I don't know if I want to write the SVN client into `svn-export-import` or leave it as a separate program. > So the helper wouldn't be running git-fast-export or git-fast-import, > unless it was a helper for using git as the foreign system. Ah. I just realized that :) > If you're going to work in C, you should look at my Perforce helper. It's > suitable for mainline inclusion, due to using a free-as-in-beer, > made-available-without-license-terms C++ library for the Perforce side, > but may be a better model for a C remote helper than git.py is. Thanks. I'll have a look. git.py isn't very useful. Regards, Ramkumar ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: native-git-svn: A Summer of Code 2010 proposal 2010-03-20 21:58 ` native-git-svn: A Summer of Code 2010 proposal Daniel Barkalow 2010-03-20 22:19 ` Ramkumar Ramachandra @ 2010-03-21 5:36 ` Ramkumar Ramachandra 2010-03-21 22:56 ` Daniel Barkalow 2010-03-21 17:08 ` Ilari Liusvaara 2 siblings, 1 reply; 33+ messages in thread From: Ramkumar Ramachandra @ 2010-03-21 5:36 UTC (permalink / raw) To: Daniel Barkalow Cc: Johannes Schindelin, Avery Pennarun, Sverre Rabbelier, Git Mailing List, Jonathan Nieder [-- Attachment #1: Type: text/plain, Size: 555 bytes --] Hi, > The structure for remote helpers should be that each foreign system has a > single helper which git can call with instructions on what to do (both for > foreign-to-git and for git-to-foreign operations). So 3 and 4 have to be > functions of the same program, and it's probably best for 2 and 5 and > maybe 1 to also be part of this program. I've attached a small image specifying the relationship between various components. I plan to include a more elaborate version of this in my final proposal. Is this what you had in mind? Regards, Ramkumar [-- Attachment #2: flow.png --] [-- Type: image/png, Size: 33412 bytes --] ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: native-git-svn: A Summer of Code 2010 proposal 2010-03-21 5:36 ` Ramkumar Ramachandra @ 2010-03-21 22:56 ` Daniel Barkalow 0 siblings, 0 replies; 33+ messages in thread From: Daniel Barkalow @ 2010-03-21 22:56 UTC (permalink / raw) To: Ramkumar Ramachandra Cc: Johannes Schindelin, Avery Pennarun, Sverre Rabbelier, Git Mailing List, Jonathan Nieder On Sun, 21 Mar 2010, Ramkumar Ramachandra wrote: > Hi, > > > The structure for remote helpers should be that each foreign system has a > > single helper which git can call with instructions on what to do (both for > > foreign-to-git and for git-to-foreign operations). So 3 and 4 have to be > > functions of the same program, and it's probably best for 2 and 5 and > > maybe 1 to also be part of this program. > > I've attached a small image specifying the relationship between > various components. I plan to include a more elaborate version of this > in my final proposal. Is this what you had in mind? Mostly, except that I think it should be possible to avoid having git-remote-svn actually link to the git core, because the git core should be taking care of everything git-specific for you. Of course, the git core also provides a bunch of useful C library code that you may want to use, such as a nice string buffer implementation, so you may want to link to git even if you don't actually need it, if licenses are suitable and it would be convenient. -Daniel *This .sig left intentionally blank* ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: native-git-svn: A Summer of Code 2010 proposal 2010-03-20 21:58 ` native-git-svn: A Summer of Code 2010 proposal Daniel Barkalow 2010-03-20 22:19 ` Ramkumar Ramachandra 2010-03-21 5:36 ` Ramkumar Ramachandra @ 2010-03-21 17:08 ` Ilari Liusvaara 2 siblings, 0 replies; 33+ messages in thread From: Ilari Liusvaara @ 2010-03-21 17:08 UTC (permalink / raw) To: Daniel Barkalow Cc: Ramkumar Ramachandra, Johannes Schindelin, Avery Pennarun, Sverre Rabbelier, Git Mailing List, Jonathan Nieder On Sat, Mar 20, 2010 at 05:58:34PM -0400, Daniel Barkalow wrote: > On Sun, 21 Mar 2010, Ramkumar Ramachandra wrote: > > > Due to a licensing conflict, the details of which can be found here > > [1], git-remote-svn will link to libsvn, but will NOT link to Git. It > > will simply use a thin wrapper to call compiled Git executables > > (referred to as remote helper in article). > > It should be possible to avoid calling any git executables (directly or > otherwise); git should call you with all the information you need. Don't know about importer modes, but in native connection mode it is possible to avoid calling or linking to git in any way (been there, done that). And if one doesn't need git APIs, its completely possible to avoid linking into git (this may mean some mode coding effor however). -Ilari ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: native-git-svn: A Summer of Code 2010 proposal 2010-03-20 20:34 ` Ramkumar Ramachandra ` (2 preceding siblings ...) 2010-03-20 21:58 ` native-git-svn: A Summer of Code 2010 proposal Daniel Barkalow @ 2010-03-21 7:40 ` Peter Baumann 3 siblings, 0 replies; 33+ messages in thread From: Peter Baumann @ 2010-03-21 7:40 UTC (permalink / raw) To: Ramkumar Ramachandra Cc: Johannes Schindelin, Avery Pennarun, Sverre Rabbelier, Git Mailing List, Jonathan Nieder On Sun, Mar 21, 2010 at 02:04:40AM +0530, Ramkumar Ramachandra wrote: > Hi, > > I just prepared another revision of my proposal- I've tried to be > clearer about the objective, and included a timeline this time. Note > that I've also changed the name from native-git-svn to git-remote-svn, > as recommended by Sverre. > ... > == The Technicalities == > I've discussed the project with Sverre Rabbelier at length over email. > The plan is to build component-wise. The distinct components are: > 1. An SVN client that uses libsvn to fetch/ push revisions to a remote > SVN repository. > 2. An exporter for SVN repositories, which will extract all the > relevant revision history and metadata to import into Git. Isn't that called an importer? At least if I am looking from the Git side it imports a SVN repository. > 3. A remote helper for Git that takes the data from this SVN exporter, > and uses git-fast-import to create corresponding commits in Git. Dito. > 4. Another remote helper to export commit data and metadata from Git > to import into SVN. ^^^^^^ export > 5. An importer for SVN, which will create revisions in SVN ^^^^^^^ exporter > corresponding to commits in Git. I have to admit, I like your proposal. Your first points sound a little bit too negative for my taste considering git svn serves me well on my day job, but fair enough. -- Peter ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Re: native-git-svn: A Summer of Code 2010 proposal 2010-03-19 21:30 ` Avery Pennarun 2010-03-20 9:19 ` Ramkumar Ramachandra 2010-03-20 10:48 ` Johannes Schindelin @ 2010-03-21 23:51 ` Dave Olszewski 2 siblings, 0 replies; 33+ messages in thread From: Dave Olszewski @ 2010-03-21 23:51 UTC (permalink / raw) To: Avery Pennarun; +Cc: Sverre Rabbelier, Ramkumar Ramachandra, Git Mailing List On Fri, 19 Mar 2010, Avery Pennarun wrote: > For example, I'd be very happy to learn that your new design would > allow two people to independently pull from svn://, do work in their > respective copies of the git repositories, branch and merge all day > long, pull from each other, and then push back to svn without a) > making a mess of the svn repo and causing zillions of conflicts, or b) > linearizing history and losing git's complex DAG. > > In the current version of git-svn this is very hard. 'git svn dcommit' > generates entirely new git commit objects corresponding to the ones > that were created in svn... but which nevertheless have your merge > history included, which is awesome. But if a new person clones the > svn repo from scratch, he will end up with git commits corresponding > to those same ones from svn, but *without* the merge history, and > therefore with different commit ids, and which therefore prevent > push/pulling between other people who have cloned the repo. I've been working on a script that does 2-way integration with an upstream CVS repo, using git-cvsimport and git-cvsexportcommit to do the difficult parts. I solved this problem you mention by rebasing in both directions onto detached HEADs and exporting the result, meaning that the history is permanently diverged from a DAG standpoint. Of course, over time, the rebase would become increasingly messy and horrible, so I created a couple of placeholder refs which are updated after the import/export is finished. These mark the last time it was done, and allow you only to attempt to apply the commits which are new on each side. It's still very green and I've already worked though a number of pretty hairy problems, so I'm not going to say it's a bulletproof solution. But it does work. Dave ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: native-git-svn: A Summer of Code 2010 proposal 2010-03-19 18:32 ` Avery Pennarun 2010-03-19 18:39 ` Sverre Rabbelier @ 2010-03-19 20:53 ` Jonathan Nieder 2010-03-19 21:00 ` Johannes Schindelin 1 sibling, 1 reply; 33+ messages in thread From: Jonathan Nieder @ 2010-03-19 20:53 UTC (permalink / raw) To: Avery Pennarun; +Cc: Ramkumar Ramachandra, Git Mailing List, Sverre Rabbelier Avery Pennarun wrote: > On Fri, Mar 19, 2010 at 1:18 PM, Ramkumar Ramachandra >> The following resources are relevant to the project: >> 1. git_remote_helpers/git/git.py is a minimalistic remote helper >> written by Sverre. I plan to extend this as much as possible before >> rewriting it in C. > > Are you sure you really want to rewrite git-svn in C? svn is so slow > that interpreted vs. native performance is unlikely to be an issue. > git-svn is probably not going to be needed on embedded systems where > installing python or perl is a problem. And managing the data > structures in a high-level language should be a lot easier. Hmm. Sverre discussed why this is more about a redesign of svn interop support than a C reimplementation of git-svn. I wouldn’t mind if at the end of the summer, all we have is some working Python code. Still, it would have to be rewritten in C or Perl before msysgit could use it unless some hero packages a Python interpreter for them. Jonathan ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: native-git-svn: A Summer of Code 2010 proposal 2010-03-19 20:53 ` Jonathan Nieder @ 2010-03-19 21:00 ` Johannes Schindelin 0 siblings, 0 replies; 33+ messages in thread From: Johannes Schindelin @ 2010-03-19 21:00 UTC (permalink / raw) To: Jonathan Nieder Cc: Avery Pennarun, Ramkumar Ramachandra, Git Mailing List, Sverre Rabbelier [-- Attachment #1: Type: TEXT/PLAIN, Size: 1306 bytes --] Hi, On Fri, 19 Mar 2010, Jonathan Nieder wrote: > Avery Pennarun wrote: > > On Fri, Mar 19, 2010 at 1:18 PM, Ramkumar Ramachandra > > >> The following resources are relevant to the project: > >> 1. git_remote_helpers/git/git.py is a minimalistic remote helper > >> written by Sverre. I plan to extend this as much as possible before > >> rewriting it in C. > > > > Are you sure you really want to rewrite git-svn in C? svn is so slow > > that interpreted vs. native performance is unlikely to be an issue. > > git-svn is probably not going to be needed on embedded systems where > > installing python or perl is a problem. And managing the data > > structures in a high-level language should be a lot easier. > > Hmm. Sverre discussed why this is more about a redesign of svn > interop support than a C reimplementation of git-svn. I wouldn’t mind > if at the end of the summer, all we have is some working Python code. > Still, it would have to be rewritten in C or Perl before msysgit could > use it unless some hero packages a Python interpreter for them. It's not about packaging the interpreter. It is about _compiling_ it, so that we can also compile native extensions for performance. Last time I tried to compile Python with MSys, I gave up. After a full week of trying. Ciao, Dscho ^ permalink raw reply [flat|nested] 33+ messages in thread
end of thread, other threads:[~2010-03-23 7:50 UTC | newest]
Thread overview: 33+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-19 17:18 native-git-svn: A Summer of Code 2010 proposal Ramkumar Ramachandra
2010-03-19 18:32 ` Avery Pennarun
2010-03-19 18:39 ` Sverre Rabbelier
2010-03-19 21:30 ` Avery Pennarun
2010-03-20 9:19 ` Ramkumar Ramachandra
2010-03-20 10:48 ` Johannes Schindelin
2010-03-20 20:34 ` Ramkumar Ramachandra
2010-03-20 20:55 ` Ramkumar Ramachandra
2010-03-20 21:04 ` Jonathan Nieder
2010-03-21 10:26 ` Johannes Schindelin
2010-03-21 11:08 ` Jonathan Nieder
2010-03-21 11:47 ` Johannes Schindelin
2010-03-21 12:25 ` Ramkumar Ramachandra
2010-03-21 12:31 ` Johannes Schindelin
2010-03-21 12:36 ` Sverre Rabbelier
2010-03-21 17:58 ` Jonathan Nieder
2010-03-22 0:33 ` Daniel Barkalow
2010-03-22 2:41 ` Christian Couder
2010-03-22 3:49 ` Ramkumar Ramachandra
2010-03-22 11:33 ` Johannes Schindelin
[not found] ` <f3271551003220643j3a726d09o2d3a078292fd8bf6@mail.gmail.com>
2010-03-22 19:52 ` Johannes Schindelin
2010-03-23 7:49 ` Ramkumar Ramachandra
2010-03-21 16:43 ` Best example of GSoC student participation (was: Re: native-git-svn: A Summer of Code 2010 proposal) Jakub Narebski
2010-03-21 17:27 ` Best example of GSoC student participation Johannes Schindelin
2010-03-20 21:58 ` native-git-svn: A Summer of Code 2010 proposal Daniel Barkalow
2010-03-20 22:19 ` Ramkumar Ramachandra
2010-03-21 5:36 ` Ramkumar Ramachandra
2010-03-21 22:56 ` Daniel Barkalow
2010-03-21 17:08 ` Ilari Liusvaara
2010-03-21 7:40 ` Peter Baumann
2010-03-21 23:51 ` Dave Olszewski
2010-03-19 20:53 ` Jonathan Nieder
2010-03-19 21:00 ` Johannes Schindelin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).