From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jakub Narebski Subject: Re: [PATCH 2/3] add new Git::Repo API Date: Mon, 14 Jul 2008 01:28:43 +0200 Message-ID: <200807140128.44923.jnareb@gmail.com> References: <4876B223.4070707@gmail.com> <1215738665-5153-1-git-send-email-LeWiemann@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-2 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: git@vger.kernel.org, John Hawley , Petr Baudis To: Lea Wiemann X-From: git-owner@vger.kernel.org Mon Jul 14 01:29:52 2008 Return-path: Envelope-to: gcvg-git-2@gmane.org Received: from vger.kernel.org ([209.132.176.167]) by lo.gmane.org with esmtp (Exim 4.50) id 1KIB0q-0005k9-3F for gcvg-git-2@gmane.org; Mon, 14 Jul 2008 01:29:52 +0200 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752219AbYGMX2x convert rfc822-to-quoted-printable (ORCPT ); Sun, 13 Jul 2008 19:28:53 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752192AbYGMX2x (ORCPT ); Sun, 13 Jul 2008 19:28:53 -0400 Received: from gv-out-0910.google.com ([216.239.58.186]:44358 "EHLO gv-out-0910.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751153AbYGMX2w (ORCPT ); Sun, 13 Jul 2008 19:28:52 -0400 Received: by gv-out-0910.google.com with SMTP id e6so829276gvc.37 for ; Sun, 13 Jul 2008 16:28:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:from:to:subject:date :user-agent:cc:references:in-reply-to:mime-version:content-type :content-transfer-encoding:content-disposition:message-id; bh=xLp0cK1OamMTfjauFrs27XTvVQc/k1yPViUJqiS2pP8=; b=MCMNl9LZKEVuDWk9XbJUN2n4ObUjF9ZLzKQidK6ftWHeYbMlAsshMYa9t20xbP1xxh PoupIvo3zQhJwgBk2bVpspqnSTUJPdh/ATjKI5mjcFEkRBSgVD3JiqM38HQvcUsXcEJg nHm+DduQo7CD0p8cPN7vh2t8GXeHwRyAW56hc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:to:subject:date:user-agent:cc:references:in-reply-to :mime-version:content-type:content-transfer-encoding :content-disposition:message-id; b=ATPN6oT1EO8Ra6m8VvT9mgHuDB2MiF9AEfVSAF9/vlbLKkzlNiktDKtD5SUe+N7rCF 8YwMTk8bKsDWG79ZiI96+MqHRRrlwwERc+gwxOExxVs+Rqvrr+uVnGCB9pzWdoMv4yO3 fj3dd0AEr+gwttAZJuW1fqdaTyczhQXdxOhfs= Received: by 10.103.49.12 with SMTP id b12mr7350284muk.72.1215991730409; Sun, 13 Jul 2008 16:28:50 -0700 (PDT) Received: from ?192.168.1.11? ( [83.8.211.66]) by mx.google.com with ESMTPS id u9sm17376756muf.12.2008.07.13.16.28.47 (version=TLSv1/SSLv3 cipher=RC4-MD5); Sun, 13 Jul 2008 16:28:49 -0700 (PDT) User-Agent: KMail/1.9.3 In-Reply-To: <1215738665-5153-1-git-send-email-LeWiemann@gmail.com> Content-Disposition: inline Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Archived-At: On Fri, 11 July 2008 03:11, Lea Wiemann napisa=B3: > This also adds the Git::Commit and Git::Tag classes, which are used b= y > Git::Repo, the Git::Object base class, and the Git::RepoRoot helper > factory class. I think it would be perhaps better to explain relationship and purpose of each class in more detail, including Git::Repo. No signoff - is it deliberate, or just omission? > --- > Here's some elaboration on why I didn't use or extend Git.pm. >=20 > Please note before starting a reply to this: This is not an argument; > I'm just explaining why I implemented it the way I did. So please > don't try to argue with me about what I should or should have done. > I'm not going to refactor Git::Repo to use Git.pm or vice versa; it's > really a much more non-trivial task than you might think at first > glance. >=20 > Anyways, the following bullet points are my reasons for not extending > Git.pm: >=20 > - Git.pm doesn't do what I want: It's designed to provide access to > working copies. Extending it to have more repository-access > functions might have resulted in a mess. I don't quite understand. True, Git->repository(...) has a bit strange [optional] parameters, but you should remember that IIRC it predates possibility of workdir / worktree / working copy separation. That I think is why it supports non-bare repositories (default), and bare repositories, not separating working copy from repository. Also, from what I understand and remember, Git.pm was created to have one place, one single implementation for safe and operating system=20 independent (which means that it works with ActiveState Perl on=20 Windows) "pipe" and "backticks". Everything else is just convenience. =20 > Some long-term thoughts on this issue: I don't think Git.pm's desig= n > approach is sensible: It tries to do (a) WC access, (b) repo access= , > and (c) frontend error handling (with sensible error messages). I can see (b) and (c), but I have trouble seeing (a). > Those things should really be separated; e.g. one could write a > Git::WC class that *has* a Git::Repo instance (since every working > copy has a repository associated with it); so you can use $wc =3D > Git::WC->new to access the working copy, and $wc->repo to access it= s > repository. Please remember that the opposite relation is also true. Non-bare=20 repository has working copy / worktree associated with it, either=20 implicitly (.git/.. is working copy), or explicitly (core.worktree). Also even if worktree and repository are separated, there are a few files in non-bare repository which refer to worktree condition, namely = =20 $GIT_DIR/HEAD, $GIT_DIR/logs/HEAD and $GIT_DIR/index. > Git.pm will obviously have to stay since a few git=20 > commands use it,=20 The following commands use Git.pm: contrib/examples/git-remote.perl, git-add--interactive.perl (helper script), git-cvsexportcommit, git-svn= ,=20 git-send-email. The following commands are written in Perl, but *do not* use Git.pm: git-archimport (which probably should go to contrib, and be replaced by= =20 fast-import / fast-export based Bazaar import), git-cvsimport,=20 git-cvsserver, git-relink (rarely used now, I think). > but I'd probably let it die a slow death, and (cleanly!) copy > functionality to a Git::WC module (and perhaps a Git::Frontend > module) in the long run.=20 I'm not so sure if it is a way to go. Most git commands wants to just=20 invoke other git commands safely, either creating pipeline, or reading=20 whole input; sometimes command output is not important. Non OO things, like ability to write print color('reset') . "\n"; is also important. > - It would have needed refactoring and more features. >=20 > - For example, the Error module should be removed (and normal die > and eval should be used instead). At some point when I was tryin= g > to debug it through gitweb, Git.pm would error out somewhere and = I > would randomly get either a blank page (usually) or an error page > (rarely). I suspect that this is because of the Error module or > some strange interaction with another. I'm not sure if using Error module and syntactic sugar in the form of=20 try { ... } catch with { ... } was a good idea for (c),=20 frontend error handling. Or if it was a good idea (the explanation in=20 http://www.perl.com/pub/a/2002/11/14/exception.html is compelling) if it was implemented fully, completely, and correctly. How would you like to catch errors from frontend in Git::Repo and=20 friends? > [...] (Besides, Error is not in > the Perl distribution, so it would be an unnecessary dependency > for gitweb; the only one as far as I can see.) Git Makefile installs Error module from local copy if one is not=20 available. perl-Git module (which would be unfortunately required to=20 use gitweb; well, that or you could install it locally) has=20 "perl(Error)" as one of dependencies. perl-Error module is a standard module and I think it can be found in any modern Linux distribution; if= =20 not, see first sentence in this paragraph. > - Also, I needed something like max_exit_code and a custom path to > the git binary; adding max_exit_code would have been non-trivial. What is max_exit_code (and why for example you couldn't just create you= r=20 own derivative of Error)? =20 > Now I'm all in favor of re-using existing code, but refactoring > Git.pm would have taken *much* longer than simply writing a new > module. I'm working on caching for gitweb, not on implementing the > next great Perl API for Git. (And Git::Repo isn't great, FTR.) Well, I can understand that. Besides, having more than one implementation of some new feature was=20 quite common in early stages of feature development. See for example=20 wit and gitweb, git-applymbox and git-am, git-annotate and git-blame. > - It's buggy and untested. Neither of these is a problem by itself, > but the combination is deadly. Haven't you added t/t9700-perl-git.sh? > E.g. I was trying to refactor the=20 > 'repository' constructor (to be able to do instantaneous > instantiation) and stumbled upon a missing semicolon that rendered > the surrounding code syntactically correct but obviously buggy (lin= e > 214 on master). Adding a semicolon there seemed to cause other > errors, and given that (a) I don't understand what the code does an= d > no test or comment tells me what it should, and (b) it doesn't work > (or maybe it half-works?), I lost all my confidence that I could do > anything resembling a (behavior-preserving) refactoring on that > code. This is not that the code should not be rewritten... but git-blame (and= =20 git log -S, aka pickaxe search) can be created to find who is the=20 author of said code, and ask him via email (probably Cc-ing git mailing= =20 list). > I might have been able to work with this particular problem, but > such a problem (buggy and completely untested code) are indicative > that the rest of the code might bear similar surprises. I think that the pipe and command code might be tested best, as it is=20 collation and condensation of many different "safe pipe" (etc.) code=20 fragments. > - It's overly lengthy, and it's a lot of code for not much > functionality. Git.pm has 1200 LOC, with a large stack of tiny > subroutines to handle pipes, and a complicated error handling > system. I can make out ~400 lines of actual Git functionality, plu= s > a 100-lines constructor (all including documentation). The part of > Git::Repo that overlaps with Git.pm (in terms of duplicate > functionality) seems to be in the range of 150-200 lines, and it's > mostly pipe-handling. That's not a whole lot. Pipe handling is IMHO most important part. Well, other parts such as=20 color(...) are important too, but not for gitweb. > - When I decided I didn't want to use Git.pm, it took me 2-3 hours to > replicate the functionality in Git.pm that I needed; it would have > taken me *much* longer to extend Git.pm to do what I want. Again, > I'm not working on the next great Git Perl API. >=20 > So where do we go with Git.pm and Git::Repo? I would suggest that > they both stay. >=20 > Thus we'd have two APIs (both of them obviously incomplete). If > there's enough objection to having two competing official APIs, I'll > be happy to move Git::Repo to the gitweb directory, and make it a > gitweb-specific thing (though it's not gitweb-specific at all in term= s > of functionality); I really don't care, as my goal is to get caching > for gitweb working. Again, if you're bothered by the fact that > there's duplicate functionality at all, please don't complain, but > send patches to reconcile the APIs; I will not take care of that > myself, since it's a very much non-trivial task. What I worry about is that dependence on Git.pm or Git::Repo would make= =20 gitweb installation too hard for some. > Unrelatedly, should I add copyright notices at the bottom of each per= l > module so they are displayed in the perldoc/man pages? I'm not > generally a fan of such notices, since they tend to establish code > ownership, but if it's desired I can add them. Well, most manpages have information about who made them... which means= =20 who was initial author, usually, and/or who is current maintainer. [Comments on patch itself in separate email, later] --=20 Jakub Narebski Poland