git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] Git Perl bindings, and OO interface
@ 2008-11-27  1:58 Jakub Narebski
  2008-11-30 13:45 ` nadim khemir
  2009-07-10  2:08 ` Tom Lanyon
  0 siblings, 2 replies; 4+ messages in thread
From: Jakub Narebski @ 2008-11-27  1:58 UTC (permalink / raw)
  To: git; +Cc: Petr Baudis, Lea Wiemann, Nadim Khemir

There exists many Git bindings for various programming languages, some 
of them using git commands, some of them reimplementing Git, or parts 
of Git.  There is GitPython and PyGit (with some native implementation)
for Python, there is (deprecated) Ruby/Git and Grit (with some native
implementation) for Ruby, there is #Git for C#, there is ObjectiveGit
for Objective C (native), there is JGit (native) and JavaGit for Java,
there is Gat for Haskell and eWiki contains something for PHP.

And of course there is Git.pm (included with Git) and Git::Repo (part
of 'gitweb caching' GSoC 2008 project by Lea Wiemann) for Perl.  Now
Git::Repo didn't get accepted into git.git codebase, but developing it
sparked a bit of discussion about Perl interface to Git commands, and
Object Oriented interface to Git.

I'd like to spawn a discussion in this thread about interfaces to Git
and Object Oriented interface to Git, mainly but not only in Perl.
I hope that the authors of mentioned (and not mentioned) bindings, 
interfaces and implementations of Git would contribute to this thread.


0. One of points of disagreement between Git.pm and new Git::Repo was
   using Error module for frontend error handling.  While the 
   explanation in http://www.perl.com/pub/a/2002/11/14/exception.html
   is compelling, it is not standard Perl technique.  Additionally
   adding "cmd_git_try { CODE } ERRORMSG" syntactic sugar was not very
   good idea.

   So the first thing I'd like to discuss: to use Error and try/catch,
   or not in Perl interface (bindings) to Git?  I would really like to
   hear from Perl experts / Perl hackers here...

1. Git::Cmd

   If I remember correctly Git.pm started as a way to gather in one
   place safe_cmd and safe_pipe like construct from various git commands
   implemented in Perl.  The goal here is to provide portable, safe, and
   working with old Perl interface:
    * portable: this means trying to work with ActiveState Perl on 
      MS Windows; I don't know how important it is _now_ (if there are
      common other Perl distributions on MS Windows).
    * safe: if some of arguments to git commands come from variables,
      then they have to be safe against shell expansion (whitespace,
      quoting characters, escape characters, metacharacters, etc.).
    * compatibile: it should work with as old Perl version as is
      reasonable; it is possible that you can install git locally, but
      cannot upgrade Perl.

   Note that some git commands, for example 'git version', 'git
   ls-remote' and 'git clone' doesn't need git repository to work on.

   We would want to be able to catch git command output to scalar, to
   list (line by line), and to filehandle. More advanced stuff is bidi
   pipe (watch for deadlocks!), and redirecting both stdout and stderr
   of git command to filehandle.

   What instance of Git::Cmd should know is where to find 'git' binary
   (what is $GIT in gitweb, for example). It could cache/store
   internally exec_path.

2. Git::Config

   If git command (a piece of code) uses more than one configuration
   variable, then one would want to get relevant configuration using
   as few calls to git commands as possible.  Therefore using git-config
   to read each config variable is usually out of the question (but it
   is sometimes useful); we would want to read all config in one go,
   either by using "git config -l -z", or by writing config parser in
   Perl (as some command(s) did).

   The problem with this solution is that we have to implement "type
   casting", i.e. equivalent of --int and --bool options to git-config
   ourselves. This mean converting to integer with optional size suffix,
   converting to boolean, and asking for escape codes corresponding to
   given color. And if we add new type (like proposed --path, expanding
   for example '~' to HOME, and ~user to home directory of given user)
   we would have to add it to Perl interface too.
  
3. Git::Repo::Bare and Git::Repo::Nonbare

   Git.pm partially implements those, in a kind of mixed way. Git::Repo
   from Lea Wiemann implements if I remember correctly bare repo only.

   What Git::Repo::Bare (or just Git::Repo) should support is to pass
   appropriate '--git-dir=<dir>' to Git::Cmd, and support accessing git
   repository config via Git::Config.  It could have also use
   long-running pipe to "git cat-file --batch / --batch-check"
   invocation.  For gitweb we only need that part.

   Git::Repo::Nonbare has to additionally pass '--work-tree=<dir>' if
   needed, ant be able to take care and manipulate where in working
   directory we are, i.e. what for example "git rev-parse --show-prefix"
   does.

4. Git::Object: Git::Commit, Git::Tag, Git::Blob and Git::Tree

   Here begins "true" object-oriented part of Git Perl API.

   The easy part is for Git::Commit and Git::Tag to parse commit and tag
   objects (perhaps Git::Object should have interface for long-lived
   "git cat-file --batch") into headers and body (commit/tag message).
   I think we can borrow / be inspired by parse_commit() and other such
   code in gitweb; we have to remember that there might be in some time
   some new headers we don't know about but are perfectly valid (see for
   example "encoding" header in commit object format, which was added
   later, not during initial design).

   The harder part would be to be able to deal with author and committer
   info, splitting it into parts (author name, author email, date and
   timezone, etc.), and also generating dates in various formats, like
   RFC-2822 or ISO-8601.

   The easiest part would be structureless Git::Blob... but there we
   might want size of blob.

   A bit harder would be Git::Tree object and dealing with elements of
   a tree (tree entries).  I'm not sure if some kind of iterator access
   would be useful here.

   Note that for Git::Commit if we are to use plumbing like git-cat-file
   we would have to take care of fake parents info, namely grafts and
   shallow info by ourself, in Perl, to have 'effective parents'.

5. Git::Diff::Raw and Git::Diff::Patchset

   Here I am thinking simply about parsing difftree (raw diff output
   format) and patchset format, as it is used in gitweb.  It is meant
   to be able to access for example to permissions of a file, or diff
   status, or diff stats, etc.

   Here we would want to be able to deal also with merge commits and
   combined diff output format.

6. Git::Log or Git::RevList

   The only difference from list of Git::Commit objects is that 
   depending on parameters like path limiting it might have different
   effective parents if there is history simplification.

7. Git::Refs

   It is meant to represent references, mainly branches, and be filled
   using git-for-each-ref... and for example used for ref markers.

There are probably a few things I have forgot about...
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC] Git Perl bindings, and OO interface
  2008-11-27  1:58 [RFC] Git Perl bindings, and OO interface Jakub Narebski
@ 2008-11-30 13:45 ` nadim khemir
  2008-11-30 14:50   ` Jakub Narebski
  2009-07-10  2:08 ` Tom Lanyon
  1 sibling, 1 reply; 4+ messages in thread
From: nadim khemir @ 2008-11-30 13:45 UTC (permalink / raw)
  To: git

On Thursday 27 November 2008 02.58.49 Jakub Narebski wrote:
> ...
>
> 7. Git::Refs
>
>    It is meant to represent references, mainly branches, and be filled
>    using git-for-each-ref... and for example used for ref markers.
>
> There are probably a few things I have forgot about...

Thank you for writing the RFC, it's a very good start. I would like to see 
some strategy for libgit[2] in the RFC. What is your opinion about that?

Nadim.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC] Git Perl bindings, and OO interface
  2008-11-30 13:45 ` nadim khemir
@ 2008-11-30 14:50   ` Jakub Narebski
  0 siblings, 0 replies; 4+ messages in thread
From: Jakub Narebski @ 2008-11-30 14:50 UTC (permalink / raw)
  To: nadim khemir; +Cc: git

nadim khemir <nadim@khemir.net> writes:
> On Thursday 27 November 2008 02.58.49 Jakub Narebski wrote:
> > ...
> >
> > 7. Git::Refs
> >
> >    It is meant to represent references, mainly branches, and be filled
> >    using git-for-each-ref... and for example used for ref markers.
> >
> > There are probably a few things I have forgot about...
> 
> Thank you for writing the RFC, it's a very good start. I would like to see 
> some strategy for libgit[2] in the RFC. What is your opinion about that?

I do not know enought about libgit2 or even git unofficial internal C
API to talk about it.

I did not plan for Perl interface to be actual Perl bindings, using
libgit2.  Please remember that earlier effort of using XS (Perl <-> C
interface) failed because it relied on GCC support for -fPIC and was
not sufficiently portable... if I remember it correctly.  Calling Git
commands and massaging output would be enough for me.

-- 
Jakub Narebski
Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC] Git Perl bindings, and OO interface
  2008-11-27  1:58 [RFC] Git Perl bindings, and OO interface Jakub Narebski
  2008-11-30 13:45 ` nadim khemir
@ 2009-07-10  2:08 ` Tom Lanyon
  1 sibling, 0 replies; 4+ messages in thread
From: Tom Lanyon @ 2009-07-10  2:08 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git, Petr Baudis, Lea Wiemann, Nadim Khemir

On 27/11/2008, at 12:28 PM, Jakub Narebski wrote:
> 0. One of points of disagreement between Git.pm and new Git::Repo was
>   using Error module for frontend error handling.  While the
>   explanation in http://www.perl.com/pub/a/2002/11/14/exception.html
>   is compelling, it is not standard Perl technique.  Additionally
>   adding "cmd_git_try { CODE } ERRORMSG" syntactic sugar was not very
>   good idea.
>
>   So the first thing I'd like to discuss: to use Error and try/catch,
>   or not in Perl interface (bindings) to Git?  I would really like to
>   hear from Perl experts / Perl hackers here...


Sorry to bring up an old thread - but there was no further discussion  
on this and I've recently run into some grief with Git.pm.

I'm new to Git, but not new to Perl and recently attempted to perform  
some simple operations over Git repositories from a Perl application  
(it needs to clone, push, checkout, merge and that's about it) and  
found the Error.pm style handling of errors unintuitive and annoying.  
It is currently fairly simple to capture errors into the application  
by wrapping git_cmd_try { CODE } ERROR into an eval {} block but this  
really only provides you with the command's exit status and no  
meaningful error messages to display to your users; not to mention  
it's fairly ugly.

A long standing Perl motto is 'There Is More Than One Way To Do It'  
and the use of Error.pm here forces developers down a specific path  
for error handling - some may like this, some may not, but there's not  
a lot they can do about it. I would suggest that the Perl way for  
Git.pm to handle errors is for its methods to return the standard 1 or  
0 for success or failure and perhaps store some meaningful error  
messages in an accessor or variable. The module should also not die()  
if there's an error - leave this up to the users of the module to  
handle errors how they prefer - if it dies, we must wrap the methods  
in eval{} blocks or handle with $SIG{__DIE__}, making for some messy  
and ugly code.

I would love to be able to:

	my $repo = Git->repository( directory => '/some/repo' )
		or die "Unable to load git repo /some/repo: $Git::errstr";

	$repo->command( 'push', [ 'some-remote' ] )
		or die "Unable to push to origin: $Git::errstr";

... or similar, and have $Git::errstr set to something meaningful like  
the "fatal: 'some-remote': unable to chdir or not a git archive"  
returned by git-push. This also leads into some discussion around git  
commands printing to STDERR when there is no error -- example: if  
everything is fine and up to date, I don't need git-push to tell me  
"Everything up-to-date" in STDERR...

Hope this helps.

Regards,
Tom

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2009-07-10  2:28 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-11-27  1:58 [RFC] Git Perl bindings, and OO interface Jakub Narebski
2008-11-30 13:45 ` nadim khemir
2008-11-30 14:50   ` Jakub Narebski
2009-07-10  2:08 ` Tom Lanyon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).