git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Joshua Jensen <jjensen@workspacewhiz.com>
To: esr@thyrsus.com
Cc: Sitaram Chamarty <sitaramc@gmail.com>,
	Patrick Donnelly <batrick@batbytes.com>,
	Nguyen Thai Ngoc Duy <pclouds@gmail.com>,
	Michael Haggerty <mhagger@alum.mit.edu>,
	Felipe Contreras <felipe.contreras@gmail.com>,
	git@vger.kernel.org
Subject: Re: Python extension commands in git - request for policy change
Date: Tue, 11 Dec 2012 22:11:09 -0700	[thread overview]
Message-ID: <50C811ED.4000600@workspacewhiz.com> (raw)
In-Reply-To: <20121212033043.GA24937@thyrsus.com>

----- Original Message -----
From: Eric S. Raymond
Date: 12/11/2012 8:30 PM
> It might be a good fit for extending git; I wouldn't be very surprised if
> that worked. However, I do have concerns about the "Oh, we'll just
> lash together a binding to C" attitude common among lua programmers; I
> foresee maintainability problems and the possibility of slow death by
> low-level details as that strategy tries to scale up.
I don't understand this statement: "Oh, we'll just lash together a 
binding to C" attitude.

??
> My sense is that git's use cases are better served by a glue language
> in the Python/Perl/Ruby class rather than an extension langage. But
> my mind is open on this issue.
I spend nearly 100% of my Git time on Windows.

Spawning new processes in Windows is dog slow.  Using 'git rebase', 
arguably my favorite Git command, is time-waiting torture.  I'm also on 
about as fast of a Windows machine as money can buy these days.

I have a Git add-on similar to git-media that uses the smudge and clean 
filters to read/write large binary files into a separate storage 
location.  When checking out a workspace, Git shells out to run a filter 
for each file it needs to write to the workspace.

I can get a maximum of 100 processes per second with this technique, 
resulting in just 100 files being written to disk.  However, I tend to 
see closer to 60 files written to disk.

So, I patched Git to allow the smudge/clean filters to load up a DLL 
that executes a Lua script.  The Lua script properly retrieves+caches a 
file locally, or it puts the file on a network share.

The in-process DLL checkout ends up being every bit as fast as when we 
use Perforce to sync files to our local workspace.  Git, then, can be a 
Perforce replacement for our needs.

(For those who don't know, Perforce handles large workspaces with 
massive binary files very efficiently.)

Anyway, my preference is to allow scripts to run in-process within Git, 
because it is far, far faster on Windows.  I imagine it is faster than 
forking processes on non-Windows machines, too, but I have no statistics 
to back that up.

Python, Perl, or Ruby can be embedded, too, but Lua probably embeds the 
easiest and smallest out of those other 3 languages.

And shell scripts tend to be the slowest on Windows due to the excessive 
numbers of process invocations needed to get anything reasonable done.

-Josh

  reply	other threads:[~2012-12-12  5:11 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-25  2:44 Python extension commands in git - request for policy change Eric S. Raymond
2012-11-25  3:15 ` Nguyen Thai Ngoc Duy
2012-11-25  5:18   ` Eric S. Raymond
2012-11-25  8:56     ` Felipe Contreras
2012-11-25  9:54       ` Eric S. Raymond
2012-11-25 11:48         ` Felipe Contreras
2012-11-25 17:50           ` Eric S. Raymond
2012-11-25 21:22             ` Felipe Contreras
2012-11-25 21:56               ` Eric S. Raymond
2012-11-26 13:11                 ` Felipe Contreras
2012-11-27  7:54                   ` David Aguilar
2012-11-27  8:43                     ` Felipe Contreras
2012-11-27  9:17                     ` Sitaram Chamarty
2012-11-27 10:51                       ` David Aguilar
2012-11-27 22:01                         ` Guillaume DE BURE
2012-11-27 15:33                     ` Johannes Schindelin
2012-11-28  2:09                 ` Felipe Contreras
2012-11-25 17:21     ` Johannes Schindelin
2012-11-25 10:26   ` Pat Thoyts
2012-11-25 10:33     ` Eric S. Raymond
2012-11-25 15:51       ` Erik Faye-Lund
2012-11-25  8:53 ` Felipe Contreras
2012-11-25  9:53   ` Eric S. Raymond
2012-11-25 11:19     ` Felipe Contreras
2012-11-25 17:32       ` Eric S. Raymond
2012-11-25 21:43         ` Felipe Contreras
2012-11-25 22:44           ` Eric S. Raymond
2012-11-26 11:05             ` Andreas Ericsson
2012-11-25 10:44   ` Michael Haggerty
2012-11-25 10:57     ` Eric S. Raymond
2012-11-25 11:51       ` David Lang
2012-11-25 12:01         ` Stefano Lattarini
2012-11-25 17:44         ` Eric S. Raymond
2012-11-25 11:25     ` Nguyen Thai Ngoc Duy
2012-12-11  5:44       ` Patrick Donnelly
2012-12-12  0:09         ` Sitaram Chamarty
2012-12-12  0:28           ` Patrick Donnelly
2012-12-12  0:53           ` Tomas Carnecky
2012-12-12  1:50             ` Nguyen Thai Ngoc Duy
2012-12-12  2:22               ` Tomas Carnecky
2012-12-12  2:26               ` Patrick Donnelly
2012-12-12  5:15                 ` Joshua Jensen
2012-12-12  3:30           ` Eric S. Raymond
2012-12-12  5:11             ` Joshua Jensen [this message]
2012-12-12 12:23               ` Eric S. Raymond
2012-12-12  6:32             ` Jeff King
2012-12-12  7:03               ` Patrick Donnelly
2012-12-12  8:32                 ` Jeff King
2012-12-12 12:26               ` Eric S. Raymond
2012-12-12 12:29                 ` Jeff King
2012-12-12 17:49               ` Junio C Hamano
2012-12-12 22:21                 ` Andrew Ardill
2012-12-12 22:43                   ` Junio C Hamano
2012-12-12  7:11             ` Patrick Donnelly
2012-12-12 12:43               ` Eric S. Raymond
2012-12-19  2:30                 ` Patrick Donnelly
2012-11-25 11:40     ` Felipe Contreras
2012-11-25 17:36       ` Eric S. Raymond
2012-11-25 21:25         ` Felipe Contreras
2012-11-25 22:11           ` Eric S. Raymond
2012-11-26 13:17             ` Felipe Contreras
2012-11-27 14:35       ` Magnus Bäck
2012-11-27 18:35         ` Eric S. Raymond
2012-11-27 21:08           ` Sitaram Chamarty
2012-11-28  0:16           ` Felipe Contreras
2012-12-03 21:45             ` Philippe Vaucher
2012-12-04 14:19               ` Felipe Contreras
2012-12-04 14:40                 ` Stephen Bash
2012-11-28  0:10         ` Felipe Contreras
2012-11-28  0:51           ` Jeff King
2012-11-28  1:22             ` Felipe Contreras
2012-11-28  1:39               ` Jeff King
2012-11-28  2:06                 ` Felipe Contreras
2012-11-28 15:39           ` Magnus Bäck
2012-11-28  5:08     ` Joshua Jensen
2012-11-25  8:57 ` Johannes Sixt
2012-11-25 10:25   ` Eric S. Raymond
2012-11-25 21:41 ` Krzysztof Mazur
2012-11-25 22:47   ` Eric S. Raymond
2012-11-26  5:10     ` Sitaram Chamarty
2012-11-26  8:32       ` Krzysztof Mazur
2012-12-04 15:51 ` Martin Langhoff

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50C811ED.4000600@workspacewhiz.com \
    --to=jjensen@workspacewhiz.com \
    --cc=batrick@batbytes.com \
    --cc=esr@thyrsus.com \
    --cc=felipe.contreras@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=mhagger@alum.mit.edu \
    --cc=pclouds@gmail.com \
    --cc=sitaramc@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).