Git development
 help / color / mirror / Atom feed
* Re: [ANNOUNCE] git-svn - bidirection operations between svn and git
From: Eric Wong @ 2006-02-16  8:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7v4q2zg2an.fsf@assigned-by-dhcp.cox.net>

Junio C Hamano <junkio@cox.net> wrote:
> Eric Wong <normalperson@yhbt.net> writes:
> 
> > @ Junio: Is there room for this in the git distribution alongside
> > git-svnimport?
> 
> Surely.  Things that superficially do similar things are not
> necessarily mutually exclusive, if that is what you are worried
> about.  There is not much incumbent advantage for tools that
> support a narrowly defined specific task (e.g. interfacing with
> foreign SCM X) on the periphery, while I would perhaps feel more
> hesitant to support 47 different variants of git-commit ;-).

<snip>
 
> Even having some experimental tools that are only starting to do
> useful things might be useful, if we had it in the git.git
> repository.  For one thing, it would give more exposure to them
> and help improve things.

Good to know.  I fully agree on this point.

> How about first adding a contrib/ directory and see how it goes?

Sure thing.  Don't worry about development history, there's hardly any
as it was all done pretty quickly.  Being able to draw from my
experiences with svn-arch-mirror, arch-svn-merge (this one sucked), and
git-archimport helped greatly; as did the very simple and flexible
nature of git.

-- 
Eric Wong

^ permalink raw reply

* [PATCH] Allow building Git in systems without iconv
From: Fernando J. Pereda @ 2006-02-16  8:38 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano

Systems using some uClibc versions do not properly support
iconv stuff. This patch allows Git to be built on those
systems by passing NO_ICONV=YesPlease to make. The only
drawback is mailinfo won't do charset conversion in those
systems.

Signed-off-by: Fernando J. Pereda <ferdy@gentoo.org>

---

 Makefile   |    6 ++++++
 mailinfo.c |    4 ++++
 2 files changed, 10 insertions(+), 0 deletions(-)

ca34460f60a4e0e037953124b91d3377db2cd1c8
diff --git a/Makefile b/Makefile
index 648469e..317be3c 100644
--- a/Makefile
+++ b/Makefile
@@ -53,6 +53,8 @@ all:
 # Define NO_SOCKADDR_STORAGE if your platform does not have struct
 # sockaddr_storage.
 #
+# Define NO_ICONV if your libc does not properly support iconv.
+#
 # Define COLLISION_CHECK below if you believe that SHA1's
 # 1461501637330902918203684832716283019655932542976 hashes do not give you
 # sufficient guarantee that no collisions between objects will ever happen.
@@ -380,6 +382,10 @@ else
 endif
 endif
 
+ifdef NO_ICONV
+	ALL_CFLAGS += -DNO_ICONV
+endif
+
 ifdef PPC_SHA1
 	SHA1_HEADER = "ppc/sha1.h"
 	LIB_OBJS += ppc/sha1.o ppc/sha1ppc.o
diff --git a/mailinfo.c b/mailinfo.c
index ff2d4d4..3c56f8c 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -7,7 +7,9 @@
 #include <stdlib.h>
 #include <string.h>
 #include <ctype.h>
+#ifndef NO_ICONV
 #include <iconv.h>
+#endif
 #include "git-compat-util.h"
 #include "cache.h"
 
@@ -469,6 +471,7 @@ static int decode_b_segment(char *in, ch
 
 static void convert_to_utf8(char *line, char *charset)
 {
+#ifndef NO_ICONV
 	char *in, *out;
 	size_t insize, outsize, nrc;
 	char outbuf[4096]; /* cheat */
@@ -501,6 +504,7 @@ static void convert_to_utf8(char *line, 
 		return;
 	*out = 0;
 	strcpy(line, outbuf);
+#endif
 }
 
 static void decode_header_bq(char *it)
-- 
1.2.0

^ permalink raw reply related

* Re: [PATCH] pack-objects: reuse data from existing pack.
From: Andreas Ericsson @ 2006-02-16  8:32 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, git
In-Reply-To: <7vbqx8m62q.fsf@assigned-by-dhcp.cox.net>

Junio C Hamano wrote:
> When generating a new pack, notice if we have already the wanted
> object in existing packs.  If the object has a delitified
> representation, and its base object is also what we are going to
> pack, then reuse the existing deltified representation
> unconditionally, bypassing all the expensive find_deltas() and
> try_deltas() routines.
> 
> Also, when writing out such deltified representation and
> undeltified representation, if a matching data already exists in
> an existing pack, just write it out without uncompressing &
> recompressing.
> 
> Without this patch:
> 
>     $ git-rev-list --objects v1.0.0 >RL
>     $ time git-pack-objects p <RL
> 
>     Generating pack...
>     Done counting 12233 objects.
>     Packing 12233 objects....................
>     60a88b3979df41e22d1edc3967095e897f720192
> 
>     real    0m32.751s
>     user    0m27.090s
>     sys     0m2.750s
> 
> With this patch:
> 
>     $ git-rev-list --objects v1.0.0 >RL
>     $ time ../git.junio/git-pack-objects q <RL
> 
>     Generating pack...
>     Done counting 12233 objects.
>     Packing 12233 objects.....................
>     60a88b3979df41e22d1edc3967095e897f720192
>     Total 12233, written 12233, reused 12177
> 
>     real    0m4.007s
>     user    0m3.360s
>     sys     0m0.090s
> 

Whoa! Columbus and the egg. Strange noone saw it before. It's so obvious 
when you shove it under the nose like that. :)

Now that pack-creation went from bizarrely expensive to insanely cheap 
(well, comparable to "tar czf" anyways), what's BCP for packing a public 
repository? Always one mega-pack and never worry, or should one still 
use incremental and sometimes overlapping pack-files?

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply

* Re: [ANNOUNCE] git-svn - bidirection operations between svn and git
From: Aneesh Kumar @ 2006-02-16  8:30 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vr763emwx.fsf@assigned-by-dhcp.cox.net>

On 2/16/06, Junio C Hamano <junkio@cox.net> wrote:
> Aneesh Kumar <aneesh.kumar@gmail.com> writes:
>
> > On 2/16/06, Junio C Hamano <junkio@cox.net> wrote:
> >>
> >> How about first adding a contrib/ directory and see how it goes?
> >
> > I am all for it. Attaching the latest gitview. This include branch and
> > tag display support and also the  option to save diffs in file.
>
> Now how do you want to proceed?  I could just dump the thing in
> say contrib/gitview subdirectory, and then afterwards you could
> either keep feeding me patches or sending me pull requests.
>
> There are two downsides doing things that way:
>
>  (1) you would lose the development history so far;
>
>  (2) if gitview script is the only thing you care about, I
>      suspect you would want to have that at the project
>      toplevel, like the "coolest merge ever" gitk merge did, but
>      that is not what you will be getting.
>
> Ideally, if we had a proper "subproject" support, I would merge
> your project with full development history so far as a
> subproject, with your toplevel grafted at contrib/gitview
> subdirectory.  That would not have neither of the above two
> downsides.  But that hasn't happened yet (and that was one of
> the reasons that I was reluctant initially -- I was hoping that
> subproject stuff would materialize sooner).
>
> For now, I'd do the easy approach (easy for me, that is) with
> both of the two downsides.  If we end up doing "subproject"
> thing, we could rectify things later, if this is OK with you.
>


It would be fine with me if you just  drop the script to
contrib/gitview directory.

-aneesh

^ permalink raw reply

* Re: [ANNOUNCE] git-svn - bidirection operations between svn and git
From: Junio C Hamano @ 2006-02-16  8:19 UTC (permalink / raw)
  To: Aneesh Kumar; +Cc: git
In-Reply-To: <cc723f590602160008v5fcc0e35h6d9296bd0572fac2@mail.gmail.com>

Aneesh Kumar <aneesh.kumar@gmail.com> writes:

> On 2/16/06, Junio C Hamano <junkio@cox.net> wrote:
>>
>> How about first adding a contrib/ directory and see how it goes?
>
> I am all for it. Attaching the latest gitview. This include branch and
> tag display support and also the  option to save diffs in file.

Now how do you want to proceed?  I could just dump the thing in
say contrib/gitview subdirectory, and then afterwards you could
either keep feeding me patches or sending me pull requests.

There are two downsides doing things that way:

 (1) you would lose the development history so far;

 (2) if gitview script is the only thing you care about, I
     suspect you would want to have that at the project
     toplevel, like the "coolest merge ever" gitk merge did, but
     that is not what you will be getting.

Ideally, if we had a proper "subproject" support, I would merge
your project with full development history so far as a
subproject, with your toplevel grafted at contrib/gitview
subdirectory.  That would not have neither of the above two
downsides.  But that hasn't happened yet (and that was one of
the reasons that I was reluctant initially -- I was hoping that
subproject stuff would materialize sooner).

For now, I'd do the easy approach (easy for me, that is) with
both of the two downsides.  If we end up doing "subproject"
thing, we could rectify things later, if this is OK with you.

^ permalink raw reply

* Re: [ANNOUNCE] git-svn - bidirection operations between svn and git
From: Aneesh Kumar @ 2006-02-16  8:08 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Eric Wong, git, Martin Langhoff
In-Reply-To: <7v4q2zg2an.fsf@assigned-by-dhcp.cox.net>

[-- Attachment #1: Type: text/plain, Size: 804 bytes --]

On 2/16/06, Junio C Hamano <junkio@cox.net> wrote:
>
>
> should not be too hesitant to expand the scope of the project.
> Also there are some interesting developments such as Martin's
> git-backed fake CVS server and Aneesh's gitview that I have been
> interested in, among other things.
>
> Even having some experimental tools that are only starting to do
> useful things might be useful, if we had it in the git.git
> repository.  For one thing, it would give more exposure to them
> and help improve things.
>
> How about first adding a contrib/ directory and see how it goes?
>


I am all for it. Attaching the latest gitview. This include branch and
tag display support and also the  option to save diffs in file.

For the screenshot
http://kvaneesh.livejournal.com

-aneesh

[-- Attachment #2: gitview --]
[-- Type: application/octet-stream, Size: 27329 bytes --]

#! /usr/bin/env python

# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.

""" gitview
GUI browser for git repository 
This program is based on bzrk by Scott James Remnant <scott@ubuntu.com>
"""
__copyright__ = "Copyright (C) 2006 Hewlett-Packard Development Company, L.P."
__author__    = "Aneesh Kumar K.V <aneesh.kumar@hp.com>"


import sys
import os
import gtk
import pygtk
import pango
import re
import time
import gobject
import cairo
import math
import string

try:
    import gtksourceview
    have_gtksourceview = True
except ImportError:
    have_gtksourceview = False
    print "Running without gtksourceview module"

re_ident = re.compile('(author|committer) (?P<ident>.*) (?P<epoch>\d+) (?P<tz>[+-]\d{4})')

def list_to_string(args, skip):
	count = len(args)
	i = skip
	str_arg=" "
	while (i < count ):
		str_arg = str_arg + args[i]
		str_arg = str_arg + " "
		i = i+1

	return str_arg

def show_date(epoch, tz):
	secs = float(epoch)
	tzsecs = float(tz[1:3]) * 3600
	tzsecs += float(tz[3:5]) * 60
	if (tz[0] == "+"):
		secs += tzsecs
	else:
		secs -= tzsecs

	return time.strftime("%Y-%m-%d %H:%M:%S", time.gmtime(secs))
		
def get_sha1_from_tags(line):
	fp = os.popen("git cat-file -t " + line)
	entry = string.strip(fp.readline())
	fp.close()
	if (entry == "commit"):
		return line
	elif (entry == "tag"):
		fp = os.popen("git cat-file tag "+ line)
		entry = string.strip(fp.readline())
		fp.close()
		obj = re.split(" ", entry)
		if (obj[0] == "object"):
			return obj[1]
	return None

class CellRendererGraph(gtk.GenericCellRenderer):
	"""Cell renderer for directed graph.

	This module contains the implementation of a custom GtkCellRenderer that
	draws part of the directed graph based on the lines suggested by the code
	in graph.py.

	Because we're shiny, we use Cairo to do this, and because we're naughty
	we cheat and draw over the bits of the TreeViewColumn that are supposed to
	just be for the background.

	Properties:
	node              (column, colour, [ names ]) tuple to draw revision node,
	in_lines          (start, end, colour) tuple list to draw inward lines,
	out_lines         (start, end, colour) tuple list to draw outward lines.
	"""

	__gproperties__ = {
	"node":         ( gobject.TYPE_PYOBJECT, "node",
			  "revision node instruction",
			  gobject.PARAM_WRITABLE
			),
	"in-lines":     ( gobject.TYPE_PYOBJECT, "in-lines",
			  "instructions to draw lines into the cell",
			  gobject.PARAM_WRITABLE
			),
	"out-lines":    ( gobject.TYPE_PYOBJECT, "out-lines",
			  "instructions to draw lines out of the cell",
			  gobject.PARAM_WRITABLE
			),
	}

	def do_set_property(self, property, value):
		"""Set properties from GObject properties."""
		if property.name == "node":
			self.node = value
		elif property.name == "in-lines":
			self.in_lines = value
		elif property.name == "out-lines":
			self.out_lines = value
		else:
			raise AttributeError, "no such property: '%s'" % property.name

	def box_size(self, widget):
		"""Calculate box size based on widget's font.

		Cache this as it's probably expensive to get.  It ensures that we
		draw the graph at least as large as the text.
		"""
		try:
			return self._box_size
		except AttributeError:
			pango_ctx = widget.get_pango_context()
			font_desc = widget.get_style().font_desc
			metrics = pango_ctx.get_metrics(font_desc)

			ascent = pango.PIXELS(metrics.get_ascent())
			descent = pango.PIXELS(metrics.get_descent())

			self._box_size = ascent + descent + 6
			return self._box_size

	def set_colour(self, ctx, colour, bg, fg):
		"""Set the context source colour.

		Picks a distinct colour based on an internal wheel; the bg
		parameter provides the value that should be assigned to the 'zero'
		colours and the fg parameter provides the multiplier that should be
		applied to the foreground colours.
		"""
		colours = [
		    ( 1.0, 0.0, 0.0 ),
		    ( 1.0, 1.0, 0.0 ),
		    ( 0.0, 1.0, 0.0 ),
		    ( 0.0, 1.0, 1.0 ),
		    ( 0.0, 0.0, 1.0 ),
		    ( 1.0, 0.0, 1.0 ),
		    ]

		colour %= len(colours)
		red   = (colours[colour][0] * fg) or bg
		green = (colours[colour][1] * fg) or bg
		blue  = (colours[colour][2] * fg) or bg

		ctx.set_source_rgb(red, green, blue)

	def on_get_size(self, widget, cell_area):
		"""Return the size we need for this cell.

		Each cell is drawn individually and is only as wide as it needs
		to be, we let the TreeViewColumn take care of making them all
		line up.
		"""
		box_size = self.box_size(widget)

		cols = self.node[0]
		for start, end, colour in self.in_lines + self.out_lines:
			cols = max(cols, start, end)

		(column, colour, names) = self.node
		names_len = 0
		if (len(names) != 0):
			for item in names:
				names_len += len(item)/3
		
		width = box_size * (cols + 1 + names_len )
		height = box_size

		# FIXME I have no idea how to use cell_area properly
		return (0, 0, width, height)

	def on_render(self, window, widget, bg_area, cell_area, exp_area, flags):
		"""Render an individual cell.

		Draws the cell contents using cairo, taking care to clip what we
		do to within the background area so we don't draw over other cells.
		Note that we're a bit naughty there and should really be drawing
		in the cell_area (or even the exposed area), but we explicitly don't
		want any gutter.

		We try and be a little clever, if the line we need to draw is going
		to cross other columns we actually draw it as in the .---' style
		instead of a pure diagonal ... this reduces confusion by an
		incredible amount.
		"""
		ctx = window.cairo_create()
		ctx.rectangle(bg_area.x, bg_area.y, bg_area.width, bg_area.height)
		ctx.clip()

		box_size = self.box_size(widget)

		ctx.set_line_width(box_size / 8)
		ctx.set_line_cap(cairo.LINE_CAP_SQUARE)

		# Draw lines into the cell
		for start, end, colour in self.in_lines:
			ctx.move_to(cell_area.x + box_size * start + box_size / 2,
					bg_area.y - bg_area.height / 2)

			if start - end > 1:
				ctx.line_to(cell_area.x + box_size * start, bg_area.y)
				ctx.line_to(cell_area.x + box_size * end + box_size, bg_area.y)
			elif start - end < -1:
				ctx.line_to(cell_area.x + box_size * start + box_size,
						bg_area.y)
				ctx.line_to(cell_area.x + box_size * end, bg_area.y)

		    	ctx.line_to(cell_area.x + box_size * end + box_size / 2,
					bg_area.y + bg_area.height / 2)

			self.set_colour(ctx, colour, 0.0, 0.65)
			ctx.stroke()

		# Draw lines out of the cell
		for start, end, colour in self.out_lines:
			ctx.move_to(cell_area.x + box_size * start + box_size / 2,
					bg_area.y + bg_area.height / 2)

			if start - end > 1:
				ctx.line_to(cell_area.x + box_size * start,
						bg_area.y + bg_area.height)
				ctx.line_to(cell_area.x + box_size * end + box_size,
						bg_area.y + bg_area.height)
			elif start - end < -1:
				ctx.line_to(cell_area.x + box_size * start + box_size,
						bg_area.y + bg_area.height)
				ctx.line_to(cell_area.x + box_size * end,
						bg_area.y + bg_area.height)

			ctx.line_to(cell_area.x + box_size * end + box_size / 2,
					bg_area.y + bg_area.height / 2 + bg_area.height)

			self.set_colour(ctx, colour, 0.0, 0.65)
			ctx.stroke()

		# Draw the revision node in the right column
		(column, colour, names) = self.node
		ctx.arc(cell_area.x + box_size * column + box_size / 2,
				cell_area.y + cell_area.height / 2,
				box_size / 4, 0, 2 * math.pi)


		if (len(names) != 0):
			name = " "
			for item in names:
				name = name + item + " "

			ctx.text_path(name)

		self.set_colour(ctx, colour, 0.0, 0.5)
		ctx.stroke_preserve()

		self.set_colour(ctx, colour, 0.5, 1.0)
		ctx.fill()

class Commit:
	""" This represent a commit object obtained after parsing the git-rev-list 
	output """

	children_sha1 = {}

	def __init__(self, commit_lines):
		self.message 		= ""
		self.author		= ""
		self.date 		= ""
		self.committer 		= ""
		self.commit_date 	= ""
		self.commit_sha1	= ""
		self.parent_sha1	= [ ]
		self.parse_commit(commit_lines)


	def parse_commit(self, commit_lines):

		# First line is the sha1 lines
		line = string.strip(commit_lines[0])
		sha1 = re.split(" ", line)
		self.commit_sha1 = sha1[0]
		self.parent_sha1 = sha1[1:]

		#build the child list
		for parent_id in self.parent_sha1:
			try:
				Commit.children_sha1[parent_id].append(self.commit_sha1)
			except KeyError:
				Commit.children_sha1[parent_id] = [self.commit_sha1]

		# IF we don't have parent
		if (len(self.parent_sha1) == 0):
			self.parent_sha1 = [0]

		for line in commit_lines[1:]:
			m = re.match("^ ", line)
			if (m != None):
				# First line of the commit message used for short log
				if self.message == "":
					self.message = string.strip(line)
				continue

			m = re.match("tree", line)
			if (m != None):
				continue

			m = re.match("parent", line)
			if (m != None):
				continue

			m = re_ident.match(line)
			if (m != None):
				date = show_date(m.group('epoch'), m.group('tz'))
				if m.group(1) == "author":
       					self.author = m.group('ident')
       					self.date = date
				elif m.group(1) == "committer":
       					self.committer = m.group('ident')
       					self.commit_date = date

				continue

	def get_message(self, with_diff=0):
		if (with_diff == 1):
			message = self.diff_tree()
		else:
			fp = os.popen("git cat-file commit " + self.commit_sha1)
			message = fp.read()
			fp.close()

		return message

	def diff_tree(self):
		fp = os.popen("git diff-tree --pretty --cc  -v -p --always " +  self.commit_sha1)
		diff = fp.read()
		fp.close()
		return diff

class DiffWindow:
	"""Diff window.
	This object represents and manages a single window containing the
	differences between two revisions on a branch.
	"""

	def __init__(self):
		self.window = gtk.Window(gtk.WINDOW_TOPLEVEL)
		self.window.set_border_width(0)
		self.window.set_title("Git repository browser diff window")

		# Use two thirds of the screen by default
		screen = self.window.get_screen()
		monitor = screen.get_monitor_geometry(0)
		width = int(monitor.width * 0.66)
		height = int(monitor.height * 0.66)
		self.window.set_default_size(width, height)

		self.construct()

	def construct(self):
		"""Construct the window contents."""
		vbox = gtk.VBox()
		self.window.add(vbox)
		vbox.show()

		menu_bar = gtk.MenuBar()
		save_menu = gtk.ImageMenuItem(gtk.STOCK_SAVE)
		save_menu.connect("activate", self.save_menu_response, "save")
		save_menu.show()
		menu_bar.append(save_menu)
		vbox.pack_start(menu_bar, False, False, 2)
		menu_bar.show()

		scrollwin = gtk.ScrolledWindow()
		scrollwin.set_policy(gtk.POLICY_AUTOMATIC, gtk.POLICY_AUTOMATIC)
		scrollwin.set_shadow_type(gtk.SHADOW_IN)
		vbox.pack_start(scrollwin, expand=True, fill=True)
		scrollwin.show()

		if have_gtksourceview:
			self.buffer = gtksourceview.SourceBuffer()
			slm = gtksourceview.SourceLanguagesManager()
			gsl = slm.get_language_from_mime_type("text/x-patch")
			self.buffer.set_highlight(True)
			self.buffer.set_language(gsl)
			sourceview = gtksourceview.SourceView(self.buffer)
		else:
			self.buffer = gtk.TextBuffer()
			sourceview = gtk.TextView(self.buffer)

		sourceview.set_editable(False)
		sourceview.modify_font(pango.FontDescription("Monospace"))
		scrollwin.add(sourceview)
		sourceview.show()


	def set_diff(self, commit_sha1, parent_sha1):
		"""Set the differences showed by this window.
		Compares the two trees and populates the window with the
		differences.
		"""
		# Diff with the first commit or the last commit shows nothing
		if (commit_sha1 == 0 or parent_sha1 == 0 ):
			return 

		fp = os.popen("git diff-tree -p " + parent_sha1 + " " + commit_sha1)
		self.buffer.set_text(fp.read())
		fp.close()
		self.window.show()

	def save_menu_response(self, widget, string):
		dialog = gtk.FileChooserDialog("Save..", None, gtk.FILE_CHOOSER_ACTION_SAVE,
				(gtk.STOCK_CANCEL, gtk.RESPONSE_CANCEL,
					gtk.STOCK_SAVE, gtk.RESPONSE_OK))
		dialog.set_default_response(gtk.RESPONSE_OK)
		response = dialog.run()
		if response == gtk.RESPONSE_OK:
			patch_buffer = self.buffer.get_text(self.buffer.get_start_iter(),
					self.buffer.get_end_iter())
			fp = open(dialog.get_filename(), "w")
			fp.write(patch_buffer)
			fp.close()
		dialog.destroy()

class GitView:
	""" This is the main class
	"""

	def __init__(self, with_diff=0):
		self.with_diff = with_diff
        	self.window =  	gtk.Window(gtk.WINDOW_TOPLEVEL)
        	self.window.set_border_width(0)
		self.window.set_title("Git repository browser")

		self.get_bt_sha1()

        	# Use three-quarters of the screen by default
        	screen = self.window.get_screen()
        	monitor = screen.get_monitor_geometry(0)
        	width = int(monitor.width * 0.75)
        	height = int(monitor.height * 0.75)
        	self.window.set_default_size(width, height)

        	# FIXME AndyFitz!
        	icon = self.window.render_icon(gtk.STOCK_INDEX, gtk.ICON_SIZE_BUTTON)
        	self.window.set_icon(icon)

        	self.accel_group = gtk.AccelGroup()
        	self.window.add_accel_group(self.accel_group)

        	self.construct()

	def get_bt_sha1(self):
		""" Update the bt_sha1 dictionary with the 
		respective sha1 details """

		self.bt_sha1 = { }
		git_dir = os.getenv("GIT_DIR")
		if (git_dir == None):
			git_dir = ".git"

		#FIXME the path seperator
		ref_files = os.listdir(git_dir + "/refs/tags")
		for file in ref_files:
			fp = open(git_dir + "/refs/tags/"+file)
			sha1 = get_sha1_from_tags(string.strip(fp.readline()))
			try:
				self.bt_sha1[sha1].append(file)
			except KeyError:
				self.bt_sha1[sha1] = [file]
			fp.close()


		#FIXME the path seperator
		ref_files = os.listdir(git_dir + "/refs/heads")
		for file in ref_files:
			fp = open(git_dir + "/refs/heads/" + file)
			sha1 = get_sha1_from_tags(string.strip(fp.readline()))
			try:
				self.bt_sha1[sha1].append(file)
			except KeyError:
				self.bt_sha1[sha1] = [file]
			fp.close()


	def construct(self):
		"""Construct the window contents."""
		paned = gtk.VPaned()
		paned.pack1(self.construct_top(), resize=False, shrink=True)
		paned.pack2(self.construct_bottom(), resize=False, shrink=True)
		self.window.add(paned)
		paned.show()

	def construct_top(self):
		"""Construct the top-half of the window."""
		vbox = gtk.VBox(spacing=6)
		vbox.set_border_width(12)
		vbox.show()

		scrollwin = gtk.ScrolledWindow()
		scrollwin.set_policy(gtk.POLICY_NEVER, gtk.POLICY_AUTOMATIC)
		scrollwin.set_shadow_type(gtk.SHADOW_IN)
		vbox.pack_start(scrollwin, expand=True, fill=True)
		scrollwin.show()

		self.treeview = gtk.TreeView()
		self.treeview.set_rules_hint(True)
		self.treeview.set_search_column(4)
		self.treeview.connect("cursor-changed", self._treeview_cursor_cb)
		scrollwin.add(self.treeview)
		self.treeview.show()

		cell = CellRendererGraph()
		column = gtk.TreeViewColumn()
		column.set_resizable(False)
		column.pack_start(cell, expand=False)
		column.add_attribute(cell, "node", 1)
		column.add_attribute(cell, "in-lines", 2)
		column.add_attribute(cell, "out-lines", 3)
		self.treeview.append_column(column)

		cell = gtk.CellRendererText()
		cell.set_property("width-chars", 65)
		cell.set_property("ellipsize", pango.ELLIPSIZE_END)
		column = gtk.TreeViewColumn("Message")
		column.set_resizable(True)
		column.pack_start(cell, expand=True)
		column.add_attribute(cell, "text", 4)
		self.treeview.append_column(column)

		cell = gtk.CellRendererText()
		cell.set_property("width-chars", 40)
		cell.set_property("ellipsize", pango.ELLIPSIZE_END)
		column = gtk.TreeViewColumn("Author")
		column.set_resizable(True)
		column.pack_start(cell, expand=True)
		column.add_attribute(cell, "text", 5)
		self.treeview.append_column(column)

		cell = gtk.CellRendererText()
		cell.set_property("ellipsize", pango.ELLIPSIZE_END)
		column = gtk.TreeViewColumn("Date")
		column.set_resizable(True)
		column.pack_start(cell, expand=True)
		column.add_attribute(cell, "text", 6)
		self.treeview.append_column(column)

		return vbox


	def construct_bottom(self):
		"""Construct the bottom half of the window."""
		vbox = gtk.VBox(False, spacing=6)
		vbox.set_border_width(12)
		(width, height) = self.window.get_size()
		vbox.set_size_request(width, int(height / 2.5))
		vbox.show()

		self.table = gtk.Table(rows=4, columns=4)
		self.table.set_row_spacings(6)
		self.table.set_col_spacings(6)
		vbox.pack_start(self.table, expand=False, fill=True)
		self.table.show()

		align = gtk.Alignment(0.0, 0.5)
		label = gtk.Label()
		label.set_markup("<b>Revision:</b>")
		align.add(label)
		self.table.attach(align, 0, 1, 0, 1, gtk.FILL, gtk.FILL)
		label.show()
		align.show()

		align = gtk.Alignment(0.0, 0.5)
		self.revid_label = gtk.Label()
		self.revid_label.set_selectable(True)
		align.add(self.revid_label)
		self.table.attach(align, 1, 2, 0, 1, gtk.EXPAND | gtk.FILL, gtk.FILL)
		self.revid_label.show()
		align.show()

		align = gtk.Alignment(0.0, 0.5)
		label = gtk.Label()
		label.set_markup("<b>Committer:</b>")
		align.add(label)
		self.table.attach(align, 0, 1, 1, 2, gtk.FILL, gtk.FILL)
		label.show()
		align.show()

		align = gtk.Alignment(0.0, 0.5)
		self.committer_label = gtk.Label()
		self.committer_label.set_selectable(True)
		align.add(self.committer_label)
		self.table.attach(align, 1, 2, 1, 2, gtk.EXPAND | gtk.FILL, gtk.FILL)
		self.committer_label.show()
		align.show()

		align = gtk.Alignment(0.0, 0.5)
		label = gtk.Label()
		label.set_markup("<b>Timestamp:</b>")
		align.add(label)
		self.table.attach(align, 0, 1, 2, 3, gtk.FILL, gtk.FILL)
		label.show()
		align.show()

		align = gtk.Alignment(0.0, 0.5)
		self.timestamp_label = gtk.Label()
		self.timestamp_label.set_selectable(True)
		align.add(self.timestamp_label)
		self.table.attach(align, 1, 2, 2, 3, gtk.EXPAND | gtk.FILL, gtk.FILL)
		self.timestamp_label.show()
		align.show()

		align = gtk.Alignment(0.0, 0.5)
		label = gtk.Label()
		label.set_markup("<b>Parents:</b>")
		align.add(label)
		self.table.attach(align, 0, 1, 3, 4, gtk.FILL, gtk.FILL)
		label.show()
		align.show()
		self.parents_widgets = []

		align = gtk.Alignment(0.0, 0.5)
		label = gtk.Label()
		label.set_markup("<b>Children:</b>")
		align.add(label)
		self.table.attach(align, 2, 3, 3, 4, gtk.FILL, gtk.FILL)
		label.show()
		align.show()
		self.children_widgets = []

		scrollwin = gtk.ScrolledWindow()
		scrollwin.set_policy(gtk.POLICY_AUTOMATIC, gtk.POLICY_AUTOMATIC)
		scrollwin.set_shadow_type(gtk.SHADOW_IN)
		vbox.pack_start(scrollwin, expand=True, fill=True)
		scrollwin.show()

		if have_gtksourceview:
			self.message_buffer = gtksourceview.SourceBuffer()
			slm = gtksourceview.SourceLanguagesManager()
			gsl = slm.get_language_from_mime_type("text/x-patch")
			self.message_buffer.set_highlight(True)
			self.message_buffer.set_language(gsl)
			sourceview = gtksourceview.SourceView(self.message_buffer)
		else:
			self.message_buffer = gtk.TextBuffer()
			sourceview = gtk.TextView(self.message_buffer)

		sourceview.set_editable(False)
		sourceview.modify_font(pango.FontDescription("Monospace"))
		scrollwin.add(sourceview)
		sourceview.show()

		return vbox

	def _treeview_cursor_cb(self, *args):
		"""Callback for when the treeview cursor changes."""
		(path, col) = self.treeview.get_cursor()
		commit = self.model[path][0]

		if commit.committer is not None:
			committer = commit.committer
			timestamp = commit.commit_date
			message   =  commit.get_message(self.with_diff)
			revid_label = commit.commit_sha1
		else:
    			committer = ""
    			timestamp = ""
    			message = ""
			revid_label = ""

		self.revid_label.set_text(revid_label)
		self.committer_label.set_text(committer)
		self.timestamp_label.set_text(timestamp)
		self.message_buffer.set_text(message)

		for widget in self.parents_widgets:
			self.table.remove(widget)

		self.parents_widgets = []
		self.table.resize(4 + len(commit.parent_sha1) - 1, 4)
		for idx, parent_id in enumerate(commit.parent_sha1):
			self.table.set_row_spacing(idx + 3, 0)

			align = gtk.Alignment(0.0, 0.0)
			self.parents_widgets.append(align)
			self.table.attach(align, 1, 2, idx + 3, idx + 4,
					gtk.EXPAND | gtk.FILL, gtk.FILL)
			align.show()

			hbox = gtk.HBox(False, 0)
			align.add(hbox)
			hbox.show()

			label = gtk.Label(parent_id)
			label.set_selectable(True)
			hbox.pack_start(label, expand=False, fill=True)
			label.show()

			image = gtk.Image()
			image.set_from_stock(gtk.STOCK_JUMP_TO, gtk.ICON_SIZE_MENU)
			image.show()

			button = gtk.Button()
			button.add(image)
			button.set_relief(gtk.RELIEF_NONE)
			button.connect("clicked", self._go_clicked_cb, parent_id)
			hbox.pack_start(button, expand=False, fill=True)
			button.show()

			image = gtk.Image()
			image.set_from_stock(gtk.STOCK_FIND, gtk.ICON_SIZE_MENU)
			image.show()

			button = gtk.Button()
			button.add(image)
			button.set_relief(gtk.RELIEF_NONE)
			button.set_sensitive(True)
			button.connect("clicked", self._show_clicked_cb,
					commit.commit_sha1, parent_id)
			hbox.pack_start(button, expand=False, fill=True)
			button.show()

		# Populate with child details
		for widget in self.children_widgets:
			self.table.remove(widget)

		self.children_widgets = []
		try:
			child_sha1 = Commit.children_sha1[commit.commit_sha1]
		except KeyError:
			# We don't have child
			child_sha1 = [ 0 ]

		if ( len(child_sha1) > len(commit.parent_sha1)):
			self.table.resize(4 + len(child_sha1) - 1, 4)

		for idx, child_id in enumerate(child_sha1):
			self.table.set_row_spacing(idx + 3, 0)

			align = gtk.Alignment(0.0, 0.0)
			self.children_widgets.append(align)
			self.table.attach(align, 3, 4, idx + 3, idx + 4,
					gtk.EXPAND | gtk.FILL, gtk.FILL)
			align.show()

			hbox = gtk.HBox(False, 0)
			align.add(hbox)
			hbox.show()

			label = gtk.Label(child_id)
			label.set_selectable(True)
			hbox.pack_start(label, expand=False, fill=True)
			label.show()

			image = gtk.Image()
			image.set_from_stock(gtk.STOCK_JUMP_TO, gtk.ICON_SIZE_MENU)
			image.show()

			button = gtk.Button()
			button.add(image)
			button.set_relief(gtk.RELIEF_NONE)
			button.connect("clicked", self._go_clicked_cb, child_id)
			hbox.pack_start(button, expand=False, fill=True)
			button.show()

			image = gtk.Image()
			image.set_from_stock(gtk.STOCK_FIND, gtk.ICON_SIZE_MENU)
			image.show()

			button = gtk.Button()
			button.add(image)
			button.set_relief(gtk.RELIEF_NONE)
			button.set_sensitive(True)
			button.connect("clicked", self._show_clicked_cb,
					child_id, commit.commit_sha1)
			hbox.pack_start(button, expand=False, fill=True)
			button.show()

	def _destroy_cb(self, widget):
		"""Callback for when a window we manage is destroyed."""
		self.quit()


	def quit(self):
		"""Stop the GTK+ main loop."""
		gtk.main_quit()

	def run(self, args):
		self.set_branch(args)
        	self.window.connect("destroy", self._destroy_cb)
		self.window.show()
		gtk.main()

	def set_branch(self, args):
		"""Fill in different windows with info from the reposiroty"""
		fp = os.popen("git rev-parse --sq --default HEAD " + list_to_string(args, 1))
		git_rev_list_cmd = fp.read()
		fp.close()
		fp = os.popen("git rev-list  --header --topo-order --parents " + git_rev_list_cmd)
		self.update_window(fp)

	def update_window(self, fp):
		commit_lines = []

		self.model = gtk.ListStore(gobject.TYPE_PYOBJECT, gobject.TYPE_PYOBJECT,
				gobject.TYPE_PYOBJECT, gobject.TYPE_PYOBJECT, str, str, str)

		# used for cursor positioning 
		self.index = {}

		self.colours = {}
		self.nodepos = {}
		self.incomplete_line = {}

		index = 0
		last_colour = 0
		last_nodepos = -1
		out_line = []	
		input_line = fp.readline()
		while (input_line != ""):
			# The commit header ends with '\0'
			# This NULL is immediately followed by the sha1 of the 
			# next commit
			if (input_line[0] != '\0'):
				commit_lines.append(input_line)
				input_line = fp.readline()
				continue;

			commit = Commit(commit_lines)
			if (commit != None ):
				(out_line, last_colour, last_nodepos) = self.draw_graph(commit,
										index, out_line,
										last_colour,
										last_nodepos)
				self.index[commit.commit_sha1] = index
				index += 1

			# Skip the '\0
			commit_lines = []
			commit_lines.append(input_line[1:])
			input_line = fp.readline()

		fp.close()

		self.treeview.set_model(self.model)
		self.treeview.show()

	def draw_graph(self, commit, index, out_line, last_colour, last_nodepos):
		in_line=[]

		#   |   -> outline
		#   X
		#   |\  <- inline 

		# Reset nodepostion
		if (last_nodepos > 5):
			last_nodepos = 0

		# Add the incomplete lines of the last cell in this 
		for sha1 in self.incomplete_line.keys():
			if ( sha1 != commit.commit_sha1):
				for pos in self.incomplete_line[sha1]:
					in_line.append((pos, pos, self.colours[sha1]))
			else:
				del self.incomplete_line[sha1]

		try:
			colour = self.colours[commit.commit_sha1]
		except KeyError:
			last_colour +=1
			self.colours[commit.commit_sha1] = last_colour
			colour =  last_colour
		try:
			node_pos = self.nodepos[commit.commit_sha1]
		except KeyError:
			last_nodepos +=1
			self.nodepos[commit.commit_sha1] = last_nodepos
			node_pos = last_nodepos

		#The first parent always continue on the same line
		try:
			# check we alreay have the value
			tmp_node_pos = self.nodepos[commit.parent_sha1[0]]
		except KeyError:
			self.colours[commit.parent_sha1[0]] = colour
			self.nodepos[commit.parent_sha1[0]] = node_pos

		in_line.append((node_pos, self.nodepos[commit.parent_sha1[0]],
					self.colours[commit.parent_sha1[0]]))

		self.add_incomplete_line(commit.parent_sha1[0], index+1)

		if (len(commit.parent_sha1) > 1):
			for parent_id in commit.parent_sha1[1:]:
				try:
					tmp_node_pos = self.nodepos[parent_id]
				except KeyError:
					last_colour += 1;
					self.colours[parent_id] = last_colour
					last_nodepos +=1
					self.nodepos[parent_id] = last_nodepos

				in_line.append((node_pos, self.nodepos[parent_id],
							self.colours[parent_id]))
				self.add_incomplete_line(parent_id, index+1)

		
		try:
			branch_tag = self.bt_sha1[commit.commit_sha1]
		except KeyError:
			branch_tag = [ ]


		node = (node_pos, colour, branch_tag) 

		self.model.append([commit, node, out_line, in_line,
				commit.message, commit.author, commit.date]) 

		return (in_line, last_colour, last_nodepos)

	def add_incomplete_line(self, sha1, index):
		try:
			self.incomplete_line[sha1].append(self.nodepos[sha1])
		except KeyError:
			self.incomplete_line[sha1] = [self.nodepos[sha1]]


	def _go_clicked_cb(self, widget, revid):
		"""Callback for when the go button for a parent is clicked."""
		try:
			self.treeview.set_cursor(self.index[revid])
		except KeyError:
			print "Revision %s not present in the list" % revid
			# revid == 0 is the parent of the first commit
			if (revid != 0 ):
				print "Try running gitview without any options"

		self.treeview.grab_focus()

	def _show_clicked_cb(self, widget,  commit_sha1, parent_sha1):
		"""Callback for when the show button for a parent is clicked."""
		window = DiffWindow()
		window.set_diff(commit_sha1, parent_sha1)
		self.treeview.grab_focus()

if __name__ == "__main__":
	without_diff = 0

	if (len(sys.argv) > 1 ):
		if (sys.argv[1] == "--without-diff"):
			without_diff = 1

	view = GitView( without_diff != 1)
	view.run(sys.argv[without_diff:])



^ permalink raw reply

* Re: [ANNOUNCE] git-svn - bidirection operations between svn and git
From: Junio C Hamano @ 2006-02-16  8:01 UTC (permalink / raw)
  To: Eric Wong; +Cc: git, Aneesh Kumar, Martin Langhoff
In-Reply-To: <20060216073826.GA12055@hand.yhbt.net>

Eric Wong <normalperson@yhbt.net> writes:

> @ Junio: Is there room for this in the git distribution alongside
> git-svnimport?

Surely.  Things that superficially do similar things are not
necessarily mutually exclusive, if that is what you are worried
about.  There is not much incumbent advantage for tools that
support a narrowly defined specific task (e.g. interfacing with
foreign SCM X) on the periphery, while I would perhaps feel more
hesitant to support 47 different variants of git-commit ;-).

Especially, from your description (I haven't looked at the
code), its point is to give a better support for an alternative
workflow from svnimport supports.

I was privately advised (by somebody I respect and trust) that I
should not be too hesitant to expand the scope of the project.
Also there are some interesting developments such as Martin's
git-backed fake CVS server and Aneesh's gitview that I have been
interested in, among other things.

Even having some experimental tools that are only starting to do
useful things might be useful, if we had it in the git.git
repository.  For one thing, it would give more exposure to them
and help improve things.

How about first adding a contrib/ directory and see how it goes?

^ permalink raw reply

* Re: [ANNOUNCE] pg - A patch porcelain for GIT
From: Karl Hasselström @ 2006-02-16  7:54 UTC (permalink / raw)
  To: git
In-Reply-To: <b0943d9e0602150925v6f01accfw@mail.gmail.com>

On 2006-02-15 17:25:30 +0000, Catalin Marinas wrote:

> On 14/02/06, Petr Baudis <pasky@suse.cz> wrote:
>
> > It is ok as long as you know what are you doing - if you don't
> > push out the commits you've just "undid" (or work on a public
> > accessible repository in the first place, but I think that's kind
> > of rare these days; quick survey - does anyone reading these lines
> > do that?), there's nothing wrong on it, and it gives you nice
> > flexibility.
> >
> > For example, to import bunch of patches (I guess that's the
> > original intention behind this) you just run git-am on them and
> > then stg uncommit all of the newly added commits.
>
> This is a sensible way of using an uncommit command but I initially
> thought it would be better to make things harder for people wanting
> to re-write the history. Anyway, I'll keep this command on my todo
> list.

stgit rewrites history all the time anyway. And as far as I recall,
there's nothing in the documentation that warns the user not to
publish stgit-managed branches. :-)

-- 
Karl Hasselström, kha@treskal.com
      www.treskal.com/kalle

^ permalink raw reply

* Re: git faq : draft and rfc
From: Alan Chandler @ 2006-02-16  7:50 UTC (permalink / raw)
  To: git
In-Reply-To: <22e91bb0602151636r2e70e60cpa5038f4b6caccc9c@mail.gmail.com>

On Thursday 16 February 2006 00:36, Thomas Riboulet wrote:
> . What's the difference between fetch and pull ?
> Fetch : download objects and a head from another repository.
> Pull : pull and merge from another repository.
> See man git-fetch and git-pull for more.

Surely you are using a pull to mean fetch here,  Shouldn't this be

Fetch: download objects and a head from another repository.
Pull: fetch (as defined above) and then merge this with current development

or something

-- 
Alan Chandler
http://www.chandlerfamily.org.uk
Open Source. It's the difference between trust and antitrust.

^ permalink raw reply

* [ANNOUNCE] git-svn - bidirection operations between svn and git
From: Eric Wong @ 2006-02-16  7:38 UTC (permalink / raw)
  To: git list

[-- Attachment #1: Type: text/plain, Size: 2171 bytes --]

Hello, I've written a simple tool for interoperating between git and
svn.  I wrote this so I could use git to work on projects where other
developers use Subversion.  I really hate using svn, but some projects I
work on require it, and svk isn't nearly as fast nor simple as git.

git-svn does not replace git-svnimport, git-svnimport handles branches
and tags automatically, but is too inflexible about repository layouts
to be useful for a good number of projects I follow, and of course
git-svnimport can't commit to Subversion repositories :)

git-svn only cares about a single branch/trunk in SVN[1], but you can
use as many branches in git as you want.  This makes it much easier to
use and allows it to handle just about any repository layout, not just
those recommended in the SVN book/developers.

Although importing changesets from SVN is mostly a linear affair,
committing to SVN is the opposite.  You may commit git tree objects in
any order you want.  It simply clobbers the existing svn tree as
'git-checkout -f' would, but tags file renames/copies carefully so users
on the SVN side can see them.  You can even do some wacky things with
patch reordering.

Basic day-to-day usage is pretty simple, and is designed to work with
and also work like normal git commands:

# Initialize a tree (like git init-db)::
	git-svn init http://svn.foo.org/project/trunk

# Fetch remote revisions::
	git-svn fetch

# Create your own branch to hack on::
	git checkout -b my-branch git-svn-HEAD

# Commit only the git commits you want to SVN::
	git-svn commit <tree-ish> [<tree-ish_2> ...] 

# Commit all the git commits from my-branch that don't exist in SVN::
	git rev-list --pretty=oneline git-svn-HEAD..my-branch | git-svn commit

# Something is committed to SVN, pull the latest into your branch::
	git-svn fetch && git pull . git-svn-HEAD

@ Junio: Is there room for this in the git distribution alongside
git-svnimport?

Thanks for reading,

[1] - there are some a hacks that lets you handle branches and tags, but
it's not automated in any way, requires a bit of imagination to use to
its full potential, and is very much a hack.  See the man page :)

-- 
Eric Wong

[-- Attachment #2: git-svn --]
[-- Type: text/plain, Size: 21108 bytes --]

#!/usr/bin/env perl
use warnings;
use strict;
use vars qw/	$AUTHOR $VERSION
		$SVN_URL $SVN_INFO $SVN_WC
		$GIT_SVN_INDEX $GIT_SVN
		$GIT_DIR $REV_DIR/;
$AUTHOR = 'Eric Wong <normalperson@yhbt.net>';
$VERSION = '0.9.0';
$GIT_DIR = $ENV{GIT_DIR} || "$ENV{PWD}/.git";
$GIT_SVN = $ENV{GIT_SVN_ID} || 'git-svn';
$GIT_SVN_INDEX = "$GIT_DIR/$GIT_SVN/index";
$ENV{GIT_DIR} ||= $GIT_DIR;
$SVN_URL = undef;
$REV_DIR = "$GIT_DIR/$GIT_SVN/revs";
$SVN_WC = "$GIT_DIR/$GIT_SVN/tree";

# make sure the svn binary gives consistent output between locales and TZs:
$ENV{TZ} = 'UTC';
$ENV{LC_ALL} = 'C';

# If SVN:: library support is added, please make the dependencies
# optional and preserve the capability to use the command-line client.
# See what I do with XML::Simple to make the dependency optional.
use Carp qw/croak/;
use IO::File qw//;
use File::Basename qw/dirname basename/;
use File::Path qw/mkpath/;
use Getopt::Long qw/:config gnu_getopt no_ignore_case auto_abbrev/;
use File::Spec qw//;
my $sha1 = qr/[a-f\d]{40}/;
my $sha1_short = qr/[a-f\d]{6,40}/;
my ($_revision,$_stdin,$_no_ignore_ext,$_no_stop_copy,$_help,$_rmdir,$_edit);

GetOptions(	'revision|r=s' => \$_revision,
		'no-ignore-externals' => \$_no_ignore_ext,
		'stdin|' => \$_stdin,
		'edit|e' => \$_edit,
		'rmdir' => \$_rmdir,
		'help|H|h' => \$_help,
		'no-stop-copy' => \$_no_stop_copy );
my %cmd = (
	fetch => [ \&fetch, "Download new revisions from SVN" ],
	init => [ \&init, "Initialize and fetch (import)"],
	commit => [ \&commit, "Commit git revisions to SVN" ],
	rebuild => [ \&rebuild, "Rebuild git-svn metadata (after git clone)" ],
	help => [ \&usage, "Show help" ],
);
my $cmd;
for (my $i = 0; $i < @ARGV; $i++) {
	if (defined $cmd{$ARGV[$i]}) {
		$cmd = $ARGV[$i];
		splice @ARGV, $i, 1;
		last;
	}
};

# we may be called as git-svn-(command), or git-svn(command).
foreach (keys %cmd) {
	if (/git\-svn\-?($_)(?:\.\w+)?$/) {
		$cmd = $1;
		last;
	}
}
usage(0) if $_help;
usage(1) unless (defined $cmd);
svn_check_ignore_externals();
$cmd{$cmd}->[0]->(@ARGV);
exit 0;

####################### primary functions ######################
sub usage {
	my $exit = shift || 0;
	my $fd = $exit ? \*STDERR : \*STDOUT;
	print $fd <<"";
git-svn - bidirectional operations between a single Subversion tree and git
Usage: $0 <command> [options] [arguments]\n
Available commands:

	foreach (sort keys %cmd) {
		print $fd '  ',pack('A10',$_),$cmd{$_}->[1],"\n";
	}
	print $fd <<"";
\nGIT_SVN_ID may be set in the environment to an arbitrary identifier if
you're tracking multiple SVN branches/repositories in one git repository
and want to keep them separate.

	exit $exit;
}

sub rebuild {
	$SVN_URL = shift or undef;
	my $repo_uuid;
	my $newest_rev = 0;
	
	my $pid = open(my $rev_list,'-|');
	defined $pid or croak $!;
	if ($pid == 0) {
		exec("git-rev-list","$GIT_SVN-HEAD") or croak $!;
	}
	my $first;
	while (<$rev_list>) {
		chomp;
		my $c = $_;
		croak "Non-SHA1: $c\n" unless $c =~ /^$sha1$/o;
		my @commit = grep(/^git-svn-id: /,`git-cat-file commit $c`);
		next if (!@commit); # skip merges
		my $id = $commit[$#commit];
		my ($url, $rev, $uuid) = ($id =~ /^git-svn-id:\s(\S+?)\@(\d+)
						\s([a-f\d\-]+)$/x);
		if (!$rev || !$uuid || !$url) {
			# some of the original repositories I made had
			# indentifiers like this:
			($rev, $uuid) = ($id =~/^git-svn-id:\s(\d+)
							\@([a-f\d\-]+)/x);
			if (!$rev || !$uuid) {
				croak "Unable to extract revision or UUID from ",
					"$c, $id\n";
			}
		}
		print "r$rev = $c\n";
		unless (defined $first) {
			if (!$SVN_URL && !$url) {
				croak "SVN repository location required: $url\n";
			}
			$SVN_URL ||= $url;
			$repo_uuid = setup_git_svn();
			$first = $rev;
		}
		if ($uuid ne $repo_uuid) {
			croak "Repository UUIDs do not match!\ngot: $uuid\n",
						"expected: $repo_uuid\n";
		}
		assert_revision_eq_or_unknown($rev, $c);
		sys('git-update-ref',"$GIT_SVN/revs/$rev",$c);
		$newest_rev = $rev if ($rev > $newest_rev);
	}
	close $rev_list or croak $?;
	if (!chdir $SVN_WC) {
		my @svn_co = ('svn','co',"-r$first");
		push @svn_co, '--ignore-externals' unless $_no_ignore_ext;
		sys(@svn_co, $SVN_URL, $SVN_WC);
		chdir $SVN_WC or croak $!;
	}
	
	$pid = fork;
	defined $pid or croak $!;
	if ($pid == 0) {
		my @svn_up = qw(svn up);
		push @svn_up, '--ignore-externals' unless $_no_ignore_ext;
		sys(@svn_up,"-r$newest_rev");
		$ENV{GIT_INDEX_FILE} = $GIT_SVN_INDEX; 
		git_addremove();
		exec('git-write-tree');
	}
	waitpid $pid, 0;
}

sub init {
	$SVN_URL = shift or croak "SVN repository location required\n";
	unless (-d $GIT_DIR) {
		sys('git-init-db');
	}
	setup_git_svn();
}

sub fetch {
	my (@parents) = @_;
	$SVN_URL ||= file_to_s("$GIT_DIR/$GIT_SVN/info/url");
	my @log_args = -d $SVN_WC ? ($SVN_WC) : ($SVN_URL);
	if (-d $SVN_WC && !$_revision) {
		$_revision = 'BASE:HEAD';
	}
	push @log_args, "-r$_revision" if $_revision;
	push @log_args, '--stop-on-copy' unless $_no_stop_copy;

	eval { require XML::Simple or croak $! };
	my $svn_log = $@ ? svn_log_raw(@log_args) : svn_log_xml(@log_args);
	
	my $base = shift @$svn_log or croak "No base revision!\n";
	my $last_commit = undef;
	unless (-d $SVN_WC) {
		my @svn_co = ('svn','co',"-r$base->{revision}");
		push @svn_co,'--ignore-externals' unless $_no_ignore_ext;
		sys(@svn_co, $SVN_URL, $SVN_WC);
		chdir $SVN_WC or croak $!;
		$last_commit = git_commit($base, @parents);
		unless (-f "$GIT_DIR/refs/heads/master") {
			sys(qw(git-update-ref refs/heads/master),$last_commit);
		}
		assert_svn_wc_clean($base->{revision}, $last_commit);
	} else {
		chdir $SVN_WC or croak $!;
		$last_commit = file_to_s("$REV_DIR/$base->{revision}");
	}
	my @svn_up = qw(svn up);
	push @svn_up, '--ignore-externals' unless $_no_ignore_ext;
	my $last_rev = $base->{revision};
	foreach my $log_msg (@$svn_log) {
		assert_svn_wc_clean($last_rev, $last_commit);
		$last_rev = $log_msg->{revision};
		sys(@svn_up,"-r$last_rev");
		$last_commit = git_commit($log_msg, $last_commit, @parents);
	}
	assert_svn_wc_clean($last_rev, $last_commit);
	return pop @$svn_log;
}

sub commit {
	my (@commits) = @_;
	if ($_stdin || !@commits) {
		print "Reading from stdin...\n";
		@commits = ();
		while (<STDIN>) {
			if (/^([a-f\d]{6,40})\b/) {
				unshift @commits, $1;
			}
		}
	}
	my @revs;
	foreach (@commits) {
		push @revs, (safe_qx('git-rev-parse',$_));
	}
	chomp @revs;
	
	fetch();
	chdir $SVN_WC or croak $!;
	my $svn_current_rev =  svn_info('.')->{'Last Changed Rev'};
	foreach my $c (@revs) {
		print "Committing $c\n";
		svn_checkout_tree($svn_current_rev, $c);
		$svn_current_rev = svn_commit_tree($svn_current_rev, $c);
	}
	print "Done committing ",scalar @revs," revisions to SVN\n";
		
}

########################### utility functions #########################

sub setup_git_svn {
	defined $SVN_URL or croak "SVN repository location required\n";
	unless (-d $GIT_DIR) {
		croak "GIT_DIR=$GIT_DIR does not exist!\n";
	}
	mkpath(["$GIT_DIR/$GIT_SVN"]);
	mkpath(["$GIT_DIR/$GIT_SVN/info"]);
	mkpath([$REV_DIR]);
	s_to_file($SVN_URL,"$GIT_DIR/$GIT_SVN/info/url");
	my $uuid = svn_info($SVN_URL)->{'Repository UUID'} or
					croak "Repository UUID unreadable\n";
	s_to_file($uuid,"$GIT_DIR/$GIT_SVN/info/uuid");

	open my $fd, '>>', "$GIT_DIR/$GIT_SVN/info/exclude" or croak $!;
	print $fd '.svn',"\n";
	close $fd or croak $!;
	return $uuid;
}

sub assert_svn_wc_clean {
	my ($svn_rev, $commit) = @_;
	croak "$svn_rev is not an integer!\n" unless ($svn_rev =~ /^\d+$/);
	croak "$commit is not a sha1!\n" unless ($commit =~ /^$sha1$/o);
	my $svn_info = svn_info('.');
	if ($svn_rev != $svn_info->{'Last Changed Rev'}) {
		croak "Expected r$svn_rev, got r",
				$svn_info->{'Last Changed Rev'},"\n";
	}
	my @status = grep(!/^Performing status on external/,(`svn status`));
	@status = grep(!/^\s*$/,@status);
	if (scalar @status) {
		print STDERR "Tree ($SVN_WC) is not clean:\n";
		print STDERR $_ foreach @status;
		croak;
	}
	my ($tree_a) = grep(/^tree $sha1$/o,`git-cat-file commit $commit`);
	$tree_a =~ s/^tree //;
	chomp $tree_a;
	chomp(my $tree_b = `GIT_INDEX_FILE=$GIT_SVN_INDEX git-write-tree`);
	if ($tree_a ne $tree_b) {
		croak "$svn_rev != $commit, $tree_a != $tree_b\n";
	}
}

sub parse_diff_tree {
	my $diff_fh = shift;
	local $/ = "\0";
	my $state = 'meta';
	my @mods;
	while (<$diff_fh>) {
		chomp $_; # this gets rid of the trailing "\0"
		print $_,"\n";
		if ($state eq 'meta' && /^:(\d{6})\s(\d{6})\s
					$sha1\s($sha1)\s([MTCRAD])\d*$/xo) {
			push @mods, {	mode_a => $1, mode_b => $2,
					sha1_b => $3, chg => $4 };
			if ($4 =~ /^(?:C|R)$/) {
				$state = 'file_a';
			} else {
				$state = 'file_b';
			}
		} elsif ($state eq 'file_a') {
			my $x = $mods[$#mods] or croak __LINE__,": Empty array\n";
			if ($x->{chg} !~ /^(?:C|R)$/) {
				croak __LINE__,": Error parsing $_, $x->{chg}\n";
			}
			$x->{file_a} = $_;
			$state = 'file_b';
		} elsif ($state eq 'file_b') {
			my $x = $mods[$#mods] or croak __LINE__,": Empty array\n";
			if (exists $x->{file_a} && $x->{chg} !~ /^(?:C|R)$/) {
				croak __LINE__,": Error parsing $_, $x->{chg}\n";
			}
			if (!exists $x->{file_a} && $x->{chg} =~ /^(?:C|R)$/) {
				croak __LINE__,": Error parsing $_, $x->{chg}\n";
			}
			$x->{file_b} = $_;
			$state = 'meta';
		} else {
			croak __LINE__,": Error parsing $_\n";
		}
	}
	close $diff_fh or croak $!;
	return \@mods;
}

sub svn_check_prop_executable {
	my $m = shift;
	if ($m->{mode_b} =~ /755$/ && $m->{mode_a} !~ /755$/) {
		sys(qw(svn propset svn:executable 1), $m->{file_b});
	} elsif ($m->{mode_b} !~ /755$/ && $m->{mode_a} =~ /755$/) {
		sys(qw(svn propdel svn:executable), $m->{file_b});
	}
}

sub svn_ensure_parent_path {
	my $dir_b = dirname(shift);
	svn_ensure_parent_path($dir_b) if ($dir_b ne File::Spec->curdir);
	mkpath([$dir_b]) unless (-d $dir_b);
	sys(qw(svn add -N), $dir_b) unless (-d "$dir_b/.svn");
}

sub svn_checkout_tree {
	my ($svn_rev, $commit) = @_;
	my $from = file_to_s("$REV_DIR/$svn_rev");
	assert_svn_wc_clean($svn_rev,$from);
	print "diff-tree '$from' '$commit'\n";
	my $pid = open my $diff_fh, '-|';
	defined $pid or croak $!;
	if ($pid == 0) {
		exec(qw(git-diff-tree -z -r -C), $from, $commit) or croak $!;
	}
	my $mods = parse_diff_tree($diff_fh);
	unless (@$mods) {
		# git can do empty commits, SVN doesn't allow it...
		return $svn_rev;
	}
	my %rm;
	foreach my $m (@$mods) {
		if ($m->{chg} eq 'C') {
			svn_ensure_parent_path( $m->{file_b} );
			sys(qw(svn cp),		$m->{file_a}, $m->{file_b});
			blob_to_file(		$m->{sha1_b}, $m->{file_b});
			svn_check_prop_executable($m);
		} elsif ($m->{chg} eq 'D') {
			$rm{dirname $m->{file_b}}->{basename $m->{file_b}} = 1;
			sys(qw(svn rm --force), $m->{file_b});
		} elsif ($m->{chg} eq 'R') {
			svn_ensure_parent_path( $m->{file_b} );
			sys(qw(svn mv --force), $m->{file_a}, $m->{file_b});
			blob_to_file(		$m->{sha1_b}, $m->{file_b});
			svn_check_prop_executable($m);
			$rm{dirname $m->{file_a}}->{basename $m->{file_a}} = 1;
		} elsif ($m->{chg} eq 'M') {
			if ($m->{mode_b} =~ /^120/ && $m->{mode_a} =~ /^120/) {
				unlink $m->{file_b} or croak $!;
				blob_to_symlink($m->{sha1_b}, $m->{file_b});
			} else {
				blob_to_file($m->{sha1_b}, $m->{file_b});
			}
			svn_check_prop_executable($m);
		} elsif ($m->{chg} eq 'T') {
			sys(qw(svn rm --force),$m->{file_b});
			if ($m->{mode_b} =~ /^120/ && $m->{mode_a} =~ /^100/) {
				blob_to_symlink($m->{sha1_b}, $m->{file_b});
			} else {
				blob_to_file($m->{sha1_b}, $m->{file_b});
			}
			svn_check_prop_executable($m);
			sys(qw(svn add --force), $m->{file_b});
		} elsif ($m->{chg} eq 'A') {
			svn_ensure_parent_path( $m->{file_b} );
			blob_to_file(		$m->{sha1_b}, $m->{file_b});
			if ($m->{mode_b} =~ /755$/) {
				chmod 0755, $m->{file_b};
			}
			sys(qw(svn add --force), $m->{file_b});
		} else {
			croak "Invalid chg: $m->{chg}\n";
		}
	}
	if ($_rmdir) {
		my $old_index = $ENV{GIT_INDEX_FILE};
		$ENV{GIT_INDEX_FILE} = $GIT_SVN_INDEX;
		foreach my $dir (keys %rm) {
			my $files = $rm{$dir};
			my @files;
			foreach (safe_qx('svn','ls',$dir)) {
				chomp;
				push @files, $_ unless $files->{$_};
			}
			sys(qw(svn rm),$dir) unless @files;
		}
		if ($old_index) {
			$ENV{GIT_INDEX_FILE} = $old_index;
		} else {
			delete $ENV{GIT_INDEX_FILE};
		}
	}
}

sub svn_commit_tree {
	my ($svn_rev, $commit) = @_;
	my $commit_msg = "$GIT_DIR/$GIT_SVN/.svn-commit.tmp.$$";
	open my $msg, '>', $commit_msg  or croak $!;
	
	chomp(my $type = `git-cat-file -t $commit`);
	if ($type eq 'commit') {
		my $pid = open my $msg_fh, '-|';
		defined $pid or croak $!;

		if ($pid == 0) {
			exec(qw(git-cat-file commit), $commit) or croak $!;
		}
		my $in_msg = 0;
		while (<$msg_fh>) {
			if (!$in_msg) {
				$in_msg = 1 if (/^\s*$/);
			} else {
				print $msg $_ or croak $!;
			}
		}
		close $msg_fh or croak $!;
	}
	close $msg or croak $!;

	if ($_edit || ($type eq 'tree')) {
		my $editor = $ENV{VISUAL} || $ENV{EDITOR} || 'vi';
		system($editor, $commit_msg);
	}
	my @ci_output = safe_qx(qw(svn commit -F),$commit_msg);
	my ($committed) = grep(/^Committed revision \d+\./,@ci_output);
	unlink $commit_msg;
	defined $committed or croak
			"Commit output failed to parse committed revision!\n",
			join("\n",@ci_output),"\n";
	my ($rev_committed) = ($committed =~ /^Committed revision (\d+)\./);

	# resync immediately
	my @svn_up = (qw(svn up), "-r$svn_rev");
	push @svn_up, '--ignore-externals' unless $_no_ignore_ext;
	sys(@svn_up);
	return fetch("$rev_committed=$commit")->{revision};
}

sub svn_log_xml {
	my (@log_args) = @_;
	my $log_fh = IO::File->new_tmpfile or croak $!;
	
	my $pid = fork;
	defined $pid or croak $!;
	
	if ($pid == 0) {
		open STDOUT, '>&', $log_fh or croak $!;
		exec (qw(svn log --xml), @log_args) or croak $!
	}
	
	waitpid $pid, 0;
	croak $? if $?;

	seek $log_fh, 0, 0;
	my @svn_log;
	my $log = XML::Simple::XMLin( $log_fh,
				ForceArray => ['path','revision','logentry'],
				KeepRoot => 0,
				KeyAttr => {	logentry => '+revision',
						paths => '+path' },
			)->{logentry};
	foreach my $r (sort {$a <=> $b} keys %$log) {
		my $log_msg = $log->{$r};
		my ($Y,$m,$d,$H,$M,$S) = ($log_msg->{date} =~
					/(\d{4})\-(\d\d)\-(\d\d)T
					 (\d\d)\:(\d\d)\:(\d\d)\.\d+Z$/x)
					 or croak "Failed to parse date: ",
						 $log->{$r}->{date};
		$log_msg->{date} = "+0000 $Y-$m-$d $H:$M:$S";

		# XML::Simple can't handle <msg></msg> as a string:
		if (ref $log_msg->{msg} eq 'HASH') {
			$log_msg->{msg} = "\n";
		} else {
			$log_msg->{msg} .= "\n";
		}
		push @svn_log, $log->{$r};
	}
	return \@svn_log;
}

sub svn_log_raw {
	my (@log_args) = @_;
	my $pid = open my $log_fh,'-|';
	defined $pid or croak $!;
	
	if ($pid == 0) {
		exec (qw(svn log), @log_args) or croak $!
	}
	
	my @svn_log;
	my $state;
	while (<$log_fh>) {
		chomp;
		if (/^\-{72}$/) {
			$state = 'rev';
			
			# if we have an empty log message, put something there:
			if (@svn_log) {
				$svn_log[0]->{msg} ||= "\n";
			}
			next;
		}
		if ($state eq 'rev' && s/^r(\d+)\s*\|\s*//) {
			my $rev = $1;
			my ($author, $date) = split(/\s*\|\s*/, $_, 2);
			my ($Y,$m,$d,$H,$M,$S,$tz) = ($date =~
					/(\d{4})\-(\d\d)\-(\d\d)\s
					 (\d\d)\:(\d\d)\:(\d\d)\s([\-\+]\d+)/x)
					 or croak "Failed to parse date: $date\n";
			my %log_msg = (	revision => $rev,
					date => "$tz $Y-$m-$d $H:$M:$S",
					author => $author,
					msg => '' );
			unshift @svn_log, \%log_msg;
			$state = 'msg_start';
			next;
		}
		# skip the first blank line of the message:
		if ($state eq 'msg_start' && /^$/) {
			$state = 'msg';
		} elsif ($state eq 'msg') {
			$svn_log[0]->{msg} .= $_."\n";
		}
	}
	close $log_fh or croak $?;
	return \@svn_log;
}

sub svn_info {
	my $url = shift || $SVN_URL;

	my $pid = open my $info_fh, '-|';
	defined $pid or croak $!;
	
	if ($pid == 0) {
		exec(qw(svn info),$url) or croak $!;
	}
	
	my $ret = {};
	# only single-lines seem to exist in svn info output
	while (<$info_fh>) {
		chomp $_;
		if (m#^([^:]+)\s*:\s*(\S*)$#) {
			$ret->{$1} = $2;
			push @{$ret->{-order}}, $1;
		}
	}
	close $info_fh or croak $!;
	return $ret;
}

sub sys { system(@_) == 0 or croak $? }

sub git_addremove {
	system(	"git-ls-files -z --others ".
			"'--exclude-from=$GIT_DIR/$GIT_SVN/info/exclude'".
				"| git-update-index --add -z --stdin; ".
		"git-ls-files -z --deleted ".
				"| git-update-index --remove -z --stdin; ".
		"git-ls-files -z --modified".
				"| git-update-index -z --stdin") == 0 or croak $?
}

sub s_to_file {
	my ($str, $file, $mode) = @_;
	open my $fd,'>',$file or croak $!;
	print $fd $str,"\n" or croak $!;
	close $fd or croak $!;
	chmod ($mode &~ umask, $file) if (defined $mode);
}

sub file_to_s {
	my $file = shift;
	open my $fd,'<',$file or croak "$!: file: $file\n";
	local $/;
	my $ret = <$fd>;
	close $fd or croak $!;
 	$ret =~ s/\s*$//s;
	return $ret;
}

sub assert_revision_unknown {
	my $revno = shift;
	if (-f "$REV_DIR/$revno") {
		croak "$REV_DIR/$revno already exists! ",
				"Why are we refetching it?";
	}
}

sub assert_revision_eq_or_unknown {
	my ($revno, $commit) = @_;
	if (-f "$REV_DIR/$revno") {
		my $current = file_to_s("$REV_DIR/$revno");
		if ($commit ne $current) {
			croak "$REV_DIR/$revno already exists!\n",
				"current: $current\nexpected: $commit\n";
		}
		return;
	}
}

sub git_commit {
	my ($log_msg, @parents) = @_;
	assert_revision_unknown($log_msg->{revision});
	my $out_fh = IO::File->new_tmpfile or croak $!;
	my $info = svn_info('.');
	my $uuid = $info->{'Repository UUID'};
	defined $uuid or croak "Unable to get Repository UUID\n";

	# commit parents can be conditionally bound to a particular
	# svn revision via: "svn_revno=commit_sha1", filter them out here:
	my @exec_parents;
	foreach my $p (@parents) {
		next unless defined $p;
		if ($p =~ /^(\d+)=($sha1_short)$/o) {
			if ($1 == $log_msg->{revision}) {
				push @exec_parents, $2;
			}
		} else {
			push @exec_parents, $p if $p =~ /$sha1_short/o;
		}
	}

	my $pid = fork;
	defined $pid or croak $!;
	if ($pid == 0) {
		$ENV{GIT_INDEX_FILE} = $GIT_SVN_INDEX;
		git_addremove();
		chomp(my $tree = `git-write-tree`);
		croak if $?;
		my $msg_fh = IO::File->new_tmpfile or croak $!;
		print $msg_fh $log_msg->{msg}, "\ngit-svn-id: ", 
					"$SVN_URL\@$log_msg->{revision}",
					" $uuid\n" or croak $!;
		$msg_fh->flush == 0 or croak $!;
		seek $msg_fh, 0, 0 or croak $!;

		$ENV{GIT_AUTHOR_NAME} = $ENV{GIT_COMMITTER_NAME} =
						$log_msg->{author};
		$ENV{GIT_AUTHOR_EMAIL} = $ENV{GIT_COMMITTER_EMAIL} =
						$log_msg->{author}."\@$uuid";
		$ENV{GIT_AUTHOR_DATE} = $ENV{GIT_COMMITTER_DATE} =
						$log_msg->{date};
		my @exec = ('git-commit-tree',$tree);
		push @exec, '-p', $_  foreach @exec_parents;
		open STDIN, '<&', $msg_fh or croak $!;
		open STDOUT, '>&', $out_fh or croak $!;
		exec @exec or croak $!;
	}
	waitpid($pid,0);
	croak if $?;

	$out_fh->flush == 0 or croak $!;
	seek $out_fh, 0, 0 or croak $!;
	chomp(my $commit = do { local $/; <$out_fh> });
	if ($commit !~ /^$sha1$/o) {
		croak "Failed to commit, invalid sha1: $commit\n";
	}
	my @update_ref = ('git-update-ref',"refs/heads/$GIT_SVN-HEAD",$commit);
	if (my $primary_parent = shift @exec_parents) {
		push @update_ref, $primary_parent;
	}
	sys(@update_ref);
	sys('git-update-ref',"$GIT_SVN/revs/$log_msg->{revision}",$commit);
	print "r$log_msg->{revision} = $commit\n";
	return $commit;
}

sub blob_to_symlink {
	my ($blob, $link) = @_;
	defined $link or croak "\$link not defined!\n";
	croak "Not a sha1: $blob\n" unless $blob =~ /^$sha1$/o;
	my $dest = `git-cat-file blob $blob`; # no newline, so no chomp
	symlink $dest, $link or croak $!;
}

sub blob_to_file {
	my ($blob, $file) = @_;
	defined $file or croak "\$file not defined!\n";
	croak "Not a sha1: $blob\n" unless $blob =~ /^$sha1$/o;
	open my $blob_fh, '>', $file or croak "$!: $file\n";
	my $pid = fork;
	defined $pid or croak $!;

	if ($pid == 0) {
		open STDOUT, '>&', $blob_fh or croak $!;
		exec('git-cat-file','blob',$blob);
	}
	waitpid $pid, 0;
	croak $? if $?;
	
	close $blob_fh or croak $!;
}

sub safe_qx {
	my $pid = open my $child, '-|';
	defined $pid or croak $!;
	if ($pid == 0) {
		exec(@_) or croak $?;
	}
	my @ret = (<$child>);
	close $child or croak $?;
	die $? if $?; # just in case close didn't error out
	return wantarray ? @ret : join('',@ret);
}

sub svn_check_ignore_externals {
	return if $_no_ignore_ext;
	unless (grep /ignore-externals/,(safe_qx(qw(svn co -h)))) {
		print STDERR "W: Installed svn version does not support ",
				"--ignore-externals\n";
		$_no_ignore_ext = 1;
	}
}
__END__

Data structures:

@svn_log = array of log_msg hashes

$log_msg hash 
{ 
	msg => 'whitespace-formatted log entry
',						# trailing newline is preserved
	revision => '8',			# integer
	date => '2004-02-24T17:01:44.108345Z',	# commit date
	author => 'committer name' 
};


@mods = array of diff-index line hashes, each element represents one line
	of diff-index output

diff-index line ($m hash)
{
	mode_a => first column of diff-index output, no leading ':',
	mode_b => second column of diff-index output,
	sha1_b => sha1sum of the final blob,
	chg => change type [MCRAD],
	file_a => original file name of a file (iff chg is 'C' or 'R')
	file_b => new/current file name of a file (any chg)
}
;

[-- Attachment #3: git-svn.txt --]
[-- Type: text/plain, Size: 7405 bytes --]

git-svn(1)
==========

NAME
----
git-svn - bidirectional operation between a single Subversion branch and git

SYNOPSIS
--------
'git-svn' <command> [options] [arguments]

DESCRIPTION
-----------
git-svn is a simple conduit for changesets between a single Subversion
branch and git.

git-svn is not to be confused with git-svnimport.  The were designed
with very different goals in mind.

git-svn is designed for an individual developer who wants a
bidirectional flow of changesets between a single branch in Subversion
and an arbitrary number of branches in git.  git-svnimport is designed
for read-only operation on repositories that match a particular layout
(albeit the recommended one by SVN developers).

For importing svn, git-svnimport is potentially more powerful when
operating on repositories organized under the recommended
trunk/branch/tags structure, and should be faster, too.

git-svn completely ignores the very limited view of branching that
Subversion has.  This allows git-svn to be much easier to use,
especially on repositories that are not organized in a manner that
git-svnimport is designed for.

COMMANDS
--------
init::
	Creates an empty git repository with additional metadata
	directories for git-svn.  The SVN_URL must be specified
	at this point.

fetch::
	Fetch unfetched revisions from the SVN_URL we are tracking.
	refs/heads/git-svn-HEAD will be updated to the latest revision.
	
commit::
	Commit specified commit or tree objects to SVN.  This relies on
	your imported fetch data being up-to-date.  This makes
	absolutely no attempts to do patching when committing to SVN, it
	simply overwrites files with those specified in the tree or
	commit.  All merging is assumed to have taken place
	independently of git-svn functions.

rebuild::
	Not a part of daily usage, but this is a useful command if
	you've just cloned a repository (using git-clone) that was
	tracked with git-svn.  Unfortunately, git-clone does not clone
	git-svn metadata and the svn working tree that git-svn uses for
	its operations.  This rebuilds the metadata so git-svn can
	resume fetch operations.  SVN_URL may be optionally specified if
	the directory/repository you're tracking has moved or changed
	protocols.

OPTIONS
-------
-r <ARG>::
--revision <ARG>::
	Only used with the 'fetch' command.

	Takes any valid -r<argument> svn would accept and passes it
	directly to svn. -r<ARG1>:<ARG2> ranges and "{" DATE "}" syntax
	is also supported.  This is passed directly to svn, see svn
	documentation for more details.

	This can allow you to make partial mirrors when running fetch.

-::
--stdin::
	Only used with the 'commit' command.
	
	Read a list of commits from stdin and commit them in reverse
	order.  Only the leading sha1 is read from each line, so
	git-rev-list --pretty=oneline output can be used.

--rmdir::
	Only used with the 'commit' command.

	Remove directories from the SVN tree if there are no files left
	behind.  SVN can version empty directories, and they are not
	removed by default if there are no files left in them.  git
	cannot version empty directories.  Enabling this flag will make
	the commit to SVN act like git.

-e::
--edit::
	Only used with the 'commit' command.
	
	Edit the commit message before committing to SVN.  This is off by
	default for objects that are commits, and forced on when committing
	tree objects.

COMPATIBILITY OPTIONS
---------------------
--no-ignore-externals::
	Only used with the 'fetch' and 'rebuild' command.

	By default, git-svn passes --ignore-externals to svn to avoid
	fetching svn:external trees into git.  Pass this flag to enable
	externals tracking directly via git.

	Versions of svn that do not support --ignore-externals are
	automatically detected and this flag will be automatically
	enabled for them.

	Otherwise, do not enable this flag unless you know what you're
	doing.

--no-stop-on-copy::
	Only used with the 'fetch' command.

	By default, git-svn passes --stop-on-copy to avoid dealing with
	the copied/renamed branch directory problem entirely.  A
	copied/renamed branch is the result of a <SVN_URL> being created
	in the past from a different source.  These are problematic to
	deal with even when working purely with svn if you work inside
	subdirectories.

	Do not use this flag unless you know exactly what you're getting
	yourself into.  You have been warned.

Examples
~~~~~~~~

Tracking and contributing to an Subversion managed-project:

# Initialize a tree (like git init-db)::
	git-svn init http://svn.foo.org/project/trunk
# Fetch remote revisions::
	git-svn fetch
# Create your own branch to hack on::
	git checkout -b my-branch git-svn-HEAD
# Commit only the git commits you want to SVN::
	git-svn commit <tree-ish> [<tree-ish_2> ...] 
# Commit all the git commits from my-branch that don't exist in SVN::
	git rev-list --pretty=oneline git-svn-HEAD..my-branch | git-svn commit
# Something is committed to SVN, pull the latest into your branch::
	git-svn fetch && git pull . git-svn-HEAD

DESIGN PHILOSOPHY
-----------------
Merge tracking in Subversion is lacking and doing branched development
with Subversion is cumbersome as a result.  git-svn completely forgoes
any automated merge/branch tracking on the Subversion side and leaves it
entirely up to the user on the git side.  It's simply not worth it to do
a useful translation when the the original signal is weak.

TRACKING MULTIPLE REPOSITORIES OR BRANCHES
------------------------------------------
This is for advanced users, most users should ignore this section.

Because git-svn does not care about relationships between different
branches or directories in a Subversion repository, git-svn has a simple
hack to allow it to track an arbitrary number of related _or_ unrelated
SVN repositories via one git repository.  Simply set the GIT_SVN_ID
environment variable to a name other other than "git-svn" (the default)
and git-svn will ignore the contents of the $GIT_DIR/git-svn directory
and instead do all of its work in $GIT_DIR/$GIT_SVN_ID for that
invocation.

ADDITIONAL FETCH ARGUMENTS
--------------------------
This is for advanced users, most users should ignore this section.

Unfetched SVN revisions may be imported as children of existing commits
by specifying additional arguments to 'fetch'.  Additional parents may
optionally be specified in the form of sha1 hex sums at the
command-line.  Unfetched SVN revisions may also be tied to particular
git commits with the following syntax:
		
	svn_revision_number=git_commit_sha1

This allows you to tie unfetched SVN revision 375 to your current HEAD::

	git-svn fetch 375=$(git-rev-parse HEAD)

BUGS
----
If somebody commits a conflicting changeset to SVN at a bad moment
(right before you commit) causing a conflict and your commit to fail,
your svn working tree ($GIT_DIR/git-svn/tree) may be dirtied.  The
easiest thing to do is probably just to rm -rf $GIT_DIR/git-svn/tree and
run 'rebuild'.

We ignore all SVN properties except svn:executable.  Too difficult to
map them since we rely heavily on git write-tree being _exactly_ the
same on both the SVN and git working trees and I prefer not to clutter
working trees with metadata files.

svn:keywords can't be ignored in Subversion (at least I don't know of
a way to ignore them).

Author
------
Written by Eric Wong <normalperson@yhbt.net>.

Documentation
-------------
Written by Eric Wong <normalperson@yhbt.net>.

^ permalink raw reply

* Re: Make "git clone" less of a deathly quiet experience
From: Junio C Hamano @ 2006-02-16  7:33 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andreas Ericsson, Keith Packard, Linus Torvalds, Git Mailing List,
	Petr Baudis
In-Reply-To: <m1ek23rduh.fsf@ebiederm.dsl.xmission.com>

ebiederm@xmission.com (Eric W. Biederman) writes:

> I don't know how well multiple packs will work with the current git
> protocol...

Then I wonder why you are making this observation ... ;-)

In any case, I suspect this would be helped to a certain degree
by the pack-object that reuses delta data from existing packs,
if your repository is reasonably packed.

^ permalink raw reply

* Re: Make "git clone" less of a deathly quiet experience
From: Eric W. Biederman @ 2006-02-16  6:56 UTC (permalink / raw)
  To: Andreas Ericsson
  Cc: Keith Packard, Linus Torvalds, Junio C Hamano, Git Mailing List,
	Petr Baudis
In-Reply-To: <43EF15D1.1050609@op5.se>

Andreas Ericsson <ae@op5.se> writes:

> Keith Packard wrote:
>> On Sun, 2006-02-12 at 04:43 +0100, Andreas Ericsson wrote:
>>
>>>A weird oddity; Cloning is faster over rsync, day-to-day pulling is not.
>> Precisely. If the protocol could deliver existing packs instead of
>> unpacking and repacking them, then git would be as fast as rsync and I
>> wouldn't have to worry about supporting two protocols.
>>
>
> Caching features have been discussed, but that means the daemon needs to have
> write-access to some directory within the repository. It would also work poorly
> for projects that see very rapid development unless the cached pack-files can be
> amended to. A sort of "create packs on demand". It shouldn't be too difficult,
> really.

Actually for the clone case we don't need a writable directory for the
git-daemon. 

If we assume that a repository up for download is reasonably packed,
we can just lob all of the packs in the current repository, and then
pack the few remaining objects and send them.

I don't know how well multiple packs will work with the current git
protocol but it should be pretty natural, and the clone case is easy
detect as there are no heads in common.  Can that be detected quickly?

I don't have a patch but it feels like a pretty straight forward thing
to implement.

Eric

^ permalink raw reply

* What's in git.git
From: Junio C Hamano @ 2006-02-16  6:57 UTC (permalink / raw)
  To: git

* Master branch has these since 1.2.1 maintenance release.

  - documentation fixes:
    git-commit: Now --only semantics is the default.

  - usability:
    - rebase aquired a hook to refuse rebasing.
    - commit and add detects misspelled pathspec while making a partial commit.
    - git-svnimport: -r adds svn revision number to commit messages
    - properly git-bisect reset after bisecting from non-master head
    - send-email: Add some options for controlling how addresses
      are automatically added to the cc: list.
    - send-email: Add --cc

* Next branch has these, that are not in master.  If you feel
  you would benefit from these, testing and feedback is greatly
  appreciated.

 - "Assume unchanged git" series (7 commits):

   This was done in response to people on filesystems with slow
   lstat(2).  I do not have such an environment, so I cannot say
   I tested it that much.

 - "Rebase to different branch" (1 commit):

   This was previously discussed on the list.  With this command
   line:

    	$ git rebase --onto master~1 master topic
    
   would rebase this ancestry graph to:
    
              A---B---C topic
             /
        D---E---F---G master
    
    
   another graph that looks like this:
    
                  A'--B'--C' topic
                 /
        D---E---F---G master

   Earlier, you couldn't rebase to anywhere other than on top of
   "the other branch".


* Proposed updates "pu" branch has these, not in "next".  Some
  of them are of iffy nature, and without further work will not
  go anywhere.

 - "merge-tree" series by Linus (2 commits).

   I haven't spent enough time looking at and thinking about
   this yet.

 - "reuse pack data" (1 commit).

   I still haven't seen data corruption with this one, which is
   a good sign, but would like to keep beating it privately for
   a while.  Perhaps will graduate to "next" by next week.

 - "bind commit" series (6 commits).

   I think the core-side is more or less done with this one.
   Anybody interested in doing Porcelain side?

 - "shallow clone" series (1 commit).

   I should drop this one for now and perhaps when enough people
   are interested reopen the issue.

^ permalink raw reply

* RE: [ANNOUNCE] GIT 1.2.1
From: Brown, Len @ 2006-02-16  6:47 UTC (permalink / raw)
  To: Junio C Hamano, git; +Cc: linux-kernel

Happy to notice Documentation/git-send-email
to standardize greg's scripts, but don't see it in the release.

anybody using it?

-Len

^ permalink raw reply

* [ANNOUNCE] GIT 1.2.1
From: Junio C Hamano @ 2006-02-16  6:25 UTC (permalink / raw)
  To: git; +Cc: linux-kernel

The latest maintenance release GIT 1.2.1 is available at the
usual places:

	http://www.kernel.org/pub/software/scm/git/

	git-1.2.1.tar.{gz,bz2}			(tarball)
	RPMS/$arch/git-*-1.2.1-1.$arch.rpm	(RPM)


Nothing earth-shattering but cleanups and cleanups and cleanups.

All the interesting things are happening in "master" and "pu",
which will be a topic for a separate message.

----------------------------------------------------------------

Changes since v1.2.0 are as follows:

Fernando J. Pereda:
      Print an error if cloning a http repo and NO_CURL is set

Fredrik Kuivinen:
      s/SHELL/SHELL_PATH/ in Makefile

Josef Weidendorfer:
      More useful/hinting error messages in git-checkout

Junio C Hamano:
      Documentation: git-commit in 1.2.X series defaults to --include.
      Documentation: git-ls-files asciidocco.
      bisect: remove BISECT_NAMES after done.
      combine-diff: diff-files fix.
      combine-diff: diff-files fix (#2)
      checkout: fix dirty-file display.

^ permalink raw reply

* [PATCH] topo-order: make --date-order optional.
From: Junio C Hamano @ 2006-02-16  6:18 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: git
In-Reply-To: <17395.58926.26670.23572@cargo.ozlabs.ibm.com>

This adds --date-order to rev-list; it is similar to topo order
in the sense that no parent comes before all of its children,
but otherwise things are still ordered in the commit timestamp
order.

The same flag is also added to show-branch.

Signed-off-by: Junio C Hamano <junkio@cox.net>

---

 * Paul, this supersedes the previous one, which made topo-order
   behave date-order unconditionally.

   The thing is, topological ordering code has a nice property
   that the equally eligible ones are pushed into a LIFO, which
   keeps single strand of pearls together and as good as the
   merge order for practical purposes.  Doing date-order breaks
   it.  Since we would want both, there is a new option as you
   originally wanted.

   I've tested it to see that it passes the tests, but that is
   not saying much ;-).

 commit.c      |   13 ++++++++++---
 commit.h      |    4 +++-
 rev-list.c    |   11 ++++++++++-
 rev-parse.c   |    1 +
 show-branch.c |    9 +++++----
 5 files changed, 29 insertions(+), 9 deletions(-)

4c8725f16abff4be4812d0d07a663250bef3ef0e
diff --git a/commit.c b/commit.c
index 67e11d7..c550a00 100644
--- a/commit.c
+++ b/commit.c
@@ -571,7 +571,7 @@ int count_parents(struct commit * commit
 /*
  * Performs an in-place topological sort on the list supplied.
  */
-void sort_in_topological_order(struct commit_list ** list)
+void sort_in_topological_order(struct commit_list ** list, int lifo)
 {
 	struct commit_list * next = *list;
 	struct commit_list * work = NULL, **insert;
@@ -630,7 +630,10 @@ void sort_in_topological_order(struct co
 		}
 		next=next->next;
 	}
+
 	/* process the list in topological order */
+	if (!lifo)
+		sort_by_date(&work);
 	while (work) {
 		struct commit * work_item = pop_commit(&work);
 		struct sort_node * work_node = (struct sort_node *)work_item->object.util;
@@ -647,8 +650,12 @@ void sort_in_topological_order(struct co
                                  * guaranteeing topological order.
                                  */
 				pn->indegree--;
-				if (!pn->indegree) 
-					commit_list_insert(parent, &work);
+				if (!pn->indegree) {
+					if (!lifo)
+						insert_by_date(parent, &work);
+					else
+						commit_list_insert(parent, &work);
+				}
 			}
 			parents=parents->next;
 		}
diff --git a/commit.h b/commit.h
index 986b22d..70a7c75 100644
--- a/commit.h
+++ b/commit.h
@@ -72,6 +72,8 @@ int count_parents(struct commit * commit
  * Post-conditions: 
  *   invariant of resulting list is:
  *      a reachable from b => ord(b) < ord(a)
+ *   in addition, when lifo == 0, commits on parallel tracks are
+ *   sorted in the dates order.
  */
-void sort_in_topological_order(struct commit_list ** list);
+void sort_in_topological_order(struct commit_list ** list, int lifo);
 #endif /* COMMIT_H */
diff --git a/rev-list.c b/rev-list.c
index 63391fc..f2d1105 100644
--- a/rev-list.c
+++ b/rev-list.c
@@ -27,6 +27,7 @@ static const char rev_list_usage[] =
 "  ordering output:\n"
 "    --merge-order [ --show-breaks ]\n"
 "    --topo-order\n"
+"    --date-order\n"
 "  formatting output:\n"
 "    --parents\n"
 "    --objects\n"
@@ -56,6 +57,7 @@ static int merge_order = 0;
 static int show_breaks = 0;
 static int stop_traversal = 0;
 static int topo_order = 0;
+static int lifo = 1;
 static int no_merges = 0;
 static const char **paths = NULL;
 static int remove_empty_trees = 0;
@@ -856,6 +858,13 @@ int main(int argc, const char **argv)
 		}
 		if (!strcmp(arg, "--topo-order")) {
 		        topo_order = 1;
+			lifo = 1;
+		        limited = 1;
+			continue;
+		}
+		if (!strcmp(arg, "--date-order")) {
+		        topo_order = 1;
+			lifo = 0;
 		        limited = 1;
 			continue;
 		}
@@ -940,7 +949,7 @@ int main(int argc, const char **argv)
 	        if (limited)
 			list = limit_list(list);
 		if (topo_order)
-			sort_in_topological_order(&list);
+			sort_in_topological_order(&list, lifo);
 		show_commit_list(list);
 	} else {
 #ifndef NO_OPENSSL
diff --git a/rev-parse.c b/rev-parse.c
index b82f294..9161fae 100644
--- a/rev-parse.c
+++ b/rev-parse.c
@@ -48,6 +48,7 @@ static int is_rev_argument(const char *a
 		"--show-breaks",
 		"--sparse",
 		"--topo-order",
+		"--date-order",
 		"--unpacked",
 		NULL
 	};
diff --git a/show-branch.c b/show-branch.c
index 511fd3b..5a86ae2 100644
--- a/show-branch.c
+++ b/show-branch.c
@@ -535,6 +535,7 @@ int main(int ac, char **av)
 	int num_rev, i, extra = 0;
 	int all_heads = 0, all_tags = 0;
 	int all_mask, all_revs;
+	int lifo = 1;
 	char head_path[128];
 	const char *head_path_p;
 	int head_path_len;
@@ -544,7 +545,6 @@ int main(int ac, char **av)
 	int no_name = 0;
 	int sha1_name = 0;
 	int shown_merge_point = 0;
-	int topo_order = 0;
 	int with_current_branch = 0;
 	int head_at = -1;
 
@@ -586,7 +586,9 @@ int main(int ac, char **av)
 		else if (!strcmp(arg, "--independent"))
 			independent = 1;
 		else if (!strcmp(arg, "--topo-order"))
-			topo_order = 1;
+			lifo = 1;
+		else if (!strcmp(arg, "--date-order"))
+			lifo = 0;
 		else
 			usage(show_branch_usage);
 		ac--; av++;
@@ -710,8 +712,7 @@ int main(int ac, char **av)
 		exit(0);
 
 	/* Sort topologically */
-	if (topo_order)
-		sort_in_topological_order(&seen);
+	sort_in_topological_order(&seen, lifo);
 
 	/* Give names to commits */
 	if (!sha1_name && !no_name)
-- 
1.2.1.gbf0a

^ permalink raw reply related

* Re: [PATCH] pack-objects: reuse data from existing pack.
From: Junio C Hamano @ 2006-02-16  4:07 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0602151953290.916@g5.osdl.org>

Linus Torvalds <torvalds@osdl.org> writes:

> No ugly special-case caching, just automatically "the right thing", with 
> very little overhead.
>
> It just makes sense.

Thanks.  I threw away two rounds of crap before this one, which
were full of ugly special cases ;-).

^ permalink raw reply

* Re: git faq : draft and rfc
From: Martin Langhoff @ 2006-02-16  4:04 UTC (permalink / raw)
  To: Thomas Riboulet; +Cc: git
In-Reply-To: <22e91bb0602151636r2e70e60cpa5038f4b6caccc9c@mail.gmail.com>

On 2/16/06, Thomas Riboulet <riboulet@gmail.com> wrote:

> . Can I import from cvs ?
> Yes. Use git-cvsimport. See the cvs-migration doc for more details.
>
> . Can I import from svn ?
> Yes. Use git-svnimport. See the svn-import doc for more details.

+ Can I import from arch/baz/tla? Use git-archimport.

+ Can I import from others? Maybe -- check if tailor.py can do it.

cheers,


martin

^ permalink raw reply

* Re: [PATCH] pack-objects: reuse data from existing pack.
From: Junio C Hamano @ 2006-02-16  3:59 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0602152226130.5606@localhost.localdomain>

Nicolas Pitre <nico@cam.org> writes:

> On Wed, 15 Feb 2006, Junio C Hamano wrote:
>
>>     Generating pack...
>>     Done counting 12233 objects.
>>     Packing 12233 objects.....................
>>     60a88b3979df41e22d1edc3967095e897f720192
>>     Total 12233, written 12233, reused 12177
>> ...
> In fact, the resulting pack should be identical with or without this 
> patch, shouldn't it?

Not necessarily.  The delta-depth limitation is currently lifted
when reusing deltified objects (finding out the current depth is
not so expensive compared to uncompress-delta-recompress cycle,
but still costs somewhat, and the objective of this exercise is
to gain performance).

Notice the numbers 'written' and 'reused' in the output?
The difference in that example comes from the fact that I am
omitting some objects from the set of objects to be packed
(v1.0.0 is ancient) in a repository where some newer objects are
packed.  Since packing-delta goes backwards, what is in v1.0.0
but not in my tip tends to be delitified in the original pack,
but the resulting pack needs to have them expanded -- that is
where the difference comes from.

A cleaned-up patch will be in "pu" branch tonight.  I considered
putting it in "next", but decided against it.  I have not spent
enough time really beating on it, although I haven't seen major
breakage.

^ permalink raw reply

* Re: [PATCH] pack-objects: reuse data from existing pack.
From: Linus Torvalds @ 2006-02-16  3:55 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vbqx8m62q.fsf@assigned-by-dhcp.cox.net>



On Wed, 15 Feb 2006, Junio C Hamano wrote:
>
> When generating a new pack, notice if we have already the wanted
> object in existing packs.  If the object has a delitified
> representation, and its base object is also what we are going to
> pack, then reuse the existing deltified representation
> unconditionally, bypassing all the expensive find_deltas() and
> try_deltas() routines.

I bow down before you.

No ugly special-case caching, just automatically "the right thing", with 
very little overhead.

It just makes sense.

We have a winner.

		Linus

^ permalink raw reply

* Re: [PATCH] pack-objects: reuse data from existing pack.
From: Nicolas Pitre @ 2006-02-16  3:41 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, git
In-Reply-To: <7vbqx8m62q.fsf@assigned-by-dhcp.cox.net>

On Wed, 15 Feb 2006, Junio C Hamano wrote:

> When generating a new pack, notice if we have already the wanted
> object in existing packs.  If the object has a delitified
> representation, and its base object is also what we are going to
> pack, then reuse the existing deltified representation
> unconditionally, bypassing all the expensive find_deltas() and
> try_deltas() routines.
> 
> Also, when writing out such deltified representation and
> undeltified representation, if a matching data already exists in
> an existing pack, just write it out without uncompressing &
> recompressing.

Great !

> Without this patch:
> 
>     $ git-rev-list --objects v1.0.0 >RL
>     $ time git-pack-objects p <RL
> 
>     Generating pack...
>     Done counting 12233 objects.
>     Packing 12233 objects....................
>     60a88b3979df41e22d1edc3967095e897f720192
> 
>     real    0m32.751s
>     user    0m27.090s
>     sys     0m2.750s
> 
> With this patch:
> 
>     $ git-rev-list --objects v1.0.0 >RL
>     $ time ../git.junio/git-pack-objects q <RL
> 
>     Generating pack...
>     Done counting 12233 objects.
>     Packing 12233 objects.....................
>     60a88b3979df41e22d1edc3967095e897f720192
>     Total 12233, written 12233, reused 12177
> 
>     real    0m4.007s
>     user    0m3.360s
>     sys     0m0.090s
> 
> Signed-off-by: Junio C Hamano <junkio@cox.net>
> 
> ---
> 
>  * This may depend on one cleanup patch I have not sent out, but
>    I am so excited that I could not help sending this out first.
> 
>    Admittedly this is hot off the press, I have not had enough
>    time to beat this too hard, but the resulting pack from the
>    above passed unpack-objects, index-pack and verify-pack.

In fact, the resulting pack should be identical with or without this 
patch, shouldn't it?

FYI: I have list of patches to produce even smaller (yet still 
compatible) packs, or less dense ones but with much reduced CPU usage.  
All depending on a new --speed argument to git-pack-objects.  I've been 
able to produce 15-20% smaller packs with the same depth and window 
size, but taking twice as much CPU time to produce. Combined with your 
patch, one could repack the object store with the maximum compression 
even if it is expensive CPU wise, but any pull will benefit from it 
afterwards with no additional cost.

I only need to find some time to finally clean and re-test those 
patches...


Nicolas

^ permalink raw reply

* Re: Handling large files with GIT
From: Junio C Hamano @ 2006-02-16  3:29 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0602151915010.916@g5.osdl.org>

Linus Torvalds <torvalds@osdl.org> writes:

> Junio, that "traverse_trees()" logic is totally independent of whether we 
> actually do "git-merge-tree" or not, so if you want to, I could split up 
> the patches the other way (and merge "traverse_trees()" first as a new 
> interface, independently).

I won't have time to look at the actual patch tonight but I am
interested.  I think the general idea should work nice with both
multi-base and octopus merge cases as well ;-).

^ permalink raw reply

* Re: Handling large files with GIT
From: Linus Torvalds @ 2006-02-16  3:25 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Fredrik Kuivinen, Git Mailing List
In-Reply-To: <7vd5hpj6ab.fsf@assigned-by-dhcp.cox.net>



Btw, here's one last gasp on this thread: it generalizes the notion of 
traversing several trees in sync, which could be used to do the n-way diff 
for the "-c" and "--cc" style merge diffs a lot more efficiently.

I didn't check, but I'm pretty sure that this would bring the cost of 
doing the 12-way diff down to way under a second. Right now:

	[torvalds@g5 linux]$ time git-diff-tree -c 9fdb62a > /dev/null 

	real    0m1.279s
	user    0m1.272s
	sys     0m0.008s

and that's a bit too much. We I'd really have expected us to be able to do 
better.

It should be possible to do this as a 

	traverse_trees(12, &trees, "", combined_diff_callback);

fairly cheaply (and quickly throw away anything where any of the parents 
was the same as the result).

Junio, that "traverse_trees()" logic is totally independent of whether we 
actually do "git-merge-tree" or not, so if you want to, I could split up 
the patches the other way (and merge "traverse_trees()" first as a new 
interface, independently).

		Linus

----
git-merge-tree: generalize the "traverse <n> trees in sync" functionality

It's actually very useful for other things too. Notably, we could do the
combined diff a lot more efficiently with this.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>

diff --git a/merge-tree.c b/merge-tree.c
index 6381118..2a9a013 100644
--- a/merge-tree.c
+++ b/merge-tree.c
@@ -125,44 +125,19 @@ static void unresolved(const char *base,
 		printf("3 %06o %s %s%s\n", n[2].mode, sha1_to_hex(n[2].sha1), base, n[2].path);
 }
 
-/*
- * Merge two trees together (t[1] and t[2]), using a common base (t[0])
- * as the origin.
- *
- * This walks the (sorted) trees in lock-step, checking every possible
- * name. Note that directories automatically sort differently from other
- * files (see "base_name_compare"), so you'll never see file/directory
- * conflicts, because they won't ever compare the same.
- *
- * IOW, if a directory changes to a filename, it will automatically be
- * seen as the directory going away, and the filename being created.
- *
- * Think of this as a three-way diff.
- *
- * The output will be either:
- *  - successful merge
- *	 "0 mode sha1 filename"
- *    NOTE NOTE NOTE! FIXME! We really really need to walk the index
- *    in parallel with this too!
- * 
- *  - conflict:
- *	"1 mode sha1 filename"
- *	"2 mode sha1 filename"
- *	"3 mode sha1 filename"
- *    where not all of the 1/2/3 lines may exist, of course.
- *
- * The successful merge rules are the same as for the three-way merge
- * in git-read-tree.
- */
-static void merge_trees(struct tree_desc t[3], const char *base)
+typedef void (*traverse_callback_t)(int n, unsigned long mask, struct name_entry *entry, const char *base);
+
+static void traverse_trees(int n, struct tree_desc *t, const char *base, traverse_callback_t callback)
 {
+	struct name_entry *entry = xmalloc(n*sizeof(*entry));
+
 	for (;;) {
 		struct name_entry entry[3];
-		unsigned int mask = 0;
+		unsigned long mask = 0;
 		int i, last;
 
 		last = -1;
-		for (i = 0; i < 3; i++) {
+		for (i = 0; i < n; i++) {
 			if (!t[i].size)
 				continue;
 			entry_extract(t+i, entry+i);
@@ -182,7 +157,7 @@ static void merge_trees(struct tree_desc
 				if (cmp < 0)
 					mask = 0;
 			}
-			mask |= 1u << i;
+			mask |= 1ul << i;
 			last = i;
 		}
 		if (!mask)
@@ -192,38 +167,77 @@ static void merge_trees(struct tree_desc
 		 * Update the tree entries we've walked, and clear
 		 * all the unused name-entries.
 		 */
-		for (i = 0; i < 3; i++) {
-			if (mask & (1u << i)) {
+		for (i = 0; i < n; i++) {
+			if (mask & (1ul << i)) {
 				update_tree_entry(t+i);
 				continue;
 			}
 			entry_clear(entry + i);
 		}
+		callback(n, mask, entry, base);
+	}
+	free(entry);
+}
 
-		/* Same in both? */
-		if (same_entry(entry+1, entry+2)) {
-			if (entry[0].sha1) {
-				resolve(base, NULL, entry+1);
-				continue;
-			}
+/*
+ * Merge two trees together (t[1] and t[2]), using a common base (t[0])
+ * as the origin.
+ *
+ * This walks the (sorted) trees in lock-step, checking every possible
+ * name. Note that directories automatically sort differently from other
+ * files (see "base_name_compare"), so you'll never see file/directory
+ * conflicts, because they won't ever compare the same.
+ *
+ * IOW, if a directory changes to a filename, it will automatically be
+ * seen as the directory going away, and the filename being created.
+ *
+ * Think of this as a three-way diff.
+ *
+ * The output will be either:
+ *  - successful merge
+ *	 "0 mode sha1 filename"
+ *    NOTE NOTE NOTE! FIXME! We really really need to walk the index
+ *    in parallel with this too!
+ * 
+ *  - conflict:
+ *	"1 mode sha1 filename"
+ *	"2 mode sha1 filename"
+ *	"3 mode sha1 filename"
+ *    where not all of the 1/2/3 lines may exist, of course.
+ *
+ * The successful merge rules are the same as for the three-way merge
+ * in git-read-tree.
+ */
+static void threeway_callback(int n, unsigned long mask, struct name_entry *entry, const char *base)
+{
+	/* Same in both? */
+	if (same_entry(entry+1, entry+2)) {
+		if (entry[0].sha1) {
+			resolve(base, NULL, entry+1);
+			return;
 		}
+	}
 
-		if (same_entry(entry+0, entry+1)) {
-			if (entry[2].sha1 && !S_ISDIR(entry[2].mode)) {
-				resolve(base, entry+1, entry+2);
-				continue;
-			}
+	if (same_entry(entry+0, entry+1)) {
+		if (entry[2].sha1 && !S_ISDIR(entry[2].mode)) {
+			resolve(base, entry+1, entry+2);
+			return;
 		}
+	}
 
-		if (same_entry(entry+0, entry+2)) {
-			if (entry[1].sha1 && !S_ISDIR(entry[1].mode)) {
-				resolve(base, NULL, entry+1);
-				continue;
-			}
+	if (same_entry(entry+0, entry+2)) {
+		if (entry[1].sha1 && !S_ISDIR(entry[1].mode)) {
+			resolve(base, NULL, entry+1);
+			return;
 		}
-
-		unresolved(base, entry);
 	}
+
+	unresolved(base, entry);
+}
+
+static void merge_trees(struct tree_desc t[3], const char *base)
+{
+	traverse_trees(3, t, base, threeway_callback);
 }
 
 static void *get_tree_descriptor(struct tree_desc *desc, const char *rev)

^ permalink raw reply related

* Re: git-rev-list --date-order ?
From: Junio C Hamano @ 2006-02-16  3:11 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: git
In-Reply-To: <17395.58926.26670.23572@cargo.ozlabs.ibm.com>

Paul Mackerras <paulus@samba.org> writes:

> Junio,
>
> Gitk has a -d option that tells it to reorder the commits in
> decreasing order of their commit time, subject to the constraint that
> parents come after all of their children.  Currently it uses
> git-rev-list --header --topo-order --parents and then reorders the
> commits internally.
>
> How hard would it be to add a --date-order flag to git-rev-list to
> make it order the commits in decreasing commit time order, subject to
> the constraint that parents come after their children?
>
> If we had that then I could remove another chunk of code from gitk and
> make it a bit faster.

It's been a while that I read the topo-order code, but I suspect
something like this?  I may be completely off the mark here.

--
diff --git a/commit.c b/commit.c
index 67e11d7..0d94e4d 100644
--- a/commit.c
+++ b/commit.c
@@ -630,7 +630,9 @@ void sort_in_topological_order(struct co
 		}
 		next=next->next;
 	}
+
 	/* process the list in topological order */
+	sort_by_date(&work);
 	while (work) {
 		struct commit * work_item = pop_commit(&work);
 		struct sort_node * work_node = (struct sort_node *)work_item->object.util;
@@ -648,7 +650,7 @@ void sort_in_topological_order(struct co
                                  */
 				pn->indegree--;
 				if (!pn->indegree) 
-					commit_list_insert(parent, &work);
+					insert_by_date(parent, &work);
 			}
 			parents=parents->next;
 		}

^ permalink raw reply related

* git-rev-list --date-order ?
From: Paul Mackerras @ 2006-02-16  2:40 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Junio,

Gitk has a -d option that tells it to reorder the commits in
decreasing order of their commit time, subject to the constraint that
parents come after all of their children.  Currently it uses
git-rev-list --header --topo-order --parents and then reorders the
commits internally.

How hard would it be to add a --date-order flag to git-rev-list to
make it order the commits in decreasing commit time order, subject to
the constraint that parents come after their children?

If we had that then I could remove another chunk of code from gitk and
make it a bit faster.

Thanks,
Paul.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox