Git development
 help / color / mirror / Atom feed
* Re: [PATCH 3/3] Teach git-index-pack how to keep a pack file.
From: Jakub Narebski @ 2006-10-29 10:49 UTC (permalink / raw)
  To: git
In-Reply-To: <20061029094159.GE3847@spearce.org>

Shawn Pearce wrote:

> +--keep::
> +       Before moving the index into its final destination
> +       create an empty .keep file for the associated pack file.
> +       This option is usually necessary with --stdin to prevent a
> +       simultaneous gitlink:git-repack[1] process from deleting
> +       the newly constructed pack and index before refs can be
> +       updated to use objects contained in the pack.
> +
> +--keep='why'::
> +       Like --keep create a .keep file before moving the index into
> +       its final destination, but rather than creating an empty file
> +       place 'why' followed by an LF into the .keep file.  The 'why'
> +       message can later be searched for within all .keep files to
> +       locate any which have outlived their usefulness.

Wouldn't it be better to use 'git-index-pack', perhaps with source URL if
possible, as the default 'why'?
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply

* [PATCH/RFC] gitweb: New improved patchset view
From: Jakub Narebski @ 2006-10-29 10:22 UTC (permalink / raw)
  To: git

You should be able to check new commitdiff view at
  http://roke_DOT_dyndns_DOT_info/cgi-bin/gitweb/gitweb.cgi
(URL mangling courtesy vger banned words list, aaaarrghhh)
if I didn't screwed something again with firewall, and when my machine
is up; should be for at least an hour. Check for example:
  ?p=git.git;a=commitdiff;h=origin
  ?p=git.git;a=commitdiff;h=161332a521fe10c41979bcd493d95e2ac562b7f
  ?p=git.git;a=commitdiff;h=e12c095aa69d8aca0326eb11960427d9bf9e2db7
  ?p=git.git;a=commitdiff;h=82560983997c961d9deafe0074b787c8484c2e1d
and compare to (for example)
  http://repo.or.cz/w/git.git?a=commitdiff;h=82560983997c961d9deafe0074b787c8484c2e1d
or (even older gitweb)
  http://www.kernel.org/git/?p=git/git.git;a=commitdiff;h=887a612fef942dd3e7dae452e2dc582738b0fb41
BTW. this gitweb has also my previous "Slight improvement" patch applied.

Do you like it? What should be changed (code, output, style)?
-- >8 --
Replace "gitweb diff header" with its full sha1 of blobs with
"git diff" header and extended diff header. Change also
highlighting of diffs.

Changes:
* "gitweb diff header" which looked for example like below:
    file:_<sha1 before>_ -> file:_<sha1 after>_
  where 'file' is file type and '<sha1>' is full sha1 of blob, is link
  and uses default link style is changed to
    diff --git a/<file before> b/<file after>
  where <file> is hidden link (i.e. underline on hover, only)
  to appropriate version of file. If file is added, a/<file> is not
  hyperlinked, if file is deleted, b/<file> is not hyperlinked.
* there is added "extended diff header", with <path> and <hash>
  hyperlinked (and <hash> shortened to 7 characters), and <mode>
  explained: '<mode>' is extnded to '<mode>/<symbolic mode> (<file type>)'.
* <file> hyperlinking should work also when <file> is originally
  quoted. For now we present filename quoted. This needed changes to
  parse_difftree_raw_line subroutine.
* from-file/to-file two-line header lines have slightly darker color
  than removed/added lines.
* chunk header has now delicate line above for easier finding chunk
  boundary, and top margin of 1px.

Controversial ideas:
* All links in patch header are hidden
* Hashes are shortened to 7 characters
* Filenames are presented quoted
* Marking of chunk beginning
* No hyperlink for renamed from/to header (bug)

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
 gitweb/gitweb.css  |   46 ++++++++++++---
 gitweb/gitweb.perl |  159 ++++++++++++++++++++++++++++++---------------------
 2 files changed, 131 insertions(+), 74 deletions(-)

diff --git a/gitweb/gitweb.css b/gitweb/gitweb.css
index 83d900d..3aeceed 100644
--- a/gitweb/gitweb.css
+++ b/gitweb/gitweb.css
@@ -303,6 +303,33 @@ td.mode {
 	font-family: monospace;
 }
 
+
+div.diff.header,
+div.diff.extended_header {
+	white-space: normal;
+}
+
+div.diff.header {
+	font-weight: bold;
+
+	background-color: #edece6;
+
+	margin-top: 4px;
+	padding: 4px 0px 2px 0px;
+	border: solid #d9d8d1;
+	border-width: 1px 0px 1px 0px;
+}
+
+div.diff.extended_header,
+div.diff.extended_header a.list {
+	color: #777777;
+}
+
+div.diff.extended_header {
+	background-color: #f6f5ee;
+	padding: 2px 0px 2px 0px;
+}
+
 div.diff a.list {
 	text-decoration: none;
 }
@@ -312,31 +339,34 @@ div.diff a.list:hover {
 }
 
 div.diff.to_file a.list,
-div.diff.to_file,
+div.diff.to_file {
+	color: #007000;
+}
+
 div.diff.add {
 	color: #008800;
 }
 
 div.diff.from_file a.list,
-div.diff.from_file,
+div.diff.from_file {
+	color: #aa0000;
+}
+
 div.diff.rem {
 	color: #cc0000;
 }
 
 div.diff.chunk_header {
 	color: #990099;
+	border: dotted #ffbbff;
+	border-width: 1px 0px 0px 0px;
+	margin-top: 1px;
 }
 
 div.diff.incomplete {
 	color: #cccccc;
 }
 
-div.diff_info {
-	font-family: monospace;
-	color: #000099;
-	background-color: #edece6;
-	font-style: italic;
-}
 
 div.index_include {
 	border: solid #d9d8d1;
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index cbab3c9..2d971ac 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -1255,9 +1255,12 @@ sub parse_difftree_raw_line {
 		$res{'status'} = $5;
 		$res{'similarity'} = $6;
 		if ($res{'status'} eq 'R' || $res{'status'} eq 'C') { # renamed or copied
-			($res{'from_file'}, $res{'to_file'}) = map { unquote($_) } split("\t", $7);
+			($res{'from_file_raw'}, $res{'to_file_raw'}) = split("\t", $7);
+			$res{'from_file'} = unquote($res{'from_file_raw'});
+			$res{'to_file'}   = unquote($res{'to_file_raw'});
 		} else {
-			$res{'file'} = unquote($7);
+			$res{'file_raw'} = $7;
+			$res{'file'} = unquote($res{'file_raw'});
 		}
 	}
 	# 'c512b523472485aef4fff9e57b229d9d243c967f'
@@ -2024,6 +2027,7 @@ sub git_patchset_body {
 	my $in_header = 0;
 	my $patch_found = 0;
 	my $diffinfo;
+	my (@from_subst, @to_subst);
 
 	print "<div class=\"patchset\">\n";
 
@@ -2033,6 +2037,7 @@ sub git_patchset_body {
 
 		if ($patch_line =~ m/^diff /) { # "git diff" header
 			# beginning of patch (in patchset)
+
 			if ($patch_found) {
 				# close previous patch
 				print "</div>\n"; # class="patch"
@@ -2042,11 +2047,59 @@ sub git_patchset_body {
 			}
 			print "<div class=\"patch\" id=\"patch". ($patch_idx+1) ."\">\n";
 
+			# read and prepare patch information
 			if (ref($difftree->[$patch_idx]) eq "HASH") {
+				# pre-parsed (or generated by hand)
 				$diffinfo = $difftree->[$patch_idx];
 			} else {
 				$diffinfo = parse_difftree_raw_line($difftree->[$patch_idx]);
 			}
+			if ($diffinfo->{'status'} ne "A") { # not new (added) file
+				my $quot = '';
+				my $from_text;
+				my $file_raw = $diffinfo->{'from_file_raw'} || $diffinfo->{'file_raw'};
+				if ($file_raw =~ s/^"(.*)"$/\1/) {
+					$from_text = qq("a/$file_raw");
+					$quot = '"';
+				} else {
+					$from_text =  qq(a/$file_raw);
+				}
+				my $file = $diffinfo->{'from_file'} || $diffinfo->{'file'};
+				my $from_link =
+					$cgi->a({-href => href(action=>"blob", hash_base=>$hash_base,
+					                      hash=>$diffinfo->{'from_id'}, file_name=>$file),
+					        -class => "list"}, esc_html($file_raw));
+				my $hash_link =
+					$cgi->a({-href => href(action=>"blob", hash_base=>$hash_base,
+					                      hash=>$diffinfo->{'to_id'}, file_name=>$file),
+					        -class => "list"}, substr($diffinfo->{'from_id'},0,7));
+				@from_subst = 
+					($from_text, "${quot}a/$from_link${quot}",
+					$diffinfo->{'from_id'} . '\.\.', "$hash_link..");
+			}
+			if ($diffinfo->{'status'} ne "D") { # not deleted file
+				my $quot = '';
+				my $to_text;
+				my $file_raw = $diffinfo->{'to_file_raw'} || $diffinfo->{'file_raw'};
+				if ($file_raw =~ s/^"(.*)"$/\1/) {
+					$to_text = qq("b/$file_raw");
+					$quot = '"';
+				} else {
+					$to_text =  qq(b/$file_raw);
+				}
+				my $file = $diffinfo->{'to_file'} || $diffinfo->{'file'};
+				my $to_link =
+					$cgi->a({-href => href(action=>"blob", hash_base=>$hash_base,
+					                      hash=>$diffinfo->{'to_id'}, file_name=>$file),
+					        -class => "list"}, esc_html($file_raw));
+				my $hash_link =
+					$cgi->a({-href => href(action=>"blob", hash_base=>$hash_base,
+					                      hash=>$diffinfo->{'to_id'}, file_name=>$file),
+					        -class => "list"}, substr($diffinfo->{'to_id'},0,7));
+				@to_subst =
+					($to_text, "${quot}b/$to_link${quot}",
+					 '\.\.' . $diffinfo->{'to_id'}, "..$hash_link");
+			}
 			$patch_idx++;
 
 			# for now we skip empty patches
@@ -2056,82 +2109,56 @@ sub git_patchset_body {
 				next LINE;
 			}
 
-			if ($diffinfo->{'status'} eq "A") { # added
-				print "<div class=\"diff_info\">" . file_type($diffinfo->{'to_mode'}) . ":" .
-				      $cgi->a({-href => href(action=>"blob", hash_base=>$hash,
-				                             hash=>$diffinfo->{'to_id'}, file_name=>$diffinfo->{'file'})},
-				              $diffinfo->{'to_id'}) . " (new)" .
-				      "</div>\n"; # class="diff_info"
-
-			} elsif ($diffinfo->{'status'} eq "D") { # deleted
-				print "<div class=\"diff_info\">" . file_type($diffinfo->{'from_mode'}) . ":" .
-				      $cgi->a({-href => href(action=>"blob", hash_base=>$hash_parent,
-				                             hash=>$diffinfo->{'from_id'}, file_name=>$diffinfo->{'file'})},
-				              $diffinfo->{'from_id'}) . " (deleted)" .
-				      "</div>\n"; # class="diff_info"
-
-			} elsif ($diffinfo->{'status'} eq "R" || # renamed
-			         $diffinfo->{'status'} eq "C" || # copied
-			         $diffinfo->{'status'} eq "2") { # with two filenames (from git_blobdiff)
-				print "<div class=\"diff_info\">" .
-				      file_type($diffinfo->{'from_mode'}) . ":" .
-				      $cgi->a({-href => href(action=>"blob", hash_base=>$hash_parent,
-				                             hash=>$diffinfo->{'from_id'}, file_name=>$diffinfo->{'from_file'})},
-				              $diffinfo->{'from_id'}) .
-				      " -> " .
-				      file_type($diffinfo->{'to_mode'}) . ":" .
-				      $cgi->a({-href => href(action=>"blob", hash_base=>$hash,
-				                             hash=>$diffinfo->{'to_id'}, file_name=>$diffinfo->{'to_file'})},
-				              $diffinfo->{'to_id'});
-				print "</div>\n"; # class="diff_info"
-
-			} else { # modified, mode changed, ...
-				print "<div class=\"diff_info\">" .
-				      file_type($diffinfo->{'from_mode'}) . ":" .
-				      $cgi->a({-href => href(action=>"blob", hash_base=>$hash_parent,
-				                             hash=>$diffinfo->{'from_id'}, file_name=>$diffinfo->{'file'})},
-				              $diffinfo->{'from_id'}) .
-				      " -> " .
-				      file_type($diffinfo->{'to_mode'}) . ":" .
-				      $cgi->a({-href => href(action=>"blob", hash_base=>$hash,
-				                             hash=>$diffinfo->{'to_id'}, file_name=>$diffinfo->{'file'})},
-				              $diffinfo->{'to_id'});
-				print "</div>\n"; # class="diff_info"
-			}
+			# print "git diff" header
+			$patch_line =~ s/$from_subst[0]/$from_subst[1]/ if @from_subst;
+			$patch_line =~ s/$to_subst[0]/$to_subst[1]/ if @to_subst;
+			print "<div class=\"diff header\">$patch_line</div>\n";
 
-			#print "<div class=\"diff extended_header\">\n";
+			print "<div class=\"diff extended_header\">\n";
 			$in_header = 1;
 			next LINE;
 		} # start of patch in patchset
 
+		if ($in_header) {
+			if ($patch_line !~ m/^---/) {
+				# match <path>
+				if ($patch_line =~ m|a/|) {
+					$patch_line =~ s/$from_subst[0]/$from_subst[1]/ if @from_subst;
+				}
+				if ($patch_line =~ m|b/|) {
+					$patch_line =~ s/$to_subst[0]/$to_subst[1]/ if @to_subst;
+				}
+				# match <mode>
+				if ($patch_line =~ m/\s(\d{6})$/) {
+					$patch_line .= '/' . mode_str($1) . ' (' . file_type($1) . ')';
+				}
+				# match <hash>
+				if ($patch_line =~ m/^index/) {
+					$patch_line =~ s/0{40}/'0' x 7/e;
+					$patch_line =~ s/$from_subst[2]/$from_subst[3]/ if @from_subst;
+					$patch_line =~ s/$to_subst[2]/$to_subst[3]/ if @to_subst;
+				}
+				print $patch_line . "<br/>\n";
 
-		if ($in_header && $patch_line =~ m/^---/) {
-			#print "</div>\n"; # class="diff extended_header"
-			$in_header = 0;
+			} else {
+				#$patch_line =~ m/^---/;
+				print "</div>\n"; # class="diff extended_header"
+				$in_header = 0;
+
+				$patch_line =~ s/$from_subst[0]/$from_subst[1]/ if @from_subst;
+				print "<div class=\"diff from_file\">$patch_line</div>\n";
 
-			my $file = $diffinfo->{'from_file'};
-			$file  ||= $diffinfo->{'file'};
-			$file = $cgi->a({-href => href(action=>"blob", hash_base=>$hash_parent,
-			                               hash=>$diffinfo->{'from_id'}, file_name=>$file),
-			                -class => "list"}, esc_html($file));
-			$patch_line =~ s|a/.*$|a/$file|g;
-			print "<div class=\"diff from_file\">$patch_line</div>\n";
+				$patch_line = <$fd>;
+				chomp $patch_line;
 
-			$patch_line = <$fd>;
-			chomp $patch_line;
+				#$patch_line =~ m/^+++/;
+				$patch_line =~ s/$to_subst[0]/$to_subst[1]/ if @to_subst;
+				print "<div class=\"diff to_file\">$patch_line</div>\n";
 
-			#$patch_line =~ m/^+++/;
-			$file    = $diffinfo->{'to_file'};
-			$file  ||= $diffinfo->{'file'};
-			$file = $cgi->a({-href => href(action=>"blob", hash_base=>$hash,
-			                               hash=>$diffinfo->{'to_id'}, file_name=>$file),
-			                -class => "list"}, esc_html($file));
-			$patch_line =~ s|b/.*|b/$file|g;
-			print "<div class=\"diff to_file\">$patch_line</div>\n";
+			}
 
 			next LINE;
 		}
-		next LINE if $in_header;
 
 		print format_diff_line($patch_line);
 	}
-- 
1.4.3.3



-------------------------------------------------------

-- 
Jakub Narebski

^ permalink raw reply related

* Re: [PATCH 1/3] Allow short pack names to git-pack-objects --unpacked=.
From: Shawn Pearce @ 2006-10-29  9:44 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <20061029093711.GC3847@spearce.org>

Ok, I'm a freaking idiot:

> Subject: [PATCH 1/3] Allow short pack names to git-pack-objects --unpacked=.
> 
> Allow short pack names to git-pack-objects --unpacked=.

I meant to delete the first two lines of the body of the message
before sending so that git-am didn't stutter the first line of the
commit message when applied, yet I managed to not do that.  *sigh*

-- 

^ permalink raw reply

* [PATCH 3/3] Teach git-index-pack how to keep a pack file.
From: Shawn Pearce @ 2006-10-29  9:41 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

To prevent a race condition between `index-pack --stdin` and
`repack -a -d` where the repack deletes the newly created pack
file before any refs are updated to reference objects contained
within it we mark the pack file as one that should be kept.  This
removes it from the list of packs that `repack -a -d` will consider
for removal.

Callers such as `receive-pack` which want to invoke `index-pack`
should use this new --keep option to prevent the newly created pack
and index file pair from being deleted before they have finished any
related ref updates.  Only after all ref updates have been finished
should the associated .keep file be removed.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
 I'm still working on the change to receive-pack.  Junio's latest
 patch (send-pack --keep) is a different way of doing at least some
 of what I'm still working on there.

 I'm throwing a pair of pipes around index-pack so I can capture
 the pack name thus allowing receive-pack to delete the correct
 .keep file after its completed the ref updates.  At that point
 receive-pack can easily examine the object count in the pack header
 and compare that against a config entry to decide if it should be
 keeping the pack or exploding it.


 Documentation/git-index-pack.txt |   24 ++++++++++++++++++--
 index-pack.c                     |   43 +++++++++++++++++++++++++++++++++++--
 2 files changed, 61 insertions(+), 6 deletions(-)

diff --git a/Documentation/git-index-pack.txt b/Documentation/git-index-pack.txt
index 9fa4847..1235416 100644
--- a/Documentation/git-index-pack.txt
+++ b/Documentation/git-index-pack.txt
@@ -9,7 +9,7 @@ git-index-pack - Build pack index file f
 SYNOPSIS
 --------
 'git-index-pack' [-v] [-o <index-file>] <pack-file>
-'git-index-pack' --stdin [--fix-thin] [-v] [-o <index-file>] [<pack-file>]
+'git-index-pack' --stdin [--fix-thin] [--keep] [-v] [-o <index-file>] [<pack-file>]
 
 
 DESCRIPTION
@@ -38,7 +38,10 @@ OPTIONS
 	instead and a copy is then written to <pack-file>. If
 	<pack-file> is not specified, the pack is written to
 	objects/pack/ directory of the current git repository with
-	a default name determined from the pack content.
+	a default name determined from the pack content.  If
+	<pack-file> is not specified consider using --keep to
+	prevent a race condition between this process and
+	gitlink::git-repack[1] .
 
 --fix-thin::
 	It is possible for gitlink:git-pack-objects[1] to build
@@ -48,7 +51,22 @@ OPTIONS
 	and they must be included in the pack for that pack to be self
 	contained and indexable. Without this option any attempt to
 	index a thin pack will fail. This option only makes sense in
-	conjonction with --stdin.
+	conjunction with --stdin.
+
+--keep::
+	Before moving the index into its final destination
+	create an empty .keep file for the associated pack file.
+	This option is usually necessary with --stdin to prevent a
+	simultaneous gitlink:git-repack[1] process from deleting
+	the newly constructed pack and index before refs can be
+	updated to use objects contained in the pack.
+
+--keep='why'::
+	Like --keep create a .keep file before moving the index into
+	its final destination, but rather than creating an empty file
+	place 'why' followed by an LF into the .keep file.  The 'why'
+	message can later be searched for within all .keep files to
+	locate any which have outlived their usefulness.
 
 
 Author
diff --git a/index-pack.c b/index-pack.c
index 866a054..b37dd78 100644
--- a/index-pack.c
+++ b/index-pack.c
@@ -10,7 +10,7 @@
 #include <signal.h>
 
 static const char index_pack_usage[] =
-"git-index-pack [-v] [-o <index-file>] { <pack-file> | --stdin [--fix-thin] [<pack-file>] }";
+"git-index-pack [-v] [-o <index-file>] [{ ---keep | --keep=<msg> }] { <pack-file> | --stdin [--fix-thin] [<pack-file>] }";
 
 struct object_entry
 {
@@ -754,6 +754,7 @@ static const char *write_index_file(cons
 
 static void final(const char *final_pack_name, const char *curr_pack_name,
 		  const char *final_index_name, const char *curr_index_name,
+		  const char *keep_name, const char *keep_msg,
 		  unsigned char *sha1)
 {
 	char name[PATH_MAX];
@@ -780,6 +781,23 @@ static void final(const char *final_pack
 		}
 	}
 
+	if (keep_msg) {
+		int keep_fd, keep_msg_len = strlen(keep_msg);
+		if (!keep_name) {
+			snprintf(name, sizeof(name), "%s/pack/pack-%s.keep",
+				 get_object_directory(), sha1_to_hex(sha1));
+			keep_name = name;
+		}
+		keep_fd = open(keep_name, O_RDWR | O_CREAT, 0600);
+		if (keep_fd < 0)
+			die("cannot write keep file");
+		if (keep_msg_len > 0) {
+			write_or_die(keep_fd, keep_msg, keep_msg_len);
+			write_or_die(keep_fd, "\n", 1);
+		}
+		close(keep_fd);
+	}
+
 	if (final_pack_name != curr_pack_name) {
 		if (!final_pack_name) {
 			snprintf(name, sizeof(name), "%s/pack/pack-%s.pack",
@@ -807,7 +825,8 @@ int main(int argc, char **argv)
 	int i, fix_thin_pack = 0;
 	const char *curr_pack, *pack_name = NULL;
 	const char *curr_index, *index_name = NULL;
-	char *index_name_buf = NULL;
+	const char *keep_name = NULL, *keep_msg = NULL;
+	char *index_name_buf = NULL, *keep_name_buf = NULL;
 	unsigned char sha1[20];
 
 	for (i = 1; i < argc; i++) {
@@ -818,6 +837,10 @@ int main(int argc, char **argv)
 				from_stdin = 1;
 			} else if (!strcmp(arg, "--fix-thin")) {
 				fix_thin_pack = 1;
+			} else if (!strcmp(arg, "--keep")) {
+				keep_msg = "";
+			} else if (!strncmp(arg, "--keep=", 7)) {
+				keep_msg = arg + 7;
 			} else if (!strcmp(arg, "-v")) {
 				verbose = 1;
 			} else if (!strcmp(arg, "-o")) {
@@ -848,6 +871,16 @@ int main(int argc, char **argv)
 		strcpy(index_name_buf + len - 5, ".idx");
 		index_name = index_name_buf;
 	}
+	if (keep_msg && !keep_name && pack_name) {
+		int len = strlen(pack_name);
+		if (!has_extension(pack_name, ".pack"))
+			die("packfile name '%s' does not end with '.pack'",
+			    pack_name);
+		keep_name_buf = xmalloc(len);
+		memcpy(keep_name_buf, pack_name, len - 5);
+		strcpy(keep_name_buf + len - 5, ".keep");
+		keep_name = keep_name_buf;
+	}
 
 	curr_pack = open_pack_file(pack_name);
 	parse_pack_header();
@@ -880,9 +913,13 @@ int main(int argc, char **argv)
 	}
 	free(deltas);
 	curr_index = write_index_file(index_name, sha1);
-	final(pack_name, curr_pack, index_name, curr_index, sha1);
+	final(pack_name, curr_pack,
+		index_name, curr_index,
+		keep_name, keep_msg,
+		sha1);
 	free(objects);
 	free(index_name_buf);
+	free(keep_name_buf);
 
 	if (!from_stdin)
 		printf("%s\n", sha1_to_hex(sha1));
-- 
1.4.3.3.g7d63

^ permalink raw reply related

* [PATCH 2/3] Only repack active packs by skipping over kept packs.
From: Shawn Pearce @ 2006-10-29  9:37 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

During `git repack -a -d` only repack objects which are loose or
which reside in an active (a non-kept) pack.  This allows the user
to keep large packs as-is without continuous repacking and can be
very helpful on large repositories.  It should also help us resolve
a race condition between `git repack -a -d` and the new pack store
functionality in `git-receive-pack`.

Kept packs are those which have a corresponding .keep file in
$GIT_OBJECT_DIRECTORY/pack.  That is pack-X.pack will be kept
(not repacked and not deleted) if pack-X.keep exists in the same
directory when `git repack -a -d` starts.

Currently this feature is not documented and there is no user
interface to keep an existing pack.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
 git-repack.sh |   27 +++++++++++++++++----------
 1 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/git-repack.sh b/git-repack.sh
index 17e2452..f150a55 100755
--- a/git-repack.sh
+++ b/git-repack.sh
@@ -45,11 +45,19 @@ case ",$all_into_one," in
 	args='--unpacked --incremental'
 	;;
 ,t,)
-	args=
-
-	# Redundancy check in all-into-one case is trivial.
-	existing=`test -d "$PACKDIR" && cd "$PACKDIR" && \
-	    find . -type f \( -name '*.pack' -o -name '*.idx' \) -print`
+	if [ -d "$PACKDIR" ]; then
+		for e in `cd "$PACKDIR" && find . -type f -name '*.pack' \
+			| sed -e 's/^\.\///' -e 's/\.pack$//'`
+		do
+			if [ -e "$PACKDIR/$e.keep" ]; then
+				: keep
+			else
+				args="$args --unpacked=$e.pack"
+				existing="$existing $e"
+			fi
+		done
+	fi
+	[ -z "$args" ] && args='--unpacked --incremental'
 	;;
 esac
 
@@ -86,17 +94,16 @@ fi
 
 if test "$remove_redundant" = t
 then
-	# We know $existing are all redundant only when
-	# all-into-one is used.
-	if test "$all_into_one" != '' && test "$existing" != ''
+	# We know $existing are all redundant.
+	if [ -n "$existing" ]
 	then
 		sync
 		( cd "$PACKDIR" &&
 		  for e in $existing
 		  do
 			case "$e" in
-			./pack-$name.pack | ./pack-$name.idx) ;;
-			*)	rm -f $e ;;
+			pack-$name) ;;
+			*)	rm -f "$e.pack" "$e.idx" "$e.keep" ;;
 			esac
 		  done
 		)
-- 
1.4.3.3.g7d63

^ permalink raw reply related

* [PATCH 1/3] Allow short pack names to git-pack-objects --unpacked=.
From: Shawn Pearce @ 2006-10-29  9:37 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Allow short pack names to git-pack-objects --unpacked=.

This allows us to pass just the file name of a pack rather than
the complete path when we want pack-objects to consider its
contents as though they were loose objects.  This can be helpful
if $GIT_OBJECT_DIRECTORY contains shell metacharacters which make
it cumbersome to pass complete paths safely in a shell script.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
 sha1_file.c |   20 +++++++++++++++++++-
 1 files changed, 19 insertions(+), 1 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index e89d24c..5e6c8b8 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1203,6 +1203,24 @@ unsigned long find_pack_entry_one(const
 	return 0;
 }
 
+static int matches_pack_name(struct packed_git *p, const char *ig)
+{
+	const char *last_c, *c;
+
+	if (!strcmp(p->pack_name, ig))
+		return 0;
+
+	for (c = p->pack_name, last_c = c; *c;)
+		if (*c == '/')
+			last_c = ++c;
+		else
+			++c;
+	if (!strcmp(last_c, ig))
+		return 0;
+
+	return 1;
+}
+
 static int find_pack_entry(const unsigned char *sha1, struct pack_entry *e, const char **ignore_packed)
 {
 	struct packed_git *p;
@@ -1214,7 +1232,7 @@ static int find_pack_entry(const unsigne
 		if (ignore_packed) {
 			const char **ig;
 			for (ig = ignore_packed; *ig; ig++)
-				if (!strcmp(p->pack_name, *ig))
+				if (!matches_pack_name(p, *ig))
 					break;
 			if (*ig)
 				continue;
-- 
1.4.3.3.g7d63

^ permalink raw reply related

* Re: [PATCH] send-pack --keep: do not explode into loose objects on the receiving end.
From: Junio C Hamano @ 2006-10-29  8:05 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: git
In-Reply-To: <20061029075638.GB3847@spearce.org>

Shawn Pearce <spearce@spearce.org> writes:

> I was thinking of just reading the pack header in receive-pack,
> checking the object count, and if its over a configured threshold
> call index-pack rather than unpack-objects.  Unfortunately I just
> realized that if we read the pack header to make that decision then
> its gone and the child process won't have it.  :-(

If you want to do that, that is certainly possible.

You can read the first block in the parent (without discarding),
make the decision and then fork()+exec() either unpack-objects
or index-pack and feed it from the parent.  The parent first
feeds the initial block it read to make that decision, and then
becomes a cat that reads from send-pack and writes to the child
process that is either unpack-objects or index-pack.

^ permalink raw reply

* Re: [PATCH] send-pack --keep: do not explode into loose objects on the receiving end.
From: Shawn Pearce @ 2006-10-29  7:56 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Nicolas Pitre, git
In-Reply-To: <7vwt6j4l77.fsf@assigned-by-dhcp.cox.net>

Junio C Hamano <junkio@cox.net> wrote:
> This adds "keep-pack" extension to send-pack vs receive pack protocol,
> and makes the receiver invoke "index-pack --stdin --fix-thin".

I'm torn on this.  I see that keeping a pack vs. exploding to loose
objects is a local repository decision and thus should be determined
by the receiving repository, not the sending one.

I was thinking of just reading the pack header in receive-pack,
checking the object count, and if its over a configured threshold
call index-pack rather than unpack-objects.  Unfortunately I just
realized that if we read the pack header to make that decision then
its gone and the child process won't have it.  :-(

-- 

^ permalink raw reply

* [PATCH] send-pack --keep: do not explode into loose objects on the receiving end.
From: Junio C Hamano @ 2006-10-29  7:47 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0610252333540.12418@xanadu.home>

This adds "keep-pack" extension to send-pack vs receive pack protocol,
and makes the receiver invoke "index-pack --stdin --fix-thin".

With this, you can ask send-pack not to explode the result into
loose objects on the receiving end.

I've patched has_sha1_file() to re-check for added packs just
like is done in read_sha1_file() for now, but I think the static
"re-prepare" interface for packs was a mistake.  Creation of a
new pack inside a process that needs to read objects in them
back ought to be a rare event, so we are better off making the
callers (such as receive-pack that calls "index-pack --stdin
--fix-thin") explicitly call re-prepare.  That way we do not
have to penalize ordinary users of read_sha1_file() and
has_sha1_file().

We would need to fix this someday.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---
 receive-pack.c       |   19 ++++++++++++++++---
 send-pack.c          |   23 ++++++++++++++++++-----
 sha1_file.c          |    7 +++++--
 t/t5400-send-pack.sh |    9 +++++++++
 4 files changed, 48 insertions(+), 10 deletions(-)

diff --git a/receive-pack.c b/receive-pack.c
index ea2dbd4..ef50226 100644
--- a/receive-pack.c
+++ b/receive-pack.c
@@ -8,10 +8,14 @@
 static const char receive_pack_usage[] = "git-receive-pack <git-dir>";
 
 static const char *unpacker[] = { "unpack-objects", NULL };
+static const char *keep_packer[] = {
+	"index-pack", "--stdin", "--fix-thin", NULL
+};
 
 static int report_status;
+static int keep_pack;
 
-static char capabilities[] = "report-status";
+static char capabilities[] = "report-status keep-pack";
 static int capabilities_sent;
 
 static int show_ref(const char *path, const unsigned char *sha1)
@@ -261,6 +265,8 @@ static void read_head_info(void)
 		if (reflen + 82 < len) {
 			if (strstr(refname + reflen + 1, "report-status"))
 				report_status = 1;
+			if (strstr(refname + reflen + 1, "keep-pack"))
+				keep_pack = 1;
 		}
 		cmd = xmalloc(sizeof(struct command) + len - 80);
 		hashcpy(cmd->old_sha1, old_sha1);
@@ -275,7 +281,14 @@ static void read_head_info(void)
 
 static const char *unpack(int *error_code)
 {
-	int code = run_command_v_opt(1, unpacker, RUN_GIT_CMD);
+	int code;
+
+	if (keep_pack)
+		code = run_command_v_opt(ARRAY_SIZE(keep_packer) - 1,
+					 keep_packer, RUN_GIT_CMD);
+	else
+		code = run_command_v_opt(ARRAY_SIZE(unpacker) - 1,
+					 unpacker, RUN_GIT_CMD);
 
 	*error_code = 0;
 	switch (code) {
@@ -335,7 +348,7 @@ int main(int argc, char **argv)
 	if (!dir)
 		usage(receive_pack_usage);
 
-	if(!enter_repo(dir, 0))
+	if (!enter_repo(dir, 0))
 		die("'%s': unable to chdir or not a git archive", dir);
 
 	write_head_info();
diff --git a/send-pack.c b/send-pack.c
index 5bb123a..54d218c 100644
--- a/send-pack.c
+++ b/send-pack.c
@@ -6,13 +6,14 @@
 #include "exec_cmd.h"
 
 static const char send_pack_usage[] =
-"git-send-pack [--all] [--exec=git-receive-pack] <remote> [<head>...]\n"
+"git-send-pack [--all] [--keep] [--exec=git-receive-pack] <remote> [<head>...]\n"
 "  --all and explicit <head> specification are mutually exclusive.";
 static const char *exec = "git-receive-pack";
 static int verbose;
 static int send_all;
 static int force_update;
 static int use_thin_pack;
+static int keep_pack;
 
 static int is_zero_sha1(const unsigned char *sha1)
 {
@@ -270,6 +271,7 @@ static int send_pack(int in, int out, in
 	int new_refs;
 	int ret = 0;
 	int ask_for_status_report = 0;
+	int ask_to_keep_pack = 0;
 	int expect_status_report = 0;
 
 	/* No funny business with the matcher */
@@ -279,6 +281,8 @@ static int send_pack(int in, int out, in
 	/* Does the other end support the reporting? */
 	if (server_supports("report-status"))
 		ask_for_status_report = 1;
+	if (server_supports("keep-pack") && keep_pack)
+		ask_to_keep_pack = 1;
 
 	/* match them up */
 	if (!remote_tail)
@@ -355,12 +359,17 @@ static int send_pack(int in, int out, in
 		strcpy(old_hex, sha1_to_hex(ref->old_sha1));
 		new_hex = sha1_to_hex(ref->new_sha1);
 
-		if (ask_for_status_report) {
-			packet_write(out, "%s %s %s%c%s",
+		if (ask_for_status_report || ask_to_keep_pack) {
+			packet_write(out, "%s %s %s%c%s%s",
 				     old_hex, new_hex, ref->name, 0,
-				     "report-status");
+				     ask_for_status_report
+				     ? " report-status" : "",
+				     ask_to_keep_pack
+				     ? " keep-pack" : "");
+			if (ask_for_status_report)
+				expect_status_report = 1;
 			ask_for_status_report = 0;
-			expect_status_report = 1;
+			ask_to_keep_pack = 0;
 		}
 		else
 			packet_write(out, "%s %s %s",
@@ -419,6 +428,10 @@ int main(int argc, char **argv)
 				verbose = 1;
 				continue;
 			}
+			if (!strcmp(arg, "--keep")) {
+				keep_pack = 1;
+				continue;
+			}
 			if (!strcmp(arg, "--thin")) {
 				use_thin_pack = 1;
 				continue;
diff --git a/sha1_file.c b/sha1_file.c
index e89d24c..278ba2f 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1292,7 +1292,7 @@ static void *read_packed_sha1(const unsi
 	return unpack_entry(&e, type, size);
 }
 
-void * read_sha1_file(const unsigned char *sha1, char *type, unsigned long *size)
+void *read_sha1_file(const unsigned char *sha1, char *type, unsigned long *size)
 {
 	unsigned long mapsize;
 	void *map, *buf;
@@ -1757,7 +1757,10 @@ int has_sha1_file(const unsigned char *s
 
 	if (find_pack_entry(sha1, &e, NULL))
 		return 1;
-	return find_sha1_file(sha1, &st) ? 1 : 0;
+	if (find_sha1_file(sha1, &st))
+		return 1;
+	reprepare_packed_git();
+	return find_pack_entry(sha1, &e, NULL);
 }
 
 /*
diff --git a/t/t5400-send-pack.sh b/t/t5400-send-pack.sh
index 8afb899..d831f8d 100755
--- a/t/t5400-send-pack.sh
+++ b/t/t5400-send-pack.sh
@@ -78,4 +78,13 @@ test_expect_success \
 	! diff -u .git/refs/heads/master victim/.git/refs/heads/master
 '
 
+test_expect_success 'push with --keep' '
+	t=`cd victim && git-rev-parse --verify refs/heads/master` &&
+	git-update-ref refs/heads/master $t &&
+	: > foo &&
+	git add foo &&
+	git commit -m "one more" &&
+	git-send-pack --keep ./victim/.git/ master
+'
+
 test_done
-- 
1.4.3.3.g7d63


^ permalink raw reply related

* Re: VCS comparison table
From: Ilpo Nyyssönen @ 2006-10-29  6:54 UTC (permalink / raw)
  To: bazaar-ng; +Cc: git
In-Reply-To: <ehvnal$tjg$1@sea.gmane.org>

Jakub Narebski <jnareb@gmail.com> writes:

> Ilpo Nyyssönen wrote:
>
>> Usability:
>> 
>> I have used bzr, bk for development and git very little for following
>> kernel development. I have followed this discussion quite well.
>> 
>> 1. It is easier to start using something you are already familiar
>> with. (Just try to use Mac OS X with a Windows or Linux background.)
>> 
>> G: Something totally new and so no points from here. The way of using
>> git is just so different from any other similar software.
>> 
>> B: Quite clearly gets points from this. Normal branches work quite
>> like many other software, the checkout stuff works like CVS and SVN.
>
> I find for example concept of branches in Git extremly easy to understand.

Might be, but the point was: Git is harder as it is not like others. 
In other hand one can see Bazaar like other distributed SCMs and even
like the not distributed ones as it has the checkout stuff.

You can give Bazaar for me, a bk user, and I can understand what to do
with the branches that are like bk clones. (The repository stuff is
later development and still optional.) Switching a CVS environment to
Bazaar one can be done so that most of the users can be just told to
use bzr checkout and they don't have to care about pushing.

But with git, I clone some repository. Now it is totally new to
understand that I didn't clone only single branch. It's like nothing
else and that's what I saw when I first looked at it. I might have
even not noticed the branch stuff and just cloned it further.

> On the other side breaking with traditional concepts of _centralized_ SCM
> in _distributed_ SCM (and geared towards distributed usage) is IMVHO a good
> idea. And breaking with the cruft of bad ideas of CVS is very good idea.

Breaking concepts can be a good idea and I somewhat think that git
needed to do what it did. But do remember that it came with a cost:
git is harder to understand and use. You first have to understand that
it is different and how it is different.

> I don't understand the confusion between "git branch" and "git clone"
> commands... unless you are confused by Bazaar-NG branch-centric approach
> which mixes branch with repository.

Those commands do so different things in different SCMs. Just look at
the differences bk clone, git clone, git branch and bzr branch. You
have both. At the point where I didn't yet understand that I cloned
more than a one branch, git branch is very odd looking command.

> Which long lasting operations lack progress bar/progress reporting?
> "git clone" and "git fetch"/"git pull" both have progress report

First note that I didn't notice git repack until recently so things
got slower until that.

At least some points they just tell that they are doing something, but
not how much of it has been done and how much is still to do. Look at
Bazaar and you'll see the difference, it has progress bars.

>> G: You have only one workspace and this forces you to use git more or
>> to make several repositories. 
>
> This is your confusion stemming from Bazaar-NG branch-centricness. In Git
> working area is associated with repository, not with branch as in bzr.

Exactly my point.

>> You can't just diff branchA/foo branchB/foo.
>
> You can: either using "git diff branchA branchB -- foo" which means

Exactly my point: it forces you to use git more. In Bazaar I can do
this without Bazaar commands. I could even do it with some Windows GUI
stuff, take two files or directories and compare.

As you need to use git commands more than bzr commands, git has bigger
requirements for usability.

>> You can't just open file from old branch to check 
>> something while you are developing in some new branch.
>
> You can view file from old branch via "git cat-file -p old-branch:file".

Same thing here, in Bazaar, I can just open the file from the other
branch. I can also compile and run the other branch while I have the
other open.

Essentially I would need a separate git repository for each branch
anyway. In Bazaar I can use the same.



^ permalink raw reply

* Re: fetching packs and storing them as packs
From: Shawn Pearce @ 2006-10-29  5:21 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7v3b9766rc.fsf@assigned-by-dhcp.cox.net>

Junio C Hamano <junkio@cox.net> wrote:
> Shawn Pearce <spearce@spearce.org> writes:
> >> Then we can make "pack-objects --unpacked" to pretend the ones
> >> with corresponding .volatile as if the objects in them are
> >> loose, without breaking backward compatibility.
> >
> > Currently I'm changing --unpacked= to match without needing quoting.
> > I'm allowing it to match an exact pack name or if it starts with
> > "pack-" and matches the last 50 ("pack-X{40}.pack") of the pack name.
> 
> I think is a very sane thing to do (I should have done that from
> the beginning).  I do not like "the last 50", but I do not have
> objection to make it take either full path or just the filename
> under objects/pack/ (so not "the last 50" but "filename w/o
> slash").

OK.  I coded it with the last 50 but will rewrite that commit
without as the code is slightly shorter that way.  :-)

-- 

^ permalink raw reply

* Re: fetching packs and storing them as packs
From: Junio C Hamano @ 2006-10-29  5:16 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: git
In-Reply-To: <20061029043818.GA3650@spearce.org>

Shawn Pearce <spearce@spearce.org> writes:

> Junio C Hamano <junkio@cox.net> wrote:
>> Shawn Pearce <spearce@spearce.org> writes:
>> 
>> So how about pack-X{40}.volatile that marks an eligible one for
>> repacking?
>
> Then anyone who has an existing pack would need to create that
> file first as soon as they got this newer version of Git... not
> very upgrade friendly if you ask me.

Ah, I mixed things up completely.  You're right.  Having .keep
leaves that pack as is (and lack of matching .keep causes it to
be repacked -- people do not have .keep so everything should be
repacked as before).

>> Then we can make "pack-objects --unpacked" to pretend the ones
>> with corresponding .volatile as if the objects in them are
>> loose, without breaking backward compatibility.
>
> Currently I'm changing --unpacked= to match without needing quoting.
> I'm allowing it to match an exact pack name or if it starts with
> "pack-" and matches the last 50 ("pack-X{40}.pack") of the pack name.

I think is a very sane thing to do (I should have done that from
the beginning).  I do not like "the last 50", but I do not have
objection to make it take either full path or just the filename
under objects/pack/ (so not "the last 50" but "filename w/o
slash").

^ permalink raw reply

* Re: fetching packs and storing them as packs
From: Shawn Pearce @ 2006-10-29  4:38 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vejsr68y9.fsf@assigned-by-dhcp.cox.net>

Junio C Hamano <junkio@cox.net> wrote:
> Shawn Pearce <spearce@spearce.org> writes:
> 
> > The issue is --unpacked= uses the path of the pack name, which
> > includes $GIT_OBJECT_DIRECTORY, whatever that may be.  This makes it
> > impossible for the shell script to hand through a proper --unpacked=
> > line for the active packs without including $GIT_OBJECT_DIRECTORY
> > as part of the option.
> 
> Yeah, I realize that; you need to know how to build shell script
> that is properly shell quoted to be eval'ed, which is not hard
> but is not usually done and is cumbersome.

Too much work.  :-)

> I would suspect it is probably easier to just say --unpacked
> (without packname) means "unpacked objects, and objects in packs
> that do not have corresponding .keep".  However, that would be a
> change in semantics for --unpacked (without packname), which is
> not nice.
> 
> So how about pack-X{40}.volatile that marks an eligible one for
> repacking?

Then anyone who has an existing pack would need to create that
file first as soon as they got this newer version of Git... not
very upgrade friendly if you ask me.
 
> Then we can make "pack-objects --unpacked" to pretend the ones
> with corresponding .volatile as if the objects in them are
> loose, without breaking backward compatibility.

Currently I'm changing --unpacked= to match without needing quoting.
I'm allowing it to match an exact pack name or if it starts with
"pack-" and matches the last 50 ("pack-X{40}.pack") of the pack name.

I figure this should work fine as probably anyone who has a pack
name that matches 50 characters and starts with "pack-" is using a
pack file name which has the SHA1 of the object list contained in
it and is thus probably unique.
 
-- 

^ permalink raw reply

* Re: fetching packs and storing them as packs
From: Junio C Hamano @ 2006-10-29  4:29 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: git
In-Reply-To: <20061029035025.GC3435@spearce.org>

Shawn Pearce <spearce@spearce.org> writes:

> The issue is --unpacked= uses the path of the pack name, which
> includes $GIT_OBJECT_DIRECTORY, whatever that may be.  This makes it
> impossible for the shell script to hand through a proper --unpacked=
> line for the active packs without including $GIT_OBJECT_DIRECTORY
> as part of the option.

Yeah, I realize that; you need to know how to build shell script
that is properly shell quoted to be eval'ed, which is not hard
but is not usually done and is cumbersome.

I would suspect it is probably easier to just say --unpacked
(without packname) means "unpacked objects, and objects in packs
that do not have corresponding .keep".  However, that would be a
change in semantics for --unpacked (without packname), which is
not nice.

So how about pack-X{40}.volatile that marks an eligible one for
repacking?

Then we can make "pack-objects --unpacked" to pretend the ones
with corresponding .volatile as if the objects in them are
loose, without breaking backward compatibility.

> However on the git.git repository if I ran `git repack -a -d`
> with every single object in a kept pack and no loose objects I
> kept repacking the same 102 objects into a new active pack,
> even though there were no loose objects to repack and no
> active packs.  Uh, yea.

Will take a look myself if you are otherwise busy.

^ permalink raw reply

* Re: fetching packs and storing them as packs
From: Shawn Pearce @ 2006-10-29  3:52 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git
In-Reply-To: <ei18ak$nv4$1@sea.gmane.org>

Jakub Narebski <jnareb@gmail.com> wrote:
> Shawn Pearce wrote:
> 
> > Eran Tromer <git2eran@tromer.org> wrote:
> >>
> >> It would be nice to have whoever creates a pack-*.keep file put
> >> something useful as the content of the file, so we'll know what to clean
> >> up after abnormal termination:
> >> 
> >> $ grep -l ^git-receive-pack $GIT_DIR/objects/pack/pack-*.keep
> > 
> > Yes, that's a very good idea.  When I do the git-receive-pack
> > implementation tonight I'll try to dump useful information to the
> > .keep file such that you can easily grep for the stale .keeps
> > and decide which ones should go.
> 
> Perhaps git-count-packs (or enhanced git-count-objects, or git-count-stuff;
> whatever it would be named) could also list (with some option) the reasons
> for packs to be kept...

That would be more like 'git ls-packs' to me, but as Junio pointed
out why add a command just to count packs, and as others have said
recently "Git has too many commands!".

<funny-and-not-to-be-taken-seriously>
I like git-count-stuff.  So generic.  Maybe we can shove
git-pickaxe in as the --count-lines-and-authors-andthings option of
git-count-stuff, seeing as how some people didn't like its name.  :-)
</funny-and-not-to-be-taken-seriously>

-- 

^ permalink raw reply

* Re: fetching packs and storing them as packs
From: Shawn Pearce @ 2006-10-29  3:50 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vfyd88d6s.fsf@assigned-by-dhcp.cox.net>

Junio C Hamano <junkio@cox.net> wrote:
> Shawn Pearce <spearce@spearce.org> writes:
> 
> > Shawn Pearce <spearce@spearce.org> wrote:
> >> Why not just use create a new flag file?
> >> 
> >> Lets say that a pack X is NOT eligible to be repacked if
> >> "$GIT_DIR/objects/pack/pack-X.keep" exists.
> >
> > Here's the `git repack -a -d` portion of that.
> > Thoughts?
> 
> > +	args=--unpacked
> > +	active=
> > +	if test -d "$PACKDIR"
> > +	then
> > +		for p in `find "$PACKDIR" -type f -name '*.pack' -print`
> 
> This change to run 'find "$PACKDIR"' is fragile when your
> $GIT_OBJECT_DIRECTORY has $IFS in it; running "find ." after
> "cd" in a subprocess was done very much on purpose to avoid that
> issue.  Please don't break it.

I only broke it because you backed me into a corner with --unpacked=
:-)

The issue is --unpacked= uses the path of the pack name, which
includes $GIT_OBJECT_DIRECTORY, whatever that may be.  This makes it
impossible for the shell script to hand through a proper --unpacked=
line for the active packs without including $GIT_OBJECT_DIRECTORY
as part of the option.

I agree with you about the $IFS issue.  I'll redraft this patch
tonight such that $IFS doesn't get broken here but that's going
to take a small code patch over in revisions.c, which I'll also
do tonight.

> > +	if test "X$args" = X--unpacked
> > +	then
> > +		args='--unpacked --incremental'
> > +	fi
 
> I do not remember offhand what --incremental meant, but
> presumably this is for the very initial "repack" (PACKDIR did
> not exist or find loop did not find anything to repack) and the
> flag would not make a difference?  Care to explain?
 
I think there is a bug in pack-objects but I couldn't find it last
night.  Using --incremental worked around it.  :-)

According to the documentation:

  --unpacked tells pack-objects to only pack loose objects.

  --incremental tells pack-objects to skip any object that is
  already contained in a pack even if it appears in the input list.

What I really wanted here was to just use '--unpacked'; if there
are no active packs and the user asked for '-a' we actually just
want to pack the loose objects into a new active pack as we aren't
allowed to touch any of the existing packs (they are all kept or
there simply aren't any packs yet).

However on the git.git repository if I ran `git repack -a -d`
with every single object in a kept pack and no loose objects I kept
repacking the same 102 objects into a new active pack, even though
there were no loose objects to repack and no active packs.  Uh, yea.

Adding --incremental in this case kept it from repacking those
same 102 objects on every invocation.  Yea, its a bug.  I meant to
mention it in my email.

-- 

^ permalink raw reply

* Re: fetching packs and storing them as packs
From: Jakub Narebski @ 2006-10-29  3:48 UTC (permalink / raw)
  To: git
In-Reply-To: <20061029033829.GA3435@spearce.org>

Shawn Pearce wrote:

> Eran Tromer <git2eran@tromer.org> wrote:
>>
>> It would be nice to have whoever creates a pack-*.keep file put
>> something useful as the content of the file, so we'll know what to clean
>> up after abnormal termination:
>> 
>> $ grep -l ^git-receive-pack $GIT_DIR/objects/pack/pack-*.keep
> 
> Yes, that's a very good idea.  When I do the git-receive-pack
> implementation tonight I'll try to dump useful information to the
> .keep file such that you can easily grep for the stale .keeps
> and decide which ones should go.

Perhaps git-count-packs (or enhanced git-count-objects, or git-count-stuff;
whatever it would be named) could also list (with some option) the reasons
for packs to be kept...

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply

* Re: fetching packs and storing them as packs
From: Shawn Pearce @ 2006-10-29  3:38 UTC (permalink / raw)
  To: Eran Tromer; +Cc: git
In-Reply-To: <4543DA2E.9030300@tromer.org>

Eran Tromer <git2eran@tromer.org> wrote:
> Hi Shawn,
> 
> On 2006-10-28 09:21, Shawn Pearce wrote:
> > Lets say that a pack X is NOT eligible to be repacked if
> > "$GIT_DIR/objects/pack/pack-X.keep" exists.
> > 
> > Thus we want to have the new ".keep" file for historical packs and
> > incoming receive-pack between steps c and g.  In the former case
> > the historical pack is already "very large" and thus one additional
> > empty file to indicate we want to retain that pack as-is is trivial
> > overhead (relatively speaking); in the latter case the lifespan of
> > the file is relatively short and thus any overhead associated with it
> > on the local filesystem is free (it may never even hit the platter).
> 
> Sounds perfect.
> 
> It would be nice to have whoever creates a pack-*.keep file put
> something useful as the content of the file, so we'll know what to clean
> up after abnormal termination:
> 
> $ grep -l ^git-receive-pack $GIT_DIR/objects/pack/pack-*.keep

Yes, that's a very good idea.  When I do the git-receive-pack
implementation tonight I'll try to dump useful information to the
.keep file such that you can easily grep for the stale .keeps
and decide which ones should go.

-- 

^ permalink raw reply

* Re: [PATCH] gitweb: Add "next" link to commitdiff view
From: Jakub Narebski @ 2006-10-29  2:10 UTC (permalink / raw)
  To: git; +Cc: Luben Tuikov, Junio Hamano, Petr Baudis
In-Reply-To: <672246.46875.qm@web31808.mail.mud.yahoo.com>

Luben Tuikov wrote:
> --- Jakub Narebski <jnareb@gmail.com> wrote:
>>
>> It is commit-7, but that can be easily changed.
> 
> Please do change it to "commit-8" -- it'd be consistent with the rest
> of GIT.

Well, some parts of GIT use "commit-8" (for example git-describe),
other use "commit-7...": this includes git-diff both in --raw
(git diff --raw, as opposed to git-diff-tree), git-diff in it's patch 
format in "index <hash>..<hash>" extended header, in "Merge:" line for 
merges in git-show output.

-- 
Jakub Narebski

^ permalink raw reply

* Re: [PATCH] gitweb: Add "next" link to commitdiff view
From: Luben Tuikov @ 2006-10-29  2:04 UTC (permalink / raw)
  To: Jakub Narebski, git; +Cc: Luben Tuikov, Junio Hamano, Petr Baudis
In-Reply-To: <200610290258.31771.jnareb@gmail.com>

--- Jakub Narebski <jnareb@gmail.com> wrote:
> Luben Tuikov wrote:
> > --- Jakub Narebski <jnareb@gmail.com> wrote:
> >> Jakub Narebski wrote:
> >>> Add a kind of "next" view in the bottom part of navigation bar for
> >>> "commitdiff" view.
> >>> 
> >>> For commitdiff between two commits:
> >>>   (from: _commit_)
> >>> For commitdiff for one single parent commit:
> >>>   (parent: _commit_)
> >>> For commitdiff for one merge commit
> >>>   (merge: _commit_ _commit_ ...)
> >>> For commitdiff for root (parentless) commit
> >>>   (initial)
> >>> where _link_ denotes hyperlink. SHA1 is shortened to 7 characters on
> >>> display, everything is perhaps unnecessary esc_html on display.
> [...]
> >> 
> >> Any reasons not to accept this patch? I find it very useful.
> >> 
> >> The idea to use fixed string instead of shortened SHA-1 of commit
> >> was abandoned after some discussion in this thread.
> > 
> > I prefer using the commit-8 without any "..." postfixed.  Anyone who
> > knows squat about git knows very well what a commit-8 is when they
> > see one -- the first 8 hexadecimal digits of the full SHA-1.
> > 
> > I like using "next" only when there is a "prev" right next to it,
> > i.e. based on _context_, something like this:
> >    << prev next>>
> > where "<< prev" is hyperlinked, and "next>>" is also.
> 
> Unfortunately this is simply not possible in this case, as links in git
> are only from commit to paren(a), in one direction only.

I was not suggesting that.  I simply saw a "next" suggestion in this
thread and wanted to mention about "next", "prev" and context.

> It is commit-7, but that can be easily changed.

Please do change it to "commit-8" -- it'd be consistent with the rest
of GIT.

> > Acked-by: Luben Tuikov <ltuikov@yahoo.com>
> > (not that it matters in any way ;-) )
> 
> IIRC Junio asked for ACKs.

In this email, it seems that he's asking for, quoting,
 "Acked-by:":
http://marc.theaimsgroup.com/?l=git&m=116206206008374&w=2

      Luben

^ permalink raw reply

* Re: [PATCH] gitweb: Add "next" link to commitdiff view
From: Jakub Narebski @ 2006-10-29  1:58 UTC (permalink / raw)
  To: git; +Cc: Luben Tuikov, Junio Hamano, Petr Baudis
In-Reply-To: <196807.75961.qm@web31801.mail.mud.yahoo.com>

Luben Tuikov wrote:
> --- Jakub Narebski <jnareb@gmail.com> wrote:
>> Jakub Narebski wrote:
>>> Add a kind of "next" view in the bottom part of navigation bar for
>>> "commitdiff" view.
>>> 
>>> For commitdiff between two commits:
>>>   (from: _commit_)
>>> For commitdiff for one single parent commit:
>>>   (parent: _commit_)
>>> For commitdiff for one merge commit
>>>   (merge: _commit_ _commit_ ...)
>>> For commitdiff for root (parentless) commit
>>>   (initial)
>>> where _link_ denotes hyperlink. SHA1 is shortened to 7 characters on
>>> display, everything is perhaps unnecessary esc_html on display.
[...]
>> 
>> Any reasons not to accept this patch? I find it very useful.
>> 
>> The idea to use fixed string instead of shortened SHA-1 of commit
>> was abandoned after some discussion in this thread.
> 
> I prefer using the commit-8 without any "..." postfixed.  Anyone who
> knows squat about git knows very well what a commit-8 is when they
> see one -- the first 8 hexadecimal digits of the full SHA-1.
> 
> I like using "next" only when there is a "prev" right next to it,
> i.e. based on _context_, something like this:
>    << prev next>>
> where "<< prev" is hyperlinked, and "next>>" is also.

Unfortunately this is simply not possible in this case, as links in git
are only from commit to paren(a), in one direction only.

Besides as you can see we can have more than one "next>>" link: for merge
commit there are multiple parents. Well, for branch points there are
multiple commits which have given commit as (one of) its parent(s), so
there can be multiple "<<prev" links as well.

> I looked at your patch (first email in this thread) and from
> what I gathered that it does, I like it (haven't tried it yet).
> 
> What I like about it is the clear information it conveys:
> it gives me both the commit-8 and what that commit-8 _is_.

It is commit-7, but that can be easily changed.

> Acked-by: Luben Tuikov <ltuikov@yahoo.com>
> (not that it matters in any way ;-) )

IIRC Junio asked for ACKs.
-- 
Jakub Narebski

^ permalink raw reply

* Re: [PATCH] gitweb: Add "next" link to commitdiff view
From: Luben Tuikov @ 2006-10-29  1:50 UTC (permalink / raw)
  To: Jakub Narebski, git; +Cc: Junio Hamano, Petr Baudis
In-Reply-To: <200610281810.36892.jnareb@gmail.com>

--- Jakub Narebski <jnareb@gmail.com> wrote:
> Jakub Narebski wrote:
> > Add a kind of "next" view in the bottom part of navigation bar for
> > "commitdiff" view.
> > 
> > For commitdiff between two commits:
> >   (from: _commit_)
> > For commitdiff for one single parent commit:
> >   (parent: _commit_)
> > For commitdiff for one merge commit
> >   (merge: _commit_ _commit_ ...)
> > For commitdiff for root (parentless) commit
> >   (initial)
> > where _link_ denotes hyperlink. SHA1 is shortened to 7 characters on
> > display, everything is perhaps unnecessary esc_html on display.
> > 
> > Signed-off-by: Jakub Narebski <jnareb@gmail.com>
> > ---
> > 
> > Junio (and others), is it what you wanted? The idea was to change
> > "commitdiff" view only in minimal way, and preserve similarity
> > to "commit" format.
> 
> Any reasons not to accept this patch? I find it very useful.
> 
> The idea to use fixed string instead of shortened SHA-1 of commit
> was abandoned after some discussion in this thread.

I prefer using the commit-8 without any "..." postfixed.  Anyone who
knows squat about git knows very well what a commit-8 is when they
see one -- the first 8 hexadecimal digits of the full SHA-1.

I like using "next" only when there is a "prev" right next to it,
i.e. based on _context_, something like this:
   << prev next >>
where "<< prev" is hyperlinked, and "next >>" is also.

I looked at your patch (first email in this thread) and from
what I gathered that it does, I like it (haven't tried it yet).

What I like about it is the clear information it conveys:
it gives me both the commit-8 and what that commit-8 _is_.

I sometimes remember commit-8s and this helps me a lot when
I move the mouse pointer to an xterm and want to do
git-show <commit-8>, and it futhermore helps me _know_ the
relationship between that commit-8 and the rest of the
information in the window frame gitweb is showing, (parent, etc).

Acked-by: Luben Tuikov <ltuikov@yahoo.com>
(not that it matters in any way ;-) )

   Luben



^ permalink raw reply

* Re: [PATCH/RFC] gitweb: New improved patchset view
From: Jakub Narebski @ 2006-10-28 23:16 UTC (permalink / raw)
  To: git
In-Reply-To: <200610290100.11731.jnareb@gmail.com>

By the way, one of the reasons for change is that current view is ill 
prepared for "combined diff" format I'd like to add later.

P.S. This affects of course both commitdiff and blobdiff views.
-- 
Jakub Narebski

^ permalink raw reply

* Re: VCS comparison table
From: Jakub Narebski @ 2006-10-28 22:46 UTC (permalink / raw)
  To: Robin Rosenberg; +Cc: git, bazaar-ng
In-Reply-To: <200610290018.05884.robin.rosenberg.lists@dewire.com>

Dnia niedziela 29. października 2006 00:18, Robin Rosenberg napisał:
> lördag 28 oktober 2006 15:53 skrev Jakub Narebski:
>>
>> But for example git(7) man page lists git commands clearly divided between
>> low-level commands (plumbing): manipulation commands, interrogation
>> commands, synching commands and high level commands (porcelain): main
>> commands, ancillary commands. The "git help" and "git --help" shows the
>> most commonly used git commands with short description of each command
>> ("git help -a" show all commands).
> 
> I believe people tend to skim through documentation looking for pieces of 
> information rather than read it from start to end. So they find themselves 
> reading the plumbing documentation first. Simply reordering documentation to 
> list the porcelain commands before the plumbing would make the git man page 
> less scary to newcomers.

Good idea. Thanks.

Current ordering in git(7) man page is probably the result of bottom-up
git development. First there were plumbing commands (well, first was
repository format AFAICT, but I digress...).

-- 
Jakub Narebski

^ permalink raw reply

* Re: fetching packs and storing them as packs
From: Eran Tromer @ 2006-10-28 22:31 UTC (permalink / raw)
  To: git; +Cc: Shawn Pearce
In-Reply-To: <20061028072146.GB14607@spearce.org>

Hi Shawn,

On 2006-10-28 09:21, Shawn Pearce wrote:
> Lets say that a pack X is NOT eligible to be repacked if
> "$GIT_DIR/objects/pack/pack-X.keep" exists.
> 
> Thus we want to have the new ".keep" file for historical packs and
> incoming receive-pack between steps c and g.  In the former case
> the historical pack is already "very large" and thus one additional
> empty file to indicate we want to retain that pack as-is is trivial
> overhead (relatively speaking); in the latter case the lifespan of
> the file is relatively short and thus any overhead associated with it
> on the local filesystem is free (it may never even hit the platter).

Sounds perfect.

It would be nice to have whoever creates a pack-*.keep file put
something useful as the content of the file, so we'll know what to clean
up after abnormal termination:

$ grep -l ^git-receive-pack $GIT_DIR/objects/pack/pack-*.keep



^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox