git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/n] gitweb: Better quoting and New improved patchset view
@ 2006-10-30 18:53 Jakub Narebski
  2006-10-30 18:58 ` [PATCH/RFC 1/n] gitweb: Better git-unquoting and gitweb-quoting of pathnames Jakub Narebski
                   ` (10 more replies)
  0 siblings, 11 replies; 46+ messages in thread
From: Jakub Narebski @ 2006-10-30 18:53 UTC (permalink / raw)
  To: git

This series of patches is meant to introduce 
"New improved patchset view" in part-by-part
case.

Would change only gitweb/gitweb.perl and gitweb/gitweb/css
-- 
Jakub Narebski

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH/RFC 1/n] gitweb: Better git-unquoting and gitweb-quoting of pathnames
  2006-10-30 18:53 [PATCH 0/n] gitweb: Better quoting and New improved patchset view Jakub Narebski
@ 2006-10-30 18:58 ` Jakub Narebski
  2006-11-03  8:15   ` Junio C Hamano
  2006-10-30 18:59 ` [PATCH 2/n] gitweb: Use '&iquot;' instead of '?' in esc_path Jakub Narebski
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 46+ messages in thread
From: Jakub Narebski @ 2006-10-30 18:58 UTC (permalink / raw)
  To: git

Extend unquote subroutine, which unquotes quoted and escaped filenames
which git may return, to deal not only with octal char sequence
quoting, but also quoting ordinary characters including '\"' and '\\'
which are respectively quoted '"' and '\', and to deal also with
C escape sequences including '\t' for TAB and '\n' for LF.

Add esc_path subroutine for gitweb quoting and HTML escaping filenames
(currently it does equivalent of ls' --hide-control-chars, which means
showing undisplayable characters (including '\n' and '\t') as '?'
(question mark) character.  Convert gitweb to use esc_path instead of
simply esc_html to print pathnames.

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
By the way, testing this patch uncovered some errors in gitweb, some
related to files with strange name, some unrelated. I'll address them
in further patches. They are:

  1. Using m/..\t(.+)$/; to catch filename instead of m/..\t(.+)$/s;

  2. Lack of '--' after $hash/$hash_base parameter which gives error
     if there exist branch (ref) and file (or directory) with the same
     name

The current implementation of esc_path is meant as preliminary: if you
have better idea for quoting names in gitweb, please tell us, or better
send code/patches.


 gitweb/gitweb.perl |   68 +++++++++++++++++++++++++++++++++++++--------------
 1 files changed, 49 insertions(+), 19 deletions(-)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index ec46b80..a15e916 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -563,12 +563,42 @@ sub esc_html {
 	return $str;
 }
 
+# quote unsafe characters and escape filename to HTML
+sub esc_path {
+	my $str = shift;
+	$str = esc_html($str);
+	$str =~ s/[[:cntrl:]\a\b\e\f\n\r\t\011]/?/g; # like --hide-control-chars in ls
+	return $str;
+}
+
 # git may return quoted and escaped filenames
 sub unquote {
 	my $str = shift;
+
+	sub unq {
+		my $seq = shift;
+		my %es = (
+			't' => "\t", # tab            (HT, TAB)
+			'n' => "\n", # newline        (NL)
+			'r' => "\r", # return         (CR)
+			'f' => "\f", # form feed      (FF)
+			'b' => "\b", # backspace      (BS)
+			'a' => "\a", # alarm (bell)   (BEL)
+			#'e' => "\e", # escape        (ESC)
+			'v' => "\011", # vertical tab (VT)
+		);
+
+		# octal char sequence
+		return chr(oct($seq))  if ($seq =~ m/^[0-7]{1,3}$/);
+		# C escape sequence (this includes '\n' (LF) and '\t' (TAB))
+		return $es{$seq}       if ($seq =~ m/^[abefnrtv]$/);
+		# quted ordinary character (this includes '\\' and '\"')
+		return $seq;
+	}
+
 	if ($str =~ m/^"(.*)"$/) {
 		$str = $1;
-		$str =~ s/\\([0-7]{1,3})/chr(oct($1))/eg;
+		$str =~ s/\\([^0-7]|[0-7]{1,3})/unq($1)/eg;
 	}
 	return $str;
 }
@@ -1435,7 +1465,7 @@ sub git_header_html {
 		if (defined $action) {
 			$title .= "/$action";
 			if (defined $file_name) {
-				$title .= " - " . esc_html($file_name);
+				$title .= " - " . esc_path($file_name);
 				if ($action eq "tree" && $file_name !~ m|/$|) {
 					$title .= "/";
 				}
@@ -1706,20 +1736,20 @@ sub git_print_page_path {
 			$fullname .= ($fullname ? '/' : '') . $dir;
 			print $cgi->a({-href => href(action=>"tree", file_name=>$fullname,
 			                             hash_base=>$hb),
-			              -title => $fullname}, esc_html($dir));
+			              -title => $fullname}, esc_path($dir));
 			print " / ";
 		}
 		if (defined $type && $type eq 'blob') {
 			print $cgi->a({-href => href(action=>"blob_plain", file_name=>$file_name,
 			                             hash_base=>$hb),
-			              -title => $name}, esc_html($basename));
+			              -title => $name}, esc_path($basename));
 		} elsif (defined $type && $type eq 'tree') {
 			print $cgi->a({-href => href(action=>"tree", file_name=>$file_name,
 			                             hash_base=>$hb),
-			              -title => $name}, esc_html($basename));
+			              -title => $name}, esc_path($basename));
 			print " / ";
 		} else {
-			print esc_html($basename);
+			print esc_path($basename);
 		}
 	}
 	print "<br/></div>\n";
@@ -1791,7 +1821,7 @@ sub git_print_tree_entry {
 		print "<td class=\"list\">" .
 			$cgi->a({-href => href(action=>"blob", hash=>$t->{'hash'},
 			                       file_name=>"$basedir$t->{'name'}", %base_key),
-			        -class => "list"}, esc_html($t->{'name'})) . "</td>\n";
+			        -class => "list"}, esc_path($t->{'name'})) . "</td>\n";
 		print "<td class=\"link\">";
 		print $cgi->a({-href => href(action=>"blob", hash=>$t->{'hash'},
 					     file_name=>"$basedir$t->{'name'}", %base_key)},
@@ -1818,7 +1848,7 @@ sub git_print_tree_entry {
 		print "<td class=\"list\">";
 		print $cgi->a({-href => href(action=>"tree", hash=>$t->{'hash'},
 		                             file_name=>"$basedir$t->{'name'}", %base_key)},
-		              esc_html($t->{'name'}));
+		              esc_path($t->{'name'}));
 		print "</td>\n";
 		print "<td class=\"link\">";
 		print $cgi->a({-href => href(action=>"tree", hash=>$t->{'hash'},
@@ -1883,7 +1913,7 @@ sub git_difftree_body {
 			print "<td>";
 			print $cgi->a({-href => href(action=>"blob", hash=>$diff{'to_id'},
 			                             hash_base=>$hash, file_name=>$diff{'file'}),
-			              -class => "list"}, esc_html($diff{'file'}));
+			              -class => "list"}, esc_path($diff{'file'}));
 			print "</td>\n";
 			print "<td>$mode_chng</td>\n";
 			print "<td class=\"link\">";
@@ -1899,7 +1929,7 @@ sub git_difftree_body {
 			print "<td>";
 			print $cgi->a({-href => href(action=>"blob", hash=>$diff{'from_id'},
 			                             hash_base=>$parent, file_name=>$diff{'file'}),
-			               -class => "list"}, esc_html($diff{'file'}));
+			               -class => "list"}, esc_path($diff{'file'}));
 			print "</td>\n";
 			print "<td>$mode_chng</td>\n";
 			print "<td class=\"link\">";
@@ -1939,7 +1969,7 @@ sub git_difftree_body {
 			print "<td>";
 			print $cgi->a({-href => href(action=>"blob", hash=>$diff{'to_id'},
 			                             hash_base=>$hash, file_name=>$diff{'file'}),
-			              -class => "list"}, esc_html($diff{'file'}));
+			              -class => "list"}, esc_path($diff{'file'}));
 			print "</td>\n";
 			print "<td>$mode_chnge</td>\n";
 			print "<td class=\"link\">";
@@ -1979,11 +2009,11 @@ sub git_difftree_body {
 			print "<td>" .
 			      $cgi->a({-href => href(action=>"blob", hash_base=>$hash,
 			                             hash=>$diff{'to_id'}, file_name=>$diff{'to_file'}),
-			              -class => "list"}, esc_html($diff{'to_file'})) . "</td>\n" .
+			              -class => "list"}, esc_path($diff{'to_file'})) . "</td>\n" .
 			      "<td><span class=\"file_status $nstatus\">[$nstatus from " .
 			      $cgi->a({-href => href(action=>"blob", hash_base=>$parent,
 			                             hash=>$diff{'from_id'}, file_name=>$diff{'from_file'}),
-			              -class => "list"}, esc_html($diff{'from_file'})) .
+			              -class => "list"}, esc_path($diff{'from_file'})) .
 			      " with " . (int $diff{'similarity'}) . "% similarity$mode_chng]</span></td>\n" .
 			      "<td class=\"link\">";
 			if ($diff{'to_id'} ne $diff{'from_id'}) {
@@ -2113,7 +2143,7 @@ sub git_patchset_body {
 			$file  ||= $diffinfo->{'file'};
 			$file = $cgi->a({-href => href(action=>"blob", hash_base=>$hash_parent,
 			                               hash=>$diffinfo->{'from_id'}, file_name=>$file),
-			                -class => "list"}, esc_html($file));
+			                -class => "list"}, esc_path($file));
 			$patch_line =~ s|a/.*$|a/$file|g;
 			print "<div class=\"diff from_file\">$patch_line</div>\n";
 
@@ -2125,7 +2155,7 @@ sub git_patchset_body {
 			$file  ||= $diffinfo->{'file'};
 			$file = $cgi->a({-href => href(action=>"blob", hash_base=>$hash,
 			                               hash=>$diffinfo->{'to_id'}, file_name=>$file),
-			                -class => "list"}, esc_html($file));
+			                -class => "list"}, esc_path($file));
 			$patch_line =~ s|b/.*|b/$file|g;
 			print "<div class=\"diff to_file\">$patch_line</div>\n";
 
@@ -3373,8 +3403,8 @@ sub git_blobdiff {
 
 	} else {
 		while (my $line = <$fd>) {
-			$line =~ s!a/($hash|$hash_parent)!'a/'.esc_html($diffinfo{'from_file'})!eg;
-			$line =~ s!b/($hash|$hash_parent)!'b/'.esc_html($diffinfo{'to_file'})!eg;
+			$line =~ s!a/($hash|$hash_parent)!'a/'.esc_path($diffinfo{'from_file'})!eg;
+			$line =~ s!b/($hash|$hash_parent)!'b/'.esc_path($diffinfo{'to_file'})!eg;
 
 			print $line;
 
@@ -3729,7 +3759,7 @@ sub git_search {
 						print $cgi->a({-href => href(action=>"blob", hash_base=>$co{'id'},
 						                             hash=>$set{'id'}, file_name=>$set{'file'}),
 						              -class => "list"},
-						              "<span class=\"match\">" . esc_html($set{'file'}) . "</span>") .
+						              "<span class=\"match\">" . esc_path($set{'file'}) . "</span>") .
 						      "<br/>\n";
 					}
 					print "</td>\n" .
@@ -3863,7 +3893,7 @@ XML
 			if (!($line =~ m/^:([0-7]{6}) ([0-7]{6}) ([0-9a-fA-F]{40}) ([0-9a-fA-F]{40}) (.)([0-9]{0,3})\t(.*)$/)) {
 				next;
 			}
-			my $file = esc_html(unquote($7));
+			my $file = esc_path(unquote($7));
 			$file = to_utf8($file);
 			print "$file<br/>\n";
 		}
-- 
1.4.3.3

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 2/n] gitweb: Use '&iquot;' instead of '?' in esc_path
  2006-10-30 18:53 [PATCH 0/n] gitweb: Better quoting and New improved patchset view Jakub Narebski
  2006-10-30 18:58 ` [PATCH/RFC 1/n] gitweb: Better git-unquoting and gitweb-quoting of pathnames Jakub Narebski
@ 2006-10-30 18:59 ` Jakub Narebski
  2006-10-31  0:34   ` Junio C Hamano
  2006-10-30 21:25 ` [PATCH 3/n] gitweb: Use 's' regexp modifier to secure against filenames with LF Jakub Narebski
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 46+ messages in thread
From: Jakub Narebski @ 2006-10-30 18:59 UTC (permalink / raw)
  To: git

Use "&iquot;" Latin 1 entity ("&#191;" -- inverted question mark =
turned question mark, U+00BF ISOnum) instead '?' as replacements for
control characters and other undisplayable characters.

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
 gitweb/gitweb.perl |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index a15e916..edca27d 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -567,7 +567,7 @@ sub esc_html {
 sub esc_path {
 	my $str = shift;
 	$str = esc_html($str);
-	$str =~ s/[[:cntrl:]\a\b\e\f\n\r\t\011]/?/g; # like --hide-control-chars in ls
+	$str =~ s/[[:cntrl:]\a\b\e\f\n\r\t\011]/&iquest;/g; # like --hide-control-chars in ls
 	return $str;
 }
 
-- 
1.4.3.3

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 3/n] gitweb: Use 's' regexp modifier to secure against filenames with LF
  2006-10-30 18:53 [PATCH 0/n] gitweb: Better quoting and New improved patchset view Jakub Narebski
  2006-10-30 18:58 ` [PATCH/RFC 1/n] gitweb: Better git-unquoting and gitweb-quoting of pathnames Jakub Narebski
  2006-10-30 18:59 ` [PATCH 2/n] gitweb: Use '&iquot;' instead of '?' in esc_path Jakub Narebski
@ 2006-10-30 21:25 ` Jakub Narebski
  2006-10-30 21:29 ` [PATCH 4/n] gitweb: Secure against commit-ish/tree-ish with the same name as path Jakub Narebski
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 46+ messages in thread
From: Jakub Narebski @ 2006-10-30 21:25 UTC (permalink / raw)
  To: git

Use 's' (treat string as single line) regexp modifier in
git_get_hash_by_path (against future changes, probably unnecessary)
and in parse_ls_tree_line (when called with '-z'=>1 option) to secure
against filenames containing newline.

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
Without this patch filename with LF broke "tree" view.

 gitweb/gitweb.perl |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index edca27d..0fd1360 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -890,7 +890,7 @@ sub git_get_hash_by_path {
 	close $fd or return undef;
 
 	#'100644 blob 0fa3f3a66fb6a137f6ec2c19351ed4d807070ffa	panic.c'
-	$line =~ m/^([0-9]+) (.+) ([0-9a-fA-F]{40})\t(.+)$/;
+	$line =~ m/^([0-9]+) (.+) ([0-9a-fA-F]{40})\t(.+)$/s;
 	if (defined $type && $type ne $2) {
 		# type doesn't match
 		return undef;
@@ -1305,7 +1305,7 @@ sub parse_ls_tree_line ($;%) {
 	my %res;
 
 	#'100644 blob 0fa3f3a66fb6a137f6ec2c19351ed4d807070ffa	panic.c'
-	$line =~ m/^([0-9]+) (.+) ([0-9a-fA-F]{40})\t(.+)$/;
+	$line =~ m/^([0-9]+) (.+) ([0-9a-fA-F]{40})\t(.+)$/s;
 
 	$res{'mode'} = $1;
 	$res{'type'} = $2;
-- 
1.4.3.3

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 4/n] gitweb: Secure against commit-ish/tree-ish with the same name as path
  2006-10-30 18:53 [PATCH 0/n] gitweb: Better quoting and New improved patchset view Jakub Narebski
                   ` (2 preceding siblings ...)
  2006-10-30 21:25 ` [PATCH 3/n] gitweb: Use 's' regexp modifier to secure against filenames with LF Jakub Narebski
@ 2006-10-30 21:29 ` Jakub Narebski
  2006-10-31 16:53   ` Jakub Narebski
  2006-10-31 14:22 ` [PATCH 5/n] [take 3] gitweb: New improved patchset view Jakub Narebski
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 46+ messages in thread
From: Jakub Narebski @ 2006-10-30 21:29 UTC (permalink / raw)
  To: git

Add "--" after <commit-ish> or <tree-ish> argument to clearly mark it
as <commit-ish> or <tree-ish> and not pathspec, securing against refs
with the same names as files or directories in [live] repository.

Some wrapping to reduce line length as well.

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
I uses branch named 'gitweb/test' to test gitweb against files with
funny characters (like '"', '\', TAB, LF) in filenames. I run gitweb
on "live" (not bare) repository, and there is directory 'gitweb/test'
in it. So I had some parts of gitweb not functioning, and errors in
the web server logs. This patch fixes that issue.

 gitweb/gitweb.perl |   38 +++++++++++++++++++++++---------------
 1 files changed, 23 insertions(+), 15 deletions(-)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 0fd1360..4554067 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -1141,7 +1141,9 @@ sub parse_commit {
 		@commit_lines = @$commit_text;
 	} else {
 		local $/ = "\0";
-		open my $fd, "-|", git_cmd(), "rev-list", "--header", "--parents", "--max-count=1", $commit_id
+		open my $fd, "-|", git_cmd(), "rev-list",
+			"--header", "--parents", "--max-count=1",
+			$commit_id, "--"
 			or return;
 		@commit_lines = split '\n', <$fd>;
 		close $fd or return;
@@ -2559,7 +2561,7 @@ sub git_summary {
 	}
 
 	open my $fd, "-|", git_cmd(), "rev-list", "--max-count=17",
-		git_get_head_hash($project)
+		git_get_head_hash($project), "--"
 		or die_error(undef, "Open git-rev-list failed");
 	my @revlist = map { chomp; $_ } <$fd>;
 	close $fd;
@@ -2970,7 +2972,7 @@ sub git_tree {
 		}
 	}
 	$/ = "\0";
-	open my $fd, "-|", git_cmd(), "ls-tree", '-z', $hash
+	open my $fd, "-|", git_cmd(), "ls-tree", '-z', $hash, "--"
 		or die_error(undef, "Open git-ls-tree failed");
 	my @entries = map { chomp; $_ } <$fd>;
 	close $fd or die_error(undef, "Reading tree failed");
@@ -3102,7 +3104,7 @@ sub git_log {
 	my $refs = git_get_references();
 
 	my $limit = sprintf("--max-count=%i", (100 * ($page+1)));
-	open my $fd, "-|", git_cmd(), "rev-list", $limit, $hash
+	open my $fd, "-|", git_cmd(), "rev-list", $limit, $hash, "--"
 		or die_error(undef, "Open git-rev-list failed");
 	my @revlist = map { chomp; $_ } <$fd>;
 	close $fd;
@@ -3160,7 +3162,7 @@ sub git_commit {
 		$parent = "--root";
 	}
 	open my $fd, "-|", git_cmd(), "diff-tree", '-r', "--no-commit-id",
-		@diff_opts, $parent, $hash
+		@diff_opts, $parent, $hash, "--"
 		or die_error(undef, "Open git-diff-tree failed");
 	my @difftree = map { chomp; $_ } <$fd>;
 	close $fd or die_error(undef, "Reading git-diff-tree failed");
@@ -3265,7 +3267,8 @@ sub git_blobdiff {
 	if (defined $hash_base && defined $hash_parent_base) {
 		if (defined $file_name) {
 			# read raw output
-			open $fd, "-|", git_cmd(), "diff-tree", '-r', @diff_opts, $hash_parent_base, $hash_base,
+			open $fd, "-|", git_cmd(), "diff-tree", '-r', @diff_opts,
+				$hash_parent_base, $hash_base,
 				"--", $file_name
 				or die_error(undef, "Open git-diff-tree failed");
 			@difftree = map { chomp; $_ } <$fd>;
@@ -3279,7 +3282,8 @@ sub git_blobdiff {
 			# try to find filename from $hash
 
 			# read filtered raw output
-			open $fd, "-|", git_cmd(), "diff-tree", '-r', @diff_opts, $hash_parent_base, $hash_base
+			open $fd, "-|", git_cmd(), "diff-tree", '-r', @diff_opts,
+				$hash_parent_base, $hash_base, "--"
 				or die_error(undef, "Open git-diff-tree failed");
 			@difftree =
 				# ':100644 100644 03b21826... 3b93d5e7... M	ls-files.c'
@@ -3349,7 +3353,8 @@ sub git_blobdiff {
 		}
 
 		# open patch output
-		open $fd, "-|", git_cmd(), "diff", '-p', @diff_opts, $hash_parent, $hash
+		open $fd, "-|", git_cmd(), "diff", '-p', @diff_opts,
+			$hash_parent, $hash, "--"
 			or die_error(undef, "Open git-diff failed");
 	} else  {
 		die_error('404 Not Found', "Missing one of the blob diff parameters")
@@ -3480,8 +3485,8 @@ sub git_commitdiff {
 	my @difftree;
 	if ($format eq 'html') {
 		open $fd, "-|", git_cmd(), "diff-tree", '-r', @diff_opts,
-			"--no-commit-id",
-			"--patch-with-raw", "--full-index", $hash_parent, $hash
+			"--no-commit-id", "--patch-with-raw", "--full-index",
+			$hash_parent, $hash, "--"
 			or die_error(undef, "Open git-diff-tree failed");
 
 		while (chomp(my $line = <$fd>)) {
@@ -3492,7 +3497,7 @@ sub git_commitdiff {
 
 	} elsif ($format eq 'plain') {
 		open $fd, "-|", git_cmd(), "diff-tree", '-r', @diff_opts,
-			'-p', $hash_parent, $hash
+			'-p', $hash_parent, $hash, "--"
 			or die_error(undef, "Open git-diff-tree failed");
 
 	} else {
@@ -3669,7 +3674,9 @@ sub git_search {
 	my $alternate = 1;
 	if ($searchtype eq 'commit' or $searchtype eq 'author' or $searchtype eq 'committer') {
 		$/ = "\0";
-		open my $fd, "-|", git_cmd(), "rev-list", "--header", "--parents", $hash or next;
+		open my $fd, "-|", git_cmd(), "rev-list",
+			"--header", "--parents", $hash, "--"
+			or next;
 		while (my $commit_text = <$fd>) {
 			if (!grep m/$searchtext/i, $commit_text) {
 				next;
@@ -3815,7 +3822,7 @@ sub git_shortlog {
 	my $refs = git_get_references();
 
 	my $limit = sprintf("--max-count=%i", (100 * ($page+1)));
-	open my $fd, "-|", git_cmd(), "rev-list", $limit, $hash
+	open my $fd, "-|", git_cmd(), "rev-list", $limit, $hash, "--"
 		or die_error(undef, "Open git-rev-list failed");
 	my @revlist = map { chomp; $_ } <$fd>;
 	close $fd;
@@ -3843,7 +3850,8 @@ sub git_shortlog {
 
 sub git_rss {
 	# http://www.notestips.com/80256B3A007F2692/1/NAMO5P9UPQ
-	open my $fd, "-|", git_cmd(), "rev-list", "--max-count=150", git_get_head_hash($project)
+	open my $fd, "-|", git_cmd(), "rev-list", "--max-count=150",
+		git_get_head_hash($project), "--"
 		or die_error(undef, "Open git-rev-list failed");
 	my @revlist = map { chomp; $_ } <$fd>;
 	close $fd or die_error(undef, "Reading git-rev-list failed");
@@ -3867,7 +3875,7 @@ XML
 		}
 		my %cd = parse_date($co{'committer_epoch'});
 		open $fd, "-|", git_cmd(), "diff-tree", '-r', @diff_opts,
-			$co{'parent'}, $co{'id'}
+			$co{'parent'}, $co{'id'}, "--"
 			or next;
 		my @difftree = map { chomp; $_ } <$fd>;
 		close $fd
-- 
1.4.3.3

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/n] gitweb: Use '&iquot;' instead of '?' in esc_path
  2006-10-30 18:59 ` [PATCH 2/n] gitweb: Use '&iquot;' instead of '?' in esc_path Jakub Narebski
@ 2006-10-31  0:34   ` Junio C Hamano
  2006-10-31  1:27     ` Junio C Hamano
  0 siblings, 1 reply; 46+ messages in thread
From: Junio C Hamano @ 2006-10-31  0:34 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Jakub Narebski <jnareb@gmail.com> writes:

> Use "&iquot;" Latin 1 entity ("&#191;" -- inverted question mark =
> turned question mark, U+00BF ISOnum) instead '?' as replacements for
> control characters and other undisplayable characters.
>
> Signed-off-by: Jakub Narebski <jnareb@gmail.com>

Do you have something against our Spanish and Latin American
friends?  ;-)

I wonder if there is a more suitable replacement character that
is accepted across scripts?

Japanese printing industry has a long tradition of using U+3013
("geta") as a filler character.  Originally they placed a type
of otherwise unused character upside down while packing types
into a row, and the reverse side of a type, when inked and
printed, left imprint that looked like footprint somebody who
wore a "geta" (a traditional footware) would leave.

	http://ja.wikipedia.org/wiki/%E4%B8%8B%E9%A7%84

shows how a "geta" looks like, and

	http://unicode.org/charts/PDF/U3000.pdf

shows how the filler character looks like.

Note that I am not suggesting to use &#3013; as a replacement at
all.  I however think inverted question is inappropriate, and we
should pick something else if we are fixing the question mark
which is obviously inappropriate.


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/n] gitweb: Use '&iquot;' instead of '?' in esc_path
  2006-10-31  0:34   ` Junio C Hamano
@ 2006-10-31  1:27     ` Junio C Hamano
  2006-10-31  9:23       ` Jakub Narebski
  2006-11-03 16:19       ` Jakub Narebski
  0 siblings, 2 replies; 46+ messages in thread
From: Junio C Hamano @ 2006-10-31  1:27 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Junio C Hamano <junkio@cox.net> writes:

> Jakub Narebski <jnareb@gmail.com> writes:
>
>> Use "&iquot;" Latin 1 entity ("&#191;" -- inverted question mark =
>> turned question mark, U+00BF ISOnum) instead '?' as replacements for
>> control characters and other undisplayable characters.
>>
>> Signed-off-by: Jakub Narebski <jnareb@gmail.com>
>
> Do you have something against our Spanish and Latin American
> friends?  ;-)
>
> I wonder if there is a more suitable replacement character that
> is accepted across scripts?

I have a suspicion that instead of finding an exotic character,
just showing the byte value in \octal, perhaps in different
color, might be more portable and easier.  For one thing, it
helps to show the exact byte value than just one substitution
character if you are troubleshooting gitweb.


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/n] gitweb: Use '&iquot;' instead of '?' in esc_path
  2006-10-31  1:27     ` Junio C Hamano
@ 2006-10-31  9:23       ` Jakub Narebski
  2006-11-03 16:19       ` Jakub Narebski
  1 sibling, 0 replies; 46+ messages in thread
From: Jakub Narebski @ 2006-10-31  9:23 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Junio C Hamano wrote:
>
> Junio C Hamano <junkio@cox.net> writes:
> 
>> Jakub Narebski <jnareb@gmail.com> writes:
>>
>>> Use "&iquot;" Latin 1 entity ("&#191;" -- inverted question mark =
>>> turned question mark, U+00BF ISOnum) instead '?' as replacements for
>>> control characters and other undisplayable characters.
>>>
>>> Signed-off-by: Jakub Narebski <jnareb@gmail.com>
>>
>> Do you have something against our Spanish and Latin American
>> friends?  ;-)
>>
>> I wonder if there is a more suitable replacement character that
>> is accepted across scripts?
> 
> I have a suspicion that instead of finding an exotic character,
> just showing the byte value in \octal, perhaps in different
> color, might be more portable and easier.  For one thing, it
> helps to show the exact byte value than just one substitution
> character if you are troubleshooting gitweb.

Will do. Well, with the exception of control characters which have
literal escape characters, for example showing LF as \n instead
of \octal. By the way, wouldn't \xhex be better?

Different color/style is good. Otherwise we would have escape escape
character, i.e. write '\\' and perhaps als mark filename as quoted.
git surrounds filename with doublequotes, "<quoted filename>", adding
'"' to the list of quoted characters: for gitweb this kind of marking
filename as quoted is PITA, as we first, don't want I think to include
'"' in link ("_filename_" vs _"filename"_), and second and more important
if we hyperlink parts of filename we would want for quote to be placed
outside whole filename ("a/_filename_" vs a/_"filename"_ and also
"_path_/_to_/_file_" vs _path_/_to_/_"file"_).

Should we mark only replaced character, or the whole filename?


On the technical side, should I send the patch as replacement for this
patch, or on top of this patch?
-- 
Jakub Narebski

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 5/n] [take 3] gitweb: New improved patchset view
  2006-10-30 18:53 [PATCH 0/n] gitweb: Better quoting and New improved patchset view Jakub Narebski
                   ` (3 preceding siblings ...)
  2006-10-30 21:29 ` [PATCH 4/n] gitweb: Secure against commit-ish/tree-ish with the same name as path Jakub Narebski
@ 2006-10-31 14:22 ` Jakub Narebski
  2006-11-03 10:26   ` [PATCH 5/10] " Jakub Narebski
  2006-10-31 16:07 ` [PATCH 6/n] gitweb: Remove redundant "blob" links from git_difftree_body Jakub Narebski
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 46+ messages in thread
From: Jakub Narebski @ 2006-10-31 14:22 UTC (permalink / raw)
  To: git

Replace "gitweb diff header" with its full sha1 of blobs and replace
it by "git diff" header and extended diff header. Change also somewhat
highlighting of diffs.

Added `file_type_long' subroutine to convert file mode in octal to
file type description (only for file modes which used by git).

Changes:
* "gitweb diff header" which looked for example like below:
    file:_<sha1 before>_ -> file:_<sha1 after>_
  where 'file' is file type and '<sha1>' is full sha1 of blob is
  changed to
    diff --git _a/<file before>_ _b/<file after>_
  In both cases links are visible and use default link style. If file
  is added, a/<file> is not hyperlinked, if file is deleted, b/<file>
  is not hyperlinked.
* there is added "extended diff header", with <path> and <hash>
  hyperlinked (and <hash> shortened to 7 characters), and <mode>
  explained: '<mode>' is extended to '<mode> (<file type description>)',
  where added text is slightly lighter to easy distinguish that it
  was added (and it is difference from git-diff output).
* from-file/to-file two-line header lines have slightly darker color
  than removed/added lines.
* chunk header has now delicate line above for easier finding chunk
  boundary, and top margin of 2px, both barely visible.

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
It still does not output empty patches. This is to be adressed
in later patch.

As usual, you can see it at work at my web page
  http://roke (.) dyndns (.) info/cgi-bin/gitweb/gitweb.cgi

Comments?

 gitweb/gitweb.css  |   66 +++++++++++++++----
 gitweb/gitweb.perl |  187 
++++++++++++++++++++++++++++++++++------------------
 2 files changed, 176 insertions(+), 77 deletions(-)

diff --git a/gitweb/gitweb.css b/gitweb/gitweb.css
index 0eda982..87dbc52 100644
--- a/gitweb/gitweb.css
+++ b/gitweb/gitweb.css
@@ -291,40 +291,82 @@ td.mode {
 	font-family: monospace;
 }
 
-div.diff a.list {
+/* styling of diffs (patchsets): commitdiff and blobdiff views */
+div.diff.header,
+div.diff.extended_header {
+	white-space: normal;
+}
+
+div.diff.header {
+	font-weight: bold;
+
+	background-color: #edece6;
+
+	margin-top: 4px;
+	padding: 4px 0px 2px 0px;
+	border: solid #d9d8d1;
+	border-width: 1px 0px 1px 0px;
+}
+
+div.diff.header a.path {
+	text-decoration: underline;
+}
+
+div.diff.extended_header,
+div.diff.extended_header a.path,
+div.diff.extended_header a.hash {
+	color: #777777;
+}
+
+div.diff.extended_header .info {
+	color: #b0b0b0;
+}
+
+div.diff.extended_header {
+	background-color: #f6f5ee;
+	padding: 2px 0px 2px 0px;
+}
+
+div.diff a.path,
+div.diff a.hash {
 	text-decoration: none;
 }
 
-div.diff a.list:hover {
+div.diff a.path:hover,
+div.diff a.hash:hover {
 	text-decoration: underline;
 }
 
-div.diff.to_file a.list,
-div.diff.to_file,
+div.diff.to_file a.path,
+div.diff.to_file {
+	color: #007000;
+}
+
 div.diff.add {
 	color: #008800;
 }
 
-div.diff.from_file a.list,
-div.diff.from_file,
+div.diff.from_file a.path,
+div.diff.from_file {
+	color: #aa0000;
+}
+
 div.diff.rem {
 	color: #cc0000;
 }
 
 div.diff.chunk_header {
 	color: #990099;
+
+	border: dotted #ffe0ff;
+	border-width: 1px 0px 0px 0px;
+	margin-top: 2px;
 }
 
 div.diff.incomplete {
 	color: #cccccc;
 }
 
-div.diff_info {
-	font-family: monospace;
-	color: #000099;
-	background-color: #edece6;
-	font-style: italic;
-}
 
 div.index_include {
 	border: solid #d9d8d1;
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 4554067..c22fcc5 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -732,6 +732,32 @@ sub file_type {
 	}
 }
 
+# convert file mode in octal to file type description string
+sub file_type_long {
+	my $mode = shift;
+
+	if ($mode !~ m/^[0-7]+$/) {
+		return $mode;
+	} else {
+		$mode = oct $mode;
+	}
+
+	if (S_ISDIR($mode & S_IFMT)) {
+		return "directory";
+	} elsif (S_ISLNK($mode)) {
+		return "symlink";
+	} elsif (S_ISREG($mode)) {
+		if ($mode & S_IXUSR) {
+			return "executable";
+		} else {
+			return "file";
+		};
+	} else {
+		return "unknown";
+	}
+}
+
+
 ## 
----------------------------------------------------------------------
 ## functions returning short HTML fragments, or transforming HTML 
fragments
 ## which don't beling to other sections
@@ -2055,7 +2081,9 @@ sub git_patchset_body {
 	my $patch_idx = 0;
 	my $in_header = 0;
 	my $patch_found = 0;
+	my $skip_patch = 0;
 	my $diffinfo;
+	my (%from, %to);
 
 	print "<div class=\"patchset\">\n";
 
@@ -2065,6 +2093,8 @@ sub git_patchset_body {
 
 		if ($patch_line =~ m/^diff /) { # "git diff" header
 			# beginning of patch (in patchset)
+			$skip_patch = 0;
+
 			if ($patch_found) {
 				# close previous patch
 				print "</div>\n"; # class="patch"
@@ -2074,96 +2104,123 @@ sub git_patchset_body {
 			}
 			print "<div class=\"patch\" id=\"patch". ($patch_idx+1) ."\">\n";
 
+			# read and prepare patch information
 			if (ref($difftree->[$patch_idx]) eq "HASH") {
+				# pre-parsed (or generated by hand)
 				$diffinfo = $difftree->[$patch_idx];
 			} else {
 				$diffinfo = parse_difftree_raw_line($difftree->[$patch_idx]);
 			}
+			$from{'file'} = $diffinfo->{'from_file'} || $diffinfo->{'file'};
+			$to{'file'}   = $diffinfo->{'to_file'}   || $diffinfo->{'file'};
+			if ($diffinfo->{'status'} ne "A") { # not new (added) file
+				$from{'href'} = href(action=>"blob", hash_base=>$hash_parent,
+				                     hash=>$diffinfo->{'from_id'},
+				                     file_name=>$from{'file'});
+			}
+			if ($diffinfo->{'status'} ne "D") { # not deleted file
+				$to{'href'} = href(action=>"blob", hash_base=>$hash,
+				                   hash=>$diffinfo->{'to_id'},
+				                   file_name=>$to{'file'});
+			}
 			$patch_idx++;
 
-			# for now, no extended header, hence we skip empty patches
-			# companion to	next LINE if $in_header;
+			# for now we skip empty patches
 			if ($diffinfo->{'from_id'} eq $diffinfo->{'to_id'}) { # no change
 				$in_header = 1;
+				$skip_patch = 0;
 				next LINE;
 			}
 
-			if ($diffinfo->{'status'} eq "A") { # added
-				print "<div class=\"diff_info\">" . 
file_type($diffinfo->{'to_mode'}) . ":" .
-				      $cgi->a({-href => href(action=>"blob", hash_base=>$hash,
-				                             hash=>$diffinfo->{'to_id'}, 
file_name=>$diffinfo->{'file'})},
-				              $diffinfo->{'to_id'}) . " (new)" .
-				      "</div>\n"; # class="diff_info"
-
-			} elsif ($diffinfo->{'status'} eq "D") { # deleted
-				print "<div class=\"diff_info\">" . 
file_type($diffinfo->{'from_mode'}) . ":" .
-				      $cgi->a({-href => href(action=>"blob", 
hash_base=>$hash_parent,
-				                             hash=>$diffinfo->{'from_id'}, 
file_name=>$diffinfo->{'file'})},
-				              $diffinfo->{'from_id'}) . " (deleted)" .
-				      "</div>\n"; # class="diff_info"
-
-			} elsif ($diffinfo->{'status'} eq "R" || # renamed
-			         $diffinfo->{'status'} eq "C" || # copied
-			         $diffinfo->{'status'} eq "2") { # with two filenames (from 
git_blobdiff)
-				print "<div class=\"diff_info\">" .
-				      file_type($diffinfo->{'from_mode'}) . ":" .
-				      $cgi->a({-href => href(action=>"blob", 
hash_base=>$hash_parent,
-				                             hash=>$diffinfo->{'from_id'}, 
file_name=>$diffinfo->{'from_file'})},
-				              $diffinfo->{'from_id'}) .
-				      " -> " .
-				      file_type($diffinfo->{'to_mode'}) . ":" .
-				      $cgi->a({-href => href(action=>"blob", hash_base=>$hash,
-				                             hash=>$diffinfo->{'to_id'}, 
file_name=>$diffinfo->{'to_file'})},
-				              $diffinfo->{'to_id'});
-				print "</div>\n"; # class="diff_info"
-
-			} else { # modified, mode changed, ...
-				print "<div class=\"diff_info\">" .
-				      file_type($diffinfo->{'from_mode'}) . ":" .
-				      $cgi->a({-href => href(action=>"blob", 
hash_base=>$hash_parent,
-				                             hash=>$diffinfo->{'from_id'}, 
file_name=>$diffinfo->{'file'})},
-				              $diffinfo->{'from_id'}) .
-				      " -> " .
-				      file_type($diffinfo->{'to_mode'}) . ":" .
-				      $cgi->a({-href => href(action=>"blob", hash_base=>$hash,
-				                             hash=>$diffinfo->{'to_id'}, 
file_name=>$diffinfo->{'file'})},
-				              $diffinfo->{'to_id'});
-				print "</div>\n"; # class="diff_info"
+			# print "git diff" header
+			$patch_line =~ s!^(diff (.*?) )a/.*$!$1!;
+			if ($from{'href'}) {
+				$patch_line .= $cgi->a({-href => $from{'href'}, -class => "path"},
+				                       'a/' . esc_path($from{'file'}));
+			} else { # file was added
+				$patch_line .= 'a/' . esc_path($from{'file'});
+			}
+			$patch_line .= ' ';
+			if ($to{'href'}) {
+				$patch_line .= $cgi->a({-href => $to{'href'}, -class => "path"},
+				                       'b/' . esc_path($to{'file'}));
+			} else { # file was deleted
+				$patch_line .= 'b/' . esc_path($to{'file'});
 			}
 
-			#print "<div class=\"diff extended_header\">\n";
+			print "<div class=\"diff header\">$patch_line</div>\n";
+			print "<div class=\"diff extended_header\">\n";
 			$in_header = 1;
 			next LINE;
+		} else {
+			# actual skipping of empty patches
+			next LINE if $skip_patch;
 		} # start of patch in patchset
 
+		if ($in_header) {
+			if ($patch_line !~ m/^---/) {
+				# match <path>
+				if ($patch_line =~ s!^((copy|rename) from ).*$!$1! && 
$from{'href'}) {
+					$patch_line .= $cgi->a({-href=>$from{'href'}, -class=>"path"},
+					                        esc_path($from{'file'}));
+				}
+				if ($patch_line =~ s!^((copy|rename) to ).*$!$1! && $to{'href'}) {
+					$patch_line = $cgi->a({-href=>$to{'href'}, -class=>"path"},
+					                      esc_path($to{'file'}));
+				}
+				# match <mode>
+				if ($patch_line =~ m/\s(\d{6})$/) {
+					$patch_line .= '<span class="info"> (' .
+					               file_type_long($1) .
+					               ')</span>';
+				}
+				# match <hash>
+				if ($patch_line =~ m/^index/) {
+					my ($from_link, $to_link);
+					if ($from{'href'}) {
+						$from_link = $cgi->a({-href=>$from{'href'}, -class=>"hash"},
+						                     substr($diffinfo->{'from_id'},0,7));
+					} else {
+						$from_link = '0' x 7;
+					}
+					if ($to{'href'}) {
+						$to_link = $cgi->a({-href=>$to{'href'}, -class=>"hash"},
+						                   substr($diffinfo->{'to_id'},0,7));
+					} else {
+						$to_link = '0' x 7;
+					}
+					my ($from_id, $to_id) = ($diffinfo->{'from_id'}, 
$diffinfo->{'to_id'});
+					$patch_line =~ s!$from_id\.\.$to_id!$from_link..$to_link!;
+				}
+				print $patch_line . "<br/>\n";
 
-		if ($in_header && $patch_line =~ m/^---/) {
-			#print "</div>\n"; # class="diff extended_header"
-			$in_header = 0;
+			} else {
+				#$in_header && $patch_line =~ m/^---/;
+				print "</div>\n"; # class="diff extended_header"
+				$in_header = 0;
+
+				if ($from{'href'}) {
+					$patch_line = '--- a/' .
+					              $cgi->a({-href=>$from{'href'}, -class=>"path"},
+					                      esc_path($from{'file'}));
+				}
+				print "<div class=\"diff from_file\">$patch_line</div>\n";
 
-			my $file = $diffinfo->{'from_file'};
-			$file  ||= $diffinfo->{'file'};
-			$file = $cgi->a({-href => href(action=>"blob", 
hash_base=>$hash_parent,
-			                               hash=>$diffinfo->{'from_id'}, 
file_name=>$file),
-			                -class => "list"}, esc_path($file));
-			$patch_line =~ s|a/.*$|a/$file|g;
-			print "<div class=\"diff from_file\">$patch_line</div>\n";
+				$patch_line = <$fd>;
+				chomp $patch_line;
 
-			$patch_line = <$fd>;
-			chomp $patch_line;
+				#$patch_line =~ m/^+++/;
+				if ($to{'href'}) {
+					$patch_line = '+++ b/' .
+					              $cgi->a({-href=>$to{'href'}, -class=>"path"},
+					                      esc_html($to{'file'}));
+				}
+				print "<div class=\"diff to_file\">$patch_line</div>\n";
 
-			#$patch_line =~ m/^+++/;
-			$file    = $diffinfo->{'to_file'};
-			$file  ||= $diffinfo->{'file'};
-			$file = $cgi->a({-href => href(action=>"blob", hash_base=>$hash,
-			                               hash=>$diffinfo->{'to_id'}, 
file_name=>$file),
-			                -class => "list"}, esc_path($file));
-			$patch_line =~ s|b/.*|b/$file|g;
-			print "<div class=\"diff to_file\">$patch_line</div>\n";
+			}
 
 			next LINE;
 		}
-		next LINE if $in_header;
 
 		print format_diff_line($patch_line);
 	}
-- 
1.4.3.3

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 6/n] gitweb: Remove redundant "blob" links from git_difftree_body
  2006-10-30 18:53 [PATCH 0/n] gitweb: Better quoting and New improved patchset view Jakub Narebski
                   ` (4 preceding siblings ...)
  2006-10-31 14:22 ` [PATCH 5/n] [take 3] gitweb: New improved patchset view Jakub Narebski
@ 2006-10-31 16:07 ` Jakub Narebski
  2006-11-03  6:41   ` Junio C Hamano
  2006-10-31 16:36 ` [PATCH 7/n] gitweb: Output also empty patches in "commitdiff" view Jakub Narebski
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 46+ messages in thread
From: Jakub Narebski @ 2006-10-31 16:07 UTC (permalink / raw)
  To: git

This revert parts of commit 4777b0141a4812177390da4b6ebc9d40ac3da4b5
  "gitweb: Restore object-named links in item lists"
by Petr Baudis, which bring back "blob" links in difftree-raw like
part of "commit" and "commitdiff" views, namely in git_difftree_body
subroutine. First, it did that incompletely: it did not add "blob"
link for added files, and added block used mixture of tabs and spaces
for align. Second, in "difftree" view the "blob" link is not the most
interesting, *contrary* to "blob"/"tree" link in "tree" view, so it
should be enough to have hidden link in the form of file name entry.

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
I'm still for having those object-named links in "tree" view.

But perhaps it would be better to add "blob" links for all possible
statuses, including added files... Pasky, Luben, Junio?

 gitweb/gitweb.perl |    9 ---------
 1 files changed, 0 insertions(+), 9 deletions(-)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index c22fcc5..542dbca 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -1967,9 +1967,6 @@ sub git_difftree_body {
 				print $cgi->a({-href => "#patch$patchno"}, "patch");
 				print " | ";
 			}
-			print $cgi->a({-href => href(action=>"blob", hash=>$diff{'from_id'},
-			                             hash_base=>$parent, file_name=>$diff{'file'})},
-				      "blob") . " | ";
 			print $cgi->a({-href => href(action=>"blame", hash_base=>$parent,
 			                             file_name=>$diff{'file'})},
 			              "blame") . " | ";
@@ -2015,9 +2012,6 @@ sub git_difftree_body {
 				}
 				print " | ";
 			}
-			print $cgi->a({-href => href(action=>"blob", hash=>$diff{'to_id'},
-						     hash_base=>$hash, file_name=>$diff{'file'})},
-				      "blob") . " | ";
 			print $cgi->a({-href => href(action=>"blame", hash_base=>$hash,
 			                             file_name=>$diff{'file'})},
 			              "blame") . " | ";
@@ -2058,9 +2052,6 @@ sub git_difftree_body {
 				}
 				print " | ";
 			}
-			print $cgi->a({-href => href(action=>"blob", hash=>$diff{'from_id'},
-						     hash_base=>$parent, file_name=>$diff{'from_file'})},
-				      "blob") . " | ";
 			print $cgi->a({-href => href(action=>"blame", hash_base=>$parent,
 			                             file_name=>$diff{'from_file'})},
 			              "blame") . " | ";
-- 
1.4.3.3

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 7/n] gitweb: Output also empty patches in "commitdiff" view
  2006-10-30 18:53 [PATCH 0/n] gitweb: Better quoting and New improved patchset view Jakub Narebski
                   ` (5 preceding siblings ...)
  2006-10-31 16:07 ` [PATCH 6/n] gitweb: Remove redundant "blob" links from git_difftree_body Jakub Narebski
@ 2006-10-31 16:36 ` Jakub Narebski
  2006-11-03 11:56   ` Jakub Narebski
  2006-10-31 16:43 ` [PATCH 8/n] gitweb: Fix two issues with quoted filenames in git_patchset_body Jakub Narebski
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 46+ messages in thread
From: Jakub Narebski @ 2006-10-31 16:36 UTC (permalink / raw)
  To: git

Remove skipping over empty patches (i.e. patches which consist solely
of extended headers) in git_patchset_body, and add links to those
header-only patches in git_difftree_body (but not generate blobdiff
links when there were no change in file contents).

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
As promised...

 gitweb/gitweb.perl |   67 +++++++++++++++++++++-------------------------------
 1 files changed, 27 insertions(+), 40 deletions(-)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 542dbca..e0f7a3f 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -1998,19 +1998,19 @@ sub git_difftree_body {
 			print "</td>\n";
 			print "<td>$mode_chnge</td>\n";
 			print "<td class=\"link\">";
-			if ($diff{'to_id'} ne $diff{'from_id'}) { # modified
-				if ($action eq 'commitdiff') {
-					# link to patch
-					$patchno++;
-					print $cgi->a({-href => "#patch$patchno"}, "patch");
-				} else {
-					print $cgi->a({-href => href(action=>"blobdiff",
-					                             hash=>$diff{'to_id'}, hash_parent=>$diff{'from_id'},
-					                             hash_base=>$hash, hash_parent_base=>$parent,
-					                             file_name=>$diff{'file'})},
-					              "diff");
-				}
-				print " | ";
+			if ($action eq 'commitdiff') {
+				# link to patch
+				$patchno++;
+				print $cgi->a({-href => "#patch$patchno"}, "patch") .
+				      " | ";
+			} elsif ($diff{'to_id'} ne $diff{'from_id'}) {
+				# "commit" view and modified file (not onlu mode changed)
+				print $cgi->a({-href => href(action=>"blobdiff",
+				                             hash=>$diff{'to_id'}, hash_parent=>$diff{'from_id'},
+				                             hash_base=>$hash, hash_parent_base=>$parent,
+				                             file_name=>$diff{'file'})},
+				              "diff") .
+				      " | ";
 			}
 			print $cgi->a({-href => href(action=>"blame", hash_base=>$hash,
 			                             file_name=>$diff{'file'})},
@@ -2038,19 +2038,19 @@ sub git_difftree_body {
 			              -class => "list"}, esc_path($diff{'from_file'})) .
 			      " with " . (int $diff{'similarity'}) . "% similarity$mode_chng]</span></td>\n" .
 			      "<td class=\"link\">";
-			if ($diff{'to_id'} ne $diff{'from_id'}) {
-				if ($action eq 'commitdiff') {
-					# link to patch
-					$patchno++;
-					print $cgi->a({-href => "#patch$patchno"}, "patch");
-				} else {
-					print $cgi->a({-href => href(action=>"blobdiff",
-					                             hash=>$diff{'to_id'}, hash_parent=>$diff{'from_id'},
-					                             hash_base=>$hash, hash_parent_base=>$parent,
-					                             file_name=>$diff{'to_file'}, file_parent=>$diff{'from_file'})},
-					              "diff");
-				}
-				print " | ";
+			if ($action eq 'commitdiff') {
+				# link to patch
+				$patchno++;
+				print $cgi->a({-href => "#patch$patchno"}, "patch") .
+				      " | ";
+			} elsif ($diff{'to_id'} ne $diff{'from_id'}) {
+				# "commit" view and modified file (not only pure rename or copy)
+				print $cgi->a({-href => href(action=>"blobdiff",
+				                             hash=>$diff{'to_id'}, hash_parent=>$diff{'from_id'},
+				                             hash_base=>$hash, hash_parent_base=>$parent,
+				                             file_name=>$diff{'to_file'}, file_parent=>$diff{'from_file'})},
+				              "diff") .
+				      " | ";
 			}
 			print $cgi->a({-href => href(action=>"blame", hash_base=>$parent,
 			                             file_name=>$diff{'from_file'})},
@@ -2072,7 +2072,6 @@ sub git_patchset_body {
 	my $patch_idx = 0;
 	my $in_header = 0;
 	my $patch_found = 0;
-	my $skip_patch = 0;
 	my $diffinfo;
 	my (%from, %to);
 
@@ -2084,8 +2083,6 @@ sub git_patchset_body {
 
 		if ($patch_line =~ m/^diff /) { # "git diff" header
 			# beginning of patch (in patchset)
-			$skip_patch = 0;
-
 			if ($patch_found) {
 				# close previous patch
 				print "</div>\n"; # class="patch"
@@ -2116,13 +2113,6 @@ sub git_patchset_body {
 			}
 			$patch_idx++;
 
-			# for now we skip empty patches
-			if ($diffinfo->{'from_id'} eq $diffinfo->{'to_id'}) { # no change
-				$in_header = 1;
-				$skip_patch = 0;
-				next LINE;
-			}
-
 			# print "git diff" header
 			$patch_line =~ s!^(diff (.*?) )a/.*$!$1!;
 			if ($from{'href'}) {
@@ -2143,10 +2133,7 @@ sub git_patchset_body {
 			print "<div class=\"diff extended_header\">\n";
 			$in_header = 1;
 			next LINE;
-		} else {
-			# actual skipping of empty patches
-			next LINE if $skip_patch;
-		} # start of patch in patchset
+		}
 
 		if ($in_header) {
 			if ($patch_line !~ m/^---/) {
-- 
1.4.3.3

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 8/n] gitweb: Fix two issues with quoted filenames in git_patchset_body
  2006-10-30 18:53 [PATCH 0/n] gitweb: Better quoting and New improved patchset view Jakub Narebski
                   ` (6 preceding siblings ...)
  2006-10-31 16:36 ` [PATCH 7/n] gitweb: Output also empty patches in "commitdiff" view Jakub Narebski
@ 2006-10-31 16:43 ` Jakub Narebski
  2006-11-01 13:33 ` [PATCH 9/n] gitweb: Better support for non-CSS aware web browsers Jakub Narebski
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 46+ messages in thread
From: Jakub Narebski @ 2006-10-31 16:43 UTC (permalink / raw)
  To: git

This fixes the following (minor) bugs:
* gitweb didn't remove quoted filenames from "git diff" header
* the filename in to-file +++ header used esc_html not esc_path

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
 gitweb/gitweb.perl |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index e0f7a3f..837b88e 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -2114,7 +2114,7 @@ sub git_patchset_body {
 			$patch_idx++;
 
 			# print "git diff" header
-			$patch_line =~ s!^(diff (.*?) )a/.*$!$1!;
+			$patch_line =~ s!^(diff (.*?) )"?a/.*$!$1!;
 			if ($from{'href'}) {
 				$patch_line .= $cgi->a({-href => $from{'href'}, -class => "path"},
 				                       'a/' . esc_path($from{'file'}));
@@ -2191,7 +2191,7 @@ sub git_patchset_body {
 				if ($to{'href'}) {
 					$patch_line = '+++ b/' .
 					              $cgi->a({-href=>$to{'href'}, -class=>"path"},
-					                      esc_html($to{'file'}));
+					                      esc_path($to{'file'}));
 				}
 				print "<div class=\"diff to_file\">$patch_line</div>\n";
 
-- 
1.4.3.3

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH 4/n] gitweb: Secure against commit-ish/tree-ish with the same name as path
  2006-10-30 21:29 ` [PATCH 4/n] gitweb: Secure against commit-ish/tree-ish with the same name as path Jakub Narebski
@ 2006-10-31 16:53   ` Jakub Narebski
  2006-11-01  0:24     ` Junio C Hamano
  0 siblings, 1 reply; 46+ messages in thread
From: Jakub Narebski @ 2006-10-31 16:53 UTC (permalink / raw)
  To: git

Jakub Narebski wrote:
> @@ -2970,7 +2972,7 @@ sub git_tree {
>                 }
>         }
>         $/ = "\0";
> -       open my $fd, "-|", git_cmd(), "ls-tree", '-z', $hash
> +       open my $fd, "-|", git_cmd(), "ls-tree", '-z', $hash, "--"
>                 or die_error(undef, "Open git-ls-tree failed");
>         my @entries = map { chomp; $_ } <$fd>;
>         close $fd or die_error(undef, "Reading tree failed");

Please remove this chunk from patch!. It makes gitweb "tree" view
empty. I have forgot that git-ls-tree _requires_ <tree-ish> so there
is no way to mistake pathspec with <tree-ish>.

Bit overeager adding of "--"... 
-- 
Jakub Narebski

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 4/n] gitweb: Secure against commit-ish/tree-ish with the same name as path
  2006-10-31 16:53   ` Jakub Narebski
@ 2006-11-01  0:24     ` Junio C Hamano
  2006-11-01  0:40       ` Jakub Narebski
  0 siblings, 1 reply; 46+ messages in thread
From: Junio C Hamano @ 2006-11-01  0:24 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Jakub Narebski <jnareb@gmail.com> writes:

> Please remove this chunk from patch!. It makes gitweb "tree" view
> empty. I have forgot that git-ls-tree _requires_ <tree-ish> so there
> is no way to mistake pathspec with <tree-ish>.

To be honest, I dislike these */n series where the the end is
unknown.  It just confuses me what's still surviving, what's
already shot down, and what's being rerolled.

Let's step back a bit and see if we share the same view as to
the status of each one:

[PATCH/RFC 1/n] gitweb: Better git-unquoting and gitweb-quoting of p...

Marked preliminary, perhaps need some discussion and rerolling
but I haven't looked at it.

[PATCH 2/n] gitweb: Use '&iquot;' instead of '?' in esc_path

Discussed; we agreed that showing byte values in different
colors is preferable.  Waiting for re-roll.

[PATCH 3/n] gitweb: Use 's' regexp modifier to secure against filena...

I looked at it although haven't said anything yet.  Probably a
safe and good change but I wonder how LF at the end of the line
matches /...(.+)$/s pattern; iow, if we do not use -z does it
still do the right thing?  Otherwise I suspect you would perhaps
need to chomp?

[PATCH 4/n] gitweb: Secure against commit-ish/tree-ish with the same...

Good fix and even improves readability; will apply after
dropping -- from ls-tree args.

[PATCH 5/n] [take 3] gitweb: New improved patchset view
[PATCH 6/n] gitweb: Remove redundant "blob" links from git_difftree_...
[PATCH 7/n] gitweb: Output also empty patches in "commitdiff" view
[PATCH 8/n] gitweb: Fix two issues with quoted filenames in git_patc...

Haven't looked at them and I do not think people have had enough
time to comment on them yet.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 4/n] gitweb: Secure against commit-ish/tree-ish with the same name as path
  2006-11-01  0:24     ` Junio C Hamano
@ 2006-11-01  0:40       ` Jakub Narebski
  2006-11-02  1:01         ` Junio C Hamano
  0 siblings, 1 reply; 46+ messages in thread
From: Jakub Narebski @ 2006-11-01  0:40 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Junio C Hamano wrote:
> 
> To be honest, I dislike these */n series where the the end is
> unknown.  It just confuses me what's still surviving, what's
> already shot down, and what's being rerolled.

Well, it looks like this patch series is closing to final patch.
The "New improved patchset view" is done.
 
> Let's step back a bit and see if we share the same view as to
> the status of each one:
> 
> [PATCH/RFC 1/n] gitweb: Better git-unquoting and gitweb-quoting of p...
> 
> Marked preliminary, perhaps need some discussion and rerolling
> but I haven't looked at it.

I'm not sure if without this patch (well, the unquote part) gitweb
can work with filenames which git quotes using escape sequences,
like ", \, LF, TAB. Former version didn't unquote fully, and it
passed partially unquoted filename to git.

> [PATCH 2/n] gitweb: Use '&iquot;' instead of '?' in esc_path
> 
> Discussed; we agreed that showing byte values in different
> colors is preferable.  Waiting for re-roll.

The problem with using text color or background color is that
the filenames tends to be shown with different color and background
color: "tree" view, parts of difftree, parts of diff header, etc.
Perhaps text-decoration: overline;? Just kidding...

> [PATCH 3/n] gitweb: Use 's' regexp modifier to secure against filena...
> 
> I looked at it although haven't said anything yet.  Probably a
> safe and good change but I wonder how LF at the end of the line
> matches /...(.+)$/s pattern; iow, if we do not use -z does it
> still do the right thing?  Otherwise I suspect you would perhaps
> need to chomp?

We always pass chomped lines. First chunk is unnecessary (we care only
for type), without second "tree" view look strange for files with
embedded newline in filename.

> [PATCH 4/n] gitweb: Secure against commit-ish/tree-ish with the same...
> 
> Good fix and even improves readability; will apply after
> dropping -- from ls-tree args.

As I said, noticed while testing gitweb with strange filenames
in 'gitweb/test' branch.

> [PATCH 5/n] [take 3] gitweb: New improved patchset view
> [PATCH 6/n] gitweb: Remove redundant "blob" links from git_difftree_...
> [PATCH 7/n] gitweb: Output also empty patches in "commitdiff" view
> [PATCH 8/n] gitweb: Fix two issues with quoted filenames in git_patc...
> 
> Haven't looked at them and I do not think people have had enough
> time to comment on them yet.

Well, patch 5 and 8 could be collapsed.

-- 
Jakub Narebski

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 9/n] gitweb: Better support for non-CSS aware web browsers
  2006-10-30 18:53 [PATCH 0/n] gitweb: Better quoting and New improved patchset view Jakub Narebski
                   ` (7 preceding siblings ...)
  2006-10-31 16:43 ` [PATCH 8/n] gitweb: Fix two issues with quoted filenames in git_patchset_body Jakub Narebski
@ 2006-11-01 13:33 ` Jakub Narebski
  2006-11-01 13:38   ` Petr Baudis
  2006-11-01 13:36 ` [PATCH 10/n] gitweb: New improved formatting of chunk header in diff Jakub Narebski
  2006-11-01 18:52 ` [PATCH 00/10] gitweb: Better quoting and New improved patchset view Jakub Narebski
  10 siblings, 1 reply; 46+ messages in thread
From: Jakub Narebski @ 2006-11-01 13:33 UTC (permalink / raw)
  To: git

Add option to replace SPC (' ') with hard (non-breakable) space HTML
entity '&nbsp;' in esc_html subroutine.

Replace ' ' with '&nbsp;' for the code/diff display part in git_blob
and git_patchset_body; this is to be able to view code and diffs in
web browsers which doesn't understand "white-space: pre;" CSS
declaration.

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
For people who want to use non-CSS aware browsers, like w3m, elinks, links
text browsers, for looking at "blob" and "commitdiff"/"blobdiff" views
of gitweb.

HTH.

 gitweb/gitweb.perl |   11 ++++++++---
 1 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 5c82093..b8fcafc 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -554,12 +554,17 @@ sub esc_url {
 }
 
 # replace invalid utf8 character with SUBSTITUTION sequence
-sub esc_html {
+sub esc_html ($;%) {
 	my $str = shift;
+	my %opts = @_;
+
 	$str = to_utf8($str);
 	$str = escapeHTML($str);
 	$str =~ s/\014/^L/g; # escape FORM FEED (FF) character (e.g. in COPYING file)
 	$str =~ s/\033/^[/g; # "escape" ESCAPE (\e) character (e.g. commit 20a3847d8a5032ce41f90dcc68abfb36e6fee9b1)
+	if ($opts{'-nbsp'}) {
+		$str =~ s/ /&nbsp;/g;
+	}
 	return $str;
 }
 
@@ -840,7 +845,7 @@ sub format_diff_line {
 		$diff_class = " incomplete";
 	}
 	$line = untabify($line);
-	return "<div class=\"diff$diff_class\">" . esc_html($line) . "</div>\n";
+	return "<div class=\"diff$diff_class\">" . esc_html($line, -nbsp=>1) . "</div>\n";
 }
 
 ## ----------------------------------------------------------------------
@@ -2985,7 +2990,7 @@ sub git_blob {
 		$nr++;
 		$line = untabify($line);
 		printf "<div class=\"pre\"><a id=\"l%i\" href=\"#l%i\" class=\"linenr\">%4i</a> %s</div>\n",
-		       $nr, $nr, $nr, esc_html($line);
+		       $nr, $nr, $nr, esc_html($line, -nbsp=>1);
 	}
 	close $fd
 		or print "Reading blob failed.\n";
-- 
1.4.3.3

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 10/n] gitweb: New improved formatting of chunk header in diff
  2006-10-30 18:53 [PATCH 0/n] gitweb: Better quoting and New improved patchset view Jakub Narebski
                   ` (8 preceding siblings ...)
  2006-11-01 13:33 ` [PATCH 9/n] gitweb: Better support for non-CSS aware web browsers Jakub Narebski
@ 2006-11-01 13:36 ` Jakub Narebski
  2006-11-01 18:52 ` [PATCH 00/10] gitweb: Better quoting and New improved patchset view Jakub Narebski
  10 siblings, 0 replies; 46+ messages in thread
From: Jakub Narebski @ 2006-11-01 13:36 UTC (permalink / raw)
  To: git

If we have provided enough info, and diff is not combined diff,
and if provided diff line is chunk header, then:
* split chunk header into .chunk_info and .section span elements,
  first containing proper chunk header, second section heading
  (aka. which function), for separate styling: the proper chunk
  header is on non-white background, section heading part uses
  slightly lighter color.
* hyperlink from-file-range to starting line of from-file, if file
  was not created.
* hyperlink to-file-range to starting line of to-file, if file
  was not deleted.
Links are of invisible variety (and "list" class).

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
As usual, you can check out the new version of gitweb at work at my
web page:
  http://roke_._dyndns_._info/cgi-bin/gitweb/gitweb.cgi

The actual formatting is open for discussion.

 gitweb/gitweb.css  |   13 +++++++++++++
 gitweb/gitweb.perl |   18 +++++++++++++++++-
 2 files changed, 30 insertions(+), 1 deletions(-)

diff --git a/gitweb/gitweb.css b/gitweb/gitweb.css
index 87dbc52..41c3084 100644
--- a/gitweb/gitweb.css
+++ b/gitweb/gitweb.css
@@ -327,11 +327,13 @@ div.diff.extended_header {
 	padding: 2px 0px 2px 0px;
 }
 
+div.diff a.list,
 div.diff a.path,
 div.diff a.hash {
 	text-decoration: none;
 }
 
+div.diff a.list:hover,
 div.diff a.path:hover,
 div.diff a.hash:hover {
 	text-decoration: underline;
@@ -355,14 +357,25 @@ div.diff.rem {
 	color: #cc0000;
 }
 
+div.diff.chunk_header a,
 div.diff.chunk_header {
 	color: #990099;
+}
 
+div.diff.chunk_header {
 	border: dotted #ffe0ff;
 	border-width: 1px 0px 0px 0px;
 	margin-top: 2px;
 }
 
+div.diff.chunk_header span.chunk_info {
+	background-color: #ffeeff;
+}
+
+div.diff.chunk_header span.section {
+	color: #aa22aa;
+}
+
 div.diff.incomplete {
 	color: #cccccc;
 }
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index b8fcafc..a7739af 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -830,6 +830,7 @@ sub format_subject_html {
 
 sub format_diff_line {
 	my $line = shift;
+	my ($from, $to) = @_;
 	my $char = substr($line, 0, 1);
 	my $diff_class = "";
 
@@ -845,6 +846,21 @@ sub format_diff_line {
 		$diff_class = " incomplete";
 	}
 	$line = untabify($line);
+	if ($from && $to && $line =~ m/^\@{2} /) {
+		my ($from_text, $from_start, $from_lines, $to_text, $to_start, $to_lines, $section) =
+			$line =~ m/^\@{2} (-(\d+),(\d+)) (\+(\d+),(\d+)) \@{2}(.*)$/;
+		if ($from->{'href'}) {
+			$from_text = $cgi->a({-href=>"$from->{'href'}#l$from_start",
+			                     -class=>"list"}, $from_text);
+		}
+		if ($to->{'href'}) {
+			$to_text   = $cgi->a({-href=>"$to->{'href'}#l$to_start",
+			                     -class=>"list"}, $to_text);
+		}
+		$line = "<span class=\"chunk_info\">@@ $from_text $to_text @@</span>" .
+		        "<span class=\"section\">" . esc_html($section, -nbsp=>1) . "</span>";
+		return "<div class=\"diff$diff_class\">$line</div>\n";
+	}
 	return "<div class=\"diff$diff_class\">" . esc_html($line, -nbsp=>1) . "</div>\n";
 }
 
@@ -2205,7 +2221,7 @@ sub git_patchset_body {
 			next LINE;
 		}
 
-		print format_diff_line($patch_line);
+		print format_diff_line($patch_line, \%from, \%to);
 	}
 	print "</div>\n" if $patch_found; # class="patch"
 
-- 
1.4.3.3

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH 9/n] gitweb: Better support for non-CSS aware web browsers
  2006-11-01 13:33 ` [PATCH 9/n] gitweb: Better support for non-CSS aware web browsers Jakub Narebski
@ 2006-11-01 13:38   ` Petr Baudis
  0 siblings, 0 replies; 46+ messages in thread
From: Petr Baudis @ 2006-11-01 13:38 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Dear diary, on Wed, Nov 01, 2006 at 02:33:21PM CET, I got a letter
where Jakub Narebski <jnareb@gmail.com> said that...
> For people who want to use non-CSS aware browsers, like w3m, elinks, links
> text browsers, for looking at "blob" and "commitdiff"/"blobdiff" views
> of gitweb.

JFYI, ELinks supports CSS. (Well, not completely, but it does handle
white-space: pre.)

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 00/10] gitweb: Better quoting and New improved patchset view
  2006-10-30 18:53 [PATCH 0/n] gitweb: Better quoting and New improved patchset view Jakub Narebski
                   ` (9 preceding siblings ...)
  2006-11-01 13:36 ` [PATCH 10/n] gitweb: New improved formatting of chunk header in diff Jakub Narebski
@ 2006-11-01 18:52 ` Jakub Narebski
  10 siblings, 0 replies; 46+ messages in thread
From: Jakub Narebski @ 2006-11-01 18:52 UTC (permalink / raw)
  To: git

Jakub Narebski wrote:
> This series of patches is meant to introduce 
> "New improved patchset view" in part-by-part
> case.

And thus endeth "New inproved patchset view" saga.

Patches in series:
 [PATCH 01/10] gitweb: Better git-unquoting and gitweb-quoting of
               pathnames
 [PATCH 02/10] gitweb: Use '&iquot;' instead of '?' in esc_path
 [PATCH 03/10] gitweb: Use 's' regexp modifier to secure against
               filenames with LF
 [PATCH 04/10] gitweb: Secure against commit-ish/tree-ish with
               the same name as path
 [PATCH 05/10] gitweb: New improved patchset view
 [PATCH 06/10] gitweb: Remove redundant "blob" links from
               git_difftree_body
 [PATCH 07/10] gitweb: Output also empty patches in "commitdiff" view
 [PATCH 08/10] gitweb: Fix two issues with quoted filenames
               in git_patchset_body
 [PATCH 09/10] gitweb: Better support for non-CSS aware web browsers
 [PATCH 10/10] gitweb: New improved formatting of chunk header in diff

Diffstat:
 gitweb/gitweb.css  |   79 +++++++++--
 gitweb/gitweb.perl |  378 ++++++++++++++++++++++++++++++++--------------------
 2 files changed, 303 insertions(+), 154 deletions(-)

Shortlog:
Jakub Narebski (11):
      gitweb: Better git-unquoting and gitweb-quoting of pathnames
      gitweb: Use '&iquot;' instead of '?' in esc_path
      gitweb: Use 's' regexp modifier to secure against filenames with LF
      gitweb: Secure against commit-ish/tree-ish with the same name as path
      gitweb: New improved patchset view
      gitweb: Remove redundant "blob" links from git_difftree_body
      gitweb: Output also empty patches in "commitdiff" view
      gitweb: Fix two issues with quoted filenames in git_patchset_body
[*1*] gitweb: Remove "--" from "git ls-tree -z <hash> --", added overeagerily
      gitweb: Better support for non-CSS aware web browsers
      gitweb: New improved formatting of chunk header in diff

[*1*] Not send as patch itself, but as comment about patch
-- 
Jakub Narebski

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 4/n] gitweb: Secure against commit-ish/tree-ish with the same name as path
  2006-11-01  0:40       ` Jakub Narebski
@ 2006-11-02  1:01         ` Junio C Hamano
  2006-11-02  8:49           ` Jakub Narebski
  0 siblings, 1 reply; 46+ messages in thread
From: Junio C Hamano @ 2006-11-02  1:01 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Jakub Narebski <jnareb@gmail.com> writes:

>> [PATCH/RFC 1/n] gitweb: Better git-unquoting and gitweb-quoting of p...
>> 
>> Marked preliminary, perhaps need some discussion and rerolling
>> but I haven't looked at it.
>
> I'm not sure if without this patch (well, the unquote part) gitweb
> can work with filenames which git quotes using escape sequences,

I am reasonably sure it wouldn't, and it sounded like you wanted
to fix it better than the preliminary one, so I think we are in
agreement.

>> [PATCH 2/n] gitweb: Use '&iquot;' instead of '?' in esc_path
>> 
>> Discussed; we agreed that showing byte values in different
>> colors is preferable.  Waiting for re-roll.
>
> The problem with using text color or background color is that
> the filenames tends to be shown with different color and background
> color: "tree" view, parts of difftree, parts of diff header, etc.
> Perhaps text-decoration: overline;? Just kidding...

Use of overstrike may actually not be a bad thing.  It _is_
unusual situation after all.

>> [PATCH 3/n] gitweb: Use 's' regexp modifier to secure against filena...
>> 
>> I looked at it although haven't said anything yet.  Probably a
>> safe and good change but I wonder how LF at the end of the line
>> matches /...(.+)$/s pattern; iow, if we do not use -z does it
>> still do the right thing?  Otherwise I suspect you would perhaps
>> need to chomp?
>
> We always pass chomped lines. First chunk is unnecessary (we care only
> for type), without second "tree" view look strange for files with
> embedded newline in filename.

The codepath affected by the first chunk does not chomp, which
was what I was referring to.  So in the meantime will apply only
the second hunk.

>> [PATCH 4/n] gitweb: Secure against commit-ish/tree-ish with the same...
>> 
>> Good fix and even improves readability; will apply after
>> dropping -- from ls-tree args.

I just applied this.  I'll be pushing out a "master" update
sometime today, and do not expect to be able to get to your "n
turned out to be ten" series, so it might be worthwhile to
reroll the remaining bits that you still care about on top of
what I push out tonight to make sure we are on the same page.

Preferably:

 - you should avoid making a series out of more-or-less
   unrelated things;

 - if you are doing related things in one series, do not send
   half-baked early parts out until you are finished and are
   confident with it.  If you do not know how many patches you
   need to complete that logically single topic yet, that is a
   sure sign that you are not done.  Instead, finish writing and
   testing it, and if your test finds an earlier mistake,
   especially a trivial one, go back and fix it in the earlier
   patch in the series.  Everybody makes mistakes so fixing up
   before submission is a norm, and other people do not have to
   be forced to see your "oops" in the development history.

Thanks.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 4/n] gitweb: Secure against commit-ish/tree-ish with the same name as path
  2006-11-02  1:01         ` Junio C Hamano
@ 2006-11-02  8:49           ` Jakub Narebski
  2006-11-03  6:18             ` Junio C Hamano
  0 siblings, 1 reply; 46+ messages in thread
From: Jakub Narebski @ 2006-11-02  8:49 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Junio C Hamano wrote:

> I'll be pushing out a "master" update
> sometime today, and do not expect to be able to get to your "n
> turned out to be ten" series, so it might be worthwhile to
> reroll the remaining bits that you still care about on top of
> what I push out tonight to make sure we are on the same page.

I'll wait a while if there are any comments (for example on formatting
used), and resend cleaned-up series.

> Preferably:
>
>  - you should avoid making a series out of more-or-less
>    unrelated things;

Well, truly unrelated were adding "--" to secure against ref with
the same name as path in the repository (but it was discovered
during testing the series) and replacing ' ' with '&nbsp;' in blob
and diff body for non-CSS aware browsers.

Better quoting and unquoting was needed for better commitdiff view.
Ah, well, perhaps it is unrelated.

Securing against filenames with LF for example has sense only if
there can be filenames with LF, and earlier gitweb unquoted it halfway
leaving '\n' instead of LF.

>  - if you are doing related things in one series, do not send
>    half-baked early parts out until you are finished and are
>    confident with it.

I've send series early to get some comments, but I see while I got
some comments on "take 1" and "take 2" on _single_ "new commitdiff"
RFC patch, I got comments only about half-baked '&iquot;' idea.
-- 
Jakub Narebski

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 4/n] gitweb: Secure against commit-ish/tree-ish with the same name as path
  2006-11-02  8:49           ` Jakub Narebski
@ 2006-11-03  6:18             ` Junio C Hamano
  2006-11-03  9:35               ` Junio C Hamano
  0 siblings, 1 reply; 46+ messages in thread
From: Junio C Hamano @ 2006-11-03  6:18 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Jakub Narebski <jnareb@gmail.com> writes:

> Junio C Hamano wrote:
>
>> I'll be pushing out a "master" update
>> sometime today, and do not expect to be able to get to your "n
>> turned out to be ten" series, so it might be worthwhile to
>> reroll the remaining bits that you still care about on top of
>> what I push out tonight to make sure we are on the same page.
>
> I'll wait a while if there are any comments (for example on formatting
> used), and resend cleaned-up series.
> ...
> I've send series early to get some comments, but I see while I got
> some comments on "take 1" and "take 2" on _single_ "new commitdiff"
> RFC patch, I got comments only about half-baked '&iquot;' idea.

Well, I think some of the major reasons you did not get any
response were:

 (1) it was unclear where it started, where it was heading to,
     and where it ended until you sent out "by the way n=10"
     message at the end;

 (2) the first major oand interesting one in the series (5/n)
     were linewrapped and could not be applied;

and it is rather hard to comment on gitweb changes unless you
view two instances of gitweb output side-by-side for before and
after each patch.

I'd see if I can add some constructive comments on patches 5-10
tonight, but I'm in the middle of other things so don't hold
your breath ;-).

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 6/n] gitweb: Remove redundant "blob" links from git_difftree_body
  2006-10-31 16:07 ` [PATCH 6/n] gitweb: Remove redundant "blob" links from git_difftree_body Jakub Narebski
@ 2006-11-03  6:41   ` Junio C Hamano
  2006-11-03 11:01     ` Jakub Narebski
  0 siblings, 1 reply; 46+ messages in thread
From: Junio C Hamano @ 2006-11-03  6:41 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Jakub Narebski <jnareb@gmail.com> writes:

> ... First, it did that incompletely: it did not add "blob"
> link for added files, and added block used mixture of tabs and spaces
> for align. Second, in "difftree" view the "blob" link is not the most
> interesting, *contrary* to "blob"/"tree" link in "tree" view, so it
> should be enough to have hidden link in the form of file name entry.

I think these "blob" links are good thing to have, and if you
think the earlier work was incomplete and know some cases are
not covered I think it would be better to help completing it
rather than reverting.

I do not understand why you feel "blob" is not the most
interesting.  Often, when it is not obvious if a patch is
correct only with the context, it is useful to view the whole
postimage after applying the patch, and the "blob" link helps
that.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH/RFC 1/n] gitweb: Better git-unquoting and gitweb-quoting of pathnames
  2006-10-30 18:58 ` [PATCH/RFC 1/n] gitweb: Better git-unquoting and gitweb-quoting of pathnames Jakub Narebski
@ 2006-11-03  8:15   ` Junio C Hamano
  2006-11-03 10:59     ` Jakub Narebski
  0 siblings, 1 reply; 46+ messages in thread
From: Junio C Hamano @ 2006-11-03  8:15 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Jakub Narebski <jnareb@gmail.com> writes:

> diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> index ec46b80..a15e916 100755
> --- a/gitweb/gitweb.perl
> +++ b/gitweb/gitweb.perl
> @@ -563,12 +563,42 @@ sub esc_html {
>  	return $str;
>  }
>  
> +# quote unsafe characters and escape filename to HTML
> +sub esc_path {
> +	my $str = shift;
> +	$str = esc_html($str);
> +	$str =~ s/[[:cntrl:]\a\b\e\f\n\r\t\011]/?/g; # like --hide-control-chars in ls
> +	return $str;
> +}
> +

When you say "[:cntrl:]" do you need to say anything more?

I was initially puzzled by "\t\011" bit, but I realize that is a
bug that is consistent with the next part I'll comment on.

>  # git may return quoted and escaped filenames
>  sub unquote {
>  	my $str = shift;
> +
> +	sub unq {
> +		my $seq = shift;
> +		my %es = (
> +			't' => "\t", # tab            (HT, TAB)
> +			'n' => "\n", # newline        (NL)
> +			'r' => "\r", # return         (CR)
> +			'f' => "\f", # form feed      (FF)
> +			'b' => "\b", # backspace      (BS)
> +			'a' => "\a", # alarm (bell)   (BEL)
> +			#'e' => "\e", # escape        (ESC)
> +			'v' => "\011", # vertical tab (VT)
> +		);
> +
> +		# octal char sequence
> +		return chr(oct($seq))  if ($seq =~ m/^[0-7]{1,3}$/);
> +		# C escape sequence (this includes '\n' (LF) and '\t' (TAB))
> +		return $es{$seq}       if ($seq =~ m/^[abefnrtv]$/);

Problems in this part of the code X-<.

 * Was there a reason not to unwrap '\e' to "\e"?

 * The vertical tab is \013 (decimal 11), not \011 (which is TAB).

 * The name and the abbreviated name of the character "\n" are
   "line feed" and "LF"; I personally do not think these
   character name comments are needed in this part of the code,
   but I do not object if you want to have them there, as long
   as you spell them correctly. cf. ISO/IEC 6429:1992 or
   http://www.unicode.org/charts/PDF/U0000.pdf for example.

 * The hash %es and the pattern /[abef...]/ must be kept in
   sync; it is a maintenance nightmare,

 * Worse yet, they do not agree even in this initial version,
   which proves the previous point.

Perhaps this is better written as:

	if (exists $es{$seq}) {
        	return $es{$seq};
	}

The rest must have been a lot of work to sift through esc_html
and identify which is path and which is not.  Much appreciated
and expect a cleaned up patch to be applied.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 4/n] gitweb: Secure against commit-ish/tree-ish with the same name as path
  2006-11-03  6:18             ` Junio C Hamano
@ 2006-11-03  9:35               ` Junio C Hamano
  2006-11-03 10:49                 ` Jakub Narebski
  0 siblings, 1 reply; 46+ messages in thread
From: Junio C Hamano @ 2006-11-03  9:35 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Junio C Hamano <junkio@cox.net> writes:

> I'd see if I can add some constructive comments on patches 5-10
> tonight, but I'm in the middle of other things so don't hold
> your breath ;-).

7 and 9 look obviously good, so I've applied them without
others.

        gitweb: Output also empty patches in "commitdiff" view
        gitweb: Better support for non-CSS aware web browsers

5 is terminally linewrapped and rather big to comment on without
comparing pre- and post- patch outputs, so I'll refrain from
commenting on it.  8 is "oops, I made a mistake when I did 5",
which discourages me even more from looking at 5 X-<.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 5/10] gitweb: New improved patchset view
  2006-10-31 14:22 ` [PATCH 5/n] [take 3] gitweb: New improved patchset view Jakub Narebski
@ 2006-11-03 10:26   ` Jakub Narebski
  0 siblings, 0 replies; 46+ messages in thread
From: Jakub Narebski @ 2006-11-03 10:26 UTC (permalink / raw)
  To: git

Replace "gitweb diff header" with its full sha1 of blobs and replace
it by "git diff" header and extended diff header. Change also somewhat
highlighting of diffs.

Added `file_type_long' subroutine to convert file mode in octal to
file type description (only for file modes which used by git).

Changes:
* "gitweb diff header" which looked for example like below:
    file:_<sha1 before>_ -> file:_<sha1 after>_
  where 'file' is file type and '<sha1>' is full sha1 of blob is
  changed to
    diff --git _a/<file before>_ _b/<file after>_
  In both cases links are visible and use default link style. If file
  is added, a/<file> is not hyperlinked, if file is deleted, b/<file>
  is not hyperlinked.
* there is added "extended diff header", with <path> and <hash>
  hyperlinked (and <hash> shortened to 7 characters), and <mode>
  explained: '<mode>' is extended to '<mode> (<file type description>)',
  where added text is slightly lighter to easy distinguish that it
  was added (and it is difference from git-diff output).
* from-file/to-file two-line header lines have slightly darker color
  than removed/added lines.
* chunk header has now delicate line above for easier finding chunk
  boundary, and top margin of 2px, both barely visible.

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
I'm very sorry, previous version was linewrapped.

As usual, you can see new gitweb in action (after all the series,
including this patch) at my site
  http://roke . dyndns . info/cgi-bin/gitweb/gitweb.cgi
(anti-anti-spam protected against "DOT info" being vger banned word).

You can doenload and view snapshots at:
  http://repo.or.cz/w/git/jnareb-git.git/gitweb/test:gitweb/snapshots/
(before means current 'master' gitweb, middle means after this patch,
after means after all patches in the series).

 gitweb/gitweb.css  |   66 +++++++++++++++----
 gitweb/gitweb.perl |  187 ++++++++++++++++++++++++++++++++++------------------
 2 files changed, 176 insertions(+), 77 deletions(-)

diff --git a/gitweb/gitweb.css b/gitweb/gitweb.css
index 0eda982..87dbc52 100644
--- a/gitweb/gitweb.css
+++ b/gitweb/gitweb.css
@@ -291,40 +291,82 @@ td.mode {
 	font-family: monospace;
 }
 
-div.diff a.list {
+/* styling of diffs (patchsets): commitdiff and blobdiff views */
+div.diff.header,
+div.diff.extended_header {
+	white-space: normal;
+}
+
+div.diff.header {
+	font-weight: bold;
+
+	background-color: #edece6;
+
+	margin-top: 4px;
+	padding: 4px 0px 2px 0px;
+	border: solid #d9d8d1;
+	border-width: 1px 0px 1px 0px;
+}
+
+div.diff.header a.path {
+	text-decoration: underline;
+}
+
+div.diff.extended_header,
+div.diff.extended_header a.path,
+div.diff.extended_header a.hash {
+	color: #777777;
+}
+
+div.diff.extended_header .info {
+	color: #b0b0b0;
+}
+
+div.diff.extended_header {
+	background-color: #f6f5ee;
+	padding: 2px 0px 2px 0px;
+}
+
+div.diff a.path,
+div.diff a.hash {
 	text-decoration: none;
 }
 
-div.diff a.list:hover {
+div.diff a.path:hover,
+div.diff a.hash:hover {
 	text-decoration: underline;
 }
 
-div.diff.to_file a.list,
-div.diff.to_file,
+div.diff.to_file a.path,
+div.diff.to_file {
+	color: #007000;
+}
+
 div.diff.add {
 	color: #008800;
 }
 
-div.diff.from_file a.list,
-div.diff.from_file,
+div.diff.from_file a.path,
+div.diff.from_file {
+	color: #aa0000;
+}
+
 div.diff.rem {
 	color: #cc0000;
 }
 
 div.diff.chunk_header {
 	color: #990099;
+
+	border: dotted #ffe0ff;
+	border-width: 1px 0px 0px 0px;
+	margin-top: 2px;
 }
 
 div.diff.incomplete {
 	color: #cccccc;
 }
 
-div.diff_info {
-	font-family: monospace;
-	color: #000099;
-	background-color: #edece6;
-	font-style: italic;
-}
 
 div.index_include {
 	border: solid #d9d8d1;
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 4554067..c22fcc5 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -732,6 +732,32 @@ sub file_type {
 	}
 }
 
+# convert file mode in octal to file type description string
+sub file_type_long {
+	my $mode = shift;
+
+	if ($mode !~ m/^[0-7]+$/) {
+		return $mode;
+	} else {
+		$mode = oct $mode;
+	}
+
+	if (S_ISDIR($mode & S_IFMT)) {
+		return "directory";
+	} elsif (S_ISLNK($mode)) {
+		return "symlink";
+	} elsif (S_ISREG($mode)) {
+		if ($mode & S_IXUSR) {
+			return "executable";
+		} else {
+			return "file";
+		};
+	} else {
+		return "unknown";
+	}
+}
+
+
 ## ----------------------------------------------------------------------
 ## functions returning short HTML fragments, or transforming HTML fragments
 ## which don't beling to other sections
@@ -2055,7 +2081,9 @@ sub git_patchset_body {
 	my $patch_idx = 0;
 	my $in_header = 0;
 	my $patch_found = 0;
+	my $skip_patch = 0;
 	my $diffinfo;
+	my (%from, %to);
 
 	print "<div class=\"patchset\">\n";
 
@@ -2065,6 +2093,8 @@ sub git_patchset_body {
 
 		if ($patch_line =~ m/^diff /) { # "git diff" header
 			# beginning of patch (in patchset)
+			$skip_patch = 0;
+
 			if ($patch_found) {
 				# close previous patch
 				print "</div>\n"; # class="patch"
@@ -2074,96 +2104,123 @@ sub git_patchset_body {
 			}
 			print "<div class=\"patch\" id=\"patch". ($patch_idx+1) ."\">\n";
 
+			# read and prepare patch information
 			if (ref($difftree->[$patch_idx]) eq "HASH") {
+				# pre-parsed (or generated by hand)
 				$diffinfo = $difftree->[$patch_idx];
 			} else {
 				$diffinfo = parse_difftree_raw_line($difftree->[$patch_idx]);
 			}
+			$from{'file'} = $diffinfo->{'from_file'} || $diffinfo->{'file'};
+			$to{'file'}   = $diffinfo->{'to_file'}   || $diffinfo->{'file'};
+			if ($diffinfo->{'status'} ne "A") { # not new (added) file
+				$from{'href'} = href(action=>"blob", hash_base=>$hash_parent,
+				                     hash=>$diffinfo->{'from_id'},
+				                     file_name=>$from{'file'});
+			}
+			if ($diffinfo->{'status'} ne "D") { # not deleted file
+				$to{'href'} = href(action=>"blob", hash_base=>$hash,
+				                   hash=>$diffinfo->{'to_id'},
+				                   file_name=>$to{'file'});
+			}
 			$patch_idx++;
 
-			# for now, no extended header, hence we skip empty patches
-			# companion to	next LINE if $in_header;
+			# for now we skip empty patches
 			if ($diffinfo->{'from_id'} eq $diffinfo->{'to_id'}) { # no change
 				$in_header = 1;
+				$skip_patch = 0;
 				next LINE;
 			}
 
-			if ($diffinfo->{'status'} eq "A") { # added
-				print "<div class=\"diff_info\">" . file_type($diffinfo->{'to_mode'}) . ":" .
-				      $cgi->a({-href => href(action=>"blob", hash_base=>$hash,
-				                             hash=>$diffinfo->{'to_id'}, file_name=>$diffinfo->{'file'})},
-				              $diffinfo->{'to_id'}) . " (new)" .
-				      "</div>\n"; # class="diff_info"
-
-			} elsif ($diffinfo->{'status'} eq "D") { # deleted
-				print "<div class=\"diff_info\">" . file_type($diffinfo->{'from_mode'}) . ":" .
-				      $cgi->a({-href => href(action=>"blob", hash_base=>$hash_parent,
-				                             hash=>$diffinfo->{'from_id'}, file_name=>$diffinfo->{'file'})},
-				              $diffinfo->{'from_id'}) . " (deleted)" .
-				      "</div>\n"; # class="diff_info"
-
-			} elsif ($diffinfo->{'status'} eq "R" || # renamed
-			         $diffinfo->{'status'} eq "C" || # copied
-			         $diffinfo->{'status'} eq "2") { # with two filenames (from git_blobdiff)
-				print "<div class=\"diff_info\">" .
-				      file_type($diffinfo->{'from_mode'}) . ":" .
-				      $cgi->a({-href => href(action=>"blob", hash_base=>$hash_parent,
-				                             hash=>$diffinfo->{'from_id'}, file_name=>$diffinfo->{'from_file'})},
-				              $diffinfo->{'from_id'}) .
-				      " -> " .
-				      file_type($diffinfo->{'to_mode'}) . ":" .
-				      $cgi->a({-href => href(action=>"blob", hash_base=>$hash,
-				                             hash=>$diffinfo->{'to_id'}, file_name=>$diffinfo->{'to_file'})},
-				              $diffinfo->{'to_id'});
-				print "</div>\n"; # class="diff_info"
-
-			} else { # modified, mode changed, ...
-				print "<div class=\"diff_info\">" .
-				      file_type($diffinfo->{'from_mode'}) . ":" .
-				      $cgi->a({-href => href(action=>"blob", hash_base=>$hash_parent,
-				                             hash=>$diffinfo->{'from_id'}, file_name=>$diffinfo->{'file'})},
-				              $diffinfo->{'from_id'}) .
-				      " -> " .
-				      file_type($diffinfo->{'to_mode'}) . ":" .
-				      $cgi->a({-href => href(action=>"blob", hash_base=>$hash,
-				                             hash=>$diffinfo->{'to_id'}, file_name=>$diffinfo->{'file'})},
-				              $diffinfo->{'to_id'});
-				print "</div>\n"; # class="diff_info"
+			# print "git diff" header
+			$patch_line =~ s!^(diff (.*?) )a/.*$!$1!;
+			if ($from{'href'}) {
+				$patch_line .= $cgi->a({-href => $from{'href'}, -class => "path"},
+				                       'a/' . esc_path($from{'file'}));
+			} else { # file was added
+				$patch_line .= 'a/' . esc_path($from{'file'});
+			}
+			$patch_line .= ' ';
+			if ($to{'href'}) {
+				$patch_line .= $cgi->a({-href => $to{'href'}, -class => "path"},
+				                       'b/' . esc_path($to{'file'}));
+			} else { # file was deleted
+				$patch_line .= 'b/' . esc_path($to{'file'});
 			}
 
-			#print "<div class=\"diff extended_header\">\n";
+			print "<div class=\"diff header\">$patch_line</div>\n";
+			print "<div class=\"diff extended_header\">\n";
 			$in_header = 1;
 			next LINE;
+		} else {
+			# actual skipping of empty patches
+			next LINE if $skip_patch;
 		} # start of patch in patchset
 
+		if ($in_header) {
+			if ($patch_line !~ m/^---/) {
+				# match <path>
+				if ($patch_line =~ s!^((copy|rename) from ).*$!$1! && $from{'href'}) {
+					$patch_line .= $cgi->a({-href=>$from{'href'}, -class=>"path"},
+					                        esc_path($from{'file'}));
+				}
+				if ($patch_line =~ s!^((copy|rename) to ).*$!$1! && $to{'href'}) {
+					$patch_line = $cgi->a({-href=>$to{'href'}, -class=>"path"},
+					                      esc_path($to{'file'}));
+				}
+				# match <mode>
+				if ($patch_line =~ m/\s(\d{6})$/) {
+					$patch_line .= '<span class="info"> (' .
+					               file_type_long($1) .
+					               ')</span>';
+				}
+				# match <hash>
+				if ($patch_line =~ m/^index/) {
+					my ($from_link, $to_link);
+					if ($from{'href'}) {
+						$from_link = $cgi->a({-href=>$from{'href'}, -class=>"hash"},
+						                     substr($diffinfo->{'from_id'},0,7));
+					} else {
+						$from_link = '0' x 7;
+					}
+					if ($to{'href'}) {
+						$to_link = $cgi->a({-href=>$to{'href'}, -class=>"hash"},
+						                   substr($diffinfo->{'to_id'},0,7));
+					} else {
+						$to_link = '0' x 7;
+					}
+					my ($from_id, $to_id) = ($diffinfo->{'from_id'}, $diffinfo->{'to_id'});
+					$patch_line =~ s!$from_id\.\.$to_id!$from_link..$to_link!;
+				}
+				print $patch_line . "<br/>\n";
 
-		if ($in_header && $patch_line =~ m/^---/) {
-			#print "</div>\n"; # class="diff extended_header"
-			$in_header = 0;
+			} else {
+				#$in_header && $patch_line =~ m/^---/;
+				print "</div>\n"; # class="diff extended_header"
+				$in_header = 0;
+
+				if ($from{'href'}) {
+					$patch_line = '--- a/' .
+					              $cgi->a({-href=>$from{'href'}, -class=>"path"},
+					                      esc_path($from{'file'}));
+				}
+				print "<div class=\"diff from_file\">$patch_line</div>\n";
 
-			my $file = $diffinfo->{'from_file'};
-			$file  ||= $diffinfo->{'file'};
-			$file = $cgi->a({-href => href(action=>"blob", hash_base=>$hash_parent,
-			                               hash=>$diffinfo->{'from_id'}, file_name=>$file),
-			                -class => "list"}, esc_path($file));
-			$patch_line =~ s|a/.*$|a/$file|g;
-			print "<div class=\"diff from_file\">$patch_line</div>\n";
+				$patch_line = <$fd>;
+				chomp $patch_line;
 
-			$patch_line = <$fd>;
-			chomp $patch_line;
+				#$patch_line =~ m/^+++/;
+				if ($to{'href'}) {
+					$patch_line = '+++ b/' .
+					              $cgi->a({-href=>$to{'href'}, -class=>"path"},
+					                      esc_html($to{'file'}));
+				}
+				print "<div class=\"diff to_file\">$patch_line</div>\n";
 
-			#$patch_line =~ m/^+++/;
-			$file    = $diffinfo->{'to_file'};
-			$file  ||= $diffinfo->{'file'};
-			$file = $cgi->a({-href => href(action=>"blob", hash_base=>$hash,
-			                               hash=>$diffinfo->{'to_id'}, file_name=>$file),
-			                -class => "list"}, esc_path($file));
-			$patch_line =~ s|b/.*|b/$file|g;
-			print "<div class=\"diff to_file\">$patch_line</div>\n";
+			}
 
 			next LINE;
 		}
-		next LINE if $in_header;
 
 		print format_diff_line($patch_line);
 	}
-- 
1.4.3.3

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH 4/n] gitweb: Secure against commit-ish/tree-ish with the same name as path
  2006-11-03  9:35               ` Junio C Hamano
@ 2006-11-03 10:49                 ` Jakub Narebski
  0 siblings, 0 replies; 46+ messages in thread
From: Jakub Narebski @ 2006-11-03 10:49 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Junio C Hamano wrote:
> Junio C Hamano <junkio@cox.net> writes:
> 
>> I'd see if I can add some constructive comments on patches 5-10
>> tonight, but I'm in the middle of other things so don't hold
>> your breath ;-).
> 
> 7 and 9 look obviously good, so I've applied them without
> others.
> 
>         gitweb: Output also empty patches in "commitdiff" view

This patch in my opinion has no sense without having extended diff
header in commitdiff view, i.e. without "New improved patchset view"
(I have send non-line wrapped version).

>         gitweb: Better support for non-CSS aware web browsers

Thats independent from other changes, true.

> 5 is terminally linewrapped and rather big to comment on without
> comparing pre- and post- patch outputs, so I'll refrain from
> commenting on it. 

You can check out new gitweb at work at my site (when it is up)
  http://roke . dyndns . info/cgi-bin/gitweb/gitweb.cgi

>                    8 is "oops, I made a mistake when I did 5", 
> which discourages me even more from looking at 5 X-<.

Well, you wouldn't notice error corrected by 8 unless you have
files with funny filenames.
-- 
Jakub Narebski

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH/RFC 1/n] gitweb: Better git-unquoting and gitweb-quoting of pathnames
  2006-11-03  8:15   ` Junio C Hamano
@ 2006-11-03 10:59     ` Jakub Narebski
  2006-11-03 11:58       ` Junio C Hamano
  0 siblings, 1 reply; 46+ messages in thread
From: Jakub Narebski @ 2006-11-03 10:59 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Junio C Hamano wrote:
> Jakub Narebski <jnareb@gmail.com> writes:
> 
> > diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> > index ec46b80..a15e916 100755
> > --- a/gitweb/gitweb.perl
> > +++ b/gitweb/gitweb.perl
> > @@ -563,12 +563,42 @@ sub esc_html {
> >  	return $str;
> >  }
> >  
> > +# quote unsafe characters and escape filename to HTML
> > +sub esc_path {
> > +	my $str = shift;
> > +	$str = esc_html($str);
> > +	$str =~ s/[[:cntrl:]\a\b\e\f\n\r\t\011]/?/g; # like --hide-control-chars in ls
> > +	return $str;
> > +}
> > +
> 
> When you say "[:cntrl:]" do you need to say anything more?

Ooops. Yes, the \a\b\e\f\n\r\t\011 part is redundant.

> >  # git may return quoted and escaped filenames
> >  sub unquote {
> >  	my $str = shift;
> > +
> > +	sub unq {
> > +		my $seq = shift;
> > +		my %es = (
> > +			't' => "\t", # tab            (HT, TAB)
> > +			'n' => "\n", # newline        (NL)
> > +			'r' => "\r", # return         (CR)
> > +			'f' => "\f", # form feed      (FF)
> > +			'b' => "\b", # backspace      (BS)
> > +			'a' => "\a", # alarm (bell)   (BEL)
> > +			#'e' => "\e", # escape        (ESC)
> > +			'v' => "\011", # vertical tab (VT)
> > +		);
> > +
> > +		# octal char sequence
> > +		return chr(oct($seq))  if ($seq =~ m/^[0-7]{1,3}$/);
> > +		# C escape sequence (this includes '\n' (LF) and '\t' (TAB))
> > +		return $es{$seq}       if ($seq =~ m/^[abefnrtv]$/);
> 
> Problems in this part of the code X-<.
> 
>  * Was there a reason not to unwrap '\e' to "\e"?

It was not mentioned in description of git pathname quoting in the
message
  http://marc.theaimsgroup.com/?l=git&m=112927316408690&w=2;

>  * The vertical tab is \013 (decimal 11), not \011 (which is TAB).

Oops. ASCII 11 is decimal 11.

>  * The name and the abbreviated name of the character "\n" are
>    "line feed" and "LF"; I personally do not think these
>    character name comments are needed in this part of the code,
>    but I do not object if you want to have them there, as long
>    as you spell them correctly. cf. ISO/IEC 6429:1992 or
>    http://www.unicode.org/charts/PDF/U0000.pdf for example.

Perhaps it should be "LF ('\n') and TAB ('\t')".

>  * The hash %es and the pattern /[abef...]/ must be kept in
>    sync; it is a maintenance nightmare,
> 
>  * Worse yet, they do not agree even in this initial version,
>    which proves the previous point.
> 
> Perhaps this is better written as:
> 
> 	if (exists $es{$seq}) {
>         	return $es{$seq};
> 	}

Fact.

Or
	return $es{$seq} if exists $es{$seq};

-- 
Jakub Narebski

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 6/n] gitweb: Remove redundant "blob" links from git_difftree_body
  2006-11-03  6:41   ` Junio C Hamano
@ 2006-11-03 11:01     ` Jakub Narebski
  0 siblings, 0 replies; 46+ messages in thread
From: Jakub Narebski @ 2006-11-03 11:01 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Junio C Hamano wrote:
> Jakub Narebski <jnareb@gmail.com> writes:
> 
>> ... First, it did that incompletely: it did not add "blob"
>> link for added files, and added block used mixture of tabs and spaces
>> for align. Second, in "difftree" view the "blob" link is not the most
>> interesting, *contrary* to "blob"/"tree" link in "tree" view, so it
>> should be enough to have hidden link in the form of file name entry.
> 
> I think these "blob" links are good thing to have, and if you
> think the earlier work was incomplete and know some cases are
> not covered I think it would be better to help completing it
> rather than reverting.
> 
> I do not understand why you feel "blob" is not the most
> interesting.  Often, when it is not obvious if a patch is
> correct only with the context, it is useful to view the whole
> postimage after applying the patch, and the "blob" link helps
> that.

O.K. I'll choose the "add blob links where there are none in difftree view"
approach in cleaned up and resend series.

I'll wait a while (a day or two) for further comments before redoing
the series.

P.S. Remove empty patches might produce incorrect HTML (one of <div>
is not closed). I'll correct it on resend.
-- 
Jakub Narebski

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 7/n] gitweb: Output also empty patches in "commitdiff" view
  2006-10-31 16:36 ` [PATCH 7/n] gitweb: Output also empty patches in "commitdiff" view Jakub Narebski
@ 2006-11-03 11:56   ` Jakub Narebski
  0 siblings, 0 replies; 46+ messages in thread
From: Jakub Narebski @ 2006-11-03 11:56 UTC (permalink / raw)
  To: git

Jakub Narebski wrote:
> Remove skipping over empty patches (i.e. patches which consist solely
> of extended headers) in git_patchset_body, and add links to those
> header-only patches in git_difftree_body (but not generate blobdiff
> links when there were no change in file contents).

Attention: I think this patch causes that in some cases gitweb generates
incorrect HTML (one of <div> elements is not closed).
-- 
Jakub Narebski

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH/RFC 1/n] gitweb: Better git-unquoting and gitweb-quoting of pathnames
  2006-11-03 10:59     ` Jakub Narebski
@ 2006-11-03 11:58       ` Junio C Hamano
  2006-11-03 12:09         ` Jakub Narebski
  0 siblings, 1 reply; 46+ messages in thread
From: Junio C Hamano @ 2006-11-03 11:58 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Jakub Narebski <jnareb@gmail.com> writes:

> Perhaps it should be "LF ('\n') and TAB ('\t')".

Official terminology seems to call \t "HT", but my feeling is
that we would not need that comment there.

> Or
> 	return $es{$seq} if exists $es{$seq};

Although gitweb is full of that syntax, I personally do not like
it very much.  It is really hard to read when you are trying to
skim through the code quickly.  You would have to say "why
return it?  ah -- only when it exists, then it makes sense",
which is a hiccup that disrupts the thought process.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH/RFC 1/n] gitweb: Better git-unquoting and gitweb-quoting of pathnames
  2006-11-03 11:58       ` Junio C Hamano
@ 2006-11-03 12:09         ` Jakub Narebski
  0 siblings, 0 replies; 46+ messages in thread
From: Jakub Narebski @ 2006-11-03 12:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Dnia piątek 3. listopada 2006 12:58, Junio C Hamano napisał:
> Jakub Narebski <jnareb@gmail.com> writes:
> 
>> Perhaps it should be "LF ('\n') and TAB ('\t')".
> 
> Official terminology seems to call \t "HT", but my feeling is
> that we would not need that comment there.

O.K. I'll remove that comment.

>> Or
>> 	return $es{$seq} if exists $es{$seq};
> 
> Although gitweb is full of that syntax, I personally do not like
> it very much.  It is really hard to read when you are trying to
> skim through the code quickly.  You would have to say "why
> return it?  ah -- only when it exists, then it makes sense",
> which is a hiccup that disrupts the thought process.

O.K., I'll change style from

	return chr(oct($seq))  if ($seq =~ m/^[0-7]{1,3}$/);
	return $es{$seq}       if exists $es{$seq};
	return $seq;

to the more readable form

	if ($seq =~ m/^[0-7]{1,3}$/) {
		return chr(oct($seq));
	} elsif (exists $es{$seq}) {
		return $es{$seq};
	}
	return $seq;

I think it is better to leave final "return $seq;" outside else block.
-- 
Jakub Narebski

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/n] gitweb: Use '&iquot;' instead of '?' in esc_path
  2006-10-31  1:27     ` Junio C Hamano
  2006-10-31  9:23       ` Jakub Narebski
@ 2006-11-03 16:19       ` Jakub Narebski
  2006-11-03 21:44         ` Junio C Hamano
  1 sibling, 1 reply; 46+ messages in thread
From: Jakub Narebski @ 2006-11-03 16:19 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

 Junio C Hamano wrote:
>
> Junio C Hamano <junkio@cox.net> writes:
> 
>> Jakub Narebski <jnareb@gmail.com> writes:
>>
>>> Use "&iquot;" Latin 1 entity ("&#191;" -- inverted question mark =
>>> turned question mark, U+00BF ISOnum) instead '?' as replacements for
>>> control characters and other undisplayable characters.
>>>
>>> Signed-off-by: Jakub Narebski <jnareb@gmail.com>
>>
>> Do you have something against our Spanish and Latin American
>> friends?  ;-)
>>
>> I wonder if there is a more suitable replacement character that
>> is accepted across scripts?
> 
> I have a suspicion that instead of finding an exotic character,
> just showing the byte value in \octal, perhaps in different
> color, might be more portable and easier.  For one thing, it
> helps to show the exact byte value than just one substitution
> character if you are troubleshooting gitweb.
 
Or perhaps we can use Unicode Control Pictures U+2400 – U+243F 
(9216–9279) ("Unicode quoting"):

http://www.alanwood.net/unicode/control_pictures.html
http://www.unicode.org/charts/PDF/U2400.pdf

# quote unsafe characters and escape filename to HTML
sub esc_path {
	my $str = shift;
	$str = esc_html($str);
	$str =~ s!([[:cntrl:]])!sprintf('<span class="cntrl">&#%04d;</span>', 9216+ord($1))!eg;
	return $str;
}

with perhaps the following CSS

span.cntrl {
	border: dashed #aaaaaa;
	border-width: 1px;
	padding: 0px 2px 0px 2px;
	margin:  0px 2px 0px 2px;
}

What do you think of it?
-- 
Jakub Narebski

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/n] gitweb: Use '&iquot;' instead of '?' in esc_path
  2006-11-03 16:19       ` Jakub Narebski
@ 2006-11-03 21:44         ` Junio C Hamano
  2006-11-03 22:33           ` Jakub Narebski
  0 siblings, 1 reply; 46+ messages in thread
From: Junio C Hamano @ 2006-11-03 21:44 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Jakub Narebski <jnareb@gmail.com> writes:

> # quote unsafe characters and escape filename to HTML
> sub esc_path {
> 	my $str = shift;
> 	$str = esc_html($str);
> 	$str =~ s!([[:cntrl:]])!sprintf('<span class="cntrl">&#%04d;</span>', 9216+ord($1))!eg;
> 	return $str;
> }
>
> with perhaps the following CSS
>
> span.cntrl {
> 	border: dashed #aaaaaa;
> 	border-width: 1px;
> 	padding: 0px 2px 0px 2px;
> 	margin:  0px 2px 0px 2px;
> }
>
> What do you think of it?

Probably "# quote unsafe characters" is not what it does yet (it
just quotes controls currently and nothing else), but we have to
start somewhere and I think this is a good start.


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/n] gitweb: Use '&iquot;' instead of '?' in esc_path
  2006-11-03 21:44         ` Junio C Hamano
@ 2006-11-03 22:33           ` Jakub Narebski
  2006-11-03 22:44             ` Junio C Hamano
  2006-11-06 21:58             ` Jakub Narebski
  0 siblings, 2 replies; 46+ messages in thread
From: Jakub Narebski @ 2006-11-03 22:33 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Junio C Hamano wrote:
> Jakub Narebski <jnareb@gmail.com> writes:
> 
>> # quote unsafe characters and escape filename to HTML
>> sub esc_path {
>> 	my $str = shift;
>> 	$str = esc_html($str);
>> 	$str =~ s!([[:cntrl:]])!sprintf('<span 
class="cntrl">&#%04d;</span>', 9216+ord($1))!eg;
>> 	return $str;
>> }
>>
>> with perhaps the following CSS
>>
>> span.cntrl {
>> 	border: dashed #aaaaaa;
>> 	border-width: 1px;
>> 	padding: 0px 2px 0px 2px;
>> 	margin:  0px 2px 0px 2px;
>> }
>>
>> What do you think of it?
> 
> Probably "# quote unsafe characters" is not what it does yet (it
> just quotes controls currently and nothing else), but we have to
> start somewhere and I think this is a good start.

Well, control characters (at least some of them) are not correct
characters in UTF-8 HTML output; Mozilla in strict XHTML mode complains.
Currently for example esc_html escapes FORM FEED (FF) and ESCAPE (ESC)
characters, because they happened to be present in git.git repository
(in COPYING file and in commit v1.4.2.1-g20a3847 respectively).

As I see it, we can either replace non-safe characters (control
characters) by single characters a la --hide-control-chars: that
is minimal solution, or we can quote unseafe characters somewhat,
but if we do that we have to indicate that we quote. Git core and
ls encloses material which needs escaping with quotes; in gitweb
it is somewhat impractical; besides we have more possibilities
to mark fragment of text (span element encompassing representation
of escaped characters for example).

I have thought of the following escaping:

1. Hide control characters using '?' or other similar character like
   &cdot; for example
2. Use "Unicode" quoting, i.e. replace control characters by their
   Unicode Printable Representation (PR), as shown above. Has the
   advantage that it is simple and does not need theoretically marking
   that it is quoted; has the disadvantage that browser must support
   this part of Unicode, and that those characters are less than
   readable with default font size gitweb uses.
3. Use Character Escape Codes (CEC), using alphabetic and octal
   backslash sequences like those used in C. Probably need to escape
   backslash (quoting character) too. Has the advantage of being widely
   understood in POSIX world. Has the disadvantage of need for escape
   sequence table/hash. Has the advantage that it works for all
   characters - simple octal backslash sequence if they have no special
   escape sequence.
4. Control key Sequence (CS), like the one used in esc_html currently,
   replacing control characters by key sequence that produces them,
   for example replacing LF with ^J, CR with ^M, FF with ^L, ESC with
   ^[, TAB with ^I. Has the advantage of being undestodd I think in
   MS-DOS/MS WIndows world. Has the advantage of being used in esc_html.
   Has the advantage that some text editors use this representation.
   Has the disadvantage of need for large key sequence table/hash.
   Has the disadvantage that less common control characters have cryptic
   control key sequences.
5. Percent encoding, also know as URL encoding. Use %<hex> encoding used
   in URL, taken for example from core of esc_url/esc_param subroutine.
   Simple, but does need marking that is escaped. Disadvantage of hardly
   readable.

Which solution do you think it's best? Or perhaps other solution, like 
using Unicode Printable Representation, or Character Escape Codes with 
the exception of LF which would be replaced by &para; (paragraph sign), 
RET by &crarr; and TAB by either &thorn;, &#8614; or &rarr;.

-- 
Jakub Narebski

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/n] gitweb: Use '&iquot;' instead of '?' in esc_path
  2006-11-03 22:33           ` Jakub Narebski
@ 2006-11-03 22:44             ` Junio C Hamano
  2006-11-03 22:50               ` Petr Baudis
  2006-11-06 21:58             ` Jakub Narebski
  1 sibling, 1 reply; 46+ messages in thread
From: Junio C Hamano @ 2006-11-03 22:44 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Jakub Narebski <jnareb@gmail.com> writes:

> Well, control characters (at least some of them) are not correct
> characters in UTF-8 HTML output; Mozilla in strict XHTML mode complains.
> Currently...

I said you quote controls (and only controls) as unsafe, so you
are not disagreeing with me here.  However "controls are unsafe"
does not necessarily mean "unsafe characters are all controls",
does it?  That was what my comments abuot the comment before the
function was.

> Which solution do you think it's best?

Sorry, if it was not clear in my message, I wanted to say that I
kinda liked those "control pictures" in U+2400 range.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/n] gitweb: Use '&iquot;' instead of '?' in esc_path
  2006-11-03 22:44             ` Junio C Hamano
@ 2006-11-03 22:50               ` Petr Baudis
  2006-11-03 23:35                 ` Jakub Narebski
  2006-11-04  0:02                 ` Junio C Hamano
  0 siblings, 2 replies; 46+ messages in thread
From: Petr Baudis @ 2006-11-03 22:50 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jakub Narebski, git

Dear diary, on Fri, Nov 03, 2006 at 11:44:48PM CET, I got a letter
where Junio C Hamano <junkio@cox.net> said that...
> Jakub Narebski <jnareb@gmail.com> writes:
> > Which solution do you think it's best?
> 
> Sorry, if it was not clear in my message, I wanted to say that I
> kinda liked those "control pictures" in U+2400 range.

In principle, right now it should be pretty easy for a project that for
some reason does not use UTF-8 in commits etc. to adjust gitweb to work
properly, right? Just change the encoding in HTTP headers and you're
done, I think.

Is it worth trying to preserve that flexibility?

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/n] gitweb: Use '&iquot;' instead of '?' in esc_path
  2006-11-03 22:50               ` Petr Baudis
@ 2006-11-03 23:35                 ` Jakub Narebski
  2006-11-04  0:02                 ` Junio C Hamano
  1 sibling, 0 replies; 46+ messages in thread
From: Jakub Narebski @ 2006-11-03 23:35 UTC (permalink / raw)
  To: git

Petr Baudis wrote:

> Dear diary, on Fri, Nov 03, 2006 at 11:44:48PM CET, I got a letter
> where Junio C Hamano <junkio@cox.net> said that...
>> Jakub Narebski <jnareb@gmail.com> writes:
>> > Which solution do you think it's best?
>> 
>> Sorry, if it was not clear in my message, I wanted to say that I
>> kinda liked those "control pictures" in U+2400 range.
> 
> In principle, right now it should be pretty easy for a project that for
> some reason does not use UTF-8 in commits etc. to adjust gitweb to work
> properly, right? Just change the encoding in HTTP headers and you're
> done, I think.
> 
> Is it worth trying to preserve that flexibility?

It should be also quite easy to change to_uft8 subroutine to actually
convert to utf8. But even if we don't use UTF-8 encoding for HTML in
gitweb, numerical character entities (and that's how those "control
pictures" are entered) should work equally well regardless of encoding.
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/n] gitweb: Use '&iquot;' instead of '?' in esc_path
  2006-11-03 22:50               ` Petr Baudis
  2006-11-03 23:35                 ` Jakub Narebski
@ 2006-11-04  0:02                 ` Junio C Hamano
  2006-11-04 10:31                   ` Petr Baudis
  1 sibling, 1 reply; 46+ messages in thread
From: Junio C Hamano @ 2006-11-04  0:02 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Jakub Narebski, git

Petr Baudis <pasky@suse.cz> writes:

> Dear diary, on Fri, Nov 03, 2006 at 11:44:48PM CET, I got a letter
> where Junio C Hamano <junkio@cox.net> said that...
>> Jakub Narebski <jnareb@gmail.com> writes:
>> > Which solution do you think it's best?
>> 
>> Sorry, if it was not clear in my message, I wanted to say that I
>> kinda liked those "control pictures" in U+2400 range.
>
> In principle, right now it should be pretty easy for a project that for
> some reason does not use UTF-8 in commits etc. to adjust gitweb to work
> properly, right? Just change the encoding in HTTP headers and you're
> done, I think.
>
> Is it worth trying to preserve that flexibility?

Absolutely, and I got your point.  Maybe <blink>\011</blink>
would be more portable and appropriate.

Also munging [:cntrl:] would break iso-2022 encoding if it
munges ESC, but the function under discussion was only for paths
and I think no sane platform would use iso-2022 for filenames.

Each repository has commit charset/encoding configuration item
if I am not mistaken, so it is probably a sane thing to do to
convert commit messages from that uniformly to utf-8, I think.


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/n] gitweb: Use '&iquot;' instead of '?' in esc_path
  2006-11-04  0:02                 ` Junio C Hamano
@ 2006-11-04 10:31                   ` Petr Baudis
  0 siblings, 0 replies; 46+ messages in thread
From: Petr Baudis @ 2006-11-04 10:31 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jakub Narebski, git

Dear diary, on Sat, Nov 04, 2006 at 01:02:43AM CET, I got a letter
where Junio C Hamano <junkio@cox.net> said that...
> Petr Baudis <pasky@suse.cz> writes:
> 
> > Dear diary, on Fri, Nov 03, 2006 at 11:44:48PM CET, I got a letter
> > where Junio C Hamano <junkio@cox.net> said that...
> >> Jakub Narebski <jnareb@gmail.com> writes:
> >> > Which solution do you think it's best?
> >> 
> >> Sorry, if it was not clear in my message, I wanted to say that I
> >> kinda liked those "control pictures" in U+2400 range.
> >
> > In principle, right now it should be pretty easy for a project that for
> > some reason does not use UTF-8 in commits etc. to adjust gitweb to work
> > properly, right? Just change the encoding in HTTP headers and you're
> > done, I think.
> >
> > Is it worth trying to preserve that flexibility?
> 
> Absolutely, and I got your point.  Maybe <blink>\011</blink>
> would be more portable and appropriate.

Actually Jakub is right, &#number; is always a Unicode codepoint,
regardless of document encoding.

Anything but <blink>! ;-)

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/n] gitweb: Use '&iquot;' instead of '?' in esc_path
  2006-11-03 22:33           ` Jakub Narebski
  2006-11-03 22:44             ` Junio C Hamano
@ 2006-11-06 21:58             ` Jakub Narebski
  2006-11-06 22:47               ` Junio C Hamano
  1 sibling, 1 reply; 46+ messages in thread
From: Jakub Narebski @ 2006-11-06 21:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Jakub Narebski wrote:
> 3. Use Character Escape Codes (CEC), using alphabetic and octal
>    backslash sequences like those used in C. Probably need to escape
>    backslash (quoting character) too. Has the advantage of being widely
>    understood in POSIX world. Has the disadvantage of need for escape
>    sequence table/hash. Has the advantage that it works for all
>    characters - simple octal backslash sequence if they have no special
>    escape sequence.

Here is example code for this:

	sub esc_path {
		my $str = shift;

		sub quot {
			my $seq = shift;
			my %es = (
				"\t" => '\t', # tab            (HT, TAB)
				"\n" => '\n', # newline        (NL)
				"\r" => '\r', # return         (CR)
				"\f" => '\f', # form feed      (FF)
				"\b" => '\b', # backspace      (BS)
				"\a" => '\a', # alarm (bell)   (BEL)
				"\e" => '\e', # escape        (ESC)
				"\013" => '\v', # vertical tab (VT)
			);

			if (exists $es{seq}) {
				return $es{$seq};
			}
			return sprintf("\\%03o", ord($seq));
		}
		
		$str = esc_html($str);
		$str =~ s/([[:cntrl:]])/'<span class="cntrl">' . quot($1) . '</span>'/eg;
		return $str;
	}

-- 
Jakub Narebski

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/n] gitweb: Use '&iquot;' instead of '?' in esc_path
  2006-11-06 21:58             ` Jakub Narebski
@ 2006-11-06 22:47               ` Junio C Hamano
  2006-11-06 23:16                 ` Jakub Narebski
  2006-11-07 21:53                 ` Jakub Narebski
  0 siblings, 2 replies; 46+ messages in thread
From: Junio C Hamano @ 2006-11-06 22:47 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Jakub Narebski <jnareb@gmail.com> writes:

> Here is example code for this:

Ok.  The issues I raised in the previous round seem to have been
addressed.  Maybe you would want not to use nested 'sub' and it
is good to go, I think.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/n] gitweb: Use '&iquot;' instead of '?' in esc_path
  2006-11-06 22:47               ` Junio C Hamano
@ 2006-11-06 23:16                 ` Jakub Narebski
       [not found]                   ` <7vwt68b0f3.fsf@assigned-by-dhcp.cox.net>
  2006-11-07 21:53                 ` Jakub Narebski
  1 sibling, 1 reply; 46+ messages in thread
From: Jakub Narebski @ 2006-11-06 23:16 UTC (permalink / raw)
  To: git

Junio C Hamano wrote:

> Jakub Narebski <jnareb@gmail.com> writes:
> 
>> Here is example code for this:
> 
> Ok.  The issues I raised in the previous round seem to have been
> addressed.  Maybe you would want not to use nested 'sub' and it
> is good to go, I think.
 
Nested sub makes it easy to change gitweb quoting from Character Escape
Codes (CEC) to e.g. Unicode Printable Representation (PR).

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/n] gitweb: Use '&iquot;' instead of '?' in esc_path
       [not found]                   ` <7vwt68b0f3.fsf@assigned-by-dhcp.cox.net>
@ 2006-11-07  0:02                     ` Jakub Narebski
  0 siblings, 0 replies; 46+ messages in thread
From: Jakub Narebski @ 2006-11-07  0:02 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Junio C Hamano wrote:
> Jakub Narebski <jnareb@gmail.com> writes:
> 
>> Junio C Hamano wrote:
>>
>>> Jakub Narebski <jnareb@gmail.com> writes:
>>> 
>>>> Here is example code for this:
>>> 
>>> Ok.  The issues I raised in the previous round seem to have been
>>> addressed.  Maybe you would want not to use nested 'sub' and it
>>> is good to go, I think.
>>  
>> Nested sub makes it easy to change gitweb quoting from Character Escape
>> Codes (CEC) to e.g. Unicode Printable Representation (PR).
> 
> Yes, sub makes it easy, it is not needed to be nested.

Well, this sub is of no use (I think) ourside esc_path subroutine.
Similarly for unq sub nested in unescape in PATCH 1/10.

-- 
Jakub Narebski

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/n] gitweb: Use '&iquot;' instead of '?' in esc_path
  2006-11-06 22:47               ` Junio C Hamano
  2006-11-06 23:16                 ` Jakub Narebski
@ 2006-11-07 21:53                 ` Jakub Narebski
  2006-11-07 22:18                   ` Junio C Hamano
  1 sibling, 1 reply; 46+ messages in thread
From: Jakub Narebski @ 2006-11-07 21:53 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Junio C Hamano wrote:
> Jakub Narebski <jnareb@gmail.com> writes:
> 
>> Here is example code for this:
> 
> Ok.  The issues I raised in the previous round seem to have been
> addressed.  Maybe you would want not to use nested 'sub' and it
> is good to go, I think.

Should I understand this as a statement that you prefer backslash 
sequences aka. Character Escape Codes (CEC) than "Unicode" escaping 
aka. Unicode Printable Representation (PR)?

Should I send better quoting/unquoting work as two patches: unquote 
correction plus '?' using esc_path + esc_path which uses backslash 
sequences and span.cntrl element, or should it be send as one, 
admittedly quite large patch. 

I don't think it would be good idea to separate unquote correction with 
esc_path work, because havin unquote which unquotes fully means that we 
can have filenames which have for exampl newline characters in them, 
hence the need of separate quoting subroutine, esc_path, and using it 
for filename escaping.
-- 
Jakub Narebski

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/n] gitweb: Use '&iquot;' instead of '?' in esc_path
  2006-11-07 21:53                 ` Jakub Narebski
@ 2006-11-07 22:18                   ` Junio C Hamano
  0 siblings, 0 replies; 46+ messages in thread
From: Junio C Hamano @ 2006-11-07 22:18 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Jakub Narebski <jnareb@gmail.com> writes:

> I don't think it would be good idea to separate unquote correction with 
> esc_path work, because havin unquote which unquotes fully means that we 
> can have filenames which have for exampl newline characters in them, 
> hence the need of separate quoting subroutine, esc_path, and using it 
> for filename escaping.

Yeah, I think that makes sense.

No matter what escaping we would end up picking in the second
part, we should do the unquote plus esc_path anyway, so I think
a two patch series to replace to '?' first and then pretty-print
that '?' with whatever encoding and span.cntrl is a very
sensible way to go.

As to backslash, cute control pictures and just plain '?', I do
not have a strong preference among them.  Especially if you have
the span.cntrl around them, we can tell even a plain '?' from
the substitution of funny byte values, so it probably is just a
matter of taste and I'd leave that up to you and gitweb cabal on
the list ;-).


^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2006-11-07 22:18 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-10-30 18:53 [PATCH 0/n] gitweb: Better quoting and New improved patchset view Jakub Narebski
2006-10-30 18:58 ` [PATCH/RFC 1/n] gitweb: Better git-unquoting and gitweb-quoting of pathnames Jakub Narebski
2006-11-03  8:15   ` Junio C Hamano
2006-11-03 10:59     ` Jakub Narebski
2006-11-03 11:58       ` Junio C Hamano
2006-11-03 12:09         ` Jakub Narebski
2006-10-30 18:59 ` [PATCH 2/n] gitweb: Use '&iquot;' instead of '?' in esc_path Jakub Narebski
2006-10-31  0:34   ` Junio C Hamano
2006-10-31  1:27     ` Junio C Hamano
2006-10-31  9:23       ` Jakub Narebski
2006-11-03 16:19       ` Jakub Narebski
2006-11-03 21:44         ` Junio C Hamano
2006-11-03 22:33           ` Jakub Narebski
2006-11-03 22:44             ` Junio C Hamano
2006-11-03 22:50               ` Petr Baudis
2006-11-03 23:35                 ` Jakub Narebski
2006-11-04  0:02                 ` Junio C Hamano
2006-11-04 10:31                   ` Petr Baudis
2006-11-06 21:58             ` Jakub Narebski
2006-11-06 22:47               ` Junio C Hamano
2006-11-06 23:16                 ` Jakub Narebski
     [not found]                   ` <7vwt68b0f3.fsf@assigned-by-dhcp.cox.net>
2006-11-07  0:02                     ` Jakub Narebski
2006-11-07 21:53                 ` Jakub Narebski
2006-11-07 22:18                   ` Junio C Hamano
2006-10-30 21:25 ` [PATCH 3/n] gitweb: Use 's' regexp modifier to secure against filenames with LF Jakub Narebski
2006-10-30 21:29 ` [PATCH 4/n] gitweb: Secure against commit-ish/tree-ish with the same name as path Jakub Narebski
2006-10-31 16:53   ` Jakub Narebski
2006-11-01  0:24     ` Junio C Hamano
2006-11-01  0:40       ` Jakub Narebski
2006-11-02  1:01         ` Junio C Hamano
2006-11-02  8:49           ` Jakub Narebski
2006-11-03  6:18             ` Junio C Hamano
2006-11-03  9:35               ` Junio C Hamano
2006-11-03 10:49                 ` Jakub Narebski
2006-10-31 14:22 ` [PATCH 5/n] [take 3] gitweb: New improved patchset view Jakub Narebski
2006-11-03 10:26   ` [PATCH 5/10] " Jakub Narebski
2006-10-31 16:07 ` [PATCH 6/n] gitweb: Remove redundant "blob" links from git_difftree_body Jakub Narebski
2006-11-03  6:41   ` Junio C Hamano
2006-11-03 11:01     ` Jakub Narebski
2006-10-31 16:36 ` [PATCH 7/n] gitweb: Output also empty patches in "commitdiff" view Jakub Narebski
2006-11-03 11:56   ` Jakub Narebski
2006-10-31 16:43 ` [PATCH 8/n] gitweb: Fix two issues with quoted filenames in git_patchset_body Jakub Narebski
2006-11-01 13:33 ` [PATCH 9/n] gitweb: Better support for non-CSS aware web browsers Jakub Narebski
2006-11-01 13:38   ` Petr Baudis
2006-11-01 13:36 ` [PATCH 10/n] gitweb: New improved formatting of chunk header in diff Jakub Narebski
2006-11-01 18:52 ` [PATCH 00/10] gitweb: Better quoting and New improved patchset view Jakub Narebski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).