git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jakub Narebski <jnareb@gmail.com>
To: git@vger.kernel.org
Cc: Jakub Narebski <jnareb@gmail.com>
Subject: [PATCH 2/5] gitweb: Buffer diff header to deal with split patches + git_patchset_body refactoring
Date: Sat, 18 Nov 2006 23:35:39 +0100	[thread overview]
Message-ID: <1163889344537-git-send-email-jnareb@gmail.com> (raw)
In-Reply-To: <1163889342877-git-send-email-jnareb@gmail.com>

There are some cases when one line from "raw" git-diff output (raw format)
corresponds to more than one patch in the patchset git-diff output. To deal
with this buffer git diff header and extended diff header (everything up to
actual patch) to check from information from "index <hash>..<hash>" extended
header line if the patch corresponds to the same or next difftree raw line.

This could also be used to gather information needed for hyperlinking, and
used for printing gitweb quoted filenames, from extended diff header instead
of raw git-diff output.

While at it, refactor git_patchset_body subroutine from the event-driven,
AWK-like state-machine parsing to sequential parsing: for each patch
parse (and output) git diff header, parse extended diff header, parse two-line
from-file/to-file diff header, parse patch itself; patch ends with the end
of input [file] or the line matching m/^diff /.

For better understanding the code, there were added assertions in the
comments a la Carp::Assert module. Just in case there is commented out code
dealing with unexpected end of input (should not happen, hence commented
out).

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
While starting to write buffering part of this patch I have realized
that git_patchset_body needs refactoring. I think it is now much more
readable, at least for people who know the patch format.

 gitweb/gitweb.perl |  235 ++++++++++++++++++++++++++++++----------------------
 1 files changed, 136 insertions(+), 99 deletions(-)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index b2482fe..bf58c3b 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -2202,31 +2202,56 @@ sub git_patchset_body {
 	my ($fd, $difftree, $hash, $hash_parent) = @_;
 
 	my $patch_idx = 0;
-	my $in_header = 0;
-	my $patch_found = 0;
+	my $patch_line;
 	my $diffinfo;
 	my (%from, %to);
+	my ($from_id, $to_id);
 
 	print "<div class=\"patchset\">\n";
 
-	LINE:
-	while (my $patch_line = <$fd>) {
+	# skip to first patch
+	while ($patch_line = <$fd>) {
 		chomp $patch_line;
 
-		if ($patch_line =~ m/^diff /) { # "git diff" header
-			# beginning of patch (in patchset)
-			if ($patch_found) {
-				# close extended header for previous empty patch
-				if ($in_header) {
-					print "</div>\n" # class="diff extended_header"
-				}
-				# close previous patch
-				print "</div>\n"; # class="patch"
-			} else {
-				# first patch in patchset
-				$patch_found = 1;
+		last if ($patch_line =~ m/^diff /);
+	}
+
+ PATCH:
+	while ($patch_line) {
+		my @diff_header;
+
+		# git diff header
+		#assert($patch_line =~ m/^diff /) if DEBUG;
+		#assert($patch_line !~ m!$/$!) if DEBUG; # is chomp-ed
+		push @diff_header, $patch_line;
+
+		# extended diff header
+	EXTENDED_HEADER:
+		while ($patch_line = <$fd>) {
+			chomp $patch_line;
+
+			last EXTENDED_HEADER if ($patch_line =~ m/^--- /);
+
+			if ($patch_line =~ m/^index ([0-9a-fA-F]{40})..([0-9a-fA-F]{40})/) {
+				$from_id = $1;
+				$to_id   = $2;
 			}
-			print "<div class=\"patch\" id=\"patch". ($patch_idx+1) ."\">\n";
+
+			push @diff_header, $patch_line;
+		}
+		#last PATCH unless $patch_line;
+		my $last_patch_line = $patch_line;
+
+		# check if current patch belong to current raw line
+		# and parse raw git-diff line if needed
+		if (defined $diffinfo &&
+		    $diffinfo->{'from_id'} eq $from_id &&
+		    $diffinfo->{'to_id'}   eq $to_id) {
+			# this is split patch
+			print "<div class=\"patch cont\">\n";
+		} else {
+			# advance raw git-diff output if needed
+			$patch_idx++ if defined $diffinfo;
 
 			# read and prepare patch information
 			if (ref($difftree->[$patch_idx]) eq "HASH") {
@@ -2247,100 +2272,112 @@ sub git_patchset_body {
 				                   hash=>$diffinfo->{'to_id'},
 				                   file_name=>$to{'file'});
 			}
-			$patch_idx++;
-
-			# print "git diff" header
-			$patch_line =~ s!^(diff (.*?) )"?a/.*$!$1!;
-			if ($from{'href'}) {
-				$patch_line .= $cgi->a({-href => $from{'href'}, -class => "path"},
-				                       'a/' . esc_path($from{'file'}));
-			} else { # file was added
-				$patch_line .= 'a/' . esc_path($from{'file'});
-			}
-			$patch_line .= ' ';
-			if ($to{'href'}) {
-				$patch_line .= $cgi->a({-href => $to{'href'}, -class => "path"},
-				                       'b/' . esc_path($to{'file'}));
-			} else { # file was deleted
-				$patch_line .= 'b/' . esc_path($to{'file'});
-			}
-
-			print "<div class=\"diff header\">$patch_line</div>\n";
-			print "<div class=\"diff extended_header\">\n";
-			$in_header = 1;
-			next LINE;
+			# this is first patch for raw difftree line with $patch_idx index
+			# we index @$difftree array from 0, but number patches from 1
+			print "<div class=\"patch\" id=\"patch". ($patch_idx+1) ."\">\n";
 		}
 
-		if ($in_header) {
-			if ($patch_line !~ m/^---/) {
-				# match <path>
-				if ($patch_line =~ s!^((copy|rename) from ).*$!$1! && $from{'href'}) {
-					$patch_line .= $cgi->a({-href=>$from{'href'}, -class=>"path"},
-					                        esc_path($from{'file'}));
-				}
-				if ($patch_line =~ s!^((copy|rename) to ).*$!$1! && $to{'href'}) {
-					$patch_line = $cgi->a({-href=>$to{'href'}, -class=>"path"},
-					                      esc_path($to{'file'}));
-				}
-				# match <mode>
-				if ($patch_line =~ m/\s(\d{6})$/) {
-					$patch_line .= '<span class="info"> (' .
-					               file_type_long($1) .
-					               ')</span>';
+		# print "git diff" header
+		$patch_line = shift @diff_header;
+		$patch_line =~ s!^(diff (.*?) )"?a/.*$!$1!;
+		if ($from{'href'}) {
+			$patch_line .= $cgi->a({-href => $from{'href'}, -class => "path"},
+			                       'a/' . esc_path($from{'file'}));
+		} else { # file was added
+			$patch_line .= 'a/' . esc_path($from{'file'});
+		}
+		$patch_line .= ' ';
+		if ($to{'href'}) {
+			$patch_line .= $cgi->a({-href => $to{'href'}, -class => "path"},
+			                       'b/' . esc_path($to{'file'}));
+		} else { # file was deleted
+			$patch_line .= 'b/' . esc_path($to{'file'});
+		}
+		print "<div class=\"diff header\">$patch_line</div>\n";
+
+		# print extended diff header
+		print "<div class=\"diff extended_header\">\n" if (@diff_header > 0);
+	EXTENDED_HEADER:
+		foreach $patch_line (@diff_header) {
+			# match <path>
+			if ($patch_line =~ s!^((copy|rename) from ).*$!$1! && $from{'href'}) {
+				$patch_line .= $cgi->a({-href=>$from{'href'}, -class=>"path"},
+				                        esc_path($from{'file'}));
+			}
+			if ($patch_line =~ s!^((copy|rename) to ).*$!$1! && $to{'href'}) {
+				$patch_line = $cgi->a({-href=>$to{'href'}, -class=>"path"},
+				                      esc_path($to{'file'}));
+			}
+			# match <mode>
+			if ($patch_line =~ m/\s(\d{6})$/) {
+				$patch_line .= '<span class="info"> (' .
+				               file_type_long($1) .
+				               ')</span>';
+			}
+			# match <hash>
+			if ($patch_line =~ m/^index/) {
+				my ($from_link, $to_link);
+				if ($from{'href'}) {
+					$from_link = $cgi->a({-href=>$from{'href'}, -class=>"hash"},
+					                     substr($diffinfo->{'from_id'},0,7));
+				} else {
+					$from_link = '0' x 7;
 				}
-				# match <hash>
-				if ($patch_line =~ m/^index/) {
-					my ($from_link, $to_link);
-					if ($from{'href'}) {
-						$from_link = $cgi->a({-href=>$from{'href'}, -class=>"hash"},
-						                     substr($diffinfo->{'from_id'},0,7));
-					} else {
-						$from_link = '0' x 7;
-					}
-					if ($to{'href'}) {
-						$to_link = $cgi->a({-href=>$to{'href'}, -class=>"hash"},
-						                   substr($diffinfo->{'to_id'},0,7));
-					} else {
-						$to_link = '0' x 7;
-					}
-					my ($from_id, $to_id) = ($diffinfo->{'from_id'}, $diffinfo->{'to_id'});
-					$patch_line =~ s!$from_id\.\.$to_id!$from_link..$to_link!;
+				if ($to{'href'}) {
+					$to_link = $cgi->a({-href=>$to{'href'}, -class=>"hash"},
+					                   substr($diffinfo->{'to_id'},0,7));
+				} else {
+					$to_link = '0' x 7;
 				}
-				print $patch_line . "<br/>\n";
-
-			} else {
-				#$in_header && $patch_line =~ m/^---/;
-				print "</div>\n"; # class="diff extended_header"
-				$in_header = 0;
+				#affirm {
+				#	my ($from_hash, $to_hash) =
+				#		($patch_line =~ m/^index ([0-9a-fA-F]{40})..([0-9a-fA-F]{40})/);
+				#	my ($from_id, $to_id) =
+				#		($diffinfo->{'from_id'}, $diffinfo->{'to_id'});
+				#	($from_hash eq $from_id) && ($to_hash eq $to_id);
+				#} if DEBUG;
+				my ($from_id, $to_id) = ($diffinfo->{'from_id'}, $diffinfo->{'to_id'});
+				$patch_line =~ s!$from_id\.\.$to_id!$from_link..$to_link!;
+			}
+			print $patch_line . "<br/>\n";
+		}
+		print "</div>\n"  if (@diff_header > 0); # class="diff extended_header"
+
+		# from-file/to-file diff header
+		$patch_line = $last_patch_line;
+		#assert($patch_line =~ m/^---/) if DEBUG;
+		if ($from{'href'}) {
+			$patch_line = '--- a/' .
+			              $cgi->a({-href=>$from{'href'}, -class=>"path"},
+			                      esc_path($from{'file'}));
+		}
+		print "<div class=\"diff from_file\">$patch_line</div>\n";
 
-				if ($from{'href'}) {
-					$patch_line = '--- a/' .
-					              $cgi->a({-href=>$from{'href'}, -class=>"path"},
-					                      esc_path($from{'file'}));
-				}
-				print "<div class=\"diff from_file\">$patch_line</div>\n";
+		$patch_line = <$fd>;
+		#last PATCH unless $patch_line;
+		chomp $patch_line;
 
-				$patch_line = <$fd>;
-				chomp $patch_line;
+		#assert($patch_line =~ m/^+++/) if DEBUG;
+		if ($to{'href'}) {
+			$patch_line = '+++ b/' .
+			              $cgi->a({-href=>$to{'href'}, -class=>"path"},
+			                      esc_path($to{'file'}));
+		}
+		print "<div class=\"diff to_file\">$patch_line</div>\n";
 
-				#$patch_line =~ m/^+++/;
-				if ($to{'href'}) {
-					$patch_line = '+++ b/' .
-					              $cgi->a({-href=>$to{'href'}, -class=>"path"},
-					                      esc_path($to{'file'}));
-				}
-				print "<div class=\"diff to_file\">$patch_line</div>\n";
+		# the patch itself
+	LINE:
+		while ($patch_line = <$fd>) {
+			chomp $patch_line;
 
-			}
+			next PATCH if ($patch_line =~ m/^diff /);
 
-			next LINE;
+			print format_diff_line($patch_line);
 		}
 
-		print format_diff_line($patch_line);
+	} continue {
+		print "</div>\n"; # class="patch"
 	}
-	print "</div>\n" if $in_header; # extended header
-
-	print "</div>\n" if $patch_found; # class="patch"
 
 	print "</div>\n"; # class="patchset"
 }
-- 
1.4.3.4

  parent reply	other threads:[~2006-11-18 22:34 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-11-18 22:35 [PATCH 0/5] gitweb: New patchset view corrections and improvements, and few other things Jakub Narebski
2006-11-18 22:35 ` [PATCH 1/5] gitweb: Protect against possible warning in git_commitdiff Jakub Narebski
2006-11-18 22:38   ` Jakub Narebski
2006-11-18 22:35 ` Jakub Narebski [this message]
2006-11-18 22:35 ` [PATCH 3/5] gitweb: New improved formatting of chunk header in diff Jakub Narebski
2006-11-18 22:35 ` [PATCH 4/5] gitweb: Default to $hash_base or HEAD for $hash in "commit" and "commitdiff" Jakub Narebski
2006-11-18 22:35 ` [PATCH 5/5] gitweb: Add an option to href() to return full URL Jakub Narebski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1163889344537-git-send-email-jnareb@gmail.com \
    --to=jnareb@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).