git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] gitweb: fix encode handling for site_{header,footer}
@ 2008-11-17  7:53 Tatsuki Sugiura
  2008-11-17 10:28 ` Junio C Hamano
  0 siblings, 1 reply; 9+ messages in thread
From: Tatsuki Sugiura @ 2008-11-17  7:53 UTC (permalink / raw)
  To: git

All non-ASCII chars of site_{header,footer} will be broken
by perl IO layer without decoding to utf8.

Here is a fix to just call to_utf8 for inputs from these files.

Signed-off-by: Tatsuki Sugiura <sugi@nemui.org>
---
 gitweb/gitweb.perl |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 933e137..79ca5c2 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -2929,7 +2929,7 @@ EOF
 
 	if (-f $site_header) {
 		open (my $fd, $site_header);
-		print <$fd>;
+		print map {to_utf8($_)} <$fd>;
 		close $fd;
 	}
 
@@ -3018,7 +3018,7 @@ sub git_footer_html {
 
 	if (-f $site_footer) {
 		open (my $fd, $site_footer);
-		print <$fd>;
+		print map {to_utf8($_)} <$fd>;
 		close $fd;
 	}
 
-- 
1.5.6.5

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] gitweb: fix encode handling for site_{header,footer}
  2008-11-17  7:53 [PATCH] gitweb: fix encode handling for site_{header,footer} Tatsuki Sugiura
@ 2008-11-17 10:28 ` Junio C Hamano
  2008-11-17 10:40   ` Jakub Narebski
  0 siblings, 1 reply; 9+ messages in thread
From: Junio C Hamano @ 2008-11-17 10:28 UTC (permalink / raw)
  To: Tatsuki Sugiura; +Cc: git, Jakub Narebski

Tatsuki Sugiura <sugi@nemui.org> writes:

> All non-ASCII chars of site_{header,footer} will be broken
> by perl IO layer without decoding to utf8.
>
> Here is a fix to just call to_utf8 for inputs from these files.
>
> Signed-off-by: Tatsuki Sugiura <sugi@nemui.org>

Looks good, thanks.

> ---
>  gitweb/gitweb.perl |    4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> index 933e137..79ca5c2 100755
> --- a/gitweb/gitweb.perl
> +++ b/gitweb/gitweb.perl
> @@ -2929,7 +2929,7 @@ EOF
>  
>  	if (-f $site_header) {
>  		open (my $fd, $site_header);
> -		print <$fd>;
> +		print map {to_utf8($_)} <$fd>;
>  		close $fd;
>  	}
>  
> @@ -3018,7 +3018,7 @@ sub git_footer_html {
>  
>  	if (-f $site_footer) {
>  		open (my $fd, $site_footer);
> -		print <$fd>;
> +		print map {to_utf8($_)} <$fd>;
>  		close $fd;
>  	}
>  
> -- 
> 1.5.6.5

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] gitweb: fix encode handling for site_{header,footer}
  2008-11-17 10:28 ` Junio C Hamano
@ 2008-11-17 10:40   ` Jakub Narebski
  2008-12-01 18:01     ` [PATCH] gitweb: Fix handling of non-ASCII characters in inserted HTML files Jakub Narebski
  0 siblings, 1 reply; 9+ messages in thread
From: Jakub Narebski @ 2008-11-17 10:40 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Tatsuki Sugiura, git, Gerrit Pape, Recai Oktaş

On Mon, 17 Nov 2008, Junio C Hamano wrote:
> Tatsuki Sugiura <sugi@nemui.org> writes:
> 
> > All non-ASCII chars of site_{header,footer} will be broken
> > by perl IO layer without decoding to utf8.
> >
> > Here is a fix to just call to_utf8 for inputs from these files.
> >
> > Signed-off-by: Tatsuki Sugiura <sugi@nemui.org>
> 
> Looks good, thanks.

Hmmm... it is good _partial_ solution. I wonder if gitweb doesn't
have the same problem with per-project README.html file...

> > ---
> >  gitweb/gitweb.perl |    4 ++--
> >  1 files changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> > index 933e137..79ca5c2 100755
> > --- a/gitweb/gitweb.perl
> > +++ b/gitweb/gitweb.perl
> > @@ -2929,7 +2929,7 @@ EOF
> >  
> >  	if (-f $site_header) {
> >  		open (my $fd, $site_header);
> > -		print <$fd>;
> > +		print map {to_utf8($_)} <$fd>;
> >  		close $fd;
> >  	}
> >  
> > @@ -3018,7 +3018,7 @@ sub git_footer_html {
> >  
> >  	if (-f $site_footer) {
> >  		open (my $fd, $site_footer);
> > -		print <$fd>;
> > +		print map {to_utf8($_)} <$fd>;
> >  		close $fd;
> >  	}
> >  
> > -- 
> > 1.5.6.5

There was some patch by Recai Oktaş, or Gerrit Pape (or me)


diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 4c104d2..8b5fcc1 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -16,7 +16,7 @@ use Encode;
 use Fcntl ':mode';
 use File::Find qw();
 use File::Basename qw(basename);
-binmode STDOUT, ':utf8';
+use open qw(:std :utf8);
 
 BEGIN {
 	CGI->compile() if $ENV{'MOD_PERL'};


Then to_utf8() wouldn't be necessary. But I'd like for Perl experts to 
check it first; for example this doesn't allow $fallback_encoding.

-- 
Jakub Narębski
Poland

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH] gitweb: Fix handling of non-ASCII characters in inserted HTML files
  2008-11-17 10:40   ` Jakub Narebski
@ 2008-12-01 18:01     ` Jakub Narebski
  2008-12-03  3:55       ` Junio C Hamano
  0 siblings, 1 reply; 9+ messages in thread
From: Jakub Narebski @ 2008-12-01 18:01 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Tatsuki Sugiura, Gerrit Pape, Recai Oktas

Use new insert_file() subroutine to insert HTML chunks from external
files: $site_header, $home_text (by default indextext.html),
$site_footer, and $projectroot/$project/REAME.html.

All non-ASCII chars of those files will be broken by Perl IO layer
without decoding to utf8, so insert_file() does to_utf8() on each
printed line; alternate solution would be to open those files with
"binmode $fh, ':utf8'", or even all files with "use open qw(:std :utf8)".

Note that inserting README.html lost one of checks for simplicity.

Noticed-by: Tatsuki Sugiura <sugi@nemui.org>
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
This is more complete solution that the one provided by Tatsuki Sugiura
in original patch

  [PATCH] gitweb: fix encode handling for site_{header,footer}
  Msg-Id: <87vdumbxgc.wl@vaj-k-334-sugi.local.valinux.co.jp>
  http://thread.gmane.org/gmane.comp.version-control.git/101199

but it is in principle the same solution.

I think this one as it is a bugfix should go in git 1.6.1

 gitweb/gitweb.perl |   33 +++++++++++++++++----------------
 1 files changed, 17 insertions(+), 16 deletions(-)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 933e137..82262a3 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -2740,6 +2740,15 @@ sub get_file_owner {
 	return to_utf8($owner);
 }
 
+# assume that file exists
+sub insert_file {
+	my $filename = shift;
+
+	open my $fd, '<', $filename;
+	print map(to_utf8, <$fd>);
+	close $fd;
+}
+
 ## ......................................................................
 ## mimetype related functions
 
@@ -2928,9 +2937,7 @@ EOF
 	      "<body>\n";
 
 	if (-f $site_header) {
-		open (my $fd, $site_header);
-		print <$fd>;
-		close $fd;
+		insert_file($site_header);
 	}
 
 	print "<div class=\"page_header\">\n" .
@@ -3017,9 +3024,7 @@ sub git_footer_html {
 	print "</div>\n"; # class="page_footer"
 
 	if (-f $site_footer) {
-		open (my $fd, $site_footer);
-		print <$fd>;
-		close $fd;
+		insert_file($site_footer);
 	}
 
 	print "</body>\n" .
@@ -4358,9 +4363,7 @@ sub git_project_list {
 	git_header_html();
 	if (-f $home_text) {
 		print "<div class=\"index_include\">\n";
-		open (my $fd, $home_text);
-		print <$fd>;
-		close $fd;
+		insert_file($home_text);
 		print "</div>\n";
 	}
 	print $cgi->startform(-method => "get") .
@@ -4472,13 +4475,11 @@ sub git_summary {
 	print "</table>\n";
 
 	if (-s "$projectroot/$project/README.html") {
-		if (open my $fd, "$projectroot/$project/README.html") {
-			print "<div class=\"title\">readme</div>\n" .
-			      "<div class=\"readme\">\n";
-			print $_ while (<$fd>);
-			print "\n</div>\n"; # class="readme"
-			close $fd;
-		}
+		print "<div class=\"title\">readme</div>\n" .
+		      "<div class=\"readme\">\n";
+		insert_file("$projectroot/$project/README.html");
+		print "\n</div>\n"; # class="readme"
+		close $fd;
 	}
 
 	# we need to request one more than 16 (0..15) to check if

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] gitweb: Fix handling of non-ASCII characters in inserted HTML files
  2008-12-01 18:01     ` [PATCH] gitweb: Fix handling of non-ASCII characters in inserted HTML files Jakub Narebski
@ 2008-12-03  3:55       ` Junio C Hamano
  2008-12-03 10:21         ` Jakub Narebski
  0 siblings, 1 reply; 9+ messages in thread
From: Junio C Hamano @ 2008-12-03  3:55 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git, Tatsuki Sugiura, Gerrit Pape, Recai Oktas

Jakub Narebski <jnareb@gmail.com> writes:

> Use new insert_file() subroutine to insert HTML chunks from external
> files: $site_header, $home_text (by default indextext.html),
> $site_footer, and $projectroot/$project/REAME.html.
>
> All non-ASCII chars of those files will be broken by Perl IO layer
> without decoding to utf8, so insert_file() does to_utf8() on each
> printed line; alternate solution would be to open those files with
> "binmode $fh, ':utf8'", or even all files with "use open qw(:std :utf8)".
>
> Note that inserting README.html lost one of checks for simplicity.
>
> Noticed-by: Tatsuki Sugiura <sugi@nemui.org>
> Signed-off-by: Jakub Narebski <jnareb@gmail.com>
> ---
> This is more complete solution that the one provided by Tatsuki Sugiura
> in original patch
>
>   [PATCH] gitweb: fix encode handling for site_{header,footer}
>   Msg-Id: <87vdumbxgc.wl@vaj-k-334-sugi.local.valinux.co.jp>
>   http://thread.gmane.org/gmane.comp.version-control.git/101199

It may be more complete but it is obviously untested.  Please help me
trust you better with your future patches.  Because I personally do not
run gitweb myself, I really need a trustworthy lieutenant(s) in the area.

[Wed Dec  3 01:52:07 2008] gitweb.perl: Global symbol "$fd" requires explicit package name at /git.git/t/../gitweb/gitweb.perl line 4500.
[Wed Dec  3 01:52:07 2008] gitweb.perl: Execution of /git.git/t/../gitweb/gitweb.perl aborted due to compilation errors.

> but it is in principle the same solution.
>
> I think this one as it is a bugfix should go in git 1.6.1

Trading a gitweb with a small bug with a gitweb that does not even pass
its test script does not feel like a good change to me.

I think the breakage is the "close $fd" at the end of this hunk:

> @@ -4472,13 +4475,11 @@ sub git_summary {
>  	print "</table>\n";
>  
>  	if (-s "$projectroot/$project/README.html") {
> -		if (open my $fd, "$projectroot/$project/README.html") {
> -			print "<div class=\"title\">readme</div>\n" .
> -			      "<div class=\"readme\">\n";
> -			print $_ while (<$fd>);
> -			print "\n</div>\n"; # class="readme"
> -			close $fd;
> -		}
> +		print "<div class=\"title\">readme</div>\n" .
> +		      "<div class=\"readme\">\n";
> +		insert_file("$projectroot/$project/README.html");
> +		print "\n</div>\n"; # class="readme"
> +		close $fd;
>  	}
>  
>  	# we need to request one more than 16 (0..15) to check if

I'll queue it to 'pu', with the "close $fd" removed, for now.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] gitweb: Fix handling of non-ASCII characters in inserted HTML files
  2008-12-03  3:55       ` Junio C Hamano
@ 2008-12-03 10:21         ` Jakub Narebski
  2008-12-03 21:14           ` Junio C Hamano
  0 siblings, 1 reply; 9+ messages in thread
From: Jakub Narebski @ 2008-12-03 10:21 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Tatsuki Sugiura, Gerrit Pape, Recai Oktas

Junio C Hamano wrote:
> Jakub Narebski <jnareb@gmail.com> writes:
> 
> > Use new insert_file() subroutine to insert HTML chunks from external
> > files: $site_header, $home_text (by default indextext.html),
> > $site_footer, and $projectroot/$project/REAME.html.
> >
> > All non-ASCII chars of those files will be broken by Perl IO layer
> > without decoding to utf8, so insert_file() does to_utf8() on each
> > printed line; alternate solution would be to open those files with
> > "binmode $fh, ':utf8'", or even all files with "use open qw(:std :utf8)".

> > This is more complete solution that the one provided by Tatsuki Sugiura
> > in original patch
> >
> >   [PATCH] gitweb: fix encode handling for site_{header,footer}
> >   Msg-Id: <87vdumbxgc.wl@vaj-k-334-sugi.local.valinux.co.jp>
> >   http://thread.gmane.org/gmane.comp.version-control.git/101199
> 
> It may be more complete but it is obviously untested.  Please help me
> trust you better with your future patches.  Because I personally do not
> run gitweb myself, I really need a trustworthy lieutenant(s) in the area.
> 
> [Wed Dec  3 01:52:07 2008] gitweb.perl: Global symbol "$fd" requires explicit package name at /git.git/t/../gitweb/gitweb.perl line 4500.
> [Wed Dec  3 01:52:07 2008] gitweb.perl: Execution of /git.git/t/../gitweb/gitweb.perl aborted due to compilation errors.
> 
> > but it is in principle the same solution.
> >
> > I think this one as it is a bugfix should go in git 1.6.1
> 
> Trading a gitweb with a small bug with a gitweb that does not even pass
> its test script does not feel like a good change to me.
> 
> I think the breakage is the "close $fd" at the end of this hunk:
[...]
> I'll queue it to 'pu', with the "close $fd" removed, for now.

I'm very sorry about that. It was a bit of time since my last patch
sent, and I forgot that nevermind how obvious and simple the change,
one should run relevant parts of test suite, or at least try to run
gitweb / changed command.

With "close $fd" removed the patch is correct (and patches t9500*).
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] gitweb: Fix handling of non-ASCII characters in inserted HTML files
  2008-12-03 10:21         ` Jakub Narebski
@ 2008-12-03 21:14           ` Junio C Hamano
  2008-12-08 13:13             ` [PATCH] gitweb: Fix bug in insert_file() subroutine Jakub Narebski
  0 siblings, 1 reply; 9+ messages in thread
From: Junio C Hamano @ 2008-12-03 21:14 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git, Tatsuki Sugiura, Gerrit Pape, Recai Oktas

Jakub Narebski <jnareb@gmail.com> writes:

> With "close $fd" removed the patch is correct (and patches t9500*).

Yeah, it passes that test here, too.  Let's unleash it to 'master' and in
an unlikely case where it still has bugs we know which commit to revert
;-).

Thanks.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] gitweb: Fix bug in insert_file() subroutine
  2008-12-03 21:14           ` Junio C Hamano
@ 2008-12-08 13:13             ` Jakub Narebski
  2008-12-08 17:05               ` Junio C Hamano
  0 siblings, 1 reply; 9+ messages in thread
From: Jakub Narebski @ 2008-12-08 13:13 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Tatsuki Sugiura, Gerrit Pape, Recai Oktas

In insert_file() subroutine (which is used to insert HTML fragments as
custom header, footer, hometext (for projects list view), and per
project README.html (for summary view)) we used:

     map(to_utf8, <$fd>);

This doesn't work, and other form has to be used:

     map { to_utf8($_) } <$fd>;

Now with test for t9600 added, for $GIT_DIR/README.html.

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
Junio C Hamano wrote:
> Jakub Narebski <jnareb@gmail.com> writes:
>> 
>> With "close $fd" removed the patch is correct (and patches t9500*).
> 
> Yeah, it passes that test here, too.  Let's unleash it to 'master' and in
> an unlikely case where it still has bugs we know which commit to revert
> ;-).

I'm extremely sorry about that, but it still had bugs. I thought
I could use "map EXPR,LIST" form instead of "map BLOCK LIST" here,
i.e. "map(to_utf8, <$fd>)" instead of "map {to_utf8($_)} <$fd>;"
in original patch by Tatsuki Sugiura.

Just in case I have added check for this in t9500 gitweb test.

Once again, I'm very sorry for my buggy patches...

 gitweb/gitweb.perl                     |    2 +-
 t/t9500-gitweb-standalone-no-errors.sh |   10 ++++++++++
 2 files changed, 11 insertions(+), 1 deletions(-)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 9517392..6eb370d 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -2764,7 +2764,7 @@ sub insert_file {
 	my $filename = shift;
 
 	open my $fd, '<', $filename;
-	print map(to_utf8, <$fd>);
+	print map { to_utf8($_) } <$fd>;
 	close $fd;
 }
 
diff --git a/t/t9500-gitweb-standalone-no-errors.sh b/t/t9500-gitweb-standalone-no-errors.sh
index 64c4cce..43cd6ee 100755
--- a/t/t9500-gitweb-standalone-no-errors.sh
+++ b/t/t9500-gitweb-standalone-no-errors.sh
@@ -673,4 +673,14 @@ test_expect_success \
 	 gitweb_run "p=.git;a=tree"'
 test_debug 'cat gitweb.log'
 
+# ----------------------------------------------------------------------
+# non-ASCII in README.html
+
+test_expect_success \
+	'README.html with non-ASCII characters (utf-8)' \
+	'echo "<b>UTF-8 example:</b><br />" > .git/README.html &&
+	 cat "$TEST_DIRECTORY"/t3900/1-UTF-8.txt >> .git/README.html &&
+	 gitweb_run "p=.git;a=summary"'
+test_debug 'cat gitweb.log'
+
 test_done

-- 
Stacked GIT 0.14.3
git version 1.6.0.4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] gitweb: Fix bug in insert_file() subroutine
  2008-12-08 13:13             ` [PATCH] gitweb: Fix bug in insert_file() subroutine Jakub Narebski
@ 2008-12-08 17:05               ` Junio C Hamano
  0 siblings, 0 replies; 9+ messages in thread
From: Junio C Hamano @ 2008-12-08 17:05 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git, Tatsuki Sugiura, Gerrit Pape, Recai Oktas

Jakub Narebski <jnareb@gmail.com> writes:

> +# ----------------------------------------------------------------------
> +# non-ASCII in README.html
> +
> +test_expect_success \
> +	'README.html with non-ASCII characters (utf-8)' \
> +	'echo "<b>UTF-8 example:</b><br />" > .git/README.html &&
> +	 cat "$TEST_DIRECTORY"/t3900/1-UTF-8.txt >> .git/README.html &&
> +	 gitweb_run "p=.git;a=summary"'
> +test_debug 'cat gitweb.log'
> +

Yup, a new test would certainly help to catch breakages ;-)

Thanks.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2008-12-08 17:07 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-11-17  7:53 [PATCH] gitweb: fix encode handling for site_{header,footer} Tatsuki Sugiura
2008-11-17 10:28 ` Junio C Hamano
2008-11-17 10:40   ` Jakub Narebski
2008-12-01 18:01     ` [PATCH] gitweb: Fix handling of non-ASCII characters in inserted HTML files Jakub Narebski
2008-12-03  3:55       ` Junio C Hamano
2008-12-03 10:21         ` Jakub Narebski
2008-12-03 21:14           ` Junio C Hamano
2008-12-08 13:13             ` [PATCH] gitweb: Fix bug in insert_file() subroutine Jakub Narebski
2008-12-08 17:05               ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).