git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC/PATCH] gitweb: Try to sanitize mimetype for 'blob_plain' view
@ 2007-11-19 14:54 Jakub Narebski
  2007-11-20  8:07 ` Junio C Hamano
  0 siblings, 1 reply; 3+ messages in thread
From: Jakub Narebski @ 2007-11-19 14:54 UTC (permalink / raw)
  To: git; +Cc: Jakub Narebski

Use 'text/plain' for files which are text and can be viewed in a
browser, and are not among a few 'text/*' mimetypes universally
recognized by web browsers.  This means files with 'text/*' which are
not text/html, text/css, text/sgml or text/xml, and files with
'application/x-*' mimetype which are nevertheless text: javascript,
shell, Perl, Tcl, (La)TeX,...

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
Tired of my web browser (Mozilla) asking me what I want to do with
shell script, Perl script or LaTeX document when using 'blob_plain'
(raw) view, because of declaration in mimetypes file I have added
mimetype sanitizing to gitweb.

It is an RFC partially because list of mimetypes is a bit
arbitrary. Additionally I guess that the mimetype sanitizing should be
separated into subroutine.

But most of all beause proper solution is to create mimetype file for
use by gitweb.

 gitweb/gitweb.perl |   23 ++++++++++++++++++++++-
 1 files changed, 22 insertions(+), 1 deletions(-)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 491a3f4..1cfe293 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -2366,7 +2366,28 @@ sub blob_mimetype {
 
 	if ($filename) {
 		my $mime = mimetype_guess($filename);
-		$mime and return $mime;
+
+		if ($mime) {
+			# try to sanitize mimetype
+
+			# return text/plain on unknown text/* mimetype
+			if ($mime =~ m!^text/! &&
+			    $mime !~ m!^text/(?:html|css|sgml|xml)$!) {
+				return 'text/plain' .
+				       ($default_text_plain_charset ?
+				        '; charset='.$default_text_plain_charset : '');
+			}
+			# return text/plain for known programming languages and like
+			if ($mime =~ m!^application/x-(?:
+			                javascript|csshell|shell|csh|perl|
+			                sh|shar|tcl|latex|tex|texinfo)!x) {
+				return 'text/plain' .
+				       ($default_text_plain_charset ?
+				        '; charset='.$default_text_plain_charset : '');
+			}
+			# otherwise return mimetype found in mimetypes file(s)
+			return $mime;
+		}
 	}
 
 	# just in case
-- 
1.5.3.5

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [RFC/PATCH] gitweb: Try to sanitize mimetype for 'blob_plain' view
  2007-11-19 14:54 [RFC/PATCH] gitweb: Try to sanitize mimetype for 'blob_plain' view Jakub Narebski
@ 2007-11-20  8:07 ` Junio C Hamano
       [not found]   ` <4BAE81C8-EFF3-473D-B243-B7D0F66F131B@wincent.com>
  0 siblings, 1 reply; 3+ messages in thread
From: Junio C Hamano @ 2007-11-20  8:07 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Jakub Narebski <jnareb@gmail.com> writes:

> Use 'text/plain' for files which are text and can be viewed in a
> browser, and are not among a few 'text/*' mimetypes universally
> recognized by web browsers.  This means files with 'text/*' which are
> not text/html, text/css, text/sgml or text/xml, and files with
> 'application/x-*' mimetype which are nevertheless text: javascript,
> shell, Perl, Tcl, (La)TeX,...
>
> Signed-off-by: Jakub Narebski <jnareb@gmail.com>
> ---
> Tired of my web browser (Mozilla) asking me what I want to do with
> shell script, Perl script or LaTeX document when using 'blob_plain'
> (raw) view,...

I admit that I share the irritation, but I've felt that solving
it this way by discarding information is going backwards, and
the kosher solution is to tell the browser what to do with these
unknown mimetypes.  Unfortunately this needs to be done by the
user -- I do not think the server can.

On the other hand, maybe the people who are browsing the plain
view from the browser do not need to have the content marked as
written in what programming language.  I dunno.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC/PATCH] gitweb: Try to sanitize mimetype for 'blob_plain' view
       [not found]   ` <4BAE81C8-EFF3-473D-B243-B7D0F66F131B@wincent.com>
@ 2007-11-20 10:58     ` Jakub Narebski
  0 siblings, 0 replies; 3+ messages in thread
From: Jakub Narebski @ 2007-11-20 10:58 UTC (permalink / raw)
  To: Wincent Colaiuta; +Cc: Junio C Hamano, git

Wincent Colaiuta wrote:

> So yes, this is discarding information -- in a sense it is actually  
> removing *correct* configuration from the server side to work around  
> undesired behaviour on the client side -- but it gave the behaviour I  
> wanted. So I think this patch is actually a good idea, because the  
> behaviour (the user experience) is more important than adhering to a  
> standard just because its a standard.

Modifying gitweb to work around not configured web browser and 
mime.types not written for gitweb like in this RFC patch is one way of 
solving this. Is it good way? That is why it is an RFC... well that
and details of sanitization.

Another would be to use mime.types crafted specially for gitweb, and use 
it for $mimetypes_file (gitweb falls back to /etc/mime.types if it is 
not defined or not present). This might be better solution.


QUESTION: Currently gitweb checks if $mimetypes_file is relative path 
(using nonportable m!^/! instead of File::Spec::file_name_is_absolute),
and if it is then it is used as relative to project. Does anyone use 
this feature to provide per-project mimetypes? Would perhaps using 
relative path as-is (i.e. relative to gitweb script) be better 
solution, and checking for gitweb.mimetypes repo configuration variable 
for per-repo relative to project if relative?

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2007-11-20 10:58 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-11-19 14:54 [RFC/PATCH] gitweb: Try to sanitize mimetype for 'blob_plain' view Jakub Narebski
2007-11-20  8:07 ` Junio C Hamano
     [not found]   ` <4BAE81C8-EFF3-473D-B243-B7D0F66F131B@wincent.com>
2007-11-20 10:58     ` Jakub Narebski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).