Git development
 help / color / mirror / Atom feed
From: Junio C Hamano <junkio@cox.net>
To: ltuikov@yahoo.com
Cc: git@vger.kernel.org
Subject: Re: [PATCH] gitweb.cgi: Use File::MMagic; "a=blob" action knows the blob/file type
Date: Fri, 07 Jul 2006 23:18:37 -0700	[thread overview]
Message-ID: <7vzmfksjpe.fsf@assigned-by-dhcp.cox.net> (raw)
In-Reply-To: 20060708041021.24704.qmail@web31804.mail.mud.yahoo.com

Luben Tuikov <ltuikov@yahoo.com> writes:

> Use File::MMagic to determine the MIME type of a blob/file.
> The variable magic_mime_file holds the location of the
> "magic.mime" file, usually "/usr/share/file/magic.mime".
> If not defined, the magic numbers internally stored in the
> File::MMagic module are used.

I am sorry to ask you this, but would you mind redoing this
patch without File::MMagic bits?  I think giving "a=blob" an
ability to automatically switch to git_blob_plain is a good
addition (as is your earlier patch to give a direct link to
reach blob_plain from the list), so let's have that part in
first.  I haven't applied your earlier one but it will appear in
"next" shortly.

Existing filename based mimetypes_guess should be a lot cheaper
than exploding a blob and feeding it to File::MMagic.  I was
hoping File::MMagic to be used when we cannot guess the content
type that way (i.e. when mimetypes_guess returns undef or
application/octet-stream).

Since the repository owner can correct misidentification by the
standard /etc/mime.types by supplying a custom per-repository
$mimetypes_file (modulo that the current implementation of
mimetype_guess_file does not allow it if the file does not have
an extension that is specific enough), File::MMagic might be an
overkill, especially if used in the way this patch does.  To
allow finer grained differentiation that cannot be done with
file extensions alone (e.g. some files may have .dat extension
but one can be VCD mpeg wrapped in RIFF, and another can be a
Z-machine story file), it might be simpler to allow the
repository owner to specify full $file_name for such an ambiguous
file in their custom $mimetypes_file, and try to match it in
mimetype_guess_file sub.  That way we may not even need to use
File::MMagic.

Are there cases where only $hash is given without $file_name?
If so we may need to fall back on File::MMagic in such a case
after all, but get_blob_mimetype sub copies the whole blob to a
temporary file to work around a problem with version 1.27 you
state in the comment -- this is way too much (and nobody seems
to clean up the tempfile).  Looking at magic.mime, I suspect we
might be able to get away with the first 4k bytes or so at most
(the largest offset except iso9660 image is "Biff5" appearing at
2114 to signal an Excel spreadsheet).

  reply	other threads:[~2006-07-08  6:18 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-07-08  4:10 [PATCH] gitweb.cgi: Use File::MMagic; "a=blob" action knows the blob/file type Luben Tuikov
2006-07-08  6:18 ` Junio C Hamano [this message]
2006-07-09  1:17   ` Luben Tuikov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7vzmfksjpe.fsf@assigned-by-dhcp.cox.net \
    --to=junkio@cox.net \
    --cc=git@vger.kernel.org \
    --cc=ltuikov@yahoo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox