From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 2/3] diff: allow turning on textconv explicitly for plumbing
Date: Tue, 9 Dec 2008 00:48:24 -0500 [thread overview]
Message-ID: <20081209054824.GA2972@coredump.intra.peff.net> (raw)
In-Reply-To: <7v63lv842a.fsf@gitster.siamese.dyndns.org>
On Mon, Dec 08, 2008 at 12:27:25AM -0800, Junio C Hamano wrote:
> I imagine that web developers may want to use a textconv filter that
> replaces ">" with "\n>" on the HTML files their designer-colleagues throw
> in the source tree to make "git log -p" of the whole project easier to
Hrm, didn't you argue against textconv'ing non-binary contents earlier
in the cycle? At any rate, I agree that is a reasonable use. Maybe a
better name would have been "normalize" or "canonicalize".
> follow. When the developers would want to suggest improvements to what
> their designer-colleagues did, however, by running "git diff --stat -p" in
> their dirty work tree to produce a patch (like I just did just now,
> visible from the mnemonic prefixes below), they would want to disable
> textconv temporarily to get an appliable patch with --no-textconv option.
Yep, exactly. I use "git diff >patch" all the time instead of
format-patch.
> You raised an intriguing possibility to use textconv in blame. It would
> also be useful if we allowed "git show --textconv $blob" to pass the blob
> via textconv filter and any other transformation controlled by the
> attributes mechanism.. When "git show" sees the above command line, it
> only knows the blob object name and not the path, so we may need to allow
> a new option to tell the command to pretend as if the content came from a
> path, perhaps with a syntax like:
>
> $ git show --attribute-path=a/b/c $blob
> $ git show --attribute-path=a/b/c --textconv $blob
I think that makes sense. If you are going to implement a "use this as
the attribute path" feature, though, you might also want just "use these
attributes" which I can imagine being simpler in some cases. Like:
$ git show --attribute diff=whatever $blob
But on the subject of "other places to see textconv", another one I
considered was git log -S. I haven't looked to see if it even checks
binary files at all, but certainly if I was searching for some text
content I would want to find it in the text version (and depending on
the file format, there may be other binary cruft preventing you from
finding it in the binary version at all).
The series I posted was to show the text in gitk diffs. However, it does
nothing for finding text in gitk's "find commits introducing this
string", which I assume just uses git's pickaxe.
Unfortunately, it would probably be painfully slow. But still better
than nothing. One enhancement for textconv that I am considering is to
name the tempfiles based on the blob sha1. Then a smart helper could
cache expensive conversions if it wanted (and dumb helpers would
obviously just ignore the filename).
Side note: For one use case, storing word processor documents, this
approach seems sensible. But for my jpg example, it almost seems that
I could use git more efficiently by having two files: one with the exif
text, and one with the image content. And then I could "build" the
resulting tagged jpgs by combining the two, but operations on the text
would be very efficient. The downside, of course, being that the "build"
step is annoying, and consumes a ton of disk space. But it just occurred
to me as an alternate way to think about the use case.
> @@ -133,7 +133,8 @@ on.
> contents can be munged into human readable form and the difference
> between the results of the conversion can be viewed (obviously this
> cannot produce a patch that can be applied, so this is disabled in
> - format-patch among other things).
> + format-patch and plumbing, but if you really wanted to, you can enable
> + it by giving them --textconv command line option explicitly).
I wonder if we want to word this even more strongly: --textconv is there
and you can play with it if you want, but we are not promising that it
is part of the stable API yet.
-Peff
prev parent reply other threads:[~2008-12-09 5:50 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-12-08 2:57 [PATCH 2/3] diff: allow turning on textconv explicitly for plumbing Jeff King
2008-12-08 3:55 ` Junio C Hamano
2008-12-08 4:08 ` Junio C Hamano
2008-12-08 4:59 ` Jeff King
2008-12-08 6:52 ` Junio C Hamano
2008-12-08 7:14 ` Jeff King
2008-12-08 8:27 ` Junio C Hamano
2008-12-09 5:48 ` Jeff King [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20081209054824.GA2972@coredump.intra.peff.net \
--to=peff@peff.net \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).