git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Michael J Gruber <git@drmicha.warpmail.net>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 0/3] fast textconv
Date: Sun, 28 Mar 2010 12:56:47 -0400	[thread overview]
Message-ID: <20100328165646.GA10293@coredump.intra.peff.net> (raw)
In-Reply-To: <20100328161921.GA3435@coredump.intra.peff.net>

On Sun, Mar 28, 2010 at 12:19:21PM -0400, Jeff King wrote:

> On Sun, Mar 28, 2010 at 12:17:28PM -0400, Jeff King wrote:
> 
> > If I understand you right, you are proposing a separate program
> > that you would pass as a fasttextconv helper, and that would look in a
> > notes tree. So you would still have a per-diff fork/exec, and pipe all
> > the data.
> > 
> > I was thinking of actually doing it in-core, so cache hits would be as
> > lightweight as a notes lookup (and cache misses obviously would still
> > fork/exec a helper, but we don't care too much since the helper's time
> > to convert will dominate in that path).
> 
> Side note: I think I might prototype it as a separate program and see
> what kind of speed I can get.

Better, but not perfect. My script is below. I get:

  $ time git show >/dev/null
  real    0m1.036s
  user    0m0.412s
  sys     0m0.672s

which is still a 2.5x speedup (versus my other fast-textconv solution
earlier in the thread), but I suspect we can do better. The notes
mechanism does some up-front work to get very fast lookups, but because
we invoke git-notes repeatedly, we never get the amortized benefit of
that up-front work.  Doing it in-core would fix that.

My script was:

-- >8 --
#!/bin/sh

type=$1; shift
program=$1; shift
sha1=$1; shift
filename=$1; shift

GIT_NOTES_REF=refs/notes/textconv/$type; export GIT_NOTES_REF

# try the cache
git notes show $sha1 2>/dev/null && exit 0

# otherwise, insert the cache entry.
# We can be as slow as we like.
ext=`echo "$filename" | sed 's/.*\.//'`
tmp=`mktemp notes-textconv-XXXXXX.$ext`
git cat-file blob $sha1 >$tmp
$program $tmp | git notes add -f -F - $sha1
git notes show $sha1
rm -f $tmp
-- 8< --

and my config is:

  $ git config diff.mfo.textconv /home/peff/fast mfo mfo-tags

where "mfo-tags" is the program to display metadata and "mfo" is a
user-selected shorthand name for it.

-Peff

  reply	other threads:[~2010-03-28 16:57 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-03-28 14:53 [PATCH 0/3] fast textconv Jeff King
2010-03-28 14:53 ` [PATCH 1/3] textconv: refactor calls to run_textconv Jeff King
2010-03-28 14:53 ` [PATCH 2/3] textconv: refactor to handle multiple textconv types Jeff King
2010-03-28 14:54 ` [PATCH 3/3] diff: add "fasttextconv" config option Jeff King
2010-03-28 18:23   ` Johannes Sixt
2010-03-30 16:30     ` Jeff King
2010-03-30 17:36       ` [PATCH] diff: fix textconv error zombies Johannes Sixt
2010-03-30 21:46         ` Junio C Hamano
2010-03-30 22:17           ` Johannes Sixt
2010-03-30 22:56             ` Jeff King
2010-03-28 16:09 ` [PATCH 0/3] fast textconv Michael J Gruber
2010-03-28 16:17   ` Jeff King
2010-03-28 16:19     ` Jeff King
2010-03-28 16:56       ` Jeff King [this message]
2010-03-28 17:34         ` Jeff King
2010-03-28 18:13           ` Sverre Rabbelier
2010-03-30 16:04             ` Jeff King
2010-03-30  3:52 ` Junio C Hamano
2010-03-30 17:07   ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100328165646.GA10293@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@drmicha.warpmail.net \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).