git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Enrico Weigelt <enrico.weigelt@vnc.biz>
To: Thomas Rast <trast@student.ethz.ch>
Cc: git list <git@vger.kernel.org>
Subject: Re: filter-branch IO optimization
Date: Fri, 12 Oct 2012 19:20:23 +0200 (CEST)	[thread overview]
Message-ID: <b94baafd-3813-49c6-9848-97bf11960bb9@zcs> (raw)
In-Reply-To: <d4a00074-5134-4314-aa61-f222f41712bb@zcs>

Hi folks,

now finally managed the index-filter part.
The main problem, IIRC, was that git-update-index didn't
automatically create an empty index, so I needed to explicitly
copy in (manually created it with an empty repo).

My current filter code is:

if [ ! "$GIT_AUTHOR_EMAIL" ] && [ ! "$GIT_COMMITTER_EMAIL" ]; then
	export GIT_AUTHOR_EMAIL="nobody@none.org"
	export GIT_COMMITTER_NAME="nobody@none.org"
elif [ ! "$GIT_AUTHOR_EMAIL" ]; then
	export GIT_AUTHOR_EMAIL="$GIT_COMMITTER_EMAIL"
elif [ ! "$GIT_COMITTER_EMAIL" ]; then
	export GIT_COMMITTER_EMAIL="$GIT_AUTHOR_NAME"
fi

if [ ! "$GIT_AUTHOR_NAME" ] && [ ! "$GIT_COMMITTER_NAME" ]; then
	export GIT_AUTHOR_NAME="nobody@none.org"
	export GIT_COMMITTER_NAME="nobody@none.org"
elif [ ! "$GIT_AUTHOR_NAME" ]; then
	export GIT_AUTHOR_NAME="$GIT_COMMITTER_NAME"
elif [ ! "$GIT_COMITTER_NAME" ]; then
	export GIT_COMMITTER_NAME="$GIT_AUTHOR_NAME"
fi

cp ../../../../scripts/index.empty $GIT_INDEX_FILE.new

git ls-files -s |
    sed "s-\t\"*-&addons/-" |
    grep -e "\t*addons/$module" |
    ( export GIT_INDEX_FILE=$GIT_INDEX_FILE.new ; git update-index --index-info )

mv $GIT_INDEX_FILE.new $GIT_INDEX_FILE


Now another problem: this leaves behind thousands of now empty
merge nodes (--prune-empty doesnt seem to catch them all),
so I loop through additional `git filter-branch --prune-empty`
runs, until the ref remains unchanged.

This process is even more time-consuming, as it takes really many
passes (havent counted them yet).

Does anyone have an idea, why a single run doesnt catch that all?


cu

  reply	other threads:[~2012-10-12 17:20 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <7e000a0f-9e4e-4a4d-a8ce-5d017e17939c@zcs>
2012-10-11 15:39 ` filter-branch IO optimization Enrico Weigelt
2012-10-11 18:36   ` Johannes Sixt
2012-10-11 20:34   ` Thomas Rast
2012-10-12 14:49     ` Enrico Weigelt
2012-10-12 15:59       ` Enrico Weigelt
2012-10-12 17:20         ` Enrico Weigelt [this message]
2012-10-12 17:20       ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b94baafd-3813-49c6-9848-97bf11960bb9@zcs \
    --to=enrico.weigelt@vnc.biz \
    --cc=git@vger.kernel.org \
    --cc=trast@student.ethz.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).