* [PATCH] Add git-filter-branch
@ 2007-06-03 0:31 Johannes Schindelin
2007-06-03 0:46 ` Jakub Narebski
2007-06-04 7:18 ` [PATCH] Add git-filter-branch Johannes Sixt
0 siblings, 2 replies; 27+ messages in thread
From: Johannes Schindelin @ 2007-06-03 0:31 UTC (permalink / raw)
To: git, junkio, pasky
This script is derived from Pasky's cg-admin-rewritehist.
In fact, it _is_ the same script, minimally adapted to work without cogito.
It _should_ be able to perform the same tasks, even if only relying on
core-git programs.
All the work is Pasky's, just the adaption is mine.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Hopefully-signed-off-by: Petr "cogito master" Baudis <pasky@suse.cz>
---
I will not have time to work on this for at least 24 hours. So, if
people want to go wild with enhancing the test case (and fixing the
script), go wild!
IMHO this should go into core-git, as one of the many, many, many
enhancements that cogito brought to us.
So really, this is a way of thanks to Pasky, rather than just
saying good-bye to cogito.
Thanks, Pasky.
Makefile | 3 +-
git-filter-branch.sh | 430 ++++++++++++++++++++++++++++++++++++++++++++++
t/t7003-filter-branch.sh | 47 +++++
3 files changed, 479 insertions(+), 1 deletions(-)
create mode 100644 git-filter-branch.sh
create mode 100755 t/t7003-filter-branch.sh
diff --git a/Makefile b/Makefile
index 7ecd8f0..da271ec 100644
--- a/Makefile
+++ b/Makefile
@@ -213,7 +213,8 @@ SCRIPT_SH = \
git-am.sh \
git-merge.sh git-merge-stupid.sh git-merge-octopus.sh \
git-merge-resolve.sh git-merge-ours.sh \
- git-lost-found.sh git-quiltimport.sh git-submodule.sh
+ git-lost-found.sh git-quiltimport.sh git-submodule.sh \
+ git-filter-branch.sh
SCRIPT_PERL = \
git-add--interactive.perl \
diff --git a/git-filter-branch.sh b/git-filter-branch.sh
new file mode 100644
index 0000000..c0a7680
--- /dev/null
+++ b/git-filter-branch.sh
@@ -0,0 +1,430 @@
+#!/bin/sh
+#
+# Rewrite revision history
+# Copyright (c) Petr Baudis, 2006
+# Minimal changes to "port" it to core-git (c) Johannes Schindelin, 2007
+#
+# Lets you rewrite GIT revision history by creating a new branch from
+# your current branch by applying custom filters on each revision.
+# Those filters can modify each tree (e.g. removing a file or running
+# a perl rewrite on all files) or information about each commit.
+# Otherwise, all information (including original commit times or merge
+# information) will be preserved.
+#
+# The command takes the new branch name as a mandatory argument and
+# the filters as optional arguments. If you specify no filters, the
+# commits will be recommitted without any changes, which would normally
+# have no effect and result with the new branch pointing to the same
+# branch as your current branch. (Nevertheless, this may be useful in
+# the future for compensating for some Git bugs or such, therefore
+# such a usage is permitted.)
+#
+# WARNING! The rewritten history will have different ids for all the
+# objects and will not converge with the original branch. You will not
+# be able to easily push and distribute the rewritten branch. Please do
+# not use this command if you do not know the full implications, and
+# avoid using it anyway - do not do what a simple single commit on top
+# of the current version would fix.
+#
+# Always verify that the rewritten version is correct before disposing
+# the original branch.
+#
+# Note that since this operation is extensively I/O expensive, it might
+# be a good idea to do it off-disk, e.g. on tmpfs. Reportedly the speedup
+# is very noticeable.
+#
+# OPTIONS
+# -------
+# -d TEMPDIR:: The path to the temporary tree used for rewriting
+# When applying a tree filter, the command needs to temporary
+# checkout the tree to some directory, which may consume
+# considerable space in case of large projects. By default it
+# does this in the '.git-rewrite/' directory but you can override
+# that choice by this parameter.
+#
+# -r STARTREV:: The commit id to start the rewrite at
+# Normally, the command will rewrite the entire history. If you
+# pass this argument, though, this will be the first commit it
+# will rewrite and keep the previous commits intact.
+#
+# -k KEEPREV:: A commit id until which _not_ to rewrite history
+# If you pass this argument, this commit and all of its
+# predecessors are kept intact.
+#
+# Filters
+# ~~~~~~~
+# The filters are applied in the order as listed below. The COMMAND
+# argument is always evaluated in shell using the 'eval' command.
+# The $GIT_COMMIT environment variable is permanently set to contain
+# the id of the commit being rewritten. The author/committer environment
+# variables are set before the first filter is run.
+#
+# A 'map' function is available that takes an "original sha1 id" argument
+# and outputs a "rewritten sha1 id" if the commit has been already
+# rewritten, fails otherwise; the 'map' function can return several
+# ids on separate lines if your commit filter emitted multiple commits
+# (see below).
+#
+# --env-filter COMMAND:: The filter for modifying environment
+# This is the filter for modifying the environment in which
+# the commit will be performed. Specifically, you might want
+# to rewrite the author/committer name/email/time environment
+# variables (see `git-commit` for details). Do not forget to
+# re-export the variables.
+#
+# --tree-filter COMMAND:: The filter for rewriting tree (and its contents)
+# This is the filter for rewriting the tree and its contents.
+# The COMMAND argument is evaluated in shell with the working
+# directory set to the root of the checked out tree. The new tree
+# is then used as-is (new files are auto-added, disappeared files
+# are auto-removed - .gitignore files nor any other ignore rules
+# HAVE NO EFFECT!).
+#
+# --index-filter COMMAND:: The filter for rewriting index
+# This is the filter for rewriting the Git's directory index.
+# It is similar to the tree filter but does not check out the
+# tree, which makes it much faster. However, you must use the
+# lowlevel Git index manipulation commands to do your work.
+#
+# --parent-filter COMMAND:: The filter for rewriting parents
+# This is the filter for rewriting the commit's parent list.
+# It will receive the parent string on stdin and shall output
+# the new parent string on stdout. The parent string is in
+# format accepted by `git-commit-tree`: empty for initial
+# commit, "-p parent" for a normal commit and "-p parent1
+# -p parent2 -p parent3 ..." for a merge commit.
+#
+# --msg-filter COMMAND:: The filter for rewriting commit message
+# This is the filter for rewriting the commit messages.
+# The COMMAND argument is evaluated in shell with the original
+# commit message on standard input; its standard output is
+# is used as the new commit message.
+#
+# --commit-filter COMMAND:: The filter for performing the commit
+# If this filter is passed, it will be called instead of the
+# `git-commit-tree` command, with those arguments:
+#
+# TREE_ID [-p PARENT_COMMIT_ID]...
+#
+# and the log message on stdin. The commit id is expected on
+# stdout. As a special extension, the commit filter may emit
+# multiple commit ids; in that case, all of them will be used
+# as parents instead of the original commit in further commits.
+#
+# --tag-name-filter COMMAND:: The filter for rewriting tag names.
+# If this filter is passed, it will be called for every tag ref
+# that points to a rewritten object (or to a tag object which
+# points to a rewritten object). The original tag name is passed
+# via standard input, and the new tag name is expected on standard
+# output.
+#
+# The original tags are not deleted, but can be overwritten;
+# use "--tag-name-filter=cat" to simply update the tags. In this
+# case, be very careful and make sure you have the old tags
+# backed up in case the conversion has run afoul.
+#
+# Note that there is currently no support for proper rewriting of
+# tag objects; in layman terms, if the tag has a message or signature
+# attached, the rewritten tag won't have it. Sorry. (It is by
+# definition impossible to preserve signatures at any rate, though.)
+#
+# EXAMPLE USAGE
+# -------------
+# Suppose you want to remove a file (containing confidential information
+# or copyright violation) from all commits:
+#
+# git-filter-branch --tree-filter 'rm filename' newbranch
+#
+# A significantly faster version:
+#
+# git-filter-branch --index-filter 'git-update-index --remove filename' newbranch
+#
+# Now, you will get the rewritten history saved in the branch 'newbranch'
+# (your current branch is left untouched).
+#
+# To "etch-graft" a commit to the revision history (set a commit to be
+# the parent of the current initial commit and propagate that):
+#
+# git-filter-branch --parent-filter sed\ 's/^$/-p graftcommitid/' newbranch
+#
+# (if the parent string is empty - therefore we are dealing with the
+# initial commit - add graftcommit as a parent). Note that this assumes
+# history with a single root (that is, no git-merge without common ancestors
+# happened). If this is not the case, use:
+#
+# git-filter-branch --parent-filter 'cat; [ "$GIT_COMMIT" = "COMMIT" ] && echo "-p GRAFTCOMMIT"' newbranch
+#
+# To remove commits authored by "Darl McBribe" from the history:
+#
+# git-filter-branch --commit-filter 'if [ "$GIT_AUTHOR_NAME" = "Darl McBribe" ]; then shift; while [ -n "$1" ]; do shift; echo "$1"; shift; done; else git-commit-tree "$@"; fi' newbranch
+#
+# (the shift magic first throws away the tree id and then the -p
+# parameters). Note that this handles merges properly! In case Darl
+# committed a merge between P1 and P2, it will be propagated properly
+# and all children of the merge will become merge commits with P1,P2
+# as their parents instead of the merge commit.
+#
+# To restrict rewriting to only part of the history, use -r or -k or both.
+# Consider this history:
+#
+# D--E--F--G--H
+# / /
+# A--B-----C
+#
+# To rewrite only commits F,G,H, use:
+#
+# git-filter-branch -r F ...
+#
+# To rewrite commits E,F,G,H, use one of these:
+#
+# git-filter-branch -r E -k C ...
+# git-filter-branch -k D -k C ...
+
+# Testsuite: TODO
+
+set -e
+
+USAGE="git-filter-branch [-d TEMPDIR] [-r STARTREV]... [-k KEEPREV]... [-s SRCBRANCH] [FILTERS] DESTBRANCH"
+. git-sh-setup
+
+map()
+{
+ [ -r "$workdir/../map/$1" ] || return 1
+ cat "$workdir/../map/$1"
+}
+
+# When piped a commit, output a script to set the ident of either
+# "author" or "committer
+
+set_ident () {
+ lid="$(echo "$1" | tr "A-Z" "a-z")"
+ uid="$(echo "$1" | tr "a-z" "A-Z")"
+ pick_id_script='
+ /^'$lid' /{
+ s/'\''/'\''\\'\'\''/g
+ h
+ s/^'$lid' \([^<]*\) <[^>]*> .*$/\1/
+ s/'\''/'\''\'\'\''/g
+ s/.*/export GIT_'$uid'_NAME='\''&'\''/p
+
+ g
+ s/^'$lid' [^<]* <\([^>]*\)> .*$/\1/
+ s/'\''/'\''\'\'\''/g
+ s/.*/export GIT_'$uid'_EMAIL='\''&'\''/p
+
+ g
+ s/^'$lid' [^<]* <[^>]*> \(.*\)$/\1/
+ s/'\''/'\''\'\'\''/g
+ s/.*/export GIT_'$uid'_DATE='\''&'\''/p
+
+ q
+ }
+ '
+
+ LANG=C LC_ALL=C sed -ne "$pick_id_script"
+ # Ensure non-empty id name.
+ echo "[ -n \"\$GIT_${uid}_NAME\" ] || export GIT_${uid}_NAME=\"\${GIT_${uid}_EMAIL%%@*}\""
+}
+
+# list all parent's object names for a given commit
+get_parents () {
+ git-rev-list -1 --parents "$1" | sed "s/^[0-9a-f]*//"
+}
+
+tempdir=.git-rewrite
+unchanged=" "
+filter_env=
+filter_tree=
+filter_index=
+filter_parent=
+filter_msg=cat
+filter_commit='git-commit-tree "$@"'
+filter_tag_name=
+srcbranch=HEAD
+while case "$#" in 0) usage;; esac
+do
+ case "$1" in
+ --)
+ shift
+ break
+ ;;
+ -*)
+ ;;
+ *)
+ break;
+ esac
+
+ # all switches take one argument
+ ARG="$1"
+ case "$#" in 1) usage ;; esac
+ shift
+ OPTARG="$1"
+ shift
+
+ case "$ARG" in
+ -d)
+ tempdir="$OPTARG"
+ ;;
+ -r)
+ unchanged="$(get_parents "$OPTARG") $unchanged"
+ ;;
+ -k)
+ unchanged="$(git-rev-parse "$OPTARG"^{commit}) $unchanged"
+ ;;
+ --env-filter)
+ filter_env="$OPTARG"
+ ;;
+ --tree-filter)
+ filter_tree="$OPTARG"
+ ;;
+ --index-filter)
+ filter_index="$OPTARG"
+ ;;
+ --parent-filter)
+ filter_parent="$OPTARG"
+ ;;
+ --msg-filter)
+ filter_msg="$OPTARG"
+ ;;
+ --commit-filter)
+ filter_commit="$OPTARG"
+ ;;
+ --tag-name-filter)
+ filter_tag_name="$OPTARG"
+ ;;
+ -s)
+ srcbranch="$OPTARG"
+ ;;
+ *)
+ usage
+ ;;
+ esac
+done
+
+dstbranch="$1"
+test -n "$dstbranch" || die "missing branch name"
+git-show-ref "refs/heads/$dstbranch" 2> /dev/null &&
+ die "branch $dstbranch already exists"
+
+test ! -e "$tempdir" || die "$tempdir already exists, please remove it"
+mkdir -p "$tempdir/t"
+cd "$tempdir/t"
+workdir="$(pwd)"
+
+case "$GIT_DIR" in
+/*)
+ ;;
+*)
+ export GIT_DIR="$(pwd)/../../$GIT_DIR"
+ ;;
+esac
+
+export GIT_INDEX_FILE="$(pwd)/../index"
+git-read-tree # seed the index file
+
+ret=0
+
+
+mkdir ../map # map old->new commit ids for rewriting parents
+
+# seed with identity mappings for the parents where we start off
+for commit in $unchanged; do
+ echo $commit > ../map/$commit
+done
+
+git-rev-list --reverse --topo-order $srcbranch --not $unchanged >../revs
+commits=$(cat ../revs | wc -l | tr -d " ")
+
+test $commits -eq 0 && die "Found nothing to rewrite"
+
+i=0
+while read commit; do
+ i=$((i+1))
+ printf "$commit ($i/$commits) "
+
+ git-read-tree -i -m $commit
+
+ export GIT_COMMIT=$commit
+ git-cat-file commit "$commit" >../commit
+
+ eval "$(set_ident AUTHOR <../commit)"
+ eval "$(set_ident COMMITTER <../commit)"
+ eval "$filter_env"
+
+ if [ "$filter_tree" ]; then
+ git-checkout-index -f -u -a
+ # files that $commit removed are now still in the working tree;
+ # remove them, else they would be added again
+ git-ls-files -z --others | xargs -0 rm -f
+ eval "$filter_tree"
+ git-diff-index -r $commit | cut -f 2- | tr '\n' '\0' | \
+ xargs -0 git-update-index --add --replace --remove
+ git-ls-files -z --others | \
+ xargs -0 git-update-index --add --replace --remove
+ fi
+
+ eval "$filter_index"
+
+ parentstr=
+ for parent in $(get_parents $commit); do
+ if [ -r "../map/$parent" ]; then
+ for reparent in $(cat "../map/$parent"); do
+ parentstr="$parentstr -p $reparent"
+ done
+ else
+ die "assertion failed: parent $parent for commit $commit not found in rewritten ones"
+ fi
+ done
+ if [ "$filter_parent" ]; then
+ parentstr="$(echo "$parentstr" | eval "$filter_parent")"
+ fi
+
+ sed -e '1,/^$/d' <../commit | \
+ eval "$filter_msg" | \
+ sh -c "$filter_commit" git-commit-tree $(git-write-tree) $parentstr | \
+ tee ../map/$commit
+done <../revs
+
+git-update-ref refs/heads/"$dstbranch" $(head -n 1 ../map/$(tail -n 1 ../revs))
+if [ "$(cat ../map/$(tail -n 1 ../revs) | wc -l)" -gt 1 ]; then
+ echo "WARNING: Your commit filter caused the head commit to expand to several rewritten commits. Only the first such commit was recorded as the current $dstbranch head but you will need to resolve the situation now (probably by manually merging the other commits). These are all the commits:" >&2
+ sed 's/^/ /' ../map/$(tail -n 1 ../revs) >&2
+ ret=1
+fi
+
+if [ "$filter_tag_name" ]; then
+ git-for-each-ref --format='%(objectname) %(objecttype) %(refname)' refs/tags |
+ while read sha1 type ref; do
+ ref="${ref#refs/tags/}"
+ # XXX: Rewrite tagged trees as well?
+ if [ "$type" != "commit" -a "$type" != "tag" ]; then
+ continue;
+ fi
+
+ if [ "$type" = "tag" ]; then
+ # Dereference to a commit
+ sha1t="$sha1"
+ sha1="$(git-rev-parse "$sha1"^{commit} 2>/dev/null)" || continue
+ fi
+
+ [ -f "../map/$sha1" ] || continue
+ new_sha1="$(cat "../map/$sha1")"
+ export GIT_COMMIT="$sha1"
+ new_ref="$(echo "$ref" | eval "$filter_tag_name")"
+
+ echo "$ref -> $new_ref ($sha1 -> $new_sha1)"
+
+ if [ "$type" = "tag" ]; then
+ # Warn that we are not rewriting the tag object itself.
+ warn "unreferencing tag object $sha1t"
+ fi
+
+ git-update-ref "refs/tags/$new_ref" "$new_sha1"
+ done
+fi
+
+cd ../..
+rm -rf "$tempdir"
+echo "Rewritten history saved to the $dstbranch branch"
+
+exit $ret
diff --git a/t/t7003-filter-branch.sh b/t/t7003-filter-branch.sh
new file mode 100755
index 0000000..9a4dae4
--- /dev/null
+++ b/t/t7003-filter-branch.sh
@@ -0,0 +1,47 @@
+#!/bin/sh
+
+test_description='git-filter-branch'
+. ./test-lib.sh
+
+make_commit () {
+ lower=$(echo $1 | tr A-Z a-z)
+ echo $lower > $lower
+ git add $lower
+ git commit -m $1
+ git tag $1
+}
+
+test_expect_success 'setup' '
+ make_commit A
+ make_commit B
+ git checkout -b branch B
+ make_commit D
+ make_commit E
+ git checkout master
+ make_commit C
+ git checkout branch
+ git merge C
+ git tag F
+ make_commit G
+ make_commit H
+'
+
+H=$(git-rev-parse H)
+
+test_expect_success 'rewrite identically' '
+ git-filter-branch H2
+'
+
+test_expect_success 'result is really identical' '
+ test $H = $(git-rev-parse H2)
+'
+
+test_expect_success 'rewrite, renaming a specific file' '
+ git-filter-branch --tree-filter "mv d doh || :" H3
+'
+
+test_expect_success 'test that the file was renamed' '
+ test d = $(git show H3:doh)
+'
+
+test_done
--
1.5.2.2663.gd77e7-dirty
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH] Add git-filter-branch
2007-06-03 0:31 [PATCH] Add git-filter-branch Johannes Schindelin
@ 2007-06-03 0:46 ` Jakub Narebski
2007-06-03 0:50 ` Johannes Schindelin
2007-06-04 7:18 ` [PATCH] Add git-filter-branch Johannes Sixt
1 sibling, 1 reply; 27+ messages in thread
From: Jakub Narebski @ 2007-06-03 0:46 UTC (permalink / raw)
To: git
Johannes Schindelin wrote:
> This script is derived from Pasky's cg-admin-rewritehist.
>
> In fact, it _is_ the same script, minimally adapted to work without cogito.
> It _should_ be able to perform the same tasks, even if only relying on
> core-git programs.
>
> All the work is Pasky's, just the adaption is mine.
I was thinking about rewriting cg-adin-rewritehist as git-rewritehist
using Perl (IIRC it needs bash, not only POSIX shell), and make it
use git-fast-import.
But that was in planning (read: it would be nice...) phase...
--
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] Add git-filter-branch
2007-06-03 0:46 ` Jakub Narebski
@ 2007-06-03 0:50 ` Johannes Schindelin
2007-06-03 10:28 ` Jakub Narebski
2007-06-05 10:18 ` Jonas Fonseca
0 siblings, 2 replies; 27+ messages in thread
From: Johannes Schindelin @ 2007-06-03 0:50 UTC (permalink / raw)
To: Jakub Narebski; +Cc: git
Hi,
[thank you for not including me in the Cc: list]
On Sun, 3 Jun 2007, Jakub Narebski wrote:
> Johannes Schindelin wrote:
>
> > This script is derived from Pasky's cg-admin-rewritehist.
> >
> > In fact, it _is_ the same script, minimally adapted to work without cogito.
> > It _should_ be able to perform the same tasks, even if only relying on
> > core-git programs.
> >
> > All the work is Pasky's, just the adaption is mine.
>
> I was thinking about rewriting cg-adin-rewritehist as git-rewritehist
> using Perl (IIRC it needs bash, not only POSIX shell), and make it
> use git-fast-import.
First, it does not need Perl.
Second, it does not even need bash.
At least that is what I tried to make sure. I replaced the only instance
of a bashim I was aware, namely the arrayism of $unchanged. It can be a
string just as well, as we are only storing object names in it.
Tell me if it does not work for you.
Or even better, provide me with a test case that fails for you.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] Add git-filter-branch
2007-06-03 0:50 ` Johannes Schindelin
@ 2007-06-03 10:28 ` Jakub Narebski
2007-06-03 18:36 ` Steven Grimm
2007-06-03 23:07 ` Johannes Schindelin
2007-06-05 10:18 ` Jonas Fonseca
1 sibling, 2 replies; 27+ messages in thread
From: Jakub Narebski @ 2007-06-03 10:28 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: git
On Sun, 3 Jun 2007, Johannes Schindelin wrote:
> On Sun, 3 Jun 2007, Jakub Narebski wrote:
>> Johannes Schindelin wrote:
>>
>>> This script is derived from Pasky's cg-admin-rewritehist.
>>>
>>> In fact, it _is_ the same script, minimally adapted to work without cogito.
>>> It _should_ be able to perform the same tasks, even if only relying on
>>> core-git programs.
>>>
>>> All the work is Pasky's, just the adaption is mine.
>>
>> I was thinking about rewriting cg-adin-rewritehist as git-rewritehist
>> using Perl (IIRC it needs bash, not only POSIX shell), and make it
>> use git-fast-import.
By the way, why did you change name to git-filter-branch, instead of
leaving it [almost] as is, i.e. git-rewritehist. Or if you wanted to
emphasize that it rewrites only one branch at a time, git-rewrite-branch?
Note that history (branch) gets rewritten also in absence of filters,
if there are any grafts in place. But I might be mistaken.
> First, it does not need Perl.
>
> Second, it does not even need bash.
If I remember correctly (but I can be wrong here) Pasky said that he had
to use arrays in cg-admin-rewritehist. Because introducing dependency on
bash would be bad, that was the cause of thought to rewrite it in Perl
(which we depend on anyway).
See below.
> At least that is what I tried to make sure. I replaced the only instance
> of a bashim I was aware, namely the arrayism of $unchanged. It can be a
> string just as well, as we are only storing object names in it.
I'm sorry, I haven't reviewed your patch carefully enough, it seems like.
If you can translate cg-admin-rewritehist to POSIX shell, more power
to you.
-- " --
Few notes of lesser importance (meaning they can go into subsequent
commits).
1. Documentation: Cogito had documentation together with the command
described, similarly to Perl POD, or LaTeX doc package + DocStrip,
etc. It has IIRC rules in Makefile to extract documentation.
In git we have documentation in separate files. The commands
themselves have only usage, and sometimes long usage embedded.
It would be nice of git-filter-branch / git-rewrite-branch also
followed this convention.
2. Using fast-import.
>> +# Note that since this operation is extensively I/O expensive, it might
>> +# be a good idea to do it off-disk, e.g. on tmpfs. Reportedly the speedup
>> +# is very noticeable.
Would it be possible to use git-fast-import to reduce I/O in this
command? Cogito didn't use it because it is quite new, but there
is no reason to not to use it now, I think.
--
Jakub Narebski
Poland
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] Add git-filter-branch
2007-06-03 10:28 ` Jakub Narebski
@ 2007-06-03 18:36 ` Steven Grimm
2007-06-03 23:07 ` Johannes Schindelin
1 sibling, 0 replies; 27+ messages in thread
From: Steven Grimm @ 2007-06-03 18:36 UTC (permalink / raw)
To: Jakub Narebski; +Cc: Johannes Schindelin, git
Jakub Narebski wrote:
> By the way, why did you change name to git-filter-branch, instead of
> leaving it [almost] as is, i.e. git-rewritehist. Or if you wanted to
> emphasize that it rewrites only one branch at a time, git-rewrite-branch?
>
One argument against the name change is that one could easily imagine
this tool being extended in the future to filter all branches rather
than just one. For example, the "get rid of copyrighted file X in my
repository" use case is a bit of a pain right now using
cg-admin-rewritehist if the file was introduced early in the history of
a large repo; in that scenario you want to be able to say, "filter this
file out of my entire repo" without particularly caring which branches
it appeared in (and without losing any of the branch structure in your
history.)
-Steve
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] Add git-filter-branch
2007-06-03 10:28 ` Jakub Narebski
2007-06-03 18:36 ` Steven Grimm
@ 2007-06-03 23:07 ` Johannes Schindelin
1 sibling, 0 replies; 27+ messages in thread
From: Johannes Schindelin @ 2007-06-03 23:07 UTC (permalink / raw)
To: Jakub Narebski; +Cc: git
Hi,
On Sun, 3 Jun 2007, Jakub Narebski wrote:
> On Sun, 3 Jun 2007, Johannes Schindelin wrote:
> > On Sun, 3 Jun 2007, Jakub Narebski wrote:
> >> Johannes Schindelin wrote:
> >>
> >>> This script is derived from Pasky's cg-admin-rewritehist.
> >>>
> >>> In fact, it _is_ the same script, minimally adapted to work without cogito.
> >>> It _should_ be able to perform the same tasks, even if only relying on
> >>> core-git programs.
> >>>
> >>> All the work is Pasky's, just the adaption is mine.
> >>
> >> I was thinking about rewriting cg-adin-rewritehist as git-rewritehist
> >> using Perl (IIRC it needs bash, not only POSIX shell), and make it
> >> use git-fast-import.
>
> By the way, why did you change name to git-filter-branch, instead of
> leaving it [almost] as is, i.e. git-rewritehist. Or if you wanted to
> emphasize that it rewrites only one branch at a time, git-rewrite-branch?
It does not rewrite the branch. It writes a filtered _copy_. That is what
I wanted to make clear by that renaming.
> Note that history (branch) gets rewritten also in absence of filters, if
> there are any grafts in place. But I might be mistaken.
Actually, if you check the first non-setup test in the provide test
script, no. It is not _really_ rewritten. As the commit names stay exactly
the same.
> > First, it does not need Perl.
> >
> > Second, it does not even need bash.
>
> If I remember correctly (but I can be wrong here) Pasky said that he had
> to use arrays in cg-admin-rewritehist. Because introducing dependency on
> bash would be bad, that was the cause of thought to rewrite it in Perl
> (which we depend on anyway).
I rewrote the only instance where arrays were used:
> > At least that is what I tried to make sure. I replaced the only
> > instance of a bashim I was aware, namely the arrayism of $unchanged.
> > It can be a string just as well, as we are only storing object names
> > in it.
>
> I'm sorry, I haven't reviewed your patch carefully enough, it seems
> like. If you can translate cg-admin-rewritehist to POSIX shell, more
> power to you.
Actually, that is my understanding.
> Few notes of lesser importance (meaning they can go into subsequent
> commits).
>
> 1. Documentation: Cogito had documentation together with the command
> described, similarly to Perl POD, or LaTeX doc package + DocStrip,
> etc. It has IIRC rules in Makefile to extract documentation.
>
> In git we have documentation in separate files. The commands
> themselves have only usage, and sometimes long usage embedded.
> It would be nice of git-filter-branch / git-rewrite-branch also
> followed this convention.
Yes, I did not plan to provide documentation with the first patch, since I
wanted to encourage _review_ of the patch. Obviously, I failed ;-)
> 2. Using fast-import.
>
> >> +# Note that since this operation is extensively I/O expensive, it might
> >> +# be a good idea to do it off-disk, e.g. on tmpfs. Reportedly the speedup
> >> +# is very noticeable.
>
> Would it be possible to use git-fast-import to reduce I/O in this
> command? Cogito didn't use it because it is quite new, but there
> is no reason to not to use it now, I think.
It is overkill, usually.
The only thing that could benefit from it, would be complicated tree
filters.
But _the_ main usage for this script (in my expectation, at least) will be
to split projects into subprojects.
For this, we _still_ don't need fast-import, but maybe a better
tree-filter (something like --subdir-filter, which only checks out the
subdir (in the root), and also only takes those revisions into account
that actually touch that subdir).
Ciao,
Dscho
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] Add git-filter-branch
2007-06-03 0:31 [PATCH] Add git-filter-branch Johannes Schindelin
2007-06-03 0:46 ` Jakub Narebski
@ 2007-06-04 7:18 ` Johannes Sixt
2007-06-04 7:59 ` Johannes Sixt
2007-06-04 16:11 ` Johannes Schindelin
1 sibling, 2 replies; 27+ messages in thread
From: Johannes Sixt @ 2007-06-04 7:18 UTC (permalink / raw)
To: git
Johannes Schindelin wrote:
>
> This script is derived from Pasky's cg-admin-rewritehist.
>
> In fact, it _is_ the same script, minimally adapted to work without cogito.
> It _should_ be able to perform the same tasks, even if only relying on
> core-git programs.
>
> All the work is Pasky's, just the adaption is mine.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> Hopefully-signed-off-by: Petr "cogito master" Baudis <pasky@suse.cz>
If you hadn't done that, I'd done it sooner or later. Thanks!
> + -r)
> + unchanged="$(get_parents "$OPTARG") $unchanged"
> + ;;
> + -k)
> + unchanged="$(git-rev-parse "$OPTARG"^{commit}) $unchanged"
> + ;;
These two (-r and -k) together with...
> +# seed with identity mappings for the parents where we start off
> +for commit in $unchanged; do
> + echo $commit > ../map/$commit
> +done
... this and ...
> + if [ -r "../map/$parent" ]; then
> + for reparent in $(cat "../map/$parent"); do
> + parentstr="$parentstr -p $reparent"
> + done
> + else
> + die "assertion failed: parent $parent for commit $commit not found in rewritten ones"
... this, means that any simple command like
git filter-branch -k orgin/master origin/next new-next
of your git.git clone will fail with the "assertion failed". (I haven't
tried your script, yet, but cg-admin-rewritehist fails.)
I propose that you just get rid of the "seed" stance and don't fail if a
commit cannot be mapped - just use it unchanged (don't forget to adjust
the map() function, too). Then you can get rid of -r and use -k to
specify everything you want under "--not" in the rev-list.
-- Hannes
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] Add git-filter-branch
2007-06-04 7:18 ` [PATCH] Add git-filter-branch Johannes Sixt
@ 2007-06-04 7:59 ` Johannes Sixt
2007-06-04 16:11 ` Johannes Schindelin
1 sibling, 0 replies; 27+ messages in thread
From: Johannes Sixt @ 2007-06-04 7:59 UTC (permalink / raw)
To: git
Johannes Sixt wrote:
> ... this, means that any simple command like
>
> git filter-branch -k orgin/master origin/next new-next
Make this:
git filter-branch -k orgin/master -s origin/next new-next
-- Hannes
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] Add git-filter-branch
2007-06-04 7:18 ` [PATCH] Add git-filter-branch Johannes Sixt
2007-06-04 7:59 ` Johannes Sixt
@ 2007-06-04 16:11 ` Johannes Schindelin
2007-06-04 16:34 ` Johannes Sixt
1 sibling, 1 reply; 27+ messages in thread
From: Johannes Schindelin @ 2007-06-04 16:11 UTC (permalink / raw)
To: Johannes Sixt; +Cc: git
Hi,
On Mon, 4 Jun 2007, Johannes Sixt wrote:
> [...] any simple command like
>
> git filter-branch -k orgin/master origin/next new-next
>
> of your git.git clone will fail with the "assertion failed". (I haven't
> tried your script, yet, but cg-admin-rewritehist fails.)
As you mentioned yourself, you should say "-s origin/next".
cg-admin-rewritehist will only rewrite the current branch (since cogito
started out as one-branch-per-repo).
> I propose that you just get rid of the "seed" stance and don't fail if a
> commit cannot be mapped - just use it unchanged (don't forget to adjust
> the map() function, too).
It is as much for debug reasons as for consistency, so I'd rather keep it.
One more safety valve for catching bugs.
> Then you can get rid of -r and use -k to specify everything you want
> under "--not" in the rev-list.
Actually, -r is quite useful. It means "start rewriting with this commit",
and saying "--not <commit>^" is _not_ the same when <commit> is a merge.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] Add git-filter-branch
2007-06-04 16:11 ` Johannes Schindelin
@ 2007-06-04 16:34 ` Johannes Sixt
2007-06-04 17:55 ` Johannes Schindelin
0 siblings, 1 reply; 27+ messages in thread
From: Johannes Sixt @ 2007-06-04 16:34 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: git
Johannes Schindelin wrote:
>
> Hi,
>
> On Mon, 4 Jun 2007, Johannes Sixt wrote:
> > I propose that you just get rid of the "seed" stance and don't fail if a
> > commit cannot be mapped - just use it unchanged (don't forget to adjust
> > the map() function, too).
>
> It is as much for debug reasons as for consistency, so I'd rather keep it.
> One more safety valve for catching bugs.
>
> > Then you can get rid of -r and use -k to specify everything you want
> > under "--not" in the rev-list.
>
> Actually, -r is quite useful. It means "start rewriting with this commit",
> and saying "--not <commit>^" is _not_ the same when <commit> is a merge.
But this makes only sense if you have a linear history. Consider this
history, where you want to rewrite the commits that are only on branch
'next':
--A--B--C--D--E--F--G--H <- master
\ \ \ \ \ \ \ \
X--o--o--o--o--o--o--o--o <- next
How would you go about with the current calling convention? It's
unpractical to say the least:
git filter-branch -r X -k B -k C -k D ... -k H new-next
If you don't give all the -k, then you get the "assertion failed" error
because the parents B..H are not registered in the commit id map. This
is not something I'd like to try on a history like git.git's.
OTOH, rev-list can easily restrict the commits regardless of how many
merges there are in 'next':
git rev-list next --not master
Why not use its powers?
-- Hannes
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] Add git-filter-branch
2007-06-04 16:34 ` Johannes Sixt
@ 2007-06-04 17:55 ` Johannes Schindelin
2007-06-05 7:01 ` Johannes Sixt
0 siblings, 1 reply; 27+ messages in thread
From: Johannes Schindelin @ 2007-06-04 17:55 UTC (permalink / raw)
To: Johannes Sixt; +Cc: git
Hi,
On Mon, 4 Jun 2007, Johannes Sixt wrote:
> Johannes Schindelin wrote:
> >
> > Hi,
> >
> > On Mon, 4 Jun 2007, Johannes Sixt wrote:
> > > I propose that you just get rid of the "seed" stance and don't fail if a
> > > commit cannot be mapped - just use it unchanged (don't forget to adjust
> > > the map() function, too).
> >
> > It is as much for debug reasons as for consistency, so I'd rather keep it.
> > One more safety valve for catching bugs.
> >
> > > Then you can get rid of -r and use -k to specify everything you want
> > > under "--not" in the rev-list.
> >
> > Actually, -r is quite useful. It means "start rewriting with this commit",
> > and saying "--not <commit>^" is _not_ the same when <commit> is a merge.
>
> But this makes only sense if you have a linear history. Consider this
> history, where you want to rewrite the commits that are only on branch
> 'next':
>
> --A--B--C--D--E--F--G--H <- master
> \ \ \ \ \ \ \ \
> X--o--o--o--o--o--o--o--o <- next
>
> How would you go about with the current calling convention?
Are you actually sure that this scenario makes sense? When is the last
time you wanted to filter a branch?
In any case, for such a degenerated test case I would rather try to limit
filtering in the filter expression. Remember: you don't have to change
_every_ commit.
Of course, if I am the only one defending this behaviour, I'll change it.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] Add git-filter-branch
2007-06-04 17:55 ` Johannes Schindelin
@ 2007-06-05 7:01 ` Johannes Sixt
2007-06-05 15:58 ` Johannes Schindelin
0 siblings, 1 reply; 27+ messages in thread
From: Johannes Sixt @ 2007-06-05 7:01 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: git
Johannes Schindelin wrote:
> On Mon, 4 Jun 2007, Johannes Sixt wrote:
> > But this makes only sense if you have a linear history. Consider this
> > history, where you want to rewrite the commits that are only on branch
> > 'next':
> >
> > --A--B--C--D--E--F--G--H <- master
> > \ \ \ \ \ \ \ \
> > X--o--o--o--o--o--o--o--o <- next
> >
> > How would you go about with the current calling convention?
>
> Are you actually sure that this scenario makes sense? When is the last
> time you wanted to filter a branch?
Oh, this makes a lot of sense. For example after I've imported a CVS
repository I had installed grafts for a number of merges that were made
in CVS (but we all know that CVS doesn't record them, so I did that
manually this way). That would be the merge commits in 'next' of the
example above. Now a simple
git filter-branch -k master new-next
could "implant" the grafts into the commits. In this scenario I don't
need to rewrite 'master' because I know in advance that nothing would
actually be rewritten.
(Since 'master' was about 8000 commits I really didn't want to wait
until the no-ops would be completed, so I did it by actually fixing
cg-admin-rewritehist to not complain about the unmapped parents.)
> In any case, for such a degenerated test case I would rather try to limit
> filtering in the filter expression. Remember: you don't have to change
> _every_ commit.
I don't think that this is a degenerate case. See the example above.
Please observe that the only reasonable way to limit the commits to
rewrite is by giving some --not arguments to the rev-list. The filter
scriptlets themselves have no easy way to tell whether a commit should
be rewritten or not. They just rewrite - with the final result perhaps
ending up identical to the original; but no labor was saved.
-- Hannes
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] Add git-filter-branch
2007-06-03 0:50 ` Johannes Schindelin
2007-06-03 10:28 ` Jakub Narebski
@ 2007-06-05 10:18 ` Jonas Fonseca
2007-06-05 10:26 ` David Kastrup
2007-06-05 10:30 ` Junio C Hamano
1 sibling, 2 replies; 27+ messages in thread
From: Jonas Fonseca @ 2007-06-05 10:18 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Jakub Narebski, git
Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote Sun, Jun 03, 2007:
> Second, it does not even need bash.
>
> At least that is what I tried to make sure. I replaced the only instance
> of a bashim I was aware, namely the arrayism of $unchanged. It can be a
> string just as well, as we are only storing object names in it.
>
> Tell me if it does not work for you.
>
> Or even better, provide me with a test case that fails for you.
I found a small problem when /bin/sh is linked to dash.
$ /bin/dash t*-filter-branch.sh
* ok 1: setup
* FAIL 2: rewrite identically
git-filter-branch H2
* FAIL 3: result is really identical
test $H = $(git-rev-parse H2)
* FAIL 4: rewrite, renaming a specific file
git-filter-branch --tree-filter "mv d doh || :" H3
* FAIL 5: test that the file was renamed
test d = $(git show H3:doh)
* failed 4 among 5 test(s)
$ cd trash/
$ rm -rf .git-rewrite/
$ git filter-branch H2
/home/fonseca/bin/git-filter-branch: 386: arith: syntax error: "i+1"
$
A possible fix that makes the test pass for me.
diff --git a/git-filter-branch.sh b/git-filter-branch.sh
index 0c8a7df..5cf9d3c 100644
--- a/git-filter-branch.sh
+++ b/git-filter-branch.sh
@@ -339,7 +339,7 @@ test $commits -eq 0 && die "Found nothing to rewrite"
i=0
while read commit; do
- i=$((i+1))
+ i=$(echo i+1 | bc)
printf "$commit ($i/$commits) "
git-read-tree -i -m $commit
--
Jonas Fonseca
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH] Add git-filter-branch
2007-06-05 10:18 ` Jonas Fonseca
@ 2007-06-05 10:26 ` David Kastrup
2007-06-05 10:30 ` Junio C Hamano
1 sibling, 0 replies; 27+ messages in thread
From: David Kastrup @ 2007-06-05 10:26 UTC (permalink / raw)
To: git
Jonas Fonseca <fonseca@diku.dk> writes:
> A possible fix that makes the test pass for me.
>
> diff --git a/git-filter-branch.sh b/git-filter-branch.sh
> index 0c8a7df..5cf9d3c 100644
> --- a/git-filter-branch.sh
> +++ b/git-filter-branch.sh
> @@ -339,7 +339,7 @@ test $commits -eq 0 && die "Found nothing to rewrite"
>
> i=0
> while read commit; do
> - i=$((i+1))
> + i=$(echo i+1 | bc)
Would not work. You need $i instead.
> printf "$commit ($i/$commits) "
>
> git-read-tree -i -m $commit
More portable would be
i=`expr $i + 1`
since not everything has bc installed. Is $(...) available generally?
I thought it was a bashism, too.
--
David Kastrup
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] Add git-filter-branch
2007-06-05 10:18 ` Jonas Fonseca
2007-06-05 10:26 ` David Kastrup
@ 2007-06-05 10:30 ` Junio C Hamano
2007-06-05 10:34 ` Jonas Fonseca
1 sibling, 1 reply; 27+ messages in thread
From: Junio C Hamano @ 2007-06-05 10:30 UTC (permalink / raw)
To: Jonas Fonseca; +Cc: Johannes Schindelin, Jakub Narebski, git
Jonas Fonseca <fonseca@diku.dk> writes:
> $ git filter-branch H2
> /home/fonseca/bin/git-filter-branch: 386: arith: syntax error: "i+1"
> $
>
> A possible fix that makes the test pass for me.
>
> diff --git a/git-filter-branch.sh b/git-filter-branch.sh
> index 0c8a7df..5cf9d3c 100644
> --- a/git-filter-branch.sh
> +++ b/git-filter-branch.sh
> @@ -339,7 +339,7 @@ test $commits -eq 0 && die "Found nothing to rewrite"
>
> i=0
> while read commit; do
> - i=$((i+1))
> + i=$(echo i+1 | bc)
Are you sure this is not "echo $i+1"???
There are quite a few $((arithmetic)) already in our shell code,
so I was initially a bit surprised. However, upon closer
inspection, this particular use is not kosher at all.
The portable ones we already have in the code say things like:
msgnum=$(($msgnum+1))
The one in filter-branch that bit you does not dereference 'i'.
I am reasonably sure if you fix it to read:
i=$(( $i+1 ))
dash would grok it.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] Add git-filter-branch
2007-06-05 10:30 ` Junio C Hamano
@ 2007-06-05 10:34 ` Jonas Fonseca
2007-06-05 13:55 ` Johannes Schindelin
2007-06-06 15:24 ` [PATCH] filter-branch: use $(($i+1)) instead of $((i+1)) Johannes Schindelin
0 siblings, 2 replies; 27+ messages in thread
From: Jonas Fonseca @ 2007-06-05 10:34 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Johannes Schindelin, Jakub Narebski, git
Junio C Hamano <gitster@pobox.com> wrote Tue, Jun 05, 2007:
> Jonas Fonseca <fonseca@diku.dk> writes:
>
> > $ git filter-branch H2
> > /home/fonseca/bin/git-filter-branch: 386: arith: syntax error: "i+1"
> > $
> >
> > A possible fix that makes the test pass for me.
> >
> > diff --git a/git-filter-branch.sh b/git-filter-branch.sh
> > index 0c8a7df..5cf9d3c 100644
> > --- a/git-filter-branch.sh
> > +++ b/git-filter-branch.sh
> > @@ -339,7 +339,7 @@ test $commits -eq 0 && die "Found nothing to rewrite"
> >
> > i=0
> > while read commit; do
> > - i=$((i+1))
> > + i=$(echo i+1 | bc)
>
> Are you sure this is not "echo $i+1"???
Yes, I noticed this right after sending the mail. People on this list
are just too damn fast for me to send a correction. ;)
> There are quite a few $((arithmetic)) already in our shell code,
> so I was initially a bit surprised. However, upon closer
> inspection, this particular use is not kosher at all.
>
> The portable ones we already have in the code say things like:
>
> msgnum=$(($msgnum+1))
Yes, I should have investigated before sending.
> The one in filter-branch that bit you does not dereference 'i'.
> I am reasonably sure if you fix it to read:
>
> i=$(( $i+1 ))
>
> dash would grok it.
This works here. Even without the spaces.
--
Jonas Fonseca
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] Add git-filter-branch
2007-06-05 10:34 ` Jonas Fonseca
@ 2007-06-05 13:55 ` Johannes Schindelin
2007-06-06 15:24 ` [PATCH] filter-branch: use $(($i+1)) instead of $((i+1)) Johannes Schindelin
1 sibling, 0 replies; 27+ messages in thread
From: Johannes Schindelin @ 2007-06-05 13:55 UTC (permalink / raw)
To: Jonas Fonseca; +Cc: Junio C Hamano, Jakub Narebski, git
Hi,
On Tue, 5 Jun 2007, Jonas Fonseca wrote:
> Junio C Hamano <gitster@pobox.com> wrote Tue, Jun 05, 2007:
>
> > The one in filter-branch that bit you does not dereference 'i'.
> > I am reasonably sure if you fix it to read:
> >
> > i=$(( $i+1 ))
> >
> > dash would grok it.
>
> This works here. Even without the spaces.
Thanks for fixing up so quickly after me.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] Add git-filter-branch
2007-06-05 7:01 ` Johannes Sixt
@ 2007-06-05 15:58 ` Johannes Schindelin
2007-06-06 7:43 ` Johannes Sixt
2007-06-06 15:36 ` [PATCH] filter-branch: also don't fail in map() if a commit cannot be mapped Johannes Sixt
0 siblings, 2 replies; 27+ messages in thread
From: Johannes Schindelin @ 2007-06-05 15:58 UTC (permalink / raw)
To: Johannes Sixt; +Cc: git
Hi,
On Tue, 5 Jun 2007, Johannes Sixt wrote:
> Johannes Schindelin wrote:
> > On Mon, 4 Jun 2007, Johannes Sixt wrote:
> > > But this makes only sense if you have a linear history. Consider this
> > > history, where you want to rewrite the commits that are only on branch
> > > 'next':
> > >
> > > --A--B--C--D--E--F--G--H <- master
> > > \ \ \ \ \ \ \ \
> > > X--o--o--o--o--o--o--o--o <- next
> > >
> > > How would you go about with the current calling convention?
> >
> > Are you actually sure that this scenario makes sense? When is the last
> > time you wanted to filter a branch?
>
> Oh, this makes a lot of sense. For example after I've imported a CVS
> repository I had installed grafts for a number of merges that were made
> in CVS (but we all know that CVS doesn't record them, so I did that
> manually this way). That would be the merge commits in 'next' of the
> example above. Now a simple
>
> git filter-branch -k master new-next
>
> could "implant" the grafts into the commits. In this scenario I don't
> need to rewrite 'master' because I know in advance that nothing would
> actually be rewritten.
>
> (Since 'master' was about 8000 commits I really didn't want to wait
> until the no-ops would be completed, so I did it by actually fixing
> cg-admin-rewritehist to not complain about the unmapped parents.)
Okay, then. Are you okay with keeping the same options? (See proposed
patch below.)
Just out of curiousity, do you have any timing data?
Ciao,
Dscho
-- snipsnap --
[PATCH] filter-branch: fix behaviour of '-k'
The option '-k' says that the given commit and _all_ of its ancestors
are kept as-is.
However, if a to-be-rewritten commit branched from an ancestor of an
ancestor of a commit given with '-k', filter-branch would fail.
Example:
A - B
\
C
If filter-branch was called with '-k B -s C', it would actually keep
B (and A as its parent), but would rewrite C, and its parent.
Noticed by Johannes Sixt.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
git-filter-branch.sh | 29 +++++++++++++++++------------
t/t7003-filter-branch.sh | 9 +++++++++
2 files changed, 26 insertions(+), 12 deletions(-)
diff --git a/git-filter-branch.sh b/git-filter-branch.sh
index 0c8a7df..6807782 100644
--- a/git-filter-branch.sh
+++ b/git-filter-branch.sh
@@ -327,11 +327,6 @@ ret=0
mkdir ../map # map old->new commit ids for rewriting parents
-# seed with identity mappings for the parents where we start off
-for commit in $unchanged; do
- echo $commit > ../map/$commit
-done
-
git-rev-list --reverse --topo-order $srcbranch --not $unchanged >../revs
commits=$(cat ../revs | wc -l | tr -d " ")
@@ -372,7 +367,8 @@ while read commit; do
parentstr="$parentstr -p $reparent"
done
else
- die "assertion failed: parent $parent for commit $commit not found in rewritten ones"
+ # if it was not rewritten, take the original
+ parentstr="$parentstr -p $parent"
fi
done
if [ "$filter_parent" ]; then
@@ -385,12 +381,21 @@ while read commit; do
tee ../map/$commit
done <../revs
-git-update-ref refs/heads/"$dstbranch" $(head -n 1 ../map/$(tail -n 1 ../revs))
-if [ "$(cat ../map/$(tail -n 1 ../revs) | wc -l)" -gt 1 ]; then
- echo "WARNING: Your commit filter caused the head commit to expand to several rewritten commits. Only the first such commit was recorded as the current $dstbranch head but you will need to resolve the situation now (probably by manually merging the other commits). These are all the commits:" >&2
- sed 's/^/ /' ../map/$(tail -n 1 ../revs) >&2
- ret=1
-fi
+src_head=$(tail -n 1 ../revs)
+target_head=$(head -n 1 ../map/$src_head)
+case "$target_head" in
+'')
+ echo Nothing rewritten
+ ;;
+*)
+ git-update-ref refs/heads/"$dstbranch" $target_head
+ if [ $(cat ../map/$src_head | wc -l) -gt 1 ]; then
+ echo "WARNING: Your commit filter caused the head commit to expand to several rewritten commits. Only the first such commit was recorded as the current $dstbranch head but you will need to resolve the situation now (probably by manually merging the other commits). These are all the commits:" >&2
+ sed 's/^/ /' ../map/$src_head >&2
+ ret=1
+ fi
+ ;;
+esac
if [ "$filter_tag_name" ]; then
git-for-each-ref --format='%(objectname) %(objecttype) %(refname)' refs/tags |
diff --git a/t/t7003-filter-branch.sh b/t/t7003-filter-branch.sh
index 9a4dae4..520963a 100755
--- a/t/t7003-filter-branch.sh
+++ b/t/t7003-filter-branch.sh
@@ -44,4 +44,13 @@ test_expect_success 'test that the file was renamed' '
test d = $(git show H3:doh)
'
+git tag oldD H3~4
+test_expect_success 'rewrite one branch, keeping a side branch' '
+ git-filter-branch --tree-filter "mv b boh || :" -k D -s oldD modD
+'
+
+test_expect_success 'common ancestor is still common (unchanged)' '
+ test "$(git-merge-base modD D)" = "$(git-rev-parse B)"
+'
+
test_done
--
1.5.2.1.2627.g8eec-dirty
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH] Add git-filter-branch
2007-06-05 15:58 ` Johannes Schindelin
@ 2007-06-06 7:43 ` Johannes Sixt
2007-06-06 8:17 ` Junio C Hamano
2007-06-06 15:00 ` Johannes Schindelin
2007-06-06 15:36 ` [PATCH] filter-branch: also don't fail in map() if a commit cannot be mapped Johannes Sixt
1 sibling, 2 replies; 27+ messages in thread
From: Johannes Sixt @ 2007-06-06 7:43 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: git
Johannes Schindelin wrote:
> Okay, then. Are you okay with keeping the same options? (See proposed
> patch below.)
I can live with it. But what do you think of this in addtion? It
replaces -k, -r, -s in favor of rev-list arguments.
> Just out of curiousity, do you have any timing data?
I did one test run through 8118 commits which took 18 minutes. But it
turns out that I have a buglet here in git-commit-tree, which would
not accept committer dates before 2000-1-1 00:00:01 UTC, but since the
first commit is from 1999, this test rewrote the entire history, which
was not intended.
--- 8< ---
From: Johannes Sixt <johannes.sixt@telecom.at>
filter-branch: Use rev-list arguments to specify revision ranges.
A subset of commits in a branch used to be specified by options (-k, -r)
as well as the branch tip itself (-s). It is more natural (for git users)
to specify revision ranges like 'master..next' instead. This makes it so.
If no range is specified it defaults to 'HEAD'.
As a consequence, the new name of the filtered branch must be the first
non-option argument. All remaining arguments are passed to 'git rev-list'
unmodified.
The tip of the branch that gets filtered is implied: It is the first
commit that git rev-list would print for the specified range.
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
---
git-filter-branch.sh | 39 ++++++++++++---------------------------
t/t7003-filter-branch.sh | 2 +-
2 files changed, 13 insertions(+), 28 deletions(-)
diff --git a/git-filter-branch.sh b/git-filter-branch.sh
index 9e12a6c..190a492 100644
--- a/git-filter-branch.sh
+++ b/git-filter-branch.sh
@@ -42,15 +42,6 @@
# does this in the '.git-rewrite/' directory but you can override
# that choice by this parameter.
#
-# -r STARTREV:: The commit id to start the rewrite at
-# Normally, the command will rewrite the entire history. If you
-# pass this argument, though, this will be the first commit it
-# will rewrite and keep the previous commits intact.
-#
-# -k KEEPREV:: A commit id until which _not_ to rewrite history
-# If you pass this argument, this commit and all of its
-# predecessors are kept intact.
-#
# Filters
# ~~~~~~~
# The filters are applied in the order as listed below. The COMMAND
@@ -164,27 +155,31 @@
# and all children of the merge will become merge commits with P1,P2
# as their parents instead of the merge commit.
#
-# To restrict rewriting to only part of the history, use -r or -k or both.
+# To restrict rewriting to only part of the history, specify a revision
+# range in addition to the new branch name. The new branch name will
+# point to the top-most revision that a 'git rev-list' of this range
+# will print.
+#
# Consider this history:
#
# D--E--F--G--H
# / /
# A--B-----C
#
-# To rewrite only commits F,G,H, use:
+# To rewrite commits D,E,F,G,H, use:
#
-# git-filter-branch -r F ...
+# git-filter-branch ... new-H C..H
#
# To rewrite commits E,F,G,H, use one of these:
#
-# git-filter-branch -r E -k C ...
-# git-filter-branch -k D -k C ...
+# git-filter-branch ... new-H C..H --not D
+# git-filter-branch ... new-H D..H --not C
# Testsuite: TODO
set -e
-USAGE="git-filter-branch [-d TEMPDIR] [-r STARTREV]... [-k KEEPREV]... [-s SRCBRANCH] [FILTERS] DESTBRANCH"
+USAGE="git-filter-branch [-d TEMPDIR] [FILTERS] DESTBRANCH [REV-RANGE]"
. git-sh-setup
map()
@@ -233,7 +228,6 @@ get_parents () {
}
tempdir=.git-rewrite
-unchanged=" "
filter_env=
filter_tree=
filter_index=
@@ -241,7 +235,6 @@ filter_parent=
filter_msg=cat
filter_commit='git-commit-tree "$@"'
filter_tag_name=
-srcbranch=HEAD
while case "$#" in 0) usage;; esac
do
case "$1" in
@@ -266,12 +259,6 @@ do
-d)
tempdir="$OPTARG"
;;
- -r)
- unchanged="$(get_parents "$OPTARG") $unchanged"
- ;;
- -k)
- unchanged="$(git-rev-parse "$OPTARG"^{commit}) $unchanged"
- ;;
--env-filter)
filter_env="$OPTARG"
;;
@@ -293,9 +280,6 @@ do
--tag-name-filter)
filter_tag_name="$OPTARG"
;;
- -s)
- srcbranch="$OPTARG"
- ;;
*)
usage
;;
@@ -303,6 +287,7 @@ do
done
dstbranch="$1"
+shift
test -n "$dstbranch" || die "missing branch name"
git-show-ref "refs/heads/$dstbranch" 2> /dev/null &&
die "branch $dstbranch already exists"
@@ -328,7 +313,7 @@ ret=0
mkdir ../map # map old->new commit ids for rewriting parents
-git-rev-list --reverse --topo-order $srcbranch --not $unchanged >../revs
+git-rev-list --reverse --topo-order --default HEAD "$@" >../revs
commits=$(cat ../revs | wc -l | tr -d " ")
test $commits -eq 0 && die "Found nothing to rewrite"
diff --git a/t/t7003-filter-branch.sh b/t/t7003-filter-branch.sh
index 520963a..89b405b 100755
--- a/t/t7003-filter-branch.sh
+++ b/t/t7003-filter-branch.sh
@@ -46,7 +46,7 @@ test_expect_success 'test that the file was renamed' '
git tag oldD H3~4
test_expect_success 'rewrite one branch, keeping a side branch' '
- git-filter-branch --tree-filter "mv b boh || :" -k D -s oldD modD
+ git-filter-branch --tree-filter "mv b boh || :" modD D..oldD
'
test_expect_success 'common ancestor is still common (unchanged)' '
--
1.5.2.1.114.gc6c36
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH] Add git-filter-branch
2007-06-06 7:43 ` Johannes Sixt
@ 2007-06-06 8:17 ` Junio C Hamano
2007-06-06 15:00 ` Johannes Schindelin
1 sibling, 0 replies; 27+ messages in thread
From: Junio C Hamano @ 2007-06-06 8:17 UTC (permalink / raw)
To: Johannes Sixt; +Cc: Johannes Schindelin, git
Johannes Sixt <J.Sixt@eudaptics.com> writes:
> Johannes Schindelin wrote:
>> Okay, then. Are you okay with keeping the same options? (See proposed
>> patch below.)
>
> I can live with it. But what do you think of this in addtion? It
> replaces -k, -r, -s in favor of rev-list arguments.
> ...
> A subset of commits in a branch used to be specified by options (-k, -r)
> as well as the branch tip itself (-s). It is more natural (for git users)
> to specify revision ranges like 'master..next' instead. This makes it so.
> If no range is specified it defaults to 'HEAD'.
FWIW, I find this much more sensible and gittish ;-)
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] Add git-filter-branch
2007-06-06 7:43 ` Johannes Sixt
2007-06-06 8:17 ` Junio C Hamano
@ 2007-06-06 15:00 ` Johannes Schindelin
2007-06-06 15:22 ` Johannes Sixt
1 sibling, 1 reply; 27+ messages in thread
From: Johannes Schindelin @ 2007-06-06 15:00 UTC (permalink / raw)
To: Johannes Sixt; +Cc: git
Hi,
On Wed, 6 Jun 2007, Johannes Sixt wrote:
> It is more natural (for git users) to specify revision ranges like
> 'master..next' instead. This makes it so. If no range is specified it
> defaults to 'HEAD'.
>
> As a consequence, the new name of the filtered branch must be the first
> non-option argument. All remaining arguments are passed to 'git rev-list'
> unmodified.
I was really close to do this myself. But I thought there was a
problem to infer the correct source branch.
But you're right, this is more gittish. (Consider that an ACK from me.)
Of course, it would be even more so if the target branch name was
"filtered", overrideable by "--target <name>".
Ciao,
Dscho
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] Add git-filter-branch
2007-06-06 15:00 ` Johannes Schindelin
@ 2007-06-06 15:22 ` Johannes Sixt
2007-06-06 17:59 ` Johannes Schindelin
0 siblings, 1 reply; 27+ messages in thread
From: Johannes Sixt @ 2007-06-06 15:22 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: git
Johannes Schindelin wrote:
> Of course, it would be even more so if the target branch name was
> "filtered", overrideable by "--target <name>".
My plan for this is:
1. run the rev-list args ("$@") through rev-parse
2. pick only the positive ones (/^[a-z0-9]{40}$/)
3. filter show-ref against the result of 2.
4. foreach ref in the result of 3. install a refs/rewritten/$ref
with the mapped id if and only if the mapped id is different
from the original id of $ref.
Then you can, for example, 'git filter-branch --all' to rewrite all
branches.
-- Hannes
^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH] filter-branch: use $(($i+1)) instead of $((i+1))
2007-06-05 10:34 ` Jonas Fonseca
2007-06-05 13:55 ` Johannes Schindelin
@ 2007-06-06 15:24 ` Johannes Schindelin
1 sibling, 0 replies; 27+ messages in thread
From: Johannes Schindelin @ 2007-06-06 15:24 UTC (permalink / raw)
To: Jonas Fonseca; +Cc: Junio C Hamano, Jakub Narebski, git
The expression $((i+1)) is not portable at all: even some bash versions
do not grok it. So do not use it.
Noticed by Jonas Fonseca.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
On Tue, 5 Jun 2007, Jonas Fonseca wrote:
> Junio C Hamano <gitster@pobox.com> wrote Tue, Jun 05, 2007:
> > Jonas Fonseca <fonseca@diku.dk> writes:
> >
> > > $ git filter-branch H2
> > > /home/fonseca/bin/git-filter-branch: 386: arith: syntax error: "i+1"
> > > $
> > >
> > > A possible fix that makes the test pass for me.
> >
> > [...]
> >
> > The portable ones we already have in the code say things like:
> >
> > msgnum=$(($msgnum+1))
>
> Yes, I should have investigated before sending.
>
> > The one in filter-branch that bit you does not dereference 'i'.
> > I am reasonably sure if you fix it to read:
> >
> > i=$(( $i+1 ))
> >
> > dash would grok it.
>
> This works here. Even without the spaces.
Voila.
git-filter-branch.sh | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/git-filter-branch.sh b/git-filter-branch.sh
index 6807782..f89cfe1 100644
--- a/git-filter-branch.sh
+++ b/git-filter-branch.sh
@@ -334,7 +334,7 @@ test $commits -eq 0 && die "Found nothing to rewrite"
i=0
while read commit; do
- i=$((i+1))
+ i=$(($i+1))
printf "$commit ($i/$commits) "
git-read-tree -i -m $commit
--
1.5.2.1.2656.g1921f
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH] filter-branch: also don't fail in map() if a commit cannot be mapped
2007-06-05 15:58 ` Johannes Schindelin
2007-06-06 7:43 ` Johannes Sixt
@ 2007-06-06 15:36 ` Johannes Sixt
2007-06-06 17:50 ` Johannes Schindelin
1 sibling, 1 reply; 27+ messages in thread
From: Johannes Sixt @ 2007-06-06 15:36 UTC (permalink / raw)
To: git, johannes.schindelin; +Cc: Johannes Sixt
The map() function can be used by filters to map a commit id to its
rewritten id. Such a mapping may not exist, in which case the identity
mapping is used (the commit is returned unchanged).
In the rewrite loop, this mapping is also needed, but was done
explicitly in the same way. Use the map() function instead.
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
---
git-filter-branch.sh | 14 +++++---------
1 files changed, 5 insertions(+), 9 deletions(-)
diff --git a/git-filter-branch.sh b/git-filter-branch.sh
index fbbb044..9e12a6c 100644
--- a/git-filter-branch.sh
+++ b/git-filter-branch.sh
@@ -189,7 +189,8 @@ USAGE="git-filter-branch [-d TEMPDIR] [-r STARTREV]... [-k KEEPREV]... [-s SRCBR
map()
{
- [ -r "$workdir/../map/$1" ] || return 1
+ # if it was not rewritten, take the original
+ [ -r "$workdir/../map/$1" ] || echo "$1"
cat "$workdir/../map/$1"
}
@@ -362,14 +363,9 @@ while read commit; do
parentstr=
for parent in $(get_parents $commit); do
- if [ -r "../map/$parent" ]; then
- for reparent in $(cat "../map/$parent"); do
- parentstr="$parentstr -p $reparent"
- done
- else
- # if it was not rewritten, take the original
- parentstr="$parentstr -p $parent"
- fi
+ for reparent in $(map "$parent"); do
+ parentstr="$parentstr -p $reparent"
+ done
done
if [ "$filter_parent" ]; then
parentstr="$(echo "$parentstr" | eval "$filter_parent")"
--
1.5.2.1.120.gd732
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH] filter-branch: also don't fail in map() if a commit cannot be mapped
2007-06-06 15:36 ` [PATCH] filter-branch: also don't fail in map() if a commit cannot be mapped Johannes Sixt
@ 2007-06-06 17:50 ` Johannes Schindelin
2007-06-06 18:38 ` [PATCH v2] " Johannes Sixt
0 siblings, 1 reply; 27+ messages in thread
From: Johannes Schindelin @ 2007-06-06 17:50 UTC (permalink / raw)
To: Johannes Sixt; +Cc: git
Hi,
On Wed, 6 Jun 2007, Johannes Sixt wrote:
> - [ -r "$workdir/../map/$1" ] || return 1
> + # if it was not rewritten, take the original
> + [ -r "$workdir/../map/$1" ] || echo "$1"
Maybe "test" instead of "["? Otherwise, ACK from me.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] Add git-filter-branch
2007-06-06 15:22 ` Johannes Sixt
@ 2007-06-06 17:59 ` Johannes Schindelin
0 siblings, 0 replies; 27+ messages in thread
From: Johannes Schindelin @ 2007-06-06 17:59 UTC (permalink / raw)
To: Johannes Sixt; +Cc: git
Hi,
On Wed, 6 Jun 2007, Johannes Sixt wrote:
> Johannes Schindelin wrote:
> > Of course, it would be even more so if the target branch name was
> > "filtered", overrideable by "--target <name>".
>
> My plan for this is:
>
> 1. run the rev-list args ("$@") through rev-parse
> 2. pick only the positive ones (/^[a-z0-9]{40}$/)
> 3. filter show-ref against the result of 2.
> 4. foreach ref in the result of 3. install a refs/rewritten/$ref
> with the mapped id if and only if the mapped id is different
> from the original id of $ref.
>
> Then you can, for example, 'git filter-branch --all' to rewrite all
> branches.
That sounds really sensible. For (2), I suggest "git-rev-parse
--symbolic", though. And maybe you want to make sure that there were no
invalid branch names, i.e. "git-filter-branch next~2". (Otherwise, you
would try to create refs/filtered/next~2 after filtering all commits.)
Ciao,
Dscho
^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH v2] filter-branch: also don't fail in map() if a commit cannot be mapped
2007-06-06 17:50 ` Johannes Schindelin
@ 2007-06-06 18:38 ` Johannes Sixt
0 siblings, 0 replies; 27+ messages in thread
From: Johannes Sixt @ 2007-06-06 18:38 UTC (permalink / raw)
To: git; +Cc: Johannes Schindelin
The map() function can be used by filters to map a commit id to its
rewritten id. Such a mapping may not exist, in which case the identity
mapping is used (the commit is returned unchanged).
In the rewrite loop, this mapping is also needed, but was done
explicitly in the same way. Use the map() function instead.
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
---
On Wednesday 06 June 2007 19:50, Johannes Schindelin wrote:
> On Wed, 6 Jun 2007, Johannes Sixt wrote:
> > - [ -r "$workdir/../map/$1" ] || return 1
> > + # if it was not rewritten, take the original
> > + [ -r "$workdir/../map/$1" ] || echo "$1"
>
> Maybe "test" instead of "["? Otherwise, ACK from me.
Right!
git-filter-branch.sh | 14 +++++---------
1 files changed, 5 insertions(+), 9 deletions(-)
diff --git a/git-filter-branch.sh b/git-filter-branch.sh
index fbbb044..15e9b46 100644
--- a/git-filter-branch.sh
+++ b/git-filter-branch.sh
@@ -189,7 +189,8 @@ USAGE="git-filter-branch [-d TEMPDIR] [-r STARTREV]... [-k KEEPREV]... [-s SRCBR
map()
{
- [ -r "$workdir/../map/$1" ] || return 1
+ # if it was not rewritten, take the original
+ test -r "$workdir/../map/$1" || echo "$1"
cat "$workdir/../map/$1"
}
@@ -362,14 +363,9 @@ while read commit; do
parentstr=
for parent in $(get_parents $commit); do
- if [ -r "../map/$parent" ]; then
- for reparent in $(cat "../map/$parent"); do
- parentstr="$parentstr -p $reparent"
- done
- else
- # if it was not rewritten, take the original
- parentstr="$parentstr -p $parent"
- fi
+ for reparent in $(map "$parent"); do
+ parentstr="$parentstr -p $reparent"
+ done
done
if [ "$filter_parent" ]; then
parentstr="$(echo "$parentstr" | eval "$filter_parent")"
--
1.5.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
end of thread, other threads:[~2007-06-06 18:38 UTC | newest]
Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-06-03 0:31 [PATCH] Add git-filter-branch Johannes Schindelin
2007-06-03 0:46 ` Jakub Narebski
2007-06-03 0:50 ` Johannes Schindelin
2007-06-03 10:28 ` Jakub Narebski
2007-06-03 18:36 ` Steven Grimm
2007-06-03 23:07 ` Johannes Schindelin
2007-06-05 10:18 ` Jonas Fonseca
2007-06-05 10:26 ` David Kastrup
2007-06-05 10:30 ` Junio C Hamano
2007-06-05 10:34 ` Jonas Fonseca
2007-06-05 13:55 ` Johannes Schindelin
2007-06-06 15:24 ` [PATCH] filter-branch: use $(($i+1)) instead of $((i+1)) Johannes Schindelin
2007-06-04 7:18 ` [PATCH] Add git-filter-branch Johannes Sixt
2007-06-04 7:59 ` Johannes Sixt
2007-06-04 16:11 ` Johannes Schindelin
2007-06-04 16:34 ` Johannes Sixt
2007-06-04 17:55 ` Johannes Schindelin
2007-06-05 7:01 ` Johannes Sixt
2007-06-05 15:58 ` Johannes Schindelin
2007-06-06 7:43 ` Johannes Sixt
2007-06-06 8:17 ` Junio C Hamano
2007-06-06 15:00 ` Johannes Schindelin
2007-06-06 15:22 ` Johannes Sixt
2007-06-06 17:59 ` Johannes Schindelin
2007-06-06 15:36 ` [PATCH] filter-branch: also don't fail in map() if a commit cannot be mapped Johannes Sixt
2007-06-06 17:50 ` Johannes Schindelin
2007-06-06 18:38 ` [PATCH v2] " Johannes Sixt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).