* Understanding git filter-branch --subdirectory-filter behaviour
@ 2008-05-20 20:11 David Tweed
2008-05-21 6:26 ` Johannes Sixt
0 siblings, 1 reply; 3+ messages in thread
From: David Tweed @ 2008-05-20 20:11 UTC (permalink / raw)
To: gi mailing list
Hi, I'm experimenting with git filter-branch --subdirectory-filter
(being specific since it appears to have several special code branches
in the script) and getting results that I don't understand. Firstly,
can I confirm what appears implied by the man-page but I can't find
explicitly stated:
git filter-branch <how to filter> HEAD
is expected to do its filtering on the branch HEAD is on the entire
DAG all the way back to the initial commit, even if this is a DAG with
multiple branches splitting off and remerging?
I'm trying this on a repo (copy) containing a directory WRITING,
although not quite all the way back to the repo creation getting:
$ git filter-branch --subdirectory-filter WRITING/ HEAD
Rewrite 42f24be8d8198738134a19471697b39359199fa3 (351/351)
Ref 'refs/heads/master' was rewritten
$ git rev-list HEAD | wc
55 55 2255
Looking at this with gitk and git log confirms 55 commits, and the
first commit is the one immediately after the first merge encountered
(the commit that occured just after the merge) when walking backwards
in history. Is this something that would be expected?
Digging a little into the shell-script I find the list of commits is
generated with
git rev-list --reverse --topo-order --default HEAD --parents HEAD
--full-history -- WRITING
and (adding --pretty so I can easily read it) running this manually
gives 351 entries and looks to contain the expected commits. So I'm
confused what's happening?
If this is expected, is there an refspec I'm missing to get
filter-branch to filter the entire repo?
(FWIW, git version 1.5.5.1.316.g377d9 on x86-64 Linux.)
Many thanks,
--
cheers, dave tweed__________________________
david.tweed@gmail.com
Rm 124, School of Systems Engineering, University of Reading.
"while having code so boring anyone can maintain it, use Python." --
attempted insult seen on slashdot
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Understanding git filter-branch --subdirectory-filter behaviour
2008-05-20 20:11 Understanding git filter-branch --subdirectory-filter behaviour David Tweed
@ 2008-05-21 6:26 ` Johannes Sixt
2008-05-22 18:05 ` David Tweed
0 siblings, 1 reply; 3+ messages in thread
From: Johannes Sixt @ 2008-05-21 6:26 UTC (permalink / raw)
To: David Tweed; +Cc: git mailing list
David Tweed schrieb:
> $ git filter-branch --subdirectory-filter WRITING/ HEAD
> Rewrite 42f24be8d8198738134a19471697b39359199fa3 (351/351)
> Ref 'refs/heads/master' was rewritten
>
> $ git rev-list HEAD | wc
> 55 55 2255
>
...
>
> Digging a little into the shell-script I find the list of commits is
> generated with
>
> git rev-list --reverse --topo-order --default HEAD --parents HEAD
> --full-history -- WRITING
>
> and (adding --pretty so I can easily read it) running this manually
> gives 351 entries and looks to contain the expected commits. So I'm
> confused what's happening?
That's difficult to tell without a peek at the repository.
Did you compare 'gitk HEAD' to 'gitk HEAD -- WRITING'? I'd expect the
latter to be a subset of the former. Note that with a path specified
"history simplification" happens, which means that you won't see as many
merges as when no path is specified.
-- Hannes
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Understanding git filter-branch --subdirectory-filter behaviour
2008-05-21 6:26 ` Johannes Sixt
@ 2008-05-22 18:05 ` David Tweed
0 siblings, 0 replies; 3+ messages in thread
From: David Tweed @ 2008-05-22 18:05 UTC (permalink / raw)
To: Johannes Sixt; +Cc: git mailing list
On Wed, May 21, 2008 at 7:26 AM, Johannes Sixt <j.sixt@viscovery.net> wrote:
> David Tweed schrieb:
> That's difficult to tell without a peek at the repository.
>
> Did you compare 'gitk HEAD' to 'gitk HEAD -- WRITING'? I'd expect the
> latter to be a subset of the former. Note that with a path specified
> "history simplification" happens, which means that you won't see as many
> merges as when no path is specified.
Just did that in the before-filtering repository, and "gitk HEAD --
WRITING" doesn't have any branches after the simplification but it
does go back to the first commit in the repository creating WRITING
(presumably simplifying out several branches that didn't affect
WRITING), whereas the filtered repository starts on the commit
immediately after the first merge you encounter walking backwards in
time. I was prepared for the branch structure to possibly simplify
whilst keeping all the commits that change that directory, but was a
bit surprised it stopped before the first merge.
<in original>
$ git log HEAD -- WRITING | wc -l
2033
<in filtered repo>
$ git log | wc -l
329
So it's definitely creating a smaller repo than git log filtering. If
you would be interested in looking at the actual repo (about 17M) let
me know and I'll send you tarball details via personal mail.
Anyway, many thanks for the insight and assistance,
--
cheers, dave tweed__________________________
david.tweed@gmail.com
Rm 124, School of Systems Engineering, University of Reading.
"while having code so boring anyone can maintain it, use Python." --
attempted insult seen on slashdot
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2008-05-22 18:06 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-05-20 20:11 Understanding git filter-branch --subdirectory-filter behaviour David Tweed
2008-05-21 6:26 ` Johannes Sixt
2008-05-22 18:05 ` David Tweed
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).