From: Michael O'Cleirigh <michael.ocleirigh@rivulet.ca>
To: Johannes Sixt <j6t@kdbg.org>
Cc: git@vger.kernel.org
Subject: Re: [PATCH] git-filter-branch: add --egrep-filter option
Date: Sat, 16 Apr 2011 21:45:50 -0400 [thread overview]
Message-ID: <4DAA464E.7010804@rivulet.ca> (raw)
In-Reply-To: <201104161016.51690.j6t@kdbg.org>
Hi Johannes,
Thanks for commenting on this patch.
> On Samstag, 16. April 2011, Michael O'Cleirigh wrote:
>> The --subdirectory-filter will look for a single directory and then rewrite
>> history to make its content the root. This is ok except for cases where we
>> want to retain history of those files before they were moved into that
>> directory.
>>
>> The --egrep-filter option allows specifying an egrep regex for the files in
>> the tree of each commit to keep. For example:
>>
>> Directories we want are A, B, C, D and they exist in several different
>> lifetimes. A and B exist sometimes together then B and C and finally then
>> D.
>>
>> e.g. git-filter-branch --egrep-filter "(A|B|C|D)"
>>
>> Each commit will then contain different combination's of A or B or C or D
>> (up to A and B and C and D).
> Why do you need a new --...-filter option for this? Your implementation is
> merely an instance of an --index-filter, and at that a very specialized one,
> which operates only at the top-most directory level.
>
At work we needed to split out 2 more modules from a 1400 revision
repository that we imported from subversion.
Each had been originally created under different names at the top level
and then only recently moved into a more logical single directory per
project structure. When we first ran filter-branch with the
--subdirectory-filter we only had 6 commits instead of the 100 commits
we ended up with after using the --egrep-filter method.
I tried a tree-filter first but it was slow and then the same method as
an index filter was slower (I would search for the paths that didn't
match the filter (egrep -v "pattern") and then remove each of them).
By using this egrep-filter option it only took 5 minutes per repo vs >8
hours for the tree-filter approach.
I posted to the list incase it might be useful to others; But I didn't
really know if it would be useful or not.
After considering your comment I have to agree with you that it is a
special case of index-filter and probably not useful/general for enough
other cases to justify adding in a new command line option.
Regards,
Mike
next prev parent reply other threads:[~2011-04-17 1:46 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-04-15 22:50 [PATCH] git-filter-branch: add --egrep-filter option Michael O'Cleirigh
2011-04-16 8:16 ` Johannes Sixt
2011-04-17 1:45 ` Michael O'Cleirigh [this message]
2011-04-19 8:01 ` Jonathan Nieder
2011-04-19 16:03 ` Phil Hord
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4DAA464E.7010804@rivulet.ca \
--to=michael.ocleirigh@rivulet.ca \
--cc=git@vger.kernel.org \
--cc=j6t@kdbg.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).