From: Michael O'Cleirigh <michael.ocleirigh@rivulet.ca>
To: Johannes Sixt <j6t@kdbg.org>
Cc: git@vger.kernel.org
Subject: Re: [PATCH] git-filter-branch: add --egrep-filter option
Date: Sat, 16 Apr 2011 21:45:50 -0400 [thread overview]
Message-ID: <4DAA464E.7010804@rivulet.ca> (raw)
In-Reply-To: <201104161016.51690.j6t@kdbg.org>
Hi Johannes,
Thanks for commenting on this patch.
> On Samstag, 16. April 2011, Michael O'Cleirigh wrote:
>> The --subdirectory-filter will look for a single directory and then rewrite
>> history to make its content the root. This is ok except for cases where we
>> want to retain history of those files before they were moved into that
>> directory.
>>
>> The --egrep-filter option allows specifying an egrep regex for the files in
>> the tree of each commit to keep. For example:
>>
>> Directories we want are A, B, C, D and they exist in several different
>> lifetimes. A and B exist sometimes together then B and C and finally then
>> D.
>>
>> e.g. git-filter-branch --egrep-filter "(A|B|C|D)"
>>
>> Each commit will then contain different combination's of A or B or C or D
>> (up to A and B and C and D).
> Why do you need a new --...-filter option for this? Your implementation is
> merely an instance of an --index-filter, and at that a very specialized one,
> which operates only at the top-most directory level.
>
At work we needed to split out 2 more modules from a 1400 revision
repository that we imported from subversion.
Each had been originally created under different names at the top level
and then only recently moved into a more logical single directory per
project structure. When we first ran filter-branch with the
--subdirectory-filter we only had 6 commits instead of the 100 commits
we ended up with after using the --egrep-filter method.
I tried a tree-filter first but it was slow and then the same method as
an index filter was slower (I would search for the paths that didn't
match the filter (egrep -v "pattern") and then remove each of them).
By using this egrep-filter option it only took 5 minutes per repo vs >8
hours for the tree-filter approach.
I posted to the list incase it might be useful to others; But I didn't
really know if it would be useful or not.
After considering your comment I have to agree with you that it is a
special case of index-filter and probably not useful/general for enough
other cases to justify adding in a new command line option.
Regards,
Mike
next prev parent reply other threads:[~2011-04-17 1:46 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-04-15 22:50 [PATCH] git-filter-branch: add --egrep-filter option Michael O'Cleirigh
2011-04-16 8:16 ` Johannes Sixt
2011-04-17 1:45 ` Michael O'Cleirigh [this message]
2011-04-19 8:01 ` Jonathan Nieder
2011-04-19 16:03 ` Phil Hord
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4DAA464E.7010804@rivulet.ca \
--to=michael.ocleirigh@rivulet.ca \
--cc=git@vger.kernel.org \
--cc=j6t@kdbg.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.