* Re: How to rewrite author history [not found] <216e54900803210938q4981b5d1t535af419f5b15ad8@mail.gmail.com> @ 2008-03-21 16:41 ` Andrew Arnott 2008-03-21 18:40 ` Dmitry Potapov 2008-03-22 9:29 ` Samuel Tardieu 0 siblings, 2 replies; 7+ messages in thread From: Andrew Arnott @ 2008-03-21 16:41 UTC (permalink / raw) To: git I imported my git repo from an SVN repo, and the authors have email@SOME-GUID for their email address rather than their actual one (probably courtesy of Google Code hosting). Rewriting history and changing all the commit hashes isn't a problem at this point in development, so how can I do a massive search-and-replace to replace several specific author emails with the valid ones? -- Andrew Arnott ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: How to rewrite author history 2008-03-21 16:41 ` How to rewrite author history Andrew Arnott @ 2008-03-21 18:40 ` Dmitry Potapov 2008-03-22 9:29 ` Samuel Tardieu 1 sibling, 0 replies; 7+ messages in thread From: Dmitry Potapov @ 2008-03-21 18:40 UTC (permalink / raw) To: Andrew Arnott; +Cc: git On Fri, Mar 21, 2008 at 7:41 PM, Andrew Arnott <andrewarnott@gmail.com> wrote: > I imported my git repo from an SVN repo, and the authors have > email@SOME-GUID for their email address rather than their actual one > (probably courtesy of Google Code hosting). Rewriting history and > changing all the commit hashes isn't a problem at this point in > development, so how can I do a massive search-and-replace to replace > several specific author emails with the valid ones? Please, take a look at man git-filter-branch. I believe you can do that using env-filter, but I have never used this filter myself. Dmitry ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: How to rewrite author history 2008-03-21 16:41 ` How to rewrite author history Andrew Arnott 2008-03-21 18:40 ` Dmitry Potapov @ 2008-03-22 9:29 ` Samuel Tardieu 2008-03-22 13:11 ` Andrew Arnott 1 sibling, 1 reply; 7+ messages in thread From: Samuel Tardieu @ 2008-03-22 9:29 UTC (permalink / raw) To: Andrew Arnott; +Cc: git >>>>> "Andrew" == Andrew Arnott <andrewarnott@gmail.com> writes: Andrew> I imported my git repo from an SVN repo, and the authors have Andrew> email@SOME-GUID for their email address rather than their Andrew> actual one (probably courtesy of Google Code hosting). Andrew> Rewriting history and changing all the commit hashes isn't a Andrew> problem at this point in development, so how can I do a Andrew> massive search-and-replace to replace several specific author Andrew> emails with the valid ones? If you can reimport it, you can use the "--authors-file" of "git svn". Sam ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: How to rewrite author history 2008-03-22 9:29 ` Samuel Tardieu @ 2008-03-22 13:11 ` Andrew Arnott 2008-03-22 16:57 ` Jeff King 0 siblings, 1 reply; 7+ messages in thread From: Andrew Arnott @ 2008-03-22 13:11 UTC (permalink / raw) To: Samuel Tardieu; +Cc: git Thanks. Re-importing from SVN isn't an option any more, but I ended up with something like this that seems to have worked. git checkout master git-filter-branch --env-filter ' if [ "$GIT_AUTHOR_NAME" = "andrewarnott" ]; then export GIT_AUTHOR_EMAIL="andrewarnott@gmail.com" export GIT_AUTHOR_NAME="Andrew Arnott" fi export GIT_COMMITTER_EMAIL=$GIT_AUTHOR_EMAIL export GIT_COMMITTER_NAME=$GIT_AUTHOR_NAME ' And I did this for master, and my v1 and v0.1 branches. I'm concerned though, that since I changed the names of all the objects by doing this, did I somehow make my branches incompatible with each other? Will there be any problems in the future sharing commits or merging across branches as a result? Thanks. On Sat, Mar 22, 2008 at 2:29 AM, Samuel Tardieu <sam@rfc1149.net> wrote: > >>>>> "Andrew" == Andrew Arnott <andrewarnott@gmail.com> writes: > > Andrew> I imported my git repo from an SVN repo, and the authors have > Andrew> email@SOME-GUID for their email address rather than their > Andrew> actual one (probably courtesy of Google Code hosting). > Andrew> Rewriting history and changing all the commit hashes isn't a > Andrew> problem at this point in development, so how can I do a > Andrew> massive search-and-replace to replace several specific author > Andrew> emails with the valid ones? > > If you can reimport it, you can use the "--authors-file" of "git svn". > > Sam > > -- Andrew Arnott ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: How to rewrite author history 2008-03-22 13:11 ` Andrew Arnott @ 2008-03-22 16:57 ` Jeff King 2008-03-22 19:06 ` Andrew Arnott 0 siblings, 1 reply; 7+ messages in thread From: Jeff King @ 2008-03-22 16:57 UTC (permalink / raw) To: Andrew Arnott; +Cc: Samuel Tardieu, git On Sat, Mar 22, 2008 at 06:11:12AM -0700, Andrew Arnott wrote: > git-filter-branch --env-filter ' > [...] > And I did this for master, and my v1 and v0.1 branches. I'm concerned > though, that since I changed the names of all the objects by doing > this, did I somehow make my branches incompatible with each other? > Will there be any problems in the future sharing commits or merging > across branches as a result? There are two concerns, and I'm not sure which you have (I think number 1): 1. Your branches within the repository will not connect anymore. I believe this is a non-issue with your filter, since the generated commit IDs are deterministic. Certainly a toy case worked for me with: for i in master branch; do git filter-branch --env-filter=... $i done You can also specify both to be done at the same time, which is more efficient: git filter-branch --env-filter=... master branch You can check the graph structure with "gitk master branch" which should show them connecting. 2. Your branches are now a different, rewritten history compared to anyone who has cloned or fetched from you. This is unavoidable, and the answer is either "don't use filter-branch" or "tell everyone to rebase their work on the new history." So the best time to filter-branch is right after import, but before you start work. -Peff ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: How to rewrite author history 2008-03-22 16:57 ` Jeff King @ 2008-03-22 19:06 ` Andrew Arnott 2008-03-23 7:09 ` Jeff King 0 siblings, 1 reply; 7+ messages in thread From: Andrew Arnott @ 2008-03-22 19:06 UTC (permalink / raw) To: Jeff King; +Cc: Samuel Tardieu, git Thanks, Jeff. That was very helpful. I have published my repo online already, but only a couple people (if even that) have cloned it by now and I am prepared to email the list of interested parties letting them know. About this rebasing thing, is there a better way than for them to just wipe their repo and clone again? Would a simple git fetch and git rebase do the trick? Thanks again, everyone. On Sat, Mar 22, 2008 at 9:57 AM, Jeff King <peff@peff.net> wrote: > On Sat, Mar 22, 2008 at 06:11:12AM -0700, Andrew Arnott wrote: > > > git-filter-branch --env-filter ' > > [...] > > > And I did this for master, and my v1 and v0.1 branches. I'm concerned > > though, that since I changed the names of all the objects by doing > > this, did I somehow make my branches incompatible with each other? > > Will there be any problems in the future sharing commits or merging > > across branches as a result? > > There are two concerns, and I'm not sure which you have (I think number > 1): > > 1. Your branches within the repository will not connect anymore. I > believe this is a non-issue with your filter, since the generated > commit IDs are deterministic. Certainly a toy case worked for me > with: > > for i in master branch; do > git filter-branch --env-filter=... $i > done > > You can also specify both to be done at the same time, which is > more efficient: > > git filter-branch --env-filter=... master branch > > You can check the graph structure with "gitk master branch" which > should show them connecting. > > 2. Your branches are now a different, rewritten history compared to > anyone who has cloned or fetched from you. This is unavoidable, and > the answer is either "don't use filter-branch" or "tell everyone to > rebase their work on the new history." So the best time to > filter-branch is right after import, but before you start work. > > -Peff > -- Andrew Arnott ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: How to rewrite author history 2008-03-22 19:06 ` Andrew Arnott @ 2008-03-23 7:09 ` Jeff King 0 siblings, 0 replies; 7+ messages in thread From: Jeff King @ 2008-03-23 7:09 UTC (permalink / raw) To: Andrew Arnott; +Cc: Samuel Tardieu, git On Sat, Mar 22, 2008 at 12:06:37PM -0700, Andrew Arnott wrote: > Thanks, Jeff. That was very helpful. I have published my repo online > already, but only a couple people (if even that) have cloned it by now > and I am prepared to email the list of interested parties letting them > know. About this rebasing thing, is there a better way than for them > to just wipe their repo and clone again? Would a simple git fetch and > git rebase do the trick? Short answer: if they haven't done any work on top of yours, re-cloning is probably the simplest route. If they do have work, then they will want to fetch and rebase. The commands are fairly simple, but what is happening is a little tricky, so I'll subject you to some ascii art. # user has commits C..D built on top of your original A..B (in the # diagram, "..." refers to an arbitrary number of commits) # # A--...--B <-- origin/master # \ # C--...--D <-- master git fetch # after the fetch, we now have the filtered A'..B' pointed to by # origin/master, but the reflog for origin/master points to the # original. # # A'--...--B' <-- origin/master # A--...--B <-- origin/master@{1} # \ # C--...--D <-- master # # so now we can rebase. We want all of the commits between the # _original_ upstream and our current state to be rebased on top # of the new upstream. git rebase --onto origin/master origin/master@{1} master Three things to note here. 1. This works even if C..D is empty, so it is valid even if they didn't do any work. Though in that case, simply doing "git reset --hard origin/master" would work just as well. 2. The annoying thing is that you have to do this for every branch. So depending on how many branches you have and how much work they did, it may just be simpler to export the work as patches, re-clone, and then apply: git checkout master git format-patch origin/master >/some/path/outside/repo cd .. && rm -rf repo git clone /path/to/repo && cd repo git am /some/path/outside/repo 3. Since you haven't changed the trees at all, a fetch will just need to download the new commits. Thus a fetch should be way less network-intensive than a re-clone. Whether that matters, of course, depends on your repo size and your users' bandwidth. Hope that helps, -Peff ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2008-03-23 7:10 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <216e54900803210938q4981b5d1t535af419f5b15ad8@mail.gmail.com> 2008-03-21 16:41 ` How to rewrite author history Andrew Arnott 2008-03-21 18:40 ` Dmitry Potapov 2008-03-22 9:29 ` Samuel Tardieu 2008-03-22 13:11 ` Andrew Arnott 2008-03-22 16:57 ` Jeff King 2008-03-22 19:06 ` Andrew Arnott 2008-03-23 7:09 ` Jeff King
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).