* reflogs generated by git-cvsimport
@ 2009-07-31 17:42 Kalle Olavi Niemitalo
2009-07-31 19:13 ` Jeff King
0 siblings, 1 reply; 4+ messages in thread
From: Kalle Olavi Niemitalo @ 2009-07-31 17:42 UTC (permalink / raw)
To: git
When I run git cvsimport incrementally, it generates a reflog
entry for each CVS commit, because it runs git-update-ref after
each git-commit-tree. I think it'd be nicer to make just one
reflog entry at the end, when all CVS patchsets have been
imported. I could then use git log cvs/master@{1}..cvs/master
to see all the commits on master that were imported in the
latest run.
If git-cvsimport.perl kept the commit IDs in Perl variables only,
and then updated the refs once at the end, I'd get the reflogs I
prefer. However, if the script were interrupted in the middle,
it would then leave the refs unchanged, and the next cvsimport
run would have to download the same commits again. I suppose
that could be fixed with some $SIG{'INT'} handler. But then how
about the git repack -a -d that git-cvsimport.perl runs every
1024 commits: could that lose some commits that have been
imported from CVS but not yet saved in any ref?
Would it be better to create temporary refs/cvsimport/* and then
finally update the real refs based on those?
There's also another problem with the reflogs. The current
version of git-cvsimport sets GIT_COMMITTER_DATE and related
variables for git-commit-tree, and then leaves them set for
git-update-ref. So git-update-ref saves the author and date of
the CVS commit into the reflog. It would be better to save the
name of the person who is running git cvsimport, and the local
time. That one I've already fixed in my local version, but I'm
not really happy with how the code turned out.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: reflogs generated by git-cvsimport
2009-07-31 17:42 reflogs generated by git-cvsimport Kalle Olavi Niemitalo
@ 2009-07-31 19:13 ` Jeff King
2009-07-31 20:15 ` Kalle Olavi Niemitalo
0 siblings, 1 reply; 4+ messages in thread
From: Jeff King @ 2009-07-31 19:13 UTC (permalink / raw)
To: Kalle Olavi Niemitalo; +Cc: git
On Fri, Jul 31, 2009 at 08:42:26PM +0300, Kalle Olavi Niemitalo wrote:
> If git-cvsimport.perl kept the commit IDs in Perl variables only,
> and then updated the refs once at the end, I'd get the reflogs I
> prefer. However, if the script were interrupted in the middle,
> it would then leave the refs unchanged, and the next cvsimport
> run would have to download the same commits again. I suppose
> that could be fixed with some $SIG{'INT'} handler. But then how
> about the git repack -a -d that git-cvsimport.perl runs every
> 1024 commits: could that lose some commits that have been
> imported from CVS but not yet saved in any ref?
>
> Would it be better to create temporary refs/cvsimport/* and then
> finally update the real refs based on those?
The "rebase" command does something similar by performing its changes on
a detached HEAD, and then writing the finished result to the real ref.
This gives you a detailed log in the HEAD reflog, but branch@{1} shows
the rebase as a single step.
However, I'm not sure that such a strategy would work for cvsimport,
which (IIRC) operates on branches that are not the HEAD. So I think
using refs/cvsimport/* instead of a detached HEAD makes sense.
Trying to do it all internally to the script seems needlessly complex
and error prone (and you would lose the detailed reflog, which you might
sometimes want for debugging or whatever).
> There's also another problem with the reflogs. The current
> version of git-cvsimport sets GIT_COMMITTER_DATE and related
> variables for git-commit-tree, and then leaves them set for
> git-update-ref. So git-update-ref saves the author and date of
> the CVS commit into the reflog. It would be better to save the
> name of the person who is running git cvsimport, and the local
> time. That one I've already fixed in my local version, but I'm
> not really happy with how the code turned out.
Yeah, it probably should not munge the reflog with the CVS committer
information. I suspect it would be as easy as the following (totally
untested, not even syntax checked) patch:
---
diff --git a/git-cvsimport.perl b/git-cvsimport.perl
index e439202..a94c48d 100755
--- a/git-cvsimport.perl
+++ b/git-cvsimport.perl
@@ -742,14 +742,16 @@ sub commit {
}
my $commit_date = strftime("+0000 %Y-%m-%d %H:%M:%S",gmtime($date));
- $ENV{GIT_AUTHOR_NAME} = $author_name;
- $ENV{GIT_AUTHOR_EMAIL} = $author_email;
- $ENV{GIT_AUTHOR_DATE} = $commit_date;
- $ENV{GIT_COMMITTER_NAME} = $author_name;
- $ENV{GIT_COMMITTER_EMAIL} = $author_email;
- $ENV{GIT_COMMITTER_DATE} = $commit_date;
- my $pid = open2(my $commit_read, my $commit_write,
- 'git-commit-tree', $tree, @commit_args);
+ my $pid = do {
+ local $ENV{GIT_AUTHOR_NAME} = $author_name;
+ local $ENV{GIT_AUTHOR_EMAIL} = $author_email;
+ local $ENV{GIT_AUTHOR_DATE} = $commit_date;
+ local $ENV{GIT_COMMITTER_NAME} = $author_name;
+ local $ENV{GIT_COMMITTER_EMAIL} = $author_email;
+ local $ENV{GIT_COMMITTER_DATE} = $commit_date;
+ open2(my $commit_read, my $commit_write,
+ 'git-commit-tree', $tree, @commit_args);
+ }
# compatibility with git2cvs
substr($logmsg,32767) = "" if length($logmsg) > 32767;
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: reflogs generated by git-cvsimport
2009-07-31 19:13 ` Jeff King
@ 2009-07-31 20:15 ` Kalle Olavi Niemitalo
2009-07-31 20:40 ` Jeff King
0 siblings, 1 reply; 4+ messages in thread
From: Kalle Olavi Niemitalo @ 2009-07-31 20:15 UTC (permalink / raw)
To: git
Jeff King <peff@peff.net> writes:
> Yeah, it probably should not munge the reflog with the CVS committer
> information. I suspect it would be as easy as the following (totally
> untested, not even syntax checked) patch:
That patch does not work because the $commit_read and
$commit_write file handles fall out of scope too early.
Those and $pid could be returned from the do {...} as
a list, but I think it's easier to remove the "do", declare
the variables above the block, and assign them in the block.
Also, there's $ENV{'TZ'}="UTC" at the beginning of the script
and it affects the reflogs too. This is the annoying part.
The script runs numerous subprocesses and it is not clear to
me which of those need TZ=UTC and which ones should use the
original TZ:
- git config: doesn't matter?
- cvs: UTC?
- rsh: UTC?
- git rev-parse --verify: depends on whether $name looks in reflog
- git-init: doesn't matter?
- git-read-tree: doesn't matter
- git-symbolic-ref: original if this can write to reflog
- git-rev-parse --verify HEAD: doesn't matter
- git-for-each-ref: doesn't matter
- cvsps: UTC?
- git-update-index: doesn't matter
- git-write-tree: doesn't matter
- git-commit-tree: doesn't matter because GIT_COMMITTER_DATE and
GIT_COMMITTER_DATE already specify "+0000". (Might be nice to
have author-specific time zones there though.)
- git-update-ref: original. (Also, -m cvsimport could be added.)
- git-tag: doesn't matter because cvsimport never uses git tag -a.
- git update-ref: original
- git-hash-object: doesn't matter
- git repack: doesn't matter?
- git-count-objects: doesn't matter
- git-merge: original
- git checkout: doesn't matter?
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: reflogs generated by git-cvsimport
2009-07-31 20:15 ` Kalle Olavi Niemitalo
@ 2009-07-31 20:40 ` Jeff King
0 siblings, 0 replies; 4+ messages in thread
From: Jeff King @ 2009-07-31 20:40 UTC (permalink / raw)
To: Kalle Olavi Niemitalo; +Cc: git
On Fri, Jul 31, 2009 at 11:15:14PM +0300, Kalle Olavi Niemitalo wrote:
> That patch does not work because the $commit_read and
> $commit_write file handles fall out of scope too early.
> Those and $pid could be returned from the do {...} as
> a list, but I think it's easier to remove the "do", declare
> the variables above the block, and assign them in the block.
Oops, indeed, I clearly did not look closely. But I see you understood
what I was trying to say, and I think you are right that it is probably
cleaner to just "my" them right before the block.
> Also, there's $ENV{'TZ'}="UTC" at the beginning of the script
> and it affects the reflogs too. This is the annoying part.
Since that is covering the whole script, it is obviously a harder issue
and should probably be a separate patch from the GIT_COMMITTER_*
information.
> The script runs numerous subprocesses and it is not clear to
> me which of those need TZ=UTC and which ones should use the
> original TZ:
Sadly, there is nothing useful in the commit history as the TZ setting
goes all the way back to the script being added. I would guess it is
there to convince cvs to give us a consistent time, since its log output
usually comes out in the local timezone (though since cvsimport is based
on cvsps, I would assume cvsps handles this sanely).
I suspect if you set it for cvs and cvsps, that would be sufficient. The
rest of git should use the original.
-Peff
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2009-07-31 20:40 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-07-31 17:42 reflogs generated by git-cvsimport Kalle Olavi Niemitalo
2009-07-31 19:13 ` Jeff King
2009-07-31 20:15 ` Kalle Olavi Niemitalo
2009-07-31 20:40 ` Jeff King
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox