* Re: gitweb wishlist
From: Linus Torvalds @ 2005-05-24 16:31 UTC (permalink / raw)
To: Thomas Glanzmann
Cc: David Mansfield, H. Peter Anvin, Kay Sievers, Petr Baudis,
Git Mailing List
In-Reply-To: <20050524161745.GA9537@cip.informatik.uni-erlangen.de>
On Tue, 24 May 2005, Thomas Glanzmann wrote:
>
> And your script does that:
>
> export GIT_COMMITTER_NAME=roessler
> export GIT_COMMITTER_EMAIL=roessler
> export GIT_AUTHOR_NAME=roessler
> export GIT_AUTHOR_EMAIL=roessler
> export GIT_AUTHOR_DATE='1998/06/20 03:53:44'
> ln -sf refs/heads/'master' .git/HEAD
> git-read-tree -m HEAD
> git-checkout-cache -f -u -a
> mkdir -p doc
> co -p -r1.2.2.1 '/home/cip/adm/sithglan/work/mutt/cvsrepository/doc/Attic/manual.sgml,v' > 'doc/manual.sgml'
> git-update-cache --add -- 'doc/manual.sgml'
> tree=$(git-write-tree)
> cat > .cmitmsg <<EOFMSG
> documenting alias-path
> EOFMSG
> commit=$(cat .cmitmsg | git-commit-tree $tree -p HEAD)
> echo $commit > .git/HEAD
>
> The problem might be that this is the first commit in the branch. But I thought
> it should end up in refs/heads/mutt-0-93.
Yes, you're using the cvs2git from yesterday, which didn't write the new
commit to the right branch. This is part of the branch fixing I've done.
Wait another few minutes and I'll commit my fix the problem with cvsps
branch handling (and I need to escape '$' in <<EOFMSG handling).
Linus
^ permalink raw reply
* Re: gitweb wishlist
From: Linus Torvalds @ 2005-05-24 16:53 UTC (permalink / raw)
To: Thomas Glanzmann
Cc: David Mansfield, H. Peter Anvin, Kay Sievers, Petr Baudis,
Git Mailing List
In-Reply-To: <Pine.LNX.4.58.0505240929051.2307@ppc970.osdl.org>
On Tue, 24 May 2005, Linus Torvalds wrote:
>
> Wait another few minutes and I'll commit my fix the problem with cvsps
> branch handling (and I need to escape '$' in <<EOFMSG handling).
Ok, committed. It takes a few minutes for the mirroring to pick it up, but
you should soon see a commit that says
cvs2git: escape <<EOF messages, and work around cvsps branch handling
This escapes '$' characters in <<-handling, and gives preference to
the new branch when cvsps incorrectly reports a commit as originating
on an old branch.
and once you do, you should have something that works.
Of course, I've still only tested it on syslinux, but it converts a
syslinux CVS repo in 64 seconds for me, and now the result really _does_
look correct at least superficially. Ie I can see 1029 commits on HEAD,
which is exactly what cvsps also reports.
And I see four different branches (HEAD is called "master" as per the
normal naming):
torvalds@ppc970:~/src/osscvs/syslinux> ll .git/refs/heads/
total 16K
-rw-rw-r-- 1 torvalds torvalds 41 May 24 09:36 branch-1_xx
-rw-rw-r-- 1 torvalds torvalds 41 May 24 09:37 master
-rw-rw-r-- 1 torvalds torvalds 41 May 24 09:36 syslinux
-rw-rw-r-- 1 torvalds torvalds 41 May 24 09:36 syslinux-1_6x-1
and doing a
git-rev-tree branch-1_xx master syslinux syslinux-1_6x-1 | wc -l
reports 1046 total revisions (which also matches cvsps exactly).
So things look ok, but I haven't actually checked the _contents_ of the
tree, except to look that the pathces that "git-whatchanged -p" reports
look sane.
There's two remaining bad things:
- name translation doesn't exist (so all of Peters changesets get
reported as author "hpa <hpa>")
- the commit time will be the conversion time, not the original commit
time (but the _author_ time will be correct). I suspect that for a
conversion like this, we really should add support for GIT_COMMIT_DATE.
That would also make the archive conversion 100% reproducible, ie
everybody should get the exact same objects (and thus the exact same
SHA1 values) which is good.
I'll add the GIT_COMMIT_DATE thing, but the name translation is for
somebody else.
Linus
^ permalink raw reply
* Re: gitweb wishlist
From: David Mansfield @ 2005-05-24 17:08 UTC (permalink / raw)
To: Linus Torvalds
Cc: H. Peter Anvin, Kay Sievers, Petr Baudis, Thomas Glanzmann,
Git Mailing List
In-Reply-To: <Pine.LNX.4.58.0505240849050.2307@ppc970.osdl.org>
Linus Torvalds wrote:
>
> On Tue, 24 May 2005, Linus Torvalds wrote:
>
>>It has the logic for branches, but it doesn't work, and I'm fed up enough
>>with CVS and RCS for the moment that I'm not going to work on it any more
>>tonight.
>
>
> I'm back, and yes, it was a really stupid thing.
>
> However, David, I need more help deciphering "cvsps" output..
>
> Fixing the branch handling shows that cvsps does some really strange
> things with the newly added "Ancestor grpah". Here's one example:
>
Yes. While not falling asleep last night I realized that the
quick-and-dirty approach was bogus. I need to track what the ancestor
is as I'm building up the data structure, not while outputting it. So
I'm working on a correct version which puts ancestor_branch into the
PatchSet structure itself.
It's completely done now except for that it segfaults instantly.
BTW where did you get the cvsroot for syslinux? Could I get a copy
somewhere?
David
^ permalink raw reply
* Re: gitweb wishlist
From: Linus Torvalds @ 2005-05-24 17:23 UTC (permalink / raw)
To: Thomas Glanzmann
Cc: David Mansfield, H. Peter Anvin, Kay Sievers, Petr Baudis,
Git Mailing List
In-Reply-To: <Pine.LNX.4.58.0505240943080.2307@ppc970.osdl.org>
On Tue, 24 May 2005, Linus Torvalds wrote:
>
> - the commit time will be the conversion time, not the original commit
> time (but the _author_ time will be correct). I suspect that for a
> conversion like this, we really should add support for GIT_COMMIT_DATE.
>
> That would also make the archive conversion 100% reproducible, ie
> everybody should get the exact same objects (and thus the exact same
> SHA1 values) which is good.
>
> I'll add the GIT_COMMIT_DATE thing, but the name translation is for
> somebody else.
Done. I've also fixed the timezone to "+0000", so that it doesn't matter
where you do the conversion, you should always get the same results
(again, I just pushed that out, it might not have hit the public mirrors
yet).
To get GIT_COMMITTER_DATE (note: COMMITTER, not COMMIT, to illogically
match the name/email ones) you obviously also need a new git. So to have
it all working right, you should have the top commits in git-tools and git
be
cvs2git: set timezone info to UTC, the way CVS does
and
git-commit-tree: allow overriding of commit date
respectively.
And if this doesn't work for you, point me to the CVS archive that causes
you trouble.
Linus
^ permalink raw reply
* Re: gitweb wishlist
From: Linus Torvalds @ 2005-05-24 17:28 UTC (permalink / raw)
To: David Mansfield
Cc: H. Peter Anvin, Kay Sievers, Petr Baudis, Thomas Glanzmann,
Git Mailing List
In-Reply-To: <42935F96.8030205@cobite.com>
On Tue, 24 May 2005, David Mansfield wrote:
>
> Yes. While not falling asleep last night I realized that the
> quick-and-dirty approach was bogus. I need to track what the ancestor
> is as I'm building up the data structure, not while outputting it.
Yes.
> It's completely done now except for that it segfaults instantly.
Very good. Are you going to also make a new release at some point, so that
we don't have strange random patches floating around?
> BTW where did you get the cvsroot for syslinux? Could I get a copy
> somewhere?
Peter sent it in private email, I don't know how public that is (it
probably is perfectly public and he just didn't want to spam the mailing
list or run afoul of size limits, but I just don't know for sure, so..),
but I bet he'll happily send it to you too.
Linus
^ permalink raw reply
* Re: gitweb wishlist
From: Thomas Glanzmann @ 2005-05-24 18:29 UTC (permalink / raw)
To: Git Mailing List; +Cc: Linus Torvalds
In-Reply-To: <Pine.LNX.4.58.0505240943080.2307@ppc970.osdl.org>
Hello,
> - name translation doesn't exist (so all of Peters changesets get
> reported as author "hpa <hpa>")
I pick that up.
> - the commit time will be the conversion time, not the original commit
> time (but the _author_ time will be correct). I suspect that for a
> conversion like this, we really should add support for GIT_COMMIT_DATE.
Thanks. I wanted to send patches or at least ask you this for ages, but
never did. :-)
Thomas
^ permalink raw reply
* Re: gitweb wishlist
From: H. Peter Anvin @ 2005-05-24 18:29 UTC (permalink / raw)
To: Linus Torvalds
Cc: David Mansfield, Kay Sievers, Petr Baudis, Thomas Glanzmann,
Git Mailing List
In-Reply-To: <Pine.LNX.4.58.0505241024450.2307@ppc970.osdl.org>
Linus Torvalds wrote:
>
> Peter sent it in private email, I don't know how public that is (it
> probably is perfectly public and he just didn't want to spam the mailing
> list or run afoul of size limits, but I just don't know for sure, so..),
> but I bet he'll happily send it to you too.
>
Already sent... I haven't looked it over to make sure there isn't
anything that shouldn't be in there yet, so if you need to distribute it
please give me a warning so I can look it over first.
-hpa
^ permalink raw reply
* Re: gitweb wishlist
From: Thomas Glanzmann @ 2005-05-24 18:46 UTC (permalink / raw)
To: Linus Torvalds
Cc: David Mansfield, H. Peter Anvin, Kay Sievers, Petr Baudis,
Git Mailing List
In-Reply-To: <Pine.LNX.4.58.0505241017510.2307@ppc970.osdl.org>
[-- Attachment #1: Type: text/plain, Size: 2400 bytes --]
Hello,
> And if this doesn't work for you, point me to the CVS archive that causes
> you trouble.
you should try the mutt cvs repository[1].
I have the following issues all seem easy to fix:
- PatchSet 1 depends on PatchSet 2 (but cvsps gets the ordering wrong;
should be easy fixable) (I just swichted the two before
running cvs2git)
- Some Shell escapes (I didn't looked into them yet)
(faui02new) [/var/tmp/sithglan/mutt-cvs] bash ~/work/cvsps/sane
defaulting to local storage area
Committing initial tree 7e68fd9a5104b61192a7da7357549d95b3a0620c
Ignoring path .cvsignore
...
/home/cip/adm/sithglan/work/cvsps/sane: line 1: ...: command not found
/home/cip/adm/sithglan/work/cvsps/sane: command substitution: line 1: unexpected EOF while looking for matching `''
/home/cip/adm/sithglan/work/cvsps/sane: command substitution: line 4: syntax error: unexpected end of file
/home/cip/adm/sithglan/work/cvsps/sane: command substitution: line 1: unexpected EOF while looking for matching `''
/home/cip/adm/sithglan/work/cvsps/sane: command substitution: line 5: syntax error: unexpected end of file
/home/cip/adm/sithglan/work/cvsps/sane: command substitution: line 1: unexpected EOF while looking for matching `''
/home/cip/adm/sithglan/work/cvsps/sane: command substitution: line 2: syntax error: unexpected end of file
/home/cip/adm/sithglan/work/cvsps/sane: command substitution: line 1: unexpected EOF while looking for matching `''
/home/cip/adm/sithglan/work/cvsps/sane: command substitution: line 3: syntax error: unexpected end of file
/home/cip/adm/sithglan/work/cvsps/sane: command substitution: line 1: unexpected EOF while looking for matching `''
/home/cip/adm/sithglan/work/cvsps/sane: command substitution: line 26: syntax error: unexpected end of file
But hey this looks really good: :-))))))
(faui02new) [/var/tmp/sithglan/mutt-cvs] git parent ~/work/mutt/git/mutt-cvs
(faui02new) [/var/tmp/sithglan/mutt-cvs] git parentdiff
(faui02new) [/var/tmp/sithglan/mutt-cvs]
I think I will run my 'import patch by patch script again' and check the
changesets against the cvs2git tree, but it looks fine for me.
Thomas
[1] To make it reproducable for you:
I used the attached patch against cvsps-2.0rc1 which fixes date
covnersion problems and of course includes the ancestor thing.
rsync -r rsync://cvs.gnupg.org/mutt-cvs-rep mutt-cvs-rep
[-- Attachment #2: diff --]
[-- Type: text/plain, Size: 12426 bytes --]
diff --git a/cvs_direct.c b/cvs_direct.c
--- a/cvs_direct.c
+++ b/cvs_direct.c
@@ -126,7 +126,7 @@ CvsServerCtx * open_cvs_server(char * p_
send_string(ctx, "Root %s\n", ctx->root);
/* this is taken from 1.11.1p1 trace - but with Mbinary removed. we can't handle it (yet!) */
- send_string(ctx, "Valid-responses ok error Valid-requests Checked-in New-entry Checksum Copy-file Updated Created Update-existing Merged Patched Rcs-diff Mode Mod-time Removed Remove-entry Set-static-directory Clear-static-directory Set-sticky Clear-sticky Template Set-checkin-prog Set-update-prog Notified Module-expansion Wrapper-rcsOption M E F MT\n", ctx->root);
+ send_string(ctx, "Valid-responses ok error Valid-requests Checked-in New-entry Checksum Copy-file Updated Created Update-existing Merged Patched Rcs-diff Mode Mod-time Removed Remove-entry Set-static-directory Clear-static-directory Set-sticky Clear-sticky Template Set-checkin-prog Set-update-prog Notified Module-expansion Wrapper-rcsOption M E F\n", ctx->root);
send_string(ctx, "valid-requests\n");
@@ -894,6 +894,7 @@ char * cvs_rlog_fgets(char * buff, int b
}
else if (strcmp(lbuff, "ok") == 0 ||strcmp(lbuff, "error") == 0)
{
+ debug(DEBUG_TCP, "cvs_direct: rlog: got command completion");
return NULL;
}
diff --git a/cvsps.1 b/cvsps.1
--- a/cvsps.1
+++ b/cvsps.1
@@ -3,7 +3,7 @@
CVSps \- create patchset information from CVS
.SH SYNOPSIS
.B cvsps
-[-h] [-x] [-u] [-z <fuzz>] [-g] [-s <patchset>] [-a <author>] [-f <file>] [-d <date1> [-d <date2>]] [-l <text>] [-b <branch>] [-r <tag> [-r <tag>]] [-p <directory>] [-v] [-t] [--norc] [--summary-first] [--test-log <filename>] [--bkcvs] [--no-rlog] [--diff-opts <option string>] [--cvs-direct] [--debuglvl <bitmask>] [-Z <compression>] [--root <cvsroot>] [-q] [<repository>]
+[\-h] [\-x] [\-u] [\-z <fuzz>] [\-g] [\-s <patchset>] [\-a <author>] [\-f <file>] [\-d <date1> [\-d <date2>]] [\-l <text>] [\-b <branch>] [\-r <tag> [\-r <tag>]] [\-p <directory>] [\-v] [\-t] [\-\-norc] [\-\-summary-first] [\-\-test\-log <filename>] [\-\-bkcvs] [\-\-no\-rlog] [\-\-diff\-opts <option string>] [\-\-cvs\-direct] [\-\-debuglvl <bitmask>] [\-Z <compression>] [\-\-root <cvsroot>] [\-q] [<repository>]
.SH DESCRIPTION
CVSps is a program for generating 'patchset' information from a CVS
repository. A patchset in this case is defined as a set of changes made
@@ -29,7 +29,7 @@ set the timestamp fuzz factor for identi
.B \-g
generate diffs of the selected patch sets
.TP
-.B \-s <patchset>[-[<patchset>]][,<patchset>...]
+.B \-s <patchset>[\-[<patchset>]][,<patchset>...]
generate a diff for a given patchsets and patchset ranges
.TP
.B \-a <author>
@@ -38,7 +38,7 @@ restrict output to patchsets created by
.B \-f <file>
restrict output to patchsets involving file
.TP
-.B \-d <date1> -d <date2>
+.B \-d <date1> \-d <date2>
if just one date specified, show
revisions newer than date1. If two dates specified,
show revisions between two dates.
@@ -50,7 +50,7 @@ restrict output to patchsets matching re
restrict output to patchsets affecting history of branch.
If you want to restrict to the main branch, use a branch of 'HEAD'.
.TP
-.B \-r <tag1> -r <tag2>
+.B \-r <tag1> \-r <tag2>
if just one tag specified, show
revisions since tag1. If two tags specified, show
revisions between the two tags.
@@ -64,47 +64,47 @@ show very verbose parsing messages
.B \-t
show some brief memory usage statistics
.TP
-.B \--norc
+.B \-\-norc
when invoking cvs, ignore the .cvsrc file
.TP
-.B \--summary-first
+.B \-\-summary\-first
when multiple patchset diffs are being generated, put the patchset
summary for all patchsets at the beginning of the output.
.TP
-.B \--test-log <captured cvs log file>
+.B \-\-test\-log <captured cvs log file>
for testing changes, you can capture cvs log output, then test against
this captured file instead of hammering some poor CVS server
.TP
-.B \--bkcvs
+.B \-\-bkcvs
(see note below) for use in parsing the BK->CVS tree log formats only. This enables
some hacks which are not generally applicable.
.TP
-.B \--no-rlog
+.B \-\-no\-rlog
disable the use of rlog internally. Note: rlog is
required for stable PatchSet numbering. Use with care.
.TP
-.B \--diffs-opts <option string>
+.B \-\-diffs\-opts <option string>
send a custom set of options to diff, for example to increase
the number of context lines, or change the diff format.
.TP
-.B \--cvs-direct (--no-cvs-direct)
-enable (disable) built-in cvs client code. This enables the 'pipelining' of multiple
+.B \-\-cvs\-direct (\-\-no-cvs\-direct)
+enable (disable) built\-in cvs client code. This enables the 'pipelining' of multiple
requests over a single client, reducing the overhead of handshaking and
authentication to one per PatchSet instead of one per file.
.TP
-.B \--debuglvl <bitmask>
+.B \-\-debuglvl <bitmask>
enable various debug output channels.
.TP
.B \-Z <compression>
A value 1-9 which specifies amount of compression. A value of 0 disables compression.
.TP
-.B \--root <cvsroot>
+.B \-\-root <cvsroot>
Override the setting of CVSROOT (overrides working dir. and environment)
.TP
.B \-q
Be quiet about warnings.
.TP
-.B \<repository>
+.B <repository>
Operate on the specified repository (overrides working dir.)
.SH "NOTE ON TAG HANDLING"
Tags are fundamentally 'file at a time' in cvs, but like everything else,
@@ -159,17 +159,17 @@ directory in the path, and -p0 will be r
diffs are generated in cvs-direct mode (see below), however, they will always
be -p1 style patches.
.SH "NOTE ON BKCVS"
-The --bkcvs option is a special operating mode that should only be used when parsing
+The \-\-bkcvs option is a special operating mode that should only be used when parsing
the log files from the BK -> CVS exported linux kernel trees. cvsps uses special
semantics for recreating the BK ChangeSet metadata that has been embedded in the log
-files for those trees. The --bkcvs option should only be specified when the cache
-file is being created or updated (i.e. initial run of cvsps, or when -u and -x options
+files for those trees. The \-\-bkcvs option should only be specified when the cache
+file is being created or updated (i.e. initial run of cvsps, or when \-u and \-x options
are used).
.SH "NOTE ON CVS-DIRECT"
As of version 2.0b6 cvsps has a partial implementation of the cvs client code built
in. This reduces the RTT and/or handshaking overhead from one per patchset member
to one per patchset. This dramatically increases the speed of generating diffs
-over a slow link, and improves the consistency of operation. Currently the --cvs-direct
+over a slow link, and improves the consistency of operation. Currently the \-\-cvs-direct
option turns on the use of this code, but it very well may be default by the time
2.0 comes out. The built-in cvs code attempts to be compatible with cvs, but may
have problems, which should be reported. It honors the CVS_RSH and CVS_SERVER
@@ -179,7 +179,9 @@ CVSps parses an rc file at startup. Thi
The file should contain arguments, in the exact syntax as the command line, one per line.
If an argument takes a parameter, the parameter should be on the same line as the argument.
.SH "NOTE ON DATE FORMATS"
-Dates have formats. Fixme.
+Dates must be in the format 'yyyy/mm/dd hh:mm:ss'; for example,
+.IP "" 4
+$ cvsps -d '2004/05/01 00:00:00' -d '2004/07/07 12:00:00'
.SH "SEE ALSO"
.BR cvs ( 1 ),
.BR ci ( 1 ),
diff --git a/cvsps.c b/cvsps.c
--- a/cvsps.c
+++ b/cvsps.c
@@ -1402,6 +1402,16 @@ static void print_patch_set(PatchSet * p
tm->tm_hour, tm->tm_min, tm->tm_sec);
printf("Author: %s\n", ps->author);
printf("Branch: %s\n", ps->branch);
+
+ /* check if ancestor was different branch */
+ if (!list_empty(&ps->members))
+ {
+ PatchSetMember * psm = list_entry(ps->members.next, PatchSetMember, link);
+ const char * abr = psm->pre_rev ? psm->pre_rev->branch : NULL;
+ if (abr && strcmp(ps->branch, abr) != 0)
+ printf("Ancestor branch: %s\n", abr);
+ }
+
printf("Tag: %s %s\n", ps->tag ? ps->tag : "(none)", tag_flag_descr[ps->tag_flags]);
printf("Log:\n%s\n", ps->descr);
printf("Members: \n");
@@ -1646,6 +1656,7 @@ static void do_cvs_diff(PatchSet * ps)
const char * dopts;
const char * utype;
char use_rep_path[PATH_MAX];
+ char esc_use_rep_path[PATH_MAX];
fflush(stdout);
fflush(stderr);
@@ -1666,6 +1677,8 @@ static void do_cvs_diff(PatchSet * ps)
dtype = "rdiff";
utype = "co";
sprintf(use_rep_path, "%s/", repository_path);
+ /* the rep_path may contain characters that the shell will barf on */
+ escape_filename(esc_use_rep_path, PATH_MAX, use_rep_path);
}
else
{
@@ -1673,6 +1686,7 @@ static void do_cvs_diff(PatchSet * ps)
dtype = "diff";
utype = "update";
use_rep_path[0] = 0;
+ esc_use_rep_path[0] = 0;
}
for (next = ps->members.next; next != &ps->members; next = next->next)
@@ -1740,7 +1754,7 @@ static void do_cvs_diff(PatchSet * ps)
else
{
snprintf(cmdbuff, PATH_MAX * 2, "cvs %s %s %s -p -r %s %s%s | diff %s %s /dev/null %s | sed -e '%s s|^\\([+-][+-][+-]\\) -|\\1 %s%s|g'",
- compress_arg, norc, utype, rev, use_rep_path, esc_file, dopts,
+ compress_arg, norc, utype, rev, esc_use_rep_path, esc_file, dopts,
cr?"":"-",cr?"-":"", cr?"2":"1",
use_rep_path, psm->file->filename);
}
@@ -1760,7 +1774,7 @@ static void do_cvs_diff(PatchSet * ps)
snprintf(cmdbuff, PATH_MAX * 2, "cvs %s %s %s %s -r %s -r %s %s%s",
compress_arg, norc, dtype, dopts, psm->pre_rev->rev, psm->post_rev->rev,
- use_rep_path, esc_file);
+ esc_use_rep_path, esc_file);
}
}
@@ -2113,7 +2127,7 @@ static void resolve_global_symbols()
Tag * tag = list_entry(next, Tag, global_link);
CvsFileRevision * rev = tag->rev;
- if (!rev->present)
+ if (!rev->present || !rev->post_psm)
{
struct list_head *tmp = next->prev;
debug(DEBUG_APPERROR, "revision %s of file %s is tagged but not present",
diff --git a/cvsps.h b/cvsps.h
--- a/cvsps.h
+++ b/cvsps.h
@@ -11,6 +11,10 @@
typedef struct _CvsServerCtx CvsServerCtx;
#endif
+#ifndef PATH_MAX
+#define PATH_MAX 4096
+#endif
+
extern struct hash_table * file_hash;
extern const char * tag_flag_descr[];
extern CvsServerCtx * cvs_direct_ctx;
diff --git a/util.c b/util.c
--- a/util.c
+++ b/util.c
@@ -13,6 +13,7 @@
#include <time.h>
#include <errno.h>
#include <signal.h>
+#include <regex.h>
#include <sys/stat.h>
#include <sys/time.h>
#include <sys/types.h>
@@ -140,24 +141,51 @@ char *get_string(char const *str)
return *res;
}
+static int get_int_substr(const char * str, const regmatch_t * p)
+{
+ char buff[256];
+ memcpy(buff, str + p->rm_so, p->rm_eo - p->rm_so);
+ buff[p->rm_eo - p->rm_so] = 0;
+ return atoi(buff);
+}
+
void convert_date(time_t * t, const char * dte)
{
- /* HACK: this routine parses two formats,
- * 1) 'cvslog' format YYYY/MM/DD HH:MM:SS
- * 2) time_t formatted as %d
- */
-
- if (strchr(dte, '/'))
+ static regex_t date_re;
+ static int init_re;
+
+#define MAX_MATCH 16
+ size_t nmatch = MAX_MATCH;
+ regmatch_t match[MAX_MATCH];
+
+ if (!init_re)
+ {
+ if (regcomp(&date_re, "([0-9]{4})[-/]([0-9]{2})[-/]([0-9]{2}) ([0-9]{2}):([0-9]{2}):([0-9]{2})", REG_EXTENDED))
+ {
+ fprintf(stderr, "FATAL: date regex compilation error\n");
+ exit(1);
+ }
+ init_re = 1;
+ }
+
+ if (regexec(&date_re, dte, nmatch, match, 0) == 0)
{
+ regmatch_t * pm = match;
struct tm tm;
+
+ /* first regmatch_t is match location of entire re */
+ pm++;
- memset(&tm, 0, sizeof(tm));
- sscanf(dte, "%d/%d/%d %d:%d:%d",
- &tm.tm_year, &tm.tm_mon, &tm.tm_mday,
- &tm.tm_hour, &tm.tm_min, &tm.tm_sec);
-
+ tm.tm_year = get_int_substr(dte, pm++);
+ tm.tm_mon = get_int_substr(dte, pm++);
+ tm.tm_mday = get_int_substr(dte, pm++);
+ tm.tm_hour = get_int_substr(dte, pm++);
+ tm.tm_min = get_int_substr(dte, pm++);
+ tm.tm_sec = get_int_substr(dte, pm++);
+
tm.tm_year -= 1900;
tm.tm_mon--;
+ tm.tm_isdst = 0;
*t = mktime(&tm);
}
diff --git a/util.h b/util.h
--- a/util.h
+++ b/util.h
@@ -6,6 +6,10 @@
#ifndef UTIL_H
#define UTIL_H
+#ifndef PATH_MAX
+#define PATH_MAX 4096
+#endif
+
#define CVSPS_PREFIX ".cvsps"
char *xstrdup(char const *);
^ permalink raw reply
* Re: gitweb wishlist
From: Linus Torvalds @ 2005-05-24 18:52 UTC (permalink / raw)
To: Thomas Glanzmann; +Cc: Git Mailing List
In-Reply-To: <20050524182951.GB9537@cip.informatik.uni-erlangen.de>
On Tue, 24 May 2005, Thomas Glanzmann wrote:
>
> Hello,
>
> > - name translation doesn't exist (so all of Peters changesets get
> > reported as author "hpa <hpa>")
>
> I pick that up.
Note that one advantage of the unconverted output is that while it's
unreadable and not very helpful, it _is_ the raw output from CVS. Again,
that means that everybody will convert the same CVS archive into exactly
the same git tree, and that means (among other things) that you can then
immediately merge between the trees.
Perhaps more interestingly, it should also mean that you can _continue_ to
use CVS, then re-convert it at a later date, and I think you should be
able to merge with somebody who has been using git in the meantime.
In contrast, if you have fancy name translation, the converted tree will
obviously depend on your translation rules.
I dunno. Maybe the advantages of having nice names outweigh the
disadvantage of possibly generating incompatible trees.
Linus
^ permalink raw reply
* Re: gitweb wishlist
From: Thomas Glanzmann @ 2005-05-24 19:16 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Git Mailing List
In-Reply-To: <Pine.LNX.4.58.0505241146500.2307@ppc970.osdl.org>
Hello,
> Note that one advantage of the unconverted output is that while it's
> unreadable and not very helpful, it _is_ the raw output from CVS. Again,
> that means that everybody will convert the same CVS archive into exactly
> the same git tree, and that means (among other things) that you can then
> immediately merge between the trees.
I see your point. But it depends on the usage scenario. For me for
example I would like to vendortrack a few CVS repositories. And I use it
only to maintain a few patches (branches and the merging facilities of
git come handy in here), not in a distributed environment were I need
that much reproducability. So having the option to use them is fine for
me and when I need reproducability than I simple don't. When I think
about this scenario ... it comes in my mind that it maybe would be
helpful to have a helper applications like git-merge-base which looks at
the treeids and not the commit ids to find the merge base, or is that
just bullshit?
Thomas
^ permalink raw reply
* Re: gitweb wishlist
From: Junio C Hamano @ 2005-05-24 19:24 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Thomas Glanzmann, Git Mailing List
In-Reply-To: <Pine.LNX.4.58.0505241146500.2307@ppc970.osdl.org>
>>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes:
LT> Perhaps more interestingly, it should also mean that you can _continue_ to
LT> use CVS, then re-convert it at a later date, and I think you should be
LT> able to merge with somebody who has been using git in the meantime.
LT> In contrast, if you have fancy name translation, the converted tree will
LT> obviously depend on your translation rules.
LT> I dunno. Maybe the advantages of having nice names outweigh the
LT> disadvantage of possibly generating incompatible trees.
LT> Linus
Porcelain layers should be capable of mapping author/committer
names taken out of the commit object, just like they already
convert the human unreadable unixtime value into something human
readable. I'd vote for keeping the original value taken from
CVS for this particular "conversion" application.
What _all_ Porcelain layer implementation would benefit from is
if we had a common output format routine that is similar to the
spirit of show_date() function. Have format_commit_fancy()
function that takes a commit object and have it do the mapping.
Then everybody can use it for their own Porcelain.
The diff-tree header generation can use it when (and only when)
it is operating under a new flag (--map-author-names), to
prettyprint the author names. I'd also suggest to have a flag
to reduce the prettyprinting it does in the current output (like
omitting committer information) to make its output be usable for
reproducing the commit history exactly. diff-tree with recent
enhancement you did (I am talking about single commit output and
--stdin, not my diffcore stuff) has become quite useful tool for
this kind of thing.
^ permalink raw reply
* Re: gitweb wishlist
From: Linus Torvalds @ 2005-05-24 19:34 UTC (permalink / raw)
To: Thomas Glanzmann
Cc: David Mansfield, H. Peter Anvin, Kay Sievers, Petr Baudis,
Git Mailing List
In-Reply-To: <20050524184612.GA23637@cip.informatik.uni-erlangen.de>
On Tue, 24 May 2005, Thomas Glanzmann wrote:
>
> I have the following issues all seem easy to fix:
>
> - PatchSet 1 depends on PatchSet 2 (but cvsps gets the ordering wrong;
> should be easy fixable) (I just swichted the two before
> running cvs2git)
Ok, this seems to be a cvsps bug, and I'll treat it as such. David, any
ideas? It seems to be because of how cvsps sorts things by date, which is
obviously bogus.
The cvs2git thing wouldn't normally even _care_ (ie would happily re-order
the thing), but for the fact that it causes problems with branches that
are used before they are created in this case.
cvsps really should do some kind of topo-sort. Probably doesn't need a lot
(ie it probably doesn't even need to be topological, but the "order"
should be based on trivial dependencies first, and time second. For
example, once David does the per-commit branch handling, I suspect enough
of an ordering to keep git happy falls out of that).
> - Some Shell escapes (I didn't looked into them yet)
Ok, I'll check it out. I didn't figure out what characters are
shell-expanded by "<<EOF" handling, and only did '$' because that showed
up in the syslinux archives.
> (faui02new) [/var/tmp/sithglan/mutt-cvs] git parent ~/work/mutt/git/mutt-cvs
> (faui02new) [/var/tmp/sithglan/mutt-cvs] git parentdiff
> (faui02new) [/var/tmp/sithglan/mutt-cvs]
>
> I think I will run my 'import patch by patch script again' and check the
> changesets against the cvs2git tree, but it looks fine for me.
In theory, they should give the exact same results, no? At least if there
are no binary objects. Of course, you'd have to update your import script
to do the times the same way.
Linus
^ permalink raw reply
* Re: gitweb wishlist
From: David Mansfield @ 2005-05-24 19:43 UTC (permalink / raw)
To: Thomas Glanzmann
Cc: Linus Torvalds, H. Peter Anvin, Kay Sievers, Petr Baudis,
Git Mailing List
In-Reply-To: <20050524184612.GA23637@cip.informatik.uni-erlangen.de>
Thomas Glanzmann wrote:
> Hello,
>
>
>>And if this doesn't work for you, point me to the CVS archive that causes
>>you trouble.
>
>
> you should try the mutt cvs repository[1].
Sounds good. I'll give it a try. I'm testing the branch ancestor
logic, which seems to be working better now. The version I sent to the
list yesterday was pretty bogus for some cases, as well as reporting the
ancestor multiple times for any give branch.
>
> I have the following issues all seem easy to fix:
>
> - PatchSet 1 depends on PatchSet 2 (but cvsps gets the ordering wrong;
> should be easy fixable) (I just swichted the two before
> running cvs2git)
>
There is something strange about 'cvs import' I believe which causes
various bizarre things to happen to the first cvsps patchset. I haven't
looked at mutt cvs yet, but this could be the cause. If you see a lot
of version numbers 1.1.1.1 then this is indeed the problem.
> I used the attached patch against cvsps-2.0rc1 which fixes date
> covnersion problems and of course includes the ancestor thing.
I'll look at taking these patches upstream. The 'MT' fix is already in
my cvs of cvsps, and the rest looks pretty good.
Do you know where I can get attribution information for these changes?
Are they all from you? (I'm not familiar with debian at all)
David
^ permalink raw reply
* Re: gitweb wishlist
From: Junio C Hamano @ 2005-05-24 19:44 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Thomas Glanzmann, Git Mailing List
In-Reply-To: <7vu0kstojw.fsf@assigned-by-dhcp.cox.net>
>>>>> "JCH" == Junio C Hamano <junkio@cox.net> writes:
JCH> What _all_ Porcelain layer implementation would benefit from is
JCH> if we had a common output format routine that is similar to the
JCH> spirit of show_date() function. Have format_commit_fancy()
JCH> function that takes a commit object and have it do the mapping.
Here is a small script called "whodunnit.sh", and its output can
be cleaned up if we had a git-format-commit command that used
format_commit_fancy(), that massages author/committer names (and
probably some other prettyprinting), instead of plain old
"git-cat-file commit".
#!/bin/sh
git-rev-list ${1:-HEAD} |
while read commit
do
git-cat-file commit $commit |
sed -ne '/^author \([^>]*>\).*/{s//\1/p;q;}'
done | sort | uniq -c | sort -n
The lines it currently spits out looks like this:
6 ...
8 Linus Torvalds <torvalds@ppc970.osdl.org.(none)>
...
155 ...
229 Linus Torvalds <torvalds@ppc970.osdl.org>
My suggestion for Thomas is not to volunteer changing
cvsps-to-git to munge names at conversion time, but instead to
volunteer doing the format_commit_fancy() on the core-ish side.
It would read from $GIT_DIR/author-names which would be a plain
text file that is a sequence of:
"bogus name" TAB "good name" LF
where you would put ".(none)" version to "bogus" side and the
corrected one on the "good" side.
^ permalink raw reply
* Re: gitweb wishlist
From: Linus Torvalds @ 2005-05-24 19:47 UTC (permalink / raw)
To: Thomas Glanzmann
Cc: David Mansfield, H. Peter Anvin, Kay Sievers, Petr Baudis,
Git Mailing List
In-Reply-To: <20050524184612.GA23637@cip.informatik.uni-erlangen.de>
On Tue, 24 May 2005, Thomas Glanzmann wrote:
>
> [1] To make it reproducable for you:
>
> I used the attached patch against cvsps-2.0rc1 which fixes date
> covnersion problems and of course includes the ancestor thing.
>
> rsync -r rsync://cvs.gnupg.org/mutt-cvs-rep mutt-cvs-rep
Ok, that's a lot bigger and slower than syslinux. It seems to be importing
about 9.5 changesets per second, and there's 3757 patchsets, so it looks
like about 6 minutes.
Oh, done.
And yes, there's a few problems. It seems to be the fault of a frowning
"smiley" - the '\' followed by newline in this:
[unstable] Re-add in-reply-to. This time with a suitable default. #-\
and one back-tick.
Will fix. This will take another six minutes of testing ;)
Linus
^ permalink raw reply
* Re: gitweb wishlist
From: David Mansfield @ 2005-05-24 19:54 UTC (permalink / raw)
To: Linus Torvalds
Cc: H. Peter Anvin, Kay Sievers, Petr Baudis, Thomas Glanzmann,
Git Mailing List
In-Reply-To: <Pine.LNX.4.58.0505240911050.2307@ppc970.osdl.org>
Linus Torvalds wrote:
>
> On Tue, 24 May 2005, Linus Torvalds wrote:
>
>>Fixing the branch handling shows that cvsps does some really strange
>>things with the newly added "Ancestor grpah". Here's one example:
>
>
> Ahh, looking at cvsps source, I think I see what's going on.
>
> It's deciding the "previous branch" by looking at what the previous branch
> for the first individual file in the PatchSet was, which fails because in
> this case, PatchSet 372 was changing "syslinux.doc", and Patchset 374 was
> changing "syslinux.c", and thus the previous version of the individual
> _files_ were both in the HEAD branch.
>
> So it does look like I should just ignore the "Ancestor branch"
> information if the new branch already existed.
>
I now consider all files in a commit, and all commits in a branch to
determine the ancestor, and only report it in the first commit on the
branch.
Strangely, you have to look at (potentially) all commits on a branch to
find the 'true' ancestor branch.
The problem is for branch-off-branch branches where the first commit on
the new branch modifies only files never modified on the branch-off-HEAD
branch. This is because cvs only REALLY creates the branch when the
first commit is made (for that file) on the branch. Before that, it is
just a 'potential' branch...
But I have code now which (seems to) works, but needs a bit more checking.
> Of course, some semantics will never be translatable when trying to treat
> CVS as a sane system (ie treating CVS as if it was changeset-based is
> always going to cause strange corner cases since it really is file-based),
> but that should most likely give the best approximation of what a
> conversion should do.
>
Yes.
David
^ permalink raw reply
* Re: gitweb wishlist
From: David Mansfield @ 2005-05-24 20:03 UTC (permalink / raw)
To: Linus Torvalds
Cc: H. Peter Anvin, Kay Sievers, Petr Baudis, Thomas Glanzmann,
Git Mailing List
In-Reply-To: <Pine.LNX.4.58.0505240911050.2307@ppc970.osdl.org>
[-- Attachment #1: Type: text/plain, Size: 1183 bytes --]
Linus Torvalds wrote:
>
> On Tue, 24 May 2005, Linus Torvalds wrote:
>
>>Fixing the branch handling shows that cvsps does some really strange
>>things with the newly added "Ancestor grpah". Here's one example:
>
>
> Ahh, looking at cvsps source, I think I see what's going on.
>
> It's deciding the "previous branch" by looking at what the previous branch
> for the first individual file in the PatchSet was, which fails because in
> this case, PatchSet 372 was changing "syslinux.doc", and Patchset 374 was
> changing "syslinux.c", and thus the previous version of the individual
> _files_ were both in the HEAD branch.
>
> So it does look like I should just ignore the "Ancestor branch"
> information if the new branch already existed.
>
I've attached what I just committed. The previous 'show ancestor' patch
needs to be reversed and this applied. It works for me on a half-dozen
repos including syslinux.
You no longer should need to work around multiple reporting of the
ancestor for a given branch, though it couldn't hurt.
I'm going to finish getting some of Thomas's patches in and make an
actual release so people won't have to scour the lists.
David
[-- Attachment #2: show-branch-ancestry-2.patch --]
[-- Type: text/x-patch, Size: 8668 bytes --]
---------------------
PatchSet 176
Date: 2005/05/24 19:57:37
Author: david
Branch: HEAD
Tag: (none)
Log:
show branch ancestry
Members:
cvsps.c:4.99->4.100
cvsps_types.h:4.9->4.10
Index: cvsps/cvsps.c
diff -u cvsps/cvsps.c:4.99 cvsps/cvsps.c:4.100
--- cvsps/cvsps.c:4.99 Wed Jan 26 14:46:41 2005
+++ cvsps/cvsps.c Tue May 24 15:57:37 2005
@@ -26,7 +26,7 @@
#include "cap.h"
#include "cvs_direct.h"
-RCSID("$Id: cvsps.c,v 4.99 2005/01/26 19:46:41 david Exp $");
+RCSID("$Id: cvsps.c,v 4.100 2005/05/24 19:57:37 david Exp $");
#define CVS_LOG_BOUNDARY "----------------------------\n"
#define CVS_FILE_BOUNDARY "=============================================================================\n"
@@ -75,6 +75,7 @@
static int do_write_cache;
static int statistics;
static const char * test_log_file;
+static struct hash_table * branch_heads;
/* settable via options */
static int timestamp_fuzz_factor = 300;
@@ -101,6 +102,7 @@
static int cvs_direct;
static int compress;
static char compress_arg[8];
+static int track_branch_ancestry;
static void check_norc(int, char *[]);
static int parse_args(int, char *[]);
@@ -112,7 +114,7 @@
static void assign_pre_revision(PatchSetMember *, CvsFileRevision * rev);
static void check_print_patch_set(PatchSet *);
static void print_patch_set(PatchSet *);
-static void set_ps_id(const void *, const VISIT, const int);
+static void walk_all_ps(const void *, const VISIT, const int);
static void show_ps_tree_node(const void *, const VISIT, const int);
static int compare_patch_sets_bk(const void *, const void *);
static int compare_patch_sets(const void *, const void *);
@@ -131,6 +133,7 @@
static int check_rev_funk(PatchSet *, CvsFileRevision *);
static CvsFileRevision * rev_follow_branch(CvsFileRevision *, const char *);
static int before_tag(CvsFileRevision * rev, const char * tag);
+static void determine_branch_ancestor(PatchSet * ps, PatchSet * head_ps);
int main(int argc, char *argv[])
{
@@ -164,6 +167,7 @@
file_hash = create_hash_table(1023);
global_symbols = create_hash_table(111);
+ branch_heads = create_hash_table(1023);
/* this parses some of the CVS/ files, and initializes
* the repository_path and other variables
@@ -197,7 +201,7 @@
}
ps_counter = 0;
- twalk(ps_tree_bytime, set_ps_id);
+ twalk(ps_tree_bytime, walk_all_ps);
resolve_global_symbols();
@@ -536,7 +540,7 @@
debug(DEBUG_APPERROR, " [--test-log <captured cvs log file>] [--bkcvs]");
debug(DEBUG_APPERROR, " [--no-rlog] [--diff-opts <option string>] [--cvs-direct]");
debug(DEBUG_APPERROR, " [--debuglvl <bitmask>] [-Z <compression>] [--root <cvsroot>]");
- debug(DEBUG_APPERROR, " [<repository>] [-q]");
+ debug(DEBUG_APPERROR, " [-q] [-A] [<repository>]");
debug(DEBUG_APPERROR, "");
debug(DEBUG_APPERROR, "Where:");
debug(DEBUG_APPERROR, " -h display this informative message");
@@ -569,6 +573,7 @@
debug(DEBUG_APPERROR, " -Z <compression> A value 1-9 which specifies amount of compression");
debug(DEBUG_APPERROR, " --root <cvsroot> specify cvsroot. overrides env. and working directory");
debug(DEBUG_APPERROR, " -q be quiet about warnings");
+ debug(DEBUG_APPERROR, " -A track and report branch ancestry");
debug(DEBUG_APPERROR, " <repository> apply cvsps to repository. overrides working directory");
debug(DEBUG_APPERROR, "\ncvsps version %s\n", VERSION);
@@ -867,6 +872,13 @@
continue;
}
+ if (strcmp(argv[i], "-A") == 0)
+ {
+ track_branch_ancestry = 1;
+ i++;
+ continue;
+ }
+
if (argv[i][0] == '-')
return usage("invalid argument", argv[i]);
@@ -1398,6 +1410,8 @@
tm->tm_hour, tm->tm_min, tm->tm_sec);
printf("Author: %s\n", ps->author);
printf("Branch: %s\n", ps->branch);
+ if (ps->ancestor_branch)
+ printf("Ancestor branch: %s\n", ps->ancestor_branch);
printf("Tag: %s %s\n", ps->tag ? ps->tag : "(none)", tag_flag_descr[ps->tag_flags]);
printf("Log:\n%s\n", ps->descr);
printf("Members: \n");
@@ -1425,7 +1439,10 @@
printf("\n");
}
-static void set_ps_id(const void * nodep, const VISIT which, const int depth)
+/* walk all the patchsets to assign monotonic psid,
+ * and to establish branch ancestry
+ */
+static void walk_all_ps(const void * nodep, const VISIT which, const int depth)
{
PatchSet * ps;
@@ -1442,6 +1459,18 @@
{
ps_counter++;
ps->psid = ps_counter;
+
+ if (track_branch_ancestry && strcmp(ps->branch, "HEAD") != 0)
+ {
+ PatchSet * head_ps = (PatchSet*)get_hash_object(branch_heads, ps->branch);
+ if (!head_ps)
+ {
+ head_ps = ps;
+ put_hash_object(branch_heads, ps->branch, head_ps);
+ }
+
+ determine_branch_ancestor(ps, head_ps);
+ }
}
else
{
@@ -1912,6 +1941,7 @@
ps->tag_flags = 0;
ps->branch_add = 0;
ps->funk_factor = 0;
+ ps->ancestor_branch = NULL;
}
return ps;
@@ -2235,21 +2265,25 @@
return 0;
}
-/*
- * When importing vendor sources, (apparently people do this)
- * the code is added on a 'vendor' branch, which, for some reason
- * doesn't use the magic-branch-tag format. Try to detect that now
- */
-static int is_vendor_branch(const char * rev)
+static int count_dots(const char * p)
{
int dots = 0;
- const char *p = rev;
while (*p)
if (*p++ == '.')
dots++;
- return !(dots&1);
+ return dots;
+}
+
+/*
+ * When importing vendor sources, (apparently people do this)
+ * the code is added on a 'vendor' branch, which, for some reason
+ * doesn't use the magic-branch-tag format. Try to detect that now
+ */
+static int is_vendor_branch(const char * rev)
+{
+ return !(count_dots(rev)&1);
}
void patch_set_add_member(PatchSet * ps, PatchSetMember * psm)
@@ -2395,5 +2429,69 @@
break;
}
i++;
+ }
+}
+
+static void determine_branch_ancestor(PatchSet * ps, PatchSet * head_ps)
+{
+ struct list_head * next;
+ CvsFileRevision * rev;
+
+ /* PatchSet 1 has no ancestor */
+ if (ps->psid == 1)
+ return;
+
+ /* HEAD branch patchsets have no ancestry, but callers should know that */
+ if (strcmp(ps->branch, "HEAD") == 0)
+ {
+ debug(DEBUG_APPMSG1, "WARNING: no branch ancestry for HEAD");
+ return;
+ }
+
+ for (next = ps->members.next; next != &ps->members; next = next->next)
+ {
+ PatchSetMember * psm = list_entry(next, PatchSetMember, link);
+ rev = psm->pre_rev;
+ int d1, d2;
+
+ /* the reason this is at all complicated has to do with a
+ * branch off of a branch. it is possible (and indeed
+ * likely) that some file would not have been modified
+ * from the initial branch point to the branch-off-branch
+ * point, and therefore the branch-off-branch point is
+ * really branch-off-HEAD for that specific member (file).
+ * in that case, rev->branch will say HEAD but we want
+ * to know the symbolic name of the first branch
+ * so we continue to look member after member until we find
+ * the 'deepest' branching. deepest can actually be determined
+ * by considering the revision currently indicated by
+ * ps->ancestor_branch (by symbolic lookup) and rev->rev. the
+ * one with more dots wins
+ *
+ * also, the first commit in which a branch-off-branch is
+ * mentioned may ONLY modify files never committed since
+ * original branch-off-HEAD was created, so we have to keep
+ * checking, ps after ps to be sure to get the deepest ancestor
+ *
+ * note: rev is the pre-commit revision, not the post-commit
+ */
+ if (!head_ps->ancestor_branch)
+ d1 = 0;
+ else if (strcmp(ps->branch, rev->branch) == 0)
+ continue;
+ else if (strcmp(head_ps->ancestor_branch, "HEAD") == 0)
+ d1 = 1;
+ else {
+ /* branch_rev may not exist if the file was added on this branch for example */
+ const char * branch_rev = (char *)get_hash_object(rev->file->branches_sym, head_ps->ancestor_branch);
+ d1 = branch_rev ? count_dots(branch_rev) : 1;
+ }
+
+ d2 = count_dots(rev->rev);
+
+ if (d2 > d1)
+ head_ps->ancestor_branch = rev->branch;
+
+ //printf("-----> %d ancestry %s %s %s\n", ps->psid, ps->branch, head_ps->ancestor_branch, rev->file->filename);
}
}
Index: cvsps/cvsps_types.h
diff -u cvsps/cvsps_types.h:4.9 cvsps/cvsps_types.h:4.10
--- cvsps/cvsps_types.h:4.9 Mon Mar 31 18:06:18 2003
+++ cvsps/cvsps_types.h Tue May 24 15:57:37 2005
@@ -110,6 +110,7 @@
char *tag;
int tag_flags;
char *branch;
+ char *ancestor_branch;
struct list_head members;
/*
* A 'branch add' patch set is a bogus patch set created automatically
^ permalink raw reply
* Re: gitweb wishlist
From: Linus Torvalds @ 2005-05-24 20:09 UTC (permalink / raw)
To: Thomas Glanzmann
Cc: David Mansfield, H. Peter Anvin, Kay Sievers, Petr Baudis,
Git Mailing List
In-Reply-To: <Pine.LNX.4.58.0505241236020.2307@ppc970.osdl.org>
On Tue, 24 May 2005, Linus Torvalds wrote:
>
> Will fix. This will take another six minutes of testing ;)
Almost eight minutes. Still, the final average was 8 changesets per
second, which sounds pretty damn good to me, actually.
Anyway, I've checked in the fix for the quoting, and I now get the right
number of revisions, ie
git-rev-tree $(ls .git/refs/heads/) | wc -l
returns the same "3757" that cvsps reports.
However, "git-fsck-cache --unreachable" reports 102 unreachable blobs,
which worries me. It's really blobs only, which is strange: it implies
that we did the "git-update-cache" but not a "git-write-tree" (or that the
git-write-tree failed for some reason, but that sounds even stranger,
since we did successfully do all the commits)
The only way I can see the unreachable blobs happening is if one of he
ChangeSet entries in cvsps mentions the _same_ pathname twice for a single
ChangeSet. David, is that possible?
Exactly because it's only blobs, it really does smell like a cvsps issue.
My scripts always use "git-update-cache --add -- filename", so it never
creates any blobs _except_ when it adds them to the index (and thus
write-tree should always pick them up, unless we update the index again
before the next write-tree happens).
Linus
^ permalink raw reply
* Re: gitweb wishlist
From: David Mansfield @ 2005-05-24 20:10 UTC (permalink / raw)
To: David Mansfield
Cc: Linus Torvalds, H. Peter Anvin, Kay Sievers, Petr Baudis,
Thomas Glanzmann, Git Mailing List
In-Reply-To: <42938893.9010608@cobite.com>
David Mansfield wrote:
> Linus Torvalds wrote:
>
>>On Tue, 24 May 2005, Linus Torvalds wrote:
>>
>>
>>>Fixing the branch handling shows that cvsps does some really strange
>>>things with the newly added "Ancestor grpah". Here's one example:
>>
>>
>>Ahh, looking at cvsps source, I think I see what's going on.
>>
>>It's deciding the "previous branch" by looking at what the previous branch
>>for the first individual file in the PatchSet was, which fails because in
>>this case, PatchSet 372 was changing "syslinux.doc", and Patchset 374 was
>>changing "syslinux.c", and thus the previous version of the individual
>>_files_ were both in the HEAD branch.
>>
>>So it does look like I should just ignore the "Ancestor branch"
>>information if the new branch already existed.
>>
>
>
> I've attached what I just committed. The previous 'show ancestor' patch
> needs to be reversed and this applied. It works for me on a half-dozen
> repos including syslinux.
>
Oops. I forgot to metion I made the tracking of branch ancestry an
option because it potentially increases the cpu time a fair margin
(though here it seemed trivial). You need to pass '-A' as an additional
argument.
David
^ permalink raw reply
* Re: gitweb wishlist
From: Thomas Glanzmann @ 2005-05-24 20:16 UTC (permalink / raw)
To: David Mansfield
Cc: Linus Torvalds, H. Peter Anvin, Kay Sievers, Petr Baudis,
Git Mailing List
In-Reply-To: <429383D6.6010908@cobite.com>
Hello,
> There is something strange about 'cvs import' I believe which causes
> various bizarre things to happen to the first cvsps patchset. I haven't
> looked at mutt cvs yet, but this could be the cause. If you see a lot
> of version numbers 1.1.1.1 then this is indeed the problem.
yes, that is happening. But it should be fairly easy to fix that.
Because the second one says INITIAL->1.1 and the first says 1.1->1.1.1.1
a lot.
> I'll look at taking these patches upstream. The 'MT' fix is already in
> my cvs of cvsps, and the rest looks pretty good.
Good. :-)
> Do you know where I can get attribution information for these changes?
> Are they all from you? (I'm not familiar with debian at all)
none of them is from me, they're all from Debian. Here are a few URLs
how to get the attribution:
http://packages.qa.debian.org/c/cvsps.html
-> You can get the source files from there: DSC are
metainformation, ORIG is your upstream version and DIFF are the
debian changes against your upstream version as patch. There is
also a Changelog in the diff for the package. You should find in
there everything you need.
http://bugs.debian.org/cgi-bin/pkgreport.cgi?pkg=cvsps
-> The bug tracking system of debian could also be of help.
Thomas
^ permalink raw reply
* Re: gitweb wishlist
From: David Mansfield @ 2005-05-24 20:19 UTC (permalink / raw)
To: Linus Torvalds
Cc: Thomas Glanzmann, H. Peter Anvin, Kay Sievers, Petr Baudis,
Git Mailing List
In-Reply-To: <Pine.LNX.4.58.0505241259250.2307@ppc970.osdl.org>
Linus Torvalds wrote:
>
> On Tue, 24 May 2005, Linus Torvalds wrote:
>
>>Will fix. This will take another six minutes of testing ;)
>
>
> Almost eight minutes. Still, the final average was 8 changesets per
> second, which sounds pretty damn good to me, actually.
>
> Anyway, I've checked in the fix for the quoting, and I now get the right
> number of revisions, ie
>
> git-rev-tree $(ls .git/refs/heads/) | wc -l
>
> returns the same "3757" that cvsps reports.
>
> However, "git-fsck-cache --unreachable" reports 102 unreachable blobs,
> which worries me. It's really blobs only, which is strange: it implies
> that we did the "git-update-cache" but not a "git-write-tree" (or that the
> git-write-tree failed for some reason, but that sounds even stranger,
> since we did successfully do all the commits)
>
> The only way I can see the unreachable blobs happening is if one of he
> ChangeSet entries in cvsps mentions the _same_ pathname twice for a single
> ChangeSet. David, is that possible?
>
Sounds possible. Unfortunately, the 'uniqueness' of a commit actually
doesn't exist. It's all smoke-and-mirrors. In order to disallow this
(which I think need to do) I'd need to use some commit member
information, and add some heuristic: if this file is already in the
commit, then this MUST be a different commit. Unfortunately, it's
possible that the 'member' already in the commit is the wrong one and
this is the right one, which just sounds horribly ugly to me.
I'll think on it.
David
^ permalink raw reply
* Re: gitweb wishlist
From: Martin Langhoff @ 2005-05-24 20:19 UTC (permalink / raw)
To: David Mansfield; +Cc: Git Mailing List
In-Reply-To: <4292A08A.5050108@cobite.com>
On 5/24/05, David Mansfield <david@cobite.com> wrote:
> > means..
> >
>
> Ok. I'll tell you. It means that the committer uses bad practices in
> tagging ;-) It generally means that force tag (cvs tag -F <file>) was
> used on a specific file. Here's the scenario:
Projects that branch on release (and maintain a long-lived stable
branch following the release) often use a floating MERGED branch to
keep track of what bugfixes have been merged back into HEAD. This
practice, broken as it is, is the recommended approach AFAIK.
It would be a good thing to be able to tell cvsps to ignore certain
tags (by name or by regex).
martin
^ permalink raw reply
* Re: gitweb wishlist
From: Thomas Glanzmann @ 2005-05-24 20:28 UTC (permalink / raw)
To: Linus Torvalds
Cc: David Mansfield, H. Peter Anvin, Kay Sievers, Petr Baudis,
Git Mailing List
In-Reply-To: <Pine.LNX.4.58.0505241259250.2307@ppc970.osdl.org>
Hello,
> Almost eight minutes. Still, the final average was 8 changesets per
> second, which sounds pretty damn good to me, actually.
yes, it is. ;-)
> Anyway, I've checked in the fix for the quoting, and I now get the right
> number of revisions, ie
> git-rev-tree $(ls .git/refs/heads/) | wc -l
> returns the same "3757" that cvsps reports.
Nice! :-)
btw:
For the mutt tree there are a few 'empty commits' eg were the
parent tree is the same as the current. This is because git ignores
.cvsignore and they commited some .cvsignore files without any other
deltas. I don't know if you want to handle this. Just a note.
> However, "git-fsck-cache --unreachable" reports 102 unreachable blobs,
> which worries me. It's really blobs only, which is strange: it implies
> that we did the "git-update-cache" but not a "git-write-tree" (or that the
> git-write-tree failed for some reason, but that sounds even stranger,
> since we did successfully do all the commits)
> The only way I can see the unreachable blobs happening is if one of the
> ChangeSet entries in cvsps mentions the _same_ pathname twice for a single
> ChangeSet. David, is that possible?
Yes, it is, I had that problem before. For example tlr commtis the
changelog seperate with '# changlog' or so log message and cvsps thinks
because of the 'time fuzz which defaults to a way to high value' that
three changelog commits are all one delta. And that it adds three
entries. And what annoys me most in the wrong direction. So if you would
apply them as patches they don't apply because of the wrong ordering.
Reference:
PatchSet 3005
Date: 2002/12/07 19:19:42
Author: roessler
Branch: HEAD
Tag: (none)
Log:
# changelog commit
Members:
ChangeLog:3.7->3.8
ChangeLog:3.6->3.7
ChangeLog:3.5->3.6
Just call cvsps with -z "20" for the mutt repository also -z 1 should
work because the timestamps of one 'commit' are all set to the same
value.
Thomas
^ permalink raw reply
* Re: gitweb wishlist
From: Linus Torvalds @ 2005-05-24 20:33 UTC (permalink / raw)
To: Thomas Glanzmann
Cc: David Mansfield, H. Peter Anvin, Kay Sievers, Petr Baudis,
Git Mailing List
In-Reply-To: <Pine.LNX.4.58.0505241259250.2307@ppc970.osdl.org>
On Tue, 24 May 2005, Linus Torvalds wrote:
>
> Exactly because it's only blobs, it really does smell like a cvsps issue.
> My scripts always use "git-update-cache --add -- filename", so it never
> creates any blobs _except_ when it adds them to the index (and thus
> write-tree should always pick them up, unless we update the index again
> before the next write-tree happens).
Looking at the contents of these files, all but one of them are changelog
files, which would be consistent with this theory - if gitps ends up
"smushing together" two separate commits (and mutt seems to have the bad
habit of having just a simple "# changelog commit" as the commit message,
so it would likely trigger the "same commit message" logic), you'd get
exactly this.
The one non-changelog file looks like some kind of message translation
thing:
# This file was prepared by (in alphabetical order):
#
# Alexey Vyskubov (alexey@pepper.spb.ru)
# Andrew W. Nosenko (awn@bcs.zp.ua)
# Michael Sobolev (mss@transas.com)
# Vsevolod Volkov (vvv@mutt.org.ua)
#
# To contact translators, please use mutt-ru mailing list:
# http://woe.spb.ru/mailman/listinfo/mutt-ru
#
msgid ""
msgstr ""
"Project-Id-Version: mutt-1.4i\n"
"POT-Creation-Date: 2002-05-02 01:08+0200\n"
"PO-Revision-Date: 2002-05-03 22:53+0300\n"
...
#: alias.c:280
#, c-format
msgid "[%s = %s] Accept?"
msgstr "[%s = %s] ðÒÉÎÑÔØ?"
...
and it looks like it is "po/ru.po". Indeed, that's a big clue:
---------------------
PatchSet 2869
Date: 2002/05/13 21:17:48
Author: roessler
Branch: mutt-1-4-stable
Tag: (none)
Log:
From: Vsevolod Volkov <vvv@mutt.org.ua>
update
Members:
po/ru.po:1.129.2.5->1.129.2.6
po/ru.po:1.129.2.4->1.129.2.5
---------------------
and I thus rest my case. cvs2git is doing the right thing, and this is
something that needs to be fixed in cvsps in case anybody cares.
Linus
^ permalink raw reply
* Re: gitweb wishlist
From: Linus Torvalds @ 2005-05-24 20:44 UTC (permalink / raw)
To: David Mansfield
Cc: Thomas Glanzmann, H. Peter Anvin, Kay Sievers, Petr Baudis,
Git Mailing List
In-Reply-To: <42938C5B.4000906@cobite.com>
On Tue, 24 May 2005, David Mansfield wrote:
>
> Sounds possible. Unfortunately, the 'uniqueness' of a commit actually
> doesn't exist. It's all smoke-and-mirrors. In order to disallow this
> (which I think need to do) I'd need to use some commit member
> information, and add some heuristic: if this file is already in the
> commit, then this MUST be a different commit. Unfortunately, it's
> possible that the 'member' already in the commit is the wrong one and
> this is the right one, which just sounds horribly ugly to me.
>
> I'll think on it.
I think it's a fundamentally hard problem to fix, but it may be that the
fix is to give hints about command line options and in particular the time
fuzz thing to try.
So maybe just _detection_ logic in cvsps, along with a warning like
"time fuzz is 600 seconds, and the time difference between the two
commits of this file was 431 seconds. You may want to try a lower
-z argument"
or something.
It might also be possible to try to sort all the names by date of commit
first, and see if they "bunch up" into groups of low fuzz with much bigger
fuzz in between groups..
Linus
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox