git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* parsecvs tool now creates git repositories
@ 2006-04-02  5:36 Keith Packard
  2006-04-02  9:39 ` Jan-Benedict Glaw
                   ` (2 more replies)
  0 siblings, 3 replies; 35+ messages in thread
From: Keith Packard @ 2006-04-02  5:36 UTC (permalink / raw)
  To: Git Mailing List; +Cc: keithp

[-- Attachment #1: Type: text/plain, Size: 956 bytes --]

I've hacked in cheesy system(3) calls to invoke various git tools to
create a git repository from a parsed cvs repository. It's about the
same speed as git-cvsimport now.

The UI is a total disaster, sufficient for testing. You must create an
Authors file in the current directory which looks like the git-cvsimport
authors file. You must also have a edit-change-log program in your path
which edits the commit message in place. /bin/true will work if you
don't need to edit the messages.

I should clearly steal the existing git-cvsimport command line arguments
and use those.

This tool successfully, and usefully, imports the X.org xserver CVS
repository, along with correctly importing several other repositories
I've tried. It doesn't quite manage to compute correct branch points for
the postgresql CVS repository, so there is clearly work remaining to be
done.

CVS - your code's worst nightmare.

-- 
keith.packard@intel.com

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 191 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: parsecvs tool now creates git repositories
  2006-04-02  5:36 Keith Packard
@ 2006-04-02  9:39 ` Jan-Benedict Glaw
  2006-04-02 19:31   ` Jan-Benedict Glaw
  2006-04-03 14:03 ` Erik Mouw
  2006-04-03 22:38 ` Martin Langhoff
  2 siblings, 1 reply; 35+ messages in thread
From: Jan-Benedict Glaw @ 2006-04-02  9:39 UTC (permalink / raw)
  To: Keith Packard; +Cc: Git Mailing List

[-- Attachment #1: Type: text/plain, Size: 876 bytes --]

On Sat, 2006-04-01 21:36:28 -0800, Keith Packard <keithp@keithp.com> wrote:
> The UI is a total disaster, sufficient for testing. You must create an
> Authors file in the current directory which looks like the git-cvsimport
> authors file. You must also have a edit-change-log program in your path
> which edits the commit message in place. /bin/true will work if you
> don't need to edit the messages.

Well, at least this sounds quite promising. I'll give it a run once
I've arrived back home on the Binutils repository.

MfG, JBG

-- 
Jan-Benedict Glaw       jbglaw@lug-owl.de    . +49-172-7608481             _ O _
"Eine Freie Meinung in  einem Freien Kopf    | Gegen Zensur | Gegen Krieg  _ _ O
 für einen Freien Staat voll Freier Bürger"  | im Internet! |   im Irak!   O O O
ret = do_actions((curr | FREE_SPEECH) & ~(NEW_COPYRIGHT_LAW | DRM | TCPA));

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: parsecvs tool now creates git repositories
  2006-04-02  9:39 ` Jan-Benedict Glaw
@ 2006-04-02 19:31   ` Jan-Benedict Glaw
  2006-04-03  4:10     ` Keith Packard
  0 siblings, 1 reply; 35+ messages in thread
From: Jan-Benedict Glaw @ 2006-04-02 19:31 UTC (permalink / raw)
  To: Keith Packard; +Cc: Git Mailing List

[-- Attachment #1: Type: text/plain, Size: 2751 bytes --]

On Sun, 2006-04-02 11:39:06 +0200, Jan-Benedict Glaw <jbglaw@lug-owl.de> wrote:
> On Sat, 2006-04-01 21:36:28 -0800, Keith Packard <keithp@keithp.com> wrote:
> > The UI is a total disaster, sufficient for testing. You must create an
> > Authors file in the current directory which looks like the git-cvsimport
> > authors file. You must also have a edit-change-log program in your path
> > which edits the commit message in place. /bin/true will work if you
> > don't need to edit the messages.
> 
> Well, at least this sounds quite promising. I'll give it a run once
> I've arrived back home on the Binutils repository.

Doesn't build for me:

jbglaw@bixie:~/vax/gittish/parsecvs$ make clean
rm -f gram.o lex.o parsecvs.o cvsutil.o revlist.o atom.o revcvs.o git.o y.tab.h gram.c parsecvs
jbglaw@bixie:~/vax/gittish/parsecvs$ make
yacc -d gram.y 
mv -f y.tab.c gram.c
cc -O0 -g -Wall -Wpointer-arith -Wstrict-prototypes -Wmissing-prototypes -Wmissing-declarations -Wnested-externs -fno-strict-aliasing   -c -o gram.o gram.c
cc -O0 -g -Wall -Wpointer-arith -Wstrict-prototypes -Wmissing-prototypes -Wmissing-declarations -Wnested-externs -fno-strict-aliasing   -c -o lex.o lex.c
lex.l: In function ‘yylex’:
lex.l:69: warning: implicit declaration of function ‘yyget_lineno’
lex.l:69: warning: nested extern declaration of ‘yyget_lineno’
<stdout>: At top level:
<stdout>:1747: warning: no previous prototype for ‘yyget_lineno’
<stdout>:1756: warning: no previous prototype for ‘yyget_in’
<stdout>:1764: warning: no previous prototype for ‘yyget_out’
<stdout>:1772: warning: no previous prototype for ‘yyget_leng’
<stdout>:1781: warning: no previous prototype for ‘yyget_text’
<stdout>:1790: warning: no previous prototype for ‘yyset_lineno’
<stdout>:1802: warning: no previous prototype for ‘yyset_in’
<stdout>:1807: warning: no previous prototype for ‘yyset_out’
<stdout>:1812: warning: no previous prototype for ‘yyget_debug’
<stdout>:1817: warning: no previous prototype for ‘yyset_debug’
<stdout>:1823: warning: no previous prototype for ‘yylex_destroy’
lex.l: In function ‘parse_data’:
lex.l:90: error: ‘yytext_ptr’ undeclared (first use in this function)
lex.l:90: error: (Each undeclared identifier is reported only once
lex.l:90: error: for each function it appears in.)
make: *** [lex.o] Error 1

MfG, JBG

-- 
Jan-Benedict Glaw       jbglaw@lug-owl.de    . +49-172-7608481             _ O _
"Eine Freie Meinung in  einem Freien Kopf    | Gegen Zensur | Gegen Krieg  _ _ O
 für einen Freien Staat voll Freier Bürger"  | im Internet! |   im Irak!   O O O
ret = do_actions((curr | FREE_SPEECH) & ~(NEW_COPYRIGHT_LAW | DRM | TCPA));

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: parsecvs tool now creates git repositories
  2006-04-02 19:31   ` Jan-Benedict Glaw
@ 2006-04-03  4:10     ` Keith Packard
  2006-04-03  4:38       ` Linus Torvalds
  2006-04-03  7:25       ` Jan-Benedict Glaw
  0 siblings, 2 replies; 35+ messages in thread
From: Keith Packard @ 2006-04-03  4:10 UTC (permalink / raw)
  To: Jan-Benedict Glaw; +Cc: keithp, Git Mailing List

[-- Attachment #1: Type: text/plain, Size: 499 bytes --]

On Sun, 2006-04-02 at 21:31 +0200, Jan-Benedict Glaw wrote:

> lex.l: In function ‘parse_data’:
> lex.l:90: error: ‘yytext_ptr’ undeclared (first use in this function)
> lex.l:90: error: (Each undeclared identifier is reported only once
> lex.l:90: error: for each function it appears in.)
> make: *** [lex.o] Error 1

I think this is a bug in your version of flex; I'm using standard lex
conventions here. I don't know how to make it work for you.

-- 
keith.packard@intel.com

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 191 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: parsecvs tool now creates git repositories
  2006-04-03  4:10     ` Keith Packard
@ 2006-04-03  4:38       ` Linus Torvalds
  2006-04-03  7:25       ` Jan-Benedict Glaw
  1 sibling, 0 replies; 35+ messages in thread
From: Linus Torvalds @ 2006-04-03  4:38 UTC (permalink / raw)
  To: Keith Packard; +Cc: Jan-Benedict Glaw, Git Mailing List

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2111 bytes --]



On Sun, 2 Apr 2006, Keith Packard wrote:

> On Sun, 2006-04-02 at 21:31 +0200, Jan-Benedict Glaw wrote:
> 
> > lex.l: In function ÿÿparse_dataÿÿ:
> > lex.l:90: error: ÿÿyytext_ptrÿÿ undeclared (first use in this function)
> > lex.l:90: error: (Each undeclared identifier is reported only once
> > lex.l:90: error: for each function it appears in.)
> > make: *** [lex.o] Error 1
> 
> I think this is a bug in your version of flex; I'm using standard lex
> conventions here. I don't know how to make it work for you.

I need something like this to make it work with flex/lex..

The "-l" tells flex to be more traditional.

The "clean" rule is obvious.

And the "yylineno" is a lot more traditional than yyget_lineno(), which 
doesn't work for me at all. I think that's some issue with flex' support 
for re-entrant parsers.

Whether it works after this, I dunno. But at least it compiles.

		Linus

---
diff --git a/Makefile b/Makefile
index 639353a..c7e04a5 100644
--- a/Makefile
+++ b/Makefile
@@ -4,6 +4,7 @@ GCC_WARNINGS3=-Wnested-externs -fno-stri
 GCC_WARNINGS=$(GCC_WARNINGS1) $(GCC_WARNINGS2) $(GCC_WARNINGS3)
 CFLAGS=-O0 -g $(GCC_WARNINGS)
 YFLAGS=-d
+LFLAGS=-l
 
 SRCS=gram.y lex.l cvs.h parsecvs.c cvsutil.c revlist.c atom.c revcvs.c git.c
 
@@ -20,4 +21,4 @@ lex.o: lex.c
 y.tab.h: gram.c
 
 clean:
-	rm -f $(OBJS) y.tab.h gram.c parsecvs
+	rm -f $(OBJS) y.tab.h gram.c parsecvs lex.c
diff --git a/lex.l b/lex.l
index 39cafb0..c7833a4 100644
--- a/lex.l
+++ b/lex.l
@@ -65,8 +65,7 @@ parse_data (int save);
 \t				;
 \n				;
 .				{ 
-				    fprintf (stderr, "%s: (%d) ignoring %c\n", 
-					     yyfilename, yyget_lineno (),
+				    fprintf (stderr, "%s: (%d) ignoring %c\n", yyfilename, yylineno,
 					     yytext[0]);
 				}
 %%
@@ -146,8 +145,7 @@ lex_date (cvs_number *n)
 	d = mktime (&tm);
 	if (d == 0) {
 	    int i;
-	    fprintf (stderr, "%s: (%d) unparsable date: ", yyfilename,
-		     yyget_lineno ());
+	    fprintf (stderr, "%s: (%d) unparsable date: ", yyfilename, yylineno);
 	    for (i = 0; i < n->c; i++) {
 		if (i) fprintf (stderr, ".");
 		fprintf (stderr, "%d", n->n[i]);

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: parsecvs tool now creates git repositories
  2006-04-03  4:10     ` Keith Packard
  2006-04-03  4:38       ` Linus Torvalds
@ 2006-04-03  7:25       ` Jan-Benedict Glaw
  2006-04-03 13:58         ` Erik Mouw
  2006-04-03 16:54         ` Keith Packard
  1 sibling, 2 replies; 35+ messages in thread
From: Jan-Benedict Glaw @ 2006-04-03  7:25 UTC (permalink / raw)
  To: Keith Packard; +Cc: Git Mailing List

[-- Attachment #1: Type: text/plain, Size: 1475 bytes --]

On Sun, 2006-04-02 21:10:56 -0700, Keith Packard <keithp@keithp.com> wrote:
> On Sun, 2006-04-02 at 21:31 +0200, Jan-Benedict Glaw wrote:
> > lex.l: In function ‘parse_data’:
> > lex.l:90: error: ‘yytext_ptr’ undeclared (first use in this function)
> > lex.l:90: error: (Each undeclared identifier is reported only once
> > lex.l:90: error: for each function it appears in.)
> > make: *** [lex.o] Error 1
> 
> I think this is a bug in your version of flex; I'm using standard lex
> conventions here. I don't know how to make it work for you.

It compiles for me with this patch (thanks to Linus for the hint):

diff --git a/Makefile b/Makefile
index 639353a..b8f5014 100644
--- a/Makefile
+++ b/Makefile
@@ -3,7 +3,8 @@ GCC_WARNINGS2=-Wmissing-prototypes -Wmis
 GCC_WARNINGS3=-Wnested-externs -fno-strict-aliasing
 GCC_WARNINGS=$(GCC_WARNINGS1) $(GCC_WARNINGS2) $(GCC_WARNINGS3)
 CFLAGS=-O0 -g $(GCC_WARNINGS)
-YFLAGS=-d
+YFLAGS=-d -l
+LFLAGS=-l
 
 SRCS=gram.y lex.l cvs.h parsecvs.c cvsutil.c revlist.c atom.c revcvs.c git.c
 

Would you please verify that it doesn't break things for you?

Thanks, JBG

-- 
Jan-Benedict Glaw       jbglaw@lug-owl.de    . +49-172-7608481             _ O _
"Eine Freie Meinung in  einem Freien Kopf    | Gegen Zensur | Gegen Krieg  _ _ O
 für einen Freien Staat voll Freier Bürger"  | im Internet! |   im Irak!   O O O
ret = do_actions((curr | FREE_SPEECH) & ~(NEW_COPYRIGHT_LAW | DRM | TCPA));

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: parsecvs tool now creates git repositories
  2006-04-03  7:25       ` Jan-Benedict Glaw
@ 2006-04-03 13:58         ` Erik Mouw
  2006-04-03 16:54         ` Keith Packard
  1 sibling, 0 replies; 35+ messages in thread
From: Erik Mouw @ 2006-04-03 13:58 UTC (permalink / raw)
  To: Jan-Benedict Glaw; +Cc: Keith Packard, Git Mailing List

On Mon, Apr 03, 2006 at 09:25:54AM +0200, Jan-Benedict Glaw wrote:
> On Sun, 2006-04-02 21:10:56 -0700, Keith Packard <keithp@keithp.com> wrote:
> > I think this is a bug in your version of flex; I'm using standard lex
> > conventions here. I don't know how to make it work for you.
> 
> It compiles for me with this patch (thanks to Linus for the hint):
> 
> diff --git a/Makefile b/Makefile

[...]

> Would you please verify that it doesn't break things for you?

Almost there. I applied your patch and ran "make clean", but the
Makefile forgets to remove lex.c. Here's an updated patch:

diff --git a/Makefile b/Makefile
index 639353a..5651e70 100644
--- a/Makefile
+++ b/Makefile
@@ -3,7 +3,8 @@ GCC_WARNINGS2=-Wmissing-prototypes -Wmis
 GCC_WARNINGS3=-Wnested-externs -fno-strict-aliasing
 GCC_WARNINGS=$(GCC_WARNINGS1) $(GCC_WARNINGS2) $(GCC_WARNINGS3)
 CFLAGS=-O0 -g $(GCC_WARNINGS)
-YFLAGS=-d
+YFLAGS=-d -l
+LFLAGS=-l
 
 SRCS=gram.y lex.l cvs.h parsecvs.c cvsutil.c revlist.c atom.c revcvs.c git.c
 
@@ -20,4 +21,4 @@ lex.o: lex.c
 y.tab.h: gram.c
 
 clean:
-	rm -f $(OBJS) y.tab.h gram.c parsecvs
+	rm -f $(OBJS) y.tab.h gram.c lex.c parsecvs



It compiles! Ship it! ;-)


Erik

-- 
+-- Erik Mouw -- www.harddisk-recovery.com -- +31 70 370 12 90 --
| Lab address: Delftechpark 26, 2628 XH, Delft, The Netherlands

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: parsecvs tool now creates git repositories
  2006-04-02  5:36 Keith Packard
  2006-04-02  9:39 ` Jan-Benedict Glaw
@ 2006-04-03 14:03 ` Erik Mouw
  2006-04-03 14:21   ` Jakub Narebski
                     ` (2 more replies)
  2006-04-03 22:38 ` Martin Langhoff
  2 siblings, 3 replies; 35+ messages in thread
From: Erik Mouw @ 2006-04-03 14:03 UTC (permalink / raw)
  To: Keith Packard; +Cc: Git Mailing List

On Sat, Apr 01, 2006 at 09:36:28PM -0800, Keith Packard wrote:
> The UI is a total disaster, sufficient for testing. You must create an
> Authors file in the current directory which looks like the git-cvsimport
> authors file. You must also have a edit-change-log program in your path
> which edits the commit message in place. /bin/true will work if you
> don't need to edit the messages.
> 
> I should clearly steal the existing git-cvsimport command line arguments
> and use those.

What is the current way to use it? I get the impression it reads raw ,v
files, but how do I get along with a remote CVS repository?


Erik

-- 
+-- Erik Mouw -- www.harddisk-recovery.com -- +31 70 370 12 90 --
| Lab address: Delftechpark 26, 2628 XH, Delft, The Netherlands

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: parsecvs tool now creates git repositories
  2006-04-03 14:03 ` Erik Mouw
@ 2006-04-03 14:21   ` Jakub Narebski
  2006-04-03 14:39     ` Keith Packard
  2006-04-03 14:37   ` Keith Packard
  2006-04-04  0:55   ` Anand Kumria
  2 siblings, 1 reply; 35+ messages in thread
From: Jakub Narebski @ 2006-04-03 14:21 UTC (permalink / raw)
  To: git

Erik Mouw wrote:

> On Sat, Apr 01, 2006 at 09:36:28PM -0800, Keith Packard wrote:
>> The UI is a total disaster, sufficient for testing. You must create an
>> Authors file in the current directory which looks like the git-cvsimport
>> authors file. You must also have a edit-change-log program in your path
>> which edits the commit message in place. /bin/true will work if you
>> don't need to edit the messages.
>> 
>> I should clearly steal the existing git-cvsimport command line arguments
>> and use those.
> 
> What is the current way to use it? I get the impression it reads raw ,v
> files, but how do I get along with a remote CVS repository?

>From the comments on #git, parsecvs reads raw ,v files for creating history
tree, then uses 'cvs co ...' for getting the contents.

If you have access to remote CVS repository, it was suggested to use either
cvsclone or cvsup.

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: parsecvs tool now creates git repositories
  2006-04-03 14:03 ` Erik Mouw
  2006-04-03 14:21   ` Jakub Narebski
@ 2006-04-03 14:37   ` Keith Packard
  2006-04-03 15:32     ` Jeff King
  2006-04-04  0:55   ` Anand Kumria
  2 siblings, 1 reply; 35+ messages in thread
From: Keith Packard @ 2006-04-03 14:37 UTC (permalink / raw)
  To: Erik Mouw; +Cc: keithp, Git Mailing List

[-- Attachment #1: Type: text/plain, Size: 1030 bytes --]

On Mon, 2006-04-03 at 16:03 +0200, Erik Mouw wrote:
> On Sat, Apr 01, 2006 at 09:36:28PM -0800, Keith Packard wrote:
> > The UI is a total disaster, sufficient for testing. You must create an
> > Authors file in the current directory which looks like the git-cvsimport
> > authors file. You must also have a edit-change-log program in your path
> > which edits the commit message in place. /bin/true will work if you
> > don't need to edit the messages.
> > 
> > I should clearly steal the existing git-cvsimport command line arguments
> > and use those.
> 
> What is the current way to use it? I get the impression it reads raw ,v
> files, but how do I get along with a remote CVS repository?

You can't. You need to create a local copy of the repository. There is a
tool which can do that using the cvs protocol, but I don't recall the
name.

It turns out that parsing the ,v files directly is both faster and more
accurate than attempting to interpret the output of cvs log.

-- 
keith.packard@intel.com

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 191 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: parsecvs tool now creates git repositories
  2006-04-03 14:21   ` Jakub Narebski
@ 2006-04-03 14:39     ` Keith Packard
  0 siblings, 0 replies; 35+ messages in thread
From: Keith Packard @ 2006-04-03 14:39 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: keithp, git

[-- Attachment #1: Type: text/plain, Size: 601 bytes --]

On Mon, 2006-04-03 at 16:21 +0200, Jakub Narebski wrote:

> From the comments on #git, parsecvs reads raw ,v files for creating history
> tree, then uses 'cvs co ...' for getting the contents.

It's not using cvs co, it's using the rcs 'co' command. I will probably
fix it to just generate the files directly as that will be a lot faster.
If there was a git command to create blobs directly from file contents,
it would be faster still as I could create all of the blobs for a
particular file in one pass and then just build trees in the index out
of those.

-- 
keith.packard@intel.com

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 191 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: parsecvs tool now creates git repositories
  2006-04-03 14:37   ` Keith Packard
@ 2006-04-03 15:32     ` Jeff King
  0 siblings, 0 replies; 35+ messages in thread
From: Jeff King @ 2006-04-03 15:32 UTC (permalink / raw)
  To: Git Mailing List

On Mon, Apr 03, 2006 at 07:37:27AM -0700, Keith Packard wrote:

> You can't. You need to create a local copy of the repository. There is a
> tool which can do that using the cvs protocol, but I don't recall the
> name.

I believe you're thinking of CVSSuck:
  http://cvs.m17n.org/~akr/cvssuck/

-Peff

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: parsecvs tool now creates git repositories
  2006-04-03  7:25       ` Jan-Benedict Glaw
  2006-04-03 13:58         ` Erik Mouw
@ 2006-04-03 16:54         ` Keith Packard
  2006-04-03 22:19           ` Keith Packard
  1 sibling, 1 reply; 35+ messages in thread
From: Keith Packard @ 2006-04-03 16:54 UTC (permalink / raw)
  To: Jan-Benedict Glaw; +Cc: keithp, Git Mailing List

[-- Attachment #1: Type: text/plain, Size: 181 bytes --]

On Mon, 2006-04-03 at 09:25 +0200, Jan-Benedict Glaw wrote:

> -YFLAGS=-d
> +YFLAGS=-d -l
> +LFLAGS=-l

Works for me too; thanks for the fix.

-- 
keith.packard@intel.com

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 191 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: parsecvs tool now creates git repositories
  2006-04-03 16:54         ` Keith Packard
@ 2006-04-03 22:19           ` Keith Packard
  0 siblings, 0 replies; 35+ messages in thread
From: Keith Packard @ 2006-04-03 22:19 UTC (permalink / raw)
  To: Jan-Benedict Glaw; +Cc: keithp, Git Mailing List

[-- Attachment #1: Type: text/plain, Size: 775 bytes --]

On Mon, 2006-04-03 at 09:54 -0700, Keith Packard wrote:
> On Mon, 2006-04-03 at 09:25 +0200, Jan-Benedict Glaw wrote:
> 
> > -YFLAGS=-d
> > +YFLAGS=-d -l
> > +LFLAGS=-l
> 
> Works for me too; thanks for the fix.

Well, -l *kinda* works; it places a limit on the maximum token size.
And, unlike 'lex', 'flex' places all input into the token buffer, even
if handled outside the usual lexer loop. So, my external function that
sucked up file contents was losing.

I switched it over to doing one-at-a-time reads from the input file, now
the external data function can directly use stdio. This eliminates all
calls to 'input' and 'unput' which should make it work for everyone now.

flex -- it's like lex, except less flexible.

-- 
keith.packard@intel.com

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 191 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: parsecvs tool now creates git repositories
  2006-04-02  5:36 Keith Packard
  2006-04-02  9:39 ` Jan-Benedict Glaw
  2006-04-03 14:03 ` Erik Mouw
@ 2006-04-03 22:38 ` Martin Langhoff
  2006-04-04  2:07   ` Keith Packard
  2 siblings, 1 reply; 35+ messages in thread
From: Martin Langhoff @ 2006-04-03 22:38 UTC (permalink / raw)
  To: Keith Packard; +Cc: Git Mailing List

Keith,

Looks nifty. Though I thought you'd go for writing a smarter cvsps, so
that git-cvsimport could take advantage of it.

Looks like I'll have to brush up on my C to get to play... :-(



m

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: parsecvs tool now creates git repositories
  2006-04-03 14:03 ` Erik Mouw
  2006-04-03 14:21   ` Jakub Narebski
  2006-04-03 14:37   ` Keith Packard
@ 2006-04-04  0:55   ` Anand Kumria
  2 siblings, 0 replies; 35+ messages in thread
From: Anand Kumria @ 2006-04-04  0:55 UTC (permalink / raw)
  To: git

On Mon, 03 Apr 2006 16:03:48 +0200, Erik Mouw wrote:

> On Sat, Apr 01, 2006 at 09:36:28PM -0800, Keith Packard wrote:
>> The UI is a total disaster, sufficient for testing. You must create an
>> Authors file in the current directory which looks like the git-cvsimport
>> authors file. You must also have a edit-change-log program in your path
>> which edits the commit message in place. /bin/true will work if you
>> don't need to edit the messages.
>> 
>> I should clearly steal the existing git-cvsimport command line arguments
>> and use those.
> 
> What is the current way to use it? I get the impression it reads raw ,v
> files, but how do I get along with a remote CVS repository?

cvsclone, recently released, might be what you are after.

I've only used it on my own CVS repositories so I've no idea just how hard
it hits the remote side.

<http://freshmeat.net/projects/cvsclone/>

Anand

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: parsecvs tool now creates git repositories
  2006-04-03 22:38 ` Martin Langhoff
@ 2006-04-04  2:07   ` Keith Packard
  2006-04-04  2:16     ` Martin Langhoff
  0 siblings, 1 reply; 35+ messages in thread
From: Keith Packard @ 2006-04-04  2:07 UTC (permalink / raw)
  To: Martin Langhoff; +Cc: keithp, Git Mailing List

[-- Attachment #1: Type: text/plain, Size: 776 bytes --]

On Tue, 2006-04-04 at 10:38 +1200, Martin Langhoff wrote:

> Looks nifty. Though I thought you'd go for writing a smarter cvsps, so
> that git-cvsimport could take advantage of it.

Once I had the change set information sitting in memory, it was far
easier to just generate the appropriate git commands than attempt to
recreate the cvsps output format...

> Looks like I'll have to brush up on my C to get to play... :-(

Trust me, it wasn't because I wanted to replace git-cvsimport; it's
solely that cvsps was generating complete garbage for most of my
repositories.

My new tool isn't perfect yet; it isn't getting exactly the expected
answers for the postgresql repository, but it's working perfectly for my
X server one.

-- 
keith.packard@intel.com

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 191 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: parsecvs tool now creates git repositories
  2006-04-04  2:07   ` Keith Packard
@ 2006-04-04  2:16     ` Martin Langhoff
  2006-04-04  2:24       ` Keith Packard
  0 siblings, 1 reply; 35+ messages in thread
From: Martin Langhoff @ 2006-04-04  2:16 UTC (permalink / raw)
  To: Keith Packard; +Cc: Git Mailing List

On 4/4/06, Keith Packard <keithp@keithp.com> wrote:
> Trust me, it wasn't because I wanted to replace git-cvsimport; it's
> solely that cvsps was generating complete garbage for most of my
> repositories.

Oh, I don't mind -- we may as well bury cvsimport but I can't do C
like I can do Perl, and I surely want to help on this one.

> My new tool isn't perfect yet; it isn't getting exactly the expected
> answers for the postgresql repository, but it's working perfectly for my
> X server one.

Meh, had you done it in Perl, I'd be helping you with the Pg repo,
attic files and ensuring that files created on a branch and then put
into HEAD are handled gracefully. (But you'll get Linus' and Junio's
attention. Smarty cookie.)

Does it run incrementally? Can it discover non-binary files and pass -kk?

cheers,


martin

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: parsecvs tool now creates git repositories
  2006-04-04  2:16     ` Martin Langhoff
@ 2006-04-04  2:24       ` Keith Packard
  2006-04-04  2:42         ` Martin Langhoff
  0 siblings, 1 reply; 35+ messages in thread
From: Keith Packard @ 2006-04-04  2:24 UTC (permalink / raw)
  To: Martin Langhoff; +Cc: keithp, Git Mailing List

[-- Attachment #1: Type: text/plain, Size: 767 bytes --]

On Tue, 2006-04-04 at 14:16 +1200, Martin Langhoff wrote:

> Meh, had you done it in Perl, I'd be helping you with the Pg repo,
> attic files and ensuring that files created on a branch and then put
> into HEAD are handled gracefully. (But you'll get Linus' and Junio's
> attention. Smarty cookie.)

I think those parts are working correctly, I've had plenty of examples
of that kind of adventure.

> Does it run incrementally? Can it discover non-binary files and pass -kk?

It doesn't run incrementally, and it unconditionally passes -kk. It's
currently using rcs to check out versions of the files, so it should
deal with binary content as well as rcs does. Is there something magic I
need to do here? Like for DOS?

-- 
keith.packard@intel.com

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 191 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: parsecvs tool now creates git repositories
  2006-04-04  2:24       ` Keith Packard
@ 2006-04-04  2:42         ` Martin Langhoff
  2006-04-04  3:51           ` Keith Packard
  0 siblings, 1 reply; 35+ messages in thread
From: Martin Langhoff @ 2006-04-04  2:42 UTC (permalink / raw)
  To: Keith Packard; +Cc: Git Mailing List

On 4/4/06, Keith Packard <keithp@keithp.com> wrote:
> On Tue, 2006-04-04 at 14:16 +1200, Martin Langhoff wrote:
>
> > Meh, had you done it in Perl, I'd be helping you with the Pg repo,
> > attic files and ensuring that files created on a branch and then put
> > into HEAD are handled gracefully. (But you'll get Linus' and Junio's
> > attention. Smarty cookie.)
>
> I think those parts are working correctly, I've had plenty of examples
> of that kind of adventure.

Cool. What's the matter with the Pg repo? (Where can I get hold of that repo?)

> > Does it run incrementally? Can it discover non-binary files and pass -kk?
>
> It doesn't run incrementally, and it unconditionally passes -kk. It's

I thought that the .git-cvs directory it created was to be able to run
incrementally (btw, I think it's fair game to create subdirs inside
.git for this kind of status-tracking). And passing -kk uncoditionally
is destructive in some cases (I know... git-cvsimport does it, and I
want to fix that). If you can ask rcs about the mode if the file and
not pass -kk for binary files...

> currently using rcs to check out versions of the files, so it should
> deal with binary content as well as rcs does. Is there something magic I
> need to do here? Like for DOS?

We'll let DOS take care of itself ;)



m

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: parsecvs tool now creates git repositories
  2006-04-04  2:42         ` Martin Langhoff
@ 2006-04-04  3:51           ` Keith Packard
  2006-04-04  6:09             ` Junio C Hamano
  0 siblings, 1 reply; 35+ messages in thread
From: Keith Packard @ 2006-04-04  3:51 UTC (permalink / raw)
  To: Martin Langhoff; +Cc: keithp, Git Mailing List

[-- Attachment #1: Type: text/plain, Size: 2717 bytes --]

On Tue, 2006-04-04 at 14:42 +1200, Martin Langhoff wrote:

> Cool. What's the matter with the Pg repo? (Where can I get hold of that repo?)

As usual, the detection of branch locations is messed up.

The postgresql CVS tree is available at:

        rsync anoncvs.postgresql.org::pgsql-cvs/* postgresql.cvs

It's a fairly hefty 300M.

> > > Does it run incrementally? Can it discover non-binary files and pass -kk?
> >
> > It doesn't run incrementally, and it unconditionally passes -kk. It's
> 
> I thought that the .git-cvs directory it created was to be able to run
> incrementally (btw, I think it's fair game to create subdirs inside
> .git for this kind of status-tracking). And passing -kk uncoditionally
> is destructive in some cases (I know... git-cvsimport does it, and I
> want to fix that). If you can ask rcs about the mode if the file and
> not pass -kk for binary files...

nah, the .git-cvs directory is purely for debugging; I leave the various
command outputs there so I can see what went wrong.

I don't really have a good idea of how we'd do this process
incrementally; that's not something I am personally interested in
either, I want to run screaming from CVS as fast as I can at this point.

> > currently using rcs to check out versions of the files, so it should
> > deal with binary content as well as rcs does. Is there something magic I
> > need to do here? Like for DOS?
> 
> We'll let DOS take care of itself ;)

I did discover that rcs has less sophisticated keyword substitution than
cvs; not having any ability to customize stuff.

I guess we need to figure out when to pass -ko and when to pass -kk. The
other alternative I'd like to get around to trying is to directly
generate all of the revision contents from the ,v file.

I've just changed parsecvs to generate blobs for every revision in
each ,v file right after they're read in; putting the necessary code
right into parsecvs should be reasonably straightforward; we don't need
the multi-patch logic as we do want to compute each intermediate version
of the file.

With the blobs all generated, the rest of the operation is a simple
matter of building suitable indices and creating commits out of them.
That's a reasonably fast operation now as it doesn't manipulate any file
contents. Plus, I can do all of the index operations using a single
git-update-index command, so I eliminate a pile of forking.

Doing the file revision generation in-line would allow us to eliminate
most of the remaining forks; we'd run one git-hash-object per file (or
so), then a git-update-index, git-write-tree and git-commit-tree per
resulting commit.

-- 
keith.packard@intel.com

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 191 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: parsecvs tool now creates git repositories
  2006-04-04  3:51           ` Keith Packard
@ 2006-04-04  6:09             ` Junio C Hamano
  0 siblings, 0 replies; 35+ messages in thread
From: Junio C Hamano @ 2006-04-04  6:09 UTC (permalink / raw)
  To: Keith Packard; +Cc: git

Keith Packard <keithp@keithp.com> writes:

> I've just changed parsecvs to generate blobs for every revision in
> each ,v file right after they're read in; putting the necessary code
> right into parsecvs should be reasonably straightforward; we don't need
> the multi-patch logic as we do want to compute each intermediate version
> of the file.

If you want to go really fast without extra fork, are writing it
in C, and have the data for blob in core, you could link with
libgit.a and call write_sha1_file() yourself:

	unsigned char sha1[20];
        void *buf;
        unsigned long len;

	write_sha1_file(buf, len, "blob", sha1);

instead of forking "hash-object -w".  You feed your blob data
in buf, with its length in len, and you will get the blob object
name back in sha1[].  buf is owned by you and after
write_sha1_file() returns it is safe for you to scribble over it
or free() it.  sha1[] stores binary object name (20 bytes, not
40-byte hexadecimal), and you can use the helper function
sha1_to_hex() if you need a hex representation:

	char *sha1_to_hex(sha1)

which returns a pointer to a static buffer that is valid until
next call to sha1_to_hex(), so you need to strdup it if you want
to retain it.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Fixes to parsecvs
@ 2006-04-06  6:36   ` Keith Packard
  2006-04-06 12:08     ` Jan-Benedict Glaw
  2006-04-06 18:15     ` parsecvs tool now creates git repositories Jim Radford
  0 siblings, 2 replies; 35+ messages in thread
From: Keith Packard @ 2006-04-06  6:36 UTC (permalink / raw)
  To: Git Mailing List; +Cc: keithp

[-- Attachment #1: Type: text/plain, Size: 656 bytes --]

note, parsecvs remains available from:

	git://git.freedesktop.org/~keithp/parsecvs

I've "fixed" the lexer to permit getc/ungetc in the data parsing
functions. This should resolve the flex -l / -X problems.

Jim Radford send a patch to add '/' as a legal tag character

I added my custom edit-change-log script for people dealing with
X.org-style commit messages.

And, it deals with import branch revisions that aren't supposed to
get merged back to the trunk, creating a custom branch name based on the
branch revision (which must be global across all files).

5e5f4c012aec2db012a08b1c7ed5219ed5100111

-- 
keith.packard@intel.com


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 191 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Fixes to parsecvs
  2006-04-06  6:36   ` Fixes to parsecvs Keith Packard
@ 2006-04-06 12:08     ` Jan-Benedict Glaw
  2006-04-06 14:48       ` Keith Packard
  2006-04-06 18:15     ` parsecvs tool now creates git repositories Jim Radford
  1 sibling, 1 reply; 35+ messages in thread
From: Jan-Benedict Glaw @ 2006-04-06 12:08 UTC (permalink / raw)
  To: Keith Packard; +Cc: Git Mailing List

[-- Attachment #1: Type: text/plain, Size: 1977 bytes --]

On Wed, 2006-04-05 23:36:32 -0700, Keith Packard <keithp@keithp.com> wrote:
> note, parsecvs remains available from:
> 
> 	git://git.freedesktop.org/~keithp/parsecvs

It now compiles out-of-the-box for me, nice work.

However, it would be nice if you'd add a short description about how
to use it. Something like this:
---------------------------------------------------------------------
There's still a lot of work to do on parsecvs, but if you want to give
it a run, first create a copy of the whole CVS tree and go to the base
directory of this copy. (You find a lot of *,v files in this directory
and all its subdirectories.)
Now feed all ,v filenames into parsecvs. Keep in mind that a
`edit-change-log' executable needs to be in your $PATH (a one-line
script only exit'ing with 0 will do the job.):

	find . -type f -name '*,v' -print | parsecvs

This will create the .git/ directory and put all the objects, commits
and tree information into this new git repository.
---------------------------------------------------------------------

I just ran it against a locally rsync'ed copy of the Binutils ,v
files. Looging at the progress bar, it is bascally ready:


Load:               winsup/configure.in,v ....................* 27704 of 27704


But it seems it now starts to really consume memory:

jbglaw@bixie:~/bin$ ps axflwww|egrep '(VSZ|parsecvs)'|grep -v grep
F   UID   PID  PPID PRI  NI    VSZ   RSS WCHAN  STAT TTY        TIME COMMAND
0  1000 15564 22879  18   0 2805084 549996 finish T  pts/10    30:51 |       \_ parsecvs

How well does this work with even larger repositories?

MfG, JBG

-- 
Jan-Benedict Glaw       jbglaw@lug-owl.de    . +49-172-7608481             _ O _
"Eine Freie Meinung in  einem Freien Kopf    | Gegen Zensur | Gegen Krieg  _ _ O
 für einen Freien Staat voll Freier Bürger"  | im Internet! |   im Irak!   O O O
ret = do_actions((curr | FREE_SPEECH) & ~(NEW_COPYRIGHT_LAW | DRM | TCPA));

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Fixes to parsecvs
  2006-04-06 12:08     ` Jan-Benedict Glaw
@ 2006-04-06 14:48       ` Keith Packard
  2006-04-06 15:26         ` Johannes Schindelin
  2006-04-09 23:17         ` Francois Romieu
  0 siblings, 2 replies; 35+ messages in thread
From: Keith Packard @ 2006-04-06 14:48 UTC (permalink / raw)
  To: Jan-Benedict Glaw; +Cc: keithp, Git Mailing List

[-- Attachment #1: Type: text/plain, Size: 2098 bytes --]

On Thu, 2006-04-06 at 14:08 +0200, Jan-Benedict Glaw wrote:
> On Wed, 2006-04-05 23:36:32 -0700, Keith Packard <keithp@keithp.com> wrote:
> > note, parsecvs remains available from:
> > 
> > 	git://git.freedesktop.org/~keithp/parsecvs
> 
> It now compiles out-of-the-box for me, nice work.

cool

> 
> However, it would be nice if you'd add a short description about how
> to use it. Something like this:

I'd rather just fix the usage to be more sane; that shouldn't take but a
few minutes...

> I just ran it against a locally rsync'ed copy of the Binutils ,v
> files. Looging at the progress bar, it is bascally ready:
> 
> 
> Load:               winsup/configure.in,v ....................* 27704 of 27704

Now all of the ,v files have been parsed and each revision placed in
the .git repository as a blob.

> But it seems it now starts to really consume memory:

Yeah, it's doing the change set computation, which is not very space
efficient; it computes the entire set of files at each commit which can
take 'a bit' of space with a large number of files over a long period of
time. Obviously computing revision deltas and saving those would make it
use a lot less memory.

> jbglaw@bixie:~/bin$ ps axflwww|egrep '(VSZ|parsecvs)'|grep -v grep
> F   UID   PID  PPID PRI  NI    VSZ   RSS WCHAN  STAT TTY        TIME COMMAND
> 0  1000 15564 22879  18   0 2805084 549996 finish T  pts/10    30:51 |       \_ parsecvs

I'd run a large repository on a large machine; I managed to get
postgresql to run on my laptop (615M CVS with 6000 files), but anything
larger I'd probably want to get it onto a big enough machine. The
question is whether it needs to be more efficient so that people can
constantly convert repositories or whether moving the repository to a
sufficiently large machine for the one-time conversion is 'good enough'.

> How well does this work with even larger repositories?

postgresql is the largest I've run; starting with a 615M CVS repository,
it built a 1.7G .git tree, which packed down to 125M.

-- 
keith.packard@intel.com

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 191 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Fixes to parsecvs
  2006-04-06 14:48       ` Keith Packard
@ 2006-04-06 15:26         ` Johannes Schindelin
  2006-04-06 16:09           ` Jan-Benedict Glaw
  2006-04-06 17:36           ` Keith Packard
  2006-04-09 23:17         ` Francois Romieu
  1 sibling, 2 replies; 35+ messages in thread
From: Johannes Schindelin @ 2006-04-06 15:26 UTC (permalink / raw)
  To: Keith Packard; +Cc: Git Mailing List

Hi,

On Thu, 6 Apr 2006, Keith Packard wrote:

> On Thu, 2006-04-06 at 14:08 +0200, Jan-Benedict Glaw wrote:
> 
> > But it seems it now starts to really consume memory:
> 
> The question is whether it needs to be more efficient so that people can 
> constantly convert repositories or whether moving the repository to a 
> sufficiently large machine for the one-time conversion is 'good enough'.

Keep in mind that there are many more valid uses for tracking a CVS 
repository than to import it once.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Fixes to parsecvs
  2006-04-06 15:26         ` Johannes Schindelin
@ 2006-04-06 16:09           ` Jan-Benedict Glaw
  2006-04-06 17:36           ` Keith Packard
  1 sibling, 0 replies; 35+ messages in thread
From: Jan-Benedict Glaw @ 2006-04-06 16:09 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Keith Packard, Git Mailing List

[-- Attachment #1: Type: text/plain, Size: 1518 bytes --]

On Thu, 2006-04-06 17:26:14 +0200, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> On Thu, 6 Apr 2006, Keith Packard wrote:
> > On Thu, 2006-04-06 at 14:08 +0200, Jan-Benedict Glaw wrote:
> > > But it seems it now starts to really consume memory:
> > The question is whether it needs to be more efficient so that people can 
> > constantly convert repositories or whether moving the repository to a 
> > sufficiently large machine for the one-time conversion is 'good enough'.
> 
> Keep in mind that there are many more valid uses for tracking a CVS 
> repository than to import it once.

Even the most simplest usage case reveals this. (It's also what I'm
about to do the the converted GCC repository.)

Get the repo, locally track the changes (so the importet branches are
all like "vendor branches") and do own work in local branches.

I'll do this eg. to be able to easily re-diff patches, which I want to
put into GIT, just because it's so much more convenient than SVN.
However, this is only possible because I'm able to keep track of
upstream SVN changes. They probably won't change their SCM again, just
after they've introduced SVN.

MfG, JBG

-- 
Jan-Benedict Glaw       jbglaw@lug-owl.de    . +49-172-7608481             _ O _
"Eine Freie Meinung in  einem Freien Kopf    | Gegen Zensur | Gegen Krieg  _ _ O
 für einen Freien Staat voll Freier Bürger"  | im Internet! |   im Irak!   O O O
ret = do_actions((curr | FREE_SPEECH) & ~(NEW_COPYRIGHT_LAW | DRM | TCPA));

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Fixes to parsecvs
  2006-04-06 15:26         ` Johannes Schindelin
  2006-04-06 16:09           ` Jan-Benedict Glaw
@ 2006-04-06 17:36           ` Keith Packard
  1 sibling, 0 replies; 35+ messages in thread
From: Keith Packard @ 2006-04-06 17:36 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: keithp, Git Mailing List

[-- Attachment #1: Type: text/plain, Size: 587 bytes --]

On Thu, 2006-04-06 at 17:26 +0200, Johannes Schindelin wrote:

> Keep in mind that there are many more valid uses for tracking a CVS 
> repository than to import it once.

Sure, but we should fix parsecvs to handle incremental CVS tracking if
that's one of the goals for this utility. git-cvsimport does this by
skipping commits earlier than a fixed time; if we did that, we'd
eliminate the huge memory usage except for initial imports. I haven't
considered how this might be done in detail yet; I have no personal need
for this functionality.

-- 
keith.packard@intel.com

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 191 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: parsecvs tool now creates git repositories
  2006-04-06  6:36   ` Fixes to parsecvs Keith Packard
  2006-04-06 12:08     ` Jan-Benedict Glaw
@ 2006-04-06 18:15     ` Jim Radford
  2006-04-06 20:12       ` Keith Packard
  1 sibling, 1 reply; 35+ messages in thread
From: Jim Radford @ 2006-04-06 18:15 UTC (permalink / raw)
  To: Keith Packard; +Cc: Git Mailing List

Hi Keith,

Here's one more build patch.  For some reason the Fedora lex doesn't
want a space after the -o.

Almost all of the errors I was seeing in the last version were fixed
with your "branches that don't get merged back to the trunk" fix.

Thanks,
-Jim

diff --git a/Makefile b/Makefile
index 4ca6ffd..137ed34 100644
--- a/Makefile
+++ b/Makefile
@@ -4,7 +4,7 @@ GCC_WARNINGS3=-Wnested-externs -fno-stri
 GCC_WARNINGS=$(GCC_WARNINGS1) $(GCC_WARNINGS2) $(GCC_WARNINGS3)
 CFLAGS=-O0 -g $(GCC_WARNINGS)
 YFLAGS=-d -l
-LFLAGS=-l -o lex.c
+LFLAGS=-l -olex.c

 SRCS=gram.y lex.l cvs.h parsecvs.c cvsutil.c \
        revlist.c atom.c revcvs.c git.c gitutil.c

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: parsecvs tool now creates git repositories
  2006-04-06 18:15     ` parsecvs tool now creates git repositories Jim Radford
@ 2006-04-06 20:12       ` Keith Packard
  2006-04-06 21:51         ` Martin Langhoff
  0 siblings, 1 reply; 35+ messages in thread
From: Keith Packard @ 2006-04-06 20:12 UTC (permalink / raw)
  To: Jim Radford; +Cc: keithp, Git Mailing List

[-- Attachment #1: Type: text/plain, Size: 560 bytes --]

On Thu, 2006-04-06 at 11:15 -0700, Jim Radford wrote:
> Hi Keith,
> 
> Here's one more build patch.  For some reason the Fedora lex doesn't
> want a space after the -o.

I probably shouldn't even use the -o flag; all it does is change the
#line directives in the output file to point at lex.c instead of
<stdout>. I'm sure it'll break something.

> Almost all of the errors I was seeing in the last version were fixed
> with your "branches that don't get merged back to the trunk" fix.

That's good news at least.

-- 
keith.packard@intel.com

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 191 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: parsecvs tool now creates git repositories
  2006-04-06 20:12       ` Keith Packard
@ 2006-04-06 21:51         ` Martin Langhoff
  2006-04-06 22:19           ` Keith Packard
  0 siblings, 1 reply; 35+ messages in thread
From: Martin Langhoff @ 2006-04-06 21:51 UTC (permalink / raw)
  To: Keith Packard; +Cc: Jim Radford, Git Mailing List

On 4/7/06, Keith Packard <keithp@keithp.com> wrote:
> > Almost all of the errors I was seeing in the last version were fixed
> > with your "branches that don't get merged back to the trunk" fix.
>
> That's good news at least.

I'm re-running my import of Moodle's cvs (20K commits) with the newer
parsecvs. The previous attempt looked very good except that

 - file additions were recorded with one-commit-per-file. I am not
sure how rcs is recording these, but hte user does enter a common
message at "commit" time. Perhaps the file addition action could be
ignored then?

 - some tags made on a branch show up in HEAD. This may be due to
partial-tree branches, but I am not sure.

cheers


m

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: parsecvs tool now creates git repositories
  2006-04-06 21:51         ` Martin Langhoff
@ 2006-04-06 22:19           ` Keith Packard
  2006-04-06 23:22             ` Martin Langhoff
  0 siblings, 1 reply; 35+ messages in thread
From: Keith Packard @ 2006-04-06 22:19 UTC (permalink / raw)
  To: Martin Langhoff; +Cc: keithp, Jim Radford, Git Mailing List

[-- Attachment #1: Type: text/plain, Size: 844 bytes --]

On Fri, 2006-04-07 at 09:51 +1200, Martin Langhoff wrote:

>  - file additions were recorded with one-commit-per-file. I am not
> sure how rcs is recording these, but hte user does enter a common
> message at "commit" time. Perhaps the file addition action could be
> ignored then?

If the log message is identical, and the dates are in-range, parsecvs
"should" put the adds in the same commit. 

>  - some tags made on a branch show up in HEAD. This may be due to
> partial-tree branches, but I am not sure.

Finding branch points is not perfect; it's complicated by bizzarre
behaviour when adding files and casual CVS changes which make precise
branch points hard to detect. Can I get at this repository to play with?
I'd like to see if we can't get the branch point detection more
accurate.

-- 
keith.packard@intel.com

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 191 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: parsecvs tool now creates git repositories
  2006-04-06 22:19           ` Keith Packard
@ 2006-04-06 23:22             ` Martin Langhoff
  2006-04-07  7:24               ` Keith Packard
  0 siblings, 1 reply; 35+ messages in thread
From: Martin Langhoff @ 2006-04-06 23:22 UTC (permalink / raw)
  To: Keith Packard; +Cc: Jim Radford, Git Mailing List

On 4/7/06, Keith Packard <keithp@keithp.com> wrote:
> On Fri, 2006-04-07 at 09:51 +1200, Martin Langhoff wrote:
>
> >  - file additions were recorded with one-commit-per-file. I am not
> > sure how rcs is recording these, but hte user does enter a common
> > message at "commit" time. Perhaps the file addition action could be
> > ignored then?
>
> If the log message is identical, and the dates are in-range, parsecvs
> "should" put the adds in the same commit.

parsecvs is committing them with the "added file foo.x" message, not
the actual commit message.

> >  - some tags made on a branch show up in HEAD. This may be due to
> > partial-tree branches, but I am not sure.
>
> Finding branch points is not perfect; it's complicated by bizzarre
> behaviour when adding files and casual CVS changes which make precise
> branch points hard to detect. Can I get at this repository to play with?

I fetch it with something along the lines of...

while ( true ) ; do
     wget -qc http://cvs.sourceforge.net/cvstarballs/moodle-cvsroot.tar.bz2 &&
break
     sleep 5
done

and then import the "moodle" module.

cheers,


m

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: parsecvs tool now creates git repositories
  2006-04-06 23:22             ` Martin Langhoff
@ 2006-04-07  7:24               ` Keith Packard
  0 siblings, 0 replies; 35+ messages in thread
From: Keith Packard @ 2006-04-07  7:24 UTC (permalink / raw)
  To: Martin Langhoff; +Cc: keithp, Jim Radford, Git Mailing List

[-- Attachment #1: Type: text/plain, Size: 565 bytes --]

On Fri, 2006-04-07 at 11:22 +1200, Martin Langhoff wrote:

> parsecvs is committing them with the "added file foo.x" message, not
> the actual commit message.

heh. my cvs repositories are all so kludged that no files have ever been
added, it appears. I'll fix this when I've got a copy of the moodle
repository. sf.net is as useful as always.

I suspect the change is as simple as checking the format of the log
message and time time stamps of the commits and then just dropping the
1.1 revision from the tree entirely.

-- 
keith.packard@intel.com

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 191 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Fixes to parsecvs
  2006-04-06 14:48       ` Keith Packard
  2006-04-06 15:26         ` Johannes Schindelin
@ 2006-04-09 23:17         ` Francois Romieu
  1 sibling, 0 replies; 35+ messages in thread
From: Francois Romieu @ 2006-04-09 23:17 UTC (permalink / raw)
  To: Keith Packard; +Cc: Jan-Benedict Glaw, Git Mailing List

Keith Packard <keithp@keithp.com> :
[...]
> > How well does this work with even larger repositories?
> 
> postgresql is the largest I've run; starting with a 615M CVS repository,
> it built a 1.7G .git tree, which packed down to 125M.

As a datapoint, I gave parsecvs a try on a local CVS repository.
The repository weights 3.28 Go. It contains 53k files (45k non-attic).

.git/objets grew from ~100k files at the end of the first pass to
199k files (~11k commit). It took 18h on a 3GHz PIV with 2Go RAM.
After 6 hours, 400 Mo were pushed to swap and parsecvs took 1.95 Go
of RAM for itself. No significant swap activity. Swap grew to 900 Mo
at end of run. A tarball (5 Mo) containing vmstat + size of objects
is available at http://www.cogenit.fr/linux/misc/cvsparse-debug.tar.bz2

I have interrupted 'git repack -a -d' after 6 hours.

-- 
Ueimor

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2006-04-09 23:22 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20060405174247.GA29758@blackbean.org>
     [not found] ` <1144262498.2303.231.camel@neko.keithp.com>
2006-04-06  6:36   ` Fixes to parsecvs Keith Packard
2006-04-06 12:08     ` Jan-Benedict Glaw
2006-04-06 14:48       ` Keith Packard
2006-04-06 15:26         ` Johannes Schindelin
2006-04-06 16:09           ` Jan-Benedict Glaw
2006-04-06 17:36           ` Keith Packard
2006-04-09 23:17         ` Francois Romieu
2006-04-06 18:15     ` parsecvs tool now creates git repositories Jim Radford
2006-04-06 20:12       ` Keith Packard
2006-04-06 21:51         ` Martin Langhoff
2006-04-06 22:19           ` Keith Packard
2006-04-06 23:22             ` Martin Langhoff
2006-04-07  7:24               ` Keith Packard
2006-04-02  5:36 Keith Packard
2006-04-02  9:39 ` Jan-Benedict Glaw
2006-04-02 19:31   ` Jan-Benedict Glaw
2006-04-03  4:10     ` Keith Packard
2006-04-03  4:38       ` Linus Torvalds
2006-04-03  7:25       ` Jan-Benedict Glaw
2006-04-03 13:58         ` Erik Mouw
2006-04-03 16:54         ` Keith Packard
2006-04-03 22:19           ` Keith Packard
2006-04-03 14:03 ` Erik Mouw
2006-04-03 14:21   ` Jakub Narebski
2006-04-03 14:39     ` Keith Packard
2006-04-03 14:37   ` Keith Packard
2006-04-03 15:32     ` Jeff King
2006-04-04  0:55   ` Anand Kumria
2006-04-03 22:38 ` Martin Langhoff
2006-04-04  2:07   ` Keith Packard
2006-04-04  2:16     ` Martin Langhoff
2006-04-04  2:24       ` Keith Packard
2006-04-04  2:42         ` Martin Langhoff
2006-04-04  3:51           ` Keith Packard
2006-04-04  6:09             ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).