* parsecvs fails even on simple input
@ 2007-06-22 11:36 Roman Kagan
2007-06-24 21:31 ` Keith Packard
0 siblings, 1 reply; 5+ messages in thread
From: Roman Kagan @ 2007-06-22 11:36 UTC (permalink / raw)
To: Keith Packard, Al Viro; +Cc: git
[-- Attachment #1: Type: text/plain, Size: 1876 bytes --]
One of the patches recently merged into parsecvs master, namely
commit f5b3cb849517adfd7790c1bfa84bbb84220e3e7b
Author: Al Viro <viro@zeniv.linux.org.uk>
Date: Tue Jan 16 04:15:35 2007 -0500
[PATCH] generate tree objects just as we calculate changesets
... and don't store the fsckloads of rev_file in ExportGit mode; they
are only needed (for now) in ExportGraph one.
Tree generation is done directly, without hitting on-disk index. Very fast
now.
broke parsecvs entirely. The reproducer (attached) is very simple:
initial commit of a just added file. parsecvs now barfs on it:
# parsecvs a,v
Initialized empty shared Git repository in .git/
Load: a,v ....................* 1 of 1
Pack pack-6b955e2d966143fc957ccd272e9dd822ceaccf25 created
Removing unused objects 81%...
Removing unused objects 100%...
Done.
error: invalid object d0141680ee5324d51a558a0a48c8a867cbc6a47c
error: writing tree
Authors: No such file or directory
Save: master ....................* 1 of 1
The problem is the following: after that commit parsecvs tries to add
objects to the git tree on its own via calls to libgit; however, in
between it runs git-pack-objects. Thus objects move to pack files
without libgit being aware of it; this results in 'ivalid object'
errors.
However, the object with that hash exists but is stored on the pack
file; if parsecvs is run in the same directory for the second time it
finds it there and happily completes.
I haven't yet had the time to dig deeper into this problem and code a
patch; however, the whole idea of doing part of the job through the
(unpublished) libgit API and the rest via callouts to git utilities
looks like calling for trouble. Wouldn't it be better to teach parsecvs
to speak git-fast-import language instead?
Roman.
[-- Attachment #2: a,v --]
[-- Type: text/plain, Size: 188 bytes --]
head 1.1;
access;
symbols;
locks; strict;
comment @# @;
1.1
date 2007.06.21.12.11.32; author tstuser; state Exp;
branches;
next ;
desc
@@
1.1
log
@test commit
@
text
@this is test
@
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: parsecvs fails even on simple input
2007-06-22 11:36 parsecvs fails even on simple input Roman Kagan
@ 2007-06-24 21:31 ` Keith Packard
2007-06-25 4:59 ` Shawn O. Pearce
2007-06-27 15:33 ` Roman Kagan
0 siblings, 2 replies; 5+ messages in thread
From: Keith Packard @ 2007-06-24 21:31 UTC (permalink / raw)
To: Roman Kagan; +Cc: keithp, Al Viro, git
[-- Attachment #1: Type: text/plain, Size: 633 bytes --]
On Fri, 2007-06-22 at 15:36 +0400, Roman Kagan wrote:
> The problem is the following: after that commit parsecvs tries to add
> objects to the git tree on its own via calls to libgit; however, in
> between it runs git-pack-objects. Thus objects move to pack files
> without libgit being aware of it; this results in 'ivalid object'
> errors.
Sticking a call to reprepare_packed_git() after the pack creation fixes
this nicely.
> Wouldn't it be better to teach parsecvs
> to speak git-fast-import language instead?
Avoiding fork/exec is rather important for parsecvs perforamance.
--
keith.packard@intel.com
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: parsecvs fails even on simple input
2007-06-24 21:31 ` Keith Packard
@ 2007-06-25 4:59 ` Shawn O. Pearce
2007-06-27 15:33 ` Roman Kagan
1 sibling, 0 replies; 5+ messages in thread
From: Shawn O. Pearce @ 2007-06-25 4:59 UTC (permalink / raw)
To: Keith Packard; +Cc: Roman Kagan, Al Viro, git
Keith Packard <keithp@keithp.com> wrote:
> On Fri, 2007-06-22 at 15:36 +0400, Roman Kagan wrote:
> > Wouldn't it be better to teach parsecvs
> > to speak git-fast-import language instead?
>
> Avoiding fork/exec is rather important for parsecvs perforamance.
That sort of thing is the entire point behind fast-import. Its only
one fork+exec to setup the fast-import "daemon" in the background,
and you do everything over a pipe to its stdin. Including forcing
it to finish its current packfile and open a new one on the next
object (the `checkpoint` command).
fast-import is fast, its input language is fairly simple, and its
quite stable. And its only one fork+exec. That's peanuts compared
to the disk IO involved in any sizable import process.
--
Shawn.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: parsecvs fails even on simple input
2007-06-24 21:31 ` Keith Packard
2007-06-25 4:59 ` Shawn O. Pearce
@ 2007-06-27 15:33 ` Roman Kagan
2007-06-27 17:20 ` Keith Packard
1 sibling, 1 reply; 5+ messages in thread
From: Roman Kagan @ 2007-06-27 15:33 UTC (permalink / raw)
To: Keith Packard; +Cc: Al Viro, git
On Sun, Jun 24, 2007 at 10:31:07PM +0100, Keith Packard wrote:
> On Fri, 2007-06-22 at 15:36 +0400, Roman Kagan wrote:
>
> > The problem is the following: after that commit parsecvs tries to add
> > objects to the git tree on its own via calls to libgit; however, in
> > between it runs git-pack-objects. Thus objects move to pack files
> > without libgit being aware of it; this results in 'ivalid object'
> > errors.
>
> Sticking a call to reprepare_packed_git() after the pack creation fixes
> this nicely.
Ehm sort of... Except that I woudn't call that extern declaration
nice.
I'm now tracking down another problem which I didn't see before:
parsecvs apparently doesn't close .git-cvs/log-XXX files and ends up
exhausting the open file descriptor limit. I'll update when I have more
info.
> > Wouldn't it be better to teach parsecvs
> > to speak git-fast-import language instead?
>
> Avoiding fork/exec is rather important for parsecvs perforamance.
Avoiding _one_ fork/exec is certainly not.
OTOH git-fast-import seems to be essentially the public API for the
parsecvs kind of tasks. It may be wiser from the maintenance POV to use
that instead of direct libgit calls (unless parsecvs is going to land in
the git tree). I'll try to find the time and take a look at this
somewhere next week.
Roman.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: parsecvs fails even on simple input
2007-06-27 15:33 ` Roman Kagan
@ 2007-06-27 17:20 ` Keith Packard
0 siblings, 0 replies; 5+ messages in thread
From: Keith Packard @ 2007-06-27 17:20 UTC (permalink / raw)
To: Roman Kagan; +Cc: keithp, Al Viro, git
[-- Attachment #1: Type: text/plain, Size: 565 bytes --]
On Wed, 2007-06-27 at 19:33 +0400, Roman Kagan wrote:
> OTOH git-fast-import seems to be essentially the public API for the
> parsecvs kind of tasks. It may be wiser from the maintenance POV to use
> that instead of direct libgit calls (unless parsecvs is going to land in
> the git tree). I'll try to find the time and take a look at this
> somewhere next week.
Yeah, I didn't quite understand how git-fast-import worked. Looks like
it aligns with parsecvs's structure fairly well. Let me know if you get
it working.
--
keith.packard@intel.com
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2007-06-27 17:20 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-06-22 11:36 parsecvs fails even on simple input Roman Kagan
2007-06-24 21:31 ` Keith Packard
2007-06-25 4:59 ` Shawn O. Pearce
2007-06-27 15:33 ` Roman Kagan
2007-06-27 17:20 ` Keith Packard
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).