git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* git-cvsimport fuzzy commit log matching?
@ 2008-12-23 11:03 Christoph Hellwig
  2008-12-23 11:06 ` Pierre Habouzit
  2008-12-23 12:53 ` Martin Langhoff
  0 siblings, 2 replies; 4+ messages in thread
From: Christoph Hellwig @ 2008-12-23 11:03 UTC (permalink / raw)
  To: Matthias Urlichs; +Cc: git

I'm currently trying to get clean git imports of the XFS userspace
repositories.  These are funky in the way they were initially kept in
ptools, and SGI-internal SCM that was built ontop of RCS which changeset
added ontop.  So we know that commits actually were done in atomic
changesets.  But ptools has the "nice" feature of allowing both per-file
and per-changeset commits.   Due to the per-file commits git-cvsimport
often misdetects a single changeset as multiple individual changes, ala:


commit 0d47d43b5878c6e7d7b516a793a82f0076d22089
Author: Barry Naujok <bnaujok@sgi.com>
Date:   Mon Jul 16 15:52:53 2007 +0000

    Perform parallel processing based on AG stride/concat unit
    Merge of master-melb:xfs-cmds:29143a by kenmcd.

      Queue up AGs per thread based on ag stride

commit 1fa4685db126fd3071e008a6d18f9d51209ab305
Author: Barry Naujok <bnaujok@sgi.com>
Date:   Mon Jul 16 15:52:53 2007 +0000

    Perform parallel processing based on AG stride/concat unit
    Merge of master-melb:xfs-cmds:29143a by kenmcd.

      Handle ag stride command line option and setup threads as required

commit a73288784e77c2411687f6778adb4c0b0f9dcdff
Author: Barry Naujok <bnaujok@sgi.com>
Date:   Mon Jul 16 15:52:53 2007 +0000

    Perform parallel processing based on AG stride/concat unit
    Merge of master-melb:xfs-cmds:29143a by kenmcd.

      Execute bits changed from x-- to ---
      Queue up AGs per thread based on ag stride

and so on.

Any idea how to tell git-cvsimport that if we have exactly the same
timestamp, and maybe the same author it really is the same changeset and
we want to merge the commit message?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: git-cvsimport fuzzy commit log matching?
  2008-12-23 11:03 git-cvsimport fuzzy commit log matching? Christoph Hellwig
@ 2008-12-23 11:06 ` Pierre Habouzit
  2008-12-23 12:53 ` Martin Langhoff
  1 sibling, 0 replies; 4+ messages in thread
From: Pierre Habouzit @ 2008-12-23 11:06 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Matthias Urlichs, git

[-- Attachment #1: Type: text/plain, Size: 2186 bytes --]

On Tue, Dec 23, 2008 at 11:03:02AM +0000, Christoph Hellwig wrote:
> I'm currently trying to get clean git imports of the XFS userspace
> repositories.  These are funky in the way they were initially kept in
> ptools, and SGI-internal SCM that was built ontop of RCS which changeset
> added ontop.  So we know that commits actually were done in atomic
> changesets.  But ptools has the "nice" feature of allowing both per-file
> and per-changeset commits.   Due to the per-file commits git-cvsimport
> often misdetects a single changeset as multiple individual changes, ala:
> 
> 
> commit 0d47d43b5878c6e7d7b516a793a82f0076d22089
> Author: Barry Naujok <bnaujok@sgi.com>
> Date:   Mon Jul 16 15:52:53 2007 +0000
> 
>     Perform parallel processing based on AG stride/concat unit
>     Merge of master-melb:xfs-cmds:29143a by kenmcd.
> 
>       Queue up AGs per thread based on ag stride
> 
> commit 1fa4685db126fd3071e008a6d18f9d51209ab305
> Author: Barry Naujok <bnaujok@sgi.com>
> Date:   Mon Jul 16 15:52:53 2007 +0000
> 
>     Perform parallel processing based on AG stride/concat unit
>     Merge of master-melb:xfs-cmds:29143a by kenmcd.
> 
>       Handle ag stride command line option and setup threads as required
> 
> commit a73288784e77c2411687f6778adb4c0b0f9dcdff
> Author: Barry Naujok <bnaujok@sgi.com>
> Date:   Mon Jul 16 15:52:53 2007 +0000
> 
>     Perform parallel processing based on AG stride/concat unit
>     Merge of master-melb:xfs-cmds:29143a by kenmcd.
> 
>       Execute bits changed from x-- to ---
>       Queue up AGs per thread based on ag stride
> 
> and so on.
> 
> Any idea how to tell git-cvsimport that if we have exactly the same
> timestamp, and maybe the same author it really is the same changeset and
> we want to merge the commit message?

Why not using a fancy git-filterbranch script to squash them together
instead ? It's probably less work than to try to modify your cvs
importer to work the exact way you want.

-- 
·O·  Pierre Habouzit
··O                                                madcoder@debian.org
OOO                                                http://www.madism.org

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: git-cvsimport fuzzy commit log matching?
  2008-12-23 11:03 git-cvsimport fuzzy commit log matching? Christoph Hellwig
  2008-12-23 11:06 ` Pierre Habouzit
@ 2008-12-23 12:53 ` Martin Langhoff
  2008-12-23 15:16   ` Christoph Hellwig
  1 sibling, 1 reply; 4+ messages in thread
From: Martin Langhoff @ 2008-12-23 12:53 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Matthias Urlichs, git

On Tue, Dec 23, 2008 at 9:03 AM, Christoph Hellwig <hch@lst.de> wrote:
> Any idea how to tell git-cvsimport that if we have exactly the same
> timestamp, and maybe the same author it really is the same changeset and
> we want to merge the commit message?

Right now, cvsimport relies on cvsps for this. cvsps compares author,
timestamp (with a fuzz factor 'cause cvs commits over slow networks or
hosts can span minutes - you could dial down to 0, it's the -z flag)
*and* commit msg.

What you could do is

 1 - run cvsps with export to a file (I've posted in this list how to
run it exactly as cvsimport does)
 2 - post-process cvsps ouput with perl (there's a parser already in
cvsimport ;-) )
 3 - run cvsimport with the post-processed file

Or postprocess the imported git tree as others have suggested.

hth,



martin
-- 
 martin.langhoff@gmail.com
 martin@laptop.org -- School Server Architect
 - ask interesting questions
 - don't get distracted with shiny stuff  - working code first
 - http://wiki.laptop.org/go/User:Martinlanghoff

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: git-cvsimport fuzzy commit log matching?
  2008-12-23 12:53 ` Martin Langhoff
@ 2008-12-23 15:16   ` Christoph Hellwig
  0 siblings, 0 replies; 4+ messages in thread
From: Christoph Hellwig @ 2008-12-23 15:16 UTC (permalink / raw)
  To: Martin Langhoff; +Cc: Matthias Urlichs, git

On Tue, Dec 23, 2008 at 10:53:42AM -0200, Martin Langhoff wrote:
> What you could do is
> 
>  1 - run cvsps with export to a file (I've posted in this list how to
> run it exactly as cvsimport does)
>  2 - post-process cvsps ouput with perl (there's a parser already in
> cvsimport ;-) )
>  3 - run cvsimport with the post-processed file
> 
> Or postprocess the imported git tree as others have suggested.

Instead of post-processing I hacked cvsps.  It already has a different
way to detect changesets when running in --bkcvs mode, and re-using that
one for ptools works great.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2008-12-23 15:18 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-12-23 11:03 git-cvsimport fuzzy commit log matching? Christoph Hellwig
2008-12-23 11:06 ` Pierre Habouzit
2008-12-23 12:53 ` Martin Langhoff
2008-12-23 15:16   ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).