From mboxrd@z Thu Jan 1 00:00:00 1970 From: Michael Haggerty Subject: Questions about git-fast-import for cvs2svn Date: Sun, 15 Jul 2007 16:11:41 +0200 Message-ID: <469A2B1D.2040107@alum.mit.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit Cc: git@vger.kernel.org To: "Shawn O. Pearce" X-From: git-owner@vger.kernel.org Sun Jul 15 16:11:53 2007 Return-path: Envelope-to: gcvg-git@gmane.org Received: from vger.kernel.org ([209.132.176.167]) by lo.gmane.org with esmtp (Exim 4.50) id 1IA4pE-0007aL-Uv for gcvg-git@gmane.org; Sun, 15 Jul 2007 16:11:53 +0200 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756597AbXGOOLu (ORCPT ); Sun, 15 Jul 2007 10:11:50 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1756521AbXGOOLu (ORCPT ); Sun, 15 Jul 2007 10:11:50 -0400 Received: from einhorn.in-berlin.de ([192.109.42.8]:37400 "EHLO einhorn.in-berlin.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755863AbXGOOLt (ORCPT ); Sun, 15 Jul 2007 10:11:49 -0400 X-Envelope-From: mhagger@alum.mit.edu Received: from [192.168.69.135] (kaiserty.in-dsl.de [217.197.85.174]) (authenticated bits=0) by einhorn.in-berlin.de (8.13.6/8.13.6/Debian-1) with ESMTP id l6FEBfs8031273 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Sun, 15 Jul 2007 16:11:42 +0200 User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.0.12) Gecko/20070604 Thunderbird/1.5.0.12 Mnenhy/0.7.5.666 X-Enigmail-Version: 0.94.0.0 X-Scanned-By: MIMEDefang_at_IN-Berlin_e.V. on 192.109.42.8 Sender: git-owner@vger.kernel.org Precedence: bulk X-Mailing-List: git@vger.kernel.org Archived-At: I've been reading the documentation for git-fast-import (thanks for the fine documentation!) as part of determining how much work it would be to add a git back end to cvs2svn, and I have a few questions. 1. Is it a problem to create blobs that are never referenced? The easiest point to create blobs is when the RCS files are originally parsed, but later we discard some CVS revisions, meaning that the corresponding blobs would never be needed. Would this be a problem? 2. It appears that author/committer require an email address. How important is a valid email address here? a. CVS commits include a username but not an email address. If an email address is really required, then I suppose the person doing the conversion would have to supply a lookup table mapping username -> email address. b. CVS tag/branch creation events do not even include a username. Any suggestions for what to use here? 3. I expect we should set 'committer' to the value determined from CVS and leave 'author' unused. But I suppose another possibility would be to set the 'committer' to 'cvs2svn' and the 'author' to the original CVS author. Which one makes sense? 4. It appears that a commit can only have a single 'from', which I suppose means that files can only be added to one branch from a single source branch/revision in a single commit. But CVS branches and tags can include files from multiple source branches and/or revisions. What would be the most git-like way to handle this situation? Should the branch be created in one commit, then have files from other sources added to it in other commits? Or should (is this even possible?) all files be added to the branch in a single commit, using multiple "merge" sources? 5. Is there any significance at all to the order that commits are output to git-fast-import? Obviously, blobs have to be defined before they are used, and ''s have to be defined before they are referenced. But is there any other significance to the order of commits? All in all, I don't think that a git back end for cvs2svn would be very trick at all. There will be a bit of refactoring work to allow the user to switch between SVN/git output at runtime, but so far I don't see any reason that the fundamental algorithms of cvs2svn will have to be changed. Thanks, Michael