From mboxrd@z Thu Jan 1 00:00:00 1970 From: Uri Moszkowicz Subject: Re: error: git-fast-import died of signal 11 Date: Wed, 17 Oct 2012 15:06:10 -0500 Message-ID: References: <507D0A53.1030707@alum.mit.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Cc: git@vger.kernel.org To: Michael Haggerty X-From: git-owner@vger.kernel.org Wed Oct 17 22:06:46 2012 Return-path: Envelope-to: gcvg-git-2@plane.gmane.org Received: from vger.kernel.org ([209.132.180.67]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1TOZt6-000707-K2 for gcvg-git-2@plane.gmane.org; Wed, 17 Oct 2012 22:06:44 +0200 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757253Ab2JQUGd (ORCPT ); Wed, 17 Oct 2012 16:06:33 -0400 Received: from mx71.nozonenet.com ([204.14.89.24]:38045 "EHLO mail3.nozonenet.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753588Ab2JQUGc (ORCPT ); Wed, 17 Oct 2012 16:06:32 -0400 Received: (qmail 22724 invoked by uid 399); 17 Oct 2012 16:06:31 -0400 Received: from mail-ie0-f174.google.com (smtp@4refs.com@209.85.223.174) (de-)crypted with TLSv1: RC4-SHA [128/128] DN=unknown by mail3.nozonenet.com with ESMTPSAM; 17 Oct 2012 16:06:31 -0400 X-Originating-IP: 209.85.223.174 X-Sender: smtp@4refs.com Received: by mail-ie0-f174.google.com with SMTP id k13so12510548iea.19 for ; Wed, 17 Oct 2012 13:06:30 -0700 (PDT) Received: by 10.50.95.167 with SMTP id dl7mr2756896igb.8.1350504390753; Wed, 17 Oct 2012 13:06:30 -0700 (PDT) Received: by 10.64.15.7 with HTTP; Wed, 17 Oct 2012 13:06:10 -0700 (PDT) In-Reply-To: <507D0A53.1030707@alum.mit.edu> Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Archived-At: Hi Michael, Looks like the changes to limit solved the problem. I didn't verify if it was the stacksize or descriptors but one of those. Final repository size was 14GB from a 328GB dump file. Thanks, Uri On Tue, Oct 16, 2012 at 2:18 AM, Michael Haggerty wrote: > On 10/15/2012 05:53 PM, Uri Moszkowicz wrote: >> I'm trying to convert a CVS repository to Git using cvs2git. I was able to >> generate the dump file without problem but am unable to get Git to >> fast-import it. The dump file is 328GB and I ran git fast-import on a >> machine with 512GB of RAM. >> >> fatal: Out of memory? mmap failed: Cannot allocate memory >> fast-import: dumping crash report to fast_import_crash_18192 >> error: git-fast-import died of signal 11 >> >> How can I import the repository? > > What versions of git and of cvs2git are you using? If not the current > versions, please try with the current versions. > > What is the nature of your repository (i.e., why is it so big)? Does it > consist of extremely large files? A very deep history? Extremely many > branches/tags? Extremely many files? > > Did you check whether the RAM usage of git-fast-import process was > growing gradually to fill RAM while it was running vs. whether the usage > seemed reasonable until it suddenly crashed? > > There are a few obvious possibilities: > > 0. There is some reason that too little of your computer's RAM is > available to git-fast-import (e.g., ulimit, other processes running at > the same time, much RAM being used as a ramdisk, etc). > > 1. Your import is simply too big for git-fast-import to hold in memory > the accumulated things that it has to remember. I'm not familiar with > the internals of git-fast-import, but I believe that the main thing that > it has to keep in RAM is the list of "marks" (references to git objects > that can be referred to later in the import). From your crash file, it > looks like there were about 350k marks loaded at the time of the crash. > Supposing each mark is about 100 bytes, this would only amount to 35 > Mb, which should not be a problem (*if* my assumptions are correct). > > 2. Your import contains a gigantic object which individually is so big > that it overflows some component of the import. (I don't know whether > large objects are handled streamily; they might be read into memory at > some point.) But since your computer had so much RAM this is hardly > imaginable. > > 3. git-fast-import has a memory leak and the accumulated memory leakage > is exhausting your RAM. > > 4. git-fast-import has some other kind of a bug. > > 5. The contents of the dumpfile are corrupt in a way that is triggering > the problem. This could either be invalid input (e.g., an object that > is reported to be quaggabytes large), or some invalid input that > triggers a bug in git-fast-import. > > If (1), then you either need a bigger machine or git-fast-import needs > architectural changes. > > If (2), then you either need a bigger machine or git-fast-import and/or > git needs architectural changes. > > If (3), then it would be good to get more information about the problem > so that the leak can be fixed. If this is the case, it might be > possible to work around the problem by splitting the dumpfile into > several parts and loading them one after the other (outputting the marks > from one run and loading them into the next). > > If (4) or (5), then it would be helpful to narrow down the problem. It > might be possible to do so by following the instructions in the cvs2svn > FAQ [1] for systematically shrinking a test case to smaller size using > destroy_repository.py and shrink_test_case.py. If you can create a > small repository that triggers the same problem, then there is a good > chance that it is easy to fix. > > Michael > (the cvs2git maintainer) > > [1] http://cvs2svn.tigris.org/faq.html#testcase > > -- > Michael Haggerty > mhagger@alum.mit.edu > http://softwareswirl.blogspot.com/