From: Uri Moszkowicz <uri@4refs.com>
To: Michael Haggerty <mhagger@alum.mit.edu>
Cc: git@vger.kernel.org
Subject: Re: error: git-fast-import died of signal 11
Date: Wed, 17 Oct 2012 15:06:10 -0500 [thread overview]
Message-ID: <CAMJd5AS92QtPyJ0sFWSQYtd-SSoZLHtR_NidRTBuNzaxv9t94A@mail.gmail.com> (raw)
In-Reply-To: <507D0A53.1030707@alum.mit.edu>
Hi Michael,
Looks like the changes to limit solved the problem. I didn't verify if
it was the stacksize or descriptors but one of those. Final repository
size was 14GB from a 328GB dump file.
Thanks,
Uri
On Tue, Oct 16, 2012 at 2:18 AM, Michael Haggerty <mhagger@alum.mit.edu> wrote:
> On 10/15/2012 05:53 PM, Uri Moszkowicz wrote:
>> I'm trying to convert a CVS repository to Git using cvs2git. I was able to
>> generate the dump file without problem but am unable to get Git to
>> fast-import it. The dump file is 328GB and I ran git fast-import on a
>> machine with 512GB of RAM.
>>
>> fatal: Out of memory? mmap failed: Cannot allocate memory
>> fast-import: dumping crash report to fast_import_crash_18192
>> error: git-fast-import died of signal 11
>>
>> How can I import the repository?
>
> What versions of git and of cvs2git are you using? If not the current
> versions, please try with the current versions.
>
> What is the nature of your repository (i.e., why is it so big)? Does it
> consist of extremely large files? A very deep history? Extremely many
> branches/tags? Extremely many files?
>
> Did you check whether the RAM usage of git-fast-import process was
> growing gradually to fill RAM while it was running vs. whether the usage
> seemed reasonable until it suddenly crashed?
>
> There are a few obvious possibilities:
>
> 0. There is some reason that too little of your computer's RAM is
> available to git-fast-import (e.g., ulimit, other processes running at
> the same time, much RAM being used as a ramdisk, etc).
>
> 1. Your import is simply too big for git-fast-import to hold in memory
> the accumulated things that it has to remember. I'm not familiar with
> the internals of git-fast-import, but I believe that the main thing that
> it has to keep in RAM is the list of "marks" (references to git objects
> that can be referred to later in the import). From your crash file, it
> looks like there were about 350k marks loaded at the time of the crash.
> Supposing each mark is about 100 bytes, this would only amount to 35
> Mb, which should not be a problem (*if* my assumptions are correct).
>
> 2. Your import contains a gigantic object which individually is so big
> that it overflows some component of the import. (I don't know whether
> large objects are handled streamily; they might be read into memory at
> some point.) But since your computer had so much RAM this is hardly
> imaginable.
>
> 3. git-fast-import has a memory leak and the accumulated memory leakage
> is exhausting your RAM.
>
> 4. git-fast-import has some other kind of a bug.
>
> 5. The contents of the dumpfile are corrupt in a way that is triggering
> the problem. This could either be invalid input (e.g., an object that
> is reported to be quaggabytes large), or some invalid input that
> triggers a bug in git-fast-import.
>
> If (1), then you either need a bigger machine or git-fast-import needs
> architectural changes.
>
> If (2), then you either need a bigger machine or git-fast-import and/or
> git needs architectural changes.
>
> If (3), then it would be good to get more information about the problem
> so that the leak can be fixed. If this is the case, it might be
> possible to work around the problem by splitting the dumpfile into
> several parts and loading them one after the other (outputting the marks
> from one run and loading them into the next).
>
> If (4) or (5), then it would be helpful to narrow down the problem. It
> might be possible to do so by following the instructions in the cvs2svn
> FAQ [1] for systematically shrinking a test case to smaller size using
> destroy_repository.py and shrink_test_case.py. If you can create a
> small repository that triggers the same problem, then there is a good
> chance that it is easy to fix.
>
> Michael
> (the cvs2git maintainer)
>
> [1] http://cvs2svn.tigris.org/faq.html#testcase
>
> --
> Michael Haggerty
> mhagger@alum.mit.edu
> http://softwareswirl.blogspot.com/
prev parent reply other threads:[~2012-10-17 20:06 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CAMJd5ATv5XfTK++4=Rs+RUkgb7F-ssrz2Lrch_WxvxZt+yF33A@mail.gmail.com>
2012-10-15 15:53 ` error: git-fast-import died of signal 11 Uri Moszkowicz
2012-10-15 21:12 ` Andrew Wong
2012-10-15 21:28 ` Uri Moszkowicz
2012-10-15 23:00 ` Andrew Wong
[not found] ` <CAMJd5AT51oSGer2JAhCPGnjWqCR-M2b1_4ULF7LeTob8xLcjVw@mail.gmail.com>
[not found] ` <CADgNjakqUL+66t7=Fkd69GPYOq54Z49RQchBLSSVGRv+4=5eGQ@mail.gmail.com>
[not found] ` <CAMJd5AR2gsyymKhT_hK9=4bHbcVnn+qEaDSxrZeJL1dfbmDxTw@mail.gmail.com>
2012-10-16 20:12 ` Andrew Wong
2012-10-16 7:18 ` Michael Haggerty
2012-10-16 19:27 ` Uri Moszkowicz
2012-10-17 20:06 ` Uri Moszkowicz [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAMJd5AS92QtPyJ0sFWSQYtd-SSoZLHtR_NidRTBuNzaxv9t94A@mail.gmail.com \
--to=uri@4refs.com \
--cc=git@vger.kernel.org \
--cc=mhagger@alum.mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).