All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ivan Kanis <expire-by-2010-08-11@kanis.fr>
To: Nguyen Thai Ngoc Duy <pclouds@gmail.com>, jaredhance@gmail.com
Cc: Avery Pennarun <apenwarr@gmail.com>,
	jnareb@gmail.com, git <git@vger.kernel.org>
Subject: Re: Git server eats all memory
Date: Fri, 06 Aug 2010 19:23:17 +0200	[thread overview]
Message-ID: <87hbj74pve.fsf@kanis.fr> (raw)
In-Reply-To: <AANLkTi=tf51FWkZZFw9cF=pcCyadgp7a9EXK=KQ6GSQS@mail.gmail.com> (Nguyen Thai Ngoc Duy's message of "Fri, 6 Aug 2010 11:51:33 +1000")

Nguyen Thai Ngoc Duy <pclouds@gmail.com> wrote:

> Naah, git pack-objects needs list of commit tips. Try
> git for-each-ref|cut -c 1-40|git pack-objects --all --stdout > /dev/null

Jared Hance <jaredhance@gmail.com> wrote:

> I would look in the code for malloc calls that don't have a free call,
> or spots where free calls might not be hit.

Hello Jared and Nguyen,

Thank you Nguyen for your command. I can now reproduce the problem
without needing the network. I have been following Jared lead today on
a potential memory leak. Here is what I found out.

I downloaded the latest release of git 1.7.2.1 and compiled it with
debugging support. I ran valgrind on the command and found two memory
leaks. I put the output at the bottom of the e-mail as it's not very
interesting. I patched one of the leak in pack_objects.c but got the
same problem: over 4G of memory consumption for a 4G repository.

I've come to the conclusion that it's not a memory leak. 

This afternoon I put macro around the following functions: xmalloc
xmallocz, xrealloc, xcalloc and xmmap. It reported the line of code and
size passed in each functions. I then run the result through a script
that totaled the amount used by each bit of code.

Here are the top 3 consumers:

| function | source                     | size in M |
|----------+----------------------------+-----------|
| xrealloc | builtin/pack-objects.c:690 |        86 |
| xmallocz | patch-delta.c:36           |       301 |
| xmmap    | sha1_file.c:772            |      4393 |

I expected the malloc to take 4G but was surprised it didn't. It seems
to be mmap taking all the memory. I am not familiar with that function,
it looks like it's mapping memory to a file... Is it reasonable to mmap
so much memory?

Today I chatted with someone on freenode #git and he reported the same
problem on his 2G repository, I am glad I am not the only one seeing
this ;)

I tried reading the code but it's going over my head. I'll look at is
some more next monday.

If anyone is familiar with the code source of git I would love to have
some insight into this.

Take care,

Ivan Kanis

PS: output of valgrind --leak-check=full

65 bytes in 1 blocks are definitely lost in loss record 4 of 7
   at 0x4C2260E: malloc (vg_replace_malloc.c:207)
   by 0x4C22797: realloc (vg_replace_malloc.c:429)
   by 0x4C600D: xrealloc (wrapper.c:80)
   by 0x4B7939: strbuf_grow (strbuf.c:70)
   by 0x4B80BA: strbuf_addf (strbuf.c:201)
   by 0x4832EF: system_path (exec_cmd.c:37)
   by 0x483411: setup_path (exec_cmd.c:104)
   by 0x404AF2: main (git.c:536)

512 bytes in 1 blocks are definitely lost in loss record 5 of 8
   at 0x4C203E4: calloc (vg_replace_malloc.c:397)
   by 0x4C5F9D: xcalloc (wrapper.c:96)
   by 0x445741: cmd_pack_objects (pack-objects.c:2117)
   by 0x4048EE: handle_internal_command (git.c:270)
   by 0x404B03: main (git.c:470)
-- 
http://kanis.fr

Everything should be made as simple as possible, but not simpler.
    -- Albert Einstein 

  parent reply	other threads:[~2010-08-06 17:23 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-04 14:57 Git server eats all memory Ivan Kanis
2010-08-04 15:55 ` Matthieu Moy
2010-08-04 17:50   ` Ivan Kanis
2010-08-04 20:12 ` Avery Pennarun
2010-08-05  6:33   ` Ivan Kanis
2010-08-05 22:45     ` Jared Hance
2010-08-06  1:37     ` Nguyen Thai Ngoc Duy
2010-08-06  1:51       ` Nguyen Thai Ngoc Duy
2010-08-06 11:34         ` Jakub Narebski
2010-08-06 17:23         ` Ivan Kanis [this message]
2010-08-07  6:42           ` Dmitry Potapov
2010-08-09 10:12             ` Excessive mmap [was Git server eats all memory] Ivan Kanis
2010-08-09 12:35               ` Dmitry Potapov
2010-08-09 16:34                 ` Ivan Kanis
2010-08-09 16:50                   ` Avery Pennarun
2010-08-09 17:45                     ` Tomas Carnecky
2010-08-09 18:17                       ` Avery Pennarun
2010-08-09 21:28                     ` Dmitry Potapov
2010-08-11 15:47                     ` Ivan Kanis
2010-08-11 16:35                       ` Avery Pennarun
     [not found]                         ` <wes4oetv31i.fsf@kanis.fr>
2010-08-17 17:07                           ` Dmitry Potapov
2018-06-20 14:53               ` Duy Nguyen
     [not found]           ` <AANLkTi=yeTh2tKn9t_=iZbdB5VLrfCPZ2_fBpYdf9wta@mail.gmail.com>
     [not found]             ` <wesbp9cnnag.fsf@kanis.fr>
2010-08-09  9:57               ` Git server eats all memory Nguyen Thai Ngoc Duy
2010-08-09 17:38                 ` Ivan Kanis
2010-08-10  0:46 ` Robin H. Johnson
2010-08-10  2:31   ` Sverre Rabbelier
2010-08-11 10:30     ` Sam Vilain
2010-08-11 15:54   ` Ivan Kanis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87hbj74pve.fsf@kanis.fr \
    --to=expire-by-2010-08-11@kanis.fr \
    --cc=apenwarr@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=jaredhance@gmail.com \
    --cc=jnareb@gmail.com \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.