git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "René Scharfe" <rene.scharfe@lsrfire.ath.cx>
To: Jeff King <peff@peff.net>
Cc: John Hsing <tsyj2007@gmail.com>,
	Matthieu Moy <matthieu.moy@grenoble-inp.fr>,
	git@vger.kernel.org
Subject: Re: a bug when execute "git status" in git version 1.7.7.431.g89633
Date: Sun, 23 Oct 2011 19:50:11 +0200	[thread overview]
Message-ID: <4EA453D3.7080002@lsrfire.ath.cx> (raw)
In-Reply-To: <20111023162944.GB28156@sigill.intra.peff.net>

Am 23.10.2011 18:29, schrieb Jeff King:
> On Sun, Oct 23, 2011 at 03:25:17PM +0200, René Scharfe wrote:
> 
>> I can reproduce the malloc crash on Ubuntu 11.10 with these simple steps:
>> [...]
>> Bisect points to 2548183ba, "fix phantom untracked files when
>> core.ignorecase is set" from Jeff (cc:d).  If I revert that patch from
>> master (8963314c), git status works fine.
> 
> Hmm. Interesting. I can't reproduce here. And I've been running with
> this patch for over a year, and never seen that. Given your fix, I guess
> it's related to pointer size. Are you on a 32-bit machine, by any
> chance?

Yes, it's a 32-bit VM.  I think it's a case of unlucky filename lengths,
combined with the rounding up to the next multiple of 8.  The following
table lists the actual size needed for entries based on the length of
their name entry.  Length calculation uses these offsets:

	offsetof(struct cache_entry, name) == 72
	offsetof(struct ondisk_cache_entry, name) == 62

	len  ce_size                 ondisk_ce_size          delta
	  1  (72 + 1 + 8) & ~7 = 80  (62 + 1 + 8) & ~7 = 64     16
	  2  (72 + 2 + 8) & ~7 = 80  (62 + 2 + 8) & ~7 = 72      8
	  3  (72 + 3 + 8) & ~7 = 80  (62 + 3 + 8) & ~7 = 72      8
	  4  (72 + 4 + 8) & ~7 = 80  (62 + 4 + 8) & ~7 = 72      8
	  5  (72 + 5 + 8) & ~7 = 80  (62 + 5 + 8) & ~7 = 72      8
	  6  (72 + 6 + 8) & ~7 = 80  (62 + 6 + 8) & ~7 = 72      8
	  7  (72 + 7 + 8) & ~7 = 80  (62 + 7 + 8) & ~7 = 72      8
	  8  (72 + 8 + 8) & ~7 = 88  (62 + 8 + 8) & ~7 = 72     16

So in 25% of the cases an entry needs 16 bytes more in memory than on
disk and the rest needs 8 bytes more.

estimate_cache_size() calculates the amount of memory needed for the
index, based on its on-disk representation.  It simply adds the
difference of the sizes of the two structs and the size of a pointer for
each entry to its total size and returns that number.  I have:

	sizeof(void *) == 4
	sizeof(struct cache_entry) == 72
	sizeof(struct ondisk_cache_entry) == 64

So each entry gets 72 - 64 + 4 = 12 bytes extra.  If you happen to have
a lot of filenames with a delta of 16 then the resulting size won't be
enough to hold the in-memory index.

Is there a nice way to derive that we need 16 bytes per entry in the
worst case, preferably without trying all eight possibilities as I did
in the table above?  My modular math is rusty..

René

  reply	other threads:[~2011-10-23 17:50 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-22  0:20 a bug when execute "git status" in git version 1.7.7.431.g89633 John Hsing
2011-10-23  8:25 ` Matthieu Moy
2011-10-23  8:35   ` John Hsing
2011-10-23 13:25     ` René Scharfe
2011-10-23 14:28       ` René Scharfe
2011-10-23 16:29       ` Jeff King
2011-10-23 17:50         ` René Scharfe [this message]
2011-10-24  1:01           ` [PATCH] read-cache.c: fix index memory allocation René Scharfe
2011-10-24  7:07             ` Junio C Hamano
2011-10-24 15:59               ` René Scharfe
2011-10-24 21:59               ` René Scharfe
2011-10-24 23:34                 ` Nguyen Thai Ngoc Duy
2011-10-25  0:01                   ` Nguyen Thai Ngoc Duy
2011-10-25 18:00                     ` René Scharfe
2011-10-25 16:24                 ` Junio C Hamano
2011-10-24  7:28             ` Junio C Hamano
2011-10-24 15:52               ` René Scharfe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4EA453D3.7080002@lsrfire.ath.cx \
    --to=rene.scharfe@lsrfire.ath.cx \
    --cc=git@vger.kernel.org \
    --cc=matthieu.moy@grenoble-inp.fr \
    --cc=peff@peff.net \
    --cc=tsyj2007@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).