git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Rast <trast@student.ethz.ch>
To: Jeff King <peff@peff.net>
Cc: "Stefan Zager" <szager@google.com>,
	git@vger.kernel.org, "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
Subject: Re: regression in multi-threaded git-pack-index
Date: Tue, 19 Mar 2013 09:17:32 +0100	[thread overview]
Message-ID: <87fvzrajmr.fsf@pctrast.inf.ethz.ch> (raw)
In-Reply-To: <20130316114118.GA1940@sigill.intra.peff.net> (Jeff King's message of "Sat, 16 Mar 2013 07:41:19 -0400")

Jeff King <peff@peff.net> writes:

> On Fri, Mar 15, 2013 at 03:42:40PM -0700, Stefan Zager wrote:
>
>> We have uncovered a regression in this commit:
>> 
>> b8a2486f1524947f232f657e9f2ebf44e3e7a243
>> 
>> The symptom is that 'git fetch' dies with:
>> 
>> error: index-pack died of signal 10
>> fatal: index-pack failed
>> 
>> I have only been able to reproduce it on a Mac thus far; will try
>> ubuntu next.  We can make it go away by running:
>> 
>> git config pack.threads 1
>
> I couldn't reproduce the problem on Linux with the instructions you
> gave. I did try running it under valgrind and it produced:
>
>   ==2380== Conditional jump or move depends on uninitialised value(s)
>   ==2380==    at 0x441631: resolve_delta (index-pack.c:837)
>   ==2380==    by 0x4419D6: find_unresolved_deltas_1 (index-pack.c:898)
>   ==2380==    by 0x441A45: find_unresolved_deltas (index-pack.c:914)
>   ==2380==    by 0x4427CA: fix_unresolved_deltas (index-pack.c:1232)
>   ==2380==    by 0x4421F5: conclude_pack (index-pack.c:1111)
>   ==2380==    by 0x443A5C: cmd_index_pack (index-pack.c:1604)
>   ==2380==    by 0x4058A2: run_builtin (git.c:281)
>   ==2380==    by 0x405A35: handle_internal_command (git.c:443)
>   ==2380==    by 0x405C01: main (git.c:532)
>
> but the line in question is:
>
>   if (deepest_delta < delta_obj->delta_depth)
>
> And in the debugger, both of those variables appear to have sane values
> (nor should either impacted by the patch you bisected to). On top of
> that, running with pack.threads=1 produces the same error. So I think it
> may be a false positive from valgrind, and unrelated to your issue.

I find that somewhat unlikely, for two reasons: memcheck is actually
quite good at finding uninitialized memory use, it just isn't that good
at distinguishing if it makes a difference.  Most false positives are of
the "loading an entire word and discarding most of it" kind.

Second, the thread-debugging valgrind tools (drd and helgrind) also
complain about exactly this line:

DRD says:

  ==20987== Thread 4:
  ==20987== Conflicting load by thread 4 at 0x007a70d0 size 4
  ==20987==    at 0x43A783: resolve_delta (index-pack.c:837)
  ==20987==    by 0x43A94F: find_unresolved_deltas (index-pack.c:898)
  ==20987==    by 0x43B0F8: threaded_second_pass (index-pack.c:945)
  ==20987==    by 0x4C2CF60: vgDrd_thread_wrapper (drd_pthread_intercepts.c:341)
  ==20987==    by 0x542FE0D: start_thread (in /lib64/libpthread-2.15.so)
  ==20987== Allocation context: BSS section of /home/thomas/.local/bin/git
  ==20987== Other segment start (thread 2)
  ==20987==    at 0x4C30A1F: pthread_mutex_unlock (drd_pthread_intercepts.c:665)
  ==20987==    by 0x43B06E: threaded_second_pass (index-pack.c:122)
  ==20987==    by 0x4C2CF60: vgDrd_thread_wrapper (drd_pthread_intercepts.c:341)
  ==20987==    by 0x542FE0D: start_thread (in /lib64/libpthread-2.15.so)
  ==20987== Other segment end (thread 2)
  ==20987==    at 0x5436AE3: ??? (in /lib64/libpthread-2.15.so)
  ==20987==    by 0x439C76: unpack_data.constprop.8 (index-pack.c:528)
  ==20987==    by 0x439EA7: get_base_data (index-pack.c:571)
  ==20987==    by 0x43A7B4: resolve_delta (index-pack.c:841)
  ==20987==    by 0x43A94F: find_unresolved_deltas (index-pack.c:898)
  ==20987==    by 0x43B0F8: threaded_second_pass (index-pack.c:945)
  ==20987==    by 0x4C2CF60: vgDrd_thread_wrapper (drd_pthread_intercepts.c:341)
  ==20987==    by 0x542FE0D: start_thread (in /lib64/libpthread-2.15.so)


helgrind says:

  ==21160== Possible data race during read of size 4 at 0x7A70D0 by thread #3
  ==21160== Locks held: none
  ==21160==    at 0x43A783: resolve_delta (index-pack.c:837)
  ==21160==    by 0x43A94F: find_unresolved_deltas (index-pack.c:898)
  ==21160==    by 0x43B0F8: threaded_second_pass (index-pack.c:945)
  ==21160==    by 0x4C2D35D: mythread_wrapper (hg_intercepts.c:219)
  ==21160==    by 0x5424E0D: start_thread (in /lib64/libpthread-2.15.so)
  ==21160== 
  ==21160== This conflicts with a previous write of size 4 by thread #2
  ==21160== Locks held: none
  ==21160==    at 0x43A78E: resolve_delta (index-pack.c:838)
  ==21160==    by 0x43A94F: find_unresolved_deltas (index-pack.c:898)
  ==21160==    by 0x43B0F8: threaded_second_pass (index-pack.c:945)
  ==21160==    by 0x4C2D35D: mythread_wrapper (hg_intercepts.c:219)
  ==21160==    by 0x5424E0D: start_thread (in /lib64/libpthread-2.15.so)


You were apparently just lucky in catching it before it was even
initialized.

I can reproduce the above warnings with

  valgrind --tool=helgrind --trace-children=yes git index-pack <any_pack>

i.e., it does not seem to depend on the pack (sample size 3, packs
looked at were from my git.git).

Duy, what was the reasoning why resolve_delta() does not need to hold
locks when it looks when it looks at deepest_delta?  My coffee levels
aren't up to this task yet.  It certainly seems extremely dubious to me,
as the code uses the global deepest_delta in threaded sections.  You can
probably argue that the load/store is atomic on most(?) platforms, but
you get no guarantees that deepest_delta at any time in fact holds the
maximum value of delta_obj->delta_depth.


Furthermore there's another warning shown by helgrind:

  ==21160== Possible data race during write of size 4 at 0x7A7060 by thread #2
  ==21160== Locks held: 1, at address 0x7A70E0
  ==21160==    at 0x43A840: resolve_delta (index-pack.c:853)
  ==21160==    by 0x43A94F: find_unresolved_deltas (index-pack.c:898)
  ==21160==    by 0x43B0F8: threaded_second_pass (index-pack.c:945)
  ==21160==    by 0x4C2D35D: mythread_wrapper (hg_intercepts.c:219)
  ==21160==    by 0x5424E0D: start_thread (in /lib64/libpthread-2.15.so)
  ==21160== 
  ==21160== This conflicts with a previous read of size 4 by thread #3
  ==21160== Locks held: 1, at address 0x7A7080
  ==21160==    at 0x43AFC8: threaded_second_pass (index-pack.c:955)
  ==21160==    by 0x4C2D35D: mythread_wrapper (hg_intercepts.c:219)
  ==21160==    by 0x5424E0D: start_thread (in /lib64/libpthread-2.15.so)

That one really seems to be a false positive in the sense that
threaded_second_pass() doesn't really care if it gets a bad value for
nr_resolved_deltas.  The only thing that matters is that the counter
increment is done with counter_mutex held.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

  parent reply	other threads:[~2013-03-19  8:18 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-15 22:42 regression in multi-threaded git-pack-index Stefan Zager
2013-03-16 11:41 ` Jeff King
2013-03-16 12:38   ` Duy Nguyen
2013-03-19  8:17   ` Thomas Rast [this message]
2013-03-19  9:30     ` Jeff King
2013-03-19  9:59       ` Jeff King
2013-03-19 10:08         ` Jeff King
2013-03-19 10:24           ` Jeff King
2013-03-19 10:29             ` Thomas Rast
2013-03-19 10:33               ` Jeff King
2013-03-19 10:45                 ` Thomas Rast
2013-03-19 10:47                   ` Jeff King
2013-03-19 10:58             ` [PATCH] index-pack: always zero-initialize object_entry list Jeff King
2013-03-19 15:33               ` Thomas Rast
2013-03-19 15:43                 ` Jeff King
2013-03-19 15:52                   ` Jeff King
2013-03-19 16:17                     ` [PATCH v2] " Jeff King
2013-03-19 16:27                       ` Thomas Rast
2013-03-19 17:13                         ` Junio C Hamano
2013-03-20 19:12                       ` Eric Sunshine
2013-03-20 19:13                         ` Jeff King
2013-03-20 19:14                           ` Eric Sunshine
2013-03-19 12:35     ` regression in multi-threaded git-pack-index Duy Nguyen
2013-03-19 13:01       ` [PATCH] index-pack: protect deepest_delta in multithread code Nguyễn Thái Ngọc Duy
2013-03-19 13:25         ` Jeff King
2013-03-19 13:50         ` Thomas Rast
2013-03-19 14:07           ` Duy Nguyen
2013-03-19 14:16             ` [PATCH v2] index-pack: guard nr_resolved_deltas reads by lock Thomas Rast
2013-03-19 15:53               ` Junio C Hamano
2013-03-19 15:41 ` regression in multi-threaded git-pack-index Thomas Rast
2013-03-19 15:45   ` Thomas Rast
2013-03-19 16:11     ` Thomas Rast
2013-03-19 17:58       ` Junio C Hamano
2013-03-19 22:08         ` [PATCH] sha1_file: remove recursion in packed_object_info Thomas Rast
2013-03-20 16:47           ` Junio C Hamano
2013-03-25  9:27             ` thomas
2013-03-25 18:07               ` [PATCH v2 0/3] Recursion-free unpack_entry and packed_object_info Thomas Rast
2013-03-25 18:07                 ` [PATCH v2 1/3] sha1_file: remove recursion in packed_object_info Thomas Rast
2013-03-25 18:07                 ` [PATCH v2 2/3] Refactor parts of in_delta_base_cache/cache_or_unpack_entry Thomas Rast
2013-03-25 23:15                   ` Junio C Hamano
2013-03-26 11:09                     ` thomas
2013-03-25 18:07                 ` [PATCH v2 3/3] sha1_file: remove recursion in unpack_entry Thomas Rast
2013-03-25 23:19                   ` Junio C Hamano
2013-03-26  3:37                 ` [PATCH v2 0/3] Recursion-free unpack_entry and packed_object_info Nicolas Pitre
2013-03-25 18:17               ` [PATCH] sha1_file: remove recursion in packed_object_info Junio C Hamano
2013-03-27 20:03               ` [PATCH v3 0/3] Recursion-free unpack_entry and packed_object_info Thomas Rast
2013-03-27 20:03                 ` [PATCH v3 1/3] sha1_file: remove recursion in packed_object_info Thomas Rast
2013-03-27 20:03                 ` [PATCH v3 2/3] Refactor parts of in_delta_base_cache/cache_or_unpack_entry Thomas Rast
2013-03-27 20:03                 ` [PATCH v3 3/3] sha1_file: remove recursion in unpack_entry Thomas Rast
2013-03-27 20:29                   ` Junio C Hamano
2013-03-20  1:17       ` regression in multi-threaded git-pack-index Duy Nguyen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87fvzrajmr.fsf@pctrast.inf.ethz.ch \
    --to=trast@student.ethz.ch \
    --cc=git@vger.kernel.org \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    --cc=szager@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).