All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michael Haggerty <mhagger@alum.mit.edu>
To: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Cc: Junio C Hamano <gitster@pobox.com>, Jeff King <peff@peff.net>,
	Johan Herland <johan@herland.net>,
	git@vger.kernel.org
Subject: Re: [PATCH 00/12] Fix some reference-related races
Date: Sun, 16 Jun 2013 07:50:42 +0200	[thread overview]
Message-ID: <51BD5232.20205@alum.mit.edu> (raw)
In-Reply-To: <51BCCAD2.9070106@ramsay1.demon.co.uk>

Thanks for all of the information.

On 06/15/2013 10:13 PM, Ramsay Jones wrote:
> Michael Haggerty wrote:
>> *This patch series must be built on top of mh/reflife.*
>>
> 
> [...]
> 
>> The other problem was that the for_each_ref() functions will die if
>> the ref cache that they are iterating over is freed out from under
>> them.  This problem is solved by using reference counts to avoid
>> freeing the old packed ref cache (even if it is no longer valid) until
>> all users are done with it.
> 
> Yes, I found exactly this happened to me on cygwin, earlier this week,
> with the previous version of this code. After seeing this mail, I had
> decided not to describe the failure on the old version, but wait and
> test this version instead.
> 
> This version is a great improvement, but it still has some failures on
> cygwin. So, it may be worth (briefly) describing the old failure anyway!
> Note that several tests failed, but I will only mention t3211-peel-ref.sh
> tests #7-8.
> 
>   $ pwd
>   /home/ramsay/git/t/trash directory.t3211-peel-ref
>   $
> 
>   $ ../../bin-wrappers/git show-ref -d
>   d1ff1c9224ae5e58a7656fb9ecc95865d42ed71e refs/heads/master
>   eb0e854c2cd2511c7571b1a5e3c8b8146193fb30 refs/outside/foo
>   d1ff1c9224ae5e58a7656fb9ecc95865d42ed71e refs/outside/foo^{}
>         5 [main] git 3540 _cygtls::handle_exceptions: Error while dumping state (p
>   robably corrupted stack)
>   Segmentation fault (core dumped)
>   $
> 
> The stack-trace for the faulting code looks something like:
> 
>   cmd_show_ref()
>     for_each_ref(show_ref, NULL)
>       do_for_each_ref(&ref_cache, "", show_ref, 0, 0, NULL)
>         do_for_each_entry(&ref_cache, "", do_one_ref, &data)
>           do_for_each_entry_in_dirs(packed_dir, loose_dir, do_one_ref, &data)
>             *dfeeid() recursive calls*
>               do_one_ref(entry, &data)
>                 show_ref("refs/outside/foo", sha1, NULL) [2nd match]
>                   peel_ref("refs/outside/foo", sha1)
>                     peel_entry(entry, 0)
>                       peel_object(name, sha1)
>                         deref_tag_noverify(o)
>                           parse_object(sha1 <eb0e854c2...>)
>                             lookup_replace_object(sha1)
>                               do_lookup_replace_object(sha1)
>                                 prepare_replace_object()
> 
>   [un-indent here!]
>       for_each_replace_ref(register_replace_ref, NULL)
>         do_for_each_ref(&ref_cache, "refs/replace", fn, 13, 0, NULL)
>           do_for_each_entry(&ref_cache, "refs/replace", fn, &data)
>             get_packed_refs(&ref_cache)
>               clear_packed_ref_cache(&ref_cache) *free_ref_entries etc*
>   ** return to show_ref() [2nd match] above **
>   ** return to recursive dfeeid() call in original iteration
>   ** dir1->entries has been free()-ed and reused => segmentation fault
>   [dir1->entries == 0x64633263 => dc2c => part of sha1 for refs/outside/foo]
> 
> So, the nested "replace-reference-iteration" causes the ref_cache to be
> freed out from under the initial show-ref iteration, so this works:
> 
>   $ GIT_NO_REPLACE_OBJECTS=1 ../../bin-wrappers/git show-ref -d
>   d1ff1c9224ae5e58a7656fb9ecc95865d42ed71e refs/heads/master
>   eb0e854c2cd2511c7571b1a5e3c8b8146193fb30 refs/outside/foo
>   d1ff1c9224ae5e58a7656fb9ecc95865d42ed71e refs/outside/foo^{}
>   d1ff1c9224ae5e58a7656fb9ecc95865d42ed71e refs/tags/base
>   eb0e854c2cd2511c7571b1a5e3c8b8146193fb30 refs/tags/foo
>   d1ff1c9224ae5e58a7656fb9ecc95865d42ed71e refs/tags/foo^{}
>   $
> 
> You may be wondering why clear_packed_ref_cache() is called? Well, that
> is because stat_validity_check() *incorrectly* indicates that the
> packed-refs file has changed. Why does it do that? Well, this is one
> more example of the problems caused by the cygwin schizophrenic stat()
> functions. :( [ARGHHHHHHHHH]
> 
> At this point, I tried running 'git show-ref' with core.checkstat set
> on the command line; but that didn't work! I had to fix show-ref and
> re-build git, and then, this works:
> 
>   $ ../../bin-wrappers/git -c core.checkstat=minimal -c core.trustctime=f
>   alse show-ref -d
>   d1ff1c9224ae5e58a7656fb9ecc95865d42ed71e refs/heads/master
>   eb0e854c2cd2511c7571b1a5e3c8b8146193fb30 refs/outside/foo
>   d1ff1c9224ae5e58a7656fb9ecc95865d42ed71e refs/outside/foo^{}
>   d1ff1c9224ae5e58a7656fb9ecc95865d42ed71e refs/tags/base
>   eb0e854c2cd2511c7571b1a5e3c8b8146193fb30 refs/tags/foo
>   d1ff1c9224ae5e58a7656fb9ecc95865d42ed71e refs/tags/foo^{}
>   $

So if I understand correctly, all of the above is *without* the
refcounting changes introduced in mh/ref-races?  Is so, then it is not
surprising, as this is exactly the sort of problem that the reference
counting is meant to solve.

> Now, turning to the new code, t3211-peel-ref.sh test #7 now works, but
> test #8 still fails...
> 
>   $ ./t3211-peel-ref.sh -i -v
> 
>   ...
> 
>   ok 7 - refs are peeled outside of refs/tags (old packed)
> 
>   expecting success:
>           git pack-refs --all &&
>           cp .git/packed-refs fully-peeled &&
>           git branch yadda &&
>           git pack-refs --all &&
>           git branch -d yadda &&
>           test_cmp fully-peeled .git/packed-refs
> 
>   fatal: internal error: packed-ref cache cleared while locked
>   not ok 8 - peeled refs survive deletion of packed ref
>   #
>   #               git pack-refs --all &&
>   #               cp .git/packed-refs fully-peeled &&
>   #               git branch yadda &&
>   #               git pack-refs --all &&
>   #               git branch -d yadda &&
>   #               test_cmp fully-peeled .git/packed-refs
>   #
>   $ cd trash\ directory.t3211-peel-ref/
>   $ ../../bin-wrappers/git pack-refs --all
>   fatal: internal error: packed-ref cache cleared while locked

These "internal error: packed-ref cache cleared while locked" failures
result from an internal consistency check that clear_packed_ref_cache()
is not called while the write lock is held on the packed-refs file.  A
call to c_p_r_c() could result from

* a programming error

* a determination based on the packed-refs file stat that the file needs
to be re-read

Judging from what you said about cygwin, I assume that the latter is
happening.  It should be impossible, because the current process is
holding packed-refs.lock, and therefore other git processes should
refuse to change the packed-refs file.

But if the stat information is not reliable, then the current process
would *think* that the packed-refs file has been changed even though it
hasn't, then it would call c_p_r_c() even though it holds the lock on
packed-refs, and the internal consistency check would fail.

So apparently in these cases cygwin is reporting that the packed-refs
stat information has changed (in the sense defined by the new
stat_validity_check() function, which does essentially the same checks
as the old ce_match_stat_basic() function).

>   $ ls
>   actual  base.t  expect
>   $ ls .git
>   COMMIT_EDITMSG  branches/  description      index  logs/     packed-refs
>   HEAD            config     hooks-disabled/  info/  objects/  refs/
>   $ ls -l .git/packed-refs
>   -rw-r--r-- 1 ramsay None 296 Jun 14 20:34 .git/packed-refs
>   $ cat .git/packed-refs
>   # pack-refs with: peeled
>   d1ff1c9224ae5e58a7656fb9ecc95865d42ed71e refs/heads/master
>   eb0e854c2cd2511c7571b1a5e3c8b8146193fb30 refs/outside/foo
>   d1ff1c9224ae5e58a7656fb9ecc95865d42ed71e refs/tags/base
>   eb0e854c2cd2511c7571b1a5e3c8b8146193fb30 refs/tags/foo
>   ^d1ff1c9224ae5e58a7656fb9ecc95865d42ed71e
>   $
> 
> Now, I have a test-stat program which prints the difference between
> the two stat implementations used in cygwin git, thus:
> 
>   $ ../../test-stat .git/packed-refs
>   stat for '.git/packed-refs':
>   *dev:   -1395862925, 0
>   *ino:   166044, 0
>   *mode:  100644 -rw-, 100600 -rw-
>    nlink: 1, 1
>   *uid:   1005, 0
>   *gid:   513, 0
>   *rdev:  -1395862925, 0
>    size:  296, 296
>    atime: 1371238550, 1371238550 Fri Jun 14 20:35:50 2013
>    mtime: 1371238469, 1371238469 Fri Jun 14 20:34:29 2013
>    ctime: 1371238469, 1371238469 Fri Jun 14 20:34:29 2013

Yikes!  ECYGWINFAIL.  I have no doubt they have reason why they cannot
implement this correctly, but this is rather limiting.  (I has assumed
that the #ifdef magic that was already in ce_match_stat_basic() would
have papered over the problems in stat, but I guess that is not the case.)

You say that there are two stat references in cygwin.  Is there a way to
ensure that the same one is used in both cases?  Or is it so hopelessly
broken that there is no point?

Or let me step back and pose a slightly more abstract question:

How, under cygwin, can we implement a quick check of whether a file
might have changed?  The check should not have any false negatives
(claiming that a file is unchanged when actually it was rewritten via
the usual lock_file mechanism) nor should it have any "strong" false
positives (claiming that a file has changed even though it has never
been touched), though a "weak" false positive would be OK (claiming that
a file has changed even though it was replaced by a version with
identical contents).

If such a check is possible, then we should build it into the
implementation of match_stat_data().  If not, we have to think of
another way to implement the checks of packed-refs cache up-to-dateness.

(One horrible hack would be: when in doubt, read the packed-refs file
into a temporary ref_dir, then compare *the contents* to the version in
the cache.  If they are the same, then discard the newly-read version,
update the stat_validity, and continue to use the old version.  If they
are different, *then* verify that the lock file was not held, call
clear_packed_ref_cache(), and start using the new version.)

>   $ ../../bin-wrappers/git -c core.checkstat=minimal pack-refs --all
>   fatal: internal error: packed-ref cache cleared while locked
>   $
> 
> Hmmm, that should have worked! Wait, fix 'git pack-refs' to support
> setting config variables on the command line, rebuild and:
> 
>   $ ../../bin-wrappers/git -c core.checkstat=minimal pack-refs --all
>   $ cat .git/packed-refs
>   # pack-refs with: peeled fully-peeled
>   d1ff1c9224ae5e58a7656fb9ecc95865d42ed71e refs/heads/master
>   eb0e854c2cd2511c7571b1a5e3c8b8146193fb30 refs/outside/foo
>   ^d1ff1c9224ae5e58a7656fb9ecc95865d42ed71e
>   d1ff1c9224ae5e58a7656fb9ecc95865d42ed71e refs/tags/base
>   eb0e854c2cd2511c7571b1a5e3c8b8146193fb30 refs/tags/foo
>   ^d1ff1c9224ae5e58a7656fb9ecc95865d42ed71e
>   $
> 
> I haven't checked the remaining test failures to see if they are
> caused by this code (I don't think so, but ...), but this failure
> is clearly a cygwin specific issue.

Thanks again for the testing and analysis,
Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

  reply	other threads:[~2013-06-16  5:51 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-11 21:48 [PATCH 00/12] Fix some reference-related races Michael Haggerty
2013-06-11 21:48 ` [PATCH 01/12] repack_without_ref(): split list curation and entry writing Michael Haggerty
2013-06-12 11:38   ` Jeff King
2013-06-12 11:56     ` Michael Haggerty
2013-06-11 21:48 ` [PATCH 02/12] pack_refs(): split creation of packed refs " Michael Haggerty
2013-06-11 21:48 ` [PATCH 03/12] refs: wrap the packed refs cache in a level of indirection Michael Haggerty
2013-06-11 21:48 ` [PATCH 04/12] refs: implement simple transactions for the packed-refs file Michael Haggerty
2013-06-12 12:01   ` Jeff King
2013-06-11 21:48 ` [PATCH 05/12] refs: manage lifetime of packed refs cache via reference counting Michael Haggerty
2013-06-11 21:48 ` [PATCH 06/12] do_for_each_entry(): increment the packed refs cache refcount Michael Haggerty
2013-06-11 21:48 ` [PATCH 07/12] packed_ref_cache: increment refcount when locked Michael Haggerty
2013-06-11 21:48 ` [PATCH 08/12] Extract a struct stat_data from cache_entry Michael Haggerty
2013-06-11 21:48 ` [PATCH 09/12] add a stat_validity struct Michael Haggerty
2013-06-11 21:48 ` [PATCH 10/12] get_packed_ref_cache: reload packed-refs file when it changes Michael Haggerty
2013-06-11 21:48 ` [PATCH 11/12] for_each_ref: load all loose refs before packed refs Michael Haggerty
2013-06-11 21:48 ` [PATCH 12/12] refs: do not invalidate the packed-refs cache unnecessarily Michael Haggerty
2013-06-12 12:39   ` Jeff King
2013-06-12 12:52 ` [PATCH 00/12] Fix some reference-related races Jeff King
2013-06-15 20:13 ` Ramsay Jones
2013-06-16  5:50   ` Michael Haggerty [this message]
2013-06-18 18:13     ` Ramsay Jones
2013-06-19  5:51       ` Michael Haggerty

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51BD5232.20205@alum.mit.edu \
    --to=mhagger@alum.mit.edu \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=johan@herland.net \
    --cc=peff@peff.net \
    --cc=ramsay@ramsay1.demon.co.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.