All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Duy Nguyen <pclouds@gmail.com>
Cc: Michael J Gruber <git@drmicha.warpmail.net>, git@vger.kernel.org
Subject: Re: [PATCH v2] diff-lib.c: adjust position of i-t-a entries in diff
Date: Tue, 17 Mar 2015 10:57:06 -0700	[thread overview]
Message-ID: <xmqq1tknpkwd.fsf@gitster.dls.corp.google.com> (raw)
In-Reply-To: <20150317140704.GA7248@lanh> (Duy Nguyen's message of "Tue, 17 Mar 2015 21:07:04 +0700")

Duy Nguyen <pclouds@gmail.com> writes:

> On Mon, Mar 16, 2015 at 09:05:45AM -0700, Junio C Hamano wrote:
>> The offending one came from eec3e7e4 (cache-tree: invalidate i-t-a
>> paths after generating trees, 2012-12-16), which was a fix to an
>> earlier bug where a cache-tree written out of an index with i-t-a
>> entries had incorrect information and still claimed it is fully
>> valid after write-tree rebuilt it.  The test probably should add
>> another path without i-t-a bit, run the same "diff --cached" with
>> updated expectation before write-tre, and run the "diff --cached"
>> again to make sure it produces a result that match the updated
>> expectation.
>
> Would adding another non-i-t-a entry help? Before this patch
> "diff --cached" after write-tree shows the i-t-a entry only when
> eec3e7e4 is applied. But with this patch we don't show i-t-a entry any
> more, before or after write-tree, eec3e7e4 makes no visible difference.
>
> We could even revert eec3e7e4 and the outcome of "diff --cached" would
> be the same because we just sort of move the "invalidation" part from
> cache-tree to do_oneway_diff(). Not invalidating would speed up "diff
> --cached" when i-t-a entries are present. Still it may be a good idea
> to invalidate i-t-a paths to be on the safe side. Perhaps a patch like
> this to resurrect the test?

My unerstanding of what eec3e7e4 (cache-tree: invalidate i-t-a paths
after generating trees, 2012-12-16) fixed was that in this sequence:

    - You prepare an index.

    - You write-tree out of the index, which involves:

      - updating the cache-tree to match the shape of the resulting
        from writing the index out.

      - create tree objects matching all levels of the cache-tree as
        needed on disk.

      - report the top-level tree object name

   - run "diff-index --cached", which can and will take advantage of
     the fact that everything in a subtree below a known-to-be-valid
     cache-tree entry does not have to be checked one-by-one.  If a
     cache-tree says "everything under D/ in the index would hash to
     tree object T" and the HEAD has tree object T at D/, then the
     diff machinery will bypass the entire section in the index
     under D/, which is a valid optimization.

     However, when there is an i-t-a entry, we excluded that entry
     from the tree object computation, its presence did not
     contribute to the tree object name, but still marked the
     cache-tree entries that contain it as valid by mistake.  This
     old bug was what the commit fixed, so an invocation of "diff
     --cached" after a write-tree, even if the index contains an
     i-t-a entry, will not see cache-tree entries that are marked
     valid when they are not.  Instead, "diff --cached" will bypass
     the optimization and makes comparison one-by-one for the index
     entries.

So reverting the fix obviously is not the right thing to do.  If the
tests show different results from two invocations of "diff --cached"
with your patch applied, there is something that is broken by your
patch, because the index and the HEAD does not change across
write-tree in that test.

If on the other hand the tests show the same result from these two
"diff --cached" and the result is different from what the test
expects, that means your patch changed the world order, i.e. an
i-t-a entry used to be treated as if it were adding an empty blob to
the index but it is now treated as non-existent, then that is a good
thing and the only thing we need to update is what the test expects.
I am guessing that instead of expecting dir/bar to be shown, it now
should expect no output?

Does adding an non-i-t-a entry help?  It does not hurt, and it makes
the test uses a non-empty output, making its effect more visible,
which may or may not count as helping.


     

  reply	other threads:[~2015-03-17 17:57 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-09 14:14 [PATCH/RFC 0/2] Bug fixes regarding diff and "git add -N" Nguyễn Thái Ngọc Duy
2015-03-09 14:14 ` [PATCH 1/2] diff --cached: do not report i-t-a entries as "new" Nguyễn Thái Ngọc Duy
2015-03-15  6:55   ` Junio C Hamano
2015-03-16 13:56   ` [PATCH v2] diff-lib.c: adjust position of i-t-a entries in diff Nguyễn Thái Ngọc Duy
2015-03-16 15:15     ` Michael J Gruber
2015-03-16 16:05       ` Junio C Hamano
2015-03-17 14:07         ` Duy Nguyen
2015-03-17 17:57           ` Junio C Hamano [this message]
2015-03-18 12:47             ` Duy Nguyen
2015-03-18 20:30               ` Junio C Hamano
2015-03-19  6:00                 ` Junio C Hamano
2015-03-24  1:15                   ` Duy Nguyen
2015-03-24 17:00                     ` Junio C Hamano
2015-03-23 20:52                 ` Junio C Hamano
2015-03-09 14:14 ` [PATCH 2/2] diff-files: mark i-t-a paths as "new" Nguyễn Thái Ngọc Duy
2015-03-15  7:05   ` Junio C Hamano
2015-03-09 15:45 ` [PATCH] t2203,t4011: adjust to changed intent-to-add treatment Michael J Gruber
2015-03-15  7:07   ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqq1tknpkwd.fsf@gitster.dls.corp.google.com \
    --to=gitster@pobox.com \
    --cc=git@drmicha.warpmail.net \
    --cc=git@vger.kernel.org \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.