* [PATCH 0/3] fix "diff --raw" on the work tree side
@ 2008-03-02 9:43 Junio C Hamano
2008-03-02 9:43 ` [PATCH 1/3] diff-lib.c: constness strengthening Junio C Hamano
2008-03-02 10:41 ` [PATCH 0/3] fix "diff --raw" " Ping Yin
0 siblings, 2 replies; 8+ messages in thread
From: Junio C Hamano @ 2008-03-02 9:43 UTC (permalink / raw)
To: git
The second patch fixes the inconsistency between "git-diff --raw" and
"git-diff-{index,files} --raw" when they are used for submodules.
The third one is a bit controversial and changes the semantics and
existing callers and won't be considered for 1.5.5.
Junio C Hamano (3):
diff-lib.c: constness strengthening
diff: make sure work tree side is shown as 0{40} when different
diff: show submodule object name when available even on the work tree side
diff-lib.c | 45 +++++++++++++++++++++++++++++++------
diff.c | 7 +----
t/t4027-diff-submodule.sh | 53 +++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 92 insertions(+), 13 deletions(-)
create mode 100755 t/t4027-diff-submodule.sh
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 1/3] diff-lib.c: constness strengthening
2008-03-02 9:43 [PATCH 0/3] fix "diff --raw" on the work tree side Junio C Hamano
@ 2008-03-02 9:43 ` Junio C Hamano
2008-03-02 9:43 ` [PATCH 2/3] diff: make sure work tree side is shown as 0{40} when different Junio C Hamano
2008-03-02 10:41 ` [PATCH 0/3] fix "diff --raw" " Ping Yin
1 sibling, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2008-03-02 9:43 UTC (permalink / raw)
To: git
The internal implementation of diff-index codepath used to use non const
pointer to pass sha1 around, but it did not have to. With this, we can
also lose the private no_sha1[] array, as we can use the public null_sha1[]
array that exists exactly for the same purpose.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
diff-lib.c | 13 ++++++-------
1 files changed, 6 insertions(+), 7 deletions(-)
diff --git a/diff-lib.c b/diff-lib.c
index 94b150e..4581b59 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -472,22 +472,21 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
static void diff_index_show_file(struct rev_info *revs,
const char *prefix,
struct cache_entry *ce,
- unsigned char *sha1, unsigned int mode)
+ const unsigned char *sha1, unsigned int mode)
{
diff_addremove(&revs->diffopt, prefix[0], mode,
sha1, ce->name, NULL);
}
static int get_stat_data(struct cache_entry *ce,
- unsigned char **sha1p,
+ const unsigned char **sha1p,
unsigned int *modep,
int cached, int match_missing)
{
- unsigned char *sha1 = ce->sha1;
+ const unsigned char *sha1 = ce->sha1;
unsigned int mode = ce->ce_mode;
if (!cached) {
- static unsigned char no_sha1[20];
int changed;
struct stat st;
if (lstat(ce->name, &st) < 0) {
@@ -501,7 +500,7 @@ static int get_stat_data(struct cache_entry *ce,
changed = ce_match_stat(ce, &st, 0);
if (changed) {
mode = ce_mode_from_stat(ce, st.st_mode);
- sha1 = no_sha1;
+ sha1 = null_sha1;
}
}
@@ -514,7 +513,7 @@ static void show_new_file(struct rev_info *revs,
struct cache_entry *new,
int cached, int match_missing)
{
- unsigned char *sha1;
+ const unsigned char *sha1;
unsigned int mode;
/* New file in the index: it might actually be different in
@@ -533,7 +532,7 @@ static int show_modified(struct rev_info *revs,
int cached, int match_missing)
{
unsigned int mode, oldmode;
- unsigned char *sha1;
+ const unsigned char *sha1;
if (get_stat_data(new, &sha1, &mode, cached, match_missing) < 0) {
if (report_missing)
--
1.5.4.3.468.g36990
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 2/3] diff: make sure work tree side is shown as 0{40} when different
2008-03-02 9:43 ` [PATCH 1/3] diff-lib.c: constness strengthening Junio C Hamano
@ 2008-03-02 9:43 ` Junio C Hamano
2008-03-02 9:43 ` [PATCH 3/3] diff: show submodule object name when available even on the work tree side Junio C Hamano
0 siblings, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2008-03-02 9:43 UTC (permalink / raw)
To: git
Ping Yin noticed that "git diff-index --raw" shows 0{40} when work tree
has submodule difference, but "git diff --raw" didn't correctly do so.
There was a mistake in the diffcore_skip_stat_unmatch() that was meant to
clean up the stat-only difference for running diff between the index and
work tree and diff between the tree and the work tree, to cause it re-read
from the submodule repository HEAD. When ce_stat_match() says work tree
is different, we should always say 0{40} on the work tree side.
This patch fixes the issue, and adds tests.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
* The real change is a net deletion of 3 lines.
diff.c | 7 +----
t/t4027-diff-submodule.sh | 53 +++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 55 insertions(+), 5 deletions(-)
create mode 100755 t/t4027-diff-submodule.sh
diff --git a/diff.c b/diff.c
index b80f656..00e1590 100644
--- a/diff.c
+++ b/diff.c
@@ -3187,11 +3187,8 @@ static void diffcore_apply_filter(const char *filter)
static int diff_filespec_is_identical(struct diff_filespec *one,
struct diff_filespec *two)
{
- if (S_ISGITLINK(one->mode)) {
- diff_fill_sha1_info(one);
- diff_fill_sha1_info(two);
- return !hashcmp(one->sha1, two->sha1);
- }
+ if (S_ISGITLINK(one->mode))
+ return 0;
if (diff_populate_filespec(one, 0))
return 0;
if (diff_populate_filespec(two, 0))
diff --git a/t/t4027-diff-submodule.sh b/t/t4027-diff-submodule.sh
new file mode 100755
index 0000000..3d2d081
--- /dev/null
+++ b/t/t4027-diff-submodule.sh
@@ -0,0 +1,53 @@
+#!/bin/sh
+
+test_description='difference in submodules'
+
+. ./test-lib.sh
+. ../diff-lib.sh
+
+_z40=0000000000000000000000000000000000000000
+test_expect_success setup '
+ test_tick &&
+ test_create_repo sub &&
+ (
+ cd sub &&
+ echo hello >world &&
+ git add world &&
+ git commit -m submodule
+ ) &&
+
+ test_tick &&
+ echo frotz >nitfol &&
+ git add nitfol sub &&
+ git commit -m superproject &&
+
+ (
+ cd sub &&
+ echo goodbye >world &&
+ git add world &&
+ git commit -m "submodule #2"
+ ) &&
+
+ set x $(
+ cd sub &&
+ git rev-list HEAD
+ ) &&
+ echo ":160000 160000 $3 $_z40 M sub" >expect
+'
+
+test_expect_success 'git diff --raw HEAD' '
+ git diff --raw --abbrev=40 HEAD >actual &&
+ diff -u expect actual
+'
+
+test_expect_success 'git diff-index --raw HEAD' '
+ git diff-index --raw HEAD >actual.index &&
+ diff -u expect actual.index
+'
+
+test_expect_success 'git diff-files --raw' '
+ git diff-files --raw >actual.files &&
+ diff -u expect actual.files
+'
+
+test_done
--
1.5.4.3.468.g36990
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 3/3] diff: show submodule object name when available even on the work tree side
2008-03-02 9:43 ` [PATCH 2/3] diff: make sure work tree side is shown as 0{40} when different Junio C Hamano
@ 2008-03-02 9:43 ` Junio C Hamano
0 siblings, 0 replies; 8+ messages in thread
From: Junio C Hamano @ 2008-03-02 9:43 UTC (permalink / raw)
To: git
Traditionally, we consistently showed 0{40} when work tree side was
different from what it was being compared against. This was primarily
because it did not make sense to re-hash potentially large blobs every
time diff-files and non-cached diff-index were asked for.
However, we can afford to read from submodule HEAD, as it is much cheaper
than hashing blobs.
This makes the output somewhat inconsistent, but it probably is a good
move to give easily available information than not giving it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
* This could be enhanced to also hash symlink blobs as they tend to be
very short.
HOWEVER.
This changes the semantics established since almost day-one of git (at
least this 0{40} convention can be traced back to show-diff as of Apr
26, 2005), and tests (e.g. t3040) demonstrate that it breaks existing
callers such as git-commit.
Which means that I am sending this out only for discussion at this
time, not for inclusion in the near term.
diff-lib.c | 32 +++++++++++++++++++++++++++++++-
t/t4027-diff-submodule.sh | 2 +-
2 files changed, 32 insertions(+), 2 deletions(-)
diff --git a/diff-lib.c b/diff-lib.c
index 4581b59..94e17a6 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -10,6 +10,26 @@
#include "cache-tree.h"
#include "path-list.h"
#include "unpack-trees.h"
+#include "refs.h"
+
+/*
+ * Traditionally, we have _always_ showed 0{40} on the work tree
+ * side if there is a difference. However, reading from HEAD of
+ * a submodule is much cheaper than rehashing regular blob objects.
+ */
+static const unsigned char *get_gitlink_data(const char *path)
+{
+ static unsigned char subhead[20];
+ /*
+ * NOTE: strictly speaking, this makes it non re-entrant, but
+ * we know the nature of the callchain makes it very
+ * short-lived --- the caller will give this to
+ * diff_addremove() or diff_change() immediately.
+ */
+ if (!resolve_gitlink_ref(path, "HEAD", subhead))
+ return subhead;
+ return null_sha1;
+}
/*
* diff-files
@@ -349,6 +369,7 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
struct stat st;
unsigned int oldmode, newmode;
struct cache_entry *ce = active_cache[i];
+ const unsigned char *worktree_sha1;
int changed;
if (DIFF_OPT_TST(&revs->diffopt, QUIET) &&
@@ -454,8 +475,15 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
continue;
oldmode = ce->ce_mode;
newmode = ce_mode_from_stat(ce, st.st_mode);
+
+ if (S_ISGITLINK(newmode))
+ worktree_sha1 = get_gitlink_data(ce->name);
+ else if (changed)
+ worktree_sha1 = null_sha1;
+ else
+ worktree_sha1 = ce->sha1;
diff_change(&revs->diffopt, oldmode, newmode,
- ce->sha1, (changed ? null_sha1 : ce->sha1),
+ ce->sha1, worktree_sha1,
ce->name, NULL);
}
@@ -501,6 +529,8 @@ static int get_stat_data(struct cache_entry *ce,
if (changed) {
mode = ce_mode_from_stat(ce, st.st_mode);
sha1 = null_sha1;
+ if (S_ISGITLINK(mode))
+ sha1 = get_gitlink_data(ce->name);
}
}
diff --git a/t/t4027-diff-submodule.sh b/t/t4027-diff-submodule.sh
index 3d2d081..c1c2500 100755
--- a/t/t4027-diff-submodule.sh
+++ b/t/t4027-diff-submodule.sh
@@ -32,7 +32,7 @@ test_expect_success setup '
cd sub &&
git rev-list HEAD
) &&
- echo ":160000 160000 $3 $_z40 M sub" >expect
+ echo ":160000 160000 $3 $2 M sub" >expect
'
test_expect_success 'git diff --raw HEAD' '
--
1.5.4.3.468.g36990
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 0/3] fix "diff --raw" on the work tree side
2008-03-02 9:43 [PATCH 0/3] fix "diff --raw" on the work tree side Junio C Hamano
2008-03-02 9:43 ` [PATCH 1/3] diff-lib.c: constness strengthening Junio C Hamano
@ 2008-03-02 10:41 ` Ping Yin
2008-03-02 17:11 ` Junio C Hamano
1 sibling, 1 reply; 8+ messages in thread
From: Ping Yin @ 2008-03-02 10:41 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
On Sun, Mar 2, 2008 at 5:43 PM, Junio C Hamano <gitster@pobox.com> wrote:
> The second patch fixes the inconsistency between "git-diff --raw" and
> "git-diff-{index,files} --raw" when they are used for submodules.
>
> The third one is a bit controversial and changes the semantics and
> existing callers and won't be considered for 1.5.5.
>
Unfortunately, my submodule summary patch series will depend on
git-diff or git-diff-index. So should i resend my improved patch only
after the thrid one is applied?
--
Ping Yin
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 0/3] fix "diff --raw" on the work tree side
2008-03-02 10:41 ` [PATCH 0/3] fix "diff --raw" " Ping Yin
@ 2008-03-02 17:11 ` Junio C Hamano
2008-03-02 17:15 ` Ping Yin
0 siblings, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2008-03-02 17:11 UTC (permalink / raw)
To: Ping Yin; +Cc: git
"Ping Yin" <pkufranky@gmail.com> writes:
> On Sun, Mar 2, 2008 at 5:43 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> The second patch fixes the inconsistency between "git-diff --raw" and
>> "git-diff-{index,files} --raw" when they are used for submodules.
>>
>> The third one is a bit controversial and changes the semantics and
>> existing callers and won't be considered for 1.5.5.
>>
> Unfortunately, my submodule summary patch series will depend on
> git-diff or git-diff-index. So should i resend my improved patch only
> after the thrid one is applied?
I do not think re-introducing the inconsistency like the third one does is
a palatable option.
Wouldn't grabbing (cd $submodule && git rev-parse HEAD) yourself be more
portable across git before and after the bugfix of "git diff --raw"?
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 0/3] fix "diff --raw" on the work tree side
2008-03-02 17:11 ` Junio C Hamano
@ 2008-03-02 17:15 ` Ping Yin
2008-03-02 17:48 ` Junio C Hamano
0 siblings, 1 reply; 8+ messages in thread
From: Ping Yin @ 2008-03-02 17:15 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
On 3/3/08, Junio C Hamano <gitster@pobox.com> wrote:
> "Ping Yin" <pkufranky@gmail.com> writes:
>
> > On Sun, Mar 2, 2008 at 5:43 PM, Junio C Hamano <gitster@pobox.com> wrote:
> >> The second patch fixes the inconsistency between "git-diff --raw" and
> >> "git-diff-{index,files} --raw" when they are used for submodules.
> >>
> >> The third one is a bit controversial and changes the semantics and
> >> existing callers and won't be considered for 1.5.5.
> >>
> > Unfortunately, my submodule summary patch series will depend on
> > git-diff or git-diff-index. So should i resend my improved patch only
> > after the thrid one is applied?
>
> I do not think re-introducing the inconsistency like the third one does is
> a palatable option.
>
> Wouldn't grabbing (cd $submodule && git rev-parse HEAD) yourself be more
> portable across git before and after the bugfix of "git diff --raw"?
>
OK, i will parse it myself
--
Ping Yin
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 0/3] fix "diff --raw" on the work tree side
2008-03-02 17:15 ` Ping Yin
@ 2008-03-02 17:48 ` Junio C Hamano
0 siblings, 0 replies; 8+ messages in thread
From: Junio C Hamano @ 2008-03-02 17:48 UTC (permalink / raw)
To: Ping Yin; +Cc: git
"Ping Yin" <pkufranky@gmail.com> writes:
> On 3/3/08, Junio C Hamano <gitster@pobox.com> wrote:
> ...
>> I do not think re-introducing the inconsistency like the third one does is
>> a palatable option.
>>
>> Wouldn't grabbing (cd $submodule && git rev-parse HEAD) yourself be more
>> portable across git before and after the bugfix of "git diff --raw"?
>
> OK, i will parse it myself
Just to clarify, I am somewhat torn on [3/3].
We can certainly re-introduce the special casing for submodule across diff
family consistently, but the rationale for doing that would be "The real
object name is very cheaply available in that case so why not show it."
That line of thinking would lead to re-hashing of symbolic link blobs as I
hinted in the message, but then it would also mean we would show
inconsistent output between a symlink that points at foo.c and a regular
file whose content is foo.c (without the trailing LF), the latter of which
would still show 0{40}, and we will eventually end up saying "Let's
re-hash small regular file blobs."
That would lead to inconsistencies between smaller and larger regular file
blobs, and the line between them is drawn at an arbitrary size limit. The
logical conclusion of this would be to re-hash everything when showing
"diff --raw" (and "diff-anything --raw").
Which may turn out not to be such a bad thing after all, and we might even
end up doing exactly that in future releases, although I highly doubt it.
In any case, such a huge semantics change takes time to get right.
People's scripts in the wild know 0{40} with non 0 mode in the raw format
means it refers to an entity in the work tree (which is a correct thing to
assume), but the convention has been used for the entire lifetime of git
and they can (arguably incorrectly) be assuming that the inverse is also
true (i.e. work tree entity is represented as 0{40} with non 0 mode).
And the point is that "submodule summary" can be useful without waiting
for the conclusion of the above confusion caused by the can of worms [3/3]
opens. If we decide to always show the real object name for work tree
entities in the future, we might want to update the implementation of
"submodule summary" to _take advantage of it_, but by starting the new
feature not to depend on the current misbehaviour, we can try it out much
earlier.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2008-03-02 17:48 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-03-02 9:43 [PATCH 0/3] fix "diff --raw" on the work tree side Junio C Hamano
2008-03-02 9:43 ` [PATCH 1/3] diff-lib.c: constness strengthening Junio C Hamano
2008-03-02 9:43 ` [PATCH 2/3] diff: make sure work tree side is shown as 0{40} when different Junio C Hamano
2008-03-02 9:43 ` [PATCH 3/3] diff: show submodule object name when available even on the work tree side Junio C Hamano
2008-03-02 10:41 ` [PATCH 0/3] fix "diff --raw" " Ping Yin
2008-03-02 17:11 ` Junio C Hamano
2008-03-02 17:15 ` Ping Yin
2008-03-02 17:48 ` Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).