git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* filtering out mode-change-only changes
@ 2012-02-29  2:31 Neal Kreitzinger
  2012-02-29  3:40 ` Junio C Hamano
  0 siblings, 1 reply; 7+ messages in thread
From: Neal Kreitzinger @ 2012-02-29  2:31 UTC (permalink / raw)
  To: git

What is the best way to filter out the "mode change only" entries from a 
"name-status diff result" listing of changed files?

Reason-for-this:
get a list of files whose content changed and feed that list into a 
gui-diff-tool for visual review of "merge" (rebase) results.

Partial-Solutions:
--name-status does not have mode-change info.
--raw has mode change info but I would have to parse it out and compare it 
myself.
--summary has that info, but not the content modification info.

Before I write a script to discern "mode-change-only changes" and remove 
them from the list of changed files (thus leaving only content-changed files 
in the list), I'd like to see if there is some way that git 
already-does-this-for-you, or if someone already has a script that does 
this.

Thanks in advance for any tips.

v/r,
neal 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: filtering out mode-change-only changes
  2012-02-29  2:31 filtering out mode-change-only changes Neal Kreitzinger
@ 2012-02-29  3:40 ` Junio C Hamano
  2012-02-29  3:52   ` Junio C Hamano
  0 siblings, 1 reply; 7+ messages in thread
From: Junio C Hamano @ 2012-02-29  3:40 UTC (permalink / raw)
  To: Neal Kreitzinger; +Cc: git

"Neal Kreitzinger" <neal@rsss.com> writes:

> What is the best way to filter out the "mode change only" entries from a 
> "name-status diff result" listing of changed files?
>
> Reason-for-this:
> get a list of files whose content changed and feed that list into a 
> gui-diff-tool for visual review of "merge" (rebase) results.

Later I have some words on this.

> ... I'd like to see if there is some way that git 
> already-does-this-for-you, or if someone already has a script that does 
> this.

I do not know about random scripts people write, but there is nothing
built-in.

But *if* the _real_ reason you want to do this is because you do not want
to see unnecessary mode changes caused by your filesystem that screws up
file modes for whatever breakage, perhaps setting core.filemode to false
so that mode changes made to the working tree files by your mode breaking
filesystem might be the real solution.

Of course I do not know if that is the real reason you are asking for that
or not.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: filtering out mode-change-only changes
  2012-02-29  3:40 ` Junio C Hamano
@ 2012-02-29  3:52   ` Junio C Hamano
  2012-02-29 19:11     ` Neal Kreitzinger
  2012-02-29 22:17     ` Junio C Hamano
  0 siblings, 2 replies; 7+ messages in thread
From: Junio C Hamano @ 2012-02-29  3:52 UTC (permalink / raw)
  To: Neal Kreitzinger; +Cc: git

Junio C Hamano <gitster@pobox.com> writes:

> "Neal Kreitzinger" <neal@rsss.com> writes:
>
>> What is the best way to filter out the "mode change only" entries from a 
>> "name-status diff result" listing of changed files?
>> ...
> I do not know about random scripts people write, but there is nothing
> built-in.

Having said that, if we _were_ to do this built-in, an obvious logical
place to do so is to define a new DIFF_OPT_IGNORE_EXECUTABLE_BIT, teach
"--ignore-executable-bit" command line option to diff_opt_parse(), and
then teach diff_resolve_rename_copy() to consider this bit when the code
originally set DIFF_STATUS_MODIFIED.  Instead, the updated code that is
working under --ignore-executable-bit option would drop such a filepair
from diff_queued_diff.

I do not know if such a change is worth doing, though.  It depends on the
real reason why do you have so many "mode change only" changes that would
make rebasing or cherry-picking too troublesome.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: filtering out mode-change-only changes
  2012-02-29  3:52   ` Junio C Hamano
@ 2012-02-29 19:11     ` Neal Kreitzinger
  2012-02-29 19:52       ` Junio C Hamano
  2012-02-29 22:17     ` Junio C Hamano
  1 sibling, 1 reply; 7+ messages in thread
From: Neal Kreitzinger @ 2012-02-29 19:11 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Neal Kreitzinger, git

On 2/28/2012 9:52 PM, Junio C Hamano wrote:
> Junio C Hamano<gitster@pobox.com>  writes:
>
>> "Neal Kreitzinger"<neal@rsss.com>  writes:
>>
>>> What is the best way to filter out the "mode change only" entries from a
>>> "name-status diff result" listing of changed files?
>>> ...
>> I do not know about random scripts people write, but there is nothing
>> built-in.
>
> Having said that, if we _were_ to do this built-in, an obvious logical
> place to do so is to define a new DIFF_OPT_IGNORE_EXECUTABLE_BIT, teach
> "--ignore-executable-bit" command line option to diff_opt_parse(), and
> then teach diff_resolve_rename_copy() to consider this bit when the code
> originally set DIFF_STATUS_MODIFIED.  Instead, the updated code that is
> working under --ignore-executable-bit option would drop such a filepair
> from diff_queued_diff.
>
> I do not know if such a change is worth doing, though.  It depends on the
> real reason why do you have so many "mode change only" changes that would
> make rebasing or cherry-picking too troublesome.
>
I see three parts to this issue that are related but also independent:
Questions:
(Q1) Is the user handling filemodes correctly in git?
(Q2) Why does the user need to interrogate filemodes in git?
(Q3) How are file modes interrogated by the user in git?

Some Answers:
Q1: Is the user handling filemodes correctly in git?

A1-1: (My Context)
Perhaps I'm not, but I'm not prepared to ignore filemodes.  I think I 
need to be aware of what's changing.  Blasting everything with the linux 
chmod 777 shotgun, or the git core.filemode=false shotgun does not seem 
like the right answer to me.  I need to do more homework on linux 
permissions and git executable bit tracking.

A1-2: (General Context)
Some users do legitimately choose to have core.filemode=true and are 
correct in doing so.

Q2: Why does the user need to interrogate filemodes in git?

A2-1: (My Context)
After a rebase we need to review what changed with a 4-way diff to have 
the full context of merge-base, topic, upstream, and merged. Because we 
are mere-mortals, we want to use gui side-by-side diff (ie, diffuse) 
instead of 4-way combined diff. git-difftool only takes two file parms 
so I have to write my own script. git-mergetool's can display 4-way diff 
but insist on mangling the $MERGED file with their own attempts at 
redoing the merge, ie. creating their own merge conflicts even though 
$MERGED has no conflict markers on input.

A2-2: (General Context)
"Vendor code drops" (see git-rm manpage) can have substantial 
file-mode-only changes along with real content changes due to incorrect 
tar procedures and/or the vendor's filesystem being a "mode breaking 
filesystem". Also, there are these human stdin's going about 
capriciously with a free-will doing chmod's.

Q3: How are file modes interrogated by the user in git?

A3-1: (Some Current Options)
--name-status lumps file-mode-only changes and content changes together 
under status "M".
--raw can be parsed to discern filemode changes concurrent with 
identical content sha1's.
--summary "mode change" entries might also be usable to apply a filter 
to --name-status results.

A3-2: (Some Desired Options)
--name-status learns a new status for file-mode-only changes (ie, "P" 
for "P"ermissions).
--raw learns "P+x" and "P-x" in the status column to tell you if the 
executable bit was added or removed.

I wonder if filemode tracking was somewhat of an afterthought of the 
content-is-king design of git and that is why it is semi-opaque.

v/r,
neal

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: filtering out mode-change-only changes
  2012-02-29 19:11     ` Neal Kreitzinger
@ 2012-02-29 19:52       ` Junio C Hamano
  2012-03-03 22:16         ` Pete Harlan
  0 siblings, 1 reply; 7+ messages in thread
From: Junio C Hamano @ 2012-02-29 19:52 UTC (permalink / raw)
  To: Neal Kreitzinger; +Cc: Neal Kreitzinger, git

Neal Kreitzinger <nkreitzinger@gmail.com> writes:

> A3-2: (Some Desired Options)
> --name-status learns a new status for file-mode-only changes (ie, "P"
> for "P"ermissions).

After reading everything above I omitted from your response in my quote, I
still do not get the feeling that these willy-nilly mode changes that you
are suffering from is a problem that is general enough to warrant such a
change, even if such a change is done as an optional feature.

Calling our executable bits "Permission" is a misnomer, by the way. It is
more about "is this an executable file?" attribute, and is not "are you
allowed to execute this?" permission.

> --raw learns "P+x" and "P-x" in the status column to tell you if the
> executable bit was added or removed.

As --raw output by definition is designed to be read by scripts that can
and do parse its output, I do not see how this can be a useful addition at
all. The same information is available in the separate mode column already.

> I wonder if filemode tracking was somewhat of an afterthought of the
> content-is-king design of git and that is why it is semi-opaque.

Our blobs represent contents.  Whether your shell script has or does not
have executable bits, the file has the same contents.

The executable-ness of a particular file starts to matter only when you
extract it to your working tree.  The executable-ness matters equally as
its contents, and as the path at which it is extracted.  All three are
recorded in an entry in a tree object and in an entry in the index, as a
<mode, object name, path> tuple.

At the end-user level, you seldom use the blob contents (identified by the
object name in the three tuple explained above) alone without the
surrounding context (the other two members of the three tuple) the blob
appears in.  Especially in "diff", you not only want to view the content
change, but also need to know the change of the context in which the blobs
appear in.  And that is why we show "This content, or a related old
version of it, appeared at this old path, but now it appears at this new
path with the following change, and it lost the executable bit during the
change."

I do see nothing afterthought about the design at the storage or at the
presentation level.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: filtering out mode-change-only changes
  2012-02-29  3:52   ` Junio C Hamano
  2012-02-29 19:11     ` Neal Kreitzinger
@ 2012-02-29 22:17     ` Junio C Hamano
  1 sibling, 0 replies; 7+ messages in thread
From: Junio C Hamano @ 2012-02-29 22:17 UTC (permalink / raw)
  To: git; +Cc: Neal Kreitzinger

Junio C Hamano <gitster@pobox.com> writes:

> Having said that, if we _were_ to do this built-in, an obvious logical
> place to do so is to define a new DIFF_OPT_IGNORE_EXECUTABLE_BIT, teach
> "--ignore-executable-bit" command line option to diff_opt_parse(), and
> then teach diff_resolve_rename_copy() to consider this bit when the code
> originally set DIFF_STATUS_MODIFIED.  Instead, the updated code that is
> working under --ignore-executable-bit option would drop such a filepair
> from diff_queued_diff.

A patch to do so may look like this (untested, of course).  A few things
to note on the changes:

 - The "header" strbuf holds the lines starting from "diff --git" and the
   meta-information lines such as mode change, similarity, etc.  It is
   passed to the the interface code fn_out_consume() via the xdiff
   machinery when an actual diff is found and emitted before the first
   hunk of the diff.

 - The must_show_header toggle is set at the strategic places when
   information is added to the "header" strbuf that makes the output for
   this filepair a must, even if it turns out that xdiff machinery does
   not find any content changes.  Before this patch, we flipped this
   toggle when we noticed a mode change, so that the header is shown even
   if there is no content change. The patch has to make make it
   conditional, which is what the first hunk is about.

 - The second hunk is not related but I think it is a worthy bit-rot fix.
   The original code before "must_show_header" was introduced showed the
   header upfront unless we are ignoring any whitespace changes, the
   reasoning behind it being that two different blobs may produce no patch
   under --ignore-space-change and in such a case we do not want to show
   anything for the filepair.  When 296c6bb (diff: fix "git show -C -C"
   output when renaming a binary file, 2010-05-26) introduced
   "must_show_header", it retained that logic.

   But I think it was a mistake.  If xdiff machinery decides there is no
   patch to show, taking the --ignore-space-change into account, our
   fn_out_consume() won't be called so we won't show the header
   unnecessarily.  And when we do see a line of patch, fn_out_consume()
   will show the header.



 diff.c |   31 +++++++++++++++++++++++++++++--
 diff.h |    1 +
 2 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/Makefile b/Makefile
old mode 100644
new mode 100755
diff --git a/diff.c b/diff.c
index a1c06b5..acf7232 100644
--- a/diff.c
+++ b/diff.c
@@ -2152,7 +2152,8 @@ static void builtin_diff(const char *name_a,
 		if (one->mode != two->mode) {
 			strbuf_addf(&header, "%s%sold mode %06o%s\n", line_prefix, set, one->mode, reset);
 			strbuf_addf(&header, "%s%snew mode %06o%s\n", line_prefix, set, two->mode, reset);
-			must_show_header = 1;
+			if (!DIFF_OPT_TST(o, IGNORE_MODE_CHANGE))
+				must_show_header = 1;
 		}
 		if (xfrm_msg)
 			strbuf_addstr(&header, xfrm_msg);
@@ -2207,7 +2208,7 @@ static void builtin_diff(const char *name_a,
 		struct emit_callback ecbdata;
 		const struct userdiff_funcname *pe;
 
-		if (!DIFF_XDL_TST(o, WHITESPACE_FLAGS) || must_show_header) {
+		if (must_show_header) {
 			fprintf(o->file, "%s", header.buf);
 			strbuf_reset(&header);
 		}
@@ -3446,6 +3447,8 @@ int diff_opt_parse(struct diff_options *options, const char **av, int ac)
 	}
 	else if (!strcmp(arg, "--no-renames"))
 		options->detect_rename = 0;
+	else if (!strcmp(arg, "--ignore-mode-change"))
+		DIFF_OPT_SET(options, IGNORE_MODE_CHANGE);
 	else if (!strcmp(arg, "--relative"))
 		DIFF_OPT_SET(options, RELATIVE_NAME);
 	else if (!prefixcmp(arg, "--relative=")) {
@@ -4509,10 +4512,34 @@ void diffcore_fix_diff_index(struct diff_options *options)
 	qsort(q->queue, q->nr, sizeof(q->queue[0]), diffnamecmp);
 }
 
+static void diffcore_ignore_mode_change(struct diff_options *diffopt)
+{
+	int i;
+	struct diff_queue_struct *q = &diff_queued_diff;
+	struct diff_queue_struct outq;
+	DIFF_QUEUE_CLEAR(&outq);
+
+	for (i = 0; i < q->nr; i++) {
+		struct diff_filepair *p = q->queue[i];
+
+		if (DIFF_FILE_VALID(p->one) &&
+		    DIFF_FILE_VALID(p->two) &&
+		    (p->one->sha1_valid && p->two->sha1_valid) &&
+		    !hashcmp(p->one->sha1, p->two->sha1))
+			diff_free_filepair(p); /* skip this */
+		else
+			diff_q(&outq, p);
+	}
+	free(q->queue);
+	*q = outq;
+}
+
 void diffcore_std(struct diff_options *options)
 {
 	if (options->skip_stat_unmatch)
 		diffcore_skip_stat_unmatch(options);
+	if (DIFF_OPT_TST(options, IGNORE_MODE_CHANGE))
+		diffcore_ignore_mode_change(options);
 	if (!options->found_follow) {
 		/* See try_to_follow_renames() in tree-diff.c */
 		if (options->break_opt != -1)
diff --git a/diff.h b/diff.h
index 7af5f1e..fabcc96 100644
--- a/diff.h
+++ b/diff.h
@@ -82,6 +82,7 @@ typedef struct strbuf *(*diff_prefix_fn_t)(struct diff_options *opt, void *data)
 #define DIFF_OPT_OVERRIDE_SUBMODULE_CONFIG (1 << 27)
 #define DIFF_OPT_DIRSTAT_BY_LINE     (1 << 28)
 #define DIFF_OPT_FUNCCONTEXT         (1 << 29)
+#define DIFF_OPT_IGNORE_MODE_CHANGE  (1 << 30)
 
 #define DIFF_OPT_TST(opts, flag)    ((opts)->flags & DIFF_OPT_##flag)
 #define DIFF_OPT_SET(opts, flag)    ((opts)->flags |= DIFF_OPT_##flag)

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: filtering out mode-change-only changes
  2012-02-29 19:52       ` Junio C Hamano
@ 2012-03-03 22:16         ` Pete Harlan
  0 siblings, 0 replies; 7+ messages in thread
From: Pete Harlan @ 2012-03-03 22:16 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Neal Kreitzinger, Neal Kreitzinger, git

On 2/29/2012 11:52 AM, Junio C Hamano wrote:
> Neal Kreitzinger<nkreitzinger@gmail.com>  writes:
>
>> A3-2: (Some Desired Options)
>> --name-status learns a new status for file-mode-only changes (ie, "P"
>> for "P"ermissions).
>
> After reading everything above I omitted from your response in my quote, I
> still do not get the feeling that these willy-nilly mode changes that you
> are suffering from is a problem that is general enough to warrant such a
> change, even if such a change is done as an optional feature.

I'll add a vote for the usefulness of the option.

I'm dealing with a repo whose history occasionally has commits that do a 
handful of real changes mixed in with tens of thousands of mistaken mode 
changes.  (New workflow is fixed but it is infeasible to fix history at 
this point.)  It's cumbersome filtering away the noise.

If Git's diff engine could ignore mode changes that would be a big help.

Regards,

--
Pete Harlan
pgit@pcharlan.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2012-03-03 22:17 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-02-29  2:31 filtering out mode-change-only changes Neal Kreitzinger
2012-02-29  3:40 ` Junio C Hamano
2012-02-29  3:52   ` Junio C Hamano
2012-02-29 19:11     ` Neal Kreitzinger
2012-02-29 19:52       ` Junio C Hamano
2012-03-03 22:16         ` Pete Harlan
2012-02-29 22:17     ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).