* git diff: add option for omitting the contents of deletes @ 2011-02-26 13:16 Mart Sõmermaa 2011-02-26 20:11 ` Junio C Hamano 0 siblings, 1 reply; 32+ messages in thread From: Mart Sõmermaa @ 2011-02-26 13:16 UTC (permalink / raw) To: git Paging through deletes adds a cognitive burden when reviewing code. Thus we have git config --global alias.deleteless-diff 'diff -M --diff-filter=ACMRTUXB' and using `git deleteless-diff` is standard practice during code reviews. However, the filter leaves out delete information altogether, which is error-prone. Therefore I'd like to propose adding the -D (or --omit-delete-contents) option to git diff that would behave as follows: $ git rm foo $ git diff -D diff --git a/foo b/foo delete foo I.e. instead of displaying the contents of the file as a diff minus, a descriptive message is displayed -- like with renames. Thoughts, suggestions? Can 'diff -M --diff-filter=ACMRTUXB' be improved? Best regards, Mart Sõmermaa ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: git diff: add option for omitting the contents of deletes 2011-02-26 13:16 git diff: add option for omitting the contents of deletes Mart Sõmermaa @ 2011-02-26 20:11 ` Junio C Hamano 2011-02-27 14:41 ` Michael J Gruber 0 siblings, 1 reply; 32+ messages in thread From: Junio C Hamano @ 2011-02-26 20:11 UTC (permalink / raw) To: Mart Sõmermaa; +Cc: git Mart Sõmermaa <mrts.pydev@gmail.com> writes: > Paging through deletes adds a cognitive burden when reviewing code. Wasn't the ability to say '/^diff --git ' in your pager invented for exactly for that? ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: git diff: add option for omitting the contents of deletes 2011-02-26 20:11 ` Junio C Hamano @ 2011-02-27 14:41 ` Michael J Gruber 2011-02-27 22:33 ` Mart Sõmermaa 2011-02-27 22:54 ` Junio C Hamano 0 siblings, 2 replies; 32+ messages in thread From: Michael J Gruber @ 2011-02-27 14:41 UTC (permalink / raw) To: Junio C Hamano; +Cc: Mart Sõmermaa, git Junio C Hamano venit, vidit, dixit 26.02.2011 21:11: > Mart Sõmermaa <mrts.pydev@gmail.com> writes: > >> Paging through deletes adds a cognitive burden when reviewing code. > > Wasn't the ability to say '/^diff --git ' in your pager invented for > exactly for that? Wasn't the pager invented for sifting through output which has to be several pages, but not not for that which could be more concise? ;) In fact, -D would be quite analogous to -M and -C in that respect. Michael ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: git diff: add option for omitting the contents of deletes 2011-02-27 14:41 ` Michael J Gruber @ 2011-02-27 22:33 ` Mart Sõmermaa 2011-02-28 9:58 ` Michael J Gruber 2011-02-27 22:54 ` Junio C Hamano 1 sibling, 1 reply; 32+ messages in thread From: Mart Sõmermaa @ 2011-02-27 22:33 UTC (permalink / raw) To: git Junio, jokes aside, do you think this is useful? Would you accept a patch that implements this? Micheal, thanks for the indirect + :). Is it +0 or +1? Best, MS On Sun, Feb 27, 2011 at 4:41 PM, Michael J Gruber <git@drmicha.warpmail.net> wrote: > Junio C Hamano venit, vidit, dixit 26.02.2011 21:11: >> Mart Sõmermaa <mrts.pydev@gmail.com> writes: >> >>> Paging through deletes adds a cognitive burden when reviewing code. >> >> Wasn't the ability to say '/^diff --git ' in your pager invented for >> exactly for that? > > Wasn't the pager invented for sifting through output which has to be > several pages, but not not for that which could be more concise? ;) > > In fact, -D would be quite analogous to -M and -C in that respect. > > Michael > ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: git diff: add option for omitting the contents of deletes 2011-02-27 22:33 ` Mart Sõmermaa @ 2011-02-28 9:58 ` Michael J Gruber 2011-02-28 10:51 ` Mart Sõmermaa 0 siblings, 1 reply; 32+ messages in thread From: Michael J Gruber @ 2011-02-28 9:58 UTC (permalink / raw) To: Mart Sõmermaa; +Cc: git, Junio C Hamano Mart Sõmermaa venit, vidit, dixit 27.02.2011 23:33: > Junio, jokes aside, do you think this is useful? > Would you accept a patch that implements this? > > Micheal, thanks for the indirect + :). Is it +0 or +1? It's -1 for the spelling of my name (unless it is correct in your language), and +1 for having "-D". As Junio explained, it is important to have this only for "viewing diffs" (like the color stuff) and not in the way of generation of patches which are meant to be applied (with -R, possibly), so that one can have it "on" by default for viewing. OTOH, if (it's off by default, naturally, and) there's no config and just let people have their option (like --color-words) I don't see a problem. Michael ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: git diff: add option for omitting the contents of deletes 2011-02-28 9:58 ` Michael J Gruber @ 2011-02-28 10:51 ` Mart Sõmermaa 0 siblings, 0 replies; 32+ messages in thread From: Mart Sõmermaa @ 2011-02-28 10:51 UTC (permalink / raw) To: Michael J Gruber; +Cc: git On Mon, Feb 28, 2011 at 11:58 AM, Michael J Gruber <git@drmicha.warpmail.net> wrote: > Mart Sõmermaa venit, vidit, dixit 27.02.2011 23:33: >> Junio, jokes aside, do you think this is useful? >> Would you accept a patch that implements this? >> >> Micheal, thanks for the indirect + :). Is it +0 or +1? > > It's -1 for the spelling of my name (unless it is correct in your > language I'm terribly sorry :( and no, it is not. To make amends, I hereby officially grant you the right to call me Marat or Marta (which is the female form of Mart) or Mrat or Mrata from now on :). Dygsrahpically yours, Mrat ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: git diff: add option for omitting the contents of deletes 2011-02-27 14:41 ` Michael J Gruber 2011-02-27 22:33 ` Mart Sõmermaa @ 2011-02-27 22:54 ` Junio C Hamano 2011-02-27 23:07 ` Junio C Hamano 1 sibling, 1 reply; 32+ messages in thread From: Junio C Hamano @ 2011-02-27 22:54 UTC (permalink / raw) To: Michael J Gruber; +Cc: Mart Sõmermaa, git Michael J Gruber <git@drmicha.warpmail.net> writes: > Wasn't the pager invented for sifting through output which has to be > several pages, but not not for that which could be more concise? ;) > > In fact, -D would be quite analogous to -M and -C in that respect. There is a big difference: -M and -C lets your recipient reproduce the state using the change you are trying to convey with the diff output in either direction (iow, "apply -R" works), but your "-D" would not have that property. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: git diff: add option for omitting the contents of deletes 2011-02-27 22:54 ` Junio C Hamano @ 2011-02-27 23:07 ` Junio C Hamano 2011-02-28 7:31 ` Michael J Gruber 2011-02-28 10:45 ` git diff: add option for omitting the contents of deletes Mart Sõmermaa 0 siblings, 2 replies; 32+ messages in thread From: Junio C Hamano @ 2011-02-27 23:07 UTC (permalink / raw) To: Michael J Gruber; +Cc: Mart Sõmermaa, git Junio C Hamano <gitster@pobox.com> writes: > Michael J Gruber <git@drmicha.warpmail.net> writes: > >> Wasn't the pager invented for sifting through output which has to be >> several pages, but not not for that which could be more concise? ;) >> >> In fact, -D would be quite analogous to -M and -C in that respect. > > There is a big difference: -M and -C lets your recipient reproduce the > state using the change you are trying to convey with the diff output in > either direction (iow, "apply -R" works), but your "-D" would not have > that property. Having said that we have always valued "reversibility" and a casual -D is not in line with that principle, I don't have a strong objection if the new mode of operation is marked clearly as "nonusable if you are trying to produce appliable diff (iow, don't send such a patch to mailing list--it is for viewing purposes only)", treating it just like the --color-words and the --stat options (there isn't even need to mark these as unusable for that purpose, as people with common sense would be able to guess). If we were to do this, it probably is a good idea to apply that for a typechange patch (the one that is produced when a symlink turns into a regular file and vice versa) as well. It also might make sense to apply the similar principle to shorten the output with -B when a rewrite patch is expressed as a single hunk patch that removes everything old and then adds everthing new. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: git diff: add option for omitting the contents of deletes 2011-02-27 23:07 ` Junio C Hamano @ 2011-02-28 7:31 ` Michael J Gruber 2011-02-28 12:17 ` Jeff King 2011-02-28 10:45 ` git diff: add option for omitting the contents of deletes Mart Sõmermaa 1 sibling, 1 reply; 32+ messages in thread From: Michael J Gruber @ 2011-02-28 7:31 UTC (permalink / raw) To: Junio C Hamano; +Cc: Mart Sõmermaa, git Junio C Hamano venit, vidit, dixit 28.02.2011 00:07: > Junio C Hamano <gitster@pobox.com> writes: > >> Michael J Gruber <git@drmicha.warpmail.net> writes: >> >>> Wasn't the pager invented for sifting through output which has to be >>> several pages, but not not for that which could be more concise? ;) >>> >>> In fact, -D would be quite analogous to -M and -C in that respect. >> >> There is a big difference: -M and -C lets your recipient reproduce the >> state using the change you are trying to convey with the diff output in >> either direction (iow, "apply -R" works), but your "-D" would not have >> that property. > > Having said that we have always valued "reversibility" and a casual -D is I didn't know, but I guess I haven't come across those issues yet. > not in line with that principle, I don't have a strong objection if the > new mode of operation is marked clearly as "nonusable if you are trying to > produce appliable diff (iow, don't send such a patch to mailing list--it > is for viewing purposes only)", treating it just like the --color-words > and the --stat options (there isn't even need to mark these as unusable > for that purpose, as people with common sense would be able to guess). Yes, it is purely intended as "for viewing by humans". Even in the forward direction saying "delete that path" is different from saying "delete that content from that path", i.e. "diff -D" output may apply without conflicts in cases where "diff" output does not. That aspect is similar to -M and -C, though - unless we check the sha1 of the blobs before applying the patch (which would be possible for -D also) - do we? > If we were to do this, it probably is a good idea to apply that for a > typechange patch (the one that is produced when a symlink turns into a > regular file and vice versa) as well. It also might make sense to apply > the similar principle to shorten the output with -B when a rewrite patch > is expressed as a single hunk patch that removes everything old and then > adds everthing new. Reminds me of my failed attempt to make the diff output for symlinks more human-friendly. The latter can be solved with textconv, though. Michael ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: git diff: add option for omitting the contents of deletes 2011-02-28 7:31 ` Michael J Gruber @ 2011-02-28 12:17 ` Jeff King 2011-02-28 12:23 ` Jeff King 2011-02-28 12:42 ` symling diff driver (Was: Re: git diff: add option for omitting the contents of deletes) Michael J Gruber 0 siblings, 2 replies; 32+ messages in thread From: Jeff King @ 2011-02-28 12:17 UTC (permalink / raw) To: Michael J Gruber; +Cc: Junio C Hamano, Mart Sõmermaa, git On Mon, Feb 28, 2011 at 08:31:55AM +0100, Michael J Gruber wrote: > > Junio C Hamano <gitster@pobox.com> writes: > >> > >> There is a big difference: -M and -C lets your recipient reproduce the > >> state using the change you are trying to convey with the diff output in > >> either direction (iow, "apply -R" works), but your "-D" would not have > >> that property. > > [...] > > Yes, it is purely intended as "for viewing by humans". Even in the > forward direction saying "delete that path" is different from saying > "delete that content from that path", i.e. "diff -D" output may apply > without conflicts in cases where "diff" output does not. > > That aspect is similar to -M and -C, though - unless we check the sha1 > of the blobs before applying the patch (which would be possible for -D > also) - do we? Yes, I think we do check the sha1s for a "-M" patch. And we should do so for a "-D" patch, too. Which would make it just as likely to conflict as a version with the actual patch content, _except_ that it relies on the recipient having that sha1. I don't see how "-D" is any less reversible than -M or -C, though. If I get your -D patch, I use the index line to see that the blob went from 1234abcd to 0000000, check that we are at 1234abcd, and then delete the file. To reverse it, I reinstate 1234abcd from nothing (and conflict if the file exists). _Neither_ case works without the sha1. So I think the problem is not about "this cannot be reversed" but about "the recipient must have your sha1 to make sense of it, in either direction". Which is the same case as with "-M" and "-C", and why we have long cautioned about their use on mailing lists. But the right rule is not "do not use on mailing lists" but rather "do not use on mailing lists for projects where the recipients will not be using git to apply". > Reminds me of my failed attempt to make the diff output for symlinks > more human-friendly. The latter can be solved with textconv, though. I am still carrying around my "symlinks as a special class of diff" patches if you are interested: https://github.com/peff/git/tree/jk/userdiff-symlinks -Peff ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: git diff: add option for omitting the contents of deletes 2011-02-28 12:17 ` Jeff King @ 2011-02-28 12:23 ` Jeff King 2011-02-28 12:32 ` Michael J Gruber 2011-02-28 18:11 ` Junio C Hamano 2011-02-28 12:42 ` symling diff driver (Was: Re: git diff: add option for omitting the contents of deletes) Michael J Gruber 1 sibling, 2 replies; 32+ messages in thread From: Jeff King @ 2011-02-28 12:23 UTC (permalink / raw) To: Michael J Gruber; +Cc: Junio C Hamano, Mart Sõmermaa, git On Mon, Feb 28, 2011 at 07:17:26AM -0500, Jeff King wrote: > > That aspect is similar to -M and -C, though - unless we check the sha1 > > of the blobs before applying the patch (which would be possible for -D > > also) - do we? > > Yes, I think we do check the sha1s for a "-M" patch. And we should do so > for a "-D" patch, too. Which would make it just as likely to conflict as > a version with the actual patch content, _except_ that it relies on the > recipient having that sha1. > > I don't see how "-D" is any less reversible than -M or -C, though. If I > get your -D patch, I use the index line to see that the blob went from > 1234abcd to 0000000, check that we are at 1234abcd, and then delete the > file. To reverse it, I reinstate 1234abcd from nothing (and conflict if > the file exists). _Neither_ case works without the sha1. > > So I think the problem is not about "this cannot be reversed" but about > "the recipient must have your sha1 to make sense of it, in either > direction". Which is the same case as with "-M" and "-C", and why we > have long cautioned about their use on mailing lists. But the right rule > is not "do not use on mailing lists" but rather "do not use on mailing > lists for projects where the recipients will not be using git to apply". Actually, thinking on this a bit more, I guess "-M" and "-C" are usable without the sha1. In fact, we don't even provide it for a strict 100% rename, and for a rename-with-patch, you can apply the patch, assuming you have the original file in any form. So they are really about "is your recipient using git", not "is your recipient using git _and_ will he/she have the right sha1". I do still think that in practice among git users that "-D" patches will be able to be applied. You won't typically be sending a patch to delete something that other people don't have. I would expect most uses of "apply -R" to be reverting an existing delete, which means you will have the sha1. -Peff ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: git diff: add option for omitting the contents of deletes 2011-02-28 12:23 ` Jeff King @ 2011-02-28 12:32 ` Michael J Gruber 2011-02-28 12:59 ` Jeff King 2011-02-28 18:11 ` Junio C Hamano 1 sibling, 1 reply; 32+ messages in thread From: Michael J Gruber @ 2011-02-28 12:32 UTC (permalink / raw) To: Jeff King; +Cc: Junio C Hamano, Mart Sõmermaa, git Jeff King venit, vidit, dixit 28.02.2011 13:23: > On Mon, Feb 28, 2011 at 07:17:26AM -0500, Jeff King wrote: > >>> That aspect is similar to -M and -C, though - unless we check the sha1 >>> of the blobs before applying the patch (which would be possible for -D >>> also) - do we? >> >> Yes, I think we do check the sha1s for a "-M" patch. And we should do so >> for a "-D" patch, too. Which would make it just as likely to conflict as >> a version with the actual patch content, _except_ that it relies on the >> recipient having that sha1. >> >> I don't see how "-D" is any less reversible than -M or -C, though. If I >> get your -D patch, I use the index line to see that the blob went from >> 1234abcd to 0000000, check that we are at 1234abcd, and then delete the >> file. To reverse it, I reinstate 1234abcd from nothing (and conflict if >> the file exists). _Neither_ case works without the sha1. >> >> So I think the problem is not about "this cannot be reversed" but about >> "the recipient must have your sha1 to make sense of it, in either >> direction". Which is the same case as with "-M" and "-C", and why we >> have long cautioned about their use on mailing lists. But the right rule >> is not "do not use on mailing lists" but rather "do not use on mailing >> lists for projects where the recipients will not be using git to apply". > > Actually, thinking on this a bit more, I guess "-M" and "-C" are usable > without the sha1. In fact, we don't even provide it for a strict 100% > rename, and for a rename-with-patch, you can apply the patch, assuming > you have the original file in any form. So they are really about "is > your recipient using git", not "is your recipient using git _and_ will > he/she have the right sha1". $ git mv Makefile Dofile $ git staged # yadayada diff --git c/Makefile i/Dofile similarity index 100% rename from Makefile rename to Dofile Same with copy. But that's not good, is it? I mean, Alice sends me her "copy patch" and I send her my Makefile patch, both on top of the same base. We both apply each other's patch cleanly. We end up with different "Dofile". Checking the sha1 would prevent this. It's no surprise that patch application is non-commutative, but shouldn't we catch this? Michael ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: git diff: add option for omitting the contents of deletes 2011-02-28 12:32 ` Michael J Gruber @ 2011-02-28 12:59 ` Jeff King 2011-02-28 13:05 ` Michael J Gruber 0 siblings, 1 reply; 32+ messages in thread From: Jeff King @ 2011-02-28 12:59 UTC (permalink / raw) To: Michael J Gruber; +Cc: Junio C Hamano, Mart Sõmermaa, git On Mon, Feb 28, 2011 at 01:32:35PM +0100, Michael J Gruber wrote: > > Actually, thinking on this a bit more, I guess "-M" and "-C" are usable > > without the sha1. In fact, we don't even provide it for a strict 100% > > rename, and for a rename-with-patch, you can apply the patch, assuming > > you have the original file in any form. So they are really about "is > > your recipient using git", not "is your recipient using git _and_ will > > he/she have the right sha1". > > $ git mv Makefile Dofile > $ git staged # yadayada > diff --git c/Makefile i/Dofile > similarity index 100% > rename from Makefile > rename to Dofile > > Same with copy. > > But that's not good, is it? I mean, Alice sends me her "copy patch" and > I send her my Makefile patch, both on top of the same base. We both > apply each other's patch cleanly. We end up with different "Dofile". > Checking the sha1 would prevent this. It's no surprise that patch > application is non-commutative, but shouldn't we catch this? Won't you either get a conflict or end up with the same Dofile? Clearly you will have a Dofile with your Makefile changes, as you applied the movement on top of your changes. Alice will either: 1. Apply not using rename detection (e.g., not using git, or using "git am" without "-3"). In this case, she gets a conflict because she no longer has Makefile. 2. Apply using rename detection (e.g., via "git am -3"). In this case, we will notice the movement of Makefile to Dofile, and apply the patch to Dofile. Still, I do wonder if we should be including an index line on a straight rename patch. It lets the recipient check that what is being renamed is what they have (IOW, it gives the same check that they would do if they ahd the whole patch text). And then the recipient can decide how to resolve the conflict. -Peff ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: git diff: add option for omitting the contents of deletes 2011-02-28 12:59 ` Jeff King @ 2011-02-28 13:05 ` Michael J Gruber 2011-02-28 21:54 ` Jeff King 0 siblings, 1 reply; 32+ messages in thread From: Michael J Gruber @ 2011-02-28 13:05 UTC (permalink / raw) To: Jeff King; +Cc: Junio C Hamano, Mart Sõmermaa, git Jeff King venit, vidit, dixit 28.02.2011 13:59: > On Mon, Feb 28, 2011 at 01:32:35PM +0100, Michael J Gruber wrote: > >>> Actually, thinking on this a bit more, I guess "-M" and "-C" are usable >>> without the sha1. In fact, we don't even provide it for a strict 100% >>> rename, and for a rename-with-patch, you can apply the patch, assuming >>> you have the original file in any form. So they are really about "is >>> your recipient using git", not "is your recipient using git _and_ will >>> he/she have the right sha1". >> >> $ git mv Makefile Dofile >> $ git staged # yadayada >> diff --git c/Makefile i/Dofile >> similarity index 100% >> rename from Makefile >> rename to Dofile >> >> Same with copy. >> >> But that's not good, is it? I mean, Alice sends me her "copy patch" and >> I send her my Makefile patch, both on top of the same base. We both >> apply each other's patch cleanly. We end up with different "Dofile". >> Checking the sha1 would prevent this. It's no surprise that patch >> application is non-commutative, but shouldn't we catch this? > > Won't you either get a conflict or end up with the same Dofile? Clearly > you will have a Dofile with your Makefile changes, as you applied the > movement on top of your changes. > > Alice will either: > > 1. Apply not using rename detection (e.g., not using git, or using > "git am" without "-3"). In this case, she gets a conflict because > she no longer has Makefile. > When I said "copy patch" I actually meant a patch which records the copy "Makefile -> Dofile". What is it today? Is it me? I know I wrote the "mv" example first, but still :) I mean, Alice: cp Makefile Dofile sends me a -C patch I: Break everything by hacking Makefile send her a crappy patch Both: apply the received patch Now I end up with a borked Makefile and a borked Dofile, but Alice still has a good Dofile, and it's all my fault, so I don't deserve any better. But still. > 2. Apply using rename detection (e.g., via "git am -3"). In this case, > we will notice the movement of Makefile to Dofile, and apply the > patch to Dofile. > > Still, I do wonder if we should be including an index line on a straight > rename patch. It lets the recipient check that what is being renamed is > what they have (IOW, it gives the same check that they would do if they > ahd the whole patch text). And then the recipient can decide how to > resolve the conflict. I think so. (Or make -f force it.) This is orthogonal to the "-D" suggestion", but "-D" could write the index line to start with. Michael ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: git diff: add option for omitting the contents of deletes 2011-02-28 13:05 ` Michael J Gruber @ 2011-02-28 21:54 ` Jeff King 0 siblings, 0 replies; 32+ messages in thread From: Jeff King @ 2011-02-28 21:54 UTC (permalink / raw) To: Michael J Gruber; +Cc: Junio C Hamano, Mart Sõmermaa, git On Mon, Feb 28, 2011 at 02:05:37PM +0100, Michael J Gruber wrote: > When I said "copy patch" I actually meant a patch which records the copy > "Makefile -> Dofile". What is it today? Is it me? I know I wrote the > "mv" example first, but still :) Yeah, I looked at the "mv" example and totally glossed over the word "copy" in your text. > I mean, Alice: > cp Makefile Dofile > sends me a -C patch > > I: > Break everything by hacking Makefile > send her a crappy patch > > Both: > apply the received patch > > Now I end up with a borked Makefile and a borked Dofile, but Alice still > has a good Dofile, and it's all my fault, so I don't deserve any better. > But still. OK, this is a much better example. Yes, you have different state at the end, and there were no conflicts. I don't think we can resolve the situation automagically, but I think it would at least be nice to mention the conflict. I guess the big question is how to mention it. Should it cause the patch application to fail? I'm worried about that creating unnecessary false positives for cases that are really quite harmless. Should patch application just give a warning unless --strict-renames or something is used? I dunno. This is one of those corner cases where we can see that there is a potential problem, but it hasn't actually come up in practice, so it's difficult to see what would be most useful in the real world. Maybe that means we are wasting our time thinking about it. :) > This is orthogonal to the "-D" suggestion", but "-D" could write the > index line to start with. Yeah. I had just assumed that "-D" would be the same as the current text, minus the actual patch lines. IOW: diff --git a/Makefile b/Makefile deleted file mode 100644 index c9ff69c..0000000 -Peff ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: git diff: add option for omitting the contents of deletes 2011-02-28 12:23 ` Jeff King 2011-02-28 12:32 ` Michael J Gruber @ 2011-02-28 18:11 ` Junio C Hamano 2011-02-28 22:23 ` Jeff King 1 sibling, 1 reply; 32+ messages in thread From: Junio C Hamano @ 2011-02-28 18:11 UTC (permalink / raw) To: Jeff King; +Cc: Michael J Gruber, Mart Sõmermaa, git Jeff King <peff@peff.net> writes: > Actually, thinking on this a bit more, I guess "-M" and "-C" are usable > without the sha1. In fact, we don't even provide it for a strict 100% > rename, and for a rename-with-patch, you can apply the patch, assuming > you have the original file in any form. Yes, you got it correctly. A patch is about giving you an ability to propagate a change to a similar tree, that does not have to be identical to yours (otherwise you can just send tarball of the whole thing ;-)). And a patch is not purely technical; it has human component in that the recipient needs to know enough about the context of the patch to be able to judge if the change is applicable to his tree. By seeing some context lines in -M/-C (many of them are not pure renames), the recipient can be sure that the patch is being applied to his tree that is "similar enough" to what the patch was originally produced against. It would certainly help to have git-apply that understands the rename patches to apply such a patch mechanically, but that is not a strict requirement; the recipient can move or copy the original file to manually create the target and apply the patch by hand to produce the desired result, so "do you have git?" is mostly irrelevant. The proposed -D would apply to any tree that happens to have the path being mentioned in the patch regardless of how similar the target tree is to the original. Pure renaming -M/-C patch shares the same risk, but unlike these modes, -D goes one step too far by making it impossible to recover from lossage without having a backup. But all of the above is only about principles. As I said, I am not strongly opposed to have an output mode that is primarily for reviewing, just like we have --color-words, that are not suitable for patch application, as long as users understand the implications. It may make sense to show such a patch still with a bit of context, perhaps like this: README | 54 ------------------------------------------------------ 1 files changed, 0 insertions(+), 54 deletions(-) diff --git -D a/README b/README deleted file mode 100644 index 67cfeb2..0000000 --- a/README +++ /dev/null @@ -1,54 +0,0 @@ -//////////////////////////////////////////////////////////////// - - GIT - the stupid content tracker -... -the discussion following them on the mailing list give a good -reference for project status, development direction and -remaining tasks. so that the reader can have some warm and fuzzy feeling that the correct file is being deleted, though. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: git diff: add option for omitting the contents of deletes 2011-02-28 18:11 ` Junio C Hamano @ 2011-02-28 22:23 ` Jeff King 2011-02-28 23:28 ` Junio C Hamano 0 siblings, 1 reply; 32+ messages in thread From: Jeff King @ 2011-02-28 22:23 UTC (permalink / raw) To: Junio C Hamano; +Cc: Michael J Gruber, Mart Sõmermaa, git On Mon, Feb 28, 2011 at 10:11:17AM -0800, Junio C Hamano wrote: > Jeff King <peff@peff.net> writes: > > > Actually, thinking on this a bit more, I guess "-M" and "-C" are usable > > without the sha1. In fact, we don't even provide it for a strict 100% > > rename, and for a rename-with-patch, you can apply the patch, assuming > > you have the original file in any form. > > Yes, you got it correctly. A patch is about giving you an ability to > propagate a change to a similar tree, that does not have to be identical > to yours (otherwise you can just send tarball of the whole thing ;-)). Yeah, I guess I am just curious whether "similar enough" is really relevant with file deletion. Obviously if you have the sha1 then everything applies, no problem. But if you don't, how useful is the patch text? The patch application will error out with a conflict. You know in either case that the person wanted the file deleted. The patch text shows you what _their_ version of the file looked like. But that's not likely to be useful; what you really care about is what's in _your_ version of the file, and whether that should be deleted or not. If you did really care about what was in their file, the giant deletion diff is almost certainly not interesting. What you really care about is what's different in their version versus yours. Which we don't provide a simple way to access. But more likely, you're just going to email them back and say "what is this based on?". So yeah, the patch gives more information, but I am skeptical that anybody uses it in practice. > And a patch is not purely technical; it has human component in that the > recipient needs to know enough about the context of the patch to be able > to judge if the change is applicable to his tree. By seeing some context > lines in -M/-C (many of them are not pure renames), the recipient can be > sure that the patch is being applied to his tree that is "similar enough" > to what the patch was originally produced against. But we don't provide context lines for pure renames. Yet the sender might have a different version of the file than the recipient, and the recipient has no idea. We don't even detect that situation, let alone give the user any clue about what might be different. > It would certainly help to have git-apply that understands the rename > patches to apply such a patch mechanically, but that is not a strict > requirement; the recipient can move or copy the original file to > manually create the target and apply the patch by hand to produce the > desired result, so "do you have git?" is mostly irrelevant. Isn't that the same for a delete? I can see you wanted the file deleted. It didn't match up with my sha1, but I can go ahead and do the delete manually if I want. > The proposed -D would apply to any tree that happens to have the path > being mentioned in the patch regardless of how similar the target tree is > to the original. Pure renaming -M/-C patch shares the same risk, but > unlike these modes, -D goes one step too far by making it impossible to > recover from lossage without having a backup. I'm not quite sure what lossage you mean. On the recipient's end? They can just "git revert", no? Or do you mean somebody who gets a conflict and manually does the deletion anyway? If they have the patch text, then yes, they have something, but it is _not_ what they deleted. Otherwise they would not have had a conflict. > But all of the above is only about principles. As I said, I am not > strongly opposed to have an output mode that is primarily for reviewing, > just like we have --color-words, that are not suitable for patch > application, as long as users understand the implications. I am not sure I agree about the danger, but certainly I agree that for viewing it is not a problem at all. I have to admit to not caring all _that_ much about the topic. I do find deletion diffs unnecessarily spammy, but they just don't happen enough for me to be really annoyed. > It may make sense to show such a patch still with a bit of context, > perhaps like this: > > README | 54 ------------------------------------------------------ > 1 files changed, 0 insertions(+), 54 deletions(-) > > diff --git -D a/README b/README > deleted file mode 100644 > index 67cfeb2..0000000 > --- a/README > +++ /dev/null > @@ -1,54 +0,0 @@ > -//////////////////////////////////////////////////////////////// > - > - GIT - the stupid content tracker > -... > -the discussion following them on the mailing list give a good > -reference for project status, development direction and > -remaining tasks. > > so that the reader can have some warm and fuzzy feeling that the correct > file is being deleted, though. Hmm. I think that is even worse. It's _still_ spammy, and it's broken as a diff (the sha1 in the index line doesn't match the preimage that the diff text shows). Yeah, it gives you warm fuzzy feelings that the file they were thinking about deleting is at least vaguely related to what you wanted to delete. But how often do you get a deletion patch where seeing the first 10 lines is actually useful? That is, what is the case where you see the first 10 lines and immediately say "Wait, this deletion is totally bogus!" and _didn't_ just see that there was a conflicting deletion, read the commit message or email and say "Wait, why are we deleting this file?" -Peff ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: git diff: add option for omitting the contents of deletes 2011-02-28 22:23 ` Jeff King @ 2011-02-28 23:28 ` Junio C Hamano 2011-03-01 0:11 ` Junio C Hamano 0 siblings, 1 reply; 32+ messages in thread From: Junio C Hamano @ 2011-02-28 23:28 UTC (permalink / raw) To: Jeff King; +Cc: Michael J Gruber, Mart Sõmermaa, git Jeff King <peff@peff.net> writes: > Yeah, I guess I am just curious whether "similar enough" is really > relevant with file deletion. Obviously if you have the sha1 then > everything applies, no problem. But if you don't, how useful is the > patch text? The patch application will error out with a conflict. You > know in either case that the person wanted the file deleted. The patch > text shows you what _their_ version of the file looked like. But that's > not likely to be useful; what you really care about is what's in _your_ > version of the file, and whether that should be deleted or not. Exactly; in order to intelligently decide if that should be deleted or not, you need to be able to judge if the deletion makes sense in the context of _your_ tree. Hopefully proposed commit log message alone may clarify if the change applies to your situation, or you need to be able to see if their version and yours are still serving similar purposes in the system (e.g. you don't want to lose the definition of a function they didn't have when they decided to remove the file saying "nobody calls any function in this library anymore"). > If you did really care about what was in their file, the giant deletion > diff is almost certainly not interesting. What you really care about is > what's different in their version versus yours. Correct; the deletion diff has half of the necessary information (the other half being what you have). > I'm not quite sure what lossage you mean. On the recipient's end? They > can just "git revert", no? No, I was not particularly interested in limiting the discussion to git-using population. As I said, just like rename patch can be manually applied by first moving and then eyeballing, you can see "an instruction to delete--ok, 'rm'" to apply it and the file is gone (unlike the case where you manually did "mv"). That is what I meant by "additionally -D has lossage". >> It may make sense to show such a patch still with a bit of context, >> perhaps like this: >> >> README | 54 ------------------------------------------------------ >> 1 files changed, 0 insertions(+), 54 deletions(-) >> >> diff --git -D a/README b/README >> deleted file mode 100644 >> index 67cfeb2..0000000 >> --- a/README >> +++ /dev/null >> @@ -1,54 +0,0 @@ >> -//////////////////////////////////////////////////////////////// >> - >> - GIT - the stupid content tracker >> -... >> -the discussion following them on the mailing list give a good >> -reference for project status, development direction and >> -remaining tasks. >> >> so that the reader can have some warm and fuzzy feeling that the correct >> file is being deleted, though. > > Hmm. I think that is even worse. It's _still_ spammy, and it's broken as > a diff (the sha1 in the index line doesn't match the preimage that the That is very much intentional; I don't think we want the format to apply mechanically. I was not talking about patch application safety at all in this thread; I already dismissed "-D" as too unreliable as a patch format to be used for conveying a change for mechanical consumption, didn't I? When you want to do so, you just generate the output without -D. The suggestion to give some context is only for people who want to eyeball their own patches, helping them to make sure that they are removing what they wanted to remove. If they can tell from the filename alone, that is just as fine. I am not interested that deeply here; the only thing I care about is to make sure we don't have to keep carrying crappy options as if they are sane. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: git diff: add option for omitting the contents of deletes 2011-02-28 23:28 ` Junio C Hamano @ 2011-03-01 0:11 ` Junio C Hamano 2011-03-07 20:38 ` Mart Sõmermaa 0 siblings, 1 reply; 32+ messages in thread From: Junio C Hamano @ 2011-03-01 0:11 UTC (permalink / raw) To: Jeff King; +Cc: Michael J Gruber, Mart Sõmermaa, git Junio C Hamano <gitster@pobox.com> writes: > Jeff King <peff@peff.net> writes: > >> I'm not quite sure what lossage you mean. On the recipient's end? They >> can just "git revert", no? This is beating on a dead horse, but in my previous life, in a project I wasn't deeply involved in, a deployment procedure used was to: - give git-archive tarball of a major release to the target machine that does not have git repository, and extract the tarball; - maintain 'maint' branch, and show the tip of 'maint' on a test server, but give the client vetoing power on individual changes; - hence ending up preparing a format-patch output for selected changes, and running git-apply on the target machine to update it. The -M/-C options would have worked well in this scenario, including the recovery from "oops--that didn't work, please revert asap" (I don't know if they actually used -M/-C when preparing the incremental updates, though). The -D option does not have the same reversibility. I am (have been) only reacting to the earlier statement by Michael that -D is the same as -M/-C and reversibility does not matter. It does in such a situation. Now that the project with that deployment procedure is totally behind me, I probably shouldn't care too much about such a workflow, but I suspect there still are people using that sort of workflow, and this would matter to them. In any case, a minimum patch to give what Mart wanted to see would probably look like this. I'll leave bugfixes, documentation and tests to the readers ;-). diff.c | 12 +++++++++--- diff.h | 1 + 2 files changed, 10 insertions(+), 3 deletions(-) diff --git a/diff.c b/diff.c index 5422c43..5c66a53 100644 --- a/diff.c +++ b/diff.c @@ -1943,7 +1943,11 @@ static void builtin_diff(const char *name_a, } } - if (!DIFF_OPT_TST(o, TEXT) && + if (o->irreversible_delete && lbl[1][0] == '/') { + fprintf(o->file, "%s", header.buf); + strbuf_reset(&header); + goto free_ab_and_return; + } else if (!DIFF_OPT_TST(o, TEXT) && ( (!textconv_one && diff_filespec_is_binary(one)) || (!textconv_two && diff_filespec_is_binary(two)) )) { if (fill_mmfile(&mf1, one) < 0 || fill_mmfile(&mf2, two) < 0) @@ -1963,8 +1967,7 @@ static void builtin_diff(const char *name_a, fprintf(o->file, "%sBinary files %s and %s differ\n", line_prefix, lbl[0], lbl[1]); o->found_changes = 1; - } - else { + } else { /* Crazy xdl interfaces.. */ const char *diffopts = getenv("GIT_DIFF_OPTS"); xpparam_t xpp; @@ -3160,6 +3163,9 @@ int diff_opt_parse(struct diff_options *options, const char **av, int ac) return error("invalid argument to -M: %s", arg+2); options->detect_rename = DIFF_DETECT_RENAME; } + else if (!strcmp(arg, "-D") || !strcmp(arg, "--irreversible-delete")) { + options->irreversible_delete = 1; + } else if (!prefixcmp(arg, "-C") || !prefixcmp(arg, "--find-copies=") || !strcmp(arg, "--find-copies")) { if (options->detect_rename == DIFF_DETECT_COPY) diff --git a/diff.h b/diff.h index 310bd6b..11d13cf 100644 --- a/diff.h +++ b/diff.h @@ -104,6 +104,7 @@ struct diff_options { int interhunkcontext; int break_opt; int detect_rename; + int irreversible_delete; int skip_stat_unmatch; int line_termination; int output_format; ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: git diff: add option for omitting the contents of deletes 2011-03-01 0:11 ` Junio C Hamano @ 2011-03-07 20:38 ` Mart Sõmermaa 2011-03-08 7:14 ` Michael J Gruber 2011-03-08 19:49 ` Junio C Hamano 0 siblings, 2 replies; 32+ messages in thread From: Mart Sõmermaa @ 2011-03-07 20:38 UTC (permalink / raw) To: Junio C Hamano; +Cc: Jeff King, Michael J Gruber, git On Tue, Mar 1, 2011 at 2:11 AM, Junio C Hamano <gitster@pobox.com> wrote: > In any case, a minimum patch to give what Mart wanted to see would > probably look like this. I'll leave bugfixes, documentation and tests to > the readers ;-). The minimum looks to be the optimum -- IMHO this is entirely sufficient, thank you very much! What would the bugfixes be (i.e. do you have any doubts about the implementation)? The change looks quite straightforward, to the point and bug-free to my eyes. I try to get to writing tests ASAP. As for the documentation, would the following section be sufficient in Documentation/diff-options.txt? -D:: --irreversible-delete:: Omit file contents when file has been deleted and only output the header. This is useful during diff review but should not be used in actual patches as these would be non-reversible due to the omitted file contents. Best regards and warm thanks once more, MS ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: git diff: add option for omitting the contents of deletes 2011-03-07 20:38 ` Mart Sõmermaa @ 2011-03-08 7:14 ` Michael J Gruber 2011-03-08 19:49 ` Junio C Hamano 1 sibling, 0 replies; 32+ messages in thread From: Michael J Gruber @ 2011-03-08 7:14 UTC (permalink / raw) To: Mart Sõmermaa; +Cc: Junio C Hamano, Jeff King, git Mart Sõmermaa venit, vidit, dixit 07.03.2011 21:38: > On Tue, Mar 1, 2011 at 2:11 AM, Junio C Hamano <gitster@pobox.com> wrote: >> In any case, a minimum patch to give what Mart wanted to see would >> probably look like this. I'll leave bugfixes, documentation and tests to >> the readers ;-). > > The minimum looks to be the optimum -- IMHO this is entirely sufficient, > thank you very much! > > What would the bugfixes be (i.e. do you have any doubts about the > implementation)? > The change looks quite straightforward, to the point and bug-free to my eyes. > > I try to get to writing tests ASAP. > > As for the documentation, would the following section be sufficient in > Documentation/diff-options.txt? > > -D:: > --irreversible-delete:: > Omit file contents when file has been deleted and only "if the file has" > output the header. This is useful during diff review > but should not be used in actual patches as these would > be non-reversible due to the omitted file contents. With the current state of git-apply, they could not even be applied, so maybe: but produces patches which can not be applied by linkgit:git-apply[1]. You can revert to the previous description once you've taught git-apply the new diff header :) Michael ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: git diff: add option for omitting the contents of deletes 2011-03-07 20:38 ` Mart Sõmermaa 2011-03-08 7:14 ` Michael J Gruber @ 2011-03-08 19:49 ` Junio C Hamano 2011-03-08 21:25 ` Mart Sõmermaa 1 sibling, 1 reply; 32+ messages in thread From: Junio C Hamano @ 2011-03-08 19:49 UTC (permalink / raw) To: Mart Sõmermaa; +Cc: Jeff King, Michael J Gruber, git Mart Sõmermaa <mrts.pydev@gmail.com> writes: > On Tue, Mar 1, 2011 at 2:11 AM, Junio C Hamano <gitster@pobox.com> wrote: >> In any case, a minimum patch to give what Mart wanted to see would >> probably look like this. I'll leave bugfixes, documentation and tests to >> the readers ;-). > > The minimum looks to be the optimum -- IMHO this is entirely sufficient, I suspect not, as I don't think I ever did anything to the codepath for -B output, and I also recall spending some time thinking about this issue in an earlier message that I mentioned -B besides "deletion", none of the suggestions in which I don't think I tried to implement. I also vaguely recall that I suspected that the output may not have the usual "index deadbeaf..000000 mode" line and that I chose to ignore it. So, no, not good enough ;-). ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: git diff: add option for omitting the contents of deletes 2011-03-08 19:49 ` Junio C Hamano @ 2011-03-08 21:25 ` Mart Sõmermaa 2011-03-08 21:31 ` Jeff King 0 siblings, 1 reply; 32+ messages in thread From: Mart Sõmermaa @ 2011-03-08 21:25 UTC (permalink / raw) To: Junio C Hamano; +Cc: Jeff King, Michael J Gruber, git On Tue, Mar 8, 2011 at 9:49 PM, Junio C Hamano <gitster@pobox.com> wrote: > Mart Sõmermaa <mrts.pydev@gmail.com> writes: > >> On Tue, Mar 1, 2011 at 2:11 AM, Junio C Hamano <gitster@pobox.com> wrote: >>> In any case, a minimum patch to give what Mart wanted to see would >>> probably look like this. I'll leave bugfixes, documentation and tests to >>> the readers ;-). >> >> The minimum looks to be the optimum -- IMHO this is entirely sufficient, > > I suspect not, as I don't think I ever did anything to the codepath for -B > output, and I also recall spending some time thinking about this issue in > an earlier message that I mentioned -B besides "deletion", none of the > suggestions in which I don't think I tried to implement. Your suggestion is as follows: "It also might make sense to apply the similar principle to shorten the output with -B when a rewrite patch is expressed as a single hunk patch that removes everything old and then adds everthing new." I have to admit that I've never used -B, only -M, so please don't mind that its semantics and exact behavior are a bit foreign to me. After running t/t4130-apply-criss-cross-rename.sh, `git diff -M -B` outputs the following: diff --git a/file2 b/file1 similarity index 100% rename from file2 rename to file1 diff --git a/file1 b/file2 similarity index 100% rename from file1 rename to file2 Can you bring a similar example of changes in the output after the above-mentioned similar principle has been implemented for -B? > I also vaguely recall that I suspected that the output may not have the > usual "index deadbeaf..000000 mode" line and that I chose to ignore it. 'deadbeaf' is present: $ ../git/bin-wrappers/git diff -D HEAD diff --git a/foo.txt b/foo.txt deleted file mode 100644 index 257cc56..0000000 Can we perhaps consider the -B behavior orthogonal to -D (unless I am missing something very important here)? Best regards, MS ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: git diff: add option for omitting the contents of deletes 2011-03-08 21:25 ` Mart Sõmermaa @ 2011-03-08 21:31 ` Jeff King 0 siblings, 0 replies; 32+ messages in thread From: Jeff King @ 2011-03-08 21:31 UTC (permalink / raw) To: Mart Sõmermaa; +Cc: Junio C Hamano, Michael J Gruber, git On Tue, Mar 08, 2011 at 11:25:04PM +0200, Mart Sõmermaa wrote: > "It also might make sense to apply the similar principle to shorten the output > with -B when a rewrite patch is expressed as a single hunk patch that removes > everything old and then adds everthing new." > > I have to admit that I've never used -B, only -M, so please don't mind that > its semantics and exact behavior are a bit foreign to me. > > After running t/t4130-apply-criss-cross-rename.sh, > `git diff -M -B` outputs the following: > > diff --git a/file2 b/file1 > similarity index 100% > rename from file2 > rename to file1 > diff --git a/file1 b/file2 > similarity index 100% > rename from file1 > rename to file2 > > Can you bring a similar example of changes in the output after > the above-mentioned similar principle has been implemented for -B? Try: $ git init $ perl -e 'print "a\n" for (1 .. 1000)' >file $ git add file && git commit -m one $ perl -e 'print "b\n" for (1 .. 1000)' >file Now you can see that -B breaks notes it as a rewrite: $ git diff --stat --summary -B file | 2000 +++++++++++++++++++++++++++---------------------------- 1 files changed, 1000 insertions(+), 1000 deletions(-) rewrite file (100%) And the diff is long: $ git diff -B diff --git a/file b/file dissimilarity index 100% index 5cfafaa..a7b871a 100644 --- a/file +++ b/file @@ -1,1000 +1,1000 @@ -a -a [... x 1000] +b +b [... x 1000] But we could perhaps drop the actual 1000-line hunks (or maybe even just the deletion half). -Peff ^ permalink raw reply [flat|nested] 32+ messages in thread
* symling diff driver (Was: Re: git diff: add option for omitting the contents of deletes) 2011-02-28 12:17 ` Jeff King 2011-02-28 12:23 ` Jeff King @ 2011-02-28 12:42 ` Michael J Gruber 2011-02-28 13:08 ` Jeff King 1 sibling, 1 reply; 32+ messages in thread From: Michael J Gruber @ 2011-02-28 12:42 UTC (permalink / raw) To: Jeff King; +Cc: Junio C Hamano, git Jeff King venit, vidit, dixit 28.02.2011 13:17: > On Mon, Feb 28, 2011 at 08:31:55AM +0100, Michael J Gruber wrote: > [non symlink stuff snipped] >> Reminds me of my failed attempt to make the diff output for symlinks >> more human-friendly. The latter can be solved with textconv, though. > > I am still carrying around my "symlinks as a special class of diff" > patches if you are interested: > > https://github.com/peff/git/tree/jk/userdiff-symlinks This is marvelous (except for s/perl -pe/sed -e/, of course). Is there anything left to do to get this in pu? (The reference to "previous patch" may need to be more explicit.) Michael ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: symling diff driver (Was: Re: git diff: add option for omitting the contents of deletes) 2011-02-28 12:42 ` symling diff driver (Was: Re: git diff: add option for omitting the contents of deletes) Michael J Gruber @ 2011-02-28 13:08 ` Jeff King 2011-02-28 15:26 ` [PATCH/WIP] attr: make attributes depend on file type Michael J Gruber 0 siblings, 1 reply; 32+ messages in thread From: Jeff King @ 2011-02-28 13:08 UTC (permalink / raw) To: Michael J Gruber; +Cc: Junio C Hamano, git On Mon, Feb 28, 2011 at 01:42:06PM +0100, Michael J Gruber wrote: > > I am still carrying around my "symlinks as a special class of diff" > > patches if you are interested: > > > > https://github.com/peff/git/tree/jk/userdiff-symlinks > > This is marvelous (except for s/perl -pe/sed -e/, of course). I used perl because many older seds have trouble with files not ending in newline. > Is there anything left to do to get this in pu? (The reference to > "previous patch" may need to be more explicit.) See this subthread for discussion: http://article.gmane.org/gmane.comp.version-control.git/156760 On the one hand, there were doubts that anybody would actually want to use it (though it looks like you may be the counterexample to that). On the other hand, it may be that the solution does not go far enough, and that .gitattributes should become aware of file types in its matching. That would be more flexible, and it would also fix other broken areas (like the fact that a symlink to foo.pdf might have its symlink text merged as if it were a pdf). -Peff ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH/WIP] attr: make attributes depend on file type 2011-02-28 13:08 ` Jeff King @ 2011-02-28 15:26 ` Michael J Gruber 2011-02-28 17:30 ` Jeff King 0 siblings, 1 reply; 32+ messages in thread From: Michael J Gruber @ 2011-02-28 15:26 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Jeff King This is a whether balloon patch (to check whether this balloons or not, uhm). Hit it hard, not me ;) It's part of the way to revive the old idea of making attributes depend on file type. I am not following the old path pattern is_symlink otherattr because that would mean something which looks like an attribute (is_symlink) but is not. Instead, symlink:pattern attrs is to specify attrs for pattern if it is a symlink. So, e.g., symlink:* diff=symlink together with a nice diff.symlink.textconv displays symlinks nicely. Most notable rough edges: - git_checkattr() gets a new mode parameter. I've stuck it in where I could, 0 anywhere else. This is appropriate in many places (where we know IS_REG a forteriori) but I'm not sure about all of them. - We/someone could teach git_checkattr other prefixes. - It is tested with one (1) symlink and works! - I would bundle it with Jeff's doc and test ;) - This is not a real commit message. Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> --- archive.c | 2 +- attr.c | 5 ++++- attr.h | 2 +- builtin/check-attr.c | 7 ++++++- builtin/pack-objects.c | 2 +- convert.c | 4 ++-- diff.c | 5 ++--- grep.c | 2 +- ll-merge.c | 4 ++-- userdiff.c | 4 ++-- userdiff.h | 2 +- ws.c | 2 +- 12 files changed, 24 insertions(+), 17 deletions(-) diff --git a/archive.c b/archive.c index 1944ed4..2870f04 100644 --- a/archive.c +++ b/archive.c @@ -124,7 +124,7 @@ static int write_archive_entry(const unsigned char *sha1, const char *base, path_without_prefix = path.buf + args->baselen; setup_archive_check(check); - if (!git_checkattr(path_without_prefix, ARRAY_SIZE(check), check)) { + if (!git_checkattr(path_without_prefix, ARRAY_SIZE(check), check, mode)) { if (ATTR_TRUE(check[0].value)) return 0; convert = ATTR_TRUE(check[1].value); diff --git a/attr.c b/attr.c index 6aff695..c4481ac 100644 --- a/attr.c +++ b/attr.c @@ -708,7 +708,7 @@ static int macroexpand_one(int attr_nr, int rem) return rem; } -int git_checkattr(const char *path, int num, struct git_attr_check *check) +int git_checkattr(const char *path, int num, struct git_attr_check *check, unsigned short mode) { struct attr_stack *stk; const char *cp; @@ -718,6 +718,9 @@ int git_checkattr(const char *path, int num, struct git_attr_check *check) for (i = 0; i < attr_nr; i++) check_all_attr[i].value = ATTR__UNKNOWN; + if (S_ISLNK(mode)) + path = prefix_filename("symlink:", strlen("symlink:"), path); + /* else if (IS_WHATEVER(mode)) do_what_you_feel_like */ pathlen = strlen(path); cp = strrchr(path, '/'); if (!cp) diff --git a/attr.h b/attr.h index 8b3f19b..1af0bd4 100644 --- a/attr.h +++ b/attr.h @@ -29,7 +29,7 @@ struct git_attr_check { const char *value; }; -int git_checkattr(const char *path, int, struct git_attr_check *); +int git_checkattr(const char *path, int, struct git_attr_check *, unsigned short mode); enum git_attr_direction { GIT_ATTR_CHECKIN, diff --git a/builtin/check-attr.c b/builtin/check-attr.c index 3016d29..30be2c2 100644 --- a/builtin/check-attr.c +++ b/builtin/check-attr.c @@ -24,7 +24,12 @@ static void check_attr(int cnt, struct git_attr_check *check, const char** name, const char *file) { int j; - if (git_checkattr(file, cnt, check)) + unsigned int mode = 0; + struct stat st; + + if (!lstat(file, &st)) + mode = st.st_mode; + if (git_checkattr(file, cnt, check, mode)) die("git_checkattr died"); for (j = 0; j < cnt; j++) { const char *value = check[j].value; diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index b0503b2..4e9ff68 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -633,7 +633,7 @@ static int no_try_delta(const char *path) struct git_attr_check check[1]; setup_delta_attr_check(check); - if (git_checkattr(path, ARRAY_SIZE(check), check)) + if (git_checkattr(path, ARRAY_SIZE(check), check, 0)) return 0; if (ATTR_FALSE(check->value)) return 1; diff --git a/convert.c b/convert.c index d5aebed..d5c2b9c 100644 --- a/convert.c +++ b/convert.c @@ -736,7 +736,7 @@ int convert_to_git(const char *path, const char *src, size_t len, const char *filter = NULL; setup_convert_check(check); - if (!git_checkattr(path, ARRAY_SIZE(check), check)) { + if (!git_checkattr(path, ARRAY_SIZE(check), check, 0)) { struct convert_driver *drv; action = git_path_check_crlf(path, check + 4); if (action == CRLF_GUESS) @@ -773,7 +773,7 @@ static int convert_to_working_tree_internal(const char *path, const char *src, const char *filter = NULL; setup_convert_check(check); - if (!git_checkattr(path, ARRAY_SIZE(check), check)) { + if (!git_checkattr(path, ARRAY_SIZE(check), check, 0)) { struct convert_driver *drv; action = git_path_check_crlf(path, check + 4); if (action == CRLF_GUESS) diff --git a/diff.c b/diff.c index 6640857..22b5d94 100644 --- a/diff.c +++ b/diff.c @@ -1784,8 +1784,7 @@ static void diff_filespec_load_driver(struct diff_filespec *one) if (one->driver) return; - if (S_ISREG(one->mode)) - one->driver = userdiff_find_by_path(one->path); + one->driver = userdiff_find_by_path(one->path, one->mode); /* Fallback to default settings */ if (!one->driver) @@ -2683,7 +2682,7 @@ static void run_diff_cmd(const char *pgm, if (!DIFF_OPT_TST(o, ALLOW_EXTERNAL)) pgm = NULL; else { - struct userdiff_driver *drv = userdiff_find_by_path(attr_path); + struct userdiff_driver *drv = userdiff_find_by_path(attr_path, two->mode); if (drv && drv->external) pgm = drv->external; } diff --git a/grep.c b/grep.c index 63c4280..05a77fc 100644 --- a/grep.c +++ b/grep.c @@ -877,7 +877,7 @@ static int grep_buffer_1(struct grep_opt *opt, const char *name, memset(&xecfg, 0, sizeof(xecfg)); if (opt->funcname && !opt->unmatch_name_only && !opt->status_only && !opt->name_only && !binary_match_only && !collect_hits) { - struct userdiff_driver *drv = userdiff_find_by_path(name); + struct userdiff_driver *drv = userdiff_find_by_path(name, 0); if (drv && drv->funcname.pattern) { const struct userdiff_funcname *pe = &drv->funcname; xdiff_set_find_func(&xecfg, pe->pattern, pe->cflags); diff --git a/ll-merge.c b/ll-merge.c index 6ce512e..3f351e3 100644 --- a/ll-merge.c +++ b/ll-merge.c @@ -330,7 +330,7 @@ static int git_path_check_merge(const char *path, struct git_attr_check check[2] check[0].attr = git_attr("merge"); check[1].attr = git_attr("conflict-marker-size"); } - return git_checkattr(path, 2, check); + return git_checkattr(path, 2, check, 0); } static void normalize_file(mmfile_t *mm, const char *path) @@ -387,7 +387,7 @@ int ll_merge_marker_size(const char *path) if (!check.attr) check.attr = git_attr("conflict-marker-size"); - if (!git_checkattr(path, 1, &check) && check.value) { + if (!git_checkattr(path, 1, &check, 0) && check.value) { marker_size = atoi(check.value); if (marker_size <= 0) marker_size = DEFAULT_CONFLICT_MARKER_SIZE; diff --git a/userdiff.c b/userdiff.c index 1ff4797..e5d0adf 100644 --- a/userdiff.c +++ b/userdiff.c @@ -245,7 +245,7 @@ struct userdiff_driver *userdiff_find_by_name(const char *name) { return userdiff_find_by_namelen(name, len); } -struct userdiff_driver *userdiff_find_by_path(const char *path) +struct userdiff_driver *userdiff_find_by_path(const char *path, unsigned short mode) { static struct git_attr *attr; struct git_attr_check check; @@ -256,7 +256,7 @@ struct userdiff_driver *userdiff_find_by_path(const char *path) if (!path) return NULL; - if (git_checkattr(path, 1, &check)) + if (git_checkattr(path, 1, &check, mode)) return NULL; if (ATTR_TRUE(check.value)) diff --git a/userdiff.h b/userdiff.h index 942d594..e516f1d 100644 --- a/userdiff.h +++ b/userdiff.h @@ -21,6 +21,6 @@ struct userdiff_driver { int userdiff_config(const char *k, const char *v); struct userdiff_driver *userdiff_find_by_name(const char *name); -struct userdiff_driver *userdiff_find_by_path(const char *path); +struct userdiff_driver *userdiff_find_by_path(const char *path, unsigned short mode); #endif /* USERDIFF */ diff --git a/ws.c b/ws.c index 9fb9b14..51929d3 100644 --- a/ws.c +++ b/ws.c @@ -88,7 +88,7 @@ unsigned whitespace_rule(const char *pathname) struct git_attr_check attr_whitespace_rule; setup_whitespace_attr_check(&attr_whitespace_rule); - if (!git_checkattr(pathname, 1, &attr_whitespace_rule)) { + if (!git_checkattr(pathname, 1, &attr_whitespace_rule, 0)) { const char *value; value = attr_whitespace_rule.value; -- 1.7.4.1.257.gb09fa ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH/WIP] attr: make attributes depend on file type 2011-02-28 15:26 ` [PATCH/WIP] attr: make attributes depend on file type Michael J Gruber @ 2011-02-28 17:30 ` Jeff King 2011-02-28 17:48 ` Junio C Hamano 0 siblings, 1 reply; 32+ messages in thread From: Jeff King @ 2011-02-28 17:30 UTC (permalink / raw) To: Michael J Gruber; +Cc: git, Junio C Hamano On Mon, Feb 28, 2011 at 04:26:18PM +0100, Michael J Gruber wrote: > This is a whether balloon patch (to check whether this balloons or not, > uhm). Hit it hard, not me ;) Ouch. > It's part of the way to revive the old idea of making attributes depend > on file type. I am not following the old path > > pattern is_symlink otherattr > > because that would mean something which looks like an attribute > (is_symlink) but is not. Instead, > > symlink:pattern attrs > > is to specify attrs for pattern if it is a symlink. So, e.g., This is way better than what I proposed. From the user's perspective, it is visually clearer that the symlink bit is part of the selector, and not an attribute. And it syntactically disallows nonsense like: pattern is_symlink is_gitlink otherattr The only downside is that it is technically a regression if somebody was using gitattributes for the bizarrely named file "symlink:". It seems pretty unlikely, but possibly we should be carving out a syntactic namespace like: ^[a-z]+: or even: ^[a-z]+(=[^:]*)?: And that would later allow stuff like "submodule:" if people wanted to attach specific bits to submodules (though perhaps it is not necessary, because we have the entire separate .gitmodules file). I don't know if there are other non-name elements people would be interested in selecting on. So maybe that is over-engineering. We also need some way of quoting. Duy has a 1.8.0 proposal to handle this. If we're going to make a syntactic change to gitattributes for this, it should probably be related (since the quoting mechanism is the way you would fix it if you _are_ affected), and should probably follow the same migration mechanism (it looks like there is talk of a "# feature: foo bar" line). See: http://article.gmane.org/gmane.comp.version-control.git/165970 I didn't test, but do we now assume that a pattern like "foo.* diff=bar" will only match when foo is a regular file? I think that would fix stuff like "*.pdf merge=pdf" when there is a symlink named "file.pdf", which I'm pretty sure is currently broken (but did not test). OTOH, that is yet another behavior change if somebody had something like "foo diff=symlink", which we would be breaking. So perhaps we will need many "# feature" markers to make this right. :) -Peff ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH/WIP] attr: make attributes depend on file type 2011-02-28 17:30 ` Jeff King @ 2011-02-28 17:48 ` Junio C Hamano 2011-03-01 7:46 ` Michael J Gruber 0 siblings, 1 reply; 32+ messages in thread From: Junio C Hamano @ 2011-02-28 17:48 UTC (permalink / raw) To: Jeff King; +Cc: Michael J Gruber, git Jeff King <peff@peff.net> writes: > The only downside is that it is technically a regression if somebody was > using gitattributes for the bizarrely named file "symlink:". It seems > pretty unlikely, but possibly we should be carving out a syntactic > namespace like: > > ^[a-z]+: > > or even: > > ^[a-z]+(=[^:]*)?: Or even '^:[a-z]+(=[^,=])?(,[a-z]+(=[^,=])?)*:' to (1) always have some special character at the beginning, to limit the extent of the damage to existing funny pathnames (i.e. to collide your pathname must begin with such a special character). This also has a nice side effect of making it clear that something special is going on; and (2) allow more than one such special on the line, comma-separated. But that is just a small bikeshed. Other than that, I like what Michael and you are aiming for (I am only commenting on the general direction at this moment, as I haven't looked at the patch at all yet). ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH/WIP] attr: make attributes depend on file type 2011-02-28 17:48 ` Junio C Hamano @ 2011-03-01 7:46 ` Michael J Gruber 0 siblings, 0 replies; 32+ messages in thread From: Michael J Gruber @ 2011-03-01 7:46 UTC (permalink / raw) To: Junio C Hamano; +Cc: Jeff King, git Junio C Hamano venit, vidit, dixit 28.02.2011 18:48: > Jeff King <peff@peff.net> writes: > >> The only downside is that it is technically a regression if somebody was >> using gitattributes for the bizarrely named file "symlink:". It seems >> pretty unlikely, but possibly we should be carving out a syntactic >> namespace like: >> >> ^[a-z]+: >> >> or even: >> >> ^[a-z]+(=[^:]*)?: > > Or even '^:[a-z]+(=[^,=])?(,[a-z]+(=[^,=])?)*:' to > > (1) always have some special character at the beginning, to limit the > extent of the damage to existing funny pathnames (i.e. to collide > your pathname must begin with such a special character). This also > has a nice side effect of making it clear that something special is > going on; and > > (2) allow more than one such special on the line, comma-separated. > > But that is just a small bikeshed. Other than that, I like what Michael > and you are aiming for (I am only commenting on the general direction at > this moment, as I haven't looked at the patch at all yet). Thanks, Junio, and Jeff, for the helpful comments. I isolated the naming convention (now "symlink:") to a single spot in attr.c so that changing it (then ":symlink:") is always simple while this develops. (I was afraid that the low likelikehood that someone's file names would collide could be too much already.) Having a standard pattern applied to regular files only is an intended side-effect; more precisely: to non-symlinks right now. We could tighten that to IS_REG, of course, and easily add others (or :other:, :nonreg:). What I really would need help with is checking all call sites of git_checkattr(), which gained a new mode parameter: I used the proper mode where available (from diff_filespec, e.g., i.e.: taking mode from index or tree) or by looking it up in the fs (using lstat() in builtin/check-attr.c), and I used mode=0 in other cases, assuming that this conveys IS_REG and that we're sure those call sites are about regular files or blobs. I tried to avoid unmotivated lstat(). Michael ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: git diff: add option for omitting the contents of deletes 2011-02-27 23:07 ` Junio C Hamano 2011-02-28 7:31 ` Michael J Gruber @ 2011-02-28 10:45 ` Mart Sõmermaa 2011-02-28 16:10 ` Michael J Gruber 1 sibling, 1 reply; 32+ messages in thread From: Mart Sõmermaa @ 2011-02-28 10:45 UTC (permalink / raw) To: git Why not just make it reversible then? $ git diff -M diff --git a/foo b/bar similarity index 100% rename from foo rename to bar is nonreversible without git already (i.e. does not work with plain patch AFAIK). Adding $ git diff -D diff --git a/foo b/foo deleted file mode 100644 delete foo would be neither less nor more reversible -- it would also only work with git apply (assuming that apply is amended accordingly). Cognitive burdens aside, the "delete foo" output is both more explicit and shorter anyway. Occam's razor FTW :)! Junio, I won't push this further, so it's a final call -- if you give -1, then let -D fall to the colourless abyss of oblivion :), if +1 or +0, I'll see if I can come up with a patch. Best regards, MS On Mon, Feb 28, 2011 at 1:07 AM, Junio C Hamano <gitster@pobox.com> wrote: > Junio C Hamano <gitster@pobox.com> writes: > >> Michael J Gruber <git@drmicha.warpmail.net> writes: >> >>> Wasn't the pager invented for sifting through output which has to be >>> several pages, but not not for that which could be more concise? ;) >>> >>> In fact, -D would be quite analogous to -M and -C in that respect. >> >> There is a big difference: -M and -C lets your recipient reproduce the >> state using the change you are trying to convey with the diff output in >> either direction (iow, "apply -R" works), but your "-D" would not have >> that property. > > Having said that we have always valued "reversibility" and a casual -D is > not in line with that principle, I don't have a strong objection if the > new mode of operation is marked clearly as "nonusable if you are trying to > produce appliable diff (iow, don't send such a patch to mailing list--it > is for viewing purposes only)", treating it just like the --color-words > and the --stat options (there isn't even need to mark these as unusable > for that purpose, as people with common sense would be able to guess). > > If we were to do this, it probably is a good idea to apply that for a > typechange patch (the one that is produced when a symlink turns into a > regular file and vice versa) as well. It also might make sense to apply > the similar principle to shorten the output with -B when a rewrite patch > is expressed as a single hunk patch that removes everything old and then > adds everthing new. > ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: git diff: add option for omitting the contents of deletes 2011-02-28 10:45 ` git diff: add option for omitting the contents of deletes Mart Sõmermaa @ 2011-02-28 16:10 ` Michael J Gruber 0 siblings, 0 replies; 32+ messages in thread From: Michael J Gruber @ 2011-02-28 16:10 UTC (permalink / raw) To: Mart Sõmermaa; +Cc: git, Junio C Hamano, Jeff King Mart Sõmermaa venit, vidit, dixit 28.02.2011 11:45: > Why not just make it reversible then? > > $ git diff -M > diff --git a/foo b/bar > similarity index 100% > rename from foo > rename to bar > > is nonreversible without git already (i.e. does not work with plain > patch AFAIK). That "i.e." is a misunderstanding: I'm sure Junio meant that "git apply -R" should make sense on a path like that. You can easily undo (reverse) a rename without knowing the file contents; undeleting a file is more difficult :) > Adding > > $ git diff -D > diff --git a/foo b/foo > deleted file mode 100644 > delete foo > > would be neither less nor more reversible -- it would also only work > with git apply > (assuming that apply is amended accordingly). To reverse it, we would need the sah1 lines, and the repo would need that blob. > Cognitive burdens aside, the "delete foo" output is both > more explicit and shorter anyway. Occam's razor FTW :)! Yes. > Junio, I won't push this further, so it's a final call -- if you give -1, > then let -D fall to the colourless abyss of oblivion :), if +1 or +0, > I'll see if I can come up with a patch. You can always submit a patch, and you can never know what happens ;) > Best regards, > MS Uhm, and please don't do that top post thing and the cc culling. Readding J&J. [quote snipped] ^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2011-03-08 21:31 UTC | newest] Thread overview: 32+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-02-26 13:16 git diff: add option for omitting the contents of deletes Mart Sõmermaa 2011-02-26 20:11 ` Junio C Hamano 2011-02-27 14:41 ` Michael J Gruber 2011-02-27 22:33 ` Mart Sõmermaa 2011-02-28 9:58 ` Michael J Gruber 2011-02-28 10:51 ` Mart Sõmermaa 2011-02-27 22:54 ` Junio C Hamano 2011-02-27 23:07 ` Junio C Hamano 2011-02-28 7:31 ` Michael J Gruber 2011-02-28 12:17 ` Jeff King 2011-02-28 12:23 ` Jeff King 2011-02-28 12:32 ` Michael J Gruber 2011-02-28 12:59 ` Jeff King 2011-02-28 13:05 ` Michael J Gruber 2011-02-28 21:54 ` Jeff King 2011-02-28 18:11 ` Junio C Hamano 2011-02-28 22:23 ` Jeff King 2011-02-28 23:28 ` Junio C Hamano 2011-03-01 0:11 ` Junio C Hamano 2011-03-07 20:38 ` Mart Sõmermaa 2011-03-08 7:14 ` Michael J Gruber 2011-03-08 19:49 ` Junio C Hamano 2011-03-08 21:25 ` Mart Sõmermaa 2011-03-08 21:31 ` Jeff King 2011-02-28 12:42 ` symling diff driver (Was: Re: git diff: add option for omitting the contents of deletes) Michael J Gruber 2011-02-28 13:08 ` Jeff King 2011-02-28 15:26 ` [PATCH/WIP] attr: make attributes depend on file type Michael J Gruber 2011-02-28 17:30 ` Jeff King 2011-02-28 17:48 ` Junio C Hamano 2011-03-01 7:46 ` Michael J Gruber 2011-02-28 10:45 ` git diff: add option for omitting the contents of deletes Mart Sõmermaa 2011-02-28 16:10 ` Michael J Gruber
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).