* [PATCHv1bis 0/2] git apply: cope with whitespace differences @ 2009-07-02 17:48 Giuseppe Bilotta 2009-07-02 17:48 ` [PATCHv1bis 1/2] git apply: option to ignore " Giuseppe Bilotta 0 siblings, 1 reply; 12+ messages in thread From: Giuseppe Bilotta @ 2009-07-02 17:48 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Robert Fitzsimons, Giuseppe Bilotta Sometimes I do some very stupid things, as was the case with the first version of this sequence. This new version still doesn't include Junio's suggestion to move whitespace fixing to before the check, because I'm not convinced it should be the case. Giuseppe Bilotta (2): git apply: option to ignore whitespace differences git apply: preserve original whitespace with --ignore-whitespace builtin-apply.c | 90 +++++++++++++++++++++++++++++-- contrib/completion/git-completion.bash | 2 + git-am.sh | 3 +- git-rebase.sh | 3 + t/t4107-apply-ignore-whitespace.sh | 74 ++++++++++++++++++++++++++ 5 files changed, 165 insertions(+), 7 deletions(-) create mode 100755 t/t4107-apply-ignore-whitespace.sh ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCHv1bis 1/2] git apply: option to ignore whitespace differences 2009-07-02 17:48 [PATCHv1bis 0/2] git apply: cope with whitespace differences Giuseppe Bilotta @ 2009-07-02 17:48 ` Giuseppe Bilotta 2009-07-02 17:48 ` [PATCHv1bis 2/2] git apply: preserve original whitespace with --ignore-whitespace Giuseppe Bilotta 2009-07-02 18:27 ` [PATCHv1bis 1/2] git apply: option to ignore whitespace differences Junio C Hamano 0 siblings, 2 replies; 12+ messages in thread From: Giuseppe Bilotta @ 2009-07-02 17:48 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Robert Fitzsimons, Giuseppe Bilotta Introduce --ignore-whitespace option to ignore whitespace differences while applying the patch. The patch is based on Robert Fitzsimons' work http://permalink.gmane.org/gmane.comp.version-control.git/7876 and includes his test case. 'git am' and 'git rebase' are made aware of this option and pass it through to 'git apply', and so is the bash git completion. --- builtin-apply.c | 60 ++++++++++++++++++++++++-- contrib/completion/git-completion.bash | 2 + git-am.sh | 3 +- git-rebase.sh | 3 + t/t4107-apply-ignore-whitespace.sh | 74 ++++++++++++++++++++++++++++++++ 5 files changed, 137 insertions(+), 5 deletions(-) create mode 100755 t/t4107-apply-ignore-whitespace.sh diff --git a/builtin-apply.c b/builtin-apply.c index dc0ff5e..70cc985 100644 --- a/builtin-apply.c +++ b/builtin-apply.c @@ -39,6 +39,7 @@ static int diffstat; static int numstat; static int summary; static int check; +static int ignore_whitespace; static int apply = 1; static int apply_in_reverse; static int apply_with_reject; @@ -214,6 +215,49 @@ static uint32_t hash_line(const char *cp, size_t len) return h; } +/* + * Compare two memory areas ignoring whitespace differences + */ +static int memcmp_ignore_whitespace(const char *s1, size_t n1, const char *s2, size_t n2) +{ + const char *stop1 = s1 + n1; + const char *stop2 = s2 + n2; + int result; + + if (!(n1 | n2)) + return 0; + + do { + if (isspace(*s1) && isspace(*s2)) { + while (isspace(*s1)) { + s1++; + } + while (isspace(*s2)) + s2++; + } + /* Check here instead of in the while because + the whitespace discarding might have moved us + past the end */ + if ((s1 >= stop1) || (s2 >= stop2)) + break; + result = *s1++ - *s2++; + } while (!result); + + return result; +} + +/* + * Returns true if the given lines (buffer + len) match + * according to the ignore_whitespace setting + */ +static int lines_match(const char *s1, size_t n1, const char *s2, size_t n2) +{ + if (ignore_whitespace) + return !memcmp_ignore_whitespace(s1, n1, s2, n2); + else + return (n1 == n2) && !memcmp(s1, s2, n1); +} + static void add_line_info(struct image *img, const char *bol, size_t len, unsigned flag) { ALLOC_GROW(img->line_allocated, img->nr + 1, img->alloc); @@ -1665,6 +1709,7 @@ static int match_fragment(struct image *img, { int i; char *fixed_buf, *buf, *orig, *target; + size_t img_len = 0; if (preimage->nr + try_lno > img->nr) return 0; @@ -1676,9 +1721,11 @@ static int match_fragment(struct image *img, return 0; /* Quick hash check */ - for (i = 0; i < preimage->nr; i++) + for (i = 0; i < preimage->nr; i++) { + img_len += img->line[try_lno + i].len; if (preimage->line[i].hash != img->line[try_lno + i].hash) return 0; + } /* * Do we have an exact match? If we were told to match @@ -1690,10 +1737,11 @@ static int match_fragment(struct image *img, if ((match_end ? (try + preimage->len == img->len) : (try + preimage->len <= img->len)) && - !memcmp(img->buf + try, preimage->buf, preimage->len)) + lines_match(img->buf + try, img_len, + preimage->buf, preimage->len)) return 1; - if (ws_error_action != correct_ws_error) + if (!ignore_whitespace && (ws_error_action != correct_ws_error)) return 0; /* @@ -1731,8 +1779,10 @@ static int match_fragment(struct image *img, * In either case, we are fixing the whitespace breakages * so we might as well take the fix together with their * real change. + * If we are ignoring whitespace differences, don't check + * for length equality. */ - match = (tgtfixlen == fixlen && !memcmp(tgtfix, buf, fixlen)); + match = lines_match(buf, fixlen, tgtfix, tgtfixlen); if (tgtfix != tgtfixbuf) free(tgtfix); @@ -3304,6 +3354,8 @@ int cmd_apply(int argc, const char **argv, const char *unused_prefix) { OPTION_CALLBACK, 0, "whitespace", &whitespace_option, "action", "detect new or modified lines that have whitespace errors", 0, option_parse_whitespace }, + OPT_BOOLEAN(0, "ignore-whitespace", &ignore_whitespace, + "ignore whitespace differences when finding context"), OPT_BOOLEAN('R', "reverse", &apply_in_reverse, "apply the patch in reverse"), OPT_BOOLEAN(0, "unidiff-zero", &unidiff_zero, diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash index ddb71e2..d3415b5 100755 --- a/contrib/completion/git-completion.bash +++ b/contrib/completion/git-completion.bash @@ -663,6 +663,7 @@ _git_am () --*) __gitcomp " --3way --committer-date-is-author-date --ignore-date + --ignore-whitespace --interactive --keep --no-utf8 --signoff --utf8 --whitespace= " @@ -684,6 +685,7 @@ _git_apply () --stat --numstat --summary --check --index --cached --index-info --reverse --reject --unidiff-zero --apply --no-add --exclude= + --ignore-whitespace --whitespace= --inaccurate-eof --verbose " return diff --git a/git-am.sh b/git-am.sh index d64d997..fe024b1 100755 --- a/git-am.sh +++ b/git-am.sh @@ -16,6 +16,7 @@ s,signoff add a Signed-off-by line to the commit message u,utf8 recode into utf8 (default) k,keep pass -k flag to git-mailinfo whitespace= pass it through git-apply +ignore-whitespace pass it through git-apply directory= pass it through git-apply C= pass it through git-apply p= pass it through git-apply @@ -303,7 +304,7 @@ do git_apply_opt="$git_apply_opt $(sq "$1$2")"; shift ;; --patch-format) shift ; patch_format="$1" ;; - --reject) + --reject|--ignore-whitespace) git_apply_opt="$git_apply_opt $1" ;; --committer-date-is-author-date) committer_date_is_author_date=t ;; diff --git a/git-rebase.sh b/git-rebase.sh index 18bc694..d741752 100755 --- a/git-rebase.sh +++ b/git-rebase.sh @@ -333,6 +333,9 @@ do ;; esac ;; + --ignore-whitespace) + git_am_opt="$git_am_opt $1" + ;; --committer-date-is-author-date|--ignore-date) git_am_opt="$git_am_opt $1" force_rebase=t diff --git a/t/t4107-apply-ignore-whitespace.sh b/t/t4107-apply-ignore-whitespace.sh new file mode 100755 index 0000000..d31e0f3 --- /dev/null +++ b/t/t4107-apply-ignore-whitespace.sh @@ -0,0 +1,74 @@ +#!/bin/sh +# +# Copyright (c) 2005 Junio C Hamano +# Copyright (c) 2005 Robert Fitzsimons +# + +test_description='git-apply --ignore-whitespace. + +' +. ./test-lib.sh + +# setup + +cat > patch1.patch <<\EOF +diff --git a/main.c b/main.c +new file mode 100644 +--- /dev/null ++++ b/main.c +@@ -0,0 +1,23 @@ ++#include <stdio.h> ++ ++void print_int(int num); ++int func(int num); ++ ++int main() { ++ int i; ++ ++ for (i = 0; i < 10; i++) { ++ print_int(func(i)); ++ } ++ ++ return 0; ++} ++ ++int func(int num) { ++ return num * num; ++} ++ ++void print_int(int num) { ++ printf("%d", num); ++} ++ +EOF +cat > patch2.patch <<\EOF +diff --git a/main.c b/main.c +--- a/main.c ++++ b/main.c +@@ -10,6 +10,8 @@ + print_int(func(i)); + } + ++ printf("\n"); ++ + return 0; + } + +EOF + +test_expect_success "S = patch1" \ + 'git-apply patch1.patch' + +test_expect_failure "F = patch2" \ + 'git-apply patch2.patch' + +test_expect_success "S = patch2 (--ignore-whitespace)" \ + 'git-apply --ignore-whitespace patch2.patch' + +rm -f main.c +test_expect_success "S = patch1 (--ignore-whitespace)" \ + 'git-apply --ignore-whitespace patch1.patch' + +test_done + + -- 1.6.3.3.511.g0ded0.dirty ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCHv1bis 2/2] git apply: preserve original whitespace with --ignore-whitespace 2009-07-02 17:48 ` [PATCHv1bis 1/2] git apply: option to ignore " Giuseppe Bilotta @ 2009-07-02 17:48 ` Giuseppe Bilotta 2009-07-02 18:27 ` [PATCHv1bis 1/2] git apply: option to ignore whitespace differences Junio C Hamano 1 sibling, 0 replies; 12+ messages in thread From: Giuseppe Bilotta @ 2009-07-02 17:48 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Robert Fitzsimons, Giuseppe Bilotta --- builtin-apply.c | 32 +++++++++++++++++++++++++++++--- 1 files changed, 29 insertions(+), 3 deletions(-) diff --git a/builtin-apply.c b/builtin-apply.c index 70cc985..24ab286 100644 --- a/builtin-apply.c +++ b/builtin-apply.c @@ -1644,6 +1644,10 @@ static void update_pre_post_images(struct image *preimage, int i, ctx; char *new, *old, *fixed; struct image fixed_preimage; + /* Do we need more space for the postimage? */ + size_t newlen = postimage->len; + if (len > preimage->len) + newlen += len - preimage->len; /* * Update the preimage with whitespace fixes. Note that we @@ -1658,11 +1662,16 @@ static void update_pre_post_images(struct image *preimage, *preimage = fixed_preimage; /* - * Adjust the common context lines in postimage, in place. + * Adjust the common context lines in postimage, in place + * if we are not ignoring whitespace differences. * This is possible because whitespace fixing does not make * the string grow. */ - new = old = postimage->buf; + old = postimage->buf; + if (ignore_whitespace) + new = postimage->buf = xmalloc(newlen); + else + new = old; fixed = preimage->buf; for (i = ctx = 0; i < postimage->nr; i++) { size_t len = postimage->line[i].len; @@ -1738,8 +1747,25 @@ static int match_fragment(struct image *img, ? (try + preimage->len == img->len) : (try + preimage->len <= img->len)) && lines_match(img->buf + try, img_len, - preimage->buf, preimage->len)) + preimage->buf, preimage->len)) { + if (ignore_whitespace) { + /* + * Replace the preimage whitespace with the original one + */ + size_t newlen = 0; + for (i = 0; i < preimage->nr; i++) { + newlen += preimage->line[i].len = + img->line[try_lno + i].len; + } + fixed_buf = xmalloc(newlen); + memcpy(fixed_buf, img->buf + try, newlen); + + update_pre_post_images( + preimage, postimage, + fixed_buf, newlen); + } return 1; + } if (!ignore_whitespace && (ws_error_action != correct_ws_error)) return 0; -- 1.6.3.3.511.g0ded0.dirty ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCHv1bis 1/2] git apply: option to ignore whitespace differences 2009-07-02 17:48 ` [PATCHv1bis 1/2] git apply: option to ignore " Giuseppe Bilotta 2009-07-02 17:48 ` [PATCHv1bis 2/2] git apply: preserve original whitespace with --ignore-whitespace Giuseppe Bilotta @ 2009-07-02 18:27 ` Junio C Hamano 2009-07-02 19:02 ` Giuseppe Bilotta 1 sibling, 1 reply; 12+ messages in thread From: Junio C Hamano @ 2009-07-02 18:27 UTC (permalink / raw) To: Giuseppe Bilotta; +Cc: git, Junio C Hamano, Robert Fitzsimons Giuseppe Bilotta <giuseppe.bilotta@gmail.com> writes: > +static int memcmp_ignore_whitespace(const char *s1, size_t n1, const char *s2, size_t n2) > +{ > + const char *stop1 = s1 + n1; > + const char *stop2 = s2 + n2; > + int result; > + > + if (!(n1 | n2)) > + return 0; > + > + do { > + if (isspace(*s1) && isspace(*s2)) { > + while (isspace(*s1)) { > + s1++; > + } > + while (isspace(*s2)) > + s2++; > + } > + /* Check here instead of in the while because > + the whitespace discarding might have moved us > + past the end */ > + if ((s1 >= stop1) || (s2 >= stop2)) > + break; If s1 is longer than s2 (or vice versa) but one is a prefix of the other, you will return "they match", because previous round compared *s1 and *s2 and found them the same? > +/* > + * Returns true if the given lines (buffer + len) match > + * according to the ignore_whitespace setting > + */ > +static int lines_match(const char *s1, size_t n1, const char *s2, size_t n2) > +{ > + if (ignore_whitespace) > + return !memcmp_ignore_whitespace(s1, n1, s2, n2); > + else > + return (n1 == n2) && !memcmp(s1, s2, n1); > +} > + I think this still is an abstraction at the wrong level. For one thing, if ignore-whitespace is set, you do not even need nor want to do the "fix only the ws breakages we are going to fix anyway according to the ws_rule" transformation applied to the preimage. I think the handling of correct_ws_error in the caller should also be moved here. IOW, shouldn't this function look like this? lines_match() { /* if the user wants fuzz, so be it */ if (ignore_whitespace) return compare_lines_wo_ws(); /* most common case: matches without fuzzing */ if (length matches && !memcmp()) return 1; /* we are done, if we are not correcting */ if (ws_error_action != correct_ws_error) return 0; ... apply configured ws fix to preimage ...; /* does it apply with ws breakage in its context fixed? */ if (length now matches && !memcmp(fixed preimage, postimage) return 1; /* noop, this won't apply */ return 0; } which would make the large-ish loop near the end of match_fragment() to something like: for (i = 0; i < preimage->nr; i++) { ... set up arguments to lines_match ...; match = lines_match(); if (!match) goto unmatch_exit; ... adjust for next iteration ...; } ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCHv1bis 1/2] git apply: option to ignore whitespace differences 2009-07-02 18:27 ` [PATCHv1bis 1/2] git apply: option to ignore whitespace differences Junio C Hamano @ 2009-07-02 19:02 ` Giuseppe Bilotta 2009-07-02 19:28 ` Giuseppe Bilotta 0 siblings, 1 reply; 12+ messages in thread From: Giuseppe Bilotta @ 2009-07-02 19:02 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Robert Fitzsimons On Thu, Jul 2, 2009 at 8:27 PM, Junio C Hamano<gitster@pobox.com> wrote: > Giuseppe Bilotta <giuseppe.bilotta@gmail.com> writes: > >> +static int memcmp_ignore_whitespace(const char *s1, size_t n1, const char *s2, size_t n2) >> +{ >> + const char *stop1 = s1 + n1; >> + const char *stop2 = s2 + n2; >> + int result; >> + >> + if (!(n1 | n2)) >> + return 0; >> + >> + do { >> + if (isspace(*s1) && isspace(*s2)) { >> + while (isspace(*s1)) { >> + s1++; >> + } >> + while (isspace(*s2)) >> + s2++; >> + } >> + /* Check here instead of in the while because >> + the whitespace discarding might have moved us >> + past the end */ >> + if ((s1 >= stop1) || (s2 >= stop2)) >> + break; > > If s1 is longer than s2 (or vice versa) but one is a prefix of the other, > you will return "they match", because previous round compared *s1 and *s2 > and found them the same? Yes. Actually, more specifically, now that I think more about this, that's the reason why my previous code was fine: when we compare the lines in the preimage against the image, we only look for a match which is as long as the preimage, we don't care what ELSE is there in the image. So it makes sense to pass a single length parameter, the preimage length. However, my first version should be fixed so that the order of the parameter becomes significant, and the early bailout is only done if the preimage length has been totally consumed. >> +/* >> + * Returns true if the given lines (buffer + len) match >> + * according to the ignore_whitespace setting >> + */ >> +static int lines_match(const char *s1, size_t n1, const char *s2, size_t n2) >> +{ >> + if (ignore_whitespace) >> + return !memcmp_ignore_whitespace(s1, n1, s2, n2); >> + else >> + return (n1 == n2) && !memcmp(s1, s2, n1); >> +} >> + > > I think this still is an abstraction at the wrong level. For one thing, > if ignore-whitespace is set, you do not even need nor want to do the "fix > only the ws breakages we are going to fix anyway according to the ws_rule" > transformation applied to the preimage. I've thought some more about this, and you are right. We still want to ws fix the postimage, but that's done elsewhere. -- Giuseppe "Oblomov" Bilotta ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCHv1bis 1/2] git apply: option to ignore whitespace differences 2009-07-02 19:02 ` Giuseppe Bilotta @ 2009-07-02 19:28 ` Giuseppe Bilotta 2009-07-02 19:45 ` Junio C Hamano 0 siblings, 1 reply; 12+ messages in thread From: Giuseppe Bilotta @ 2009-07-02 19:28 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Robert Fitzsimons On Thu, Jul 2, 2009 at 9:02 PM, Giuseppe Bilotta<giuseppe.bilotta@gmail.com> wrote: > On Thu, Jul 2, 2009 at 8:27 PM, Junio C Hamano<gitster@pobox.com> wrote: >>> +/* >>> + * Returns true if the given lines (buffer + len) match >>> + * according to the ignore_whitespace setting >>> + */ >>> +static int lines_match(const char *s1, size_t n1, const char *s2, size_t n2) >>> +{ >>> + if (ignore_whitespace) >>> + return !memcmp_ignore_whitespace(s1, n1, s2, n2); >>> + else >>> + return (n1 == n2) && !memcmp(s1, s2, n1); >>> +} >>> + >> >> I think this still is an abstraction at the wrong level. For one thing, >> if ignore-whitespace is set, you do not even need nor want to do the "fix >> only the ws breakages we are going to fix anyway according to the ws_rule" >> transformation applied to the preimage. > > I've thought some more about this, and you are right. We still want to > ws fix the postimage, but that's done elsewhere. Sorry for repying to myself here, but I'm not convinced again. Or to be more specific: I think this kind of refactoring is totally out of the scope of this patch. So although I agree with you in priciple, if you don't mind I'll keep the first two patches simpler and less invasive. I'll look into the refactoring as a third step. -- Giuseppe "Oblomov" Bilotta ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCHv1bis 1/2] git apply: option to ignore whitespace differences 2009-07-02 19:28 ` Giuseppe Bilotta @ 2009-07-02 19:45 ` Junio C Hamano 2009-07-02 20:33 ` Giuseppe Bilotta 0 siblings, 1 reply; 12+ messages in thread From: Junio C Hamano @ 2009-07-02 19:45 UTC (permalink / raw) To: Giuseppe Bilotta; +Cc: git, Robert Fitzsimons Giuseppe Bilotta <giuseppe.bilotta@gmail.com> writes: > Sorry for repying to myself here, but I'm not convinced again. Or to > be more specific: I think this kind of refactoring is totally out of > the scope of this patch. So although I agree with you in priciple, if > you don't mind I'll keep the first two patches simpler and less > invasive. I'll look into the refactoring as a third step. I am not interested in applying "this adds a broken ignore-whitespace option, but as long as you do not use it, it does not hurt the carefully crafted apply-with-context-whose-ws-breakage-was-fixed codepath." For example, I am not convinced at all that your patch does not break the update_pre_post_images() nor do I know what text the final pre/postimage will happen to end up with. In other words, I do not see a clear logic in the change. Also about the broken "only prefix matches" loop, I do not understand why you would want to consider this preimage from the patch "this has extra whitespace and other stuff\n" matches the target line "this has extra whitespace\n" only because the prefix matches. For that matter, I do not think I understand why the leading whitespaces of s1 and s2 are skipped only when they both begin with a whitespace, either. I do not want to be looking at this series until it gets into a bit more reviewable shape, which I would expect to take at least a week if you are not working on this full-time (and I presume nobody on the git list is). Please do not Cc me in the meantime, but please do ask for help from other people interested in this topic on the list. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCHv1bis 1/2] git apply: option to ignore whitespace differences 2009-07-02 19:45 ` Junio C Hamano @ 2009-07-02 20:33 ` Giuseppe Bilotta 2009-07-02 21:00 ` Junio C Hamano 2009-07-02 23:55 ` Junio C Hamano 0 siblings, 2 replies; 12+ messages in thread From: Giuseppe Bilotta @ 2009-07-02 20:33 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Robert Fitzsimons On Thu, Jul 2, 2009 at 9:45 PM, Junio C Hamano<gitster@pobox.com> wrote: > Giuseppe Bilotta <giuseppe.bilotta@gmail.com> writes: > >> Sorry for repying to myself here, but I'm not convinced again. Or to >> be more specific: I think this kind of refactoring is totally out of >> the scope of this patch. So although I agree with you in priciple, if >> you don't mind I'll keep the first two patches simpler and less >> invasive. I'll look into the refactoring as a third step. > > I am not interested in applying "this adds a broken ignore-whitespace > option, but as long as you do not use it, it does not hurt the carefully > crafted apply-with-context-whose-ws-breakage-was-fixed codepath." Perfectly agreed. I'm not asking you to apply the patches as they are, I sent them to the list for review, and CC'ed you since I couldn't identify a git-apply maintainer. > For > example, I am not convinced at all that your patch does not break the > update_pre_post_images() nor do I know what text the final pre/postimage > will happen to end up with. In other words, I do not see a clear logic in > the change. As you yourself pointed out in the previous email, when ignoring whitespace the code wouldn't hit the whitespace fix path at all. Indeed, I've updated the patch to this effect right after reading your previous email. > Also about the broken "only prefix matches" loop, I do not understand why > you would want to consider this preimage from the patch > > "this has extra whitespace and other stuff\n" > > matches the target line > > "this has extra whitespace\n" > > only because the prefix matches. That's totally not what I meant. The other way around though, i.e. a preimage of "this has extra whitespace \n" should match against "this has extra whitespace\nand other stuff\n" And of course "this has extra whitespace \n" should fail against "this has extra whitespace and other stuff\n" Both cases behave correctly (I just extended the test to include this case, btw) A more interesting question would be if a missing EOL at EOF should be treated as whitespace difference or not. > For that matter, I do not think I understand why the leading whitespaces > of s1 and s2 are skipped only when they both begin with a whitespace, > either. Oh, thanks, when first adapting Robert's patch I hadn't considered it missed this case. I've updated the patch accordingly and added a test case for it. > I do not want to be looking at this series until it gets into a bit more > reviewable shape, which I would expect to take at least a week if you are > not working on this full-time (and I presume nobody on the git list is). I do wonder what makes the patch 'unreviewable', since I assume that doesn't mean just 'does not include the refactoring I requested'. > Please do not Cc me in the meantime, but please do ask for help from other > people interested in this topic on the list. I thought it appropriate to cc you for this specific reply since I was addressing the doubts you raised, but I will not include you in the next resend of the series, as per your request. Thanks for your patience anyway. -- Giuseppe "Oblomov" Bilotta ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCHv1bis 1/2] git apply: option to ignore whitespace differences 2009-07-02 20:33 ` Giuseppe Bilotta @ 2009-07-02 21:00 ` Junio C Hamano 2009-07-02 21:05 ` Giuseppe Bilotta 2009-07-02 23:55 ` Junio C Hamano 1 sibling, 1 reply; 12+ messages in thread From: Junio C Hamano @ 2009-07-02 21:00 UTC (permalink / raw) To: Giuseppe Bilotta; +Cc: git, Robert Fitzsimons Giuseppe Bilotta <giuseppe.bilotta@gmail.com> writes: > I do wonder what makes the patch 'unreviewable', since I assume that > doesn't mean just 'does not include the refactoring I requested'. Ok, unreviewable may not be the word, and "it is apparent that the patch has not been well thought through" is the word I would want. IOW, it really should have been marked as "[RFC/PATCH]". ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCHv1bis 1/2] git apply: option to ignore whitespace differences 2009-07-02 21:00 ` Junio C Hamano @ 2009-07-02 21:05 ` Giuseppe Bilotta 0 siblings, 0 replies; 12+ messages in thread From: Giuseppe Bilotta @ 2009-07-02 21:05 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Robert Fitzsimons On Thu, Jul 2, 2009 at 11:00 PM, Junio C Hamano<gitster@pobox.com> wrote: > Giuseppe Bilotta <giuseppe.bilotta@gmail.com> writes: > >> I do wonder what makes the patch 'unreviewable', since I assume that >> doesn't mean just 'does not include the refactoring I requested'. > > Ok, unreviewable may not be the word, and "it is apparent that the patch > has not been well thought through" is the word I would want. > > IOW, it really should have been marked as "[RFC/PATCH]". Ah yes, I keep forgetting about that. I remembered it on the very first try (actually, I wrote PATCH/RFC), but then I forgot about it in the next rounds. -- Giuseppe "Oblomov" Bilotta ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCHv1bis 1/2] git apply: option to ignore whitespace differences 2009-07-02 20:33 ` Giuseppe Bilotta 2009-07-02 21:00 ` Junio C Hamano @ 2009-07-02 23:55 ` Junio C Hamano 2009-07-03 6:40 ` Giuseppe Bilotta 1 sibling, 1 reply; 12+ messages in thread From: Junio C Hamano @ 2009-07-02 23:55 UTC (permalink / raw) To: Giuseppe Bilotta; +Cc: git, Robert Fitzsimons By the way, I think we need to make sure your understanding of how the current code works matches mine before you go any further. Are the words "preimage", "postimage" and "target" used consistently between us? By these words, I mean: preimage = the lines prefixed with '-' and ' ' in the patch postimage = the lines prefixed with ' ' and '+' in the patch target = lines in the file being patched that corresponds to the preimage The point of patch application is to find a block of lines in the target that matches preimage, and replace that block with postimage. When the patch applies cleanly (which is the case we should optimize for), the preimage match the target byte-for-byte. The hunk starting at line 1690 does a memcmp of the whole thing, without ws fuzz, for this reason. You do not want to touch that part with your patch (and that is why I am writing this message to make sure you understand what you are doing). After that, as a fallback, we compare line-by-line, while fixing the whitespace breakage in the preimage (what the patch author based on) and the target (what we currently have). The reason for the loop is because we are interested in two cases: (1) The patch was made against an old code without recent whitespace fix we already have. (2) The patch was made against a code with whitespace fix we do not have yet. In either case, preimage and target won't match byte-for-byte, but by applying the whitespace breakage on each of the preimage line and the corresponding target line, they will match in either of the above cases. While doing this "convert-and-match", we prepare a version of preimage with whitespace breakage fixed to give to update_pre_post_images() at the end of the function in fixed_buf. The contents of fixed_buf is used to update the preimage and the postimage by calling update_pre_post_images(). This is to avoid reverting the whitespace fix we already had in the target when we are in situation (1). The postimage is what replaces the block of lines in the image that matched the preimage, so this step is essential. This is another point I am worried about your patch. Suppose you have this target: a a a b b b c c d e e And we have a broken patch that needs --ignore-whitespace to apply: diff --git a/file b/file index xxxxxx..yyyyyy 100644 @@ -1,4, +1,5 @@ a a a b b b +q c c d Your preimage is "a a a\nb b b\nc c\n d\n", target is "a a a\nb b b\nc c\nd\ne e\n", and postimage is "a a a\nb b b\nq\nc c\n d\n". Wouldn't you want to have this as the result of patch application? a a a b b b q c c d e e With whitespace squashed, the preimage would match the target (perhaps after fixing line_matches()), but wsfix_copy() called while we fix each preimage line won't have changed anything in the fixed_buf that is to become the new preimage, and update_pre_post_images() while copying the fixed preimage to the postimage won't have corrected "a a a" back to "a a a" that was in the target as the result. So I suspect that you would instead end up with: a a a b b b c c d e e I think the intent of --ignore-whitespace is "don't worry about ws differences in the context when locating where to make the change", and it is not "I do not care about getting whitespace mangled anywhere in the file the patch touches." correct_ws_error is special in that we can afford to take the fixed pre/postimage, "because we are fixing the ws breakage anyway", but arguably it _might_ be nicer to limit the change to the lines marked with '-' and '+' in the patch even in that case. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCHv1bis 1/2] git apply: option to ignore whitespace differences 2009-07-02 23:55 ` Junio C Hamano @ 2009-07-03 6:40 ` Giuseppe Bilotta 0 siblings, 0 replies; 12+ messages in thread From: Giuseppe Bilotta @ 2009-07-03 6:40 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Robert Fitzsimons On Fri, Jul 3, 2009 at 1:55 AM, Junio C Hamano<gitster@pobox.com> wrote: > By the way, I think we need to make sure your understanding of how the > current code works matches mine before you go any further. Souns reasonable. > Are the words "preimage", "postimage" and "target" used consistently > between us? By these words, I mean: > > preimage = the lines prefixed with '-' and ' ' in the patch > > postimage = the lines prefixed with ' ' and '+' in the patch > > target = lines in the file being patched that corresponds to the preimage It _did_ take me a little to understand the names when I started working on the feature, but I got on track pretty soon (at the first segfault ;-)). > The point of patch application is to find a block of lines in the target > that matches preimage, and replace that block with postimage. When the > patch applies cleanly (which is the case we should optimize for), the > preimage match the target byte-for-byte. The hunk starting at line 1690 > does a memcmp of the whole thing, without ws fuzz, for this reason. You > do not want to touch that part with your patch (and that is why I am > writing this message to make sure you understand what you are doing). Of course. > After that, as a fallback, we compare line-by-line, while fixing the > whitespace breakage in the preimage (what the patch author based on) and > the target (what we currently have). > [...] preimage and target won't match byte-for-byte, but by > applying the whitespace breakage on each of the preimage line and the > corresponding target line, they will match in either of the above cases. > While doing this "convert-and-match", we prepare a version of preimage > with whitespace breakage fixed to give to update_pre_post_images() at the > end of the function in fixed_buf. Indeed. This is why in my 2/2 patch I do a similar operation to bring the preimage whitespace to match the target whitespace if matching was done ignoring whitespace (but we never got to that part, for obvious reasons). > This is another point I am worried about your patch. Suppose you have this > target: > > a a a > b b b > c c > d > e e > > And we have a broken patch that needs --ignore-whitespace to apply: > > diff --git a/file b/file > index xxxxxx..yyyyyy 100644 > @@ -1,4, +1,5 @@ > a a a > b b b > +q > c c > d > > Your preimage is "a a a\nb b b\nc c\n d\n", > target is "a a a\nb b b\nc c\nd\ne e\n", > and postimage is "a a a\nb b b\nq\nc c\n d\n". > > Wouldn't you want to have this as the result of patch application? > > a a a > b b b > q > c c > d > e e > > With whitespace squashed, the preimage would match the target (perhaps > after fixing line_matches()), but wsfix_copy() called while we fix each > preimage line won't have changed anything in the fixed_buf that is to > become the new preimage, and update_pre_post_images() while copying the > fixed preimage to the postimage won't have corrected "a a a" back to "a a > a" that was in the target as the result. > > So I suspect that you would instead end up with: > > a a a > b b b > c c > d > e e This is indeed the case with my 1/2 patch: no whitespace adjustment is done on the pre- and postimage when the preimage and target match with whitespace fuzz and ignore_whitespace is active. In the first RFC I sent I expressely mentioned that this was not what I liked about my patch. When I first sent a _series_, it was made of two patches, the second of which served the purpose of realigning the whitespaces of the patch (pre and postimage) to the whitespaces of the target (at least for the common lines). > I think the intent of --ignore-whitespace is "don't worry about ws > differences in the context when locating where to make the change", and it > is not "I do not care about getting whitespace mangled anywhere in the > file the patch touches." I totally agree. This is important because it also means that when re-diffing the applied patch you still get changes ONLY in the lines where you SHOULD get changes, and not in the nearby context that only had different whitespace. I sent the thing in two patches to make it easier to review. If you think it's more appropriate to squash them, I can do that no problem. > correct_ws_error is special in that we can > afford to take the fixed pre/postimage, "because we are fixing the ws > breakage anyway", but arguably it _might_ be nicer to limit the change to > the lines marked with '-' and '+' in the patch even in that case. But that's a path we're not going to hit in match_fragment when ignoring whitespace. Instead, one thing we could consider in this case (ignore_whitespace) is to adjust the leading space in the + lines to match the ws transformations done in the context lines, but that might be making whitespace fixing a little too far. Or we should rename it to whitespace=adjust... -- Giuseppe "Oblomov" Bilotta ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2009-07-03 6:40 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-07-02 17:48 [PATCHv1bis 0/2] git apply: cope with whitespace differences Giuseppe Bilotta 2009-07-02 17:48 ` [PATCHv1bis 1/2] git apply: option to ignore " Giuseppe Bilotta 2009-07-02 17:48 ` [PATCHv1bis 2/2] git apply: preserve original whitespace with --ignore-whitespace Giuseppe Bilotta 2009-07-02 18:27 ` [PATCHv1bis 1/2] git apply: option to ignore whitespace differences Junio C Hamano 2009-07-02 19:02 ` Giuseppe Bilotta 2009-07-02 19:28 ` Giuseppe Bilotta 2009-07-02 19:45 ` Junio C Hamano 2009-07-02 20:33 ` Giuseppe Bilotta 2009-07-02 21:00 ` Junio C Hamano 2009-07-02 21:05 ` Giuseppe Bilotta 2009-07-02 23:55 ` Junio C Hamano 2009-07-03 6:40 ` Giuseppe Bilotta
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).