* [WIP PATCH/RFC] Use a higher range-diff --creation-factor for format-patch @ 2019-03-15 16:09 Ævar Arnfjörð Bjarmason 2019-03-15 19:00 ` Eric Sunshine 0 siblings, 1 reply; 5+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2019-03-15 16:09 UTC (permalink / raw) To: Git Mailing list; +Cc: Johannes Schindelin, Eric Sunshine I just submittted https://public-inbox.org/git/20190315155959.12390-1-avarab@gmail.com/ and for 8/8 had to tweak the creation factor to 80% due to having added a large comment. Maybe something like the below makes more sense for format-patch? Also, the "Algorithm" section of git-range-diff describes how we'll try to find a percentage similarity in the *diff*, but in this case I'm fairly sure that e.g. a creation factor of 50 would do if it also considered the commit message. Maybe I'm just wrong and it does that already, but assuming I'm right in my reading and it doesn't, was that ever considered? That should result in fewer false "not the same patch" positives, but maybe if patches are split up it'll screw things up in other ways. diff --git a/Documentation/git-format-patch.txt b/Documentation/git-format-patch.txt index 1af85d404f5..67a4881a20f 100644 --- a/Documentation/git-format-patch.txt +++ b/Documentation/git-format-patch.txt @@ -261,6 +261,10 @@ material (this may change in the future). between the previous and current series of patches by adjusting the creation/deletion cost fudge factor. See linkgit:git-range-diff[1]) for details. ++ +Defaults to 90, whereas the linkgit:git-range-diff[1] default is +60. It's assumed that you're submitting a new patch series & that we +should try harder than normal to find similarities. --notes[=<ref>]:: Append the notes (see linkgit:git-notes[1]) for the commit diff --git a/builtin/log.c b/builtin/log.c index ab859f59041..ff340130826 100644 --- a/builtin/log.c +++ b/builtin/log.c @@ -1843,7 +1843,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix) } if (creation_factor < 0) - creation_factor = RANGE_DIFF_CREATION_FACTOR_DEFAULT; + creation_factor = RANGE_DIFF_CREATION_FACTOR_FORMAT_PATCH_DEFAULT; else if (!rdiff_prev) die(_("--creation-factor requires --range-diff")); diff --git a/range-diff.h b/range-diff.h index 08a50b6e98f..634112396d3 100644 --- a/range-diff.h +++ b/range-diff.h @@ -4,6 +4,7 @@ #include "diff.h" #define RANGE_DIFF_CREATION_FACTOR_DEFAULT 60 +#define RANGE_DIFF_CREATION_FACTOR_FORMAT_PATCH_DEFAULT 90 /* * Compare series of commmits in RANGE1 and RANGE2, and emit to the ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [WIP PATCH/RFC] Use a higher range-diff --creation-factor for format-patch 2019-03-15 16:09 [WIP PATCH/RFC] Use a higher range-diff --creation-factor for format-patch Ævar Arnfjörð Bjarmason @ 2019-03-15 19:00 ` Eric Sunshine 2019-03-15 19:21 ` Ævar Arnfjörð Bjarmason 2019-03-18 5:23 ` Junio C Hamano 0 siblings, 2 replies; 5+ messages in thread From: Eric Sunshine @ 2019-03-15 19:00 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: Git Mailing list, Johannes Schindelin On Fri, Mar 15, 2019 at 12:09 PM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote: > diff --git a/Documentation/git-format-patch.txt b/Documentation/git-format-patch.txt > @@ -261,6 +261,10 @@ material (this may change in the future). > +Defaults to 90, whereas the linkgit:git-range-diff[1] default is > +60. It's assumed that you're submitting a new patch series & that we > +should try harder than normal to find similarities. My understanding was that the primary use-case of git-range-diff itself (which existed before the --range-diff option was added to git-format-patch) was to generate a "range diff" for a cover letter of a re-rolled series. So, I'm confused about why this tweaks the default value of one command but not the other. > diff --git a/range-diff.h b/range-diff.h > @@ -4,6 +4,7 @@ > #define RANGE_DIFF_CREATION_FACTOR_DEFAULT 60 > +#define RANGE_DIFF_CREATION_FACTOR_FORMAT_PATCH_DEFAULT 90 The point of introducing RANGE_DIFF_CREATION_FACTOR_DEFAULT in the first place was to ensure that the default creation-factor didn't get out-of-sync between git-range-diff and git-format-patch., Thus, introducing this sort of inconsistency between the two would likely lead to confusion on the part of users. After all, --range-diff was added to git-format-patch merely as a convenience over having to run git-range-diff separately and copy/pasting its output into a cover letter generated by git-format-patch. If the two programs default to different values, then that "convenience equality" breaks down. So, I'm fairly negative on this change. However, that doesn't mean I would oppose tweaking the value shared between the two programs (and also the default used by GitGitGadget, if it specifies it manually), though I defer to Dscho in that regard. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [WIP PATCH/RFC] Use a higher range-diff --creation-factor for format-patch 2019-03-15 19:00 ` Eric Sunshine @ 2019-03-15 19:21 ` Ævar Arnfjörð Bjarmason 2019-03-21 11:22 ` Johannes Schindelin 2019-03-18 5:23 ` Junio C Hamano 1 sibling, 1 reply; 5+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2019-03-15 19:21 UTC (permalink / raw) To: Eric Sunshine; +Cc: Git Mailing list, Johannes Schindelin On Fri, Mar 15 2019, Eric Sunshine wrote: > On Fri, Mar 15, 2019 at 12:09 PM Ævar Arnfjörð Bjarmason > <avarab@gmail.com> wrote: >> diff --git a/Documentation/git-format-patch.txt b/Documentation/git-format-patch.txt >> @@ -261,6 +261,10 @@ material (this may change in the future). >> +Defaults to 90, whereas the linkgit:git-range-diff[1] default is >> +60. It's assumed that you're submitting a new patch series & that we >> +should try harder than normal to find similarities. > > My understanding was that the primary use-case of git-range-diff > itself (which existed before the --range-diff option was added to > git-format-patch) was to generate a "range diff" for a cover letter of > a re-rolled series. So, I'm confused about why this tweaks the default > value of one command but not the other. > >> diff --git a/range-diff.h b/range-diff.h >> @@ -4,6 +4,7 @@ >> #define RANGE_DIFF_CREATION_FACTOR_DEFAULT 60 >> +#define RANGE_DIFF_CREATION_FACTOR_FORMAT_PATCH_DEFAULT 90 > > The point of introducing RANGE_DIFF_CREATION_FACTOR_DEFAULT in the > first place was to ensure that the default creation-factor didn't get > out-of-sync between git-range-diff and git-format-patch., Thus, > introducing this sort of inconsistency between the two would likely > lead to confusion on the part of users. After all, --range-diff was > added to git-format-patch merely as a convenience over having to run > git-range-diff separately and copy/pasting its output into a cover > letter generated by git-format-patch. If the two programs default to > different values, then that "convenience equality" breaks down. > > So, I'm fairly negative on this change. However, that doesn't mean I > would oppose tweaking the value shared between the two programs (and > also the default used by GitGitGadget, if it specifies it manually), > though I defer to Dscho in that regard. I think that was the intention initially, but at least I'm now using range-diff as a general comparison tool of different non-ff-branches, e.g. the force-pushes to "pu". I'd expect the tool in general to be used like that, whereas with format-patch it's safe to say we're dealing with a re-roll of the same thing. Hence the hypothesis that for format-patch we can be more aggressive about finding similarities. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [WIP PATCH/RFC] Use a higher range-diff --creation-factor for format-patch 2019-03-15 19:21 ` Ævar Arnfjörð Bjarmason @ 2019-03-21 11:22 ` Johannes Schindelin 0 siblings, 0 replies; 5+ messages in thread From: Johannes Schindelin @ 2019-03-21 11:22 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason; +Cc: Eric Sunshine, Git Mailing list [-- Attachment #1: Type: text/plain, Size: 3579 bytes --] Hi Ævar and Eric, On Fri, 15 Mar 2019, Ævar Arnfjörð Bjarmason wrote: > > On Fri, Mar 15 2019, Eric Sunshine wrote: > > > On Fri, Mar 15, 2019 at 12:09 PM Ævar Arnfjörð Bjarmason > > <avarab@gmail.com> wrote: > >> diff --git a/Documentation/git-format-patch.txt b/Documentation/git-format-patch.txt > >> @@ -261,6 +261,10 @@ material (this may change in the future). > >> +Defaults to 90, whereas the linkgit:git-range-diff[1] default is > >> +60. It's assumed that you're submitting a new patch series & that we > >> +should try harder than normal to find similarities. > > > > My understanding was that the primary use-case of git-range-diff > > itself (which existed before the --range-diff option was added to > > git-format-patch) was to generate a "range diff" for a cover letter of > > a re-rolled series. So, I'm confused about why this tweaks the default > > value of one command but not the other. > > > >> diff --git a/range-diff.h b/range-diff.h > >> @@ -4,6 +4,7 @@ > >> #define RANGE_DIFF_CREATION_FACTOR_DEFAULT 60 > >> +#define RANGE_DIFF_CREATION_FACTOR_FORMAT_PATCH_DEFAULT 90 > > > > The point of introducing RANGE_DIFF_CREATION_FACTOR_DEFAULT in the > > first place was to ensure that the default creation-factor didn't get > > out-of-sync between git-range-diff and git-format-patch., Thus, > > introducing this sort of inconsistency between the two would likely > > lead to confusion on the part of users. After all, --range-diff was > > added to git-format-patch merely as a convenience over having to run > > git-range-diff separately and copy/pasting its output into a cover > > letter generated by git-format-patch. If the two programs default to > > different values, then that "convenience equality" breaks down. > > > > So, I'm fairly negative on this change. However, that doesn't mean I > > would oppose tweaking the value shared between the two programs (and > > also the default used by GitGitGadget, if it specifies it manually), > > though I defer to Dscho in that regard. GitGitGadget does not specify the creation factor manually. > I think that was the intention initially, but at least I'm now using > range-diff as a general comparison tool of different non-ff-branches, > e.g. the force-pushes to "pu". Interesting ;-) > I'd expect the tool in general to be used like that, whereas with > format-patch it's safe to say we're dealing with a re-roll of the same > thing. > > Hence the hypothesis that for format-patch we can be more aggressive > about finding similarities. I do agree that `format-patch`'s `--range-diff` is certainly exclusively used for comparing different iterations of the same patch series. As such, I do agree with Ævar that it makes sense to have a *different* default for the creation factor. Having said that, I did notice while porting `tbdiff` to C that it would be a neat idea to put more weight behind the differences of the commit message, and maybe even use a *different* measure on the commit message, too. Personally, I would try to use a variation of the word diff (maybe something that reflects my experience that it is common to change the case in the oneline, to add a sentence here and there, or to delete a sentence, but not so much to replace entire sentences), to account for the fact that the commit message rarely changes substantially between iterations. So I guess there is a lot of room for improvement here: code (read: diff) changes simply are not the same as commit message changes. Ciao, Dscho ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [WIP PATCH/RFC] Use a higher range-diff --creation-factor for format-patch 2019-03-15 19:00 ` Eric Sunshine 2019-03-15 19:21 ` Ævar Arnfjörð Bjarmason @ 2019-03-18 5:23 ` Junio C Hamano 1 sibling, 0 replies; 5+ messages in thread From: Junio C Hamano @ 2019-03-18 5:23 UTC (permalink / raw) To: Eric Sunshine Cc: Ævar Arnfjörð Bjarmason, Git Mailing list, Johannes Schindelin Eric Sunshine <sunshine@sunshineco.com> writes: > So, I'm fairly negative on this change. However, that doesn't mean I > would oppose tweaking the value shared between the two programs (and > also the default used by GitGitGadget, if it specifies it manually), > though I defer to Dscho in that regard. I do not think of a good reason why the command should behave differently only when run from inside format-patch, and if we were changing anything, perhaps raising it a bit for all may make sense. I have yet to see range-diff getting confused and matching wrong patches but often seee it being too conservative to match two iterations of the same patch after a moderate amount of update. I find myself passing "--creation-factor 99" or some absurdly looking value when accepting a rerolled series. The most recent was the updated 'clean-up merge message with scissors' from Denton [*1*]. [Footnote] *1* <6716d28a5187c44c1d90f5ce840c44441f62352c.1552275703.git.liu.denton@gmail.com> and <08426189b5a29b376581eb0172e52222ab22387a.1552817044.git.liu.denton@gmail.com> do not line up without a high creation factor. ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2019-03-21 11:22 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2019-03-15 16:09 [WIP PATCH/RFC] Use a higher range-diff --creation-factor for format-patch Ævar Arnfjörð Bjarmason 2019-03-15 19:00 ` Eric Sunshine 2019-03-15 19:21 ` Ævar Arnfjörð Bjarmason 2019-03-21 11:22 ` Johannes Schindelin 2019-03-18 5:23 ` Junio C Hamano
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).