git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Reduce cost of deletion in levenstein distance (4 -> 3)
@ 2012-05-27 16:02 Matthieu Moy
  2012-05-29 18:25 ` Junio C Hamano
  0 siblings, 1 reply; 3+ messages in thread
From: Matthieu Moy @ 2012-05-27 16:02 UTC (permalink / raw)
  To: git, gitster; +Cc: Matthieu Moy

Before this patch, a character deletion has the same cost as 2 swaps, or
4 additions, so Git prefers suggesting a completely scrambled command
name to removing a character. For example, "git tags" suggests "stage",
but not "tag".

By setting the deletion cost to 3, we keep it higher than swaps or
additions, but prefer 1 deletion to 2 swaps. "git tags" now suggests
"tag" in addition to staged.

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
---
The RFC sent earlier [1] didn't receive negative comments, so I think this
is a good change.

http://thread.gmane.org/gmane.comp.version-control.git/196457

 help.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/help.c b/help.c
index 14eefc9..fdaa90d 100644
--- a/help.c
+++ b/help.c
@@ -334,7 +334,7 @@ const char *help_unknown_cmd(const char *cmd)
 		}
 
 		main_cmds.names[i]->len =
-			levenshtein(cmd, candidate, 0, 2, 1, 4) + 1;
+			levenshtein(cmd, candidate, 0, 2, 1, 3) + 1;
 	}
 
 	qsort(main_cmds.names, main_cmds.cnt,
-- 
1.7.10.363.g7fcd3d.dirty

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] Reduce cost of deletion in levenstein distance (4 -> 3)
  2012-05-27 16:02 [PATCH] Reduce cost of deletion in levenstein distance (4 -> 3) Matthieu Moy
@ 2012-05-29 18:25 ` Junio C Hamano
  2012-05-29 18:58   ` Matthieu Moy
  0 siblings, 1 reply; 3+ messages in thread
From: Junio C Hamano @ 2012-05-29 18:25 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: git

Matthieu Moy <Matthieu.Moy@imag.fr> writes:

> Before this patch, a character deletion has the same cost as 2 swaps, or
> 4 additions, so Git prefers suggesting a completely scrambled command
> name to removing a character. For example, "git tags" suggests "stage",
> but not "tag".
>
> By setting the deletion cost to 3, we keep it higher than swaps or
> additions, but prefer 1 deletion to 2 swaps. "git tags" now suggests
> "tag" in addition to staged.
>
> Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
> ---
> The RFC sent earlier [1] didn't receive negative comments, so I think this
> is a good change.
>
> http://thread.gmane.org/gmane.comp.version-control.git/196457

Lack of objections is never a good reason to assume it is a good
change.  In this particular case, I think the reason why you saw
no comments, either positive or negative, was because nobody came up
with a more "scientific" way to judge the weighting.  Choosing
between "tag" and "stage" given "tags" is just one datapoint, but
does not convince anybody there are not surprising combinations
where this change affects negatively.

I wonder if we can mechanically find a set of parameters that
optimally separates the built-in commands by computing N^2 distances
(e.g. compute "tag" and all other command names and record
minimum. Do so for all other commands. Now fudge the parameters and
repeat to see if it results in better minimum separation.  Something
like that).

Having said all that, until somebody comes up with a better method
of judging, I'd say that the best thing we could do is to apply this
patch and see if anybody finds a "surprising" case where this leads
to a regression.

Thanks.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] Reduce cost of deletion in levenstein distance (4 -> 3)
  2012-05-29 18:25 ` Junio C Hamano
@ 2012-05-29 18:58   ` Matthieu Moy
  0 siblings, 0 replies; 3+ messages in thread
From: Matthieu Moy @ 2012-05-29 18:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Junio C Hamano <gitster@pobox.com> writes:

> Lack of objections is never a good reason to assume it is a good
> change.

The other reason I forgot to mention is "I've lived with this patch and
my fat fingers for a few weeks, and didn't find weird cases". But my
fingers are not _that_ fat, and not necessarily representative ;-).

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2012-05-29 18:59 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-05-27 16:02 [PATCH] Reduce cost of deletion in levenstein distance (4 -> 3) Matthieu Moy
2012-05-29 18:25 ` Junio C Hamano
2012-05-29 18:58   ` Matthieu Moy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).