From: Erik Faye-Lund <kusmabite@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, ziade.tarek@gmail.com
Subject: Re: [PATCH v3] help: always suggest common-cmds if prefix of cmd
Date: Mon, 29 Nov 2010 12:20:35 +0100 [thread overview]
Message-ID: <AANLkTin34AfYnFY5e9B1cuyckfLXU2=qXFciFaaNGt9f@mail.gmail.com> (raw)
In-Reply-To: <7voc9bpqj2.fsf@alter.siamese.dyndns.org>
Sorry for the late reply, I've been out sick.
On Sat, Nov 27, 2010 at 1:18 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Erik Faye-Lund <kusmabite@gmail.com> writes:
>
>> @@ -320,9 +321,16 @@ const char *help_unknown_cmd(const char *cmd)
>> uniq(&main_cmds);
>>
>> /* This reuses cmdname->len for similarity index */
>> - for (i = 0; i < main_cmds.cnt; ++i)
>> - main_cmds.names[i]->len =
>> + for (i = 0; i < main_cmds.cnt; ++i) {
>> + main_cmds.names[i]->len = 1 +
>> levenshtein(cmd, main_cmds.names[i]->name, 0, 2, 1, 4);
>> + for (n = 0; n < ARRAY_SIZE(common_cmds); ++n) {
>> + if (!strcmp(main_cmds.names[i]->name,
>> + common_cmds[n].name) &&
>> + !prefixcmp(main_cmds.names[i]->name, cmd))
>> + main_cmds.names[i]->len = 0;
>> + }
>> + }
>
> This is an error codepath so performance would not matter much, but this
> is doing it in an unnecessarily slow way, no? At this point, both arrays
> are sorted the same way, so we should be able to walk common_cmds[]
> alongside the main_cmds.names[] (see below).
>
I like it, thanks!
>> + if (n < main_cmds.cnt) {
>> + best_similarity = main_cmds.names[n++]->len;
>> + while (n < main_cmds.cnt &&
>> + best_similarity == main_cmds.names[n]->len)
>> + ++n;
>> + } else
>> + best_similarity = 0;
>
> Think about what does this case _means_... The end user input was so
> ambiguous that it prefix matched all the common commands! Is it really
> similar enough?
>
> Note that most of the time main_cmds[] has more than what common_cmds[]
> has, and because prefix match is done only against common_cmds[],
> "everything is a prefix-match" never happens. You might want to mark it
> as a BUG(), but someday we may change the rules to give 0 to non common
> commands with prefix match under some condition, so thinking these rare
> corner cases through would defend ourselves from future gotchas.
>
> How about doing it this way instead? Isn't it more readable?
>
Yes, this is better. But:
> diff --git a/help.c b/help.c
> index 7f4928e..7654f1b 100644
> --- a/help.c
> +++ b/help.c
> @@ -3,6 +3,7 @@
> #include "exec_cmd.h"
> #include "levenshtein.h"
> #include "help.h"
> +#include "common-cmds.h"
>
> /* most GUI terminals set COLUMNS (although some don't export it) */
> static int term_columns(void)
> @@ -298,7 +299,8 @@ static void add_cmd_list(struct cmdnames *cmds, struct cmdnames *old)
> }
>
> /* An empirically derived magic number */
> -#define SIMILAR_ENOUGH(x) ((x) < 6)
> +#define SIMILARITY_FLOOR 7
> +#define SIMILAR_ENOUGH(x) ((x) < SIMILARITY_FLOOR)
>
> const char *help_unknown_cmd(const char *cmd)
> {
> @@ -319,10 +321,28 @@ const char *help_unknown_cmd(const char *cmd)
> sizeof(main_cmds.names), cmdname_compare);
> uniq(&main_cmds);
>
> - /* This reuses cmdname->len for similarity index */
> - for (i = 0; i < main_cmds.cnt; ++i)
> + /* This abuses cmdname->len for levenshtein distance */
> + for (i = 0, n = 0; i < main_cmds.cnt; i++) {
> + int cmp = 0; /* avoid compiler stupidity */
> + const char *candidate = main_cmds.names[i]->name;
> +
> + /* Does the candidate appear in common_cmds list? */
> + while (n < ARRAY_SIZE(common_cmds) &&
> + (cmp = strcmp(common_cmds[n].name, candidate)) < 0)
> + n++;
> + if ((n < ARRAY_SIZE(common_cmds)) && !cmp) {
> + /* Yes, this is one of the common commands */
> + n++; /* use the entry from common_cmds[] */
> + if (!prefixcmp(candidate, cmd)) {
> + /* Give prefix match a very good score */
> + main_cmds.names[i]->len = 0;
> + continue;
> + }
> + }
> +
> main_cmds.names[i]->len =
> - levenshtein(cmd, main_cmds.names[i]->name, 0, 2, 1, 4);
> + levenshtein(cmd, candidate, 0, 2, 1, 4) + 1;
> + }
>
> qsort(main_cmds.names, main_cmds.cnt,
> sizeof(*main_cmds.names), levenshtein_compare);
> @@ -330,10 +350,21 @@ const char *help_unknown_cmd(const char *cmd)
> if (!main_cmds.cnt)
> die ("Uh oh. Your system reports no Git commands at all.");
>
> - best_similarity = main_cmds.names[0]->len;
> - n = 1;
> - while (n < main_cmds.cnt && best_similarity == main_cmds.names[n]->len)
> - ++n;
> + /* skip and count prefix matches */
> + for (n = 0; n < main_cmds.cnt && !main_cmds.names[n]->len; n++)
> + ; /* still counting */
> +
> + if (main_cmds.cnt <= n) {
> + /* prefix matches with everything? that is too ambiguous */
> + best_similarity = SIMILARITY_FLOOR + 1;
For this code-path to trigger we would have to be able to prefix-match
every common command AND every "main command" must be included in
common commands. At the same time. The only possible way to
prefix-match all commands is if they all start with the same letter.
Do you really think this is a situation we could ever end up in? Every
git command being a common-command, starting with the same letter?
This is basically unreachable code. Perhaps it'd be even clearer just to die:
if (main_cmds.cnt <= n)
die("Prefix-matched everyting, what's going on?");
> + } else {
> + /* count all the most similar ones */
> + for (best_similarity = main_cmds.names[n++]->len;
> + (n < main_cmds.cnt &&
> + best_similarity == main_cmds.names[n]->len);
> + n++)
> + ; /* still counting */
> + }
> if (autocorrect && n == 1 && SIMILAR_ENOUGH(best_similarity)) {
> const char *assumed = main_cmds.names[0]->name;
> main_cmds.names[0] = NULL;
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
next prev parent reply other threads:[~2010-11-29 11:21 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-11-23 12:23 bug: unexpected output for "git st" + suggestion Tarek Ziadé
2010-11-23 12:40 ` Nguyen Thai Ngoc Duy
2010-11-23 12:49 ` Tarek Ziadé
2010-11-23 13:08 ` Nguyen Thai Ngoc Duy
2010-11-23 13:18 ` Tarek Ziadé
2010-11-23 13:26 ` Nguyen Thai Ngoc Duy
2010-11-23 20:38 ` Andreas Schwab
2010-11-23 12:43 ` Sylvain Rabot
2010-11-23 13:22 ` Erik Faye-Lund
2010-11-23 13:47 ` Tarek Ziadé
2010-11-23 13:56 ` Erik Faye-Lund
2010-11-23 13:58 ` Tarek Ziadé
2010-11-23 19:11 ` [PATCH] help: always suggest common-cmds if prefix of cmd Erik Faye-Lund
2010-11-24 19:49 ` Junio C Hamano
2010-11-24 20:20 ` Erik Faye-Lund
2010-11-24 23:53 ` Erik Faye-Lund
2010-11-25 4:49 ` Junio C Hamano
2010-11-25 10:39 ` Erik Faye-Lund
2010-11-26 16:00 ` [PATCH v3] " Erik Faye-Lund
2010-11-27 0:18 ` Junio C Hamano
2010-11-29 11:20 ` Erik Faye-Lund [this message]
2010-11-29 16:40 ` Jonathan Nieder
2010-11-29 16:53 ` Erik Faye-Lund
2010-11-29 17:44 ` Junio C Hamano
2010-12-01 14:33 ` Erik Faye-Lund
2018-11-19 20:35 ` help.autoCorrect prefix selection considered a bit dangerous Ævar Arnfjörð Bjarmason
2018-11-20 3:23 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='AANLkTin34AfYnFY5e9B1cuyckfLXU2=qXFciFaaNGt9f@mail.gmail.com' \
--to=kusmabite@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=ziade.tarek@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).