git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Erik Faye-Lund <kusmabite@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, ziade.tarek@gmail.com
Subject: Re: [PATCH v3] help: always suggest common-cmds if prefix of cmd
Date: Mon, 29 Nov 2010 12:20:35 +0100	[thread overview]
Message-ID: <AANLkTin34AfYnFY5e9B1cuyckfLXU2=qXFciFaaNGt9f@mail.gmail.com> (raw)
In-Reply-To: <7voc9bpqj2.fsf@alter.siamese.dyndns.org>

Sorry for the late reply, I've been out sick.

On Sat, Nov 27, 2010 at 1:18 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Erik Faye-Lund <kusmabite@gmail.com> writes:
>
>> @@ -320,9 +321,16 @@ const char *help_unknown_cmd(const char *cmd)
>>       uniq(&main_cmds);
>>
>>       /* This reuses cmdname->len for similarity index */
>> -     for (i = 0; i < main_cmds.cnt; ++i)
>> -             main_cmds.names[i]->len =
>> +     for (i = 0; i < main_cmds.cnt; ++i) {
>> +             main_cmds.names[i]->len = 1 +
>>                       levenshtein(cmd, main_cmds.names[i]->name, 0, 2, 1, 4);
>> +             for (n = 0; n < ARRAY_SIZE(common_cmds); ++n) {
>> +                     if (!strcmp(main_cmds.names[i]->name,
>> +                         common_cmds[n].name) &&
>> +                         !prefixcmp(main_cmds.names[i]->name, cmd))
>> +                             main_cmds.names[i]->len = 0;
>> +             }
>> +     }
>
> This is an error codepath so performance would not matter much, but this
> is doing it in an unnecessarily slow way, no?  At this point, both arrays
> are sorted the same way, so we should be able to walk common_cmds[]
> alongside the main_cmds.names[] (see below).
>

I like it, thanks!

>> +     if (n < main_cmds.cnt) {
>> +             best_similarity = main_cmds.names[n++]->len;
>> +             while (n < main_cmds.cnt &&
>> +                    best_similarity == main_cmds.names[n]->len)
>> +                     ++n;
>> +     } else
>> +             best_similarity = 0;
>
> Think about what does this case _means_... The end user input was so
> ambiguous that it prefix matched all the common commands!  Is it really
> similar enough?
>
> Note that most of the time main_cmds[] has more than what common_cmds[]
> has, and because prefix match is done only against common_cmds[],
> "everything is a prefix-match" never happens.  You might want to mark it
> as a BUG(), but someday we may change the rules to give 0 to non common
> commands with prefix match under some condition, so thinking these rare
> corner cases through would defend ourselves from future gotchas.
>
> How about doing it this way instead?  Isn't it more readable?
>

Yes, this is better. But:

> diff --git a/help.c b/help.c
> index 7f4928e..7654f1b 100644
> --- a/help.c
> +++ b/help.c
> @@ -3,6 +3,7 @@
>  #include "exec_cmd.h"
>  #include "levenshtein.h"
>  #include "help.h"
> +#include "common-cmds.h"
>
>  /* most GUI terminals set COLUMNS (although some don't export it) */
>  static int term_columns(void)
> @@ -298,7 +299,8 @@ static void add_cmd_list(struct cmdnames *cmds, struct cmdnames *old)
>  }
>
>  /* An empirically derived magic number */
> -#define SIMILAR_ENOUGH(x) ((x) < 6)
> +#define SIMILARITY_FLOOR 7
> +#define SIMILAR_ENOUGH(x) ((x) < SIMILARITY_FLOOR)
>
>  const char *help_unknown_cmd(const char *cmd)
>  {
> @@ -319,10 +321,28 @@ const char *help_unknown_cmd(const char *cmd)
>              sizeof(main_cmds.names), cmdname_compare);
>        uniq(&main_cmds);
>
> -       /* This reuses cmdname->len for similarity index */
> -       for (i = 0; i < main_cmds.cnt; ++i)
> +       /* This abuses cmdname->len for levenshtein distance */
> +       for (i = 0, n = 0; i < main_cmds.cnt; i++) {
> +               int cmp = 0; /* avoid compiler stupidity */
> +               const char *candidate = main_cmds.names[i]->name;
> +
> +               /* Does the candidate appear in common_cmds list? */
> +               while (n < ARRAY_SIZE(common_cmds) &&
> +                      (cmp = strcmp(common_cmds[n].name, candidate)) < 0)
> +                       n++;
> +               if ((n < ARRAY_SIZE(common_cmds)) && !cmp) {
> +                       /* Yes, this is one of the common commands */
> +                       n++; /* use the entry from common_cmds[] */
> +                       if (!prefixcmp(candidate, cmd)) {
> +                               /* Give prefix match a very good score */
> +                               main_cmds.names[i]->len = 0;
> +                               continue;
> +                       }
> +               }
> +
>                main_cmds.names[i]->len =
> -                       levenshtein(cmd, main_cmds.names[i]->name, 0, 2, 1, 4);
> +                       levenshtein(cmd, candidate, 0, 2, 1, 4) + 1;
> +       }
>
>        qsort(main_cmds.names, main_cmds.cnt,
>              sizeof(*main_cmds.names), levenshtein_compare);
> @@ -330,10 +350,21 @@ const char *help_unknown_cmd(const char *cmd)
>        if (!main_cmds.cnt)
>                die ("Uh oh. Your system reports no Git commands at all.");
>
> -       best_similarity = main_cmds.names[0]->len;
> -       n = 1;
> -       while (n < main_cmds.cnt && best_similarity == main_cmds.names[n]->len)
> -               ++n;
> +       /* skip and count prefix matches */
> +       for (n = 0; n < main_cmds.cnt && !main_cmds.names[n]->len; n++)
> +               ; /* still counting */
> +
> +       if (main_cmds.cnt <= n) {
> +               /* prefix matches with everything? that is too ambiguous */
> +               best_similarity = SIMILARITY_FLOOR + 1;

For this code-path to trigger we would have to be able to prefix-match
every common command AND every "main command" must be included in
common commands. At the same time. The only possible way to
prefix-match all commands is if they all start with the same letter.
Do you really think this is a situation we could ever end up in? Every
git command being a common-command, starting with the same letter?

This is basically unreachable code. Perhaps it'd be even clearer just to die:

if (main_cmds.cnt <= n)
	die("Prefix-matched everyting, what's going on?");


> +       } else {
> +               /* count all the most similar ones */
> +               for (best_similarity = main_cmds.names[n++]->len;
> +                    (n < main_cmds.cnt &&
> +                     best_similarity == main_cmds.names[n]->len);
> +                    n++)
> +                       ; /* still counting */
> +       }
>        if (autocorrect && n == 1 && SIMILAR_ENOUGH(best_similarity)) {
>                const char *assumed = main_cmds.names[0]->name;
>                main_cmds.names[0] = NULL;
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

  reply	other threads:[~2010-11-29 11:21 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-23 12:23 bug: unexpected output for "git st" + suggestion Tarek Ziadé
2010-11-23 12:40 ` Nguyen Thai Ngoc Duy
2010-11-23 12:49   ` Tarek Ziadé
2010-11-23 13:08     ` Nguyen Thai Ngoc Duy
2010-11-23 13:18       ` Tarek Ziadé
2010-11-23 13:26         ` Nguyen Thai Ngoc Duy
2010-11-23 20:38         ` Andreas Schwab
2010-11-23 12:43 ` Sylvain Rabot
2010-11-23 13:22 ` Erik Faye-Lund
2010-11-23 13:47   ` Tarek Ziadé
2010-11-23 13:56     ` Erik Faye-Lund
2010-11-23 13:58       ` Tarek Ziadé
2010-11-23 19:11         ` [PATCH] help: always suggest common-cmds if prefix of cmd Erik Faye-Lund
2010-11-24 19:49           ` Junio C Hamano
2010-11-24 20:20             ` Erik Faye-Lund
2010-11-24 23:53               ` Erik Faye-Lund
2010-11-25  4:49               ` Junio C Hamano
2010-11-25 10:39                 ` Erik Faye-Lund
2010-11-26 16:00                   ` [PATCH v3] " Erik Faye-Lund
2010-11-27  0:18                     ` Junio C Hamano
2010-11-29 11:20                       ` Erik Faye-Lund [this message]
2010-11-29 16:40                         ` Jonathan Nieder
2010-11-29 16:53                           ` Erik Faye-Lund
2010-11-29 17:44                         ` Junio C Hamano
2010-12-01 14:33                           ` Erik Faye-Lund
2018-11-19 20:35                     ` help.autoCorrect prefix selection considered a bit dangerous Ævar Arnfjörð Bjarmason
2018-11-20  3:23                       ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='AANLkTin34AfYnFY5e9B1cuyckfLXU2=qXFciFaaNGt9f@mail.gmail.com' \
    --to=kusmabite@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=ziade.tarek@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).