All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
To: git@vger.kernel.org
Cc: "Junio C Hamano" <gitster@pobox.com>,
	"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
Subject: [PATCH v4 0/4] count-objects improvements
Date: Wed, 13 Feb 2013 16:13:15 +0700	[thread overview]
Message-ID: <1360746799-3668-1-git-send-email-pclouds@gmail.com> (raw)
In-Reply-To: <7va9r9igze.fsf@alter.siamese.dyndns.org>

On Wed, Feb 13, 2013 at 12:23 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Nguyễn Thái Ngọc Duy <pclouds@gmail.com> writes:
>
>> +/* A hook for count-objects to report invalid files in pack directory */
>> +extern void (*report_garbage)(const char *desc, const char *path, int len, const char *name);
>
> We may want to document the strange way the last three parameters
> are used somewhere.  e.g.
>
>         shows "path" (if "name" is NULL), or prepends "path" in
>         front of name (otherwise); only for the latter, "path" can
>         be a string that is not NUL-terminated but its length
>         specified with "len" and in that case a slash is inserted
>         between the path and the "name".
>
> When described clearly, it sounds somewhat ugly and incoherent API,
> even though it covers the immediate need X-<.

One of the reasons why I did not export it explicitly. Changed it to

void (*report_garbage)(const char *desc, const char *path);

and pushed the ugly part back to callers.

> How about doing it something along this line, perhaps?
>
>         int i;
>         int beginning_of_this_name = -1;
>         int seen_bits = 0; /* 01 for .idx, 02 for .pack */
>         for (i = 0; i < list->nr; i++) {
>                 if (beginning_of_this_name < 0)
>                         beginning_of_this_name = i;
>                 else if (list->items[i] and list->items[beginning_of_this_name]
>                          share the same basename)
>                         ; /* keep scanning */
>                 else {
>                         /* one name ended at (i-1) */
>                         if (seen_bits == 3)
>                                 ; /* both .idx and .pack exist; good */
>                         else
>                                 report_garbage_for_one_name(list, beginning_of_this_name, i,
>                                                 seen_bits);
>                         seen_bits = 0;
>                         beginning_of_this_name = i;
>                 }
>                 if (list->items[i] is ".idx")
>                         seen_bits |= 1;
>                 if (list->items[i] is ".pack")
>                         seen_bits |= 2;
>
>         }
>         if (0 <= beginning_of_this_name && seen_bits != 3)
>                 report_garbages_for_one_name(list, beginning_of_this_name, list->nr, seen_bits);
>
> with a helper function report_garbage_for_one_name() that would look like this:
>
>         report_garbage_for_one_name(...) {
>                 int j;
>                 const char *msg;
>                 switch (seen_bits) {
>                 case 0: msg = "no corresponding .idx nor .pack"; break;
>                 case 1: msg = "no corresponding .pack"; break;
>                 case 2: msg = "no corresponding .idx; break;
>                 }
>                 for (j = beginning_of_this_name; j < i; j++)
>                         report_garbage(msg, list->items[j]);
>         }
>
> For the above to work, prepare_packed_git_one() needs to retain only the
> paths with known extensions in garbage list. "pack-deadbeef.unk" can and
> should be reported as a garbage immediately when it is seen without being
> placed in the list.

Yup. Looks good.

>> +             } else if (has_extension(de->d_name, ".idx")) {
>> +                     struct string_list_item *item;
>> +                     int n = strlen(path) - 4;
>> +                     item = string_list_append_nodup(&garbage,
>> +                                                     xstrndup(path, n));
>> +                     item->util = ".idx";
>> +                     continue;
>> +             } else
>> +                     report_garbage("garbage found", path, 0, NULL);
>
> Hmm, where is a ".keep" file handled in this flow?

Apparently I smoked/drank while coding or something. .idx is supposed
to be .keep. This calls for a test to guard my code (part of this v4).

> The structure of the if/else cascade is much nicer than the earlier
> iterations, but wouldn't it be even more clear to do this?
>
>         if (is .idx file) {
>                 ... do that .idx thing ...
>         }
>
>         if (!report_garbage)
>                 continue; /* it does not matter what the file is */
>
>         if (is .pack) {
>                 ... remember that we saw this .pack ...
>         } else if (is .idx) {
>                 ... remember that we saw this .idx ...
>         } else if (is .keep) {
>                 ... remember that we saw this .keep ...
>         } else {
>                 ... all else --- report as garbage immediately ...
>         }

Done. 2/4 is updated to make sure the "if (is .idx file)" block does
not shortcut the loop with "continue;" so that we always get .idx
file in the end of the loop.

Nguyễn Thái Ngọc Duy (4):
  git-count-objects.txt: describe each line in -v output
  sha1_file: reorder code in prepare_packed_git_one()
  count-objects: report garbage files in pack directory too
  count-objects: report how much disk space taken by garbage files

 Documentation/git-count-objects.txt |  22 ++++++--
 builtin/count-objects.c             |  30 ++++++++---
 cache.h                             |   3 ++
 sha1_file.c                         | 101 +++++++++++++++++++++++++++++++-----
 t/t5304-prune.sh                    |  26 ++++++++++
 5 files changed, 156 insertions(+), 26 deletions(-)

-- 
1.8.1.2.536.gf441e6d

  reply	other threads:[~2013-02-13  9:13 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-12  9:27 [PATCH v3 0/4] count-objects improvements Nguyễn Thái Ngọc Duy
2013-02-12  9:27 ` [PATCH v3 1/4] git-count-objects.txt: describe each line in -v output Nguyễn Thái Ngọc Duy
2013-02-12  9:27 ` [PATCH v3 2/4] sha1_file: reorder code in prepare_packed_git_one() Nguyễn Thái Ngọc Duy
2013-02-12  9:27 ` [PATCH v3 3/4] count-objects: report garbage files in pack directory too Nguyễn Thái Ngọc Duy
2013-02-12 17:23   ` Junio C Hamano
2013-02-13  9:13     ` Nguyễn Thái Ngọc Duy [this message]
2013-02-13  9:13       ` [PATCH v4 1/4] git-count-objects.txt: describe each line in -v output Nguyễn Thái Ngọc Duy
2013-02-13  9:13       ` [PATCH v4 2/4] sha1_file: reorder code in prepare_packed_git_one() Nguyễn Thái Ngọc Duy
2013-02-13  9:13       ` [PATCH v4 3/4] count-objects: report garbage files in pack directory too Nguyễn Thái Ngọc Duy
2013-02-13 15:55         ` Junio C Hamano
2013-02-13 16:54           ` Junio C Hamano
2013-02-14  9:24           ` Duy Nguyen
2013-02-14 17:02             ` Junio C Hamano
2013-02-15 12:07           ` [PATCH v4+ " Nguyễn Thái Ngọc Duy
2013-02-15 23:20             ` Junio C Hamano
2013-02-13  9:13       ` [PATCH v4 4/4] count-objects: report how much disk space taken by garbage files Nguyễn Thái Ngọc Duy
2013-02-12  9:27 ` [PATCH v3 " Nguyễn Thái Ngọc Duy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1360746799-3668-1-git-send-email-pclouds@gmail.com \
    --to=pclouds@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.