All of lore.kernel.org
 help / color / mirror / Atom feed
From: karthik nayak <karthik.188@gmail.com>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org, sunshine@sunshineco.com,
	Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH v8 1/4] sha1_file.c: support reading from a loose object of unknown type
Date: Sat, 18 Apr 2015 00:15:28 +0530	[thread overview]
Message-ID: <553154C8.5090001@gmail.com> (raw)
In-Reply-To: <20150417142310.GA12479@peff.net>



On 04/17/2015 07:53 PM, Jeff King wrote:
> On Wed, Apr 15, 2015 at 06:18:24PM -0400, Jeff King wrote:
>
> >> This _might_ have some performance impact in that strbuf_addch()
> >> involves strbuf_grow(*, 1), which does "does it overflow to
> >> increment sb->len by one?"; I would say it should be unmeasurable
> >> because the function is expected to be used only on loose objects
> >> and you shouldn't have very many of them without packing in your
> >> repository in the first place.
> >>
> >> I guess Peff's c1822d4f (strbuf: add an optimized 1-character
> >> strbuf_grow, 2015-04-04) may want to teach strbuf_addch() to use his
> >> new strbuf_grow_ch(), and once that happens the performance worry
> >> would disappear without this code to be changed at all.
> >
> > I haven't re-rolled that series yet, but the discussion there showed
> > that strbuf_grow_ch() is unnecessary; call-sites can just check:
> >
> >    if (!strbuf_avail(sb))
> >     strbuf_grow(sb, 1);
> >
> > to get the fast inline check. Since we go to the trouble to inline
> > strbuf_addch, we should probably teach it the same trick. It would be
> > nice to identify a place with a tight strbuf_addch() loop that could
> > demonstrate the speed increase, though.
>
> Just for reference, I did teach strbuf_addch this trick. And the config
> code is the tight loop to test it with. Results are here:
>
>    http://article.gmane.org/gmane.comp.version-control.git/267266
>
> In this code path, we are typically only seeing short strings (the
> original code used a 10-byte static buffer). So I doubt it makes all
> that much difference.
>
> If there _is_ a performance implication to worry about here, I think it
> would be that we are doing an extra malloc/free. I'm not sure I
> understand why we are copying it at all. The original code copied from
> the hdr into type[10] so that we could NUL-terminate it, which was
> required for type_from_string().
>
> But now we use type_from_string_gently, which can accept a length[1]. So
> we could just count the bytes to the first space and pass the original
> buffer along with that length, no?

Yes, we could, that would eliminate  "struct strbuf typename =
STRBUF_INIT".

Something like this perhaps :

@@ -1614,27 +1642,40 @@ static void *unpack_sha1_rest(git_zstream
*stream, void *buffer, unsigned long s
    * too permissive for what we want to check. So do an anal
    * object header parse by hand.
    */
-int parse_sha1_header(const char *hdr, unsigned long *sizep)
+static int parse_sha1_header_extended(const char *hdr, struct
object_info *oi,
+                              unsigned int flags)
   {
-       char type[10];
-       int i;
+       const char *buf = hdr;
          unsigned long size;
+       int type, len = 0;

          /*
-        * The type can be at most ten bytes (including the
-        * terminating '\0' that we add), and is followed by
+        * The type can be of any size but is followed by
           * a space.
           */
-       i = 0;
          for (;;) {
                  char c = *hdr++;
                  if (c == ' ')
                          break;
-               type[i++] = c;
-               if (i >= sizeof(type))
-                       return -1;
+               len++;
          }
-       type[i] = 0;
+
+       type = type_from_string_gently(buf, len, 1);
+       if (oi->typename) {
+               strbuf_add(oi->typename, buf, len);
+               strbuf_addch(oi->typename, '\0');
+       }
+       /*
+        * Set type to 0 if its an unknown object and
+        * we're obtaining the type using '--literally'
+        * option.
+        */
+       if ((flags & LOOKUP_LITERALLY) && (type == -1))
+               type = 0;
+       else if (type == -1)
+               die("invalid object type");
+       if (oi->typep)
+               *oi->typep = type;

          /*
           * The length must follow immediately, and be in canonical

>
> -Peff
>
> [1] Not related to your patch, but it looks like type_from_string_gently
>      is overly lax. It does a strncmp() with the length of the candidate
>      name, but does not check that we consumed all of the matching name.
>      So "tr" would match "tree", "comm" would match "commit", and so
>      forth.
>
Thanks for the patch re-roll on strbuf_addch() and the patch on
type_from_string_gently().

  parent reply	other threads:[~2015-04-17 18:45 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-15 16:55 [PATCH v8 0/4] cat-file: teach cat-file a '--literally' option karthik nayak
2015-04-15 16:59 ` [PATCH v8 1/4] sha1_file.c: support reading from a loose object of unknown type Karthik Nayak
2015-04-15 20:21   ` Junio C Hamano
2015-04-15 22:18     ` Jeff King
2015-04-17 14:23       ` Jeff King
2015-04-17 16:21         ` Junio C Hamano
2015-04-17 20:51           ` Jeff King
2015-04-17 21:10             ` Junio C Hamano
2015-04-20 18:43               ` karthik nayak
2015-04-20 18:51                 ` Jeff King
2015-04-21 11:26                   ` karthik nayak
2015-04-21 14:24                     ` Jeff King
2015-04-17 18:45         ` karthik nayak [this message]
2015-04-17 18:49           ` Jeff King
2015-04-18  8:31             ` karthik nayak
2015-04-17 19:23           ` Junio C Hamano
2015-04-18  8:32             ` karthik nayak
2015-04-17 23:31   ` Eric Sunshine
2015-04-18  9:03     ` karthik nayak
2015-04-15 16:59 ` [PATCH v8 2/4] cat-file: teach cat-file a '--literally' option Karthik Nayak
2015-04-15 20:20   ` Junio C Hamano
2015-04-15 20:52     ` Junio C Hamano
2015-04-16  7:26       ` karthik nayak
2015-04-16 13:35         ` Junio C Hamano
2015-04-17  2:10           ` Karthik Nayak
2015-04-17  2:14             ` Junio C Hamano
2015-04-19  0:28   ` Charles Bailey
2015-04-20  5:30     ` Junio C Hamano
2015-04-20  7:44       ` Charles Bailey
2015-04-20  8:57         ` Karthik Nayak
2015-04-20  9:19           ` Charles Bailey
2015-04-20 15:52             ` karthik nayak
2015-04-21 10:16               ` Charles Bailey
2015-04-21 19:40                 ` Eric Sunshine
2015-04-21 20:36                   ` Junio C Hamano
2015-04-25 11:22                     ` karthik nayak
2015-04-25 17:04                       ` Junio C Hamano
2015-04-27 11:57                         ` karthik nayak
2015-04-27 18:38                           ` Eric Sunshine
2015-04-28 12:03                             ` karthik nayak
2015-04-15 17:00 ` [PATCH v8 3/4] cat-file: add documentation for " Karthik Nayak
2015-04-15 17:00 ` [PATCH v8 4/4] t1006: add tests for git cat-file --literally Karthik Nayak
2015-04-18  0:00   ` Eric Sunshine
2015-04-18  5:22     ` karthik nayak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=553154C8.5090001@gmail.com \
    --to=karthik.188@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.