git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: karthik nayak <karthik.188@gmail.com>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org, sunshine@sunshineco.com,
	Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH v8 1/4] sha1_file.c: support reading from a loose object of unknown type
Date: Sat, 18 Apr 2015 00:15:28 +0530	[thread overview]
Message-ID: <553154C8.5090001@gmail.com> (raw)
In-Reply-To: <20150417142310.GA12479@peff.net>



On 04/17/2015 07:53 PM, Jeff King wrote:
> On Wed, Apr 15, 2015 at 06:18:24PM -0400, Jeff King wrote:
>
> >> This _might_ have some performance impact in that strbuf_addch()
> >> involves strbuf_grow(*, 1), which does "does it overflow to
> >> increment sb->len by one?"; I would say it should be unmeasurable
> >> because the function is expected to be used only on loose objects
> >> and you shouldn't have very many of them without packing in your
> >> repository in the first place.
> >>
> >> I guess Peff's c1822d4f (strbuf: add an optimized 1-character
> >> strbuf_grow, 2015-04-04) may want to teach strbuf_addch() to use his
> >> new strbuf_grow_ch(), and once that happens the performance worry
> >> would disappear without this code to be changed at all.
> >
> > I haven't re-rolled that series yet, but the discussion there showed
> > that strbuf_grow_ch() is unnecessary; call-sites can just check:
> >
> >    if (!strbuf_avail(sb))
> >     strbuf_grow(sb, 1);
> >
> > to get the fast inline check. Since we go to the trouble to inline
> > strbuf_addch, we should probably teach it the same trick. It would be
> > nice to identify a place with a tight strbuf_addch() loop that could
> > demonstrate the speed increase, though.
>
> Just for reference, I did teach strbuf_addch this trick. And the config
> code is the tight loop to test it with. Results are here:
>
>    http://article.gmane.org/gmane.comp.version-control.git/267266
>
> In this code path, we are typically only seeing short strings (the
> original code used a 10-byte static buffer). So I doubt it makes all
> that much difference.
>
> If there _is_ a performance implication to worry about here, I think it
> would be that we are doing an extra malloc/free. I'm not sure I
> understand why we are copying it at all. The original code copied from
> the hdr into type[10] so that we could NUL-terminate it, which was
> required for type_from_string().
>
> But now we use type_from_string_gently, which can accept a length[1]. So
> we could just count the bytes to the first space and pass the original
> buffer along with that length, no?

Yes, we could, that would eliminate  "struct strbuf typename =
STRBUF_INIT".

Something like this perhaps :

@@ -1614,27 +1642,40 @@ static void *unpack_sha1_rest(git_zstream
*stream, void *buffer, unsigned long s
    * too permissive for what we want to check. So do an anal
    * object header parse by hand.
    */
-int parse_sha1_header(const char *hdr, unsigned long *sizep)
+static int parse_sha1_header_extended(const char *hdr, struct
object_info *oi,
+                              unsigned int flags)
   {
-       char type[10];
-       int i;
+       const char *buf = hdr;
          unsigned long size;
+       int type, len = 0;

          /*
-        * The type can be at most ten bytes (including the
-        * terminating '\0' that we add), and is followed by
+        * The type can be of any size but is followed by
           * a space.
           */
-       i = 0;
          for (;;) {
                  char c = *hdr++;
                  if (c == ' ')
                          break;
-               type[i++] = c;
-               if (i >= sizeof(type))
-                       return -1;
+               len++;
          }
-       type[i] = 0;
+
+       type = type_from_string_gently(buf, len, 1);
+       if (oi->typename) {
+               strbuf_add(oi->typename, buf, len);
+               strbuf_addch(oi->typename, '\0');
+       }
+       /*
+        * Set type to 0 if its an unknown object and
+        * we're obtaining the type using '--literally'
+        * option.
+        */
+       if ((flags & LOOKUP_LITERALLY) && (type == -1))
+               type = 0;
+       else if (type == -1)
+               die("invalid object type");
+       if (oi->typep)
+               *oi->typep = type;

          /*
           * The length must follow immediately, and be in canonical

>
> -Peff
>
> [1] Not related to your patch, but it looks like type_from_string_gently
>      is overly lax. It does a strncmp() with the length of the candidate
>      name, but does not check that we consumed all of the matching name.
>      So "tr" would match "tree", "comm" would match "commit", and so
>      forth.
>
Thanks for the patch re-roll on strbuf_addch() and the patch on
type_from_string_gently().

  parent reply	other threads:[~2015-04-17 18:45 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-15 16:55 [PATCH v8 0/4] cat-file: teach cat-file a '--literally' option karthik nayak
2015-04-15 16:59 ` [PATCH v8 1/4] sha1_file.c: support reading from a loose object of unknown type Karthik Nayak
2015-04-15 20:21   ` Junio C Hamano
2015-04-15 22:18     ` Jeff King
2015-04-17 14:23       ` Jeff King
2015-04-17 16:21         ` Junio C Hamano
2015-04-17 20:51           ` Jeff King
2015-04-17 21:10             ` Junio C Hamano
2015-04-20 18:43               ` karthik nayak
2015-04-20 18:51                 ` Jeff King
2015-04-21 11:26                   ` karthik nayak
2015-04-21 14:24                     ` Jeff King
2015-04-17 18:45         ` karthik nayak [this message]
2015-04-17 18:49           ` Jeff King
2015-04-18  8:31             ` karthik nayak
2015-04-17 19:23           ` Junio C Hamano
2015-04-18  8:32             ` karthik nayak
2015-04-17 23:31   ` Eric Sunshine
2015-04-18  9:03     ` karthik nayak
2015-04-15 16:59 ` [PATCH v8 2/4] cat-file: teach cat-file a '--literally' option Karthik Nayak
2015-04-15 20:20   ` Junio C Hamano
2015-04-15 20:52     ` Junio C Hamano
2015-04-16  7:26       ` karthik nayak
2015-04-16 13:35         ` Junio C Hamano
2015-04-17  2:10           ` Karthik Nayak
2015-04-17  2:14             ` Junio C Hamano
2015-04-19  0:28   ` Charles Bailey
2015-04-20  5:30     ` Junio C Hamano
2015-04-20  7:44       ` Charles Bailey
2015-04-20  8:57         ` Karthik Nayak
2015-04-20  9:19           ` Charles Bailey
2015-04-20 15:52             ` karthik nayak
2015-04-21 10:16               ` Charles Bailey
2015-04-21 19:40                 ` Eric Sunshine
2015-04-21 20:36                   ` Junio C Hamano
2015-04-25 11:22                     ` karthik nayak
2015-04-25 17:04                       ` Junio C Hamano
2015-04-27 11:57                         ` karthik nayak
2015-04-27 18:38                           ` Eric Sunshine
2015-04-28 12:03                             ` karthik nayak
2015-04-15 17:00 ` [PATCH v8 3/4] cat-file: add documentation for " Karthik Nayak
2015-04-15 17:00 ` [PATCH v8 4/4] t1006: add tests for git cat-file --literally Karthik Nayak
2015-04-18  0:00   ` Eric Sunshine
2015-04-18  5:22     ` karthik nayak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=553154C8.5090001@gmail.com \
    --to=karthik.188@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).