git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Sunshine <sunshine@sunshineco.com>
To: Jeff King <peff@peff.net>
Cc: "git@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: [PATCH 1/7] commit: provide a function to find a header in a buffer
Date: Sun, 22 Jun 2014 21:26:44 -0400	[thread overview]
Message-ID: <CAPig+cQuBg0sjZXfkB7c_L=PMJuaPbWZBz9UV2Jbtr3eqJuPcw@mail.gmail.com> (raw)
In-Reply-To: <20140618202737.GA23896@sigill.intra.peff.net>

On Wednesday, June 18, 2014, Jeff King <peff@peff.net> wrote:
> Usually when we parse a commit, we read it line by line and
> handle each header in a single pass (e.g., in parse_commit
> and parse_commit_header).  Sometimes, however, we only care
> about extracting a single header. Code in this situation is
> stuck doing an ad-hoc parse of the commit buffer.
>
> Let's provide a reusable function to locate a header within
> the commit.  The code is modeled after pretty.c's
> get_header, which is used to extract the encoding.
>
> Since some callers may not have the "struct commit" to go
> along with the buffer, we drop that parameter.  The only
> thing lost is a warning for truncated commits, but that's
> OK.  This shouldn't happen in practice, and even if it does,
> there's no particular reason that this function needs to
> complain about it. It either finds the header it was asked
> for, or it doesn't (and in the latter case, the caller can
> complain).
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> diff --git a/commit.c b/commit.c
> index 11106fb..d04b525 100644
> --- a/commit.c
> +++ b/commit.c
> @@ -1652,3 +1652,26 @@ void print_commit_list(struct commit_list *list,
>                 printf(format, sha1_to_hex(list->item->object.sha1));
>         }
>  }
> +
> +const char *find_commit_header(const char *msg, const char *key, size_t *out_len)
> +{
> +       int key_len = strlen(key);
> +       const char *line = msg;
> +
> +       while (line) {
> +               const char *eol = strchrnul(line, '\n'), *next;
> +
> +               if (line == eol)
> +                       return NULL;
> +               next = *eol ? eol + 1 : NULL;
> +
> +               if (eol - line > key_len &&
> +                   !strncmp(line, key, key_len) &&
> +                   line[key_len] == ' ') {
> +                       *out_len = eol - line - key_len - 1;
> +                       return line + key_len + 1;
> +               }
> +               line = next;

This is already simplified from the original implementation in
get_header(), but it can be simplified further by dropping 'next',
which is not otherwise used, and assigning 'line' directly:

    line = *eol ? eol + 1 : NULL;

> +       }
> +       return NULL;
> +}
> diff --git a/commit.h b/commit.h
> index 61559a9..7c766e9 100644
> --- a/commit.h
> +++ b/commit.h
> @@ -312,6 +312,17 @@ extern struct commit_extra_header *read_commit_extra_headers(struct commit *, co
>
>  extern void free_commit_extra_headers(struct commit_extra_header *extra);
>
> +/*
> + * Search the commit object contents given by "msg" for the header "key".
> + * Returns a pointer to the start of the header contents, or NULL. The length
> + * of the header, up to the first newline, is returned via out_len.
> + *
> + * Note that some headers (like mergetag) may be multi-line. It is the caller's
> + * responsibility to parse further in this case!
> + */
> +extern const char *find_commit_header(const char *msg, const char *key,
> +                                     size_t *out_len);
> +
>  struct merge_remote_desc {
>         struct object *obj; /* the named object, could be a tag */
>         const char *name;
> diff --git a/pretty.c b/pretty.c
> index cc5b45d..6081750 100644
> --- a/pretty.c
> +++ b/pretty.c
> @@ -548,31 +548,11 @@ static void add_merge_info(const struct pretty_print_context *pp,
>         strbuf_addch(sb, '\n');
>  }
>
> -static char *get_header(const struct commit *commit, const char *msg,
> -                       const char *key)
> +static char *get_header(const char *msg, const char *key)
>  {
> -       int key_len = strlen(key);
> -       const char *line = msg;
> -
> -       while (line) {
> -               const char *eol = strchrnul(line, '\n'), *next;
> -
> -               if (line == eol)
> -                       return NULL;
> -               if (!*eol) {
> -                       warning("malformed commit (header is missing newline): %s",
> -                               sha1_to_hex(commit->object.sha1));
> -                       next = NULL;
> -               } else
> -                       next = eol + 1;
> -               if (eol - line > key_len &&
> -                   !strncmp(line, key, key_len) &&
> -                   line[key_len] == ' ') {
> -                       return xmemdupz(line + key_len + 1, eol - line - key_len - 1);
> -               }
> -               line = next;
> -       }
> -       return NULL;
> +       size_t len;
> +       const char *v = find_commit_header(msg, key, &len);
> +       return v ? xmemdupz(v, len) : NULL;
>  }
>
>  static char *replace_encoding_header(char *buf, const char *encoding)
> @@ -618,11 +598,10 @@ const char *logmsg_reencode(const struct commit *commit,
>
>         if (!output_encoding || !*output_encoding) {
>                 if (commit_encoding)
> -                       *commit_encoding =
> -                               get_header(commit, msg, "encoding");
> +                       *commit_encoding = get_header(msg, "encoding");
>                 return msg;
>         }
> -       encoding = get_header(commit, msg, "encoding");
> +       encoding = get_header(msg, "encoding");
>         if (commit_encoding)
>                 *commit_encoding = encoding;
>         use_encoding = encoding ? encoding : utf8;
> --
> 2.0.0.566.gfe3e6b2

  reply	other threads:[~2014-06-23  1:26 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-18 20:19 [PATCH 0/7] cleaning up determine_author_info Jeff King
2014-06-18 20:27 ` [PATCH 1/7] commit: provide a function to find a header in a buffer Jeff King
2014-06-23  1:26   ` Eric Sunshine [this message]
2014-06-23 16:47     ` Jeff King
2014-06-18 20:28 ` [PATCH 2/7] record_author_info: fix memory leak on malformed commit Jeff King
2014-06-18 20:29 ` [PATCH 3/7] record_author_info: use find_commit_header Jeff King
2014-06-18 20:31 ` [PATCH 4/7] ident_split: store begin/end pairs on their own struct Jeff King
2014-06-23  1:28   ` Eric Sunshine
2014-06-18 20:32 ` [PATCH 5/7] use strbufs in date functions Jeff King
2014-06-18 20:35 ` [PATCH 6/7] determine_author_info: reuse parsing functions Jeff King
2014-06-18 20:36 ` [PATCH 7/7] determine_author_info: stop leaking name/email Jeff King
2014-06-23  9:28   ` Eric Sunshine
2014-06-23  9:33     ` Erik Faye-Lund
2014-06-23  9:48       ` Eric Sunshine
2014-06-23 17:21       ` Jeff King
2014-06-23 17:20     ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAPig+cQuBg0sjZXfkB7c_L=PMJuaPbWZBz9UV2Jbtr3eqJuPcw@mail.gmail.com' \
    --to=sunshine@sunshineco.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).