From: Eric Sunshine <sunshine@sunshineco.com>
To: Jeff King <peff@peff.net>
Cc: "git@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: [PATCH 1/7] commit: provide a function to find a header in a buffer
Date: Sun, 22 Jun 2014 21:26:44 -0400 [thread overview]
Message-ID: <CAPig+cQuBg0sjZXfkB7c_L=PMJuaPbWZBz9UV2Jbtr3eqJuPcw@mail.gmail.com> (raw)
In-Reply-To: <20140618202737.GA23896@sigill.intra.peff.net>
On Wednesday, June 18, 2014, Jeff King <peff@peff.net> wrote:
> Usually when we parse a commit, we read it line by line and
> handle each header in a single pass (e.g., in parse_commit
> and parse_commit_header). Sometimes, however, we only care
> about extracting a single header. Code in this situation is
> stuck doing an ad-hoc parse of the commit buffer.
>
> Let's provide a reusable function to locate a header within
> the commit. The code is modeled after pretty.c's
> get_header, which is used to extract the encoding.
>
> Since some callers may not have the "struct commit" to go
> along with the buffer, we drop that parameter. The only
> thing lost is a warning for truncated commits, but that's
> OK. This shouldn't happen in practice, and even if it does,
> there's no particular reason that this function needs to
> complain about it. It either finds the header it was asked
> for, or it doesn't (and in the latter case, the caller can
> complain).
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> diff --git a/commit.c b/commit.c
> index 11106fb..d04b525 100644
> --- a/commit.c
> +++ b/commit.c
> @@ -1652,3 +1652,26 @@ void print_commit_list(struct commit_list *list,
> printf(format, sha1_to_hex(list->item->object.sha1));
> }
> }
> +
> +const char *find_commit_header(const char *msg, const char *key, size_t *out_len)
> +{
> + int key_len = strlen(key);
> + const char *line = msg;
> +
> + while (line) {
> + const char *eol = strchrnul(line, '\n'), *next;
> +
> + if (line == eol)
> + return NULL;
> + next = *eol ? eol + 1 : NULL;
> +
> + if (eol - line > key_len &&
> + !strncmp(line, key, key_len) &&
> + line[key_len] == ' ') {
> + *out_len = eol - line - key_len - 1;
> + return line + key_len + 1;
> + }
> + line = next;
This is already simplified from the original implementation in
get_header(), but it can be simplified further by dropping 'next',
which is not otherwise used, and assigning 'line' directly:
line = *eol ? eol + 1 : NULL;
> + }
> + return NULL;
> +}
> diff --git a/commit.h b/commit.h
> index 61559a9..7c766e9 100644
> --- a/commit.h
> +++ b/commit.h
> @@ -312,6 +312,17 @@ extern struct commit_extra_header *read_commit_extra_headers(struct commit *, co
>
> extern void free_commit_extra_headers(struct commit_extra_header *extra);
>
> +/*
> + * Search the commit object contents given by "msg" for the header "key".
> + * Returns a pointer to the start of the header contents, or NULL. The length
> + * of the header, up to the first newline, is returned via out_len.
> + *
> + * Note that some headers (like mergetag) may be multi-line. It is the caller's
> + * responsibility to parse further in this case!
> + */
> +extern const char *find_commit_header(const char *msg, const char *key,
> + size_t *out_len);
> +
> struct merge_remote_desc {
> struct object *obj; /* the named object, could be a tag */
> const char *name;
> diff --git a/pretty.c b/pretty.c
> index cc5b45d..6081750 100644
> --- a/pretty.c
> +++ b/pretty.c
> @@ -548,31 +548,11 @@ static void add_merge_info(const struct pretty_print_context *pp,
> strbuf_addch(sb, '\n');
> }
>
> -static char *get_header(const struct commit *commit, const char *msg,
> - const char *key)
> +static char *get_header(const char *msg, const char *key)
> {
> - int key_len = strlen(key);
> - const char *line = msg;
> -
> - while (line) {
> - const char *eol = strchrnul(line, '\n'), *next;
> -
> - if (line == eol)
> - return NULL;
> - if (!*eol) {
> - warning("malformed commit (header is missing newline): %s",
> - sha1_to_hex(commit->object.sha1));
> - next = NULL;
> - } else
> - next = eol + 1;
> - if (eol - line > key_len &&
> - !strncmp(line, key, key_len) &&
> - line[key_len] == ' ') {
> - return xmemdupz(line + key_len + 1, eol - line - key_len - 1);
> - }
> - line = next;
> - }
> - return NULL;
> + size_t len;
> + const char *v = find_commit_header(msg, key, &len);
> + return v ? xmemdupz(v, len) : NULL;
> }
>
> static char *replace_encoding_header(char *buf, const char *encoding)
> @@ -618,11 +598,10 @@ const char *logmsg_reencode(const struct commit *commit,
>
> if (!output_encoding || !*output_encoding) {
> if (commit_encoding)
> - *commit_encoding =
> - get_header(commit, msg, "encoding");
> + *commit_encoding = get_header(msg, "encoding");
> return msg;
> }
> - encoding = get_header(commit, msg, "encoding");
> + encoding = get_header(msg, "encoding");
> if (commit_encoding)
> *commit_encoding = encoding;
> use_encoding = encoding ? encoding : utf8;
> --
> 2.0.0.566.gfe3e6b2
next prev parent reply other threads:[~2014-06-23 1:26 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-06-18 20:19 [PATCH 0/7] cleaning up determine_author_info Jeff King
2014-06-18 20:27 ` [PATCH 1/7] commit: provide a function to find a header in a buffer Jeff King
2014-06-23 1:26 ` Eric Sunshine [this message]
2014-06-23 16:47 ` Jeff King
2014-06-18 20:28 ` [PATCH 2/7] record_author_info: fix memory leak on malformed commit Jeff King
2014-06-18 20:29 ` [PATCH 3/7] record_author_info: use find_commit_header Jeff King
2014-06-18 20:31 ` [PATCH 4/7] ident_split: store begin/end pairs on their own struct Jeff King
2014-06-23 1:28 ` Eric Sunshine
2014-06-18 20:32 ` [PATCH 5/7] use strbufs in date functions Jeff King
2014-06-18 20:35 ` [PATCH 6/7] determine_author_info: reuse parsing functions Jeff King
2014-06-18 20:36 ` [PATCH 7/7] determine_author_info: stop leaking name/email Jeff King
2014-06-23 9:28 ` Eric Sunshine
2014-06-23 9:33 ` Erik Faye-Lund
2014-06-23 9:48 ` Eric Sunshine
2014-06-23 17:21 ` Jeff King
2014-06-23 17:20 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAPig+cQuBg0sjZXfkB7c_L=PMJuaPbWZBz9UV2Jbtr3eqJuPcw@mail.gmail.com' \
--to=sunshine@sunshineco.com \
--cc=git@vger.kernel.org \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).