git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: git@vger.kernel.org
Subject: [PATCH 6/7] determine_author_info: reuse parsing functions
Date: Wed, 18 Jun 2014 16:35:39 -0400	[thread overview]
Message-ID: <20140618203539.GF23896@sigill.intra.peff.net> (raw)
In-Reply-To: <20140618201944.GA23238@sigill.intra.peff.net>

Rather than parsing the header manually to find the "author"
field, and then parsing its sub-parts, let's use
find_commit_header and split_ident_line. This is shorter and
easier to read, and should do a more careful parsing job.

For example, the current parser could find the end-of-email
right-bracket across a newline (for a malformed commit), and
calculate a bogus gigantic length for the date (by using
"eol - rb").

As a bonus, this also plugs a memory leak when we pull the
date field from an existing commit (we still leak the name
and email buffers, which will be fixed in a later commit).

Signed-off-by: Jeff King <peff@peff.net>
---
The large buffer comes from wrapping around the negative side of the
size_t space.  In theory you could wrap far enough to get a buffer that
we can actually allocate (probably only on a 32-bit system), and then
we followup by copying "len" random bytes into it. I doubt an attacker
could get that data out of the program, though, as we then run it
through fmt_ident, which should complain if it's full of garbage.

 builtin/commit.c | 61 +++++++++++++++++++++++++++++---------------------------
 1 file changed, 32 insertions(+), 29 deletions(-)

diff --git a/builtin/commit.c b/builtin/commit.c
index bf770cf..62abee0 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -541,6 +541,16 @@ static int parse_force_date(const char *in, struct strbuf *out)
 	return 0;
 }
 
+static void strbuf_add_pair(struct strbuf *buf, const struct pointer_pair *p)
+{
+	strbuf_add(buf, p->begin, p->end - p->begin);
+}
+
+static char *xmemdupz_pair(const struct pointer_pair *p)
+{
+	return xmemdupz(p->begin, p->end - p->begin);
+}
+
 static void determine_author_info(struct strbuf *author_ident)
 {
 	char *name, *email, *date;
@@ -552,42 +562,35 @@ static void determine_author_info(struct strbuf *author_ident)
 	date = getenv("GIT_AUTHOR_DATE");
 
 	if (author_message) {
-		const char *a, *lb, *rb, *eol;
-		size_t len;
+		struct ident_split ident;
+		unsigned long len;
+		const char *a;
 
-		a = strstr(author_message_buffer, "\nauthor ");
+		a = find_commit_header(author_message_buffer, "author", &len);
 		if (!a)
-			die(_("invalid commit: %s"), author_message);
-
-		lb = strchrnul(a + strlen("\nauthor "), '<');
-		rb = strchrnul(lb, '>');
-		eol = strchrnul(rb, '\n');
-		if (!*lb || !*rb || !*eol)
-			die(_("invalid commit: %s"), author_message);
-
-		if (lb == a + strlen("\nauthor "))
-			/* \nauthor <foo@example.com> */
-			name = xcalloc(1, 1);
-		else
-			name = xmemdupz(a + strlen("\nauthor "),
-					(lb - strlen(" ") -
-					 (a + strlen("\nauthor "))));
-		email = xmemdupz(lb + strlen("<"), rb - (lb + strlen("<")));
-		len = eol - (rb + strlen("> "));
-		date = xmalloc(len + 2);
-		*date = '@';
-		memcpy(date + 1, rb + strlen("> "), len);
-		date[len + 1] = '\0';
+			die(_("commit '%s' lacks author header"), author_message);
+		if (split_ident_line(&ident, a, len) < 0)
+			die(_("commit '%s' has malformed author line"), author_message);
+
+		name = xmemdupz_pair(&ident.name);
+		email = xmemdupz_pair(&ident.mail);
+		if (ident.date.begin) {
+			strbuf_reset(&date_buf);
+			strbuf_addch(&date_buf, '@');
+			strbuf_add_pair(&date_buf, &ident.date);
+			strbuf_addch(&date_buf, ' ');
+			strbuf_add_pair(&date_buf, &ident.tz);
+			date = date_buf.buf;
+		}
 	}
 
 	if (force_author) {
-		const char *lb = strstr(force_author, " <");
-		const char *rb = strchr(force_author, '>');
+		struct ident_split ident;
 
-		if (!lb || !rb)
+		if (split_ident_line(&ident, force_author, strlen(force_author)) < 0)
 			die(_("malformed --author parameter"));
-		name = xstrndup(force_author, lb - force_author);
-		email = xstrndup(lb + 2, rb - (lb + 2));
+		name = xmemdupz_pair(&ident.name);
+		email = xmemdupz_pair(&ident.mail);
 	}
 
 	if (force_date) {
-- 
2.0.0.566.gfe3e6b2

  parent reply	other threads:[~2014-06-18 20:35 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-18 20:19 [PATCH 0/7] cleaning up determine_author_info Jeff King
2014-06-18 20:27 ` [PATCH 1/7] commit: provide a function to find a header in a buffer Jeff King
2014-06-23  1:26   ` Eric Sunshine
2014-06-23 16:47     ` Jeff King
2014-06-18 20:28 ` [PATCH 2/7] record_author_info: fix memory leak on malformed commit Jeff King
2014-06-18 20:29 ` [PATCH 3/7] record_author_info: use find_commit_header Jeff King
2014-06-18 20:31 ` [PATCH 4/7] ident_split: store begin/end pairs on their own struct Jeff King
2014-06-23  1:28   ` Eric Sunshine
2014-06-18 20:32 ` [PATCH 5/7] use strbufs in date functions Jeff King
2014-06-18 20:35 ` Jeff King [this message]
2014-06-18 20:36 ` [PATCH 7/7] determine_author_info: stop leaking name/email Jeff King
2014-06-23  9:28   ` Eric Sunshine
2014-06-23  9:33     ` Erik Faye-Lund
2014-06-23  9:48       ` Eric Sunshine
2014-06-23 17:21       ` Jeff King
2014-06-23 17:20     ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140618203539.GF23896@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).