* [PATCH] mailmap: handle mailmap blobs without trailing newlines @ 2013-08-25 8:45 Jeff King 2013-08-25 8:55 ` Jeff King 0 siblings, 1 reply; 3+ messages in thread From: Jeff King @ 2013-08-25 8:45 UTC (permalink / raw) To: Junio C Hamano; +Cc: git The read_mailmap_buf function reads each line of the mailmap using strchrnul, like: const char *end = strchrnul(buf, '\n'); unsigned long linelen = end - buf + 1; But that's off-by-one when we actually hit the NUL byte; our line does not have a terminator, and so is only "end - buf" bytes long. As a result, when we subtract the linelen from the total len, we end up with (unsigned long)-1 bytes left in the buffer, and we start reading random junk from memory. We could fix it with: unsigned long linelen = end - buf + !!*end; but it is questionable for a function called read_mailmap_buf to be using strchrnul in the first place. It happens to work because our buffers come from blobs, and read_sha1_file always NUL-terminates the object data. But let's future-proof the function by actually handling non-terminated strings correctly, and fix the off-by-one at the same time. Signed-off-by: Jeff King <peff@peff.net> --- Intended for 'maint'. The bug was introduced in 0861090, but I built the fix on top of 8c473ce, the tip of the jk/mailmap-from-blob topic, as it avoids annoying textual conflicts in the test script. v1.8.2 was the first version with the bug, so this is not an "oops, we failed to find this new bug during v1.8.4-rc series" problem. I found it now because I turned on mailmap.blob for all of github.com, which exposed the code to a much larger array of random inputs. This is the minimal fix. Another option would be to switch read_mailmap_buf to read_mailmap_string, and I think we could even get away with avoiding the extra allocation/copy in the loop (because read_mailmap_line seems to cope with newline-or-EOS just fine). But it may be better to save that for 'master'. mailmap.c | 12 +++++++++--- t/t4203-mailmap.sh | 16 +++++++++++++++- 2 files changed, 24 insertions(+), 4 deletions(-) diff --git a/mailmap.c b/mailmap.c index b16542f..a635873 100644 --- a/mailmap.c +++ b/mailmap.c @@ -192,10 +192,16 @@ static void read_mailmap_buf(struct string_list *map, char **repo_abbrev) { while (len) { - const char *end = strchrnul(buf, '\n'); - unsigned long linelen = end - buf + 1; - char *line = xmemdupz(buf, linelen); + const char *end = memchr(buf, '\n', len); + unsigned long linelen; + char *line; + if (end) + linelen = end - buf + 1; + else + linelen = len; + + line = xmemdupz(buf, linelen); read_mailmap_line(map, line, repo_abbrev); free(line); diff --git a/t/t4203-mailmap.sh b/t/t4203-mailmap.sh index aae30d9..10c7b12 100755 --- a/t/t4203-mailmap.sh +++ b/t/t4203-mailmap.sh @@ -159,7 +159,8 @@ test_expect_success 'setup mailmap blob tests' ' Blob Guy <author@example.com> Blob Guy <bugs@company.xx> EOF - git add just-bugs both && + printf "Tricky Guy <author@example.com>" >no-newline && + git add just-bugs both no-newline && git commit -m "my mailmaps" && echo "Repo Guy <author@example.com>" >.mailmap && echo "Internal Guy <author@example.com>" >internal.map @@ -243,6 +244,19 @@ test_expect_success 'mailmap.blob defaults to HEAD:.mailmap in bare repo' ' ) ' +test_expect_success 'mailmap.blob can handle blobs without trailing newline' ' + cat >expect <<-\EOF && + Tricky Guy (1): + initial + + nick1 (1): + second + + EOF + git -c mailmap.blob=map:no-newline shortlog HEAD >actual && + test_cmp expect actual +' + test_expect_success 'cleanup after mailmap.blob tests' ' rm -f .mailmap ' -- 1.8.4.2.g87d4a77 ^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] mailmap: handle mailmap blobs without trailing newlines 2013-08-25 8:45 [PATCH] mailmap: handle mailmap blobs without trailing newlines Jeff King @ 2013-08-25 8:55 ` Jeff King 2013-08-25 9:11 ` Jeff King 0 siblings, 1 reply; 3+ messages in thread From: Jeff King @ 2013-08-25 8:55 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On Sun, Aug 25, 2013 at 04:45:50AM -0400, Jeff King wrote: > This is the minimal fix. Another option would be to switch > read_mailmap_buf to read_mailmap_string, and I think we could even get > away with avoiding the extra allocation/copy in the loop (because > read_mailmap_line seems to cope with newline-or-EOS just fine). But it > may be better to save that for 'master'. Hmm, actually, this isn't quite true. read_mailmap_line does handle the optional trailing newline properly, but the underlying parsing routines really do want to see a NUL at the end of each line (because they came from code that just calls fgets). So we really do want to tie off each line. But given that our only caller is handing us blob contents which get immediately freed, we could still do that without an extra allocation if: 1. We make it clear that the input must be NUL-terminated (i.e., by renaming the function and dropping the len parameter). 2. We drop the "const" from the buf parameter so that we can simply terminate each line as we go. I'll see what the patch looks like. -Peff ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] mailmap: handle mailmap blobs without trailing newlines 2013-08-25 8:55 ` Jeff King @ 2013-08-25 9:11 ` Jeff King 0 siblings, 0 replies; 3+ messages in thread From: Jeff King @ 2013-08-25 9:11 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On Sun, Aug 25, 2013 at 04:55:00AM -0400, Jeff King wrote: > On Sun, Aug 25, 2013 at 04:45:50AM -0400, Jeff King wrote: > > > This is the minimal fix. Another option would be to switch > > read_mailmap_buf to read_mailmap_string, and I think we could even get > > away with avoiding the extra allocation/copy in the loop (because > > read_mailmap_line seems to cope with newline-or-EOS just fine). But it > > may be better to save that for 'master'. > > Hmm, actually, this isn't quite true. read_mailmap_line does handle the > optional trailing newline properly, but the underlying parsing routines > really do want to see a NUL at the end of each line (because they came > from code that just calls fgets). So we really do want to tie off each > line. But given that our only caller is handing us blob contents which > get immediately freed, we could still do that without an extra > allocation if: > > 1. We make it clear that the input must be NUL-terminated (i.e., by > renaming the function and dropping the len parameter). > > 2. We drop the "const" from the buf parameter so that we can simply > terminate each line as we go. > > I'll see what the patch looks like. I think the end result is actually more readable and easier to follow. The only downside is that a true "I have a const buffer without NUL termination" caller would have to make a copy of its buffer. But there is no such caller currently (and I do not foresee adding one). Here's the patch directly on top of my other one. I had originally thought to put the first to 'maint' and then do the refactoring in 'master', but this really didn't end up being any more invasive than the original fix. So maybe it is worth doing both on 'maint' (or squashing them together, in which case the rationale in the commit messages need combined). -- >8 -- Subject: mailmap: avoid allocation when reading from blob The read_mailmap_blob function reads the whole blob into a buffer, then calls read_mailmap_buf to do the heavy lifting. The latter function ends up making a NUL-terminated copy of each line of the blob in order to call read_mailmap_line, which was originally written to handle the line-at-a-time input from fgets(). We can avoid the extra copy if we simply NUL-terminate each line in place, and assume that the input buffer is itself NUL-terminated. Neither of these is a problem, since our only caller is read_mailmap_blob, which has a non-const NUL-terminated buffer already. Signed-off-by: Jeff King <peff@peff.net> --- mailmap.c | 27 +++++++++------------------ 1 file changed, 9 insertions(+), 18 deletions(-) diff --git a/mailmap.c b/mailmap.c index a635873..caa7c6b 100644 --- a/mailmap.c +++ b/mailmap.c @@ -187,26 +187,17 @@ static void read_mailmap_buf(struct string_list *map, return 0; } -static void read_mailmap_buf(struct string_list *map, - const char *buf, unsigned long len, - char **repo_abbrev) +static void read_mailmap_string(struct string_list *map, char *buf, + char **repo_abbrev) { - while (len) { - const char *end = memchr(buf, '\n', len); - unsigned long linelen; - char *line; - - if (end) - linelen = end - buf + 1; - else - linelen = len; + while (*buf) { + char *end = strchrnul(buf, '\n'); - line = xmemdupz(buf, linelen); - read_mailmap_line(map, line, repo_abbrev); + if (*end) + *end++ = '\0'; - free(line); - buf += linelen; - len -= linelen; + read_mailmap_line(map, buf, repo_abbrev); + buf = end; } } @@ -230,7 +221,7 @@ static int read_mailmap_blob(struct string_list *map, if (type != OBJ_BLOB) return error("mailmap is not a blob: %s", name); - read_mailmap_buf(map, buf, size, repo_abbrev); + read_mailmap_string(map, buf, repo_abbrev); free(buf); return 0; -- 1.8.4.2.g87d4a77 ^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2013-08-25 9:11 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-08-25 8:45 [PATCH] mailmap: handle mailmap blobs without trailing newlines Jeff King 2013-08-25 8:55 ` Jeff King 2013-08-25 9:11 ` Jeff King
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).