* git patch-id fails on long lines @ 2011-09-20 18:07 Andrew Pimlott 2011-09-20 20:11 ` Junio C Hamano 2011-09-20 20:18 ` git patch-id fails on long lines Jeff King 0 siblings, 2 replies; 5+ messages in thread From: Andrew Pimlott @ 2011-09-20 18:07 UTC (permalink / raw) To: git In patch-id.c, get_one_patchid uses a fixed 1000-char buffer to read a line.[1] This causes incorrect results on longer lines. Pasted below is a git commit (from git show) that demonstrates the problem. The result of running git patch-id on this commit is: 9220f380851be9cab1a760430e3be096dcbee8c6 9b96b6fde8f7df791a1490ae18e1fa75fbab3262 74b8ede07628a574fd586624e0c77a4b6c9967e0 aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa The commit: commit 9b96b6fde8f7df791a1490ae18e1fa75fbab3262 Author: Andrew Pimlott <andrew@pimlott.net> Date: Tue Sep 20 10:53:25 2011 -0700 2 diff --git a/a b/a index e69de29..2e6adac 100644 --- a/a +++ b/a @@ -0,0 +1 @@ +aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa diff --git a/b b/b index e69de29..1425fcc 100644 --- a/b +++ b/b @@ -0,0 +1 @@ +b Andrew [1] https://github.com/git/git/blob/master/builtin/patch-id.c ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: git patch-id fails on long lines 2011-09-20 18:07 git patch-id fails on long lines Andrew Pimlott @ 2011-09-20 20:11 ` Junio C Hamano 2011-09-21 12:42 ` [PATCH] patch-id.c: use strbuf instead of a fixed buffer Michael Schubert 2011-09-20 20:18 ` git patch-id fails on long lines Jeff King 1 sibling, 1 reply; 5+ messages in thread From: Junio C Hamano @ 2011-09-20 20:11 UTC (permalink / raw) To: Andrew Pimlott; +Cc: git Andrew Pimlott <andrew@pimlott.net> writes: > In patch-id.c, get_one_patchid uses a fixed 1000-char buffer to read a line. Thanks; builtin/patch-id.c is parhaps one of the more ancient part of the system and it hasn't been updated to use more modern facility like strbuf API. Fix should be pretty straightforward. Any takers? ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH] patch-id.c: use strbuf instead of a fixed buffer 2011-09-20 20:11 ` Junio C Hamano @ 2011-09-21 12:42 ` Michael Schubert 2011-09-22 17:17 ` Jeff King 0 siblings, 1 reply; 5+ messages in thread From: Michael Schubert @ 2011-09-21 12:42 UTC (permalink / raw) To: Junio C Hamano; +Cc: Andrew Pimlott, git, Jeff King get_one_patchid() uses a rather dumb heuristic to determine if the passed buffer is part of the next commit. Whenever the first 40 bytes are a valid hexadecimal sha1 representation, get_one_patchid() returns next_sha1. Once the current line is longer than the fixed buffer, this will break (provided the additional bytes make a valid hexadecimal sha1). As a result patch-id returns incorrect results. Instead, user strbuf and read one line at a time. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Michael Schubert <mschub@elegosoft.com> --- builtin/patch-id.c | 10 ++++++---- 1 files changed, 6 insertions(+), 4 deletions(-) diff --git a/builtin/patch-id.c b/builtin/patch-id.c index f821eb3..3cfe02d 100644 --- a/builtin/patch-id.c +++ b/builtin/patch-id.c @@ -56,13 +56,13 @@ static int scan_hunk_header(const char *p, int *p_before, int *p_after) return 1; } -static int get_one_patchid(unsigned char *next_sha1, git_SHA_CTX *ctx) +static int get_one_patchid(unsigned char *next_sha1, git_SHA_CTX *ctx, struct strbuf *line_buf) { - static char line[1000]; int patchlen = 0, found_next = 0; int before = -1, after = -1; - while (fgets(line, sizeof(line), stdin) != NULL) { + while (strbuf_getwholeline(line_buf, stdin, '\n') != EOF) { + char *line = line_buf->buf; char *p = line; int len; @@ -133,14 +133,16 @@ static void generate_id_list(void) unsigned char sha1[20], n[20]; git_SHA_CTX ctx; int patchlen; + struct strbuf line_buf = STRBUF_INIT; git_SHA1_Init(&ctx); hashclr(sha1); while (!feof(stdin)) { - patchlen = get_one_patchid(n, &ctx); + patchlen = get_one_patchid(n, &ctx, &line_buf); flush_current_id(patchlen, sha1, &ctx); hashcpy(sha1, n); } + strbuf_release(&line_buf); } static const char patch_id_usage[] = "git patch-id < patch"; -- 1.7.7.rc2.365.g55c1f ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] patch-id.c: use strbuf instead of a fixed buffer 2011-09-21 12:42 ` [PATCH] patch-id.c: use strbuf instead of a fixed buffer Michael Schubert @ 2011-09-22 17:17 ` Jeff King 0 siblings, 0 replies; 5+ messages in thread From: Jeff King @ 2011-09-22 17:17 UTC (permalink / raw) To: Michael Schubert; +Cc: Junio C Hamano, Andrew Pimlott, git On Wed, Sep 21, 2011 at 02:42:22PM +0200, Michael Schubert wrote: > get_one_patchid() uses a rather dumb heuristic to determine if the > passed buffer is part of the next commit. Whenever the first 40 bytes > are a valid hexadecimal sha1 representation, get_one_patchid() returns > next_sha1. > Once the current line is longer than the fixed buffer, this will break > (provided the additional bytes make a valid hexadecimal sha1). As a result > patch-id returns incorrect results. Instead, user strbuf and read one > line at a time. A minor nit, but I think this is probably broken even if the additional bytes don't look like a valid sha1. It would look like cruft after the diff (since the lien doesn't start with plus, minus, or space), which means it would not be stirred into the sha1 mix. > builtin/patch-id.c | 10 ++++++---- > 1 files changed, 6 insertions(+), 4 deletions(-) The patch itself looks good to me, though. Thanks. -Peff ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: git patch-id fails on long lines 2011-09-20 18:07 git patch-id fails on long lines Andrew Pimlott 2011-09-20 20:11 ` Junio C Hamano @ 2011-09-20 20:18 ` Jeff King 1 sibling, 0 replies; 5+ messages in thread From: Jeff King @ 2011-09-20 20:18 UTC (permalink / raw) To: Andrew Pimlott; +Cc: git On Tue, Sep 20, 2011 at 11:07:42AM -0700, Andrew Pimlott wrote: > In patch-id.c, get_one_patchid uses a fixed 1000-char buffer to read a line.[1] > This causes incorrect results on longer lines. Pasted below is a git commit > (from git show) that demonstrates the problem. The result of running git > patch-id on this commit is: > > 9220f380851be9cab1a760430e3be096dcbee8c6 9b96b6fde8f7df791a1490ae18e1fa75fbab3262 > 74b8ede07628a574fd586624e0c77a4b6c9967e0 aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa Hmm, yeah. My initial impression "eh, so what, it will just add the line contents to the sha1 patch-id in two separate hunks". But we actually treat lines magically based on their beginnings, and it seems we accidentally think the "aaa" is the start of the next commit header. I think this can be trivially converted to use strbuf_getwholeline, something like this (completely untested): diff --git a/builtin/patch-id.c b/builtin/patch-id.c index f821eb3..99ba2ca 100644 --- a/builtin/patch-id.c +++ b/builtin/patch-id.c @@ -58,11 +58,12 @@ static int scan_hunk_header(const char *p, int *p_before, int *p_after) static int get_one_patchid(unsigned char *next_sha1, git_SHA_CTX *ctx) { - static char line[1000]; + static struct strbuf line_buf = STRBUF_INIT; int patchlen = 0, found_next = 0; int before = -1, after = -1; - while (fgets(line, sizeof(line), stdin) != NULL) { + while (strbuf_getwholeline(&line_buf, stdin, '\n') != EOF) { + char *line = line_buf.buf; char *p = line; int len; That just reuses the same heap-allocated buffer over and over, and then eventually leaks it at the end (but then the program exits anyway). You could get fancier and actually pass in the strbuf from generate_id_list, and then release it when we're done. -Peff ^ permalink raw reply related [flat|nested] 5+ messages in thread
end of thread, other threads:[~2011-09-22 17:17 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-09-20 18:07 git patch-id fails on long lines Andrew Pimlott 2011-09-20 20:11 ` Junio C Hamano 2011-09-21 12:42 ` [PATCH] patch-id.c: use strbuf instead of a fixed buffer Michael Schubert 2011-09-22 17:17 ` Jeff King 2011-09-20 20:18 ` git patch-id fails on long lines Jeff King
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).