* git patch-id fails on long lines
@ 2011-09-20 18:07 Andrew Pimlott
2011-09-20 20:11 ` Junio C Hamano
2011-09-20 20:18 ` git patch-id fails on long lines Jeff King
0 siblings, 2 replies; 5+ messages in thread
From: Andrew Pimlott @ 2011-09-20 18:07 UTC (permalink / raw)
To: git
In patch-id.c, get_one_patchid uses a fixed 1000-char buffer to read a line.[1]
This causes incorrect results on longer lines. Pasted below is a git commit
(from git show) that demonstrates the problem. The result of running git
patch-id on this commit is:
9220f380851be9cab1a760430e3be096dcbee8c6 9b96b6fde8f7df791a1490ae18e1fa75fbab3262
74b8ede07628a574fd586624e0c77a4b6c9967e0 aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
The commit:
commit 9b96b6fde8f7df791a1490ae18e1fa75fbab3262
Author: Andrew Pimlott <andrew@pimlott.net>
Date: Tue Sep 20 10:53:25 2011 -0700
2
diff --git a/a b/a
index e69de29..2e6adac 100644
--- a/a
+++ b/a
@@ -0,0 +1 @@
+aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
diff --git a/b b/b
index e69de29..1425fcc 100644
--- a/b
+++ b/b
@@ -0,0 +1 @@
+b
Andrew
[1] https://github.com/git/git/blob/master/builtin/patch-id.c
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: git patch-id fails on long lines
2011-09-20 18:07 git patch-id fails on long lines Andrew Pimlott
@ 2011-09-20 20:11 ` Junio C Hamano
2011-09-21 12:42 ` [PATCH] patch-id.c: use strbuf instead of a fixed buffer Michael Schubert
2011-09-20 20:18 ` git patch-id fails on long lines Jeff King
1 sibling, 1 reply; 5+ messages in thread
From: Junio C Hamano @ 2011-09-20 20:11 UTC (permalink / raw)
To: Andrew Pimlott; +Cc: git
Andrew Pimlott <andrew@pimlott.net> writes:
> In patch-id.c, get_one_patchid uses a fixed 1000-char buffer to read a line.
Thanks; builtin/patch-id.c is parhaps one of the more ancient part of the
system and it hasn't been updated to use more modern facility like strbuf
API.
Fix should be pretty straightforward. Any takers?
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: git patch-id fails on long lines
2011-09-20 18:07 git patch-id fails on long lines Andrew Pimlott
2011-09-20 20:11 ` Junio C Hamano
@ 2011-09-20 20:18 ` Jeff King
1 sibling, 0 replies; 5+ messages in thread
From: Jeff King @ 2011-09-20 20:18 UTC (permalink / raw)
To: Andrew Pimlott; +Cc: git
On Tue, Sep 20, 2011 at 11:07:42AM -0700, Andrew Pimlott wrote:
> In patch-id.c, get_one_patchid uses a fixed 1000-char buffer to read a line.[1]
> This causes incorrect results on longer lines. Pasted below is a git commit
> (from git show) that demonstrates the problem. The result of running git
> patch-id on this commit is:
>
> 9220f380851be9cab1a760430e3be096dcbee8c6 9b96b6fde8f7df791a1490ae18e1fa75fbab3262
> 74b8ede07628a574fd586624e0c77a4b6c9967e0 aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
Hmm, yeah. My initial impression "eh, so what, it will just add the
line contents to the sha1 patch-id in two separate hunks". But we
actually treat lines magically based on their beginnings, and it
seems we accidentally think the "aaa" is the start of the next commit
header.
I think this can be trivially converted to use strbuf_getwholeline,
something like this (completely untested):
diff --git a/builtin/patch-id.c b/builtin/patch-id.c
index f821eb3..99ba2ca 100644
--- a/builtin/patch-id.c
+++ b/builtin/patch-id.c
@@ -58,11 +58,12 @@ static int scan_hunk_header(const char *p, int *p_before, int *p_after)
static int get_one_patchid(unsigned char *next_sha1, git_SHA_CTX *ctx)
{
- static char line[1000];
+ static struct strbuf line_buf = STRBUF_INIT;
int patchlen = 0, found_next = 0;
int before = -1, after = -1;
- while (fgets(line, sizeof(line), stdin) != NULL) {
+ while (strbuf_getwholeline(&line_buf, stdin, '\n') != EOF) {
+ char *line = line_buf.buf;
char *p = line;
int len;
That just reuses the same heap-allocated buffer over and over, and then
eventually leaks it at the end (but then the program exits anyway). You
could get fancier and actually pass in the strbuf from generate_id_list,
and then release it when we're done.
-Peff
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH] patch-id.c: use strbuf instead of a fixed buffer
2011-09-20 20:11 ` Junio C Hamano
@ 2011-09-21 12:42 ` Michael Schubert
2011-09-22 17:17 ` Jeff King
0 siblings, 1 reply; 5+ messages in thread
From: Michael Schubert @ 2011-09-21 12:42 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Andrew Pimlott, git, Jeff King
get_one_patchid() uses a rather dumb heuristic to determine if the
passed buffer is part of the next commit. Whenever the first 40 bytes
are a valid hexadecimal sha1 representation, get_one_patchid() returns
next_sha1.
Once the current line is longer than the fixed buffer, this will break
(provided the additional bytes make a valid hexadecimal sha1). As a result
patch-id returns incorrect results. Instead, user strbuf and read one
line at a time.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Michael Schubert <mschub@elegosoft.com>
---
builtin/patch-id.c | 10 ++++++----
1 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/builtin/patch-id.c b/builtin/patch-id.c
index f821eb3..3cfe02d 100644
--- a/builtin/patch-id.c
+++ b/builtin/patch-id.c
@@ -56,13 +56,13 @@ static int scan_hunk_header(const char *p, int *p_before, int *p_after)
return 1;
}
-static int get_one_patchid(unsigned char *next_sha1, git_SHA_CTX *ctx)
+static int get_one_patchid(unsigned char *next_sha1, git_SHA_CTX *ctx, struct strbuf *line_buf)
{
- static char line[1000];
int patchlen = 0, found_next = 0;
int before = -1, after = -1;
- while (fgets(line, sizeof(line), stdin) != NULL) {
+ while (strbuf_getwholeline(line_buf, stdin, '\n') != EOF) {
+ char *line = line_buf->buf;
char *p = line;
int len;
@@ -133,14 +133,16 @@ static void generate_id_list(void)
unsigned char sha1[20], n[20];
git_SHA_CTX ctx;
int patchlen;
+ struct strbuf line_buf = STRBUF_INIT;
git_SHA1_Init(&ctx);
hashclr(sha1);
while (!feof(stdin)) {
- patchlen = get_one_patchid(n, &ctx);
+ patchlen = get_one_patchid(n, &ctx, &line_buf);
flush_current_id(patchlen, sha1, &ctx);
hashcpy(sha1, n);
}
+ strbuf_release(&line_buf);
}
static const char patch_id_usage[] = "git patch-id < patch";
--
1.7.7.rc2.365.g55c1f
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] patch-id.c: use strbuf instead of a fixed buffer
2011-09-21 12:42 ` [PATCH] patch-id.c: use strbuf instead of a fixed buffer Michael Schubert
@ 2011-09-22 17:17 ` Jeff King
0 siblings, 0 replies; 5+ messages in thread
From: Jeff King @ 2011-09-22 17:17 UTC (permalink / raw)
To: Michael Schubert; +Cc: Junio C Hamano, Andrew Pimlott, git
On Wed, Sep 21, 2011 at 02:42:22PM +0200, Michael Schubert wrote:
> get_one_patchid() uses a rather dumb heuristic to determine if the
> passed buffer is part of the next commit. Whenever the first 40 bytes
> are a valid hexadecimal sha1 representation, get_one_patchid() returns
> next_sha1.
> Once the current line is longer than the fixed buffer, this will break
> (provided the additional bytes make a valid hexadecimal sha1). As a result
> patch-id returns incorrect results. Instead, user strbuf and read one
> line at a time.
A minor nit, but I think this is probably broken even if the additional
bytes don't look like a valid sha1. It would look like cruft after the
diff (since the lien doesn't start with plus, minus, or space), which
means it would not be stirred into the sha1 mix.
> builtin/patch-id.c | 10 ++++++----
> 1 files changed, 6 insertions(+), 4 deletions(-)
The patch itself looks good to me, though. Thanks.
-Peff
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2011-09-22 17:17 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-09-20 18:07 git patch-id fails on long lines Andrew Pimlott
2011-09-20 20:11 ` Junio C Hamano
2011-09-21 12:42 ` [PATCH] patch-id.c: use strbuf instead of a fixed buffer Michael Schubert
2011-09-22 17:17 ` Jeff King
2011-09-20 20:18 ` git patch-id fails on long lines Jeff King
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).