* [PATCH 0/6] Detect non email patches in git-mailinfo
@ 2006-05-23 19:42 Eric W. Biederman
2006-05-23 19:44 ` [PATCH 1/6] Make read_one_header_line return a flag not a length Eric W. Biederman
2006-05-23 23:44 ` [PATCH 0/6] Detect non email patches in git-mailinfo Junio C Hamano
0 siblings, 2 replies; 10+ messages in thread
From: Eric W. Biederman @ 2006-05-23 19:42 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
After looking at a number of additional quit patches I noticed
a small problem with using the current git-mailinfo. On patches
with out any leading headers git-mailinfo can get confused and
loose a bit of information.
So far I have only seen this in the quilt from Andi Kleen but
it is fairly straight forward to fix.
What follows is a small patch series that one small step at
a time refactors (and I think simplifies) git-mailinfo
so that it can detect and cope with a file without any
email headers.
Eric
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 1/6] Make read_one_header_line return a flag not a length.
2006-05-23 19:42 [PATCH 0/6] Detect non email patches in git-mailinfo Eric W. Biederman
@ 2006-05-23 19:44 ` Eric W. Biederman
2006-05-23 19:45 ` [PATCH 2/6] Move B and Q decoding into check header Eric W. Biederman
2006-05-23 23:44 ` [PATCH 0/6] Detect non email patches in git-mailinfo Junio C Hamano
1 sibling, 1 reply; 10+ messages in thread
From: Eric W. Biederman @ 2006-05-23 19:44 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
Currently we only use the return value from read_one_header line
to tell if the line we have read is a header or not. So make
it a flag. This paves the way for better email detection.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
mailinfo.c | 22 +++++++++++-----------
1 files changed, 11 insertions(+), 11 deletions(-)
40f4ca44ec851e435ce9453c682c71b9c67063b9
diff --git a/mailinfo.c b/mailinfo.c
index b276519..83a2986 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -331,7 +331,7 @@ struct header_def {
int namelen;
};
-static void check_header(char *line, int len, struct header_def *header)
+static void check_header(char *line, struct header_def *header)
{
int i;
@@ -349,7 +349,7 @@ static void check_header(char *line, int
}
}
-static void check_subheader_line(char *line, int len)
+static void check_subheader_line(char *line)
{
static struct header_def header[] = {
{ "Content-Type", handle_subcontent_type },
@@ -357,9 +357,9 @@ static void check_subheader_line(char *l
handle_content_transfer_encoding },
{ NULL },
};
- check_header(line, len, header);
+ check_header(line, header);
}
-static void check_header_line(char *line, int len)
+static void check_header_line(char *line)
{
static struct header_def header[] = {
{ "From", handle_from },
@@ -370,7 +370,7 @@ static void check_header_line(char *line
handle_content_transfer_encoding },
{ NULL },
};
- check_header(line, len, header);
+ check_header(line, header);
}
static int read_one_header_line(char *line, int sz, FILE *in)
@@ -709,8 +709,8 @@ static void handle_multipart_body(void)
return;
/* We are on boundary line. Start slurping the subhead. */
while (1) {
- int len = read_one_header_line(line, sizeof(line), stdin);
- if (!len) {
+ int hdr = read_one_header_line(line, sizeof(line), stdin);
+ if (!hdr) {
if (handle_multipart_one_part() < 0)
return;
/* Reset per part headers */
@@ -718,7 +718,7 @@ static void handle_multipart_body(void)
charset[0] = 0;
}
else
- check_subheader_line(line, len);
+ check_subheader_line(line);
}
fclose(patchfile);
if (!patch_lines) {
@@ -787,15 +787,15 @@ int main(int argc, char **argv)
exit(1);
}
while (1) {
- int len = read_one_header_line(line, sizeof(line), stdin);
- if (!len) {
+ int hdr = read_one_header_line(line, sizeof(line), stdin);
+ if (!hdr) {
if (multipart_boundary[0])
handle_multipart_body();
else
handle_body();
break;
}
- check_header_line(line, len);
+ check_header_line(line);
}
return 0;
}
--
1.3.2.g5041c-dirty
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 2/6] Move B and Q decoding into check header.
2006-05-23 19:44 ` [PATCH 1/6] Make read_one_header_line return a flag not a length Eric W. Biederman
@ 2006-05-23 19:45 ` Eric W. Biederman
2006-05-23 19:47 ` [PATCH 3/6] Refactor commit messge handling Eric W. Biederman
0 siblings, 1 reply; 10+ messages in thread
From: Eric W. Biederman @ 2006-05-23 19:45 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
B and Q decoding is not appropriate for in body headers, so move
it up to where we explicitly know we have a real email header.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
mailinfo.c | 12 +++++-------
1 files changed, 5 insertions(+), 7 deletions(-)
3cccc5a0728a981cc6f4ea72e81513fd902e29a2
diff --git a/mailinfo.c b/mailinfo.c
index 83a2986..bee7b20 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -324,6 +324,7 @@ static void cleanup_space(char *buf)
}
}
+static void decode_header_bq(char *it);
typedef int (*header_fn_t)(char *);
struct header_def {
const char *name;
@@ -343,6 +344,10 @@ static void check_header(char *line, str
int len = header[i].namelen;
if (!strncasecmp(line, header[i].name, len) &&
line[len] == ':' && isspace(line[len + 1])) {
+ /* Unwrap inline B and Q encoding, and optionally
+ * normalize the meta information to utf8.
+ */
+ decode_header_bq(line + len + 2);
header[i].func(line + len + 2);
break;
}
@@ -597,13 +602,6 @@ static void handle_info(void)
cleanup_space(email);
cleanup_space(sub);
- /* Unwrap inline B and Q encoding, and optionally
- * normalize the meta information to utf8.
- */
- decode_header_bq(name);
- decode_header_bq(date);
- decode_header_bq(email);
- decode_header_bq(sub);
printf("Author: %s\nEmail: %s\nSubject: %s\nDate: %s\n\n",
name, email, sub, date);
}
--
1.3.2.g5041c-dirty
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 3/6] Refactor commit messge handling.
2006-05-23 19:45 ` [PATCH 2/6] Move B and Q decoding into check header Eric W. Biederman
@ 2006-05-23 19:47 ` Eric W. Biederman
2006-05-23 19:49 ` [PATCH 4/6] In handle_body only read a line if we don't already have one Eric W. Biederman
0 siblings, 1 reply; 10+ messages in thread
From: Eric W. Biederman @ 2006-05-23 19:47 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
- Move handle_info into main so it is called once
after everything has been parsed. This allows the removal
of a static variable and removes two duplicate calls.
- Move parsing of inbody headers into handle_commit.
This means we parse the in-body headers after we have decoded
the character set, and it removes code duplication between
handle_multipart_one_part and handle_body.
- Change the flag indicating that we have seen an in body
prefix header into another bit in seen.
This is a little more general and allows the possibility of parsing
in body headers after the body message has begun.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
mailinfo.c | 58 ++++++++++++++++++++++------------------------------------
1 files changed, 22 insertions(+), 36 deletions(-)
3f6fe4d5e86c3d8d1fad75bfeb71f398966813d4
diff --git a/mailinfo.c b/mailinfo.c
index bee7b20..3fa9505 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -237,38 +237,41 @@ static int eatspace(char *line)
#define SEEN_FROM 01
#define SEEN_DATE 02
#define SEEN_SUBJECT 04
+#define SEEN_PREFIX 0x08
/* First lines of body can have From:, Date:, and Subject: */
-static int handle_inbody_header(int *seen, char *line)
+static void handle_inbody_header(int *seen, char *line)
{
+ if (*seen & SEEN_PREFIX)
+ return;
if (!memcmp("From:", line, 5) && isspace(line[5])) {
if (!(*seen & SEEN_FROM) && handle_from(line+6)) {
*seen |= SEEN_FROM;
- return 1;
+ return;
}
}
if (!memcmp("Date:", line, 5) && isspace(line[5])) {
if (!(*seen & SEEN_DATE)) {
handle_date(line+6);
*seen |= SEEN_DATE;
- return 1;
+ return;
}
}
if (!memcmp("Subject:", line, 8) && isspace(line[8])) {
if (!(*seen & SEEN_SUBJECT)) {
handle_subject(line+9);
*seen |= SEEN_SUBJECT;
- return 1;
+ return;
}
}
if (!memcmp("[PATCH]", line, 7) && isspace(line[7])) {
if (!(*seen & SEEN_SUBJECT)) {
handle_subject(line);
*seen |= SEEN_SUBJECT;
- return 1;
+ return;
}
}
- return 0;
+ *seen |= SEEN_PREFIX;
}
static char *cleanup_subject(char *subject)
@@ -590,12 +593,7 @@ static void decode_transfer_encoding(cha
static void handle_info(void)
{
char *sub;
- static int done_info = 0;
-
- if (done_info)
- return;
- done_info = 1;
sub = cleanup_subject(subject);
cleanup_space(name);
cleanup_space(date);
@@ -609,7 +607,7 @@ static void handle_info(void)
/* We are inside message body and have read line[] already.
* Spit out the commit log.
*/
-static int handle_commit_msg(void)
+static int handle_commit_msg(int *seen)
{
if (!cmitmsg)
return 0;
@@ -633,6 +631,11 @@ static int handle_commit_msg(void)
decode_transfer_encoding(line);
if (metainfo_charset)
convert_to_utf8(line, charset);
+
+ handle_inbody_header(seen, line);
+ if (!(*seen & SEEN_PREFIX))
+ continue;
+
fputs(line, cmitmsg);
} while (fgets(line, sizeof(line), stdin) != NULL);
fclose(cmitmsg);
@@ -664,26 +667,16 @@ static void handle_patch(void)
* that the first part to contain commit message and a patch, and
* handle other parts as pure patches.
*/
-static int handle_multipart_one_part(void)
+static int handle_multipart_one_part(int *seen)
{
- int seen = 0;
int n = 0;
- int len;
while (fgets(line, sizeof(line), stdin) != NULL) {
again:
- len = eatspace(line);
n++;
- if (!len)
- continue;
if (is_multipart_boundary(line))
break;
- if (0 <= seen && handle_inbody_header(&seen, line))
- continue;
- seen = -1; /* no more inbody headers */
- line[len] = '\n';
- handle_info();
- if (handle_commit_msg())
+ if (handle_commit_msg(seen))
goto again;
handle_patch();
break;
@@ -695,6 +688,7 @@ static int handle_multipart_one_part(voi
static void handle_multipart_body(void)
{
+ int seen = 0;
int part_num = 0;
/* Skip up to the first boundary */
@@ -709,7 +703,7 @@ static void handle_multipart_body(void)
while (1) {
int hdr = read_one_header_line(line, sizeof(line), stdin);
if (!hdr) {
- if (handle_multipart_one_part() < 0)
+ if (handle_multipart_one_part(&seen) < 0)
return;
/* Reset per part headers */
transfer_encoding = TE_DONTCARE;
@@ -730,18 +724,9 @@ static void handle_body(void)
{
int seen = 0;
- while (fgets(line, sizeof(line), stdin) != NULL) {
- int len = eatspace(line);
- if (!len)
- continue;
- if (0 <= seen && handle_inbody_header(&seen, line))
- continue;
- seen = -1; /* no more inbody headers */
- line[len] = '\n';
- handle_info();
- handle_commit_msg();
+ if (fgets(line, sizeof(line), stdin) != NULL) {
+ handle_commit_msg(&seen);
handle_patch();
- break;
}
fclose(patchfile);
if (!patch_lines) {
@@ -791,6 +776,7 @@ int main(int argc, char **argv)
handle_multipart_body();
else
handle_body();
+ handle_info();
break;
}
check_header_line(line);
--
1.3.2.g5041c-dirty
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 4/6] In handle_body only read a line if we don't already have one.
2006-05-23 19:47 ` [PATCH 3/6] Refactor commit messge handling Eric W. Biederman
@ 2006-05-23 19:49 ` Eric W. Biederman
2006-05-23 19:53 ` [PATCH 5/6] More accurately detect header lines in read_one_header_line Eric W. Biederman
0 siblings, 1 reply; 10+ messages in thread
From: Eric W. Biederman @ 2006-05-23 19:49 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
This prepares for detecting non-email patches that don't have
mail headers. In which case we have already read the first
line so handle_body should not ignore it.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
mailinfo.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
3ad0c255a351d771c7f301d4a4e9bfb6fdcbde5f
diff --git a/mailinfo.c b/mailinfo.c
index 3fa9505..99989c2 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -724,7 +724,7 @@ static void handle_body(void)
{
int seen = 0;
- if (fgets(line, sizeof(line), stdin) != NULL) {
+ if (line[0] || fgets(line, sizeof(line), stdin) != NULL) {
handle_commit_msg(&seen);
handle_patch();
}
--
1.3.2.g5041c-dirty
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 5/6] More accurately detect header lines in read_one_header_line
2006-05-23 19:49 ` [PATCH 4/6] In handle_body only read a line if we don't already have one Eric W. Biederman
@ 2006-05-23 19:53 ` Eric W. Biederman
2006-05-23 19:58 ` [PATCH 6/6] Allow in body headers beyond the in body header prefix Eric W. Biederman
2006-05-26 7:29 ` [PATCH 5/6] More accurately detect header lines in read_one_header_line Junio C Hamano
0 siblings, 2 replies; 10+ messages in thread
From: Eric W. Biederman @ 2006-05-23 19:53 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
Only count lines of the form '^.*: ' and '^From ' as email
header lines.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
mailinfo.c | 25 +++++++++++++++++--------
1 files changed, 17 insertions(+), 8 deletions(-)
b955444f0bfb4ee9a5cd31686dd7eeec0750e235
diff --git a/mailinfo.c b/mailinfo.c
index 99989c2..c642ff4 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -385,20 +385,29 @@ static int read_one_header_line(char *li
{
int ofs = 0;
while (ofs < sz) {
+ const char *colon;
int peek, len;
if (fgets(line + ofs, sz - ofs, in) == NULL)
- return ofs;
+ break;
len = eatspace(line + ofs);
if (len == 0)
- return ofs;
- peek = fgetc(in); ungetc(peek, in);
- if (peek == ' ' || peek == '\t') {
- /* Yuck, 2822 header "folding" */
- ofs += len;
- continue;
+ break;
+ colon = strchr(line, ':');
+ if (!colon || !isspace(colon[1])) {
+ /* Readd the newline */
+ line[ofs + len] = '\n';
+ line[ofs + len + 1] = '\0';
+ break;
}
- return ofs + len;
+ ofs += len;
+ /* Yuck, 2822 header "folding" */
+ peek = fgetc(in); ungetc(peek, in);
+ if (peek != ' ' && peek != '\t')
+ break;
}
+ /* Count mbox From headers as headers */
+ if (!ofs && !memcmp(line, "From ", 5))
+ ofs = 1;
return ofs;
}
--
1.3.2.g5041c-dirty
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 6/6] Allow in body headers beyond the in body header prefix.
2006-05-23 19:53 ` [PATCH 5/6] More accurately detect header lines in read_one_header_line Eric W. Biederman
@ 2006-05-23 19:58 ` Eric W. Biederman
2006-05-26 7:29 ` [PATCH 5/6] More accurately detect header lines in read_one_header_line Junio C Hamano
1 sibling, 0 replies; 10+ messages in thread
From: Eric W. Biederman @ 2006-05-23 19:58 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
- handle_from is fixed to not mangle it's input line.
- Then handle_inbody_header is allowed to look in
the body of a commit message for additional headers
that we haven't already seen.
This allows patches with all of the right information in
unfortunate places to be imported.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
mailinfo.c | 9 +++++----
1 files changed, 5 insertions(+), 4 deletions(-)
eca59d2fd60af47170cdbfdebf3384465f0e7635
diff --git a/mailinfo.c b/mailinfo.c
index c642ff4..99374b3 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -72,11 +72,14 @@ static int bogus_from(char *line)
return 1;
}
-static int handle_from(char *line)
+static int handle_from(char *in_line)
{
- char *at = strchr(line, '@');
+ char line[1000];
+ char *at;
char *dst;
+ strcpy(line, in_line);
+ at = strchr(line, '@');
if (!at)
return bogus_from(line);
@@ -242,8 +245,6 @@ #define SEEN_PREFIX 0x08
/* First lines of body can have From:, Date:, and Subject: */
static void handle_inbody_header(int *seen, char *line)
{
- if (*seen & SEEN_PREFIX)
- return;
if (!memcmp("From:", line, 5) && isspace(line[5])) {
if (!(*seen & SEEN_FROM) && handle_from(line+6)) {
*seen |= SEEN_FROM;
--
1.3.2.g5041c-dirty
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 0/6] Detect non email patches in git-mailinfo
2006-05-23 19:42 [PATCH 0/6] Detect non email patches in git-mailinfo Eric W. Biederman
2006-05-23 19:44 ` [PATCH 1/6] Make read_one_header_line return a flag not a length Eric W. Biederman
@ 2006-05-23 23:44 ` Junio C Hamano
1 sibling, 0 replies; 10+ messages in thread
From: Junio C Hamano @ 2006-05-23 23:44 UTC (permalink / raw)
To: Eric W. Biederman; +Cc: git
Thanks. Merged to "next", this probably would graduate to
"master" by the end of the week if not earlier.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 5/6] More accurately detect header lines in read_one_header_line
2006-05-23 19:53 ` [PATCH 5/6] More accurately detect header lines in read_one_header_line Eric W. Biederman
2006-05-23 19:58 ` [PATCH 6/6] Allow in body headers beyond the in body header prefix Eric W. Biederman
@ 2006-05-26 7:29 ` Junio C Hamano
[not found] ` <7vr72hns7h.fsf@assigned-by-dhcp.cox.net>
1 sibling, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2006-05-26 7:29 UTC (permalink / raw)
To: Eric W. Biederman; +Cc: git
ebiederm@xmission.com (Eric W. Biederman) writes:
> Only count lines of the form '^.*: ' and '^From ' as email
> header lines.
I am having trouble with this patch.
> diff --git a/mailinfo.c b/mailinfo.c
> index 99989c2..c642ff4 100644
> --- a/mailinfo.c
> +++ b/mailinfo.c
> @@ -385,20 +385,29 @@ static int read_one_header_line(char *li
> {
> int ofs = 0;
> while (ofs < sz) {
> + const char *colon;
> int peek, len;
> if (fgets(line + ofs, sz - ofs, in) == NULL)
> + break;
> len = eatspace(line + ofs);
> if (len == 0)
> + break;
> + colon = strchr(line, ':');
> + if (!colon || !isspace(colon[1])) {
> + /* Readd the newline */
> + line[ofs + len] = '\n';
> + line[ofs + len + 1] = '\0';
> + break;
> }
Because eatspace() eats the trailing space, although your commit
message say lines matching "^.*: " are headers, this does not
match the criteria:
X-Spam-Checker-Version: SpamAssassin 3.1.1 (2006-03-10) on
gitster.siamese.dyndns.org
-> X-Spam-Level:
X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00
autolearn=ham version=3.1.1
Notice that the field body for this unstructured header
X-Spam-Level (an optional field) consists of a single
whitespace. It will be gone because of eatspace() when your
check sees the line, so the header parsing stops prematurely.
Was there a particular reason you needed this change? That is,
did you have to parse mail-looking input that does not have a
blank line between runs of headers and the body of the message?
If so, I'd at least like to remove the || !isspace(colon[1])
from the test. After all, I do not think RFC2822 requires a
whitespace after the colon there.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 5/6] More accurately detect header lines in read_one_header_line
[not found] ` <7vr72hns7h.fsf@assigned-by-dhcp.cox.net>
@ 2006-05-26 8:16 ` Eric W. Biederman
0 siblings, 0 replies; 10+ messages in thread
From: Eric W. Biederman @ 2006-05-26 8:16 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
Junio C Hamano <junkio@cox.net> writes:
> Junio C Hamano <junkio@cox.net> writes:
>
>> Was there a particular reason you needed this change? That is,
>> did you have to parse mail-looking input that does not have a
>> blank line between runs of headers and the body of the message?
Yes. I had patches that had a subject line followed by a blank line,
and the problem was that the old check thought the subject was a
header line, despite not even having a colon in it.
>> If so, I'd at least like to remove the || !isspace(colon[1])
>> from the test. After all, I do not think RFC2822 requires a
>> whitespace after the colon there.
>
> In other words, something like this (tested):
Looks good to me, sorry for missing that one.
Eric
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2006-05-26 8:17 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-05-23 19:42 [PATCH 0/6] Detect non email patches in git-mailinfo Eric W. Biederman
2006-05-23 19:44 ` [PATCH 1/6] Make read_one_header_line return a flag not a length Eric W. Biederman
2006-05-23 19:45 ` [PATCH 2/6] Move B and Q decoding into check header Eric W. Biederman
2006-05-23 19:47 ` [PATCH 3/6] Refactor commit messge handling Eric W. Biederman
2006-05-23 19:49 ` [PATCH 4/6] In handle_body only read a line if we don't already have one Eric W. Biederman
2006-05-23 19:53 ` [PATCH 5/6] More accurately detect header lines in read_one_header_line Eric W. Biederman
2006-05-23 19:58 ` [PATCH 6/6] Allow in body headers beyond the in body header prefix Eric W. Biederman
2006-05-26 7:29 ` [PATCH 5/6] More accurately detect header lines in read_one_header_line Junio C Hamano
[not found] ` <7vr72hns7h.fsf@assigned-by-dhcp.cox.net>
2006-05-26 8:16 ` Eric W. Biederman
2006-05-23 23:44 ` [PATCH 0/6] Detect non email patches in git-mailinfo Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).