Git development
 help / color / mirror / Atom feed
* [PATCH] Don't parse any headers in the real body of an email message.
From: Eric W. Biederman @ 2006-06-12 19:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, Git Mailing List
In-Reply-To: <Pine.LNX.4.64.0606121204220.5498@g5.osdl.org>


It was pointed out that the current behaviour might mispart a patch comment
so remove this behaviour for now.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
 mailinfo.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/mailinfo.c b/mailinfo.c
index 3696d61..325c3b2 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -254,6 +254,8 @@ #define SEEN_PREFIX  020
 /* First lines of body can have From:, Date:, and Subject: or be blank */
 static void handle_inbody_header(int *seen, char *line)
 {
+	if (*seen & SEEN_PREFIX)
+		return;
 	if (!memcmp(">From", line, 5) && isspace(line[5])) {
 		if (!(*seen & SEEN_BOGUS_UNIX_FROM)) {
 			*seen |= SEEN_BOGUS_UNIX_FROM;
-- 
1.4.0.g25f48-dirty

^ permalink raw reply related

* [PATCH] Fix git-format-patch -s
From: Eric W. Biederman @ 2006-06-12 19:31 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git


When git-format-patch was converted to a builtin an appropriate call
to setup_ident was missed and thus git-format-patch -s fails because
it doesn't look up anything in the password file.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
 builtin-log.c |    7 +++++--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/builtin-log.c b/builtin-log.c
index 29a8851..322024c 100644
--- a/builtin-log.c
+++ b/builtin-log.c
@@ -217,8 +217,11 @@ int cmd_format_patch(int argc, const cha
 		}
 		else if (!strcmp(argv[i], "--signoff") ||
 			 !strcmp(argv[i], "-s")) {
-			const char *committer = git_committer_info(1);
-			const char *endpos = strchr(committer, '>');
+			const char *committer;
+			const char *endpos;
+			setup_ident();
+			committer = git_committer_info(1);
+			endpos = strchr(committer, '>');
 			if (!endpos)
 				die("bogos committer info %s\n", committer);
 			add_signoff = xmalloc(endpos - committer + 2);
-- 
1.4.0.g25f48-dirty

^ permalink raw reply related

* [PATCH] Ignore blank lines among this inbody headers
From: Eric W. Biederman @ 2006-06-12 19:29 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List
In-Reply-To: <m17j3m6wmw.fsf_-_@ebiederm.dsl.xmission.com>


This is a fix for a regression introduced in:
8b4525fb3c6d79bd3a64b8f441237a4095db4e22.

When I refactored the inbody header parsing into a state machine I failed
to see the logic that skipped multiple leading spaces if they are present.
I think I assumed that logic was just there to skip the initial blank
line between the mail headers and the body.

This restores that behaviour and since we ignore all leading blank lines
in commit messages now this code removes the special case for the blank
line between the mail headers and the body.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---

This is a resend to add my missing Signed-off-by line.
---
 mailinfo.c |   24 ++++++++++++++++--------
 1 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/mailinfo.c b/mailinfo.c
index 5b6c215..3696d61 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -229,6 +229,14 @@ static int is_multipart_boundary(const c
 	return (!memcmp(line, multipart_boundary, multipart_boundary_len));
 }
 
+static int is_blank(char *line)
+{
+	char *ch;
+	for (ch = line; isspace(*ch); ch++)
+		;
+	return *ch == '\0';
+}
+
 static int eatspace(char *line)
 {
 	int len = strlen(line);
@@ -243,7 +251,7 @@ #define SEEN_SUBJECT 04
 #define SEEN_BOGUS_UNIX_FROM 010
 #define SEEN_PREFIX  020
 
-/* First lines of body can have From:, Date:, and Subject: */
+/* First lines of body can have From:, Date:, and Subject: or be blank */
 static void handle_inbody_header(int *seen, char *line)
 {
 	if (!memcmp(">From", line, 5) && isspace(line[5])) {
@@ -279,6 +287,10 @@ static void handle_inbody_header(int *se
 			return;
 		}
 	}
+	if (isspace(line[0])) {
+		if (!(*seen & SEEN_PREFIX) && is_blank(line))
+			return;
+	}
 	*seen |= SEEN_PREFIX;
 }
 
@@ -420,9 +432,7 @@ static int read_one_header_line(char *li
 		if (fgets(line + ofs, sz - ofs, in) == NULL)
 			break;
 		len = eatspace(line + ofs);
-		if (len == 0)
-			break;
-		if (!is_rfc2822_header(line)) {
+		if ((len == 0) || !is_rfc2822_header(line)) {
 			/* Re-add the newline */
 			line[ofs + len] = '\n';
 			line[ofs + len + 1] = '\0';
@@ -762,10 +772,8 @@ static void handle_body(void)
 {
 	int seen = 0;
 
-	if (line[0] || fgets(line, sizeof(line), stdin) != NULL) {
-		handle_commit_msg(&seen);
-		handle_patch();
-	}
+	handle_commit_msg(&seen);
+	handle_patch();
 	fclose(patchfile);
 	if (!patch_lines) {
 		fprintf(stderr, "No patch found\n");
-- 
1.4.0.g25f48-dirty

^ permalink raw reply related

* Re: Thoughts on adding another hook to git
From: David Kowis @ 2006-06-12 19:22 UTC (permalink / raw)
  To: Yakov Lerner; +Cc: git
In-Reply-To: <f36b08ee0606121218s6cdcfec2i42482ed5284a45e3@mail.gmail.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

Yakov Lerner wrote:
> On 6/12/06, David Kowis <dkowis@shlrm.org> wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA512
>>
>> Yakov Lerner wrote:
>> > On 6/12/06, David Kowis <dkowis@shlrm.org> wrote:
>> >> I'd like to be able to modify the commit message before it ends up in
>> >> the $EDITOR.
>> >
>> > Can't you define $EDITOR to point to some script
>> > which modifies the file as you wish then calls the
>> > real editor on it ?
>> >
>>
>> I could, but then anything else that uses $EDITOR would also be affected
>> in the same way... Which would produce interesting results.
> 
> git-commit sure creates those temp files with
> specific naming in specific dir. You could check for
> that in EDITOR script. In the script, you could even check
> the name of the parent process.
> 

This is true. However, I'd be running that script every time something
invoked $EDITOR. And some people may not like that solution. I'm
thinking that more than just I will like to use this pre-edit hook.
Especially in the distro I'm helping develop.


- --
David Kowis

ISO Team Lead - www.sourcemage.org
Source Mage GNU/Linux

Progress isn't made by early risers. It's made by lazy men trying to
find easier ways to do something.
  - Robert Heinlein
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (MingW32)

iQGVAwUBRI2+68nf+vRw63ObAQo6Kwv/bmLf8f54lm7sXekww8olFwT3SkE8orHk
BozzIyW8foz+FLtVbjQQbbGw1RgLrL5zPn+De+BM3LwXabhqnsVca2KpQVMkLaqx
aJwjn2JvL6ujG3ponuCCPTk5VhzU0C/Su15eIMa86O2EFu6Y0HBhw/hOnyEWJiYa
tOxPETizJHD1YbneoTJmu+tQFaKbjAD/3tUMDkQBp9h1QkbZHD5LQUjozepLZplY
PfOvZIP9fT6GDsK5SuweGrYZHjzuV0RlrwN191No3FsJMMX7+PQ85MBvj1p/xJG7
VO+z33+IYPascpm/3NdpjdtKAh72+rfW5OUd8FN1ISwPtY9dJeh5zaaCTB+oEqL4
56pchLL33SFphOO5//YwHcwgX61tPK0stsVpjfdQHEjz8BHNSoBhZw3lw/AvO+63
UhiA3rjiIFrYe9piJHlX+IxoNo5OaLJNO2KjV9k15+0FxSzbKPn3Pt2Ee90ootEn
8NzHFVoyOWnt5mPM+jQr4DsPgrikeUaO
=3hwD
-----END PGP SIGNATURE-----

^ permalink raw reply

* Re: Thoughts on adding another hook to git
From: Yakov Lerner @ 2006-06-12 19:18 UTC (permalink / raw)
  To: David Kowis; +Cc: git
In-Reply-To: <448DBC2B.1070807@shlrm.org>

On 6/12/06, David Kowis <dkowis@shlrm.org> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA512
>
> Yakov Lerner wrote:
> > On 6/12/06, David Kowis <dkowis@shlrm.org> wrote:
> >> I'd like to be able to modify the commit message before it ends up in
> >> the $EDITOR.
> >
> > Can't you define $EDITOR to point to some script
> > which modifies the file as you wish then calls the
> > real editor on it ?
> >
>
> I could, but then anything else that uses $EDITOR would also be affected
> in the same way... Which would produce interesting results.

git-commit sure creates those temp files with
specific naming in specific dir. You could check for
that in EDITOR script. In the script, you could even check
the name of the parent process.

Yakov

^ permalink raw reply

* Re: svn to git, N-squared?
From: Linus Torvalds @ 2006-06-12 19:17 UTC (permalink / raw)
  To: Yakov Lerner; +Cc: Jon Smirl, git
In-Reply-To: <f36b08ee0606121204q1f9dfb5dv3c09c4e9e6a16a0f@mail.gmail.com>



On Mon, 12 Jun 2006, Yakov Lerner wrote:
> 
> Is this related to 1-level dir tree for objects (12/object)
> vs 2-level dir tree (12/34/object) ? Does git employ more levels
> for object tree for large projects ?

The "more levels" approach was certainly an option early on, when we 
discussed how the objects should be spread out.

It was basically made a non-issue by the pack-files. These days, the rule 
is really more along the lines of "if you ever have more than a few 
thousand files, you've not repacked properly".

The git-svnimport script obviously doesn't do it right, but it should be 
trivial to fix. For the git cvsimporter, the fix was literally to just do

	$commitcount++;
	..
	if (($commitcount & 1023) == 0) {
		system("git repack -a -d");  
	}

when committing and that was it. It doesn't get much simpler than that, 
but the svnimporter just hasn't done it yet.

		Linus

^ permalink raw reply

* Re: Thoughts on adding another hook to git
From: David Kowis @ 2006-06-12 19:10 UTC (permalink / raw)
  To: Yakov Lerner; +Cc: git
In-Reply-To: <f36b08ee0606121206k62242354k13671f95da6b1418@mail.gmail.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

Yakov Lerner wrote:
> On 6/12/06, David Kowis <dkowis@shlrm.org> wrote:
>> I'd like to be able to modify the commit message before it ends up in
>> the $EDITOR.
> 
> Can't you define $EDITOR to point to some script
> which modifies the file as you wish then calls the
> real editor on it ?
> 

I could, but then anything else that uses $EDITOR would also be affected
in the same way... Which would produce interesting results.

- --
David Kowis

ISO Team Lead - www.sourcemage.org
Source Mage GNU/Linux

Progress isn't made by early risers. It's made by lazy men trying to
find easier ways to do something.
  - Robert Heinlein
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (MingW32)

iQGVAwUBRI28K8nf+vRw63ObAQoKPgv9EvwbDkGmct7IZGFMydea+HlIMWR+Jyla
WHYnCN353Hw+WbOIvnTvlJrI1R+zSYIu2tDhZ2P/2czyWyja8HQHjGhTXbBInILX
T4ODPwZ55od4uDi1arnHgRpJwhLeGIU+1Wxc8k70tszWj2gb6sIGMHcK9LhzZ+Sf
lEY6iGF74TE3gyQsj78smxL/COvNjzoCWY4AieIVxtu7b1shb7lZXbnkfcKhs82L
0bdmHKri7999nxgWnmdyaDi9RuYOKinc/YhrKDrvY2GB5c8BQHgpFMDR/17oTREL
PTmAJwFs8dAAalGmPAajZY1gXrqo/lVb4JPK4b2QboEC8SGpFwcq4jtHCr/s2mQd
uNINnZ62+dxgRxk9koW2QZeh7hPB8rFcIufUhUC19P0+UWv5TDuKie/mR1U6uZNN
BfTIj/1AI5+l9kCJS+om9o8P1m2wPW4MsP2XaqatInUz9YXn14zrjcKTZnCuIMvw
mqrBlfI7L2KEsoL4ywJsb4ATVz7M6G0I
=IFOu
-----END PGP SIGNATURE-----

^ permalink raw reply

* Re: git-applymbox broken?
From: Linus Torvalds @ 2006-06-12 19:10 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: Junio C Hamano, Git Mailing List
In-Reply-To: <m13bea6w13.fsf@ebiederm.dsl.xmission.com>



On Mon, 12 Jun 2006, Eric W. Biederman wrote:
> 
> Below is an example of the kind of patch that inspired me to relax the
> rules on parsing in body headers (this comes from Andi Kleen quilt tree).

And this is wrong.

We should _not_ accept crappy patches, and then start guessing at what the 
person meant.

>From the very beginning of git, I tried to make it extremely clear that 
there is never any guessing going on. We don't use "heuristics" except as 
a pure optimization: ie a heuristic can have a _performance_ impact, but 
it must never EVER have semantic impact.

SCM's are not about guessing. They are about saving the _exact_ state that 
the user asked for. No "let's try to be nice", no gray areas.

If the new git-applymbox just takes random lines from the body of the 
email, and decides that they may be authorship information, then that is a 
BUG. The "From: " line in the middle of an email may well be about 
somebody having _discovered_ the bug, and we're quoting him as part of the 
explanation. It does NOT mean that it's about authorship.

So we should ONLY check for "From:" (and perhaps "Subject:" and "Date:") 
at the very top of the email body. NOWHERE ELSE.

The fact that somebody has a crappy quilt tree, and the fact that quilt is 
very much a "anything goes" kind of laissez faire system does not mean, 
and should NEVER mean that git becomes the same kind of mess of "let's do 
a best effort and try to guess what somebody means" kind of thing.

I check and edit my emails before I apply them, and I try to teach the 
people who send them manners and what the rules are. THAT is the way to 
handle this, not by having the tool itself become unreliable and random

		Linus

^ permalink raw reply

* :), neon tube
From: Lara Obrien @ 2006-06-12 19:10 UTC (permalink / raw)
  To: linux-newbie

Even if you have no erectin problems SOFT CIAzLIS 
would help you to make BETTER SE  X MORE OFTEN!
and to bring  unimagnable plesure to her.

Just disolve half a pil under your tongue 
and get ready for action in 15 minutes. 

The tests showed that the majority of men 
after taking this medic ation were able to have 
PERFECT ER ECTI ON during 36 hours!

VISIT US, AND GET OUR SPECIAL 70% DISC OUNT OFER!

http://vvvflg.framefro.net/?83415033

==========
tired, or to sleep.
crossed over the hillock  and moved  on and on toward us, right next  to the
or fighting, or power in the Flock? A thousand lives, Jon,  ten  thousand!
After all, what  can those  toads do to me? He  really  didn't  have to  say
are special and gifted and divine, above other birds."
and asked gloomily:

Strugatsky fury--and it is fury:  disgust  with hypocrisy, with bureaucratic
You'll be saying good-bye yet!  We  were off. The institute was on our right

^ permalink raw reply

* Re: Thoughts on adding another hook to git
From: Yakov Lerner @ 2006-06-12 19:06 UTC (permalink / raw)
  To: David Kowis; +Cc: git
In-Reply-To: <448DB201.5090208@shlrm.org>

On 6/12/06, David Kowis <dkowis@shlrm.org> wrote:
> I'd like to be able to modify the commit message before it ends up in
> the $EDITOR.

Can't you define $EDITOR to point to some script
which modifies the file as you wish then calls the
real editor on it ?

Yakov

^ permalink raw reply

* Re: svn to git, N-squared?
From: Yakov Lerner @ 2006-06-12 19:04 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jon Smirl, git
In-Reply-To: <Pine.LNX.4.64.0606112056440.5498@g5.osdl.org>

On 6/12/06, Linus Torvalds <torvalds@osdl.org> wrote:
> On Sun, 11 Jun 2006, Jon Smirl wrote:
> > I have it stopped and I am running the repack.
> > There are 1.27M files in my .git directory
> Yeah, that would do it. That's ~5000 files per object directory, so I
> assume that your directories are 200+kB in size, and for every new object
> added, you'll basically have to traverse the old directory fully in order
> to find an empty place for it

Is this related to 1-level dir tree for objects (12/object)
vs 2-level dir tree (12/34/object) ? Does git employ more levels
for object tree for large projects ?

Yakov

^ permalink raw reply

* Re: svn to git, N-squared?
From: Jon Smirl @ 2006-06-12 19:00 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux@horizon.com, git
In-Reply-To: <9e4733910606121106ta925b6er49fe68bf3c1031f5@mail.gmail.com>

On 6/12/06, Jon Smirl <jonsmirl@gmail.com> wrote:
> On 6/12/06, Linus Torvalds <torvalds@osdl.org> wrote:
> > Having that many files in a single directory (or two) is a total disaster.
> > That said, it works well enough if you don't create new files very often
> > (and _preferably_ don't look them up either, although that is effectively
> > helped by indexing). I _suspect_ that
>
> Posted to the svn list, they said that 220K files is normal. They told
> me to turn on the ext2 dir_index option. Cheking my system I see that
> none of partitions have it turned on so it must not be the default for
> FC5.
>
> I have to unmount the drive to convert existing directories. I can
> trying doing the file move trick while the process is running since
> new directories will use it.

I converted the ext3 directories to dir_index on the fly using the
move trick. Switching the directory index makes it look like it is
spending even more time in the kernel.

procs -----------memory---------- ---swap-- -----io---- --system--
-----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us
sy id wa st
 1  0 636188  22380  19176 157200    0    0     0    52  436   415 13
40 48  0  0
 1  0 636188  22504  19176 157200    0    0     0     0  430   373 13
38 49  0  0
 1  0 636188  22628  19176 157064    0    0     0     0  433   380 12
39 49  0  0
 1  0 636188  22628  19184 157056    0    0     0    20  434   390 12
38 49  0  0
 1  0 636188  22628  19184 156920    0    0     0     0  431   376 11
40 49  0  0
 1  0 636188  22752  19192 156912    0    0     0    48  437   376 12
40 49  0  0
 1  0 636188  22876  19192 156912    0    0     0     0  430   386 11
40 49  0  0
 1  0 636188  22752  19192 156776    0    0     0     0  431   370 10
41 49  0  0
 1  0 636188  23016  19192 156776    0    0     8     0  422   500 22
40 37  2  0

The size of the svn directories went from 3.2MB to 4.4MB after they
were converted to ext3 indexed mode.

I'll get oprofile running when I do a reboot.

-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply

* Re: git-applymbox broken?
From: Eric W. Biederman @ 2006-06-12 18:58 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Git Mailing List
In-Reply-To: <Pine.LNX.4.64.0606111735440.5498@g5.osdl.org>

Linus Torvalds <torvalds@osdl.org> writes:

> What do you mean by "middle"?
>
> No, it should only look at From: and Subject: lines if they are at the 
> very top, with no other non-whitespace lines above them. But when it looks 
> at them and uses the data from them, it should then remove them from the 
> body - they are "conceptually" just extended header lines that just 
> happened to technically (from an rfc822 standpoint) be in the body of the 
> email.

Below is an example of the kind of patch that inspired me to relax the
rules on parsing in body headers (this comes from Andi Kleen quilt tree).

The first line in this instance is obviously a subject line but there
is not really good way to detect that.  Then we get a From: line.

Now I doubt any patches ever hit the mail in this format and it probably
isn't worth it to track down every variation of patch headers in existence.
But if we don't find a From: header in the body prefix it seems to make
sense to keep looking for headers in the body, and to use the information
if we find it.

---
Kdump i386 nmi event notification fix

From: Vivek Goyal <vgoyal@in.ibm.com>

After a crash we should wait for NMI IPI event and not for external NMI or
NMI watchdog tick.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
---

 arch/i386/kernel/crash.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

Index: linux/arch/i386/kernel/crash.c
===================================================================
--- linux.orig/arch/i386/kernel/crash.c
+++ linux/arch/i386/kernel/crash.c
@@ -102,7 +102,7 @@ static int crash_nmi_callback(struct not
 	struct pt_regs fixed_regs;
 	int cpu;
 
-	if (val != DIE_NMI)
+	if (val != DIE_NMI_IPI)
 		return NOTIFY_OK;
 
 	regs = ((struct die_args *)data)->regs;
@@ -113,7 +113,7 @@ static int crash_nmi_callback(struct not
 	 * an NMI if system was initially booted with nmi_watchdog parameter.
 	 */
 	if (cpu == crashing_cpu)
-		return 1;
+		return NOTIFY_STOP;
 	local_irq_disable();
 
 	if (!user_mode_vm(regs)) {

^ permalink raw reply

* [PATCH] Ignore blank lines among this inbody headers.
From: Eric W. Biederman @ 2006-06-12 18:45 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, Git Mailing List
In-Reply-To: <Pine.LNX.4.64.0606111735440.5498@g5.osdl.org>


This is a fix for a regression introduced in:
8b4525fb3c6d79bd3a64b8f441237a4095db4e22.

When I refactored the inbody header parsing into a state machine I failed
to see the logic that skipped multiple leading spaces if they are present.
I think I assumed that logic was just there to skip the initial blank
line between the mail headers and the body.

This restores that behaviour and since we ignore all leading blank lines
in commit messages now this code removes the special case for the blank
line between the mail headers and the body.
---
 mailinfo.c |   24 ++++++++++++++++--------
 1 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/mailinfo.c b/mailinfo.c
index 5b6c215..3696d61 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -229,6 +229,14 @@ static int is_multipart_boundary(const c
 	return (!memcmp(line, multipart_boundary, multipart_boundary_len));
 }
 
+static int is_blank(char *line)
+{
+	char *ch;
+	for (ch = line; isspace(*ch); ch++)
+		;
+	return *ch == '\0';
+}
+
 static int eatspace(char *line)
 {
 	int len = strlen(line);
@@ -243,7 +251,7 @@ #define SEEN_SUBJECT 04
 #define SEEN_BOGUS_UNIX_FROM 010
 #define SEEN_PREFIX  020
 
-/* First lines of body can have From:, Date:, and Subject: */
+/* First lines of body can have From:, Date:, and Subject: or be blank */
 static void handle_inbody_header(int *seen, char *line)
 {
 	if (!memcmp(">From", line, 5) && isspace(line[5])) {
@@ -279,6 +287,10 @@ static void handle_inbody_header(int *se
 			return;
 		}
 	}
+	if (isspace(line[0])) {
+		if (!(*seen & SEEN_PREFIX) && is_blank(line))
+			return;
+	}
 	*seen |= SEEN_PREFIX;
 }
 
@@ -420,9 +432,7 @@ static int read_one_header_line(char *li
 		if (fgets(line + ofs, sz - ofs, in) == NULL)
 			break;
 		len = eatspace(line + ofs);
-		if (len == 0)
-			break;
-		if (!is_rfc2822_header(line)) {
+		if ((len == 0) || !is_rfc2822_header(line)) {
 			/* Re-add the newline */
 			line[ofs + len] = '\n';
 			line[ofs + len + 1] = '\0';
@@ -762,10 +772,8 @@ static void handle_body(void)
 {
 	int seen = 0;
 
-	if (line[0] || fgets(line, sizeof(line), stdin) != NULL) {
-		handle_commit_msg(&seen);
-		handle_patch();
-	}
+	handle_commit_msg(&seen);
+	handle_patch();
 	fclose(patchfile);
 	if (!patch_lines) {
 		fprintf(stderr, "No patch found\n");
-- 
1.4.0.rc2.g5e3a6

^ permalink raw reply related

* Thoughts on adding another hook to git
From: David Kowis @ 2006-06-12 18:27 UTC (permalink / raw)
  To: git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

I'd like to be able to modify the commit message before it ends up in
the $EDITOR. This is a fairly trivial thing to implement:
Call ${GIT_DIR}/hooks/pre-editor on COMMIT_MESSAGE before opening it in
$EDITOR.

My question to you all is should I set it up so that the hook only opens
when the $EDITOR is actually being called? (really easy)
Or, do I set it up so that the hook always happens. In which case it's
similar to the commit-msg hook already, just happens before the message
instead of after.

Thanks,
- --
David Kowis

ISO Team Lead - www.sourcemage.org
Source Mage GNU/Linux

Progress isn't made by early risers. It's made by lazy men trying to
find easier ways to do something.
  - Robert Heinlein
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (MingW32)

iQGVAwUBRI2yAMnf+vRw63ObAQpNSgv+OCXYSDlW96K9M5QZvSaEbdZOGorYZg5Y
RSh9WUXS2ribYRr1TbplD0Fp4vGnG8CB6qvr2QF8vP3tbEMjnwk4LobeWaUtK2Kn
Hja3TgIUPWkzHMLleToe5o99r8v/6LFf9rkBxvFw3TMkuxsFS/lFlxy1eRa43rvd
Skod2cA7RWus1IFJcbDKNonjhJkVkHylSMjT8iVQDbgY0hg7PEy2ZW3XB0MJJRZC
lLsDDIJ4msPCXSx/lDRGaJj+m7IrvUgnEDzkX0jTT8DeZqnlC8nRM/2dOS72b/5w
gIBYu49DvTL8ynod2mmYTyBynfRpVxPjxnXbubn/M+N+0WCTXIUTPCbyW2MOscjA
pFe6/S1qKaTqc06VBDabYxdvGrHG6v+KkaJhu2XoLOHWVoBblobBBNrpIkA6GNqz
H7JHNJDF+JbshlW2aU2HazDINRfD/AfrJmDx4Xn91qAKiegyO3wRA1rM6a0LEpun
zg3haF3l0rfBEdFpz21gNQbYxNHaRkwg
=Rxm/
-----END PGP SIGNATURE-----

^ permalink raw reply

* Re: gitweb: Config file support (was: Adding a `blame' interface.)
From: Florian Forster @ 2006-06-12 18:11 UTC (permalink / raw)
  To: Martin Langhoff; +Cc: git
In-Reply-To: <46a038f90606120134n21c269bbj3e8c7e31d4d93a23@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 714 bytes --]

Hi Martin,

On Mon, Jun 12, 2006 at 08:34:43PM +1200, Martin Langhoff wrote:
> > As far as I know the Debian maintainer of the `gitweb' package has
> > asked for this before but was refused for some reason..
> BTW, I haven't seen the debian maintainer's request, was that on the list?

Yes, it was a mail by Andres Salomon on May 20th, 2005 with the subject
`add conf file support to gitweb'. A friend of mine asked him if he had
sent the patch upstream and he pointed to this message and explained he
had gotten a private reply saying that gitweb `only covers the special
needs on kernel.org'.

Regards,
-octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply

* Re: svn to git, N-squared?
From: Jon Smirl @ 2006-06-12 18:06 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux@horizon.com, git
In-Reply-To: <Pine.LNX.4.64.0606120958230.5498@g5.osdl.org>

On 6/12/06, Linus Torvalds <torvalds@osdl.org> wrote:
> Having that many files in a single directory (or two) is a total disaster.
> That said, it works well enough if you don't create new files very often
> (and _preferably_ don't look them up either, although that is effectively
> helped by indexing). I _suspect_ that

Posted to the svn list, they said that 220K files is normal. They told
me to turn on the ext2 dir_index option. Cheking my system I see that
none of partitions have it turned on so it must not be the default for
FC5.

I have to unmount the drive to convert existing directories. I can
trying doing the file move trick while the process is running since
new directories will use it.

-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply

* Re: [PATCH] gitweb: Supporting caches (was: Adding a `blame' interface.)
From: Florian Forster @ 2006-06-12 17:57 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Martin Langhoff, git
In-Reply-To: <Pine.LNX.4.64.0606120754460.5498@g5.osdl.org>

[-- Attachment #1: Type: text/plain, Size: 1286 bytes --]

On Mon, Jun 12, 2006 at 07:59:39AM -0700, Linus Torvalds wrote:
> The apache setup at least on kernel.org is already set up to do
> caching, as long as the generated headers for the page allow it in the
> first place.

I've actually looked into improving native HTTP caching (mostly for
small site without revers proxying) by providing a `Last-Modified'
header where possible and sending a `304 Not Modified' whenever
appropriate.

While it doesn't sound hard it's next to impossible: A commit's
timestamp doesn't change when head a points to it (or does not longer
point to it). Also displaying the timestamps as `Modified xy
{seconds,minutes, hours,...} ago' possess a big problem.

(I guess the webserver could use the `If-Modified-Since' header to check
if the displayed time needs to be updated, but if you ask me it's not
worth the effort.)

In short, the `blob', `blob_plain', and `blobdiff' pages could profit
from that because they don't display the head(s) pointing to the current
commit. On the other hand, this is a little inconsistent and could be
considered a bug. So I'll give up on that unless someone has a great
idea how to handle this.

Regards,
-octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply

* Re: svn to git, N-squared?
From: Linus Torvalds @ 2006-06-12 17:08 UTC (permalink / raw)
  To: Jon Smirl; +Cc: linux@horizon.com, git
In-Reply-To: <9e4733910606120944p4deb170ejc2863846685917f6@mail.gmail.com>



On Mon, 12 Jun 2006, Jon Smirl wrote:
> 
> The svn repository was built by cvs2svn, none of the git tools were involved.

Ok, so that part is purely a SVN issue.

Having that many files in a single directory (or two) is a total disaster. 
That said, it works well enough if you don't create new files very often 
(and _preferably_ don't look them up either, although that is effectively 
helped by indexing). I _suspect_ that 

 - the "cvs->svn" import process was probably optimized so that it did one 
   file at a time (your "eight stages" description certainly sounds as if 
   it could do it), and in that case it's entirely possible that that can 
   be done efficiently (ie you still do file creates and lookups in an 
   increasingly big directory, but you do it only _once_ per file, rather 
   than look up old files all the time). So your lookup ratio would be 1:1 
   with the files.

   Doing a git-cvsimport would then do basically random lookups in that 
   _huge_ directory, and instead of reading the files one at a time (and 
   fully) and never again, I assume it opens them, reads one revision, 
   closes it, and then goes on to the next revision, so it will have a 
   much higher lookup ratio (you'd look up every file several times).

 - I suspect the SVN people must be hurting for performance themselves. I 
   guess they don't expect to be able to do 5-10 commits per second, the 
   way git was designed to do. So they optimized the cvs import part, but 
   their actual regular live usage is probably hitting this same directory 
   inefficiency.

Of course, the old SVN Berkeley DB usage was probably even worse (not in 
system time, but I'd expect the access patterns within the BDB file to be 
pretty nasty, and probably a lot of user time spent seeking around it). 
But in this particular case, it might even have been better.

Maybe we could teach the SVN people about pack-files? ;)

			Linus

^ permalink raw reply

* Re: svn to git, N-squared?
From: Linus Torvalds @ 2006-06-12 16:57 UTC (permalink / raw)
  To: Jon Smirl; +Cc: linux@horizon.com, git
In-Reply-To: <9e4733910606120932k5b6f7acfra3f3a26168454f47@mail.gmail.com>



On Mon, 12 Jun 2006, Jon Smirl wrote:
> > 
> > 64 files in tmp.
> > But the SVN repository itself has 411,000 files in it. Split between
> > two directories.
> 
> I'm doing all of this on ext3. I have plenty of free disk space so I
> can make another partition and switch to a new file system after I
> install the new RAM. What would be the best one to try? Doing that
> would provide a data point to determine if this is a problem with file
> system performance or the misuse of file systems.

I'm sure there are better filesystems to try for this kind of insane 
schenario, but at the same time, I really cannot imaging that the 411,000 
files is a "normal" thing. There _must_ be some way to have SVN not do 
that in the first place (or git-svnimport).

Is this what happened when the SVN people started using fsfs? 

			Linus

^ permalink raw reply

* Re: svn to git, N-squared?
From: Jon Smirl @ 2006-06-12 16:44 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux@horizon.com, git
In-Reply-To: <Pine.LNX.4.64.0606120938490.5498@g5.osdl.org>

On 6/12/06, Linus Torvalds <torvalds@osdl.org> wrote:
> > Is there some pack equivalent for svn that I haven't found yet?
>
> Is this literally what SVN does normally? That's just insane. I mean, even
> git tried to at least hash out the files (and yeah, admittedly even that
> worked less well than I was hoping for, but I at least fixed it within
> just a few weeks through the pack mechanism).
>
> Or is that 411,000 files a result of how git-svnimport does things, rather
> than some basic SVN approach to live: does it perhaps end up checking out
> each file under an individual temporary name?

The svn repository was built by cvs2svn, none of the git tools were involved.

-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply

* Re: svn to git, N-squared?
From: Linus Torvalds @ 2006-06-12 16:41 UTC (permalink / raw)
  To: Jon Smirl; +Cc: linux@horizon.com, git
In-Reply-To: <9e4733910606120922g181a5aaal623fd3f29b839f4c@mail.gmail.com>



On Mon, 12 Jun 2006, Jon Smirl wrote:
>
> 64 files in tmp.
> But the SVN repository itself has 411,000 files in it. Split between
> two directories.

Ouch. That sounds like it. 

> Is there some pack equivalent for svn that I haven't found yet?

Is this literally what SVN does normally? That's just insane. I mean, even 
git tried to at least hash out the files (and yeah, admittedly even that 
worked less well than I was hoping for, but I at least fixed it within 
just a few weeks through the pack mechanism).

Or is that 411,000 files a result of how git-svnimport does things, rather 
than some basic SVN approach to live: does it perhaps end up checking out 
each file under an individual temporary name?

			Linus

^ permalink raw reply

* Re: svn to git, N-squared?
From: Jon Smirl @ 2006-06-12 16:32 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux@horizon.com, git
In-Reply-To: <9e4733910606120922g181a5aaal623fd3f29b839f4c@mail.gmail.com>

On 6/12/06, Jon Smirl <jonsmirl@gmail.com> wrote:
> On 6/12/06, Linus Torvalds <torvalds@osdl.org> wrote:
> >
> >
> > On Mon, 12 Jun 2006, Jon Smirl wrote:
> > >
> > >  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> > > 14525 jonsmirl  16   0  604m 391m 1904 S   24 38.7 916:53.39 git-svnimport
> > > 20947 jonsmirl  17   0     0    0    0 R    1  0.0   0:00.03 git-svnimport
> >
> > Hard to tell, it's obviously got short-lived processes there too that it's
> > not showing, but equally obviously that svnimport script itself is
> > spending an alarming amount of CPU time. I don't think it should do that
> > much processing, but since it's written in perl, I can't read it.
> >
> > Are there any other directories that seem to be growing (eg some temp-file
> > directory where the old files aren't cleaned away?). I can't imagine what
> > else it could be doing in kernel space than simply some silly filesystem
> > operation, but dang it all, Linux filesystems are usually very efficient
> > indeed, unless we're talking huge directories (and if it's not the git
> > object directory any more, it must be something else).
>
> 64 files in tmp.
> But the SVN repository itself has 411,000 files in it. Split between
> two directories.

I'm doing all of this on ext3. I have plenty of free disk space so I
can make another partition and switch to a new file system after I
install the new RAM. What would be the best one to try? Doing that
would provide a data point to determine if this is a problem with file
system performance or the misuse of file systems.

-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply

* Re: svn to git, N-squared?
From: Randal L. Schwartz @ 2006-06-12 16:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jon Smirl, git
In-Reply-To: <86irn6wdob.fsf@blue.stonehenge.com>

>>>>> "Randal" == Randal L Schwartz <merlyn@stonehenge.com> writes:

>>>>> "Linus" == Linus Torvalds <torvalds@osdl.org> writes:
Linus> This sounds like _exactly_ what happens if you don't repack
Linus> occasionally.  Expecially if you are using a filesystem without hashed
Linus> filename lookup, but it's true to some degree even with that - the
Linus> filesystem tends to end up spending tons of time in kernel space,
Linus> trying to find a place to put new objects.

Randal> I'm using git-svn to do a similar thing with a 11K-commit history.  It's now 4
Randal> days running, and yes, I'm repacking and deleting empty dirs every 200-300
Randal> commits, but I'm only up to commit 4000 or so.  At this rate, I *may* finish
Randal> by sometime next week. :(

Randal> However, I notice one thing that can't be good: .git/git-svn/revs has one file
Randal> per revision.  Yes, I'll end up with 11000 files in a single directory.  Ugh.

Another contributing factor is that there's 2500 files in the repo (at
revision 3931).  I was recording 20 commits a minute in the early part of the
cycle, and now I'm down to 1 commit every two minutes.  Doing a bit of
back-of-the-scribbled-on-envelope calcs, I won't be finished for
another two weeks or so. :(

-- 
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!

^ permalink raw reply

* Re: svn to git, N-squared?
From: Jon Smirl @ 2006-06-12 16:22 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux@horizon.com, git
In-Reply-To: <Pine.LNX.4.64.0606120906210.5498@g5.osdl.org>

On 6/12/06, Linus Torvalds <torvalds@osdl.org> wrote:
>
>
> On Mon, 12 Jun 2006, Jon Smirl wrote:
> >
> >  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> > 14525 jonsmirl  16   0  604m 391m 1904 S   24 38.7 916:53.39 git-svnimport
> > 20947 jonsmirl  17   0     0    0    0 R    1  0.0   0:00.03 git-svnimport
>
> Hard to tell, it's obviously got short-lived processes there too that it's
> not showing, but equally obviously that svnimport script itself is
> spending an alarming amount of CPU time. I don't think it should do that
> much processing, but since it's written in perl, I can't read it.
>
> Are there any other directories that seem to be growing (eg some temp-file
> directory where the old files aren't cleaned away?). I can't imagine what
> else it could be doing in kernel space than simply some silly filesystem
> operation, but dang it all, Linux filesystems are usually very efficient
> indeed, unless we're talking huge directories (and if it's not the git
> object directory any more, it must be something else).

64 files in tmp.
But the SVN repository itself has 411,000 files in it. Split between
two directories.

Is there some pack equivalent for svn that I haven't found yet?

> At least with the cvs importer I have _some_ clue what it's doing, since I
> wrote an earlier version myself (very different, but at least I know what
> the operations are). SVN has always just confused me, and I have no idea
> what svnimport does, so I think I'll have to defer to somebody who
> actually knows the code.
>
> Smurf, have you looked at any larger repositories?
>
>                 Linus
>


-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox