Git development
 help / color / mirror / Atom feed
* Remove old "git-grep.sh" remnants
From: Linus Torvalds @ 2006-05-16 23:46 UTC (permalink / raw)
  To: Junio C Hamano, Git Mailing List


It's built-in now.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
----

diff --git a/Makefile b/Makefile
index 93779b0..9ba608c 100644
--- a/Makefile
+++ b/Makefile
@@ -124,7 +124,7 @@ SCRIPT_SH = \
 	git-tag.sh git-verify-tag.sh \
 	git-applymbox.sh git-applypatch.sh git-am.sh \
 	git-merge.sh git-merge-stupid.sh git-merge-octopus.sh \
-	git-merge-resolve.sh git-merge-ours.sh git-grep.sh \
+	git-merge-resolve.sh git-merge-ours.sh \
 	git-lost-found.sh
 
 SCRIPT_PERL = \
@@ -169,7 +169,8 @@ PROGRAMS = \
 	git-describe$X git-merge-tree$X git-blame$X git-imap-send$X
 
 BUILT_INS = git-log$X git-whatchanged$X git-show$X \
-	git-count-objects$X git-diff$X git-push$X
+	git-count-objects$X git-diff$X git-push$X \
+	git-grep$X
 
 # what 'all' will build and 'install' will install, in gitexecdir
 ALL_PROGRAMS = $(PROGRAMS) $(SIMPLE_PROGRAMS) $(SCRIPTS)
diff --git a/git-grep.sh b/git-grep.sh
deleted file mode 100755
index ad4f2fe..0000000
--- a/git-grep.sh
+++ /dev/null
@@ -1,62 +0,0 @@
-#!/bin/sh
-#
-# Copyright (c) Linus Torvalds, 2005
-#
-
-USAGE='[<option>...] [-e] <pattern> [<path>...]'
-SUBDIRECTORY_OK='Yes'
-. git-sh-setup
-
-got_pattern () {
-	if [ -z "$no_more_patterns" ]
-	then
-		pattern="$1" no_more_patterns=yes
-	else
-		die "git-grep: do not specify more than one pattern"
-	fi
-}
-
-no_more_patterns=
-pattern=
-flags=()
-git_flags=()
-while : ; do
-	case "$1" in
-	-o|--cached|--deleted|--others|--killed|\
-	--ignored|--modified|--exclude=*|\
-	--exclude-from=*|\--exclude-per-directory=*)
-		git_flags=("${git_flags[@]}" "$1")
-		;;
-	-e)
-		got_pattern "$2"
-		shift
-		;;
-	-A|-B|-C|-D|-d|-f|-m)
-		flags=("${flags[@]}" "$1" "$2")
-		shift
-		;;
-	--)
-		# The rest are git-ls-files paths
-		shift
-		break
-		;;
-	-*)
-		flags=("${flags[@]}" "$1")
-		;;
-	*)
-		if [ -z "$no_more_patterns" ]
-		then
-			got_pattern "$1"
-			shift
-		fi
-		[ "$1" = -- ] && shift
-		break
-		;;
-	esac
-	shift
-done
-[ "$pattern" ] || {
-	usage
-}
-git-ls-files -z "${git_flags[@]}" -- "$@" |
-	xargs -0 grep "${flags[@]}" -e "$pattern" --

^ permalink raw reply related

* Re: [PATCH] improve depth heuristic for maximum delta size
From: Junio C Hamano @ 2006-05-16 23:34 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0605161510200.18071@localhost.localdomain>

Nicolas Pitre <nico@cam.org> writes:

> This provides a linear decrement on the penalty related to delta depth
> instead of being an 1/x function.  With this another 5% reduction is 
> observed on packs for both the GIT repo and the Linux kernel repo, as 
> well as fixing a pack size regression in another sample repo I have.

Good job, and it does not seem to spend too many more cycles
either (it does slow it down a bit because it needs to do more
deltas, but that is to be expected).

Here is the average chain length and resulting pack size from
full repacking of git.git repository, with three versions.

        Avg 6.20   6516kB (master)
        Avg 5.97   5784kB (next, has 1/x version)
        Avg 6.89   5536kB (this patch on top of next)

What's interesting is that the 1/x version shortens the chain
(i.e. decreased runtime cost) while producing smaller results,
compared to the master version.  The story is the same on the
kernel archive.

	Avg 5.82 113808kB (master)
	Avg 4.76 108044kB (next, has 1/x version)
	Avg 5.81 105768kB (this patch on top of next)

^ permalink raw reply

* Re: Merge with local conflicts in new files
From: Junio C Hamano @ 2006-05-16 23:28 UTC (permalink / raw)
  To: Santi; +Cc: git
In-Reply-To: <8aa486160605161611p4c9ddbc0v@mail.gmail.com>

Santi <sbejar@gmail.com> writes:

> 2006/5/17, Junio C Hamano <junkio@cox.net>:
>> Santi <sbejar@gmail.com> writes:
>>
>> >       In the case of:
>> >
>> > - You merge from a branch with new files
>> > - You have these files in the working directory
>> > - You do not have these files in the HEAD.
>>
>> and
>>
>>  - You have not told git that these files matter.
>
> For me it is the other way, all my files matter but git can do
> whatever it wants with the ones it controls.

You really do not mean that.

If you told git a file matters, and have local modifications to
the file in the working tree that you have not run update-index
yet, merge and apply should be careful not to overwrite your
changes that is not ready while doing whatever thing they have
to do.  And they are careful, because you have told git that
they matter, and the way you tell git that they matter is to
have entries for them in the index.

^ permalink raw reply

* Re: [PATCH] Update the documentation for git-merge-base
From: Junio C Hamano @ 2006-05-16 23:20 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Fredrik Kuivinen, git
In-Reply-To: <Pine.LNX.4.64.0605160906150.3866@g5.osdl.org>

Linus Torvalds <torvalds@osdl.org> writes:

>  - In contrast, for git (current master branch), the numbers are 35 out of 
>    540, and there are lots of merges with many LCA's:
>
>     505 o
>      15 oo
>      13 ooo
>       2 oooo
>       3 ooooo
>       2 ooooooo
>
> I think the difference is that Junio does a lot of these branches where he 
> keeps on pulling from them, and never syncs back (which is a great 
> workflow). In contrast, the kernel tends to try to avoid that because the 
> history gets messy enough as it is ;)
>
> Anyway, the two commits that apparently have seven (!) LCA's in the git 
> tree should probably be checked out. They are probably a good thing to see 
> if git-merge-base really _really_ does the right thing, and whether they 
> really are true LCA's.
>
> They are commits ad0b46bf.. and e6a933bd.. respectively.

The first one is because at 1.3.0 I pulled everything from
"next" to "master".

Usually "next" incorporates topic branches that stem from
different commits on "master", and when a new topic is merged to
"next", it gets the updates to "master" up to that point along
with the new topic.  When topics graduate (i.e. merged back) to
"master", they do so at different pace.


      topic2          o---o---o---o---H---.
                     /                 \   \
      next   -----------o---o---E---o---I-------B
                   /   /       /             \   \
      topic1      /   /   o---D---.           \   \
                 /   /   /         \           \   \
      master ---G---o---C---o---o---F---o---o---A---X

The above illustration shows that two topics branched from
master were cooked in next.  Topic 1 branched from master at C,
added two commits (its tip is at D), merged to next at E and
then later merged to master at F.  Similarly, topic 2 branched
from master at G, added five commits (its tip is at H), merged
to next at I and then later merged to master at A.

When merging "next" into "master" by merging A and B to produce
X, tips of topics 1 and 2 (D and H, respectively) become the
merge base.

Merging "next" wholesale to "master" is hopefully a rare event,
but the seven bases you are seeing are the topic tips.

The other one is the other way around.  From time to time,
"next" itself gets updates from "master" to keep it in sync with
fixes that occurred on "master" directly.  Such a merge into
"next" will have this picture but the principles are the same.

      topic2          o---o---o---o---H---.
                     /                 \   \
      next   -----------o---o---E---o---I-------B---Y
                   /   /       /             \     /
      topic1      /   /   o---D---.           \   /
                 /   /   /         \           \ /
      master ---G---o---C---o---o---F---o---o---A

^ permalink raw reply

* Re: Merge with local conflicts in new files
From: Santi @ 2006-05-16 23:11 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7v1wut61aj.fsf@assigned-by-dhcp.cox.net>

2006/5/17, Junio C Hamano <junkio@cox.net>:
> Santi <sbejar@gmail.com> writes:
>
> >       In the case of:
> >
> > - You merge from a branch with new files
> > - You have these files in the working directory
> > - You do not have these files in the HEAD.
>
> and
>
>  - You have not told git that these files matter.

For me it is the other way, all my files matter but git can do
whatever it wants with the ones it controls.

>
> This is totally untested, but on top of "next" you could do
> something like this, perhaps.

Thanks, it works here.

Santi

^ permalink raw reply

* Re: Ouput of git diff with <ent>:<path>
From: Junio C Hamano @ 2006-05-16 22:44 UTC (permalink / raw)
  To: Santi; +Cc: git
In-Reply-To: <8aa486160605161524j5d7e672eo@mail.gmail.com>

Santi <sbejar@gmail.com> writes:

> ... I didn't expect the rename from/to neither the
> similarity index 0%.
>
> diff --git a/v1.3.3:Makefile b/Makefile
> similarity index 0%
> rename from v1.3.3:Makefile
> rename to Makefile
> index b808eca..55d1937 100644
> --- a/v1.3.3:Makefile
> +++ b/Makefile

Yes I am aware of this one; I just haven't bothered to deal with
it.

It looks at two strings, "v1.3.3:Makefile" and "Makefile", and
says "they have different names -- they are renamed".

Patches welcome as long as you do not break more usual cases
;-).

^ permalink raw reply

* Re: "git add $ignored_file" fail
From: Santi @ 2006-05-16 22:42 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git, Junio C Hamano
In-Reply-To: <Pine.LNX.4.64.0605161526210.16475@g5.osdl.org>

2006/5/17, Linus Torvalds <torvalds@osdl.org>:
>
>
> On Wed, 17 May 2006, Santi wrote:
> >
> >      When you try to add ignored files with the git-add command it
> > fails because the call to:
> >
> > git-ls-files -z \
> >        --exclude-from="$GIT_DIR/info/exclude" \
> >        --others --exclude-per-directory=.gitignore
> >
> >      does not output this file because it is ignored. I know I can do it with:
> >
> > git-update-index --add $ignored_file
> >
> > I understand the behaviour of git-ls-files but I think it is no the
> > expected for git-add, at least for me.
>
> Well, the thing is, git-add doesn't really take a "file name", it takes a
> filename _pattern_.
>
> Clearly we can't add everything that matches the pattern, because one
> common case is to add a whole subdirectory, and thus clearly the
> .gitignore file must override the pattern.
>
> So it's consistent that it overrides it also for a single filename case,
> no?
>

It's consistent from an implementation point of view, but not from the
(my?) user point of view. This is why I say I understand it for
git-ls-files. For the case of git-add even the usage and the man page
talk about <file>...

Clearly for the case of a whole subdirectory, or even ".",  the
.gitignore file must override the pattern, but not for the case of a
pattern that is a single existing file.

Santi

^ permalink raw reply

* Re: "git add $ignored_file" fail
From: Jakub Narebski @ 2006-05-16 22:41 UTC (permalink / raw)
  To: git
In-Reply-To: <Pine.LNX.4.64.0605161526210.16475@g5.osdl.org>

Linus Torvalds wrote:

> Well, the thing is, git-add doesn't really take a "file name", it takes a 
> filename _pattern_.
> 
> Clearly we can't add everything that matches the pattern, because one 
> common case is to add a whole subdirectory, and thus clearly the 
> .gitignore file must override the pattern.
> 
> So it's consistent that it overrides it also for a single filename case, 
> no?

Well, if shell expansion cannot find a file matching pattern, it uses
pattern as file name literaly.

It would be nice to have easy (git core porcelain level) way to add files
which match ignore pattern.

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply

* Re: Merge with local conflicts in new files
From: Junio C Hamano @ 2006-05-16 22:40 UTC (permalink / raw)
  To: Santi; +Cc: git
In-Reply-To: <8aa486160605161500m1dd8428cj@mail.gmail.com>

Santi <sbejar@gmail.com> writes:

>       In the case of:
>
> - You merge from a branch with new files
> - You have these files in the working directory
> - You do not have these files in the HEAD.

and

 - You have not told git that these files matter.

>...
> test_expect_success 'prepare repository' \
> 'echo "Hello" > init &&
> git add init &&
> git commit -m "Initial commit" &&
> git branch B &&
> echo "foo" > foo &&
> git add foo &&
> git commit -m "File: foo" &&
> git checkout B &&
> echo "bar" > foo '

At this point, you have not told git that foo is a file that is
relevant on branch B, so git considers it a fair game to
overwrite.

At least, that was the original reasoning.

It happens not just during the ordinary "git-merge", by the way.
If you are on branch B that did not have 'foo', created 'foo'
and switched to branch A (which has 'foo') before telling the
index that you care about your version of 'foo' on branch B,
'foo' from branch A will overwrite your throwaway copy in the
working tree:

	$ git branch
	* master
        $ git branch another
	$ echo 'New file' >afile
        $ git add afile
        $ git commit -m 'Add afile'
        $ git checkout another
        $ ls afile
	ls: afile: No such file or directory
        $ echo 'Lost file' >afile
        $ git checkout master
        $ cat afile
        New file

We acquired "git apply" which does take notice when you have
such an untracked file in the working tree that conflicts with
what it does to the index, and I think its behaviour sometimes
is more user friendly and safer than what the merge does
currently (but it irritates people some other times).

This is totally untested, but on top of "next" you could do
something like this, perhaps.

We _might_ want to do this conditionally, only when the user
asks, though.  I dunno.  Being able to blow away irrelevant
files is sometimes a good thing, so we _might_ want to have a
reverse logic to "git apply" that makes it blow away untracked
working tree files under "--index" option.

-- >8 --

diff --git a/read-tree.c b/read-tree.c
index aa6172b..185a73f 100644
--- a/read-tree.c
+++ b/read-tree.c
@@ -453,8 +453,18 @@ static int merged_entry(struct cache_ent
 			invalidate_ce_path(old);
 		}
 	}
-	else
+	else {
+		/*
+		 * Originally we did not have a cache entry here but
+		 * are creating a new file as a result of the merge.
+		 * Do we want to lose the untracked working tree files?
+		 */
+		struct stat st;
+
+		if (!lstat(merge->name, &st))
+			die("Untracked working tree file '%s' would be overwritten by merge.", merge->name);
 		invalidate_ce_path(merge);
+	}
 	merge->ce_flags &= ~htons(CE_STAGEMASK);
 	add_cache_entry(merge, ADD_CACHE_OK_TO_ADD);
 	return 1;
@@ -701,7 +711,7 @@ static int bind_merge(struct cache_entry
 		return error("Cannot do a bind merge of %d trees\n",
 			     merge_size);
 	if (!a)
-		return merged_entry(old, NULL);
+		return merged_entry(old, old);
 	if (old)
 		die("Entry '%s' overlaps.  Cannot bind.", a->name);
 
@@ -736,7 +746,7 @@ static int oneway_merge(struct cache_ent
 		}
 		return keep_entry(old);
 	}
-	return merged_entry(a, NULL);
+	return merged_entry(a, old);
 }
 
 static int read_cache_unmerged(void)

^ permalink raw reply related

* Re: "git add $ignored_file" fail
From: Linus Torvalds @ 2006-05-16 22:28 UTC (permalink / raw)
  To: Santi; +Cc: git, Junio C Hamano
In-Reply-To: <8aa486160605161507w3a27152dq@mail.gmail.com>



On Wed, 17 May 2006, Santi wrote:
> 
>      When you try to add ignored files with the git-add command it
> fails because the call to:
> 
> git-ls-files -z \
>        --exclude-from="$GIT_DIR/info/exclude" \
>        --others --exclude-per-directory=.gitignore
> 
>      does not output this file because it is ignored. I know I can do it with:
> 
> git-update-index --add $ignored_file
> 
> I understand the behaviour of git-ls-files but I think it is no the
> expected for git-add, at least for me.

Well, the thing is, git-add doesn't really take a "file name", it takes a 
filename _pattern_.

Clearly we can't add everything that matches the pattern, because one 
common case is to add a whole subdirectory, and thus clearly the 
.gitignore file must override the pattern.

So it's consistent that it overrides it also for a single filename case, 
no?

		Linus

^ permalink raw reply

* Ouput of git diff with <ent>:<path>
From: Santi @ 2006-05-16 22:24 UTC (permalink / raw)
  To: git, Junio C Hamano

Hi *,

   just curious if this is the expected output. I find this syntax
very usefull but the "a/v1.3.3:" of even without the tree "a/:" a bit
confusing. And I didn't expect the rename from/to neither the
similarity index 0%.

diff --git a/v1.3.3:Makefile b/Makefile
similarity index 0%
rename from v1.3.3:Makefile
rename to Makefile
index b808eca..55d1937 100644
--- a/v1.3.3:Makefile
+++ b/Makefile

Thanks.

Santi

^ permalink raw reply

* Re: Merge with local conflicts in new files
From: Santi @ 2006-05-16 22:12 UTC (permalink / raw)
  To: git, Junio C Hamano
In-Reply-To: <8aa486160605161500m1dd8428cj@mail.gmail.com>

Sorry, the test is wrong. Use this:

test_description='Test merge with local conflicts in new files'
. ./test-lib.sh

test_expect_success 'prepare repository' \
'echo "Hello" > init &&
git add init &&
git commit -m "Initial commit" &&
git checkout -b B &&
echo "foo" > foo &&
git add foo &&
git commit -m "File: foo" &&
git checkout master &&
echo "bar" > foo &&
'

test_expect_code 1 'Merge with local conflicts in new files' 'git
merge "merge msg" HEAD B'

test_done

^ permalink raw reply

* "git add $ignored_file" fail
From: Santi @ 2006-05-16 22:07 UTC (permalink / raw)
  To: git, Junio C Hamano

Hi *,

      When you try to add ignored files with the git-add command it
fails because the call to:

git-ls-files -z \
        --exclude-from="$GIT_DIR/info/exclude" \
        --others --exclude-per-directory=.gitignore

      does not output this file because it is ignored. I know I can do it with:

git-update-index --add $ignored_file

I understand the behaviour of git-ls-files but I think it is no the
expected for git-add, at least for me.

    Thanks

    Santi

^ permalink raw reply

* Merge with local conflicts in new files
From: Santi @ 2006-05-16 22:00 UTC (permalink / raw)
  To: git, Junio C Hamano

Hi *,

       In the case of:

- You merge from a branch with new files
- You have these files in the working directory
- You do not have these files in the HEAD.

   The end result is that you lose the content of these files.

   So an additional check for the merge is to check for these dirty
but not in HEAD files.

   Here is a test that reproduce it. I expect the merge to fail and
with the content of foo being bar.

test_description='Test merge with local conflicts in new files'
. ./test-lib.sh

test_expect_success 'prepare repository' \
'echo "Hello" > init &&
git add init &&
git commit -m "Initial commit" &&
git branch B &&
echo "foo" > foo &&
git add foo &&
git commit -m "File: foo" &&
git checkout B &&
echo "bar" > foo '

test_expect_code 1 'Merge with local conflicts in new files' 'git
merge "merge msg" B master'

test_done

Thanks.

^ permalink raw reply

* Re: [RFC] Add "rcs format diff" support
From: Al Viro @ 2006-05-16 20:49 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Git Mailing List, Al Viro, Davide Libenzi
In-Reply-To: <20060514001214.GB27946@ftp.linux.org.uk>

Use:
	diff-remap-data <dir1> <dir2> >map
or
	git-remap-data <git-diff arguments> >map
will build information for remapper,
	git-remap <map> <options>
will do line numbers remapping.

git-remap is a filter.  It takes map as argument and, in the simplest form,
will look at the lines in stdin that have form
<filename>:<number>:<text>
If the indicated line from old tree had survived into the new one, we will
get
N:<new-filename>:<new-number>:<text>
on the output.  If it hadn't, we get
O:<filename>:<number>:<text>
Lines that do not have such form are passed unchanged.

Even that is already very useful for log comparison.  E.g. if old-log is
from the old tree and new-log is from the new one, we can do
	git-remap map <old-log >foo
	git-remap /dev/null <new-log >bar
	diff -u foo bar
and have the noise due to line number changes excluded (empty map means
identity mapping, so the second line will simply slap N: on all lines of
form <filename>:<number>:<text> in new-log).

Note that it's not just for build logs; the thing is useful for sparse logs,
grep -n output, etc., etc. 

Behaviour described above is the default; what _really_ happens is
that we take lines of form
<original_prefix><filename>:<number>:<text>
and replace them with
<prefix_for_new><new-filename>:<new-number>:<text>
or
<prefix_for_old><filename>:<number>:<text>
Defaults are :", "N:" and "O:" resp.; what it gives us is the ability to
do multiple remappings.  IOW, we can say

diff-remap-data old-tree newer-tree > map1
diff-remap-data newer-tree current-tree > map2
git-remap -o old: map1 <old-log | git-remap -p N: -o newer: -n current: map2>foo

and get lines that didn't make it into the newer tree marked with old: and
otherwise be unchanged, ones that made it to newer, but not the current to
be marked with newer: and have the filenames/line numbers remapped and ones
that made it all the way be marked with current: and remapped all the way
to current tree.

That's quite useful when you want to carry logs for a while, basically using
them as annotated TODO ("logs" here can very well be results of grep -n with
annotations added to them).  You can have all still relevant bits stay with
the locations in text and see what had fallen out.

Note on relation to git:
	* git-remap, despite the name, doesn't need git to work
	* diff-remap-data doesn't need git to work
	* git-remap-data _does_ need it.  Aside of working on revisions in
git repository instead of a couple of directory trees, it generates slightly
better map than diff-remap-data does.  I.e. it manages to remap more lines -
it does notice renames.

This stuff lives on ftp.linux.org.uk/pub/people/viro/remapper/; I'm not
sure what to do with it wrt distributing - submit for inclusion into
git, or leave that sucker standalone.  It can be used without git, but
OTOH having it in git would make my life easier - I wouldn't have to
think about packaging it myself ;-)

Seriously,
	a) feel free to play with it; hopefully it will be useful.
	b) review and comments are welcome.
	c) so would any thoughts regarding the right way to distribute it.

^ permalink raw reply

* Re: let's meet
From: Junio C Hamano @ 2006-05-16 20:40 UTC (permalink / raw)
  To: Randal L. Schwartz; +Cc: git
In-Reply-To: <86odxxn1yc.fsf@blue.stonehenge.com>

merlyn@stonehenge.com (Randal L. Schwartz) writes:

>>>>>> "Luke" == Luke  <oxwacpp@arsenal.co.uk> writes:
>
> Luke> Hire,
> Luke> i am here sittiang in the internet caffe. Found your email a!nd
> Luke> decided to write. I might be coming to your p!lace in 14 days, 
> Luke> so I decided to email you. May be we ca!n meet? I am 25 y.o.
> Luke> girl. I have a picture if you want. No need to reply here as 
> Luke> this is not my email. Write me at ex@datetodayy.com
>
> I hope she has a big table. :)

Huh?

She's coming to *your* place, so you are the one to prepare a
big table to cover the locations we all live---perhaps "earth"?

;-)

^ permalink raw reply

* Re: let's meet
From: Randal L. Schwartz @ 2006-05-16 20:34 UTC (permalink / raw)
  To: git
In-Reply-To: <602115DC.2C05E9D@arsenal.co.uk>

>>>>> "Luke" == Luke  <oxwacpp@arsenal.co.uk> writes:

Luke> Hire,
Luke> i am here sittiang in the internet caffe. Found your email a!nd
Luke> decided to write. I might be coming to your p!lace in 14 days, 
Luke> so I decided to email you. May be we ca!n meet? I am 25 y.o.
Luke> girl. I have a picture if you want. No need to reply here as 
Luke> this is not my email. Write me at ex@datetodayy.com

I hope she has a big table. :)

-- 
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!

^ permalink raw reply

* [PATCH] improve depth heuristic for maximum delta size
From: Nicolas Pitre @ 2006-05-16 20:29 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7v4pzqhh3t.fsf@assigned-by-dhcp.cox.net>

This provides a linear decrement on the penalty related to delta depth
instead of being an 1/x function.  With this another 5% reduction is 
observed on packs for both the GIT repo and the Linux kernel repo, as 
well as fixing a pack size regression in another sample repo I have.

Signed-off-by: Nicolas Pitre <nico@cam.org>

---

On Mon, 15 May 2006, Junio C Hamano wrote:

> Nicolas Pitre <nico@cam.org> writes:
> 
> > @@ -1038,8 +1038,8 @@ static int try_delta(struct unpacked *tr
> >  
> >  	/* Now some size filtering euristics. */
> >  	size = trg_entry->size;
> > -	max_size = size / 2 - 20;
> > -	if (trg_entry->delta)
> > +	max_size = (size/2 - 20) / (src_entry->depth + 1);
> > +	if (trg_entry->delta && trg_entry->delta_size <= max_size)
> >  		max_size = trg_entry->delta_size-1;
> >  	src_size = src_entry->size;
> >  	sizediff = src_size < size ? size - src_size : 0;
> 
> At the first glance, this seems rather too agressive.  It makes
> me wonder if it is a good balance to penalize the second
> generation base by requiring it to produce a small delta that is
> at most half as we normally would (and the third generation a
> third), or maybe the penalty should kick in more gradually, like
> e.g. ((max_depth * 2 - src_entry->depth) / (max_depth * 2).

You are right.  However your formula converge towards 0.5 which is not 
enough to be sure the bad effect with early eviction of max depth object 
from the object window won't come back.  I prefer this patch with a 
formula converging toward 0.

diff --git a/pack-objects.c b/pack-objects.c
index 566a2a2..3116020 100644
--- a/pack-objects.c
+++ b/pack-objects.c
@@ -1036,9 +1036,12 @@ static int try_delta(struct unpacked *tr
 	if (src_entry->depth >= max_depth)
 		return 0;
 
-	/* Now some size filtering euristics. */
+	/* Now some size filtering heuristics. */
 	size = trg_entry->size;
-	max_size = (size/2 - 20) / (src_entry->depth + 1);
+	max_size = size/2 - 20;
+	max_size = max_size * (max_depth - src_entry->depth) / max_depth;
+	if (max_size == 0)
+		return 0;
 	if (trg_entry->delta && trg_entry->delta_size <= max_size)
 		max_size = trg_entry->delta_size-1;
 	src_size = src_entry->size;

^ permalink raw reply related

* Re: [PATCH] Implement git-quiltimport
From: Junio C Hamano @ 2006-05-16 19:01 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: Linus Torvalds, git
In-Reply-To: <m1bqtx6el6.fsf@ebiederm.dsl.xmission.com>

ebiederm@xmission.com (Eric W. Biederman) writes:

>    Given the ugliness in -mm making it an error to have an
>    non-attributed patch would result in people specifying --author
>    when they really don't know who the author is, giving us much
>    less reliable information.
>
>    Possibly what we need is an option to not make it an error so that
>    people doing this kind of thing in their own trees have useful
>    information.

I agree it is probably a good way to error by default, optinally
allowing to say "don't care".  I do not think Linus would pull
from such a tree or trees branched from it into his official
tree, so I do not think we would need to worry about commits
with incomplete information propagating for this particular
"gitified mm" usage.  But as a general purpose tool to produce
"gitified quilt series" tree, we would.

It depends on the expected use of the resulting gitified mm
tree.

If it is for an individual developer to futz with and tweak
upon, and the end result from the work leaves such a "gitified
quilt series" repository only as a patch form, then not having
to figure out and specify authorship information to many patches
is probably a plus; the information will not be part of the
official history recorded elsewhere anyway.

However, if it is to produce a reference git tree to point
people at, (i.e. the quiltimport script is run once per a series
by somebody and the result is published for public use), I would
imagine we would want to have the attribution straight, so if
the tool has to "guess", it should either error out or go
interactive and ask.

^ permalink raw reply

* Re: git-svn vs. $Id$
From: Tommi Virtanen @ 2006-05-16 18:12 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0605161037220.3866@g5.osdl.org>

Linus Torvalds wrote:
> Isn't there some flag to svn to avoid keyword expansion, like "-ko" to 
> CVS?
> 
> Any import script definitely should avoid keyword expansion (and that's 
> true whether you end up wanting to use keywords or not).

Well, yes, I agree. But, at least git-svn.txt says this:

BUGS
----
...
svn:keywords can't be ignored in Subversion (at least I don't know of
a way to ignore them).

I guess one might be able to reach that information through the svn API.

Or just propget svn:keywords and sed s/\$Id\(:[^$]*\)\$/$Id$/ all files
with keywords, for all relevant keywords. Eww.

-- 
Inoi Oy, Tykistökatu 4 D (4. krs), FI-20520 Turku, Finland
http://www.inoi.fi/
Mobile +358 40 762 5656

^ permalink raw reply

* Re: [PATCH] Implement git-quiltimport
From: Eric W. Biederman @ 2006-05-16 17:53 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, git
In-Reply-To: <Pine.LNX.4.64.0605161001190.3866@g5.osdl.org>

Linus Torvalds <torvalds@osdl.org> writes:

> On Tue, 16 May 2006, Eric W. Biederman wrote:
>>
>> If the --author flag was not given the the author is recorded as 
>> unknown.
>
> Please don't do this. Just error out. It would be horrible to have a quilt 
> import "succeed", and then later notice that some of the patches had 
> incorrect authorship attribution just because the import script didn't 
> check it, but just made it "unknown".
>
> An un-attributed patch is simply not acceptable in any serious project. 
> It's much better to consider it an error than to say "ok".

There are two practical problems with this.
1) quilt does not force any authorship information to be preserved,
   in the description, so this probably a common case.  Although for
   most users just needing to specify --author sounds reasonable.

2) There are currently 84 out of roughly 1322 patches in
   2.6.17-rc4-mm1 that git-mailinfo cannot compute the author for.
   Generally the information is there but in such an irregular form
   that it cannot be automatically detected.

   If we can resolve that problem I am willing to make it an error.
   If we can't then sucking quilt patches into a git tree is much
   less useful.  

   Given the ugliness in -mm making it an error to have an
   non-attributed patch would result in people specifying --author
   when they really don't know who the author is, giving us much
   less reliable information.

   Possibly what we need is an option to not make it an error so that
   people doing this kind of thing in their own trees have useful
   information.


The list of patches that git-mailinfo cannot find authorship
information for from 2.6.17-rc4-mm1 is included below.  Mostly these
are either git trees splatted into a single file, or simply fixes
added by Andrew.  But there are some like: gregkh-usb-usb-gotemp
that have no description at all and only the patch name records who
made the patch.

A really ugly case is acx1xx-wireless-driver patch, which
appears to have multiple authors and a serious history
before Andrew got it.

>From acx1xx-wireless-driver.patch
> acx100.sourceforge.net (Andreas Mohr <andi@rhlx01.fht-esslingen.de>) ->
>   -> Denis Vlasenko <vda@ilport.com.ua>
>      -> Jeff Garzik <jgarzik@pobox.com>
>         -> me
> 
> DESC
> acx1xx-wireless-driver-usb-is-bust
> EDESC
> From: Andrew Morton <akpm@osdl.org>
> 
> drivers/net/wireless/tiacx/usb.c:1116: `URB_ASYNC_UNLINK' undeclared (first use in this function)
> 
> Cc: Denis Vlasenko <vda@ilport.com.ua>
> DESC
> acx1xx-allow-modular-build
> EDESC
> From: Andrew Morton <akpm@osdl.org>
> DESC
> acx1xx-wireless-driver-spy_offset-went-away
> EDESC
> From: Andrew Morton <akpm@osdl.org>
> 
> Cc: Denis Vlasenko <vda@ilport.com.ua>
> DESC
> acx update
> EDESC
> From: Denis Vlasenko <vda@ilport.com.ua>
> 
> > > Attached is a patch which updates acx. All your changes are
> > > included too. allyesconfig build is fixed by unifying
> > > PCI and USB modules into one. 'acx_debug' parameter is renamed back
> > > to just 'debug' (because all previous versions used it and
> > > we don't want to add to user confusion).
> > >
> > > Please apply.
> > >
> > > Signed-off-by: Denis Vlasenko <vda@ilport.com.ua>
> >
> > I missed a spy_offset fix. Updated patch is attached.
> > Also it is at
> > http://195.66.192.167/linux/acx_patches/linux-2.6.13-mm2acx-2.patch.bz2
> 
> Oh no. Yes. I forgot to remove some standalone build aids.
> 
> DESC
> acx-update 2
> EDESC
> From: Denis Vlasenko <vda@ilport.com.ua>
> 
> [20051016] 0.3.13
> * Revert 20051013 fix, we have one which actually works.
>   Thanks Jacek Jablonski <yacek87@gmail.com> for testing!
> 
> [20051013]
> * trying to fix "yet another similar bug"
> * usb fix by Carlos Martin
> 
> [20051012] 0.3.12
> * acx_l_clean_tx_desc bug fixed - was stopping tx completely
>   at high load. (It seems there exists yet another similar bug!)
> * "unknown IE" dump was 2 bytes too short - fixed
> * DUP logging made less noisy
> * another usb fix by Carlos Martin <carlosmn@gmail.com>
> 
> [20051003]
> * several usb fixes by Carlos Martin <carlosmn@gmail.com> - thanks!
> * unknown IE logging made less noisy
> * few unknown IEs added to the growing collection
> * version bump to 0.3.11
> 
> [20050916]
> * fix bogus MTU handling, add ability to change MTU
> * fix WLAN_DATA_MAXLEN: 2312 -> 2304
> * version bump to 0.3.10
> 
> [20050915]
> * by popular request default mode is 'managed'
> * empty handler for EID 7 (country info) is added
> * fix 'timer not started - iface is not up'
> * tx[host]desc micro optimizations
> * version bump to 0.3.9
> 
> [20050914]
> * tx[host]desc ring workings brought a bit back to two-hostdesc
>   scheme. This is an attempt to fix weird WG311v2 bug.
>   I still fail to understand how same chip with same fw can
>   work for me but do not work for a WG311v2 owner. Mystery.
> * README updated
> * version bump to 0.3.8
> 
> [20050913]
> * variable and fields with awful names renamed
> * a few fields dropped (they had constant values)
> * small optimization to acx_l_clean_tx_desc()
> * version bump to 0.3.7

      origin
      git-acpi
      git-agpgart
      git-alsa
      git-block
      git-cfq
      git-cifs
      git-dvb
      git-gfs2
      git-ia64
      git-ieee1394
      git-infiniband
      git-intelfb
      sane-menuconfig-colours
      git-klibc
      git-hdrcleanup
      git-hdrinstall
      git-libata-all
      libata_resume_fix
      git-mips
      git-mtd
      git-netdev-all
      git-nfs
      git-ocfs2
      git-powerpc
      git-rbtree
      git-sas
      gregkh-pci-acpiphp-configure-_prt-v3
      gregkh-pci-acpiphp-hotplug-slot-hotplug
      gregkh-pci-acpiphp-host-and-p2p-hotplug
      gregkh-pci-acpiphp-turn-off-slot-power-at-error-case
      gregkh-pci-pci-legacy-i-o-port-free-driver-changes-to-generic-pci-code
      gregkh-pci-pci-legacy-i-o-port-free-driver-update-documentation-pci_txt
      gregkh-pci-pci-legacy-i-o-port-free-driver-make-intel-e1000-driver-legacy-i-o-port-free
      gregkh-pci-pci-64-bit-resources-drivers-pci-changes
      gregkh-pci-pci-64-bit-resources-drivers-media-changes
      gregkh-pci-pci-64-bit-resources-drivers-net-changes
      gregkh-pci-pci-64-bit-resources-drivers-pcmcia-changes
      gregkh-pci-pci-64-bit-resources-drivers-others-changes
      gregkh-pci-pci-msi-abstractions-and-support-for-altix
      git-pcmcia
      git-scsi-target
      gregkh-usb-usb-gotemp
      git-supertrak
      git-watchdog
      x86_64-mm-defconfig-update
      x86_64-mm-memset-always-inline
      x86_64-mm-amd-core-cpuid
      x86_64-mm-amd-cpuid4
      x86_64-mm-alternatives
      x86_64-mm-ia32-unistd-cleanup
      x86_64-mm-topology-comment
      x86_64-mm-new-compat-ptrace
      x86_64-mm-disable-agp-resource-check
      x86_64-mm-new-northbridge
      x86_64-mm-iommu-warning
      x86_64-mm-i386-up-generic-arch
      x86_64-mm-iommu-enodev
      x86_64-mm-compat-printk
      x86_64-mm-i386-numa-summit-check
      x86_64-mm-fix-b44-checks
      x86_64-mm-nommu-warning
      git-cryptodev
      mm
      acx1xx-wireless-driver
      reiser4-export-find_get_pages
      kgdb-core-lite
      kgdb-8250
      kgdb-netpoll_pass_skb_to_rx_hook
      kgdb-eth
      kgdb-i386-lite
      kgdb-cfi_annotations
      kgdb-sysrq_bugfix
      kgdb-module
      kgdb-core
      kgdb-i386
      journal_add_journal_head-debug
      list_del-debug
      unplug-can-sleep
      firestream-warnings
      git-viro-bird-m32r
      git-viro-bird-m68k
      git-viro-bird-frv
      git-viro-bird-upf
      git-viro-bird-volatile

Eric

^ permalink raw reply

* Re: git-svn vs. $Id$
From: Linus Torvalds @ 2006-05-16 17:48 UTC (permalink / raw)
  To: Tommi Virtanen; +Cc: git
In-Reply-To: <446A0CCF.2060903@inoi.fi>



On Tue, 16 May 2006, Tommi Virtanen wrote:
> 
> Just wanted to let you know of a workaround:
> manually edit the relevant file in .git/git-svn/tree/ to
> undo the $Id$ change, and git-svn fetch works again.

Isn't there some flag to svn to avoid keyword expansion, like "-ko" to 
CVS?

Any import script definitely should avoid keyword expansion (and that's 
true whether you end up wanting to use keywords or not).

(And yes, CVS is probably a bad example. Those "substitution modes" are 
confusing as hell, and I don't know which one is the right one. Is it 
"-ko" or "-kk"? Don't ask me, I'm CVS-illiterate. I don't know why the 
current cvsimport uses -kk, and only does it conditionally. Whatever.)

		Linus

^ permalink raw reply

* git-svn vs. $Id$
From: Tommi Virtanen @ 2006-05-16 17:33 UTC (permalink / raw)
  To: git

Hi. I just ran into trouble with git-svn, related to a file
containing $Id$. Yes, I know $Id$ sucks and should be avoided,
and I'll be removing them shortly, but that doesn't change the
fact that the history contains files with them.

Just wanted to let you know of a workaround:
manually edit the relevant file in .git/git-svn/tree/ to
undo the $Id$ change, and git-svn fetch works again.

$ git-svn fetch
Tree mismatch, Got: c242bb60d78c1dfce133e0bbaca7f13895de00b2, Expected:
07d35ac911cc56aabea86f4467cafc1d92b724c4
 at /home/tv/bin/git-svn line 426
        main::assert_tree('a5890d459de08dc8adbbe34cdfb4b1f44f377ad8')
called at /home/tv/bin/git-svn line 392
        main::assert_svn_wc_clean(2039,
'a5890d459de08dc8adbbe34cdfb4b1f44f377ad8') called at
/home/tv/bin/git-svn line 262
        main::fetch() called at /home/tv/bin/git-svn line 105

$ git diff-tree -p 07d35ac911cc56aabea86f4467cafc1d92b724c4 \
  c242bb60d78c1dfce133e0bbaca7f13895de00b2
diff --git a/anonymized b/anonymized
index 16b3988..f43782a 100644
--- a/anonymized
+++ b/anonymized
@@ -1,4 +1,4 @@
-## $Id: anonymized 1775 2006-04-20 09:25:22Z tv $
+## $Id: anonymized 2025 2006-05-16 07:25:24Z tv $

 blah
 blah


So editing .git/git-svn/tree/anonymized and replacing
"1775 2006-04-20 09:25:22Z tv" with "2025 2006-05-16 07:25:24Z tv"
make git-svn happy again.

-- 
Inoi Oy, Tykistökatu 4 D (4. krs), FI-20520 Turku, Finland
http://www.inoi.fi/
Mobile +358 40 762 5656

^ permalink raw reply related

* Re: [PATCH] Implement git-quiltimport
From: Linus Torvalds @ 2006-05-16 17:03 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: Junio C Hamano, git
In-Reply-To: <m1k68l6hga.fsf@ebiederm.dsl.xmission.com>



On Tue, 16 May 2006, Eric W. Biederman wrote:
>
> If the --author flag was not given the the author is recorded as 
> unknown.

Please don't do this. Just error out. It would be horrible to have a quilt 
import "succeed", and then later notice that some of the patches had 
incorrect authorship attribution just because the import script didn't 
check it, but just made it "unknown".

An un-attributed patch is simply not acceptable in any serious project. 
It's much better to consider it an error than to say "ok".

		Linus

^ permalink raw reply

* [PATCH] Implement git-quiltimport
From: Eric W. Biederman @ 2006-05-16 16:51 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git


Importing a quilt patch series into git is not very difficult
but parsing the patch descriptions and all of the other
minutia take a bit of effort to get right, so this automates it.

Since git and quilt complement each other it makes sense
to make it easy to go back and forth between the two.

---

Eric

 Documentation/git-quiltimport.txt |   50 +++++++++++++++++++++
 Makefile                          |    2 -
 git-quiltimport.sh                |   88 +++++++++++++++++++++++++++++++++++++
 3 files changed, 139 insertions(+), 1 deletions(-)
 create mode 100644 Documentation/git-quiltimport.txt
 create mode 100644 git-quiltimport.sh

2256c7e9b3913732a5c3a2e54cdea20fc951b76d
diff --git a/Documentation/git-quiltimport.txt b/Documentation/git-quiltimport.txt
new file mode 100644
index 0000000..8ea20eb
--- /dev/null
+++ b/Documentation/git-quiltimport.txt
@@ -0,0 +1,50 @@
+git-quiltimport(1)
+================
+
+NAME
+----
+git-quiltimport - Applies a quilt patchset onto the current branch
+
+
+SYNOPSIS
+--------
+[verse]
+'git-quiltimport' [--author <author>] [--patches <dir>]
+
+
+DESCRIPTION
+-----------
+Applies a quilt patchset onto the current git branch, preserving
+the patch boundaries, patch order, and patch descriptions present
+in the quilt patchset.
+
+For each patch the code attempts to extract the author from the 
+patch description.  If that fails it falls back to the author
+specified with --author.  If the --author flag was not given
+the the author is recorded as unknown.
+
+The patch name is preserved as the 1 line subject in the git
+description.
+
+OPTIONS
+-------
+--author Author Name <Author Email>::
+	The author name and email address to use when no author
+	information can be found in the patch description.
+
+--patches <dir>::
+	The directory to find the quilt patches and the
+	quilt series file.
+
+Author
+------
+Written by Eric Biederman <ebiederm@lnxi.com>
+
+Documentation
+--------------
+Documentation by Eric Biederman <ebiederm@lnxi.com>
+
+GIT
+---
+Part of the gitlink:git[7] suite
+
diff --git a/Makefile b/Makefile
index 37fbe78..1f4abe6 100644
--- a/Makefile
+++ b/Makefile
@@ -125,7 +125,7 @@ SCRIPT_SH = \
 	git-applymbox.sh git-applypatch.sh git-am.sh \
 	git-merge.sh git-merge-stupid.sh git-merge-octopus.sh \
 	git-merge-resolve.sh git-merge-ours.sh git-grep.sh \
-	git-lost-found.sh
+	git-lost-found.sh git-quiltimport.sh
 
 SCRIPT_PERL = \
 	git-archimport.perl git-cvsimport.perl git-relink.perl \
diff --git a/git-quiltimport.sh b/git-quiltimport.sh
new file mode 100644
index 0000000..534be82
--- /dev/null
+++ b/git-quiltimport.sh
@@ -0,0 +1,88 @@
+#!/bin/sh
+USAGE='--author <author> --patches </path/to/quilt/patch/directory>'
+SUBDIRECTORY_ON=Yes
+. git-sh-setup
+
+quilt_author="Unknown <unknown>"
+while case "$#" in 0) break;; esac
+do
+	case "$1" in
+	--au=*|--aut=*|--auth=*|--autho=*|--author=*)
+		quilt_author=$(expr "$1" : '-[^=]*\(.*\)')
+		shift
+		;;
+	
+	--au|--aut|--auth|--autho|--author)
+		case "$#" in 1) usage ;; esac
+		shift
+		quilt_author="$1"
+		shift
+		;;
+
+	--pa=*|--pat=*|--patc=*|--patch=*|--patche=*|--patches=*)
+		QUILT_PATCHES=$(expr "$1" : '-[^=]*\(.*\)')
+		shift
+		;;
+	
+	--pa|--pat|--patc|--patch|--patche|--patches)
+		case "$#" in 1) usage ;; esac
+		shift
+		QUILT_PATCHES="$1"
+		shift
+		;;
+	
+	*)
+		break
+		;;
+	esac
+done
+
+# Quilt Author
+quilt_author_name=$(expr "z$quilt_author" : 'z\(.*[^ ]\) *<.*') &&
+quilt_author_email=$(expr "z$quilt_author" : '.*\(<.*\)') &&
+test '' != "$quilt_author_name" &&
+test '' != "$quilt_author_email" ||
+die "malformatted --author parameter"
+
+# Quilt patch directory
+: ${QUILT_PATCHES:=patches}
+if ! [ -d "$QUILT_PATCHES" ] ; then
+	echo "The \"$QUILT_PATCHES\" directory does not exist."
+	exit 1
+fi
+
+# Temporay directories
+tmp_dir=.dotest
+tmp_msg="$tmp_dir/msg"
+tmp_patch="$tmp_dir/patch"
+tmp_info="$tmp_dir/info"
+
+
+# Find the intial commit
+commit=$(git-rev-parse HEAD)
+
+mkdir $tmp_dir || exit 2
+cat "$QUILT_PATCHES/series" | grep -v '^#' | 
+while read line ; do 
+	echo $line
+	(cat $QUILT_PATCHES/$line | git-mailinfo "$tmp_msg" "$tmp_patch" > "$tmp_info") || exit 3
+	
+	# Parse the author information
+	export GIT_AUTHOR_NAME=$(sed -ne 's/Author: //p' "$tmp_info")
+	export GIT_AUTHOR_EMAIL=$(sed -ne 's/Email: //p' "$tmp_info")
+	if [ -z "$GIT_AUTHOR_EMAIL" ] ; then
+		GIT_AUTHOR_NAME=$quilt_author_name
+		GIT_AUTHOR_EMAIL=$quilt_author_email
+	fi
+	export GIT_AUTHOR_DATE=$(sed -ne 's/Date: //p' "$tmp_info")
+	export SUBJECT=$(sed -ne 's/Subject: //p' "$tmp_info")
+	if [ -z "$SUBJECT" ] ; then
+		SUBJECT=$(echo $line | sed -e 's/.patch$//')
+	fi
+
+	git-apply --index -C1 "$tmp_patch" &&
+	tree=$(git-write-tree) &&
+	commit=$((echo "$SUBJECT"; echo; cat "$tmp_msg") | git-commit-tree $tree -p $commit) &&
+	git-update-ref HEAD $commit || exit 4
+done
+rm -rf $tmp_dir || exit 5
-- 
1.3.2.g2256

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox