Git development

Git development
 help / color / mirror / Atom feed

* Re: What's in git.git
From: Junio C Hamano @ 2006-02-09 10:55 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: git
In-Reply-To: <43EB1984.3040602@op5.se>

Andreas Ericsson <ae@op5.se> writes:

> This is exactly what I do when I improve upon things in master, and
> according to numerous emails this is the recommended workflow.

Yes.

> Do you mean
> 	$ git pull git://git.kernel.org/pub/scm/git/git +pu:my-pu

I do mean "+pu:pu".  In my illustration, "pu" is used in your
repository to track "pu" retrieved from me, and "my-pu" is a
fork you created from it and you build your changes upon.

	$ git pull $URL +pu:my-pu

is a shorthand for:

	$ git fetch $URL +pu:my-pu
        $ git merge "auto merge message" HEAD my-pu

and you definitely do not want to _fetch_ into my-pu when you
are on my-pu.

> ? Otherwise, I don't see how I can end up with merge-conflicts.

The problem is exactly why you need the plus sign when you fetch,
i.e. "+pu:pu".  My "pu" rebases.

Suppose I had this:

             o--o--o
            /      "pu"
	o--o
           "master"     

You do fetch +pu:pu, branch my-pu, and build on top of it:

                     o--o--o--o--o--o--o
                    /                  "my-pu"
             o--o--o
            /      "pu"
	o--o
           "master"

I add some to my "master" and rebuild "pu", maybe while adding
another commit on "pu".  You fetch +pu:pu again:

                     o--o--o--o--o--o--o
                    /                  "my-pu"
             o--o--o        o--o--o--o
            /              /         "pu" 
	o--o--o--o--o--o--o
                          "master"

Now, what happens when you merge "pu" into "my-pu"?  The three
commits I had on my previous "pu" are not part of the history of
the updated "pu" anymore, but is considered to be part of your
development trail.  If these had an addition of a file, and if
your development on top of the previous "pu" modified it, the
merge would result in:

 * originally the file did not exist.
 * "pu" adds it one way.
 * "my-pu" adds it in another way.

This requires a hand merge.  What should be done is for me to
instead of rebasing "pu", merge the updated master to "pu".

                     o--o--o--o--o--o--o
                    /                  "my-pu"
             o--o--o--------*--o
            /              /   "pu" 
	o--o--o--o--o--o--o
                          "master"

Then merge between "my-pu" and "pu" become easier.  You do not
have to worry about the earlier three commits, because the point
you forked from the previous "pu" becomes the merge base.

The reason I have not done it that way so far is primarily I am
lazy and also I do not like to see too many merges in the log.
Also "pu" tends to have really wacky stuff, so separating out
only usable bits, excluding wacky ones is slightly easier if I
rebuild it from scratch.

The new "next" aka "not too close to bleeding or broken edge"
branch will be managed like the last picture above, in order to
make working with it easier to manage.  This is only usable if I
do not include too bleeding-edge topic branch in it.

^ permalink raw reply

* Re: What's in git.git
From: Junio C Hamano @ 2006-02-09 10:32 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git
In-Reply-To: <Pine.LNX.4.63.0602091055540.24701@wbgn013.biozentrum.uni-wuerzburg.de>

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> IMHO what you intend to put into "next" should be put into "master" 
> anyway: everyone interested in git development should try the new features 
> as early as possible.

Yes, but I've been trying to be _very_ conservative to keep
"master" clean and stable, as I said in my inauguration speech.

Since git is still young and we are building features that are
needed in the field every day, it is very beneficial for users
to keep up-to-date with "master", and I would really like to
encourage that.  It saddens me to see git patches posted to the
kernel list marked with 0.99.9.GIT by prominent kernel people.

However, I do not want to see their time wasted on getting
bitten by stupid bugs I carelessly place on the "master" branch.
So I'd like to keep "master" conservative, stable and boring, at
least for now.

Instead of introducing "next", I could treat "pu" the way I said
I would do "next".  But even if I rid of its constant rewinding
nature, "pu" tends to have intrusive stuff near its tip and is
very hard to build on top of it.  Patches against the tip of
"pu" to fix things unrelated to the whacky ones often would be
inapplicable to "master".  This is especially true with what are
currently pending near the tip of "pu" (bind commits and shallow
clones).  I do not forsee them to graduate to "master" any time
soon.  Not in their current shape.

The promised "next" should be much easier to build on top of,
without disecting it into component topic branches, and it would
be the branch to track for people interested in git development
if you want to stay closer to the edge without touching bleeding
or even broken edge.  Making it easier to participate in git
development by people interested is what I am aiming at here.

I've considered publishing the topic branches individually.
Branches are cheap from the storage point of view (not really,
one inode and a filesystem block wasted to store only 41-bytes
;-)), but it needs management time and care (I will need to
remember to go to the repository and remove stale ones once they
are merged up).  Since branches in "next" are meant to be
short-lived, I am hoping it is easier for me to bundle them up
like I am planning.

On the other hand, long-lived whacky intrusive ones might be
better published as individual branches.  

^ permalink raw reply

* Re: What's in git.git
From: Andreas Ericsson @ 2006-02-09 10:29 UTC (permalink / raw)
  To: git
In-Reply-To: <7vk6c4etzy.fsf@assigned-by-dhcp.cox.net>

Junio C Hamano wrote:
> Andreas Ericsson <ae@op5.se> writes:
> 
> 
>>sean wrote:
>>
>>>I've always followed it okay by just using "git branch -d pu" each
>>>time before pulling from you.   Your "next" branch does sound like
>>>an improvement though.
>>
>>I thought
>>
>>	Pull: +pu:pu
>>
>>was supposed to handle such things automatically. It has always pulled
>>properly for me anyways.
> 
> 
> Yes, fetching to look at is no problem, but what I wanted to
> solve is that you cannot easily _touch_ it.  The point of this
> is to make improving on top of what is still _not_ in master
> easier for the contributors.
> 
> If you want to improve upon what is in the current "pu", the
> natural thing for you to do would be:
> 
> 	$ git fetch git://git.kernel.org/pub/scm/git/git +pu:pu
> 	$ git checkout -b my-pu pu ;# initial
>         $ hack on it and git commit many times
>         $ git format-patch --stdout pu..my-pu |
>           git send-email --to junkio@cox.net --cc git@vger.kernel.org
> 

This is exactly what I do when I improve upon things in master, and 
according to numerous emails this is the recommended workflow.

> (Side note: I do not know git-send-email would work like the
> above, but if it did that might be handy.  Ryan?)
> 

With my (still un-published) git-send-patch you could do

	$ work, work, work
	$ git send-patch -s "Some subject for a prelude message" pu

and it would do the right thing.

I guess I'll have to get around to sending that thing in sooner or later.

> But sometimes you may take more time than how my "pu"
> progresses, and you would want to sync your work to my updated
> "pu".  A natural thing you would want to do is this:
> 
>         $ git pull git://git.kernel.org/pub/scm/git/git +pu:pu
> 

Do you mean
	$ git pull git://git.kernel.org/pub/scm/git/git +pu:my-pu

? Otherwise, I don't see how I can end up with merge-conflicts.

> Unfortunately, this would _not_ work very well, because by the
> time you pull from my "pu" again, it would have rewound and
> rebased.  You would end up seeing unnecessary merge conflicts.
> 
> Another possibility would be:
> 
>         $ git fetch git://git.kernel.org/pub/scm/git/git +pu:pu
>         $ git rebase pu
> 

Using my own topic-branch, this is what I always do. Conflicts that 
occur that way are always in my patches, so they would have to be 
reworked anyway. The new rerere tool should help if I dally too long.

Perhaps I'm just weird, but I never touch published branches.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply

* [PATCH] Fix t9000-init.sh - comment for initial commit is different
From: Pavel Roskin @ 2006-02-09 10:01 UTC (permalink / raw)
  To: Petr Baudis, git

[PATCH] Fix t9000-init.sh - comment for initial commit is different

Comment for initial commit has "Initial commit" prepended to it now. 
Update test to match.

Signed-off-by: Pavel Roskin <proski@gnu.org>
---

 t/t9000-init.sh |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/t/t9000-init.sh b/t/t9000-init.sh
index aef2587..b3b9df3 100755
--- a/t/t9000-init.sh
+++ b/t/t9000-init.sh
@@ -37,7 +37,8 @@ test_expect_success 'initialize with the
 test_expect_success 'check if we have a commit' \
 	'[ -s .git/refs/heads/master ] && cg-object-id -c HEAD'
 test_expect_success 'check if the commit is proper' \
-	'[ "$(git-cat-file commit HEAD | sed -n '\''/^parent/q; /^$/{n; :a p; n; b a}'\'')" = "silly commit message" ]'
+	'[ "$(git-cat-file commit HEAD | sed -n '\''/^parent/q; /^$/{n; :a p; n; b a}'\'')" = "Initial commit
+silly commit message" ]'
 test_expect_success 'check if we have populated index' \
 	'[ "$(git-ls-files | tr '\''\n'\'' " ")" = "file1 file2 " ]'
 test_expect_success 'blow away the repository' \

-- 
Regards,
Pavel Roskin

^ permalink raw reply related

* Re: What's in git.git
From: Johannes Schindelin @ 2006-02-09  9:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vslqtf2p1.fsf@assigned-by-dhcp.cox.net>

Hi,

On Wed, 8 Feb 2006, Junio C Hamano wrote:

> So I have created another branch, "next".

I never quite understood why you not just publish your topic branches. 
IMHO what you intend to put into "next" should be put into "master" 
anyway: everyone interested in git development should try the new features 
as early as possible. If there is a "whacky" feature, you can put it into 
"whacky/archexport" or something along the lines.

Ciao,
Dscho

^ permalink raw reply

* Re: [PATCH] Add git-annotate - a tool for annotating files with  the revision and person that created each line in the file.
From: Junio C Hamano @ 2006-02-09  9:57 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: git
In-Reply-To: <43EB093A.1060207@op5.se>

Andreas Ericsson <ae@op5.se> writes:

> So long as we never involve ruby, java or DCL, I'm a happy fellow.

Wholeheartedly seconded ;-).

^ permalink raw reply

* Re: What's in git.git
From: Junio C Hamano @ 2006-02-09  9:55 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: sean, Ryan Anderson, git
In-Reply-To: <43EB05B5.20307@op5.se>

Andreas Ericsson <ae@op5.se> writes:

> sean wrote:
>> I've always followed it okay by just using "git branch -d pu" each
>> time before pulling from you.   Your "next" branch does sound like
>> an improvement though.
>
> I thought
>
> 	Pull: +pu:pu
>
> was supposed to handle such things automatically. It has always pulled
> properly for me anyways.

Yes, fetching to look at is no problem, but what I wanted to
solve is that you cannot easily _touch_ it.  The point of this
is to make improving on top of what is still _not_ in master
easier for the contributors.

If you want to improve upon what is in the current "pu", the
natural thing for you to do would be:

	$ git fetch git://git.kernel.org/pub/scm/git/git +pu:pu
	$ git checkout -b my-pu pu ;# initial
        $ hack on it and git commit many times
        $ git format-patch --stdout pu..my-pu |
          git send-email --to junkio@cox.net --cc git@vger.kernel.org

(Side note: I do not know git-send-email would work like the
above, but if it did that might be handy.  Ryan?)

But sometimes you may take more time than how my "pu"
progresses, and you would want to sync your work to my updated
"pu".  A natural thing you would want to do is this:

        $ git pull git://git.kernel.org/pub/scm/git/git +pu:pu

Unfortunately, this would _not_ work very well, because by the
time you pull from my "pu" again, it would have rewound and
rebased.  You would end up seeing unnecessary merge conflicts.

Another possibility would be:

        $ git fetch git://git.kernel.org/pub/scm/git/git +pu:pu
        $ git rebase pu

This helps somewhat because "git rebase" uses "git cherry" to
detect the same patch with different commit ID in "pu" that you
already have in "my-pu".  But my topic branches have been
sometimes rewound and even rewritten to fix minor points (using
"reset --soft HEAD^" followed by "commit -a -c ORIG_HEAD"), and
when that happens "git rebase" would not be of much help.

The updated workflow on my part is trying to reduce these
problems by (1) not rewinding nor rebasing "next" and (2) not
rewinding nor rebasing the topic branches merged into "next".

Strictly speaking, the latter is not necessary (I would need to
resolve conflicts when merging the rewound/rebased topic
branches into "next", but after that is done, contributors who
pulled "next" do not have to deal with that, as long as "next"
itself is not rewound/rebased), but that way you could disect
component topic branches more easily out of "next".

For example, as of this writhing, my "master" and "next" look
like this:

    $ git show-branch --topo-order master next
    * [master] .gitignore git-rerere and config.mak
     ! [next] Merge branch 'jc/nostat'
    --
     - [next] Merge branch 'jc/nostat'
     + [next^2] "Assume unchanged" git: --really-refresh fix.
     - [next^] Merge branch 'jc/ls-files-o'
     + [next^^2] ls-files: honour per-directory ignore file ...
     - [next~2] Merge branches 'jc/nostat' and 'jc/empty-commit'
     + [next~2^3] t6000: fix a careless test library add-on.
     + [next~2^3^] Do not allow empty name or email.
     + [next^2^] ls-files: debugging aid for CE_VALID changes.
     + [next^2~2] "Assume unchanged" git: do not set CE_VALID...
     + [next^2~3] "Assume unchanged" git
    *+ [master] .gitignore git-rerere and config.mak

If you want to help fixing my thinko in jc/nostat branch, you
could:

	$ git checkout -b jc/nostat next^2
        $ fix fix fix; git commit

By convention, merge records what was the tip of the branch as
the first parent, and the second parent (and subsequent ones if
it is an Octopus) is the tip of the branch that was merged in,
so you can tell "next^2" is what was merged into the branch to
advance "next"; in other words, that is the tip of the jc/nostat
branch.  Similarly, you can tell the tip of jc/empty-commit was
merged to next~2 in an Octopus as the second merged-in branch,
so you can tell that its tip is next~2^3.

You could even publish your jc/nostat branch after you built on
it and tell me to pull from it to fix my stupidity.

^ permalink raw reply

* Re: What's in git.git
From: sean @ 2006-02-09  9:40 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: junkio, git
In-Reply-To: <43EB05B5.20307@op5.se>

On Thu, 09 Feb 2006 10:04:53 +0100
Andreas Ericsson <ae@op5.se> wrote:

> I thought
> 
> 	Pull: +pu:pu
> 
> was supposed to handle such things automatically. It has always pulled 
> properly for me anyways.
> 

The only problem with that is that Junio rebases and discards commits
periodically that will still be in your local pu branch.   The fetch/merge 
logic doesn't notice that commits have disappeared from Junio's pu branch.
So you'll end up with a union of all the pu branches in your local repo 
with commits that were dropped and never merged into mainline by Junio.

Unless you add changes to the pu branch locally you should never need
anything but a fast forward when pulling from Junio.  Except it breaks
when he rebases things.   The easy hackish "fix" is just to delete and repull
the branch which is always small anyway.

Sean

^ permalink raw reply

* Re: [PATCH] Add git-annotate - a tool for annotating files with  the revision and person that created each line in the file.
From: Andreas Ericsson @ 2006-02-09  9:19 UTC (permalink / raw)
  To: git
In-Reply-To: <86ek2dsn5f.fsf@blue.stonehenge.com>

Randal L. Schwartz wrote:
>>>>>>"Franck" == Franck Bui-Huu <vagabon.xyz@gmail.com> writes:
> 
> 
> Franck> another perl script :(
> 
> Franck> Are there any rules on the choice of the script language ?
> 
> I could argue that they should all be Perl. :)
> 

Brave thing to do among such a bunch of hardcore C hackers. ;)

So long as we never involve ruby, java or DCL, I'm a happy fellow.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply

* Re: What's in git.git
From: Andreas Ericsson @ 2006-02-09  9:04 UTC (permalink / raw)
  To: sean; +Cc: Junio C Hamano, git
In-Reply-To: <BAYC1-PASMTP1142DA49F5BC7B7B42B22FAE030@CEZ.ICE>

sean wrote:
> On Wed, 08 Feb 2006 22:47:54 -0800
> Junio C Hamano <junkio@cox.net> wrote:
> 
> 
> 
>>One *major* change I am thinking about doing is to change my
>>workflow a bit.  So far, the proposed updates branch "pu" was
>>almost impossible to follow unless you are really a devoted git
>>developer, because it is always rebased to the latest master and
>>then topic branches are merged onto it.  While that keeps the
>>number of unnecessary merge nodes between master and pu to the
>>minimum, it actively discouraged for the branch to be followed
>>by developers.
> 
> 
> I've always followed it okay by just using "git branch -d pu" each time 
> before pulling from you.   Your "next" branch does sound like an 
> improvement though.
> 

I thought

	Pull: +pu:pu

was supposed to handle such things automatically. It has always pulled 
properly for me anyways.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply

* Re: What's in git.git
From: sean @ 2006-02-09  8:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vslqtf2p1.fsf@assigned-by-dhcp.cox.net>

On Wed, 08 Feb 2006 22:47:54 -0800
Junio C Hamano <junkio@cox.net> wrote:


> One *major* change I am thinking about doing is to change my
> workflow a bit.  So far, the proposed updates branch "pu" was
> almost impossible to follow unless you are really a devoted git
> developer, because it is always rebased to the latest master and
> then topic branches are merged onto it.  While that keeps the
> number of unnecessary merge nodes between master and pu to the
> minimum, it actively discouraged for the branch to be followed
> by developers.

I've always followed it okay by just using "git branch -d pu" each time 
before pulling from you.   Your "next" branch does sound like an 
improvement though.

Sean

^ permalink raw reply

* [PATCH] ls-files: honour per-directory ignore file from higher directories.
From: Junio C Hamano @ 2006-02-09  8:16 UTC (permalink / raw)
  To: Pavel Roskin; +Cc: Sam Ravnborg, git
In-Reply-To: <20060125061140.GA8408@mars.ravnborg.org>

When git-ls-files -o --exclude-per-directory=.gitignore is run
from a subdirectory, it did not read from .gitignore from its
parent directory.  Reading from them makes output from these two
commands consistent:

    $ git ls-files -o --exclude-per-directory=.gitignore Documentation
    $ cd Documentation &&
      git ls-files -o --exclude-per-directory=.gitignore

Signed-off-by: Junio C Hamano <junkio@cox.net>

---

 * If there are positive feedbacks on this one, I consider it
   a safe enough candidate to be included in 1.2.0 release.

 ls-files.c |   22 +++++++++++++++++++++-
 1 files changed, 21 insertions(+), 1 deletions(-)

701ca744e386c2429ca44072ea987bbb4bdac7ce
diff --git a/ls-files.c b/ls-files.c
index 6af3b09..7024cf1 100644
--- a/ls-files.c
+++ b/ls-files.c
@@ -474,8 +474,28 @@ static void show_files(void)
 		const char *path = ".", *base = "";
 		int baselen = prefix_len;

-		if (baselen)
+		if (baselen) {
 			path = base = prefix;
+			if (exclude_per_dir) {
+				char *p, *pp = xmalloc(baselen+1);
+				memcpy(pp, prefix, baselen+1);
+				p = pp;
+				while (1) {
+					char save = *p;
+					*p = 0;
+					push_exclude_per_directory(pp, p-pp);
+					*p++ = save;
+					if (!save)
+						break;
+					p = strchr(p, '/');
+					if (p)
+						p++;
+					else
+						p = pp + baselen;
+				}
+				free(pp);
+			}
+		}
 		read_directory(path, base, baselen);
 		qsort(dir, nr_dir, sizeof(struct nond_on_fs *), cmp_name);
 		if (show_others)
-- 
1.1.6.gbb042

^ permalink raw reply related

* What's in git.git
From: Junio C Hamano @ 2006-02-09  6:47 UTC (permalink / raw)
  To: git

I haven't heard major breakage around the new features scheduled
for 1.2.0 so far, except for the two-tree "diff-tree --cc" Linus
has already fixed, so the previous "What's new" is pretty much
unchanged.

One *major* change I am thinking about doing is to change my
workflow a bit.  So far, the proposed updates branch "pu" was
almost impossible to follow unless you are really a devoted git
developer, because it is always rebased to the latest master and
then topic branches are merged onto it.  While that keeps the
number of unnecessary merge nodes between master and pu to the
minimum, it actively discouraged for the branch to be followed
by developers.

I would like to rectify that.

So I have created another branch, "next".  This is managed quite
differently from "pu".  I'd promise these things:

 * It is to contain planned updates and merge from topic
   branches, just like "pu" currently does.  However, the topics
   merged there will not contain majorly whacky / unproven ones
   like bind commits and shallow clones, until the basic part
   proves sound during the list discussion.

 * I will not rewind or rebase the "next" branch.  Also I will
   not rebase the topic branches that are merged into it.

 * It would occasionally merge from "master" if only to prevent
   conflicts.

 * If there are patches sent to improve a topic branch in it,
   they will be applied to the topic branch, and then the topic
   branch is merged into "next", without any funny rewinding or
   rebasing of "next".  This will make the "next" branch
   cluttered with repeated merges from the same topic branch,
   but that is OK.  "next" will not be merged into "master",
   ever.

 * Once a topic is fully cooked, the topic branch will be merged
   into "master".

What this means is that "next" should be as easy to follow as
"master", but still is slightly ahead of "master" with not so
wildly experimental features.

Although there theoretically is no reason not to follow the
above principles I set for "next" to manage "pu", it will stay
wild for now until I get more comfortable with this workflow.

Now, what's in "next"?  Currently I have two topic branches
merged to it.

    * jc/nostat:
      ls-files: debugging aid for CE_VALID changes.
      "Assume unchanged" git: do not set CE_VALID with --refresh
      "Assume unchanged" git

    * jc/empty-commit:
      t6000: fix a careless test library add-on.
      Do not allow empty name or email.

^ permalink raw reply

* [PATCH] ls-files: debugging aid for CE_VALID changes.
From: Junio C Hamano @ 2006-02-09  5:50 UTC (permalink / raw)
  To: git
  Cc: Alex Riesen, Radoslaw Szkodzinski, Keith Packard, cworth,
	Martin Langhoff, Linus Torvalds
In-Reply-To: <7vek2di043.fsf@assigned-by-dhcp.cox.net>

This is not really part of the proposed updates for CE_VALID,
but with this change, ls-files -t shows CE_VALID paths with
lowercase tag letters instead of the usual uppercase.  Useful
for checking out what is going on.

Signed-off-by: Junio C Hamano <junkio@cox.net>

---

 ls-files.c |   18 +++++++++++++++++-
 1 files changed, 17 insertions(+), 1 deletions(-)

775ca05ee2ba7e1f54ec4db1fed7069014364f2c
diff --git a/ls-files.c b/ls-files.c
index 6af3b09..3f06ece 100644
--- a/ls-files.c
+++ b/ls-files.c
@@ -447,6 +447,22 @@ static void show_ce_entry(const char *ta
 	if (pathspec && !match(pathspec, ce->name, len))
 		return;
 
+	if (tag && *tag && (ce->ce_flags & htons(CE_VALID))) {
+		static char alttag[4];
+		memcpy(alttag, tag, 3);
+		if (isalpha(tag[0]))
+			alttag[0] = tolower(tag[0]);
+		else if (tag[0] == '?')
+			alttag[0] = '!';
+		else {
+			alttag[0] = 'v';
+			alttag[1] = tag[0];
+			alttag[2] = ' ';
+			alttag[3] = 0;
+		}
+		tag = alttag;
+	}
+
 	if (!show_stage) {
 		fputs(tag, stdout);
 		write_name_quoted("", 0, ce->name + offset,
@@ -503,7 +519,7 @@ static void show_files(void)
 			err = lstat(ce->name, &st);
 			if (show_deleted && err)
 				show_ce_entry(tag_removed, ce);
-			if (show_modified && ce_modified(ce, &st))
+			if (show_modified && ce_modified(ce, &st, 0))
 				show_ce_entry(tag_modified, ce);
 		}
 	}
-- 
1.1.6.gbb042

^ permalink raw reply related

* [PATCH] "Assume unchanged" git: do not set CE_VALID with --refresh
From: Junio C Hamano @ 2006-02-09  5:49 UTC (permalink / raw)
  To: git
  Cc: Alex Riesen, Radoslaw Szkodzinski, Keith Packard, cworth,
	Martin Langhoff, Linus Torvalds
In-Reply-To: <7vek2di043.fsf@assigned-by-dhcp.cox.net>

When working with automatic assume-unchanged mode using
core.ignorestat, setting CE_VALID after --refresh makes things
more cumbersome to use.  Consider this scenario:

 (1) the working tree is on a filesystem with slow lstat(2).
     The user sets core.ignorestat = true.

 (2) "git checkout" to switch to a different branch (or initial
     checkout) updates all paths and the index starts out with
     "all clean".

 (3) The user knows she wants to edit certain paths.  She uses
     update-index --no-assume-unchanged (we could call it --edit;
     the name is inmaterial) to mark these paths and starts
     editing.

 (4) After editing half of the paths marked to be edited, she
     runs "git status".  This runs "update-index --refresh" to
     reduce the false hits from diff-files.

 (5) Now the other half of the paths, since she has not changed
     them, are found to match the index, and CE_VALID is set on
     them again.

For this reason, this commit makes update-index --refresh not to
set CE_VALID even after the path without CE_VALID are verified
to be up to date.  The user still can run --really-refresh to
force lstat() to match the index entries to the reality.

Signed-off-by: Junio C Hamano <junkio@cox.net>

---

 update-index.c |    9 +++++++++
 1 files changed, 9 insertions(+), 0 deletions(-)

fd4e57f17733d85ed5346d70005ea900cb80b9ff
diff --git a/update-index.c b/update-index.c
index 767fd49..bb73050 100644
--- a/update-index.c
+++ b/update-index.c
@@ -172,6 +172,15 @@ static struct cache_entry *refresh_entry
 	memcpy(updated, ce, size);
 	fill_stat_cache_info(updated, &st);
 
+	/* In this case, if really is not set, we should leave
+	 * CE_VALID bit alone.  Otherwise, paths marked with
+	 * --no-assume-unchanged (i.e. things to be edited) will
+	 * reacquire CE_VALID bit automatically, which is not
+	 * really what we want.
+	 */
+	if (!really && assume_unchanged && !(ce->ce_flags & htons(CE_VALID)))
+		updated->ce_flags &= ~htons(CE_VALID);
+
 	return updated;
 }
 
-- 
1.1.6.gbb042

^ permalink raw reply related

* Re: Handling large files with GIT
From: Martin Langhoff @ 2006-02-09  5:38 UTC (permalink / raw)
  To: Greg KH; +Cc: Florian Weimer, git
In-Reply-To: <20060209045420.GB15924@kroah.com>

On 2/9/06, Greg KH <greg@kroah.com> wrote:
> Is there anyway to not repack everything if it's not
> need?

I often cg-clone large projects over stupid protocols (http/rsync) and
then hand-edit the branches/origin or remotes/origin file to say
"git://" instead.

It's called cheating but it works great ;-)



m

^ permalink raw reply

* [PATCH] "Assume unchanged" git
From: Junio C Hamano @ 2006-02-09  5:15 UTC (permalink / raw)
  To: git
  Cc: Alex Riesen, Radoslaw Szkodzinski, Keith Packard, cworth,
	Martin Langhoff, Linus Torvalds
In-Reply-To: <Pine.LNX.4.64.0601311807470.7301@g5.osdl.org>

Linus Torvalds <torvalds@osdl.org> writes:

> The real meat is just making sure that CE_VALID gets set/cleared properly.

Setting is easier part.  Deciding when to ignore/clear for the
sake of safety and usability is harder.  I think I got the
basics right but we might want to pass "really" from more places.

This is _not_ 1.2 material, but I think it is ready to be tested
by people who asked for this feature.  It applies on top of the
recent master branch.

-- >8 --
[PATCH] "Assume unchanged" git

This adds "assume unchanged" logic, started by this message in the list
discussion recently:

	<Pine.LNX.4.64.0601311807470.7301@g5.osdl.org>

This is a workaround for filesystems that do not have lstat()
that is quick enough for the index mechanism to take advantage
of.  On the paths marked as "assumed to be unchanged", the user
needs to explicitly use update-index to register the object name
to be in the next commit.

You can use two new options to update-index to set and reset the
CE_VALID bit:

	git-update-index --assume-unchanged path...
	git-update-index --no-assume-unchanged path...

These forms manipulate only the CE_VALID bit; it does not change
the object name recorded in the index file.  Nor they add a new
entry to the index.

When the configuration variable "core.ignorestat = true" is set,
the index entries are marked with CE_VALID bit automatically
after:

 - update-index to explicitly register the current object name to the
   index file.

 - when update-index --refresh finds the path to be up-to-date.

 - when tools like read-tree -u and apply --index update the working
   tree file and register the current object name to the index file.

The flag is dropped upon read-tree that does not check out the index
entry.  This happens regardless of the core.ignorestat settings.

Index entries marked with CE_VALID bit are assumed to be
unchanged most of the time.  However, there are cases that
CE_VALID bit is ignored for the sake of safety and usability:

 - while "git-read-tree -m" or git-apply need to make sure
   that the paths involved in the merge do not have local
   modifications.  This sacrifices performance for safety.

 - when git-checkout-index -f -q -u -a tries to see if it needs
   to checkout the paths.  Otherwise you can never check
   anything out ;-).

 - when git-update-index --really-refresh (a new flag) tries to
   see if the index entry is up to date.  You can start with
   everything marked as CE_VALID and run this once to drop
   CE_VALID bit for paths that are modified.

Most notably, "update-index --refresh" honours CE_VALID and does
not actively stat, so after you modified a file in the working
tree, update-index --refresh would not notice until you tell the
index about it with "git-update-index path" or "git-update-index
--no-assume-unchanged path".

This version is not expected to be perfect.  I think diff
between index and/or tree and working files may need some
adjustment, and there probably needs other cases we should
automatically unmark paths that are marked to be CE_VALID.

But the basics seem to work, and ready to be tested by people
who asked for this feature.

Signed-off-by: Junio C Hamano <junkio@cox.net>

---

 apply.c          |    2 +-
 cache.h          |    6 +++--
 checkout-index.c |    1 +
 config.c         |    5 ++++
 diff-files.c     |    2 +-
 diff-index.c     |    2 +-
 diff.c           |    2 +-
 entry.c          |    2 +-
 environment.c    |    1 +
 read-cache.c     |   28 +++++++++++++++++++----
 read-tree.c      |    2 +-
 update-index.c   |   65 ++++++++++++++++++++++++++++++++++++++++++++++++------
 write-tree.c     |    2 +-
 13 files changed, 99 insertions(+), 21 deletions(-)

b169290f100cfa67b785c361bcae83f807487f5e
diff --git a/apply.c b/apply.c
index 2ad47fb..35ae48e 100644
--- a/apply.c
+++ b/apply.c
@@ -1309,7 +1309,7 @@ static int check_patch(struct patch *pat
 					return -1;
 			}
 
-			changed = ce_match_stat(active_cache[pos], &st);
+			changed = ce_match_stat(active_cache[pos], &st, 1);
 			if (changed)
 				return error("%s: does not match index",
 					     old_name);
diff --git a/cache.h b/cache.h
index bdbe2d6..cd58fad 100644
--- a/cache.h
+++ b/cache.h
@@ -91,6 +91,7 @@ struct cache_entry {
 #define CE_NAMEMASK  (0x0fff)
 #define CE_STAGEMASK (0x3000)
 #define CE_UPDATE    (0x4000)
+#define CE_VALID     (0x8000)
 #define CE_STAGESHIFT 12
 
 #define create_ce_flags(len, stage) htons((len) | ((stage) << CE_STAGESHIFT))
@@ -144,8 +145,8 @@ extern int add_cache_entry(struct cache_
 extern int remove_cache_entry_at(int pos);
 extern int remove_file_from_cache(const char *path);
 extern int ce_same_name(struct cache_entry *a, struct cache_entry *b);
-extern int ce_match_stat(struct cache_entry *ce, struct stat *st);
-extern int ce_modified(struct cache_entry *ce, struct stat *st);
+extern int ce_match_stat(struct cache_entry *ce, struct stat *st, int);
+extern int ce_modified(struct cache_entry *ce, struct stat *st, int);
 extern int ce_path_match(const struct cache_entry *ce, const char **pathspec);
 extern int index_fd(unsigned char *sha1, int fd, struct stat *st, int write_object, const char *type);
 extern int index_pipe(unsigned char *sha1, int fd, const char *type, int write_object);
@@ -161,6 +162,7 @@ extern int commit_index_file(struct cach
 extern void rollback_index_file(struct cache_file *);
 
 extern int trust_executable_bit;
+extern int assume_unchanged;
 extern int only_use_symrefs;
 extern int diff_rename_limit_default;
 extern int shared_repository;
diff --git a/checkout-index.c b/checkout-index.c
index 53dd8cb..957b4a8 100644
--- a/checkout-index.c
+++ b/checkout-index.c
@@ -116,6 +116,7 @@ int main(int argc, char **argv)
 	int all = 0;
 
 	prefix = setup_git_directory();
+	git_config(git_default_config);
 	prefix_length = prefix ? strlen(prefix) : 0;
 
 	if (read_cache() < 0) {
diff --git a/config.c b/config.c
index 8355224..7dbdce1 100644
--- a/config.c
+++ b/config.c
@@ -222,6 +222,11 @@ int git_default_config(const char *var, 
 		return 0;
 	}
 
+	if (!strcmp(var, "core.ignorestat")) {
+		assume_unchanged = git_config_bool(var, value);
+		return 0;
+	}
+
 	if (!strcmp(var, "core.symrefsonly")) {
 		only_use_symrefs = git_config_bool(var, value);
 		return 0;
diff --git a/diff-files.c b/diff-files.c
index d24d11c..c96ad35 100644
--- a/diff-files.c
+++ b/diff-files.c
@@ -191,7 +191,7 @@ int main(int argc, const char **argv)
 			show_file('-', ce);
 			continue;
 		}
-		changed = ce_match_stat(ce, &st);
+		changed = ce_match_stat(ce, &st, 0);
 		if (!changed && !diff_options.find_copies_harder)
 			continue;
 		oldmode = ntohl(ce->ce_mode);
diff --git a/diff-index.c b/diff-index.c
index f8a102e..12a9418 100644
--- a/diff-index.c
+++ b/diff-index.c
@@ -33,7 +33,7 @@ static int get_stat_data(struct cache_en
 			}
 			return -1;
 		}
-		changed = ce_match_stat(ce, &st);
+		changed = ce_match_stat(ce, &st, 0);
 		if (changed) {
 			mode = create_ce_mode(st.st_mode);
 			if (!trust_executable_bit &&
diff --git a/diff.c b/diff.c
index ec51e7d..c72064e 100644
--- a/diff.c
+++ b/diff.c
@@ -311,7 +311,7 @@ static int work_tree_matches(const char 
 	ce = active_cache[pos];
 	if ((lstat(name, &st) < 0) ||
 	    !S_ISREG(st.st_mode) || /* careful! */
-	    ce_match_stat(ce, &st) ||
+	    ce_match_stat(ce, &st, 0) ||
 	    memcmp(sha1, ce->sha1, 20))
 		return 0;
 	/* we return 1 only when we can stat, it is a regular file,
diff --git a/entry.c b/entry.c
index 6c47c3a..8fb99bc 100644
--- a/entry.c
+++ b/entry.c
@@ -123,7 +123,7 @@ int checkout_entry(struct cache_entry *c
 	strcpy(path + len, ce->name);
 
 	if (!lstat(path, &st)) {
-		unsigned changed = ce_match_stat(ce, &st);
+		unsigned changed = ce_match_stat(ce, &st, 1);
 		if (!changed)
 			return 0;
 		if (!state->force) {
diff --git a/environment.c b/environment.c
index 0596fc6..251e53c 100644
--- a/environment.c
+++ b/environment.c
@@ -12,6 +12,7 @@
 char git_default_email[MAX_GITNAME];
 char git_default_name[MAX_GITNAME];
 int trust_executable_bit = 1;
+int assume_unchanged = 0;
 int only_use_symrefs = 0;
 int repository_format_version = 0;
 char git_commit_encoding[MAX_ENCODING_LENGTH] = "utf-8";
diff --git a/read-cache.c b/read-cache.c
index c5474d4..efbb1be 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -27,6 +27,9 @@ void fill_stat_cache_info(struct cache_e
 	ce->ce_uid = htonl(st->st_uid);
 	ce->ce_gid = htonl(st->st_gid);
 	ce->ce_size = htonl(st->st_size);
+
+	if (assume_unchanged)
+		ce->ce_flags |= htons(CE_VALID);
 }
 
 static int ce_compare_data(struct cache_entry *ce, struct stat *st)
@@ -146,9 +149,18 @@ static int ce_match_stat_basic(struct ca
 	return changed;
 }
 
-int ce_match_stat(struct cache_entry *ce, struct stat *st)
+int ce_match_stat(struct cache_entry *ce, struct stat *st, int ignore_valid)
 {
-	unsigned int changed = ce_match_stat_basic(ce, st);
+	unsigned int changed;
+
+	/*
+	 * If it's marked as always valid in the index, it's
+	 * valid whatever the checked-out copy says.
+	 */
+	if (!ignore_valid && (ce->ce_flags & htons(CE_VALID)))
+		return 0;
+
+	changed = ce_match_stat_basic(ce, st);
 
 	/*
 	 * Within 1 second of this sequence:
@@ -164,7 +176,7 @@ int ce_match_stat(struct cache_entry *ce
 	 * effectively mean we can make at most one commit per second,
 	 * which is not acceptable.  Instead, we check cache entries
 	 * whose mtime are the same as the index file timestamp more
-	 * careful than others.
+	 * carefully than others.
 	 */
 	if (!changed &&
 	    index_file_timestamp &&
@@ -174,10 +186,10 @@ int ce_match_stat(struct cache_entry *ce
 	return changed;
 }
 
-int ce_modified(struct cache_entry *ce, struct stat *st)
+int ce_modified(struct cache_entry *ce, struct stat *st, int really)
 {
 	int changed, changed_fs;
-	changed = ce_match_stat(ce, st);
+	changed = ce_match_stat(ce, st, really);
 	if (!changed)
 		return 0;
 	/*
@@ -233,6 +245,11 @@ int cache_name_compare(const char *name1
 		return -1;
 	if (len1 > len2)
 		return 1;
+
+	/* Differences between "assume up-to-date" should not matter. */
+	flags1 &= ~CE_VALID;
+	flags2 &= ~CE_VALID;
+
 	if (flags1 < flags2)
 		return -1;
 	if (flags1 > flags2)
@@ -430,6 +447,7 @@ int add_cache_entry(struct cache_entry *
 	int ok_to_add = option & ADD_CACHE_OK_TO_ADD;
 	int ok_to_replace = option & ADD_CACHE_OK_TO_REPLACE;
 	int skip_df_check = option & ADD_CACHE_SKIP_DFCHECK;
+
 	pos = cache_name_pos(ce->name, ntohs(ce->ce_flags));
 
 	/* existing match? Just replace it. */
diff --git a/read-tree.c b/read-tree.c
index 5580f15..52f06e3 100644
--- a/read-tree.c
+++ b/read-tree.c
@@ -349,7 +349,7 @@ static void verify_uptodate(struct cache
 		return;
 
 	if (!lstat(ce->name, &st)) {
-		unsigned changed = ce_match_stat(ce, &st);
+		unsigned changed = ce_match_stat(ce, &st, 1);
 		if (!changed)
 			return;
 		errno = 0;
diff --git a/update-index.c b/update-index.c
index afec98d..767fd49 100644
--- a/update-index.c
+++ b/update-index.c
@@ -23,6 +23,10 @@ static int quiet; /* --refresh needing u
 static int info_only;
 static int force_remove;
 static int verbose;
+static int mark_valid_only = 0;
+#define MARK_VALID 1
+#define UNMARK_VALID 2
+
 
 /* Three functions to allow overloaded pointer return; see linux/err.h */
 static inline void *ERR_PTR(long error)
@@ -53,6 +57,25 @@ static void report(const char *fmt, ...)
 	va_end(vp);
 }
 
+static int mark_valid(const char *path)
+{
+	int namelen = strlen(path);
+	int pos = cache_name_pos(path, namelen);
+	if (0 <= pos) {
+		switch (mark_valid_only) {
+		case MARK_VALID:
+			active_cache[pos]->ce_flags |= htons(CE_VALID);
+			break;
+		case UNMARK_VALID:
+			active_cache[pos]->ce_flags &= ~htons(CE_VALID);
+			break;
+		}
+		active_cache_changed = 1;
+		return 0;
+	}
+	return -1;
+}
+
 static int add_file_to_cache(const char *path)
 {
 	int size, namelen, option, status;
@@ -94,6 +117,7 @@ static int add_file_to_cache(const char 
 	ce = xmalloc(size);
 	memset(ce, 0, size);
 	memcpy(ce->name, path, namelen);
+	ce->ce_flags = htons(namelen);
 	fill_stat_cache_info(ce, &st);
 
 	ce->ce_mode = create_ce_mode(st.st_mode);
@@ -105,7 +129,6 @@ static int add_file_to_cache(const char 
 		if (0 <= pos)
 			ce->ce_mode = active_cache[pos]->ce_mode;
 	}
-	ce->ce_flags = htons(namelen);
 
 	if (index_path(ce->sha1, path, &st, !info_only))
 		return -1;
@@ -128,7 +151,7 @@ static int add_file_to_cache(const char 
  * For example, you'd want to do this after doing a "git-read-tree",
  * to link up the stat cache details with the proper files.
  */
-static struct cache_entry *refresh_entry(struct cache_entry *ce)
+static struct cache_entry *refresh_entry(struct cache_entry *ce, int really)
 {
 	struct stat st;
 	struct cache_entry *updated;
@@ -137,21 +160,22 @@ static struct cache_entry *refresh_entry
 	if (lstat(ce->name, &st) < 0)
 		return ERR_PTR(-errno);
 
-	changed = ce_match_stat(ce, &st);
+	changed = ce_match_stat(ce, &st, really);
 	if (!changed)
 		return NULL;
 
-	if (ce_modified(ce, &st))
+	if (ce_modified(ce, &st, really))
 		return ERR_PTR(-EINVAL);
 
 	size = ce_size(ce);
 	updated = xmalloc(size);
 	memcpy(updated, ce, size);
 	fill_stat_cache_info(updated, &st);
+
 	return updated;
 }
 
-static int refresh_cache(void)
+static int refresh_cache(int really)
 {
 	int i;
 	int has_errors = 0;
@@ -171,12 +195,19 @@ static int refresh_cache(void)
 			continue;
 		}
 
-		new = refresh_entry(ce);
+		new = refresh_entry(ce, really);
 		if (!new)
 			continue;
 		if (IS_ERR(new)) {
 			if (not_new && PTR_ERR(new) == -ENOENT)
 				continue;
+			if (really && PTR_ERR(new) == -EINVAL) {
+				/* If we are doing --really-refresh that
+				 * means the index is not valid anymore.
+				 */
+				ce->ce_flags &= ~htons(CE_VALID);
+				active_cache_changed = 1;
+			}
 			if (quiet)
 				continue;
 			printf("%s: needs update\n", ce->name);
@@ -274,6 +305,8 @@ static int add_cacheinfo(unsigned int mo
 	memcpy(ce->name, path, len);
 	ce->ce_flags = create_ce_flags(len, stage);
 	ce->ce_mode = create_ce_mode(mode);
+	if (assume_unchanged)
+		ce->ce_flags |= htons(CE_VALID);
 	option = allow_add ? ADD_CACHE_OK_TO_ADD : 0;
 	option |= allow_replace ? ADD_CACHE_OK_TO_REPLACE : 0;
 	if (add_cache_entry(ce, option))
@@ -317,6 +350,12 @@ static void update_one(const char *path,
 		fprintf(stderr, "Ignoring path %s\n", path);
 		return;
 	}
+	if (mark_valid_only) {
+		if (mark_valid(p))
+			die("Unable to mark file %s", path);
+		return;
+	}
+
 	if (force_remove) {
 		if (remove_file_from_cache(p))
 			die("git-update-index: unable to remove %s", path);
@@ -467,7 +506,11 @@ int main(int argc, const char **argv)
 				continue;
 			}
 			if (!strcmp(path, "--refresh")) {
-				has_errors |= refresh_cache();
+				has_errors |= refresh_cache(0);
+				continue;
+			}
+			if (!strcmp(path, "--really-refresh")) {
+				has_errors |= refresh_cache(1);
 				continue;
 			}
 			if (!strcmp(path, "--cacheinfo")) {
@@ -493,6 +536,14 @@ int main(int argc, const char **argv)
 					die("git-update-index: %s cannot chmod %s", path, argv[i]);
 				continue;
 			}
+			if (!strcmp(path, "--assume-unchanged")) {
+				mark_valid_only = MARK_VALID;
+				continue;
+			}
+			if (!strcmp(path, "--no-assume-unchanged")) {
+				mark_valid_only = UNMARK_VALID;
+				continue;
+			}
 			if (!strcmp(path, "--info-only")) {
 				info_only = 1;
 				continue;
diff --git a/write-tree.c b/write-tree.c
index f866059..addb5de 100644
--- a/write-tree.c
+++ b/write-tree.c
@@ -111,7 +111,7 @@ int main(int argc, char **argv)
 	funny = 0;
 	for (i = 0; i < entries; i++) {
 		struct cache_entry *ce = active_cache[i];
-		if (ntohs(ce->ce_flags) & ~CE_NAMEMASK) {
+		if (ce_stage(ce)) {
 			if (10 < ++funny) {
 				fprintf(stderr, "...\n");
 				break;
-- 
1.1.6.gbb042

^ permalink raw reply related

* Re: Handling large files with GIT
From: Greg KH @ 2006-02-09  4:54 UTC (permalink / raw)
  To: Florian Weimer; +Cc: git
In-Reply-To: <87slqty2c8.fsf@mid.deneb.enyo.de>

On Wed, Feb 08, 2006 at 10:20:39PM +0100, Florian Weimer wrote:
> * Martin Langhoff:
> 
> > SVN does reasonably well tracking his >1GB mbox file. Now, I don't
> > know if I like the idea of putting my own mbox file under version
> > control, but it looks like projects with large and slow-changing files
> > would be in trouble with GIT.
> 
> To my surprise, it's not that bad.  The Debian testing-security team
> uses a single 1.8 MB file (400 KB compressed) to keep vulnerability
> data.  Most changes to that file involve just a few lines.  But even
> in this extreme case, git doesn't compare too badly against Subversion
> if you pack regularly (but not too often).  Disk usage is actually
> *below* Subversion FSFS even with --depth=10 (the default,
> unfortunately a bit hard to override).

I have a project that has 2.5Mb files, and git handles them just fine,
even on my old slow laptop.

But when I tried to use it to backup my old email archive a few months
ago, running about 2Gb in about 300 files, it took forever.  Luckily I
only archive stuff off every other month or so, otherwise it would be
unusable.

However, I did notice one problem.  When cloning from one machine to
another, for a project that is already fully packed, it seems that the
whole project is packed again before sending it accross the wire.  With
an archive this big, that takes over an hour for my slow old fileserver.
I ended up just rsyncing over the files and pointing the parent back to
the original.  Is there anyway to not repack everything if it's not
needed?

thanks,

greg k-h

^ permalink raw reply

* Re: gitweb using "--cc"?
From: Junio C Hamano @ 2006-02-09  3:14 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0602081817040.2458@g5.osdl.org>

Linus Torvalds <torvalds@osdl.org> writes:

> Now, arguably, the raw format should default to the same kind of "were 
> there data conflicts" that "-c" does for merges, but it doesn't, so it's 
> silent ;(

True.  There was a discussion to come up with a sensible
semantics for -c without -p (currently --cc and -c implies -p),
but I haven't got around to it, since --cc was more useful in
general.

Volunteers?

^ permalink raw reply

* Re: gitweb using "--cc"?
From: Kay Sievers @ 2006-02-09  3:13 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List
In-Reply-To: <Pine.LNX.4.64.0602081532360.2458@g5.osdl.org>

On Wed, Feb 08, 2006 at 03:44:58PM -0800, Linus Torvalds wrote:
> 
> I just did an arm merge that needed some (very trivial) manual fixups 
> (commit ID cce0cac1, in case anybody cares).
> 
> As usual, git-diff-tree --cc does a beautiful job on it, but I also 
> checked the gitweb output, which seems to not do as well (the commit 
> message about a manual conflict merge doesn't make any sense at all).
> 
> Now, in this case, what gitweb shows is actually "sensible": it will show 
> the diff of what the merge "brought in" to the mainline kernel, and in 
> that sense I can certainly understand it. It basically diffs the merge 
> against the first parent.
> 
> So looking at that particular example, arguably gitweb does something 
> "different" from what the commit message is talking about, but in many 
> ways it's a perfectly logical thing.
> 
> However, diffing against the first parent, while it sometimes happens to 
> be a sane thing to do, really isn't very sane in general. The merge may go 
> the other way (subdevelopers merging my code), like in commit b2faf597, 
> and sometimes there might not be a single reference tree, but more of a 
> "couple of main branches" approach with merging back and forth). Then the 
> current gitweb behaviour makes no sense at all.
> 
> So it would be much nicer if gitweb had some alternate approach to showing 
> merge diffs. My suggested approach would be to just let the user choose: 
> have separate "diff against fist/second[/third[/..]] parent" buttons. And 
> one of the choices would be the "conflict view" that git-diff-tree --cc 
> gives (I'd argue for that being the default one, because it's the only one 
> that doesn't have a "preferred parent").

Hmm, I have no real clue what all the --cc is about. It's not obvious
for someone who never thought about "meta patches" or "complex merges". :)

If nobody else can do the changes to gitweb, sure, I'll do this and try
to understand what is needed, but then I will need it explained in more
details, what functionality we want to see here. At best with some commented
commandline examples that produce the data you want to see. So that I
can imagine what you are looking for and can give it a try ...

On the technical side for the kernel.org installation:
  does git diff use /usr/bin/diff?

  does git diff create temp files?

  how can i specify the location for the temp files?
  (wasn't possible some months ago, but needed on kernel.org)

  is the temp file naming safe for a lot of git diff running in parallel?

  is a --cc capable git already available on the kernel.org boxes?

Thanks,
Kay

^ permalink raw reply

* Re: gitweb using "--cc"?
From: Linus Torvalds @ 2006-02-09  2:26 UTC (permalink / raw)
  To: Brian Gerst; +Cc: Kay Sievers, Git Mailing List
In-Reply-To: <43EAA560.8030504@didntduck.org>

On Wed, 8 Feb 2006, Brian Gerst wrote:
> 
> git-whatchanged doesn't show that merge commit either.

Actually, it does. You just have to ask it.

	git-whatchanged --cc

The thing is, "git-whatchanged" is different from "git diff" and other 
helpers, in that it by default shows the "raw" git representation. Which 
indeed doesn't show that merge as being anything interesting.

But with "--cc", the merge suddenly blossoms.

Now, arguably, the raw format should default to the same kind of "were 
there data conflicts" that "-c" does for merges, but it doesn't, so it's 
silent ;(

		Linus

^ permalink raw reply

* Re: gitweb using "--cc"?
From: Brian Gerst @ 2006-02-09  2:13 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Kay Sievers, Git Mailing List
In-Reply-To: <Pine.LNX.4.64.0602081532360.2458@g5.osdl.org>

Linus Torvalds wrote:
> I just did an arm merge that needed some (very trivial) manual fixups 
> (commit ID cce0cac1, in case anybody cares).
> 
> As usual, git-diff-tree --cc does a beautiful job on it, but I also 
> checked the gitweb output, which seems to not do as well (the commit 
> message about a manual conflict merge doesn't make any sense at all).
> 
> Now, in this case, what gitweb shows is actually "sensible": it will show 
> the diff of what the merge "brought in" to the mainline kernel, and in 
> that sense I can certainly understand it. It basically diffs the merge 
> against the first parent.
> 
> So looking at that particular example, arguably gitweb does something 
> "different" from what the commit message is talking about, but in many 
> ways it's a perfectly logical thing.
> 
> However, diffing against the first parent, while it sometimes happens to 
> be a sane thing to do, really isn't very sane in general. The merge may go 
> the other way (subdevelopers merging my code), like in commit b2faf597, 
> and sometimes there might not be a single reference tree, but more of a 
> "couple of main branches" approach with merging back and forth). Then the 
> current gitweb behaviour makes no sense at all.
> 
> So it would be much nicer if gitweb had some alternate approach to showing 
> merge diffs. My suggested approach would be to just let the user choose: 
> have separate "diff against fist/second[/third[/..]] parent" buttons. And 
> one of the choices would be the "conflict view" that git-diff-tree --cc 
> gives (I'd argue for that being the default one, because it's the only one 
> that doesn't have a "preferred parent").
> 
> Kay?
> 
> 		Linus
> -
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

git-whatchanged doesn't show that merge commit either.

--
				Brian Gerst

^ permalink raw reply

* Re: Two crazy proposals for changing git's diff commands
From: Junio C Hamano @ 2006-02-09  1:37 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0602081726390.2458@g5.osdl.org>

Linus Torvalds <torvalds@osdl.org> writes:

> On Wed, 8 Feb 2006, Junio C Hamano wrote:
>> 
>> So we could even implement that with
>> 
>> 	$ git commit --preview [other flags] [paths...]
>
> My argument for "git status" was that it very much _is_ about "what would 
> I commit". So I'd much rather extend on "git status" than add a 
> "--preview" flag to "git commit".

Our messages crossed.  I'd agree about the reasoning completely,
as I did in my previous message.

Now we need to find a volunteer to do all that work ;-).

^ permalink raw reply

* Re: Two crazy proposals for changing git's diff commands
From: Junio C Hamano @ 2006-02-09  1:35 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0602081643570.2458@g5.osdl.org>

Linus Torvalds <torvalds@osdl.org> writes:

> In particular, I think the real culprit is the plain "git diff" with no 
> arguments at all. Right now it ends up showing just a piece of the 
> picture, and the piece it shows is incomplete enough to be irritating.

Not necessarily.  Your "during the merge" example is a good one,
and "so far it looks good and I do not want to see these diffs
while modifying things further -- update-index it!" workflow
benefits immensely from it.

> Right now a plain "git diff" only shows the differences in the current 
> tree against the index. I think that was just the wrong choice. I think 
> almost everybody would actually prefer the default to be to show the 
> difference against HEAD.

I somehow suspect this is welding training wheels to your
bicycle.

> Teaching "git status" to take a "-p" flag (for "patch" - or -v for 
> "verbose") might actually be a good thing. Then, instead of "git diff", 
> you'd use "git status -p" and it would show you what the differences are 
> in the index, and what they are in the tree, so you'd _really_ know what 
> "git commit" in all its glory would do.

I think this may not be a bad idea.

What we could do is this:

	$ git status -v [--only | --include] [paths...]

When -v is given, it takes the same parameter as "git commit",
and changes its output format from the usual N x "# useful info lines"
to something like:

	---
        diff ...
        --- a/path
        +++ b/path
        @@ -N,M +L,K @@

that shows the commit preview.  At the same time we change "git
commit" commit log message reader to stop reading the input at
the first '---' line, just like we do for e-mailed patches.

Then:

 - If you want a commit preview _before_ initiating a commit,
   you can say:

	$ git status -v [whatever you planned to give git commit]

 - If you want a commit preview _while_ writing the commit log, you can
   say:

	$ git commit -v [whatever your parameters are]

   which internally would pass the -v and "$@" to git status
   that seeds the log message


 - If you want a commit preview after you made a commit, it is
   too late ;-).

^ permalink raw reply

* Re: Two crazy proposals for changing git's diff commands
From: Linus Torvalds @ 2006-02-09  1:30 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Carl Worth, git
In-Reply-To: <7vfymtl43b.fsf@assigned-by-dhcp.cox.net>

On Wed, 8 Feb 2006, Junio C Hamano wrote:
> 
> So we could even implement that with
> 
> 	$ git commit --preview [other flags] [paths...]

My argument for "git status" was that it very much _is_ about "what would 
I commit". So I'd much rather extend on "git status" than add a 
"--preview" flag to "git commit".

At least to me, "git status" is very much a "what is pending" kind of 
command. The fact that you can't con it into giving a diff is actually a 
downside: I would at times have preferred to have the "git commit" message 
contain an extended status that contained the diffs too.

Another way of saying that: I think it would make sense to have a verbose 
status report, and maybe also have that verbose flag passed into "git 
commit" so that you can see the verbose status when you edit the commit 
message.

Under that logic, "git status -v" would show all the diffs (not just 
filenames) and "git commit -v .." would be the same as "git commit .." but 
the "-v" flag would have been passed down to the "git status" call, so the 
commit message file would be pre-populated with the diff.

For small commits, it's actually nice to see the diff itself as you write 
the commit message.

		Linus

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox