Git development

Git development
 help / color / mirror / Atom feed

* Re: Unmodified submodules shows up as dirty with 1.6.6.443.gd7346
From: Junio C Hamano @ 2010-01-18 17:22 UTC (permalink / raw)
  To: Gustaf Hendeby; +Cc: Jacob Helwig, git, Jens.Lehmann
In-Reply-To: <4B549254.5090206@isy.liu.se>

Gustaf Hendeby <hendeby@isy.liu.se> writes:

> ....  I don't want to include
> these files in a .gitignore as they are named differently on different
> systems.  Hence, I include them in my .git/info/exclude file,...

I don't have a strong opinion on the submodules part of your issue, but
the above part applies to projects with or without submodules, which I
have an opinion, and because it is different from what I used to teach
people, I think it is worth mentioning..

I used to say "Never place *~ (or *.swp) in .gitignore because they are
only useful to you who use Emacs (or vim); and do have *.o in .gitignore,
because everybody who compile your checkout would see it".

But I don't think the former is a right attitude.  My thinking these days
is that keeping these in .gitignore should not just be tolerated but
should be actively encouraged, unless the project may need to track paths
that match *~ or *.swp in the future,

If it is very unlikely that the project will ever track them, there is no
harm done [*1*], and it will help other people because they don't have to
add the same and common entries in their own .git/info/excludes file.

I am suspecting that your "these files ... are named differently on
different systems" may fall into the same category.  Your build may not
produce "frotz.linux" when compiled on a FreeBSD box (and "frotz.fbsd" on
a Linux box), but would it hurt more than it helps to list them in the
same .gitignore to cover both?

[Footnote]

*1* Once it starts doing so, un-ignoring a special case can be done
at that point in the history

^ permalink raw reply

* Re: Unmodified submodules shows up as dirty with 1.6.6.443.gd7346
From: Gustaf Hendeby @ 2010-01-18 17:27 UTC (permalink / raw)
  To: Jacob Helwig; +Cc: git, Jens.Lehmann
In-Reply-To: <8c9a061001180914v42074056o3a5d077ac4cea70f@mail.gmail.com>

Jacob Helwig wrote:
> On Mon, Jan 18, 2010 at 08:54, Gustaf Hendeby <hendeby@isy.liu.se> wrote:
>> Jacob Helwig wrote:
>>> On Mon, Jan 18, 2010 at 07:30, Gustaf Hendeby <hendeby@isy.liu.se> wrote:
>>>> Hi!
>>>>
>>>> I have been using submodules for a while, and been quite happy with
>>>> them.  Just updating to the latest next (1.6.6.443.gd7346), a strange
>>>> problem has occurred.  All my submodules (which are in fact unmodified)
>>>> show as modified and dirty
>>>>
>>>> diff --git a/extern/utils b/extern/utils
>>>> --- a/extern/utils
>>>> +++ b/extern/utils
>>>> @@ -1 +1 @@
>>>> -Subproject commit 6bad20e1419f1ca61bd5a6eef9b5937122e006f1
>>>> +Subproject commit 6bad20e1419f1ca61bd5a6eef9b5937122e006f1-dirty
>>>>
>>>>
>>> Do you have any untracked files in the submodule?  git status is
>>> working as I would expect with the same version (1.6.6.443.gd7346).
>> Yes, I do.
>>
>>> If there is no output from git status in the submodule, then git
>>> status in the superproject shows the submodule as being clean.
>>> However, if there is _any_ output from git status (untracked files,
>>> modified files, deleted files, new files), then the superproject shows
>>> the submodule as being dirty.
>>>
>> I have the following use case, which is affected.  I have with in a
>> submodule some code that needs to be compiled, and hence generate some
>> object files and other files in the process.  I don't want to include
>> these files in a .gitignore as they are named differently on different
>> systems.  Hence, I include them in my .git/info/exclude file, where I am
>> developing the module.  So now, unless I do the same thing for all
>> places I checkout the repo as submodule, I end up with the module
>> indicated as dirty after I compile it.  This is a bit inconvenient.
> 
> That being said:
> The .gitignore file supports shell globs.  Are the generated files
> created with names that are so different that some simple shell globs
> used in one or more .gitignore files couldn't cover them?

Under Linux I for example get .o files whereas under Windows I get .obj
files.  Of course I could put both in .gitignore, but I don't like to
have to exclude more files than necessary it gives me a bad feeling of
accidentally one day exclude something important, which has at occasions
happened before.  (For the same reason i don't usually put *~ in
.gitignore, as I don't want to impose emacs on others, that might prefer
a different editor, but that is a bit different from the case we have
here.)  Furthermore, list could start to grow with more systems and
build chains with more intermediate files.

/Gustaf

^ permalink raw reply

* Re: [PATCH] Fix segfault in fast-export
From: Heiko Voigt @ 2010-01-18 17:44 UTC (permalink / raw)
  To: Mike Mueller; +Cc: git
In-Reply-To: <20100107035839.GM8510@samus.subfocal.net>

Hi,

if want your change included in git you probably want to CC: Junio and
inline your patch so its easier to comment. Please see the file
Documentation/SubmittingPatches for tips on how to do it with your
mailer.

cheers Heiko

P.S.: and include a commit message in your patch

On Wed, Jan 06, 2010 at 10:58:39PM -0500, Mike Mueller wrote:
> Hi all,
> 
> I'm working on a C++ static analyzer (Vigilant Sentry), and git
> is one of my test subjects.  In git-1.6.6, I found a crash in the
> fast-export command:
> 
> The problem is in builtin-fast-export.c, function export_marks:
> 
>     f = fopen(file, "w");
>     if (!f)
>         error("Unable to open marks file %s for writing.", file);
>    
>     for (i = 0; i < idnums.size; i++) {
>         if (deco->base && deco->base->type == 1) {
>             mark = ptr_to_mark(deco->decoration);
>             if (fprintf(f, ":%"PRIu32" %s\n", mark,
>                 sha1_to_hex(deco->base->sha1)) < 0) {
>                 e = 1;
>                 break;
>             }
>         }
>         deco++;
>     }
>    
>     e |= ferror(f);
>     e |= fclose(f);
> 
> If fopen() fails, the error message is printed, but the function
> doesn't exit.  The subsequent calls to fprintf and/or ferror will
> fail because f is NULL.  A simple way to reproduce is to export
> to a path you don't have write access to:
>    
>     $ git fast-export --export-marks=/foo
>     error: Unable to open marks file /foo for writing.
>     Segmentation fault (core dumped)
> 
> I've attached a trivial patch that calls die_errno instead of
> error, so the program exits if f is NULL.
> 
> Regards,
> Mike
> 
> -- 
> Mike Mueller
> mmueller@vigilantsw.com
> 
> http://www.vigilantsw.com/

> diff --git a/builtin-fast-export.c b/builtin-fast-export.c
> index b0a4029..963e89b 100644
> --- a/builtin-fast-export.c
> +++ b/builtin-fast-export.c
> @@ -503,7 +503,7 @@ static void export_marks(char *file)
>  
>  	f = fopen(file, "w");
>  	if (!f)
> -		error("Unable to open marks file %s for writing.", file);
> +		die_errno("Unable to open marks file %s for writing", file);
>  
>  	for (i = 0; i < idnums.size; i++) {
>  		if (deco->base && deco->base->type == 1) {

^ permalink raw reply

* [PATCH] Replace parse_blob() with an explanatory comment
From: Daniel Barkalow @ 2010-01-18 18:06 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

parse_blob() has never actually been used; it has served simply to
avoid having a confusing gap in the API. Instead of leaving it, put in
a comment that explains what "parsing a blob" entails (making sure the
object is actually readable), and why code might care whether a blob
has been parsed or not.

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
---
I also realized that the code that used to use the "parsed" bit to 
distinguish blobs that had actually be read from blobs that had only been 
referenced (fsck-cache, at the time) no longer works that way, as far as I 
can tell. However, I think it's worth maintaining the convention that it 
used, and better not to say that nothing known presently relies on it, 
since that could change.

 blob.c |   21 ---------------------
 blob.h |    9 ++++++++-
 2 files changed, 8 insertions(+), 22 deletions(-)

diff --git a/blob.c b/blob.c
index bd7d078..ae320bd 100644
--- a/blob.c
+++ b/blob.c
@@ -23,24 +23,3 @@ int parse_blob_buffer(struct blob *item, void *buffer, unsigned long size)
 	item->object.parsed = 1;
 	return 0;
 }
-
-int parse_blob(struct blob *item)
-{
-        enum object_type type;
-        void *buffer;
-        unsigned long size;
-	int ret;
-
-        if (item->object.parsed)
-                return 0;
-        buffer = read_sha1_file(item->object.sha1, &type, &size);
-        if (!buffer)
-                return error("Could not read %s",
-                             sha1_to_hex(item->object.sha1));
-        if (type != OBJ_BLOB)
-                return error("Object %s not a blob",
-                             sha1_to_hex(item->object.sha1));
-	ret = parse_blob_buffer(item, buffer, size);
-	free(buffer);
-	return ret;
-}
diff --git a/blob.h b/blob.h
index ea5d9e9..38ea395 100644
--- a/blob.h
+++ b/blob.h
@@ -13,6 +13,13 @@ struct blob *lookup_blob(const unsigned char *sha1);
 
 int parse_blob_buffer(struct blob *item, void *buffer, unsigned long size);
 
-int parse_blob(struct blob *item);
+/** 
+ * Blobs do not contain references to other objects and do not have
+ * structured data that needs parsing. However, code may use the
+ * "parsed" bit in the struct object for a blob to determine whether
+ * its content has been found to actually be available, so
+ * parse_blob_buffer() is used (by object.c) to flag that the object
+ * has been read successfully from the database.
+ **/
 
 #endif /* BLOB_H */
-- 
1.6.6.rc0.354.g373a9.dirty

^ permalink raw reply related

* Re: [PATCH v3] Threaded grep
From: Linus Torvalds @ 2010-01-18 18:05 UTC (permalink / raw)
  To: Fredrik Kuivinen; +Cc: git, Junio C Hamano
In-Reply-To: <20100118103334.GA17361@fredrik-laptop>

On Mon, 18 Jan 2010, Fredrik Kuivinen wrote:
>
> Make git grep use threads when it is available.

Ok, this is much better.

On my machine (4 cores with HT, so 8 threads total), I get the 
following:

 [ Note: the --no-ext-grep is because I'm comparing with a git that has 
  the original grep optimization, but hasn't removed the external grep 
  functionality yet. I just used the same command line when then comparing 
  to next+your patch, so don't mind it.

  Also, the three examples are chosen to be: "trivial single match", 
  "regex single match", and "trivial lots of matches" ]

Before (all best-of-five), with the non-threaded internal grep:

 - git grep --no-ext-grep qwerty

	real	0m0.311s
	user	0m0.164s
	sys	0m0.144s

 - git grep --no-ext-grep qwerty.*as

	real	0m0.555s
	user	0m0.416s
	sys	0m0.132s

 - git grep --no-ext-grep function

	real	0m0.461s
	user	0m0.276s
	sys	0m0.180s

After, with threading:

 - git grep --no-ext-grep qwerty

	real	0m0.152s
	user	0m0.788s
	sys	0m0.212s

 - git grep --no-ext-grep qwerty.*as

	real	0m0.148s
	user	0m0.724s
	sys	0m0.284s

 - git grep --no-ext-grep function

	real	0m0.241s
	user	0m1.436s
	sys	0m0.216s

so now it's a clear win in all cases. And as you'd expect, the "complex 
case with single output" is the one that wins most, since it's the one 
that should have the most CPU usage, with the least synchronization.

That said, it still does waste quite a bit of time doing this, and while 
it almost doubles performance in the last case too, it does so at the cost 
of quite a bit of overhead.

(Some overhead is natural, especially since I have HT. Running two threads 
on the same core does _not_ give double the performance, so the CPU time 
almost doubling - because the threads themselves run slower - is not 
unexpected. It's when the CPU time more than quadruples that I suspect 
that it could be improved a bit).

		Linus

^ permalink raw reply

* Re: [RFC] Git Wiki Move
From: J.H. @ 2010-01-18 18:47 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Matthieu Moy, Petr Baudis, git
In-Reply-To: <alpine.DEB.1.00.1001181258540.3044@intel-tinevez-2-302>

On 01/18/2010 04:03 AM, Johannes Schindelin wrote:
> Hi,
> 
> On Mon, 18 Jan 2010, Johannes Schindelin wrote:
> 
>> On Sun, 17 Jan 2010, J.H. wrote:
>>
>>> On 01/17/2010 03:06 PM, Johannes Schindelin wrote:
>>>
>>>> BTW there is a file KHTMLFixes.css in the directory skins/monobook/ 
>>>> which makes the layout break with Chromium.  Apparently, it is no 
>>>> longer needed by KHTML anyway.  So could you please replace that 
>>>> file with an empty one, or comment out the offending part, like so:
>>>>
>>>> 	/* #column-content { margin-left: 0; } */
>>>
>>> I'm not keen on making changes since that file is still coming from 
>>> the shipping version of mediawiki and I'm trying, quite a lot, to not 
>>> run a modified version of it.  I've got enough troubles with the fact 
>>> that that one change would affect 22 wikis in a single go.
>>>
>>> As bad as it is to say this, I'd rather wait for 1.16 to come out vs. 
>>> modify it in place.  Mediawiki is claiming they are in continuous 
>>> integration development with quarterly releases but their last release 
>>> was June of 2009, so take that as you will.
>>
>> Fair enough.  As long as Chromium has a fairly small share of the 
>> market, I think it is safe to tell everybody to wait a little until the 
>> side bar is no longer displayed at the left _below_ the main body text.  
>> AFAICT the problem was solved with Wikipedia, so the next release should 
>> magically fix the issue.
> 
> Actually, in a rare case of cleverness, I found out how to fix it (at 
> least for me, it works):
> 
> 	http://git.wiki.kernel.org/index.php/MediaWiki:Monobook.css
> 
> Of course, this is only a work-around, and it will get broken once 
> Monobook changes dimensions (or more).  But hopefully Chromium will be 
> fixed by then, extending the canvas into the negative arm of the x axis 
> whenever needed.  And then I will happily delete the custom Monobook.css.

That works.

> These are the things left that I would like to see soon:
> 
> - add a link from the old Wiki (with rewrite rules)

I think the rewrites should be simple enough (the pages seem to line up
1:1 easily enough) so all I would need to do is discuss this with Petr.

> - use whatever logo (anything is better than the sunflower)

It's easy enough to change just need to know which one to use.

> - enable anonymous edits

Anonymous edits will *NEVER* be enabled.  You are going to need to have
an account and your going to have to verify it's e-mail to be able to do
eits.  Completely anonymous edits are too much of a risk considering
bots and spam and the abuse they can do.

Even then there is at least one case of a spammer actually going through
and using a verified account.

> - a major cleanup of the broken autolinks and faulty formatting.
> 
> The latter point is probably the most tedious one, so I suggest that only 
> those get a vote on the logo who fix at least 3 pages.

sound fair to me.

- John 'Warthog9' Hawley

^ permalink raw reply

* Re: [PATCH] builtin-apply.c: Skip filenames without enough components
From: Andreas Gruenbacher @ 2010-01-18 19:57 UTC (permalink / raw)
  To: Nanako Shiraishi; +Cc: Junio C Hamano, git
In-Reply-To: <20100118192235.6117@nanako3.lavabit.com>

On Monday 18 January 2010 11:22:35 am Nanako Shiraishi wrote:
> Junio, in case you don't want to wait for Andreas, you can squash this test
>  in.

Thanks, looks good!

Andreas

^ permalink raw reply

* [PATCH v2] Performance optimization for detection of modified submodules
From: Jens Lehmann @ 2010-01-18 20:26 UTC (permalink / raw)
  To: Junio C Hamano, Git Mailing List

In the worst case is_submodule_modified() got called three times for
each submodule. The information we got from scanning the whole
submodule tree the first time should not be thrown away.

New parameters have been added to diff_change() and diff_addremove(),
the information is stored in a new member of struct diff_filespec. Its
value is then reused instead of calling is_submodule_modified() again.

When no explicit "-dirty" is needed in the output the call to
is_submodule_modified() is not necessary when the submodules HEAD
already disagrees with the ref of the superproject, as this alone
marks it as modified. To achieve that, get_stat_data() got an extra
argument.

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
---


Changed since the first version:

- diff_change() now has two dirty_submodule arguments, one for
  each side.

- If the submodules HEAD does not match the ref it is compared
  against, is_submodule_modified() is only called when the
  patch formatted output is requested.

- Changed the type of the new dirty_submodule flags to unsigned.

- Updated the commit message


 diff-lib.c  |   46 +++++++++++++++++++++++++++++++---------------
 diff.c      |   15 +++++++++++----
 diff.h      |   10 ++++++----
 diffcore.h  |    1 +
 revision.c  |    5 +++--
 tree-diff.c |    8 ++++----
 6 files changed, 56 insertions(+), 29 deletions(-)

diff --git a/diff-lib.c b/diff-lib.c
index 29c5915..ec2e2ac 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -73,6 +73,7 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
 		unsigned int oldmode, newmode;
 		struct cache_entry *ce = active_cache[i];
 		int changed;
+		unsigned dirty_submodule = 0;

 		if (DIFF_OPT_TST(&revs->diffopt, QUICK) &&
 			DIFF_OPT_TST(&revs->diffopt, HAS_CHANGES))
@@ -173,12 +174,16 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
 			if (silent_on_removed)
 				continue;
 			diff_addremove(&revs->diffopt, '-', ce->ce_mode,
-				       ce->sha1, ce->name);
+				       ce->sha1, ce->name, 0);
 			continue;
 		}
 		changed = ce_match_stat(ce, &st, ce_option);
-		if (S_ISGITLINK(ce->ce_mode) && !changed)
-			changed = is_submodule_modified(ce->name);
+		if (S_ISGITLINK(ce->ce_mode)
+		    && (!changed || (revs->diffopt.output_format & DIFF_FORMAT_PATCH))
+		    && is_submodule_modified(ce->name)) {
+			changed = 1;
+			dirty_submodule = 1;
+		}
 		if (!changed) {
 			ce_mark_uptodate(ce);
 			if (!DIFF_OPT_TST(&revs->diffopt, FIND_COPIES_HARDER))
@@ -188,7 +193,7 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
 		newmode = ce_mode_from_stat(ce, st.st_mode);
 		diff_change(&revs->diffopt, oldmode, newmode,
 			    ce->sha1, (changed ? null_sha1 : ce->sha1),
-			    ce->name);
+			    ce->name, 0, dirty_submodule);

 	}
 	diffcore_std(&revs->diffopt);
@@ -204,16 +209,18 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
 static void diff_index_show_file(struct rev_info *revs,
 				 const char *prefix,
 				 struct cache_entry *ce,
-				 const unsigned char *sha1, unsigned int mode)
+				 const unsigned char *sha1, unsigned int mode,
+				 unsigned dirty_submodule)
 {
 	diff_addremove(&revs->diffopt, prefix[0], mode,
-		       sha1, ce->name);
+		       sha1, ce->name, dirty_submodule);
 }

 static int get_stat_data(struct cache_entry *ce,
 			 const unsigned char **sha1p,
 			 unsigned int *modep,
-			 int cached, int match_missing)
+			 int cached, int match_missing,
+			 unsigned *dirty_submodule, int output_format)
 {
 	const unsigned char *sha1 = ce->sha1;
 	unsigned int mode = ce->ce_mode;
@@ -233,8 +240,13 @@ static int get_stat_data(struct cache_entry *ce,
 			return -1;
 		}
 		changed = ce_match_stat(ce, &st, 0);
-		if (changed
-		    || (S_ISGITLINK(ce->ce_mode) && is_submodule_modified(ce->name))) {
+		if (S_ISGITLINK(ce->ce_mode)
+		    && (!changed || (output_format & DIFF_FORMAT_PATCH))
+		    && is_submodule_modified(ce->name)) {
+			changed = 1;
+			*dirty_submodule = 1;
+		}
+		if (changed) {
 			mode = ce_mode_from_stat(ce, st.st_mode);
 			sha1 = null_sha1;
 		}
@@ -251,15 +263,17 @@ static void show_new_file(struct rev_info *revs,
 {
 	const unsigned char *sha1;
 	unsigned int mode;
+	unsigned dirty_submodule = 0;

 	/*
 	 * New file in the index: it might actually be different in
 	 * the working copy.
 	 */
-	if (get_stat_data(new, &sha1, &mode, cached, match_missing) < 0)
+	if (get_stat_data(new, &sha1, &mode, cached, match_missing,
+	    &dirty_submodule, revs->diffopt.output_format) < 0)
 		return;

-	diff_index_show_file(revs, "+", new, sha1, mode);
+	diff_index_show_file(revs, "+", new, sha1, mode, dirty_submodule);
 }

 static int show_modified(struct rev_info *revs,
@@ -270,11 +284,13 @@ static int show_modified(struct rev_info *revs,
 {
 	unsigned int mode, oldmode;
 	const unsigned char *sha1;
+	unsigned dirty_submodule = 0;

-	if (get_stat_data(new, &sha1, &mode, cached, match_missing) < 0) {
+	if (get_stat_data(new, &sha1, &mode, cached, match_missing,
+			  &dirty_submodule, revs->diffopt.output_format) < 0) {
 		if (report_missing)
 			diff_index_show_file(revs, "-", old,
-					     old->sha1, old->ce_mode);
+					     old->sha1, old->ce_mode, 0);
 		return -1;
 	}

@@ -309,7 +325,7 @@ static int show_modified(struct rev_info *revs,
 		return 0;

 	diff_change(&revs->diffopt, oldmode, mode,
-		    old->sha1, sha1, old->name);
+		    old->sha1, sha1, old->name, 0, dirty_submodule);
 	return 0;
 }

@@ -356,7 +372,7 @@ static void do_oneway_diff(struct unpack_trees_options *o,
 	 * Something removed from the tree?
 	 */
 	if (!idx) {
-		diff_index_show_file(revs, "-", tree, tree->sha1, tree->ce_mode);
+		diff_index_show_file(revs, "-", tree, tree->sha1, tree->ce_mode, 0);
 		return;
 	}

diff --git a/diff.c b/diff.c
index 012b3d3..f130a36 100644
--- a/diff.c
+++ b/diff.c
@@ -2032,7 +2032,7 @@ static int diff_populate_gitlink(struct diff_filespec *s, int size_only)
 	char *data = xmalloc(100), *dirty = "";

 	/* Are we looking at the work tree? */
-	if (!s->sha1_valid && is_submodule_modified(s->path))
+	if (!s->sha1_valid && s->dirty_submodule)
 		dirty = "-dirty";

 	len = snprintf(data, 100,
@@ -3736,7 +3736,7 @@ int diff_result_code(struct diff_options *opt, int status)
 void diff_addremove(struct diff_options *options,
 		    int addremove, unsigned mode,
 		    const unsigned char *sha1,
-		    const char *concatpath)
+		    const char *concatpath, unsigned dirty_submodule)
 {
 	struct diff_filespec *one, *two;

@@ -3768,8 +3768,10 @@ void diff_addremove(struct diff_options *options,

 	if (addremove != '+')
 		fill_filespec(one, sha1, mode);
-	if (addremove != '-')
+	if (addremove != '-') {
 		fill_filespec(two, sha1, mode);
+		two->dirty_submodule = dirty_submodule;
+	}

 	diff_queue(&diff_queued_diff, one, two);
 	if (!DIFF_OPT_TST(options, DIFF_FROM_CONTENTS))
@@ -3780,7 +3782,8 @@ void diff_change(struct diff_options *options,
 		 unsigned old_mode, unsigned new_mode,
 		 const unsigned char *old_sha1,
 		 const unsigned char *new_sha1,
-		 const char *concatpath)
+		 const char *concatpath,
+		 unsigned old_dirty_submodule, unsigned new_dirty_submodule)
 {
 	struct diff_filespec *one, *two;

@@ -3793,6 +3796,8 @@ void diff_change(struct diff_options *options,
 		const unsigned char *tmp_c;
 		tmp = old_mode; old_mode = new_mode; new_mode = tmp;
 		tmp_c = old_sha1; old_sha1 = new_sha1; new_sha1 = tmp_c;
+		tmp = old_dirty_submodule; old_dirty_submodule = new_dirty_submodule;
+			new_dirty_submodule = tmp;
 	}

 	if (options->prefix &&
@@ -3803,6 +3808,8 @@ void diff_change(struct diff_options *options,
 	two = alloc_filespec(concatpath);
 	fill_filespec(one, old_sha1, old_mode);
 	fill_filespec(two, new_sha1, new_mode);
+	one->dirty_submodule = old_dirty_submodule;
+	two->dirty_submodule = new_dirty_submodule;

 	diff_queue(&diff_queued_diff, one, two);
 	if (!DIFF_OPT_TST(options, DIFF_FROM_CONTENTS))
diff --git a/diff.h b/diff.h
index 06a9a88..2ef3341 100644
--- a/diff.h
+++ b/diff.h
@@ -14,12 +14,13 @@ typedef void (*change_fn_t)(struct diff_options *options,
 		 unsigned old_mode, unsigned new_mode,
 		 const unsigned char *old_sha1,
 		 const unsigned char *new_sha1,
-		 const char *fullpath);
+		 const char *fullpath,
+		 unsigned old_dirty_submodule, unsigned new_dirty_submodule);

 typedef void (*add_remove_fn_t)(struct diff_options *options,
 		    int addremove, unsigned mode,
 		    const unsigned char *sha1,
-		    const char *fullpath);
+		    const char *fullpath, unsigned dirty_submodule);

 typedef void (*diff_format_fn_t)(struct diff_queue_struct *q,
 		struct diff_options *options, void *data);
@@ -177,13 +178,14 @@ extern void diff_addremove(struct diff_options *,
 			   int addremove,
 			   unsigned mode,
 			   const unsigned char *sha1,
-			   const char *fullpath);
+			   const char *fullpath, unsigned dirty_submodule);

 extern void diff_change(struct diff_options *,
 			unsigned mode1, unsigned mode2,
 			const unsigned char *sha1,
 			const unsigned char *sha2,
-			const char *fullpath);
+			const char *fullpath,
+			unsigned dirty_submodule1, unsigned dirty_submodule2);

 extern void diff_unmerge(struct diff_options *,
 			 const char *path,
diff --git a/diffcore.h b/diffcore.h
index 5b63458..66687c3 100644
--- a/diffcore.h
+++ b/diffcore.h
@@ -42,6 +42,7 @@ struct diff_filespec {
 #define DIFF_FILE_VALID(spec) (((spec)->mode) != 0)
 	unsigned should_free : 1; /* data should be free()'ed */
 	unsigned should_munmap : 1; /* data should be munmap()'ed */
+	unsigned dirty_submodule : 1;  /* For submodules: its work tree is dirty */

 	struct userdiff_driver *driver;
 	/* data should be considered "binary"; -1 means "don't know yet" */
diff --git a/revision.c b/revision.c
index 25fa14d..769cfd4 100644
--- a/revision.c
+++ b/revision.c
@@ -268,7 +268,7 @@ static int tree_difference = REV_TREE_SAME;
 static void file_add_remove(struct diff_options *options,
 		    int addremove, unsigned mode,
 		    const unsigned char *sha1,
-		    const char *fullpath)
+		    const char *fullpath, unsigned dirty_submodule)
 {
 	int diff = addremove == '+' ? REV_TREE_NEW : REV_TREE_OLD;

@@ -281,7 +281,8 @@ static void file_change(struct diff_options *options,
 		 unsigned old_mode, unsigned new_mode,
 		 const unsigned char *old_sha1,
 		 const unsigned char *new_sha1,
-		 const char *fullpath)
+		 const char *fullpath,
+		 unsigned old_dirty_submodule, unsigned new_dirty_submodule)
 {
 	tree_difference = REV_TREE_DIFFERENT;
 	DIFF_OPT_SET(options, HAS_CHANGES);
diff --git a/tree-diff.c b/tree-diff.c
index 7d745b4..fe9f52c 100644
--- a/tree-diff.c
+++ b/tree-diff.c
@@ -68,7 +68,7 @@ static int compare_tree_entry(struct tree_desc *t1, struct tree_desc *t2, const
 		if (DIFF_OPT_TST(opt, TREE_IN_RECURSIVE)) {
 			newbase[baselen + pathlen1] = 0;
 			opt->change(opt, mode1, mode2,
-				    sha1, sha2, newbase);
+				    sha1, sha2, newbase, 0, 0);
 			newbase[baselen + pathlen1] = '/';
 		}
 		retval = diff_tree_sha1(sha1, sha2, newbase, opt);
@@ -77,7 +77,7 @@ static int compare_tree_entry(struct tree_desc *t1, struct tree_desc *t2, const
 	}

 	fullname = malloc_fullname(base, baselen, path1, pathlen1);
-	opt->change(opt, mode1, mode2, sha1, sha2, fullname);
+	opt->change(opt, mode1, mode2, sha1, sha2, fullname, 0, 0);
 	free(fullname);
 	return 0;
 }
@@ -241,7 +241,7 @@ static void show_entry(struct diff_options *opt, const char *prefix, struct tree

 		if (DIFF_OPT_TST(opt, TREE_IN_RECURSIVE)) {
 			newbase[baselen + pathlen] = 0;
-			opt->add_remove(opt, *prefix, mode, sha1, newbase);
+			opt->add_remove(opt, *prefix, mode, sha1, newbase, 0);
 			newbase[baselen + pathlen] = '/';
 		}

@@ -252,7 +252,7 @@ static void show_entry(struct diff_options *opt, const char *prefix, struct tree
 		free(newbase);
 	} else {
 		char *fullname = malloc_fullname(base, baselen, path, pathlen);
-		opt->add_remove(opt, prefix[0], mode, sha1, fullname);
+		opt->add_remove(opt, prefix[0], mode, sha1, fullname, 0);
 		free(fullname);
 	}
 }
-- 
1.6.6.444.g4e34e.dirty

^ permalink raw reply related

* Re: [RFC] Git Wiki Move
From: Johannes Schindelin @ 2010-01-18 20:46 UTC (permalink / raw)
  To: J.H.; +Cc: Matthieu Moy, Petr Baudis, git
In-Reply-To: <4B54ACBD.6000000@eaglescrag.net>

Hi,

On Mon, 18 Jan 2010, J.H. wrote:

> On 01/18/2010 04:03 AM, Johannes Schindelin wrote:
> 
> > These are the things left that I would like to see soon:
> > 
> > - add a link from the old Wiki (with rewrite rules)
> 
> I think the rewrites should be simple enough (the pages seem to line up 
> 1:1 easily enough) so all I would need to do is discuss this with Petr.

Wonderful.

> > - use whatever logo (anything is better than the sunflower)
> 
> It's easy enough to change just need to know which one to use.

See below.

> > - enable anonymous edits
> 
> Anonymous edits will *NEVER* be enabled.  You are going to need to have
> an account and your going to have to verify it's e-mail to be able to do
> eits.  Completely anonymous edits are too much of a risk considering
> bots and spam and the abuse they can do.
> 
> Even then there is at least one case of a spammer actually going through
> and using a verified account.

That's good news.  Every couple of days I would look through the recent 
changes, and sadly enough, there were many spammings on the GitWiki.  No 
more!

> > - a major cleanup of the broken autolinks and faulty formatting.
> > 
> > The latter point is probably the most tedious one, so I suggest that 
> > only those get a vote on the logo who fix at least 3 pages.
> 
> sound fair to me.

I think I qualify.  My vote goes for 
http://henrik.nyh.se/uploads/git-logo.png, as it is really pretty.  If you 
agree, and put in your vote, too, you can already set it until the 
bikeshedding has settled down ;-)

Ciao,
Dscho

^ permalink raw reply

* [PATCH 0/2] Branch --set-upstream
From: Ilari Liusvaara @ 2010-01-18 20:44 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Matthieu Moy

This adds new switch --set-upstream that can set upstream information
on existing branch branch without changing where the branch points
to.

push --set-upstream doesn't help if user doesn't have push priviledges
or if user is not up to date with upstream. Forcing branch recreate
is not possible if user has own patches appiled to the branch.

This series is edited version of Matthieu Moy's branch --track patch
series. The first patch is edited, the second is resent as-is.

Ilari Liusvaara (1):
  Add branch --set-upstream

Matthieu Moy (1):
  branch: warn and refuse to set a branch as a tracking branch of
    itself.

 Documentation/git-branch.txt |    8 ++++++-
 branch.c                     |   45 ++++++++++++++++++++++++++++++-----------
 builtin-branch.c             |    2 +
 cache.h                      |    1 +
 t/t6040-tracking-info.sh     |   21 +++++++++++++++++++
 5 files changed, 64 insertions(+), 13 deletions(-)

^ permalink raw reply

* [PATCH 1/2] Add branch --set-upstream
From: Ilari Liusvaara @ 2010-01-18 20:44 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Matthieu Moy
In-Reply-To: <1263847452-29634-1-git-send-email-ilari.liusvaara@elisanet.fi>

Add --set-upstream option to branch that works like --track, except that
when branch exists already, its upstream info is changed without changing
the ref value.

Based-on-patch-from: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Ilari Liusvaara <ilari.liusvaara@elisanet.fi>
---
 Documentation/git-branch.txt |    8 +++++++-
 branch.c                     |   31 +++++++++++++++++++++----------
 builtin-branch.c             |    2 ++
 cache.h                      |    1 +
 t/t6040-tracking-info.sh     |   21 +++++++++++++++++++++
 5 files changed, 52 insertions(+), 11 deletions(-)

diff --git a/Documentation/git-branch.txt b/Documentation/git-branch.txt
index 0e83680..a0d6a7a 100644
--- a/Documentation/git-branch.txt
+++ b/Documentation/git-branch.txt
@@ -11,7 +11,7 @@ SYNOPSIS
 'git branch' [--color | --no-color] [-r | -a]
 	[-v [--abbrev=<length> | --no-abbrev]]
 	[(--merged | --no-merged | --contains) [<commit>]]
-'git branch' [--track | --no-track] [-l] [-f] <branchname> [<start-point>]
+'git branch' [--set-upstream | --track | --no-track] [-l] [-f] <branchname> [<start-point>]
 'git branch' (-m | -M) [<oldbranch>] <newbranch>
 'git branch' (-d | -D) [-r] <branchname>...
 
@@ -129,6 +129,12 @@ start-point is either a local or remote branch.
 	Do not set up "upstream" configuration, even if the
 	branch.autosetupmerge configuration variable is true.
 
+--set-upstream::
+	If specified branch does not exist yet or if '--force' has been
+	given, acts exactly like '--track'. Otherwise sets up configuration
+	like '--track' would when creating the branch, except that where
+	branch points to is not changed.
+
 --contains <commit>::
 	Only list branches which contain the specified commit.
 
diff --git a/branch.c b/branch.c
index 05ef3f5..40d3c45 100644
--- a/branch.c
+++ b/branch.c
@@ -108,6 +108,7 @@ static int setup_tracking(const char *new_ref, const char *orig_ref,
 		switch (track) {
 		case BRANCH_TRACK_ALWAYS:
 		case BRANCH_TRACK_EXPLICIT:
+		case BRANCH_TRACK_OVERRIDE:
 			break;
 		default:
 			return 1;
@@ -128,18 +129,25 @@ void create_branch(const char *head,
 		   const char *name, const char *start_name,
 		   int force, int reflog, enum branch_track track)
 {
-	struct ref_lock *lock;
+	struct ref_lock *lock = NULL;
 	struct commit *commit;
 	unsigned char sha1[20];
 	char *real_ref, msg[PATH_MAX + 20];
 	struct strbuf ref = STRBUF_INIT;
 	int forcing = 0;
+	int dont_change_ref = 0;
+	int explicit_tracking = 0;
+
+	if (track == BRANCH_TRACK_EXPLICIT || track == BRANCH_TRACK_OVERRIDE)
+		explicit_tracking = 1;
 
 	if (strbuf_check_branch_ref(&ref, name))
 		die("'%s' is not a valid branch name.", name);
 
 	if (resolve_ref(ref.buf, sha1, 1, NULL)) {
-		if (!force)
+		if (!force && track == BRANCH_TRACK_OVERRIDE)
+			dont_change_ref = 1;
+		else if (!force)
 			die("A branch named '%s' already exists.", name);
 		else if (!is_bare_repository() && !strcmp(head, name))
 			die("Cannot force update the current branch.");
@@ -153,12 +161,12 @@ void create_branch(const char *head,
 	switch (dwim_ref(start_name, strlen(start_name), sha1, &real_ref)) {
 	case 0:
 		/* Not branching from any existing branch */
-		if (track == BRANCH_TRACK_EXPLICIT)
+		if (explicit_tracking)
 			die("Cannot setup tracking information; starting point is not a branch.");
 		break;
 	case 1:
 		/* Unique completion -- good, only if it is a real ref */
-		if (track == BRANCH_TRACK_EXPLICIT && !strcmp(real_ref, "HEAD"))
+		if (explicit_tracking && !strcmp(real_ref, "HEAD"))
 			die("Cannot setup tracking information; starting point is not a branch.");
 		break;
 	default:
@@ -170,9 +178,11 @@ void create_branch(const char *head,
 		die("Not a valid branch point: '%s'.", start_name);
 	hashcpy(sha1, commit->object.sha1);
 
-	lock = lock_any_ref_for_update(ref.buf, NULL, 0);
-	if (!lock)
-		die_errno("Failed to lock ref for update");
+	if (!dont_change_ref) {
+		lock = lock_any_ref_for_update(ref.buf, NULL, 0);
+		if (!lock)
+			die_errno("Failed to lock ref for update");
+	}
 
 	if (reflog)
 		log_all_ref_updates = 1;
@@ -180,15 +190,16 @@ void create_branch(const char *head,
 	if (forcing)
 		snprintf(msg, sizeof msg, "branch: Reset from %s",
 			 start_name);
-	else
+	else if (!dont_change_ref)
 		snprintf(msg, sizeof msg, "branch: Created from %s",
 			 start_name);
 
 	if (real_ref && track)
 		setup_tracking(name, real_ref, track);
 
-	if (write_ref_sha1(lock, sha1, msg) < 0)
-		die_errno("Failed to write ref");
+	if (!dont_change_ref)
+		if (write_ref_sha1(lock, sha1, msg) < 0)
+			die_errno("Failed to write ref");
 
 	strbuf_release(&ref);
 	free(real_ref);
diff --git a/builtin-branch.c b/builtin-branch.c
index ddc9f2d..970cf31 100644
--- a/builtin-branch.c
+++ b/builtin-branch.c
@@ -564,6 +564,8 @@ int cmd_branch(int argc, const char **argv, const char *prefix)
 		OPT__VERBOSE(&verbose),
 		OPT_SET_INT('t', "track",  &track, "set up tracking mode (see git-pull(1))",
 			BRANCH_TRACK_EXPLICIT),
+		OPT_SET_INT( 0, "set-upstream",  &track, "change upstream info",
+			BRANCH_TRACK_OVERRIDE),
 		OPT_BOOLEAN( 0 , "color",  &branch_use_color, "use colored output"),
 		OPT_SET_INT('r', NULL,     &kinds, "act on remote-tracking branches",
 			REF_REMOTE_BRANCH),
diff --git a/cache.h b/cache.h
index cf36c81..e9ec537 100644
--- a/cache.h
+++ b/cache.h
@@ -553,6 +553,7 @@ enum branch_track {
 	BRANCH_TRACK_REMOTE,
 	BRANCH_TRACK_ALWAYS,
 	BRANCH_TRACK_EXPLICIT,
+	BRANCH_TRACK_OVERRIDE,
 };
 
 enum rebase_setup_type {
diff --git a/t/t6040-tracking-info.sh b/t/t6040-tracking-info.sh
index 664b0f8..1785e17 100755
--- a/t/t6040-tracking-info.sh
+++ b/t/t6040-tracking-info.sh
@@ -89,4 +89,25 @@ test_expect_success 'status when tracking annotated tags' '
 	grep "set up to track" actual &&
 	git checkout heavytrack
 '
+
+test_expect_success 'setup tracking with branch --set-upstream on existing branch' '
+	git branch from-master master &&
+	test_must_fail git config branch.from-master.merge > actual &&
+	git branch --set-upstream from-master master &&
+	git config branch.from-master.merge > actual &&
+	grep -q "^refs/heads/master$" actual
+'
+
+test_expect_success '--set-upstream does not change branch' '
+	git branch from-master2 master &&
+	test_must_fail git config branch.from-master2.merge > actual &&
+	git rev-list from-master2 &&
+	git update-ref refs/heads/from-master2 from-master2^ &&
+	git rev-parse from-master2 >expect2 &&
+	git branch --set-upstream from-master2 master &&
+	git config branch.from-master.merge > actual &&
+	git rev-parse from-master2 >actual2 &&
+	grep -q "^refs/heads/master$" actual &&
+	cmp expect2 actual2
+'
 test_done
-- 
1.6.6.199.gff4b0

^ permalink raw reply related

* [PATCH 2/2] branch: warn and refuse to set a branch as a tracking branch of itself.
From: Ilari Liusvaara @ 2010-01-18 20:44 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Matthieu Moy
In-Reply-To: <1263847452-29634-1-git-send-email-ilari.liusvaara@elisanet.fi>

From: Matthieu Moy <Matthieu.Moy@imag.fr>

Previous patch allows commands like "git branch foo foo --track", which
doesn't make much sense. Warn the user and don't change the configuration
in this case. Don't die to let the caller finish its job in such case.

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Ilari Liusvaara <ilari.liusvaara@elisanet.fi>
---
 branch.c |   14 ++++++++++++--
 1 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/branch.c b/branch.c
index 40d3c45..9e1f63e 100644
--- a/branch.c
+++ b/branch.c
@@ -49,9 +49,19 @@ static int should_setup_rebase(const char *origin)
 
 void install_branch_config(int flag, const char *local, const char *origin, const char *remote)
 {
+	const char *shortname = remote + 11;
+	int remote_is_branch = !prefixcmp(remote, "refs/heads/");
 	struct strbuf key = STRBUF_INIT;
 	int rebasing = should_setup_rebase(origin);
 
+	if (remote_is_branch
+	    && !strcmp(local, shortname)
+	    && !origin) {
+		warning("Not setting branch %s as its own upstream.",
+			local);
+		return;
+	}
+
 	strbuf_addf(&key, "branch.%s.remote", local);
 	git_config_set(key.buf, origin ? origin : ".");
 
@@ -71,8 +81,8 @@ void install_branch_config(int flag, const char *local, const char *origin, cons
 		strbuf_addstr(&key, origin ? "remote" : "local");
 
 		/* Are we tracking a proper "branch"? */
-		if (!prefixcmp(remote, "refs/heads/")) {
-			strbuf_addf(&key, " branch %s", remote + 11);
+		if (remote_is_branch) {
+			strbuf_addf(&key, " branch %s", shortname);
 			if (origin)
 				strbuf_addf(&key, " from %s", origin);
 		}
-- 
1.6.6.199.gff4b0

^ permalink raw reply related

* Re: Unmodified submodules shows up as dirty with 1.6.6.443.gd7346
From: Jens Lehmann @ 2010-01-18 20:50 UTC (permalink / raw)
  To: Gustaf Hendeby; +Cc: Junio C Hamano, Jacob Helwig, git
In-Reply-To: <7vr5pnqqem.fsf@alter.siamese.dyndns.org>

Am 18.01.2010 18:22, schrieb Junio C Hamano:
> I am suspecting that your "these files ... are named differently on
> different systems" may fall into the same category.  Your build may not
> produce "frotz.linux" when compiled on a FreeBSD box (and "frotz.fbsd" on
> a Linux box), but would it hurt more than it helps to list them in the
> same .gitignore to cover both?

If you keep source directories separate from directories for compiler
output you can easily simplify your .gitignore by negating it. So
instead of putting a line into the .gitignore for every file type you
do /not/ want, exclude everything in the build directory and add
exceptions for the files you /do/ want to track.

Then a .gitignore could look like this for building with VC6, VC8 and
Makefile based build systems:

path/to/build/directory/*
!*.dsw
!*.dsp
!*.sln
!*.vcproj
!Makefile

^ permalink raw reply

* Забара ♥Armin_Van_Buuren♥ Тимур добавил Вас в друзья на сайте ВКонтакте.ру
From: ВКонтакте.ру @ 2010-01-18 21:06 UTC (permalink / raw)
  To: Здравствуйте

Здравствуйте,

Забара ♥Armin_Van_Buuren♥ Тимур добавил Вас в друзья на сайте ВКонтакте.ру

Вы можете зайти на сайт и просмотреть страницы Ваших друзей, используя
Ваш e-mail и автоматически созданный пароль: TxvUlJH7

ВКонтакте.ру - сайт, который ежедневно позволяет десяткам миллионов людей находить старых друзей и оставаться с ними на связи, делиться фотографиями
и событиями из жизни.

Чтобы войти на сайт, пройдите по ссылке:
http://vk.com/login.php?regemail=git@vger.kernel.org#TxvUlJH7

Внимание: Ваша регистрация не будет активирована, если Вы проигнорируете
это приглашение.

Желаем удачи!
С уважением,
Администрация ВКонтакте.ру

^ permalink raw reply

* builtin-apply.c: fix the --- and +++ header filename consistency check
From: Andreas Gruenbacher @ 2010-01-18 21:37 UTC (permalink / raw)
  To: git

gitdiff_verify_name() only did a filename prefix check because of an
off-by-one error.

Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
---
 builtin-apply.c            |    2 +-
 t/t4133-apply-filenames.sh |   38 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 39 insertions(+), 1 deletions(-)
 create mode 100755 t/t4133-apply-filenames.sh

diff --git a/builtin-apply.c b/builtin-apply.c
index 541493e..c8be66e 100644
--- a/builtin-apply.c
+++ b/builtin-apply.c
@@ -686,7 +686,7 @@ static char *gitdiff_verify_name(const char *line, int 
isnull, char *orig_name,
 		if (isnull)
 			die("git apply: bad git-diff - expected /dev/null, got %s on line %d", 
name, linenr);
 		another = find_name(line, NULL, p_value, TERM_TAB);
-		if (!another || memcmp(another, name, len))
+		if (!another || memcmp(another, name, len + 1))
 			die("git apply: bad git-diff - inconsistent %s filename on line %d", 
oldnew, linenr);
 		free(another);
 		return orig_name;
diff --git a/t/t4133-apply-filenames.sh b/t/t4133-apply-filenames.sh
new file mode 100755
index 0000000..3421807
--- /dev/null
+++ b/t/t4133-apply-filenames.sh
@@ -0,0 +1,38 @@
+#!/bin/sh
+#
+# Copyright (c) 2010 Andreas Gruenbacher
+#
+
+test_description='git apply filename consistency check'
+
+. ./test-lib.sh
+
+test_expect_success setup '
+	cat > bad1.patch <<EOF
+diff --git a/f b/f
+new file mode 100644
+index 0000000..d00491f
+--- /dev/null
++++ b/f-blah
+@@ -0,0 +1 @@
++1
+EOF
+	cat > bad2.patch <<EOF
+diff --git a/f b/f
+deleted file mode 100644
+index d00491f..0000000
+--- b/f-blah
++++ /dev/null
+@@ -1 +0,0 @@
+-1
+EOF
+'
+
+test_expect_success 'apply diff with inconsistent filenames in headers' '
+	test_must_fail git apply bad1.patch 2>err
+	grep "inconsistent new filename" err
+	test_must_fail git apply bad2.patch 2>err
+	grep "inconsistent old filename" err
+'
+
+test_done
-- 
1.6.6.243.gff6d2

^ permalink raw reply related

* [PATCH] grep: don't segfault
From: Jim Meyering @ 2010-01-18 21:55 UTC (permalink / raw)
  To: git list


Today, git grep -l '^	' failed with a segfault.
I tracked it down to a26345b6085340ebd61e156aa8154a80196bee0f.
Debugging it, I found that look_ahead calls regexec with a "bol"
argument pointing to a buffer that is not NUL-terminated.
Yet regexec requires the NUL-termination.  Without that, its
initial strlen will read beyond the end of the buffer.

    $ valgrind ./git grep -l '^  '
    Memcheck, a memory error detector
    Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al.
    Using Valgrind-3.5.0 and LibVEX; rerun with -h for copyright info
    Command: ./git grep -l ^

    Conditional jump or move depends on uninitialised value(s)
       at 0x4A06318: __GI_strlen (mc_replace_strmem.c:284)
       by 0x3AF00C04C4: regexec@@GLIBC_2.3.4 (regexec.c:243)
       by 0x496A0A: look_ahead (grep.c:644)
       by 0x496D24: grep_buffer_1 (grep.c:726)
       by 0x497161: grep_buffer (grep.c:852)
       by 0x432720: grep_file (builtin-grep.c:201)
       by 0x432854: grep_cache (builtin-grep.c:230)
       by 0x433F2F: cmd_grep (builtin-grep.c:621)
       by 0x40488D: run_builtin (git.c:257)
       by 0x404A28: handle_internal_command (git.c:401)
       by 0x404B1D: run_argv (git.c:443)
       by 0x404C9A: main (git.c:514)
    ...

That buffer is read by builtin-grep.c(grep_file) and here's the code.
You can see from the allocation of "sz + 1" and read of one less
that there is already room for the required trailing NUL byte:

	sz = xsize_t(st.st_size);
	i = open(filename, O_RDONLY);
	if (i < 0)
		goto err_ret;
	data = xmalloc(sz + 1);
	if (st.st_size != read_in_full(i, data, sz)) {
		error("'%s': short read %s", filename, strerror(errno));
		close(i);
		free(data);
		return 0;
	}
	close(i);
>>>     data[sz] = 0;    <======  added line
	if (opt->relative && opt->prefix_length)
		filename = quote_path_relative(filename, -1, &buf, opt->prefix);
	i = grep_buffer(opt, filename, data, sz);

Signed-off-by: Jim Meyering <meyering@redhat.com>
---
This is relative to the head of git's next branch,
d7346b15ca5bda881f5abc37b0844e9b35c8cffc
I noticed the segfault by running the same command in
libvirt's git repository, but it's not consistently reproducible.

 builtin-grep.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/builtin-grep.c b/builtin-grep.c
index 4dd5aaf..da854fa 100644
--- a/builtin-grep.c
+++ b/builtin-grep.c
@@ -196,6 +196,7 @@ static int grep_file(struct grep_opt *opt, const char *filename)
 		return 0;
 	}
 	close(i);
+	data[sz] = 0;
 	if (opt->relative && opt->prefix_length)
 		filename = quote_path_relative(filename, -1, &buf, opt->prefix);
 	i = grep_buffer(opt, filename, data, sz);
--
1.6.4.4.1.ga2634

^ permalink raw reply related

* Re: [PATCH v2 06/14] mingw: use real pid
From: Erik Faye-Lund @ 2010-01-18 22:33 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: msysgit, git
In-Reply-To: <40aa078e1001160112k68c0daafnd6abcb715e1176fe@mail.gmail.com>

On Sat, Jan 16, 2010 at 10:12 AM, Erik Faye-Lund
<kusmabite@googlemail.com> wrote:
> On Sat, Jan 16, 2010 at 9:03 AM, Johannes Sixt <j6t@kdbg.org> wrote:
>> On Freitag, 15. Januar 2010, Erik Faye-Lund wrote:
>>> On Fri, Jan 15, 2010 at 11:30 PM, Johannes Sixt <j6t@kdbg.org> wrote:
>>> > On Freitag, 15. Januar 2010, Erik Faye-Lund wrote:
>>> >> @@ -729,7 +729,7 @@ static pid_t mingw_spawnve(const char *cmd, const
>>> >> char **argv, char **env, return -1;
>>> >>       }
>>> >>       CloseHandle(pi.hThread);
>>> >> -     return (pid_t)pi.hProcess;
>>> >> +     return (pid_t)pi.dwProcessId;
>>> >>  }
>>> >
>>> > You are not using the pi.hProcess anymore, so you must close it.
>>>
>>> No. If I do, the pid becomes invalid after the process is finished,
>>> and waitpid won't work. I couldn't find anywhere were we actually were
>>> closing the handle, even after it was finished. So I don't think we
>>> leak any more than we already did (for non-daemon purposes).
>>
>> Previously, this handle was closed by _cwait() (it was the "pid"), so we
>> didn't leak it.
>
> Oh, I see. My planned route with this (before I looked for where the
> handle was closed), was to maintain some sort of list of each started
> PID and their handle, and lookup in that list instead of using
> OpenProcess. I guess that would solve the problem here, but it feels a
> bit nasty. Not as nasty as introducing a leak, though.
>

What I had in mind was something along these lines:

diff --git a/compat/mingw.c b/compat/mingw.c
index e2821b3..71201d0 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -638,6 +638,13 @@ static int env_compare(const void *a, const void *b)
 	return strcasecmp(*ea, *eb);
 }

+struct pid_info {
+	pid_t pid;
+	HANDLE proc;
+};
+static struct pid_info *pinfo;
+static int num_pinfo;
+
 static pid_t mingw_spawnve(const char *cmd, const char **argv, char **env,
 			   int prepend_cmd)
 {
@@ -729,6 +736,13 @@ static pid_t mingw_spawnve(const char *cmd, const
char **argv, char **env,
 		return -1;
 	}
 	CloseHandle(pi.hThread);
+
+	/* store process handle */
+	num_pinfo++;
+	pinfo = xrealloc(pinfo, sizeof(struct pid_info) * num_pinfo);
+	pinfo[num_pinfo - 1].pid = pi.dwProcessId;
+	pinfo[num_pinfo - 1].proc = pi.hProcess;
+
 	return (pid_t)pi.dwProcessId;
 }

@@ -1536,6 +1550,7 @@ int waitpid(pid_t pid, int *status, unsigned options)
 	}

 	if (options == 0) {
+		int i;
 		if (WaitForSingleObject(h, INFINITE) != WAIT_OBJECT_0) {
 			CloseHandle(h);
 			return 0;
@@ -1544,6 +1559,19 @@ int waitpid(pid_t pid, int *status, unsigned options)
 		if (status)
 			GetExitCodeProcess(h, (LPDWORD)status);

+		for (i = 0; i < num_pinfo; ++i)
+			if (pinfo[i].pid == pid)
+				break;
+
+		if (i < num_pinfo) {
+			CloseHandle(pinfo[i].proc);
+			memmove(pinfo + i, pinfo + i + 1,
+			    sizeof(struct pid_info) * (num_pinfo - i - 1));
+			num_pinfo--;
+			pinfo = xrealloc(pinfo,
+			    sizeof(struct pid_info) * num_pinfo);
+		}
+
 		CloseHandle(h);
 		return pid;
 	}



-- 
Erik "kusma" Faye-Lund

^ permalink raw reply related

* Re: [PATCH] grep: don't segfault
From: Linus Torvalds @ 2010-01-18 22:40 UTC (permalink / raw)
  To: Jim Meyering; +Cc: git list
In-Reply-To: <871vhnhydg.fsf@meyering.net>

On Mon, 18 Jan 2010, Jim Meyering wrote:
> 
> 	sz = xsize_t(st.st_size);
> 	i = open(filename, O_RDONLY);
> 	if (i < 0)
> 		goto err_ret;
> 	data = xmalloc(sz + 1);
> 	if (st.st_size != read_in_full(i, data, sz)) {
> 		error("'%s': short read %s", filename, strerror(errno));
> 		close(i);
> 		free(data);
> 		return 0;
> 	}
> 	close(i);
> >>>     data[sz] = 0;    <======  added line

Heh. Fredrik's threaded grep patch also fixed this, although he did the 
"data[*sz] = 0;" up front.

So an obvious ack on the whole thing (the xmalloc() itself makes it 
obvious that that thing should be NULL-terminated).

That said, I also suspect that we have a _lot_ of these patterns, and I 
wonder if we should just add a 

	void *read_file(int fd, struct stat *st);

helper function, which reads a file and NULL-terminates it. Doing a

	git grep -5 read_in_full

on the git sources, and it looks like there are several cases that 
basically do that xmalloc+read_in_full+error handling pattern.

		Linus

^ permalink raw reply

* "warning: Updating the currently checked out branch may cause confusion" on bare repositories
From: Adam Megacz @ 2010-01-18 22:41 UTC (permalink / raw)
  To: git

It seems that the message "warning: Updating the currently checked out
branch may cause confusion" is unnecessary when pushing to a bare
"myproject.git" repository (instead of a "myproject/.git" repository).

Is this the case?

If so, perhaps the warning should be omitted when the target is a bare
repository.

  - a

^ permalink raw reply

* Re: Filenames and prefixes in extended diffs
From: Andreas Gruenbacher @ 2010-01-18 23:22 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vfx69k0bu.fsf@alter.siamese.dyndns.org>

On Thursday 14 January 2010 01:16:21 am Junio C Hamano wrote:
> Andreas Gruenbacher <agruen@suse.de> writes:
> > Can git be changed to ...
> 
> Just to save your time coming up with more ways to *change* git diff...
> 
> Even though I wouldn't say _any_ change is too late to bring in, change in
> the output format from "git diff" family _must_ be usable by "git apply"
> people have been using for the last 4 years or so.
> 
> Suppose your updated version of "git diff" with a certain set of options
> produces output A, which is different from the output B you would get out
> of today's "git diff" that is run with the same set of options.
> 
> If "git apply" people have been using understands B (i.e. current output)
> and does something, the format change between A and B must be designed in
> such a way that the same "git apply" accepts A (i.e. your output) and do
> the same thing.
> 
> Two examples:
> 
>  - "git diff -M" (or "git show -M") is _defined_ to show the filenames
>    without prefix on "rename from" line, and deployed "git apply" relies
>    on this definition to apply the patch to the file the patch was meant
>    to apply.  If your modified "git diff -M" changes it to add the prefix,
>    and existing "git apply" changes behaviour (either by rejecting your
>    output, or applying the patch to a wrong file), then such a change has
>    *no chance* of getting in.  It is merely a breakage.

Git apply is currently broken in some (uncommon) cases.  Consider the 
following two patches:

	$ cat add.diff
	diff --git a/d/f b/d/f
	new file mode 100644
	index 0000000..6a69f92
	--- /dev/null
	+++ b/d/f
	@@ -0,0 +1 @@
	+f

	$ cat rename.diff
	diff --git a/d/f b/d/g
	similarity index 100%
	rename from d/f
	rename to d/g

They apply fine after each other with plain "git apply".  When you try to 
apply them to a different location, things break though:

	$ git apply --directory e -p2 add.diff
	$ git apply --directory e -p2 rename.diff
	error: e/d/f: No such file or directory

Had the second patch not been generated with "git diff -M", things would just 
have worked; in other words, the -M format is broken.

I think the easiest way to fix this in "git apply" would be to figure out what 
the missing prefixes are in the rename and copy lines (in this case, just "a/" 
and "b/"), and to prepend those prefixes to the decoded filenames before 
stripping off pathname components.  This is the same as just taking the 
filenames from the "diff --git" line, and ignoring the filenames in the "copy 
from", "copy to", "rename from", and "rename to" headers.

I don't see a way how to fix this in the existing headers in a backwards 
compatible way.  Do you?

Independent of whether and how this is addressed, can I please at least have 
the "diff --git" line parsing problem fixed so that filenames which contain 
spaces are put in double quotes there?  Then I can at least ignore all the 
prefix-less filenames in GNU patch and still make it understand git's output.

>  - If you say "git diff --src-prefix=a/b/c --dst-prefix=x/y", it _might_
>    produce something "git apply" won't grok (I haven't checked this,
>    though).  You can suggest to change the output from such a case to work
>    better.  We didn't work as expected so a change _could_ be a fix.

The output format for that is fine.

> The output from "git diff --no-index" is an exception to the above rule.
> It is primarily for people who have unmanaged contents and want to use
> features of the git diff engine that are not found in other people's diff
> implementations (e.g. wordwise colored diff), and the header part of its
> output does not currently follow "git diff" convention to be grokkable by
> "git apply".
> 
> Fixing _that_ is a welcome change, but I suspect that there are corner
> cases, e.g. "git diff --no-index frotz-1.2.36/ /tmp/frotz/" (i.e. you have
> a pristine version in frotz-1.2.36 directory, but your modified version is
> in /tmp/frtoz/) that might make fixing it fundamentally impossible (I
> haven't looked into it for a long time, so it could be easy, but my gut
> feeling is it isn't).

Patches with a different number of components in the from and to prefixes are 
a really bad idea.  (GNU patch will prefer the pathname with the fewer 
components, but this yould just as well be the wrong one.)

Thanks,
Andreas

^ permalink raw reply

* git describe --contains
From: John Tapsell @ 2010-01-18 23:37 UTC (permalink / raw)
  To: Git List

Hi all,

Doing:   git describe --contains SHA

on a SHA that is not contained by any tags gives a confusing error
message that it cannot describe the SHA.
Could this error message be changed please to just say something like
"SHA1 is not in any tags."

^ permalink raw reply

* Re: [PATCH v3] Threaded grep
From: Fredrik Kuivinen @ 2010-01-19  0:12 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, Git Mailing List, Johannes Sixt
In-Reply-To: <7vmy0bhxn1.fsf@alter.siamese.dyndns.org>

[I am adding git@vger to the Cc list as this may be interesting to
others as well.]

On Mon, Jan 18, 2010 at 23:10, Junio C Hamano <gitster@pobox.com> wrote:
> Fredrik Kuivinen <frekui@gmail.com> writes:
>
>> diff --git a/builtin-grep.c b/builtin-grep.c
>> index 24ae1ce..dc07e9e 100644
>> --- a/builtin-grep.c
>> +++ b/builtin-grep.c
>> @@ -15,11 +15,257 @@
>>  #include "grep.h"
>>  #include "quote.h"
>>
>> +#ifndef NO_PTHREADS
>> +#include "thread-utils.h"
>> +#include <pthread.h>
>> +#endif
>> +
>>  static char const * const grep_usage[] = {
>>       "git grep [options] [-e] <pattern> [<rev>...] [[--] path...]",
>>       NULL
>>  };
>>
>> +static int use_threads = 1;
>> +
>> +#ifndef NO_PTHREADS
>> +#define THREADS 8
>> +static pthread_t threads[THREADS];
>
> I had to wonder why builtin-pack-objects.c doesn't have an array like
> this.  It uses an array of structure each of which bundles per-thread
> work item data and pthread_t.  Would an arrangement like that make it
> easier to read for this patch as well?

From a cursory look at builtin-pack-objects.c I think it depends on
how you structure the work you are about to do.

In builtin-pack-objects.c the work seems to be divided evenly among
the threads at the start. When some thread don't have anything to do
it then steals work from some other thread.

In the current patch we instead add work to a FIFO from one thread and
then the consumer threads pick work from the FIFO. I think the current
structure suits the FIFO style quite well.

>> +static void* load_file(const char *filename, size_t *sz);
>
> Style: asterisks stick to identifiers, not types.  There are other
> places with the same issue in the patch, e.g.
>> +static struct work_item* get_work()
> that I'll omit from the rest of my comments.

Will fix. I looked carefully at the arguments, but forgot the return types.

>> +enum work_type {WORK_BUF, WORK_FILE};
>> +
>> +/* We use one producer thread and number_of_threads consumer
>> +   threads. The producer adds struct work_items to 'todo' and the
>> +   consumers pick work items from the same array. */
>
> Style:
>
>        /*
>         * We write multi line comments
>         * like this.
>         */

Will fix.

>> +#define TODO_SIZE 128
>
> How was this size tuned?  Can everybody (from teeny machines to beefy
> ones) use the same setting?

I simply increased it until I didn't see any more significant
performance improvements on two the boxes I have access to. As I only
have access to two boxes, it can most certainly be better tuned for
boxes with significantly different characteristics. However, I think
that 128 is reasonable in the sense that it gives a nice improvement
for everyone with more than one core.

>> +static struct work_item todo[TODO_SIZE];
>> +static int todo_start = 0;
>
> We try not to explicitly initialize statics to 0 and let BSS segment
> take care of them.

Will fix.

>> +/* This lock protects all the variables above. */
>> +static pthread_mutex_t grep_lock = PTHREAD_MUTEX_INITIALIZER;
>
> J6t had comments on these initializers; do they also apply to
> builtin-pack-objects.c?

I believe so, but I do not know. Johannes?

>> +/* This function takes ownership of 'name' and 'buf'. They will be
>> +   deallocated with free. */
>> +static int grep_buffer_async(struct grep_opt *opt, char *name, char *buf,
>> +                          size_t size)
>> +{
>> +     add_work(WORK_BUF, name, buf, size);
>> +     return 0;
>> +}
>> +
>> +static int grep_file_async(struct grep_opt *opt, char *name,
>> +                        const char *filename)
>> +{
>> +     add_work(WORK_FILE, name, (char*) filename, 0);
>> +     return 0;
>> +}
>
> If I am reading the code correctly, it seems that you read quite many
> blobs in core and queue them in todo[] (up to the size limit of the
> array), and the worker bees in run() consumes them.  Shouldn't the
> rate of feeding blobs be limited in some way (other than just having a
> fixed TODO_SIZE, but somehow linked to the rate the worker bees finish
> their work) to avoid wasting memory like this?  One approach might be
> to queue only the blob SHA-1 not the data in work item queue just like
> you queue only the filename not the contents, and have the worker bee
> to read the blob into memory.

Hm, yes that would be an improvement. One just have serialize the
calls to read_sha1_file (as I don't think read_sha1_file safely can be
called simultaneously from multiple threads).

>> ...
>> +static int grep_buffer_internal(struct grep_opt *opt, char *name, char *buf,
>> +                             size_t size)
>> +{
>> +#ifndef NO_PTHREADS
>> +     if (use_threads)
>> +             return grep_buffer_async(opt, name, buf, size);
>> +#endif
>> +     {
>> +             int hit = grep_buffer(opt, name, buf, size);
>> +             free(name);
>> +             free(buf);
>> +             return hit;
>> +     }
>> +}
>> +
>>  static int grep_config(const char *var, const char *value, void *cb)
>>  {
>>       struct grep_opt *opt = cb;
>> @@ -159,21 +405,21 @@ static int grep_sha1(struct grep_opt *opt, const unsigned char *sha1, const char
>>       if (opt->relative && opt->prefix_length) {
>>               quote_path_relative(name + tree_name_len, -1, &pathbuf, opt->prefix);
>>               strbuf_insert(&pathbuf, 0, name, tree_name_len);
>> -             name = pathbuf.buf;
>> +     } else {
>> +             strbuf_addstr(&pathbuf, name);
>>       }
>> -     hit = grep_buffer(opt, name, data, size);
>> -     strbuf_release(&pathbuf);
>> -     free(data);
>> +
>> +     hit = grep_buffer_internal(opt, strbuf_detach(&pathbuf, NULL),
>> +                                data, size);
>> +
>>       return hit;
>>  }
>
> This is quite a puzzling code.  Your grep_buffer_internal() returns
> the return value from grep_buffer_async() which always return 0 but
> then that 0 is returned to the caller as "we didn't see any hit".
> What happens if we start optimizing for a case where the end user is
> interested only in one bit of information "does it find anything, or
> is there absolutely no hits?" and wanted to add an early return
> codepath to grep_tree() and grep_cache() so that we say "Ah, we have
> seen a hit, so we don't have to look in others"?
>
> IOW, this patch makes the variable "hit" meaningless and confuses the
> readers.  This actually is the point that most worries me from longer
> term maintenance point of view.

The idea was to keep the current calling conventions as much as
possible, so that grep_cache and grep_tree could stay nearly
unchanged. So hit = 1 now means "we found something" and hit = 0 means
"we didn't find anything right now, but we may find something later
for this entry". The early return you sketched above will still
produce correct results, but it will not be an optimization for the
threaded code.

With the patch the return value from grep_file and grep_sha1 has to be
ORed with the return value from wait_all to determine if we found
something.

> How is the order of output controlled by the way?  I see you are
> keeping more than one threads from doing write_or_die() of the
> accumulated output at the same time with a mutex in work_done(), but
> if we have large files whose names come near the beginning and then a
> small file whose name come later in the input, we should defer the
> output of hits from the later smaller one before the earlier larger
> ones even if the worker bee assigned to the smaller one finishes while
> the other worker bees are still searching in the larger ones; it is
> not obvious to me where you are guaranteeing that ordering.

The output order is controlled by todo_done, todo_start, and work_item::done.

The rule is that the range [todo_done, todo_start) (modulo
ARRAY_SIZE(todo)) in "todo" contains work_items have been or are
processed by a consumer thread. If a particular work_item has been
processed, then .done = 1 and otherwise .done = 0. When a work_item is
marked as done in "work_done" we traverse a prefix of the range
[todo_done, todo_start): we stop as soon as .done = 0. We then set
todo_done to the first work_item with .done = 0.

In your example todo_done = 0 and todo_start = 2. todo[0] represents a
large file and todo[1] represents a small file. If the small file gets
done first we set todo[1].done to 1 in work_done. As todo_done = 0 and
todo[0].done = 0 we will not write any output now. When work_done is
called for todo[0], we set todo[0].done to 1. We will then write the
output for todo[0] (as todo_done = 0 and todo[0].done = 1) and after
that we write the output for todo[1].

In the current patch the mutex grep_lock is held when the output is
written. I'm guessing that this unnecessary serialization is the cause
of the waste of CPU time Linus is seeing. (Strangely enough I haven't
observed this phenomena on the boxes I have tested it on.)

> I am wondering if this can somehow be made into a change with alot
> smaller inpact, in spirit similar to "preloading".  The higher level
> loop may walk the input (be it cache, tree, or directory), issuing one
> grep_file() or grep_sha1() at a time just like the existing code, but
> the thread-aware code adds "priming" phase that (1) populates a "work
> queue" with a very small memory footprint, and that (2) starts more
> than one underlying grep on different files and blobs (up to the
> number of threads you are allowed, like your "#define THREADS 8"), and
> that (3) upon completion of one "work queue" item, the work queue item
> is marked with its result.
>
> Then grep_file() and grep_sha1() _may_ notice that the work queue item
> hasn't completed, and would wait in such a case, or just emits the
> output produced earlier (could be much earlier) by the worker bee.
>
> The low-memory-footprint work queue could be implemented as a lazy
> one, and may be very different depending on how the input is created.
> If we are grepping in the index, it could be a small array of <array
> index in active_cache[], done-bit, result-bit, result-strbuf> with a
> single integer that is a pointer into the index to indicate up to
> which index entry the work queue has been populated.

I will have to think about this a bit more. It's getting late here.

- Fredrik

^ permalink raw reply

* bug: git-bundle create foo --stdin -> segfault
From: Joey Hess @ 2010-01-19  0:26 UTC (permalink / raw)
  To: git; +Cc: 554682

[-- Attachment #1: Type: text/plain, Size: 896 bytes --]

joey@gnu:~/tmp/new>echo master | git bundle create ../my.bundle --stdin
zsh: segmentation fault  git bundle create ../my.bundle --stdin

I noticed that git bundle --stdin actually attempts to read from stdin
past EOF. You can see this if you manually type into its stdin.

% git-bundle create ../bundle --stdin
master
^D
master
^D
fatal: Refusing to create empty bundle.

The first stdin read is done by the internal call to rev-list --stdin.
The second stdin read is done by the call to setup_revisions(),
which has its own handler for --stdin.

Bug seen with git version 1.6.5.7 / 1.6.6.243.gff6d2

I also tried going back to 22568f0a336ac37ae7329c917857b455839d1d09, but
still see a bug with Adam Brewster's initial code to add --stdin to
git-bundle. That code still tries to read stdin twice. If it sees
"master" both times, it does create a bundle.

-- 
see shy jo

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply

* Re: [PATCH v2] Performance optimization for detection of modified submodules
From: Junio C Hamano @ 2010-01-19  1:44 UTC (permalink / raw)
  To: Jens Lehmann; +Cc: Git Mailing List
In-Reply-To: <4B54C3EA.9090200@web.de>

Jens Lehmann <Jens.Lehmann@web.de> writes:

> - Changed the type of the new dirty_submodule flags to unsigned.

Why?  An unsigned used in place of a bool raises an eyebrow as it is more
common to use "int" (the most natural type on the platform).  Going
against tradition makes readers waste time wondering if there were some
other reason why the code couldn't use "int" and had to use "unsigned"
(e.g. "Hmmpphh, it looked like a mere boolean but the use of 'unsigned'
suggests there might be something deeper going on.  Is this used as a
bitfield?  Does this count and cannot go negative?").

> @@ -173,12 +174,16 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
>  			if (silent_on_removed)
>  				continue;
>  			diff_addremove(&revs->diffopt, '-', ce->ce_mode,
> -				       ce->sha1, ce->name);
> +				       ce->sha1, ce->name, 0);

Removed from work tree; cannot be dirty---good.

> +		if (S_ISGITLINK(ce->ce_mode)
> +		    && (!changed || (revs->diffopt.output_format & DIFF_FORMAT_PATCH))
> +		    && is_submodule_modified(ce->name)) {
> +			changed = 1;
> +			dirty_submodule = 1;
> +		}

If the index records a submodule commit, and the commit checked out in the
submodule is different from that commit, ce_compare_gitlink() called from
ce_match_stat() would have already said "changed".  If we want "-dirty",
we need to call is_submodule_modified(), but otherwise we don't.  Looks
good.

Does DIFF_FORMAT_PATCH cover all cases where we may want a reliable value
in "dirty_submodule" here?  Should use of "--submodule=log" affect this
decision?

> @@ -188,7 +193,7 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
>  		newmode = ce_mode_from_stat(ce, st.st_mode);
>  		diff_change(&revs->diffopt, oldmode, newmode,
>  			    ce->sha1, (changed ? null_sha1 : ce->sha1),
> -			    ce->name);
> +			    ce->name, 0, dirty_submodule);

LHS is always index and RHS is work tree whose dirtiness is in
dirty_submodule; good.

> @@ -204,16 +209,18 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
>  static void diff_index_show_file(struct rev_info *revs,
>  				 const char *prefix,
>  				 struct cache_entry *ce,
> -				 const unsigned char *sha1, unsigned int mode)
> +				 const unsigned char *sha1, unsigned int mode,
> +				 unsigned dirty_submodule)
>  {
>  	diff_addremove(&revs->diffopt, prefix[0], mode,
> -		       sha1, ce->name);
> +		       sha1, ce->name, dirty_submodule);

Mental note to myself.  prefix[0] is either '-' (removed from the work
tree) or '+' (added to the work tree).  Added submodule could be
immediately dirty.

> @@ -251,15 +263,17 @@ static void show_new_file(struct rev_info *revs,
>  {
> ...
> -	diff_index_show_file(revs, "+", new, sha1, mode);
> +	diff_index_show_file(revs, "+", new, sha1, mode, dirty_submodule);

And this caller handles that case correctly; good.

> @@ -270,11 +284,13 @@ static int show_modified(struct rev_info *revs,
>  {
>  	unsigned int mode, oldmode;
>  	const unsigned char *sha1;
> +	unsigned dirty_submodule = 0;
>
> -	if (get_stat_data(new, &sha1, &mode, cached, match_missing) < 0) {
> +	if (get_stat_data(new, &sha1, &mode, cached, match_missing,
> +			  &dirty_submodule, revs->diffopt.output_format) < 0) {
>  		if (report_missing)
>  			diff_index_show_file(revs, "-", old,
> -					     old->sha1, old->ce_mode);
> +					     old->sha1, old->ce_mode, 0);

Again, removed from work tree cannot be dirty; good.

> @@ -309,7 +325,7 @@ static int show_modified(struct rev_info *revs,
>  		return 0;
>
>  	diff_change(&revs->diffopt, oldmode, mode,
> -		    old->sha1, sha1, old->name);
> +		    old->sha1, sha1, old->name, 0, dirty_submodule);
>  	return 0;
>  }

Comparing between a tree entry and either an index entry or a file in the
work tree.  When cached (i.e. not looking at work tree), get_stat_data()
doesn't touch dirty_submodule, so this won't make noises about submodule
dirtyness, which is good.

> @@ -356,7 +372,7 @@ static void do_oneway_diff(struct unpack_trees_options *o,
>  	 * Something removed from the tree?
>  	 */
>  	if (!idx) {
> -		diff_index_show_file(revs, "-", tree, tree->sha1, tree->ce_mode);
> +		diff_index_show_file(revs, "-", tree, tree->sha1, tree->ce_mode, 0);

Removed from the work tree and canont be dirty; good.

> diff --git a/diff.c b/diff.c
> index 012b3d3..f130a36 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -3736,7 +3736,7 @@ int diff_result_code(struct diff_options *opt, int status)
>  void diff_addremove(struct diff_options *options,
>  		    int addremove, unsigned mode,
>  		    const unsigned char *sha1,
> -		    const char *concatpath)
> +		    const char *concatpath, unsigned dirty_submodule)

A removed submodule cannot be dirty, so we can get away with only one
dirty_submodule that always refers to the RHS (i.e. "two"); I see.

> diff --git a/diffcore.h b/diffcore.h
> index 5b63458..66687c3 100644
> --- a/diffcore.h
> +++ b/diffcore.h
> @@ -42,6 +42,7 @@ struct diff_filespec {
>  #define DIFF_FILE_VALID(spec) (((spec)->mode) != 0)
>  	unsigned should_free : 1; /* data should be free()'ed */
>  	unsigned should_munmap : 1; /* data should be munmap()'ed */
> +	unsigned dirty_submodule : 1;  /* For submodules: its work tree is dirty */

By the way, we might want to consolidate these bitfields into a single 

	unsigned should_free:1,
        	 should_munmap:1,
                 dirty_submodule:1;

in a separate clean-up patch, after all the dust settles.

^ permalink raw reply

* [PATCH v2 3/5] Documentation: reset: describe new "--keep" option
From: Christian Couder @ 2010-01-19  4:25 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Linus Torvalds, Johannes Schindelin, Stephan Beyer,
	Daniel Barkalow, Jakub Narebski, Paolo Bonzini, Johannes Sixt,
	Stephen Boyd, Andreas Schwab, Daniel Convissor
In-Reply-To: <20100119042404.4510.48855.chriscool@tuxfamily.org>

and give an example to show how it can be used.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 Documentation/git-reset.txt |   47 ++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 46 insertions(+), 1 deletions(-)

diff --git a/Documentation/git-reset.txt b/Documentation/git-reset.txt
index 168db08..90239f5 100644
--- a/Documentation/git-reset.txt
+++ b/Documentation/git-reset.txt
@@ -8,7 +8,7 @@ git-reset - Reset current HEAD to the specified state
 SYNOPSIS
 --------
 [verse]
-'git reset' [--mixed | --soft | --hard | --merge] [-q] [<commit>]
+'git reset' [--mixed | --soft | --hard | --merge | --keep] [-q] [<commit>]
 'git reset' [-q] [<commit>] [--] <paths>...
 'git reset' --patch [<commit>] [--] [<paths>...]
 
@@ -52,6 +52,11 @@ OPTIONS
 	and updates the files that are different between the named commit
 	and the current commit in the working tree.
 
+--keep::
+	Resets the index to match the tree recorded by the named commit,
+	but keep changes in the working tree. Aborts if the reset would
+	change files that are already modified in the working tree.
+
 -p::
 --patch::
 	Interactively select hunks in the difference between the index
@@ -93,6 +98,7 @@ in the index and in state D in HEAD.
 				--mixed  A       D     D
 				--hard   D       D     D
 				--merge (disallowed)
+				--keep  (disallowed)
 
       working index HEAD target         working index HEAD
       ----------------------------------------------------
@@ -100,6 +106,7 @@ in the index and in state D in HEAD.
 				--mixed  A       C     C
 				--hard   C       C     C
 				--merge (disallowed)
+				--keep   A       C     C
 
       working index HEAD target         working index HEAD
       ----------------------------------------------------
@@ -107,6 +114,7 @@ in the index and in state D in HEAD.
 				--mixed  B       D     D
 				--hard   D       D     D
 				--merge  D       D     D
+				--keep  (disallowed)
 
       working index HEAD target         working index HEAD
       ----------------------------------------------------
@@ -114,6 +122,7 @@ in the index and in state D in HEAD.
 				--mixed  B       C     C
 				--hard   C       C     C
 				--merge  C       C     C
+				--keep   B       C     C
 
       working index HEAD target         working index HEAD
       ----------------------------------------------------
@@ -121,6 +130,7 @@ in the index and in state D in HEAD.
 				--mixed  B       D     D
 				--hard   D       D     D
 				--merge (disallowed)
+				--keep  (disallowed)
 
       working index HEAD target         working index HEAD
       ----------------------------------------------------
@@ -128,6 +138,7 @@ in the index and in state D in HEAD.
 				--mixed  B       C     C
 				--hard   C       C     C
 				--merge  B       C     C
+				--keep   B       C     C
 
 "reset --merge" is meant to be used when resetting out of a conflicted
 merge. Any mergy operation guarantees that the work tree file that is
@@ -138,6 +149,14 @@ between the index and the work tree, then it means that we are not
 resetting out from a state that a mergy operation left after failing
 with a conflict. That is why we disallow --merge option in this case.
 
+"reset --keep" is meant to be used when remove some of the last
+commits in the current branch while keep changes in the working tree.
+If there could be conflicts between the changes in the commit we want
+to remove and the changes in the working tree we want to keep, the
+reset is disallowed. That's why it is disallowed if there are both
+changes between the working tree and HEAD, and between HEAD and the
+target.
+
 The following tables show what happens when there are unmerged
 entries:
 
@@ -147,6 +166,7 @@ entries:
 				--mixed  X       B     B
 				--hard   B       B     B
 				--merge  B       B     B
+				--keep  (disallowed)
 
       working index HEAD target         working index HEAD
       ----------------------------------------------------
@@ -154,6 +174,7 @@ entries:
 				--mixed  X       A     A
 				--hard   A       A     A
 				--merge  A       A     A
+				--keep   X       A     A
 
 X means any state and U means an unmerged index.
 
@@ -325,6 +346,30 @@ $ git add frotz.c                           <3>
 <2> This commits all other changes in the index.
 <3> Adds the file to the index again.
 
+Keep changes in working tree while discarding some previous commits::
++
+Suppose you are working on something and you commit it, and then you
+continue working a bit more, but now you think that what you have in
+your working tree should be in another branch that has nothing to do
+with what you commited previously. You can start a new branch and
+reset it while keeping the changes in your work tree.
++
+------------
+$ git tag start
+$ git branch branch1
+$ edit
+$ git commit ...                            <1>
+$ edit
+$ git branch branch2                        <2>
+$ git reset --keep start                    <3>
+------------
++
+<1> This commits your first edits in branch1.
+<2> This creates branch2, but unfortunately it contains the previous
+commit that you don't want in this branch.
+<3> This removes the unwanted previous commit, but this keeps the
+changes in your working tree.
+
 Author
 ------
 Written by Junio C Hamano <gitster@pobox.com> and Linus Torvalds <torvalds@osdl.org>
-- 
1.6.6.271.gc8799

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox