Git development
 help / color / mirror / Atom feed
* Re: git 1.2 works on Solaris, AIX
From: Junio C Hamano @ 2006-02-14  6:13 UTC (permalink / raw)
  To: Jason Riedy; +Cc: git
In-Reply-To: <12579.1139893961@lotus.CS.Berkeley.EDU>

Thanks.  So let's leave the 1.2.X maintenance series as is at
least for now.

^ permalink raw reply

* Re: diffstat wierdness with 'git format-patch' output
From: Junio C Hamano @ 2006-02-14  6:09 UTC (permalink / raw)
  To: Greg KH; +Cc: git
In-Reply-To: <20060214055648.GA592@kroah.com>

Greg KH <greg@kroah.com> writes:

> Hm, in looking at it closer, it's probably the last two lines of the
> file, the signature that git format-patch adds to the message:
> 	-- 
> 	1.2.0

If that is the case, it's unfortunate that diffstat is broken
and is not properly counting lines to tell which lines are part
of the patch and which lines are not.

Have you tried "git apply --stat" instead?

> Any way to suppress these?

Sorry, there is no option to disable that, but the stuff is
GPLv2 so you can do whatever ;-).

The string "-- \n" is an established convention to mark the
beginning of the signature (or whatever inmaterial stuff that
follow the message contents), so changing the marker is
pointless -- if we want the option it should be to delete those
two lines altogether.

I personally find it useful to see the trend of version of tools
people use on the public mailing list, and that was the primary
reason it is there.

Have you tried "git apply --stat --summary" instead?

^ permalink raw reply

* Re: diffstat wierdness with 'git format-patch' output
From: Greg KH @ 2006-02-14  5:56 UTC (permalink / raw)
  To: git
In-Reply-To: <20060214055425.GA32261@kroah.com>

On Mon, Feb 13, 2006 at 09:54:25PM -0800, Greg KH wrote:
> I was trying to use the built-in git tools to send patches off, instead
> of my horribly hacked up scripts that use the git low-level stuff, when
> I noticed that git format-patch's output confuses diffstat a bit, and
> causes it to add another line to it's count.
> 
> This isn't good when I do a 'diffstat -p1 *.txt' of the output and add
> it to an email to send off for someone to pull from, as the result will
> be off from what is really there.
> 
> Here's what I get:
> 
>  $ git format-patch -n origin..HEAD
>  0001-USB-fix-up-the-usb-early-handoff-logic-for-EHCI.txt
>  0002-USB-add-new-device-ids-to-ldusb.txt
>  0003-USB-change-ldusb-s-experimental-state.txt
>  0004-USB-PL2303-Leadtek-9531-GPS-Mouse.txt
>  0005-USB-sl811_cs-needs-platform_device-conversion-too.txt
>  0006-usb-storage-new-unusual_devs-entry.txt
>  0007-usb-storage-unusual_devs-entry.txt
>  0008-USB-unusual_devs.h-entry-TrekStor-i.Beat.txt
>  0009-USB-unusual_devs.h-entry-iAUDIO-M5.txt
>  0010-USB-unusual-devs-bugfix.txt
> 
>  $ git log | head -n 1
>  commit 16f05be7be0bf121491d83bd97337fe179b3b323
> 
>  $ git show 16f05be7be0bf121491d83bd97337fe179b3b323 | diffstat -p1
>   drivers/usb/storage/unusual_devs.h |   25 ++++++++++++++++++-------
>   1 file changed, 18 insertions(+), 7 deletions(-)
> 
>  $ diffstat -p1 0010-USB-unusual-devs-bugfix.txt
>   drivers/usb/storage/unusual_devs.h |   26 ++++++++++++++++++--------
>   1 file changed, 18 insertions(+), 8 deletions(-)
> 
> Any thoughts?

Hm, in looking at it closer, it's probably the last two lines of the
file, the signature that git format-patch adds to the message:
	-- 
	1.2.0

Any way to suppress these?

thanks,

greg k-h

> 
> thanks,
> 
> greg k-h

^ permalink raw reply

* diffstat wierdness with 'git format-patch' output
From: Greg KH @ 2006-02-14  5:54 UTC (permalink / raw)
  To: git

I was trying to use the built-in git tools to send patches off, instead
of my horribly hacked up scripts that use the git low-level stuff, when
I noticed that git format-patch's output confuses diffstat a bit, and
causes it to add another line to it's count.

This isn't good when I do a 'diffstat -p1 *.txt' of the output and add
it to an email to send off for someone to pull from, as the result will
be off from what is really there.

Here's what I get:

 $ git format-patch -n origin..HEAD
 0001-USB-fix-up-the-usb-early-handoff-logic-for-EHCI.txt
 0002-USB-add-new-device-ids-to-ldusb.txt
 0003-USB-change-ldusb-s-experimental-state.txt
 0004-USB-PL2303-Leadtek-9531-GPS-Mouse.txt
 0005-USB-sl811_cs-needs-platform_device-conversion-too.txt
 0006-usb-storage-new-unusual_devs-entry.txt
 0007-usb-storage-unusual_devs-entry.txt
 0008-USB-unusual_devs.h-entry-TrekStor-i.Beat.txt
 0009-USB-unusual_devs.h-entry-iAUDIO-M5.txt
 0010-USB-unusual-devs-bugfix.txt

 $ git log | head -n 1
 commit 16f05be7be0bf121491d83bd97337fe179b3b323

 $ git show 16f05be7be0bf121491d83bd97337fe179b3b323 | diffstat -p1
  drivers/usb/storage/unusual_devs.h |   25 ++++++++++++++++++-------
  1 file changed, 18 insertions(+), 7 deletions(-)

 $ diffstat -p1 0010-USB-unusual-devs-bugfix.txt
  drivers/usb/storage/unusual_devs.h |   26 ++++++++++++++++++--------
  1 file changed, 18 insertions(+), 8 deletions(-)

Any thoughts?

thanks,

greg k-h

^ permalink raw reply

* Re: older git archive access broken in 1.2.0?
From: Junio C Hamano @ 2006-02-14  5:48 UTC (permalink / raw)
  To: Greg KH; +Cc: git
In-Reply-To: <20060214050616.GA28528@kroah.com>

Greg KH <greg@kroah.com> writes:

> I was trying to find where something changed in the historical Linux
> kernel git tree:
> 	rsync://rsync.kernel.org/pub/scm/linux/kernel/git/tglx/history.git/
>
> when I noticed that the latest version of git doesn't seem to like this
> archive.  I can't clone it, but 'git log' and 'git whatchanged' seems to
> work fine.

I think 1.2.0 may be a coincidence.  history.git/ mistakenly has
an extra .git subdirectory underneath it.  Removing it should
make things to work again I suspect.

^ permalink raw reply

* git 1.2 works on Solaris, AIX [was Re: [PATCH 1/3] Call extended-semantics commands through variables.]
From: Jason Riedy @ 2006-02-14  5:12 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7v64nllbdj.fsf@assigned-by-dhcp.cox.net>

The AIX machines I work on are back, and it looks like my 
patches are unnecessary, at least for my use.  It'd be cute 
to allow builders to point at GNU tools, but not terribly 
useful.  The File::Find patch to git-archimport.perl might 
be nice, but it functions as-is.

To have diff and merge on my path with this AIX platform, I have 
to pull *all* the GNU tools into my path.  (NERSC uses modules.)  
I suspect that is a rather common setup, so it's not worth the 
serious surgery to redirect diff and merge.  diff is used in C 
and shell, and merge is in shell, Perl, and Python sources.

And pkgsrc on Solaris appears happy using GNU's cpio (under
archivers/gcpio) rather than its default, plain one.  I hadn't 
realized I could replace it easily.

So with the GNU tools in the path and a properly built Python, 
the mainline code works on Solaris 8 and AIX.

For posterity: Any problems with git-merge-recursive.py on AIX
likely are a yucky Python/AIX problem.  The sha has 'sem_trywait: 
Permission denied\n' prepended to it a few times.  You need to 
rebuild Python with HAVE_BROKEN_POSIX_SEMAPHORES:
  https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1106262&group_id=5470

If anyone really wants to point at particular tools but not 
require them in the user's path, the simplest way would be to 
link the correct tools (or wrappers) into the GIT_EXEC_PATH 
and prepend that to the PATH *everywhere*.  But it's not worth 
the effort until someone really needs it.

Jason

^ permalink raw reply

* older git archive access broken in 1.2.0?
From: Greg KH @ 2006-02-14  5:06 UTC (permalink / raw)
  To: git

I was trying to find where something changed in the historical Linux
kernel git tree:
	rsync://rsync.kernel.org/pub/scm/linux/kernel/git/tglx/history.git/

when I noticed that the latest version of git doesn't seem to like this
archive.  I can't clone it, but 'git log' and 'git whatchanged' seems to
work fine.

Also, gitk dies with the following error:
$ gitk
Error in startup script: fatal: '.git': unable to chdir or not a git archive
fatal: unexpected EOF
Failed to find remote refs
    while executing
"close $refd"
    (procedure "readrefs" line 41)
    invoked from within
"readrefs"
    (file "/home/greg/bin/gitk" line 3744)


Is this because I just synced the whole tree over using rsync and didn't
use git to clone it a long time ago?  Or that it was created with an
older version of git?

I'm not able to clone it locally either:
	$ git clone history.git/ test
	fatal: '/home/greg/linux/git/history.git/.git': unable to chdir or not a git archive
	fatal: unexpected EOF
	clone-pack from '/home/greg/linux/git/history.git/.git' failed.

I don't know when this broke with git, as it's been a long time since I
looked at this tree...

thanks,

greg k-h

^ permalink raw reply

* Re: [ANNOUNCE] pg - A patch porcelain for GIT
From: Shawn Pearce @ 2006-02-14  4:56 UTC (permalink / raw)
  To: Catalin Marinas, git
In-Reply-To: <tnxy80fe2zo.fsf@arm.com>

Catalin Marinas <catalin.marinas@arm.com> wrote:
> Without much testing, I think pg is a good tool but it is different
> from StGIT in many ways. It mainly resembles the topic branches way of
> working with the advantage of having them stacked on each-other. Each
> patch seems to be equivalent to a topic branch where you can commit
> changes. Rebasing a patch is equivalent to a merge in a branch with
> the merge commit having a general description like "Refreshed patch
> ..." and two parents - the new base and the old top.

Yes, exactly.
 
> While I don't say the above is a bad thing, it is pretty different
> from StGIT. With StGIT, the history of the tree only shows one commit
> per patch with the patch description chosen by the user. If you edit
> the description or modify the patch, the old patch or description is
> dropped from the main branch (visible via HEAD) and you only get the
> latest one. This clean history has many advantages when sending
> patches upstream either via e-mail or by asking for a pull.

Yes.  I didn't intend on exporting the entire patch history for
delivery upstream; I only intended on exporting the batch between
its base and last markers, which amounts to giving a single diff
such as what StGIT would generate.  But I had planned on pulling
the commit comments from all history into the header of the patch
during export, I just haven't gotten there yet.
 
> > - Preserves change history of patches.
> >
> >     The complete change history associated with each patch is
> >     maintained directly within GIT.  By storing the evolution of a
> >     patch as a sequence of GIT commits standard GIT history tools
> >     such as gitk can be used.
> 
> There have been discussions to adding this to StGIT as well (and there
> is a patch already from Chuck). It is a good thing to have but I'm
> opposed to the idea of having the history accessible from the top of
> the patch. Since the patch can be refreshed indefinitely, it would
> make the main history (visible from HEAD) really ugly and also cause
> problems with people pulling from a tree. I prefer to have a separate
> command (like 'stg id patch/log') that gives access to the history.

I definately agree.  I have been rather unhappy with the log
structure that pg is giving me when I flip patches around on the
stack.  So I'm certainly considering keeping the history of the
patch in a parallel tree stored within the same object and refs
database but I haven't really figured out how this could work.
 
> > - Its prune proof.
> >
> >     The metadata structure is stored entirely within the refs
> >     directory and the object database, which means you can safely use
> >     git-prune without damaging your work, even for unapplied
> >     patches.
> 
> That's missing indeed in StGIT but it will be available in the next
> release. I didn't push this yet because I wasn't sure what to do with
> the refresh history of a patch.

I see you actually already pushed out a change for this for StGIT.
That's good news.  :-) I noticed the solution StGIT used is close to
pg's, except that StGIT has the simplified single-commit-per-patch
model so its less refs than pg.

> > - Automatic detection (and cancellation) of returning patches.
> >
> >     pg automatically detects when a patch is received from
> >     the upstream GIT repository during a pg-rebase and deletes
> >     (cancels) the local version of the patch from the patch series.
> >     The automatic cancelling makes it easy to use pg to track and
> >     develop changes on top of a GIT project.
> 
> StGIT has been doing this from the beginning. You would need to run a
> 'stg clean' after a rebase (or push). I prefer to run this command
> manually so that 'stg series -e' would show the empty patches and let
> me decided what to do with them.

Actually StGIT didn't do this correctly for one of my use cases
and that's one of the things that drove me to trying to write pg
(because I wondered if there was a way to resolve it automatically).
Try building a patch series such as:

	... start with an empty stack ...

	... create patch A ...
	... edit file hello.c ...
	... refresh patch A ...

	... create patch B ...
	... edit file hello.c (same line region as patch A) ...
	... refresh patch B ...

	... generate patch A+B (as one patch!) ...
	... send A+B upstream ...

	... pull upstream down ...

StGIT seemed to not handle this when it tried to reapply the two
already applied patches.  A won't apply because the file coming
down is actually A+B, not A's predecessor and not A.  B won't apply
because the file also isn't A (B's predecessor).

pg resolves this by attempting to automatically fold patches during
a pg-rebase (equiv. of stg pull).  If a patch fails to push cleanly
and there's another patch immediately behind it which also should
be reapplied pg aborts and retries pushing the combination of the
patches.  This fixes my A+B case quite nicely during a rebase.  :-)

Of course it doesn't deal with the upstream giving me A+B+C and I
have only A+B locally in my patches.  But I can't have everything
now can I.  :-)
 
> > - Fast
> >
> >     pg operations generally perform faster than StGIT operations,
> >     at least on my large (~7000 file) repositories.
> 
> Might be possible but I haven't done any tests. There are some
> optimisations in StGIT that make it pretty fast: (1) if the base of
> the patch has not changed, it can fast-forward the pushed patches
> which is O(1) and (2) StGIT first tries to use git-apply when pushing
> a patch and use a three-way merge only if this fails (the operation
> usually succeeds for most of the patches). There are some speed
> problems with three-way merging if there are many file
> removals/additions because the external merge tool is called for each
> of them but the same problem exists for any other tool.

pg uses the same optimization for pushing and popping patches. It
also has a special case for the trivially empty patch which StGIT
doesn't seem to have (as StGIT must have a commit for every patch,
pg doesn't require a commit in an empty patch).

However one thing I'm playing around with is using git-read-tree -u
-m to rebase a patch rather than git-diff-tree piped to git-apply
(at least when its not a trivial forward or rewind).  I found that
most of the time to push a patch was spent in git-diff-tree and
hardly anytime was in git-apply.  Using git-read-tree to merge in the
change works nicely in the common case of different patches changing
different files with it falling back to the external merge strategy
when there's unmerged stages in the index.  The open question is
what percentage is this one way or the other?

So I think StGIT is causing a bit more CPU and disk IO than pg is,
but some of these `optimizations' were only put into pg today (and
pushed to my website around 5 pm EST).  I'm actually considering
benchmarking StGIT and pg against the same set of changes to see
how pg compares to StGIT - because I'm now rather curious if pg is
better or worse.


Another difference is the fast-forward when the base of the patch
isn't changed.  In pg this is just:

	git-update-ref HEAD $last $head &&
	git-read-tree -u -m $head $last

which should be slightly faster than StGIT as pg is skipping the
update-index step:

	git-update-index -q --unmerged --refresh
	git-read-tree -u -m head patch
	git-update-ref HEAD patch head

because like StGIT I check for a clean tree before starting the
push; a tree is only clean if the index doesn't need to be refreshed
(plus all the other normal considerations like no unmerged files).
This drops a working directory scan before the read-tree.  :-)

-- 
Shawn.

^ permalink raw reply

* Re: git-ls-files handling of 'missing' files
From: Junio C Hamano @ 2006-02-14  3:56 UTC (permalink / raw)
  To: Jon Nelson; +Cc: git
In-Reply-To: <Pine.LNX.4.63.0602132126210.6352@gheavc.wnzcbav.cig>

Jon Nelson <jnelson-git@jamponi.net> writes:

> The documentation confuses me when it says that files marked with a 'K' 
> are "to be killed / other" - it don't understand why 'killed' and 
> 'other' are lumped together.

I think there is a typo in asciidoc source (probably ?:: is needed).
Whenever you see funky things in the documentation please first
check the Documentation/that-file.txt to see if you are just
seeing a bad rendition of what was meant.

> The docs for git-ls-files indicate that a file marked as 'killed' (wrong 
> tense?) is a file that needs to be removed for git-checkout-index to 
> succeed. The manpage doesn't say why git-checkout-index needs to succeed 
> or under what conditions git-checkout-index would be invoked. (ie, "why" 
> should I manually remove this file).

This was from long time ago so I may be misremembering things
but it is for D/F conflicts.  index has "doc/file1" stored but
your working tree has a regular file doc.  To check "doc/file1"
out you would need to remove that file.  Or index has a regular
file "path2" stored when you have "path2/file2" on the working
tree (hence path2 is a directory), in which case "path2/file2"
needs to disappear.

> It seems to me that files can also exist in the state 'new' or 'added' 
> (is this the same as unmerged?) Is there a state for 'conflict'?

Remember, ls-files is about index vs working tree files.  It
works before your initial commit, and never looks at the HEAD
commit.  'new' or 'added' has no meaning.  working tree file is
either known to be the same (thanks to stat information that is
cached in the index), known to be different (ditto), or unknown
(when stat information is stale), relative to index.

Unmerged and conflict should be the same, I think.

^ permalink raw reply

* git-ls-files handling of 'missing' files
From: Jon Nelson @ 2006-02-14  3:46 UTC (permalink / raw)
  To: git


git-ls-files appears to treat missing files as both removed /and/ 
modified, neither of which really seems right. Perhaps a new state, 
'missing', is worthwhile?

Also, the documentation for git-ls-files is a bit confusing to me:

(aside: I assume that the '?' is a mis-typed '/')

The documentation confuses me when it says that files marked with a 'K' 
are "to be killed / other" - it don't understand why 'killed' and 
'other' are lumped together.

The docs for git-ls-files indicate that a file marked as 'killed' (wrong 
tense?) is a file that needs to be removed for git-checkout-index to 
succeed. The manpage doesn't say why git-checkout-index needs to succeed 
or under what conditions git-checkout-index would be invoked. (ie, "why" 
should I manually remove this file).

Would it also be worthwhile to change the terminology used? 
Specifically, it seems that 'unchanged' is more readily understandable 
than 'cached', and the past tense of 'killed' throws me. I can offer no 
improvement there, however.

It seems to me that files can also exist in the state 'new' or 'added' 
(is this the same as unmerged?) Is there a state for 'conflict'?

Sorry for all of the questions, I've really been enjoying using git but 
every now and again something thows me - tonight it was git-ls-files. 
;-)

--
Jon Nelson <jnelson-git@jamponi.net>

^ permalink raw reply

* [PATCH] git-svnimport: -r adds svn revision number to commit messages
From: Karl  Hasselström @ 2006-02-14  2:43 UTC (permalink / raw)
  To: git; +Cc: Karl  Hasselström

New -r flag for prepending the corresponding Subversion revision
number to each commit message.

Signed-off-by: Karl Hasselström <kha@treskal.com>


---

 Documentation/git-svnimport.txt |    4 ++++
 git-svnimport.perl              |    7 ++++---
 2 files changed, 8 insertions(+), 3 deletions(-)

2aad1f8f976b8bb2cd6fd3b225199373008e14b4
diff --git a/Documentation/git-svnimport.txt b/Documentation/git-svnimport.txt
index 63e28b8..5c543d5 100644
--- a/Documentation/git-svnimport.txt
+++ b/Documentation/git-svnimport.txt
@@ -61,6 +61,10 @@ When importing incrementally, you might 
 	the git repository. Use this option if you want to import into a
 	different branch.
 
+-r::
+	Prepend 'rX: ' to commit messages, where X is the imported
+	subversion revision.
+
 -m::
 	Attempt to detect merges based on the commit message. This option
 	will enable default regexes that try to capture the name source
diff --git a/git-svnimport.perl b/git-svnimport.perl
index f17d5a2..c536d70 100755
--- a/git-svnimport.perl
+++ b/git-svnimport.perl
@@ -30,19 +30,19 @@ die "Need SVN:Core 1.2.1 or better" if $
 $SIG{'PIPE'}="IGNORE";
 $ENV{'TZ'}="UTC";
 
-our($opt_h,$opt_o,$opt_v,$opt_u,$opt_C,$opt_i,$opt_m,$opt_M,$opt_t,$opt_T,$opt_b,$opt_s,$opt_l,$opt_d,$opt_D);
+our($opt_h,$opt_o,$opt_v,$opt_u,$opt_C,$opt_i,$opt_m,$opt_M,$opt_t,$opt_T,$opt_b,$opt_r,$opt_s,$opt_l,$opt_d,$opt_D);
 
 sub usage() {
 	print STDERR <<END;
 Usage: ${\basename $0}     # fetch/update GIT from SVN
        [-o branch-for-HEAD] [-h] [-v] [-l max_rev]
        [-C GIT_repository] [-t tagname] [-T trunkname] [-b branchname]
-       [-d|-D] [-i] [-u] [-s start_chg] [-m] [-M regex] [SVN_URL]
+       [-d|-D] [-i] [-u] [-r] [-s start_chg] [-m] [-M regex] [SVN_URL]
 END
 	exit(1);
 }
 
-getopts("b:C:dDhil:mM:o:s:t:T:uv") or usage();
+getopts("b:C:dDhil:mM:o:rs:t:T:uv") or usage();
 usage if $opt_h;
 
 my $tag_name = $opt_t || "tags";
@@ -650,6 +650,7 @@ sub commit {
 		$pr->reader();
 
 		$message =~ s/[\s\n]+\z//;
+		$message = "r$revision: $message" if $opt_r;
 
 		print $pw "$message\n"
 			or die "Error writing to git-commit-tree: $!\n";
-- 
1.2.0.g812d

^ permalink raw reply related

* Re: maildir / read-tree trivial merging getting in the way?
From: Ben Clifford @ 2006-02-14  2:35 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vy80ewu6n.fsf@assigned-by-dhcp.cox.net>

On Mon, 13 Feb 2006, Junio C Hamano wrote:

>
> That would be more naturally done by writing that thing in a
> more reasonable scripting language (not shell, but Perl or
> Python), call ls-tree three times, do whatever merge to come up
> with the final shape of the tree, and then construct the tree
> with a single invocation of "update-index --index-info", maybe
> even starting from an empty index file.

yeah, looks like ls-tree x 3 is what I want and quite possibly I'll end up 
constructing a new index from scratch.

-- 
Ben
http://www.hawaga.org.uk/ben/

^ permalink raw reply

* Re: maildir / read-tree trivial merging getting in the way?
From: Linus Torvalds @ 2006-02-14  2:36 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Ben Clifford, git
In-Reply-To: <7vy80ewu6n.fsf@assigned-by-dhcp.cox.net>



On Mon, 13 Feb 2006, Junio C Hamano wrote:
> 
> That would be more naturally done by writing that thing in a
> more reasonable scripting language (not shell, but Perl or
> Python), call ls-tree three times, do whatever merge to come up
> with the final shape of the tree, and then construct the tree
> with a single invocation of "update-index --index-info", maybe
> even starting from an empty index file.

Exactly. Except that it probably makes sense to use "git-diff-tree" to try 
to avoid doing lots of unnecessary work in a script, if the normal case is 
that there's still a lot of stuff that hasn't changed.

So conceptually you would do three "git-ls-tree" invocations, but in 
_practice_ it's probably better to do just one "git-ls-tree", and then use 
"git-diff-tree" to basically generate the differences from that one 
ls-tree to the other cases of interest.

So start with the merge-base, for example, and then basically generate the 
"what changed" between the merge base and the two branch heads. 

That was the plan for doing merges initially, it just turned out that 
doing them in the index made things easier.

			Linus

^ permalink raw reply

* Re: maildir / read-tree trivial merging getting in the way?
From: Linus Torvalds @ 2006-02-14  2:32 UTC (permalink / raw)
  To: Ben Clifford; +Cc: git
In-Reply-To: <Pine.LNX.4.60.0602140217380.19093@mundungus.clifford.ac>



On Tue, 14 Feb 2006, Ben Clifford wrote:
> 
> I've spent a few hours playing round with maildir-aware merging.
> 
> The basic idea I'm trying to implement is to flip the index round so that
> instead of looking at how the content has changed for a particular filename,
> I'm looking at how the filenames have changed for a particular content.
> 
> So I'm using git read-tree -m to populate the index with entries for the
> branches to merge so that I can then diddle round with those.
> 
> But the read-tree trivial merge logic seems to be getting in the way a bit.

You are much better off working with "git-ls-tree", or perhaps 
"git-diff-tree".

The latter in particular will show you what got added and what got 
deleted, but will quickly ignore any common entries (which is probably 
exactly what you want).

> So basically my question is: should I feel dirty about doing this and diddle
> read-tree so that there's a flag to not do the trivial merges automatically?

You should try to avoid git-read-tree entirely, I suspect.

All the things git-read-tree does are wrong for you. Notably, it very much 
on purpose will match things up name-by-name, and it does a lot of extra 
work to create a sorted version of the index to do the trivial merges 
quickly. The thing is, it doesn't even do that the smart way.

Now, git-read-tree actually does a _great_ job - don't get me wrong. It's 
just that the job it does isn't really suitable for your usage, and it's 
doing some things the "simple and stupid" way instead of being very smart 
about them, just because they aren't that important under normal loads.

For example, in a three-way merge (with an index), it will basically have 
four sorted inputs that it needs to interleave. Now, there's a _smart_ way 
to interleave sorted input, and there's a stupid one. The smart way is to 
read the sources all together, and just pick the right sorted order, and 
interleave them all together.

That's not what git-read-tree does.

git-read-tree will read them one by one, and use "insertion sort" to 
maintain the result in sorted order. Now, insertion sort isn't totally 
idiotic (it's not doing a bogo-sort, at least), but it _is_ pretty damn 
silly when all the sources are already sorted and known ahead of time.

So git-read-tree does some stupid things, and scales badly with really big 
trees. The good news is that we can fix it - the bad news is that my 
motivation for it is pretty low, since "really big" means "much bigger 
than the kernel" ;)

In contrast "git-diff-tree -r a b" does the _smart_ thing, and scales 
linearly with tree size _and_ can take advantage of subdirectories not 
changing (the latter is apparently not an issue for you, but can be one in 
other circumstances).

(The "raw output" from git-diff-tree is also very easy to parse. Don't do 
the "-p" (patch) form, the raw "this is how the SHA's changed" sounds 
like it's exactly what you want, assuming you're interested in renames 
with no content change)

		Linus

^ permalink raw reply

* Re: maildir / read-tree trivial merging getting in the way?
From: Junio C Hamano @ 2006-02-14  2:28 UTC (permalink / raw)
  To: Ben Clifford; +Cc: git
In-Reply-To: <Pine.LNX.4.60.0602140217380.19093@mundungus.clifford.ac>

Ben Clifford <benc@hawaga.org.uk> writes:

> So basically my question is: should I feel dirty about doing this and
> diddle read-tree so that there's a flag to not do the trivial merges
> automatically?

I am mildly negative about touching read-tree for this kind of
non-SCM'ish usage.

If you are doing read-tree without doing any trivial merge, then
you would use ls-files to inspect each stage, decide what the
final shape of the tree you want, and construct such a tree in
the index.

That would be more naturally done by writing that thing in a
more reasonable scripting language (not shell, but Perl or
Python), call ls-tree three times, do whatever merge to come up
with the final shape of the tree, and then construct the tree
with a single invocation of "update-index --index-info", maybe
even starting from an empty index file.

^ permalink raw reply

* Re: Quick question
From: Radoslaw Szkodzinski @ 2006-02-14  2:21 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7v3bimy9wn.fsf@assigned-by-dhcp.cox.net>

[-- Attachment #1: Type: text/plain, Size: 1064 bytes --]

Junio C Hamano wrote:
> Wow, you have a strong voice.
>
I didn't want to sound rude at all, of course.

>> That's why I used the -o (--others).
> 
> You asked it to show either ignored or others.
> 

So here's the catch? I don't think so.
But the manpage isn't totally clear in this matter.

When I specify just -o, it gives me files which weren't ignored too.
-o -i gives me only ignored files.
Plain -i returns nothing.

With git directory, compare:
git-ls-files -o -i -X .gitignore

with:
git-ls-files -o

The remainder is:
git-ls-files -o -X .gitignore

I have the documentation built.
(Yes, I'm not including its .gitignore on purpose)

> 
>> I would like to use it for backup~ hunting purposes in a script
>> and not have to worry about find and other less portable tools.
> 
> I usually do this for that:
> 
> 	git ls-files -o '*~'
> 

Also good. I have *~ in ignored too, so I think -o -i will suffice.

-- 
GPG Key id:  0xD1F10BA2
Fingerprint: 96E2 304A B9C4 949A 10A0  9105 9543 0453 D1F1 0BA2

AstralStorm


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 252 bytes --]

^ permalink raw reply

* maildir / read-tree trivial merging getting in the way?
From: Ben Clifford @ 2006-02-14  2:18 UTC (permalink / raw)
  To: git


I've spent a few hours playing round with maildir-aware merging.

The basic idea I'm trying to implement is to flip the index round so that 
instead of looking at how the content has changed for a particular filename, 
I'm looking at how the filenames have changed for a particular content.

So I'm using git read-tree -m to populate the index with entries for the 
branches to merge so that I can then diddle round with those.

But the read-tree trivial merge logic seems to be getting in the way a bit.

In my test repo, I have two branches ('master' and 'red') forked from the base 
point 'base':

in 'base':

$ ls
A    fish one

in 'red':

$ ls
B         billygoat one

in 'master'

$ ls
A    lion two

> From base, I renamed and cg add / cg rm'd to change A to B, one to two, 
and fish to billygoat and lion to give the above.

When I read in the tree I get automatic resolving (down to stage 0) for the 
added files. But actually in the output of my merge, I'm not always going to 
want that to happen: In the A->B case, I do want to keep B (and need to remove 
A), likewise in the one->two case.

But for fish->{billygoat,lion}, I only want one file to end up at stage 0, and 
it might not be called either billygoat or lion - in maildir, the filenames are 
more structured, and given a filename like
foo:2,SR and foo:2,SF I would want to compose the filenames to give me 
foo:2,SRF.


$ git read-tree -m base master red

$ git ls-files  --stage
100644 40e0a6f540b1b457c61315f3ccf2f5ed628e2f36 1       A
100644 40e0a6f540b1b457c61315f3ccf2f5ed628e2f36 2       A
100644 40e0a6f540b1b457c61315f3ccf2f5ed628e2f36 0       B
100644 a8150e61a3a4c9941d29169ee639396547f40de2 0       billygoat
100644 a8150e61a3a4c9941d29169ee639396547f40de2 1       fish
100644 a8150e61a3a4c9941d29169ee639396547f40de2 0       lion
100644 b67e17aeb5938def7ee105c2afe9fbb30a28a872 1       one
100644 b67e17aeb5938def7ee105c2afe9fbb30a28a872 3       one
100644 b67e17aeb5938def7ee105c2afe9fbb30a28a872 0       two

Now, I think maybe I can just look at what has made it to stage 0 and play 
round with those, but it makes me feel a little dirty - if anything, the index 
indicates that a bunch of stuff has been correctly merged (by being at stage 0) 
when in fact it hasn't.

So basically my question is: should I feel dirty about doing this and diddle 
read-tree so that there's a flag to not do the trivial merges automatically?

-- 

^ permalink raw reply

* Re: Quick question
From: Junio C Hamano @ 2006-02-14  2:03 UTC (permalink / raw)
  To: Radoslaw Szkodzinski; +Cc: git
In-Reply-To: <43F13776.9000501@gorzow.mm.pl>

Radoslaw Szkodzinski <astralstorm@gorzow.mm.pl> writes:

> Wrong. I wanted to display files that are ignored and not checked in.
> (unlike your example)

Wow, you have a strong voice.

> That's why I used the -o (--others).

You asked it to show either ignored or others.


> I would like to use it for backup~ hunting purposes in a script
> and not have to worry about find and other less portable tools.

I usually do this for that:

	git ls-files -o '*~'

^ permalink raw reply

* Re: git-bisect problem
From: Junio C Hamano @ 2006-02-14  1:52 UTC (permalink / raw)
  To: Andrew Morton; +Cc: git
In-Reply-To: <20060213165620.11ec6051.akpm@osdl.org>

Andrew Morton <akpm@osdl.org> writes:

> The bug is in Jeff's tree only
> (git+ssh://master.kernel.org/pub/scm/linux/kernel/git/jgarzik/netdev-2.6.git#ALL)
> so I wanted to perform the bisection on the git-netdev-all branch.
>
> So I did a `git log git-netdev-all' and looked at where the ipw2200 changes
> were and then decided that the 2.6.16-rc1 and 2.6.16-rc2 commits straddled
> those changes nicely, so I chose those as the bisection starting points.

Ah.  Jeff merges from Linus and that causes things on Linus tree
to appear in his tree.  So you saw -rc1 and -rc2 in the output,
but neither of them may contain the problematic change, and are
not good/bad pair at all.  They are probably both good ones.

git log output is chronological and there is no guarantee that
the ordering has much to do with the actual ordering of commits,
especially when merges are involved.  In fact, "Jeff's tree
only" suggests to me that 2.6.16-rc2 has not merged those
changes, but you thought (arguably rightly so) rc1 and rc2
straddled them.


              -rc1                     -rc2
    ---o---o---o---o---o---o---o---o---o---o---o---o--- Linus
                                            \ 
                                             \ 
       ---o---o---o---*---o---o---o---*---o---o---o---o--- Jeff
                       <- ipw2200 ->

So you would want to perhaps pick two commits like the above *
and bisect.  If the one marked as bad on the Linus tree
initially (-rc2) is not bad and does not reach the allegedly bad
commit on Jeff's line, there is no way for bisect to find it.

If you are suspecting ipw2200, 2f633db and 747af1e might be a
pair of good anchor points to start bisecting.

The way I came up with these two; I should be using gitk for
this kind of thing, but I do not work in X during daytime, so I
am guessing these from:
 
        $ git rev-list --pretty=oneline linus..garzik/netdev |
          grep -C4 -i ipw2200 | less

This gets the list of commits that are on Jeff's tree but not in
Linus' in reverse chrono order, and grabs ones with ipw2200 in
their titles.  It shows that 2f633db is (close to) the latest
that touches ipw2200, and 747af1e is (close to) the reasonably
old that touches ipw2200.  As a review of these two points, I
did this:

	$ git log 747af1e..2f633db

Hope it helps this time...

^ permalink raw reply

* Re: Quick question
From: Radoslaw Szkodzinski @ 2006-02-14  1:50 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vy80eydq0.fsf@assigned-by-dhcp.cox.net>

[-- Attachment #1: Type: text/plain, Size: 886 bytes --]

Junio C Hamano wrote:
> With the git.git repository itself, I tried:
> 
> $ cat /var/tmp/i
> *.c
> $ git ls-files -i -X /var/tmp/i | head -n 6
> apply.c
> arm/sha1.c
> blob.c
> cat-file.c
> check-ref-format.c
> checkout-index.c
> 
> So I am not sure what you mean.  You wanted to "display ignored
> files of the whole project", right?  I am getting arm/sha1.c
> here in my output, so I do not understand the issue here...
> 

Wrong. I wanted to display files that are ignored and not checked in.
(unlike your example)

That's why I used the -o (--others).

Try your example with git repo's .gitignore and any .o file.
I would like to use it for backup~ hunting purposes in a script
and not have to worry about find and other less portable tools.

-- 
GPG Key id:  0xD1F10BA2
Fingerprint: 96E2 304A B9C4 949A 10A0  9105 9543 0453 D1F1 0BA2

AstralStorm


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 252 bytes --]

^ permalink raw reply

* Re: git-bisect problem
From: Petr Baudis @ 2006-02-14  1:27 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Junio C Hamano, git
In-Reply-To: <20060214011512.GB31278@pasky.or.cz>

Dear diary, on Tue, Feb 14, 2006 at 02:15:12AM CET, I got a letter
where Petr Baudis <pasky@suse.cz> said that...
> Dear diary, on Tue, Feb 14, 2006 at 01:56:20AM CET, I got a letter
> where Andrew Morton <akpm@osdl.org> said that...
> > Junio C Hamano <junkio@cox.net> wrote:
> > >
> > > Sorry, this question is what I do not quite understand.
> > > 
> > >  Here is my understanding of the situation.
> > > 
> > >   - Betweeen 2.6.16-rc1 and 2.6.16-rc2 a bug you are chasing was
> > >     introduced.  You know rc1 works fine but rc2 is bad.
> > > 
> > >   - You suspect that changes introduced by merging Jeff's tree
> > >     at some point between -rc1 and -rc2 may be causing this.
> > > 
> > >  Am I totally misunderstanding the situation?
> > 
> > yup ;)
> > 
> > The bug is in Jeff's tree only
> > (git+ssh://master.kernel.org/pub/scm/linux/kernel/git/jgarzik/netdev-2.6.git#ALL)
> > so I wanted to perform the bisection on the git-netdev-all branch.
> > 
> > So I did a `git log git-netdev-all' and looked at where the ipw2200 changes
> > were and then decided that the 2.6.16-rc1 and 2.6.16-rc2 commits straddled
> > those changes nicely, so I chose those as the bisection starting points.
> 
> But aren't those commits on the Linus' "branch", not containing any
> commits specific to git-netdev-all?
> 
> I imagine the situation is like:
> 
> * -- 2.6.16-rc1  -- * -- * -- 2.6.16-rc2  -- * - -  (linus)
>   \                        \              \
> * -- * -- * -- * -- * -- * -- * -- * -- * -- M - -  (git-netdev-all)
> 
> Then, if you bisect between -rc2 and -rc1, you will never actually get
> to the git-netdev-all branch, since there are no such commits inbetween
> -rc2 and -rc1. Even if you consider this:
> 
> * -- 2.6.16-rc1  -- * -- * -- 2.6.16-rc2  -- * - -  (linus)
>   \              /         \              \
> * -- X -- Y -- Z -- A -- * -- * -- * -- * -- M - -  (git-netdev-all)
> 
> git-bisect will consider the X, Y, Z commits (since they are part of the
> ancestry between -rc and -rc2), but not commits from A on - it can't
> reach them topologically if it considers only commits between -rc1 and
> -rc2:
> 
> * -- 2.6.16-rc1  -- * -- * -- 2.6.16-rc2
>   \              /
>    - X -- Y -- Z

I got this one (and consequently, the following one) wrong - obviously,
it should read as

       2.6.16-rc1  -- * -- * -- 2.6.16-rc2
                   /
       X -- Y -- Z

since the "asterisk" commit is already behind -rc1.


Pedagogical excursion:

All those commit intervals are really set differences - if you have
commit A and commit B,

	[A,B] = B \cup (ancestry(B) \ ancestry(A))

or if you don't like math, color B and all its ancestors blue in
your head, and then color all the A ancestors black. The commits
that stay blue are in the [A,B] interval.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
Of the 3 great composers Mozart tells us what it's like to be human,
Beethoven tells us what it's like to be Beethoven and Bach tells us
what it's like to be the universe.  -- Douglas Adams

^ permalink raw reply

* Re: git-bisect problem
From: Petr Baudis @ 2006-02-14  1:15 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Junio C Hamano, git
In-Reply-To: <20060213165620.11ec6051.akpm@osdl.org>

Dear diary, on Tue, Feb 14, 2006 at 01:56:20AM CET, I got a letter
where Andrew Morton <akpm@osdl.org> said that...
> Junio C Hamano <junkio@cox.net> wrote:
> >
> > Sorry, this question is what I do not quite understand.
> > 
> >  Here is my understanding of the situation.
> > 
> >   - Betweeen 2.6.16-rc1 and 2.6.16-rc2 a bug you are chasing was
> >     introduced.  You know rc1 works fine but rc2 is bad.
> > 
> >   - You suspect that changes introduced by merging Jeff's tree
> >     at some point between -rc1 and -rc2 may be causing this.
> > 
> >  Am I totally misunderstanding the situation?
> 
> yup ;)
> 
> The bug is in Jeff's tree only
> (git+ssh://master.kernel.org/pub/scm/linux/kernel/git/jgarzik/netdev-2.6.git#ALL)
> so I wanted to perform the bisection on the git-netdev-all branch.
> 
> So I did a `git log git-netdev-all' and looked at where the ipw2200 changes
> were and then decided that the 2.6.16-rc1 and 2.6.16-rc2 commits straddled
> those changes nicely, so I chose those as the bisection starting points.

But aren't those commits on the Linus' "branch", not containing any
commits specific to git-netdev-all?

I imagine the situation is like:

* -- 2.6.16-rc1  -- * -- * -- 2.6.16-rc2  -- * - -  (linus)
  \                        \              \
* -- * -- * -- * -- * -- * -- * -- * -- * -- M - -  (git-netdev-all)

Then, if you bisect between -rc2 and -rc1, you will never actually get
to the git-netdev-all branch, since there are no such commits inbetween
-rc2 and -rc1. Even if you consider this:

* -- 2.6.16-rc1  -- * -- * -- 2.6.16-rc2  -- * - -  (linus)
  \              /         \              \
* -- X -- Y -- Z -- A -- * -- * -- * -- * -- M - -  (git-netdev-all)

git-bisect will consider the X, Y, Z commits (since they are part of the
ancestry between -rc and -rc2), but not commits from A on - it can't
reach them topologically if it considers only commits between -rc1 and
-rc2:

* -- 2.6.16-rc1  -- * -- * -- 2.6.16-rc2
  \              /
   - X -- Y -- Z

Now, perhaps what you meant is that "when -rc2 got merged to netdev-all,
things were already broken". In this case, what you want to do is to use
the commit M as the bisect bad point. Then, bisect will walk this
subgraph:

* -- 2.6.16-rc1  -- * -- * -- 2.6.16-rc2
  \              /         \              \
   - X -- Y -- Z -- A -- * -- * -- * -- * -- M

I agree that this can be kind of confusing; I'm not sure how to avoid
this. Perhaps git-bisect should warn if when bisecting between Q and P,
there exists a path between HEAD and P avoiding Q...?

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
Of the 3 great composers Mozart tells us what it's like to be human,
Beethoven tells us what it's like to be Beethoven and Bach tells us
what it's like to be the universe.  -- Douglas Adams

^ permalink raw reply

* Re: git-bisect problem
From: Linus Torvalds @ 2006-02-14  1:14 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Junio C Hamano, git
In-Reply-To: <20060213165620.11ec6051.akpm@osdl.org>



On Mon, 13 Feb 2006, Andrew Morton wrote:
> 
> The bug is in Jeff's tree only
> (git+ssh://master.kernel.org/pub/scm/linux/kernel/git/jgarzik/netdev-2.6.git#ALL)
> so I wanted to perform the bisection on the git-netdev-all branch.

Actually, what you should do, is not play any games at all, but just tell 
"git bisect" what the problem is. It will do the right thing.

So in this case, what you do is _literally_ to

	# fetch Jeff's tree (you obviously had this already, but I just 
	# want to point it out as a "name that branch" thing)
	git fetch netdev-all

	# we know that that tree is broken
	git bisect start
	git bisect bad netdev-all

	# We know that Linus' top-of-tree doesn't have the bug
	git bisect good origin

and off you go. It absolutely magically does the right thing, and will 
bisect stuff that is only in the netdev branch and not in my tree. No 
guessing necessary, no need to try to figure out what the differences are. 
git will do it all for you.

And notice how it will work perfectly well, even if the two points you 
have tested AREN'T EVEN DIRECTLY RELATED! The "good" and "bad" points do 
not have to have any direct relationship other than a common parent 
_somewhere_. "git bisect" really is that good.

(The above is obviously assuming that "origin" is set to my tree, 
self-aggrandizing bastard that I am, and that you've set up a 
.git/remotes/netdev-all file pointing to Jeff's tree - your setup may vary 
from this, so you'd have to change the lines to match)

			Linus

^ permalink raw reply

* Re: git-bisect problem
From: Andrew Morton @ 2006-02-14  0:56 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7v8xsezsni.fsf@assigned-by-dhcp.cox.net>

Junio C Hamano <junkio@cox.net> wrote:
>
> Sorry, this question is what I do not quite understand.
> 
>  Here is my understanding of the situation.
> 
>   - Betweeen 2.6.16-rc1 and 2.6.16-rc2 a bug you are chasing was
>     introduced.  You know rc1 works fine but rc2 is bad.
> 
>   - You suspect that changes introduced by merging Jeff's tree
>     at some point between -rc1 and -rc2 may be causing this.
> 
>  Am I totally misunderstanding the situation?

yup ;)

The bug is in Jeff's tree only
(git+ssh://master.kernel.org/pub/scm/linux/kernel/git/jgarzik/netdev-2.6.git#ALL)
so I wanted to perform the bisection on the git-netdev-all branch.

So I did a `git log git-netdev-all' and looked at where the ipw2200 changes
were and then decided that the 2.6.16-rc1 and 2.6.16-rc2 commits straddled
those changes nicely, so I chose those as the bisection starting points.

^ permalink raw reply

* Re: Quick question
From: Junio C Hamano @ 2006-02-14  0:40 UTC (permalink / raw)
  To: Radoslaw Szkodzinski; +Cc: git
In-Reply-To: <43F0B577.4070608@gorzow.mm.pl>

Radoslaw Szkodzinski <astralstorm@gorzow.mm.pl> writes:

> How to display ignored files of the whole project using only core git?
>
> I've tried:
>
> git-ls-files -o -i -X .git/info/exclude
>
> and it only showed me the excluded files in the current directory...

With the git.git repository itself, I tried:

$ cat /var/tmp/i
*.c
$ git ls-files -i -X /var/tmp/i | head -n 6
apply.c
arm/sha1.c
blob.c
cat-file.c
check-ref-format.c
checkout-index.c

So I am not sure what you mean.  You wanted to "display ignored
files of the whole project", right?  I am getting arm/sha1.c
here in my output, so I do not understand the issue here...

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox