git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Odd problems trying to build an orphaned branch
@ 2015-11-05 21:16 alan
  2015-11-06  0:18 ` Jeff King
  0 siblings, 1 reply; 6+ messages in thread
From: alan @ 2015-11-05 21:16 UTC (permalink / raw)
  To: git; +Cc: alan

I am trying to create an orphaned branch that contains the linux-3.12.y
branch from linux-stable. Each time I try a method to make this work I
encounter a blocker that halts my progress.

I expect that at least one of these is a bug, but I am not sure.

Here is what I did. I have read the docs and tried a huge pile of
suggestions. How is this supposed to be done?

I am using git version 2.6.2.402.g2635c2b. It passes all the tests.

I created an orphan branch from 3.12-rc1. I then used git format-patch to
generate patches from 3.12-rc1 to HEAD. (Over 7000 patches.) I use git am
to apply them to the orphan branch. At patch 237 it fails to apply. (It
appears the patch is from a block of code added with a merge commit, but
it is somewhere in the middle of the block.)

Are merge commits supposed to screw up git-format-patch?

I also tried using clone with depth and --single-branch set.  It ignored
the depth setting and gave me the whole branch all the way back to 2.6.x.

All the examples of shallow clones use depth=1. Is it broken for values
bigger than 1 or am I missing something?

I tried using graft and filter-branch. None of the descriptions are very
clear. None of them worked either. Filter-branch died on a commit
somewhere in 2.6 land that had no author. (Which is outside of the commits
I want to keep.)

I tried creating an orphan branch and using cherry-pick
v3.12-rc1..linux-3.12.y. It blew up on the first merge commit it hit. I
tried adding in "-m 1" to try to get it to pick a parent, but then it died
on the first commit because it was not a merge.

Why is this so hard?

All I want to do is take a branch from linux-stable and create a branch
that contains just the commits from where it was branched off of master
until it hits HEAD. That is it. All the scripts that I have seen that
claim to do just what I want break when it hits a merge or a bogus author.
(How that got into linux-stable, I have no idea. The commit is 10 year
old!)

Ideas? Do I need to create a new command? ("cake-cutter". Cut from
commit..commit and make a new branch out of it.)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Odd problems trying to build an orphaned branch
  2015-11-05 21:16 Odd problems trying to build an orphaned branch alan
@ 2015-11-06  0:18 ` Jeff King
  2015-11-06  0:20   ` Jeff King
  2015-11-06 18:32   ` Odd problems trying to build an orphaned branch alan
  0 siblings, 2 replies; 6+ messages in thread
From: Jeff King @ 2015-11-06  0:18 UTC (permalink / raw)
  To: alan; +Cc: git

On Thu, Nov 05, 2015 at 01:16:54PM -0800, alan@clueserver.org wrote:

> I created an orphan branch from 3.12-rc1. I then used git format-patch to
> generate patches from 3.12-rc1 to HEAD. (Over 7000 patches.) I use git am
> to apply them to the orphan branch. At patch 237 it fails to apply. (It
> appears the patch is from a block of code added with a merge commit, but
> it is somewhere in the middle of the block.)
> 
> Are merge commits supposed to screw up git-format-patch?

Yes. There is no defined format for merge patches, so git-format-patch
cannot show them. What you're trying to do won't work.

If your goal is to have the history at HEAD truncated at 3.12-rc1, you
are probably better off using a graft and having "filter-branch" rewrite
the history based on that. That will preserve merges and the general
shape of history.

> I also tried using clone with depth and --single-branch set.  It ignored
> the depth setting and gave me the whole branch all the way back to 2.6.x.

Was it a local clone? Depth is ignored for those (it _should_ print a
warning). If so, try --no-local to make it act like a "regular" clone.

> I tried using graft and filter-branch. None of the descriptions are very
> clear. None of them worked either. Filter-branch died on a commit
> somewhere in 2.6 land that had no author. (Which is outside of the commits
> I want to keep.)

I suspect you need to graft more than just the commit at v3.12-rc1. For
example, consider this history graph:

  --A--B--C--D---G--H
           \    /
            E--F

If we imagine that H is the current HEAD, and D is our tag (v3.12-rc1),
then making a cut between D and C will not have any effect on the side
branch that contains E and F. Commits A and B are still reachable
through them.

You can find the complete set of boundary commits like this:

  git log --boundary --format='%m %H' v3.12-rc1..HEAD

and then graft them all like this:

  git log --boundary --format='%m %H' v3.12-rc1..HEAD |
    grep ^- | cut -d' ' -f2 >.git/info/grafts

Then you should be able to run "git filter-branch" to rewrite the
history based on that.

I think you can probably get the same effect by running:

  git filter-branch v3.12-rc1..HEAD

Of course that leaves only the problem that filter-branch is
horrendously slow (for the kernel, most of the time goes to populating
the index for each commit; I think filter-branch could probably learn to
skip this step if there is no index or tree filter at work).

> I tried creating an orphan branch and using cherry-pick
> v3.12-rc1..linux-3.12.y. It blew up on the first merge commit it hit. I
> tried adding in "-m 1" to try to get it to pick a parent, but then it died
> on the first commit because it was not a merge.

That won't do what you want. Cherry-pick doesn't preserve merges. When
you pick a merge and choose a mainline, it is effectively saying "treat
that as the only interesting parent" and squashes the result down to a
single non-merge commit.

If you wanted to follow this path (starting at an orphan and moving the
patches over), I think rebase's "--preserve-merges" would be your best
bet. It used to have some corner cases, though, and I don't know if
those were ever fixed. I'd say filter-branch is the most-supported way
to do what you want.

> All I want to do is take a branch from linux-stable and create a branch
> that contains just the commits from where it was branched off of master
> until it hits HEAD. That is it. All the scripts that I have seen that
> claim to do just what I want break when it hits a merge or a bogus author.
> (How that got into linux-stable, I have no idea. The commit is 10 year
> old!)

As an aside, which commit caused the bogus-author problem? Filter-branch
generally tries to preserve or fix problems rather than barfing, exactly
because it is often used to rewrite-out crap. I wonder if there is
something it could be doing better (though again, I think in your case
you are hitting the commit only because of an incomplete cut with your
grafts).

-Peff

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Odd problems trying to build an orphaned branch
  2015-11-06  0:18 ` Jeff King
@ 2015-11-06  0:20   ` Jeff King
  2015-11-06  6:24     ` [PATCH] filter-branch: skip index read/write when possible Jeff King
  2015-11-06 18:32   ` Odd problems trying to build an orphaned branch alan
  1 sibling, 1 reply; 6+ messages in thread
From: Jeff King @ 2015-11-06  0:20 UTC (permalink / raw)
  To: alan; +Cc: git

On Thu, Nov 05, 2015 at 07:18:32PM -0500, Jeff King wrote:

> Of course that leaves only the problem that filter-branch is
> horrendously slow (for the kernel, most of the time goes to populating
> the index for each commit; I think filter-branch could probably learn to
> skip this step if there is no index or tree filter at work).

Here's a totally untested patch that seems to make a filter-branch like
this on the kernel orders of magnitude faster:

diff --git a/git-filter-branch.sh b/git-filter-branch.sh
index 27c9c54..9df5185 100755
--- a/git-filter-branch.sh
+++ b/git-filter-branch.sh
@@ -306,6 +306,13 @@ then
 	start_timestamp=$(date '+%s')
 fi
 
+if test -n "$filter_index" || test -n "$filter_tree"
+then
+	need_index=t
+else
+	need_index=
+fi
+
 while read commit parents; do
 	git_filter_branch__commit_count=$(($git_filter_branch__commit_count+1))
 
@@ -313,7 +320,10 @@ while read commit parents; do
 
 	case "$filter_subdir" in
 	"")
-		GIT_ALLOW_NULL_SHA1=1 git read-tree -i -m $commit
+		if test -n "$need_index"
+		then
+			GIT_ALLOW_NULL_SHA1=1 git read-tree -i -m $commit
+		fi
 		;;
 	*)
 		# The commit may not have the subdirectory at all
@@ -387,8 +397,15 @@ while read commit parents; do
 	} <../commit |
 		eval "$filter_msg" > ../message ||
 			die "msg filter failed: $filter_msg"
+
+	if test -n "$need_index"
+	then
+		tree=$(git write-tree)
+	else
+		tree="$commit^{tree}"
+	fi
 	workdir=$workdir @SHELL_PATH@ -c "$filter_commit" "git commit-tree" \
-		$(git write-tree) $parentstr < ../message > ../map/$commit ||
+		"$tree" $parentstr < ../message > ../map/$commit ||
 			die "could not write rewritten commit"
 done <../revs
 

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH] filter-branch: skip index read/write when possible
  2015-11-06  0:20   ` Jeff King
@ 2015-11-06  6:24     ` Jeff King
  0 siblings, 0 replies; 6+ messages in thread
From: Jeff King @ 2015-11-06  6:24 UTC (permalink / raw)
  To: git; +Cc: alan

On Thu, Nov 05, 2015 at 07:20:48PM -0500, Jeff King wrote:

> Here's a totally untested patch that seems to make a filter-branch like
> this on the kernel orders of magnitude faster:

Testing shows that it is indeed broken. :)

If $filter_subdir is set, it handles the index read itself, but my
earlier patch did not correctly do the write.

This one should work for all cases (unless the user does something
really strange, like expect to manipulate the index inside the
--env-filter or something, but IMHO it is insane for anyone to rely on
that working).

-- >8 --
Subject: filter-branch: skip index read/write when possible

If the user specifies an index filter but not a tree filter,
filter-branch cleverly avoids checking out the tree
entirely. But we don't do the next level of optimization: if
you have no index or tree filter, we do not need to read the
index at all.

This can greatly speed up cases where we are only changing
the commit objects (e.g., cementing a graft into place).
Here are numbers from the newly-added perf test:

  Test                  HEAD^              HEAD
  ---------------------------------------------------------------
  7000.2: noop filter   13.81(4.95+0.83)   5.43(0.42+0.43) -60.7%

Signed-off-by: Jeff King <peff@peff.net>
---
Those numbers are from git.git. The bigger your tree, the better the
speedup (I didn't run the perf test, because even the span of
HEAD~100..HEAD takes tens of minutes for each trial with the old code.
With the new it's less than 30 seconds).

 git-filter-branch.sh          | 23 +++++++++++++++++++++--
 t/perf/p7000-filter-branch.sh | 19 +++++++++++++++++++
 2 files changed, 40 insertions(+), 2 deletions(-)
 create mode 100755 t/perf/p7000-filter-branch.sh

diff --git a/git-filter-branch.sh b/git-filter-branch.sh
index 27c9c54..d61f9ba 100755
--- a/git-filter-branch.sh
+++ b/git-filter-branch.sh
@@ -306,6 +306,15 @@ then
 	start_timestamp=$(date '+%s')
 fi
 
+if test -n "$filter_index" ||
+   test -n "$filter_tree" ||
+   test -n "$filter_subdir"
+then
+	need_index=t
+else
+	need_index=
+fi
+
 while read commit parents; do
 	git_filter_branch__commit_count=$(($git_filter_branch__commit_count+1))
 
@@ -313,7 +322,10 @@ while read commit parents; do
 
 	case "$filter_subdir" in
 	"")
-		GIT_ALLOW_NULL_SHA1=1 git read-tree -i -m $commit
+		if test -n "$need_index"
+		then
+			GIT_ALLOW_NULL_SHA1=1 git read-tree -i -m $commit
+		fi
 		;;
 	*)
 		# The commit may not have the subdirectory at all
@@ -387,8 +399,15 @@ while read commit parents; do
 	} <../commit |
 		eval "$filter_msg" > ../message ||
 			die "msg filter failed: $filter_msg"
+
+	if test -n "$need_index"
+	then
+		tree=$(git write-tree)
+	else
+		tree="$commit^{tree}"
+	fi
 	workdir=$workdir @SHELL_PATH@ -c "$filter_commit" "git commit-tree" \
-		$(git write-tree) $parentstr < ../message > ../map/$commit ||
+		"$tree" $parentstr < ../message > ../map/$commit ||
 			die "could not write rewritten commit"
 done <../revs
 
diff --git a/t/perf/p7000-filter-branch.sh b/t/perf/p7000-filter-branch.sh
new file mode 100755
index 0000000..15ee5d1
--- /dev/null
+++ b/t/perf/p7000-filter-branch.sh
@@ -0,0 +1,19 @@
+#!/bin/sh
+
+test_description='performance of filter-branch'
+. ./perf-lib.sh
+
+test_perf_default_repo
+test_checkout_worktree
+
+test_expect_success 'mark bases for tests' '
+	git tag -f tip &&
+	git tag -f base HEAD~100
+'
+
+test_perf 'noop filter' '
+	git checkout --detach tip &&
+	git filter-branch -f base..HEAD
+'
+
+test_done
-- 
2.6.2.711.g30c79de

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: Odd problems trying to build an orphaned branch
  2015-11-06  0:18 ` Jeff King
  2015-11-06  0:20   ` Jeff King
@ 2015-11-06 18:32   ` alan
  2015-11-06 19:00     ` Jeff King
  1 sibling, 1 reply; 6+ messages in thread
From: alan @ 2015-11-06 18:32 UTC (permalink / raw)
  To: Jeff King; +Cc: alan, git

> On Thu, Nov 05, 2015 at 01:16:54PM -0800, alan@clueserver.org wrote:
>
>> I created an orphan branch from 3.12-rc1. I then used git format-patch
>> to
>> generate patches from 3.12-rc1 to HEAD. (Over 7000 patches.) I use git
>> am
>> to apply them to the orphan branch. At patch 237 it fails to apply. (It
>> appears the patch is from a block of code added with a merge commit, but
>> it is somewhere in the middle of the block.)
>>
>> Are merge commits supposed to screw up git-format-patch?
>
> Yes. There is no defined format for merge patches, so git-format-patch
> cannot show them. What you're trying to do won't work.

This makes me worry about using git-format-patch. If it cannot handle
merge commits correctly, then using it to send patches to customers is
risky at best. (I work for a place that does not want to distribute the
kernel, just patches on top of the kernel. The case of having a large
number of merge commits in the tree seems to break that.)

> If your goal is to have the history at HEAD truncated at 3.12-rc1, you
> are probably better off using a graft and having "filter-branch" rewrite
> the history based on that. That will preserve merges and the general
> shape of history.

I tried using that.  The documentation on how to do it correctly is vague.
It seemed to want to take the patches before the graft point, not after.
When filter-branch hit a commit with no author, it died. (It does not
allow a rewrite of a commit that does not have an author.)

>
>> I also tried using clone with depth and --single-branch set.  It ignored
>> the depth setting and gave me the whole branch all the way back to
>> 2.6.x.
>
> Was it a local clone? Depth is ignored for those (it _should_ print a
> warning). If so, try --no-local to make it act like a "regular" clone.

I did not add any options for "local" vs "regular". What defines that?

>> I tried using graft and filter-branch. None of the descriptions are very
>> clear. None of them worked either. Filter-branch died on a commit
>> somewhere in 2.6 land that had no author. (Which is outside of the
>> commits
>> I want to keep.)
>
> I suspect you need to graft more than just the commit at v3.12-rc1. For
> example, consider this history graph:
>
>   --A--B--C--D---G--H
>            \    /
>             E--F
>
> If we imagine that H is the current HEAD, and D is our tag (v3.12-rc1),
> then making a cut between D and C will not have any effect on the side
> branch that contains E and F. Commits A and B are still reachable
> through them.
>
> You can find the complete set of boundary commits like this:
>
>   git log --boundary --format='%m %H' v3.12-rc1..HEAD
>
> and then graft them all like this:
>
>   git log --boundary --format='%m %H' v3.12-rc1..HEAD |
>     grep ^- | cut -d' ' -f2 >.git/info/grafts
>
> Then you should be able to run "git filter-branch" to rewrite the
> history based on that.
>
> I think you can probably get the same effect by running:
>
>   git filter-branch v3.12-rc1..HEAD

I will try this and see what happens.

> Of course that leaves only the problem that filter-branch is
> horrendously slow (for the kernel, most of the time goes to populating
> the index for each commit; I think filter-branch could probably learn to
> skip this step if there is no index or tree filter at work).

I have to only run this once, so I don't care. Running at all would be nice.

>> I tried creating an orphan branch and using cherry-pick
>> v3.12-rc1..linux-3.12.y. It blew up on the first merge commit it hit. I
>> tried adding in "-m 1" to try to get it to pick a parent, but then it
>> died
>> on the first commit because it was not a merge.
>
> That won't do what you want. Cherry-pick doesn't preserve merges. When
> you pick a merge and choose a mainline, it is effectively saying "treat
> that as the only interesting parent" and squashes the result down to a
> single non-merge commit.
>
> If you wanted to follow this path (starting at an orphan and moving the
> patches over), I think rebase's "--preserve-merges" would be your best
> bet. It used to have some corner cases, though, and I don't know if
> those were ever fixed. I'd say filter-branch is the most-supported way
> to do what you want.
>
>> All I want to do is take a branch from linux-stable and create a branch
>> that contains just the commits from where it was branched off of master
>> until it hits HEAD. That is it. All the scripts that I have seen that
>> claim to do just what I want break when it hits a merge or a bogus
>> author.
>> (How that got into linux-stable, I have no idea. The commit is 10 year
>> old!)
>
> As an aside, which commit caused the bogus-author problem? Filter-branch
> generally tries to preserve or fix problems rather than barfing, exactly
> because it is often used to rewrite-out crap. I wonder if there is
> something it could be doing better (though again, I think in your case
> you are hitting the commit only because of an incomplete cut with your
> grafts).

I will try and find it again. It is in the 2.6 tree from 2005.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Odd problems trying to build an orphaned branch
  2015-11-06 18:32   ` Odd problems trying to build an orphaned branch alan
@ 2015-11-06 19:00     ` Jeff King
  0 siblings, 0 replies; 6+ messages in thread
From: Jeff King @ 2015-11-06 19:00 UTC (permalink / raw)
  To: alan; +Cc: git

On Fri, Nov 06, 2015 at 10:32:56AM -0800, alan@clueserver.org wrote:

> > Yes. There is no defined format for merge patches, so git-format-patch
> > cannot show them. What you're trying to do won't work.
> 
> This makes me worry about using git-format-patch. If it cannot handle
> merge commits correctly, then using it to send patches to customers is
> risky at best. (I work for a place that does not want to distribute the
> kernel, just patches on top of the kernel. The case of having a large
> number of merge commits in the tree seems to break that.)

If you do not know if the history contains merges and are blindly using
format-patch, you are right to be worried. It will not work well for
your case.

> > Was it a local clone? Depth is ignored for those (it _should_ print a
> > warning). If so, try --no-local to make it act like a "regular" clone.
> 
> I did not add any options for "local" vs "regular". What defines that?

If the clone is on the local filesystem (i.e., the source is a regular
path, not a URL or an ssh endpoint), git will optimize some of the
transfer. For example, it will hardlink objects, which makes computing a
shallow clone more expensive than simply providing all of the objects.

But it should warn in this case.

For example:

  $ git clone --depth=1 /home/peff/compile/linux clone-of-linux
  Cloning into 'clone-of-linux'...
  warning: --depth is ignored in local clones; use file:// instead.
  done.

You can disable these local optimizations by using a "file://" URL
instead of just a filename, or by using the "--no-local" flag.

> > Of course that leaves only the problem that filter-branch is
> > horrendously slow (for the kernel, most of the time goes to populating
> > the index for each commit; I think filter-branch could probably learn to
> > skip this step if there is no index or tree filter at work).
> 
> I have to only run this once, so I don't care. Running at all would be nice.

It may be sufficiently slow on the kernel that it will not count as
"running it all".  :)

The patch I posted earlier seemed to make it workable.

Yet another option (because you wanted more, right?) is to pipe
git-fast-export into git-fast-import. Something like:

  git fast-export \
    --no-data \
    --refspec refs/heads/master:refs/heads/filtered \
    v3.12-rc1..master |
  git fast-import

I don't know if there are any gotchas there, though.

-Peff

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-11-06 19:00 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-11-05 21:16 Odd problems trying to build an orphaned branch alan
2015-11-06  0:18 ` Jeff King
2015-11-06  0:20   ` Jeff King
2015-11-06  6:24     ` [PATCH] filter-branch: skip index read/write when possible Jeff King
2015-11-06 18:32   ` Odd problems trying to build an orphaned branch alan
2015-11-06 19:00     ` Jeff King

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).