git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [JGIT PATCH 1/4] Simplify and micro-optimize WorkingTreeIterator.ENTRY_CMP
@ 2008-08-21 20:57 Shawn O. Pearce
  2008-08-21 20:57 ` [JGIT PATCH 2/4] Support post order tree handling during TreeWalk Shawn O. Pearce
  2008-08-22 20:45 ` [JGIT PATCH 1/4] Simplify and micro-optimize WorkingTreeIterator.ENTRY_CMP Robin Rosenberg
  0 siblings, 2 replies; 6+ messages in thread
From: Shawn O. Pearce @ 2008-08-21 20:57 UTC (permalink / raw)
  To: Robin Rosenberg; +Cc: git

We already did this simplification work to AbstractTreeIterator's
pathCompare method, and this is based upon that same structure.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
 .../spearce/jgit/treewalk/WorkingTreeIterator.java |   44 ++-----------------
 1 files changed, 5 insertions(+), 39 deletions(-)

diff --git a/org.spearce.jgit/src/org/spearce/jgit/treewalk/WorkingTreeIterator.java b/org.spearce.jgit/src/org/spearce/jgit/treewalk/WorkingTreeIterator.java
index c6664f5..6fce150 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/treewalk/WorkingTreeIterator.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/treewalk/WorkingTreeIterator.java
@@ -297,45 +297,11 @@ public int compare(final Entry o1, final Entry o2) {
 					return cmp;
 			}
 
-			if (cPos < aLen) {
-				final int aj = a[cPos] & 0xff;
-				final int lastb = lastPathChar(o2);
-				if (aj < lastb)
-					return -1;
-				else if (aj > lastb)
-					return 1;
-				else if (cPos == aLen - 1)
-					return 0;
-				else
-					return -1;
-			}
-
-			if (cPos < bLen) {
-				final int bk = b[cPos] & 0xff;
-				final int lasta = lastPathChar(o1);
-				if (lasta < bk)
-					return -1;
-				else if (lasta > bk)
-					return 1;
-				else if (cPos == bLen - 1)
-					return 0;
-				else
-					return 1;
-			}
-
-			final int lasta = lastPathChar(o1);
-			final int lastb = lastPathChar(o2);
-			if (lasta < lastb)
-				return -1;
-			else if (lasta > lastb)
-				return 1;
-
-			if (aLen == bLen)
-				return 0;
-			else if (aLen < bLen)
-				return -1;
-			else
-				return 1;
+			if (cPos < aLen)
+				return (a[cPos] & 0xff) - lastPathChar(o2);
+			if (cPos < bLen)
+				return lastPathChar(o1) - (b[cPos] & 0xff);
+			return lastPathChar(o1) - lastPathChar(o2);
 		}
 	};
 
-- 
1.6.0.112.g9c75

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [JGIT PATCH 2/4] Support post order tree handling during TreeWalk
  2008-08-21 20:57 [JGIT PATCH 1/4] Simplify and micro-optimize WorkingTreeIterator.ENTRY_CMP Shawn O. Pearce
@ 2008-08-21 20:57 ` Shawn O. Pearce
  2008-08-21 20:57   ` [JGIT PATCH 3/4] Micro-optimize TreeWalk's exitSubtree implementation Shawn O. Pearce
  2008-08-22 20:45 ` [JGIT PATCH 1/4] Simplify and micro-optimize WorkingTreeIterator.ENTRY_CMP Robin Rosenberg
  1 sibling, 1 reply; 6+ messages in thread
From: Shawn O. Pearce @ 2008-08-21 20:57 UTC (permalink / raw)
  To: Robin Rosenberg; +Cc: git

Some (rare) applications need to handle a tree after they have dealt
with the tree's children.  An example of such an application is one
implementing rmdir, where the directory should be cleaned up after
all files/subdirectories were already removed.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
 .../jgit/treewalk/PostOrderTreeWalkTest.java       |  189 ++++++++++++++++++++
 .../src/org/spearce/jgit/treewalk/TreeWalk.java    |   53 ++++++-
 2 files changed, 241 insertions(+), 1 deletions(-)
 create mode 100644 org.spearce.jgit.test/tst/org/spearce/jgit/treewalk/PostOrderTreeWalkTest.java

diff --git a/org.spearce.jgit.test/tst/org/spearce/jgit/treewalk/PostOrderTreeWalkTest.java b/org.spearce.jgit.test/tst/org/spearce/jgit/treewalk/PostOrderTreeWalkTest.java
new file mode 100644
index 0000000..145a56e
--- /dev/null
+++ b/org.spearce.jgit.test/tst/org/spearce/jgit/treewalk/PostOrderTreeWalkTest.java
@@ -0,0 +1,189 @@
+/*
+ * Copyright (C) 2008, Google Inc.
+ *
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ *   copyright notice, this list of conditions and the following
+ *   disclaimer in the documentation and/or other materials provided
+ *   with the distribution.
+ *
+ * - Neither the name of the Git Development Community nor the
+ *   names of its contributors may be used to endorse or promote
+ *   products derived from this software without specific prior
+ *   written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
+ * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES,
+ * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
+ * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
+ * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+ * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
+ * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+package org.spearce.jgit.treewalk;
+
+import java.io.ByteArrayInputStream;
+
+import org.spearce.jgit.dircache.DirCache;
+import org.spearce.jgit.dircache.DirCacheBuilder;
+import org.spearce.jgit.dircache.DirCacheEntry;
+import org.spearce.jgit.dircache.DirCacheIterator;
+import org.spearce.jgit.lib.Constants;
+import org.spearce.jgit.lib.FileMode;
+import org.spearce.jgit.lib.ObjectWriter;
+import org.spearce.jgit.lib.RepositoryTestCase;
+
+import static org.spearce.jgit.lib.FileMode.REGULAR_FILE;
+import static org.spearce.jgit.lib.FileMode.TREE;
+
+public class PostOrderTreeWalkTest extends RepositoryTestCase {
+	public void testInitialize_NoPostOrder() throws Exception {
+		final TreeWalk tw = new TreeWalk(db);
+		assertFalse(tw.isPostOrderTraversal());
+	}
+
+	public void testInitialize_TogglePostOrder() throws Exception {
+		final TreeWalk tw = new TreeWalk(db);
+		assertFalse(tw.isPostOrderTraversal());
+		tw.setPostOrderTraversal(true);
+		assertTrue(tw.isPostOrderTraversal());
+		tw.setPostOrderTraversal(false);
+		assertFalse(tw.isPostOrderTraversal());
+	}
+
+	public void testResetDoesNotAffectPostOrder() throws Exception {
+		final TreeWalk tw = new TreeWalk(db);
+		tw.setPostOrderTraversal(true);
+		assertTrue(tw.isPostOrderTraversal());
+		tw.reset();
+		assertTrue(tw.isPostOrderTraversal());
+
+		tw.setPostOrderTraversal(false);
+		assertFalse(tw.isPostOrderTraversal());
+		tw.reset();
+		assertFalse(tw.isPostOrderTraversal());
+	}
+
+	public void testNoPostOrder() throws Exception {
+		final DirCache tree = DirCache.read(db);
+		{
+			final DirCacheBuilder b = tree.builder();
+
+			b.add(makeFile("a"));
+			b.add(makeFile("b/c"));
+			b.add(makeFile("b/d"));
+			b.add(makeFile("q"));
+
+			b.finish();
+			assertEquals(4, tree.getEntryCount());
+		}
+
+		final TreeWalk tw = new TreeWalk(db);
+		tw.reset();
+		tw.setPostOrderTraversal(false);
+		tw.addTree(new DirCacheIterator(tree));
+
+		assertModes("a", REGULAR_FILE, tw);
+		assertModes("b", TREE, tw);
+		assertTrue(tw.isSubtree());
+		assertFalse(tw.isPostChildren());
+		tw.enterSubtree();
+		assertModes("b/c", REGULAR_FILE, tw);
+		assertModes("b/d", REGULAR_FILE, tw);
+		assertModes("q", REGULAR_FILE, tw);
+	}
+
+	public void testWithPostOrder_EnterSubtree() throws Exception {
+		final DirCache tree = DirCache.read(db);
+		{
+			final DirCacheBuilder b = tree.builder();
+
+			b.add(makeFile("a"));
+			b.add(makeFile("b/c"));
+			b.add(makeFile("b/d"));
+			b.add(makeFile("q"));
+
+			b.finish();
+			assertEquals(4, tree.getEntryCount());
+		}
+
+		final TreeWalk tw = new TreeWalk(db);
+		tw.reset();
+		tw.setPostOrderTraversal(true);
+		tw.addTree(new DirCacheIterator(tree));
+
+		assertModes("a", REGULAR_FILE, tw);
+
+		assertModes("b", TREE, tw);
+		assertTrue(tw.isSubtree());
+		assertFalse(tw.isPostChildren());
+		tw.enterSubtree();
+		assertModes("b/c", REGULAR_FILE, tw);
+		assertModes("b/d", REGULAR_FILE, tw);
+
+		assertModes("b", TREE, tw);
+		assertTrue(tw.isSubtree());
+		assertTrue(tw.isPostChildren());
+
+		assertModes("q", REGULAR_FILE, tw);
+	}
+
+	public void testWithPostOrder_NoEnterSubtree() throws Exception {
+		final DirCache tree = DirCache.read(db);
+		{
+			final DirCacheBuilder b = tree.builder();
+
+			b.add(makeFile("a"));
+			b.add(makeFile("b/c"));
+			b.add(makeFile("b/d"));
+			b.add(makeFile("q"));
+
+			b.finish();
+			assertEquals(4, tree.getEntryCount());
+		}
+
+		final TreeWalk tw = new TreeWalk(db);
+		tw.reset();
+		tw.setPostOrderTraversal(true);
+		tw.addTree(new DirCacheIterator(tree));
+
+		assertModes("a", REGULAR_FILE, tw);
+
+		assertModes("b", TREE, tw);
+		assertTrue(tw.isSubtree());
+		assertFalse(tw.isPostChildren());
+
+		assertModes("q", REGULAR_FILE, tw);
+	}
+
+	private DirCacheEntry makeFile(final String path) throws Exception {
+		final byte[] pathBytes = Constants.encode(path);
+		final DirCacheEntry ent = new DirCacheEntry(path);
+		ent.setFileMode(REGULAR_FILE);
+		ent.setObjectId(new ObjectWriter(db).computeBlobSha1(pathBytes.length,
+				new ByteArrayInputStream(pathBytes)));
+		return ent;
+	}
+
+	private static void assertModes(final String path, final FileMode mode0,
+			final TreeWalk tw) throws Exception {
+		assertTrue("has " + path, tw.next());
+		assertEquals(path, tw.getPathString());
+		assertEquals(mode0, tw.getFileMode(0));
+	}
+}
diff --git a/org.spearce.jgit/src/org/spearce/jgit/treewalk/TreeWalk.java b/org.spearce.jgit/src/org/spearce/jgit/treewalk/TreeWalk.java
index 7a09878..6d0ef02 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/treewalk/TreeWalk.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/treewalk/TreeWalk.java
@@ -147,10 +147,14 @@ public static TreeWalk forPath(final Repository db, final String path,
 
 	private boolean recursive;
 
+	private boolean postOrderTraversal;
+
 	private int depth;
 
 	private boolean advance;
 
+	private boolean postChildren;
+
 	AbstractTreeIterator currentHead;
 
 	/**
@@ -235,6 +239,36 @@ public void setRecursive(final boolean b) {
 		recursive = b;
 	}
 
+	/**
+	 * Does this walker return a tree entry after it exits the subtree?
+	 * <p>
+	 * If post order traversal is enabled then the walker will return a subtree
+	 * after it has returned the last entry within that subtree. This may cause
+	 * a subtree to be seen by the application twice if {@link #isRecursive()}
+	 * is false, as the application will see it once, call
+	 * {@link #enterSubtree()}, and then see it again as it leaves the subtree.
+	 * <p>
+	 * If an application does not enable {@link #isRecursive()} and it does not
+	 * call {@link #enterSubtree()} then the tree is returned only once as none
+	 * of the children were processed.
+	 * 
+	 * @return true if subtrees are returned after entries within the subtree.
+	 */
+	public boolean isPostOrderTraversal() {
+		return postOrderTraversal;
+	}
+
+	/**
+	 * Set the walker to return trees after their children.
+	 * 
+	 * @param b
+	 *            true to get trees after their children.
+	 * @see #isPostOrderTraversal()
+	 */
+	public void setPostOrderTraversal(final boolean b) {
+		postOrderTraversal = b;
+	}
+
 	/** Reset this walker so new tree iterators can be added to it. */
 	public void reset() {
 		trees = new AbstractTreeIterator[0];
@@ -379,6 +413,7 @@ public boolean next() throws MissingObjectException,
 		try {
 			if (advance) {
 				advance = false;
+				postChildren = false;
 				popEntriesEqual();
 			}
 
@@ -387,6 +422,12 @@ public boolean next() throws MissingObjectException,
 				if (t.eof()) {
 					if (depth > 0) {
 						exitSubtree();
+						if (postOrderTraversal) {
+							advance = true;
+							postChildren = true;
+							return true;
+						}
+						popEntriesEqual();
 						continue;
 					}
 					return false;
@@ -587,6 +628,17 @@ public boolean isSubtree() {
 	}
 
 	/**
+	 * Is the current entry a subtree returned after its children?
+	 * 
+	 * @return true if the current node is a tree that has been returned after
+	 *         its children were already processed.
+	 * @see #isPostOrderTraversal()
+	 */
+	public boolean isPostChildren() {
+		return postChildren && isSubtree();
+	}
+
+	/**
 	 * Enter into the current subtree.
 	 * <p>
 	 * If the current entry is a subtree this method arranges for its children
@@ -675,7 +727,6 @@ private void exitSubtree() throws CorruptObjectException {
 		for (int i = 0; i < trees.length; i++)
 			trees[i] = trees[i].parent;
 		currentHead = min();
-		popEntriesEqual();
 	}
 
 	private CanonicalTreeParser parserFor(final ObjectId id)
-- 
1.6.0.112.g9c75

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [JGIT PATCH 3/4] Micro-optimize TreeWalk's exitSubtree implementation
  2008-08-21 20:57 ` [JGIT PATCH 2/4] Support post order tree handling during TreeWalk Shawn O. Pearce
@ 2008-08-21 20:57   ` Shawn O. Pearce
  2008-08-21 20:57     ` [JGIT PATCH 4/4] Add getNameString to TreeWalk to get only the last path component Shawn O. Pearce
  0 siblings, 1 reply; 6+ messages in thread
From: Shawn O. Pearce @ 2008-08-21 20:57 UTC (permalink / raw)
  To: Robin Rosenberg; +Cc: git

Rather than recomputing the min over again we can take the hint that
the prior min (the one that describes the tree we just left) must be
one that matches itself.  There may be more than one such case, as a
min could be found and match itself and later another min is found.
So we fall back into a pathCompare if we identify more than one.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
 .../src/org/spearce/jgit/treewalk/TreeWalk.java    |   12 ++++++++++--
 1 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/org.spearce.jgit/src/org/spearce/jgit/treewalk/TreeWalk.java b/org.spearce.jgit/src/org/spearce/jgit/treewalk/TreeWalk.java
index 6d0ef02..ef27e4e 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/treewalk/TreeWalk.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/treewalk/TreeWalk.java
@@ -722,11 +722,19 @@ void skipEntriesEqual() throws CorruptObjectException {
 		}
 	}
 
-	private void exitSubtree() throws CorruptObjectException {
+	private void exitSubtree() {
 		depth--;
 		for (int i = 0; i < trees.length; i++)
 			trees[i] = trees[i].parent;
-		currentHead = min();
+
+		AbstractTreeIterator minRef = null;
+		for (final AbstractTreeIterator t : trees) {
+			if (t.matches != t)
+				continue;
+			if (minRef == null || t.pathCompare(minRef) < 0)
+				minRef = t;
+		}
+		currentHead = minRef;
 	}
 
 	private CanonicalTreeParser parserFor(final ObjectId id)
-- 
1.6.0.112.g9c75

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [JGIT PATCH 4/4] Add getNameString to TreeWalk to get only the last path component
  2008-08-21 20:57   ` [JGIT PATCH 3/4] Micro-optimize TreeWalk's exitSubtree implementation Shawn O. Pearce
@ 2008-08-21 20:57     ` Shawn O. Pearce
  0 siblings, 0 replies; 6+ messages in thread
From: Shawn O. Pearce @ 2008-08-21 20:57 UTC (permalink / raw)
  To: Robin Rosenberg; +Cc: git

Some applications may need to see only the name of the current
entry within its parent tree and not require the full path from
the root of the repository.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
 .../src/org/spearce/jgit/treewalk/TreeWalk.java    |   18 ++++++++++++++++++
 1 files changed, 18 insertions(+), 0 deletions(-)

diff --git a/org.spearce.jgit/src/org/spearce/jgit/treewalk/TreeWalk.java b/org.spearce.jgit/src/org/spearce/jgit/treewalk/TreeWalk.java
index ef27e4e..53b94d2 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/treewalk/TreeWalk.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/treewalk/TreeWalk.java
@@ -548,6 +548,24 @@ public boolean idEqual(final int nthA, final int nthB) {
 	}
 
 	/**
+	 * Get the current entry's name within its parent tree.
+	 * <p>
+	 * This method is not very efficient and is primarily meant for debugging
+	 * and final output generation. Applications should try to avoid calling it,
+	 * and if invoked do so only once per interesting entry, where the name is
+	 * absolutely required for correct function.
+	 * 
+	 * @return name of the current entry within the parent tree (or directory).
+	 *         The name never includes a '/'.
+	 */
+	public String getNameString() {
+		final AbstractTreeIterator t = currentHead;
+		final int off = t.pathOffset;
+		final int end = t.pathLen;
+		return RawParseUtils.decode(Constants.CHARSET, t.path, off, end);
+	}
+
+	/**
 	 * Get the current entry's complete path.
 	 * <p>
 	 * This method is not very efficient and is primarily meant for debugging
-- 
1.6.0.112.g9c75

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [JGIT PATCH 1/4] Simplify and micro-optimize WorkingTreeIterator.ENTRY_CMP
  2008-08-21 20:57 [JGIT PATCH 1/4] Simplify and micro-optimize WorkingTreeIterator.ENTRY_CMP Shawn O. Pearce
  2008-08-21 20:57 ` [JGIT PATCH 2/4] Support post order tree handling during TreeWalk Shawn O. Pearce
@ 2008-08-22 20:45 ` Robin Rosenberg
  2008-08-22 23:07   ` Shawn O. Pearce
  1 sibling, 1 reply; 6+ messages in thread
From: Robin Rosenberg @ 2008-08-22 20:45 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: git

torsdagen den 21 augusti 2008 22.57.35 skrev Shawn O. Pearce:
> We already did this simplification work to AbstractTreeIterator's
> pathCompare method, and this is based upon that same structure.

I noticed we have very bad code coverage in unit tests in this area of the code. 
Maybe we should increase the coverage before optimizing. I'll accept the patches
anyway and pray they are ok.

-- robin

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [JGIT PATCH 1/4] Simplify and micro-optimize WorkingTreeIterator.ENTRY_CMP
  2008-08-22 20:45 ` [JGIT PATCH 1/4] Simplify and micro-optimize WorkingTreeIterator.ENTRY_CMP Robin Rosenberg
@ 2008-08-22 23:07   ` Shawn O. Pearce
  0 siblings, 0 replies; 6+ messages in thread
From: Shawn O. Pearce @ 2008-08-22 23:07 UTC (permalink / raw)
  To: Robin Rosenberg; +Cc: git

Robin Rosenberg <robin.rosenberg.lists@dewire.com> wrote:
> torsdagen den 21 augusti 2008 22.57.35 skrev Shawn O. Pearce:
> > We already did this simplification work to AbstractTreeIterator's
> > pathCompare method, and this is based upon that same structure.
> 
> I noticed we have very bad code coverage in unit tests in this area of the code. 
> Maybe we should increase the coverage before optimizing. I'll accept the patches
> anyway and pray they are ok.

Yea, I didn't go into here just to optimize the code.  But I looked
at it and said "WTF was I smoking when I wrote that?  Must have
been some good quality stuff!".

I was in there studying the method because I wanted to call it
from someplace else, and then realized how crappy it was.  So as I
simplified it, I noticed we didn't have to do as many compares to
come up with the same result.

I'm working on improving test coverage.  Slowly.  But I am trying
to spend at least a few hours a week on it.  I know we are really,
really bad here.  It will get better.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2008-08-22 23:08 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-08-21 20:57 [JGIT PATCH 1/4] Simplify and micro-optimize WorkingTreeIterator.ENTRY_CMP Shawn O. Pearce
2008-08-21 20:57 ` [JGIT PATCH 2/4] Support post order tree handling during TreeWalk Shawn O. Pearce
2008-08-21 20:57   ` [JGIT PATCH 3/4] Micro-optimize TreeWalk's exitSubtree implementation Shawn O. Pearce
2008-08-21 20:57     ` [JGIT PATCH 4/4] Add getNameString to TreeWalk to get only the last path component Shawn O. Pearce
2008-08-22 20:45 ` [JGIT PATCH 1/4] Simplify and micro-optimize WorkingTreeIterator.ENTRY_CMP Robin Rosenberg
2008-08-22 23:07   ` Shawn O. Pearce

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).