* [JGIT PATCH 0/6] RawParseUtil improvements
@ 2008-12-10 22:05 Shawn O. Pearce
2008-12-10 22:05 ` [JGIT PATCH 1/6] Simplify RawParseUtils.nextLF invocations Shawn O. Pearce
0 siblings, 1 reply; 11+ messages in thread
From: Shawn O. Pearce @ 2008-12-10 22:05 UTC (permalink / raw)
To: Robin Rosenberg; +Cc: git
I'm working on patch parsing support for JGit, so I can rely on
it in Gerrit 2. Currently Gerrit 1 uses a bastard chunk of code
I ripped out in an hour to scan the output of "git diff"; its not
suitable for long-term appliction.
This series improves RawParseUtils code clarity, and then adds
support for the C-style quoting rules used by git diff when file
names contain "special" characters like LF.
I'm working on patch parsing right now; but I wanted to send this
preliminary series out so you aren't drowning in code to review.
Shawn O. Pearce (6):
Simplify RawParseUtils.nextLF invocations
Simplify RawParseUtils next and nextLF loops
Correct Javadoc of RawParseUtils next and nextLF methods
Add QuotedString class to handle C-style quoting rules
Add Bourne style quoting for TransportGitSsh
Add ~user friendly Bourne style quoting for TransportGitSsh
.../jgit/util/QuotedStringBourneStyleTest.java | 111 ++++++
.../util/QuotedStringBourneUserPathStyleTest.java | 130 +++++++
.../spearce/jgit/util/QuotedStringC_StyleTest.java | 144 ++++++++
.../src/org/spearce/jgit/lib/ObjectChecker.java | 4 +-
.../src/org/spearce/jgit/revwalk/RevTag.java | 2 +-
.../spearce/jgit/transport/TransportGitSsh.java | 38 +--
.../src/org/spearce/jgit/util/QuotedString.java | 364 ++++++++++++++++++++
.../src/org/spearce/jgit/util/RawParseUtils.java | 45 ++-
8 files changed, 783 insertions(+), 55 deletions(-)
create mode 100644 org.spearce.jgit.test/tst/org/spearce/jgit/util/QuotedStringBourneStyleTest.java
create mode 100644 org.spearce.jgit.test/tst/org/spearce/jgit/util/QuotedStringBourneUserPathStyleTest.java
create mode 100644 org.spearce.jgit.test/tst/org/spearce/jgit/util/QuotedStringC_StyleTest.java
create mode 100644 org.spearce.jgit/src/org/spearce/jgit/util/QuotedString.java
^ permalink raw reply [flat|nested] 11+ messages in thread
* [JGIT PATCH 1/6] Simplify RawParseUtils.nextLF invocations
2008-12-10 22:05 [JGIT PATCH 0/6] RawParseUtil improvements Shawn O. Pearce
@ 2008-12-10 22:05 ` Shawn O. Pearce
2008-12-10 22:05 ` [JGIT PATCH 2/6] Simplify RawParseUtils next and nextLF loops Shawn O. Pearce
0 siblings, 1 reply; 11+ messages in thread
From: Shawn O. Pearce @ 2008-12-10 22:05 UTC (permalink / raw)
To: Robin Rosenberg; +Cc: git
Most of the time when we call next('\n') or nextLF('\n') we really
meant to just say nextLF(), which is logically identical to next()
but could be micro-optimized for the LF byte.
This refactoring shifts the calls to use the new nextLF wrapper for
next('\n'), so we can later chose to make this optimization, or to
leave the code as-is. But either way the call sites are now much
clearer to read.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
.../src/org/spearce/jgit/lib/ObjectChecker.java | 4 +-
.../src/org/spearce/jgit/revwalk/RevTag.java | 2 +-
.../src/org/spearce/jgit/util/RawParseUtils.java | 25 ++++++++++++++++----
3 files changed, 23 insertions(+), 8 deletions(-)
diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/ObjectChecker.java b/org.spearce.jgit/src/org/spearce/jgit/lib/ObjectChecker.java
index b303d6f..75e3c77 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/lib/ObjectChecker.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/lib/ObjectChecker.java
@@ -205,11 +205,11 @@ public void checkTag(final byte[] raw) throws CorruptObjectException {
if ((ptr = match(raw, ptr, type)) < 0)
throw new CorruptObjectException("no type header");
- ptr = nextLF(raw, ptr, '\n');
+ ptr = nextLF(raw, ptr);
if ((ptr = match(raw, ptr, tag)) < 0)
throw new CorruptObjectException("no tag header");
- ptr = nextLF(raw, ptr, '\n');
+ ptr = nextLF(raw, ptr);
if ((ptr = match(raw, ptr, tagger)) < 0)
throw new CorruptObjectException("no tagger header");
diff --git a/org.spearce.jgit/src/org/spearce/jgit/revwalk/RevTag.java b/org.spearce.jgit/src/org/spearce/jgit/revwalk/RevTag.java
index bbb18ee..77a55cd 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/revwalk/RevTag.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/revwalk/RevTag.java
@@ -90,7 +90,7 @@ void parseCanonical(final RevWalk walk, final byte[] rawTag)
object = walk.lookupAny(walk.idBuffer, oType);
int p = pos.value += 4; // "tag "
- final int nameEnd = RawParseUtils.next(rawTag, p, '\n') - 1;
+ final int nameEnd = RawParseUtils.nextLF(rawTag, p) - 1;
name = RawParseUtils.decode(Constants.CHARSET, rawTag, p, nameEnd);
buffer = rawTag;
flags |= PARSED;
diff --git a/org.spearce.jgit/src/org/spearce/jgit/util/RawParseUtils.java b/org.spearce.jgit/src/org/spearce/jgit/util/RawParseUtils.java
index 4b96439..10c2239 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/util/RawParseUtils.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/util/RawParseUtils.java
@@ -230,6 +230,21 @@ public static final int next(final byte[] b, int ptr, final char chrA) {
}
/**
+ * Locate the first position after the next LF.
+ * <p>
+ * This method stops on the first '\n' it finds.
+ *
+ * @param b
+ * buffer to scan.
+ * @param ptr
+ * position within buffer to start looking for LF at.
+ * @return new position just after the first LF found.
+ */
+ public static final int nextLF(final byte[] b, int ptr) {
+ return next(b, ptr, '\n');
+ }
+
+ /**
* Locate the first position after either the given character or LF.
* <p>
* This method stops on the first match it finds from either chrA or '\n'.
@@ -296,7 +311,7 @@ public static final int committer(final byte[] b, int ptr) {
while (ptr < sz && b[ptr] == 'p')
ptr += 48; // skip this parent.
if (ptr < sz && b[ptr] == 'a')
- ptr = next(b, ptr, '\n');
+ ptr = nextLF(b, ptr);
return match(b, ptr, committer);
}
@@ -320,7 +335,7 @@ public static final int encoding(final byte[] b, int ptr) {
return -1;
if (b[ptr] == 'e')
break;
- ptr = next(b, ptr, '\n');
+ ptr = nextLF(b, ptr);
}
return match(b, ptr, encoding);
}
@@ -342,7 +357,7 @@ public static Charset parseEncoding(final byte[] b) {
final int enc = encoding(b, 0);
if (enc < 0)
return Constants.CHARSET;
- final int lf = next(b, enc, '\n');
+ final int lf = nextLF(b, enc);
return Charset.forName(decode(Constants.CHARSET, b, enc, lf - 1));
}
@@ -505,7 +520,7 @@ public static final int commitMessage(final byte[] b, int ptr) {
// header line type is.
//
while (ptr < sz && b[ptr] != '\n')
- ptr = next(b, ptr, '\n');
+ ptr = nextLF(b, ptr);
if (ptr < sz && b[ptr] == '\n')
return ptr + 1;
return -1;
@@ -529,7 +544,7 @@ public static final int endOfParagraph(final byte[] b, final int start) {
int ptr = start;
final int sz = b.length;
while (ptr < sz && b[ptr] != '\n')
- ptr = next(b, ptr, '\n');
+ ptr = nextLF(b, ptr);
while (0 < ptr && start < ptr && b[ptr - 1] == '\n')
ptr--;
return ptr;
--
1.6.1.rc2.299.gead4c
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [JGIT PATCH 2/6] Simplify RawParseUtils next and nextLF loops
2008-12-10 22:05 ` [JGIT PATCH 1/6] Simplify RawParseUtils.nextLF invocations Shawn O. Pearce
@ 2008-12-10 22:05 ` Shawn O. Pearce
2008-12-10 22:05 ` [JGIT PATCH 3/6] Correct Javadoc of RawParseUtils next and nextLF methods Shawn O. Pearce
0 siblings, 1 reply; 11+ messages in thread
From: Shawn O. Pearce @ 2008-12-10 22:05 UTC (permalink / raw)
To: Robin Rosenberg; +Cc: git
We always need ptr + 1 after we read the current position,
so we might as well do the much more common foo[ptr++]. A
good JIT would be more likely to optimize this case over
the weird else branch we had.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
.../src/org/spearce/jgit/util/RawParseUtils.java | 12 ++++--------
1 files changed, 4 insertions(+), 8 deletions(-)
diff --git a/org.spearce.jgit/src/org/spearce/jgit/util/RawParseUtils.java b/org.spearce.jgit/src/org/spearce/jgit/util/RawParseUtils.java
index 10c2239..5a40911 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/util/RawParseUtils.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/util/RawParseUtils.java
@@ -221,10 +221,8 @@ public static final int parseTimeZoneOffset(final byte[] b, int ptr) {
public static final int next(final byte[] b, int ptr, final char chrA) {
final int sz = b.length;
while (ptr < sz) {
- if (b[ptr] == chrA)
- return ptr + 1;
- else
- ptr++;
+ if (b[ptr++] == chrA)
+ return ptr;
}
return ptr;
}
@@ -260,11 +258,9 @@ public static final int nextLF(final byte[] b, int ptr) {
public static final int nextLF(final byte[] b, int ptr, final char chrA) {
final int sz = b.length;
while (ptr < sz) {
- final byte c = b[ptr];
+ final byte c = b[ptr++];
if (c == chrA || c == '\n')
- return ptr + 1;
- else
- ptr++;
+ return ptr;
}
return ptr;
}
--
1.6.1.rc2.299.gead4c
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [JGIT PATCH 3/6] Correct Javadoc of RawParseUtils next and nextLF methods
2008-12-10 22:05 ` [JGIT PATCH 2/6] Simplify RawParseUtils next and nextLF loops Shawn O. Pearce
@ 2008-12-10 22:05 ` Shawn O. Pearce
2008-12-10 22:05 ` [JGIT PATCH 4/6] Add QuotedString class to handle C-style quoting rules Shawn O. Pearce
0 siblings, 1 reply; 11+ messages in thread
From: Shawn O. Pearce @ 2008-12-10 22:05 UTC (permalink / raw)
To: Robin Rosenberg; +Cc: git
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
.../src/org/spearce/jgit/util/RawParseUtils.java | 8 ++++----
1 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/org.spearce.jgit/src/org/spearce/jgit/util/RawParseUtils.java b/org.spearce.jgit/src/org/spearce/jgit/util/RawParseUtils.java
index 5a40911..8896d38 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/util/RawParseUtils.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/util/RawParseUtils.java
@@ -213,10 +213,10 @@ public static final int parseTimeZoneOffset(final byte[] b, int ptr) {
* @param b
* buffer to scan.
* @param ptr
- * position within buffer to start looking for LF at.
+ * position within buffer to start looking for chrA at.
* @param chrA
* character to find.
- * @return new position just after chr.
+ * @return new position just after chrA.
*/
public static final int next(final byte[] b, int ptr, final char chrA) {
final int sz = b.length;
@@ -250,10 +250,10 @@ public static final int nextLF(final byte[] b, int ptr) {
* @param b
* buffer to scan.
* @param ptr
- * position within buffer to start looking for LF at.
+ * position within buffer to start looking for chrA or LF at.
* @param chrA
* character to find.
- * @return new position just after the first chrA or chrB to be found.
+ * @return new position just after the first chrA or LF to be found.
*/
public static final int nextLF(final byte[] b, int ptr, final char chrA) {
final int sz = b.length;
--
1.6.1.rc2.299.gead4c
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [JGIT PATCH 4/6] Add QuotedString class to handle C-style quoting rules
2008-12-10 22:05 ` [JGIT PATCH 3/6] Correct Javadoc of RawParseUtils next and nextLF methods Shawn O. Pearce
@ 2008-12-10 22:05 ` Shawn O. Pearce
2008-12-10 22:05 ` [JGIT PATCH 5/6] Add Bourne style quoting for TransportGitSsh Shawn O. Pearce
2008-12-10 23:22 ` [JGIT PATCH 4/6] Add QuotedString class to handle C-style quoting rules Robin Rosenberg
0 siblings, 2 replies; 11+ messages in thread
From: Shawn O. Pearce @ 2008-12-10 22:05 UTC (permalink / raw)
To: Robin Rosenberg; +Cc: git
Git patch files can contain file names which are quoted using the
C language quoting rules. In order to correctly create or parse
these files we must implement a quoting style that matches those
specific rules.
QuotedString itself is an abstract API so callers can be passed a
quoting style based on the context of where their output will be
used, and multiple styles could be supported. This may be useful
if jgit ever grows a "git for-each-ref" style of output where Perl,
Python, Tcl and Bourne style quoting might be necessary.
References through the singleton QuotedString.C should be able to
bypass the virtual function table, as the specific type is mentioned
in the field declaration and that type is final. A good JIT should
be able to remove the abstraction costs when the caller has hardcoded
the quoting style.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
.../spearce/jgit/util/QuotedStringC_StyleTest.java | 144 +++++++++++
.../src/org/spearce/jgit/util/QuotedString.java | 270 ++++++++++++++++++++
2 files changed, 414 insertions(+), 0 deletions(-)
create mode 100644 org.spearce.jgit.test/tst/org/spearce/jgit/util/QuotedStringC_StyleTest.java
create mode 100644 org.spearce.jgit/src/org/spearce/jgit/util/QuotedString.java
diff --git a/org.spearce.jgit.test/tst/org/spearce/jgit/util/QuotedStringC_StyleTest.java b/org.spearce.jgit.test/tst/org/spearce/jgit/util/QuotedStringC_StyleTest.java
new file mode 100644
index 0000000..493869c
--- /dev/null
+++ b/org.spearce.jgit.test/tst/org/spearce/jgit/util/QuotedStringC_StyleTest.java
@@ -0,0 +1,144 @@
+/*
+ * Copyright (C) 2008, Google Inc.
+ *
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials provided
+ * with the distribution.
+ *
+ * - Neither the name of the Git Development Community nor the
+ * names of its contributors may be used to endorse or promote
+ * products derived from this software without specific prior
+ * written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
+ * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES,
+ * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
+ * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
+ * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+ * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
+ * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+package org.spearce.jgit.util;
+
+import static org.spearce.jgit.util.QuotedString.C;
+import junit.framework.TestCase;
+
+import org.spearce.jgit.lib.Constants;
+
+public class QuotedStringC_StyleTest extends TestCase {
+ private static void assertQuote(final String in, final String exp) {
+ final String r = C.quote(in);
+ assertNotSame(in, r);
+ assertFalse(in.equals(r));
+ assertEquals('"' + exp + '"', r);
+ }
+
+ private static void assertDequote(final String exp, final String in) {
+ final byte[] b = Constants.encode('"' + in + '"');
+ final String r = C.dequote(b, 0, b.length);
+ assertEquals(exp, r);
+ }
+
+ public void testQuote_Empty() {
+ assertEquals("\"\"", C.quote(""));
+ }
+
+ public void testDequote_Empty1() {
+ assertEquals("", C.dequote(new byte[0], 0, 0));
+ }
+
+ public void testDequote_Empty2() {
+ assertEquals("", C.dequote(new byte[] { '"', '"' }, 0, 2));
+ }
+
+ public void testDequote_SoleDq() {
+ assertEquals("\"", C.dequote(new byte[] { '"' }, 0, 1));
+ }
+
+ public void testQuote_BareA() {
+ final String in = "a";
+ assertSame(in, C.quote(in));
+ }
+
+ public void testDequote_BareA() {
+ final String in = "a";
+ final byte[] b = Constants.encode(in);
+ assertEquals(in, C.dequote(b, 0, b.length));
+ }
+
+ public void testDequote_BareABCZ_OnlyBC() {
+ final String in = "abcz";
+ final byte[] b = Constants.encode(in);
+ final int p = in.indexOf('b');
+ assertEquals("bc", C.dequote(b, p, p + 2));
+ }
+
+ public void testDequote_LoneBackslash() {
+ assertDequote("\\", "\\");
+ }
+
+ public void testQuote_NamedEscapes() {
+ assertQuote("\u0007", "\\a");
+ assertQuote("\b", "\\b");
+ assertQuote("\f", "\\f");
+ assertQuote("\n", "\\n");
+ assertQuote("\r", "\\r");
+ assertQuote("\t", "\\t");
+ assertQuote("\u000B", "\\v");
+ assertQuote("\\", "\\\\");
+ assertQuote("\"", "\\\"");
+ }
+
+ public void testDequote_NamedEscapes() {
+ assertDequote("\u0007", "\\a");
+ assertDequote("\b", "\\b");
+ assertDequote("\f", "\\f");
+ assertDequote("\n", "\\n");
+ assertDequote("\r", "\\r");
+ assertDequote("\t", "\\t");
+ assertDequote("\u000B", "\\v");
+ assertDequote("\\", "\\\\");
+ assertDequote("\"", "\\\"");
+ }
+
+ public void testDequote_OctalAll() {
+ for (int i = 0; i < 256; i++) {
+ String s = Integer.toOctalString(i);
+ while (s.length() < 3) {
+ s = "0" + s;
+ }
+ assertDequote("" + (char) i, "\\" + s);
+ }
+ }
+
+ public void testQuote_OctalAll() {
+ assertQuote("\1", "\\001");
+ assertQuote("~", "\\176");
+ assertQuote("\u00ff", "\\303\\277"); // \u00ff in UTF-8
+ }
+
+ public void testDequote_UnknownEscapeQ() {
+ assertDequote("\\q", "\\q");
+ }
+
+ public void testDequote_FooTabBar() {
+ assertDequote("foo\tbar", "foo\\tbar");
+ }
+}
diff --git a/org.spearce.jgit/src/org/spearce/jgit/util/QuotedString.java b/org.spearce.jgit/src/org/spearce/jgit/util/QuotedString.java
new file mode 100644
index 0000000..4aaa8ff
--- /dev/null
+++ b/org.spearce.jgit/src/org/spearce/jgit/util/QuotedString.java
@@ -0,0 +1,270 @@
+/*
+ * Copyright (C) 2008, Google Inc.
+ *
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials provided
+ * with the distribution.
+ *
+ * - Neither the name of the Git Development Community nor the
+ * names of its contributors may be used to endorse or promote
+ * products derived from this software without specific prior
+ * written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
+ * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES,
+ * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
+ * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
+ * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+ * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
+ * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+package org.spearce.jgit.util;
+
+import static org.spearce.jgit.util.RawParseUtils.decode;
+
+import java.util.Arrays;
+
+import org.spearce.jgit.lib.Constants;
+
+/** Utility functions related to quoted string handling. */
+public abstract class QuotedString {
+ /** Quoting style that obeys the rules of the C programming language. */
+ public static final C_Style C = new C_Style();
+
+ /**
+ * Quote an input string by the quoting rules.
+ * <p>
+ * If the input string does not require any quoting, the same String
+ * reference is returned to the caller.
+ * <p>
+ * Otherwise a quoted string is returned, including the opening and closing
+ * quotation marks at the start and end of the string. If the style does not
+ * permit raw Unicode characters then the string will first be encoded in
+ * UTF-8, with unprintable sequences possibly escaped by the rules.
+ *
+ * @param in
+ * any non-null Unicode string.
+ * @return a quoted string. See above for details.
+ */
+ public abstract String quote(String in);
+
+ /**
+ * Clean a previously quoted input, decoding the result via UTF-8.
+ * <p>
+ * This method must match quote such that:
+ *
+ * <pre>
+ * a.equals(dequote(quote(a)));
+ * </pre>
+ *
+ * is true for any <code>a</code>.
+ *
+ * @param in
+ * a Unicode string to remove quoting from.
+ * @return the cleaned string.
+ * @see #dequote(byte[], int, int)
+ */
+ public String dequote(final String in) {
+ final byte[] b = Constants.encode(in);
+ return dequote(b, 0, b.length);
+ }
+
+ /**
+ * Decode a previously quoted input, scanning a UTF-8 encoded buffer.
+ * <p>
+ * This method must match quote such that:
+ *
+ * <pre>
+ * a.equals(dequote(Constants.encode(quote(a))));
+ * </pre>
+ *
+ * is true for any <code>a</code>.
+ * <p>
+ * This method removes any opening/closing quotation marks added by
+ * {@link #quote(String)}.
+ *
+ * @param in
+ * the input buffer to parse.
+ * @param offset
+ * first position within <code>in</code> to scan.
+ * @param end
+ * one position past in <code>in</code> to scan.
+ * @return the cleaned string.
+ */
+ public abstract String dequote(byte[] in, int offset, int end);
+
+ /** Quoting style that obeys the rules of the C programming language. */
+ public static final class C_Style extends QuotedString {
+ private static final byte[] quote;
+ static {
+ quote = new byte[128];
+ Arrays.fill(quote, (byte) -1);
+
+ for (int i = '0'; i <= '9'; i++)
+ quote[i] = 0;
+ for (int i = 'a'; i <= 'z'; i++)
+ quote[i] = 0;
+ for (int i = 'A'; i <= 'Z'; i++)
+ quote[i] = 0;
+ quote[' '] = 0;
+ quote['+'] = 0;
+ quote[','] = 0;
+ quote['-'] = 0;
+ quote['.'] = 0;
+ quote['/'] = 0;
+ quote['='] = 0;
+ quote['_'] = 0;
+ quote['^'] = 0;
+
+ quote['\u0007'] = 'a';
+ quote['\b'] = 'b';
+ quote['\f'] = 'f';
+ quote['\n'] = 'n';
+ quote['\r'] = 'r';
+ quote['\t'] = 't';
+ quote['\u000B'] = 'v';
+ quote['\\'] = '\\';
+ quote['"'] = '"';
+ }
+
+ @Override
+ public String quote(final String instr) {
+ if (instr.length() == 0)
+ return "\"\"";
+ boolean reuse = true;
+ final byte[] in = Constants.encode(instr);
+ final StringBuilder r = new StringBuilder(2 + in.length);
+ r.append('"');
+ for (int i = 0; i < in.length; i++) {
+ final int c = in[i] & 0xff;
+ if (c < quote.length) {
+ final byte style = quote[c];
+ if (style == 0) {
+ r.append((char) c);
+ continue;
+ }
+ if (style > 0) {
+ reuse = false;
+ r.append('\\');
+ r.append((char) style);
+ continue;
+ }
+ }
+
+ reuse = false;
+ r.append('\\');
+ r.append((char) (((c >> 6) & 03) + '0'));
+ r.append((char) (((c >> 3) & 07) + '0'));
+ r.append((char) (((c >> 0) & 07) + '0'));
+ }
+ if (reuse)
+ return instr;
+ r.append('"');
+ return r.toString();
+ }
+
+ @Override
+ public String dequote(final byte[] in, final int inPtr, final int inEnd) {
+ if (2 <= inEnd - inPtr && in[inPtr] == '"' && in[inEnd - 1] == '"')
+ return dq(in, inPtr + 1, inEnd - 1);
+ return decode(Constants.CHARSET, in, inPtr, inEnd);
+ }
+
+ private static String dq(final byte[] in, int inPtr, final int inEnd) {
+ final byte[] r = new byte[inEnd - inPtr];
+ int rPtr = 0;
+ while (inPtr < inEnd) {
+ final byte b = in[inPtr++];
+ if (b != '\\') {
+ r[rPtr++] = b;
+ continue;
+ }
+
+ if (inPtr == inEnd) {
+ // Lone trailing backslash. Treat it as a literal.
+ //
+ r[rPtr++] = '\\';
+ break;
+ }
+
+ switch (in[inPtr++]) {
+ case 'a':
+ r[rPtr++] = 0x07 /* \a = BEL */;
+ continue;
+ case 'b':
+ r[rPtr++] = '\b';
+ continue;
+ case 'f':
+ r[rPtr++] = '\f';
+ continue;
+ case 'n':
+ r[rPtr++] = '\n';
+ continue;
+ case 'r':
+ r[rPtr++] = '\r';
+ continue;
+ case 't':
+ r[rPtr++] = '\t';
+ continue;
+ case 'v':
+ r[rPtr++] = 0x0B/* \v = VT */;
+ continue;
+
+ case '\\':
+ case '"':
+ r[rPtr++] = in[inPtr - 1];
+ continue;
+
+ case '0':
+ case '1':
+ case '2':
+ case '3': {
+ int cp = in[inPtr - 1] - '0';
+ while (inPtr < inEnd) {
+ final byte c = in[inPtr];
+ if ('0' <= c && c <= '7') {
+ cp <<= 3;
+ cp |= c - '0';
+ inPtr++;
+ } else {
+ break;
+ }
+ }
+ r[rPtr++] = (byte) cp;
+ continue;
+ }
+
+ default:
+ // Any other code is taken literally.
+ //
+ r[rPtr++] = '\\';
+ r[rPtr++] = in[inPtr - 1];
+ continue;
+ }
+ }
+
+ return decode(Constants.CHARSET, r, 0, rPtr);
+ }
+
+ private C_Style() {
+ // Singleton
+ }
+ }
+}
--
1.6.1.rc2.299.gead4c
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [JGIT PATCH 5/6] Add Bourne style quoting for TransportGitSsh
2008-12-10 22:05 ` [JGIT PATCH 4/6] Add QuotedString class to handle C-style quoting rules Shawn O. Pearce
@ 2008-12-10 22:05 ` Shawn O. Pearce
2008-12-10 22:05 ` [JGIT PATCH 6/6] Add ~user friendly " Shawn O. Pearce
2008-12-10 23:22 ` [JGIT PATCH 4/6] Add QuotedString class to handle C-style quoting rules Robin Rosenberg
1 sibling, 1 reply; 11+ messages in thread
From: Shawn O. Pearce @ 2008-12-10 22:05 UTC (permalink / raw)
To: Robin Rosenberg; +Cc: git
Now that we have a nice QuotedString abstraction we can port our
string quoting logic from being private within the SSH transport
code to being available in the rest of the library.
Currently we only support the super-restrictive quoting style used
for the repository path name argument over SSH. We don't support the
"minimal" style used to invoke the command name, nor do we support
the ~user/ style format, which cannot be quoted.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
.../jgit/util/QuotedStringBourneStyleTest.java | 111 ++++++++++++++++++++
.../spearce/jgit/transport/TransportGitSsh.java | 13 +--
.../src/org/spearce/jgit/util/QuotedString.java | 66 ++++++++++++
3 files changed, 179 insertions(+), 11 deletions(-)
create mode 100644 org.spearce.jgit.test/tst/org/spearce/jgit/util/QuotedStringBourneStyleTest.java
diff --git a/org.spearce.jgit.test/tst/org/spearce/jgit/util/QuotedStringBourneStyleTest.java b/org.spearce.jgit.test/tst/org/spearce/jgit/util/QuotedStringBourneStyleTest.java
new file mode 100644
index 0000000..86d46fe
--- /dev/null
+++ b/org.spearce.jgit.test/tst/org/spearce/jgit/util/QuotedStringBourneStyleTest.java
@@ -0,0 +1,111 @@
+/*
+ * Copyright (C) 2008, Google Inc.
+ *
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials provided
+ * with the distribution.
+ *
+ * - Neither the name of the Git Development Community nor the
+ * names of its contributors may be used to endorse or promote
+ * products derived from this software without specific prior
+ * written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
+ * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES,
+ * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
+ * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
+ * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+ * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
+ * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+package org.spearce.jgit.util;
+
+import static org.spearce.jgit.util.QuotedString.BOURNE;
+import junit.framework.TestCase;
+
+import org.spearce.jgit.lib.Constants;
+
+public class QuotedStringBourneStyleTest extends TestCase {
+ private static void assertQuote(final String in, final String exp) {
+ final String r = BOURNE.quote(in);
+ assertNotSame(in, r);
+ assertFalse(in.equals(r));
+ assertEquals('\'' + exp + '\'', r);
+ }
+
+ private static void assertDequote(final String exp, final String in) {
+ final byte[] b = Constants.encode('\'' + in + '\'');
+ final String r = BOURNE.dequote(b, 0, b.length);
+ assertEquals(exp, r);
+ }
+
+ public void testQuote_Empty() {
+ assertEquals("''", BOURNE.quote(""));
+ }
+
+ public void testDequote_Empty1() {
+ assertEquals("", BOURNE.dequote(new byte[0], 0, 0));
+ }
+
+ public void testDequote_Empty2() {
+ assertEquals("", BOURNE.dequote(new byte[] { '\'', '\'' }, 0, 2));
+ }
+
+ public void testDequote_SoleSq() {
+ assertEquals("", BOURNE.dequote(new byte[] { '\'' }, 0, 1));
+ }
+
+ public void testQuote_BareA() {
+ assertQuote("a", "a");
+ }
+
+ public void testDequote_BareA() {
+ final String in = "a";
+ final byte[] b = Constants.encode(in);
+ assertEquals(in, BOURNE.dequote(b, 0, b.length));
+ }
+
+ public void testDequote_BareABCZ_OnlyBC() {
+ final String in = "abcz";
+ final byte[] b = Constants.encode(in);
+ final int p = in.indexOf('b');
+ assertEquals("bc", BOURNE.dequote(b, p, p + 2));
+ }
+
+ public void testDequote_LoneBackslash() {
+ assertDequote("\\", "\\");
+ }
+
+ public void testQuote_NamedEscapes() {
+ assertQuote("'", "'\\''");
+ assertQuote("!", "'\\!'");
+
+ assertQuote("a'b", "a'\\''b");
+ assertQuote("a!b", "a'\\!'b");
+ }
+
+ public void testDequote_NamedEscapes() {
+ assertDequote("'", "'\\''");
+ assertDequote("!", "'\\!'");
+
+ assertDequote("a'b", "a'\\''b");
+ assertDequote("a!b", "a'\\!'b");
+ }
+}
diff --git a/org.spearce.jgit/src/org/spearce/jgit/transport/TransportGitSsh.java b/org.spearce.jgit/src/org/spearce/jgit/transport/TransportGitSsh.java
index 3f2cd37..e3f5ae8 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/transport/TransportGitSsh.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/transport/TransportGitSsh.java
@@ -47,6 +47,7 @@
import org.spearce.jgit.errors.NoRemoteRepositoryException;
import org.spearce.jgit.errors.TransportException;
import org.spearce.jgit.lib.Repository;
+import org.spearce.jgit.util.QuotedString;
import com.jcraft.jsch.ChannelExec;
import com.jcraft.jsch.JSchException;
@@ -154,17 +155,7 @@ private static void sq(final StringBuilder cmd, final String val) {
return;
}
- cmd.append('\'');
- for (; i < val.length(); i++) {
- final char c = val.charAt(i);
- if (c == '\'')
- cmd.append("'\\''");
- else if (c == '!')
- cmd.append("'\\!'");
- else
- cmd.append(c);
- }
- cmd.append('\'');
+ cmd.append(QuotedString.BOURNE.quote(val.substring(i)));
}
private void initSession() throws TransportException {
diff --git a/org.spearce.jgit/src/org/spearce/jgit/util/QuotedString.java b/org.spearce.jgit/src/org/spearce/jgit/util/QuotedString.java
index 4aaa8ff..1089e9e 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/util/QuotedString.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/util/QuotedString.java
@@ -49,6 +49,15 @@
public static final C_Style C = new C_Style();
/**
+ * Quoting style used by the Bourne shell.
+ * <p>
+ * Quotes are unconditionally inserted during {@link #quote(String)}. This
+ * protects shell meta-characters like <code>$</code> or <code>~</code> from
+ * being recognized as special.
+ */
+ public static final BourneStyle BOURNE = new BourneStyle();
+
+ /**
* Quote an input string by the quoting rules.
* <p>
* If the input string does not require any quoting, the same String
@@ -110,6 +119,63 @@ public String dequote(final String in) {
*/
public abstract String dequote(byte[] in, int offset, int end);
+ /**
+ * Quoting style used by the Bourne shell.
+ * <p>
+ * Quotes are unconditionally inserted during {@link #quote(String)}. This
+ * protects shell meta-characters like <code>$</code> or <code>~</code> from
+ * being recognized as special.
+ */
+ public static class BourneStyle extends QuotedString {
+ @Override
+ public String quote(final String in) {
+ final StringBuilder r = new StringBuilder();
+ r.append('\'');
+ int start = 0, i = 0;
+ for (; i < in.length(); i++) {
+ switch (in.charAt(i)) {
+ case '\'':
+ case '!':
+ r.append(in, start, i);
+ r.append('\'');
+ r.append('\\');
+ r.append(in.charAt(i));
+ r.append('\'');
+ start = i + 1;
+ break;
+ }
+ }
+ r.append(in, start, i);
+ r.append('\'');
+ return r.toString();
+ }
+
+ @Override
+ public String dequote(final byte[] in, int ip, final int ie) {
+ boolean inquote = false;
+ final byte[] r = new byte[ie - ip];
+ int rPtr = 0;
+ while (ip < ie) {
+ final byte b = in[ip++];
+ switch (b) {
+ case '\'':
+ inquote = !inquote;
+ continue;
+ case '\\':
+ if (inquote || ip == ie)
+ r[rPtr++] = b; // literal within a quote
+ else
+ r[rPtr++] = in[ip++];
+ continue;
+ default:
+ r[rPtr++] = b;
+ continue;
+ }
+ }
+ return decode(Constants.CHARSET, r, 0, rPtr);
+ }
+ }
+
/** Quoting style that obeys the rules of the C programming language. */
public static final class C_Style extends QuotedString {
private static final byte[] quote;
--
1.6.1.rc2.299.gead4c
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [JGIT PATCH 6/6] Add ~user friendly Bourne style quoting for TransportGitSsh
2008-12-10 22:05 ` [JGIT PATCH 5/6] Add Bourne style quoting for TransportGitSsh Shawn O. Pearce
@ 2008-12-10 22:05 ` Shawn O. Pearce
0 siblings, 0 replies; 11+ messages in thread
From: Shawn O. Pearce @ 2008-12-10 22:05 UTC (permalink / raw)
To: Robin Rosenberg; +Cc: git
This mostly completes the migration of our quoting rules from the
SSH transport to our QuotedString pattern. User names may be left
alone for the shell to expand when the string is evaluated, if the
caller wants that sort of behavior in a particular context.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
.../util/QuotedStringBourneUserPathStyleTest.java | 130 ++++++++++++++++++++
.../spearce/jgit/transport/TransportGitSsh.java | 27 +----
.../src/org/spearce/jgit/util/QuotedString.java | 28 ++++
3 files changed, 160 insertions(+), 25 deletions(-)
create mode 100644 org.spearce.jgit.test/tst/org/spearce/jgit/util/QuotedStringBourneUserPathStyleTest.java
diff --git a/org.spearce.jgit.test/tst/org/spearce/jgit/util/QuotedStringBourneUserPathStyleTest.java b/org.spearce.jgit.test/tst/org/spearce/jgit/util/QuotedStringBourneUserPathStyleTest.java
new file mode 100644
index 0000000..36fb52a
--- /dev/null
+++ b/org.spearce.jgit.test/tst/org/spearce/jgit/util/QuotedStringBourneUserPathStyleTest.java
@@ -0,0 +1,130 @@
+/*
+ * Copyright (C) 2008, Google Inc.
+ *
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials provided
+ * with the distribution.
+ *
+ * - Neither the name of the Git Development Community nor the
+ * names of its contributors may be used to endorse or promote
+ * products derived from this software without specific prior
+ * written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
+ * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES,
+ * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
+ * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
+ * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+ * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
+ * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+package org.spearce.jgit.util;
+
+import static org.spearce.jgit.util.QuotedString.BOURNE_USER_PATH;
+import junit.framework.TestCase;
+
+import org.spearce.jgit.lib.Constants;
+
+public class QuotedStringBourneUserPathStyleTest extends TestCase {
+ private static void assertQuote(final String in, final String exp) {
+ final String r = BOURNE_USER_PATH.quote(in);
+ assertNotSame(in, r);
+ assertFalse(in.equals(r));
+ assertEquals('\'' + exp + '\'', r);
+ }
+
+ private static void assertDequote(final String exp, final String in) {
+ final byte[] b = Constants.encode('\'' + in + '\'');
+ final String r = BOURNE_USER_PATH.dequote(b, 0, b.length);
+ assertEquals(exp, r);
+ }
+
+ public void testQuote_Empty() {
+ assertEquals("''", BOURNE_USER_PATH.quote(""));
+ }
+
+ public void testDequote_Empty1() {
+ assertEquals("", BOURNE_USER_PATH.dequote(new byte[0], 0, 0));
+ }
+
+ public void testDequote_Empty2() {
+ assertEquals("", BOURNE_USER_PATH.dequote(new byte[] { '\'', '\'' }, 0,
+ 2));
+ }
+
+ public void testDequote_SoleSq() {
+ assertEquals("", BOURNE_USER_PATH.dequote(new byte[] { '\'' }, 0, 1));
+ }
+
+ public void testQuote_BareA() {
+ assertQuote("a", "a");
+ }
+
+ public void testDequote_BareA() {
+ final String in = "a";
+ final byte[] b = Constants.encode(in);
+ assertEquals(in, BOURNE_USER_PATH.dequote(b, 0, b.length));
+ }
+
+ public void testDequote_BareABCZ_OnlyBC() {
+ final String in = "abcz";
+ final byte[] b = Constants.encode(in);
+ final int p = in.indexOf('b');
+ assertEquals("bc", BOURNE_USER_PATH.dequote(b, p, p + 2));
+ }
+
+ public void testDequote_LoneBackslash() {
+ assertDequote("\\", "\\");
+ }
+
+ public void testQuote_NamedEscapes() {
+ assertQuote("'", "'\\''");
+ assertQuote("!", "'\\!'");
+
+ assertQuote("a'b", "a'\\''b");
+ assertQuote("a!b", "a'\\!'b");
+ }
+
+ public void testDequote_NamedEscapes() {
+ assertDequote("'", "'\\''");
+ assertDequote("!", "'\\!'");
+
+ assertDequote("a'b", "a'\\''b");
+ assertDequote("a!b", "a'\\!'b");
+ }
+
+ public void testQuote_User() {
+ assertEquals("~foo/", BOURNE_USER_PATH.quote("~foo"));
+ assertEquals("~foo/", BOURNE_USER_PATH.quote("~foo/"));
+ assertEquals("~/", BOURNE_USER_PATH.quote("~/"));
+
+ assertEquals("~foo/'a'", BOURNE_USER_PATH.quote("~foo/a"));
+ assertEquals("~/'a'", BOURNE_USER_PATH.quote("~/a"));
+ }
+
+ public void testDequote_User() {
+ assertEquals("~foo", BOURNE_USER_PATH.dequote("~foo"));
+ assertEquals("~foo/", BOURNE_USER_PATH.dequote("~foo/"));
+ assertEquals("~/", BOURNE_USER_PATH.dequote("~/"));
+
+ assertEquals("~foo/a", BOURNE_USER_PATH.dequote("~foo/'a'"));
+ assertEquals("~/a", BOURNE_USER_PATH.dequote("~/'a'"));
+ }
+}
diff --git a/org.spearce.jgit/src/org/spearce/jgit/transport/TransportGitSsh.java b/org.spearce.jgit/src/org/spearce/jgit/transport/TransportGitSsh.java
index e3f5ae8..d4bf466 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/transport/TransportGitSsh.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/transport/TransportGitSsh.java
@@ -131,31 +131,8 @@ private static void sqAlways(final StringBuilder cmd, final String val) {
}
private static void sq(final StringBuilder cmd, final String val) {
- int i = 0;
-
- if (val.length() == 0)
- return;
- if (val.matches("^~[A-Za-z0-9_-]+$")) {
- // If the string is just "~user" we can assume they
- // mean "~user/" and evaluate it within the shell.
- //
- cmd.append(val);
- cmd.append('/');
- return;
- }
-
- if (val.matches("^~[A-Za-z0-9_-]*/.*$")) {
- // If the string is of "~/path" or "~user/path"
- // we must not escape ~/ or ~user/ from the shell
- // as we need that portion to be evaluated.
- //
- i = val.indexOf('/') + 1;
- cmd.append(val.substring(0, i));
- if (i == val.length())
- return;
- }
-
- cmd.append(QuotedString.BOURNE.quote(val.substring(i)));
+ if (val.length() > 0)
+ cmd.append(QuotedString.BOURNE.quote(val));
}
private void initSession() throws TransportException {
diff --git a/org.spearce.jgit/src/org/spearce/jgit/util/QuotedString.java b/org.spearce.jgit/src/org/spearce/jgit/util/QuotedString.java
index 1089e9e..d2f71b4 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/util/QuotedString.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/util/QuotedString.java
@@ -57,6 +57,9 @@
*/
public static final BourneStyle BOURNE = new BourneStyle();
+ /** Bourne style, but permits <code>~user</code> at the start of the string. */
+ public static final BourneUserPathStyle BOURNE_USER_PATH = new BourneUserPathStyle();
+
/**
* Quote an input string by the quoting rules.
* <p>
@@ -176,6 +179,31 @@ public String dequote(final byte[] in, int ip, final int ie) {
}
}
+ /** Bourne style, but permits <code>~user</code> at the start of the string. */
+ public static class BourneUserPathStyle extends BourneStyle {
+ @Override
+ public String quote(final String in) {
+ if (in.matches("^~[A-Za-z0-9_-]+$")) {
+ // If the string is just "~user" we can assume they
+ // mean "~user/".
+ //
+ return in + "/";
+ }
+
+ if (in.matches("^~[A-Za-z0-9_-]*/.*$")) {
+ // If the string is of "~/path" or "~user/path"
+ // we must not escape ~/ or ~user/ from the shell.
+ //
+ final int i = in.indexOf('/') + 1;
+ if (i == in.length())
+ return in;
+ return in.substring(0, i) + super.quote(in.substring(i));
+ }
+
+ return super.quote(in);
+ }
+ }
+
/** Quoting style that obeys the rules of the C programming language. */
public static final class C_Style extends QuotedString {
private static final byte[] quote;
--
1.6.1.rc2.299.gead4c
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [JGIT PATCH 4/6] Add QuotedString class to handle C-style quoting rules
2008-12-10 22:05 ` [JGIT PATCH 4/6] Add QuotedString class to handle C-style quoting rules Shawn O. Pearce
2008-12-10 22:05 ` [JGIT PATCH 5/6] Add Bourne style quoting for TransportGitSsh Shawn O. Pearce
@ 2008-12-10 23:22 ` Robin Rosenberg
2008-12-10 23:41 ` Shawn O. Pearce
1 sibling, 1 reply; 11+ messages in thread
From: Robin Rosenberg @ 2008-12-10 23:22 UTC (permalink / raw)
To: Shawn O. Pearce; +Cc: git
onsdag 10 december 2008 23:05:49 skrev Shawn O. Pearce:
> Git patch files can contain file names which are quoted using the
> C language quoting rules. In order to correctly create or parse
Should we maybe call this Git-style since we really do not care
about C (which version btw?).
> QuotedString itself is an abstract API so callers can be passed a
> quoting style based on the context of where their output will be
> used, and multiple styles could be supported. This may be useful
> if jgit ever grows a "git for-each-ref" style of output where Perl,
> Python, Tcl and Bourne style quoting might be necessary.
>
> References through the singleton QuotedString.C should be able to
> bypass the virtual function table, as the specific type is mentioned
> in the field declaration and that type is final. A good JIT should
> be able to remove the abstraction costs when the caller has hardcoded
> the quoting style.
Making two interfaces is better. We may share the implementation initially,
but parsing file names in Git patches and parsing C strings are different
operations.
> + public void testQuote_OctalAll() {
> + assertQuote("\1", "\\001");
> + assertQuote("~", "\\176");
> + assertQuote("\u00ff", "\\303\\277"); // \u00ff in UTF-8
> + }
What do we do with non-UTF8 names? I think we should
follow the logic we use when parsing commits and paths
in other places.
> +
> + public void testDequote_UnknownEscapeQ() {
> + assertDequote("\\q", "\\q");
> + }
Would Git generate this style in a name?
> + quote[' '] = 0;
> + quote['+'] = 0;
> + quote[','] = 0;
> + quote['-'] = 0;
> + quote['.'] = 0;
> + quote['/'] = 0;
> + quote['='] = 0;
> + quote['_'] = 0;
> + quote['^'] = 0;
> +
> + quote['\u0007'] = 'a';
> + quote['\b'] = 'b';
\e = esc
> + default:
> + // Any other code is taken literally.
> + //
> + r[rPtr++] = '\\';
> + r[rPtr++] = in[inPtr - 1];
> + continue;
> + }
> + }
> +
> + return decode(Constants.CHARSET, r, 0, rPtr);
Importing methods really obscures things. Please qualify with class name
of RawparseUtils here instead.
-- robin
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [JGIT PATCH 4/6] Add QuotedString class to handle C-style quoting rules
2008-12-10 23:22 ` [JGIT PATCH 4/6] Add QuotedString class to handle C-style quoting rules Robin Rosenberg
@ 2008-12-10 23:41 ` Shawn O. Pearce
2008-12-11 0:33 ` Robin Rosenberg
0 siblings, 1 reply; 11+ messages in thread
From: Shawn O. Pearce @ 2008-12-10 23:41 UTC (permalink / raw)
To: Robin Rosenberg; +Cc: git
Robin Rosenberg <robin.rosenberg@dewire.com> wrote:
> onsdag 10 december 2008 23:05:49 skrev Shawn O. Pearce:
> > Git patch files can contain file names which are quoted using the
> > C language quoting rules. In order to correctly create or parse
>
> Should we maybe call this Git-style since we really do not care
> about C (which version btw?).
Yea, I think you are right. I'll change the name.
> Making two interfaces is better. We may share the implementation initially,
> but parsing file names in Git patches and parsing C strings are different
> operations.
Well, if we ever supported other C-style string names we could just
swap the implementation reference. I'll try to make it clearer
this format, although like a C-style string, is really meant for
Git path names within patches.
> > + public void testQuote_OctalAll() {
> > + assertQuote("\1", "\\001");
> > + assertQuote("~", "\\176");
> > + assertQuote("\u00ff", "\\303\\277"); // \u00ff in UTF-8
> > + }
>
> What do we do with non-UTF8 names? I think we should
> follow the logic we use when parsing commits and paths
> in other places.
Then we're totally f'd.
Git has no specific encoding on file names. If we get a standard
Java Unicode string and get asked to quote it characters with
code points above 127 need to be escaped as an octal escape code
according to the Git style. Further the Git style only permits
octal escapes that result in a value <= 255, aka an unsigned char.
The name needs to be encoded into an 8-bit encoding, and UTF-8 is
the only encoding that will represent every valid Unicode character.
Elsewhere we sort of take the attitude that when writing data *out*
we produce UTF-8, even if we read in ISO-whatever. Here I'm doing
the same thing.
> > + public void testDequote_UnknownEscapeQ() {
> > + assertDequote("\\q", "\\q");
> > + }
>
> Would Git generate this style in a name?
No, but a mangled patch might have it. Rather than throw an
exception I try to parse the string as faithfully as I can, so we
can continue on and mine the stream for more information. We may
be able to work around the breakage (there's a lot of redundant
data about path names in the git patch format).
As far as I can tell C Git handles this style of escape the same way:
$ cat F
diff --git "a/\q" "b/\q"
--- "a/\q"
+++ "b/\q"
$ git apply --stat F
"\\q\"" | 0
1 files changed, 0 insertions(+), 0 deletions(-)
Actually, I think there's a bug, its reading the closing quote as
part of the file name. But anyway, my point is that C git handles
an unknown escape sequence as though it were a literal.
> > + quote['\b'] = 'b';
> \e = esc
Oddly enough, C Git doesn't output \e. It uses \033. It doesn't
recognize \e either, so we can't produce it even if we wanted to
honor it on input. I can add it to our input table (but keep it
out of our output table), but then we'd treat \e as \033 while C
Git would treat \e as a literal. Bad.
So no, I won't add \e here.
> > + return decode(Constants.CHARSET, r, 0, rPtr);
>
> Importing methods really obscures things. Please qualify with class name
> of RawparseUtils here instead.
Hmmph. I wonder how you'll feel then about the patch parser code.
I import 4 methods from RawParseUtils because they are so commonly
invoked that I got tired of reading RawParseUtils.foo.
There's only two calls here so I changed it to be qualified.
--
Shawn.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [JGIT PATCH 4/6] Add QuotedString class to handle C-style quoting rules
2008-12-10 23:41 ` Shawn O. Pearce
@ 2008-12-11 0:33 ` Robin Rosenberg
2008-12-11 0:57 ` [JGIT PATCH 4/6 v3] Add QuotedString class to handle Git path style " Shawn O. Pearce
0 siblings, 1 reply; 11+ messages in thread
From: Robin Rosenberg @ 2008-12-11 0:33 UTC (permalink / raw)
To: Shawn O. Pearce; +Cc: git
torsdag 11 december 2008 00:41:30 skrev Shawn O. Pearce:
> > > + public void testQuote_OctalAll() {
> > > + assertQuote("\1", "\\001");
> > > + assertQuote("~", "\\176");
> > > + assertQuote("\u00ff", "\\303\\277"); // \u00ff in UTF-8
> > > + }
> >
> > What do we do with non-UTF8 names? I think we should
> > follow the logic we use when parsing commits and paths
> > in other places.
>
> Then we're totally f'd.
>
> Git has no specific encoding on file names. If we get a standard
> Java Unicode string and get asked to quote it characters with
> code points above 127 need to be escaped as an octal escape code
> according to the Git style. Further the Git style only permits
> octal escapes that result in a value <= 255, aka an unsigned char.
>
> The name needs to be encoded into an 8-bit encoding, and UTF-8 is
> the only encoding that will represent every valid Unicode character.
> Elsewhere we sort of take the attitude that when writing data *out*
> we produce UTF-8, even if we read in ISO-whatever. Here I'm doing
> the same thing.
So this should pass, right?
public void testDeQuote_Latin1() {
assertDequote("\u00c5ngstr\u00f6m", "\\305ngstr\\366m"); // Latin1
}
public void testDeQuote_UTF8() {
assertDequote("\u00c5ngstr\u00f6m", "\\303\\205ngstr\\303\\266m");
}
And possibly these actuall unquoted names, which can be produced when
core.quotepath is false
public void testDeQuote_Rawlatin() {
assertDequote("\u00c5ngstr\u00f6m", "\305ngstr\366m");
}
public void testDeQuote_RawUTF8() {
assertDequote("\u00c5ngstr\u00f6m", "\303\205ngstr\303\266m");
}
You also reversed the arguments to testQuote. It think we should follow the
"expected"-first conventions here too. The case above works neither way.
Using Constant.encode in the test is kind of dangerous as it does too
many conversions, so you don't know what you're testing anymore. Changing
assertDequote like this makes us able to feed byte sequences as strings
to the test method (which we cannot do if we assume UTF-8 encoding). ISO-
latin-encoding allows any byte sequence to be entered conveniently.
private static void assertDequote(final String exp, final String in) {
final byte[] b;
try {
b = ('"' + in + '"').getBytes("ISO-8859-1");
} catch (UnsupportedEncodingException e) {
throw new RuntimeException(e);
}
final String r = C.dequote(b, 0, b.length);
assertEquals(exp, r);
}
-- robin
^ permalink raw reply [flat|nested] 11+ messages in thread
* [JGIT PATCH 4/6 v3] Add QuotedString class to handle Git path style quoting rules
2008-12-11 0:33 ` Robin Rosenberg
@ 2008-12-11 0:57 ` Shawn O. Pearce
0 siblings, 0 replies; 11+ messages in thread
From: Shawn O. Pearce @ 2008-12-11 0:57 UTC (permalink / raw)
To: Robin Rosenberg; +Cc: git
Git patch files can contain file names which are quoted using the
roughly the C language quoting rules. In order to correctly create
or parse these files we must implement a quoting style that matches
those specific rules.
QuotedString itself is an abstract API so callers can be passed a
quoting style based on the context of where their output will be
used, and multiple styles could be supported. This may be useful
if jgit ever grows a "git for-each-ref" style of output where Perl,
Python, Tcl and Bourne style quoting might be necessary.
References through the singleton QuotedString.GIT_PATH should be
able to bypass the virtual function table, as the specific type is
mentioned in the field declaration and that type is final. A good
JIT should be able to remove the abstraction costs when the caller
has hardcoded the quoting style.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
Robin Rosenberg <robin.rosenberg@dewire.com> wrote:
> So this should pass, right?
These tests have been added, and they pass.
> You also reversed the arguments to testQuote. It think we should follow the
> "expected"-first conventions here too.
Fixed.
> Using Constant.encode in the test is kind of dangerous as it does too
> many conversions,
Fixed.
.../jgit/util/QuotedStringGitPathStyleTest.java | 172 +++++++++++++
.../src/org/spearce/jgit/util/QuotedString.java | 268 ++++++++++++++++++++
2 files changed, 440 insertions(+), 0 deletions(-)
create mode 100644 org.spearce.jgit.test/tst/org/spearce/jgit/util/QuotedStringGitPathStyleTest.java
create mode 100644 org.spearce.jgit/src/org/spearce/jgit/util/QuotedString.java
diff --git a/org.spearce.jgit.test/tst/org/spearce/jgit/util/QuotedStringGitPathStyleTest.java b/org.spearce.jgit.test/tst/org/spearce/jgit/util/QuotedStringGitPathStyleTest.java
new file mode 100644
index 0000000..54fbd31
--- /dev/null
+++ b/org.spearce.jgit.test/tst/org/spearce/jgit/util/QuotedStringGitPathStyleTest.java
@@ -0,0 +1,172 @@
+/*
+ * Copyright (C) 2008, Google Inc.
+ *
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials provided
+ * with the distribution.
+ *
+ * - Neither the name of the Git Development Community nor the
+ * names of its contributors may be used to endorse or promote
+ * products derived from this software without specific prior
+ * written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
+ * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES,
+ * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
+ * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
+ * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+ * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
+ * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+package org.spearce.jgit.util;
+
+import static org.spearce.jgit.util.QuotedString.GIT_PATH;
+
+import java.io.UnsupportedEncodingException;
+
+import junit.framework.TestCase;
+
+import org.spearce.jgit.lib.Constants;
+
+public class QuotedStringGitPathStyleTest extends TestCase {
+ private static void assertQuote(final String exp, final String in) {
+ final String r = GIT_PATH.quote(in);
+ assertNotSame(in, r);
+ assertFalse(in.equals(r));
+ assertEquals('"' + exp + '"', r);
+ }
+
+ private static void assertDequote(final String exp, final String in) {
+ final byte[] b;
+ try {
+ b = ('"' + in + '"').getBytes("ISO-8859-1");
+ } catch (UnsupportedEncodingException e) {
+ throw new RuntimeException(e);
+ }
+ final String r = GIT_PATH.dequote(b, 0, b.length);
+ assertEquals(exp, r);
+ }
+
+ public void testQuote_Empty() {
+ assertEquals("\"\"", GIT_PATH.quote(""));
+ }
+
+ public void testDequote_Empty1() {
+ assertEquals("", GIT_PATH.dequote(new byte[0], 0, 0));
+ }
+
+ public void testDequote_Empty2() {
+ assertEquals("", GIT_PATH.dequote(new byte[] { '"', '"' }, 0, 2));
+ }
+
+ public void testDequote_SoleDq() {
+ assertEquals("\"", GIT_PATH.dequote(new byte[] { '"' }, 0, 1));
+ }
+
+ public void testQuote_BareA() {
+ final String in = "a";
+ assertSame(in, GIT_PATH.quote(in));
+ }
+
+ public void testDequote_BareA() {
+ final String in = "a";
+ final byte[] b = Constants.encode(in);
+ assertEquals(in, GIT_PATH.dequote(b, 0, b.length));
+ }
+
+ public void testDequote_BareABCZ_OnlyBC() {
+ final String in = "abcz";
+ final byte[] b = Constants.encode(in);
+ final int p = in.indexOf('b');
+ assertEquals("bc", GIT_PATH.dequote(b, p, p + 2));
+ }
+
+ public void testDequote_LoneBackslash() {
+ assertDequote("\\", "\\");
+ }
+
+ public void testQuote_NamedEscapes() {
+ assertQuote("\\a", "\u0007");
+ assertQuote("\\b", "\b");
+ assertQuote("\\f", "\f");
+ assertQuote("\\n", "\n");
+ assertQuote("\\r", "\r");
+ assertQuote("\\t", "\t");
+ assertQuote("\\v", "\u000B");
+ assertQuote("\\\\", "\\");
+ assertQuote("\\\"", "\"");
+ }
+
+ public void testDequote_NamedEscapes() {
+ assertDequote("\u0007", "\\a");
+ assertDequote("\b", "\\b");
+ assertDequote("\f", "\\f");
+ assertDequote("\n", "\\n");
+ assertDequote("\r", "\\r");
+ assertDequote("\t", "\\t");
+ assertDequote("\u000B", "\\v");
+ assertDequote("\\", "\\\\");
+ assertDequote("\"", "\\\"");
+ }
+
+ public void testDequote_OctalAll() {
+ for (int i = 0; i < 256; i++) {
+ String s = Integer.toOctalString(i);
+ while (s.length() < 3) {
+ s = "0" + s;
+ }
+ assertDequote("" + (char) i, "\\" + s);
+ }
+ }
+
+ public void testQuote_OctalAll() {
+ assertQuote("\\001", "\1");
+ assertQuote("\\176", "~");
+ assertQuote("\\303\\277", "\u00ff"); // \u00ff in UTF-8
+ }
+
+ public void testDequote_UnknownEscapeQ() {
+ assertDequote("\\q", "\\q");
+ }
+
+ public void testDequote_FooTabBar() {
+ assertDequote("foo\tbar", "foo\\tbar");
+ }
+
+ public void testDequote_Latin1() {
+ assertDequote("\u00c5ngstr\u00f6m", "\\305ngstr\\366m"); // Latin1
+ }
+
+ public void testDequote_UTF8() {
+ assertDequote("\u00c5ngstr\u00f6m", "\\303\\205ngstr\\303\\266m");
+ }
+
+ public void testDequote_RawUTF8() {
+ assertDequote("\u00c5ngstr\u00f6m", "\303\205ngstr\303\266m");
+ }
+
+ public void testDequote_RawLatin1() {
+ assertDequote("\u00c5ngstr\u00f6m", "\305ngstr\366m");
+ }
+
+ public void testQuote_Ang() {
+ assertQuote("\\303\\205ngstr\\303\\266m", "\u00c5ngstr\u00f6m");
+ }
+}
diff --git a/org.spearce.jgit/src/org/spearce/jgit/util/QuotedString.java b/org.spearce.jgit/src/org/spearce/jgit/util/QuotedString.java
new file mode 100644
index 0000000..279b713
--- /dev/null
+++ b/org.spearce.jgit/src/org/spearce/jgit/util/QuotedString.java
@@ -0,0 +1,268 @@
+/*
+ * Copyright (C) 2008, Google Inc.
+ *
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials provided
+ * with the distribution.
+ *
+ * - Neither the name of the Git Development Community nor the
+ * names of its contributors may be used to endorse or promote
+ * products derived from this software without specific prior
+ * written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
+ * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES,
+ * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
+ * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
+ * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+ * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
+ * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+package org.spearce.jgit.util;
+
+import java.util.Arrays;
+
+import org.spearce.jgit.lib.Constants;
+
+/** Utility functions related to quoted string handling. */
+public abstract class QuotedString {
+ /** Quoting style that obeys the rules Git applies to file names */
+ public static final GitPathStyle GIT_PATH = new GitPathStyle();
+
+ /**
+ * Quote an input string by the quoting rules.
+ * <p>
+ * If the input string does not require any quoting, the same String
+ * reference is returned to the caller.
+ * <p>
+ * Otherwise a quoted string is returned, including the opening and closing
+ * quotation marks at the start and end of the string. If the style does not
+ * permit raw Unicode characters then the string will first be encoded in
+ * UTF-8, with unprintable sequences possibly escaped by the rules.
+ *
+ * @param in
+ * any non-null Unicode string.
+ * @return a quoted string. See above for details.
+ */
+ public abstract String quote(String in);
+
+ /**
+ * Clean a previously quoted input, decoding the result via UTF-8.
+ * <p>
+ * This method must match quote such that:
+ *
+ * <pre>
+ * a.equals(dequote(quote(a)));
+ * </pre>
+ *
+ * is true for any <code>a</code>.
+ *
+ * @param in
+ * a Unicode string to remove quoting from.
+ * @return the cleaned string.
+ * @see #dequote(byte[], int, int)
+ */
+ public String dequote(final String in) {
+ final byte[] b = Constants.encode(in);
+ return dequote(b, 0, b.length);
+ }
+
+ /**
+ * Decode a previously quoted input, scanning a UTF-8 encoded buffer.
+ * <p>
+ * This method must match quote such that:
+ *
+ * <pre>
+ * a.equals(dequote(Constants.encode(quote(a))));
+ * </pre>
+ *
+ * is true for any <code>a</code>.
+ * <p>
+ * This method removes any opening/closing quotation marks added by
+ * {@link #quote(String)}.
+ *
+ * @param in
+ * the input buffer to parse.
+ * @param offset
+ * first position within <code>in</code> to scan.
+ * @param end
+ * one position past in <code>in</code> to scan.
+ * @return the cleaned string.
+ */
+ public abstract String dequote(byte[] in, int offset, int end);
+
+ /** Quoting style that obeys the rules Git applies to file names */
+ public static final class GitPathStyle extends QuotedString {
+ private static final byte[] quote;
+ static {
+ quote = new byte[128];
+ Arrays.fill(quote, (byte) -1);
+
+ for (int i = '0'; i <= '9'; i++)
+ quote[i] = 0;
+ for (int i = 'a'; i <= 'z'; i++)
+ quote[i] = 0;
+ for (int i = 'A'; i <= 'Z'; i++)
+ quote[i] = 0;
+ quote[' '] = 0;
+ quote['+'] = 0;
+ quote[','] = 0;
+ quote['-'] = 0;
+ quote['.'] = 0;
+ quote['/'] = 0;
+ quote['='] = 0;
+ quote['_'] = 0;
+ quote['^'] = 0;
+
+ quote['\u0007'] = 'a';
+ quote['\b'] = 'b';
+ quote['\f'] = 'f';
+ quote['\n'] = 'n';
+ quote['\r'] = 'r';
+ quote['\t'] = 't';
+ quote['\u000B'] = 'v';
+ quote['\\'] = '\\';
+ quote['"'] = '"';
+ }
+
+ @Override
+ public String quote(final String instr) {
+ if (instr.length() == 0)
+ return "\"\"";
+ boolean reuse = true;
+ final byte[] in = Constants.encode(instr);
+ final StringBuilder r = new StringBuilder(2 + in.length);
+ r.append('"');
+ for (int i = 0; i < in.length; i++) {
+ final int c = in[i] & 0xff;
+ if (c < quote.length) {
+ final byte style = quote[c];
+ if (style == 0) {
+ r.append((char) c);
+ continue;
+ }
+ if (style > 0) {
+ reuse = false;
+ r.append('\\');
+ r.append((char) style);
+ continue;
+ }
+ }
+
+ reuse = false;
+ r.append('\\');
+ r.append((char) (((c >> 6) & 03) + '0'));
+ r.append((char) (((c >> 3) & 07) + '0'));
+ r.append((char) (((c >> 0) & 07) + '0'));
+ }
+ if (reuse)
+ return instr;
+ r.append('"');
+ return r.toString();
+ }
+
+ @Override
+ public String dequote(final byte[] in, final int inPtr, final int inEnd) {
+ if (2 <= inEnd - inPtr && in[inPtr] == '"' && in[inEnd - 1] == '"')
+ return dq(in, inPtr + 1, inEnd - 1);
+ return RawParseUtils.decode(Constants.CHARSET, in, inPtr, inEnd);
+ }
+
+ private static String dq(final byte[] in, int inPtr, final int inEnd) {
+ final byte[] r = new byte[inEnd - inPtr];
+ int rPtr = 0;
+ while (inPtr < inEnd) {
+ final byte b = in[inPtr++];
+ if (b != '\\') {
+ r[rPtr++] = b;
+ continue;
+ }
+
+ if (inPtr == inEnd) {
+ // Lone trailing backslash. Treat it as a literal.
+ //
+ r[rPtr++] = '\\';
+ break;
+ }
+
+ switch (in[inPtr++]) {
+ case 'a':
+ r[rPtr++] = 0x07 /* \a = BEL */;
+ continue;
+ case 'b':
+ r[rPtr++] = '\b';
+ continue;
+ case 'f':
+ r[rPtr++] = '\f';
+ continue;
+ case 'n':
+ r[rPtr++] = '\n';
+ continue;
+ case 'r':
+ r[rPtr++] = '\r';
+ continue;
+ case 't':
+ r[rPtr++] = '\t';
+ continue;
+ case 'v':
+ r[rPtr++] = 0x0B/* \v = VT */;
+ continue;
+
+ case '\\':
+ case '"':
+ r[rPtr++] = in[inPtr - 1];
+ continue;
+
+ case '0':
+ case '1':
+ case '2':
+ case '3': {
+ int cp = in[inPtr - 1] - '0';
+ while (inPtr < inEnd) {
+ final byte c = in[inPtr];
+ if ('0' <= c && c <= '7') {
+ cp <<= 3;
+ cp |= c - '0';
+ inPtr++;
+ } else {
+ break;
+ }
+ }
+ r[rPtr++] = (byte) cp;
+ continue;
+ }
+
+ default:
+ // Any other code is taken literally.
+ //
+ r[rPtr++] = '\\';
+ r[rPtr++] = in[inPtr - 1];
+ continue;
+ }
+ }
+
+ return RawParseUtils.decode(Constants.CHARSET, r, 0, rPtr);
+ }
+
+ private GitPathStyle() {
+ // Singleton
+ }
+ }
+}
--
1.6.1.rc2.299.gead4c
--
Shawn.
^ permalink raw reply related [flat|nested] 11+ messages in thread
end of thread, other threads:[~2008-12-11 0:59 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-12-10 22:05 [JGIT PATCH 0/6] RawParseUtil improvements Shawn O. Pearce
2008-12-10 22:05 ` [JGIT PATCH 1/6] Simplify RawParseUtils.nextLF invocations Shawn O. Pearce
2008-12-10 22:05 ` [JGIT PATCH 2/6] Simplify RawParseUtils next and nextLF loops Shawn O. Pearce
2008-12-10 22:05 ` [JGIT PATCH 3/6] Correct Javadoc of RawParseUtils next and nextLF methods Shawn O. Pearce
2008-12-10 22:05 ` [JGIT PATCH 4/6] Add QuotedString class to handle C-style quoting rules Shawn O. Pearce
2008-12-10 22:05 ` [JGIT PATCH 5/6] Add Bourne style quoting for TransportGitSsh Shawn O. Pearce
2008-12-10 22:05 ` [JGIT PATCH 6/6] Add ~user friendly " Shawn O. Pearce
2008-12-10 23:22 ` [JGIT PATCH 4/6] Add QuotedString class to handle C-style quoting rules Robin Rosenberg
2008-12-10 23:41 ` Shawn O. Pearce
2008-12-11 0:33 ` Robin Rosenberg
2008-12-11 0:57 ` [JGIT PATCH 4/6 v3] Add QuotedString class to handle Git path style " Shawn O. Pearce
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).