All of lore.kernel.org
 help / color / mirror / Atom feed
* [JGIT PATCH v2] The default encoding for reading commits is UTF-8 rather than system default
@ 2009-10-07 16:26 Constantine Plotnikov
  0 siblings, 0 replies; only message in thread
From: Constantine Plotnikov @ 2009-10-07 16:26 UTC (permalink / raw)
  To: git; +Cc: Constantine Plotnikov

When reading commits the system default encoding was used if no encoding
was specified in the commit. The patch modifies test to add a check that
commit message was encoded correctly (the test fails on old implementation
if system encoding is not UTF-8) and fixes Commit.decode() method to use
UTF-8 is encoding is not specified in the commit object.

Signed-off-by: Constantine Plotnikov <constantine.plotnikov@gmail.com>
---

This version is over eclipse.org packages.

See man git-commit (the section "DISCUSSION"), for justification why
UTF-8 should be used. Note that this was already correctly implemented
in ObjectWriter.writeCommit(...) method. But Commit.decode() was not
implemented in the same way for some reason.
 
 .../tst/org/eclipse/jgit/lib/T0003_Basic.java      |    3 +++
 .../src/org/eclipse/jgit/lib/Commit.java           |   18 +++++++-----------
 2 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/org.eclipse.jgit.test/tst/org/eclipse/jgit/lib/T0003_Basic.java b/org.eclipse.jgit.test/tst/org/eclipse/jgit/lib/T0003_Basic.java
index 98fb794..f3bc9b1 100644
--- a/org.eclipse.jgit.test/tst/org/eclipse/jgit/lib/T0003_Basic.java
+++ b/org.eclipse.jgit.test/tst/org/eclipse/jgit/lib/T0003_Basic.java
@@ -348,6 +348,9 @@ public void test023_createCommitNonAnullii() throws IOException {
 		commit.setMessage("\u00dcbergeeks");
 		ObjectId cid = new ObjectWriter(db).writeCommit(commit);
 		assertEquals("4680908112778718f37e686cbebcc912730b3154", cid.name());
+		Commit loadedCommit = db.mapCommit(cid);
+		assertNotSame(loadedCommit, commit);
+		assertEquals(commit.getMessage(), loadedCommit.getMessage());
 	}
 
 	public void test024_createCommitNonAscii() throws IOException {
diff --git a/org.eclipse.jgit/src/org/eclipse/jgit/lib/Commit.java b/org.eclipse.jgit/src/org/eclipse/jgit/lib/Commit.java
index b2cf9b1..430cddc 100644
--- a/org.eclipse.jgit/src/org/eclipse/jgit/lib/Commit.java
+++ b/org.eclipse.jgit/src/org/eclipse/jgit/lib/Commit.java
@@ -299,17 +299,13 @@ private void decode() {
 				br.read(readBuf);
 				int msgstart = readBuf.length != 0 ? ( readBuf[0] == '\n' ? 1 : 0 ) : 0;
 
-				if (encoding != null) {
-					// TODO: this isn't reliable so we need to guess the encoding from the actual content
-					author = new PersonIdent(new String(rawAuthor.getBytes(),encoding.name()));
-					committer = new PersonIdent(new String(rawCommitter.getBytes(),encoding.name()));
-					message = new String(readBuf,msgstart, readBuf.length-msgstart, encoding.name());
-				} else {
-					// TODO: use config setting / platform / ascii / iso-latin
-					author = new PersonIdent(new String(rawAuthor.getBytes()));
-					committer = new PersonIdent(new String(rawCommitter.getBytes()));
-					message = new String(readBuf, msgstart, readBuf.length-msgstart);
-				}
+				// If encoding is not specified, the default for commit is UTF-8
+				if (encoding == null) encoding = Constants.CHARSET;
+
+				// TODO: this isn't reliable so we need to guess the encoding from the actual content
+				author = new PersonIdent(new String(rawAuthor.getBytes(),encoding.name()));
+				committer = new PersonIdent(new String(rawCommitter.getBytes(),encoding.name()));
+				message = new String(readBuf,msgstart, readBuf.length-msgstart, encoding.name());
 			} catch (IOException e) {
 				e.printStackTrace();
 			} finally {
-- 
1.6.1.2

^ permalink raw reply related	[flat|nested] only message in thread

only message in thread, other threads:[~2009-10-07 16:28 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-10-07 16:26 [JGIT PATCH v2] The default encoding for reading commits is UTF-8 rather than system default Constantine Plotnikov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.