All of lore.kernel.org
 help / color / mirror / Atom feed
From: Constantine Plotnikov <constantine.plotnikov@gmail.com>
To: git@vger.kernel.org
Cc: Constantine Plotnikov <constantine.plotnikov@gmail.com>
Subject: [JGIT PATCH] The default encoding for reading commits is UTF-8 rather than system default
Date: Wed,  7 Oct 2009 19:44:33 +0400	[thread overview]
Message-ID: <1254930273-1796-1-git-send-email-constantine.plotnikov@gmail.com> (raw)

When reading commits the system default encoding was used if no encoding
was specified in the commit. The patch modifies test to add a check that 
commit message was encoded correctly (the test fails on old implementation 
if system encoding is not UTF-8) and fixes Commit.decode() method to use 
UTF-8 is encoding is not specified in the commit object.

Signed-off-by: Constantine Plotnikov <constantine.plotnikov@gmail.com>
---

See man git-commit (the section "DISCUSSION"), for justification why 
UTF-8 should be used. Note that this was already correctly implemented 
in ObjectWriter.writeCommit(...) method. But Commit.decode() was not
implemented in the same way for some reason.
 
 .../tst/org/spearce/jgit/lib/T0003_Basic.java      |    3 +++
 .../src/org/spearce/jgit/lib/Commit.java           |   18 +++++++-----------
 2 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/org.spearce.jgit.test/tst/org/spearce/jgit/lib/T0003_Basic.java b/org.spearce.jgit.test/tst/org/spearce/jgit/lib/T0003_Basic.java
index c2b1b91..4702aaf 100644
--- a/org.spearce.jgit.test/tst/org/spearce/jgit/lib/T0003_Basic.java
+++ b/org.spearce.jgit.test/tst/org/spearce/jgit/lib/T0003_Basic.java
@@ -348,6 +348,9 @@ public void test023_createCommitNonAnullii() throws IOException {
 		commit.setMessage("\u00dcbergeeks");
 		ObjectId cid = new ObjectWriter(db).writeCommit(commit);
 		assertEquals("4680908112778718f37e686cbebcc912730b3154", cid.name());
+		Commit loadedCommit = db.mapCommit(cid);
+		assertNotSame(loadedCommit, commit);
+		assertEquals(commit.getMessage(), loadedCommit.getMessage());
 	}
 
 	public void test024_createCommitNonAscii() throws IOException {
diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/Commit.java b/org.spearce.jgit/src/org/spearce/jgit/lib/Commit.java
index 030d4a4..933b929 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/lib/Commit.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/lib/Commit.java
@@ -299,17 +299,13 @@ private void decode() {
 				br.read(readBuf);
 				int msgstart = readBuf.length != 0 ? ( readBuf[0] == '\n' ? 1 : 0 ) : 0;
 
-				if (encoding != null) {
-					// TODO: this isn't reliable so we need to guess the encoding from the actual content
-					author = new PersonIdent(new String(rawAuthor.getBytes(),encoding.name()));
-					committer = new PersonIdent(new String(rawCommitter.getBytes(),encoding.name()));
-					message = new String(readBuf,msgstart, readBuf.length-msgstart, encoding.name());
-				} else {
-					// TODO: use config setting / platform / ascii / iso-latin
-					author = new PersonIdent(new String(rawAuthor.getBytes()));
-					committer = new PersonIdent(new String(rawCommitter.getBytes()));
-					message = new String(readBuf, msgstart, readBuf.length-msgstart);
-				}
+				// If encoding is not specified, the default for commit is UTF-8
+				if (encoding == null) encoding = Constants.CHARSET;
+
+				// TODO: this isn't reliable so we need to guess the encoding from the actual content
+				author = new PersonIdent(new String(rawAuthor.getBytes(),encoding.name()));
+				committer = new PersonIdent(new String(rawCommitter.getBytes(),encoding.name()));
+				message = new String(readBuf,msgstart, readBuf.length-msgstart, encoding.name());
 			} catch (IOException e) {
 				e.printStackTrace();
 			} finally {
-- 
1.6.1.2

             reply	other threads:[~2009-10-07 15:54 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-10-07 15:44 Constantine Plotnikov [this message]
2009-10-08  4:16 ` [JGIT PATCH] The default encoding for reading commits is UTF-8 rather than system default Shawn O. Pearce

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1254930273-1796-1-git-send-email-constantine.plotnikov@gmail.com \
    --to=constantine.plotnikov@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.