git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Constantine Plotnikov <constantine.plotnikov@gmail.com>
To: git@vger.kernel.org
Cc: Constantine Plotnikov <constantine.plotnikov@gmail.com>
Subject: [JGIT PATCH] The default encoding for reading commits is UTF-8 rather than system default
Date: Wed,  7 Oct 2009 19:44:33 +0400	[thread overview]
Message-ID: <1254930273-1796-1-git-send-email-constantine.plotnikov@gmail.com> (raw)

When reading commits the system default encoding was used if no encoding
was specified in the commit. The patch modifies test to add a check that 
commit message was encoded correctly (the test fails on old implementation 
if system encoding is not UTF-8) and fixes Commit.decode() method to use 
UTF-8 is encoding is not specified in the commit object.

Signed-off-by: Constantine Plotnikov <constantine.plotnikov@gmail.com>
---

See man git-commit (the section "DISCUSSION"), for justification why 
UTF-8 should be used. Note that this was already correctly implemented 
in ObjectWriter.writeCommit(...) method. But Commit.decode() was not
implemented in the same way for some reason.
 
 .../tst/org/spearce/jgit/lib/T0003_Basic.java      |    3 +++
 .../src/org/spearce/jgit/lib/Commit.java           |   18 +++++++-----------
 2 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/org.spearce.jgit.test/tst/org/spearce/jgit/lib/T0003_Basic.java b/org.spearce.jgit.test/tst/org/spearce/jgit/lib/T0003_Basic.java
index c2b1b91..4702aaf 100644
--- a/org.spearce.jgit.test/tst/org/spearce/jgit/lib/T0003_Basic.java
+++ b/org.spearce.jgit.test/tst/org/spearce/jgit/lib/T0003_Basic.java
@@ -348,6 +348,9 @@ public void test023_createCommitNonAnullii() throws IOException {
 		commit.setMessage("\u00dcbergeeks");
 		ObjectId cid = new ObjectWriter(db).writeCommit(commit);
 		assertEquals("4680908112778718f37e686cbebcc912730b3154", cid.name());
+		Commit loadedCommit = db.mapCommit(cid);
+		assertNotSame(loadedCommit, commit);
+		assertEquals(commit.getMessage(), loadedCommit.getMessage());
 	}
 
 	public void test024_createCommitNonAscii() throws IOException {
diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/Commit.java b/org.spearce.jgit/src/org/spearce/jgit/lib/Commit.java
index 030d4a4..933b929 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/lib/Commit.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/lib/Commit.java
@@ -299,17 +299,13 @@ private void decode() {
 				br.read(readBuf);
 				int msgstart = readBuf.length != 0 ? ( readBuf[0] == '\n' ? 1 : 0 ) : 0;
 
-				if (encoding != null) {
-					// TODO: this isn't reliable so we need to guess the encoding from the actual content
-					author = new PersonIdent(new String(rawAuthor.getBytes(),encoding.name()));
-					committer = new PersonIdent(new String(rawCommitter.getBytes(),encoding.name()));
-					message = new String(readBuf,msgstart, readBuf.length-msgstart, encoding.name());
-				} else {
-					// TODO: use config setting / platform / ascii / iso-latin
-					author = new PersonIdent(new String(rawAuthor.getBytes()));
-					committer = new PersonIdent(new String(rawCommitter.getBytes()));
-					message = new String(readBuf, msgstart, readBuf.length-msgstart);
-				}
+				// If encoding is not specified, the default for commit is UTF-8
+				if (encoding == null) encoding = Constants.CHARSET;
+
+				// TODO: this isn't reliable so we need to guess the encoding from the actual content
+				author = new PersonIdent(new String(rawAuthor.getBytes(),encoding.name()));
+				committer = new PersonIdent(new String(rawCommitter.getBytes(),encoding.name()));
+				message = new String(readBuf,msgstart, readBuf.length-msgstart, encoding.name());
 			} catch (IOException e) {
 				e.printStackTrace();
 			} finally {
-- 
1.6.1.2

             reply	other threads:[~2009-10-07 15:54 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-10-07 15:44 Constantine Plotnikov [this message]
2009-10-08  4:16 ` [JGIT PATCH] The default encoding for reading commits is UTF-8 rather than system default Shawn O. Pearce

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1254930273-1796-1-git-send-email-constantine.plotnikov@gmail.com \
    --to=constantine.plotnikov@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).