git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: larsxschneider@gmail.com
To: git@vger.kernel.org
Cc: luke@diamand.org, tboegi@web.de, sunshine@sunshineco.com,
	remi.galan-alfonso@ensimag.grenoble-inp.fr,
	Lars Schneider <larsxschneider@gmail.com>
Subject: [PATCH v6] git-p4: add config git-p4.pathEncoding
Date: Thu,  3 Sep 2015 11:14:07 +0200	[thread overview]
Message-ID: <1441271647-67824-2-git-send-email-larsxschneider@gmail.com> (raw)
In-Reply-To: <1441271647-67824-1-git-send-email-larsxschneider@gmail.com>

From: Lars Schneider <larsxschneider@gmail.com>

Perforce keeps the encoding of a path as given by the originating OS.
Git expects paths encoded as UTF-8. Add a config to tell git-p4 what
encoding Perforce had used for the paths. This encoding is used to
transcode the paths to UTF-8. As an example, Perforce on Windows often
uses “cp1252” to encode path names.

Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Acked-by: Luke Diamand <luke@diamand.org>
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
---
 Documentation/git-p4.txt        |  7 +++++
 git-p4.py                       | 11 ++++++++
 t/t9822-git-p4-path-encoding.sh | 60 +++++++++++++++++++++++++++++++++++++++++
 3 files changed, 78 insertions(+)
 create mode 100755 t/t9822-git-p4-path-encoding.sh

diff --git a/Documentation/git-p4.txt b/Documentation/git-p4.txt
index 82aa5d6..12a57d4 100644
--- a/Documentation/git-p4.txt
+++ b/Documentation/git-p4.txt
@@ -510,6 +510,13 @@ git-p4.useClientSpec::
 	option '--use-client-spec'.  See the "CLIENT SPEC" section above.
 	This variable is a boolean, not the name of a p4 client.
 
+git-p4.pathEncoding::
+	Perforce keeps the encoding of a path as given by the originating OS.
+	Git expects paths encoded as UTF-8. Use this config to tell git-p4
+	what encoding Perforce had used for the paths. This encoding is used
+	to transcode the paths to UTF-8. As an example, Perforce on Windows
+	often uses “cp1252” to encode path names.
+
 Submit variables
 ~~~~~~~~~~~~~~~~
 git-p4.detectRenames::
diff --git a/git-p4.py b/git-p4.py
index 073f87b..b1ad86d 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -2213,6 +2213,17 @@ class P4Sync(Command, P4UserMap):
             text = regexp.sub(r'$\1$', text)
             contents = [ text ]
 
+        if gitConfig("git-p4.pathEncoding"):
+            relPath = relPath.decode(gitConfig("git-p4.pathEncoding")).encode('utf8', 'replace')
+        elif self.verbose:
+            try:
+                relPath.decode('ascii')
+            except:
+                print (
+                    "Path with Non-ASCII characters detected and no path encoding defined. "
+                    "Please check the encoding: %s" % relPath
+                )
+
         self.gitStream.write("M %s inline %s\n" % (git_mode, relPath))
 
         # total length...
diff --git a/t/t9822-git-p4-path-encoding.sh b/t/t9822-git-p4-path-encoding.sh
new file mode 100755
index 0000000..e507ad7
--- /dev/null
+++ b/t/t9822-git-p4-path-encoding.sh
@@ -0,0 +1,60 @@
+#!/bin/sh
+
+test_description='Clone repositories with non ASCII paths'
+
+. ./lib-git-p4.sh
+
+UTF8_ESCAPED="a-\303\244_o-\303\266_u-\303\274.txt"
+ISO8859_ESCAPED="a-\344_o-\366_u-\374.txt"
+
+test_expect_success 'start p4d' '
+	start_p4d
+'
+
+test_expect_success 'Create a repo containing iso8859-1 encoded paths' '
+	(
+		cd "$cli" &&
+		ISO8859="$(printf "$ISO8859_ESCAPED")" &&
+		echo content123 >"$ISO8859" &&
+		p4 add "$ISO8859" &&
+		p4 submit -d "test commit"
+	)
+'
+
+test_expect_success 'Clone repo containing iso8859-1 encoded paths without git-p4.pathEncoding' '
+	git p4 clone --destination="$git" //depot &&
+	test_when_finished cleanup_git &&
+	(
+		cd "$git" &&
+		UTF8="$(printf "$UTF8_ESCAPED")" &&
+		echo $UTF8 >expect &&
+		git -c core.quotepath=false ls-files >actual &&
+		test_must_fail test_cmp expect actual
+	)
+'
+
+test_expect_success 'Clone repo containing iso8859-1 encoded paths with git-p4.pathEncoding' '
+
+	test_when_finished cleanup_git &&
+	(
+		cd "$git" &&
+		git init . &&
+		test_config git-p4.pathEncoding iso8859-1 &&
+		git p4 clone --use-client-spec --destination="$git" //depot &&
+		UTF8="$(printf "$UTF8_ESCAPED")" &&
+		echo $UTF8 >expect &&
+		git -c core.quotepath=false ls-files >actual &&
+		test_cmp expect actual &&
+		cat >expect <<-\EOF &&
+		content123
+		EOF
+		cat $UTF8 >actual &&
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'kill p4d' '
+	kill_p4d
+'
+
+test_done
-- 
1.9.5 (Apple Git-50.3)

  reply	other threads:[~2015-09-03  9:14 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-03  9:14 [PATCH v6] git-p4: add config git-p4.pathEncoding larsxschneider
2015-09-03  9:14 ` larsxschneider [this message]
2015-09-03 17:03   ` Junio C Hamano
2015-09-03 17:24     ` Lars Schneider
2015-09-03 18:18       ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1441271647-67824-2-git-send-email-larsxschneider@gmail.com \
    --to=larsxschneider@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=luke@diamand.org \
    --cc=remi.galan-alfonso@ensimag.grenoble-inp.fr \
    --cc=sunshine@sunshineco.com \
    --cc=tboegi@web.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).