git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] git-p4: add "--path-encoding" option
@ 2015-08-31 20:43 larsxschneider
  2015-08-31 20:43 ` larsxschneider
  2015-08-31 21:08 ` Junio C Hamano
  0 siblings, 2 replies; 4+ messages in thread
From: larsxschneider @ 2015-08-31 20:43 UTC (permalink / raw)
  To: git; +Cc: luke, gitster, tboegi, Lars Schneider

From: Lars Schneider <larsxschneider@gmail.com>

Diff to v1:
* switch example conversions from cp1252 to iso8859-1 (thanks Torsten!)
* fix git-p4.txt line length and double dashes (thanks Junio!)
* remove bare UTF-8 sequence (thanks Junio!)

As with v1, I ensured the unit test runs on OS X and Linux.

I noticed one weird point, though. "git ls-files" outputs the UTF-8 characters escaped on Linux and on OS X. Is there a problem with my setup or this a Git bug?

Thanks,
Lars

Lars Schneider (1):
  git-p4: add "--path-encoding" option

 Documentation/git-p4.txt        |  5 +++++
 git-p4.py                       |  6 ++++++
 t/t9821-git-p4-path-encoding.sh | 39 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 50 insertions(+)
 create mode 100755 t/t9821-git-p4-path-encoding.sh

--
2.5.1.1.g9071995.dirty

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v2] git-p4: add "--path-encoding" option
  2015-08-31 20:43 [PATCH v2] git-p4: add "--path-encoding" option larsxschneider
@ 2015-08-31 20:43 ` larsxschneider
  2015-08-31 21:10   ` Junio C Hamano
  2015-08-31 21:08 ` Junio C Hamano
  1 sibling, 1 reply; 4+ messages in thread
From: larsxschneider @ 2015-08-31 20:43 UTC (permalink / raw)
  To: git; +Cc: luke, gitster, tboegi, Lars Schneider

From: Lars Schneider <larsxschneider@gmail.com>

Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
---
 Documentation/git-p4.txt        |  5 +++++
 git-p4.py                       |  6 ++++++
 t/t9821-git-p4-path-encoding.sh | 39 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 50 insertions(+)
 create mode 100755 t/t9821-git-p4-path-encoding.sh

diff --git a/Documentation/git-p4.txt b/Documentation/git-p4.txt
index 82aa5d6..14bb79c 100644
--- a/Documentation/git-p4.txt
+++ b/Documentation/git-p4.txt
@@ -252,6 +252,11 @@ Git repository:
 	Use a client spec to find the list of interesting files in p4.
 	See the "CLIENT SPEC" section below.
 
+--path-encoding <encoding>::
+	The encoding to use when reading p4 client paths. With this option
+	non ASCII paths are properly stored in Git. For example, the encoding
+	'cp1252' is often used on Windows systems.
+
 -/ <path>::
 	Exclude selected depot paths when cloning or syncing.
 
diff --git a/git-p4.py b/git-p4.py
index 073f87b..2b3bfc4 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -1981,6 +1981,8 @@ class P4Sync(Command, P4UserMap):
                 optparse.make_option("--silent", dest="silent", action="store_true"),
                 optparse.make_option("--detect-labels", dest="detectLabels", action="store_true"),
                 optparse.make_option("--import-labels", dest="importLabels", action="store_true"),
+                optparse.make_option("--path-encoding", dest="pathEncoding", type="string",
+                                     help="Encoding to use for paths"),
                 optparse.make_option("--import-local", dest="importIntoRemotes", action="store_false",
                                      help="Import into refs/heads/ , not refs/remotes"),
                 optparse.make_option("--max-changes", dest="maxChanges",
@@ -2025,6 +2027,7 @@ class P4Sync(Command, P4UserMap):
         self.clientSpecDirs = None
         self.tempBranches = []
         self.tempBranchLocation = "git-p4-tmp"
+        self.pathEncoding = None
 
         if gitConfig("git-p4.syncFromOrigin") == "false":
             self.syncWithOrigin = False
@@ -2213,6 +2216,9 @@ class P4Sync(Command, P4UserMap):
             text = regexp.sub(r'$\1$', text)
             contents = [ text ]
 
+        if self.pathEncoding:
+            relPath = relPath.decode(self.pathEncoding).encode('utf8', 'replace')
+
         self.gitStream.write("M %s inline %s\n" % (git_mode, relPath))
 
         # total length...
diff --git a/t/t9821-git-p4-path-encoding.sh b/t/t9821-git-p4-path-encoding.sh
new file mode 100755
index 0000000..bb85074
--- /dev/null
+++ b/t/t9821-git-p4-path-encoding.sh
@@ -0,0 +1,39 @@
+#!/bin/sh
+
+test_description='Clone repositories with non ASCII paths'
+
+. ./lib-git-p4.sh
+
+UTF8_ESCAPED="a-\303\244_o-\303\266_u-\303\274.txt"
+
+test_expect_success 'start p4d' '
+	start_p4d
+'
+
+test_expect_success 'Create a repo containing iso8859-1 encoded paths' '
+	cd "$cli" &&
+
+	ISO8859="$(printf "$UTF8_ESCAPED" | iconv -f utf-8 -t iso8859-1)" &&
+	>"$ISO8859" &&
+	p4 add "$ISO8859" &&
+	p4 submit -d "test commit"
+'
+
+test_expect_success 'Clone repo containing iso8859-1 encoded paths' '
+	git p4 clone --destination="$git" --path-encoding=iso8859-1 //depot &&
+	test_when_finished cleanup_git &&
+	(
+		cd "$git" &&
+		printf "\"$UTF8_ESCAPED\"" >expect &&
+		# I wonder why Git prints "ls-files" output UTF-8 escaped.
+		# This behavior is consistent on Linux and OS X.
+		printf $(git ls-files) >actual &&
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'kill p4d' '
+	kill_p4d
+'
+
+test_done
-- 
2.5.1.1.g9071995.dirty

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] git-p4: add "--path-encoding" option
  2015-08-31 20:43 [PATCH v2] git-p4: add "--path-encoding" option larsxschneider
  2015-08-31 20:43 ` larsxschneider
@ 2015-08-31 21:08 ` Junio C Hamano
  1 sibling, 0 replies; 4+ messages in thread
From: Junio C Hamano @ 2015-08-31 21:08 UTC (permalink / raw)
  To: larsxschneider; +Cc: git, luke, tboegi

larsxschneider@gmail.com writes:

> From: Lars Schneider <larsxschneider@gmail.com>
>
> Diff to v1:
> * switch example conversions from cp1252 to iso8859-1 (thanks Torsten!)
> * fix git-p4.txt line length and double dashes (thanks Junio!)
> * remove bare UTF-8 sequence (thanks Junio!)
>
> As with v1, I ensured the unit test runs on OS X and Linux.
>
> I noticed one weird point, though. "git ls-files" outputs the UTF-8 characters escaped on Linux and on OS X. Is there a problem with my setup or this a Git bug?

There is no bug, there is no misconfiguration on your part.  It is
very much deliberate, I think.  core.quotepath defaults to false.

Asking is very good, but please don't do so in in-code comment.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] git-p4: add "--path-encoding" option
  2015-08-31 20:43 ` larsxschneider
@ 2015-08-31 21:10   ` Junio C Hamano
  0 siblings, 0 replies; 4+ messages in thread
From: Junio C Hamano @ 2015-08-31 21:10 UTC (permalink / raw)
  To: larsxschneider; +Cc: git, luke, tboegi

larsxschneider@gmail.com writes:

> From: Lars Schneider <larsxschneider@gmail.com>
>
> Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
> ---
>  Documentation/git-p4.txt        |  5 +++++
>  git-p4.py                       |  6 ++++++
>  t/t9821-git-p4-path-encoding.sh | 39 +++++++++++++++++++++++++++++++++++++++
>  3 files changed, 50 insertions(+)
>  create mode 100755 t/t9821-git-p4-path-encoding.sh
>
> diff --git a/Documentation/git-p4.txt b/Documentation/git-p4.txt
> index 82aa5d6..14bb79c 100644
> --- a/Documentation/git-p4.txt
> +++ b/Documentation/git-p4.txt
> @@ -252,6 +252,11 @@ Git repository:
>  	Use a client spec to find the list of interesting files in p4.
>  	See the "CLIENT SPEC" section below.
>  
> +--path-encoding <encoding>::
> +	The encoding to use when reading p4 client paths. With this option
> +	non ASCII paths are properly stored in Git. For example, the encoding
> +	'cp1252' is often used on Windows systems.
> +
>  -/ <path>::
>  	Exclude selected depot paths when cloning or syncing.
>  
> diff --git a/git-p4.py b/git-p4.py
> index 073f87b..2b3bfc4 100755
> --- a/git-p4.py
> +++ b/git-p4.py
> @@ -1981,6 +1981,8 @@ class P4Sync(Command, P4UserMap):
>                  optparse.make_option("--silent", dest="silent", action="store_true"),
>                  optparse.make_option("--detect-labels", dest="detectLabels", action="store_true"),
>                  optparse.make_option("--import-labels", dest="importLabels", action="store_true"),
> +                optparse.make_option("--path-encoding", dest="pathEncoding", type="string",
> +                                     help="Encoding to use for paths"),
>                  optparse.make_option("--import-local", dest="importIntoRemotes", action="store_false",
>                                       help="Import into refs/heads/ , not refs/remotes"),
>                  optparse.make_option("--max-changes", dest="maxChanges",
> @@ -2025,6 +2027,7 @@ class P4Sync(Command, P4UserMap):
>          self.clientSpecDirs = None
>          self.tempBranches = []
>          self.tempBranchLocation = "git-p4-tmp"
> +        self.pathEncoding = None
>  
>          if gitConfig("git-p4.syncFromOrigin") == "false":
>              self.syncWithOrigin = False
> @@ -2213,6 +2216,9 @@ class P4Sync(Command, P4UserMap):
>              text = regexp.sub(r'$\1$', text)
>              contents = [ text ]
>  
> +        if self.pathEncoding:
> +            relPath = relPath.decode(self.pathEncoding).encode('utf8', 'replace')
> +
>          self.gitStream.write("M %s inline %s\n" % (git_mode, relPath))
>  
>          # total length...
> diff --git a/t/t9821-git-p4-path-encoding.sh b/t/t9821-git-p4-path-encoding.sh
> new file mode 100755
> index 0000000..bb85074
> --- /dev/null
> +++ b/t/t9821-git-p4-path-encoding.sh
> @@ -0,0 +1,39 @@
> +#!/bin/sh
> +
> +test_description='Clone repositories with non ASCII paths'
> +
> +. ./lib-git-p4.sh
> +
> +UTF8_ESCAPED="a-\303\244_o-\303\266_u-\303\274.txt"
> +
> +test_expect_success 'start p4d' '
> +	start_p4d
> +'
> +
> +test_expect_success 'Create a repo containing iso8859-1 encoded paths' '
> +	cd "$cli" &&
> +
> +	ISO8859="$(printf "$UTF8_ESCAPED" | iconv -f utf-8 -t iso8859-1)" &&
> +	>"$ISO8859" &&
> +	p4 add "$ISO8859" &&
> +	p4 submit -d "test commit"
> +'
> +
> +test_expect_success 'Clone repo containing iso8859-1 encoded paths' '
> +	git p4 clone --destination="$git" --path-encoding=iso8859-1 //depot &&
> +	test_when_finished cleanup_git &&
> +	(
> +		cd "$git" &&
> +		printf "\"$UTF8_ESCAPED\"" >expect &&

Did this pass your test?  I suspect 'expect' lacks the final LF.

> +		# I wonder why Git prints "ls-files" output UTF-8 escaped.
> +		# This behavior is consistent on Linux and OS X.
> +		printf $(git ls-files) >actual &&

Yuck.  Perhaps do it like so:

	test_config core.quotepath false &&
	git ls-files >actual &&
        test_cmp expect actual

> +		test_cmp expect actual
> +	)
> +'
> +
> +test_expect_success 'kill p4d' '
> +	kill_p4d
> +'
> +
> +test_done

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-08-31 21:10 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-31 20:43 [PATCH v2] git-p4: add "--path-encoding" option larsxschneider
2015-08-31 20:43 ` larsxschneider
2015-08-31 21:10   ` Junio C Hamano
2015-08-31 21:08 ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).