git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v7] git-p4: add config git-p4.pathEncoding
@ 2015-09-14 17:10 larsxschneider
  2015-09-14 17:10 ` [PATCH v7] git-p4: improve path encoding verbose output larsxschneider
  2015-09-14 18:25 ` [PATCH v7] git-p4: add config git-p4.pathEncoding Junio C Hamano
  0 siblings, 2 replies; 8+ messages in thread
From: larsxschneider @ 2015-09-14 17:10 UTC (permalink / raw)
  To: git; +Cc: luke, Lars Schneider

From: Lars Schneider <larsxschneider@gmail.com>

diff to v6:
* Always print the encoded path in verbose mode.

v6 is already on next (a9e383). This patch must be applied on next.
Is this the right way to handle this situation?

Thanks,
Lars

Lars Schneider (1):
  git-p4: improve path encoding verbose output

 git-p4.py | 19 +++++++++----------
 1 file changed, 9 insertions(+), 10 deletions(-)

--
2.5.1

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v7] git-p4: improve path encoding verbose output
  2015-09-14 17:10 [PATCH v7] git-p4: add config git-p4.pathEncoding larsxschneider
@ 2015-09-14 17:10 ` larsxschneider
  2015-09-14 18:40   ` Junio C Hamano
  2015-09-15  7:31   ` Luke Diamand
  2015-09-14 18:25 ` [PATCH v7] git-p4: add config git-p4.pathEncoding Junio C Hamano
  1 sibling, 2 replies; 8+ messages in thread
From: larsxschneider @ 2015-09-14 17:10 UTC (permalink / raw)
  To: git; +Cc: luke, Lars Schneider

From: Lars Schneider <larsxschneider@gmail.com>

If a path with non-ASCII characters is detected then print always the
encoding and the encoded string in verbose mode.

Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
---
 git-p4.py | 19 +++++++++----------
 1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/git-p4.py b/git-p4.py
index d45cf2b..da25d3f 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -2220,16 +2220,15 @@ class P4Sync(Command, P4UserMap):
             text = regexp.sub(r'$\1$', text)
             contents = [ text ]
 
-        if gitConfig("git-p4.pathEncoding"):
-            relPath = relPath.decode(gitConfig("git-p4.pathEncoding")).encode('utf8', 'replace')
-        elif self.verbose:
-            try:
-                relPath.decode('ascii')
-            except:
-                print (
-                    "Path with Non-ASCII characters detected and no path encoding defined. "
-                    "Please check the encoding: %s" % relPath
-                )
+        try:
+            relPath.decode('ascii')
+        except:
+            encoding = 'utf8'
+            if gitConfig('git-p4.pathEncoding'):
+                encoding = gitConfig('git-p4.pathEncoding')
+                relPath = relPath.decode(encoding).encode('utf8', 'replace')
+            if self.verbose:
+                print 'Path with non-ASCII characters detected. Used %s to encode: %s ' % (encoding, relPath)
 
         self.gitStream.write("M %s inline %s\n" % (git_mode, relPath))
 
-- 
2.5.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v7] git-p4: add config git-p4.pathEncoding
  2015-09-14 17:10 [PATCH v7] git-p4: add config git-p4.pathEncoding larsxschneider
  2015-09-14 17:10 ` [PATCH v7] git-p4: improve path encoding verbose output larsxschneider
@ 2015-09-14 18:25 ` Junio C Hamano
  1 sibling, 0 replies; 8+ messages in thread
From: Junio C Hamano @ 2015-09-14 18:25 UTC (permalink / raw)
  To: larsxschneider; +Cc: git, luke

larsxschneider@gmail.com writes:

> From: Lars Schneider <larsxschneider@gmail.com>
>
> diff to v6:
> * Always print the encoded path in verbose mode.
>
> v6 is already on next (a9e383). This patch must be applied on next.
> Is this the right way to handle this situation?

Yes, I thought that the reviewers found v6 was solid enough, and it
is now in 'next'.  It is best to do further polishing on the tip of
the topic that is in 'next', i.e. a9e38359 (git-p4: add config
git-p4.pathEncoding, 2015-09-03).

Labeling the patch as v7 is misleading, but from the context (like
your cover letter description) it was clear that the patch is a
follow-up, so everything looks good from procedural point of view.

Thanks for making reviewer's life easier.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v7] git-p4: improve path encoding verbose output
  2015-09-14 17:10 ` [PATCH v7] git-p4: improve path encoding verbose output larsxschneider
@ 2015-09-14 18:40   ` Junio C Hamano
  2015-09-15 15:55     ` Lars Schneider
  2015-09-15  7:31   ` Luke Diamand
  1 sibling, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2015-09-14 18:40 UTC (permalink / raw)
  To: larsxschneider; +Cc: git, luke

larsxschneider@gmail.com writes:

> From: Lars Schneider <larsxschneider@gmail.com>
>
> If a path with non-ASCII characters is detected then print always the
> encoding and the encoded string in verbose mode.

Earlier if the user tells us that s/he knows what she is doing
by setting the configuration, we just followed the instruction
without complaining or notifying.  The differences in this version
are

 (1) if the path is in ASCII, the configuration is not even
     consulted, and we didn't do any path munging.

 (2) for a non-ASCII path, even if the user tells us that s/he knows
     what she is doing, we notify what we did under "--verbose"
     mode.

I think (1) is a definite improvement, but it is not immediately
obvious why (2) is an improvement.  It is clearly a good thing to
let the user know when we munged the path without being told, but
when the configuration is given, it can be argued both ways.  It may
be a good thing to reassure that the configuration is kicking in, or
it may be a needless noise to tell the user that we did what we were
told to do.

In any case, I suspectq that the call to decode-encode to munge
relPath is indented one level too deep in this patch.  You would
want to use the configured value if exists and utf8 if there is no
configuration, but in either case you would want to munge relPath
when it does not decode as ASCII, no?

> Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
> ---
>  git-p4.py | 19 +++++++++----------
>  1 file changed, 9 insertions(+), 10 deletions(-)
>
> diff --git a/git-p4.py b/git-p4.py
> index d45cf2b..da25d3f 100755
> --- a/git-p4.py
> +++ b/git-p4.py
> @@ -2220,16 +2220,15 @@ class P4Sync(Command, P4UserMap):
>              text = regexp.sub(r'$\1$', text)
>              contents = [ text ]
>  
> -        if gitConfig("git-p4.pathEncoding"):
> -            relPath = relPath.decode(gitConfig("git-p4.pathEncoding")).encode('utf8', 'replace')
> -        elif self.verbose:
> -            try:
> -                relPath.decode('ascii')
> -            except:
> -                print (
> -                    "Path with Non-ASCII characters detected and no path encoding defined. "
> -                    "Please check the encoding: %s" % relPath
> -                )
> +        try:
> +            relPath.decode('ascii')
> +        except:
> +            encoding = 'utf8'
> +            if gitConfig('git-p4.pathEncoding'):
> +                encoding = gitConfig('git-p4.pathEncoding')
> +                relPath = relPath.decode(encoding).encode('utf8', 'replace')
> +            if self.verbose:
> +                print 'Path with non-ASCII characters detected. Used %s to encode: %s ' % (encoding, relPath)
>  
>          self.gitStream.write("M %s inline %s\n" % (git_mode, relPath))

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v7] git-p4: improve path encoding verbose output
  2015-09-14 17:10 ` [PATCH v7] git-p4: improve path encoding verbose output larsxschneider
  2015-09-14 18:40   ` Junio C Hamano
@ 2015-09-15  7:31   ` Luke Diamand
  2015-09-15  7:36     ` Luke Diamand
  2015-09-15 15:59     ` Lars Schneider
  1 sibling, 2 replies; 8+ messages in thread
From: Luke Diamand @ 2015-09-15  7:31 UTC (permalink / raw)
  To: larsxschneider, git

On 14/09/15 18:10, larsxschneider@gmail.com wrote:
> From: Lars Schneider <larsxschneider@gmail.com>
>
> If a path with non-ASCII characters is detected then print always the

s/print always/print/


> encoding and the encoded string in verbose mode.
>
> Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
> ---
>   git-p4.py | 19 +++++++++----------
>   1 file changed, 9 insertions(+), 10 deletions(-)
>
> diff --git a/git-p4.py b/git-p4.py
> index d45cf2b..da25d3f 100755
> --- a/git-p4.py
> +++ b/git-p4.py
> @@ -2220,16 +2220,15 @@ class P4Sync(Command, P4UserMap):
>               text = regexp.sub(r'$\1$', text)
>               contents = [ text ]
>
> -        if gitConfig("git-p4.pathEncoding"):
> -            relPath = relPath.decode(gitConfig("git-p4.pathEncoding")).encode('utf8', 'replace')
> -        elif self.verbose:
> -            try:
> -                relPath.decode('ascii')
> -            except:
> -                print (
> -                    "Path with Non-ASCII characters detected and no path encoding defined. "
> -                    "Please check the encoding: %s" % relPath
> -                )
> +        try:
> +            relPath.decode('ascii')
> +        except:
> +            encoding = 'utf8'
> +            if gitConfig('git-p4.pathEncoding'):
> +                encoding = gitConfig('git-p4.pathEncoding')

It would be better to query this once at startup. Otherwise we're 
potentially forking "git config" twice per file which on a large repo 
could become significant. Make it an instance variable perhaps?

> +                relPath = relPath.decode(encoding).encode('utf8', 'replace')
> +            if self.verbose:
> +                print 'Path with non-ASCII characters detected. Used %s to encode: %s ' % (encoding, relPath)
>
>           self.gitStream.write("M %s inline %s\n" % (git_mode, relPath))
>
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v7] git-p4: improve path encoding verbose output
  2015-09-15  7:31   ` Luke Diamand
@ 2015-09-15  7:36     ` Luke Diamand
  2015-09-15 15:59     ` Lars Schneider
  1 sibling, 0 replies; 8+ messages in thread
From: Luke Diamand @ 2015-09-15  7:36 UTC (permalink / raw)
  To: larsxschneider, git

On 15/09/15 08:31, Luke Diamand wrote:
> On 14/09/15 18:10, larsxschneider@gmail.com wrote:
>
> It would be better to query this once at startup. Otherwise we're
> potentially forking "git config" twice per file which on a large repo
> could become significant. Make it an instance variable perhaps?

This is of course complete nonsense since gitConfig caches its results!

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v7] git-p4: improve path encoding verbose output
  2015-09-14 18:40   ` Junio C Hamano
@ 2015-09-15 15:55     ` Lars Schneider
  0 siblings, 0 replies; 8+ messages in thread
From: Lars Schneider @ 2015-09-15 15:55 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, luke


On 14 Sep 2015, at 20:40, Junio C Hamano <gitster@pobox.com> wrote:

> larsxschneider@gmail.com writes:
> 
>> From: Lars Schneider <larsxschneider@gmail.com>
>> 
>> If a path with non-ASCII characters is detected then print always the
>> encoding and the encoded string in verbose mode.
> 
> Earlier if the user tells us that s/he knows what she is doing
> by setting the configuration, we just followed the instruction
> without complaining or notifying.  The differences in this version
> are
> 
> (1) if the path is in ASCII, the configuration is not even
>     consulted, and we didn't do any path munging.
Correct!


> (2) for a non-ASCII path, even if the user tells us that s/he knows
>     what she is doing, we notify what we did under "--verbose"
>     mode.
Correct!


> I think (1) is a definite improvement, but it is not immediately
> obvious why (2) is an improvement.  It is clearly a good thing to
> let the user know when we munged the path without being told, but
> when the configuration is given, it can be argued both ways.  It may
> be a good thing to reassure that the configuration is kicking in, or
> it may be a needless noise to tell the user that we did what we were
> told to do.
I get your point. However, changing file names in a repository is a pretty significant action and therefore I would prefer to explicitly tell the user about it. Some encodings differ only slightly I would like to have an easy way to look at all the changed paths to ensure I picked the right encoding (e.g. grep “Path with non-ASCII characters detected”). I also assume the user is OK with noise since s/he enabled “verbose” mode :-)


> In any case, I suspectq that the call to decode-encode to munge
> relPath is indented one level too deep in this patch.  You would
> want to use the configured value if exists and utf8 if there is no
> configuration, but in either case you would want to munge relPath
> when it does not decode as ASCII, no?
Good catch! It works with the indented code too because UTF8 is the default encoding for relPath later on. However, with your suggestion the code is more explicit. I will change it in the next roll

Thanks!

> 
>> Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
>> ---
>> git-p4.py | 19 +++++++++----------
>> 1 file changed, 9 insertions(+), 10 deletions(-)
>> 
>> diff --git a/git-p4.py b/git-p4.py
>> index d45cf2b..da25d3f 100755
>> --- a/git-p4.py
>> +++ b/git-p4.py
>> @@ -2220,16 +2220,15 @@ class P4Sync(Command, P4UserMap):
>>             text = regexp.sub(r'$\1$', text)
>>             contents = [ text ]
>> 
>> -        if gitConfig("git-p4.pathEncoding"):
>> -            relPath = relPath.decode(gitConfig("git-p4.pathEncoding")).encode('utf8', 'replace')
>> -        elif self.verbose:
>> -            try:
>> -                relPath.decode('ascii')
>> -            except:
>> -                print (
>> -                    "Path with Non-ASCII characters detected and no path encoding defined. "
>> -                    "Please check the encoding: %s" % relPath
>> -                )
>> +        try:
>> +            relPath.decode('ascii')
>> +        except:
>> +            encoding = 'utf8'
>> +            if gitConfig('git-p4.pathEncoding'):
>> +                encoding = gitConfig('git-p4.pathEncoding')
>> +                relPath = relPath.decode(encoding).encode('utf8', 'replace')
>> +            if self.verbose:
>> +                print 'Path with non-ASCII characters detected. Used %s to encode: %s ' % (encoding, relPath)
>> 
>>         self.gitStream.write("M %s inline %s\n" % (git_mode, relPath))

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v7] git-p4: improve path encoding verbose output
  2015-09-15  7:31   ` Luke Diamand
  2015-09-15  7:36     ` Luke Diamand
@ 2015-09-15 15:59     ` Lars Schneider
  1 sibling, 0 replies; 8+ messages in thread
From: Lars Schneider @ 2015-09-15 15:59 UTC (permalink / raw)
  To: Luke Diamand; +Cc: git


On 15 Sep 2015, at 09:31, Luke Diamand <luke@diamand.org> wrote:

> On 14/09/15 18:10, larsxschneider@gmail.com wrote:
>> From: Lars Schneider <larsxschneider@gmail.com>
>> 
>> If a path with non-ASCII characters is detected then print always the
> 
> s/print always/print/
I will fix it.

> 
> 
>> encoding and the encoded string in verbose mode.
>> 
>> Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
>> ---
>>  git-p4.py | 19 +++++++++----------
>>  1 file changed, 9 insertions(+), 10 deletions(-)
>> 
>> diff --git a/git-p4.py b/git-p4.py
>> index d45cf2b..da25d3f 100755
>> --- a/git-p4.py
>> +++ b/git-p4.py
>> @@ -2220,16 +2220,15 @@ class P4Sync(Command, P4UserMap):
>>              text = regexp.sub(r'$\1$', text)
>>              contents = [ text ]
>> 
>> -        if gitConfig("git-p4.pathEncoding"):
>> -            relPath = relPath.decode(gitConfig("git-p4.pathEncoding")).encode('utf8', 'replace')
>> -        elif self.verbose:
>> -            try:
>> -                relPath.decode('ascii')
>> -            except:
>> -                print (
>> -                    "Path with Non-ASCII characters detected and no path encoding defined. "
>> -                    "Please check the encoding: %s" % relPath
>> -                )
>> +        try:
>> +            relPath.decode('ascii')
>> +        except:
>> +            encoding = 'utf8'
>> +            if gitConfig('git-p4.pathEncoding'):
>> +                encoding = gitConfig('git-p4.pathEncoding')
> 
> It would be better to query this once at startup. Otherwise we're potentially forking "git config" twice per file which on a large repo could become significant. Make it an instance variable perhaps?
solved in other email

> 
>> +                relPath = relPath.decode(encoding).encode('utf8', 'replace')
>> +            if self.verbose:
>> +                print 'Path with non-ASCII characters detected. Used %s to encode: %s ' % (encoding, relPath)
>> 
>>          self.gitStream.write("M %s inline %s\n" % (git_mode, relPath))

Thanks!

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2015-09-15 15:59 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-09-14 17:10 [PATCH v7] git-p4: add config git-p4.pathEncoding larsxschneider
2015-09-14 17:10 ` [PATCH v7] git-p4: improve path encoding verbose output larsxschneider
2015-09-14 18:40   ` Junio C Hamano
2015-09-15 15:55     ` Lars Schneider
2015-09-15  7:31   ` Luke Diamand
2015-09-15  7:36     ` Luke Diamand
2015-09-15 15:59     ` Lars Schneider
2015-09-14 18:25 ` [PATCH v7] git-p4: add config git-p4.pathEncoding Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).