* [PATCH] git-p4: don't convert utf16 files.
@ 2011-08-19 22:50 Chris Li
2011-08-21 15:21 ` Pete Wyckoff
0 siblings, 1 reply; 3+ messages in thread
From: Chris Li @ 2011-08-19 22:50 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano
Some repository has some utf16 files git-p4 don't know
how to convert. For those files, git-p4 just write the utf8
files. That is wrong, because git get different file than
perforce does, causing some windows resource file fail
to compile.
Using the "p4 print -o tmpfile depotfile" can avoid this
convertion (and possible failure) all together.
Signed-off-by: Chris Li <git@chrisli.org>
---
git-p4 | 11 +++++------
1 files changed, 5 insertions(+), 6 deletions(-)
diff --git a/git-p4 b/git-p4
index 672b0c2..0c6a5cc 100755
--- a/git-p4
+++ b/git-p4
@@ -755,12 +755,11 @@ class P4FileReader:
break
if header['type'].startswith('utf16'):
- try:
- text = textBuffer.getvalue().encode('utf_16')
- except UnicodeDecodeError:
- # File checked in to Perforce has an error. Try
without encoding
- print "Corrupt UTF-16 file in Perforce: %s" %
header['depotFile']
- text = textBuffer.getvalue()
+ # Don't even try to convert utf16. Ask p4 to write
the file directly.
+ tmpFile = tempfile.NamedTemporaryFile()
+ P4Helper().p4_system("print -o %s %s"%(tmpFile.name,
header['depotFile']))
+ text = open(tmpFile.name).read()
+ tmpFile.close()
else:
text = textBuffer.getvalue()
textBuffer.close()
--
1.7.6
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: [PATCH] git-p4: don't convert utf16 files.
2011-08-19 22:50 [PATCH] git-p4: don't convert utf16 files Chris Li
@ 2011-08-21 15:21 ` Pete Wyckoff
2011-08-21 19:09 ` Chris Li
0 siblings, 1 reply; 3+ messages in thread
From: Pete Wyckoff @ 2011-08-21 15:21 UTC (permalink / raw)
To: Chris Li
Cc: git, Junio C Hamano, Eberhard Beilharz, Jordan Zimmerman,
Mike Crowe
christ.li@gmail.com wrote on Fri, 19 Aug 2011 15:50 -0700:
> Some repository has some utf16 files git-p4 don't know
> how to convert. For those files, git-p4 just write the utf8
> files. That is wrong, because git get different file than
> perforce does, causing some windows resource file fail
> to compile.
>
> Using the "p4 print -o tmpfile depotfile" can avoid this
> convertion (and possible failure) all together.
This isn't contrib/fast-import/git-p4. Searching around, I
discovered a 2009 fork of git-p4 that is fairly active. CC-ing
some of the names I found on github.
Here's one such repo:
http://github.com/ermshiperete/git-p4
Git's git-p4 doesn't try to do anything special with utf-16. It
does \r\n mangling, but not $Keyword$ removal, then just streams
it to disk however p4 sends it. That's close to what you're
trying to do here.
-- Pete
> Signed-off-by: Chris Li <git@chrisli.org>
>
> ---
> git-p4 | 11 +++++------
> 1 files changed, 5 insertions(+), 6 deletions(-)
>
> diff --git a/git-p4 b/git-p4
> index 672b0c2..0c6a5cc 100755
> --- a/git-p4
> +++ b/git-p4
> @@ -755,12 +755,11 @@ class P4FileReader:
> break
>
> if header['type'].startswith('utf16'):
> - try:
> - text = textBuffer.getvalue().encode('utf_16')
> - except UnicodeDecodeError:
> - # File checked in to Perforce has an error. Try
> without encoding
> - print "Corrupt UTF-16 file in Perforce: %s" %
> header['depotFile']
> - text = textBuffer.getvalue()
> + # Don't even try to convert utf16. Ask p4 to write
> the file directly.
> + tmpFile = tempfile.NamedTemporaryFile()
> + P4Helper().p4_system("print -o %s %s"%(tmpFile.name,
> header['depotFile']))
> + text = open(tmpFile.name).read()
> + tmpFile.close()
> else:
> text = textBuffer.getvalue()
> textBuffer.close()
> --
> 1.7.6
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [PATCH] git-p4: don't convert utf16 files.
2011-08-21 15:21 ` Pete Wyckoff
@ 2011-08-21 19:09 ` Chris Li
0 siblings, 0 replies; 3+ messages in thread
From: Chris Li @ 2011-08-21 19:09 UTC (permalink / raw)
To: Pete Wyckoff
Cc: git, Junio C Hamano, Eberhard Beilharz, Jordan Zimmerman,
Mike Crowe
On Sun, Aug 21, 2011 at 8:21 AM, Pete Wyckoff <pw@padd.com> wrote:
>> Using the "p4 print -o tmpfile depotfile" can avoid this
>> convertion (and possible failure) all together.
>
> This isn't contrib/fast-import/git-p4. Searching around, I
> discovered a 2009 fork of git-p4 that is fairly active. CC-ing
> some of the names I found on github.
Oops. I forget I am trying some other git-p4 and this patch is
not for git-core. The git-p4 in git-core repository has the same
problem. I was original using the git-core one, hit the bug. Then
I try some other branch and found the same problem, I just forget
which git repository I am at. Silly me.
Let me redo a patch for git.
>
> Here's one such repo:
>
> http://github.com/ermshiperete/git-p4
>
> Git's git-p4 doesn't try to do anything special with utf-16. It
> does \r\n mangling, but not $Keyword$ removal, then just streams
> it to disk however p4 sends it. That's close to what you're
> trying to do here.
No, it is not the same. Here is what I find out. If you ask use
"p4 print" to fetch a perforce utf16 file, perforce convert that file into
utf8 before it send out to stdout. I guess the perforce guys assume
p4 print is used for terminal console. If the git-p4 just stream to disk,
then you those file are effective checkout as utf8, which is about
half the size of the original utf16 file.
I never like those utf16 files, I think those file should be check in
as binary. But some one did check in the utf16 in the repository I
use at work, Now I have to deal with it. I guess the reason perforce
want to know about utf16 is to do the \r\n conversion properly on
those utf16 files. The file in question is the windows .mc file.
It contain different language translation of the error message.
So git-p4 in git-core effectively converted the perforce utf16 file to
utf8 during import.
However, if I do "p4 print -o filename". Perforce will write the file in
the original utf16 format. I don't need to do the utf8 to utf16 conversion,
which can failed and it does in my case.
The git-p4 in git-core need the same treatment. I will work out a patch
and resubmit that later.
Thanks
Chris
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2011-08-21 19:09 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-08-19 22:50 [PATCH] git-p4: don't convert utf16 files Chris Li
2011-08-21 15:21 ` Pete Wyckoff
2011-08-21 19:09 ` Chris Li
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox