git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] git-p4: Fetch the proper revision of utf16 files
@ 2015-04-03 21:13 Daniel Bingham
  2015-04-03 21:13 ` Daniel Bingham
  0 siblings, 1 reply; 5+ messages in thread
From: Daniel Bingham @ 2015-04-03 21:13 UTC (permalink / raw)
  To: git; +Cc: luke, pw, Daniel Bingham

I discovered what appears to be a bug with the handling of utf16 type files 
when cloning from P4.  It appears that it always fetches the content of the 
latest version rather than getting the revision specified in the changelist 
being processed.  This patch is an attempt to resolve that by formatting
the print command using the revision number as is done in streamP4Files().

Daniel Bingham (1):
  git-p4: Fetch the proper revision of utf16 files

 git-p4.py | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

-- 
2.3.5

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] git-p4: Fetch the proper revision of utf16 files
  2015-04-03 21:13 [PATCH] git-p4: Fetch the proper revision of utf16 files Daniel Bingham
@ 2015-04-03 21:13 ` Daniel Bingham
  2015-04-03 21:46   ` Eric Sunshine
  0 siblings, 1 reply; 5+ messages in thread
From: Daniel Bingham @ 2015-04-03 21:13 UTC (permalink / raw)
  To: git; +Cc: luke, pw, Daniel Bingham

git-p4 always fetches the latest revision of UTF16
files from P4 rather than the revision at the commit being sync'd.

The print command should, instead, specify the revision number from the
commit in question using the file#revision syntax.

The file#revision syntax is preferable over file@changelist for
consistency with how streamP4Files formats the fileArgs list.

Signed-off-by: Daniel Bingham <git@dbingham.com>
---
 git-p4.py | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/git-p4.py b/git-p4.py
index ff132b2..156f3a4 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -2101,7 +2101,8 @@ class P4Sync(Command, P4UserMap):
             # them back too.  This is not needed to the cygwin windows version,
             # just the native "NT" type.
             #
-            text = p4_read_pipe(['print', '-q', '-o', '-', file['depotFile']])
+            ufile = "%s#%s" % (file['depotFile'], file['rev'])
+            text = p4_read_pipe(['print', '-q', '-o', '-', ufile])
             if p4_version_string().find("/NT") >= 0:
                 text = text.replace("\r\n", "\n")
             contents = [ text ]
-- 
2.3.5

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] git-p4: Fetch the proper revision of utf16 files
  2015-04-03 21:13 ` Daniel Bingham
@ 2015-04-03 21:46   ` Eric Sunshine
  2015-04-03 22:49     ` Domain Admin
       [not found]     ` <CAFzU1R+m8Gw3O_HHKmKq0rTALh+TJ1kK84=L2rsGCNft4XcT=Q@mail.gmail.com>
  0 siblings, 2 replies; 5+ messages in thread
From: Eric Sunshine @ 2015-04-03 21:46 UTC (permalink / raw)
  To: Daniel Bingham; +Cc: Git List, luke, Pete Wyckoff, Daniel Bingham

On Fri, Apr 3, 2015 at 5:13 PM, Daniel Bingham <daniel@dbingham.com> wrote:
> git-p4 always fetches the latest revision of UTF16
> files from P4 rather than the revision at the commit being sync'd.
>
> The print command should, instead, specify the revision number from the
> commit in question using the file#revision syntax.
>
> The file#revision syntax is preferable over file@changelist for
> consistency with how streamP4Files formats the fileArgs list.

As a non-Perforce reader trying to understand this patch, there are a
couple issues which are unclear or inadequately explained. Perhaps you
could provide a bit more detail or cite relevant sources.

First, does "UTF16 file" refer to the content or the filename?

Second, I may be entirely missing it, but the commit message doesn't
seem to explain why this impacts only "UTF16 files", and why this
solution is the appropriate fix.

If the answer to the first question is that the filename is UTF-16,
then would an alternate fix be to convert the value of
file['depotFile'] to have the same encoding as the "print -q -o - ..."
command-line? (Again, please excuse my Perforce-ignorance if I'm
completely off the mark.)

> Signed-off-by: Daniel Bingham <git@dbingham.com>
> ---
>  git-p4.py | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/git-p4.py b/git-p4.py
> index ff132b2..156f3a4 100755
> --- a/git-p4.py
> +++ b/git-p4.py
> @@ -2101,7 +2101,8 @@ class P4Sync(Command, P4UserMap):
>              # them back too.  This is not needed to the cygwin windows version,
>              # just the native "NT" type.
>              #
> -            text = p4_read_pipe(['print', '-q', '-o', '-', file['depotFile']])
> +            ufile = "%s#%s" % (file['depotFile'], file['rev'])
> +            text = p4_read_pipe(['print', '-q', '-o', '-', ufile])
>              if p4_version_string().find("/NT") >= 0:
>                  text = text.replace("\r\n", "\n")
>              contents = [ text ]
> --
> 2.3.5

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] git-p4: Fetch the proper revision of utf16 files
  2015-04-03 21:46   ` Eric Sunshine
@ 2015-04-03 22:49     ` Domain Admin
       [not found]     ` <CAFzU1R+m8Gw3O_HHKmKq0rTALh+TJ1kK84=L2rsGCNft4XcT=Q@mail.gmail.com>
  1 sibling, 0 replies; 5+ messages in thread
From: Domain Admin @ 2015-04-03 22:49 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Git List, Pete Wyckoff

*note* I tried to send this to the list earlier but forgot to set the
mode to plain-text so it got rejected.  Resending.  Apologies for any
duplicates.

I think the context of the patch doesn't make this clear, but if you
look at git-p4.py in this spot you'll see that this is in a block that
begins:

if type_base == "utf16":

Where "type_base" is extracted, by the script, from the filetype
provided by p4 (i.e. metadata provided by p4). In the particular
scenario I encountered, P4 said that the file was of type xutf16 from
which the split_p4_type() method extracts "utf16" as the type_base.

The essence of the issue here has nothing to do with UTF16, its just
coincidental that it happens for files of that type.  The problem is
that the script wasn't requesting the proper revision from P4 and thus
would always get the *latest* revision.  This corrects that by having
the script request the appropriate revision from P4.

Daniel

On Fri, Apr 3, 2015 at 2:46 PM, Eric Sunshine <sunshine@sunshineco.com> wrote:
> On Fri, Apr 3, 2015 at 5:13 PM, Daniel Bingham <daniel@dbingham.com> wrote:
>> git-p4 always fetches the latest revision of UTF16
>> files from P4 rather than the revision at the commit being sync'd.
>>
>> The print command should, instead, specify the revision number from the
>> commit in question using the file#revision syntax.
>>
>> The file#revision syntax is preferable over file@changelist for
>> consistency with how streamP4Files formats the fileArgs list.
>
> As a non-Perforce reader trying to understand this patch, there are a
> couple issues which are unclear or inadequately explained. Perhaps you
> could provide a bit more detail or cite relevant sources.
>
> First, does "UTF16 file" refer to the content or the filename?
>
> Second, I may be entirely missing it, but the commit message doesn't
> seem to explain why this impacts only "UTF16 files", and why this
> solution is the appropriate fix.
>
> If the answer to the first question is that the filename is UTF-16,
> then would an alternate fix be to convert the value of
> file['depotFile'] to have the same encoding as the "print -q -o - ..."
> command-line? (Again, please excuse my Perforce-ignorance if I'm
> completely off the mark.)
>
>> Signed-off-by: Daniel Bingham <git@dbingham.com>
>> ---
>>  git-p4.py | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/git-p4.py b/git-p4.py
>> index ff132b2..156f3a4 100755
>> --- a/git-p4.py
>> +++ b/git-p4.py
>> @@ -2101,7 +2101,8 @@ class P4Sync(Command, P4UserMap):
>>              # them back too.  This is not needed to the cygwin windows version,
>>              # just the native "NT" type.
>>              #
>> -            text = p4_read_pipe(['print', '-q', '-o', '-', file['depotFile']])
>> +            ufile = "%s#%s" % (file['depotFile'], file['rev'])
>> +            text = p4_read_pipe(['print', '-q', '-o', '-', ufile])
>>              if p4_version_string().find("/NT") >= 0:
>>                  text = text.replace("\r\n", "\n")
>>              contents = [ text ]
>> --
>> 2.3.5

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] git-p4: Fetch the proper revision of utf16 files
       [not found]     ` <CAFzU1R+m8Gw3O_HHKmKq0rTALh+TJ1kK84=L2rsGCNft4XcT=Q@mail.gmail.com>
@ 2015-04-03 22:49       ` Eric Sunshine
  0 siblings, 0 replies; 5+ messages in thread
From: Eric Sunshine @ 2015-04-03 22:49 UTC (permalink / raw)
  To: Domain Admin; +Cc: Git List, luke, Pete Wyckoff

[Daniel: Your HTML-formatted response probably did not make it to the
Git mailing list since only plain text is accepted. Also, please
respond inline on this list rather than top-posting.]

On Fri, Apr 3, 2015 at 5:54 PM, Domain Admin <daniel@dbingham.com> wrote:
> I think the context of the patch doesn't make this clear, but if you look at
> git-p4.py in this spot you'll see that this is in a block that begins:
>
> if type_base == "utf16":
>
> Where "type_base" is extracted, by the script, from the filetype provided by
> p4 (i.e. metadata provided by p4). In the particular scenario I encountered,
> P4 said that the file was of type xutf16 from which the split_p4_type()
> method extracts "utf16" as the type_base.

So, the patch's mention of "UTF16 file" indeed refers to the content
rather than the filename.

Now that I've studied up on Perforce a bit and read through some of
the git-p4.py code, I understand that there is a special-case for
files with utf-16 content in which git-p4 re-reads the content in
order to work around a problem with ASCII text saved with type utf-16
getting mangled. The work-around, 55aa5714 (git-p4: handle utf16
filetype properly; 2011-10-17), is buggy in that it neglects to
specify a particular revision of the file, and your patch fixes that.
Perhaps your commit message could make that a bit clearer by
mentioning the special-case and citing 55aa5714? Maybe something like:

    The special-case handling of files with UTF-16 content done by
    55aa5714 (git-p4: handle utf16 filetype properly; 2011-10-17)
    neglects to specify a revision when re-reading the file content.
    As a result, the latest revision of the UTF-16 file is always
    fetched rather than the desired one.

Also, is it possible to craft a test for this issue and place it in
one of the t98xx scripts?

> Daniel
>
> On Fri, Apr 3, 2015 at 2:46 PM, Eric Sunshine <sunshine@sunshineco.com>
> wrote:
>>
>> On Fri, Apr 3, 2015 at 5:13 PM, Daniel Bingham <daniel@dbingham.com>
>> wrote:
>> > git-p4 always fetches the latest revision of UTF16
>> > files from P4 rather than the revision at the commit being sync'd.
>> >
>> > The print command should, instead, specify the revision number from the
>> > commit in question using the file#revision syntax.
>> >
>> > The file#revision syntax is preferable over file@changelist for
>> > consistency with how streamP4Files formats the fileArgs list.
>>
>> As a non-Perforce reader trying to understand this patch, there are a
>> couple issues which are unclear or inadequately explained. Perhaps you
>> could provide a bit more detail or cite relevant sources.
>>
>> First, does "UTF16 file" refer to the content or the filename?
>>
>> Second, I may be entirely missing it, but the commit message doesn't
>> seem to explain why this impacts only "UTF16 files", and why this
>> solution is the appropriate fix.
>>
>> If the answer to the first question is that the filename is UTF-16,
>> then would an alternate fix be to convert the value of
>> file['depotFile'] to have the same encoding as the "print -q -o - ..."
>> command-line? (Again, please excuse my Perforce-ignorance if I'm
>> completely off the mark.)
>>
>> > Signed-off-by: Daniel Bingham <git@dbingham.com>
>> > ---
>> >  git-p4.py | 3 ++-
>> >  1 file changed, 2 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/git-p4.py b/git-p4.py
>> > index ff132b2..156f3a4 100755
>> > --- a/git-p4.py
>> > +++ b/git-p4.py
>> > @@ -2101,7 +2101,8 @@ class P4Sync(Command, P4UserMap):
>> >              # them back too.  This is not needed to the cygwin windows
>> > version,
>> >              # just the native "NT" type.
>> >              #
>> > -            text = p4_read_pipe(['print', '-q', '-o', '-',
>> > file['depotFile']])
>> > +            ufile = "%s#%s" % (file['depotFile'], file['rev'])
>> > +            text = p4_read_pipe(['print', '-q', '-o', '-', ufile])
>> >              if p4_version_string().find("/NT") >= 0:
>> >                  text = text.replace("\r\n", "\n")
>> >              contents = [ text ]
>> > --
>> > 2.3.5

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-04-03 22:49 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-04-03 21:13 [PATCH] git-p4: Fetch the proper revision of utf16 files Daniel Bingham
2015-04-03 21:13 ` Daniel Bingham
2015-04-03 21:46   ` Eric Sunshine
2015-04-03 22:49     ` Domain Admin
     [not found]     ` <CAFzU1R+m8Gw3O_HHKmKq0rTALh+TJ1kK84=L2rsGCNft4XcT=Q@mail.gmail.com>
2015-04-03 22:49       ` Eric Sunshine

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).