* [RFC v1] git-p4: test case for RCS keyword problem
@ 2011-05-09 7:49 Luke Diamand
2011-05-09 23:00 ` Pete Wyckoff
0 siblings, 1 reply; 4+ messages in thread
From: Luke Diamand @ 2011-05-09 7:49 UTC (permalink / raw)
To: git; +Cc: Pete Wyckoff, Michael Horowitz, Luke Diamand
This is following on from some earlier threads about RCS keywords
and git-p4:
http://marc.info/?l=git&m=122145837226632&w=2
http://marc.info/?l=git&m=130470278328964&w=2
The problem is that git-p4 imports into git with RCS keywords
unexpanded (e.g. as $Id$), which is certainly the right thing
to do given how nasty RCS keywords are.
However, when it comes to try to apply your changes, it
applies them against a checked-out p4 tree, where the RCS keywords
*are* expanded. This then fails if in git you modify any lines
that contain RCS keywords (i.e. deleting them, or deleting the
entire file).
You would think you could just tell p4 to not expand RCS keywords
in your client view, but sadly that option doesn't exist :-(
This isn't a fix, it's just a test case that shows the problem,
and doesn't even try to test the whole-file deletion case.
I'm hoping someone will suggest a good way to handle this.
Otherwise, I've got a possible scheme that involves spotting the
failure to apply the patch, patching up RCS keywords in the
p4 client shadow and then trying again. It's not pretty but it seems
like it ought to work. My current version doesn't handle deletion,
and zaps *all* RCS keywords rather than just the ones zapped in git;
more work is needed before I can submit it.
Regards!
Luke
Signed-off-by: Luke Diamand <luke@diamand.org>
---
t/t9800-git-p4.sh | 44 ++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 44 insertions(+), 0 deletions(-)
diff --git a/t/t9800-git-p4.sh b/t/t9800-git-p4.sh
index 0dcaa9c..8e79331 100755
--- a/t/t9800-git-p4.sh
+++ b/t/t9800-git-p4.sh
@@ -258,6 +258,50 @@ test_expect_success 'not preserving user with mixed authorship' '
rm -rf "$git" && mkdir "$git"
'
+create_kw_file() {
+ cat <<'EOF' > $1
+/* A file
+ Id: $Id:$
+ Revision: $Revision:$
+ File: $File:$
+ */
+int main(int argc, const char **argv) {
+ return 0;
+}
+EOF
+ git add $1
+}
+
+p4_append_to_file() {
+ f=$1
+ p4 edit -t ktext $f &&
+ echo "/* $(date) */" >> $f &&
+ p4 submit -d "appending a line in p4" &&
+ cat $f
+}
+
+# Create some files with RCS keywords. If they get modified
+# elsewhere then the version number gets bumped which then
+# results in a merge conflict if we touch the RCS kw lines,
+# even though the change itself would otherwise apply cleanly.
+test_expect_failure 'cope with rcs keyword expansion damage' '
+ "$GITP4" clone --dest="$git" //depot &&
+ cd "$git" &&
+ git config git-p4.skipSubmitEditCheck true &&
+ create_kw_file kwfile1.c &&
+ git commit -m "Files with RCS keywords" &&
+ P4EDITOR=touch "$GITP4" commit && "$GITP4" rebase &&
+ (cd ../cli && p4_append_to_file kwfile1.c) &&
+ perl -n -i -e "print unless m/Revision:/" kwfile1.c &&
+ git add kwfile1.c &&
+ git commit -m "Zap an RCS kw line" &&
+ (echo 'w' | P4EDITOR=touch "$GITP4" submit) && "$GITP4" rebase &&
+ git diff p4/master &&
+ "$GITP4" commit &&
+ cd "$TRASH_DIRECTORY" &&
+ rm -rf "$git" && mkdir "$git"
+'
+
test_expect_success 'shutdown' '
pid=`pgrep -f p4d` &&
--
1.7.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [RFC v1] git-p4: test case for RCS keyword problem
2011-05-09 7:49 [RFC v1] git-p4: test case for RCS keyword problem Luke Diamand
@ 2011-05-09 23:00 ` Pete Wyckoff
2011-05-10 8:13 ` Luke Diamand
0 siblings, 1 reply; 4+ messages in thread
From: Pete Wyckoff @ 2011-05-09 23:00 UTC (permalink / raw)
To: Luke Diamand; +Cc: git, Michael Horowitz
luke@diamand.org wrote on Mon, 09 May 2011 08:49 +0100:
> This is following on from some earlier threads about RCS keywords
> and git-p4:
>
> http://marc.info/?l=git&m=122145837226632&w=2
> http://marc.info/?l=git&m=130470278328964&w=2
I hadn't read Mike's mail. Not that deep into my backlog yet.
> The problem is that git-p4 imports into git with RCS keywords
> unexpanded (e.g. as $Id$), which is certainly the right thing
> to do given how nasty RCS keywords are.
>
> However, when it comes to try to apply your changes, it
> applies them against a checked-out p4 tree, where the RCS keywords
> *are* expanded. This then fails if in git you modify any lines
> that contain RCS keywords (i.e. deleting them, or deleting the
> entire file).
>
> You would think you could just tell p4 to not expand RCS keywords
> in your client view, but sadly that option doesn't exist :-(
One idea. I snuck in a2b665d "convert filter: supply path to
external driver" that lets you do:
git config filter.p4.clean git-p4-filter --clean %f
git config filter.p4.smudge git-p4-filter --smudge %f
echo "*.c filter=p4" >> .gitattributes
Then your git tree can have expanded keywords too. The script to
clean is pretty easy; to smudge you have to ask p4 for the
information using fstat and filelog. My script is pretty nasty
and full of dependencies, but I could clean it up if you think
this is a good way to go.
We'd need to discover text+k files at clone and sync time
and maintain the .gitattributes accordingly. Filtering all
files would be wrong, and slow.
Dhruva's approach has the downside of always including RCS lines
in every commit when the file changes in p4.
> This isn't a fix, it's just a test case that shows the problem,
> and doesn't even try to test the whole-file deletion case.
>
> I'm hoping someone will suggest a good way to handle this.
>
> Otherwise, I've got a possible scheme that involves spotting the
> failure to apply the patch, patching up RCS keywords in the
> p4 client shadow and then trying again. It's not pretty but it seems
> like it ought to work. My current version doesn't handle deletion,
> and zaps *all* RCS keywords rather than just the ones zapped in git;
> more work is needed before I can submit it.
I'm a little hazy on how you would identify a patch failure as
due to an RCS keyword, but maybe it's possible. The deleted line
case is surely hard.
Curious what you think of switching to having expanded keywords
in the repo, but using smudge/clean instead.
Maybe hang onto this test case until we figure out how we want
to deal with it.
-- Pete
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFC v1] git-p4: test case for RCS keyword problem
2011-05-09 23:00 ` Pete Wyckoff
@ 2011-05-10 8:13 ` Luke Diamand
2011-05-10 11:47 ` Pete Wyckoff
0 siblings, 1 reply; 4+ messages in thread
From: Luke Diamand @ 2011-05-10 8:13 UTC (permalink / raw)
To: Pete Wyckoff; +Cc: git, Michael Horowitz
On 10/05/11 00:00, Pete Wyckoff wrote:
> luke@diamand.org wrote on Mon, 09 May 2011 08:49 +0100:
<snip>
>>
>> You would think you could just tell p4 to not expand RCS keywords
>> in your client view, but sadly that option doesn't exist :-(
>
> One idea. I snuck in a2b665d "convert filter: supply path to
> external driver" that lets you do:
>
> git config filter.p4.clean git-p4-filter --clean %f
> git config filter.p4.smudge git-p4-filter --smudge %f
> echo "*.c filter=p4">> .gitattributes
>
> Then your git tree can have expanded keywords too. The script to
> clean is pretty easy; to smudge you have to ask p4 for the
> information using fstat and filelog. My script is pretty nasty
> and full of dependencies, but I could clean it up if you think
> this is a good way to go.
Would this cause problems switching from one branch to another? If
there's a file with $Date:$ or $Revision:$ then every time you do "git
checkout other_branch" or even just "git stash" git would have to patch
up all files tagged with ktext (tens of thousands in my case). If you
didn't you'd be right back at square one.
At that point it would be easier just to submit a one-time change to
Perforce that simply zapped every RCS keyword in every file (probably
resulting in all kinds of pain for perforce users though).
My feeling is that there's no way to elegantly handle RCS keywords.
They're a really nasty horrible hangover from the dark ages, and if
you've got them in your code base you're going to have to accept that
anything that tries to deal with them is going to be ugly.
> We'd need to discover text+k files at clone and sync time
> and maintain the .gitattributes accordingly. Filtering all
> files would be wrong, and slow.
>
> Dhruva's approach has the downside of always including RCS lines
> in every commit when the file changes in p4.
>
>> This isn't a fix, it's just a test case that shows the problem,
>> and doesn't even try to test the whole-file deletion case.
>>
>> I'm hoping someone will suggest a good way to handle this.
>>
>> Otherwise, I've got a possible scheme that involves spotting the
>> failure to apply the patch, patching up RCS keywords in the
>> p4 client shadow and then trying again. It's not pretty but it seems
>> like it ought to work. My current version doesn't handle deletion,
>> and zaps *all* RCS keywords rather than just the ones zapped in git;
>> more work is needed before I can submit it.
>
> I'm a little hazy on how you would identify a patch failure as
> due to an RCS keyword, but maybe it's possible. The deleted line
> case is surely hard.
My existing patch is pretty crude - if "git apply --check" fails then it
parses the output, finds the files causing the problem, checks if
they're +ktext, and then zaps every RCS keyword in the file in the P4
shadow repo, replacing them with just $Keyword:$.
Amazingly this seems to work, since git then sees the text it is
expecting and is happy, while untouched RCS keywords go back to $KW:$
which P4 is quite happy to re-expand!
I haven't tried to handle the case where the entire file is deleted.
This is a bit harder - the file won't be opened for edit so I imagine
you have to do some kind of 'chmod +w' :-(
>
> Curious what you think of switching to having expanded keywords
> in the repo, but using smudge/clean instead.
One of the many things I like about git is that by default it never
messes with the contents of my files. I've spent too much of my life
trawling through diffs full of changes that are down to RCS keywords!
>
> Maybe hang onto this test case until we figure out how we want
> to deal with it.
Another option I wondered about is to modify git apply so that it spots
RCS keywords and ignores them? This would then mimic what p4 itself does.
And another option is to just admit that RCS keywords are stupid and
can't be sorted out properly, put some explanation into git-p4.txt to
explain the problem to avoid having these threads pop up, and then
everyone just has to zap them in Perforce.
Regards!
Luke
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFC v1] git-p4: test case for RCS keyword problem
2011-05-10 8:13 ` Luke Diamand
@ 2011-05-10 11:47 ` Pete Wyckoff
0 siblings, 0 replies; 4+ messages in thread
From: Pete Wyckoff @ 2011-05-10 11:47 UTC (permalink / raw)
To: Luke Diamand; +Cc: git, Michael Horowitz
luke@diamand.org wrote on Tue, 10 May 2011 09:13 +0100:
> Would this cause problems switching from one branch to another? If
> there's a file with $Date:$ or $Revision:$ then every time you do
> "git checkout other_branch" or even just "git stash" git would have
> to patch up all files tagged with ktext (tens of thousands in my
> case). If you didn't you'd be right back at square one.
>
> At that point it would be easier just to submit a one-time change to
> Perforce that simply zapped every RCS keyword in every file
> (probably resulting in all kinds of pain for perforce users though).
Checkout/branch work fine, but scaling is definitely an issue.
The clean/smudge approach scales up to tens of files, but for
thousands it would be too painful without some sort of cache of
p4 fstat info. Which maybe could be gotten at clone time.
We've been destroying RCS keywords in our legacy p4, but there
are still a handful left to satisfy other systems that look for
$Id$ or whatever.
We can forget this approach for your use case too. Maybe it
would work for Mike, depending on scale.
> My existing patch is pretty crude - if "git apply --check" fails
> then it parses the output, finds the files causing the problem,
> checks if they're +ktext, and then zaps every RCS keyword in the
> file in the P4 shadow repo, replacing them with just $Keyword:$.
>
> Amazingly this seems to work, since git then sees the text it is
> expecting and is happy, while untouched RCS keywords go back to
> $KW:$ which P4 is quite happy to re-expand!
>
> I haven't tried to handle the case where the entire file is deleted.
> This is a bit harder - the file won't be opened for edit so I
> imagine you have to do some kind of 'chmod +w' :-(
Okay. So code would do the apply normally, then if it fails,
go and try applying to a version of the file with RCS keywords
removed, and if that worked, great. It would be nice to do this
RCS-keyword-scrubbed checking somewhere not in the p4 workspace,
since we'd have to undo it if it turned out that was not the
problem. I.e., it would be good if this were all quite
failsafe.
What about copying the RCS keywords _in_ to the git repo just to
create a clean diff? Maybe the same difference, but if it eases
the deleted file case? For each file in the diff, the keyword
expansions are just read out of the corresponding p4 file.
> Another option I wondered about is to modify git apply so that it
> spots RCS keywords and ignores them? This would then mimic what p4
> itself does.
That might make sense. Although it feels harder to me, and
less likely to generalize in a way that would be acceptable.
Something akin to fuzzy_matchlines() that can decide "$Id$" is the
same as "$Id: //depot/proj/foo.c#3 $". Hrm.
> And another option is to just admit that RCS keywords are stupid and
> can't be sorted out properly, put some explanation into git-p4.txt
> to explain the problem to avoid having these threads pop up, and
> then everyone just has to zap them in Perforce.
Or maybe just small steps toward a solution: check for failure,
and if the diff chunks have what look like unexpanded keywords,
print out a message guessing why the check failed?
-- Pete
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2011-05-10 11:47 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-05-09 7:49 [RFC v1] git-p4: test case for RCS keyword problem Luke Diamand
2011-05-09 23:00 ` Pete Wyckoff
2011-05-10 8:13 ` Luke Diamand
2011-05-10 11:47 ` Pete Wyckoff
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).