* Using UTF-8 encodings for SVN commit message
@ 2007-01-25 3:19 Anthony Liguori
[not found] ` <45B821DC.5020200-NZpS4cJIG2HvQtjrzfazuQ@public.gmane.org>
0 siblings, 1 reply; 6+ messages in thread
From: Anthony Liguori @ 2007-01-25 3:19 UTC (permalink / raw)
To: kvm-devel; +Cc: ichiyanagi.yoshimi-Zyj7fXuS5i5L9jVzuh4AOg
Howdy,
I have a bit of an odd request. I've been using tailor to mirror KVM in
mercurial which is my preferred SCM. This has worked quite well up
until sometime in December when tailor started throwing errors.
I finally got around to looking into it and discovered the source of the
problem. It seems that on December 26th, Avi committed a patch from
Yoshimi Ichiyanagi "kvm: initialize kvm_arch_ops in kvm_init()".
The problem is that the commit message (which was likely copy-pasted
from an email) contains a character that is neither ASCII nor is it
UTF-8. SVN commit messages should be encoded in UTF-8. The reason for
this is that SVN allows exporting information about the repository in
XML which is marked as UTF-8 encoded.
If commit messages aren't valid UTF-8, SVN generates invalid XML. While
SVN should probably generate an error when committing non-UTF8 messages,
we shouldn't be doing it in the first place.
Is there any way to fix this? I'm going to try to find a way to hack
around it in tailor but I suspect this will break other SVN tools too.
If nothing else, I was hoping we could be a bit more careful about this
in the future (if that's at all possible).
Thanks!
Regards,
Anthony Liguori
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Using UTF-8 encodings for SVN commit message
[not found] ` <45B821DC.5020200-NZpS4cJIG2HvQtjrzfazuQ@public.gmane.org>
@ 2007-01-25 4:19 ` Anthony Liguori
2007-01-25 7:13 ` Muli Ben-Yehuda
2007-01-25 7:47 ` Avi Kivity
2 siblings, 0 replies; 6+ messages in thread
From: Anthony Liguori @ 2007-01-25 4:19 UTC (permalink / raw)
To: kvm-devel
[-- Attachment #1: Type: text/plain, Size: 1919 bytes --]
FYI, the follow patch is a work-around for this particular problem.
It'll only work for this particular changeset but just in case anyone
else is interested in using tailor with KVM I thought I'd post it.
Regards,
Anthony Liguori
Anthony Liguori wrote:
> Howdy,
>
> I have a bit of an odd request. I've been using tailor to mirror KVM in
> mercurial which is my preferred SCM. This has worked quite well up
> until sometime in December when tailor started throwing errors.
>
> I finally got around to looking into it and discovered the source of the
> problem. It seems that on December 26th, Avi committed a patch from
> Yoshimi Ichiyanagi "kvm: initialize kvm_arch_ops in kvm_init()".
>
> The problem is that the commit message (which was likely copy-pasted
> from an email) contains a character that is neither ASCII nor is it
> UTF-8. SVN commit messages should be encoded in UTF-8. The reason for
> this is that SVN allows exporting information about the repository in
> XML which is marked as UTF-8 encoded.
>
> If commit messages aren't valid UTF-8, SVN generates invalid XML. While
> SVN should probably generate an error when committing non-UTF8 messages,
> we shouldn't be doing it in the first place.
>
> Is there any way to fix this? I'm going to try to find a way to hack
> around it in tailor but I suspect this will break other SVN tools too.
>
> If nothing else, I was hoping we could be a bit more careful about this
> in the future (if that's at all possible).
>
> Thanks!
>
> Regards,
>
> Anthony Liguori
>
> -------------------------------------------------------------------------
> Take Surveys. Earn Cash. Influence the Future of IT
> Join SourceForge.net's Techsay panel and you'll get the chance to share your
> opinions on IT & business topics through brief surveys - and earn cash
> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
[-- Attachment #2: tailor-jp.diff --]
[-- Type: text/x-patch, Size: 747 bytes --]
--- vcpx/repository/svn.py 2006-12-11 15:08:24.000000000 -0600
+++ /usr/lib/python2.4/site-packages/vcpx/repository/svn.py 2007-01-24 22:16:37.000000000 -0600
@@ -299,9 +299,19 @@
parser.setContentHandler(handler)
parser.setErrorHandler(ErrorHandler())
+ def scrub(text):
+ scrubbed_text = ''
+ i = text.find('\x1b')
+ while i != -1:
+ scrubbed_text += text[0:i]
+ text = text[i+3:]
+ i = text.find('\x1b')
+ scrubbed_text += text
+ return scrubbed_text
+
chunk = log.read(chunksize)
while chunk:
- parser.feed(chunk)
+ parser.feed(scrub(chunk))
for cs in handler.changesets:
yield cs
handler.changesets = []
[-- Attachment #3: Type: text/plain, Size: 347 bytes --]
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
[-- Attachment #4: Type: text/plain, Size: 186 bytes --]
_______________________________________________
kvm-devel mailing list
kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
https://lists.sourceforge.net/lists/listinfo/kvm-devel
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Using UTF-8 encodings for SVN commit message
[not found] ` <45B821DC.5020200-NZpS4cJIG2HvQtjrzfazuQ@public.gmane.org>
2007-01-25 4:19 ` Anthony Liguori
@ 2007-01-25 7:13 ` Muli Ben-Yehuda
2007-01-25 7:47 ` Avi Kivity
2 siblings, 0 replies; 6+ messages in thread
From: Muli Ben-Yehuda @ 2007-01-25 7:13 UTC (permalink / raw)
To: Anthony Liguori; +Cc: kvm-devel, ichiyanagi.yoshimi-Zyj7fXuS5i5L9jVzuh4AOg
On Wed, Jan 24, 2007 at 09:19:56PM -0600, Anthony Liguori wrote:
> If commit messages aren't valid UTF-8, SVN generates invalid XML. While
> SVN should probably generate an error when committing non-UTF8 messages,
> we shouldn't be doing it in the first place.
Agreed.
> Is there any way to fix this? I'm going to try to find a way to hack
> around it in tailor but I suspect this will break other SVN tools
> too.
Check out the tailor option 'encoding-errors-policy', I use
'encoding-errors-policy = replace', which is brute force but works.
Cheers,
Muli
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Using UTF-8 encodings for SVN commit message
[not found] ` <45B821DC.5020200-NZpS4cJIG2HvQtjrzfazuQ@public.gmane.org>
2007-01-25 4:19 ` Anthony Liguori
2007-01-25 7:13 ` Muli Ben-Yehuda
@ 2007-01-25 7:47 ` Avi Kivity
[not found] ` <45B860A4.80000-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2 siblings, 1 reply; 6+ messages in thread
From: Avi Kivity @ 2007-01-25 7:47 UTC (permalink / raw)
To: Anthony Liguori; +Cc: kvm-devel, ichiyanagi.yoshimi-Zyj7fXuS5i5L9jVzuh4AOg
Anthony Liguori wrote:
> Howdy,
>
> I have a bit of an odd request. I've been using tailor to mirror KVM
> in mercurial which is my preferred SCM. This has worked quite well up
> until sometime in December when tailor started throwing errors.
>
> I finally got around to looking into it and discovered the source of
> the problem. It seems that on December 26th, Avi committed a patch
> from Yoshimi Ichiyanagi "kvm: initialize kvm_arch_ops in kvm_init()".
>
> The problem is that the commit message (which was likely copy-pasted
> from an email) contains a character that is neither ASCII nor is it
> UTF-8. SVN commit messages should be encoded in UTF-8. The reason
> for this is that SVN allows exporting information about the repository
> in XML which is marked as UTF-8 encoded.
>
> If commit messages aren't valid UTF-8, SVN generates invalid XML.
> While SVN should probably generate an error when committing non-UTF8
> messages, we shouldn't be doing it in the first place.
>
> Is there any way to fix this? I'm going to try to find a way to hack
> around it in tailor but I suspect this will break other SVN tools too.
>
> If nothing else, I was hoping we could be a bit more careful about
> this in the future (if that's at all possible).
I can do a dump/edit/load cycle if you like.
--
error compiling committee.c: too many arguments to function
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Using UTF-8 encodings for SVN commit message
[not found] ` <45B860A4.80000-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
@ 2007-01-26 14:49 ` Christopher Boumenot
2007-01-26 18:19 ` Avi Kivity
0 siblings, 1 reply; 6+ messages in thread
From: Christopher Boumenot @ 2007-01-26 14:49 UTC (permalink / raw)
To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
> I can do a dump/edit/load cycle if you like.
>
Commit messages in SVN are not idempotent. If you have configured SVN
to allow it, you can just re-edit the commit message.
$ svn propedit svn:log --revprop -r <revision>
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Using UTF-8 encodings for SVN commit message
2007-01-26 14:49 ` Christopher Boumenot
@ 2007-01-26 18:19 ` Avi Kivity
0 siblings, 0 replies; 6+ messages in thread
From: Avi Kivity @ 2007-01-26 18:19 UTC (permalink / raw)
To: Christopher Boumenot
Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f,
aliguori-NZpS4cJIG2HvQtjrzfazuQ
Christopher Boumenot wrote:
> > I can do a dump/edit/load cycle if you like.
>
>
> Commit messages in SVN are not idempotent. If you have configured SVN
> to allow it, you can just re-edit the commit message.
>
> $ svn propedit svn:log --revprop -r <revision>
>
>
Thanks for the tip. I fixed the two instances of the error I could find.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2007-01-26 18:19 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-01-25 3:19 Using UTF-8 encodings for SVN commit message Anthony Liguori
[not found] ` <45B821DC.5020200-NZpS4cJIG2HvQtjrzfazuQ@public.gmane.org>
2007-01-25 4:19 ` Anthony Liguori
2007-01-25 7:13 ` Muli Ben-Yehuda
2007-01-25 7:47 ` Avi Kivity
[not found] ` <45B860A4.80000-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-01-26 14:49 ` Christopher Boumenot
2007-01-26 18:19 ` Avi Kivity
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox