public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* Using UTF-8 encodings for SVN commit message
@ 2007-01-25  3:19 Anthony Liguori
       [not found] ` <45B821DC.5020200-NZpS4cJIG2HvQtjrzfazuQ@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: Anthony Liguori @ 2007-01-25  3:19 UTC (permalink / raw)
  To: kvm-devel; +Cc: ichiyanagi.yoshimi-Zyj7fXuS5i5L9jVzuh4AOg

Howdy,

I have a bit of an odd request.  I've been using tailor to mirror KVM in 
mercurial which is my preferred SCM.  This has worked quite well up 
until sometime in December when tailor started throwing errors.

I finally got around to looking into it and discovered the source of the 
problem.  It seems that on December 26th, Avi committed a patch from 
Yoshimi Ichiyanagi "kvm: initialize kvm_arch_ops in kvm_init()".

The problem is that the commit message (which was likely copy-pasted 
from an email) contains a character that is neither ASCII nor is it 
UTF-8.  SVN commit messages should be encoded in UTF-8.  The reason for 
this is that SVN allows exporting information about the repository in 
XML which is marked as UTF-8 encoded.

If commit messages aren't valid UTF-8, SVN generates invalid XML.  While 
SVN should probably generate an error when committing non-UTF8 messages, 
we shouldn't be doing it in the first place.

Is there any way to fix this?  I'm going to try to find a way to hack 
around it in tailor but I suspect this will break other SVN tools too.

If nothing else, I was hoping we could be a bit more careful about this 
in the future (if that's at all possible).

Thanks!

Regards,

Anthony Liguori

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Using UTF-8 encodings for SVN commit message
       [not found] ` <45B821DC.5020200-NZpS4cJIG2HvQtjrzfazuQ@public.gmane.org>
@ 2007-01-25  4:19   ` Anthony Liguori
  2007-01-25  7:13   ` Muli Ben-Yehuda
  2007-01-25  7:47   ` Avi Kivity
  2 siblings, 0 replies; 6+ messages in thread
From: Anthony Liguori @ 2007-01-25  4:19 UTC (permalink / raw)
  To: kvm-devel

[-- Attachment #1: Type: text/plain, Size: 1919 bytes --]

FYI, the follow patch is a work-around for this particular problem. 
It'll only work for this particular changeset but just in case anyone 
else is interested in using tailor with KVM I thought I'd post it.

Regards,

Anthony Liguori

Anthony Liguori wrote:
> Howdy,
> 
> I have a bit of an odd request.  I've been using tailor to mirror KVM in 
> mercurial which is my preferred SCM.  This has worked quite well up 
> until sometime in December when tailor started throwing errors.
> 
> I finally got around to looking into it and discovered the source of the 
> problem.  It seems that on December 26th, Avi committed a patch from 
> Yoshimi Ichiyanagi "kvm: initialize kvm_arch_ops in kvm_init()".
> 
> The problem is that the commit message (which was likely copy-pasted 
> from an email) contains a character that is neither ASCII nor is it 
> UTF-8.  SVN commit messages should be encoded in UTF-8.  The reason for 
> this is that SVN allows exporting information about the repository in 
> XML which is marked as UTF-8 encoded.
> 
> If commit messages aren't valid UTF-8, SVN generates invalid XML.  While 
> SVN should probably generate an error when committing non-UTF8 messages, 
> we shouldn't be doing it in the first place.
> 
> Is there any way to fix this?  I'm going to try to find a way to hack 
> around it in tailor but I suspect this will break other SVN tools too.
> 
> If nothing else, I was hoping we could be a bit more careful about this 
> in the future (if that's at all possible).
> 
> Thanks!
> 
> Regards,
> 
> Anthony Liguori
> 
> -------------------------------------------------------------------------
> Take Surveys. Earn Cash. Influence the Future of IT
> Join SourceForge.net's Techsay panel and you'll get the chance to share your
> opinions on IT & business topics through brief surveys - and earn cash
> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV


[-- Attachment #2: tailor-jp.diff --]
[-- Type: text/x-patch, Size: 747 bytes --]

--- vcpx/repository/svn.py	2006-12-11 15:08:24.000000000 -0600
+++ /usr/lib/python2.4/site-packages/vcpx/repository/svn.py	2007-01-24 22:16:37.000000000 -0600
@@ -299,9 +299,19 @@
     parser.setContentHandler(handler)
     parser.setErrorHandler(ErrorHandler())
 
+    def scrub(text):
+        scrubbed_text = ''
+        i = text.find('\x1b')
+        while i != -1:
+            scrubbed_text += text[0:i]
+            text = text[i+3:]
+            i = text.find('\x1b')
+        scrubbed_text += text
+        return scrubbed_text
+
     chunk = log.read(chunksize)
     while chunk:
-        parser.feed(chunk)
+        parser.feed(scrub(chunk))
         for cs in handler.changesets:
             yield cs
         handler.changesets = []

[-- Attachment #3: Type: text/plain, Size: 347 bytes --]

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

[-- Attachment #4: Type: text/plain, Size: 186 bytes --]

_______________________________________________
kvm-devel mailing list
kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
https://lists.sourceforge.net/lists/listinfo/kvm-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Using UTF-8 encodings for SVN commit message
       [not found] ` <45B821DC.5020200-NZpS4cJIG2HvQtjrzfazuQ@public.gmane.org>
  2007-01-25  4:19   ` Anthony Liguori
@ 2007-01-25  7:13   ` Muli Ben-Yehuda
  2007-01-25  7:47   ` Avi Kivity
  2 siblings, 0 replies; 6+ messages in thread
From: Muli Ben-Yehuda @ 2007-01-25  7:13 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: kvm-devel, ichiyanagi.yoshimi-Zyj7fXuS5i5L9jVzuh4AOg

On Wed, Jan 24, 2007 at 09:19:56PM -0600, Anthony Liguori wrote:

> If commit messages aren't valid UTF-8, SVN generates invalid XML.  While 
> SVN should probably generate an error when committing non-UTF8 messages, 
> we shouldn't be doing it in the first place.

Agreed.

> Is there any way to fix this?  I'm going to try to find a way to hack 
> around it in tailor but I suspect this will break other SVN tools
> too.

Check out the tailor option 'encoding-errors-policy', I use
'encoding-errors-policy = replace', which is brute force but works.

Cheers,
Muli




-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Using UTF-8 encodings for SVN commit message
       [not found] ` <45B821DC.5020200-NZpS4cJIG2HvQtjrzfazuQ@public.gmane.org>
  2007-01-25  4:19   ` Anthony Liguori
  2007-01-25  7:13   ` Muli Ben-Yehuda
@ 2007-01-25  7:47   ` Avi Kivity
       [not found]     ` <45B860A4.80000-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
  2 siblings, 1 reply; 6+ messages in thread
From: Avi Kivity @ 2007-01-25  7:47 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: kvm-devel, ichiyanagi.yoshimi-Zyj7fXuS5i5L9jVzuh4AOg

Anthony Liguori wrote:
> Howdy,
>
> I have a bit of an odd request.  I've been using tailor to mirror KVM 
> in mercurial which is my preferred SCM.  This has worked quite well up 
> until sometime in December when tailor started throwing errors.
>
> I finally got around to looking into it and discovered the source of 
> the problem.  It seems that on December 26th, Avi committed a patch 
> from Yoshimi Ichiyanagi "kvm: initialize kvm_arch_ops in kvm_init()".
>
> The problem is that the commit message (which was likely copy-pasted 
> from an email) contains a character that is neither ASCII nor is it 
> UTF-8.  SVN commit messages should be encoded in UTF-8.  The reason 
> for this is that SVN allows exporting information about the repository 
> in XML which is marked as UTF-8 encoded.
>
> If commit messages aren't valid UTF-8, SVN generates invalid XML.  
> While SVN should probably generate an error when committing non-UTF8 
> messages, we shouldn't be doing it in the first place.
>
> Is there any way to fix this?  I'm going to try to find a way to hack 
> around it in tailor but I suspect this will break other SVN tools too.
>
> If nothing else, I was hoping we could be a bit more careful about 
> this in the future (if that's at all possible).

I can do a dump/edit/load cycle if you like.


-- 
error compiling committee.c: too many arguments to function


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Using UTF-8 encodings for SVN commit message
       [not found]     ` <45B860A4.80000-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
@ 2007-01-26 14:49       ` Christopher Boumenot
  2007-01-26 18:19         ` Avi Kivity
  0 siblings, 1 reply; 6+ messages in thread
From: Christopher Boumenot @ 2007-01-26 14:49 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

  > I can do a dump/edit/load cycle if you like.
> 

Commit messages in SVN are not idempotent.  If you have configured SVN 
to allow it, you can just re-edit the commit message.

$ svn propedit svn:log --revprop -r <revision>






-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Using UTF-8 encodings for SVN commit message
  2007-01-26 14:49       ` Christopher Boumenot
@ 2007-01-26 18:19         ` Avi Kivity
  0 siblings, 0 replies; 6+ messages in thread
From: Avi Kivity @ 2007-01-26 18:19 UTC (permalink / raw)
  To: Christopher Boumenot
  Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f,
	aliguori-NZpS4cJIG2HvQtjrzfazuQ

Christopher Boumenot wrote:
>   > I can do a dump/edit/load cycle if you like.
>   
>
> Commit messages in SVN are not idempotent.  If you have configured SVN 
> to allow it, you can just re-edit the commit message.
>
> $ svn propedit svn:log --revprop -r <revision>
>
>   

Thanks for the tip.  I fixed the two instances of the error I could find.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2007-01-26 18:19 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-01-25  3:19 Using UTF-8 encodings for SVN commit message Anthony Liguori
     [not found] ` <45B821DC.5020200-NZpS4cJIG2HvQtjrzfazuQ@public.gmane.org>
2007-01-25  4:19   ` Anthony Liguori
2007-01-25  7:13   ` Muli Ben-Yehuda
2007-01-25  7:47   ` Avi Kivity
     [not found]     ` <45B860A4.80000-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-01-26 14:49       ` Christopher Boumenot
2007-01-26 18:19         ` Avi Kivity

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox