All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jakub Narebski <jnareb@gmail.com>
To: Anders Waldenborg <anders@0x63.nu>
Cc: Junio C Hamano <gitster@pobox.com>, git@vger.kernel.org
Subject: Re: [PATCH] gitweb: Fix chop_str not to cut in middle of utf8 multibyte chars.
Date: Sat, 24 May 2008 15:34:23 +0200	[thread overview]
Message-ID: <200805241534.25517.jnareb@gmail.com> (raw)
In-Reply-To: <4833D314.4010904@0x63.nu>

On Wed, 21 May 2008, Anders Waldenborg wrote:
> Junio C Hamano wrote:

>> I haven't followed the codepath but what do the callers do to the string
>> returned from chop_str?  Don't they assume the string hasn't been decoded
>> (because the old implementation of chop_str did not do this to_utf8), and
>> emit the result directly to the output because it also assumes the
>> undecoded format is what the outside world wants?  In other words, don't
>> they now need to do different things because returned string has gone
>> through the to_utf8() processing already?
> 
> The to_utf8() (defined in gitweb.perl, not part of perl it self) is kind 
> of sneaky, it checks if the string already is valid utf8. (guess it 
> should be called ensure_utf8())

Perhaps it should...

> chop_str needs to work on decoded string, otherwise character count goes 
> all wrong. But maybe it is better to add the to_utf8() to the callsites?

Or do "binmode $fd, :utf8".

But yes, I guess converting to Perl internal form on input would be
good idea.  Gitweb currently does it partially...

-- 
Jakub Narebski
Poland

  reply	other threads:[~2008-05-24 13:35 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-05-20 20:55 [PATCH] gitweb: Fix chop_str not to cut in middle of utf8 multibyte chars Anders Waldenborg
2008-05-20 22:19 ` Jakub Narebski
2008-05-21  7:27   ` Junio C Hamano
2008-05-21  7:45     ` Anders Waldenborg
2008-05-24 13:34       ` Jakub Narebski [this message]
2008-05-21 11:44   ` [PATCH] gitweb: Convert string to internal form before chopping in chop_str Anders Waldenborg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200805241534.25517.jnareb@gmail.com \
    --to=jnareb@gmail.com \
    --cc=anders@0x63.nu \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.