git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Torsten Bögershausen" <tboegi@web.de>
To: Junio C Hamano <gitster@pobox.com>
Cc: "Nguyen Thai Ngoc Duy" <pclouds@gmail.com>,
	git@vger.kernel.org, "Torsten Bögershausen" <tboegi@web.de>
Subject: Re: [PATCH v2] fetch: align new ref summary printout in UTF-8 locales
Date: Wed, 12 Sep 2012 22:12:02 +0200	[thread overview]
Message-ID: <5050EC92.6000103@web.de> (raw)
In-Reply-To: <7vpq5rce6y.fsf@alter.siamese.dyndns.org>

On 12.09.12 20:02, Junio C Hamano wrote:
> Nguyen Thai Ngoc Duy <pclouds@gmail.com> writes:
> 
>> Ping.. what happens to this patch? Do you want to support other
>> encodings as well via iconv()? I can't test that though.
> 
> I thought I refuted a potential blocker, which was an implied
> objection from Torsten, in $gmane/204912 already.  As long as we
> make it clear that your patch helps only "ASCII/UTF-8 only" audience
> but we still "try to be nicer to 'others'" by doing two things I
> said in the message, I think we should proceed without iconv() to
> keep things simple.

Please unblock and proceed (as I can't test other encodings either)

And even special thanks for the excellent lines from Junio,
which explained the update philosophy in git so well, 
that I take the freedom to re-post them here:

>> I try to re-phrase my question:
>>
>> Do installations still exist which use e.g. BIG5 or any other
>> multi byte encoding which is not UTF-8?
>
>Yes.
>
>> Do we want to support other encodings than ASCII or UTF-8?
>> (Because then the screen width needs to be calculate different, I think)
>
>That depends on who "we" are and what timeframe you have in mind.
>
>Do our developers care about these encodings so much that we would
>reject "ASCCI/UTF-8 only" patch and wait until we enhance it to do
>the right thing for other encodings that we do not use ourselves?
>No.  That does not make any sense, especially when we know we will
>not have a good test coverage on the additional parts that we will
>not be using ourselves.
>
>"This change only helps people with ASCII or UTF-8 and does not help
>others" alone is never a valid reason to reject a change, but we
>still try to be nicer to "others" that may come after we leave this
>topic behind by doing a few things:
>
> - If the change will make things worse than it currently is for
>   "others", we would try to minimize the regression for them.
>
> - If the change will make the code harder to update later to
>   enhance with additional change to support "others", we would try
>   to anticipate what kind of changes are needed on top, and
>   structure the code in such a way that future changes can be made
>   cleanly.
>
>For the first point, for multi-byte encodings (e.g. ISO-2022), the
>display columns and byte length do not match and in general byte
>length is longer than the display columns in the current code.  With
>the current code that measures the required columns across elements
>by taking the maximum of byte length, they will see wrong number of
>filler, so they are already getting a wrong alignment.  With the
>"UTF-8 only" change, the required columns and the filler will be
>calculated by assuming that the string is in UTF-8, which may make
>the computation off in a different way, and if we underestimate the
>display columns for a string, they may see the strings truncated,
>which is bad.
>
>So as long as gettext_width() punts and returns strlen() for a
>malformed UTF-8, it would be OK [*1*].
>
>For the second point, I think the API "here is a string, give me the
>number of display columns it will occupy, as I am interested in
>aligning them" is a good abstraction that can be later enhanced to
>other encodings fairly easily, so I do not see a problem in the
>patch that goes in that direction.
>
>
>
>[Footnote]
>
>*1* If the patch used utf_strwidth() (which I didn't bother to go
>back and check, though), it should be OK.  The underlying
>utf8_width() will reject a malformed UTF-8 sequence and the code
>falls back to strlen().

 

      reply	other threads:[~2012-09-12 20:12 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-03 11:10 [PATCH] fetch: align new ref summary printout in UTF-8 locales Nguyễn Thái Ngọc Duy
2012-09-03 19:26 ` Junio C Hamano
2012-09-03 20:01   ` Johannes Sixt
2012-09-03 20:30     ` Junio C Hamano
2012-09-04  1:31   ` Nguyen Thai Ngoc Duy
2012-09-04 10:39 ` [PATCH v2] " Nguyễn Thái Ngọc Duy
     [not found]   ` <504796A5.4070700@gmail.com>
2012-09-05 19:20     ` Torsten Bögershausen
2012-09-06 12:11       ` Nguyen Thai Ngoc Duy
2012-09-06 15:36       ` Nguyen Thai Ngoc Duy
2012-09-06 16:21         ` Torsten Bögershausen
2012-09-06 18:08           ` Junio C Hamano
2012-09-12 14:02   ` Nguyen Thai Ngoc Duy
2012-09-12 18:02     ` Junio C Hamano
2012-09-12 20:12       ` Torsten Bögershausen [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5050EC92.6000103@web.de \
    --to=tboegi@web.de \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).