All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff Epler <jepler@unpythonic.net>
To: Michael J Gruber <git@drmicha.warpmail.net>
Cc: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Git Mailing List" <git@vger.kernel.org>,
	"Jakub Narebski" <jnareb@gmail.com>,
	"Dmitry Potapov" <dpotapov@gmail.com>, "Jan Hudec" <bulb@ucw.cz>,
	"Thomas Rast" <trast@student.ethz.ch>,
	"Marc Weber" <marco-oweber@gmx.de>
Subject: Re: [PATCH] WIP: begin to translate git with gettext
Date: Tue, 18 May 2010 11:40:02 -0500	[thread overview]
Message-ID: <20100518164002.GC20842@unpythonic.net> (raw)
In-Reply-To: <4BF24467.7000204@drmicha.warpmail.net>

On Tue, May 18, 2010 at 09:40:23AM +0200, Michael J Gruber wrote:
> > -		color_fprintf(s->fp, c, "unmerged:   %s", one);
> > +		color_fprintf(s->fp, c, _("unmerged:   %s"), one);
> 
> I have no experience whatsover with gettext, but it looks quite
> dangerous to me to have printf format specifiers as part of the
> localized text. It means that our programs can crash depending on the
> LANG setting at run time if localisers mess up. We'll never catch this
> unless we run all tests in all languages!

This is exactly how gettext works.  Yes, you can get crashes if the
translated string does not have the right arguments--and I would not be
at all surprised to hear of at least one privilege escalation bug
due to a bad message catalog, since printf format errors can be used in
such interesting ways.

Anyway, for printf-style formats, 'msgfmt' can be directed to check for
this situation:
    $ cat bad.po
    msgid ""
    msgstr "Content-Type: text/plain; charset=UTF-8\n"

    #,c-format
    msgid "foo %s %d"
    msgstr "föö %d %d"

    $ msgfmt --check-format bad.po
    bad.po:6: format specifications in 'msgid' and 'msgstr' for argument 1 are not the same
    msgfmt: found 1 fatal error
 
> Also, the basic structure of the output should probably be independent
> of the language, preferring consistent structure across languages over
> linguistically consistent structure  within a language.

No, the ability of gettext+printf to use the right structure of the
user's language is a strength.  For instance, consider the translation
into Yoda's locale of the following sentence:

    printf("The %s is %s.\n", "Future", "Clouded");

The proper localized message is

    Clouded the Future is.

Anything else will range from confusing to unintelligible to the
native speaker.  You get that with gettext by writing

    printf(_("The %s is %s.\n"), _("Future"), _("Clouded"));

together with the message catalog entry
    msgid "The %s is %s.\n"
    msgfmt "%2$s the %1$s is.\n"

> >  	if (extra.len) {
> > -		color_fprintf(s->fp, color(WT_STATUS_HEADER, s), "%s", extra.buf);
> > +		color_fprintf(s->fp, color(WT_STATUS_HEADER, s), _("%s"), extra.buf);
> 
> Seriously?

No, that one's a mistake.  I did not take care when choosing which
strings to mark, because I was mostly interested in showing a
proof-of-concept for using gettext to translate core parts of git.

The amount of work to mark all the source files and then to keep the
marks up to date should not be underestimated--and that's just the work
to enable translators to localize the software.  It is important to
gauge the interest in the git community in actually doing this work.

As my own primary language is English, I have only a theoretical
interest in this feature.  However, the existence of translations for
gitk and git-gui indicates to me that the community probably does desire
this.

Jeff

  parent reply	other threads:[~2010-05-18 16:40 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-05-17 16:05 [PATCH] WIP: begin to translate git with gettext Jeff Epler
2010-05-17 23:29 ` Robert Buck
2010-05-18  4:23   ` Ævar Arnfjörð Bjarmason
2010-05-18  7:40 ` Michael J Gruber
2010-05-18  8:11   ` Ævar Arnfjörð Bjarmason
2010-05-18 16:40   ` Jeff Epler [this message]
2010-05-18 17:02     ` Ævar Arnfjörð Bjarmason
2010-05-20 16:02       ` Ævar Arnfjörð Bjarmason
2010-05-21 18:02         ` Dévai Tamás
     [not found] <20100517160208.GA20842@unpythonic.net>
2010-05-18 10:57 ` Ævar Arnfjörð Bjarmason
2010-05-18 16:07   ` Jeff Epler
2010-05-18 16:47     ` Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100518164002.GC20842@unpythonic.net \
    --to=jepler@unpythonic.net \
    --cc=avarab@gmail.com \
    --cc=bulb@ucw.cz \
    --cc=dpotapov@gmail.com \
    --cc=git@drmicha.warpmail.net \
    --cc=git@vger.kernel.org \
    --cc=jnareb@gmail.com \
    --cc=marco-oweber@gmx.de \
    --cc=trast@student.ethz.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.