* [RFC] Introducing AI Agents to Git Localization
@ 2026-02-04 9:31 Jiang Xin
2026-02-04 11:58 ` Peter Krefting
` (3 more replies)
0 siblings, 4 replies; 42+ messages in thread
From: Jiang Xin @ 2026-02-04 9:31 UTC (permalink / raw)
To: Alexander Shopov, Mikel Forcada, Ralf Thielow,
Jean-Noël Avila, Bagas Sanjaya, Dimitriy Ryazantcev,
Peter Krefting, Emir SARI, Arkadii Yakovets,
Vũ Tiến Hưng, Teng Long, Yi-Jyun Pan, Jordi Mas,
Matthias Rüster, Phillip Szelat, Sébastien Helleu,
insolor, Kateryna Golovanova, Trần Ngọc Quân,
Nguyễn Thái Ngọc Duy, Ray Chen, 依云,
Fangyi Zhou, Jiang Xin, Franklin Weng
Cc: Git List
Dear Git l10n team members,
Two commits have been introduced in the next branch of the git-po
repository to better support AI-assisted workflows for Git l10n
translation and quality checking:
- https://github.com/git-l10n/git-po/commits/next/
Before submitting patches upstream, I invite the community to test
using AI agents for day-to-day Git l10n tasks.
To get started, work on the next branch:
git clone git@github.com:git-l10n/git-po.git
git checkout -b next origin/next
Please try using AI coding tools to update translations in po/XX.po or
review historical translations, following the prompts below:
- "Refer to @po/README.md to update translations in po/XX.po."
- "Refer to @po/README.md to review all translations in po/XX.po."
--
Jiang Xin
^ permalink raw reply [flat|nested] 42+ messages in thread* Re: [RFC] Introducing AI Agents to Git Localization 2026-02-04 9:31 [RFC] Introducing AI Agents to Git Localization Jiang Xin @ 2026-02-04 11:58 ` Peter Krefting 2026-02-04 13:00 ` Michal Suchánek 2026-02-05 1:04 ` Jiang Xin [not found] ` <0207CD38-C811-499D-AFA6-131B0CA825CD@gmail.com> ` (2 subsequent siblings) 3 siblings, 2 replies; 42+ messages in thread From: Peter Krefting @ 2026-02-04 11:58 UTC (permalink / raw) To: Jiang Xin Cc: Alexander Shopov, Mikel Forcada, Ralf Thielow, Jean-Noël Avila, Bagas Sanjaya, Dimitriy Ryazantcev, Emir SARI, Arkadii Yakovets, Vũ Tiến Hưng, Teng Long, Yi-Jyun Pan, Jordi Mas, Matthias Rüster, Phillip Szelat, Sébastien Helleu, insolor, Kateryna Golovanova, Trần Ngọc Quân, Nguyễn Thái Ngọc Duy, Ray Chen, 依云, Fangyi Zhou, Franklin Weng, Git List 2026-02-04 10:31 skrev Jiang Xin: > Please try using AI coding tools to update translations in po/XX.po or > review historical translations, following the prompts below: No. Please disable this altogether for the Swedish localization. "Translation" using stochastic parrots is not mature and just creates gibberish that takes more time to clean up than to do the translation from scratch manually. -- \\// Peter - http://www.softwolves.pp.se/ ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC] Introducing AI Agents to Git Localization 2026-02-04 11:58 ` Peter Krefting @ 2026-02-04 13:00 ` Michal Suchánek 2026-02-04 14:38 ` 依云 2026-02-05 2:06 ` Jiang Xin 2026-02-05 1:04 ` Jiang Xin 1 sibling, 2 replies; 42+ messages in thread From: Michal Suchánek @ 2026-02-04 13:00 UTC (permalink / raw) To: Peter Krefting Cc: Jiang Xin, Alexander Shopov, Mikel Forcada, Ralf Thielow, Jean-Noël Avila, Bagas Sanjaya, Dimitriy Ryazantcev, Emir SARI, Arkadii Yakovets, Vũ Tiến Hưng, Teng Long, Yi-Jyun Pan, Jordi Mas, Matthias Rüster, Phillip Szelat, Sébastien Helleu, insolor, Kateryna Golovanova, Trần Ngọc Quân, Nguyễn Thái Ngọc Duy, Ray Chen, 依云, Fangyi Zhou, Franklin Weng, Git List On Wed, Feb 04, 2026 at 12:58:05PM +0100, Peter Krefting wrote: > 2026-02-04 10:31 skrev Jiang Xin: > > > Please try using AI coding tools to update translations in po/XX.po or > > review historical translations, following the prompts below: > > No. > > Please disable this altogether for the Swedish localization. "Translation" > using stochastic parrots is not mature and just creates gibberish that takes > more time to clean up than to do the translation from scratch manually. Hello, a similar attempt was widely reported, eg. here: https://linuxiac.com/ai-controversy-forces-end-of-mozilla-japanese-sumo-community/ As pointed out the availiability of the tools is not necessarily a problem in itself. The problem in that particular case was that Mozilla automatically applied the tools to existing translations, even well-maintained ones. Abandoned or completely missing translations may benefit from AI translation when the topic is general enough that there is likely a lot of training data available. Unfortunately, git with its specific jargon may not be the most optimal project for automated translation. When the generated change needs to be sent to a maintainer for review having gibberish translation would be theoretically avoided. There is the caveat that maintainers might receive more gibberish submissions when the tools are available. Thanks Michal ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC] Introducing AI Agents to Git Localization 2026-02-04 13:00 ` Michal Suchánek @ 2026-02-04 14:38 ` 依云 2026-02-05 2:06 ` Jiang Xin 1 sibling, 0 replies; 42+ messages in thread From: 依云 @ 2026-02-04 14:38 UTC (permalink / raw) To: Michal Suchánek Cc: Peter Krefting, Jiang Xin, Alexander Shopov, Mikel Forcada, Ralf Thielow, Jean-Noël Avila, Bagas Sanjaya, Dimitriy Ryazantcev, Emir SARI, Arkadii Yakovets, Vũ Tiến Hưng, Teng Long, Yi-Jyun Pan, Jordi Mas, Matthias Rüster, Phillip Szelat, Sébastien Helleu, insolor, Kateryna Golovanova, Trần Ngọc Quân, Nguyễn Thái Ngọc Duy, Ray Chen, Fangyi Zhou, Franklin Weng, Git List 在 Wed, Feb 04, 2026 at 02:00:21PM +0100,Michal Suchánek 写道: > On Wed, Feb 04, 2026 at 12:58:05PM +0100, Peter Krefting wrote: > > 2026-02-04 10:31 skrev Jiang Xin: > > > > > Please try using AI coding tools to update translations in po/XX.po or > > > review historical translations, following the prompts below: > > > > No. > > > > Please disable this altogether for the Swedish localization. "Translation" > > using stochastic parrots is not mature and just creates gibberish that takes > > more time to clean up than to do the translation from scratch manually. > > Hello, > > a similar attempt was widely reported, eg. here: > https://linuxiac.com/ai-controversy-forces-end-of-mozilla-japanese-sumo-community/ FYI, less known is the fish (a command line shell) zh-CN translation fiasco. Last time I reviewed zh-CN translation for git there were a bunch of nonsense. And I'm pretty much given up the docs.python.org's zh-CN translation. I don't really care how others do things, but I hope that I'm not the gatekeeper to review or be fed up with rubbish sentences. -- Best regards, lilydjwg ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC] Introducing AI Agents to Git Localization 2026-02-04 13:00 ` Michal Suchánek 2026-02-04 14:38 ` 依云 @ 2026-02-05 2:06 ` Jiang Xin 2026-02-05 8:30 ` Michal Suchánek 1 sibling, 1 reply; 42+ messages in thread From: Jiang Xin @ 2026-02-05 2:06 UTC (permalink / raw) To: Michal Suchánek Cc: Peter Krefting, Alexander Shopov, Mikel Forcada, Ralf Thielow, Jean-Noël Avila, Bagas Sanjaya, Dimitriy Ryazantcev, Emir SARI, Arkadii Yakovets, Vũ Tiến Hưng, Teng Long, Yi-Jyun Pan, Jordi Mas, Matthias Rüster, Phillip Szelat, Sébastien Helleu, insolor, Kateryna Golovanova, Trần Ngọc Quân, Nguyễn Thái Ngọc Duy, Ray Chen, 依云, Fangyi Zhou, Franklin Weng, Git List On Wed, Feb 4, 2026 at 9:00 PM Michal Suchánek <msuchanek@suse.de> wrote: > > On Wed, Feb 04, 2026 at 12:58:05PM +0100, Peter Krefting wrote: > > 2026-02-04 10:31 skrev Jiang Xin: > > > > > Please try using AI coding tools to update translations in po/XX.po or > > > review historical translations, following the prompts below: > > > > No. > > > > Please disable this altogether for the Swedish localization. "Translation" > > using stochastic parrots is not mature and just creates gibberish that takes > > more time to clean up than to do the translation from scratch manually. > > Hello, > > a similar attempt was widely reported, eg. here: > https://linuxiac.com/ai-controversy-forces-end-of-mozilla-japanese-sumo-community/ > > As pointed out the availiability of the tools is not necessarily a > problem in itself. The problem in that particular case was that Mozilla > automatically applied the tools to existing translations, even > well-maintained ones. Thank you for the context—this is a good reminder that automation should never override community judgment. To be clear, using AI as a translation aid is entirely up to each contributor. In Git 2.53’s l10n cycle, I temporarily handled the Chinese translation (as the usual lead was unavailable), translated all new strings, and fixed many issues in older translations—both speed and quality were surprisingly good. As an l10n coordinator, I’ve long struggled with reviewing PRs: while git-po-helper catches technical errors, it can’t assess translation quality or detect irrelevant content like ads or political text. Here, AI can help flag such issues during review. That’s why we updated po/README.md and invited teams to try these tools—not to impose automation, but to explore how AI can assist human reviewers, language by language. > Abandoned or completely missing translations may benefit from AI > translation when the topic is general enough that there is likely a lot > of training data available. Unfortunately, git with its specific jargon > may not be the most optimal project for automated translation. For Git’s Simplified Chinese localization, we follow a simple practice: a glossary is included in the header of `po/zh_CN.po`, and all contributors—whether human or AI—are expected to adhere to it. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC] Introducing AI Agents to Git Localization 2026-02-05 2:06 ` Jiang Xin @ 2026-02-05 8:30 ` Michal Suchánek 2026-02-05 11:16 ` Jiang Xin 0 siblings, 1 reply; 42+ messages in thread From: Michal Suchánek @ 2026-02-05 8:30 UTC (permalink / raw) To: Jiang Xin Cc: Peter Krefting, Alexander Shopov, Mikel Forcada, Ralf Thielow, Jean-Noël Avila, Bagas Sanjaya, Dimitriy Ryazantcev, Emir SARI, Arkadii Yakovets, Vũ Tiến Hưng, Teng Long, Yi-Jyun Pan, Jordi Mas, Matthias Rüster, Phillip Szelat, Sébastien Helleu, insolor, Kateryna Golovanova, Trần Ngọc Quân, Nguyễn Thái Ngọc Duy, Ray Chen, 依云, Fangyi Zhou, Franklin Weng, Git List On Thu, Feb 05, 2026 at 10:06:58AM +0800, Jiang Xin wrote: > On Wed, Feb 4, 2026 at 9:00 PM Michal Suchánek <msuchanek@suse.de> wrote: > > > > On Wed, Feb 04, 2026 at 12:58:05PM +0100, Peter Krefting wrote: > > > 2026-02-04 10:31 skrev Jiang Xin: > > > > > > > Please try using AI coding tools to update translations in po/XX.po or > > > > review historical translations, following the prompts below: > > > > > > No. > > > > > > Please disable this altogether for the Swedish localization. "Translation" > > > using stochastic parrots is not mature and just creates gibberish that takes > > > more time to clean up than to do the translation from scratch manually. > > > > Hello, > > > > a similar attempt was widely reported, eg. here: > > https://linuxiac.com/ai-controversy-forces-end-of-mozilla-japanese-sumo-community/ > > > > As pointed out the availiability of the tools is not necessarily a > > problem in itself. The problem in that particular case was that Mozilla > > automatically applied the tools to existing translations, even > > well-maintained ones. > > Thank you for the context—this is a good reminder that automation > should never override community judgment. > > To be clear, using AI as a translation aid is entirely up to each > contributor. In Git 2.53’s l10n cycle, I temporarily handled the > Chinese translation (as the usual lead was unavailable), translated > all new strings, and fixed many issues in older translations—both > speed and quality were surprisingly good. > > As an l10n coordinator, I’ve long struggled with reviewing PRs: while > git-po-helper catches technical errors, it can’t assess translation > quality or detect irrelevant content like ads or political text. Here, > AI can help flag such issues during review. That is really sad. 'ads or political text' sounds like something that would be visible immediately if somebody looked at the change at all. Which implies that you do not want to look at it, and have AI review it. That is put AI in charge. That's not going to go well. Translation quakity is something AI cannot help with unless you want it to decrease. Thanks Michal ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC] Introducing AI Agents to Git Localization 2026-02-05 8:30 ` Michal Suchánek @ 2026-02-05 11:16 ` Jiang Xin 2026-02-05 13:18 ` Michal Suchánek 0 siblings, 1 reply; 42+ messages in thread From: Jiang Xin @ 2026-02-05 11:16 UTC (permalink / raw) To: Michal Suchánek Cc: Peter Krefting, Alexander Shopov, Mikel Forcada, Ralf Thielow, Jean-Noël Avila, Bagas Sanjaya, Dimitriy Ryazantcev, Emir SARI, Arkadii Yakovets, Vũ Tiến Hưng, Teng Long, Yi-Jyun Pan, Jordi Mas, Matthias Rüster, Phillip Szelat, Sébastien Helleu, insolor, Kateryna Golovanova, Trần Ngọc Quân, Nguyễn Thái Ngọc Duy, Ray Chen, 依云, Fangyi Zhou, Franklin Weng, Git List On Thu, Feb 5, 2026 at 4:30 PM Michal Suchánek <msuchanek@suse.de> wrote: > > On Thu, Feb 05, 2026 at 10:06:58AM +0800, Jiang Xin wrote: > > On Wed, Feb 4, 2026 at 9:00 PM Michal Suchánek <msuchanek@suse.de> wrote: > > > > > > On Wed, Feb 04, 2026 at 12:58:05PM +0100, Peter Krefting wrote: > > > > 2026-02-04 10:31 skrev Jiang Xin: > > > > > > > > > Please try using AI coding tools to update translations in po/XX.po or > > > > > review historical translations, following the prompts below: > > > > > > > > No. > > > > > > > > Please disable this altogether for the Swedish localization. "Translation" > > > > using stochastic parrots is not mature and just creates gibberish that takes > > > > more time to clean up than to do the translation from scratch manually. > > > > > > Hello, > > > > > > a similar attempt was widely reported, eg. here: > > > https://linuxiac.com/ai-controversy-forces-end-of-mozilla-japanese-sumo-community/ > > > > > > As pointed out the availiability of the tools is not necessarily a > > > problem in itself. The problem in that particular case was that Mozilla > > > automatically applied the tools to existing translations, even > > > well-maintained ones. > > > > Thank you for the context—this is a good reminder that automation > > should never override community judgment. > > > > To be clear, using AI as a translation aid is entirely up to each > > contributor. In Git 2.53’s l10n cycle, I temporarily handled the > > Chinese translation (as the usual lead was unavailable), translated > > all new strings, and fixed many issues in older translations—both > > speed and quality were surprisingly good. > > > > As an l10n coordinator, I’ve long struggled with reviewing PRs: while > > git-po-helper catches technical errors, it can’t assess translation > > quality or detect irrelevant content like ads or political text. Here, > > AI can help flag such issues during review. > > That is really sad. 'ads or political text' sounds like something that > would be visible immediately if somebody looked at the change at all. > Which implies that you do not want to look at it, and have AI review > it. That is put AI in charge. That's not going to go well. Git supports 19 languages, 14 of which have received active updates in the past year. How am I supposed to perform semantic-level reviews for languages I'm not familiar with? In principle, I should trust all pull requests provided by team leaders, but having an AI-powered semantic-level code review available, especially for extreme scenarios or to assist contributors, isn't necessarily a bad idea. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC] Introducing AI Agents to Git Localization 2026-02-05 11:16 ` Jiang Xin @ 2026-02-05 13:18 ` Michal Suchánek 0 siblings, 0 replies; 42+ messages in thread From: Michal Suchánek @ 2026-02-05 13:18 UTC (permalink / raw) To: Jiang Xin Cc: Peter Krefting, Alexander Shopov, Mikel Forcada, Ralf Thielow, Jean-Noël Avila, Bagas Sanjaya, Dimitriy Ryazantcev, Emir SARI, Arkadii Yakovets, Vũ Tiến Hưng, Teng Long, Yi-Jyun Pan, Jordi Mas, Matthias Rüster, Phillip Szelat, Sébastien Helleu, insolor, Kateryna Golovanova, Trần Ngọc Quân, Nguyễn Thái Ngọc Duy, Ray Chen, 依云, Fangyi Zhou, Franklin Weng, Git List On Thu, Feb 05, 2026 at 07:16:44PM +0800, Jiang Xin wrote: > On Thu, Feb 5, 2026 at 4:30 PM Michal Suchánek <msuchanek@suse.de> wrote: > > > > On Thu, Feb 05, 2026 at 10:06:58AM +0800, Jiang Xin wrote: > > > On Wed, Feb 4, 2026 at 9:00 PM Michal Suchánek <msuchanek@suse.de> wrote: > > > > > > > > On Wed, Feb 04, 2026 at 12:58:05PM +0100, Peter Krefting wrote: > > > > > 2026-02-04 10:31 skrev Jiang Xin: > > > > > > > > > > > Please try using AI coding tools to update translations in po/XX.po or > > > > > > review historical translations, following the prompts below: > > > > > > > > > > No. > > > > > > > > > > Please disable this altogether for the Swedish localization. "Translation" > > > > > using stochastic parrots is not mature and just creates gibberish that takes > > > > > more time to clean up than to do the translation from scratch manually. > > > > > > > > Hello, > > > > > > > > a similar attempt was widely reported, eg. here: > > > > https://linuxiac.com/ai-controversy-forces-end-of-mozilla-japanese-sumo-community/ > > > > > > > > As pointed out the availiability of the tools is not necessarily a > > > > problem in itself. The problem in that particular case was that Mozilla > > > > automatically applied the tools to existing translations, even > > > > well-maintained ones. > > > > > > Thank you for the context—this is a good reminder that automation > > > should never override community judgment. > > > > > > To be clear, using AI as a translation aid is entirely up to each > > > contributor. In Git 2.53’s l10n cycle, I temporarily handled the > > > Chinese translation (as the usual lead was unavailable), translated > > > all new strings, and fixed many issues in older translations—both > > > speed and quality were surprisingly good. > > > > > > As an l10n coordinator, I’ve long struggled with reviewing PRs: while > > > git-po-helper catches technical errors, it can’t assess translation > > > quality or detect irrelevant content like ads or political text. Here, > > > AI can help flag such issues during review. > > > > That is really sad. 'ads or political text' sounds like something that > > would be visible immediately if somebody looked at the change at all. > > Which implies that you do not want to look at it, and have AI review > > it. That is put AI in charge. That's not going to go well. > > Git supports 19 languages, 14 of which have received active updates in > the past year. How am I supposed to perform semantic-level reviews for > languages I'm not familiar with? > > In principle, I should trust all pull requests provided by team > leaders, but having an AI-powered semantic-level code review > available, especially for extreme scenarios or to assist contributors, > isn't necessarily a bad idea. Is it not or is it? When you do not understand the language in question you cannot verify the AI review. Neither for false positives nor for false negatives. So far AI has been shown to provide lower quality reviews than actual humans. If the team leaders employ AI for typo and grammer review they can rule out the false positives but you cannot. In the end you need to trust them or learn all those 19 languages. Thanks Michal ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC] Introducing AI Agents to Git Localization 2026-02-04 11:58 ` Peter Krefting 2026-02-04 13:00 ` Michal Suchánek @ 2026-02-05 1:04 ` Jiang Xin 2026-02-05 1:53 ` brian m. carlson 1 sibling, 1 reply; 42+ messages in thread From: Jiang Xin @ 2026-02-05 1:04 UTC (permalink / raw) To: Peter Krefting Cc: Alexander Shopov, Mikel Forcada, Ralf Thielow, Jean-Noël Avila, Bagas Sanjaya, Dimitriy Ryazantcev, Emir SARI, Arkadii Yakovets, Vũ Tiến Hưng, Teng Long, Yi-Jyun Pan, Jordi Mas, Matthias Rüster, Phillip Szelat, Sébastien Helleu, insolor, Kateryna Golovanova, Trần Ngọc Quân, Nguyễn Thái Ngọc Duy, Ray Chen, 依云, Fangyi Zhou, Franklin Weng, Git List On Wed, Feb 4, 2026 at 8:28 PM Peter Krefting <peter@softwolves.pp.se> wrote: > > 2026-02-04 10:31 skrev Jiang Xin: > > > Please try using AI coding tools to update translations in po/XX.po or > > review historical translations, following the prompts below: > > No. > > Please disable this altogether for the Swedish localization. > "Translation" using stochastic parrots is not mature and just creates > gibberish that takes more time to clean up than to do the translation > from scratch manually. Thank you for your feedback — I completely understand your concerns. To clarify, the intention is not to enforce automated translations via a central bot. Instead, each l10n team should retain full control over whether or not to use AI assistance in their workflow. The recent commits in the git-po next branch only add optional guidance in po/README.md to help AI agents (if a team chooses to use them) perform specific tasks more effectively—such as recognizing glossary terms from the .po header, locating untranslated or fuzzy strings, and splitting large files for easier handling. We fully acknowledge that AI translation quality varies significantly across languages, and for some—like Swedish—it may not yet be reliable enough for direct use. The goal is to provide tools that teams can optionally leverage, not to replace human judgment or community oversight. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC] Introducing AI Agents to Git Localization 2026-02-05 1:04 ` Jiang Xin @ 2026-02-05 1:53 ` brian m. carlson 0 siblings, 0 replies; 42+ messages in thread From: brian m. carlson @ 2026-02-05 1:53 UTC (permalink / raw) To: Jiang Xin Cc: Peter Krefting, Alexander Shopov, Mikel Forcada, Ralf Thielow, Jean-Noël Avila, Bagas Sanjaya, Dimitriy Ryazantcev, Emir SARI, Arkadii Yakovets, Vũ Tiến Hưng, Teng Long, Yi-Jyun Pan, Jordi Mas, Matthias Rüster, Phillip Szelat, Sébastien Helleu, insolor, Kateryna Golovanova, Trần Ngọc Quân, Nguyễn Thái Ngọc Duy, Ray Chen, 依云, Fangyi Zhou, Franklin Weng, Git List [-- Attachment #1: Type: text/plain, Size: 3831 bytes --] On 2026-02-05 at 01:04:51, Jiang Xin wrote: > To clarify, the intention is not to enforce automated translations via > a central bot. Instead, each l10n team should retain full control over > whether or not to use AI assistance in their workflow. The recent > commits in the git-po next branch only add optional guidance in > po/README.md to help AI agents (if a team chooses to use them) perform > specific tasks more effectively—such as recognizing glossary terms > from the .po header, locating untranslated or fuzzy strings, and > splitting large files for easier handling. > > We fully acknowledge that AI translation quality varies significantly > across languages, and for some—like Swedish—it may not yet be reliable > enough for direct use. The goal is to provide tools that teams can > optionally leverage, not to replace human judgment or community > oversight. My experience in seeing AI translations is that they tend to be of poor quality. Certainly, I'm only a native speaker of English, but my reading and writing skills in Spanish and French are somewhere around B2 or C1 and I've seen some AI translations that are frankly just wrong, making errors that no human would ever make. And Spanish and French are two of the most spoken languages on the planet, with hundreds of millions of speakers each. I also will share with you the experience of a colleague of mine who is a European Portuguese speaker. Most of the AI models produce Brazilian Portuguese, which is much more common (since there are more people in Brazil than in Portugal), but can vary substantially from European Portuguese. (Most FLOSS I've seen has separate pt_BR and pt_PT translations for this reason.) This means that these tools are going to produce bad translations in such a case. I strongly feel that we should provide people good quality software, which includes good quality translations, to the best of our ability. I realize that this demands extra effort from humans to do good quality translations, but I feel really positively about the quality of the translations we have in Git: they are overall excellent and it is only extremely infrequently that I've found a problem (which is usually a typo of some sort that anyone could have made). Considering that most people on the planet do not speak English and that even those that do may not speak it fluently, it's of the utmost importance that we produce the best quality translations we can. I don't feel using AI-generated translations would be honouring our users in that way. Finally, we have this text in SubmittingPatches: The Developer's Certificate of Origin requires contributors to certify that they know the origin of their contributions to the project and that they have the right to submit it under the project's license. It's not yet clear that this can be legally satisfied when submitting significant amount of content that has been generated by AI tools. Another issue with AI generated content is that AIs still often hallucinate or just produce bad code, commit messages, documentation or output, even when you point out their mistakes. To avoid these issues, we will reject anything that looks AI generated, that sounds overly formal or bloated, that looks like AI slop, that looks good on the surface but makes no sense, or that senders don’t understand or cannot explain. It's my understanding that copyright attaches to translations, at least under U.S. and Canadian law, and so the sign-off requirements would need to be met here. So I'm afraid that we wouldn't be able to accept such contributions if they were made due to the need for sign-off with the DCO. -- brian m. carlson (they/them) Toronto, Ontario, CA [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 262 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
[parent not found: <0207CD38-C811-499D-AFA6-131B0CA825CD@gmail.com>]
* Re: [RFC] Introducing AI Agents to Git Localization [not found] ` <0207CD38-C811-499D-AFA6-131B0CA825CD@gmail.com> @ 2026-02-05 12:54 ` Jiang Xin 0 siblings, 0 replies; 42+ messages in thread From: Jiang Xin @ 2026-02-05 12:54 UTC (permalink / raw) To: Yi-Jyun Pan Cc: Alexander Shopov, Mikel Forcada, Ralf Thielow, Jean-Noël Avila, Bagas Sanjaya, Dimitriy Ryazantcev, Peter Krefting, Emir SARI, Arkadii Yakovets, Vũ Tiến Hưng, Teng Long, Jordi Mas, Matthias Rüster, Phillip Szelat, Sébastien Helleu, insolor, Kateryna Golovanova, Trần Ngọc Quân, Nguyễn Thái Ngọc Duy, Ray Chen, 依云, Fangyi Zhou, Franklin Weng, Git List On Thu, Feb 5, 2026 at 7:47 PM Yi-Jyun Pan <pan93412@gmail.com> wrote: > > Hi Jiang Xin, > > Not going to focus on if we should use AI or not; just share some best practices > and opinions for writing content for AI agents. > > For the "Technical guidelines for AI tools" section, I would recommend placing this > section in AGENTS.md (https://agents.md/), which is an open standard for guidelines > for AI agents to reference, so we leave the main README file clear and concise. > AI will also refer to them without explicitly mentioning it (@po/README.md). The best AI coding tools have their own memory files, such as "CLAUDE.md", ".cursorrules", and "GEMINI.md". I tried using "po/AGENTS.md", but it does not automatically provide context. Many capabilities are already documented in "po/README.md", including how to generate po/git.pot, how to update "po/XX.po" files, and how to create location-less PO files. Therefore, I have added the additional AI-related capabilities to "po/README.md". > I hope my agent can assist me with some tedious preparation tasks, so I would like to > suggest a few useful use cases that would be fantastic if you could document the best > practices of these use cases in AGENTS.md: > > 1. Allow agents to update the base and translation by executing "make po-update PO_FILE=po/XX.po”. > 2. Based on point (1), I can instruct my agent to retain the previous strings for fuzzy strings translation (msgmerge --previous, custom arguments). > 3. When I complete the translation, it can also run “msgcat” to create a location-less file. This change includes a commit that defines a filter for PO files in .gitattributes, so contributors (including AI coding tools) no longer need to worry about submitting location-less PO files. ^ permalink raw reply [flat|nested] 42+ messages in thread
* [RFC PATCH 1/2] l10n: add .gitattributes to simplify location filtering 2026-02-04 9:31 [RFC] Introducing AI Agents to Git Localization Jiang Xin 2026-02-04 11:58 ` Peter Krefting [not found] ` <0207CD38-C811-499D-AFA6-131B0CA825CD@gmail.com> @ 2026-02-05 13:00 ` Jiang Xin 2026-02-05 20:07 ` Junio C Hamano 2026-02-05 13:00 ` [RFC PATCH 2/2] l10n: README: document AI assistant guidelines Jiang Xin 3 siblings, 1 reply; 42+ messages in thread From: Jiang Xin @ 2026-02-05 13:00 UTC (permalink / raw) To: Git List; +Cc: Jiang Xin To simplify the location filtering process for l10n contributors when committing po/XX.po files, add the filter attributes for .po files to the repository. This ensures all contributors automatically get the same filter configuration without manual setup in .git/info/attributes. Contributors still need to manually define the filter drivers using git-config as documented in po/README.md. Signed-off-by: Jiang Xin <worldhello.net@gmail.com> --- po/.gitattributes | 37 +++++++++++++++++++++++++++++++++++++ po/README.md | 19 +++++++++++-------- 2 files changed, 48 insertions(+), 8 deletions(-) create mode 100644 po/.gitattributes diff --git a/po/.gitattributes b/po/.gitattributes new file mode 100644 index 0000000000..1a83c8027c --- /dev/null +++ b/po/.gitattributes @@ -0,0 +1,37 @@ +# Git Attributes for PO Files +# +# This file configures Git filters to automatically strip location information +# from PO files when committing, producing cleaner diffs and saving repository +# space. +# +# Two filter types are used: +# 1. gettext-no-file-no-location: Strips both filenames and line numbers +# (e.g., removes "#: main.c:123" entirely) +# 2. gettext-no-location: Preserves filenames but removes line numbers, which +# requires gettext 0.20 or higher +# (e.g., "#: main.c:123" becomes "#: main.c") +# +# See `po/README.md` for instructions on setting up the required filter drivers. + +# Default: Strip both filenames and locations for all .po files +*.po filter=gettext-no-file-no-location + +# Explicitly documented languages using the default filter +# (These inherit the *.po rule above but are listed here for documentation) +bg.po filter=gettext-no-file-no-location +de.po filter=gettext-no-file-no-location +es.po filter=gettext-no-file-no-location +fr.po filter=gettext-no-file-no-location +ga.po filter=gettext-no-file-no-location +ru.po filter=gettext-no-file-no-location +sv.po filter=gettext-no-file-no-location +tr.po filter=gettext-no-file-no-location +uk.po filter=gettext-no-file-no-location +vi.po filter=gettext-no-file-no-location + +# Languages that preserve filenames but strip line numbers +# (These override the *.po rule above with a different filter) +ca.po filter=gettext-no-location +id.po filter=gettext-no-location +zh_CN.po filter=gettext-no-location +zh_TW.po filter=gettext-no-location diff --git a/po/README.md b/po/README.md index ec08aa24ad..ad7f72ba83 100644 --- a/po/README.md +++ b/po/README.md @@ -166,23 +166,26 @@ and make a user-friendly patch for review. To save a location-less "po/XX.po" automatically in repository, you can: -First define a new attribute for "po/XX.po" by appending the following -line in ".git/info/attributes": +First, check which filter is configured for your "po/XX.po" file: ``` -/po/XX.po filter=gettext-no-location +git check-attr filter po/XX.po ``` -Then define the driver for the "gettext-no-location" clean filter to -strip out both filenames and locations from the contents as follows: +The filter configuration is defined in the "po/.gitattributes" file. + +Then define the driver for the filter. Most languages use the +"gettext-no-file-no-location" clean filter, which strips out both filenames and +locations from the comments. To set this up, run the following command: ```shell -git config --global filter.gettext-no-location.clean \ +git config --global filter.gettext-no-file-no-location.clean \ "msgcat --no-location -" ``` -For users who have gettext version 0.20 or higher, it is also possible -to define a clean filter to preserve filenames but not locations: +Some languages use the "gettext-no-location" clean filter, which preserves +filenames but not locations. For these, install gettext version 0.20 or higher +and setup the driver as below: ```shell git config --global filter.gettext-no-location.clean \ -- 2.51.0.rc2 ^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: [RFC PATCH 1/2] l10n: add .gitattributes to simplify location filtering 2026-02-05 13:00 ` [RFC PATCH 1/2] l10n: add .gitattributes to simplify location filtering Jiang Xin @ 2026-02-05 20:07 ` Junio C Hamano 0 siblings, 0 replies; 42+ messages in thread From: Junio C Hamano @ 2026-02-05 20:07 UTC (permalink / raw) To: Jiang Xin; +Cc: Git List Jiang Xin <worldhello.net@gmail.com> writes: > To simplify the location filtering process for l10n contributors when > committing po/XX.po files, add the filter attributes for .po files to > the repository. This ensures all contributors automatically get the > same filter configuration without manual setup in .git/info/attributes. > > Contributors still need to manually define the filter drivers using > git-config as documented in po/README.md. > > Signed-off-by: Jiang Xin <worldhello.net@gmail.com> > --- > po/.gitattributes | 37 +++++++++++++++++++++++++++++++++++++ > po/README.md | 19 +++++++++++-------- > 2 files changed, 48 insertions(+), 8 deletions(-) > create mode 100644 po/.gitattributes OK. It is slightly sad that two camps cannot agree on a single filter, which may allow us to just do *.po filter=one-single-filter-agreed-upon but I guess this is a good start. And having sample configuration lines that can readily be cut-and-pasted to help the contributors is very good. > +The filter configuration is defined in the "po/.gitattributes" file. > + > +Then define the driver for the filter. Most languages use the > +"gettext-no-file-no-location" clean filter, which strips out both filenames and > +locations from the comments. To set this up, run the following command: > > ```shell > -git config --global filter.gettext-no-location.clean \ > +git config --global filter.gettext-no-file-no-location.clean \ > "msgcat --no-location -" > ``` > > -For users who have gettext version 0.20 or higher, it is also possible > -to define a clean filter to preserve filenames but not locations: > +Some languages use the "gettext-no-location" clean filter, which preserves > +filenames but not locations. For these, install gettext version 0.20 or higher > +and setup the driver as below: > > ```shell > git config --global filter.gettext-no-location.clean \ ^ permalink raw reply [flat|nested] 42+ messages in thread
* [RFC PATCH 2/2] l10n: README: document AI assistant guidelines 2026-02-04 9:31 [RFC] Introducing AI Agents to Git Localization Jiang Xin ` (2 preceding siblings ...) 2026-02-05 13:00 ` [RFC PATCH 1/2] l10n: add .gitattributes to simplify location filtering Jiang Xin @ 2026-02-05 13:00 ` Jiang Xin 2026-02-05 20:35 ` Junio C Hamano 3 siblings, 1 reply; 42+ messages in thread From: Jiang Xin @ 2026-02-05 13:00 UTC (permalink / raw) To: Git List; +Cc: Jiang Xin Add guidelines for using AI tools as optional assistants in Git localization work, while emphasizing human translators remain in control. Also update `git-po-helper` command examples to include the `--pot-file=build` option. Example usage in prompts to AI assistants: - "Update translations in `po/XX.po` following the guidelines in @po/README.md" - "Review all translations in `po/XX.po` following the guidelines in @po/README.md" Signed-off-by: Jiang Xin <worldhello.net@gmail.com> --- po/README.md | 294 ++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 291 insertions(+), 3 deletions(-) diff --git a/po/README.md b/po/README.md index ad7f72ba83..6ba082376a 100644 --- a/po/README.md +++ b/po/README.md @@ -227,8 +227,8 @@ L10n coordinator will check your contributions using a helper program (see "PO helper" section below): ```shell -git-po-helper check-po po/XX.po -git-po-helper check-commits <rev-list-opts> +git-po-helper check-po --pot-file=build po/XX.po +git-po-helper check-commits --pot-file=build <rev-list-opts> ``` @@ -430,7 +430,7 @@ There are some conventions that l10n contributors must follow: your commit: ```shell - git-po-helper check-po <XX.po> + git-po-helper check-po --pot-file=build <XX.po> ``` - Squash trivial commits to make history clear. @@ -459,5 +459,293 @@ additional conventions: ``` +## Artificial Intelligence (AI) as Translation Assistant + +This section provides guidance for human translators who choose to use AI tools +as assistants in their localization work. The use of AI is entirely optional. +Many successful translation teams work effectively without it. + + +### Human translators remain in control + +Translation of Git is a human-driven community effort. Language team leaders and +contributors are responsible for: + +- Understanding the technical context of Git commands and messages +- Making linguistic and cultural adaptation decisions for their target language +- Maintaining translation quality and consistency within their language team +- Ensuring translations follow Git l10n conventions and community standards +- Building and maintaining language-specific glossaries +- Reviewing and approving all changes before submission + +AI tools, if used, serve only to accelerate routine tasks. They do not make +decisions, do not replace human judgment, and do not understand cultural +nuances or community needs. + + +### When AI assistance may be helpful + +AI tools can help speed up certain mechanical aspects of translation work: + +- Generating first-draft translations for new or updated messages +- Identifying untranslated or fuzzy entries across large PO files +- Checking consistency with existing translations and glossary terms +- Detecting technical errors (missing placeholders, formatting issues) +- Reviewing translations against quality criteria + +However, AI-generated output should always be treated as rough drafts requiring +human review, editing, and approval by someone who understands both the +technical context and the target language. + + +### Preparing your translation environment for effective AI use + +If you choose to use AI assistance, investing time in preparation will +significantly improve the quality of AI-generated suggestions: + +1. **Maintain a glossary**: Add a "Git glossary for XX translators" section in + the header comments of your `po/XX.po` file (before the first `msgid`). List + key Git terms with their approved translations. AI tools can read and follow + this glossary. + +2. **Keep translations up-to-date**: Regularly sync your `po/XX.po` with + upstream. AI learns from existing translations. The more complete and + consistent your PO file, the better AI suggestions will be. + +3. **Document style guidelines**: If your language team has specific formatting + or style preferences, document them in your `po/XX.po` header. AI can + incorporate these guidelines into its output. + +4. **Choose appropriate AI coding tools**: Evaluate and use models and tools + that work best for your target language. Different AI models have varying + levels of proficiency across languages. Test multiple tools to find which + produces the most natural and accurate translations for your language. + + +### Technical guidelines for AI tools + +The following sections provide technical specifications for AI tools that +assist with Git translation. These guidelines ensure AI-generated suggestions +are technically correct and follow Git l10n conventions. Human translators +should be familiar with these requirements to effectively review AI output. + + +#### Scope and context + +- Primary files: `po/XX.po` for translations, `po/git.pot` for the source + template (generated on demand; see "Dynamically generated POT files"). +- Source language: English. Target language: derived from the language code in + the `po/XX.po` filename based on ISO 639 and ISO 3166. +- Glossary: Git l10n teams may add glossary sections (e.g. "Git glossary for + Chinese translators") in the header comments of `po/XX.po` immediately before + the first `msgid` entry. If a glossary exists, read it and keep terminology + consistent. + + +#### Quality checklist + +- Accuracy: faithfully conveys the original meaning; no omissions or distortions. +- Terminology: uses correct, consistent terms per glossary or domain standards. +- Grammar and fluency: grammatically correct and reads naturally. +- Placeholders: preserves variables (e.g. `%s`, `{name}`, `$1`) exactly. If + reordering is needed for the target language, use positional parameters as + described below. +- Plurals and gender: handles plural forms, gender, and agreement correctly. +- Context fit: suitable for UI space, tone, and usage (e.g. error vs. tooltip). +- Cultural appropriateness: avoids offensive or ambiguous content. +- Consistency: matches prior translations of the same source string. +- Technical integrity: do not translate code, paths, commands, brand names, or + proper nouns. +- Readability: clear, concise, and user-friendly. + + +#### Locating untranslated, fuzzy, and obsolete entries + +Use GNU gettext tools to parse PO structure reliably (safe for multi-line +`msgid`/`msgstr`): + +- Untranslated entries: + + ```shell + msgattrib --untranslated --no-obsolete po/XX.po + ``` + +- Fuzzy entries: + + ```shell + msgattrib --only-fuzzy --no-obsolete po/XX.po + ``` + +- Obsolete entries (marked with `#~`): + + ```shell + msgattrib --obsolete --no-wrap po/XX.po + ``` + +If you only want the message IDs, you can pipe to: + +```shell +msgattrib --untranslated --no-obsolete po/XX.po | sed -n '/^msgid /,/^$/p' +``` + +```shell +msgattrib --only-fuzzy --no-obsolete po/XX.po | sed -n '/^msgid /,/^$/p' +``` + + +#### Translation workflow (`po/XX.po`) + +When asked to update translations, follow the steps in this section in order +and reference this section in your plan before making edits. + +- Generate `po/git.pot` from source code (see "Dynamically generated POT files"). +- Update `po/XX.po` with the new template. +- Translate new entries identified by `msgattrib --untranslated` (see above). +- Fix fuzzy entries identified by `msgattrib --only-fuzzy` (see above) by + re-translating and removing the `fuzzy` tag after updating `msgstr`. +- For entries with `msgid_plural`, consult [Plural forms](#plural-forms) to + supply all required `msgstr[n]` forms based on the `Plural-Forms` header. +- Apply the quality checklist to every translation. + + +#### Review workflow + +Review workflow has two modes: direct review against local `po/XX.po`, and +review based on a patch. + +##### Full file review + +- When explicitly asked to review all translated content, review `po/XX.po` + in chunks (see [Handling large inputs](#handling-large-inputs) for splitting). +- Apply the quality checklist to each message you review. +- Unless otherwise specified, update `po/XX.po` directly; if a summary is + requested, provide a consolidated report of the issues. + + +##### Patch review + +- Review requests may come as patches of `po/XX.po`: + - Workspace changes: `git diff HEAD -- po/XX.po` + - Changes since a commit-ish: `git diff <commit-ish> -- po/XX.po` + - Changes in a specific commit: `git show <commit-ish> -- po/XX.po` +- For large patches, follow the split guidance in + [Handling large inputs](#handling-large-inputs) when splitting. +- When diff context is incomplete (truncated `msgid` or `msgstr`), use file + viewing tools to pull nearby context for accurate review. +- Apply the same quality checklist as in full file reviews. +- If the patch is based on workspace changes, update `po/XX.po` directly + unless a summary is requested. +- If the patch is from a specific commit, report issues or apply fixes when + comparing against the current `po/XX.po` in the workspace. + + +#### Handling large inputs + +When a `po/XX.po` file or a patch is too large for LLM context, split it into +chunks while keeping `msgid` and `msgstr` pairs intact. This includes plural +forms: `msgid`, `msgid_plural`, `msgstr[0]`, `msgstr[1]`, and any additional +plural indices required by the language. + +For `po/XX.po`, split on the line immediately before each `msgid` entry. This +guarantees no chunk begins with an unpaired `msgid`. Use +`grep -n '^msgid' po/XX.po` to locate split points, and group the file into +chunks of no more than 200 `msgid` entries (about 50K bytes each). + +For patch files, check the patch size first: + +- If the patch is <= 100KB, do not split. +- If the patch is > 100KB, split it using the same rule as for `po/XX.po`: + split on the line immediately before each `msgid` entry so message pairs + stay together. + + +#### Plural forms + +This section defines how translators should handle `msgid_plural` entries, +including how many `msgstr[n]` forms are required and how to index them. It +provides the canonical example and points to the `Plural-Forms` header for the +language-specific rule set. + +For entries with `msgid_plural`, provide plural forms: + +```po +msgid "..." +msgid_plural "..." +msgstr[0] "..." +msgstr[1] "..." +``` + +Use `msgstr[0]`/`msgstr[1]` as required. If the language has more plural forms, +follow the `Plural-Forms` header in `po/XX.po` to determine the required number +of `msgstr[n]` entries. + + +#### Placeholder reordering + +When a translation reorders placeholders, mark them with positional parameter +syntax (`%n$`) so each argument maps to the correct source value. Keep the +width/precision modifiers intact and place the position specifier before them. + +Example: + +```po +msgid "missing environment variable '%s' for configuration '%.*s'" +msgstr "配置 '%3$.*2$s' 缺少环境变量 '%1$s'" +``` + +Here the translation swaps the two placeholders. `%1$s` still refers to the +first argument (`%s`), while `%3$.*2$s` refers to the third string argument +with the precision taken from the second argument (`%.*s`). + + +### Integrating AI tools into your workflow + +If you decide to use AI assistance, here's how to integrate it responsibly: + + +#### For AI tool developers and users + +When building or configuring AI-assisted translation tools: + +- Use the quality checklist (above) to score or filter draft suggestions +- Apply the `msgattrib` + `sed` commands to extract relevant entries for processing +- Ensure AI tools read and respect glossary terms from the `po/XX.po` header +- Configure tools to follow the technical workflows documented above + + +#### Human oversight is mandatory + +**Never submit AI-generated translations without human review.** The human +translator must: + +- Verify technical accuracy (correct placeholders, plural forms, formatting) +- Ensure linguistic quality (natural phrasing, appropriate terminology) +- Check cultural appropriateness for the target audience +- Confirm consistency with the language team's established style +- Take full responsibility for the final translation + +Example usage in prompts to AI assistants: + +- "Update translations in `po/XX.po` following the guidelines in @po/README.md" +- "Review all translations in `po/XX.po` following the guidelines in @po/README.md" + + +### Summary: AI as a tool, humans as translators + +AI can accelerate translation work, but it is not a substitute for human +translators. The Git localization community values: + +- **Human expertise**: Deep understanding of Git's technical context and the + cultural nuances of each target language +- **Community standards**: Consistency across releases and alignment with + language team conventions +- **Accountability**: Human translators who stand behind their work and respond + to feedback from users + +If you choose to use AI tools, they should enhance these human contributions, +not replace them. The best results come from combining AI efficiency with human +judgment, cultural insight, and community engagement. + + [git-po-helper/README]: https://github.com/git-l10n/git-po-helper#readme [Documentation/SubmittingPatches]: Documentation/SubmittingPatches -- 2.51.0.rc2 ^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: [RFC PATCH 2/2] l10n: README: document AI assistant guidelines 2026-02-05 13:00 ` [RFC PATCH 2/2] l10n: README: document AI assistant guidelines Jiang Xin @ 2026-02-05 20:35 ` Junio C Hamano 2026-02-06 2:38 ` Jiang Xin 0 siblings, 1 reply; 42+ messages in thread From: Junio C Hamano @ 2026-02-05 20:35 UTC (permalink / raw) To: Jiang Xin; +Cc: Git List Jiang Xin <worldhello.net@gmail.com> writes: > Add guidelines for using AI tools as optional assistants in Git > localization work, while emphasizing human translators remain in > control. > > Also update `git-po-helper` command examples to include the > `--pot-file=build` option. > > Example usage in prompts to AI assistants: > > - "Update translations in `po/XX.po` following the guidelines > in @po/README.md" > - "Review all translations in `po/XX.po` following the guidelines > in @po/README.md" > > Signed-off-by: Jiang Xin <worldhello.net@gmail.com> > --- > po/README.md | 294 ++++++++++++++++++++++++++++++++++++++++++++++++++- > 1 file changed, 291 insertions(+), 3 deletions(-) > > diff --git a/po/README.md b/po/README.md > index ad7f72ba83..6ba082376a 100644 > --- a/po/README.md > +++ b/po/README.md > @@ -227,8 +227,8 @@ L10n coordinator will check your contributions using a helper program > (see "PO helper" section below): > > ```shell > -git-po-helper check-po po/XX.po > -git-po-helper check-commits <rev-list-opts> > +git-po-helper check-po --pot-file=build po/XX.po > +git-po-helper check-commits --pot-file=build <rev-list-opts> > ``` > > > @@ -430,7 +430,7 @@ There are some conventions that l10n contributors must follow: > your commit: > > ```shell > - git-po-helper check-po <XX.po> > + git-po-helper check-po --pot-file=build <XX.po> > ``` > > - Squash trivial commits to make history clear. Is everything above specific to using AI assistants to help your translation process, or do people who do not (yet) use them also benefit from these updated examples? If the latter, it probably should belong to a separate patch. > +AI tools, if used, serve only to accelerate routine tasks. They do not make > +decisions, do not replace human judgment, and do not understand cultural > +nuances or community needs. They may very well do any of the above. It is your responsibility as humans to monitor their decisions, judgement, and understanding, and countermand them as needed. > +### Preparing your translation environment for effective AI use > + > +If you choose to use AI assistance, investing time in preparation will > +significantly improve the quality of AI-generated suggestions: > + > +1. **Maintain a glossary**: Add a "Git glossary for XX translators" section in > + the header comments of your `po/XX.po` file (before the first `msgid`). List > + key Git terms with their approved translations. AI tools can read and follow > + this glossary. A few random sampling of po/XX.po files seems to tell me that this is already the case for some languages but no all of them. Perhaps refer translators for other languages an existing example to help them start their glossary in their po/XX.po file? > +2. **Keep translations up-to-date**: Regularly sync your `po/XX.po` with > + upstream. AI learns from existing translations. The more complete and > + consistent your PO file, the better AI suggestions will be. I am not sure what this means. When you are working on updating translations for your language, you'd want to be working from or near the tip anyway, regardless of what tools you would use, no? > +3. **Document style guidelines**: If your language team has specific formatting > + or style preferences, document them in your `po/XX.po` header. AI can > + incorporate these guidelines into its output. If we have an example in po/XY.po that translators to other languages can learn from? > +4. **Choose appropriate AI coding tools**: Evaluate and use models and tools > + that work best for your target language. Different AI models have varying > + levels of proficiency across languages. Test multiple tools to find which > + produces the most natural and accurate translations for your language. > + > + > +### Technical guidelines for AI tools > + > +The following sections provide technical specifications for AI tools that > +assist with Git translation. These guidelines ensure AI-generated suggestions > +are technically correct and follow Git l10n conventions. Human translators > +should be familiar with these requirements to effectively review AI output. Are the subsections of this section meant to be fed as part of prompt to the tools? Otherwise they look mostly repetitions of what human translators already have learned elsewhere in the document. > +#### Scope and context > + > +- Primary files: `po/XX.po` for translations, `po/git.pot` for the source > + template (generated on demand; see "Dynamically generated POT files"). > +- Source language: English. Target language: derived from the language code in > + the `po/XX.po` filename based on ISO 639 and ISO 3166. > +- Glossary: Git l10n teams may add glossary sections (e.g. "Git glossary for > + Chinese translators") in the header comments of `po/XX.po` immediately before > + the first `msgid` entry. If a glossary exists, read it and keep terminology > + consistent. This overlaps "Preparing #1"; do you want to cover "Preparing #4" as well? > +#### Quality checklist > + > +- Accuracy: faithfully conveys the original meaning; no omissions or distortions. > +- Terminology: uses correct, consistent terms per glossary or domain standards. > +- Grammar and fluency: grammatically correct and reads naturally. > +- Placeholders: preserves variables (e.g. `%s`, `{name}`, `$1`) exactly. If > + reordering is needed for the target language, use positional parameters as > + described below. > +- Plurals and gender: handles plural forms, gender, and agreement correctly. > +- Context fit: suitable for UI space, tone, and usage (e.g. error vs. tooltip). > +- Cultural appropriateness: avoids offensive or ambiguous content. > +- Consistency: matches prior translations of the same source string. > +- Technical integrity: do not translate code, paths, commands, brand names, or > + proper nouns. > +- Readability: clear, concise, and user-friendly. The fact that these are important does not change if you use AI tools or not, no? As I am not sure the purpose of these repeated instructions in the "Tech guidelines for AI tools" section, I've trimmed most of the contents in it here. > +### Integrating AI tools into your workflow > + > +If you decide to use AI assistance, here's how to integrate it responsibly: > + > + > +#### For AI tool developers and users > + > +When building or configuring AI-assisted translation tools: > + > +- Use the quality checklist (above) to score or filter draft suggestions > +- Apply the `msgattrib` + `sed` commands to extract relevant entries for processing Referring to the section (e.g., "commands listed in the 'Locating untranslated, fuzzy, and obsolete entries' section") would be clearer. You have necessary commands ready to be cut-and-pasted there. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC PATCH 2/2] l10n: README: document AI assistant guidelines 2026-02-05 20:35 ` Junio C Hamano @ 2026-02-06 2:38 ` Jiang Xin 2026-03-03 15:33 ` [PATCH v2 0/5] docs(l10n): AI agent instructions and workflow improvements Jiang Xin ` (2 more replies) 0 siblings, 3 replies; 42+ messages in thread From: Jiang Xin @ 2026-02-06 2:38 UTC (permalink / raw) To: Junio C Hamano; +Cc: Git List On Fri, Feb 6, 2026 at 4:35 AM Junio C Hamano <gitster@pobox.com> wrote: > > Jiang Xin <worldhello.net@gmail.com> writes: > > > Add guidelines for using AI tools as optional assistants in Git > > localization work, while emphasizing human translators remain in > > control. > > > > Also update `git-po-helper` command examples to include the > > `--pot-file=build` option. > > > > Example usage in prompts to AI assistants: > > > > - "Update translations in `po/XX.po` following the guidelines > > in @po/README.md" > > - "Review all translations in `po/XX.po` following the guidelines > > in @po/README.md" > > > > Signed-off-by: Jiang Xin <worldhello.net@gmail.com> > > --- > > po/README.md | 294 ++++++++++++++++++++++++++++++++++++++++++++++++++- > > 1 file changed, 291 insertions(+), 3 deletions(-) > > > > diff --git a/po/README.md b/po/README.md > > index ad7f72ba83..6ba082376a 100644 > > --- a/po/README.md > > +++ b/po/README.md > > @@ -227,8 +227,8 @@ L10n coordinator will check your contributions using a helper program > > (see "PO helper" section below): > > > > ```shell > > -git-po-helper check-po po/XX.po > > -git-po-helper check-commits <rev-list-opts> > > +git-po-helper check-po --pot-file=build po/XX.po > > +git-po-helper check-commits --pot-file=build <rev-list-opts> > > ``` > > > > > > @@ -430,7 +430,7 @@ There are some conventions that l10n contributors must follow: > > your commit: > > > > ```shell > > - git-po-helper check-po <XX.po> > > + git-po-helper check-po --pot-file=build <XX.po> > > ``` > > > > - Squash trivial commits to make history clear. > > Is everything above specific to using AI assistants to help your > translation process, or do people who do not (yet) use them also > benefit from these updated examples? If the latter, it probably > should belong to a separate patch. When git-po-helper is used in GitHub Actions, it cannot build the POT template from source code because the CI workflow uses a partial clone of the Git repository with only ”po/*.po“ files checked out. Therefore, by default, git-po-helper downloads a prebuilt POT template file instead of compiling from source. However, building from source code (--pot-file=build) should be the safe and default behavior when working in a complete source tree. I will update the git-po-helper code to automatically detect the environment and set the appropriate default behavior for both scenarios, eliminating the need to document the --pot-file=build option explicitly. > > +AI tools, if used, serve only to accelerate routine tasks. They do not make > > +decisions, do not replace human judgment, and do not understand cultural > > +nuances or community needs. > > They may very well do any of the above. It is your responsibility > as humans to monitor their decisions, judgement, and understanding, > and countermand them as needed. Agreed. I'll adopt your suggested wording. > > +### Preparing your translation environment for effective AI use > > + > > +If you choose to use AI assistance, investing time in preparation will > > +significantly improve the quality of AI-generated suggestions: > > + > > +1. **Maintain a glossary**: Add a "Git glossary for XX translators" section in > > + the header comments of your `po/XX.po` file (before the first `msgid`). List > > + key Git terms with their approved translations. AI tools can read and follow > > + this glossary. > > A few random sampling of po/XX.po files seems to tell me that this > is already the case for some languages but no all of them. Perhaps > refer translators for other languages an existing example to help > them start their glossary in their po/XX.po file? Will do. I'll add: "See `po/zh_CN.po` for an example." > > +2. **Keep translations up-to-date**: Regularly sync your `po/XX.po` with > > + upstream. AI learns from existing translations. The more complete and > > + consistent your PO file, the better AI suggestions will be. > > I am not sure what this means. When you are working on updating > translations for your language, you'd want to be working from or > near the tip anyway, regardless of what tools you would use, no? Agreed. I'll remove this redundant point. > > +3. **Document style guidelines**: If your language team has specific formatting > > + or style preferences, document them in your `po/XX.po` header. AI can > > + incorporate these guidelines into its output. > > If we have an example in po/XY.po that translators to other > languages can learn from? I originally kept that point because I wanted to document how to generate the location-less file format in the PO file header, but it's now obsolete since I added a repository-level gitattributes file in a previous commit. > > +4. **Choose appropriate AI coding tools**: Evaluate and use models and tools > > + that work best for your target language. Different AI models have varying > > + levels of proficiency across languages. Test multiple tools to find which > > + produces the most natural and accurate translations for your language. > > + > > + > > +### Technical guidelines for AI tools > > + > > +The following sections provide technical specifications for AI tools that > > +assist with Git translation. These guidelines ensure AI-generated suggestions > > +are technically correct and follow Git l10n conventions. Human translators > > +should be familiar with these requirements to effectively review AI output. > > Are the subsections of this section meant to be fed as part of > prompt to the tools? Otherwise they look mostly repetitions of what > human translators already have learned elsewhere in the document. > > > > +#### Scope and context > > + > > +- Primary files: `po/XX.po` for translations, `po/git.pot` for the source > > + template (generated on demand; see "Dynamically generated POT files"). > > +- Source language: English. Target language: derived from the language code in > > + the `po/XX.po` filename based on ISO 639 and ISO 3166. > > +- Glossary: Git l10n teams may add glossary sections (e.g. "Git glossary for > > + Chinese translators") in the header comments of `po/XX.po` immediately before > > + the first `msgid` entry. If a glossary exists, read it and keep terminology > > + consistent. > > This overlaps "Preparing #1"; do you want to cover "Preparing #4" as well? "Preparing #1" tells humans to maintain a glossary; this section tells AI tools to read and use it (add to the context). Different audiences, complementary purposes. > > +#### Quality checklist > > + > > +- Accuracy: faithfully conveys the original meaning; no omissions or distortions. > > +- Terminology: uses correct, consistent terms per glossary or domain standards. > > +- Grammar and fluency: grammatically correct and reads naturally. > > +- Placeholders: preserves variables (e.g. `%s`, `{name}`, `$1`) exactly. If > > + reordering is needed for the target language, use positional parameters as > > + described below. > > +- Plurals and gender: handles plural forms, gender, and agreement correctly. > > +- Context fit: suitable for UI space, tone, and usage (e.g. error vs. tooltip). > > +- Cultural appropriateness: avoids offensive or ambiguous content. > > +- Consistency: matches prior translations of the same source string. > > +- Technical integrity: do not translate code, paths, commands, brand names, or > > + proper nouns. > > +- Readability: clear, concise, and user-friendly. > > The fact that these are important does not change if you use AI > tools or not, no? As I am not sure the purpose of these repeated > instructions in the "Tech guidelines for AI tools" section, I've > trimmed most of the contents in it here. You're right that these standards apply universally. However, the following sections reference this checklist explicitly (e.g., "Apply the quality checklist to every translation" in the workflow section, and "Apply the quality checklist to each message you review" in the review process). Without defining the checklist here, we'd need to repeat a shorter version of quality standards in multiple places. I'll evaluate translation quality with different versions of "po/README.md" and share some data in v2 to demonstrate whether the AI-specific guidance adds value. Best regards, Jiang Xin ^ permalink raw reply [flat|nested] 42+ messages in thread
* [PATCH v2 0/5] docs(l10n): AI agent instructions and workflow improvements 2026-02-06 2:38 ` Jiang Xin @ 2026-03-03 15:33 ` Jiang Xin 2026-03-03 15:33 ` [PATCH v2 1/5] l10n: add .gitattributes to simplify location filtering Jiang Xin ` (4 more replies) 2026-03-14 14:38 ` [PATCH v3 0/5] docs(l10n): AI agent instructions and workflow improvements Jiang Xin 2026-03-16 23:54 ` [PATCH v4 0/5] docs(l10n): AI agent instructions and workflow improvements Jiang Xin 2 siblings, 5 replies; 42+ messages in thread From: Jiang Xin @ 2026-03-03 15:33 UTC (permalink / raw) To: Junio C Hamano, Git List Cc: Jiang Xin, Alexander Shopov, Mikel Forcada, Ralf Thielow, Jean-Noël Avila, Bagas Sanjaya, Dimitriy Ryazantcev, Peter Krefting, Emir SARI, Arkadii Yakovets, Vũ Tiến Hưng, Teng Long, Yi-Jyun Pan ## Introduction This series introduces AI agent instructions for Git localization (l10n) workflows to help localization contributors quickly complete drafts and use AI to check translation quality. The changes focus on: 1. Separating agent-specific documentation into po/AGENTS.md for targeted optimization of AI-assisted workflows 2. Providing step-by-step instructions for update-pot, update-po, translation, and review tasks 3. Simplifying location filtering for PO file commits via .gitattributes AI-assisted translation is optional; many successful l10n teams work well without it. When used, AI output serves as reference only—human contributors must review and approve before submission. ## Performance summary | Task | Before | After | Improvement | |-------------|---------------|--------------|---------------------------------| | update-pot | 17 turns, 34s | 3 turns, 8s | -82% turns, -76% time | | update-po | 22 turns, 38s | 4 turns, 9s | -82% turns, -76% time | | translate | 86 turns | 56 turns | -35% turns (git-po-helper flow) | | review | N/A | 96/100 score | New workflow documented | These improvements reduce API costs and make agent workflows more efficient while maintaining human oversight of translation quality. ## Testing All changes have been evaluated with the qwen model via git-po-helper agent-test and agent-run. The po/AGENTS.md instructions are designed to work with coding tools that support file references (e.g., "Translate po/zh_CN.po by referring to @po/AGENTS.md"). ## Changes Jiang Xin (5): l10n: add .gitattributes to simplify location filtering docs(l10n): add AGENTS.md with optimized update-pot instructions docs(l10n): add AI agent instructions for updating po/XX.po files docs(l10n): add AI agent instructions for translating PO files docs(l10n): add AI agent instructions to review translations po/.gitattributes | 36 ++ po/AGENTS.md | 941 ++++++++++++++++++++++++++++++++++++++++++++++ po/README.md | 70 ++-- 3 files changed, 1015 insertions(+), 32 deletions(-) create mode 100644 po/.gitattributes create mode 100644 po/AGENTS.md ^ permalink raw reply [flat|nested] 42+ messages in thread
* [PATCH v2 1/5] l10n: add .gitattributes to simplify location filtering 2026-03-03 15:33 ` [PATCH v2 0/5] docs(l10n): AI agent instructions and workflow improvements Jiang Xin @ 2026-03-03 15:33 ` Jiang Xin 2026-03-03 15:33 ` [PATCH v2 2/5] docs(l10n): add AGENTS.md with optimized update-pot instructions Jiang Xin ` (3 subsequent siblings) 4 siblings, 0 replies; 42+ messages in thread From: Jiang Xin @ 2026-03-03 15:33 UTC (permalink / raw) To: Junio C Hamano, Git List Cc: Jiang Xin, Alexander Shopov, Mikel Forcada, Ralf Thielow, Jean-Noël Avila, Bagas Sanjaya, Dimitriy Ryazantcev, Peter Krefting, Emir SARI, Arkadii Yakovets, Vũ Tiến Hưng, Teng Long, Yi-Jyun Pan To simplify the location filtering process for l10n contributors when committing po/XX.po files, add the filter attributes for selected PO files to the repository. This ensures all contributors automatically get the same filter configuration without manual setup in .git/info/attributes. The filter attribute is only applied to specific PO files that have been properly prepared. Files without the filter attribute fall into two categories: - Legacy files that lack maintenance and still contain location comments that have not been cleaned up - Files whose formatting (such as line wrapping) differs from the output of msgcat processing To avoid discrepancies between the filtered blob in the index and the unfiltered working tree for these files, the filter attribute is not applied to them. Contributors still need to manually define the filter drivers using git-config as documented in po/README.md. Additionally, po/README.md has been reorganized: the content of handling location-less PO file content has been moved from the "Updating a XX.po file" section to a separate "Preparing a XX.po file for commit" section. This prevents AI agents from introducing unrelated operations when updating PO files. Signed-off-by: Jiang Xin <worldhello.net@gmail.com> --- po/.gitattributes | 36 ++++++++++++++++++++++++ po/README.md | 70 +++++++++++++++++++++++++---------------------- 2 files changed, 74 insertions(+), 32 deletions(-) create mode 100644 po/.gitattributes diff --git a/po/.gitattributes b/po/.gitattributes new file mode 100644 index 0000000000..7100b7050e --- /dev/null +++ b/po/.gitattributes @@ -0,0 +1,36 @@ +# Git Attributes for PO Files +# +# This file configures Git filters to automatically strip location information +# from PO files when committing, producing cleaner diffs and saving repository +# space. +# +# Two filter types are used: +# 1. gettext-no-file-no-location: Strips both filenames and line numbers +# (e.g., removes "#: main.c:123" entirely) +# 2. gettext-no-location: Preserves filenames but removes line numbers, which +# requires gettext 0.20 or higher +# (e.g., "#: main.c:123" becomes "#: main.c") +# +# See `po/README.md` for instructions on setting up the required filter drivers. + +# Do not configure default attributes for `*.po` files, as this would cause +# differences between the filtered blob stored in the index and the unfiltered +# working tree version for legacy, unmaintained PO files. + +# Languages that strip both filenames and line numbers +bg.po filter=gettext-no-file-no-location +de.po filter=gettext-no-file-no-location +#es.po filter=gettext-no-file-no-location +fr.po filter=gettext-no-file-no-location +#ga.po filter=gettext-no-file-no-location +#ru.po filter=gettext-no-file-no-location +sv.po filter=gettext-no-file-no-location +tr.po filter=gettext-no-file-no-location +uk.po filter=gettext-no-file-no-location +vi.po filter=gettext-no-file-no-location + +# Languages that preserve filenames but strip line numbers +#ca.po filter=gettext-no-location +id.po filter=gettext-no-location +zh_CN.po filter=gettext-no-location +zh_TW.po filter=gettext-no-location diff --git a/po/README.md b/po/README.md index ec08aa24ad..79757d4c21 100644 --- a/po/README.md +++ b/po/README.md @@ -159,38 +159,6 @@ It will: and these location lines will help translation tools to locate translation context easily. -Once you are done testing the translation (see below), it's better -to commit a location-less "po/XX.po" file to save repository space -and make a user-friendly patch for review. - -To save a location-less "po/XX.po" automatically in repository, you -can: - -First define a new attribute for "po/XX.po" by appending the following -line in ".git/info/attributes": - -``` -/po/XX.po filter=gettext-no-location -``` - -Then define the driver for the "gettext-no-location" clean filter to -strip out both filenames and locations from the contents as follows: - -```shell -git config --global filter.gettext-no-location.clean \ - "msgcat --no-location -" -``` - -For users who have gettext version 0.20 or higher, it is also possible -to define a clean filter to preserve filenames but not locations: - -```shell -git config --global filter.gettext-no-location.clean \ - "msgcat --add-location=file -" -``` - -You're now ready to ask the l10n coordinator to pull from you. - ## Fuzzy translation @@ -229,6 +197,44 @@ git-po-helper check-commits <rev-list-opts> ``` +## Preparing a "XX.po" file for commit + +Once you are done testing the translation, it's better to commit a +location-less "po/XX.po" file to save repository space and make a +user-friendly patch for review. + +To save a location-less "po/XX.po" automatically in repository, you +can: + +First, check which filter is configured for your "po/XX.po" file: + +``` +git check-attr filter po/XX.po +``` + +The filter configuration is defined in the "po/.gitattributes" file. + +Then define the driver for the filter. Most languages use the +"gettext-no-file-no-location" clean filter, which strips out both filenames and +locations from the comments. To set this up, run the following command: + +```shell +git config --global filter.gettext-no-file-no-location.clean \ + "msgcat --no-location -" +``` + +Some languages use the "gettext-no-location" clean filter, which preserves +filenames but not locations. For these, install gettext version 0.20 or higher +and setup the driver as below: + +```shell +git config --global filter.gettext-no-location.clean \ + "msgcat --add-location=file -" +``` + +You're now ready to ask the l10n coordinator to pull from you. + + ## Marking strings for translation (This is done by the core developers). -- 2.53.0.rc2.20.g532543fa46 ^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH v2 2/5] docs(l10n): add AGENTS.md with optimized update-pot instructions 2026-03-03 15:33 ` [PATCH v2 0/5] docs(l10n): AI agent instructions and workflow improvements Jiang Xin 2026-03-03 15:33 ` [PATCH v2 1/5] l10n: add .gitattributes to simplify location filtering Jiang Xin @ 2026-03-03 15:33 ` Jiang Xin 2026-03-12 2:11 ` Jiang Xin 2026-03-03 15:33 ` [PATCH v2 3/5] docs(l10n): add AI agent instructions for updating po/XX.po files Jiang Xin ` (2 subsequent siblings) 4 siblings, 1 reply; 42+ messages in thread From: Jiang Xin @ 2026-03-03 15:33 UTC (permalink / raw) To: Junio C Hamano, Git List Cc: Jiang Xin, Alexander Shopov, Mikel Forcada, Ralf Thielow, Jean-Noël Avila, Bagas Sanjaya, Dimitriy Ryazantcev, Peter Krefting, Emir SARI, Arkadii Yakovets, Vũ Tiến Hưng, Teng Long, Yi-Jyun Pan Add a new documentation file po/AGENTS.md that provides agent-specific instructions for generating or updating po/git.pot, separating them from the general po/README.md. This separation allows for more targeted optimization of AI agent workflows. Performance evaluation using the qwen model: # Before: add the instruction to po/README.md; the prompt references # po/README.md for execution git-po-helper agent-test --runs=5 --agent=qwen update-pot \ --prompt="Update po/git.pot according to po/README.md" # After: add the instruction to po/AGENTS.md; use builtin prompt # that references po/AGENTS.md for execution git-po-helper agent-test --runs=5 --agent=qwen update-pot Benchmark results (5-run average): Phase 1 - Optimizing po/README.md: | Metric | Before | After | Improvement | |-------------|---------|--------|-------------| | Turns: | 17 | 5 | -71% | | Exec time | 34s | 14s | -59% | | Turn range | 3-36 | 3-7 | | | Time range | 10s-59s | 9s-19s | | Phase 2 - Adding po/AGENTS.md (further optimization): | Metric | Before | After | Improvement | |-------------|---------|--------|-------------| | Turns | 17 | 3 | -82% | | Exec time | 34s | 8s | -76% | | Turn range | 3-36 | 3-3 | | | Time range | 10s-59s | 6s-9s | | Separating agent-specific instructions into AGENTS.md provides: - More focused and concise instructions for AI agents - Cleaner README.md for human readers - Additional 11% reduction in turns and 17% in execution time - More consistent behavior (turn range reduced from 3-7 to 3-3) This change makes agent workflows more efficient and reduces API costs by minimizing redundant LLM interactions. Signed-off-by: Jiang Xin <worldhello.net@gmail.com> --- po/AGENTS.md | 92 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 92 insertions(+) create mode 100644 po/AGENTS.md diff --git a/po/AGENTS.md b/po/AGENTS.md new file mode 100644 index 0000000000..1fcef9119a --- /dev/null +++ b/po/AGENTS.md @@ -0,0 +1,92 @@ +# Instructions for AI Agents + +This file gives specific instructions for AI agents that perform +housekeeping tasks for Git l10n. Use of AI is optional; many successful +l10n teams work well without it. + +The section "Housekeeping tasks for localization workflows" documents the +most commonly used housekeeping tasks. + + +## Background knowledge for localization workflows + +Essential background for the workflows below; understand these concepts before +performing any housekeeping tasks in this document. + +### Language code and notation (XX, ll, ll\_CC) + +XX is a placeholder for the language code. The code is either `ll` (ISO 639) +or `ll_CC` (e.g. `de`, `zh_CN` for Simplified Chinese). It appears in the PO +file's header entry metadata (e.g. `"Language: zh_CN\n"`) and is typically used +as the filename: `po/XX.po`. + + +### Header Entry + +Every PO file (`po/XX.po`) contains a special entry called the "header entry" +at the beginning of the file. This entry has an empty `msgid` and contains +metadata about the translation in its `msgstr`: + +```po +msgid "" +msgstr "" +"Project-Id-Version: Git\n" +"Report-Msgid-Bugs-To: Git Mailing List <git@vger.kernel.org>\n" +"POT-Creation-Date: 2026-02-14 13:38+0800\n" +"PO-Revision-Date: 2026-02-14 11:41+0800\n" +"Last-Translator: Teng Long <dyroneteng@gmail.com>\n" +"Language-Team: GitHub <https://github.com/dyrone/git/>\n" +"Language: zh_CN\n" +"MIME-Version: 1.0\n" +"Content-Type: text/plain; charset=UTF-8\n" +"Content-Transfer-Encoding: 8bit\n" +"Plural-Forms: nplurals=2; plural=(n != 1);\n" +"X-Generator: Gtranslator 42.0\n" +``` + +**CRITICAL**: Do not modify the header's `msgstr` during translation. Extracted +files (e.g. `po/l10n-pending.po`) include this header; preserve it exactly. + +The header provides: translation metadata (translator, language, dates); +pluralization rules (`Plural-Forms`); encoding and MIME type; project/version. + + +## Housekeeping tasks for localization workflows + +This section describes housekeeping tasks listed in the introduction. Read +"Background knowledge for localization workflows" above before performing +any task. + + +### Task 1: Generating or updating po/git.pot + +When asked to "update po/git.pot" or similar requests: + +1. **Directly execute** the command `make po/git.pot` without checking + if the file exists beforehand. + +2. **Do not verify** the generated file after execution. Simply run the + command and consider the task complete. + +The command will handle all necessary steps including file creation or +update automatically. + + +## Human translators remain in control + +Git translation is human-driven; language team leaders and contributors are +responsible for: + +- Understanding technical context of Git commands and messages +- Making linguistic and cultural decisions for the target language +- Maintaining translation quality and consistency +- Ensuring translations follow Git l10n conventions and standards +- Building and maintaining language glossaries +- Reviewing and approving all changes before submission + +AI tools, if used, only accelerate routine tasks. + +AI-generated output should always be treated as rough drafts requiring human +review, editing, and approval by someone who understands both the technical +context and the target language. The best results come from combining AI +efficiency with human judgment, cultural insight, and community engagement. -- 2.53.0.rc2.20.g532543fa46 ^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: [PATCH v2 2/5] docs(l10n): add AGENTS.md with optimized update-pot instructions 2026-03-03 15:33 ` [PATCH v2 2/5] docs(l10n): add AGENTS.md with optimized update-pot instructions Jiang Xin @ 2026-03-12 2:11 ` Jiang Xin 0 siblings, 0 replies; 42+ messages in thread From: Jiang Xin @ 2026-03-12 2:11 UTC (permalink / raw) To: Junio C Hamano, Git List Cc: Alexander Shopov, Mikel Forcada, Ralf Thielow, Jean-Noël Avila, Bagas Sanjaya, Dimitriy Ryazantcev, Peter Krefting, Emir SARI, Arkadii Yakovets, Vũ Tiến Hưng, Teng Long, Yi-Jyun Pan On Tue, Mar 3, 2026 at 11:33 PM Jiang Xin <worldhello.net@gmail.com> wrote: > +### Header Entry > + > +Every PO file (`po/XX.po`) contains a special entry called the "header entry" > +at the beginning of the file. This entry has an empty `msgid` and contains > +metadata about the translation in its `msgstr`: > + > +```po > +msgid "" > +msgstr "" > +"Project-Id-Version: Git\n" > +"Report-Msgid-Bugs-To: Git Mailing List <git@vger.kernel.org>\n" > +"POT-Creation-Date: 2026-02-14 13:38+0800\n" > +"PO-Revision-Date: 2026-02-14 11:41+0800\n" > +"Last-Translator: Teng Long <dyroneteng@gmail.com>\n" > +"Language-Team: GitHub <https://github.com/dyrone/git/>\n" > +"Language: zh_CN\n" > +"MIME-Version: 1.0\n" > +"Content-Type: text/plain; charset=UTF-8\n" > +"Content-Transfer-Encoding: 8bit\n" > +"Plural-Forms: nplurals=2; plural=(n != 1);\n" > +"X-Generator: Gtranslator 42.0\n" > +``` Will remove unnecessary header entries in the v3 reroll to make this file smaller and more concise. ^ permalink raw reply [flat|nested] 42+ messages in thread
* [PATCH v2 3/5] docs(l10n): add AI agent instructions for updating po/XX.po files 2026-03-03 15:33 ` [PATCH v2 0/5] docs(l10n): AI agent instructions and workflow improvements Jiang Xin 2026-03-03 15:33 ` [PATCH v2 1/5] l10n: add .gitattributes to simplify location filtering Jiang Xin 2026-03-03 15:33 ` [PATCH v2 2/5] docs(l10n): add AGENTS.md with optimized update-pot instructions Jiang Xin @ 2026-03-03 15:33 ` Jiang Xin 2026-03-03 15:33 ` [PATCH v2 4/5] docs(l10n): add AI agent instructions for translating PO files Jiang Xin 2026-03-03 15:33 ` [PATCH v2 5/5] docs(l10n): add AI agent instructions to review translations Jiang Xin 4 siblings, 0 replies; 42+ messages in thread From: Jiang Xin @ 2026-03-03 15:33 UTC (permalink / raw) To: Junio C Hamano, Git List Cc: Jiang Xin, Alexander Shopov, Mikel Forcada, Ralf Thielow, Jean-Noël Avila, Bagas Sanjaya, Dimitriy Ryazantcev, Peter Krefting, Emir SARI, Arkadii Yakovets, Vũ Tiến Hưng, Teng Long, Yi-Jyun Pan Add a new section po/AGENTS.md to provide clear instructions for updating language-specific PO files. The improved documentation significantly reduces both model interaction rounds and execution time. Performance evaluation using the qwen model: # Before: instructions in po/README.md; custom prompt references it git-po-helper agent-test --runs=5 --agent=qwen update-po \ --prompt="Update po/zh_CN.po according to po/README.md" # After: instructions in po/AGENTS.md; built-in prompt references it git-po-helper agent-test --runs=5 --agent=qwen update-po Benchmark results (5-run average): | Metric | Before | After | Improvement | |-------------|---------|--------|-------------| | Turns: | 22 | 4 | -82% | | Exec time | 38s | 9s | -76% | | Turn range | 17-39 | 3-9 | | | Time range | 25s-68s | 7s-14s | | This change makes agent workflows more efficient and reduces API costs by minimizing redundant LLM interactions and file content checks. Signed-off-by: Jiang Xin <worldhello.net@gmail.com> --- po/AGENTS.md | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/po/AGENTS.md b/po/AGENTS.md index 1fcef9119a..5eb1a606e1 100644 --- a/po/AGENTS.md +++ b/po/AGENTS.md @@ -72,6 +72,22 @@ The command will handle all necessary steps including file creation or update automatically. +### Task 2: Updating po/XX.po + +When asked to "update po/XX.po" or similar requests (where XX is a +language code): + +1. **Directly execute** the command `make po-update PO_FILE=po/XX.po` + without reading or checking the file content beforehand. + +2. **Do not verify, translate, or review** the updated file after execution. + Simply run the command and consider the task complete. + +The command will handle all necessary steps including generating +"po/git.pot" and merging new translatable strings into "po/XX.po" +automatically. + + ## Human translators remain in control Git translation is human-driven; language team leaders and contributors are -- 2.53.0.rc2.20.g532543fa46 ^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH v2 4/5] docs(l10n): add AI agent instructions for translating PO files 2026-03-03 15:33 ` [PATCH v2 0/5] docs(l10n): AI agent instructions and workflow improvements Jiang Xin ` (2 preceding siblings ...) 2026-03-03 15:33 ` [PATCH v2 3/5] docs(l10n): add AI agent instructions for updating po/XX.po files Jiang Xin @ 2026-03-03 15:33 ` Jiang Xin 2026-03-12 2:26 ` Jiang Xin 2026-03-03 15:33 ` [PATCH v2 5/5] docs(l10n): add AI agent instructions to review translations Jiang Xin 4 siblings, 1 reply; 42+ messages in thread From: Jiang Xin @ 2026-03-03 15:33 UTC (permalink / raw) To: Junio C Hamano, Git List Cc: Jiang Xin, Alexander Shopov, Mikel Forcada, Ralf Thielow, Jean-Noël Avila, Bagas Sanjaya, Dimitriy Ryazantcev, Peter Krefting, Emir SARI, Arkadii Yakovets, Vũ Tiến Hưng, Teng Long, Yi-Jyun Pan Add a new "Translating po/XX.po" section to po/AGENTS.md with detailed workflow and procedures for AI agents to translate language-specific PO files. Users can invoke AI-assisted translation in coding tools with a prompt such as: "Translate the po/XX.po file by referring to @po/AGENTS.md" Translation results serve as a reference; human contributors must review and approve before submission. To address the translation efficiency issues of some LLMs, batch translation replaces entry-by-entry translation. git-po-helper implements a gettext JSON format for translation files, replacing PO format during translation to enable batch processing. Evaluation test using the qwen model: git-po-helper agent-run --agent=qwen translate po/zh_CN.po Test translation (127 entries, 50 per batch): Initial state: 5998 translated, 91 fuzzy, 36 untranslated Final state: 6125 translated, 0 fuzzy, 0 untranslated Successfully translated: 127 entries (91 fuzzy + 36 untranslated) Success rate: 100% Benchmark results (3-run average): AI Agent using gettext tools: | Metric | Value | |------------------|--------------------------------| | Avg Num turns | 86 (176, 44, 40) | | Avg Exec Time | 20m44s (39m56s, 14m38s, 7m38s) | | Successful runs | 3/3 | AI Agent using git-po-helper (JSON batch flow): | Metric | Value | |------------------|--------------------------------| | Avg Num turns | 56 (68, 39, 63) | | Avg Exec Time | 19m8s (28m55s, 9m1s, 19m28s) | | Successful runs | 3/3 | The git-po-helper flow reduces turns (86 → 56) with similar execution time; the bottleneck appears to be LLM processing rather than network interaction. Signed-off-by: Jiang Xin <worldhello.net@gmail.com> --- po/AGENTS.md | 643 ++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 642 insertions(+), 1 deletion(-) diff --git a/po/AGENTS.md b/po/AGENTS.md index 5eb1a606e1..3bb8fb3858 100644 --- a/po/AGENTS.md +++ b/po/AGENTS.md @@ -5,7 +5,11 @@ housekeeping tasks for Git l10n. Use of AI is optional; many successful l10n teams work well without it. The section "Housekeeping tasks for localization workflows" documents the -most commonly used housekeeping tasks. +most commonly used housekeeping tasks: + +1. Generating or updating po/git.pot +2. Updating po/XX.po +3. Translating po/XX.po ## Background knowledge for localization workflows @@ -51,6 +55,384 @@ The header provides: translation metadata (translator, language, dates); pluralization rules (`Plural-Forms`); encoding and MIME type; project/version. +### Glossary Section + +PO files may have a glossary in comments before the header entry (first +`msgid ""`), giving terminology guidelines: + +```po +# Git glossary for Chinese translators +# +# English | Chinese +# ---------------------------------+-------------------------------------- +# 3-way merge | 三路合并 +# branch | 分支 +# commit | 提交 +# ... +``` + +**IMPORTANT**: Read and use the glossary when translating or reviewing. It is +in `#` comments and is preserved when extracting with `msgattrib`. + + +### Single-line vs Multi-line Entries + +**Single-line entries**: +```po +msgid "commit message" +msgstr "提交说明" +``` + +**Multi-line entries** (first line of `msgid` and `msgstr` is empty string): +```po +msgid "" +"Line 1\n" +"Line 2" +msgstr "" +"行 1\n" +"行 2" +``` + +**CRITICAL** for multi-line: first line is `msgid ""` / `msgstr ""`; following +lines are quoted strings; use `\n` for line breaks. Preserve quotes and +structure exactly. + +Because multi-line entries also use `msgstr ""` on the first line, `grep +'^msgstr ""'` yields false positives when locating untranslated strings. See +the next section for the correct approach. + + +### Locating untranslated, fuzzy, and obsolete entries + +**The commands below are used in "Task 3: translating po/XX.po".** For +translation tasks, follow Task 3 steps strictly; do not run these commands in +isolation. + +This section describes how to locate untranslated, fuzzy, and obsolete entries. +Do **not** use `grep '^msgstr ""$'`—it matches multi-line entries and causes +false positives. Use `msgattrib`: + +- **Untranslated**: `msgattrib --untranslated --no-obsolete po/XX.po` +- **Fuzzy**: `msgattrib --only-fuzzy --no-obsolete po/XX.po` +- **Obsolete** (`#~`): `msgattrib --obsolete --no-wrap po/XX.po` + +To get only message IDs: +`msgattrib --untranslated --no-obsolete po/XX.po | sed -n '/^msgid /,/^$/p'` +(Same pattern for fuzzy with `--only-fuzzy`.) + +When counting entries, the header is included; subtract 1 to exclude it. + + +### Translating fuzzy entries + +Fuzzy entries need re-translation because the source text changed. The format +differs by file type: + +- **PO file**: A `#, fuzzy` tag in the entry comments marks the entry as fuzzy. +- **JSON file**: The entry has `"fuzzy": true`. + +**Translation principles**: Re-translate the `msgstr` (and, for plural entries, +`msgstr[n]`) into the target language. Do **not** modify `msgid` or +`msgid_plural`. After translation, **clear the fuzzy mark**: in PO, remove the +`#, fuzzy` tag from comments; in JSON, omit or set `fuzzy` to `false`. + + +### Preserving Special Characters + +Preserve escape sequences (`\n`, `\"`, `\\`, `\t`), placeholders (`%s`, `%d`, +etc.), and quotes exactly as in `msgid`. Only reorder placeholders with +positional syntax when needed (see Placeholder Reordering below). + +**Correct**: `msgstr "行 1\n行 2"` (keep `\n` as escape). +**Wrong**: `msgstr "行 1\\n行 2"` or actual line breaks inside the string. + + +### Placeholder Reordering + +When reordering placeholders from the original `msgid`, use positional syntax +(`%n$`) so each argument maps to the correct value. Keep width/precision +modifiers and put the position before them. + +**Example 1** (precision): +```po +#, c-format +msgid "missing environment variable '%s' for configuration '%.*s'" +msgstr "配置 '%3$.*2$s' 缺少环境变量 '%1$s'" +``` +`%s` → argument 1 → `%1$s`. `%.*s` needs precision (arg 2) and string (arg 3) → +`%3$.*2$s`. + +**Example 2** (multi-line, four `%s` reordered): +```po +#, c-format +msgid "" +"the 'submodule.%s.gitdir' config does not exist for module '%s'. Please " +"ensure it is set, for example by running something like: 'git config " +"submodule.%s.gitdir .git/modules/%s'. For details see the " +"extensions.submodulePathConfig documentation." +msgstr "" +"模块 '%2$s' 的 'submodule.%1$s.gitdir' 配置不存在。请确保已设置,例如运行类" +"似:'git config submodule.%3$s.gitdir .git/modules/%4$s'。详细信息请参见 " +"extensions.submodulePathConfig 文档。" +``` + +Original order 1,2,3,4; in translation 2,1,3,4. Each line must be a complete +quoted string. + +**Rules**: Use `%n$` (n = 1-based position); place position before +width/precision; for `%.*s` map both precision and string; verify all +placeholders are mapped. + + +### Validating PO File Format + +Validate any PO file (e.g. `po/XX.po`, `po/l10n-pending.po`): + +```shell +msgfmt --check -o /dev/null po/XX.po +``` + +Common validation errors include: +- Unclosed quotes +- Missing escape sequences +- Invalid placeholder syntax +- Malformed multi-line entries +- Incorrect line breaks in multi-line strings + +**Handling validation errors with automatic repair**: +When `msgfmt` reports an error, it provides the line number where the error +was detected. Use this information to locate and fix the issue. + + +### Using git-po-helper + +[git-po-helper](https://github.com/git-l10n/git-po-helper) is a helper program +for Git localization (l10n) contributions. It serves two main purposes: +**quality checking** (conventions for git-l10n pull requests) and +**AI-assisted translation** (evaluate; help establish and assess the impact +of this document on automated translation). git-po-helper provides subcommands +that simplify the AI translation workflow and improve efficiency. When +available, this document uses `git-po-helper` for PO operations; otherwise it +falls back to gettext tools. + +**This section serves as reference for Housekeeping tasks.** AI Agent should +follow the Task steps when executing; this content provides command reference +information. Do not run commands in isolation. + + +#### Splitting large PO files + +When a PO file is too large for translation or review, use `git-po-helper +msg-select` to split it by entry index. + +- **Entry 0** is the header (included by default; use `--no-header` to omit). +- **Entries 1, 2, 3, …** are content entries. +- **Range format**: `--range "1-50"` (entries 1 through 50), `--range "-50"` + (first 50 entries), `--range "51-"` (from entry 51 to end). +- **Output format**: PO by default; use `--json` for GETTEXT JSON. See the + "GETTEXT JSON format" section (under git-po-helper) for details. +- **State filter**: Use `--translated`, `--untranslated`, `--fuzzy` to filter + by state (OR relationship). Use `--no-obsolete` to exclude obsolete entries; + `--with-obsolete` to include (default). Use `--only-same` or `--only-obsolete` + for a single state. Range applies to the filtered list. + +```shell +# First 50 entries (header + entries 1–50) +git-po-helper msg-select --range "-50" po/in.po -o po/out1.po + +# Entries 51–100 +git-po-helper msg-select --range "51-100" po/in.po -o po/out2.po + +# Entries 101 to end +git-po-helper msg-select --range "101-" po/in.po -o po/out3.po + +# Entries 1–50 without header (content only) +git-po-helper msg-select --range "1-50" --no-header po/in.po -o po/frag.po + +# Output as JSON; select untranslated and fuzzy entries, exclude obsolete +git-po-helper msg-select --json --untranslated --fuzzy --no-obsolete po/in.po >po/filtered.json +``` + + +#### Comparing PO files for translation and review + +Use `git-po-helper compare` for scenarios that `git diff` or `git show` cannot +handle well: + +- **Show changes with full context**: Get new and modified entries with + complete `msgid` and `msgstr`. Plain `git diff` fragments or loses PO + context. +- **Detect msgid tampering**: When an AI-generated PO file may have altered + `msgid`, a translation becomes an add instead of a replace. Use `--msgid` + to compare by msgid only. No diff output means the target and source files + are consistent in the data source (msgid). + +These capabilities support both translation workflows and code review. Redirect +output to a file: + +```shell +# Check msgid consistency (detect tampering); no output means target matches source +git-po-helper compare --msgid po/old.po po/new.po >po/out.po + +# Get full context of local changes (HEAD vs working tree) +git-po-helper compare po/XX.po -o po/out.po + +# Get full context of changes in a specific commit (parent vs commit) +git-po-helper compare --commit <commit> po/XX.po -o po/out.po + +# Get full context of changes since a commit (commit vs working tree) +git-po-helper compare --since <commit> po/XX.po -o po/out.po + +# Get full context between two commits +git-po-helper compare -r <commit1>..<commit2> po/XX.po -o po/out.po + +# Get full context of two worktree files +git-po-helper compare po/old.po po/new.po -o po/out.po +``` + +**Options summary** + +| Option | Meaning | +|---------------------|------------------------------------------------| +| (none) | Compare HEAD with working tree (local changes) | +| `--commit <commit>` | Compare parent of commit with the commit | +| `--since <commit>` | Compare commit with working tree | +| `-r x..y` | Compare revision x with revision y | +| `-r x..` | Compare revision x with working tree | +| `-r x` | Compare parent of x with x | + +Output is empty when there are no new or changed entries; otherwise it +includes a valid PO header. + + +#### Concatenating multiple PO/JSON files + +Use `git-po-helper msg-cat` to merge one or more input files (PO, POT, or +gettext JSON) into a single output. Input format is auto-detected by content +or extension. For duplicate `msgid`, the first occurrence by file order wins. +Use `-o <file>` for output; omit or use `-o -` for stdout. Use `--json` for +JSON output; otherwise output is PO format. + +```shell +# Convert JSON to PO (e.g. after translation) +git-po-helper msg-cat --unset-fuzzy -o po/out.po po/in.json + +# Merge multiple PO files +git-po-helper msg-cat -o po/out.po po/in-1.po po/in-2.po +``` + + +#### GETTEXT JSON format + +The **GETTEXT JSON** format is an internal format defined by `git-po-helper` +for convenient batch processing of translation and related tasks by AI models. +`git-po-helper msg-select`, `git-po-helper msg-cat`, and `git-po-helper compare` +read and write this format. + +**Top-level structure**: + +```json +{ + "header_comment": "string", + "header_meta": "string", + "entries": [ /* array of entry objects */ ] +} +``` + +| Field | Description | +|------------------|-------------------------------------------------------------------------------| +| `header_comment` | Lines above the first `msgid ""` (comments, glossary). Directly concatenated. | +| `header_meta` | Decoded `msgstr` of the header entry (Project-Id-Version, Plural-Forms, etc.).| +| `entries` | List of PO entries. Order matches source. | + +**Entry object** (each element of `entries`): + +| Field | Type | Description | +|-----------------|----------|-------------------------------------------------------| +| `msgid` | string | Singular message ID. PO escapes encoded. | +| `msgstr` | string | Singular message string. Empty for plural entries. | +| `msgid_plural` | string | Plural form of msgid. Omit for non-plural. | +| `msgstr_plural` | []string | Array of msgstr[0], msgstr[1], … Omit for non-plural. | +| `comments` | []string | Comment lines (`#`, `#.`, `#:`, `#,`, etc.). | +| `fuzzy` | bool | True if entry has fuzzy flag. | +| `obsolete` | bool | True for `#~` obsolete entries. Omit if false. | + +**Example (single-line entry)**: + +```json +{ + "header_comment": "# Glossary:\\n# term1\\tTranslation 1\\n#\\n", + "header_meta": "Project-Id-Version: git\\nContent-Type: text/plain; charset=UTF-8\\n", + "entries": [ + { + "msgid": "Hello", + "msgstr": "你好", + "comments": ["#. Comment for translator\\n", "#: src/file.c:10\\n"], + "fuzzy": false + } + ] +} +``` + +**Example (plural entry)**: + +```json +{ + "msgid": "One file", + "msgstr": "", + "msgid_plural": "%d files", + "msgstr_plural": ["一个文件", "%d 个文件"], + "comments": ["#, c-format\\n"], + "fuzzy": false +} +``` + +**Example (fuzzy entry before translation)**: + +```json +{ + "msgid": "Old message", + "msgstr": "旧翻译", + "comments": ["#, fuzzy\\n"], + "fuzzy": true +} +``` + +**Translation notes for GETTEXT JSON files**: + +- **Preserve structure**: Keep `header_comment`, `header_meta`, `comments`, + `msgid`, `msgid_plural` unchanged. Only modify `msgstr` and `msgstr_plural`. +- **Fuzzy entries**: Entries extracted from fuzzy PO entries have `"fuzzy": true`. + After translating, **remove the `fuzzy` field** or set it to `false` in the + output (`po/l10n-done.json`). The merge step uses `--unset-fuzzy`, which can + also remove the `fuzzy` field. +- **Placeholders**: Preserve `%s`, `%d`, etc. exactly; use `%n$` when + reordering (see "Placeholder Reordering" above). + + +### Quality checklist + +- **Accuracy**: Faithful to original meaning; no omissions or distortions. +- **Fuzzy entries**: Re-translate fully and clear the fuzzy flag (see + "Translating fuzzy entries" above). +- **Terminology**: Consistent with glossary (see "Glossary Section" above) or + domain standards. +- **Grammar and fluency**: Correct and natural in the target language. +- **Placeholders**: Preserve variables (`%s`, `{name}`, `$1`) exactly; use + positional parameters when reordering (see "Placeholder Reordering" above). +- **Special characters**: Preserve escape sequences (`\n`, `\"`, `\\`, `\t`), + placeholders, and quotes exactly as in `msgid`. Correct: `msgstr "行 1\n行 2"` + (keep `\n` as escape). Wrong: `"行 1\\n行 2"` or actual line breaks inside the + string. See "Preserving Special Characters" above. +- **Plurals and gender**: Correct forms and agreement. +- **Context fit**: Suitable for UI space, tone, and use (e.g. error vs. tooltip). +- **Cultural appropriateness**: No offensive or ambiguous content. +- **Consistency**: Match prior translations of the same source. +- **Technical integrity**: Do not translate code, paths, commands, brands, or + proper nouns. +- **Readability**: Clear, concise, and user-friendly. + + ## Housekeeping tasks for localization workflows This section describes housekeeping tasks listed in the introduction. Read @@ -88,6 +470,265 @@ The command will handle all necessary steps including generating automatically. +### Task 3: Translating po/XX.po + +When asked to translate `po/XX.po`, follow the steps below. The workflow +**automatically selects** the tool based on availability: use `git-po-helper` +if present, otherwise use gettext tools. With `git-po-helper`, the content to +translate is converted to JSON, enabling batch translation instead of +entry-by-entry translation for better efficiency. Translate every untranslated +and fuzzy entry; do not stop before the loop completes. + +1. **Extract entries to translate**: Generate `po/l10n-pending.po` with + untranslated and fuzzy messages. If the generated `po/l10n-pending.po` file + is empty or does not exist, translation is complete. In that case, you + **MUST** skip to the last step (clean up); do not run further translation + steps. + + ```shell + po_extract_pending () { + test $# -ge 1 || { echo "Usage: po_extract_pending <po-file>" >&2; exit 1; } + PO_FILE="$1" + PENDING="po/l10n-pending.po" + rm -f "$PENDING" + + if command -v git-po-helper >/dev/null 2>&1 + then + git-po-helper msg-select --untranslated --fuzzy --no-obsolete -o "$PENDING" "$PO_FILE" + else + msgattrib --untranslated --no-obsolete "$PO_FILE" >"${PENDING}.untranslated" + msgattrib --only-fuzzy --no-obsolete --clear-fuzzy --empty "$PO_FILE" >"${PENDING}.fuzzy" + msgattrib --only-fuzzy --no-obsolete "$PO_FILE" >"${PENDING}.fuzzy.reference" + msgcat --use-first "${PENDING}.untranslated" "${PENDING}.fuzzy" >"$PENDING" + rm -f "${PENDING}.untranslated" "${PENDING}.fuzzy" + fi + } + # Run the extraction. Example: po_extract_pending po/zh_CN.po + po_extract_pending po/XX.po + ``` + +2. **Prepare one batch for translation**: **BEFORE translating**, run the + script below. It truncates large tasks so each run processes one chunk, + keeping file size within model capacity. + + Output: `po/l10n-todo.json` (git-po-helper) or `po/l10n-todo.po` (gettext + only). If `po/l10n-todo.json` exists, go to step 3a; if `po/l10n-todo.po` + exists, go to step 3b. + + ```shell + l10n_one_batch () { + test $# -ge 1 || { echo "Usage: l10n_one_batch <po-file> [min_batch_size]" >&2; exit 1; } + PO_FILE="$1" + min_batch_size=${2:-100} + PENDING="po/l10n-pending.po" + rm -f po/l10n-todo.json po/l10n-done.json po/l10n-todo.po po/l10n-done.po + + ENTRY_COUNT=$(grep -c '^msgid ' "$PENDING" 2>/dev/null || true) + ENTRY_COUNT=$((ENTRY_COUNT > 0 ? ENTRY_COUNT - 1 : 0)) + + if test "$ENTRY_COUNT" -gt $min_batch_size + then + if test "$ENTRY_COUNT" -gt $((min_batch_size * 8)) + then + NUM=$((min_batch_size * 2)) + elif test "$ENTRY_COUNT" -gt $((min_batch_size * 4)) + then + NUM=$((min_batch_size + min_batch_size / 2)) + else + NUM=$min_batch_size + fi + BATCHING=1 + else + NUM=$ENTRY_COUNT + BATCHING= + fi + + if command -v git-po-helper >/dev/null 2>&1 + then + if test -n "$BATCHING" + then + git-po-helper msg-select --json --range "-$NUM" -o po/l10n-todo.json "$PENDING" + echo "Processing batch of $NUM entries (out of $ENTRY_COUNT remaining)" + else + git-po-helper msg-select --json -o po/l10n-todo.json "$PENDING" + echo "Processing all $ENTRY_COUNT entries at once" + fi + else + if test -n "$BATCHING" + then + awk -v num="$NUM" '/^msgid / && count++ > num {exit} 1' "$PENDING" | + tac | awk '/^$/ {found=1} found' | tac >po/l10n-todo.po + echo "Processing batch of $NUM entries (out of $ENTRY_COUNT remaining)" + else + cp "$PENDING" po/l10n-todo.po + echo "Processing all $ENTRY_COUNT entries at once" + fi + fi + } + # Prepare batch for translation. Second param controls batch size; reduce if + # the batch file is too large for the Agent to process. + l10n_one_batch po/XX.po 100 + ``` + +3a. **Translate JSON batch** (`po/l10n-todo.json` → `po/l10n-done.json`): + + - **Task**: Translate `po/l10n-todo.json` (input, GETTEXT JSON) into + `po/l10n-done.json` (output, GETTEXT JSON). See the "GETTEXT JSON format" + section above for format details and translation rules. + - **Reference glossary**: Read the glossary from the batch file's + `header_comment` (see "Glossary Section" above) and use it for + consistent terminology. + - **When translating**: Follow the "Quality checklist" above for correctness + and quality. Handle escape sequences (`\n`, `\"`, `\\`, `\t`), placeholders, + and quotes correctly as in `msgid`. For JSON, correctly escape and unescape + these sequences when reading and writing. Modify `msgstr` and `msgstr[n]` + (for plural entries); clear the fuzzy flag (omit or set `fuzzy` to `false`). + Do **not** modify `msgid` or `msgid_plural`. + +3b. **Translate PO batch** (`po/l10n-todo.po` → `po/l10n-done.po`): + + - **Task**: Translate `po/l10n-todo.po` (input, PO) into `po/l10n-done.po` + (output, PO). + - **Reference glossary**: Read the glossary from the pending file header + (see "Glossary Section" above) and use it for consistent terminology. + - **When translating**: Follow the "Quality checklist" above for correctness + and quality. Preserve escape sequences (`\n`, `\"`, `\\`, `\t`), placeholders, + and quotes as in `msgid`. Modify `msgstr` and `msgstr[n]` (for plural + entries); remove the `#, fuzzy` tag from comments when done. Do **not** + modify `msgid` or `msgid_plural`. + +4. **Validate `po/l10n-done.po`**: + + Whether from step 3a (JSON converted to PO) or step 3b (PO output directly), + the result may have two kinds of issues. Run the validation script; proceed to + step 5 only if it succeeds: + + ```shell + l10n_validate_done () { + DONE_PO="po/l10n-done.po" + DONE_JSON="po/l10n-done.json" + PENDING="po/l10n-pending.po" + + if test -f "$DONE_JSON" && { ! test -f "$DONE_PO" || test "$DONE_JSON" -nt "$DONE_PO"; } + then + git-po-helper msg-cat --unset-fuzzy -o "$DONE_PO" "$DONE_JSON" || { + echo "ERROR [JSON to PO conversion]: Fix $DONE_JSON and re-run." >&2 + return 1 + } + fi + + # Check 1: msgid should not be modified + MSGID_OUT=$(git-po-helper compare -q --msgid --assert-no-changes \ + "$PENDING" "$DONE_PO" 2>&1) + MSGID_RC=$? + if test $MSGID_RC -ne 0 || test -n "$MSGID_OUT" + then + echo "ERROR [msgid modified]: The following entries appeared after" >&2 + echo "translation because msgid was altered. Fix in $DONE_PO." >&2 + echo "$MSGID_OUT" >&2 + return 1 + fi + + # Check 2: PO format (see "Validating PO File Format" for error handling) + MSGFMT_OUT=$(msgfmt --check -o /dev/null "$DONE_PO" 2>&1) + MSGFMT_RC=$? + if test $MSGFMT_RC -ne 0 + then + echo "ERROR [PO format]: Fix errors in $DONE_PO." >&2 + echo "$MSGFMT_OUT" >&2 + return 1 + fi + + echo "Validation passed." + } + l10n_validate_done + ``` + + If the script fails, fix **directly in `po/l10n-done.po`**. Editing + `po/l10n-done.json` is not recommended because it adds an extra JSON-to-PO + conversion step. Use the error message to decide: + + - **`[msgid modified]`**: The listed entries have altered `msgid`; restore + them to match `po/l10n-pending.po`. + - **`[PO format]`**: `msgfmt` reports line numbers; fix the errors in place. + See "Validating PO File Format" for common issues. + + Re-run `l10n_validate_done` until it succeeds. If repair fails, exit + immediately. + +5. **Merge translation results into `po/XX.po`**: Run the following script: + + ```shell + l10n_merge_batch () { + test $# -ge 1 || { echo "Usage: l10n_merge_batch <po-file>" >&2; exit 1; } + PO_FILE="$1" + DONE_PO="po/l10n-done.po" + DONE_JSON="po/l10n-done.json" + MERGED="po/l10n-done.merged" + PENDING="po/l10n-pending.po" + if test -f "$DONE_JSON" && { ! test -f "$DONE_PO" || test "$DONE_JSON" -nt "$DONE_PO"; } + then + git-po-helper msg-cat --unset-fuzzy -o "$DONE_PO" "$DONE_JSON" || { + echo "ERROR [JSON to PO conversion]: Fix $DONE_JSON and re-run." >&2 + return 1 + } + fi + msgcat --use-first "$DONE_PO" "$PO_FILE" >"$MERGED" || { + echo "ERROR [msgcat merge]: Fix errors in $DONE_PO and re-run." >&2 + exit 1 + } + mv "$MERGED" "$PO_FILE" + rm -f "$PENDING" + } + # Run the merge. Example: l10n_merge_batch po/zh_CN.po + l10n_merge_batch po/XX.po + ``` + + If `msgcat` fails, fix **directly in `po/l10n-done.po`**. Editing + `po/l10n-done.json` is not recommended because it adds an extra JSON-to-PO + conversion step. If repair fails, exit immediately. + +6. **Repeat steps 1–5** until `po/l10n-pending.po` is empty (or does not exist). + Do not stop early. + +7. **Final verification**: + + ```shell + # Final check + UNTRANS=$(msgattrib --untranslated --no-obsolete po/XX.po 2>/dev/null | grep -c '^msgid ' || true) + UNTRANS=$((UNTRANS > 0 ? UNTRANS - 1 : 0)) + FUZZY=$(msgattrib --only-fuzzy --no-obsolete po/XX.po 2>/dev/null | grep -c '^msgid ' || true) + FUZZY=$((FUZZY > 0 ? FUZZY - 1 : 0)) + if test "$UNTRANS" -eq 0 && test "$FUZZY" -eq 0 + then + echo "Translation complete! All entries translated." + else + echo "WARNING: Still have $UNTRANS untranslated + $FUZZY fuzzy entries." + echo "Do not clean up. Continue with step 1." + exit 1 + fi + ``` + +8. **Clean up** (only after step 7 passes): + + ```shell + po_cleanup () { + rm -f "po/l10n-pending.po" + rm -f "po/l10n-pending.po.fuzzy" + rm -f "po/l10n-pending.po.fuzzy.reference" + rm -f "po/l10n-pending.po.untranslated" + rm -f "po/l10n-todo.json" + rm -f "po/l10n-todo.po" + rm -f "po/l10n-done.json" + rm -f "po/l10n-done.merged" + rm -f "po/l10n-done.po" + echo "Cleanup complete. Translation finished successfully." + } + # Run cleanup + po_cleanup + ``` + + ## Human translators remain in control Git translation is human-driven; language team leaders and contributors are -- 2.53.0.rc2.20.g532543fa46 ^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: [PATCH v2 4/5] docs(l10n): add AI agent instructions for translating PO files 2026-03-03 15:33 ` [PATCH v2 4/5] docs(l10n): add AI agent instructions for translating PO files Jiang Xin @ 2026-03-12 2:26 ` Jiang Xin 0 siblings, 0 replies; 42+ messages in thread From: Jiang Xin @ 2026-03-12 2:26 UTC (permalink / raw) To: Junio C Hamano, Git List Cc: Alexander Shopov, Mikel Forcada, Ralf Thielow, Jean-Noël Avila, Bagas Sanjaya, Dimitriy Ryazantcev, Peter Krefting, Emir SARI, Arkadii Yakovets, Vũ Tiến Hưng, Teng Long, Yi-Jyun Pan On Tue, Mar 3, 2026 at 11:34 PM Jiang Xin <worldhello.net@gmail.com> wrote: > +#### GETTEXT JSON format > + > +The **GETTEXT JSON** format is an internal format defined by `git-po-helper` > +for convenient batch processing of translation and related tasks by AI models. > +`git-po-helper msg-select`, `git-po-helper msg-cat`, and `git-po-helper compare` > +read and write this format. > + > +**Top-level structure**: > + > +```json > +{ > + "header_comment": "string", > + "header_meta": "string", > + "entries": [ /* array of entry objects */ ] > +} > +``` > + > +| Field | Description | > +|------------------|-------------------------------------------------------------------------------| > +| `header_comment` | Lines above the first `msgid ""` (comments, glossary). Directly concatenated. | > +| `header_meta` | Decoded `msgstr` of the header entry (Project-Id-Version, Plural-Forms, etc.).| > +| `entries` | List of PO entries. Order matches source. | > + > +**Entry object** (each element of `entries`): > + > +| Field | Type | Description | > +|-----------------|----------|-------------------------------------------------------| > +| `msgid` | string | Singular message ID. PO escapes encoded. | > +| `msgstr` | string | Singular message string. Empty for plural entries. | > +| `msgid_plural` | string | Plural form of msgid. Omit for non-plural. | > +| `msgstr_plural` | []string | Array of msgstr[0], msgstr[1], … Omit for non-plural. | > +| `comments` | []string | Comment lines (`#`, `#.`, `#:`, `#,`, etc.). | > +| `fuzzy` | bool | True if entry has fuzzy flag. | > +| `obsolete` | bool | True for `#~` obsolete entries. Omit if false. | The coexistence of msgstr (string) and msgstr_plural (string array) introduces redundancy and increases the risk of model generation errors. To resolve this, unify all translations into a single msgstr array in v3: - Single element: Represents the singular form (equivalent to PO msgstr or msgstr[0]). - Multiple elements: Represent plural forms in sequential order (msgstr[0], msgstr[1], …). > + > +**Example (single-line entry)**: > + > +```json > +{ > + "header_comment": "# Glossary:\\n# term1\\tTranslation 1\\n#\\n", > + "header_meta": "Project-Id-Version: git\\nContent-Type: text/plain; charset=UTF-8\\n", > + "entries": [ > + { > + "msgid": "Hello", > + "msgstr": "你好", "msgstr": ["你好"], > +**Example (plural entry)**: > + > +```json > +{ > + "msgid": "One file", > + "msgstr": "", > + "msgid_plural": "%d files", > + "msgstr_plural": ["一个文件", "%d 个文件"], "msgstr": ["一个文件", "%d 个文件"], > +6. **Repeat steps 1–5** until `po/l10n-pending.po` is empty (or does not exist). > + Do not stop early. > + > +7. **Final verification**: Some LLMs sometimes fail to follow instructions, skipping directly from Step 6 to Step 7. This issue can be resolved by renaming the Step 7 title to "7. **Only after loop exits**". ^ permalink raw reply [flat|nested] 42+ messages in thread
* [PATCH v2 5/5] docs(l10n): add AI agent instructions to review translations 2026-03-03 15:33 ` [PATCH v2 0/5] docs(l10n): AI agent instructions and workflow improvements Jiang Xin ` (3 preceding siblings ...) 2026-03-03 15:33 ` [PATCH v2 4/5] docs(l10n): add AI agent instructions for translating PO files Jiang Xin @ 2026-03-03 15:33 ` Jiang Xin 2026-03-12 2:34 ` Jiang Xin 4 siblings, 1 reply; 42+ messages in thread From: Jiang Xin @ 2026-03-03 15:33 UTC (permalink / raw) To: Junio C Hamano, Git List Cc: Jiang Xin, Alexander Shopov, Mikel Forcada, Ralf Thielow, Jean-Noël Avila, Bagas Sanjaya, Dimitriy Ryazantcev, Peter Krefting, Emir SARI, Arkadii Yakovets, Vũ Tiến Hưng, Teng Long, Yi-Jyun Pan Add a new "Reviewing po/XX.po" section to po/AGENTS.md that provides comprehensive guidance for AI agents to review translation files. Translation diffs lose context, especially for multi-line msgid and msgstr entries. Some LLMs ignore context and cannot evaluate translations accurately; others rely on scripts to search for context in source files, making the review process time-consuming. To address this, git-po-helper implements a compare subcommand that extracts new or modified translations with full context (complete msgid/msgstr pairs), significantly improving review efficiency. A limitation is that extracted content lacks other already translated content for reference, which may affect terminology consistency. This is mitigated by including a glossary in the PO file header. git-po-helper-generated review files include the header entry and glossary (if present) by default. The review workflow leverages git-po-helper subcommands: - git-po-helper compare: Extract new or changed entries between two PO file versions into a valid PO file for review. Supports multiple modes: * Compare HEAD with working tree (local changes) * Compare parent of commit with the commit (--commit) * Compare commit with working tree (--since) * Compare two arbitrary revisions (-r) - git-po-helper msg-select: Split large review files into smaller batches by entry index range for manageable review sessions. Supports range formats like "-50" (first 50), "51-100", "101-" (to end). Evaluation test using qwen model: git-po-helper agent-run review --commit 2000abefba --agent qwen Benchmark results: | Metric | Value | |------------------|----------------------------------| | Num turns | 22 | | Input tokens | 537263 | | Output tokens | 4397 | | API duration | 167.84 s | | Review score | 96/100 | | Total entries | 63 | | With issues | 4 (1 critical, 2 major, 1 minor) | Signed-off-by: Jiang Xin <worldhello.net@gmail.com> --- po/AGENTS.md | 194 ++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 193 insertions(+), 1 deletion(-) diff --git a/po/AGENTS.md b/po/AGENTS.md index 3bb8fb3858..08be73ada5 100644 --- a/po/AGENTS.md +++ b/po/AGENTS.md @@ -10,6 +10,7 @@ most commonly used housekeeping tasks: 1. Generating or updating po/git.pot 2. Updating po/XX.po 3. Translating po/XX.po +4. Reviewing translation quality ## Background knowledge for localization workflows @@ -729,6 +730,191 @@ and fuzzy entry; do not stop before the loop completes. ``` +### Task 4: Review translation quality + +Review may target the full `po/XX.po`, a specific commit, or changes since a +commit. When asked to review, follow the steps below. **Note**: This task uses +`git-po-helper compare`; if `git-po-helper` is not available, the task +cannot be performed. + +1. **Check for existing review**: Evaluate the following in order: + + - If `po/review-input.po` does **not** exist, proceed to step 2 regardless + of any other files (e.g., batch or JSON files). + - If both `po/review-input.po` and `po/review-result.json` exist, go + directly to step 5 (Merge and summary) and display the report. + Do **not** check for batch or other temporary files; no further review + steps are needed. + - If `po/review-input.po` exists but `po/review-result.json` does not, + go to step 4 (Process one batch) to continue the previous review. + +2. **Extract entries**: Run `git-po-helper compare` with the desired range and + redirect the output to `po/review-input.po`. Do not use `git show` or + `git diff`—they can fragment or lose PO context (see "Comparing PO files + for translation and review" under git-po-helper). + +3. **Prepare review batches**: Run the script below to clean up any leftover + files from previous reviews and split `po/review-input.po` into one or + more `po/review-input-<N>.json` files (dynamic batch sizing). Run as a + single script (define the function, then call it): + + ```shell + review_split_batches () { + min_batch_size=${1:-50} + rm -f po/review-input-*.json + rm -f po/review-result-*.json + rm -f po/review-result.json + rm -f po/review-output.po + + ENTRY_COUNT=$(grep -c '^msgid ' po/review-input.po 2>/dev/null || true) + ENTRY_COUNT=$((ENTRY_COUNT > 0 ? ENTRY_COUNT - 1 : 0)) + + if test "$ENTRY_COUNT" -gt $min_batch_size + then + if test "$ENTRY_COUNT" -gt $((min_batch_size * 8)) + then + NUM=$((min_batch_size * 2)) + elif test "$ENTRY_COUNT" -gt $((min_batch_size * 4)) + then + NUM=$((min_batch_size + min_batch_size / 2)) + else + NUM=$min_batch_size + fi + BATCH_COUNT=$(( (ENTRY_COUNT + NUM - 1) / NUM )) + for i in $(seq 1 "$BATCH_COUNT") + do + START=$(((i - 1) * NUM + 1)) + END=$((i * NUM)) + if test "$END" -gt "$ENTRY_COUNT" + then + END=$ENTRY_COUNT + fi + if test "$i" -eq 1 + then + git-po-helper msg-select --json --range "-$NUM" \ + -o "po/review-input-$i.json" po/review-input.po + elif test "$END" -ge "$ENTRY_COUNT" + then + git-po-helper msg-select --json --range "$START-" \ + -o "po/review-input-$i.json" po/review-input.po + else + git-po-helper msg-select --json --range "$START-$END" \ + -o "po/review-input-$i.json" po/review-input.po + fi + done + else + git-po-helper msg-cat --json \ + -o po/review-input-1.json po/review-input.po + fi + } + # Parameter controls batch size; reduce if the batch file is too large for + # the Agent to process. + review_split_batches 20 + ``` + +4. **Process one batch (repeat until none left)**: + + a. If no `po/review-input-*.json` files exist, proceed to step 5. + + b. Select the smallest remaining index N (e.g. `po/review-input-1.json`). + The current batch is `po/review-input-<N>.json`. + + c. Review translation quality in the current batch: Read the current + batch file (`po/review-input-<N>.json`) and: + - Consult the "Background knowledge for localization workflows" section + for PO format, JSON format, placeholder rules, and terminology. If the + current batch file has a glossary in the `header_comment` field, add + it to your context for consistent terminology. + - Do not review or modify the header entry (in PO format: empty `msgid` + with metadata in `msgstr`; in JSON format: `header_comment` and + `header_meta`). + - For all other entries, check the quality of translations in `msgstr` + (singular form) and `msgstr_plural` (plural forms) against `msgid` and + `msgid_plural`. See the "Quality checklist" above for criteria. + + d. After reviewing all entries in the current batch, write the issues you + found to `po/review-result-<N>.json` using the format described in the + "Review result JSON format" section below. If no issues found, write + `{"issues": []}` to `po/review-result-<N>.json`. Always write this file; + it marks the batch as complete. + + e. Delete the current batch file (`po/review-input-<N>.json`). + + f. Return to step 4a. + + This loop is resumable: remaining `po/review-input-*.json` files indicate + batches still to process. + +5. **Merge and summary**: Run the command below to merge all + `po/review-result-*.json` files into `po/review-result.json`, apply the + result to `po/review-output.po`, and display the report. + + ```shell + git-po-helper agent-run report + ``` + + **Do not delete** `po/review-result.json`, `po/review-output.po`, or + `po/review-input.po`. + +**Review result JSON format**: + +The **Review result JSON** format defines the structure for translation +review reports. For each entry with translation issues, create an issue +object as follows: + +- Copy the original entry's `msgid`, `msgstr`, `msgid_plural` and + `msgstr_plural` (if present) to the corresponding fields in the + result issue object. +- Write a summary of all issues found for this entry in `description`. +- Set `score` according to the severity of issues found for this entry, + from 0 to 3 (3 = perfect, no issues; 0 = critical, 1 = major, 2 = minor). +- Place the suggested translation in `suggest_msgstr` (singular) or + `suggest_msgstr_plural` (plural). +- Include only entries with issues (score less than 3). When no issues + are found in the batch, write `{"issues": []}`. + +Example review result (with issues): + +```json +{ + "issues": [ + { + "msgid": "commit", + "msgid_plural": "", + "msgstr": "委托", + "msgstr_plural": [], + "suggest_msgstr": "提交", + "suggest_msgstr_plural": [], + "score": 0, + "description": "Terminology error: 'commit' should be translated as '提交'" + }, + { + "msgid": "repository", + "msgid_plural": "repositories", + "msgstr": "", + "msgstr_plural": ["版本库", "版本库"], + "suggest_msgstr": "", + "suggest_msgstr_plural": ["仓库", "仓库"], + "score": 2, + "description": "Consistency issue: '版本库' and '仓库' are used interchangeably; suggest using '仓库' consistently" + } + ] +} +``` + +Field descriptions for each issue object (element of the `issues` array): + +- `msgid` (and `msgid_plural` for plural entries): Original source text. +- `msgstr` (and `msgstr_plural` for plural entries): Original translation. +- `suggest_msgstr`: Suggested translation for the singular form. +- `suggest_msgstr_plural`: Array of suggested translations for plural forms; + `suggest_msgstr` is empty for plural-only entries. +- `score`: 0–3 (see scale below). +- `description`: Brief summary of the issue. +- Score scale: 0 = critical (must fix before release), 1 = major (should fix), + 2 = minor (improve later), 3 = perfect. + + ## Human translators remain in control Git translation is human-driven; language team leaders and contributors are @@ -741,7 +927,13 @@ responsible for: - Building and maintaining language glossaries - Reviewing and approving all changes before submission -AI tools, if used, only accelerate routine tasks. +AI tools, if used, only accelerate routine tasks: + +- First-draft translations for new or updated messages +- Finding untranslated or fuzzy entries +- Checking consistency with glossary and existing translations +- Detecting technical errors (placeholders, formatting) +- Reviewing against quality criteria AI-generated output should always be treated as rough drafts requiring human review, editing, and approval by someone who understands both the technical -- 2.53.0.rc2.20.g532543fa46 ^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: [PATCH v2 5/5] docs(l10n): add AI agent instructions to review translations 2026-03-03 15:33 ` [PATCH v2 5/5] docs(l10n): add AI agent instructions to review translations Jiang Xin @ 2026-03-12 2:34 ` Jiang Xin 0 siblings, 0 replies; 42+ messages in thread From: Jiang Xin @ 2026-03-12 2:34 UTC (permalink / raw) To: Junio C Hamano, Git List Cc: Alexander Shopov, Mikel Forcada, Ralf Thielow, Jean-Noël Avila, Bagas Sanjaya, Dimitriy Ryazantcev, Peter Krefting, Emir SARI, Arkadii Yakovets, Vũ Tiến Hưng, Teng Long, Yi-Jyun Pan On Tue, Mar 3, 2026 at 11:34 PM Jiang Xin <worldhello.net@gmail.com> wrote: > +### Task 4: Review translation quality > + > +Review may target the full `po/XX.po`, a specific commit, or changes since a > +commit. When asked to review, follow the steps below. **Note**: This task uses > +`git-po-helper compare`; if `git-po-helper` is not available, the task > +cannot be performed. > + > +1. **Check for existing review**: Evaluate the following in order: > + > + - If `po/review-input.po` does **not** exist, proceed to step 2 regardless > + of any other files (e.g., batch or JSON files). > + - If both `po/review-input.po` and `po/review-result.json` exist, go > + directly to step 5 (Merge and summary) and display the report. > + Do **not** check for batch or other temporary files; no further review > + steps are needed. > + - If `po/review-input.po` exists but `po/review-result.json` does not, > + go to step 4 (Process one batch) to continue the previous review. > + > +2. **Extract entries**: Run `git-po-helper compare` with the desired range and > + redirect the output to `po/review-input.po`. Do not use `git show` or > + `git diff`—they can fragment or lose PO context (see "Comparing PO files > + for translation and review" under git-po-helper). > + > +3. **Prepare review batches**: Run the script below to clean up any leftover > + files from previous reviews and split `po/review-input.po` into one or > + more `po/review-input-<N>.json` files (dynamic batch sizing). Run as a > + single script (define the function, then call it): In the v3 reroll, we will adopt a solution that is easier for the model to understand: using a single pending file to record data awaiting review, rather than splitting review tasks across multiple files and requiring the model to select among them. > + git-po-helper msg-select --json --range "-$NUM" \ > + -o "po/review-input-$i.json" po/review-input.po There are trailing spaces in this file that are breaking CI. This will be fixed in the v3 reroll. ^ permalink raw reply [flat|nested] 42+ messages in thread
* [PATCH v3 0/5] docs(l10n): AI agent instructions and workflow improvements 2026-02-06 2:38 ` Jiang Xin 2026-03-03 15:33 ` [PATCH v2 0/5] docs(l10n): AI agent instructions and workflow improvements Jiang Xin @ 2026-03-14 14:38 ` Jiang Xin 2026-03-14 14:38 ` [PATCH v3 1/5] l10n: add .gitattributes to simplify location filtering Jiang Xin ` (4 more replies) 2026-03-16 23:54 ` [PATCH v4 0/5] docs(l10n): AI agent instructions and workflow improvements Jiang Xin 2 siblings, 5 replies; 42+ messages in thread From: Jiang Xin @ 2026-03-14 14:38 UTC (permalink / raw) To: Junio C Hamano, Git List Cc: Jiang Xin, Alexander Shopov, Mikel Forcada, Ralf Thielow, Jean-Noël Avila, Bagas Sanjaya, Dimitriy Ryazantcev, Peter Krefting, Emir SARI, Arkadii Yakovets, Vũ Tiến Hưng, Teng Long, Yi-Jyun Pan ## Changes since v2 - CI: Fix trailing spaces that caused check-whitespace to fail. - Size: Trim po/AGENTS.md by removing unnecessary examples and compressing wording so the file stays smaller and easier to follow. - GETTEXT JSON: Use a single `msgstr` array for all translations instead of separate `msgstr` (string) and `msgstr_plural` (array). One element means singular; multiple elements mean plural forms in order. This reduces redundancy and model errors. - Workflow: Rename the final step of translation workflow to "Only after loop exits" so agents do not jump from the merge step to the final step before the loop has exited. Simplify the review flow by using one pending file for entries awaiting review instead of multiple review-input-<N>.json files and file-selection logic. ## Introduction This series introduces AI agent instructions for Git localization (l10n) workflows to help localization contributors quickly complete drafts and use AI to check translation quality. The changes focus on: 1. Separating agent-specific documentation into po/AGENTS.md for targeted optimization of AI-assisted workflows 2. Providing step-by-step instructions for update-pot, update-po, translation, and review tasks 3. Simplifying location filtering for PO file commits via .gitattributes AI-assisted translation is optional; many successful l10n teams work well without it. When used, AI output serves as reference only—human contributors must review and approve before submission. ## Performance summary Benchmarks use the Qwen model via git-po-helper. The improvements reduce API costs and make agent workflows more efficient while maintaining human oversight of translation quality. | Task | Before | After | Improvement | |-------------|---------------------|--------------------------|---------------------------------------| | update-pot | 17 turns, 34s | 3 turns, 8s (range 3–3) | -82% turns, -76% time | | update-po | 22 turns, 38s | 4 turns, 9s (3–9, 7–14s) | -82% turns, -76% time | | translate | 86 turns, ~21m | 56 turns, ~19m | -35% turns (git-po-helper JSON batch) | | review | N/A | 22 turns (63 entries) | | ## Testing All changes have been evaluated with the qwen model via git-po-helper agent-test and agent-run. The po/AGENTS.md instructions are designed to work with coding tools that support file references (e.g., "Translate po/zh_CN.po by referring to @po/AGENTS.md"). ## Changes Jiang Xin (5): l10n: add .gitattributes to simplify location filtering docs(l10n): add AGENTS.md with optimized update-pot instructions docs(l10n): add AI agent instructions for updating po/XX.po files docs(l10n): add AI agent instructions for translating PO files docs(l10n): add AI agent instructions to review translations po/.gitattributes | 36 ++ po/AGENTS.md | 872 ++++++++++++++++++++++++++++++++++++++++++++++ po/README.md | 70 ++-- 3 files changed, 946 insertions(+), 32 deletions(-) create mode 100644 po/.gitattributes create mode 100644 po/AGENTS.md -- 2.53.0.rc2.20.g532543fa46 ^ permalink raw reply [flat|nested] 42+ messages in thread
* [PATCH v3 1/5] l10n: add .gitattributes to simplify location filtering 2026-03-14 14:38 ` [PATCH v3 0/5] docs(l10n): AI agent instructions and workflow improvements Jiang Xin @ 2026-03-14 14:38 ` Jiang Xin 2026-03-15 11:13 ` Johannes Sixt 2026-03-14 14:38 ` [PATCH v3 2/5] docs(l10n): add AGENTS.md with optimized update-pot instructions Jiang Xin ` (3 subsequent siblings) 4 siblings, 1 reply; 42+ messages in thread From: Jiang Xin @ 2026-03-14 14:38 UTC (permalink / raw) To: Junio C Hamano, Git List Cc: Jiang Xin, Alexander Shopov, Mikel Forcada, Ralf Thielow, Jean-Noël Avila, Bagas Sanjaya, Dimitriy Ryazantcev, Peter Krefting, Emir SARI, Arkadii Yakovets, Vũ Tiến Hưng, Teng Long, Yi-Jyun Pan To simplify the location filtering process for l10n contributors when committing po/XX.po files, add the filter attributes for selected PO files to the repository. This ensures all contributors automatically get the same filter configuration without manual setup in .git/info/attributes. The filter attribute is only applied to specific PO files that have been properly prepared. Files without the filter attribute fall into two categories: - Legacy files that lack maintenance and still contain location comments that have not been cleaned up - Files that are already location-less but whose formatting (e.g., line wrapping style) differs from the output of msgcat processing To avoid discrepancies between the filtered blob in the index and the unfiltered working tree for these files, the filter attribute is not applied to them. Contributors still need to manually define the filter drivers using git-config as documented in po/README.md. Additionally, po/README.md has been reorganized: the content of handling location-less PO file content has been moved from the "Updating a XX.po file" section to a separate "Preparing a XX.po file for commit" section. This prevents AI agents from introducing unrelated operations when updating PO files. Signed-off-by: Jiang Xin <worldhello.net@gmail.com> --- po/.gitattributes | 36 ++++++++++++++++++++++++ po/README.md | 70 +++++++++++++++++++++++++---------------------- 2 files changed, 74 insertions(+), 32 deletions(-) create mode 100644 po/.gitattributes diff --git a/po/.gitattributes b/po/.gitattributes new file mode 100644 index 0000000000..7b4c1cd9df --- /dev/null +++ b/po/.gitattributes @@ -0,0 +1,36 @@ +# Git Attributes for PO Files +# +# This file configures Git filters to automatically strip location information +# from PO files when committing, producing cleaner diffs and saving repository +# space. +# +# Two filter types are used: +# 1. gettext-no-file-no-location: Strips both filenames and line numbers +# (e.g., removes "#: main.c:123" entirely) +# 2. gettext-no-location: Preserves filenames but removes line numbers, which +# requires gettext 0.20 or higher +# (e.g., "#: main.c:123" becomes "#: main.c") +# +# See `po/README.md` for instructions on setting up the required filter drivers. + +# Do not apply these filters to all `*.po` files via a wildcard. For legacy, +# unmaintained PO files, that would cause discrepancies between the filtered +# blob in the index and the unfiltered file in the working tree. + +# Languages that strip both filenames and line numbers +bg.po filter=gettext-no-file-no-location +de.po filter=gettext-no-file-no-location +#es.po filter=gettext-no-file-no-location +fr.po filter=gettext-no-file-no-location +#ga.po filter=gettext-no-file-no-location +#ru.po filter=gettext-no-file-no-location +sv.po filter=gettext-no-file-no-location +tr.po filter=gettext-no-file-no-location +uk.po filter=gettext-no-file-no-location +vi.po filter=gettext-no-file-no-location + +# Languages that preserve filenames but strip line numbers +#ca.po filter=gettext-no-location +id.po filter=gettext-no-location +zh_CN.po filter=gettext-no-location +zh_TW.po filter=gettext-no-location diff --git a/po/README.md b/po/README.md index ec08aa24ad..e358371255 100644 --- a/po/README.md +++ b/po/README.md @@ -159,38 +159,6 @@ It will: and these location lines will help translation tools to locate translation context easily. -Once you are done testing the translation (see below), it's better -to commit a location-less "po/XX.po" file to save repository space -and make a user-friendly patch for review. - -To save a location-less "po/XX.po" automatically in repository, you -can: - -First define a new attribute for "po/XX.po" by appending the following -line in ".git/info/attributes": - -``` -/po/XX.po filter=gettext-no-location -``` - -Then define the driver for the "gettext-no-location" clean filter to -strip out both filenames and locations from the contents as follows: - -```shell -git config --global filter.gettext-no-location.clean \ - "msgcat --no-location -" -``` - -For users who have gettext version 0.20 or higher, it is also possible -to define a clean filter to preserve filenames but not locations: - -```shell -git config --global filter.gettext-no-location.clean \ - "msgcat --add-location=file -" -``` - -You're now ready to ask the l10n coordinator to pull from you. - ## Fuzzy translation @@ -229,6 +197,44 @@ git-po-helper check-commits <rev-list-opts> ``` +## Preparing a "XX.po" file for commit + +Once you are done testing the translation, it's better to commit a +location-less "po/XX.po" file to save repository space and make a +user-friendly patch for review. + +To save a location-less "po/XX.po" automatically in the repository, +follow these steps: + +First, check which filter is configured for your "po/XX.po" file: + +``` +git check-attr filter po/XX.po +``` + +The filter configuration is defined in the "po/.gitattributes" file. + +Then define the driver for the filter. Most languages use the +"gettext-no-file-no-location" clean filter, which strips out both filenames and +line numbers from location comments. To set this up, run the following command: + +```shell +git config --global filter.gettext-no-file-no-location.clean \ + "msgcat --no-location -" +``` + +Some languages use the "gettext-no-location" clean filter, which preserves +filenames but not line numbers. For these, install gettext version 0.20 or +higher and set up the driver as follows: + +```shell +git config --global filter.gettext-no-location.clean \ + "msgcat --add-location=file -" +``` + +You're now ready to ask the l10n coordinator to pull from you. + + ## Marking strings for translation (This is done by the core developers). -- 2.53.0.rc2.20.g532543fa46 ^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: [PATCH v3 1/5] l10n: add .gitattributes to simplify location filtering 2026-03-14 14:38 ` [PATCH v3 1/5] l10n: add .gitattributes to simplify location filtering Jiang Xin @ 2026-03-15 11:13 ` Johannes Sixt 2026-03-15 16:11 ` Junio C Hamano 2026-03-16 3:21 ` Jiang Xin 0 siblings, 2 replies; 42+ messages in thread From: Johannes Sixt @ 2026-03-15 11:13 UTC (permalink / raw) To: Jiang Xin Cc: Alexander Shopov, Mikel Forcada, Ralf Thielow, Jean-Noël Avila, Bagas Sanjaya, Dimitriy Ryazantcev, Peter Krefting, Emir SARI, Arkadii Yakovets, Vũ Tiến Hưng, Teng Long, Yi-Jyun Pan, Junio C Hamano, Git List Am 14.03.26 um 15:38 schrieb Jiang Xin: > +# Two filter types are used: > +# 1. gettext-no-file-no-location: Strips both filenames and line numbers > +# (e.g., removes "#: main.c:123" entirely) > +# 2. gettext-no-location: Preserves filenames but removes line numbers, which > +# requires gettext 0.20 or higher > +# (e.g., "#: main.c:123" becomes "#: main.c") > +# > +# See `po/README.md` for instructions on setting up the required filter drivers. > + > +# Do not apply these filters to all `*.po` files via a wildcard. For legacy, > +# unmaintained PO files, that would cause discrepancies between the filtered > +# blob in the index and the unfiltered file in the working tree. > + > +# Languages that strip both filenames and line numbers > +bg.po filter=gettext-no-file-no-location > +de.po filter=gettext-no-file-no-location > +#es.po filter=gettext-no-file-no-location > +fr.po filter=gettext-no-file-no-location > +#ga.po filter=gettext-no-file-no-location > +#ru.po filter=gettext-no-file-no-location > +sv.po filter=gettext-no-file-no-location > +tr.po filter=gettext-no-file-no-location > +uk.po filter=gettext-no-file-no-location > +vi.po filter=gettext-no-file-no-location > + > +# Languages that preserve filenames but strip line numbers > +#ca.po filter=gettext-no-location > +id.po filter=gettext-no-location > +zh_CN.po filter=gettext-no-location > +zh_TW.po filter=gettext-no-location How settled is the use of these two different filters (and names) in the community of translators? I am asking because I'm about to align the translation workflow in the Gitk repository with that in the Git repository. I need to know which of the two variants of filter names I should ask translators to use. -- Hannes ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v3 1/5] l10n: add .gitattributes to simplify location filtering 2026-03-15 11:13 ` Johannes Sixt @ 2026-03-15 16:11 ` Junio C Hamano 2026-03-16 5:44 ` Jiang Xin 2026-03-16 3:21 ` Jiang Xin 1 sibling, 1 reply; 42+ messages in thread From: Junio C Hamano @ 2026-03-15 16:11 UTC (permalink / raw) To: Johannes Sixt Cc: Jiang Xin, Alexander Shopov, Mikel Forcada, Ralf Thielow, Jean-Noël Avila, Bagas Sanjaya, Dimitriy Ryazantcev, Peter Krefting, Emir SARI, Arkadii Yakovets, Vũ Tiến Hưng, Teng Long, Yi-Jyun Pan, Git List Johannes Sixt <j6t@kdbg.org> writes: >> +# Languages that strip both filenames and line numbers >> +bg.po filter=gettext-no-file-no-location >> +de.po filter=gettext-no-file-no-location >> +#es.po filter=gettext-no-file-no-location >> +fr.po filter=gettext-no-file-no-location >> +#ga.po filter=gettext-no-file-no-location >> +#ru.po filter=gettext-no-file-no-location >> +sv.po filter=gettext-no-file-no-location >> +tr.po filter=gettext-no-file-no-location >> +uk.po filter=gettext-no-file-no-location >> +vi.po filter=gettext-no-file-no-location >> + >> +# Languages that preserve filenames but strip line numbers >> +#ca.po filter=gettext-no-location >> +id.po filter=gettext-no-location >> +zh_CN.po filter=gettext-no-location >> +zh_TW.po filter=gettext-no-location > > How settled is the use of these two different filters (and names) in the > community of translators? I am asking because I'm about to align the > translation workflow in the Gitk repository with that in the Git > repository. I need to know which of the two variants of filter names I > should ask translators to use. I too am curious. I would imagine that the translation target langugae has nothing to do with the choice, and it would be mere personal preference---in which case it would be better if people can converge on a single convention fast and stick to it. After all, even if the current French translators happen to prefer no-file no-location, for example, existing translators would graduate the project and new ones would come in, and their preference would change over time. At least comments like "Languages that strip" is misleading, if this is just "personal preferences of l10n groups of various languages". Thanks. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v3 1/5] l10n: add .gitattributes to simplify location filtering 2026-03-15 16:11 ` Junio C Hamano @ 2026-03-16 5:44 ` Jiang Xin 0 siblings, 0 replies; 42+ messages in thread From: Jiang Xin @ 2026-03-16 5:44 UTC (permalink / raw) To: Junio C Hamano Cc: Johannes Sixt, Alexander Shopov, Mikel Forcada, Ralf Thielow, Jean-Noël Avila, Bagas Sanjaya, Dimitriy Ryazantcev, Peter Krefting, Emir SARI, Arkadii Yakovets, Vũ Tiến Hưng, Teng Long, Yi-Jyun Pan, Git List On Mon, Mar 16, 2026 at 12:11 AM Junio C Hamano <gitster@pobox.com> wrote: > > Johannes Sixt <j6t@kdbg.org> writes: > > >> +# Languages that strip both filenames and line numbers > >> +bg.po filter=gettext-no-file-no-location > >> +de.po filter=gettext-no-file-no-location > >> +#es.po filter=gettext-no-file-no-location > >> +fr.po filter=gettext-no-file-no-location > >> +#ga.po filter=gettext-no-file-no-location > >> +#ru.po filter=gettext-no-file-no-location > >> +sv.po filter=gettext-no-file-no-location > >> +tr.po filter=gettext-no-file-no-location > >> +uk.po filter=gettext-no-file-no-location > >> +vi.po filter=gettext-no-file-no-location > >> + > >> +# Languages that preserve filenames but strip line numbers > >> +#ca.po filter=gettext-no-location > >> +id.po filter=gettext-no-location > >> +zh_CN.po filter=gettext-no-location > >> +zh_TW.po filter=gettext-no-location > > > > How settled is the use of these two different filters (and names) in the > > community of translators? I am asking because I'm about to align the > > translation workflow in the Gitk repository with that in the Git > > repository. I need to know which of the two variants of filter names I > > should ask translators to use. > > I too am curious. > > I would imagine that the translation target langugae has nothing to > do with the choice, and it would be mere personal preference---in > which case it would be better if people can converge on a single > convention fast and stick to it. After all, even if the current > French translators happen to prefer no-file no-location, for > example, existing translators would graduate the project and new > ones would come in, and their preference would change over time. > > At least comments like "Languages that strip" is misleading, if this > is just "personal preferences of l10n groups of various languages". Will fix as below: -------- >8 -------- # Default: Strip the whole location comments for all .po files *.po filter=gettext-no-location # Legacy, unmaintained PO files: filter disabled to avoid index vs # working-tree mismatch (these files still have location comments). el.po -filter is.po -filter it.po -filter ko.po -filter pl.po -filter pt_PT.po -filter # These files use gettext-no-line-number (keep filenames, strip line # numbers). The choice is per l10n team preference. Requires gettext 0.20+. # The only benefit is locating source files from location comments when # the .po file is not updated from the POT via make po-update. ca.po filter=gettext-no-line-number id.po filter=gettext-no-line-number zh_CN.po filter=gettext-no-line-number zh_TW.po filter=gettext-no-line-number ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v3 1/5] l10n: add .gitattributes to simplify location filtering 2026-03-15 11:13 ` Johannes Sixt 2026-03-15 16:11 ` Junio C Hamano @ 2026-03-16 3:21 ` Jiang Xin 2026-03-16 12:43 ` Johannes Sixt 1 sibling, 1 reply; 42+ messages in thread From: Jiang Xin @ 2026-03-16 3:21 UTC (permalink / raw) To: Johannes Sixt Cc: Alexander Shopov, Mikel Forcada, Ralf Thielow, Jean-Noël Avila, Bagas Sanjaya, Dimitriy Ryazantcev, Peter Krefting, Emir SARI, Arkadii Yakovets, Vũ Tiến Hưng, Teng Long, Yi-Jyun Pan, Junio C Hamano, Git List On Sun, Mar 15, 2026 at 7:13 PM Johannes Sixt <j6t@kdbg.org> wrote: > > Am 14.03.26 um 15:38 schrieb Jiang Xin: > > +# Two filter types are used: > > +# 1. gettext-no-file-no-location: Strips both filenames and line numbers > > +# (e.g., removes "#: main.c:123" entirely) > > +# 2. gettext-no-location: Preserves filenames but removes line numbers, which > > +# requires gettext 0.20 or higher > > +# (e.g., "#: main.c:123" becomes "#: main.c") > > +# > > +# See `po/README.md` for instructions on setting up the required filter drivers. > > + > > +# Do not apply these filters to all `*.po` files via a wildcard. For legacy, > > +# unmaintained PO files, that would cause discrepancies between the filtered > > +# blob in the index and the unfiltered file in the working tree. > > + > > +# Languages that strip both filenames and line numbers > > +bg.po filter=gettext-no-file-no-location > > +de.po filter=gettext-no-file-no-location > > +#es.po filter=gettext-no-file-no-location > > +fr.po filter=gettext-no-file-no-location > > +#ga.po filter=gettext-no-file-no-location > > +#ru.po filter=gettext-no-file-no-location > > +sv.po filter=gettext-no-file-no-location > > +tr.po filter=gettext-no-file-no-location > > +uk.po filter=gettext-no-file-no-location > > +vi.po filter=gettext-no-file-no-location > > + > > +# Languages that preserve filenames but strip line numbers > > +#ca.po filter=gettext-no-location > > +id.po filter=gettext-no-location > > +zh_CN.po filter=gettext-no-location > > +zh_TW.po filter=gettext-no-location > > How settled is the use of these two different filters (and names) in the > community of translators? I am asking because I'm about to align the > translation workflow in the Gitk repository with that in the Git > repository. I need to know which of the two variants of filter names I > should ask translators to use. This is a very good question, and it reminds me to revisit the names of these two filters. When storing PO files in the repository, filtering location comments is necessary, because it reduces the storage growth caused by frequent changes in location comments, and also reduces the “diff churn” in commits caused by location changes. Either removing the entire location lines (filenames + line numbers) or removing only the line numbers can solve the two problems above. Both also improve blob compression equally well, so there is no difference in terms of repository storage savings. Therefore, maintainers may choose either approach according to their own preference. As long as they do not switch back and forth between the two frequently, there is no impact. Recording the maintainer’s choice in the repository through .gitattributes can avoid l10n teams repeatedly changing their choices. For the gitk project, there is only one source file, so removing line numbers while keeping the filename brings no benefit; removing the entire location is the best choice. However, for the Git project, some l10n teams keep filenames in PO files while removing line numbers, and this can still be somewhat helpful for locating the correspondence between PO entries and the source code when the PO files have not been regenerated from the POT file. This also reminds me to rethink the naming of the following filters. Judging from the msgcat --no-location option, “location” refers to filename + line number, so defining the filters like this may be more appropriate: ```shell git config --global filter.gettext-no-location.clean \ "msgcat --no-location -" git config --global filter.gettext-no-line-number.clean \ "msgcat --add-location=file -" ``` Please let me know your thoughts, and I will make the corresponding changes in reroll v4. -- Jiang Xin ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v3 1/5] l10n: add .gitattributes to simplify location filtering 2026-03-16 3:21 ` Jiang Xin @ 2026-03-16 12:43 ` Johannes Sixt 0 siblings, 0 replies; 42+ messages in thread From: Johannes Sixt @ 2026-03-16 12:43 UTC (permalink / raw) To: Jiang Xin Cc: Alexander Shopov, Mikel Forcada, Ralf Thielow, Jean-Noël Avila, Bagas Sanjaya, Dimitriy Ryazantcev, Peter Krefting, Emir SARI, Arkadii Yakovets, Vũ Tiến Hưng, Teng Long, Yi-Jyun Pan, Junio C Hamano, Git List Am 16.03.26 um 04:21 schrieb Jiang Xin: > This also reminds me to rethink the naming of the following filters. > Judging from the msgcat --no-location option, “location” refers to > filename + line number, so defining the filters like this may be more > appropriate: > > ```shell > git config --global filter.gettext-no-location.clean \ > "msgcat --no-location -" > > git config --global filter.gettext-no-line-number.clean \ > "msgcat --add-location=file -" > ``` I fully agree with this naming convention. -- Hannes ^ permalink raw reply [flat|nested] 42+ messages in thread
* [PATCH v3 2/5] docs(l10n): add AGENTS.md with optimized update-pot instructions 2026-03-14 14:38 ` [PATCH v3 0/5] docs(l10n): AI agent instructions and workflow improvements Jiang Xin 2026-03-14 14:38 ` [PATCH v3 1/5] l10n: add .gitattributes to simplify location filtering Jiang Xin @ 2026-03-14 14:38 ` Jiang Xin 2026-03-14 14:38 ` [PATCH v3 3/5] docs(l10n): add AI agent instructions for updating po/XX.po files Jiang Xin ` (2 subsequent siblings) 4 siblings, 0 replies; 42+ messages in thread From: Jiang Xin @ 2026-03-14 14:38 UTC (permalink / raw) To: Junio C Hamano, Git List Cc: Jiang Xin, Alexander Shopov, Mikel Forcada, Ralf Thielow, Jean-Noël Avila, Bagas Sanjaya, Dimitriy Ryazantcev, Peter Krefting, Emir SARI, Arkadii Yakovets, Vũ Tiến Hưng, Teng Long, Yi-Jyun Pan Add a new documentation file po/AGENTS.md that provides agent-specific instructions for generating or updating po/git.pot, separating them from the general po/README.md. This separation allows for more targeted optimization of AI agent workflows. Performance evaluation with the Qwen model: # Before: No agent-specific instructions; use po/README.md for # reference. git-po-helper agent-test --runs=5 --agent=qwen update-pot \ --prompt="Update po/git.pot according to po/README.md" # Phase 1: add the instructions to po/README.md; the prompt # references po/README.md during execution git-po-helper agent-test --runs=5 --agent=qwen update-pot \ --prompt="Update po/git.pot according to po/README.md" # Phase 2: add the instructions to po/AGENTS.md; use the built-in # prompt that references po/AGENTS.md during execution git-po-helper agent-test --runs=5 --agent=qwen update-pot Benchmark results (5-run average): Phase 1 - Optimizing po/README.md: | Metric | Before | Phase 1 | Improvement | |-------------|---------|---------|-------------| | Turns | 17 | 5 | -71% | | Exec. time | 34s | 14s | -59% | | Turn range | 3-36 | 3-7 | | | Time range | 10s-59s | 9s-19s | | Phase 2 - Adding po/AGENTS.md (further optimization): | Metric | Before | Phase 2 | Improvement | |-------------|---------|---------|-------------| | Turns | 17 | 3 | -82% | | Exec. time | 34s | 8s | -76% | | Turn range | 3-36 | 3-3 | | | Time range | 10s-59s | 6s-9s | | Separating agent-specific instructions into AGENTS.md provides: - More focused and concise instructions for AI agents - Cleaner README.md for human readers - An additional 11% reduction in turns and 17% reduction in execution time - More consistent behavior (turn range reduced from 3-7 to 3-3) Signed-off-by: Jiang Xin <worldhello.net@gmail.com> --- po/AGENTS.md | 70 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 70 insertions(+) create mode 100644 po/AGENTS.md diff --git a/po/AGENTS.md b/po/AGENTS.md new file mode 100644 index 0000000000..94b7aa7f28 --- /dev/null +++ b/po/AGENTS.md @@ -0,0 +1,70 @@ +# Instructions for AI Agents + +This file gives specific instructions for AI agents that perform +housekeeping tasks for Git l10n. Use of AI is optional; many successful +l10n teams work well without it. + +The section "Housekeeping tasks for localization workflows" documents the +most commonly used housekeeping tasks. + + +## Background knowledge for localization workflows + +Essential background for the workflows below; understand these concepts before +performing any housekeeping tasks in this document. + +### Language code and notation (XX, ll, ll\_CC) + +**XX** is a placeholder for the language code: either `ll` (ISO 639) or +`ll_CC` (e.g. `de`, `zh_CN`). It appears in the PO file header metadata +(e.g. `"Language: zh_CN\n"`) and is typically used to name the PO file: +`po/XX.po`. + + +### Header Entry + +The **header entry** is the first entry in every `po/XX.po`. It has an empty +`msgid`; translation metadata (project, language, plural rules, encoding, etc.) +is stored in `msgstr`, as in this example: + +```po +msgid "" +msgstr "" +"Project-Id-Version: Git\n" +"Language: zh_CN\n" +"MIME-Version: 1.0\n" +"Content-Type: text/plain; charset=UTF-8\n" +"Content-Transfer-Encoding: 8bit\n" +"Plural-Forms: nplurals=2; plural=(n != 1);\n" +``` + +**CRITICAL**: Do not edit the header's `msgstr` while translating. It holds +metadata only and must be left unchanged. + + +## Housekeeping tasks for localization workflows + +For common housekeeping tasks, follow the steps in the matching subsection +below. + + +### Task 1: Generating or updating po/git.pot + +When asked to generate or update `po/git.pot` (or the like): + +1. **Directly execute** the command `make po/git.pot` without checking + if the file exists beforehand. + +2. **Do not verify** the generated file after execution. Simply run the + command and consider the task complete. + + +## Human translators remain in control + +Git translation is human-driven; language team leaders and contributors are +responsible for maintaining translation quality and consistency. + +AI-generated output should always be treated as drafts that must be reviewed +and approved by someone who understands both the technical context and the +target language. The best results come from combining AI efficiency with human +judgment, cultural insight, and community engagement. -- 2.53.0.rc2.20.g532543fa46 ^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH v3 3/5] docs(l10n): add AI agent instructions for updating po/XX.po files 2026-03-14 14:38 ` [PATCH v3 0/5] docs(l10n): AI agent instructions and workflow improvements Jiang Xin 2026-03-14 14:38 ` [PATCH v3 1/5] l10n: add .gitattributes to simplify location filtering Jiang Xin 2026-03-14 14:38 ` [PATCH v3 2/5] docs(l10n): add AGENTS.md with optimized update-pot instructions Jiang Xin @ 2026-03-14 14:38 ` Jiang Xin 2026-03-14 14:38 ` [PATCH v3 4/5] docs(l10n): add AI agent instructions for translating PO files Jiang Xin 2026-03-14 14:38 ` [PATCH v3 5/5] docs(l10n): add AI agent instructions to review translations Jiang Xin 4 siblings, 0 replies; 42+ messages in thread From: Jiang Xin @ 2026-03-14 14:38 UTC (permalink / raw) To: Junio C Hamano, Git List Cc: Jiang Xin, Alexander Shopov, Mikel Forcada, Ralf Thielow, Jean-Noël Avila, Bagas Sanjaya, Dimitriy Ryazantcev, Peter Krefting, Emir SARI, Arkadii Yakovets, Vũ Tiến Hưng, Teng Long, Yi-Jyun Pan Add a new section to po/AGENTS.md to provide clear instructions for updating language-specific PO files. The improved documentation significantly reduces both conversation turns and execution time. Performance evaluation with the Qwen model: # Before: instructions in po/README.md; the custom prompt # references po/README.md during execution git-po-helper agent-test --runs=5 --agent=qwen update-po \ --prompt="Update po/zh_CN.po according to po/README.md" # After: instructions in po/AGENTS.md; the built-in prompt # references po/AGENTS.md during execution git-po-helper agent-test --runs=5 --agent=qwen update-po Benchmark results (5-run average): | Metric | Before | After | Improvement | |-------------|---------|--------|-------------| | Turns | 22 | 4 | -82% | | Exec. time | 38s | 9s | -76% | | Turn range | 17-39 | 3-9 | | | Time range | 25s-68s | 7s-14s | | Signed-off-by: Jiang Xin <worldhello.net@gmail.com> --- po/AGENTS.md | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/po/AGENTS.md b/po/AGENTS.md index 94b7aa7f28..f2b8fc5100 100644 --- a/po/AGENTS.md +++ b/po/AGENTS.md @@ -59,6 +59,17 @@ When asked to generate or update `po/git.pot` (or the like): command and consider the task complete. +### Task 2: Updating po/XX.po + +When asked to update `po/XX.po` (or the like): + +1. **Directly execute** the command `make po-update PO_FILE=po/XX.po` + without reading or checking the file content beforehand. + +2. **Do not verify, translate, or review** the updated file after execution. + Simply run the command and consider the task complete. + + ## Human translators remain in control Git translation is human-driven; language team leaders and contributors are -- 2.53.0.rc2.20.g532543fa46 ^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH v3 4/5] docs(l10n): add AI agent instructions for translating PO files 2026-03-14 14:38 ` [PATCH v3 0/5] docs(l10n): AI agent instructions and workflow improvements Jiang Xin ` (2 preceding siblings ...) 2026-03-14 14:38 ` [PATCH v3 3/5] docs(l10n): add AI agent instructions for updating po/XX.po files Jiang Xin @ 2026-03-14 14:38 ` Jiang Xin 2026-03-14 14:38 ` [PATCH v3 5/5] docs(l10n): add AI agent instructions to review translations Jiang Xin 4 siblings, 0 replies; 42+ messages in thread From: Jiang Xin @ 2026-03-14 14:38 UTC (permalink / raw) To: Junio C Hamano, Git List Cc: Jiang Xin, Alexander Shopov, Mikel Forcada, Ralf Thielow, Jean-Noël Avila, Bagas Sanjaya, Dimitriy Ryazantcev, Peter Krefting, Emir SARI, Arkadii Yakovets, Vũ Tiến Hưng, Teng Long, Yi-Jyun Pan Add a new "Translating po/XX.po" section to po/AGENTS.md with detailed workflow and procedures for AI agents to translate language-specific PO files. Users can invoke AI-assisted translation in coding tools with a prompt such as: "Translate the po/XX.po file by referring to @po/AGENTS.md" Translation results serve as drafts; human contributors must review and approve before submission. To address the low translation efficiency of some LLMs, batch translation replaces entry-by-entry translation. git-po-helper implements a gettext JSON format for translation files, replacing PO format during translation to enable batch processing. Evaluation with the Qwen model: git-po-helper agent-run --agent=qwen translate po/zh_CN.po Test translation (127 entries, 50 per batch): Initial state: 5998 translated, 91 fuzzy, 36 untranslated Final state: 6125 translated, 0 fuzzy, 0 untranslated Successfully translated: 127 entries (91 fuzzy + 36 untranslated) Success rate: 100% Benchmark results (3-run average): AI agent using gettext tools: | Metric | Value | |------------------|--------------------------------| | Avg. Num turns | 86 (176, 44, 40) | | Avg. Exec. Time | 20m44s (39m56s, 14m38s, 7m38s) | | Successful runs | 3/3 | AI agent using git-po-helper (JSON batch flow): | Metric | Value | |------------------|--------------------------------| | Avg. Num turns | 56 (68, 39, 63) | | Avg. Exec. Time | 19m8s (28m55s, 9m1s, 19m28s) | | Successful runs | 3/3 | The git-po-helper flow reduces the number of turns (86 → 56) with similar execution time; the bottleneck appears to be LLM processing rather than network interaction. Signed-off-by: Jiang Xin <worldhello.net@gmail.com> --- po/AGENTS.md | 596 ++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 595 insertions(+), 1 deletion(-) diff --git a/po/AGENTS.md b/po/AGENTS.md index f2b8fc5100..65017624f7 100644 --- a/po/AGENTS.md +++ b/po/AGENTS.md @@ -5,7 +5,11 @@ housekeeping tasks for Git l10n. Use of AI is optional; many successful l10n teams work well without it. The section "Housekeeping tasks for localization workflows" documents the -most commonly used housekeeping tasks. +most commonly used housekeeping tasks: + +1. Generating or updating po/git.pot +2. Updating po/XX.po +3. Translating po/XX.po ## Background knowledge for localization workflows @@ -42,6 +46,351 @@ msgstr "" metadata only and must be left unchanged. +### Glossary Section + +PO files may have a glossary in comments before the header entry (first +`msgid ""`), giving terminology guidelines (e.g.): + +```po +# Git glossary for Chinese translators +# +# English | Chinese +# ---------------------------------+-------------------------------------- +# 3-way merge | 三路合并 +# branch | 分支 +# ... +``` + +**IMPORTANT**: Read and use the glossary when translating or reviewing. It is +in `#` comments only. Leave that comment block unchanged. + + +### PO entry structure (single-line and multi-line) + +PO entries are `msgid` / `msgstr` pairs. Plural messages add `msgid_plural` and +`msgstr[n]`. The `msgid` is the immutable source; `msgstr` is the target +translation. Each side may be a single quoted string or a multi-line block. +In the multi-line form the header line is often `msgid ""` / `msgstr ""`, with +the real text split across following quoted lines (concatenated by Gettext). + +**Single-line entries**: + +```po +msgid "commit message" +msgstr "提交说明" +``` + +**Multi-line entries**: + +```po +msgid "" +"Line 1\n" +"Line 2" +msgstr "" +"行 1\n" +"行 2" +``` + +**CRITICAL**: Do **not** use `grep '^msgstr ""'` to find untranslated entries; +multi-line `msgstr` blocks use the same opening line, so grep gives false +positives. Use `msgattrib` (next section). + + +### Locating untranslated, fuzzy, and obsolete entries + +Use `msgattrib` to list untranslated, fuzzy, and obsolete entries. Task 3 +(translating `po/XX.po`) uses these commands. + +- **Untranslated**: `msgattrib --untranslated --no-obsolete po/XX.po` +- **Fuzzy**: `msgattrib --only-fuzzy --no-obsolete po/XX.po` +- **Obsolete** (`#~`): `msgattrib --obsolete --no-wrap po/XX.po` + + +### Translating fuzzy entries + +Fuzzy entries need re-translation because the source text changed. The format +differs by file type: + +- **PO file**: A `#, fuzzy` tag in the entry comments marks the entry as fuzzy. +- **JSON file**: The entry has `"fuzzy": true`. + +**Translation principles**: Re-translate the `msgstr` (and, for plural entries, +`msgstr[n]`) into the target language. Do **not** modify `msgid` or +`msgid_plural`. After translation, **clear the fuzzy mark**: in PO, remove the +`#, fuzzy` tag from comments; in JSON, omit or set `fuzzy` to `false`. + + +### Preserving Special Characters + +Preserve escape sequences (`\n`, `\"`, `\\`, `\t`), placeholders (`%s`, `%d`, +etc.), and quotes exactly as in `msgid`. Only reorder placeholders with +positional syntax when needed (see Placeholder Reordering below). + + +### Placeholder Reordering + +When reordering placeholders relative to `msgid`, use positional syntax (`%n$`) +where *n* is the 1-based argument index, so each argument still binds to the +right value. Preserve width and precision modifiers, and place `%n$` before +them (see examples below). + +**Example 1** (precision): + +```po +#, c-format +msgid "missing environment variable '%s' for configuration '%.*s'" +msgstr "配置 '%3$.*2$s' 缺少环境变量 '%1$s'" +``` + +`%s` → argument 1 → `%1$s`. `%.*s` needs precision (arg 2) and string (arg 3) → +`%3$.*2$s`. + +**Example 2** (multi-line, four `%s` reordered): + +```po +#, c-format +msgid "" +"the 'submodule.%s.gitdir' config does not exist for module '%s'. Please " +"ensure it is set, for example by running something like: 'git config " +"submodule.%s.gitdir .git/modules/%s'. For details see the " +"extensions.submodulePathConfig documentation." +msgstr "" +"模块 '%2$s' 的 'submodule.%1$s.gitdir' 配置不存在。请确保已设置,例如运行类" +"似:'git config submodule.%3$s.gitdir .git/modules/%4$s'。详细信息请参见 " +"extensions.submodulePathConfig 文档。" +``` + +Original order 1,2,3,4; in translation 2,1,3,4. Each line must be a complete +quoted string. + + +### Validating PO File Format + +Check the PO file using the command below: + +```shell +msgfmt --check -o /dev/null po/XX.po +``` + +Common validation errors include: +- Unclosed quotes +- Missing escape sequences +- Invalid placeholder syntax +- Malformed multi-line entries +- Incorrect line breaks in multi-line strings + +On failure, `msgfmt` prints the line number; fix the PO at that line. + + +### Using git-po-helper + +[git-po-helper](https://github.com/git-l10n/git-po-helper) supports Git l10n with +**quality checking** (git-l10n PR conventions) and **AI-assisted translation** +(subcommands for automated workflows). Housekeeping tasks in this document use +it when available; otherwise rely on gettext tools. + + +#### Splitting large PO files + +When a PO file is too large for translation or review, use `git-po-helper +msg-select` to split it by entry index. + +- **Entry 0** is the header (included by default; use `--no-header` to omit). +- **Entries 1, 2, 3, …** are content entries. +- **Range format**: `--range "1-50"` (entries 1 through 50), `--range "-50"` + (first 50 entries), `--range "51-"` (from entry 51 to end). Shortcuts: + `--head N` (first N), `--tail N` (last N), `--since N` (from N to end). +- **Output format**: PO by default; use `--json` for GETTEXT JSON. See the + "GETTEXT JSON format" section (under git-po-helper) for details. +- **State filter**: Use `--translated`, `--untranslated`, `--fuzzy` to filter + by state (OR relationship). Use `--no-obsolete` to exclude obsolete entries; + `--with-obsolete` to include (default). Use `--only-same` or `--only-obsolete` + for a single state. Range applies to the filtered list. + +```shell +# First 50 entries (header + entries 1–50) +git-po-helper msg-select --range "-50" po/in.po -o po/out.po + +# Entries 51–100 +git-po-helper msg-select --range "51-100" po/in.po -o po/out.po + +# Entries 101 to end +git-po-helper msg-select --range "101-" po/in.po -o po/out.po + +# Entries 1–50 without header (content only) +git-po-helper msg-select --range "1-50" --no-header po/in.po -o po/frag.po + +# Output as JSON; select untranslated and fuzzy entries, exclude obsolete +git-po-helper msg-select --json --untranslated --fuzzy --no-obsolete po/in.po >po/filtered.json +``` + + +#### Comparing PO files for translation and review + +`git-po-helper compare` shows PO changes with full entry context (unlike +`git diff`). Redirect output to a file: it is empty when there are no new or +changed entries; otherwise it contains a valid PO header. + +```shell +# Get full context of local changes (HEAD vs working tree) +git-po-helper compare po/XX.po -o po/out.po + +# Get full context of changes in a specific commit (parent vs commit) +git-po-helper compare --commit <commit> po/XX.po -o po/out.po + +# Get full context of changes since a commit (commit vs working tree) +git-po-helper compare --since <commit> po/XX.po -o po/out.po + +# Get full context between two commits +git-po-helper compare -r <commit1>..<commit2> po/XX.po -o po/out.po + +# Get full context of two worktree files +git-po-helper compare po/old.po po/new.po -o po/out.po + +# Check msgid consistency (detect tampering); no output means target matches source +git-po-helper compare --msgid po/old.po po/new.po >po/out.po +``` + +**Options summary** + +| Option | Meaning | +|---------------------|------------------------------------------------| +| (none) | Compare HEAD with working tree (local changes) | +| `--commit <commit>` | Compare parent of commit with the commit | +| `--since <commit>` | Compare commit with working tree | +| `-r x..y` | Compare revision x with revision y | +| `-r x..` | Compare revision x with working tree | +| `-r x` | Compare parent of x with x | + + +#### Concatenating multiple PO/JSON files + +`git-po-helper msg-cat` merges PO, POT, or gettext JSON inputs into one stream. +Duplicate `msgid` values keep the first occurrence in file order. Write with +`-o <file>` or stdout (`-o -` or omit); `--json` selects JSON output, else PO. + +```shell +# Convert JSON to PO (e.g. after translation) +git-po-helper msg-cat --unset-fuzzy -o po/out.po po/in.json + +# Merge multiple PO files +git-po-helper msg-cat -o po/out.po po/in-1.po po/in-2.json +``` + + +#### GETTEXT JSON format + +The **GETTEXT JSON** format is an internal format defined by `git-po-helper` +for convenient batch processing of translation and related tasks by AI models. +`git-po-helper msg-select`, `git-po-helper msg-cat`, and `git-po-helper compare` +read and write this format. + +**Top-level structure**: + +```json +{ + "header_comment": "string", + "header_meta": "string", + "entries": [ /* array of entry objects */ ] +} +``` + +| Field | Description | +|------------------|--------------------------------------------------------------------------------| +| `header_comment` | Lines above the first `msgid ""` (comments, glossary), directly concatenated. | +| `header_meta` | Encoded `msgstr` of the header entry (Project-Id-Version, Plural-Forms, etc.). | +| `entries` | List of PO entries. Order matches source. | + +**Entry object** (each element of `entries`): + +| Field | Type | Description | +|-----------------|----------|--------------------------------------------------------------| +| `msgid` | string | Singular message ID. PO escapes encoded (e.g. `\n` → `\\n`). | +| `msgstr` | []string | Translation forms as a **JSON array only**. Details below. | +| `msgid_plural` | string | Plural form of msgid. Omit for non-plural. | +| `comments` | []string | Comment lines (`#`, `#.`, `#:`, `#,`, etc.). | +| `fuzzy` | bool | True if entry has fuzzy flag. | +| `obsolete` | bool | True for `#~` obsolete entries. Omit if false. | + +**`msgstr` array (required shape)**: + +- **Always** a JSON array of strings, never a single string. One element = singular + (PO `msgstr` / `msgstr[0]`); multiple elements = plural forms in order + (`msgstr[0]`, `msgstr[1]`, …). +- Omit the key or use an empty array when the entry is untranslated. + +**Example (single-line entry)**: + +```json +{ + "header_comment": "# Glossary:\\n# term1\\tTranslation 1\\n#\\n", + "header_meta": "Project-Id-Version: git\\nContent-Type: text/plain; charset=UTF-8\\n", + "entries": [ + { + "msgid": "Hello", + "msgstr": ["你好"], + "comments": ["#. Comment for translator\\n", "#: src/file.c:10\\n"], + "fuzzy": false + } + ] +} +``` + +**Example (plural entry)**: + +```json +{ + "msgid": "One file", + "msgid_plural": "%d files", + "msgstr": ["一个文件", "%d 个文件"], + "comments": ["#, c-format\\n"] +} +``` + +**Example (fuzzy entry before translation)**: + +```json +{ + "msgid": "Old message", + "msgstr": ["旧翻译。"], + "comments": ["#, fuzzy\\n"], + "fuzzy": true +} +``` + +**Translation notes for GETTEXT JSON files**: + +- **Preserve structure**: Keep `header_comment`, `header_meta`, `msgid`, + `msgid_plural` unchanged. +- **Fuzzy entries**: Entries extracted from fuzzy PO entries have `"fuzzy": true`. + After translating, **remove the `fuzzy` field** or set it to `false` in the + output JSON. The merge step uses `--unset-fuzzy`, which can also remove the + `fuzzy` field. +- **Placeholders**: Preserve `%s`, `%d`, etc. exactly; use `%n$` when + reordering (see "Placeholder Reordering" above). + + +### Quality checklist + +- **Accuracy**: Faithful to original meaning; no omissions or distortions. +- **Fuzzy entries**: Re-translate fully and clear the fuzzy flag (see + "Translating fuzzy entries" above). +- **Terminology**: Consistent with glossary (see "Glossary Section" above) or + domain standards. +- **Grammar and fluency**: Correct and natural in the target language. +- **Placeholders**: Preserve variables (`%s`, `{name}`, `$1`) exactly; use + positional parameters when reordering (see "Placeholder Reordering" above). +- **Special characters**: Preserve escape sequences (`\n`, `\"`, `\\`, `\t`), + placeholders exactly as in `msgid`. See "Preserving Special Characters" above. +- **Plurals and gender**: Correct forms and agreement. +- **Context fit**: Suitable for UI space, tone, and use (e.g. error vs. tooltip). +- **Cultural appropriateness**: No offensive or ambiguous content. +- **Consistency**: Match prior translations of the same source. +- **Technical integrity**: Do not translate code, paths, commands, brands, or + proper nouns. +- **Readability**: Clear, concise, and user-friendly. + + ## Housekeeping tasks for localization workflows For common housekeeping tasks, follow the steps in the matching subsection @@ -70,6 +419,251 @@ When asked to update `po/XX.po` (or the like): Simply run the command and consider the task complete. +### Task 3: Translating po/XX.po + +To translate `po/XX.po`, use the steps below. The script uses gettext or +`git-po-helper` depending on what is installed; JSON export (when available) +supports batch translation rather than per-entry work. + +**Workflow loop**: Steps 1→2→3→4→5→6→7 form a loop. After step 6 succeeds, +**always** go to step 7, which returns to step 1. The **only** exit to step 8 +is when step 2 finds `po/l10n-pending.po` empty. Do not skip step 7 or jump to +step 8 after step 6. + +1. **Extract entries to translate**: **Directly execute** the script below—it is + authoritative; do not reimplement. It generates `po/l10n-pending.po` with + messages that need translation. + + ```shell + l10n_extract_pending () { + test $# -ge 1 || { echo "Usage: l10n_extract_pending <po-file>" >&2; return 1; } + PO_FILE="$1" + PENDING="po/l10n-pending.po" + PENDING_FUZZY="${PENDING}.fuzzy" + PENDING_REFER="${PENDING}.fuzzy.reference" + PENDING_UNTRANS="${PENDING}.untranslated" + rm -f "$PENDING" + + if command -v git-po-helper >/dev/null 2>&1 + then + git-po-helper msg-select --untranslated --fuzzy --no-obsolete -o "$PENDING" "$PO_FILE" + else + msgattrib --untranslated --no-obsolete "$PO_FILE" >"${PENDING_UNTRANS}" + msgattrib --only-fuzzy --no-obsolete --clear-fuzzy --empty "$PO_FILE" >"${PENDING_FUZZY}" + msgattrib --only-fuzzy --no-obsolete "$PO_FILE" >"${PENDING_REFER}" + msgcat --use-first "${PENDING_UNTRANS}" "${PENDING_FUZZY}" >"$PENDING" + rm -f "${PENDING_UNTRANS}" "${PENDING_FUZZY}" + fi + if test -s "$PENDING" + then + msgfmt --stat -o /dev/null "$PENDING" || true + echo "Pending file is not empty; there are still entries to translate." + else + echo "No entries need translation." + return 1 + fi + } + # Run the extraction. Example: l10n_extract_pending po/zh_CN.po + l10n_extract_pending po/XX.po + ``` + +2. **Check generated file**: If `po/l10n-pending.po` is empty or does not exist, + translation is complete; go to step 8. Otherwise proceed to step 3. + +3. **Prepare one batch for translation**: Batching keeps each run small so the + model can complete translation within limited context. **BEFORE translating**, + **directly execute** the script below—it is authoritative; do not reimplement. + Based on which file the script produces: if `po/l10n-todo.json` exists, go to + step 4a; if `po/l10n-todo.po` exists, go to step 4b. + + ```shell + l10n_one_batch () { + test $# -ge 1 || { echo "Usage: l10n_one_batch <po-file> [min_batch_size]" >&2; return 1; } + PO_FILE="$1" + min_batch_size=${2:-100} + PENDING="po/l10n-pending.po" + TODO_JSON="po/l10n-todo.json" + TODO_PO="po/l10n-todo.po" + DONE_JSON="po/l10n-done.json" + DONE_PO="po/l10n-done.po" + rm -f "$TODO_JSON" "$TODO_PO" "$DONE_JSON" "$DONE_PO" + + ENTRY_COUNT=$(grep -c '^msgid ' "$PENDING" 2>/dev/null || echo 0) + ENTRY_COUNT=$((ENTRY_COUNT > 0 ? ENTRY_COUNT - 1 : 0)) + + if test "$ENTRY_COUNT" -gt $min_batch_size + then + if test "$ENTRY_COUNT" -gt $((min_batch_size * 8)) + then + NUM=$((min_batch_size * 2)) + elif test "$ENTRY_COUNT" -gt $((min_batch_size * 4)) + then + NUM=$((min_batch_size + min_batch_size / 2)) + else + NUM=$min_batch_size + fi + BATCHING=1 + else + NUM=$ENTRY_COUNT + BATCHING= + fi + + if command -v git-po-helper >/dev/null 2>&1 + then + if test -n "$BATCHING" + then + git-po-helper msg-select --json --head "$NUM" -o "$TODO_JSON" "$PENDING" + echo "Processing batch of $NUM entries (out of $ENTRY_COUNT remaining)" + else + git-po-helper msg-select --json -o "$TODO_JSON" "$PENDING" + echo "Processing all $ENTRY_COUNT entries at once" + fi + else + if test -n "$BATCHING" + then + awk -v num="$NUM" '/^msgid / && count++ > num {exit} 1' "$PENDING" | + tac | awk '/^$/ {found=1} found' | tac >"$TODO_PO" + echo "Processing batch of $NUM entries (out of $ENTRY_COUNT remaining)" + else + cp "$PENDING" "$TODO_PO" + echo "Processing all $ENTRY_COUNT entries at once" + fi + fi + } + # Prepare one batch; shrink 2nd arg when batches exceed agent capacity. + l10n_one_batch po/XX.po 100 + ``` + +4a. **Translate JSON batch** (`po/l10n-todo.json` → `po/l10n-done.json`): + + - **Task**: Translate `po/l10n-todo.json` (input, GETTEXT JSON) into + `po/l10n-done.json` (output, GETTEXT JSON). See the "GETTEXT JSON format" + section above for format details and translation rules. + - **Reference glossary**: Read the glossary from the batch file's + `header_comment` (see "Glossary Section" above) and use it for + consistent terminology. + - **When translating**: Follow the "Quality checklist" above for correctness + and quality. Handle escape sequences (`\n`, `\"`, `\\`, `\t`), placeholders, + and quotes correctly as in `msgid`. For JSON, correctly escape and unescape + these sequences when reading and writing. Modify `msgstr` and `msgstr[n]` + (for plural entries); clear the fuzzy flag (omit or set `fuzzy` to `false`). + Do **not** modify `msgid` or `msgid_plural`. + +4b. **Translate PO batch** (`po/l10n-todo.po` → `po/l10n-done.po`): + + - **Task**: Translate `po/l10n-todo.po` (input, GETTEXT PO) into + `po/l10n-done.po` (output, GETTEXT PO). + - **Reference glossary**: Read the glossary from the pending file header + (see "Glossary Section" above) and use it for consistent terminology. + - **When translating**: Follow the "Quality checklist" above for correctness + and quality. Preserve escape sequences (`\n`, `\"`, `\\`, `\t`), placeholders, + and quotes as in `msgid`. Modify `msgstr` and `msgstr[n]` (for plural + entries); remove the `#, fuzzy` tag from comments when done. Do **not** + modify `msgid` or `msgid_plural`. + +5. **Validate `po/l10n-done.po`**: + + Run the validation script below. If it fails, fix per the errors and notes, + re-run until it succeeds. + + ```shell + l10n_validate_done () { + DONE_PO="po/l10n-done.po" + DONE_JSON="po/l10n-done.json" + PENDING="po/l10n-pending.po" + + if test -f "$DONE_JSON" && { ! test -f "$DONE_PO" || test "$DONE_JSON" -nt "$DONE_PO"; } + then + git-po-helper msg-cat --unset-fuzzy -o "$DONE_PO" "$DONE_JSON" || { + echo "ERROR [JSON to PO conversion]: Fix $DONE_JSON and re-run." >&2 + return 1 + } + fi + + # Check 1: msgid should not be modified + MSGID_OUT=$(git-po-helper compare -q --msgid --assert-no-changes \ + "$PENDING" "$DONE_PO" 2>&1) + MSGID_RC=$? + if test $MSGID_RC -ne 0 || test -n "$MSGID_OUT" + then + echo "ERROR [msgid modified]: The following entries appeared after" >&2 + echo "translation because msgid was altered. Fix in $DONE_PO." >&2 + echo "$MSGID_OUT" >&2 + return 1 + fi + + # Check 2: PO format (see "Validating PO File Format" for error handling) + MSGFMT_OUT=$(msgfmt --check -o /dev/null "$DONE_PO" 2>&1) + MSGFMT_RC=$? + if test $MSGFMT_RC -ne 0 + then + echo "ERROR [PO format]: Fix errors in $DONE_PO." >&2 + echo "$MSGFMT_OUT" >&2 + return 1 + fi + + echo "Validation passed." + } + l10n_validate_done + ``` + + If the script fails, fix **directly in `po/l10n-done.po`**. Re-run + `l10n_validate_done` until it succeeds. Editing `po/l10n-done.json` is not + recommended because it adds an extra JSON-to-PO conversion step. Use the + error message to decide: + + - **`[msgid modified]`**: The listed entries have altered `msgid`; restore + them to match `po/l10n-pending.po`. + - **`[PO format]`**: `msgfmt` reports line numbers; fix the errors in place. + See "Validating PO File Format" for common issues. + + +6. **Merge translation results into `po/XX.po`**: Run the script below. If it + fails, fix the file the error names: **`[JSON to PO conversion]`** → + `po/l10n-done.json`; **`[msgcat merge]`** → `po/l10n-done.po`. Re-run until + it succeeds. + + ```shell + l10n_merge_batch () { + test $# -ge 1 || { echo "Usage: l10n_merge_batch <po-file>" >&2; return 1; } + PO_FILE="$1" + DONE_PO="po/l10n-done.po" + DONE_JSON="po/l10n-done.json" + MERGED="po/l10n-done.merged" + PENDING="po/l10n-pending.po" + PENDING_REFER="${PENDING}.fuzzy.reference" + TODO_JSON="po/l10n-todo.json" + TODO_PO="po/l10n-todo.po" + if test -f "$DONE_JSON" && { ! test -f "$DONE_PO" || test "$DONE_JSON" -nt "$DONE_PO"; } + then + git-po-helper msg-cat --unset-fuzzy -o "$DONE_PO" "$DONE_JSON" || { + echo "ERROR [JSON to PO conversion]: Fix $DONE_JSON and re-run." >&2 + return 1 + } + fi + msgcat --use-first "$DONE_PO" "$PO_FILE" >"$MERGED" || { + echo "ERROR [msgcat merge]: Fix errors in $DONE_PO and re-run." >&2 + return 1 + } + mv "$MERGED" "$PO_FILE" + rm -f "$TODO_JSON" "$TODO_PO" "$DONE_JSON" "$DONE_PO" "$PENDING_REFER" + } + # Run the merge. Example: l10n_merge_batch po/zh_CN.po + l10n_merge_batch po/XX.po + ``` + +7. **Loop**: **MUST** return to step 1 (Extract entries) and repeat the cycle. + Do **not** skip this step or go to step 8. Step 8 (below) runs **only** + when step 2 finds no more entries and redirects there. + +8. **Only after loop exits**: Run the command below to validate the PO file and + display the report. The process ends here. + + ```shell + msgfmt --check --stat -o /dev/null po/XX.po + ``` + + ## Human translators remain in control Git translation is human-driven; language team leaders and contributors are -- 2.53.0.rc2.20.g532543fa46 ^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH v3 5/5] docs(l10n): add AI agent instructions to review translations 2026-03-14 14:38 ` [PATCH v3 0/5] docs(l10n): AI agent instructions and workflow improvements Jiang Xin ` (3 preceding siblings ...) 2026-03-14 14:38 ` [PATCH v3 4/5] docs(l10n): add AI agent instructions for translating PO files Jiang Xin @ 2026-03-14 14:38 ` Jiang Xin 4 siblings, 0 replies; 42+ messages in thread From: Jiang Xin @ 2026-03-14 14:38 UTC (permalink / raw) To: Junio C Hamano, Git List Cc: Jiang Xin, Alexander Shopov, Mikel Forcada, Ralf Thielow, Jean-Noël Avila, Bagas Sanjaya, Dimitriy Ryazantcev, Peter Krefting, Emir SARI, Arkadii Yakovets, Vũ Tiến Hưng, Teng Long, Yi-Jyun Pan Add a new "Reviewing po/XX.po" section to po/AGENTS.md that provides comprehensive guidance for AI agents to review translation files. Translation diffs lose context, especially for multi-line msgid and msgstr entries. Some LLMs ignore context and cannot evaluate translations accurately; others rely on scripts to search for context in source files, making the review process time-consuming. To address this, git-po-helper implements the compare subcommand, which extracts new or modified translations with full context (complete msgid/msgstr pairs), significantly improving review efficiency. A limitation is that the extracted content lacks other already-translated content for reference, which may affect terminology consistency. This is mitigated by including a glossary in the PO file header. git-po-helper-generated review files include the header entry and glossary (if present) by default. The review workflow leverages git-po-helper subcommands: - git-po-helper compare: Extract new or changed entries between two versions of a PO file into a valid PO file for review. Supports multiple modes: * Compare HEAD with the working tree (local changes) * Compare a commit's parent with the commit (--commit) * Compare a commit with the working tree (--since) * Compare two arbitrary revisions (-r) - git-po-helper msg-select: Split large review files into smaller batches by entry index range for manageable review sessions. Supports range formats like "-50" (first 50), "51-100", "101-" (to end). Evaluation with the Qwen model: git-po-helper agent-run review --commit 2000abefba --agent qwen Benchmark results: | Metric | Value | |------------------|----------------------------------| | Turns | 22 | | Input tokens | 537263 | | Output tokens | 4397 | | API duration | 167.84 s | | Review score | 96/100 | | Total entries | 63 | | With issues | 4 (1 critical, 2 major, 1 minor) | Signed-off-by: Jiang Xin <worldhello.net@gmail.com> --- po/AGENTS.md | 197 +++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 197 insertions(+) diff --git a/po/AGENTS.md b/po/AGENTS.md index 65017624f7..e9a6ffc7f1 100644 --- a/po/AGENTS.md +++ b/po/AGENTS.md @@ -10,6 +10,7 @@ most commonly used housekeeping tasks: 1. Generating or updating po/git.pot 2. Updating po/XX.po 3. Translating po/XX.po +4. Reviewing translation quality ## Background knowledge for localization workflows @@ -664,6 +665,202 @@ step 8 after step 6. ``` +### Task 4: Review translation quality + +Review may target the full `po/XX.po`, a specific commit, or changes since a +commit. When asked to review, follow the steps below. + +**Workflow**: Follow steps in order. Do **NOT** use `git show`, `git diff`, +`git format-patch`, or similar to get changes—they break PO context; use **only** +`git-po-helper compare` for extraction. Without `git-po-helper`, refuse the task. +Steps 3→4→5→6→7 loop: after step 6, **always** go to step 7 (back to step 3). +The **only** ways to step 8 are when step 4 finds `po/review-todo.json` missing +or empty (no batch left to review), or when step 1 finds `po/review-result.json` +already present. + +1. **Check for existing review (resume support)**: Evaluate the following in order: + + - If `po/review-input.po` does **not** exist, proceed to step 2 (Extract + entries) for a fresh start. + - Else If `po/review-result.json` exists, go to step 8 (only after loop exits). + - Else If `po/review-done.json` exists, go to step 6 (Rename result). + - Else if `po/review-todo.json` exists, go to step 5 (Review the current + batch). + - Else go to step 3 (Prepare one batch). + +2. **Extract entries**: Run `git-po-helper compare` with the desired range and + redirect the output to `po/review-input.po`. See "Comparing PO files for + translation and review" under git-po-helper for options. + +3. **Prepare one batch**: Batching keeps each run small so the model can + complete review within limited context. **Directly execute** the script + below—it is authoritative; do not reimplement. + + ```shell + review_one_batch () { + min_batch_size=${1:-100} + INPUT_PO="po/review-input.po" + PENDING="po/review-pending.po" + TODO="po/review-todo.json" + DONE="po/review-done.json" + BATCH_FILE="po/review-batch.txt" + + if test ! -f "$INPUT_PO" + then + rm -f "$TODO" + echo >&2 "cannot find $INPUT_PO, nothing for review" + return 1 + fi + if test ! -f "$PENDING" || test "$INPUT_PO" -nt "$PENDING" + then + rm -f "$BATCH_FILE" "$TODO" "$DONE" + rm -f po/review-result*.json + cp "$INPUT_PO" "$PENDING" + fi + + ENTRY_COUNT=$(grep -c '^msgid ' "$PENDING" 2>/dev/null || echo 0) + ENTRY_COUNT=$((ENTRY_COUNT > 0 ? ENTRY_COUNT - 1 : 0)) + if test "$ENTRY_COUNT" -eq 0 + then + rm -f "$TODO" + echo >&2 "No entries left for review" + return 1 + fi + + if test "$ENTRY_COUNT" -gt $min_batch_size + then + if test "$ENTRY_COUNT" -gt $((min_batch_size * 8)) + then + NUM=$((min_batch_size * 2)) + elif test "$ENTRY_COUNT" -gt $((min_batch_size * 4)) + then + NUM=$((min_batch_size + min_batch_size / 2)) + else + NUM=$min_batch_size + fi + else + NUM=$ENTRY_COUNT + fi + + BATCH=$(cat "$BATCH_FILE" 2>/dev/null || echo 0) + BATCH=$((BATCH + 1)) + echo "$BATCH" >"$BATCH_FILE" + + git-po-helper msg-select --json --head "$NUM" -o "$TODO" "$PENDING" + git-po-helper msg-select --since "$((NUM + 1))" -o "${PENDING}.tmp" "$PENDING" + mv "${PENDING}.tmp" "$PENDING" + echo "Processing batch $BATCH ($NUM entries out of $ENTRY_COUNT)" + } + # The parameter controls batch size; reduce if the batch file is too large. + review_one_batch 100 + ``` + +4. **Check todo file**: If `po/review-todo.json` does not exist or is empty, + review is complete; go to step 8 (only after loop exits). Otherwise proceed to + step 5. + +5. **Review the current batch**: Review translations in `po/review-todo.json` + and write findings to `po/review-done.json` as follows: + - Use "Background knowledge for localization workflows" for PO/JSON structure, + placeholders, and terminology. + - If `header_comment` includes a glossary, follow it for consistency. + - Do **not** review the header (`header_comment`, `header_meta`). + - For every other entry, check the entry's `msgstr` **array** (translation + forms) against `msgid` / `msgid_plural` using the "Quality checklist" above. + - Write JSON per "Review result JSON format" below; use `{"issues": []}` when + there are no issues. **Always** write `po/review-done.json`—it marks the + batch complete. + +6. **Rename result**: Rename `po/review-done.json` to `po/review-result-<N>.json`, + where N is the value in `po/review-batch.txt` (the batch just completed). + Run the script below: + + ```shell + review_rename_result () { + TODO="po/review-todo.json" + DONE="po/review-done.json" + BATCH_FILE="po/review-batch.txt" + if test -f "$DONE" + then + N=$(cat "$BATCH_FILE" 2>/dev/null) || { echo "ERROR: $BATCH_FILE not found." >&2; return 1; } + mv "$DONE" "po/review-result-$N.json" + echo "Renamed to po/review-result-$N.json" + fi + rm -f "$TODO" + } + review_rename_result + ``` + +7. **Loop**: **MUST** return to step 3 (Prepare one batch) and repeat the cycle. + Do **not** skip this step or go to step 8. Step 8 is reached **only** when + step 4 finds `po/review-todo.json` missing or empty. + +8. **Only after loop exits**: **Directly execute** the command below. It merges + results, applies suggestions, and displays the report. The process ends here. + + ```shell + git-po-helper agent-run report + ``` + + **Do not** run cleanup or delete intermediate files. Keep them for inspection + or resumption. + +**Review result JSON format**: + +The **Review result JSON** format defines the structure for translation +review reports. For each entry with translation issues, create an issue +object as follows: + +- Copy the original entry's `msgid`, optional `msgid_plural`, and optional + `msgstr` array (original translation forms) into the issue object. Use the + same shape as GETTEXT JSON: `msgstr` is **always a JSON array** when present + (one element singular, multiple for plural). +- Write a summary of all issues found for this entry in `description`. +- Set `score` according to the severity of issues found for this entry, + from 0 to 3 (0 = critical; 1 = major; 2 = minor; 3 = perfect, no issues). + **Lower score means more severe issues.** +- Place the suggested translation in **`suggest_msgstr`** as a **JSON array**: + one string for singular, multiple strings for plural forms in order. This is + required for `git-po-helper` to apply suggestions. +- Include only entries with issues (score less than 3). When no issues are + found in the batch, write `{"issues": []}`. + +Example review result (with issues): + +```json +{ + "issues": [ + { + "msgid": "commit", + "msgstr": ["委托"], + "score": 0, + "description": "Terminology error: 'commit' should be translated as '提交'", + "suggest_msgstr": ["提交"] + }, + { + "msgid": "repository", + "msgid_plural": "repositories", + "msgstr": ["版本库", "版本库"], + "score": 2, + "description": "Consistency issue: suggest using '仓库' consistently", + "suggest_msgstr": ["仓库", "仓库"] + } + ] +} +``` + +Field descriptions for each issue object (element of the `issues` array): + +- `msgid` (and optional `msgid_plural` for plural entries): Original source text. +- `msgstr` (optional): JSON array of original translation forms (same meaning as + in GETTEXT JSON entries). +- `suggest_msgstr`: JSON array of suggested translation forms; **must be an + array** (e.g. `["提交"]` for singular). Plural entries use multiple elements + in order. +- `score`: 0–3 (0 = critical; 1 = major; 2 = minor; 3 = perfect, no issues). +- `description`: Brief summary of the issue. + + ## Human translators remain in control Git translation is human-driven; language team leaders and contributors are -- 2.53.0.rc2.20.g532543fa46 ^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH v4 0/5] docs(l10n): AI agent instructions and workflow improvements 2026-02-06 2:38 ` Jiang Xin 2026-03-03 15:33 ` [PATCH v2 0/5] docs(l10n): AI agent instructions and workflow improvements Jiang Xin 2026-03-14 14:38 ` [PATCH v3 0/5] docs(l10n): AI agent instructions and workflow improvements Jiang Xin @ 2026-03-16 23:54 ` Jiang Xin 2026-03-16 23:54 ` [PATCH v4 1/5] l10n: add .gitattributes to simplify location filtering Jiang Xin ` (4 more replies) 2 siblings, 5 replies; 42+ messages in thread From: Jiang Xin @ 2026-03-16 23:54 UTC (permalink / raw) To: Junio C Hamano, Johannes Sixt, Git List Cc: Jiang Xin, Alexander Shopov, Mikel Forcada, Ralf Thielow, Jean-Noël Avila, Bagas Sanjaya, Dimitriy Ryazantcev, Peter Krefting, Emir SARI, Arkadii Yakovets, Vũ Tiến Hưng, Teng Long, Yi-Jyun Pan ## range-diff v3...v4 1: 0c00f09918 ! 1: cb99047e24 l10n: add .gitattributes to simplify location filtering @@ Commit message l10n: add .gitattributes to simplify location filtering To simplify the location filtering process for l10n contributors when - committing po/XX.po files, add the filter attributes for selected PO + committing po/XX.po files, add filter attributes for selected PO files to the repository. This ensures all contributors automatically get the same filter configuration without manual setup in .git/info/attributes. - The filter attribute is only applied to specific PO files that have - been properly prepared. Files without the filter attribute fall into - two categories: + The default filter (gettext-no-location) is applied to all .po files + except: - - Legacy files that lack maintenance and still contain location - comments that have not been cleaned up - - Files that are already location-less but whose formatting (e.g., line - wrapping style) differs from the output of msgcat processing + - Legacy, unmaintained PO files that still contain location comments. + Leaving the filter off avoids index vs working-tree discrepancies for + these files. The CI pipeline will report an error when future updates + touch these legacy files. + - Some PO files use a different filter that strips only line numbers + from location comments while keeping filenames. - To avoid discrepancies between the filtered blob in the index and the - unfiltered working tree for these files, the filter attribute is not - applied to them. - - Contributors still need to manually define the filter drivers using + Contributors still need to manually define the filter drivers via git-config as documented in po/README.md. - Additionally, po/README.md has been reorganized: the content of handling - location-less PO file content has been moved from the "Updating a XX.po - file" section to a separate "Preparing a XX.po file for commit" section. - This prevents AI agents from introducing unrelated operations when - updating PO files. + Four PO files that use location filtering (po/ca.po, po/es.po, po/ga.po, + po/ru.po) were batch-modified so their on-disk format matches the filter + output (e.g. line wrapping), avoiding index vs working-tree mismatch. + + Additionally, po/README.md has been reorganized: the material on + preparing location-less PO files for commit has been moved from + "Updating a XX.po file" to a separate "Preparing a XX.po file for + commit" section. This prevents AI agents from introducing unrelated + operations when updating PO files. Signed-off-by: Jiang Xin <worldhello.net@gmail.com> @@ po/.gitattributes (new) +# space. +# +# Two filter types are used: -+# 1. gettext-no-file-no-location: Strips both filenames and line numbers ++# 1. gettext-no-location: Strips both filenames and line numbers +# (e.g., removes "#: main.c:123" entirely) -+# 2. gettext-no-location: Preserves filenames but removes line numbers, which ++# 2. gettext-no-line-number: Preserves filenames but removes line numbers, which +# requires gettext 0.20 or higher +# (e.g., "#: main.c:123" becomes "#: main.c") +# +# See `po/README.md` for instructions on setting up the required filter drivers. + -+# Do not apply these filters to all `*.po` files via a wildcard. For legacy, -+# unmaintained PO files, that would cause discrepancies between the filtered -+# blob in the index and the unfiltered file in the working tree. ++# Default: Strip the whole location comments for all .po files ++*.po filter=gettext-no-location + -+# Languages that strip both filenames and line numbers -+bg.po filter=gettext-no-file-no-location -+de.po filter=gettext-no-file-no-location -+#es.po filter=gettext-no-file-no-location -+fr.po filter=gettext-no-file-no-location -+#ga.po filter=gettext-no-file-no-location -+#ru.po filter=gettext-no-file-no-location -+sv.po filter=gettext-no-file-no-location -+tr.po filter=gettext-no-file-no-location -+uk.po filter=gettext-no-file-no-location -+vi.po filter=gettext-no-file-no-location ++# Legacy, unmaintained PO files: filter disabled to avoid index vs ++# working-tree mismatch (these files still have location comments). ++el.po -filter ++is.po -filter ++it.po -filter ++ko.po -filter ++pl.po -filter ++pt_PT.po -filter + -+# Languages that preserve filenames but strip line numbers -+#ca.po filter=gettext-no-location -+id.po filter=gettext-no-location -+zh_CN.po filter=gettext-no-location -+zh_TW.po filter=gettext-no-location ++# These files use gettext-no-line-number (keep filenames, strip line ++# numbers). The choice is per l10n team preference. Requires gettext 0.20+. ++# The only benefit is locating source files from location comments when ++# the .po file is not updated from the POT via make po-update. ++ca.po filter=gettext-no-line-number ++id.po filter=gettext-no-line-number ++zh_CN.po filter=gettext-no-line-number ++zh_TW.po filter=gettext-no-line-number ## po/README.md ## @@ po/README.md: It will: @@ po/README.md: git-po-helper check-commits <rev-list-opts> +The filter configuration is defined in the "po/.gitattributes" file. + +Then define the driver for the filter. Most languages use the -+"gettext-no-file-no-location" clean filter, which strips out both filenames and -+line numbers from location comments. To set this up, run the following command: ++"gettext-no-location" clean filter, which strips out both filenames and line ++numbers from location comments. To set this up, run the following command: + +```shell -+git config --global filter.gettext-no-file-no-location.clean \ ++git config --global filter.gettext-no-location.clean \ + "msgcat --no-location -" +``` + -+Some languages use the "gettext-no-location" clean filter, which preserves -+filenames but not line numbers. For these, install gettext version 0.20 or -+higher and set up the driver as follows: ++Some PO files use the "gettext-no-line-number" clean filter, which keeps ++filenames but strips line numbers. This filter requires gettext 0.20 or ++later. The only benefit is being able to locate source files from location ++comments when the .po file is not updated from the POT via `make po-update`. + +```shell -+git config --global filter.gettext-no-location.clean \ ++git config --global filter.gettext-no-line-number.clean \ + "msgcat --add-location=file -" +``` + 2: 573c24e798 = 2: e1258eac7d docs(l10n): add AGENTS.md with optimized update-pot instructions 3: bc00ca2d30 = 3: 88f9e2e2cd docs(l10n): add AI agent instructions for updating po/XX.po files 4: 6c61a8ca25 = 4: 5034063c2b docs(l10n): add AI agent instructions for translating PO files 5: 208c1230d1 = 5: 9388b8e9f4 docs(l10n): add AI agent instructions to review translations ## Introduction This series introduces AI agent instructions for Git localization (l10n) workflows to help localization contributors quickly complete drafts and use AI to check translation quality. The changes focus on: 1. Separating agent-specific documentation into po/AGENTS.md for targeted optimization of AI-assisted workflows 2. Providing step-by-step instructions for update-pot, update-po, translation, and review tasks 3. Simplifying location filtering for PO file commits via .gitattributes AI-assisted translation is optional; many successful l10n teams work well without it. When used, AI output serves as reference only—human contributors must review and approve before submission. ## Performance summary Benchmarks use the Qwen model via git-po-helper. The improvements reduce API costs and make agent workflows more efficient while maintaining human oversight of translation quality. | Task | Before | After | Improvement | |-------------|---------------------|--------------------------|---------------------------------------| | update-pot | 17 turns, 34s | 3 turns, 8s (range 3–3) | -82% turns, -76% time | | update-po | 22 turns, 38s | 4 turns, 9s (3–9, 7–14s) | -82% turns, -76% time | | translate | 86 turns, ~21m | 56 turns, ~19m | -35% turns (git-po-helper JSON batch) | | review | N/A | 22 turns (63 entries) | | ## Testing All changes have been evaluated with the qwen model via git-po-helper agent-test and agent-run. The po/AGENTS.md instructions are designed to work with coding tools that support file references (e.g., "Translate po/zh_CN.po by referring to @po/AGENTS.md"). ## Changes Jiang Xin (5): l10n: add .gitattributes to simplify location filtering docs(l10n): add AGENTS.md with optimized update-pot instructions docs(l10n): add AI agent instructions for updating po/XX.po files docs(l10n): add AI agent instructions for translating PO files docs(l10n): add AI agent instructions to review translations po/.gitattributes | 35 ++ po/AGENTS.md | 872 ++++++++++++++++++++++++++++++++++++++++++++++ po/README.md | 71 ++-- po/ca.po | 48 +-- po/es.po | 34 +- po/ga.po | 64 ++-- po/ru.po | 28 +- 7 files changed, 1033 insertions(+), 119 deletions(-) create mode 100644 po/.gitattributes create mode 100644 po/AGENTS.md -- 2.53.0.rc2.20.g532543fa46 ^ permalink raw reply [flat|nested] 42+ messages in thread
* [PATCH v4 1/5] l10n: add .gitattributes to simplify location filtering 2026-03-16 23:54 ` [PATCH v4 0/5] docs(l10n): AI agent instructions and workflow improvements Jiang Xin @ 2026-03-16 23:54 ` Jiang Xin 2026-03-16 23:54 ` [PATCH v4 2/5] docs(l10n): add AGENTS.md with optimized update-pot instructions Jiang Xin ` (3 subsequent siblings) 4 siblings, 0 replies; 42+ messages in thread From: Jiang Xin @ 2026-03-16 23:54 UTC (permalink / raw) To: Junio C Hamano, Johannes Sixt, Git List Cc: Jiang Xin, Alexander Shopov, Mikel Forcada, Ralf Thielow, Jean-Noël Avila, Bagas Sanjaya, Dimitriy Ryazantcev, Peter Krefting, Emir SARI, Arkadii Yakovets, Vũ Tiến Hưng, Teng Long, Yi-Jyun Pan To simplify the location filtering process for l10n contributors when committing po/XX.po files, add filter attributes for selected PO files to the repository. This ensures all contributors automatically get the same filter configuration without manual setup in .git/info/attributes. The default filter (gettext-no-location) is applied to all .po files except: - Legacy, unmaintained PO files that still contain location comments. Leaving the filter off avoids index vs working-tree discrepancies for these files. The CI pipeline will report an error when future updates touch these legacy files. - Some PO files use a different filter that strips only line numbers from location comments while keeping filenames. Contributors still need to manually define the filter drivers via git-config as documented in po/README.md. Four PO files that use location filtering (po/ca.po, po/es.po, po/ga.po, po/ru.po) were batch-modified so their on-disk format matches the filter output (e.g. line wrapping), avoiding index vs working-tree mismatch. Additionally, po/README.md has been reorganized: the material on preparing location-less PO files for commit has been moved from "Updating a XX.po file" to a separate "Preparing a XX.po file for commit" section. This prevents AI agents from introducing unrelated operations when updating PO files. Signed-off-by: Jiang Xin <worldhello.net@gmail.com> --- po/.gitattributes | 35 +++++++++++++++++++++++ po/README.md | 71 ++++++++++++++++++++++++++--------------------- po/ca.po | 48 ++++++++++++++++---------------- po/es.po | 34 +++++++++++------------ po/ga.po | 64 +++++++++++++++++++++--------------------- po/ru.po | 28 +++++++++---------- 6 files changed, 161 insertions(+), 119 deletions(-) create mode 100644 po/.gitattributes diff --git a/po/.gitattributes b/po/.gitattributes new file mode 100644 index 0000000000..284af6bcc0 --- /dev/null +++ b/po/.gitattributes @@ -0,0 +1,35 @@ +# Git Attributes for PO Files +# +# This file configures Git filters to automatically strip location information +# from PO files when committing, producing cleaner diffs and saving repository +# space. +# +# Two filter types are used: +# 1. gettext-no-location: Strips both filenames and line numbers +# (e.g., removes "#: main.c:123" entirely) +# 2. gettext-no-line-number: Preserves filenames but removes line numbers, which +# requires gettext 0.20 or higher +# (e.g., "#: main.c:123" becomes "#: main.c") +# +# See `po/README.md` for instructions on setting up the required filter drivers. + +# Default: Strip the whole location comments for all .po files +*.po filter=gettext-no-location + +# Legacy, unmaintained PO files: filter disabled to avoid index vs +# working-tree mismatch (these files still have location comments). +el.po -filter +is.po -filter +it.po -filter +ko.po -filter +pl.po -filter +pt_PT.po -filter + +# These files use gettext-no-line-number (keep filenames, strip line +# numbers). The choice is per l10n team preference. Requires gettext 0.20+. +# The only benefit is locating source files from location comments when +# the .po file is not updated from the POT via make po-update. +ca.po filter=gettext-no-line-number +id.po filter=gettext-no-line-number +zh_CN.po filter=gettext-no-line-number +zh_TW.po filter=gettext-no-line-number diff --git a/po/README.md b/po/README.md index ec08aa24ad..ca39160077 100644 --- a/po/README.md +++ b/po/README.md @@ -159,38 +159,6 @@ It will: and these location lines will help translation tools to locate translation context easily. -Once you are done testing the translation (see below), it's better -to commit a location-less "po/XX.po" file to save repository space -and make a user-friendly patch for review. - -To save a location-less "po/XX.po" automatically in repository, you -can: - -First define a new attribute for "po/XX.po" by appending the following -line in ".git/info/attributes": - -``` -/po/XX.po filter=gettext-no-location -``` - -Then define the driver for the "gettext-no-location" clean filter to -strip out both filenames and locations from the contents as follows: - -```shell -git config --global filter.gettext-no-location.clean \ - "msgcat --no-location -" -``` - -For users who have gettext version 0.20 or higher, it is also possible -to define a clean filter to preserve filenames but not locations: - -```shell -git config --global filter.gettext-no-location.clean \ - "msgcat --add-location=file -" -``` - -You're now ready to ask the l10n coordinator to pull from you. - ## Fuzzy translation @@ -229,6 +197,45 @@ git-po-helper check-commits <rev-list-opts> ``` +## Preparing a "XX.po" file for commit + +Once you are done testing the translation, it's better to commit a +location-less "po/XX.po" file to save repository space and make a +user-friendly patch for review. + +To save a location-less "po/XX.po" automatically in the repository, +follow these steps: + +First, check which filter is configured for your "po/XX.po" file: + +``` +git check-attr filter po/XX.po +``` + +The filter configuration is defined in the "po/.gitattributes" file. + +Then define the driver for the filter. Most languages use the +"gettext-no-location" clean filter, which strips out both filenames and line +numbers from location comments. To set this up, run the following command: + +```shell +git config --global filter.gettext-no-location.clean \ + "msgcat --no-location -" +``` + +Some PO files use the "gettext-no-line-number" clean filter, which keeps +filenames but strips line numbers. This filter requires gettext 0.20 or +later. The only benefit is being able to locate source files from location +comments when the .po file is not updated from the POT via `make po-update`. + +```shell +git config --global filter.gettext-no-line-number.clean \ + "msgcat --add-location=file -" +``` + +You're now ready to ask the l10n coordinator to pull from you. + + ## Marking strings for translation (This is done by the core developers). diff --git a/po/ca.po b/po/ca.po index 8dc21f36ed..742109c65f 100644 --- a/po/ca.po +++ b/po/ca.po @@ -512,8 +512,8 @@ msgstr "" #, c-format msgid "Discard mode change from index and worktree [y,n,q,a,d%s,?]? " msgstr "" -"Descarta el canvi de mode de l'índex i de l'arbre de treball [y,n,q,a,d" -"%s,?]? " +"Descarta el canvi de mode de l'índex i de l'arbre de treball " +"[y,n,q,a,d%s,?]? " #: add-patch.c #, c-format @@ -3879,11 +3879,11 @@ msgstr "HEAD no trobat sota refs/heads!" #: builtin/branch.c msgid "" -"branch with --recurse-submodules can only be used if submodule." -"propagateBranches is enabled" +"branch with --recurse-submodules can only be used if " +"submodule.propagateBranches is enabled" msgstr "" -"la branca amb --recurse-submodules només es pot utilitzar si submodule." -"propagateBranches està habilitat" +"la branca amb --recurse-submodules només es pot utilitzar si " +"submodule.propagateBranches està habilitat" #: builtin/branch.c msgid "--recurse-submodules can only be used to create branches" @@ -7983,11 +7983,11 @@ msgstr "el protocol no admet --negotiate-only, se surt" #: builtin/fetch.c msgid "" -"--filter can only be used with the remote configured in extensions." -"partialclone" +"--filter can only be used with the remote configured in " +"extensions.partialclone" msgstr "" -"--filter només es pot utilitzar amb el remot configurat en extensions." -"partialclone" +"--filter només es pot utilitzar amb el remot configurat en " +"extensions.partialclone" #: builtin/fetch.c msgid "--atomic can only be used when fetching from one remote" @@ -11436,8 +11436,8 @@ msgid "" "| -C) <object>] [<object>] [-e]" msgstr "" "git notes [--ref <referència-notes>] add [-f] [--allow-empty] [--" -"[no-]separator|--separator=<salt-paràgraf>] [--[no-]stripspace] [-m <msg> | -" -"F <fitxer> | (-c | -C) <objecte>] [<objecte>]" +"[no-]separator|--separator=<salt-paràgraf>] [--[no-]stripspace] [-m <msg> | " +"-F <fitxer> | (-c | -C) <objecte>] [<objecte>]" #: builtin/notes.c msgid "git notes [--ref <notes-ref>] copy [-f] <from-object> <to-object>" @@ -11451,8 +11451,8 @@ msgid "" "| -C) <object>] [<object>] [-e]" msgstr "" "git notes [--ref <referència-notes>] append [--allow-empty] [--" -"[no-]separator|--separator=<salt-paràgraf>] [--[no-]stripspace] [-m <msg> | -" -"F <fitxer> | (-c | -C) <objecte>] [<objecte>] [-e]" +"[no-]separator|--separator=<salt-paràgraf>] [--[no-]stripspace] [-m <msg> | " +"-F <fitxer> | (-c | -C) <objecte>] [<objecte>] [-e]" #: builtin/notes.c msgid "git notes [--ref <notes-ref>] edit [--allow-empty] [<object>]" @@ -13231,8 +13231,8 @@ msgstr "--empty=ask és obslolet; utilitzeu '--empty=stop' en el seu lloc." #: builtin/rebase.c #, c-format msgid "" -"unrecognized empty type '%s'; valid values are \"drop\", \"keep\", and \"stop" -"\"." +"unrecognized empty type '%s'; valid values are \"drop\", \"keep\", and " +"\"stop\"." msgstr "" "tipus buit «%s» no reconegut; els valors vàlids són \"drop\", \"keep\" i " "\"stop\"." @@ -14440,8 +14440,8 @@ msgid "" msgstr "" "Els reempaquetaments incrementals són incompatibles amb els índexs de mapes " "de bits. Useu\n" -"--no-write-bitmap-index o inhabiliteu el paràmetre de configuració pack." -"writeBitmaps." +"--no-write-bitmap-index o inhabiliteu el paràmetre de configuració " +"pack.writeBitmaps." #: builtin/repack.c msgid "could not start pack-objects to repack promisor objects" @@ -19350,11 +19350,11 @@ msgstr "" #: commit-graph.c #, c-format msgid "" -"attempting to write a commit-graph, but 'commitGraph." -"changedPathsVersion' (%d) is not supported" +"attempting to write a commit-graph, but 'commitGraph.changedPathsVersion' " +"(%d) is not supported" msgstr "" -"s'ha intentat escriure un graf de comissió, però no s'admet 'commitGraph." -"changedPathsVersion' (%d)" +"s'ha intentat escriure un graf de comissió, però no s'admet " +"'commitGraph.changedPathsVersion' (%d)" #: commit-graph.c msgid "too many commits to write graph" @@ -22940,8 +22940,8 @@ msgstr "el fitxer de l'índex multipaquet %s és massa petit" #, c-format msgid "multi-pack-index signature 0x%08x does not match signature 0x%08x" msgstr "" -"la signatura de l'índex multipaquet 0x%08x no coincideix amb la signatura 0x" -"%08x" +"la signatura de l'índex multipaquet 0x%08x no coincideix amb la signatura " +"0x%08x" #: midx.c #, c-format diff --git a/po/es.po b/po/es.po index 1ff5ff3911..aa1bb9bf90 100644 --- a/po/es.po +++ b/po/es.po @@ -391,8 +391,8 @@ msgstr "" #, c-format, perl-format msgid "Apply mode change to index and worktree [y,n,q,a,d%s,?]? " msgstr "" -"¿Aplicar cambio de modo para el índice y el árbol de trabajo [y,n,q,a," -"d%s,?]? " +"¿Aplicar cambio de modo para el índice y el árbol de trabajo " +"[y,n,q,a,d%s,?]? " #, c-format, perl-format msgid "Apply deletion to index and worktree [y,n,q,a,d%s,?]? " @@ -2294,9 +2294,9 @@ msgid "" "=<term>] [--no-checkout] [--first-parent] [<bad> [<good>...]] [--] " "[<paths>...]" msgstr "" -"git bisect--helper --bisect-start [--term-{new,bad}=<término> --term-{old," -"good}=<término>] [--no-checkout] [--first-parent] [<malo> [<bueno>...]] [--] " -"[<rutas>...]" +"git bisect--helper --bisect-start [--term-{new,bad}=<término> --term-" +"{old,good}=<término>] [--no-checkout] [--first-parent] [<malo> [<bueno>...]] " +"[--] [<rutas>...]" msgid "git bisect--helper --bisect-state (bad|new) [<rev>]" msgstr "git bisect--helper --bisect-state (bad|new) [<rev>]" @@ -2983,11 +2983,11 @@ msgid "HEAD not found below refs/heads!" msgstr "¡HEAD no encontrado dentro de refs/heads!" msgid "" -"branch with --recurse-submodules can only be used if submodule." -"propagateBranches is enabled" +"branch with --recurse-submodules can only be used if " +"submodule.propagateBranches is enabled" msgstr "" -"branch con --recurse-submodules solo se puede usar si submodule." -"propagateBranches está habilitado" +"branch con --recurse-submodules solo se puede usar si " +"submodule.propagateBranches está habilitado" msgid "--recurse-submodules can only be used to create branches" msgstr "--recurse-submodules solo se puede usar para crear ramas" @@ -5983,11 +5983,11 @@ msgid "protocol does not support --negotiate-only, exiting" msgstr "el protocolo no soporta --negotiate-only, saliendo" msgid "" -"--filter can only be used with the remote configured in extensions." -"partialclone" +"--filter can only be used with the remote configured in " +"extensions.partialclone" msgstr "" -"--filter solo puede ser usado con el remoto configurado en extensions." -"partialclone" +"--filter solo puede ser usado con el remoto configurado en " +"extensions.partialclone" msgid "--atomic can only be used when fetching from one remote" msgstr "--atomic solo se puede usar cuando se busca desde un control remoto" @@ -8914,8 +8914,8 @@ msgstr "objeto esperado en el desplazamiento %<PRIuMAX> en el paquete %s" msgid "disabling bitmap writing, packs are split due to pack.packSizeLimit" msgstr "" -"deshabilitando escritura bitmap, paquetes son divididos debido a pack." -"packSizeLimit" +"deshabilitando escritura bitmap, paquetes son divididos debido a " +"pack.packSizeLimit" msgid "Writing objects" msgstr "Escribiendo objetos" @@ -9489,8 +9489,8 @@ msgstr "" msgid "" "\n" "To avoid automatically configuring upstream branches when their name\n" -"doesn't match the local branch, see option 'simple' of branch." -"autoSetupMerge\n" +"doesn't match the local branch, see option 'simple' of " +"branch.autoSetupMerge\n" "in 'git help config'.\n" msgstr "" "\n" diff --git a/po/ga.po b/po/ga.po index 4c05a2511e..2d8065b420 100644 --- a/po/ga.po +++ b/po/ga.po @@ -382,8 +382,8 @@ msgstr "Caitheamh breisiú ó innéacs agus crann oibre [y, n, q, a, d %s,?]? " #, c-format msgid "Discard this hunk from index and worktree [y,n,q,a,d%s,?]? " msgstr "" -"An bhfuil an píosa beag seo le fáil réidh ón innéacs agus ón gcrann oibre [y," -"n,q,a,d%s,?]? " +"An bhfuil an píosa beag seo le fáil réidh ón innéacs agus ón gcrann oibre " +"[y,n,q,a,d%s,?]? " msgid "" "y - discard this hunk from index and worktree\n" @@ -402,8 +402,8 @@ msgstr "" #, c-format msgid "Apply mode change to index and worktree [y,n,q,a,d%s,?]? " msgstr "" -"Cuir athrú mód i bhfeidhm ar an innéacs agus ar an gcrann oibre [y,n,q,a," -"d%s,?]? " +"Cuir athrú mód i bhfeidhm ar an innéacs agus ar an gcrann oibre " +"[y,n,q,a,d%s,?]? " #, c-format msgid "Apply deletion to index and worktree [y,n,q,a,d%s,?]? " @@ -413,14 +413,14 @@ msgstr "" #, c-format msgid "Apply addition to index and worktree [y,n,q,a,d%s,?]? " msgstr "" -"Cuir an breiseán i bhfeidhm ar an innéacs agus ar an gcrann oibre [y,n,q,a," -"d%s,?]? " +"Cuir an breiseán i bhfeidhm ar an innéacs agus ar an gcrann oibre " +"[y,n,q,a,d%s,?]? " #, c-format msgid "Apply this hunk to index and worktree [y,n,q,a,d%s,?]? " msgstr "" -"Cuir an píosa seo i bhfeidhm ar an innéacs agus ar an gcrann oibre [y,n,q,a," -"d%s,?]? " +"Cuir an píosa seo i bhfeidhm ar an innéacs agus ar an gcrann oibre " +"[y,n,q,a,d%s,?]? " msgid "" "y - apply this hunk to index and worktree\n" @@ -3114,11 +3114,11 @@ msgid "HEAD not found below refs/heads!" msgstr "Ní fhaightear CEAD thíos na refs/heads!" msgid "" -"branch with --recurse-submodules can only be used if submodule." -"propagateBranches is enabled" +"branch with --recurse-submodules can only be used if " +"submodule.propagateBranches is enabled" msgstr "" -"ní féidir brainse le --recurse-submodules a úsáid ach amháin má tá submodule." -"propagateBranches cumasaithe" +"ní féidir brainse le --recurse-submodules a úsáid ach amháin má tá " +"submodule.propagateBranches cumasaithe" msgid "--recurse-submodules can only be used to create branches" msgstr "Ní féidir --recurse-submodules a úsáid ach chun brainsí a chruthú" @@ -6034,8 +6034,8 @@ msgid "" "'strip-if-invalid' is not a valid mode for git fast-export with --signed-" "commits=<mode>" msgstr "" -"'strip-if-invalid' ní mód bailí é seo le haghaidh easpórtáil le haghaidh git fast-export le --signed-" -"commits=<mód>" +"'strip-if-invalid' ní mód bailí é seo le haghaidh easpórtáil le haghaidh git " +"fast-export le --signed-commits=<mód>" #, c-format msgid "" @@ -6067,8 +6067,8 @@ msgid "" "'strip-if-invalid' is not a valid mode for git fast-export with --signed-" "tags=<mode>" msgstr "" -"'strip-if-invalid' ní mód bailí é seo le haghaidh git fast-export le --signed-" -"tags=<mode>" +"'strip-if-invalid' ní mód bailí é seo le haghaidh git fast-export le --" +"signed-tags=<mode>" #, c-format msgid "" @@ -7048,8 +7048,8 @@ msgid "protocol does not support --negotiate-only, exiting" msgstr "ní thacaíonn an prótacal le --negotiate-only, ag scoir" msgid "" -"--filter can only be used with the remote configured in extensions." -"partialclone" +"--filter can only be used with the remote configured in " +"extensions.partialclone" msgstr "" "--filter Ní féidir ach an scagaire a úsáid ach leis an iargúlta cumraithe in " "extensions.partialclone" @@ -7584,8 +7584,8 @@ msgstr "Theip ar 'git multi-pack-index repack'" msgid "" "skipping incremental-repack task because core.multiPackIndex is disabled" msgstr "" -"ag scipeáil an tasc athphacála incriminteach mar go bhfuil core." -"multiPackIndex díchumasaithe" +"ag scipeáil an tasc athphacála incriminteach mar go bhfuil " +"core.multiPackIndex díchumasaithe" msgid "failed to perform geometric repack" msgstr "theip ar athphacáil gheoiméadrach a dhéanamh" @@ -10097,8 +10097,8 @@ msgstr "" msgid "disabling bitmap writing, packs are split due to pack.packSizeLimit" msgstr "" -"scríobh bitmap a dhíchumasú, roinntear pacáistí mar gheall ar pack." -"packSizeLimit" +"scríobh bitmap a dhíchumasú, roinntear pacáistí mar gheall ar " +"pack.packSizeLimit" msgid "Writing objects" msgstr "Rudaí a scríobh" @@ -11904,8 +11904,8 @@ msgid "" msgstr "" "Tá tagairtí contrártha sa\n" "tagarmharc sprice nua ag an gcianrialtán atá tú ag iarraidh a athainmniú. Is " -"dóichí gur mar gheall ar iarracht a dhéanamh cianrialtán a neadú ann féin, e." -"g. trí 'tuismitheoir' a athainmniú go 'tuismitheoir/leanbh'\n" +"dóichí gur mar gheall ar iarracht a dhéanamh cianrialtán a neadú ann féin, " +"e.g. trí 'tuismitheoir' a athainmniú go 'tuismitheoir/leanbh'\n" "nó trí chianrialtán a dhí-neadú, e.g. an bealach eile.\n" "\n" "Más amhlaidh atá, is féidir leat é seo a réiteach tríd an\n" @@ -16114,11 +16114,11 @@ msgstr "" #, c-format msgid "" -"attempting to write a commit-graph, but 'commitGraph." -"changedPathsVersion' (%d) is not supported" +"attempting to write a commit-graph, but 'commitGraph.changedPathsVersion' " +"(%d) is not supported" msgstr "" -"ag iarraidh commit-graph a scríobh, ach tá 'commitGraph." -"changedPathsVersion' (%d) ní thacaítear leis" +"ag iarraidh commit-graph a scríobh, ach tá 'commitGraph.changedPathsVersion' " +"(%d) ní thacaítear leis" msgid "too many commits to write graph" msgstr "an iomarca gealltanais graf a scríobh" @@ -18266,8 +18266,8 @@ msgid "" "given pattern contains NULL byte (via -f <file>). This is only supported " "with -P under PCRE v2" msgstr "" -"tá byte NULL (trí -f<file>) i bpatrún tugtha. Ní thacaítear leis seo ach le -" -"P faoi PCRE v2" +"tá byte NULL (trí -f<file>) i bpatrún tugtha. Ní thacaítear leis seo ach le " +"-P faoi PCRE v2" #, c-format msgid "'%s': unable to read %s" @@ -18433,8 +18433,8 @@ msgid "" msgstr "" "Rinneadh neamhaird ar an gcroca '%s' toisc nach bhfuil sé socraithe mar " "infheidhmithe.\n" -"Is féidir leat an rabhadh seo a dhíchumasú le `git config set advice." -"ignoredHook false `." +"Is féidir leat an rabhadh seo a dhíchumasú le `git config set " +"advice.ignoredHook false `." msgid "not a git repository" msgstr "ní stór git é" diff --git a/po/ru.po b/po/ru.po index 3e56eb546e..e8845ca2c0 100644 --- a/po/ru.po +++ b/po/ru.po @@ -369,8 +369,8 @@ msgstr "" #, c-format msgid "Discard mode change from index and worktree [y,n,q,a,d%s,?]? " msgstr "" -"Отменить изменения режима доступа в индексе и рабочем каталоге [y,n,q,a," -"d%s,?]? " +"Отменить изменения режима доступа в индексе и рабочем каталоге " +"[y,n,q,a,d%s,?]? " #, c-format msgid "Discard deletion from index and worktree [y,n,q,a,d%s,?]? " @@ -400,8 +400,8 @@ msgstr "" #, c-format msgid "Apply mode change to index and worktree [y,n,q,a,d%s,?]? " msgstr "" -"Применить изменения режима доступа к индексу и рабочему каталогу [y,n,q,a," -"d%s,?]? " +"Применить изменения режима доступа к индексу и рабочему каталогу " +"[y,n,q,a,d%s,?]? " #, c-format msgid "Apply deletion to index and worktree [y,n,q,a,d%s,?]? " @@ -2966,8 +2966,8 @@ msgid "HEAD not found below refs/heads!" msgstr "HEAD не найден в refs/heads!" msgid "" -"branch with --recurse-submodules can only be used if submodule." -"propagateBranches is enabled" +"branch with --recurse-submodules can only be used if " +"submodule.propagateBranches is enabled" msgstr "" msgid "--recurse-submodules can only be used to create branches" @@ -3997,8 +3997,8 @@ msgid "" "clean.requireForce defaults to true and neither -i, -n, nor -f given; " "refusing to clean" msgstr "" -"clean.requireForce установлен по умолчанию как true и ни одна из опций -i, -" -"n или -f не указана; отказ очистки" +"clean.requireForce установлен по умолчанию как true и ни одна из опций -i, " +"-n или -f не указана; отказ очистки" msgid "-x and -X cannot be used together" msgstr "нельзя использовать одновременно -x и -X" @@ -5890,8 +5890,8 @@ msgid "protocol does not support --negotiate-only, exiting" msgstr "" msgid "" -"--filter can only be used with the remote configured in extensions." -"partialclone" +"--filter can only be used with the remote configured in " +"extensions.partialclone" msgstr "" msgid "--atomic can only be used when fetching from one remote" @@ -8385,8 +8385,8 @@ msgstr "Каталог %s в индексе и не является подмо msgid "Please stage your changes to .gitmodules or stash them to proceed" msgstr "" -"Чтобы продолжить, проиндексируйте или спрячьте ваши изменения в файле ." -"gitmodules" +"Чтобы продолжить, проиндексируйте или спрячьте ваши изменения в " +"файле .gitmodules" #, c-format msgid "%.*s is in index" @@ -16134,8 +16134,8 @@ msgid "" msgstr "" "Перехватчик «%s» был проигнорирован, так как он не установлен как " "исполняемый.\n" -"Вы можете отключить это предупреждение с помощью команды «git config advice." -"ignoredHook false»." +"Вы можете отключить это предупреждение с помощью команды «git config " +"advice.ignoredHook false»." #, c-format msgid "argument to --packfile must be a valid hash (got '%s')" -- 2.53.0.rc2.20.g532543fa46 ^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH v4 2/5] docs(l10n): add AGENTS.md with optimized update-pot instructions 2026-03-16 23:54 ` [PATCH v4 0/5] docs(l10n): AI agent instructions and workflow improvements Jiang Xin 2026-03-16 23:54 ` [PATCH v4 1/5] l10n: add .gitattributes to simplify location filtering Jiang Xin @ 2026-03-16 23:54 ` Jiang Xin 2026-03-16 23:54 ` [PATCH v4 3/5] docs(l10n): add AI agent instructions for updating po/XX.po files Jiang Xin ` (2 subsequent siblings) 4 siblings, 0 replies; 42+ messages in thread From: Jiang Xin @ 2026-03-16 23:54 UTC (permalink / raw) To: Junio C Hamano, Johannes Sixt, Git List Cc: Jiang Xin, Alexander Shopov, Mikel Forcada, Ralf Thielow, Jean-Noël Avila, Bagas Sanjaya, Dimitriy Ryazantcev, Peter Krefting, Emir SARI, Arkadii Yakovets, Vũ Tiến Hưng, Teng Long, Yi-Jyun Pan Add a new documentation file po/AGENTS.md that provides agent-specific instructions for generating or updating po/git.pot, separating them from the general po/README.md. This separation allows for more targeted optimization of AI agent workflows. Performance evaluation with the Qwen model: # Before: No agent-specific instructions; use po/README.md for # reference. git-po-helper agent-test --runs=5 --agent=qwen update-pot \ --prompt="Update po/git.pot according to po/README.md" # Phase 1: add the instructions to po/README.md; the prompt # references po/README.md during execution git-po-helper agent-test --runs=5 --agent=qwen update-pot \ --prompt="Update po/git.pot according to po/README.md" # Phase 2: add the instructions to po/AGENTS.md; use the built-in # prompt that references po/AGENTS.md during execution git-po-helper agent-test --runs=5 --agent=qwen update-pot Benchmark results (5-run average): Phase 1 - Optimizing po/README.md: | Metric | Before | Phase 1 | Improvement | |-------------|---------|---------|-------------| | Turns | 17 | 5 | -71% | | Exec. time | 34s | 14s | -59% | | Turn range | 3-36 | 3-7 | | | Time range | 10s-59s | 9s-19s | | Phase 2 - Adding po/AGENTS.md (further optimization): | Metric | Before | Phase 2 | Improvement | |-------------|---------|---------|-------------| | Turns | 17 | 3 | -82% | | Exec. time | 34s | 8s | -76% | | Turn range | 3-36 | 3-3 | | | Time range | 10s-59s | 6s-9s | | Separating agent-specific instructions into AGENTS.md provides: - More focused and concise instructions for AI agents - Cleaner README.md for human readers - An additional 11% reduction in turns and 17% reduction in execution time - More consistent behavior (turn range reduced from 3-7 to 3-3) Signed-off-by: Jiang Xin <worldhello.net@gmail.com> --- po/AGENTS.md | 70 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 70 insertions(+) create mode 100644 po/AGENTS.md diff --git a/po/AGENTS.md b/po/AGENTS.md new file mode 100644 index 0000000000..94b7aa7f28 --- /dev/null +++ b/po/AGENTS.md @@ -0,0 +1,70 @@ +# Instructions for AI Agents + +This file gives specific instructions for AI agents that perform +housekeeping tasks for Git l10n. Use of AI is optional; many successful +l10n teams work well without it. + +The section "Housekeeping tasks for localization workflows" documents the +most commonly used housekeeping tasks. + + +## Background knowledge for localization workflows + +Essential background for the workflows below; understand these concepts before +performing any housekeeping tasks in this document. + +### Language code and notation (XX, ll, ll\_CC) + +**XX** is a placeholder for the language code: either `ll` (ISO 639) or +`ll_CC` (e.g. `de`, `zh_CN`). It appears in the PO file header metadata +(e.g. `"Language: zh_CN\n"`) and is typically used to name the PO file: +`po/XX.po`. + + +### Header Entry + +The **header entry** is the first entry in every `po/XX.po`. It has an empty +`msgid`; translation metadata (project, language, plural rules, encoding, etc.) +is stored in `msgstr`, as in this example: + +```po +msgid "" +msgstr "" +"Project-Id-Version: Git\n" +"Language: zh_CN\n" +"MIME-Version: 1.0\n" +"Content-Type: text/plain; charset=UTF-8\n" +"Content-Transfer-Encoding: 8bit\n" +"Plural-Forms: nplurals=2; plural=(n != 1);\n" +``` + +**CRITICAL**: Do not edit the header's `msgstr` while translating. It holds +metadata only and must be left unchanged. + + +## Housekeeping tasks for localization workflows + +For common housekeeping tasks, follow the steps in the matching subsection +below. + + +### Task 1: Generating or updating po/git.pot + +When asked to generate or update `po/git.pot` (or the like): + +1. **Directly execute** the command `make po/git.pot` without checking + if the file exists beforehand. + +2. **Do not verify** the generated file after execution. Simply run the + command and consider the task complete. + + +## Human translators remain in control + +Git translation is human-driven; language team leaders and contributors are +responsible for maintaining translation quality and consistency. + +AI-generated output should always be treated as drafts that must be reviewed +and approved by someone who understands both the technical context and the +target language. The best results come from combining AI efficiency with human +judgment, cultural insight, and community engagement. -- 2.53.0.rc2.20.g532543fa46 ^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH v4 3/5] docs(l10n): add AI agent instructions for updating po/XX.po files 2026-03-16 23:54 ` [PATCH v4 0/5] docs(l10n): AI agent instructions and workflow improvements Jiang Xin 2026-03-16 23:54 ` [PATCH v4 1/5] l10n: add .gitattributes to simplify location filtering Jiang Xin 2026-03-16 23:54 ` [PATCH v4 2/5] docs(l10n): add AGENTS.md with optimized update-pot instructions Jiang Xin @ 2026-03-16 23:54 ` Jiang Xin 2026-03-16 23:54 ` [PATCH v4 4/5] docs(l10n): add AI agent instructions for translating PO files Jiang Xin 2026-03-16 23:54 ` [PATCH v4 5/5] docs(l10n): add AI agent instructions to review translations Jiang Xin 4 siblings, 0 replies; 42+ messages in thread From: Jiang Xin @ 2026-03-16 23:54 UTC (permalink / raw) To: Junio C Hamano, Johannes Sixt, Git List Cc: Jiang Xin, Alexander Shopov, Mikel Forcada, Ralf Thielow, Jean-Noël Avila, Bagas Sanjaya, Dimitriy Ryazantcev, Peter Krefting, Emir SARI, Arkadii Yakovets, Vũ Tiến Hưng, Teng Long, Yi-Jyun Pan Add a new section to po/AGENTS.md to provide clear instructions for updating language-specific PO files. The improved documentation significantly reduces both conversation turns and execution time. Performance evaluation with the Qwen model: # Before: instructions in po/README.md; the custom prompt # references po/README.md during execution git-po-helper agent-test --runs=5 --agent=qwen update-po \ --prompt="Update po/zh_CN.po according to po/README.md" # After: instructions in po/AGENTS.md; the built-in prompt # references po/AGENTS.md during execution git-po-helper agent-test --runs=5 --agent=qwen update-po Benchmark results (5-run average): | Metric | Before | After | Improvement | |-------------|---------|--------|-------------| | Turns | 22 | 4 | -82% | | Exec. time | 38s | 9s | -76% | | Turn range | 17-39 | 3-9 | | | Time range | 25s-68s | 7s-14s | | Signed-off-by: Jiang Xin <worldhello.net@gmail.com> --- po/AGENTS.md | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/po/AGENTS.md b/po/AGENTS.md index 94b7aa7f28..f2b8fc5100 100644 --- a/po/AGENTS.md +++ b/po/AGENTS.md @@ -59,6 +59,17 @@ When asked to generate or update `po/git.pot` (or the like): command and consider the task complete. +### Task 2: Updating po/XX.po + +When asked to update `po/XX.po` (or the like): + +1. **Directly execute** the command `make po-update PO_FILE=po/XX.po` + without reading or checking the file content beforehand. + +2. **Do not verify, translate, or review** the updated file after execution. + Simply run the command and consider the task complete. + + ## Human translators remain in control Git translation is human-driven; language team leaders and contributors are -- 2.53.0.rc2.20.g532543fa46 ^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH v4 4/5] docs(l10n): add AI agent instructions for translating PO files 2026-03-16 23:54 ` [PATCH v4 0/5] docs(l10n): AI agent instructions and workflow improvements Jiang Xin ` (2 preceding siblings ...) 2026-03-16 23:54 ` [PATCH v4 3/5] docs(l10n): add AI agent instructions for updating po/XX.po files Jiang Xin @ 2026-03-16 23:54 ` Jiang Xin 2026-03-16 23:54 ` [PATCH v4 5/5] docs(l10n): add AI agent instructions to review translations Jiang Xin 4 siblings, 0 replies; 42+ messages in thread From: Jiang Xin @ 2026-03-16 23:54 UTC (permalink / raw) To: Junio C Hamano, Johannes Sixt, Git List Cc: Jiang Xin, Alexander Shopov, Mikel Forcada, Ralf Thielow, Jean-Noël Avila, Bagas Sanjaya, Dimitriy Ryazantcev, Peter Krefting, Emir SARI, Arkadii Yakovets, Vũ Tiến Hưng, Teng Long, Yi-Jyun Pan Add a new "Translating po/XX.po" section to po/AGENTS.md with detailed workflow and procedures for AI agents to translate language-specific PO files. Users can invoke AI-assisted translation in coding tools with a prompt such as: "Translate the po/XX.po file by referring to @po/AGENTS.md" Translation results serve as drafts; human contributors must review and approve before submission. To address the low translation efficiency of some LLMs, batch translation replaces entry-by-entry translation. git-po-helper implements a gettext JSON format for translation files, replacing PO format during translation to enable batch processing. Evaluation with the Qwen model: git-po-helper agent-run --agent=qwen translate po/zh_CN.po Test translation (127 entries, 50 per batch): Initial state: 5998 translated, 91 fuzzy, 36 untranslated Final state: 6125 translated, 0 fuzzy, 0 untranslated Successfully translated: 127 entries (91 fuzzy + 36 untranslated) Success rate: 100% Benchmark results (3-run average): AI agent using gettext tools: | Metric | Value | |------------------|--------------------------------| | Avg. Num turns | 86 (176, 44, 40) | | Avg. Exec. Time | 20m44s (39m56s, 14m38s, 7m38s) | | Successful runs | 3/3 | AI agent using git-po-helper (JSON batch flow): | Metric | Value | |------------------|--------------------------------| | Avg. Num turns | 56 (68, 39, 63) | | Avg. Exec. Time | 19m8s (28m55s, 9m1s, 19m28s) | | Successful runs | 3/3 | The git-po-helper flow reduces the number of turns (86 → 56) with similar execution time; the bottleneck appears to be LLM processing rather than network interaction. Signed-off-by: Jiang Xin <worldhello.net@gmail.com> --- po/AGENTS.md | 596 ++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 595 insertions(+), 1 deletion(-) diff --git a/po/AGENTS.md b/po/AGENTS.md index f2b8fc5100..65017624f7 100644 --- a/po/AGENTS.md +++ b/po/AGENTS.md @@ -5,7 +5,11 @@ housekeeping tasks for Git l10n. Use of AI is optional; many successful l10n teams work well without it. The section "Housekeeping tasks for localization workflows" documents the -most commonly used housekeeping tasks. +most commonly used housekeeping tasks: + +1. Generating or updating po/git.pot +2. Updating po/XX.po +3. Translating po/XX.po ## Background knowledge for localization workflows @@ -42,6 +46,351 @@ msgstr "" metadata only and must be left unchanged. +### Glossary Section + +PO files may have a glossary in comments before the header entry (first +`msgid ""`), giving terminology guidelines (e.g.): + +```po +# Git glossary for Chinese translators +# +# English | Chinese +# ---------------------------------+-------------------------------------- +# 3-way merge | 三路合并 +# branch | 分支 +# ... +``` + +**IMPORTANT**: Read and use the glossary when translating or reviewing. It is +in `#` comments only. Leave that comment block unchanged. + + +### PO entry structure (single-line and multi-line) + +PO entries are `msgid` / `msgstr` pairs. Plural messages add `msgid_plural` and +`msgstr[n]`. The `msgid` is the immutable source; `msgstr` is the target +translation. Each side may be a single quoted string or a multi-line block. +In the multi-line form the header line is often `msgid ""` / `msgstr ""`, with +the real text split across following quoted lines (concatenated by Gettext). + +**Single-line entries**: + +```po +msgid "commit message" +msgstr "提交说明" +``` + +**Multi-line entries**: + +```po +msgid "" +"Line 1\n" +"Line 2" +msgstr "" +"行 1\n" +"行 2" +``` + +**CRITICAL**: Do **not** use `grep '^msgstr ""'` to find untranslated entries; +multi-line `msgstr` blocks use the same opening line, so grep gives false +positives. Use `msgattrib` (next section). + + +### Locating untranslated, fuzzy, and obsolete entries + +Use `msgattrib` to list untranslated, fuzzy, and obsolete entries. Task 3 +(translating `po/XX.po`) uses these commands. + +- **Untranslated**: `msgattrib --untranslated --no-obsolete po/XX.po` +- **Fuzzy**: `msgattrib --only-fuzzy --no-obsolete po/XX.po` +- **Obsolete** (`#~`): `msgattrib --obsolete --no-wrap po/XX.po` + + +### Translating fuzzy entries + +Fuzzy entries need re-translation because the source text changed. The format +differs by file type: + +- **PO file**: A `#, fuzzy` tag in the entry comments marks the entry as fuzzy. +- **JSON file**: The entry has `"fuzzy": true`. + +**Translation principles**: Re-translate the `msgstr` (and, for plural entries, +`msgstr[n]`) into the target language. Do **not** modify `msgid` or +`msgid_plural`. After translation, **clear the fuzzy mark**: in PO, remove the +`#, fuzzy` tag from comments; in JSON, omit or set `fuzzy` to `false`. + + +### Preserving Special Characters + +Preserve escape sequences (`\n`, `\"`, `\\`, `\t`), placeholders (`%s`, `%d`, +etc.), and quotes exactly as in `msgid`. Only reorder placeholders with +positional syntax when needed (see Placeholder Reordering below). + + +### Placeholder Reordering + +When reordering placeholders relative to `msgid`, use positional syntax (`%n$`) +where *n* is the 1-based argument index, so each argument still binds to the +right value. Preserve width and precision modifiers, and place `%n$` before +them (see examples below). + +**Example 1** (precision): + +```po +#, c-format +msgid "missing environment variable '%s' for configuration '%.*s'" +msgstr "配置 '%3$.*2$s' 缺少环境变量 '%1$s'" +``` + +`%s` → argument 1 → `%1$s`. `%.*s` needs precision (arg 2) and string (arg 3) → +`%3$.*2$s`. + +**Example 2** (multi-line, four `%s` reordered): + +```po +#, c-format +msgid "" +"the 'submodule.%s.gitdir' config does not exist for module '%s'. Please " +"ensure it is set, for example by running something like: 'git config " +"submodule.%s.gitdir .git/modules/%s'. For details see the " +"extensions.submodulePathConfig documentation." +msgstr "" +"模块 '%2$s' 的 'submodule.%1$s.gitdir' 配置不存在。请确保已设置,例如运行类" +"似:'git config submodule.%3$s.gitdir .git/modules/%4$s'。详细信息请参见 " +"extensions.submodulePathConfig 文档。" +``` + +Original order 1,2,3,4; in translation 2,1,3,4. Each line must be a complete +quoted string. + + +### Validating PO File Format + +Check the PO file using the command below: + +```shell +msgfmt --check -o /dev/null po/XX.po +``` + +Common validation errors include: +- Unclosed quotes +- Missing escape sequences +- Invalid placeholder syntax +- Malformed multi-line entries +- Incorrect line breaks in multi-line strings + +On failure, `msgfmt` prints the line number; fix the PO at that line. + + +### Using git-po-helper + +[git-po-helper](https://github.com/git-l10n/git-po-helper) supports Git l10n with +**quality checking** (git-l10n PR conventions) and **AI-assisted translation** +(subcommands for automated workflows). Housekeeping tasks in this document use +it when available; otherwise rely on gettext tools. + + +#### Splitting large PO files + +When a PO file is too large for translation or review, use `git-po-helper +msg-select` to split it by entry index. + +- **Entry 0** is the header (included by default; use `--no-header` to omit). +- **Entries 1, 2, 3, …** are content entries. +- **Range format**: `--range "1-50"` (entries 1 through 50), `--range "-50"` + (first 50 entries), `--range "51-"` (from entry 51 to end). Shortcuts: + `--head N` (first N), `--tail N` (last N), `--since N` (from N to end). +- **Output format**: PO by default; use `--json` for GETTEXT JSON. See the + "GETTEXT JSON format" section (under git-po-helper) for details. +- **State filter**: Use `--translated`, `--untranslated`, `--fuzzy` to filter + by state (OR relationship). Use `--no-obsolete` to exclude obsolete entries; + `--with-obsolete` to include (default). Use `--only-same` or `--only-obsolete` + for a single state. Range applies to the filtered list. + +```shell +# First 50 entries (header + entries 1–50) +git-po-helper msg-select --range "-50" po/in.po -o po/out.po + +# Entries 51–100 +git-po-helper msg-select --range "51-100" po/in.po -o po/out.po + +# Entries 101 to end +git-po-helper msg-select --range "101-" po/in.po -o po/out.po + +# Entries 1–50 without header (content only) +git-po-helper msg-select --range "1-50" --no-header po/in.po -o po/frag.po + +# Output as JSON; select untranslated and fuzzy entries, exclude obsolete +git-po-helper msg-select --json --untranslated --fuzzy --no-obsolete po/in.po >po/filtered.json +``` + + +#### Comparing PO files for translation and review + +`git-po-helper compare` shows PO changes with full entry context (unlike +`git diff`). Redirect output to a file: it is empty when there are no new or +changed entries; otherwise it contains a valid PO header. + +```shell +# Get full context of local changes (HEAD vs working tree) +git-po-helper compare po/XX.po -o po/out.po + +# Get full context of changes in a specific commit (parent vs commit) +git-po-helper compare --commit <commit> po/XX.po -o po/out.po + +# Get full context of changes since a commit (commit vs working tree) +git-po-helper compare --since <commit> po/XX.po -o po/out.po + +# Get full context between two commits +git-po-helper compare -r <commit1>..<commit2> po/XX.po -o po/out.po + +# Get full context of two worktree files +git-po-helper compare po/old.po po/new.po -o po/out.po + +# Check msgid consistency (detect tampering); no output means target matches source +git-po-helper compare --msgid po/old.po po/new.po >po/out.po +``` + +**Options summary** + +| Option | Meaning | +|---------------------|------------------------------------------------| +| (none) | Compare HEAD with working tree (local changes) | +| `--commit <commit>` | Compare parent of commit with the commit | +| `--since <commit>` | Compare commit with working tree | +| `-r x..y` | Compare revision x with revision y | +| `-r x..` | Compare revision x with working tree | +| `-r x` | Compare parent of x with x | + + +#### Concatenating multiple PO/JSON files + +`git-po-helper msg-cat` merges PO, POT, or gettext JSON inputs into one stream. +Duplicate `msgid` values keep the first occurrence in file order. Write with +`-o <file>` or stdout (`-o -` or omit); `--json` selects JSON output, else PO. + +```shell +# Convert JSON to PO (e.g. after translation) +git-po-helper msg-cat --unset-fuzzy -o po/out.po po/in.json + +# Merge multiple PO files +git-po-helper msg-cat -o po/out.po po/in-1.po po/in-2.json +``` + + +#### GETTEXT JSON format + +The **GETTEXT JSON** format is an internal format defined by `git-po-helper` +for convenient batch processing of translation and related tasks by AI models. +`git-po-helper msg-select`, `git-po-helper msg-cat`, and `git-po-helper compare` +read and write this format. + +**Top-level structure**: + +```json +{ + "header_comment": "string", + "header_meta": "string", + "entries": [ /* array of entry objects */ ] +} +``` + +| Field | Description | +|------------------|--------------------------------------------------------------------------------| +| `header_comment` | Lines above the first `msgid ""` (comments, glossary), directly concatenated. | +| `header_meta` | Encoded `msgstr` of the header entry (Project-Id-Version, Plural-Forms, etc.). | +| `entries` | List of PO entries. Order matches source. | + +**Entry object** (each element of `entries`): + +| Field | Type | Description | +|-----------------|----------|--------------------------------------------------------------| +| `msgid` | string | Singular message ID. PO escapes encoded (e.g. `\n` → `\\n`). | +| `msgstr` | []string | Translation forms as a **JSON array only**. Details below. | +| `msgid_plural` | string | Plural form of msgid. Omit for non-plural. | +| `comments` | []string | Comment lines (`#`, `#.`, `#:`, `#,`, etc.). | +| `fuzzy` | bool | True if entry has fuzzy flag. | +| `obsolete` | bool | True for `#~` obsolete entries. Omit if false. | + +**`msgstr` array (required shape)**: + +- **Always** a JSON array of strings, never a single string. One element = singular + (PO `msgstr` / `msgstr[0]`); multiple elements = plural forms in order + (`msgstr[0]`, `msgstr[1]`, …). +- Omit the key or use an empty array when the entry is untranslated. + +**Example (single-line entry)**: + +```json +{ + "header_comment": "# Glossary:\\n# term1\\tTranslation 1\\n#\\n", + "header_meta": "Project-Id-Version: git\\nContent-Type: text/plain; charset=UTF-8\\n", + "entries": [ + { + "msgid": "Hello", + "msgstr": ["你好"], + "comments": ["#. Comment for translator\\n", "#: src/file.c:10\\n"], + "fuzzy": false + } + ] +} +``` + +**Example (plural entry)**: + +```json +{ + "msgid": "One file", + "msgid_plural": "%d files", + "msgstr": ["一个文件", "%d 个文件"], + "comments": ["#, c-format\\n"] +} +``` + +**Example (fuzzy entry before translation)**: + +```json +{ + "msgid": "Old message", + "msgstr": ["旧翻译。"], + "comments": ["#, fuzzy\\n"], + "fuzzy": true +} +``` + +**Translation notes for GETTEXT JSON files**: + +- **Preserve structure**: Keep `header_comment`, `header_meta`, `msgid`, + `msgid_plural` unchanged. +- **Fuzzy entries**: Entries extracted from fuzzy PO entries have `"fuzzy": true`. + After translating, **remove the `fuzzy` field** or set it to `false` in the + output JSON. The merge step uses `--unset-fuzzy`, which can also remove the + `fuzzy` field. +- **Placeholders**: Preserve `%s`, `%d`, etc. exactly; use `%n$` when + reordering (see "Placeholder Reordering" above). + + +### Quality checklist + +- **Accuracy**: Faithful to original meaning; no omissions or distortions. +- **Fuzzy entries**: Re-translate fully and clear the fuzzy flag (see + "Translating fuzzy entries" above). +- **Terminology**: Consistent with glossary (see "Glossary Section" above) or + domain standards. +- **Grammar and fluency**: Correct and natural in the target language. +- **Placeholders**: Preserve variables (`%s`, `{name}`, `$1`) exactly; use + positional parameters when reordering (see "Placeholder Reordering" above). +- **Special characters**: Preserve escape sequences (`\n`, `\"`, `\\`, `\t`), + placeholders exactly as in `msgid`. See "Preserving Special Characters" above. +- **Plurals and gender**: Correct forms and agreement. +- **Context fit**: Suitable for UI space, tone, and use (e.g. error vs. tooltip). +- **Cultural appropriateness**: No offensive or ambiguous content. +- **Consistency**: Match prior translations of the same source. +- **Technical integrity**: Do not translate code, paths, commands, brands, or + proper nouns. +- **Readability**: Clear, concise, and user-friendly. + + ## Housekeeping tasks for localization workflows For common housekeeping tasks, follow the steps in the matching subsection @@ -70,6 +419,251 @@ When asked to update `po/XX.po` (or the like): Simply run the command and consider the task complete. +### Task 3: Translating po/XX.po + +To translate `po/XX.po`, use the steps below. The script uses gettext or +`git-po-helper` depending on what is installed; JSON export (when available) +supports batch translation rather than per-entry work. + +**Workflow loop**: Steps 1→2→3→4→5→6→7 form a loop. After step 6 succeeds, +**always** go to step 7, which returns to step 1. The **only** exit to step 8 +is when step 2 finds `po/l10n-pending.po` empty. Do not skip step 7 or jump to +step 8 after step 6. + +1. **Extract entries to translate**: **Directly execute** the script below—it is + authoritative; do not reimplement. It generates `po/l10n-pending.po` with + messages that need translation. + + ```shell + l10n_extract_pending () { + test $# -ge 1 || { echo "Usage: l10n_extract_pending <po-file>" >&2; return 1; } + PO_FILE="$1" + PENDING="po/l10n-pending.po" + PENDING_FUZZY="${PENDING}.fuzzy" + PENDING_REFER="${PENDING}.fuzzy.reference" + PENDING_UNTRANS="${PENDING}.untranslated" + rm -f "$PENDING" + + if command -v git-po-helper >/dev/null 2>&1 + then + git-po-helper msg-select --untranslated --fuzzy --no-obsolete -o "$PENDING" "$PO_FILE" + else + msgattrib --untranslated --no-obsolete "$PO_FILE" >"${PENDING_UNTRANS}" + msgattrib --only-fuzzy --no-obsolete --clear-fuzzy --empty "$PO_FILE" >"${PENDING_FUZZY}" + msgattrib --only-fuzzy --no-obsolete "$PO_FILE" >"${PENDING_REFER}" + msgcat --use-first "${PENDING_UNTRANS}" "${PENDING_FUZZY}" >"$PENDING" + rm -f "${PENDING_UNTRANS}" "${PENDING_FUZZY}" + fi + if test -s "$PENDING" + then + msgfmt --stat -o /dev/null "$PENDING" || true + echo "Pending file is not empty; there are still entries to translate." + else + echo "No entries need translation." + return 1 + fi + } + # Run the extraction. Example: l10n_extract_pending po/zh_CN.po + l10n_extract_pending po/XX.po + ``` + +2. **Check generated file**: If `po/l10n-pending.po` is empty or does not exist, + translation is complete; go to step 8. Otherwise proceed to step 3. + +3. **Prepare one batch for translation**: Batching keeps each run small so the + model can complete translation within limited context. **BEFORE translating**, + **directly execute** the script below—it is authoritative; do not reimplement. + Based on which file the script produces: if `po/l10n-todo.json` exists, go to + step 4a; if `po/l10n-todo.po` exists, go to step 4b. + + ```shell + l10n_one_batch () { + test $# -ge 1 || { echo "Usage: l10n_one_batch <po-file> [min_batch_size]" >&2; return 1; } + PO_FILE="$1" + min_batch_size=${2:-100} + PENDING="po/l10n-pending.po" + TODO_JSON="po/l10n-todo.json" + TODO_PO="po/l10n-todo.po" + DONE_JSON="po/l10n-done.json" + DONE_PO="po/l10n-done.po" + rm -f "$TODO_JSON" "$TODO_PO" "$DONE_JSON" "$DONE_PO" + + ENTRY_COUNT=$(grep -c '^msgid ' "$PENDING" 2>/dev/null || echo 0) + ENTRY_COUNT=$((ENTRY_COUNT > 0 ? ENTRY_COUNT - 1 : 0)) + + if test "$ENTRY_COUNT" -gt $min_batch_size + then + if test "$ENTRY_COUNT" -gt $((min_batch_size * 8)) + then + NUM=$((min_batch_size * 2)) + elif test "$ENTRY_COUNT" -gt $((min_batch_size * 4)) + then + NUM=$((min_batch_size + min_batch_size / 2)) + else + NUM=$min_batch_size + fi + BATCHING=1 + else + NUM=$ENTRY_COUNT + BATCHING= + fi + + if command -v git-po-helper >/dev/null 2>&1 + then + if test -n "$BATCHING" + then + git-po-helper msg-select --json --head "$NUM" -o "$TODO_JSON" "$PENDING" + echo "Processing batch of $NUM entries (out of $ENTRY_COUNT remaining)" + else + git-po-helper msg-select --json -o "$TODO_JSON" "$PENDING" + echo "Processing all $ENTRY_COUNT entries at once" + fi + else + if test -n "$BATCHING" + then + awk -v num="$NUM" '/^msgid / && count++ > num {exit} 1' "$PENDING" | + tac | awk '/^$/ {found=1} found' | tac >"$TODO_PO" + echo "Processing batch of $NUM entries (out of $ENTRY_COUNT remaining)" + else + cp "$PENDING" "$TODO_PO" + echo "Processing all $ENTRY_COUNT entries at once" + fi + fi + } + # Prepare one batch; shrink 2nd arg when batches exceed agent capacity. + l10n_one_batch po/XX.po 100 + ``` + +4a. **Translate JSON batch** (`po/l10n-todo.json` → `po/l10n-done.json`): + + - **Task**: Translate `po/l10n-todo.json` (input, GETTEXT JSON) into + `po/l10n-done.json` (output, GETTEXT JSON). See the "GETTEXT JSON format" + section above for format details and translation rules. + - **Reference glossary**: Read the glossary from the batch file's + `header_comment` (see "Glossary Section" above) and use it for + consistent terminology. + - **When translating**: Follow the "Quality checklist" above for correctness + and quality. Handle escape sequences (`\n`, `\"`, `\\`, `\t`), placeholders, + and quotes correctly as in `msgid`. For JSON, correctly escape and unescape + these sequences when reading and writing. Modify `msgstr` and `msgstr[n]` + (for plural entries); clear the fuzzy flag (omit or set `fuzzy` to `false`). + Do **not** modify `msgid` or `msgid_plural`. + +4b. **Translate PO batch** (`po/l10n-todo.po` → `po/l10n-done.po`): + + - **Task**: Translate `po/l10n-todo.po` (input, GETTEXT PO) into + `po/l10n-done.po` (output, GETTEXT PO). + - **Reference glossary**: Read the glossary from the pending file header + (see "Glossary Section" above) and use it for consistent terminology. + - **When translating**: Follow the "Quality checklist" above for correctness + and quality. Preserve escape sequences (`\n`, `\"`, `\\`, `\t`), placeholders, + and quotes as in `msgid`. Modify `msgstr` and `msgstr[n]` (for plural + entries); remove the `#, fuzzy` tag from comments when done. Do **not** + modify `msgid` or `msgid_plural`. + +5. **Validate `po/l10n-done.po`**: + + Run the validation script below. If it fails, fix per the errors and notes, + re-run until it succeeds. + + ```shell + l10n_validate_done () { + DONE_PO="po/l10n-done.po" + DONE_JSON="po/l10n-done.json" + PENDING="po/l10n-pending.po" + + if test -f "$DONE_JSON" && { ! test -f "$DONE_PO" || test "$DONE_JSON" -nt "$DONE_PO"; } + then + git-po-helper msg-cat --unset-fuzzy -o "$DONE_PO" "$DONE_JSON" || { + echo "ERROR [JSON to PO conversion]: Fix $DONE_JSON and re-run." >&2 + return 1 + } + fi + + # Check 1: msgid should not be modified + MSGID_OUT=$(git-po-helper compare -q --msgid --assert-no-changes \ + "$PENDING" "$DONE_PO" 2>&1) + MSGID_RC=$? + if test $MSGID_RC -ne 0 || test -n "$MSGID_OUT" + then + echo "ERROR [msgid modified]: The following entries appeared after" >&2 + echo "translation because msgid was altered. Fix in $DONE_PO." >&2 + echo "$MSGID_OUT" >&2 + return 1 + fi + + # Check 2: PO format (see "Validating PO File Format" for error handling) + MSGFMT_OUT=$(msgfmt --check -o /dev/null "$DONE_PO" 2>&1) + MSGFMT_RC=$? + if test $MSGFMT_RC -ne 0 + then + echo "ERROR [PO format]: Fix errors in $DONE_PO." >&2 + echo "$MSGFMT_OUT" >&2 + return 1 + fi + + echo "Validation passed." + } + l10n_validate_done + ``` + + If the script fails, fix **directly in `po/l10n-done.po`**. Re-run + `l10n_validate_done` until it succeeds. Editing `po/l10n-done.json` is not + recommended because it adds an extra JSON-to-PO conversion step. Use the + error message to decide: + + - **`[msgid modified]`**: The listed entries have altered `msgid`; restore + them to match `po/l10n-pending.po`. + - **`[PO format]`**: `msgfmt` reports line numbers; fix the errors in place. + See "Validating PO File Format" for common issues. + + +6. **Merge translation results into `po/XX.po`**: Run the script below. If it + fails, fix the file the error names: **`[JSON to PO conversion]`** → + `po/l10n-done.json`; **`[msgcat merge]`** → `po/l10n-done.po`. Re-run until + it succeeds. + + ```shell + l10n_merge_batch () { + test $# -ge 1 || { echo "Usage: l10n_merge_batch <po-file>" >&2; return 1; } + PO_FILE="$1" + DONE_PO="po/l10n-done.po" + DONE_JSON="po/l10n-done.json" + MERGED="po/l10n-done.merged" + PENDING="po/l10n-pending.po" + PENDING_REFER="${PENDING}.fuzzy.reference" + TODO_JSON="po/l10n-todo.json" + TODO_PO="po/l10n-todo.po" + if test -f "$DONE_JSON" && { ! test -f "$DONE_PO" || test "$DONE_JSON" -nt "$DONE_PO"; } + then + git-po-helper msg-cat --unset-fuzzy -o "$DONE_PO" "$DONE_JSON" || { + echo "ERROR [JSON to PO conversion]: Fix $DONE_JSON and re-run." >&2 + return 1 + } + fi + msgcat --use-first "$DONE_PO" "$PO_FILE" >"$MERGED" || { + echo "ERROR [msgcat merge]: Fix errors in $DONE_PO and re-run." >&2 + return 1 + } + mv "$MERGED" "$PO_FILE" + rm -f "$TODO_JSON" "$TODO_PO" "$DONE_JSON" "$DONE_PO" "$PENDING_REFER" + } + # Run the merge. Example: l10n_merge_batch po/zh_CN.po + l10n_merge_batch po/XX.po + ``` + +7. **Loop**: **MUST** return to step 1 (Extract entries) and repeat the cycle. + Do **not** skip this step or go to step 8. Step 8 (below) runs **only** + when step 2 finds no more entries and redirects there. + +8. **Only after loop exits**: Run the command below to validate the PO file and + display the report. The process ends here. + + ```shell + msgfmt --check --stat -o /dev/null po/XX.po + ``` + + ## Human translators remain in control Git translation is human-driven; language team leaders and contributors are -- 2.53.0.rc2.20.g532543fa46 ^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH v4 5/5] docs(l10n): add AI agent instructions to review translations 2026-03-16 23:54 ` [PATCH v4 0/5] docs(l10n): AI agent instructions and workflow improvements Jiang Xin ` (3 preceding siblings ...) 2026-03-16 23:54 ` [PATCH v4 4/5] docs(l10n): add AI agent instructions for translating PO files Jiang Xin @ 2026-03-16 23:54 ` Jiang Xin 4 siblings, 0 replies; 42+ messages in thread From: Jiang Xin @ 2026-03-16 23:54 UTC (permalink / raw) To: Junio C Hamano, Johannes Sixt, Git List Cc: Jiang Xin, Alexander Shopov, Mikel Forcada, Ralf Thielow, Jean-Noël Avila, Bagas Sanjaya, Dimitriy Ryazantcev, Peter Krefting, Emir SARI, Arkadii Yakovets, Vũ Tiến Hưng, Teng Long, Yi-Jyun Pan Add a new "Reviewing po/XX.po" section to po/AGENTS.md that provides comprehensive guidance for AI agents to review translation files. Translation diffs lose context, especially for multi-line msgid and msgstr entries. Some LLMs ignore context and cannot evaluate translations accurately; others rely on scripts to search for context in source files, making the review process time-consuming. To address this, git-po-helper implements the compare subcommand, which extracts new or modified translations with full context (complete msgid/msgstr pairs), significantly improving review efficiency. A limitation is that the extracted content lacks other already-translated content for reference, which may affect terminology consistency. This is mitigated by including a glossary in the PO file header. git-po-helper-generated review files include the header entry and glossary (if present) by default. The review workflow leverages git-po-helper subcommands: - git-po-helper compare: Extract new or changed entries between two versions of a PO file into a valid PO file for review. Supports multiple modes: * Compare HEAD with the working tree (local changes) * Compare a commit's parent with the commit (--commit) * Compare a commit with the working tree (--since) * Compare two arbitrary revisions (-r) - git-po-helper msg-select: Split large review files into smaller batches by entry index range for manageable review sessions. Supports range formats like "-50" (first 50), "51-100", "101-" (to end). Evaluation with the Qwen model: git-po-helper agent-run review --commit 2000abefba --agent qwen Benchmark results: | Metric | Value | |------------------|----------------------------------| | Turns | 22 | | Input tokens | 537263 | | Output tokens | 4397 | | API duration | 167.84 s | | Review score | 96/100 | | Total entries | 63 | | With issues | 4 (1 critical, 2 major, 1 minor) | Signed-off-by: Jiang Xin <worldhello.net@gmail.com> --- po/AGENTS.md | 197 +++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 197 insertions(+) diff --git a/po/AGENTS.md b/po/AGENTS.md index 65017624f7..e9a6ffc7f1 100644 --- a/po/AGENTS.md +++ b/po/AGENTS.md @@ -10,6 +10,7 @@ most commonly used housekeeping tasks: 1. Generating or updating po/git.pot 2. Updating po/XX.po 3. Translating po/XX.po +4. Reviewing translation quality ## Background knowledge for localization workflows @@ -664,6 +665,202 @@ step 8 after step 6. ``` +### Task 4: Review translation quality + +Review may target the full `po/XX.po`, a specific commit, or changes since a +commit. When asked to review, follow the steps below. + +**Workflow**: Follow steps in order. Do **NOT** use `git show`, `git diff`, +`git format-patch`, or similar to get changes—they break PO context; use **only** +`git-po-helper compare` for extraction. Without `git-po-helper`, refuse the task. +Steps 3→4→5→6→7 loop: after step 6, **always** go to step 7 (back to step 3). +The **only** ways to step 8 are when step 4 finds `po/review-todo.json` missing +or empty (no batch left to review), or when step 1 finds `po/review-result.json` +already present. + +1. **Check for existing review (resume support)**: Evaluate the following in order: + + - If `po/review-input.po` does **not** exist, proceed to step 2 (Extract + entries) for a fresh start. + - Else If `po/review-result.json` exists, go to step 8 (only after loop exits). + - Else If `po/review-done.json` exists, go to step 6 (Rename result). + - Else if `po/review-todo.json` exists, go to step 5 (Review the current + batch). + - Else go to step 3 (Prepare one batch). + +2. **Extract entries**: Run `git-po-helper compare` with the desired range and + redirect the output to `po/review-input.po`. See "Comparing PO files for + translation and review" under git-po-helper for options. + +3. **Prepare one batch**: Batching keeps each run small so the model can + complete review within limited context. **Directly execute** the script + below—it is authoritative; do not reimplement. + + ```shell + review_one_batch () { + min_batch_size=${1:-100} + INPUT_PO="po/review-input.po" + PENDING="po/review-pending.po" + TODO="po/review-todo.json" + DONE="po/review-done.json" + BATCH_FILE="po/review-batch.txt" + + if test ! -f "$INPUT_PO" + then + rm -f "$TODO" + echo >&2 "cannot find $INPUT_PO, nothing for review" + return 1 + fi + if test ! -f "$PENDING" || test "$INPUT_PO" -nt "$PENDING" + then + rm -f "$BATCH_FILE" "$TODO" "$DONE" + rm -f po/review-result*.json + cp "$INPUT_PO" "$PENDING" + fi + + ENTRY_COUNT=$(grep -c '^msgid ' "$PENDING" 2>/dev/null || echo 0) + ENTRY_COUNT=$((ENTRY_COUNT > 0 ? ENTRY_COUNT - 1 : 0)) + if test "$ENTRY_COUNT" -eq 0 + then + rm -f "$TODO" + echo >&2 "No entries left for review" + return 1 + fi + + if test "$ENTRY_COUNT" -gt $min_batch_size + then + if test "$ENTRY_COUNT" -gt $((min_batch_size * 8)) + then + NUM=$((min_batch_size * 2)) + elif test "$ENTRY_COUNT" -gt $((min_batch_size * 4)) + then + NUM=$((min_batch_size + min_batch_size / 2)) + else + NUM=$min_batch_size + fi + else + NUM=$ENTRY_COUNT + fi + + BATCH=$(cat "$BATCH_FILE" 2>/dev/null || echo 0) + BATCH=$((BATCH + 1)) + echo "$BATCH" >"$BATCH_FILE" + + git-po-helper msg-select --json --head "$NUM" -o "$TODO" "$PENDING" + git-po-helper msg-select --since "$((NUM + 1))" -o "${PENDING}.tmp" "$PENDING" + mv "${PENDING}.tmp" "$PENDING" + echo "Processing batch $BATCH ($NUM entries out of $ENTRY_COUNT)" + } + # The parameter controls batch size; reduce if the batch file is too large. + review_one_batch 100 + ``` + +4. **Check todo file**: If `po/review-todo.json` does not exist or is empty, + review is complete; go to step 8 (only after loop exits). Otherwise proceed to + step 5. + +5. **Review the current batch**: Review translations in `po/review-todo.json` + and write findings to `po/review-done.json` as follows: + - Use "Background knowledge for localization workflows" for PO/JSON structure, + placeholders, and terminology. + - If `header_comment` includes a glossary, follow it for consistency. + - Do **not** review the header (`header_comment`, `header_meta`). + - For every other entry, check the entry's `msgstr` **array** (translation + forms) against `msgid` / `msgid_plural` using the "Quality checklist" above. + - Write JSON per "Review result JSON format" below; use `{"issues": []}` when + there are no issues. **Always** write `po/review-done.json`—it marks the + batch complete. + +6. **Rename result**: Rename `po/review-done.json` to `po/review-result-<N>.json`, + where N is the value in `po/review-batch.txt` (the batch just completed). + Run the script below: + + ```shell + review_rename_result () { + TODO="po/review-todo.json" + DONE="po/review-done.json" + BATCH_FILE="po/review-batch.txt" + if test -f "$DONE" + then + N=$(cat "$BATCH_FILE" 2>/dev/null) || { echo "ERROR: $BATCH_FILE not found." >&2; return 1; } + mv "$DONE" "po/review-result-$N.json" + echo "Renamed to po/review-result-$N.json" + fi + rm -f "$TODO" + } + review_rename_result + ``` + +7. **Loop**: **MUST** return to step 3 (Prepare one batch) and repeat the cycle. + Do **not** skip this step or go to step 8. Step 8 is reached **only** when + step 4 finds `po/review-todo.json` missing or empty. + +8. **Only after loop exits**: **Directly execute** the command below. It merges + results, applies suggestions, and displays the report. The process ends here. + + ```shell + git-po-helper agent-run report + ``` + + **Do not** run cleanup or delete intermediate files. Keep them for inspection + or resumption. + +**Review result JSON format**: + +The **Review result JSON** format defines the structure for translation +review reports. For each entry with translation issues, create an issue +object as follows: + +- Copy the original entry's `msgid`, optional `msgid_plural`, and optional + `msgstr` array (original translation forms) into the issue object. Use the + same shape as GETTEXT JSON: `msgstr` is **always a JSON array** when present + (one element singular, multiple for plural). +- Write a summary of all issues found for this entry in `description`. +- Set `score` according to the severity of issues found for this entry, + from 0 to 3 (0 = critical; 1 = major; 2 = minor; 3 = perfect, no issues). + **Lower score means more severe issues.** +- Place the suggested translation in **`suggest_msgstr`** as a **JSON array**: + one string for singular, multiple strings for plural forms in order. This is + required for `git-po-helper` to apply suggestions. +- Include only entries with issues (score less than 3). When no issues are + found in the batch, write `{"issues": []}`. + +Example review result (with issues): + +```json +{ + "issues": [ + { + "msgid": "commit", + "msgstr": ["委托"], + "score": 0, + "description": "Terminology error: 'commit' should be translated as '提交'", + "suggest_msgstr": ["提交"] + }, + { + "msgid": "repository", + "msgid_plural": "repositories", + "msgstr": ["版本库", "版本库"], + "score": 2, + "description": "Consistency issue: suggest using '仓库' consistently", + "suggest_msgstr": ["仓库", "仓库"] + } + ] +} +``` + +Field descriptions for each issue object (element of the `issues` array): + +- `msgid` (and optional `msgid_plural` for plural entries): Original source text. +- `msgstr` (optional): JSON array of original translation forms (same meaning as + in GETTEXT JSON entries). +- `suggest_msgstr`: JSON array of suggested translation forms; **must be an + array** (e.g. `["提交"]` for singular). Plural entries use multiple elements + in order. +- `score`: 0–3 (0 = critical; 1 = major; 2 = minor; 3 = perfect, no issues). +- `description`: Brief summary of the issue. + + ## Human translators remain in control Git translation is human-driven; language team leaders and contributors are -- 2.53.0.rc2.20.g532543fa46 ^ permalink raw reply related [flat|nested] 42+ messages in thread
end of thread, other threads:[~2026-03-16 23:55 UTC | newest]
Thread overview: 42+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-04 9:31 [RFC] Introducing AI Agents to Git Localization Jiang Xin
2026-02-04 11:58 ` Peter Krefting
2026-02-04 13:00 ` Michal Suchánek
2026-02-04 14:38 ` 依云
2026-02-05 2:06 ` Jiang Xin
2026-02-05 8:30 ` Michal Suchánek
2026-02-05 11:16 ` Jiang Xin
2026-02-05 13:18 ` Michal Suchánek
2026-02-05 1:04 ` Jiang Xin
2026-02-05 1:53 ` brian m. carlson
[not found] ` <0207CD38-C811-499D-AFA6-131B0CA825CD@gmail.com>
2026-02-05 12:54 ` Jiang Xin
2026-02-05 13:00 ` [RFC PATCH 1/2] l10n: add .gitattributes to simplify location filtering Jiang Xin
2026-02-05 20:07 ` Junio C Hamano
2026-02-05 13:00 ` [RFC PATCH 2/2] l10n: README: document AI assistant guidelines Jiang Xin
2026-02-05 20:35 ` Junio C Hamano
2026-02-06 2:38 ` Jiang Xin
2026-03-03 15:33 ` [PATCH v2 0/5] docs(l10n): AI agent instructions and workflow improvements Jiang Xin
2026-03-03 15:33 ` [PATCH v2 1/5] l10n: add .gitattributes to simplify location filtering Jiang Xin
2026-03-03 15:33 ` [PATCH v2 2/5] docs(l10n): add AGENTS.md with optimized update-pot instructions Jiang Xin
2026-03-12 2:11 ` Jiang Xin
2026-03-03 15:33 ` [PATCH v2 3/5] docs(l10n): add AI agent instructions for updating po/XX.po files Jiang Xin
2026-03-03 15:33 ` [PATCH v2 4/5] docs(l10n): add AI agent instructions for translating PO files Jiang Xin
2026-03-12 2:26 ` Jiang Xin
2026-03-03 15:33 ` [PATCH v2 5/5] docs(l10n): add AI agent instructions to review translations Jiang Xin
2026-03-12 2:34 ` Jiang Xin
2026-03-14 14:38 ` [PATCH v3 0/5] docs(l10n): AI agent instructions and workflow improvements Jiang Xin
2026-03-14 14:38 ` [PATCH v3 1/5] l10n: add .gitattributes to simplify location filtering Jiang Xin
2026-03-15 11:13 ` Johannes Sixt
2026-03-15 16:11 ` Junio C Hamano
2026-03-16 5:44 ` Jiang Xin
2026-03-16 3:21 ` Jiang Xin
2026-03-16 12:43 ` Johannes Sixt
2026-03-14 14:38 ` [PATCH v3 2/5] docs(l10n): add AGENTS.md with optimized update-pot instructions Jiang Xin
2026-03-14 14:38 ` [PATCH v3 3/5] docs(l10n): add AI agent instructions for updating po/XX.po files Jiang Xin
2026-03-14 14:38 ` [PATCH v3 4/5] docs(l10n): add AI agent instructions for translating PO files Jiang Xin
2026-03-14 14:38 ` [PATCH v3 5/5] docs(l10n): add AI agent instructions to review translations Jiang Xin
2026-03-16 23:54 ` [PATCH v4 0/5] docs(l10n): AI agent instructions and workflow improvements Jiang Xin
2026-03-16 23:54 ` [PATCH v4 1/5] l10n: add .gitattributes to simplify location filtering Jiang Xin
2026-03-16 23:54 ` [PATCH v4 2/5] docs(l10n): add AGENTS.md with optimized update-pot instructions Jiang Xin
2026-03-16 23:54 ` [PATCH v4 3/5] docs(l10n): add AI agent instructions for updating po/XX.po files Jiang Xin
2026-03-16 23:54 ` [PATCH v4 4/5] docs(l10n): add AI agent instructions for translating PO files Jiang Xin
2026-03-16 23:54 ` [PATCH v4 5/5] docs(l10n): add AI agent instructions to review translations Jiang Xin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox