* [PATCH] tools/docs/checktransupdate.py: use metadata to lookup origin path
@ 2026-03-09 9:58 Haoyang LIU
2026-03-09 14:10 ` Jonathan Corbet
0 siblings, 1 reply; 3+ messages in thread
From: Haoyang LIU @ 2026-03-09 9:58 UTC (permalink / raw)
To: Dongliang Mu, Yanteng Si, Alex Shi, Jonathan Corbet,
Mauro Carvalho Chehab, Shuah Khan
Cc: Haoyang LIU, linux-doc, linux-kernel
The get_origin_path() function assumes that translation files have the
same relative path as their origin files, just with "translations/{locale}"
inserted after "Documentation/". However, this assumption is incorrect
for several translation files where the origin path differs. For example:
translations/zh_CN/dev-tools/gdb-kernel-debugging.rst
-> process/debugging/gdb-kernel-debugging.rst
The correct origin path is specified in each translation file's
:Original: metadata field, which can appear in several formats:
1. Plain path: :Original: Documentation/path/to/file.rst
2. With :ref: :Original: :ref:`Documentation/path/to/file.rst <label>`
3. With :doc: :Original: :doc:`../../../path/to/file`
Add get_origin_path_from_metadata() to parse the :Original: metadata
from translation files and extract the actual origin path. Update
check_per_file() to use metadata-based lookup first, falling back to
the path manipulation heuristic only when no metadata is found.
Signed-off-by: Haoyang LIU <tttturtleruss@gmail.com>
---
tools/docs/checktransupdate.py | 63 ++++++++++++++++++++++++++++++++--
1 file changed, 61 insertions(+), 2 deletions(-)
diff --git a/tools/docs/checktransupdate.py b/tools/docs/checktransupdate.py
index cc07cda667fc..b3c695fa0f7a 100755
--- a/tools/docs/checktransupdate.py
+++ b/tools/docs/checktransupdate.py
@@ -32,7 +32,7 @@ from datetime import datetime
def get_origin_path(file_path):
- """Get the origin path from the translation path"""
+ """Get the origin path from the translation path by path manipulation (fallback)"""
paths = file_path.split("/")
tidx = paths.index("translations")
opaths = paths[:tidx]
@@ -40,6 +40,62 @@ def get_origin_path(file_path):
return "/".join(opaths)
+def get_origin_path_from_metadata(file_path):
+ """Get the origin path from the :Original: metadata in the translation file.
+
+ The :Original: metadata can have several formats:
+ 1. Plain path: :Original: Documentation/path/to/file.rst
+ 2. With :ref: directive: :Original: :ref:`Documentation/path/to/file.rst <label>`
+ 3. With :doc: directive: :Original: :doc:`../../../path/to/file`
+
+ Returns the origin path if found, None otherwise.
+ """
+ # Pattern to match :Original: line
+ original_re = re.compile(r'^:Original:\s*(.+?)\s*$', re.IGNORECASE)
+ # Pattern to extract path from :ref:`path <label>` or :ref:`path`
+ ref_re = re.compile(r':ref:`([^`<]+?)(?:\s*<[^>]+>)?`')
+ # Pattern to extract path from :doc:`path`
+ doc_re = re.compile(r':doc:`([^`]+)`')
+
+ try:
+ with open(file_path, 'r', encoding='utf-8') as f:
+ # Only check the first 20 lines for metadata
+ for _ in range(20):
+ line = f.readline()
+ if not line:
+ break
+ match = original_re.match(line.strip())
+ if match:
+ original_value = match.group(1).strip()
+
+ # Try to extract from :ref:`...`
+ ref_match = ref_re.search(original_value)
+ if ref_match:
+ return ref_match.group(1).strip()
+
+ # Try to extract from :doc:`...`
+ doc_match = doc_re.search(original_value)
+ if doc_match:
+ doc_path = doc_match.group(1).strip()
+ # Handle relative paths - resolve relative to translation file
+ if doc_path.startswith('../'):
+ trans_dir = os.path.dirname(file_path)
+ resolved = os.path.normpath(os.path.join(trans_dir, doc_path))
+ # Add .rst extension if not present
+ if not resolved.endswith('.rst'):
+ resolved += '.rst'
+ return resolved
+
+ # Plain path (no directive wrapper)
+ if original_value.startswith('Documentation/'):
+ return original_value
+
+ except (IOError, OSError) as e:
+ logging.debug("Could not read file %s: %s", file_path, e)
+
+ return None
+
+
def get_latest_commit_from(file_path, commit):
"""Get the latest commit from the specified commit for the specified file"""
command = f"git log --pretty=format:%H%n%aD%n%cD%n%n%B {commit} -1 -- {file_path}"
@@ -128,7 +184,10 @@ def valid_commit(commit):
def check_per_file(file_path):
"""Check the translation status for the specified file"""
- opath = get_origin_path(file_path)
+ opath = get_origin_path_from_metadata(file_path)
+ if opath is None:
+ opath = get_origin_path(file_path)
+ logging.debug("No :Original: metadata found, using path-based fallback for %s", file_path)
if not os.path.isfile(opath):
logging.error("Cannot find the origin path for %s", file_path)
--
2.53.0
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: [PATCH] tools/docs/checktransupdate.py: use metadata to lookup origin path
2026-03-09 9:58 [PATCH] tools/docs/checktransupdate.py: use metadata to lookup origin path Haoyang LIU
@ 2026-03-09 14:10 ` Jonathan Corbet
2026-03-09 17:05 ` Haoyang Liu
0 siblings, 1 reply; 3+ messages in thread
From: Jonathan Corbet @ 2026-03-09 14:10 UTC (permalink / raw)
To: Haoyang LIU, Dongliang Mu, Yanteng Si, Alex Shi,
Mauro Carvalho Chehab, Shuah Khan
Cc: Haoyang LIU, linux-doc, linux-kernel
Haoyang LIU <tttturtleruss@gmail.com> writes:
> The get_origin_path() function assumes that translation files have the
> same relative path as their origin files, just with "translations/{locale}"
> inserted after "Documentation/". However, this assumption is incorrect
> for several translation files where the origin path differs. For example:
> translations/zh_CN/dev-tools/gdb-kernel-debugging.rst
> -> process/debugging/gdb-kernel-debugging.rst
Honestly, rather than trying to work around such things, I think it
would be far better to fix the places where the translated structure
differs from the original. Those differences can only lead to
confusion, and I've been trying to avoid creating any more of them.
Thanks,
jon
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [PATCH] tools/docs/checktransupdate.py: use metadata to lookup origin path
2026-03-09 14:10 ` Jonathan Corbet
@ 2026-03-09 17:05 ` Haoyang Liu
0 siblings, 0 replies; 3+ messages in thread
From: Haoyang Liu @ 2026-03-09 17:05 UTC (permalink / raw)
To: Jonathan Corbet, Dongliang Mu, Yanteng Si, Alex Shi,
Mauro Carvalho Chehab, Shuah Khan
Cc: linux-doc, linux-kernel
On 3/9/2026 10:10 PM, Jonathan Corbet wrote:
> Haoyang LIU <tttturtleruss@gmail.com> writes:
>
>> The get_origin_path() function assumes that translation files have the
>> same relative path as their origin files, just with "translations/{locale}"
>> inserted after "Documentation/". However, this assumption is incorrect
>> for several translation files where the origin path differs. For example:
>> translations/zh_CN/dev-tools/gdb-kernel-debugging.rst
>> -> process/debugging/gdb-kernel-debugging.rst
> Honestly, rather than trying to work around such things, I think it
> would be far better to fix the places where the translated structure
> differs from the original. Those differences can only lead to
> confusion, and I've been trying to avoid creating any more of them.
Dear Jon,
That makes sense. I agree that keeping the translated directory
structure consistent with the original would avoid confusion and
reduce the need for special handling in the script.
I’ll take a look at the places where the structures differ and see
whether they can be aligned with the original layout instead.
Sincerely,
Haoyang
>
> Thanks,
>
> jon
--
Sincerely,
Haoyang
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-03-09 17:05 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-09 9:58 [PATCH] tools/docs/checktransupdate.py: use metadata to lookup origin path Haoyang LIU
2026-03-09 14:10 ` Jonathan Corbet
2026-03-09 17:05 ` Haoyang Liu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox