linux-doc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] scripts: add origin commit identification based on specific patterns
@ 2025-07-13 16:34 Zhiyu Zhang
  2025-07-17 20:08 ` Jonathan Corbet
  2025-07-18  0:22 ` Dongliang Mu
  0 siblings, 2 replies; 5+ messages in thread
From: Zhiyu Zhang @ 2025-07-13 16:34 UTC (permalink / raw)
  To: dzm91, corbet, si.yanteng, zhiyuzhang999; +Cc: linux-kernel, linux-doc

This patch adds the functionability to smartly identify origin commit
of the translation by matching the following patterns in commit log:
1) update to commit HASH
2) Update the translation through commit HASH
If no such pattern is found, script will obey the original workflow.

Signed-off-by: Zhiyu Zhang <zhiyuzhang999@gmail.com>
---
 scripts/checktransupdate.py | 38 ++++++++++++++++++++++++++++++++++++-
 1 file changed, 37 insertions(+), 1 deletion(-)

diff --git a/scripts/checktransupdate.py b/scripts/checktransupdate.py
index 578c3fecfdfd..e39529e46c3d 100755
--- a/scripts/checktransupdate.py
+++ b/scripts/checktransupdate.py
@@ -24,6 +24,7 @@ commit 42fb9cfd5b18 ("Documentation: dev-tools: Add link to RV docs")
 """
 
 import os
+import re
 import time
 import logging
 from argparse import ArgumentParser, ArgumentTypeError, BooleanOptionalAction
@@ -69,6 +70,38 @@ def get_origin_from_trans(origin_path, t_from_head):
     return o_from_t
 
 
+def get_origin_from_trans_smartly(origin_path, t_from_head):
+    """Get the latest origin commit from the formatted translation commit:
+    (1) update to commit HASH (TITLE)
+    (2) Update the translation through commit HASH (TITLE)
+    """
+    # catch flag for 12-bit commit hash
+    HASH = r'([0-9a-f]{12})'
+    # pattern 1: contains "update to commit HASH"
+    pat_update_to = re.compile(rf'update to commit {HASH}')
+    # pattern 2: contains "Update the translation through commit HASH"
+    pat_update_translation = re.compile(rf'Update the translation through commit {HASH}')
+
+    origin_commit_hash = None
+    for line in t_from_head["message"]:
+        # check if the line matches the first pattern
+        match = pat_update_to.search(line)
+        if match:
+            origin_commit_hash = match.group(1)
+            break
+        # check if the line matches the second pattern
+        match = pat_update_translation.search(line)
+        if match:
+            origin_commit_hash = match.group(1)
+            break
+    if origin_commit_hash is None:
+        return None
+    o_from_t = get_latest_commit_from(origin_path, origin_commit_hash)
+    if o_from_t is not None:
+        logging.debug("tracked origin commit id: %s", o_from_t["hash"])
+    return o_from_t
+
+
 def get_commits_count_between(opath, commit1, commit2):
     """Get the commits count between two commits for the specified file"""
     command = f"git log --pretty=format:%H {commit1}...{commit2} -- {opath}"
@@ -108,7 +141,10 @@ def check_per_file(file_path):
         logging.error("Cannot find the latest commit for %s", file_path)
         return
 
-    o_from_t = get_origin_from_trans(opath, t_from_head)
+    o_from_t = get_origin_from_trans_smartly(opath, t_from_head)
+    # notice, o_from_t from get_*_smartly() is always more accurate than from get_*()
+    if o_from_t is None:
+        o_from_t = get_origin_from_trans(opath, t_from_head)
 
     if o_from_t is None:
         logging.error("Error: Cannot find the latest origin commit for %s", file_path)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] scripts: add origin commit identification based on specific patterns
  2025-07-13 16:34 [PATCH] scripts: add origin commit identification based on specific patterns Zhiyu Zhang
@ 2025-07-17 20:08 ` Jonathan Corbet
  2025-07-18  0:24   ` Dongliang Mu
  2025-07-18  0:22 ` Dongliang Mu
  1 sibling, 1 reply; 5+ messages in thread
From: Jonathan Corbet @ 2025-07-17 20:08 UTC (permalink / raw)
  To: Zhiyu Zhang, dzm91, si.yanteng, zhiyuzhang999; +Cc: linux-kernel, linux-doc

Zhiyu Zhang <zhiyuzhang999@gmail.com> writes:

> This patch adds the functionability to smartly identify origin commit
> of the translation by matching the following patterns in commit log:
> 1) update to commit HASH
> 2) Update the translation through commit HASH
> If no such pattern is found, script will obey the original workflow.
>
> Signed-off-by: Zhiyu Zhang <zhiyuzhang999@gmail.com>
> ---
>  scripts/checktransupdate.py | 38 ++++++++++++++++++++++++++++++++++++-
>  1 file changed, 37 insertions(+), 1 deletion(-)

So I don't have any objection to this, but wouldn't mind hearing from
folks who actually use this script - has anybody else tested it out?

Thanks,

jon

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] scripts: add origin commit identification based on specific patterns
  2025-07-13 16:34 [PATCH] scripts: add origin commit identification based on specific patterns Zhiyu Zhang
  2025-07-17 20:08 ` Jonathan Corbet
@ 2025-07-18  0:22 ` Dongliang Mu
  1 sibling, 0 replies; 5+ messages in thread
From: Dongliang Mu @ 2025-07-18  0:22 UTC (permalink / raw)
  To: Zhiyu Zhang; +Cc: dzm91, corbet, si.yanteng, linux-kernel, linux-doc

On Mon, Jul 14, 2025 at 12:38 AM Zhiyu Zhang <zhiyuzhang999@gmail.com> wrote:
>
> This patch adds the functionability to smartly identify origin commit
> of the translation by matching the following patterns in commit log:
> 1) update to commit HASH
> 2) Update the translation through commit HASH
> If no such pattern is found, script will obey the original workflow.
>
> Signed-off-by: Zhiyu Zhang <zhiyuzhang999@gmail.com>

Reviewed-by: Dongliang Mu <dzm91@hust.edu.cn>


> ---
>  scripts/checktransupdate.py | 38 ++++++++++++++++++++++++++++++++++++-
>  1 file changed, 37 insertions(+), 1 deletion(-)
>
> diff --git a/scripts/checktransupdate.py b/scripts/checktransupdate.py
> index 578c3fecfdfd..e39529e46c3d 100755
> --- a/scripts/checktransupdate.py
> +++ b/scripts/checktransupdate.py
> @@ -24,6 +24,7 @@ commit 42fb9cfd5b18 ("Documentation: dev-tools: Add link to RV docs")
>  """
>
>  import os
> +import re
>  import time
>  import logging
>  from argparse import ArgumentParser, ArgumentTypeError, BooleanOptionalAction
> @@ -69,6 +70,38 @@ def get_origin_from_trans(origin_path, t_from_head):
>      return o_from_t
>
>
> +def get_origin_from_trans_smartly(origin_path, t_from_head):
> +    """Get the latest origin commit from the formatted translation commit:
> +    (1) update to commit HASH (TITLE)
> +    (2) Update the translation through commit HASH (TITLE)
> +    """
> +    # catch flag for 12-bit commit hash
> +    HASH = r'([0-9a-f]{12})'
> +    # pattern 1: contains "update to commit HASH"
> +    pat_update_to = re.compile(rf'update to commit {HASH}')
> +    # pattern 2: contains "Update the translation through commit HASH"
> +    pat_update_translation = re.compile(rf'Update the translation through commit {HASH}')
> +
> +    origin_commit_hash = None
> +    for line in t_from_head["message"]:
> +        # check if the line matches the first pattern
> +        match = pat_update_to.search(line)
> +        if match:
> +            origin_commit_hash = match.group(1)
> +            break
> +        # check if the line matches the second pattern
> +        match = pat_update_translation.search(line)
> +        if match:
> +            origin_commit_hash = match.group(1)
> +            break
> +    if origin_commit_hash is None:
> +        return None
> +    o_from_t = get_latest_commit_from(origin_path, origin_commit_hash)
> +    if o_from_t is not None:
> +        logging.debug("tracked origin commit id: %s", o_from_t["hash"])
> +    return o_from_t
> +
> +
>  def get_commits_count_between(opath, commit1, commit2):
>      """Get the commits count between two commits for the specified file"""
>      command = f"git log --pretty=format:%H {commit1}...{commit2} -- {opath}"
> @@ -108,7 +141,10 @@ def check_per_file(file_path):
>          logging.error("Cannot find the latest commit for %s", file_path)
>          return
>
> -    o_from_t = get_origin_from_trans(opath, t_from_head)
> +    o_from_t = get_origin_from_trans_smartly(opath, t_from_head)
> +    # notice, o_from_t from get_*_smartly() is always more accurate than from get_*()
> +    if o_from_t is None:
> +        o_from_t = get_origin_from_trans(opath, t_from_head)
>
>      if o_from_t is None:
>          logging.error("Error: Cannot find the latest origin commit for %s", file_path)
> --
> 2.34.1
>
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] scripts: add origin commit identification based on specific patterns
  2025-07-17 20:08 ` Jonathan Corbet
@ 2025-07-18  0:24   ` Dongliang Mu
  2025-07-24 14:41     ` Jonathan Corbet
  0 siblings, 1 reply; 5+ messages in thread
From: Dongliang Mu @ 2025-07-18  0:24 UTC (permalink / raw)
  To: Jonathan Corbet; +Cc: Zhiyu Zhang, dzm91, si.yanteng, linux-kernel, linux-doc

On Fri, Jul 18, 2025 at 4:09 AM Jonathan Corbet <corbet@lwn.net> wrote:
>
> Zhiyu Zhang <zhiyuzhang999@gmail.com> writes:
>
> > This patch adds the functionability to smartly identify origin commit
> > of the translation by matching the following patterns in commit log:
> > 1) update to commit HASH
> > 2) Update the translation through commit HASH
> > If no such pattern is found, script will obey the original workflow.
> >
> > Signed-off-by: Zhiyu Zhang <zhiyuzhang999@gmail.com>
> > ---
> >  scripts/checktransupdate.py | 38 ++++++++++++++++++++++++++++++++++++-
> >  1 file changed, 37 insertions(+), 1 deletion(-)
>
> So I don't have any objection to this, but wouldn't mind hearing from
> folks who actually use this script - has anybody else tested it out?

I’ve tested this script, and the new origin commit tracking
functionality is working effectively—it can reduce the number of
commits requiring updates.

Dongliang Mu

>
> Thanks,
>
> jon
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] scripts: add origin commit identification based on specific patterns
  2025-07-18  0:24   ` Dongliang Mu
@ 2025-07-24 14:41     ` Jonathan Corbet
  0 siblings, 0 replies; 5+ messages in thread
From: Jonathan Corbet @ 2025-07-24 14:41 UTC (permalink / raw)
  To: Dongliang Mu; +Cc: Zhiyu Zhang, dzm91, si.yanteng, linux-kernel, linux-doc

Dongliang Mu <mudongliangabcd@gmail.com> writes:

> On Fri, Jul 18, 2025 at 4:09 AM Jonathan Corbet <corbet@lwn.net> wrote:
>>
>> Zhiyu Zhang <zhiyuzhang999@gmail.com> writes:
>>
>> > This patch adds the functionability to smartly identify origin commit
>> > of the translation by matching the following patterns in commit log:
>> > 1) update to commit HASH
>> > 2) Update the translation through commit HASH
>> > If no such pattern is found, script will obey the original workflow.
>> >
>> > Signed-off-by: Zhiyu Zhang <zhiyuzhang999@gmail.com>
>> > ---
>> >  scripts/checktransupdate.py | 38 ++++++++++++++++++++++++++++++++++++-
>> >  1 file changed, 37 insertions(+), 1 deletion(-)
>>
>> So I don't have any objection to this, but wouldn't mind hearing from
>> folks who actually use this script - has anybody else tested it out?
>
> I’ve tested this script, and the new origin commit tracking
> functionality is working effectively—it can reduce the number of
> commits requiring updates.

Great, thanks, I've applied it.

jon

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-07-24 14:41 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-13 16:34 [PATCH] scripts: add origin commit identification based on specific patterns Zhiyu Zhang
2025-07-17 20:08 ` Jonathan Corbet
2025-07-18  0:24   ` Dongliang Mu
2025-07-24 14:41     ` Jonathan Corbet
2025-07-18  0:22 ` Dongliang Mu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).