From: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
To: Jonathan Corbet <corbet@lwn.net>
Cc: linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
Akira Yokosawa <akiyks@gmail.com>
Subject: Re: [PATCH 12/12] docs: kdoc: Improve the output text accumulation
Date: Thu, 10 Jul 2025 12:10:33 +0200 [thread overview]
Message-ID: <20250710121033.42db5ef3@foz.lan> (raw)
In-Reply-To: <20250710101931.202953d1@foz.lan>
Em Thu, 10 Jul 2025 10:19:31 +0200
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> escreveu:
> Em Thu, 10 Jul 2025 09:13:52 +0200
> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> escreveu:
>
> Heh, on those times where LLM can quickly code trivial things for us,
> I actually decided to test 3 different variants:
>
> - using string +=
> - using list append
> - using __add__
> - using __iadd__
Manually reorganized the LLM-generated code, in order to get more
precise results. Script enclosed at the end.
$ for i in python3.9 python3.13 python3.13t; do echo " $i:"; $i /tmp/bench.py 100000 10 1; $i /tmp/bench.py 1000 1000 1; done
python3.9:
10 strings in a loop with 100000 interactions, repeating 24 times
str += : time: 25.21
list join : time: 72.65: 188.18% slower than str +=
__add__ : time: 71.82: 184.88% slower than str +=
__iadd__ : time: 67.84: 169.09% slower than str +=
1000 strings in a loop with 1000 interactions, repeating 24 times
str += : time: 24.29
list join : time: 58.76: 141.88% slower than str +=
__add__ : time: 58.68: 141.54% slower than str +=
__iadd__ : time: 55.48: 128.37% slower than str +=
python3.13:
10 strings in a loop with 100000 interactions, repeating 24 times
str += : time: 28.01
list join : time: 32.46: 15.91% slower than str +=
__add__ : time: 52.56: 87.66% slower than str +=
__iadd__ : time: 58.69: 109.55% slower than str +=
1000 strings in a loop with 1000 interactions, repeating 24 times
str += : time: 22.03
list join : time: 23.38: 6.12% slower than str +=
__add__ : time: 44.25: 100.86% slower than str +=
__iadd__ : time: 40.70: 84.74% slower than str +=
python3.13t:
10 strings in a loop with 100000 interactions, repeating 24 times
str += : time: 25.65
list join : time: 74.95: 192.18% slower than str +=
__add__ : time: 83.04: 223.71% slower than str +=
__iadd__ : time: 79.07: 208.23% slower than str +=
1000 strings in a loop with 1000 interactions, repeating 24 times
str += : time: 57.39
list join : time: 62.31: 8.58% slower than str +=
__add__ : time: 70.65: 23.10% slower than str +=
__iadd__ : time: 68.67: 19.65% slower than str +=
From the above:
- It is not worth applying patch 12/12 as it makes the code slower;
- Python 3.13t (no-GIL version) had very bad results. It seems it
still requires optimization;
- Python 3.9 is a lot worse (140% to 190%) when using list append;
- when there are not many concats, Python 3.13 is about 15% slower
with lists than concat strings. It only approaches str concat
when the number of concats is high.
With the above, clearly str += is faster than list append.
So, except if I did something wrong on this benchmark script, please
don't apply patch 12/12.
Regards,
Mauro
---
Benchmark code:
#!/usr/bin/env python3
import argparse
import time
import sys
def benchmark_str_concat(test_strings, n_ops):
start = time.time()
for _ in range(n_ops):
result = ""
for s in test_strings:
result += s
return (time.time() - start) * 1000
def benchmark_explicit_list(test_strings, n_ops):
class ExplicitList:
def __init__(self):
self._output = []
def emit(self, text):
self._output.append(text)
def output(self):
return ''.join(self._output)
start = time.time()
for _ in range(n_ops):
obj = ExplicitList()
for s in test_strings:
obj.emit(s)
return (time.time() - start) * 1000
def benchmark_add_overload(test_strings, n_ops):
class OutputStringAdd:
def __init__(self):
self._output = []
def __add__(self, text):
self._output.append(text)
return self
def __str__(self):
return ''.join(self._output)
start = time.time()
for _ in range(n_ops):
obj = OutputStringAdd()
for s in test_strings:
obj += s
return (time.time() - start) * 1000
def benchmark_iadd_overload(test_strings, n_ops):
class OutputStringIAdd:
def __init__(self):
self._output = []
def __iadd__(self, text):
self._output.append(text)
return self
def __str__(self):
return ''.join(self._output)
start = time.time()
for _ in range(n_ops):
obj = OutputStringIAdd()
for s in test_strings:
obj += s
return (time.time() - start) * 1000
def calculate_comparison(base_time, compare_time):
if compare_time < base_time:
return (True, (1 - compare_time/base_time)*100)
return (False, (compare_time/base_time - 1)*100)
def benchmark(num_reps, strings_per_run, repeats, detail):
test_strings = [f"string_{i:03d}" for i in range(strings_per_run)]
# Create benchmark execution order list
benchmarks = [
("str +=", benchmark_str_concat),
("list join", benchmark_explicit_list),
("__add__", benchmark_add_overload),
("__iadd__", benchmark_iadd_overload)
]
# Use all possible permutations of benchmark order to reduce any
# noise due to CPU caches
all_orders = [
(0, 1, 2, 3), (0, 1, 3, 2), (0, 2, 1, 3), (0, 2, 3, 1),
(0, 3, 1, 2), (0, 3, 2, 1), (1, 0, 2, 3), (1, 0, 3, 2),
(1, 2, 0, 3), (1, 2, 3, 0), (1, 3, 0, 2), (1, 3, 2, 0),
(2, 0, 1, 3), (2, 0, 3, 1), (2, 1, 0, 3), (2, 1, 3, 0),
(2, 3, 0, 1), (2, 3, 1, 0), (3, 0, 1, 2), (3, 0, 2, 1),
(3, 1, 0, 2), (3, 1, 2, 0), (3, 2, 0, 1), (3, 2, 1, 0)
]
results = {}
for name, _ in benchmarks:
results[name] = 0
# Warm-up phase to reduce caching issues
for name, fn in benchmarks:
fn(test_strings, 1)
n_repeats = len(all_orders) * repeats
print(f" {strings_per_run} strings in a loop with {num_reps} interactions, repeating {n_repeats} times")
# Actual benchmark starts here
i = 0
if detail:
headers = ['Run'] + [name for name, _ in benchmarks]
print()
print(f"\t{headers[0]:<6} {headers[1]:<12} {headers[2]:<12} {headers[3]:<12} {headers[4]:<12}")
print("\t" + "-" * 60)
for _ in range(repeats):
# Shuffle execution order each run
for order in all_orders:
run_results = {}
for idx in order:
name, func = benchmarks[idx]
run_results[name] = func(test_strings, num_reps)
results[name] += run_results[name]
if detail:
# Output results in consistent order
print(f"\t{i+1:<6}", end=" ")
for name, _ in benchmarks:
print(f"{run_results[name]:<12.2f}", end=" ")
print()
i += 1
avg_results = {}
for name, _ in benchmarks:
avg_results[name] = results[name] / repeats / len(all_orders)
if detail:
print("\t" + "-" * 60)
print(f"\t ", end=" ")
for name, _ in benchmarks:
print(f"{avg_results[name]:<12.2f}", end=" ")
print()
print()
ref = benchmarks.pop(0)
print(f"\t{ref[0]:<12} : time: {avg_results[ref[0]]:3.2f}")
for name, _ in benchmarks:
is_faster, percentage = calculate_comparison(avg_results[ref[0]], avg_results[name])
direction = "faster" if is_faster else "slower"
print(f"\t{name:<12} : time: {avg_results[name]:3.2f}: {percentage:3.2f}% {direction} than {ref[0]}")
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument('-d', '--detail', action='store_true',
help='Enable detailed output')
args, remaining = parser.parse_known_args()
# Then handle the positional arguments manually
if len(remaining) != 3:
print(f"Usage: {sys.argv[0]} [-d] <num_repetitions> <strings_per_op> <repeats>")
sys.exit(1)
num_reps = int(remaining[0])
strings_per_op = int(remaining[1])
repeats = int(remaining[2])
num_reps = int(sys.argv[1])
strings_per_op = int(sys.argv[2])
repeats = int(sys.argv[3])
benchmark(num_reps, strings_per_op, repeats, args.detail)
next prev parent reply other threads:[~2025-07-10 10:10 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-02 22:35 [PATCH 00/12] [PATCH 00/11] Thrash up the parser/output interface Jonathan Corbet
2025-07-02 22:35 ` [PATCH 01/12] docs: kdoc; Add a rudimentary class to represent output items Jonathan Corbet
2025-07-10 5:28 ` Mauro Carvalho Chehab
2025-07-02 22:35 ` [PATCH 02/12] docs: kdoc: simplify the output-item passing Jonathan Corbet
2025-07-10 5:29 ` Mauro Carvalho Chehab
2025-07-02 22:35 ` [PATCH 03/12] docs: kdoc: drop "sectionlist" Jonathan Corbet
2025-07-09 16:27 ` Mauro Carvalho Chehab
2025-07-02 22:35 ` [PATCH 04/12] docs: kdoc: Centralize handling of the item section list Jonathan Corbet
2025-07-10 5:45 ` Mauro Carvalho Chehab
2025-07-10 13:25 ` Jonathan Corbet
2025-07-02 22:35 ` [PATCH 05/12] docs: kdoc: remove the "struct_actual" machinery Jonathan Corbet
2025-07-10 6:11 ` Mauro Carvalho Chehab
2025-07-02 22:35 ` [PATCH 06/12] docs: kdoc: use self.entry.parameterlist directly in check_sections() Jonathan Corbet
2025-07-10 6:12 ` Mauro Carvalho Chehab
2025-07-02 22:35 ` [PATCH 07/12] docs: kdoc: Coalesce parameter-list handling Jonathan Corbet
2025-07-10 6:20 ` Mauro Carvalho Chehab
2025-07-02 22:35 ` [PATCH 08/12] docs: kdoc: Regularize the use of the declaration name Jonathan Corbet
2025-07-10 6:22 ` Mauro Carvalho Chehab
2025-07-02 22:35 ` [PATCH 09/12] docs: kdoc: straighten up dump_declaration() Jonathan Corbet
2025-07-10 6:25 ` Mauro Carvalho Chehab
2025-07-10 13:27 ` Jonathan Corbet
2025-07-10 22:13 ` Mauro Carvalho Chehab
2025-07-02 22:35 ` [PATCH 10/12] docs: kdoc: directly access the always-there KdocItem fields Jonathan Corbet
2025-07-10 6:27 ` Mauro Carvalho Chehab
2025-07-02 22:35 ` [PATCH 11/12] docs: kdoc: clean up check_sections() Jonathan Corbet
2025-07-10 6:29 ` Mauro Carvalho Chehab
2025-07-02 22:35 ` [PATCH 12/12] docs: kdoc: Improve the output text accumulation Jonathan Corbet
2025-07-10 6:41 ` Mauro Carvalho Chehab
2025-07-10 7:13 ` Mauro Carvalho Chehab
2025-07-10 8:19 ` Mauro Carvalho Chehab
2025-07-10 10:10 ` Mauro Carvalho Chehab [this message]
2025-07-10 10:31 ` Mauro Carvalho Chehab
2025-07-10 10:59 ` Mauro Carvalho Chehab
2025-07-10 23:30 ` Jonathan Corbet
2025-07-11 6:14 ` Mauro Carvalho Chehab
2025-07-11 12:49 ` Jonathan Corbet
2025-07-11 16:28 ` Mauro Carvalho Chehab
2025-07-11 16:39 ` Jonathan Corbet
2025-07-03 2:07 ` [PATCH 00/12] [PATCH 00/11] Thrash up the parser/output interface Yanteng Si
2025-07-09 15:29 ` Jonathan Corbet
2025-07-09 16:21 ` Mauro Carvalho Chehab
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250710121033.42db5ef3@foz.lan \
--to=mchehab+huawei@kernel.org \
--cc=akiyks@gmail.com \
--cc=corbet@lwn.net \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).