Re: [PATCH 12/12] docs: kdoc: Improve the output text accumulation

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
To: Jonathan Corbet <corbet@lwn.net>
Cc: linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	Akira Yokosawa <akiyks@gmail.com>
Subject: Re: [PATCH 12/12] docs: kdoc: Improve the output text accumulation
Date: Thu, 10 Jul 2025 10:19:31 +0200	[thread overview]
Message-ID: <20250710101931.202953d1@foz.lan> (raw)
In-Reply-To: <20250710091352.4ae01211@foz.lan>

Em Thu, 10 Jul 2025 09:13:52 +0200
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> escreveu:

> Em Thu, 10 Jul 2025 08:41:19 +0200
> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> escreveu:
> 
> > Em Wed,  2 Jul 2025 16:35:24 -0600
> > Jonathan Corbet <corbet@lwn.net> escreveu:
> >   
> > > Building strings with repeated concatenation is somewhat inefficient in
> > > Python; it is better to make a list and glom them all together at the end.
> > > Add a small set of methods to the OutputFormat superclass to manage the
> > > output string, and use them throughout.
> > > 
> > > Signed-off-by: Jonathan Corbet <corbet@lwn.net>    
> > 
> > The patch looks good to me. Just a minor nit below.
> >   
> > > ---
> > >  scripts/lib/kdoc/kdoc_output.py | 185 +++++++++++++++++---------------
> > >  1 file changed, 98 insertions(+), 87 deletions(-)
> > > 
> > > diff --git a/scripts/lib/kdoc/kdoc_output.py b/scripts/lib/kdoc/kdoc_output.py
> > > index ea8914537ba0..d4aabdaa9c51 100644
> > > --- a/scripts/lib/kdoc/kdoc_output.py
> > > +++ b/scripts/lib/kdoc/kdoc_output.py
> > > @@ -73,7 +73,19 @@ class OutputFormat:
> > >          self.config = None
> > >          self.no_doc_sections = False
> > >  
> > > -        self.data = ""
> > > +    #
> > > +    # Accumulation and management of the output text.
> > > +    #
> > > +    def reset_output(self):
> > > +        self._output = []
> > > +
> > > +    def emit(self, text):
> > > +        """Add a string to out output text"""
> > > +        self._output.append(text)
> > > +
> > > +    def output(self):
> > > +        """Obtain the accumulated output text"""
> > > +        return ''.join(self._output)    
> > 
> > I would prefer to use a more Pythonic name for this function:
> > 
> > 	def __str__(self)
> > 
> > This way, all it takes to get the final string is to use str():
> > 
> > 	out_str = str(out)
> > 
> > With that:
> > 
> > Reviewed-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>  
> 
> 
> Hmm... actually, I would code it on a different way, using something like:
> 
> class OutputString:
>     def __init__(self):
> 	"""Initialize internal list"""
>         self._output = []
>     
>     # Probably not needed - The code can simply do, instead:
>     # a = OutputString() to create a new string.
>     def reset(self):
>         """Reset the output text"""
>         self._output = []
>     
>     def __add__(self, text):
> 	"""Add a string to out output text"""
>         if not isinstance(text, str):
>             raise TypeError("Can only append strings")
>         self._output.append(text)
>         return self
> 
>     def __str__(self):
>         return ''.join(self._output)
>     
>     # and, if needed, add a getter/setter:
> 
>     @property
>     def data(self):
>         """Getter for the current output"""
>         return ''.join(self._output)
> 
>     @data.setter
>     def data(self, new_value):
>         if isinstance(new_value, str):
> 	    self._output = [new_value]
> 	elif isinstance(new_value, list):
>             self._output = new_value
>         else:
>             raise TypeError("Value should be either list or string")
> 
> That would allow things like:
> 
> 	out = OutputString()
> 	out = out + "Foo" + " " + "Bar"
> 	print(out)
> 
> 	out = OutputString()
> 	out += "Foo"
> 	out += " "
> 	out += "Bar"
> 	return(str(out))
> 
> and won't require much changes at the output logic, and IMO will
> provide a cleaner code.
> 
> Thanks,
> Mauro

Heh, on those times where LLM can quickly code trivial things for us,
I actually decided to test 3 different variants:

- using string +=
- using list append
- using __add__
- using __iadd__

Except if the LLM-generated did something wrong (I double checked, and
was unable to find any issues), the results on Python 3.13.5 are:

$ /tmp/bench.py
Benchmarking 1,000 ops × 1000 strings = 1,000,000 appends

Run    str+=        ExplicitList __add__      __iadd__    
------------------------------------------------------------
1      25.26        29.44        53.42        50.71       
2      29.34        29.35        53.45        50.61       
3      29.44        29.56        53.41        50.67       
4      29.28        29.23        53.26        50.64       
5      29.28        29.20        45.90        40.47       
6      23.53        23.62        42.74        40.61       
7      23.43        23.76        42.97        40.78       
8      23.51        23.59        42.67        40.61       
9      23.43        23.52        42.77        40.72       
10     23.53        23.54        42.78        40.67       
11     23.83        23.63        42.98        40.87       
12     23.49        23.45        42.67        40.53       
13     23.43        23.69        42.75        40.66       
14     23.47        23.49        42.70        40.56       
15     23.44        23.63        42.72        40.52       
16     23.51        23.56        42.65        40.66       
17     23.48        23.60        42.86        40.81       
18     23.67        23.53        42.73        40.59       
19     23.75        23.62        42.78        40.58       
20     23.68        23.55        42.77        40.54       
21     23.65        23.67        42.76        40.59       
22     23.73        23.49        42.78        40.61       
23     23.61        23.59        42.78        40.58       
24     23.66        23.51        42.73        40.55       
------------------------------------------------------------
Avg    24.60        24.78        44.67        42.30       

Summary:
ExplicitList : 100.74% slower than str+=
__add__      : 181.56% slower than str+=
__iadd__     : 171.93% slower than str+=

(running it a couple of times sometimes it sometimes show list
 addition a little bit better, bu all at the +/- 1% range)

In practice, it means that my suggestion of using __add__ (or even
using the __iadd__ variant) was not good, but it also showed
that Python 3.13 implementation is actually very efficient
with str += operations.

With that, I would just drop this patch, as the performance is
almost identical, and using "emit()" instead of "+=" IMO makes
the code less clear.

-

Btw, with Python 3.9, "".join(list) is a lot worse than str += :

$ python3.9 /tmp/bench.py
Benchmarking 1,000 ops × 1000 strings = 1,000,000 appends

Run    str+=        ExplicitList __add__      __iadd__    
------------------------------------------------------------
1      28.27        87.24        96.03        88.81       
2      32.76        87.35        87.40        88.92       
3      32.69        85.98        73.01        70.87       
4      26.28        69.80        70.62        71.90       
5      27.21        70.54        71.04        72.00       
6      27.77        70.06        70.22        70.92       
7      27.03        69.75        70.30        70.89       
8      33.31        72.63        70.57        70.59       
9      26.33        70.15        70.27        70.97       
10     26.29        69.84        71.60        70.94       
11     26.59        69.60        70.16        71.26       
12     26.38        69.57        71.64        70.95       
13     26.41        69.89        70.11        70.85       
14     26.38        69.86        70.36        70.93       
15     26.43        69.57        70.18        70.90       
16     26.38        70.04        70.26        71.19       
17     26.40        70.02        80.50        71.01       
18     26.41        71.74        80.39        71.90       
19     28.06        69.60        71.95        70.88       
20     28.28        69.90        71.12        71.07       
21     26.34        69.74        72.42        71.02       
22     26.33        69.86        70.25        70.97       
23     26.40        69.78        71.64        71.10       
24     26.44        69.73        70.23        70.83       
------------------------------------------------------------
Avg    27.55        72.18        73.43        72.57       

Summary:
ExplicitList : 262.00% slower than str+=
__add__      : 266.54% slower than str+=
__iadd__     : 263.42% slower than str+=


Thanks,
Mauro

---

#!/usr/bin/env python3

import timeit

class ExplicitList:
    def __init__(self):
        self._output = []

    def emit(self, text):
        self._output.append(text)

    def output(self):
        return ''.join(self._output)

class OutputStringAdd:
    def __init__(self):
        self._output = []

    def __add__(self, text):
        self._output.append(text)
        return self

    def __str__(self):
        return ''.join(self._output)

class OutputStringIAdd:
    def __init__(self):
        self._output = []

    def __iadd__(self, text):
        self._output.append(text)
        return self

    def __str__(self):
        return ''.join(self._output)

def calculate_comparison(base_time, compare_time):
    """Returns tuple of (is_faster, percentage)"""
    if compare_time < base_time:
        return (True, (1 - compare_time/base_time)*100)
    else:
        return (False, (compare_time/base_time)*100)

def benchmark():
    N = 1000       # Operations
    STRINGS_PER_OP = 1000
    REPEATS = 24

    # Generate test data (1000 unique 10-character strings)
    test_strings = [f"string_{i:03d}" for i in range(STRINGS_PER_OP)]

    print(f"Benchmarking {N:,} ops × {STRINGS_PER_OP} strings = {N*STRINGS_PER_OP:,} appends\n")
    headers = ['Run', 'str+=', 'ExplicitList', '__add__', '__iadd__']
    print(f"{headers[0]:<6} {headers[1]:<12} {headers[2]:<12} {headers[3]:<12} {headers[4]:<12}")
    print("-" * 60)

    results = []

    for i in range(REPEATS):
        # Benchmark normal string +=
        t_str = timeit.timeit(
            'result = ""\nfor s in test_strings: result += s',
            globals={'test_strings': test_strings},
            number=N
        ) * 1000

        # Benchmark ExplicitList
        t_explicit = timeit.timeit(
            'obj = ExplicitList()\nfor s in test_strings: obj.emit(s)',
            globals={'test_strings': test_strings, 'ExplicitList': ExplicitList},
            number=N
        ) * 1000

        # Benchmark __add__ version
        t_add = timeit.timeit(
            'obj = OutputStringAdd()\nfor s in test_strings: obj += s',
            globals={'test_strings': test_strings, 'OutputStringAdd': OutputStringAdd},
            number=N
        ) * 1000

        # Benchmark __iadd__ version
        t_iadd = timeit.timeit(
            'obj = OutputStringIAdd()\nfor s in test_strings: obj += s',
            globals={'test_strings': test_strings, 'OutputStringIAdd': OutputStringIAdd},
            number=N
        ) * 1000

        results.append((t_str, t_explicit, t_add, t_iadd))
        print(f"{i+1:<6} {t_str:<12.2f} {t_explicit:<12.2f} {t_add:<12.2f} {t_iadd:<12.2f}")

    # Calculate averages
    avg_str = sum(r[0] for r in results) / REPEATS
    avg_explicit = sum(r[1] for r in results) / REPEATS
    avg_add = sum(r[2] for r in results) / REPEATS
    avg_iadd = sum(r[3] for r in results) / REPEATS

    print("-" * 60)
    print(f"{'Avg':<6} {avg_str:<12.2f} {avg_explicit:<12.2f} {avg_add:<12.2f} {avg_iadd:<12.2f}")

    print()
    print("Summary:")
    # Calculate and print comparisons
    for name, time in [("ExplicitList", avg_explicit),
                      ("__add__", avg_add),
                      ("__iadd__", avg_iadd)]:
        is_faster, percentage = calculate_comparison(avg_str, time)
        if is_faster:
            print(f"{name:<12} : {percentage:.2f}% faster than str+=")
        else:
            print(f"{name:<12} : {percentage:.2f}% slower than str+=")


if __name__ == "__main__":
    benchmark()

next prev parent reply	other threads:[~2025-07-10  8:19 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-02 22:35 [PATCH 00/12] [PATCH 00/11] Thrash up the parser/output interface Jonathan Corbet
2025-07-02 22:35 ` [PATCH 01/12] docs: kdoc; Add a rudimentary class to represent output items Jonathan Corbet
2025-07-10  5:28   ` Mauro Carvalho Chehab
2025-07-02 22:35 ` [PATCH 02/12] docs: kdoc: simplify the output-item passing Jonathan Corbet
2025-07-10  5:29   ` Mauro Carvalho Chehab
2025-07-02 22:35 ` [PATCH 03/12] docs: kdoc: drop "sectionlist" Jonathan Corbet
2025-07-09 16:27   ` Mauro Carvalho Chehab
2025-07-02 22:35 ` [PATCH 04/12] docs: kdoc: Centralize handling of the item section list Jonathan Corbet
2025-07-10  5:45   ` Mauro Carvalho Chehab
2025-07-10 13:25     ` Jonathan Corbet
2025-07-02 22:35 ` [PATCH 05/12] docs: kdoc: remove the "struct_actual" machinery Jonathan Corbet
2025-07-10  6:11   ` Mauro Carvalho Chehab
2025-07-02 22:35 ` [PATCH 06/12] docs: kdoc: use self.entry.parameterlist directly in check_sections() Jonathan Corbet
2025-07-10  6:12   ` Mauro Carvalho Chehab
2025-07-02 22:35 ` [PATCH 07/12] docs: kdoc: Coalesce parameter-list handling Jonathan Corbet
2025-07-10  6:20   ` Mauro Carvalho Chehab
2025-07-02 22:35 ` [PATCH 08/12] docs: kdoc: Regularize the use of the declaration name Jonathan Corbet
2025-07-10  6:22   ` Mauro Carvalho Chehab
2025-07-02 22:35 ` [PATCH 09/12] docs: kdoc: straighten up dump_declaration() Jonathan Corbet
2025-07-10  6:25   ` Mauro Carvalho Chehab
2025-07-10 13:27     ` Jonathan Corbet
2025-07-10 22:13       ` Mauro Carvalho Chehab
2025-07-02 22:35 ` [PATCH 10/12] docs: kdoc: directly access the always-there KdocItem fields Jonathan Corbet
2025-07-10  6:27   ` Mauro Carvalho Chehab
2025-07-02 22:35 ` [PATCH 11/12] docs: kdoc: clean up check_sections() Jonathan Corbet
2025-07-10  6:29   ` Mauro Carvalho Chehab
2025-07-02 22:35 ` [PATCH 12/12] docs: kdoc: Improve the output text accumulation Jonathan Corbet
2025-07-10  6:41   ` Mauro Carvalho Chehab
2025-07-10  7:13     ` Mauro Carvalho Chehab
2025-07-10  8:19       ` Mauro Carvalho Chehab [this message]
2025-07-10 10:10         ` Mauro Carvalho Chehab
2025-07-10 10:31           ` Mauro Carvalho Chehab
2025-07-10 10:59             ` Mauro Carvalho Chehab
2025-07-10 23:30         ` Jonathan Corbet
2025-07-11  6:14           ` Mauro Carvalho Chehab
2025-07-11 12:49             ` Jonathan Corbet
2025-07-11 16:28               ` Mauro Carvalho Chehab
2025-07-11 16:39                 ` Jonathan Corbet
2025-07-03  2:07 ` [PATCH 00/12] [PATCH 00/11] Thrash up the parser/output interface Yanteng Si
2025-07-09 15:29 ` Jonathan Corbet
2025-07-09 16:21   ` Mauro Carvalho Chehab

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250710101931.202953d1@foz.lan \
    --to=mchehab+huawei@kernel.org \
    --cc=akiyks@gmail.com \
    --cc=corbet@lwn.net \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.