* [PATCH 0/1] sphinx-build-wrapper: Fix a performance regression
@ 2025-08-18 18:10 Mauro Carvalho Chehab
2025-08-18 18:10 ` [PATCH 1/1] scripts/sphinx-build-wrapper: allow building PDF files in parallel Mauro Carvalho Chehab
0 siblings, 1 reply; 5+ messages in thread
From: Mauro Carvalho Chehab @ 2025-08-18 18:10 UTC (permalink / raw)
To: Linux Doc Mailing List, corbet
Cc: Mauro Carvalho Chehab, Mauro Carvalho Chehab, linux-kernel,
Akira Yokosawa
As reported by Akira, using sphinx-build-wrapper serialize PDF builds.
This patch addresses it as commented at the thread.
The patch on this series comes after:
https://lore.kernel.org/linux-doc/cover.1755258303.git.mchehab+huawei@kernel.org/
I opted to send it in separate to avoid respining the entiire series just for
a new patch at the end.
Mauro Carvalho Chehab (1):
scripts/sphinx-build-wrapper: allow building PDF files in parallel
scripts/sphinx-build-wrapper | 141 ++++++++++++++++++++++++++---------
1 file changed, 106 insertions(+), 35 deletions(-)
--
2.50.1
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 1/1] scripts/sphinx-build-wrapper: allow building PDF files in parallel
2025-08-18 18:10 [PATCH 0/1] sphinx-build-wrapper: Fix a performance regression Mauro Carvalho Chehab
@ 2025-08-18 18:10 ` Mauro Carvalho Chehab
2025-08-19 9:09 ` Akira Yokosawa
0 siblings, 1 reply; 5+ messages in thread
From: Mauro Carvalho Chehab @ 2025-08-18 18:10 UTC (permalink / raw)
To: Linux Doc Mailing List, corbet
Cc: Mauro Carvalho Chehab, Akira Yokosawa, Mauro Carvalho Chehab,
linux-kernel
Use POSIX jobserver when available or -j<number> to run PDF
builds in parallel, restoring pdf build performance. Yet,
running it when debugging troubles is a bad idea, so, when
calling directly via command line, except if "-j" is splicitly
requested, it will serialize the build.
With such change, a PDF doc builds now takes around 5 minutes
on a Ryzen 9 machine with 32 cpu threads:
# Explicitly paralelize both Sphinx and LaTeX pdf builds
$ make cleandocs; time scripts/sphinx-build-wrapper pdfdocs -j 33
real 5m17.901s
user 15m1.499s
sys 2m31.482s
# Use POSIX jobserver to paralelize both sphinx-build and LaTeX
$ make cleandocs; time make pdfdocs
real 5m22.369s
user 15m9.076s
sys 2m31.419s
# Serializes PDF build, while keeping Sphinx parallelized.
# it is equivalent of passing -jauto via command line
$ make cleandocs; time scripts/sphinx-build-wrapper pdfdocs
real 11m20.901s
user 13m2.910s
sys 1m44.553s
Reported-by: Akira Yokosawa <akiyks@gmail.com>
Closes: https://lore.kernel.org/linux-doc/9b3b0430-066f-486e-89cc-00e6f1f3b096@gmail.com/
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
scripts/sphinx-build-wrapper | 141 ++++++++++++++++++++++++++---------
1 file changed, 106 insertions(+), 35 deletions(-)
diff --git a/scripts/sphinx-build-wrapper b/scripts/sphinx-build-wrapper
index f21701d34552..0d13c19f6df3 100755
--- a/scripts/sphinx-build-wrapper
+++ b/scripts/sphinx-build-wrapper
@@ -53,6 +53,7 @@ import shutil
import subprocess
import sys
+from concurrent import futures
from glob import glob
LIB_DIR = "lib"
@@ -295,6 +296,76 @@ class SphinxBuilder:
except (OSError, IOError) as e:
print(f"Warning: Failed to copy CSS: {e}", file=sys.stderr)
+ def build_pdf_file(self, latex_cmd, from_dir, path):
+ """Builds a single pdf file using latex_cmd"""
+ try:
+ subprocess.run(latex_cmd + [path],
+ cwd=from_dir, check=True)
+
+ return True
+ except subprocess.CalledProcessError:
+ # LaTeX PDF error code is almost useless: it returns
+ # error codes even when build succeeds but has warnings.
+ # So, we'll ignore the results
+ return False
+
+ def pdf_parallel_build(self, tex_suffix, latex_cmd, tex_files, n_jobs):
+ """Build PDF files in parallel if possible"""
+ builds = {}
+ build_failed = False
+ max_len = 0
+ has_tex = False
+
+ # Process files in parallel
+ with futures.ThreadPoolExecutor(max_workers=n_jobs) as executor:
+ jobs = {}
+
+ for from_dir, pdf_dir, entry in tex_files:
+ name = entry.name
+
+ if not name.endswith(tex_suffix):
+ continue
+
+ name = name[:-len(tex_suffix)]
+
+ max_len = max(max_len, len(name))
+
+ has_tex = True
+
+ future = executor.submit(self.build_pdf_file, latex_cmd,
+ from_dir, entry.path)
+ jobs[future] = (from_dir, name, entry.path)
+
+ for future in futures.as_completed(jobs):
+ from_dir, name, path = jobs[future]
+
+ pdf_name = name + ".pdf"
+ pdf_from = os.path.join(from_dir, pdf_name)
+
+ try:
+ success = future.result()
+
+ if success and os.path.exists(pdf_from):
+ pdf_to = os.path.join(pdf_dir, pdf_name)
+
+ os.rename(pdf_from, pdf_to)
+ builds[name] = os.path.relpath(pdf_to, self.builddir)
+ else:
+ builds[name] = "FAILED"
+ build_failed = True
+ except Exception as e:
+ builds[name] = f"FAILED ({str(e)})"
+ build_failed = True
+
+ # Handle case where no .tex files were found
+ if not has_tex:
+ name = "Sphinx LaTeX builder"
+ max_len = max(max_len, len(name))
+ builds[name] = "FAILED (no .tex file was generated)"
+ build_failed = True
+
+ return builds, build_failed, max_len
+
def handle_pdf(self, output_dirs):
"""
Extra steps for PDF output.
@@ -305,7 +376,10 @@ class SphinxBuilder:
"""
builds = {}
max_len = 0
+ tex_suffix = ".tex"
+ # Get all tex files that will be used for PDF build
+ tex_files = []
for from_dir in output_dirs:
pdf_dir = os.path.join(from_dir, "../pdf")
os.makedirs(pdf_dir, exist_ok=True)
@@ -317,49 +391,46 @@ class SphinxBuilder:
latex_cmd.extend(shlex.split(self.latexopts))
- tex_suffix = ".tex"
-
- # Process each .tex file
- has_tex = False
- build_failed = False
+ # Get a list of tex files to process
with os.scandir(from_dir) as it:
for entry in it:
- if not entry.name.endswith(tex_suffix):
- continue
+ if entry.name.endswith(tex_suffix):
+ tex_files.append((from_dir, pdf_dir, entry))
- name = entry.name[:-len(tex_suffix)]
- has_tex = True
+ # When using make, this won't be used, as the number of jobs comes
+ # from POSIX jobserver. So, this covers the case where build comes
+ # from command line. On such case, serialize by default, except if
+ # the user explicitly sets the number of jobs.
+ n_jobs = 1
- try:
- subprocess.run(latex_cmd + [entry.path],
- cwd=from_dir, check=True)
- except subprocess.CalledProcessError:
- # LaTeX PDF error code is almost useless: it returns
- # error codes even when build succeeds but has warnings.
- pass
+ # n_jobs is either an integer or "auto". Only use it if it is a number
+ if self.n_jobs:
+ try:
+ n_jobs = int(self.n_jobs)
+ except ValueError:
+ pass
- # Instead of checking errors, let's do the next best thing:
- # check if the PDF file was actually created.
+ # When using make, jobserver.claim is the number of jobs that were
+ # used with "-j" and that aren't used by other make targets
+ with JobserverExec() as jobserver:
+ n_jobs = 1
- pdf_name = name + ".pdf"
- pdf_from = os.path.join(from_dir, pdf_name)
- pdf_to = os.path.join(pdf_dir, pdf_name)
+ # Handle the case when a parameter is passed via command line,
+ # using it as default, if jobserver doesn't claim anything
+ if self.n_jobs:
+ try:
+ n_jobs = int(self.n_jobs)
+ except ValueError:
+ pass
- if os.path.exists(pdf_from):
- os.rename(pdf_from, pdf_to)
- builds[name] = os.path.relpath(pdf_to, self.builddir)
- else:
- builds[name] = "FAILED"
- build_failed = True
+ if jobserver.claim:
+ n_jobs = jobserver.claim
- name = entry.name.removesuffix(".tex")
- max_len = max(max_len, len(name))
-
- if not has_tex:
- name = os.path.basename(from_dir)
- max_len = max(max_len, len(name))
- builds[name] = "FAILED (no .tex)"
- build_failed = True
+ # Build files in parallel
+ builds, build_failed, max_len = self.pdf_parallel_build(tex_suffix,
+ latex_cmd,
+ tex_files,
+ n_jobs)
msg = "Summary"
msg += "\n" + "=" * len(msg)
--
2.50.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH 1/1] scripts/sphinx-build-wrapper: allow building PDF files in parallel
2025-08-18 18:10 ` [PATCH 1/1] scripts/sphinx-build-wrapper: allow building PDF files in parallel Mauro Carvalho Chehab
@ 2025-08-19 9:09 ` Akira Yokosawa
2025-08-19 13:21 ` Jonathan Corbet
0 siblings, 1 reply; 5+ messages in thread
From: Akira Yokosawa @ 2025-08-19 9:09 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: linux-kernel, Linux Doc Mailing List, corbet, Akira Yokosawa
On Mon, 18 Aug 2025 20:10:01 +0200, Mauro Carvalho Chehab wrote:
> Use POSIX jobserver when available or -j<number> to run PDF
> builds in parallel, restoring pdf build performance. Yet,
> running it when debugging troubles is a bad idea, so, when
> calling directly via command line, except if "-j" is splicitly
> requested, it will serialize the build.
>
> With such change, a PDF doc builds now takes around 5 minutes
> on a Ryzen 9 machine with 32 cpu threads:
>
> # Explicitly paralelize both Sphinx and LaTeX pdf builds
> $ make cleandocs; time scripts/sphinx-build-wrapper pdfdocs -j 33
>
> real 5m17.901s
> user 15m1.499s
> sys 2m31.482s
>
> # Use POSIX jobserver to paralelize both sphinx-build and LaTeX
> $ make cleandocs; time make pdfdocs
>
> real 5m22.369s
> user 15m9.076s
> sys 2m31.419s
>
> # Serializes PDF build, while keeping Sphinx parallelized.
> # it is equivalent of passing -jauto via command line
> $ make cleandocs; time scripts/sphinx-build-wrapper pdfdocs
>
> real 11m20.901s
> user 13m2.910s
> sys 1m44.553s
>
Sounds promising to me!
I couldn't test this because I couldn't apply your sphinx-build-wrapper
series on top of docs-next. :-/
Which commit does it based on?
Thanks,
Akira
> Reported-by: Akira Yokosawa <akiyks@gmail.com>
> Closes: https://lore.kernel.org/linux-doc/9b3b0430-066f-486e-89cc-00e6f1f3b096@gmail.com/
> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> ---
> scripts/sphinx-build-wrapper | 141 ++++++++++++++++++++++++++---------
> 1 file changed, 106 insertions(+), 35 deletions(-)
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 1/1] scripts/sphinx-build-wrapper: allow building PDF files in parallel
2025-08-19 9:09 ` Akira Yokosawa
@ 2025-08-19 13:21 ` Jonathan Corbet
2025-08-19 14:27 ` Mauro Carvalho Chehab
0 siblings, 1 reply; 5+ messages in thread
From: Jonathan Corbet @ 2025-08-19 13:21 UTC (permalink / raw)
To: Akira Yokosawa, Mauro Carvalho Chehab
Cc: linux-kernel, Linux Doc Mailing List, Akira Yokosawa
Akira Yokosawa <akiyks@gmail.com> writes:
> I couldn't test this because I couldn't apply your sphinx-build-wrapper
> series on top of docs-next. :-/
> Which commit does it based on?
It is built on top of the PDF series - I ran into that too.
jon
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 1/1] scripts/sphinx-build-wrapper: allow building PDF files in parallel
2025-08-19 13:21 ` Jonathan Corbet
@ 2025-08-19 14:27 ` Mauro Carvalho Chehab
0 siblings, 0 replies; 5+ messages in thread
From: Mauro Carvalho Chehab @ 2025-08-19 14:27 UTC (permalink / raw)
To: Jonathan Corbet
Cc: Akira Yokosawa, Mauro Carvalho Chehab, linux-kernel,
Linux Doc Mailing List
On Tue, Aug 19, 2025 at 07:21:09AM -0600, Jonathan Corbet wrote:
> Akira Yokosawa <akiyks@gmail.com> writes:
>
> > I couldn't test this because I couldn't apply your sphinx-build-wrapper
> > series on top of docs-next. :-/
> > Which commit does it based on?
>
> It is built on top of the PDF series - I ran into that too.
Actually, the sequence is:
- PDF series
- sphinx-build-wrapper
- this one
The dependencies between PDF and sphinx-build-wrapper are
trivial, though: both touches the Makefile; the first one
changes a broken parameter for PDF; the second one simplifies
it a lot.
--
Thanks,
Mauro
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-08-19 14:27 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-18 18:10 [PATCH 0/1] sphinx-build-wrapper: Fix a performance regression Mauro Carvalho Chehab
2025-08-18 18:10 ` [PATCH 1/1] scripts/sphinx-build-wrapper: allow building PDF files in parallel Mauro Carvalho Chehab
2025-08-19 9:09 ` Akira Yokosawa
2025-08-19 13:21 ` Jonathan Corbet
2025-08-19 14:27 ` Mauro Carvalho Chehab
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).