From: Alejandro Colomar <alx@kernel.org>
To: "G. Branden Robinson" <g.branden.robinson@gmail.com>
Cc: Deri <deri@chuzzlewit.myzen.co.uk>, Jonny Grant <jg@jguk.org>,
linux-man <linux-man@vger.kernel.org>
Subject: Re: PDF book of unreleased pages (was: strncpy clarify result may not be null terminated)
Date: Mon, 20 Nov 2023 10:43:53 +0100 [thread overview]
Message-ID: <ZVsqX_suG6VJe0BG@devuan> (raw)
In-Reply-To: <20231120004525.acgivh3htslijygr@illithid>
[-- Attachment #1: Type: text/plain, Size: 4450 bytes --]
Hi Branden,
On Sun, Nov 19, 2023 at 06:46:29PM -0600, G. Branden Robinson wrote:
> Hi Alex and Deri,
>
> I'm going to address just a few small parts of this message...
>
> At 2023-11-19T21:58:03+0100, Alejandro Colomar wrote:
> > You can always `find ... | xargs cat | troff /dev/stdin`
>
> ...not if you need to preprocess any of the input. With tbl(1), for
> instance.
What I mean is that I can preprocess individually:
find ... | while read f; do eqn $f > $f.troff; done
And only process together in a single invocation what _needs_ to be done
in a single invocation:
find ... | xargs cat | gropdf /dev/stdin
I guess that preprocessors can be run per-file.
I know that gropdf(1) must be run with the entire book as input.
But I don't know if `troff -Tpdf` needs to see the entire book at once,
or if it can process each file separately.
In my laptop, the pipeline for building the Linux Man Book takes 23.3 s.
I've split the processing of the book so that I produce every
intermediary file in the pipeline (except pic(1), which I think we don't
need). From that, I've seen the times it takes for each program to do
its job (and importantly, the overall time wasn't slower; it took again
23.3 s): preconv(1) takes 0.04 s; tbl(1) takes 0.06 s; eqn(1) takes
0.05 s; troff(1) takes 2.8 s; and gropdf(1) takes 17.6 s.
The time taken by gropdf(1) is mandatory, since it can't process the
individual files separately. But if we can reduce the time taken by all
other programs close to 0, it would be good. It depends on which
programs need to see the entire book, and which can process each file
separately.
Nevertheless, I think it's interesting to process the book per-file, as
much as possible, even if the overall time won't change significantly.
It is a good documentation of what needs to be processed together and
what not, when building a PDF document with groff.
> > My problem is probably that I don't know what's done by `gropdf`, and
> > what's done by `troff -Tpdf`. I was hoping that `troff -Tpdf` still
> > didn't need to know about the entire book, and that only gropdf(1)
> > would need that.
>
> This stuff is documented in groff's Texinfo manual, and in the groff(1)
> and roff(7) man pages.
>
> Here's an excerpt of the last.
>
> Using roff
> When you read a man page, often a roff is the program rendering
> it. Some roff implementations provide wrapper programs that make
> it easy to use the roff system from the shell’s command line.
> These can be specific to a macro package, like mmroff(1), or more
> general. groff(1) provides command‐line options sparing the user
> from constructing the long, order‐dependent pipelines familiar to
> AT&T troff users. Further, a heuristic program, grog(1), is
> available to infer from a document’s contents which groff
> arguments should be used to process it.
>
> The roff pipeline
> A typical roff document is prepared by running one or more
> processors in series, followed by a a formatter program and then
> an output driver (or “device postprocessor”). Commonly, these
> programs are structured into a pipeline; that is, each is run in
> sequence such that the output of one is taken as the input to the
> next, without passing through secondary storage. (On non‐Unix
> systems, pipelines may have to be simulated with temporary
> files.)
>
> $ preproc1 < input‐file | preproc2 | ... | troff [option] ... \
> | output‐driver
>
> Once all preprocessors have run, they deliver pure roff language
> input to the formatter, which in turn generates a document in a
> page description language that is then interpreted by a
> postprocessor for viewing, printing, or further processing.
>
> gropdf(1) is the output driver for the PDF "device". So "groff -T pdf
> input.tr" and "troff -T pdf input.tr | gropdf" are equivalent.
>
> (Yes, you still need the `-T pdf` arguments, even to troff proper.
This doesn't answer my doubt. For generating a book, does troff(1) need
to see the entire book, or it enough if gropdf(1) does? My guess is
that troff(1) also needs to see the entire book, but I don't know for
sure.
Cheers,
Alex
--
<https://www.alejandro-colomar.es/>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2023-11-20 9:40 UTC|newest]
Thread overview: 138+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-04 11:27 strncpy clarify result may not be null terminated Jonny Grant
2023-11-04 19:33 ` Alejandro Colomar
2023-11-04 21:18 ` Jonny Grant
2023-11-05 1:36 ` Alejandro Colomar
2023-11-05 21:16 ` Jonny Grant
2023-11-05 23:31 ` Alejandro Colomar
2023-11-07 11:52 ` Jonny Grant
2023-11-07 13:23 ` Alejandro Colomar
2023-11-07 14:19 ` Jonny Grant
2023-11-07 16:17 ` Alejandro Colomar
2023-11-07 17:00 ` Jonny Grant
2023-11-07 17:20 ` Alejandro Colomar
2023-11-08 6:18 ` Oskari Pirhonen
2023-11-08 9:51 ` Alejandro Colomar
2023-11-08 9:59 ` Thorsten Kukuk
2023-11-08 15:09 ` Alejandro Colomar
[not found] ` <6bcad2492ab843019aa63895beaea2ce@DB6PR04MB3255.eurprd04.prod.outlook.com>
2023-11-08 15:44 ` Thorsten Kukuk
2023-11-08 17:26 ` Adhemerval Zanella Netto
2023-11-08 14:06 ` Zack Weinberg
2023-11-08 15:07 ` Alejandro Colomar
2023-11-08 19:45 ` G. Branden Robinson
2023-11-08 21:35 ` Carlos O'Donell
2023-11-08 22:11 ` Alejandro Colomar
2023-11-08 23:31 ` Paul Eggert
2023-11-09 0:29 ` Alejandro Colomar
2023-11-09 10:13 ` Jonny Grant
2023-11-09 11:08 ` catenate vs concatenate (was: strncpy clarify result may not be null terminated) Alejandro Colomar
2023-11-09 14:06 ` catenate vs concatenate Jonny Grant
2023-11-27 14:33 ` catenate vs concatenate (was: strncpy clarify result may not be null terminated) Zack Weinberg
2023-11-27 15:08 ` Alejandro Colomar
2023-11-27 15:13 ` Alejandro Colomar
2023-11-27 16:59 ` G. Branden Robinson
2023-11-27 18:35 ` Zack Weinberg
2023-11-27 23:45 ` G. Branden Robinson
2023-11-09 11:13 ` strncpy clarify result may not be null terminated Alejandro Colomar
2023-11-09 14:05 ` Jonny Grant
2023-11-09 15:04 ` Alejandro Colomar
2023-11-08 19:04 ` DJ Delorie
2023-11-08 19:40 ` Alejandro Colomar
2023-11-08 19:58 ` DJ Delorie
2023-11-08 20:13 ` Alejandro Colomar
2023-11-08 21:07 ` DJ Delorie
2023-11-08 21:50 ` Alejandro Colomar
2023-11-08 22:17 ` [PATCH] stpncpy.3, string_copying.7: Clarify that st[rp]ncpy() do NOT produce a string Alejandro Colomar
2023-11-08 23:06 ` Paul Eggert
2023-11-08 23:28 ` DJ Delorie
2023-11-09 0:24 ` Alejandro Colomar
2023-11-09 14:11 ` Jonny Grant
2023-11-09 14:35 ` Alejandro Colomar
2023-11-09 14:47 ` Jonny Grant
2023-11-09 15:02 ` Alejandro Colomar
2023-11-09 17:30 ` DJ Delorie
2023-11-09 17:54 ` Andreas Schwab
2023-11-09 18:00 ` Alejandro Colomar
2023-11-09 19:42 ` Jonny Grant
2023-11-09 7:23 ` Oskari Pirhonen
2023-11-09 15:20 ` [PATCH v2 1/2] " Alejandro Colomar
2023-11-09 15:20 ` [PATCH v2 2/2] stpncpy.3, string.3, string_copying.7: Clarify that st[rp]ncpy() pad with null bytes Alejandro Colomar
2023-11-10 5:47 ` Oskari Pirhonen
2023-11-10 10:47 ` Alejandro Colomar
2023-11-08 2:12 ` strncpy clarify result may not be null terminated Matthew House
2023-11-08 19:33 ` Alejandro Colomar
2023-11-08 19:40 ` Alejandro Colomar
2023-11-09 3:13 ` Matthew House
2023-11-09 10:26 ` Jonny Grant
2023-11-09 10:31 ` Jonny Grant
2023-11-09 11:38 ` Alejandro Colomar
2023-11-09 12:43 ` Alejandro Colomar
2023-11-09 12:51 ` Xi Ruoyao
2023-11-09 14:01 ` Alejandro Colomar
2023-11-09 18:11 ` Paul Eggert
2023-11-09 23:48 ` Alejandro Colomar
2023-11-10 5:36 ` Paul Eggert
2023-11-10 11:05 ` Alejandro Colomar
2023-11-10 11:47 ` Alejandro Colomar
2023-11-10 17:58 ` Paul Eggert
2023-11-10 18:36 ` Alejandro Colomar
2023-11-10 20:19 ` Alejandro Colomar
2023-11-10 23:44 ` Jonny Grant
2023-11-10 19:52 ` Alejandro Colomar
2023-11-10 22:14 ` Paul Eggert
2023-11-11 21:13 ` Alejandro Colomar
2023-11-11 22:20 ` Paul Eggert
2023-11-12 9:52 ` Jonny Grant
2023-11-12 10:59 ` Alejandro Colomar
2023-11-12 20:49 ` Paul Eggert
2023-11-12 21:00 ` Alejandro Colomar
2023-11-12 21:45 ` Alejandro Colomar
2023-11-13 23:46 ` Jonny Grant
2023-11-17 21:57 ` Jonny Grant
2023-11-18 10:12 ` Alejandro Colomar
2023-11-18 23:03 ` Jonny Grant
2023-11-10 11:36 ` Jonny Grant
2023-11-10 13:15 ` Alejandro Colomar
2023-11-18 23:40 ` Jonny Grant
2023-11-20 11:56 ` Jonny Grant
2023-11-20 15:12 ` Alejandro Colomar
2023-11-20 23:08 ` Jonny Grant
2023-11-20 23:42 ` Alejandro Colomar
2023-11-10 11:23 ` Jonny Grant
2023-11-09 12:23 ` Alejandro Colomar
2023-11-09 12:35 ` Alejandro Colomar
2023-11-10 7:06 ` Oskari Pirhonen
2023-11-10 11:18 ` Alejandro Colomar
2023-11-11 7:55 ` Oskari Pirhonen
2023-11-10 16:06 ` Matthew House
2023-11-10 17:48 ` Alejandro Colomar
2023-11-13 15:01 ` Matthew House
2023-11-11 20:55 ` Jonny Grant
2023-11-11 21:15 ` Jonny Grant
2023-11-11 22:36 ` Alejandro Colomar
2023-11-11 23:19 ` Alejandro Colomar
2023-11-17 21:46 ` Jonny Grant
2023-11-18 9:37 ` PDF book of unreleased pages (was: strncpy clarify result may not be null terminated) Alejandro Colomar
2023-11-19 0:22 ` Deri
2023-11-19 1:19 ` Alejandro Colomar
2023-11-19 9:29 ` Alejandro Colomar
2023-11-19 16:21 ` Deri
2023-11-19 20:58 ` Alejandro Colomar
2023-11-20 0:46 ` G. Branden Robinson
2023-11-20 9:43 ` Alejandro Colomar [this message]
2023-11-18 9:44 ` NULL safety " Alejandro Colomar
2023-11-18 23:21 ` NULL safety Jonny Grant
2023-11-24 22:25 ` Alejandro Colomar
2023-11-25 0:57 ` Jonny Grant
2023-11-10 10:40 ` strncpy clarify result may not be null terminated Stefan Puiu
2023-11-10 11:06 ` Jonny Grant
2023-11-10 11:20 ` Alejandro Colomar
2023-11-12 9:17 ` [PATCH 0/2] Expand BUGS section of string_copying(7) Alejandro Colomar
2023-11-12 9:18 ` [PATCH 1/2] string_copying.7: BUGS: *cat(3) functions aren't always bad Alejandro Colomar
2023-11-12 9:18 ` [PATCH 2/2] string_copying.7: BUGS: Document strl{cpy,cat}(3)'s performance problems Alejandro Colomar
2023-11-12 11:26 ` [PATCH v2 0/3] Improve string_copying(7) Alejandro Colomar
2023-11-12 11:26 ` [PATCH v2 1/3] string_copying.7: BUGS: *cat(3) functions aren't always bad Alejandro Colomar
2023-11-17 21:43 ` Jonny Grant
2023-11-18 0:25 ` Signing all patches and email to this list Matthew House
2023-11-18 23:24 ` Jonny Grant
2023-11-12 11:26 ` [PATCH v2 2/3] string_copying.7: BUGS: Document strl{cpy,cat}(3)'s performance problems Alejandro Colomar
2023-11-12 11:27 ` [PATCH v2 3/3] strtcpy.3, string_copying.7: Add strtcpy(3) Alejandro Colomar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZVsqX_suG6VJe0BG@devuan \
--to=alx@kernel.org \
--cc=deri@chuzzlewit.myzen.co.uk \
--cc=g.branden.robinson@gmail.com \
--cc=jg@jguk.org \
--cc=linux-man@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox