From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 97BD411C91 for ; Mon, 20 Nov 2023 09:40:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="CrY9Tk3m" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1977AC433C8; Mon, 20 Nov 2023 09:40:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1700473239; bh=/ioPf1bk9m60VfW4BvRVKHWyotHnW2Fop+XGCaJK/p4=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=CrY9Tk3mM1QRcNEWs70/8+tygjGEt0DVkHhCLMENmCZDqIHWBNUUC27FpiIKeVy4F 4dQxAQmwfwme0qOB8mY+fWyqGeyckvvnR/+KSYdDqEaFYj3fX2/+SnZXqbO0MfUOgi Pia2F/p3COuc/YzL/TYYONYPCjRfia3cDXCNnZEomCvoEHfHecubDN7Ciq5/DQUm+8 Ga1uQw0WF/awp9kY70M9Lru4X65JYCfUQjETVzUNWNwnLWA6TTyteV2+aZNOowPJXo j22jjbLnuDwKozR/S7kXoE0pd6NBbmXKszMAyFqZ/zWSG5ObFitb3AOuTMigOeE9hW +92ptojcx6DXA== Date: Mon, 20 Nov 2023 10:43:53 +0100 From: Alejandro Colomar To: "G. Branden Robinson" Cc: Deri , Jonny Grant , linux-man Subject: Re: PDF book of unreleased pages (was: strncpy clarify result may not be null terminated) Message-ID: References: <4567510.7DdL66CAHx@pip> <12344046.3XHVMEB1Be@pip> <20231120004525.acgivh3htslijygr@illithid> Precedence: bulk X-Mailing-List: linux-man@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="i7tdFf1Z6rRYk1MT" Content-Disposition: inline In-Reply-To: <20231120004525.acgivh3htslijygr@illithid> --i7tdFf1Z6rRYk1MT Content-Type: text/plain; protected-headers=v1; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Date: Mon, 20 Nov 2023 10:43:53 +0100 From: Alejandro Colomar To: "G. Branden Robinson" Cc: Deri , Jonny Grant , linux-man Subject: Re: PDF book of unreleased pages (was: strncpy clarify result may not be null terminated) Hi Branden, On Sun, Nov 19, 2023 at 06:46:29PM -0600, G. Branden Robinson wrote: > Hi Alex and Deri, >=20 > I'm going to address just a few small parts of this message... >=20 > At 2023-11-19T21:58:03+0100, Alejandro Colomar wrote: > > You can always `find ... | xargs cat | troff /dev/stdin` >=20 > ...not if you need to preprocess any of the input. With tbl(1), for > instance. What I mean is that I can preprocess individually: find ... | while read f; do eqn $f > $f.troff; done And only process together in a single invocation what _needs_ to be done in a single invocation: find ... | xargs cat | gropdf /dev/stdin I guess that preprocessors can be run per-file. I know that gropdf(1) must be run with the entire book as input. But I don't know if `troff -Tpdf` needs to see the entire book at once, or if it can process each file separately. In my laptop, the pipeline for building the Linux Man Book takes 23.3 s. I've split the processing of the book so that I produce every intermediary file in the pipeline (except pic(1), which I think we don't need). From that, I've seen the times it takes for each program to do its job (and importantly, the overall time wasn't slower; it took again 23.3 s): preconv(1) takes 0.04 s; tbl(1) takes 0.06 s; eqn(1) takes 0.05 s; troff(1) takes 2.8 s; and gropdf(1) takes 17.6 s. The time taken by gropdf(1) is mandatory, since it can't process the individual files separately. But if we can reduce the time taken by all other programs close to 0, it would be good. It depends on which programs need to see the entire book, and which can process each file separately. Nevertheless, I think it's interesting to process the book per-file, as much as possible, even if the overall time won't change significantly. It is a good documentation of what needs to be processed together and what not, when building a PDF document with groff. > > My problem is probably that I don't know what's done by `gropdf`, and > > what's done by `troff -Tpdf`. I was hoping that `troff -Tpdf` still > > didn't need to know about the entire book, and that only gropdf(1) > > would need that. >=20 > This stuff is documented in groff's Texinfo manual, and in the groff(1) > and roff(7) man pages. >=20 > Here's an excerpt of the last. >=20 > Using roff > When you read a man page, often a roff is the program rendering > it. Some roff implementations provide wrapper programs that make > it easy to use the roff system from the shell=E2=80=99s command li= ne. > These can be specific to a macro package, like mmroff(1), or more > general. groff(1) provides command=E2=80=90line options sparing t= he user > from constructing the long, order=E2=80=90dependent pipelines fami= liar to > AT&T troff users. Further, a heuristic program, grog(1), is > available to infer from a document=E2=80=99s contents which groff > arguments should be used to process it. >=20 > The roff pipeline > A typical roff document is prepared by running one or more > processors in series, followed by a a formatter program and then > an output driver (or =E2=80=9Cdevice postprocessor=E2=80=9D). Com= monly, these > programs are structured into a pipeline; that is, each is run in > sequence such that the output of one is taken as the input to the > next, without passing through secondary storage. (On non=E2=80=90= Unix > systems, pipelines may have to be simulated with temporary > files.) >=20 > $ preproc1 < input=E2=80=90file | preproc2 | ... | troff [option]= ... \ > | output=E2=80=90driver >=20 > Once all preprocessors have run, they deliver pure roff language > input to the formatter, which in turn generates a document in a > page description language that is then interpreted by a > postprocessor for viewing, printing, or further processing. >=20 > gropdf(1) is the output driver for the PDF "device". So "groff -T pdf > input.tr" and "troff -T pdf input.tr | gropdf" are equivalent. >=20 > (Yes, you still need the `-T pdf` arguments, even to troff proper. This doesn't answer my doubt. For generating a book, does troff(1) need to see the entire book, or it enough if gropdf(1) does? My guess is that troff(1) also needs to see the entire book, but I don't know for sure. Cheers, Alex --=20 --i7tdFf1Z6rRYk1MT Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE6jqH8KTroDDkXfJAnowa+77/2zIFAmVbKlkACgkQnowa+77/ 2zJ6bA/6A0lb/lgMOwYx5gKUPK17ngaSEqtuWOKuRKmNFC/jv520bXOkYkOoe8w1 7pX46krgahg6rAQxfqjiJaJNOF6veYAE1xe2L9v4bQpPnFwE0uKWm/oFPGBo11Y2 J2F1ABOu2A5LAAktSG/Y+kXInEWo/cUu6r82rKEjFlPRg8edNCJdD7sfZOmiae8/ qpJz+qIr630lSOXN1Hnxj60TMq0IuPf0QCoItwCDNup0OeBxkZVUg9pucw6ztJjJ F5iSK3+KKukqSLRnUgrNnATDTPAYjpHsVDWaCSMNcYJ/nYqpDQ8t1l/glphybXle btsmTIKlj9BthP6PAzJueM+HApvDzQh19QexVXFx9lkylaP+Ogsu11Mdu3U64vwF VeXRVAmXCWFwipdpEF2NBl7hir7qWrcrBWiLcfv0V8hHKWVCNbsbPkorukeoXP79 EfzUiYOMAqRj+I/FN0NRCGWsJpVrP92mVAQ5G77ZTbuPNwoY3lTh4DiYb0G0uwQ5 lwx/8iEuDWon9zMBLvtJ+AkFrm3kDaAdFRYLm7xYN4LlvxZJrWuKFTl3Gu/n+hkY gzsOU0vtud9EDluyTqKUi9acC+aSFMX3JNT4N44gTM1JpvLoktyNMeWJtt34N73x 9Cvjk8TJKe2+7ybZaNMsupRsP9wVqLabS8348w1dgYhkfvFbHMk= =POwr -----END PGP SIGNATURE----- --i7tdFf1Z6rRYk1MT--