From: Alejandro Colomar <alx@kernel.org>
To: Deri <deri@chuzzlewit.myzen.co.uk>
Cc: "G. Branden Robinson" <g.branden.robinson@gmail.com>,
linux-man@vger.kernel.org
Subject: Re: Linux man-pages PDF book
Date: Thu, 18 Apr 2024 02:44:59 +0200 [thread overview]
Message-ID: <ZiBtHeVlZmlGe0kP@debian> (raw)
In-Reply-To: <3935722.768hzMJKAL@pip>
[-- Attachment #1: Type: text/plain, Size: 5128 bytes --]
Hi Deri,
On Tue, Apr 16, 2024 at 01:14:12AM +0100, Deri wrote:
> diff --git a/share/mk/build/pdf/book/prepare.pl b/share/mk/build/pdf/book/prepare.pl
> index e23f149c6..bc922bd88 100755
> --- a/share/mk/build/pdf/book/prepare.pl
> +++ b/share/mk/build/pdf/book/prepare.pl
> @@ -1,4 +1,4 @@
> -#!/usr/bin/perl -w
> +#!/usr/bin/perl -wd
> #
> # BuildLinuxMan.pl : Build Linux manpages book
> # Deri James (& Brian Inglis) : 15 Dec 2022
> @@ -49,16 +49,16 @@ my $dir2=$dir;
> $dir2=~tr[.][_];
> my %files;
> my %aliases;
> -my %target;
> +my %revalias;
>
> foreach my $al (`find "$dir"/man*/ -type f \\
> | grep "\\.[[:digit:]]\\([[:alpha:]][[:alnum:]]*\\)\\?\\>\$" \\
> | xargs grep '^\\.so' /dev/null;`)
> {
> #$al=~tr[.][_];
> - $al=~m/^$dir\/man\d[a-z]*\/(.*):\.\s*so\s*man\d[a-z]*\/(.*)/o;
> + $al=~m/^$dir\/man\d[a-z]*\/(.*):\.\s*so\s*man\d[a-z]*\/(.*?)\.(.*)/o;
Your annotation said:
- $al=~m/^$dir\/man\d[a-z]*\/(.*):\.\s*so\s*man\d[a-z]*\/(.*)/o;
+ $al=~m/^$dir\/man\d[a-z]*\/(.*):\.\s*so\s*man\d[a-z]*\/(.*?)\.(.*)/o;
Example:-
./man2/rt_sigaction.2:.so man2/sigaction.2
==============
========= =
$1
$2
$3
Capture the file name extension in $3.
But the regex is wrong, I think. Consider this part of the regex:
(.*?)\.(.*)
For a page like gai.conf.5, the section would be 'conf.5'. The '?' is
spurious, I think.
>
> - $aliases{$1}=$2;
> + $aliases{$1}="$2($3)";
> }
>
> while (my ($k,$v)=each %aliases)
> @@ -68,13 +68,18 @@ while (my ($k,$v)=each %aliases)
> }
> }
>
> +while (my ($k,$v)=each %aliases)
> +{
> + push(@{$revalias{$v}},$k);
> +}
> +
> foreach my $fn (`find "$dir"/man*/ -type f \\
> | grep "\\.[[:digit:]]\\([[:alpha:]][[:alnum:]]*\\)\\?\\>\$";`)
> {
> $fn=~s/\n//;
>
> my ($nm,$sec)=GetNmSec($fn,qr/\.\d[a-z]*/);
> - $files{"${nm}.$sec"}=[$fn,(exists($aliases{"${nm}.$sec"}))?$aliases{"${nm}.$sec"}:"${nm}.$sec"];
> + $files{"${nm}.$sec"}=[$fn,(exists($aliases{"${nm}.$sec"}))?$aliases{"${nm}.$sec"}:"${nm}($sec)"];
> }
>
> my $Section='';
> @@ -97,7 +102,7 @@ sub BuildPage
> my $fn=$files{$bkmark}->[0];
> my ($nm,$sec,$srt)=GetNmSec($bkmark,qr/\.[\da-z]+/);
>
> - my $title= "$nm\\($sec\\)";
> + my $title= "$nm($sec)";
>
> print ".\\\" >>>>>> $nm($sec) <<<<<<\n.lf 0 $bkmark\n";
>
> @@ -112,8 +117,10 @@ sub BuildPage
> $Section=$sec;
> }
>
> - if (exists($aliases{$bkmark})) {
> + if (exists($aliases{$bkmark}))
> + {
> print ".eo\n.device ps:exec [/Dest /$aliases{$bkmark} /Title ($title) /Level 2 /OUT pdfmark\n.ec\n.fl\n";
> +# print ".pdfbookmark 2 $nm($sec)";
> return;
> }
>
> @@ -137,7 +144,7 @@ sub BuildPage
>
> s/\\-/-/g if /^\.[BM]R\s+/;
>
> - if (m/^\.BR\s+([-\w\\.]+)\s+\((.+?)\)(.*)/ or m/^\.MR\s+([-\w\\.]+)\s+(\w+)\s+(.*)/ or m/^\\fB([-\w\\.]+)\\fR\((.+?)\)(.*)$/) {
> + if (m/^\.BR\s+([-\w\\.]+)\s+\(([\d\w]+?)\)(.*)/ or m/^\.MR\s+([-\w\\.]+)\s+(\w+)\s+(.*)/ or m/^\\fB([-\w\\.]+)\\fR\((.+?)\)(.*)$/) {
This regex might have similar issues (although they aren't being
introduced now). And there might be others too.
BTW, your annotation was:
Not completely sure if this change is necessary, just nervous
about (.+?) as a pattern.
Agree; but there are more (.+?) in the same regex.
Have a lovely night!
Alex
> my $bkmark="$1";
> my $sec=$2;
> my $after=$3;
> @@ -145,12 +152,7 @@ sub BuildPage
> my $dest=$bkmark;
> $dest=~s/\\-/-/g;
>
> - if (exists($files{"${bkmark}.$sec"})) {
> - my $dest=$files{"${bkmark}.$sec"}->[1];
> - $_=".pdfhref L -D \"$dest\" -A \"$after\" -- \\fI$bkmark\\fP($sec)";
> - } else {
> - $_=".IR $bkmark ($sec)\\c\n$after";
> - }
> + $_=".MR \"$bkmark\" $sec $after";
> }
>
> s/^\.BI \\fB/.BI /;
> @@ -175,16 +177,20 @@ sub BuildPage
> s/\n\n/\n/g;
> }
>
> - s/\\&\././ if m/^.TH /;
> -
> - if (m/^\.TH\s+"?([-\w\\.]+)"?\s+"?(\w+)"?/) {
> -
> - print "$_\n";
> -
> - # Add a level two bookmark. We don't set it in the TH macro since the name passed
> - # may be different from the filename, i.e. file = unimplemented.2, TH = UNIMPLEMENTED 2
> -
> - print ".pdfbookmark -T $bkmark 2 $nm($sec)\n";
> +# s/\\&\././ if m/^.TH /;
> +#
> + if (m/^\.TH\s+"?([-\w\\.]+)"?\s+"?(\w+)"?(.*)/)
> + {
> + print ".TH \"$nm\" \"$2\" $3\n";
> +
> + if (exists($revalias{"$nm($sec)"}))
> + {
> + foreach my $dest (@{$revalias{"$nm($sec)"}})
> + {
> + my ($nm,$sec,$srt)=GetNmSec($dest,qr/\.[\da-z]+/);
> + print ".pdfhref M -D $nm($sec)\n";
> + }
> + }
>
> next;
> }
> @@ -199,11 +205,8 @@ sub doMR
> my $nm=shift;
> my $sec=shift;
>
> - if (exists($files{"${nm}.$sec"})) {
> - return("\n.pdfhref L -D \"$files{\"${nm}.$sec\"}->[1]\" -A \"\\c\" -- \\fI$nm\\fP($sec)\n");
> - } else {
> - return("\\fI$nm\\fP($sec)");
> - }
> + return "\n.MR $nm $sec";
> +
> }
>
> sub GetNmSec
--
<https://www.alejandro-colomar.es/>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2024-04-18 0:45 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-14 11:37 Linux man-pages PDF book Alejandro Colomar
2024-04-14 11:41 ` Alejandro Colomar
2024-04-14 12:01 ` G. Branden Robinson
2024-04-14 12:25 ` Alejandro Colomar
[not found] ` <3935722.768hzMJKAL@pip>
2024-04-16 1:02 ` Alejandro Colomar
2024-04-16 2:08 ` G. Branden Robinson
2024-04-16 2:42 ` Alejandro Colomar
[not found] ` <44896690.SEQk1G1hEZ@pip>
[not found] ` <20240416165157.ml3ntjoozh3mpyzo@illithid>
2024-04-16 20:15 ` Alejandro Colomar
2024-04-16 20:57 ` Alejandro Colomar
2024-04-16 23:17 ` Deri
2024-04-17 9:54 ` Alejandro Colomar
2024-04-17 9:56 ` Alejandro Colomar
2024-04-17 10:28 ` Deri
2024-04-17 10:33 ` Alejandro Colomar
2024-04-17 20:01 ` Deri
2024-04-17 20:48 ` Alejandro Colomar
2024-04-18 0:26 ` Deri
2024-04-18 1:09 ` Alejandro Colomar
2024-04-18 14:45 ` Deri
2024-04-18 0:44 ` Alejandro Colomar [this message]
2024-04-18 1:08 ` Alejandro Colomar
2024-04-14 11:57 ` G. Branden Robinson
2024-04-14 12:32 ` Alejandro Colomar
2024-04-14 12:42 ` Alejandro Colomar
2024-04-14 13:00 ` G. Branden Robinson
2024-04-14 12:56 ` G. Branden Robinson
2024-04-14 15:58 ` Alejandro Colomar
2024-04-14 19:55 ` Alejandro Colomar
2024-04-14 20:25 ` G. Branden Robinson
2024-04-14 21:06 ` Alejandro Colomar
2024-04-14 14:50 ` Alejandro Colomar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZiBtHeVlZmlGe0kP@debian \
--to=alx@kernel.org \
--cc=deri@chuzzlewit.myzen.co.uk \
--cc=g.branden.robinson@gmail.com \
--cc=linux-man@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.