linux-man.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alejandro Colomar <alx@kernel.org>
To: "G. Branden Robinson" <g.branden.robinson@gmail.com>
Cc: linux-man@vger.kernel.org, Ingo Schwarze <schwarze@usta.de>,
	Jakub Wilk <jwilk@jwilk.net>, groff <groff@gnu.org>
Subject: Re: [PATCH v2] man*/: ffix (migrate to `MR`)
Date: Tue, 1 Aug 2023 01:15:02 +0200	[thread overview]
Message-ID: <c2a8024a-0d56-4fb2-ee12-7dcbca0e75e7@kernel.org> (raw)
In-Reply-To: <20230731225016.4fxao4bn4ntmnx35@illithid>


[-- Attachment #1.1: Type: text/plain, Size: 5473 bytes --]

Hi Branden,

On 2023-08-01 00:50, G. Branden Robinson wrote:
> Hi Alex,
> 
> At 2023-07-31T23:47:50+0200, Alejandro Colomar wrote:
>>> When the text of all Linux man-pages documents (excluding those
>>> containing only `so` requests) is dumped, with adjustment mode 'l'
>>> ("-dAD=l") and automatic hyphenation disabled ("-rHY=0") before and
>>> after this change, there is no change to rendered output.
>>
>> It would be interesting to see a script that corroborates the above
>> paragraph.  It might help other projects that may want to migrate to
>> MR.
> 
> Sure.  I used a couple of scripts.
> 
>   $ cat ATTIC/dump-pages.sh
>   #!/bin/sh
> 
>   pages=$(grep -L '^\.so ' man*/* | sort)
>   groff -t "$@" -m andoc -T utf8 -P -cbou $pages
> 
>   $ cat ATTIC/dump-pages-left-adjustment-no-hyphenation.sh
>   #!/bin/sh
> 
>   pages=$(grep -L '^\.so ' man*/* | sort)
>   groff -t -dAD=l -rHY=0 -m andoc -T utf8 -P -cbou $pages
> 
> And here's how I ran them.
> 
>   sh ATTIC/dump-pages.sh >| DUMP1
>   sed -i -f ./ATTIC/MR.sed $(grep -L '^\.so ' man*/*)
>   sh ATTIC/dump-pages-left-adjustment-no-hyphenation.sh >| DUMP2
>   diff -U0 -b DUMP1 DUMP2 | less -R
> 
> That confirmed that there were "no changes" (with the caveat noted
> above).
> 
>   sh ATTIC/dump-pages.sh >| DUMP2
>   diff -U0 -b DUMP1 DUMP2 | less -R
>   diff -U0 -b DUMP1 DUMP2 | wc -l
> 
> I used these to eyeball and measure whether there were any formatting
> changes even with default adjustment and hyphenation enabled.  It showed
> me _tons_ of man page names no longer getting broken (and hyphenated)
> across lines, and nothing else that I noticed.
> 
> With the previous empty diff in hand, I decided that I hadn't regressed
> the text of the pages.
> 
> If there are further sanity checks we can apply, I'm open to
> suggestions.

Nah, I eyeballed random samples the diff and it looked good.  That, and
your extensive tests, make me confident enough.  If we screwed anything,
we can fix it.

The only concern I had some time ago was with code like exit(1), but
that should be using italics today, so it shouldn't be a problem.  I
can't imagine big issues.

> 
> Since you had me looking at my shell history, I'll share that I did a
> "git co ." (co = alias for "checkout") 18 times in the course of
> developing MR.sed.  Those drove most of my recent patch submissions
> immediately prior to this one.  I could have done 18 more without
> fatiguing (albeit not necessrily without frustration with myself for not
> getting my sed right).  But that's the beauty of sed, and
> Bash/readline's "reverse-search-history" and "operate-and-get-next"
> features.
> 
> As it turned out, my sed was pretty good, except for the missing use
> case you identified, and my fix for which worked on the first try.  The
> irregularity of the page inputs was the tricky bit.
> 
> At one point I had a fearful episode that I'd misdesigned `MR` for one
> scenario, and much like the Master being terrorized by the Keller
> Machine, I had visions of the Doctor (Ingo Schwarze) laughing at me and
> telling me he told me so and winning the whole world over to mdoc(7) in
> one stroke.  But it was fine (attached).
> 
> There are _still_ some `ad` requests scattered around (outside of tbl(1)
> text blocks), but I didn't go after those because they weren't in the
> way of my objective.  Eventually it'd be good to scrub those too.
> 
>>> I prepared this change with the following GNU sed script.
>>>
>>> \# Handle simplest cases: ".BR foo (1)" and ".IR foo (1)".
>>
>> What I do to avoid git messing with these comments is to write a
>> leading space.  For git, only '#' in column 1 are special.  Since most
>> compilers and interpreters allow a space before a commented line, a
>> leading space is fine.
> 
> Ahh.  A leading backslash is the only workaround I've ever noticed.
> 
>> I've edited the commit message to have spaces, so that it's directly
>> pastable into a MR.sed script.  Oh, and I included "$ cat MR.sed;" in
>> the commit message; I couldn't not do it.  :)
> 
> No worries. :)
> 
>> I've applied the patch (or rather, the script), but won't push it yet.
>> If you send a run of commands that prove no differences before and
>> after, I'll amend the commit message with it.
> 
> Please do verify it yourself with the tools above (or better ones).  I'm
> well aware that this is a huge change that can make people nervous.

I applied the patch, amended the message with a quote from this email,
and pushed to the MR branch in my private git repo at
<http://www.alejandro-colomar.es/src/alx/linux/man-pages/man-pages.git/log/?h=MR>.

Oh, and I also removed a few pages from your patch, per CONTRIBUTING
guidelines:

Notes
   External and autogenerated pages
       A few pages come from external sources.  Fixes to the pages
       should really go to the upstream source.

       tzfile(5), tzselect(8), zdump(8), and zic(8) come from the tz
       project <https://www.iana.org/time-zones>.

       bpf-helpers(7) is autogenerated from the Linux kernel sources
       using scripts.  See man-pages commits 53666f6c3 and 19c7f7839 for
       details.

Anyone that wants to check it, feel free to have a look at it.


Cheers,
Alex

> 
> Regards,
> Branden

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2023-07-31 23:15 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-31 17:52 [PATCH v2] man*/: ffix (migrate to `MR`) G. Branden Robinson
2023-07-31 21:47 ` Alejandro Colomar
2023-07-31 22:50   ` G. Branden Robinson
2023-07-31 23:15     ` Alejandro Colomar [this message]
2023-08-01 20:10       ` G. Branden Robinson
2023-08-17  0:14         ` Brian Inglis
2023-07-31 22:16 ` Jakub Wilk
2023-07-31 23:30   ` Alejandro Colomar
2023-08-01  1:31   ` G. Branden Robinson
2023-08-01 13:35     ` Alejandro Colomar
2023-08-01 14:12       ` G. Branden Robinson
2023-08-12 15:35         ` Alejandro Colomar
2023-08-16  3:55           ` G. Branden Robinson
2023-08-16 12:12             ` Alejandro Colomar
2023-08-16 16:33             ` Ingo Schwarze
2023-08-16 18:25               ` Alejandro Colomar
2023-08-16 21:57                 ` linting mdoc(7) pages (was: [PATCH v2] man*/: ffix (migrate to `MR`)) Alejandro Colomar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c2a8024a-0d56-4fb2-ee12-7dcbca0e75e7@kernel.org \
    --to=alx@kernel.org \
    --cc=g.branden.robinson@gmail.com \
    --cc=groff@gnu.org \
    --cc=jwilk@jwilk.net \
    --cc=linux-man@vger.kernel.org \
    --cc=schwarze@usta.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).