public inbox for linux-man@vger.kernel.org
 help / color / mirror / Atom feed
From: Alejandro Colomar <alx.manpages@gmail.com>
To: "G. Branden Robinson" <g.branden.robinson@gmail.com>
Cc: linux-man@vger.kernel.org
Subject: Re: [PATCH 1/9] ldconfig.8: Fix markup nits
Date: Wed, 4 Jan 2023 21:15:00 +0100	[thread overview]
Message-ID: <afd3a0d3-9bf4-2687-4f62-2ebd62398447@gmail.com> (raw)
In-Reply-To: <20230104191118.xs7jwtjcqz6fhbbx@illithid>


[-- Attachment #1.1: Type: text/plain, Size: 10210 bytes --]

Hi Branden,

On 1/4/23 20:11, G. Branden Robinson wrote:
> Hi Alex,
> 
> At 2023-01-04T19:41:51+0100, Alejandro Colomar wrote:
>> On 1/4/23 16:55, G. Branden Robinson wrote:
>>> At 2023-01-04T13:26:33+0100, Alejandro Colomar wrote:
>>>>>     .SH NAME
>>>>>     ldconfig \- configure dynamic linker run-time bindings
>>>>>     .SH SYNOPSIS
>>>>
>>>> We should wrap this in .nf/.fi
>>>
>>> That will have a cost.  It will mean using a lot of \c escape
>>> sequences to connect the output lines.
>>>
>>> The existing synopsis fits within 74 output columns on a terminal.
>>>
>>> Do you think it's worth it?
>>
>> But, it we don't use it, if someone uses a smaller terminal, there
>> might appear our beloved hyphens breaking a word...
> 
> That's true.  But what is _not_ true is that you don't have a minimum
> expected terminal width.  You do, you just might not know what it is and
> it may not even have been consciously chosen.

I haven't consciously chosen it, but I often use 66-col terminals, especially 
when I plan to paste text into an email.

> 
> The minimum expected terminal width for the Linux man-pages corpus is
> the longest output line produced by an unfilled (.nf/.fi) region or a
> tbl(1) row that doesn't use a text block.  Somewhere in the ~2,539 man
> pages, a longest unfilled line lurks...and its identity may change
> depending on the output device used to render it (terminal vs.
> typesetter).

Even if unfilled blocks exceed the terminal width, and therefore might be 
unpleasant to the eyes, they have a nice feature: the line never breaks, and you 
can still pipe it to other programs, or paste it, and it will be a single line.

> 
> If you _do_ know what that expected minimum is, please document it.
> 
> The nearest thing I see is:
> 
> "Please limit source code line length to no more than about 75
> characters wherever possible." -- man-pages(7)
> 
> But the relationship between input document line length for and
> formatted output line length is a loose one.
> 
> In any event, groff man(7)'s SY/YS extension macros are _built_ for this
> application.  I'm happy to "port" this page to use them; doing so will
> permit removal of the PD macro calls, among other benefits.

Yup, that would be nice!

> 
>> Is there anything that "reverts" \%?  So that if it were the default,
>> we could use \anti-% to say "groff, you might break this word"?
> 
> Yes.  \% itself does that.
I mean something like \X\%foobar, so that the \X "cancels" the \%.  Not manually 
inserting break points.  So, imagining a world in which hyphenation was disabled 
_only_ within font-selection macros, I could specify that a word is fine to be 
hyphenated like this:

.B \Xidontcareifthisishyphenated

> 
>  From the groff 1.23 Texinfo manual (with stuff irrelevant to man(7)
> usage stripped out):
> 
>   -- Escape sequence: \%
>   -- Escape sequence: \:
>       To tell GNU 'troff' how to hyphenate words as they occur in input,
>       use the '\%' escape sequence; it is the default "hyphenation
>       character".  Each instance within a word indicates to GNU 'troff'
>       that the word may be hyphenated at that point, while prefixing a
>       word with this escape sequence prevents it from being otherwise
>       hyphenated.  This mechanism affects only that occurrence of the
>       word; [...]
> 
> [...]
> 
>       '\:' inserts a non-printing break point; that is, a word can break
>       there, but the soft hyphen glyph (see below) is not written to the
>       output if it does.  This escape sequence is an input word boundary,
>       so the remainder of the word is subject to hyphenation as normal.
> 
>       You can use '\:' and '\%' in combination to control breaking of a
>       file name or URL or to permit hyphenation only after certain
>       explicit hyphens within a word.
> 
>            The \%Lethbridge-Stewart-\:\%Sackville-Baggins divorce
>            was, in retrospect, inevitable once the contents of
>            \%/var/log/\:\%httpd/\:\%access_log on the family web
>            server came to light, revealing visitors from Hogwarts.
> 
>>> groff man(7) _has_ a mechanism for this, and has since groff 1.19
>>> (2003).  It's the `HY` register.  People can put this in their
>>> man.local files (on Debian-based systems, that's in /etc/groff).
>>>
>>> .nr HY 0
>>
>> I know, but I don't think we should write manual pages in a way that
>> forces distributors to use such a thing.
> 
> As far as I know, most distributors aren't configuring man.local this
> way today, despite it having been possible for almost 20 years.

Touche.

>  Adding
> explicit hyphenation breakpoints (or their suppression) isn't going to
> force them any harder than they have been for 2 decades; it will in fact
> reduce any such pressure by reducing the number of bogus hyphenation
> breaks when hyphenation is enabled.
> 
>> Either the pages are written plagued with \%,
> 
> Like changes in lettercase, this is _information_.

I don't argue against that, but if there was a way to return that information 
explicitly, we wouldn't be loosing it.

> 
>> and the distros don't need to use .HY, or we write pages lazily so
>> that distros need to fix the hyphenation.
> 
> Distributors' laziness seems to be a match for Linux man-pages's own;
> users seem to muddle through without much evident complaint.

Oh, I do complain a lot; however, I don't express it too much in the form of bug 
reports, since I don't believe it's the fault of the writer, but rather lack of 
support from groff(1).

But I do find it very uncomfortable, similar to when manual page authors don't 
use the proper \-.  However, I do think that one is fault of the author, and you 
can already find many such reports signed by me :)

>  I suppose
> people who copy-and-paste multiple lines from a man page realize they
> need to remove the hyphens along with newlines.  Fortunately, on UTF-8
> terminals, they have hope of seeing the difference between hyphens and
> the ASCII hyphen-minus that is always(?) meant as a literal.

My problem is not about pasting text.  That's very minor.  My problem is finding 
text.

For finding command options, I usually type:

/   --foo

If \- hasn't been used, I need to use:

/   ..foo

and skip all the noise.  When there's too much noise, sometimes using an anchor 
(^) helps.  But it's way nicer when writers use \-.  I keep finding such bugs, 
and reporting them as much as I can.

When searching for keywords, the problem is the following:  I do `/keyword`, but 
then if the keyword is hyphenated... well, good luck.

> 
>> But writing the pages lazily and having distributors ignore it would
>> result in suboptimal pages for our readers.
> 
> I think marking break points, hyphenated and otherwise (as with URLs),
> is the opposite of laziness.  It is a level of fastidiousness I don't
> actually expect of many man(7) writers apart from myself.

I would want to use \:.  What I want is a tool which re-enables the default 
hyphenation points after they have been cancelled.

> 
>>> Colin Watson's man-db man(1) also has a feature to suppress
>>> hyphenation, using a hack; it's not pretty but it works even on
>>> other *roff formatters.
>>
>> Does that disable hyphenation for macros, or for the entire document?
>> I only want to disable it in highlighting macros.
> 
> I don't quite understand what you mean by "macros" here.  Macro
> interpolation is textual replacement, there isn't really a macro "mode"
> that is visible to the formatter when hyphenation decisions are made.
> 
> But if by "highlighting macros" you mean font selection and
> alternation macros in man(7) (.B, .I, .BR, etc.), then the answer is
> "for the entire document".  You don't always want to disable hyphenation
> when using these macros anyway.  Not everything is a literal.  The font
> macros are presentational, not semantic.
> 
> Even then, I would not suppress hyphenation of a metasyntactic variable,
> like "directory".  The whole point of these is that they are textually
> replaced _by the reader_.

I would.  I wouldn't be able to count how many times I've tried to search for 
such a keyword, and it was hyphenated.

> 
>>> I don't insist that people keep hyphenation enabled, but assuming
>>> that no one will do so will keep us from putting worthwhile
>>> information in our man pages.
>>>
>>> If you dread the tedium of adding \% escape sequences to "keywords"
>>> all over the place, I don't blame you.  This is one reason I
>>> proposed my most ambitious man(7) extension yet, a two-macro
>>> semantic tag mechanism.
>>>
>>> https://marc.info/?l=linux-man&m=165868366126909&w=2
>>
>> I still don't know what to think about that.
> 
> That's okay.  Its realization is some ways off, if ever.  First I need
> Bertrand to recover from holidays.  :-O
> 
>> "XXX - quick hack, should disappear before anyone notices :)."
>>
>> Of course, the quick hack never disappeared after Oct 7, 2007, when it
>> was written in stone.
> 
> Of course!
> 
>> <https://github.com/shadow-maint/shadow/commit/6b6e005ce1cc4a5e4fc7fc40a52f2ed229f54b5b>
>>
>> "XXX - is the above ok or should it be <time.h> on ultrix?"
> 
> If you pine for a stagnant commercial Unix to kick around, Solaris 10
> will be around for another year or so...

Solaris 10, I already remove code that supports it at every chance that I have :)

I wonder when the day will come that things like C89 will officially be declared 
dead by consensus.  I wish GCC would drop -std=gnu89 some day.

Seeing how some people strongly defend portability to dinosaur shells, because 
POSIX is not portable enough...  well, I don't think we'll get rid of C89 in 
another 50 years...

But hey, when you don't care about big piles o'money, you can write 
"non-portable" POSIX-only code.  I prefer writing non-portable code, than 
Solaris-3-portable code for a big pile o'money. :)

> 
> Regards,
> Branden

Cheers,

Alex

-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2023-01-04 20:15 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-04  7:38 [PATCH 1/9] ldconfig.8: Fix markup nits G. Branden Robinson
2023-01-04 12:26 ` Alejandro Colomar
2023-01-04 12:36   ` Alejandro Colomar
2023-01-04 16:06     ` G. Branden Robinson
2023-01-04 15:55   ` G. Branden Robinson
2023-01-04 18:41     ` Alejandro Colomar
2023-01-04 19:11       ` G. Branden Robinson
2023-01-04 20:15         ` Alejandro Colomar [this message]
2023-01-04 20:59           ` G. Branden Robinson
2023-01-05  1:21             ` Alejandro Colomar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=afd3a0d3-9bf4-2687-4f62-2ebd62398447@gmail.com \
    --to=alx.manpages@gmail.com \
    --cc=g.branden.robinson@gmail.com \
    --cc=linux-man@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox