From: Alejandro Colomar <alx@kernel.org>
To: Brian.Inglis@Shaw.ca, linux-man@vger.kernel.org
Cc: Deri <deri@chuzzlewit.myzen.co.uk>
Subject: Re: No 6.05/.01 pdf book available
Date: Mon, 28 Aug 2023 23:11:56 +0200 [thread overview]
Message-ID: <fab31245-a17a-f472-f570-f5b35d2c79b4@kernel.org> (raw)
In-Reply-To: <1435b3f6-b4fb-28b1-3c54-547c9a7e919a@Shaw.ca>
[-- Attachment #1.1: Type: text/plain, Size: 4470 bytes --]
Hi Brian,
On 2023-08-28 20:24, Brian Inglis wrote:
> On 2023-08-28 06:17, Alejandro Colomar wrote:
>> Hi Brian,
>>
>> On 2023-08-22 01:45, Brian Inglis wrote:
>>> I am in favour of all punctuation being treated as word spaces and sorting
>>> "cat ..." before "cat..." but find the real orders more evocative and easier to
>>> decide about than examples.
>>
>> Here's an excerpt of how treating - and _ as spaces looks like. I think
>> it's a reasonable order. Should I apply that diff?
>>
>> Cheers,
>> Alex
>>
>> $ git diff
>> diff --git a/scripts/sortman b/scripts/sortman
>> index a8f70bab5..6d1d92f09 100755
>> --- a/scripts/sortman
>> +++ b/scripts/sortman
>> @@ -9,7 +9,7 @@ sed -E '/\/intro./ s/.*\.([[:digit:]])/\10\t&/' \
>> | sed -E ' s/\t(.*)/&\n\1/' \
>> | sed -E '/\t/ s/\.[[:digit:]]([[:alpha:]][[:alnum:]]*)?\>.*//' \
>> | sed -E '/\t/ s/\/[_-]*/\//g' \
>> -| sed -E '/\t/ s/[_-]/_/g' \
>> +| sed -E '/\t/ s/[_-]/ /g' \
>> | sed -E '/\t/ {N;s/\n/\t/;}' \
>> | sort -fV -k1,2 \
>> | cut -f3;
>> $ touch man8/ld-z.8
>> $ touch man8/ld.8
>> $ find man8 | ./scripts/sortman
>> man8/intro.8
>> man8/iconvconfig.8
>> man8/ld.8
>> man8/ld-linux.8
>> man8/ld-linux.so.8
>> man8/ld-z.8
>> man8/ld.so.8
>> man8/ldconfig.8
>> man8/nscd.8
>> man8/sln.8
>> man8/tzselect.8
>> man8/zdump.8
>> man8/zic.8
>> man8
>
> Looks better,
Thanks, I've applied and pushed the patch.
> but should your sort *key* field instance also drop the section
> suffix (already in prefix)
It is already dropped. Am I understanding it correctly?
Here's a debug patch to view the sort key field:
diff --git a/scripts/sortman b/scripts/sortman
index 6d1d92f09..e690f23ea 100755
--- a/scripts/sortman
+++ b/scripts/sortman
@@ -12,4 +12,5 @@ sed -E '/\/intro./ s/.*\.([[:digit:]])/\10\t&/' \
| sed -E '/\t/ s/[_-]/ /g' \
| sed -E '/\t/ {N;s/\n/\t/;}' \
| sort -fV -k1,2 \
+| tee /dev/tty \
| cut -f3;
And here's how it looks with man8 (plus the dummy files):
$ find man8 -type f | ./scripts/sortman
80 man8/intro man8/intro.8
81 man8/iconvconfig man8/iconvconfig.8
81 man8/ld man8/ld.8
81 man8/ld linux man8/ld-linux.8
81 man8/ld linux.so man8/ld-linux.so.8
81 man8/ld z man8/ld-z.8
81 man8/ld.so man8/ld.so.8
81 man8/ldconfig man8/ldconfig.8
81 man8/nscd man8/nscd.8
81 man8/sln man8/sln.8
81 man8/tzselect man8/tzselect.8
81 man8/zdump man8/zdump.8
81 man8/zic man8/zic.8
man8/intro.8
man8/iconvconfig.8
man8/ld.8
man8/ld-linux.8
man8/ld-linux.so.8
man8/ld-z.8
man8/ld.so.8
man8/ldconfig.8
man8/nscd.8
man8/sln.8
man8/tzselect.8
man8/zdump.8
man8/zic.8
There are no suffixes in the second field.
> and also treat "." as space?
I'had been thinking about it, but didn't make an opinion.
Since they are rare, I think making them stand out a little bit
by having a special order rather than just being mixed with the
underscores would make sense. But I'm open to change that.
> Where would you expect to see ld.so?
Not sure.
>
> Also, in `sed`, instead of cloning the line, at the start of a series of
> executions, make them all into a single inline command script, start with `h` to
> *hold* the input line, and end with `G` instead of `N` to append '\n' then the
> held line, convert to `\t`, drop the braces, and you can skip the then redundant
> tests, something like the following should get you close (tried it earlier, now
> sadly already gone from history):
>
> | sed -E '
> h
> /\/intro./ s/.*\.([[:digit:]])/\10\t&/
> s/\.[[:digit:]]([[:alpha:]][[:alnum:]]*)?\>.*//
> s/\/[_-]*/\//g
> s/[_-]/_/g
> s/[_-]/ /g
> G
> s/\n/\t/
> ' \
> | ...
I prefer having many one-liners for a few reasons:
- Not everybody knows what h and G do. I did't. And I will
soon forget. In contrast, my implementation has nothing
rare in it.
- I can inspect the contents at each of the steps easily by
adding a line with `| tee /dev/tty \`, for debug purposes.
In general, I avoid having large scripts in other languages.
I prefer piping many one-liners, even if it might be less
efficient (but it uses more cores, so it might end up being
faster; I've seen such things happen already many times).
Cheers,
Alex
--
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2023-08-28 21:13 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-07 1:16 [PATCH] scripts/LinuxManBook/gropdf: use symlink instead of hard coded groff version Brian Inglis
2023-08-07 2:46 ` No 6.05/.01 pdf book available Brian Inglis
2023-08-07 8:45 ` Alejandro Colomar
2023-08-07 9:16 ` Alejandro Colomar
2023-08-07 16:21 ` Brian Inglis
2023-08-12 0:02 ` Alejandro Colomar
2023-08-12 1:48 ` G. Branden Robinson
2023-08-12 21:32 ` Alejandro Colomar
[not found] ` <21975186.EfDdHjke4D@pip>
2023-08-11 23:51 ` Alejandro Colomar
2023-08-12 3:04 ` G. Branden Robinson
2023-08-12 21:33 ` Alejandro Colomar
2023-08-12 17:02 ` Brian Inglis
2023-08-12 20:02 ` Deri
2023-08-13 20:30 ` Brian Inglis
2023-08-13 20:47 ` Alejandro Colomar
2023-08-13 21:55 ` G. Branden Robinson
2023-08-13 22:45 ` Alejandro Colomar
2023-08-13 22:18 ` Alejandro Colomar
2023-08-14 6:49 ` Brian Inglis
2023-08-14 10:46 ` Alejandro Colomar
2023-08-13 21:47 ` hyphens at ends of pages (was: No 6.05/.01 pdf book available) G. Branden Robinson
2023-08-14 5:28 ` Brian Inglis
2023-08-14 16:06 ` No 6.05/.01 pdf book available Deri
2023-08-14 17:37 ` Alejandro Colomar
2023-08-14 20:01 ` Alejandro Colomar
2023-08-14 21:22 ` Deri
2023-08-14 21:32 ` Alejandro Colomar
2023-08-14 23:26 ` Deri
2023-08-14 21:40 ` Deri
2023-08-15 0:50 ` groff features for hyperlinked man pages (was: No 6.05/.01 pdf book available) G. Branden Robinson
2023-08-15 10:34 ` G. Branden Robinson
2023-08-18 13:50 ` Alejandro Colomar
2023-08-19 4:37 ` G. Branden Robinson
2023-10-01 12:02 ` Alejandro Colomar
2023-08-18 10:29 ` No 6.05/.01 pdf book available Alejandro Colomar
2023-08-15 0:34 ` Brian Inglis
2023-08-20 16:48 ` Deri
2023-08-20 18:54 ` Alejandro Colomar
2023-08-20 19:06 ` Brian Inglis
[not found] ` <3262525.44csPzL39Z@pip>
2023-08-21 22:02 ` Alejandro Colomar
2023-08-21 23:10 ` Deri
2023-08-21 23:45 ` Brian Inglis
2023-08-28 12:17 ` Alejandro Colomar
2023-08-28 18:24 ` Brian Inglis
2023-08-28 21:11 ` Alejandro Colomar [this message]
2023-08-07 8:29 ` [PATCH] scripts/LinuxManBook/gropdf: use symlink instead of hard coded groff version Alejandro Colomar
2023-08-07 15:01 ` Brian Inglis
2023-08-11 23:57 ` Alejandro Colomar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=fab31245-a17a-f472-f570-f5b35d2c79b4@kernel.org \
--to=alx@kernel.org \
--cc=Brian.Inglis@Shaw.ca \
--cc=deri@chuzzlewit.myzen.co.uk \
--cc=linux-man@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).