From: Alejandro Colomar <alx@kernel.org>
To: наб <nabijaczleweli@nabijaczleweli.xyz>
Cc: linux-man@vger.kernel.org
Subject: Re: [PATCH v3] strverscmp.3: this is NOT the ordering used by ls -v
Date: Mon, 16 Dec 2024 10:57:36 +0100 [thread overview]
Message-ID: <20241216095736.yzkofmgtgzidlp2j@devuan> (raw)
In-Reply-To: <myuppkwnltqtxduoop7g7wfuyou5cdo6sotocrvyztmqnazvph@tarta.nabijaczleweli.xyz>
[-- Attachment #1: Type: text/plain, Size: 4318 bytes --]
Hi nab,
On Mon, Dec 16, 2024 at 02:00:45AM +0100, наб wrote:
> On Sun, Dec 15, 2024 at 10:44:26PM +0100, Alejandro Colomar wrote:
> > On Sun, Dec 15, 2024 at 10:02:42PM +0100, наб wrote:
> > > > Should we file a bug against glibc strverscmp(3)? We probably should.
> > > >
> > > > And the reference to sort(1), I'd put it in BUGS, saying that this API
> > > > is broken, and does not sort properly. Sounds good?
> > > No, this API works as-documented, and the implementation is useful.
> > What does useful mean?
> There are applications where a lexicographical-except-numeric comparison
> like this is what you want (it's most of them). Calling it a "version
> sort is silly + goofy but, whatever.
Hmmm, yeah, we can live with that for historical raisins.
> > > It's just not what ls -v does.
> > While version sort isn't something standard, I think GNU should be
> > self-consistent.
> It is, ls -v and sort -V are consistent.
> Having just implemented the /actual/ algorithm they use for voreutils,
> that is by far /not/ universally applicable, much hairier, and hard-tuned for
> "versions that are kinda like debian describes and sorts them (but not actually)
> AND ALSO we put them in filenames where we can assume the format a little bit
> AND ALSO {4 special cases to make ls -v work}".
> Replacing this well-defined lexicographical-except-numeric sorter with... that,
> isn't really applicable.
Sounds reasonable.
>
> Best,
> -- >8 --
> From: =?UTF-8?q?Ahelenia=20Ziemia=C5=84ska?=
> <nabijaczleweli@nabijaczleweli.xyz>
> Subject: [PATCH v3] strverscmp.3: this is NOT the ordering used by ls -v
>
> Compare, given:
> #include <stdlib.h>
> #include <stdio.h>
> #include <string.h>
> int compar(const char **l, const char **r) {
> return strverscmp(*l, *r);
> }
> int main(int argc, char ** argv) {
> qsort(argv + 1, argc - 1, sizeof(*argv), compar);
> for(int i = 1; i < argc; ++i)
> puts(argv[i]);
> }
> yields:
> $ /bin/ls -v1 a* # coreutils ls
> a-1.0a
> a-1.0.1a
> $ ../vers a* # as above
> a-1.0.1a
> a-1.0a
> $ ls -v1 a* # voreutils ls @ 5781698 with strverscmp()-equivalent sorting
> a-1.0.1a
> a-1.0a
> compare also the results for real data like
> netstat-nat-1.{0,1{,.1},2,3.1,4{,.{1,2,3,4,5,6,7,8,9,10}}}.tar.gz
>
> Thus, coreutils ls -v does NOT use strverscmp(3);
> it uses a modified Debian version comparison algorithm with additional
> suffix processing and ls -v-specific exceptions.
>
> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Patch applied. Thanks!
Have a lovely day!
Alex
> ---
> man/man3/strverscmp.3 | 23 ++++++++---------------
> 1 file changed, 8 insertions(+), 15 deletions(-)
>
> diff --git a/man/man3/strverscmp.3 b/man/man3/strverscmp.3
> index 41bc1ddbd..e028d6788 100644
> --- a/man/man3/strverscmp.3
> +++ b/man/man3/strverscmp.3
> @@ -18,25 +18,14 @@ .SH SYNOPSIS
> .BI "int strverscmp(const char *" s1 ", const char *" s2 );
> .fi
> .SH DESCRIPTION
> -Often one has files
> +For a dataset like
> .IR jan1 ", " jan2 ", ..., " jan9 ", " jan10 ", ..."
> -and it feels wrong when
> -.BR ls (1)
> -orders them
> +sorting it lexicographically yields
> .IR jan1 ", " jan10 ", ..., " jan2 ", ..., " jan9 .
> .\" classical solution: "rename jan jan0 jan?"
> -In order to rectify this, GNU introduced the
> -.I \-v
> -option to
> -.BR ls (1),
> -which is implemented using
> -.BR versionsort (3),
> -which again uses
> -.BR strverscmp ().
> -.P
> -Thus, the task of
> +The task of
> .BR strverscmp ()
> -is to compare two strings and find the "right" order, while
> +is to compare two strings yielding the former order, while
> .BR strcmp (3)
> finds only the lexicographic order.
> This function does not use
> @@ -44,6 +33,10 @@ .SH DESCRIPTION
> .BR LC_COLLATE ,
> so is meant mostly for situations
> where the strings are expected to be in ASCII.
> +This is different from the ordering produced by
> +.BR sort (1)
> +.BR -V .
> +.\" sort -V sorts a-1.0a < a-1.0.1a; strverscmp() does not
> .P
> What this function does is the following.
> If both strings are equal, return 0.
> --
> 2.39.5
>
--
<https://www.alejandro-colomar.es/>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
prev parent reply other threads:[~2024-12-16 9:57 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-15 20:17 [PATCH] strverscmp.3: this is NOT the ordering used by ls -v Ahelenia Ziemiańska
2024-12-15 20:43 ` Alejandro Colomar
2024-12-15 21:02 ` [PATCH v2] " наб
2024-12-15 21:44 ` Alejandro Colomar
2024-12-16 1:00 ` [PATCH v3] " наб
2024-12-16 9:57 ` Alejandro Colomar [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241216095736.yzkofmgtgzidlp2j@devuan \
--to=alx@kernel.org \
--cc=linux-man@vger.kernel.org \
--cc=nabijaczleweli@nabijaczleweli.xyz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox