From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mo4-p01-ob.smtp.rzone.de (mo4-p01-ob.smtp.rzone.de [85.215.255.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0A0E71FB3 for ; Thu, 20 Mar 2025 14:27:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=pass smtp.client-ip=85.215.255.53 ARC-Seal:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742480869; cv=pass; b=a9NFxXHlLtIHqFKp4Jt5wy+g8p8SFaEqhyxlEe9ScnG4SZTI+KlTRPyv1g/8v6lrSlaTc8AhEkJaplrOY5dGb7A1d9JNZ/2mj46tf9J5UDbF0q5V01/u/nwH6cIXbNd8Tz9ftzdvSWa/dnjaD7BABYQomsNOoHxaPq9qnTGrB7s= ARC-Message-Signature:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742480869; c=relaxed/simple; bh=nwWsrAs/aE82BOxJHDgRcLf+bk5lTuexaoMkO+l489s=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=UbAEllPjJWMO5UVYO0G444Ikco5ViTXEE05QgUdgK6a6t9dOguxBbl6QV3WADefal/5lgwr2YyQkobKdyOG7yLz/1ZjPXgCoZvAEJZWwm2ki2MkKr3bCXnRHBrfYS2QIr7wtv98wjF/2Nw4K3av3ADnO7mViqKoK5rVqm4c3I60= ARC-Authentication-Results:i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=clisp.org; spf=none smtp.mailfrom=clisp.org; dkim=pass (2048-bit key) header.d=clisp.org header.i=@clisp.org header.b=Rf3pPvJv; dkim=permerror (0-bit key) header.d=clisp.org header.i=@clisp.org header.b=ariaSAPr; arc=pass smtp.client-ip=85.215.255.53 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=clisp.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=clisp.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=clisp.org header.i=@clisp.org header.b="Rf3pPvJv"; dkim=permerror (0-bit key) header.d=clisp.org header.i=@clisp.org header.b="ariaSAPr" ARC-Seal: i=1; a=rsa-sha256; t=1742480809; cv=none; d=strato.com; s=strato-dkim-0002; b=aVcuyy8NG7JtjCBr/v69DJqSGodvvUxJYr0p+k7p5RJBh1uCxrvPRxYl4lUNXpwjv3 jV9nVCuqYRMwjP4xT6FO3BDVskShCXtjcgtqr74WTN2RFhYlU1AnGNIAPsScMYfvqgbT 7y2BsyDUU1BZxOfClrc/vM60IJy1J4EToJ+8KqOM+EDKKf8augVoSlsAX7E5ZmPukAYQ eM+/PmRB8cCpfyL2gUTi3HeGdelnbHsKu/Lc9QFtRPvtXP6Haxkwi9NewXrM2H9N7kJC SKBKCswlmS9T0+uC0oAJSPyIKvpEO+y9IjWCI5rG9Yo9t6ny8Nq55al4fqsuB8lmektI 4T6A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; t=1742480809; s=strato-dkim-0002; d=strato.com; h=References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Cc:Date: From:Subject:Sender; bh=izl7IX4XMrEI+/YYkzZ5NIoS9San4Q/ONxp4NSVTei0=; b=hhfPKazkKTbNDus3HAQwlRuaFAZOd0tTS1AnsK6lwHQgtddGC9igZ/Ovl3uL4Vg4/j KIp9D1UwaUKeAESbsuHcEwMg4fEhzMaF2H0PMHutREzUrH0i/YE6Uj4sRMOAcN0pLGx9 5zycRoEquQiLxErULY16VAeCB3wQx/HrnfDOe/Zy/roSvEaCtTcJq17lZoSaO2toeXAd P2J/FFSEHSeEOs1l8hqS5CzBwRJbVrXYW8bpmT5beV48QmV4sftmww149r9mHVeFVkx9 89g6hMMMy+JW9yZ+tkMhbWteXpIztNCs0QJNndp3bM4+aJc4slxj88hIOEQ//Lg9XMw7 L51Q== ARC-Authentication-Results: i=1; strato.com; arc=none; dkim=none X-RZG-CLASS-ID: mo01 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; t=1742480809; s=strato-dkim-0002; d=clisp.org; h=References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Cc:Date: From:Subject:Sender; bh=izl7IX4XMrEI+/YYkzZ5NIoS9San4Q/ONxp4NSVTei0=; b=Rf3pPvJvlpvx1KIb0ZOGyh9ohxLVutjyfFLZfEDCCuIHjLtSYbMueH6vt1PRLBVr3B 3ctolHwrNADjhsMTHW+UUtT4+LEK+fCWbRNMEB6NPWbhpSjB53ibwzAJiUlULvKZ2xTU lvnzuEDHVG0IRg4KxNSAxn1ed5eMI+UTw4BPFUyLK0YQw33jmJmJq7NglkdHI90k8zh6 OM2mVYwRJZ48U4A/xuZNlSSGg/34/dEJHK6NLQwEzfAq04zif4H+wJfY6hueoROwf9be YVo1AK8sSnJHdeiT8+37S1/f3HU8+aquHQA8OpRvG+xm2RNM6Lac6uhGvAGp026fgn/h xftw== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; t=1742480809; s=strato-dkim-0003; d=clisp.org; h=References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Cc:Date: From:Subject:Sender; bh=izl7IX4XMrEI+/YYkzZ5NIoS9San4Q/ONxp4NSVTei0=; b=ariaSAPrdMTvmigLWFm0vi2O6qHr/JkkX73Ak3SIcP2vC6PEtQBW7BF3gZIaxfoNlr XoVMWjx9p7tfIQTlGaCQ== X-RZG-AUTH: ":Ln4Re0+Ic/6oZXR1YgKryK8brlshOcZlLnY4jECd2hdUURIbZgL8PX2QiTuZ3cdB8X/nqjiaQjqWehHPXowdmmJ2tGkuj4i6" Received: from nimes.localnet by smtp.strato.de (RZmta 51.3.0 AUTH) with ESMTPSA id N7dcf812KEQlCcG (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256 bits)) (Client did not present a certificate); Thu, 20 Mar 2025 15:26:47 +0100 (CET) From: Bruno Haible To: Alejandro Colomar Cc: liba2i@lists.linux.dev, sc22wg14@open-std.org, libbsd@lists.freedesktop.org, tech-misc@netbsd.org, christos , =?utf-8?B?xJBvw6BuIFRy4bqnbiBDw7RuZw==?= Danh , Paul Eggert , Eli Schwartz , Guillem Jover , Iker Pedrosa , Michael Vetter , Robert Elz , riastradh@netbsd.org, Sam James , "Serge E. Hallyn" Subject: Re: alx-0008 - Standardize strtoi(3) and strtou(3) from NetBSD Date: Thu, 20 Mar 2025 15:26:46 +0100 Message-ID: <6580350.j6PcuT4dK6@nimes> Organization: GNU In-Reply-To: References: <4085563.2iPT33SAM4@nimes> Precedence: bulk X-Mailing-List: liba2i@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" Alejandro Colomar wrote: > > 2) nproc.c, omp_init.c > > > > > > gnulib/lib/nproc.c:402 > > > > > > lib/nproc.c-383-/* Parse OMP environment variables without dependence on OMP. > > > lib/nproc.c-384- Return 0 for invalid values. */ > > > lib/nproc.c-385-static unsigned long int > > > lib/nproc.c:386:parse_omp_threads (char const* threads) > > > lib/nproc.c-387-{ > > > > > > ... > > > > > > lib/nproc.c-398- /* Convert it from positive decimal to 'unsigned long'. */ > > > lib/nproc.c-399- if (c_isdigit (*threads)) > > > lib/nproc.c-400- { > > > lib/nproc.c-401- char *endptr = NULL; > > > lib/nproc.c:402: unsigned long int value = strtoul (threads, &endptr, 10); > > > lib/nproc.c-403- > > > lib/nproc.c-404- if (endptr != NULL) > > > lib/nproc.c-405- { > > > lib/nproc.c-406- while (*endptr != '\0' && c_isspace (*endptr)) > > > lib/nproc.c-407- endptr++; > > > lib/nproc.c-408- if (*endptr == '\0') > > > lib/nproc.c-409- return value; > > > lib/nproc.c-410- /* Also accept the first value in a nesting level, > > > lib/nproc.c-411- since we can't determine the nesting level from env vars. */ > > > lib/nproc.c-412- else if (*endptr == ',') > > > lib/nproc.c-413- return value; > > > lib/nproc.c-414- } > > > lib/nproc.c-415- } > > > ... > > > This is one case where you seem to silently ignore saturation. > > > > This is on purpose. A CPU never has more than 1000 processors. Therefore > > all values > 1000 should be treated the same way, whether they are > > <= ULONG_MAX or > ULONG_MAX. Clamping to the number of actually available > > processors occurs later in the code. > > I guess you refer to the MIN() calls within num_processors(), right? > Why do those clampings not result in diagnostics? Couldn't you > calculate the limits before parsing the actual number, and so use them > to perform the range checks during the strtou(3) call? The essential code is like this: unsigned long int omp_env_limit = ULONG_MAX; if (query == NPROC_CURRENT_OVERRIDABLE) { omp_env_limit = parse_omp_threads (getenv ("OMP_THREAD_LIMIT")); if (! omp_env_limit) omp_env_limit = ULONG_MAX; ... query = NPROC_CURRENT; } unsigned long nprocs = num_processors_ignoring_omp (query); return MIN (nprocs, omp_env_limit); and num_processors_ignoring_omp (query) is expensive to compute, since it makes system calls. For this reason, we do the parsing of environment variables first; this gives us the opportunity to optimize away the system calls in particular circumstances.[1] > > 3) msgfmt.c > > > > > > gettext/gettext-tools/src/msgfmt.c:287 > > > > > > gettext-tools/src/msgfmt.c-286- char *endp; > > > gettext-tools/src/msgfmt.c:287: size_t new_align = strtoul (optarg, &endp, 0); > > > gettext-tools/src/msgfmt.c-288- > > > gettext-tools/src/msgfmt.c-289- if (endp != optarg) > > > gettext-tools/src/msgfmt.c-290- alignment = new_align; > > > ... > > > Why don't you have a diagnostic message for invalid input? > > > > Because this option is meant for users who know what an "alignment" is. > > But still, you don't need to actively ignore ERANGE, especially when it > results in more complex code than actually checking. Once the API gives > you the check for free, you should probably use it. > > In this case, I think I'd do > > int status; > size_t new_align; > > new_align = strtou(optarg, NULL, 0, 1, 16, &status); > if (status == 0) > alignment = new_align; > > Don't you think this is more robust and just as simple? > ... > strtoul(3) makes it difficult for you to do the check, > not because you actively want to not check. Yes and no. Yes, it would be nice to have a range check at essentially zero cost. No, clamping is not the right behaviour: an alignment of 5 or 17 should be rejected, not mapped to something valid. > > 4) msgl-check.c > > > > > > gettext/gettext-tools/src/msgl-check.c:379 > > > > > > gettext-tools/src/msgl-check.c-374- while (*nplurals != '\0' && c_isspace ((unsigned char) *nplurals)) > > > gettext-tools/src/msgl-check.c-375- ++nplurals; > > > gettext-tools/src/msgl-check.c-376- endp = nplurals; > > > gettext-tools/src/msgl-check.c-377- nplurals_value = 0; > > > gettext-tools/src/msgl-check.c-378- if (*nplurals >= '0' && *nplurals <= '9') > > > gettext-tools/src/msgl-check.c:379: nplurals_value = strtoul (nplurals, (char **) &endp, 10); > > > gettext-tools/src/msgl-check.c-380- if (nplurals == endp) > > > gettext-tools/src/msgl-check.c-381- { > > > gettext-tools/src/msgl-check.c-382- const char *msg = _("invalid nplurals value"); > > > gettext-tools/src/msgl-check.c-383- char *help = plural_help (nullentry); > > > gettext-tools/src/msgl-check.c-384- > > > gettext-tools/src/msgl-check.c-385- if (help != NULL) > > > gettext-tools/src/msgl-check.c-386- { > > > gettext-tools/src/msgl-check.c-387- char *msgext = xasprintf ("%s\n%s", msg, help); > > > gettext-tools/src/msgl-check.c-388- xeh->xerror (CAT_SEVERITY_ERROR, header, NULL, 0, 0, true, > > > gettext-tools/src/msgl-check.c-389- msgext); > > > gettext-tools/src/msgl-check.c-390- free (msgext); > > > gettext-tools/src/msgl-check.c-391- free (help); > > > gettext-tools/src/msgl-check.c-392- } > > > gettext-tools/src/msgl-check.c-393- else > > > gettext-tools/src/msgl-check.c-394- xeh->xerror (CAT_SEVERITY_ERROR, header, NULL, 0, 0, false, > > > gettext-tools/src/msgl-check.c-395- msg); > > > gettext-tools/src/msgl-check.c-396- > > > gettext-tools/src/msgl-check.c-397- seen_errors++; > > > gettext-tools/src/msgl-check.c-398- } > > > > > On the other hand, I also wonder why you don't diagnose invalid input. > > > Why is -10 an "invalid nplurals value" > > > > "-10" designates a negative number. For the number of plural forms, that's > > invalid. > > > > > but ULONG_MAX+10 is a valid (albeit clamped) one? > > > > Again, as mentioned above, values like 1000 and 10000000000000000 should > > be treated the same way. Since ULONG_MAX+10 gets clamped to ULONG_MAX, that's > > just fine. > > I see in line 433 a check > > if (min_nplurals < nplurals_value) > > And in line 449 a check > > else if (max_nplurals > nplurals_value) > > nplurals_value was parsed via strtoul(3) in line 379. > > And min_plurals and max_plurals were assigned in lines 293 and lines 298. > Which makes me wonder: why don't you move the tests from lines 4** into > the strtou(3) call? > > That is, why not call this? > > nplurals_value = strtou(nplurals, NULL, 10, min_nplurals, max_nplurals, &status); > > And then have the ERANGE handling there? For two reasons: - The program's logic is to parse simple things first and complicated things afterwards. The user is likely to make a mistake in the simple things with a lower probability than in the complicated things. When there is a mismatch, the likely culprit is a mistake in the complicated things; the diagnostics need to be directed at this scenario. - It feels strange to move complicated validation logic into a library function. In summary, while the built-in range checks are welcome for things like port numbers (0..65536), in this case it's better to keep the program's logic explicit and situation-aware. > In cases 2 and 4, I think you'd benefit from refactoring the code to use > the limits in the strtou(3) call. Especially in case 4 where you > already know the limits. No. As the author of that code, I disagree. Bruno [1] https://git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commitdiff;h=205078f891d87e0e966ad8616fdf0306437f0370