From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mo4-p01-ob.smtp.rzone.de (mo4-p01-ob.smtp.rzone.de [81.169.146.166]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 87FD41A01D4 for ; Tue, 18 Mar 2025 21:54:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=pass smtp.client-ip=81.169.146.166 ARC-Seal:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742334853; cv=pass; b=ApehbCY/VTXlr3UEHJr28BcHglM2PpirtYPsXqQAq9toNDexMmFwyprAmM0SZrpTK16sjejLOC5Ksgx1YlBH4NOcUUSGUFv5VsCZNfkSHnKg3JdgFAjQnXeG1IOdwopI86KJO2IzJQPsEVER7aIxUKK5ORocfFbOUY1Zkrdvc74= ARC-Message-Signature:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742334853; c=relaxed/simple; bh=Bf1QX5O1vjkfBDio1Hrg9nrjk/6p7x48lQQ+ca9jOek=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=PQZaKzEjMZDAsSQwZP7eAHMhjMJ08wS2cip1fKk14sR8pMu9eY2/BNnQ7PCxIRTSp0U8zGBN+GUWBaY607VWb0HlBxPEMnO+0ZZPqsHw2CxGYn19yRL1nFUiv0sRPDdkveLb9lVG0Ov47t7ijPTfWd0iSiuUgykAdFy/Jlq0FEM= ARC-Authentication-Results:i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=clisp.org; spf=none smtp.mailfrom=clisp.org; dkim=pass (2048-bit key) header.d=clisp.org header.i=@clisp.org header.b=XyJpFEI2; dkim=permerror (0-bit key) header.d=clisp.org header.i=@clisp.org header.b=IC6ED0r/; arc=pass smtp.client-ip=81.169.146.166 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=clisp.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=clisp.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=clisp.org header.i=@clisp.org header.b="XyJpFEI2"; dkim=permerror (0-bit key) header.d=clisp.org header.i=@clisp.org header.b="IC6ED0r/" ARC-Seal: i=1; a=rsa-sha256; t=1742334790; cv=none; d=strato.com; s=strato-dkim-0002; b=May1d0cwutkG8EUa98LXZ5529DIq4+B+FvcMX39HlQEBpm58eC218FLR5NzlcowlVe Kf16Ju8R7KZF61V4fg87r/Jcd8TWQCrYZURdHsPG4EK3nma2kSF5xpLse6s332Mwl+8g NPSuQGugiH62R0/MQhQFoSPp4qMEgWG432XEX3aFov5MiWz5irSuSmPy6nGuyDQBOrW3 vqz9O57/T9MWNaaI690FKtJBK0mtQj7yqgckHE0s7nSoZ05J1/2TIjDgsLG/Nlmqdxzp SRTJ1ee6tGsL2q1ohsNTmaDRkHi+iok6/0qtJ26awSQw8aXx5RvktJLQT972J1JxFfSY BAmQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; t=1742334790; s=strato-dkim-0002; d=strato.com; h=References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Cc:Date: From:Subject:Sender; bh=Uq5NLgpt15sYbatlURpTGzQx8UThgrr+Hqu4V0ivRdE=; b=doSUDKaC4sEnNB+e6WeycjNakiSDz/5XtUj3wkvSgAZ7INend+ZOcnGlrhf6OaNzwN 71PGvLspNby+MDS7YuOdUoCvsxRYdhZUoZM0ZnUxO3or2O5bj5kqhfLQMtYjwysjjPt2 VtE/b/CZbVhGRkzFeXf1ZQpw+are9Kjc0MjSZMFH2RE0mfN7+SUsbB9qVvxe1sq8zPn4 wIoN3bQwHNU9YotFVOv/X+bQkLqyanblfizMO3C6OGkYf8XtIK+prBKfklwbOctslAgA t+qSrVrRDdD3s2JiP1cdpTrTux6aqsI0MZd6qdXY2EcLSMRE1Fbe2kctQyoV29KD0cGT +sDw== ARC-Authentication-Results: i=1; strato.com; arc=none; dkim=none X-RZG-CLASS-ID: mo01 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; t=1742334790; s=strato-dkim-0002; d=clisp.org; h=References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Cc:Date: From:Subject:Sender; bh=Uq5NLgpt15sYbatlURpTGzQx8UThgrr+Hqu4V0ivRdE=; b=XyJpFEI20BPmgFlWeiHZ3iNU/k9FBW328OanJwGK5O1f+T+drGp/ID9EjSkL2TR13w fTCpjQD2tdoeZ1zV469C/sqXWg0kS6HEVkoHhWWbw3QHR/D3b4Vv4AZHBatMoKBA8oRe c44QVVp4tUedAZ4WUmr2E9W92b4qunbdPXCerJYb3kCnCvtaZpSLFGqqfIWUMMExRktg 8BbY6NLf/5cnQAE/zWxaKYz8HHl55+kfYXQ5LWwoJXaLgDnmikPcz2jEXELbEGpFmIUW R7tMqY4EkJOnj98Ucv0OQic+BMnhX8azkyJHNAqPNRe6E/kFDhwSR0HjyDGqRT548zOS +B0w== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; t=1742334790; s=strato-dkim-0003; d=clisp.org; h=References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Cc:Date: From:Subject:Sender; bh=Uq5NLgpt15sYbatlURpTGzQx8UThgrr+Hqu4V0ivRdE=; b=IC6ED0r/32q5sd95rk0Qn4qYWq80JnvXGN9PxqiE+Mh696iLyT0O7GTVksGrE9wbuM hBWriMvuzP5UllqmWCCQ== X-RZG-AUTH: ":Ln4Re0+Ic/6oZXR1YgKryK8brlshOcZlLnY4jECd2hdUURIbZgL8PX2QiTuZ3cdB8X/nqmqYEWkN6DGVKOEnUurE7fG/J+o=" Received: from nimes.localnet by smtp.strato.de (RZmta 51.3.0 AUTH) with ESMTPSA id N7dcf812ILr93RO (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256 bits)) (Client did not present a certificate); Tue, 18 Mar 2025 22:53:09 +0100 (CET) From: Bruno Haible To: Alejandro Colomar Cc: liba2i@lists.linux.dev, sc22wg14@open-std.org, libbsd@lists.freedesktop.org, tech-misc@netbsd.org, christos , =?utf-8?B?xJBvw6BuIFRy4bqnbiBDw7RuZw==?= Danh , Paul Eggert , Eli Schwartz , Guillem Jover , Iker Pedrosa , Michael Vetter , Robert Elz , riastradh@netbsd.org, Sam James , "Serge E. Hallyn" Subject: Re: alx-0008 - Standardize strtoi(3) and strtou(3) from NetBSD Date: Tue, 18 Mar 2025 22:53:09 +0100 Message-ID: <18739733.sWSEgdgrri@nimes> Organization: GNU In-Reply-To: References: Precedence: bulk X-Mailing-List: liba2i@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" Hi Alejandro, > Below is a draft of a proposal for standardization of strtoi/u(3) from > NetBSD in ISO C2y. First of all: I like your initiative, and I moderately like this proposal. > The strtol(3) family of functions is do damn hard to use > correctly. Only a handful of programmers in the world really > know how to use it correctly in all the corner cases, and even > those need to be really careful to not make mistakes. It would be useful to list the mistakes that are being made most frequently; so as to verify that the proposed strtoi / strtou functions don't tend to provoke the same mistakes. (I'd guess that one of the frequent mistakes is that when the number is not expected to occupy the entire string, the success test after (errno = 0, strtol (...)) is endptr > nptr && errno == 0 and programmers tend to forget one of the two conditions.) > +Synopsis > +1 #include > + intmax_t strtoi(const char *restrict s, char **restrict endp, int base, > + intmax_t min, intmax_t max, int *rstatus); > + uintmax_t strtou(const char *restrict s, char **restrict endp, int base, > + uintmax_t min, uintmax_t max, int *rstatus); Probably it will be an impediment to adoption that these functions work on [u]intmax_t, which is 64-bits or 128-bits integers, which seems overkill when people want to parse, say, a port number in the range 0..65535. To address this adoption problem, how about changing these function to generic functions (in the sense of )? In such a way that strtoi (n, &end, base, LONG_MIN, LONG_MAX, &status) is known to return a 'long' rather than 'intmax_t', and strtoi (n, &end, base, INT_MIN, INT_MAX, &status) is known to return an 'int' rather than 'intmax_t'. If the standard does NOT say that these functions are generic, it would be harder for an implementation to optimize invocations of these functions for narrower types: I don't see how it could be done without explicit compiler support. > + Instead, they set the object pointed to by rstatus > + to an error code, > + or to zero on success. > + > +12 -- EINVAL The value in base is not supported. > + -- ECANCELED The given string did not contain > + any characters that were converted. > + -- ERANGE The converted value was out of range > + and has been coerced, > + or the range was invalid (e.g., min > max). > + -- ENOTSUP The given string contained characters > + that did not get converted. > + > +13 If various errors happen in the same call, > + the first one listed here is reported. It would be useful to show how a success test looks like, after strtoi (s, &end, base, min, max, &status) for each of the four frequent use-cases: -a. expect to parse the initial portion of the string, no coercion, -b. expect to parse the initial portion of the string, silent coercion, -c. expect to parse the entire string, no coercion, -d. expect to parse the entire string, silent coercion. AFAICS, the success tests are: -a. status == 0 || status == ENOTSUP -b. status == 0 || status == ENOTSUP || status == ERANGE -c. status == 0 -d. status == 0 || (status == ERANGE && end > s && *end == '\0') The success test in case d. is so complicated that, for my feeling, the goal to avoid programmer mistakes is not being met. I would therefore propose to change the status value to a bit mask, so that the error conditions "The converted value was out of range and has been coerced" and "The given string contains characters that did not get converted" can be both returned together, without conflicting. And, while at it, the error condition "min > max" is an error that is independent of the given string contents; I would better see it mapped to EINVAL rather than ERANGE. Bruno