From: Alejandro Colomar <alx@kernel.org>
To: linux-man@vger.kernel.org, Zack Weinberg <zack@owlfolio.org>
Cc: Lee Griffiths <poddster@gmail.com>
Subject: Re: [PATCH] sscanf.3: Remove term 'deprecated', and expand BUGS
Date: Wed, 6 Dec 2023 17:36:24 +0100 [thread overview]
Message-ID: <ZXCjD5dP-jaUpeER@debian> (raw)
In-Reply-To: <20231206145132.5538-2-alx@kernel.org>
[-- Attachment #1: Type: text/plain, Size: 3920 bytes --]
Hi Zack,
On Wed, Dec 06, 2023 at 03:52:34PM +0100, Alejandro Colomar wrote:
> Several programmers have been confused about this use of 'deprecated'.
>
> Also, maximum field width can be used with these fields to mitigate the
> problem. Still, it's only a mitigation, since it limits the number of
> characters read, but that means an input of LONG_MAX+1 --which takes up
> the same number of characters than LONG_MAX-- would still cause UB; or
> one can limit that to well below the limit of UB, but then you
> artificially invalidate valid input. No good way to avoid UB with
> sscanf(3), but it's not necessarily bad with trusted input (and
> strtol(3) isn't the panacea either; strtoi(3) is good, though, but not
> standard).
>
> Try to be more convincing in BUGS instead.
>
> Link: <https://stackoverflow.com/questions/77601832/man-sscanf-d-is-deprecated-in-c-or-glibc/>
> Cc: Lee Griffiths <poddster@gmail.com>
> Cc: Zack Weinberg <zack@owlfolio.org>
> Signed-off-by: Alejandro Colomar <alx@kernel.org>
> ---
>
> Hi Lee!
>
> Thanks for the report. After seeing how much frustration it has caused,
> I propose this change. Does it look good to you?
>
> Thanks,
> Alex
Formatted page:
BUGS
Numeric conversion specifiers
Use of the numeric conversion specifiers produces Undefined Be‐
havior for invalid input. See C11 7.21.6.2/10. This is a bug in
the ISO C standard, and not an inherent design issue with the
API. However, current implementations are not safe from that
bug, so it is not recommended to use them. Instead, programs
should use functions such as strtol(3) to parse numeric input.
This manual page deprecates use of the numeric conversion speci‐
fiers until they are fixed by ISO C.
I think it would be good if glibc would make promises about sscanf(3)
on untrusted input. How about guaranteeing a value of -1 and ERANGE if
the integer would overflow?
The current implementation, AFAIK, uses strtol(3), so it has the
following behavior:
- For %d, if the value is >INT_MAX but <=LONG_MAX, the wrap-around
value is stored, and errno is not set.
- For %d, if the value is >LONG_MAX, -1 is stored, and errno is set.
$ cat sscanf.c
#define _GNU_SOURCE
#include <errno.h>
#include <stdio.h>
#include <string.h>
#define wrap(s) do \
{ \
int i, ret; \
\
errno = 0; \
ret = sscanf(s, "%d", &i); \
printf("%s: ret= %d, val= %d, errno= %s\n", #s , ret, i, strerrorname_np(errno)); \
} while (0)
int
main(void)
{
char str_a[] = "9223372036854775828"; // 2^63 + 20
char str_s[] = "8589934599"; // 2^33 + 7
char str_d[] = "4294967290"; // 2^32 - 6
char str_f[] = "2147483678"; // 2^31 + 30
char str_g[] = "2147483638"; // 2^31 - 10
wrap(str_a);
wrap(str_s);
wrap(str_d);
wrap(str_f);
wrap(str_g);
}
$ cc -Wall -Wextra sscanf.c
$ ./a.out
str_a: ret= 1, val= -1, errno= ERANGE
str_s: ret= 1, val= 7, errno= 0
str_d: ret= 1, val= -6, errno= 0
str_f: ret= 1, val= -2147483618, errno= 0
str_g: ret= 1, val= 2147483638, errno= 0
The suggested change would be to act as if
strtoi(str, NULL, 0, INT_MIN, INT_MAX, &err);
would have been called. Does that make sense to you?
Also, I was going to ask for strtoi(3bsd) in glibc, since strtol(3)
isn't easy to use portably (since POSIX allows EINVAL on no conversion,
how to differentiate strtoi(3bsd)'s ECANCELED from EINVAL in strtol(3)?).
Thanks,
Alex
--
<https://www.alejandro-colomar.es/>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2023-12-06 16:36 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-06 14:52 [PATCH] sscanf.3: Remove term 'deprecated', and expand BUGS Alejandro Colomar
2023-12-06 16:36 ` Alejandro Colomar [this message]
2023-12-06 18:33 ` Matthew House
2023-12-06 20:17 ` Alejandro Colomar
2023-12-06 20:45 ` Matthew House
2023-12-06 20:54 ` Matthew House
2023-12-06 21:12 ` Alejandro Colomar
[not found] ` <CAKXok1GQvKi2HiBU89CSd+KF_dd9+mOMVhHrMKAVLLwcyJDN2g@mail.gmail.com>
2023-12-07 21:50 ` Fwd: " Lee Griffiths
2023-12-09 11:55 ` Alejandro Colomar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZXCjD5dP-jaUpeER@debian \
--to=alx@kernel.org \
--cc=linux-man@vger.kernel.org \
--cc=poddster@gmail.com \
--cc=zack@owlfolio.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox