public inbox for linux-man@vger.kernel.org
 help / color / mirror / Atom feed
From: Alejandro Colomar <alx@kernel.org>
To: linux-man@vger.kernel.org, Zack Weinberg <zack@owlfolio.org>
Cc: Lee Griffiths <poddster@gmail.com>
Subject: Re: [PATCH] sscanf.3: Remove term 'deprecated', and expand BUGS
Date: Wed, 6 Dec 2023 17:36:24 +0100	[thread overview]
Message-ID: <ZXCjD5dP-jaUpeER@debian> (raw)
In-Reply-To: <20231206145132.5538-2-alx@kernel.org>

[-- Attachment #1: Type: text/plain, Size: 3920 bytes --]

Hi Zack,

On Wed, Dec 06, 2023 at 03:52:34PM +0100, Alejandro Colomar wrote:
> Several programmers have been confused about this use of 'deprecated'.
> 
> Also, maximum field width can be used with these fields to mitigate the
> problem.  Still, it's only a mitigation, since it limits the number of
> characters read, but that means an input of LONG_MAX+1 --which takes up
> the same number of characters than LONG_MAX-- would still cause UB; or
> one can limit that to well below the limit of UB, but then you
> artificially invalidate valid input.  No good way to avoid UB with
> sscanf(3), but it's not necessarily bad with trusted input (and
> strtol(3) isn't the panacea either; strtoi(3) is good, though, but not
> standard).
> 
> Try to be more convincing in BUGS instead.
> 
> Link: <https://stackoverflow.com/questions/77601832/man-sscanf-d-is-deprecated-in-c-or-glibc/>
> Cc: Lee Griffiths <poddster@gmail.com>
> Cc: Zack Weinberg <zack@owlfolio.org>
> Signed-off-by: Alejandro Colomar <alx@kernel.org>
> ---
> 
> Hi Lee!
> 
> Thanks for the report.  After seeing how much frustration it has caused,
> I propose this change.  Does it look good to you?
> 
> Thanks,
> Alex

Formatted page:

BUGS
   Numeric conversion specifiers
     Use of the numeric conversion specifiers produces  Undefined  Be‐
     havior for invalid input.  See C11 7.21.6.2/10.  This is a bug in
     the  ISO  C  standard,  and not an inherent design issue with the
     API.  However, current implementations are  not  safe  from  that
     bug,  so  it  is  not recommended to use them.  Instead, programs
     should use functions such as strtol(3) to  parse  numeric  input.
     This  manual page deprecates use of the numeric conversion speci‐
     fiers until they are fixed by ISO C.

I think it would be good if glibc would make promises about sscanf(3)
on untrusted input.  How about guaranteeing a value of -1 and ERANGE if
the integer would overflow?

The current implementation, AFAIK, uses strtol(3), so it has the
following behavior:

-  For %d, if the value is >INT_MAX but <=LONG_MAX, the wrap-around
   value is stored, and errno is not set.

-  For %d, if the value is >LONG_MAX, -1 is stored, and errno is set.

	$ cat sscanf.c 
	#define _GNU_SOURCE
	#include <errno.h>
	#include <stdio.h>
	#include <string.h>

	#define wrap(s)  do                                                           \
	{                                                                             \
		int  i, ret;                                                          \
										      \
		errno = 0;                                                            \
		ret = sscanf(s, "%d", &i);                                            \
		printf("%s: ret= %d, val= %d, errno= %s\n", #s , ret, i, strerrorname_np(errno)); \
	} while (0)

	int
	main(void)
	{
		char  str_a[] = "9223372036854775828";   // 2^63 + 20
		char  str_s[] = "8589934599";  // 2^33 + 7
		char  str_d[] = "4294967290";  // 2^32 - 6
		char  str_f[] = "2147483678";  // 2^31 + 30
		char  str_g[] = "2147483638";  // 2^31 - 10

		wrap(str_a);
		wrap(str_s);
		wrap(str_d);
		wrap(str_f);
		wrap(str_g);
	}
	$ cc -Wall -Wextra sscanf.c 
	$ ./a.out 
	str_a: ret= 1, val= -1, errno= ERANGE
	str_s: ret= 1, val= 7, errno= 0
	str_d: ret= 1, val= -6, errno= 0
	str_f: ret= 1, val= -2147483618, errno= 0
	str_g: ret= 1, val= 2147483638, errno= 0

The suggested change would be to act as if

	strtoi(str, NULL, 0, INT_MIN, INT_MAX, &err);

would have been called.  Does that make sense to you?

Also, I was going to ask for strtoi(3bsd) in glibc, since strtol(3)
isn't easy to use portably (since POSIX allows EINVAL on no conversion,
how to differentiate strtoi(3bsd)'s ECANCELED from EINVAL in strtol(3)?).

Thanks,
Alex

-- 
<https://www.alejandro-colomar.es/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2023-12-06 16:36 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-06 14:52 [PATCH] sscanf.3: Remove term 'deprecated', and expand BUGS Alejandro Colomar
2023-12-06 16:36 ` Alejandro Colomar [this message]
2023-12-06 18:33   ` Matthew House
2023-12-06 20:17     ` Alejandro Colomar
2023-12-06 20:45       ` Matthew House
2023-12-06 20:54         ` Matthew House
2023-12-06 21:12         ` Alejandro Colomar
     [not found] ` <CAKXok1GQvKi2HiBU89CSd+KF_dd9+mOMVhHrMKAVLLwcyJDN2g@mail.gmail.com>
2023-12-07 21:50   ` Fwd: " Lee Griffiths
2023-12-09 11:55     ` Alejandro Colomar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZXCjD5dP-jaUpeER@debian \
    --to=alx@kernel.org \
    --cc=linux-man@vger.kernel.org \
    --cc=poddster@gmail.com \
    --cc=zack@owlfolio.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox