public inbox for linux-man@vger.kernel.org
 help / color / mirror / Atom feed
From: Eric Biggers <ebiggers@kernel.org>
To: Alejandro Colomar <alx.manpages@gmail.com>
Cc: Ian Abbott <abbotti@mev.co.uk>, Zack Weinberg <zack@owlfolio.org>,
	linux-man@vger.kernel.org,
	Michael Kerrisk <mtk.manpages@gmail.com>,
	Alejandro Colomar <alx@kernel.org>
Subject: Re: [PATCH] scanf.3: Do not mention the ERANGE error
Date: Fri, 20 Jan 2023 11:41:04 -0800	[thread overview]
Message-ID: <Y8ruUIIh/9RdQfeU@sol.localdomain> (raw)
In-Reply-To: <b929eaed-0e8f-bb6c-87cb-8a36573c2769@gmail.com>

On Fri, Jan 20, 2023 at 02:12:07PM +0100, Alejandro Colomar wrote:
> >  Is the undefined behavior here a real world issue
> > anywhere, or is this just a theoretical issue based on interpretation of the C
> > standard?
> 
> All implementations of sscanf(3) produce Undefined Behavior (UB), AFAIK.
> How much you consider UB to be a real-world issue differs for each
> programmer, but I tend to consider all UB to be as bad as nasal demons.  I'm
> not saying UB shouldn't exist, just that you shouldn't invoke it.  And a
> function that is used for scanning user input is one of those places where
> you really want to avoid invoking UB.

Well, according to the C standard, the behavior is undefined when the value
scanned by an integer conversion specifier does not fit in the type.  That's
clearly a bug in the standard.  Obviously, implementations need not implement
that bug, because it is stupid -- much stupider than other cases of UB in the
standard.  It's not unreasonable to focus on what implementations actually do.

In general, compilers optimize code assuming that undefined behavior never
occurs.  However, this specific type of undefined behavior is pretty obscure,
and it is hard to think of any compiler optimization that would apply to it.
I'm doubtful that any has been implemented.

UBSAN does not detect this as undefined behavior either.  (Tested with
'sscanf("99999999999999999999", "%d", &x' with gcc 12.2.0 and clang 14.0.6.)

The remaining question, then, would be whether any actual sscanf()
implementation would actually do something "bad" given one of these "undefined"
inputs.  I think the main pitfall would be that a naive implementation of %d and
other *signed* integer conversion specifiers would execute a signed integer
multiplication that overflows.  That's a more well established undefined
behavior.  Though, some implementations make that behavior defined too.

Unsigned specifiers such as %u should fare better, as unsigned integer overflow
has defined behavior.  I.e., it would be much harder to write an implementation
of %u that invokes undefined behavior due to the value being too large.

It might be fair to say that behavior here is de facto implementation-defined,
despite the standard saying undefined...

Anyway, if you do go the hard-line route of "undefined is undefined, so let's
deprecate", you need to make it clear (a) who is doing the deprecation, (b) what
the actual issue is, and (c) what the replacement is.

- Eric

      parent reply	other threads:[~2023-01-20 19:41 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-08 12:34 [PATCH] scanf.3: Do not mention the ERANGE error Ian Abbott
2022-12-09 18:59 ` Alejandro Colomar
2022-12-09 19:28   ` Ian Abbott
2022-12-09 19:33     ` Alejandro Colomar
2022-12-09 21:41       ` Zack Weinberg
2022-12-11 15:58         ` Alejandro Colomar
2022-12-11 16:03           ` Alejandro Colomar
2022-12-12  2:11           ` Zack Weinberg
2022-12-12 10:21             ` Alejandro Colomar
2022-12-14  2:13               ` Zack Weinberg
2022-12-14 10:47                 ` Alejandro Colomar
2022-12-14 11:03                   ` Ian Abbott
2022-12-29  6:42                     ` Zack Weinberg
2022-12-29  6:39                   ` Zack Weinberg
2022-12-29 10:47                     ` Alejandro Colomar
2022-12-29 16:35                       ` Zack Weinberg
2022-12-29 16:39                         ` Alejandro Colomar
2022-12-12 15:22             ` Ian Abbott
2022-12-14  2:18               ` Zack Weinberg
2022-12-14 10:22                 ` Ian Abbott
2022-12-14 10:39                   ` Alejandro Colomar
2022-12-14 10:52                     ` Ian Abbott
2022-12-14 11:23                       ` Alejandro Colomar
2022-12-14 14:10                         ` Ian Abbott
2022-12-14 16:38                         ` Joseph Myers
2022-12-12 10:07       ` Ian Abbott
2022-12-12 11:33 ` Alejandro Colomar
2023-01-20  4:09   ` Eric Biggers
2023-01-20 13:12     ` Alejandro Colomar
2023-01-20 17:55       ` G. Branden Robinson
2023-01-20 22:02         ` Alejandro Colomar
2023-01-20 19:41       ` Eric Biggers [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y8ruUIIh/9RdQfeU@sol.localdomain \
    --to=ebiggers@kernel.org \
    --cc=abbotti@mev.co.uk \
    --cc=alx.manpages@gmail.com \
    --cc=alx@kernel.org \
    --cc=linux-man@vger.kernel.org \
    --cc=mtk.manpages@gmail.com \
    --cc=zack@owlfolio.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox