git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michael Haggerty <mhagger@alum.mit.edu>
To: Junio C Hamano <gitster@pobox.com>, Jeff King <peff@peff.net>
Cc: Jeff King <peff@peff.org>, git@vger.kernel.org
Subject: Re: [PATCH 00/14] numparse module: systematically tighten up integer parsing
Date: Tue, 24 Mar 2015 17:06:12 +0100	[thread overview]
Message-ID: <55118B74.1030201@alum.mit.edu> (raw)
In-Reply-To: <xmqqk2ydjvcd.fsf@gitster.dls.corp.google.com>

On 03/19/2015 08:32 AM, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
> 
>> I wonder how much of the boilerplate in the parse_* functions could be
>> factored out to use a uintmax_t, with the caller just providing the
>> range. That would make it easier to add new types like off_t, and
>> possibly even constrained types (e.g., an integer from 0 to 100). On the
>> other hand, you mentioned to me elsewhere that there may be some bugs in
>> the range-checks of config.c's integer parsing. I suspect they are
>> related to exactly this kind of refactoring, so perhaps writing things
>> out is best.
> 
> I like this idea very well.  I wonder if we can implement the family
> of
> 
>     parse_{type}(const char *, unsigned int flags,
>     		 const char **endptr, {type} *result)
> 
> functions by calls a helper that internally deals with the numbers
> in uintmax_t, and then checks if the value fits within the possible
> range of the *result before returning.
> 
>     int parse_i(const char *str, unsigned flags,
> 		const char **endptr, int *result) {
> 	uintmax_t val;
>         int sign = 1, status;
>         if (*str == '-') {
> 		sign = -1; 
>                 str++;
> 	}
>         status = parse_num(str, flags, endptr, &val, INT_MAX);
>         if (status < 0)
>         	return status;
> 	*result = sign < 0 ? -val : val;
>         return 0;
>     }
> 
> (assuming the last parameter to parse_num is used to check if the
> result fits within that range).  Or that may be easier and cleaner
> to be done in the caller with or something like that:
> 
> 	status = parse_num(str, flags, endptr, &val);
>         if (status < 0)
>         	return status;
> 	if (INT_MAX <= val * sign || val * sign <= INT_MIN) {
>         	errno = ERANGE;
>                 return -1;
> 	}
> 
> If we wanted to, we may even be able to avoid duplication of
> boilerplate by wrapping the above pattern within a macro,
> parameterizing the TYPE_{MIN,MAX} and using token pasting, to
> expand to four necessary result types.
> 
> There is no reason for the implementation of the parse_num() helper
> to be using strtoul(3) or strtoull(3); its behaviour will be under
> our total control.  It can become fancier by enriching the flags
> bits (e.g. allow scaling factor, etc.) only once and all variants
> for various result types will get the same feature.

Parsing numbers is not rocket science, but there are a lot of pitfalls,
especially around overflow. It's even harder to write such code via
macros and the result is less readable.

This patch series is mostly about finding a reasonable API and whipping
the callers into shape. That seems ambitious enough for me. I'd rather
stick with boring wrappers for now and lean on strtol()/strtoul() to do
the dirty work. It will be easy for a motivated person to change the
implementation later.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu

  reply	other threads:[~2015-03-24 16:06 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-17 16:00 [PATCH 00/14] numparse module: systematically tighten up integer parsing Michael Haggerty
2015-03-17 16:00 ` [PATCH 01/14] numparse: new module for parsing integral numbers Michael Haggerty
2015-03-18 18:27   ` Eric Sunshine
2015-03-18 22:47     ` Michael Haggerty
2015-03-20  8:54       ` Eric Sunshine
2015-03-20 17:51   ` Junio C Hamano
2015-03-17 16:00 ` [PATCH 02/14] cacheinfo_callback(): use convert_ui() when handling "--cacheinfo" Michael Haggerty
2015-03-17 16:00 ` [PATCH 03/14] write_subdirectory(): use convert_ui() for parsing mode Michael Haggerty
2015-03-17 16:00 ` [PATCH 04/14] handle_revision_opt(): use skip_prefix() in many places Michael Haggerty
2015-03-17 16:00 ` [PATCH 05/14] handle_revision_opt(): use convert_i() when handling "-<digit>" Michael Haggerty
2015-03-19  6:34   ` Junio C Hamano
2015-03-17 16:00 ` [PATCH 06/14] strtoul_ui(), strtol_i(): remove functions Michael Haggerty
2015-03-17 16:00 ` [PATCH 07/14] handle_revision_opt(): use convert_ui() when handling "--abbrev=" Michael Haggerty
2015-03-17 16:00 ` [PATCH 08/14] builtin_diff(): detect errors when parsing --unified argument Michael Haggerty
2015-03-17 16:00 ` [PATCH 09/14] opt_arg(): val is always non-NULL Michael Haggerty
2015-03-17 16:00 ` [PATCH 10/14] opt_arg(): use convert_i() in implementation Michael Haggerty
2015-03-17 16:00 ` [PATCH 11/14] opt_arg(): report errors parsing option values Michael Haggerty
2015-03-17 16:00 ` [PATCH 12/14] opt_arg(): simplify pointer handling Michael Haggerty
2015-03-17 16:00 ` [PATCH 13/14] diff_opt_parse(): use convert_i() when handling "-l<num>" Michael Haggerty
2015-03-17 16:00 ` [PATCH 14/14] diff_opt_parse(): use convert_i() when handling --abbrev=<num> Michael Haggerty
2015-03-19  6:37   ` Junio C Hamano
2015-03-17 18:48 ` [PATCH 00/14] numparse module: systematically tighten up integer parsing Junio C Hamano
2015-03-17 19:46   ` Michael Haggerty
2015-03-19  6:31     ` Junio C Hamano
2015-03-17 23:05 ` Duy Nguyen
2015-03-18  9:47   ` Michael Haggerty
2015-03-18  9:58     ` Duy Nguyen
2015-03-18 10:03     ` Jeff King
2015-03-18 10:20       ` Michael Haggerty
2015-03-19  5:26 ` Jeff King
2015-03-19  6:41   ` Junio C Hamano
2015-03-19  7:32   ` Junio C Hamano
2015-03-24 16:06     ` Michael Haggerty [this message]
2015-03-24 16:49       ` René Scharfe
2015-03-25 21:14         ` Michael Haggerty
2015-03-25 21:59           ` Junio C Hamano
2015-03-24 15:05   ` Michael Haggerty
2015-03-19  6:22 ` Junio C Hamano
2015-03-24 15:42   ` Michael Haggerty
2015-03-24 15:58     ` Junio C Hamano
2015-03-24 16:09       ` Junio C Hamano
2015-03-24 17:39       ` Michael Haggerty
2015-03-24 18:08         ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55118B74.1030201@alum.mit.edu \
    --to=mhagger@alum.mit.edu \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    --cc=peff@peff.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).