public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: David Disseldorp <ddiss@suse.de>
To: Qu Wenruo <wqu@suse.com>
Cc: linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, christophe.jaillet@wanadoo.fr,
	andriy.shevchenko@linux.intel.com, David.Laight@ACULAB.COM
Subject: Re: [PATCH v2 2/4] kstrtox: introduce a safer version of memparse()
Date: Wed, 3 Jan 2024 01:19:08 +1100	[thread overview]
Message-ID: <20240103011908.3a25c567@echidna> (raw)
In-Reply-To: <12f3cbc956800709b2d5e1072bd22edc5720cbae.1704168510.git.wqu@suse.com>

On Tue,  2 Jan 2024 14:42:12 +1030, Qu Wenruo wrote:

> [BUGS]
> Function memparse() lacks error handling:
> 
> - If no valid number string at all
>   In that case @retptr would just be updated and return value would be
>   zero.
> 
> - No overflown detection
>   This applies to both the number string part, and the suffixes part.
>   And since we have no way to indicate errors, we can get weird results
>   like:
> 
>   	"25E" -> 10376293541461622784 (9E)
> 
>   This is due to the fact that for "E" suffix, there is only 4 bits
>   left, and 25 with 60 bits left shift would lead to overflow.
> 
> [CAUSE]
> The root cause is already mentioned in the comments of the function, the
> usage of simple_strtoull() is the source of evil.
> Furthermore the function prototype is no good either, just returning an
> unsigned long long gives us no way to indicate an error.
> 
> [FIX]
> Due to the prototype limits, we can not have a drop-in replacement for
> memparse().
> 
> This patch can only help by introduce a new helper, memparse_safe(), and
> mark the old memparse() deprecated.
> 
> The new memparse_safe() has the following improvement:
> 
> - Invalid string detection
>   If no number string can be detected at all, -EINVAL would be returned.
> 
> - Better overflow detection
>   Both the string part and the extra left shift would have overflow
>   detection.
>   Any overflow would result -ERANGE.
> 
> - Safer default suffix selection
>   The helper allows the caller to choose the suffixes that they want to
>   use.
>   But only "KMGTP" are recommended by default since the "E" leaves only
>   4 bits before overflow.
>   For those callers really know what they are doing, they can still
>   manually to include all suffixes.
> 
> Due to the prototype change, callers should migrate to the new one and
> change their code and add extra error handling.
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>  include/linux/kernel.h  |  8 +++-
>  include/linux/kstrtox.h | 15 +++++++
>  lib/cmdline.c           |  5 ++-
>  lib/kstrtox.c           | 96 +++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 122 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> index d9ad21058eed..b1b6da60ea43 100644
> --- a/include/linux/kernel.h
> +++ b/include/linux/kernel.h
> @@ -201,7 +201,13 @@ void do_exit(long error_code) __noreturn;
>  
>  extern int get_option(char **str, int *pint);
>  extern char *get_options(const char *str, int nints, int *ints);
> -extern unsigned long long memparse(const char *ptr, char **retptr);
> +
> +/*
> + * DEPRECATED, lack of any kind of error handling.
> + *
> + * Use memparse_safe() from lib/kstrtox.c instead.
> + */
> +extern __deprecated unsigned long long memparse(const char *ptr, char **retptr);
>  extern bool parse_option_str(const char *str, const char *option);
>  extern char *next_arg(char *args, char **param, char **val);
>  
> diff --git a/include/linux/kstrtox.h b/include/linux/kstrtox.h
> index 7fcf29a4e0de..53a1e059dd31 100644
> --- a/include/linux/kstrtox.h
> +++ b/include/linux/kstrtox.h
> @@ -9,6 +9,21 @@
>  int __must_check _kstrtoul(const char *s, unsigned int base, unsigned long *res);
>  int __must_check _kstrtol(const char *s, unsigned int base, long *res);
>  
> +enum memparse_suffix {
> +	MEMPARSE_SUFFIX_K = 1 << 0,
> +	MEMPARSE_SUFFIX_M = 1 << 1,
> +	MEMPARSE_SUFFIX_G = 1 << 2,
> +	MEMPARSE_SUFFIX_T = 1 << 3,
> +	MEMPARSE_SUFFIX_P = 1 << 4,
> +	MEMPARSE_SUFFIX_E = 1 << 5,
> +};
> +
> +#define MEMPARSE_SUFFIXES_DEFAULT (MEMPARSE_SUFFIX_K | MEMPARSE_SUFFIX_M |\
> +				   MEMPARSE_SUFFIX_G | MEMPARSE_SUFFIX_T |\
> +				   MEMPARSE_SUFFIX_P)
> +
> +int __must_check memparse_safe(const char *s, enum memparse_suffix suffixes,
> +			       unsigned long long *res, char **retptr);
>  int __must_check kstrtoull(const char *s, unsigned int base, unsigned long long *res);
>  int __must_check kstrtoll(const char *s, unsigned int base, long long *res);
>  
> diff --git a/lib/cmdline.c b/lib/cmdline.c
> index 90ed997d9570..d379157de349 100644
> --- a/lib/cmdline.c
> +++ b/lib/cmdline.c
> @@ -139,12 +139,15 @@ char *get_options(const char *str, int nints, int *ints)
>  EXPORT_SYMBOL(get_options);
>  
>  /**
> - *	memparse - parse a string with mem suffixes into a number
> + *	memparse - DEPRECATED, parse a string with mem suffixes into a number
>   *	@ptr: Where parse begins
>   *	@retptr: (output) Optional pointer to next char after parse completes
>   *
> + *	There is no way to handle errors, and no overflown detection and string
> + *	sanity checks.
>   *	Parses a string into a number.  The number stored at @ptr is
>   *	potentially suffixed with K, M, G, T, P, E.
> + *
>   */
>  
>  unsigned long long memparse(const char *ptr, char **retptr)
> diff --git a/lib/kstrtox.c b/lib/kstrtox.c
> index 41c9a499bbf3..a1e4279f52b3 100644
> --- a/lib/kstrtox.c
> +++ b/lib/kstrtox.c
> @@ -113,6 +113,102 @@ static int _kstrtoull(const char *s, unsigned int base, unsigned long long *res)
>  	return 0;
>  }
>  
> +/**
> + * memparse_safe - convert a string to an unsigned long long, safer version of
> + * memparse()
> + *
> + * @s:		The start of the string. Must be null-terminated.
> + *		The base would be determined automatically, if it starts with
> + *		"0x" the base would be 16, if it starts with "0" the base
> + *		would be 8, otherwise the base would be 10.
> + *		After a valid number string, there can be at most one
> + *		case-insensive suffix character, specified by the @suffixes
> + *		parameter.
> + *
> + * @suffixes:	The suffixes which should be parsed. Use logical ORed
> + *		memparse_suffix enum to indicate the supported suffixes.
> + *		The suffixes are case-insensive, all 2 ^ 10 based.
> + *		Supported ones are "KMGPTE".
> + *		NOTE: If one suffix out of the supported one is hit, it would
> + *		end the parse normally, with @retptr pointed to the unsupported
> + *		suffix.
> + *
> + * @res:	Where to write the result.
> + *
> + * @retptr:	(output) Optional pointer to the next char after parse completes.
> + *
> + * Return 0 if any valid numberic string can be parsed, and @retptr updated.
> + * Return -INVALID if no valid number string can be found.
> + * Return -ERANGE if the number overflows.
> + * For minus return values, @retptr would not be updated.
> + */
> +noinline int memparse_safe(const char *s, enum memparse_suffix suffixes,
> +			   unsigned long long *res, char **retptr)
> +{
> +	unsigned long long value;
> +	unsigned int rv;
> +	int shift = 0;
> +	int base = 0;
> +
> +	s = _parse_integer_fixup_radix(s, &base);
> +	rv = _parse_integer(s, base, &value);
> +	if (rv & KSTRTOX_OVERFLOW)
> +		return -ERANGE;
> +	if (rv == 0)
> +		return -EINVAL;
> +
> +	s += rv;
> +	switch (*s) {
> +	case 'K':
> +	case 'k':
> +		if (!(suffixes & MEMPARSE_SUFFIX_K))
> +			break;
> +		shift = 10;
> +		break;
> +	case 'M':
> +	case 'm':
> +		if (!(suffixes & MEMPARSE_SUFFIX_M))
> +			break;
> +		shift = 20;
> +		break;
> +	case 'G':
> +	case 'g':
> +		if (!(suffixes & MEMPARSE_SUFFIX_G))
> +			break;
> +		shift = 30;
> +		break;
> +	case 'T':
> +	case 't':
> +		if (!(suffixes & MEMPARSE_SUFFIX_T))
> +			break;
> +		shift = 40;
> +		break;
> +	case 'P':
> +	case 'p':
> +		if (!(suffixes & MEMPARSE_SUFFIX_P))
> +			break;
> +		shift = 50;
> +		break;
> +	case 'E':
> +	case 'e':
> +		if (!(suffixes & MEMPARSE_SUFFIX_E))
> +			break;
> +		shift = 60;
> +		break;
> +	}
> +	if (shift) {
> +		s++;
> +		if (value >> (64 - shift))
> +			return -ERANGE;
> +		value <<= shift;
> +	}
> +	*res = value;
> +	if (retptr)
> +		*retptr = (char *)s;
> +	return 0;
> +}
> +EXPORT_SYMBOL(memparse_safe);
> +
>  /**
>   * kstrtoull - convert a string to an unsigned long long
>   * @s: The start of the string. The string must be null-terminated, and may also

Still not a fan of the MEMPARSE_SUFFIXES_DEFAULT naming, but won't
bikeshed on that further. Looks good otherwise.
Reviewed-by: David Disseldorp <ddiss@suse.de>

  reply	other threads:[~2024-01-02 14:19 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-02  4:12 [PATCH v2 0/4] kstrtox: introduce memparse_safe() Qu Wenruo
2024-01-02  4:12 ` [PATCH v2 1/4] kstrtox: always skip the leading "0x" even if no more valid chars Qu Wenruo
2024-01-02 14:16   ` David Disseldorp
2024-01-02  4:12 ` [PATCH v2 2/4] kstrtox: introduce a safer version of memparse() Qu Wenruo
2024-01-02 14:19   ` David Disseldorp [this message]
2024-01-02  4:12 ` [PATCH v2 3/4] kstrtox: add unit tests for memparse_safe() Qu Wenruo
2024-01-02 13:23   ` Geert Uytterhoeven
2024-01-02 20:55     ` Qu Wenruo
2024-01-03  9:27       ` Geert Uytterhoeven
2024-01-03  9:45         ` Qu Wenruo
2024-01-02 14:17   ` David Disseldorp
2024-01-03 22:45     ` Qu Wenruo
2024-01-02  4:12 ` [PATCH v2 4/4] btrfs: migrate to the newer memparse_safe() helper Qu Wenruo
2024-01-02 14:24   ` David Disseldorp
2024-01-02 17:10 ` [PATCH v2 0/4] kstrtox: introduce memparse_safe() David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240103011908.3a25c567@echidna \
    --to=ddiss@suse.de \
    --cc=David.Laight@ACULAB.COM \
    --cc=akpm@linux-foundation.org \
    --cc=andriy.shevchenko@linux.intel.com \
    --cc=christophe.jaillet@wanadoo.fr \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox