From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from mail-wr1-f47.google.com (mail-wr1-f47.google.com [209.85.221.47])
	(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id F2D422C08DC
	for <linux-kernel@vger.kernel.org>; Mon, 15 Jun 2026 08:43:42 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.47
ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1781513024; cv=none; b=n1a4CSgVQRfqPuAsri28218w9Fx+TVINZotCPeMwcdbA1Ri4e0Oq/7bayduzmtx2YRkMF3RYBaIbAe/4pfHWqwFRV7k3ogCYRUs/foM9+KvPwSB31AwSd7ZmbZoCHbTkYmmkuFMwbwlelGSsic2De6CZrC+n6RWQobW5z5Gvk9Q=
ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1781513024; c=relaxed/simple;
	bh=VopfHIcptEZMmLf/ycAEd+pN9utFW1DmMuJTCgNa00E=;
	h=Date:From:To:Cc:Subject:Message-ID:In-Reply-To:References:
	 MIME-Version:Content-Type; b=gTJrFxR2/wjHTMwpiUqVQQBx3FvUn7z6XtSfrPO9d6km2DOKROY2v2OwSNQdd8kXqGVAl4ljGX8tyuU2IApDdcI+ySP9jvu4/n54gAtLYAM1AzUV4fRY1e2T0bjEwxxfqCq61N0jatjobRrgP+imv5CVnvoUbFkJGzWK/l2FjEs=
ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=gHCDaD+T; arc=none smtp.client-ip=209.85.221.47
Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com
Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="gHCDaD+T"
Received: by mail-wr1-f47.google.com with SMTP id ffacd0b85a97d-45ef5146b56so2459031f8f.0
        for <linux-kernel@vger.kernel.org>; Mon, 15 Jun 2026 01:43:42 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20251104; t=1781513021; x=1782117821; darn=vger.kernel.org;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:subject:cc:to:from:date:from:to:cc:subject:date
         :message-id:reply-to;
        bh=DdemtzIvOcfi41092raFZqsT5vfMFOf0jmMvC/r395A=;
        b=gHCDaD+TtnDlLqcHdkPjr6hy7SM/a8tbejaBym7kKmBsPEIgAu5M5uHvYYAkQ6S/AI
         ami5LUt1VYZrlDnbaVunsxyvgQgcIL7Q05bHvOGvVbYW/zTn+tnp9uCs6aE+JWEQGIhS
         mJwVzEg00GEa2HzGc+a2XUmzNZIwu5Dy8zzuvNS7fbfBbP2TvRkosf3mCRNIz2Coz2bK
         vW9tcF+N9NFlwOvaqZ92UX8F8uxNELdhJGuEpq3A6u+weOt0CqiJgPC9Cqem8I+5gcUY
         mv0h9Xob/92Gmk0fPaceSFwpwhsodwNUEZgxJ5czwIpuvFymr2qh0k148DNbAlg+aLEw
         BXWA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20251104; t=1781513021; x=1782117821;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:subject:cc:to:from:date:x-gm-gg:x-gm-message-state:from
         :to:cc:subject:date:message-id:reply-to;
        bh=DdemtzIvOcfi41092raFZqsT5vfMFOf0jmMvC/r395A=;
        b=TKQRc91BIY65ex7kiXMJaNGYWXDRFY3xSxku8f/Xr/9aHHksvKmAcbgOR2yha0hzwr
         x9nJ7qPcPrVjMGRu9VvDg8Cv7PdHB+0BgM6Aj31CI6cmxkoC9UHyp95uqBLyyhxrqKGl
         a9eMpkUxqux+2opXm47VDsbrRgaofGS0F84lMTR62r04k6rkhCfWq8oOcGwqweVbB/dl
         ODz/eqtXcn1q3DwMHBXThb/+2gefxA4nlsdS95CfvFTY01L7PgXLazPss3TTxenKbm4j
         QBgFlXgIeeJ1BoDhxB/f7ll7oXUTVTagzRMRsIBIwtqgorZEg/QMfNGPcndkWvxVod8K
         8Yew==
X-Forwarded-Encrypted: i=1; AFNElJ+TPaOJKDV5IYMUUpX6PHVGNjJkO08vdI1uFQLxT3Ummfrt0e/tQ7sUDpcMiGK8e6YI3BQiMixQvSCM2n0=@vger.kernel.org
X-Gm-Message-State: AOJu0Ywot7jZ3PEOAEaD4f8ZqSOM4j7B89ul5QZ/mLmOpE+DJyik4Ebr
	WTcxCRerfZkOTZbUkJMHNynsdDPumIvK4TYod+T5gui+xs9ddcnSs0Ha
X-Gm-Gg: Acq92OEprgmHrDO3RQ9emT8t2uqRu2j+AVNdSkGhx7Ce8qqyQbdKkeXtZXjLUMOYzwt
	dJh1v1tUxJffE5KkgaLgtnmxcoEVhxt7GQ/SqMMATBhGhotfSwMsbl8uuaQ08g8EC7AGjDlt0Ea
	Ajar87/2XNqD7BHz/DMptyPvNr2BbwkeBqQvs21kGuQ1Jo8fQr159kCwT0ahUxcIAHCjvemAJLz
	Oij2XjAIQllZMXg3+pX7gVbGFOK6u7/HJhb+1ZIzTy9iioJrvY/dKvdQMYgAXBK5KOivpyR7uCj
	zxumnlI5ZfoXD5DvO1kVpwlTGAYFJrCN/fP8RU8tC+w3EuukU4GDgTJuZAlgCo7wR8lLE+yvIiO
	T0ugeJdBIH4uIYxfDx6Un/7t8O6EV9kh11WuoLaemhpB79sDWLxPQ04U5frFrOxdxr1r7LaIG8u
	CNxa0FxCZe8UWkFTnevzObuXqoGd02pmbntNV+/Fi4Gz4G1g9ZjoaoJTzMmC62
X-Received: by 2002:a05:6000:4289:b0:45e:f3b2:1228 with SMTP id ffacd0b85a97d-4606cb216demr17912785f8f.3.1781513020886;
        Mon, 15 Jun 2026 01:43:40 -0700 (PDT)
Received: from pumpkin (82-69-66-36.dsl.in-addr.zen.co.uk. [82.69.66.36])
        by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-4606f26f450sm35237829f8f.10.2026.06.15.01.43.40
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Mon, 15 Jun 2026 01:43:40 -0700 (PDT)
Date: Mon, 15 Jun 2026 09:43:39 +0100
From: David Laight <david.laight.linux@gmail.com>
To: Lasse Collin <lasse.collin@tukaani.org>
Cc: Andrew Morton <akpm@linux-foundation.org>, linux-kernel@vger.kernel.org,
 Nathan Chancellor <nathan@kernel.org>, Thorsten Blum
 <thorsten.blum@linux.dev>
Subject: Re: [PATCH 1/2] lib/xz: Use size_t instead of uint32_t in a few
 places
Message-ID: <20260615094339.4e2a1dcb@pumpkin>
In-Reply-To: <20260614160521.924710-1-lasse.collin@tukaani.org>
References: <20260614160521.924710-1-lasse.collin@tukaani.org>
X-Mailer: Claws Mail 4.1.1 (GTK 3.24.38; arm-unknown-linux-gnueabihf)
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit

On Sun, 14 Jun 2026 19:05:17 +0300
Lasse Collin <lasse.collin@tukaani.org> wrote:

> Reduce the number of uint32_t <-> size_t conversions a little. Eliminating
> such conversions entirely would require changing almost all uint32_t to
> size_t, which would look confusing and increase the sizes of the structs
> even more. Going the other way, converting everything to uint32_t, isn't
> possible because the input and output buffers use size_t in struct xz_buf.
> 
> Now both arguments to min() have the same type. This is required to for
> compatibility with PowerPC boot code[1] whose min() is strict like
> include/linux/minmax.h was before the commit d03eba99f5bf ("minmax: allow
> min()/max()/clamp() if the arguments have the same signedness.").
> 
> Swap the order of the "state" and "len" in struct lzma_dec to avoid
> padding in the middle of the struct when size_t is 64 bits. The
> reordering doesn't change the size of the struct; the padding just
> appears at the end instead.
> 
> dict_flush() used to truncate size_t to uint32_t when returning.
> This wasn't a bug; the value is always small enough.
> 
> Reported-by: Nathan Chancellor <nathan@kernel.org>
> Closes: https://lore.kernel.org/lkml/20260610232323.GA1071374@ax162/ [1]
> Cc: Thorsten Blum <thorsten.blum@linux.dev>
> Cc: David Laight <david.laight.linux@gmail.com>
> Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
> ---
>  lib/xz/xz_dec_lzma2.c | 40 ++++++++++++++++++++--------------------
>  lib/xz/xz_lzma2.h     |  2 +-
>  2 files changed, 21 insertions(+), 21 deletions(-)
> 
> diff --git a/lib/xz/xz_dec_lzma2.c b/lib/xz/xz_dec_lzma2.c
> index 9d80342b9c6b..68ae9c33b6a8 100644
> --- a/lib/xz/xz_dec_lzma2.c
> +++ b/lib/xz/xz_dec_lzma2.c
> @@ -135,14 +135,16 @@ struct lzma_dec {
>  	uint32_t rep2;
>  	uint32_t rep3;
>  
> -	/* Types of the most recently seen LZMA symbols */
> -	enum lzma_state state;
> -
>  	/*
>  	 * Length of a match. This is updated so that dict_repeat can
> -	 * be called again to finish repeating the whole match.
> +	 * be called again to finish repeating the whole match. This is
> +	 * size_t because a pointer to this is passed to dict_repeat,
> +	 * and there it's nicer to have size_t instead of uint32_t.
>  	 */
> -	uint32_t len;
> +	size_t len;
> +
> +	/* Types of the most recently seen LZMA symbols */
> +	enum lzma_state state;
>  
>  	/*
>  	 * LZMA properties or related bit masks (number of literal
> @@ -228,13 +230,13 @@ struct lzma2_dec {
>  	enum lzma2_seq next_sequence;
>  
>  	/* Uncompressed size of LZMA chunk (2 MiB at maximum) */
> -	uint32_t uncompressed;
> +	size_t uncompressed;
>  
>  	/*
>  	 * Compressed size of LZMA chunk or compressed/uncompressed
>  	 * size of uncompressed chunk (64 KiB at maximum)
>  	 */
> -	uint32_t compressed;
> +	size_t compressed;
>  
>  	/*
>  	 * True if dictionary reset is needed. This is false before
> @@ -273,7 +275,7 @@ struct xz_dec_lzma2 {
>  	 * decoder calls. See lzma2_lzma() for details.
>  	 */
>  	struct {
> -		uint32_t size;
> +		size_t size;
>  		uint8_t buf[3 * LZMA_IN_REQUIRED];
>  	} temp;
>  };
> @@ -320,7 +322,7 @@ static inline bool dict_has_space(const struct dictionary *dict)
>   * still empty. This special case is needed for single-call decoding to
>   * avoid writing a '\0' to the end of the destination buffer.
>   */
> -static inline uint32_t dict_get(const struct dictionary *dict, uint32_t dist)
> +static inline uint32_t dict_get(const struct dictionary *dict, size_t dist)
>  {
>  	size_t offset = dict->pos - dist - 1;
>  
> @@ -346,10 +348,10 @@ static inline void dict_put(struct dictionary *dict, uint8_t byte)
>   * invalid, false is returned. On success, true is returned and *len is
>   * updated to indicate how many bytes were left to be repeated.
>   */
> -static bool dict_repeat(struct dictionary *dict, uint32_t *len, uint32_t dist)
> +static bool dict_repeat(struct dictionary *dict, size_t *len, size_t dist)
>  {
>  	size_t back;
> -	uint32_t left;
> +	size_t left;
>  
>  	if (dist >= dict->full || dist >= dict->size)
>  		return false;
> @@ -375,7 +377,7 @@ static bool dict_repeat(struct dictionary *dict, uint32_t *len, uint32_t dist)
>  
>  /* Copy uncompressed data as is from input to dictionary and output buffers. */
>  static void dict_uncompressed(struct dictionary *dict, struct xz_buf *b,
> -			      uint32_t *left)
> +			      size_t *left)
>  {
>  	size_t copy_size;
>  
> @@ -433,7 +435,7 @@ static void dict_uncompressed(struct dictionary *dict, struct xz_buf *b,
>   * enough space in b->out. This is guaranteed because caller uses dict_limit()
>   * before decoding data into the dictionary.
>   */
> -static uint32_t dict_flush(struct dictionary *dict, struct xz_buf *b)
> +static size_t dict_flush(struct dictionary *dict, struct xz_buf *b)
>  {
>  	size_t copy_size = dict->pos - dict->start;
>  
> @@ -878,7 +880,7 @@ static bool lzma_props(struct xz_dec_lzma2 *s, uint8_t props)
>  static bool lzma2_lzma(struct xz_dec_lzma2 *s, struct xz_buf *b)
>  {
>  	size_t in_avail;
> -	uint32_t tmp;
> +	size_t tmp;
>  
>  	in_avail = b->in_size - b->in_pos;
>  	if (s->temp.size > 0 || s->lzma2.compressed == 0) {
> @@ -1046,25 +1048,23 @@ enum xz_ret xz_dec_lzma2_run(struct xz_dec_lzma2 *s, struct xz_buf *b)
>  
>  		case SEQ_UNCOMPRESSED_1:
>  			s->lzma2.uncompressed
> -					+= (uint32_t)b->in[b->in_pos++] << 8;
> +					+= (size_t)b->in[b->in_pos++] << 8;

The type of that casts can't matter.
Indeed all these casts can be deleted.
(The value read from the uint8_t variable is promoted to int before
it is used.)

	David

>  			s->lzma2.sequence = SEQ_UNCOMPRESSED_2;
>  			break;
>  
>  		case SEQ_UNCOMPRESSED_2:
>  			s->lzma2.uncompressed
> -					+= (uint32_t)b->in[b->in_pos++] + 1;
> +					+= (size_t)b->in[b->in_pos++] + 1;
>  			s->lzma2.sequence = SEQ_COMPRESSED_0;
>  			break;
>  
>  		case SEQ_COMPRESSED_0:
> -			s->lzma2.compressed
> -					= (uint32_t)b->in[b->in_pos++] << 8;
> +			s->lzma2.compressed = (size_t)b->in[b->in_pos++] << 8;
>  			s->lzma2.sequence = SEQ_COMPRESSED_1;
>  			break;
>  
>  		case SEQ_COMPRESSED_1:
> -			s->lzma2.compressed
> -					+= (uint32_t)b->in[b->in_pos++] + 1;
> +			s->lzma2.compressed += (size_t)b->in[b->in_pos++] + 1;
>  			s->lzma2.sequence = s->lzma2.next_sequence;
>  			break;
>  
> diff --git a/lib/xz/xz_lzma2.h b/lib/xz/xz_lzma2.h
> index d2632b7dfb9c..a612ce4fd450 100644
> --- a/lib/xz/xz_lzma2.h
> +++ b/lib/xz/xz_lzma2.h
> @@ -143,7 +143,7 @@ static inline bool lzma_state_is_literal(enum lzma_state state)
>   * Get the index of the appropriate probability array for decoding
>   * the distance slot.
>   */
> -static inline uint32_t lzma_get_dist_state(uint32_t len)
> +static inline size_t lzma_get_dist_state(size_t len)
>  {
>  	return len < DIST_STATES + MATCH_LEN_MIN
>  			? len - MATCH_LEN_MIN : DIST_STATES - 1;