From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-wr1-f47.google.com (mail-wr1-f47.google.com [209.85.221.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F2D422C08DC for ; Mon, 15 Jun 2026 08:43:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.47 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781513024; cv=none; b=n1a4CSgVQRfqPuAsri28218w9Fx+TVINZotCPeMwcdbA1Ri4e0Oq/7bayduzmtx2YRkMF3RYBaIbAe/4pfHWqwFRV7k3ogCYRUs/foM9+KvPwSB31AwSd7ZmbZoCHbTkYmmkuFMwbwlelGSsic2De6CZrC+n6RWQobW5z5Gvk9Q= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781513024; c=relaxed/simple; bh=VopfHIcptEZMmLf/ycAEd+pN9utFW1DmMuJTCgNa00E=; h=Date:From:To:Cc:Subject:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=gTJrFxR2/wjHTMwpiUqVQQBx3FvUn7z6XtSfrPO9d6km2DOKROY2v2OwSNQdd8kXqGVAl4ljGX8tyuU2IApDdcI+ySP9jvu4/n54gAtLYAM1AzUV4fRY1e2T0bjEwxxfqCq61N0jatjobRrgP+imv5CVnvoUbFkJGzWK/l2FjEs= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=gHCDaD+T; arc=none smtp.client-ip=209.85.221.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="gHCDaD+T" Received: by mail-wr1-f47.google.com with SMTP id ffacd0b85a97d-45ef5146b56so2459031f8f.0 for ; Mon, 15 Jun 2026 01:43:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1781513021; x=1782117821; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=DdemtzIvOcfi41092raFZqsT5vfMFOf0jmMvC/r395A=; b=gHCDaD+TtnDlLqcHdkPjr6hy7SM/a8tbejaBym7kKmBsPEIgAu5M5uHvYYAkQ6S/AI ami5LUt1VYZrlDnbaVunsxyvgQgcIL7Q05bHvOGvVbYW/zTn+tnp9uCs6aE+JWEQGIhS mJwVzEg00GEa2HzGc+a2XUmzNZIwu5Dy8zzuvNS7fbfBbP2TvRkosf3mCRNIz2Coz2bK vW9tcF+N9NFlwOvaqZ92UX8F8uxNELdhJGuEpq3A6u+weOt0CqiJgPC9Cqem8I+5gcUY mv0h9Xob/92Gmk0fPaceSFwpwhsodwNUEZgxJ5czwIpuvFymr2qh0k148DNbAlg+aLEw BXWA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1781513021; x=1782117821; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=DdemtzIvOcfi41092raFZqsT5vfMFOf0jmMvC/r395A=; b=TKQRc91BIY65ex7kiXMJaNGYWXDRFY3xSxku8f/Xr/9aHHksvKmAcbgOR2yha0hzwr x9nJ7qPcPrVjMGRu9VvDg8Cv7PdHB+0BgM6Aj31CI6cmxkoC9UHyp95uqBLyyhxrqKGl a9eMpkUxqux+2opXm47VDsbrRgaofGS0F84lMTR62r04k6rkhCfWq8oOcGwqweVbB/dl ODz/eqtXcn1q3DwMHBXThb/+2gefxA4nlsdS95CfvFTY01L7PgXLazPss3TTxenKbm4j QBgFlXgIeeJ1BoDhxB/f7ll7oXUTVTagzRMRsIBIwtqgorZEg/QMfNGPcndkWvxVod8K 8Yew== X-Forwarded-Encrypted: i=1; AFNElJ+TPaOJKDV5IYMUUpX6PHVGNjJkO08vdI1uFQLxT3Ummfrt0e/tQ7sUDpcMiGK8e6YI3BQiMixQvSCM2n0=@vger.kernel.org X-Gm-Message-State: AOJu0Ywot7jZ3PEOAEaD4f8ZqSOM4j7B89ul5QZ/mLmOpE+DJyik4Ebr WTcxCRerfZkOTZbUkJMHNynsdDPumIvK4TYod+T5gui+xs9ddcnSs0Ha X-Gm-Gg: Acq92OEprgmHrDO3RQ9emT8t2uqRu2j+AVNdSkGhx7Ce8qqyQbdKkeXtZXjLUMOYzwt dJh1v1tUxJffE5KkgaLgtnmxcoEVhxt7GQ/SqMMATBhGhotfSwMsbl8uuaQ08g8EC7AGjDlt0Ea Ajar87/2XNqD7BHz/DMptyPvNr2BbwkeBqQvs21kGuQ1Jo8fQr159kCwT0ahUxcIAHCjvemAJLz Oij2XjAIQllZMXg3+pX7gVbGFOK6u7/HJhb+1ZIzTy9iioJrvY/dKvdQMYgAXBK5KOivpyR7uCj zxumnlI5ZfoXD5DvO1kVpwlTGAYFJrCN/fP8RU8tC+w3EuukU4GDgTJuZAlgCo7wR8lLE+yvIiO T0ugeJdBIH4uIYxfDx6Un/7t8O6EV9kh11WuoLaemhpB79sDWLxPQ04U5frFrOxdxr1r7LaIG8u CNxa0FxCZe8UWkFTnevzObuXqoGd02pmbntNV+/Fi4Gz4G1g9ZjoaoJTzMmC62 X-Received: by 2002:a05:6000:4289:b0:45e:f3b2:1228 with SMTP id ffacd0b85a97d-4606cb216demr17912785f8f.3.1781513020886; Mon, 15 Jun 2026 01:43:40 -0700 (PDT) Received: from pumpkin (82-69-66-36.dsl.in-addr.zen.co.uk. [82.69.66.36]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-4606f26f450sm35237829f8f.10.2026.06.15.01.43.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 15 Jun 2026 01:43:40 -0700 (PDT) Date: Mon, 15 Jun 2026 09:43:39 +0100 From: David Laight To: Lasse Collin Cc: Andrew Morton , linux-kernel@vger.kernel.org, Nathan Chancellor , Thorsten Blum Subject: Re: [PATCH 1/2] lib/xz: Use size_t instead of uint32_t in a few places Message-ID: <20260615094339.4e2a1dcb@pumpkin> In-Reply-To: <20260614160521.924710-1-lasse.collin@tukaani.org> References: <20260614160521.924710-1-lasse.collin@tukaani.org> X-Mailer: Claws Mail 4.1.1 (GTK 3.24.38; arm-unknown-linux-gnueabihf) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit On Sun, 14 Jun 2026 19:05:17 +0300 Lasse Collin wrote: > Reduce the number of uint32_t <-> size_t conversions a little. Eliminating > such conversions entirely would require changing almost all uint32_t to > size_t, which would look confusing and increase the sizes of the structs > even more. Going the other way, converting everything to uint32_t, isn't > possible because the input and output buffers use size_t in struct xz_buf. > > Now both arguments to min() have the same type. This is required to for > compatibility with PowerPC boot code[1] whose min() is strict like > include/linux/minmax.h was before the commit d03eba99f5bf ("minmax: allow > min()/max()/clamp() if the arguments have the same signedness."). > > Swap the order of the "state" and "len" in struct lzma_dec to avoid > padding in the middle of the struct when size_t is 64 bits. The > reordering doesn't change the size of the struct; the padding just > appears at the end instead. > > dict_flush() used to truncate size_t to uint32_t when returning. > This wasn't a bug; the value is always small enough. > > Reported-by: Nathan Chancellor > Closes: https://lore.kernel.org/lkml/20260610232323.GA1071374@ax162/ [1] > Cc: Thorsten Blum > Cc: David Laight > Signed-off-by: Lasse Collin > --- > lib/xz/xz_dec_lzma2.c | 40 ++++++++++++++++++++-------------------- > lib/xz/xz_lzma2.h | 2 +- > 2 files changed, 21 insertions(+), 21 deletions(-) > > diff --git a/lib/xz/xz_dec_lzma2.c b/lib/xz/xz_dec_lzma2.c > index 9d80342b9c6b..68ae9c33b6a8 100644 > --- a/lib/xz/xz_dec_lzma2.c > +++ b/lib/xz/xz_dec_lzma2.c > @@ -135,14 +135,16 @@ struct lzma_dec { > uint32_t rep2; > uint32_t rep3; > > - /* Types of the most recently seen LZMA symbols */ > - enum lzma_state state; > - > /* > * Length of a match. This is updated so that dict_repeat can > - * be called again to finish repeating the whole match. > + * be called again to finish repeating the whole match. This is > + * size_t because a pointer to this is passed to dict_repeat, > + * and there it's nicer to have size_t instead of uint32_t. > */ > - uint32_t len; > + size_t len; > + > + /* Types of the most recently seen LZMA symbols */ > + enum lzma_state state; > > /* > * LZMA properties or related bit masks (number of literal > @@ -228,13 +230,13 @@ struct lzma2_dec { > enum lzma2_seq next_sequence; > > /* Uncompressed size of LZMA chunk (2 MiB at maximum) */ > - uint32_t uncompressed; > + size_t uncompressed; > > /* > * Compressed size of LZMA chunk or compressed/uncompressed > * size of uncompressed chunk (64 KiB at maximum) > */ > - uint32_t compressed; > + size_t compressed; > > /* > * True if dictionary reset is needed. This is false before > @@ -273,7 +275,7 @@ struct xz_dec_lzma2 { > * decoder calls. See lzma2_lzma() for details. > */ > struct { > - uint32_t size; > + size_t size; > uint8_t buf[3 * LZMA_IN_REQUIRED]; > } temp; > }; > @@ -320,7 +322,7 @@ static inline bool dict_has_space(const struct dictionary *dict) > * still empty. This special case is needed for single-call decoding to > * avoid writing a '\0' to the end of the destination buffer. > */ > -static inline uint32_t dict_get(const struct dictionary *dict, uint32_t dist) > +static inline uint32_t dict_get(const struct dictionary *dict, size_t dist) > { > size_t offset = dict->pos - dist - 1; > > @@ -346,10 +348,10 @@ static inline void dict_put(struct dictionary *dict, uint8_t byte) > * invalid, false is returned. On success, true is returned and *len is > * updated to indicate how many bytes were left to be repeated. > */ > -static bool dict_repeat(struct dictionary *dict, uint32_t *len, uint32_t dist) > +static bool dict_repeat(struct dictionary *dict, size_t *len, size_t dist) > { > size_t back; > - uint32_t left; > + size_t left; > > if (dist >= dict->full || dist >= dict->size) > return false; > @@ -375,7 +377,7 @@ static bool dict_repeat(struct dictionary *dict, uint32_t *len, uint32_t dist) > > /* Copy uncompressed data as is from input to dictionary and output buffers. */ > static void dict_uncompressed(struct dictionary *dict, struct xz_buf *b, > - uint32_t *left) > + size_t *left) > { > size_t copy_size; > > @@ -433,7 +435,7 @@ static void dict_uncompressed(struct dictionary *dict, struct xz_buf *b, > * enough space in b->out. This is guaranteed because caller uses dict_limit() > * before decoding data into the dictionary. > */ > -static uint32_t dict_flush(struct dictionary *dict, struct xz_buf *b) > +static size_t dict_flush(struct dictionary *dict, struct xz_buf *b) > { > size_t copy_size = dict->pos - dict->start; > > @@ -878,7 +880,7 @@ static bool lzma_props(struct xz_dec_lzma2 *s, uint8_t props) > static bool lzma2_lzma(struct xz_dec_lzma2 *s, struct xz_buf *b) > { > size_t in_avail; > - uint32_t tmp; > + size_t tmp; > > in_avail = b->in_size - b->in_pos; > if (s->temp.size > 0 || s->lzma2.compressed == 0) { > @@ -1046,25 +1048,23 @@ enum xz_ret xz_dec_lzma2_run(struct xz_dec_lzma2 *s, struct xz_buf *b) > > case SEQ_UNCOMPRESSED_1: > s->lzma2.uncompressed > - += (uint32_t)b->in[b->in_pos++] << 8; > + += (size_t)b->in[b->in_pos++] << 8; The type of that casts can't matter. Indeed all these casts can be deleted. (The value read from the uint8_t variable is promoted to int before it is used.) David > s->lzma2.sequence = SEQ_UNCOMPRESSED_2; > break; > > case SEQ_UNCOMPRESSED_2: > s->lzma2.uncompressed > - += (uint32_t)b->in[b->in_pos++] + 1; > + += (size_t)b->in[b->in_pos++] + 1; > s->lzma2.sequence = SEQ_COMPRESSED_0; > break; > > case SEQ_COMPRESSED_0: > - s->lzma2.compressed > - = (uint32_t)b->in[b->in_pos++] << 8; > + s->lzma2.compressed = (size_t)b->in[b->in_pos++] << 8; > s->lzma2.sequence = SEQ_COMPRESSED_1; > break; > > case SEQ_COMPRESSED_1: > - s->lzma2.compressed > - += (uint32_t)b->in[b->in_pos++] + 1; > + s->lzma2.compressed += (size_t)b->in[b->in_pos++] + 1; > s->lzma2.sequence = s->lzma2.next_sequence; > break; > > diff --git a/lib/xz/xz_lzma2.h b/lib/xz/xz_lzma2.h > index d2632b7dfb9c..a612ce4fd450 100644 > --- a/lib/xz/xz_lzma2.h > +++ b/lib/xz/xz_lzma2.h > @@ -143,7 +143,7 @@ static inline bool lzma_state_is_literal(enum lzma_state state) > * Get the index of the appropriate probability array for decoding > * the distance slot. > */ > -static inline uint32_t lzma_get_dist_state(uint32_t len) > +static inline size_t lzma_get_dist_state(size_t len) > { > return len < DIST_STATES + MATCH_LEN_MIN > ? len - MATCH_LEN_MIN : DIST_STATES - 1;