From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pj1-f49.google.com (mail-pj1-f49.google.com [209.85.216.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CFED12C324C for ; Tue, 2 Dec 2025 18:03:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.49 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764698634; cv=none; b=Ai2QLR88Am2yUlw0MHB6WovgZM8j0/uPc0y9OIexznzClwbdx563sXZjc8Q5sTnOIFQtjNHKgCLACRuZEqeRHNtFQgkdbic3pFtdKWR4CeN8g83M+Jy5yLuSOP5Jj+FuNrsU87BITi3MgKxuyF1T0xfVtb2YY2NabobR9SVGIzk= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764698634; c=relaxed/simple; bh=kDvVu0dP1EqrfbA/cz1taWpIeoZsDWPzNFKw2DKRTbA=; h=Message-ID:Subject:From:To:Cc:Date:In-Reply-To:References: Content-Type:MIME-Version; b=sMXbMyI2BjFmE8Ead2xeKhZfzyCDgx1vuacJSjoCbIl3ULdKziLgsfibWXkphusBVAeH5WxvRZ4o4iuxy06aZFPR9hKrThumLsBpp9QuQaqRvisSIQ1TKIk1Nmef09L+fC83Bdcica24MVP7k/Qeb9USScZMzc4oOL1PhZw9Nq0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=JuCrpCiQ; arc=none smtp.client-ip=209.85.216.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="JuCrpCiQ" Received: by mail-pj1-f49.google.com with SMTP id 98e67ed59e1d1-343ff854297so7598842a91.1 for ; Tue, 02 Dec 2025 10:03:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1764698632; x=1765303432; darn=vger.kernel.org; h=mime-version:user-agent:content-transfer-encoding:references :in-reply-to:date:cc:to:from:subject:message-id:from:to:cc:subject :date:message-id:reply-to; bh=LabsOKhQyjKD4tx01JVB9reuouUAx25N0i0FBFz7i7E=; b=JuCrpCiQk0bNv6jwOSe8gZkw53KcFlLqUnXaGJcSON6L+7VGJ8pKhJvMZjWO6hYCy1 RcFxpmGjekAki73c7oL1ifziiTvl7ufTxlan2Ulg4+8uAaYh2ezoMBOKcMCQsjQLz5xf mpNLPcN6wvvkMycwV4c1jgnD6ZTOebqq4E2zuu2sGSniThhGDaCpSuCA83T+0JyuBiXZ vmcVPozDisV7ak7wMdTKr8FIzHa0NhpuWLUXtl9lcERJLUE8e+bxVVf1lIeKZ5jXXKdJ F0l1UVqFanEglghnxdxzUCueF1CJYQuZAZcZp/aC89Dm+j1RrRcZomMUPoA08TXmyY5l Os/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764698632; x=1765303432; h=mime-version:user-agent:content-transfer-encoding:references :in-reply-to:date:cc:to:from:subject:message-id:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=LabsOKhQyjKD4tx01JVB9reuouUAx25N0i0FBFz7i7E=; b=uTk1zdsoDRVxH2uXjPXeizMr71Hv0RDmyreAq3Wb085Dk6Gu/8ooCEUIezZ0Myl7rm Y5TkwRct5uNnAZQjB6wXBXKCY9+DiCvBT9CDStOTPIJWpbuYFb+aCD1thFRSiRfdhHYW z8Slidm4KfDXN1uEQthHWZfKdeSuHB+csPCyzwIHtIZjD90qW2oIK2Mh3v9B3Qyea5D2 6cwFUo+NBs06Ok3UZ6o42YaShWZovTU5U5ujwtGhp8RSfi4LNA1diibpsfI9UdK4h/gQ lxlSMgdcPVoEVmt6kF/fVg1eMn9SyEKSsTEStmtwLedpN9btQngzkE+WOyZwpr1HzaBg 6qww== X-Gm-Message-State: AOJu0Ywvw3wJ96eubQtw7qgwH0Om5kQwGW/kKB+8eJo7DMf1aTnbzAAB c+r2xE4iDJTnWxrlhMtQ8WTeGgX0LnAvU8iCE6kAxOyxpDR5Fto65UOp X-Gm-Gg: ASbGncsUEvYGKjCaGqkGpcw93Ms+OjNHk9s1cnt3bCJMGTEdLmlo01x84fF/6+EL3sn ncdoxEPiryQFokTCzoxvPG179efgADyzTQm2oi3Yy3W9aoCCvyUiRUNR9GixSIluMty3s36wL+b ndFed/2xnfj8rEFX0fuxPJmbiPplUKOoPJ2+pUK7r4IEKez/Ap6Lh5du9WECW7iY14hiBc5aBm8 rWcmCyD3g1xmEGgZMrUmkWIY5MNGPD8Z6T5u70pFBeU2H++7Vm2IlMAoiStA38ZqGSMjpRMwR+M 444PraIgZeLqRR5TxP+0uwhT3K4xOfj7sKfF5rAOlfGLwPdIXN2YY6pfNpNuSxbPLY+mC0nRI/C cOs32WHTX20ghytkx1TMzy1BoFTp9KJuJxW/Twp4h5g6Vl1RIHBWUq4vHqM32HaXvNguFTuV+CX cQxS3+9BLZns9/dMa1oyrWl2UooRY0m5SgsxKp X-Google-Smtp-Source: AGHT+IFApUoayHMZIMUKL5MCEmGhXyw+2vhvCQrfs3IWROSZD9OJbL36vBZmMufIHCFrgMZh7KYeSw== X-Received: by 2002:a17:90b:534f:b0:341:88ba:c6d3 with SMTP id 98e67ed59e1d1-34910825900mr276470a91.23.1764698631653; Tue, 02 Dec 2025 10:03:51 -0800 (PST) Received: from ?IPv6:2a03:83e0:115c:1:efc2:226e:efa5:2264? ([2620:10d:c090:500::5:e42b]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-34910ba58a9sm79899a91.8.2025.12.02.10.03.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Dec 2025 10:03:51 -0800 (PST) Message-ID: Subject: Re: [PATCH v1 2/3] bpf: verifier: Simplify register sign extension with tnum_scast From: Eduard Zingerman To: Dimitar Kanaliev Cc: bpf@vger.kernel.org, Alexei Starovoitov , Daniel Borkmann , John Fastabend , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Mykola Lysenko , Shung-Hsi Yu Date: Tue, 02 Dec 2025 10:03:48 -0800 In-Reply-To: References: <20251125125634.2671-1-dimitar.kanaliev@siteground.com> <20251125125634.2671-3-dimitar.kanaliev@siteground.com> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.56.2 (3.56.2-2.fc42) Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 On Tue, 2025-12-02 at 12:53 +0200, Dimitar Kanaliev wrote: > On Tue, Dec 2, 2025 at 1:50=E2=80=AFAM Eduard Zingerman wrote: > >=20 > > On Tue, 2025-11-25 at 14:56 +0200, Dimitar Kanaliev wrote: > >=20 > > [...] > >=20 > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > > index 766695491bc5..c9a6bf85b4ad 100644 > > > --- a/kernel/bpf/verifier.c > > > +++ b/kernel/bpf/verifier.c > > > @@ -6876,147 +6876,57 @@ static void coerce_reg_to_size(struct bpf_re= g_state *reg, int size) > > > reg_bounds_sync(reg); > > > } > > >=20 > > > -static void set_sext64_default_val(struct bpf_reg_state *reg, int si= ze) > > > -{ > > > - if (size =3D=3D 1) { > > > - reg->smin_value =3D reg->s32_min_value =3D S8_MIN; > > > - reg->smax_value =3D reg->s32_max_value =3D S8_MAX; > > > - } else if (size =3D=3D 2) { > > > - reg->smin_value =3D reg->s32_min_value =3D S16_MIN; > > > - reg->smax_value =3D reg->s32_max_value =3D S16_MAX; > > > - } else { > > > - /* size =3D=3D 4 */ > > > - reg->smin_value =3D reg->s32_min_value =3D S32_MIN; > > > - reg->smax_value =3D reg->s32_max_value =3D S32_MAX; > > > - } > > > - reg->umin_value =3D reg->u32_min_value =3D 0; > > > - reg->umax_value =3D U64_MAX; > > > - reg->u32_max_value =3D U32_MAX; > > > - reg->var_off =3D tnum_unknown; > > > -} > > > - > > > static void coerce_reg_to_size_sx(struct bpf_reg_state *reg, int siz= e) > > > { > > > - s64 init_s64_max, init_s64_min, s64_max, s64_min, u64_cval; > > > - u64 top_smax_value, top_smin_value; > > > - u64 num_bits =3D size * 8; > > > + s64 smin_value, smax_value; > > >=20 > > > - if (tnum_is_const(reg->var_off)) { > > > - u64_cval =3D reg->var_off.value; > > > - if (size =3D=3D 1) > > > - reg->var_off =3D tnum_const((s8)u64_cval); > > > - else if (size =3D=3D 2) > > > - reg->var_off =3D tnum_const((s16)u64_cval); > > > - else > > > - /* size =3D=3D 4 */ > > > - reg->var_off =3D tnum_const((s32)u64_cval); > > > - > > > - u64_cval =3D reg->var_off.value; > > > - reg->smax_value =3D reg->smin_value =3D u64_cval; > > > - reg->umax_value =3D reg->umin_value =3D u64_cval; > > > - reg->s32_max_value =3D reg->s32_min_value =3D u64_cval; > > > - reg->u32_max_value =3D reg->u32_min_value =3D u64_cval; > > > + if (size >=3D 8) > > > return; > > > - } > > >=20 > > > - top_smax_value =3D ((u64)reg->smax_value >> num_bits) << num_bi= ts; > > > - top_smin_value =3D ((u64)reg->smin_value >> num_bits) << num_bi= ts; > > > + reg->var_off =3D tnum_scast(reg->var_off, size); > > >=20 > > > - if (top_smax_value !=3D top_smin_value) > > > - goto out; > > > + smin_value =3D -(1LL << (size * 8 - 1)); > > > + smax_value =3D (1LL << (size * 8 - 1)) - 1; > > >=20 > > > - /* find the s64_min and s64_min after sign extension */ > > > - if (size =3D=3D 1) { > > > - init_s64_max =3D (s8)reg->smax_value; > > > - init_s64_min =3D (s8)reg->smin_value; > > > - } else if (size =3D=3D 2) { > > > - init_s64_max =3D (s16)reg->smax_value; > > > - init_s64_min =3D (s16)reg->smin_value; > > > - } else { > > > - init_s64_max =3D (s32)reg->smax_value; > > > - init_s64_min =3D (s32)reg->smin_value; > > > - } > > > - > > > - s64_max =3D max(init_s64_max, init_s64_min); > > > - s64_min =3D min(init_s64_max, init_s64_min); > > > + reg->smin_value =3D smin_value; > > > + reg->smax_value =3D smax_value; > > >=20 > > > - /* both of s64_max/s64_min positive or negative */ > > > - if ((s64_max >=3D 0) =3D=3D (s64_min >=3D 0)) { > > > - reg->s32_min_value =3D reg->smin_value =3D s64_min; > > > - reg->s32_max_value =3D reg->smax_value =3D s64_max; > > > - reg->u32_min_value =3D reg->umin_value =3D s64_min; > > > - reg->u32_max_value =3D reg->umax_value =3D s64_max; > > > - reg->var_off =3D tnum_range(s64_min, s64_max); > > > - return; > > > - } > > > + reg->s32_min_value =3D (s32)smin_value; > > > + reg->s32_max_value =3D (s32)smax_value; > > >=20 > > > -out: > > > - set_sext64_default_val(reg, size); > > > -} > >=20 > > Assume that size =3D=3D 1, s64_min =3D 0b000, s64_max =3D=3D 0b100. > > This corresponds to tnum with value =3D=3D 0b000 and mask =3D=3D 0b111. > > Old algorithm computes more precise range in this situation. > > Old: > >=20 > > 0: (85) call bpf_get_prandom_u32#7 ; R0=3Dscalar() > > 1: (25) if r0 > 0x4 goto pc+2 ; R0=3Dscalar(smin=3Dsmin32=3D0= ,smax=3Dumax=3Dsmax32=3Dumax32=3D4,var_off=3D(0x0; 0x7)) > > 2: (7b) *(u64 *)(r10 -8) =3D r0 ; R0=3Dscalar(id=3D1,smin=3Ds= min32=3D0,smax=3Dumax=3Dsmax32=3Dumax32=3D4,var_off=3D(0x0; 0x7)) ... > > 3: (91) r0 =3D *(s8 *)(r10 -8) ; R0=3Dscalar(id=3D1,smin=3Ds= min32=3D0,smax=3Dumax=3Dsmax32=3Dumax32=3D4,var_off=3D(0x0; 0x7)) ... > > 4: (b7) r0 =3D 0 ; R0=3D0 > > 5: (95) exit > >=20 > > New: > >=20 > > 0: (85) call bpf_get_prandom_u32#7 ; R0=3Dscalar() > > 1: (25) if r0 > 0x4 goto pc+2 ; R0=3Dscalar(smin=3Dsmin32=3D0= ,smax=3Dumax=3Dsmax32=3Dumax32=3D4,var_off=3D(0x0; 0x7)) > > 2: (7b) *(u64 *)(r10 -8) =3D r0 ; R0=3Dscalar(id=3D1,smin=3Ds= min32=3D0,smax=3Dumax=3Dsmax32=3Dumax32=3D4,var_off=3D(0x0; 0x7)) ... > > 3: (91) r0 =3D *(s8 *)(r10 -8) ; R0=3Dscalar(id=3D1,smin=3Ds= min32=3D0,smax=3Dumax=3Dsmax32=3Dumax32=3D7,var_off=3D(0x0; 0x7)) ... > > 4: (b7) r0 =3D 0 ; R0=3D0 > > 5: (95) exit > >=20 > > Note that range for R0 at (3) is 0..4 for old algorithm and 0..7 for > > new algorithm. > >=20 > > Can we keep both algorithms by e.g. replacing set_sext64_default_val() > > implementation with tnum_scast() adding tnum_scast() in > > coerce_reg_to_size_sx()? > >=20 > > In general, for such kinds of patch-sets it is interesting to see how > > much precision is gained/lost with the change. It shouldn't be hard to > > collect such data for e.g. complete s8 range by writing a small > > user-space program that enumerates the s8 x s8 range and applies both > > old an new range computations. > >=20 > > [...] >=20 > I was mostly focused on preserving the info from tnum for sparse ranges, = so > I kind of forgot about continuous ranges entirely. > As per your suggestion, I plucked out anything relevant from the kernel, > and compared the smax / smin values for the entire -1024,1024 range in a > loop, like so: >=20 > struct bpf_reg_state r_old, r_new; >=20 > init_reg(&r_old, start, end); > r_new =3D r_old; >=20 > coerce_reg_to_size_sx_old(&r_old, size); > coerce_reg_to_size_sx_new(&r_new, size); >=20 > s64 range_old =3D r_old.smax_value - r_old.smin_value; > s64 range_new =3D r_new.smax_value - r_new.smin_value; >=20 > if (range_old < range_new) { ... >=20 > In these continous ranges, the old implementation is much better: >=20 > [-1024, 1024]: > Old Better: 128016 > New Better: 0 > Equal: 1972209 >=20 > So I endeed up drafting this: >=20 > static void coerce_reg_to_size_sx(struct bpf_reg_state *reg, int size) > { > s64 smin_value, smax_value; > u64 num_bits =3D size * 8; > u64 top_smax_value, top_smin_value; >=20 > reg->var_off =3D tnum_scast(reg->var_off, size); >=20 > top_smax_value =3D ((u64)reg->smax_value >> num_bits) << num_bits; > top_smin_value =3D ((u64)reg->smin_value >> num_bits) << num_bits; >=20 > if (top_smax_value =3D=3D top_smin_value) { > if (size =3D=3D 1) { > smin_value =3D (s8)reg->smin_value; > smax_value =3D (s8)reg->smax_value; > } else if (size =3D=3D 2) { > smin_value =3D (s16)reg->smin_value; > smax_value =3D (s16)reg->smax_value; > } else { > smin_value =3D (s32)reg->smin_value; > smax_value =3D (s32)reg->smax_value; > } > } else { > smin_value =3D -(1LL << (num_bits - 1)); > smax_value =3D (1LL << (num_bits - 1)) - 1; > } The current implementation has the following part: s64_max =3D max(init_s64_max, init_s64_min); s64_min =3D min(init_s64_max, init_s64_min); And I think this part is necessary. E.g. consider that smin_value=3D=3D0x00 and smax_value=3D0x80, then with suggested implementation: clang-repl> #include clang-repl> printf("%ld\n", (long)(char)(0x80UL)); -128 Sign of the smin_value will be positive, while sign of the smax_value will become negative. When respining for v2, could you please also provide a link to a repository with test harness you use to check range [-1024,1024]? (E.g. push it to the github). >=20 > reg->smin_value =3D smin_value; > reg->smax_value =3D smax_value; >=20 > reg->umin_value =3D 0; > reg->umax_value =3D U64_MAX; >=20 > reg->s32_min_value =3D (s32)smin_value; > reg->s32_max_value =3D (s32)smax_value; > reg->u32_min_value =3D 0; > reg->u32_max_value =3D U32_MAX; >=20 > __update_reg_bounds(reg); > } >=20 > I'm trying to always perform tnum_scast in order to preserve bitwise > info, but attempt to use the old numeric logic first. If the range fits > into the target size, we preserve the existing numeric bounds. If not, we > fall back to the type limits and let __update_reg_bounds reconstruct the > range from var_off. The imeplementation is similar for the subreg variant= . >=20 > Rerunning the comparison for the same range looks much better, we should = be > consistently seeing precision gains in the cases where the original > implementation bails out via goto: >=20 > [-1024, 1024]: > Old Better: 0 > New Better: 131072 > Equal: 1969153 >=20 > I also went through the CI, the existing selftest in the series still > covers the change. >=20 > wdyt?