From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0ABD91F189
	for <bpf@vger.kernel.org>; Mon, 12 Jun 2023 17:40:16 +0000 (UTC)
Received: from mail-ej1-x634.google.com (mail-ej1-x634.google.com [IPv6:2a00:1450:4864:20::634])
	by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ABC29118
	for <bpf@vger.kernel.org>; Mon, 12 Jun 2023 10:40:14 -0700 (PDT)
Received: by mail-ej1-x634.google.com with SMTP id a640c23a62f3a-977d6aa3758so839208966b.0
        for <bpf@vger.kernel.org>; Mon, 12 Jun 2023 10:40:14 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20221208; t=1686591613; x=1689183613;
        h=in-reply-to:content-disposition:mime-version:references:message-id
         :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to;
        bh=UTDx4jf602cbOKdXqkoDk4cQOeQcYfGazQMP/vfhTeE=;
        b=YGocuszVbmhqFyb2D09Qb6DLNW2FemHVtAoXQ48a9W45f1P3Uid8RkJEl9EyWd7dyh
         Q3B7KbHlYZRcAMlWrYylmthcTtj+a2PZedDovbnK5JFnEulbkoIyQ+rDXsywvvWu8ghL
         Q0RVl2b2yoIRtlpW+EdAmd8mH7UwbCouDya/FAe3CvhcAosezAGYkrGRNrR7cmEl/RzT
         LPE+li1xQe2wv+9DeQ9IeB1o6fUzeEQZd7B4leHZnNYNzt+lS5YhPRMQhIlaaDIkWFfS
         URB0TMhgQhquAQ5XZeIZdm8APM3tNdcWGEcbelyBv111GNE4HHV8qfCWBbhscGtYSvsf
         TaPg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20221208; t=1686591613; x=1689183613;
        h=in-reply-to:content-disposition:mime-version:references:message-id
         :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date
         :message-id:reply-to;
        bh=UTDx4jf602cbOKdXqkoDk4cQOeQcYfGazQMP/vfhTeE=;
        b=WvzKEmqgwyi2Qrg/HQZWVm5dPm1iBNR3EQSwuzegdpLQT3x0aHnu1l/QTA5j/6H6FD
         ZcNMmNkZ8IelEAbLzoZZaTekMiqUpWnXBHbfoebhnnqMvnvhi3Q31hF2mXyK9KPjjgXf
         Sszl7R3g9MOxbQRe2DAXdkv7C8lM3nNrRmq2eGxWaAccRKPuV9HCuvqhL4vUBKB8hH5x
         KRkEefhwoJ2gKPnty05k02MHxCicfZh7IO2DNTB56Kyb5BhYrgqBjxB9blfFFcznpWiz
         cdURXnZUJXH+hTz4gqIrXnM6Sjdp2is8damaiQkxmlZlXvha3JWR7rV2cP0G8WNrbKFD
         KHiw==
X-Gm-Message-State: AC+VfDyE0IvzV5yaAlLP3ATBppTh/9zmsgL6hV4FgVzNzT/QnzVAhyok
	v90pO5Uvp51YUZ2Ms351c8U=
X-Google-Smtp-Source: ACHHUZ63u3fPOOs1CbjQloqP4YB/6WNXZTTZ13aGk8lsDjkHhMz/f4GIz9X1+hbad0Bv0a3ZUZOpkQ==
X-Received: by 2002:a17:906:730b:b0:973:bcf6:1d4 with SMTP id di11-20020a170906730b00b00973bcf601d4mr12071877ejc.76.1686591612803;
        Mon, 12 Jun 2023 10:40:12 -0700 (PDT)
Received: from localhost ([179.43.159.200])
        by smtp.gmail.com with ESMTPSA id lo11-20020a170906fa0b00b009784915c660sm5533062ejb.136.2023.06.12.10.40.11
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Mon, 12 Jun 2023 10:40:12 -0700 (PDT)
Date: Mon, 12 Jun 2023 20:40:10 +0300
From: Maxim Mikityanskiy <maxtram95@gmail.com>
To: Eduard Zingerman <eddyz87@gmail.com>
Cc: bpf@vger.kernel.org, ast@kernel.org, andrii@kernel.org,
	daniel@iogearbox.net, martin.lau@linux.dev, kernel-team@fb.com,
	yhs@fb.com
Subject: Re: [PATCH bpf-next v5 4/4] selftests/bpf: verify that check_ids()
 is used for scalars in regsafe()
Message-ID: <ZIdYepniUlHlmtvO@mail.gmail.com>
References: <20230612160801.2804666-1-eddyz87@gmail.com>
 <20230612160801.2804666-5-eddyz87@gmail.com>
Precedence: bulk
X-Mailing-List: bpf@vger.kernel.org
List-Id: <bpf.vger.kernel.org>
List-Subscribe: <mailto:bpf+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:bpf+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20230612160801.2804666-5-eddyz87@gmail.com>
X-Spam-Status: No, score=1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,
	DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_ENVFROM_END_DIGIT,
	FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,RCVD_IN_SBL_CSS,SPF_HELO_NONE,
	SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no
	version=3.4.6
X-Spam-Level: *
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
	lindbergh.monkeyblade.net

On Mon, 12 Jun 2023 at 19:08:01 +0300, Eduard Zingerman wrote:
> Verify that the following example is rejected by verifier:
> 
>   r9 = ... some pointer with range X ...
>   r6 = ... unbound scalar ID=a ...
>   r7 = ... unbound scalar ID=b ...
>   if (r6 > r7) goto +1
>   r7 = r6
>   if (r7 > X) goto exit
>   r9 += r6
>   *(u64 *)r9 = Y
> 
> Also add test cases to:
> - check that check_alu_op() for BPF_MOV instruction does not allocate
>   scalar ID if source register is a constant;
> - check that unique scalar IDs are ignored when new verifier state is
>   compared to cached verifier state;
> - check that two different scalar IDs in a verified state can't be
>   mapped to the same scalar ID in current state.
> 
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> ---
>  .../selftests/bpf/progs/verifier_scalar_ids.c | 313 ++++++++++++++++++
>  1 file changed, 313 insertions(+)
> 
> diff --git a/tools/testing/selftests/bpf/progs/verifier_scalar_ids.c b/tools/testing/selftests/bpf/progs/verifier_scalar_ids.c
> index 8a5203fb14ca..5d56e764fe43 100644
> --- a/tools/testing/selftests/bpf/progs/verifier_scalar_ids.c
> +++ b/tools/testing/selftests/bpf/progs/verifier_scalar_ids.c
> @@ -341,4 +341,317 @@ __naked void precision_two_ids(void)
>  	: __clobber_all);
>  }
>  
> +/* Verify that check_ids() is used by regsafe() for scalars.
> + *
> + * r9 = ... some pointer with range X ...
> + * r6 = ... unbound scalar ID=a ...
> + * r7 = ... unbound scalar ID=b ...
> + * if (r6 > r7) goto +1
> + * r6 = r7
> + * if (r6 > X) goto exit
> + * r9 += r7
> + * *(u8 *)r9 = Y
> + *
> + * The memory access is safe only if r7 is bounded,
> + * which is true for one branch and not true for another.
> + */
> +SEC("socket")
> +__failure __msg("register with unbounded min value")
> +__flag(BPF_F_TEST_STATE_FREQ)
> +__naked void check_ids_in_regsafe(void)
> +{
> +	asm volatile (
> +	/* Bump allocated stack */
> +	"r1 = 0;"
> +	"*(u64*)(r10 - 8) = r1;"
> +	/* r9 = pointer to stack */
> +	"r9 = r10;"
> +	"r9 += -8;"
> +	/* r7 = ktime_get_ns() */
> +	"call %[bpf_ktime_get_ns];"
> +	"r7 = r0;"
> +	/* r6 = ktime_get_ns() */
> +	"call %[bpf_ktime_get_ns];"
> +	"r6 = r0;"
> +	/* if r6 > r7 is an unpredictable jump */
> +	"if r6 > r7 goto l1_%=;"
> +	"r7 = r6;"
> +"l1_%=:"
> +	/* if r6 > 4 exit(0) */
> +	"if r7 > 4 goto l2_%=;"
> +	/* Access memory at r9[r7] */
> +	"r9 += r6;"

Sorry if I'm missing some context, but there seem to be discrepancies
between the code of this test, the comments right here, the comment
above the test and the commit message. r6 vs r7 don't match in a few
places.

The code matches the commit message and looks correct (unsafe). The code
sample in the comments, however, is different and looks safe to me
(r7 <= r6 <= X, accessing r9[r7]).

> +	"r0 = *(u8*)(r9 + 0);"
> +"l2_%=:"
> +	"r0 = 0;"
> +	"exit;"
> +	:
> +	: __imm(bpf_ktime_get_ns)
> +	: __clobber_all);
> +}
> +
> +/* Similar to check_ids_in_regsafe.
> + * The l0 could be reached in two states:
> + *
> + *   (1) r6{.id=A}, r7{.id=A}, r8{.id=B}
> + *   (2) r6{.id=B}, r7{.id=A}, r8{.id=B}
> + *
> + * Where (2) is not safe, as "r7 > 4" check won't propagate range for it.
> + * This example would be considered safe without changes to
> + * mark_chain_precision() to track scalar values with equal IDs.
> + */
> +SEC("socket")
> +__failure __msg("register with unbounded min value")
> +__flag(BPF_F_TEST_STATE_FREQ)
> +__naked void check_ids_in_regsafe_2(void)
> +{
> +	asm volatile (
> +	/* Bump allocated stack */
> +	"r1 = 0;"
> +	"*(u64*)(r10 - 8) = r1;"
> +	/* r9 = pointer to stack */
> +	"r9 = r10;"
> +	"r9 += -8;"
> +	/* r8 = ktime_get_ns() */
> +	"call %[bpf_ktime_get_ns];"
> +	"r8 = r0;"
> +	/* r7 = ktime_get_ns() */
> +	"call %[bpf_ktime_get_ns];"
> +	"r7 = r0;"
> +	/* r6 = ktime_get_ns() */
> +	"call %[bpf_ktime_get_ns];"
> +	"r6 = r0;"
> +	/* scratch .id from r0 */
> +	"r0 = 0;"
> +	/* if r6 > r7 is an unpredictable jump */
> +	"if r6 > r7 goto l1_%=;"
> +	/* tie r6 and r7 .id */
> +	"r6 = r7;"
> +"l0_%=:"
> +	/* if r7 > 4 exit(0) */
> +	"if r7 > 4 goto l2_%=;"
> +	/* Access memory at r9[r7] */
> +	"r9 += r6;"
> +	"r0 = *(u8*)(r9 + 0);"
> +"l2_%=:"
> +	"r0 = 0;"
> +	"exit;"
> +"l1_%=:"
> +	/* tie r6 and r8 .id */
> +	"r6 = r8;"
> +	"goto l0_%=;"
> +	:
> +	: __imm(bpf_ktime_get_ns)
> +	: __clobber_all);
> +}
> +
> +/* Check that scalar IDs *are not* generated on register to register
> + * assignments if source register is a constant.
> + *
> + * If such IDs *are* generated the 'l1' below would be reached in
> + * two states:
> + *
> + *   (1) r1{.id=A}, r2{.id=A}
> + *   (2) r1{.id=C}, r2{.id=C}
> + *
> + * Thus forcing 'if r1 == r2' verification twice.
> + */
> +SEC("socket")
> +__success __log_level(2)
> +__msg("11: (1d) if r3 == r4 goto pc+0")
> +__msg("frame 0: propagating r3,r4")
> +__msg("11: safe")
> +__msg("processed 15 insns")
> +__flag(BPF_F_TEST_STATE_FREQ)
> +__naked void no_scalar_id_for_const(void)
> +{
> +	asm volatile (
> +	"call %[bpf_ktime_get_ns];"
> +	/* unpredictable jump */
> +	"if r0 > 7 goto l0_%=;"
> +	/* possibly generate same scalar ids for r3 and r4 */
> +	"r1 = 0;"
> +	"r1 = r1;"
> +	"r3 = r1;"
> +	"r4 = r1;"
> +	"goto l1_%=;"
> +"l0_%=:"
> +	/* possibly generate different scalar ids for r3 and r4 */
> +	"r1 = 0;"
> +	"r2 = 0;"
> +	"r3 = r1;"
> +	"r4 = r2;"
> +"l1_%=:"
> +	/* predictable jump, marks r3 and r4 precise */
> +	"if r3 == r4 goto +0;"
> +	"r0 = 0;"
> +	"exit;"
> +	:
> +	: __imm(bpf_ktime_get_ns)
> +	: __clobber_all);
> +}
> +
> +/* Same as no_scalar_id_for_const() but for 32-bit values */
> +SEC("socket")
> +__success __log_level(2)
> +__msg("11: (1e) if w3 == w4 goto pc+0")
> +__msg("frame 0: propagating r3,r4")
> +__msg("11: safe")
> +__msg("processed 15 insns")
> +__flag(BPF_F_TEST_STATE_FREQ)
> +__naked void no_scalar_id_for_const32(void)
> +{
> +	asm volatile (
> +	"call %[bpf_ktime_get_ns];"
> +	/* unpredictable jump */
> +	"if r0 > 7 goto l0_%=;"
> +	/* possibly generate same scalar ids for r3 and r4 */
> +	"w1 = 0;"
> +	"w1 = w1;"
> +	"w3 = w1;"
> +	"w4 = w1;"
> +	"goto l1_%=;"
> +"l0_%=:"
> +	/* possibly generate different scalar ids for r3 and r4 */
> +	"w1 = 0;"
> +	"w2 = 0;"
> +	"w3 = w1;"
> +	"w4 = w2;"
> +"l1_%=:"
> +	/* predictable jump, marks r1 and r2 precise */
> +	"if w3 == w4 goto +0;"
> +	"r0 = 0;"
> +	"exit;"
> +	:
> +	: __imm(bpf_ktime_get_ns)
> +	: __clobber_all);
> +}
> +
> +/* Check that unique scalar IDs are ignored when new verifier state is
> + * compared to cached verifier state. For this test:
> + * - cached state has no id on r1
> + * - new state has a unique id on r1
> + */
> +SEC("socket")
> +__success __log_level(2)
> +__msg("6: (25) if r6 > 0x7 goto pc+1")
> +__msg("7: (57) r1 &= 255")
> +__msg("8: (bf) r2 = r10")
> +__msg("from 6 to 8: safe")
> +__msg("processed 12 insns")
> +__flag(BPF_F_TEST_STATE_FREQ)
> +__naked void ignore_unique_scalar_ids_cur(void)
> +{
> +	asm volatile (
> +	"call %[bpf_ktime_get_ns];"
> +	"r6 = r0;"
> +	"call %[bpf_ktime_get_ns];"
> +	"r0 &= 0xff;"
> +	/* r1.id == r0.id */
> +	"r1 = r0;"
> +	/* make r1.id unique */
> +	"r0 = 0;"
> +	"if r6 > 7 goto l0_%=;"
> +	/* clear r1 id, but keep the range compatible */
> +	"r1 &= 0xff;"
> +"l0_%=:"
> +	/* get here in two states:
> +	 * - first: r1 has no id (cached state)
> +	 * - second: r1 has a unique id (should be considered equivalent)
> +	 */
> +	"r2 = r10;"
> +	"r2 += r1;"
> +	"exit;"
> +	:
> +	: __imm(bpf_ktime_get_ns)
> +	: __clobber_all);
> +}
> +
> +/* Check that unique scalar IDs are ignored when new verifier state is
> + * compared to cached verifier state. For this test:
> + * - cached state has a unique id on r1
> + * - new state has no id on r1
> + */
> +SEC("socket")
> +__success __log_level(2)
> +__msg("6: (25) if r6 > 0x7 goto pc+1")
> +__msg("7: (05) goto pc+1")
> +__msg("9: (bf) r2 = r10")
> +__msg("9: safe")
> +__msg("processed 13 insns")
> +__flag(BPF_F_TEST_STATE_FREQ)
> +__naked void ignore_unique_scalar_ids_old(void)
> +{
> +	asm volatile (
> +	"call %[bpf_ktime_get_ns];"
> +	"r6 = r0;"
> +	"call %[bpf_ktime_get_ns];"
> +	"r0 &= 0xff;"
> +	/* r1.id == r0.id */
> +	"r1 = r0;"
> +	/* make r1.id unique */
> +	"r0 = 0;"
> +	"if r6 > 7 goto l1_%=;"
> +	"goto l0_%=;"
> +"l1_%=:"
> +	/* clear r1 id, but keep the range compatible */
> +	"r1 &= 0xff;"
> +"l0_%=:"
> +	/* get here in two states:
> +	 * - first: r1 has a unique id (cached state)
> +	 * - second: r1 has no id (should be considered equivalent)
> +	 */
> +	"r2 = r10;"
> +	"r2 += r1;"
> +	"exit;"
> +	:
> +	: __imm(bpf_ktime_get_ns)
> +	: __clobber_all);
> +}
> +
> +/* Check that two different scalar IDs in a verified state can't be
> + * mapped to the same scalar ID in current state.
> + */
> +SEC("socket")
> +__success __log_level(2)
> +/* The exit instruction should be reachable from two states,
> + * use two matches and "processed .. insns" to ensure this.
> + */
> +__msg("13: (95) exit")
> +__msg("13: (95) exit")
> +__msg("processed 18 insns")
> +__flag(BPF_F_TEST_STATE_FREQ)
> +__naked void two_old_ids_one_cur_id(void)
> +{
> +	asm volatile (
> +	/* Give unique scalar IDs to r{6,7} */
> +	"call %[bpf_ktime_get_ns];"
> +	"r0 &= 0xff;"
> +	"r6 = r0;"
> +	"call %[bpf_ktime_get_ns];"
> +	"r0 &= 0xff;"
> +	"r7 = r0;"
> +	"r0 = 0;"
> +	/* Maybe make r{6,7} IDs identical */
> +	"if r6 > r7 goto l0_%=;"
> +	"goto l1_%=;"
> +"l0_%=:"
> +	"r6 = r7;"
> +"l1_%=:"
> +	/* Mark r{6,7} precise.
> +	 * Get here in two states:
> +	 * - first:  r6{.id=A}, r7{.id=B} (cached state)
> +	 * - second: r6{.id=A}, r7{.id=A}
> +	 * Currently we don't want to consider such states equivalent.
> +	 * Thus, marker instruction "r0 = r0;" would be verified twice.
> +	 */
> +	"r2 = r10;"
> +	"r2 += r6;"
> +	"r2 += r7;"
> +	"exit;"
> +	:
> +	: __imm(bpf_ktime_get_ns)
> +	: __clobber_all);
> +}
> +
>  char _license[] SEC("license") = "GPL";
> -- 
> 2.40.1
> 
>