From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 60FBC328DF for ; Tue, 3 Oct 2023 15:33:54 +0000 (UTC) Received: from mail-lf1-x129.google.com (mail-lf1-x129.google.com [IPv6:2a00:1450:4864:20::129]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 68190AB for ; Tue, 3 Oct 2023 08:33:52 -0700 (PDT) Received: by mail-lf1-x129.google.com with SMTP id 2adb3069b0e04-5044dd5b561so1265206e87.1 for ; Tue, 03 Oct 2023 08:33:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1696347230; x=1696952030; darn=vger.kernel.org; h=mime-version:user-agent:content-transfer-encoding:autocrypt :references:in-reply-to:date:cc:to:from:subject:message-id:from:to :cc:subject:date:message-id:reply-to; bh=F8xIQzJty4+9K8/+BMP+c0isEPX6YD2aMBdd/lN9Zmo=; b=AqJaWO8fE7/y7/MaMnAwAGIMlj4a3LjjYLYDBoGvlgcYjXTbLhJzR/uK8/Eu1FYrlo pKESWWz6XKf8SLm8oAsU0AfvEj8oTLWvW9ame1RYjaRjw2gVyRwoH8MtNzH3WrRm9C3F 72PNIl1F1fFsBrbesZTIre0ZOMG7Eih9s6Ywhr7h04hlGySUl94uIjeVJ45m0j0/u/iE NmjEqzpWvxb5RVnignZ8JllizP449qH41ns0DxrN4uKduuzKWdMKosQAPWFgWiF/+cFQ QOtdrEUgBCsgOe2zBg2hlPUolbHb22/4MoGMCP1omqbMjVNOuJvmPdxmLI7wOqkjVdOc UuPA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1696347230; x=1696952030; h=mime-version:user-agent:content-transfer-encoding:autocrypt :references:in-reply-to:date:cc:to:from:subject:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=F8xIQzJty4+9K8/+BMP+c0isEPX6YD2aMBdd/lN9Zmo=; b=saXD40nIsJwJKe1SNqccp3xxNWCmVfBCTK54kUjxDYWY+EckpGAsw1JjjJZc/ELupc pHYgKtvZexYS7gCUVYXOmWZcTg7h3t6ZNi7ak7WYw+psHT2kIWkpn+mhCLqbg04VpxG7 DYQqaidtQ0iNHOLosnAllApcmaZkNsBGxSM7wOgzEkwR5zc3y6A1otfdAEyesSIb32Zs 9oC5Jauu3eAKKxJ/YoJJ9+3Fj1exE2hgj0PnewiaZvIo1Xe0EN3IRGrMs3sstX4EiWD0 FjPmli+W7LPGaA7nILgc6azrCunqfqWLJwRYhGMo1et4uCswxYeT9jrDuMbVSVWpzGSO BY4Q== X-Gm-Message-State: AOJu0YzH7StoIf/GmZXUjpuwBfGTE+oWrJdM9NYAcZ3H3EsnUv9i66Ue X022ZFzwcQDThQS21a8E4ci4NIB7mBrSLza1 X-Google-Smtp-Source: AGHT+IF3jNCBVqazF/odZduYj2zSj8EC7i2Po5Ug/J1jJi6HokKEcJnSoem1643WUErIc1LKULar4g== X-Received: by 2002:a05:6512:4019:b0:502:d84d:e893 with SMTP id br25-20020a056512401900b00502d84de893mr14709082lfb.36.1696347230124; Tue, 03 Oct 2023 08:33:50 -0700 (PDT) Received: from [192.168.1.95] (host-176-36-0-241.b024.la.net.ua. [176.36.0.241]) by smtp.gmail.com with ESMTPSA id a8-20020ac25208000000b004fe461aab36sm225636lfl.129.2023.10.03.08.33.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 03 Oct 2023 08:33:49 -0700 (PDT) Message-ID: <8b75e01d27696cd6661890e49bdc06b1e96092c7.camel@gmail.com> Subject: Re: [BUG] verifier escape with iteration helpers (bpf_loop, ...) From: Eduard Zingerman To: Alexei Starovoitov Cc: Andrii Nakryiko , Andrew Werner , bpf , Andrei Matei , Tamir Duberstein , Joanne Koong , kernel-team@dataexmachina.dev, Song Liu , Kumar Kartikeya Dwivedi Date: Tue, 03 Oct 2023 18:33:48 +0300 In-Reply-To: References: <97a90da09404c65c8e810cf83c94ac703705dc0e.camel@gmail.com> <5649df64315467c67b969e145afda8bbf7e60445.camel@gmail.com> <4b121c3b96dcc0322ea111062ed2260d2d1d0ed7.camel@gmail.com> <52df1240415be1ee8827cb6395fd339a720e229c.camel@gmail.com> <44363f61c49bafa7901ae2aa43897b525805192c.camel@gmail.com> <9dd331b31755632f0528bfb1d0acbf904cedbd98.camel@gmail.com> <680e69504eabbae2abd5e9e2b745319c561c86ef.camel@gmail.com> Autocrypt: addr=eddyz87@gmail.com; prefer-encrypt=mutual; keydata=mQGNBGKNNQEBDACwcUNXZOGTzn4rr7Sd18SA5Wv0Wna/ONE0ZwZEx+sIjyGrPOIhR14/DsOr3ZJer9UJ/WAJwbxOBj6E5Y2iF7grehljNbLr/jMjzPJ+hJpfOEAb5xjCB8xIqDoric1WRcCaRB+tDSk7jcsIIiMish0diTK3qTdu4MB6i/sh4aeFs2nifkNi3LdBuk8Xnk+RJHRoKFJ+C+EoSmQPuDQIRaF9N2m4yO0eG36N8jLwvUXnZzGvHkphoQ9ztbRJp58oh6xT7uH62m98OHbsVgzYKvHyBu/IU2ku5kVG9pLrFp25xfD4YdlMMkJH6l+jk+cpY0cvMTS1b6/g+1fyPM+uzD8Wy+9LtZ4PHwLZX+t4ONb/48i5AKq/jSsb5HWdciLuKEwlMyFAihZamZpEj+9n91NLPX4n7XeThXHaEvaeVVl4hfW/1Qsao7l1YjU/NCHuLaDeH4U1P59bagjwo9d1n5/PESeuD4QJFNqW+zkmE4tmyTZ6bPV6T5xdDRHeiITGc00AEQEAAbQkRWR1YXJkIFppbmdlcm1hbiA8ZWRkeXo4N0BnbWFpbC5jb20+iQHUBBMBCgA+FiEEx+6LrjApQyqnXCYELgxleklgRAkFAmKNNQECGwMFCQPCZwAFCwkIBwIGFQoJCAsCBBYCAwECHgECF4AACgkQLgxleklgRAlWZAv/cJ5v3zlEyP0/jMKQBqbVCCHTirPEw+nqxbkeSO6r2FUds0NnGA9a6NPOpBH+qW7a6+n6q3sIbvH7jlss4pzLI7LYlDC6z+egTv7KR5X1xFrY1uR5UGs1beAjnzYeV2hK4yqRUfygsT0Wk5e4FiNBv4+DUZ8r0cNDkO6swJxU55DO21mcteC147+4aDoHZ40R0tsAu+brDGSSoOPpb0RWVsEf9XOBJqWWA+T7mluw nYzhLWGcczc6J71q1Dje0l5vIPaSFOgwmWD4DA+WvuxM/shH4rtWeodbv iCTce6yYIygHgUAtJcHozAlgRrL0jz44cggBTcoeXp/atckXK546OugZPnl00J3qmm5uWAznU6T5YDv2vCvAMEbz69ib+kHtnOSBvR0Jb86UZZqSb4ATfwMOWe9htGTjKMb0QQOLK0mTcrk/TtymaG+T4Fsos0kgrxqjgfrxxEhYcVNW8v8HISmFGFbqsJmFbVtgk68BcU0wgF8oFxo7u+XYQDdKbI1uQGNBGKNNQEBDADbQIdo8L3sdSWGQtu+LnFqCZoAbYurZCmUjLV3df1b+sg+GJZvVTmMZnzDP/ADufcbjopBBjGTRAY4L76T2niu2EpjclMMM3mtrOc738Kr3+RvPjUupdkZ1ZEZaWpf4cZm+4wH5GUfyu5pmD5WXX2i1r9XaUjeVtebvbuXWmWI1ZDTfOkiz/6Z0GDSeQeEqx2PXYBcepU7S9UNWttDtiZ0+IH4DZcvyKPUcK3tOj4u8GvO3RnOrglERzNCM/WhVdG1+vgU9fXO83TB/PcfAsvxYSie7u792s/I+yA4XKKh82PSTvTzg2/4vEDGpI9yubkfXRkQN28w+HKF5qoRB8/L1ZW/brlXkNzA6SveJhCnH7aOF0Yezl6TfX27w1CW5Xmvfi7X33V/SPvo0tY1THrO1c+bOjt5F+2/K3tvejmXMS/I6URwa8n1e767y5ErFKyXAYRweE9zarEgpNZTuSIGNNAqK+SiLLXt51G7P30TVavIeB6s2lCt1QKt62ccLqUAEQEAAYkBvAQYAQoAJhYhBMfui64wKUMqp1wmBC4MZXpJYEQJBQJijTUBAhsMBQkDwmcAAAoJEC4MZXpJYEQJkRAMAKNvWVwtXm/WxWoiLnXyF2WGXKoDe5+itTLvBmKcV/b1OKZF1s90V7WfSBz712eFAynEzyeezPbwU8QBiTpZcHXwQni3IYKvsh7s t1iq+gsfnXbPz5AnS598ScZI1oP7OrPSFJkt/z4acEbOQDQs8aUqrd46PV jsdqGvKnXZxzylux29UTNby4jTlz9pNJM+wPrDRmGfchLDUmf6CffaUYCbu4FiId+9+dcTCDvxbABRy1C3OJ8QY7cxfJ+pEZW18fRJ0XCl/fiV/ecAOfB3HsqgTzAn555h0rkFgay0hAvMU/mAW/CFNSIxV397zm749ZNLA0L2dMy1AKuOqH+/B+/ImBfJMDjmdyJQ8WU/OFRuGLdqOd2oZrA1iuPIa+yUYyZkaZfz/emQwpIL1+Q4p1R/OplA4yc301AqruXXUcVDbEB+joHW3hy5FwK5t5OwTKatrSJBkydSF9zdXy98fYzGniRyRA65P0Ix/8J3BYB4edY2/w0Ip/mdYsYQljBY0A== Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.50.0 Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FROM,RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net On Mon, 2023-10-02 at 17:05 -0700, Alexei Starovoitov wrote: [...] > > Consider that we get to the environment state where: > > - all env->head states are exhausted; > > - all potentially looping states (stored in as a separate set of > > states instead of env->head) are states_equal() to some already > > explored state. > >=20 > > I argue that if such environment state is reached the program should > > be safe, because: > > - Each looping state L is a sub-state of some explored state V and > > every path from V leads to either safe exit or another loop. > > - Iterator loops are guaranteed to exit eventually. >=20 > It sounds correct, but I don't like that the new mechanism > with two stacks of states completely changes the way the verifier works. > The check you proposed: > if (env->stack_size !=3D 0) > push_iter_stack() > rings alarm bells. >=20 > env->stack_size =3D=3D 0 (same as env->head exhausted) means we're done > with verification (ignoring bpf_exit in callbacks and subprogs). > So above check looks like a hack for something that I don't understand ye= t. > Also there could be branches in the code before and after iter loop. > With "opportunistic" states_equal() for looping states and delayed > reschedule_loop_states() to throw states back at the verifier > the whole verification model is non comprehensible (at least to me). > The stack + iter_stack + reschedule_loop_states means that in the followi= ng: > foo() > { > br1 // some if() {...} block > loop { > br2 > } > br3 > } >=20 > the normal verifier pop_stack logic will check br3 and br1, > but br2 may or may not be checked depending on "luck" of states_equal > and looping states that will be in iter_stack. > Then the verifier will restart from checking loop-ing states. > > If they somehow go past the end of the loop all kinds of things go crazy. > update_branch_counts() might warn, propagate_liveness, propagate_precisio= n > will do nonsensical things. When I put states to the loop stack I do copy_verifier_state() and increase .branches counter for each state parent, so this should not trigger warnings with update_branch_counts(). Logically, any state that has a loop state as it's grandchild is not verified to be safe until loops steady state is achieved, thus such states could not be used for states pruning and it is correct to keep their branch counters > 0. propagate_liveness() and propagate_precision() should not be called for the delayed states (work-in-progress I'll update the patch). Behavior for non-delayed states should not be altered by these changes. > This out-of-order state processing distorts the existing model so > much that I don't see how we can reason about these two stacks verificati= on. Iterators (and callbacks) try to diverge from regular verification logic of visiting all possible paths by making some general assumptions about loop bodies. There is no way to verify safety of such bodies w/o computing steady states. > I think the cleaner solution is to keep current single stack model. > In the above example the verifier would reach the end, then check br3, > then check br2, > then we need to split branches counter somehow, so that we can > compare loop iter states with previous visited states that are known > to be safe. > In visited states we explored everything in br3 and in br2, > so no concerns that some path inside the loop or after the loop > missed precision or liveness. > > Maybe we can split branches counter into branches due to 'if' blocks > and branches due to process_iter_next_call(). > If there are pending process_iter_next_call-caused states it's still > ok to call states_equal on such visited states. Using a stack forces DFS states traversal, if there is a loop state with branching this is prone to infinite nesting, e.g.: 0. // Complete assembly code is in the end of the email. 1. while (next(i)) { 2. if (random()) 3. continue; 4. r0 +=3D 0; 5. } Would lead to an infinite loop when using the patch shared in [1]. At (2) verifier would always create a state with {.loop_state=3Dfalse}, thus checkpoint state at (1) would always have `sl->branches !=3D sl->state.looping_states`: - looping state is current one, - non-looping state is the "else" branch scheduled in (2). Therefore checkpoint in (1) is not eligible for pruning and verifier would eagerly descend via path 1,2,3,1,2,3,... I don't think this is a quirk of the patch, with only a single stack there are no means to postpone exploration. Note, that removal of `elem->st.looping_state =3D false;` from push_stack() is not safe either, precision marks and unsafe values could be concealed in nested iteration states as shown in example [2], so we are risking pruning some states too early. [1] https://lore.kernel.org/bpf/CAADnVQJ3=3Dx8hfv7d29FQ-ckzh9=3DMXo54cnFShF= p=3DeG0fJjdDow@mail.gmail.com/T/#m8fc0fc3e338f57845f9fb65e0c3798a2ef5fb2e7 [2] https://lore.kernel.org/bpf/CAADnVQJ3=3Dx8hfv7d29FQ-ckzh9=3DMXo54cnFShF= p=3DeG0fJjdDow@mail.gmail.com/T/#m6014e44a00ab7732890c13b83b5497f8d856fc81 --- In theory, loops steady state does not have to be tracked globally it can be tracked per some key states, e.g. entry states for top-level loops: - at the entry state E for a top level loop: - establish a separate regular and loop state stacks: E.stack, E.loop_sta= ck; - add E to explored states as a checkpoint and continue verification from E' a copy of E (at this point E branches counter is > 0); - in is_state_visited() / .branches > 0 / is_bpf_next_insn(): - only consider current state C for states_equal(V, C) if V and C have a common grandparent E; - use E.loop_stack to delay C; - use same logic for steady state as before: - E.stack should be exhausted; - states in E.loop_stack should all be states_equal() to some explored state V that has E as it's grandparent. - if such steady state is achieved E's loop states could be dropped and branches counter for E could be set to 0, thus opening it up for regular states pruning. =20 Such logic would make verification of loops more similar to verification of regular conditionals, e.g.: foo() { if (random()) { r0 =3D 0; // branch B1 } else { r0 =3D 42; // branch B2 } i =3D iter_new(); // loop entry E while (iter_next()) { // loop body LB do_something(); } } Here verifier would first visit path B1,E,LB, second visit to LB would be delayed and steady state for E would be declared, E.branches would be set to 0. Then path B1,E would be visited and declared safe as E is eligible for states pruning. This gives more opportunities for states pruning and makes user's reasoning about iterators verification similar to user's reasoning about regular loops verification. But it adds some more complexity to the implementation (albeit, I think it's not much, env->head becomes a stack of stacks and a few other tweaks). --- SEC("?raw_tp") __success __naked int loop1(void) { asm volatile ( "r1 =3D r10;" "r1 +=3D -8;" "r2 =3D 0;" "r3 =3D 10;" "call %[bpf_iter_num_new];" "loop_%=3D:" "r1 =3D r10;" "r1 +=3D -8;" "call %[bpf_iter_num_next];" "if r0 =3D=3D 0 goto loop_end_%=3D;" "call %[bpf_get_prandom_u32];" "if r0 !=3D 42 goto loop_%=3D;" "r0 +=3D 0;" "goto loop_%=3D;" "loop_end_%=3D:" "r1 =3D r10;" "r1 +=3D -8;" "call %[bpf_iter_num_destroy];" "r0 =3D 0;" "exit;" : : __imm(bpf_get_prandom_u32), __imm(bpf_iter_num_new), __imm(bpf_iter_num_next), __imm(bpf_iter_num_destroy) : __clobber_all ); }