Re: [PATCH v2 bpf-next 7/7] selftests/bpf: BPF register range bounds tester

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Shung-Hsi Yu <shung-hsi.yu@suse.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>,
	Paul Chaignon <paul@isovalent.com>
Cc: Srinivas Narayana Ganapathy <sn624@cs.rutgers.edu>,
	Andrii Nakryiko <andrii@kernel.org>,
	Langston Barrett <langston.barrett@gmail.com>,
	Srinivas Narayana <srinivas.narayana@rutgers.edu>,
	Santosh Nagarakatte <sn349@cs.rutgers.edu>,
	"bpf@vger.kernel.org" <bpf@vger.kernel.org>,
	"ast@kernel.org" <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	"martin.lau@kernel.org" <martin.lau@kernel.org>,
	"kernel-team@meta.com" <kernel-team@meta.com>,
	Matan Shachnai <m.shachnai@rutgers.edu>,
	Harishankar Vishwanathan <harishankar.vishwanathan@rutgers.edu>
Subject: Re: [PATCH v2 bpf-next 7/7] selftests/bpf: BPF register range bounds tester
Date: Mon, 23 Oct 2023 22:05:41 +0800	[thread overview]
Message-ID: <ZTZ9tYlVgt9DVcgi@u94a> (raw)
In-Reply-To: <CAEf4BzbFhA585gSN1YfaDaeEmmUvWSdpMY605fmV_RvSQ7+xeQ@mail.gmail.com>

On Sat, Oct 21, 2023 at 09:42:46PM -0700, Andrii Nakryiko wrote:
> On Fri, Oct 20, 2023 at 10:37 AM Srinivas Narayana Ganapathy
> <sn624@cs.rutgers.edu> wrote:
> >
> > Hi all,
> >
> > Thanks, @Shung-Hsi, for bringing up this conversation about
> > integrating formal verification approaches into the BPF CI and testing.
> >
> > > On 19-Oct-2023, at 1:34 PM, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> > > On Thu, Oct 19, 2023 at 12:52 AM Shung-Hsi Yu <shung-hsi.yu@suse.com> wrote:
> > >> On Thu, Oct 19, 2023 at 03:30:33PM +0800, Shung-Hsi Yu wrote:
> > >>> On Wed, Oct 18, 2023 at 09:24:05PM -0700, Andrii Nakryiko wrote:
> > >>>> Add tests that validate correctness and completeness of BPF verifier's
> > >>>> register range bounds.
> > >>>
> > >>> Nitpick: in abstract-interpretation-speak, completeness seems to mean
> > >>> something different. I believe what we're trying to check here is
> > >>> soundness[1], again, in abstraction-interpretation-speak), so using
> > >>> completeness here may be misleading to some. (I'll leave explanation to
> > >>> other that understand this concept better than I do, rather than making an
> > >>> ill attempt that would probably just make things worst)
> > >>>
> > >>>> The main bulk is a lot of auto-generated tests based on a small set of
> > >>>> seed values for lower and upper 32 bits of full 64-bit values.
> > >>>> Currently we validate only range vs const comparisons, but the idea is
> > >>>> to start validating range over range comparisons in subsequent patch set.
> > >>>
> > >>> CC Langston Barrett who had previously send kunit-based tnum checks[2] a
> > >>> while back. If this patch is merged, perhaps we can consider adding
> > >>> validation for tnum as well in the future using similar framework.
> > >>>
> > >>> More comments below
> > >>>
> > >>>> When setting up initial register ranges we treat registers as one of
> > >>>> u64/s64/u32/s32 numeric types, and then independently perform conditional
> > >>>> comparisons based on a potentially different u64/s64/u32/s32 types. This
> > >>>> tests lots of tricky cases of deriving bounds information across
> > >>>> different numeric domains.
> > >>>>
> > >>>> Given there are lots of auto-generated cases, we guard them behind
> > >>>> SLOW_TESTS=1 envvar requirement, and skip them altogether otherwise.
> > >>>> With current full set of upper/lower seed value, all supported
> > >>>> comparison operators and all the combinations of u64/s64/u32/s32 number
> > >>>> domains, we get about 7.7 million tests, which run in about 35 minutes
> > >>>> on my local qemu instance. So it's something that can be run manually
> > >>>> for exhaustive check in a reasonable time, and perhaps as a nightly CI
> > >>>> test, but certainly is too slow to run as part of a default test_progs run.
> > >>>
> > >>> FWIW an alternative approach that speeds things up is to use model checkers
> > >>> like Z3 or CBMC. On my laptop, using Z3 to validate tnum_add() against *all*
> > >>> possible inputs takes less than 1.3 seconds[3] (based on code from [1]
> > >>> paper, but I somehow lost the link to their GitHub repository).
> > >>
> > >> Found it. For reference, code used in "Sound, Precise, and Fast Abstract
> > >> Interpretation with Tristate Numbers"[1] can be found at
> > >> https://github.com/bpfverif/tnums-cgo22/blob/main/verification/tnum.py
> > >>
> > >> Below is a truncated form of the above that only check tnum_add(), requires
> > >> a package called python3-z3 on most distros:
> > >
> > > Great! I'd be curious to see how range tracking logic can be encoded
> > > using this approach, please give it a go!
> >
> > We have some recent work that applies formal verification approaches
> > to the entirety of range tracking in the eBPF verifier. We posted a
> > note to the eBPF mailing list about it sometime ago:
> >
> > [1] https://lore.kernel.org/bpf/SJ2PR14MB6501E906064EE19F5D1666BFF93BA@SJ2PR14MB6501.namprd14.prod.outlook.com/T/#u
> 
> Oh, I totally missed this, as I just went on a long vacation a few
> days before that and declared email bankruptcy afterwards. I'll try to
> give it a read, though I see lots of math symbols there and make no
> promises ;)

Feels the same when I start reading their previous work, but I can vouch
their work their work are definitely worth the read. (Though I had to admit
I secretly chant "math is easier than code, math is easier than code" to
convincing my mind to not go into flight mode when seeing math symbols ;D)

> > Our paper, also posted on [1], appeared at Computer Aided Verification (CAV)’23.
> >
> > [2] https://people.cs.rutgers.edu/~sn624/papers/agni-cav23.pdf
> >
> > Together with @Paul Chaignon and @Harishankar Vishwanathan (CC'ed), we
> > are working to get our tooling into a form that is integrable into BPF
> > CI. We will look forward to your feedback when we post patches.
> 
> If this could be integrated in a way that we can regularly run this
> and validate latest version of verifier, that would be great. I have a
> second part of verifier changes coming up that extends range tracking
> logic further to support range vs range (as opposed to range vs const
> that we do currently) comparisons and is_branch_taken, so having
> independent and formal verification of these changes would be great!

+1 (from a quick skim) this work is already great as-is, and it'd be even
better once it get's in the CI. From the paper there's this

  We conducted our experiments on ... a machine with two 10-core Intel
  Skylake CPUs running at 2.20 GHz with 192 GB of memory...

I suppose the memory requirement comes from the vast amount of state space
that the Z3 SMT solver have to go through, and perhaps that poses a
challenge for CI integration?

Just wondering is there are some low-hanging fruit the can make things
easier for the SMT solver.

> > Thanks,
> >
> > --
> > Srinivas
> > The fastest algorithm can frequently be replaced by one that is almost as fast and much easier to understand.
> >

next prev parent reply	other threads:[~2023-10-23 14:05 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-19  4:23 [PATCH v2 bpf-next 0/7] BPF register bounds logic and testing improvements Andrii Nakryiko
2023-10-19  4:23 ` [PATCH v2 bpf-next 1/7] bpf: improve JEQ/JNE branch taken logic Andrii Nakryiko
2023-10-19  4:24 ` [PATCH v2 bpf-next 2/7] bpf: derive smin/smax from umin/max bounds Andrii Nakryiko
2023-10-19  4:24 ` [PATCH v2 bpf-next 3/7] bpf: enhance subregister bounds deduction logic Andrii Nakryiko
2023-10-19  4:24 ` [PATCH v2 bpf-next 4/7] bpf: improve deduction of 64-bit bounds from 32-bit bounds Andrii Nakryiko
2023-10-19  4:24 ` [PATCH v2 bpf-next 5/7] bpf: try harder to deduce register bounds from different numeric domains Andrii Nakryiko
2023-10-19  4:24 ` [PATCH v2 bpf-next 6/7] bpf: drop knowledge-losing __reg_combine_{32,64}_into_{64,32} logic Andrii Nakryiko
2023-10-19  4:24 ` [PATCH v2 bpf-next 7/7] selftests/bpf: BPF register range bounds tester Andrii Nakryiko
2023-10-19  7:08   ` kernel test robot
2023-10-19 18:27     ` Andrii Nakryiko
2023-10-19  7:30   ` Shung-Hsi Yu
2023-10-19  7:52     ` Shung-Hsi Yu
2023-10-19 18:34       ` Andrii Nakryiko
2023-10-20 17:37         ` Srinivas Narayana Ganapathy
2023-10-22  4:42           ` Andrii Nakryiko
2023-10-23 14:05             ` Shung-Hsi Yu [this message]
2023-10-23 15:52               ` Paul Chaignon
2023-10-23 22:50                 ` Andrii Nakryiko
2023-10-24  5:51                   ` Andrii Nakryiko
2023-10-24 21:26                     ` Paul Chaignon
2023-10-26 22:47                       ` Andrii Nakryiko
2023-10-19 18:31     ` Andrii Nakryiko
2023-10-20 12:27       ` Shung-Hsi Yu
2023-10-21  4:13 ` [PATCH v2 bpf-next 0/7] BPF register bounds logic and testing improvements John Fastabend
2023-10-22  4:32   ` Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZTZ9tYlVgt9DVcgi@u94a \
    --to=shung-hsi.yu@suse.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=harishankar.vishwanathan@rutgers.edu \
    --cc=kernel-team@meta.com \
    --cc=langston.barrett@gmail.com \
    --cc=m.shachnai@rutgers.edu \
    --cc=martin.lau@kernel.org \
    --cc=paul@isovalent.com \
    --cc=sn349@cs.rutgers.edu \
    --cc=sn624@cs.rutgers.edu \
    --cc=srinivas.narayana@rutgers.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.