From mboxrd@z Thu Jan  1 00:00:00 1970
From: Jiong Wang <jiong.wang@netronome.com>
Subject: Re: [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification
Date: Wed, 3 Oct 2018 16:36:31 +0100
Message-ID: <858638d7-dad4-cb70-478c-29a54a131e68@netronome.com>
References: <d16ea072-61a0-8f8a-aca1-13cac09d3d14@solarflare.com>
 <0252cca7-82e4-24d3-8682-e1a613d6cd78@netronome.com>
 <df4c315c-f023-a200-4fbf-64bd3b508241@solarflare.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Cc: netdev@vger.kernel.org
To: Edward Cree <ecree@solarflare.com>, ast@kernel.org,
        daniel@iogearbox.net
Return-path: <netdev-owner@vger.kernel.org>
Received: from mail-wr1-f66.google.com ([209.85.221.66]:38673 "EHLO
        mail-wr1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1726748AbeJCWZb (ORCPT
        <rfc822;netdev@vger.kernel.org>); Wed, 3 Oct 2018 18:25:31 -0400
Received: by mail-wr1-f66.google.com with SMTP id a13-v6so6648237wrt.5
        for <netdev@vger.kernel.org>; Wed, 03 Oct 2018 08:36:36 -0700 (PDT)
In-Reply-To: <df4c315c-f023-a200-4fbf-64bd3b508241@solarflare.com>
Content-Language: en-GB
Sender: netdev-owner@vger.kernel.org
List-ID: <netdev.vger.kernel.org>

On 28/09/2018 14:36, Edward Cree wrote:
 > On 26/09/18 23:16, Jiong Wang wrote:
 >> On 22/08/2018 20:00, Edward Cree wrote:
 >>> In the future this idea may be extended to form use-def chains.
 >>
 >>   1. instruction level use->def chain
 >>
 >>      - new use->def chains for each instruction. one eBPF insn could 
have two
 >>        uses at maximum.
 > I was thinking of something a lot weaker/simpler, just making
 >     ld rX, rY
 >  copy rY.parent into rX.parent and not read-mark rY (whereas actual
 >  arithmetic, pointer deref etc. would still create read marks).

Thanks for the feedback Edward.

 > But what you've described sounds interesting; perhaps it would also
 >  help later with loop-variable handling?

Haven't considered how to use this for loop-variable handling, guess you 
mean
applying what I have described to your previous loop detection RFC? I 
will look
into your RFC later.

At the moment the design of the use->def chain is mainly to optimize 32-bit
code-gen. I was about to satisfied with a local implementation and to 
share it
to ML for further discussion. However, when manually check the optimization
result on testcase with medium size (~1000 eBPF insns) and proper complexity
(make sure path prunes etc are triggered inside verifier), I found the 
code-gen
doesn't meet my expectation.

For example, for the following sequence, insn at 25 should operate on 
full-64
bit but I found it is marked as 32-bit safe.

   25:    r7 = 1
   26:    if r4 > r8 goto +1200 <L>
   27:    r1 = *(u8 *)(r1 + 0)
   28:    r1 &= 15
   29:    r7 = 1
   ...

L:
   1227:  r0 = r7
   1228:  exit

As described at previous email, the algorithm assume all insns are 
32-bit safe
first, then start to insns back to "64-bit" if there is any 64-bit use found
for a insn.

Insn 25 is defining r7 which is used at the 1227 where its value 
propagated to
r0 and then r0 is implicitly used at insn 1228 as it is a exit from main
function to external.

For above example, as we don't know the external use of r0 at 1228 (exit 
from
main to external), so r0 is treated as 64-bit implicit use. The define is at
1227, so insn 1227 is marked as "64-bit". The "64-bit" attribute should
propagate to source register operand through register move and 
arithmetic, so
r7 at insn 1227 is a "64-bit" use and should make its definition 
instruction,
insn 25, marked as "64-bit". This is my thinking of how insn 25 should be
marked.

Now this hasn't happened. I am still debugging the root cause, but kind 
of feel
"64-bit" attribute propagation is the issue, it seems to me it can't be 
nicely
integrated into the existing register read/write propagation 
infrastructure. For
example, for a slightly more complex sequence which is composed of three 
states:

State A
   ...
   10: r6 = *(u32 *)(r10 - 4)
   11: r7 = *(u32 *)(r10 - 8)
   12: *(u64 *)(r10 - 16) = r6
   13: *(u64 *)(r10 - 24) = r7

State B
   14: r6 += 1
   15: r7 += r6
   16: *(u32 *)(r10 - 28) = r7

State C
   ...
   17: r3 += r7
   18: r4 = 1
   19: *(u64 *)(r10 - 32) = r3
   20: *(u64 *)(r10 - 40) = r4

State A is parent of state B which is parent of state C.

Inside state C, at insn 20, r4 is a 64-bit read/use, so its define at 18 is
marked as "64-bit". There is no register source at 18, so "64-bit" attribute
propagation is stopped.

Then at insn 19, r3 is a 64-bit read/use, so its define at 17 is marked as
"64-bit" read/use. Insn 17 has two register sources, r3 and r7, they become
"64-bit" now, and their definition should be marked as "64-bit".

Now if the definition of r3 or r7 comes from parent state, then the 
parent state
should receive a "REG_LIVE_READ64", this is necessary if later another path
reaches state C and triggers prune path, for which case that path should 
know
there is "64-bit" use inside state C on some registers, and should use this
information to mark "64-bit" insn.

If the definition of r3 or r7 is still inside state C, we need to keep 
walking
up the instruction sequences, and propagate "64-bit" attribute upward 
until it
goes beyond the state C.

The above propagation logic is quite different from existing register 
read/write
propagation. For the latter, a write just screen up all following read, 
and a
read would propagate directly to its parent is there is not previous 
write, no
instruction analysis is required.

I am just describing what I have run into and trying to resolve, any 
thoughts
and suggestions are appreciated.

Regards,
Jiong