From mboxrd@z Thu Jan  1 00:00:00 1970
From: David Ahern <dsahern@gmail.com>
Subject: Re: [PATCH v15 ] net/veth/XDP: Line-rate packet forwarding in kernel
Date: Tue, 3 Apr 2018 11:14:00 -0600
Message-ID: <dea228d2-8483-b291-493d-3627b824fd00@gmail.com>
References: <CAFgPn1DX9cOpDRGj=wFwvZq_bpq6VFnEOzR1YbMuC0+=DFEWxA@mail.gmail.com>
 <7cfca503-3e17-6287-8888-92d43ce7a2e7@gmail.com>
 <2ac3c590-8f13-b983-7efb-021f82ee3295@gmail.com>
 <20180402181602.jpdb25ytmffg2gei@ast-mbp.dhcp.thefacebook.com>
 <9cb8a162-3b6a-abfa-4f6e-524995bbfb8d@gmail.com>
 <4f0c0f20-ce25-4996-4f28-14a73c988446@gmail.com>
 <8832a8a4-24bb-7841-bb84-31f7c05c2b72@gmail.com>
 <20180403170624.ojofd737zh5pul5m@ast-mbp.dhcp.thefacebook.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
Cc: John Fastabend <john.fastabend@gmail.com>,
        "Md. Islam" <mislam4@kent.edu>, netdev@vger.kernel.org,
        David Miller <davem@davemloft.net>, stephen@networkplumber.org,
        agaceph@gmail.com, Pavel Emelyanov <xemul@openvz.org>,
        Eric Dumazet <edumazet@google.com>, brouer@redhat.com
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Return-path: <netdev-owner@vger.kernel.org>
Received: from mail-pl0-f47.google.com ([209.85.160.47]:34073 "EHLO
        mail-pl0-f47.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1751926AbeDCROD (ORCPT
        <rfc822;netdev@vger.kernel.org>); Tue, 3 Apr 2018 13:14:03 -0400
Received: by mail-pl0-f47.google.com with SMTP id u11-v6so9748394plq.1
        for <netdev@vger.kernel.org>; Tue, 03 Apr 2018 10:14:02 -0700 (PDT)
In-Reply-To: <20180403170624.ojofd737zh5pul5m@ast-mbp.dhcp.thefacebook.com>
Content-Language: en-US
Sender: netdev-owner@vger.kernel.org
List-ID: <netdev.vger.kernel.org>

On 4/3/18 11:06 AM, Alexei Starovoitov wrote:
>> For 3 and 4 above I was referring to the route lookup part of it; sorry
>> for not being clear.
>>
>> For example, eth1 is enslaved to bond1 which is in VRF red. The lookup
>> needs to go to the table associated with the VRF. That is not known by
>> just looking at eth1. The code exists to walk the upper layers and do
>> the effective translations, just need to cover those cases.
>>
>> The VLAN part of it is a bit more difficult - ingress device for the
>> lookup should be eth1.100 for example not eth1, and then if eth1.100 is
>> enslaved to a VRF, ...
>>
>> None of it is that complex, just need to walk through the various use
>> cases and make sure bpf_ipv4_fwd_lookup and bpf_ipv6_fwd_lookup can do
>> the right thing for these common use cases.
> I'm a bit lost here. Why this is a concern?
> 'index' as argument that bpf prog is passing into the helper.
> The clsbpf program may choose to pass ifindex of the netdev
> it's attached to or some other one.
> In your patch you have:
> +BPF_CALL_3(bpf_ipv4_fwd_lookup, int, index, const struct iphdr *, iph,
> +	   struct ethhdr *, eth)
> +{
> +	struct flowi4 fl4 = {
> +		.daddr = iph->daddr,
> +		.saddr = iph->saddr,
> +		.flowi4_iif = index,
> +		.flowi4_tos = iph->tos & IPTOS_RT_MASK,
> +		.flowi4_scope = RT_SCOPE_UNIVERSE,
> +	};
> 
> As you saying there is concern with .flowi4_iif = index line ?

yes. BPF / XDP programs are installed on the bottom device ... e.g.,
eth1. The L3 lookup is not necessarily done on that device index.


> In the above the only thing Daniel and myself pointed out that
> passing struct iphdr * like this is not safe.
> We either need size argument which would be a bit cumbersome or
> extend verifier a little to specify size as part of helper proto,
> so that verifier can eforce it without having program to pass it.
> imo that's the only bit missing from that patch to upstream it.

sure. I did not mean that item 1. was a big deal, just something that
needed to be fixed.

> 
> Also the helper isn't really related to XDP. It should work as-is
> for clsbpf and xdp programs as far as I can tell.
> 

yes.