From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.9 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 489E3C2D0DB for ; Fri, 24 Jan 2020 15:31:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 0B3C520838 for ; Fri, 24 Jan 2020 15:31:05 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="C2o3b9BO" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388042AbgAXPbE (ORCPT ); Fri, 24 Jan 2020 10:31:04 -0500 Received: from us-smtp-1.mimecast.com ([205.139.110.61]:36477 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2387599AbgAXPbE (ORCPT ); Fri, 24 Jan 2020 10:31:04 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1579879862; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7h5WWYpXM0AhhqzCXx3HLitAXt17+79fmo2qQPvwnK4=; b=C2o3b9BOknM4nj5J5thdUDkwsjAv1KFAhbTdZLIba/M5jv32YQ+HO1C3G0zh+Sm8iPRPug Nsa0D26Rv3GmEyeIe/L4zYDXaQN0s06+hzLuc4iUeajLTkEy8VUM9fUjDWi3xc4nSANjvB JxC7mf+tnPfFcv4Y+Tg5db6m+V5ETkk= Received: from mail-lf1-f70.google.com (mail-lf1-f70.google.com [209.85.167.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-36-fXLAgzWQMXi4QH677PeD3Q-1; Fri, 24 Jan 2020 10:30:55 -0500 X-MC-Unique: fXLAgzWQMXi4QH677PeD3Q-1 Received: by mail-lf1-f70.google.com with SMTP id v19so352671lfg.2 for ; Fri, 24 Jan 2020 07:30:55 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:in-reply-to:references:date :message-id:mime-version:content-transfer-encoding; bh=7h5WWYpXM0AhhqzCXx3HLitAXt17+79fmo2qQPvwnK4=; b=mb+1NyQOfE8BUwH6T5C8URJY+fUCAjq8Bse3XLCtvKmSyhcx7nrnhveqGtlQRKktaa 8cC2nUUwPwDhosC+BYijR7MMhiMrnayOzX/v4Kk2DgGHQ5hodq9PFK5/PI4q4uTVVkWk q3bbDbpHD6YWT8kgSwsstW9zyeBr/fOcv5dwTaQPlWje9K1MSa9JCT0DX0DyazIS6GSJ DeJbTKV73MkabhIboWSbrkbPBtDJOGTSCbi8835ByjJ/QJ/mtQUQ19WE1j7Xb83vT9ez MrTd3d4Kpwd3guMfUTrqEKUFaHUS/gPhwkWvTcM3po7omPACsASdJAVMeXgtwZuCnYdb m0RQ== X-Gm-Message-State: APjAAAVRvz5ZhOpSGA8FUhHWvlgMDzJOSrWDFwLYs/ujIiXmfn6hRKK+ /HRtOk1JS9jWO6Toj6QDKCYSud3qPxK+Hry2JsiWylW9kEaBVtitK5K+dwRzMsr0kho0jky74/F OoJ9/3K9xYC8aS+Rw X-Received: by 2002:a2e:7812:: with SMTP id t18mr2669635ljc.289.1579879853884; Fri, 24 Jan 2020 07:30:53 -0800 (PST) X-Google-Smtp-Source: APXvYqxsaUBPnOaFMegKrVMDPjpOsDoWTWZc2Icnpi+kKG4aDBmCkSxuBZ57z0F8tlH+zfN7GZSy0w== X-Received: by 2002:a2e:7812:: with SMTP id t18mr2669625ljc.289.1579879853600; Fri, 24 Jan 2020 07:30:53 -0800 (PST) Received: from alrua-x1.borgediget.toke.dk ([2a0c:4d80:42:443::2]) by smtp.gmail.com with ESMTPSA id 144sm2955991lfi.67.2020.01.24.07.30.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 Jan 2020 07:30:52 -0800 (PST) Received: by alrua-x1.borgediget.toke.dk (Postfix, from userid 1000) id 5A3D8180073; Fri, 24 Jan 2020 16:30:52 +0100 (CET) From: Toke =?utf-8?Q?H=C3=B8iland-J=C3=B8rgensen?= To: Luigi Rizzo Cc: Daniel Borkmann , netdev@vger.kernel.org, Jesper Dangaard Brouer , "David S. Miller" , sameehj@amazon.com Subject: Re: [PATCH] net-xdp: netdev attribute to control xdpgeneric skb linearization In-Reply-To: References: <20200122203253.20652-1-lrizzo@google.com> <875zh2bis0.fsf@toke.dk> <953c8fee-91f0-85e7-6c7b-b9a2f8df5aa6@iogearbox.net> <87blqui1zu.fsf@toke.dk> <875zh2hx20.fsf@toke.dk> <87r1zpgosp.fsf@toke.dk> X-Clacks-Overhead: GNU Terry Pratchett Date: Fri, 24 Jan 2020 16:30:52 +0100 Message-ID: <87r1zog9cj.fsf@toke.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Luigi Rizzo writes: > On Fri, Jan 24, 2020 at 1:57 AM Toke H=C3=B8iland-J=C3=B8rgensen wrote: >> >> Daniel Borkmann writes: >> >> > On 1/23/20 7:06 PM, Luigi Rizzo wrote: >> >> On Thu, Jan 23, 2020 at 10:01 AM Toke H=C3=B8iland-J=C3=B8rgensen wrote: >> >>> Luigi Rizzo writes: >> >>>> On Thu, Jan 23, 2020 at 8:14 AM Toke H=C3=B8iland-J=C3=B8rgensen wrote: >> >>>>> Daniel Borkmann writes: >> >>>>>> On 1/23/20 10:53 AM, Toke H=C3=B8iland-J=C3=B8rgensen wrote: >> >>>>>>> Luigi Rizzo writes: >> >>>>>>> >> >>>>>>>> Add a netdevice flag to control skb linearization in generic xd= p mode. >> >>>>>>>> Among the various mechanism to control the flag, the sysfs >> >>>>>>>> interface seems sufficiently simple and self-contained. >> >>>>>>>> The attribute can be modified through >> >>>>>>>> /sys/class/net//xdp_linearize >> >>>>>>>> The default is 1 (on) >> >>>>>> >> >>>>>> Needs documentation in Documentation/ABI/testing/sysfs-class-net. >> >>>>>> >> >>>>>>> Erm, won't turning off linearization break the XDP program's abi= lity to >> >>>>>>> do direct packet access? >> >>>>>> >> >>>>>> Yes, in the worst case you only have eth header pulled into linear >> >>>>>> section. :/ >> >>>>> >> >>>>> In which case an eBPF program could read/write out of bounds since= the >> >>>>> verifier only verifies checks against xdp->data_end. Right? >> >>>> >> >>>> Why out of bounds? Without linearization we construct xdp_buff as f= ollows: >> >>>> >> >>>> mac_len =3D skb->data - skb_mac_header(skb); >> >>>> hlen =3D skb_headlen(skb) + mac_len; >> >>>> xdp->data =3D skb->data - mac_len; >> >>>> xdp->data_end =3D xdp->data + hlen; >> >>>> xdp->data_hard_start =3D skb->data - skb_headroom(skb); >> >>>> >> >>>> so we shouldn't go out of bounds. >> >>> >> >>> Hmm, right, as long as it's guaranteed that the bit up to hlen is >> >>> already linear; is it? :) >> >> >> >> honest question: that would be skb->len - skb->data_len, isn't that >> >> the linear part by definition ? >> > >> > Yep, that's the linear part by definition. Generic XDP with ->data/->d= ata_end is in >> > this aspect no different from tc/BPF where we operate on skb context. = Only linear part >> > can be covered from skb (unless you pull in more via helper for the >> > latter). >> >> OK, but then why are we linearising in the first place? Just to get >> sufficient headroom? > > Looking at the condition in the if() it is both to make sufficient > headroom available and have linear data so the bpf code can access all > the packet data. Ohhh, didn't realise that linearising also changes skb_headlen() - makes so much more sense now :) > My motivation for this change is that enforcing those guarantees has > significant cost (even for native xdp in the cases I mentioned - mtu > > 1 page, hw LRO, header split), and this is an interim solution to make > generic skb usable without too much penalty. Sure, that part I understand; I just don't like that this "interim" solution makes generic and native XDP diverge further in their semantics... > In the long term I think it would be good if the xdp program could > express its requirements at load time ("i just need header, I need at > least 18 bytes of headroom..") and have the netdev or nic driver > reconfigure as appropriate. This may be interesting to include in the XDP feature detection capabilities we've been discussing for some time. Our current thinking is that the verifier should detect what a program does, rather than the program having to explicitly declare what features it needs. See https://github.com/xdp-project/xdp-project/blob/master/xdp-project.org#note= s-implementation-plan for some notes on this :) -Toke