From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Tue, 28 Feb 2023 06:04:55 -0500 From: "Michael S. Tsirkin" Subject: Re: [PATCH v9] virtio-net: support inner header hash Message-ID: <20230228060352-mutt-send-email-mst@kernel.org> References: <0f53212f-a89b-ad3c-73e3-a7a7b5533058@linux.alibaba.com> <1047920c-5dd5-8f31-0c4c-a108f36155f8@redhat.com> <20230223075934-mutt-send-email-mst@kernel.org> <20230224030509-mutt-send-email-mst@kernel.org> <20230227023657-mutt-send-email-mst@kernel.org> <20230227124800-mutt-send-email-mst@kernel.org> MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Disposition: inline Content-Transfer-Encoding: 8bit To: Jason Wang Cc: Heng Qi , virtio-comment@lists.oasis-open.org, virtio-dev@lists.oasis-open.org, Parav Pandit , Yuri Benditovich , Cornelia Huck , Xuan Zhuo List-ID: On Tue, Feb 28, 2023 at 11:04:26AM +0800, Jason Wang wrote: > On Tue, Feb 28, 2023 at 1:49 AM Michael S. Tsirkin wrote: > > > > On Mon, Feb 27, 2023 at 04:35:09PM +0800, Jason Wang wrote: > > > On Mon, Feb 27, 2023 at 3:39 PM Michael S. Tsirkin wrote: > > > > > > > > On Mon, Feb 27, 2023 at 12:07:17PM +0800, Jason Wang wrote: > > > > > Btw, this kind of 1:1 hash features seems not scalable and flexible. > > > > > It requires an endless extension on bits/fields. Modern NICs allow the > > > > > user to customize the hash calculation, for virtio-net we can allow to > > > > > use eBPF program to classify the packets. It seems to be more flexible > > > > > and scalable and there's almost no maintain burden in the spec (only > > > > > bytecode is required, no need any fancy features/interactions like > > > > > maps), easy to be migrated etc. > > > > > > > > > > Prototype is also easy, tun/tap had an eBPF classifier for years. > > > > > > > > > > Thanks > > > > > > > > Yea BPF offload would be great to have. We have been discussing it for > > > > years though - security issues keep blocking it. *Maybe* it's finally > > > > going to be there but I'm not going to block this work waiting for BPF > > > > offload. And easily migrated is what BPF is not. > > > > > > Just to make sure we're at the same page. I meant to find a way to > > > allow the driver/user to fully customize what it wants to > > > hash/classify. Similar technologies which is based on private solution > > > has been used by some vendors, which allow user to customize the > > > classifier[1] > > > > > > ePBF looks like a good open-source solution candidate for this (there > > > could be others). But there could be many kinds of eBPF programs that > > > could be offloaded. One famous one is XDP which requires many features > > > other than the bytecode/VM like map access, tailcall. Starting from > > > such a complicated type is hard. Instead, we can start from a simple > > > type, that is the eBPF classifier. All it needs is to pass the > > > bytecode to the device, the device can choose to run it or compile it > > > to what it can understand for classifying. We don't need maps, tail > > > calls and other features. > > > > Until people start asking exactly for maps because they want > > state for their classifier? > > Yes, but let's compare the eBPF without maps with the static feature > proposed here. It is much more scalable and flexible. I looked for some examples of RSS using BPF and only found this: https://github.com/Netronome/bpf-samples/blob/master/programmable_rss/rss_user.c seems to use maps. > > And it makes sense - if you want > > e.g. load balancing you need stats which needs maps. > > Yes, but we know it's possible to have that (through the XDP offload). Not without a lot more work to make xdp offload happen. > This is impossible with the approach proposed here. > > > > > > We don't need to worry about the security > > > because of its simplicity: the eBPF program is only in charge of doing > > > classification, no other interactions with the driver and packet > > > modification is prohibited. The feature is limited only to the > > > VM/bytecode abstraction itself. > > > > > > What's more, it's a good first step to achieve full eBPF offloading in > > > the future. > > > > > > Thanks > > > > > > [1] https://www.intel.com/content/www/us/en/architecture-and-technology/ethernet/dynamic-device-personalization-brief.html > > > > Dave seems to have nacked this approach, no? > > I may miss something but looking at kernel commit, there are few > patches to support that: > > E.g > > commit c7648810961682b9388be2dd041df06915647445 > Author: Tony Nguyen > Date: Mon Sep 9 06:47:44 2019 -0700 > > ice: Implement Dynamic Device Personalization (DDP) download > > And it has been used by DPDK drivers. > > Thanks > > > > > > > > > > > -- > > > > MST > > > > > > From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from ws5-mx01.kavi.com (ws5-mx01.kavi.com [34.193.7.191]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0D4ACC64EC7 for ; Tue, 28 Feb 2023 11:05:25 +0000 (UTC) Received: from lists.oasis-open.org (oasis.ws5.connectedcommunity.org [10.110.1.242]) by ws5-mx01.kavi.com (Postfix) with ESMTP id 321DB2B061 for ; Tue, 28 Feb 2023 11:05:25 +0000 (UTC) Received: from lists.oasis-open.org (oasis-open.org [10.110.1.242]) by lists.oasis-open.org (Postfix) with ESMTP id 173FF986602 for ; Tue, 28 Feb 2023 11:05:25 +0000 (UTC) Received: from host09.ws5.connectedcommunity.org (host09.ws5.connectedcommunity.org [10.110.1.97]) by lists.oasis-open.org (Postfix) with QMQP id 01DC5983EFA; Tue, 28 Feb 2023 11:05:25 +0000 (UTC) Mailing-List: contact virtio-dev-help@lists.oasis-open.org; run by ezmlm List-Id: Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Received: from lists.oasis-open.org (oasis-open.org [10.110.1.242]) by lists.oasis-open.org (Postfix) with ESMTP id 9DB679865FA for ; Tue, 28 Feb 2023 11:05:03 +0000 (UTC) X-Virus-Scanned: amavisd-new at kavi.com X-MC-Unique: PMeYL5-NOUGG4mrt48qu3A-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1677582299; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=bpgi36v5fbPu7GIyXSZ3bszRj03IlFVjkt9be9XW2JM=; b=azyTWbqNQOlZLdMjFnDEbcjjKhb1LwY16L1nIlCVncUCk0Hd/nMxGqKj/Tak9Kkp9O utwQlWfvWR+RIdsjwiuPCobyoNPXvhZT0cD3rpvhbPMuG/d6njYIwfbxurWTIyHcaAKX gUGfJTPhZWS1Zzs5chJfVfGx9XvCOvHk73wxowgShMlVkzP/MxVC83CM0xFVEJuxvTu4 NO89ppGT+X+O2A5JW7/dei1SM0m1kQhK3Ze9kkLTavR04K7ZKX+I/3UIOYRxN9AQtrBA GJHUwop87rusT699EHlqujDtlxNgKG6RuU/1O/0oVW0FmFDPY+tMc8CIwMZibW6oH74K +pBg== X-Gm-Message-State: AO0yUKVwOAuQ4nEdNnoWifreSAq4O1Hcbzno08d6wuHyhizxNKhBu3L8 6PD2lnIdU+Q7GfE9scTMwdnMn0EVK9kUP5i8M5O+/NcjBemCwmjDMxkAvz3Tz7XP5VTKQJXm+ux 7XVPUB3w4Eap+BvLylbCVnddU/5ze X-Received: by 2002:a17:906:31c9:b0:8c7:f906:7fa8 with SMTP id f9-20020a17090631c900b008c7f9067fa8mr2080942ejf.38.1677582299512; Tue, 28 Feb 2023 03:04:59 -0800 (PST) X-Google-Smtp-Source: AK7set/aMdLgVFLEesYQT55gmzVGL+aUKBEjQXyHJhC78T7L9UPOPif9knvBlgiwzS4nQQUQpEumCQ== X-Received: by 2002:a17:906:31c9:b0:8c7:f906:7fa8 with SMTP id f9-20020a17090631c900b008c7f9067fa8mr2080918ejf.38.1677582299174; Tue, 28 Feb 2023 03:04:59 -0800 (PST) Date: Tue, 28 Feb 2023 06:04:55 -0500 From: "Michael S. Tsirkin" To: Jason Wang Cc: Heng Qi , virtio-comment@lists.oasis-open.org, virtio-dev@lists.oasis-open.org, Parav Pandit , Yuri Benditovich , Cornelia Huck , Xuan Zhuo Message-ID: <20230228060352-mutt-send-email-mst@kernel.org> References: <0f53212f-a89b-ad3c-73e3-a7a7b5533058@linux.alibaba.com> <1047920c-5dd5-8f31-0c4c-a108f36155f8@redhat.com> <20230223075934-mutt-send-email-mst@kernel.org> <20230224030509-mutt-send-email-mst@kernel.org> <20230227023657-mutt-send-email-mst@kernel.org> <20230227124800-mutt-send-email-mst@kernel.org> MIME-Version: 1.0 In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit Subject: [virtio-dev] Re: [PATCH v9] virtio-net: support inner header hash Message-ID: <20230228110455.ENhtXz-4MVdidVAnV7xM0idVb3j9lnOz5lq2QFUwyBk@z> On Tue, Feb 28, 2023 at 11:04:26AM +0800, Jason Wang wrote: > On Tue, Feb 28, 2023 at 1:49 AM Michael S. Tsirkin wrote: > > > > On Mon, Feb 27, 2023 at 04:35:09PM +0800, Jason Wang wrote: > > > On Mon, Feb 27, 2023 at 3:39 PM Michael S. Tsirkin wrote: > > > > > > > > On Mon, Feb 27, 2023 at 12:07:17PM +0800, Jason Wang wrote: > > > > > Btw, this kind of 1:1 hash features seems not scalable and flexible. > > > > > It requires an endless extension on bits/fields. Modern NICs allow the > > > > > user to customize the hash calculation, for virtio-net we can allow to > > > > > use eBPF program to classify the packets. It seems to be more flexible > > > > > and scalable and there's almost no maintain burden in the spec (only > > > > > bytecode is required, no need any fancy features/interactions like > > > > > maps), easy to be migrated etc. > > > > > > > > > > Prototype is also easy, tun/tap had an eBPF classifier for years. > > > > > > > > > > Thanks > > > > > > > > Yea BPF offload would be great to have. We have been discussing it for > > > > years though - security issues keep blocking it. *Maybe* it's finally > > > > going to be there but I'm not going to block this work waiting for BPF > > > > offload. And easily migrated is what BPF is not. > > > > > > Just to make sure we're at the same page. I meant to find a way to > > > allow the driver/user to fully customize what it wants to > > > hash/classify. Similar technologies which is based on private solution > > > has been used by some vendors, which allow user to customize the > > > classifier[1] > > > > > > ePBF looks like a good open-source solution candidate for this (there > > > could be others). But there could be many kinds of eBPF programs that > > > could be offloaded. One famous one is XDP which requires many features > > > other than the bytecode/VM like map access, tailcall. Starting from > > > such a complicated type is hard. Instead, we can start from a simple > > > type, that is the eBPF classifier. All it needs is to pass the > > > bytecode to the device, the device can choose to run it or compile it > > > to what it can understand for classifying. We don't need maps, tail > > > calls and other features. > > > > Until people start asking exactly for maps because they want > > state for their classifier? > > Yes, but let's compare the eBPF without maps with the static feature > proposed here. It is much more scalable and flexible. I looked for some examples of RSS using BPF and only found this: https://github.com/Netronome/bpf-samples/blob/master/programmable_rss/rss_user.c seems to use maps. > > And it makes sense - if you want > > e.g. load balancing you need stats which needs maps. > > Yes, but we know it's possible to have that (through the XDP offload). Not without a lot more work to make xdp offload happen. > This is impossible with the approach proposed here. > > > > > > We don't need to worry about the security > > > because of its simplicity: the eBPF program is only in charge of doing > > > classification, no other interactions with the driver and packet > > > modification is prohibited. The feature is limited only to the > > > VM/bytecode abstraction itself. > > > > > > What's more, it's a good first step to achieve full eBPF offloading in > > > the future. > > > > > > Thanks > > > > > > [1] https://www.intel.com/content/www/us/en/architecture-and-technology/ethernet/dynamic-device-personalization-brief.html > > > > Dave seems to have nacked this approach, no? > > I may miss something but looking at kernel commit, there are few > patches to support that: > > E.g > > commit c7648810961682b9388be2dd041df06915647445 > Author: Tony Nguyen > Date: Mon Sep 9 06:47:44 2019 -0700 > > ice: Implement Dynamic Device Personalization (DDP) download > > And it has been used by DPDK drivers. > > Thanks > > > > > > > > > > > -- > > > > MST > > > > > > --------------------------------------------------------------------- To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org