From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A3E4AE95A91 for ; Mon, 9 Oct 2023 11:51:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1346309AbjJILvP (ORCPT ); Mon, 9 Oct 2023 07:51:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58426 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1346281AbjJILvN (ORCPT ); Mon, 9 Oct 2023 07:51:13 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2405CB4 for ; Mon, 9 Oct 2023 04:50:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1696852224; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=MBRRUjtk0t8+rSFYxPYd7qzO0KgNPL4SOI4O3jVUvmI=; b=X6fQRVGUekm0B4f/MlPxIcOwEe8uqFroQiM/Z1UkeLIR8IXlz/8l5bIhfCp+i6N+FUnCGn GLdI0pim32A+DjQj0IO4h5JICUwM/xdJrGjUeZEWyKBUm70R14m3j1Q2Td9KHTVVOtKeOu 702W5UzDIIfMUxlJCnQDCgkqBsxgPAA= Received: from mail-ej1-f72.google.com (mail-ej1-f72.google.com [209.85.218.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-661-NN5B_SkrNRuEbXQaFfkVNA-1; Mon, 09 Oct 2023 07:50:23 -0400 X-MC-Unique: NN5B_SkrNRuEbXQaFfkVNA-1 Received: by mail-ej1-f72.google.com with SMTP id a640c23a62f3a-9a1cf3e6c04so361983066b.3 for ; Mon, 09 Oct 2023 04:50:23 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1696852222; x=1697457022; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=MBRRUjtk0t8+rSFYxPYd7qzO0KgNPL4SOI4O3jVUvmI=; b=ikUKugFbRpK2jI052JiDnG5q/OFgub0D51B9aXCjnKoNH+z0X1gjw35ijoOipqc6+0 vmO4ONruoHZQI1xOjrrzYJ0C75xeFi4QNkRqTt50RMkmwxn4w2sHSRsiLRnHHvPo6etV EGGBeOw+Rx1Pg9fmw0pMy0pCNWZRCanOFt4H90Bwevg50abad9rYx/1PEOwLDK6tglDc r8/v5kyKrJBeExdQaMsr6CCJVXcu+4R+Sohu5bpdtqxMpuxNDhjrhb9pqu7fxSz3L0P4 32nVhnb0UfrBNm6+403jJKI9Ypxqjb3b3gNxKJC+m5AoEcqBO8S9YUZbIkX7a3IunbxB fj9w== X-Gm-Message-State: AOJu0Yw4Eercj1pmgqYDI8OZmPQ89uGez4AgEwtDsrKN+4/3RrHaQ+Td MZ0JwIJyH6c9aE+1lVnIkl7mp++O3/hIjohBy2DmxYn0f8PZsphlqxdrhQJEymRf58l2yZgwaVv 8uRt0tT2PwaW51AG2vkmo X-Received: by 2002:a17:906:8a50:b0:9aa:2c5b:6591 with SMTP id gx16-20020a1709068a5000b009aa2c5b6591mr14399909ejc.9.1696852221875; Mon, 09 Oct 2023 04:50:21 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHOfzC8yt9O1EKKC6ulfMdWzK/Wyw5t8obuDTFM7EgyZXQHnZ2zQtwYH9FVSA5nqeIbOBRa3w== X-Received: by 2002:a17:906:8a50:b0:9aa:2c5b:6591 with SMTP id gx16-20020a1709068a5000b009aa2c5b6591mr14399871ejc.9.1696852221495; Mon, 09 Oct 2023 04:50:21 -0700 (PDT) Received: from redhat.com ([2a02:14f:16f:5caf:857a:f352:c1fc:cf50]) by smtp.gmail.com with ESMTPSA id x19-20020aa7d393000000b005330b2d1904sm6058419edq.71.2023.10.09.04.50.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 Oct 2023 04:50:20 -0700 (PDT) Date: Mon, 9 Oct 2023 07:50:10 -0400 From: "Michael S. Tsirkin" To: Akihiko Odaki Cc: Willem de Bruijn , Jason Wang , netdev@vger.kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-kselftest@vger.kernel.org, bpf@vger.kernel.org, davem@davemloft.net, kuba@kernel.org, ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, kafai@fb.com, songliubraving@fb.com, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, rdunlap@infradead.org, willemb@google.com, gustavoars@kernel.org, herbert@gondor.apana.org.au, steffen.klassert@secunet.com, nogikh@google.com, pablo@netfilter.org, decui@microsoft.com, cai@lca.pw, jakub@cloudflare.com, elver@google.com, pabeni@redhat.com, Yuri Benditovich Subject: Re: [RFC PATCH 5/7] tun: Introduce virtio-net hashing feature Message-ID: <20231009074840-mutt-send-email-mst@kernel.org> References: <20231008052101.144422-1-akihiko.odaki@daynix.com> <20231008052101.144422-6-akihiko.odaki@daynix.com> <48e20be1-b658-4117-8856-89ff1df6f48f@daynix.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <48e20be1-b658-4117-8856-89ff1df6f48f@daynix.com> Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org On Mon, Oct 09, 2023 at 05:44:20PM +0900, Akihiko Odaki wrote: > On 2023/10/09 17:13, Willem de Bruijn wrote: > > On Sun, Oct 8, 2023 at 12:22 AM Akihiko Odaki wrote: > > > > > > virtio-net have two usage of hashes: one is RSS and another is hash > > > reporting. Conventionally the hash calculation was done by the VMM. > > > However, computing the hash after the queue was chosen defeats the > > > purpose of RSS. > > > > > > Another approach is to use eBPF steering program. This approach has > > > another downside: it cannot report the calculated hash due to the > > > restrictive nature of eBPF. > > > > > > Introduce the code to compute hashes to the kernel in order to overcome > > > thse challenges. An alternative solution is to extend the eBPF steering > > > program so that it will be able to report to the userspace, but it makes > > > little sense to allow to implement different hashing algorithms with > > > eBPF since the hash value reported by virtio-net is strictly defined by > > > the specification. > > > > > > The hash value already stored in sk_buff is not used and computed > > > independently since it may have been computed in a way not conformant > > > with the specification. > > > > > > Signed-off-by: Akihiko Odaki > > > > > @@ -2116,31 +2172,49 @@ static ssize_t tun_put_user(struct tun_struct *tun, > > > } > > > > > > if (vnet_hdr_sz) { > > > - struct virtio_net_hdr gso; > > > + union { > > > + struct virtio_net_hdr hdr; > > > + struct virtio_net_hdr_v1_hash v1_hash_hdr; > > > + } hdr; > > > + int ret; > > > > > > if (iov_iter_count(iter) < vnet_hdr_sz) > > > return -EINVAL; > > > > > > - if (virtio_net_hdr_from_skb(skb, &gso, > > > - tun_is_little_endian(tun), true, > > > - vlan_hlen)) { > > > + if ((READ_ONCE(tun->vnet_hash.flags) & TUN_VNET_HASH_REPORT) && > > > + vnet_hdr_sz >= sizeof(hdr.v1_hash_hdr) && > > > + skb->tun_vnet_hash) { > > > > Isn't vnet_hdr_sz guaranteed to be >= hdr.v1_hash_hdr, by virtue of > > the set hash ioctl failing otherwise? > > > > Such checks should be limited to control path where possible > > There is a potential race since tun->vnet_hash.flags and vnet_hdr_sz are not > read at once. And then it's a complete mess and you get inconsistent behaviour with packets getting sent all over the place, right? So maybe keep a pointer to this struct so it can be changed atomically then. Maybe even something with rcu I donnu. -- MST