From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.9 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 035A8C4332D for ; Mon, 28 Dec 2020 23:17:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DF8232222A for ; Mon, 28 Dec 2020 23:17:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730269AbgL1Wzo (ORCPT ); Mon, 28 Dec 2020 17:55:44 -0500 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:51081 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729542AbgL1ViJ (ORCPT ); Mon, 28 Dec 2020 16:38:09 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1609191402; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=cNicy8A1iYfuWy2Y4PrSJ6PP9ocoInA0QnPx1AGCPo0=; b=IGBor0EnLwI2Iif2Gf/Clb0T8fN0Wq0qrGy62n5sUlHTXOBwRprzP8BhAESv4iegCDHkjH 8InCUo++cr1eNyanvzUANyO22/bfdW1e70bThoCnjgfsybY+kcvUugSWdu1AlnLL8+dvvr QZKWxUdS4mDD2kyyf+OcgWRZ0fJI314= Received: from mail-wm1-f72.google.com (mail-wm1-f72.google.com [209.85.128.72]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-127-X_2VFHdlN_-DLSUF5PQP_A-1; Mon, 28 Dec 2020 16:36:40 -0500 X-MC-Unique: X_2VFHdlN_-DLSUF5PQP_A-1 Received: by mail-wm1-f72.google.com with SMTP id s130so251617wme.0 for ; Mon, 28 Dec 2020 13:36:40 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=cNicy8A1iYfuWy2Y4PrSJ6PP9ocoInA0QnPx1AGCPo0=; b=Z9FSanfNR+PjQwh4oY84CKlkABc4Bl5anhig/M3bdNE9hAa2SIZfTJMagBkzTILpSc AVwBeIc9i9+3qUBFOSpNFIrjv18mYbDkmFHm4oojd0UYE4Q/yf65yZgPGf6/2RilHxrI k0j+aDPIKLPAMn69agbd2WKvEimCe4oOQ91l8G6uD4T3Q4K7g1ZjKB74CrRbU2ahjqgW ZsjQ9fSCh1GpsE2uhvpwgvOydYmg0Hr9zPXgiEz360p7W7ty1G7Evyhlg7BujvKESFHt CbCn3/xnvEWZqKj07p+Be23Z5uaAL6fwr/UixT44X3ND/49oPFHNix+Om8Ms/gv4yrIf D3SA== X-Gm-Message-State: AOAM532W07218CW2/s5Ryeie8OCt27KPKnPv835OT+C+gKXXRZkjBA3Z njMR1B8QRH/uytSjpRAk7BdwRoEyuMWToa9+0kjjJJixHWcI2BBD+jcVZm8hDxaZSamZ4M5026M z91/qtDFV1TTI2z0e X-Received: by 2002:a5d:400c:: with SMTP id n12mr53078771wrp.218.1609191399098; Mon, 28 Dec 2020 13:36:39 -0800 (PST) X-Google-Smtp-Source: ABdhPJz2wEb3D1BWN/UONyqc2w0gJAIvl4kLQiL+4H3nvCzIRKbLME7rvRsdRCxiQ9+pTl1aaS1mKQ== X-Received: by 2002:a5d:400c:: with SMTP id n12mr53078760wrp.218.1609191398959; Mon, 28 Dec 2020 13:36:38 -0800 (PST) Received: from redhat.com (bzq-79-178-32-166.red.bezeqint.net. [79.178.32.166]) by smtp.gmail.com with ESMTPSA id v20sm60258556wra.19.2020.12.28.13.36.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Dec 2020 13:36:38 -0800 (PST) Date: Mon, 28 Dec 2020 16:36:35 -0500 From: "Michael S. Tsirkin" To: Willem de Bruijn Cc: virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, jasowang@redhat.com, Willem de Bruijn Subject: Re: [PATCH rfc 1/3] virtio-net: support transmit hash report Message-ID: <20201228163359-mutt-send-email-mst@kernel.org> References: <20201228162233.2032571-1-willemdebruijn.kernel@gmail.com> <20201228162233.2032571-2-willemdebruijn.kernel@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20201228162233.2032571-2-willemdebruijn.kernel@gmail.com> Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On Mon, Dec 28, 2020 at 11:22:31AM -0500, Willem de Bruijn wrote: > From: Willem de Bruijn > > Virtio-net supports sharing the flow hash from host to guest on rx. > Do the same on transmit, to allow the host to infer connection state > for more robust routing and telemetry. > > Linux derives ipv6 flowlabel and ECMP multipath from sk->sk_txhash, > and updates these fields on error with sk_rethink_txhash. This feature > allows the host to make similar decisions. > > Besides the raw hash, optionally also convey connection state for > this hash. Specifically, the hash rotates on transmit timeout. To > avoid having to keep a stateful table in the host to detect flow > changes, explicitly notify when a hash changed due to timeout. > > Signed-off-by: Willem de Bruijn > --- > drivers/net/virtio_net.c | 24 +++++++++++++++++++++--- > include/uapi/linux/virtio_net.h | 10 +++++++++- > 2 files changed, 30 insertions(+), 4 deletions(-) > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > index 21b71148c532..b917b7333928 100644 > --- a/drivers/net/virtio_net.c > +++ b/drivers/net/virtio_net.c > @@ -201,6 +201,9 @@ struct virtnet_info { > /* Host will merge rx buffers for big packets (shake it! shake it!) */ > bool mergeable_rx_bufs; > > + /* Guest will pass tx path info to the host */ > + bool has_tx_hash; > + > /* Has control virtqueue */ > bool has_cvq; > > @@ -394,9 +397,9 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi, > > hdr_len = vi->hdr_len; > if (vi->mergeable_rx_bufs) > - hdr_padded_len = sizeof(*hdr); > + hdr_padded_len = max_t(unsigned int, hdr_len, sizeof(*hdr)); > else > - hdr_padded_len = sizeof(struct padded_vnet_hdr); > + hdr_padded_len = ALIGN(hdr_len, 16); > > /* hdr_valid means no XDP, so we can copy the vnet header */ > if (hdr_valid) > @@ -1534,6 +1537,7 @@ static int xmit_skb(struct send_queue *sq, struct sk_buff *skb) > struct virtio_net_hdr_mrg_rxbuf *hdr; > const unsigned char *dest = ((struct ethhdr *)skb->data)->h_dest; > struct virtnet_info *vi = sq->vq->vdev->priv; > + struct virtio_net_hdr_v1_hash *ht; > int num_sg; > unsigned hdr_len = vi->hdr_len; > bool can_push; > @@ -1558,6 +1562,14 @@ static int xmit_skb(struct send_queue *sq, struct sk_buff *skb) > if (vi->mergeable_rx_bufs) > hdr->num_buffers = 0; > > + ht = (void *)hdr; > + if (vi->has_tx_hash) { > + ht->hash_value = cpu_to_virtio32(vi->vdev, skb->hash); > + ht->hash_report = skb->l4_hash ? VIRTIO_NET_HASH_REPORT_L4 : > + VIRTIO_NET_HASH_REPORT_OTHER; > + ht->hash_state = VIRTIO_NET_HASH_STATE_DEFAULT; > + } > + > sg_init_table(sq->sg, skb_shinfo(skb)->nr_frags + (can_push ? 1 : 2)); > if (can_push) { > __skb_push(skb, hdr_len); > @@ -3054,6 +3066,11 @@ static int virtnet_probe(struct virtio_device *vdev) > else > vi->hdr_len = sizeof(struct virtio_net_hdr); > > + if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_TX_HASH)) { > + vi->has_tx_hash = true; > + vi->hdr_len = sizeof(struct virtio_net_hdr_v1_hash); > + } > + > if (virtio_has_feature(vdev, VIRTIO_F_ANY_LAYOUT) || > virtio_has_feature(vdev, VIRTIO_F_VERSION_1)) > vi->any_header_sg = true; > @@ -3243,7 +3260,8 @@ static struct virtio_device_id id_table[] = { > VIRTIO_NET_F_GUEST_ANNOUNCE, VIRTIO_NET_F_MQ, \ > VIRTIO_NET_F_CTRL_MAC_ADDR, \ > VIRTIO_NET_F_MTU, VIRTIO_NET_F_CTRL_GUEST_OFFLOADS, \ > - VIRTIO_NET_F_SPEED_DUPLEX, VIRTIO_NET_F_STANDBY > + VIRTIO_NET_F_SPEED_DUPLEX, VIRTIO_NET_F_STANDBY, \ > + VIRTIO_NET_F_TX_HASH > > static unsigned int features[] = { > VIRTNET_FEATURES, > diff --git a/include/uapi/linux/virtio_net.h b/include/uapi/linux/virtio_net.h > index 3f55a4215f11..f6881b5b77ee 100644 > --- a/include/uapi/linux/virtio_net.h > +++ b/include/uapi/linux/virtio_net.h > @@ -57,6 +57,7 @@ > * Steering */ > #define VIRTIO_NET_F_CTRL_MAC_ADDR 23 /* Set MAC address */ > > +#define VIRTIO_NET_F_TX_HASH 56 /* Guest sends hash report */ > #define VIRTIO_NET_F_HASH_REPORT 57 /* Supports hash report */ > #define VIRTIO_NET_F_RSS 60 /* Supports RSS RX steering */ > #define VIRTIO_NET_F_RSC_EXT 61 /* extended coalescing info */ > @@ -170,8 +171,15 @@ struct virtio_net_hdr_v1_hash { > #define VIRTIO_NET_HASH_REPORT_IPv6_EX 7 > #define VIRTIO_NET_HASH_REPORT_TCPv6_EX 8 > #define VIRTIO_NET_HASH_REPORT_UDPv6_EX 9 > +#define VIRTIO_NET_HASH_REPORT_L4 10 > +#define VIRTIO_NET_HASH_REPORT_OTHER 11 Need to specify these I guess ... Can't there be any consistency with RX hash? Handy for VM2VM ... > __le16 hash_report; > - __le16 padding; > + union { > + __le16 padding; > +#define VIRTIO_NET_HASH_STATE_DEFAULT 0 > +#define VIRTIO_NET_HASH_STATE_TIMEOUT_BIT 0x1 > + __le16 hash_state; > + }; > }; > > #ifndef VIRTIO_NET_NO_LEGACY > -- > 2.29.2.729.g45daf8777d-goog