All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Akihiko Odaki <akihiko.odaki@daynix.com>
Cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>,
	Jason Wang <jasowang@redhat.com>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	kvm@vger.kernel.org, virtualization@lists.linux-foundation.org,
	linux-kselftest@vger.kernel.org, bpf@vger.kernel.org,
	davem@davemloft.net, kuba@kernel.org, ast@kernel.org,
	daniel@iogearbox.net, andrii@kernel.org, kafai@fb.com,
	songliubraving@fb.com, yhs@fb.com, john.fastabend@gmail.com,
	kpsingh@kernel.org, rdunlap@infradead.org, willemb@google.com,
	gustavoars@kernel.org, herbert@gondor.apana.org.au,
	steffen.klassert@secunet.com, nogikh@google.com,
	pablo@netfilter.org, decui@microsoft.com, cai@lca.pw,
	jakub@cloudflare.com, elver@google.com, pabeni@redhat.com,
	Yuri Benditovich <yuri.benditovich@daynix.com>
Subject: Re: [RFC PATCH 5/7] tun: Introduce virtio-net hashing feature
Date: Mon, 9 Oct 2023 07:50:10 -0400	[thread overview]
Message-ID: <20231009074840-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <48e20be1-b658-4117-8856-89ff1df6f48f@daynix.com>

On Mon, Oct 09, 2023 at 05:44:20PM +0900, Akihiko Odaki wrote:
> On 2023/10/09 17:13, Willem de Bruijn wrote:
> > On Sun, Oct 8, 2023 at 12:22 AM Akihiko Odaki <akihiko.odaki@daynix.com> wrote:
> > > 
> > > virtio-net have two usage of hashes: one is RSS and another is hash
> > > reporting. Conventionally the hash calculation was done by the VMM.
> > > However, computing the hash after the queue was chosen defeats the
> > > purpose of RSS.
> > > 
> > > Another approach is to use eBPF steering program. This approach has
> > > another downside: it cannot report the calculated hash due to the
> > > restrictive nature of eBPF.
> > > 
> > > Introduce the code to compute hashes to the kernel in order to overcome
> > > thse challenges. An alternative solution is to extend the eBPF steering
> > > program so that it will be able to report to the userspace, but it makes
> > > little sense to allow to implement different hashing algorithms with
> > > eBPF since the hash value reported by virtio-net is strictly defined by
> > > the specification.
> > > 
> > > The hash value already stored in sk_buff is not used and computed
> > > independently since it may have been computed in a way not conformant
> > > with the specification.
> > > 
> > > Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> > 
> > > @@ -2116,31 +2172,49 @@ static ssize_t tun_put_user(struct tun_struct *tun,
> > >          }
> > > 
> > >          if (vnet_hdr_sz) {
> > > -               struct virtio_net_hdr gso;
> > > +               union {
> > > +                       struct virtio_net_hdr hdr;
> > > +                       struct virtio_net_hdr_v1_hash v1_hash_hdr;
> > > +               } hdr;
> > > +               int ret;
> > > 
> > >                  if (iov_iter_count(iter) < vnet_hdr_sz)
> > >                          return -EINVAL;
> > > 
> > > -               if (virtio_net_hdr_from_skb(skb, &gso,
> > > -                                           tun_is_little_endian(tun), true,
> > > -                                           vlan_hlen)) {
> > > +               if ((READ_ONCE(tun->vnet_hash.flags) & TUN_VNET_HASH_REPORT) &&
> > > +                   vnet_hdr_sz >= sizeof(hdr.v1_hash_hdr) &&
> > > +                   skb->tun_vnet_hash) {
> > 
> > Isn't vnet_hdr_sz guaranteed to be >= hdr.v1_hash_hdr, by virtue of
> > the set hash ioctl failing otherwise?
> > 
> > Such checks should be limited to control path where possible
> 
> There is a potential race since tun->vnet_hash.flags and vnet_hdr_sz are not
> read at once.

And then it's a complete mess and you get inconsistent
behaviour with packets getting sent all over the place, right?
So maybe keep a pointer to this struct so it can be
changed atomically then. Maybe even something with rcu I donnu.

-- 
MST


WARNING: multiple messages have this Message-ID (diff)
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Akihiko Odaki <akihiko.odaki@daynix.com>
Cc: songliubraving@fb.com, gustavoars@kernel.org,
	kvm@vger.kernel.org, decui@microsoft.com, ast@kernel.org,
	virtualization@lists.linux-foundation.org,
	linux-kselftest@vger.kernel.org, steffen.klassert@secunet.com,
	Willem de Bruijn <willemdebruijn.kernel@gmail.com>,
	herbert@gondor.apana.org.au, daniel@iogearbox.net,
	john.fastabend@gmail.com, andrii@kernel.org, yhs@fb.com,
	pabeni@redhat.com, pablo@netfilter.org, elver@google.com,
	kpsingh@kernel.org,
	Yuri Benditovich <yuri.benditovich@daynix.com>,
	cai@lca.pw, kuba@kernel.org, willemb@google.com,
	netdev@vger.kernel.org, rdunlap@infradead.org,
	linux-kernel@vger.kernel.org, davem@davemloft.net,
	nogikh@google.com, bpf@vger.kernel.org, kafai@fb.com
Subject: Re: [RFC PATCH 5/7] tun: Introduce virtio-net hashing feature
Date: Mon, 9 Oct 2023 07:50:10 -0400	[thread overview]
Message-ID: <20231009074840-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <48e20be1-b658-4117-8856-89ff1df6f48f@daynix.com>

On Mon, Oct 09, 2023 at 05:44:20PM +0900, Akihiko Odaki wrote:
> On 2023/10/09 17:13, Willem de Bruijn wrote:
> > On Sun, Oct 8, 2023 at 12:22 AM Akihiko Odaki <akihiko.odaki@daynix.com> wrote:
> > > 
> > > virtio-net have two usage of hashes: one is RSS and another is hash
> > > reporting. Conventionally the hash calculation was done by the VMM.
> > > However, computing the hash after the queue was chosen defeats the
> > > purpose of RSS.
> > > 
> > > Another approach is to use eBPF steering program. This approach has
> > > another downside: it cannot report the calculated hash due to the
> > > restrictive nature of eBPF.
> > > 
> > > Introduce the code to compute hashes to the kernel in order to overcome
> > > thse challenges. An alternative solution is to extend the eBPF steering
> > > program so that it will be able to report to the userspace, but it makes
> > > little sense to allow to implement different hashing algorithms with
> > > eBPF since the hash value reported by virtio-net is strictly defined by
> > > the specification.
> > > 
> > > The hash value already stored in sk_buff is not used and computed
> > > independently since it may have been computed in a way not conformant
> > > with the specification.
> > > 
> > > Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> > 
> > > @@ -2116,31 +2172,49 @@ static ssize_t tun_put_user(struct tun_struct *tun,
> > >          }
> > > 
> > >          if (vnet_hdr_sz) {
> > > -               struct virtio_net_hdr gso;
> > > +               union {
> > > +                       struct virtio_net_hdr hdr;
> > > +                       struct virtio_net_hdr_v1_hash v1_hash_hdr;
> > > +               } hdr;
> > > +               int ret;
> > > 
> > >                  if (iov_iter_count(iter) < vnet_hdr_sz)
> > >                          return -EINVAL;
> > > 
> > > -               if (virtio_net_hdr_from_skb(skb, &gso,
> > > -                                           tun_is_little_endian(tun), true,
> > > -                                           vlan_hlen)) {
> > > +               if ((READ_ONCE(tun->vnet_hash.flags) & TUN_VNET_HASH_REPORT) &&
> > > +                   vnet_hdr_sz >= sizeof(hdr.v1_hash_hdr) &&
> > > +                   skb->tun_vnet_hash) {
> > 
> > Isn't vnet_hdr_sz guaranteed to be >= hdr.v1_hash_hdr, by virtue of
> > the set hash ioctl failing otherwise?
> > 
> > Such checks should be limited to control path where possible
> 
> There is a potential race since tun->vnet_hash.flags and vnet_hdr_sz are not
> read at once.

And then it's a complete mess and you get inconsistent
behaviour with packets getting sent all over the place, right?
So maybe keep a pointer to this struct so it can be
changed atomically then. Maybe even something with rcu I donnu.

-- 
MST

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

  parent reply	other threads:[~2023-10-09 11:50 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-08  5:20 [RFC PATCH 0/7] tun: Introduce virtio-net hashing feature Akihiko Odaki
2023-10-08  5:20 ` [RFC PATCH 1/7] net: skbuff: Add tun_vnet_hash flag Akihiko Odaki
2023-10-08 18:39   ` Willem de Bruijn
2023-10-08 18:39     ` Willem de Bruijn
2023-10-08 19:52     ` Akihiko Odaki
2023-10-08  5:20 ` [RFC PATCH 2/7] net/core: Ensure qdisc_skb_cb will not be overwritten Akihiko Odaki
2023-10-08  5:20 ` [RFC PATCH 3/7] net: sched: Add members to qdisc_skb_cb Akihiko Odaki
2023-10-15 15:04   ` kernel test robot
2023-10-08  5:20 ` [RFC PATCH 4/7] virtio_net: Add functions for hashing Akihiko Odaki
2023-10-08  5:20 ` [RFC PATCH 5/7] tun: Introduce virtio-net hashing feature Akihiko Odaki
2023-10-08 19:07   ` Willem de Bruijn
2023-10-08 19:07     ` Willem de Bruijn
2023-10-08 20:04     ` Akihiko Odaki
2023-10-08 20:08       ` Willem de Bruijn
2023-10-08 20:08         ` Willem de Bruijn
2023-10-08 20:46         ` Akihiko Odaki
2023-10-09  8:04           ` Willem de Bruijn
2023-10-09  8:04             ` Willem de Bruijn
2023-10-09  8:57             ` Akihiko Odaki
2023-10-09  9:57               ` Willem de Bruijn
2023-10-09  9:57                 ` Willem de Bruijn
2023-10-09 10:01                 ` Akihiko Odaki
2023-10-09 10:06                   ` Willem de Bruijn
2023-10-09 10:06                     ` Willem de Bruijn
2023-10-09 10:12                     ` Akihiko Odaki
2023-10-09 10:44                       ` Willem de Bruijn
2023-10-09 10:44                         ` Willem de Bruijn
2023-10-10  1:52                         ` Akihiko Odaki
2023-10-10  5:45                           ` Jason Wang
2023-10-10  5:45                             ` Jason Wang
2023-10-10  5:51                             ` Akihiko Odaki
2023-10-10  6:00                               ` Jason Wang
2023-10-10  6:00                                 ` Jason Wang
2023-10-10  6:19                                 ` Akihiko Odaki
2023-10-11  3:18                                   ` Jason Wang
2023-10-11  3:18                                     ` Jason Wang
2023-10-11  3:57                                     ` Akihiko Odaki
2023-10-09  8:13   ` Willem de Bruijn
2023-10-09  8:13     ` Willem de Bruijn
2023-10-09  8:44     ` Akihiko Odaki
2023-10-09  9:54       ` Willem de Bruijn
2023-10-09  9:54         ` Willem de Bruijn
2023-10-09 10:05         ` Akihiko Odaki
2023-10-09 10:07           ` Willem de Bruijn
2023-10-09 10:07             ` Willem de Bruijn
2023-10-09 10:11             ` Akihiko Odaki
2023-10-09 10:32               ` Willem de Bruijn
2023-10-09 10:32                 ` Willem de Bruijn
2023-10-09 11:50       ` Michael S. Tsirkin [this message]
2023-10-09 11:50         ` Michael S. Tsirkin
2023-10-10  2:34         ` Akihiko Odaki
2023-10-09 11:38   ` Michael S. Tsirkin
2023-10-09 11:38     ` Michael S. Tsirkin
2023-10-10  2:32     ` Akihiko Odaki
2023-10-08  5:20 ` [RFC PATCH 6/7] selftest: tun: Add tests for virtio-net hashing Akihiko Odaki
2023-10-20  1:52   ` kernel test robot
2023-10-08  5:20 ` [RFC PATCH 7/7] vhost_net: Support VIRTIO_NET_F_HASH_REPORT Akihiko Odaki
2023-10-08 18:36 ` [RFC PATCH 0/7] tun: Introduce virtio-net hashing feature Willem de Bruijn
2023-10-08 18:36   ` Willem de Bruijn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231009074840-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=akihiko.odaki@daynix.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=cai@lca.pw \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=decui@microsoft.com \
    --cc=elver@google.com \
    --cc=gustavoars@kernel.org \
    --cc=herbert@gondor.apana.org.au \
    --cc=jakub@cloudflare.com \
    --cc=jasowang@redhat.com \
    --cc=john.fastabend@gmail.com \
    --cc=kafai@fb.com \
    --cc=kpsingh@kernel.org \
    --cc=kuba@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=nogikh@google.com \
    --cc=pabeni@redhat.com \
    --cc=pablo@netfilter.org \
    --cc=rdunlap@infradead.org \
    --cc=songliubraving@fb.com \
    --cc=steffen.klassert@secunet.com \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=willemb@google.com \
    --cc=willemdebruijn.kernel@gmail.com \
    --cc=yhs@fb.com \
    --cc=yuri.benditovich@daynix.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.