All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: "沈安琪(凛玥)" <amy.saq@antgroup.com>
Cc: "Willem de Bruijn" <willemdebruijn.kernel@gmail.com>,
	netdev@vger.kernel.org, davem@davemloft.net, jasowang@redhat.com,
	谈鉴锋 <henry.tjf@antgroup.com>
Subject: Re: [PATCH 2/2] net/packet: send and receive pkt with given vnet_hdr_sz
Date: Wed, 22 Feb 2023 06:37:59 -0500	[thread overview]
Message-ID: <20230222063242-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <4b431f19-b5f2-6704-318e-6bde113a3e0a@antgroup.com>

On Wed, Feb 22, 2023 at 04:04:34PM +0800, 沈安琪(凛玥) wrote:
> 
> 在 2023/2/21 下午11:03, Willem de Bruijn 写道:
> > 沈安琪(凛玥) wrote:
> > > 在 2023/2/14 下午10:28, Willem de Bruijn 写道:
> > > > 沈安琪(凛玥) wrote:
> > > > > 在 2023/2/10 下午11:39, Willem de Bruijn 写道:
> > > > > > Michael S. Tsirkin wrote:
> > > > > > > On Fri, Feb 10, 2023 at 12:01:03PM +0800, 沈安琪(凛玥) wrote:
> > > > > > > > 在 2023/2/9 下午9:07, Michael S. Tsirkin 写道:
> > > > > > > > > On Thu, Feb 09, 2023 at 08:43:15PM +0800, 沈安琪(凛玥) wrote:
> > > > > > > > > > From: "Jianfeng Tan" <henry.tjf@antgroup.com>
> > > > > > > > > > 
> > > > > > > > > > When raw socket is used as the backend for kernel vhost, currently it
> > > > > > > > > > will regard the virtio net header as 10-byte, which is not always the
> > > > > > > > > > case since some virtio features need virtio net header other than
> > > > > > > > > > 10-byte, such as mrg_rxbuf and VERSION_1 that both need 12-byte virtio
> > > > > > > > > > net header.
> > > > > > > > > > 
> > > > > > > > > > Instead of hardcoding virtio net header length to 10 bytes, tpacket_snd,
> > > > > > > > > > tpacket_rcv, packet_snd and packet_recvmsg now get the virtio net header
> > > > > > > > > > size that is recorded in packet_sock to indicate the exact virtio net
> > > > > > > > > > header size that virtio user actually prepares in the packets. By doing
> > > > > > > > > > so, it can fix the issue of incorrect mac header parsing when these
> > > > > > > > > > virtio features that need virtio net header other than 10-byte are
> > > > > > > > > > enable.
> > > > > > > > > > 
> > > > > > > > > > Signed-off-by: Jianfeng Tan <henry.tjf@antgroup.com>
> > > > > > > > > > Co-developed-by: Anqi Shen <amy.saq@antgroup.com>
> > > > > > > > > > Signed-off-by: Anqi Shen <amy.saq@antgroup.com>
> > > > > > > > > Does it handle VERSION_1 though? That one is also LE.
> > > > > > > > > Would it be better to pass a features bitmap instead?
> > > > > > > > Thanks for quick reply!
> > > > > > > > 
> > > > > > > > I am a little confused abot what "LE" presents here?
> > > > > > > LE == little_endian.
> > > > > > > Little endian format.
> > > > > > > 
> > > > > > > > For passing a features bitmap to af_packet here, our consideration is
> > > > > > > > whether it will be too complicated for af_packet to understand the virtio
> > > > > > > > features bitmap in order to get the vnet header size. For now, all the
> > > > > > > > virtio features stuff is handled by vhost worker and af_packet actually does
> > > > > > > > not need to know much about virtio features. Would it be better if we keep
> > > > > > > > the virtio feature stuff in user-level and let user-level tell af_packet how
> > > > > > > > much space it should reserve?
> > > > > > > Presumably, we'd add an API in include/linux/virtio_net.h ?
> > > > > > Better leave this opaque to packet sockets if they won't act on this
> > > > > > type info.
> > > > > > This patch series probably should be a single patch btw. As else the
> > > > > > socket option introduced in the first is broken at that commit, since
> > > > > > the behavior is only introduced in patch 2.
> > > > > Good point, will merge this patch series into one patch.
> > > > > 
> > > > > 
> > > > > Thanks for Michael's enlightening advice, we plan to modify current UAPI
> > > > > change of adding an extra socketopt from only setting vnet header size
> > > > > only to setting a bit-map of virtio features, and implement another
> > > > > helper function in include/linux/virtio_net.h to parse the feature
> > > > > bit-map. In this case, packet sockets have no need to understand the
> > > > > feature bit-map but only pass this bit-map to virtio_net helper and get
> > > > > back the information, such as vnet header size, it needs.
> > > > > 
> > > > > This change will make the new UAPI more general and avoid further
> > > > > modification if there are more virtio features to support in the future.
> > > > > 
> > > > Please also comment how these UAPI extension are intended to be used.
> > > > As that use is not included in this initial patch series.
> > > > 
> > > > If the only intended user is vhost-net, we can consider not exposing
> > > > outside the kernel at all. That makes it easier to iterate if
> > > > necessary (no stable ABI) and avoids accidentally opening up new
> > > > avenues for bugs and exploits (syzkaller has a history with
> > > > virtio_net_header options).
> > > 
> > > Our concern is, it seems there is no other solution than uapi to let
> > > packet sockets know the vnet header size they should use.
> > > 
> > > Receiving packets in vhost driver, implemented in drivers/vhost/net.c:
> > > 1109 handle_rx(), will abstract the backend device it uses and directly
> > > invoke the corresponding socket ops with no extra information indicating
> > > it is invoked by vhost worker. Vhost worker actually does not know the
> > > type of backend device it is using; only virito-user knows what type of
> > > backend device it uses. Therefore, it seems impossible to let vhost set
> > > the vnet header information to the target backend device.
> > > 
> > > Tap, another kind of backend device vhost may use, lets virtio-user set
> > > whether it needs vnet header and how long the vnet header is through
> > > ioctl. (implemented in drivers/net/tap.c:1066)
> > > 
> > > In this case, we wonder whether we should align with what tap does and
> > > set vnet hdr size through setsockopt for packet_sockets.
> > > 
> > > We really appreciate suggestions on if any, potential approachs to pass
> > > this vnet header size information from virtio-user to packet-socket.
> > You're right. This is configured from userspace before the FD is passed
> > to vhost-net, so indeed this will require packet socket UAPI support.
> 
> 
> Thanks for quick reply. We will go with adding an extra UAPI here then.
> 
> 
> Another discussion for designing this UAPI is, whether it will be better to
> support setting only vnet header size, just like what TAP does in its ioctl,
> or to support setting a virtio feature bit-map.
> 
> 
> UAPI setting only vnet header size
> 
> Pros:
> 
> 1. It aligns with how other virito backend devices communicate with
> virtio-user
> 
> 2. We can use the holes in struct packet_socket (net/packet/internal.h:120)
> to record the extra information since the size info only takes 8 bits.
> 
> Cons:
> 
> 1. It may have more information that virtio-user needs to communicate with
> packet socket in the future and needs to add more UAPI supports here.
> 
> To Michael: Is there any other information that backend device needs and
> will be given from virtio-user?


Yes e.g. I already mentioned virtio 1.0 wrt LE versus native endian
format.


> 
> UAPI setting a virtio feature bit-map
> 
> Pros:
> 
> 1. It is more general and may reduce future UAPI changes.
> 
> Cons:
> 
> 1. A virtio feature bit-map needs 64 bits, which needs to add an extra field
> in packet_sock struct
> 
> 2. Virtio-user needs to aware that using packet socket as backend supports
> different approach to negotiate the vnet header size.
> 
> 
> We really appreciate any suggestion or discussion on this design choice of
> UAPI.

In the end it's ok with just size too, you just probably shouldn't say
you support VERSION_1 if you are not passing that bit.

-- 
MST


  reply	other threads:[~2023-02-22 11:39 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-09 12:43 [PATCH 0/2] net/packet: support of specifying virtio net header size 沈安琪(凛玥)
2023-02-09 12:43 ` [PATCH 1/2] net/packet: add socketopt to set/get vnet_hdr_sz 沈安琪(凛玥)
2023-02-09 14:45   ` Willem de Bruijn
2023-02-10  4:00     ` 沈安琪(凛玥)
2023-02-09 12:43 ` [PATCH 2/2] net/packet: send and receive pkt with given vnet_hdr_sz 沈安琪(凛玥)
2023-02-09 13:07   ` Michael S. Tsirkin
2023-02-10  4:01     ` 沈安琪(凛玥)
2023-02-10  8:10       ` Michael S. Tsirkin
2023-02-10 15:36         ` Willem de Bruijn
2023-02-12  9:54           ` Michael S. Tsirkin
2023-02-10 15:39         ` Willem de Bruijn
2023-02-13 11:06           ` 沈安琪(凛玥)
2023-02-14 14:28             ` Willem de Bruijn
2023-02-21  9:40               ` 沈安琪(凛玥)
2023-02-21 15:03                 ` Willem de Bruijn
2023-02-22  8:04                   ` 沈安琪(凛玥)
2023-02-22 11:37                     ` Michael S. Tsirkin [this message]
2023-02-22 11:43                       ` 沈安琪(凛玥)
2023-02-22 15:04                         ` Willem de Bruijn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230222063242-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=amy.saq@antgroup.com \
    --cc=davem@davemloft.net \
    --cc=henry.tjf@antgroup.com \
    --cc=jasowang@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=willemdebruijn.kernel@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.