Linux virtualization list
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Si-Wei Liu <si-wei.liu@oracle.com>
Cc: "alexander.h.duyck@intel.com" <alexander.h.duyck@intel.com>,
	Virtio-Dev <virtio-dev@lists.oasis-open.org>,
	"kubakici@wp.pl" <kubakici@wp.pl>,
	"sridhar.samudrala@intel.com" <sridhar.samudrala@intel.com>,
	"jesse.brandeburg@intel.com" <jesse.brandeburg@intel.com>,
	Gavi Teitz <gavi@nvidia.com>,
	virtualization <virtualization@lists.linux-foundation.org>,
	"Hemminger, Stephen" <stephen@networkplumber.org>,
	"loseweigh@gmail.com" <loseweigh@gmail.com>,
	davem <davem@davemloft.net>, Gavin Li <gavinl@nvidia.com>
Subject: Re: [virtio-dev] [PATCH] virtio-net: use mtu size as buffer length for big packets
Date: Tue, 9 Aug 2022 19:03:09 -0400	[thread overview]
Message-ID: <20220809190206-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <0c6c876b-1d52-bfc8-87d4-edbe6b8581bc@oracle.com>

On Tue, Aug 09, 2022 at 03:54:57PM -0700, Si-Wei Liu wrote:
> 
> 
> On 8/9/2022 3:37 PM, Michael S. Tsirkin wrote:
> > On Tue, Aug 09, 2022 at 03:32:26PM -0700, Si-Wei Liu wrote:
> > > 
> > > On 8/9/2022 2:37 PM, Michael S. Tsirkin wrote:
> > > > On Tue, Aug 09, 2022 at 07:18:30PM +0000, Parav Pandit wrote:
> > > > > > From: Si-Wei Liu <si-wei.liu@oracle.com>
> > > > > > Sent: Tuesday, August 9, 2022 3:09 PM
> > > > > > > > From: Si-Wei Liu <si-wei.liu@oracle.com>
> > > > > > > > Sent: Tuesday, August 9, 2022 2:39 PM Currently it is not. Not a
> > > > > > > > single patch nor this patch, but the context for the eventual goal is
> > > > > > > > to allow XDP on a MTU=9000 link when guest users intentionally lower
> > > > > > > > down MTU to 1500.
> > > > > > > Which application benefit by having asymmetry by lowering mtu to 1500
> > > > > > to send packets but want to receive 9K packets?
> > > > > Below details doesn’t answer the question of asymmetry. :)
> > > > > 
> > > > > > I think virtio-net driver doesn't differentiate MTU and MRU, in which case
> > > > > > the receive buffer will be reduced to fit the 1500B payload size when mtu is
> > > > > > lowered down to 1500 from 9000.
> > > > > How? Driver reduced the mXu to 1500, say it is improved to post buffers of 1500 bytes.
> > > > > 
> > > > > Device doesn't know about it because mtu in config space is RO field.
> > > > > Device keep dropping 9K packets because buffers posted are 1500 bytes.
> > > > > This is because device follows the spec " The device MUST NOT pass received packets that exceed mtu".
> > > > The "mtu" here is the device config field, which is
> > > > 
> > > >           /* Default maximum transmit unit advice */
> > > > 
> > > > there is no guarantee device will not get a bigger packet.
> > > > And there is no guarantee such a packet will be dropped
> > > > as opposed to wedging the device if userspace insists on
> > > > adding smaller buffers.
> > > It'd be nice to document this requirement or statement to the spec for
> > > clarity purpose.
> > It's not a requirement, more of a bug. But it's been like this for
> > years.
> Well, I'm not sure how it may wedge the device if not capable of posting to
> smaller buffers, is there other option than drop? Truncate to what the
> buffer size may fit and deliver up? Seems even worse than drop...
> 
> > 
> > > Otherwise various device implementations are hard to
> > > follow. The capture is that vhost-net drops bigger packets while the driver
> > > only supplied smaller buffers. This is the status quo, and users seemingly
> > > have relied on this behavior for some while.
> > > 
> > > -Siwei
> > Weird where do you see this in code? I see
> > 
> >                  sock_len = vhost_net_rx_peek_head_len(net, sock->sk,
> >                                                        &busyloop_intr);
> >                  if (!sock_len)
> >                          break;
> >                  sock_len += sock_hlen;
> >                  vhost_len = sock_len + vhost_hlen;
> >                  headcount = get_rx_bufs(vq, vq->heads + nvq->done_idx,
> >                                          vhost_len, &in, vq_log, &log,
> >                                          likely(mergeable) ? UIO_MAXIOV : 1);
> >                  /* On error, stop handling until the next kick. */
> >                  if (unlikely(headcount < 0))
> >                          goto out;
> > 
> > 
> > so it does not drop a packet, it just stops processing the queue.
> Here
> 
>                 /* On overrun, truncate and discard */
>                 if (unlikely(headcount > UIO_MAXIOV)) {
>                         iov_iter_init(&msg.msg_iter, READ, vq->iov, 1, 1);
>                         err = sock->ops->recvmsg(sock, &msg,
>                                                  1, MSG_DONTWAIT |
> MSG_TRUNC);
>                         pr_debug("Discarded rx packet: len %zd\n",
> sock_len);
>                         continue;
>                 }
> 
> recvmsg(, , 1, ) is essentially to drop the oversized packet.
> 
> 
> In get_rx_bufs(), overrun detection will return something larger than
> UIO_MAXIOV as indicator:
> 
> static int get_rx_bufs()
> {
> :
> ;
>         /* Detect overrun */
>         if (unlikely(datalen > 0)) {
>                 r = UIO_MAXIOV + 1;
>                 goto err;
>         }
> :
> :
> 
> 
> -Siwei


Hmm you are right. I'll check but it seems I have misread the code.
Sorry about wasting your time on this.
So maybe the approach is ok then.
It's late, I'll recheck tomorrow.


> > 
> > 
> > > > 
> > > > > So, I am lost what virtio net device user application is trying to achieve by sending smaller packets and dropping all receive packets.
> > > > > (it doesn’t have any relation to mergeable or otherwise).
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> > > > For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
> > > > 

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

  reply	other threads:[~2022-08-09 23:03 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20220802044548.9031-1-gavinl@nvidia.com>
2022-08-04  5:00 ` [virtio-dev] [PATCH] virtio-net: use mtu size as buffer length for big packets Jason Wang
2022-08-04  7:10   ` Michael S. Tsirkin
2022-08-04  7:23     ` Jason Wang
2022-08-04  7:24       ` Jason Wang
2022-08-05 22:11 ` Si-Wei Liu
2022-08-05 23:26   ` Si-Wei Liu
     [not found]   ` <c4c42174-dcf0-b1e1-a483-0447fbdb1c48@nvidia.com>
2022-08-08 23:56     ` Si-Wei Liu
     [not found]       ` <465efc4c-f41f-494e-8f2d-a87deae90c5d@nvidia.com>
2022-08-09  7:44         ` Jason Wang
2022-08-09  9:22           ` Michael S. Tsirkin
2022-08-09  9:28             ` Jason Wang
2022-08-09  9:25           ` Michael S. Tsirkin
2022-08-09  9:40             ` Jason Wang
2022-08-09 18:38           ` Si-Wei Liu
2022-08-09 18:42             ` Parav Pandit via Virtualization
2022-08-09 19:08               ` Si-Wei Liu
2022-08-09 19:18                 ` Parav Pandit via Virtualization
2022-08-09 20:32                   ` Si-Wei Liu
2022-08-09 21:13                     ` Parav Pandit via Virtualization
2022-08-09 21:32                       ` Michael S. Tsirkin
2022-08-09 21:37                   ` Michael S. Tsirkin
2022-08-09 21:49                     ` Parav Pandit via Virtualization
2022-08-09 22:25                       ` Michael S. Tsirkin
2022-08-09 22:49                         ` Parav Pandit via Virtualization
2022-08-09 22:59                           ` Michael S. Tsirkin
2022-08-09 23:04                           ` Michael S. Tsirkin
2022-08-09 23:24                           ` Si-Wei Liu
2022-08-10  6:14                             ` Michael S. Tsirkin
2022-08-10  6:15                               ` Michael S. Tsirkin
2022-08-10  6:59                                 ` Jason Wang
2022-08-10  9:03                                   ` Michael S. Tsirkin
2022-08-10 16:00                                     ` Parav Pandit via Virtualization
2022-08-10 16:05                                       ` Michael S. Tsirkin
2022-08-10 16:22                                         ` Parav Pandit via Virtualization
2022-08-10 16:58                                           ` Michael S. Tsirkin
2022-08-10 17:02                                             ` Michael S. Tsirkin
2022-08-10 17:06                                             ` Parav Pandit via Virtualization
2022-08-10 17:12                                               ` Michael S. Tsirkin
2022-08-11  0:26                                 ` Si-Wei Liu
2022-08-09 22:32                     ` Si-Wei Liu
2022-08-09 22:37                       ` Michael S. Tsirkin
2022-08-09 22:54                         ` Si-Wei Liu
2022-08-09 23:03                           ` Michael S. Tsirkin [this message]
2022-08-10  1:24                           ` Jason Wang
2022-08-09 21:34             ` Michael S. Tsirkin
2022-08-09 21:39               ` Si-Wei Liu
2022-08-09 22:27                 ` Michael S. Tsirkin
2022-08-10  1:15             ` Jason Wang
2022-08-09 18:06         ` Si-Wei Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220809190206-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=alexander.h.duyck@intel.com \
    --cc=davem@davemloft.net \
    --cc=gavi@nvidia.com \
    --cc=gavinl@nvidia.com \
    --cc=jesse.brandeburg@intel.com \
    --cc=kubakici@wp.pl \
    --cc=loseweigh@gmail.com \
    --cc=si-wei.liu@oracle.com \
    --cc=sridhar.samudrala@intel.com \
    --cc=stephen@networkplumber.org \
    --cc=virtio-dev@lists.oasis-open.org \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox