Linux virtualization list
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Parav Pandit <parav@nvidia.com>
Cc: "alexander.h.duyck@intel.com" <alexander.h.duyck@intel.com>,
	Virtio-Dev <virtio-dev@lists.oasis-open.org>,
	"kubakici@wp.pl" <kubakici@wp.pl>,
	"sridhar.samudrala@intel.com" <sridhar.samudrala@intel.com>,
	"jesse.brandeburg@intel.com" <jesse.brandeburg@intel.com>,
	Gavi Teitz <gavi@nvidia.com>,
	virtualization <virtualization@lists.linux-foundation.org>,
	"Hemminger, Stephen" <stephen@networkplumber.org>,
	"loseweigh@gmail.com" <loseweigh@gmail.com>,
	davem <davem@davemloft.net>, Gavin Li <gavinl@nvidia.com>
Subject: Re: [virtio-dev] [PATCH] virtio-net: use mtu size as buffer length for big packets
Date: Tue, 9 Aug 2022 18:59:50 -0400	[thread overview]
Message-ID: <20220809185747-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <PH0PR12MB54816FFF167D3EA3EF5F075FDC629@PH0PR12MB5481.namprd12.prod.outlook.com>

On Tue, Aug 09, 2022 at 10:49:48PM +0000, Parav Pandit wrote:
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Tuesday, August 9, 2022 6:26 PM
> > To: Parav Pandit <parav@nvidia.com>
> > Cc: Si-Wei Liu <si-wei.liu@oracle.com>; Jason Wang
> > <jasowang@redhat.com>; Gavin Li <gavinl@nvidia.com>; Hemminger,
> > Stephen <stephen@networkplumber.org>; davem
> > <davem@davemloft.net>; virtualization <virtualization@lists.linux-
> > foundation.org>; Virtio-Dev <virtio-dev@lists.oasis-open.org>;
> > jesse.brandeburg@intel.com; alexander.h.duyck@intel.com;
> > kubakici@wp.pl; sridhar.samudrala@intel.com; loseweigh@gmail.com; Gavi
> > Teitz <gavi@nvidia.com>
> > Subject: Re: [virtio-dev] [PATCH] virtio-net: use mtu size as buffer length for
> > big packets
> > 
> > On Tue, Aug 09, 2022 at 09:49:03PM +0000, Parav Pandit wrote:
> > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > Sent: Tuesday, August 9, 2022 5:38 PM
> > >
> > > [..]
> > > > > > I think virtio-net driver doesn't differentiate MTU and MRU, in
> > > > > > which case the receive buffer will be reduced to fit the 1500B
> > > > > > payload size when mtu is lowered down to 1500 from 9000.
> > > > > How? Driver reduced the mXu to 1500, say it is improved to post
> > > > > buffers of
> > > > 1500 bytes.
> > > > >
> > > > > Device doesn't know about it because mtu in config space is RO field.
> > > > > Device keep dropping 9K packets because buffers posted are 1500
> > bytes.
> > > > > This is because device follows the spec " The device MUST NOT pass
> > > > received packets that exceed mtu".
> > > >
> > > >
> > > > The "mtu" here is the device config field, which is
> > > >
> > > >         /* Default maximum transmit unit advice */
> > > >
> > >
> > > It is the field from struct virtio_net_config.mtu. right?
> > > This is RO field for driver.
> > >
> > > > there is no guarantee device will not get a bigger packet.
> > > Right. That is what I also hinted.
> > > Hence, allocating buffers worth upto mtu is safer.
> > 
> > yes
> > 
> > > When user overrides it, driver can be further optimized to honor such new
> > value on rx buffer posting.
> > 
> > no, not without a feature bit promising device won't get wedged.
> > 
> I mean to say as_it_stands today, driver can decide to post smaller buffers with larger mtu.
> Why device should be affected with it?
> ( I am not proposing such weird configuration but asking for sake of correctness).

They just are because drivers did not do this.

> > > > And there is no guarantee such a packet will be dropped as opposed
> > > > to wedging the device if userspace insists on adding smaller buffers.
> > > >
> > > If user space insists on small buffers, so be it.
> > 
> > If previously things worked, the "so be it" is a regression and blaming users
> > won't help us.
> > 
> I am not suggesting above.
> This was Si-Wei's suggestion that somehow driver wants to post smaller buffers than the mtu because user knows what peer is doing.
> So may be driver can be extended to give more weight on user config.
> 
> > > It only works when user exactly know what user is doing in the whole
> > network.
> > 
> > If you want to claim this you need a new feature bit.
> > 
> Why is a new bit needed to tell device?
> User is doing something its own config mismatching the buffers and mtu.
> A solid use case hasn't emerged for this yet.
> 
> If user wants to modify the mtu, we should just make virtio_net_config.mtu as RW field using new feature bit.
> Is that what you mean?
> If so, yes, it makes things very neat where driver and device are aligned to each other, the way they are today.
> Only limitation is that its one-way. = device tells to driver.
> 
> > > When user prefers to override the device RO field, device is in the dark and
> > things work on best effort basis.
> > 
> > Dropping packets is best effort. Getting stuck forever isn't, that's a quality of
> > implementation issue.
> >
> Not sure, why things get stuck for ever. Maybe you have example to explain.
> I am mostly missing something.

I sent an explanation a bit earlier. It's more or less a bug.

> > > This must be a reasonably advance user who has good knowledge of its
> > network topology etc.
> > >
> > > For such case, may be yes, driver should be further optimized.
> > >

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

  reply	other threads:[~2022-08-09 23:00 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20220802044548.9031-1-gavinl@nvidia.com>
2022-08-04  5:00 ` [virtio-dev] [PATCH] virtio-net: use mtu size as buffer length for big packets Jason Wang
2022-08-04  7:10   ` Michael S. Tsirkin
2022-08-04  7:23     ` Jason Wang
2022-08-04  7:24       ` Jason Wang
2022-08-05 22:11 ` Si-Wei Liu
2022-08-05 23:26   ` Si-Wei Liu
     [not found]   ` <c4c42174-dcf0-b1e1-a483-0447fbdb1c48@nvidia.com>
2022-08-08 23:56     ` Si-Wei Liu
     [not found]       ` <465efc4c-f41f-494e-8f2d-a87deae90c5d@nvidia.com>
2022-08-09  7:44         ` Jason Wang
2022-08-09  9:22           ` Michael S. Tsirkin
2022-08-09  9:28             ` Jason Wang
2022-08-09  9:25           ` Michael S. Tsirkin
2022-08-09  9:40             ` Jason Wang
2022-08-09 18:38           ` Si-Wei Liu
2022-08-09 18:42             ` Parav Pandit via Virtualization
2022-08-09 19:08               ` Si-Wei Liu
2022-08-09 19:18                 ` Parav Pandit via Virtualization
2022-08-09 20:32                   ` Si-Wei Liu
2022-08-09 21:13                     ` Parav Pandit via Virtualization
2022-08-09 21:32                       ` Michael S. Tsirkin
2022-08-09 21:37                   ` Michael S. Tsirkin
2022-08-09 21:49                     ` Parav Pandit via Virtualization
2022-08-09 22:25                       ` Michael S. Tsirkin
2022-08-09 22:49                         ` Parav Pandit via Virtualization
2022-08-09 22:59                           ` Michael S. Tsirkin [this message]
2022-08-09 23:04                           ` Michael S. Tsirkin
2022-08-09 23:24                           ` Si-Wei Liu
2022-08-10  6:14                             ` Michael S. Tsirkin
2022-08-10  6:15                               ` Michael S. Tsirkin
2022-08-10  6:59                                 ` Jason Wang
2022-08-10  9:03                                   ` Michael S. Tsirkin
2022-08-10 16:00                                     ` Parav Pandit via Virtualization
2022-08-10 16:05                                       ` Michael S. Tsirkin
2022-08-10 16:22                                         ` Parav Pandit via Virtualization
2022-08-10 16:58                                           ` Michael S. Tsirkin
2022-08-10 17:02                                             ` Michael S. Tsirkin
2022-08-10 17:06                                             ` Parav Pandit via Virtualization
2022-08-10 17:12                                               ` Michael S. Tsirkin
2022-08-11  0:26                                 ` Si-Wei Liu
2022-08-09 22:32                     ` Si-Wei Liu
2022-08-09 22:37                       ` Michael S. Tsirkin
2022-08-09 22:54                         ` Si-Wei Liu
2022-08-09 23:03                           ` Michael S. Tsirkin
2022-08-10  1:24                           ` Jason Wang
2022-08-09 21:34             ` Michael S. Tsirkin
2022-08-09 21:39               ` Si-Wei Liu
2022-08-09 22:27                 ` Michael S. Tsirkin
2022-08-10  1:15             ` Jason Wang
2022-08-09 18:06         ` Si-Wei Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220809185747-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=alexander.h.duyck@intel.com \
    --cc=davem@davemloft.net \
    --cc=gavi@nvidia.com \
    --cc=gavinl@nvidia.com \
    --cc=jesse.brandeburg@intel.com \
    --cc=kubakici@wp.pl \
    --cc=loseweigh@gmail.com \
    --cc=parav@nvidia.com \
    --cc=sridhar.samudrala@intel.com \
    --cc=stephen@networkplumber.org \
    --cc=virtio-dev@lists.oasis-open.org \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox