Discussion of the VIRTIO specification
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Jason Wang <jasowang@redhat.com>
Cc: Vitaly Mireyno <vmireyno@marvell.com>,
	"virtio-comment@lists.oasis-open.org"
	<virtio-comment@lists.oasis-open.org>
Subject: [virtio-comment] Re: [EXT] Re: [virtio-comment] [PATCH] virtio-net: Add an optional device control over the receive buffers length
Date: Thu, 23 Jan 2020 02:46:11 -0500	[thread overview]
Message-ID: <20200123023750-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <99c01a11-cff7-03c6-4e4b-fc05e5a31405@redhat.com>

On Thu, Jan 23, 2020 at 03:10:52PM +0800, Jason Wang wrote:
> 
> On 2020/1/23 下午2:55, Michael S. Tsirkin wrote:
> > On Wed, Jan 22, 2020 at 04:06:30PM +0000, Vitaly Mireyno wrote:
> > > 
> > > > -----Original Message-----
> > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > Sent: Monday, 20 January, 2020 18:25
> > > > To: Vitaly Mireyno <vmireyno@marvell.com>
> > > > Cc: virtio-comment@lists.oasis-open.org; Jason Wang
> > > > <jasowang@redhat.com>
> > > > Subject: [EXT] Re: [virtio-comment] [PATCH] virtio-net: Add an optional device
> > > > control over the receive buffers length
> > > > 
> > > > External Email
> > > > 
> > > > ----------------------------------------------------------------------
> > > > On Mon, Dec 30, 2019 at 01:59:17PM +0000, Vitaly Mireyno wrote:
> > > > > This patch gives devices some level of control over the receive buffers
> > > > length.
> > > > > The driver declares the minimum receive buffer length, and the device
> > > > requests max/min buffer length ratio.
> > > > > This is a follow-up to the "[PATCH] virtio-net: Add equal-sized
> > > > > receive buffers feature" discussion:
> > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.oasis-2Dope
> > > > > n.org_archives_virtio-
> > > > 2Dcomment_201912_msg00007.html&d=DwIFAg&c=nKjWec
> > > > 2b6R0mOyPaz7xtfQ&r=lDHJ2FW52oJ3lqqsArgFRdcevq01tbLQAw4A_NO7xgI&
> > > > m=jrQgZ
> > > > > -
> > > > iN3QnvFqxQQkoPTmztJRTNCXTF2dz68TEV0cA&s=pDtISMJpw1N9ohot9fluo1N
> > > > j1X2s1
> > > > > J_sEVJGg2uwkZk&e=
> > > > > 
> > > > > 
> > > > > Signed-off-by: Vitaly Mireyno <vmireyno@marvell.com>
> > > > > ---
> > > > >   content.tex | 41 +++++++++++++++++++++++++++++++++++++++--
> > > > >   1 file changed, 39 insertions(+), 2 deletions(-)
> > > > > 
> > > > > diff --git a/content.tex b/content.tex index d68cfaf..0a4cfba 100644
> > > > > --- a/content.tex
> > > > > +++ b/content.tex
> > > > > @@ -2815,6 +2815,13 @@ \subsection{Feature bits}\label{sec:Device
> > > > > Types / Network Device / Feature bits
> > > > \item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)] Set MAC address through
> > > > control
> > > > >       channel.
> > > > > 
> > > > > +\item[VIRTIO_NET_F_RXBUF_LEN_RATIO(57)] Device requests to limit the
> > > > > +    maximum/minimum receive buffers length ratio.
> > > > > +
> > > > > +\item[VIRTIO_NET_F_RXBUF_MIN_LEN(58)] Device requests to know the
> > > > minimum
> > > > > +    receive buffers length. Driver declares the minimum receive buffers
> > > > > +    length.
> > > > > +
> > > > >   \item[VIRTIO_NET_F_GUEST_HDRLEN(59)] Driver can provide the exact
> > > > \field{hdr_len}
> > > > >       value. Device benefits from knowing the exact header length.
> > > > > 
> > > > So I am wondering, when we are switching from regular skbs to XDP, all
> > > > buffers become 4K. When we switch back, we get variability again.
> > > > All the while some old buffers can be outstanding.
> > > > 
> > > > How will we handle this?
> > > > 
> > > The feature limits the flexibility of the driver to set the descriptors length.
> > > It doesn't limit the actual buffer allocation size. So if XDP requires a larger headroom, I don't see a problem. Please excuse my ignorance, but what do you mean by "all buffers become 4K"?
> > So linux can switch between skb and XDP mode. In skb mode buffer size
> > varies, it works by merging multiple buffers. In XDP a single buffer
> > is made big enough to hold the whole packet. Length is fixed.
> > 
> > If we are in XDP mode but buffer was added while in skb mode,
> > or vice versa, we recover generally by copying data to
> > the correct buffer. This is a temporary slowdown -
> > better than dropping packets.
> > 
> > So let's assume the device ratio is 1.
> > I guess while in XDP mode, we'll write XDP buffer length into
> > this field. But in skb mode, we can make the buffer smaller.
> > This implies driver needs to change the min_rx_buf_len?
> 
> 
> I'm not sure I get here. The header room should be invisible from device
> point of view I think.
> 
> Thanks
> 

In fact I am confused. We have this comment:

                /* This happens when rx buffer size is underestimated
                 * or headroom is not enough because of the buffer
                 * was refilled before XDP is set. This should only
                 * happen for the first several packets, so we don't
                 * care much about its performance.
                 */

but what does ensure that num_buf == 1 except for the first several
packets?

We seem to be calling add_recvbuf_mergeable which in turn uses ewma
to estimate the packet size, but it seems that XDP never updates ewma so
the size will be whatever it happened to be, no?

I guess we need a counter in this slow path so we can notice it
happening ...

> > 
> > 
> > 
> > > > > @@ -2861,8 +2868,8 @@ \subsubsection{Legacy Interface: Feature
> > > > > bits}\label{sec:Device Types / Network  \subsection{Device
> > > > > configuration layout}\label{sec:Device Types / Network Device / Device
> > > > > configuration layout}  \label{sec:Device Types / Block Device /
> > > > > Feature bits / Device configuration layout}
> > > > > 
> > > > > -Three driver-read-only configuration fields are currently defined.
> > > > > The \field{mac} address field -always exists (though is only valid if
> > > > > VIRTIO_NET_F_MAC is set), and
> > > > > +The driver-read-only \field{mac} address field always exists (though
> > > > > +is only valid if VIRTIO_NET_F_MAC is set), and
> > > > >   \field{status} only exists if VIRTIO_NET_F_STATUS is set. Two
> > > > > read-only bits (for the driver) are currently defined for the status field:
> > > > >   VIRTIO_NET_S_LINK_UP and VIRTIO_NET_S_ANNOUNCE.
> > > > > @@ -2882,12 +2889,22 @@ \subsection{Device configuration
> > > > > layout}\label{sec:Device Types / Network Device  VIRTIO_NET_F_MTU is
> > > > > set. This field specifies the maximum MTU for the driver to  use.
> > > > > 
> > > > > +The device-read-only field \field{min_rx_buf_len} only exists if
> > > > > +VIRTIO_NET_F_RXBUF_MIN_LEN is set. This field specifies the minimum
> > > > > +receive buffers length.
> > > > > +
> > > > > +The driver-read-only field \field{rx_buf_len_ratio} only exists if
> > > > > +VIRTIO_NET_F_RXBUF_LEN_RATIO is set. This field specifies the
> > > > > +maximum/minimum receive buffers length ratio.
> > > > > +
> > Config space access is from point of view of driver (it's part of device
> > so it does not make sense to talk about device reading it).
> > The point is driver is not supposed to read it, right?
> > So s/device-read-only/driver-write-only/ or better "write-only for
> > driver".
> > 
> > > > >   \begin{lstlisting}
> > > > >   struct virtio_net_config {
> > > > >           u8 mac[6];
> > > > >           le16 status;
> > > > >           le16 max_virtqueue_pairs;
> > > > >           le16 mtu;
> > > > > +        le32 min_rx_buf_len;
> > > > > +        le16 rx_buf_len_ratio;
> > > > >   };
> > > > >   \end{lstlisting}
> > > > > 
> > > > > @@ -2916,6 +2933,15 @@ \subsection{Device configuration
> > > > > layout}\label{sec:Device Types / Network Device  If the driver
> > > > > negotiates the VIRTIO_NET_F_STANDBY feature, the device MAY act  as a
> > > > standby device for a primary device with the same MAC address.
> > > > > +A driver SHOULD accept the VIRTIO_NET_F_RXBUF_MIN_LEN feature if
> > > > offered.
> > > > > +
> > > > > +If VIRTIO_NET_F_RXBUF_MIN_LEN feature has been negotiated, the
> > > > driver
> > > > > +MUST set \field{min_rx_buf_len}.
> > > > > +
> > > > > +A driver MUST NOT modify \field{min_rx_buf_len} once it has been set.
> > > > > +
> > > > > +A driver SHOULD accept the VIRTIO_NET_F_RXBUF_LEN_RATIO feature if
> > > > offered.
> > > > > +
> > > > >   \drivernormative{\subsubsection}{Device configuration layout}{Device
> > > > > Types / Network Device / Device configuration layout}
> > > > > 
> > > > >   A driver SHOULD negotiate VIRTIO_NET_F_MAC if the device offers it.
> > > > > @@ -3281,6 +3307,17 @@ \subsubsection{Setting Up Receive
> > > > > Buffers}\label{sec:Device Types / Network Devi  If VIRTIO_NET_F_MQ is
> > > > > negotiated, each of receiveq1\ldots receiveqN  that will be used SHOULD be
> > > > populated with receive buffers.
> > > > > +If VIRTIO_NET_F_RXBUF_MIN_LEN feature has been negotiated, the
> > > > driver
> > > > > +MUST initialize all receive virtqueue descriptors \field{len} field
> > > > > +with the value greater than or equal to the value configured in the
> > > > > +\field{min_rx_buf_len} device configuration field, and allocate
> > > > > +receive buffers accordingly.
> > > > > +
> > > > > +If VIRTIO_NET_F_RXBUF_LEN_RATIO feature has been negotiated, the
> > > > > +driver MUST initialize all receive virtqueue descriptors \field{len}
> > > > > +field with the value less than or equal to (\field{min_rx_buf_len} *
> > > > > +\field{rx_buf_len_ratio}), and allocate receive buffers accordingly.
> > > > > +
> > > > >   \devicenormative{\paragraph}{Setting Up Receive Buffers}{Device Types
> > > > > / Network Device / Device Operation / Setting Up Receive Buffers}
> > > > > 
> > > > >   The device MUST set \field{num_buffers} to the number of descriptors
> > > > > used to
> > > > > --
> > > > > 
> > > > > This publicly archived list offers a means to provide input to the
> > > > > OASIS Virtual I/O Device (VIRTIO) TC.
> > > > > 
> > > > > In order to verify user consent to the Feedback License terms and to
> > > > > minimize spam in the list archive, subscription is required before
> > > > > posting.
> > > > > 
> > > > > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> > > > > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> > > > > List help: virtio-comment-help@lists.oasis-open.org
> > > > > List archive:
> > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.oasis-2Dope
> > > > > n.org_archives_virtio-
> > > > 2Dcomment_&d=DwIFAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=l
> > > > > DHJ2FW52oJ3lqqsArgFRdcevq01tbLQAw4A_NO7xgI&m=jrQgZ-
> > > > iN3QnvFqxQQkoPTmztJ
> > > > > RTNCXTF2dz68TEV0cA&s=yHWSFWSoze0yJhs-
> > > > YBtijO34uqnnV0LoPfizi0miMNQ&e=
> > > > > Feedback License:
> > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__www.oasis-
> > > > 2Dopen.
> > > > > org_who_ipr_feedback-
> > > > 5Flicense.pdf&d=DwIFAg&c=nKjWec2b6R0mOyPaz7xtfQ&r
> > > > > =lDHJ2FW52oJ3lqqsArgFRdcevq01tbLQAw4A_NO7xgI&m=jrQgZ-
> > > > iN3QnvFqxQQkoPTmz
> > > > tJRTNCXTF2dz68TEV0cA&s=19yavBBd2Ai05CwKnyFOLRNlBnNAtmpKVsTMB6sI
> > > > TRE&e=
> > > > > List Guidelines:
> > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__www.oasis-
> > > > 2Dopen.
> > > > > org_policies-2Dguidelines_mailing-
> > > > 2Dlists&d=DwIFAg&c=nKjWec2b6R0mOyPaz
> > > > > 7xtfQ&r=lDHJ2FW52oJ3lqqsArgFRdcevq01tbLQAw4A_NO7xgI&m=jrQgZ-
> > > > iN3QnvFqxQ
> > > > > QkoPTmztJRTNCXTF2dz68TEV0cA&s=3bbeWIoIuAK6RwARN2q-
> > > > t9hPTDAKbwOJ8fMRykpq
> > > > > ryk&e=
> > > > > Committee:
> > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__www.oasis-
> > > > 2Dopen.
> > > > org_committees_virtio_&d=DwIFAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=lDHJ2F
> > > > W52oJ
> > > > > 3lqqsArgFRdcevq01tbLQAw4A_NO7xgI&m=jrQgZ-
> > > > iN3QnvFqxQQkoPTmztJRTNCXTF2dz
> > > > > 68TEV0cA&s=U8z-sIxuNy_keXvWYnlIzloNziRKqC1pxOG-x3Hy6SY&e=
> > > > > Join OASIS:
> > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__www.oasis-
> > > > 2Dopen.
> > > > org_join_&d=DwIFAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=lDHJ2FW52oJ3lqqsAr
> > > > gFRdce
> > > > > vq01tbLQAw4A_NO7xgI&m=jrQgZ-
> > > > iN3QnvFqxQQkoPTmztJRTNCXTF2dz68TEV0cA&s=C4
> > > > > SQkiR2V1N3I_ri_7fwSiHmcVRNPp2FnVA4gR0-cgM&e=


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


  reply	other threads:[~2020-01-23  7:46 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-30 13:59 [virtio-comment] [PATCH] virtio-net: Add an optional device control over the receive buffers length Vitaly Mireyno
2020-01-03 15:22 ` Stefan Hajnoczi
2020-01-05 10:28   ` Michael S. Tsirkin
2020-01-05 10:35 ` [virtio-comment] " Michael S. Tsirkin
2020-01-20 16:24 ` [virtio-comment] " Michael S. Tsirkin
2020-01-22 16:06   ` [virtio-comment] RE: [EXT] " Vitaly Mireyno
2020-01-23  6:55     ` [virtio-comment] " Michael S. Tsirkin
2020-01-23  7:10       ` Jason Wang
2020-01-23  7:46         ` Michael S. Tsirkin [this message]
2020-01-23 10:00           ` Jason Wang
2020-01-23 12:08             ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200123023750-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=virtio-comment@lists.oasis-open.org \
    --cc=vmireyno@marvell.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox