All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Michael Dalton <mwdalton@google.com>
Cc: netdev@vger.kernel.org,
	lf-virt <virtualization@lists.linux-foundation.org>,
	Eric Dumazet <edumazet@google.com>,
	"David S. Miller" <davem@davemloft.net>
Subject: Re: [PATCH net-next 3/3] net: auto-tune mergeable rx buffer size for improved performance
Date: Thu, 26 Dec 2013 22:24:13 +0200	[thread overview]
Message-ID: <20131226202413.GA10215@redhat.com> (raw)
In-Reply-To: <CANJ5vPJX1yTKJXpFe+PO7u89-tYRjNPdJwrBLjAop3veye1pDg@mail.gmail.com>

On Thu, Dec 26, 2013 at 12:06:23PM -0800, Michael Dalton wrote:
> On Mon, Dec 23, 2013 at 4:51 AM, Michael S. Tsirkin <mst@redhat.com> wrote:
> > OK so a high level benchmark shows it's worth it,
> > but how well does the logic work?
> > I think we should make the buffer size accessible in sysfs
> > or debugfs, and look at it, otherwise we don't really know.
> >
> Exporting the size sounds good to me, it is definitely an
> important metric and would give more visibility to the admin.
> 
> Do you have a preference for implementation strategy? I was
> thinking just add a DEVICE_ATTR to create a read-only sysfs file,
> 'mergeable_rx_buffer_size', and return a space-separated list of the
> current buffer size (computed from the average packet size) for each
> receive queue. -EINVAL or a similar error could be returned if the
> netdev was not configured for mergeable rx buffers.

Not too important really. Maybe an attribute per queue? Might be
easier to parse and does not need tricky formatting.

> > I don't get the real motivation for this.
> >
> > We have skbs A,B,C sharing a page, with chunk D being unused.
> > This randomly charges chunk D to an skb that ended up last
> > in the page.
> > Correct?
> > Why does this make sense?
> 
> The intent of this code is to adjust the SKB true size for
> the packet. We should completely use each packet buffer except
> for the last buffer. For all buffers except the last buffer, it
> should be the case that 'len' (bytes received) = buffer size. For
> the last buffer, this code adjusts the truesize by comparing the
> approximated buffer size with the bytes received into the buffer,
> and adding the difference to the SKB truesize if the buffer size
> is greater than the number of bytes received.
> 
> We approximate the buffer size by using the last packet buffer size
> from that same page, which as you have correctly noted may be a buffer
> that belongs to a different packet on the same virtio-net device. This
> buffer size should be very close to the actual buffer size because our
> EWMA estimator uses a high weight (so the packet buffer size changes very
> slowly) and there are only a handful packets on a page (even order-3).

I'll need to ponder it, don't understand it exactly.

> > Why head_skb only? Why not full buffer size that comes from host?
> > This is simply len.
> 
> Sorry, I believe this code fragment should be clearer. Basically, we
> have a corner case in that for packets with size <= GOOD_COPY_LEN, there
> are no frags because page_to_skb() already unref'd the page and the entire
> packet contents are copied to skb->data. In this case, the SKB truesize
> is already accurate and should not be updated (and it would be unsafe to
> access page->private as page is already unref'd).
> 
> I'll look at the above code again and cleanup (please let me know if you
> have a preference) and/or add a comment to clarify.
> 
> Best,
> 
> Mike

  reply	other threads:[~2013-12-26 20:24 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-17  0:16 [PATCH net-next 1/3] net: allow > 0 order atomic page alloc in skb_page_frag_refill Michael Dalton
2013-12-17  0:16 ` [PATCH net-next 2/3] virtio-net: use per-receive queue page frag alloc for mergeable bufs Michael Dalton
2013-12-23  8:12   ` Jason Wang
2013-12-23 17:27     ` Eric Dumazet
2013-12-23 19:37       ` Michael S. Tsirkin
2013-12-26 21:28         ` Michael Dalton
2013-12-26 21:37           ` Michael S. Tsirkin
2013-12-26 22:00             ` Eric Dumazet
2014-01-08 17:21               ` Michael S. Tsirkin
2014-01-08 18:09                 ` Eric Dumazet
2014-01-08 18:57                   ` Michael S. Tsirkin
2014-01-08 19:54                   ` David Miller
2014-01-08 21:16                   ` Rick Jones
2013-12-26 22:00             ` Eric Dumazet
2013-12-26 21:56           ` Eric Dumazet
2013-12-26 21:56           ` Eric Dumazet
2013-12-27  4:55             ` Jason Wang
2013-12-27  5:46               ` Eric Dumazet
2013-12-27  6:12                 ` Jason Wang
2013-12-23 13:31   ` Michael S. Tsirkin
2013-12-17  0:16 ` [PATCH net-next 3/3] net: auto-tune mergeable rx buffer size for improved performance Michael Dalton
2013-12-17  0:16 ` Michael Dalton
2013-12-23 12:51   ` Michael S. Tsirkin
2013-12-23 13:33   ` Michael S. Tsirkin
2013-12-30 10:14     ` Amos Kong
2014-01-08 17:41       ` Michael S. Tsirkin
2013-12-26  7:33   ` Jason Wang
2013-12-26 20:06     ` Michael Dalton
2013-12-26 20:24       ` Michael S. Tsirkin [this message]
2013-12-27  3:04       ` Jason Wang
2013-12-27 21:41         ` Michael Dalton
2013-12-30  4:50           ` Jason Wang
2013-12-30  5:38           ` Jason Wang
2014-01-08 17:37           ` Michael S. Tsirkin
2013-12-19 19:58 ` [PATCH net-next 1/3] net: allow > 0 order atomic page alloc in skb_page_frag_refill David Miller
2013-12-23 13:35   ` Michael S. Tsirkin
2013-12-23  7:52 ` Jason Wang
2013-12-23 17:24   ` Eric Dumazet
2013-12-23 17:24   ` Eric Dumazet
2013-12-23 12:53 ` Michael S. Tsirkin
2013-12-23 17:30   ` Eric Dumazet
2013-12-23 19:19     ` Michael S. Tsirkin
2013-12-24 22:46 ` David Miller
2014-01-03  0:42   ` Debabrata Banerjee
2014-01-03  0:42   ` Debabrata Banerjee
2014-01-03  0:56     ` Eric Dumazet
2014-01-03  1:26       ` Eric Dumazet
2014-01-03  1:59         ` Debabrata Banerjee
2014-01-03 22:47           ` Debabrata Banerjee
2014-01-03 22:47           ` Debabrata Banerjee
2014-01-03 22:54             ` Eric Dumazet
2014-01-03 23:27               ` Debabrata Banerjee
2014-01-03 23:27               ` Debabrata Banerjee
2014-01-03  1:59         ` Debabrata Banerjee
2013-12-24 22:46 ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131226202413.GA10215@redhat.com \
    --to=mst@redhat.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=mwdalton@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.