From: Jason Wang <jasowang@redhat.com>
To: Michael Dalton <mwdalton@google.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>,
netdev@vger.kernel.org,
lf-virt <virtualization@lists.linux-foundation.org>,
Eric Dumazet <edumazet@google.com>,
"David S. Miller" <davem@davemloft.net>
Subject: Re: [PATCH net-next 3/3] net: auto-tune mergeable rx buffer size for improved performance
Date: Fri, 27 Dec 2013 11:04:46 +0800 [thread overview]
Message-ID: <52BCEE4E.7070106@redhat.com> (raw)
In-Reply-To: <CANJ5vPJX1yTKJXpFe+PO7u89-tYRjNPdJwrBLjAop3veye1pDg@mail.gmail.com>
On 12/27/2013 04:06 AM, Michael Dalton wrote:
> On Mon, Dec 23, 2013 at 4:51 AM, Michael S. Tsirkin <mst@redhat.com> wrote:
>> OK so a high level benchmark shows it's worth it,
>> but how well does the logic work?
>> I think we should make the buffer size accessible in sysfs
>> or debugfs, and look at it, otherwise we don't really know.
>>
> Exporting the size sounds good to me, it is definitely an
> important metric and would give more visibility to the admin.
>
> Do you have a preference for implementation strategy? I was
> thinking just add a DEVICE_ATTR to create a read-only sysfs file,
> 'mergeable_rx_buffer_size', and return a space-separated list of the
> current buffer size (computed from the average packet size) for each
> receive queue. -EINVAL or a similar error could be returned if the
> netdev was not configured for mergeable rx buffers.
>
>> I don't get the real motivation for this.
>>
>> We have skbs A,B,C sharing a page, with chunk D being unused.
>> This randomly charges chunk D to an skb that ended up last
>> in the page.
>> Correct?
>> Why does this make sense?
> The intent of this code is to adjust the SKB true size for
> the packet. We should completely use each packet buffer except
> for the last buffer. For all buffers except the last buffer, it
> should be the case that 'len' (bytes received) = buffer size. For
> the last buffer, this code adjusts the truesize by comparing the
> approximated buffer size with the bytes received into the buffer,
> and adding the difference to the SKB truesize if the buffer size
> is greater than the number of bytes received.
But look like it does not take the 'hole' into account which may cause
under estimation of trusize.
>
> We approximate the buffer size by using the last packet buffer size
> from that same page, which as you have correctly noted may be a buffer
> that belongs to a different packet on the same virtio-net device. This
> buffer size should be very close to the actual buffer size because our
> EWMA estimator uses a high weight (so the packet buffer size changes very
> slowly) and there are only a handful packets on a page (even order-3).
We can make this more accurate by using extra data structure to track
the real buf size and using it as token.
>> Why head_skb only? Why not full buffer size that comes from host?
>> This is simply len.
> Sorry, I believe this code fragment should be clearer. Basically, we
> have a corner case in that for packets with size <= GOOD_COPY_LEN, there
> are no frags because page_to_skb() already unref'd the page and the entire
> packet contents are copied to skb->data. In this case, the SKB truesize
> is already accurate and should not be updated (and it would be unsafe to
> access page->private as page is already unref'd).
>
> I'll look at the above code again and cleanup (please let me know if you
> have a preference) and/or add a comment to clarify.
>
> Best,
>
> Mike
next prev parent reply other threads:[~2013-12-27 3:04 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-12-17 0:16 [PATCH net-next 1/3] net: allow > 0 order atomic page alloc in skb_page_frag_refill Michael Dalton
2013-12-17 0:16 ` [PATCH net-next 2/3] virtio-net: use per-receive queue page frag alloc for mergeable bufs Michael Dalton
2013-12-23 8:12 ` Jason Wang
2013-12-23 17:27 ` Eric Dumazet
2013-12-23 19:37 ` Michael S. Tsirkin
2013-12-26 21:28 ` Michael Dalton
2013-12-26 21:37 ` Michael S. Tsirkin
2013-12-26 22:00 ` Eric Dumazet
2014-01-08 17:21 ` Michael S. Tsirkin
2014-01-08 18:09 ` Eric Dumazet
2014-01-08 18:57 ` Michael S. Tsirkin
2014-01-08 19:54 ` David Miller
2014-01-08 21:16 ` Rick Jones
2013-12-26 21:56 ` Eric Dumazet
2013-12-27 4:55 ` Jason Wang
2013-12-27 5:46 ` Eric Dumazet
2013-12-27 6:12 ` Jason Wang
2013-12-23 13:31 ` Michael S. Tsirkin
2013-12-17 0:16 ` [PATCH net-next 3/3] net: auto-tune mergeable rx buffer size for improved performance Michael Dalton
2013-12-23 12:51 ` Michael S. Tsirkin
2013-12-23 13:33 ` Michael S. Tsirkin
2013-12-30 10:14 ` Amos Kong
2014-01-08 17:41 ` Michael S. Tsirkin
2013-12-26 7:33 ` Jason Wang
2013-12-26 20:06 ` Michael Dalton
2013-12-26 20:24 ` Michael S. Tsirkin
2013-12-27 3:04 ` Jason Wang [this message]
2013-12-27 21:41 ` Michael Dalton
2013-12-30 4:50 ` Jason Wang
2013-12-30 5:38 ` Jason Wang
2014-01-08 17:37 ` Michael S. Tsirkin
2013-12-19 19:58 ` [PATCH net-next 1/3] net: allow > 0 order atomic page alloc in skb_page_frag_refill David Miller
2013-12-23 13:35 ` Michael S. Tsirkin
2013-12-23 7:52 ` Jason Wang
2013-12-23 17:24 ` Eric Dumazet
2013-12-23 12:53 ` Michael S. Tsirkin
2013-12-23 17:30 ` Eric Dumazet
2013-12-23 19:19 ` Michael S. Tsirkin
2013-12-24 22:46 ` David Miller
2014-01-03 0:42 ` Debabrata Banerjee
2014-01-03 0:56 ` Eric Dumazet
2014-01-03 1:26 ` Eric Dumazet
2014-01-03 1:59 ` Debabrata Banerjee
2014-01-03 22:47 ` Debabrata Banerjee
2014-01-03 22:54 ` Eric Dumazet
2014-01-03 23:27 ` Debabrata Banerjee
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52BCEE4E.7070106@redhat.com \
--to=jasowang@redhat.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=mst@redhat.com \
--cc=mwdalton@google.com \
--cc=netdev@vger.kernel.org \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).