xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Matt Wilson <msw@amazon.com>
To: Ian Campbell <Ian.Campbell@citrix.com>
Cc: "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>,
	"Palagummi, Siva" <Siva.Palagummi@ca.com>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Subject: Re: [PATCH RFC V2] xen/netback: Count ring slots properly when larger MTU sizes are used
Date: Mon, 17 Dec 2012 12:09:52 -0800	[thread overview]
Message-ID: <20121217200950.GA29382@u109add4315675089e695.ant.amazon.com> (raw)
In-Reply-To: <1355743598.14620.43.camel@zakaz.uk.xensource.com>

On Mon, Dec 17, 2012 at 11:26:38AM +0000, Ian Campbell wrote:
> On Fri, 2012-12-14 at 18:53 +0000, Matt Wilson wrote:
> > On Thu, Dec 13, 2012 at 11:12:50PM +0000, Palagummi, Siva wrote:
> > > > -----Original Message-----
> > > > From: Matt Wilson [mailto:msw@amazon.com]
> > > > Sent: Wednesday, December 12, 2012 3:05 AM
> > > >
> > > > On Tue, Dec 11, 2012 at 10:25:51AM +0000, Palagummi, Siva wrote:
> > > > >
> > > > > You can clearly see below that copy_off is input to
> > > > > start_new_rx_buffer while copying frags.
> > > > 
> > > > Yes, but that's the right thing to do. copy_off should be set to the
> > > > destination offset after copying the last byte of linear data, which
> > > > means "skb_headlen(skb) % PAGE_SIZE" is correct.
> > > > 
> > >
> > > No. It is not correct for two reasons. For example what if
> > > skb_headlen(skb) is exactly a multiple of PAGE_SIZE. Copy_off would
> > > be set to ZERO. And now if there exists some data in frags, ZERO
> > > will be passed in as copy_off value and start_new_rx_buffer will
> > > return FALSE. And second reason is the obvious case from the current
> > > code where "offset_in_page(skb->data)" size hole will be left in the
> > > first buffer after first pass in case remaining data that need to be
> > > copied is going to overflow the first buffer.
> > 
> > Right, and I'm arguing that having the code leave a hole is less
> > desirable than potentially increasing the number of copy
> > operations. I'd like to hear from Ian and others if using the buffers
> > efficiently is more important than reducing copy operations. Intuitively,
> > I think it's more important to use the ring efficiently.
> 
> Do you mean the ring or the actual buffers?

Sorry, the actual buffers.

> The current code tries to coalesce multiple small frags/heads because it
> is usually trivial but doesn't try too hard with multiple larger frags,
> since they take up most of a page by themselves anyway. I suppose this
> does waste a bit of buffer space and therefore could take more ring
> slots, but it's not clear to me how much this matters in practice (it
> might be tricky to measure this with any realistic workload).

In the case where we're consistently handling large heads (like when
using a MTU value of 9000 for streaming traffic), we're wasting 1/3 of
the available buffers.

> The cost of splitting a copy into two should be low though, the copies
> are already batched into a single hypercall and I'd expect things to be
> mostly dominated by the data copy itself rather than the setup of each
> individual op, which would argue for splitting a copy in two if that
> helps fill the buffers.

That was my thought as well. We're testing a patch that does just this
now.

> The flip side is that once you get past the headers etc the paged frags
> likely tend to either bits and bobs (fine) or mostly whole pages. In the
> whole pages case trying to fill the buffers will result in every copy
> getting split. My gut tells me that the whole pages case probably
> dominates, but I'm not sure what the real world impact of splitting all
> the copies would be.

Right, I'm less concerned about the paged frags. It might make sense
to skip some space so that the copying can be page aligned. I suppose
it depends on how many defferent pages are in the list, and what the
total size is.

In practice I'd think it would be rare to see a paged SKB for ingress
traffic to domUs unless there is significant intra-host communication
(dom0->domU, domU->domU). When domU ingress traffic is originating
from an Ethernet device it shouldn't be paged. Paged SKBs would come
into play when a SKB is formed for transmit on an egress device that
is SG-capable. Or am I misunderstanding how paged SKBs are used these
days?

Matt

  reply	other threads:[~2012-12-17 20:09 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-29 12:21 [PATCH RFC V2] xen/netback: Count ring slots properly when larger MTU sizes are used Palagummi, Siva
2012-08-30  8:07 ` Ian Campbell
2012-08-30 10:26   ` Palagummi, Siva
2012-12-04 23:23   ` Matt Wilson
2012-12-05 11:56     ` Palagummi, Siva
2012-12-06  5:35       ` Matt Wilson
2012-12-11 10:25         ` Palagummi, Siva
2012-12-11 21:34           ` Matt Wilson
2012-12-13 23:12             ` Palagummi, Siva
2012-12-14 18:53               ` Matt Wilson
2012-12-17 11:26                 ` Ian Campbell
2012-12-17 20:09                   ` Matt Wilson [this message]
2012-12-18 10:02                     ` Ian Campbell
2012-12-18 19:43                       ` Matt Wilson
2012-12-20 10:05                         ` Ian Campbell
2012-12-20 21:42                           ` Matt Wilson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121217200950.GA29382@u109add4315675089e695.ant.amazon.com \
    --to=msw@amazon.com \
    --cc=Ian.Campbell@citrix.com \
    --cc=Siva.Palagummi@ca.com \
    --cc=konrad.wilk@oracle.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).