xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Matt Wilson <msw@amazon.com>
To: "Palagummi, Siva" <Siva.Palagummi@ca.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [PATCH RFC V2] xen/netback: Count ring slots properly when larger MTU sizes are used
Date: Tue, 11 Dec 2012 13:34:39 -0800	[thread overview]
Message-ID: <20121211213437.GA29869@u109add4315675089e695.ant.amazon.com> (raw)
In-Reply-To: <7D7C26B1462EB14CB0E7246697A18C13145668@INHYMS111A.ca.com>

On Tue, Dec 11, 2012 at 10:25:51AM +0000, Palagummi, Siva wrote:
> > -----Original Message-----
> > From: Matt Wilson [mailto:msw@amazon.com]
> > Sent: Thursday, December 06, 2012 11:05 AM
> > To: Palagummi, Siva
> > Cc: Ian Campbell; xen-devel@lists.xen.org
> > Subject: Re: [Xen-devel] [PATCH RFC V2] xen/netback: Count ring slots
> > properly when larger MTU sizes are used
> > 
> > On Wed, Dec 05, 2012 at 11:56:32AM +0000, Palagummi, Siva wrote:
> > > Matt,
> > [...]
> > > You are right. The above chunk which is already part of the upstream
> > > is unfortunately incorrect for some cases. We also ran into issues
> > > in our environment around a week back and found this problem. The
> > > count will be different based on head len because of the
> > > optimization that start_new_rx_buffer is trying to do for large
> > > buffers.  A hole of size "offset_in_page" will be left in first page
> > > during copy if the remaining buffer size is >=PAG_SIZE. This
> > > subsequently affects the copy_off as well.
> > >
> > > So xen_netbk_count_skb_slots actually needs a fix to calculate the
> > > count correctly based on head len. And also a fix to calculate the
> > > copy_off properly to which the data from fragments gets copied.
> > 
> > Can you explain more about the copy_off problem? I'm not seeing it.
>
> You can clearly see below that copy_off is input to
> start_new_rx_buffer while copying frags.

Yes, but that's the right thing to do. copy_off should be set to the
destination offset after copying the last byte of linear data, which
means "skb_headlen(skb) % PAGE_SIZE" is correct.

> So if the buggy "count" calculation below is fixed based on
> offset_in_page value then copy_off value also will change
> accordingly.

This calculation is not incorrect. You should only need as many
PAGE_SIZE buffers as you have linear data to fill.

>         count = DIV_ROUND_UP(skb_headlen(skb), PAGE_SIZE);
> 
>         copy_off = skb_headlen(skb) % PAGE_SIZE;
> 
>         if (skb_shinfo(skb)->gso_size)
>                 count++;
> 
>         for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
>                 unsigned long size = skb_frag_size(&skb_shinfo(skb)->frags[i]);
>                 unsigned long bytes;
>                 while (size > 0) {
>                         BUG_ON(copy_off > MAX_BUFFER_OFFSET);
> 
>                         if (start_new_rx_buffer(copy_off, size, 0)) {
>                                 count++;
>                                 copy_off = 0;
>                         }
> 
>
> So a correct calculation should be somewhat like below because of
> the optimization in start_new_rx_buffer for larger sizes.

start_new_rx_buffer() should not be starting a new buffer after the
first pass copying the linear data.

>       linear_len = skb_headlen(skb)
> 	count = (linear_len <= PAGE_SIZE)
>               ? 1
>               :DIV_ROUND_UP(offset_in_page(skb->data)+linear_len, PAGE_SIZE));
> 
>       copy_off = ((offset_in_page(skb->data)+linear_len) < 2*PAGE_SIZE)
> 			? linear_len % PAGE_SIZE;
> 			: (offset_in_page(skb->data)+linear_len) % PAGE_SIZE;

A change like this makes the code much more difficult to understand.

> > > max_required_rx_slots also may require a fix to account the
> > > additional slot that may be required in case mtu >= PAG_SIZE. For
> > > worst case scenario atleast another +1.  One thing that is still
> > > puzzling here is, max_required_rx_slots seems to be assuming that
> > > linear length in head will never be greater than mtu size. But that
> > > doesn't seem to be the case all the time. I wonder if it requires
> > > some kind of fix there or special handling when count_skb_slots
> > > exceeds max_required_rx_slots.
> > 
> > We should only be using the number of pages required to copy the
> > data. The fix shouldn't be to anticipate wasting ring space by
> > increasing the return value of max_required_rx_slots().
> > 
>
> I do not think we are wasting any ring space. But just ensuring that
> we have enough before proceeding ahead.

For some SKBs with large linear buffers, we certainly are wasting
space. Go back and read the explanation in
  http://lists.xen.org/archives/html/xen-devel/2012-12/msg00274.html

> > [...]
> > 
> > > > Why increment count by the /estimated/ count instead of the actual
> > > > number of slots used? We have the number of slots in the line just
> > > > above, in sco->meta_slots_used.
> > > >
> > >
> > > Count actually refers to ring slots consumed rather than meta_slots
> > > used.  Count can be different from meta_slots_used.
> > 
> > Aah, indeed. This can end up being too pessimistic if you have lots of
> > frags that require multiple copy operations. I still think that it
> > would be better to calculate the actual number of ring slots consumed
> > by netbk_gop_skb() to avoid other bugs like the one you originally
> > fixed.
> > 
>
> counting done in count_skb_slots is what exactly it is. The fix done
> above is to make it same so that no need to re calculate again.

Today, the counting done in count_skb_slots() *does not* match the
number of buffer slots consumed by netbk_gop_skb().

Matt

  reply	other threads:[~2012-12-11 21:34 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-29 12:21 [PATCH RFC V2] xen/netback: Count ring slots properly when larger MTU sizes are used Palagummi, Siva
2012-08-30  8:07 ` Ian Campbell
2012-08-30 10:26   ` Palagummi, Siva
2012-12-04 23:23   ` Matt Wilson
2012-12-05 11:56     ` Palagummi, Siva
2012-12-06  5:35       ` Matt Wilson
2012-12-11 10:25         ` Palagummi, Siva
2012-12-11 21:34           ` Matt Wilson [this message]
2012-12-13 23:12             ` Palagummi, Siva
2012-12-14 18:53               ` Matt Wilson
2012-12-17 11:26                 ` Ian Campbell
2012-12-17 20:09                   ` Matt Wilson
2012-12-18 10:02                     ` Ian Campbell
2012-12-18 19:43                       ` Matt Wilson
2012-12-20 10:05                         ` Ian Campbell
2012-12-20 21:42                           ` Matt Wilson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121211213437.GA29869@u109add4315675089e695.ant.amazon.com \
    --to=msw@amazon.com \
    --cc=Ian.Campbell@citrix.com \
    --cc=Siva.Palagummi@ca.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).